Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* RE: [PATCH v2 4/5] mmc: sdhci-esdhc-imx: disable irq during suspend to fix unhandled interrupt
From: Luke Wang (OSS) @ 2026-06-26  6:04 UTC (permalink / raw)
  To: Frank Li (OSS), Luke Wang (OSS)
  Cc: adrian.hunter@intel.com, ulfh@kernel.org, Bough Chen, Frank Li,
	s.hauer@pengutronix.de, kernel@pengutronix.de, festevam@gmail.com,
	imx@lists.linux.dev, linux-mmc@vger.kernel.org, dl-S32,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <aj1ZWZo_9MLUWDBD@SMW015318>



> -----Original Message-----
> From: Frank Li (OSS) <frank.li@oss.nxp.com>
> Sent: Friday, June 26, 2026 12:38 AM
> To: Luke Wang (OSS) <ziniu.wang_1@oss.nxp.com>
> Cc: adrian.hunter@intel.com; ulfh@kernel.org; Bough Chen
> <haibo.chen@nxp.com>; Frank Li <frank.li@nxp.com>;
> s.hauer@pengutronix.de; kernel@pengutronix.de; festevam@gmail.com;
> imx@lists.linux.dev; linux-mmc@vger.kernel.org; dl-S32 <S32@nxp.com>;
> linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2 4/5] mmc: sdhci-esdhc-imx: disable irq during
> suspend to fix unhandled interrupt
> 
> On Thu, Jun 25, 2026 at 06:59:33PM +0800, ziniu.wang_1@oss.nxp.com
> wrote:
> > From: Luke Wang <ziniu.wang_1@nxp.com>
> >
> > When using WIFI out-of-band wakeup, an "irq xxx: nobody cared" warning
> > occurs. This happens because the usdhc interrupt is not disabled during
> > system suspend when device_may_wakeup() returns false.
> >
> > The sequence of events leading to this issue:
> > 1. System enters suspend without disabling usdhc interrupt
> > (because device_may_wakeup() returns false for usdhc device)
> > 2. WIFI out-of-band wakeup triggers system resume via GPIO interrupt
> > 3. WIFI sends a Card interrupt before usdhc has fully resumed
> > 4. usdhc is still in runtime suspend state and cannot handle the
> > interrupt properly
> > 5. The unhandled interrupt triggers "nobody cared" warning
> >
> > Fix this by unconditionally disabling the usdhc interrupt during suspend
> > and re-enabling it during resume, regardless of the wakeup capability.
> > This ensures no interrupts are processed during the suspend/resume
> > transition.
> 
> Does it impact the case if WIFI don't use out-of-band wakeup?

It doesn't impact other cases.

Thanks,
Luke

> 
> Frank
> >
> > Fixes: 676a83855614 ("mmc: host: sdhci-esdhc-imx: refactor the system PM
> logic")
> > Signed-off-by: Luke Wang <ziniu.wang_1@nxp.com>
> > ---
> >  drivers/mmc/host/sdhci-esdhc-imx.c | 11 ++++++-----
> >  1 file changed, 6 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-
> esdhc-imx.c
> > index 7fcaecdd4ec6..c4a22e42628e 100644
> > --- a/drivers/mmc/host/sdhci-esdhc-imx.c
> > +++ b/drivers/mmc/host/sdhci-esdhc-imx.c
> > @@ -2076,9 +2076,10 @@ static int sdhci_esdhc_suspend(struct device
> *dev)
> >  	if (mmc_card_keep_power(host->mmc) &&
> esdhc_is_usdhc(imx_data))
> >  		sdhc_esdhc_tuning_save(host);
> >
> > +	/* The irqs of imx are not shared. It is safe to disable */
> > +	disable_irq(host->irq);
> > +
> >  	if (device_may_wakeup(dev)) {
> > -		/* The irqs of imx are not shared. It is safe to disable */
> > -		disable_irq(host->irq);
> >  		ret = sdhci_enable_irq_wakeups(host);
> >  		if (!ret)
> >  			dev_warn(dev, "Failed to enable irq wakeup\n");
> > @@ -2129,10 +2130,10 @@ static int sdhci_esdhc_resume(struct device
> *dev)
> >  	/* re-initialize hw state in case it's lost in low power mode */
> >  	sdhci_esdhc_imx_hwinit(host);
> >
> > -	if (host->irq_wake_enabled) {
> > +	if (host->irq_wake_enabled)
> >  		sdhci_disable_irq_wakeups(host);
> > -		enable_irq(host->irq);
> > -	}
> > +
> > +	enable_irq(host->irq);
> >
> >  	/*
> >  	 * restore the saved tuning delay value for the device which keep
> > --
> > 2.34.1
> >
> >


^ permalink raw reply

* RE: [PATCH v2 5/5] mmc: sdhci-esdhc-imx: fix suspend/resume error handling
From: Luke Wang (OSS) @ 2026-06-26  6:07 UTC (permalink / raw)
  To: Frank Li (OSS), Luke Wang (OSS)
  Cc: adrian.hunter@intel.com, ulfh@kernel.org, Bough Chen, Frank Li,
	s.hauer@pengutronix.de, kernel@pengutronix.de, festevam@gmail.com,
	imx@lists.linux.dev, linux-mmc@vger.kernel.org, dl-S32,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <aj1Z0fYSVY4cw0Mq@SMW015318>



> -----Original Message-----
> From: Frank Li (OSS) <frank.li@oss.nxp.com>
> Sent: Friday, June 26, 2026 12:40 AM
> To: Luke Wang (OSS) <ziniu.wang_1@oss.nxp.com>
> Cc: adrian.hunter@intel.com; ulfh@kernel.org; Bough Chen
> <haibo.chen@nxp.com>; Frank Li <frank.li@nxp.com>;
> s.hauer@pengutronix.de; kernel@pengutronix.de; festevam@gmail.com;
> imx@lists.linux.dev; linux-mmc@vger.kernel.org; dl-S32 <S32@nxp.com>;
> linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2 5/5] mmc: sdhci-esdhc-imx: fix suspend/resume error
> handling
> 
> On Thu, Jun 25, 2026 at 06:59:34PM +0800, ziniu.wang_1@oss.nxp.com
> wrote:
> > From: Luke Wang <ziniu.wang_1@nxp.com>
> >
> > Fix several error handling issues in sdhci_esdhc_suspend/resume:
> >
> > 1. Use pm_runtime_resume_and_get() instead of pm_runtime_get_sync()
> >    to simplify error handling. If it fails, the device is unclocked
> >    and accessing hardware registers would cause a kernel panic.
> >
> > 2. Make pinctrl_pm_select_sleep_state() and mmc_gpio_set_cd_wake()
> >    failures non-fatal in suspend path. These failures only mean
> >    slightly higher power consumption or missing CD wakeup, but should
> >    not block system suspend.
> >
> > 3. Check pm_runtime_force_resume() return value in resume. If it
> >    fails (clock enable failure), return immediately since accessing
> >    hardware registers on an unclocked device would cause a panic.
> >
> > 4. Make mmc_gpio_set_cd_wake(false) call in resume not check return
> >    value since it always returns 0.
> >
> > 5. Always return 0 on success path instead of propagating non-fatal
> >    warning return values.
> 
> each patch fix one problem.

I will split this patch in v3

Thanks,
Luke

> 
> Frank
> 
> >
> > Signed-off-by: Luke Wang <ziniu.wang_1@nxp.com>
> > ---
> >  drivers/mmc/host/sdhci-esdhc-imx.c | 18 +++++++++++-------
> >  1 file changed, 11 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-
> esdhc-imx.c
> > index c4a22e42628e..4d6818c95809 100644
> > --- a/drivers/mmc/host/sdhci-esdhc-imx.c
> > +++ b/drivers/mmc/host/sdhci-esdhc-imx.c
> > @@ -2060,7 +2060,9 @@ static int sdhci_esdhc_suspend(struct device
> *dev)
> >  	 * 2, make sure the pm_runtime_force_resume() in
> sdhci_esdhc_resume() really
> >  	 *    invoke its ->runtime_resume callback (needs_force_resume = 1).
> >  	 */
> > -	pm_runtime_get_sync(dev);
> > +	ret = pm_runtime_resume_and_get(dev);
> > +	if (ret)
> > +		return ret;
> >
> >  	if ((imx_data->socdata->flags &
> ESDHC_FLAG_STATE_LOST_IN_LPMODE) &&
> >  		(host->tuning_mode != SDHCI_TUNING_MODE_1)) {
> > @@ -2094,10 +2096,12 @@ static int sdhci_esdhc_suspend(struct device
> *dev)
> >  		 */
> >  		ret = pinctrl_pm_select_sleep_state(dev);
> >  		if (ret)
> > -			return ret;
> > +			dev_warn(dev, "Failed to select sleep pinctrl
> state\n");
> >  	}
> >
> >  	ret = mmc_gpio_set_cd_wake(host->mmc, true);
> > +	if (ret)
> > +		dev_warn(dev, "Failed to enable cd wake\n");
> >
> >  	/*
> >  	 * Make sure invoke runtime_suspend to gate off clock.
> > @@ -2105,7 +2109,7 @@ static int sdhci_esdhc_suspend(struct device
> *dev)
> >  	 */
> >  	pm_runtime_force_suspend(dev);
> >
> > -	return ret;
> > +	return 0;
> >  }
> >
> >  static int sdhci_esdhc_resume(struct device *dev)
> > @@ -2121,12 +2125,12 @@ static int sdhci_esdhc_resume(struct device
> *dev)
> >  			dev_warn(dev, "Failed to restore pinctrl state\n");
> >  	}
> >
> > -	pm_runtime_force_resume(dev);
> > -
> > -	ret = mmc_gpio_set_cd_wake(host->mmc, false);
> > +	ret = pm_runtime_force_resume(dev);
> >  	if (ret)
> >  		return ret;
> >
> > +	mmc_gpio_set_cd_wake(host->mmc, false);
> > +
> >  	/* re-initialize hw state in case it's lost in low power mode */
> >  	sdhci_esdhc_imx_hwinit(host);
> >
> > @@ -2153,7 +2157,7 @@ static int sdhci_esdhc_resume(struct device
> *dev)
> >
> >  	pm_runtime_put_autosuspend(dev);
> >
> > -	return ret;
> > +	return 0;
> >  }
> >
> >  static int sdhci_esdhc_runtime_suspend(struct device *dev)
> > --
> > 2.34.1
> >
> >


^ permalink raw reply

* [PATCH v4 00/32] pinctrl: mediatek: Enable module build support for all drivers
From: Justin Yeh @ 2026-06-26  4:00 UTC (permalink / raw)
  To: Sean Wang, Linus Walleij, Matthias Brugger,
	AngeloGioacchino Del Regno
  Cc: Project_Global_Chrome_Upstream_Group, linux-mediatek, linux-gpio,
	linux-kernel, linux-arm-kernel, Justin Yeh

Sorry for the quick v4 - v3 was sent with an incomplete cover letter
(template placeholders) by mistake. This v4 also unifies MODULE_LICENSE
to consistently use "GPL v2" across all patches.

This series enables all MediaTek pinctrl drivers to be built as loadable
kernel modules. This is required for Android GKI (Generic Kernel Image) +
vendor_dlkm deployments where vendor-specific drivers must be kept separate
from the GKI vmlinux.

Each patch adds MODULE_LICENSE("GPL v2") and MODULE_DESCRIPTION() macros where
missing, and changes the Kconfig option from bool to tristate. This allows
these drivers to be properly packaged as vendor kernel modules while
maintaining the existing built-in option.

Changes in v4:
  * Fix cover letter content (v3 accidentally sent with template placeholders)
  * Unify MODULE_LICENSE to use "GPL v2" consistently across all drivers
  * Update all commit messages to reflect "GPL v2" instead of "GPL"

Changes in v3:
  * Add MODULE_DESCRIPTION() for all drivers (even those that already had MODULE_LICENSE)
  * Update commit messages to reflect that we're adding MODULE_DESCRIPTION too

Changes in v2:
  * Squash MODULE_LICENSE and tristate changes into single patch per driver
  * Extend fix to all MediaTek pinctrl drivers (32 total), not just MT8189
  * Add Android GKI + vendor_dlkm context to cover letter
  * Add MODULE_DESCRIPTION() where it was missing
  * Add Fixes: tags referencing the original commits that added each driver

Justin Yeh (32):
  pinctrl: mediatek: mt8189: Enable module build support
  pinctrl: mediatek: mt6878: Enable module build support
  pinctrl: mediatek: mt6893: Enable module build support
  pinctrl: mediatek: mt7622: Enable module build support
  pinctrl: mediatek: mt7981: Enable module build support
  pinctrl: mediatek: mt7986: Enable module build support
  pinctrl: mediatek: mt7988: Enable module build support
  pinctrl: mediatek: mt8167: Enable module build support
  pinctrl: mediatek: mt8173: Enable module build support
  pinctrl: mediatek: mt8183: Enable module build support
  pinctrl: mediatek: mt8186: Enable module build support
  pinctrl: mediatek: mt8188: Enable module build support
  pinctrl: mediatek: mt8192: Enable module build support
  pinctrl: mediatek: mt8195: Enable module build support
  pinctrl: mediatek: mt8196: Enable module build support
  pinctrl: mediatek: mt8365: Enable module build support
  pinctrl: mediatek: mt8516: Enable module build support
  pinctrl: mediatek: mt2701: Enable module build support
  pinctrl: mediatek: mt7623: Enable module build support
  pinctrl: mediatek: mt7629: Enable module build support
  pinctrl: mediatek: mt8135: Enable module build support
  pinctrl: mediatek: mt8127: Enable module build support
  pinctrl: mediatek: mt7620: Enable module build support
  pinctrl: mediatek: mt7621: Enable module build support
  pinctrl: mediatek: mt76x8: Enable module build support
  pinctrl: mediatek: rt2880: Enable module build support
  pinctrl: mediatek: rt305x: Enable module build support
  pinctrl: mediatek: rt3883: Enable module build support
  pinctrl: mediatek: mt6397: Enable module build support
  pinctrl: mediatek: mt2712: Enable module build support
  pinctrl: mediatek: mt6795: Enable module build support
  pinctrl: mediatek: mt6797: Enable module build support

 drivers/pinctrl/mediatek/Kconfig          | 64 +++++++++++------------
 drivers/pinctrl/mediatek/pinctrl-mt2701.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt2712.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt6397.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt6795.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt6797.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt6878.c |  1 +
 drivers/pinctrl/mediatek/pinctrl-mt6893.c |  1 +
 drivers/pinctrl/mediatek/pinctrl-mt7620.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt7621.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt7622.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt7623.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt7629.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt76x8.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt7981.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt7986.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt7988.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt8127.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt8135.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt8167.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt8173.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt8183.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt8186.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt8188.c |  1 +
 drivers/pinctrl/mediatek/pinctrl-mt8189.c |  1 +
 drivers/pinctrl/mediatek/pinctrl-mt8192.c |  1 +
 drivers/pinctrl/mediatek/pinctrl-mt8195.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-mt8196.c |  1 +
 drivers/pinctrl/mediatek/pinctrl-mt8365.c |  1 +
 drivers/pinctrl/mediatek/pinctrl-mt8516.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-rt2880.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-rt305x.c |  3 ++
 drivers/pinctrl/mediatek/pinctrl-rt3883.c |  3 ++
 33 files changed, 114 insertions(+), 32 deletions(-)

-- 
2.45.2



^ permalink raw reply

* [PATCH v4 01/32] pinctrl: mediatek: mt8189: Enable module build support
From: Justin Yeh @ 2026-06-26  4:00 UTC (permalink / raw)
  To: Sean Wang, Linus Walleij, Matthias Brugger,
	AngeloGioacchino Del Regno
  Cc: Project_Global_Chrome_Upstream_Group, linux-mediatek, linux-gpio,
	linux-kernel, linux-arm-kernel, Justin Yeh
In-Reply-To: <20260626040112.2436185-1-justin.yeh@mediatek.com>

Add MODULE_LICENSE("GPL v2") macro and change Kconfig option from
bool to tristate to allow building as a loadable kernel module.

This is required for Android GKI + vendor_dlkm deployments where
vendor-specific drivers must be kept separate from the GKI vmlinux.

Fixes: a3fe1324c3c5 ("pinctrl: mediatek: Add pinctrl driver for mt8189")

Signed-off-by: Justin Yeh <justin.yeh@mediatek.com>
---
 drivers/pinctrl/mediatek/Kconfig          | 2 +-
 drivers/pinctrl/mediatek/pinctrl-mt8189.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pinctrl/mediatek/Kconfig b/drivers/pinctrl/mediatek/Kconfig
index 97980cc28b9c..e79139700d72 100644
--- a/drivers/pinctrl/mediatek/Kconfig
+++ b/drivers/pinctrl/mediatek/Kconfig
@@ -255,7 +255,7 @@ config PINCTRL_MT8188
 	  map specific eint which doesn't have real gpio pin.
 
 config PINCTRL_MT8189
-        bool "MediaTek MT8189 pin control"
+        tristate "MediaTek MT8189 pin control"
         depends on OF
         depends on ARM64 || COMPILE_TEST
         default ARM64 && ARCH_MEDIATEK
diff --git a/drivers/pinctrl/mediatek/pinctrl-mt8189.c b/drivers/pinctrl/mediatek/pinctrl-mt8189.c
index cd4cdff309a1..a9c128c514a4 100644
--- a/drivers/pinctrl/mediatek/pinctrl-mt8189.c
+++ b/drivers/pinctrl/mediatek/pinctrl-mt8189.c
@@ -1696,3 +1696,4 @@ static int __init mt8189_pinctrl_init(void)
 arch_initcall(mt8189_pinctrl_init);
 
 MODULE_DESCRIPTION("MediaTek MT8189 Pinctrl Driver");
+MODULE_LICENSE("GPL v2");
-- 
2.45.2



^ permalink raw reply related

* RE: [EXTERNAL] Re: [PATCH v4 1/3] perf: marvell: Add MPAM partid filtering to CN10K TAD PMU
From: Geethasowjanya Akula @ 2026-06-26  6:21 UTC (permalink / raw)
  To: Ben Horgan, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org
  Cc: mark.rutland@arm.com, will@kernel.org, krzk+dt@kernel.org,
	james.morse@arm.com, Sunil Kovvuri Goutham, Tanmay Jagdale
In-Reply-To: <6b15d3fc-4b4e-4c6c-a96d-5817d7114d02@arm.com>



>-----Original Message-----
>From: Ben Horgan <ben.horgan@arm.com>
>Sent: Thursday, June 25, 2026 7:23 PM
>To: Geethasowjanya Akula <gakula@marvell.com>; linux-perf-
>users@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
>kernel@lists.infradead.org; devicetree@vger.kernel.org
>Cc: mark.rutland@arm.com; will@kernel.org; krzk+dt@kernel.org;
>james.morse@arm.com
>Subject: [EXTERNAL] Re: [PATCH v4 1/3] perf: marvell: Add MPAM partid
>filtering to CN10K TAD PMU
>Hi Geetha,
>
>+CC James
>
>On 6/18/26 16:36, Geetha sowjanya wrote:
>> From: Tanmay Jagdale <tanmay@marvell.com>
>>
>> The TAD PMU exposes counters that can be filtered by MPAM partition id
>> for a subset of allocation and hit events.
>>
>> Add a 9-bit partid format attribute (config1) and route counter
>> programming through variant-specific ops so CN10K keeps MPAM-capable
>> programming while Odyssey keeps the reduced event set without advertising
>partid in sysfs.
>>
>> Probe no longer mutates the platform_device MMIO resource (walk a
>> local map_start), rejects tad-cnt / page sizes of zero, validates the
>> memory window against tad-cnt, and registers the perf PMU before
>> hotplug with correct unwind.
>>
>> Example:
>>   perf stat -e tad/tad_alloc_any,partid=0x12,partid_en=1/ -- <program>
>
>Where is the user expected to get the PARTID from? The MPAM driver
>considers the PARTID as an internal only value.
>
>resctrl does support a 'debug' mount option which will show the CLOSID
>associated with a control group. Whilst the CLOSID is often the PARTID, it is
>really a set of PARTIDs. When the cdp mount option is used, CLOSID maps to 2
>PARTIDs and if we use PARTID narrowing to give us more monitors, as in
>proposed in [1], then the set of PARTIDs may be bigger.
>Furthermore, if the PARTID narrowing scheme is made dynamic the size of the
>PARTID set may change when control or monitoring groups are created or
>deleted.
>
>It seems that a way to map from a resctrl control group to the set of PARTIDs is
>required and a mechanism to tie this to lifetime of the resctrl mount.
>
>Perhaps some helpers along the lines of:
>
>int resctrl_mount_generation(void)
>int mpam_rdtgrp_to_partid_is_static(int mount_gen) int
>resctrl_rdtgrp_generation(char *name) int
>mpam_rdtgrp_to_partid_count(char *name, int rdt_gen) int
>mpam_rdtgrp_to_partid_array(char *name, int rdt_gen, int* partids)
>
>The rdtgrp generation is to an attempt to avoid having to use a debug interface
>in anger and cope with renaming of control groups in resctrl.
>This does seem a bit unwieldly so hopefully there is better way to do this.
>
>Sorry to throw a spanner in the works.
On …, … wrote:
> Where is the user expected to get the PARTID from? The MPAM driver
> considers the PARTID as an internal only value.
> …
> Perhaps some helpers along the lines of:
> int resctrl_mount_generation(void)
> …
Hi Ben,

Thank you for the detailed feedback — the concern you raise is valid, particularly when 
viewed from the perspective of resctrl-managed deployments.

However, to clarify the intent of this patch: the exposure of partid in the TAD PMU is deliberately
a low-level, hardware-facing interface, and is not intended to integrate with or mirror the
abstractions provided by resctrl. It is mainly meant for platform bring-up and low-level
performance/debug users, who already have explicit knowledge of the MPAM configuration,
typically provisioned by firmware or other privileged software layers (e.g. EL3/EL2).
In such environments, PARTIDs are known out-of-band, so the expectation is that the
user supplying partid is already aware of the MPAM IDs programmed on the system.

A proper “profile this resctrl group” path would require MPAM–resctrl support (e.g. something along the lines of the helpers you suggest) 
to resolve a group to its PARTID set. This is indeed important, but it constitutes a separate design discussion that is outside the scope of this driver patch.

We will clarify this in the commit message and avoid implying that users normally obtain PARTIDs from resctrl today.


Thanks,
Geetha
>
>Thanks,
>
>Ben
>
>>
>> Signed-off-by: Tanmay Jagdale <tanmay@marvell.com>
>> Signed-off-by: Geetha sowjanya <gakula@marvell.com>
>> ---
>>
>> Changelog (since v3)
>> --------------------
>> - Restore cpuhp_state_add_instance_nocalls before perf_pmu_register in
>probe
>>   so users cannot attach events before the hotplug instance exists; unwind
>>   removes the hotplug instance if perf registration fails.
>> - Add perf_ready: tad_pmu_offline_cpu skips perf_pmu_migrate_context
>until after
>>   successful perf_pmu_register, so a CPU offline between hotplug add and
>perf
>>   register does not touch perf core state for an unregistered PMU.
>>
>> Changelog (since v2)
>> --------------------
>> - Validate the eventId using an appropriate mask to ensure
>>   it is restricted to 8 bits.
>>
>> Changelog (since v1)
>> --------------------
>> - Fix config1 filter enable to use bit 9 consistently with the PMU format
>>   string (partid_en) and reject reserved bits with GENMASK(9, 0).
>> - Register perf_pmu_register before cpuhp_state_add_instance_nocalls and
>>   unregister on hotplug failure.
>>
>>  drivers/perf/marvell_cn10k_tad_pmu.c | 220
>> +++++++++++++++++++++------
>>  1 file changed, 171 insertions(+), 49 deletions(-)
>>
>> diff --git a/drivers/perf/marvell_cn10k_tad_pmu.c
>> b/drivers/perf/marvell_cn10k_tad_pmu.c
>> index 51ccb0befa05..340be3776fe7 100644
>> --- a/drivers/perf/marvell_cn10k_tad_pmu.c
>> +++ b/drivers/perf/marvell_cn10k_tad_pmu.c
>> @@ -7,6 +7,8 @@
>>  #define pr_fmt(fmt) "tad_pmu: " fmt
>>
>>  #include <linux/io.h>
>> +#include <linux/bits.h>
>> +#include <linux/compiler.h>
>>  #include <linux/module.h>
>>  #include <linux/of.h>
>>  #include <linux/cpuhotplug.h>
>> @@ -14,12 +16,20 @@
>>  #include <linux/platform_device.h>
>>  #include <linux/acpi.h>
>>
>> -#define TAD_PFC_OFFSET		0x800
>> -#define TAD_PFC(counter)	(TAD_PFC_OFFSET | (counter << 3))
>>  #define TAD_PRF_OFFSET		0x900
>> -#define TAD_PRF(counter)	(TAD_PRF_OFFSET | (counter << 3))
>> +#define TAD_PFC_OFFSET		0x800
>> +#define TAD_PFC(base, counter)	((base) | ((u64)(counter) << 3))
>> +#define TAD_PRF(base, counter)	((base) | ((u64)(counter) << 3))
>>  #define TAD_PRF_CNTSEL_MASK	0xFF
>> +#define TAD_PRF_MATCH_PARTID	BIT(8)
>> +#define TAD_PRF_PARTID_NS	BIT(10)
>> +/*
>> + * config1: bits 0..8 MPAM partition id (including 0); bit 9 requests
>> + * filtering for MPAM-capable events. All-zero config1 means no filter.
>> + */
>> +#define TAD_PARTID_FILTER_EN	BIT(9)
>>  #define TAD_MAX_COUNTERS	8
>> +#define TAD_EVENT_SEL_MASK	GENMASK(7, 0)
>>
>>  #define to_tad_pmu(p) (container_of(p, struct tad_pmu, pmu))
>>
>> @@ -27,30 +37,94 @@ struct tad_region {
>>  	void __iomem	*base;
>>  };
>>
>> +enum mrvl_tad_pmu_version {
>> +	TAD_PMU_V1 = 1,
>> +	TAD_PMU_V2,
>> +};
>> +
>> +struct tad_pmu_data {
>> +	int id;
>> +	u64 tad_prf_offset;
>> +	u64 tad_pfc_offset;
>> +};
>> +
>>  struct tad_pmu {
>>  	struct pmu pmu;
>>  	struct tad_region *regions;
>>  	u32 region_cnt;
>>  	unsigned int cpu;
>> +	/* Set after successful perf_pmu_register(); gates offline migration. */
>> +	bool perf_ready;
>> +	const struct tad_pmu_ops *ops;
>> +	const struct tad_pmu_data *pdata;
>>  	struct hlist_node node;
>>  	struct perf_event *events[TAD_MAX_COUNTERS];
>>  	DECLARE_BITMAP(counters_map, TAD_MAX_COUNTERS);  };
>>
>> -enum mrvl_tad_pmu_version {
>> -	TAD_PMU_V1 = 1,
>> -	TAD_PMU_V2,
>> -};
>> -
>> -struct tad_pmu_data {
>> -	int id;
>> +struct tad_pmu_ops {
>> +	void (*start_counter)(struct tad_pmu *pmu, struct perf_event
>> +*event);
>>  };
>>
>>  static int tad_pmu_cpuhp_state;
>>
>> +static void tad_pmu_start_counter(struct tad_pmu *pmu,
>> +				  struct perf_event *event)
>> +{
>> +	const struct tad_pmu_data *pdata = pmu->pdata;
>> +	struct hw_perf_event *hwc = &event->hw;
>> +	u32 event_idx = (u32)(event->attr.config & TAD_EVENT_SEL_MASK);
>> +	u32 counter_idx = hwc->idx;
>> +	u64 partid_filter = 0;
>> +	u64 reg_val;
>> +	u64 cfg1 = event->attr.config1;
>> +	bool use_mpam = cfg1 & TAD_PARTID_FILTER_EN;
>> +	u32 partid = (u32)(cfg1 & GENMASK(8, 0));
>> +	int i;
>> +
>> +	for (i = 0; i < pmu->region_cnt; i++)
>> +		writeq_relaxed(0, pmu->regions[i].base +
>> +			       TAD_PFC(pdata->tad_pfc_offset, counter_idx));
>> +
>> +	if (use_mpam && event_idx > 0x19 && event_idx < 0x21) {
>> +		partid_filter = TAD_PRF_MATCH_PARTID |
>TAD_PRF_PARTID_NS |
>> +				((u64)partid << 11);
>> +	}
>> +
>> +
>> +	for (i = 0; i < pmu->region_cnt; i++) {
>> +		reg_val = event_idx & 0xFF;
>> +		reg_val |= partid_filter;
>> +		writeq_relaxed(reg_val, pmu->regions[i].base +
>> +			       TAD_PRF(pdata->tad_prf_offset, counter_idx));
>> +	}
>> +}
>> +
>> +static void tad_pmu_v2_start_counter(struct tad_pmu *pmu,
>> +				     struct perf_event *event)
>> +{
>> +	const struct tad_pmu_data *pdata = pmu->pdata;
>> +	struct hw_perf_event *hwc = &event->hw;
>> +	u32 event_idx = (u32)(event->attr.config & TAD_EVENT_SEL_MASK);
>> +	u32 counter_idx = hwc->idx;
>> +	u64 reg_val;
>> +	int i;
>> +
>> +	for (i = 0; i < pmu->region_cnt; i++)
>> +		writeq_relaxed(0, pmu->regions[i].base +
>> +			       TAD_PFC(pdata->tad_pfc_offset, counter_idx));
>> +
>> +	for (i = 0; i < pmu->region_cnt; i++) {
>> +		reg_val = event_idx & 0xFF;
>> +		writeq_relaxed(reg_val, pmu->regions[i].base +
>> +			       TAD_PRF(pdata->tad_prf_offset, counter_idx));
>> +	}
>> +}
>> +
>>  static void tad_pmu_event_counter_read(struct perf_event *event)  {
>>  	struct tad_pmu *tad_pmu = to_tad_pmu(event->pmu);
>> +	const struct tad_pmu_data *pdata = tad_pmu->pdata;
>>  	struct hw_perf_event *hwc = &event->hw;
>>  	u32 counter_idx = hwc->idx;
>>  	u64 prev, new;
>> @@ -60,7 +134,7 @@ static void tad_pmu_event_counter_read(struct
>perf_event *event)
>>  		prev = local64_read(&hwc->prev_count);
>>  		for (i = 0, new = 0; i < tad_pmu->region_cnt; i++)
>>  			new += readq(tad_pmu->regions[i].base +
>> -				     TAD_PFC(counter_idx));
>> +				     TAD_PFC(pdata->tad_pfc_offset,
>counter_idx));
>>  	} while (local64_cmpxchg(&hwc->prev_count, prev, new) != prev);
>>
>>  	local64_add(new - prev, &event->count); @@ -69,16 +143,14 @@
>static
>> void tad_pmu_event_counter_read(struct perf_event *event)  static void
>> tad_pmu_event_counter_stop(struct perf_event *event, int flags)  {
>>  	struct tad_pmu *tad_pmu = to_tad_pmu(event->pmu);
>> +	const struct tad_pmu_data *pdata = tad_pmu->pdata;
>>  	struct hw_perf_event *hwc = &event->hw;
>>  	u32 counter_idx = hwc->idx;
>>  	int i;
>>
>> -	/* TAD()_PFC() stop counting on the write
>> -	 * which sets TAD()_PRF()[CNTSEL] == 0
>> -	 */
>>  	for (i = 0; i < tad_pmu->region_cnt; i++) {
>>  		writeq_relaxed(0, tad_pmu->regions[i].base +
>> -			       TAD_PRF(counter_idx));
>> +			       TAD_PRF(pdata->tad_prf_offset, counter_idx));
>>  	}
>>
>>  	tad_pmu_event_counter_read(event);
>> @@ -89,26 +161,10 @@ static void tad_pmu_event_counter_start(struct
>> perf_event *event, int flags)  {
>>  	struct tad_pmu *tad_pmu = to_tad_pmu(event->pmu);
>>  	struct hw_perf_event *hwc = &event->hw;
>> -	u32 event_idx = event->attr.config;
>> -	u32 counter_idx = hwc->idx;
>> -	u64 reg_val;
>> -	int i;
>>
>>  	hwc->state = 0;
>>
>> -	/* Typically TAD_PFC() are zeroed to start counting */
>> -	for (i = 0; i < tad_pmu->region_cnt; i++)
>> -		writeq_relaxed(0, tad_pmu->regions[i].base +
>> -			       TAD_PFC(counter_idx));
>> -
>> -	/* TAD()_PFC() start counting on the write
>> -	 * which sets TAD()_PRF()[CNTSEL] != 0
>> -	 */
>> -	for (i = 0; i < tad_pmu->region_cnt; i++) {
>> -		reg_val = event_idx & 0xFF;
>> -		writeq_relaxed(reg_val,	tad_pmu->regions[i].base +
>> -			       TAD_PRF(counter_idx));
>> -	}
>> +	tad_pmu->ops->start_counter(tad_pmu, event);
>>  }
>>
>>  static void tad_pmu_event_counter_del(struct perf_event *event, int
>> flags) @@ -128,7 +184,6 @@ static int tad_pmu_event_counter_add(struct
>perf_event *event, int flags)
>>  	struct hw_perf_event *hwc = &event->hw;
>>  	int idx;
>>
>> -	/* Get a free counter for this event */
>>  	idx = find_first_zero_bit(tad_pmu->counters_map,
>TAD_MAX_COUNTERS);
>>  	if (idx == TAD_MAX_COUNTERS)
>>  		return -EAGAIN;
>> @@ -148,6 +203,9 @@ static int tad_pmu_event_counter_add(struct
>> perf_event *event, int flags)  static int tad_pmu_event_init(struct
>> perf_event *event)  {
>>  	struct tad_pmu *tad_pmu = to_tad_pmu(event->pmu);
>> +	const struct tad_pmu_data *pdata = tad_pmu->pdata;
>> +	u32 event_idx = (u32)(event->attr.config & TAD_EVENT_SEL_MASK);
>> +	u64 cfg1 = event->attr.config1;
>>
>>  	if (event->attr.type != event->pmu->type)
>>  		return -ENOENT;
>> @@ -158,6 +216,23 @@ static int tad_pmu_event_init(struct perf_event
>*event)
>>  	if (event->state != PERF_EVENT_STATE_OFF)
>>  		return -EINVAL;
>>
>> +	if (event->attr.config & ~TAD_EVENT_SEL_MASK)
>> +		return -EINVAL;
>> +
>> +	if (pdata->id == TAD_PMU_V2) {
>> +		if (cfg1)
>> +			return -EINVAL;
>> +	} else {
>> +		if ((cfg1 & GENMASK(8, 0)) && !(cfg1 &
>TAD_PARTID_FILTER_EN))
>> +			return -EINVAL;
>> +		if (cfg1 & TAD_PARTID_FILTER_EN) {
>> +			if (event_idx <= 0x19 || event_idx >= 0x21)
>> +				return -EINVAL;
>> +		}
>> +		if (cfg1 & ~GENMASK(9, 0))
>> +			return -EINVAL;
>> +	}
>> +
>>  	event->cpu = tad_pmu->cpu;
>>  	event->hw.idx = -1;
>>  	event->hw.config_base = event->attr.config; @@ -232,7 +307,7 @@
>> static struct attribute *ody_tad_pmu_event_attrs[] = {
>>  	TAD_PMU_EVENT_ATTR(tad_hit_ltg, 0x1e),
>>  	TAD_PMU_EVENT_ATTR(tad_hit_any, 0x1f),
>>  	TAD_PMU_EVENT_ATTR(tad_tag_rd, 0x20),
>> -	TAD_PMU_EVENT_ATTR(tad_tot_cycle, 0xFF),
>> +	TAD_PMU_EVENT_ATTR(tad_tot_cycle, 0xff),
>>  	NULL
>>  };
>>
>> @@ -242,9 +317,13 @@ static const struct attribute_group
>> ody_tad_pmu_events_attr_group = {  };
>>
>>  PMU_FORMAT_ATTR(event, "config:0-7");
>> +PMU_FORMAT_ATTR(partid, "config1:0-8"); PMU_FORMAT_ATTR(partid_en,
>> +"config1:9-9");
>>
>>  static struct attribute *tad_pmu_format_attrs[] = {
>>  	&format_attr_event.attr,
>> +	&format_attr_partid.attr,
>> +	&format_attr_partid_en.attr,
>>  	NULL
>>  };
>>
>> @@ -253,6 +332,16 @@ static struct attribute_group
>tad_pmu_format_attr_group = {
>>  	.attrs = tad_pmu_format_attrs,
>>  };
>>
>> +static struct attribute *ody_tad_pmu_format_attrs[] = {
>> +	&format_attr_event.attr,
>> +	NULL
>> +};
>> +
>> +static struct attribute_group ody_tad_pmu_format_attr_group = {
>> +	.name = "format",
>> +	.attrs = ody_tad_pmu_format_attrs,
>> +};
>> +
>>  static ssize_t tad_pmu_cpumask_show(struct device *dev,
>>  				struct device_attribute *attr, char *buf)  { @@
>-281,16 +370,25
>> @@ static const struct attribute_group *tad_pmu_attr_groups[] = {
>>
>>  static const struct attribute_group *ody_tad_pmu_attr_groups[] = {
>>  	&ody_tad_pmu_events_attr_group,
>> -	&tad_pmu_format_attr_group,
>> +	&ody_tad_pmu_format_attr_group,
>>  	&tad_pmu_cpumask_attr_group,
>>  	NULL
>>  };
>>
>> +static const struct tad_pmu_ops tad_pmu_ops = {
>> +	.start_counter = tad_pmu_start_counter, };
>> +
>> +static const struct tad_pmu_ops tad_pmu_v2_ops = {
>> +	.start_counter = tad_pmu_v2_start_counter, };
>> +
>>  static int tad_pmu_probe(struct platform_device *pdev)  {
>>  	const struct tad_pmu_data *dev_data;
>>  	struct device *dev = &pdev->dev;
>>  	struct tad_region *regions;
>> +	resource_size_t map_start;
>>  	struct tad_pmu *tad_pmu;
>>  	struct resource *res;
>>  	u32 tad_pmu_page_size;
>> @@ -298,7 +396,6 @@ static int tad_pmu_probe(struct platform_device
>*pdev)
>>  	u32 tad_cnt;
>>  	int version;
>>  	int i, ret;
>> -	char *name;
>>
>>  	tad_pmu = devm_kzalloc(&pdev->dev, sizeof(*tad_pmu),
>GFP_KERNEL);
>>  	if (!tad_pmu)
>> @@ -312,6 +409,7 @@ static int tad_pmu_probe(struct platform_device
>*pdev)
>>  		return -ENODEV;
>>  	}
>>  	version = dev_data->id;
>> +	tad_pmu->pdata = dev_data;
>>
>>  	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>>  	if (!res) {
>> @@ -338,22 +436,31 @@ static int tad_pmu_probe(struct platform_device
>*pdev)
>>  		dev_err(&pdev->dev, "Can't find tad-cnt property\n");
>>  		return ret;
>>  	}
>> +	if (!tad_cnt || !tad_page_size || !tad_pmu_page_size) {
>> +		dev_err(&pdev->dev, "Invalid tad-cnt or page size\n");
>> +		return -EINVAL;
>> +	}
>>
>>  	regions = devm_kcalloc(&pdev->dev, tad_cnt,
>>  			       sizeof(*regions), GFP_KERNEL);
>>  	if (!regions)
>>  		return -ENOMEM;
>>
>> -	/* ioremap the distributed TAD pmu regions */
>> -	for (i = 0; i < tad_cnt && res->start < res->end; i++) {
>> -		regions[i].base = devm_ioremap(&pdev->dev,
>> -					       res->start,
>> +	map_start = res->start;
>> +	for (i = 0; i < tad_cnt; i++) {
>> +		if (map_start > res->end ||
>> +		    tad_pmu_page_size > (resource_size_t)(res->end -
>map_start + 1)) {
>> +			dev_err(&pdev->dev, "TAD PMU mem window too
>small for tad-cnt=%u\n",
>> +				tad_cnt);
>> +			return -EINVAL;
>> +		}
>> +		regions[i].base = devm_ioremap(&pdev->dev, map_start,
>>  					       tad_pmu_page_size);
>>  		if (!regions[i].base) {
>>  			dev_err(&pdev->dev, "TAD%d ioremap fail\n", i);
>>  			return -ENOMEM;
>>  		}
>> -		res->start += tad_page_size;
>> +		map_start += tad_page_size;
>>  	}
>>
>>  	tad_pmu->regions = regions;
>> @@ -374,14 +481,16 @@ static int tad_pmu_probe(struct platform_device
>*pdev)
>>  		.read		= tad_pmu_event_counter_read,
>>  	};
>>
>> -	if (version == TAD_PMU_V1)
>> +	if (version == TAD_PMU_V1) {
>>  		tad_pmu->pmu.attr_groups = tad_pmu_attr_groups;
>> -	else
>> +		tad_pmu->ops		 = &tad_pmu_ops;
>> +	} else {
>>  		tad_pmu->pmu.attr_groups = ody_tad_pmu_attr_groups;
>> +		tad_pmu->ops		 = &tad_pmu_v2_ops;
>> +	}
>>
>>  	tad_pmu->cpu = raw_smp_processor_id();
>>
>> -	/* Register pmu instance for cpu hotplug */
>>  	ret = cpuhp_state_add_instance_nocalls(tad_pmu_cpuhp_state,
>>  					       &tad_pmu->node);
>>  	if (ret) {
>> @@ -389,19 +498,24 @@ static int tad_pmu_probe(struct platform_device
>*pdev)
>>  		return ret;
>>  	}
>>
>> -	name = "tad";
>> -	ret = perf_pmu_register(&tad_pmu->pmu, name, -1);
>> -	if (ret)
>> +	ret = perf_pmu_register(&tad_pmu->pmu, "tad", -1);
>> +	if (ret) {
>> +		dev_err(&pdev->dev, "Error %d registering perf PMU\n", ret);
>>  		cpuhp_state_remove_instance_nocalls(tad_pmu_cpuhp_state,
>>  						    &tad_pmu->node);
>> +		return ret;
>> +	}
>>
>> -	return ret;
>> +	WRITE_ONCE(tad_pmu->perf_ready, true);
>> +
>> +	return 0;
>>  }
>>
>>  static void tad_pmu_remove(struct platform_device *pdev)  {
>>  	struct tad_pmu *pmu = platform_get_drvdata(pdev);
>>
>> +	WRITE_ONCE(pmu->perf_ready, false);
>>  	cpuhp_state_remove_instance_nocalls(tad_pmu_cpuhp_state,
>>  						&pmu->node);
>>  	perf_pmu_unregister(&pmu->pmu);
>> @@ -410,12 +524,17 @@ static void tad_pmu_remove(struct
>> platform_device *pdev)  #if defined(CONFIG_OF) || defined(CONFIG_ACPI)
>> static const struct tad_pmu_data tad_pmu_data = {
>>  	.id   = TAD_PMU_V1,
>> +	.tad_prf_offset = TAD_PRF_OFFSET,
>> +	.tad_pfc_offset = TAD_PFC_OFFSET,
>>  };
>> +
>>  #endif
>>
>>  #ifdef CONFIG_ACPI
>>  static const struct tad_pmu_data tad_pmu_v2_data = {
>>  	.id   = TAD_PMU_V2,
>> +	.tad_prf_offset = TAD_PRF_OFFSET,
>> +	.tad_pfc_offset = TAD_PFC_OFFSET,
>>  };
>>  #endif
>>
>> @@ -451,6 +570,9 @@ static int tad_pmu_offline_cpu(unsigned int cpu,
>struct hlist_node *node)
>>  	struct tad_pmu *pmu = hlist_entry_safe(node, struct tad_pmu, node);
>>  	unsigned int target;
>>
>> +	if (!READ_ONCE(pmu->perf_ready))
>> +		return 0;
>> +
>>  	if (cpu != pmu->cpu)
>>  		return 0;
>>
>> @@ -491,6 +613,6 @@ static void __exit tad_pmu_exit(void)
>> module_init(tad_pmu_init);  module_exit(tad_pmu_exit);
>>
>> -MODULE_DESCRIPTION("Marvell CN10K LLC-TAD Perf driver");
>> +MODULE_DESCRIPTION("Marvell CN10K LLC-TAD perf driver");
>>  MODULE_AUTHOR("Bhaskara Budiredla <bbudiredla@marvell.com>");
>> MODULE_LICENSE("GPL v2");


^ permalink raw reply

* RE: [PATCH v3 1/2] i2c: imx: Clear slave pointer on registration error
From: Carlos Song (OSS) @ 2026-06-26  6:23 UTC (permalink / raw)
  To: Liem, Frank Li (OSS)
  Cc: Frank Li, andi.shyti@kernel.org, Biwen Li, festevam@gmail.com,
	imx@lists.linux.dev, kernel@pengutronix.de,
	linux-arm-kernel@lists.infradead.org, linux-i2c@vger.kernel.org,
	linux-kernel@vger.kernel.org, o.rempel@pengutronix.de,
	s.hauer@pengutronix.de, stable@vger.kernel.org, wsa@kernel.org
In-Reply-To: <20260626025846.106157-2-liem16213@gmail.com>



> -----Original Message-----
> From: Liem <liem16213@gmail.com>
> Sent: Friday, June 26, 2026 10:59 AM
> To: Frank Li (OSS) <frank.li@oss.nxp.com>
> Cc: Frank Li <frank.li@nxp.com>; andi.shyti@kernel.org; Biwen Li
> <biwen.li@nxp.com>; festevam@gmail.com; imx@lists.linux.dev;
> kernel@pengutronix.de; liem16213@gmail.com;
> linux-arm-kernel@lists.infradead.org; linux-i2c@vger.kernel.org;
> linux-kernel@vger.kernel.org; o.rempel@pengutronix.de;
> s.hauer@pengutronix.de; stable@vger.kernel.org; wsa@kernel.org
> Subject: [PATCH v3 1/2] i2c: imx: Clear slave pointer on registration error
> 
> In i2c_imx_reg_slave(), i2c_imx->slave is checked at the beginning and the
> function returns -EBUSY if it is non-NULL.  If
> pm_runtime_resume_and_get() fails later, the error path returns without clearing
> i2c_imx->slave, leaving it non-NULL.  Subsequent attempts to register a slave will
> then immediately fail with -EBUSY, making it impossible to register the slave
> again.
> 
> Fix by setting i2c_imx->slave = NULL on the error path.
> 
> Fixes: f7414cd6923f ("i2c: imx: support slave mode for imx I2C driver")
> Cc: stable@vger.kernel.org
> Signed-off-by: Liem <liem16213@gmail.com>
> ---
>  drivers/i2c/busses/i2c-imx.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index
> 28313d0fad37..17defb470776 100644
> --- a/drivers/i2c/busses/i2c-imx.c
> +++ b/drivers/i2c/busses/i2c-imx.c

Hi, Liem

LGTM. But I notice Sashiko give a worth-considering topic: 

Can this assignment race with the interrupt handler?
Because the driver uses a shared IRQ, i2c_imx_isr() could execute
concurrently if another device triggers the interrupt line.
If the ISR acquires slave_lock and evaluates i2c_imx->slave as valid, and
this error path locklessly sets it to NULL, wouldn't subsequent accesses
in the ISR dereference a NULL pointer?

> @@ -936,6 +936,7 @@ static int i2c_imx_reg_slave(struct i2c_client *client)
>  	/* Resume */
>  	ret = pm_runtime_resume_and_get(i2c_imx->adapter.dev.parent);
>  	if (ret < 0) {
> +		i2c_imx->slave = NULL;

>  		dev_err(&i2c_imx->adapter.dev, "failed to resume i2c controller");
>  		return ret;
>  	}

Is it helpful?

diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index 73317ddd5f02..e50058fd39ee 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -930,14 +930,16 @@ static int i2c_imx_reg_slave(struct i2c_client *client)
        if (i2c_imx->slave)
                return -EBUSY;
 
-       i2c_imx->slave = client;
-       i2c_imx->last_slave_event = I2C_SLAVE_STOP;
-
        /* Resume */
        ret = pm_runtime_resume_and_get(i2c_imx->adapter.dev.parent);
        if (ret < 0) {
                dev_err(&i2c_imx->adapter.dev, "failed to resume i2c controller");
                return ret;
+        }
+
+       scoped_guard(spinlock_irqsave, &i2c_imx->slave_lock) {
+               i2c_imx->slave = client;
+               i2c_imx->last_slave_event = I2C_SLAVE_STOP;
        }
 
        i2c_imx_slave_init(i2c_imx);

> --
> 2.34.1
> 



^ permalink raw reply related

* RE: [PATCH v3 2/2] i2c: imx: Cancel hrtimer before clearing slave pointer
From: Carlos Song (OSS) @ 2026-06-26  6:26 UTC (permalink / raw)
  To: Liem, Frank Li (OSS)
  Cc: Frank Li, andi.shyti@kernel.org, Biwen Li, festevam@gmail.com,
	imx@lists.linux.dev, kernel@pengutronix.de,
	linux-arm-kernel@lists.infradead.org, linux-i2c@vger.kernel.org,
	linux-kernel@vger.kernel.org, o.rempel@pengutronix.de,
	s.hauer@pengutronix.de, stable@vger.kernel.org, wsa@kernel.org
In-Reply-To: <20260626025846.106157-3-liem16213@gmail.com>



> -----Original Message-----
> From: Liem <liem16213@gmail.com>
> Sent: Friday, June 26, 2026 10:59 AM
> To: Frank Li (OSS) <frank.li@oss.nxp.com>
> Cc: Frank Li <frank.li@nxp.com>; andi.shyti@kernel.org; Biwen Li
> <biwen.li@nxp.com>; festevam@gmail.com; imx@lists.linux.dev;
> kernel@pengutronix.de; liem16213@gmail.com;
> linux-arm-kernel@lists.infradead.org; linux-i2c@vger.kernel.org;
> linux-kernel@vger.kernel.org; o.rempel@pengutronix.de;
> s.hauer@pengutronix.de; stable@vger.kernel.org; wsa@kernel.org
> Subject: [PATCH v3 2/2] i2c: imx: Cancel hrtimer before clearing slave pointer
> 
> In i2c_imx_unreg_slave(), the slave pointer is set to NULL after disabling
> interrupts.  However, a pending interrupt might already have started the
> hrtimer (i2c_imx_slave_timeout) before the pointer was cleared.  If the hrtimer
> fires after i2c_imx->slave is set to NULL, the timer callback
> i2c_imx_slave_finish_op() will call
> i2c_imx_slave_event() with a NULL slave pointer,which results in a use-after-free /
> NULL pointer dereference.
> 
> Fix by canceling the hrtimer and waiting for it to complete after disabling
> interrupts, before clearing the slave pointer.
> 
> Fixes: f7414cd6923f ("i2c: imx: support slave mode for imx I2C driver")
> Cc: stable@vger.kernel.org
> Signed-off-by: Liem <liem16213@gmail.com>

Hi,

LGTM, thank you very much!

Acked-by: Carlos Song <carlos.song@nxp.com>

> ---
>  drivers/i2c/busses/i2c-imx.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index
> 17defb470776..f02c216ba299 100644
> --- a/drivers/i2c/busses/i2c-imx.c
> +++ b/drivers/i2c/busses/i2c-imx.c
> @@ -959,6 +959,7 @@ static int i2c_imx_unreg_slave(struct i2c_client *client)
> 
>  	i2c_imx_reset_regs(i2c_imx);
> 
> +	hrtimer_cancel(&i2c_imx->slave_timer);
>  	i2c_imx->slave = NULL;
> 
>  	/* Suspend */
> --
> 2.34.1
> 



^ permalink raw reply

* Re: [PATCH RFC 1/4] media: imx8mq-mipi-csi2: Make reset release SoC-specific
From: Philipp Zabel @ 2026-06-26  6:47 UTC (permalink / raw)
  To: Vincent Cloutier, linux-media, devicetree, linux-arm-kernel
  Cc: linux-kernel, linux-imx, kernel, Vincent Cloutier,
	Laurent Pinchart, Frank Li, Martin Kepplinger-Novakovic,
	Rui Miguel Silva, Mauro Carvalho Chehab, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, imx
In-Reply-To: <20260626000715.1111803-2-vincent.cloutier@icloud.com>

On Do, 2026-06-25 at 20:06 -0400, Vincent Cloutier wrote:
> From: Vincent Cloutier <vincent@cloutier.co>
> 
> The CSI-2 software reset helper currently asserts the reset control and
> then releases it again unconditionally.
> 
> That release step is required by the i.MX8QXP path, but it changes the
> reset sequence used by i.MX8MQ. On Librem 5r4, which is i.MX8MQ-based,
> the unconditional release step prevents the camera pipeline from producing
> frames after reset; captures time out waiting for EOF from the CSI bridge.
> 
> This series enables the Librem 5 rear camera on the second i.MX8MQ CSI-2
> receiver. Keep the i.MX8MQ path on the known-working assert-only software
> reset sequence while preserving the explicit release step for i.MX8QXP.
> 
> Make reset release opt-in through platform data.
> 
> Tested on Librem 5r4 with the existing HI846 front camera and the S5K3L6
> rear camera added by this series.

I think you are missing [1] ("reset: imx7: Correct polarity of MIPI CSI
resets on i.MX8MQ") instead.

[1] https://lore.kernel.org/all/20260619073115.3778313-1-robby.cai@oss.nxp.com/


regards
Philipp


^ permalink raw reply

* Re: [PATCH v2] nvme-apple: Use acquire/release for queue enabled state
From: Christoph Hellwig @ 2026-06-26  6:59 UTC (permalink / raw)
  To: Gui-Dong Han
  Cc: sven, kbusch, linux-nvme, axboe, hch, sagi, j, neal, asahi,
	linux-kernel, linux-arm-kernel, baijiaju1990
In-Reply-To: <20260618021543.3866850-1-hanguidong02@gmail.com>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>



^ permalink raw reply

* Re: [PATCH v2 3/5] mmc: sdhci-esdhc-imx: restore pinctrl before restoring ios timing on resume
From: Bough Chen @ 2026-06-26  6:58 UTC (permalink / raw)
  To: Luke Wang (OSS)
  Cc: Frank Li (OSS), adrian.hunter@intel.com, ulfh@kernel.org,
	Bough Chen, Frank Li, s.hauer@pengutronix.de,
	kernel@pengutronix.de, festevam@gmail.com, imx@lists.linux.dev,
	linux-mmc@vger.kernel.org, dl-S32,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <AM7PR04MB68708D2075BE3EF4BC35204BEDEB2@AM7PR04MB6870.eurprd04.prod.outlook.com>

On Fri, Jun 26, 2026 at 06:03:05AM +0000, Luke Wang (OSS) wrote:
> 
> 
> > -----Original Message-----
> > From: Frank Li (OSS) <frank.li@oss.nxp.com>
> > Sent: Friday, June 26, 2026 12:36 AM
> > To: Luke Wang (OSS) <ziniu.wang_1@oss.nxp.com>
> > Cc: adrian.hunter@intel.com; ulfh@kernel.org; Bough Chen
> > <haibo.chen@nxp.com>; Frank Li <frank.li@nxp.com>;
> > s.hauer@pengutronix.de; kernel@pengutronix.de; festevam@gmail.com;
> > imx@lists.linux.dev; linux-mmc@vger.kernel.org; dl-S32 <S32@nxp.com>;
> > linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH v2 3/5] mmc: sdhci-esdhc-imx: restore pinctrl before
> > restoring ios timing on resume
> > 
> > On Thu, Jun 25, 2026 at 06:59:32PM +0800, ziniu.wang_1@oss.nxp.com
> > wrote:
> > > From: Luke Wang <ziniu.wang_1@nxp.com>
> > >
> > > SDIO devices such as WiFi may keep power during suspend, so the MMC
> > > core skips full card re-initialization on resume and directly restores
> > > the host controller's ios timing to match the card. For DDR mode,
> > > pm_runtime_force_resume() sets DDR_EN before the pin configuration is
> > > restored from sleep state. When DDR_EN is set while the pinctrl is still
> > > muxed to GPIO or other non-uSDHC function, the loopback clock from the
> > > external pad is not valid, resulting in an incorrect internal sampling
> > > point being selected. This causes persistent read CRC errors on subsequent
> > > data transfers, even after the pinctrl is later configured correctly.
> > >
> > > SD/eMMC running in DDR mode are unaffected as they are fully re-
> > initialized
> > > from legacy timing after resume.
> > >
> > > Fix this by restoring the pinctrl state based on current timing mode
> > > using esdhc_change_pinstate() before pm_runtime_force_resume(). This
> > > ensures the correct pin configuration (e.g., 100/200MHz for UHS modes)
> > > is applied. Only restore for non-wakeup devices since wakeup devices
> > > kept their active pin state during suspend to avoid glitching the SD
> > > bus pins for powered SDIO cards.
> > 
> > pin state change should only impact driver strength, why cause glitch ?
> 
> You're right that switching driver strength alone won't cause a glitch. 
> The issue is more specific to the sleep pinctrl state: the uSDHC clock pin is
> low when the clock is stopped, but the sleep pinctrl enables a pull-up on that
> pin, driving it high during suspend. When we switch back to the uSDHC function
> pinctrl on resume, the pin transitions from high back to low, generating 
> a falling edge glitch.
> 
> I'll update the commit message in v3 to clarify this.

The glitch should be related to the SoC IP intergration, switch pinctrl setting
(change alt from GPIO to USDHC) impact the internal loopback path. If pinctrl
config the pad to GPIO function, once DDR_EN configed, the dll delay will fix
based on the GPIO function loopback path, but then change the pinctrl to function
USDHC, the internal loopback path change, the original fixed sample point maybe
not suitable for current loopback path.

Luke, please add this in the commit log.

Regards
Haibo Chen
> 
> Thanks, 
> Luke
> 
> > 
> > Frank
> > >
> > > Fixes: 676a83855614 ("mmc: host: sdhci-esdhc-imx: refactor the system PM
> > logic")
> > > Signed-off-by: Luke Wang <ziniu.wang_1@nxp.com>
> > > ---
> > >  drivers/mmc/host/sdhci-esdhc-imx.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-
> > esdhc-imx.c
> > > index a944351dbcdf..7fcaecdd4ec6 100644
> > > --- a/drivers/mmc/host/sdhci-esdhc-imx.c
> > > +++ b/drivers/mmc/host/sdhci-esdhc-imx.c
> > > @@ -2114,6 +2114,12 @@ static int sdhci_esdhc_resume(struct device
> > *dev)
> > >  	struct pltfm_imx_data *imx_data = sdhci_pltfm_priv(pltfm_host);
> > >  	int ret;
> > >
> > > +	if (!device_may_wakeup(dev)) {
> > > +		ret = esdhc_change_pinstate(host, host->timing);
> > > +		if (ret)
> > > +			dev_warn(dev, "Failed to restore pinctrl state\n");
> > > +	}
> > >  	pm_runtime_force_resume(dev);
> > >
> > >  	ret = mmc_gpio_set_cd_wake(host->mmc, false);
> > > --
> > > 2.34.1
> > >
> > >


^ permalink raw reply

* [PATCH v3 0/8] KVM: arm64: Rework pKVM vCPU state synchronisation
From: Fuad Tabba @ 2026-06-26  7:04 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
	linux-kernel
  Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
	Sebastian Ene, Hyunwoo Kim, Fuad Tabba

Hi folks,

Changes since v2 [1]:
  - Sync host state only on trap or SError exits, and move it into a
    dedicated handle_exit_pkvm_state(). (Vincent)
  - Collected Vincent's Reviewed-by.

Building on Will's pKVM infrastructure series [2], this series reworks
how pKVM moves vCPU state between the host and EL2, and stops copying a
non-protected guest's state on every world switch.

EL2 gains proper primitives for the state it transfers: vCPU lookup
helpers, and VGIC flush/sync that reduces how much host state EL2
dereferences. The series also moves some preparatory code (such as sys
reg access and PSCI helpers) to shared headers and HYP, and implements
lazy copying of a non-protected guest's register state back to the host
until the host actually needs it, instead of on every exit.

This is the first of two series moving pKVM vCPU state management to
EL2. The follow-up completes the job for protected VMs: state
isolation, PSCI handling at EL2, and the resulting API behaviour.

The series is structured as follows:

  01-04:  Preparatory refactoring (MPIDR, sys reg access, vCPU reset, PSCI
          helpers) to shared headers and HYP.
  05:     Host and hypervisor vCPU lookup primitives.
  06-07:  VGIC: reduce EL2's exposure to host state, add flush/sync primitives.
  08:     Lazy state sync for non-protected guests.

Based on kvmarm/next (1ee27dacbe5dc).

Cheers,
/fuad

[1] https://lore.kernel.org/all/20260619070719.812227-1-tabba@google.com/
[2] https://lore.kernel.org/all/20260105154939.11041-1-will@kernel.org/

Fuad Tabba (5):
  KVM: arm64: Extract MPIDR computation into a shared header
  KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code
  KVM: arm64: Factor out reusable vCPU reset helpers
  KVM: arm64: Move PSCI helper functions to a shared header
  KVM: arm64: Implement lazy vCPU state sync for non-protected guests

Marc Zyngier (3):
  KVM: arm64: Add host and hypervisor vCPU lookup primitives
  KVM: arm64: Minimise EL2's exposure of host VGIC state during world
    switch
  KVM: arm64: Add primitives to flush/sync the VGIC state at EL2

 arch/arm64/include/asm/kvm_arm.h     |  12 ++
 arch/arm64/include/asm/kvm_asm.h     |   1 +
 arch/arm64/include/asm/kvm_emulate.h |  79 +++++++-
 arch/arm64/include/asm/kvm_host.h    |   2 +
 arch/arm64/kvm/arm.c                 |   7 +
 arch/arm64/kvm/handle_exit.c         |  23 +++
 arch/arm64/kvm/hyp/exception.c       |  34 +---
 arch/arm64/kvm/hyp/nvhe/hyp-main.c   | 258 +++++++++++++++++++++++----
 arch/arm64/kvm/psci.c                |  30 +---
 arch/arm64/kvm/reset.c               |  60 +------
 arch/arm64/kvm/sys_regs.c            |  14 +-
 arch/arm64/kvm/sys_regs.h            |  19 ++
 include/kvm/arm_psci.h               |  27 +++
 13 files changed, 403 insertions(+), 163 deletions(-)

-- 
2.39.5



^ permalink raw reply

* [PATCH v3 1/8] KVM: arm64: Extract MPIDR computation into a shared header
From: Fuad Tabba @ 2026-06-26  7:04 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
	linux-kernel
  Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
	Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260626070408.3420953-1-fuad.tabba@linux.dev>

Extract the vCPU MPIDR computation embedded in reset_mpidr() into a
kvm_calculate_mpidr() inline in sys_regs.h, so it can be computed
without duplicating the logic. A follow-up series reuses it to reset
protected vCPUs at EL2.

No functional change intended.

Reviewed-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Fuad Tabba <fuad.tabba@linux.dev>
---
 arch/arm64/kvm/sys_regs.c | 14 +-------------
 arch/arm64/kvm/sys_regs.h | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 33c921df19b54..674fabe1d40d1 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -976,21 +976,9 @@ static u64 reset_actlr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 
 static u64 reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 {
-	u64 mpidr;
+	u64 mpidr = kvm_calculate_mpidr(vcpu);
 
-	/*
-	 * Map the vcpu_id into the first three affinity level fields of
-	 * the MPIDR. We limit the number of VCPUs in level 0 due to a
-	 * limitation to 16 CPUs in that level in the ICC_SGIxR registers
-	 * of the GICv3 to be able to address each CPU directly when
-	 * sending IPIs.
-	 */
-	mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
-	mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
-	mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
-	mpidr |= (1ULL << 31);
 	vcpu_write_sys_reg(vcpu, mpidr, MPIDR_EL1);
-
 	return mpidr;
 }
 
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 2a983664220ce..bd56a45abbf9c 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -222,6 +222,25 @@ find_reg(const struct sys_reg_params *params, const struct sys_reg_desc table[],
 	return __inline_bsearch((void *)pval, table, num, sizeof(table[0]), match_sys_reg);
 }
 
+static inline u64 kvm_calculate_mpidr(const struct kvm_vcpu *vcpu)
+{
+	u64 mpidr;
+
+	/*
+	 * Map the vcpu_id into the first three affinity level fields of
+	 * the MPIDR. We limit the number of VCPUs in level 0 due to a
+	 * limitation to 16 CPUs in that level in the ICC_SGIxR registers
+	 * of the GICv3 to be able to address each CPU directly when
+	 * sending IPIs.
+	 */
+	mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
+	mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
+	mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
+	mpidr |= (1ULL << 31);
+
+	return mpidr;
+}
+
 const struct sys_reg_desc *get_reg_by_id(u64 id,
 					 const struct sys_reg_desc table[],
 					 unsigned int num);
-- 
2.39.5



^ permalink raw reply related

* [PATCH v3 2/8] KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code
From: Fuad Tabba @ 2026-06-26  7:04 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
	linux-kernel
  Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
	Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260626070408.3420953-1-fuad.tabba@linux.dev>

The vcpu_{read,write}_sys_reg() accessors are host-only, so helpers
built on them such as kvm_vcpu_set_be()/kvm_vcpu_is_be() cannot be
shared with hyp code. exception.c already wraps them in
__vcpu_{read,write}_sys_reg(), which pick the host- or hyp-side accessor
via has_vhe() and so are valid in any context.

Move those wrappers to kvm_emulate.h as kvm_vcpu_{read,write}_sys_reg()
and switch the callers over, so a follow-up series can share that
emulation code at EL2.

No functional change intended.

Reviewed-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Fuad Tabba <fuad.tabba@linux.dev>
---
 arch/arm64/include/asm/kvm_emulate.h | 22 +++++++++++++++---
 arch/arm64/kvm/hyp/exception.c       | 34 ++++++++--------------------
 2 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 5bf3d7e1d92c7..80b30fead3d16 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -506,6 +506,22 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
 	return __vcpu_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
 }
 
+static inline u64 kvm_vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
+{
+	if (has_vhe())
+		return vcpu_read_sys_reg(vcpu, reg);
+
+	return __vcpu_sys_reg(vcpu, reg);
+}
+
+static inline void kvm_vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
+{
+	if (has_vhe())
+		vcpu_write_sys_reg(vcpu, val, reg);
+	else
+		__vcpu_assign_sys_reg(vcpu, reg, val);
+}
+
 static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
 {
 	if (vcpu_mode_is_32bit(vcpu)) {
@@ -516,9 +532,9 @@ static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
 
 		r = vcpu_has_nv(vcpu) ? SCTLR_EL2 : SCTLR_EL1;
 
-		sctlr = vcpu_read_sys_reg(vcpu, r);
+		sctlr = kvm_vcpu_read_sys_reg(vcpu, r);
 		sctlr |= SCTLR_ELx_EE;
-		vcpu_write_sys_reg(vcpu, sctlr, r);
+		kvm_vcpu_write_sys_reg(vcpu, sctlr, r);
 	}
 }
 
@@ -533,7 +549,7 @@ static inline bool kvm_vcpu_is_be(struct kvm_vcpu *vcpu)
 	r = is_hyp_ctxt(vcpu) ? SCTLR_EL2 : SCTLR_EL1;
 	bit = vcpu_mode_priv(vcpu) ? SCTLR_ELx_EE : SCTLR_EL1_E0E;
 
-	return vcpu_read_sys_reg(vcpu, r) & bit;
+	return kvm_vcpu_read_sys_reg(vcpu, r) & bit;
 }
 
 static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu,
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index bef40ddb16dbc..2cb68dc7d441e 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -20,22 +20,6 @@
 #error Hypervisor code only!
 #endif
 
-static inline u64 __vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
-{
-	if (has_vhe())
-		return vcpu_read_sys_reg(vcpu, reg);
-
-	return __vcpu_sys_reg(vcpu, reg);
-}
-
-static inline void __vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
-{
-	if (has_vhe())
-		vcpu_write_sys_reg(vcpu, val, reg);
-	else
-		__vcpu_assign_sys_reg(vcpu, reg, val);
-}
-
 static void __vcpu_write_spsr(struct kvm_vcpu *vcpu, unsigned long target_mode,
 			      u64 val)
 {
@@ -101,14 +85,14 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
 
 	switch (target_mode) {
 	case PSR_MODE_EL1h:
-		vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL1);
-		sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL1);
-		__vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL1);
+		vbar = kvm_vcpu_read_sys_reg(vcpu, VBAR_EL1);
+		sctlr = kvm_vcpu_read_sys_reg(vcpu, SCTLR_EL1);
+		kvm_vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL1);
 		break;
 	case PSR_MODE_EL2h:
-		vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL2);
-		sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL2);
-		__vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL2);
+		vbar = kvm_vcpu_read_sys_reg(vcpu, VBAR_EL2);
+		sctlr = kvm_vcpu_read_sys_reg(vcpu, SCTLR_EL2);
+		kvm_vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL2);
 		break;
 	default:
 		/* Don't do that */
@@ -185,7 +169,7 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
  */
 static unsigned long get_except32_cpsr(struct kvm_vcpu *vcpu, u32 mode)
 {
-	u32 sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL1);
+	u32 sctlr = kvm_vcpu_read_sys_reg(vcpu, SCTLR_EL1);
 	unsigned long old, new;
 
 	old = *vcpu_cpsr(vcpu);
@@ -281,7 +265,7 @@ static void enter_exception32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset)
 {
 	unsigned long spsr = *vcpu_cpsr(vcpu);
 	bool is_thumb = (spsr & PSR_AA32_T_BIT);
-	u32 sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL1);
+	u32 sctlr = kvm_vcpu_read_sys_reg(vcpu, SCTLR_EL1);
 	u32 return_address;
 
 	*vcpu_cpsr(vcpu) = get_except32_cpsr(vcpu, mode);
@@ -305,7 +289,7 @@ static void enter_exception32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset)
 	if (sctlr & (1 << 13))
 		vect_offset += 0xffff0000;
 	else /* always have security exceptions */
-		vect_offset += __vcpu_read_sys_reg(vcpu, VBAR_EL1);
+		vect_offset += kvm_vcpu_read_sys_reg(vcpu, VBAR_EL1);
 
 	*vcpu_pc(vcpu) = vect_offset;
 }
-- 
2.39.5



^ permalink raw reply related

* [PATCH v3 3/8] KVM: arm64: Factor out reusable vCPU reset helpers
From: Fuad Tabba @ 2026-06-26  7:04 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
	linux-kernel
  Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
	Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260626070408.3420953-1-fuad.tabba@linux.dev>

Pull the reusable pieces out of kvm_reset_vcpu(): expose the reset
PSTATE values in kvm_arm.h, and split the core register reset and the
PSCI-driven reset into kvm_reset_vcpu_core() and kvm_reset_vcpu_psci().
A follow-up series reuses these to reset protected vCPUs at EL2.

No functional change intended.

Reviewed-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Fuad Tabba <fuad.tabba@linux.dev>
---
 arch/arm64/include/asm/kvm_arm.h     | 12 ++++++
 arch/arm64/include/asm/kvm_emulate.h | 57 ++++++++++++++++++++++++++
 arch/arm64/kvm/reset.c               | 60 ++--------------------------
 3 files changed, 72 insertions(+), 57 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 3f9233b5a1308..aba4ec09acd23 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -348,4 +348,16 @@
 	{ PSR_AA32_MODE_UND,	"32-bit UND" },	\
 	{ PSR_AA32_MODE_SYS,	"32-bit SYS" }
 
+/*
+ * ARMv8 Reset Values
+ */
+#define VCPU_RESET_PSTATE_EL1	(PSR_MODE_EL1h | PSR_A_BIT | PSR_I_BIT | \
+				 PSR_F_BIT | PSR_D_BIT)
+
+#define VCPU_RESET_PSTATE_EL2	(PSR_MODE_EL2h | PSR_A_BIT | PSR_I_BIT | \
+				 PSR_F_BIT | PSR_D_BIT)
+
+#define VCPU_RESET_PSTATE_SVC	(PSR_AA32_MODE_SVC | PSR_AA32_A_BIT | \
+				 PSR_AA32_I_BIT | PSR_AA32_F_BIT)
+
 #endif /* __ARM64_KVM_ARM_H__ */
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 80b30fead3d16..2385d8855fcfd 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -704,4 +704,61 @@ static inline void vcpu_set_hcrx(struct kvm_vcpu *vcpu)
 			vcpu->arch.hcrx_el2 |= HCRX_EL2_EnASR;
 	}
 }
+
+/* Reset a vcpu's core registers. */
+static inline void kvm_reset_vcpu_core(struct kvm_vcpu *vcpu)
+{
+	u32 pstate;
+
+	if (vcpu_el1_is_32bit(vcpu))
+		pstate = VCPU_RESET_PSTATE_SVC;
+	else if (vcpu_has_nv(vcpu))
+		pstate = VCPU_RESET_PSTATE_EL2;
+	else
+		pstate = VCPU_RESET_PSTATE_EL1;
+
+	/* Reset core registers */
+	memset(vcpu_gp_regs(vcpu), 0, sizeof(*vcpu_gp_regs(vcpu)));
+	memset(&vcpu->arch.ctxt.fp_regs, 0, sizeof(vcpu->arch.ctxt.fp_regs));
+	vcpu->arch.ctxt.spsr_abt = 0;
+	vcpu->arch.ctxt.spsr_und = 0;
+	vcpu->arch.ctxt.spsr_irq = 0;
+	vcpu->arch.ctxt.spsr_fiq = 0;
+	vcpu_gp_regs(vcpu)->pstate = pstate;
+}
+
+/* PSCI reset handling for a vcpu. */
+static inline void kvm_reset_vcpu_psci(struct kvm_vcpu *vcpu,
+				       struct vcpu_reset_state *reset_state)
+{
+	unsigned long target_pc = reset_state->pc;
+
+	/* Gracefully handle Thumb2 entry point */
+	if (vcpu_mode_is_32bit(vcpu) && (target_pc & 1)) {
+		target_pc &= ~1UL;
+		vcpu_set_thumb(vcpu);
+	}
+
+	/* Propagate caller endianness */
+	if (reset_state->be)
+		kvm_vcpu_set_be(vcpu);
+
+	*vcpu_pc(vcpu) = target_pc;
+
+	/*
+	 * We may come from a state where either a PC update was
+	 * pending (SMC call resulting in PC being increpented to
+	 * skip the SMC) or a pending exception. Make sure we get
+	 * rid of all that, as this cannot be valid out of reset.
+	 *
+	 * Note that clearing the exception mask also clears PC
+	 * updates, but that's an implementation detail, and we
+	 * really want to make it explicit.
+	 */
+	vcpu_clear_flag(vcpu, PENDING_EXCEPTION);
+	vcpu_clear_flag(vcpu, EXCEPT_MASK);
+	vcpu_clear_flag(vcpu, INCREMENT_PC);
+	vcpu_set_reg(vcpu, 0, reset_state->r0);
+}
+
 #endif /* __ARM64_KVM_EMULATE_H__ */
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index b963fd975aaca..10eb7249aa9e8 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -34,18 +34,6 @@
 static u32 __ro_after_init kvm_ipa_limit;
 unsigned int __ro_after_init kvm_host_sve_max_vl;
 
-/*
- * ARMv8 Reset Values
- */
-#define VCPU_RESET_PSTATE_EL1	(PSR_MODE_EL1h | PSR_A_BIT | PSR_I_BIT | \
-				 PSR_F_BIT | PSR_D_BIT)
-
-#define VCPU_RESET_PSTATE_EL2	(PSR_MODE_EL2h | PSR_A_BIT | PSR_I_BIT | \
-				 PSR_F_BIT | PSR_D_BIT)
-
-#define VCPU_RESET_PSTATE_SVC	(PSR_AA32_MODE_SVC | PSR_AA32_A_BIT | \
-				 PSR_AA32_I_BIT | PSR_AA32_F_BIT)
-
 unsigned int __ro_after_init kvm_sve_max_vl;
 
 int __init kvm_arm_init_sve(void)
@@ -191,7 +179,6 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_reset_state reset_state;
 	bool loaded;
-	u32 pstate;
 
 	spin_lock(&vcpu->arch.mp_state_lock);
 	reset_state = vcpu->arch.reset_state;
@@ -210,21 +197,8 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 		kvm_vcpu_reset_sve(vcpu);
 	}
 
-	if (vcpu_el1_is_32bit(vcpu))
-		pstate = VCPU_RESET_PSTATE_SVC;
-	else if (vcpu_has_nv(vcpu))
-		pstate = VCPU_RESET_PSTATE_EL2;
-	else
-		pstate = VCPU_RESET_PSTATE_EL1;
-
 	/* Reset core registers */
-	memset(vcpu_gp_regs(vcpu), 0, sizeof(*vcpu_gp_regs(vcpu)));
-	memset(&vcpu->arch.ctxt.fp_regs, 0, sizeof(vcpu->arch.ctxt.fp_regs));
-	vcpu->arch.ctxt.spsr_abt = 0;
-	vcpu->arch.ctxt.spsr_und = 0;
-	vcpu->arch.ctxt.spsr_irq = 0;
-	vcpu->arch.ctxt.spsr_fiq = 0;
-	vcpu_gp_regs(vcpu)->pstate = pstate;
+	kvm_reset_vcpu_core(vcpu);
 
 	/* Reset system registers */
 	kvm_reset_sys_regs(vcpu);
@@ -233,36 +207,8 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 	 * Additional reset state handling that PSCI may have imposed on us.
 	 * Must be done after all the sys_reg reset.
 	 */
-	if (reset_state.reset) {
-		unsigned long target_pc = reset_state.pc;
-
-		/* Gracefully handle Thumb2 entry point */
-		if (vcpu_mode_is_32bit(vcpu) && (target_pc & 1)) {
-			target_pc &= ~1UL;
-			vcpu_set_thumb(vcpu);
-		}
-
-		/* Propagate caller endianness */
-		if (reset_state.be)
-			kvm_vcpu_set_be(vcpu);
-
-		*vcpu_pc(vcpu) = target_pc;
-
-		/*
-		 * We may come from a state where either a PC update was
-		 * pending (SMC call resulting in PC being increpented to
-		 * skip the SMC) or a pending exception. Make sure we get
-		 * rid of all that, as this cannot be valid out of reset.
-		 *
-		 * Note that clearing the exception mask also clears PC
-		 * updates, but that's an implementation detail, and we
-		 * really want to make it explicit.
-		 */
-		vcpu_clear_flag(vcpu, PENDING_EXCEPTION);
-		vcpu_clear_flag(vcpu, EXCEPT_MASK);
-		vcpu_clear_flag(vcpu, INCREMENT_PC);
-		vcpu_set_reg(vcpu, 0, reset_state.r0);
-	}
+	if (reset_state.reset)
+		kvm_reset_vcpu_psci(vcpu, &reset_state);
 
 	/* Reset timer */
 	kvm_timer_vcpu_reset(vcpu);
-- 
2.39.5



^ permalink raw reply related

* [PATCH v3 5/8] KVM: arm64: Add host and hypervisor vCPU lookup primitives
From: Fuad Tabba @ 2026-06-26  7:04 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
	linux-kernel
  Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
	Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260626070408.3420953-1-fuad.tabba@linux.dev>

From: Marc Zyngier <maz@kernel.org>

The nVHE hypervisor repeatedly resolves a host vCPU into the EL2
address space and validates that the loaded hyp vCPU matches it, with
that logic open-coded in each handler.

Add __get_host_hyp_vcpus() and the get_host_hyp_vcpus() macro, which
translate the host vCPU into the hypervisor's address space and, when
pKVM is enabled, also return the loaded hyp vCPU if it matches. If pKVM
is enabled but the loaded hyp vCPU does not correspond to the requested
host vCPU, both the host and hyp vCPU are returned as NULL. Convert
handle___kvm_vcpu_run() to use it.

No functional change intended.

Reviewed-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Co-developed-by: Fuad Tabba <fuad.tabba@linux.dev>
Signed-off-by: Fuad Tabba <fuad.tabba@linux.dev>
---
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 52 ++++++++++++++++++++++--------
 1 file changed, 38 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 1d01c6e547f5d..8923f594c2640 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -212,14 +212,45 @@ static void handle___pkvm_vcpu_put(struct kvm_cpu_context *host_ctxt)
 		pkvm_put_hyp_vcpu(hyp_vcpu);
 }
 
-static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
+static struct kvm_vcpu *__get_host_hyp_vcpus(struct kvm_vcpu *arg,
+					     struct pkvm_hyp_vcpu **hyp_vcpup)
 {
-	DECLARE_REG(struct kvm_vcpu *, host_vcpu, host_ctxt, 1);
-	int ret;
+	struct kvm_vcpu *host_vcpu = kern_hyp_va(arg);
+	struct pkvm_hyp_vcpu *hyp_vcpu = NULL;
 
 	if (unlikely(is_protected_kvm_enabled())) {
-		struct pkvm_hyp_vcpu *hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
+		hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
 
+		if (!hyp_vcpu || hyp_vcpu->host_vcpu != host_vcpu) {
+			hyp_vcpu = NULL;
+			host_vcpu = NULL;
+		}
+	}
+
+	*hyp_vcpup = hyp_vcpu;
+	return host_vcpu;
+}
+
+#define get_host_hyp_vcpus(ctxt, regnr, hyp_vcpup)			\
+	({								\
+		DECLARE_REG(struct kvm_vcpu *, __vcpu, ctxt, regnr);	\
+		__get_host_hyp_vcpus(__vcpu, hyp_vcpup);		\
+	})
+
+static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
+{
+	struct pkvm_hyp_vcpu *hyp_vcpu;
+	struct kvm_vcpu *host_vcpu;
+	int ret;
+
+	host_vcpu = get_host_hyp_vcpus(host_ctxt, 1, &hyp_vcpu);
+
+	if (!host_vcpu) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (unlikely(hyp_vcpu)) {
 		/*
 		 * KVM (and pKVM) doesn't support SME guests for now, and
 		 * ensures that SME features aren't enabled in pstate when
@@ -231,23 +262,16 @@ static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
 			goto out;
 		}
 
-		if (!hyp_vcpu) {
-			ret = -EINVAL;
-			goto out;
-		}
-
 		flush_hyp_vcpu(hyp_vcpu);
 
 		ret = __kvm_vcpu_run(&hyp_vcpu->vcpu);
 
 		sync_hyp_vcpu(hyp_vcpu);
 	} else {
-		struct kvm_vcpu *vcpu = kern_hyp_va(host_vcpu);
-
 		/* The host is fully trusted, run its vCPU directly. */
-		fpsimd_lazy_switch_to_guest(vcpu);
-		ret = __kvm_vcpu_run(vcpu);
-		fpsimd_lazy_switch_to_host(vcpu);
+		fpsimd_lazy_switch_to_guest(host_vcpu);
+		ret = __kvm_vcpu_run(host_vcpu);
+		fpsimd_lazy_switch_to_host(host_vcpu);
 	}
 out:
 	cpu_reg(host_ctxt, 1) =  ret;
-- 
2.39.5



^ permalink raw reply related

* [PATCH v3 6/8] KVM: arm64: Minimise EL2's exposure of host VGIC state during world switch
From: Fuad Tabba @ 2026-06-26  7:04 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
	linux-kernel
  Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
	Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260626070408.3420953-1-fuad.tabba@linux.dev>

From: Marc Zyngier <maz@kernel.org>

The host passes a vgic_v3_cpu_if pointer to the __vgic_v3_save_aprs and
__vgic_v3_restore_vmcr_aprs hypercalls, which EL2 dereferences
wholesale. That exposes the host's full VGIC emulation state to the
hypervisor, against pKVM's isolation goals.

Recover the host vCPU from the supplied cpu_if via container_of() and
copy only vgic_vmcr and the active priority registers between EL2's
hyp-side state and the host vCPU, so EL2 no longer dereferences the
host's vgic_v3_cpu_if directly.

Reviewed-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Co-developed-by: Fuad Tabba <fuad.tabba@linux.dev>
Signed-off-by: Fuad Tabba <fuad.tabba@linux.dev>
---
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 67 ++++++++++++++++++++++++++++--
 1 file changed, 63 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 8923f594c2640..f25ee39715282 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -7,6 +7,8 @@
 #include <hyp/adjust_pc.h>
 #include <hyp/switch.h>
 
+#include <linux/irqchip/arm-gic-v3.h>
+
 #include <asm/pgtable-types.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
@@ -237,6 +239,16 @@ static struct kvm_vcpu *__get_host_hyp_vcpus(struct kvm_vcpu *arg,
 		__get_host_hyp_vcpus(__vcpu, hyp_vcpup);		\
 	})
 
+#define get_host_hyp_vcpus_from_vgic_v3_cpu_if(ctxt, regnr, hyp_vcpup)		\
+	({									\
+		DECLARE_REG(struct vgic_v3_cpu_if *, cif, ctxt, regnr);\
+		struct kvm_vcpu *__vcpu = container_of(cif,			\
+						       struct kvm_vcpu,		\
+						       arch.vgic_cpu.vgic_v3);	\
+										\
+		__get_host_hyp_vcpus(__vcpu, hyp_vcpup);			\
+	})
+
 static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
 {
 	struct pkvm_hyp_vcpu *hyp_vcpu;
@@ -506,16 +518,63 @@ static void handle___vgic_v3_init_lrs(struct kvm_cpu_context *host_ctxt)
 
 static void handle___vgic_v3_save_aprs(struct kvm_cpu_context *host_ctxt)
 {
-	DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
+	struct pkvm_hyp_vcpu *hyp_vcpu;
+	struct kvm_vcpu *host_vcpu;
 
-	__vgic_v3_save_aprs(kern_hyp_va(cpu_if));
+	host_vcpu = get_host_hyp_vcpus_from_vgic_v3_cpu_if(host_ctxt, 1,
+							   &hyp_vcpu);
+	if (!host_vcpu)
+		return;
+
+	if (unlikely(hyp_vcpu)) {
+		struct vgic_v3_cpu_if *hyp_cpu_if, *host_cpu_if;
+		int i;
+
+		hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+		__vgic_v3_save_aprs(hyp_cpu_if);
+
+		host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
+		host_cpu_if->vgic_vmcr = hyp_cpu_if->vgic_vmcr;
+		for (i = 0; i < ARRAY_SIZE(host_cpu_if->vgic_ap0r); i++) {
+			host_cpu_if->vgic_ap0r[i] = hyp_cpu_if->vgic_ap0r[i];
+			host_cpu_if->vgic_ap1r[i] = hyp_cpu_if->vgic_ap1r[i];
+		}
+	} else {
+		__vgic_v3_save_aprs(&host_vcpu->arch.vgic_cpu.vgic_v3);
+	}
 }
 
 static void handle___vgic_v3_restore_vmcr_aprs(struct kvm_cpu_context *host_ctxt)
 {
-	DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
+	struct pkvm_hyp_vcpu *hyp_vcpu;
+	struct kvm_vcpu *host_vcpu;
 
-	__vgic_v3_restore_vmcr_aprs(kern_hyp_va(cpu_if));
+	host_vcpu = get_host_hyp_vcpus_from_vgic_v3_cpu_if(host_ctxt, 1,
+							   &hyp_vcpu);
+	if (!host_vcpu)
+		return;
+
+	if (unlikely(hyp_vcpu)) {
+		struct vgic_v3_cpu_if *hyp_cpu_if, *host_cpu_if;
+		int i;
+
+		hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+		host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
+
+		hyp_cpu_if->vgic_vmcr = host_cpu_if->vgic_vmcr;
+		/* Should be a one-off */
+		hyp_cpu_if->vgic_sre = (ICC_SRE_EL1_DIB |
+					ICC_SRE_EL1_DFB |
+					ICC_SRE_EL1_SRE);
+		for (i = 0; i < ARRAY_SIZE(host_cpu_if->vgic_ap0r); i++) {
+			hyp_cpu_if->vgic_ap0r[i] = host_cpu_if->vgic_ap0r[i];
+			hyp_cpu_if->vgic_ap1r[i] = host_cpu_if->vgic_ap1r[i];
+		}
+
+		__vgic_v3_restore_vmcr_aprs(hyp_cpu_if);
+	} else {
+		__vgic_v3_restore_vmcr_aprs(&host_vcpu->arch.vgic_cpu.vgic_v3);
+	}
 }
 
 static void handle___pkvm_init(struct kvm_cpu_context *host_ctxt)
-- 
2.39.5



^ permalink raw reply related

* [PATCH v3 4/8] KVM: arm64: Move PSCI helper functions to a shared header
From: Fuad Tabba @ 2026-06-26  7:04 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
	linux-kernel
  Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
	Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260626070408.3420953-1-fuad.tabba@linux.dev>

Move kvm_psci_valid_affinity() and kvm_psci_narrow_to_32bit() from
psci.c to include/kvm/arm_psci.h, and move psci_affinity_mask() there
too, renaming it kvm_psci_affinity_mask() now that it is no longer
file-local. A follow-up series handles some protected-guest PSCI calls
at EL2 using these helpers.

No functional change intended.

Reviewed-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Fuad Tabba <fuad.tabba@linux.dev>
---
 arch/arm64/kvm/psci.c  | 30 +-----------------------------
 include/kvm/arm_psci.h | 27 +++++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 3b5dbe9a0a0ea..e3db84400d1f8 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -21,16 +21,6 @@
  * as described in ARM document number ARM DEN 0022A.
  */
 
-#define AFFINITY_MASK(level)	~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1)
-
-static unsigned long psci_affinity_mask(unsigned long affinity_level)
-{
-	if (affinity_level <= 3)
-		return MPIDR_HWID_BITMASK & AFFINITY_MASK(affinity_level);
-
-	return 0;
-}
-
 static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
 {
 	/*
@@ -51,12 +41,6 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
 	return PSCI_RET_SUCCESS;
 }
 
-static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
-					   unsigned long affinity)
-{
-	return !(affinity & ~MPIDR_HWID_BITMASK);
-}
-
 static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 {
 	struct vcpu_reset_state *reset_state;
@@ -135,7 +119,7 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
 		return PSCI_RET_INVALID_PARAMS;
 
 	/* Determine target affinity mask */
-	target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
+	target_affinity_mask = kvm_psci_affinity_mask(lowest_affinity_level);
 	if (!target_affinity_mask)
 		return PSCI_RET_INVALID_PARAMS;
 
@@ -220,18 +204,6 @@ static void kvm_psci_system_suspend(struct kvm_vcpu *vcpu)
 	run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
 }
 
-static void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
-{
-	int i;
-
-	/*
-	 * Zero the input registers' upper 32 bits. They will be fully
-	 * zeroed on exit, so we're fine changing them in place.
-	 */
-	for (i = 1; i < 4; i++)
-		vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i)));
-}
-
 static unsigned long kvm_psci_check_allowed_function(struct kvm_vcpu *vcpu, u32 fn)
 {
 	/*
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index cbaec804eb839..f86a006d67136 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -38,6 +38,33 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu)
 	return KVM_ARM_PSCI_0_1;
 }
 
+/* Narrow the PSCI register arguments (r1 to r3) to 32 bits. */
+static inline void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
+{
+	int i;
+
+	/*
+	 * Zero the input registers' upper 32 bits. They will be fully
+	 * zeroed on exit, so we're fine changing them in place.
+	 */
+	for (i = 1; i < 4; i++)
+		vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i)));
+}
+
+static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
+					   unsigned long affinity)
+{
+	return !(affinity & ~MPIDR_HWID_BITMASK);
+}
+
+static inline unsigned long kvm_psci_affinity_mask(unsigned long affinity_level)
+{
+	if (affinity_level <= 3)
+		return MPIDR_HWID_BITMASK &
+			~((0x1UL << (affinity_level * MPIDR_LEVEL_BITS)) - 1);
+
+	return 0;
+}
 
 int kvm_psci_call(struct kvm_vcpu *vcpu);
 
-- 
2.39.5



^ permalink raw reply related

* [PATCH v3 7/8] KVM: arm64: Add primitives to flush/sync the VGIC state at EL2
From: Fuad Tabba @ 2026-06-26  7:04 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
	linux-kernel
  Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
	Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260626070408.3420953-1-fuad.tabba@linux.dev>

From: Marc Zyngier <maz@kernel.org>

pKVM performs its own world switch for protected VMs but has no
primitives to move the per-vCPU VGIC state between the host and
hypervisor vCPU contexts.

Add flush_hyp_vgic_state() and sync_hyp_vgic_state(). Flush copies
vgic_hcr, the in-use list registers and used_lrs from the host into the
hyp vCPU and pins vgic_sre to a fixed value; sync copies vgic_hcr,
vgic_vmcr and the in-use list registers back. The active priority
registers are handled separately by the save/restore-aprs path.

Bound used_lrs by hyp_gicv3_nr_lr, the cached implemented-LR count,
instead of reading ICH_VTR_EL2 on each entry. That clamps the
host-supplied value and avoids a per-entry sysreg read that is costly
under NV.

Reviewed-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Co-developed-by: Fuad Tabba <fuad.tabba@linux.dev>
Signed-off-by: Fuad Tabba <fuad.tabba@linux.dev>
---
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 55 ++++++++++++++++++++++--------
 1 file changed, 41 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index f25ee39715282..0194965930e61 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -102,6 +102,45 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
 	*host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;
 }
 
+static void flush_hyp_vgic_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+	struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
+	struct vgic_v3_cpu_if *host_cpu_if, *hyp_cpu_if;
+	unsigned int used_lrs, i;
+
+	host_cpu_if	= &host_vcpu->arch.vgic_cpu.vgic_v3;
+	hyp_cpu_if	= &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+
+	used_lrs	= host_cpu_if->used_lrs;
+	used_lrs	= min(used_lrs, hyp_gicv3_nr_lr);
+
+	hyp_cpu_if->vgic_hcr	= host_cpu_if->vgic_hcr;
+	/* Should be a one-off */
+	hyp_cpu_if->vgic_sre	= (ICC_SRE_EL1_DIB |
+				   ICC_SRE_EL1_DFB |
+				   ICC_SRE_EL1_SRE);
+	hyp_cpu_if->used_lrs	= used_lrs;
+
+	for (i = 0; i < used_lrs; i++)
+		hyp_cpu_if->vgic_lr[i] = host_cpu_if->vgic_lr[i];
+}
+
+static void sync_hyp_vgic_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+	struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
+	struct vgic_v3_cpu_if *host_cpu_if, *hyp_cpu_if;
+	unsigned int i;
+
+	host_cpu_if	= &host_vcpu->arch.vgic_cpu.vgic_v3;
+	hyp_cpu_if	= &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+
+	host_cpu_if->vgic_hcr = hyp_cpu_if->vgic_hcr;
+	host_cpu_if->vgic_vmcr = hyp_cpu_if->vgic_vmcr;
+
+	for (i = 0; i < hyp_cpu_if->used_lrs; i++)
+		host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i];
+}
+
 static void flush_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu)
 {
 	struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
@@ -150,13 +189,7 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
 
 	hyp_vcpu->vcpu.arch.vsesr_el2	= host_vcpu->arch.vsesr_el2;
 
-	hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3 = host_vcpu->arch.vgic_cpu.vgic_v3;
-
-	/* Bound used_lrs by the number of implemented list registers. */
-	hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3.used_lrs =
-		min_t(unsigned int,
-		      hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3.used_lrs,
-		      hyp_gicv3_nr_lr);
+	flush_hyp_vgic_state(hyp_vcpu);
 
 	hyp_vcpu->vcpu.arch.pid = host_vcpu->arch.pid;
 }
@@ -164,9 +197,6 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
 static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
 {
 	struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
-	struct vgic_v3_cpu_if *hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
-	struct vgic_v3_cpu_if *host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
-	unsigned int i;
 
 	fpsimd_sve_sync(&hyp_vcpu->vcpu);
 	sync_debug_state(hyp_vcpu);
@@ -179,10 +209,7 @@ static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
 
 	host_vcpu->arch.iflags		= hyp_vcpu->vcpu.arch.iflags;
 
-	host_cpu_if->vgic_hcr		= hyp_cpu_if->vgic_hcr;
-	host_cpu_if->vgic_vmcr		= hyp_cpu_if->vgic_vmcr;
-	for (i = 0; i < hyp_cpu_if->used_lrs; ++i)
-		host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i];
+	sync_hyp_vgic_state(hyp_vcpu);
 }
 
 static void handle___pkvm_vcpu_load(struct kvm_cpu_context *host_ctxt)
-- 
2.39.5



^ permalink raw reply related

* [PATCH v3 8/8] KVM: arm64: Implement lazy vCPU state sync for non-protected guests
From: Fuad Tabba @ 2026-06-26  7:04 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
	linux-kernel
  Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
	Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260626070408.3420953-1-fuad.tabba@linux.dev>

pKVM copies a non-protected guest's register context between the host
and the hypervisor on every world switch, even when the host never
inspects it. Defer the copy: on entry, flush the host context into the
hyp vCPU only when the host marked it dirty (PKVM_HOST_STATE_DIRTY); on
exit, leave it in the hyp vCPU and copy it back only when the host needs
it, via a __pkvm_vcpu_sync_state hypercall or at vcpu put. A protected
guest's context is copied as before, since lazy sync only helps where
the host is trusted to see the guest's registers.

PC and PSTATE are the exception: they are copied back on every exit so
the kvm_exit tracepoint reports the guest's real exit PC, and the run
loop's vcpu_mode_is_bad_32bit() and SError-masking checks evaluate the
guest's current PSTATE rather than the value left by the previous sync.

The host needs the full context when it is about to read it (trap
handling) or write it (the SError injection that writes ESR_EL1). Sync
both from handle_exit_early(), which runs non-preemptible so the loaded
hyp vCPU is stable without a preempt guard.

Signed-off-by: Fuad Tabba <fuad.tabba@linux.dev>
---
 arch/arm64/include/asm/kvm_asm.h   |  1 +
 arch/arm64/include/asm/kvm_host.h  |  2 +
 arch/arm64/kvm/arm.c               |  7 +++
 arch/arm64/kvm/handle_exit.c       | 23 ++++++++
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 86 ++++++++++++++++++++++++++++--
 5 files changed, 114 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 043495f7fc78b..6e1135b3ded44 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -113,6 +113,7 @@ enum __kvm_host_smccc_func {
 	__KVM_HOST_SMCCC_FUNC___pkvm_finalize_teardown_vm,
 	__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_load,
 	__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_put,
+	__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_sync_state,
 	__KVM_HOST_SMCCC_FUNC___pkvm_tlb_flush_vmid,
 
 	MARKER(__KVM_HOST_SMCCC_FUNC_MAX)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2faa60df847d2..caa39ee5125f2 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1068,6 +1068,8 @@ struct kvm_vcpu_arch {
 #define INCREMENT_PC		__vcpu_single_flag(iflags, BIT(1))
 /* Target EL/MODE (not a single flag, but let's abuse the macro) */
 #define EXCEPT_MASK		__vcpu_single_flag(iflags, GENMASK(3, 1))
+/* Host-set: the hyp flushes the non-protected vCPU state in on entry */
+#define PKVM_HOST_STATE_DIRTY	__vcpu_single_flag(iflags, BIT(4))
 
 /* Helpers to encode exceptions with minimum fuss */
 #define __EXCEPT_MASK_VAL	unpack_vcpu_flag(EXCEPT_MASK)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 3732ee9eb0d4e..4e89558d80278 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -733,6 +733,10 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 	if (is_protected_kvm_enabled()) {
 		kvm_call_hyp(__vgic_v3_save_aprs, &vcpu->arch.vgic_cpu.vgic_v3);
 		kvm_call_hyp_nvhe(__pkvm_vcpu_put);
+
+		/* __pkvm_vcpu_put implies a sync of the state */
+		if (!kvm_vm_is_protected(vcpu->kvm))
+			vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
 	}
 
 	kvm_vcpu_put_debug(vcpu);
@@ -964,6 +968,9 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
 		return ret;
 
 	if (is_protected_kvm_enabled()) {
+		/* Start with the vcpu in a dirty state */
+		if (!kvm_vm_is_protected(vcpu->kvm))
+			vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
 		ret = pkvm_create_hyp_vm(kvm);
 		if (ret)
 			return ret;
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 54aedf93c78b6..29108e5c0206e 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -486,9 +486,32 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index)
 	}
 }
 
+static void handle_exit_pkvm_state(struct kvm_vcpu *vcpu, int exception_index)
+{
+	int exception_code = ARM_EXCEPTION_CODE(exception_index);
+
+	if (!is_protected_kvm_enabled() || kvm_vm_is_protected(vcpu->kvm))
+		return;
+
+	/*
+	 * Sync the context back when the host will read (trap) or write
+	 * (SError) it. Preempt-off here, so the loaded hyp vCPU is stable.
+	 */
+	if (exception_code == ARM_EXCEPTION_TRAP ||
+	    exception_code == ARM_EXCEPTION_EL1_SERROR ||
+	    ARM_SERROR_PENDING(exception_index)) {
+		kvm_call_hyp_nvhe(__pkvm_vcpu_sync_state);
+		vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
+	} else {
+		vcpu_clear_flag(vcpu, PKVM_HOST_STATE_DIRTY);
+	}
+}
+
 /* For exit types that need handling before we can be preempted */
 void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index)
 {
+	handle_exit_pkvm_state(vcpu, exception_index);
+
 	if (ARM_SERROR_PENDING(exception_index)) {
 		if (this_cpu_has_cap(ARM64_HAS_RAS_EXTN)) {
 			u64 disr = kvm_vcpu_get_disr(vcpu);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 0194965930e61..acf53aae4fe43 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -141,6 +141,48 @@ static void sync_hyp_vgic_state(struct pkvm_hyp_vcpu *hyp_vcpu)
 		host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i];
 }
 
+static void __copy_vcpu_state(const struct kvm_vcpu *from_vcpu,
+			      struct kvm_vcpu *to_vcpu)
+{
+	int i;
+
+	to_vcpu->arch.ctxt.regs		= from_vcpu->arch.ctxt.regs;
+	to_vcpu->arch.ctxt.spsr_abt	= from_vcpu->arch.ctxt.spsr_abt;
+	to_vcpu->arch.ctxt.spsr_und	= from_vcpu->arch.ctxt.spsr_und;
+	to_vcpu->arch.ctxt.spsr_irq	= from_vcpu->arch.ctxt.spsr_irq;
+	to_vcpu->arch.ctxt.spsr_fiq	= from_vcpu->arch.ctxt.spsr_fiq;
+	to_vcpu->arch.ctxt.fp_regs	= from_vcpu->arch.ctxt.fp_regs;
+
+	/*
+	 * Copy the sysregs, but don't mess with the timer state which
+	 * is directly handled by EL1 and is expected to be preserved.
+	 * enum vcpu_sysreg is sparse: VNCR-mapped registers take values
+	 * derived from their VNCR page offset, so the timer registers do
+	 * not form a contiguous numeric range and must be skipped by name.
+	 */
+	for (i = 1; i < NR_SYS_REGS; i++) {
+		switch (i) {
+		case CNTVOFF_EL2:
+		case CNTV_CVAL_EL0:
+		case CNTV_CTL_EL0:
+		case CNTP_CVAL_EL0:
+		case CNTP_CTL_EL0:
+			continue;
+		}
+		to_vcpu->arch.ctxt.sys_regs[i] = from_vcpu->arch.ctxt.sys_regs[i];
+	}
+}
+
+static void sync_hyp_vcpu_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+	__copy_vcpu_state(&hyp_vcpu->vcpu, hyp_vcpu->host_vcpu);
+}
+
+static void flush_hyp_vcpu_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+	__copy_vcpu_state(hyp_vcpu->host_vcpu, &hyp_vcpu->vcpu);
+}
+
 static void flush_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu)
 {
 	struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
@@ -170,7 +212,17 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
 	fpsimd_sve_flush();
 	flush_debug_state(hyp_vcpu);
 
-	hyp_vcpu->vcpu.arch.ctxt	= host_vcpu->arch.ctxt;
+	/*
+	 * If we deal with a non-protected guest and the state is potentially
+	 * dirty (from a host perspective), copy the state back into the hyp
+	 * vcpu.
+	 */
+	if (!pkvm_hyp_vcpu_is_protected(hyp_vcpu)) {
+		if (vcpu_get_flag(host_vcpu, PKVM_HOST_STATE_DIRTY))
+			flush_hyp_vcpu_state(hyp_vcpu);
+	} else {
+		hyp_vcpu->vcpu.arch.ctxt = host_vcpu->arch.ctxt;
+	}
 
 	/* __hyp_running_vcpu must be NULL in a guest context. */
 	hyp_vcpu->vcpu.arch.ctxt.__hyp_running_vcpu = NULL;
@@ -201,9 +253,13 @@ static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
 	fpsimd_sve_sync(&hyp_vcpu->vcpu);
 	sync_debug_state(hyp_vcpu);
 
-	host_vcpu->arch.ctxt		= hyp_vcpu->vcpu.arch.ctxt;
-
-	host_vcpu->arch.hcr_el2		= hyp_vcpu->vcpu.arch.hcr_el2;
+	if (pkvm_hyp_vcpu_is_protected(hyp_vcpu)) {
+		host_vcpu->arch.ctxt = hyp_vcpu->vcpu.arch.ctxt;
+	} else {
+		/* Keep PC (tracepoint) and PSTATE (vcpu_mode_is_bad_32bit) current. */
+		host_vcpu->arch.ctxt.regs.pc = hyp_vcpu->vcpu.arch.ctxt.regs.pc;
+		host_vcpu->arch.ctxt.regs.pstate = hyp_vcpu->vcpu.arch.ctxt.regs.pstate;
+	}
 
 	host_vcpu->arch.fault		= hyp_vcpu->vcpu.arch.fault;
 
@@ -237,8 +293,27 @@ static void handle___pkvm_vcpu_put(struct kvm_cpu_context *host_ctxt)
 {
 	struct pkvm_hyp_vcpu *hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
 
-	if (hyp_vcpu)
+	if (hyp_vcpu) {
+		struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
+
+		if (!pkvm_hyp_vcpu_is_protected(hyp_vcpu) &&
+		    !vcpu_get_flag(host_vcpu, PKVM_HOST_STATE_DIRTY)) {
+			sync_hyp_vcpu_state(hyp_vcpu);
+		}
+
 		pkvm_put_hyp_vcpu(hyp_vcpu);
+	}
+}
+
+static void handle___pkvm_vcpu_sync_state(struct kvm_cpu_context *host_ctxt)
+{
+	struct pkvm_hyp_vcpu *hyp_vcpu;
+
+	hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
+	if (!hyp_vcpu || pkvm_hyp_vcpu_is_protected(hyp_vcpu))
+		return;
+
+	sync_hyp_vcpu_state(hyp_vcpu);
 }
 
 static struct kvm_vcpu *__get_host_hyp_vcpus(struct kvm_vcpu *arg,
@@ -869,6 +944,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__pkvm_finalize_teardown_vm),
 	HANDLE_FUNC(__pkvm_vcpu_load),
 	HANDLE_FUNC(__pkvm_vcpu_put),
+	HANDLE_FUNC(__pkvm_vcpu_sync_state),
 	HANDLE_FUNC(__pkvm_tlb_flush_vmid),
 };
 
-- 
2.39.5



^ permalink raw reply related

* Re: [PATCH v2 1/2] gpio: shared-proxy: always serialize with a sleeping mutex
From: Viacheslav @ 2026-06-26  7:16 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-gpio, linux-arm-kernel, linux-amlogic, linux-kernel
In-Reply-To: <d8d407d5-ba6c-4197-9cf0-2fa7e6e17155@samsung.com>

Hi!

26.06.2026 08:54, Marek Szyprowski wrote:
> On 25.06.2026 13:57, Viacheslav Bocharov wrote:
>> The shared GPIO descriptor used either a mutex or a spinlock, chosen at
>> runtime from the underlying chip's can_sleep:
>>
>> 	shared_desc->can_sleep = gpiod_cansleep(shared_desc->desc);
>> 	... if (can_sleep) mutex_lock(); else spin_lock_irqsave();

...

>>
>> The lock type was added by commit a060b8c511ab ("gpiolib: implement
>> low-level, shared GPIO support"); the sleeping call under it arrived with
>> the proxy driver.
>>
>> Fixes: e992d54c6f97 ("gpio: shared-proxy: implement the shared GPIO proxy driver")
>> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> Closes: https://lore.kernel.org/all/00107523-7737-4b92-a785-14ce4e93b8cb@samsung.com/
>> Signed-off-by: Viacheslav Bocharov <v@baodeep.com>
> 
> 
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> 

Thanks!


Best regards
-- 
Viacheslav Bocharov



^ permalink raw reply

* Re: [PATCH v5 1/7] dt-bindings: display: verisilicon,dc: generalize for single-output variants
From: Conor Dooley @ 2026-06-26  7:19 UTC (permalink / raw)
  To: Icenowy Zheng
  Cc: Conor Dooley, Joey Lu, maarten.lankhorst, mripard, tzimmermann,
	airlied, simona, robh, krzk+dt, conor+dt, ychuang3, schung, yclu4,
	dri-devel, devicetree, linux-arm-kernel, linux-kernel
In-Reply-To: <e3fe23ddbc504879bd797bbaa595d3653fa139ff.camel@iscas.ac.cn>

[-- Attachment #1: Type: text/plain, Size: 2657 bytes --]

On Fri, Jun 26, 2026 at 01:27:21PM +0800, Icenowy Zheng wrote:
> 在 2026-06-25四的 17:33 +0100，Conor Dooley写道：
> > On Thu, Jun 25, 2026 at 05:44:43PM +0800, Joey Lu wrote:
> > > +allOf:
> > > +  - if:
> > > +      properties:
> > > +        compatible:
> > > +          contains:
> > > +            const: thead,th1520-dc8200
> > > +    then:
> > > +      properties:
> > > +        clocks:
> > > +          minItems: 5
> > > +          maxItems: 5
> > > +
> > > +        clock-names:
> > > +          minItems: 5
> > > +          maxItems: 5
> > 
> > All the maxItems here repeat the maximum constraint and do nothing.
> > 
> > Since you didn't change the minimum constraint at the top level, your
> > minItems also do nothing.
> > 
> > > +
> > > +        resets:
> > > +          minItems: 3
> > > +          maxItems: 3
> > > +
> > > +        reset-names:
> > > +          minItems: 3
> > > +          maxItems: 3
> > > +
> > > +      required:
> > > +        - resets
> > > +        - reset-names
> > 
> > Both conditional sections have this, but the original binding doesn't
> > require these for the thead device. This is a functional change
> > therefore and shouldn't be in a patch calling itself "generalise for
> > single ended variants".
> 
> Well yes they're required.
> 
> Should I send a patch adding the `thead,th1520-dc8200` part of the
> schema?

If you mean the code above, no. Adding a conditional section when
there's only that compatible doesn't make sense.

What you could do is just add it at the top level though, which would
also benefit this patch since it'd not have to be conditionally added
for the new nuvoton device.
Just note in your commit message about what the ABI impact of the change
to required properties is (effectively nothing because it's optional in
the driver and the only user has the properties).

> > > +
> > > +        resets:
> > > +          minItems: 1
> > > +          maxItems: 1
> > > +
> > > +        reset-names:
> > > +          items:
> > > +            - const: core
> > 
> > This is just maxItems: 1.
> 
> Well the implicit rules of DT binding schemas are quite weird...

I don't think it is that strange, as the binding has
  reset-names:
    items:
      - const: core
      - const: axi
      - const: ahb
so just constraining to one item is the simplest way to do this without
duplication.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH v5 1/7] dt-bindings: display: verisilicon,dc: generalize for single-output variants
From: Conor Dooley @ 2026-06-26  7:22 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Joey Lu, zhengxingda, maarten.lankhorst, mripard, tzimmermann,
	airlied, simona, robh, krzk+dt, conor+dt, ychuang3, schung, yclu4,
	dri-devel, devicetree, linux-arm-kernel, linux-kernel
In-Reply-To: <20260625-bobbing-annotate-d1c4d6874ee2@spud>

[-- Attachment #1: Type: text/plain, Size: 4397 bytes --]

On Thu, Jun 25, 2026 at 05:33:37PM +0100, Conor Dooley wrote:
> On Thu, Jun 25, 2026 at 05:44:43PM +0800, Joey Lu wrote:
> > The verisilicon,dc binding was originally written for the T-Head TH1520
> > SoC carrying a DC8200, and hard-codes five clocks, three resets and two
> > output ports.
> > 
> > Add the Nuvoton MA35D1 DCUltraLite (nuvoton,ma35d1-dcu) to the binding.
> > The DCUltraLite uses only two clocks (core, pix0) and one reset (core),
> > with a single output port.
> > 
> > Use allOf/if blocks to express per-variant constraints rather than
> > hard-coding the DC8200 topology at the top level.  Each compatible's
> > block constrains the clock and reset item counts; the nuvoton block
> > additionally overrides clock-names to the two names it actually uses.
> > 
> > Signed-off-by: Joey Lu <a0987203069@gmail.com>
> > ---
> >  .../bindings/display/verisilicon,dc.yaml      | 57 +++++++++++++++++++
> >  1 file changed, 57 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/display/verisilicon,dc.yaml b/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
> > index 9dc35ab973f2..1e751f3c7ce8 100644
> > --- a/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
> > +++ b/Documentation/devicetree/bindings/display/verisilicon,dc.yaml
> > @@ -17,6 +17,7 @@ properties:
> >      items:
> >        - enum:
> >            - thead,th1520-dc8200
> > +          - nuvoton,ma35d1-dcu
> >        - const: verisilicon,dc # DC IPs have discoverable ID/revision registers
> >  
> >    reg:
> > @@ -77,6 +78,62 @@ required:
> >    - clock-names
> >    - ports
> >  
> > +allOf:
> > +  - if:
> > +      properties:
> > +        compatible:
> > +          contains:
> > +            const: thead,th1520-dc8200
> > +    then:
> > +      properties:
> > +        clocks:
> > +          minItems: 5
> > +          maxItems: 5
> > +
> > +        clock-names:
> > +          minItems: 5
> > +          maxItems: 5
> 
> All the maxItems here repeat the maximum constraint and do nothing.
> 
> Since you didn't change the minimum constraint at the top level, your
> minItems also do nothing.
> 
> > +
> > +        resets:
> > +          minItems: 3
> > +          maxItems: 3
> > +
> > +        reset-names:
> > +          minItems: 3
> > +          maxItems: 3
> > +
> > +      required:
> > +        - resets
> > +        - reset-names
> 
> Both conditional sections have this, but the original binding doesn't
> require these for the thead device. This is a functional change
> therefore and shouldn't be in a patch calling itself "generalise for
> single ended variants".
> 
> FWIW, adding your new compatible shouldn't really be in a patch with
> that subject either, it really should say "add support for nuvoton
> ma35d1" or something.
> 
> > +
> > +  - if:
> > +      properties:
> > +        compatible:
> > +          contains:
> > +            const: nuvoton,ma35d1-dcu
> > +    then:
> > +      properties:
> > +        clocks:
> > +          minItems: 2
> 
> Anything that updates the minimum constraint should be done at the top
> level of this schema. The conditional section should then tighten the
> constraint, in this case that means only having maxItems.
> 
> > +          maxItems: 2
> > +
> > +        clock-names:
> > +          items:
> > +            - const: core
> > +            - const: pix0
> 
> Does this even work when the top level schema thinks clock 2 should be
> called axi?

Additionally here, only have core and pix0 seems like it might be an
oversimplification. I doubt removing the second output port means that
the axi and ahb clocks are no longer needed.
Is it the case that your device supplies the same clock to core, ahb and
axi? If so, then you should fill those clocks in in your devicetree and
this can just constrain the number of clocks/clock-names to 4.

> 
> > +
> > +        resets:
> > +          minItems: 1
> > +          maxItems: 1
> > +
> > +        reset-names:
> > +          items:
> > +            - const: core
> 
> This is just maxItems: 1.
> 
> pw-bot: changes-requested
> 
> Thanks,
> Conor.
> 
> > +
> > +      required:
> > +        - resets
> > +        - reset-names
> > +
> >  additionalProperties: false
> >  
> >  examples:
> > -- 
> > 2.43.0
> > 



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH v3 8/8] KVM: arm64: Implement lazy vCPU state sync for non-protected guests
From: Vincent Donnefort @ 2026-06-26  7:26 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
	linux-kernel, Catalin Marinas, Will Deacon, Joey Gouly,
	Steffen Eiden, Suzuki K Poulose, Zenghui Yu, Quentin Perret,
	Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260626070408.3420953-9-fuad.tabba@linux.dev>

On Fri, Jun 26, 2026 at 08:04:08AM +0100, Fuad Tabba wrote:
> pKVM copies a non-protected guest's register context between the host
> and the hypervisor on every world switch, even when the host never
> inspects it. Defer the copy: on entry, flush the host context into the
> hyp vCPU only when the host marked it dirty (PKVM_HOST_STATE_DIRTY); on
> exit, leave it in the hyp vCPU and copy it back only when the host needs
> it, via a __pkvm_vcpu_sync_state hypercall or at vcpu put. A protected
> guest's context is copied as before, since lazy sync only helps where
> the host is trusted to see the guest's registers.
> 
> PC and PSTATE are the exception: they are copied back on every exit so
> the kvm_exit tracepoint reports the guest's real exit PC, and the run
> loop's vcpu_mode_is_bad_32bit() and SError-masking checks evaluate the
> guest's current PSTATE rather than the value left by the previous sync.
> 
> The host needs the full context when it is about to read it (trap
> handling) or write it (the SError injection that writes ESR_EL1). Sync
> both from handle_exit_early(), which runs non-preemptible so the loaded
> hyp vCPU is stable without a preempt guard.
> 
> Signed-off-by: Fuad Tabba <fuad.tabba@linux.dev>

Reviewed-by: Vincent Donnefort <vdonnefort@google.com>

> ---
>  arch/arm64/include/asm/kvm_asm.h   |  1 +
>  arch/arm64/include/asm/kvm_host.h  |  2 +
>  arch/arm64/kvm/arm.c               |  7 +++
>  arch/arm64/kvm/handle_exit.c       | 23 ++++++++
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c | 86 ++++++++++++++++++++++++++++--
>  5 files changed, 114 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 043495f7fc78b..6e1135b3ded44 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -113,6 +113,7 @@ enum __kvm_host_smccc_func {
>  	__KVM_HOST_SMCCC_FUNC___pkvm_finalize_teardown_vm,
>  	__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_load,
>  	__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_put,
> +	__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_sync_state,
>  	__KVM_HOST_SMCCC_FUNC___pkvm_tlb_flush_vmid,
>  
>  	MARKER(__KVM_HOST_SMCCC_FUNC_MAX)
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 2faa60df847d2..caa39ee5125f2 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -1068,6 +1068,8 @@ struct kvm_vcpu_arch {
>  #define INCREMENT_PC		__vcpu_single_flag(iflags, BIT(1))
>  /* Target EL/MODE (not a single flag, but let's abuse the macro) */
>  #define EXCEPT_MASK		__vcpu_single_flag(iflags, GENMASK(3, 1))
> +/* Host-set: the hyp flushes the non-protected vCPU state in on entry */
> +#define PKVM_HOST_STATE_DIRTY	__vcpu_single_flag(iflags, BIT(4))
>  
>  /* Helpers to encode exceptions with minimum fuss */
>  #define __EXCEPT_MASK_VAL	unpack_vcpu_flag(EXCEPT_MASK)
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 3732ee9eb0d4e..4e89558d80278 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -733,6 +733,10 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  	if (is_protected_kvm_enabled()) {
>  		kvm_call_hyp(__vgic_v3_save_aprs, &vcpu->arch.vgic_cpu.vgic_v3);
>  		kvm_call_hyp_nvhe(__pkvm_vcpu_put);
> +
> +		/* __pkvm_vcpu_put implies a sync of the state */
> +		if (!kvm_vm_is_protected(vcpu->kvm))
> +			vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
>  	}
>  
>  	kvm_vcpu_put_debug(vcpu);
> @@ -964,6 +968,9 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
>  		return ret;
>  
>  	if (is_protected_kvm_enabled()) {
> +		/* Start with the vcpu in a dirty state */
> +		if (!kvm_vm_is_protected(vcpu->kvm))
> +			vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
>  		ret = pkvm_create_hyp_vm(kvm);
>  		if (ret)
>  			return ret;
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 54aedf93c78b6..29108e5c0206e 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -486,9 +486,32 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index)
>  	}
>  }
>  
> +static void handle_exit_pkvm_state(struct kvm_vcpu *vcpu, int exception_index)
> +{
> +	int exception_code = ARM_EXCEPTION_CODE(exception_index);
> +
> +	if (!is_protected_kvm_enabled() || kvm_vm_is_protected(vcpu->kvm))
> +		return;
> +
> +	/*
> +	 * Sync the context back when the host will read (trap) or write
> +	 * (SError) it. Preempt-off here, so the loaded hyp vCPU is stable.
> +	 */
> +	if (exception_code == ARM_EXCEPTION_TRAP ||
> +	    exception_code == ARM_EXCEPTION_EL1_SERROR ||
> +	    ARM_SERROR_PENDING(exception_index)) {
> +		kvm_call_hyp_nvhe(__pkvm_vcpu_sync_state);
> +		vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
> +	} else {
> +		vcpu_clear_flag(vcpu, PKVM_HOST_STATE_DIRTY);
> +	}
> +}
> +
>  /* For exit types that need handling before we can be preempted */
>  void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index)
>  {
> +	handle_exit_pkvm_state(vcpu, exception_index);
> +
>  	if (ARM_SERROR_PENDING(exception_index)) {
>  		if (this_cpu_has_cap(ARM64_HAS_RAS_EXTN)) {
>  			u64 disr = kvm_vcpu_get_disr(vcpu);
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 0194965930e61..acf53aae4fe43 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -141,6 +141,48 @@ static void sync_hyp_vgic_state(struct pkvm_hyp_vcpu *hyp_vcpu)
>  		host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i];
>  }
>  
> +static void __copy_vcpu_state(const struct kvm_vcpu *from_vcpu,
> +			      struct kvm_vcpu *to_vcpu)
> +{
> +	int i;
> +
> +	to_vcpu->arch.ctxt.regs		= from_vcpu->arch.ctxt.regs;
> +	to_vcpu->arch.ctxt.spsr_abt	= from_vcpu->arch.ctxt.spsr_abt;
> +	to_vcpu->arch.ctxt.spsr_und	= from_vcpu->arch.ctxt.spsr_und;
> +	to_vcpu->arch.ctxt.spsr_irq	= from_vcpu->arch.ctxt.spsr_irq;
> +	to_vcpu->arch.ctxt.spsr_fiq	= from_vcpu->arch.ctxt.spsr_fiq;
> +	to_vcpu->arch.ctxt.fp_regs	= from_vcpu->arch.ctxt.fp_regs;
> +
> +	/*
> +	 * Copy the sysregs, but don't mess with the timer state which
> +	 * is directly handled by EL1 and is expected to be preserved.
> +	 * enum vcpu_sysreg is sparse: VNCR-mapped registers take values
> +	 * derived from their VNCR page offset, so the timer registers do
> +	 * not form a contiguous numeric range and must be skipped by name.
> +	 */
> +	for (i = 1; i < NR_SYS_REGS; i++) {
> +		switch (i) {
> +		case CNTVOFF_EL2:
> +		case CNTV_CVAL_EL0:
> +		case CNTV_CTL_EL0:
> +		case CNTP_CVAL_EL0:
> +		case CNTP_CTL_EL0:
> +			continue;
> +		}
> +		to_vcpu->arch.ctxt.sys_regs[i] = from_vcpu->arch.ctxt.sys_regs[i];
> +	}
> +}
> +
> +static void sync_hyp_vcpu_state(struct pkvm_hyp_vcpu *hyp_vcpu)
> +{
> +	__copy_vcpu_state(&hyp_vcpu->vcpu, hyp_vcpu->host_vcpu);
> +}
> +
> +static void flush_hyp_vcpu_state(struct pkvm_hyp_vcpu *hyp_vcpu)
> +{
> +	__copy_vcpu_state(hyp_vcpu->host_vcpu, &hyp_vcpu->vcpu);
> +}
> +
>  static void flush_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu)
>  {
>  	struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
> @@ -170,7 +212,17 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
>  	fpsimd_sve_flush();
>  	flush_debug_state(hyp_vcpu);
>  
> -	hyp_vcpu->vcpu.arch.ctxt	= host_vcpu->arch.ctxt;
> +	/*
> +	 * If we deal with a non-protected guest and the state is potentially
> +	 * dirty (from a host perspective), copy the state back into the hyp
> +	 * vcpu.
> +	 */
> +	if (!pkvm_hyp_vcpu_is_protected(hyp_vcpu)) {
> +		if (vcpu_get_flag(host_vcpu, PKVM_HOST_STATE_DIRTY))
> +			flush_hyp_vcpu_state(hyp_vcpu);
> +	} else {
> +		hyp_vcpu->vcpu.arch.ctxt = host_vcpu->arch.ctxt;
> +	}
>  
>  	/* __hyp_running_vcpu must be NULL in a guest context. */
>  	hyp_vcpu->vcpu.arch.ctxt.__hyp_running_vcpu = NULL;
> @@ -201,9 +253,13 @@ static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
>  	fpsimd_sve_sync(&hyp_vcpu->vcpu);
>  	sync_debug_state(hyp_vcpu);
>  
> -	host_vcpu->arch.ctxt		= hyp_vcpu->vcpu.arch.ctxt;
> -
> -	host_vcpu->arch.hcr_el2		= hyp_vcpu->vcpu.arch.hcr_el2;
> +	if (pkvm_hyp_vcpu_is_protected(hyp_vcpu)) {
> +		host_vcpu->arch.ctxt = hyp_vcpu->vcpu.arch.ctxt;
> +	} else {
> +		/* Keep PC (tracepoint) and PSTATE (vcpu_mode_is_bad_32bit) current. */
> +		host_vcpu->arch.ctxt.regs.pc = hyp_vcpu->vcpu.arch.ctxt.regs.pc;
> +		host_vcpu->arch.ctxt.regs.pstate = hyp_vcpu->vcpu.arch.ctxt.regs.pstate;
> +	}
>  
>  	host_vcpu->arch.fault		= hyp_vcpu->vcpu.arch.fault;
>  
> @@ -237,8 +293,27 @@ static void handle___pkvm_vcpu_put(struct kvm_cpu_context *host_ctxt)
>  {
>  	struct pkvm_hyp_vcpu *hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
>  
> -	if (hyp_vcpu)
> +	if (hyp_vcpu) {
> +		struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
> +
> +		if (!pkvm_hyp_vcpu_is_protected(hyp_vcpu) &&
> +		    !vcpu_get_flag(host_vcpu, PKVM_HOST_STATE_DIRTY)) {
> +			sync_hyp_vcpu_state(hyp_vcpu);
> +		}
> +
>  		pkvm_put_hyp_vcpu(hyp_vcpu);
> +	}
> +}
> +
> +static void handle___pkvm_vcpu_sync_state(struct kvm_cpu_context *host_ctxt)
> +{
> +	struct pkvm_hyp_vcpu *hyp_vcpu;
> +
> +	hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
> +	if (!hyp_vcpu || pkvm_hyp_vcpu_is_protected(hyp_vcpu))
> +		return;
> +
> +	sync_hyp_vcpu_state(hyp_vcpu);
>  }
>  
>  static struct kvm_vcpu *__get_host_hyp_vcpus(struct kvm_vcpu *arg,
> @@ -869,6 +944,7 @@ static const hcall_t host_hcall[] = {
>  	HANDLE_FUNC(__pkvm_finalize_teardown_vm),
>  	HANDLE_FUNC(__pkvm_vcpu_load),
>  	HANDLE_FUNC(__pkvm_vcpu_put),
> +	HANDLE_FUNC(__pkvm_vcpu_sync_state),
>  	HANDLE_FUNC(__pkvm_tlb_flush_vmid),
>  };
>  
> -- 
> 2.39.5
> 


^ permalink raw reply

* [PATCH 1/4] firmware: raspberrypi: reorder rpi_firmware_property_tag enum
From: Gregor Herburger @ 2026-06-26  7:35 UTC (permalink / raw)
  To: Florian Fainelli, Broadcom internal kernel review list, Ray Jui,
	Scott Branden, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Eric Anholt, Stefan Wahren
  Cc: linux-rpi-kernel, linux-arm-kernel, linux-kernel, devicetree,
	Gregor Herburger
In-Reply-To: <20260626-rpi-tryboot-v1-0-490b1c4c4970@linutronix.de>

The enum was once ordered by tags. The later added tags where added in
a different order. Reorder the tags again.

No functional change intended.

Signed-off-by: Gregor Herburger <gregor.herburger@linutronix.de>
---
 include/soc/bcm2835/raspberrypi-firmware.h | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/soc/bcm2835/raspberrypi-firmware.h b/include/soc/bcm2835/raspberrypi-firmware.h
index e1f87fbfe5542..66cc5a426c3c5 100644
--- a/include/soc/bcm2835/raspberrypi-firmware.h
+++ b/include/soc/bcm2835/raspberrypi-firmware.h
@@ -72,26 +72,26 @@ enum rpi_firmware_property_tag {
 	RPI_FIRMWARE_GET_EDID_BLOCK =                         0x00030020,
 	RPI_FIRMWARE_GET_CUSTOMER_OTP =                       0x00030021,
 	RPI_FIRMWARE_GET_DOMAIN_STATE =                       0x00030030,
+	RPI_FIRMWARE_GET_GPIO_STATE =                         0x00030041,
+	RPI_FIRMWARE_GET_GPIO_CONFIG =                        0x00030043,
+	RPI_FIRMWARE_GET_PERIPH_REG =                         0x00030045,
 	RPI_FIRMWARE_GET_THROTTLED =                          0x00030046,
 	RPI_FIRMWARE_GET_CLOCK_MEASURED =                     0x00030047,
 	RPI_FIRMWARE_NOTIFY_REBOOT =                          0x00030048,
+	RPI_FIRMWARE_GET_POE_HAT_VAL =                        0x00030049,
+	RPI_FIRMWARE_SET_POE_HAT_VAL =                        0x00030050,
+	RPI_FIRMWARE_NOTIFY_XHCI_RESET =                      0x00030058,
+	RPI_FIRMWARE_NOTIFY_DISPLAY_DONE =                    0x00030066,
 	RPI_FIRMWARE_SET_CLOCK_STATE =                        0x00038001,
 	RPI_FIRMWARE_SET_CLOCK_RATE =                         0x00038002,
 	RPI_FIRMWARE_SET_VOLTAGE =                            0x00038003,
 	RPI_FIRMWARE_SET_TURBO =                              0x00038009,
 	RPI_FIRMWARE_SET_CUSTOMER_OTP =                       0x00038021,
 	RPI_FIRMWARE_SET_DOMAIN_STATE =                       0x00038030,
-	RPI_FIRMWARE_GET_GPIO_STATE =                         0x00030041,
 	RPI_FIRMWARE_SET_GPIO_STATE =                         0x00038041,
 	RPI_FIRMWARE_SET_SDHOST_CLOCK =                       0x00038042,
-	RPI_FIRMWARE_GET_GPIO_CONFIG =                        0x00030043,
 	RPI_FIRMWARE_SET_GPIO_CONFIG =                        0x00038043,
-	RPI_FIRMWARE_GET_PERIPH_REG =                         0x00030045,
 	RPI_FIRMWARE_SET_PERIPH_REG =                         0x00038045,
-	RPI_FIRMWARE_GET_POE_HAT_VAL =                        0x00030049,
-	RPI_FIRMWARE_SET_POE_HAT_VAL =                        0x00030050,
-	RPI_FIRMWARE_NOTIFY_XHCI_RESET =                      0x00030058,
-	RPI_FIRMWARE_NOTIFY_DISPLAY_DONE =                    0x00030066,
 
 	/* Dispmanx TAGS */
 	RPI_FIRMWARE_FRAMEBUFFER_ALLOCATE =                   0x00040001,
@@ -107,7 +107,6 @@ enum rpi_firmware_property_tag {
 	RPI_FIRMWARE_FRAMEBUFFER_GET_PALETTE =                0x0004000b,
 	RPI_FIRMWARE_FRAMEBUFFER_GET_TOUCHBUF =               0x0004000f,
 	RPI_FIRMWARE_FRAMEBUFFER_GET_GPIOVIRTBUF =            0x00040010,
-	RPI_FIRMWARE_FRAMEBUFFER_RELEASE =                    0x00048001,
 	RPI_FIRMWARE_FRAMEBUFFER_TEST_PHYSICAL_WIDTH_HEIGHT = 0x00044003,
 	RPI_FIRMWARE_FRAMEBUFFER_TEST_VIRTUAL_WIDTH_HEIGHT =  0x00044004,
 	RPI_FIRMWARE_FRAMEBUFFER_TEST_DEPTH =                 0x00044005,
@@ -117,6 +116,7 @@ enum rpi_firmware_property_tag {
 	RPI_FIRMWARE_FRAMEBUFFER_TEST_OVERSCAN =              0x0004400a,
 	RPI_FIRMWARE_FRAMEBUFFER_TEST_PALETTE =               0x0004400b,
 	RPI_FIRMWARE_FRAMEBUFFER_TEST_VSYNC =                 0x0004400e,
+	RPI_FIRMWARE_FRAMEBUFFER_RELEASE =                    0x00048001,
 	RPI_FIRMWARE_FRAMEBUFFER_SET_PHYSICAL_WIDTH_HEIGHT =  0x00048003,
 	RPI_FIRMWARE_FRAMEBUFFER_SET_VIRTUAL_WIDTH_HEIGHT =   0x00048004,
 	RPI_FIRMWARE_FRAMEBUFFER_SET_DEPTH =                  0x00048005,
@@ -125,10 +125,10 @@ enum rpi_firmware_property_tag {
 	RPI_FIRMWARE_FRAMEBUFFER_SET_VIRTUAL_OFFSET =         0x00048009,
 	RPI_FIRMWARE_FRAMEBUFFER_SET_OVERSCAN =               0x0004800a,
 	RPI_FIRMWARE_FRAMEBUFFER_SET_PALETTE =                0x0004800b,
-	RPI_FIRMWARE_FRAMEBUFFER_SET_TOUCHBUF =               0x0004801f,
-	RPI_FIRMWARE_FRAMEBUFFER_SET_GPIOVIRTBUF =            0x00048020,
 	RPI_FIRMWARE_FRAMEBUFFER_SET_VSYNC =                  0x0004800e,
 	RPI_FIRMWARE_FRAMEBUFFER_SET_BACKLIGHT =              0x0004800f,
+	RPI_FIRMWARE_FRAMEBUFFER_SET_TOUCHBUF =               0x0004801f,
+	RPI_FIRMWARE_FRAMEBUFFER_SET_GPIOVIRTBUF =            0x00048020,
 
 	RPI_FIRMWARE_VCHIQ_INIT =                             0x00048010,
 

-- 
2.47.3



^ permalink raw reply related

* [PATCH 2/4] dt-bindings: raspberrypi,bcm2835-firmware: Include 'reboot-mode.yaml'
From: Gregor Herburger @ 2026-06-26  7:35 UTC (permalink / raw)
  To: Florian Fainelli, Broadcom internal kernel review list, Ray Jui,
	Scott Branden, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Eric Anholt, Stefan Wahren
  Cc: linux-rpi-kernel, linux-arm-kernel, linux-kernel, devicetree,
	Gregor Herburger
In-Reply-To: <20260626-rpi-tryboot-v1-0-490b1c4c4970@linutronix.de>

The Raspberry Pi firmware allows to set a reboot mode called tryboot
that allows to try booting from a different partition to allow updating
of the boot partition. Allow reboot mode properties by referencing the
reboot-mode schema.

Signed-off-by: Gregor Herburger <gregor.herburger@linutronix.de>
---
 .../devicetree/bindings/arm/bcm/raspberrypi,bcm2835-firmware.yaml    | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/arm/bcm/raspberrypi,bcm2835-firmware.yaml b/Documentation/devicetree/bindings/arm/bcm/raspberrypi,bcm2835-firmware.yaml
index 983ea80eaec97..30b490e0d9fb3 100644
--- a/Documentation/devicetree/bindings/arm/bcm/raspberrypi,bcm2835-firmware.yaml
+++ b/Documentation/devicetree/bindings/arm/bcm/raspberrypi,bcm2835-firmware.yaml
@@ -133,11 +133,14 @@ properties:
     required:
       - compatible
 
+allOf:
+  - $ref: /schemas/power/reset/reboot-mode.yaml#
+
 required:
   - compatible
   - mboxes
 
-additionalProperties: false
+unevaluatedProperties: false
 
 examples:
   - |

-- 
2.47.3



^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox