Linux Power Management development
 help / color / mirror / Atom feed
* Re: [PATCH 3/8] firmware: meson: sm: Add thermal calibration SMC call
From: Ronald Claveau @ 2026-04-13 10:31 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: linux-pm, linux-amlogic, devicetree, linux-kernel,
	linux-arm-kernel, Guillaume La Roque, Rafael J. Wysocki,
	Daniel Lezcano, Zhang Rui, Lukasz Luba, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Neil Armstrong, Kevin Hilman,
	Jerome Brunet, Martin Blumenstingl
In-Reply-To: <6200b372-149f-48f7-ae91-a5364562058c@kernel.org>

On 4/12/26 12:47 PM, Krzysztof Kozlowski wrote:
> On 10/04/2026 18:48, Ronald Claveau wrote:
>> @@ -245,6 +246,14 @@ struct meson_sm_firmware *meson_sm_get(struct device_node *sm_node)
>>  }
>>  EXPORT_SYMBOL_GPL(meson_sm_get);
>>  
>> +int meson_sm_get_thermal_calib(struct meson_sm_firmware *fw, u32 *trim_info,
> 
> Exported functions should have kerneldoc.
> 

Thanks for your feedback, I will add it.

>> +			       u32 tsensor_id)
>> +{
>> +	return meson_sm_call(fw, SM_THERMAL_CALIB_READ, trim_info, tsensor_id,
>> +			     0, 0, 0, 0);
> 
> Best regards,
> Krzysztof


-- 
Best regards,
Ronald

^ permalink raw reply

* Re: [PATCH v4 0/2] Add QEMU virt-ctrl driver and update m68k virt
From: Geert Uytterhoeven @ 2026-04-13 10:19 UTC (permalink / raw)
  To: Kuan-Wei Chiu
  Cc: sre, jserv, eleanor15x, daniel, laurent, linux-kernel, linux-m68k,
	linux-pm
In-Reply-To: <20260412211952.3564033-1-visitorckw@gmail.com>

Hi Kuan-Wei,

On Sun, 12 Apr 2026 at 23:20, Kuan-Wei Chiu <visitorckw@gmail.com> wrote:
> Introduce a generic platform driver for the QEMU 'virt-ctrl' device [1]
> and transitions the m68k 'virt' machine to use it, replacing
> architecture-specific hooks.
>
> The new driver ('qemu-virt-ctrl') registers a restart handler and
> populates the global 'pm_power_off' callback.
>
> On the m68k side, the platform initialization is updated to register
> the 'qemu-virt-ctrl' platform device. Additionally, the 'mach_reset'
> hook is bridged to 'do_kernel_restart()' to ensure the kernel's restart
> handler chain is correctly invoked.
>
> Verified on QEMU m68k virt. Both system reset and power-off were
> confirmed functional by invoking 'reboot(LINUX_REBOOT_CMD_RESTART)',
> 'reboot(LINUX_REBOOT_CMD_POWER_OFF)', and
> 'reboot(LINUX_REBOOT_CMD_HALT)' from userspace.
>
> Link: https://gitlab.com/qemu-project/qemu/-/blob/v10.2.0/hw/misc/virt_ctrl.c [1]
> ---
> Changes in v4:
> - Fix sparse warning caught by kernel test robot.
> - Pass the struct qemu_virt_ctrl context to sys-off handlers instead of
>   the __iomem pointer.

Thanks I replaced v3 with v4 in my queue in the m68k tree for v7.1,
and will postpone my PR by a few days to give it some coverage in
linux-next.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH v3] PM: hibernate: call preallocate_image after freeze prepare
From: Matthew Leach @ 2026-04-13  9:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Pavel Machek, Len Brown, Mario Limonciello, linux-pm,
	linux-kernel, YoungJun Park, kernel
In-Reply-To: <CAJZ5v0j7fqyL2GtWzwpWP8ATj=WnuQGC5z8Zhz7VO-jixrYnpA@mail.gmail.com>

Hi Rafael,

"Rafael J. Wysocki" <rafael@kernel.org> writes:

> On Fri, Apr 3, 2026 at 9:36 AM Matthew Leach
> <matthew.leach@collabora.com> wrote:
>>

[...]

> Can you please have a look at
>
> https://sashiko.dev/#/patchset/20260403-hibernation-fixes-v3-1-31bc9fa3ba2d%40collabora.com

[pasting comment in-line here for comment]

> Does this relocation introduce a deadlock during memory reclaim?
> 
> hibernate_preallocate_memory() allocates a large amount of memory and
> triggers direct reclaim. Direct reclaim needs to write back dirty file
> pages and swap out anonymous pages.
> 
> Since freeze_kernel_threads() just ran, threads required for I/O
> completion (like kswapd, jbd2 for Ext4, or WQ_FREEZABLE workqueues) are
> currently frozen. Will the I/O for page reclaim block indefinitely
> waiting on these frozen threads?

The existing code already performs a memory reclaim after
freeze_kernel_threads(). The old shrink_shmem_memory() called
shrink_all_memory() in the same position, after both
freeze_kernel_threads() and dpm_prepare(). This isn't a new pattern
being introduced by this patch.

Nevertheless, the call chain looks like:

shrink_all_memory()
  -> do_try_to_free_pages()
    -> shrink_zones()
      -> shrink_node()
        -> shrink_folio_list()
          -> pageout()

pageout() only writes back shmem and anonymous folios to swap; so jdb2
and other FS threads being frozen isn't a concern. For the swap write
out, the I/O submission path is via submit_bio() which is also
synchronous.

> Additionally, because the OOM killer is already disabled by
> freeze_processes() earlier in the hibernation path, can the reclaim path
> get stuck permanently without being able to fall back to killing
> processes?

There's nothing new here regarding the OOM killer. freeze_processes()
disables it in hibernate() prior to calling hibernation_snapshot().
Since this patch is entirely contained within hibernation_snapshot()
that pattern hasn't changed.

Regards,
-- 
Matt

^ permalink raw reply

* [RFC PATCH 0/2] Decouple ftrace/livepatch from module loader via notifier priority and reverse traversal
From: chensong_2000 @ 2026-04-13  8:01 UTC (permalink / raw)
  To: rafael, lenb, mturquette, sboyd, viresh.kumar, agk, snitzer,
	mpatocka, bmarzins, song, yukuai, linan122, jason.wessel, danielt,
	dianders, horms, davem, edumazet, kuba, pabeni, paulmck, frederic,
	mcgrof, petr.pavlu, da.gomez, samitolvanen, atomlin, jpoimboe,
	jikos, mbenes, pmladek, joe.lawrence, rostedt, mhiramat,
	mark.rutland, mathieu.desnoyers
  Cc: linux-modules, linux-kernel, linux-trace-kernel, linux-acpi,
	linux-clk, linux-pm, live-patching, dm-devel, linux-raid,
	kgdb-bugreport, netdev, Song Chen

From: Song Chen <chensong_2000@189.cn>

This patchset addresses a long-standing tight coupling between the
module loader and two of its key consumers: ftrace and livepatch.

Background:

The module loader currently hard-codes direct calls to
ftrace_module_enable(), klp_module_coming(), klp_module_going() and
ftrace_release_mod() inside prepare_coming_module() and the module
unload path. This hard-coding was necessary because the module notifier
chain could not guarantee the strict call ordering that ftrace and
livepatch require:

  During MODULE_STATE_COMING, ftrace must run before livepatch, so
  that per-module function records are ready before livepatch registers
  its ftrace hooks.

  During MODULE_STATE_GOING, livepatch must run before ftrace, so that
  livepatch removes its hooks before ftrace releases those records.

This symmetric setup/teardown ordering could not be expressed through
the notifier chain because the chain only supported forward (descending
priority) traversal. Without reverse traversal, it was impossible to
guarantee that the GOING order would be the strict inverse of the
COMING order using a single priority value per notifier.

Patch 1 - notifier: replace single-linked list with double-linked list.
Patch 2 - ftrace/klp: decouple from module loader using notifier
priority.

headsup: somehow the smtp of my mailbox doesn't work very well lately, 
if i receive return letter, i have to resend, sorry in advance.

Song Chen (2):
  kernel/notifier: replace single-linked list with double-linked list
    for reverse traversal
  kernel/module: Decouple klp and ftrace from load_module

 drivers/acpi/sleep.c      |   1 -
 drivers/clk/clk.c         |   2 +-
 drivers/cpufreq/cpufreq.c |   2 +-
 drivers/md/dm-integrity.c |   1 -
 drivers/md/md.c           |   1 -
 include/linux/module.h    |   8 ++
 include/linux/notifier.h  |  26 ++---
 kernel/debug/debug_core.c |   1 -
 kernel/livepatch/core.c   |  29 ++++-
 kernel/module/main.c      |  34 +++---
 kernel/notifier.c         | 219 ++++++++++++++++++++++++++++++++------
 kernel/trace/ftrace.c     |  38 +++++++
 net/ipv4/nexthop.c        |   2 +-
 13 files changed, 290 insertions(+), 74 deletions(-)

-- 
2.43.0


^ permalink raw reply

* Re: [PATCH v7 0/8] Add support for handling PCIe M.2 Key E connectors in devicetree
From: Chen-Yu Tsai @ 2026-04-13  7:54 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: Rob Herring, Greg Kroah-Hartman, Jiri Slaby, Nathan Chancellor,
	Nicolas Schier, Hans de Goede, Ilpo Järvinen, Mark Pearson,
	Derek J. Clark, Manivannan Sadhasivam, Krzysztof Kozlowski,
	Conor Dooley, Marcel Holtmann, Luiz Augusto von Dentz,
	Bartosz Golaszewski, Andy Shevchenko, Bartosz Golaszewski,
	linux-serial, linux-kernel, linux-kbuild, platform-driver-x86,
	linux-pci, devicetree, linux-arm-msm, linux-bluetooth, linux-pm,
	Stephan Gerhold, Dmitry Baryshkov, linux-acpi, Hans de Goede,
	Bartosz Golaszewski
In-Reply-To: <20260326-pci-m2-e-v7-0-43324a7866e6@oss.qualcomm.com>

Hi,

On Thu, Mar 26, 2026 at 01:36:28PM +0530, Manivannan Sadhasivam wrote:
> Hi,
> 
> This series is the continuation of the series [1] that added the initial support
> for the PCIe M.2 connectors. This series extends it by adding support for Key E
> connectors. These connectors are used to connect the Wireless Connectivity
> devices such as WiFi, BT, NFC and GNSS devices to the host machine over
> interfaces such as PCIe/SDIO, USB/UART and NFC. This series adds support for
> connectors that expose PCIe interface for WiFi and UART interface for BT. Other
> interfaces are left for future improvements.

Thanks for working on this. I started playing with it now that it is in
-next. The PCIe part works fine. I'm looking into how to fit the pwrseq

A couple questions:

- Given that this connector actually represents two devices, how do I
  say I want the BT part to be a wakeup source, but not the WiFi part?
  Does wakeup-source even work at this point?

- Are there plans to do the SDIO part?

- The matching done in the M.2 connector driver for pwrseq_get() seems a
  bit naive. It simply checks if the remote device in the OF graph is
  the same as the requesting device.

  I think this would run into issues with USB hubs. If I have a USB hub
  and two M.2 connectors, with both connectors connected to the same
  hub, pwrseq_get() is going to always return only one of the instances.
  This is because the USB hub has one device node with multiple OF graph
  ports.


Thanks
ChenYu


> Serdev device support for BT
> ============================
> 
> Adding support for the PCIe interface was mostly straightforward and a lot
> similar to the previous Key M connector. But adding UART interface has proved to
> be tricky. This is mostly because of the fact UART is a non-discoverable bus,
> unlike PCIe which is discoverable. So this series relied on the PCI notifier to
> create the serdev device for UART/BT. This means the PCIe interface will be
> brought up first and after the PCIe device enumeration, the serdev device will
> be created by the pwrseq driver. This logic is necessary since the connector
> driver and DT node don't describe the device, but just the connector. So to make
> the connector interface Plug and Play, the connector driver uses the PCIe device
> ID to identify the card and creates the serdev device. This logic could be
> extended in the future to support more M.2 cards. Even if the M.2 card uses SDIO
> interface for connecting WLAN, a SDIO notifier could be added to create the
> serdev device.
> 
> Testing
> =======
> 
> This series, together with the devicetree changes [2] was tested on the
> Qualcomm X1e based Lenovo Thinkpad T14s Laptop which has the WCN7850 WLAN/BT
> 1620 LGA card connected over PCIe and UART.
> 
> Merge Strategy
> ==============
> 
> Due to the API dependency, both the serdev and pwrseq patches need to go through
> a single tree, maybe through pwrseq tree. So the serdev patches need Ack from
> Greg. But Bluetooth patch can be merged separately.
> 
> NOTE
> ====
> 
> This series is based on bluetooth-next/master to resolve the conflict with the
> Bluetooth patch. Other pathces should apply cleanly on top of v7.0-rc1.
> 
> [1] https://lore.kernel.org/linux-pci/20260107-pci-m2-v5-0-8173d8a72641@oss.qualcomm.com
> [2] https://github.com/Mani-Sadhasivam/linux/commit/b50f8386900990eed3dce8d91c3b643fb0e8739d
> 
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> ---
> Changes in v7:
> - Dropped the LGA binding change due to vendor prefix concern. This will be
>   submitted later once I get clarity.
> - Fixed several issues in the cleanup path of the pwrseq-pci-m2 driver which
>   includes adding the .remove() callback.
> - Rebased on top of bluetooth-next/master to resolve conflict with bluetooth
>   patch.
> - Link to v6: https://lore.kernel.org/r/20260317-pci-m2-e-v6-0-9c898f108d3d@oss.qualcomm.com
> 
> Changes in v6:
> - Added a check to bail out if the serdev device was already added during notifier.
> - Collected tags
> - Link to v5: https://lore.kernel.org/r/20260224-pci-m2-e-v5-0-dd9b9501d33c@oss.qualcomm.com
> 
> Changes in v5:
> - Incorporated comments in the binding patch by using single endpoint per port,
>   reordering port nodes, adding missing properties and using a complete example.
> - Incorporated comments in the pwrseq patch (nothing major)
> - Fixed the build issue in patch 2
> - Collected tags
> - Rebased on top of 7.0-rc1
> - Link to v4: https://lore.kernel.org/r/20260112-pci-m2-e-v4-0-eff84d2c6d26@oss.qualcomm.com
> 
> Changes in v4:
> - Switched to dynamic OF node for serdev instead of swnode and dropped all
>   swnode related patches
> - Link to v3: https://lore.kernel.org/r/20260110-pci-m2-e-v3-0-4faee7d0d5ae@oss.qualcomm.com
> 
> Changes in v3:
> - Switched to swnode for the serdev device and dropped the custom
>   serdev_device_id related patches
> - Added new swnode APIs to match the swnode with existing of_device_id
> - Incorporated comments in the bindings patch
> - Dropped the UIM interface from binding since it is not clear how it should get
>   wired
> - Incorporated comments in the pwrseq driver patch
> - Splitted the pwrseq patch into two
> - Added the 1620 LGA compatible with Key E fallback based on Stephan's finding
> - Link to v2: https://lore.kernel.org/r/20251125-pci-m2-e-v2-0-32826de07cc5@oss.qualcomm.com
> 
> Changes in v2:
> - Used '-' for GPIO names in the binding and removed led*-gpios properties
> - Described the endpoint nodes for port@0 and port@1 nodes
> - Added the OF graph port to the serial binding
> - Fixed the hci_qca driver to return err if devm_pwrseq_get() fails
> - Incorporated various review comments in pwrseq driver
> - Collected Ack
> - Link to v1: https://lore.kernel.org/r/20251112-pci-m2-e-v1-0-97413d6bf824@oss.qualcomm.com
> 
> ---
> Manivannan Sadhasivam (8):
>       serdev: Convert to_serdev_*() helpers to macros and use container_of_const()
>       serdev: Add an API to find the serdev controller associated with the devicetree node
>       serdev: Do not return -ENODEV from of_serdev_register_devices() if external connector is used
>       dt-bindings: serial: Document the graph port
>       dt-bindings: connector: Add PCIe M.2 Mechanical Key E connector
>       Bluetooth: hci_qca: Add M.2 Bluetooth device support using pwrseq
>       power: sequencing: pcie-m2: Add support for PCIe M.2 Key E connectors
>       power: sequencing: pcie-m2: Create serdev device for WCN7850 bluetooth
> 
>  .../bindings/connector/pcie-m2-e-connector.yaml    | 184 +++++++++++
>  .../devicetree/bindings/serial/serial.yaml         |   3 +
>  MAINTAINERS                                        |   1 +
>  drivers/bluetooth/hci_qca.c                        |   9 +
>  drivers/power/sequencing/Kconfig                   |   3 +-
>  drivers/power/sequencing/pwrseq-pcie-m2.c          | 346 ++++++++++++++++++++-
>  drivers/tty/serdev/core.c                          |  28 +-
>  include/linux/serdev.h                             |  24 +-
>  8 files changed, 570 insertions(+), 28 deletions(-)
> ---
> base-commit: 559f264e403e4d58d56a17595c60a1de011c5e20
> change-id: 20251112-pci-m2-e-94695ac9d657
> 
> Best regards,
> --  
> Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> 

^ permalink raw reply

* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support
From: Viresh Kumar @ 2026-04-13  5:51 UTC (permalink / raw)
  To: Pierre Gondois
  Cc: Sumit Gupta, linux-tegra, linux-kernel, linux-doc, zhenglifeng1,
	treding, jonathanh, vsethi, ionela.voinescu, ksitaraman, sanjayc,
	zhanjie9, corbet, mochs, skhan, bbasu, rdunlap, linux-pm,
	mario.limonciello, rafael
In-Reply-To: <208360b1-36a5-419d-80f4-431914407f61@arm.com>

On 10-04-26, 15:47, Pierre Gondois wrote:
> I need to ping Viresh to check if this is still relevant.

I think its okay to clear the min/max state in the kernel once and for all if
you think it is not done nicely. As discussed earlier, try that in a fresh
series which only does that part.

-- 
viresh

^ permalink raw reply

* RE: [PATCH] pmdomain: imx: Make IMX8M/IMX9 BLK_CTRL tristate
From: Zhipeng Wang @ 2026-04-13  5:32 UTC (permalink / raw)
  To: Frank Li
  Cc: ulfh@kernel.org, s.hauer@pengutronix.de, kernel@pengutronix.de,
	festevam@gmail.com, linux-pm@vger.kernel.org, imx@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Xuegang Liu, Jindong Yue
In-Reply-To: <adxiLObZ5gp2uQ4t@lizhi-Precision-Tower-5810>

> On Mon, Apr 13, 2026 at 03:08:20AM +0000, Zhipeng Wang wrote:
> > > On Fri, Apr 10, 2026 at 06:27:35PM +0900, Zhipeng Wang wrote:
> > > > Convert IMX8M_BLK_CTRL and IMX9_BLK_CTRL from bool to tristate to
> > > > allow building as loadable modules.
> > > >
> > > > Add prompt strings to make these options visible and configurable
> > > > in menuconfig, keeping them enabled by default on appropriate
> platforms.
> > > >
> > > > Also remove the IMX_GPCV2_PM_DOMAINS dependency from
> > > IMX9_BLK_CTRL
> > > > since i.MX93 doesn't use GPCv2 power domains.
> > >
> > > Does it cause build failure at GPCv2 platform? Or previous
> > > dependency actually wrong.
> > >
> > > Frank
> >
> > Hi Frank,
> >
> > The previous dependency was actually wrong. Here's why:
> >
> > i.MX93 uses a different power domain architecture compared to i.MX8M
> series:
> >
> > - i.MX8M uses GPCv2 (General Power Controller v2) for power domain
> > management
> > - i.MX93 uses BLK_CTRL directly without GPCv2.
> >
> > The IMXGPCV2PMDOMAINS dependency was likely copied from
> IMX8MBLKCTRL
> > configuration when IMX9BLKCTRL was initially added, but it was
> > incorrect from the beginning since i.MX93 hardware doesn't have GPCv2 at
> all.
> 
> Okay, please add such information into commit message. Some
> 
> Frank
Hi Frank,

Thanks! I've added the detailed explanation to the commit message and 
sent v2.

Best regards,
Zhipeng

^ permalink raw reply

* [PATCH v2] pmdomain: imx: Make IMX8M/IMX9 BLK_CTRL tristate
From: Zhipeng Wang @ 2026-04-13  5:30 UTC (permalink / raw)
  To: ulfh, Frank.Li, s.hauer
  Cc: kernel, festevam, linux-pm, imx, linux-arm-kernel, linux-kernel,
	xuegang.liu, jindong.yue

Convert IMX8M_BLK_CTRL and IMX9_BLK_CTRL from bool to tristate
to allow building as loadable modules.

Add prompt strings to make these options visible and configurable
in menuconfig, keeping them enabled by default on appropriate platforms.

Also remove the IMX_GPCV2_PM_DOMAINS dependency from IMX9_BLK_CTRL.
This dependency was incorrect from the beginning - i.MX93 uses a
different power domain architecture compared to i.MX8M series:

- i.MX8M uses GPCv2 (General Power Controller v2) for power domain
  management, hence IMX8M_BLK_CTRL correctly depends on it.

- i.MX93 uses BLK_CTRL directly without GPCv2. The hardware doesn't
  have GPCv2 at all.

Signed-off-by: Zhipeng Wang <zhipeng.wang_1@nxp.com>
---
 drivers/pmdomain/imx/Kconfig | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/pmdomain/imx/Kconfig b/drivers/pmdomain/imx/Kconfig
index 00203615c65e..9168d183b0c5 100644
--- a/drivers/pmdomain/imx/Kconfig
+++ b/drivers/pmdomain/imx/Kconfig
@@ -10,15 +10,18 @@ config IMX_GPCV2_PM_DOMAINS
 	default y if SOC_IMX7D
 
 config IMX8M_BLK_CTRL
-	bool
-	default SOC_IMX8M && IMX_GPCV2_PM_DOMAINS
+	tristate "i.MX8M BLK CTRL driver"
+	depends on SOC_IMX8M
+	depends on IMX_GPCV2_PM_DOMAINS
 	depends on PM_GENERIC_DOMAINS
 	depends on COMMON_CLK
+	default y
 
 config IMX9_BLK_CTRL
-	bool
-	default SOC_IMX9 && IMX_GPCV2_PM_DOMAINS
+	tristate "i.MX93 BLK CTRL driver"
+	depends on SOC_IMX9
 	depends on PM_GENERIC_DOMAINS
+	default y
 
 config IMX_SCU_PD
 	bool "IMX SCU Power Domain driver"
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH v2] cpuidle: Deny idle entry when CPU already have IPI interrupt pending
From: Maulik Shah (mkshah) @ 2026-04-13  5:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Daniel Lezcano, Christian Loehle, Ulf Hansson, linux-pm,
	linux-kernel, linux-arm-msm
In-Reply-To: <CAJZ5v0jXwtBz3z4h3ehJTuaqYN4z7=wOv_LGnjQ4LQMP0TBKmA@mail.gmail.com>



On 4/6/2026 8:37 PM, Rafael J. Wysocki wrote:
> On Fri, Apr 3, 2026 at 6:08 AM Maulik Shah <maulik.shah@oss.qualcomm.com> wrote:
>>
>> CPU can get IPI interrupt from another CPU while it is executing
>> cpuidle_select() or about to execute same. The selection do not account
>> for pending interrupts and may continue to enter selected idle state only
>> to exit immediately.
>>
>> Example trace collected when there is cross CPU IPI.
>>
>>  [000] 154.892148: sched_waking: comm=sugov:4 pid=491 prio=-1 target_cpu=007
>>  [000] 154.892148: ipi_raise: target_mask=00000000,00000080 (Function call interrupts)
>>  [007] 154.892162: cpu_idle: state=2 cpu_id=7
>>  [007] 154.892208: cpu_idle: state=4294967295 cpu_id=7
>>  [007] 154.892211: irq_handler_entry: irq=2 name=IPI
>>  [007] 154.892211: ipi_entry: (Function call interrupts)
>>  [007] 154.892213: sched_wakeup: comm=sugov:4 pid=491 prio=-1 target_cpu=007
>>  [007] 154.892214: ipi_exit: (Function call interrupts)
>>
>> This impacts performance and the above count increments.
>>
>> commit ccde6525183c ("smp: Introduce a helper function to check for pending
>> IPIs") already introduced a helper function to check the pending IPIs and
>> it is used in pmdomain governor to deny the cluster level idle state when
>> there is a pending IPI on any of cluster CPUs.
>>
>> This however does not stop CPU to enter CPU level idle state. Make use of
>> same at CPUidle to deny the idle entry when there is already IPI pending.
>>
>> With change observing glmark2 [1] off screen scores improving in the range
>> of 25% to 30% on Qualcomm lemans-evk board which is arm64 based having two
>> clusters each with 4 CPUs.
>>
>> [1] https://github.com/glmark2/glmark2
>>
>> Signed-off-by: Maulik Shah <maulik.shah@oss.qualcomm.com>
>> ---
>> Changes in v2:
>> - Fix cpumask argument of cpus_peek_for_pending_ipi() to take single cpu
>> - Link to v1: https://lore.kernel.org/r/20260316-cpuidle_ipi-v1-1-d0ff6350f4e2@oss.qualcomm.com
>> ---
>>  drivers/cpuidle/cpuidle.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
>> index c7876e9e024f9076663063ad21cfc69343fdbbe7..c01e57df64ca5af8c28da3d971500b3f38306cdf 100644
>> --- a/drivers/cpuidle/cpuidle.c
>> +++ b/drivers/cpuidle/cpuidle.c
>> @@ -224,6 +224,9 @@ noinstr int cpuidle_enter_state(struct cpuidle_device *dev,
>>         bool broadcast = !!(target_state->flags & CPUIDLE_FLAG_TIMER_STOP);
>>         ktime_t time_start, time_end;
>>
>> +       if (cpus_peek_for_pending_ipi(cpumask_of(dev->cpu)))
>> +               return -EBUSY;
>> +
> 
> Why do you want to check it here and not in cpuidle_idle_call(), for example?

It can be moved in cpuidle_idle_call(), just before call_cpuidle() too.
The intention is to check after cpuidle_select() is done.

> 
> In principle, this check may be useful in the default idle path too.

Yes, this check may be useful in default_idle_call() too.

Thanks,
Maulik

> 
>>         instrumentation_begin();
>>
>>         /*
>>
>> ---


^ permalink raw reply

* Re: [PATCH v3 0/2] Support BPF traversal of wakeup sources
From: Greg Kroah-Hartman @ 2026-04-13  4:47 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Samuel Wu, Rafael J. Wysocki, Pavel Machek, Len Brown,
	Danilo Krummrich, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa,
	Shuah Khan, Android Kernel Team, LKML, Linux Power Management,
	driver-core, bpf, open list:KERNEL SELFTEST FRAMEWORK
In-Reply-To: <CAADnVQLELWyv0kuv8wK3FJVFdJPbp-tD8d3p2pgCiVrcw_cVNA@mail.gmail.com>

On Sun, Apr 12, 2026 at 03:48:58PM -0700, Alexei Starovoitov wrote:
> On Fri, Apr 3, 2026 at 9:28 AM Samuel Wu <wusamuel@google.com> wrote:
> >
> > On Fri, Apr 3, 2026 at 3:04 AM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Thu, Apr 02, 2026 at 12:37:12PM -0700, Samuel Wu wrote:
> > > > On Wed, Apr 1, 2026 at 9:06 PM Greg Kroah-Hartman
> > > > <gregkh@linuxfoundation.org> wrote:
> > > > >
> > > > > On Wed, Apr 01, 2026 at 12:07:12PM -0700, Samuel Wu wrote:
> > > > > > On Wed, Apr 1, 2026 at 2:15 AM Greg Kroah-Hartman
> > > > > > <gregkh@linuxfoundation.org> wrote:
> > > > > > >
> > > > > > > On Tue, Mar 31, 2026 at 08:34:09AM -0700, Samuel Wu wrote:
> >
> > [ ... ]
> >
> > > > The data is fundamental for debugging and improving power at scale.
> > > > The original discussion and patch [1] provide more context of the
> > > > intent. To summarize the history, debugfs was unstable and insecure,
> > > > leading to the current sysfs implementation. However, sysfs has the
> > > > constraint of one attribute per node, requiring 10 sysfs accesses per
> > > > wakeup source.
> > >
> > > Ok, as the sysfs api doesn't work your use case anymore, why do we need
> > > to keep it around at all?
> > >
> > > > That said, I completely agree that reading 1500+ sysfs files at once
> > > > is unreasonable. Perhaps the sysfs approach was manageable at the time
> > > > of [1], but moving forward we need a more scalable solution. This is
> > > > the main motivator and makes BPF the sane approach, as it improves
> > > > traversal in nearly every aspect (e.g. cycles, memory, simplicity,
> > > > scalability).
> > >
> > > I'm all for making this more scalable and work for your systems now, but
> > > consider if you could drop the sysfs api entirely, would you want this
> > > to be a different type of api entirely instead of having to plug through
> > > these using ebpf?
> >
> > Almost all use cases want all this data at once, so AFAICT BPF offers
> > the best performance for that. But of course, open to discussion if
> > there is an alternative API that matches BPF's performance for this
> > use case.
> >
> > I'm not opposed to dropping the sysfs approach, and I attempted to do
> > so in the v1 patch [1]. I'm not sure who else currently uses those
> > sysfs nodes, but a config flag should remove friction and could be a
> > stepping stone toward deprecation/removal.
> >
> > [1]: https://lore.kernel.org/all/20260320160055.4114055-3-wusamuel@google.com/
> 
> The patches make sense to me.
> 
> Patch 2 adds a bpf selftest and corresponding:
> +CONFIG_DIBS_LO=y
> +CONFIG_PM_WAKELOCKS=y
> 
> and almost green in BPF CI.
> 
> Except s390 that fails with:
> 
> Error: #682/1 wakeup_source/iterate_and_verify_times
> Error: #682/1 wakeup_source/iterate_and_verify_times
> libbpf: extern (func ksym) 'bpf_wakeup_sources_get_head': not found in
> kernel or module BTFs
> libbpf: failed to load BPF skeleton 'test_wakeup_source': -EINVAL
> test_wakeup_source:FAIL:skel_open_and_load unexpected error: -22
> 
> We can still land it into bpf-next for this merge window.
> 
> Greg,
> any objection ?

Yes, it is too late for 7.1-rc1, sorry, there will have not been any
time in linux-next to add it.  Let's revisit it after -rc1 is out, and
again, I feel that "walk all sysfs devices in bpf" is not the correct
solution for a system-wide snapshot interface you want to have,
especially as the one you previously added you feel is now obsolete.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH v4 1/4] rust: devres: return reference in `devres::register`
From: Viresh Kumar @ 2026-04-13  3:58 UTC (permalink / raw)
  To: markus.probst
  Cc: Rob Herring, Greg Kroah-Hartman, Jiri Slaby, Miguel Ojeda,
	Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
	Alice Ryhl, Trevor Gross, Danilo Krummrich, Kari Argillander,
	Rafael J. Wysocki, Boqun Feng, David Airlie, Simona Vetter,
	linux-serial, linux-kernel, rust-for-linux, linux-pm, driver-core,
	dri-devel
In-Reply-To: <20260411-rust_serdev-v4-1-845e960c6627@posteo.de>

On 11-04-26, 17:10, Markus Probst via B4 Relay wrote:
> From: Markus Probst <markus.probst@posteo.de>
> 
> Return the reference to the initialized data in the `devres::register`
> function.
> 
> This is needed in a following commit (rust: add basic serial device bus
> abstractions).
> 
> Signed-off-by: Markus Probst <markus.probst@posteo.de>
> ---
>  rust/kernel/cpufreq.rs    |  3 ++-
>  rust/kernel/devres.rs     | 15 +++++++++++++--
>  rust/kernel/drm/driver.rs |  3 ++-
>  3 files changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/rust/kernel/cpufreq.rs b/rust/kernel/cpufreq.rs
> index f5adee48d40c..31bf7e685097 100644
> --- a/rust/kernel/cpufreq.rs
> +++ b/rust/kernel/cpufreq.rs
> @@ -1052,7 +1052,8 @@ pub fn new_foreign_owned(dev: &Device<Bound>) -> Result
>      where
>          T: 'static,
>      {
> -        devres::register(dev, Self::new()?, GFP_KERNEL)
> +        devres::register(dev, Self::new()?, GFP_KERNEL)?;
> +        Ok(())
>      }
>  }

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply

* Re: [PATCH] pmdomain: imx: Make IMX8M/IMX9 BLK_CTRL tristate
From: Frank Li @ 2026-04-13  3:25 UTC (permalink / raw)
  To: Zhipeng Wang
  Cc: ulfh@kernel.org, s.hauer@pengutronix.de, kernel@pengutronix.de,
	festevam@gmail.com, linux-pm@vger.kernel.org, imx@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Xuegang Liu, Jindong Yue
In-Reply-To: <AS8PR04MB84204966B358280D2A89F348EB242@AS8PR04MB8420.eurprd04.prod.outlook.com>

On Mon, Apr 13, 2026 at 03:08:20AM +0000, Zhipeng Wang wrote:
> > On Fri, Apr 10, 2026 at 06:27:35PM +0900, Zhipeng Wang wrote:
> > > Convert IMX8M_BLK_CTRL and IMX9_BLK_CTRL from bool to tristate to
> > > allow building as loadable modules.
> > >
> > > Add prompt strings to make these options visible and configurable in
> > > menuconfig, keeping them enabled by default on appropriate platforms.
> > >
> > > Also remove the IMX_GPCV2_PM_DOMAINS dependency from
> > IMX9_BLK_CTRL
> > > since i.MX93 doesn't use GPCv2 power domains.
> >
> > Does it cause build failure at GPCv2 platform? Or previous dependency actually
> > wrong.
> >
> > Frank
>
> Hi Frank,
>
> The previous dependency was actually wrong. Here's why:
>
> i.MX93 uses a different power domain architecture compared to i.MX8M series:
>
> - i.MX8M uses GPCv2 (General Power Controller v2) for power domain management
> - i.MX93 uses BLK_CTRL directly without GPCv2.
>
> The IMXGPCV2PMDOMAINS dependency was likely copied from IMX8MBLKCTRL
> configuration when IMX9BLKCTRL was initially added, but it was incorrect
> from the beginning since i.MX93 hardware doesn't have GPCv2 at all.

Okay, please add such information into commit message. Some

Frank

^ permalink raw reply

* RE: [PATCH] pmdomain: imx: Make IMX8M/IMX9 BLK_CTRL tristate
From: Zhipeng Wang @ 2026-04-13  3:08 UTC (permalink / raw)
  To: Frank Li
  Cc: ulfh@kernel.org, s.hauer@pengutronix.de, kernel@pengutronix.de,
	festevam@gmail.com, linux-pm@vger.kernel.org, imx@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Xuegang Liu, Jindong Yue
In-Reply-To: <adrwO8HdZ7d85mEy@lizhi-Precision-Tower-5810>

> On Fri, Apr 10, 2026 at 06:27:35PM +0900, Zhipeng Wang wrote:
> > Convert IMX8M_BLK_CTRL and IMX9_BLK_CTRL from bool to tristate to
> > allow building as loadable modules.
> >
> > Add prompt strings to make these options visible and configurable in
> > menuconfig, keeping them enabled by default on appropriate platforms.
> >
> > Also remove the IMX_GPCV2_PM_DOMAINS dependency from
> IMX9_BLK_CTRL
> > since i.MX93 doesn't use GPCv2 power domains.
> 
> Does it cause build failure at GPCv2 platform? Or previous dependency actually
> wrong.
> 
> Frank

Hi Frank,

The previous dependency was actually wrong. Here's why:

i.MX93 uses a different power domain architecture compared to i.MX8M series:

- i.MX8M uses GPCv2 (General Power Controller v2) for power domain management
- i.MX93 uses BLK_CTRL directly without GPCv2.

The IMXGPCV2PMDOMAINS dependency was likely copied from IMX8MBLKCTRL 
configuration when IMX9BLKCTRL was initially added, but it was incorrect 
from the beginning since i.MX93 hardware doesn't have GPCv2 at all.

So this patch corrects the Kconfig to match the actual hardware architecture.

Best regards,
Zhipeng
> >
> > Signed-off-by: Zhipeng Wang <zhipeng.wang_1@nxp.com>
> > ---
> >  drivers/pmdomain/imx/Kconfig | 11 +++++++----
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/pmdomain/imx/Kconfig
> > b/drivers/pmdomain/imx/Kconfig index 00203615c65e..9168d183b0c5
> 100644
> > --- a/drivers/pmdomain/imx/Kconfig
> > +++ b/drivers/pmdomain/imx/Kconfig
> > @@ -10,15 +10,18 @@ config IMX_GPCV2_PM_DOMAINS
> >  	default y if SOC_IMX7D
> >
> >  config IMX8M_BLK_CTRL
> > -	bool
> > -	default SOC_IMX8M && IMX_GPCV2_PM_DOMAINS
> > +	tristate "i.MX8M BLK CTRL driver"
> > +	depends on SOC_IMX8M
> > +	depends on IMX_GPCV2_PM_DOMAINS
> >  	depends on PM_GENERIC_DOMAINS
> >  	depends on COMMON_CLK
> > +	default y
> >
> >  config IMX9_BLK_CTRL
> > -	bool
> > -	default SOC_IMX9 && IMX_GPCV2_PM_DOMAINS
> > +	tristate "i.MX93 BLK CTRL driver"
> > +	depends on SOC_IMX9
> >  	depends on PM_GENERIC_DOMAINS
> > +	default y
> >
> >  config IMX_SCU_PD
> >  	bool "IMX SCU Power Domain driver"
> > --
> > 2.34.1
> >

^ permalink raw reply

* Re: [PATCH v3 0/2] Support BPF traversal of wakeup sources
From: Alexei Starovoitov @ 2026-04-12 22:48 UTC (permalink / raw)
  To: Samuel Wu
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Pavel Machek, Len Brown,
	Danilo Krummrich, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa,
	Shuah Khan, Android Kernel Team, LKML, Linux Power Management,
	driver-core, bpf, open list:KERNEL SELFTEST FRAMEWORK
In-Reply-To: <CAG2KctqOLeYm_NFUjGdXQcBo1KhSZpXFQxQdM-DQBV4Mn0ahTw@mail.gmail.com>

On Fri, Apr 3, 2026 at 9:28 AM Samuel Wu <wusamuel@google.com> wrote:
>
> On Fri, Apr 3, 2026 at 3:04 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Apr 02, 2026 at 12:37:12PM -0700, Samuel Wu wrote:
> > > On Wed, Apr 1, 2026 at 9:06 PM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Wed, Apr 01, 2026 at 12:07:12PM -0700, Samuel Wu wrote:
> > > > > On Wed, Apr 1, 2026 at 2:15 AM Greg Kroah-Hartman
> > > > > <gregkh@linuxfoundation.org> wrote:
> > > > > >
> > > > > > On Tue, Mar 31, 2026 at 08:34:09AM -0700, Samuel Wu wrote:
>
> [ ... ]
>
> > > The data is fundamental for debugging and improving power at scale.
> > > The original discussion and patch [1] provide more context of the
> > > intent. To summarize the history, debugfs was unstable and insecure,
> > > leading to the current sysfs implementation. However, sysfs has the
> > > constraint of one attribute per node, requiring 10 sysfs accesses per
> > > wakeup source.
> >
> > Ok, as the sysfs api doesn't work your use case anymore, why do we need
> > to keep it around at all?
> >
> > > That said, I completely agree that reading 1500+ sysfs files at once
> > > is unreasonable. Perhaps the sysfs approach was manageable at the time
> > > of [1], but moving forward we need a more scalable solution. This is
> > > the main motivator and makes BPF the sane approach, as it improves
> > > traversal in nearly every aspect (e.g. cycles, memory, simplicity,
> > > scalability).
> >
> > I'm all for making this more scalable and work for your systems now, but
> > consider if you could drop the sysfs api entirely, would you want this
> > to be a different type of api entirely instead of having to plug through
> > these using ebpf?
>
> Almost all use cases want all this data at once, so AFAICT BPF offers
> the best performance for that. But of course, open to discussion if
> there is an alternative API that matches BPF's performance for this
> use case.
>
> I'm not opposed to dropping the sysfs approach, and I attempted to do
> so in the v1 patch [1]. I'm not sure who else currently uses those
> sysfs nodes, but a config flag should remove friction and could be a
> stepping stone toward deprecation/removal.
>
> [1]: https://lore.kernel.org/all/20260320160055.4114055-3-wusamuel@google.com/

The patches make sense to me.

Patch 2 adds a bpf selftest and corresponding:
+CONFIG_DIBS_LO=y
+CONFIG_PM_WAKELOCKS=y

and almost green in BPF CI.

Except s390 that fails with:

Error: #682/1 wakeup_source/iterate_and_verify_times
Error: #682/1 wakeup_source/iterate_and_verify_times
libbpf: extern (func ksym) 'bpf_wakeup_sources_get_head': not found in
kernel or module BTFs
libbpf: failed to load BPF skeleton 'test_wakeup_source': -EINVAL
test_wakeup_source:FAIL:skel_open_and_load unexpected error: -22

We can still land it into bpf-next for this merge window.

Greg,
any objection ?

If not, Samuel, could you resend and address s390 issue?
I'm guessing wakeup sources just don't exist there.

^ permalink raw reply

* [PATCH v4 2/2] m68k: virt: Switch to qemu-virt-ctrl driver
From: Kuan-Wei Chiu @ 2026-04-12 21:19 UTC (permalink / raw)
  To: geert, sre
  Cc: jserv, eleanor15x, daniel, laurent, linux-kernel, linux-m68k,
	linux-pm, Kuan-Wei Chiu
In-Reply-To: <20260412211952.3564033-1-visitorckw@gmail.com>

Register the "qemu-virt-ctrl" platform device during board
initialization to utilize the new generic power/reset driver.

Consequently, remove the legacy reset and power-off implementations
specific to the virt machine. The platform's mach_reset callback is
updated to call do_kernel_restart(), bridging the legacy m68k reboot
path to the generic kernel restart handler framework for this machine.

To prevent any regressions in reboot or power-off functionality when
the driver is not built-in, explicitly select POWER_RESET and
POWER_RESET_QEMU_VIRT_CTRL for the VIRT machine in Kconfig.machine.

Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
 arch/m68k/Kconfig.machine |  2 ++
 arch/m68k/virt/config.c   | 42 +--------------------------------------
 arch/m68k/virt/platform.c | 20 ++++++++++++++++---
 3 files changed, 20 insertions(+), 44 deletions(-)

diff --git a/arch/m68k/Kconfig.machine b/arch/m68k/Kconfig.machine
index de39f23b180e..624e6b27f394 100644
--- a/arch/m68k/Kconfig.machine
+++ b/arch/m68k/Kconfig.machine
@@ -133,6 +133,8 @@ config VIRT
 	select GOLDFISH_TIMER
 	select GOLDFISH_TTY
 	select M68040
+	select POWER_RESET
+	select POWER_RESET_QEMU_VIRT_CTRL
 	select RTC_CLASS
 	select RTC_DRV_GOLDFISH
 	select TTY
diff --git a/arch/m68k/virt/config.c b/arch/m68k/virt/config.c
index 632ba200ad42..b338e2a8da6a 100644
--- a/arch/m68k/virt/config.c
+++ b/arch/m68k/virt/config.c
@@ -13,18 +13,6 @@
 
 struct virt_booter_data virt_bi_data;
 
-#define VIRT_CTRL_REG_FEATURES	0x00
-#define VIRT_CTRL_REG_CMD	0x04
-
-static struct resource ctrlres;
-
-enum {
-	CMD_NOOP,
-	CMD_RESET,
-	CMD_HALT,
-	CMD_PANIC,
-};
-
 static void virt_get_model(char *str)
 {
 	/* str is 80 characters long */
@@ -33,25 +21,9 @@ static void virt_get_model(char *str)
 		(u8)(virt_bi_data.qemu_version >> 16),
 		(u8)(virt_bi_data.qemu_version >> 8));
 }
-
-static void virt_halt(void)
-{
-	void __iomem *base = (void __iomem *)virt_bi_data.ctrl.mmio;
-
-	iowrite32be(CMD_HALT, base + VIRT_CTRL_REG_CMD);
-	local_irq_disable();
-	while (1)
-		;
-}
-
 static void virt_reset(void)
 {
-	void __iomem *base = (void __iomem *)virt_bi_data.ctrl.mmio;
-
-	iowrite32be(CMD_RESET, base + VIRT_CTRL_REG_CMD);
-	local_irq_disable();
-	while (1)
-		;
+	do_kernel_restart(NULL);
 }
 
 /*
@@ -113,20 +85,8 @@ void __init config_virt(void)
 		 virt_bi_data.tty.mmio);
 	setup_earlycon(earlycon);
 
-	ctrlres = (struct resource)
-		   DEFINE_RES_MEM_NAMED(virt_bi_data.ctrl.mmio, 0x100,
-					"virtctrl");
-
-	if (request_resource(&iomem_resource, &ctrlres)) {
-		pr_err("Cannot allocate virt controller resource\n");
-		return;
-	}
-
 	mach_init_IRQ = virt_init_IRQ;
 	mach_sched_init = virt_sched_init;
 	mach_get_model = virt_get_model;
 	mach_reset = virt_reset;
-	mach_halt = virt_halt;
-
-	register_platform_power_off(virt_halt);
 }
diff --git a/arch/m68k/virt/platform.c b/arch/m68k/virt/platform.c
index 1560c4140ab9..764f556b4b32 100644
--- a/arch/m68k/virt/platform.c
+++ b/arch/m68k/virt/platform.c
@@ -30,7 +30,10 @@ static int __init virt_platform_init(void)
 		DEFINE_RES_MEM(virt_bi_data.rtc.mmio + 0x1000, 0x1000),
 		DEFINE_RES_IRQ(virt_bi_data.rtc.irq + 1),
 	};
-	struct platform_device *pdev1, *pdev2;
+	const struct resource virt_ctrl_res[] = {
+		DEFINE_RES_MEM(virt_bi_data.ctrl.mmio, 0x100),
+	};
+	struct platform_device *pdev1, *pdev2, *pdev3;
 	struct platform_device *pdevs[VIRTIO_BUS_NB];
 	unsigned int i;
 	int ret = 0;
@@ -57,19 +60,30 @@ static int __init virt_platform_init(void)
 		goto err_unregister_tty;
 	}
 
+	pdev3 = platform_device_register_simple("qemu-virt-ctrl",
+						PLATFORM_DEVID_NONE,
+						virt_ctrl_res,
+						ARRAY_SIZE(virt_ctrl_res));
+	if (IS_ERR(pdev3)) {
+		ret = PTR_ERR(pdev3);
+		goto err_unregister_rtc;
+	}
+
 	for (i = 0; i < VIRTIO_BUS_NB; i++) {
 		pdevs[i] = virt_virtio_init(i);
 		if (IS_ERR(pdevs[i])) {
 			ret = PTR_ERR(pdevs[i]);
-			goto err_unregister_rtc_virtio;
+			goto err_unregister_virtio;
 		}
 	}
 
 	return 0;
 
-err_unregister_rtc_virtio:
+err_unregister_virtio:
 	while (i > 0)
 		platform_device_unregister(pdevs[--i]);
+	platform_device_unregister(pdev3);
+err_unregister_rtc:
 	platform_device_unregister(pdev2);
 err_unregister_tty:
 	platform_device_unregister(pdev1);
-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply related

* [PATCH v4 1/2] power: reset: Add QEMU virt-ctrl driver
From: Kuan-Wei Chiu @ 2026-04-12 21:19 UTC (permalink / raw)
  To: geert, sre
  Cc: jserv, eleanor15x, daniel, laurent, linux-kernel, linux-m68k,
	linux-pm, Kuan-Wei Chiu, Sebastian Reichel
In-Reply-To: <20260412211952.3564033-1-visitorckw@gmail.com>

Add a new driver for the 'virt-ctrl' device found on QEMU virt machines
(e.g. m68k). This device provides a simple interface for system reset
and power off [1].

This driver utilizes the modern system-off API to register callbacks
for both system restart and power off. It also registers a reboot
notifier to catch SYS_HALT events, ensuring that LINUX_REBOOT_CMD_HALT
is properly handled. It is designed to be generic and can be reused by
other architectures utilizing this QEMU device.

Link: https://gitlab.com/qemu-project/qemu/-/blob/v10.2.0/hw/misc/virt_ctrl.c [1]
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com>
---
Changes in v4:
- Fix sparse warning caught by kernel test robot.
- Pass the struct qemu_virt_ctrl context to sys-off handlers instead of
  the __iomem pointer.

 MAINTAINERS                          |   6 ++
 drivers/power/reset/Kconfig          |  10 +++
 drivers/power/reset/Makefile         |   1 +
 drivers/power/reset/qemu-virt-ctrl.c | 122 +++++++++++++++++++++++++++
 4 files changed, 139 insertions(+)
 create mode 100644 drivers/power/reset/qemu-virt-ctrl.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 55af015174a5..aa9eb8540637 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -21441,6 +21441,12 @@ S:	Maintained
 F:	drivers/firmware/qemu_fw_cfg.c
 F:	include/uapi/linux/qemu_fw_cfg.h
 
+QEMU VIRT MACHINE SYSTEM CONTROLLER DRIVER
+M:	Kuan-Wei Chiu <visitorckw@gmail.com>
+L:	linux-pm@vger.kernel.org
+S:	Maintained
+F:	drivers/power/reset/qemu-virt-ctrl.c
+
 QLOGIC QL41xxx FCOE DRIVER
 M:	Saurav Kashyap <skashyap@marvell.com>
 M:	Javed Hasan <jhasan@marvell.com>
diff --git a/drivers/power/reset/Kconfig b/drivers/power/reset/Kconfig
index f6c1bcbb57de..99e3334726a5 100644
--- a/drivers/power/reset/Kconfig
+++ b/drivers/power/reset/Kconfig
@@ -354,4 +354,14 @@ config POWER_MLXBF
 	help
 	  This driver supports reset or low power mode handling for Mellanox BlueField.
 
+config POWER_RESET_QEMU_VIRT_CTRL
+	tristate "QEMU Virt Machine System Controller"
+	depends on HAS_IOMEM
+	help
+	  This driver supports the system reset and power off functionality
+	  provided by the QEMU 'virt-ctrl' device.
+
+	  Say Y here if you are running Linux on a QEMU virtual machine that
+	  provides this controller, such as the m68k virt machine.
+
 endif
diff --git a/drivers/power/reset/Makefile b/drivers/power/reset/Makefile
index 0e4ae6f6b5c5..d7ae97241a83 100644
--- a/drivers/power/reset/Makefile
+++ b/drivers/power/reset/Makefile
@@ -41,3 +41,4 @@ obj-$(CONFIG_SYSCON_REBOOT_MODE) += syscon-reboot-mode.o
 obj-$(CONFIG_POWER_RESET_SC27XX) += sc27xx-poweroff.o
 obj-$(CONFIG_NVMEM_REBOOT_MODE) += nvmem-reboot-mode.o
 obj-$(CONFIG_POWER_MLXBF) += pwr-mlxbf.o
+obj-$(CONFIG_POWER_RESET_QEMU_VIRT_CTRL) += qemu-virt-ctrl.o
diff --git a/drivers/power/reset/qemu-virt-ctrl.c b/drivers/power/reset/qemu-virt-ctrl.c
new file mode 100644
index 000000000000..01409dfe2265
--- /dev/null
+++ b/drivers/power/reset/qemu-virt-ctrl.c
@@ -0,0 +1,122 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * QEMU Virt Machine System Controller Driver
+ *
+ * Copyright (C) 2026 Kuan-Wei Chiu <visitorckw@gmail.com>
+ */
+
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/mod_devicetable.h>
+#include <linux/platform_device.h>
+#include <linux/reboot.h>
+
+/* Registers */
+#define VIRT_CTRL_REG_FEATURES	0x00
+#define VIRT_CTRL_REG_CMD	0x04
+
+/* Commands */
+#define CMD_NOOP	0
+#define CMD_RESET	1
+#define CMD_HALT	2
+#define CMD_PANIC	3
+
+struct qemu_virt_ctrl {
+	void __iomem *base;
+	struct notifier_block reboot_nb;
+};
+
+static inline void virt_ctrl_write32(u32 val, void __iomem *addr)
+{
+	if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
+		iowrite32be(val, addr);
+	else
+		iowrite32(val, addr);
+}
+
+static int qemu_virt_ctrl_power_off(struct sys_off_data *data)
+{
+	struct qemu_virt_ctrl *ctrl = data->cb_data;
+
+	virt_ctrl_write32(CMD_HALT, ctrl->base + VIRT_CTRL_REG_CMD);
+
+	return NOTIFY_DONE;
+}
+
+static int qemu_virt_ctrl_restart(struct sys_off_data *data)
+{
+	struct qemu_virt_ctrl *ctrl = data->cb_data;
+
+	virt_ctrl_write32(CMD_RESET, ctrl->base + VIRT_CTRL_REG_CMD);
+
+	return NOTIFY_DONE;
+}
+
+static int qemu_virt_ctrl_reboot_notify(struct notifier_block *nb,
+					unsigned long action, void *data)
+{
+	struct qemu_virt_ctrl *ctrl = container_of(nb, struct qemu_virt_ctrl, reboot_nb);
+
+	if (action == SYS_HALT)
+		virt_ctrl_write32(CMD_HALT, ctrl->base + VIRT_CTRL_REG_CMD);
+
+	return NOTIFY_DONE;
+}
+
+static int qemu_virt_ctrl_probe(struct platform_device *pdev)
+{
+	struct qemu_virt_ctrl *ctrl;
+	int ret;
+
+	ctrl = devm_kzalloc(&pdev->dev, sizeof(*ctrl), GFP_KERNEL);
+	if (!ctrl)
+		return -ENOMEM;
+
+	ctrl->base = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(ctrl->base))
+		return PTR_ERR(ctrl->base);
+
+	ret = devm_register_sys_off_handler(&pdev->dev,
+					    SYS_OFF_MODE_RESTART,
+					    SYS_OFF_PRIO_DEFAULT,
+					    qemu_virt_ctrl_restart,
+					    ctrl);
+	if (ret)
+		return dev_err_probe(&pdev->dev, ret,
+				     "cannot register restart handler\n");
+
+	ret = devm_register_sys_off_handler(&pdev->dev,
+					    SYS_OFF_MODE_POWER_OFF,
+					    SYS_OFF_PRIO_DEFAULT,
+					    qemu_virt_ctrl_power_off,
+					    ctrl);
+	if (ret)
+		return dev_err_probe(&pdev->dev, ret,
+				     "cannot register power-off handler\n");
+
+	ctrl->reboot_nb.notifier_call = qemu_virt_ctrl_reboot_notify;
+	ret = devm_register_reboot_notifier(&pdev->dev, &ctrl->reboot_nb);
+	if (ret)
+		return dev_err_probe(&pdev->dev, ret, "cannot register reboot notifier\n");
+
+	return 0;
+}
+
+static const struct platform_device_id qemu_virt_ctrl_id[] = {
+	{ "qemu-virt-ctrl", 0 },
+	{ }
+};
+MODULE_DEVICE_TABLE(platform, qemu_virt_ctrl_id);
+
+static struct platform_driver qemu_virt_ctrl_driver = {
+	.probe = qemu_virt_ctrl_probe,
+	.driver = {
+		.name = "qemu-virt-ctrl",
+	},
+	.id_table = qemu_virt_ctrl_id,
+};
+module_platform_driver(qemu_virt_ctrl_driver);
+
+MODULE_AUTHOR("Kuan-Wei Chiu <visitorckw@gmail.com>");
+MODULE_DESCRIPTION("QEMU Virt Machine System Controller Driver");
+MODULE_LICENSE("GPL");
-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply related

* [PATCH v4 0/2] Add QEMU virt-ctrl driver and update m68k virt
From: Kuan-Wei Chiu @ 2026-04-12 21:19 UTC (permalink / raw)
  To: geert, sre
  Cc: jserv, eleanor15x, daniel, laurent, linux-kernel, linux-m68k,
	linux-pm, Kuan-Wei Chiu

Introduce a generic platform driver for the QEMU 'virt-ctrl' device [1]
and transitions the m68k 'virt' machine to use it, replacing
architecture-specific hooks.

The new driver ('qemu-virt-ctrl') registers a restart handler and
populates the global 'pm_power_off' callback.

On the m68k side, the platform initialization is updated to register
the 'qemu-virt-ctrl' platform device. Additionally, the 'mach_reset'
hook is bridged to 'do_kernel_restart()' to ensure the kernel's restart
handler chain is correctly invoked.

Verified on QEMU m68k virt. Both system reset and power-off were
confirmed functional by invoking 'reboot(LINUX_REBOOT_CMD_RESTART)',
'reboot(LINUX_REBOOT_CMD_POWER_OFF)', and
'reboot(LINUX_REBOOT_CMD_HALT)' from userspace.

Link: https://gitlab.com/qemu-project/qemu/-/blob/v10.2.0/hw/misc/virt_ctrl.c [1]
---
Changes in v4:
- Fix sparse warning caught by kernel test robot.
- Pass the struct qemu_virt_ctrl context to sys-off handlers instead of
  the __iomem pointer.

Changes in v3:
- Add a reboot notifier in the driver to handle LINUX_REBOOT_CMD_HALT.
- Handle native endianness in the driver instead of hardcoding
  big-endian I/O writes.
- Select POWER_RESET and POWER_RESET_QEMU_VIRT_CTRL in m68k
  Kconfig.machine.

Changes in v2:
- Use devm_register_sys_off_handler() instead of register_restart_handler()
  and global pm_power_off.
- Switch Kconfig to tristate to support modular build.
- Add .id_table to platform_driver and use MODULE_DEVICE_TABLE() to correct
  module auto-loading.

v2: https://lore.kernel.org/lkml/20260203170824.2968045-1-visitorckw@gmail.com/
v1: https://lore.kernel.org/lkml/20260112182258.1851769-1-visitorckw@gmail.com/

Kuan-Wei Chiu (2):
  power: reset: Add QEMU virt-ctrl driver
  m68k: virt: Switch to qemu-virt-ctrl driver

 MAINTAINERS                          |   6 ++
 arch/m68k/Kconfig.machine            |   2 +
 arch/m68k/virt/config.c              |  42 +--------
 arch/m68k/virt/platform.c            |  20 ++++-
 drivers/power/reset/Kconfig          |  10 +++
 drivers/power/reset/Makefile         |   1 +
 drivers/power/reset/qemu-virt-ctrl.c | 122 +++++++++++++++++++++++++++
 7 files changed, 159 insertions(+), 44 deletions(-)
 create mode 100644 drivers/power/reset/qemu-virt-ctrl.c

-- 
2.53.0.1213.gd9a14994de-goog


^ permalink raw reply

* Re: [PATCH V5 0/5] i3c: mipi-i3c-hci-pci: Enable IBI while runtime suspended for Intel controllers
From: Alexandre Belloni @ 2026-04-12 14:34 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Frank.Li, rafael, linux-i3c, linux-kernel, linux-pm
In-Reply-To: <20260306085338.62955-1-adrian.hunter@intel.com>

On Fri, 06 Mar 2026 10:53:33 +0200, Adrian Hunter wrote:
> 	Please note all patches have Frank's Rev'd-by.
> 
> 
> Changes in V5:
> 
> 	Re-base on top of v7.0 fixes series:
> 		https://lore.kernel.org/linux-i3c/20260306072451.11131-1-adrian.hunter@intel.com/T
> 
> [...]

Applied, thanks!

[1/5] i3c: mipi-i3c-hci-pci: Set d3hot_delay to 0 for Intel controllers
      https://git.kernel.org/i3c/c/815b4448198f
[2/5] i3c: mipi-i3c-hci: Add quirk to allow IBI while runtime suspended
      https://git.kernel.org/i3c/c/5fe77a6d8d5d
[3/5] i3c: mipi-i3c-hci: Allow parent to manage runtime PM
      https://git.kernel.org/i3c/c/82851828a8b1
[4/5] i3c: mipi-i3c-hci-pci: Add optional ability to manage child runtime PM
      https://git.kernel.org/i3c/c/e813e7e30086
[5/5] i3c: mipi-i3c-hci-pci: Enable IBI while runtime suspended for Intel controllers
      https://git.kernel.org/i3c/c/e7a718627c6f

Best regards,

-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: [chanwoo:devfreq-next 5/5] drivers/devfreq/devfreq.c:1946:37: error: redefinition of 'devfreq_group'
From: Jori Koolstra @ 2026-04-12 14:09 UTC (permalink / raw)
  To: Chanwoo Choi, kernel test robot
  Cc: gregkh@linuxfoundation.org, llvm, oe-kbuild-all, linux-pm,
	Chanwoo Choi
In-Reply-To: <890652743.852439.1775062451369@kpc.webmail.kpnmail.nl>

Hi Chanwoo,

I replied to your email a bit ago. Just sending this as a reminder again:

> Op 21-03-2026 16:11 CET schreef Chanwoo Choi <chanwoo@kernel.org>:
> 
>  
> HI Jori,
> 
> There are build errors in your patches.
> DId you test it? I hope to fix and test the patch before posting.
> 

Yes, I have no built issues. Can it be the test bot did not apply the patch
correctly?

Looking at the build error it seems that ATTRIBUTE_GROUPS(devfreq) is duplicated,
not moved.

> 
> 
> -- 
> Best Regards,
> Chanwoo Choi
> Samsung Electronics

Thanks,
Jori.

^ permalink raw reply

* Re: [PATCH 3/8] firmware: meson: sm: Add thermal calibration SMC call
From: Krzysztof Kozlowski @ 2026-04-12 10:47 UTC (permalink / raw)
  To: Ronald Claveau, Guillaume La Roque, Rafael J. Wysocki,
	Daniel Lezcano, Zhang Rui, Lukasz Luba, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Neil Armstrong, Kevin Hilman,
	Jerome Brunet, Martin Blumenstingl
  Cc: linux-pm, linux-amlogic, devicetree, linux-kernel,
	linux-arm-kernel
In-Reply-To: <20260410-add-thermal-t7-vim4-v1-3-19f2b8da74d7@aliel.fr>

On 10/04/2026 18:48, Ronald Claveau wrote:
> @@ -245,6 +246,14 @@ struct meson_sm_firmware *meson_sm_get(struct device_node *sm_node)
>  }
>  EXPORT_SYMBOL_GPL(meson_sm_get);
>  
> +int meson_sm_get_thermal_calib(struct meson_sm_firmware *fw, u32 *trim_info,

Exported functions should have kerneldoc.

> +			       u32 tsensor_id)
> +{
> +	return meson_sm_call(fw, SM_THERMAL_CALIB_READ, trim_info, tsensor_id,
> +			     0, 0, 0, 0);

Best regards,
Krzysztof

^ permalink raw reply

* Re: [PATCH 1/8] dt-bindings: thermal: amlogic: Add support for T7
From: Krzysztof Kozlowski @ 2026-04-12  9:58 UTC (permalink / raw)
  To: Ronald Claveau
  Cc: Guillaume La Roque, Rafael J. Wysocki, Daniel Lezcano, Zhang Rui,
	Lukasz Luba, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Neil Armstrong, Kevin Hilman, Jerome Brunet, Martin Blumenstingl,
	linux-pm, linux-amlogic, devicetree, linux-kernel,
	linux-arm-kernel
In-Reply-To: <20260410-add-thermal-t7-vim4-v1-1-19f2b8da74d7@aliel.fr>

On Fri, Apr 10, 2026 at 06:48:02PM +0200, Ronald Claveau wrote:
> Add the amlogic,t7-thermal compatible for the Amlogic T7 thermal sensor.
> 
> Unlike existing variants which use a phandle to the ao-secure syscon,
> the T7 relies on a secure monitor interface described by a phandle and
> a sensor index argument.
> 
> Introduce the amlogic,secure-monitor property as a phandle-array and
> make amlogic,ao-secure or amlogic,secure-monitor conditionally required
> depending on the compatible.
> 
> Signed-off-by: Ronald Claveau <linux-kernel-dev@aliel.fr>
> ---
>  .../bindings/thermal/amlogic,thermal.yaml          | 40 +++++++++++++++++++++-
>  1 file changed, 39 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml b/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml
> index 70b273271754b..85ee73c6e1161 100644
> --- a/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml
> +++ b/Documentation/devicetree/bindings/thermal/amlogic,thermal.yaml
> @@ -22,6 +22,7 @@ properties:
>                - amlogic,g12a-ddr-thermal
>            - const: amlogic,g12a-thermal
>        - const: amlogic,a1-cpu-thermal
> +      - const: amlogic,t7-thermal

So these two entries are enum.

>  
>    reg:
>      maxItems: 1
> @@ -42,12 +43,40 @@ properties:
>    '#thermal-sensor-cells':
>      const: 0
>  
> +  amlogic,secure-monitor:
> +    description: phandle to the secure monitor
> +    $ref: /schemas/types.yaml#/definitions/phandle-array
> +    items:
> +      - items:
> +          - description: phandle to the secure monitor
> +          - description: sensor index

For what exactly this sensor index is needed? commit msg explained me
nothing, instead repeated what you did. That's pointless, explain why
you did it.

> +
>  required:
>    - compatible
>    - reg
>    - interrupts
>    - clocks
> -  - amlogic,ao-secure
> +
> +allOf:
> +  - if:
> +      properties:
> +        compatible:
> +          contains:
> +            enum:
> +              - amlogic,g12a-cpu-thermal
> +              - amlogic,g12a-ddr-thermal

Drop both, you need only fallback.

> +              - amlogic,a1-cpu-thermal

And list is sorted alphabetically.

> +    then:
> +      required:
> +        - amlogic,ao-secure

Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH 1/2] thermal/drivers/imx: Fix thermal zone leak on probe error path
From: Daniel Lezcano @ 2026-04-12  9:35 UTC (permalink / raw)
  To: Frank Li, Felix Gu
  Cc: Rafael J. Wysocki, Daniel Lezcano, Zhang Rui, Lukasz Luba,
	Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam,
	Oleksij Rempel, linux-pm, imx, linux-arm-kernel, linux-kernel
In-Reply-To: <adruT5fgNSZ_VOLr@lizhi-Precision-Tower-5810>

On 4/12/26 02:58, Frank Li wrote:
> On Sun, Apr 12, 2026 at 03:03:03AM +0800, Felix Gu wrote:
>> If pm_runtime_resume_and_get() fails after the thermal zone has been
>> registered, the probe error path cleans up runtime PM but skips
>> thermal_zone_device_unregister(), leaking the thermal zone device.
>>
>> Move thermal_zone_device_unregister() into disable_runtime_pm so all
> 
> Use devm_thermal_of_zone_register() to fix this problem

+1

^ permalink raw reply

* Re: [PATCH v5 00/21] Virtual Swap Space
From: Nhat Pham @ 2026-04-12  1:40 UTC (permalink / raw)
  To: YoungJun Park
  Cc: kasong, Liam.Howlett, akpm, apopple, axelrasmussen, baohua,
	baolin.wang, bhe, byungchul, cgroups, chengming.zhou, chrisl,
	corbet, david, dev.jain, gourry, hannes, hughd, jannh,
	joshua.hahnjy, lance.yang, lenb, linux-doc, linux-kernel,
	linux-mm, linux-pm, lorenzo.stoakes, matthew.brost, mhocko,
	muchun.song, npache, pavel, peterx, peterz, pfalcato, rafael,
	rakie.kim, roman.gushchin, rppt, ryan.roberts, shakeel.butt,
	shikemeng, surenb, tglx, vbabka, weixugc, ying.huang, yosry.ahmed,
	yuanchu, zhengqi.arch, ziy, kernel-team, riel
In-Reply-To: <acQrQYHJgqof0yx4@yjaykim-PowerEdge-T330>

n Wed, Mar 25, 2026 at 11:36 AM YoungJun Park <youngjun.park@lge.com> wrote:
>
> On Fri, Mar 20, 2026 at 12:27:14PM -0700, Nhat Pham wrote:
> >
> > This patch series is based on 6.19. There are a couple more
> > swap-related changes in mainline that I would need to coordinate
> > with, but I still want to send this out as an update for the
> > regressions reported by Kairui Song in [15]. It's probably easier
> > to just build this thing rather than dig through that series of
> > emails to get the fix patch :)
>
> Hi Nhat,
>
> I wanted to fully understand the patches before asking questions,
> but reviewing everything takes time, and I didn't want to miss the
> timing. So let me share some thoughts and ask about your direction.
>
> These are the perspectives I'm coming from:
>
> Pros:
> - The architecture is very clean.
> - Zero entries currently consume swap space, which can prevent
>   actual swap usage in some cases.

Yeah not just zero entries. Compressed entries consuming a static
space also makes no sense to me.

> - It resolves zswap's dependency on swap device size.
> - And so on.
>
> Cons:
> - An additional virtual allocation step is introduced per every swap.
> - not easy to merge (change swap infrastructure totally?)
>
> To address the cons, I think if we can demonstrate that the
> benefits always outweigh the costs, it could fully replace the
> existing mechanism. However, if this can be applied selectively,
> we get only the pros without the cons.
>
> 1. Modularization
>
> You removed CONFIG_* and went with a unified approach. I recall
> you were also considering a module-based structure at some point.
> What are your thoughts on that direction?
>

The CONFIG-based approach was a huge mess. It makes me not want to
look at the code, and I'm the author :)

> If we take that approach, we could extend the recent swap ops
> patchset (https://lore.kernel.org/linux-mm/20260302104016.163542-1-bhe@redhat.com/)
> as follows:
> - Make vswap a swap module
> - Have cluster allocation functions reside in swapops
> - Enable vswap through swapon

Hmmmmm.


>
> I think this could result in a similar structure. An additional
> benefit would be that it enables various configurations:
>
> - vswap + regular swap together
> - vswap only
> - And other combinations
>
> And merge is not that hard. it is not the total change of swap infra structure.
>
> But, swapoff fastness might disappear? it is not that critical as I think.

Yeah that's not critical. It's a cool beans optimization but nobody
does swapoff and expect fast ;)

(It is a lot cleaner tho but again not my first priority).

>
> 2. Flash-friendly swap integration (for my use case)
>
> I've been thinking about the flash-friendly swap concept that
> I mentioned before and recently proposed:
> (https://lore.kernel.org/linux-mm/aZW0voL4MmnMQlaR@yjaykim-PowerEdge-T330/)
>
> One of its core functions requires buffering RAM-swapped pages
> and writing them sequentially at an appropriate time -- not
> immediately, but in proper block-sized units, sequentially.
>
> This means allocated offsets must essentially be virtual, and
> physical offsets need to be managed separately at the actual
> write time.
>
> If we integrate this into the current vswap, we would either
> need vswap itself to handle the sequential writes (bypassing
> the physical device and receiving pages directly), or swapon
> a swap device and have vswap obtain physical offsets from it.
> But since those offsets cannot be used directly (due to
> buffering and sequential write requirements), they become
> virtual too, resulting in:
>
>   virtual -> virtual -> physical
>
> This triple indirection is not ideal.
>
> However, if the modularization from point 1 is achieved and
> vswap acts as a swap device itself, then we can cleanly
> establish a:
>
>   virtual -> physical

I read that thread sometimes ago. Some remarks:

1. I think Christoph has a point. Seems like some of your ideas ( are
broadly applicable to swap in general. Maybe fixing swap infra
generally would make a lot of sense?

2. Why do we need to do two virtual layers here? For example, If you
want to buffer multiple swap outs and turn them into a sequential
request, you can:

a. Allocate virtual swap space for them as you wish. They don't even
need to be sequential.

b. At swap_writeout() time, don't allocate physical swap space for
them right away. Instead, accumulate them into a buffer. You can add a
new virtual swap entry type to flag it if necessary.

c. Once that buffer reaches a certain size, you can now allocate
contiguous physical swap space for them. Then flush etc. You can flush
at swap_writeout() time, or use a dedicated threads etc.

Deduplication sounds like something that should live at a lower layer
- I was thinking about it for zswap/zsmalloc back then. I mean, I
assume you don't want content sharing across different swap media? :)
Something along the line of:

1. Maintain an content index for swapped out pages.

2. For the swap media that support deduplication, you'll need to add
some sort of reference count (more overhead ew).

3. Each time we swapped out, we can content-check to see if the same
piece of conent has been swapped out before. If so, set the vswap
backend to the physical location of the data, increment some sort of
reference count (perhaps we can use swap count) of the older entry,
and have the swap type point to it.

But have you considered the implications of sharing swap data like
this? I need to read the paper you cite - seems like a potential fun
read. But what happen when these two pages that share the content
belong to two different cgroups? How does the
charging/uncharging/charge transferring story work? That's one of the
things that made me pause when I wanted to implement deduplication for
zswap/zsmalloc. Zram does not charge memory towards cgroup, but zswap
does, so we'll need to handle this somehow, and at that point all the
complexity might no longer be worth it.

>
> relationship within it.
>
> I noticed you seem to be exploring collaboration with Kairui
> as well. I'm curious whether you have a compromise direction
> in mind, or if you plan to stick with the current approach.

I do have some ideas while discussing with Kairui. I'm still figuring
that part out though.

What I'm working on right now is tracing all the inherent overhead of
swap virtualization, regardless of the method we use.

>
> P.S. I definitely want to review the vswap code in detail
> when I get the time. great work and code.
>
> Thanks,
> Youngjun Park
>

^ permalink raw reply

* Re: [PATCH] pmdomain: imx: Make IMX8M/IMX9 BLK_CTRL tristate
From: Frank Li @ 2026-04-12  1:07 UTC (permalink / raw)
  To: Zhipeng Wang
  Cc: ulfh, s.hauer, kernel, festevam, linux-pm, imx, linux-arm-kernel,
	linux-kernel, xuegang.liu, jindong.yue
In-Reply-To: <20260410092735.1065294-1-zhipeng.wang_1@nxp.com>

On Fri, Apr 10, 2026 at 06:27:35PM +0900, Zhipeng Wang wrote:
> Convert IMX8M_BLK_CTRL and IMX9_BLK_CTRL from bool to tristate
> to allow building as loadable modules.
>
> Add prompt strings to make these options visible and configurable
> in menuconfig, keeping them enabled by default on appropriate platforms.
>
> Also remove the IMX_GPCV2_PM_DOMAINS dependency from IMX9_BLK_CTRL
> since i.MX93 doesn't use GPCv2 power domains.

Does it cause build failure at GPCv2 platform? Or previous dependency
actually wrong.

Frank
>
> Signed-off-by: Zhipeng Wang <zhipeng.wang_1@nxp.com>
> ---
>  drivers/pmdomain/imx/Kconfig | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/pmdomain/imx/Kconfig b/drivers/pmdomain/imx/Kconfig
> index 00203615c65e..9168d183b0c5 100644
> --- a/drivers/pmdomain/imx/Kconfig
> +++ b/drivers/pmdomain/imx/Kconfig
> @@ -10,15 +10,18 @@ config IMX_GPCV2_PM_DOMAINS
>  	default y if SOC_IMX7D
>
>  config IMX8M_BLK_CTRL
> -	bool
> -	default SOC_IMX8M && IMX_GPCV2_PM_DOMAINS
> +	tristate "i.MX8M BLK CTRL driver"
> +	depends on SOC_IMX8M
> +	depends on IMX_GPCV2_PM_DOMAINS
>  	depends on PM_GENERIC_DOMAINS
>  	depends on COMMON_CLK
> +	default y
>
>  config IMX9_BLK_CTRL
> -	bool
> -	default SOC_IMX9 && IMX_GPCV2_PM_DOMAINS
> +	tristate "i.MX93 BLK CTRL driver"
> +	depends on SOC_IMX9
>  	depends on PM_GENERIC_DOMAINS
> +	default y
>
>  config IMX_SCU_PD
>  	bool "IMX SCU Power Domain driver"
> --
> 2.34.1
>

^ permalink raw reply

* Re: [PATCH v5 00/21] Virtual Swap Space
From: Nhat Pham @ 2026-04-12  1:03 UTC (permalink / raw)
  To: YoungJun Park
  Cc: Kairui Song, Liam.Howlett, akpm, apopple, axelrasmussen, baohua,
	baolin.wang, bhe, byungchul, cgroups, chengming.zhou, chrisl,
	corbet, david, dev.jain, gourry, hannes, hughd, jannh,
	joshua.hahnjy, lance.yang, lenb, linux-doc, linux-kernel,
	linux-mm, linux-pm, lorenzo.stoakes, matthew.brost, mhocko,
	muchun.song, npache, pavel, peterx, peterz, pfalcato, rafael,
	rakie.kim, roman.gushchin, rppt, ryan.roberts, shakeel.butt,
	shikemeng, surenb, tglx, vbabka, weixugc, ying.huang, yosry.ahmed,
	yuanchu, zhengqi.arch, ziy, kernel-team, riel
In-Reply-To: <acQvNRLpHwnHt7i+@yjaykim-PowerEdge-T330>

On Wed, Mar 25, 2026 at 11:53 AM YoungJun Park <youngjun.park@lge.com> wrote:
>
> On Mon, Mar 23, 2026 at 11:32:57AM -0400, Nhat Pham wrote:
>
> > Interesting. Normally "lots of zero-filled page" is a very beneficial
> > case for vswap. You don't need a swapfile, or any zram/zswap metadata
> > overhead - it's a native swap backend. If production workload has this
> > many zero-filled pages, I think the numbers of vswap would be much
> > less alarming - perhaps even matching memory overhead because you
> > don't need to maintain a zram entry metadata (it's at least 2 words
> > per zram entry right?), while there's no reverse map overhead induced
> > (so it's 24 bytes on both side), and no need to do zram-side locking
> > :)
> >
> > So I was surprised to see that it's not working out very well here. I
> > checked the implementation of memhog - let me know if this is wrong
> > place to look:
> >
> > https://man7.org/linux/man-pages/man8/memhog.8.html
> > https://github.com/numactl/numactl/blob/master/memhog.c#L52
> >
> > I think this is what happened here: memhog was populating the memory
> > 0xff, which triggers the full overhead of a swapfile-backed swap entry
> > because even though it's "same-filled" it's not zero-filled! I was
> > following Usama's observation - "less than 1% of the same-filled pages
> > were non-zero" - and so I only handled the zero-filled case here:
> >
> > https://lore.kernel.org/all/20240530102126.357438-1-usamaarif642@gmail.com/
> >
> > This sounds a bit artificial IMHO - as Usama pointed out above, I
> > think most samefilled pages are zero pages, in real production
> > workloads. However, if you think there are real use cases with a lot
> > of non-zero samefilled pages, please let me know I can fix this real
> > quick. We can support this in vswap with zero extra metadata overhead
> > - change the VSWAP_ZERO swap entry type to VSWAP_SAME_FILLED, then use
> > the backend field to store that value. I can send you a patch if
> > you're interested.
>
> This brings back memories -- I'm pretty sure we talked about
> exactly this at LPC. Our custom swap device already handles both
> zero-filled and same-filled pages on its own, so what we really
> wanted was a way to tell the swap layer "just skip the detection
> and let it through."
>
> I looked at two approaches back then but never submitted either:
>
>   - A per-swap_info flag to opt out of zero/same-filled handling.
>     But this felt wrong from vswap's perspective -- if even one
>     device opts out of the zeromap, the model gets messy.
>
>   - Revisiting Usama's patch 2 approach.
>     Sounded good in theory, but as you said,
>     it's not as simple to verify in practice. And it is more clean design
>     swapout time zero check as I see. So,  I gave up on it.
>
> Seeing this come up again is actually kind of nice :)
>
> One thought -- maybe a compile-time CONFIG or a boot param to
> control the scope? e.g. zero-only, same-filled, or disabled.
> That way vendors like us just turn it off, and setups like
> Kairui's can opt into broader detection. Just an idea though --
> open to other approaches if you have something in mind.

Yeah for vswap it's probably going to be a CONFIG or boot param.

But in the status quo, we can always add a swapfile flag. That one
should work already, right?

Thanks for thinking about it :) FWIW I think zero check is really
cheap, but yeah it's just wasted work.

(ZRAM folks - do you feel the overhead here?)

>
> Thanks,
> Youngjun Park
>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox