Linux Documentation
 help / color / mirror / Atom feed
* [PATCH 2/3] docs/zh_CN: add process/changes.rst translation
From: Jiandong Qiu @ 2026-06-19 14:02 UTC (permalink / raw)
  To: alexs, si.yanteng
  Cc: dzm91, corbet, skhan, linux-doc, linux-kernel, Jiandong Qiu
In-Reply-To: <20260619140245.1982921-1-qiujiandong1998@gmail.com>

Add the zh_CN translation of process/changes.rst.

Update the translation through commit ece7e57afd51
("docs: changes.rst and ver_linux: sort the lists")

Signed-off-by: Jiandong Qiu <qiujiandong1998@gmail.com>
---
 .../translations/zh_CN/process/changes.rst    | 530 ++++++++++++++++++
 1 file changed, 530 insertions(+)
 create mode 100644 Documentation/translations/zh_CN/process/changes.rst

diff --git a/Documentation/translations/zh_CN/process/changes.rst b/Documentation/translations/zh_CN/process/changes.rst
new file mode 100644
index 000000000000..cc22f65e4888
--- /dev/null
+++ b/Documentation/translations/zh_CN/process/changes.rst
@@ -0,0 +1,530 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/process/changes.rst
+
+:翻译: 裘剑东 Jiandong Qiu <qiujiandong1998@gmail.com>
+
+.. _changes_zh:
+
+==================
+编译内核的最小需求
+==================
+
+引言
+====
+
+本文旨在给出运行当前内核版本所需的最低软件版本列表。
+
+本文最初基于 Linus 为 2.0.x 内核编写的 “Changes” 文件,因此也应将功劳归于
+与该文件相关的同一批人(Jared Mauch、Axel Boldt、Alessandro Sigala,
+以及互联网上无数其他用户)。
+
+当前最低需求
+------------
+
+在认为自己碰到了一个bug之前,请先至少升级到以下软件版本。
+如果你不确定当前运行的版本,建议使用右侧命令进行检查。
+若要列出系统中的程序及其版本,请执行 ``./scripts/ver_linux``
+
+再次提醒,本列表假定你已经能够正常运行一个Linux内核。另外,
+并非所有工具在所有系统上都是必需的;例如,如果你的机器没有任何
+PC Card硬件,那么大概无需关心 pcmciautils。
+
+====================== ===============  ========================================
+        程序             最低版本              版本检查命令
+====================== ===============  ========================================
+bash                   4.2              bash --version
+bc                     1.06.95          bc --version
+bindgen(可选)        0.65.1           bindgen --version
+binutils               2.30             ld -v
+bison                  2.0              bison --version
+btrfs-progs            0.18             btrfs --version
+Clang/LLVM(可选)     15.0.0           clang --version
+e2fsprogs              1.41.4           e2fsck -V
+flex                   2.5.35           flex --version
+gdb                    7.2              gdb --version
+GNU awk(可选)        5.1.0            gawk --version
+GNU C                  8.1              gcc --version
+GNU make               4.0              make --version
+GNU tar                1.28             tar --version
+GRUB                   0.93             grub --version || grub-install --version
+gtags(可选)          6.6.5            gtags --version
+iptables               1.4.2            iptables -V
+jfsutils               1.1.3            fsck.jfs -V
+kmod                   13               kmod -V
+mcelog                 0.6              mcelog --version
+mkimage(可选)        2017.01          mkimage --version
+nfs-utils              1.0.5            showmount --version
+openssl & libcrypto    1.0.0            openssl version
+pahole                 1.22             pahole --version
+pcmciautils            004              pccardctl -V
+PPP                    2.4.0            pppd --version
+procps                 3.2.0            ps --version
+Python                 3.9.x            python3 --version
+quota-tools            3.09             quota -V
+Rust(可选)           1.78.0           rustc --version
+Sphinx\ [#f1]_         3.4.3            sphinx-build --version
+squashfs-tools         4.0              mksquashfs -version
+udev                   081              udevadm --version
+util-linux             2.10o            mount --version
+xfsprogs               2.6.0            xfs_db -V
+====================== ===============  ========================================
+
+.. [#f1] Sphinx 仅在构建内核文档时需要
+
+内核编译
+--------
+
+GCC
+~~~
+
+gcc 的版本要求可能会因你计算机中CPU的类型不同而有所变化。
+
+Clang/LLVM(可选)
+~~~~~~~~~~~~~~~~~~
+
+clang和LLVM工具的最新正式发行版(依据
+`releases.llvm.org <https://releases.llvm.org>`_)支持用于构建内核。
+较旧版本并不保证可用,我们也可能移除内核中为支持旧版而加入的兼容性处理。
+更多信息请参阅 :ref:`使用 Clang/LLVM 构建 Linux <kbuild_llvm_zh>`。
+
+Rust(可选)
+~~~~~~~~~~~~
+
+需要较新的 Rust 编译器版本。
+
+关于如何满足 Rust 支持的构建需求,请参阅
+Documentation/translations/zh_CN/rust/quick-start.rst。其中,``Makefile`` 目标
+``rustavailable`` 可用于检查 Rust 工具链为何未被检测到。
+
+bindgen(可选)
+~~~~~~~~~~~~~~~
+
+``bindgen`` 用于为内核的 C 侧生成Rust绑定。它依赖 ``libclang``。
+
+Make
+~~~~
+
+要构建内核,你需要 GNU make 4.0 或更高版本。
+
+Bash
+~~~~
+
+内核构建会使用一些 bash 脚本。需要 Bash 4.2 或更新版本。
+
+Binutils
+~~~~~~~~
+
+构建内核需要 Binutils 2.30 或更新版本。
+
+pkg-config
+~~~~~~~~~~
+
+从 Linux 4.18 起,构建系统需要 pkg-config 来检查已安装的 kconfig 工具,并确定用于
+'make {g,x}config' 的标志设置。此前虽然已经在使用 pkg-config,
+但并未进行检查或文档说明。
+
+Flex
+~~~~
+
+自 Linux 4.16 起,构建系统会在构建过程中生成词法分析器。这需要
+flex 2.5.35 或更高版本。
+
+
+Bison
+~~~~~
+
+自 Linux 4.16 起,构建系统会在构建过程中生成语法解析器。这需要 bison 2.0
+或更高版本。
+
+pahole
+~~~~~~
+
+自 Linux 5.2 起,如果选择了 CONFIG_DEBUG_INFO_BTF,构建系统会从 vmlinux 中的
+DWARF 生成 BTF(BPF Type Format),稍后也会为内核模块生成。这需要 pahole
+v1.22 或更高版本。
+
+它可从发行版中的 'dwarves' 或 'pahole' 软件包获得,或从
+https://fedorapeople.org/~acme/dwarves/ 获取。
+
+Perl
+~~~~
+
+要构建内核,你需要 perl 5 以及以下模块:``Getopt::Long``、
+``Getopt::Std``、``File::Basename`` 和 ``File::Find``。
+
+Python
+~~~~~~
+
+若干配置选项需要它:例如 arm/arm64 默认配置、CONFIG_LTO_CLANG、某些
+DRM 可选配置、kernel-doc 工具以及文档构建(Sphinx)等。
+
+BC
+~~
+
+构建 3.10 及以上版本内核时需要 bc。
+
+
+OpenSSL
+~~~~~~~
+
+模块签名和外部证书处理使用OpenSSL程序及其加密库来创建密钥并生成签名。
+
+如果启用了模块签名,那么构建 3.7 及以上版本内核时需要 openssl。构建
+4.3 及以上版本内核时,还需要 openssl 的开发包。
+
+Tar
+~~~
+
+如果你想通过 sysfs 启用对内核头文件的访问(CONFIG_IKHEADERS),则需要 GNU tar。
+
+gtags / GNU GLOBAL(可选)
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+内核构建要求 GNU GLOBAL 版本 6.6.5 或更高,以便通过 ``make gtags``
+生成标签文件。这是因为它使用了 gtags 的 ``-C (--directory)`` 选项。
+
+mkimage
+~~~~~~~
+
+该工具用于构建 Flat Image Tree(FIT),常见于 ARM 平台。该工具可通过
+``u-boot-tools`` 软件包获得,也可以从 U-Boot 源码构建。详见
+https://docs.u-boot.org/en/latest/build/tools.html#building-tools-for-linux
+
+GNU AWK
+~~~~~~~
+
+如果你希望内核构建为内建模块生成地址范围数据(CONFIG_BUILTIN_MODULE_RANGES),
+则需要GNU AWK。
+
+系统工具
+--------
+
+架构方面的变化
+~~~~~~~~~~~~~~
+
+DevFS 已被废弃,转而使用 udev
+(https://www.kernel.org/pub/linux/utils/kernel/hotplug/)
+
+现在已经支持 32 位 UID。尽情享用!
+
+Linux 中函数的文档正在转向以内联文档形式存在,即在源码定义附近使用特殊格式的
+注释。这些注释可以与 Documentation/ 目录中的 ReST 文件结合,生成更丰富的文档,
+随后可以再转换为 PostScript、HTML、LaTex、ePUB 和 PDF 文件。若要将 ReST
+格式转换为你所需的格式,需要Sphinx。
+
+Util-linux
+~~~~~~~~~~
+
+较新的 util-linux 版本为更大容量磁盘提供 ``fdisk`` 支持,支持更多的 mount 选项,
+识别更多分区类型,以及其他类似改进。你大概会想升级它。
+
+Ksymoops
+~~~~~~~~
+
+如果发生了最糟糕的情况,内核出现 oops,你可能需要 ksymoops 工具来解码它,
+但在大多数情况下并不需要。通常更推荐在构建内核时启用 ``CONFIG_KALLSYMS``,
+这样可以产生可直接使用的可读转储(而且输出比 ksymoops 更好)。
+如果由于某种原因你的内核不是以 ``CONFIG_KALLSYMS`` 构建的,
+并且你也没有办法重新构建并在启用该选项的情况下重新复现Oops,
+那么你仍然可以使用 ksymoops 对该 Oops 进行解码。
+
+Mkinitrd
+~~~~~~~~
+
+``/lib/modules`` 文件树布局的这些变化同样要求升级 mkinitrd。
+
+E2fsprogs
+~~~~~~~~~
+
+最新版 ``e2fsprogs`` 修复了 fsck 和 debugfs 中的若干bug。显然,升级它是个好主意。
+
+JFSutils
+~~~~~~~~
+
+``jfsutils`` 软件包包含该文件系统的工具。可用工具如下:
+
+ - ``fsck.jfs`` - 启动事务日志重放,并检查和修复 JFS 格式分区。
+ - ``mkfs.jfs`` - 创建 JFS 格式分区。
+ - 该软件包中还提供了其他文件系统工具。
+
+Xfsprogs
+~~~~~~~~
+
+最新版 ``xfsprogs`` 包含 ``mkfs.xfs``、``xfs_db`` 和 ``xfs_repair`` 等
+XFS文件系统工具。它与架构无关,2.0.0 及以上的任何版本都应能与当前版本的
+XFS内核代码正常配合使用(推荐 2.6.0 或更高版本,因为其包含一些重要改进)。
+
+PCMCIAutils
+~~~~~~~~~~~
+
+PCMCIAutils取代了 ``pcmcia-cs``。它会在系统启动时正确设置 PCMCIA插槽;
+如果内核采用模块化并使用了 hotplug 子系统,它还会为16位PCMCIA设备加载相应模块。
+
+Quota-tools
+~~~~~~~~~~~
+
+如果你想使用较新的 version 2 配额格式,就需要支持 32 位 uid 和 gid。
+Quota-tools 3.07 及更新版本提供了该支持。请使用上表中推荐版本或更新版本。
+
+Intel IA32 微码
+~~~~~~~~~~~~~~~
+
+新增了一个驱动,可用于更新 Intel IA32 微码,并以普通(misc)
+字符设备的形式提供访问。如果你没有使用udev,则在使用前可能需要以root身份执行::
+
+  mkdir /dev/cpu
+  mknod /dev/cpu/microcode c 10 184
+  chmod 0644 /dev/cpu/microcode
+
+你可能还会希望获取用户空间的 microcode_ctl 工具来配合使用。
+
+udev
+~~~~
+
+``udev`` 是一个用户空间程序,用于动态填充 ``/dev``,
+仅为实际存在的设备创建设备节点。``udev`` 替代了 devfs 的基本功能,
+同时允许为设备提供持久化命名。
+
+FUSE
+~~~~
+
+需要 libfuse 2.4.0 或更高版本。绝对最低要求是 2.3.0,但 mount 选项
+``direct_io`` 和 ``kernel_cache`` 将无法工作。
+
+网络
+----
+
+通用变化
+~~~~~~~~
+
+如果你有较复杂的网络配置需求,应该考虑使用 ip-route2 中的网络工具。
+
+包过滤 / NAT
+~~~~~~~~~~~~
+
+数据包过滤和 NAT 代码使用的工具与此前的 2.4.x 内核系列相同(iptables)。
+它仍然包含与 2.2.x 风格 ipchains 以及 2.0.x 风格ipfwadm 的向后兼容模块。
+
+PPP
+~~~
+
+PPP 驱动已经过重构,以支持 multilink 并使其能够运行在多种介质层之上。如
+果你使用 PPP,请将 pppd 至少升级到2.4.0。
+
+如果你没有使用 udev,则必须拥有设备文件 /dev/ppp,可以通过以下命令创建::
+
+  mknod /dev/ppp c 108 0
+
+需要以 root 身份执行。
+
+NFS-utils
+~~~~~~~~~
+
+在很早期的内核(2.4 及更早版本)中,nfs 服务器需要知道哪些客户端希望通
+过 NFS 访问文件。这些信息会在客户端挂载文件系统时由 ``mountd`` 提供
+给内核,或者在系统启动时由 ``exportfs`` 提供。exportfs 会从
+``/var/lib/nfs/rmtab`` 中获取活跃客户端信息。
+
+这种方法相当脆弱,因为它依赖于 rmtab 的正确性,而这并不总是容易保证,
+尤其是在尝试实现故障切换时。即便系统运行正常,``rmtab``
+也会积累大量从未被移除的旧条目。
+
+在现代内核中,我们可以选择让内核在收到未知主机请求时通知 mountd,
+再由 mountd 将合适的导出信息提供给内核。这样就不再依赖 ``rmtab``,
+并且内核只需要知道当前活跃的客户端。
+
+要启用这一新功能,你需要在运行 exportfs 或 mountd 之前执行::
+
+  mount -t nfsd nfsd /proc/fs/nfsd
+
+建议尽可能使用防火墙将所有NFS服务与公共互联网隔离。
+
+mcelog
+~~~~~~
+
+在x86内核上,如果启用了 ``CONFIG_X86_MCE``,则需要 mcelog 工具来处理和
+记录机器检查事件。机器检查事件是CPU报告的错误,强烈建议对其进行处理。
+
+内核文档
+--------
+
+Sphinx
+~~~~~~
+
+关于Sphinx需求的详细信息,请参阅
+Documentation/translations/zh_CN/doc-guide/sphinx.rst 中的 :ref:`sphinx_install_zh`。
+
+rustdoc
+~~~~~~~
+
+``rustdoc`` 用于为 Rust 代码生成文档。更多信息请参阅
+Documentation/translations/zh_CN/rust/general-information.rst。
+
+获取更新的软件
+==============
+
+内核编译
+--------
+
+gcc
+~~~
+
+- <ftp://ftp.gnu.org/gnu/gcc/>
+
+Clang/LLVM
+~~~~~~~~~~
+
+- :ref:`获取 LLVM <zh_cn_getting_llvm>`。
+
+Rust
+~~~~
+
+- Documentation/rust/quick-start.rst。
+
+bindgen
+~~~~~~~
+
+- Documentation/rust/quick-start.rst。
+
+Make
+~~~~
+
+- <ftp://ftp.gnu.org/gnu/make/>
+
+Bash
+~~~~
+
+- <ftp://ftp.gnu.org/gnu/bash/>
+
+Binutils
+~~~~~~~~
+
+- <https://www.kernel.org/pub/linux/devel/binutils/>
+
+Flex
+~~~~
+
+- <https://github.com/westes/flex/releases>
+
+Bison
+~~~~~
+
+- <ftp://ftp.gnu.org/gnu/bison/>
+
+OpenSSL
+~~~~~~~
+
+- <https://www.openssl.org/>
+
+系统工具
+--------
+
+Util-linux
+~~~~~~~~~~
+
+- <https://www.kernel.org/pub/linux/utils/util-linux/>
+
+Kmod
+~~~~
+
+- <https://www.kernel.org/pub/linux/utils/kernel/kmod/>
+- <https://git.kernel.org/pub/scm/utils/kernel/kmod/kmod.git>
+
+Ksymoops
+~~~~~~~~
+
+- <https://www.kernel.org/pub/linux/utils/kernel/ksymoops/v2.4/>
+
+Mkinitrd
+~~~~~~~~
+
+- <https://code.launchpad.net/initrd-tools/main>
+
+E2fsprogs
+~~~~~~~~~
+
+- <https://www.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs/>
+- <https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/>
+
+JFSutils
+~~~~~~~~
+
+- <https://jfs.sourceforge.net/>
+
+Xfsprogs
+~~~~~~~~
+
+- <https://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git>
+- <https://www.kernel.org/pub/linux/utils/fs/xfs/xfsprogs/>
+
+Pcmciautils
+~~~~~~~~~~~
+
+- <https://www.kernel.org/pub/linux/utils/kernel/pcmcia/>
+
+Quota-tools
+~~~~~~~~~~~
+
+- <https://sourceforge.net/projects/linuxquota/>
+
+
+Intel P6 微码
+~~~~~~~~~~~~~
+
+- <https://downloadcenter.intel.com/>
+
+udev
+~~~~
+
+- <https://www.freedesktop.org/software/systemd/man/udev.html>
+
+FUSE
+~~~~
+
+- <https://github.com/libfuse/libfuse/releases>
+
+mcelog
+~~~~~~
+
+- <https://www.mcelog.org/>
+
+网络
+----
+
+PPP
+~~~
+
+- <https://download.samba.org/pub/ppp/>
+- <https://git.ozlabs.org/?p=ppp.git>
+- <https://github.com/paulusmack/ppp/>
+
+NFS-utils
+~~~~~~~~~
+
+- <https://sourceforge.net/project/showfiles.php?group_id=14>
+- <https://nfs.sourceforge.net/>
+
+Iptables
+~~~~~~~~
+
+- <https://netfilter.org/projects/iptables/index.html>
+
+Ip-route2
+~~~~~~~~~
+
+- <https://www.kernel.org/pub/linux/utils/net/iproute2/>
+
+OProfile
+~~~~~~~~
+
+- <https://oprofile.sf.net/download/>
+
+内核文档
+--------
+
+Sphinx
+~~~~~~
+
+- <https://www.sphinx-doc.org/>
-- 
Jiandong Qiu <qiujiandong1998@gmail.com>


^ permalink raw reply related

* [PATCH 1/3] docs/zh_CN: add llvm.rst translation anchor
From: Jiandong Qiu @ 2026-06-19 14:02 UTC (permalink / raw)
  To: alexs, si.yanteng
  Cc: dzm91, corbet, skhan, linux-doc, linux-kernel, Jiandong Qiu
In-Reply-To: <20260619140245.1982921-1-qiujiandong1998@gmail.com>

Add the kbuild_llvm_zh label for local cross-references.

Signed-off-by: Jiandong Qiu <qiujiandong1998@gmail.com>
---
process/changes.rst refers to this anchor.

 Documentation/translations/zh_CN/kbuild/llvm.rst | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/translations/zh_CN/kbuild/llvm.rst b/Documentation/translations/zh_CN/kbuild/llvm.rst
index f87e0181d8e7..5fdf281a614a 100644
--- a/Documentation/translations/zh_CN/kbuild/llvm.rst
+++ b/Documentation/translations/zh_CN/kbuild/llvm.rst
@@ -5,6 +5,8 @@
 :Original: Documentation/kbuild/llvm.rst
 :Translator: 慕冬亮 Dongliang Mu <dzm91@hust.edu.cn>
 
+.. _kbuild_llvm_zh:
+
 ==========================
 使用 Clang/LLVM 构建 Linux
 ==========================
-- 
Jiandong Qiu <qiujiandong1998@gmail.com>


^ permalink raw reply related

* [PATCH 0/3] docs/zh_CN: update translation of doc-guide/sphinx.rst
From: Jiandong Qiu @ 2026-06-19 14:02 UTC (permalink / raw)
  To: alexs, si.yanteng
  Cc: dzm91, corbet, skhan, linux-doc, linux-kernel, Jiandong Qiu

Hi all,

This is my first time sending patches to the Linux community. I have
been reading the kernel documentation to learn more about Linux, and in
the process I found a few places where I could help improve the zh_CN
translations. Comments and suggestions are welcome.

This series contains three patches, all intended to update the zh_CN
translation of doc-guide/sphinx.rst. That translation appears to have
been out of date for several years. While updating it, I noticed that
sphinx.rst refers to process/changes.rst, which did not yet have a zh_CN
translation, so I added one. During that work, I also noticed that
changes.rst refers to llvm.rst, so I added the missing anchor in the
zh_CN translation of llvm.rst.

Jiandong Qiu (3):
  docs/zh_CN: add llvm.rst translation anchor
  docs/zh_CN: add process/changes.rst translation
  docs/zh_CN: update sphinx.rst translation

 .../translations/zh_CN/doc-guide/sphinx.rst   | 165 ++++--
 .../translations/zh_CN/kbuild/llvm.rst        |   2 +
 .../translations/zh_CN/process/changes.rst    | 530 ++++++++++++++++++
 3 files changed, 665 insertions(+), 32 deletions(-)
 create mode 100644 Documentation/translations/zh_CN/process/changes.rst

-- 
Jiandong Qiu <qiujiandong1998@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v5 12/15] onsemi: s2500: Add driver support for TS2500 MAC-PHY
From: Uwe Kleine-König @ 2026-06-19 13:59 UTC (permalink / raw)
  To: Selvamani.Rajagopal
  Cc: Andrew Lunn, Piergiorgio Beruto, Heiner Kallweit, Russell King,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Andrew Lunn, Parthiban Veerasooran, Richard Cochran, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Simon Horman, Jonathan Corbet,
	Shuah Khan, netdev, linux-kernel, devicetree, linux-doc,
	Jerry Ray
In-Reply-To: <20260614-s2500-mac-phy-support-v5-12-89874b72f725@onsemi.com>

[-- Attachment #1: Type: text/plain, Size: 1190 bytes --]

On Sun, Jun 14, 2026 at 10:00:28AM -0700, Selvamani Rajagopal via B4 Relay wrote:
> +static const struct of_device_id s2500_of_match[] = {
> +	{ .compatible = "onnn,s2500" },
> +	{}

s/{}/{ }/

> +};
> +
> +static const struct spi_device_id s2500_ids[] = {
> +	{ "s2500" },
> +	{}
> +};

Please make this:

static const struct spi_device_id s2500_ids[] = {
	{ .name = "s2500" },
	{ }
};

> +MODULE_DEVICE_TABLE(spi, s2500_ids);
> +
> +static struct spi_driver s2500_driver = {
> +	.driver = {
> +		.name	= DRV_NAME,
> +		.of_match_table = s2500_of_match,
> +	},
> +	.probe		= s2500_probe,
> +	.remove		= s2500_remove,
> +	.id_table	= s2500_ids,

Tastes are different, but the idea to align = is usually screwed by
follow up patches. Here it's broken from the start. If you ask me: Use a
single space before each =.

> +};
> +
> +module_spi_driver(s2500_driver);

Usually there is no empty line between the driver struct and the macro
registering it.

> +
> +MODULE_AUTHOR("Piergiorgio Beruto <pier.beruto@onsemi.com>");
> +MODULE_AUTHOR("Selva Rajagopal <selvamani.rajagopal@onsemi.com>");
> +MODULE_DESCRIPTION("onsemi MACPHY ethernet driver");
> +MODULE_LICENSE("GPL");

Best regards
Uwe

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [RFC PATCH 0/2] kasan: hw_tags: Add option to tag only at allocation time
From: Catalin Marinas @ 2026-06-19 13:19 UTC (permalink / raw)
  To: Harry Yoo
  Cc: Dev Jain, ryabinin.a.a, akpm, corbet, glider, andreyknvl, dvyukov,
	vincenzo.frascino, kasan-dev, linux-mm, linux-kernel, skhan,
	workflows, linux-doc, linux-arm-kernel, ryan.roberts,
	anshuman.khandual, kaleshsingh, 21cnbao, david, will
In-Reply-To: <b1502a60-09a1-4699-886b-93d041de7023@kernel.org>

On Thu, Jun 18, 2026 at 10:35:15PM +0900, Harry Yoo wrote:
> On 6/12/26 1:44 PM, Dev Jain wrote:
> > Introduce a boot option to tag only at allocation time of the objects. This
> > reduces KASAN MTE overhead, the tradeoff being reduced ability of
> > catching bugs.
> 
> I think most of overhead when enabling MTE comes from loading and
> validing tags for every memory access (either in SYNC or ASYNC mode),
> rather than from storing tags.

I guess it depends on the workload. Lots of allocations for short-lived
buffers (e.g. network traffic) may notice the additional tagging more
than the actual tag checking.

Of course, it would be nice to get some numbers from those who have
access to MTE capable hardware.

> > Now, when a memory object will be freed, it will retain the random tag it
> > had at allocation time. This compromises on catching UAF bugs, till the
> > time the object is not reallocated, at which point it will have a new
> > random tag.
> > 
> > Hence, not catching "use-after-free-before-reallocation" and not catching
> > "double-free" will be the compromise for reduced KASAN overhead.
> 
> I doubt users who care about security enough to enable HW_TAGS KASAN
> are willing to compromise on security just to save a few instructions
> to store tags in the free path.
> 
> To me, it looks like too much of a compromise on security for little
> performance gain.

I don't think there's much compromise on security for use-after-free.
The buffer will be re-tagged later so use-after-realloc should be
caught, especially if we ensure that a different tag will be used (I
don't think Dev's patches do this).

Of course, if you want MTE as a debug/bug-finding feature, tagging on
both allocation and freeing is highly recommended. This patchset is
aimed for those wanting to run MTE in production and squeeze a bit more
performance out of it (with the compromise of not detecting
use-after-free, only prevent access after re-allocation).

-- 
Catalin

^ permalink raw reply

* Re: [RFC PATCH 0/2] kasan: hw_tags: Add option to tag only at allocation time
From: Catalin Marinas @ 2026-06-19 13:04 UTC (permalink / raw)
  To: Harry Yoo
  Cc: Dev Jain, ryabinin.a.a, akpm, corbet, glider, andreyknvl, dvyukov,
	vincenzo.frascino, kasan-dev, linux-mm, linux-kernel, skhan,
	workflows, linux-doc, linux-arm-kernel, ryan.roberts,
	anshuman.khandual, kaleshsingh, 21cnbao, david, will
In-Reply-To: <2a7d21fa-28c1-446c-97f5-2513f29157d3@kernel.org>

On Thu, Jun 18, 2026 at 11:05:43PM +0900, Harry Yoo wrote:
> On 6/18/26 10:35 PM, Harry Yoo wrote:
> > On 6/12/26 1:44 PM, Dev Jain wrote:
> >> Introduce a boot option to tag only at allocation time of the objects. This
> >> reduces KASAN MTE overhead, the tradeoff being reduced ability of
> >> catching bugs.
> > 
> > I think most of overhead when enabling MTE comes from loading and
> > validing tags for every memory access (either in SYNC or ASYNC mode),
> > rather than from storing tags.
> 
> Is there any reason not to use STGM instead of STG + DC GVA when
> setting/clearing tags for large sizes when we know they are properly
> aligned?

STGM is intended for copying tags when paired with LDGM. Have you seen
hardware where STGM is faster than STG or DC GVA? For properly aligned
buffers, I'd expect DC GVA to behave at least on par with STGM.

-- 
Catalin

^ permalink raw reply

* Re: [PATCH v3 1/2] dt-bindings: iio: dac: Add AD5529R
From: Nuno Sá @ 2026-06-19 13:01 UTC (permalink / raw)
  To: Conor Dooley
  Cc: Janani Sunil, Jonathan Cameron, Rodrigo Alencar, Janani Sunil,
	Lars-Peter Clausen, Michael Hennerich, David Lechner,
	Nuno Sá, Andy Shevchenko, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Philipp Zabel, Jonathan Corbet, Shuah Khan,
	linux-iio, devicetree, linux-kernel, linux-doc, Mark Brown
In-Reply-To: <20260619-bunch-diocese-dd7805cc17ff@spud>

On Fri, Jun 19, 2026 at 12:40:54PM +0100, Conor Dooley wrote:
> On Fri, Jun 19, 2026 at 12:36:55PM +0100, Conor Dooley wrote:
> > On Fri, Jun 19, 2026 at 12:33:11PM +0200, Janani Sunil wrote:
> > > 
> > > On 6/14/26 21:44, Jonathan Cameron wrote:
> > > > On Tue, 9 Jun 2026 16:47:23 +0200
> > > > Janani Sunil <jan.sun97@gmail.com> wrote:
> > > > 
> > > > > On 5/26/26 15:11, Rodrigo Alencar wrote:
> > > > > > On 26/05/19 05:42PM, Janani Sunil wrote:
> > > > > > > Devicetree bindings for AD5529R 16 channel 12/16 bit high voltage,
> > > > > > > buffered voltage output digital-to-analog converter (DAC) with an
> > > > > > > integrated precision reference.
> > > > > > ...
> > > > > > Probably others may comment on that, but...
> > > > > > 
> > > > > > This parent node may support device addressing for multi-device support through
> > > > > > those ID pins. I suppose that each device may have its own power supplies or
> > > > > > other resources like the toggle pins or reset and enable.
> > > > > > 
> > > > > > That way I suppose that an example would look like...
> > > > > > > +
> > > > > > > +patternProperties:
> > > > > > > +  "^channel@([0-9]|1[0-5])$":
> > > > > > > +    type: object
> > > > > > > +    description: Child nodes for individual channel configuration
> > > > > > > +
> > > > > > > +    properties:
> > > > > > > +      reg:
> > > > > > > +        description: Channel number.
> > > > > > > +        minimum: 0
> > > > > > > +        maximum: 15
> > > > > > > +
> > > > > > > +      adi,output-range-microvolt:
> > > > > > > +        description: |
> > > > > > > +          Output voltage range for this channel as [min, max] in microvolts.
> > > > > > > +          If not specified, defaults to 0V to 5V range.
> > > > > > > +        oneOf:
> > > > > > > +          - items:
> > > > > > > +              - const: 0
> > > > > > > +              - enum: [5000000, 10000000, 20000000, 40000000]
> > > > > > > +          - items:
> > > > > > > +              - const: -5000000
> > > > > > > +              - const: 5000000
> > > > > > > +          - items:
> > > > > > > +              - const: -10000000
> > > > > > > +              - const: 10000000
> > > > > > > +          - items:
> > > > > > > +              - const: -15000000
> > > > > > > +              - const: 15000000
> > > > > > > +          - items:
> > > > > > > +              - const: -20000000
> > > > > > > +              - const: 20000000
> > > > > > > +
> > > > > > > +    required:
> > > > > > > +      - reg
> > > > > > > +
> > > > > > > +    additionalProperties: false
> > > > > > > +
> > > > > > > +required:
> > > > > > > +  - compatible
> > > > > > > +  - reg
> > > > > > > +  - vdd-supply
> > > > > > > +  - avdd-supply
> > > > > > > +  - hvdd-supply
> > > > > > > +
> > > > > > > +dependencies:
> > > > > > > +  spi-cpha: [ spi-cpol ]
> > > > > > > +  spi-cpol: [ spi-cpha ]
> > > > > > > +
> > > > > > > +allOf:
> > > > > > > +  - $ref: /schemas/spi/spi-peripheral-props.yaml#
> > > > > > > +
> > > > > > > +unevaluatedProperties: false
> > > > > > > +
> > > > > > > +examples:
> > > > > > > +  - |
> > > > > > > +    #include <dt-bindings/gpio/gpio.h>
> > > > > > > +
> > > > > > > +    spi {
> > > > > > > +        #address-cells = <1>;
> > > > > > > +        #size-cells = <0>;
> > > > > > > +
> > > > > > > +        dac@0 {
> > > > > > > +            compatible = "adi,ad5529r-16";
> > > > > > > +            reg = <0>;
> > > > > > > +            spi-max-frequency = <25000000>;
> > > > > > > +
> > > > > > > +            vdd-supply = <&vdd_regulator>;
> > > > > > > +            avdd-supply = <&avdd_regulator>;
> > > > > > > +            hvdd-supply = <&hvdd_regulator>;
> > > > > > > +            hvss-supply = <&hvss_regulator>;
> > > > > > > +
> > > > > > > +            reset-gpios = <&gpio0 87 GPIO_ACTIVE_LOW>;
> > > > > > > +
> > > > > > > +            #address-cells = <1>;
> > > > > > > +            #size-cells = <0>;
> > > > > > > +
> > > > > > > +            channel@0 {
> > > > > > > +                reg = <0>;
> > > > > > > +                adi,output-range-microvolt = <0 5000000>;
> > > > > > > +            };
> > > > > > > +
> > > > > > > +            channel@1 {
> > > > > > > +                reg = <1>;
> > > > > > > +                adi,output-range-microvolt = <(-10000000) 10000000>;
> > > > > > > +            };
> > > > > > > +
> > > > > > > +            channel@2 {
> > > > > > > +                reg = <2>;
> > > > > > > +                adi,output-range-microvolt = <0 40000000>;
> > > > > > > +            };
> > > > > > > +        };
> > > > > > > +    };
> > > > > > ...
> > > > > > 
> > > > > > 	spi {
> > > > > > 		#address-cells = <1>;
> > > > > > 		#size-cells = <0>;
> > > > > > 
> > > > > > 		multi-dac@0 {
> > > > > > 			compatible = "adi,ad5529r-16";
> > > > > > 			reg = <0>;
> > > > > > 			spi-max-frequency = <25000000>;
> > > > > > 
> > > > > > 			#address-cells = <1>;
> > > > > > 			#size-cells = <0>;
> > > > > > 
> > > > > > 			dac@0 {
> > > > > > 				reg = <0>;
> > > > > > 				vdd-supply = <&vdd_regulator>;
> > > > > > 				avdd-supply = <&avdd_regulator>;
> > > > > > 				hvdd-supply = <&hvdd_regulator>;
> > > > > > 				hvss-supply = <&hvss_regulator>;
> > > > > > 
> > > > > > 				reset-gpios = <&gpio0 87 GPIO_ACTIVE_LOW>;
> > > > > > 
> > > > > > 				#address-cells = <1>;
> > > > > > 				#size-cells = <0>;
> > > > > > 
> > > > > > 				channel@0 {
> > > > > > 					reg = <0>;
> > > > > > 					adi,output-range-microvolt = <0 5000000>;
> > > > > > 				};
> > > > > > 
> > > > > > 				channel@1 {
> > > > > > 					reg = <1>;
> > > > > > 					adi,output-range-microvolt = <(-10000000) 10000000>;
> > > > > > 				};
> > > > > > 
> > > > > > 				channel@2 {
> > > > > > 					reg = <2>;
> > > > > > 					adi,output-range-microvolt = <0 40000000>;
> > > > > > 				};
> > > > > > 			}
> > > > > > 
> > > > > > 			dac@1 {
> > > > > > 				reg = <1>;
> > > > > > 				vdd-supply = <&vdd_regulator>;
> > > > > > 				avdd-supply = <&avdd_regulator>;
> > > > > > 				hvdd-supply = <&hvdd_regulator>;
> > > > > > 				hvss-supply = <&hvss_regulator>;
> > > > > > 
> > > > > > 				reset-gpios = <&gpio0 88 GPIO_ACTIVE_LOW>;
> > > > > > 
> > > > > > 				#address-cells = <1>;
> > > > > > 				#size-cells = <0>;
> > > > > > 
> > > > > > 				channel@0 {
> > > > > > 					reg = <0>;
> > > > > > 					adi,output-range-microvolt = <0 5000000>;
> > > > > > 				};
> > > > > > 
> > > > > > 				channel@1 {
> > > > > > 					reg = <1>;
> > > > > > 					adi,output-range-microvolt = <(-10000000) 10000000>;
> > > > > > 				};
> > > > > > 			}
> > > > > > 		};
> > > > > > 	};
> > > > > > 
> > > > > > then you might need something like:
> > > > > > 
> > > > > > 	patternProperties:
> > > > > > 		"^dac@[0-3]$":
> > > > > > 
> > > > > > and put most of the things under this node pattern.
> > > > > > 
> > > > > > So the main driver that you're putting together might need to handle up to four instances.
> > > > > > Even if your current driver cannot handle this, the dt-bindings might need cover that.
> > > > > > 
> > > > > > Need to double check if each dac node needs a separate compatible, so you would maybe populate
> > > > > > a platform data to be shared with the child nodes, which would be a separate driver.
> > > > > > (not sure if it would make sense to mix and match ad5529r-16 and ad5529r-12).
> > > > > Hi Rodrigo,
> > > > > 
> > > > > Thank you for looking at this.
> > > > > 
> > > > > For now, I would prefer to keep the binding scoped to a single AD5529R device instance. The current
> > > > > hardware/use case we have only needs one device node and the driver is written around that model as well.
> > > > > While the device addressing pins could allow multi-device topology, we do not have an actual platform using
> > > > > that configuration at the moment, so I would prefer not to introduce an extra parent/child binding structure
> > > > > speculatively without a validating use case.
> > > > Interesting feature - kind of similar to address control on a typical i2c bus device, or
> > > > looking at it another way a kind of distributed SPI mux.
> > > > 
> > > > Challenge of a binding is we need to anticipate the future.  So I think we do need something
> > > > like Rodrigo is suggesting even if we only (for now) support a single instance in the driver.
> > > > That would leave the path open to supporting the addressing at a later date.
> > > > An alternative might be to look at it like a chained device setup. In those we pretend there
> > > > is just one device with a lot of channels etc.  The snag is that here things are more loosely
> > > > coupled whereas for those devices it tends to be you have to read / write the same register
> > > > in all devices in the chain as one big SPI message.
> > > > 
> > > > +CC Mark Brown as he may know of some precedence for this feature. For his reference..
> > > > - Each of these device has 2 ID pins.  The SPI transfers have to contain the 2 bit
> > > > value that matches that or they are ignored.  Thus a single bus + 1 chip select can
> > > > be used to talk to 4 devices.  Question is what that looks like in device tree + I guess
> > > > longer term how to support it cleanly in SPI.
> > 
> > I'd swear I have seen this before, from some Microchip devices. Let me
> > see if I can find what I am thinking of...
> 
> 
> microchip,mcp3911 and microchip,mcp3564 both seem to do this with
> slightly different properties.
> 
>   microchip,device-addr:
>     description: Device address when multiple MCP3911 chips are present on the same SPI bus.
>     $ref: /schemas/types.yaml#/definitions/uint32
>     enum: [0, 1, 2, 3]
>     default: 0
> 
> and
> 
> 
>   microchip,hw-device-address:
>     $ref: /schemas/types.yaml#/definitions/uint32
>     minimum: 0
>     maximum: 3
>     description:
>       The address is set on a per-device basis by fuses in the factory,
>       configured on request. If not requested, the fuses are set for 0x1.
>       The device address is part of the device markings to avoid
>       potential confusion. This address is coded on two bits, so four possible
>       addresses are available when multiple devices are present on the same
>       SPI bus with only one Chip Select line for all devices.
>       Each device communication starts by a CS falling edge, followed by the
>       clocking of the device address (BITS[7:6] - top two bits of COMMAND BYTE
>       which is first one on the wire).
> 
> This sounds exactly like the sort of feature that you're dealing with
> here?
> 

The core idea yes but for this chip, things are a bit more annoying (but
Janani can correct me if I'm wrong). Here, each device can, in theory,
have it's own supplies, pins and at the very least, channels with maybe
different scales. That is why Janani is proposing dac nodes. Given I
honestly don't like much of that "adi,ad5529r-bus" compatible I wondered
about solving this at the spi level.

Ah and to make it more annoying, we can also mix 12 and 16 bits variants
together in the same bus.

- Nuno Sá



^ permalink raw reply

* Re: [PATCH v4 00/31] Introduce SCMI Telemetry FS support
From: Cristian Marussi @ 2026-06-19 12:51 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Cristian Marussi, Christian Brauner, linux-kernel,
	linux-arm-kernel, arm-scmi, linux-fsdevel, linux-doc,
	sudeep.holla, james.quinlan, f.fainelli, vincent.guittot,
	etienne.carriere, peng.fan, michal.simek, d-gole, jic23,
	elif.topuz, lukasz.luba, philip.radford, souvik.chakravarty,
	leitao, kas, puranjay, usama.arif, kernel-team
In-Reply-To: <0025b907-27b9-4a51-b78f-f8ad413644d0@kernel.org>

On Fri, Jun 19, 2026 at 12:16:58PM +0200, David Hildenbrand (Arm) wrote:
> 
> >> Is the configuration aspect limited to enabling selected events, or is there
> >> more that can be configured?

Hi,

> >>
> > 
> > The needed configuration is:
> > 
> >  - global Telemetry enable (tlm_enable)
> >  - global common update_interval (current_update_interval)
> 
> Okay, so simple global properties.
> 
> >  - per-DE enable/disable (des/0x<NNNN>/enable)
> >  - per-DE timestamping enable/disable (des/0x<NNNN>/tstamp_enable)
> > 
> >  ... then there are a couple of handy catch-all entries:
> > 	all_des_enable, all_des_tstamp_enable
> 
> Okay, so fairly trivial configs.

Yes mostly on/off switches or single values config.

> > 
> > Note that all the existent DEs are discovered at runtime dynamically via
> > SCMI in the background at init/probe and then never change: i.e.
> > the tree is statically created upon discovery, user cannot
> > create/destroy or symlink files at will, nor the backend platform FW
> > running the SCMI server can pop-up new DataEvents after the initial
> > enumeration.
> 
> That makes sense.
> 
> > 
> > All the above configs can also be pre-defined in the FW (at built time)
> > as being default boot-on with predefined values, like a specific
> > boot-on update interval, so that you could have a system in which really
> > you dont need to configure anything...everything is on and you just
> > read data. (unless you want to change config of course...)
> 
> Okay, so the initial value of some parameters might not be "disabled" etc.

Yes at the protocol layer I take care to lookup all of this states at
init so that the initial states are consistent with what exposed...

> 
> I guess, from a user space perspective, reading should be allowed by everyone
> but writing should be limited to root?
>

Yes it is currently world readable and only root has write access by default,
BUT in this latest V4 I added (as asked by some internal team) handling of the
usual uid/gid/umask mount options so that a privileged user can change the ownership
policy at mount time. (not supporting anyway FS_USERNS_MOUNT since it does not
make sense to support containers for SCMI Telemetry)

> > 
> > There is more stuff that indeed is configurable per the SCMI spec
> > but these additional params are hidden into the SCMI Telemetry protocol
> > layer (the initial patches in this series) and NOT made available to
> > the driver/users of the protocol (like the SCMI FS driver that sits on
> > top)
> 
> Do you assume that there will get significantly more config options added in the
> future for user space to configure?

No, I dont think so...the only planned extensions were to support more
performant read access mechanisms, i.e. direct mmap'ability of FW/Kernel
SCMI Telemetry shared memory areas...BUT that will immediately dump all
the bulk of the lower layer protocol work into the tools domain...and
we're not ready to do so...beside having one more thing, the tool, to keep
in sync with possible future spec changes (unless exposing even more stuff
like tlm mem-areas accessors to the UAPI...that would be painful kernel
side and not desired AFAIU...)

> 
> > 
> > IOW, this humonguos series (~8k lines) is only partially composed by
> > the Filesystem driver (~3k): the bulk of the Telemetry logic and SCMI
> > message exchanges are contained in the SCMI Protocol stack which has
> > been extended to support the Telemerty protocol at first
> > (the 'firmware: arm_scmi:' initial patches).
> > 
> > This latter common support is exposed by the SCMI stack for the SCMI
> > drivers to use via custom per-protocol operations (not an orginal name :P)
> > exposed in include/linux/scmi_protocol.h
> > 
> > So when you write into FS to configure smth, you end up calling an internal
> > tlm_proto_ops that in turn will cause an SCMI message to be sent
> > (in some cases say to enable a DE or set the update interval)
> 
> Makes sense.
> 
> > 
> > When you read something, you end up calling another Telemetry operation
> > that in turn returns you the DataEvent value you were looking for...how
> > this is retrieved via SCMI in the background is transparent to the
> > FS driver because, again, these details are buried into the protocol
> > layer. Talking about reads, you can:
> > 
> >  - read a single value from des/0x<NNNN>/value
> >  - read ALL the currently enabled DE in a bulk read via des_bulk_read
> > 
> > ...most of the other entries in the tree are simply RO properties of the DEs
> > that have been discovered at enumeration time.
> 
> Is this bulk-reading relevant for performance or just a "nice to have" ?
> 

I suppose depends on your usage pattern: it is definitely relevant
because the main collection mechanism are shared memory areas (SHMTIs)
between the platform firmware and the Kernel: such areas being accessed
from 2 differnt worlds concurrently come with a SCMI-specified
synchro/consistency mechanism based simply on a pair of sequence numbers
placed at the start and at the end of the SHMTI, so that the FW increases
such magic numbers in a well-known way before and after updating the SHMTI
values, so that the kernel can detect (without any interlocking mechanism)
if a platform write happened in the middle of its reads...

...so if you read one single DE 64bit value, under the hood the kernel
would have had to really perform at leats 3 reads from the SHMTI to check
the consistecy of that single read...

... while if you do a bulk_read the overhead due to the consistecy
checks gets 'spread' across a number of DEs because the kernel will snapshot
the whole SHMTIs (potentially KBytes) between the 2 consistency reads

...the good side effect of all of this is that I can leverage such
sequence number to optimize reads..i.e. do NOT even try to read anything
if the new sequnce number is unchanged from the last one I cached on the
last successfull read of this value...

So at the end I would say it is NOT simply a nice to have BUT it is
certainly only the first step towards a more performant alternative access
(like with mmaps)...it depends on the usage pattern...I am not sure what
mechanism is used by our tools more...

> 
> > 
> > Given that walking a FS tree and issuing configuration as writes is NOT
> > performant really (nor handy if you are not a human), currently, even
> > in this FS-based series you can really perform all of the discovery AND
> > the configuration tasks WITHOUT walking the filesystem tree, but instead
> > issuing a bunch of IOCTLs issued on a special 'control' file that I
> > embedded in the FS. Such UAPI IOCTLs described at:
> 
> Makes sense.
> 
> > 
> > https://lore.kernel.org/arm-scmi/20260612223802.1337232-6-cristian.marussi@arm.com/T/#u
> >  
> > So my plan of action in order to get rid of the FS in-kenel implementation
> > would be to drop this Filesystem in favour of simple character devices
> > and move the existent IOCTLs interface (revisited where needed) on top of
> > these devices: that way you will be able to use IOCTLs to enumerate the
> > Telemetry sources and then configure them.
> > 
> > Read will then happen (probably) leveraging a number of chardev fops like:
> > IOCTLs, .read and .mmap...up to the tool decide what to use.
> > 
> > After this porting to chardev is done, I would start optionally exposing
> > again all of this in a human-readable alternative way by adding a layer
> > of FUSE on top of this chardev interface.
> 
> Yes. How high-priority is the fs side? Or would a tool using a library to access
> this information also work in the first step?
> 

I have to sync with tools on this...because they are stiil probably
using currently the FS, but it was already planned for the future to move to
a more low level access (ioctl/mmap)...

...my aim would be, at this point, to favour this transition without sudden
breaking their current world (and have to expatriate :P)

..from my personal point of view, I would certainly like to still have the
FUSE layer for ease of testing and verification on my side...but it is just
a nice to have... 

> > 
> > Basically my aim is to drop the FS implementation from the kernel, as
> > advised, while trying to optionally make it still available via a userspace
> > FUSE implementation...IOW the intention would be for the next V5 to expose
> > the same interfaces as V4 but with the help of a tool instead that builds,
> > if wanted, a FUSE mount built on top of the chardev interface.

[snip]

> >>
> >> It's a good question how that could be done, if you need more information about
> >> these events from user space.
> > 
> > I have NOT really delved into that, so as of know we do NOT fed any data
> > to existing Kernel subsystems, not there is any available in-kernel
> > interface to consume DE data (nobody asked), but, I can imagine 2 solution:
> > 
> >  - our beloved architects decide to 'architect' more DataEvents in the
> >    next version of the spec.. i.e. they reserve some specific DE IDs to
> >    represent some well defined entity (like it is done already in the spec
> >    for a dozen IDs)...this avoids the needs of any new interface all
> >    together
> 
> That would be the cleanest solution :)
> 

Definitely agree.

> > 
> > OR
> > 
> > - we open some sort of user-->kernel ABI channel 'somewhere' where the
> >   userspace tool, interpreting the JSON description, can communicate something
> >   like " on this platform ID 1,2,3,4 should be fed to the IIO sensors frmwk
> >   too, while ID 39,8,76 can be fed to HWMON..." etc
> > 
> >>
> >> [...]
> >>
> >>
> >> That sounds reasonable.
> >>
> >> [...]
> >>

[snip]

> > Regarding the user concurrency, I have already explicitly pushed back on
> > this, our own tools team: any concurrent read or configuration write is
> > allowed and properly handled in a consistent way, BUT on the configuration
> > side the last write/ioctl wins: there is NO in-kernel OR userspace
> > co-ordination provided out of the box: IOW if you use multiple tools
> > concurrently to apply conflicting configurations, it is none of our problem
> 
> Would concurrent reading work? I assume so, right?
> 

Yes concurrent reading is not a problem, and concurrent writes are
properly handled at the write/message level (i.e. no corruption) BUT
no co-ordination is provided from the kernel on those config writes,
last write wins.

> > 
> > ...similarly as if you have an actively running network configuration daemon
> > and you try to set your IP manually...nobody will prevent you from doing this,
> > the same netlink will be used freely by you on the shell and the daemon (if you
> > have enough privilege), but you will gonna have unexpected result...
> > 
> > I dont either see the case to enforce exclusive access for Telemetry resources:
> > co-ordination is up to the user in my view...I mean if you have 2 tools
> > configuring concurrently SCMI telemetry in a conflicting way something has been
> > misconfigured somewhere
> > 
> > .....having said that, I understand that the concurrency co-ordination
> > issue can be particularly tricky to spot and solve in userspace, so I DO
> > expose a generation counter entry that is updated on any configuration
> > change, so that a userspace app using Telemetry can monitor (poll) this
> > counter to spot if someone else on the system is quietly suddenly applying
> > configuration changes...
> 
> Okay, so a single writer (admin) changing stuff could get picked up my possibly
> many concurrent readers?

Mmm...not sure what you mean here...

If you configure your Telemetry as you desire and start collecting data via
readers, BUT then some other process changes configs under your belt, that is
allowed as said, and so your analisys could be impacted...(something turned off
as an example, or update interval changed)...

...so while this is NOT regulated/co-ordinated by the Kernel, in order to
ease the detection of such events by your reading process, I provide a pollable
entry that returns an integer and then blocks until such counter is next updated
by an intervening under-the-hood configuration change...so you can configure,
monitor the generatin counter and then starts reading you data, sure that you
will detect any conflicting re-config issued by a rougue process...
(and I have to still extend this event polling mechanism to use a user
provided eventfd...since it was NOT strictly needed...but now with
IOCTLs interface I will add that too...)

> 
> > 
> >>>
> >>> Should/could such a tool live in the kernel tree (tools/) at least for
> >>> ease of development/deployment ?
> >>
> >> I think OOT.
> >>
> > 
> > Ok.
> > 
> > Sorry for the long email..I hope I have clarified the situation, anyway
> > I am already moving to get rid of the in-kernel interface as advised in
> > favour of a chardev kernel interface and an optional FUSE based FS...
> 
> Yes, thank you a lot, I hope it also helps Christian to help push this into the
> right direction!
>

Thanks a lot, David !
Cristian


^ permalink raw reply

* Re: [PATCH v8 05/46] KVM: Make CONFIG_KVM_VM_MEMORY_ATTRIBUTES selectable
From: Julian Braha @ 2026-06-19 12:51 UTC (permalink / raw)
  To: ackerleytng, aik, andrew.jones, binbin.wu, brauner, chao.p.peng,
	david, jmattson, jthoughton, michael.roth, oupton, pankaj.gupta,
	qperret, rick.p.edgecombe, rientjes, shivankg, steven.price,
	tabba, willy, wyihan, yan.y.zhao, forkloop, pratyush,
	suzuki.poulose, aneesh.kumar, liam, Paolo Bonzini,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen,
	Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt,
	Kiryl Shutsemau, Baoquan He, Jason Gunthorpe, Vlastimil Babka
  Cc: kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-5-9d2959357853@google.com>

Hi Ackerley,

On 6/19/26 01:31, Ackerley Tng via B4 Relay wrote:

>  config KVM_VM_MEMORY_ATTRIBUTES
> -	bool
> +	depends on KVM_SW_PROTECTED_VM || KVM_INTEL_TDX || KVM_AMD_SEV
> +	bool "Enable per-VM PRIVATE vs. SHARED attributes (for CoCo VMs)"

Sorry for the style nitpick, but could you keep the type and prompt as
the first attribute in the Kconfig option definition (like the other
options do)?

- Julian Braha

^ permalink raw reply

* Re: [PATCH v8 00/46] guest_memfd: In-place conversion support
From: Garg, Shivank @ 2026-06-19 12:28 UTC (permalink / raw)
  To: ackerleytng, aik, andrew.jones, binbin.wu, brauner, chao.p.peng,
	david, jmattson, jthoughton, michael.roth, oupton, pankaj.gupta,
	qperret, rick.p.edgecombe, rientjes, steven.price, tabba, willy,
	wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
	aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
	Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
	Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
	Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
	Baoquan He, Jason Gunthorpe, Vlastimil Babka
  Cc: kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-0-9d2959357853@google.com>



On 6/19/2026 6:01 AM, Ackerley Tng via B4 Relay wrote:
> This is v8 of guest_memfd in-place conversion support.
> 
> Up till now, guest_memfd supports the entire inode worth of memory being
> used as all-shared, or all-private. CoCo VMs may request guest memory to be
> converted between private and shared states, and the only way to support
> that currently would be to have the userspace VMM provide two sources of
> backing memory from completely different areas of physical memory.
> 
> pKVM has a use case for in-place sharing: the guest and host may be
> cooperating on given data, and pKVM doesn't protect data through
> encryption, so copying that given data between different areas of physical
> memory as part of conversions would be unnecessary work.
> 
> This series also serves as a foundation for guest_memfd huge page
> support. Now, guest_memfd only supports PAGE_SIZE pages, so if two sources
> of backing memory are used, the userspace VMM could maintain a steady total
> memory utilized by punching out the pages that are not used. When huge
> pages are available in guest_memfd, even if the backing memory source
> supports hole punching within a huge page, punching out pages to maintain
> the total memory utilized by a VM would be introducing lots of
> fragmentation.
> 
> In-place conversion avoids fragmentation by allowing the same physical
> memory to be used for both shared and private memory, with guest_memfd
> tracks the shared/private status of all the pages at a per-page
> granularity.
> 
> The central principle, which guest_memfd continues to uphold, is that any
> guest-private page will not be mappable to host userspace. All pages will
> be mmap()-able in host userspace, but accesses to guest-private pages (as
> tracked by guest_memfd) will result in a SIGBUS.
> 
> This series introduces a guest_memfd ioctl (not kvm, vm or vcpu, but
> guest_memfd ioctl) that allows userspace to set memory
> attributes (shared/private) directly through the guest_memfd. This is the
> appropriate interface because shared/private-ness is a property of memory
> and hence the request should be sent directly to the memory provider -
> guest_memfd.
> 
> Tested with both CONFIG_KVM_VM_MEMORY_ATTRIBUTES enabled and disabled:
> 
> + tools/testing/selftests/kvm/guest_memfd_test.c
> + tools/testing/selftests/kvm/pre_fault_memory_test.c
> + tools/testing/selftests/kvm/x86/guest_memfd_conversions_test.c
> + tools/testing/selftests/kvm/x86/private_mem_conversions_test.c
> + tools/testing/selftests/kvm/x86/private_mem_kvm_exits_test.c
> 
> Updates for this revision:
> 
> + Updated the series to _not_ deprecate all of VM memory attributes, but
>   only deprecate tracking of the PRIVATE attributes in VM memory
>   attributes. This takes into account upcoming RWX attributes support,
>   which will be tracked at the VM level.
> + Reshuffled the earlier commits that deal with preparing KVM to stop
>   seeing VM memory attributes as the only source of attributes.
> + Addressed comments from v7
> 
> TODOs
> 
> + Retest with TDX selftests. v7 was tested with TDX [12], but the setup there was
>   wrong. Conversions were successful (no errors), but the shared memory being
>   tested is actually in a completely different host physical page.
> + Retest with SNP selftests. v6 was tested with SNP, I ported that to v7
>   and those ran fine too. Just need to double-check for v8.
> 
> This series is based on kvm-x86/next, and here's the tree for your convenience:
> 
> https://github.com/googleprodkernel/linux-cc/commits/guest_memfd-inplace-conversion-v8
> 
> Older series:
> 
> + RFCv7 is at [11]
> + RFCv6 is at [10]
> + RFCv5 is at [8]
> + RFCv4 is at [7]
> + RFCv3 is at [6]
> + RFCv2 is at [5]
> + RFCv1 is at [4]
> + Previous versions of this feature, part of other series, are available at
>   [1][2][3].
> 
> [1] https://lore.kernel.org/all/bd163de3118b626d1005aa88e71ef2fb72f0be0f.1726009989.git.ackerleytng@google.com/
> [2] https://lore.kernel.org/all/20250117163001.2326672-6-tabba@google.com/
> [3] https://lore.kernel.org/all/b784326e9ccae6a08388f1bf39db70a2204bdc51.1747264138.git.ackerleytng@google.com/
> [4] https://lore.kernel.org/all/cover.1760731772.git.ackerleytng@google.com/T/
> [5] https://lore.kernel.org/all/cover.1770071243.git.ackerleytng@google.com/T/
> [6] https://lore.kernel.org/r/20260313-gmem-inplace-conversion-v3-0-5fc12a70ec89@google.com/T/
> [7] https://lore.kernel.org/all/20260326-gmem-inplace-conversion-v4-0-e202fe950ffd@google.com/T/
> [8] https://lore.kernel.org/r/20260428-gmem-inplace-conversion-v5-0-d8608ccfca22@google.com
> [9] https://lore.kernel.org/all/20260414-selftest-global-metadata-v1-0-fd223922bc57@google.com/T/
> [10] https://lore.kernel.org/r/20260507-gmem-inplace-conversion-v6-0-91ab5a8b19a4@google.com
> [11] https://lore.kernel.org/r/20260522-gmem-inplace-conversion-v7-0-2f0fae496530@google.com
> [12] https://lore.kernel.org/all/20260605134153.204152-1-ackerleytng@google.com/
> 
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> ---
> Ackerley Tng (27):
>       KVM: Make CONFIG_KVM_VM_MEMORY_ATTRIBUTES selectable
>       KVM: Enumerate support for PRIVATE memory iff kvm_arch_has_private_mem is defined
>       KVM: guest_memfd: Introduce function to check GFN private/shared status
>       KVM: guest_memfd: Only prepare folios for private pages
>       KVM: guest_memfd: Add base support for KVM_SET_MEMORY_ATTRIBUTES2
>       KVM: guest_memfd: Ensure pages are not in use before conversion
>       KVM: guest_memfd: Call arch invalidate hooks on conversion
>       KVM: guest_memfd: Return early if range already has requested attributes
>       KVM: guest_memfd: Advertise KVM_SET_MEMORY_ATTRIBUTES2 ioctl
>       KVM: guest_memfd: Handle lru_add fbatch refcounts during conversion safety check
>       KVM: guest_memfd: Use actual size for invalidation in kvm_gmem_release()
>       KVM: guest_memfd: Determine invalidation filter from memory attributes
>       KVM: guest_memfd: Zero page while getting pfn
>       KVM: TDX: Make source page optional for KVM_TDX_INIT_MEM_REGION
>       KVM: guest_memfd: Make in-place conversion the default
>       KVM: selftests: Test basic single-page conversion flow
>       KVM: selftests: Test conversion flow when INIT_SHARED
>       KVM: selftests: Test conversion precision in guest_memfd
>       KVM: selftests: Test conversion before allocation
>       KVM: selftests: Convert with allocated folios in different layouts
>       KVM: selftests: Test that truncation does not change shared/private status
>       KVM: selftests: Add helpers to pin pages with CONFIG_GUP_TEST
>       KVM: selftests: Test conversion with elevated page refcount
>       KVM: selftests: Reset shared memory after hole-punching
>       KVM: selftests: Provide function to look up guest_memfd details from gpa
>       KVM: selftests: Make TEST_EXPECT_SIGBUS thread-safe
>       KVM: selftests: Update private_mem_conversions_test to mmap() guest_memfd
> 
> Michael Roth (1):
>       KVM: SEV: Make 'uaddr' parameter optional for KVM_SEV_SNP_LAUNCH_UPDATE
> 
> Sean Christopherson (18):
>       KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings
>       KVM: Rename KVM_GENERIC_MEMORY_ATTRIBUTES to KVM_VM_MEMORY_ATTRIBUTES
>       KVM: Move KVM_VM_MEMORY_ATTRIBUTES config definition to x86
>       KVM: Decouple kvm_has_arch_private_mem from CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>       KVM: Rename memory attribute APIs to prepare for in-place gmem conversion
>       KVM: Provide generic interface for checking memory private/shared status
>       KVM: guest_memfd: Wire up core private/shared attribute interfaces
>       KVM: Consolidate private memory and guest_memfd ifdeffery in kvm_host.h
>       KVM: guest_memfd: Enable INIT_SHARED on guest_memfd for x86 Coco VMs
>       KVM: selftests: Create gmem fd before "regular" fd when adding memslot
>       KVM: selftests: Rename guest_memfd{,_offset} to gmem_{fd,offset}
>       KVM: selftests: Add support for mmap() on guest_memfd in core library
>       KVM: selftests: Add selftests global for guest memory attributes capability
>       KVM: selftests: Add helpers for calling ioctls on guest_memfd
>       KVM: selftests: Test that shared/private status is consistent across processes
>       KVM: selftests: Provide common function to set memory attributes
>       KVM: selftests: Check fd/flags provided to mmap() when setting up memslot
>       KVM: selftests: Update private memory exits test to work with per-gmem attributes
> 

Hi,

Thanks for this series.
This works well for me on AMD EPYC 7713 (SEV-SNP enabled). I tested:
1. KVM selftests: all tests pass.
2. Using in-place conversion QEMU branch [1]:
qemu-system-x86_64 \
  -machine q35,confidential-guest-support=sev0 \
  -enable-kvm -cpu EPYC-v4 -smp 8,maxcpus=8 -m 120G -no-reboot \
  -object memory-backend-guest-memfd,id=ram0,size=60G,share=on,host-nodes=0-1,policy=interleave \
  -object memory-backend-guest-memfd,id=ram1,size=60G,share=on,host-nodes=0,policy=bind \
  -numa node,nodeid=0,memdev=ram0,cpus=0-3 \
  -numa node,nodeid=1,memdev=ram1,cpus=4-7 \
  -object sev-snp-guest,id=sev0,policy=0x30000,cbitpos=51,reduced-phys-bits=1,convert-in-place=on \
  -bios "$OVMF" \
  -drive file="$DISK",if=none,id=disk0,format=qcow2 \
  -device virtio-scsi-pci,id=scsi0,disable-legacy=on,iommu_platform=true -device scsi-hd,drive=disk0 \
  -netdev user,id=net0,hostfwd=tcp::8000-:22 -device virtio-net-pci,netdev=net0 \
  -kernel "$KERNEL" -initrd "$INITRD" \
  -append "$ROOT ro console=ttyS0,115200" \
  -trace enable=kvm_convert_memory,file=/tmp/convert.log \
  -nographic -serial mon:stdio

   The guest boots successfully and run memory hogger. With this, I verified the
   shared <-> private conversion logs (trace_kvm_convert_memory).

3. Additionally, verified the NUMA placement for SEV-SNP. With this series,
   NUMA mempolicy support for guest_memfd [2] now works for SEV-SNP as well.

[1] https://github.com/amdese/qemu/commits/snp-inplace-rfc1
[2] https://lore.kernel.org/kvm/20251016172853.52451-1-seanjc@google.com

Tested-by: Shivank Garg <shivankg@amd.com>

Best regards,
Shivank

^ permalink raw reply

* Re: [PATCH v3 1/2] dt-bindings: iio: dac: Add AD5529R
From: Conor Dooley @ 2026-06-19 11:40 UTC (permalink / raw)
  To: Janani Sunil
  Cc: Jonathan Cameron, Rodrigo Alencar, Janani Sunil,
	Lars-Peter Clausen, Michael Hennerich, David Lechner,
	Nuno Sá, Andy Shevchenko, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Philipp Zabel, Jonathan Corbet, Shuah Khan,
	linux-iio, devicetree, linux-kernel, linux-doc, Mark Brown
In-Reply-To: <20260619-obstinate-polo-a230bef97fda@spud>

[-- Attachment #1: Type: text/plain, Size: 10234 bytes --]

On Fri, Jun 19, 2026 at 12:36:55PM +0100, Conor Dooley wrote:
> On Fri, Jun 19, 2026 at 12:33:11PM +0200, Janani Sunil wrote:
> > 
> > On 6/14/26 21:44, Jonathan Cameron wrote:
> > > On Tue, 9 Jun 2026 16:47:23 +0200
> > > Janani Sunil <jan.sun97@gmail.com> wrote:
> > > 
> > > > On 5/26/26 15:11, Rodrigo Alencar wrote:
> > > > > On 26/05/19 05:42PM, Janani Sunil wrote:
> > > > > > Devicetree bindings for AD5529R 16 channel 12/16 bit high voltage,
> > > > > > buffered voltage output digital-to-analog converter (DAC) with an
> > > > > > integrated precision reference.
> > > > > ...
> > > > > Probably others may comment on that, but...
> > > > > 
> > > > > This parent node may support device addressing for multi-device support through
> > > > > those ID pins. I suppose that each device may have its own power supplies or
> > > > > other resources like the toggle pins or reset and enable.
> > > > > 
> > > > > That way I suppose that an example would look like...
> > > > > > +
> > > > > > +patternProperties:
> > > > > > +  "^channel@([0-9]|1[0-5])$":
> > > > > > +    type: object
> > > > > > +    description: Child nodes for individual channel configuration
> > > > > > +
> > > > > > +    properties:
> > > > > > +      reg:
> > > > > > +        description: Channel number.
> > > > > > +        minimum: 0
> > > > > > +        maximum: 15
> > > > > > +
> > > > > > +      adi,output-range-microvolt:
> > > > > > +        description: |
> > > > > > +          Output voltage range for this channel as [min, max] in microvolts.
> > > > > > +          If not specified, defaults to 0V to 5V range.
> > > > > > +        oneOf:
> > > > > > +          - items:
> > > > > > +              - const: 0
> > > > > > +              - enum: [5000000, 10000000, 20000000, 40000000]
> > > > > > +          - items:
> > > > > > +              - const: -5000000
> > > > > > +              - const: 5000000
> > > > > > +          - items:
> > > > > > +              - const: -10000000
> > > > > > +              - const: 10000000
> > > > > > +          - items:
> > > > > > +              - const: -15000000
> > > > > > +              - const: 15000000
> > > > > > +          - items:
> > > > > > +              - const: -20000000
> > > > > > +              - const: 20000000
> > > > > > +
> > > > > > +    required:
> > > > > > +      - reg
> > > > > > +
> > > > > > +    additionalProperties: false
> > > > > > +
> > > > > > +required:
> > > > > > +  - compatible
> > > > > > +  - reg
> > > > > > +  - vdd-supply
> > > > > > +  - avdd-supply
> > > > > > +  - hvdd-supply
> > > > > > +
> > > > > > +dependencies:
> > > > > > +  spi-cpha: [ spi-cpol ]
> > > > > > +  spi-cpol: [ spi-cpha ]
> > > > > > +
> > > > > > +allOf:
> > > > > > +  - $ref: /schemas/spi/spi-peripheral-props.yaml#
> > > > > > +
> > > > > > +unevaluatedProperties: false
> > > > > > +
> > > > > > +examples:
> > > > > > +  - |
> > > > > > +    #include <dt-bindings/gpio/gpio.h>
> > > > > > +
> > > > > > +    spi {
> > > > > > +        #address-cells = <1>;
> > > > > > +        #size-cells = <0>;
> > > > > > +
> > > > > > +        dac@0 {
> > > > > > +            compatible = "adi,ad5529r-16";
> > > > > > +            reg = <0>;
> > > > > > +            spi-max-frequency = <25000000>;
> > > > > > +
> > > > > > +            vdd-supply = <&vdd_regulator>;
> > > > > > +            avdd-supply = <&avdd_regulator>;
> > > > > > +            hvdd-supply = <&hvdd_regulator>;
> > > > > > +            hvss-supply = <&hvss_regulator>;
> > > > > > +
> > > > > > +            reset-gpios = <&gpio0 87 GPIO_ACTIVE_LOW>;
> > > > > > +
> > > > > > +            #address-cells = <1>;
> > > > > > +            #size-cells = <0>;
> > > > > > +
> > > > > > +            channel@0 {
> > > > > > +                reg = <0>;
> > > > > > +                adi,output-range-microvolt = <0 5000000>;
> > > > > > +            };
> > > > > > +
> > > > > > +            channel@1 {
> > > > > > +                reg = <1>;
> > > > > > +                adi,output-range-microvolt = <(-10000000) 10000000>;
> > > > > > +            };
> > > > > > +
> > > > > > +            channel@2 {
> > > > > > +                reg = <2>;
> > > > > > +                adi,output-range-microvolt = <0 40000000>;
> > > > > > +            };
> > > > > > +        };
> > > > > > +    };
> > > > > ...
> > > > > 
> > > > > 	spi {
> > > > > 		#address-cells = <1>;
> > > > > 		#size-cells = <0>;
> > > > > 
> > > > > 		multi-dac@0 {
> > > > > 			compatible = "adi,ad5529r-16";
> > > > > 			reg = <0>;
> > > > > 			spi-max-frequency = <25000000>;
> > > > > 
> > > > > 			#address-cells = <1>;
> > > > > 			#size-cells = <0>;
> > > > > 
> > > > > 			dac@0 {
> > > > > 				reg = <0>;
> > > > > 				vdd-supply = <&vdd_regulator>;
> > > > > 				avdd-supply = <&avdd_regulator>;
> > > > > 				hvdd-supply = <&hvdd_regulator>;
> > > > > 				hvss-supply = <&hvss_regulator>;
> > > > > 
> > > > > 				reset-gpios = <&gpio0 87 GPIO_ACTIVE_LOW>;
> > > > > 
> > > > > 				#address-cells = <1>;
> > > > > 				#size-cells = <0>;
> > > > > 
> > > > > 				channel@0 {
> > > > > 					reg = <0>;
> > > > > 					adi,output-range-microvolt = <0 5000000>;
> > > > > 				};
> > > > > 
> > > > > 				channel@1 {
> > > > > 					reg = <1>;
> > > > > 					adi,output-range-microvolt = <(-10000000) 10000000>;
> > > > > 				};
> > > > > 
> > > > > 				channel@2 {
> > > > > 					reg = <2>;
> > > > > 					adi,output-range-microvolt = <0 40000000>;
> > > > > 				};
> > > > > 			}
> > > > > 
> > > > > 			dac@1 {
> > > > > 				reg = <1>;
> > > > > 				vdd-supply = <&vdd_regulator>;
> > > > > 				avdd-supply = <&avdd_regulator>;
> > > > > 				hvdd-supply = <&hvdd_regulator>;
> > > > > 				hvss-supply = <&hvss_regulator>;
> > > > > 
> > > > > 				reset-gpios = <&gpio0 88 GPIO_ACTIVE_LOW>;
> > > > > 
> > > > > 				#address-cells = <1>;
> > > > > 				#size-cells = <0>;
> > > > > 
> > > > > 				channel@0 {
> > > > > 					reg = <0>;
> > > > > 					adi,output-range-microvolt = <0 5000000>;
> > > > > 				};
> > > > > 
> > > > > 				channel@1 {
> > > > > 					reg = <1>;
> > > > > 					adi,output-range-microvolt = <(-10000000) 10000000>;
> > > > > 				};
> > > > > 			}
> > > > > 		};
> > > > > 	};
> > > > > 
> > > > > then you might need something like:
> > > > > 
> > > > > 	patternProperties:
> > > > > 		"^dac@[0-3]$":
> > > > > 
> > > > > and put most of the things under this node pattern.
> > > > > 
> > > > > So the main driver that you're putting together might need to handle up to four instances.
> > > > > Even if your current driver cannot handle this, the dt-bindings might need cover that.
> > > > > 
> > > > > Need to double check if each dac node needs a separate compatible, so you would maybe populate
> > > > > a platform data to be shared with the child nodes, which would be a separate driver.
> > > > > (not sure if it would make sense to mix and match ad5529r-16 and ad5529r-12).
> > > > Hi Rodrigo,
> > > > 
> > > > Thank you for looking at this.
> > > > 
> > > > For now, I would prefer to keep the binding scoped to a single AD5529R device instance. The current
> > > > hardware/use case we have only needs one device node and the driver is written around that model as well.
> > > > While the device addressing pins could allow multi-device topology, we do not have an actual platform using
> > > > that configuration at the moment, so I would prefer not to introduce an extra parent/child binding structure
> > > > speculatively without a validating use case.
> > > Interesting feature - kind of similar to address control on a typical i2c bus device, or
> > > looking at it another way a kind of distributed SPI mux.
> > > 
> > > Challenge of a binding is we need to anticipate the future.  So I think we do need something
> > > like Rodrigo is suggesting even if we only (for now) support a single instance in the driver.
> > > That would leave the path open to supporting the addressing at a later date.
> > > An alternative might be to look at it like a chained device setup. In those we pretend there
> > > is just one device with a lot of channels etc.  The snag is that here things are more loosely
> > > coupled whereas for those devices it tends to be you have to read / write the same register
> > > in all devices in the chain as one big SPI message.
> > > 
> > > +CC Mark Brown as he may know of some precedence for this feature. For his reference..
> > > - Each of these device has 2 ID pins.  The SPI transfers have to contain the 2 bit
> > > value that matches that or they are ignored.  Thus a single bus + 1 chip select can
> > > be used to talk to 4 devices.  Question is what that looks like in device tree + I guess
> > > longer term how to support it cleanly in SPI.
> 
> I'd swear I have seen this before, from some Microchip devices. Let me
> see if I can find what I am thinking of...


microchip,mcp3911 and microchip,mcp3564 both seem to do this with
slightly different properties.

  microchip,device-addr:
    description: Device address when multiple MCP3911 chips are present on the same SPI bus.
    $ref: /schemas/types.yaml#/definitions/uint32
    enum: [0, 1, 2, 3]
    default: 0

and


  microchip,hw-device-address:
    $ref: /schemas/types.yaml#/definitions/uint32
    minimum: 0
    maximum: 3
    description:
      The address is set on a per-device basis by fuses in the factory,
      configured on request. If not requested, the fuses are set for 0x1.
      The device address is part of the device markings to avoid
      potential confusion. This address is coded on two bits, so four possible
      addresses are available when multiple devices are present on the same
      SPI bus with only one Chip Select line for all devices.
      Each device communication starts by a CS falling edge, followed by the
      clocking of the device address (BITS[7:6] - top two bits of COMMAND BYTE
      which is first one on the wire).

This sounds exactly like the sort of feature that you're dealing with
here?


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH v3 1/2] dt-bindings: iio: dac: Add AD5529R
From: Conor Dooley @ 2026-06-19 11:36 UTC (permalink / raw)
  To: Janani Sunil
  Cc: Jonathan Cameron, Rodrigo Alencar, Janani Sunil,
	Lars-Peter Clausen, Michael Hennerich, David Lechner,
	Nuno Sá, Andy Shevchenko, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Philipp Zabel, Jonathan Corbet, Shuah Khan,
	linux-iio, devicetree, linux-kernel, linux-doc, Mark Brown
In-Reply-To: <076d7d2d-81a0-49c2-af94-bd65ead66c09@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 8557 bytes --]

On Fri, Jun 19, 2026 at 12:33:11PM +0200, Janani Sunil wrote:
> 
> On 6/14/26 21:44, Jonathan Cameron wrote:
> > On Tue, 9 Jun 2026 16:47:23 +0200
> > Janani Sunil <jan.sun97@gmail.com> wrote:
> > 
> > > On 5/26/26 15:11, Rodrigo Alencar wrote:
> > > > On 26/05/19 05:42PM, Janani Sunil wrote:
> > > > > Devicetree bindings for AD5529R 16 channel 12/16 bit high voltage,
> > > > > buffered voltage output digital-to-analog converter (DAC) with an
> > > > > integrated precision reference.
> > > > ...
> > > > Probably others may comment on that, but...
> > > > 
> > > > This parent node may support device addressing for multi-device support through
> > > > those ID pins. I suppose that each device may have its own power supplies or
> > > > other resources like the toggle pins or reset and enable.
> > > > 
> > > > That way I suppose that an example would look like...
> > > > > +
> > > > > +patternProperties:
> > > > > +  "^channel@([0-9]|1[0-5])$":
> > > > > +    type: object
> > > > > +    description: Child nodes for individual channel configuration
> > > > > +
> > > > > +    properties:
> > > > > +      reg:
> > > > > +        description: Channel number.
> > > > > +        minimum: 0
> > > > > +        maximum: 15
> > > > > +
> > > > > +      adi,output-range-microvolt:
> > > > > +        description: |
> > > > > +          Output voltage range for this channel as [min, max] in microvolts.
> > > > > +          If not specified, defaults to 0V to 5V range.
> > > > > +        oneOf:
> > > > > +          - items:
> > > > > +              - const: 0
> > > > > +              - enum: [5000000, 10000000, 20000000, 40000000]
> > > > > +          - items:
> > > > > +              - const: -5000000
> > > > > +              - const: 5000000
> > > > > +          - items:
> > > > > +              - const: -10000000
> > > > > +              - const: 10000000
> > > > > +          - items:
> > > > > +              - const: -15000000
> > > > > +              - const: 15000000
> > > > > +          - items:
> > > > > +              - const: -20000000
> > > > > +              - const: 20000000
> > > > > +
> > > > > +    required:
> > > > > +      - reg
> > > > > +
> > > > > +    additionalProperties: false
> > > > > +
> > > > > +required:
> > > > > +  - compatible
> > > > > +  - reg
> > > > > +  - vdd-supply
> > > > > +  - avdd-supply
> > > > > +  - hvdd-supply
> > > > > +
> > > > > +dependencies:
> > > > > +  spi-cpha: [ spi-cpol ]
> > > > > +  spi-cpol: [ spi-cpha ]
> > > > > +
> > > > > +allOf:
> > > > > +  - $ref: /schemas/spi/spi-peripheral-props.yaml#
> > > > > +
> > > > > +unevaluatedProperties: false
> > > > > +
> > > > > +examples:
> > > > > +  - |
> > > > > +    #include <dt-bindings/gpio/gpio.h>
> > > > > +
> > > > > +    spi {
> > > > > +        #address-cells = <1>;
> > > > > +        #size-cells = <0>;
> > > > > +
> > > > > +        dac@0 {
> > > > > +            compatible = "adi,ad5529r-16";
> > > > > +            reg = <0>;
> > > > > +            spi-max-frequency = <25000000>;
> > > > > +
> > > > > +            vdd-supply = <&vdd_regulator>;
> > > > > +            avdd-supply = <&avdd_regulator>;
> > > > > +            hvdd-supply = <&hvdd_regulator>;
> > > > > +            hvss-supply = <&hvss_regulator>;
> > > > > +
> > > > > +            reset-gpios = <&gpio0 87 GPIO_ACTIVE_LOW>;
> > > > > +
> > > > > +            #address-cells = <1>;
> > > > > +            #size-cells = <0>;
> > > > > +
> > > > > +            channel@0 {
> > > > > +                reg = <0>;
> > > > > +                adi,output-range-microvolt = <0 5000000>;
> > > > > +            };
> > > > > +
> > > > > +            channel@1 {
> > > > > +                reg = <1>;
> > > > > +                adi,output-range-microvolt = <(-10000000) 10000000>;
> > > > > +            };
> > > > > +
> > > > > +            channel@2 {
> > > > > +                reg = <2>;
> > > > > +                adi,output-range-microvolt = <0 40000000>;
> > > > > +            };
> > > > > +        };
> > > > > +    };
> > > > ...
> > > > 
> > > > 	spi {
> > > > 		#address-cells = <1>;
> > > > 		#size-cells = <0>;
> > > > 
> > > > 		multi-dac@0 {
> > > > 			compatible = "adi,ad5529r-16";
> > > > 			reg = <0>;
> > > > 			spi-max-frequency = <25000000>;
> > > > 
> > > > 			#address-cells = <1>;
> > > > 			#size-cells = <0>;
> > > > 
> > > > 			dac@0 {
> > > > 				reg = <0>;
> > > > 				vdd-supply = <&vdd_regulator>;
> > > > 				avdd-supply = <&avdd_regulator>;
> > > > 				hvdd-supply = <&hvdd_regulator>;
> > > > 				hvss-supply = <&hvss_regulator>;
> > > > 
> > > > 				reset-gpios = <&gpio0 87 GPIO_ACTIVE_LOW>;
> > > > 
> > > > 				#address-cells = <1>;
> > > > 				#size-cells = <0>;
> > > > 
> > > > 				channel@0 {
> > > > 					reg = <0>;
> > > > 					adi,output-range-microvolt = <0 5000000>;
> > > > 				};
> > > > 
> > > > 				channel@1 {
> > > > 					reg = <1>;
> > > > 					adi,output-range-microvolt = <(-10000000) 10000000>;
> > > > 				};
> > > > 
> > > > 				channel@2 {
> > > > 					reg = <2>;
> > > > 					adi,output-range-microvolt = <0 40000000>;
> > > > 				};
> > > > 			}
> > > > 
> > > > 			dac@1 {
> > > > 				reg = <1>;
> > > > 				vdd-supply = <&vdd_regulator>;
> > > > 				avdd-supply = <&avdd_regulator>;
> > > > 				hvdd-supply = <&hvdd_regulator>;
> > > > 				hvss-supply = <&hvss_regulator>;
> > > > 
> > > > 				reset-gpios = <&gpio0 88 GPIO_ACTIVE_LOW>;
> > > > 
> > > > 				#address-cells = <1>;
> > > > 				#size-cells = <0>;
> > > > 
> > > > 				channel@0 {
> > > > 					reg = <0>;
> > > > 					adi,output-range-microvolt = <0 5000000>;
> > > > 				};
> > > > 
> > > > 				channel@1 {
> > > > 					reg = <1>;
> > > > 					adi,output-range-microvolt = <(-10000000) 10000000>;
> > > > 				};
> > > > 			}
> > > > 		};
> > > > 	};
> > > > 
> > > > then you might need something like:
> > > > 
> > > > 	patternProperties:
> > > > 		"^dac@[0-3]$":
> > > > 
> > > > and put most of the things under this node pattern.
> > > > 
> > > > So the main driver that you're putting together might need to handle up to four instances.
> > > > Even if your current driver cannot handle this, the dt-bindings might need cover that.
> > > > 
> > > > Need to double check if each dac node needs a separate compatible, so you would maybe populate
> > > > a platform data to be shared with the child nodes, which would be a separate driver.
> > > > (not sure if it would make sense to mix and match ad5529r-16 and ad5529r-12).
> > > Hi Rodrigo,
> > > 
> > > Thank you for looking at this.
> > > 
> > > For now, I would prefer to keep the binding scoped to a single AD5529R device instance. The current
> > > hardware/use case we have only needs one device node and the driver is written around that model as well.
> > > While the device addressing pins could allow multi-device topology, we do not have an actual platform using
> > > that configuration at the moment, so I would prefer not to introduce an extra parent/child binding structure
> > > speculatively without a validating use case.
> > Interesting feature - kind of similar to address control on a typical i2c bus device, or
> > looking at it another way a kind of distributed SPI mux.
> > 
> > Challenge of a binding is we need to anticipate the future.  So I think we do need something
> > like Rodrigo is suggesting even if we only (for now) support a single instance in the driver.
> > That would leave the path open to supporting the addressing at a later date.
> > An alternative might be to look at it like a chained device setup. In those we pretend there
> > is just one device with a lot of channels etc.  The snag is that here things are more loosely
> > coupled whereas for those devices it tends to be you have to read / write the same register
> > in all devices in the chain as one big SPI message.
> > 
> > +CC Mark Brown as he may know of some precedence for this feature. For his reference..
> > - Each of these device has 2 ID pins.  The SPI transfers have to contain the 2 bit
> > value that matches that or they are ignored.  Thus a single bus + 1 chip select can
> > be used to talk to 4 devices.  Question is what that looks like in device tree + I guess
> > longer term how to support it cleanly in SPI.

I'd swear I have seen this before, from some Microchip devices. Let me
see if I can find what I am thinking of...

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH v3 1/2] dt-bindings: iio: dac: Add AD5529R
From: Nuno Sá @ 2026-06-19 11:31 UTC (permalink / raw)
  To: Janani Sunil
  Cc: Jonathan Cameron, Rodrigo Alencar, Janani Sunil,
	Lars-Peter Clausen, Michael Hennerich, David Lechner,
	Nuno Sá, Andy Shevchenko, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Philipp Zabel, Jonathan Corbet, Shuah Khan,
	linux-iio, devicetree, linux-kernel, linux-doc, Mark Brown
In-Reply-To: <076d7d2d-81a0-49c2-af94-bd65ead66c09@gmail.com>

On Fri, Jun 19, 2026 at 12:33:11PM +0200, Janani Sunil wrote:
> 
> On 6/14/26 21:44, Jonathan Cameron wrote:
> > On Tue, 9 Jun 2026 16:47:23 +0200
> > Janani Sunil <jan.sun97@gmail.com> wrote:
> > 
> 
> Hi Jonathan, Rob, Krzysztof, Conor,
> 
> One possible model that would also allow mixing the 12-bit and 16-bit variants would be to treat the parent node
> as the shared SPI transport only, and let each dac@N child carry its own compatible.
> 
> Rob, Krzysztof, Conor — wanted to get your input on whether this is an acceptable binding pattern.
> 
> properties:
>   compatible:
>     const: adi,ad5529r-bus
> 
> patternProperties:
>   "^dac@[0-3]$":
>     type: object
>     properties:
>       compatible:
>         enum:
>           - adi,ad5529r-16
>           - adi,ad5529r-12
>       reg:
>         minimum: 0
>         maximum: 3
> 
> With a DT example such as:
> 
> ad5529r@0 {
>         compatible = "adi,ad5529r-bus";
>         reg = <0>;
> 
>         dac@0 {
>                 compatible = "adi,ad5529r-16";
>                 reg = <0>;
>         };
> 
>         dac@1 {
>                 compatible = "adi,ad5529r-12";
>                 reg = <1>;
>         };
> };
> 
> The downside is that it introduces adi,ad5529r-bus as a compatible that does not correspond to an actual
> standalone device variant - it would require a parent driver to manage the shared SPI transport and enumerate the
> child devices. The actual DAC functionality is handled by the matching per-child compatibles(12 or 16 bit).
> Is this an acceptable pattern, or is there a preferred way to model this type of addressing scheme?
> 

At some point, I wondered if we can't just have this at spi level? Like
(in the simplest terms) a new spi-peripheral property that would allow
devices to share the same CS. Then we would need an adi,pin-id kind of
property for this device but the bindings would be pretty much as if we
only supported one device.

I see Mark is already in the loop, maybe he has seen this kind of things
before.

- Nuno Sá


^ permalink raw reply

* Re: [PATCH v8 23/46] KVM: TDX: Make source page optional for KVM_TDX_INIT_MEM_REGION
From: Fuad Tabba @ 2026-06-19 11:09 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-23-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> Update tdx_gmem_post_populate() to handle cases where a source page is
> not explicitly provided. Instead of returning -EOPNOTSUPP when src_page
> is NULL, default to using the page associated with the destination PFN.
>
> This change allows for in-place memory conversion where the data is
> already present in the target PFN, ensuring the TDX module has a valid
> source page reference for the TDH.MEM.PAGE.ADD operation.
>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---

Sashiko flagged that when src_page = pfn_to_page(pfn),
tdh_mem_page_add gets identical physical addresses for r8
(destination) and r9 (source), reading with host KeyID and writing
with TD KeyID on the same address. I don't know enough about the TDX
module's operand constraints to confirm whether it allows overlapping
source and destination, but the concern looks legitimate.

nit: why does it have Sean's SoB?

Cheers,
/fuad


>  Documentation/virt/kvm/x86/intel-tdx.rst |  4 ++++
>  arch/x86/kvm/vmx/tdx.c                   | 11 ++++++++---
>  2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/virt/kvm/x86/intel-tdx.rst b/Documentation/virt/kvm/x86/intel-tdx.rst
> index 6a222e9d09541..74357fe87f9ec 100644
> --- a/Documentation/virt/kvm/x86/intel-tdx.rst
> +++ b/Documentation/virt/kvm/x86/intel-tdx.rst
> @@ -158,6 +158,10 @@ KVM_TDX_INIT_MEM_REGION
>  Initialize @nr_pages TDX guest private memory starting from @gpa with userspace
>  provided data from @source_addr. @source_addr must be PAGE_SIZE-aligned.
>
> +If guest_memfd in-place conversion is enabled, pass NULL for @source_addr to
> +initialize the memory region using memory contents already populated in
> +guest_memfd memory.
> +
>  Note, before calling this sub command, memory attribute of the range
>  [gpa, gpa + nr_pages] needs to be private.  Userspace can use
>  KVM_SET_MEMORY_ATTRIBUTES to set the attribute.
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index ffe9d0db58c59..56d10333c61a7 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -3198,8 +3198,12 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
>         if (KVM_BUG_ON(kvm_tdx->page_add_src, kvm))
>                 return -EIO;
>
> -       if (!src_page)
> -               return -EOPNOTSUPP;
> +       if (!src_page) {
> +               if (!gmem_in_place_conversion)
> +                       return -EOPNOTSUPP;
> +
> +               src_page = pfn_to_page(pfn);
> +       }
>
>         kvm_tdx->page_add_src = src_page;
>         ret = kvm_tdp_mmu_map_private_pfn(arg->vcpu, gfn, pfn);
> @@ -3278,7 +3282,8 @@ static int tdx_vcpu_init_mem_region(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *c
>                         break;
>                 }
>
> -               region.source_addr += PAGE_SIZE;
> +               if (region.source_addr)
> +                       region.source_addr += PAGE_SIZE;
>                 region.gpa += PAGE_SIZE;
>                 region.nr_pages--;
>
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 11/46] KVM: Consolidate private memory and guest_memfd ifdeffery in kvm_host.h
From: Fuad Tabba @ 2026-06-19 11:02 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-11-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Sean Christopherson <seanjc@google.com>
>
> Move the kvm_arch_has_private_mem() stub and a few guest_memfd function
> definitions/declarations "down" in kvm_host.h to utilize existing #ifdefs,
> and so that related code is clustered together.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

SoB fix please. With that...

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad
> ---
>  include/linux/kvm_host.h | 37 ++++++++++++++++---------------------
>  1 file changed, 16 insertions(+), 21 deletions(-)
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index acb552745b428..9c1cf1a6559e3 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -722,27 +722,6 @@ static inline int kvm_arch_vcpu_memslots_id(struct kvm_vcpu *vcpu)
>  }
>  #endif
>
> -#ifndef kvm_arch_has_private_mem
> -static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
> -{
> -       return false;
> -}
> -#endif
> -
> -#ifdef CONFIG_KVM_GUEST_MEMFD
> -bool kvm_arch_supports_gmem_init_shared(struct kvm *kvm);
> -
> -static inline u64 kvm_gmem_get_supported_flags(struct kvm *kvm)
> -{
> -       u64 flags = GUEST_MEMFD_FLAG_MMAP;
> -
> -       if (!kvm || kvm_arch_supports_gmem_init_shared(kvm))
> -               flags |= GUEST_MEMFD_FLAG_INIT_SHARED;
> -
> -       return flags;
> -}
> -#endif
> -
>  #ifndef kvm_arch_has_readonly_mem
>  static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm)
>  {
> @@ -2572,6 +2551,11 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>  #else
>  #define gmem_in_place_conversion false
>
> +static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
> +{
> +       return false;
> +}
> +
>  static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>  {
>         return false;
> @@ -2580,6 +2564,17 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>
>  #ifdef CONFIG_KVM_GUEST_MEMFD
>  bool kvm_gmem_is_private(struct kvm *kvm, gfn_t gfn);
> +bool kvm_arch_supports_gmem_init_shared(struct kvm *kvm);
> +
> +static inline u64 kvm_gmem_get_supported_flags(struct kvm *kvm)
> +{
> +       u64 flags = GUEST_MEMFD_FLAG_MMAP;
> +
> +       if (!kvm || kvm_arch_supports_gmem_init_shared(kvm))
> +               flags |= GUEST_MEMFD_FLAG_INIT_SHARED;
> +
> +       return flags;
> +}
>
>  int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
>                      gfn_t gfn, kvm_pfn_t *pfn, struct page **page,
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 22/46] KVM: SEV: Make 'uaddr' parameter optional for KVM_SEV_SNP_LAUNCH_UPDATE
From: Fuad Tabba @ 2026-06-19 11:01 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-22-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Michael Roth <michael.roth@amd.com>
>
> Make the source page for populating an SNP guest_memfd instance optional
> if in-place conversion/population is enabled.  If KVM can convert the page
> in-place, then it's possible for guest memory to be initialized directly
> from userspace by mmap()'ing the guest_memfd and writing to it while the
> corresponding GPA ranges are in a 'shared' state, before converting them
> to the 'private' state expected by KVM_SEV_SNP_LAUNCH_UPDATE.
>
> Update the handling/documentation for KVM_SEV_SNP_LAUNCH_UPDATE to allow
> for 'uaddr' to be set to NULL when in-place conversion is enabled, which
> SNP_LAUNCH_UPDATE will then use to determine when it should/shouldn't
> copy in data from a separate memory location. Continue to enforce
> non-NULL when PRIVATE is tracked per-VM, not per-guest_memfd.
>
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> [Added src_page check in error handling path when the firmware command fails]
> [Dropped ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES]
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> [sean: drop explicit vm_memory_attributes references]
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  Documentation/virt/kvm/x86/amd-memory-encryption.rst | 13 +++++++++----
>  arch/x86/kvm/svm/sev.c                               | 16 +++++++++++-----
>  virt/kvm/kvm_main.c                                  |  1 +
>  3 files changed, 21 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> index bd04a908a8dbd..29409297f1ef0 100644
> --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> @@ -503,7 +503,8 @@ secrets.
>
>  It is required that the GPA ranges initialized by this command have had the
>  KVM_MEMORY_ATTRIBUTE_PRIVATE attribute set in advance. See the documentation
> -for KVM_SET_MEMORY_ATTRIBUTES for more details on this aspect.
> +for KVM_SET_MEMORY_ATTRIBUTES/KVM_SET_MEMORY_ATTRIBUTES2 for more details on
> +this aspect.
>
>  Upon success, this command is not guaranteed to have processed the entire
>  range requested. Instead, the ``gfn_start``, ``uaddr``, and ``len`` fields of
> @@ -511,9 +512,13 @@ range requested. Instead, the ``gfn_start``, ``uaddr``, and ``len`` fields of
>  remaining range that has yet to be processed. The caller should continue
>  calling this command until those fields indicate the entire range has been
>  processed, e.g. ``len`` is 0, ``gfn_start`` is equal to the last GFN in the
> -range plus 1, and ``uaddr`` is the last byte of the userspace-provided source
> -buffer address plus 1. In the case where ``type`` is KVM_SEV_SNP_PAGE_TYPE_ZERO,
> -``uaddr`` will be ignored completely.
> +range plus 1, and ``uaddr`` (if specified) is the last byte of the
> +userspace-provided source buffer address plus 1.
> +
> +In the case where ``type`` is KVM_SEV_SNP_PAGE_TYPE_ZERO, ``uaddr`` will be
> +ignored completely. For all other page types, ``uaddr`` is optional if in-place
> +conversion is enable, i.e. when the destination can also be the source, and is

Typo: "is enable" -> "is enabled".

"when the destination can also be the source" is hard to parse without
context. Maybe: "i.e. when the data has been written directly to
guest_memfd while the range was in the shared state".

Also, how does userspace discover whether in-place conversion is
enabled? A cross-reference to KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES
would help here.

Cheers,
/fuad

> +required if in-place conversion is disabled.
>
>  Parameters (in): struct  kvm_sev_snp_launch_update
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 74fb15551e83f..2b7569b6a8609 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2330,7 +2330,13 @@ static int sev_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
>         int level;
>         int ret;
>
> -       if (WARN_ON_ONCE(sev_populate_args->type != KVM_SEV_SNP_PAGE_TYPE_ZERO && !src_page))
> +       /*
> +        * A source page is required if in-place conversion isn't enabled, as
> +        * the data needs to come from a separate physical page.  Zero pages
> +        * are exempt as they don't consume a source page.
> +        */
> +       if (!gmem_in_place_conversion &&
> +           sev_populate_args->type != KVM_SEV_SNP_PAGE_TYPE_ZERO && !src_page)
>                 return -EINVAL;
>
>         ret = snp_lookup_rmpentry((u64)pfn, &assigned, &level);
> @@ -2377,7 +2383,7 @@ static int sev_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
>          */
>         if (ret && !snp_page_reclaim(kvm, pfn) &&
>             sev_populate_args->type == KVM_SEV_SNP_PAGE_TYPE_CPUID &&
> -           sev_populate_args->fw_error == SEV_RET_INVALID_PARAM) {
> +           sev_populate_args->fw_error == SEV_RET_INVALID_PARAM && src_page) {
>                 void *src_vaddr = kmap_local_page(src_page);
>                 void *dst_vaddr = kmap_local_pfn(pfn);
>
> @@ -2410,8 +2416,8 @@ static int snp_launch_update(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         if (copy_from_user(&params, u64_to_user_ptr(argp->data), sizeof(params)))
>                 return -EFAULT;
>
> -       pr_debug("%s: GFN start 0x%llx length 0x%llx type %d flags %d\n", __func__,
> -                params.gfn_start, params.len, params.type, params.flags);
> +       pr_debug("%s: GFN start 0x%llx length 0x%llx type %d flags %d src %llx\n", __func__,
> +                params.gfn_start, params.len, params.type, params.flags, params.uaddr);
>
>         if (!params.len || !PAGE_ALIGNED(params.len) || params.flags ||
>             (params.type != KVM_SEV_SNP_PAGE_TYPE_NORMAL &&
> @@ -2468,7 +2474,7 @@ static int snp_launch_update(struct kvm *kvm, struct kvm_sev_cmd *argp)
>
>         params.gfn_start += count;
>         params.len -= count * PAGE_SIZE;
> -       if (params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO)
> +       if (src && params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO)
>                 params.uaddr += count * PAGE_SIZE;
>
>         if (copy_to_user(u64_to_user_ptr(argp->data), &params, sizeof(params)))
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 044486f128c37..dd1d18a1d2f68 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -103,6 +103,7 @@ module_param(allow_unsafe_mappings, bool, 0444);
>
>  #ifdef kvm_arch_has_private_mem
>  bool __ro_after_init gmem_in_place_conversion = false;
> +EXPORT_SYMBOL_FOR_KVM_INTERNAL(gmem_in_place_conversion);
>  #endif
>
>  #define MEMORY_ATTRIBUTES_MATCH(one, two)                              \
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 21/46] KVM: guest_memfd: Zero page while getting pfn
From: Fuad Tabba @ 2026-06-19 10:51 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-21-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> Move the folio initialization logic from kvm_gmem_get_pfn() into
> __kvm_gmem_get_pfn() to also zero pages if the page is to be used in
> kvm_gmem_populate().
>
> With in-place conversion, the existing data in a guest_memfd page can be
> populated into guest memory through platform-specific ioctls.
>
> Without first zeroing the page obtained using __kvm_gmem_get_pfn(), it
> might contain uninitialized host memory, which would leak to the guest if
> the populate completes.
>
> guest_memfd pages are zeroed at most once in the page's entire lifetime
> with guest_memfd, and that is tracked using the uptodate flag.
>
> Zeroing the page in __kvm_gmem_get_pfn() is chosen over zeroing in
> kvm_gmem_get_folio() since other flows, such as a future write() syscall,
> can get a page, write to the page and then set page uptodate without
> zeroing.
>
> This aligns with the concept of zeroing before first use - the other place
> where zeroing happens is in kvm_gmem_fault_user_mapping().
>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad
> ---
>  virt/kvm/guest_memfd.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 90bc1a26512b6..86c9f5b0863cb 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -1137,6 +1137,11 @@ static struct folio *__kvm_gmem_get_pfn(struct file *file,
>                 return ERR_PTR(-EHWPOISON);
>         }
>
> +       if (!folio_test_uptodate(folio)) {
> +               clear_highpage(folio_page(folio, 0));
> +               folio_mark_uptodate(folio);
> +       }
> +
>         *pfn = folio_file_pfn(folio, index);
>         if (max_order)
>                 *max_order = 0;
> @@ -1166,11 +1171,6 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
>                 goto out;
>         }
>
> -       if (!folio_test_uptodate(folio)) {
> -               clear_highpage(folio_page(folio, 0));
> -               folio_mark_uptodate(folio);
> -       }
> -
>         if (kvm_gmem_is_private_mem(inode, index))
>                 r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio);
>
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 19/46] KVM: guest_memfd: Use actual size for invalidation in kvm_gmem_release()
From: Fuad Tabba @ 2026-06-19 10:46 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-19-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> __kvm_gmem_invalidate_begin() and __kvm_gmem_invalidate_end() actually do
> not specially handle -1ul. -1ul is used as a huge number, which legal
> indices do not exceed, and hence the invalidation works as expected.
>
> Since a later patch is going to make use of the exact range, calculate the
> size of the guest_memfd inode and use it as the end range for invalidating
> SPTEs.
>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> ---

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad

>  virt/kvm/guest_memfd.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index d163559da0235..d72ecbfcc3144 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -366,6 +366,7 @@ static long kvm_gmem_fallocate(struct file *file, int mode, loff_t offset,
>
>  static int kvm_gmem_release(struct inode *inode, struct file *file)
>  {
> +       pgoff_t end = i_size_read(inode) >> PAGE_SHIFT;
>         struct gmem_file *f = file->private_data;
>         struct kvm_memory_slot *slot;
>         struct kvm *kvm = f->kvm;
> @@ -396,9 +397,9 @@ static int kvm_gmem_release(struct inode *inode, struct file *file)
>          * Zap all SPTEs pointed at by this file.  Do not free the backing
>          * memory, as its lifetime is associated with the inode, not the file.
>          */
> -       __kvm_gmem_invalidate_start(f, 0, -1ul,
> +       __kvm_gmem_invalidate_start(f, 0, end,
>                                     kvm_gmem_get_invalidate_filter(inode));
> -       __kvm_gmem_invalidate_end(f, 0, -1ul);
> +       __kvm_gmem_invalidate_end(f, 0, end);
>
>         list_del(&f->entry);
>
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 17/46] KVM: guest_memfd: Advertise KVM_SET_MEMORY_ATTRIBUTES2 ioctl
From: Fuad Tabba @ 2026-06-19 10:35 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-17-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> Introduce KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES to advertise the
> availability of the KVM_SET_MEMORY_ATTRIBUTES2 ioctl.
>
> KVM_SET_MEMORY_ATTRIBUTES2 is a guest_memfd-scoped version of the existing
> KVM_SET_MEMORY_ATTRIBUTES VM ioctl. It allows userspace to manage memory
> attributes, such as KVM_MEMORY_ATTRIBUTE_PRIVATE, directly on a guest_memfd
> file descriptor.
>
> This new version uses struct kvm_memory_attributes2, which adds an
> error_offset field to the output. This allows KVM to return the specific
> offset that triggered an error, which is especially useful for handling
> EAGAIN results caused by transient page reference counts during attribute
> conversions.
>
> Update the KVM API documentation to define the new ioctl and its behavior,
> and add the necessary UAPI definitions and capability checks.
>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Suggested-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad
> ---
>  Documentation/virt/kvm/api.rst | 78 +++++++++++++++++++++++++++++++++++++++++-
>  include/uapi/linux/kvm.h       |  2 ++
>  virt/kvm/kvm_main.c            | 23 +++++++++----
>  3 files changed, 95 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index a833d90845b95..73878f34f6d2e 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -117,7 +117,7 @@ description:
>        x86 includes both i386 and x86_64.
>
>    Type:
> -      system, vm, or vcpu.
> +      system, vm, vcpu or guest_memfd.
>
>    Parameters:
>        what parameters are accepted by the ioctl.
> @@ -6373,6 +6373,8 @@ S390:
>  Returns -EINVAL if the VM has the KVM_VM_S390_UCONTROL flag set.
>  Returns -EINVAL if called on a protected VM.
>
> +.. _KVM_SET_MEMORY_ATTRIBUTES:
> +
>  4.141 KVM_SET_MEMORY_ATTRIBUTES
>  -------------------------------
>
> @@ -6566,6 +6568,80 @@ KVM_S390_KEYOP_SSKE
>    Sets the storage key for the guest address ``guest_addr`` to the key
>    specified in ``key``, returning the previous value in ``key``.
>
> +4.145 KVM_SET_MEMORY_ATTRIBUTES2
> +---------------------------------
> +
> +:Capability: KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES
> +:Architectures: all
> +:Type: guest_memfd ioctl
> +:Parameters: struct kvm_memory_attributes2 (in/out)
> +:Returns: 0 on success, <0 on error
> +
> +Errors:
> +
> +  ========== ===============================================================
> +  EINVAL     The specified `offset` or `size` were invalid (e.g. not
> +             page aligned, causes an overflow, or size is zero).
> +  EFAULT     The parameter address was invalid.
> +  EAGAIN     Some page within requested range had unexpected refcounts. The
> +             offset of the page will be returned in `error_offset`.
> +  ENOMEM     Ran out of memory trying to track private/shared state
> +  ========== ===============================================================
> +
> +KVM_SET_MEMORY_ATTRIBUTES2 is an extension to
> +KVM_SET_MEMORY_ATTRIBUTES that supports returning (writing) values to
> +userspace.  The original (pre-extension) fields are shared with
> +KVM_SET_MEMORY_ATTRIBUTES identically.
> +
> +Attribute values are shared with KVM_SET_MEMORY_ATTRIBUTES.
> +
> +::
> +
> +  struct kvm_memory_attributes2 {
> +       /* in */
> +       union {
> +               __u64 address;
> +               __u64 offset;
> +       };
> +       __u64 size;
> +       __u64 attributes;
> +       __u64 flags;
> +       /* out */
> +       __u64 error_offset;
> +       __u64 reserved[11];
> +  };
> +
> +  #define KVM_MEMORY_ATTRIBUTE_PRIVATE           (1ULL << 3)
> +
> +Set attributes for a range of offsets within a guest_memfd to
> +KVM_MEMORY_ATTRIBUTE_PRIVATE to limit the specified guest_memfd backed
> +memory range for guest_use. Even if KVM_CAP_GUEST_MEMFD_MMAP is
> +supported, after a successful call to set
> +KVM_MEMORY_ATTRIBUTE_PRIVATE, the requested range will not be mappable
> +into host userspace and will only be mappable by the guest.
> +
> +To allow the range to be mappable into host userspace again, call
> +KVM_SET_MEMORY_ATTRIBUTES2 on the guest_memfd again with
> +KVM_MEMORY_ATTRIBUTE_PRIVATE unset.
> +
> +KVM does not directly manipulate the memory contents of pages during
> +attribute updates. However, the process of setting these attributes,
> +which includes operations such as unmapping pages from the host or
> +stage-2 page tables, may result in side effects on memory contents
> +that vary across different trusted firmware implementations.
> +
> +If this ioctl returns -EAGAIN, the offset of the page with unexpected
> +refcounts will be returned in `error_offset`. This can occur if there
> +are transient refcounts on the pages, taken by other parts of the
> +kernel.
> +
> +Userspace is expected to figure out how to remove all known refcounts
> +on the shared pages, such as refcounts taken by get_user_pages(), and
> +try the ioctl again. A possible source of these long term refcounts is
> +if the guest_memfd memory was pinned in IOMMU page tables.
> +
> +See also: :ref: `KVM_SET_MEMORY_ATTRIBUTES`.
> +
>  .. _kvm_run:
>
>  5. The kvm_run structure
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 876c0429f9d4e..129d6f6303251 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -997,6 +997,7 @@ struct kvm_enable_cap {
>  #define KVM_CAP_S390_KEYOP 247
>  #define KVM_CAP_S390_VSIE_ESAMODE 248
>  #define KVM_CAP_S390_HPAGE_2G 249
> +#define KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES 250
>
>  struct kvm_irq_routing_irqchip {
>         __u32 irqchip;
> @@ -1649,6 +1650,7 @@ struct kvm_memory_attributes {
>         __u64 flags;
>  };
>
> +/* Available with KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES */
>  #define KVM_SET_MEMORY_ATTRIBUTES2              _IOWR(KVMIO,  0xd2, struct kvm_memory_attributes2)
>
>  struct kvm_memory_attributes2 {
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index a08b518cdb175..044486f128c37 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2434,18 +2434,22 @@ static int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm,
>  }
>  #endif /* CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT */
>
> +#ifdef kvm_arch_has_private_mem
> +static u64 kvm_supports_private_mem(struct kvm *kvm)
> +{
> +       return !kvm || kvm_arch_has_private_mem(kvm);
> +}
> +#else
> +#define kvm_supports_private_mem(kvm) false
> +#endif
> +
>  #ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>  static u64 kvm_supported_vm_mem_attributes(struct kvm *kvm)
>  {
> -#ifdef kvm_arch_has_private_mem
> -       if (gmem_in_place_conversion)
> +       if (gmem_in_place_conversion || !kvm_supports_private_mem(kvm))
>                 return 0;
>
> -       if (!kvm || kvm_arch_has_private_mem(kvm))
> -               return KVM_MEMORY_ATTRIBUTE_PRIVATE;
> -#endif
> -
> -       return 0;
> +       return KVM_MEMORY_ATTRIBUTE_PRIVATE;
>  }
>
>  /*
> @@ -4969,6 +4973,11 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
>                 return 1;
>         case KVM_CAP_GUEST_MEMFD_FLAGS:
>                 return kvm_gmem_get_supported_flags(kvm);
> +       case KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES:
> +               if (!gmem_in_place_conversion || !kvm_supports_private_mem(kvm))
> +                       return 0;
> +
> +               return KVM_MEMORY_ATTRIBUTE_PRIVATE;
>  #endif
>         default:
>                 break;
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v3 1/2] dt-bindings: iio: dac: Add AD5529R
From: Janani Sunil @ 2026-06-19 10:33 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Rodrigo Alencar, Janani Sunil, Lars-Peter Clausen,
	Michael Hennerich, David Lechner, Nuno Sá, Andy Shevchenko,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley, Philipp Zabel,
	Jonathan Corbet, Shuah Khan, linux-iio, devicetree, linux-kernel,
	linux-doc, Mark Brown
In-Reply-To: <20260614204455.408c4d40@jic23-huawei>


On 6/14/26 21:44, Jonathan Cameron wrote:
> On Tue, 9 Jun 2026 16:47:23 +0200
> Janani Sunil <jan.sun97@gmail.com> wrote:
>
>> On 5/26/26 15:11, Rodrigo Alencar wrote:
>>> On 26/05/19 05:42PM, Janani Sunil wrote:
>>>> Devicetree bindings for AD5529R 16 channel 12/16 bit high voltage,
>>>> buffered voltage output digital-to-analog converter (DAC) with an
>>>> integrated precision reference.
>>> ...
>>> Probably others may comment on that, but...
>>>
>>> This parent node may support device addressing for multi-device support through
>>> those ID pins. I suppose that each device may have its own power supplies or
>>> other resources like the toggle pins or reset and enable.
>>>
>>> That way I suppose that an example would look like...
>>>   
>>>> +
>>>> +patternProperties:
>>>> +  "^channel@([0-9]|1[0-5])$":
>>>> +    type: object
>>>> +    description: Child nodes for individual channel configuration
>>>> +
>>>> +    properties:
>>>> +      reg:
>>>> +        description: Channel number.
>>>> +        minimum: 0
>>>> +        maximum: 15
>>>> +
>>>> +      adi,output-range-microvolt:
>>>> +        description: |
>>>> +          Output voltage range for this channel as [min, max] in microvolts.
>>>> +          If not specified, defaults to 0V to 5V range.
>>>> +        oneOf:
>>>> +          - items:
>>>> +              - const: 0
>>>> +              - enum: [5000000, 10000000, 20000000, 40000000]
>>>> +          - items:
>>>> +              - const: -5000000
>>>> +              - const: 5000000
>>>> +          - items:
>>>> +              - const: -10000000
>>>> +              - const: 10000000
>>>> +          - items:
>>>> +              - const: -15000000
>>>> +              - const: 15000000
>>>> +          - items:
>>>> +              - const: -20000000
>>>> +              - const: 20000000
>>>> +
>>>> +    required:
>>>> +      - reg
>>>> +
>>>> +    additionalProperties: false
>>>> +
>>>> +required:
>>>> +  - compatible
>>>> +  - reg
>>>> +  - vdd-supply
>>>> +  - avdd-supply
>>>> +  - hvdd-supply
>>>> +
>>>> +dependencies:
>>>> +  spi-cpha: [ spi-cpol ]
>>>> +  spi-cpol: [ spi-cpha ]
>>>> +
>>>> +allOf:
>>>> +  - $ref: /schemas/spi/spi-peripheral-props.yaml#
>>>> +
>>>> +unevaluatedProperties: false
>>>> +
>>>> +examples:
>>>> +  - |
>>>> +    #include <dt-bindings/gpio/gpio.h>
>>>> +
>>>> +    spi {
>>>> +        #address-cells = <1>;
>>>> +        #size-cells = <0>;
>>>> +
>>>> +        dac@0 {
>>>> +            compatible = "adi,ad5529r-16";
>>>> +            reg = <0>;
>>>> +            spi-max-frequency = <25000000>;
>>>> +
>>>> +            vdd-supply = <&vdd_regulator>;
>>>> +            avdd-supply = <&avdd_regulator>;
>>>> +            hvdd-supply = <&hvdd_regulator>;
>>>> +            hvss-supply = <&hvss_regulator>;
>>>> +
>>>> +            reset-gpios = <&gpio0 87 GPIO_ACTIVE_LOW>;
>>>> +
>>>> +            #address-cells = <1>;
>>>> +            #size-cells = <0>;
>>>> +
>>>> +            channel@0 {
>>>> +                reg = <0>;
>>>> +                adi,output-range-microvolt = <0 5000000>;
>>>> +            };
>>>> +
>>>> +            channel@1 {
>>>> +                reg = <1>;
>>>> +                adi,output-range-microvolt = <(-10000000) 10000000>;
>>>> +            };
>>>> +
>>>> +            channel@2 {
>>>> +                reg = <2>;
>>>> +                adi,output-range-microvolt = <0 40000000>;
>>>> +            };
>>>> +        };
>>>> +    };
>>> ...
>>>
>>> 	spi {
>>> 		#address-cells = <1>;
>>> 		#size-cells = <0>;
>>>
>>> 		multi-dac@0 {
>>> 			compatible = "adi,ad5529r-16";
>>> 			reg = <0>;
>>> 			spi-max-frequency = <25000000>;
>>>
>>> 			#address-cells = <1>;
>>> 			#size-cells = <0>;
>>>
>>> 			dac@0 {
>>> 				reg = <0>;
>>> 				vdd-supply = <&vdd_regulator>;
>>> 				avdd-supply = <&avdd_regulator>;
>>> 				hvdd-supply = <&hvdd_regulator>;
>>> 				hvss-supply = <&hvss_regulator>;
>>>
>>> 				reset-gpios = <&gpio0 87 GPIO_ACTIVE_LOW>;
>>>
>>> 				#address-cells = <1>;
>>> 				#size-cells = <0>;
>>>
>>> 				channel@0 {
>>> 					reg = <0>;
>>> 					adi,output-range-microvolt = <0 5000000>;
>>> 				};
>>>
>>> 				channel@1 {
>>> 					reg = <1>;
>>> 					adi,output-range-microvolt = <(-10000000) 10000000>;
>>> 				};
>>>
>>> 				channel@2 {
>>> 					reg = <2>;
>>> 					adi,output-range-microvolt = <0 40000000>;
>>> 				};
>>> 			}
>>>
>>> 			dac@1 {
>>> 				reg = <1>;
>>> 				vdd-supply = <&vdd_regulator>;
>>> 				avdd-supply = <&avdd_regulator>;
>>> 				hvdd-supply = <&hvdd_regulator>;
>>> 				hvss-supply = <&hvss_regulator>;
>>>
>>> 				reset-gpios = <&gpio0 88 GPIO_ACTIVE_LOW>;
>>>
>>> 				#address-cells = <1>;
>>> 				#size-cells = <0>;
>>>
>>> 				channel@0 {
>>> 					reg = <0>;
>>> 					adi,output-range-microvolt = <0 5000000>;
>>> 				};
>>>
>>> 				channel@1 {
>>> 					reg = <1>;
>>> 					adi,output-range-microvolt = <(-10000000) 10000000>;
>>> 				};
>>> 			}
>>> 		};
>>> 	};
>>>
>>> then you might need something like:
>>>
>>> 	patternProperties:
>>> 		"^dac@[0-3]$":
>>>
>>> and put most of the things under this node pattern.
>>>
>>> So the main driver that you're putting together might need to handle up to four instances.
>>> Even if your current driver cannot handle this, the dt-bindings might need cover that.
>>>
>>> Need to double check if each dac node needs a separate compatible, so you would maybe populate
>>> a platform data to be shared with the child nodes, which would be a separate driver.
>>> (not sure if it would make sense to mix and match ad5529r-16 and ad5529r-12).
>> Hi Rodrigo,
>>
>> Thank you for looking at this.
>>
>> For now, I would prefer to keep the binding scoped to a single AD5529R device instance. The current
>> hardware/use case we have only needs one device node and the driver is written around that model as well.
>> While the device addressing pins could allow multi-device topology, we do not have an actual platform using
>> that configuration at the moment, so I would prefer not to introduce an extra parent/child binding structure
>> speculatively without a validating use case.
> Interesting feature - kind of similar to address control on a typical i2c bus device, or
> looking at it another way a kind of distributed SPI mux.
>
> Challenge of a binding is we need to anticipate the future.  So I think we do need something
> like Rodrigo is suggesting even if we only (for now) support a single instance in the driver.
> That would leave the path open to supporting the addressing at a later date.
> An alternative might be to look at it like a chained device setup. In those we pretend there
> is just one device with a lot of channels etc.  The snag is that here things are more loosely
> coupled whereas for those devices it tends to be you have to read / write the same register
> in all devices in the chain as one big SPI message.
>
> +CC Mark Brown as he may know of some precedence for this feature. For his reference..
> - Each of these device has 2 ID pins.  The SPI transfers have to contain the 2 bit
> value that matches that or they are ignored.  Thus a single bus + 1 chip select can
> be used to talk to 4 devices.  Question is what that looks like in device tree + I guess
> longer term how to support it cleanly in SPI.
>
> Jonathan

Hi Jonathan, Rob, Krzysztof, Conor,

One possible model that would also allow mixing the 12-bit and 16-bit variants would be to treat the parent node
as the shared SPI transport only, and let each dac@N child carry its own compatible.

Rob, Krzysztof, Conor — wanted to get your input on whether this is an acceptable binding pattern.

properties:
   compatible:
     const: adi,ad5529r-bus

patternProperties:
   "^dac@[0-3]$":
     type: object
     properties:
       compatible:
         enum:
           - adi,ad5529r-16
           - adi,ad5529r-12
       reg:
         minimum: 0
         maximum: 3

With a DT example such as:

ad5529r@0 {
         compatible = "adi,ad5529r-bus";
         reg = <0>;

         dac@0 {
                 compatible = "adi,ad5529r-16";
                 reg = <0>;
         };

         dac@1 {
                 compatible = "adi,ad5529r-12";
                 reg = <1>;
         };
};

The downside is that it introduces adi,ad5529r-bus as a compatible that does not correspond to an actual
standalone device variant - it would require a parent driver to manage the shared SPI transport and enumerate the
child devices. The actual DAC functionality is handled by the matching per-child compatibles(12 or 16 bit).
Is this an acceptable pattern, or is there a preferred way to model this type of addressing scheme?

Regards,
Janani Sunil


^ permalink raw reply

* Re: [PATCH v4 00/31] Introduce SCMI Telemetry FS support
From: David Hildenbrand (Arm) @ 2026-06-19 10:16 UTC (permalink / raw)
  To: Cristian Marussi
  Cc: Christian Brauner, linux-kernel, linux-arm-kernel, arm-scmi,
	linux-fsdevel, linux-doc, sudeep.holla, james.quinlan, f.fainelli,
	vincent.guittot, etienne.carriere, peng.fan, michal.simek, d-gole,
	jic23, elif.topuz, lukasz.luba, philip.radford,
	souvik.chakravarty, leitao, kas, puranjay, usama.arif,
	kernel-team
In-Reply-To: <ajR_FBWOoXJKSeoH@pluto>


>> Is the configuration aspect limited to enabling selected events, or is there
>> more that can be configured?
>>
> 
> The needed configuration is:
> 
>  - global Telemetry enable (tlm_enable)
>  - global common update_interval (current_update_interval)

Okay, so simple global properties.

>  - per-DE enable/disable (des/0x<NNNN>/enable)
>  - per-DE timestamping enable/disable (des/0x<NNNN>/tstamp_enable)
> 
>  ... then there are a couple of handy catch-all entries:
> 	all_des_enable, all_des_tstamp_enable

Okay, so fairly trivial configs.
> 
> Note that all the existent DEs are discovered at runtime dynamically via
> SCMI in the background at init/probe and then never change: i.e.
> the tree is statically created upon discovery, user cannot
> create/destroy or symlink files at will, nor the backend platform FW
> running the SCMI server can pop-up new DataEvents after the initial
> enumeration.

That makes sense.

> 
> All the above configs can also be pre-defined in the FW (at built time)
> as being default boot-on with predefined values, like a specific
> boot-on update interval, so that you could have a system in which really
> you dont need to configure anything...everything is on and you just
> read data. (unless you want to change config of course...)

Okay, so the initial value of some parameters might not be "disabled" etc.

I guess, from a user space perspective, reading should be allowed by everyone
but writing should be limited to root?

> 
> There is more stuff that indeed is configurable per the SCMI spec
> but these additional params are hidden into the SCMI Telemetry protocol
> layer (the initial patches in this series) and NOT made available to
> the driver/users of the protocol (like the SCMI FS driver that sits on
> top)

Do you assume that there will get significantly more config options added in the
future for user space to configure?

> 
> IOW, this humonguos series (~8k lines) is only partially composed by
> the Filesystem driver (~3k): the bulk of the Telemetry logic and SCMI
> message exchanges are contained in the SCMI Protocol stack which has
> been extended to support the Telemerty protocol at first
> (the 'firmware: arm_scmi:' initial patches).
> 
> This latter common support is exposed by the SCMI stack for the SCMI
> drivers to use via custom per-protocol operations (not an orginal name :P)
> exposed in include/linux/scmi_protocol.h
> 
> So when you write into FS to configure smth, you end up calling an internal
> tlm_proto_ops that in turn will cause an SCMI message to be sent
> (in some cases say to enable a DE or set the update interval)

Makes sense.

> 
> When you read something, you end up calling another Telemetry operation
> that in turn returns you the DataEvent value you were looking for...how
> this is retrieved via SCMI in the background is transparent to the
> FS driver because, again, these details are buried into the protocol
> layer. Talking about reads, you can:
> 
>  - read a single value from des/0x<NNNN>/value
>  - read ALL the currently enabled DE in a bulk read via des_bulk_read
> 
> ...most of the other entries in the tree are simply RO properties of the DEs
> that have been discovered at enumeration time.

Is this bulk-reading relevant for performance or just a "nice to have" ?


> 
> Given that walking a FS tree and issuing configuration as writes is NOT
> performant really (nor handy if you are not a human), currently, even
> in this FS-based series you can really perform all of the discovery AND
> the configuration tasks WITHOUT walking the filesystem tree, but instead
> issuing a bunch of IOCTLs issued on a special 'control' file that I
> embedded in the FS. Such UAPI IOCTLs described at:

Makes sense.

> 
> https://lore.kernel.org/arm-scmi/20260612223802.1337232-6-cristian.marussi@arm.com/T/#u
>  
> So my plan of action in order to get rid of the FS in-kenel implementation
> would be to drop this Filesystem in favour of simple character devices
> and move the existent IOCTLs interface (revisited where needed) on top of
> these devices: that way you will be able to use IOCTLs to enumerate the
> Telemetry sources and then configure them.
> 
> Read will then happen (probably) leveraging a number of chardev fops like:
> IOCTLs, .read and .mmap...up to the tool decide what to use.
> 
> After this porting to chardev is done, I would start optionally exposing
> again all of this in a human-readable alternative way by adding a layer
> of FUSE on top of this chardev interface.

Yes. How high-priority is the fs side? Or would a tool using a library to access
this information also work in the first step?

> 
> Basically my aim is to drop the FS implementation from the kernel, as
> advised, while trying to optionally make it still available via a userspace
> FUSE implementation...IOW the intention would be for the next V5 to expose
> the same interfaces as V4 but with the help of a tool instead that builds,
> if wanted, a FUSE mount built on top of the chardev interface.
> 
> So basically 'floating up' the current FS-like interface into userspace.

Yes.

> 
>>
>> You mention json here ... but I assume the data we are getting fed by the
>> protocol is not in some default format? (e.g., json)
> 
> The data format is defined by the SCMI spec and it is buried in the SCMI
> layer, there are a number of collection method and a number of formats: this
> is NOT exposed from the SCMI core BUT handled transparently.
> 
> The raw spec format basically defines how DE ID, Tstamps, values are represented
> in memory and how their consistency can be assured despite the fact that
> platform could update the same entries that a user is concurrently reading...
> 
> JSON definitions only assign a semantic to the DEs (in theory...): e.g. on this
> specific platform...wth is 0x1234 ? ..also note that JSON defs are NOT part of
> the spec....they do NOT really exist for the Kernel: they are parsed and
> interpreted by more complex user space tools that are supposed to leverage some
> of these interfaces to retrieve data and carry-on analysis.

What I thought, thanks.

> 
>>
>>
>> Maybe you have it in some of the patches here, but what does the typical
>> directory + file structure look like in the current implementation?
>>
>> Do you have an example?
>>
>> Also, is everything in that filesystem read-only, or are there some writable
>> file (IOW, how is stuff configured?).
> 
> See above for config/write entry ... and I think you found the FS layout in the
> doc already...
> 
>>
>>
>> Okay, so you really only feed this data to user space, exposing all the data you
>> have easily available as part of the protocol.
> 
> Yes, no interpetation nor filtering: I expose all that have enumerated and/
> discovered by the protocol, allowing for configurations while hiding the inner
> SCMI Telemetry mechanism...
> 
>>
>>
>> It's a good question how that could be done, if you need more information about
>> these events from user space.
> 
> I have NOT really delved into that, so as of know we do NOT fed any data
> to existing Kernel subsystems, not there is any available in-kernel
> interface to consume DE data (nobody asked), but, I can imagine 2 solution:
> 
>  - our beloved architects decide to 'architect' more DataEvents in the
>    next version of the spec.. i.e. they reserve some specific DE IDs to
>    represent some well defined entity (like it is done already in the spec
>    for a dozen IDs)...this avoids the needs of any new interface all
>    together

That would be the cleanest solution :)

> 
> OR
> 
> - we open some sort of user-->kernel ABI channel 'somewhere' where the
>   userspace tool, interpreting the JSON description, can communicate something
>   like " on this platform ID 1,2,3,4 should be fed to the IIO sensors frmwk
>   too, while ID 39,8,76 can be fed to HWMON..." etc
> 
>>
>> [...]
>>
>>
>> That sounds reasonable.
>>
>> [...]
>>
>>> ...I would not say that this was the kind of feedback I was hoping for,
>>> but I am NOT gonna argue, given that you shot down already what I thought
>>> were all my best selling points :P
>>>
>>> At this point my understanding is that the way forward must be to use
>>> a custom tool to configure/extract/translate the raw Telemetry data and
>>> move up into userspace the whole human readable FS layer via FUSE, if
>>> really needed.
>>>
>>> I suppose that the new kernel/user interface has to be some dedicated char
>>> device implementing proper fops. (like I did previously in early versions
>>> of this series and then abandoned...)
>>>
>>> Is this you have in mind ? Dedicated character device(s) with enough fops
>>> to be able to configure/extract Telemetry data with a custom tool ?
>>
>> I cannot speak for Christian, but I guess you could have some kind of libscmi in
>> user space that can obtain the information (as you say, probably char device,
>> not sure which alternatives we have), to expose the data through a nice ABI, to
>> then either make tools build upon that directly, or have a fuse server in user
>> space that mimics what you currently do with the file system.
> 
> My aim would be at first a simple tool that can exercise the chardev interface to
> discover configure and read back data, and then a FUSE server on top of this to
> optionally expose the human readable FS....I suppose our internal and external
> customers can use the FS interface to validate/test/script on one side, OR
> simply code their own tools/libs to use directly the bare chardev inteface...
> 
> ...we do have a tools team already working on a library to ease all of this
> SCMI Telemtry collection and analysis...it is just a matter to re-target the
> kind of lower level interfaces that they are using in the near future
> probably (they were already planning indeed AFAIK to use more performant
> interface that FS...)

Good.

> 
>>
>> One thing that is not clear to me yet is how stuff would be configured, and how
>> possibly multiple users of libscmi would possibly interact.
>>
> 
> Configuration/discovery will happen via IOCTls, while consuming the Data
> can happen:
> 
>  - all together in bulk via a device read fops
>  - a single DE via a targeted IOCTL
>  - direct access to the raw SCMI data via dev/mmap of the underlying SCMI
>    areas (that means the tool has to parse the SCMI format defined by the
>    spec on its own, without the currently provided Kernel mediation...)
> 
> Regarding the user concurrency, I have already explicitly pushed back on
> this, our own tools team: any concurrent read or configuration write is
> allowed and properly handled in a consistent way, BUT on the configuration
> side the last write/ioctl wins: there is NO in-kernel OR userspace
> co-ordination provided out of the box: IOW if you use multiple tools
> concurrently to apply conflicting configurations, it is none of our problem

Would concurrent reading work? I assume so, right?

> 
> ...similarly as if you have an actively running network configuration daemon
> and you try to set your IP manually...nobody will prevent you from doing this,
> the same netlink will be used freely by you on the shell and the daemon (if you
> have enough privilege), but you will gonna have unexpected result...
> 
> I dont either see the case to enforce exclusive access for Telemetry resources:
> co-ordination is up to the user in my view...I mean if you have 2 tools
> configuring concurrently SCMI telemetry in a conflicting way something has been
> misconfigured somewhere
> 
> .....having said that, I understand that the concurrency co-ordination
> issue can be particularly tricky to spot and solve in userspace, so I DO
> expose a generation counter entry that is updated on any configuration
> change, so that a userspace app using Telemetry can monitor (poll) this
> counter to spot if someone else on the system is quietly suddenly applying
> configuration changes...

Okay, so a single writer (admin) changing stuff could get picked up my possibly
many concurrent readers?

> 
>>>
>>> Should/could such a tool live in the kernel tree (tools/) at least for
>>> ease of development/deployment ?
>>
>> I think OOT.
>>
> 
> Ok.
> 
> Sorry for the long email..I hope I have clarified the situation, anyway
> I am already moving to get rid of the in-kernel interface as advised in
> favour of a chardev kernel interface and an optional FUSE based FS...

Yes, thank you a lot, I hope it also helps Christian to help push this into the
right direction!

-- 
Cheers,

David

^ permalink raw reply

* Re: [PATCH v8 15/46] KVM: guest_memfd: Call arch invalidate hooks on conversion
From: Fuad Tabba @ 2026-06-19 10:09 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-15-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> When memory in guest_memfd is converted from private to shared, the
> platform-specific state associated with the guest-private pages must be
> invalidated or cleaned up.
>
> Iterate over the folios in the affected range and call the
> kvm_arch_gmem_invalidate() hook for each PFN range. This allows
> architectures to perform necessary teardown, such as updating hardware
> metadata or encryption states, before the pages are transitioned to the
> shared state.
>
> Invoke this helper after indicating to KVM's mmu code that an invalidation
> is in progress to stop in-flight page faults from succeeding.
>
> Reviewed-by: Fuad Tabba <tabba@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Coming back to this after working through the arm64/pKVM side. My
Reviewed-by here is from the previous round and the patch hasn't
changed, but I missed an implication for arm64.

kvm_arch_gmem_invalidate() is now called from two paths with the same
(start, end) signature: folio teardown (kvm_gmem_free_folio) and
private->shared conversion (here). For SNP/TDX that's fine, conversion is
destructive anyway. For pKVM the two need opposite content semantics:
conversion must preserve the page in place (same physical page, the point
of in-place conversion without encryption), while teardown must scrub it
before returning it to the host.

The hook gets only a pfn range with no indication of which caller it's
serving, so arm64 can't give the two paths the behaviour they need. It
would help to signal intent on the conversion path: a reason/flag, a
separate hook, or not routing non-destructive conversion through the
teardown hook.

arm64 isn't here yet, so this isn't urgent, but the hook is gaining a
second caller now, and it's cheaper to leave room for the distinction
than to change a generic contract other arches depend on later.

Cheers,
/fuad


> ---
>  virt/kvm/guest_memfd.c | 41 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 433f79047b9d1..3c94442bc8131 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -607,6 +607,42 @@ static bool kvm_gmem_is_safe_for_conversion(struct inode *inode, pgoff_t start,
>         return safe;
>  }
>
> +#ifdef CONFIG_HAVE_KVM_ARCH_GMEM_INVALIDATE
> +static void kvm_gmem_invalidate(struct inode *inode, pgoff_t start, pgoff_t end)
> +{
> +       struct folio_batch fbatch;
> +       pgoff_t next = start;
> +       int i;
> +
> +       folio_batch_init(&fbatch);
> +       while (filemap_get_folios(inode->i_mapping, &next, end - 1, &fbatch)) {
> +               for (i = 0; i < folio_batch_count(&fbatch); ++i) {
> +                       struct folio *folio = fbatch.folios[i];
> +                       pgoff_t start_index, end_index;
> +                       kvm_pfn_t start_pfn, end_pfn;
> +
> +                       start_index = max(start, folio->index);
> +                       end_index = min(end, folio_next_index(folio));
> +                       /*
> +                        * end_index is either in folio or points to
> +                        * the first page of the next folio. Hence,
> +                        * all pages in range [start_index, end_index)
> +                        * are contiguous.
> +                        */
> +                       start_pfn = folio_file_pfn(folio, start_index);
> +                       end_pfn = start_pfn + end_index - start_index;
> +
> +                       kvm_arch_gmem_invalidate(start_pfn, end_pfn);
> +               }
> +
> +               folio_batch_release(&fbatch);
> +               cond_resched();
> +       }
> +}
> +#else
> +static void kvm_gmem_invalidate(struct inode *inode, pgoff_t start, pgoff_t end) {}
> +#endif
> +
>  static int __kvm_gmem_set_attributes(struct inode *inode, pgoff_t start,
>                                      size_t nr_pages, uint64_t attrs,
>                                      pgoff_t *err_index)
> @@ -647,7 +683,12 @@ static int __kvm_gmem_set_attributes(struct inode *inode, pgoff_t start,
>          */
>
>         kvm_gmem_invalidate_start(inode, start, end);
> +
> +       if (!to_private)
> +               kvm_gmem_invalidate(inode, start, end);
> +
>         mas_store_prealloc(&mas, xa_mk_value(attrs));
> +
>         kvm_gmem_invalidate_end(inode, start, end);
>  out:
>         filemap_invalidate_unlock(mapping);
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 08/46] KVM: Provide generic interface for checking memory private/shared status
From: Suzuki K Poulose @ 2026-06-19  9:57 UTC (permalink / raw)
  To: Fuad Tabba, ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, aneesh.kumar, liam, Paolo Bonzini,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen,
	Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt,
	Kiryl Shutsemau, Baoquan He, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <CA+EHjTxu32nQ+vPV7Zmcw76_-V4g2_g=P_UzRnnO2dP1PFO2ww@mail.gmail.com>

On 19/06/2026 09:21, Fuad Tabba wrote:
> On Fri, 19 Jun 2026 at 09:19, Fuad Tabba <tabba@google.com> wrote:
>>
>> On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
>> <devnull+ackerleytng.google.com@kernel.org> wrote:
>>>
>>> From: Sean Christopherson <seanjc@google.com>
>>>
>>> Introduce a generic kvm_mem_is_private() interface using a static call to
>>> determine if a GFN is private. This allows the implementation for checking
>>> a GFN's private/shared status to be set at runtime.
>>>
>>> In preparation for choosing implementations between a guest_memfd lookup
>>> and the existing VM attribute lookup, rename the existing
>>> VM-attribute-based check to kvm_vm_mem_is_private to emphasize that it
>>> looks up VM attributes.
>>>
>>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>>
>> (SoB fix plz)
>>
>> Reviewed-by: Fuad Tabba <tabba@google.com>
>>
>> Cheers,
>> /fuad
>>> ---
>>>   include/linux/kvm_host.h | 12 +++++++++++-
>>>   virt/kvm/kvm_main.c      | 15 +++++++++++++++
>>>   2 files changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>>> index eb26d4ea8945a..3915da2a61778 100644
>>> --- a/include/linux/kvm_host.h
>>> +++ b/include/linux/kvm_host.h
>>> @@ -2546,7 +2546,7 @@ bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm,
>>>   bool kvm_arch_post_set_memory_attributes(struct kvm *kvm,
>>>                                           struct kvm_gfn_range *range);
>>>
>>> -static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>>> +static inline bool kvm_vm_mem_is_private(struct kvm *kvm, gfn_t gfn)
> 
> Should have read the Sashiko review first, but where is this used?
> It's not used at all in this series...

See below:

> 
> /fuad
> 
>>>   {
>>>          return kvm_get_vm_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE_PRIVATE;
>>>   }
>>> @@ -2557,6 +2557,16 @@ static inline bool kvm_mem_range_is_private(struct kvm *kvm, gfn_t start,
>>>                                                    KVM_MEMORY_ATTRIBUTE_PRIVATE,
>>>                                                    KVM_MEMORY_ATTRIBUTE_PRIVATE);
>>>   }
>>> +#endif  /* CONFIG_KVM_VM_MEMORY_ATTRIBUTES */
>>> +
>>> +#ifdef kvm_arch_has_private_mem
>>> +typedef bool (kvm_mem_is_private_t)(struct kvm *kvm, gfn_t gfn);
>>> +DECLARE_STATIC_CALL(__kvm_mem_is_private, kvm_mem_is_private_t);
>>> +
>>> +static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>>> +{
>>> +       return static_call(__kvm_mem_is_private)(kvm, gfn);
>>> +}
>>>   #else
>>>   static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>>>   {
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index 6669f1477013c..8b238e461b854 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -2627,6 +2627,20 @@ static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
>>>   }
>>>   #endif /* CONFIG_KVM_VM_MEMORY_ATTRIBUTES */
>>>
>>> +#ifdef kvm_arch_has_private_mem
>>> +DEFINE_STATIC_CALL_RET0(__kvm_mem_is_private, kvm_mem_is_private_t);
>>> +EXPORT_STATIC_CALL_GPL(__kvm_mem_is_private);
>>> +
>>> +static void kvm_init_memory_attributes(void)
>>> +{
>>> +#ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>>> +       static_call_update(__kvm_mem_is_private, kvm_vm_mem_is_private);
>>> +#endif
>>> +}


Here ^^ as the static call update ?


Suzuki

^ permalink raw reply

* [PATCH 5/5] arm64: dts: socfpga: stratix10: add hwmon node
From: tze.yee.ng @ 2026-06-19  9:38 UTC (permalink / raw)
  To: Guenter Roeck, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	linux-hwmon, devicetree, linux-kernel, Dinh Nguyen, Mahesh Rao,
	Jonathan Corbet, Shuah Khan, linux-doc
In-Reply-To: <cover.1781861409.git.tze.yee.ng@altera.com>

From: Tze Yee Ng <tze.yee.ng@altera.com>

Add an hwmon child node under the Stratix 10 service layer and describe
the SoCDK voltage and temperature sensors using the altr,stratix10-hwmon
compatible.

Signed-off-by: Nazim Amirul <muhammad.nazim.amirul.nazle.asmade@altera.com>
Signed-off-by: Tze Yee Ng <tze.yee.ng@altera.com>
---
 .../boot/dts/altera/socfpga_stratix10.dtsi    |  5 +++
 .../dts/altera/socfpga_stratix10_socdk.dts    | 33 +++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi b/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
index 0d9cad0c0351..afb11e6f6813 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
@@ -78,6 +78,11 @@ svc {
 			fpga_mgr: fpga-mgr {
 				compatible = "intel,stratix10-soc-fpga-mgr";
 			};
+
+			temp_volt: hwmon {
+				compatible = "altr,stratix10-hwmon";
+				status = "disabled";
+			};
 		};
 	};
 
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
index e2a1cea7f3da..01a8ffe430ed 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
@@ -134,3 +134,36 @@ root: partition@4200000 {
 		};
 	};
 };
+
+&temp_volt {
+	status = "okay";
+
+	voltage {
+		#address-cells = <1>;
+		#size-cells = <0>;
+		input@2 {
+			label = "0.8V VCC";
+			reg = <2>;
+		};
+
+		input@3 {
+			label = "1.8V VCCIO_SDM";
+			reg = <3>;
+		};
+
+		input@6 {
+			label = "0.9V VCCERAM";
+			reg = <6>;
+		};
+	};
+
+	temperature {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		input@0 {
+			label = "Main Die SDM";
+			reg = <0x0>;
+		};
+	};
+};
-- 
2.43.7


^ permalink raw reply related

* [PATCH 4/5] hwmon: add Stratix 10 SoC FPGA hardware monitor driver
From: tze.yee.ng @ 2026-06-19  9:38 UTC (permalink / raw)
  To: Guenter Roeck, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	linux-hwmon, devicetree, linux-kernel, Dinh Nguyen, Mahesh Rao,
	Jonathan Corbet, Shuah Khan, linux-doc
In-Reply-To: <cover.1781861409.git.tze.yee.ng@altera.com>

From: Tze Yee Ng <tze.yee.ng@altera.com>

Add a hardware monitoring driver for Altera Stratix 10 SoC FPGA devices
that reads temperature and voltage sensors through the Stratix 10 service
layer. Use the asynchronous service layer interface when available, with
a synchronous fallback.

Signed-off-by: Nazim Amirul <muhammad.nazim.amirul.nazle.asmade@altera.com>
Signed-off-by: Tze Yee Ng <tze.yee.ng@altera.com>
---
 Documentation/hwmon/index.rst           |   1 +
 Documentation/hwmon/stratix10-hwmon.rst |  31 ++
 MAINTAINERS                             |   2 +
 drivers/hwmon/Kconfig                   |  10 +
 drivers/hwmon/Makefile                  |   1 +
 drivers/hwmon/stratix10-hwmon.c         | 575 ++++++++++++++++++++++++
 6 files changed, 620 insertions(+)
 create mode 100644 Documentation/hwmon/stratix10-hwmon.rst
 create mode 100644 drivers/hwmon/stratix10-hwmon.c

diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst
index 8b655e5d6b68..30f533301903 100644
--- a/Documentation/hwmon/index.rst
+++ b/Documentation/hwmon/index.rst
@@ -244,6 +244,7 @@ Hardware Monitoring Kernel Drivers
    sparx5-temp
    spd5118
    stpddc60
+   stratix10-hwmon
    surface_fan
    sy7636a-hwmon
    tc654
diff --git a/Documentation/hwmon/stratix10-hwmon.rst b/Documentation/hwmon/stratix10-hwmon.rst
new file mode 100644
index 000000000000..61b682fe177a
--- /dev/null
+++ b/Documentation/hwmon/stratix10-hwmon.rst
@@ -0,0 +1,31 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Kernel driver stratix10-hwmon
+=============================
+
+Supported chips:
+
+ * Altera Stratix 10 SoC FPGA
+
+Authors:
+      - Nazim Amirul <muhammad.nazim.amirul.nazle.asmade@altera.com>
+      - Tze Yee Ng <tze.yee.ng@altera.com>
+
+Description
+-----------
+
+This driver supports hardware monitoring for Altera Stratix 10 SoC FPGA
+devices through the Secure Device Manager and Stratix 10 service layer.
+
+The following sensor types are supported:
+
+  * temperature
+  * voltage
+
+Usage Notes
+-----------
+
+The driver relies on a device tree node to enumerate sensors present on the
+specific device. See
+Documentation/devicetree/bindings/hwmon/altr,stratix10-hwmon.yaml for details
+of the device-tree node.
diff --git a/MAINTAINERS b/MAINTAINERS
index 678f6c429627..5afdf286f8f9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -943,6 +943,8 @@ M:	Tze Yee Ng <tze.yee.ng@altera.com>
 L:	linux-hwmon@vger.kernel.org
 S:	Maintained
 F:	Documentation/devicetree/bindings/hwmon/altr,stratix10-hwmon.yaml
+F:	Documentation/hwmon/stratix10-hwmon.rst
+F:	drivers/hwmon/stratix10-hwmon.c
 
 ALTERA MAILBOX DRIVER
 M:	Tien Sung Ang <tiensung.ang@altera.com>
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 14e4cea48acc..8eff1c71a226 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -2112,6 +2112,16 @@ config SENSORS_SMSC47M192
 	  This driver can also be built as a module. If so, the module
 	  will be called smsc47m192.
 
+config SENSORS_ALTERA_SOCFPGA_STRATIX10
+	tristate "Altera SoC FPGA Stratix 10 hardware monitoring features"
+	depends on INTEL_STRATIX10_SERVICE
+	help
+	  If you say yes here you get support for the temperature and
+	  voltage sensors of Altera SoC FPGA Stratix 10 devices.
+
+	  This driver can also be built as a module. If so, the module
+	  will be called stratix10-hwmon.
+
 config SENSORS_SMSC47B397
 	tristate "SMSC LPC47B397-NC"
 	depends on HAS_IOPORT
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index 982ee2c6f9de..7e643de0e7d4 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -217,6 +217,7 @@ obj-$(CONFIG_SENSORS_SMPRO)	+= smpro-hwmon.o
 obj-$(CONFIG_SENSORS_SMSC47B397)+= smsc47b397.o
 obj-$(CONFIG_SENSORS_SMSC47M1)	+= smsc47m1.o
 obj-$(CONFIG_SENSORS_SMSC47M192)+= smsc47m192.o
+obj-$(CONFIG_SENSORS_ALTERA_SOCFPGA_STRATIX10)	+= stratix10-hwmon.o
 obj-$(CONFIG_SENSORS_SPARX5)	+= sparx5-temp.o
 obj-$(CONFIG_SENSORS_SPD5118)	+= spd5118.o
 obj-$(CONFIG_SENSORS_STTS751)	+= stts751.o
diff --git a/drivers/hwmon/stratix10-hwmon.c b/drivers/hwmon/stratix10-hwmon.c
new file mode 100644
index 000000000000..7ed1116e57b8
--- /dev/null
+++ b/drivers/hwmon/stratix10-hwmon.c
@@ -0,0 +1,575 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Altera Stratix 10 SoC FPGA hardware monitoring driver
+ *
+ * Copyright (c) 2026 Altera Corporation
+ *
+ * Authors:
+ *	Nazim Amirul <muhammad.nazim.amirul.nazle.asmade@altera.com>
+ *	Tze Yee Ng <tze.yee.ng@altera.com>
+ */
+
+#include <linux/bitops.h>
+#include <linux/cleanup.h>
+#include <linux/completion.h>
+#include <linux/delay.h>
+#include <linux/firmware/intel/stratix10-svc-client.h>
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+
+#define HWMON_TIMEOUT			msecs_to_jiffies(SVC_HWMON_REQUEST_TIMEOUT_MS)
+#define HWMON_RETRY_SLEEP_MS		1U
+#define HWMON_ASYNC_MSG_RETRY		3U
+#define STRATIX10_HWMON_MAXSENSORS	16
+#define STRATIX10_HWMON_TEMPERATURE	"temperature"
+#define STRATIX10_HWMON_VOLTAGE		"voltage"
+#define STRATIX10_HWMON_CHANNEL_MASK	GENMASK(15, 0)
+#define STRATIX10_HWMON_PAGE_SHIFT	16
+#define STRATIX10_HWMON_ATTR_VISIBLE	0444
+/* Temperature from SDM is signed Q8.8 millidegrees Celsius (8 fractional bits). */
+#define STRATIX10_HWMON_TEMP_FRAC_BITS	8
+#define STRATIX10_HWMON_TEMP_FRAC_DIV	BIT(STRATIX10_HWMON_TEMP_FRAC_BITS)
+/* Voltage from SDM is unsigned Q16 (millivolts, 16 fractional bits). */
+#define STRATIX10_HWMON_VOLT_FRAC_BITS	16
+#define STRATIX10_HWMON_VOLT_FRAC_DIV	BIT(STRATIX10_HWMON_VOLT_FRAC_BITS)
+
+#define ETEMP_INACTIVE			0x80000000U
+#define ETEMP_TOO_OLD			0x80000001U
+#define ETEMP_NOT_PRESENT		0x80000002U
+#define ETEMP_TIMEOUT			0x80000003U
+#define ETEMP_CORRUPT			0x80000004U
+#define ETEMP_BUSY			0x80000005U
+#define ETEMP_NOT_INITIALIZED		0x800000FFU
+
+struct stratix10_hwmon_priv {
+	struct stratix10_svc_chan *chan;
+	struct stratix10_svc_client client;
+	struct completion completion;
+	struct mutex lock;	/* protect SVC calls */
+	bool async;
+	u32 temperature;
+	u32 voltage;
+	int temperature_channels;
+	int voltage_channels;
+	const char *temp_chan_names[STRATIX10_HWMON_MAXSENSORS];
+	const char *volt_chan_names[STRATIX10_HWMON_MAXSENSORS];
+	u32 temp_chan[STRATIX10_HWMON_MAXSENSORS];
+	u32 volt_chan[STRATIX10_HWMON_MAXSENSORS];
+};
+
+static umode_t stratix10_hwmon_is_visible(const void *dev,
+					  enum hwmon_sensor_types type,
+					 u32 attr, int chan)
+{
+	const struct stratix10_hwmon_priv *priv = dev;
+
+	switch (type) {
+	case hwmon_temp:
+		if (chan < priv->temperature_channels)
+			return STRATIX10_HWMON_ATTR_VISIBLE;
+		return 0;
+	case hwmon_in:
+		if (chan < priv->voltage_channels)
+			return STRATIX10_HWMON_ATTR_VISIBLE;
+		return 0;
+	default:
+		return 0;
+	}
+}
+
+static void stratix10_hwmon_readtemp_cb(struct stratix10_svc_client *client,
+					struct stratix10_svc_cb_data *data)
+{
+	struct stratix10_hwmon_priv *priv = client->priv;
+
+	if (data->status == BIT(SVC_STATUS_OK)) {
+		priv->temperature = (u32)*(unsigned long *)data->kaddr1;
+	} else if (data->kaddr1) {
+		dev_err(client->dev, "%s failed with status 0x%x, value 0x%lx\n",
+			__func__, data->status,
+			*(unsigned long *)data->kaddr1);
+	} else {
+		dev_err(client->dev, "%s failed with status 0x%x\n",
+			__func__, data->status);
+	}
+
+	complete(&priv->completion);
+}
+
+static void stratix10_hwmon_readvolt_cb(struct stratix10_svc_client *client,
+					struct stratix10_svc_cb_data *data)
+{
+	struct stratix10_hwmon_priv *priv = client->priv;
+
+	if (data->status == BIT(SVC_STATUS_OK)) {
+		priv->voltage = (u32)*(unsigned long *)data->kaddr1;
+	} else if (data->kaddr1) {
+		dev_err(client->dev, "%s failed with status 0x%x, value 0x%lx\n",
+			__func__, data->status,
+			*(unsigned long *)data->kaddr1);
+	} else {
+		dev_err(client->dev, "%s failed with status 0x%x\n",
+			__func__, data->status);
+	}
+
+	complete(&priv->completion);
+}
+
+static void stratix10_hwmon_async_callback(void *ptr)
+{
+	if (ptr)
+		complete(ptr);
+}
+
+static int stratix10_hwmon_parse_temp(long *val, u32 temperature)
+{
+	switch (temperature) {
+	case ETEMP_INACTIVE:
+	case ETEMP_NOT_PRESENT:
+	case ETEMP_CORRUPT:
+	case ETEMP_NOT_INITIALIZED:
+		return -EOPNOTSUPP;
+	case ETEMP_TIMEOUT:
+	case ETEMP_BUSY:
+	case ETEMP_TOO_OLD:
+		return -EAGAIN;
+	default:
+		/* Convert Q8.8 millidegrees Celsius to millidegrees for hwmon. */
+		*val = (long)(s32)temperature / STRATIX10_HWMON_TEMP_FRAC_DIV;
+		return 0;
+	}
+}
+
+static int stratix10_hwmon_encode_temp_arg(u32 reg, u64 *arg)
+{
+	u32 page = (reg >> STRATIX10_HWMON_PAGE_SHIFT) & STRATIX10_HWMON_CHANNEL_MASK;
+	u32 channel = reg & STRATIX10_HWMON_CHANNEL_MASK;
+
+	if (channel >= STRATIX10_HWMON_MAXSENSORS)
+		return -EINVAL;
+
+	*arg = (1ULL << channel) | ((u64)page << STRATIX10_HWMON_PAGE_SHIFT);
+	return 0;
+}
+
+static int stratix10_hwmon_encode_volt_arg(u32 reg, u64 *arg)
+{
+	u32 channel = reg & STRATIX10_HWMON_CHANNEL_MASK;
+
+	if (channel >= STRATIX10_HWMON_MAXSENSORS)
+		return -EINVAL;
+
+	*arg = 1ULL << channel;
+	return 0;
+}
+
+static int stratix10_hwmon_async_read(struct device *dev,
+				      enum hwmon_sensor_types type,
+				     struct stratix10_svc_client_msg *msg)
+{
+	struct stratix10_hwmon_priv *priv = dev_get_drvdata(dev);
+	struct stratix10_svc_cb_data data = {};
+	struct completion completion;
+	unsigned long wait_ret;
+	void *handle = NULL;
+	int status, index, ret;
+
+	init_completion(&completion);
+
+	for (index = 0; index < HWMON_ASYNC_MSG_RETRY; index++) {
+		status = stratix10_svc_async_send(priv->chan, msg, &handle,
+						  stratix10_hwmon_async_callback,
+						  &completion);
+		if (status == 0)
+			break;
+		dev_warn(dev, "Failed to send async message\n");
+		msleep(HWMON_RETRY_SLEEP_MS);
+	}
+
+	if (status && !handle)
+		return status;
+
+	wait_ret = wait_for_completion_io_timeout(&completion, HWMON_TIMEOUT);
+	if (wait_ret > 0)
+		dev_dbg(dev, "Received async interrupt\n");
+	else if (wait_ret == 0)
+		dev_dbg(dev, "Timeout occurred, trying to poll the response\n");
+
+	ret = -ETIMEDOUT;
+	for (index = 0; index < HWMON_ASYNC_MSG_RETRY; index++) {
+		status = stratix10_svc_async_poll(priv->chan, handle, &data);
+		if (status == -EAGAIN) {
+			dev_dbg(dev, "Async message is still in progress\n");
+		} else if (status < 0) {
+			dev_alert(dev, "Failed to poll async message: %d\n", status);
+			ret = status;
+			break;
+		} else if (status == 0) {
+			ret = 0;
+			break;
+		}
+		msleep(HWMON_RETRY_SLEEP_MS);
+	}
+
+	if (ret) {
+		dev_err(dev, "Failed to get async response\n");
+		goto done;
+	}
+
+	if (data.status) {
+		dev_err(dev, "%s returned 0x%x from SDM\n", __func__,
+			data.status);
+		ret = -EFAULT;
+		goto done;
+	}
+
+	if (type == hwmon_temp)
+		priv->temperature = (u32)*(unsigned long *)data.kaddr1;
+	else
+		priv->voltage = (u32)*(unsigned long *)data.kaddr1;
+
+	ret = 0;
+
+done:
+	stratix10_svc_async_done(priv->chan, handle);
+	return ret;
+}
+
+static int stratix10_hwmon_sync_read(struct device *dev,
+				     enum hwmon_sensor_types type,
+				    struct stratix10_svc_client_msg *msg)
+{
+	struct stratix10_hwmon_priv *priv = dev_get_drvdata(dev);
+	int ret;
+
+	reinit_completion(&priv->completion);
+
+	if (type == hwmon_temp)
+		priv->client.receive_cb = stratix10_hwmon_readtemp_cb;
+	else
+		priv->client.receive_cb = stratix10_hwmon_readvolt_cb;
+
+	ret = stratix10_svc_send(priv->chan, msg);
+	if (ret < 0)
+		goto status_done;
+
+	ret = wait_for_completion_interruptible_timeout(&priv->completion,
+							HWMON_TIMEOUT);
+	if (!ret) {
+		dev_err(priv->client.dev, "timeout waiting for SMC call\n");
+		ret = -ETIMEDOUT;
+		goto status_done;
+	}
+	if (ret < 0) {
+		dev_err(priv->client.dev, "error %d waiting for SMC call\n", ret);
+		goto status_done;
+	}
+
+	ret = 0;
+
+status_done:
+	stratix10_svc_done(priv->chan);
+	return ret;
+}
+
+static int stratix10_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
+				u32 attr, int chan, long *val)
+{
+	struct stratix10_hwmon_priv *priv = dev_get_drvdata(dev);
+	struct stratix10_svc_client_msg msg = {0};
+	int ret;
+
+	if (chan >= STRATIX10_HWMON_MAXSENSORS)
+		return -EOPNOTSUPP;
+
+	switch (type) {
+	case hwmon_temp:
+		ret = stratix10_hwmon_encode_temp_arg(priv->temp_chan[chan],
+						      &msg.arg[0]);
+		if (ret)
+			return ret;
+		msg.command = COMMAND_HWMON_READTEMP;
+		break;
+	case hwmon_in:
+		ret = stratix10_hwmon_encode_volt_arg(priv->volt_chan[chan],
+						      &msg.arg[0]);
+		if (ret)
+			return ret;
+		msg.command = COMMAND_HWMON_READVOLT;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	guard(mutex)(&priv->lock);
+	if (priv->async)
+		ret = stratix10_hwmon_async_read(dev, type, &msg);
+	else
+		ret = stratix10_hwmon_sync_read(dev, type, &msg);
+	if (ret)
+		return ret;
+
+	if (type == hwmon_temp)
+		ret = stratix10_hwmon_parse_temp(val, priv->temperature);
+	else
+		/* Convert Q16 millivolts to millivolts for hwmon. */
+		*val = (long)priv->voltage / STRATIX10_HWMON_VOLT_FRAC_DIV;
+	return ret;
+}
+
+static int stratix10_hwmon_read_string(struct device *dev,
+				       enum hwmon_sensor_types type, u32 attr,
+				      int chan, const char **str)
+{
+	struct stratix10_hwmon_priv *priv = dev_get_drvdata(dev);
+
+	switch (type) {
+	case hwmon_in:
+		*str = priv->volt_chan_names[chan];
+		return 0;
+	case hwmon_temp:
+		*str = priv->temp_chan_names[chan];
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static const struct hwmon_ops stratix10_hwmon_ops = {
+	.is_visible = stratix10_hwmon_is_visible,
+	.read = stratix10_hwmon_read,
+	.read_string = stratix10_hwmon_read_string,
+};
+
+static const struct hwmon_channel_info *stratix10_hwmon_info[] = {
+	HWMON_CHANNEL_INFO(temp,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL,
+			   HWMON_T_INPUT | HWMON_T_LABEL),
+	HWMON_CHANNEL_INFO(in,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL,
+			   HWMON_I_INPUT | HWMON_I_LABEL),
+	NULL
+};
+
+static const struct hwmon_chip_info stratix10_hwmon_chip_info = {
+	.ops = &stratix10_hwmon_ops,
+	.info = stratix10_hwmon_info,
+};
+
+static int stratix10_hwmon_add_channel(struct device *dev, const char *type,
+				       u32 val, const char *label,
+				      struct stratix10_hwmon_priv *priv)
+{
+	if (!strcmp(type, STRATIX10_HWMON_TEMPERATURE)) {
+		if (priv->temperature_channels >= STRATIX10_HWMON_MAXSENSORS) {
+			dev_warn(dev, "Can't add temp node %s, too many channels\n",
+				 label);
+			return 0;
+		}
+
+		priv->temp_chan_names[priv->temperature_channels] = label;
+		priv->temp_chan[priv->temperature_channels] = val;
+		priv->temperature_channels++;
+		return 0;
+	}
+
+	if (!strcmp(type, STRATIX10_HWMON_VOLTAGE)) {
+		if (priv->voltage_channels >= STRATIX10_HWMON_MAXSENSORS) {
+			dev_warn(dev, "Can't add voltage node %s, too many channels\n",
+				 label);
+			return 0;
+		}
+
+		priv->volt_chan_names[priv->voltage_channels] = label;
+		priv->volt_chan[priv->voltage_channels] = val;
+		priv->voltage_channels++;
+		return 0;
+	}
+
+	dev_warn(dev, "unsupported sensor type %s\n", type);
+	return 0;
+}
+
+static int stratix10_hwmon_probe_child_from_dt(struct device *dev,
+					       struct device_node *child,
+					      struct stratix10_hwmon_priv *priv)
+{
+	struct device_node *grandchild;
+	const char *label;
+	u32 val;
+	int ret;
+
+	for_each_child_of_node(child, grandchild) {
+		ret = of_property_read_u32(grandchild, "reg", &val);
+		if (ret) {
+			dev_err(dev, "missing reg property of %pOFn\n",
+				grandchild);
+			of_node_put(grandchild);
+			return ret;
+		}
+
+		ret = of_property_read_string(grandchild, "label", &label);
+		if (ret)
+			label = grandchild->name;
+
+		stratix10_hwmon_add_channel(dev, child->name, val, label, priv);
+	}
+
+	return 0;
+}
+
+static int stratix10_hwmon_probe_from_dt(struct device *dev,
+					 struct stratix10_hwmon_priv *priv)
+{
+	struct device_node *child;
+	int ret;
+
+	if (!dev->of_node)
+		return 0;
+
+	for_each_child_of_node(dev->of_node, child) {
+		ret = stratix10_hwmon_probe_child_from_dt(dev, child, priv);
+		if (ret) {
+			of_node_put(child);
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int stratix10_hwmon_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct stratix10_hwmon_priv *priv;
+	struct device *hwmon_dev;
+	int ret;
+
+	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->client.dev = dev;
+	priv->client.priv = priv;
+	init_completion(&priv->completion);
+	mutex_init(&priv->lock);
+
+	ret = stratix10_hwmon_probe_from_dt(dev, priv);
+	if (ret) {
+		dev_err(dev, "Unable to probe from device tree\n");
+		return ret;
+	}
+
+	if (!priv->temperature_channels && !priv->voltage_channels) {
+		dev_err(dev, "no temperature or voltage channels in device tree\n");
+		return -ENODEV;
+	}
+
+	priv->chan = stratix10_svc_request_channel_byname(&priv->client,
+							  SVC_CLIENT_HWMON);
+	if (IS_ERR(priv->chan)) {
+		ret = PTR_ERR(priv->chan);
+		if (ret == -EPROBE_DEFER)
+			dev_dbg(dev, "service channel %s not ready, deferring probe\n",
+				SVC_CLIENT_HWMON);
+		else
+			dev_err(dev, "couldn't get service channel %s: %d\n",
+				SVC_CLIENT_HWMON, ret);
+		return ret;
+	}
+
+	ret = stratix10_svc_add_async_client(priv->chan, false);
+	switch (ret) {
+	case 0:
+		priv->async = true;
+		break;
+	case -EINVAL:
+		dev_dbg(dev, "async operations not supported, using sync mode\n");
+		priv->async = false;
+		break;
+	default:
+		dev_err(dev, "failed to add async client: %d\n", ret);
+		stratix10_svc_free_channel(priv->chan);
+		return ret;
+	}
+
+	dev_info(dev, "Initialized %d temperature and %d voltage channels\n",
+		 priv->temperature_channels, priv->voltage_channels);
+
+	hwmon_dev = devm_hwmon_device_register_with_info(dev, "stratix10_hwmon",
+							 priv,
+							 &stratix10_hwmon_chip_info,
+							 NULL);
+	if (IS_ERR(hwmon_dev)) {
+		if (priv->async)
+			stratix10_svc_remove_async_client(priv->chan);
+		stratix10_svc_free_channel(priv->chan);
+		return PTR_ERR(hwmon_dev);
+	}
+
+	platform_set_drvdata(pdev, priv);
+	return 0;
+}
+
+static void stratix10_hwmon_remove(struct platform_device *pdev)
+{
+	struct stratix10_hwmon_priv *priv = platform_get_drvdata(pdev);
+
+	if (priv->async)
+		stratix10_svc_remove_async_client(priv->chan);
+	stratix10_svc_free_channel(priv->chan);
+}
+
+static const struct of_device_id stratix10_hwmon_of_match[] = {
+	{ .compatible = "altr,stratix10-hwmon" },
+	{}
+};
+MODULE_DEVICE_TABLE(of, stratix10_hwmon_of_match);
+
+static struct platform_driver stratix10_hwmon_driver = {
+	.driver = {
+		.name = "stratix10-hwmon",
+		.of_match_table = stratix10_hwmon_of_match,
+	},
+	.probe = stratix10_hwmon_probe,
+	.remove = stratix10_hwmon_remove,
+};
+module_platform_driver(stratix10_hwmon_driver);
+
+MODULE_AUTHOR("Nazim Amirul <muhammad.nazim.amirul.nazle.asmade@altera.com>");
+MODULE_AUTHOR("Tze Yee Ng <tze.yee.ng@altera.com>");
+MODULE_DESCRIPTION("Altera Stratix 10 SoC FPGA hardware monitoring driver");
+MODULE_LICENSE("GPL");
-- 
2.43.7


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox