Linux Documentation
 help / color / mirror / Atom feed
* Re: [PATCH v2] Fail the build on RUST=y and RUST_IS_AVAILABLE=n
From: Neal Gompa @ 2026-05-21 10:14 UTC (permalink / raw)
  To: Sasha Finkelstein
  Cc: Alice Ryhl, Andreas Hindborg, Benno Lossin, Björn Roy Baron,
	Boqun Feng, Danilo Krummrich, Gary Guo, Jonathan Corbet,
	Miguel Ojeda, Shuah Khan, Trevor Gross, linux-doc, linux-kernel,
	rust-for-linux
In-Reply-To: <20260521-evolve-to-crab-v2-1-c18e0e98fc54@chaosmail.tech>

On Thu, May 21, 2026 at 4:32 AM Sasha Finkelstein <k@chaosmail.tech> wrote:
>
> The current approach of silently disabling all rust drivers if the
> toolchain is missing results in users that try to compile their own
> kernels getting a "successful" build and then being confused about where
> did their drivers go. In comparison, missing openssl results in a build
> failure, not a disappearance of everything that depends on it.
>
> This also means that allyesconfig will depend on rust, but since the
> rust experiment concluded with "rust is here to stay", i believe that
> allyesconfig should be building rust drivers too.
>
> Signed-off-by: Sasha Finkelstein <k@chaosmail.tech>
> ---
> Changes in v2:
> - No longer a RFC, let's make it happen.
> - Update the docs.
> - Link to v1: https://patch.msgid.link/20260510-evolve-to-crab-v1-1-208df84e67be@chaosmail.tech
> ---
>  Documentation/rust/quick-start.rst | 6 +++---
>  init/Kconfig                       | 1 -
>  2 files changed, 3 insertions(+), 4 deletions(-)
>

At this point, yes, we should just go ahead and do this.

Reviewed-by: Neal Gompa <neal@gompa.dev>


-- 
真実はいつも一つ!/ Always, there's only one truth!

^ permalink raw reply

* [PATCH v7 4/8] docs/zh_CN: Add chipidea.rst translation
From: Kefan Bai @ 2026-05-21  9:55 UTC (permalink / raw)
  To: linux-usb, si.yanteng
  Cc: gregkh, seakeel, alexs, dzm91, corbet, skhan, linux-doc, doubled
In-Reply-To: <cover.1779355170.git.baikefan@leap-io-kernel.com>

Translate .../usb/chipidea.rst into Chinese

Update the translation through commit e4157519ad46
("Documentation: usb: correct spelling")

Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Signed-off-by: Kefan Bai <baikefan@leap-io-kernel.com>
---
 .../translations/zh_CN/usb/chipidea.rst       | 150 ++++++++++++++++++
 .../translations/zh_CN/usb/index.rst          |   2 +-
 2 files changed, 151 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/translations/zh_CN/usb/chipidea.rst

diff --git a/Documentation/translations/zh_CN/usb/chipidea.rst b/Documentation/translations/zh_CN/usb/chipidea.rst
new file mode 100644
index 000000000000..ea0dc3043189
--- /dev/null
+++ b/Documentation/translations/zh_CN/usb/chipidea.rst
@@ -0,0 +1,150 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/usb/chipidea.rst
+
+:翻译:
+
+ 白钶凡 Kefan Bai <baikefan@leap-io-kernel.com>
+
+:校译:
+
+
+=============================
+ChipIdea 高速双角色控制器驱动
+=============================
+
+1. 如何测试 OTG FSM(HNP 和 SRP)
+---------------------------------
+
+下面以两块 Freescale i.MX6Q Sabre SD 开发板为例,
+说明如何通过 sysfs 输入文件演示 OTG 的 HNP 和 SRP 功能。
+
+1.1 如何使能 OTG FSM
+--------------------
+
+1.1.1 在 ``menuconfig`` 中选择 ``CONFIG_USB_OTG_FSM``,并重新编译内核
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+重新构建内核镜像和模块。如果想查看 OTG FSM 的
+一些内部变量,可以挂载 ``debugfs``;其中有两个文件
+可以显示 OTG FSM 变量以及部分控制器寄存器的值::
+
+	cat /sys/kernel/debug/ci_hdrc.0/otg
+	cat /sys/kernel/debug/ci_hdrc.0/registers
+
+1.1.2 在控制器节点对应的 ``dts`` 文件中添加以下条目
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+::
+
+	otg-rev = <0x0200>;
+	adp-disable;
+
+1.2 测试步骤
+------------
+
+1) 给两块 Freescale i.MX6Q Sabre SD 开发板上电,
+   并加载 gadget 类驱动(例如 ``g_mass_storage``)。
+
+2) 用 USB 线连接两块开发板:
+   一端是 micro A 插头,另一端是 micro B 插头。
+
+   插入 micro A 插头的一端是 A 设备,它应枚举另一端的 B 设备。
+
+3) 角色切换
+
+   在 B 设备上执行::
+
+	echo 1 > /sys/bus/platform/devices/ci_hdrc.0/inputs/b_bus_req
+
+   B 设备应接管主机角色并枚举 A 设备。
+
+4) A 设备切回主机角色
+
+   在 B 设备上执行::
+
+	echo 0 > /sys/bus/platform/devices/ci_hdrc.0/inputs/b_bus_req
+
+   或者,通过引入 HNP 轮询,B 端主机可以知道
+   A 端外设希望切换为主机角色,因此这次角色切换
+   也可以通过 A 端外设响应 B 端主机的轮询,
+   在 A 侧触发。
+   这可以通过在 A 设备上执行下面的命令来完成::
+
+	echo 1 > /sys/bus/platform/devices/ci_hdrc.0/inputs/a_bus_req
+
+   A 设备应切回主机角色并枚举 B 设备。
+
+5) 拔掉 B 设备(拔掉 micro B 插头),
+   并在 10 秒内重新插入;
+   A 设备应重新枚举 B 设备。
+
+6) 拔掉 B 设备(拔掉 micro B 插头),
+   并在 10 秒后重新插入;
+   A 设备不应重新枚举 B 设备。
+
+   如果 A 设备希望使用总线:
+
+   在 A 设备上执行::
+
+	echo 0 > /sys/bus/platform/devices/ci_hdrc.0/inputs/a_bus_drop
+	echo 1 > /sys/bus/platform/devices/ci_hdrc.0/inputs/a_bus_req
+
+   如果 B 设备希望使用总线:
+
+   在 B 设备上执行::
+
+	echo 1 > /sys/bus/platform/devices/ci_hdrc.0/inputs/b_bus_req
+
+7) A 设备关闭总线供电
+
+   在 A 设备上执行::
+
+	echo 1 > /sys/bus/platform/devices/ci_hdrc.0/inputs/a_bus_drop
+
+   A 设备应断开与 B 设备的连接,并关闭总线供电。
+
+8) B 设备发出 SRP 数据脉冲
+
+   在 B 设备上执行::
+
+	echo 1 > /sys/bus/platform/devices/ci_hdrc.0/inputs/b_bus_req
+
+   A 设备应恢复 USB 总线并枚举 B 设备。
+
+1.3 参考文档
+------------
+《On-The-Go and Embedded Host Supplement
+to the USB Revision 2.0 Specification
+July 27, 2012 Revision 2.0 version 1.1a》
+
+2. 如何将 USB 用作系统唤醒源
+----------------------------
+下面是在 i.MX6 平台上把 USB 用作系统唤醒源的示例。
+
+2.1 使能核心控制器的唤醒功能::
+
+	echo enabled > /sys/bus/platform/devices/ci_hdrc.0/power/wakeup
+
+2.2 使能 glue 层的唤醒功能::
+
+	echo enabled > /sys/bus/platform/devices/2184000.usb/power/wakeup
+
+2.3 使能 PHY 的唤醒功能(可选)::
+
+	echo enabled > /sys/bus/platform/devices/20c9000.usbphy/power/wakeup
+
+2.4 使能根集线器的唤醒功能::
+
+	echo enabled > /sys/bus/usb/devices/usb1/power/wakeup
+
+2.5 使能相关设备的唤醒功能::
+
+	echo enabled > /sys/bus/usb/devices/1-1/power/wakeup
+
+如果系统只有一个 USB 端口,
+而你希望在该端口上启用 USB 唤醒功能,
+可以使用下面的脚本::
+
+	for i in $(find /sys -name wakeup | grep usb);do echo enabled > $i;done;
diff --git a/Documentation/translations/zh_CN/usb/index.rst b/Documentation/translations/zh_CN/usb/index.rst
index 3480966fee19..e6d0a4fceff7 100644
--- a/Documentation/translations/zh_CN/usb/index.rst
+++ b/Documentation/translations/zh_CN/usb/index.rst
@@ -19,10 +19,10 @@ USB 支持

     acm
     authorization
+    chipidea

 Todolist:

-* chipidea
 * dwc3
 * ehci
 * usbmon
--
2.54.0


^ permalink raw reply related

* Re: [PATCH net-next v3 03/14] libeth: allow to create fill queues without NAPI
From: Larysa Zaremba @ 2026-05-21 10:07 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Tony Nguyen, davem, pabeni, edumazet, andrew+netdev, netdev,
	Pavan Kumar Linga, przemyslaw.kitszel, aleksander.lobakin,
	sridhar.samudrala, anjali.singhai, michal.swiatkowski,
	maciej.fijalkowski, emil.s.tantilov, joshua.a.hay, jacob.e.keller,
	jayaprakash.shanmugam, jiri, horms, corbet, richardcochran,
	linux-doc, Bharath R, Samuel Salin
In-Reply-To: <20260520184922.34c36c74@kernel.org>

On Wed, May 20, 2026 at 06:49:22PM -0700, Jakub Kicinski wrote:
> On Fri, 15 May 2026 15:44:27 -0700 Tony Nguyen wrote:
> > +int libeth_rx_fq_create(struct libeth_fq *fq, void *napi_dev)
> 
> Why do you have to pass an opaque void pointer?
> Just add another arg for dev.

I agree that having type safety would be nice. But firstly, napi and dev are 
mutually exclusive, and secondly, call sites would look pretty ugly, unless we 
add a macro on top.

> 
>  int libeth_rx_fq_create(struct libeth_fq *fq, struct napi_struct *napi,
> 			 struct device *dev)
>  {
>  	struct page_pool_params pp = {
>  		.flags		= PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
>  		.order		= LIBETH_RX_PAGE_ORDER,
>  		.pool_size	= fq->count,
>  		.nid		= fq->nid,
> -		.dev		= napi->dev->dev.parent,
> -		.netdev		= napi->dev,
> +		.dev		= dev ? dev : napi->dev->dev.parent,
> +		.netdev		= napi ? napi->dev : NULL,

^ permalink raw reply

* [PATCH v7 6/8] docs/zh_CN: Add ehci.rst translation
From: Kefan Bai @ 2026-05-21  9:55 UTC (permalink / raw)
  To: linux-usb, si.yanteng
  Cc: gregkh, seakeel, alexs, dzm91, corbet, skhan, linux-doc, doubled
In-Reply-To: <cover.1779355170.git.baikefan@leap-io-kernel.com>

Translate .../usb/ehci.rst into Chinese

Update the translation through commit 570eb861243c
("docs: usb: replace some characters")

Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Signed-off-by: Kefan Bai <baikefan@leap-io-kernel.com>
---
 Documentation/translations/zh_CN/usb/ehci.rst | 261 ++++++++++++++++++
 .../translations/zh_CN/usb/index.rst          |   2 +-
 2 files changed, 262 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/translations/zh_CN/usb/ehci.rst

diff --git a/Documentation/translations/zh_CN/usb/ehci.rst b/Documentation/translations/zh_CN/usb/ehci.rst
new file mode 100644
index 000000000000..e05e493a30d3
--- /dev/null
+++ b/Documentation/translations/zh_CN/usb/ehci.rst
@@ -0,0 +1,261 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/usb/ehci.rst
+
+:翻译:
+
+ 白钶凡 Kefan Bai <baikefan@leap-io-kernel.com>
+
+:校译:
+
+
+=========
+EHCI 驱动
+=========
+
+2002年12月27日
+
+EHCI 驱动用于通过支持 USB 2.0 的主机控制器
+硬件与高速 USB 2.0 设备通信。USB 2.0 兼容
+USB 1.1 标准,它定义了三种传输速率:
+
+    - “高速”(High Speed)480 Mbit/sec(60 MByte/sec)
+    - “全速”(Full Speed)12 Mbit/sec(1.5 MByte/sec)
+    - “低速”(Low Speed)1.5 Mbit/sec
+
+USB 1.1 仅支持全速与低速。
+高速设备可以在 USB 1.1 系统上使用,
+但速度会降到 USB 1.1 的速率。
+
+USB 1.1 设备也可以在 USB 2.0 系统上使用。当它们
+插入 EHCI 控制器时,会被交由 USB 1.1 的伴随
+(companion)控制器处理,该控制器通常是 OHCI 或 UHCI。
+
+当 USB 1.1 设备插入 USB 2.0 集线器时,它们通过
+集线器中的事务转换器(Transaction Translator,TT)
+与 EHCI 控制器交互,该转换器将低速或全速事务转换为
+高速分割事务,从而避免浪费传输带宽。
+
+截至本文撰写时,该驱动已在以下 EHCI 实现上成功运行
+(按字母顺序):Intel、NEC、Philips 和 VIA。
+其他供应商的 EHCI 实现正在陆续问世;
+预计该驱动在这些实现上也可正常运行。
+
+自 2001 年年中起,usb-storage 设备就已可用
+(在 2.4 版该驱动上速度相当不错),
+集线器则直到 2001 年底才开始可用,而其他类型的高速设备
+似乎要等到更多系统内置 USB 2.0 后才会出现。
+这类新系统从 2002 年初开始上市,
+并在 2002 年下半年变得更加常见。
+
+注意,USB 2.0 支持并不只是 EHCI 本身。
+它还需要对 Linux-USB 核心 API 作出其他修改,
+包括 hub 驱动;不过这些修改并不需要真正改变
+暴露给 USB 设备驱动的基本 ``usbcore`` API。
+
+- David Brownell
+  <dbrownell@users.sourceforge.net>
+
+
+功能
+====
+
+该驱动会定期在 x86 硬件上进行测试,
+也已在 PPC 硬件上使用,因此大小端问题应当已经解决。
+因此可以认为,它已经处理好了所有必要的 PCI 细节,
+所以即便在 DMA 映射有些特殊的系统上,
+I/O 也应能正常运行。
+
+传输类型
+--------
+
+截至本文撰写时,该驱动应当已经能够很好地处理
+所有控制传输、批量传输和中断传输,
+包括通过 USB 2.0 集线器中的事务转换器
+与 USB 1.1 设备通信;但仍可能存在 bug。
+
+高速等时(ISO)传输支持也已可用,但截至本文撰写时,
+还没有 Linux 驱动使用这项支持。
+
+目前尚不支持通过事务转换器实现全速等时传输。
+需要注意,ISO 传输的 split transaction 支持
+与高速 ISO 传输几乎无法共用代码,
+因为 EHCI 用不同的数据结构表示它们。
+因此,目前大多数 USB 音频和视频设备
+还不能通过高速总线连接使用。
+
+驱动行为
+--------
+
+所有类型的传输都可以排队。
+这意味着来自一个接口驱动的控制传输
+(或通过 usbfs 发出的控制传输)不会干扰
+另一个驱动的控制传输,而且中断传输可以使用 1 帧的周期,
+而不必担心中断处理开销导致的数据丢失。
+
+
+EHCI 根集线器代码会将 USB 1.1 设备移交给其伴随控制器。
+该驱动不需要了解那些驱动的任何细节;
+一个原本就能正常工作的 OHCI 或 UHCI 驱动,
+并不会因为 EHCI 驱动也存在而需要更改。
+
+电源管理方面还有一些问题;
+当前挂起/恢复的行为还不完全正确。
+
+此外,在调度周期性事务
+(中断和等时传输)时还采取了一些简化处理。
+这些简化会限制可调度的周期性事务数量,
+并且无法使用小于一帧的轮询间隔。
+
+使用方式
+========
+
+假设有一个 EHCI 控制器(位于 PCI 卡或主板上),
+并且已将此驱动编译为模块,可这样加载::
+
+    # modprobe ehci-hcd
+
+卸载方式::
+
+    # rmmod ehci-hcd
+
+还应加载一个伴随控制器驱动,
+例如 ``ohci-hcd`` 或 ``uhci-hcd``。
+如果 EHCI 驱动出现任何问题,只需卸载它的模块,
+随后该伴随控制器驱动就会接手
+此前由 EHCI 驱动处理的所有设备
+(但速度会降低)。
+
+模块参数(传给 ``modprobe``)包括:
+
+    log2_irq_thresh(默认值 0):
+        默认中断延迟的 log2 值,单位是微帧。默认值 0 表示 1 个微帧
+        (125 微秒)。最大值 6 表示 2^6 = 64 个微帧。
+        该值控制 EHCI 控制器发出中断的频率。
+
+如果在 2.5 内核上使用此驱动,并且启用了 USB 调试支持,
+则会在任一 EHCI 控制器的 ``sysfs`` 目录中看到三个文件:
+
+    ``async``
+        转储异步调度,用于控制传输和批量传输。它会显示每个活动的 ``qh``
+        以及待处理的 ``qtd``,通常每个 ``urb`` 对应一个 ``qtd``。
+        (可以在 ``usb-storage`` 做磁盘 I/O 时看它;顺便观察请求队列!)
+
+    ``periodic``
+        转储周期性调度,用于中断传输和等时传输。不显示 ``qtd``。
+
+    ``registers``
+        显示控制器寄存器状态。
+
+这些文件的内容有助于定位驱动问题。
+
+
+设备驱动通常不需要关心自己是否运行在 EHCI 之上,
+但它们可能想检查
+``usb_device->speed == USB_SPEED_HIGH``。
+高速设备能做到全速(或低速)设备做不到的事,
+例如高带宽的周期性传输(中断或 ISO 传输)。
+另外,设备描述符中的某些值
+(例如周期性传输的轮询间隔)
+在高速模式下使用不同的编码方式。
+
+不过,一定要让设备驱动经过 USB 2.0 集线器的测试。
+当使用事务转换器时,这些集线器报告某些故障
+(例如断开连接)的方式会不同;
+已经见过一些驱动在遇到与 OHCI 或 UHCI
+所报告的不同故障时表现不佳。
+
+性能
+====
+
+USB 2.0 吞吐量主要受两个因素制约:
+主机控制器处理请求的速度,以及设备响应这些请求的速度。
+480 Mbit/sec 的“原始传输率”对所有设备都成立,
+但总吞吐量还会受到诸如单个高速包之间的延迟、
+驱动是否足够聪明,以及系统整体负载等因素的影响。
+延迟也是性能考量因素。
+
+批量传输最常用于关注吞吐量的场景。
+需要记住的是,批量传输总是以 512 字节包为单位,
+而一个 USB 2.0 微帧中最多只能容纳 13 个这样的包。
+8 个 USB 2.0 微帧构成一个 USB 1.1 帧;
+一个微帧的时长是 1 毫秒 / 8 = 125 微秒。
+
+因此,只要硬件和设备驱动软件都允许,
+批量传输可提供超过 50 MByte/sec 的带宽。
+周期性传输模式(等时和中断)允许使用更大的包大小,
+从而可以逼近所宣称的 480 Mbit/sec 传输率。
+
+硬件性能
+--------
+
+截至本文撰写时,单个 USB 2.0 设备的最大传输速率
+通常约为 20 MByte/sec。
+这当然会随着时间改变:一些设备现在更快,一些更慢。
+
+第一代 NEC EHCI 实现似乎存在
+大约 28 MByte/sec 的硬件瓶颈。
+虽然这对单个 20 MByte/sec 的设备显然已经够用,
+但把三个这样的设备挂到同一总线上,
+并不能得到 60 MByte/sec。
+问题似乎在于控制器硬件无法并发进行 USB 与 PCI 访问,
+因此它每个微帧只会尝试 6 次(也许是 7 次)
+USB 事务,而不是 13 次。
+(对一个比其他产品早上市一年的芯片来说,
+这是个合理的妥协!)
+
+
+预计较新的实现会在这方面做得更好,
+通过投入更多芯片面积来解决这个问题,
+使新的主板芯片组更接近 60 MByte/sec 的目标。
+这既包括 NEC 的更新实现,也包括其他厂商的芯片。
+
+
+主机从 EHCI 控制器收到“请求已完成”中断的最小延迟
+为一个微帧(125 微秒)。该延迟可以调节;
+驱动提供了一个模块选项。默认情况下,
+``ehci-hcd`` 使用最小延迟,这意味着当发出一个控制
+或批量请求时,通常可以在不到 250 微秒内得知它已完成
+(具体取决于传输大小)。
+
+软件性能
+--------
+
+即便只是要达到 20 MByte/sec 的传输速率,
+Linux-USB 设备驱动也必须让 EHCI 队列始终保持满载。
+这意味着要发出较大的请求,
+或者在需要发出一连串小请求时使用批量请求排队。
+如果驱动未做到这一点,那么会直接从性能结果上表现出来。
+
+
+在典型情况下,使用 ``usb_bulk_msg()``
+以 4 KB 块循环写出,
+会浪费超过一半的 USB 2.0 带宽。
+I/O 完成与驱动发出下一次请求之间的延迟,
+通常会比一次 I/O 本身耗时更长。
+如果同样的循环改用 16 KB 块,会好一些;
+若使用一连串 128 KB 块,则浪费会少得多。
+
+
+但与其依赖这么大的 I/O 缓冲区来让同步 I/O 高效,
+不如直接向主机控制器排入多个(批量)请求,
+然后等待它们全部完成(或在出错时取消)。
+这种 URB 排队方式对所有 USB 1.1
+主机控制器驱动也同样适用。
+
+
+在 Linux 2.5 内核中,定义了新的 ``usb_sg_*()`` API;
+它们会把 scatterlist 中的所有缓冲区都排入队列。
+它们还使用 scatterlist 的 DMA 映射
+(其中可能应用 IOMMU)并减少中断次数,
+这些都有助于让高速传输尽可能快地运行。
+
+待办:
+   中断传输和等时(ISO)传输的性能问题。
+   这些周期性传输都是完全调度的,因此,主要问题可能在于如何触发高带宽模式。
+
+待办:
+   通过 ``sysfs`` 中的 ``uframe_periodic_max`` 参数,
+   可以分配超过标准 80% 的周期性带宽。
+   后续将对此进行说明。
diff --git a/Documentation/translations/zh_CN/usb/index.rst b/Documentation/translations/zh_CN/usb/index.rst
index 7c739627077b..8c6b26912320 100644
--- a/Documentation/translations/zh_CN/usb/index.rst
+++ b/Documentation/translations/zh_CN/usb/index.rst
@@ -21,10 +21,10 @@ USB 支持
     authorization
     chipidea
     dwc3
+    ehci

 Todolist:

-* ehci
 * usbmon
 * functionfs
 * functionfs-desc
--
2.54.0


^ permalink raw reply related

* [PATCH v7 8/8] docs/zh_CN: Add CREDITS translation
From: Kefan Bai @ 2026-05-21  9:55 UTC (permalink / raw)
  To: linux-usb, si.yanteng
  Cc: gregkh, seakeel, alexs, dzm91, corbet, skhan, linux-doc, doubled
In-Reply-To: <cover.1779355170.git.baikefan@leap-io-kernel.com>

Translate .../usb/CREDITS into Chinese

Update the translation through commit 7b2328c5a009
("docs: Fix typo in usb/CREDITS")

Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Signed-off-by: Kefan Bai <baikefan@leap-io-kernel.com>
---
 Documentation/translations/zh_CN/usb/CREDITS | 163 +++++++++++++++++++
 1 file changed, 163 insertions(+)
 create mode 100644 Documentation/translations/zh_CN/usb/CREDITS

diff --git a/Documentation/translations/zh_CN/usb/CREDITS b/Documentation/translations/zh_CN/usb/CREDITS
new file mode 100644
index 000000000000..c133b1a5daff
--- /dev/null
+++ b/Documentation/translations/zh_CN/usb/CREDITS
@@ -0,0 +1,163 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/usb/CREDITS
+
+:翻译:
+
+ 白钶凡 Kefan Bai <baikefan@leap-io-kernel.com>
+
+:校译:
+
+
+简易 Linux USB 驱动的致谢名单:
+
+以下人员都为 Linux USB 驱动代码作出了贡献
+(按姓氏字母顺序排列)。我相信这份名单本应
+更长一些,但确实不容易维护。
+如需将自己加入名单,请提交补丁。
+
+  Georg Acher <acher@informatik.tu-muenchen.de>
+  David Brownell <dbrownell@users.sourceforge.net>
+  Alan Cox <alan@lxorguk.ukuu.org.uk>
+  Randy Dunlap <randy.dunlap@intel.com>
+  Johannes Erdfelt <johannes@erdfelt.com>
+  Deti Fliegl <deti@fliegl.de>
+  ham <ham@unsuave.com>
+  Bradley M Keryan <keryan@andrew.cmu.edu>
+  Greg Kroah-Hartman <greg@kroah.com>
+  Pavel Machek <pavel@suse.cz>
+  Paul Mackerras <paulus@cs.anu.edu.au>
+  Petko Manlolov <petkan@dce.bg>
+  David E. Nelson <dnelson@jump.net>
+  Vojtech Pavlik <vojtech@suse.cz>
+  Bill Ryder <bryder@sgi.com>
+  Thomas Sailer <sailer@ife.ee.ethz.ch>
+  Gregory P. Smith <greg@electricrain.com>
+  Linus Torvalds <torvalds@linux-foundation.org>
+  Roman Weissgaerber <weissg@vienna.at>
+  <Kazuki.Yasumatsu@fujixerox.co.jp>
+
+特别感谢:
+
+  Inaky Perez Gonzalez <inaky@peloncho.fis.ucm.es>
+  感谢他发起了 Linux USB 驱动开发工作,并编写了体量较大的 uusbd
+  驱动中的大部分代码。我们从那项工作中学到了很多。
+
+  NetBSD 和 FreeBSD 的 USB 开发者们
+  感谢他们加入 Linux USB 邮件列表,提供建议并分享实现经验。
+
+附加感谢:
+  还要感谢以下公司与个人在硬件、支持、时间投入和开发方面提供的捐赠与帮助
+  (摘自 Inaky 驱动原始的 THANKS 文件):
+
+    以下公司曾帮助我们开发 Linux USB / UUSBD:
+
+        - 3Com GmbH 捐赠了一台 ISDN Pro TA,并在技术问题和测试设备方面为我
+          提供支持。没想到能得到这么大的帮助。
+
+        - USAR Systems 向我们提供了他们出色的 USB 评估套件,
+          使我们能够测试 Linux USB 驱动对最新 USB 规范的符合性。
+          USAR Systems 认识到保持开放操作系统与时俱进的重要性,
+          并以硬件支持这个项目。感谢!
+
+        - 感谢英特尔提供的宝贵帮助。
+
+        - 我们与 Cherry 合作,使 Linux 成为首个内置 USB 支持的操作系统。
+          Cherry 是全球最大的键盘制造商之一。
+
+        - CMD Technology, Inc. 慷慨捐赠了一块 CSA-6700 PCI-to-USB
+          控制卡,用于测试 OHCI 实现。
+
+        - 由于他们对我们的支持,Keytronic 可以放心,
+          他们的键盘能卖给至少 300 万 Linux 用户中的一部分。
+
+        - ing büro h doran [http://www.ibhdoran.com]!
+          在欧洲,想给主板买一个 PC 背板 USB 连接器几乎是不可能的
+          (我自己做的那个相当糟糕 :))。现在我知道该去哪里买漂亮的 USB
+          配件了!
+
+        - Genius Germany 捐赠了一只 USB 鼠标,用于测试鼠标启动协议;
+          他们还捐赠了 F-23 数字摇杆和 NetMouse Pro。感谢!
+
+        - AVM GmbH Berlin 支持我们开发 Linux 下的 AVM ISDN Controller B1 USB 驱动。
+          AVM 是领先的 ISDN 控制器制造商,其主动式设计对包括 Linux 在内的
+          所有操作系统平台开放。
+
+        - 非常感谢 Y-E Data, Inc 捐赠的 FlashBuster-U USB 软驱,
+          使我们能够测试批量传输代码。
+
+        - 感谢 Logitech 捐赠了一只三轴 USB 鼠标。
+
+          Logitech 负责设计、制造并销售各种人机接口设备,
+          在键盘、鼠标、轨迹球、摄像头、扬声器,以及面向游戏和专业用途的
+          控制设备方面拥有悠久历史和丰富经验。
+
+          作为这些设备广为人知的供应商和销售商,他们捐赠了 USB 鼠标、
+          摇杆和扫描仪,以表明 Linux 的重要性,也让 Logitech 的客户
+          能在自己喜欢的操作系统上获得支持,并让所有 Linux 用户都能使用
+          Logitech 以及其他 USB 硬件。
+
+          Logitech 也是 1999 年 2 月 11 日维也纳 Linux 大会的官方赞助商,
+          我们将在会上展示 Linux USB 工作的最新进展。
+
+        - 感谢 CATC 提供 USB Inspector,帮助我们揭开 UHCI 内部实现中
+          那些不为人知的角落。
+
+        - 感谢 Entrega 为开发工作提供 PCI 转 USB 卡、集线器和转换器产品。
+
+        - 感谢 ConnectTech 提供 WhiteHEAT USB 转串口转换器以及相关文档,
+          让这个驱动得以写成。
+
+        - 感谢 ADMtek 提供 Pegasus 和 Pegasus II 评估板、规格说明,
+          以及驱动开发过程中的宝贵建议。
+
+    另外还要感谢以下个人(嘿,顺序不分先后 :))
+
+        - Oren Tirosh <orenti@hishome.net>,
+          他非常耐心地听我唠叨各种 USB 疑问,还给了很多很酷的想法。
+
+        - Jochen Karrer <karrer@wpfd25.physik.uni-wuerzburg.de>,
+          指出了致命 bug,并给出了宝贵建议。
+
+        - Edmund Humemberger <ed@atnet.at>,他在公共关系与项目管理方面
+          为 Linux-USB 项目付出了巨大的努力。
+
+        - Alberto Menegazzi <flash@flash.iol.it> 正在着手编写 UUSBD 文档,加油!
+
+        - Ric Klaren <ia_ric@cs.utwente.nl> 编写了很好的入门文档,
+          与 Alberto 的作品形成良性竞争:)。
+
+        - Christian Groessler <cpg@aladdin.de>,感谢他在那些棘手细节上的帮助。
+
+        - Paul MacKerras 改进了 OHCI 实现,推动了对 iMac 的支持,
+          并提供了大量的改进意见。
+
+        - Fernando Herrera <fherrera@eurielec.etsit.upm.es>
+          负责撰写、维护并不断补充那份期待已久、独一无二又精彩的
+          UUSBD FAQ!太棒了!
+
+        - Rasca Gmelch <thron@gmx.de> 重新启用了 raw 驱动,
+          指出了一些错误,并启动了 uusbd-utils 软件包。
+
+        - Peter Dettori <dettori@ozy.dec.com>,像疯了一样挖掘 bug,
+          还提出了很多很酷的建议,太棒了!
+
+        - 自由软件与 Linux 社区的所有成员,包括 FSF、GNU 项目、
+          MIT X 联盟、TeX 社区等等,谢谢你们!
+
+        - 特别感谢 Richard Stallman 创造了 Emacs!
+
+        - 感谢 linux-usb 邮件列表的所有成员,读了那么多邮件——不开玩笑了,
+          感谢你们提出的所有建议!
+
+        - 感谢 USB Implementers Forum 成员们的帮助与支持。
+
+        - Nathan Myers <ncm@cantrip.org>,感谢他的建议!
+          (希望你喜欢 Cibeles 的派对。)
+
+        - 感谢 Linus Torvalds 创建、开发并管理 Linux。
+
+        - Mike Smith、Craig Keithley、Thierry Giron 和 Janet Schank
+          感谢他们让我认识到标准 USB 集线器其实也没那么“标准”,
+          这有助于我们在标准集线器驱动中加入厂商特定的特殊处理。
--
2.54.0


^ permalink raw reply related

* [PATCH v7 5/8] docs/zh_CN: Add dwc3.rst translation
From: Kefan Bai @ 2026-05-21  9:55 UTC (permalink / raw)
  To: linux-usb, si.yanteng
  Cc: gregkh, seakeel, alexs, dzm91, corbet, skhan, linux-doc, doubled
In-Reply-To: <cover.1779355170.git.baikefan@leap-io-kernel.com>

Translate .../usb/dwc3.rst into Chinese

Update the translation through commit ecefae6db042
("docs: usb: rename files to .rst and add them to drivers-api")

Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Signed-off-by: Kefan Bai <baikefan@leap-io-kernel.com>
---
 Documentation/translations/zh_CN/usb/dwc3.rst | 63 +++++++++++++++++++
 .../translations/zh_CN/usb/index.rst          |  2 +-
 2 files changed, 64 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/translations/zh_CN/usb/dwc3.rst

diff --git a/Documentation/translations/zh_CN/usb/dwc3.rst b/Documentation/translations/zh_CN/usb/dwc3.rst
new file mode 100644
index 000000000000..3468ce50c5ba
--- /dev/null
+++ b/Documentation/translations/zh_CN/usb/dwc3.rst
@@ -0,0 +1,63 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/usb/dwc3.rst
+
+:翻译:
+
+ 白钶凡 Kefan Bai <baikefan@leap-io-kernel.com>
+
+:校译:
+
+
+=========
+DWC3 驱动
+=========
+
+
+待办
+~~~~
+
+阅读时如果想顺手认领点任务,可以从下面挑一项 :)
+
+- 将中断处理程序改为每个端点各自使用线程化 IRQ
+
+  事实证明,有些 DWC3 命令大约需要 ``~1 ms`` 才能完成。
+  当前代码会一直自旋等待命令完成,这种设计并不好。
+
+  实现思路:
+
+  - DWC 核心实现了一个按端点对中断进行解复用的 IRQ 控制器。
+    中断号在探测(``probe``)阶段分配,并归属于该设备。
+    如果硬件通过 ``MSI`` 为每个端点提供独立中断,
+    那么这个“虚拟”IRQ 控制器就可以被真实的端点中断取代。
+
+  - 在调用 ``usb_ep_enable()`` 时请求并分配中断资源,
+    在调用 ``usb_ep_disable()`` 时释放中断资源。
+    最坏情况下需要 32 个中断,最少是 ``ep0/1`` 的两个中断。
+  - ``dwc3_send_gadget_ep_cmd()`` 将在 ``wait_for_completion_timeout()``
+    中休眠,直到命令完成。
+  - 中断处理程序分为以下几个部分:
+
+    - 设备级主中断处理程序
+      遍历每个事件,并对其调用 ``generic_handle_irq()``。
+      从 ``generic_handle_irq()`` 返回后,确认事件计数器,使中断最终消失。
+
+    - 设备级线程化处理程序
+      无。
+
+    - 端点中断的主处理程序
+      读取事件并尽量处理它。凡是需要睡眠的操作都交给线程处理。
+      事件保存在每个端点的数据结构中。
+      还要注意,一旦把某项工作交给线程处理,
+      就不要再在主处理程序里处理它,
+      以免出现优先级反转之类的问题。
+
+    - 端点中断的线程化处理程序
+      处理剩余的端点工作,这些工作可能会睡眠,例如等待命令完成。
+
+  延迟:
+
+   不应增加延迟,因为中断线程具有较高优先级,
+   会在普通用户态任务之前运行
+   (除非用户更改了调度优先级)。
diff --git a/Documentation/translations/zh_CN/usb/index.rst b/Documentation/translations/zh_CN/usb/index.rst
index e6d0a4fceff7..7c739627077b 100644
--- a/Documentation/translations/zh_CN/usb/index.rst
+++ b/Documentation/translations/zh_CN/usb/index.rst
@@ -20,10 +20,10 @@ USB 支持
     acm
     authorization
     chipidea
+    dwc3

 Todolist:

-* dwc3
 * ehci
 * usbmon
 * functionfs
--
2.54.0


^ permalink raw reply related

* Re: [PATCH bpf-next v11 0/8] bpf: Extend the bpf_list family of APIs
From: patchwork-bot+netdevbpf @ 2026-05-21 10:00 UTC (permalink / raw)
  To: Kaitao Cheng
  Cc: ast, corbet, martin.lau, daniel, andrii, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah,
	chengkaitao, skhan, memxor, bpf, linux-kernel, linux-doc, vmalik,
	linux-kselftest
In-Reply-To: <20260521032306.97118-1-kaitao.cheng@linux.dev>

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Thu, 21 May 2026 11:22:58 +0800 you wrote:
> In BPF, a list can only be used to implement a stack structure.
> Due to an incomplete API set, only FIFO or LIFO operations are
> supported. The patches enhance the BPF list API, making it more
> list-like.
> 
> Five new kfuncs have been added:
> bpf_list_del: remove a node from the list
> bpf_list_add_impl: insert a node after a given list node
> bpf_list_is_first: check if a node is the first in the list
> bpf_list_is_last: check if a node is the last in the list
> bpf_list_empty: check if the list is empty
> 
> [...]

Here is the summary with links:
  - [bpf-next,v11,1/8] bpf: refactor __bpf_list_del to take list node pointer
    https://git.kernel.org/bpf/bpf-next/c/cb339ac61d72
  - [bpf-next,v11,2/8] bpf: clear list node owner and unlink before drop
    https://git.kernel.org/bpf/bpf-next/c/cfa6afa4b931
  - [bpf-next,v11,3/8] bpf: allow non-owning list-node args via __nonown_allowed
    https://git.kernel.org/bpf/bpf-next/c/7c8c71591b76
  - [bpf-next,v11,4/8] bpf: Introduce the bpf_list_del kfunc.
    https://git.kernel.org/bpf/bpf-next/c/187baa10963a
  - [bpf-next,v11,5/8] bpf: refactor __bpf_list_add to take insertion point via **prev_ptr
    https://git.kernel.org/bpf/bpf-next/c/e6919ff67c1e
  - [bpf-next,v11,6/8] bpf: Add bpf_list_add to insert node after a given list node
    https://git.kernel.org/bpf/bpf-next/c/a3493ca504f1
  - [bpf-next,v11,7/8] bpf: add bpf_list_is_first/last/empty kfuncs
    https://git.kernel.org/bpf/bpf-next/c/745515d386eb
  - [bpf-next,v11,8/8] selftests/bpf: Add test cases for bpf_list_del/add/is_first/is_last/empty
    https://git.kernel.org/bpf/bpf-next/c/ba3dc064f406

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH v6 21/43] KVM: SEV: Make 'uaddr' parameter optional for KVM_SEV_SNP_LAUNCH_UPDATE
From: Fuad Tabba @ 2026-05-21  9:55 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	ira.weiny, jmattson, jthoughton, michael.roth, oupton,
	pankaj.gupta, qperret, rick.p.edgecombe, rientjes, shivankg,
	steven.price, willy, wyihan, yan.y.zhao, forkloop, pratyush,
	suzuki.poulose, aneesh.kumar, liam, Paolo Bonzini,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Baoquan He, Barry Song,
	Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng,
	Shakeel Butt, Kiryl Shutsemau, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <20260507-gmem-inplace-conversion-v6-21-91ab5a8b19a4@google.com>

Hi,

On Thu, 7 May 2026 at 21:22, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Michael Roth <michael.roth@amd.com>
>
> For vm_memory_attributes=1, in-place conversion/population is not
> supported, so the initial contents necessarily must need to come
> from a separate src address, which is enforced by the current
> implementation. However, for vm_memory_attributes=0, it is possible for
> guest memory to be initialized directly from userspace by mmap()'ing the
> guest_memfd and writing to it while the corresponding GPA ranges are in
> a 'shared' state before converting them to the 'private' state expected
> by KVM_SEV_SNP_LAUNCH_UPDATE.
>
> Update the handling/documentation for KVM_SEV_SNP_LAUNCH_UPDATE to allow
> for 'uaddr' to be set to NULL when vm_memory_attributes=0, which
> SNP_LAUNCH_UPDATE will then use to determine when it should/shouldn't
> copy in data from a separate memory location. Continue to enforce
> non-NULL for the original vm_memory_attributes=1 case.
>
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> [Added src_page check in error handling path when the firmware command fails]
> [Dropped ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES]
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

I'm not very familiar with the SEV-SNP populate flows, but it looks
like Sashiko is on to something:
https://sashiko.dev/#/patchset/20260507-gmem-inplace-conversion-v6-0-91ab5a8b19a4%40google.com?part=21

- a potential read-only page overwrite, because src_page is acquired
via get_user_pages_fast() without the FOLL_WRITE flag, but is then
overwritten via memcpy
- an ordering violation with the kunmap_local() calls

These predate this patch series and are just being touched by the
'src_page' addition, but if Sashiko's right, these should probably be
fixed sooner rather than later.

Cheers,
/fuad



> ---
>  Documentation/virt/kvm/x86/amd-memory-encryption.rst | 15 +++++++++++----
>  arch/x86/kvm/svm/sev.c                               | 18 +++++++++++++-----
>  virt/kvm/kvm_main.c                                  |  1 +
>  3 files changed, 25 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> index b2395dd4769de..43085f65b2d85 100644
> --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> @@ -503,7 +503,8 @@ secrets.
>
>  It is required that the GPA ranges initialized by this command have had the
>  KVM_MEMORY_ATTRIBUTE_PRIVATE attribute set in advance. See the documentation
> -for KVM_SET_MEMORY_ATTRIBUTES for more details on this aspect.
> +for KVM_SET_MEMORY_ATTRIBUTES/KVM_SET_MEMORY_ATTRIBUTES2 for more details on
> +this aspect.
>
>  Upon success, this command is not guaranteed to have processed the entire
>  range requested. Instead, the ``gfn_start``, ``uaddr``, and ``len`` fields of
> @@ -511,9 +512,15 @@ range requested. Instead, the ``gfn_start``, ``uaddr``, and ``len`` fields of
>  remaining range that has yet to be processed. The caller should continue
>  calling this command until those fields indicate the entire range has been
>  processed, e.g. ``len`` is 0, ``gfn_start`` is equal to the last GFN in the
> -range plus 1, and ``uaddr`` is the last byte of the userspace-provided source
> -buffer address plus 1. In the case where ``type`` is KVM_SEV_SNP_PAGE_TYPE_ZERO,
> -``uaddr`` will be ignored completely.
> +range plus 1, and ``uaddr`` (if specified) is the last byte of the
> +userspace-provided source buffer address plus 1.
> +
> +In the case where ``type`` is KVM_SEV_SNP_PAGE_TYPE_ZERO, ``uaddr`` will be
> +ignored completely. Otherwise, ``uaddr`` is required if
> +kvm.vm_memory_attributes=1 and optional if kvm.vm_memory_attributes=0, since
> +in the latter case guest memory can be initialized directly from userspace
> +prior to converting it to private and passing the GPA range on to this
> +interface.
>
>  Parameters (in): struct  kvm_sev_snp_launch_update
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index c2126b3c30724..bf10d24907a00 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2343,7 +2343,15 @@ static int sev_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
>         int level;
>         int ret;
>
> -       if (WARN_ON_ONCE(sev_populate_args->type != KVM_SEV_SNP_PAGE_TYPE_ZERO && !src_page))
> +       /*
> +        * For vm_memory_attributes=1, in-place conversion/population is not
> +        * supported, so the initial contents necessarily need to come from a
> +        * separate src address. For vm_memory_attributes=0, this isn't
> +        * necessarily the case, since the pages may have been populated
> +        * directly from userspace before calling KVM_SEV_SNP_LAUNCH_UPDATE.
> +        */
> +       if (vm_memory_attributes &&
> +           sev_populate_args->type != KVM_SEV_SNP_PAGE_TYPE_ZERO && !src_page)
>                 return -EINVAL;
>
>         ret = snp_lookup_rmpentry((u64)pfn, &assigned, &level);
> @@ -2390,7 +2398,7 @@ static int sev_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
>          */
>         if (ret && !snp_page_reclaim(kvm, pfn) &&
>             sev_populate_args->type == KVM_SEV_SNP_PAGE_TYPE_CPUID &&
> -           sev_populate_args->fw_error == SEV_RET_INVALID_PARAM) {
> +           sev_populate_args->fw_error == SEV_RET_INVALID_PARAM && src_page) {
>                 void *src_vaddr = kmap_local_page(src_page);
>                 void *dst_vaddr = kmap_local_pfn(pfn);
>
> @@ -2422,8 +2430,8 @@ static int snp_launch_update(struct kvm *kvm, struct kvm_sev_cmd *argp)
>         if (copy_from_user(&params, u64_to_user_ptr(argp->data), sizeof(params)))
>                 return -EFAULT;
>
> -       pr_debug("%s: GFN start 0x%llx length 0x%llx type %d flags %d\n", __func__,
> -                params.gfn_start, params.len, params.type, params.flags);
> +       pr_debug("%s: GFN start 0x%llx length 0x%llx type %d flags %d src %llx\n", __func__,
> +                params.gfn_start, params.len, params.type, params.flags, params.uaddr);
>
>         if (!params.len || !PAGE_ALIGNED(params.len) || params.flags ||
>             (params.type != KVM_SEV_SNP_PAGE_TYPE_NORMAL &&
> @@ -2479,7 +2487,7 @@ static int snp_launch_update(struct kvm *kvm, struct kvm_sev_cmd *argp)
>
>         params.gfn_start += count;
>         params.len -= count * PAGE_SIZE;
> -       if (params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO)
> +       if (src && params.type != KVM_SEV_SNP_PAGE_TYPE_ZERO)
>                 params.uaddr += count * PAGE_SIZE;
>
>         if (copy_to_user(u64_to_user_ptr(argp->data), &params, sizeof(params)))
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index ba195bb239aaa..3bf212fd99193 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -105,6 +105,7 @@ module_param(allow_unsafe_mappings, bool, 0444);
>  #ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>  bool vm_memory_attributes = true;
>  module_param(vm_memory_attributes, bool, 0444);
> +EXPORT_SYMBOL_FOR_KVM_INTERNAL(vm_memory_attributes);
>  #endif
>  DEFINE_STATIC_CALL_RET0(__kvm_get_memory_attributes, kvm_get_memory_attributes_t);
>  EXPORT_SYMBOL_FOR_KVM_INTERNAL(STATIC_CALL_KEY(__kvm_get_memory_attributes));
>
> --
> 2.54.0.563.g4f69b47b94-goog
>
>

^ permalink raw reply

* [PATCH v7 1/8] docs/zh_CN: Add index.rst translation
From: Kefan Bai @ 2026-05-21  9:55 UTC (permalink / raw)
  To: linux-usb, si.yanteng
  Cc: gregkh, seakeel, alexs, dzm91, corbet, skhan, linux-doc, doubled
In-Reply-To: <cover.1779355170.git.baikefan@leap-io-kernel.com>

Translate .../usb/index.rst into Chinese and update subsystem-apis.rst

Update the translation through commit a592a36e4937
("Documentation: use a source-read extension for the index link boilerplate")

Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Signed-off-by: Kefan Bai <baikefan@leap-io-kernel.com>
---
 .../translations/zh_CN/subsystem-apis.rst     |  2 +-
 .../translations/zh_CN/usb/index.rst          | 54 +++++++++++++++++++
 2 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/translations/zh_CN/usb/index.rst

diff --git a/Documentation/translations/zh_CN/subsystem-apis.rst b/Documentation/translations/zh_CN/subsystem-apis.rst
index 830217140fb6..b52e1feb0167 100644
--- a/Documentation/translations/zh_CN/subsystem-apis.rst
+++ b/Documentation/translations/zh_CN/subsystem-apis.rst
@@ -90,6 +90,7 @@ TODOList:
    security/index
    PCI/index
    peci/index
+   usb/index

 TODOList:

@@ -104,6 +105,5 @@ TODOList:
 * accel/index
 * crypto/index
 * bpf/index
-* usb/index
 * misc-devices/index
 * wmi/index
diff --git a/Documentation/translations/zh_CN/usb/index.rst b/Documentation/translations/zh_CN/usb/index.rst
new file mode 100644
index 000000000000..b4cb0ccaa39b
--- /dev/null
+++ b/Documentation/translations/zh_CN/usb/index.rst
@@ -0,0 +1,54 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: ../disclaimer-zh_CN.rst
+
+:Original: Documentation/usb/index.rst
+
+:翻译:
+
+ 白钶凡 Kefan Bai <baikefan@leap-io-kernel.com>
+
+:校译:
+
+
+========
+USB 支持
+========
+
+.. toctree::
+    :maxdepth: 1
+
+
+Todolist:
+
+* acm
+* authorization
+* chipidea
+* dwc3
+* ehci
+* usbmon
+* functionfs
+* functionfs-desc
+* gadget_configfs
+* gadget_hid
+* gadget_multi
+* gadget_printer
+* gadget_serial
+* gadget_uvc
+* gadget-testing
+* iuu_phoenix
+* mass-storage
+* misc_usbsevseg
+* mtouchusb
+* ohci
+* raw-gadget
+* usbip_protocol
+* usb-serial
+* usb-help
+* text_files
+
+.. only::  subproject and html
+
+   索引
+   ====
+
+   * :ref:`genindex`
--
2.54.0


^ permalink raw reply related

* Re: [PATCH net-next v3 01/14] virtchnl: create 'include/linux/intel' and move necessary header files
From: Larysa Zaremba @ 2026-05-21  9:28 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Tony Nguyen, davem, pabeni, edumazet, andrew+netdev, netdev,
	przemyslaw.kitszel, aleksander.lobakin, sridhar.samudrala,
	anjali.singhai, michal.swiatkowski, maciej.fijalkowski,
	emil.s.tantilov, madhu.chittim, joshua.a.hay, jacob.e.keller,
	jayaprakash.shanmugam, jiri, horms, corbet, richardcochran,
	linux-doc, tatyana.e.nikolova, krzysztof.czurylo, jgg, leon,
	linux-rdma, Samuel Salin, Aleksandr Loktionov
In-Reply-To: <20260520175201.72f83c4a@kernel.org>

On Wed, May 20, 2026 at 05:52:01PM -0700, Jakub Kicinski wrote:
> On Fri, 15 May 2026 15:44:25 -0700 Tony Nguyen wrote:
> > include/linux/intel is vacant
> 
> I don't see any other vendor directory under include/linux

There are at least

include/linux/mlx4, include/linux/mlx5 and include/linux/bnxt.

Those are per-driver and not per-vendor, but intel ethernet has too many drivers 
to have separate folders for them.

I just do not think this creates a precedent neccessarily.

Folder structure is for you to decide as a maintainer, but it would be nice to 
have known about such doubts earlier.

> and TBH I don't want to be the maintainer making a precedent
> for this sort of stuff. include/net/intel is a better choice.
> Or rather, at least its in "our" section of the tree so nobody
> will complain.
> 

^ permalink raw reply

* Re: [PATCH v3] killswitch: add per-function short-circuit mitigation primitive
From: Daniel Borkmann @ 2026-05-21  9:11 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Song Liu, linux-kernel, linux-doc, linux-kselftest, bpf,
	live-patching, Greg Kroah-Hartman, Andrew Morton, Jonathan Corbet,
	Mathieu Desnoyers, Joshua Peisach, Florian Weimer, Breno Leitao,
	Anthony Iliopoulos, Michal Hocko, Jiri Olsa, John Fastabend,
	Christian Brauner, KP Singh
In-Reply-To: <agzAwjKhOhuANz_P@laps>

On 5/19/26 9:57 PM, Sasha Levin wrote:
> On Tue, May 19, 2026 at 02:13:26PM +0200, Daniel Borkmann wrote:
>> On 5/19/26 1:59 AM, Song Liu wrote:
>>> On Mon, May 18, 2026 at 6:33 AM Sasha Levin <sashal@kernel.org> wrote:
>>>> On Sun, May 17, 2026 at 11:37:36PM -0700, Song Liu wrote:
>>>>> On Sun, May 17, 2026 at 6:49 AM Sasha Levin <sashal@kernel.org> wrote:
>>>>>> * fail_function (CONFIG_FUNCTION_ERROR_INJECTION) is disabled in
>>>>>>   most production kernels. Even where enabled, it only works on
>>>>>>   functions pre-annotated with ALLOW_ERROR_INJECTION() in source -
>>>>>>   no help for a freshly-disclosed CVE. The debugfs UI is blocked by
>>>>>>   lockdown=integrity and the override is probabilistic.
>>>>>>
>>>>>> * BPF override (bpf_override_return) honors the same
>>>>>>   ALLOW_ERROR_INJECTION() whitelist, and BPF itself is off in many
>>>>>>   production kernels. Even where on, the operator interface is
>>>>>>   "load a verified BPF program," not a one-line write.
>>>>>
>>>>> If it is OK for killswitch to attach to any kernel functions, do we still
>>>>> need ALLOW_ERROR_INJECTION() for fail_function and BPF
>>>>> override? Shall we instead also allow fail_function and BPF override
>>>>> to attach to any kernel functions?
>>>>
>>>> I don't think so. ALLOW_ERROR_INJECTION is not a security mechanism, it's an
>>>> integrity/safety mechanism for both bpf and fault injection.
>>>>
>>>> It protects against a "developer or CI script doing legitimate fault injection
>>>> accidentally panics the box" scenario, not an "attacker gets in" one.
>>>
>>> There really isn't a clear boundary between "security mechanism" and
>>> "non-security mechanism". As we are making killswitch available
>>> everywhere under root, users will soon learn to use it to do fault injection,
>>> and potentially much more scary things. (Think about agents with sudo
>>> access).
>>
>> Fully agree with Song here that there is no clear boundary, and that the
>> killswitch could lead to arbitrary, hard to debug breakage if applied to
>> the wrong function.. introducing worse bugs than the one being mitigated
>> or even /short-circuit LSM enforcement/ (engage security_file_open 0,
>> engage cap_capable 0, engage apparmor_* etc).
> 
> This is similar to livepatch, right? Do we need guardrails there too?
> 
> Or do we just trust root to do the right thing for it's systems without needing
> to be it's babysitter?

[See Song's reply.]

>> The ALLOW_ERROR_INJECTION() provides a curated white-list where you may
>> return with an error without causing more severe damage (assuming the
>> error handling code is right). The right thing would be to more widely
>> apply ALLOW_ERROR_INJECTION() or to figure out a better way to safely
>> enable the latter without explicit function annotation.
> 
> Sure, this would also work. How do you see this happening? Can we let a certain
> user/pid/etc disable the allowlist if they choose to?

I don't think we should, given then we're back to square one where root
or some other user would be able to just override/bypass an LSM.

[...]
> How do you see this working with the allowlist?

We should look at the underlying areas where most of the CVE-like fixes
took place (these days should be more easily doable given Claude and friends)
and based on that either extend ALLOW_ERROR_INJECTION() or (better) create
new hooks which BPF LSM can consume where you can then have a policy to reject
requests and tighten the attack surface. For example, the AF_ALG stuff you
can already easily cover today ...

#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>

#define AF_ALG	38
#define EPERM	1

char _license[] SEC("license") = "Dual BSD/GPL";

SEC("lsm/socket_create")
int BPF_PROG(block_af_alg, int family, int type, int protocol, int kern)
{
	if (family == AF_ALG)
		return -EPERM;
	return 0;
}

... the problem is that distros enable and pull in all sort of crap which
then non-root could pull in via request_module() as an example; similarly
for netlink we want to have a BPF LSM policy to parse into netlink requests
and then reject based on certain attribute matching (both on our todo list)
which would have helped in case of exotic tc cls/act/qdisc modules to prevent
them to be pulled from userns. I bet there are a ton more examples once we
look further into the data.

Thanks,
Daniel

^ permalink raw reply

* Re: [PATCH v6 20/43] KVM: guest_memfd: Enable INIT_SHARED on guest_memfd for x86 Coco VMs
From: Fuad Tabba @ 2026-05-21  8:54 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	ira.weiny, jmattson, jthoughton, michael.roth, oupton,
	pankaj.gupta, qperret, rick.p.edgecombe, rientjes, shivankg,
	steven.price, willy, wyihan, yan.y.zhao, forkloop, pratyush,
	suzuki.poulose, aneesh.kumar, liam, Paolo Bonzini,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Baoquan He, Barry Song,
	Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng,
	Shakeel Butt, Kiryl Shutsemau, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <20260507-gmem-inplace-conversion-v6-20-91ab5a8b19a4@google.com>

On Thu, 7 May 2026 at 21:22, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Sean Christopherson <seanjc@google.com>
>
> Now that guest_memfd supports tracking private vs. shared within gmem
> itself, allow userspace to specify INIT_SHARED on a guest_memfd instance
> for x86 Confidential Computing (CoCo) VMs, so long as per-VM attributes
> are disabled, i.e. when it's actually possible for a guest_memfd instance
> to contain shared memory.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad


> ---
>  arch/x86/kvm/x86.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 1560de1e95be0..6609957ecfea3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -14172,14 +14172,13 @@ bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
>  }
>
>  #ifdef CONFIG_KVM_GUEST_MEMFD
> -/*
> - * KVM doesn't yet support initializing guest_memfd memory as shared for VMs
> - * with private memory (the private vs. shared tracking needs to be moved into
> - * guest_memfd).
> - */
>  bool kvm_arch_supports_gmem_init_shared(struct kvm *kvm)
>  {
> -       return !kvm_arch_has_private_mem(kvm);
> +       /*
> +        * INIT_SHARED isn't supported if the memory attributes are per-VM,
> +        * in which case guest_memfd can _only_ be used for private memory.
> +        */
> +       return !vm_memory_attributes || !kvm_arch_has_private_mem(kvm);
>  }
>
>  #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_PREPARE
>
> --
> 2.54.0.563.g4f69b47b94-goog
>
>

^ permalink raw reply

* Re: [PATCH v5 06/13] ima: Mediate open/release method of the measurements list
From: Roberto Sassu @ 2026-05-21  8:30 UTC (permalink / raw)
  To: Mimi Zohar, corbet, skhan, dmitry.kasatkin, eric.snowberg, paul,
	jmorris, serge
  Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
	gregorylumen, chenste, nramas, Roberto Sassu
In-Reply-To: <db872f810f22bf25ff0ae7fe15b44f316b078079.camel@linux.ibm.com>

On Wed, 2026-05-20 at 22:07 -0400, Mimi Zohar wrote:
> On Wed, 2026-04-29 at 18:03 +0200, Roberto Sassu wrote:
> > From: Roberto Sassu <roberto.sassu@huawei.com>
> > 
> > Introduce the ima_measure_users counter, to implement a semaphore-like
> > locking scheme where the binary and ASCII measurements list interfaces can
> > be concurrently open by multiple readers, or alternatively by a single
> > writer.
> > 
> > A semaphore cannot be used because the kernel cannot return to user space
> > with a lock held.
> > 
> > Introduce the ima_measure_lock() and ima_measure_unlock() primitives, to
> > respectively lock/unlock the interfaces (safely with the ima_measure_users
> > counter, without holding a lock).
> > 
> > Finally, introduce _ima_measurements_open() to lock the interface before
> > seq_open(), and call it from ima_measurements_open() and
> > ima_ascii_measurements_open(). And, introduce ima_measurements_release(),
> > to unlock the interface.
> > 
> > Require CAP_SYS_ADMIN if the interface is opened for write (not possible
> > for the current measurements interfaces, since they only have read
> > permission).
> > 
> > No functional changes: multiple readers are allowed as before.
> > 
> > Link: https://github.com/linux-integrity/linux/issues/1
> > Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
> > ---
> >  security/integrity/ima/ima_fs.c | 71 +++++++++++++++++++++++++++++++--
> >  1 file changed, 67 insertions(+), 4 deletions(-)
> > 
> > diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
> > index 9a8dba14d82a..68edea7139d5 100644
> > --- a/security/integrity/ima/ima_fs.c
> > +++ b/security/integrity/ima/ima_fs.c
> > @@ -25,6 +25,8 @@
> >  #include "ima.h"
> >  
> >  static DEFINE_MUTEX(ima_write_mutex);
> > +static DEFINE_MUTEX(ima_measure_mutex);
> > +static long ima_measure_users;
> 
> long?

The limit pre process can be up to INT_MAX. Two processes could
overflow the counter if it was int.

Since privileged users can bypass the system wide max-file check (see
alloc_empty_file()), I will add an overflow check to be sure.

> >  
> >  bool ima_canonical_fmt;
> >  static int __init default_canonical_fmt_setup(char *str)
> > @@ -209,16 +211,76 @@ static const struct seq_operations ima_measurments_seqops = {
> >  	.show = ima_measurements_show
> >  };
> >  
> > +static int ima_measure_lock(bool write)
> > +{
> > +	mutex_lock(&ima_measure_mutex);
> > +	if ((write && ima_measure_users != 0) ||
> > +	    (!write && ima_measure_users < 0)) {
> > +		mutex_unlock(&ima_measure_mutex);
> > +		return -EBUSY;
> > +	}
> 
> Thanks, Roberto. The code is really clear and well written.  However, it could
> use a comment indicating the different ima_measure_users values as a reminder.
> 
> ima_measure_users:  > 0 open readers
> ima_meaasure_users: == -1 open writer

Ok.

> > +
> > +	if (write)
> > +		ima_measure_users--;
> > +	else
> > +		ima_measure_users++;
> > +	mutex_unlock(&ima_measure_mutex);
> > +	return 0;
> > +}
> > +
> > +static void ima_measure_unlock(bool write)
> > +{
> > +	mutex_lock(&ima_measure_mutex);
> > +	if (write)
> > +		ima_measure_users++;
> 
> There should only be one writer at a time. ima_measure_users could be set to
> zero.

Sure, but I find the code more clear this way.

Roberto

> > +	else
> > +		ima_measure_users--;
> > +	mutex_unlock(&ima_measure_mutex);
> > +}
> > +
> 
> thanks,
> 
> Mimi


^ permalink raw reply

* Re: [PATCH v6 19/43] KVM: Let userspace disable per-VM mem attributes, enable per-gmem attributes
From: Fuad Tabba @ 2026-05-21  8:44 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	ira.weiny, jmattson, jthoughton, michael.roth, oupton,
	pankaj.gupta, qperret, rick.p.edgecombe, rientjes, shivankg,
	steven.price, willy, wyihan, yan.y.zhao, forkloop, pratyush,
	suzuki.poulose, aneesh.kumar, liam, Paolo Bonzini,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Baoquan He, Barry Song,
	Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng,
	Shakeel Butt, Kiryl Shutsemau, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <20260507-gmem-inplace-conversion-v6-19-91ab5a8b19a4@google.com>

Hi Ackerley,

On Thu, 7 May 2026 at 21:22, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Sean Christopherson <seanjc@google.com>
>
> Make vm_memory_attributes a module parameter so that userspace can disable
> the use of memory attributes on the VM level.
>
> To avoid inconsistencies in the way memory attributes are tracked in KVM
> and guest_memfd, the vm_memory_attributes module_param is made
> read-only (0444).
>
> Make CONFIG_KVM_VM_MEMORY_ATTRIBUTES selectable, only for (CoCo) VM types
> that might use vm_memory_attributes.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Config files always confuse me, but Sashiko might be onto something:

https://sashiko.dev/#/patchset/20260507-gmem-inplace-conversion-v6-0-91ab5a8b19a4%40google.com?part=19

I think this partially goes back to commit 6, the one I flagged
yesterday. But also adding "default y" to KVM_VM_MEMORY_ATTRIBUTES?
The default value should at least fix this issue, but I'm not sure if
it would cause other problems...

Cheers,
/fuad


> ---
>  arch/x86/kvm/Kconfig | 13 +++++++++----
>  virt/kvm/kvm_main.c  |  1 +
>  2 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index b6d65ee664d0f..8b97d341bd33f 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -82,13 +82,20 @@ config KVM_WERROR
>
>  config KVM_VM_MEMORY_ATTRIBUTES
>         select KVM_MEMORY_ATTRIBUTES
> -       bool
> +       depends on KVM_SW_PROTECTED_VM || KVM_INTEL_TDX || KVM_AMD_SEV
> +       bool "Enable per-VM memory attributes (for CoCo VMs)"
> +       help
> +         Enable support for per-VM memory attributes, which are deprecated in
> +         favor of tracking memory attributes in guest_memfd.  Select this if
> +         you need to run CoCo VMs using a VMM that doesn't support guest_memfd
> +         memory attributes.
> +
> +         If unsure, say N.
>
>  config KVM_SW_PROTECTED_VM
>         bool "Enable support for KVM software-protected VMs"
>         depends on EXPERT
>         depends on KVM_X86 && X86_64
> -       select KVM_VM_MEMORY_ATTRIBUTES
>         help
>           Enable support for KVM software-protected VMs.  Currently, software-
>           protected VMs are purely a development and testing vehicle for
> @@ -139,7 +146,6 @@ config KVM_INTEL_TDX
>         bool "Intel Trust Domain Extensions (TDX) support"
>         default y
>         depends on INTEL_TDX_HOST
> -       select KVM_VM_MEMORY_ATTRIBUTES
>         select HAVE_KVM_ARCH_GMEM_POPULATE
>         help
>           Provides support for launching Intel Trust Domain Extensions (TDX)
> @@ -163,7 +169,6 @@ config KVM_AMD_SEV
>         depends on KVM_AMD && X86_64
>         depends on CRYPTO_DEV_SP_PSP && !(KVM_AMD=y && CRYPTO_DEV_CCP_DD=m)
>         select ARCH_HAS_CC_PLATFORM
> -       select KVM_VM_MEMORY_ATTRIBUTES
>         select HAVE_KVM_ARCH_GMEM_PREPARE
>         select HAVE_KVM_ARCH_GMEM_INVALIDATE
>         select HAVE_KVM_ARCH_GMEM_POPULATE
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index cec02d68d7039..ba195bb239aaa 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -104,6 +104,7 @@ module_param(allow_unsafe_mappings, bool, 0444);
>  #ifdef CONFIG_KVM_MEMORY_ATTRIBUTES
>  #ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>  bool vm_memory_attributes = true;
> +module_param(vm_memory_attributes, bool, 0444);
>  #endif
>  DEFINE_STATIC_CALL_RET0(__kvm_get_memory_attributes, kvm_get_memory_attributes_t);
>  EXPORT_SYMBOL_FOR_KVM_INTERNAL(STATIC_CALL_KEY(__kvm_get_memory_attributes));
>
> --
> 2.54.0.563.g4f69b47b94-goog
>
>

^ permalink raw reply

* [PATCH v2] Fail the build on RUST=y and RUST_IS_AVAILABLE=n
From: Sasha Finkelstein @ 2026-05-21  8:30 UTC (permalink / raw)
  To: Alice Ryhl, Andreas Hindborg, Benno Lossin, Björn Roy Baron,
	Boqun Feng, Danilo Krummrich, Gary Guo, Jonathan Corbet,
	Miguel Ojeda, Shuah Khan, Trevor Gross
  Cc: Neal Gompa, linux-doc, linux-kernel, rust-for-linux,
	Sasha Finkelstein

The current approach of silently disabling all rust drivers if the
toolchain is missing results in users that try to compile their own
kernels getting a "successful" build and then being confused about where
did their drivers go. In comparison, missing openssl results in a build
failure, not a disappearance of everything that depends on it.

This also means that allyesconfig will depend on rust, but since the
rust experiment concluded with "rust is here to stay", i believe that
allyesconfig should be building rust drivers too.

Signed-off-by: Sasha Finkelstein <k@chaosmail.tech>
---
Changes in v2:
- No longer a RFC, let's make it happen.
- Update the docs.
- Link to v1: https://patch.msgid.link/20260510-evolve-to-crab-v1-1-208df84e67be@chaosmail.tech
---
 Documentation/rust/quick-start.rst | 6 +++---
 init/Kconfig                       | 1 -
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/Documentation/rust/quick-start.rst b/Documentation/rust/quick-start.rst
index a6ec3fa94d33..764c81d0dd59 100644
--- a/Documentation/rust/quick-start.rst
+++ b/Documentation/rust/quick-start.rst
@@ -321,9 +321,9 @@ Configuration
 -------------
 
 ``Rust support`` (``CONFIG_RUST``) needs to be enabled in the ``General setup``
-menu. The option is only shown if a suitable Rust toolchain is found (see
-above), as long as the other requirements are met. In turn, this will make
-visible the rest of options that depend on Rust.
+menu. In turn, this will make visible the rest of options that depend on Rust.
+You can check the value of ``RUST_IS_AVAILABLE`` to determine if your toolchain
+is configured correctly.
 
 Afterwards, go to::
 
diff --git a/init/Kconfig b/init/Kconfig
index 2937c4d308ae..f7d4c7ea764f 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2190,7 +2190,6 @@ config PROFILING
 config RUST
 	bool "Rust support"
 	depends on HAVE_RUST
-	depends on RUST_IS_AVAILABLE
 	select EXTENDED_MODVERSIONS if MODVERSIONS
 	depends on !MODVERSIONS || GENDWARFKSYMS
 	depends on !GCC_PLUGIN_RANDSTRUCT

---
base-commit: 8bc67e4db64aa72732c474b44ea8622062c903f0
change-id: 20260510-evolve-to-crab-8cba1768dcd5

Best regards,
--  
Sasha Finkelstein <k@chaosmail.tech>


^ permalink raw reply related

* Re: [PATCH net-next v3 05/14] libie: add bookkeeping support for control queue messages
From: Larysa Zaremba @ 2026-05-21  8:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Tony Nguyen, davem, pabeni, edumazet, andrew+netdev, netdev,
	Phani R Burra, przemyslaw.kitszel, aleksander.lobakin,
	sridhar.samudrala, anjali.singhai, michal.swiatkowski,
	maciej.fijalkowski, emil.s.tantilov, madhu.chittim, joshua.a.hay,
	jacob.e.keller, jayaprakash.shanmugam, jiri, horms, corbet,
	richardcochran, linux-doc, Bharath R, Samuel Salin,
	Aleksandr Loktionov
In-Reply-To: <20260520185121.6f380ad0@kernel.org>

On Wed, May 20, 2026 at 06:51:21PM -0700, Jakub Kicinski wrote:
> On Fri, 15 May 2026 15:44:29 -0700 Tony Nguyen wrote:
> > +	guard(spinlock)(&xnm->free_xns_bm_lock);
> 
> Quoting documentation:
> 
>   Using device-managed and cleanup.h constructs
>   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   
>   Netdev remains skeptical about promises of all "auto-cleanup" APIs,
>   including even ``devm_`` helpers, historically. They are not the preferred
>   style of implementation, merely an acceptable one.
>   
>   Use of ``guard()`` is discouraged within any function longer than 20 lines,
>   ``scoped_guard()`` is considered more readable. Using normal lock/unlock is
>   still (weakly) preferred.
>

I agree that using guard in long functions is confusing, but the longest 
function in this patchset that uses guard() is libie_ctlq_xn_pop_free(), which 
has 18 lines between curly braces and represents a concrete atomic operation.

There is also scoped_guard() in libie_ctlq_xn_process_send(), which protects a 
block of 22 lines, but I would consider it acceptable under the guidelines you 
shared too.
   
>   Low level cleanup constructs (such as ``__free()``) can be used when building
>   APIs and helpers, especially scoped iterators. However, direct use of
>   ``__free()`` within networking core and drivers is discouraged.
>   Similar guidance applies to declaring variables mid-function.
>   
> See: https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#using-device-managed-and-cleanup-h-constructs

^ permalink raw reply

* Re: [PATCH 1/6] alloc_tag: add ioctl to /proc/allocinfo
From: Hao Ge @ 2026-05-21  8:19 UTC (permalink / raw)
  To: Suren Baghdasaryan, Abhishek Bapat
  Cc: Shuah Khan, Jonathan Corbet, linux-doc, linux-kernel, linux-mm,
	Sourav Panda, Kent Overstreet, Andrew Morton
In-Reply-To: <CAJuCfpFn0Oefewvjp1jBhCOgxxwhHFy_RK08DwQywOjYcfr2pw@mail.gmail.com>

On 2026/5/20 01:42, Suren Baghdasaryan wrote:
> On Mon, May 18, 2026 at 7:53 PM Hao Ge <hao.ge@linux.dev> wrote:
>> Hi Abhishek
>>
>>
>> Thanks for the follow-up.
>>
>>
>> On 2026/5/19 07:41, Abhishek Bapat wrote:
>>> On Wed, May 13, 2026 at 9:38 PM Hao Ge<hao.ge@linux.dev>  wrote:
>>>> Hi Suren and Abhishek
>>>>
>>>>
>>>> Thanks for the patch! A couple of minor comments below.
>>>>
>>>>
>>>> On 2026/5/5 07:36, Abhishek Bapat wrote:
>>>>> From: Suren Baghdasaryan<surenb@google.com>
>>>>>
>>>>> Add the following ioctl commands for /proc/allocinfo file:
>>>>>
>>>>> ALLOCINFO_IOC_CONTENT_ID - gets content identifier which can be used
>>>>> to check whether the file content has changed specifically due to module
>>>>> load/unload. Every time a module is loaded / unloaded, the returned
>>>>> value will be different. By comparing the identifier value at the
>>>>> beginning and at the end of the content retrieval operation, users can
>>>>> validate retrieved information for consistency.
>>>>>
>>>>> ALLOCINFO_IOC_GET_AT - gets the record at the specified position. This
>>>>> is the position of a record in /proc/allocinfo.
>>>>>
>>>>> ALLOCINFO_IOC_GET_NEXT - gets the record next to the last retrieved
>>>>> one. If no records were previously retrieved, returns the first
>>>>> record.
>>>>>
>>>>> Signed-off-by: Suren Baghdasaryan<surenb@google.com>
>>>>> Signed-off-by: Abhishek Bapat<abhishekbapat@google.com>
>>>>> ---
>>>>>     .../userspace-api/ioctl/ioctl-number.rst      |   2 +
>>>>>     include/linux/codetag.h                       |   1 +
>>>>>     include/uapi/linux/alloc_tag.h                |  54 ++++++
>>>>>     lib/alloc_tag.c                               | 178 +++++++++++++++++-
>>>>>     lib/codetag.c                                 |  11 ++
>>>>>     5 files changed, 244 insertions(+), 2 deletions(-)
>>>>>     create mode 100644 include/uapi/linux/alloc_tag.h
>>>>>
>>>>> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
>>>>> index 331223761fff..84f6808a8578 100644
>>>>> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
>>>>> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
>>>>> @@ -349,6 +349,8 @@ Code  Seq#    Include File                                             Comments
>>>>>                                                                            <mailto:luzmaximilian@gmail.com>
>>>>>     0xA5  20-2F  linux/surface_aggregator/dtx.h                            Microsoft Surface DTX driver
>>>>>                                                                            <mailto:luzmaximilian@gmail.com>
>>>>> +0xA6  00-0F  uapi/linux/alloc_tag.h                                    Memory allocation profiling
>>>>> +<mailto:surenb@google.com>
>>>>>     0xAA  00-3F  linux/uapi/linux/userfaultfd.h
>>>>>     0xAB  00-1F  linux/nbd.h
>>>>>     0xAC  00-1F  linux/raw.h
>>>>> diff --git a/include/linux/codetag.h b/include/linux/codetag.h
>>>>> index 8ea2a5f7c98a..2bcd4e7c809e 100644
>>>>> --- a/include/linux/codetag.h
>>>>> +++ b/include/linux/codetag.h
>>>>> @@ -76,6 +76,7 @@ struct codetag_iterator {
>>>>>
>>>>>     void codetag_lock_module_list(struct codetag_type *cttype, bool lock);
>>>>>     bool codetag_trylock_module_list(struct codetag_type *cttype);
>>>>> +unsigned long codetag_get_content_id(struct codetag_type *cttype);
>>>>>     struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype);
>>>>>     struct codetag *codetag_next_ct(struct codetag_iterator *iter);
>>>>>
>>>>> diff --git a/include/uapi/linux/alloc_tag.h b/include/uapi/linux/alloc_tag.h
>>>>> new file mode 100644
>>>>> index 000000000000..e9a5b55fcc7a
>>>>> --- /dev/null
>>>>> +++ b/include/uapi/linux/alloc_tag.h
>>>>> @@ -0,0 +1,54 @@
>>>>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>>>>> +/*
>>>>> + *  include/linux/alloc_tag.h
>>>>> + */
>>>>> +
>>>>> +#ifndef _UAPI_ALLOC_TAG_H
>>>>> +#define _UAPI_ALLOC_TAG_H
>>>>> +
>>>>> +#include <linux/types.h>
>>>>> +
>>>>> +#define ALLOCINFO_STR_SIZE   64
>>>>> +
>>>>> +struct allocinfo_content_id {
>>>>> +     __u64 id;
>>>>> +};
>>>>> +
>>>>> +struct allocinfo_tag {
>>>>> +     /* Longer names are trimmed */
>>>>> +     char modname[ALLOCINFO_STR_SIZE];
>>>>> +     char function[ALLOCINFO_STR_SIZE];
>>>>> +     char filename[ALLOCINFO_STR_SIZE];
>>>>> +     __u64 lineno;
>>>>> +};
>>>>> +
>>>>> +struct allocinfo_counter {
>>>>> +     __u64 bytes;
>>>>> +     __u64 calls;
>>>>> +     __u8 accurate;
>>>>> +     __u8 pad[7]; /* Add alignment to not break the 32-bit compatible interface */
>>>>> +};
>>>>> +
>>>>> +struct allocinfo_tag_data {
>>>>> +     struct allocinfo_tag tag;
>>>>> +     struct allocinfo_counter counter;
>>>>> +};
>>>>> +
>>>>> +struct allocinfo_get_at {
>>>>> +     __u64 pos;      /* input */
>>>>> +     struct allocinfo_tag_data data;
>>>>> +};
>>>>> +
>>>>> +#define _ALLOCINFO_IOC_CONTENT_ID    0
>>>>> +#define _ALLOCINFO_IOC_GET_AT                1
>>>>> +#define _ALLOCINFO_IOC_GET_NEXT              2
>>>>> +
>>>>> +#define ALLOCINFO_IOC_BASE           0xA6
>>>>> +#define ALLOCINFO_IOC_CONTENT_ID     _IOR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_CONTENT_ID,     \
>>>>> +                                          struct allocinfo_content_id)
>>>>> +#define ALLOCINFO_IOC_GET_AT         _IOWR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_GET_AT,        \
>>>>> +                                           struct allocinfo_get_at)
>>>>> +#define ALLOCINFO_IOC_GET_NEXT               _IOR(ALLOCINFO_IOC_BASE, _ALLOCINFO_IOC_GET_NEXT,       \
>>>>> +                                          struct allocinfo_tag_data)
>>>>> +
>>>>> +#endif /* _UAPI_ALLOC_TAG_H */
>>>>> diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
>>>>> index ed1bdcf1f8ab..5c24d2f954d4 100644
>>>>> --- a/lib/alloc_tag.c
>>>>> +++ b/lib/alloc_tag.c
>>>>> @@ -14,6 +14,7 @@
>>>>>     #include <linux/string_choices.h>
>>>>>     #include <linux/vmalloc.h>
>>>>>     #include <linux/kmemleak.h>
>>>>> +#include <uapi/linux/alloc_tag.h>
>>>>>
>>>>>     #define ALLOCINFO_FILE_NAME         "allocinfo"
>>>>>     #define MODULE_ALLOC_TAG_VMAP_SIZE  (100000UL * sizeof(struct alloc_tag))
>>>>> @@ -46,6 +47,9 @@ int alloc_tag_ref_offs;
>>>>>     struct allocinfo_private {
>>>>>         struct codetag_iterator iter;
>>>>>         bool print_header;
>>>>> +     /* ioctl uses a separate iterator not to interfere with reads */
>>>>> +     struct codetag_iterator ioctl_iter;
>>>>> +     bool positioned; /* seq_open_private() sets to 0 */
>>>>>     };
>>>>>
>>>>>     static void *allocinfo_start(struct seq_file *m, loff_t *pos)
>>>>> @@ -125,6 +129,177 @@ static const struct seq_operations allocinfo_seq_op = {
>>>>>         .show   = allocinfo_show,
>>>>>     };
>>>>>
>>>>> +static int allocinfo_open(struct inode *inode, struct file *file)
>>>>> +{
>>>>> +     return seq_open_private(file, &allocinfo_seq_op,
>>>>> +                             sizeof(struct allocinfo_private));
>>>>> +}
>>>>> +
>>>>> +static int allocinfo_release(struct inode *inode, struct file *file)
>>>>> +{
>>>>> +     return seq_release_private(inode, file);
>>>>> +}
>>>>> +
>>>>> +static const char *allocinfo_str(const char *str)
>>>>> +{
>>>>> +     size_t len = strlen(str);
>>>>> +
>>>>> +     /* Keep an extra space for the trailing NULL. */
>>>>> +     if (len >= ALLOCINFO_STR_SIZE)
>>>>> +             str += (len - ALLOCINFO_STR_SIZE) + 1;
>>>>> +     return str;
>>>>> +}
>>>>> +
>>>>> +/* Copy a string and trim from the beginning if it's too long */
>>>>> +static void allocinfo_copy_str(char *dest, const char *src)
>>>>> +{
>>>>> +     strscpy(dest, allocinfo_str(src), ALLOCINFO_STR_SIZE);
>>>>> +}
>>>>> +
>>>>> +static void allocinfo_to_params(struct codetag *ct,
>>>>> +                             struct allocinfo_tag_data *data)
>>>>> +{
>>>>> +     struct alloc_tag *tag = ct_to_alloc_tag(ct);
>>>>> +     struct alloc_tag_counters counter = alloc_tag_read(tag);
>>>>> +
>>>>> +     if (ct->modname)
>>>>> +             allocinfo_copy_str(data->tag.modname, ct->modname);
>>>>> +     else
>>>>> +             data->tag.modname[0] = '\0';
>>>> Minor nit about allocinfo_to_params():
>>>>
>>>> When modname is NULL (built-in kernel code), the current code sets it
>>>>
>>>> to an empty string:
>>>>
>>>>        if (ct->modname)
>>>>
>>>>            allocinfo_copy_str(data->tag.modname, ct->modname);
>>>>
>>>>        else
>>>>
>>>>            data->tag.modname[0] = '\0';
>>>>
>>>> This is of course workable in userspace by checking for an empty
>>>>
>>>> string, but I was wondering if it would be cleaner to use "vmlinux"
>>>>
>>>> as a default:
>>>>
>>>> else
>>>>
>>>>              allocinfo_copy_str(data->tag.modname, "vmlinux");
>>>>
>>>>
>>>> For some context, in our memory analysis workflow we often group
>>>>
>>>> allocations by module to get a quick overview of where memory goes,
>>>>
>>>> for example:
>>>>
>>>> vmlinux:    2.1 GB    (kernel core)
>>>>
>>>> nvidia:     1.2 GB    (GPU driver)
>>>>
>>>> iwlwifi:    800 MB    (WiFi driver)
>>>>
>>>> ext4:       500 MB    (filesystem)
>>>>
>>>> Having a consistent identifier for kernel built-in allocations would
>>>>
>>>> avoid each userspace tool needing to handle the empty string as a
>>>>
>>>> special case. Totally fine if this is intentional though.
>>>>
>>> Thanks for bringing this up, I can certainly make this change.
>>> However, the information is not currently exposed this way through
>>> /proc/allocinfo. /proc/allocinfo does not categorize kernel non-module
>>> allocations as vmlinux, so there will a delta between how IOCTL and
>>> /proc/allocinfo behave. Suren, could you comment on whether this
>>> recommendation is fine by you?
>>>
>> Right, /proc/allocinfo indeed doesn't categorize them as vmlinux currently.
>>
>> It's just that in practice we often group allocations by module, so
>> having "vmlinux" as a default
>>
>> would be convenient. Let's wait for Suren's input.
> Hi Folks,
> I would prefer to keep it empty because vmlinux is not really a module
> and hardcoding this name also seems suboptimal (in case it ever
> changes). Empty string also aligns with how we output /proc/allocinfo
> data. If the symbol is in the kernel itself, we do not display the
> module name at all. So, all in all, unless there is a strong reason
> against it, I think we should keep it empty.

Hi Suren


Thanks for the clarification, that makes sense.

For userspace tools that want to group by module, we can always map an 
empty modname to "vmlinux" at the

presentation layer — no need to hardcode that in the kernel.


Hi Abhishek

I noticed the new files (like include/uapi/linux/alloc_tag.h) were added 
in this patchset.

Should they be reflected in the MAINTAINERS file for easier future 
maintenance?

Thanks

Best Regards

Hao

>>>>> +     allocinfo_copy_str(data->tag.function, ct->function);
>>>>> +     allocinfo_copy_str(data->tag.filename, ct->filename);
>>>>> +     data->tag.lineno = ct->lineno;
>>>>> +     data->counter.bytes = counter.bytes;
>>>>> +     data->counter.calls = counter.calls;
>>>>> +     data->counter.accurate = !alloc_tag_is_inaccurate(tag);
>>>>> +}
>>>>> +
>>>>> +static int allocinfo_ioctl_get_content_id(struct seq_file *m, void __user *arg)
>>>>> +{
>>>>> +     struct allocinfo_content_id params;
>>>>> +
>>>>> +     codetag_lock_module_list(alloc_tag_cttype, true);
>>>>> +     params.id = codetag_get_content_id(alloc_tag_cttype);
>>>>> +     codetag_lock_module_list(alloc_tag_cttype, false);
>>>>> +     if (copy_to_user(arg, &params, sizeof(params)))
>>>>> +             return -EFAULT;
>>>>> +
>>>>> +     return 0;
>>>>> +}
>>>>> +
>>>>> +static int allocinfo_ioctl_get_at(struct seq_file *m, void __user *arg)
>>>>> +{
>>>>> +     struct allocinfo_private *priv;
>>>>> +     struct codetag *ct;
>>>>> +     __u64 pos;
>>>>> +     struct allocinfo_get_at params = {0};
>>>>> +
>>>>> +     if (copy_from_user(&params, arg, sizeof(params)))
>>>>> +             return -EFAULT;
>>>>> +
>>>>> +     priv = (struct allocinfo_private *)m->private;
>>>>> +     pos = params.pos;
>>>>> +
>>>>> +     codetag_lock_module_list(alloc_tag_cttype, true);
>>>>> +
>>>>> +     /* Find the codetag */
>>>>> +     priv->ioctl_iter = codetag_get_ct_iter(alloc_tag_cttype);
>>>>> +     ct = codetag_next_ct(&priv->ioctl_iter);
>>>>> +     while (ct && pos--)
>>>>> +             ct = codetag_next_ct(&priv->ioctl_iter);
>>>> I noticed that codetag_next_ct(&priv->ioctl_iter) and
>>>>
>>>> priv->positioned are accessed without serialization in the ioctl
>>>>
>>>> path. Concurrent ioctl calls on the same fd could race on these
>>>>
>>>> fields. Just something I spotted while reading the code.
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Best Regards
>>>>
>>>> Hao
>>>>
>>> I believe this should be prevented by `codetag_lock_module_list`; am I
>>> wrong in my understanding?
>> Thanks for the explanation! codetag_lock_module_list is designed to
>> protect the module list from concurrent load/unload, which it does
>>
>> correctly. However, it doesn't cover the race between concurrent ioctl
>> calls on the same fd, since it acquires cttype->mod_lock via
>>
>> down_read() and rwsem read locks allow multiple readers to proceed
>> concurrently:
>>
>> Thread A: ALLOCINFO_IOC_GET_AT
>>
>> down_read(&cttype->mod_lock)              // read lock acquired
>>
>> priv->ioctl_iter = codetag_get_ct_iter(...)
>>
>> ct = codetag_next_ct(&priv->ioctl_iter)
>>
>> priv->positioned = true;
>>
>> Thread B: ALLOCINFO_IOC_GET_NEXT            // concurrent ioctl on same fd
>>
>> down_read(&cttype->mod_lock)              // read locks don't exclude
>> each other
>>
>> if (!priv->positioned) {                  // sees partial state from
>> Thread A
>>
>> priv->ioctl_iter = ...                // overwrites Thread A's iterator
>>
>> }
>>
>> ct = codetag_next_ct(&priv->ioctl_iter)   // corrupted iterator
>>
>> priv->ioctl_iter and priv->positioned are per-fd state with no
>> serialization in the ioctl path.
> Yep, you are right. codetag_lock_module_list() is not enough here to
> protect from such races. I guess allocinfo_private would need another
> lock.
> Thanks,
> Suren.
>
>
>> Just something I spotted.
>>
>> Thanks
>>
>> Best Regards
>>
>> Hao
>>
>>>>> +     if (ct) {
>>>>> +             allocinfo_to_params(ct, &params.data);
>>>>> +             priv->positioned = true;
>>>>> +     }
>>>>> +
>>>>> +     codetag_lock_module_list(alloc_tag_cttype, false);
>>>>> +
>>>>> +     if (!ct)
>>>>> +             return -ENOENT;
>>>>> +
>>>>> +     if (copy_to_user(arg, &params, sizeof(params)))
>>>>> +             return -EFAULT;
>>>>> +
>>>>> +     return 0;
>>>>> +}
>>>>> +
>>>>> +static int allocinfo_ioctl_get_next(struct seq_file *m, void __user *arg)
>>>>> +{
>>>>> +     struct allocinfo_private *priv;
>>>>> +     struct codetag *ct;
>>>>> +     struct allocinfo_tag_data params = {0};
>>>>> +     int ret = 0;
>>>>> +
>>>>> +     priv = (struct allocinfo_private *)m->private;
>>>>> +
>>>>> +     codetag_lock_module_list(alloc_tag_cttype, true);
>>>>> +
>>>>> +     if (!priv->positioned) {
>>>>> +             priv->ioctl_iter = codetag_get_ct_iter(alloc_tag_cttype);
>>>>> +             priv->positioned = true;
>>>>> +     }
>>>>> +
>>>>> +     ct = codetag_next_ct(&priv->ioctl_iter);
>>>>> +     if (ct)
>>>>> +             allocinfo_to_params(ct, &params);
>>>>> +
>>>>> +     if (!ct) {
>>>>> +             priv->positioned = false;
>>>>> +             ret = -ENOENT;
>>>>> +     }
>>>>> +     codetag_lock_module_list(alloc_tag_cttype, false);
>>>>> +
>>>>> +     if (ret == 0) {
>>>>> +             if (copy_to_user(arg, &params, sizeof(params)))
>>>>> +                     return -EFAULT;
>>>>> +     }
>>>>> +     return ret;
>>>>> +}
>>>>> +
>>>>> +static long allocinfo_ioctl(struct file *file, unsigned int cmd,
>>>>> +                         unsigned long __arg)
>>>>> +{
>>>>> +     void __user *arg = (void __user *)__arg;
>>>>> +     int ret;
>>>>> +
>>>>> +     switch (cmd) {
>>>>> +     case ALLOCINFO_IOC_CONTENT_ID:
>>>>> +             ret = allocinfo_ioctl_get_content_id(file->private_data, arg);
>>>>> +             break;
>>>>> +     case ALLOCINFO_IOC_GET_AT:
>>>>> +             ret = allocinfo_ioctl_get_at(file->private_data, arg);
>>>>> +             break;
>>>>> +     case ALLOCINFO_IOC_GET_NEXT:
>>>>> +             ret = allocinfo_ioctl_get_next(file->private_data, arg);
>>>>> +             break;
>>>>> +     default:
>>>>> +             ret = -ENOIOCTLCMD;
>>>>> +             break;
>>>>> +     }
>>>>> +
>>>>> +     return ret;
>>>>> +}
>>>>> +
>>>>> +#ifdef CONFIG_COMPAT
>>>>> +static long allocinfo_compat_ioctl(struct file *file, unsigned int cmd,
>>>>> +                                unsigned long arg)
>>>>> +{
>>>>> +     return allocinfo_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
>>>>> +}
>>>>> +#endif
>>>>> +
>>>>> +static const struct proc_ops allocinfo_proc_ops = {
>>>>> +     .proc_open              = allocinfo_open,
>>>>> +     .proc_read_iter         = seq_read_iter,
>>>>> +     .proc_lseek             = seq_lseek,
>>>>> +     .proc_release           = allocinfo_release,
>>>>> +     .proc_ioctl             = allocinfo_ioctl,
>>>>> +#ifdef CONFIG_COMPAT
>>>>> +     .proc_compat_ioctl      = allocinfo_compat_ioctl,
>>>>> +#endif
>>>>> +
>>>>> +};
>>>>> +
>>>>>     size_t alloc_tag_top_users(struct codetag_bytes *tags, size_t count, bool can_sleep)
>>>>>     {
>>>>>         struct codetag_iterator iter;
>>>>> @@ -946,8 +1121,7 @@ static int __init alloc_tag_init(void)
>>>>>                 return 0;
>>>>>         }
>>>>>
>>>>> -     if (!proc_create_seq_private(ALLOCINFO_FILE_NAME, 0400, NULL, &allocinfo_seq_op,
>>>>> -                                  sizeof(struct allocinfo_private), NULL)) {
>>>>> +     if (!proc_create(ALLOCINFO_FILE_NAME, 0400, NULL, &allocinfo_proc_ops)) {
>>>>>                 pr_err("Failed to create %s file\n", ALLOCINFO_FILE_NAME);
>>>>>                 shutdown_mem_profiling(false);
>>>>>                 return -ENOMEM;
>>>>> diff --git a/lib/codetag.c b/lib/codetag.c
>>>>> index 304667897ad4..93aa30991563 100644
>>>>> --- a/lib/codetag.c
>>>>> +++ b/lib/codetag.c
>>>>> @@ -48,6 +48,17 @@ bool codetag_trylock_module_list(struct codetag_type *cttype)
>>>>>         return down_read_trylock(&cttype->mod_lock) != 0;
>>>>>     }
>>>>>
>>>>> +unsigned long codetag_get_content_id(struct codetag_type *cttype)
>>>>> +{
>>>>> +     lockdep_assert_held(&cttype->mod_lock);
>>>>> +
>>>>> +     /*
>>>>> +      * next_mod_seq is updated on every load, so can be used to identify
>>>>> +      * content changes.
>>>>> +      */
>>>>> +     return cttype->next_mod_seq;
>>>>> +}
>>>>> +
>>>>>     struct codetag_iterator codetag_get_ct_iter(struct codetag_type *cttype)
>>>>>     {
>>>>>         struct codetag_iterator iter = {
>>> Note, I will be following up with a v2 patchset with your feedback
>>> included. Please bring up any other points you'd want to clarify so
>>> that I can include all the changes in the v2 patchset. Thanks for
>>> reviewing!

^ permalink raw reply

* Re: [PATCH v5 04/13] ima: Introduce per binary measurements list type binary_runtime_size value
From: Roberto Sassu @ 2026-05-21  7:58 UTC (permalink / raw)
  To: Mimi Zohar, corbet, skhan, dmitry.kasatkin, eric.snowberg, paul,
	jmorris, serge
  Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
	gregorylumen, chenste, nramas, Roberto Sassu
In-Reply-To: <b7f97a0a3b79b72a014d12514febc338d1ecd038.camel@linux.ibm.com>

On Wed, 2026-05-20 at 22:06 -0400, Mimi Zohar wrote:
> On Wed, 2026-04-29 at 18:03 +0200, Roberto Sassu wrote:
> > From: Roberto Sassu <roberto.sassu@huawei.com>
> > 
> > Make binary_runtime_size as an array, to have separate counters per binary
> > measurements list type. Currently, define the BINARY type for the existing
> > binary measurements list.
> > 
> > Introduce ima_update_binary_runtime_size() to facilitate updating a
> > binary_runtime_size value with a given binary measurement list type.
> > 
> > Also add the binary measurements list type parameter to
> > ima_get_binary_runtime_size(), to retrieve the desired value. Retrieving
> > the value is now done under the ima_extend_list_mutex, since there can be
> > concurrent updates.
> > 
> > No functional change (except for the mutex usage, that fixes the
> > concurrency issue): the BINARY array element is equivalent to the old
> > binary_runtime_size.
> 
> The patch is really clear and well written, but I don't see a concurrency issue
> requiring taking the ima_extend_list_mutex at least in this patch.

binary_runtime_size is not an atomic variable. It is updated under the
ima_extend_list_mutex lock in ima_add_digest_entry(). The same lock
must be taken on the reader side, ima_get_binary_runtime_size().

Roberto


^ permalink raw reply

* Re: [PATCH v6 18/43] KVM: Move KVM_VM_MEMORY_ATTRIBUTES config definition to x86
From: Fuad Tabba @ 2026-05-21  8:07 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	ira.weiny, jmattson, jthoughton, michael.roth, oupton,
	pankaj.gupta, qperret, rick.p.edgecombe, rientjes, shivankg,
	steven.price, willy, wyihan, yan.y.zhao, forkloop, pratyush,
	suzuki.poulose, aneesh.kumar, liam, Paolo Bonzini,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Baoquan He, Barry Song,
	Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng,
	Shakeel Butt, Kiryl Shutsemau, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <20260507-gmem-inplace-conversion-v6-18-91ab5a8b19a4@google.com>

On Thu, 7 May 2026 at 21:22, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Sean Christopherson <seanjc@google.com>
>
> Bury KVM_VM_MEMORY_ATTRIBUTES in x86 to discourage other architectures
> from adding support for per-VM memory attributes, because tracking private
> vs. shared memory on a per-VM basis is now deprecated in favor of tracking
> on a per-guest_memfd basis, and no other memory attributes are on the
> horizon.
>
> This will also allow modifying KVM_VM_MEMORY_ATTRIBUTES to be
> user-selectable (in x86) without creating weirdness in KVM's Kconfigs.
> Now that guest_memfd support memory attributes, it's entirely possible to
> run x86 CoCo VMs without support for KVM_VM_MEMORY_ATTRIBUTES.
>
> Leave the code itself in common KVM so that it's trivial to undo this
> change if new per-VM attributes do come along.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad
> ---
>  arch/x86/kvm/Kconfig | 4 ++++
>  virt/kvm/Kconfig     | 4 ----
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 26f6afd51bbdc..b6d65ee664d0f 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -80,6 +80,10 @@ config KVM_WERROR
>
>           If in doubt, say "N".
>
> +config KVM_VM_MEMORY_ATTRIBUTES
> +       select KVM_MEMORY_ATTRIBUTES
> +       bool
> +
>  config KVM_SW_PROTECTED_VM
>         bool "Enable support for KVM software-protected VMs"
>         depends on EXPERT
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index e371e079e2c50..663de6421eda2 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -103,10 +103,6 @@ config KVM_MMU_LOCKLESS_AGING
>  config KVM_MEMORY_ATTRIBUTES
>         bool
>
> -config KVM_VM_MEMORY_ATTRIBUTES
> -       select KVM_MEMORY_ATTRIBUTES
> -       bool
> -
>  config KVM_GUEST_MEMFD
>         select XARRAY_MULTI
>         select KVM_MEMORY_ATTRIBUTES
>
> --
> 2.54.0.563.g4f69b47b94-goog
>
>

^ permalink raw reply

* Re: [PATCH v5 8/8] ARM: defconfig: Add a zx29 defconfig file
From: Stefan Dösinger @ 2026-05-21  8:00 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linus Walleij, Jonathan Corbet, Shuah Khan, Russell King,
	Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Krzysztof Kozlowski, Alexandre Belloni, Drew Fustini,
	Greg Kroah-Hartman, Jiri Slaby, linux-doc, linux-kernel,
	linux-arm-kernel, devicetree, soc, linux-serial
In-Reply-To: <30b96e0d-f296-4c31-8701-a15c568ebffc@app.fastmail.com>

[-- Attachment #1: Type: text/plain, Size: 3910 bytes --]

Hi Arnd,

I saw your reply to my defconfig pull request, but apparently never received your original reply. I only found this mail here. It looks like I have to look for a better E-Mail provider as gmail is choking on the volume of the linux-arm-kernel mailing list.

To answer your questions I found at https://lore.kernel.org/all/61452117-0cdc-4ec2-83eb-dc03ccbd410b@app.fastmail.com/ :

> Either way, the patch description above should at least explain
> why you think you need your own defconfig, as we don't normally
> take those.

It was more cluelessness / being new to kernel development that gave me the impression that boards should have defconfigs. Since then I ran across scripts/dt_to_config. I haven't tested it yet on my DT, but if it does the right thing I don't think this board needs a defconfig.

>> +CONFIG_CMDLINE="console=ttyAMA0 earlyprintk root=/dev/ram rw"

> A definconfig should normall not rely on earlyprintk, just add
> that when you actually need to debug the super-early boot
> stages. With "earlycon" it should pick up the right console
> from the stdout path and work almost as early.

>> +CONFIG_BINFMT_FLAT=y

> Are you actually using flat binaries? I wasn't aware that this
> is still possible on MMU-enabled kernels.

>> +CONFIG_BLK_DEV_RAM=y
>> +CONFIG_BLK_DEV_RAM_COUNT=4

> The old ramdisk boot is going away in the future, please use
> initramfs instead. This should also save a good amount of RAM.

I'll fix those in my tree and keep the defconfig around just in case, but otherwise drop it from the submission. We can revisit it later when the board is more complete.

>> +CONFIG_DEVTMPFS=y # FIXME: This is specific to my initrd. Remove 
>> before upstream
>stale comment?

I believe I removed this in later versions though :-)

Cheers,
Stefan

> Am 24.04.2026 um 11:54 schrieb Arnd Bergmann <arnd@arndb.de>:
> 
> On Fri, Apr 24, 2026, at 09:13, Linus Walleij wrote:
>> On Tue, Apr 21, 2026 at 10:24 PM Stefan Dösinger
>> <stefandoesinger@gmail.com> wrote:
>> 
>>> This enables existing drivers that already are (UART) or will be (USB,
>>> GPIO) necessary to operate this board even if they aren't declared in
>>> the DTS yet.
>>> 
>>> Signed-off-by: Stefan Dösinger <stefandoesinger@gmail.com>
>> 
>> *I* personally (as SoC maintainer) think that having a few more defconfigs
>> is fine, even helpful.
>> 
>> But I would defer this to the more senior SoC maintainers because I think
>> their stance is something like:
>> 
>> - We have multi_v7_defconfig for compile testing
>> 
>> - We know that binary gets way to big for your system: it's for build
>>  testing and perhaps booting in QEMU or systems with many MB of
>>  RAM, not for actually running it on products.
>> 
>> - You are encouraged to keep your own defconfig out-of-tree.
> 
> Right, we clearly need to do something better than what we are with
> the general defconfigs, as I'm sure many of the existing ones are
> never actually used for booting a machine, and are horribly out of
> date with the Kconfig options.
> 
> I wouldn't object to adding another defconfig for a new (or revived)
> soc family, but I don't want to have more per-board ones.
> Overall, we have about 70 defconfigs and 55 soc families that have their
> own mach-* directory (plus a few without code), and the number of
> defconfigs alone makes it hard to keep them up to date. 
> 
>> However I even challenged this myself by adding a defconfig for memory
>> constrained Broadcoms a while back (NACKed/ignored ;) so if it was all
>> up to me I would merge this.
> 
> I don't even remember that discussion ;-)
> 
> One idea might be to have a tiny base defconfig, plus platform
> specific fragments that add drivers. The problem is agreeing
> what bits are essential enough to still get enabled in the
> tiny config.
> 
>       Arnd


[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v6 17/43] KVM: guest_memfd: Determine invalidation filter from memory attributes
From: Fuad Tabba @ 2026-05-21  7:56 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	ira.weiny, jmattson, jthoughton, michael.roth, oupton,
	pankaj.gupta, qperret, rick.p.edgecombe, rientjes, shivankg,
	steven.price, willy, wyihan, yan.y.zhao, forkloop, pratyush,
	suzuki.poulose, aneesh.kumar, liam, Paolo Bonzini,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Baoquan He, Barry Song,
	Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng,
	Shakeel Butt, Kiryl Shutsemau, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <20260507-gmem-inplace-conversion-v6-17-91ab5a8b19a4@google.com>

On Thu, 7 May 2026 at 21:22, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> Before conversion, the range filter doesn't really matter:
>
> + For non-CoCo VMs that use guest_memfd, they have no mirrored tdp, so
>   KVM_DIRECT_ROOTS would have been invalidated anyway.
> + CoCo VMs could not use INIT_SHARED, and there's no conversion support, so
>   always using KVM_FILTER_PRIVATE would have worked.
>
> Now with conversion support, update kvm_gmem_get_invalidate_filter to
> inspect the memory attributes maple tree for a given range.
>
> Instead of determining the invalidation filter based on static inode
> flags, iterate through the attributes maple tree for the specific range
> being invalidated. This allows KVM to identify if the range contains
> private pages, shared pages, or both, and set the filter bits
> accordingly.
>
> Update kvm_gmem_invalidate_begin and kvm_gmem_release to pass the range
> parameters to the filter helper to ensure invalidation accurately
> targets the memory types present in the affected range.
>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad
> ---
>  virt/kvm/guest_memfd.c | 27 ++++++++++++++++++++-------
>  1 file changed, 20 insertions(+), 7 deletions(-)
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 9f6eebfb68f6b..c9f155c2dc5c5 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -193,12 +193,24 @@ static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index)
>         return folio;
>  }
>
> -static enum kvm_gfn_range_filter kvm_gmem_get_invalidate_filter(struct inode *inode)
> +static enum kvm_gfn_range_filter kvm_gmem_get_invalidate_filter(
> +               struct inode *inode, pgoff_t start, pgoff_t end)
>  {
> -       if (GMEM_I(inode)->flags & GUEST_MEMFD_FLAG_INIT_SHARED)
> -               return KVM_FILTER_SHARED;
> +       struct gmem_inode *gi = GMEM_I(inode);
> +       enum kvm_gfn_range_filter filter = 0;
> +       void *entry;
> +
> +       lockdep_assert(mt_lock_is_held(&gi->attributes));
> +
> +       mt_for_each(&gi->attributes, entry, start, end - 1) {
> +               filter |= (xa_to_value(entry) & KVM_MEMORY_ATTRIBUTE_PRIVATE) ?
> +                         KVM_FILTER_PRIVATE : KVM_FILTER_SHARED;
> +
> +               if (filter == (KVM_FILTER_PRIVATE | KVM_FILTER_SHARED))
> +                       break;
> +       }
>
> -       return KVM_FILTER_PRIVATE;
> +       return filter;
>  }
>
>  static void __kvm_gmem_invalidate_begin(struct gmem_file *f, pgoff_t start,
> @@ -244,7 +256,7 @@ static void kvm_gmem_invalidate_begin(struct inode *inode, pgoff_t start,
>         enum kvm_gfn_range_filter attr_filter;
>         struct gmem_file *f;
>
> -       attr_filter = kvm_gmem_get_invalidate_filter(inode);
> +       attr_filter = kvm_gmem_get_invalidate_filter(inode, start, end);
>
>         kvm_gmem_for_each_file(f, inode)
>                 __kvm_gmem_invalidate_begin(f, start, end, attr_filter);
> @@ -367,6 +379,7 @@ static long kvm_gmem_fallocate(struct file *file, int mode, loff_t offset,
>  static int kvm_gmem_release(struct inode *inode, struct file *file)
>  {
>         struct gmem_file *f = file->private_data;
> +       enum kvm_gfn_range_filter filter;
>         struct kvm_memory_slot *slot;
>         struct kvm *kvm = f->kvm;
>         unsigned long index;
> @@ -398,8 +411,8 @@ static int kvm_gmem_release(struct inode *inode, struct file *file)
>          * memory, as its lifetime is associated with the inode, not the file.
>          */
>         end = i_size_read(inode) >> PAGE_SHIFT;
> -       __kvm_gmem_invalidate_begin(f, 0, end,
> -                                   kvm_gmem_get_invalidate_filter(inode));
> +       filter = kvm_gmem_get_invalidate_filter(inode, 0, end);
> +       __kvm_gmem_invalidate_begin(f, 0, end, filter);
>         __kvm_gmem_invalidate_end(f, 0, end);
>
>         list_del(&f->entry);
>
> --
> 2.54.0.563.g4f69b47b94-goog
>
>

^ permalink raw reply

* Re: [PATCH bpf-next v11 6/8] bpf: Add bpf_list_add to insert node after a given list node
From: Kaitao Cheng @ 2026-05-21  7:35 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: bpf, ast, corbet, martin.lau, daniel, andrii, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa, shuah,
	chengkaitao, skhan, memxor, linux-kernel, linux-doc, vmalik,
	linux-kselftest, martin.lau, clm, ihor.solodrai
In-Reply-To: <61871735d9151729b3092d42209dad08b1e38fa930bc1c9e452645240efbc40e@mail.kernel.org>

在 2026/5/21 12:08, bot+bpf-ci@kernel.org 写道:
>> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
>> index 1c69476c8a09..89579165ef4d 100644
>> --- a/kernel/bpf/helpers.c
>> +++ b/kernel/bpf/helpers.c
>> @@ -2577,6 +2577,16 @@ __bpf_kfunc int bpf_list_push_back_impl(struct bpf_list_head *head,
>>  	return bpf_list_push_back(head, node, meta__ign, off);
>>  }
>>
>> +__bpf_kfunc int bpf_list_add(struct bpf_list_head *head, struct bpf_list_node *new,
>> +			     struct bpf_list_node *prev__nonown_allowed,
>> +			     struct btf_struct_meta *meta, u64 off)
>> +{
>> +	struct bpf_list_node_kern *n = (void *)new, *p = (void *)prev__nonown_allowed;
>> +	struct list_head *prev_ptr = &p->list_head;
>> +
>> +	return __bpf_list_add(n, head, &prev_ptr, meta ? meta->record : NULL, off);
>> +}
> 
> If __bpf_list_add() fails at runtime (e.g., because prev is not in the
> list indicated by head), it drops the 'new' node via __bpf_obj_drop_impl
> in the fail path.
> 
> However, the verifier unconditionally converts 'new' into a non-owning
> reference during check_kfunc_call(). If __bpf_list_add() fails, the BPF
> program retains this non-owning reference to a freed node.
> 
> Could this allow the program to trigger a use-after-free within the lock
> critical section?
> 
> (sashiko-bot@kernel.org, v10 RESEND)
> https://lore.kernel.org/bpf/20260513204420.2BB22C19425@smtp.kernel.org/

This should be an acceptable case within the current design framework.

>>  static struct bpf_list_node *__bpf_list_del(struct bpf_list_head *head,
>>  					    struct list_head *n)
> 
> [ ... ]
> 
>> @@ -4756,6 +4766,7 @@ BTF_ID_FLAGS(func, bpf_list_push_front, KF_IMPLICIT_ARGS)
>>  BTF_ID_FLAGS(func, bpf_list_push_front_impl)
>>  BTF_ID_FLAGS(func, bpf_list_push_back, KF_IMPLICIT_ARGS)
>>  BTF_ID_FLAGS(func, bpf_list_push_back_impl)
>> +BTF_ID_FLAGS(func, bpf_list_add, KF_IMPLICIT_ARGS)
> 
> Since bpf_list_add is marked with KF_IMPLICIT_ARGS, won't the verifier
> look for a bpf_list_add_impl counterpart during program load?
> 
> Without defining and registering a bpf_list_add_impl function,
> find_kfunc_impl_proto() will fail to find the required _impl suffix in
> the vmlinux BTF.
> 
> Will this cause the verifier to unconditionally reject the kfunc?
> 
> (sashiko-bot@kernel.org, v10 RESEND)
> https://lore.kernel.org/bpf/20260513204420.2BB22C19425@smtp.kernel.org/

The target prototype with the _impl suffix has already been deprecated.

>>  BTF_ID_FLAGS(func, bpf_list_pop_front, KF_ACQUIRE | KF_RET_NULL)
>>  BTF_ID_FLAGS(func, bpf_list_pop_back, KF_ACQUIRE | KF_RET_NULL)
>>  BTF_ID_FLAGS(func, bpf_list_del, KF_ACQUIRE | KF_RET_NULL)
> 
> [ ... ]
> 
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 35eebb5e7769..662ad7312697 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
> 
> [ ... ]
> 
>> @@ -19524,8 +19527,11 @@ int bpf_fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>>  		int struct_meta_reg = BPF_REG_3;
>>  		int node_offset_reg = BPF_REG_4;
>>
>> -		/* rbtree_add has extra 'less' arg, so args-to-fixup are in diff regs */
>> -		if (is_bpf_rbtree_add_kfunc(desc->func_id)) {
>> +		/* list_add/rbtree_add have an extra arg (prev/less),
>> +		 * so args-to-fixup are in diff regs.
>> +		 */
> 
> This isn't a bug, but the multi-line comment format doesn't follow the
> BPF subsystem style. The preferred form in kernel/bpf/ has the opening
> /* on its own line.
> 
> Additionally, consider adding an is_bpf_list_add_kfunc() helper instead
> of direct comparison with special_kfunc_list[KF_bpf_list_add].
> 
> (bot+bpf-ci@kernel.org and Eduard Zingerman, v10 RESEND)
> https://lore.kernel.org/bpf/f573f7e0a5140fe6254d2ddfd3f69add455fa25c67c020c0312e2c2ceeca7c69@mail.kernel.org/

I provided a more detailed explanation in the link below.
https://lore.kernel.org/all/9e300669-8793-41a2-84a1-98fb8195e263@linux.dev/

>> +		if (desc->func_id == special_kfunc_list[KF_bpf_list_add] ||
>> +		    is_bpf_rbtree_add_kfunc(desc->func_id)) {
>>  			struct_meta_reg = BPF_REG_4;
>>  			node_offset_reg = BPF_REG_5;
>>  		}
> 
> 
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> 
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26204125015

-- 
Thanks
Kaitao Cheng


^ permalink raw reply

* Re: [PATCH v6 16/43] KVM: guest_memfd: Use actual size for invalidation in kvm_gmem_release()
From: Fuad Tabba @ 2026-05-21  7:30 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	ira.weiny, jmattson, jthoughton, michael.roth, oupton,
	pankaj.gupta, qperret, rick.p.edgecombe, rientjes, shivankg,
	steven.price, willy, wyihan, yan.y.zhao, forkloop, pratyush,
	suzuki.poulose, aneesh.kumar, liam, Paolo Bonzini,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Baoquan He, Barry Song,
	Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng,
	Shakeel Butt, Kiryl Shutsemau, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <20260507-gmem-inplace-conversion-v6-16-91ab5a8b19a4@google.com>

Hi Ackerley,

On Thu, 7 May 2026 at 21:22, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> __kvm_gmem_invalidate_begin() and __kvm_gmem_invalidate_end() actually do
> not specially handle -1ul. -1ul is used as a huge number, which legal
> indices do not exceed, and hence the invalidation works as expected.
>
> Since a later patch is going to make use of the exact range, calculate the
> size of the guest_memfd inode and use it as the end range for invalidating
> SPTEs.
>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Want to look at what Sashiko has to say? Seems to be a real issue:

https://sashiko.dev/#/patchset/20260507-gmem-inplace-conversion-v6-0-91ab5a8b19a4%40google.com?part=16

If I understand correctly, the fix should simple: use
check_add_overflow() to validate the offset and size parameters in
kvm_gmem_bind()

   int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
             unsigned int fd, loff_t offset)
   {
       loff_t size = slot->npages << PAGE_SHIFT;
   +    loff_t end;
       unsigned long start, end_index;
       struct gmem_file *f;
...
   -    if (offset < 0 || !PAGE_ALIGNED(offset) ||
   -        offset + size > i_size_read(inode))
   +    if (offset < 0 || !PAGE_ALIGNED(offset) ||
   +        check_add_overflow(offset, size, &end) ||
   +        end > i_size_read(inode))
           goto err;

What do you think?

/fuad

> ---
>  virt/kvm/guest_memfd.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 050a8c092b1a3..9f6eebfb68f6b 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -370,6 +370,7 @@ static int kvm_gmem_release(struct inode *inode, struct file *file)
>         struct kvm_memory_slot *slot;
>         struct kvm *kvm = f->kvm;
>         unsigned long index;
> +       pgoff_t end;
>
>         /*
>          * Prevent concurrent attempts to *unbind* a memslot.  This is the last
> @@ -396,9 +397,10 @@ static int kvm_gmem_release(struct inode *inode, struct file *file)
>          * Zap all SPTEs pointed at by this file.  Do not free the backing
>          * memory, as its lifetime is associated with the inode, not the file.
>          */
> -       __kvm_gmem_invalidate_begin(f, 0, -1ul,
> +       end = i_size_read(inode) >> PAGE_SHIFT;
> +       __kvm_gmem_invalidate_begin(f, 0, end,
>                                     kvm_gmem_get_invalidate_filter(inode));
> -       __kvm_gmem_invalidate_end(f, 0, -1ul);
> +       __kvm_gmem_invalidate_end(f, 0, end);
>
>         list_del(&f->entry);
>
>
> --
> 2.54.0.563.g4f69b47b94-goog
>
>

^ permalink raw reply

* [PATCH net-next 2/3] devlink: Add eswitch mode boot defaults
From: Tariq Toukan @ 2026-05-21  7:24 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Jonathan Corbet, Shuah Khan, Jiri Pirko, Simon Horman,
	Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Borislav Petkov (AMD), Andrew Morton, Randy Dunlap,
	Thomas Gleixner, Petr Mladek, Peter Zijlstra (Intel), Tejun Heo,
	Vlastimil Babka, Feng Tang, Christian Brauner, Dave Hansen,
	Dapeng Mi, Kees Cook, Marco Elver, Li RongQing, Eric Biggers,
	Paul E. McKenney, linux-doc, linux-kernel, netdev, linux-rdma,
	Gal Pressman, Dragos Tatulea, Jiri Pirko
In-Reply-To: <20260521072434.362624-1-tariqt@nvidia.com>

From: Mark Bloch <mbloch@nvidia.com>

Add devlink_eswitch_mode= command line support for setting an eswitch
mode during device initialization.

The supported syntax selects either all devlink handles or one explicit
comma-separated handle list:

  devlink_eswitch_mode=[*]:<mode>
  devlink_eswitch_mode=[<handle>[,<handle>...]]:<mode>

where <mode> is one of legacy, switchdev or switchdev_inactive. All
selected handles receive the same mode. Assigning different modes to
different handle lists in the same parameter value is not supported.

The default is applied through the existing eswitch_mode_set() devlink
operation, matching the userspace devlink eswitch set command.

Expose devl_apply_default_esw_mode() so drivers can apply the default at
the point where their devlink instance and eswitch operations are ready.

Document the devlink_eswitch_mode= syntax and duplicate handle handling.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 .../admin-guide/kernel-parameters.txt         |  25 ++
 .../networking/devlink/devlink-defaults.rst   |  80 ++++++
 Documentation/networking/devlink/index.rst    |   1 +
 include/net/devlink.h                         |   1 +
 net/devlink/core.c                            | 255 ++++++++++++++++++
 5 files changed, 362 insertions(+)
 create mode 100644 Documentation/networking/devlink/devlink-defaults.rst

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7834ee927310..f87ae561c0dc 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1278,6 +1278,31 @@ Kernel parameters
 	dell_smm_hwmon.fan_max=
 			[HW] Maximum configurable fan speed.
 
+	devlink_eswitch_mode=
+			[NET]
+			Format:
+			[<selector>]:<mode>
+
+			<selector>:
+			* | <handle>[,<handle>...]
+
+			<handle>:
+			<bus-name>/<dev-name>
+
+			Configure default devlink eswitch mode for matching
+			devlink instances during device initialization.
+
+			<mode>:
+			legacy | switchdev | switchdev_inactive
+
+			Examples:
+			devlink_eswitch_mode=[*]:switchdev
+			devlink_eswitch_mode=[pci/0000:08:00.0]:switchdev
+			devlink_eswitch_mode=[pci/0000:08:00.0,pci/0000:09:00.1]:legacy
+
+			See Documentation/networking/devlink/devlink-defaults.rst
+			for the full syntax.
+
 	dfltcc=		[HW,S390]
 			Format: { on | off | def_only | inf_only | always }
 			on:       s390 zlib hardware support for compression on
diff --git a/Documentation/networking/devlink/devlink-defaults.rst b/Documentation/networking/devlink/devlink-defaults.rst
new file mode 100644
index 000000000000..b554e75eeeea
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-defaults.rst
@@ -0,0 +1,80 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Devlink Eswitch Mode Defaults
+==============================
+
+Devlink eswitch mode defaults allow the eswitch mode to be provided on the
+kernel command line and applied to matching devlink instances during device
+initialization.
+
+The devlink device is selected by its devlink handle. For PCI devices this is
+the same handle shown by ``devlink dev show``, for example
+``pci/0000:08:00.0``.
+
+Kernel command line syntax
+==========================
+
+Defaults are specified with the ``devlink_eswitch_mode=`` kernel command line
+parameter.
+
+The general syntax is::
+
+  devlink_eswitch_mode=[<selector>]:<mode>
+
+``<selector>`` is either ``*`` or one or more devlink handles::
+
+  * | <bus-name>/<dev-name>[,<bus-name>/<dev-name>...]
+
+``*`` applies the mode to every devlink instance. All handles in the same
+``[]`` list receive the same eswitch mode.
+
+``<mode>`` is one of ``legacy``, ``switchdev`` or ``switchdev_inactive``.
+
+Syntax rules
+------------
+
+The following syntax rules apply:
+
+* Specify the default in one ``devlink_eswitch_mode=`` parameter. Repeated
+  ``devlink_eswitch_mode=`` parameters are not accumulated.
+* The ``devlink_eswitch_mode=`` value is limited by the kernel command line
+  size.
+* Whitespace is not allowed within the parameter value.
+* ``<selector>`` must be either ``*`` or a handle list. ``*`` cannot be
+  combined with explicit handles.
+* ``<bus-name>`` and ``<dev-name>`` must not be empty.
+* ``<bus-name>`` must not contain ``:``.
+* ``<dev-name>`` may contain ``:``. This allows PCI names such as
+  ``0000:08:00.0``.
+* Handles must not contain whitespace, ``[``, ``]``, ``*`` or more than one
+  ``/``.
+* A comma inside ``[]`` separates handles.
+* Comma-separated default groups are not supported.
+* Duplicate handles are rejected and the devlink eswitch mode default is
+  ignored.
+
+The eswitch mode default corresponds to the userspace command::
+
+  devlink dev eswitch set <handle> mode <value>
+
+
+Examples
+========
+
+Set all devlink instances to switchdev mode::
+
+  devlink_eswitch_mode=[*]:switchdev
+
+Set one PCI devlink instance to switchdev mode::
+
+  devlink_eswitch_mode=[pci/0000:08:00.0]:switchdev
+
+Set two PCI devlink instances to legacy mode::
+
+  devlink_eswitch_mode=[pci/0000:08:00.0,pci/0000:09:00.1]:legacy
+
+The following is invalid because comma-separated default groups are not
+supported::
+
+  devlink_eswitch_mode=[pci/0000:08:00.0]:switchdev,[pci/0000:09:00.0]:switchdev_inactive
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index f7ba7dcf477d..0d27a7008b14 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -56,6 +56,7 @@ general.
    :maxdepth: 1
 
    devlink-dpipe
+   devlink-defaults
    devlink-eswitch-attr
    devlink-flash
    devlink-health
diff --git a/include/net/devlink.h b/include/net/devlink.h
index bcd31de1f890..98885f7c6c10 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1622,6 +1622,7 @@ int devl_trylock(struct devlink *devlink);
 void devl_unlock(struct devlink *devlink);
 void devl_assert_locked(struct devlink *devlink);
 bool devl_lock_is_held(struct devlink *devlink);
+int devl_apply_default_esw_mode(struct devlink *devlink);
 DEFINE_GUARD(devl, struct devlink *, devl_lock(_T), devl_unlock(_T));
 
 struct ib_device;
diff --git a/net/devlink/core.c b/net/devlink/core.c
index eeb6a71f5f56..4bc1734878d1 100644
--- a/net/devlink/core.c
+++ b/net/devlink/core.c
@@ -4,6 +4,10 @@
  * Copyright (c) 2016 Jiri Pirko <jiri@mellanox.com>
  */
 
+#include <linux/init.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+#include <linux/string.h>
 #include <net/genetlink.h>
 #define CREATE_TRACE_POINTS
 #include <trace/events/devlink.h>
@@ -16,6 +20,233 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(devlink_trap_report);
 
 DEFINE_XARRAY_FLAGS(devlinks, XA_FLAGS_ALLOC);
 
+static char *devlink_default_esw_mode_param;
+static bool devlink_default_esw_mode_match_all;
+static enum devlink_eswitch_mode devlink_default_esw_mode;
+static LIST_HEAD(devlink_default_esw_mode_nodes);
+
+struct devlink_default_esw_mode_node {
+	struct list_head list;
+	char *bus_name;
+	char *dev_name;
+};
+
+static int __init
+devlink_default_esw_mode_to_value(const char *str,
+				  enum devlink_eswitch_mode *mode)
+{
+	if (!strcmp(str, "legacy")) {
+		*mode = DEVLINK_ESWITCH_MODE_LEGACY;
+		return 0;
+	}
+	if (!strcmp(str, "switchdev")) {
+		*mode = DEVLINK_ESWITCH_MODE_SWITCHDEV;
+		return 0;
+	}
+	if (!strcmp(str, "switchdev_inactive")) {
+		*mode = DEVLINK_ESWITCH_MODE_SWITCHDEV_INACTIVE;
+		return 0;
+	}
+
+	return -EINVAL;
+}
+
+static int devlink_default_esw_mode_apply(struct devlink *devlink)
+{
+	const struct devlink_ops *ops = devlink->ops;
+
+	if (!ops->eswitch_mode_set)
+		return -EOPNOTSUPP;
+
+	return ops->eswitch_mode_set(devlink, devlink_default_esw_mode,
+				     NULL);
+}
+
+static int __init
+devlink_default_esw_mode_handle_parse(char *handle, char **bus_name,
+				      char **dev_name)
+{
+	char *slash;
+	char *p;
+
+	if (!handle || !*handle)
+		return -EINVAL;
+
+	for (p = handle; *p; p++) {
+		if (*p == '[' || *p == ']' || *p == '*')
+			return -EINVAL;
+	}
+
+	slash = strchr(handle, '/');
+	if (!slash || slash == handle || !slash[1])
+		return -EINVAL;
+	if (strchr(slash + 1, '/'))
+		return -EINVAL;
+
+	*slash = '\0';
+	if (strchr(handle, ':'))
+		return -EINVAL;
+
+	*bus_name = handle;
+	*dev_name = slash + 1;
+	return 0;
+}
+
+static struct devlink_default_esw_mode_node *
+devlink_default_esw_mode_node_find(const char *bus_name, const char *dev_name)
+{
+	struct devlink_default_esw_mode_node *node;
+
+	list_for_each_entry(node, &devlink_default_esw_mode_nodes, list) {
+		if (!strcmp(node->bus_name, bus_name) &&
+		    !strcmp(node->dev_name, dev_name))
+			return node;
+	}
+
+	return NULL;
+}
+
+static int __init
+devlink_default_esw_mode_node_add(const char *bus_name, const char *dev_name)
+{
+	struct devlink_default_esw_mode_node *node;
+
+	if (devlink_default_esw_mode_node_find(bus_name, dev_name))
+		return -EEXIST;
+
+	node = kzalloc_obj(*node);
+	if (!node)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&node->list);
+	node->bus_name = kstrdup(bus_name, GFP_KERNEL);
+	node->dev_name = kstrdup(dev_name, GFP_KERNEL);
+	if (!node->bus_name || !node->dev_name) {
+		kfree(node->bus_name);
+		kfree(node->dev_name);
+		kfree(node);
+		return -ENOMEM;
+	}
+
+	list_add_tail(&node->list, &devlink_default_esw_mode_nodes);
+	return 0;
+}
+
+static int __init devlink_default_esw_mode_handles_parse(char *handles)
+{
+	char *handle;
+	int err;
+
+	if (!strcmp(handles, "*")) {
+		devlink_default_esw_mode_match_all = true;
+		return 0;
+	}
+
+	while ((handle = strsep(&handles, ",")) != NULL) {
+		char *bus_name;
+		char *dev_name;
+
+		err = devlink_default_esw_mode_handle_parse(handle, &bus_name,
+							    &dev_name);
+		if (err)
+			return err;
+
+		err = devlink_default_esw_mode_node_add(bus_name, dev_name);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static void __init
+devlink_default_esw_mode_node_free(struct devlink_default_esw_mode_node *node)
+{
+	kfree(node->bus_name);
+	kfree(node->dev_name);
+	kfree(node);
+}
+
+static void __init devlink_default_esw_mode_nodes_clear(void)
+{
+	struct devlink_default_esw_mode_node *node;
+	struct devlink_default_esw_mode_node *node_tmp;
+
+	list_for_each_entry_safe(node, node_tmp,
+				 &devlink_default_esw_mode_nodes, list) {
+		list_del(&node->list);
+		devlink_default_esw_mode_node_free(node);
+	}
+
+	devlink_default_esw_mode_match_all = false;
+}
+
+static int __init devlink_default_esw_mode_parse(char *str)
+{
+	char *handles_end;
+	char *handles;
+	char *mode;
+	int err;
+
+	if (!str || *str != '[')
+		return -EINVAL;
+
+	handles = str + 1;
+	handles_end = strchr(handles, ']');
+	if (!handles_end || handles_end[1] != ':' || !handles_end[2])
+		return -EINVAL;
+
+	*handles_end = '\0';
+	mode = handles_end + 2;
+	if (!*handles)
+		return -EINVAL;
+
+	err = devlink_default_esw_mode_to_value(mode,
+						&devlink_default_esw_mode);
+	if (err)
+		return err;
+
+	err = devlink_default_esw_mode_handles_parse(handles);
+	if (err)
+		devlink_default_esw_mode_nodes_clear();
+
+	return err;
+}
+
+/**
+ * devl_apply_default_esw_mode - Apply default eswitch mode to devlink instance
+ * @devlink: devlink
+ *
+ * The caller must hold the devlink instance lock.
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int devl_apply_default_esw_mode(struct devlink *devlink)
+{
+	const char *bus_name = devlink_bus_name(devlink);
+	const char *dev_name = devlink_dev_name(devlink);
+	struct devlink_default_esw_mode_node *node;
+
+	devl_assert_locked(devlink);
+
+	if (devlink_default_esw_mode_match_all)
+		return devlink_default_esw_mode_apply(devlink);
+
+	node = devlink_default_esw_mode_node_find(bus_name, dev_name);
+	if (node)
+		return devlink_default_esw_mode_apply(devlink);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devl_apply_default_esw_mode);
+
+static int __init devlink_default_esw_mode_setup(char *str)
+{
+	devlink_default_esw_mode_param = str;
+	return 1;
+}
+__setup("devlink_eswitch_mode=", devlink_default_esw_mode_setup);
+
 static struct devlink *devlinks_xa_get(unsigned long index)
 {
 	struct devlink *devlink;
@@ -578,6 +809,27 @@ static int __init devlink_init(void)
 {
 	int err;
 
+	if (devlink_default_esw_mode_param) {
+		char *def;
+
+		def = kstrdup(devlink_default_esw_mode_param, GFP_KERNEL);
+		if (!def) {
+			err = -ENOMEM;
+			goto out;
+		}
+		err = devlink_default_esw_mode_parse(def);
+		kfree(def);
+		if (err == -EEXIST) {
+			devlink_default_esw_mode_param = NULL;
+			pr_warn("devlink: duplicate eswitch mode handles ignored\n");
+		} else if (err == -EINVAL) {
+			devlink_default_esw_mode_param = NULL;
+			pr_warn("devlink: invalid devlink_eswitch_mode parameter ignored\n");
+		} else if (err) {
+			goto out;
+		}
+	}
+
 	err = register_pernet_subsys(&devlink_pernet_ops);
 	if (err)
 		goto out;
@@ -593,7 +845,10 @@ static int __init devlink_init(void)
 out_unreg_pernet_subsys:
 	unregister_pernet_subsys(&devlink_pernet_ops);
 out:
+	if (err)
+		devlink_default_esw_mode_nodes_clear();
 	WARN_ON(err);
+
 	return err;
 }
 
-- 
2.44.0


^ permalink raw reply related

* [PATCH net-next 3/3] net/mlx5: Apply devlink default eswitch mode during init
From: Tariq Toukan @ 2026-05-21  7:24 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Jonathan Corbet, Shuah Khan, Jiri Pirko, Simon Horman,
	Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch,
	Borislav Petkov (AMD), Andrew Morton, Randy Dunlap,
	Thomas Gleixner, Petr Mladek, Peter Zijlstra (Intel), Tejun Heo,
	Vlastimil Babka, Feng Tang, Christian Brauner, Dave Hansen,
	Dapeng Mi, Kees Cook, Marco Elver, Li RongQing, Eric Biggers,
	Paul E. McKenney, linux-doc, linux-kernel, netdev, linux-rdma,
	Gal Pressman, Dragos Tatulea, Jiri Pirko, Shay Drori,
	Moshe Shemesh
In-Reply-To: <20260521072434.362624-1-tariqt@nvidia.com>

From: Mark Bloch <mbloch@nvidia.com>

Apply devlink default eswitch mode for mlx5 devices after successful
device initialization while holding the devlink instance lock.

At this point the devlink instance is registered and the mlx5 devlink
operations are available, so the default eswitch mode can be applied to
the matching PCI devlink handle.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 0c6e4efe38c8..4528097f3d84 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1391,6 +1391,21 @@ static void mlx5_unload(struct mlx5_core_dev *dev)
 	mlx5_free_bfreg(dev, &dev->priv.bfreg);
 }
 
+static void mlx5_devl_apply_default_esw_mode(struct mlx5_core_dev *dev)
+{
+	struct devlink *devlink = priv_to_devlink(dev);
+	int err;
+
+	if (!MLX5_ESWITCH_MANAGER(dev))
+		return;
+
+	devl_assert_locked(devlink);
+	err = devl_apply_default_esw_mode(devlink);
+	if (err)
+		mlx5_core_warn(dev, "Couldn't apply default eswitch mode, err %d\n",
+			       err);
+}
+
 int mlx5_init_one_devl_locked(struct mlx5_core_dev *dev)
 {
 	bool light_probe = mlx5_dev_is_lightweight(dev);
@@ -1437,6 +1452,7 @@ int mlx5_init_one_devl_locked(struct mlx5_core_dev *dev)
 		mlx5_core_err(dev, "mlx5_hwmon_dev_register failed with error code %d\n", err);
 
 	mutex_unlock(&dev->intf_state_mutex);
+	mlx5_devl_apply_default_esw_mode(dev);
 	return 0;
 
 err_register:
@@ -1538,6 +1554,7 @@ int mlx5_load_one_devl_locked(struct mlx5_core_dev *dev, bool recovery)
 		goto err_attach;
 
 	mutex_unlock(&dev->intf_state_mutex);
+	mlx5_devl_apply_default_esw_mode(dev);
 	return 0;
 
 err_attach:
-- 
2.44.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox