* [PATCH v6 10/10] rust: module: update MAINTAINERS to cover module.rs
From: Alvin Sun @ 2026-06-24 15:00 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
In-Reply-To: <20260624-fix-fops-owner-v6-0-5295e333cb3e@linux.dev>
Module types now live in `rust/kernel/module.rs` alongside
`rust/kernel/module_param.rs`. Update the MODULE SUPPORT file pattern
from `rust/kernel/module_param.rs` to `rust/kernel/module*.rs` so both
files are covered.
Assisted-by: opencode:glm-5.2
Link: https://lore.kernel.org/rust-for-linux/8ea21b29-9baf-4926-a16f-7d21c5a1a1b8@suse.com
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
MAINTAINERS | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index e035a3be797c4..74733de3e41ee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17984,7 +17984,7 @@ F: include/linux/module*.h
F: kernel/module/
F: lib/test_kmod.c
F: lib/tests/module/
-F: rust/kernel/module_param.rs
+F: rust/kernel/module*.rs
F: rust/macros/module.rs
F: scripts/module*
F: tools/testing/selftests/kmod/
--
2.43.0
^ permalink raw reply related
* [PATCH v6 09/10] rust: macros: remove `THIS_MODULE` static from `module!`
From: Alvin Sun @ 2026-06-24 15:00 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
In-Reply-To: <20260624-fix-fops-owner-v6-0-5295e333cb3e@linux.dev>
All users have been migrated to `ModuleMetadata::THIS_MODULE` const or
`this_module::<LocalModule>()` helper. The `static THIS_MODULE`
generated by the `module!` macro is no longer referenced anywhere,
so remove it to avoid having two sources of the same `ThisModule`
pointer.
Assisted-by: opencode:glm-5.2
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
rust/macros/module.rs | 16 ----------------
1 file changed, 16 deletions(-)
diff --git a/rust/macros/module.rs b/rust/macros/module.rs
index aa9a618d5d19e..23b6a1b456b80 100644
--- a/rust/macros/module.rs
+++ b/rust/macros/module.rs
@@ -497,22 +497,6 @@ pub(crate) fn module(info: ModuleInfo) -> Result<TokenStream> {
/// Used by the printing macros, e.g. [`info!`].
const __LOG_PREFIX: &[u8] = #name_cstr.to_bytes_with_nul();
- // SAFETY: `__this_module` is constructed by the kernel at load time and will not be
- // freed until the module is unloaded.
- #[cfg(MODULE)]
- static THIS_MODULE: ::kernel::ThisModule = unsafe {
- extern "C" {
- static __this_module: ::kernel::types::Opaque<::kernel::bindings::module>;
- };
-
- ::kernel::ThisModule::from_ptr(__this_module.get())
- };
-
- #[cfg(not(MODULE))]
- static THIS_MODULE: ::kernel::ThisModule = unsafe {
- ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
- };
-
/// The `LocalModule` type is the type of the module created by `module!`,
/// `module_pci_driver!`, `module_platform_driver!`, etc.
type LocalModule = #type_;
--
2.43.0
^ permalink raw reply related
* [PATCH v6 06/10] rust: miscdevice: set fops.owner from driver module pointer
From: Alvin Sun @ 2026-06-24 15:00 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
In-Reply-To: <20260624-fix-fops-owner-v6-0-5295e333cb3e@linux.dev>
Set the miscdevice fops owner field from the driver module pointer
via the `this_module::<T::OwnerModule>()` helper, instead of
defaulting to null.
Assisted-by: opencode:glm-5.2
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
rust/kernel/miscdevice.rs | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/rust/kernel/miscdevice.rs b/rust/kernel/miscdevice.rs
index 83ce50def5ac9..2a4329f98614e 100644
--- a/rust/kernel/miscdevice.rs
+++ b/rust/kernel/miscdevice.rs
@@ -24,12 +24,13 @@
IovIterSource, //
},
mm::virt::VmaNew,
+ module::this_module,
prelude::*,
seq_file::SeqFile,
types::{
ForeignOwnable,
Opaque, //
- },
+ }, //
};
use core::marker::PhantomData;
@@ -430,6 +431,7 @@ impl<T: MiscDevice> MiscdeviceVTable<T> {
} else {
None
},
+ owner: this_module::<T::OwnerModule>().as_ptr(),
..pin_init::zeroed()
};
--
2.43.0
^ permalink raw reply related
* [PATCH v6 07/10] rust: configfs: use `LocalModule` for `THIS_MODULE`
From: Alvin Sun @ 2026-06-24 15:00 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
In-Reply-To: <20260624-fix-fops-owner-v6-0-5295e333cb3e@linux.dev>
Replace the `THIS_MODULE` static reference in the `configfs_attrs!`
macro with `this_module::<LocalModule>()`, and update
rnull to import `LocalModule` instead of `THIS_MODULE`, consistent
with the move of `THIS_MODULE` into the `ModuleMetadata` trait.
Assisted-by: opencode:glm-5.2
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
drivers/block/rnull/configfs.rs | 6 ++----
rust/kernel/configfs.rs | 8 +++++---
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/block/rnull/configfs.rs b/drivers/block/rnull/configfs.rs
index c10a55fc58948..b2547ad1e5ddd 100644
--- a/drivers/block/rnull/configfs.rs
+++ b/drivers/block/rnull/configfs.rs
@@ -1,9 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
-use super::{
- NullBlkDevice,
- THIS_MODULE, //
-};
+use super::NullBlkDevice;
+use crate::LocalModule;
use kernel::{
block::mq::gen_disk::{
GenDisk,
diff --git a/rust/kernel/configfs.rs b/rust/kernel/configfs.rs
index 2339c6467325d..c31d7882e216d 100644
--- a/rust/kernel/configfs.rs
+++ b/rust/kernel/configfs.rs
@@ -875,7 +875,7 @@ fn as_ptr(&self) -> *const bindings::config_item_type {
/// configfs::Subsystem<Configuration>,
/// Configuration
/// >::new_with_child_ctor::<N,Child>(
-/// &THIS_MODULE,
+/// ::kernel::module::this_module::<crate::LocalModule>(),
/// &CONFIGURATION_ATTRS
/// );
///
@@ -1021,7 +1021,8 @@ macro_rules! configfs_attrs {
static [< $data:upper _TPE >] : $crate::configfs::ItemType<$container, $data> =
$crate::configfs::ItemType::<$container, $data>::new::<N>(
- &THIS_MODULE, &[<$ data:upper _ATTRS >]
+ $crate::module::this_module::<LocalModule>(),
+ &[<$ data:upper _ATTRS >]
);
)?
@@ -1030,7 +1031,8 @@ macro_rules! configfs_attrs {
$crate::configfs::ItemType<$container, $data> =
$crate::configfs::ItemType::<$container, $data>::
new_with_child_ctor::<N, $child>(
- &THIS_MODULE, &[<$ data:upper _ATTRS >]
+ $crate::module::this_module::<LocalModule>(),
+ &[<$ data:upper _ATTRS >]
);
)?
--
2.43.0
^ permalink raw reply related
* [PATCH v6 08/10] rust: binder: use `LocalModule` for `THIS_MODULE`
From: Alvin Sun @ 2026-06-24 15:00 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
In-Reply-To: <20260624-fix-fops-owner-v6-0-5295e333cb3e@linux.dev>
Replace the `THIS_MODULE` static reference in the binder fops with
`this_module::<LocalModule>()`, consistent with the move of
`THIS_MODULE` into the `ModuleMetadata` trait.
Assisted-by: opencode:glm-5.2
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
drivers/android/binder/rust_binder_main.rs | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/android/binder/rust_binder_main.rs b/drivers/android/binder/rust_binder_main.rs
index dc1941cd2407b..d6ceebbd5f94e 100644
--- a/drivers/android/binder/rust_binder_main.rs
+++ b/drivers/android/binder/rust_binder_main.rs
@@ -17,6 +17,7 @@
bindings::{self, seq_file},
fs::File,
list::{ListArc, ListArcSafe, ListLinksSelfPtr, TryNewListArc},
+ module::this_module,
prelude::*,
seq_file::SeqFile,
seq_print,
@@ -318,7 +319,7 @@ unsafe impl<T> Sync for AssertSync<T> {}
let zeroed_ops = unsafe { core::mem::MaybeUninit::zeroed().assume_init() };
let ops = kernel::bindings::file_operations {
- owner: THIS_MODULE.as_ptr(),
+ owner: this_module::<LocalModule>().as_ptr(),
poll: Some(rust_binder_poll),
unlocked_ioctl: Some(rust_binder_ioctl),
compat_ioctl: bindings::compat_ptr_ioctl,
--
2.43.0
^ permalink raw reply related
* [PATCH v6 05/10] rust: drm: set fops.owner from driver module pointer
From: Alvin Sun @ 2026-06-24 15:00 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
In-Reply-To: <20260624-fix-fops-owner-v6-0-5295e333cb3e@linux.dev>
Change `create_fops()` to accept an owner module pointer instead of
hardcoding `null_mut()`, ensuring the kernel correctly tracks the
module owning the DRM device's file operations.
Assisted-by: opencode:glm-5.2
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
rust/kernel/drm/device.rs | 3 ++-
rust/kernel/drm/gem/mod.rs | 4 ++--
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs
index 403fc35353c74..d92cacb665366 100644
--- a/rust/kernel/drm/device.rs
+++ b/rust/kernel/drm/device.rs
@@ -111,7 +111,8 @@ impl<T: drm::Driver> Device<T> {
fops: &Self::GEM_FOPS,
};
- const GEM_FOPS: bindings::file_operations = drm::gem::create_fops();
+ const GEM_FOPS: bindings::file_operations =
+ drm::gem::create_fops(crate::module::this_module::<T::OwnerModule>().as_ptr());
/// Create a new `drm::Device` for a `drm::Driver`.
pub fn new(dev: &device::Device, data: impl PinInit<T::Data, Error>) -> Result<ARef<Self>> {
diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs
index 01b5bd47a3332..9a203efc59116 100644
--- a/rust/kernel/drm/gem/mod.rs
+++ b/rust/kernel/drm/gem/mod.rs
@@ -357,10 +357,10 @@ impl<T: DriverObject> AllocImpl for Object<T> {
};
}
-pub(super) const fn create_fops() -> bindings::file_operations {
+pub(super) const fn create_fops(owner: *mut bindings::module) -> bindings::file_operations {
let mut fops: bindings::file_operations = pin_init::zeroed();
- fops.owner = core::ptr::null_mut();
+ fops.owner = owner;
fops.open = Some(bindings::drm_open);
fops.release = Some(bindings::drm_release);
fops.unlocked_ioctl = Some(bindings::drm_ioctl);
--
2.43.0
^ permalink raw reply related
* [PATCH v6 04/10] rust: macros: auto-insert OwnerModule in #[vtable]
From: Alvin Sun @ 2026-06-24 15:00 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
In-Reply-To: <20260624-fix-fops-owner-v6-0-5295e333cb3e@linux.dev>
Auto-add `type OwnerModule: ::kernel::ModuleMetadata;` as a required
associated type on the trait side if not already defined, and
auto-insert `type OwnerModule = crate::LocalModule;` on the impl side
if not explicitly provided, eliminating the need to manually declare
and implement `OwnerModule` in every vtable trait and impl.
Assisted-by: opencode:glm-5.2
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Suggested-by: Gary Guo <gary@garyguo.net>
Link: https://lore.kernel.org/all/DIMMWHUOLPSH.13JFRHDKDQJGO@garyguo.net
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
rust/macros/lib.rs | 6 ++++++
rust/macros/vtable.rs | 41 ++++++++++++++++++++++++++++++++++++-----
2 files changed, 42 insertions(+), 5 deletions(-)
diff --git a/rust/macros/lib.rs b/rust/macros/lib.rs
index 2cfd59e0f9e7c..bc7ded353c5ca 100644
--- a/rust/macros/lib.rs
+++ b/rust/macros/lib.rs
@@ -176,6 +176,12 @@ pub fn module(input: TokenStream) -> TokenStream {
///
/// This macro should not be used when all functions are required.
///
+/// Additionally, this macro automatically handles the `OwnerModule`
+/// associated type: on the trait side, `type OwnerModule: ModuleMetadata;`
+/// is added as a required associated type if not already defined; on the
+/// impl side, `type OwnerModule = LocalModule;` is automatically inserted
+/// if not explicitly defined.
+///
/// # Examples
///
/// ```
diff --git a/rust/macros/vtable.rs b/rust/macros/vtable.rs
index c6510b0c4ea1d..be9a5ed8abe5e 100644
--- a/rust/macros/vtable.rs
+++ b/rust/macros/vtable.rs
@@ -30,6 +30,22 @@ fn handle_trait(mut item: ItemTrait) -> Result<ItemTrait> {
const USE_VTABLE_ATTR: ();
});
+ // Add `type OwnerModule: ModuleMetadata` as a required associated type if
+ // the trait does not already define it.
+ if !item
+ .items
+ .iter()
+ .any(|i| matches!(i, TraitItem::Type(t) if t.ident == "OwnerModule"))
+ {
+ gen_items.push(parse_quote! {
+ /// The module implementing this vtable trait.
+ ///
+ /// Automatically set to `crate::LocalModule` by the `#[vtable]`
+ /// impl macro.
+ type OwnerModule: ::kernel::ModuleMetadata;
+ });
+ }
+
for item in &item.items {
if let TraitItem::Fn(fn_item) = item {
let name = &fn_item.sig.ident;
@@ -57,12 +73,18 @@ fn handle_trait(mut item: ItemTrait) -> Result<ItemTrait> {
fn handle_impl(mut item: ItemImpl) -> Result<ItemImpl> {
let mut gen_items = Vec::new();
- let mut defined_consts = HashSet::new();
+ let mut defined_items = HashSet::new();
- // Iterate over all user-defined constants to gather any possible explicit overrides.
+ // Iterate over all user-defined items to gather any possible explicit overrides.
for item in &item.items {
- if let ImplItem::Const(const_item) = item {
- defined_consts.insert(const_item.ident.clone());
+ match item {
+ ImplItem::Const(const_item) => {
+ defined_items.insert(const_item.ident.clone());
+ }
+ ImplItem::Type(type_item) => {
+ defined_items.insert(type_item.ident.clone());
+ }
+ _ => {}
}
}
@@ -70,6 +92,15 @@ fn handle_impl(mut item: ItemImpl) -> Result<ItemImpl> {
const USE_VTABLE_ATTR: () = ();
});
+ // Auto-insert `type OwnerModule = crate::LocalModule` if not explicitly defined.
+ // `crate::LocalModule` resolves to the real module type (via `module!`) or a
+ // dummy fallback in non-module contexts (e.g., doctests).
+ if !defined_items.contains(&parse_quote!(OwnerModule)) {
+ gen_items.push(parse_quote! {
+ type OwnerModule = crate::LocalModule;
+ });
+ }
+
for item in &item.items {
if let ImplItem::Fn(fn_item) = item {
let name = &fn_item.sig.ident;
@@ -78,7 +109,7 @@ fn handle_impl(mut item: ItemImpl) -> Result<ItemImpl> {
name.span(),
);
// Skip if it's declared already -- this allows user override.
- if defined_consts.contains(&gen_const_name) {
+ if defined_items.contains(&gen_const_name) {
continue;
}
let cfg_attrs = crate::helpers::gather_cfg_attrs(&fn_item.attrs);
--
2.43.0
^ permalink raw reply related
* [PATCH v6 00/10] Fix missing fops.owner in Rust DRM/misc abstractions
From: Alvin Sun @ 2026-06-24 14:59 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
During tyr debugfs development, a kernel NULL pointer dereference was
encountered after `rmmod tyr` while gnome-shell still held /dev/card1 open:
```
[158827.868132] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[158827.868918] Mem abort info:
[158827.869177] ESR = 0x0000000086000004
[158827.869519] EC = 0x21: IABT (current EL), IL = 32 bits
[158827.870000] SET = 0, FnV = 0
[158827.870281] EA = 0, S1PTW = 0
[158827.870571] FSC = 0x04: level 0 translation fault
[158827.871043] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000108dec000
[158827.871623] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[158827.872242] Internal error: Oops: 0000000086000004 [#1] SMP
[158827.872246] Modules linked in: tyr sunrpc snd_soc_simple_card rk805_pwrkey snd_soc_simple_card_utils rtw88_8822bu display_connector rtw88_usb rtw88_8822b snd_soc_rockchip_i2s_tdm snd_soc_hdmi_codec
rtw88_core]
[158827.872337] CPU: 4 UID: 1000 PID: 11276 Comm: gnome-s:disk$0 Tainted: G N 7.1.0-rc1+ #331 PREEMPT
[158827.880534] Tainted: [N]=TEST
[158827.880535] Hardware name: FriendlyElec NanoPi R6C/NanoPi R6C, BIOS v1.1 04/09/2025
[158827.880538] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[158827.880542] pc : 0x0
[158827.880547] lr : _RNvNtCs257m05FHVbX_3tyr2vm8pt_unmap+0x8c/0x12c [tyr]
[158827.880578] sp : ffff800083c236b0
[158827.880579] x29: ffff800083c236d0 x28: ffff00013f8a0000 x27: 0000000000000000
[158827.880585] x26: 000000000000007c x25: ffff000108e6ed80 x24: 0000000000401000
[158827.880590] x23: 0000000000000000 x22: 0000000040000000 x21: 0000000000001000
[158827.880595] x20: ffff00010f778138 x19: 0000000000400000 x18: 00000000ffffffff
[158827.880600] x17: 000000040044ffff x16: 045000f2b5503510 x15: 0720072007200720
[158827.880606] x14: 0720072007200720 x13: 0000000000401000 x12: 0000000000400000
[158827.880611] x11: ffff800083c239d0 x10: ffff000141e4fd88 x9 : 0000000000000000
[158827.880615] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000400000
[158827.880620] x5 : ffff00013f8a0000 x4 : 0000000000000000 x3 : 0000000000000001
[158827.880625] x2 : 0000000000001000 x1 : 0000000000400000 x0 : ffff00010f778138
[158827.880630] Call trace:
[158827.880632] 0x0 (P)
[158827.880635] _RNvXs6_NtCs257m05FHVbX_3tyr2vmNtB5_9GpuVmDataNtNtNtCsgmSOfgXi5CZ_6kernel3drm5gpuvm11DriverGpuVm13sm_step_unmap+0x3c/0x120 [tyr]
[158827.891166] _RNvMs4_NtNtNtCsgmSOfgXi5CZ_6kernel3drm5gpuvm6sm_opsINtB7_5GpuVmNtNtCs257m05FHVbX_3tyr2vm9GpuVmDataE13sm_step_unmapB13_+0x18/0x34 [tyr]
[158827.891187] op_unmap_cb+0x78/0xb0
[158827.891196] __drm_gpuvm_sm_unmap+0x18c/0x1b4
[158827.891204] drm_gpuvm_sm_unmap+0x38/0x4c
[158827.891209] _RNvMs5_NtCs257m05FHVbX_3tyr2vmNtB5_2Vm7exec_op+0x1cc/0x254 [tyr]
[158827.894085] _RNvMs5_NtCs257m05FHVbX_3tyr2vmNtB5_2Vm11unmap_range+0x124/0x188 [tyr]
[158827.894105] _RINvNtCs5hGKnPbRUFW_4core3ptr13drop_in_placeNtNtCs257m05FHVbX_3tyr3gem8KernelBoEBK_+0x44/0xd8 [tyr]
[158827.894125] _RINvNtCs5hGKnPbRUFW_4core3ptr13drop_in_placeINtNtNtCsgmSOfgXi5CZ_6kernel5alloc4kvec3VecNtNtCs257m05FHVbX_3tyr2fw7SectionNtNtBL_9allocator7KmallocEEB1r_+0x3c/0x100 [tyr]
[158827.894147] _RINvNtCs5hGKnPbRUFW_4core3ptr13drop_in_placeINtNtNtCsgmSOfgXi5CZ_6kernel4sync3arc3ArcNtNtCs257m05FHVbX_3tyr2fw8FirmwareEEB1p_+0x94/0x190 [tyr]
[158827.894167] _RNvMs4_NtNtCsgmSOfgXi5CZ_6kernel3drm6deviceINtB5_6DeviceNtNtCs257m05FHVbX_3tyr6driver12TyrDrmDriverE7releaseBW_+0x30/0x98 [tyr]
[158827.899550] drm_dev_put.part.0+0x88/0xc0
[158827.899557] drm_minor_release+0x18/0x28
[158827.899562] drm_release+0x144/0x170
[158827.899567] __fput+0xe4/0x30c
[158827.899573] ____fput+0x14/0x20
[158827.899579] task_work_run+0x7c/0xe8
[158827.899586] do_exit+0x2a8/0xac4
[158827.899590] do_group_exit+0x34/0x90
[158827.899594] get_signal+0xaac/0xabc
[158827.899599] arch_do_signal_or_restart+0x90/0x3e8
[158827.899606] exit_to_user_mode_loop+0x140/0x1d0
[158827.899613] el0_svc+0x2f4/0x2f8
[158827.899620] el0t_64_sync_handler+0xa0/0xe4
[158827.899627] el0t_64_sync+0x198/0x19c
[158827.899632] ---[ end trace 0000000000000000 ]---
```
The root cause: `fops.owner` was `NULL` in Rust DRM drivers, so the kernel
never blocked module unloading while file descriptors were open. This leads to
use-after-free when drm_release (or other fops) is called on freed module code.
The series moves `THIS_MODULE` into the `ModuleMetadata` as a const, threads it
through `#[vtable]` to set `fops.owner` in DRM/miscdevice, and updates configfs
and rnull to use `this_module::<LocalModule>()`.
Assisted-by: opencode:glm-5.2
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
Changes in v6:
- Update MAINTAINERS to cover the new `rust/kernel/module.rs`.
- Link to v5: https://lore.kernel.org/r/20260624-fix-fops-owner-v5-0-aa1cba242f05@linux.dev
Changes in v5:
- Add `#[inline]` to the `this_module()` helper.
- Fix configfs doc comment to reference `crate::LocalModule` instead of
bare `LocalModule`.
- Link to v4: https://lore.kernel.org/r/20260623-fix-fops-owner-v4-0-0daf5f077d5c@linux.dev
Changes in v4:
- Move module-related types into a new `rust/kernel/module.rs`.
- Migrate binder from the `module!`-generated `THIS_MODULE` static to
`this_module::<LocalModule>()`.
- Reorganise the series so that every commit builds independently, and
drop the legacy `THIS_MODULE` static once all users are migrated.
- Link to v3: https://lore.kernel.org/r/20260622-fix-fops-owner-v3-0-49d45cb37032@linux.dev
Changes in v3:
- Renamed vtable associated type `ThisModule` to `OwnerModule`
- Added `this_module()` helper for ergonomic `THIS_MODULE` access
- Refined vtable macro implementation: one-liner detection and single `defined_items` set
- Reordered commits to place doctest fallback before vtable auto-insert
- Link to v2: https://lore.kernel.org/r/20260521-fix-fops-owner-v2-0-fd99079c5a04@linux.dev
Changes in v2:
- Merged old `static THIS_MODULE` and v1's `MODULE_PTR` into a single
`ModuleMetadata::THIS_MODULE` const
- `#[vtable]` macro now auto-inserts `type ThisModule`, removing all per-driver
manual patches from v1
- Added configfs & rnull usage site updates and doctest `LocalModule` fallback
- Link to v1: https://lore.kernel.org/r/20260519-fix-fops-owner-v1-0-2ded9830da14@linux.dev
---
Alvin Sun (10):
rust: module: move module types into `module.rs`
rust: module: add `THIS_MODULE` const to `ModuleMetadata` trait
rust: doctest: add LocalModule fallback for #[vtable] ThisModule
rust: macros: auto-insert OwnerModule in #[vtable]
rust: drm: set fops.owner from driver module pointer
rust: miscdevice: set fops.owner from driver module pointer
rust: configfs: use `LocalModule` for `THIS_MODULE`
rust: binder: use `LocalModule` for `THIS_MODULE`
rust: macros: remove `THIS_MODULE` static from `module!`
rust: module: update MAINTAINERS to cover module.rs
MAINTAINERS | 2 +-
drivers/android/binder/rust_binder_main.rs | 3 +-
drivers/block/rnull/configfs.rs | 6 +--
rust/kernel/auxiliary.rs | 2 +-
rust/kernel/configfs.rs | 8 +--
rust/kernel/drm/device.rs | 3 +-
rust/kernel/drm/gem/mod.rs | 4 +-
rust/kernel/i2c.rs | 2 +-
rust/kernel/lib.rs | 75 +++-------------------------
rust/kernel/miscdevice.rs | 4 +-
rust/kernel/module.rs | 80 ++++++++++++++++++++++++++++++
rust/kernel/net/phy.rs | 6 ++-
rust/kernel/pci.rs | 2 +-
rust/kernel/platform.rs | 2 +-
rust/kernel/usb.rs | 2 +-
rust/macros/lib.rs | 6 +++
rust/macros/module.rs | 34 ++++++-------
rust/macros/vtable.rs | 41 +++++++++++++--
scripts/rustdoc_test_gen.rs | 16 ++++++
19 files changed, 189 insertions(+), 109 deletions(-)
---
base-commit: b7e5ac83cb16f7ffd11dc23736f84276602100ed
change-id: 20260519-fix-fops-owner-e3a77bb27c6c
prerequisite-change-id: 20260519-miscdev-use-format-9ab7e83b1c11:v3
prerequisite-patch-id: 405b334ff0d48ad350014f05a2321bdbaa025400
prerequisite-patch-id: 604b631c81d5423f4ebb2e12ba2d22e9ce371bfc
prerequisite-patch-id: cb550d94cefe01920e0d3ced2b2bcbecd76f3907
prerequisite-patch-id: 3bc830839742591460cb86d9472c04f4686dc600
prerequisite-patch-id: 571058244bc8c7088638d2e3225713011246c7e9
prerequisite-patch-id: 347c5a3c6dbef9832bfce8419fc23e6e08ba477f
prerequisite-patch-id: 3e202d988b56b88446f7535e90d3f00cf5f15701
Best regards,
--
Alvin Sun <alvin.sun@linux.dev>
^ permalink raw reply
* [PATCH v6 02/10] rust: module: add `THIS_MODULE` const to `ModuleMetadata` trait
From: Alvin Sun @ 2026-06-24 15:00 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
In-Reply-To: <20260624-fix-fops-owner-v6-0-5295e333cb3e@linux.dev>
Since `const_refs_to_static` has been stable as of the MSRV bump, a
`ThisModule` pointer can now be used in const contexts.
Add a `THIS_MODULE` const to the `ModuleMetadata` trait so that modules
can provide their `ThisModule` pointer in const contexts such as static
`file_operations`.
Add a `this_module()` helper to retrieve the `THIS_MODULE` pointer of a
given module type, and update `__init` to use it instead of the
`THIS_MODULE` static generated by the `module!` macro.
The `static THIS_MODULE` generated by the `module!` macro is retained
for backwards compatibility with existing users and removed in a later
patch once all references have been migrated.
Assisted-by: opencode:glm-5.2
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
rust/kernel/module.rs | 9 +++++++++
rust/macros/module.rs | 18 +++++++++++++++++-
2 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/rust/kernel/module.rs b/rust/kernel/module.rs
index be242a82e86d2..d713705984477 100644
--- a/rust/kernel/module.rs
+++ b/rust/kernel/module.rs
@@ -42,6 +42,15 @@ fn init(module: &'static ThisModule) -> impl pin_init::PinInit<Self, crate::erro
pub trait ModuleMetadata {
/// The name of the module as specified in the `module!` macro.
const NAME: &'static crate::str::CStr;
+
+ /// The module's `THIS_MODULE` pointer.
+ const THIS_MODULE: ThisModule;
+}
+
+/// Returns a reference to the `THIS_MODULE` of the given module type.
+#[inline]
+pub const fn this_module<M: ModuleMetadata>() -> &'static ThisModule {
+ &M::THIS_MODULE
}
/// Equivalent to `THIS_MODULE` in the C API.
diff --git a/rust/macros/module.rs b/rust/macros/module.rs
index 06c18e2075083..aa9a618d5d19e 100644
--- a/rust/macros/module.rs
+++ b/rust/macros/module.rs
@@ -519,6 +519,22 @@ pub(crate) fn module(info: ModuleInfo) -> Result<TokenStream> {
impl ::kernel::ModuleMetadata for #type_ {
const NAME: &'static ::kernel::str::CStr = #name_cstr;
+
+ #[cfg(MODULE)]
+ const THIS_MODULE: ::kernel::ThisModule = {
+ extern "C" {
+ static __this_module: ::kernel::types::Opaque<::kernel::bindings::module>;
+ }
+
+ // SAFETY: `__this_module` is constructed by the kernel at load time
+ // and lives until the module is unloaded.
+ unsafe { ::kernel::ThisModule::from_ptr(__this_module.get()) }
+ };
+
+ #[cfg(not(MODULE))]
+ const THIS_MODULE: ::kernel::ThisModule = unsafe {
+ ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
+ };
}
// Double nested modules, since then nobody can access the public items inside.
@@ -616,7 +632,7 @@ pub extern "C" fn #ident_exit() {
/// This function must only be called once.
unsafe fn __init() -> ::kernel::ffi::c_int {
let initer = <super::super::LocalModule as ::kernel::InPlaceModule>::init(
- &super::super::THIS_MODULE
+ ::kernel::module::this_module::<super::super::LocalModule>()
);
// SAFETY: No data race, since `__MOD` can only be accessed by this module
// and there only `__init` and `__exit` access it. These functions are only
--
2.43.0
^ permalink raw reply related
* [PATCH v6 03/10] rust: doctest: add LocalModule fallback for #[vtable] ThisModule
From: Alvin Sun @ 2026-06-24 15:00 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
In-Reply-To: <20260624-fix-fops-owner-v6-0-5295e333cb3e@linux.dev>
Add a `LocalModule` struct with a null-pointer `ModuleMetadata` impl
in the doctest harness, so that `crate::LocalModule` (auto-inserted
by `#[vtable]`) resolves correctly when there is no `module!` macro.
Assisted-by: opencode:glm-5.2
Reviewed-by: Andreas Hindborg <a.hindborg@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
scripts/rustdoc_test_gen.rs | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/scripts/rustdoc_test_gen.rs b/scripts/rustdoc_test_gen.rs
index ee76e96b41eea..198af4e446c8c 100644
--- a/scripts/rustdoc_test_gen.rs
+++ b/scripts/rustdoc_test_gen.rs
@@ -239,6 +239,22 @@ macro_rules! assert_eq {{
const __LOG_PREFIX: &[u8] = b"rust_doctests_kernel\0";
+/// Dummy module type for doctest context.
+struct LocalModule;
+
+use kernel::{{
+ str::CStr,
+ ModuleMetadata,
+ ThisModule, //
+}};
+use core::ptr::null_mut;
+
+impl ModuleMetadata for LocalModule {{
+ const NAME: &'static CStr = c"rust_doctests_kernel";
+ // SAFETY: `try_module_get`/`module_put` handle null module pointers gracefully.
+ const THIS_MODULE: ThisModule = unsafe {{ ThisModule::from_ptr(null_mut()) }};
+}}
+
{rust_tests}
"#
)
--
2.43.0
^ permalink raw reply related
* [PATCH v6 01/10] rust: module: move module types into `module.rs`
From: Alvin Sun @ 2026-06-24 15:00 UTC (permalink / raw)
To: Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Danilo Krummrich, Luis Chamberlain, Petr Pavlu, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas
Cc: rust-for-linux, linux-modules, driver-core, dri-devel, nova-gpu,
linux-kselftest, kunit-dev, linux-block, linux-kernel, netdev,
linux-pci, Alvin Sun
In-Reply-To: <20260624-fix-fops-owner-v6-0-5295e333cb3e@linux.dev>
Move `Module`, `InPlaceModule`, `ModuleMetadata` and `ThisModule` from
`lib.rs` into a new `rust/kernel/module.rs`. Re-export them from `lib.rs`
to avoid tree-wide changes.
Switch six bus driver registrations from `module.0` to the public
`ThisModule::as_ptr()` accessor, since the field is no longer visible
outside the new `module` submodule.
No functional change.
Assisted-by: opencode:glm-5.2
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Alvin Sun <alvin.sun@linux.dev>
---
rust/kernel/auxiliary.rs | 2 +-
rust/kernel/i2c.rs | 2 +-
rust/kernel/lib.rs | 75 +++++-------------------------------------------
rust/kernel/module.rs | 71 +++++++++++++++++++++++++++++++++++++++++++++
rust/kernel/net/phy.rs | 6 +++-
rust/kernel/pci.rs | 2 +-
rust/kernel/platform.rs | 2 +-
rust/kernel/usb.rs | 2 +-
8 files changed, 88 insertions(+), 74 deletions(-)
diff --git a/rust/kernel/auxiliary.rs b/rust/kernel/auxiliary.rs
index 93c0db1f66555..4a02f83240be3 100644
--- a/rust/kernel/auxiliary.rs
+++ b/rust/kernel/auxiliary.rs
@@ -63,7 +63,7 @@ unsafe fn register(
// SAFETY: `adrv` is guaranteed to be a valid `DriverType`.
to_result(unsafe {
- bindings::__auxiliary_driver_register(adrv.get(), module.0, name.as_char_ptr())
+ bindings::__auxiliary_driver_register(adrv.get(), module.as_ptr(), name.as_char_ptr())
})
}
diff --git a/rust/kernel/i2c.rs b/rust/kernel/i2c.rs
index 7b908f0c5a58d..24eff08f47123 100644
--- a/rust/kernel/i2c.rs
+++ b/rust/kernel/i2c.rs
@@ -142,7 +142,7 @@ unsafe fn register(
}
// SAFETY: `idrv` is guaranteed to be a valid `DriverType`.
- to_result(unsafe { bindings::i2c_register_driver(module.0, idrv.get()) })
+ to_result(unsafe { bindings::i2c_register_driver(module.as_ptr(), idrv.get()) })
}
unsafe fn unregister(idrv: &Opaque<Self::DriverType>) {
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index b72b2fbe046d6..040ae85056509 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -93,6 +93,7 @@
pub mod maple_tree;
pub mod miscdevice;
pub mod mm;
+pub mod module;
pub mod module_param;
#[cfg(CONFIG_NET)]
pub mod net;
@@ -139,79 +140,17 @@
#[doc(hidden)]
pub use bindings;
pub use macros;
+pub use module::{
+ InPlaceModule,
+ Module,
+ ModuleMetadata,
+ ThisModule, //
+};
pub use uapi;
/// Prefix to appear before log messages printed from within the `kernel` crate.
const __LOG_PREFIX: &[u8] = b"rust_kernel\0";
-/// The top level entrypoint to implementing a kernel module.
-///
-/// For any teardown or cleanup operations, your type may implement [`Drop`].
-pub trait Module: Sized + Sync + Send {
- /// Called at module initialization time.
- ///
- /// Use this method to perform whatever setup or registration your module
- /// should do.
- ///
- /// Equivalent to the `module_init` macro in the C API.
- fn init(module: &'static ThisModule) -> error::Result<Self>;
-}
-
-/// A module that is pinned and initialised in-place.
-pub trait InPlaceModule: Sync + Send {
- /// Creates an initialiser for the module.
- ///
- /// It is called when the module is loaded.
- fn init(module: &'static ThisModule) -> impl pin_init::PinInit<Self, error::Error>;
-}
-
-impl<T: Module> InPlaceModule for T {
- fn init(module: &'static ThisModule) -> impl pin_init::PinInit<Self, error::Error> {
- let initer = move |slot: *mut Self| {
- let m = <Self as Module>::init(module)?;
-
- // SAFETY: `slot` is valid for write per the contract with `pin_init_from_closure`.
- unsafe { slot.write(m) };
- Ok(())
- };
-
- // SAFETY: On success, `initer` always fully initialises an instance of `Self`.
- unsafe { pin_init::pin_init_from_closure(initer) }
- }
-}
-
-/// Metadata attached to a [`Module`] or [`InPlaceModule`].
-pub trait ModuleMetadata {
- /// The name of the module as specified in the `module!` macro.
- const NAME: &'static crate::str::CStr;
-}
-
-/// Equivalent to `THIS_MODULE` in the C API.
-///
-/// C header: [`include/linux/init.h`](srctree/include/linux/init.h)
-pub struct ThisModule(*mut bindings::module);
-
-// SAFETY: `THIS_MODULE` may be used from all threads within a module.
-unsafe impl Sync for ThisModule {}
-
-impl ThisModule {
- /// Creates a [`ThisModule`] given the `THIS_MODULE` pointer.
- ///
- /// # Safety
- ///
- /// The pointer must be equal to the right `THIS_MODULE`.
- pub const unsafe fn from_ptr(ptr: *mut bindings::module) -> ThisModule {
- ThisModule(ptr)
- }
-
- /// Access the raw pointer for this module.
- ///
- /// It is up to the user to use it correctly.
- pub const fn as_ptr(&self) -> *mut bindings::module {
- self.0
- }
-}
-
#[cfg(not(testlib))]
#[panic_handler]
fn panic(info: &core::panic::PanicInfo<'_>) -> ! {
diff --git a/rust/kernel/module.rs b/rust/kernel/module.rs
new file mode 100644
index 0000000000000..be242a82e86d2
--- /dev/null
+++ b/rust/kernel/module.rs
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Module-related types and helpers.
+
+/// The entrypoint to implementing a kernel module.
+///
+/// For any teardown or cleanup operations, your type may implement [`Drop`].
+pub trait Module: Sized + Sync + Send {
+ /// Called at module initialization time.
+ ///
+ /// Use this method to perform whatever setup or registration your module
+ /// should do.
+ ///
+ /// Equivalent to the `module_init` macro in the C API.
+ fn init(module: &'static ThisModule) -> crate::error::Result<Self>;
+}
+
+/// A module that is pinned and initialised in-place.
+pub trait InPlaceModule: Sync + Send {
+ /// Creates an initialiser for the module.
+ ///
+ /// It is called when the module is loaded.
+ fn init(module: &'static ThisModule) -> impl pin_init::PinInit<Self, crate::error::Error>;
+}
+
+impl<T: Module> InPlaceModule for T {
+ fn init(module: &'static ThisModule) -> impl pin_init::PinInit<Self, crate::error::Error> {
+ let initer = move |slot: *mut Self| {
+ let m = <Self as Module>::init(module)?;
+
+ // SAFETY: `slot` is valid for write per the contract with `pin_init_from_closure`.
+ unsafe { slot.write(m) };
+ Ok(())
+ };
+
+ // SAFETY: On success, `initer` always fully initialises an instance of `Self`.
+ unsafe { pin_init::pin_init_from_closure(initer) }
+ }
+}
+
+/// Metadata attached to a [`Module`] or [`InPlaceModule`].
+pub trait ModuleMetadata {
+ /// The name of the module as specified in the `module!` macro.
+ const NAME: &'static crate::str::CStr;
+}
+
+/// Equivalent to `THIS_MODULE` in the C API.
+///
+/// C header: [`include/linux/init.h`](srctree/include/linux/init.h)
+pub struct ThisModule(*mut crate::bindings::module);
+
+// SAFETY: `THIS_MODULE` may be used from all threads within a module.
+unsafe impl Sync for ThisModule {}
+
+impl ThisModule {
+ /// Creates a [`ThisModule`] given the `THIS_MODULE` pointer.
+ ///
+ /// # Safety
+ ///
+ /// The pointer must be equal to the right `THIS_MODULE`.
+ pub const unsafe fn from_ptr(ptr: *mut crate::bindings::module) -> ThisModule {
+ ThisModule(ptr)
+ }
+
+ /// Access the raw pointer for this module.
+ ///
+ /// It is up to the user to use it correctly.
+ pub const fn as_ptr(&self) -> *mut crate::bindings::module {
+ self.0
+ }
+}
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index 3ca99db5cccf2..8b7036b8fe480 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -659,7 +659,11 @@ pub fn register(
// the `drivers` slice are initialized properly. `drivers` will not be moved.
// So it's just an FFI call.
to_result(unsafe {
- bindings::phy_drivers_register(drivers[0].0.get(), drivers.len().try_into()?, module.0)
+ bindings::phy_drivers_register(
+ drivers[0].0.get(),
+ drivers.len().try_into()?,
+ module.as_ptr(),
+ )
})?;
// INVARIANT: The `drivers` slice is successfully registered to the kernel via `phy_drivers_register`.
Ok(Registration { drivers })
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index af74ddff6114d..916ed2cb6b70b 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -86,7 +86,7 @@ unsafe fn register(
// SAFETY: `pdrv` is guaranteed to be a valid `DriverType`.
to_result(unsafe {
- bindings::__pci_register_driver(pdrv.get(), module.0, name.as_char_ptr())
+ bindings::__pci_register_driver(pdrv.get(), module.as_ptr(), name.as_char_ptr())
})
}
diff --git a/rust/kernel/platform.rs b/rust/kernel/platform.rs
index 8917d4ee499fb..9fdbafd53bc21 100644
--- a/rust/kernel/platform.rs
+++ b/rust/kernel/platform.rs
@@ -82,7 +82,7 @@ unsafe fn register(
}
// SAFETY: `pdrv` is guaranteed to be a valid `DriverType`.
- to_result(unsafe { bindings::__platform_driver_register(pdrv.get(), module.0) })
+ to_result(unsafe { bindings::__platform_driver_register(pdrv.get(), module.as_ptr()) })
}
unsafe fn unregister(pdrv: &Opaque<Self::DriverType>) {
diff --git a/rust/kernel/usb.rs b/rust/kernel/usb.rs
index 9c17a672cd275..213db32727c17 100644
--- a/rust/kernel/usb.rs
+++ b/rust/kernel/usb.rs
@@ -63,7 +63,7 @@ unsafe fn register(
// SAFETY: `udrv` is guaranteed to be a valid `DriverType`.
to_result(unsafe {
- bindings::usb_register_driver(udrv.get(), module.0, name.as_char_ptr())
+ bindings::usb_register_driver(udrv.get(), module.as_ptr(), name.as_char_ptr())
})
}
--
2.43.0
^ permalink raw reply related
* Re: [BUG] KFENCE: use-after-free read in udp_tunnel_nic_device_sync_work
From: Sam Sun @ 2026-06-24 14:46 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, netdev,
linux-kernel, syzkaller
In-Reply-To: <CANn89iKhPmbJW_6DA1_okSGsr_e_Jz47qns-nFcZpnQZ-nAUOA@mail.gmail.com>
On Wed, Jun 24, 2026 at 10:10 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Jun 24, 2026 at 6:59 AM Eric Dumazet <edumazet@google.com> wrote:
>
> > Oh well.
> >
> > u8 need_sync:1;
> > u8 need_replay:1;
> > u8 work_pending:1;
> >
> > These bitfields are not safe, obviously :/
> >
> > Time to convert them to atomic bit operations.
>
> Can you try:
>
> diff --git a/net/ipv4/udp_tunnel_nic.c b/net/ipv4/udp_tunnel_nic.c
> index 9944ed923ddfd10f9adf6ad788c0740daeaf2adb..939d6f656bb71814718bc3bf84be665adad27e4b
> 100644
> --- a/net/ipv4/udp_tunnel_nic.c
> +++ b/net/ipv4/udp_tunnel_nic.c
> @@ -30,9 +30,7 @@ struct udp_tunnel_nic_table_entry {
> * @work: async work for talking to hardware from process context
> * @dev: netdev pointer
> * @lock: protects all fields
> - * @need_sync: at least one port start changed
> - * @need_replay: space was freed, we need a replay of all ports
> - * @work_pending: @work is currently scheduled
> + * @flags: sync, replay, pending flags
> * @n_tables: number of tables under @entries
> * @missed: bitmap of tables which overflown
> * @entries: table of tables of ports currently offloaded
> @@ -44,9 +42,10 @@ struct udp_tunnel_nic {
>
> struct mutex lock;
>
> - u8 need_sync:1;
> - u8 need_replay:1;
> - u8 work_pending:1;
> + unsigned long flags;
> +#define UDP_TUNNEL_NIC_NEED_SYNC 0
> +#define UDP_TUNNEL_NIC_NEED_REPLAY 1
> +#define UDP_TUNNEL_NIC_WORK_PENDING 2
>
> unsigned int n_tables;
> unsigned long missed;
> @@ -116,7 +115,7 @@ udp_tunnel_nic_entry_queue(struct udp_tunnel_nic *utn,
> unsigned int flag)
> {
> entry->flags |= flag;
> - utn->need_sync = 1;
> + set_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags);
> }
>
> static void
> @@ -283,7 +282,7 @@ udp_tunnel_nic_device_sync_by_table(struct net_device *dev,
> static void
> __udp_tunnel_nic_device_sync(struct net_device *dev, struct
> udp_tunnel_nic *utn)
> {
> - if (!utn->need_sync)
> + if (!test_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags))
> return;
>
> if (dev->udp_tunnel_nic_info->sync_table)
> @@ -291,21 +290,24 @@ __udp_tunnel_nic_device_sync(struct net_device
> *dev, struct udp_tunnel_nic *utn)
> else
> udp_tunnel_nic_device_sync_by_port(dev, utn);
>
> - utn->need_sync = 0;
> + clear_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags);
> /* Can't replay directly here, in case we come from the tunnel driver's
> * notification - trying to replay may deadlock inside tunnel driver.
> */
> - utn->need_replay = udp_tunnel_nic_should_replay(dev, utn);
> + if (udp_tunnel_nic_should_replay(dev, utn))
> + set_bit(UDP_TUNNEL_NIC_NEED_REPLAY, &utn->flags);
> + else
> + clear_bit(UDP_TUNNEL_NIC_NEED_REPLAY, &utn->flags);
> }
>
> static void
> udp_tunnel_nic_device_sync(struct net_device *dev, struct udp_tunnel_nic *utn)
> {
> - if (!utn->need_sync)
> + if (!test_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags))
> return;
>
> + set_bit(UDP_TUNNEL_NIC_WORK_PENDING, &utn->flags);
> queue_work(udp_tunnel_nic_workqueue, &utn->work);
> - utn->work_pending = 1;
> }
>
> static bool
> @@ -348,7 +350,7 @@ udp_tunnel_nic_has_collision(struct net_device
> *dev, struct udp_tunnel_nic *utn,
> if (!udp_tunnel_nic_entry_is_free(entry) &&
> entry->port == ti->port &&
> entry->type != ti->type) {
> - __set_bit(i, &utn->missed);
> + set_bit(i, &utn->missed);
> return true;
> }
> }
> @@ -483,7 +485,7 @@ udp_tunnel_nic_add_new(struct net_device *dev,
> struct udp_tunnel_nic *utn,
> * are no devices currently which have multiple tables accepting
> * the same tunnel type, and false positives are okay.
> */
> - __set_bit(i, &utn->missed);
> + set_bit(i, &utn->missed);
> }
>
> return false;
> @@ -552,7 +554,7 @@ static void __udp_tunnel_nic_reset_ntf(struct
> net_device *dev)
>
> mutex_lock(&utn->lock);
>
> - utn->need_sync = false;
> + clear_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags);
> for (i = 0; i < utn->n_tables; i++)
> for (j = 0; j < info->tables[i].n_entries; j++) {
> struct udp_tunnel_nic_table_entry *entry;
> @@ -696,8 +698,8 @@ udp_tunnel_nic_flush(struct net_device *dev,
> struct udp_tunnel_nic *utn)
> for (i = 0; i < utn->n_tables; i++)
> memset(utn->entries[i], 0, array_size(info->tables[i].n_entries,
> sizeof(**utn->entries)));
> - WARN_ON(utn->need_sync);
> - utn->need_replay = 0;
> + WARN_ON(test_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags));
> + clear_bit(UDP_TUNNEL_NIC_NEED_REPLAY, &utn->flags);
> }
>
> static void
> @@ -713,8 +715,8 @@ udp_tunnel_nic_replay(struct net_device *dev,
> struct udp_tunnel_nic *utn)
> for (i = 0; i < utn->n_tables; i++)
> for (j = 0; j < info->tables[i].n_entries; j++)
> udp_tunnel_nic_entry_freeze_used(&utn->entries[i][j]);
> - utn->missed = 0;
> - utn->need_replay = 0;
> + bitmap_zero(&utn->missed, UDP_TUNNEL_NIC_MAX_TABLES);
> + clear_bit(UDP_TUNNEL_NIC_NEED_REPLAY, &utn->flags);
>
> if (!info->shared) {
> udp_tunnel_get_rx_info(dev);
> @@ -736,10 +738,10 @@ static void
> udp_tunnel_nic_device_sync_work(struct work_struct *work)
> rtnl_lock();
> mutex_lock(&utn->lock);
>
> - utn->work_pending = 0;
> + clear_bit(UDP_TUNNEL_NIC_WORK_PENDING, &utn->flags);
> __udp_tunnel_nic_device_sync(utn->dev, utn);
>
> - if (utn->need_replay)
> + if (test_bit(UDP_TUNNEL_NIC_NEED_REPLAY, &utn->flags))
> udp_tunnel_nic_replay(utn->dev, utn);
>
> mutex_unlock(&utn->lock);
> @@ -866,6 +868,11 @@ udp_tunnel_nic_unregister(struct net_device *dev,
> struct udp_tunnel_nic *utn)
>
> udp_tunnel_nic_lock(dev);
>
> + if (test_bit(UDP_TUNNEL_NIC_WORK_PENDING, &utn->flags)) {
> + udp_tunnel_nic_unlock(dev);
> + return;
> + }
> +
> /* For a shared table remove this dev from the list of sharing devices
> * and if there are other devices just detach.
> */
> @@ -901,12 +908,6 @@ udp_tunnel_nic_unregister(struct net_device *dev,
> struct udp_tunnel_nic *utn)
> udp_tunnel_nic_flush(dev, utn);
> udp_tunnel_nic_unlock(dev);
>
> - /* Wait for the work to be done using the state, netdev core will
> - * retry unregister until we give up our reference on this device.
> - */
> - if (utn->work_pending)
> - return;
> -
> udp_tunnel_nic_free(utn);
> release_dev:
> dev->udp_tunnel_nic = NULL;
I tested this version as well, but it still does not stop the C reproducer
on my side.
This time the VM panicked after about 50 seconds on a debugobjects warning:
[ 50.420529][ T9744] ------------[ cut here ]------------
[ 50.421258][ T9744] ODEBUG: free active (active state 0) object:
ff110001114b5a00 object type: work_struct hint:
udp_tunnel_nic_device_sync_work+0x0/0x940
[ 50.424052][ T9744] WARNING: lib/debugobjects.c:629 at
debug_print_object+0x1a0/0x2e0, CPU#0: repro/9744
[ 50.425766][ T9744] Modules linked in:
[ 50.426279][ T9744] CPU: 0 UID: 0 PID: 9744 Comm: repro Not tainted
7.1.0-11240-g840ef6c78e6a-dirty #33 PREEMPT(full)
[ 50.428614][ T9744] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS 1.15.0-1 04/01/2014
[ 50.429661][ T9744] RIP: 0010:debug_print_object+0x1a5/0x2e0
[ 50.430338][ T9744] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8a 00
00 00 48 8b 14 ed 60 33 1e 8c 48 83 fd 05 77 47 48 8d 3d e0 0c 1e 0c
41 56 4c 89 e6 <67> 48 0f b9 3a 58 83 05 42 30 14 0c 01 48 83 c4 20 5b
5d 41 5c 41
[ 50.432538][ T9744] RSP: 0018:ffa0000012c8ee60 EFLAGS: 00010293
[ 50.433045][ T9744] RAX: dffffc0000000000 RBX: ffa0000012c8ef40 RCX:
0000000000000000
[ 50.433710][ T9744] RDX: ffffffff8c1e32a0 RSI: ffffffff8c1e2e80 RDI:
ffffffff90e31820
[ 50.434390][ T9744] RBP: 0000000000000003 R08: ff110001114b5a00 R09:
ffffffff8bae17e0
[ 50.435050][ T9744] R10: ffffffff90d907d7 R11: 0000000000000000 R12:
ffffffff8c1e2e80
[ 50.435719][ T9744] R13: ffffffff8bae1820 R14: ffffffff8a0f69e0 R15:
ffa0000012c8ef58
[ 50.436393][ T9744] FS: 00007f8430d5b640(0000)
GS:ff1100018394a000(0000) knlGS:0000000000000000
[ 50.437149][ T9744] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 50.437749][ T9744] CR2: 00007f422d7b8000 CR3: 000000010b850000 CR4:
0000000000753ef0
[ 50.438404][ T9744] PKRU: 55555554
[ 50.438716][ T9744] Call Trace:
[ 50.439004][ T9744] <TASK>
[ 50.439277][ T9744] ? __pfx_udp_tunnel_nic_device_sync_work+0x10/0x10
[ 50.439856][ T9744] ? _raw_spin_unlock_irqrestore+0x58/0x70
[ 50.440942][ T9744] debug_check_no_obj_freed+0x3ec/0x520
[ 50.441419][ T9744] ? __udp_tunnel_nic_lock+0x47/0x60
[ 50.441878][ T9744] ? __pfx_debug_check_no_obj_freed+0x10/0x10
[ 50.442403][ T9744] ? kasan_quarantine_put+0x10d/0x230
[ 50.442875][ T9744] ? lockdep_hardirqs_on+0x7c/0x110
[ 50.443325][ T9744] kfree+0x2a0/0x6d0
[ 50.443663][ T9744] ? udp_tunnel_nic_netdevice_event+0xc14/0x1e40
[ 50.444214][ T9744] udp_tunnel_nic_netdevice_event+0xc14/0x1e40
[ 50.444730][ T9744] notifier_call_chain+0xbd/0x430
[ 50.445164][ T9744] ? __pfx_udp_tunnel_nic_netdevice_event+0x10/0x10
[ 50.445729][ T9744] call_netdevice_notifiers_info+0xbe/0x110
[ 50.446236][ T9744] unregister_netdevice_many_notify+0xbab/0x2130
[ 50.446781][ T9744] ? __pfx_unregister_netdevice_many_notify+0x10/0x10
[ 50.447907][ T9744] ? __pfx___mutex_lock+0x10/0x10
[ 50.448351][ T9744] unregister_netdevice_queue+0x305/0x3c0
[ 50.448842][ T9744] ? __pfx_unregister_netdevice_queue+0x10/0x10
[ 50.449369][ T9744] nsim_destroy+0x231/0x980
[ 50.449773][ T9744] __nsim_dev_port_del+0x197/0x2c0
[ 50.450215][ T9744] nsim_dev_reload_destroy+0x105/0x490
[ 50.450677][ T9744] nsim_dev_reload_down+0x67/0xd0
[ 50.451143][ T9744] devlink_reload+0x197/0x7b0
[ 50.451564][ T9744] ? __pfx_devlink_reload+0x10/0x10
[ 50.452020][ T9744] ? security_capable+0x210/0x250
[ 50.452466][ T9744] ? ns_capable+0xe2/0x120
[ 50.452858][ T9744] devlink_nl_reload_doit+0x541/0x1160
[ 50.453323][ T9744] ? __pfx_devlink_nl_reload_doit+0x10/0x10
[ 50.453828][ T9744] ? genl_family_rcv_msg_attrs_parse.constprop.0+0x1e5/0x2f0
[ 50.454458][ T9744] genl_family_rcv_msg_doit+0x1ff/0x2f0
[ 50.454930][ T9744] ? __pfx_genl_family_rcv_msg_doit+0x10/0x10
[ 50.455442][ T9744] ? bpf_lsm_capable+0x9/0x10
[ 50.455845][ T9744] ? security_capable+0x210/0x250
[ 50.456297][ T9744] genl_rcv_msg+0x532/0x7e0
[ 50.456683][ T9744] ? __pfx_genl_rcv_msg+0x10/0x10
[ 50.457115][ T9744] ? __pfx_devlink_nl_pre_doit_dev_lock+0x10/0x10
[ 50.457690][ T9744] ? __pfx_devlink_nl_reload_doit+0x10/0x10
[ 50.458218][ T9744] ? __pfx_devlink_nl_post_doit_dev_lock+0x10/0x10
[ 50.458776][ T9744] ? __lock_acquire+0x476/0x2420
[ 50.459208][ T9744] netlink_rcv_skb+0x147/0x430
[ 50.459633][ T9744] ? __pfx_genl_rcv_msg+0x10/0x10
[ 50.460062][ T9744] ? __pfx_netlink_rcv_skb+0x10/0x10
[ 50.460520][ T9744] ? netlink_deliver_tap+0x1ae/0xd10
[ 50.460976][ T9744] genl_rcv+0x28/0x40
[ 50.461330][ T9744] netlink_unicast+0x58d/0x850
[ 50.461739][ T9744] ? __pfx_netlink_unicast+0x10/0x10
[ 50.462198][ T9744] netlink_sendmsg+0x88d/0xd90
[ 50.462610][ T9744] ? __pfx_netlink_sendmsg+0x10/0x10
[ 50.463092][ T9744] ? __pfx_netlink_sendmsg+0x10/0x10
[ 50.463557][ T9744] ____sys_sendmsg+0xa27/0xb90
[ 50.463982][ T9744] ? __pfx_____sys_sendmsg+0x10/0x10
[ 50.464438][ T9744] ? __pfx_copy_msghdr_from_user+0x10/0x10
[ 50.464938][ T9744] ? find_held_lock+0x2b/0x80
[ 50.465347][ T9744] ? futex_wake+0x4f7/0x5e0
[ 50.465735][ T9744] ___sys_sendmsg+0x11c/0x1b0
[ 50.466129][ T9744] ? __pfx____sys_sendmsg+0x10/0x10
[ 50.466586][ T9744] ? __pfx_futex_wake+0x10/0x10
[ 50.467002][ T9744] ? __fget_files+0x1f1/0x3b0
[ 50.467440][ T9744] ? __fget_files+0x1fb/0x3b0
[ 50.467836][ T9744] ? __lock_acquire+0x450/0x2420
[ 50.468298][ T9744] __sys_sendmsg+0x142/0x1f0
[ 50.468696][ T9744] ? __pfx___sys_sendmsg+0x10/0x10
[ 50.469133][ T9744] ? __cpu_to_node+0x8a/0x130
[ 50.469538][ T9744] do_syscall_64+0x11f/0x860
[ 50.469953][ T9744] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 50.470454][ T9744] RIP: 0033:0x451a4d
[ 50.470783][ T9744] Code: c3 e8 a7 23 00 00 0f 1f 80 00 00 00 00 f3
0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b
4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8
64 89 01 48
[ 50.472393][ T9744] RSP: 002b:00007f8430d5b198 EFLAGS: 00000246
ORIG_RAX: 000000000000002e
[ 50.473102][ T9744] RAX: ffffffffffffffda RBX: 00000000004e9440 RCX:
0000000000451a4d
[ 50.473761][ T9744] RDX: 0000000000000000 RSI: 0000200000000800 RDI:
0000000000000003
[ 50.474419][ T9744] RBP: 00000000004b66b4 R08: 000000000000006d R09:
0000000000000000
[ 50.475085][ T9744] R10: 0000000000000001 R11: 0000000000000246 R12:
0000200000000280
[ 50.475738][ T9744] R13: 0000200000000190 R14: 0000200000000180 R15:
00000000004e9448
[ 50.476418][ T9744] </TASK>
[ 50.476683][ T9744] Kernel panic - not syncing: kernel: panic_on_warn set ...
[ 50.477297][ T9744] CPU: 0 UID: 0 PID: 9744 Comm: repro Not tainted
7.1.0-11240-g840ef6c78e6a-dirty #33 PREEMPT(full)
[ 50.478201][ T9744] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS 1.15.0-1 04/01/2014
[ 50.478952][ T9744] Call Trace:
[ 50.479240][ T9744] <TASK>
[ 50.479492][ T9744] dump_stack_lvl+0x3d/0x1b0
[ 50.479895][ T9744] vpanic+0x7f2/0xa70
[ 50.480241][ T9744] ? __pfx_vpanic+0x10/0x10
[ 50.480621][ T9744] ? is_bpf_text_address+0x96/0x1a0
[ 50.481070][ T9744] ? debug_print_object+0x1a0/0x2e0
[ 50.481512][ T9744] panic+0xc2/0xd0
[ 50.481844][ T9744] ? __pfx_panic+0x10/0x10
[ 50.482230][ T9744] ? check_panic_on_warn+0x1f/0xc0
[ 50.482673][ T9744] check_panic_on_warn+0xb1/0xc0
[ 50.483102][ T9744] __warn+0x108/0x3f0
[ 50.483460][ T9744] __report_bug+0x42c/0x510
[ 50.483854][ T9744] ? debug_print_object+0x1a0/0x2e0
[ 50.484297][ T9744] ? __pfx___report_bug+0x10/0x10
[ 50.484721][ T9744] ? __kernel_text_address+0xd/0x40
[ 50.485166][ T9744] ? unwind_get_return_address+0x59/0xa0
[ 50.485664][ T9744] report_bug_entry+0xe1/0x280
[ 50.486068][ T9744] ? debug_print_object+0x1a5/0x2e0
[ 50.486508][ T9744] handle_bug+0x428/0x4e0
[ 50.486889][ T9744] exc_invalid_op+0x35/0x80
[ 50.487281][ T9744] asm_exc_invalid_op+0x1a/0x20
[ 50.487693][ T9744] RIP: 0010:debug_print_object+0x1a5/0x2e0
[ 50.488194][ T9744] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8a 00
00 00 48 8b 14 ed 60 33 1e 8c 48 83 fd 05 77 47 48 8d 3d e0 0c 1e 0c
41 56 4c 89 e6 <67> 48 0f b9 3a 58 83 05 42 30 14 0c 01 48 83 c4 20 5b
5d 41 5c 41
[ 50.489814][ T9744] RSP: 0018:ffa0000012c8ee60 EFLAGS: 00010293
[ 50.490362][ T9744] RAX: dffffc0000000000 RBX: ffa0000012c8ef40 RCX:
0000000000000000
[ 50.491033][ T9744] RDX: ffffffff8c1e32a0 RSI: ffffffff8c1e2e80 RDI:
ffffffff90e31820
[ 50.491698][ T9744] RBP: 0000000000000003 R08: ff110001114b5a00 R09:
ffffffff8bae17e0
[ 50.492384][ T9744] R10: ffffffff90d907d7 R11: 0000000000000000 R12:
ffffffff8c1e2e80
[ 50.493043][ T9744] R13: ffffffff8bae1820 R14: ffffffff8a0f69e0 R15:
ffa0000012c8ef58
[ 50.493718][ T9744] ? __pfx_udp_tunnel_nic_device_sync_work+0x10/0x10
[ 50.494288][ T9744] ? __pfx_udp_tunnel_nic_device_sync_work+0x10/0x10
[ 50.494845][ T9744] ? _raw_spin_unlock_irqrestore+0x58/0x70
[ 50.495347][ T9744] debug_check_no_obj_freed+0x3ec/0x520
[ 50.495825][ T9744] ? __udp_tunnel_nic_lock+0x47/0x60
[ 50.496276][ T9744] ? __pfx_debug_check_no_obj_freed+0x10/0x10
[ 50.496794][ T9744] ? kasan_quarantine_put+0x10d/0x230
[ 50.497266][ T9744] ? lockdep_hardirqs_on+0x7c/0x110
[ 50.497706][ T9744] kfree+0x2a0/0x6d0
[ 50.498038][ T9744] ? udp_tunnel_nic_netdevice_event+0xc14/0x1e40
[ 50.498577][ T9744] udp_tunnel_nic_netdevice_event+0xc14/0x1e40
[ 50.499107][ T9744] notifier_call_chain+0xbd/0x430
[ 50.499544][ T9744] ? __pfx_udp_tunnel_nic_netdevice_event+0x10/0x10
[ 50.500096][ T9744] call_netdevice_notifiers_info+0xbe/0x110
[ 50.500625][ T9744] unregister_netdevice_many_notify+0xbab/0x2130
[ 50.501169][ T9744] ? __pfx_unregister_netdevice_many_notify+0x10/0x10
[ 50.501737][ T9744] ? __pfx___mutex_lock+0x10/0x10
[ 50.502174][ T9744] unregister_netdevice_queue+0x305/0x3c0
[ 50.502660][ T9744] ? __pfx_unregister_netdevice_queue+0x10/0x10
[ 50.503186][ T9744] nsim_destroy+0x231/0x980
[ 50.503564][ T9744] __nsim_dev_port_del+0x197/0x2c0
[ 50.503966][ T9744] nsim_dev_reload_destroy+0x105/0x490
[ 50.504406][ T9744] nsim_dev_reload_down+0x67/0xd0
[ 50.504808][ T9744] devlink_reload+0x197/0x7b0
[ 50.505220][ T9744] ? __pfx_devlink_reload+0x10/0x10
[ 50.505679][ T9744] ? security_capable+0x210/0x250
[ 50.506113][ T9744] ? ns_capable+0xe2/0x120
[ 50.506490][ T9744] devlink_nl_reload_doit+0x541/0x1160
[ 50.506962][ T9744] ? __pfx_devlink_nl_reload_doit+0x10/0x10
[ 50.507471][ T9744] ? genl_family_rcv_msg_attrs_parse.constprop.0+0x1e5/0x2f0
[ 50.508094][ T9744] genl_family_rcv_msg_doit+0x1ff/0x2f0
[ 50.508577][ T9744] ? __pfx_genl_family_rcv_msg_doit+0x10/0x10
[ 50.509098][ T9744] ? bpf_lsm_capable+0x9/0x10
[ 50.509495][ T9744] ? security_capable+0x210/0x250
[ 50.509924][ T9744] genl_rcv_msg+0x532/0x7e0
[ 50.510327][ T9744] ? __pfx_genl_rcv_msg+0x10/0x10
[ 50.510753][ T9744] ? __pfx_devlink_nl_pre_doit_dev_lock+0x10/0x10
[ 50.511291][ T9744] ? __pfx_devlink_nl_reload_doit+0x10/0x10
[ 50.511795][ T9744] ? __pfx_devlink_nl_post_doit_dev_lock+0x10/0x10
[ 50.512365][ T9744] ? __lock_acquire+0x476/0x2420
[ 50.512785][ T9744] netlink_rcv_skb+0x147/0x430
[ 50.513197][ T9744] ? __pfx_genl_rcv_msg+0x10/0x10
[ 50.513627][ T9744] ? __pfx_netlink_rcv_skb+0x10/0x10
[ 50.514090][ T9744] ? netlink_deliver_tap+0x1ae/0xd10
[ 50.514542][ T9744] genl_rcv+0x28/0x40
[ 50.514879][ T9744] netlink_unicast+0x58d/0x850
[ 50.515302][ T9744] ? __pfx_netlink_unicast+0x10/0x10
[ 50.515760][ T9744] netlink_sendmsg+0x88d/0xd90
[ 50.516181][ T9744] ? __pfx_netlink_sendmsg+0x10/0x10
[ 50.516640][ T9744] ? __pfx_netlink_sendmsg+0x10/0x10
[ 50.517109][ T9744] ____sys_sendmsg+0xa27/0xb90
[ 50.517520][ T9744] ? __pfx_____sys_sendmsg+0x10/0x10
[ 50.517965][ T9744] ? __pfx_copy_msghdr_from_user+0x10/0x10
[ 50.518457][ T9744] ? find_held_lock+0x2b/0x80
[ 50.518871][ T9744] ? futex_wake+0x4f7/0x5e0
[ 50.519271][ T9744] ___sys_sendmsg+0x11c/0x1b0
[ 50.519677][ T9744] ? __pfx____sys_sendmsg+0x10/0x10
[ 50.520117][ T9744] ? __pfx_futex_wake+0x10/0x10
[ 50.520542][ T9744] ? __fget_files+0x1f1/0x3b0
[ 50.520945][ T9744] ? __fget_files+0x1fb/0x3b0
[ 50.521347][ T9744] ? __lock_acquire+0x450/0x2420
[ 50.521770][ T9744] __sys_sendmsg+0x142/0x1f0
[ 50.522176][ T9744] ? __pfx___sys_sendmsg+0x10/0x10
[ 50.522617][ T9744] ? __cpu_to_node+0x8a/0x130
[ 50.523021][ T9744] do_syscall_64+0x11f/0x860
[ 50.523417][ T9744] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 50.523926][ T9744] RIP: 0033:0x451a4d
[ 50.524264][ T9744] Code: c3 e8 a7 23 00 00 0f 1f 80 00 00 00 00 f3
0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b
4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8
64 89 01 48
[ 50.525868][ T9744] RSP: 002b:00007f8430d5b198 EFLAGS: 00000246
ORIG_RAX: 000000000000002e
[ 50.526579][ T9744] RAX: ffffffffffffffda RBX: 00000000004e9440 RCX:
0000000000451a4d
[ 50.527243][ T9744] RDX: 0000000000000000 RSI: 0000200000000800 RDI:
0000000000000003
[ 50.527914][ T9744] RBP: 00000000004b66b4 R08: 000000000000006d R09:
0000000000000000
[ 50.528614][ T9744] R10: 0000000000000001 R11: 0000000000000246 R12:
0000200000000280
[ 50.529305][ T9744] R13: 0000200000000190 R14: 0000200000000180 R15:
00000000004e9448
[ 50.529982][ T9744] </TASK>
[ 50.530453][ T9744] Kernel Offset: disabled
Resolved from the patched vmlinux:
udp_tunnel_nic_netdevice_event+0xc14/0x1e40:
udp_tunnel_nic_unregister at net/ipv4/udp_tunnel_nic.c:913
udp_tunnel_nic_netdevice_event at net/ipv4/udp_tunnel_nic.c:943
So we are still freeing struct udp_tunnel_nic while its embedded work_struct
is active. debugobjects catches this at kfree() before the active work gets a
chance to run later and dereference the freed utn.
My read is that the conversion from bitfields to atomic bitops removes the
plain bitfield data race, but UDP_TUNNEL_NIC_WORK_PENDING is still only one
boolean state. It can represent "some work is pending", but it cannot
distinguish between:
idle
queued
running
running and queued again
In particular, the workqueue core clears WORK_STRUCT_PENDING before invoking
the worker. At that point the same work item can be queued again by
udp_tunnel_nic_device_sync(). If an already running instance later executes:
clear_bit(UDP_TUNNEL_NIC_WORK_PENDING, &utn->flags);
it can still clear the bit that was set for the requeued instance. Then
udp_tunnel_nic_unregister() may observe UDP_TUNNEL_NIC_WORK_PENDING clear and
free utn, even though debugobjects still sees utn->work as active.
Thanks,
Yue
^ permalink raw reply
* Re: [PATCH v3 1/7] list: Add mutable iterator variants
From: David Laight @ 2026-06-24 14:23 UTC (permalink / raw)
To: Christian König
Cc: Kaitao Cheng, Andrew Morton, David Hildenbrand, Jens Axboe,
Tejun Heo, Alexander Viro, Christian Brauner, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Johannes Weiner, Peter Zijlstra,
Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim,
Thomas Gleixner, Juri Lelli, Vincent Guittot, Paul Moore,
Andy Shevchenko, Paul E. McKenney, Shakeel Butt, David Howells,
Simona Vetter, Randy Dunlap, Luca Ceresoli, Philipp Stanner,
linux-block, linux-kernel, cgroups, linux-ntfs-dev, linux-fsdevel,
io-uring, audit, bpf, netdev, dri-devel, linux-perf-users,
linux-trace-kernel, kexec, live-patching, linux-modules,
linux-crypto, linux-pm, rcu, sched-ext, linux-mm, virtualization,
damon, llvm, Kaitao Cheng
In-Reply-To: <cf8467c7-b98f-44a5-9cf9-60b43b5da711@amd.com>
On Wed, 24 Jun 2026 15:23:47 +0200
Christian König <christian.koenig@amd.com> wrote:
> On 6/24/26 15:14, Kaitao Cheng wrote:
> >
> >
> > 在 2026/6/22 16:42, David Laight 写道:
> >> On Mon, 22 Jun 2026 12:05:31 +0800
> >> Kaitao Cheng <kaitao.cheng@linux.dev> wrote:
> >>
> >>> From: Kaitao Cheng <chengkaitao@kylinos.cn>
> >>>
> >>> The list_for_each*_safe() helpers are used when the loop body may
> >>> remove the current entry. Their API exposes the temporary cursor at
> >>> every call site, even though most users only need it for the iterator
> >>> implementation and never reference it in the loop body.
> >>>
> >>> Add *_mutable() variants for list and hlist iteration. The new helpers
> >>> support both forms: callers may keep passing an explicit temporary cursor
> >>> when they need to inspect or reset it, or omit it and let the helper use
> >>> a unique internal cursor.
> >>
> >> I'm not really sure 'mutable' means anything either.
> >> It is possible to make it valid for the loop body (or even other threads)
> >> to delete arbitrary list items - but that needs significant extra overheads.
> >>
> >> It might be worth doing something that doesn't need the extra variable,
> >> but there is little point doing all the churn just to rename things.
> >>
> >>>
> >>> This makes call sites that only mutate the list through the current entry
> >>> less noisy, while keeping the existing *_safe() helpers available for
> >>> compatibility.
> >>>
> >>> Signed-off-by: Kaitao Cheng <chengkaitao@kylinos.cn>
> >>> ---
> >>> include/linux/list.h | 269 +++++++++++++++++++++++++++++++++++++------
> >>> 1 file changed, 231 insertions(+), 38 deletions(-)
> >>>
> >>> diff --git a/include/linux/list.h b/include/linux/list.h
> >>> index 09d979976b3b..1081def7cea9 100644
> >>> --- a/include/linux/list.h
> >>> +++ b/include/linux/list.h
> >>> @@ -7,6 +7,7 @@
> >>> #include <linux/stddef.h>
> >>> #include <linux/poison.h>
> >>> #include <linux/const.h>
> >>> +#include <linux/args.h>
> >>>
> >>> #include <asm/barrier.h>
> >>>
> >>> @@ -763,28 +764,72 @@ static inline void list_splice_tail_init(struct list_head *list,
> >>> #define list_for_each_prev(pos, head) \
> >>> for (pos = (head)->prev; !list_is_head(pos, (head)); pos = pos->prev)
> >>>
> >>> -/**
> >>> - * list_for_each_safe - iterate over a list safe against removal of list entry
> >>> - * @pos: the &struct list_head to use as a loop cursor.
> >>> - * @n: another &struct list_head to use as temporary storage
> >>> - * @head: the head for your list.
> >>> +/*
> >>> + * list_for_each_safe is an old interface, use list_for_each_mutable instead.
> >>> */
> >>> #define list_for_each_safe(pos, n, head) \
> >>> for (pos = (head)->next, n = pos->next; \
> >>> !list_is_head(pos, (head)); \
> >>> pos = n, n = pos->next)
> >>>
> >>> +#define __list_for_each_mutable_internal(pos, tmp, head) \
> >>> + for (typeof(pos) tmp = (pos = (head)->next)->next; \
> >>
> >> Use auto
> >>
> >>> + !list_is_head(pos, (head)); \
> >>> + pos = tmp, tmp = pos->next)
> >>> +
> >>> +#define __list_for_each_mutable1(pos, head) \
> >>> + __list_for_each_mutable_internal(pos, __UNIQUE_ID(next), head)
> >>> +
> >>> +#define __list_for_each_mutable2(pos, next, head) \
> >>> + list_for_each_safe(pos, next, head)
> >>> +
> >>> /**
> >>> - * list_for_each_prev_safe - iterate over a list backwards safe against removal of list entry
> >>> + * list_for_each_mutable - iterate over a list safe against entry removal
> >>> * @pos: the &struct list_head to use as a loop cursor.
> >>> - * @n: another &struct list_head to use as temporary storage
> >>> - * @head: the head for your list.
> >>> + * @...: either (head) or (next, head)
> >>> + *
> >>> + * next: another &struct list_head to use as optional temporary storage.
> >>> + * The temporary cursor is internal unless explicitly supplied by
> >>> + * the caller.
> >>> + * head: the head for your list.
> >>> + */
> >>> +#define list_for_each_mutable(pos, ...) \
> >>> + CONCATENATE(__list_for_each_mutable, COUNT_ARGS(__VA_ARGS__)) \
> >>> + (pos, __VA_ARGS__)
> >>
> >> The variable argument count logic really just slows down compilation.
> >> Maybe there aren't enough copies of this code to make that significant.
> >> But just because you can do it doesn't mean it is a gooD idea.
> >> I'm also not sure it really adds anything to the readability.
> >>
> >> And, it you are going to make the middle argument optional there is
> >> no need to change the macro name.
> >
> > Christian König and Jani Nikula also disagree with the variadic-argument
> > implementation approach. If we abandon that method, it means we will
> > inevitably need to add some new macros. If mutable is not a good name,
> > suggestions for better alternatives would be welcome; coming up with a
> > suitable name is indeed rather tricky.
>
> I don't think you need to add a new macro for the specific use case that people want to modify the next element of the iteration.
>
> If I remember your numbers correctly that is a really corner case and keeping using the existing *_safe() macros for that sounds perfectly fine to me.
IIRC currently you have a choice of either:
define Item that can't be deleted
list_for_each() The current item.
list_for_each_safe() The next item.
There is also likely to be code that updates the variables to allow
for other scenarios.
Note that if increase a reference count and release a lock then list_for_each()
is likely safer than list_for_each_safe() :-)
list.h has 9 variants of the 'safe' loop.
The bloat of another 9 is getting excessive.
It has to be said that this is one of my least favourite type of list...
David
>
> Regards,
> Christian.
^ permalink raw reply
* [PATCH V2 net 4/4] net: hns3: differentiate autoneg default values between copper and fiber
From: Jijie Shao @ 2026-06-24 14:13 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, huangdonghua3, yangshuaisong,
netdev, linux-kernel, shaojijie
In-Reply-To: <20260624141319.271439-1-shaojijie@huawei.com>
From: Shuaisong Yang <yangshuaisong@h-partners.com>
Fix a link loss issue during driver initialization on optical ports
connected to forced-mode (non-autoneg) remote switches.
Previously, during driver probe or initialization, hclge_configure()
blindly hardcoded hdev->hw.mac.req_autoneg to AUTONEG_ENABLE for all
media types. While this is necessary for copper (BASE-T) ports to
establish a link, many high-speed optical (fiber) ports in data
centers are connected to switches running in forced mode (fixed speed,
autoneg disabled). Forcing autoneg on these optical ports during
initialization causes a permanent link failure since the remote end
refuses to respond to autoneg pulses.
Fix this by implementing media-type differentiated initialization in
hclge_init_ae_dev(). Copper ports continue to default to
AUTONEG_ENABLE, while optical ports strictly inherit the preset
autoneg status pre-configured by the firmware (hdev->hw.mac.autoneg),
preserving native compatibility with forced-mode network environments.
Fixes: 05eb60e9648c ("net: hns3: using user configure after hardware reset")
Signed-off-by: Shuaisong Yang <yangshuaisong@h-partners.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
Changes in V2:
- Fix a link-loss regression on fiber ports in forced mode where the
helper would configure hardware with the static default_speed instead
of the firmware-probed SFP speed, by synchronizing req_speed from
mac.speed when req_autoneg is overridden to AUTONEG_DISABLE.
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index d176100d3e4c..fc8587c80813 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -9498,6 +9498,13 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
if (ret)
goto err_ptp_uninit;
+ if (hdev->hw.mac.media_type != HNAE3_MEDIA_TYPE_COPPER) {
+ hdev->hw.mac.req_autoneg = hdev->hw.mac.autoneg;
+ if (hdev->hw.mac.autoneg == AUTONEG_DISABLE &&
+ hdev->hw.mac.speed != SPEED_UNKNOWN)
+ hdev->hw.mac.req_speed = hdev->hw.mac.speed;
+ }
+
ret = hclge_set_autoneg_speed_dup(hdev);
if (ret) {
dev_err(&pdev->dev,
--
2.33.0
^ permalink raw reply related
* [PATCH V2 net 3/4] net: hns3: fix permanent link down deadlock after reset
From: Jijie Shao @ 2026-06-24 14:13 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, huangdonghua3, yangshuaisong,
netdev, linux-kernel, shaojijie
In-Reply-To: <20260624141319.271439-1-shaojijie@huawei.com>
From: Shuaisong Yang <yangshuaisong@h-partners.com>
Fix a critical race condition deadlock where the network interface
remains permanently Link Down after a hardware reset under specific
ethtool sequences.
This issue exclusively manifests in firmware-controlled PHY topologies
where the driver relies on the IMP firmware to arbitrate link parameters.
Standard devices driven by the kernel's native PHY_LIB are unaffected.
The deadlock occurs via the following path:
1. User disables autoneg and forces an unmatched speed, forcing link
down: `ethtool -s ethx autoneg off speed 10 duplex full`
2. User re-enables autoneg: `ethtool -s ethx autoneg on`. The netdev
stack passes cmd->base.speed as SPEED_UNKNOWN (0xffffffff).
3. Driver saves req_autoneg=1, but before the interface can link up,
a hardware reset is triggered.
4. During reset recovery, MAC init reads the un-synchronized runtime
state mac.autoneg (which is still 0/OFF), misinterprets it as
forced mode, and pushes the cached SPEED_UNKNOWN into the hardware
registers, causing the MAC firmware state machine to freeze.
Meanwhile, PHY init reads req_autoneg=1 and enables PHY autoneg.
Since the MAC is frozen with 0xffffffff and PHY is running autoneg,
they mismatch permanently.
Fix this by:
1. Intercepting SPEED_UNKNOWN/DUPLEX_UNKNOWN in
hclge_set_phy_link_ksettings() and hclge_cfg_mac_speed_dup_h() to
prevent it from corrupting the driver's cached valid configuration.
2. Save req_autoneg in hclge_set_autoneg().
3. Aligning the state judgment in hclge_set_autoneg_speed_dup() to use
req_autoneg instead of the un-synchronized runtime mac.autoneg,
ensuring both MAC and PHY consistently enter the autoneg branch to
eliminate configuration discrepancies during reset recovery.
Fixes: 05eb60e9648c ("net: hns3: using user configure after hardware reset")
Signed-off-by: Shuaisong Yang <yangshuaisong@h-partners.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
.../hisilicon/hns3/hns3pf/hclge_main.c | 22 +++++++++++++------
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index fb12ba77228c..d176100d3e4c 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2585,8 +2585,10 @@ static int hclge_cfg_mac_speed_dup_h(struct hnae3_handle *handle, int speed,
return ret;
hdev->hw.mac.req_lane_num = lane_num;
- hdev->hw.mac.req_speed = (u32)speed;
- hdev->hw.mac.req_duplex = duplex;
+ if (speed != SPEED_UNKNOWN)
+ hdev->hw.mac.req_speed = (u32)speed;
+ if (duplex != DUPLEX_UNKNOWN)
+ hdev->hw.mac.req_duplex = duplex;
return 0;
}
@@ -2617,6 +2619,7 @@ static int hclge_set_autoneg(struct hnae3_handle *handle, bool enable)
{
struct hclge_vport *vport = hclge_get_vport(handle);
struct hclge_dev *hdev = vport->back;
+ int ret;
if (!hdev->hw.mac.support_autoneg) {
if (enable) {
@@ -2628,7 +2631,10 @@ static int hclge_set_autoneg(struct hnae3_handle *handle, bool enable)
}
}
- return hclge_set_autoneg_en(hdev, enable);
+ ret = hclge_set_autoneg_en(hdev, enable);
+ if (!ret)
+ hdev->hw.mac.req_autoneg = enable;
+ return ret;
}
static int hclge_get_autoneg(struct hnae3_handle *handle)
@@ -3343,8 +3349,10 @@ hclge_set_phy_link_ksettings(struct hnae3_handle *handle,
return ret;
hdev->hw.mac.req_autoneg = cmd->base.autoneg;
- hdev->hw.mac.req_speed = cmd->base.speed;
- hdev->hw.mac.req_duplex = cmd->base.duplex;
+ if (cmd->base.speed != SPEED_UNKNOWN)
+ hdev->hw.mac.req_speed = cmd->base.speed;
+ if (cmd->base.duplex != DUPLEX_UNKNOWN)
+ hdev->hw.mac.req_duplex = cmd->base.duplex;
return 0;
}
@@ -9313,12 +9321,12 @@ static int hclge_set_autoneg_speed_dup(struct hclge_dev *hdev)
int ret;
if (hdev->hw.mac.support_autoneg) {
- ret = hclge_set_autoneg_en(hdev, hdev->hw.mac.autoneg);
+ ret = hclge_set_autoneg_en(hdev, hdev->hw.mac.req_autoneg);
if (ret)
return ret;
}
- if (!hdev->hw.mac.autoneg) {
+ if (!hdev->hw.mac.req_autoneg) {
ret = hclge_cfg_mac_speed_dup_hw(hdev, hdev->hw.mac.req_speed,
hdev->hw.mac.req_duplex,
hdev->hw.mac.req_lane_num);
--
2.33.0
^ permalink raw reply related
* [PATCH V2 net 0/4] net: hns3: fix configuration deadlocks and refactor link setup
From: Jijie Shao @ 2026-06-24 14:13 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, huangdonghua3, yangshuaisong,
netdev, linux-kernel, shaojijie
This patch series addresses a sequence of link configuration deadlocks
and parameter contamination issues in the hns3 network driver, which
typically occur during hardware resets or driver initialization under
specific user-configured scenarios.
The bugs root from asynchronous discrepancies between the MAC state
machine and cached user requests during sudden hardware resets, leading
to invalid parameter combos or frozen registers.
Changes in V2:
- Squashed the former patch 5 ("fix init failure caused by lane_num
contamination") into patch 2, introducing the req_lane_num separation
directly where the helper is created. This avoids a bisect-time
regression where an intermediate commit could fail probe with an
inconsistent (speed, lane_num) pair.
- Added a NULL phydev guard in patch 1 (hclge_set_phy_link_ksettings)
to prevent a kernel panic when firmware reports PHY_INEXISTENT on a
copper port. The previous netdev->phydev check was lost during the
ethtool refactor.
- In patch 1, for copper ports where neither IMP firmware nor a kernel
PHY is available (e.g. PHY_INEXISTENT), hclge_set_phy_link_ksettings()
now returns -ENODEV, and hns3_set_link_ksettings() catches this error
to proceed to the existing MAC-level path (check_ksettings_param
-> cfg_mac_speed_dup_h), preserving compatibility with PHY-less copper
deployments.
- Preserved the 1000BASE-T forced-mode rejection in the kernel PHY
path inside the new hclge_set_phy_link_ksettings() wrapper, closing
a gap identified in community review.
- Fixed a link-loss regression in patch 4 where fiber ports in forced
mode would be configured with the static default_speed instead of the
firmware-probed SFP speed, by synchronizing req_speed from mac.speed
when req_autoneg is overridden to AUTONEG_DISABLE.
- Rewrote the commit message of patch 2 to accurately describe the
init/reset path asymmetry and the req_lane_num rationale.
The series is organized as follows:
- Patch 1 refactors the ethtool link settings entry path to unify copper
port handling (both native kernel PHY_LIB and firmware-controlled PHY)
and ensures req_xxx configurations are uniformly saved across all modes.
For PHY_INEXISTENT copper ports, -ENODEV is returned to allow fallthrough
to MAC-level configuration.
- Patch 2 refactors the MAC initialization by extracting the autoneg and
speed configuration logic out of hclge_mac_init() into a dedicated
helper function, and introduces req_lane_num to isolate the user-
requested lane count from firmware-overwritten mac.lane_num.
- Patch 3 fixes a permanent link-down deadlock after a reset by ensuring
that the driver caches and uses the user's intended autoneg/speed
settings (req_***) rather than unsynchronized runtime states or
SPEED_UNKNOWN tokens.
- Patch 4 fixes a link loss issue on optical ports during initialization
by differentiating autoneg default values between copper and fiber
media types, and synchronizing req_speed with the firmware-probed
SFP speed when forced mode is detected.
Shuaisong Yang (4):
net: hns3: unify copper port ksettings configuration path
net: hns3: refactor MAC autoneg and speed configuration
net: hns3: fix permanent link down deadlock after reset
net: hns3: differentiate autoneg default values between copper and
fiber
.../ethernet/hisilicon/hns3/hns3_ethtool.c | 31 +++--
.../hisilicon/hns3/hns3pf/hclge_main.c | 108 ++++++++++++++----
.../hisilicon/hns3/hns3pf/hclge_main.h | 1 +
3 files changed, 102 insertions(+), 38 deletions(-)
base-commit: d87363b0edfc7504ff2b144fe4cdd8154f90f42e
--
2.33.0
^ permalink raw reply
* [PATCH V2 net 2/4] net: hns3: refactor MAC autoneg and speed configuration
From: Jijie Shao @ 2026-06-24 14:13 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, huangdonghua3, yangshuaisong,
netdev, linux-kernel, shaojijie
In-Reply-To: <20260624141319.271439-1-shaojijie@huawei.com>
From: Shuaisong Yang <yangshuaisong@h-partners.com>
Extract the MAC autoneg and speed/duplex/lane configuration logic out
of hclge_mac_init() and encapsulate it into a new dedicated helper
function hclge_set_autoneg_speed_dup().
In the init path (hclge_init_ae_dev), this helper is now called after
hclge_update_port_info() so that firmware-reported autoneg values are
already populated before applying the link configuration.
Introduce a separate req_lane_num field in struct hclge_mac to isolate
the user-requested lane count from mac.lane_num, which firmware may
overwrite via hclge_get_sfp_info() with stale values from a prior link
lifecycle (e.g., lane_num=4 from 100G). During probe, req_lane_num is
initialized to 0, which instructs firmware to auto-select the correct
lane count for the current speed, rather than reusing the firmware-
reported mac.lane_num that may be inconsistent with the target speed.
This prevents probe failures from mismatched (speed, lane_num) pairs.
In the reset path (hclge_reset_ae_dev), it runs immediately after
hclge_mac_init(), using the previously cached req_* values to restore
the link without re-querying firmware.
Signed-off-by: Shuaisong Yang <yangshuaisong@h-partners.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
Changes in V2:
- Squashed the former patch 5 ("fix init failure caused by lane_num
contamination") into this patch. The req_lane_num separation is
introduced here to avoid a bisect-time regression where an
intermediate commit could fail probe with an inconsistent
(speed, lane_num) pair.
- Rewrote the commit message to accurately describe the init/reset
path asymmetry and the req_lane_num rationale.
---
.../hisilicon/hns3/hns3pf/hclge_main.c | 55 ++++++++++++++-----
.../hisilicon/hns3/hns3pf/hclge_main.h | 1 +
2 files changed, 42 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 9fe6bc02d71e..fb12ba77228c 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -1504,6 +1504,11 @@ static int hclge_configure(struct hclge_dev *hdev)
hdev->hw.mac.req_autoneg = AUTONEG_ENABLE;
hdev->hw.mac.req_duplex = DUPLEX_FULL;
+ /* When lane_num is 0, the firmware will automatically
+ * select the appropriate lane_num based on the speed.
+ */
+ hdev->hw.mac.req_lane_num = 0;
+
hclge_parse_link_mode(hdev, cfg.speed_ability);
hdev->hw.mac.max_speed = hclge_get_max_speed(cfg.speed_ability);
@@ -2579,6 +2584,7 @@ static int hclge_cfg_mac_speed_dup_h(struct hnae3_handle *handle, int speed,
if (ret)
return ret;
+ hdev->hw.mac.req_lane_num = lane_num;
hdev->hw.mac.req_speed = (u32)speed;
hdev->hw.mac.req_duplex = duplex;
@@ -2884,20 +2890,6 @@ static int hclge_mac_init(struct hclge_dev *hdev)
if (!test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
hdev->hw.mac.duplex = HCLGE_MAC_FULL;
- if (hdev->hw.mac.support_autoneg) {
- ret = hclge_set_autoneg_en(hdev, hdev->hw.mac.autoneg);
- if (ret)
- return ret;
- }
-
- if (!hdev->hw.mac.autoneg) {
- ret = hclge_cfg_mac_speed_dup_hw(hdev, hdev->hw.mac.req_speed,
- hdev->hw.mac.req_duplex,
- hdev->hw.mac.lane_num);
- if (ret)
- return ret;
- }
-
mac->link = 0;
if (mac->user_fec_mode & BIT(HNAE3_FEC_USER_DEF)) {
@@ -9316,6 +9308,27 @@ static int hclge_set_wol(struct hnae3_handle *handle,
return ret;
}
+static int hclge_set_autoneg_speed_dup(struct hclge_dev *hdev)
+{
+ int ret;
+
+ if (hdev->hw.mac.support_autoneg) {
+ ret = hclge_set_autoneg_en(hdev, hdev->hw.mac.autoneg);
+ if (ret)
+ return ret;
+ }
+
+ if (!hdev->hw.mac.autoneg) {
+ ret = hclge_cfg_mac_speed_dup_hw(hdev, hdev->hw.mac.req_speed,
+ hdev->hw.mac.req_duplex,
+ hdev->hw.mac.req_lane_num);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
{
struct pci_dev *pdev = ae_dev->pdev;
@@ -9477,6 +9490,13 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
if (ret)
goto err_ptp_uninit;
+ ret = hclge_set_autoneg_speed_dup(hdev);
+ if (ret) {
+ dev_err(&pdev->dev,
+ "failed to set autoneg speed duplex, ret = %d\n", ret);
+ goto err_ptp_uninit;
+ }
+
INIT_KFIFO(hdev->mac_tnl_log);
hclge_dcb_ops_set(hdev);
@@ -9807,6 +9827,13 @@ static int hclge_reset_ae_dev(struct hnae3_ae_dev *ae_dev)
return ret;
}
+ ret = hclge_set_autoneg_speed_dup(hdev);
+ if (ret) {
+ dev_err(&pdev->dev,
+ "failed to set autoneg speed duplex, ret = %d\n", ret);
+ return ret;
+ }
+
ret = hclge_tp_port_init(hdev);
if (ret) {
dev_err(&pdev->dev, "failed to init tp port, ret = %d\n",
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
index 87adeb64e6ea..7419481422c3 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
@@ -287,6 +287,7 @@ struct hclge_mac {
u8 support_autoneg;
u8 speed_type; /* 0: sfp speed, 1: active speed */
u8 lane_num;
+ u8 req_lane_num;
u32 speed;
u32 req_speed;
u32 max_speed;
--
2.33.0
^ permalink raw reply related
* [PATCH V2 net 1/4] net: hns3: unify copper port ksettings configuration path
From: Jijie Shao @ 2026-06-24 14:13 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, andrew+netdev, horms
Cc: shenjian15, liuyonglong, chenhao418, huangdonghua3, yangshuaisong,
netdev, linux-kernel, shaojijie
In-Reply-To: <20260624141319.271439-1-shaojijie@huawei.com>
From: Shuaisong Yang <yangshuaisong@h-partners.com>
Refactor hns3_set_link_ksettings() and hclge_set_phy_link_ksettings()
to unify the configuration path for copper ports.
Previously, netdevs with a native kernel phy attached bypassed the main
MAC parameter caching logic and returned early via
phy_ethtool_ksettings_set(). This prevented the driver from updating
hdev->hw.mac.req_xxx variables for kernel PHY setups, leaving them
out-of-sync during reset recovery.
Clean this up by routing all copper port configurations through
ops->set_phy_link_ksettings(), and perform driver-level or kernel-level
PHY arbitration inside hclge_set_phy_link_ksettings() via
hnae3_dev_phy_imp_supported(). This ensures that the user's intended link
profiles (req_speed, req_duplex, req_autoneg) are uniformly recorded
across all copper and fiber deployment topologies, laying the groundwork
for stable reset recovery.
For copper ports where neither IMP firmware nor a kernel PHY is available
(e.g. PHY_INEXISTENT), hclge_set_phy_link_ksettings() returns -ENODEV.
In hns3_set_link_ksettings(), this is caught so the configuration falls
through to the existing MAC-level path (check_ksettings_param ->
cfg_mac_speed_dup_h), preserving compatibility with PHY-less copper
deployments.
Signed-off-by: Shuaisong Yang <yangshuaisong@h-partners.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
Changes in V2:
- Add NULL phydev guard in hclge_set_phy_link_ksettings() to prevent
kernel panic when firmware reports PHY_INEXISTENT on a copper port.
- For PHY_INEXISTENT copper ports, return -ENODEV from
hclge_set_phy_link_ksettings() and catch it in
hns3_set_link_ksettings() to fall through to the existing MAC-level
path, preserving compatibility with PHY-less copper deployments.
- Preserve the 1000BASE-T forced-mode (SPEED_1000 + AUTONEG_DISABLE)
rejection in the kernel PHY path, closing a validation gap.
---
.../ethernet/hisilicon/hns3/hns3_ethtool.c | 31 +++++++++----------
.../hisilicon/hns3/hns3pf/hclge_main.c | 28 +++++++++++++++--
2 files changed, 40 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
index 9cb7ce9fd311..64bee0e78db3 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c
@@ -811,12 +811,11 @@ static int hns3_get_link_ksettings(struct net_device *netdev,
}
static int hns3_check_ksettings_param(const struct net_device *netdev,
- const struct ethtool_link_ksettings *cmd)
+ const struct ethtool_link_ksettings *cmd,
+ u8 media_type)
{
struct hnae3_handle *handle = hns3_get_handle(netdev);
const struct hnae3_ae_ops *ops = hns3_get_ops(handle);
- u8 module_type = HNAE3_MODULE_TYPE_UNKNOWN;
- u8 media_type = HNAE3_MEDIA_TYPE_UNKNOWN;
u32 lane_num;
u8 autoneg;
u32 speed;
@@ -836,9 +835,6 @@ static int hns3_check_ksettings_param(const struct net_device *netdev,
return 0;
}
- if (ops->get_media_type)
- ops->get_media_type(handle, &media_type, &module_type);
-
if (cmd->base.duplex == DUPLEX_HALF &&
media_type != HNAE3_MEDIA_TYPE_COPPER) {
netdev_err(netdev,
@@ -863,6 +859,8 @@ static int hns3_set_link_ksettings(struct net_device *netdev,
struct hnae3_handle *handle = hns3_get_handle(netdev);
struct hnae3_ae_dev *ae_dev = hns3_get_ae_dev(handle);
const struct hnae3_ae_ops *ops = hns3_get_ops(handle);
+ u8 module_type = HNAE3_MODULE_TYPE_UNKNOWN;
+ u8 media_type = HNAE3_MEDIA_TYPE_UNKNOWN;
int ret;
/* Chip don't support this mode. */
@@ -878,22 +876,23 @@ static int hns3_set_link_ksettings(struct net_device *netdev,
cmd->base.autoneg, cmd->base.speed, cmd->base.duplex,
cmd->lanes);
- /* Only support ksettings_set for netdev with phy attached for now */
- if (netdev->phydev) {
- if (cmd->base.speed == SPEED_1000 &&
- cmd->base.autoneg == AUTONEG_DISABLE)
- return -EINVAL;
+ if (!ops->get_media_type)
+ return -EOPNOTSUPP;
+ ops->get_media_type(handle, &media_type, &module_type);
- return phy_ethtool_ksettings_set(netdev->phydev, cmd);
- } else if (test_bit(HNAE3_DEV_SUPPORT_PHY_IMP_B, ae_dev->caps) &&
- ops->set_phy_link_ksettings) {
- return ops->set_phy_link_ksettings(handle, cmd);
+ if (media_type == HNAE3_MEDIA_TYPE_COPPER) {
+ if (!ops->set_phy_link_ksettings)
+ return -EOPNOTSUPP;
+ ret = ops->set_phy_link_ksettings(handle, cmd);
+ if (ret != -ENODEV)
+ return ret;
+ /* PHY_INEXISTENT, use MAC-level configuration */
}
if (ae_dev->dev_version < HNAE3_DEVICE_VERSION_V2)
return -EOPNOTSUPP;
- ret = hns3_check_ksettings_param(netdev, cmd);
+ ret = hns3_check_ksettings_param(netdev, cmd, media_type);
if (ret)
return ret;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 2f1984930da2..9fe6bc02d71e 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3285,8 +3285,8 @@ static int hclge_get_phy_link_ksettings(struct hnae3_handle *handle,
}
static int
-hclge_set_phy_link_ksettings(struct hnae3_handle *handle,
- const struct ethtool_link_ksettings *cmd)
+hclge_ethtool_ksettings_set(struct hnae3_handle *handle,
+ const struct ethtool_link_ksettings *cmd)
{
struct hclge_desc desc[HCLGE_PHY_LINK_SETTING_BD_NUM];
struct hclge_vport *vport = hclge_get_vport(handle);
@@ -3327,10 +3327,32 @@ hclge_set_phy_link_ksettings(struct hnae3_handle *handle,
return ret;
}
+ linkmode_copy(hdev->hw.mac.advertising, cmd->link_modes.advertising);
+ return 0;
+}
+
+static int
+hclge_set_phy_link_ksettings(struct hnae3_handle *handle,
+ const struct ethtool_link_ksettings *cmd)
+{
+ struct hclge_vport *vport = hclge_get_vport(handle);
+ struct hclge_dev *hdev = vport->back;
+ int ret = -ENODEV;
+
+ if (hnae3_dev_phy_imp_supported(hdev)) {
+ ret = hclge_ethtool_ksettings_set(handle, cmd);
+ } else if (handle->netdev->phydev) {
+ if (cmd->base.speed == SPEED_1000 &&
+ cmd->base.autoneg == AUTONEG_DISABLE)
+ return -EINVAL;
+ ret = phy_ethtool_ksettings_set(handle->netdev->phydev, cmd);
+ }
+ if (ret)
+ return ret;
+
hdev->hw.mac.req_autoneg = cmd->base.autoneg;
hdev->hw.mac.req_speed = cmd->base.speed;
hdev->hw.mac.req_duplex = cmd->base.duplex;
- linkmode_copy(hdev->hw.mac.advertising, cmd->link_modes.advertising);
return 0;
}
--
2.33.0
^ permalink raw reply related
* Re: [BUG] KFENCE: use-after-free read in udp_tunnel_nic_device_sync_work
From: Eric Dumazet @ 2026-06-24 14:10 UTC (permalink / raw)
To: Sam Sun
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, netdev,
linux-kernel, syzkaller
In-Reply-To: <CANn89iJ5wKuvKy=Ed-JkeyzKYaEoA_8S=mEMBicMjEPxLCY0Vw@mail.gmail.com>
On Wed, Jun 24, 2026 at 6:59 AM Eric Dumazet <edumazet@google.com> wrote:
> Oh well.
>
> u8 need_sync:1;
> u8 need_replay:1;
> u8 work_pending:1;
>
> These bitfields are not safe, obviously :/
>
> Time to convert them to atomic bit operations.
Can you try:
diff --git a/net/ipv4/udp_tunnel_nic.c b/net/ipv4/udp_tunnel_nic.c
index 9944ed923ddfd10f9adf6ad788c0740daeaf2adb..939d6f656bb71814718bc3bf84be665adad27e4b
100644
--- a/net/ipv4/udp_tunnel_nic.c
+++ b/net/ipv4/udp_tunnel_nic.c
@@ -30,9 +30,7 @@ struct udp_tunnel_nic_table_entry {
* @work: async work for talking to hardware from process context
* @dev: netdev pointer
* @lock: protects all fields
- * @need_sync: at least one port start changed
- * @need_replay: space was freed, we need a replay of all ports
- * @work_pending: @work is currently scheduled
+ * @flags: sync, replay, pending flags
* @n_tables: number of tables under @entries
* @missed: bitmap of tables which overflown
* @entries: table of tables of ports currently offloaded
@@ -44,9 +42,10 @@ struct udp_tunnel_nic {
struct mutex lock;
- u8 need_sync:1;
- u8 need_replay:1;
- u8 work_pending:1;
+ unsigned long flags;
+#define UDP_TUNNEL_NIC_NEED_SYNC 0
+#define UDP_TUNNEL_NIC_NEED_REPLAY 1
+#define UDP_TUNNEL_NIC_WORK_PENDING 2
unsigned int n_tables;
unsigned long missed;
@@ -116,7 +115,7 @@ udp_tunnel_nic_entry_queue(struct udp_tunnel_nic *utn,
unsigned int flag)
{
entry->flags |= flag;
- utn->need_sync = 1;
+ set_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags);
}
static void
@@ -283,7 +282,7 @@ udp_tunnel_nic_device_sync_by_table(struct net_device *dev,
static void
__udp_tunnel_nic_device_sync(struct net_device *dev, struct
udp_tunnel_nic *utn)
{
- if (!utn->need_sync)
+ if (!test_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags))
return;
if (dev->udp_tunnel_nic_info->sync_table)
@@ -291,21 +290,24 @@ __udp_tunnel_nic_device_sync(struct net_device
*dev, struct udp_tunnel_nic *utn)
else
udp_tunnel_nic_device_sync_by_port(dev, utn);
- utn->need_sync = 0;
+ clear_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags);
/* Can't replay directly here, in case we come from the tunnel driver's
* notification - trying to replay may deadlock inside tunnel driver.
*/
- utn->need_replay = udp_tunnel_nic_should_replay(dev, utn);
+ if (udp_tunnel_nic_should_replay(dev, utn))
+ set_bit(UDP_TUNNEL_NIC_NEED_REPLAY, &utn->flags);
+ else
+ clear_bit(UDP_TUNNEL_NIC_NEED_REPLAY, &utn->flags);
}
static void
udp_tunnel_nic_device_sync(struct net_device *dev, struct udp_tunnel_nic *utn)
{
- if (!utn->need_sync)
+ if (!test_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags))
return;
+ set_bit(UDP_TUNNEL_NIC_WORK_PENDING, &utn->flags);
queue_work(udp_tunnel_nic_workqueue, &utn->work);
- utn->work_pending = 1;
}
static bool
@@ -348,7 +350,7 @@ udp_tunnel_nic_has_collision(struct net_device
*dev, struct udp_tunnel_nic *utn,
if (!udp_tunnel_nic_entry_is_free(entry) &&
entry->port == ti->port &&
entry->type != ti->type) {
- __set_bit(i, &utn->missed);
+ set_bit(i, &utn->missed);
return true;
}
}
@@ -483,7 +485,7 @@ udp_tunnel_nic_add_new(struct net_device *dev,
struct udp_tunnel_nic *utn,
* are no devices currently which have multiple tables accepting
* the same tunnel type, and false positives are okay.
*/
- __set_bit(i, &utn->missed);
+ set_bit(i, &utn->missed);
}
return false;
@@ -552,7 +554,7 @@ static void __udp_tunnel_nic_reset_ntf(struct
net_device *dev)
mutex_lock(&utn->lock);
- utn->need_sync = false;
+ clear_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags);
for (i = 0; i < utn->n_tables; i++)
for (j = 0; j < info->tables[i].n_entries; j++) {
struct udp_tunnel_nic_table_entry *entry;
@@ -696,8 +698,8 @@ udp_tunnel_nic_flush(struct net_device *dev,
struct udp_tunnel_nic *utn)
for (i = 0; i < utn->n_tables; i++)
memset(utn->entries[i], 0, array_size(info->tables[i].n_entries,
sizeof(**utn->entries)));
- WARN_ON(utn->need_sync);
- utn->need_replay = 0;
+ WARN_ON(test_bit(UDP_TUNNEL_NIC_NEED_SYNC, &utn->flags));
+ clear_bit(UDP_TUNNEL_NIC_NEED_REPLAY, &utn->flags);
}
static void
@@ -713,8 +715,8 @@ udp_tunnel_nic_replay(struct net_device *dev,
struct udp_tunnel_nic *utn)
for (i = 0; i < utn->n_tables; i++)
for (j = 0; j < info->tables[i].n_entries; j++)
udp_tunnel_nic_entry_freeze_used(&utn->entries[i][j]);
- utn->missed = 0;
- utn->need_replay = 0;
+ bitmap_zero(&utn->missed, UDP_TUNNEL_NIC_MAX_TABLES);
+ clear_bit(UDP_TUNNEL_NIC_NEED_REPLAY, &utn->flags);
if (!info->shared) {
udp_tunnel_get_rx_info(dev);
@@ -736,10 +738,10 @@ static void
udp_tunnel_nic_device_sync_work(struct work_struct *work)
rtnl_lock();
mutex_lock(&utn->lock);
- utn->work_pending = 0;
+ clear_bit(UDP_TUNNEL_NIC_WORK_PENDING, &utn->flags);
__udp_tunnel_nic_device_sync(utn->dev, utn);
- if (utn->need_replay)
+ if (test_bit(UDP_TUNNEL_NIC_NEED_REPLAY, &utn->flags))
udp_tunnel_nic_replay(utn->dev, utn);
mutex_unlock(&utn->lock);
@@ -866,6 +868,11 @@ udp_tunnel_nic_unregister(struct net_device *dev,
struct udp_tunnel_nic *utn)
udp_tunnel_nic_lock(dev);
+ if (test_bit(UDP_TUNNEL_NIC_WORK_PENDING, &utn->flags)) {
+ udp_tunnel_nic_unlock(dev);
+ return;
+ }
+
/* For a shared table remove this dev from the list of sharing devices
* and if there are other devices just detach.
*/
@@ -901,12 +908,6 @@ udp_tunnel_nic_unregister(struct net_device *dev,
struct udp_tunnel_nic *utn)
udp_tunnel_nic_flush(dev, utn);
udp_tunnel_nic_unlock(dev);
- /* Wait for the work to be done using the state, netdev core will
- * retry unregister until we give up our reference on this device.
- */
- if (utn->work_pending)
- return;
-
udp_tunnel_nic_free(utn);
release_dev:
dev->udp_tunnel_nic = NULL;
^ permalink raw reply
* Re: [PATCH v5 0/9] Fix missing fops.owner in Rust DRM/misc abstractions
From: Miguel Ojeda @ 2026-06-24 14:02 UTC (permalink / raw)
To: Petr Pavlu
Cc: Alvin Sun, Miguel Ojeda, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross, Danilo Krummrich, Luis Chamberlain, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Greg Kroah-Hartman,
Rafael J. Wysocki, David Airlie, Simona Vetter, Daniel Almeida,
Arnd Bergmann, Brendan Higgins, David Gow, Rae Moar, Breno Leitao,
Jens Axboe, Dave Ertman, Leon Romanovsky, Igor Korotin,
FUJITA Tomonori, Bjorn Helgaas, Krzysztof Wilczyński,
Arve Hjønnevåg, Todd Kjos, Christian Brauner,
Carlos Llamas, rust-for-linux, linux-modules, driver-core,
dri-devel, nova-gpu, linux-kselftest, kunit-dev, linux-block,
linux-kernel, netdev, linux-pci
In-Reply-To: <8ea21b29-9baf-4926-a16f-7d21c5a1a1b8@suse.com>
On Wed, Jun 24, 2026 at 3:23 PM Petr Pavlu <petr.pavlu@suse.com> wrote:
>
> I would only suggest adding the new file rust/kernel/module.rs in
> patch #1 under the MODULE SUPPORT support entry in the MAINTAINERS file,
> similarly to the other module-related Rust code, so that the module
> maintainers are emailed when changes to this file are proposed. I think
> you can change the existing 'F: rust/kernel/module_param.rs' to
> 'F: rust/kernel/module*.rs'.
That would be great, yes -- thanks!
Cheers,
Miguel
^ permalink raw reply
* Re: [BUG] KFENCE: use-after-free read in udp_tunnel_nic_device_sync_work
From: Eric Dumazet @ 2026-06-24 13:59 UTC (permalink / raw)
To: Sam Sun
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, netdev,
linux-kernel, syzkaller
In-Reply-To: <CAEkJfYN4Uep_WEzZyaDGkW5p4rW+oVRqq2DgBt2xiYp9ARt0GA@mail.gmail.com>
On Wed, Jun 24, 2026 at 6:42 AM Sam Sun <samsun1006219@gmail.com> wrote:
>
> On Wed, Jun 24, 2026 at 6:01 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Wed, Jun 24, 2026 at 2:01 AM Yue Sun <samsun1006219@gmail.com> wrote:
> > >
> > > Hello,
> > >
> > > I hit a reproducible use-after-free in the UDP tunnel NIC offload work item.
> > > The original local crash was reported by KFENCE as:
> > >
> > > KFENCE: use-after-free read in udp_tunnel_nic_device_sync_work
> > >
> > > On current mainline, the C reproducer below triggers the same lifetime bug,
> > > reported by KASAN before KFENCE samples the object:
> > >
> > > BUG: KASAN: slab-use-after-free in __mutex_lock
> > > Workqueue: udp_tunnel_nic udp_tunnel_nic_device_sync_work
> > >
> > > Tested kernel:
> > >
> > > 840ef6c78e6a ("Merge tag 'nfs-for-7.2-1' of git://git.linux-nfs.org/projects/anna/linux-nfs")
> > > Linux 7.1.0-11240-g840ef6c78e6a #31 SMP PREEMPT_DYNAMIC
> > >
> >
> >
> > Thanks or the report.
> >
> > Can you test the following patch?
> >
> > diff --git a/net/ipv4/udp_tunnel_nic.c b/net/ipv4/udp_tunnel_nic.c
> > index 9944ed923ddfd10f9adf6ad788c0740daeaf2adb..c5f8d2f9d325de8f4d2247ddaa52e33378851857
> > 100644
> > --- a/net/ipv4/udp_tunnel_nic.c
> > +++ b/net/ipv4/udp_tunnel_nic.c
> > @@ -304,8 +304,8 @@ udp_tunnel_nic_device_sync(struct net_device *dev,
> > struct udp_tunnel_nic *utn)
> > if (!utn->need_sync)
> > return;
> >
> > - queue_work(udp_tunnel_nic_workqueue, &utn->work);
> > utn->work_pending = 1;
> > + queue_work(udp_tunnel_nic_workqueue, &utn->work);
> > }
> >
> > static bool
> > @@ -866,6 +866,11 @@ udp_tunnel_nic_unregister(struct net_device *dev,
> > struct udp_tunnel_nic *utn)
> >
> > udp_tunnel_nic_lock(dev);
> >
> > + if (utn->work_pending) {
> > + udp_tunnel_nic_unlock(dev);
> > + return;
> > + }
> > +
> > /* For a shared table remove this dev from the list of sharing devices
> > * and if there are other devices just detach.
> > */
> > @@ -901,12 +906,6 @@ udp_tunnel_nic_unregister(struct net_device *dev,
> > struct udp_tunnel_nic *utn)
> > udp_tunnel_nic_flush(dev, utn);
> > udp_tunnel_nic_unlock(dev);
> >
> > - /* Wait for the work to be done using the state, netdev core will
> > - * retry unregister until we give up our reference on this device.
> > - */
> > - if (utn->work_pending)
> > - return;
> > -
> > udp_tunnel_nic_free(utn);
> > release_dev:
> > dev->udp_tunnel_nic = NULL;
>
> I tested the patch, but unfortunately the C reproducer still triggers the
> same use-after-free for me.
>
> Tested on top of:
>
> 840ef6c78e6a ("Merge tag 'nfs-for-7.2-1' of
> git://git.linux-nfs.org/projects/anna/linux-nfs")
>
> I booted the kernel with KASAN/KFENCE enabled and:
>
> panic_on_warn=1 panic_on_oops=1 kfence.sample_interval=1
>
> Then I ran the same C reproducer:
>
> timeout -k 10 360 /root/repro
>
> The VM panicked after about 236 seconds:
>
> [ 236.471119][ T58] BUG: KASAN: slab-use-after-free in
> __mutex_lock+0x16d0/0x1d80
> [ 236.473404][ T58] Read of size 8 at addr ff11000076a63ea8 by task
> kworker/u16:3/58
> [ 236.476455][ T58] Hardware name: QEMU Standard PC (i440FX + PIIX,
> 1996), BIOS 1.15.0-1 04/01/2014
> [ 236.476478][ T58] Workqueue: udp_tunnel_nic udp_tunnel_nic_device_sync_work
> [ 236.476787][ T58] __mutex_lock+0x16d0/0x1d80
> [ 236.477020][ T58] udp_tunnel_nic_device_sync_work+0x32/0x9c0
> [ 236.477068][ T58] process_one_work+0x9de/0x1bf0
>
> The allocation/free stacks are still the same shape:
> ```
> Allocated by task 11563:
> __kmalloc_noprof
> udp_tunnel_nic_netdevice_event+0x12d8/0x1e80
> register_netdevice
> nsim_create
> nsim_dev_reload_up
> devlink_reload
>
> Freed by task 11609:
> kfree
> udp_tunnel_nic_netdevice_event+0xc26/0x1e80
> unregister_netdevice_many_notify
> nsim_destroy
> nsim_dev_reload_down
> devlink_reload
>
> Last potentially related work creation:
> queue_work_on
> __udp_tunnel_nic_del_port+0x2af/0x320
> udp_tunnel_notify_del_rx_port
> __geneve_sock_release.part.0
> geneve_stop
>
> Second to last potentially related work creation:
> queue_work_on
> __udp_tunnel_nic_add_port+0x6ec/0xd70
> udp_tunnel_notify_add_rx_port
> geneve_open
> ```
>
> My read of the patch is that it closes the small window where queue_work()
> can publish the work before utn->work_pending is set, and it also prevents
> udp_tunnel_nic_unregister() from flushing/freeing the object when
> work_pending is already set.
>
> However, the test above suggests that work_pending still does not fully
> protect the lifetime of struct udp_tunnel_nic. The crashing work was still
> queued through udp_tunnel_nic_device_sync() at line 308, so the patched path
> was exercised. One suspicious point is that udp_tunnel_nic_device_sync_work()
> clears utn->work_pending at the beginning of the worker, while the same work
> item can still interact with replay/add/del-port state. The reproducer can
> still end up with udp_tunnel_nic_unregister() freeing utn while a
> udp_tunnel_nic_device_sync_work item later runs and dereferences the freed
> utn->lock.
>
> So this patch does not seem to be sufficient for this reproducer.
>
Oh well.
u8 need_sync:1;
u8 need_replay:1;
u8 work_pending:1;
These bitfields are not safe, obviously :/
Time to convert them to atomic bit operations.
^ permalink raw reply
* [PATCH net v2] tipc: fix out-of-bounds read in broadcast Gap ACK blocks
From: Samuel Page @ 2026-06-24 13:56 UTC (permalink / raw)
To: Jon Maloy
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Tung Quang Nguyen, netdev, tipc-discussion,
linux-kernel, Samuel Page
A broadcast PROTOCOL/STATE_MSG can carry a Gap ACK blocks record in its
data area. tipc_get_gap_ack_blks() only verifies that the record's len
field is self-consistent with its ugack_cnt/bgack_cnt counts
(sz == struct_size(p, gacks, ugack_cnt + bgack_cnt)); it does not check
that the record actually fits in the message data area, msg_data_sz().
The unicast caller tipc_link_proto_rcv() bounds it ("if (glen > dlen)
break;"), but the broadcast caller tipc_bcast_sync_rcv() discards the
returned size, so tipc_link_advance_transmq() copies the record off the
receive skb with an attacker-controlled count:
this_ga = kmemdup(ga, struct_size(ga, gacks, ga->bgack_cnt),
GFP_ATOMIC);
A TIPC neighbour that negotiated TIPC_GAP_ACK_BLOCK triggers it with one
ordinary broadcast STATE_MSG (msg_bc_ack_invalid() clear), sized so its
data area is short, carrying a Gap ACK record with len = 0x400,
bgack_cnt = 0xff and ugack_cnt = 0. len then equals
struct_size(p, gacks, 255), so the consistency check passes and ga is
non-NULL; kmemdup() reads struct_size(ga, gacks, 255) = 1024 bytes out
of the much smaller skb:
BUG: KASAN: slab-out-of-bounds in kmemdup_noprof+0x48/0x60
Read of size 1024 at addr ffff0000c7030d38 by task poc864/69
Call trace:
kmemdup_noprof+0x48/0x60
tipc_link_advance_transmq+0x86c/0xb80
tipc_link_bc_ack_rcv+0x19c/0x1e0
tipc_bcast_sync_rcv+0x1c4/0x2c4
tipc_rcv+0x85c/0x1340
tipc_l2_rcv_msg+0xac/0x104
The buggy address belongs to the object at ffff0000c7030d00
which belongs to the cache skbuff_small_head of size 704
The buggy address is located 56 bytes inside of
allocated 704-byte region [ffff0000c7030d00, ffff0000c7030fc0)
The copied-out bytes are subsequently consumed as gap/ack values, but
the read is already out of bounds at the kmemdup() regardless of how
they are used.
The unicast STATE path drops such a message: "if (glen > dlen) break;"
skips the rest of STATE_MSG handling and the skb is freed. Make the
broadcast path drop it too. tipc_bcast_sync_rcv() now bounds the record
against msg_data_sz() and, when it does not fit, reports it back through
tipc_node_bc_sync_rcv() to tipc_rcv() so the skb is discarded rather than
processed. ga is not cleared on this path: ga == NULL already means
"legacy peer without Selective ACK", a distinct legitimate state.
Fixes: d7626b5acff9 ("tipc: introduce Gap ACK blocks for broadcast link")
Cc: stable@vger.kernel.org
Assisted-by: Bynario AI
Signed-off-by: Samuel Page <sam@bynar.io>
---
v2, per review of v1 [1]:
- v1 cleared 'ga' on an oversized Gap ACK record, which let the malformed
STATE message be processed as a legacy (no Selective ACK) one rather than
dropped. v2 drops it instead, matching the unicast STATE path:
tipc_bcast_sync_rcv() reports the bad record through a bool output
parameter, propagated by tipc_node_bc_sync_rcv() to tipc_rcv(), which
discards the skb.
- v1 touched only net/tipc/bcast.c; v2 also touches net/tipc/{bcast.h,node.c}.
[1] https://lore.kernel.org/netdev/20260623134137.3641275-1-sam@bynar.io/
For reference, an earlier thread proposed validating inside
tipc_get_gap_ack_blks():
https://lore.kernel.org/netdev/1316452e465e9a96fce44ec15130a14f3872149f.1775809727.git.caoruide123@gmail.com/
net/tipc/bcast.c | 22 ++++++++++++++--------
net/tipc/bcast.h | 2 +-
net/tipc/node.c | 13 ++++++++++---
3 files changed, 25 insertions(+), 12 deletions(-)
diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index 76a1585d3f6b..08637c3c9db0 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -497,11 +497,12 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
*/
int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l,
struct tipc_msg *hdr,
- struct sk_buff_head *retrq)
+ struct sk_buff_head *retrq, bool *valid)
{
struct sk_buff_head *inputq = &tipc_bc_base(net)->inputq;
struct tipc_gap_ack_blks *ga;
struct sk_buff_head xmitq;
+ u16 glen;
int rc = 0;
__skb_queue_head_init(&xmitq);
@@ -510,13 +511,18 @@ int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l,
if (msg_type(hdr) != STATE_MSG) {
tipc_link_bc_init_rcv(l, hdr);
} else if (!msg_bc_ack_invalid(hdr)) {
- tipc_get_gap_ack_blks(&ga, l, hdr, false);
- if (!sysctl_tipc_bc_retruni)
- retrq = &xmitq;
- rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr),
- msg_bc_gap(hdr), ga, &xmitq,
- retrq);
- rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq);
+ glen = tipc_get_gap_ack_blks(&ga, l, hdr, false);
+ if (glen > msg_data_sz(hdr)) {
+ /* Malformed Gap ACK blocks; caller drops the msg */
+ *valid = false;
+ } else {
+ if (!sysctl_tipc_bc_retruni)
+ retrq = &xmitq;
+ rc = tipc_link_bc_ack_rcv(l, msg_bcast_ack(hdr),
+ msg_bc_gap(hdr), ga, &xmitq,
+ retrq);
+ rc |= tipc_link_bc_sync_rcv(l, hdr, &xmitq);
+ }
}
tipc_bcast_unlock(net);
diff --git a/net/tipc/bcast.h b/net/tipc/bcast.h
index 2d9352dc7b0e..55d17b5413e1 100644
--- a/net/tipc/bcast.h
+++ b/net/tipc/bcast.h
@@ -97,7 +97,7 @@ void tipc_bcast_ack_rcv(struct net *net, struct tipc_link *l,
struct tipc_msg *hdr);
int tipc_bcast_sync_rcv(struct net *net, struct tipc_link *l,
struct tipc_msg *hdr,
- struct sk_buff_head *retrq);
+ struct sk_buff_head *retrq, bool *valid);
int tipc_nl_add_bc_link(struct net *net, struct tipc_nl_msg *msg,
struct tipc_link *bcl);
int tipc_nl_bc_link_set(struct net *net, struct nlattr *attrs[]);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index 97aa970a0d83..2887f94ee28f 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1831,12 +1831,13 @@ static void tipc_node_mcast_rcv(struct tipc_node *n)
}
static void tipc_node_bc_sync_rcv(struct tipc_node *n, struct tipc_msg *hdr,
- int bearer_id, struct sk_buff_head *xmitq)
+ int bearer_id, struct sk_buff_head *xmitq,
+ bool *valid)
{
struct tipc_link *ucl;
int rc;
- rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr, xmitq);
+ rc = tipc_bcast_sync_rcv(n->net, n->bc_entry.link, hdr, xmitq, valid);
if (rc & TIPC_LINK_DOWN_EVT) {
tipc_node_reset_links(n);
@@ -2140,12 +2141,18 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, struct tipc_bearer *b)
/* Ensure broadcast reception is in synch with peer's send state */
if (unlikely(usr == LINK_PROTOCOL)) {
+ bool valid = true;
+
if (unlikely(skb_linearize(skb))) {
tipc_node_put(n);
goto discard;
}
hdr = buf_msg(skb);
- tipc_node_bc_sync_rcv(n, hdr, bearer_id, &xmitq);
+ tipc_node_bc_sync_rcv(n, hdr, bearer_id, &xmitq, &valid);
+ if (!valid) {
+ tipc_node_put(n);
+ goto discard;
+ }
} else if (unlikely(tipc_link_acked(n->bc_entry.link) != bc_ack)) {
tipc_bcast_ack_rcv(net, n->bc_entry.link, hdr);
}
base-commit: a986fde914d88af47eb78fd29c5d1af7952c3500
--
2.54.0
^ permalink raw reply related
* Re: [PATCH net-next] selftests: tls: size splice_short pipe by page size
From: Nirmoy Das @ 2026-06-24 13:46 UTC (permalink / raw)
To: Simon Horman
Cc: Jakub Kicinski, Sabrina Dubroca, John Fastabend, netdev,
linux-kernel
In-Reply-To: <20260624125132.GN827683@horms.kernel.org>
On 24.06.26 15:51, Simon Horman wrote:
> On Mon, Jun 22, 2026 at 01:28:47PM -0700, Nirmoy Das wrote:
>> splice_short grows its pipe with (MAX_FRAGS + 1) * 0x1000 so it can
>> queue one short vmsplice() buffer for each fragment before draining the
>> pipe. That assumes 4K pipe buffers.
>>
>> On 64K-page kernels the request is rounded to 262144 bytes, which
>> provides only four pipe buffers. The fifth one-byte vmsplice() blocks in
>> pipe_wait_writable and the test times out before it reaches the TLS path.
>>
>> Request enough bytes for the same number of pipe buffers using the
>> runtime page size, and assert that the kernel granted at least that much.
>> If an unprivileged run cannot raise the pipe above the system
>> pipe-max-size limit, skip the test because it cannot exercise the
>> intended path.
>>
>> Fixes: 3667e9b442b9 ("selftests: tls: add test for short splice due to full skmsg")
>> Assisted-by: Codex:gpt-5
>> Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
> The nit below not withstanding, this looks good to me.
>
> Reviewed-by: Simon Horman <horms@kernel.org>
>
>> ---
>> tools/testing/selftests/net/tls.c | 9 ++++++++-
>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
>> index 30a236b8e9f73..e3bf4ade0f770 100644
>> --- a/tools/testing/selftests/net/tls.c
>> +++ b/tools/testing/selftests/net/tls.c
>> @@ -997,6 +997,8 @@ TEST_F(tls, splice_short)
>> char sendbuf[0x100];
>> char sendchar = 'S';
>> int pipefds[2];
>> + int pipe_sz;
>> + int ret;
>> int i;
>>
>> sendchar_iov.iov_base = &sendchar;
>> @@ -1005,7 +1007,12 @@ TEST_F(tls, splice_short)
>> memset(sendbuf, 's', sizeof(sendbuf));
>>
>> ASSERT_GE(pipe2(pipefds, O_NONBLOCK), 0);
>> - ASSERT_GE(fcntl(pipefds[0], F_SETPIPE_SZ, (MAX_FRAGS + 1) * 0x1000), 0);
>> + pipe_sz = (MAX_FRAGS + 1) * getpagesize();
>> + ret = fcntl(pipefds[0], F_SETPIPE_SZ, pipe_sz);
>> + if (ret < 0 && errno == EPERM)
>> + SKIP(return, "insufficient pipe capacity");
>> + ASSERT_GE(ret, 0);
> nit: the line above seems redundant to me given the line below.
Thanks Simon. Sent v2 with the nit addressed.
>
>> + ASSERT_GE(ret, pipe_sz);
>>
>> for (i = 0; i < MAX_FRAGS; i++)
>> ASSERT_GE(vmsplice(pipefds[1], &sendchar_iov, 1, 0), 0);
>> --
>> 2.43.0
>>
^ permalink raw reply
* [PATCH net-next v2] selftests: tls: size splice_short pipe by page size
From: Nirmoy Das @ 2026-06-24 13:44 UTC (permalink / raw)
To: Jakub Kicinski, Sabrina Dubroca, John Fastabend
Cc: Simon Horman, netdev, linux-kernel, Nirmoy Das
splice_short grows its pipe with (MAX_FRAGS + 1) * 0x1000 so it can
queue one short vmsplice() buffer for each fragment before draining the
pipe. That assumes 4K pipe buffers.
On 64K-page kernels the request is rounded to 262144 bytes, which
provides only four pipe buffers. The fifth one-byte vmsplice() blocks in
pipe_wait_writable and the test times out before it reaches the TLS path.
Request enough bytes for the same number of pipe buffers using the
runtime page size, and assert that the kernel granted at least that much.
If an unprivileged run cannot raise the pipe above the system
pipe-max-size limit, skip the test because it cannot exercise the
intended path.
Fixes: 3667e9b442b9 ("selftests: tls: add test for short splice due to full skmsg")
Assisted-by: Codex:gpt-5
Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
---
v2: Remove redundant ASSERT_GE(ret, 0) (Simon Horman).
tools/testing/selftests/net/tls.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/tls.c b/tools/testing/selftests/net/tls.c
index 30a236b8e9f73..d805a7dfbdd51 100644
--- a/tools/testing/selftests/net/tls.c
+++ b/tools/testing/selftests/net/tls.c
@@ -997,6 +997,8 @@ TEST_F(tls, splice_short)
char sendbuf[0x100];
char sendchar = 'S';
int pipefds[2];
+ int pipe_sz;
+ int ret;
int i;
sendchar_iov.iov_base = &sendchar;
@@ -1005,7 +1007,11 @@ TEST_F(tls, splice_short)
memset(sendbuf, 's', sizeof(sendbuf));
ASSERT_GE(pipe2(pipefds, O_NONBLOCK), 0);
- ASSERT_GE(fcntl(pipefds[0], F_SETPIPE_SZ, (MAX_FRAGS + 1) * 0x1000), 0);
+ pipe_sz = (MAX_FRAGS + 1) * getpagesize();
+ ret = fcntl(pipefds[0], F_SETPIPE_SZ, pipe_sz);
+ if (ret < 0 && errno == EPERM)
+ SKIP(return, "insufficient pipe capacity");
+ ASSERT_GE(ret, pipe_sz);
for (i = 0; i < MAX_FRAGS; i++)
ASSERT_GE(vmsplice(pipefds[1], &sendchar_iov, 1, 0), 0);
base-commit: 47186409c092cd7dd70350999186c700233e854d
--
2.43.0
^ permalink raw reply related
* Re: [BUG] KFENCE: use-after-free read in udp_tunnel_nic_device_sync_work
From: Sam Sun @ 2026-06-24 13:42 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, netdev,
linux-kernel, syzkaller
In-Reply-To: <CANn89iKD=4_A+7K2erw5AZPnzvBrQqTKXJyfKN5ZfNpAYZ+y3A@mail.gmail.com>
On Wed, Jun 24, 2026 at 6:01 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Jun 24, 2026 at 2:01 AM Yue Sun <samsun1006219@gmail.com> wrote:
> >
> > Hello,
> >
> > I hit a reproducible use-after-free in the UDP tunnel NIC offload work item.
> > The original local crash was reported by KFENCE as:
> >
> > KFENCE: use-after-free read in udp_tunnel_nic_device_sync_work
> >
> > On current mainline, the C reproducer below triggers the same lifetime bug,
> > reported by KASAN before KFENCE samples the object:
> >
> > BUG: KASAN: slab-use-after-free in __mutex_lock
> > Workqueue: udp_tunnel_nic udp_tunnel_nic_device_sync_work
> >
> > Tested kernel:
> >
> > 840ef6c78e6a ("Merge tag 'nfs-for-7.2-1' of git://git.linux-nfs.org/projects/anna/linux-nfs")
> > Linux 7.1.0-11240-g840ef6c78e6a #31 SMP PREEMPT_DYNAMIC
> >
>
>
> Thanks or the report.
>
> Can you test the following patch?
>
> diff --git a/net/ipv4/udp_tunnel_nic.c b/net/ipv4/udp_tunnel_nic.c
> index 9944ed923ddfd10f9adf6ad788c0740daeaf2adb..c5f8d2f9d325de8f4d2247ddaa52e33378851857
> 100644
> --- a/net/ipv4/udp_tunnel_nic.c
> +++ b/net/ipv4/udp_tunnel_nic.c
> @@ -304,8 +304,8 @@ udp_tunnel_nic_device_sync(struct net_device *dev,
> struct udp_tunnel_nic *utn)
> if (!utn->need_sync)
> return;
>
> - queue_work(udp_tunnel_nic_workqueue, &utn->work);
> utn->work_pending = 1;
> + queue_work(udp_tunnel_nic_workqueue, &utn->work);
> }
>
> static bool
> @@ -866,6 +866,11 @@ udp_tunnel_nic_unregister(struct net_device *dev,
> struct udp_tunnel_nic *utn)
>
> udp_tunnel_nic_lock(dev);
>
> + if (utn->work_pending) {
> + udp_tunnel_nic_unlock(dev);
> + return;
> + }
> +
> /* For a shared table remove this dev from the list of sharing devices
> * and if there are other devices just detach.
> */
> @@ -901,12 +906,6 @@ udp_tunnel_nic_unregister(struct net_device *dev,
> struct udp_tunnel_nic *utn)
> udp_tunnel_nic_flush(dev, utn);
> udp_tunnel_nic_unlock(dev);
>
> - /* Wait for the work to be done using the state, netdev core will
> - * retry unregister until we give up our reference on this device.
> - */
> - if (utn->work_pending)
> - return;
> -
> udp_tunnel_nic_free(utn);
> release_dev:
> dev->udp_tunnel_nic = NULL;
I tested the patch, but unfortunately the C reproducer still triggers the
same use-after-free for me.
Tested on top of:
840ef6c78e6a ("Merge tag 'nfs-for-7.2-1' of
git://git.linux-nfs.org/projects/anna/linux-nfs")
I booted the kernel with KASAN/KFENCE enabled and:
panic_on_warn=1 panic_on_oops=1 kfence.sample_interval=1
Then I ran the same C reproducer:
timeout -k 10 360 /root/repro
The VM panicked after about 236 seconds:
[ 236.471119][ T58] BUG: KASAN: slab-use-after-free in
__mutex_lock+0x16d0/0x1d80
[ 236.473404][ T58] Read of size 8 at addr ff11000076a63ea8 by task
kworker/u16:3/58
[ 236.476455][ T58] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS 1.15.0-1 04/01/2014
[ 236.476478][ T58] Workqueue: udp_tunnel_nic udp_tunnel_nic_device_sync_work
[ 236.476787][ T58] __mutex_lock+0x16d0/0x1d80
[ 236.477020][ T58] udp_tunnel_nic_device_sync_work+0x32/0x9c0
[ 236.477068][ T58] process_one_work+0x9de/0x1bf0
The allocation/free stacks are still the same shape:
```
Allocated by task 11563:
__kmalloc_noprof
udp_tunnel_nic_netdevice_event+0x12d8/0x1e80
register_netdevice
nsim_create
nsim_dev_reload_up
devlink_reload
Freed by task 11609:
kfree
udp_tunnel_nic_netdevice_event+0xc26/0x1e80
unregister_netdevice_many_notify
nsim_destroy
nsim_dev_reload_down
devlink_reload
Last potentially related work creation:
queue_work_on
__udp_tunnel_nic_del_port+0x2af/0x320
udp_tunnel_notify_del_rx_port
__geneve_sock_release.part.0
geneve_stop
Second to last potentially related work creation:
queue_work_on
__udp_tunnel_nic_add_port+0x6ec/0xd70
udp_tunnel_notify_add_rx_port
geneve_open
```
My read of the patch is that it closes the small window where queue_work()
can publish the work before utn->work_pending is set, and it also prevents
udp_tunnel_nic_unregister() from flushing/freeing the object when
work_pending is already set.
However, the test above suggests that work_pending still does not fully
protect the lifetime of struct udp_tunnel_nic. The crashing work was still
queued through udp_tunnel_nic_device_sync() at line 308, so the patched path
was exercised. One suspicious point is that udp_tunnel_nic_device_sync_work()
clears utn->work_pending at the beginning of the worker, while the same work
item can still interact with replay/add/del-port state. The reproducer can
still end up with udp_tunnel_nic_unregister() freeing utn while a
udp_tunnel_nic_device_sync_work item later runs and dereferences the freed
utn->lock.
So this patch does not seem to be sufficient for this reproducer.
Thanks,
Yue
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox