* Fw: [Bug 221665] New: tcp_ecn: doc says default is 2, but should be 5
From: Stephen Hemminger @ 2026-06-18 15:52 UTC (permalink / raw)
To: netdev
Begin forwarded message:
Date: Thu, 18 Jun 2026 13:05:29 +0000
From: bugzilla-daemon@kernel.org
To: stephen@networkplumber.org
Subject: [Bug 221665] New: tcp_ecn: doc says default is 2, but should be 5
https://bugzilla.kernel.org/show_bug.cgi?id=221665
Bug ID: 221665
Summary: tcp_ecn: doc says default is 2, but should be 5
Product: Networking
Version: 2.5
Hardware: All
OS: Linux
Status: NEW
Severity: normal
Priority: P3
Component: IPV4
Assignee: stephen@networkplumber.org
Reporter: pedretti.fabio@gmail.com
CC: corbet@lwn.net, pabeni@redhat.com
Regression: No
Here: https://docs.kernel.org/networking/ip-sysctl.html#:~:text=tcp_ecn
tcp_ecn - INTEGER
...
Default: 2
However it looks like the default was recently changed:
"In previous kernels, the value of tcp_ecn is, by default, two, [...] The new
default value is five, [...]"
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8ae3e8e6ceedfb3cf74ca18169c942e073586a39
Please address this.
Thanks.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply
* Re: [PATCH v3] net: mvneta: re-enable percpu interrupt on resume
From: Sebastian Andrzej Siewior @ 2026-06-18 15:53 UTC (permalink / raw)
To: Zhou, Yun
Cc: marcin.s.wojtas, andrew+netdev, davem, edumazet, kuba, pabeni,
maxime.chevallier, netdev, linux-kernel
In-Reply-To: <3a4ba3da-47ee-4da7-b0da-af500ff1a369@windriver.com>
On 2026-06-18 23:45:51 [+0800], Zhou, Yun wrote:
> > Having a NAPI instance with IRQ per queue and those configured and
> > spread among CPUs should cause less trouble and is what others do.
> > In fact is the only net driver using per-CPU interrupts.
> >
> It is a SoC limitation. Armada XP's MPIC provides a single shared
> interrupt for the ethernet controller with per-CPU masking for
> interrupt steering — there are no per-queue MSI vectors. The percpu
> IRQ model was the only way to distribute interrupt handling across
> CPUs given this hardware constraint.
Is this a hardware constraint or more a software design choice? From the
other comment it read like it could be changed.
There is nothing wrong to provide 4 interrupts for a device from the
device-tree and then allocate and request all four. This requires that
SMP affinity is supported properly in order to spread it across CPU. You
would also be able to reduce the amount of queues/ interrupts via
ethtool if you would like to isolate a CPU for NOHZ reasons.
Sebastian
^ permalink raw reply
* RE: [Intel-wired-lan] [PATCH iwl-net 1/2] ice: skip per-VLAN promisc rules when default VSI Rx rule is set
From: Loktionov, Aleksandr @ 2026-06-18 16:01 UTC (permalink / raw)
To: Oros, Petr, netdev@vger.kernel.org
Cc: Vecera, Ivan, Alice Michael, Kitszel, Przemyslaw, Eric Dumazet,
linux-kernel@vger.kernel.org, Andrew Lunn, Nguyen, Anthony L,
Michal Swiatkowski, Keller, Jacob E, Jakub Kicinski, Paolo Abeni,
David S. Miller, intel-wired-lan@lists.osuosl.org
In-Reply-To: <89efbea9831175e6f57e9fe8557f7a0e48e050b7.1781786935.git.poros@redhat.com>
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Petr Oros
> Sent: Thursday, June 18, 2026 5:09 PM
> To: netdev@vger.kernel.org
> Cc: Vecera, Ivan <ivecera@redhat.com>; Alice Michael
> <alice.michael@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Eric Dumazet <edumazet@google.com>;
> linux-kernel@vger.kernel.org; Andrew Lunn <andrew+netdev@lunn.ch>;
> Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Michal Swiatkowski
> <michal.swiatkowski@linux.intel.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; Jakub Kicinski <kuba@kernel.org>; Paolo
> Abeni <pabeni@redhat.com>; David S. Miller <davem@davemloft.net>;
> intel-wired-lan@lists.osuosl.org
> Subject: [Intel-wired-lan] [PATCH iwl-net 1/2] ice: skip per-VLAN
> promisc rules when default VSI Rx rule is set
>
> When an ice port is part of a vlan-filtering bridge with a wide VLAN
> trunk and the netdev is in IFF_PROMISC (typical for bond slaves
> attached to a bridge), the driver installs per-VLAN
> ICE_SW_LKUP_PROMISC_VLAN entries (recipe 9) in addition to the broad
> ICE_SW_LKUP_DFLT VSI Rx rule (recipe 5). Each per-VLAN rule consumes
> one Flow Lookup Unit (FLU) entry from a fixed hardware pool of "up to
> 32K FLU entries" per device, documented in the E810 datasheet
> (613875-009 section 7.8.10, Table 7-18, page 1015).
>
> With three active PFs sharing one switch context and a bridge trunk of
> vid 2-4094, the configuration would require roughly
>
> 3 PFs * 4093 VLANs * 3 rules per VLAN per PF ~= 36,800 rules
>
> which exceeds the 32K FLU budget. Firmware then responds to further
> Add Switch Rules requests with AQ retval 0x10 (LIBIE_AQ_RC_ENOSPC) and
> the user-visible failure surfaces as
>
> ice 0000:5c:00.1: Failed to set VSI 14 as the default forwarding
> VSI, error -5
> ice 0000:5c:00.1 ens1f1: Error -5 setting default VSI 14 Rx rule
>
> After a switch context has been driven into overrun, subsequent
> retries can come back as AQ retval 0x2 (LIBIE_AQ_RC_ENOENT), which has
> misled triage attempts toward a perceived recipe binding defect rather
> than a capacity issue.
>
> When the DFLT VSI Rx rule is in place it catches every packet on the
> lport regardless of VLAN tag, so the per-VLAN PROMISC_VLAN expansion
> is redundant. The recipe 4 VLAN prune entries are still installed per
> VLAN and continue to track the allowed VID set, but the IFF_PROMISC
> sync path disables their enforcement on the VSI via
> vlan_ops->dis_rx_filtering() before ice_set_promisc() runs.
> ena_rx_filtering() is restored when IFF_PROMISC is cleared.
>
> Skip the per-VLAN expansion at the two call sites that drive it:
> ice_set_promisc() falls through to ice_fltr_set_vsi_promisc() and
> ice_vlan_rx_add_vid() omits the per-VLAN ICE_MCAST_VLAN_PROMISC_BITS
> add. Plain IFF_ALLMULTI without an installed DFLT VSI rule is
> unchanged and still installs per-VLAN multicast promisc rules.
>
> Both checks use ice_is_vsi_dflt_vsi() which inspects the recipe filter
> list for an installed DFLT rule on this VSI, not
> netdev->flags & IFF_PROMISC. The HW-state predicate avoids two
> regression vectors that a user-intent predicate would introduce:
>
> 1. ice_lag_is_switchdev_running() short-circuits ice_set_dflt_vsi()
> to return 0 without installing the DFLT rule for a PF in
> switchdev LAG mode. An IFF_PROMISC-only check would also
> suppress the per-VLAN fallback, leaving the PF with no rule.
>
> 2. When ice_set_dflt_vsi() returns a non-EEXIST error (FLU
> exhausted, switch context divergence), the driver clears
> IFF_PROMISC from vsi->current_netdev_flags but the netdev's own
> flags retain IFF_PROMISC. The user-intent predicate would still
> suppress the per-VLAN fallback even though DFLT failed to
> install.
>
> The predicate is install-time only. The IFF_PROMISC off path closes
> the lifecycle gap in ice_vsi_exit_dflt_promisc(): for an IFF_ALLMULTI
> VSI with VLANs it reinstates the per-VID rules before clearing the
> default rule, so multicast coverage never lapses. If that AQ call
> fails the default rule is left in place, ice_vsi_exit_dflt_promisc()
> returns the error, and the sync_fltr pass bails with
> vsi->current_netdev_flags |= IFF_PROMISC; the current/netdev flag
> mismatch re-fires the IFF_PROMISC off path on the next sync. Clearing
> the default rule first would instead expose a window where neither the
> default rule nor the per-VID rules carry multicast.
>
> If ice_clear_dflt_vsi() fails after the per-VID rules were reinstated
> they are deliberately not rolled back. Clearing the default rule is a
> removal that frees an FLU entry rather than allocating one, so it
> cannot fail for lack of space; a failure is a transient AdminQ error.
> The per-VID rules are the steady state for an IFF_ALLMULTI VLAN VSI,
> so the only redundant entry left behind is the single un-removed
> default rule, not the per-VID set. The retry re-enters this path,
> ice_fltr_set_vlan_vsi_promisc() returns -EEXIST for the rules that
> already exist so nothing is reallocated, and the default rule is
> removed on the next attempt. Rolling the per-VID rules back here would
> instead churn thousands of removes and re-adds on every retry.
>
> After the default rule is gone the vid=0 PROMISC rule that paired with
> it is redundant and is dropped, but only to reclaim a filter entry, so
> a failure there is logged and does not abort the transition.
>
> ice_set_vsi_promisc() and ice_clear_vsi_promisc() dispatch the recipe
> based on whether ICE_PROMISC_VLAN_RX/TX bits are present in the mask:
> with the bits set, recipe ICE_SW_LKUP_PROMISC_VLAN is used; otherwise
> ICE_SW_LKUP_PROMISC. The else branch in
> ice_set_promisc() installs the vid=0 rule in ICE_SW_LKUP_PROMISC.
> Because ice_clear_promisc() with VLANs present adds the VLAN bits and
> would search ICE_SW_LKUP_PROMISC_VLAN, the recipe mismatch would leave
> the vid=0 ICE_SW_LKUP_PROMISC rule orphaned when VLANs are configured.
> This is a single stale rule, not a per-cycle leak:
> re-adding it on the next promisc on returns -EEXIST rather than
> allocating a new entry. The set-time recipe is not recorded, so
> ice_clear_promisc() clears both recipes; clearing a rule that is not
> present succeeds, both clears run unconditionally, and the first error
> is returned.
>
> The two VLAN-0 recipe transition blocks in ice_vlan_rx_add_vid() and
> ice_vlan_rx_kill_vid() that promote / demote the vid=0 rule between
> ICE_SW_LKUP_PROMISC and ICE_SW_LKUP_PROMISC_VLAN are likewise guarded
> by !ice_is_vsi_dflt_vsi(). With DFLT in place the
> vid=0 rule already covers every VID and a recipe swap would only
> install a redundant rule.
>
> Lab reproduction on an E810-C with the same firmware family (4.80, NVM
> 1.3805.0, DDP 1.3.43.0) using four PFs in vlan-filtering bridges with
> vid 2-4094 and the slaves brought to IFF_PROMISC before the bridge
> VLAN bulk add:
>
> before fix: ~12,279 AQ Add Switch Rules per PF, ENOSPC and ENOENT
> responses in dmesg, DFLT VSI Rx rule install fails on
> the affected PF
> after fix: ~4,093 AQ Add Switch Rules per PF, no AQ errors, DFLT
> VSI Rx rule installs on every PF
>
> The 66.7% reduction in installed switch rules per PF matches the
> expected per-VLAN saving: a single DFLT rule replaces the per-VID
> PROMISC_VLAN expansion.
>
> Functional regression test with vid 2-100 trunk between two ice ports
> through the lab switch (40/40 PASS, 0 AQ errors, 0 ENOSPC at 4093-VID
> customer scale):
>
> vid 50 unicast, vid 100 unicast, vid 50 broadcast ARP,
> vid 100 multicast IPv6 ND
> vid 200/500/1500/4000 isolation (out-of-trunk) and untagged not
> leaked: 0 packets reach any bridge endpoint
> IGMP/MLD snooping, Jumbo MTU 9000, reserved-multicast STP BPDU
> IFF_PROMISC + IFF_ALLMULTI transition (off while allmulti stays)
> Regression reproducer for commit 1273f89578f2 ("ice: Fix broken
> IFF_ALLMULTI handling"): allmulti on -> add vid -> allmulti off
> -> allmulti on plus the orphan-rule Scenario 2; both converge
> with no stale rules
> 100-VID, 1000-VID, 4093-VID stress cycles (5/3/2 iterations each)
> switchdev mode toggle preserves IFF_PROMISC pruning state across
> the session (vid 999 multicast received before and after the
> legacy -> switchdev -> legacy cycle)
> SR-IOV: VFs unaffected because ice_set_promisc() early-returns
> for non-PF VSI and VF representors do not register
> ndo_vlan_rx_add_vid
>
> Fixes: 1273f89578f2 ("ice: Fix broken IFF_ALLMULTI handling")
> Signed-off-by: Petr Oros <poros@redhat.com>
> ---
> drivers/net/ethernet/intel/ice/ice_main.c | 90 ++++++++++++++++++----
> -
> 1 file changed, 70 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> b/drivers/net/ethernet/intel/ice/ice_main.c
> index 6d24056c247cf4..af8df81fc45623 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -274,7 +274,8 @@ static int ice_set_promisc(struct ice_vsi *vsi, u8
> promisc_m)
> if (vsi->type != ICE_VSI_PF)
> return 0;
>
> - if (ice_vsi_has_non_zero_vlans(vsi)) {
> + /* skip per-VID expansion; the DFLT Rx rule already covers
> every VID */
> + if (ice_vsi_has_non_zero_vlans(vsi) &&
> !ice_is_vsi_dflt_vsi(vsi)) {
> promisc_m |= (ICE_PROMISC_VLAN_RX |
> ICE_PROMISC_VLAN_TX);
> status = ice_fltr_set_vlan_vsi_promisc(&vsi->back->hw,
> vsi,
> promisc_m);
> @@ -304,9 +305,19 @@ static int ice_clear_promisc(struct ice_vsi *vsi,
> u8 promisc_m)
> return 0;
>
> if (ice_vsi_has_non_zero_vlans(vsi)) {
...
> ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi-
> >idx,
>
> ICE_MCAST_VLAN_PROMISC_BITS,
> 0);
> --
> 2.53.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply
* RE: [Intel-wired-lan] [PATCH iwl-net 2/2] ice: preserve uplink DFLT Rx rule on switchdev release
From: Loktionov, Aleksandr @ 2026-06-18 16:02 UTC (permalink / raw)
To: Oros, Petr, netdev@vger.kernel.org
Cc: Vecera, Ivan, Alice Michael, Kitszel, Przemyslaw, Eric Dumazet,
linux-kernel@vger.kernel.org, Andrew Lunn, Nguyen, Anthony L,
Michal Swiatkowski, Keller, Jacob E, Jakub Kicinski, Paolo Abeni,
David S. Miller, intel-wired-lan@lists.osuosl.org
In-Reply-To: <deef5756e534ef06c12d910c5305d3fd205d30a0.1781786935.git.poros@redhat.com>
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Petr Oros
> Sent: Thursday, June 18, 2026 5:09 PM
> To: netdev@vger.kernel.org
> Cc: Vecera, Ivan <ivecera@redhat.com>; Alice Michael
> <alice.michael@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Eric Dumazet <edumazet@google.com>;
> linux-kernel@vger.kernel.org; Andrew Lunn <andrew+netdev@lunn.ch>;
> Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Michal Swiatkowski
> <michal.swiatkowski@linux.intel.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; Jakub Kicinski <kuba@kernel.org>; Paolo
> Abeni <pabeni@redhat.com>; David S. Miller <davem@davemloft.net>;
> intel-wired-lan@lists.osuosl.org
> Subject: [Intel-wired-lan] [PATCH iwl-net 2/2] ice: preserve uplink
> DFLT Rx rule on switchdev release
>
> ice_eswitch_setup_env() calls ice_set_dflt_vsi() to install the
> ICE_SW_LKUP_DFLT Rx rule on the uplink VSI. The helper returns 0 even
> when the rule is already in place, so the call is a no-op if
> ice_vsi_sync_fltr() had previously installed the DFLT rule in response
> to IFF_PROMISC on the uplink netdev. ice_remove_vsi_fltr() called
> earlier in ice_eswitch_setup_env() does not affect this rule because
> ice_remove_vsi_lkup_fltr() lacks a case for ICE_SW_LKUP_DFLT and falls
> into its default branch which only logs. Switchdev mode then adds an
> ICE_FLTR_TX leg via ice_cfg_dflt_vsi() on the same VSI handle.
>
> ice_eswitch_release_env() unconditionally removed both the Rx and Tx
> DFLT rules. When the Rx DFLT was installed by ice_vsi_sync_fltr()
> before the switchdev session started, this clobbered promisc state the
> operator had asked for: the DFLT Rx rule disappeared while IFF_PROMISC
> was still set on the netdev, and the IFF_PROMISC sync path was not
> retriggered, so the uplink ended the session without the catch-all
> rule the netdev flags requested.
>
> Skip the Rx DFLT removal when the uplink is still promiscuous, both in
> ice_eswitch_release_env() and in the err_def_tx unwind of
> ice_eswitch_setup_env(). The Tx leg installed by switchdev is always
> removed since switchdev owns it.
>
> The ena_rx_filtering() call earlier in ice_eswitch_release_env() is
> left unconditional: it calls ice_cfg_vlan_pruning(), which returns
> without enabling pruning while the netdev is in IFF_PROMISC, so it
> cannot re-enable VLAN pruning under the preserved DFLT rule and drop
> tagged traffic. Pruning is re-enabled later, when the IFF_PROMISC sync
> path runs after promisc is actually cleared.
>
> Use vsi->current_netdev_flags rather than the live netdev->flags for
> this test. netdev->flags is written under RTNL by dev_change_flags(),
> while ice_eswitch_release_env() runs under devl_lock, so reading it
> here would be a TOCTOU against a concurrent promisc change. The
> IFF_PROMISC bit of current_netdev_flags is written only under
> ICE_CFG_BUSY by ice_vsi_sync_fltr(), and ice_set_rx_mode() gates that
> sync off for the uplink while ice_is_switchdev_running() is true. The
> bit is therefore frozen for the whole session and stable when
> release_env reads it.
>
> Because the sync is gated off during the session, a promisc change the
> operator makes while switchdev runs never reaches ice_vsi_sync_fltr():
> current_netdev_flags keeps the value captured before the session while
> netdev->flags carries the new one. Once switchdev is torn down and
> pf->eswitch.is_running is cleared, schedule a filter sync from
> ice_eswitch_disable_switchdev() so the suppressed change is replayed
> and the DFLT Rx rule is reconciled with the current netdev flags. This
> also closes the window where release_env kept the rule based on the
> frozen flag but the operator had since cleared IFF_PROMISC.
>
> Fixes: 1a1c40df2e80 ("ice: set and release switchdev environment")
> Signed-off-by: Petr Oros <poros@redhat.com>
> ---
> drivers/net/ethernet/intel/ice/ice_eswitch.c | 32 +++++++++++++++++--
> -
> 1 file changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c
> b/drivers/net/ethernet/intel/ice/ice_eswitch.c
> index 2e4f0969035f77..b6073fc2375019 100644
> --- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
> @@ -66,8 +66,10 @@ static int ice_eswitch_setup_env(struct ice_pf *pf)
> ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
> ICE_FLTR_TX);
> err_def_tx:
> - ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
> - ICE_FLTR_RX);
> + /* keep the Rx DFLT rule if still promiscuous (see release_env)
> */
> + if (!(uplink_vsi->current_netdev_flags & IFF_PROMISC))
> + ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx,
> + false, ICE_FLTR_RX);
> err_def_rx:
> ice_vsi_del_vlan_zero(uplink_vsi);
> err_vlan_zero:
> @@ -275,11 +277,23 @@ static void ice_eswitch_release_env(struct
> ice_pf *pf)
> vlan_ops = ice_get_compat_vsi_vlan_ops(uplink_vsi);
>
> ice_vsi_update_local_lb(uplink_vsi, false);
> + /* No-op while IFF_PROMISC is set: ice_cfg_vlan_pruning() self-
> gates on
> + * it, so this cannot re-enable VLAN pruning under a preserved
> DFLT rule.
> + */
> vlan_ops->ena_rx_filtering(uplink_vsi);
> ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
> ICE_FLTR_TX);
> - ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx, false,
> - ICE_FLTR_RX);
> +
> + /* Keep the Rx DFLT rule if the uplink is still promiscuous; it
> must
> + * outlive the session. current_netdev_flags is used because
> its
> + * IFF_PROMISC bit only changes under ice_vsi_sync_fltr(),
> gated off
> + * during switchdev, so the read cannot race the RTNL netdev-
> >flags.
> + * Any change made during the session is replayed on teardown.
> + */
> + if (!(uplink_vsi->current_netdev_flags & IFF_PROMISC))
> + ice_cfg_dflt_vsi(uplink_vsi->port_info, uplink_vsi->idx,
> + false, ICE_FLTR_RX);
> +
> ice_fltr_add_mac_and_broadcast(uplink_vsi,
> uplink_vsi->port_info-
> >mac.perm_addr,
> ICE_FWD_TO_VSI);
> @@ -327,10 +341,20 @@ static int ice_eswitch_enable_switchdev(struct
> ice_pf *pf)
> */
> static void ice_eswitch_disable_switchdev(struct ice_pf *pf) {
> + struct ice_vsi *uplink_vsi = pf->eswitch.uplink_vsi;
> +
> ice_eswitch_br_offloads_deinit(pf);
> ice_eswitch_release_env(pf);
>
> pf->eswitch.is_running = false;
> +
> + /* ice_set_rx_mode() was gated off during the session; replay a
> filter
> + * sync so any suppressed promisc change reconciles the DFLT Rx
> rule.
> + */
> + set_bit(ICE_VSI_UMAC_FLTR_CHANGED, uplink_vsi->state);
> + set_bit(ICE_VSI_MMAC_FLTR_CHANGED, uplink_vsi->state);
> + set_bit(ICE_FLAG_FLTR_SYNC, pf->flags);
> + ice_service_task_schedule(pf);
> }
>
> /**
> --
> 2.53.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply
* RE: [Intel-wired-lan] [PATCH net v2] ice: eswitch: fix use-after-free of metadata_dst in repr release
From: Loktionov, Aleksandr @ 2026-06-18 16:02 UTC (permalink / raw)
To: Doruk Tan Ozturk, Nguyen, Anthony L, Kitszel, Przemyslaw,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com
Cc: michal.swiatkowski@linux.intel.com, Drewek, Wojciech,
intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, stable@vger.kernel.org,
horms@kernel.org
In-Reply-To: <20260618145003.47471-1-doruk@0sec.ai>
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Doruk Tan Ozturk
> Sent: Thursday, June 18, 2026 4:50 PM
> To: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel,
> Przemyslaw <przemyslaw.kitszel@intel.com>; andrew+netdev@lunn.ch;
> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com
> Cc: michal.swiatkowski@linux.intel.com; Drewek, Wojciech
> <wojciech.drewek@intel.com>; intel-wired-lan@lists.osuosl.org;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> stable@vger.kernel.org; horms@kernel.org
> Subject: [Intel-wired-lan] [PATCH net v2] ice: eswitch: fix use-after-
> free of metadata_dst in repr release
>
> ice_eswitch_release_repr() frees the port representor metadata_dst via
> metadata_dst_free(), which directly kfree()s the object and ignores
> the dst_entry refcount. The eswitch slow-path TX routine
> ice_eswitch_port_start_xmit() takes a reference on this dst with
> dst_hold() and attaches it to the skb via skb_dst_set(). If such an
> skb is still in flight (e.g. queued in a qdisc) when the representor
> is torn down, the metadata_dst is freed while the skb still points at
> it. When the skb is later freed, dst_release() operates on already-
> freed memory.
>
> Replace metadata_dst_free() with dst_release() so the metadata_dst is
> freed only after the last reference is dropped. The dst subsystem
> frees metadata_dst objects from dst_destroy() once the refcount
> reaches zero (DST_METADATA is set by metadata_dst_alloc()).
>
> Same class of bug and fix as commit c32b26aaa2f9 ("netfilter:
> nft_tunnel: fix use-after-free on object destroy").
>
> Fixes: 1a1c40df2e80 ("ice: set and release switchdev environment")
> Cc: stable@vger.kernel.org
> Signed-off-by: Doruk Tan Ozturk <doruk@0sec.ai>
> Reviewed-by: Simon Horman <horms@kernel.org>
> ---
> v2:
> - Correct the Fixes: tag to 1a1c40df2e80 ("ice: set and release
> switchdev environment"); the previously cited fff292b47ac1 only
> moved
> the affected code rather than introducing the unbalanced free, and
> the
> bug dates back to when switchdev support was added (Simon Horman).
> - Add Simon Horman's Reviewed-by. No functional change.
>
> drivers/net/ethernet/intel/ice/ice_eswitch.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c
> b/drivers/net/ethernet/intel/ice/ice_eswitch.c
> index 2e4f0969035f..41b30a7ca4a9 100644
> --- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
> +++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
> @@ -95,7 +95,7 @@ ice_eswitch_release_repr(struct ice_pf *pf, struct
> ice_repr *repr)
> return;
>
> ice_vsi_update_security(vsi, ice_vsi_ctx_set_antispoof);
> - metadata_dst_free(repr->dst);
> + dst_release(&repr->dst->dst);
> repr->dst = NULL;
> ice_fltr_add_mac_and_broadcast(vsi, repr->parent_mac,
> ICE_FWD_TO_VSI);
> --
> 2.43.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply
* Re: [PATCH v3 2/3] net/smc: bound the receive length to the RMB in smc_rx_recvmsg()
From: Dust Li @ 2026-06-18 16:03 UTC (permalink / raw)
To: hexlabsecurity, Wenjia Zhang, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
In-Reply-To: <20260614-b4-disp-edd64be9-v3-2-551fa514257e@proton.me>
On 2026-06-14 03:23:31, Bryam Vargas via B4 Relay wrote:
>From: Bryam Vargas <hexlabsecurity@proton.me>
>
>conn->bytes_to_rcv is accumulated in the receive tasklet from the peer's
>producer cursor:
>
> diff_prod = smc_curs_diff(rmb_desc->len, &prod_old, &prod_new);
> atomic_add(diff_prod, &conn->bytes_to_rcv);
>
>smc_curs_diff()'s differing-wrap branch returns (size - old.count) +
>new.count, which exceeds rmb_desc->len for a forged producer cursor and
>accumulates across CDC messages, so bytes_to_rcv can grow past the RMB
>(and across many messages can overflow the signed counter negative).
>smc_rx_recvmsg() reads it as the number of readable bytes and performs a
>wrap-around copy whose second chunk (copylen - first_chunk, read from
>ring offset 0) is never re-bounded to rmb_desc->len, reading past the
>RMB into adjacent kernel memory and disclosing it to the peer.
Hi Bryam,
Once we validate the CDC message at the input boundary (as in the
previous patch), bytes_to_rcv can never exceed rmb_desc->len, so
this check becomes unreachable. So I don't think this patch is needed.
Best regards,
Dust
>
>Bound the readable count to rmb_desc->len where it is consumed, treating
>a negative (sign-overflowed) value as out of range too, so the copy
>length can never exceed the ring. This enforces the documented
>0 <= bytes_to_rcv <= rmb_desc->len invariant at the consumer, where it
>is race-free against the producer update that runs in the receive
>tasklet.
>
>Fixes: 952310ccf2d8 ("smc: receive data from RMBE")
>Cc: stable@vger.kernel.org
>Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
>---
> net/smc/smc_rx.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
>diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
>index c1d9b923938d..f461cf10b085 100644
>--- a/net/smc/smc_rx.c
>+++ b/net/smc/smc_rx.c
>@@ -442,6 +442,18 @@ int smc_rx_recvmsg(struct smc_sock *smc, struct msghdr *msg,
> /* initialize variables for 1st iteration of subsequent loop */
> /* could be just 1 byte, even after waiting on data above */
> readable = smc_rx_data_available(conn, peeked_bytes);
>+ /* bytes_to_rcv is accumulated from the peer's wire-controlled
>+ * producer cursor; a forged cursor can drive it past the RMB,
>+ * or overflow the signed accumulator to a negative value across
>+ * many CDC messages (which a plain "> len" check would miss
>+ * before the size_t cast below turns it huge). Bound it to the
>+ * RMB in either case so the wrap-around copy cannot run past
>+ * rmb_desc->len. This enforces the documented
>+ * 0 <= bytes_to_rcv <= rmb_desc->len invariant at the consumer,
>+ * race-free against the producer update in the receive tasklet.
>+ */
>+ if (readable < 0 || readable > conn->rmb_desc->len)
>+ readable = conn->rmb_desc->len;
> splbytes = atomic_read(&conn->splice_pending);
> if (!readable || (msg && splbytes)) {
> if (splbytes)
>
>--
>2.43.0
>
^ permalink raw reply
* Re: [PATCH v3 3/3] net/smc: bound the send length to the send buffer in smc_tx_sendmsg()
From: Dust Li @ 2026-06-18 16:08 UTC (permalink / raw)
To: hexlabsecurity, Wenjia Zhang, D. Wythe, Sidraya Jayagond
Cc: Eric Dumazet, David S. Miller, Mahanta Jambigi, Wen Gu,
Simon Horman, netdev, Ursula Braun, Stefan Raspl, linux-s390,
Paolo Abeni, linux-kernel, linux-rdma, Jakub Kicinski, Tony Lu
In-Reply-To: <20260614-b4-disp-edd64be9-v3-3-551fa514257e@proton.me>
On 2026-06-14 03:23:32, Bryam Vargas via B4 Relay wrote:
>From: Bryam Vargas <hexlabsecurity@proton.me>
>
>On the SMC-D DMB-merge (nocopy) path, smc_cdc_msg_recv_action() advances
>conn->sndbuf_space from the peer's consumer cursor:
>
> diff_tx = smc_curs_diff(sndbuf_desc->len, &tx_curs_fin,
> &local_rx_ctrl.cons);
> atomic_add(diff_tx, &conn->sndbuf_space);
>
>The consumer cursor is wire-controlled and unvalidated, and
>smc_curs_diff()'s differing-wrap branch can return more than
>sndbuf_desc->len, so a forged cursor drives sndbuf_space past the send
>buffer (and across many CDC messages can overflow the signed counter
>negative). smc_tx_sendmsg() reads it as the available write space and
>performs a wrap-around copy whose second chunk (copylen - first_chunk,
>written at ring offset 0) is never re-bounded to sndbuf_desc->len,
>writing user data past the send buffer -- a heap out-of-bounds write of
>attacker-influenced length and content.
Hi Bryam,
I think this is the same as patch #2.
Best regards,
Dust
>
>Bound the write space to sndbuf_desc->len where it is consumed, treating
>a negative (sign-overflowed) value as out of range too, so the copy
>length can never exceed the ring. This enforces the documented
>0 <= sndbuf_space <= sndbuf_desc->len invariant at the producer,
>race-free against the CDC tasklet that advances sndbuf_space.
>
>Fixes: cc0ab806fc52 ("net/smc: adapt cursor update when sndbuf and peer DMB are merged")
>Cc: stable@vger.kernel.org
>Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
>---
> net/smc/smc_tx.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
>diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
>index 3144b4b1fe29..5916f02060fb 100644
>--- a/net/smc/smc_tx.c
>+++ b/net/smc/smc_tx.c
>@@ -233,6 +233,19 @@ int smc_tx_sendmsg(struct smc_sock *smc, struct msghdr *msg, size_t len)
> /* initialize variables for 1st iteration of subsequent loop */
> /* could be just 1 byte, even after smc_tx_wait above */
> writespace = atomic_read(&conn->sndbuf_space);
>+ /* sndbuf_space is advanced from the peer's wire-controlled
>+ * consumer cursor on the SMC-D DMB-merge path; a forged cursor
>+ * can inflate it past the send buffer, or overflow the signed
>+ * accumulator to a negative value across many CDC messages
>+ * (which a plain "> len" check would miss before the size_t
>+ * cast below turns it huge). Bound it to the send buffer in
>+ * either case so the wrap-around write cannot run past
>+ * sndbuf_desc->len. This enforces the documented
>+ * 0 <= sndbuf_space <= sndbuf_desc->len invariant at the
>+ * producer, race-free against the CDC tasklet.
>+ */
>+ if (writespace < 0 || writespace > conn->sndbuf_desc->len)
>+ writespace = conn->sndbuf_desc->len;
> /* not more than what user space asked for */
> copylen = min_t(size_t, send_remaining, writespace);
> /* determine start of sndbuf */
>
>--
>2.43.0
>
^ permalink raw reply
* [PATCH net 0/6] ipv6: fix sysctl error handling and missing notifications
From: Fernando Fernandez Mancera @ 2026-06-18 16:22 UTC (permalink / raw)
To: netdev
Cc: nicolas.dichtel, shemminger, dforster, gospo, ddutt, brian.haley,
horms, pabeni, kuba, edumazet, davem, idosch, dsahern,
Fernando Fernandez Mancera
While working on a different IPv6 patch series I have spotted multiple
minor bugs around sysctl error handling and notifications. In general,
they are not serious issues.
In addition, there is one more issue in forwarding sysctl as it does not
check for CAP_NET_ADMIN for the namespace. I am keeping that patch out
of this series and I am aiming it at the net-next tree once it re-opens.
Fernando Fernandez Mancera (6):
ipv6: fix error handling in disable_ipv6 sysctl
ipv6: fix error handling in ignore_routes_with_linkdown sysctl
ipv6: fix error handling in forwarding sysctl
ipv6: fix error handling in disable_policy sysctl
ipv6: reset value and position for proxy_ndp sysctl restart
ipv6: fix missing notification for ignore_routes_with_linkdown
net/ipv6/addrconf.c | 35 +++++++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 8 deletions(-)
--
2.54.0
^ permalink raw reply
* [PATCH net 1/6] ipv6: fix error handling in disable_ipv6 sysctl
From: Fernando Fernandez Mancera @ 2026-06-18 16:22 UTC (permalink / raw)
To: netdev
Cc: nicolas.dichtel, shemminger, dforster, gospo, ddutt, brian.haley,
horms, pabeni, kuba, edumazet, davem, idosch, dsahern,
Fernando Fernandez Mancera
In-Reply-To: <20260618162225.4588-1-fmancera@suse.de>
When writing to the disable_ipv6 sysctl, if proc_dointvec() fails to
parse the input, it returns a negative error code. The current
implementation is overwriting that error for write operations.
This results in a silent failure, it returns a successful write although
the configuration was not modified at all. When modifying the "all"
variant it can also modify the configuration of existing interfaces to
the wrong value.
Fix this by checking the return value of proc_dointvec() and returning
early on failure.
Fixes: 56d417b12e57 ("IPv6: Add 'autoconf' and 'disable_ipv6' module parameters")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
---
net/ipv6/addrconf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 1f21ccb55caa..c901b444a995 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -6467,6 +6467,8 @@ static int addrconf_sysctl_disable(const struct ctl_table *ctl, int write,
lctl.data = &val;
ret = proc_dointvec(&lctl, write, buffer, lenp, ppos);
+ if (ret)
+ return ret;
if (write)
ret = addrconf_disable_ipv6(ctl, valp, val);
--
2.54.0
^ permalink raw reply related
* [PATCH net 4/6] ipv6: fix error handling in disable_policy sysctl
From: Fernando Fernandez Mancera @ 2026-06-18 16:22 UTC (permalink / raw)
To: netdev
Cc: nicolas.dichtel, shemminger, dforster, gospo, ddutt, brian.haley,
horms, pabeni, kuba, edumazet, davem, idosch, dsahern,
Fernando Fernandez Mancera
In-Reply-To: <20260618162225.4588-1-fmancera@suse.de>
When writing to the disable_policy sysctl, if proc_dointvec() fails to
parse the input, it returns a negative error code. The current
implementation is resetting the position argument even if an error
occurred during proc_dointvec() and not only during sysctl restart.
Fix this by checking the return value of proc_dointvec() and returning
early on failure.
Fixes: df789fe75206 ("ipv6: Provide ipv6 version of "disable_policy" sysctl")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
---
net/ipv6/addrconf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 0b128b6cff60..8ff015975e27 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -6769,6 +6769,8 @@ static int addrconf_sysctl_disable_policy(const struct ctl_table *ctl, int write
lctl = *ctl;
lctl.data = &val;
ret = proc_dointvec(&lctl, write, buffer, lenp, ppos);
+ if (ret)
+ return ret;
if (write && (*valp != val))
ret = addrconf_disable_policy(ctl, valp, val);
--
2.54.0
^ permalink raw reply related
* [PATCH net 2/6] ipv6: fix error handling in ignore_routes_with_linkdown sysctl
From: Fernando Fernandez Mancera @ 2026-06-18 16:22 UTC (permalink / raw)
To: netdev
Cc: nicolas.dichtel, shemminger, dforster, gospo, ddutt, brian.haley,
horms, pabeni, kuba, edumazet, davem, idosch, dsahern,
Fernando Fernandez Mancera
In-Reply-To: <20260618162225.4588-1-fmancera@suse.de>
When writing to the ignore_routes_with_linkdown sysctl, if
proc_dointvec() fails to parse the input, it returns a negative error
code. The current implementation is overwriting that error for write
operations.
This results in a silent failure, it returns a successful write although
the configuration was not modified at all. When modifying the "all"
variant it can also modify the configuration of existing interfaces to
the wrong value.
Fix this by checking the return value of proc_dointvec() and returning
early on failure.
Fixes: 35103d11173b ("net: ipv6 sysctl option to ignore routes when nexthop link is down")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
---
net/ipv6/addrconf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index c901b444a995..70058d971205 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -6671,6 +6671,8 @@ int addrconf_sysctl_ignore_routes_with_linkdown(const struct ctl_table *ctl,
lctl.data = &val;
ret = proc_dointvec(&lctl, write, buffer, lenp, ppos);
+ if (ret)
+ return ret;
if (write)
ret = addrconf_fixup_linkdown(ctl, valp, val);
--
2.54.0
^ permalink raw reply related
* [PATCH net 3/6] ipv6: fix error handling in forwarding sysctl
From: Fernando Fernandez Mancera @ 2026-06-18 16:22 UTC (permalink / raw)
To: netdev
Cc: nicolas.dichtel, shemminger, dforster, gospo, ddutt, brian.haley,
horms, pabeni, kuba, edumazet, davem, idosch, dsahern,
Fernando Fernandez Mancera
In-Reply-To: <20260618162225.4588-1-fmancera@suse.de>
When writing to the forwarding sysctl, if proc_dointvec() fails to parse
the input, it returns a negative error code. The current implementation
is overwriting that error for write operations.
This results in a silent failure, it returns a successful write although
the configuration was not modified at all. When modifying the "all"
variant it can also modify the configuration of existing interfaces to
the wrong value.
Fix this by checking the return value of proc_dointvec() and returning
early on failure.
Fixes: b325fddb7f86 ("ipv6: Fix sysctl unregistration deadlock")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
---
net/ipv6/addrconf.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 70058d971205..0b128b6cff60 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -6370,6 +6370,8 @@ static int addrconf_sysctl_forward(const struct ctl_table *ctl, int write,
lctl.data = &val;
ret = proc_dointvec(&lctl, write, buffer, lenp, ppos);
+ if (ret)
+ return ret;
if (write)
ret = addrconf_fixup_forwarding(ctl, valp, val);
--
2.54.0
^ permalink raw reply related
* [PATCH net 5/6] ipv6: reset value and position for proxy_ndp sysctl restart
From: Fernando Fernandez Mancera @ 2026-06-18 16:22 UTC (permalink / raw)
To: netdev
Cc: nicolas.dichtel, shemminger, dforster, gospo, ddutt, brian.haley,
horms, pabeni, kuba, edumazet, davem, idosch, dsahern,
Fernando Fernandez Mancera
In-Reply-To: <20260618162225.4588-1-fmancera@suse.de>
When handling proxy_ndp, if rtnl_net_trylock() fails, the operation is
retried but as the value was already modified by the initial
proc_dointvec() call, the restarted syscall will read the newly modified
value as the 'old' state.
Fix this by restoring the original value and position pointer before
restarting the syscall.
Fixes: c92d5491a6d9 ("netconf: add support for IPv6 proxy_ndp")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
---
net/ipv6/addrconf.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 8ff015975e27..1cfb223476bd 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -6483,8 +6483,9 @@ static int addrconf_sysctl_proxy_ndp(const struct ctl_table *ctl, int write,
void *buffer, size_t *lenp, loff_t *ppos)
{
int *valp = ctl->data;
- int ret;
+ loff_t pos = *ppos;
int old, new;
+ int ret;
old = *valp;
ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
@@ -6493,8 +6494,12 @@ static int addrconf_sysctl_proxy_ndp(const struct ctl_table *ctl, int write,
if (write && old != new) {
struct net *net = ctl->extra2;
- if (!rtnl_net_trylock(net))
+ if (!rtnl_net_trylock(net)) {
+ /* Restore the original values before restarting */
+ *valp = old;
+ *ppos = pos;
return restart_syscall();
+ }
if (valp == &net->ipv6.devconf_dflt->proxy_ndp) {
inet6_netconf_notify_devconf(net, RTM_NEWNETCONF,
--
2.54.0
^ permalink raw reply related
* [PATCH net 6/6] ipv6: fix missing notification for ignore_routes_with_linkdown
From: Fernando Fernandez Mancera @ 2026-06-18 16:22 UTC (permalink / raw)
To: netdev
Cc: nicolas.dichtel, shemminger, dforster, gospo, ddutt, brian.haley,
horms, pabeni, kuba, edumazet, davem, idosch, dsahern,
Fernando Fernandez Mancera
In-Reply-To: <20260618162225.4588-1-fmancera@suse.de>
When changing the ignore_routes_with_linkdown sysctl for a specific
interface, the RTM_NEWNETCONF netlink notification was not being emitted
to userspace. Fix this by emitting the notification when needed.
In addition, fix bogus return value for successful "all" and specific
interface write operation leading to a wrong reset of the position
pointer.
Fixes: 35103d11173b ("net: ipv6 sysctl option to ignore routes when nexthop link is down")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
---
net/ipv6/addrconf.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 1cfb223476bd..2d81569b692b 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -955,11 +955,7 @@ static int addrconf_fixup_linkdown(const struct ctl_table *table, int *p, int ne
NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN,
NETCONFA_IFINDEX_DEFAULT,
net->ipv6.devconf_dflt);
- rtnl_net_unlock(net);
- return 0;
- }
-
- if (p == &net->ipv6.devconf_all->ignore_routes_with_linkdown) {
+ } else if (p == &net->ipv6.devconf_all->ignore_routes_with_linkdown) {
WRITE_ONCE(net->ipv6.devconf_dflt->ignore_routes_with_linkdown, newf);
addrconf_linkdown_change(net, newf);
if ((!newf) ^ (!old))
@@ -968,11 +964,21 @@ static int addrconf_fixup_linkdown(const struct ctl_table *table, int *p, int ne
NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN,
NETCONFA_IFINDEX_ALL,
net->ipv6.devconf_all);
+ } else {
+ if (!newf ^ !old) {
+ struct inet6_dev *idev = table->extra1;
+
+ inet6_netconf_notify_devconf(net,
+ RTM_NEWNETCONF,
+ NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN,
+ idev->dev->ifindex,
+ &idev->cnf);
+ }
}
rtnl_net_unlock(net);
- return 1;
+ return 0;
}
#endif
--
2.54.0
^ permalink raw reply related
* [PATCH 1/1] xfrm: nat_keepalive: avoid double free on send error
From: Ren Wei @ 2026-06-18 16:36 UTC (permalink / raw)
To: netdev
Cc: steffen.klassert, herbert, davem, eyal.birger, yuantan098, bird,
qianyuluo3, n05ec
In-Reply-To: <cover.1781788385.git.qianyuluo3@gmail.com>
From: Qianyu Luo <qianyuluo3@gmail.com>
nat_keepalive_send() frees the keepalive skb whenever the IPv4 or IPv6
send helper reports an error.
That cleanup is only correct before the skb is handed to the output
path. Once ip_build_and_send_pkt() or ip6_xmit() takes ownership, the
networking stack may already have consumed the skb before returning an
error, so freeing it again is unsafe.
Handle the pre-handoff failure cases inside nat_keepalive_send_ipv4()
and nat_keepalive_send_ipv6(), where the caller still owns the skb, and
keep nat_keepalive_send() responsible only for family dispatch and the
unsupported-family cleanup path.
Fixes: f531d13bdfe3 ("xfrm: support sending NAT keepalives in ESP in UDP states")
Cc: stable@vger.kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Qianyu Luo <qianyuluo3@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
---
net/xfrm/xfrm_nat_keepalive.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/net/xfrm/xfrm_nat_keepalive.c b/net/xfrm/xfrm_nat_keepalive.c
index 458931062a04..f71328096f11 100644
--- a/net/xfrm/xfrm_nat_keepalive.c
+++ b/net/xfrm/xfrm_nat_keepalive.c
@@ -55,8 +55,10 @@ static int nat_keepalive_send_ipv4(struct sk_buff *skb,
ka->encap_sport, sock_net_uid(net, NULL));
rt = ip_route_output_key(net, &fl4);
- if (IS_ERR(rt))
+ if (IS_ERR(rt)) {
+ kfree_skb(skb);
return PTR_ERR(rt);
+ }
skb_dst_set(skb, &rt->dst);
@@ -100,6 +102,7 @@ static int nat_keepalive_send_ipv6(struct sk_buff *skb,
sock_net_set(sk, net);
dst = ip6_dst_lookup_flow(net, sk, &fl6, NULL);
if (IS_ERR(dst)) {
+ kfree_skb(skb);
local_unlock_nested_bh(&nat_keepalive_sk_ipv6.bh_lock);
return PTR_ERR(dst);
}
@@ -118,7 +121,6 @@ static void nat_keepalive_send(struct nat_keepalive *ka)
sizeof(struct ipv6hdr)) +
sizeof(struct udphdr);
const u8 nat_ka_payload = 0xFF;
- int err = -EAFNOSUPPORT;
struct sk_buff *skb;
struct udphdr *uh;
@@ -140,16 +142,17 @@ static void nat_keepalive_send(struct nat_keepalive *ka)
switch (ka->family) {
case AF_INET:
- err = nat_keepalive_send_ipv4(skb, ka);
+ nat_keepalive_send_ipv4(skb, ka);
break;
#if IS_ENABLED(CONFIG_IPV6)
case AF_INET6:
- err = nat_keepalive_send_ipv6(skb, ka, uh);
+ nat_keepalive_send_ipv6(skb, ka, uh);
break;
#endif
- }
- if (err)
+ default:
kfree_skb(skb);
+ break;
+ }
}
struct nat_keepalive_work_ctx {
--
2.43.7
^ permalink raw reply related
* [PATCH 00/10] rust: driver: use pointers instead of indices for ID info
From: Gary Guo @ 2026-06-18 17:03 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Daniel Almeida,
Tamir Duberstein, Alexandre Courbot, Onur Özkan,
FUJITA Tomonori, David Airlie, Simona Vetter, Bjorn Helgaas,
Krzysztof Wilczyński, Abdiel Janulgue, Robin Murphy,
Dave Ertman, Ira Weiny, Leon Romanovsky, Len Brown, Igor Korotin,
Rob Herring, Saravana Kannan, Viresh Kumar, Michal Wilczynski,
Drew Fustini, Guo Ren, Fu Wei, Uwe Kleine-König
Cc: driver-core, rust-for-linux, linux-kernel, netdev, nova-gpu,
dri-devel, linux-pci, linux-acpi, devicetree, linux-pm,
linux-riscv, linux-pwm, Gary Guo
Most C drivers use a pointer (and cast to kernel_ulong_t) for driver_data
fields in device_id. Rust code currently does not do this, but rather use
indices. These indices then needs to be translated to `&IdInfo` separately
and this is by a side table.
This leads to open-coded ACPI/OF handling in driver.rs, which is not
desirable. Convert the code to use pointers (or rather, static references)
instead.
Signed-off-by: Gary Guo <gary@garyguo.net>
---
Gary Guo (10):
rust: driver: remove `IdTable::id`
rust: driver: simplify `IdArray::new_without_index`
rust: pci: use `Option<&IdInfo>` for device ID info
rust: net/phy: remove expansion from doc
rust: driver: centralize device ID handling
rust: driver: remove `$module_table_name` from `module_device_table`
rust: driver: store pointers in `DeviceId`
rust: driver: remove open-coded matching logic
rust: driver: remove duplicate ID table
RFC: rust: driver: support map-like syntax for ID table
drivers/cpufreq/rcpufreq_dt.rs | 1 -
drivers/gpu/drm/nova/driver.rs | 1 -
drivers/gpu/drm/tyr/driver.rs | 1 -
drivers/gpu/nova-core/driver.rs | 3 +-
drivers/pwm/pwm_th1520.rs | 1 -
rust/kernel/acpi.rs | 14 +--
rust/kernel/auxiliary.rs | 18 +--
rust/kernel/device_id.rs | 205 +++++++++++++++++++---------------
rust/kernel/driver.rs | 114 ++-----------------
rust/kernel/i2c.rs | 26 ++---
rust/kernel/net/phy.rs | 66 +----------
rust/kernel/of.rs | 14 +--
rust/kernel/pci.rs | 24 ++--
rust/kernel/platform.rs | 5 +-
rust/kernel/usb.rs | 18 +--
samples/rust/rust_debugfs.rs | 1 -
samples/rust/rust_dma.rs | 3 +-
samples/rust/rust_driver_auxiliary.rs | 4 +-
samples/rust/rust_driver_i2c.rs | 3 -
samples/rust/rust_driver_pci.rs | 11 +-
samples/rust/rust_driver_platform.rs | 2 -
samples/rust/rust_driver_usb.rs | 1 -
samples/rust/rust_i2c_client.rs | 2 -
samples/rust/rust_soc.rs | 2 -
24 files changed, 166 insertions(+), 374 deletions(-)
---
base-commit: 4fa3f5fabb30bf00d7475d5a33459ea83d639bf9
change-id: 20260612-id_info-23eca472ccd8
Best regards,
--
Gary Guo <gary@garyguo.net>
^ permalink raw reply
* [PATCH 02/10] rust: driver: simplify `IdArray::new_without_index`
From: Gary Guo @ 2026-06-18 17:03 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Daniel Almeida,
Tamir Duberstein, Alexandre Courbot, Onur Özkan,
FUJITA Tomonori, David Airlie, Simona Vetter, Bjorn Helgaas,
Krzysztof Wilczyński, Abdiel Janulgue, Robin Murphy,
Dave Ertman, Ira Weiny, Leon Romanovsky, Len Brown, Igor Korotin,
Rob Herring, Saravana Kannan, Viresh Kumar, Michal Wilczynski,
Drew Fustini, Guo Ren, Fu Wei, Uwe Kleine-König
Cc: driver-core, rust-for-linux, linux-kernel, netdev, nova-gpu,
dri-devel, linux-pci, linux-acpi, devicetree, linux-pm,
linux-riscv, linux-pwm, Gary Guo
In-Reply-To: <20260618-id_info-v1-0-96af1e559ef9@garyguo.net>
This method can very easily construct the `IdArray` on its own without
delegating to `Self::build`. Doing so also simplifies the phy device table
macro because it does not need to construct tuples anymore.
This also allows simplification of `new` and `build` which removes the
`unsafe`.
Signed-off-by: Gary Guo <gary@garyguo.net>
---
rust/kernel/device_id.rs | 64 ++++++++++++++++++++----------------------------
rust/kernel/net/phy.rs | 2 +-
2 files changed, 28 insertions(+), 38 deletions(-)
diff --git a/rust/kernel/device_id.rs b/rust/kernel/device_id.rs
index fbf6d8e6afb9..eeef3f5e7b63 100644
--- a/rust/kernel/device_id.rs
+++ b/rust/kernel/device_id.rs
@@ -73,19 +73,11 @@ pub struct IdArray<T: RawDeviceId, U, const N: usize> {
id_infos: [U; N],
}
-impl<T: RawDeviceId, U, const N: usize> IdArray<T, U, N> {
+impl<T: RawDeviceId + RawDeviceIdIndex, U, const N: usize> IdArray<T, U, N> {
/// Creates a new instance of the array.
///
/// The contents are derived from the given identifiers and context information.
- ///
- /// # Safety
- ///
- /// `data_offset` as `None` is always safe.
- /// If `data_offset` is `Some(data_offset)`, then:
- /// - `data_offset` must be the correct offset (in bytes) to the context/data field
- /// (e.g., the `driver_data` field) within the raw device ID structure.
- /// - The field at `data_offset` must be correctly sized to hold a `usize`.
- const unsafe fn build(ids: [(T, U); N], data_offset: Option<usize>) -> Self {
+ pub const fn new(ids: [(T, U); N]) -> Self {
let mut raw_ids = [const { MaybeUninit::<T::RawType>::uninit() }; N];
let mut infos = [const { MaybeUninit::uninit() }; N];
@@ -94,16 +86,14 @@ impl<T: RawDeviceId, U, const N: usize> IdArray<T, U, N> {
// SAFETY: by the safety requirement of `RawDeviceId`, we're guaranteed that `T` is
// layout-wise compatible with `RawType`.
raw_ids[i] = unsafe { core::mem::transmute_copy(&ids[i].0) };
- if let Some(data_offset) = data_offset {
- // SAFETY: by the safety requirement of this function, this would be effectively
- // `raw_ids[i].driver_data = i;`.
- unsafe {
- raw_ids[i]
- .as_mut_ptr()
- .byte_add(data_offset)
- .cast::<usize>()
- .write(i);
- }
+ // SAFETY: by the safety requirement of `RawDeviceIdIndex`, this would be effectively
+ // `raw_ids[i].driver_data = i;`.
+ unsafe {
+ raw_ids[i]
+ .as_mut_ptr()
+ .byte_add(T::DRIVER_DATA_OFFSET)
+ .cast::<usize>()
+ .write(i);
}
// SAFETY: this is effectively a move: `infos[i] = ids[i].1`. We make a copy here but
@@ -127,32 +117,32 @@ impl<T: RawDeviceId, U, const N: usize> IdArray<T, U, N> {
id_infos: unsafe { core::mem::transmute_copy(&infos) },
}
}
+}
- /// Creates a new instance of the array without writing index values.
- ///
- /// The contents are derived from the given identifiers and context information.
- /// If the device implements [`RawDeviceIdIndex`], consider using [`IdArray::new`] instead.
- pub const fn new_without_index(ids: [(T, U); N]) -> Self {
- // SAFETY: Calling `Self::build` with `offset = None` is always safe,
- // because no raw memory writes are performed in this case.
- unsafe { Self::build(ids, None) }
- }
-
+impl<T: RawDeviceId, U, const N: usize> IdArray<T, U, N> {
/// Reference to the contained [`RawIdArray`].
pub const fn raw_ids(&self) -> &RawIdArray<T, N> {
&self.raw_ids
}
}
-impl<T: RawDeviceId + RawDeviceIdIndex, U, const N: usize> IdArray<T, U, N> {
- /// Creates a new instance of the array.
+impl<T: RawDeviceId, const N: usize> IdArray<T, (), N> {
+ /// Creates a new instance of the array without writing index values.
///
/// The contents are derived from the given identifiers and context information.
- pub const fn new(ids: [(T, U); N]) -> Self {
- // SAFETY: by the safety requirement of `RawDeviceIdIndex`,
- // `T::DRIVER_DATA_OFFSET` is guaranteed to be the correct offset (in bytes) to
- // a field within `T::RawType`.
- unsafe { Self::build(ids, Some(T::DRIVER_DATA_OFFSET)) }
+ /// If the device implements [`RawDeviceIdIndex`], consider using [`IdArray::new`] instead.
+ pub const fn new_without_index(ids: [T; N]) -> Self {
+ // SAFETY: `T` is layout-wise compatible with `T::RawType`, so is the array of them.
+ let raw_ids: [T::RawType; N] = unsafe { core::mem::transmute_copy(&ids) };
+ core::mem::forget(ids);
+
+ Self {
+ raw_ids: RawIdArray {
+ ids: raw_ids,
+ sentinel: MaybeUninit::zeroed(),
+ },
+ id_infos: [(); N],
+ }
}
}
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index 3ca99db5cccf..2868e3a9e02c 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -868,7 +868,7 @@ macro_rules! module_phy_driver {
const N: usize = $crate::module_phy_driver!(@count_devices $($dev),+);
const TABLE: $crate::device_id::IdArray<$crate::net::phy::DeviceId, (), N> =
- $crate::device_id::IdArray::new_without_index([ $(($dev,())),+, ]);
+ $crate::device_id::IdArray::new_without_index([ $($dev),+, ]);
$crate::module_device_table!("mdio", phydev, TABLE);
};
--
2.54.0
^ permalink raw reply related
* [PATCH 01/10] rust: driver: remove `IdTable::id`
From: Gary Guo @ 2026-06-18 17:03 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Daniel Almeida,
Tamir Duberstein, Alexandre Courbot, Onur Özkan,
FUJITA Tomonori, David Airlie, Simona Vetter, Bjorn Helgaas,
Krzysztof Wilczyński, Abdiel Janulgue, Robin Murphy,
Dave Ertman, Ira Weiny, Leon Romanovsky, Len Brown, Igor Korotin,
Rob Herring, Saravana Kannan, Viresh Kumar, Michal Wilczynski,
Drew Fustini, Guo Ren, Fu Wei, Uwe Kleine-König
Cc: driver-core, rust-for-linux, linux-kernel, netdev, nova-gpu,
dri-devel, linux-pci, linux-acpi, devicetree, linux-pm,
linux-riscv, linux-pwm, Gary Guo
In-Reply-To: <20260618-id_info-v1-0-96af1e559ef9@garyguo.net>
This is unused.
Signed-off-by: Gary Guo <gary@garyguo.net>
---
rust/kernel/device_id.rs | 7 -------
1 file changed, 7 deletions(-)
diff --git a/rust/kernel/device_id.rs b/rust/kernel/device_id.rs
index 8e9721446014..fbf6d8e6afb9 100644
--- a/rust/kernel/device_id.rs
+++ b/rust/kernel/device_id.rs
@@ -166,9 +166,6 @@ pub trait IdTable<T: RawDeviceId, U> {
/// Obtain the pointer to the ID table.
fn as_ptr(&self) -> *const T::RawType;
- /// Obtain the pointer to the bus specific device ID from an index.
- fn id(&self, index: usize) -> &T::RawType;
-
/// Obtain the pointer to the driver-specific information from an index.
fn info(&self, index: usize) -> &U;
}
@@ -180,10 +177,6 @@ fn as_ptr(&self) -> *const T::RawType {
core::ptr::from_ref(self).cast()
}
- fn id(&self, index: usize) -> &T::RawType {
- &self.raw_ids.ids[index]
- }
-
fn info(&self, index: usize) -> &U {
&self.id_infos[index]
}
--
2.54.0
^ permalink raw reply related
* [PATCH 03/10] rust: pci: use `Option<&IdInfo>` for device ID info
From: Gary Guo @ 2026-06-18 17:03 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Daniel Almeida,
Tamir Duberstein, Alexandre Courbot, Onur Özkan,
FUJITA Tomonori, David Airlie, Simona Vetter, Bjorn Helgaas,
Krzysztof Wilczyński, Abdiel Janulgue, Robin Murphy,
Dave Ertman, Ira Weiny, Leon Romanovsky, Len Brown, Igor Korotin,
Rob Herring, Saravana Kannan, Viresh Kumar, Michal Wilczynski,
Drew Fustini, Guo Ren, Fu Wei, Uwe Kleine-König
Cc: driver-core, rust-for-linux, linux-kernel, netdev, nova-gpu,
dri-devel, linux-pci, linux-acpi, devicetree, linux-pm,
linux-riscv, linux-pwm, Gary Guo
In-Reply-To: <20260618-id_info-v1-0-96af1e559ef9@garyguo.net>
It is possible that `pci_device_id_any` will be passed to the driver, e.g.
`driver_override` is used on the device. Therefore, the driver must be able
to handle the case where `driver_data` is 0. Thus, update the `probe`
functions to get `Option`.
The current code cannot tell if the info does not exist or is the first
entry; however this will be achievable once the code is updated to use a
`&'static IdInfo` pointer instead of indices.
Signed-off-by: Gary Guo <gary@garyguo.net>
---
drivers/gpu/nova-core/driver.rs | 2 +-
rust/kernel/pci.rs | 6 +++---
samples/rust/rust_dma.rs | 2 +-
samples/rust/rust_driver_auxiliary.rs | 2 +-
samples/rust/rust_driver_pci.rs | 3 ++-
5 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 5738d4ac521b..5a5f0b63e0f3 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -70,7 +70,7 @@ impl pci::Driver for NovaCoreDriver {
fn probe<'bound>(
pdev: &'bound pci::Device<Core<'_>>,
- _info: &'bound Self::IdInfo,
+ _info: Option<&'bound Self::IdInfo>,
) -> impl PinInit<Self::Data<'bound>, Error> + 'bound {
pin_init::pin_init_scope(move || {
dev_dbg!(pdev, "Probe Nova Core GPU driver.\n");
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index 5071cae6543f..0e055e4df99e 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -113,7 +113,7 @@ extern "C" fn probe_callback(
let info = T::ID_TABLE.info(id.index());
from_result(|| {
- let data = T::probe(pdev, info);
+ let data = T::probe(pdev, Some(info));
pdev.as_ref().set_drvdata(data)?;
Ok(0)
@@ -284,7 +284,7 @@ macro_rules! pci_device_table {
///
/// fn probe<'bound>(
/// _pdev: &'bound pci::Device<Core<'_>>,
-/// _id_info: &'bound Self::IdInfo,
+/// _id_info: Option<&'bound Self::IdInfo>,
/// ) -> impl PinInit<Self::Data<'bound>, Error> + 'bound {
/// Err(ENODEV)
/// }
@@ -313,7 +313,7 @@ pub trait Driver {
/// attempt to initialize the device here.
fn probe<'bound>(
dev: &'bound Device<device::Core<'_>>,
- id_info: &'bound Self::IdInfo,
+ id_info: Option<&'bound Self::IdInfo>,
) -> impl PinInit<Self::Data<'bound>, Error> + 'bound;
/// PCI driver unbind.
diff --git a/samples/rust/rust_dma.rs b/samples/rust/rust_dma.rs
index 5046b4628d0e..9beb37275e0d 100644
--- a/samples/rust/rust_dma.rs
+++ b/samples/rust/rust_dma.rs
@@ -63,7 +63,7 @@ impl pci::Driver for DmaSampleDriver {
fn probe<'bound>(
pdev: &'bound pci::Device<Core<'_>>,
- _info: &'bound Self::IdInfo,
+ _info: Option<&'bound Self::IdInfo>,
) -> impl PinInit<Self, Error> + 'bound {
pin_init::pin_init_scope(move || {
dev_info!(pdev, "Probe DMA test driver.\n");
diff --git a/samples/rust/rust_driver_auxiliary.rs b/samples/rust/rust_driver_auxiliary.rs
index 2c1351040e45..73c63afc046a 100644
--- a/samples/rust/rust_driver_auxiliary.rs
+++ b/samples/rust/rust_driver_auxiliary.rs
@@ -79,7 +79,7 @@ impl pci::Driver for ParentDriver {
fn probe<'bound>(
pdev: &'bound pci::Device<Core<'_>>,
- _info: &'bound Self::IdInfo,
+ _info: Option<&'bound Self::IdInfo>,
) -> impl PinInit<Self::Data<'bound>, Error> + 'bound {
Ok(ParentData {
// SAFETY: `ParentData` is the driver's private data, which is dropped when the
diff --git a/samples/rust/rust_driver_pci.rs b/samples/rust/rust_driver_pci.rs
index 1aa8197d8698..5547dd704a1b 100644
--- a/samples/rust/rust_driver_pci.rs
+++ b/samples/rust/rust_driver_pci.rs
@@ -144,7 +144,7 @@ impl pci::Driver for SampleDriver {
fn probe<'bound>(
pdev: &'bound pci::Device<Core<'_>>,
- info: &'bound Self::IdInfo,
+ info: Option<&'bound Self::IdInfo>,
) -> impl PinInit<Self::Data<'bound>, Error> + 'bound {
let vendor = pdev.vendor_id();
dev_dbg!(
@@ -153,6 +153,7 @@ fn probe<'bound>(
vendor,
pdev.device_id()
);
+ let info = info.ok_or(ENODEV)?;
pdev.enable_device_mem()?;
pdev.set_master();
--
2.54.0
^ permalink raw reply related
* [PATCH 04/10] rust: net/phy: remove expansion from doc
From: Gary Guo @ 2026-06-18 17:03 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Daniel Almeida,
Tamir Duberstein, Alexandre Courbot, Onur Özkan,
FUJITA Tomonori, David Airlie, Simona Vetter, Bjorn Helgaas,
Krzysztof Wilczyński, Abdiel Janulgue, Robin Murphy,
Dave Ertman, Ira Weiny, Leon Romanovsky, Len Brown, Igor Korotin,
Rob Herring, Saravana Kannan, Viresh Kumar, Michal Wilczynski,
Drew Fustini, Guo Ren, Fu Wei, Uwe Kleine-König
Cc: driver-core, rust-for-linux, linux-kernel, netdev, nova-gpu,
dri-devel, linux-pci, linux-acpi, devicetree, linux-pm,
linux-riscv, linux-pwm, Gary Guo
In-Reply-To: <20260618-id_info-v1-0-96af1e559ef9@garyguo.net>
The expansion serves little purpose and it can easily diverge.
Signed-off-by: Gary Guo <gary@garyguo.net>
---
rust/kernel/net/phy.rs | 56 --------------------------------------------------
1 file changed, 56 deletions(-)
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index 2868e3a9e02c..965ecca7d55f 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -800,62 +800,6 @@ const fn as_int(&self) -> u32 {
/// }
/// # }
/// ```
-///
-/// This expands to the following code:
-///
-/// ```ignore
-/// use kernel::net::phy::{self, DeviceId};
-/// use kernel::prelude::*;
-///
-/// struct Module {
-/// _reg: ::kernel::net::phy::Registration,
-/// }
-///
-/// module! {
-/// type: Module,
-/// name: "rust_sample_phy",
-/// authors: ["Rust for Linux Contributors"],
-/// description: "Rust sample PHYs driver",
-/// license: "GPL",
-/// }
-///
-/// struct PhySample;
-///
-/// #[vtable]
-/// impl phy::Driver for PhySample {
-/// const NAME: &'static CStr = c"PhySample";
-/// const PHY_DEVICE_ID: phy::DeviceId = phy::DeviceId::new_with_exact_mask(0x00000001);
-/// }
-///
-/// const _: () = {
-/// static mut DRIVERS: [::kernel::net::phy::DriverVTable; 1] =
-/// [::kernel::net::phy::create_phy_driver::<PhySample>()];
-///
-/// impl ::kernel::Module for Module {
-/// fn init(module: &'static ::kernel::ThisModule) -> Result<Self> {
-/// let drivers = unsafe { &mut DRIVERS };
-/// let mut reg = ::kernel::net::phy::Registration::register(
-/// module,
-/// ::core::pin::Pin::static_mut(drivers),
-/// )?;
-/// Ok(Module { _reg: reg })
-/// }
-/// }
-/// };
-///
-/// const N: usize = 1;
-///
-/// const TABLE: ::kernel::device_id::IdArray<::kernel::net::phy::DeviceId, (), N> =
-/// ::kernel::device_id::IdArray::new_without_index([
-/// ::kernel::net::phy::DeviceId(
-/// ::kernel::bindings::mdio_device_id {
-/// phy_id: 0x00000001,
-/// phy_id_mask: 0xffffffff,
-/// }),
-/// ]);
-///
-/// ::kernel::module_device_table!("mdio", phydev, TABLE);
-/// ```
#[macro_export]
macro_rules! module_phy_driver {
(@replace_expr $_t:tt $sub:expr) => {$sub};
--
2.54.0
^ permalink raw reply related
* [PATCH 05/10] rust: driver: centralize device ID handling
From: Gary Guo @ 2026-06-18 17:03 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Daniel Almeida,
Tamir Duberstein, Alexandre Courbot, Onur Özkan,
FUJITA Tomonori, David Airlie, Simona Vetter, Bjorn Helgaas,
Krzysztof Wilczyński, Abdiel Janulgue, Robin Murphy,
Dave Ertman, Ira Weiny, Leon Romanovsky, Len Brown, Igor Korotin,
Rob Herring, Saravana Kannan, Viresh Kumar, Michal Wilczynski,
Drew Fustini, Guo Ren, Fu Wei, Uwe Kleine-König
Cc: driver-core, rust-for-linux, linux-kernel, netdev, nova-gpu,
dri-devel, linux-pci, linux-acpi, devicetree, linux-pm,
linux-riscv, linux-pwm, Gary Guo
In-Reply-To: <20260618-id_info-v1-0-96af1e559ef9@garyguo.net>
Move the `IdArray` creation from individual buses to be handled by shared
code in `device_id.rs`.
Signed-off-by: Gary Guo <gary@garyguo.net>
---
rust/kernel/acpi.rs | 10 ++--------
rust/kernel/auxiliary.rs | 10 ++--------
rust/kernel/device_id.rs | 31 ++++++++++++++++++++++++++++++-
rust/kernel/i2c.rs | 10 ++--------
rust/kernel/net/phy.rs | 10 ++++------
rust/kernel/of.rs | 10 ++--------
rust/kernel/pci.rs | 10 ++--------
rust/kernel/usb.rs | 10 ++--------
8 files changed, 46 insertions(+), 55 deletions(-)
diff --git a/rust/kernel/acpi.rs b/rust/kernel/acpi.rs
index 9b8efa623130..315f2f2af446 100644
--- a/rust/kernel/acpi.rs
+++ b/rust/kernel/acpi.rs
@@ -53,13 +53,7 @@ pub const fn new(id: &'static CStr) -> Self {
/// Create an ACPI `IdTable` with an "alias" for modpost.
#[macro_export]
macro_rules! acpi_device_table {
- ($table_name:ident, $module_table_name:ident, $id_info_type: ty, $table_data: expr) => {
- const $table_name: $crate::device_id::IdArray<
- $crate::acpi::DeviceId,
- $id_info_type,
- { $table_data.len() },
- > = $crate::device_id::IdArray::new($table_data);
-
- $crate::module_device_table!("acpi", $module_table_name, $table_name);
+ ($($tt:tt)*) => {
+ $crate::module_device_table!("acpi", $crate::acpi::DeviceId, $($tt)*);
};
}
diff --git a/rust/kernel/auxiliary.rs b/rust/kernel/auxiliary.rs
index c42928d5a239..59787c9bff26 100644
--- a/rust/kernel/auxiliary.rs
+++ b/rust/kernel/auxiliary.rs
@@ -181,14 +181,8 @@ fn index(&self) -> usize {
/// Create a auxiliary `IdTable` with its alias for modpost.
#[macro_export]
macro_rules! auxiliary_device_table {
- ($table_name:ident, $module_table_name:ident, $id_info_type: ty, $table_data: expr) => {
- const $table_name: $crate::device_id::IdArray<
- $crate::auxiliary::DeviceId,
- $id_info_type,
- { $table_data.len() },
- > = $crate::device_id::IdArray::new($table_data);
-
- $crate::module_device_table!("auxiliary", $module_table_name, $table_name);
+ ($($tt:tt)*) => {
+ $crate::module_device_table!("auxiliary", $crate::auxiliary::DeviceId, $($tt)*);
};
}
diff --git a/rust/kernel/device_id.rs b/rust/kernel/device_id.rs
index eeef3f5e7b63..0239f89d5f69 100644
--- a/rust/kernel/device_id.rs
+++ b/rust/kernel/device_id.rs
@@ -175,7 +175,36 @@ fn info(&self, index: usize) -> &U {
/// Create device table alias for modpost.
#[macro_export]
macro_rules! module_device_table {
- ($table_type: literal, $module_table_name:ident, $table_name:ident) => {
+ (
+ $table_type: literal, $device_id_ty: ty,
+ $table_name: ident, $module_table_name: ident, $id_info_type: ty,
+ [$(($id: expr, $info:expr $(,)?)),* $(,)?]
+ ) => {
+ const $table_name: $crate::device_id::IdArray<
+ $device_id_ty,
+ $id_info_type,
+ { <[$device_id_ty]>::len(&[$($id,)*]) },
+ > = $crate::device_id::IdArray::new([$(($id, $info),)*]);
+
+ $crate::module_device_table!($table_type, $module_table_name, $table_name);
+ };
+
+ // Case for no ID info.
+ (
+ $table_type: literal, $device_id_ty: ty,
+ $table_name: ident, $module_table_name: ident, @none,
+ [$($id: expr),* $(,)?]
+ ) => {
+ const $table_name: $crate::device_id::IdArray<
+ $device_id_ty,
+ (),
+ { <[$device_id_ty]>::len(&[$($id,)*]) },
+ > = $crate::device_id::IdArray::new_without_index([$($id),*]);
+
+ $crate::module_device_table!($table_type, $module_table_name, $table_name);
+ };
+
+ ($table_type: literal, $module_table_name: ident, $table_name:ident) => {
#[rustfmt::skip]
#[export_name =
concat!("__mod_device_table__", line!(),
diff --git a/rust/kernel/i2c.rs b/rust/kernel/i2c.rs
index 7655d61daac8..a7d9b88ae616 100644
--- a/rust/kernel/i2c.rs
+++ b/rust/kernel/i2c.rs
@@ -77,14 +77,8 @@ fn index(&self) -> usize {
/// Create a I2C `IdTable` with its alias for modpost.
#[macro_export]
macro_rules! i2c_device_table {
- ($table_name:ident, $module_table_name:ident, $id_info_type: ty, $table_data: expr) => {
- const $table_name: $crate::device_id::IdArray<
- $crate::i2c::DeviceId,
- $id_info_type,
- { $table_data.len() },
- > = $crate::device_id::IdArray::new($table_data);
-
- $crate::module_device_table!("i2c", $module_table_name, $table_name);
+ ($($tt:tt)*) => {
+ $crate::module_device_table!("i2c", $crate::i2c::DeviceId, $($tt)*);
};
}
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index 965ecca7d55f..166572861e61 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -809,12 +809,10 @@ macro_rules! module_phy_driver {
};
(@device_table [$($dev:expr),+]) => {
- const N: usize = $crate::module_phy_driver!(@count_devices $($dev),+);
-
- const TABLE: $crate::device_id::IdArray<$crate::net::phy::DeviceId, (), N> =
- $crate::device_id::IdArray::new_without_index([ $($dev),+, ]);
-
- $crate::module_device_table!("mdio", phydev, TABLE);
+ $crate::module_device_table!(
+ "mdio", $crate::net::phy::DeviceId,
+ phydev, TABLE, @none, [$($dev),+]
+ );
};
(drivers: [$($driver:ident),+ $(,)?], device_table: [$($dev:expr),+ $(,)?], $($f:tt)*) => {
diff --git a/rust/kernel/of.rs b/rust/kernel/of.rs
index 58b20c367f99..35aa6d36d309 100644
--- a/rust/kernel/of.rs
+++ b/rust/kernel/of.rs
@@ -53,13 +53,7 @@ pub const fn new(compatible: &'static CStr) -> Self {
/// Create an OF `IdTable` with an "alias" for modpost.
#[macro_export]
macro_rules! of_device_table {
- ($table_name:ident, $module_table_name:ident, $id_info_type: ty, $table_data: expr) => {
- const $table_name: $crate::device_id::IdArray<
- $crate::of::DeviceId,
- $id_info_type,
- { $table_data.len() },
- > = $crate::device_id::IdArray::new($table_data);
-
- $crate::module_device_table!("of", $module_table_name, $table_name);
+ ($($tt:tt)*) => {
+ $crate::module_device_table!("of", $crate::of::DeviceId, $($tt)*);
};
}
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index 0e055e4df99e..34e07a53244d 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -245,14 +245,8 @@ fn index(&self) -> usize {
/// Create a PCI `IdTable` with its alias for modpost.
#[macro_export]
macro_rules! pci_device_table {
- ($table_name:ident, $module_table_name:ident, $id_info_type: ty, $table_data: expr) => {
- const $table_name: $crate::device_id::IdArray<
- $crate::pci::DeviceId,
- $id_info_type,
- { $table_data.len() },
- > = $crate::device_id::IdArray::new($table_data);
-
- $crate::module_device_table!("pci", $module_table_name, $table_name);
+ ($($tt:tt)*) => {
+ $crate::module_device_table!("pci", $crate::pci::DeviceId, $($tt)*);
};
}
diff --git a/rust/kernel/usb.rs b/rust/kernel/usb.rs
index 7aff0c82d0af..154919ee1e19 100644
--- a/rust/kernel/usb.rs
+++ b/rust/kernel/usb.rs
@@ -254,14 +254,8 @@ fn index(&self) -> usize {
/// Create a USB `IdTable` with its alias for modpost.
#[macro_export]
macro_rules! usb_device_table {
- ($table_name:ident, $module_table_name:ident, $id_info_type: ty, $table_data: expr) => {
- const $table_name: $crate::device_id::IdArray<
- $crate::usb::DeviceId,
- $id_info_type,
- { $table_data.len() },
- > = $crate::device_id::IdArray::new($table_data);
-
- $crate::module_device_table!("usb", $module_table_name, $table_name);
+ ($($tt:tt)*) => {
+ $crate::module_device_table!("usb", $crate::usb::DeviceId, $($tt)*);
};
}
--
2.54.0
^ permalink raw reply related
* [PATCH 06/10] rust: driver: remove `$module_table_name` from `module_device_table`
From: Gary Guo @ 2026-06-18 17:03 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Daniel Almeida,
Tamir Duberstein, Alexandre Courbot, Onur Özkan,
FUJITA Tomonori, David Airlie, Simona Vetter, Bjorn Helgaas,
Krzysztof Wilczyński, Abdiel Janulgue, Robin Murphy,
Dave Ertman, Ira Weiny, Leon Romanovsky, Len Brown, Igor Korotin,
Rob Herring, Saravana Kannan, Viresh Kumar, Michal Wilczynski,
Drew Fustini, Guo Ren, Fu Wei, Uwe Kleine-König
Cc: driver-core, rust-for-linux, linux-kernel, netdev, nova-gpu,
dri-devel, linux-pci, linux-acpi, devicetree, linux-pm,
linux-riscv, linux-pwm, Gary Guo
In-Reply-To: <20260618-id_info-v1-0-96af1e559ef9@garyguo.net>
Wrap the generated code in a `const _: ()` block to avoid symbol conflict.
This removes the need of creating a new identifier.
Signed-off-by: Gary Guo <gary@garyguo.net>
---
drivers/cpufreq/rcpufreq_dt.rs | 1 -
drivers/gpu/drm/nova/driver.rs | 1 -
drivers/gpu/drm/tyr/driver.rs | 1 -
drivers/gpu/nova-core/driver.rs | 1 -
drivers/pwm/pwm_th1520.rs | 1 -
rust/kernel/device_id.rs | 30 ++++++++++++++++--------------
rust/kernel/i2c.rs | 3 ---
rust/kernel/net/phy.rs | 2 +-
rust/kernel/pci.rs | 1 -
rust/kernel/platform.rs | 2 --
rust/kernel/usb.rs | 1 -
samples/rust/rust_debugfs.rs | 1 -
samples/rust/rust_dma.rs | 1 -
samples/rust/rust_driver_auxiliary.rs | 2 --
samples/rust/rust_driver_i2c.rs | 3 ---
samples/rust/rust_driver_pci.rs | 1 -
samples/rust/rust_driver_platform.rs | 2 --
samples/rust/rust_driver_usb.rs | 1 -
samples/rust/rust_i2c_client.rs | 2 --
samples/rust/rust_soc.rs | 2 --
20 files changed, 17 insertions(+), 42 deletions(-)
diff --git a/drivers/cpufreq/rcpufreq_dt.rs b/drivers/cpufreq/rcpufreq_dt.rs
index 10106fa13095..145daa12072f 100644
--- a/drivers/cpufreq/rcpufreq_dt.rs
+++ b/drivers/cpufreq/rcpufreq_dt.rs
@@ -194,7 +194,6 @@ fn register_em(policy: &mut cpufreq::Policy) {
kernel::of_device_table!(
OF_TABLE,
- MODULE_OF_TABLE,
<CPUFreqDTDriver as platform::Driver>::IdInfo,
[(of::DeviceId::new(c"operating-points-v2"), ())]
);
diff --git a/drivers/gpu/drm/nova/driver.rs b/drivers/gpu/drm/nova/driver.rs
index 48933d86ddda..43f15cdfeb09 100644
--- a/drivers/gpu/drm/nova/driver.rs
+++ b/drivers/gpu/drm/nova/driver.rs
@@ -43,7 +43,6 @@ pub(crate) struct NovaData {
kernel::auxiliary_device_table!(
AUX_TABLE,
- MODULE_AUX_TABLE,
<NovaDriver as auxiliary::Driver>::IdInfo,
[(
auxiliary::DeviceId::new(NOVA_CORE_MODULE_NAME, AUXILIARY_NAME),
diff --git a/drivers/gpu/drm/tyr/driver.rs b/drivers/gpu/drm/tyr/driver.rs
index d063bc664cc1..218e9af899c7 100644
--- a/drivers/gpu/drm/tyr/driver.rs
+++ b/drivers/gpu/drm/tyr/driver.rs
@@ -87,7 +87,6 @@ fn issue_soft_reset(dev: &Device, iomem: &IoMem<'_>) -> Result {
kernel::of_device_table!(
OF_TABLE,
- MODULE_OF_TABLE,
<TyrPlatformDriver as platform::Driver>::IdInfo,
[
(of::DeviceId::new(c"rockchip,rk3588-mali"), ()),
diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 5a5f0b63e0f3..0c53b7239ac9 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -40,7 +40,6 @@ pub(crate) struct NovaCore<'bound> {
kernel::pci_device_table!(
PCI_TABLE,
- MODULE_PCI_TABLE,
<NovaCoreDriver as pci::Driver>::IdInfo,
[
// Modern NVIDIA GPUs will show up as either VGA or 3D controllers.
diff --git a/drivers/pwm/pwm_th1520.rs b/drivers/pwm/pwm_th1520.rs
index 3e3fa51ccef9..1df752330e8f 100644
--- a/drivers/pwm/pwm_th1520.rs
+++ b/drivers/pwm/pwm_th1520.rs
@@ -303,7 +303,6 @@ fn drop(self: Pin<&mut Self>) {
kernel::of_device_table!(
OF_TABLE,
- MODULE_OF_TABLE,
<Th1520PwmPlatformDriver as platform::Driver>::IdInfo,
[(of::DeviceId::new(c"thead,th1520-pwm"), ())]
);
diff --git a/rust/kernel/device_id.rs b/rust/kernel/device_id.rs
index 0239f89d5f69..84852a2d9ad7 100644
--- a/rust/kernel/device_id.rs
+++ b/rust/kernel/device_id.rs
@@ -177,7 +177,7 @@ fn info(&self, index: usize) -> &U {
macro_rules! module_device_table {
(
$table_type: literal, $device_id_ty: ty,
- $table_name: ident, $module_table_name: ident, $id_info_type: ty,
+ $table_name: ident, $id_info_type: ty,
[$(($id: expr, $info:expr $(,)?)),* $(,)?]
) => {
const $table_name: $crate::device_id::IdArray<
@@ -186,13 +186,13 @@ macro_rules! module_device_table {
{ <[$device_id_ty]>::len(&[$($id,)*]) },
> = $crate::device_id::IdArray::new([$(($id, $info),)*]);
- $crate::module_device_table!($table_type, $module_table_name, $table_name);
+ $crate::module_device_table!($table_type, $table_name);
};
// Case for no ID info.
(
$table_type: literal, $device_id_ty: ty,
- $table_name: ident, $module_table_name: ident, @none,
+ $table_name: ident, @none,
[$($id: expr),* $(,)?]
) => {
const $table_name: $crate::device_id::IdArray<
@@ -201,18 +201,20 @@ macro_rules! module_device_table {
{ <[$device_id_ty]>::len(&[$($id,)*]) },
> = $crate::device_id::IdArray::new_without_index([$($id),*]);
- $crate::module_device_table!($table_type, $module_table_name, $table_name);
+ $crate::module_device_table!($table_type, $table_name);
};
- ($table_type: literal, $module_table_name: ident, $table_name:ident) => {
- #[rustfmt::skip]
- #[export_name =
- concat!("__mod_device_table__", line!(),
- "__kmod_", module_path!(),
- "__", $table_type,
- "__", stringify!($table_name))
- ]
- static $module_table_name: [::core::mem::MaybeUninit<u8>; $table_name.raw_ids().size()] =
- unsafe { ::core::mem::transmute_copy($table_name.raw_ids()) };
+ ($table_type: literal, $table_name:ident) => {
+ const _: () = {
+ #[rustfmt::skip]
+ #[export_name =
+ concat!("__mod_device_table__", line!(),
+ "__kmod_", module_path!(),
+ "__", $table_type,
+ "__", stringify!($table_name))
+ ]
+ static TABLE: [::core::mem::MaybeUninit<u8>; $table_name.raw_ids().size()] =
+ unsafe { ::core::mem::transmute_copy($table_name.raw_ids()) };
+ };
};
}
diff --git a/rust/kernel/i2c.rs b/rust/kernel/i2c.rs
index a7d9b88ae616..55c89ba3a82a 100644
--- a/rust/kernel/i2c.rs
+++ b/rust/kernel/i2c.rs
@@ -261,7 +261,6 @@ macro_rules! module_i2c_driver {
///
/// kernel::acpi_device_table!(
/// ACPI_TABLE,
-/// MODULE_ACPI_TABLE,
/// <MyDriver as i2c::Driver>::IdInfo,
/// [
/// (acpi::DeviceId::new(c"LNUXBEEF"), ())
@@ -270,7 +269,6 @@ macro_rules! module_i2c_driver {
///
/// kernel::i2c_device_table!(
/// I2C_TABLE,
-/// MODULE_I2C_TABLE,
/// <MyDriver as i2c::Driver>::IdInfo,
/// [
/// (i2c::DeviceId::new(c"rust_driver_i2c"), ())
@@ -279,7 +277,6 @@ macro_rules! module_i2c_driver {
///
/// kernel::of_device_table!(
/// OF_TABLE,
-/// MODULE_OF_TABLE,
/// <MyDriver as i2c::Driver>::IdInfo,
/// [
/// (of::DeviceId::new(c"test,device"), ())
diff --git a/rust/kernel/net/phy.rs b/rust/kernel/net/phy.rs
index 166572861e61..1e86b901c391 100644
--- a/rust/kernel/net/phy.rs
+++ b/rust/kernel/net/phy.rs
@@ -811,7 +811,7 @@ macro_rules! module_phy_driver {
(@device_table [$($dev:expr),+]) => {
$crate::module_device_table!(
"mdio", $crate::net::phy::DeviceId,
- phydev, TABLE, @none, [$($dev),+]
+ TABLE, @none, [$($dev),+]
);
};
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index 34e07a53244d..a3dd48f76353 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -261,7 +261,6 @@ macro_rules! pci_device_table {
///
/// kernel::pci_device_table!(
/// PCI_TABLE,
-/// MODULE_PCI_TABLE,
/// <MyDriver as pci::Driver>::IdInfo,
/// [
/// (
diff --git a/rust/kernel/platform.rs b/rust/kernel/platform.rs
index 9b362e0495d3..210a815925ce 100644
--- a/rust/kernel/platform.rs
+++ b/rust/kernel/platform.rs
@@ -176,7 +176,6 @@ macro_rules! module_platform_driver {
///
/// kernel::of_device_table!(
/// OF_TABLE,
-/// MODULE_OF_TABLE,
/// <MyDriver as platform::Driver>::IdInfo,
/// [
/// (of::DeviceId::new(c"test,device"), ())
@@ -185,7 +184,6 @@ macro_rules! module_platform_driver {
///
/// kernel::acpi_device_table!(
/// ACPI_TABLE,
-/// MODULE_ACPI_TABLE,
/// <MyDriver as platform::Driver>::IdInfo,
/// [
/// (acpi::DeviceId::new(c"LNUXBEEF"), ())
diff --git a/rust/kernel/usb.rs b/rust/kernel/usb.rs
index 154919ee1e19..500b5e0ba4ea 100644
--- a/rust/kernel/usb.rs
+++ b/rust/kernel/usb.rs
@@ -271,7 +271,6 @@ macro_rules! usb_device_table {
///
/// kernel::usb_device_table!(
/// USB_TABLE,
-/// MODULE_USB_TABLE,
/// <MyDriver as usb::Driver>::IdInfo,
/// [
/// (usb::DeviceId::from_id(0x1234, 0x5678), ()),
diff --git a/samples/rust/rust_debugfs.rs b/samples/rust/rust_debugfs.rs
index 1f59e08aaa4b..181fd98ae5b6 100644
--- a/samples/rust/rust_debugfs.rs
+++ b/samples/rust/rust_debugfs.rs
@@ -110,7 +110,6 @@ fn from_str(s: &str) -> Result<Self> {
kernel::acpi_device_table!(
ACPI_TABLE,
- MODULE_ACPI_TABLE,
<RustDebugFs as platform::Driver>::IdInfo,
[(acpi::DeviceId::new(c"LNUXBEEF"), ())]
);
diff --git a/samples/rust/rust_dma.rs b/samples/rust/rust_dma.rs
index 9beb37275e0d..80c309ce07e9 100644
--- a/samples/rust/rust_dma.rs
+++ b/samples/rust/rust_dma.rs
@@ -51,7 +51,6 @@ unsafe impl kernel::transmute::FromBytes for MyStruct {}
kernel::pci_device_table!(
PCI_TABLE,
- MODULE_PCI_TABLE,
<DmaSampleDriver as pci::Driver>::IdInfo,
[(pci::DeviceId::from_id(pci::Vendor::REDHAT, 0x5), ())]
);
diff --git a/samples/rust/rust_driver_auxiliary.rs b/samples/rust/rust_driver_auxiliary.rs
index 73c63afc046a..704567a072ba 100644
--- a/samples/rust/rust_driver_auxiliary.rs
+++ b/samples/rust/rust_driver_auxiliary.rs
@@ -24,7 +24,6 @@
kernel::auxiliary_device_table!(
AUX_TABLE,
- MODULE_AUX_TABLE,
<AuxiliaryDriver as auxiliary::Driver>::IdInfo,
[(auxiliary::DeviceId::new(MODULE_NAME, AUXILIARY_NAME), ())]
);
@@ -66,7 +65,6 @@ struct ParentData<'bound> {
kernel::pci_device_table!(
PCI_TABLE,
- MODULE_PCI_TABLE,
<ParentDriver as pci::Driver>::IdInfo,
[(pci::DeviceId::from_id(pci::Vendor::REDHAT, 0x5), ())]
);
diff --git a/samples/rust/rust_driver_i2c.rs b/samples/rust/rust_driver_i2c.rs
index ead8263a7d48..a0df0c6097c4 100644
--- a/samples/rust/rust_driver_i2c.rs
+++ b/samples/rust/rust_driver_i2c.rs
@@ -14,21 +14,18 @@
kernel::acpi_device_table! {
ACPI_TABLE,
- MODULE_ACPI_TABLE,
<SampleDriver as i2c::Driver>::IdInfo,
[(acpi::DeviceId::new(c"LNUXBEEF"), 0)]
}
kernel::i2c_device_table! {
I2C_TABLE,
- MODULE_I2C_TABLE,
<SampleDriver as i2c::Driver>::IdInfo,
[(i2c::DeviceId::new(c"rust_driver_i2c"), 0)]
}
kernel::of_device_table! {
OF_TABLE,
- MODULE_OF_TABLE,
<SampleDriver as i2c::Driver>::IdInfo,
[(of::DeviceId::new(c"test,rust_driver_i2c"), 0)]
}
diff --git a/samples/rust/rust_driver_pci.rs b/samples/rust/rust_driver_pci.rs
index 5547dd704a1b..2282191e6292 100644
--- a/samples/rust/rust_driver_pci.rs
+++ b/samples/rust/rust_driver_pci.rs
@@ -74,7 +74,6 @@ struct SampleDriverData<'bound> {
kernel::pci_device_table!(
PCI_TABLE,
- MODULE_PCI_TABLE,
<SampleDriver as pci::Driver>::IdInfo,
[(
pci::DeviceId::from_id(pci::Vendor::REDHAT, 0x5),
diff --git a/samples/rust/rust_driver_platform.rs b/samples/rust/rust_driver_platform.rs
index ec0d6cac4f57..710145b3605a 100644
--- a/samples/rust/rust_driver_platform.rs
+++ b/samples/rust/rust_driver_platform.rs
@@ -87,14 +87,12 @@ struct SampleDriver {
kernel::of_device_table!(
OF_TABLE,
- MODULE_OF_TABLE,
<SampleDriver as platform::Driver>::IdInfo,
[(of::DeviceId::new(c"test,rust-device"), Info(42))]
);
kernel::acpi_device_table!(
ACPI_TABLE,
- MODULE_ACPI_TABLE,
<SampleDriver as platform::Driver>::IdInfo,
[(acpi::DeviceId::new(c"LNUXBEEF"), Info(0))]
);
diff --git a/samples/rust/rust_driver_usb.rs b/samples/rust/rust_driver_usb.rs
index 02bd5085f9bc..284042c5969b 100644
--- a/samples/rust/rust_driver_usb.rs
+++ b/samples/rust/rust_driver_usb.rs
@@ -19,7 +19,6 @@ struct SampleDriver {
kernel::usb_device_table!(
USB_TABLE,
- MODULE_USB_TABLE,
<SampleDriver as usb::Driver>::IdInfo,
[(usb::DeviceId::from_id(0x1234, 0x5678), ()),]
);
diff --git a/samples/rust/rust_i2c_client.rs b/samples/rust/rust_i2c_client.rs
index 2d876f4e3ee0..c8a23875ef5b 100644
--- a/samples/rust/rust_i2c_client.rs
+++ b/samples/rust/rust_i2c_client.rs
@@ -87,14 +87,12 @@ struct SampleDriver {
kernel::of_device_table!(
OF_TABLE,
- MODULE_OF_TABLE,
<SampleDriver as platform::Driver>::IdInfo,
[(of::DeviceId::new(c"test,rust-device"), ())]
);
kernel::acpi_device_table!(
ACPI_TABLE,
- MODULE_ACPI_TABLE,
<SampleDriver as platform::Driver>::IdInfo,
[(acpi::DeviceId::new(c"LNUXBEEF"), ())]
);
diff --git a/samples/rust/rust_soc.rs b/samples/rust/rust_soc.rs
index 808d58200eb6..f5e5f2f9adf7 100644
--- a/samples/rust/rust_soc.rs
+++ b/samples/rust/rust_soc.rs
@@ -23,14 +23,12 @@ struct SampleSocDriver {
kernel::of_device_table!(
OF_TABLE,
- MODULE_OF_TABLE,
<SampleSocDriver as platform::Driver>::IdInfo,
[(of::DeviceId::new(c"test,rust-device"), ())]
);
kernel::acpi_device_table!(
ACPI_TABLE,
- MODULE_ACPI_TABLE,
<SampleSocDriver as platform::Driver>::IdInfo,
[(acpi::DeviceId::new(c"LNUXBEEF"), ())]
);
--
2.54.0
^ permalink raw reply related
* [PATCH 07/10] rust: driver: store pointers in `DeviceId`
From: Gary Guo @ 2026-06-18 17:03 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Daniel Almeida,
Tamir Duberstein, Alexandre Courbot, Onur Özkan,
FUJITA Tomonori, David Airlie, Simona Vetter, Bjorn Helgaas,
Krzysztof Wilczyński, Abdiel Janulgue, Robin Murphy,
Dave Ertman, Ira Weiny, Leon Romanovsky, Len Brown, Igor Korotin,
Rob Herring, Saravana Kannan, Viresh Kumar, Michal Wilczynski,
Drew Fustini, Guo Ren, Fu Wei, Uwe Kleine-König
Cc: driver-core, rust-for-linux, linux-kernel, netdev, nova-gpu,
dri-devel, linux-pci, linux-acpi, devicetree, linux-pm,
linux-riscv, linux-pwm, Gary Guo
In-Reply-To: <20260618-id_info-v1-0-96af1e559ef9@garyguo.net>
The common practice in C drivers is to store pointers into `driver_data`
field of device IDs. The Rust code is however currently storing indices
into the fields and then carry a side table that maps the index to
pointers.
It is much simpler to just have `DeviceId` carry the pointer like C code
does. However, just doing so naively would cause a "pointers cannot be cast
to integers during const eval" error, as kernel_ulong_t does not have
provenance while pointers do, and Rust forbids `expose_provenance` during
consteval.
Work around this limitation by wrapping raw IDs in `MaybeUninit`.
`MaybeUninit` is allowed to host arbitrary bytes with or without
provenance, so we can just then use `unsafe` to store a pointer with
provenance there. This has the same effect as changing the C-side
definition to use `void*` instead of `kernel_ulong_t`, but without actually
changing the C side.
Signed-off-by: Gary Guo <gary@garyguo.net>
---
rust/kernel/acpi.rs | 4 ---
rust/kernel/auxiliary.rs | 8 ++---
rust/kernel/device_id.rs | 88 +++++++++++++++++++++++++++++-------------------
rust/kernel/driver.rs | 14 ++++----
rust/kernel/i2c.rs | 7 ++--
rust/kernel/of.rs | 4 ---
rust/kernel/pci.rs | 11 +++---
rust/kernel/usb.rs | 7 ++--
8 files changed, 73 insertions(+), 70 deletions(-)
diff --git a/rust/kernel/acpi.rs b/rust/kernel/acpi.rs
index 315f2f2af446..ea2ce61ee393 100644
--- a/rust/kernel/acpi.rs
+++ b/rust/kernel/acpi.rs
@@ -25,10 +25,6 @@ unsafe impl RawDeviceId for DeviceId {
// SAFETY: `DRIVER_DATA_OFFSET` is the offset to the `driver_data` field.
unsafe impl RawDeviceIdIndex for DeviceId {
const DRIVER_DATA_OFFSET: usize = core::mem::offset_of!(bindings::acpi_device_id, driver_data);
-
- fn index(&self) -> usize {
- self.0.driver_data
- }
}
impl DeviceId {
diff --git a/rust/kernel/auxiliary.rs b/rust/kernel/auxiliary.rs
index 59787c9bff26..aa13d8866a19 100644
--- a/rust/kernel/auxiliary.rs
+++ b/rust/kernel/auxiliary.rs
@@ -93,7 +93,9 @@ extern "C" fn probe_callback(
// SAFETY: `DeviceId` is a `#[repr(transparent)`] wrapper of `struct auxiliary_device_id`
// and does not add additional invariants, so it's safe to transmute.
let id = unsafe { &*id.cast::<DeviceId>() };
- let info = T::ID_TABLE.info(id.index());
+
+ // SAFETY: `id` comes from `T::ID_TABLE` which is of type `IdArray<_, T::IdInfo>`.
+ let info = unsafe { id.info_unchecked::<T::IdInfo>() };
from_result(|| {
let data = T::probe(adev, info);
@@ -169,10 +171,6 @@ unsafe impl RawDeviceId for DeviceId {
unsafe impl RawDeviceIdIndex for DeviceId {
const DRIVER_DATA_OFFSET: usize =
core::mem::offset_of!(bindings::auxiliary_device_id, driver_data);
-
- fn index(&self) -> usize {
- self.0.driver_data
- }
}
/// IdTable type for auxiliary drivers.
diff --git a/rust/kernel/device_id.rs b/rust/kernel/device_id.rs
index 84852a2d9ad7..59453588df0e 100644
--- a/rust/kernel/device_id.rs
+++ b/rust/kernel/device_id.rs
@@ -5,7 +5,10 @@
//! Each bus / subsystem that matches device and driver through a bus / subsystem specific ID is
//! expected to implement [`RawDeviceId`].
-use core::mem::MaybeUninit;
+use core::{
+ marker::PhantomData,
+ mem::MaybeUninit, //
+};
/// Marker trait to indicate a Rust device ID type represents a corresponding C device ID type.
///
@@ -47,15 +50,48 @@ pub unsafe trait RawDeviceIdIndex: RawDeviceId {
/// The offset (in bytes) to the context/data field in the raw device ID.
const DRIVER_DATA_OFFSET: usize;
- /// The index stored at `DRIVER_DATA_OFFSET` of the implementor of the [`RawDeviceIdIndex`]
- /// trait.
- fn index(&self) -> usize;
+ /// Obtain the data pointer stored inside the device ID.
+ ///
+ /// # Safety
+ ///
+ /// `&Self` must be stored inside a `IdArray<Self, U>`.
+ unsafe fn info_unchecked<U>(&self) -> &'static U {
+ // SAFETY: By safety requirement of the trait, this is `self.driver_data as *const U` and by
+ // the safety requirement of the function, this is stored in `IdArray<Self, U>` so is
+ // convertible to `&'static U`.
+ unsafe {
+ core::ptr::from_ref(self)
+ .byte_add(Self::DRIVER_DATA_OFFSET)
+ .cast::<&U>()
+ .read()
+ }
+ }
+
+ /// Obtain the data pointer stored inside the device ID.
+ ///
+ /// # Safety
+ ///
+ /// `&Self` must be stored inside a `IdArray<Self, U>`, or has NULL (or 0) as driver data.
+ unsafe fn info_unchecked_opt<U>(&self) -> Option<&'static U> {
+ // SAFETY: By safety requirement of the trait, this is `self.driver_data as *const U` and by
+ // the safety requirement of the function, if this is stored in `IdArray<Self, U>`, this is
+ // convertible to `Option<&'static U>`. Otherwise it is NULL which is `None` as
+ // `Option<&U>`.
+ unsafe {
+ core::ptr::from_ref(self)
+ .byte_add(Self::DRIVER_DATA_OFFSET)
+ .cast::<Option<&U>>()
+ .read()
+ }
+ }
}
/// A zero-terminated device id array.
#[repr(C)]
pub struct RawIdArray<T: RawDeviceId, const N: usize> {
- ids: [T::RawType; N],
+ // This is `MaybeUninit<T::RawType>` so any bytes inside it can carry provenance in CTFE.
+ // If this were `T::RawType`, integer fields would not be able to contain pointers.
+ ids: [MaybeUninit<T::RawType>; N],
sentinel: MaybeUninit<T::RawType>,
}
@@ -68,18 +104,17 @@ pub const fn size(&self) -> usize {
/// A zero-terminated device id array, followed by context data.
#[repr(C)]
-pub struct IdArray<T: RawDeviceId, U, const N: usize> {
+pub struct IdArray<T: RawDeviceId, U: 'static, const N: usize> {
raw_ids: RawIdArray<T, N>,
- id_infos: [U; N],
+ phantom: PhantomData<&'static U>,
}
-impl<T: RawDeviceId + RawDeviceIdIndex, U, const N: usize> IdArray<T, U, N> {
+impl<T: RawDeviceId + RawDeviceIdIndex, U: 'static, const N: usize> IdArray<T, U, N> {
/// Creates a new instance of the array.
///
/// The contents are derived from the given identifiers and context information.
- pub const fn new(ids: [(T, U); N]) -> Self {
+ pub const fn new(ids: [(T, &'static U); N]) -> Self {
let mut raw_ids = [const { MaybeUninit::<T::RawType>::uninit() }; N];
- let mut infos = [const { MaybeUninit::uninit() }; N];
let mut i = 0usize;
while i < N {
@@ -87,18 +122,15 @@ impl<T: RawDeviceId + RawDeviceIdIndex, U, const N: usize> IdArray<T, U, N> {
// layout-wise compatible with `RawType`.
raw_ids[i] = unsafe { core::mem::transmute_copy(&ids[i].0) };
// SAFETY: by the safety requirement of `RawDeviceIdIndex`, this would be effectively
- // `raw_ids[i].driver_data = i;`.
+ // `raw_ids[i].driver_data = ids[i].1;`.
unsafe {
raw_ids[i]
.as_mut_ptr()
.byte_add(T::DRIVER_DATA_OFFSET)
- .cast::<usize>()
- .write(i);
+ .cast::<&U>()
+ .write(ids[i].1);
}
- // SAFETY: this is effectively a move: `infos[i] = ids[i].1`. We make a copy here but
- // later forget `ids`.
- infos[i] = MaybeUninit::new(unsafe { core::ptr::read(&ids[i].1) });
i += 1;
}
@@ -106,20 +138,15 @@ impl<T: RawDeviceId + RawDeviceIdIndex, U, const N: usize> IdArray<T, U, N> {
Self {
raw_ids: RawIdArray {
- // SAFETY: this is effectively `array_assume_init`, which is unstable, so we use
- // `transmute_copy` instead. We have initialized all elements of `raw_ids` so this
- // `array_assume_init` is safe.
- ids: unsafe { core::mem::transmute_copy(&raw_ids) },
+ ids: raw_ids,
sentinel: MaybeUninit::zeroed(),
},
- // SAFETY: We have initialized all elements of `infos` so this `array_assume_init` is
- // safe.
- id_infos: unsafe { core::mem::transmute_copy(&infos) },
+ phantom: PhantomData,
}
}
}
-impl<T: RawDeviceId, U, const N: usize> IdArray<T, U, N> {
+impl<T: RawDeviceId, U: 'static, const N: usize> IdArray<T, U, N> {
/// Reference to the contained [`RawIdArray`].
pub const fn raw_ids(&self) -> &RawIdArray<T, N> {
&self.raw_ids
@@ -133,7 +160,7 @@ impl<T: RawDeviceId, const N: usize> IdArray<T, (), N> {
/// If the device implements [`RawDeviceIdIndex`], consider using [`IdArray::new`] instead.
pub const fn new_without_index(ids: [T; N]) -> Self {
// SAFETY: `T` is layout-wise compatible with `T::RawType`, so is the array of them.
- let raw_ids: [T::RawType; N] = unsafe { core::mem::transmute_copy(&ids) };
+ let raw_ids: [MaybeUninit<T::RawType>; N] = unsafe { core::mem::transmute_copy(&ids) };
core::mem::forget(ids);
Self {
@@ -141,7 +168,7 @@ impl<T: RawDeviceId, const N: usize> IdArray<T, (), N> {
ids: raw_ids,
sentinel: MaybeUninit::zeroed(),
},
- id_infos: [(); N],
+ phantom: PhantomData,
}
}
}
@@ -155,9 +182,6 @@ impl<T: RawDeviceId, const N: usize> IdArray<T, (), N> {
pub trait IdTable<T: RawDeviceId, U> {
/// Obtain the pointer to the ID table.
fn as_ptr(&self) -> *const T::RawType;
-
- /// Obtain the pointer to the driver-specific information from an index.
- fn info(&self, index: usize) -> &U;
}
impl<T: RawDeviceId, U, const N: usize> IdTable<T, U> for IdArray<T, U, N> {
@@ -166,10 +190,6 @@ fn as_ptr(&self) -> *const T::RawType {
// to access the sentinel.
core::ptr::from_ref(self).cast()
}
-
- fn info(&self, index: usize) -> &U {
- &self.id_infos[index]
- }
}
/// Create device table alias for modpost.
@@ -184,7 +204,7 @@ macro_rules! module_device_table {
$device_id_ty,
$id_info_type,
{ <[$device_id_ty]>::len(&[$($id,)*]) },
- > = $crate::device_id::IdArray::new([$(($id, $info),)*]);
+ > = $crate::device_id::IdArray::new([$(($id, &$info),)*]);
$crate::module_device_table!($table_type, $table_name);
};
diff --git a/rust/kernel/driver.rs b/rust/kernel/driver.rs
index bf5ba0d27553..824899d76fed 100644
--- a/rust/kernel/driver.rs
+++ b/rust/kernel/driver.rs
@@ -107,6 +107,7 @@
use crate::{
acpi,
device,
+ device_id::RawDeviceIdIndex,
of,
prelude::*,
types::Opaque,
@@ -350,7 +351,8 @@ fn acpi_id_info(dev: &device::Device) -> Option<&'static Self::IdInfo> {
// and does not add additional invariants, so it's safe to transmute.
let id = unsafe { &*raw_id.cast::<acpi::DeviceId>() };
- Some(table.info(<acpi::DeviceId as crate::device_id::RawDeviceIdIndex>::index(id)))
+ // SAFETY: `id` comes from `table` which is of type `IdArray<_, Self::IdInfo>`.
+ Some(unsafe { id.info_unchecked::<Self::IdInfo>() })
}
}
}
@@ -381,9 +383,8 @@ fn of_id_info(dev: &device::Device) -> Option<&'static Self::IdInfo> {
// and does not add additional invariants, so it's safe to transmute.
let id = unsafe { &*raw_id.cast::<of::DeviceId>() };
- return Some(table.info(
- <of::DeviceId as crate::device_id::RawDeviceIdIndex>::index(id),
- ));
+ // SAFETY: `id` comes from `table` which is of type `IdArray<_, Self::IdInfo>`.
+ return Some(unsafe { id.info_unchecked::<Self::IdInfo>() });
}
}
@@ -412,9 +413,8 @@ fn of_id_info(dev: &device::Device) -> Option<&'static Self::IdInfo> {
// and does not add additional invariants, so it's safe to transmute.
let id = unsafe { &*raw_id.cast::<of::DeviceId>() };
- return Some(table.info(
- <of::DeviceId as crate::device_id::RawDeviceIdIndex>::index(id),
- ));
+ // SAFETY: `id` comes from `table` which is of type `IdArray<_, Self::IdInfo>`.
+ return Some(unsafe { id.info_unchecked::<Self::IdInfo>() });
}
}
diff --git a/rust/kernel/i2c.rs b/rust/kernel/i2c.rs
index 55c89ba3a82a..9e551c7e8e41 100644
--- a/rust/kernel/i2c.rs
+++ b/rust/kernel/i2c.rs
@@ -65,10 +65,6 @@ unsafe impl RawDeviceId for DeviceId {
// SAFETY: `DRIVER_DATA_OFFSET` is the offset to the `driver_data` field.
unsafe impl RawDeviceIdIndex for DeviceId {
const DRIVER_DATA_OFFSET: usize = core::mem::offset_of!(bindings::i2c_device_id, driver_data);
-
- fn index(&self) -> usize {
- self.0.driver_data
- }
}
/// IdTable type for I2C
@@ -212,7 +208,8 @@ fn i2c_id_info(dev: &I2cClient) -> Option<&'static <Self as driver::Adapter>::Id
// does not add additional invariants, so it's safe to transmute.
let id = unsafe { &*raw_id.cast::<DeviceId>() };
- Some(table.info(<DeviceId as RawDeviceIdIndex>::index(id)))
+ // SAFETY: `id` comes from `table` which is of type `IdArray<_, Self::IdInfo>`.
+ Some(unsafe { id.info_unchecked::<T::IdInfo>() })
}
}
diff --git a/rust/kernel/of.rs b/rust/kernel/of.rs
index 35aa6d36d309..d0318f62afd7 100644
--- a/rust/kernel/of.rs
+++ b/rust/kernel/of.rs
@@ -25,10 +25,6 @@ unsafe impl RawDeviceId for DeviceId {
// SAFETY: `DRIVER_DATA_OFFSET` is the offset to the `data` field.
unsafe impl RawDeviceIdIndex for DeviceId {
const DRIVER_DATA_OFFSET: usize = core::mem::offset_of!(bindings::of_device_id, data);
-
- fn index(&self) -> usize {
- self.0.data as usize
- }
}
impl DeviceId {
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index a3dd48f76353..a630c7fc6a85 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -110,10 +110,13 @@ extern "C" fn probe_callback(
// SAFETY: `DeviceId` is a `#[repr(transparent)]` wrapper of `struct pci_device_id` and
// does not add additional invariants, so it's safe to transmute.
let id = unsafe { &*id.cast::<DeviceId>() };
- let info = T::ID_TABLE.info(id.index());
+
+ // SAFETY: `id` comes from `T::ID_TABLE` which is of type `IdArray<_, T::IdInfo>` or
+ // `pci_device_id_any` which has 0 as driver_data.
+ let info = unsafe { id.info_unchecked_opt::<T::IdInfo>() };
from_result(|| {
- let data = T::probe(pdev, Some(info));
+ let data = T::probe(pdev, info);
pdev.as_ref().set_drvdata(data)?;
Ok(0)
@@ -233,10 +236,6 @@ unsafe impl RawDeviceId for DeviceId {
// SAFETY: `DRIVER_DATA_OFFSET` is the offset to the `driver_data` field.
unsafe impl RawDeviceIdIndex for DeviceId {
const DRIVER_DATA_OFFSET: usize = core::mem::offset_of!(bindings::pci_device_id, driver_data);
-
- fn index(&self) -> usize {
- self.0.driver_data
- }
}
/// `IdTable` type for PCI.
diff --git a/rust/kernel/usb.rs b/rust/kernel/usb.rs
index 500b5e0ba4ea..8aeff5011755 100644
--- a/rust/kernel/usb.rs
+++ b/rust/kernel/usb.rs
@@ -89,7 +89,8 @@ extern "C" fn probe_callback(
// does not add additional invariants, so it's safe to transmute.
let id = unsafe { &*id.cast::<DeviceId>() };
- let info = T::ID_TABLE.info(id.index());
+ // SAFETY: `id` comes from `T::ID_TABLE` which is of type `IdArray<_, T::IdInfo>`.
+ let info = unsafe { id.info_unchecked::<T::IdInfo>() };
let data = T::probe(intf, id, info);
let dev: &device::Device<device::CoreInternal<'_>> = intf.as_ref();
@@ -242,10 +243,6 @@ unsafe impl RawDeviceId for DeviceId {
// SAFETY: `DRIVER_DATA_OFFSET` is the offset to the `driver_info` field.
unsafe impl RawDeviceIdIndex for DeviceId {
const DRIVER_DATA_OFFSET: usize = core::mem::offset_of!(bindings::usb_device_id, driver_info);
-
- fn index(&self) -> usize {
- self.0.driver_info
- }
}
/// [`IdTable`](kernel::device_id::IdTable) type for USB.
--
2.54.0
^ permalink raw reply related
* [PATCH 08/10] rust: driver: remove open-coded matching logic
From: Gary Guo @ 2026-06-18 17:03 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Daniel Almeida,
Tamir Duberstein, Alexandre Courbot, Onur Özkan,
FUJITA Tomonori, David Airlie, Simona Vetter, Bjorn Helgaas,
Krzysztof Wilczyński, Abdiel Janulgue, Robin Murphy,
Dave Ertman, Ira Weiny, Leon Romanovsky, Len Brown, Igor Korotin,
Rob Herring, Saravana Kannan, Viresh Kumar, Michal Wilczynski,
Drew Fustini, Guo Ren, Fu Wei, Uwe Kleine-König
Cc: driver-core, rust-for-linux, linux-kernel, netdev, nova-gpu,
dri-devel, linux-pci, linux-acpi, devicetree, linux-pm,
linux-riscv, linux-pwm, Gary Guo
In-Reply-To: <20260618-id_info-v1-0-96af1e559ef9@garyguo.net>
With device ID info now including pointers instead of indices, the
open-coded ACPI/OF matching is no longer needed and can be replaced with
`device_get_match_data`.
Signed-off-by: Gary Guo <gary@garyguo.net>
---
rust/kernel/driver.rs | 114 ++++--------------------------------------------
rust/kernel/i2c.rs | 6 ++-
rust/kernel/platform.rs | 3 +-
3 files changed, 15 insertions(+), 108 deletions(-)
diff --git a/rust/kernel/driver.rs b/rust/kernel/driver.rs
index 824899d76fed..a881f5ef99ec 100644
--- a/rust/kernel/driver.rs
+++ b/rust/kernel/driver.rs
@@ -107,7 +107,6 @@
use crate::{
acpi,
device,
- device_id::RawDeviceIdIndex,
of,
prelude::*,
types::Opaque,
@@ -325,117 +324,22 @@ pub trait Adapter {
/// The [`acpi::IdTable`] of the corresponding driver
fn acpi_id_table() -> Option<acpi::IdTable<Self::IdInfo>>;
- /// Returns the driver's private data from the matching entry in the [`acpi::IdTable`], if any.
- ///
- /// If this returns `None`, it means there is no match with an entry in the [`acpi::IdTable`].
- fn acpi_id_info(dev: &device::Device) -> Option<&'static Self::IdInfo> {
- #[cfg(not(CONFIG_ACPI))]
- {
- let _ = dev;
- None
- }
-
- #[cfg(CONFIG_ACPI)]
- {
- let table = Self::acpi_id_table()?;
-
- // SAFETY:
- // - `table` has static lifetime, hence it's valid for read,
- // - `dev` is guaranteed to be valid while it's alive, and so is `dev.as_raw()`.
- let raw_id = unsafe { bindings::acpi_match_device(table.as_ptr(), dev.as_raw()) };
-
- if raw_id.is_null() {
- None
- } else {
- // SAFETY: `DeviceId` is a `#[repr(transparent)]` wrapper of `struct acpi_device_id`
- // and does not add additional invariants, so it's safe to transmute.
- let id = unsafe { &*raw_id.cast::<acpi::DeviceId>() };
-
- // SAFETY: `id` comes from `table` which is of type `IdArray<_, Self::IdInfo>`.
- Some(unsafe { id.info_unchecked::<Self::IdInfo>() })
- }
- }
- }
-
/// The [`of::IdTable`] of the corresponding driver.
fn of_id_table() -> Option<of::IdTable<Self::IdInfo>>;
- /// Returns the driver's private data from the matching entry in the [`of::IdTable`], if any.
- ///
- /// If this returns `None`, it means there is no match with an entry in the [`of::IdTable`].
- fn of_id_info(dev: &device::Device) -> Option<&'static Self::IdInfo> {
- let table = Self::of_id_table()?;
-
- #[cfg(not(any(CONFIG_OF, CONFIG_ACPI)))]
- {
- let _ = (dev, table);
- }
-
- #[cfg(CONFIG_OF)]
- {
- // SAFETY:
- // - `table` has static lifetime, hence it's valid for read,
- // - `dev` is guaranteed to be valid while it's alive, and so is `dev.as_raw()`.
- let raw_id = unsafe { bindings::of_match_device(table.as_ptr(), dev.as_raw()) };
-
- if !raw_id.is_null() {
- // SAFETY: `DeviceId` is a `#[repr(transparent)]` wrapper of `struct of_device_id`
- // and does not add additional invariants, so it's safe to transmute.
- let id = unsafe { &*raw_id.cast::<of::DeviceId>() };
-
- // SAFETY: `id` comes from `table` which is of type `IdArray<_, Self::IdInfo>`.
- return Some(unsafe { id.info_unchecked::<Self::IdInfo>() });
- }
- }
-
- #[cfg(CONFIG_ACPI)]
- {
- use core::ptr;
- use device::property::FwNode;
-
- let mut raw_id = ptr::null();
-
- let fwnode = dev.fwnode().map_or(ptr::null_mut(), FwNode::as_raw);
-
- // SAFETY: `fwnode` is a pointer to a valid `fwnode_handle`. A null pointer will be
- // passed through the function.
- let adev = unsafe { bindings::to_acpi_device_node(fwnode) };
-
- // SAFETY:
- // - `adev` is a valid pointer to `acpi_device` or is null. It is guaranteed to be
- // valid as long as `dev` is alive.
- // - `table` has static lifetime, hence it's valid for read.
- if unsafe { acpi_of_match_device(adev, table.as_ptr(), &raw mut raw_id) } {
- // SAFETY:
- // - the function returns true, therefore `raw_id` has been set to a pointer to a
- // valid `of_device_id`.
- // - `DeviceId` is a `#[repr(transparent)]` wrapper of `struct of_device_id`
- // and does not add additional invariants, so it's safe to transmute.
- let id = unsafe { &*raw_id.cast::<of::DeviceId>() };
-
- // SAFETY: `id` comes from `table` which is of type `IdArray<_, Self::IdInfo>`.
- return Some(unsafe { id.info_unchecked::<Self::IdInfo>() });
- }
- }
-
- None
- }
-
/// Returns the driver's private data from the matching entry of any of the ID tables, if any.
///
/// If this returns `None`, it means that there is no match in any of the ID tables directly
/// associated with a [`device::Device`].
- fn id_info(dev: &device::Device) -> Option<&'static Self::IdInfo> {
- let id = Self::acpi_id_info(dev);
- if id.is_some() {
- return id;
- }
-
- let id = Self::of_id_info(dev);
- if id.is_some() {
- return id;
- }
+ ///
+ /// # Safety
+ ///
+ /// The caller must ensure that the `dev` matched data is of type `Self::IdInfo`.
+ unsafe fn id_info(dev: &device::Device) -> Option<&'static Self::IdInfo> {
+ // SAFETY: `dev` is guaranteed to be valid while it's alive, and so is `dev.as_raw()`.
+ let data = unsafe { bindings::device_get_match_data(dev.as_raw()) };
- None
+ // SAFETY: Per safety requirement, `data` is of type `Self::IdInfo`.
+ unsafe { data.cast::<Self::IdInfo>().as_ref() }
}
}
diff --git a/rust/kernel/i2c.rs b/rust/kernel/i2c.rs
index 9e551c7e8e41..07680fd2f3fc 100644
--- a/rust/kernel/i2c.rs
+++ b/rust/kernel/i2c.rs
@@ -149,8 +149,10 @@ extern "C" fn probe_callback(idev: *mut bindings::i2c_client) -> kernel::ffi::c_
// INVARIANT: `idev` is valid for the duration of `probe_callback()`.
let idev = unsafe { &*idev.cast::<I2cClient<device::CoreInternal<'_>>>() };
- let info =
- Self::i2c_id_info(idev).or_else(|| <Self as driver::Adapter>::id_info(idev.as_ref()));
+ let info = Self::i2c_id_info(idev).or_else(|| {
+ // SAFETY: `idev` matched data is of type `Self::IdInfo`.
+ unsafe { <Self as driver::Adapter>::id_info(idev.as_ref()) }
+ });
from_result(|| {
let data = T::probe(idev, info);
diff --git a/rust/kernel/platform.rs b/rust/kernel/platform.rs
index 210a815925ce..e12e88113ca5 100644
--- a/rust/kernel/platform.rs
+++ b/rust/kernel/platform.rs
@@ -100,7 +100,8 @@ extern "C" fn probe_callback(pdev: *mut bindings::platform_device) -> kernel::ff
//
// INVARIANT: `pdev` is valid for the duration of `probe_callback()`.
let pdev = unsafe { &*pdev.cast::<Device<device::CoreInternal<'_>>>() };
- let info = <Self as driver::Adapter>::id_info(pdev.as_ref());
+ // SAFETY: `pdev` matched data is of type `Self::IdInfo`.
+ let info = unsafe { <Self as driver::Adapter>::id_info(pdev.as_ref()) };
from_result(|| {
let data = T::probe(pdev, info);
--
2.54.0
^ permalink raw reply related
* [PATCH 09/10] rust: driver: remove duplicate ID table
From: Gary Guo @ 2026-06-18 17:03 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
Andreas Hindborg, Alice Ryhl, Trevor Gross, Daniel Almeida,
Tamir Duberstein, Alexandre Courbot, Onur Özkan,
FUJITA Tomonori, David Airlie, Simona Vetter, Bjorn Helgaas,
Krzysztof Wilczyński, Abdiel Janulgue, Robin Murphy,
Dave Ertman, Ira Weiny, Leon Romanovsky, Len Brown, Igor Korotin,
Rob Herring, Saravana Kannan, Viresh Kumar, Michal Wilczynski,
Drew Fustini, Guo Ren, Fu Wei, Uwe Kleine-König
Cc: driver-core, rust-for-linux, linux-kernel, netdev, nova-gpu,
dri-devel, linux-pci, linux-acpi, devicetree, linux-pm,
linux-riscv, linux-pwm, Gary Guo
In-Reply-To: <20260618-id_info-v1-0-96af1e559ef9@garyguo.net>
Previously, `IdArray` contains both device ID table and info table so we
keep a separate copy for MODULE_DEVICE_TABLE for hotplug (which needs to be
just the device ID table). With the info being changed to be carried via
pointers, `IdArray` is now layout compatible with raw ID table and hence
there is no longer a need to keep the distinction.
Deduplicate the code, and remove the redundant copy for hotplug purpose by
just giving the `IdArray` instance a proper symbol name.
Signed-off-by: Gary Guo <gary@garyguo.net>
---
rust/kernel/device_id.rs | 76 +++++++++++++++++-------------------------------
1 file changed, 27 insertions(+), 49 deletions(-)
diff --git a/rust/kernel/device_id.rs b/rust/kernel/device_id.rs
index 59453588df0e..26618bcda276 100644
--- a/rust/kernel/device_id.rs
+++ b/rust/kernel/device_id.rs
@@ -86,28 +86,23 @@ unsafe fn info_unchecked_opt<U>(&self) -> Option<&'static U> {
}
}
-/// A zero-terminated device id array.
+/// A zero-terminated device id array, followed by context data.
#[repr(C)]
-pub struct RawIdArray<T: RawDeviceId, const N: usize> {
+pub struct IdArray<T: RawDeviceId, U: 'static, const N: usize> {
// This is `MaybeUninit<T::RawType>` so any bytes inside it can carry provenance in CTFE.
// If this were `T::RawType`, integer fields would not be able to contain pointers.
ids: [MaybeUninit<T::RawType>; N],
sentinel: MaybeUninit<T::RawType>,
+ phantom: PhantomData<&'static U>,
}
-impl<T: RawDeviceId, const N: usize> RawIdArray<T, N> {
- #[doc(hidden)]
- pub const fn size(&self) -> usize {
- core::mem::size_of::<Self>()
- }
-}
+// SAFETY: device ID is plain data plus a `&'static U` and can thus be sent between threads safely
+// if `&U` can.
+unsafe impl<T: RawDeviceId, U: Sync + 'static, const N: usize> Send for IdArray<T, U, N> {}
-/// A zero-terminated device id array, followed by context data.
-#[repr(C)]
-pub struct IdArray<T: RawDeviceId, U: 'static, const N: usize> {
- raw_ids: RawIdArray<T, N>,
- phantom: PhantomData<&'static U>,
-}
+// SAFETY: device ID is plain data plus a `&'static U` and can thus be shared between threads safely
+// if `&U` can.
+unsafe impl<T: RawDeviceId, U: Sync + 'static, const N: usize> Sync for IdArray<T, U, N> {}
impl<T: RawDeviceId + RawDeviceIdIndex, U: 'static, const N: usize> IdArray<T, U, N> {
/// Creates a new instance of the array.
@@ -137,22 +132,13 @@ impl<T: RawDeviceId + RawDeviceIdIndex, U: 'static, const N: usize> IdArray<T, U
core::mem::forget(ids);
Self {
- raw_ids: RawIdArray {
- ids: raw_ids,
- sentinel: MaybeUninit::zeroed(),
- },
+ ids: raw_ids,
+ sentinel: MaybeUninit::zeroed(),
phantom: PhantomData,
}
}
}
-impl<T: RawDeviceId, U: 'static, const N: usize> IdArray<T, U, N> {
- /// Reference to the contained [`RawIdArray`].
- pub const fn raw_ids(&self) -> &RawIdArray<T, N> {
- &self.raw_ids
- }
-}
-
impl<T: RawDeviceId, const N: usize> IdArray<T, (), N> {
/// Creates a new instance of the array without writing index values.
///
@@ -164,10 +150,8 @@ impl<T: RawDeviceId, const N: usize> IdArray<T, (), N> {
core::mem::forget(ids);
Self {
- raw_ids: RawIdArray {
- ids: raw_ids,
- sentinel: MaybeUninit::zeroed(),
- },
+ ids: raw_ids,
+ sentinel: MaybeUninit::zeroed(),
phantom: PhantomData,
}
}
@@ -200,13 +184,17 @@ macro_rules! module_device_table {
$table_name: ident, $id_info_type: ty,
[$(($id: expr, $info:expr $(,)?)),* $(,)?]
) => {
- const $table_name: $crate::device_id::IdArray<
+ #[export_name =
+ concat!("__mod_device_table__", line!(),
+ "__kmod_", module_path!(),
+ "__", $table_type,
+ "__", stringify!($table_name))
+ ]
+ static $table_name: $crate::device_id::IdArray<
$device_id_ty,
$id_info_type,
{ <[$device_id_ty]>::len(&[$($id,)*]) },
> = $crate::device_id::IdArray::new([$(($id, &$info),)*]);
-
- $crate::module_device_table!($table_type, $table_name);
};
// Case for no ID info.
@@ -215,26 +203,16 @@ macro_rules! module_device_table {
$table_name: ident, @none,
[$($id: expr),* $(,)?]
) => {
- const $table_name: $crate::device_id::IdArray<
+ #[export_name =
+ concat!("__mod_device_table__", line!(),
+ "__kmod_", module_path!(),
+ "__", $table_type,
+ "__", stringify!($table_name))
+ ]
+ static $table_name: $crate::device_id::IdArray<
$device_id_ty,
(),
{ <[$device_id_ty]>::len(&[$($id,)*]) },
> = $crate::device_id::IdArray::new_without_index([$($id),*]);
-
- $crate::module_device_table!($table_type, $table_name);
- };
-
- ($table_type: literal, $table_name:ident) => {
- const _: () = {
- #[rustfmt::skip]
- #[export_name =
- concat!("__mod_device_table__", line!(),
- "__kmod_", module_path!(),
- "__", $table_type,
- "__", stringify!($table_name))
- ]
- static TABLE: [::core::mem::MaybeUninit<u8>; $table_name.raw_ids().size()] =
- unsafe { ::core::mem::transmute_copy($table_name.raw_ids()) };
- };
};
}
--
2.54.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox