* [PATCH v3] net/sched: act_nat: only rewrite IPv4 packets
From: Samuel Moelius @ 2026-06-28 13:20 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: Samuel Moelius, Jiri Pirko, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Herbert Xu,
open list:TC subsystem, open list
act_nat can process packets whose protocol is not IPv4 and then
interpret the payload as an IPv4 header. Non-IPv4 packets may be
modified based on unrelated bytes at the network header offset.
The action is documented as IPv4 NAT and should leave other protocols
alone.
Check skb->protocol before parsing and rewriting the IPv4 header. This
keeps accepting hardware-accelerated VLAN IPv4 packets whose network
header already points at the IPv4 header, while still rejecting inline
VLAN packets because act_nat does not adjust the network header offset
before using ip_hdr(skb).
Fixes: b4219952356b ("[PKT_SCHED]: Add stateless NAT")
Assisted-by: Codex:gpt-5.5-cyber-preview
Signed-off-by: Samuel Moelius <sam.moelius@trailofbits.com>
---
Changes in v3:
- Check skb->protocol
Changes in v2:
- Check skb_protocol(skb, false)
net/sched/act_nat.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/sched/act_nat.c b/net/sched/act_nat.c
index abb332dee836..1bf4a5853617 100644
--- a/net/sched/act_nat.c
+++ b/net/sched/act_nat.c
@@ -142,6 +142,9 @@ TC_INDIRECT_SCOPE int tcf_nat_act(struct sk_buff *skb,
egress = parms->flags & TCA_NAT_FLAG_EGRESS;
noff = skb_network_offset(skb);
+ if (skb->protocol != htons(ETH_P_IP))
+ goto out;
+
if (!pskb_may_pull(skb, sizeof(*iph) + noff))
goto drop;
--
2.43.0
^ permalink raw reply related
* Re: [PATCH] net/sched: drr: reseed active class deficit after quantum changes
From: Samuel Moelius @ 2026-06-28 13:40 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Jamal Hadi Salim, Jiri Pirko, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, open list:TC subsystem, open list
In-Reply-To: <20260609185625.6e4bb757@kernel.org>
On Tue, Jun 9, 2026 at 9:56 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 9 Jun 2026 00:36:18 +0000 Samuel Moelius wrote:
> > Subject: [PATCH] net/sched: drr: reseed active class deficit after quantum changes
>
> If the change is not for a serious bug it needs to be generated against
> net-next. This was generated against Linus's tree I guess and doesn't
> apply to -next.
>
> > Changing the quantum of an active DRR class leaves the old deficit in
> > place. The next scheduling round can therefore use credit accumulated
> > under a different quantum.
> >
> > This can be observed by making a class active, changing its quantum, and
> > then dequeuing with the old deficit still present.
> >
> > When an active class quantum changes, reseed its deficit from the new
> > quantum so the changed class weight is reflected immediately.
>
> TBH the current implementation is how I would expect DRR to work.
> quantum is the "refill" value, it should not reset the state of
> the current round? It wouldn't in an ASIC.
Your point is well taken. Thank you for the feedback.
I withdraw the patch.
^ permalink raw reply
* Re: [PATCH v2 1/1] xfrm: nat_keepalive: avoid double free on send error
From: Eyal Birger @ 2026-06-28 13:42 UTC (permalink / raw)
To: Ren Wei
Cc: netdev, steffen.klassert, herbert, davem, yuantan098, bird,
qianyuluo3
In-Reply-To: <20260625055513.1841167-1-n05ec@lzu.edu.cn>
On Wed, Jun 24, 2026 at 10:55 PM Ren Wei <n05ec@lzu.edu.cn> wrote:
>
> From: Qianyu Luo <qianyuluo3@gmail.com>
>
> nat_keepalive_send() frees the keepalive skb whenever the IPv4 or IPv6
> send helper reports an error.
>
> That cleanup is only correct before the skb is handed to the output
> path. Once ip_build_and_send_pkt() or ip6_xmit() takes ownership, the
> networking stack may already have consumed the skb before returning an
> error, so freeing it again is unsafe.
>
> Handle the pre-handoff failure cases inside nat_keepalive_send_ipv4()
> and nat_keepalive_send_ipv6(), where the caller still owns the skb, and
> keep nat_keepalive_send() responsible only for family dispatch and the
> unsupported-family cleanup path.
>
> Fixes: f531d13bdfe3 ("xfrm: support sending NAT keepalives in ESP in UDP states")
Thanks for the fix!
Reviewed-by: Eyal Birger <eyal.birger@gmail.com>
^ permalink raw reply
* [PATCH v3] net/sched: dualpi2: clear stale classification on filter miss
From: Samuel Moelius @ 2026-06-28 13:48 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: Samuel Moelius, Jiri Pirko, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Olga Albisser,
Koen De Schepper, Henrik Steen, Olivier Tilmans,
open list:TC subsystem, open list
DualPI2 leaves previous classification state attached to an skb when
filter classification returns no match. The enqueue path can then act
on stale state from an earlier classification attempt.
A filter miss should fall back to the default class without reusing old
per-packet classification data.
Initialize the classification result to CLASSIC before running the
classifier. Explicit L4S, priority, and successful filter
classification can still override that default.
Fixes: 8f9516daedd6 ("sched: Add enqueue/dequeue of dualpi2 qdisc")
Assisted-by: Codex:gpt-5.5-cyber-preview
Signed-off-by: Samuel Moelius <sam.moelius@trailofbits.com>
---
Changes in v3:
- Improve readability
Changes in v2:
- Add fixes tag
net/sched/sch_dualpi2.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/sched/sch_dualpi2.c b/net/sched/sch_dualpi2.c
index 5434df6ca8ef..27088760eff4 100644
--- a/net/sched/sch_dualpi2.c
+++ b/net/sched/sch_dualpi2.c
@@ -346,6 +346,8 @@ static int dualpi2_skb_classify(struct dualpi2_sched_data *q,
struct tcf_proto *fl;
int result;
+ cb->classified = DUALPI2_C_CLASSIC;
+
dualpi2_read_ect(skb);
if (cb->ect & q->ecn_mask) {
cb->classified = DUALPI2_C_L4S;
@@ -359,10 +361,8 @@ static int dualpi2_skb_classify(struct dualpi2_sched_data *q,
}
fl = rcu_dereference_bh(q->tcf_filters);
- if (!fl) {
- cb->classified = DUALPI2_C_CLASSIC;
+ if (!fl)
return NET_XMIT_SUCCESS;
- }
result = tcf_classify(skb, NULL, fl, &res, false);
if (result >= 0) {
--
2.43.0
^ permalink raw reply related
* Re: [PATCH] net/sched: codel: refresh CAN_BYPASS when limit changes
From: Samuel Moelius @ 2026-06-28 13:59 UTC (permalink / raw)
To: Eric Dumazet
Cc: Jamal Hadi Salim, Jiri Pirko, David S. Miller, Jakub Kicinski,
Paolo Abeni, Simon Horman, open list:TC subsystem, open list
In-Reply-To: <CANn89iJdrwPtMkexacercdV+2qDAZ=JYwGjW8xA1aKZTk3muhg@mail.gmail.com>
On Tue, Jun 9, 2026 at 11:28 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Jun 8, 2026 at 5:12 PM Samuel Moelius
> <sam.moelius@trailofbits.com> wrote:
> >
> > sch_codel and sch_fq_codel update their packet limit without refreshing
> > the queue bypass state. Changing the limit to zero can leave CAN_BYPASS
> > set from the previous configuration.
> >
> > The enqueue path can then bypass limit enforcement even though the new
> > limit should prevent queued packets.
> >
> > Recompute the bypass flag whenever the configured limit changes.
> >
> > Assisted-by: Codex:gpt-5.5-cyber-preview
> > Signed-off-by: Samuel Moelius <sam.moelius@trailofbits.com>
> > ---
>
> We can't change sch->flags in a change() method even under qdisc
> spinlock, look at __dev_xmit_skb() which reads q->flags locklessly.
Your point is well taken. Thank you for the feedback.
I withdraw this patch.
^ permalink raw reply
* Re: [PATCH] net: usb: rtl8150: handle link status read failures
From: Petko Manolov @ 2026-06-28 15:18 UTC (permalink / raw)
To: Yousef Alhouseen
Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-usb, netdev, linux-kernel, stable,
syzbot+9db6c624635564ad813c
In-Reply-To: <20260628093929.44214-1-alhouseenyousef@gmail.com>
On 26-06-28 11:39:29, Yousef Alhouseen wrote:
> set_carrier() ignores the result of the USB control transfer and tests the
> stack variable supplied as its receive buffer. If the device rejects or aborts
> the request, that variable remains uninitialized and the driver chooses an
> arbitrary carrier state.
>
> Report carrier down when the link status cannot be read.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: syzbot+9db6c624635564ad813c@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=9db6c624635564ad813c
> Cc: stable@vger.kernel.org
> Signed-off-by: Yousef Alhouseen <alhouseenyousef@gmail.com>
> ---
> drivers/net/usb/rtl8150.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/usb/rtl8150.c b/drivers/net/usb/rtl8150.c
> index c880c95c41a5..5606490aaea0 100644
> --- a/drivers/net/usb/rtl8150.c
> +++ b/drivers/net/usb/rtl8150.c
> @@ -732,7 +732,11 @@ static void set_carrier(struct net_device *netdev)
> rtl8150_t *dev = netdev_priv(netdev);
> short tmp;
>
> - get_registers(dev, CSCR, 2, &tmp);
> + if (get_registers(dev, CSCR, 2, &tmp)) {
> + netif_carrier_off(netdev);
> + return;
> + }
> +
> if (tmp & CSCR_LINK_STATUS)
> netif_carrier_on(netdev);
> else
I'd rather do something along these lines:
@@ -732,7 +732,9 @@ static void set_carrier(struct net_device *netdev)
rtl8150_t *dev = netdev_priv(netdev);
short tmp;
- get_registers(dev, CSCR, 2, &tmp);
+ if (get_registers(dev, CSCR, 2, &tmp)
+ return;
+
if (tmp & CSCR_LINK_STATUS)
netif_carrier_on(netdev);
else
IIRC it is possible for the message get lost in bus noise while the device is
still operating correctly. So if my memory isn't failing me, it is better to
not do anything if usb_control_msg_recv() is non-zero and only change the
carrier status if 'tmp' contains meaningful value.
Petko
^ permalink raw reply
* Re: [RFC] xdp: add device context to bpf_xdp_link_attach_failed tracepoint
From: Leon Hwang @ 2026-06-28 15:26 UTC (permalink / raw)
To: Masashi Honma, netdev, bpf, linux-trace-kernel
Cc: ast, daniel, kuba, hawk, andrii, rostedt, mhiramat, edumazet,
pabeni, linux-kernel
In-Reply-To: <CAFk-A4mE9Jweo2hfX7y_85xbPyt0FqpMT1EvqX1OcYZ=LTLgRA@mail.gmail.com>
On 2026/6/28 19:39, Masashi Honma wrote:
> Hello, I am re-posting this mail because I forget to add [RFC].
>
> The bpf_xdp_link_attach_failed tracepoint (added in commit bf4ea1d0b2cb
> "xdp: Add tracepoint for xdp attaching failure") exposes the netlink
> extack message produced when attaching an XDP program via BPF_LINK_CREATE
> fails. This is useful because, unlike the netlink attach path, the
I really appreciate that the XDP tracepoint helped someone.
> bpf_link attach path does not return the extack to userspace -- the caller
> only gets an errno (e.g. EINVAL/ERANGE).
>
> We would like to use this in Cilium [1][2]: when attaching the XDP
> datapath program fails, surface the kernel's reason (e.g. "single-buffer
> XDP requires MTU less than ...") in the agent logs instead of an opaque
> errno, so operators don't have to inspect dmesg on the host.
>
> The limitation we hit is that the tracepoint only carries the message
> string, so a consumer cannot tell which device a failure belongs to.
> This matters for two reasons:
>
> 1. Correlation: with only the message, a consumer cannot reliably
> attribute a failure to a specific attach, particularly if multiple
> XDP attaches happen concurrently.
> 2. Scoping: a consumer watching this tracepoint sees XDP attach
> failures system-wide and cannot limit them to the devices it
> manages.
>
> At the call site (bpf_xdp_link_attach() in net/core/dev.c) the net_device
> is in scope, so exposing it looks straightforward:
>
> TRACE_EVENT(bpf_xdp_link_attach_failed,
> TP_PROTO(const char *msg, const struct net_device *dev),
> TP_ARGS(msg, dev),
> TP_STRUCT__entry(
> __string(msg, msg)
> __field(int, ifindex)
> ),
> TP_fast_assign(
> __assign_str(msg);
> __entry->ifindex = dev->ifindex;
> ),
> TP_printk("ifindex=%d errmsg=%s", __entry->ifindex, __get_str(msg))
> );
>
> - trace_bpf_xdp_link_attach_failed(extack._msg);
> + trace_bpf_xdp_link_attach_failed(extack._msg, dev);
>
> Before sending a formal patch I'd appreciate guidance on a few points:
>
> - Should the tracepoint take const struct net_device *dev (consistent
> with the other tracepoints in this file, and lets TP_printk show the
> device), or just the ifindex as an int (simpler for raw_tp BPF
> consumers, which otherwise read dev->ifindex via CO-RE)?
>
> - For raw_tp consumers the argument order is effectively ABI: prepending
> dev would shift the existing msg argument. I've appended dev above to
> keep msg at args[0]. Is preserving the existing argument position the
> right call, or is reordering acceptable given how new and rarely
> consumed this tracepoint is?
>
Good concerns. I'm not sure about these parts.
> - Is extending the existing tracepoint preferred, or would you rather
> keep it as-is and expose the device context some other way?
>
I'm planning to retire this tracepoint. But I think I cannot do it, if
there's user space application relied on the tracepoint.
I'm planning to add BPF syscall common attributes support for
BPF_LINK_CREATE, including XDP link. By that way, the kernel will be
able to back-propagate the 'extack._msg' to user space, when fail to
create XDP link. Thereafter, the user space library will be able to get
the error message alongside the errno.
Thanks,
Leon
> This would be my first XDP/BPF tracepoint change, so any direction is
> welcome. I'm happy to send a proper patch once the shape is agreed.
>
> Regards,
> Masashi Honma
>
> [1] https://github.com/cilium/cilium/issues/40777
> [2] https://github.com/cilium/cilium/pull/46546
^ permalink raw reply
* Re: [PATCH net 2/2] net: pse-pd: guard against freed PI data on regulator disable
From: Carlo Szelinsky @ 2026-06-28 15:31 UTC (permalink / raw)
To: Simon Horman
Cc: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel,
Carlo Szelinsky
In-Reply-To: <20260615180048.828053-1-github@szelinsky.de>
Hi Simon,
Gentle ping on this one... I think I'm just waiting on your read before
I send v2, and I'd like to get it unblocked :-)
v2 would take pcdev->lock around the kfree() + pcdev->pi = NULL in
pse_release_pis() so the NULL is authoritative, and add the same
!pcdev->pi guard to pse_pi_is_enabled() and pse_pi_enable().
Two things I'd value your view on before I send the next version:
- Is the contained fix (lock around the free) ok, or would you prefer
the regulator unregister reordered ahead of pse_release_pis()?
- I couldn't find a concrete consumer hitting a regulator op on
another CPU during unbind, so I'd describe it as a narrow window
rather than a proven race. Does that sound right to you?
Happy to just send v2 with the lock fix if that works for you.
Thanks,
Carlo
^ permalink raw reply
* [PATCH v2] net: usb: rtl8150: handle link status read failures
From: Yousef Alhouseen @ 2026-06-28 16:25 UTC (permalink / raw)
To: Petko Manolov, Andrew Lunn, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni
Cc: linux-usb, netdev, linux-kernel, stable,
syzbot+9db6c624635564ad813c, Yousef Alhouseen
In-Reply-To: <20260628151835.GC14404@carbon.k.g>
set_carrier() ignores the result of the USB control transfer and tests
the stack variable supplied as its receive buffer. If the device rejects
or aborts the request, that variable remains uninitialized and the driver
chooses an arbitrary carrier state.
Leave the existing carrier state unchanged when the link status cannot be
read. A transient USB error should not be treated as link loss.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+9db6c624635564ad813c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9db6c624635564ad813c
Cc: stable@vger.kernel.org
Signed-off-by: Yousef Alhouseen <alhouseenyousef@gmail.com>
---
drivers/net/usb/rtl8150.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/usb/rtl8150.c b/drivers/net/usb/rtl8150.c
index c880c95c41a5..d51e43170e03 100644
--- a/drivers/net/usb/rtl8150.c
+++ b/drivers/net/usb/rtl8150.c
@@ -732,7 +732,9 @@ static void set_carrier(struct net_device *netdev)
rtl8150_t *dev = netdev_priv(netdev);
short tmp;
- get_registers(dev, CSCR, 2, &tmp);
+ if (get_registers(dev, CSCR, 2, &tmp))
+ return;
+
if (tmp & CSCR_LINK_STATUS)
netif_carrier_on(netdev);
else
--
2.54.0
^ permalink raw reply related
* Re: [PATCH v2] net: usb: rtl8150: handle link status read failures
From: Andrew Lunn @ 2026-06-28 16:40 UTC (permalink / raw)
To: Yousef Alhouseen
Cc: Petko Manolov, Andrew Lunn, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, linux-usb, netdev, linux-kernel,
stable, syzbot+9db6c624635564ad813c
In-Reply-To: <20260628162528.8273-1-alhouseenyousef@gmail.com>
On Sun, Jun 28, 2026 at 06:25:28PM +0200, Yousef Alhouseen wrote:
> set_carrier() ignores the result of the USB control transfer and tests
> the stack variable supplied as its receive buffer. If the device rejects
> or aborts the request, that variable remains uninitialized and the driver
> chooses an arbitrary carrier state.
>
> Leave the existing carrier state unchanged when the link status cannot be
> read. A transient USB error should not be treated as link loss.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: syzbot+9db6c624635564ad813c@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=9db6c624635564ad813c
> Cc: stable@vger.kernel.org
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
Does this issue bother people?
I think it would be better to submit to net-next:
https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html
Please don't thread patch versions together.
Also, a Suggested-by: might be appropriate here.
Andrew
---
pw-bot: cr
^ permalink raw reply
* Re: iproute2: trailing whitespace in man pages
From: Bjarni Ingi Gislason @ 2026-06-28 16:45 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20260626081534.70b9f2ff@phoenix.local>
grep -n -e ' $' -e '\\~$' -e ' \\f.$' -e ' \\"' -e ' "$' <file>
Added option -e ' "$' for the last argument of macros.
^ permalink raw reply
* [PATCH net-next v3] net: usb: rtl8150: handle link status read failures
From: Yousef Alhouseen @ 2026-06-28 16:50 UTC (permalink / raw)
To: Petko Manolov, Andrew Lunn, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni
Cc: linux-usb, netdev, linux-kernel, syzbot+9db6c624635564ad813c,
Yousef Alhouseen
set_carrier() ignores the result of the USB control transfer and tests
the stack variable supplied as its receive buffer. If the device rejects
or aborts the request, that variable remains uninitialized and the driver
chooses an arbitrary carrier state.
Leave the existing carrier state unchanged when the link status cannot be
read. A transient USB error should not be treated as link loss.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+9db6c624635564ad813c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9db6c624635564ad813c
Suggested-by: Petko Manolov <petkan@nucleusys.com>
Signed-off-by: Yousef Alhouseen <alhouseenyousef@gmail.com>
---
drivers/net/usb/rtl8150.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/usb/rtl8150.c b/drivers/net/usb/rtl8150.c
index c880c95c41a5..d51e43170e03 100644
--- a/drivers/net/usb/rtl8150.c
+++ b/drivers/net/usb/rtl8150.c
@@ -732,7 +732,9 @@ static void set_carrier(struct net_device *netdev)
rtl8150_t *dev = netdev_priv(netdev);
short tmp;
- get_registers(dev, CSCR, 2, &tmp);
+ if (get_registers(dev, CSCR, 2, &tmp))
+ return;
+
if (tmp & CSCR_LINK_STATUS)
netif_carrier_on(netdev);
else
--
2.54.0
^ permalink raw reply related
* Re: [PATCH iproute2-next 0/7] devlink: add per-port resource support
From: David Ahern @ 2026-06-28 17:14 UTC (permalink / raw)
To: Tariq Toukan, Stephen Hemminger, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Andrew Lunn, David S. Miller
Cc: Donald Hunter, Simon Horman, Jiri Pirko, Jonathan Corbet,
Shuah Khan, Saeed Mahameed, Leon Romanovsky, Mark Bloch,
Shuah Khan, Matthieu Baerts (NGI0), Chuck Lever, Or Har-Toov,
Carolina Jubran, Moshe Shemesh, Shay Drori, Dragos Tatulea,
Daniel Zahka, Shahar Shitrit, Jacob Keller, Cosmin Ratiu,
Parav Pandit, Kees Cook, Adithya Jayachandran, Daniel Jurgens,
netdev, linux-kernel, linux-doc, linux-rdma, linux-kselftest,
Gal Pressman, Ido Schimmel, Jiri Pirko, Petr Machata
In-Reply-To: <20260609053953.487152-1-tariqt@nvidia.com>
On 6/8/26 11:39 PM, Tariq Toukan wrote:
> Hi,
>
> Currently, devlink resource show only supports querying a specific
> device and displays device-level resources. However, some resources
> are per-port, such as the maximum number of SFs that can be created
> on a specific PF port.
>
> This series extends devlink resource show with full support for
> port-level resources, including a dump mode, per-port querying syntax,
> and scope filtering. In preparation for these features, the first two
> patches refactor how dpipe tables are handled to unblock dump support
> and ensure errors in secondary queries are non-fatal.
>
> The series is organized as follows:
>
> Patch 1 splits the dpipe tables display into a separate function.
>
> Patch 2 moves the dpipe tables query into the per-device resource show
> callback, ensuring it behaves correctly during a multi-device dump.
>
> Patch 3 fixes a pre-existing memory leak in resource_ctx_fini.
>
> Patch 4 adds dump support to resource show (no device required).
>
> Patch 5 shows port-level resources returned in a dump reply.
>
> Patch 6 adds DEV/PORT_INDEX syntax to resource show.
>
> Patch 7 adds scope filter to resource show.
>
> With this series, users can query resources at all levels:
>
> $ devlink resource show
> pci/0000:03:00.0:
> name local_max_SFs size 508 unit entry
> name external_max_SFs size 508 unit entry
> pci/0000:03:00.0/196608:
> name max_SFs size 20 unit entry
>
> $ devlink resource show scope dev
> pci/0000:03:00.0:
> name local_max_SFs size 508 unit entry
> name external_max_SFs size 508 unit entry
>
> $ devlink resource show scope port
> pci/0000:03:00.0/196608:
> name max_SFs size 20 unit entry
>
> $ devlink resource show pci/0000:03:00.0/196608
> pci/0000:03:00.0/196608:
> name max_SFs size 20 unit entry
>
> This series is the userspace counterpart to the kernel series:
> https://lore.kernel.org/all/20260407194107.148063-1-tariqt@nvidia.com/
>
> Ido Schimmel (2):
> devlink: Split dpipe tables output to a separate function
> devlink: Move dpipe tables query to resources show callback
>
> Or Har-Toov (5):
> devlink: fix memory leak in resource_ctx_fini
> devlink: add dump support for resource show
> devlink: show port resources in resource dump
> devlink: add per-port resource show support
> devlink: add scope filter to resource show
>
> bash-completion/devlink | 8 ++
> devlink/devlink.c | 202 +++++++++++++++++++++++++++---------
> man/man8/devlink-resource.8 | 34 +++++-
> 3 files changed, 192 insertions(+), 52 deletions(-)
>
>
> base-commit: 7340b539841dc739bc0b813e8e86825bc1eb5a4c
applied to iproute2-next with the fixup recommended by Claude and
confirmed by Or
^ permalink raw reply
* Re: [PATCH iproute2-next] devlink: support u32-array values in devlink param show/set
From: David Ahern @ 2026-06-28 17:19 UTC (permalink / raw)
To: Ratheesh Kannoth, stephen, kuba, linux-kernel, netdev
Cc: andrew+netdev, davem, edumazet, pabeni, sgoutham
In-Reply-To: <20260615041042.549715-1-rkannoth@marvell.com>
On 6/14/26 10:10 PM, Ratheesh Kannoth wrote:
> @@ -3904,6 +3935,14 @@ static int cmd_dev_param_set(struct dl *dl)
> if (!strcmp(dl->opts.param_value, ctx.value.vstr))
> return 0;
> break;
> + case 129:
no magic numbers. What does 129 represent? Is there a named macro for
it? If not, why not if this is part of a UAPI?
> + buf = (char *)dl->opts.param_value;
> + token = strtok(buf, delim);
> + while (token) {
> + mnl_attr_put_u32(nlh, DEVLINK_ATTR_PARAM_VALUE_DATA, atoi(token));
> + token = strtok(NULL, delim);
> + }
> + break;
> default:
> printf("Value type not supported\n");
> return -ENOTSUP;
^ permalink raw reply
* Re: [PATCH iproute2-next v3] rdma: display resource limits in curr/max format
From: David Ahern @ 2026-06-28 17:22 UTC (permalink / raw)
To: Tao Cui, leonro; +Cc: linux-rdma, netdev, Tao Cui
In-Reply-To: <20260615005315.169582-1-cui.tao@linux.dev>
On 6/14/26 6:53 PM, Tao Cui wrote:
> diff --git a/rdma/include/uapi/rdma/rdma_netlink.h b/rdma/include/uapi/rdma/rdma_netlink.h
> index 4356ec4a..e5b8b065 100644
> --- a/rdma/include/uapi/rdma/rdma_netlink.h
> +++ b/rdma/include/uapi/rdma/rdma_netlink.h
> @@ -604,6 +604,11 @@ enum rdma_nldev_attr {
> RDMA_NLDEV_ATTR_FRMR_POOL_PINNED_HANDLES, /* u32 */
> RDMA_NLDEV_ATTR_FRMR_POOL_KEY_KERNEL_VENDOR_KEY, /* u64 */
>
> + /*
> + * Resource summary entry maximum value.
> + */
> + RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_MAX, /* u64 */
I do not see this uapi in Linus' tree. What is the status of the kernel
commit? Put a reference to the kernel patches in the commit message.
^ permalink raw reply
* Re: [PATCH iproute-next v3] ipaddress: add support for showing IPv4 devconf attributes
From: David Ahern @ 2026-06-28 17:30 UTC (permalink / raw)
To: Fernando Fernandez Mancera, netdev
Cc: stephen, davem, edumazet, kuba, pabeni, horms
In-Reply-To: <20260614182515.8765-1-fmancera@suse.de>
On 6/14/26 12:25 PM, Fernando Fernandez Mancera wrote:
> This patch introduces support for showing IPv4 devconf attributes on
> detailed output of an interface e.g "ip -d link show dev enp1s0".
>
> Additionally, this refactors 'print_af_spec()' to sequentially process
> both AF_INET and AF_INET6 attributes rather than returning early if
> AF_INET6 is missing.
refactors should be a separate patch.
>
> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
> ---
> v2: changed print_string to print_bool for boolean attributes
> v3: use print_bool for JSON output only
> ---
> ip/ipaddress.c | 313 ++++++++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 273 insertions(+), 40 deletions(-)
>
> diff --git a/ip/ipaddress.c b/ip/ipaddress.c
> index 6017bc83..1530b836 100644
> --- a/ip/ipaddress.c
> +++ b/ip/ipaddress.c
> @@ -23,6 +23,7 @@
> #include <linux/netdevice.h>
> #include <linux/if_arp.h>
> #include <linux/if_infiniband.h>
> +#include <linux/ip.h>
> #include <linux/sockios.h>
> #include <linux/net_namespace.h>
>
> @@ -294,53 +295,285 @@ static void print_linktype(FILE *fp, struct rtattr *tb)
> close_json_object();
> }
>
> +static void print_inet(FILE *fp, struct rtattr *inet_attr)
> +{
> + struct rtattr *tb[IFLA_INET_MAX + 1];
> +
> + parse_rtattr_nested(tb, IFLA_INET_MAX, inet_attr);
> +
> + if (tb[IFLA_INET_CONF]) {
> + int *conf = RTA_DATA(tb[IFLA_INET_CONF]);
> + int max_elements = RTA_PAYLOAD(tb[IFLA_INET_CONF]) / sizeof(int);
> +
> + if (max_elements >= IPV4_DEVCONF_FORWARDING) {
> + print_bool(PRINT_JSON, "forwarding", NULL,
> + conf[IPV4_DEVCONF_FORWARDING - 1]);
> + print_string(PRINT_FP, "forwarding", "forwarding %s ",
> + conf[IPV4_DEVCONF_FORWARDING - 1] ? "on" : "off");
> + }
> +
> + if (max_elements >= IPV4_DEVCONF_MC_FORWARDING) {
> + print_bool(PRINT_JSON, "mc_forwarding", NULL,
> + conf[IPV4_DEVCONF_MC_FORWARDING - 1]);
> + print_string(PRINT_FP, "mc_forwarding", "mc_forwarding %s ",
> + conf[IPV4_DEVCONF_MC_FORWARDING - 1] ? "on" : "off");
> + }
> +
> + if (max_elements >= IPV4_DEVCONF_PROXY_ARP) {
> + print_bool(PRINT_JSON, "proxy_arp", NULL,
> + conf[IPV4_DEVCONF_PROXY_ARP - 1]);
> + print_string(PRINT_FP, "proxy_arp", "proxy_arp %s ",
> + conf[IPV4_DEVCONF_PROXY_ARP - 1] ? "on" : "off");
> + }
> +
> + if (max_elements >= IPV4_DEVCONF_ACCEPT_REDIRECTS) {
> + print_bool(PRINT_JSON, "accept_redirects", NULL,
> + conf[IPV4_DEVCONF_ACCEPT_REDIRECTS - 1]);
> + print_string(PRINT_FP, "accept_redirects",
> + "accept_redirects %s ",
> + conf[IPV4_DEVCONF_ACCEPT_REDIRECTS - 1] ? "on" : "off");
As I stated in the last patch for devconf:
"iproute2 follows netdev with coding standards and those need to be
followed as long as humans are in the loop. Please make sure follow on
patches adhere to roughly 80 columns with a little extra if it improves
readability (and of course strings are not broken across lines)."
Use print_on_off for example for these or use a temp variable for the
attributes.
^ permalink raw reply
* Re: Question: bridge: clarify MST VLAN list RCU traversal contract
From: Ido Schimmel @ 2026-06-28 17:49 UTC (permalink / raw)
To: Runyu Xiao
Cc: Nikolay Aleksandrov, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, bridge, netdev,
linux-kernel, jianhao.xu
In-Reply-To: <20260627132539.3701630-1-runyu.xiao@seu.edu.cn>
On Sat, Jun 27, 2026 at 09:25:39PM +0800, Runyu Xiao wrote:
> Hi bridge maintainers,
>
> This question comes from a candidate found by our static analysis tool
> and then manually reviewed against the current tree. The audit used
> CONFIG_PROVE_RCU_LIST as target-matched triage evidence; I am asking
> for maintainer guidance because the source-level review did not prove
> a use-after-free.
>
> A CONFIG_PROVE_RCU_LIST audit flags the VLAN-list traversal in
> br_mst_info_size():
>
> net/bridge/br_mst.c:251 br_mst_info_size()
>
> The helper walks vg->vlan_list with list_for_each_entry_rcu(). In the
> direct local context, br_get_link_af_size_filtered() first enters an
> RCU read-side section, resolves the bridge port or bridge VLAN group,
> and calls br_get_num_vlan_infos(vg, filter_mask). That local RCU
> read-side section is then dropped before the later MST sizing call:
>
> net/bridge/br_netlink.c:104 rcu_read_lock()
> net/bridge/br_netlink.c:113 br_get_num_vlan_infos(vg, filter_mask)
> net/bridge/br_netlink.c:114 rcu_read_unlock()
> net/bridge/br_netlink.c:123 br_mst_info_size(vg)
>
> The helper is registered through rtnl_af_ops.get_link_af_size, and
> bridge VLAN updates appear RTNL-centered, so the broader rtnetlink
> sizing path may already provide the intended serialization. I am not
> claiming a use-after-free here. The question is only whether the
> RCU-list traversal contract around br_mst_info_size() should be made
> explicit enough for CONFIG_PROVE_RCU_LIST to see it.
>
> Would you prefer one of these directions?
>
> 1. keep the MST sizing loop inside an explicit rcu_read_lock() in
> br_get_link_af_size_filtered();
>
> 2. pass a confirmed RTNL lockdep condition to the iterator in
> br_mst_info_size();
>
> 3. document that the outer rtnetlink sizing path is the required
> protection and leave the helper unchanged;
>
> 4. use a different bridge-specific pattern.
>
> I am intentionally sending this as a maintainer question rather than a
> patch because the right contract seems to depend on the bridge/rtnetlink
> caller semantics.
I don't think anything needs to change. AFAICT, br_mst_info_size() is
only reachable via the get_link_af_size() callback and
rtnl_link_get_af_size() always invokes it from an RCU read-side critical
section.
Did you see a splat with CONFIG_PROVE_RCU_LIST?
^ permalink raw reply
* [PATCH net-next v4] vsock/virtio: rewrite MSG_ZEROCOPY flag handling
From: Arseniy Krasnov @ 2026-06-28 18:20 UTC (permalink / raw)
To: Stefan Hajnoczi, Stefano Garzarella, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Michael S. Tsirkin,
Jason Wang, Bobby Eshleman, Xuan Zhuo, Eugenio Pérez,
Simon Horman
Cc: kvm, virtualization, netdev, linux-kernel, oxffffaa, rulkc,
Arseniy Krasnov
Logically it was based on TCP implementation, so to make further support
easier, rewrite it in the TCP way (like in 'tcp_sendmsg_locked()'). By
this way, patch also adds handling case when 'msg_ubuf' is already set.
Signed-off-by: Arseniy Krasnov <avkrasnov@rulkc.org>
---
Changelog v1->v2:
* Rebase on last 'net-next'. Don't need 'skb_zcopy_set()' now - it was
already added.
Changelog v2->v3:
* Update commit message.
* Remove one empty line.
Changelog v3->v4:
* Update commit message.
net/vmw_vsock/virtio_transport_common.c | 47 ++++++++++++-------------
1 file changed, 22 insertions(+), 25 deletions(-)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 09475007165b..41c2a0b82a8e 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -328,38 +328,35 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
return pkt_len;
- if (info->msg) {
- /* If zerocopy is not enabled by 'setsockopt()', we behave as
- * there is no MSG_ZEROCOPY flag set.
+ if (info->msg && (info->msg->msg_flags & MSG_ZEROCOPY)) {
+ /* If 'info->msg' is not NULL, this is only VIRTIO_VSOCK_OP_RW.
+ * 'MSG_ZEROCOPY' flag handling here is based on the same flag
+ * handling from 'tcp_sendmsg_locked()'.
*/
- if (!sock_flag(sk_vsock(vsk), SOCK_ZEROCOPY))
- info->msg->msg_flags &= ~MSG_ZEROCOPY;
+ if (info->msg->msg_ubuf) {
+ uarg = info->msg->msg_ubuf;
+ can_zcopy = virtio_transport_can_zcopy(t_ops, info, pkt_len);
+ } else if (sock_flag(sk_vsock(vsk), SOCK_ZEROCOPY)) {
+ uarg = msg_zerocopy_realloc(sk_vsock(vsk), pkt_len,
+ NULL, false);
+ if (!uarg) {
+ virtio_transport_put_credit(vvs, pkt_len);
+ return -ENOMEM;
+ }
- if (info->msg->msg_flags & MSG_ZEROCOPY)
can_zcopy = virtio_transport_can_zcopy(t_ops, info, pkt_len);
+ if (!can_zcopy)
+ uarg_to_msgzc(uarg)->zerocopy = 0;
+ have_uref = true;
+ }
+
+ /* 'can_zcopy' means that this transmission will be
+ * in zerocopy way (e.g. using 'frags' array).
+ */
if (can_zcopy)
max_skb_len = min_t(u32, VIRTIO_VSOCK_MAX_PKT_BUF_SIZE,
(MAX_SKB_FRAGS * PAGE_SIZE));
-
- if (info->msg->msg_flags & MSG_ZEROCOPY &&
- info->op == VIRTIO_VSOCK_OP_RW) {
- uarg = info->msg->msg_ubuf;
-
- if (!uarg) {
- uarg = msg_zerocopy_realloc(sk_vsock(vsk),
- pkt_len, NULL, false);
- if (!uarg) {
- virtio_transport_put_credit(vvs, pkt_len);
- return -ENOMEM;
- }
-
- if (!can_zcopy)
- uarg_to_msgzc(uarg)->zerocopy = 0;
-
- have_uref = true;
- }
- }
}
rest_len = pkt_len;
--
2.25.1
^ permalink raw reply related
* Re: [PATCH net-next v4] vsock/virtio: rewrite MSG_ZEROCOPY flag handling
From: Michael S. Tsirkin @ 2026-06-28 18:35 UTC (permalink / raw)
To: Arseniy Krasnov
Cc: Stefan Hajnoczi, Stefano Garzarella, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jason Wang,
Bobby Eshleman, Xuan Zhuo, Eugenio Pérez, Simon Horman, kvm,
virtualization, netdev, linux-kernel, oxffffaa, rulkc
In-Reply-To: <20260628182052.951760-1-avkrasnov@rulkc.org>
On Sun, Jun 28, 2026 at 09:20:52PM +0300, Arseniy Krasnov wrote:
> Logically it was based on TCP implementation, so to make further support
> easier, rewrite it in the TCP way (like in 'tcp_sendmsg_locked()'). By
> this way, patch also adds handling case when 'msg_ubuf' is already set.
>
> Signed-off-by: Arseniy Krasnov <avkrasnov@rulkc.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> Changelog v1->v2:
> * Rebase on last 'net-next'. Don't need 'skb_zcopy_set()' now - it was
> already added.
> Changelog v2->v3:
> * Update commit message.
> * Remove one empty line.
> Changelog v3->v4:
> * Update commit message.
>
> net/vmw_vsock/virtio_transport_common.c | 47 ++++++++++++-------------
> 1 file changed, 22 insertions(+), 25 deletions(-)
>
> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> index 09475007165b..41c2a0b82a8e 100644
> --- a/net/vmw_vsock/virtio_transport_common.c
> +++ b/net/vmw_vsock/virtio_transport_common.c
> @@ -328,38 +328,35 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
> if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
> return pkt_len;
>
> - if (info->msg) {
> - /* If zerocopy is not enabled by 'setsockopt()', we behave as
> - * there is no MSG_ZEROCOPY flag set.
> + if (info->msg && (info->msg->msg_flags & MSG_ZEROCOPY)) {
> + /* If 'info->msg' is not NULL, this is only VIRTIO_VSOCK_OP_RW.
> + * 'MSG_ZEROCOPY' flag handling here is based on the same flag
> + * handling from 'tcp_sendmsg_locked()'.
> */
> - if (!sock_flag(sk_vsock(vsk), SOCK_ZEROCOPY))
> - info->msg->msg_flags &= ~MSG_ZEROCOPY;
> + if (info->msg->msg_ubuf) {
> + uarg = info->msg->msg_ubuf;
> + can_zcopy = virtio_transport_can_zcopy(t_ops, info, pkt_len);
> + } else if (sock_flag(sk_vsock(vsk), SOCK_ZEROCOPY)) {
> + uarg = msg_zerocopy_realloc(sk_vsock(vsk), pkt_len,
> + NULL, false);
> + if (!uarg) {
> + virtio_transport_put_credit(vvs, pkt_len);
> + return -ENOMEM;
> + }
>
> - if (info->msg->msg_flags & MSG_ZEROCOPY)
> can_zcopy = virtio_transport_can_zcopy(t_ops, info, pkt_len);
> + if (!can_zcopy)
> + uarg_to_msgzc(uarg)->zerocopy = 0;
>
> + have_uref = true;
> + }
> +
> + /* 'can_zcopy' means that this transmission will be
> + * in zerocopy way (e.g. using 'frags' array).
> + */
> if (can_zcopy)
> max_skb_len = min_t(u32, VIRTIO_VSOCK_MAX_PKT_BUF_SIZE,
> (MAX_SKB_FRAGS * PAGE_SIZE));
> -
> - if (info->msg->msg_flags & MSG_ZEROCOPY &&
> - info->op == VIRTIO_VSOCK_OP_RW) {
> - uarg = info->msg->msg_ubuf;
> -
> - if (!uarg) {
> - uarg = msg_zerocopy_realloc(sk_vsock(vsk),
> - pkt_len, NULL, false);
> - if (!uarg) {
> - virtio_transport_put_credit(vvs, pkt_len);
> - return -ENOMEM;
> - }
> -
> - if (!can_zcopy)
> - uarg_to_msgzc(uarg)->zerocopy = 0;
> -
> - have_uref = true;
> - }
> - }
> }
>
> rest_len = pkt_len;
> --
> 2.25.1
^ permalink raw reply
* [PATCH] net: sysfs: cleanup coding style
From: Lucas Poupeau @ 2026-06-28 18:58 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni
Cc: horms, kuniyu, sdf, brauner, krikku, netdev, linux-kernel,
Lucas Poupeau
Replace DEVICE_ATTR() with DEVICE_ATTR_RW() where applicable to
follow kernel coding style conventions. Add blank lines between
function definitions and macro invocations. Fold multi-line
attribute initializations onto single lines where it improves
readability without exceeding the line length limit.
Signed-off-by: Lucas Poupeau <lucasp.linux@gmail.com>
---
net/core/net-sysfs.c | 43 ++++++++++++++++++++-----------------------
1 file changed, 20 insertions(+), 23 deletions(-)
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 0e71c9ed41e8..14efe81d006b 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -494,6 +494,7 @@ static ssize_t mtu_store(struct device *dev, struct device_attribute *attr,
{
return netdev_store(dev, attr, buf, len, change_mtu);
}
+
NETDEVICE_SHOW_RW(mtu, fmt_dec);
static int change_flags(struct net_device *dev, unsigned long new_flags)
@@ -506,6 +507,7 @@ static ssize_t flags_store(struct device *dev, struct device_attribute *attr,
{
return netdev_store(dev, attr, buf, len, change_flags);
}
+
NETDEVICE_SHOW_RW(flags, fmt_hex);
static ssize_t tx_queue_len_store(struct device *dev,
@@ -517,6 +519,7 @@ static ssize_t tx_queue_len_store(struct device *dev,
return netdev_store(dev, attr, buf, len, dev_change_tx_queue_len);
}
+
NETDEVICE_SHOW_RW(tx_queue_len, fmt_dec);
static int change_gro_flush_timeout(struct net_device *dev, unsigned long val)
@@ -534,6 +537,7 @@ static ssize_t gro_flush_timeout_store(struct device *dev,
return netdev_lock_store(dev, attr, buf, len, change_gro_flush_timeout);
}
+
NETDEVICE_SHOW_RW(gro_flush_timeout, fmt_ulong);
static int change_napi_defer_hard_irqs(struct net_device *dev, unsigned long val)
@@ -555,6 +559,7 @@ static ssize_t napi_defer_hard_irqs_store(struct device *dev,
return netdev_lock_store(dev, attr, buf, len,
change_napi_defer_hard_irqs);
}
+
NETDEVICE_SHOW_RW(napi_defer_hard_irqs, fmt_uint);
static ssize_t ifalias_store(struct device *dev, struct device_attribute *attr,
@@ -612,8 +617,9 @@ static ssize_t group_store(struct device *dev, struct device_attribute *attr,
{
return netdev_store(dev, attr, buf, len, change_group);
}
+
NETDEVICE_SHOW(group, fmt_dec);
-static DEVICE_ATTR(netdev_group, 0644, group_show, group_store);
+static DEVICE_ATTR_RW(netdev_group);
static int change_proto_down(struct net_device *dev, unsigned long proto_down)
{
@@ -626,6 +632,7 @@ static ssize_t proto_down_store(struct device *dev,
{
return netdev_store(dev, attr, buf, len, change_proto_down);
}
+
NETDEVICE_SHOW_RW(proto_down, fmt_dec);
static ssize_t phys_port_id_show(struct device *dev,
@@ -1125,8 +1132,7 @@ static struct rx_queue_attribute rps_cpus_attribute __ro_after_init
= __ATTR(rps_cpus, 0644, show_rps_map, store_rps_map);
static struct rx_queue_attribute rps_dev_flow_table_cnt_attribute __ro_after_init
- = __ATTR(rps_flow_cnt, 0644,
- show_rps_dev_flow_table_cnt, store_rps_dev_flow_table_cnt);
+ = __ATTR(rps_flow_cnt, 0644, show_rps_dev_flow_table_cnt, store_rps_dev_flow_table_cnt);
#endif /* CONFIG_RPS */
static struct attribute *rx_queue_default_attrs[] __ro_after_init = {
@@ -1279,8 +1285,7 @@ static int rx_queue_change_owner(struct net_device *dev, int index, kuid_t kuid,
return error;
if (dev->sysfs_rx_queue_group)
- error = sysfs_group_change_owner(
- kobj, dev->sysfs_rx_queue_group, kuid, kgid);
+ error = sysfs_group_change_owner(kobj, dev->sysfs_rx_queue_group, kuid, kgid);
return error;
}
@@ -1358,6 +1363,7 @@ struct netdev_queue_attribute {
struct netdev_queue *queue, const char *buf,
size_t len);
};
+
#define to_netdev_queue_attr(_attr) \
container_of(_attr, struct netdev_queue_attribute, attr)
@@ -1366,8 +1372,7 @@ struct netdev_queue_attribute {
static ssize_t netdev_queue_attr_show(struct kobject *kobj,
struct attribute *attr, char *buf)
{
- const struct netdev_queue_attribute *attribute
- = to_netdev_queue_attr(attr);
+ const struct netdev_queue_attribute *attribute = to_netdev_queue_attr(attr);
struct netdev_queue *queue = to_netdev_queue(kobj);
if (!attribute->show)
@@ -1380,8 +1385,7 @@ static ssize_t netdev_queue_attr_store(struct kobject *kobj,
struct attribute *attr,
const char *buf, size_t count)
{
- const struct netdev_queue_attribute *attribute
- = to_netdev_queue_attr(attr);
+ const struct netdev_queue_attribute *attribute = to_netdev_queue_attr(attr);
struct netdev_queue *queue = to_netdev_queue(kobj);
if (!attribute->store)
@@ -1499,15 +1503,12 @@ static ssize_t tx_maxrate_store(struct kobject *kobj, struct attribute *attr,
return err;
}
-static struct netdev_queue_attribute queue_tx_maxrate __ro_after_init
- = __ATTR_RW(tx_maxrate);
+static struct netdev_queue_attribute queue_tx_maxrate __ro_after_init = __ATTR_RW(tx_maxrate);
#endif
-static struct netdev_queue_attribute queue_trans_timeout __ro_after_init
- = __ATTR_RO(tx_timeout);
+static struct netdev_queue_attribute queue_trans_timeout __ro_after_init = __ATTR_RO(tx_timeout);
-static struct netdev_queue_attribute queue_traffic_class __ro_after_init
- = __ATTR_RO(traffic_class);
+static struct netdev_queue_attribute queue_traffic_class __ro_after_init = __ATTR_RO(traffic_class);
#ifdef CONFIG_BQL
/*
@@ -1565,8 +1566,7 @@ static ssize_t bql_set_hold_time(struct kobject *kobj, struct attribute *attr,
}
static struct netdev_queue_attribute bql_hold_time_attribute __ro_after_init
- = __ATTR(hold_time, 0644,
- bql_show_hold_time, bql_set_hold_time);
+ = __ATTR(hold_time, 0644, bql_show_hold_time, bql_set_hold_time);
static ssize_t bql_show_stall_thrs(struct kobject *kobj, struct attribute *attr,
struct netdev_queue *queue, char *buf)
@@ -1660,8 +1660,7 @@ static ssize_t bql_set_ ## NAME(struct kobject *kobj, \
} \
\
static struct netdev_queue_attribute bql_ ## NAME ## _attribute __ro_after_init \
- = __ATTR(NAME, 0644, \
- bql_show_ ## NAME, bql_set_ ## NAME)
+ = __ATTR(NAME, 0644, bql_show_ ## NAME, bql_set_ ## NAME)
BQL_ATTR(limit, limit);
BQL_ATTR(limit_max, max_limit);
@@ -1816,8 +1815,7 @@ static ssize_t xps_cpus_store(struct kobject *kobj, struct attribute *attr,
return err ? : len;
}
-static struct netdev_queue_attribute xps_cpus_attribute __ro_after_init
- = __ATTR_RW(xps_cpus);
+static struct netdev_queue_attribute xps_cpus_attribute __ro_after_init = __ATTR_RW(xps_cpus);
static ssize_t xps_rxqs_show(struct kobject *kobj, struct attribute *attr,
struct netdev_queue *queue, char *buf)
@@ -1886,8 +1884,7 @@ static ssize_t xps_rxqs_store(struct kobject *kobj, struct attribute *attr,
return err ? : len;
}
-static struct netdev_queue_attribute xps_rxqs_attribute __ro_after_init
- = __ATTR_RW(xps_rxqs);
+static struct netdev_queue_attribute xps_rxqs_attribute __ro_after_init = __ATTR_RW(xps_rxqs);
#endif /* CONFIG_XPS */
static struct attribute *netdev_queue_default_attrs[] __ro_after_init = {
--
2.53.0
^ permalink raw reply related
* Re: [PATCH] net: sysfs: cleanup coding style
From: Andrew Lunn @ 2026-06-28 20:14 UTC (permalink / raw)
To: Lucas Poupeau
Cc: davem, edumazet, kuba, pabeni, horms, kuniyu, sdf, brauner,
krikku, netdev, linux-kernel
In-Reply-To: <20260628185824.231250-1-lucasp.linux@gmail.com>
On Sun, Jun 28, 2026 at 08:58:24PM +0200, Lucas Poupeau wrote:
> Replace DEVICE_ATTR() with DEVICE_ATTR_RW() where applicable to
> follow kernel coding style conventions. Add blank lines between
> function definitions and macro invocations. Fold multi-line
> attribute initializations onto single lines where it improves
> readability without exceeding the line length limit.
When you have a list like this, it means you should have a patch
series, one patch per item on the list.
Please also take a read of
https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html
Andrew
---
pw-bot: cr
^ permalink raw reply
* Re: [PATCH net-next v3] net: usb: rtl8150: handle link status read failures
From: Andrew Lunn @ 2026-06-28 20:17 UTC (permalink / raw)
To: Yousef Alhouseen
Cc: Petko Manolov, Andrew Lunn, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, linux-usb, netdev, linux-kernel,
syzbot+9db6c624635564ad813c
In-Reply-To: <20260628165033.17842-1-alhouseenyousef@gmail.com>
On Sun, Jun 28, 2026 at 06:50:33PM +0200, Yousef Alhouseen wrote:
> set_carrier() ignores the result of the USB control transfer and tests
> the stack variable supplied as its receive buffer. If the device rejects
> or aborts the request, that variable remains uninitialized and the driver
> chooses an arbitrary carrier state.
>
> Leave the existing carrier state unchanged when the link status cannot be
> read. A transient USB error should not be treated as link loss.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
If it is for net-next it does not need a Fixes tag.
Please also note that:
https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html
say:
* don’t repost your patches within one 24h period
It is a good idea the read that whole document.
Andrew
^ permalink raw reply
* [PATCH] Wireguard: Fix data-race in rx/tx counter
From: Rafael Passos @ 2026-06-28 20:38 UTC (permalink / raw)
To: rafael
Cc: Jason, andrew+netdev, davem, edumazet, kuba, linux-kernel, netdev,
pabeni, syzbot+9ca7674fa7521a3f1bc2, syzkaller-bugs, wireguard
In-Reply-To: <DJFTVX3FE7OD.2O8GTO84798T@rcpassos.me>
fixes data-race in {rx/tx}_bytes counter for wireguard connection.
these values were incremented inside a read_lock_bh block, but write
protections were missing. making them atomic was the simplest way out.
This was found by syzbot with kcsan.
Reported-by: syzbot+9ca7674fa7521a3f1bc2@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=9ca7674fa7521a3f1bc2
Signed-off-by: Rafael Passos <rafael@rcpassos.me>
---
Hi,
I am posting this patch to better ilustrate the discussion.
If this is a non-issue, its fine.
As I mentioned in the previous email, this issue was reported by syzbot,
but I was not able to reproduce it.
I am also aware atomic calls may introduce extra cost on older arm cpus.
I would like to hear from the community: would this an adequate solution ?
Thanks,
Rafael Passos
drivers/net/wireguard/netlink.c | 4 ++--
drivers/net/wireguard/peer.h | 2 +-
drivers/net/wireguard/receive.c | 2 +-
drivers/net/wireguard/socket.c | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c
index 1da7e98d0d509..ec66f79e46377 100644
--- a/drivers/net/wireguard/netlink.c
+++ b/drivers/net/wireguard/netlink.c
@@ -109,9 +109,9 @@ get_peer(struct wg_peer *peer, struct sk_buff *skb, struct dump_ctx *ctx)
sizeof(last_handshake), &last_handshake) ||
nla_put_u16(skb, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
peer->persistent_keepalive_interval) ||
- nla_put_u64_64bit(skb, WGPEER_A_TX_BYTES, peer->tx_bytes,
+ nla_put_u64_64bit(skb, WGPEER_A_TX_BYTES, atomic64_read(&peer->tx_bytes),
WGPEER_A_UNSPEC) ||
- nla_put_u64_64bit(skb, WGPEER_A_RX_BYTES, peer->rx_bytes,
+ nla_put_u64_64bit(skb, WGPEER_A_RX_BYTES, atomic64_read(&peer->rx_bytes),
WGPEER_A_UNSPEC) ||
nla_put_u32(skb, WGPEER_A_PROTOCOL_VERSION, 1))
goto err;
diff --git a/drivers/net/wireguard/peer.h b/drivers/net/wireguard/peer.h
index 718fb42bdac7e..01c4b80086759 100644
--- a/drivers/net/wireguard/peer.h
+++ b/drivers/net/wireguard/peer.h
@@ -49,7 +49,7 @@ struct wg_peer {
struct work_struct transmit_handshake_work, clear_peer_work, transmit_packet_work;
struct cookie latest_cookie;
struct hlist_node pubkey_hash;
- u64 rx_bytes, tx_bytes;
+ atomic64_t rx_bytes, tx_bytes;
struct timer_list timer_retransmit_handshake, timer_send_keepalive;
struct timer_list timer_new_handshake, timer_zero_key_material;
struct timer_list timer_persistent_keepalive;
diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
index eb8851113654f..500d86576c692 100644
--- a/drivers/net/wireguard/receive.c
+++ b/drivers/net/wireguard/receive.c
@@ -20,7 +20,7 @@
static void update_rx_stats(struct wg_peer *peer, size_t len)
{
dev_sw_netstats_rx_add(peer->device->dev, len);
- peer->rx_bytes += len;
+ atomic64_add(len, &peer->rx_bytes);
}
#define SKB_TYPE_LE32(skb) (((struct message_header *)(skb)->data)->type)
diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c
index 0028ef17dc716..9e8a49b9078f2 100644
--- a/drivers/net/wireguard/socket.c
+++ b/drivers/net/wireguard/socket.c
@@ -179,7 +179,7 @@ int wg_socket_send_skb_to_peer(struct wg_peer *peer, struct sk_buff *skb, u8 ds)
else
dev_kfree_skb(skb);
if (likely(!ret))
- peer->tx_bytes += skb_len;
+ atomic64_add(skb_len, &peer->tx_bytes);
read_unlock_bh(&peer->endpoint_lock);
return ret;
--
2.53.0
^ permalink raw reply related
* [PATCH net] nfc: nci: fix out-of-bounds read in activation parameter parsing
From: Muhammad Bilal @ 2026-06-28 21:00 UTC (permalink / raw)
To: David Heidelberg, netdev
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, oe-linux-nfc, linux-kernel
nci_extract_activation_params_iso_dep() and
nci_extract_activation_params_nfc_dep() receive a pointer into the
RF_INTF_ACTIVATED_NTF notification but are not told how many bytes
remain. Each reads a one-byte length field (rats_res_len,
attrib_res_len, atr_res_len or atr_req_len) and then memcpy()s that many
bytes from the packet. The length is clamped to the destination size,
but it is never checked against the remaining activation-parameter data,
so a notification whose length field is larger than the data present
reads past the end of the buffer.
The sibling nci_extract_rf_params_*() helpers were recently given a
data_len argument and matching remaining-length checks, but the
activation-parameter helpers were not updated.
Pass the remaining length down and validate each field against it before
copying, as the rf_params helpers do.
Fixes: ac2068384034 ("NFC: Parse NCI NFC-DEP activation params")
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Bilal <meatuni001@gmail.com>
---
net/nfc/nci/ntf.c | 36 ++++++++++++++++++++++++++++++------
1 file changed, 30 insertions(+), 6 deletions(-)
diff --git a/net/nfc/nci/ntf.c b/net/nfc/nci/ntf.c
index c96512bb86531..63aa0a78472b1 100644
--- a/net/nfc/nci/ntf.c
+++ b/net/nfc/nci/ntf.c
@@ -525,7 +525,7 @@ static int nci_rf_discover_ntf_packet(struct nci_dev *ndev,
static int nci_extract_activation_params_iso_dep(struct nci_dev *ndev,
struct nci_rf_intf_activated_ntf *ntf,
- const __u8 *data)
+ const __u8 *data, ssize_t data_len)
{
struct activation_params_nfca_poll_iso_dep *nfca_poll;
struct activation_params_nfcb_poll_iso_dep *nfcb_poll;
@@ -533,9 +533,14 @@ static int nci_extract_activation_params_iso_dep(struct nci_dev *ndev,
switch (ntf->activation_rf_tech_and_mode) {
case NCI_NFC_A_PASSIVE_POLL_MODE:
nfca_poll = &ntf->activation_params.nfca_poll_iso_dep;
+ if (data_len < 1)
+ return NCI_STATUS_RF_PROTOCOL_ERROR;
nfca_poll->rats_res_len = min_t(__u8, *data++, NFC_ATS_MAXSIZE);
+ data_len--;
pr_debug("rats_res_len %d\n", nfca_poll->rats_res_len);
if (nfca_poll->rats_res_len > 0) {
+ if (data_len < nfca_poll->rats_res_len)
+ return NCI_STATUS_RF_PROTOCOL_ERROR;
memcpy(nfca_poll->rats_res,
data, nfca_poll->rats_res_len);
}
@@ -543,9 +548,14 @@ static int nci_extract_activation_params_iso_dep(struct nci_dev *ndev,
case NCI_NFC_B_PASSIVE_POLL_MODE:
nfcb_poll = &ntf->activation_params.nfcb_poll_iso_dep;
+ if (data_len < 1)
+ return NCI_STATUS_RF_PROTOCOL_ERROR;
nfcb_poll->attrib_res_len = min_t(__u8, *data++, 50);
+ data_len--;
pr_debug("attrib_res_len %d\n", nfcb_poll->attrib_res_len);
if (nfcb_poll->attrib_res_len > 0) {
+ if (data_len < nfcb_poll->attrib_res_len)
+ return NCI_STATUS_RF_PROTOCOL_ERROR;
memcpy(nfcb_poll->attrib_res,
data, nfcb_poll->attrib_res_len);
}
@@ -562,7 +572,7 @@ static int nci_extract_activation_params_iso_dep(struct nci_dev *ndev,
static int nci_extract_activation_params_nfc_dep(struct nci_dev *ndev,
struct nci_rf_intf_activated_ntf *ntf,
- const __u8 *data)
+ const __u8 *data, ssize_t data_len)
{
struct activation_params_poll_nfc_dep *poll;
struct activation_params_listen_nfc_dep *listen;
@@ -571,21 +581,33 @@ static int nci_extract_activation_params_nfc_dep(struct nci_dev *ndev,
case NCI_NFC_A_PASSIVE_POLL_MODE:
case NCI_NFC_F_PASSIVE_POLL_MODE:
poll = &ntf->activation_params.poll_nfc_dep;
+ if (data_len < 1)
+ return NCI_STATUS_RF_PROTOCOL_ERROR;
poll->atr_res_len = min_t(__u8, *data++,
NFC_ATR_RES_MAXSIZE - 2);
+ data_len--;
pr_debug("atr_res_len %d\n", poll->atr_res_len);
- if (poll->atr_res_len > 0)
+ if (poll->atr_res_len > 0) {
+ if (data_len < poll->atr_res_len)
+ return NCI_STATUS_RF_PROTOCOL_ERROR;
memcpy(poll->atr_res, data, poll->atr_res_len);
+ }
break;
case NCI_NFC_A_PASSIVE_LISTEN_MODE:
case NCI_NFC_F_PASSIVE_LISTEN_MODE:
listen = &ntf->activation_params.listen_nfc_dep;
+ if (data_len < 1)
+ return NCI_STATUS_RF_PROTOCOL_ERROR;
listen->atr_req_len = min_t(__u8, *data++,
NFC_ATR_REQ_MAXSIZE - 2);
+ data_len--;
pr_debug("atr_req_len %d\n", listen->atr_req_len);
- if (listen->atr_req_len > 0)
+ if (listen->atr_req_len > 0) {
+ if (data_len < listen->atr_req_len)
+ return NCI_STATUS_RF_PROTOCOL_ERROR;
memcpy(listen->atr_req, data, listen->atr_req_len);
+ }
break;
default:
@@ -806,12 +828,14 @@ static int nci_rf_intf_activated_ntf_packet(struct nci_dev *ndev,
switch (ntf.rf_interface) {
case NCI_RF_INTERFACE_ISO_DEP:
err = nci_extract_activation_params_iso_dep(ndev,
- &ntf, data);
+ &ntf, data,
+ ntf.activation_params_len);
break;
case NCI_RF_INTERFACE_NFC_DEP:
err = nci_extract_activation_params_nfc_dep(ndev,
- &ntf, data);
+ &ntf, data,
+ ntf.activation_params_len);
break;
case NCI_RF_INTERFACE_FRAME:
--
2.54.0
^ permalink raw reply related
* Re: [PATCH] Wireguard: Fix data-race in rx/tx counter
From: Andrew Lunn @ 2026-06-28 21:02 UTC (permalink / raw)
To: Rafael Passos
Cc: Jason, andrew+netdev, davem, edumazet, kuba, linux-kernel, netdev,
pabeni, syzbot+9ca7674fa7521a3f1bc2, syzkaller-bugs, wireguard
In-Reply-To: <20260628203823.144789-1-rafael@rcpassos.me>
On Sun, Jun 28, 2026 at 05:38:23PM -0300, Rafael Passos wrote:
> fixes data-race in {rx/tx}_bytes counter for wireguard connection.
> these values were incremented inside a read_lock_bh block, but write
> protections were missing. making them atomic was the simplest way out.
> This was found by syzbot with kcsan.
>
> Reported-by: syzbot+9ca7674fa7521a3f1bc2@syzkaller.appspotmail.com
> Link: https://syzkaller.appspot.com/bug?extid=9ca7674fa7521a3f1bc2
> Signed-off-by: Rafael Passos <rafael@rcpassos.me>
> ---
>
> Hi,
>
> I am posting this patch to better ilustrate the discussion.
> If this is a non-issue, its fine.
> As I mentioned in the previous email, this issue was reported by syzbot,
> but I was not able to reproduce it.
> I am also aware atomic calls may introduce extra cost on older arm cpus.
Atomics are expensive in general, especially on high CPU count
systems.
Statistic counters tend to be very asymmetric in usage. They are
incremented frequently, maybe per packet, but reported very
infrequently, maybe every minute when an SNMP agent reads them. So the
solution to statistic counters should reflect this. Increment should
be very cheap, reporting them can be expensive.
There are a few different solutions. Per CPU counters is
one. u64_stats_sync.h may help.
Please take a look at other drivers doing statistics. This is a solved
problem, you just need to copy bits of code from somewhere else.
Andrew
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox