* Re: [PATCH net] NFC: digital: bound SENSF response copy into nfc_target
From: Jakub Kicinski @ 2026-04-13 18:41 UTC (permalink / raw)
To: Michael Bommarito
Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Kees Cook, stable, linux-kernel
In-Reply-To: <20260413174715.197640-1-michael.bommarito@gmail.com>
On Mon, 13 Apr 2026 13:47:15 -0400 Michael Bommarito wrote:
> Assisted-by: Claude:claude-opus-4-6
> Assisted-by: Codex:gpt-5-4
Could you do some experimentation and figure out what we can do to the
kernel to make the bots check the submission history? It's the 4th time
we received this (incorrect) patch.
^ permalink raw reply
* Re: [PATCH net v3 4/5] bonding: 3ad: fix stuck negotiation on recovery
From: Jay Vosburgh @ 2026-04-13 18:39 UTC (permalink / raw)
To: Louis Scalbert
Cc: netdev, andrew+netdev, edumazet, kuba, pabeni, fbl, andy,
shemminger, maheshb
In-Reply-To: <20260408152353.276204-5-louis.scalbert@6wind.com>
Louis Scalbert <louis.scalbert@6wind.com> wrote:
>The previous commit introduced a side effect caused by clearing the
>SELECTED flag on disabled ports. After all ports in an aggregator go
>down, if only a subset of ports comes back up, those ports can no
>longer renegotiate LACP unless all aggregator ports come back up.
>
>1. All aggregator ports go down
> - The SELECTED flag is cleared on all of them.
>2. One port comes back up
> - Its SELECTED flag is set again.
> - It enters the WAITING state and gets its READY_N flag.
> - The remaining ports stay UNSELECTED. Because of that, they cannot
> enter the WAITING state and therefore never get READY_N.
This is the part that I think we may be doing something else
incorrectly. If the port is UNSELECTED, then that means that no
aggregator is currently selected for that port, and therefore it
shouldn't be assigned to an aggregator with other ports (per
802.1AX-2014 6.4.8, "Selected").
I'm not seeing anything in the 6.4.14 Selection Logic that makes
me think a port that is down (port_enabled == FALSE) is disallowed from
being SELECTED.
Looking at the Receive machine state diagram (Figure 6-18), I
tend to think that in this case the port would transition to
PORT_DISABLED state, as we're not asserting a BEGIN (reinitialization of
the LACP protocol entity), so the port variables can remain unchanged.
There's even some language that suggests this is intentional:
"If the Aggregation Port becomes inoperable and the BEGIN
variable is not asserted, the state machine enters the
PORT_DISABLED state. [...] This state allows the current
Selection state to remain undisturbed, so that, in the event
that the Aggregation Port is still connected to the same Partner
and Partner Aggregation Port when it becomes operable again,
there will be no disturbance caused to higher layers by
unnecessary re-configuration.
So, perhaps the actual bug is that these ports are attached to
the aggregator but not SELECTED.
-J
> - __agg_ports_are_ready() returns 0 because it finds a port without
> READY_N.
> - As a result, __set_agg_ports_ready() keeps the READY flag cleared on
> all ports.
> - The port that came back up is therefore not marked READY and cannot
> transition to ATTACHED.
> - LACP negotiation becomes stuck, and the port cannot be used.
>3. All aggregator ports come back up
> - They all regain SELECTED and READY_N.
> - __agg_ports_are_ready() now returns 1.
> - __set_agg_ports_ready() sets READY on all ports.
> - They can then transition to ATTACHED.
> - Negotiation resumes and the aggregator becomes operational again.
>
>Consider only ports currently in the WAITING mux state for READY_N in
>order to avoid __agg_ports_are_ready() to return 0 because of a disabled
>port. That matches 802.3ad, which states: "The Selection Logic asserts
>Ready TRUE when the values of Ready_N for all ports that are waiting to
>attach to a given Aggregator are TRUE.".
>
>Fixes: 655f8919d549 ("bonding: add min links parameter to 802.3ad")
>Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
>---
> drivers/net/bonding/bond_3ad.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index 3a94fbcbf721..3f56d892b101 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -700,7 +700,8 @@ static void __update_ntt(struct lacpdu *lacpdu, struct port *port)
> }
>
> /**
>- * __agg_ports_are_ready - check if all ports in an aggregator are ready
>+ * __agg_ports_are_ready - check if all ports in an aggregator that are in
>+ * the WAITING state are ready
> * @aggregator: the aggregator we're looking at
> *
> */
>@@ -716,6 +717,8 @@ static int __agg_ports_are_ready(struct aggregator *aggregator)
> for (port = aggregator->lag_ports;
> port;
> port = port->next_port_in_aggregator) {
>+ if (port->sm_mux_state != AD_MUX_WAITING)
>+ continue;
> if (!(port->sm_vars & AD_PORT_READY_N)) {
> retval = 0;
> break;
>--
>2.39.2
>
---
-Jay Vosburgh, jv@jvosburgh.net
^ permalink raw reply
* Re: [PATCH net-next v2] net: check qdisc_pkt_len_segs_init() return value on ingress
From: Daniel Borkmann @ 2026-04-13 18:38 UTC (permalink / raw)
To: David Carlier, Jakub Kicinski, David S . Miller, Eric Dumazet,
Paolo Abeni
Cc: Simon Horman, Stanislav Fomichev, Kuniyuki Iwashima,
Samiullah Khawaja, Hangbin Liu, Krishna Kumar, netdev,
linux-kernel
In-Reply-To: <20260413182225.10683-1-devnexen@gmail.com>
On 4/13/26 8:22 PM, David Carlier wrote:
> Commit 7fb4c1967011 ("net: pull headers in qdisc_pkt_len_segs_init()")
> changed qdisc_pkt_len_segs_init() to return an skb drop reason when
> it detects malicious GSO packets. The egress path in __dev_queue_xmit()
> checks this return value and drops bad packets, but the ingress path in
> sch_handle_ingress() ignores it.
>
> This means malformed GSO packets entering via TC ingress are not dropped
> and could be redirected to another interface or cause incorrect qdisc
> accounting.
Why we need to do this on both sides (and what's the perf impact)? If TC
ingress redirects it to some other device, then don't we hit the same via
__dev_queue_xmit() where the 7fb4c1967011 added the qdisc_pkt_len_segs_init()?
> Check the return value and drop the packet when a bad GSO is detected.
>
> Fixes: 7fb4c1967011 ("net: pull headers in qdisc_pkt_len_segs_init()")
> Signed-off-by: David Carlier <devnexen@gmail.com>
> ---
>
> v1 -> v2: reorder variable declarations for reverse xmas tree
> v1: https://lore.kernel.org/netdev/20260408172307.46498-1-devnexen@gmail.com/
> net/core/dev.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 5a31f9d2128c..d11c22cafca9 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4459,8 +4459,8 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
> struct net_device *orig_dev, bool *another)
> {
> struct bpf_mprog_entry *entry = rcu_dereference_bh(skb->dev->tcx_ingress);
> - enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_INGRESS;
> struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx;
> + enum skb_drop_reason drop_reason;
> int sch_ret;
>
> if (!entry)
> @@ -4472,7 +4472,15 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
> *pt_prev = NULL;
> }
>
> - qdisc_pkt_len_segs_init(skb);
> + drop_reason = qdisc_pkt_len_segs_init(skb);
> + if (unlikely(drop_reason)) {
> + kfree_skb_reason(skb, drop_reason);
> + *ret = NET_RX_DROP;
> + bpf_net_ctx_clear(bpf_net_ctx);
> + return NULL;
> + }
> +
> + drop_reason = SKB_DROP_REASON_TC_INGRESS;
> tcx_set_ingress(skb, true);
>
> if (static_branch_unlikely(&tcx_needed_key)) {
^ permalink raw reply
* Re: [RFC v2 1/2] vfio: add callback to get tph info for dmabuf
From: Zhiping Zhang @ 2026-04-13 18:32 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Keith Busch, Jason Gunthorpe, Bjorn Helgaas, linux-rdma,
linux-pci, netdev, dri-devel, Yochai Cohen, Yishai Hadas,
Bjorn Helgaas
In-Reply-To: <20260409120415.GF86584@unreal>
On Thu, Apr 9, 2026 at 5:04 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> >
> On Tue, Mar 31, 2026 at 01:44:02PM -0600, Keith Busch wrote:
> > On Tue, Mar 31, 2026 at 10:02:20PM +0300, Leon Romanovsky wrote:
> > >
> > > Right, what about adding TPH fields to struct vfio_region_dma_range
> > > instead of struct vfio_device_feature_dma_buf?
> >
> > You might have to show me with code what you're talking about because I
> > can't see any way we can add fields to any struct here without breaking
> > backward compatibility.
> >
> > If we can't claim bits out of the unused "flags" field for this feature,
> > then my initial reply is the only sane approach: we can introduce a new
> > feature and struct for it that closely mirrors the existing one, but
> > with the extra hint fields.
>
> Something like that, on top of this proposal:
>
> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> index 3961afa640391..70d5ee1e3ef7b 100644
> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> @@ -241,9 +241,7 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
> return -EFAULT;
>
> if (!get_dma_buf.nr_ranges ||
> - (get_dma_buf.flags & ~(VFIO_DMABUF_FL_TPH |
> - VFIO_DMABUF_TPH_PH_MASK |
> - VFIO_DMABUF_TPH_ST_MASK)))
> + (get_dma_buf.flags & ~VFIO_DMABUF_FLAG_TPH))
> return -EINVAL;
>
> /*
> @@ -300,13 +298,10 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
> ret = PTR_ERR(priv->dmabuf);
> goto err_dev_put;
> }
> - if (get_dma_buf.flags & VFIO_DMABUF_FL_TPH) {
> - priv->steering_tag = (get_dma_buf.flags &
> - VFIO_DMABUF_TPH_ST_MASK) >>
> - VFIO_DMABUF_TPH_ST_SHIFT;
> - priv->ph = (get_dma_buf.flags &
> - VFIO_DMABUF_TPH_PH_MASK) >>
> - VFIO_DMABUF_TPH_PH_SHIFT;
> + if (get_dma_buf.flags & VFIO_DMABUF_FLAG_TPH) {
> + priv->steering_tag =
> + dma_ranges[get_dma_buf.nr_ranges + 1].tph.tag;
> + priv->ph = dma_ranges[get_dma_buf.nr_ranges + 1].tph.ph;
> }
> /* dma_buf_put() now frees priv */
> INIT_LIST_HEAD(&priv->dmabufs_elm);
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index e2a8962641d2c..a8b8d8b1a3278 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1497,20 +1497,30 @@ struct vfio_device_feature_bus_master {
> */
> #define VFIO_DEVICE_FEATURE_DMA_BUF 11
>
> +struct vfio_region_dma_tph {
> + u16 tag;
> + u8 ph;
> +};
> +
> struct vfio_region_dma_range {
> - __u64 offset;
> - __u64 length;
> + union {
> + __u64 offset;
> + struct vfio_region_dma_tph tph;
> + };
> + union {
> + __u64 length;
> + __u64 reserved;
> + };
> +};
> +
> +enum {
> + VFIO_DMABUF_FLAG_TPH = 1 << 0,
> };
>
> struct vfio_device_feature_dma_buf {
> __u32 region_index;
> __u32 open_flags;
> __u32 flags;
> -#define VFIO_DMABUF_FL_TPH (1U << 0) /* TPH info is present */
> -#define VFIO_DMABUF_TPH_PH_SHIFT 1 /* bits 1-2: PH (2-bit) */
> -#define VFIO_DMABUF_TPH_PH_MASK 0x6U
> -#define VFIO_DMABUF_TPH_ST_SHIFT 16 /* bits 16-31: steering tag */
> -#define VFIO_DMABUF_TPH_ST_MASK 0xffff0000U
> __u32 nr_ranges;
> struct vfio_region_dma_range dma_ranges[] __counted_by(nr_ranges);
> };
Sounds good, thanks! We will follow up and move this RFC to a formal patch.
Zhiping
^ permalink raw reply
* [PATCH RFC bpf-next 8/8] selftests/bpf: add tests to validate KASAN on JIT programs
From: Alexis Lothoré (eBPF Foundation) @ 2026-04-13 18:28 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton
Cc: ebpf, Bastien Curutchet, Thomas Petazzoni, Xu Kuohai, bpf,
linux-kernel, netdev, linux-kselftest, linux-stm32,
linux-arm-kernel, kasan-dev, linux-mm,
Alexis Lothoré (eBPF Foundation)
In-Reply-To: <20260413-kasan-v1-0-1a5831230821@bootlin.com>
Add a basic KASAN test runner that loads and test-run programs that can
trigger memory management bugs. The test captures kernel logs and ensure
that the expected KASAN splat is emitted by searching for the
corresponding first lines in the report.
This version implements two faulty programs triggering either a
user-after-free, or an out-of-bounds memory usage. The bugs are
triggered thanks to some dedicated kfuncs in bpf_testmod.c, but two
different techniques are used, as some cases can be quite hard to
trigger in a pure "black box" approach:
- for reads, we can make the used kfuncs return some faulty pointers
that ebpf programs will manipulate, they will generate legitimate
kasan reports as a consequence
- applying the same trick for faulty writes is harder, as ebpf programs
can't write kernel data freely. So ebpf programs can call another
specific testing kfunc that will alter the shadow memory matching the
passed memory (eg: a map). When the program will try to write to the
corresponding memory, it will trigger a report as well.
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
---
The way of bringing kasan_poison into bpf_testmod is definitely not
ideal. But I would like to validate the testing approach (triggering
real faulty accesses, which is hard on some cases, VS manually poisoning
BPF-manipulated memory) before eventually making clean bridges between
KASAN APIs and bpf_testmod.c, if the latter approach is the valid one.
---
tools/testing/selftests/bpf/prog_tests/kasan.c | 165 +++++++++++++++++++++
tools/testing/selftests/bpf/progs/kasan.c | 146 ++++++++++++++++++
.../testing/selftests/bpf/test_kmods/bpf_testmod.c | 79 ++++++++++
3 files changed, 390 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/kasan.c b/tools/testing/selftests/bpf/prog_tests/kasan.c
new file mode 100644
index 000000000000..fd628aaa8005
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/kasan.c
@@ -0,0 +1,165 @@
+// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+#include <bpf/bpf.h>
+#include <fcntl.h>
+#include <linux/if_ether.h>
+#include <sys/klog.h>
+#include <test_progs.h>
+#include <unpriv_helpers.h>
+#include "kasan.skel.h"
+
+#define SUBTEST_NAME_MAX_LEN 64
+#define SYSLOG_ACTION_READ_ALL 3
+#define SYSLOG_ACTION_CLEAR 5
+
+#define MAX_LOG_SIZE (8*1024)
+#define READ_CHUNK_SIZE 128
+
+#define KASAN_PATTERN_SLAB_UAF "BUG: KASAN: slab-use-after-free in bpf_prog_"
+#define KASAN_PATTERN_GLOBAL_OOB "BUG: KASAN: global-out-of-bounds in bpf_prog_"
+
+static char klog_buffer[MAX_LOG_SIZE];
+
+static int read_kernel_logs(char *buf, size_t max_len)
+{
+ return klogctl(SYSLOG_ACTION_READ_ALL, buf, max_len);
+}
+
+static int clear_kernel_logs(void)
+{
+ return klogctl(SYSLOG_ACTION_CLEAR, NULL, 0);
+}
+
+static int kernel_logs_have_matching_kasan_report(char *buf, char *pattern,
+ bool is_write, int size)
+{
+ char *access_desc_start, *access_desc_end, *tmp;
+ char access_log[READ_CHUNK_SIZE];
+ char *kasan_report_start;
+ int hsize, nsize;
+ /* Searched kasan report is valid if
+ * - it contains the expected kasan pattern
+ * - the next line is the description of the faulty access
+ * - faulty access properties match the tested type and size
+ */
+ kasan_report_start = strstr(buf, pattern);
+
+ if (!kasan_report_start)
+ return 1;
+
+ /* Find next line */
+ access_desc_start = strchr(kasan_report_start, '\n');
+ if (!access_desc_start)
+ return 1;
+ access_desc_start++;
+
+ access_desc_end = strchr(access_desc_start, '\n');
+ if (!access_desc_end)
+ return 1;
+
+ nsize = snprintf(access_log, READ_CHUNK_SIZE, "%s of size %d at addr",
+ is_write ? "Write" : "Read", size);
+
+ hsize = access_desc_end - access_desc_start;
+ tmp = memmem(access_desc_start, hsize, access_log, nsize);
+
+ if (!tmp)
+ return 1;
+
+ return 0;
+}
+
+struct test_spec {
+ char *prog_name;
+ char *expected_report_pattern;
+};
+
+static struct test_spec tests[] = {
+ {
+ .prog_name = "bpf_kasan_uaf",
+ .expected_report_pattern = KASAN_PATTERN_SLAB_UAF
+ },
+ {
+ .prog_name = "bpf_kasan_oob",
+ .expected_report_pattern = KASAN_PATTERN_GLOBAL_OOB
+ }
+};
+
+static void run_test_with_type_and_size(struct kasan *skel,
+ struct test_spec *test, bool is_write,
+ int access_size)
+{
+ char subtest_name[SUBTEST_NAME_MAX_LEN];
+ struct bpf_program *prog;
+ uint8_t buf[ETH_HLEN];
+ int ret;
+
+ prog = bpf_object__find_program_by_name(skel->obj, test->prog_name);
+ if (!ASSERT_OK_PTR(prog, "find test prog"))
+ return;
+
+ snprintf(subtest_name, SUBTEST_NAME_MAX_LEN, "%s_%s_%d",
+ test->prog_name, is_write ? "write" : "read", access_size);
+
+ if (!test__start_subtest(subtest_name))
+ return;
+
+ ret = clear_kernel_logs();
+ if (!ASSERT_OK(ret, "reset log buffer"))
+ return;
+
+ LIBBPF_OPTS(bpf_test_run_opts, topts);
+ topts.sz = sizeof(struct bpf_test_run_opts);
+ topts.data_size_in = ETH_HLEN;
+ topts.data_in = buf;
+ skel->bss->is_write = is_write;
+ skel->bss->access_size = access_size;
+ ret = bpf_prog_test_run_opts(bpf_program__fd(prog), &topts);
+ if (!ASSERT_OK(ret, "run prog"))
+ return;
+
+ ret = read_kernel_logs(klog_buffer, MAX_LOG_SIZE);
+ if (ASSERT_GE(ret, 0, "read kernel logs"))
+ ASSERT_OK(kernel_logs_have_matching_kasan_report(
+ klog_buffer, test->expected_report_pattern,
+ is_write, access_size),
+ test->prog_name);
+}
+
+static void run_test_with_type(struct kasan *skel, struct test_spec *test,
+ bool is_write)
+{
+ run_test_with_type_and_size(skel, test, is_write, 1);
+ run_test_with_type_and_size(skel, test, is_write, 2);
+ run_test_with_type_and_size(skel, test, is_write, 4);
+ run_test_with_type_and_size(skel, test, is_write, 8);
+}
+
+static void run_test(struct kasan *skel, struct test_spec *test)
+{
+ run_test_with_type(skel, test, false);
+ run_test_with_type(skel, test, true);
+}
+
+void test_kasan(void)
+{
+ struct test_spec *test;
+ struct kasan *skel;
+ int i;
+
+ if (!is_jit_enabled() || !get_kasan_jit_enabled()) {
+ test__skip();
+ return;
+ }
+
+ skel = kasan__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "open and load prog"))
+ return;
+
+ for (i = 0; i < ARRAY_SIZE(tests); i++) {
+ test = &tests[i];
+
+ run_test(skel, test);
+ }
+
+ kasan__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/kasan.c b/tools/testing/selftests/bpf/progs/kasan.c
new file mode 100644
index 000000000000..f713c9b7c9ce
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/kasan.c
@@ -0,0 +1,146 @@
+// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+#define KASAN_SLAB_FREE 0xFB
+#define KASAN_GLOBAL_REDZONE 0xF9
+
+extern __u8 *bpf_kfunc_kasan_uaf_1(void) __ksym;
+extern __u16 *bpf_kfunc_kasan_uaf_2(void) __ksym;
+extern __u32 *bpf_kfunc_kasan_uaf_4(void) __ksym;
+extern __u64 *bpf_kfunc_kasan_uaf_8(void) __ksym;
+extern __u8 *bpf_kfunc_kasan_oob_1(void) __ksym;
+extern __u16 *bpf_kfunc_kasan_oob_2(void) __ksym;
+extern __u32 *bpf_kfunc_kasan_oob_4(void) __ksym;
+extern __u64 *bpf_kfunc_kasan_oob_8(void) __ksym;
+extern void bpf_kfunc_kasan_poison(void *mem, __u32 mem__sz, __u8 byte) __ksym;
+
+int access_size;
+int is_write;
+
+struct kasan_write_val {
+ __u8 data_1;
+ __u16 data_2;
+ __u32 data_4;
+ __u64 data_8;
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __uint(max_entries, 1);
+ __type(key, __u32);
+ __type(value, struct kasan_write_val);
+} test_map SEC(".maps");
+
+static void bpf_kasan_faulty_write(int size, __u8 poison_byte)
+{
+ struct kasan_write_val *val;
+ __u32 key = 0;
+
+ val = bpf_map_lookup_elem(&test_map, &key);
+ if (!val)
+ return;
+
+ bpf_kfunc_kasan_poison(val, sizeof(struct kasan_write_val),
+ poison_byte);
+ switch (size) {
+ case 1:
+ val->data_1 = 0xAA;
+ break;
+ case 2:
+ val->data_2 = 0xAA;
+ break;
+ case 4:
+ val->data_4 = 0xAA;
+ break;
+ case 8:
+ val->data_8 = 0xAA;
+ break;
+ }
+ bpf_kfunc_kasan_poison(val, sizeof(struct kasan_write_val), 0x00);
+}
+
+
+static int bpf_kasan_uaf_read(int size)
+{
+ __u8 *result_1;
+ __u16 *result_2;
+ __u32 *result_4;
+ __u64 *result_8;
+ int ret = 0;
+
+ switch (size) {
+ case 1:
+ result_1 = bpf_kfunc_kasan_uaf_1();
+ ret = result_1[0] ? 1 : 0;
+ break;
+ case 2:
+ result_2 = bpf_kfunc_kasan_uaf_2();
+ ret = result_2[0] ? 1 : 0;
+ break;
+ case 4:
+ result_4 = bpf_kfunc_kasan_uaf_4();
+ ret = result_4[0] ? 1 : 0;
+ break;
+ case 8:
+ result_8 = bpf_kfunc_kasan_uaf_8();
+ ret = result_8[0] ? 1 : 0;
+ break;
+ }
+ return ret;
+}
+
+SEC("tcx/ingress")
+int bpf_kasan_uaf(struct __sk_buff *skb)
+{
+ if (is_write) {
+ bpf_kasan_faulty_write(access_size, KASAN_SLAB_FREE);
+ return 0;
+ }
+
+ return bpf_kasan_uaf_read(access_size);
+}
+
+static int bpf_kasan_oob_read(int size)
+{
+ __u8 *result_1;
+ __u16 *result_2;
+ __u32 *result_4;
+ __u64 *result_8;
+ int ret = 0;
+
+ switch (size) {
+ case 1:
+ result_1 = bpf_kfunc_kasan_oob_1();
+ ret = result_1[0] ? 1 : 0;
+ break;
+ case 2:
+ result_2 = bpf_kfunc_kasan_oob_2();
+ ret = result_2[0] ? 1 : 0;
+ break;
+ case 4:
+ result_4 = bpf_kfunc_kasan_oob_4();
+ ret = result_4[0] ? 1 : 0;
+ break;
+ case 8:
+ result_8 = bpf_kfunc_kasan_oob_8();
+ ret = result_8[0] ? 1 : 0;
+ break;
+ }
+ return ret;
+}
+
+SEC("tcx/ingress")
+int bpf_kasan_oob(struct __sk_buff *skb)
+{
+ if (is_write) {
+ bpf_kasan_faulty_write(access_size, KASAN_GLOBAL_REDZONE);
+ return 0;
+ }
+
+ return bpf_kasan_oob_read(access_size);
+}
+
+char LICENSE[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
index d876314a4d67..01554bcbbbb0 100644
--- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c
@@ -271,6 +271,76 @@ __bpf_kfunc void bpf_kfunc_put_default_trusted_ptr_test(struct prog_test_member
*/
}
+static void *kasan_uaf(void)
+{
+ void *p = kmalloc(64, GFP_ATOMIC);
+
+ if (!p)
+ return NULL;
+ memset(p, 0xAA, 64);
+ kfree(p);
+
+ return p;
+}
+
+#ifdef CONFIG_KASAN_GENERIC
+extern void kasan_poison(const void *addr, size_t size, u8 value, bool init);
+
+__bpf_kfunc void bpf_kfunc_kasan_poison(void *mem, u32 mem__sz, u8 byte)
+{
+ kasan_poison(mem, mem__sz, byte, false);
+}
+#else
+__bpf_kfunc void bpf_kfunc_kasan_poison(void *mem, u32 mem__sz, u8 byte) { }
+#endif
+
+__bpf_kfunc u8 *bpf_kfunc_kasan_uaf_1(void)
+{
+ return kasan_uaf();
+}
+
+__bpf_kfunc u16 *bpf_kfunc_kasan_uaf_2(void)
+{
+ return kasan_uaf();
+}
+
+__bpf_kfunc u32 *bpf_kfunc_kasan_uaf_4(void)
+{
+ return kasan_uaf();
+}
+
+__bpf_kfunc u64 *bpf_kfunc_kasan_uaf_8(void)
+{
+ return kasan_uaf();
+}
+
+static u8 test_oob_buffer[64];
+
+static void *bpf_kfunc_kasan_oob(void)
+{
+ return test_oob_buffer+64;
+}
+
+__bpf_kfunc u8 *bpf_kfunc_kasan_oob_1(void)
+{
+ return bpf_kfunc_kasan_oob();
+}
+
+__bpf_kfunc u16 *bpf_kfunc_kasan_oob_2(void)
+{
+ return bpf_kfunc_kasan_oob();
+}
+
+__bpf_kfunc u32 *bpf_kfunc_kasan_oob_4(void)
+{
+ return bpf_kfunc_kasan_oob();
+}
+
+__bpf_kfunc u64 *bpf_kfunc_kasan_oob_8(void)
+{
+ return bpf_kfunc_kasan_oob();
+}
+
__bpf_kfunc struct bpf_testmod_ctx *
bpf_testmod_ctx_create(int *err)
{
@@ -740,6 +810,15 @@ BTF_ID_FLAGS(func, bpf_testmod_ops3_call_test_1)
BTF_ID_FLAGS(func, bpf_testmod_ops3_call_test_2)
BTF_ID_FLAGS(func, bpf_kfunc_get_default_trusted_ptr_test);
BTF_ID_FLAGS(func, bpf_kfunc_put_default_trusted_ptr_test);
+BTF_ID_FLAGS(func, bpf_kfunc_kasan_poison)
+BTF_ID_FLAGS(func, bpf_kfunc_kasan_uaf_1)
+BTF_ID_FLAGS(func, bpf_kfunc_kasan_uaf_2)
+BTF_ID_FLAGS(func, bpf_kfunc_kasan_uaf_4)
+BTF_ID_FLAGS(func, bpf_kfunc_kasan_uaf_8)
+BTF_ID_FLAGS(func, bpf_kfunc_kasan_oob_1)
+BTF_ID_FLAGS(func, bpf_kfunc_kasan_oob_2)
+BTF_ID_FLAGS(func, bpf_kfunc_kasan_oob_4)
+BTF_ID_FLAGS(func, bpf_kfunc_kasan_oob_8)
BTF_KFUNCS_END(bpf_testmod_common_kfunc_ids)
BTF_ID_LIST(bpf_testmod_dtor_ids)
--
2.53.0
^ permalink raw reply related
* [PATCH RFC bpf-next 7/8] bpf, x86: enable KASAN for JITed programs on x86
From: Alexis Lothoré (eBPF Foundation) @ 2026-04-13 18:28 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton
Cc: ebpf, Bastien Curutchet, Thomas Petazzoni, Xu Kuohai, bpf,
linux-kernel, netdev, linux-kselftest, linux-stm32,
linux-arm-kernel, kasan-dev, linux-mm,
Alexis Lothoré (eBPF Foundation)
In-Reply-To: <20260413-kasan-v1-0-1a5831230821@bootlin.com>
Mark x86 as supporting KASAN checks in JITed programs so that the
corresponding JIT compiler inserts checks on the translated
instructions.
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
---
arch/x86/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e2df1b147184..a50aa9a0b93c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -234,6 +234,7 @@ config X86
select HAVE_SAMPLE_FTRACE_DIRECT if X86_64
select HAVE_SAMPLE_FTRACE_DIRECT_MULTI if X86_64
select HAVE_EBPF_JIT
+ select HAVE_EBPF_JIT_KASAN if X86_64
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_EISA if X86_32
select HAVE_EXIT_THREAD
--
2.53.0
^ permalink raw reply related
* [PATCH RFC bpf-next 6/8] selftests/bpf: do not run verifier JIT tests when BPF_JIT_KASAN is enabled
From: Alexis Lothoré (eBPF Foundation) @ 2026-04-13 18:28 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton
Cc: ebpf, Bastien Curutchet, Thomas Petazzoni, Xu Kuohai, bpf,
linux-kernel, netdev, linux-kselftest, linux-stm32,
linux-arm-kernel, kasan-dev, linux-mm,
Alexis Lothoré (eBPF Foundation)
In-Reply-To: <20260413-kasan-v1-0-1a5831230821@bootlin.com>
Multiple verifier tests validate the exact list of JITed instructions.
Even if the test offers some flexibility in its checks (eg: not
enforcing the first instruction to be verified right at the beginning of
jited code, but rather searching where the expected JIT instructions
could be located), it is confused by the new KASAN instrumentation JITed
in programs: this instrumentation can be inserted anywhere in-between
searched instructions, leading to test failures despite the correct
instructions being generated.
Prevent those failures by skipping tests involving JITed instructions
checks when kernel is built with KASAN _and_ JIT is enabled, as those
two conditions lead the JITed code to contains KASAN checks.
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
---
tools/testing/selftests/bpf/test_loader.c | 5 +++++
tools/testing/selftests/bpf/unpriv_helpers.c | 5 +++++
tools/testing/selftests/bpf/unpriv_helpers.h | 1 +
3 files changed, 11 insertions(+)
diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
index c4c34cae6102..d2c0062ef31a 100644
--- a/tools/testing/selftests/bpf/test_loader.c
+++ b/tools/testing/selftests/bpf/test_loader.c
@@ -1175,6 +1175,11 @@ void run_subtest(struct test_loader *tester,
return;
}
+ if (is_jit_enabled() && subspec->jited.cnt && get_kasan_jit_enabled()) {
+ test__skip();
+ return;
+ }
+
if (unpriv) {
if (!can_execute_unpriv(tester, spec)) {
test__skip();
diff --git a/tools/testing/selftests/bpf/unpriv_helpers.c b/tools/testing/selftests/bpf/unpriv_helpers.c
index f997d7ec8fd0..25bd08648f5f 100644
--- a/tools/testing/selftests/bpf/unpriv_helpers.c
+++ b/tools/testing/selftests/bpf/unpriv_helpers.c
@@ -142,3 +142,8 @@ bool get_unpriv_disabled(void)
}
return mitigations_off;
}
+
+bool get_kasan_jit_enabled(void)
+{
+ return config_contains("CONFIG_BPF_JIT_KASAN=y");
+}
diff --git a/tools/testing/selftests/bpf/unpriv_helpers.h b/tools/testing/selftests/bpf/unpriv_helpers.h
index 151f67329665..bc5f4c953c9d 100644
--- a/tools/testing/selftests/bpf/unpriv_helpers.h
+++ b/tools/testing/selftests/bpf/unpriv_helpers.h
@@ -5,3 +5,4 @@
#define UNPRIV_SYSCTL "kernel/unprivileged_bpf_disabled"
bool get_unpriv_disabled(void);
+bool get_kasan_jit_enabled(void);
--
2.53.0
^ permalink raw reply related
* [PATCH RFC bpf-next 5/8] bpf, x86: emit KASAN checks into x86 JITed programs
From: Alexis Lothoré (eBPF Foundation) @ 2026-04-13 18:28 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton
Cc: ebpf, Bastien Curutchet, Thomas Petazzoni, Xu Kuohai, bpf,
linux-kernel, netdev, linux-kselftest, linux-stm32,
linux-arm-kernel, kasan-dev, linux-mm,
Alexis Lothoré (eBPF Foundation)
In-Reply-To: <20260413-kasan-v1-0-1a5831230821@bootlin.com>
Insert KASAN shadow memory checks before memory load and store
operations in JIT-compiled BPF programs. This helps detect memory safety
bugs such as use-after-free and out-of-bounds accesses at runtime.
The main instructions being targeted are BPF_LDX and BPF_STX, but not
all of them are being instrumented:
- if the load/store instruction is in fact accessing the program stack,
emit_kasan_check silently skips the instrumentation, as we already
have page guards to monitor stack accesses. Stack accesses _could_ be
monitored more finely by adding kasan checks, but it would need JIT
compiler to insert red zones around any variable on stack, and we likely
do not have enough info in JIT compiler to do so.
- if the load/store instruction is a BPF_PROBE_MEM or a BPF_PROBE_ATOMIC
instruction, we do not instrument it, as the passed address can fault
(hence the custom fault management with BPF_PROBE_XXX instructions),
and so the corresponding kasan check could fault as well.
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
---
This RFC also ignores for now atomic operations, because I am not
perfectly clear yet about how they are JITed and so how much kasan
instrumentation is legitimate here.
---
arch/x86/net/bpf_jit_comp.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index b90103bd0080..111fe1d55121 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1811,6 +1811,7 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
const s32 imm32 = insn->imm;
u32 dst_reg = insn->dst_reg;
u32 src_reg = insn->src_reg;
+ bool accesses_stack;
u8 b2 = 0, b3 = 0;
u8 *start_of_ldx;
s64 jmp_offset;
@@ -1831,6 +1832,7 @@ static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *
EMIT_ENDBR();
ip = image + addrs[i - 1] + (prog - temp);
+ accesses_stack = bpf_insn_accesses_stack(env, bpf_prog, i - 1);
switch (insn->code) {
/* ALU */
@@ -2242,6 +2244,11 @@ st: if (is_imm8(insn->off))
case BPF_STX | BPF_MEM | BPF_H:
case BPF_STX | BPF_MEM | BPF_W:
case BPF_STX | BPF_MEM | BPF_DW:
+ err = emit_kasan_check(&prog, dst_reg, insn,
+ image + addrs[i - 1],
+ accesses_stack);
+ if (err)
+ return err;
emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off);
break;
@@ -2390,6 +2397,12 @@ st: if (is_imm8(insn->off))
/* populate jmp_offset for JAE above to jump to start_of_ldx */
start_of_ldx = prog;
end_of_jmp[-1] = start_of_ldx - end_of_jmp;
+ } else {
+ err = emit_kasan_check(&prog, src_reg, insn,
+ image + addrs[i - 1],
+ accesses_stack);
+ if (err)
+ return err;
}
if (BPF_MODE(insn->code) == BPF_PROBE_MEMSX ||
BPF_MODE(insn->code) == BPF_MEMSX)
--
2.53.0
^ permalink raw reply related
* [PATCH RFC bpf-next 4/8] bpf, x86: add helper to emit kasan checks in x86 JITed programs
From: Alexis Lothoré (eBPF Foundation) @ 2026-04-13 18:28 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton
Cc: ebpf, Bastien Curutchet, Thomas Petazzoni, Xu Kuohai, bpf,
linux-kernel, netdev, linux-kselftest, linux-stm32,
linux-arm-kernel, kasan-dev, linux-mm,
Alexis Lothoré (eBPF Foundation)
In-Reply-To: <20260413-kasan-v1-0-1a5831230821@bootlin.com>
Add the emit_kasan_check() function that emits KASAN shadow memory
checks before memory accesses in JIT-compiled BPF programs. The
implementation relies on the existing __asan_{load,store}X functions
from KASAN subsystem. The helper:
- ensures that the kasan instrumention is actually needed: if the
instruction being processed accesses the program stack, we skip the
instrumentation, as those accesses are already protected with page
guards
- saves registers. This includes caller-saved registers, but also
temporary registers, as those were possibly used by the
affected program
- computes the accessed address and stores it in %rdi
- calls the relevant function, depending on the instruction being a load
or a store, and the size of the access.
- restores registeres
The special care needed when inserting this instrumentation comes at the
cost of a non negligeable increase in JITed code size. For example, a
bare
mov 0x0(%si),rbx # Load in rbx content at address stored in rsi
becomes
push %rax
push %rcx
push %rdx
push %rsi
push %rdi
push %r8
push %r9
push %r10
push %r11
sub $0x8,%rsp
mov %rsi,%rdi
call 0xffffffff81da0a60 <__asan_load8>
add $0x8,%rsp
pop %r11
pop %r10
pop %r9
pop %r8
pop %rdi
pop %rsi
pop %rdx
pop %rcx
pop %rax
mov 0x0(%rsi),rbx
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
---
arch/x86/net/bpf_jit_comp.c | 93 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 93 insertions(+)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index ea9e707e8abf..b90103bd0080 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -20,6 +20,10 @@
#include <asm/unwind.h>
#include <asm/cfi.h>
+#ifdef CONFIG_BPF_JIT_KASAN
+#include <linux/kasan.h>
+#endif
+
static bool all_callee_regs_used[4] = {true, true, true, true};
static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
@@ -1301,6 +1305,95 @@ static void emit_store_stack_imm64(u8 **pprog, int reg, int stack_off, u64 imm64
emit_stx(pprog, BPF_DW, BPF_REG_FP, reg, stack_off);
}
+static int emit_kasan_check(u8 **pprog, u32 addr_reg, struct bpf_insn *insn,
+ u8 *ip, bool accesses_stack)
+{
+#ifdef CONFIG_BPF_JIT_KASAN
+ bool is_write = BPF_CLASS(insn->code) == BPF_STX;
+ u32 bpf_size = BPF_SIZE(insn->code);
+ s32 off = insn->off;
+ u8 *prog = *pprog;
+ void *kasan_func;
+
+ if (accesses_stack)
+ return 0;
+
+ /* Derive KASAN check function from access type and size */
+ switch (bpf_size) {
+ case BPF_B:
+ kasan_func = is_write ? __asan_store1 : __asan_load1;
+ break;
+ case BPF_H:
+ kasan_func = is_write ? __asan_store2 : __asan_load2;
+ break;
+ case BPF_W:
+ kasan_func = is_write ? __asan_store4 : __asan_load4;
+ break;
+ case BPF_DW:
+ kasan_func = is_write ? __asan_store8 : __asan_load8;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ /* Save rax */
+ EMIT1(0x50);
+ /* Save rcx */
+ EMIT1(0x51);
+ /* Save rdx */
+ EMIT1(0x52);
+ /* Save rsi */
+ EMIT1(0x56);
+ /* Save rdi */
+ EMIT1(0x57);
+ /* Save r8 */
+ EMIT2(0x41, 0x50);
+ /* Save r9 */
+ EMIT2(0x41, 0x51);
+ /* Save r10 */
+ EMIT2(0x41, 0x52);
+ /* Save r11 */
+ EMIT2(0x41, 0x53);
+ /* We have pushed 72 bytes, realign stack to 16 bytes: sub rsp, 8 */
+ EMIT4(0x48, 0x83, 0xEC, 8);
+
+ /* mov rdi, addr_reg */
+ EMIT_mov(BPF_REG_1, addr_reg);
+
+ /* add rdi, off (if offset is non-zero) */
+ if (off) {
+ if (is_imm8(off)) {
+ /* add rdi, imm8 */
+ EMIT4(0x48, 0x83, 0xC7, (u8)off);
+ } else {
+ /* add rdi, imm32 */
+ EMIT3_off32(0x48, 0x81, 0xC7, off);
+ }
+ }
+
+ /* Adjust ip to account for the instrumentation generated so far */
+ ip += (prog - *pprog);
+ /* call kasan_func */
+ if (emit_call(&prog, kasan_func, ip))
+ return -ERANGE;
+
+ /* Restore registers */
+ EMIT4(0x48, 0x83, 0xC4, 8);
+ EMIT2(0x41, 0x5B);
+ EMIT2(0x41, 0x5A);
+ EMIT2(0x41, 0x59);
+ EMIT2(0x41, 0x58);
+ EMIT1(0x5F);
+ EMIT1(0x5E);
+ EMIT1(0x5A);
+ EMIT1(0x59);
+ EMIT1(0x58);
+
+ *pprog = prog;
+#endif /* CONFIG_BPF_JIT_KASAN */
+ return 0;
+}
+
static int emit_atomic_rmw(u8 **pprog, u32 atomic_op,
u32 dst_reg, u32 src_reg, s16 off, u8 bpf_size)
{
--
2.53.0
^ permalink raw reply related
* [PATCH RFC bpf-next 3/8] bpf: add BPF_JIT_KASAN for KASAN instrumentation of JITed programs
From: Alexis Lothoré (eBPF Foundation) @ 2026-04-13 18:28 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton
Cc: ebpf, Bastien Curutchet, Thomas Petazzoni, Xu Kuohai, bpf,
linux-kernel, netdev, linux-kselftest, linux-stm32,
linux-arm-kernel, kasan-dev, linux-mm,
Alexis Lothoré (eBPF Foundation)
In-Reply-To: <20260413-kasan-v1-0-1a5831230821@bootlin.com>
Add a new Kconfig option CONFIG_BPF_JIT_KASAN that automatically enables
KASAN (Kernel Address Sanitizer) memory access checks for JIT-compiled
BPF programs, when both KASAN and JIT compiler are enabled. When
enabled, the JIT compiler will emit shadow memory checks before memory
loads and stores to detect use-after-free, out-of-bounds, and other
memory safety bugs at runtime. The option is gated behind
HAVE_EBPF_JIT_KASAN, as it needs proper arch-specific implementation.
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
---
kernel/bpf/Kconfig | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig
index eb3de35734f0..28392adb3d7e 100644
--- a/kernel/bpf/Kconfig
+++ b/kernel/bpf/Kconfig
@@ -17,6 +17,10 @@ config HAVE_CBPF_JIT
config HAVE_EBPF_JIT
bool
+# KASAN support for JIT compiler
+config HAVE_EBPF_JIT_KASAN
+ bool
+
# Used by archs to tell that they want the BPF JIT compiler enabled by
# default for kernels that were compiled with BPF JIT support.
config ARCH_WANT_DEFAULT_BPF_JIT
@@ -101,4 +105,9 @@ config BPF_LSM
If you are unsure how to answer this question, answer N.
+config BPF_JIT_KASAN
+ bool
+ depends on HAVE_EBPF_JIT_KASAN
+ default y if BPF_JIT && KASAN_GENERIC
+
endmenu # "BPF subsystem"
--
2.53.0
^ permalink raw reply related
* [PATCH RFC bpf-next 2/8] bpf: mark instructions accessing program stack
From: Alexis Lothoré (eBPF Foundation) @ 2026-04-13 18:28 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton
Cc: ebpf, Bastien Curutchet, Thomas Petazzoni, Xu Kuohai, bpf,
linux-kernel, netdev, linux-kselftest, linux-stm32,
linux-arm-kernel, kasan-dev, linux-mm,
Alexis Lothoré (eBPF Foundation)
In-Reply-To: <20260413-kasan-v1-0-1a5831230821@bootlin.com>
In order to prepare to emit KASAN checks in JITed programs, JIT
compilers need to be aware about whether some load/store instructions
are targeting the bpf program stack, as those should not be monitored
(we already have guard pages for that, and it is difficult anyway to
correctly monitor any kind of data passed on stack).
To support this need, make the BPF verifier mark the instructions that
access program stack:
- add a setter that allows the verifier to mark instructions accessing
the program stack
- add a getter that allows JIT compilers to check whether instructions
being JITed are accessing the stack
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
---
include/linux/bpf.h | 2 ++
include/linux/bpf_verifier.h | 2 ++
kernel/bpf/core.c | 10 ++++++++++
kernel/bpf/verifier.c | 7 +++++++
4 files changed, 21 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index b4b703c90ca9..774a0395c498 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1543,6 +1543,8 @@ void bpf_jit_uncharge_modmem(u32 size);
bool bpf_prog_has_trampoline(const struct bpf_prog *prog);
bool bpf_insn_is_indirect_target(const struct bpf_verifier_env *env, const struct bpf_prog *prog,
int insn_idx);
+bool bpf_insn_accesses_stack(const struct bpf_verifier_env *env,
+ const struct bpf_prog *prog, int insn_idx);
#else
static inline int bpf_trampoline_link_prog(struct bpf_tramp_link *link,
struct bpf_trampoline *tr,
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index b148f816f25b..ab99ed4c4227 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -660,6 +660,8 @@ struct bpf_insn_aux_data {
u16 const_reg_map_mask;
u16 const_reg_subprog_mask;
u32 const_reg_vals[10];
+ /* instruction accesses stack */
+ bool accesses_stack;
};
#define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 8b018ff48875..340abfdadbed 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1582,6 +1582,16 @@ bool bpf_insn_is_indirect_target(const struct bpf_verifier_env *env, const struc
insn_idx += prog->aux->subprog_start;
return env->insn_aux_data[insn_idx].indirect_target;
}
+
+bool bpf_insn_accesses_stack(const struct bpf_verifier_env *env,
+ const struct bpf_prog *prog, int insn_idx)
+{
+ if (!env)
+ return false;
+ insn_idx += prog->aux->subprog_start;
+ return env->insn_aux_data[insn_idx].accesses_stack;
+}
+
#endif /* CONFIG_BPF_JIT */
/* Base function for offset calculation. Needs to go into .text section,
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1e36b9e91277..7bce4fb4e540 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3502,6 +3502,11 @@ static void mark_indirect_target(struct bpf_verifier_env *env, int idx)
env->insn_aux_data[idx].indirect_target = true;
}
+static void mark_insn_accesses_stack(struct bpf_verifier_env *env, int idx)
+{
+ env->insn_aux_data[idx].accesses_stack = true;
+}
+
#define LR_FRAMENO_BITS 3
#define LR_SPI_BITS 6
#define LR_ENTRY_BITS (LR_SPI_BITS + LR_FRAMENO_BITS + 1)
@@ -6490,6 +6495,8 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
else
err = check_stack_write(env, regno, off, size,
value_regno, insn_idx);
+
+ mark_insn_accesses_stack(env, insn_idx);
} else if (reg_is_pkt_pointer(reg)) {
if (t == BPF_WRITE && !may_access_direct_pkt_data(env, NULL, t)) {
verbose(env, "cannot write into packet\n");
--
2.53.0
^ permalink raw reply related
* [PATCH RFC bpf-next 0/8] bpf: add support for KASAN checks in JITed programs
From: Alexis Lothoré (eBPF Foundation) @ 2026-04-13 18:28 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton
Cc: ebpf, Bastien Curutchet, Thomas Petazzoni, Xu Kuohai, bpf,
linux-kernel, netdev, linux-kselftest, linux-stm32,
linux-arm-kernel, kasan-dev, linux-mm,
Alexis Lothoré (eBPF Foundation)
Hello,
this series aims to bring basic support for KASAN checks to BPF JITed
programs. This follows the first RFC posted in [1].
KASAN allows to spot memory management mistakes by reserving a fraction
of memory as "shadow memory" that will map to the rest of the memory and
allow its monitoring. Each memory-accessing instruction is then
instrumented at build time to call some ASAN check function, that will
analyze the corresponding bits in shadow memory, and if it detects the
access as invalid, trigger a detailed report. The goal of this series is
to replicate this mechanism for BPF programs when they are being JITed
into native instructions: that's then the (runtime) JIT compiler who is
in charge of inserting calls to the corresponding kasan checks, when a
program is being loaded into the kernel. This task involves:
- identifying at program load time the instructions performing memory
accesses
- identifying those accesses properties (size ? read or write ?) to
define the relevant kasan check function to call
- just before the identified instructions:
- perform the basic context saving (ie: saving registers)
- inserting a call to the relevant kasan check function
- restore context
- whenever the instrumented program executes, if it performs an invalid
access, it triggers a kasan report identical to those instrumented on
kernel side at build time.
As discussed in [1], this series is based on some choices and
assumptions:
- it focuses on x86_64 for now, and so only on KASAN_GENERIC
- not all memory accessing BPF instructions are being instrumented:
- it focuses on STX/LDX instructions
- it discards instructions accessing BPF program stack (already
monitored by page guards)
- it discards possibly faulting instructions, like BPF_PROBE_MEM or
BPF_PROBE_ATOMIC insns
The series is marked and sent as RFC:
- to allow collecting feedback early and make sure that it goes into the
right direction
- because it depends on Xu's work to pass data between the verifier and
JIT compilers. This work is not merged yet, see [2]. I have been
tracking the various revisions he sent on the ML and based my local
branch on his work
- because tests brought by this series currently can't run on BPF CI:
they expect kasan multishot to be enabled, otherwise the first test
will make all other kasan-related tests fail.
- because some cases like atomic loads/stores are not instrumented yet
(and are still making me scratch my head)
- because it will hopefully provide a good basis to discuss the topic at
LSFMMBPF (see [3])
Despite this series not being ready for integration yet, anyone
interested in running it locally can perform the following steps to run
the JITed KASAN instrumentation selftests:
- rebasing locally this series on [2]
- building and running the corresponding kernel with kasan_multi_shot
enabled
- running `test_progs -a kasan`
And should get a variety of KASAN tests executed for BPF programs:
#162/1 kasan/bpf_kasan_uaf_read_1:OK
#162/2 kasan/bpf_kasan_uaf_read_2:OK
#162/3 kasan/bpf_kasan_uaf_read_4:OK
#162/4 kasan/bpf_kasan_uaf_read_8:OK
#162/5 kasan/bpf_kasan_uaf_write_1:OK
#162/6 kasan/bpf_kasan_uaf_write_2:OK
#162/7 kasan/bpf_kasan_uaf_write_4:OK
#162/8 kasan/bpf_kasan_uaf_write_8:OK
#162/9 kasan/bpf_kasan_oob_read_1:OK
#162/10 kasan/bpf_kasan_oob_read_2:OK
#162/11 kasan/bpf_kasan_oob_read_4:OK
#162/12 kasan/bpf_kasan_oob_read_8:OK
#162/13 kasan/bpf_kasan_oob_write_1:OK
#162/14 kasan/bpf_kasan_oob_write_2:OK
#162/15 kasan/bpf_kasan_oob_write_4:OK
#162/16 kasan/bpf_kasan_oob_write_8:OK
#162 kasan:OK
Summary: 1/16 PASSED, 0 SKIPPED, 0 FAILED
[1] https://lore.kernel.org/bpf/DG7UG112AVBC.JKYISDTAM30T@bootlin.com/
[2] https://lore.kernel.org/bpf/cover.1776062885.git.xukuohai@hotmail.com/
[3] https://lore.kernel.org/bpf/DGGNCXX79H8O.2P6K8L1QW1M8K@bootlin.com/
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
---
Alexis Lothoré (eBPF Foundation) (8):
kasan: expose generic kasan helpers
bpf: mark instructions accessing program stack
bpf: add BPF_JIT_KASAN for KASAN instrumentation of JITed programs
bpf, x86: add helper to emit kasan checks in x86 JITed programs
bpf, x86: emit KASAN checks into x86 JITed programs
selftests/bpf: do not run verifier JIT tests when BPF_JIT_KASAN is enabled
bpf, x86: enable KASAN for JITed programs on x86
selftests/bpf: add tests to validate KASAN on JIT programs
arch/x86/Kconfig | 1 +
arch/x86/net/bpf_jit_comp.c | 106 +++++++++++++
include/linux/bpf.h | 2 +
include/linux/bpf_verifier.h | 2 +
include/linux/kasan.h | 13 ++
kernel/bpf/Kconfig | 9 ++
kernel/bpf/core.c | 10 ++
kernel/bpf/verifier.c | 7 +
mm/kasan/kasan.h | 10 --
tools/testing/selftests/bpf/prog_tests/kasan.c | 165 +++++++++++++++++++++
tools/testing/selftests/bpf/progs/kasan.c | 146 ++++++++++++++++++
.../testing/selftests/bpf/test_kmods/bpf_testmod.c | 79 ++++++++++
tools/testing/selftests/bpf/test_loader.c | 5 +
tools/testing/selftests/bpf/unpriv_helpers.c | 5 +
tools/testing/selftests/bpf/unpriv_helpers.h | 1 +
15 files changed, 551 insertions(+), 10 deletions(-)
---
base-commit: 7990a071b32887a1a883952e8cf60134b6d6fea0
change-id: 20260126-kasan-fcd68f64cd7b
Best regards,
--
Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
^ permalink raw reply
* [PATCH RFC bpf-next 1/8] kasan: expose generic kasan helpers
From: Alexis Lothoré (eBPF Foundation) @ 2026-04-13 18:28 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton
Cc: ebpf, Bastien Curutchet, Thomas Petazzoni, Xu Kuohai, bpf,
linux-kernel, netdev, linux-kselftest, linux-stm32,
linux-arm-kernel, kasan-dev, linux-mm,
Alexis Lothoré (eBPF Foundation)
In-Reply-To: <20260413-kasan-v1-0-1a5831230821@bootlin.com>
In order to prepare KASAN helpers to be called from the eBPF subsystem
(to add KASAN instrumentation at runtime when JITing eBPF programs),
expose the __asan_{load,store}X functions in linux/kasan.h
Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
---
include/linux/kasan.h | 13 +++++++++++++
mm/kasan/kasan.h | 10 ----------
2 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 338a1921a50a..6f580d4a39e4 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -710,4 +710,17 @@ void kasan_non_canonical_hook(unsigned long addr);
static inline void kasan_non_canonical_hook(unsigned long addr) { }
#endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
+#ifdef CONFIG_KASAN_GENERIC
+void __asan_load1(void *p);
+void __asan_store1(void *p);
+void __asan_load2(void *p);
+void __asan_store2(void *p);
+void __asan_load4(void *p);
+void __asan_store4(void *p);
+void __asan_load8(void *p);
+void __asan_store8(void *p);
+void __asan_load16(void *p);
+void __asan_store16(void *p);
+#endif /* CONFIG_KASAN_GENERIC */
+
#endif /* LINUX_KASAN_H */
diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
index fc9169a54766..3bfce8eb3135 100644
--- a/mm/kasan/kasan.h
+++ b/mm/kasan/kasan.h
@@ -594,16 +594,6 @@ void __asan_handle_no_return(void);
void __asan_alloca_poison(void *, ssize_t size);
void __asan_allocas_unpoison(void *stack_top, ssize_t stack_bottom);
-void __asan_load1(void *);
-void __asan_store1(void *);
-void __asan_load2(void *);
-void __asan_store2(void *);
-void __asan_load4(void *);
-void __asan_store4(void *);
-void __asan_load8(void *);
-void __asan_store8(void *);
-void __asan_load16(void *);
-void __asan_store16(void *);
void __asan_loadN(void *, ssize_t size);
void __asan_storeN(void *, ssize_t size);
--
2.53.0
^ permalink raw reply related
* [PATCH net] ixgbevf: fix use-after-free in VEPA multicast source pruning
From: Michael Bommarito @ 2026-04-13 18:24 UTC (permalink / raw)
To: intel-wired-lan
Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, stable,
linux-kernel, Michael Bommarito
ixgbevf_clean_rx_irq() prunes frames whose source MAC matches the VF's
own address (VEPA multicast workaround) by freeing the skb and
continuing to the next descriptor:
dev_kfree_skb_irq(skb);
continue;
The skb pointer is declared outside the while loop and persists across
iterations. Because the continue skips the "skb = NULL" reset at the
bottom of the loop, the next iteration enters the "else if (skb)" path
and calls ixgbevf_add_rx_frag() on the freed skb, dereferencing
skb_shinfo(skb)->nr_frags — a use-after-free in NAPI softirq context.
The sibling driver iavf already handles this correctly by nulling the
pointer before continuing. Apply the same pattern here.
I do not have ixgbevf hardware; the bug was found by static analysis
(scan_drop_continue_loops.py + semgrep drop_continue_in_loop, multi-tool
corroboration with the highest score in the scan). The UAF was confirmed
under KASAN by loading a test module that reproduces the exact code
pattern (alloc skb, kfree_skb, then read skb_shinfo(skb)->nr_frags):
BUG: KASAN: slab-use-after-free in ixgbevf_uaf_test_init+0x100/0x1000
Read of size 8 at addr 000000006163ae78 by task insmod/30
freed 208-byte region [000000006163adc0, 000000006163ae90)
QEMU emulates igb (82576) but not ixgbe (82599), and the igbvf VF
driver does not include the VEPA source pruning path, so a full
end-to-end reproduction with emulated hardware was not possible.
Fixes: bad17234ba70 ("ixgbevf: Change receive model to use double buffered page based receives")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 42f89a179a3f..4ba3be961ab6 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -1221,6 +1221,7 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
ether_addr_equal(rx_ring->netdev->dev_addr,
eth_hdr(skb)->h_source)) {
dev_kfree_skb_irq(skb);
+ skb = NULL;
continue;
}
--
2.53.0
^ permalink raw reply related
* [PATCH net-next v2] net: check qdisc_pkt_len_segs_init() return value on ingress
From: David Carlier @ 2026-04-13 18:22 UTC (permalink / raw)
To: Jakub Kicinski, David S . Miller, Eric Dumazet, Paolo Abeni
Cc: Simon Horman, Stanislav Fomichev, Kuniyuki Iwashima,
Samiullah Khawaja, Hangbin Liu, Krishna Kumar, netdev,
linux-kernel, David Carlier
Commit 7fb4c1967011 ("net: pull headers in qdisc_pkt_len_segs_init()")
changed qdisc_pkt_len_segs_init() to return an skb drop reason when
it detects malicious GSO packets. The egress path in __dev_queue_xmit()
checks this return value and drops bad packets, but the ingress path in
sch_handle_ingress() ignores it.
This means malformed GSO packets entering via TC ingress are not dropped
and could be redirected to another interface or cause incorrect qdisc
accounting.
Check the return value and drop the packet when a bad GSO is detected.
Fixes: 7fb4c1967011 ("net: pull headers in qdisc_pkt_len_segs_init()")
Signed-off-by: David Carlier <devnexen@gmail.com>
---
v1 -> v2: reorder variable declarations for reverse xmas tree
v1: https://lore.kernel.org/netdev/20260408172307.46498-1-devnexen@gmail.com/
net/core/dev.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 5a31f9d2128c..d11c22cafca9 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4459,8 +4459,8 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
struct net_device *orig_dev, bool *another)
{
struct bpf_mprog_entry *entry = rcu_dereference_bh(skb->dev->tcx_ingress);
- enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_INGRESS;
struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx;
+ enum skb_drop_reason drop_reason;
int sch_ret;
if (!entry)
@@ -4472,7 +4472,15 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
*pt_prev = NULL;
}
- qdisc_pkt_len_segs_init(skb);
+ drop_reason = qdisc_pkt_len_segs_init(skb);
+ if (unlikely(drop_reason)) {
+ kfree_skb_reason(skb, drop_reason);
+ *ret = NET_RX_DROP;
+ bpf_net_ctx_clear(bpf_net_ctx);
+ return NULL;
+ }
+
+ drop_reason = SKB_DROP_REASON_TC_INGRESS;
tcx_set_ingress(skb, true);
if (static_branch_unlikely(&tcx_needed_key)) {
--
2.53.0
^ permalink raw reply related
* Re: [PATCH net-next v7 14/15] selftests: net: add team_bridge_macvlan rx_mode test
From: Breno Leitao @ 2026-04-13 18:09 UTC (permalink / raw)
To: Stanislav Fomichev; +Cc: netdev, davem, edumazet, kuba, pabeni
In-Reply-To: <20260413171131.550126-15-sdf@fomichev.me>
On Mon, Apr 13, 2026 at 10:11:30AM -0700, Stanislav Fomichev wrote:
> Add a test that exercises the ndo_change_rx_flags path through a
> macvlan -> bridge -> team -> dummy stack. This triggers dev_uc_add
> under addr_list_lock which flips promiscuity on the lower device.
> With the new work queue approach, this must not deadlock.
>
> Link: https://lore.kernel.org/netdev/20260214033859.43857-1-jiayuan.chen@linux.dev/
> Cc: Breno Leitao <leitao@debian.org>
> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Breno Leitao <leitao@debian.org>
^ permalink raw reply
* Re: [PATCH net-next] net: stmmac: enable RPS and RBU interrupts
From: Jakub Kicinski @ 2026-04-13 18:02 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Alexandre Torgue, Andrew Lunn, David S. Miller,
Eric Dumazet, linux-arm-kernel, linux-stm32, netdev, Paolo Abeni,
Sam Edwards
In-Reply-To: <E1wBBaR-0000000GZHR-1dbM@rmk-PC.armlinux.org.uk>
On Fri, 10 Apr 2026 14:07:51 +0100 Russell King (Oracle) wrote:
> Since we are seeing receive buffer exhaustion on several platforms,
> let's enable the interrupts so the statistics we publish via ethtool -S
> actually work to aid diagnosis. I've been in two minds about whether
> to send this patch, but given the problems with stmmac at the moment,
> I think it should be merged.
Sorry for a under-research response but wasn't there are person trying
to fix the OOM starvation issue? Who was supposed to add a timer?
Is your problem also OOM related or do you suspect something else?
Firing interrupts when Rx fill ring runs dry (which IIUC this patches
dies?) is not a good idea.
^ permalink raw reply
* Re: [PATCH net-next v2 1/2] keys, dns: drop unused upayload->data NUL terminator
From: Jakub Kicinski @ 2026-04-13 18:00 UTC (permalink / raw)
To: Thorsten Blum
Cc: David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Tim Bird, netdev, linux-kernel
In-Reply-To: <adw5cvtPfx1SWQq9@linux.dev>
On Mon, 13 Apr 2026 02:31:46 +0200 Thorsten Blum wrote:
> On Sun, Apr 12, 2026 at 05:05:08PM -0700, Jakub Kicinski wrote:
> > On Mon, 13 Apr 2026 01:04:54 +0200 Thorsten Blum wrote:
> > > On Sun, Apr 12, 2026 at 02:10:04PM -0700, Jakub Kicinski wrote:
> [...]
> [...]
> [...]
> > >
> > > The point of patch 1/2 is not the removed NUL terminator itself, but to
> > > prepare for patch 2/2, which adds __counted_by() and requires ->datalen
> > > to match the number of elements in ->data.
> > >
> > > Currently, that is not the case because ->data includes an extra NUL
> > > despite never being used as a C string. Removing the unused terminator
> > > makes the length match the allocation size and allows adding the
> > > __counted_by() annotation.
> > >
> > > I can fold this into the __counted_by() patch if you prefer.
> >
> > I understand that part, but I don't get where the data from which
> > the terminating character is removed, is used. Only other access
> > I saw was freeing it, the rest of the callback seem to looking
> > at the error, not the data..
>
> ->data and ->datalen are used in multiple places.
>
> For example, in dns_query() in net/dns_resolver/dns_query.c:
>
> upayload = user_key_payload_locked(rkey);
> len = upayload->datalen;
>
> if (_result) {
> ret = -ENOMEM;
> *_result = kmemdup_nul(upayload->data, len, GFP_KERNEL);
> if (!*_result)
> goto put;
> }
>
> In cifs_set_cifscreds() in fs/smb/client/connect.c:
>
> /* find first : in payload */
> payload = upayload->data;
> delim = strnchr(payload, upayload->datalen, ':');
>
Alright, could you repost this after the merge window and CC David and
Jarkko on both patches? They supposedly maintain this.
^ permalink raw reply
* Re: [PATCH v3] nfc: hci: fix out-of-bounds read in HCP header parsing
From: Jakub Kicinski @ 2026-04-13 17:55 UTC (permalink / raw)
To: Ashutosh Desai; +Cc: netdev, edumazet, davem, pabeni, horms, linux-kernel
In-Reply-To: <20260413024329.3293075-1-ashutoshdesai993@gmail.com>
On Mon, 13 Apr 2026 02:43:29 +0000 Ashutosh Desai wrote:
> nfc_hci_recv_from_llc() and nci_hci_data_received_cb() cast skb->data
> to struct hcp_packet and read the message header byte without checking
> that enough data is present in the linear sk_buff area. A malicious NFC
> peer can send a 1-byte HCP frame that passes through the SHDLC layer
> and reaches these functions, causing an out-of-bounds heap read.
>
> Fix this by adding pskb_may_pull() before each cast to ensure the full
> 2-byte HCP header is pulled into the linear area before it is accessed.
This is missing a Fixes tag.
Also please do not post new revision of a patch in response to the
previous one
--
pw-bot: cr
pv-bot: fixes
pv-bot: thread
^ permalink raw reply
* Re: [PATCH 2/4] tools: ynl-gen-c: optionally emit structs and helpers
From: Jakub Kicinski @ 2026-04-13 17:49 UTC (permalink / raw)
To: Christoph Böhmwalder
Cc: Jens Axboe, drbd-dev, linux-kernel, Lars Ellenberg,
Philipp Reisner, linux-block, Donald Hunter, Eric Dumazet, netdev
In-Reply-To: <adzVUdf74CVk2DwJ@localhost.localdomain>
On Mon, 13 Apr 2026 13:48:32 +0200 Christoph Böhmwalder wrote:
> >Can we just commit the code they output and leave the YNL itself be?
> >Every single legacy family has some weird quirks the point of YNL
> >is to get rid of them, not support them all..
>
> Fair enough, we could also do that. Though the question then becomes
> whether we want to keep the YAML spec for the "drbd" family (patch 3 of
> this series) in Documentation/.
>
> I would argue it makes sense to keep it around somewhere so that the old
> family is somehow documented, but obviously that yaml file won't work
> with the unmodified generator.
To be clear (correct me if I misunderstood) it looked like we would be
missing out on "automating" things, so extra work would still need to
be done in the C code / manually written headers. But pure YNL (eg
Python or Rust) client _would_ work? They could generate correct
requests and parse responses, right?
If yes, keeping it makes sense. FWIW all the specs we have for "old"
networking families (routing etc) also don't replace any kernel code.
They are purely to enable user space libraries in various languages.
Whether having broad languages support for drbd or you just have one
well known user space stack - I dunno.
> Maybe keep it, but with a comment at the top that notes that
> - this family is deprecated and "frozen",
> - the spec is only for documentation purposes, and
> - the spec doesn't work with the upstream parser?
The past point needs a clarification, I guess..
^ permalink raw reply
* [PATCH net] NFC: digital: bound SENSF response copy into nfc_target
From: Michael Bommarito @ 2026-04-13 17:47 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Kees Cook, stable, linux-kernel, Michael Bommarito
digital_in_recv_sensf_res() copies the received SENSF response into
struct nfc_target without bounding the copy to target.sensf_res. A full
on-wire digital_sensf_res is 19 bytes long, while nfc_target stores 18
bytes, so full-length or oversized responses can overwrite adjacent
stack fields before digital_target_found() sees the target.
Reject payloads larger than struct digital_sensf_res and clamp the copy
into target.sensf_res so valid 19-byte responses keep working while the
destination buffer remains bounded.
This was confirmed by injecting an oversized SENSF_RES frame via a
patched nfcsim driver, producing a kernel panic with the overflow
pattern visible on the stack:
Kernel panic - not syncing: Kernel mode fault at addr 0x0
Stack:
4141414141414141 4141414141414141 4141414141414141 ...
Found by static analysis with Coccinelle (memcpy-from-TLV pattern
derived from CVE-2019-14814).
Fixes: 8c0695e4998d ("NFC Digital: Add NFC-F technology support")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-6
Assisted-by: Codex:gpt-5-4
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
net/nfc/digital_technology.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/nfc/digital_technology.c b/net/nfc/digital_technology.c
index 63f1b721c71d..5ef49f813f70 100644
--- a/net/nfc/digital_technology.c
+++ b/net/nfc/digital_technology.c
@@ -768,12 +768,18 @@ static void digital_in_recv_sensf_res(struct nfc_digital_dev *ddev, void *arg,
skb_pull(resp, 1);
+ if (resp->len > sizeof(struct digital_sensf_res)) {
+ rc = -EIO;
+ goto exit;
+ }
+
memset(&target, 0, sizeof(struct nfc_target));
sensf_res = (struct digital_sensf_res *)resp->data;
- memcpy(target.sensf_res, sensf_res, resp->len);
- target.sensf_res_len = resp->len;
+ target.sensf_res_len = min_t(unsigned int, resp->len,
+ sizeof(target.sensf_res));
+ memcpy(target.sensf_res, sensf_res, target.sensf_res_len);
memcpy(target.nfcid2, sensf_res->nfcid2, NFC_NFCID2_MAXSIZE);
target.nfcid2_len = NFC_NFCID2_MAXSIZE;
--
2.53.0
^ permalink raw reply related
* [PATCH net-next 3/3] rose: guard rose_neigh_put() against NULL in timer expiry
From: f6bvp @ 2026-04-13 17:42 UTC (permalink / raw)
To: linux-hams; +Cc: netdev, edumazet, pabeni, f6bvp
In-Reply-To: <20260413174238.112418-1-bernard.f6bvp@gmail.com>
In rose_timer_expiry(), ROSE_STATE_2 calls rose_neigh_put() on
rose->neighbour without checking whether it is NULL first. The pointer
can be NULL if the connection was already being torn down by a
concurrent code path (e.g. rose_kill_by_neigh()), leading to a
NULL-pointer dereference.
Add a NULL check before the put and clear the pointer afterwards.
Signed-off-by: f6bvp <bernard.f6bvp@gmail.com>
---
net/rose/rose_timer.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/rose/rose_timer.c b/net/rose/rose_timer.c
index bb60a1654d61..d997d24ab081 100644
--- a/net/rose/rose_timer.c
+++ b/net/rose/rose_timer.c
@@ -180,7 +180,10 @@ static void rose_timer_expiry(struct timer_list *t)
break;
case ROSE_STATE_2: /* T3 */
- rose_neigh_put(rose->neighbour);
+ if (rose->neighbour) {
+ rose_neigh_put(rose->neighbour);
+ rose->neighbour = NULL;
+ }
rose_disconnect(sk, ETIMEDOUT, -1, -1);
break;
--
2.51.0
^ permalink raw reply related
* [PATCH net-next 2/3] rose: clear neighbour pointer after rose_neigh_put() in state machines
From: f6bvp @ 2026-04-13 17:42 UTC (permalink / raw)
To: linux-hams; +Cc: netdev, edumazet, pabeni, f6bvp
In-Reply-To: <20260413174238.112418-1-bernard.f6bvp@gmail.com>
After releasing a neighbour reference with rose_neigh_put() in the
ROSE state machines, the pointer in rose_sock was left dangling.
A subsequent code path could dereference the freed neighbour, causing
a use-after-free.
Set rose->neighbour to NULL immediately after each rose_neigh_put()
call in rose_state1_machine() through rose_state5_machine().
Signed-off-by: f6bvp <bernard.f6bvp@gmail.com>
---
net/rose/rose_in.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/rose/rose_in.c b/net/rose/rose_in.c
index 0276b393f0e5..622527f1354f 100644
--- a/net/rose/rose_in.c
+++ b/net/rose/rose_in.c
@@ -57,6 +57,7 @@ static int rose_state1_machine(struct sock *sk, struct sk_buff *skb, int framety
rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION);
rose_disconnect(sk, ECONNREFUSED, skb->data[3], skb->data[4]);
rose_neigh_put(rose->neighbour);
+ rose->neighbour = NULL;
break;
default:
@@ -80,11 +81,13 @@ static int rose_state2_machine(struct sock *sk, struct sk_buff *skb, int framety
rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION);
rose_disconnect(sk, 0, skb->data[3], skb->data[4]);
rose_neigh_put(rose->neighbour);
+ rose->neighbour = NULL;
break;
case ROSE_CLEAR_CONFIRMATION:
rose_disconnect(sk, 0, -1, -1);
rose_neigh_put(rose->neighbour);
+ rose->neighbour = NULL;
break;
default:
@@ -122,6 +125,7 @@ static int rose_state3_machine(struct sock *sk, struct sk_buff *skb, int framety
rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION);
rose_disconnect(sk, 0, skb->data[3], skb->data[4]);
rose_neigh_put(rose->neighbour);
+ rose->neighbour = NULL;
break;
case ROSE_RR:
@@ -235,6 +239,7 @@ static int rose_state4_machine(struct sock *sk, struct sk_buff *skb, int framety
rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION);
rose_disconnect(sk, 0, skb->data[3], skb->data[4]);
rose_neigh_put(rose->neighbour);
+ rose->neighbour = NULL;
break;
default:
@@ -255,6 +260,7 @@ static int rose_state5_machine(struct sock *sk, struct sk_buff *skb, int framety
rose_write_internal(sk, ROSE_CLEAR_CONFIRMATION);
rose_disconnect(sk, 0, skb->data[3], skb->data[4]);
rose_neigh_put(rose_sk(sk)->neighbour);
+ rose_sk(sk)->neighbour = NULL;
}
return 0;
--
2.51.0
^ permalink raw reply related
* [PATCH net-next 1/3] rose: fix race between loopback timer and module removal
From: f6bvp @ 2026-04-13 17:42 UTC (permalink / raw)
To: linux-hams; +Cc: netdev, edumazet, pabeni, f6bvp
In-Reply-To: <5a88b747-bb06-4ebd-99de-80ceb574cf22@free.fr>
rose_loopback_clear() used timer_delete() which returns immediately
without waiting for any running callback to complete. If the timer
fired concurrently with module removal, rose_loopback_timer() would
access rose_loopback_neigh after it was freed, causing a use-after-free.
Three changes fix the race:
1. Add a loopback_stopping atomic flag. rose_loopback_timer() checks
this at entry and mid-loop; when set it drains the queue and bails
out without re-arming the timer.
2. Switch rose_loopback_clear() to timer_delete_sync() so it blocks
until any in-flight callback has returned.
3. Wrap the timer body with rose_neigh_hold()/rose_neigh_put() so the
loopback neighbour cannot be freed while the callback is running.
Also fix a pre-existing bug: dev_put(dev) was only called on the
failure path of rose_rx_call_request(); it is now called unconditionally
so the device reference is always released.
Remove a dead check (!neigh->dev && !neigh->loopback) that can never
be true for the loopback neighbour, which always has loopback=1.
Signed-off-by: f6bvp <bernard.f6bvp@gmail.com>
---
net/rose/rose_loopback.c | 53 +++++++++++++++++++++++++++-------------
1 file changed, 36 insertions(+), 17 deletions(-)
diff --git a/net/rose/rose_loopback.c b/net/rose/rose_loopback.c
index b538e39b3df5..80d7879ef36a 100644
--- a/net/rose/rose_loopback.c
+++ b/net/rose/rose_loopback.c
@@ -12,13 +12,15 @@
#include <net/rose.h>
#include <linux/init.h>
-static struct sk_buff_head loopback_queue;
#define ROSE_LOOPBACK_LIMIT 1000
-static struct timer_list loopback_timer;
+static struct timer_list loopback_timer;
+static struct sk_buff_head loopback_queue;
static void rose_set_loopback_timer(void);
static void rose_loopback_timer(struct timer_list *unused);
+static atomic_t loopback_stopping = ATOMIC_INIT(0);
+
void rose_loopback_init(void)
{
skb_queue_head_init(&loopback_queue);
@@ -66,10 +68,25 @@ static void rose_loopback_timer(struct timer_list *unused)
unsigned int lci_i, lci_o;
int count;
+ if (atomic_read(&loopback_stopping))
+ return;
+
+ if (rose_loopback_neigh)
+ rose_neigh_hold(rose_loopback_neigh);
+ else
+ return;
+
for (count = 0; count < ROSE_LOOPBACK_LIMIT; count++) {
skb = skb_dequeue(&loopback_queue);
if (!skb)
- return;
+ goto out;
+
+ if (atomic_read(&loopback_stopping)) {
+ kfree_skb(skb);
+ skb_queue_purge(&loopback_queue);
+ goto out;
+ }
+
if (skb->len < ROSE_MIN_LEN) {
kfree_skb(skb);
continue;
@@ -96,27 +113,24 @@ static void rose_loopback_timer(struct timer_list *unused)
}
if (frametype == ROSE_CALL_REQUEST) {
- if (!rose_loopback_neigh->dev &&
- !rose_loopback_neigh->loopback) {
- kfree_skb(skb);
- continue;
- }
-
dev = rose_dev_get(dest);
if (!dev) {
kfree_skb(skb);
continue;
}
- if (rose_rx_call_request(skb, dev, rose_loopback_neigh, lci_o) == 0) {
- dev_put(dev);
+ if (rose_rx_call_request(skb, dev, rose_loopback_neigh, lci_o) == 0)
kfree_skb(skb);
- }
+ dev_put(dev);
} else {
kfree_skb(skb);
}
}
- if (!skb_queue_empty(&loopback_queue))
+
+out:
+ rose_neigh_put(rose_loopback_neigh);
+
+ if (!atomic_read(&loopback_stopping) && !skb_queue_empty(&loopback_queue))
mod_timer(&loopback_timer, jiffies + 1);
}
@@ -124,10 +138,15 @@ void __exit rose_loopback_clear(void)
{
struct sk_buff *skb;
- timer_delete(&loopback_timer);
+ atomic_set(&loopback_stopping, 1);
+ /* Pairs with atomic_read() in rose_loopback_timer(): ensure the
+ * stopping flag is visible before we cancel, so a concurrent
+ * callback aborts its loop early rather than re-arming the timer.
+ */
+ smp_mb();
- while ((skb = skb_dequeue(&loopback_queue)) != NULL) {
- skb->sk = NULL;
+ timer_delete_sync(&loopback_timer);
+
+ while ((skb = skb_dequeue(&loopback_queue)) != NULL)
kfree_skb(skb);
- }
}
--
2.51.0
^ permalink raw reply related
* Re: [PATCH v5 net-next 0/8] dpll/ice: Add TXC DPLL type and full TX reference clock control for E825
From: Jakub Kicinski @ 2026-04-13 17:40 UTC (permalink / raw)
To: Kubalewski, Arkadiusz
Cc: Nitka, Grzegorz, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
Oros, Petr, richardcochran@gmail.com, andrew+netdev@lunn.ch,
Kitszel, Przemyslaw, Nguyen, Anthony L,
Prathosh.Satish@microchip.com, Vecera, Ivan, jiri@resnulli.us,
vadim.fedorenko@linux.dev, donald.hunter@gmail.com,
horms@kernel.org, pabeni@redhat.com, davem@davemloft.net,
edumazet@google.com
In-Reply-To: <IA0PR11MB737882B384AE7279EBCD05C79B242@IA0PR11MB7378.namprd11.prod.outlook.com>
On Mon, 13 Apr 2026 08:19:30 +0000 Kubalewski, Arkadiusz wrote:
> >My concern is that I think this is a pretty run of the mill SyncE
> >design. If we need to pretend we have two DPLLs here if we really
> >only have one and a mux - then our APIs are mis-designed :(
>
> Well, the true is that we did not anticipated per-port control of the
> TX clock source, as a single DPLL device could drive multiple of such.
>
> This is not true, that we pretend there is a second PLL - there is a
> PLL on each TX clock, maybe not a full DPLL, but still the loop with
> a control over it's sources is there and it has the same 2 external
> sources + default XO.
Let me dig around and see if I can find any docs for PLL IPs
that get integrated into ASICs. The DPLL subsystem has implicitly
focused on standalone, timing related PLLs. Every ASIC out there
has a bunch of PLLs to generate the clock signals. It's not clear
to me that DPLL subsystem is the right fit for this. Ping me if
I don't get back to this by the end of the week please. I'll need
to wrap up net-next and send the PR first..
> A mentioned try of adding per port MUX-type pin, just to give some control
> to the user, is where we wanted to simplify things, but in the end the API
> would have to be modified in significant way, various paths related to pin
> registration and keeping correct references, just to make working case
> for the pin_on_pin_register and it's internals. We decided that the burden
> and impact for existing design was to high.
>
> And that is why the TXC approach emerged, the change of DPLL is minimal,
> The model is still correct from user perspective, SyncE SW controller shall
> anticipate possibility that per-port TXC dpll is there
>
> This particular device and driver doesn't implement any EEC-type DPLL
> device, the one could think that we can just change the type here and use
> EEC type instead of new one TXC - since we share pins from external dpll
> driver, which is EEC type, and our DPLL device would have different clock_id
> and module. But, further designs, where a single NIC is having control over
> both a EEC DPLL and ability to control each source per-port this would be
> problematic. At least one NIC Port driver would have to have 2 EEC-type DPLLs
> leaving user with extra confusion.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox