Netdev List
 help / color / mirror / Atom feed
* [PATCH v5.10] netfilter: nft_set_pipapo: do not rely on ZERO_SIZE_PTR
From: Keerthana K @ 2026-04-14  6:32 UTC (permalink / raw)
  To: stable, gregkh
  Cc: pablo, kadlec, fw, davem, edumazet, kuba, pabeni, netfilter-devel,
	coreteam, netdev, linux-kernel, ajay.kaher, alexey.makhalov,
	vamsi-krishna.brahmajosyula, yin.ding, tapas.kundu,
	Stefano Brivio, Mukul Sikka, Brennan Lamoreaux, Keerthana K

From: Florian Westphal <fw@strlen.de>

commit 07ace0bbe03b3d8e85869af1dec5e4087b1d57b8 upstream

pipapo relies on kmalloc(0) returning ZERO_SIZE_PTR (i.e., not NULL
but pointer is invalid).

Rework this to not call slab allocator when we'd request a 0-byte
allocation.

Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mukul Sikka <mukul.sikka@broadcom.com>
Signed-off-by: Brennan Lamoreaux <brennan.lamoreaux@broadcom.com>
[Keerthana: In older stable branches (v6.6 and earlier), the allocation logic in
pipapo_clone() still relies on `src->rules` rather than `src->rules_alloc`
(introduced in v6.9 via 9f439bd6ef4f). Consequently, the previously
backported INT_MAX clamping check uses `src->rules`. This patch correctly
moves that `src->rules > (INT_MAX / ...)` check inside the new
`if (src->rules > 0)` block]
Signed-off-by: Keerthana K <keerthana.kalyanasundaram@broadcom.com>
---
 net/netfilter/nft_set_pipapo.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index a4fdd1587bb3..83606dfde033 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -524,6 +524,9 @@ static struct nft_pipapo_elem *pipapo_get(const struct net *net,
 	struct nft_pipapo_field *f;
 	int i;
 
+	if (m->bsize_max == 0)
+		return ret;
+
 	res_map = kmalloc_array(m->bsize_max, sizeof(*res_map), GFP_ATOMIC);
 	if (!res_map) {
 		ret = ERR_PTR(-ENOMEM);
@@ -1363,14 +1366,20 @@ static struct nft_pipapo_match *pipapo_clone(struct nft_pipapo_match *old)
 		       src->bsize * sizeof(*dst->lt) *
 		       src->groups * NFT_PIPAPO_BUCKETS(src->bb));
 
-		if (src->rules > (INT_MAX / sizeof(*src->mt)))
-			goto out_mt;
+		if (src->rules > 0) {
+			if (src->rules > (INT_MAX / sizeof(*src->mt)))
+				goto out_mt;
 
-		dst->mt = kvmalloc(src->rules * sizeof(*src->mt), GFP_KERNEL);
-		if (!dst->mt)
-			goto out_mt;
+			dst->mt = kvmalloc_array(src->rules, sizeof(*src->mt),
+						 GFP_KERNEL);
+			if (!dst->mt)
+				goto out_mt;
+
+			memcpy(dst->mt, src->mt, src->rules * sizeof(*src->mt));
+		} else {
+			dst->mt = NULL;
+		}
 
-		memcpy(dst->mt, src->mt, src->rules * sizeof(*src->mt));
 		src++;
 		dst++;
 	}
-- 
2.43.7


^ permalink raw reply related

* [PATCH v2 v5.15-v6.1] netfilter: nft_set_pipapo: do not rely on ZERO_SIZE_PTR
From: Keerthana K @ 2026-04-14  6:31 UTC (permalink / raw)
  To: stable, gregkh
  Cc: pablo, kadlec, fw, davem, edumazet, kuba, pabeni, netfilter-devel,
	coreteam, netdev, linux-kernel, ajay.kaher, alexey.makhalov,
	vamsi-krishna.brahmajosyula, yin.ding, tapas.kundu,
	Stefano Brivio, Mukul Sikka, Brennan Lamoreaux, Keerthana K

From: Florian Westphal <fw@strlen.de>

commit 07ace0bbe03b3d8e85869af1dec5e4087b1d57b8 upstream

pipapo relies on kmalloc(0) returning ZERO_SIZE_PTR (i.e., not NULL
but pointer is invalid).

Rework this to not call slab allocator when we'd request a 0-byte
allocation.

Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mukul Sikka <mukul.sikka@broadcom.com>
Signed-off-by: Brennan Lamoreaux <brennan.lamoreaux@broadcom.com>
[Keerthana: In older stable branches (v6.6 and earlier), the allocation logic in
pipapo_clone() still relies on `src->rules` rather than `src->rules_alloc`
(introduced in v6.9 via 9f439bd6ef4f). Consequently, the previously
backported INT_MAX clamping check uses `src->rules`. This patch correctly
moves that `src->rules > (INT_MAX / ...)` check inside the new
`if (src->rules > 0)` block]
Signed-off-by: Keerthana K <keerthana.kalyanasundaram@broadcom.com>
---
Changes in v2:
- Fixed patch apply failure

v1: https://lore.kernel.org/all/20260413043247.3327855-1-keerthana.kalyanasundaram@broadcom.com/

 net/netfilter/nft_set_pipapo.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index 863162c82330..2072c89a467d 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -525,6 +525,8 @@ static struct nft_pipapo_elem *pipapo_get(const struct net *net,
 	int i;
 
 	m = priv->clone;
+	if (m->bsize_max == 0)
+		return ret;
 
 	res_map = kmalloc_array(m->bsize_max, sizeof(*res_map), GFP_ATOMIC);
 	if (!res_map) {
@@ -1365,14 +1367,20 @@ static struct nft_pipapo_match *pipapo_clone(struct nft_pipapo_match *old)
 		       src->bsize * sizeof(*dst->lt) *
 		       src->groups * NFT_PIPAPO_BUCKETS(src->bb));
 
-		if (src->rules > (INT_MAX / sizeof(*src->mt)))
-			goto out_mt;
+		if (src->rules > 0) {
+			if (src->rules > (INT_MAX / sizeof(*src->mt)))
+				goto out_mt;
+
+			dst->mt = kvmalloc_array(src->rules, sizeof(*src->mt),
+						 GFP_KERNEL);
+			if (!dst->mt)
+				goto out_mt;
 
-		dst->mt = kvmalloc(src->rules * sizeof(*src->mt), GFP_KERNEL);
-		if (!dst->mt)
-			goto out_mt;
+			memcpy(dst->mt, src->mt, src->rules * sizeof(*src->mt));
+		} else {
+			dst->mt = NULL;
+		}
 
-		memcpy(dst->mt, src->mt, src->rules * sizeof(*src->mt));
 		src++;
 		dst++;
 	}
-- 
2.43.7


^ permalink raw reply related

* Re: [PATCH v2] rose: fix OOB reads on short CLEAR REQUEST frames
From: Eric Dumazet @ 2026-04-14  6:11 UTC (permalink / raw)
  To: Ashutosh Desai
  Cc: netdev, linux-hams, davem, kuba, pabeni, horms, linux-kernel
In-Reply-To: <177614667427.3606651.8700070406932922261@gmail.com>

On Mon, Apr 13, 2026 at 11:04 PM Ashutosh Desai
<ashutoshdesai993@gmail.com> wrote:
>
> rose_process_rx_frame() calls rose_decode() which reads skb->data[2]
> without any prior length check. For CLEAR REQUEST frames the state
> machines then read skb->data[3] and skb->data[4] as the cause and
> diagnostic bytes.
>
> A crafted 3-byte ROSE CLEAR REQUEST frame passes the minimum length
> gate in rose_route_frame() and reaches rose_process_rx_frame(), where
> rose_decode() reads one byte past the header and the state machines
> read two bytes past the valid buffer.
>
> Add a pskb_may_pull(skb, 3) check before rose_decode() to cover its
> skb->data[2] access, and a pskb_may_pull(skb, 5) check afterwards for
> the CLEAR REQUEST path to cover the cause and diagnostic reads.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Cc: stable@vger.kernel.org
> Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
> ---
> V1 -> V2: switch skb->len check to pskb_may_pull; also add
>           pskb_may_pull(skb, 3) before rose_decode() to cover its
>           skb->data[2] access
>
> v1: https://lore.kernel.org/netdev/20260409013246.2051746-1-ashutoshdesai993@gmail.com/
>
>  net/rose/rose_in.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/net/rose/rose_in.c b/net/rose/rose_in.c
> index 0276b393f0e5..b9f01a11e2df 100644
> --- a/net/rose/rose_in.c
> +++ b/net/rose/rose_in.c
> @@ -269,8 +269,18 @@ int rose_process_rx_frame(struct sock *sk, struct sk_buff *skb)
>  if (rose->state == ROSE_STATE_0)
>   0;
>
> +if (!pskb_may_pull(skb, 3)) {
> +kfree_skb(skb);
> +return 0;
> +}
> +
>  frametype = rose_decode(skb, &ns, &nr, &q, &d, &m);
>
> +if (frametype == ROSE_CLEAR_REQUEST && !pskb_may_pull(skb, 5)) {
> +kfree_skb(skb);
> +return 0;
> +}
> +
>  switch (rose->state) {
>  case ROSE_STATE_1:
>  ueued = rose_state1_machine(sk, skb, frametype);
> --
> 2.34.1

rose_process_rx_frame() callers already call kfree_skb(skb) if
rose_process_rx_frame()
returns a 0.
Your patch would add double-frees.


Your patch is white-space mangled.

Please take a look at Documentation/process/maintainer-netdev.rst

Preparing changes
-----------------

Attention to detail is important.  Re-read your own work as if you were the
reviewer.  You can start with using ``checkpatch.pl``, perhaps even with
the ``--strict`` flag.  But do not be mindlessly robotic in doing so.
If your change is a bug fix, make sure your commit log indicates the
end-user visible symptom, the underlying reason as to why it happens,
and then if necessary, explain why the fix proposed is the best way to
get things done.  Don't mangle whitespace, and as is common, don't
mis-indent function arguments that span multiple lines.  If it is your
first patch, mail it to yourself so you can test apply it to an
unpatched tree to confirm infrastructure didn't mangle it.

Finally, go back and read
:ref:`Documentation/process/submitting-patches.rst <submittingpatches>`
to be sure you are not repeating some common mistake documented there.

Also:

Indicating target tree
~~~~~~~~~~~~~~~~~~~~~~

To help maintainers and CI bots you should explicitly mark which tree
your patch is targeting. Assuming that you use git, use the prefix
flag::

  git format-patch --subject-prefix='PATCH net-next' start..finish

Use ``net`` instead of ``net-next`` (always lower case) in the above for
bug-fix ``net`` content.

Please

pw-bot: cr

^ permalink raw reply

* [PATCH v2] rose: fix OOB reads on short CLEAR REQUEST frames
From: Ashutosh Desai @ 2026-04-14  6:04 UTC (permalink / raw)
  To: netdev; +Cc: linux-hams, davem, edumazet, kuba, pabeni, horms, linux-kernel

rose_process_rx_frame() calls rose_decode() which reads skb->data[2]
without any prior length check. For CLEAR REQUEST frames the state
machines then read skb->data[3] and skb->data[4] as the cause and
diagnostic bytes.

A crafted 3-byte ROSE CLEAR REQUEST frame passes the minimum length
gate in rose_route_frame() and reaches rose_process_rx_frame(), where
rose_decode() reads one byte past the header and the state machines
read two bytes past the valid buffer.

Add a pskb_may_pull(skb, 3) check before rose_decode() to cover its
skb->data[2] access, and a pskb_may_pull(skb, 5) check afterwards for
the CLEAR REQUEST path to cover the cause and diagnostic reads.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
---
V1 -> V2: switch skb->len check to pskb_may_pull; also add
          pskb_may_pull(skb, 3) before rose_decode() to cover its
          skb->data[2] access

v1: https://lore.kernel.org/netdev/20260409013246.2051746-1-ashutoshdesai993@gmail.com/

 net/rose/rose_in.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/rose/rose_in.c b/net/rose/rose_in.c
index 0276b393f0e5..b9f01a11e2df 100644
--- a/net/rose/rose_in.c
+++ b/net/rose/rose_in.c
@@ -269,8 +269,18 @@ int rose_process_rx_frame(struct sock *sk, struct sk_buff *skb)
 if (rose->state == ROSE_STATE_0)
  0;
 
+if (!pskb_may_pull(skb, 3)) {
+kfree_skb(skb);
+return 0;
+}
+
 frametype = rose_decode(skb, &ns, &nr, &q, &d, &m);
 
+if (frametype == ROSE_CLEAR_REQUEST && !pskb_may_pull(skb, 5)) {
+kfree_skb(skb);
+return 0;
+}
+
 switch (rose->state) {
 case ROSE_STATE_1:
 ueued = rose_state1_machine(sk, skb, frametype);
-- 
2.34.1

^ permalink raw reply related

* Re: [PATCH net-next v2 3/3] selftests/net: Add additional test coverage in nk_qlease
From: Nikolay Aleksandrov @ 2026-04-14  5:59 UTC (permalink / raw)
  To: Daniel Borkmann, netdev; +Cc: kuba, dw, pabeni
In-Reply-To: <20260413220809.604592-4-daniel@iogearbox.net>

On 4/14/26 01:08, Daniel Borkmann wrote:
> Add further netkit queue-lease coverage for netns lifecycle of the guest
> and physical halves, channel resize across active leases, single-device
> and multi-lessee scenarios, L3 mode operation, lease capacity exhaustion,
> and corner-cases of e.g. queue-create rejection paths. Also make the tests
> more robust by removing the time.sleep(0.1) after netns deletion and turn
> them into a wait loop.
> 
> Full test run:
> 
>   # ./nk_qlease.py
>   TAP version 13
>   1..45
>   ok 1 nk_qlease.test_remove_phys
>   ok 2 nk_qlease.test_double_lease
>   ok 3 nk_qlease.test_virtual_lessor
>   ok 4 nk_qlease.test_phys_lessee
>   ok 5 nk_qlease.test_different_lessors
>   ok 6 nk_qlease.test_queue_out_of_range
>   ok 7 nk_qlease.test_resize_leased
>   ok 8 nk_qlease.test_self_lease
>   ok 9 nk_qlease.test_create_tx_type
>   ok 10 nk_qlease.test_create_primary
>   ok 11 nk_qlease.test_create_limit
>   ok 12 nk_qlease.test_link_flap_phys
>   ok 13 nk_qlease.test_queue_get_virtual
>   ok 14 nk_qlease.test_remove_virt_first
>   ok 15 nk_qlease.test_multiple_leases
>   ok 16 nk_qlease.test_lease_queue_tx_type
>   ok 17 nk_qlease.test_invalid_netns
>   ok 18 nk_qlease.test_invalid_phys_ifindex
>   ok 19 nk_qlease.test_multi_netkit_remove_phys
>   ok 20 nk_qlease.test_single_remove_phys
>   ok 21 nk_qlease.test_link_flap_virt
>   ok 22 nk_qlease.test_phys_queue_no_lease
>   ok 23 nk_qlease.test_same_ns_lease
>   ok 24 nk_qlease.test_resize_after_unlease
>   ok 25 nk_qlease.test_lease_queue_zero
>   ok 26 nk_qlease.test_release_and_reuse
>   ok 27 nk_qlease.test_veth_queue_create
>   ok 28 nk_qlease.test_two_netkits_same_queue
>   ok 29 nk_qlease.test_l3_mode_lease
>   ok 30 nk_qlease.test_single_double_lease
>   ok 31 nk_qlease.test_single_different_lessors
>   ok 32 nk_qlease.test_cross_ns_netns_id
>   ok 33 nk_qlease.test_delete_guest_netns
>   ok 34 nk_qlease.test_move_guest_netns
>   ok 35 nk_qlease.test_resize_phys_no_reduction
>   ok 36 nk_qlease.test_delete_one_netkit_of_two
>   ok 37 nk_qlease.test_bind_rx_leased_phys_queue
>   ok 38 nk_qlease.test_resize_phys_shrink_past_leased
>   ok 39 nk_qlease.test_resize_virt_not_supported
>   ok 40 nk_qlease.test_lease_devices_down
>   ok 41 nk_qlease.test_lease_capacity_exhaustion
>   ok 42 nk_qlease.test_resize_phys_up
>   ok 43 nk_qlease.test_multi_ns_lease
>   ok 44 nk_qlease.test_multi_ns_delete_one
>   ok 45 nk_qlease.test_move_phys_netns
>   # Totals: pass:45 fail:0 xfail:0 xpass:0 skip:0 error:0
> 
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---
>  tools/testing/selftests/net/nk_qlease.py | 951 ++++++++++++++++++++++-
>  1 file changed, 946 insertions(+), 5 deletions(-)
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply

* Re: [PATCH net-next v2 2/3] selftests/net: Split netdevsim tests from HW tests in nk_qlease
From: Nikolay Aleksandrov @ 2026-04-14  5:58 UTC (permalink / raw)
  To: Daniel Borkmann, netdev; +Cc: kuba, dw, pabeni
In-Reply-To: <20260413220809.604592-3-daniel@iogearbox.net>

On 4/14/26 01:08, Daniel Borkmann wrote:
> As pointed out in 3d2c3d2eea9a ("selftests: net: py: explicitly forbid
> multiple ksft_run() calls"), ksft_run() cannot be called multiple times.
> 
> Move the netdevsim-based queue lease tests to selftests/net/ so that
> each file has exactly one ksft_run() call.
> 
> The HW tests (io_uring ZC RX, queue attrs, XDP with MP, destroy) remain
> in selftests/drivers/net/hw/.
> 
> Fixes: 65d657d80684 ("selftests/net: Add queue leasing tests with netkit")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Link: https://lore.kernel.org/netdev/20260409181950.7e099b6c@kernel.org
> ---
>  .../selftests/drivers/net/hw/nk_qlease.py     | 1142 ----------------
>  tools/testing/selftests/net/Makefile          |    1 +
>  tools/testing/selftests/net/nk_qlease.py      | 1168 +++++++++++++++++
>  3 files changed, 1169 insertions(+), 1142 deletions(-)
>  create mode 100755 tools/testing/selftests/net/nk_qlease.py
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply

* Re: [PATCH net-next v2 1/3] tools/ynl: Make YnlFamily closeable as a context manager
From: Nikolay Aleksandrov @ 2026-04-14  5:57 UTC (permalink / raw)
  To: Daniel Borkmann, netdev; +Cc: kuba, dw, pabeni
In-Reply-To: <20260413220809.604592-2-daniel@iogearbox.net>

On 4/14/26 01:08, Daniel Borkmann wrote:
> YnlFamily opens an AF_NETLINK socket in __init__ but has no way
> to release it other than leaving it to the GC. YnlFamily holds a
> self reference cycle through SpecFamily's self.family = self
> in its super().__init__() call, so refcount GC cannot reclaim
> it and the socket stays open until the cyclic GC runs.
> 
> If a test creates a guest netns, instantiates a YnlFamily inside
> it via NetNSEnter(), performs some test case work via Ynl, and
> then deletes the netns, then the 'ip netns del' only drops the
> mount binding and cleanup_net in the kernel never runs, so any
> subsequent test case assertions that objects got cleaned up would
> fail given this only gets triggered later via cyclic GC run.
> 
> Add an explicit close() that closes the netlink socket and wire
> up the __enter__/__exit__ so callers can scope the instance
> deterministically via 'with YnlFamily(...) as ynl: ...'.
> 
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---
>  tools/net/ynl/pyynl/lib/ynl.py | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/tools/net/ynl/pyynl/lib/ynl.py b/tools/net/ynl/pyynl/lib/ynl.py
> index 9c078599cea0..f63c6f828735 100644
> --- a/tools/net/ynl/pyynl/lib/ynl.py
> +++ b/tools/net/ynl/pyynl/lib/ynl.py
> @@ -731,6 +731,16 @@ class YnlFamily(SpecFamily):
>              bound_f = functools.partial(self._op, op_name)
>              setattr(self, op.ident_name, bound_f)
>  
> +    def close(self):
> +        if self.sock is not None:
> +            self.sock.close()
> +            self.sock = None
> +
> +    def __enter__(self):
> +        return self
> +
> +    def __exit__(self, exc_type, exc, tb):
> +        self.close()
>  
>      def ntf_subscribe(self, mcast_name):
>          mcast_id = self.nlproto.get_mcast_id(mcast_name, self.mcast_groups)

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>


^ permalink raw reply

* Re: [PATCH net-next v2 2/2] selftests/bpf: verify syncookie statistics in tcp_custom_syncookie
From: Kuniyuki Iwashima @ 2026-04-14  5:50 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: netdev, Eric Dumazet, Neal Cardwell, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, linux-kernel, bpf, linux-kselftest
In-Reply-To: <20260411013211.225834-2-jiayuan.chen@linux.dev>

On Fri, Apr 10, 2026 at 6:32 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> Add read_tcpext_snmp() helper to network_helpers which reads a
> TcpExt SNMP counter via nstat, and use it in the tcp_custom_syncookie
> test to verify that LINUX_MIB_SYNCOOKIESRECV is incremented and
> LINUX_MIB_SYNCOOKIESFAILED stays unchanged across a successful
> BPF custom syncookie validation.
>
> The delta is captured between start_server() and accept(), which
> covers the full SYN/ACK/cookie-check path for one connection.
>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>  tools/testing/selftests/bpf/network_helpers.c | 22 +++++++++++++++++++
>  tools/testing/selftests/bpf/network_helpers.h |  1 +
>  .../bpf/prog_tests/tcp_custom_syncookie.c     | 20 +++++++++++++++++

As you touch bpf selftest helper files, please rebase on bpf-next
to avoid possible conflicts and tag bpf-next in the Subject.

Change itself looks good.

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

one nit below.


>  3 files changed, 43 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c
> index b82f572641b7..3388dd5112b6 100644
> --- a/tools/testing/selftests/bpf/network_helpers.c
> +++ b/tools/testing/selftests/bpf/network_helpers.c
> @@ -621,6 +621,28 @@ int get_socket_local_port(int sock_fd)
>         return -1;
>  }
>
> +int read_tcpext_snmp(const char *name, unsigned long *val)
> +{
> +       char cmd[128], buf[128];
> +       int ret = 0;
> +       FILE *f;
> +
> +       snprintf(cmd, sizeof(cmd),
> +                "nstat -az TcpExt%s | awk '/TcpExt/ {print $2}'", name);
> +       f = popen(cmd, "r");
> +       if (!f)
> +               return -errno;
> +
> +       if (!fgets(buf, sizeof(buf), f)) {
> +               ret = ferror(f) ? -errno : -ENODATA;
> +               goto out;
> +       }
> +       *val = strtoul(buf, NULL, 10);
> +out:
> +       pclose(f);
> +       return ret;
> +}
> +
>  int get_hw_ring_size(char *ifname, struct ethtool_ringparam *ring_param)
>  {
>         struct ifreq ifr = {0};
> diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h
> index 79a010c88e11..c53cd781df6e 100644
> --- a/tools/testing/selftests/bpf/network_helpers.h
> +++ b/tools/testing/selftests/bpf/network_helpers.h
> @@ -84,6 +84,7 @@ int make_sockaddr(int family, const char *addr_str, __u16 port,
>                   struct sockaddr_storage *addr, socklen_t *len);
>  char *ping_command(int family);
>  int get_socket_local_port(int sock_fd);
> +int read_tcpext_snmp(const char *name, unsigned long *val);
>  int get_hw_ring_size(char *ifname, struct ethtool_ringparam *ring_param);
>  int set_hw_ring_size(char *ifname, struct ethtool_ringparam *ring_param);
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
> index eaf441dc7e79..6adfb4b892f8 100644
> --- a/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
> +++ b/tools/testing/selftests/bpf/prog_tests/tcp_custom_syncookie.c
> @@ -91,12 +91,21 @@ static void transfer_message(int sender, int receiver)
>
>  static void create_connection(struct test_tcp_custom_syncookie_case *test_case)
>  {
> +       unsigned long recv_before, recv_after;
> +       unsigned long failed_before, failed_after;

While at it, please keep reverse xmas tree order


>         int server, client, child;
>
>         server = start_server(test_case->family, test_case->type, test_case->addr, 0, 0);
>         if (!ASSERT_NEQ(server, -1, "start_server"))
>                 return;
>
> +       if (!ASSERT_OK(read_tcpext_snmp("SyncookiesRecv", &recv_before),
> +                      "read SyncookiesRecv before"))
> +               goto close_server;
> +       if (!ASSERT_OK(read_tcpext_snmp("SyncookiesFailed", &failed_before),
> +                      "read SyncookiesFailed before"))
> +               goto close_server;
> +
>         client = connect_to_fd(server, 0);
>         if (!ASSERT_NEQ(client, -1, "connect_to_fd"))
>                 goto close_server;
> @@ -105,9 +114,20 @@ static void create_connection(struct test_tcp_custom_syncookie_case *test_case)
>         if (!ASSERT_NEQ(child, -1, "accept"))
>                 goto close_client;
>
> +       if (!ASSERT_OK(read_tcpext_snmp("SyncookiesRecv", &recv_after),
> +                      "read SyncookiesRecv after"))
> +               goto close_child;
> +       if (!ASSERT_OK(read_tcpext_snmp("SyncookiesFailed", &failed_after),
> +                      "read SyncookiesFailed after"))
> +               goto close_child;
> +
> +       ASSERT_EQ(recv_after - recv_before, 1, "SyncookiesRecv delta");
> +       ASSERT_EQ(failed_after - failed_before, 0, "SyncookiesFailed delta");
> +
>         transfer_message(client, child);
>         transfer_message(child, client);
>
> +close_child:
>         close(child);
>  close_client:
>         close(client);
> --
> 2.43.0
>

^ permalink raw reply

* Re: [PATCH net-next v2 1/2] net: add missing syncookie statistics for BPF custom syncookies
From: Kuniyuki Iwashima @ 2026-04-14  5:38 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: netdev, Eric Dumazet, Neal Cardwell, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern,
	Andrii Nakryiko, Eduard Zingerman, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, linux-kernel, bpf, linux-kselftest
In-Reply-To: <20260411013211.225834-1-jiayuan.chen@linux.dev>

On Fri, Apr 10, 2026 at 6:32 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>
> 1. Replace IS_ENABLED(CONFIG_BPF) with CONFIG_BPF_SYSCALL for
>    cookie_bpf_ok() and cookie_bpf_check(). CONFIG_BPF is selected by
>    CONFIG_NET unconditionally, so IS_ENABLED(CONFIG_BPF) is always
>    true and provides no real guard. CONFIG_BPF_SYSCALL is the correct
>    config for BPF program functionality.
>
> 2. Remove the CONFIG_BPF_SYSCALL guard around struct bpf_tcp_req_attrs.
>    This struct is referenced by bpf_sk_assign_tcp_reqsk() in
>    net/core/filter.c which is compiled unconditionally, so wrapping
>    the definition in a config guard could cause build failures when
>    CONFIG_BPF_SYSCALL=n.
>
> 3. Fix mismatched declaration of cookie_bpf_check() between the
>    CONFIG_BPF_SYSCALL and stub paths: the real definition takes
>    'struct net *net' but the declaration in the header did not.
>    Add the net parameter to the declaration and all call sites.
>
> 4. Add missing LINUX_MIB_SYNCOOKIESRECV and LINUX_MIB_SYNCOOKIESFAILED
>    statistics in cookie_bpf_check(), so that BPF custom syncookie
>    validation is accounted for in SNMP counters just like the
>    non-BPF path.
>
> Compile-tested with CONFIG_BPF_SYSCALL=y and CONFIG_BPF_SYSCALL
> not set.
>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>

Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>

^ permalink raw reply

* Re: [PATCH net v2 1/1] af_unix: Reject SIOCATMARK on non-stream sockets
From: Kuniyuki Iwashima @ 2026-04-14  5:33 UTC (permalink / raw)
  To: Ren Wei
  Cc: netdev, davem, edumazet, kuba, pabeni, horms, rao.shoaib,
	yifanwucs, tomapufckgml, yuantan098, bird, enjou1224z,
	wangjiexun2025
In-Reply-To: <20260413122916.1479959-1-n05ec@lzu.edu.cn>

On Mon, Apr 13, 2026 at 5:29 AM Ren Wei <n05ec@lzu.edu.cn> wrote:
>
> From: Jiexun Wang <wangjiexun2025@gmail.com>
>
> SIOCATMARK reports whether the receive queue is at the urgent mark for
> MSG_OOB.
>
> In AF_UNIX, MSG_OOB is supported only for SOCK_STREAM sockets.
> SOCK_DGRAM and SOCK_SEQPACKET reject MSG_OOB in sendmsg() and recvmsg(),
> so they should not support SIOCATMARK either.
>
> Return -EOPNOTSUPP for non-stream sockets before checking the receive
> queue.
>
> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
> Reported-by: Yifan Wu <yifanwucs@gmail.com>
> Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> Co-developed-by: Yuan Tan <yuantan098@gmail.com>
> Signed-off-by: Yuan Tan <yuantan098@gmail.com>
> Suggested-by: Xin Liu <bird@lzu.edu.cn>

Please read this guideline again.
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by

Co-developed-by is not where you mention someone who
developed a tool to find a bug, and Suggested-by is not where
you mention someone who funds your research.
https://lore.kernel.org/netdev/7c26a74d-90c5-4520-a10a-22f06e098b86@gmail.com/

When you just copy my fix and modify the commit message,
the two tags are inappropriate.


> Tested-by: Ren Wei <enjou1224z@gmail.com>
> Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com>
> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
> ---
> Changes in v2:
> - Rework the fix based on maintainer feedback.
> - Drop the receive-queue locking approach and reject SIOCATMARK on
>   non-stream sockets instead, since it is only meaningful for MSG_OOB.
> - V1 link: https://lore.kernel.org/netdev/f6cbbc8da90e95584847b5ceb60aae830d1631c2.1775731983.git.wangjiexun2025@gmail.com/
>
>  net/unix/af_unix.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index b23c33df8b46..09d43b4813b1 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -3300,6 +3300,9 @@ static int unix_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
>                         struct sk_buff *skb;
>                         int answ = 0;
>
> +                       if (sk->sk_type != SOCK_STREAM)
> +                               return -EOPNOTSUPP;
> +
>                         mutex_lock(&u->iolock);
>
>                         skb = skb_peek(&sk->sk_receive_queue);
> --
> 2.34.1
>

^ permalink raw reply

* [PATCH v4] nfc: hci: fix out-of-bounds read in HCP header parsing
From: Ashutosh Desai @ 2026-04-14  5:24 UTC (permalink / raw)
  To: netdev; +Cc: kuba, edumazet, davem, pabeni, horms, linux-kernel

nfc_hci_recv_from_llc() and nci_hci_data_received_cb() cast skb->data
to struct hcp_packet and read the message header byte without checking
that enough data is present in the linear sk_buff area. A malicious NFC
peer can send a 1-byte HCP frame that passes through the SHDLC layer
and reaches these functions, causing an out-of-bounds heap read.

Fix this by adding pskb_may_pull() before each cast to ensure the full
2-byte HCP header is pulled into the linear area before it is accessed.

Fixes: 8b8d2e08bf0d ("NFC: HCI support")
Fixes: 11f54f228643 ("NFC: nci: Add HCI over NCI protocol support")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
---
V3 -> V4: add Fixes tags
V2 -> V3: drop redundant checks from nfc_hci_msg_rx_work/nci_hci_msg_rx_work,
          remove incorrect Suggested-by tag
V1 -> V2: switch skb->len check to pskb_may_pull

v3: https://lore.kernel.org/netdev/20260413024329.3293075-1-ashutoshdesai993@gmail.com/
v2: https://lore.kernel.org/netdev/20260409150825.2217133-1-ashutoshdesai993@gmail.com/
v1: https://lore.kernel.org/netdev/20260408223113.2009304-1-ashutoshdesai993@gmail.com/

 net/nfc/hci/core.c | 5 +++++
 net/nfc/nci/hci.c  | 5 +++++
 2 files changed, 10 insertions(+)

diff --git a/net/nfc/hci/core.c b/net/nfc/hci/core.c
index 0d33c81a15fe..cd9cf6c94a50 100644
--- a/net/nfc/hci/core.c
+++ b/net/nfc/hci/core.c
@@ -904,6 +904,11 @@ static void nfc_hci_recv_from_llc(struct nfc_hci_dev *hdev, struct sk_buff *skb)
          * unblock waiting cmd context. Otherwise, enqueue to dispatch
          * in separate context where handler can also execute command.
          */
+if (!pskb_may_pull(hcp_skb, NFC_HCI_HCP_HEADER_LEN)) {
+kfree_skb(hcp_skb);
+return;
+}
+
 packet = (struct hcp_packet *)hcp_skb->data;
 type = HCP_MSG_GET_TYPE(packet->message.header);
 if (type == NFC_HCI_HCP_RESPONSE) {
diff --git a/net/nfc/nci/hci.c b/net/nfc/nci/hci.c
index 40ae8e5a7ec7..6e633da257d1 100644
--- a/net/nfc/nci/hci.c
+++ b/net/nfc/nci/hci.c
@@ -482,6 +482,11 @@ void nci_hci_data_received_cb(void *context,
          * unblock waiting cmd context. Otherwise, enqueue to dispatch
          * in separate context where handler can also execute command.
          */
+if (!pskb_may_pull(hcp_skb, NCI_HCI_HCP_HEADER_LEN)) {
+kfree_skb(hcp_skb);
+return;
+}
+
 packet = (struct nci_hcp_packet *)hcp_skb->data;
 type = NCI_HCP_MSG_GET_TYPE(packet->message.header);
 if (type == NCI_HCI_HCP_RESPONSE) {
-- 
2.34.1

^ permalink raw reply related

* Re: [PATCH net] net: usb: cdc_ncm: reject negative chained NDP offsets
From: Greg Kroah-Hartman @ 2026-04-14  4:23 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Oliver Neukum, linux-usb, netdev, linux-kernel, Oliver Neukum,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, stable
In-Reply-To: <87wlyavnl3.fsf@miraculix.mork.no>

On Mon, Apr 13, 2026 at 06:20:40PM +0200, Bjørn Mork wrote:
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
> > On Mon, Apr 13, 2026 at 02:11:50PM +0200, Oliver Neukum wrote:
> >> On 13.04.26 12:43, Greg Kroah-Hartman wrote:
> >> > On Mon, Apr 13, 2026 at 10:36:19AM +0200, Oliver Neukum wrote:
> >> > > 
> >> > > 
> >> > > On 11.04.26 12:53, Greg Kroah-Hartman wrote:
> >> > > > cdc_ncm_rx_fixup() reads dwNextNdpIndex from each NDP32 to chain to the
> >> > > > next one.  The 32-bit value from the device is stored into the signed
> >> > > > int ndpoffset so that means values with the high bit set become
> >> > > 
> >> > > Well, then isn't the problem rather that you should not store an
> >> > > unsigned value in a signed variable?
> >> > 
> >> > No.  well, yes.  but no.
> >> > 
> >> > cdc_ncm_rx_verify_nth16() returns an int, and is negative if something
> >> > went wrong, so we need it that way, and then we need to check it, like
> >> > we properly do at the top of the loop, it's just that at the bottom of
> >> > the loop we also need to do the same exact thing.
> >> 
> >> Doesn't that suggest that cdc_ncm_rx_verify_nth16() is the problem?
> >> To be precise, the way it indicates errors?
> >> As this is an offset into a buffer and the header must be at the start
> >> of the buffer, isn't 0 the natural indication of an error?
> >
> > Maybe?  I really don't know, sorry, parsing the cdc_ncm buffer is not
> > something I looked too deeply into :)
> 
> Oliver is correct AFAICS. These functions could use 0 to indicate
> errors.  This would make the code simpler and cleaner.
> 
> The negative error return is just a sloppy choice I made at a time we
> only supported the 16bit versions.  Didn't anticipate 32bit support
> since it is optional and pointless.  But as usual, hardware vendors do
> surprising things.
> 
> Note that cdc_mbim.c must be updated if cdc_ncm_rx_verify_nth16() is
> changed.

Ok thanks for the background, I'll rework this after the merge window is
over.

greg k-h

^ permalink raw reply

* [PATCH 5.10.y] Revert "wifi: cfg80211: stop NAN and P2P in cfg80211_leave"
From: guocai.he.cn @ 2026-04-14  4:03 UTC (permalink / raw)
  To: gregkh
  Cc: stable, johannes.berg, netdev, regressions,
	miriam.rachel.korenblit, linux-kernel

From: Guocai He <guocai.he.cn@windriver.com>

This reverts commit d91240f24e831d3bd36954599ada6b456fb1bd0a which is commit
e1696c8bd0056bc1a5f7766f58ac333adc203e8a upstream.

The reverted patch introduced a deadlock. The locking situation in mainline is
totally different, so it is incorrect to directly backport the commit from mainline.

Signed-off-by: Guocai He <guocai.he.cn@windriver.com>
---
 net/wireless/core.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/wireless/core.c b/net/wireless/core.c
index cc2093f75468..3b25b78896a2 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -1207,10 +1207,8 @@ void __cfg80211_leave(struct cfg80211_registered_device *rdev,
 		/* must be handled by mac80211/driver, has no APIs */
 		break;
 	case NL80211_IFTYPE_P2P_DEVICE:
-		cfg80211_stop_p2p_device(rdev, wdev);
-		break;
 	case NL80211_IFTYPE_NAN:
-		cfg80211_stop_nan(rdev, wdev);
+		/* cannot happen, has no netdev */
 		break;
 	case NL80211_IFTYPE_AP_VLAN:
 	case NL80211_IFTYPE_MONITOR:
-- 
2.34.1


^ permalink raw reply related

* [PATCH net v2] net: reduce RFS/ARFS flow updates by checking LLC affinity
From: Chuang Wang @ 2026-04-14  3:59 UTC (permalink / raw)
  Cc: Chuang Wang, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Stanislav Fomichev, Kuniyuki Iwashima,
	Samiullah Khawaja, Hangbin Liu, Krishna Kumar, Neal Cardwell,
	Willem de Bruijn, netdev, linux-kernel

The current implementation of rps_record_sock_flow() updates the flow
table every time a socket is processed on a different CPU. In high-load
scenarios, especially with Accelerated RFS (ARFS), this triggers
frequent flow steering updates via ndo_rx_flow_steer.

For drivers like mlx5 that implement hardware flow steering, these
constant updates lead to significant contention on internal driver locks
(e.g., arfs_lock). This contention often becomes a performance
bottleneck that outweighs the steering benefits.

This patch introduces a cache-aware update strategy: the flow record is
only updated if the flow migrates across Last Level Cache (LLC)
boundaries. This minimizes expensive hardware reconfigurations while
preserving cache locality for the application. A new sysctl,
net.core.rps_feat_llc_affinity, is added to toggle this feature.

Performance Test Results:
The patch was tested in a K8s environment (AMD CPU 128*2, 16-core Pod
with CPU pinning, mlx5 NIC) using brpc[1] echo_server and rpc_press.

rpc_press Commands:

  for i in {1..8}; do
    ./rpc_press -proto=./echo.proto -method=example.EchoService.Echo
    -server=<IP>:8000 -input='{"message":"hello"}'
    -qps=0 -thread_num=512 -connection_type=pooled &
  done

Monitor mlx5e_rx_flow_steer frequency:

  /usr/share/bcc/tools/funccount -i 1 mlx5e_rx_flow_steer

Frequency of mlx5e_rx_flow_steer (via funccount[2]):

  Before: ~335,000 counts/sec
  After:   ~23,000 counts/sec (reduced by ~93%)

System Metrics (after enabling rps_feat_llc_affinity):

  CPU Utilization: 38% -> 32%
  CPU PSI (Pressure Stall Information): 20% -> 10%

These results demonstrate that filtering updates by LLC affinity
significantly reduces driver lock contention and improves overall
CPU efficiency under heavy network load.

[1] https://github.com/apache/brpc/
[2] https://github.com/iovisor/bcc/blob/master/tools/funccount.py

Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
---
v1 -> v2: add rps_feat_llc_affinity; add brpc tests

 include/net/rps.h          | 18 ++--------
 net/core/dev.c             | 72 ++++++++++++++++++++++++++++++++++++++
 net/core/sysctl_net_core.c | 34 ++++++++++++++++++
 3 files changed, 108 insertions(+), 16 deletions(-)

diff --git a/include/net/rps.h b/include/net/rps.h
index e33c6a2fa8bb..37bbb7009c36 100644
--- a/include/net/rps.h
+++ b/include/net/rps.h
@@ -12,6 +12,7 @@
 
 extern struct static_key_false rps_needed;
 extern struct static_key_false rfs_needed;
+extern struct static_key_false rps_feat_llc_affinity;
 
 /*
  * This structure holds an RPS map which can be of variable length.  The
@@ -55,22 +56,7 @@ struct rps_sock_flow_table {
 
 #define RPS_NO_CPU 0xffff
 
-static inline void rps_record_sock_flow(rps_tag_ptr tag_ptr, u32 hash)
-{
-	unsigned int index = hash & rps_tag_to_mask(tag_ptr);
-	u32 val = hash & ~net_hotdata.rps_cpu_mask;
-	struct rps_sock_flow_table *table;
-
-	/* We only give a hint, preemption can change CPU under us */
-	val |= raw_smp_processor_id();
-
-	table = rps_tag_to_table(tag_ptr);
-	/* The following WRITE_ONCE() is paired with the READ_ONCE()
-	 * here, and another one in get_rps_cpu().
-	 */
-	if (READ_ONCE(table[index].ent) != val)
-		WRITE_ONCE(table[index].ent, val);
-}
+void rps_record_sock_flow(rps_tag_ptr tag_ptr, u32 hash);
 
 static inline void _sock_rps_record_flow_hash(__u32 hash)
 {
diff --git a/net/core/dev.c b/net/core/dev.c
index 203dc36aaed5..630a7f21d8de 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4964,6 +4964,8 @@ struct static_key_false rps_needed __read_mostly;
 EXPORT_SYMBOL(rps_needed);
 struct static_key_false rfs_needed __read_mostly;
 EXPORT_SYMBOL(rfs_needed);
+struct static_key_false rps_feat_llc_affinity __read_mostly;
+EXPORT_SYMBOL(rps_feat_llc_affinity);
 
 static u32 rfs_slot(u32 hash, rps_tag_ptr tag_ptr)
 {
@@ -5175,6 +5177,76 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
 	return cpu;
 }
 
+/**
+ * rps_record_cond - Determine if RPS flow table should be updated
+ * @old_val: Previous flow record value
+ * @new_val: Target flow record value
+ *
+ * Returns true if the record needs an update.
+ */
+static inline bool rps_record_cond(u32 old_val, u32 new_val)
+{
+	u32 old_cpu = old_val & ~net_hotdata.rps_cpu_mask;
+	u32 new_cpu = new_val & ~net_hotdata.rps_cpu_mask;
+
+	if (old_val == new_val)
+		return false;
+
+	/*
+	 * RPS LLC Affinity Feature:
+	 * Reduce RFS/ARFS flow updates by checking LLC affinity.
+	 *
+	 * Frequent flow table updates can trigger constant hardware steering
+	 * reconfigurations (e.g., ndo_rx_flow_steer), leading to significant
+	 * contention on driver internal locks (like mlx5's arfs_lock).
+	 *
+	 * This strategy only updates the flow record if it migrates across LLC
+	 * boundaries. This minimizes expensive hardware updates while preserving
+	 * cache locality for the application.
+	 */
+	if (static_branch_unlikely(&rps_feat_llc_affinity)) {
+		/* Force update if the recorded CPU is invalid or has gone offline */
+		if (old_cpu >= nr_cpu_ids || !cpu_active(old_cpu))
+			return true;
+
+		/*
+		 * Force an update if the current task is no longer permitted
+		 * to run on the old_cpu.
+		 */
+		if (!cpumask_test_cpu(old_cpu, current->cpus_ptr))
+			return true;
+
+		/*
+		 * If CPUs do not share a cache, allow the update to prevent
+		 * expensive remote memory accesses and cache misses.
+		 */
+		if (!cpus_share_cache(old_cpu, new_cpu))
+			return true;
+
+		return false;
+	}
+
+	return true;
+}
+
+void rps_record_sock_flow(rps_tag_ptr tag_ptr, u32 hash)
+{
+	unsigned int index = hash & rps_tag_to_mask(tag_ptr);
+	u32 val = hash & ~net_hotdata.rps_cpu_mask;
+	struct rps_sock_flow_table *table;
+
+	/* We only give a hint, preemption can change CPU under us */
+	val |= raw_smp_processor_id();
+
+	table = rps_tag_to_table(tag_ptr);
+	/* The following WRITE_ONCE() is paired with the READ_ONCE()
+	 * here, and another one in get_rps_cpu().
+	 */
+	if (rps_record_cond(READ_ONCE(table[index].ent), val))
+		WRITE_ONCE(table[index].ent, val);
+}
+EXPORT_SYMBOL(rps_record_sock_flow);
+
 #ifdef CONFIG_RFS_ACCEL
 
 /**
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 502705e04649..dbc99aea7bb0 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -210,6 +210,32 @@ static int rps_sock_flow_sysctl(const struct ctl_table *table, int write,
 	kvfree_rcu_mightsleep(tofree);
 	return ret;
 }
+
+static int rps_feat_llc_affinity_sysctl(const struct ctl_table *table, int write,
+					void *buffer, size_t *lenp, loff_t *ppos)
+{
+	u8 curr_state;
+	int ret;
+	const struct ctl_table tmp = {
+		.data = &curr_state,
+		.maxlen = sizeof(curr_state),
+		.mode = table->mode,
+		.extra1 = table->extra1,
+		.extra2 = table->extra2
+	};
+
+	curr_state = static_branch_unlikely(&rps_feat_llc_affinity) ? 1 : 0;
+
+	ret = proc_dou8vec_minmax(&tmp, write, buffer, lenp, ppos);
+	if (write && ret == 0) {
+		if (curr_state && !static_branch_unlikely(&rps_feat_llc_affinity))
+			static_branch_enable(&rps_feat_llc_affinity);
+		else if (!curr_state && static_branch_unlikely(&rps_feat_llc_affinity))
+			static_branch_disable(&rps_feat_llc_affinity);
+	}
+
+	return ret;
+}
 #endif /* CONFIG_RPS */
 
 #ifdef CONFIG_NET_FLOW_LIMIT
@@ -531,6 +557,14 @@ static struct ctl_table net_core_table[] = {
 		.mode		= 0644,
 		.proc_handler	= rps_sock_flow_sysctl
 	},
+	{
+		.procname	= "rps_feat_llc_affinity",
+		.maxlen		= sizeof(u8),
+		.mode		= 0644,
+		.proc_handler   = rps_feat_llc_affinity_sysctl,
+		.extra1     = SYSCTL_ZERO,
+		.extra2     = SYSCTL_ONE
+	},
 #endif
 #ifdef CONFIG_NET_FLOW_LIMIT
 	{
-- 
2.47.3


^ permalink raw reply related

* Re: [PATCH v11 net-next 5/7] octeontx2-af: npc: cn20k: add subbank search order control
From: Ratheesh Kannoth @ 2026-04-14  3:46 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, linux-kernel, linux-rdma, sgoutham, andrew+netdev, davem,
	edumazet, kuba, donald.hunter, horms, jiri, chuck.lever, matttbe,
	cjubran, saeedm, leon, tariqt, mbloch, dtatulea
In-Reply-To: <b9ffa72d-ebe2-4fd1-b668-93620f206179@redhat.com>

On 2026-04-13 at 18:26:00, Paolo Abeni (pabeni@redhat.com) wrote:
> > +	xa_for_each(&npc_priv.xa_sb_free, index, v) {
> > +		val = xa_to_value(v);
> > +		fslots[fcnt][0] = index;
> > +		fslots[fcnt][1] = val;
> > +		xa_erase(&npc_priv.xa_sb_free, index);
> > +		fcnt++;
> > +	}
> > +
> > +	/* xa_store() is done under lock. If xa_store fails
> > +	 * ,no rollback is planned as it might also fail.
>
> Why do you need to go throuh erase and add loop? Why can't you directly
> xa_store() the new value? Note that xa_store() can fail due to memory
> pressure.
>
> Avoiding the previous erase will prevent deallocation and re allocation
> and will avoid any reasonable xa_store() failure.
ACK.

>
> AFAICS there are a few more items reported by sashiko, please have a look:
>
> https://sashiko.dev/#/patchset/20260409025055.1664053-1-rkannoth%40marvell.com
>
> /P
>

Patch 1: [PATCH v11 net-next 1/7] octeontx2-af: npc: cn20k: debugfs enhancements

>"+static u64 dstats[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS] = {};
>Is it safe to use a static global array here? If multiple RVU AF devices exist
>in the system, it looks like they might share this array and overwrite each
>other's hardware statistics."
There is only one single rvu AF device in the system. Ignore the comment.
as it is false positive.

>"Also, since there are no locks, could concurrent reads of the debugfs file by
>multiple threads cause a data race and corrupt the stored statistics?"
ACK. added a mutex lock.

>"Since en_map is added to the global npc_priv structure, is it ever explicitly
>cleared during device initialization?"
npc_priv is defined as static. So all fields are set to 0, by default.
Anyway, added bitmap clear to get rid of this AI review warning.

>"Does updating dstats inline during the seq_file show callback break the
>statistics if the buffer overflows?"
ACK. Used seq open with size() API to set required size of buffer at the start.
Added a check to see if there is a buffer over flow by counting buffers wrote into
seq_file. dstats are "delta stats", and used for debugging.

>"Does updating dstats inline during the seq_file show callback break the
>statistics if the buffer overflows?Does updating dstats inline during the seq_file show callback break the
>statistics if the buffer overflows?"
ACK.

>"When seq_read detects the overflow, it allocates a larger buffer and calls
>this show callback again from the beginning. During the retry,
>stats == dstats[bank][idx] will evaluate to true because the array was
>already updated, which skips all the previously overflowing entries."
ACK.

>"This isn't a bug, but using the ternary operator for its void side-effects
>is generally avoided in the kernel. Could this be written using an if/else
>block, or the assign_bit() macro instead?"
ACK.

>"Since en_map is added to the global npc_priv structure, is it ever explicitly
>cleared during device initialization?
>If the PCI device is unbound and rebound via sysfs, could en_map retain bits
>from the previous lifecycle and cause the new mismatch debugfs file to falsely
>report old enabled rules as mismatched?"
ACK. Reset bitmap to zero on deinit.

Patch 2: [PATCH v11 net-next 2/7] net/mlx5e: heap-allocate devlink param values

>Could we avoid the dynamic allocation entirely? Instead of storing all four
>values at once, could a single union devlink_param_value be declared inside
>the loop, populated, and immediately assigned to the respective config field?
>This would reduce the stack usage to a single union, avoiding both the frame
>size warning and the new memory allocation failure path."
ACK.

Patch 3: [PATCH v11 net-next 3/7] devlink: Change function syntax.

>"The commit message notes that exported APIs will be updated to pass the
>unions by pointer. Is this incomplete? Exported APIs like
>devl_param_driverinit_value_set and the validate callback in struct
>devlink_param still appear to take the union by value, which might continue
>to push the large structure onto the stack during validation."
ACK.

>"This isn't a bug, but it looks like an automated search and replace
>accidentally modified the word "values" in this comment."
ACK.

Patch 5: [PATCH v11 net-next 5/7] octeontx2-af: npc: cn20k: add subbank search order control

>Does this code overflow uslots[]?
>Looking at npc_priv_init(), num_subbanks is populated directly from the
>hardware register NPC_AF_CONST2. While it ensures the value is a power of 2,
>there doesn't seem to be a check that restricts it to MAX_NUM_SUB_BANKS.
>If the hardware reports a value larger than 32, ucnt could exceed the
>bounds of the stack-allocated uslots array.
ACK.

>Can this approach cause a permanent leak of the erased subbanks?
>If any xa_store() call fails, the function jumps to fail_used and returns
>without rolling back the state. Since the subbanks were already erased from
>the xarray into the local stack arrays, failing to re-insert them leaves the
>hardware allocator state partially empty and corrupts it for the lifetime of
>the system.
>Would it be safer to pre-allocate with xa_reserve(), or build new xarrays and
>swap them in on success?
Used xa_store without xa_erase. If xa_store fails due to memory pressure, error is
unrecoverable as this error can happen in between (for loop).

[PATCH v11 net-next 6/7] octeontx2-af: npc: cn20k: dynamically allocate and free default MCAM entries

>"The NIX_LF_DONT_FREE_DFT_IDXS flag is read directly from the mailbox message
>structure. Since mailbox messages can be initiated by Virtual Functions, what
>prevents an unprivileged VF from repeatedly allocating and freeing LFs with
>this flag set to exhaust the hardware MCAM entries?"
rvu_mbox_handler_nix_lf_alloc() ---> npc_cn20k_dft_rules_alloc(). npc_cn20k_dft_rules_alloc() will alloc default entries if and only if it is not already there. So if
VF calls alloc and free LF without setting the flag, the default entries wont be allocated or freed.

>Are the values in ptr[] virtual or physical MCAM indices? It appears that
>npc_cn20k_dft_rules_idx_get() retrieves virtual indices, but they are used
>here as direct indices into mcam->entry2pfvf_map and passed to
>npc_mcam_clear_bit(). Since those structures are sized for physical indices,
>could this cause an out-of-bounds memory corruption or an integer underflow?
default entries are allways allocated by setting "ref_entry" field in struct npc_mcam_alloc_entry_req. Then, rvu_mbox_handler_npc_mcam_alloc_entry() wont return a virtual
mcam index.

>If xa_erase() fails above and returns NULL, ptr[i] is not cleared and the
>code falls through to the free_rules label. Will this result in
>unconditionally calling npc_cn20k_idx_free() on the stale index, potentially
>causing a double-free?
ACK.

>Furthermore, if a VF manually frees its default MCAM rules via the
>NPC_MCAM_FREE_ENTRY mailbox command before this NIX LF teardown occurs,
>npc_cn20k_idx_free() will be called during that manual free. Since the manual
>free does not remove the index from xa_pf2dfl_rmap, could this teardown path
>fetch the same index and attempt to free it again?
default mcam rules are allocated in rvu_mbox_handler_nix_lf_alloc(). Not thru
NPC_MCAM_FREE_ENTRY. if it does intentionally, then it is a violation. we have dev_err() there, and need to debug at User point.

>Does the caller of this function properly handle negative error codes?
>For example, in npc_enadis_default_mce_entry() and
>npc_enadis_default_entries(), the returned index is passed directly to
>npc_enable_mcam_entry() and nix_update_mce_list() without checking for a
>negative value. This could lead to a WARN(1) in npc_enable_mcam_entry() or an
>out-of-bounds write in nix_update_mce_list().
We intentionally did the change to find out the flow which pass wrong mcam index.
So we need a splat using WARN(1)

>Here, index is a physical index from the bitmap iteration, but the values
>returned into dft_idxs[] by npc_cn20k_dft_rules_idx_get() are virtual
>indices. Will this comparison always fail, causing default rules to be
>erroneously physically freed?
No. default indexes are not virtual. This is ensured during allocation itself.

>Additionally, if the NIX LF is freed with NIX_LF_DONT_FREE_DFT_IDXS to
>preserve default rules, the ownership mapping is cleared here.
ACK.

>Upon
>re-allocation, npc_cn20k_dft_rules_alloc() will detect the rules in
>xa_pf2dfl_rmap but won't restore the ownership in entry2pfvf_map, meaning
>subsequent operations on these rules will fail verification.
ACK.

>Does this make the firmware layout dependent on the internal size of
>ikpu_action_entries?
Yes.
>If future kernel versions add new packet kinds and increase the size of
>this array, older firmware files will fail this bounds check and be rejected.
struct npc_kpu_profile_fwdata does not have a field to indicate the size of ikpu_action_entries.
We can't modify the structure as it would break backward compatability on old fw.

>ill this trigger a compiler warning or build failure on strict builds?
>The min() macro performs strict type checking, and fw_kpu->entries appears
>to be a signed int, while rvu->hw->npc_kpu_entries is an unsigned u16.
ACK.

>Could a negative value in fw_kpu->entries cause an integer underflow here?
>If fw_kpu->entries is read from untrusted firmware as a negative value, the
>offset calculation can underflow the size_t offset variable.
>This would bypass the subsequent bounds check because the wrapped offset
>plus hdr_sz wraps again to a small positive value.
>On the next iteration, calculating fw_kpu = fw->data + offset could result
>in an out-of-bounds memory read.
Addded check to return on invalid value.

>Does modifying profile->kpu here corrupt the global default profile state?
>Earlier in the flow, profile->kpu is initialized to point to the global
>static array npc_kpu_profiles. Allocating device-managed memory into
>profile->kpu[kpu].cam2 overwrites this global state with device-specific
>pointers.
>When the device is unbound and the memory is freed, could this leave dangling
>pointers in the global array for other RVU devices in the system? The same
>applies to the legacy firmware parsing path where cam[entry] is overwritten.

>We are not using profile->kpu after unbind and memory is freed. During reinit, these
>fields are intialized again. So there is no issue with it.
>
>Could this printk formatter read past the end of the profile name?
>The name array in the firmware header is 32 bytes. If a user provides a
>firmware file with exactly 32 non-null characters, the string will lack a
>null terminator.
>Printing this with %s can leak adjacent heap memory contents into the kernel
>log. Using %.32s would ensure the read stays within bounds.
ACK.

>Do these fields require an endianness conversion before use?
>The 16-bit values like dp0, dp1, and dp2 are read directly from the firmware
>blob.
>If the firmware payload uses little-endian byte order, applying these
>directly to hardware registers could result in misprogramming on big-endian
>architectures. Would it be safer to use le16_to_cpu() here?
s/w is validated only for little endian as HW is little endian. if big endian required,
we will provide seperate firmware for the same.

^ permalink raw reply

* [PATCH 5.15.y] Revert "wifi: cfg80211: stop NAN and P2P in cfg80211_leave"
From: guocai.he.cn @ 2026-04-14  3:20 UTC (permalink / raw)
  To: gregkh
  Cc: stable, johannes.berg, netdev, regressions,
	miriam.rachel.korenblit, linux-kernel

From: Guocai He <guocai.he.cn@windriver.com>

This reverts commit 31344ffecd7a34335ce2b52e8c205bce3cbfca4b which is commit
e1696c8bd0056bc1a5f7766f58ac333adc203e8a upstream.

The reverted patch introduced a deadlock. The locking situation in mainline is
totally different, so it is incorrect to directly backport the commit from mainline.

Signed-off-by: Guocai He <guocai.he.cn@windriver.com>
---
 net/wireless/core.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/wireless/core.c b/net/wireless/core.c
index 22e6fd12f201..58b91e9647c2 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -1300,10 +1300,8 @@ void __cfg80211_leave(struct cfg80211_registered_device *rdev,
 		__cfg80211_leave_ocb(rdev, dev);
 		break;
 	case NL80211_IFTYPE_P2P_DEVICE:
-		cfg80211_stop_p2p_device(rdev, wdev);
-		break;
 	case NL80211_IFTYPE_NAN:
-		cfg80211_stop_nan(rdev, wdev);
+		/* cannot happen, has no netdev */
 		break;
 	case NL80211_IFTYPE_AP_VLAN:
 	case NL80211_IFTYPE_MONITOR:
-- 
2.34.1


^ permalink raw reply related

* [PATCH 6.18.y] netfilter: conntrack: add missing netlink policy validations
From: Li hongliang @ 2026-04-14  3:31 UTC (permalink / raw)
  To: gregkh, stable, fw
  Cc: patches, linux-kernel, pablo, kadlec, davem, edumazet, kuba,
	pabeni, horms, kaber, netfilter-devel, coreteam, netdev, imv4bel

From: Florian Westphal <fw@strlen.de>

[ Upstream commit f900e1d77ee0ef87bfb5ab3fe60f0b3d8ad5ba05 ]

Hyunwoo Kim reports out-of-bounds access in sctp and ctnetlink.

These attributes are used by the kernel without any validation.
Extend the netlink policies accordingly.

Quoting the reporter:
  nlattr_to_sctp() assigns the user-supplied CTA_PROTOINFO_SCTP_STATE
  value directly to ct->proto.sctp.state without checking that it is
  within the valid range. [..]

  and: ... with exp->dir = 100, the access at
  ct->master->tuplehash[100] reads 5600 bytes past the start of a
  320-byte nf_conn object, causing a slab-out-of-bounds read confirmed by
  UBSAN.

Fixes: 076a0ca02644 ("netfilter: ctnetlink: add NAT support for expectations")
Fixes: a258860e01b8 ("netfilter: ctnetlink: add full support for SCTP to ctnetlink")
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Li hongliang <1468888505@139.com>
---
 net/netfilter/nf_conntrack_netlink.c    | 2 +-
 net/netfilter/nf_conntrack_proto_sctp.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 879413b9fa06..2bb9eb2d25fb 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -3465,7 +3465,7 @@ ctnetlink_change_expect(struct nf_conntrack_expect *x,
 
 #if IS_ENABLED(CONFIG_NF_NAT)
 static const struct nla_policy exp_nat_nla_policy[CTA_EXPECT_NAT_MAX+1] = {
-	[CTA_EXPECT_NAT_DIR]	= { .type = NLA_U32 },
+	[CTA_EXPECT_NAT_DIR]	= NLA_POLICY_MAX(NLA_BE32, IP_CT_DIR_REPLY),
 	[CTA_EXPECT_NAT_TUPLE]	= { .type = NLA_NESTED },
 };
 #endif
diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c
index 7c6f7c9f7332..645d2c43ebf7 100644
--- a/net/netfilter/nf_conntrack_proto_sctp.c
+++ b/net/netfilter/nf_conntrack_proto_sctp.c
@@ -582,7 +582,8 @@ static int sctp_to_nlattr(struct sk_buff *skb, struct nlattr *nla,
 }
 
 static const struct nla_policy sctp_nla_policy[CTA_PROTOINFO_SCTP_MAX+1] = {
-	[CTA_PROTOINFO_SCTP_STATE]	    = { .type = NLA_U8 },
+	[CTA_PROTOINFO_SCTP_STATE]	    = NLA_POLICY_MAX(NLA_U8,
+							 SCTP_CONNTRACK_HEARTBEAT_SENT),
 	[CTA_PROTOINFO_SCTP_VTAG_ORIGINAL]  = { .type = NLA_U32 },
 	[CTA_PROTOINFO_SCTP_VTAG_REPLY]     = { .type = NLA_U32 },
 };
-- 
2.34.1



^ permalink raw reply related

* [PATCH iwl-next v2 2/2] idpf: implement pci error handlers
From: Emil Tantilov @ 2026-04-14  3:16 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, przemyslaw.kitszel, jay.bhat, ivan.d.barrera,
	aleksandr.loktionov, larysa.zaremba, anthony.l.nguyen,
	andrew+netdev, davem, edumazet, kuba, pabeni, aleksander.lobakin,
	linux-pci, madhu.chittim, decot, willemb, sheenamo, lukas
In-Reply-To: <20260414031631.2107-1-emil.s.tantilov@intel.com>

Add callbacks to handle PCI errors and FLR reset. When preparing to handle
reset on the bus, the driver must stop all operations that can lead to MMIO
access in order to prevent HW errors. To accomplish this introduce helper
idpf_reset_prepare() that gets called prior to FLR or when PCI error is
detected. Upon resume the recovery is done through the existing reset path
by starting the event task.

The following callbacks are implemented:
.reset_prepare runs the first portion of the generic reset path leading up
to the part where we wait for the reset to complete.
.reset_done/resume runs the recovery part of the reset handling.
.error_detected is the callback dealing with PCI errors, similar to the
prepare call, we stop all operations, prior to attempting a recovery.
.slot_reset is the callback attempting to restore the device, provided a
PCI reset was initiated by the AER driver.

Whereas previously the init logic guaranteed netdevs during reset, the
addition of idpf_detach_and_close() to the PCI callbacks flow makes it
possible for the function to be called without netdevs. Add check to
avoid NULL pointer dereference in that case.

Co-developed-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Jay Bhat <jay.bhat@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
---
 drivers/net/ethernet/intel/idpf/idpf.h      |   3 +
 drivers/net/ethernet/intel/idpf/idpf_lib.c  |  13 ++-
 drivers/net/ethernet/intel/idpf/idpf_main.c | 112 ++++++++++++++++++++
 3 files changed, 126 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf.h b/drivers/net/ethernet/intel/idpf/idpf.h
index 1d0e32e47e87..164d2f3e233a 100644
--- a/drivers/net/ethernet/intel/idpf/idpf.h
+++ b/drivers/net/ethernet/intel/idpf/idpf.h
@@ -88,6 +88,7 @@ enum idpf_state {
  * @IDPF_REMOVE_IN_PROG: Driver remove in progress
  * @IDPF_MB_INTR_MODE: Mailbox in interrupt mode
  * @IDPF_VC_CORE_INIT: virtchnl core has been init
+ * @IDPF_PCI_CB_RESET: Reset via the PCI callbacks
  * @IDPF_FLAGS_NBITS: Must be last
  */
 enum idpf_flags {
@@ -97,6 +98,7 @@ enum idpf_flags {
 	IDPF_REMOVE_IN_PROG,
 	IDPF_MB_INTR_MODE,
 	IDPF_VC_CORE_INIT,
+	IDPF_PCI_CB_RESET,
 	IDPF_FLAGS_NBITS,
 };
 
@@ -1012,4 +1014,5 @@ void idpf_idc_vdev_mtu_event(struct iidc_rdma_vport_dev_info *vdev_info,
 int idpf_add_del_fsteer_filters(struct idpf_adapter *adapter,
 				struct virtchnl2_flow_rule_add_del *rule,
 				enum virtchnl2_op opcode);
+void idpf_detach_and_close(struct idpf_adapter *adapter);
 #endif /* !_IDPF_H_ */
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index 7988836fbae0..1e706beb0098 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -758,13 +758,16 @@ static int idpf_init_mac_addr(struct idpf_vport *vport,
 	return 0;
 }
 
-static void idpf_detach_and_close(struct idpf_adapter *adapter)
+void idpf_detach_and_close(struct idpf_adapter *adapter)
 {
 	int max_vports = adapter->max_vports;
 
 	for (int i = 0; i < max_vports; i++) {
 		struct net_device *netdev = adapter->netdevs[i];
 
+		if (!netdev)
+			continue;
+
 		/* If the interface is in detached state, that means the
 		 * previous reset was not handled successfully for this
 		 * vport.
@@ -1908,6 +1911,10 @@ static void idpf_init_hard_reset(struct idpf_adapter *adapter)
 
 	dev_info(dev, "Device HW Reset initiated\n");
 
+	/* Reset has already happened, skip to recovery. */
+	if (test_and_clear_bit(IDPF_PCI_CB_RESET, adapter->flags))
+		goto check_rst_complete;
+
 	/* Prepare for reset */
 	if (test_bit(IDPF_HR_DRV_LOAD, adapter->flags)) {
 		reg_ops->trigger_reset(adapter, IDPF_HR_DRV_LOAD);
@@ -1925,6 +1932,7 @@ static void idpf_init_hard_reset(struct idpf_adapter *adapter)
 		goto unlock_mutex;
 	}
 
+check_rst_complete:
 	/* Wait for reset to complete */
 	err = idpf_check_reset_complete(adapter, &adapter->reset_reg);
 	if (err) {
@@ -1984,7 +1992,8 @@ void idpf_vc_event_task(struct work_struct *work)
 	if (test_bit(IDPF_HR_FUNC_RESET, adapter->flags))
 		goto func_reset;
 
-	if (test_bit(IDPF_HR_DRV_LOAD, adapter->flags))
+	if (test_bit(IDPF_HR_DRV_LOAD, adapter->flags) ||
+	    test_bit(IDPF_PCI_CB_RESET, adapter->flags))
 		goto drv_load;
 
 	return;
diff --git a/drivers/net/ethernet/intel/idpf/idpf_main.c b/drivers/net/ethernet/intel/idpf/idpf_main.c
index d99f759c55e1..54fca25c09f7 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_main.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_main.c
@@ -234,6 +234,7 @@ static int idpf_cfg_device(struct idpf_adapter *adapter)
 	if (err)
 		pci_dbg(pdev, "PCIe PTM is not supported by PCIe bus/controller\n");
 
+	pci_save_state(pdev);
 	pci_set_drvdata(pdev, adapter);
 
 	return 0;
@@ -360,6 +361,116 @@ static int idpf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	return err;
 }
 
+static void idpf_reset_prepare(struct idpf_adapter *adapter)
+{
+	pci_dbg(adapter->pdev, "resetting\n");
+	set_bit(IDPF_HR_RESET_IN_PROG, adapter->flags);
+	cancel_delayed_work_sync(&adapter->serv_task);
+	cancel_delayed_work_sync(&adapter->vc_event_task);
+	idpf_detach_and_close(adapter);
+	idpf_idc_issue_reset_event(adapter->cdev_info);
+	idpf_vc_core_deinit(adapter);
+}
+
+/**
+ * idpf_pci_err_detected - PCI error detected, about to attempt recovery
+ * @pdev: PCI device struct
+ * @err: err detected
+ *
+ * Return: %PCI_ERS_RESULT_NEED_RESET to attempt recovery,
+ * %PCI_ERS_RESULT_DISCONNECT if recovery is not possible.
+ */
+static pci_ers_result_t
+idpf_pci_err_detected(struct pci_dev *pdev, pci_channel_state_t err)
+{
+	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
+
+	/* Shutdown the mailbox if PCI I/O is in a bad state to avoid MBX
+	 * timeouts during the prepare stage.
+	 */
+	if (pci_channel_offline(pdev))
+		libie_ctlq_xn_shutdown(adapter->xnm);
+
+	idpf_reset_prepare(adapter);
+
+	if (err == pci_channel_io_perm_failure)
+		return PCI_ERS_RESULT_DISCONNECT;
+
+	/* When called due to PCI error, driver will have to force PFR on
+	 * resume, in order to complete the recovery via the event task.
+	 */
+	set_bit(IDPF_PCI_CB_RESET, adapter->flags);
+
+	return PCI_ERS_RESULT_NEED_RESET;
+}
+
+/**
+ * idpf_pci_err_slot_reset - PCI undergoing reset
+ * @pdev: PCI device struct
+ *
+ * Reset PCI state and use a register read to see if we're good.
+ *
+ * Return: %PCI_ERS_RESULT_RECOVERED on success,
+ * %PCI_ERS_RESULT_DISCONNECT on failure.
+ */
+static pci_ers_result_t
+idpf_pci_err_slot_reset(struct pci_dev *pdev)
+{
+	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
+
+	pci_restore_state(pdev);
+	pci_set_master(pdev);
+	pci_wake_from_d3(pdev, false);
+	if (readl(adapter->reset_reg.rstat) != 0xFFFFFFFF)
+		return PCI_ERS_RESULT_RECOVERED;
+
+	return PCI_ERS_RESULT_DISCONNECT;
+}
+
+/**
+ * idpf_pci_err_resume - Resume operations after PCI error recovery
+ * @pdev: PCI device struct
+ */
+static void idpf_pci_err_resume(struct pci_dev *pdev)
+{
+	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
+
+	/* Force a PFR when resuming from PCI error. */
+	if (test_and_set_bit(IDPF_PCI_CB_RESET, adapter->flags))
+		adapter->dev_ops.reg_ops.trigger_reset(adapter, IDPF_HR_FUNC_RESET);
+
+	queue_delayed_work(adapter->vc_event_wq,
+			   &adapter->vc_event_task,
+			   msecs_to_jiffies(300));
+}
+
+/**
+ * idpf_pci_err_reset_prepare - Prepare driver for PCI reset
+ * @pdev: PCI device struct
+ */
+static void idpf_pci_err_reset_prepare(struct pci_dev *pdev)
+{
+	idpf_reset_prepare(pci_get_drvdata(pdev));
+}
+
+/**
+ * idpf_pci_err_reset_done - PCI err reset recovery complete
+ * @pdev: PCI device struct
+ */
+static void idpf_pci_err_reset_done(struct pci_dev *pdev)
+{
+	pci_dbg(pdev, "reset: done\n");
+	idpf_pci_err_resume(pdev);
+}
+
+static const struct pci_error_handlers idpf_pci_err_handler = {
+	.error_detected = idpf_pci_err_detected,
+	.slot_reset = idpf_pci_err_slot_reset,
+	.reset_prepare = idpf_pci_err_reset_prepare,
+	.reset_done = idpf_pci_err_reset_done,
+	.resume = idpf_pci_err_resume,
+};
+
 /* idpf_pci_tbl - PCI Dev idpf ID Table
  */
 static const struct pci_device_id idpf_pci_tbl[] = {
@@ -377,5 +488,6 @@ static struct pci_driver idpf_driver = {
 	.sriov_configure	= idpf_sriov_configure,
 	.remove			= idpf_remove,
 	.shutdown		= idpf_shutdown,
+	.err_handler		= &idpf_pci_err_handler,
 };
 module_pci_driver(idpf_driver);
-- 
2.37.3


^ permalink raw reply related

* [PATCH iwl-next v2 1/2] idpf: remove conditonal MBX deinit from idpf_vc_core_deinit()
From: Emil Tantilov @ 2026-04-14  3:16 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, przemyslaw.kitszel, jay.bhat, ivan.d.barrera,
	aleksandr.loktionov, larysa.zaremba, anthony.l.nguyen,
	andrew+netdev, davem, edumazet, kuba, pabeni, aleksander.lobakin,
	linux-pci, madhu.chittim, decot, willemb, sheenamo, lukas
In-Reply-To: <20260414031631.2107-1-emil.s.tantilov@intel.com>

Previously it was assumed that idpf_vc_core_deinit() is always being
called during reset handling, with remove being an exception. Ideally
the driver needs to communicate the changes to FW in all instances where
the MBX is not already disabled. Remove the remove_in_prog check from
idpf_vc_core_deinit() as the MBX was already disabled while handling the
reset via libie_ctlq_xn_shutdown() by the service task. This is also
needed by the following patch, introducing PCI callbacks support.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Reviewed-by: Jay Bhat <jay.bhat@intel.com>
Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
---
 drivers/net/ethernet/intel/idpf/idpf_virtchnl.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
index 129c8f6b0faa..fceaf3ec1cd4 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_virtchnl.c
@@ -3229,24 +3229,15 @@ int idpf_vc_core_init(struct idpf_adapter *adapter)
  */
 void idpf_vc_core_deinit(struct idpf_adapter *adapter)
 {
-	bool remove_in_prog;
-
 	if (!test_bit(IDPF_VC_CORE_INIT, adapter->flags))
 		return;
 
-	/* Avoid transaction timeouts when called during reset */
-	remove_in_prog = test_bit(IDPF_REMOVE_IN_PROG, adapter->flags);
-	if (!remove_in_prog)
-		idpf_deinit_dflt_mbx(adapter);
-
 	idpf_ptp_release(adapter);
 	idpf_deinit_task(adapter);
 	idpf_idc_deinit_core_aux_device(adapter);
 	idpf_rel_rx_pt_lkup(adapter);
 	idpf_intr_rel(adapter);
-
-	if (remove_in_prog)
-		idpf_deinit_dflt_mbx(adapter);
+	idpf_deinit_dflt_mbx(adapter);
 
 	cancel_delayed_work_sync(&adapter->serv_task);
 
-- 
2.37.3


^ permalink raw reply related

* [PATCH iwl-next v2 0/2] Introduce IDPF PCI callbacks
From: Emil Tantilov @ 2026-04-14  3:16 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, przemyslaw.kitszel, jay.bhat, ivan.d.barrera,
	aleksandr.loktionov, larysa.zaremba, anthony.l.nguyen,
	andrew+netdev, davem, edumazet, kuba, pabeni, aleksander.lobakin,
	linux-pci, madhu.chittim, decot, willemb, sheenamo, lukas

This series implements PCI callbacks for the purpose of handling FLR and
PCI errors in the IDPF driver.

The first patch removes the conditional deinitialization of the mailbox in
the idpf_vc_core_deinit() function. Aside from being redundant, due to the
shutdown of the mailbox after a reset is detected, the check was also
preventing the driver from sending messages to stop and disable the vports
and queues on FW side, which is needed for the prepare phase of the FLR
handling.

The second patch implements the PCI callbacks. The logic here follows
the reset handling done in idpf_init_hard_reset(), but is split in
prepare and resume phases, where idpf_reset_prepare() stops all driver
operations and the resume callback attempt to recover following the
reset or the PCI error event.

Testing hints:
1. FLR via sysfs:
echo 1 > /sys/class/net/<ifname>/device/reset

Previously this would have been handled by idpf_init_hard_reset() as the
driver detects the reset. Now it will be done by the PCI err callbacks,
so this is the easiest way to test the reset_prepare/resume path.

2. PCI errors can be tested with aer-inject:
./aer-inject -s 83:00.0 examples/<error_type>

3. Stress testing can be done by combining various callbacks with the
reset from step 1:
echo 1 > /sys/class/net/<if>/device/reset& ethtool -L <if> combined 8
ethtool -L <if> combined 16& echo 1 > /sys/class/net/<if>/device/reset

Changelog:
v1->v2:
- Removed the call to pci_save_state() from idpf_pci_err_slot_reset(),
  as it is no longer needed after pci_restore_state(). Suggested by
  Lukas Wunner.

v1:
https://lore.kernel.org/netdev/20260411003959.30959-1-emil.s.tantilov@intel.com/

Emil Tantilov (2):
  idpf: remove conditonal MBX deinit from idpf_vc_core_deinit()
  idpf: implement pci error handlers

 drivers/net/ethernet/intel/idpf/idpf.h        |   3 +
 drivers/net/ethernet/intel/idpf/idpf_lib.c    |  13 +-
 drivers/net/ethernet/intel/idpf/idpf_main.c   | 112 ++++++++++++++++++
 .../net/ethernet/intel/idpf/idpf_virtchnl.c   |  11 +-
 4 files changed, 127 insertions(+), 12 deletions(-)

-- 
2.37.3


^ permalink raw reply

* [PATCH 6.6.y] netfilter: conntrack: add missing netlink policy validations
From: Li hongliang @ 2026-04-14  2:59 UTC (permalink / raw)
  To: gregkh, stable, fw
  Cc: patches, linux-kernel, pablo, kadlec, davem, edumazet, kuba,
	pabeni, horms, kaber, netfilter-devel, coreteam, netdev, imv4bel

From: Florian Westphal <fw@strlen.de>

[ Upstream commit f900e1d77ee0ef87bfb5ab3fe60f0b3d8ad5ba05 ]

Hyunwoo Kim reports out-of-bounds access in sctp and ctnetlink.

These attributes are used by the kernel without any validation.
Extend the netlink policies accordingly.

Quoting the reporter:
  nlattr_to_sctp() assigns the user-supplied CTA_PROTOINFO_SCTP_STATE
  value directly to ct->proto.sctp.state without checking that it is
  within the valid range. [..]

  and: ... with exp->dir = 100, the access at
  ct->master->tuplehash[100] reads 5600 bytes past the start of a
  320-byte nf_conn object, causing a slab-out-of-bounds read confirmed by
  UBSAN.

Fixes: 076a0ca02644 ("netfilter: ctnetlink: add NAT support for expectations")
Fixes: a258860e01b8 ("netfilter: ctnetlink: add full support for SCTP to ctnetlink")
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Li hongliang <1468888505@139.com>
---
 net/netfilter/nf_conntrack_netlink.c    | 2 +-
 net/netfilter/nf_conntrack_proto_sctp.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 9b089cdfcd35..255996f43d85 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -3454,7 +3454,7 @@ ctnetlink_change_expect(struct nf_conntrack_expect *x,
 
 #if IS_ENABLED(CONFIG_NF_NAT)
 static const struct nla_policy exp_nat_nla_policy[CTA_EXPECT_NAT_MAX+1] = {
-	[CTA_EXPECT_NAT_DIR]	= { .type = NLA_U32 },
+	[CTA_EXPECT_NAT_DIR]	= NLA_POLICY_MAX(NLA_BE32, IP_CT_DIR_REPLY),
 	[CTA_EXPECT_NAT_TUPLE]	= { .type = NLA_NESTED },
 };
 #endif
diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c
index 4cc97f971264..fabb2c1ca00a 100644
--- a/net/netfilter/nf_conntrack_proto_sctp.c
+++ b/net/netfilter/nf_conntrack_proto_sctp.c
@@ -587,7 +587,8 @@ static int sctp_to_nlattr(struct sk_buff *skb, struct nlattr *nla,
 }
 
 static const struct nla_policy sctp_nla_policy[CTA_PROTOINFO_SCTP_MAX+1] = {
-	[CTA_PROTOINFO_SCTP_STATE]	    = { .type = NLA_U8 },
+	[CTA_PROTOINFO_SCTP_STATE]	    = NLA_POLICY_MAX(NLA_U8,
+							 SCTP_CONNTRACK_HEARTBEAT_SENT),
 	[CTA_PROTOINFO_SCTP_VTAG_ORIGINAL]  = { .type = NLA_U32 },
 	[CTA_PROTOINFO_SCTP_VTAG_REPLY]     = { .type = NLA_U32 },
 };
-- 
2.34.1



^ permalink raw reply related

* [PATCH 6.1.y] netfilter: conntrack: add missing netlink policy validations
From: Li hongliang @ 2026-04-14  2:59 UTC (permalink / raw)
  To: gregkh, stable, fw
  Cc: patches, linux-kernel, pablo, kadlec, davem, edumazet, kuba,
	pabeni, horms, kaber, netfilter-devel, coreteam, netdev, imv4bel

From: Florian Westphal <fw@strlen.de>

[ Upstream commit f900e1d77ee0ef87bfb5ab3fe60f0b3d8ad5ba05 ]

Hyunwoo Kim reports out-of-bounds access in sctp and ctnetlink.

These attributes are used by the kernel without any validation.
Extend the netlink policies accordingly.

Quoting the reporter:
  nlattr_to_sctp() assigns the user-supplied CTA_PROTOINFO_SCTP_STATE
  value directly to ct->proto.sctp.state without checking that it is
  within the valid range. [..]

  and: ... with exp->dir = 100, the access at
  ct->master->tuplehash[100] reads 5600 bytes past the start of a
  320-byte nf_conn object, causing a slab-out-of-bounds read confirmed by
  UBSAN.

Fixes: 076a0ca02644 ("netfilter: ctnetlink: add NAT support for expectations")
Fixes: a258860e01b8 ("netfilter: ctnetlink: add full support for SCTP to ctnetlink")
Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Li hongliang <1468888505@139.com>
---
 net/netfilter/nf_conntrack_netlink.c    | 2 +-
 net/netfilter/nf_conntrack_proto_sctp.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 89cec02de68b..bcbd77608365 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -3458,7 +3458,7 @@ ctnetlink_change_expect(struct nf_conntrack_expect *x,
 
 #if IS_ENABLED(CONFIG_NF_NAT)
 static const struct nla_policy exp_nat_nla_policy[CTA_EXPECT_NAT_MAX+1] = {
-	[CTA_EXPECT_NAT_DIR]	= { .type = NLA_U32 },
+	[CTA_EXPECT_NAT_DIR]	= NLA_POLICY_MAX(NLA_BE32, IP_CT_DIR_REPLY),
 	[CTA_EXPECT_NAT_TUPLE]	= { .type = NLA_NESTED },
 };
 #endif
diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c
index 7ffd698497f2..90458799324e 100644
--- a/net/netfilter/nf_conntrack_proto_sctp.c
+++ b/net/netfilter/nf_conntrack_proto_sctp.c
@@ -600,7 +600,8 @@ static int sctp_to_nlattr(struct sk_buff *skb, struct nlattr *nla,
 }
 
 static const struct nla_policy sctp_nla_policy[CTA_PROTOINFO_SCTP_MAX+1] = {
-	[CTA_PROTOINFO_SCTP_STATE]	    = { .type = NLA_U8 },
+	[CTA_PROTOINFO_SCTP_STATE]	    = NLA_POLICY_MAX(NLA_U8,
+							 SCTP_CONNTRACK_HEARTBEAT_SENT),
 	[CTA_PROTOINFO_SCTP_VTAG_ORIGINAL]  = { .type = NLA_U32 },
 	[CTA_PROTOINFO_SCTP_VTAG_REPLY]     = { .type = NLA_U32 },
 };
-- 
2.34.1



^ permalink raw reply related

* [PATCH 6.6.y] Revert "wifi: cfg80211: stop NAN and P2P in cfg80211_leave"
From: guocai.he.cn @ 2026-04-14  2:46 UTC (permalink / raw)
  To: gregkh
  Cc: stable, johannes.berg, netdev, regressions,
	miriam.rachel.korenblit, linux-kernel

From: Guocai He <guocai.he.cn@windriver.com>

This reverts commit 4d7a05da767e5cbcf4db511b9289d7ebd380dc56 which is commit
e1696c8bd0056bc1a5f7766f58ac333adc203e8a upstream.

The reverted patch introduced a deadlock. The locking situation in mainline is
totally different, so it is incorrect to directly backport the commit from mainline.

Signed-off-by: Guocai He <guocai.he.cn@windriver.com>
---
 net/wireless/core.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/wireless/core.c b/net/wireless/core.c
index fac19dab23c6..d07c4baa32d9 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -1332,10 +1332,8 @@ void __cfg80211_leave(struct cfg80211_registered_device *rdev,
 		__cfg80211_leave_ocb(rdev, dev);
 		break;
 	case NL80211_IFTYPE_P2P_DEVICE:
-		cfg80211_stop_p2p_device(rdev, wdev);
-		break;
 	case NL80211_IFTYPE_NAN:
-		cfg80211_stop_nan(rdev, wdev);
+		/* cannot happen, has no netdev */
 		break;
 	case NL80211_IFTYPE_AP_VLAN:
 	case NL80211_IFTYPE_MONITOR:
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH net-next] net: shaper: Reject zero weight in shaper config
From: Mohsin Bashir @ 2026-04-14  2:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, ast, chuck.lever, davem, donald.hunter, edumazet, horms,
	linux-kernel, matttbe, pabeni
In-Reply-To: <20260413145039.43f7b162@kernel.org>



On 4/13/26 2:50 PM, Jakub Kicinski wrote:
> On Fri, 10 Apr 2026 15:51:23 -0700 Mohsin Bashir wrote:
>> A zero weight is meaningless for DWRR scheduling and can cause
>> starvation of the affected node. Add a min-value constraint to
>> the weight attribute in the net_shaper netlink spec so that zero
>> is rejected at the netlink policy level.
>>
>> Found while prototyping a new driver, existing drivers are not
>> affected.
> 
> AI review points out that if the netlink attr is not present core will
> leave the DWRR weight as 0 in the struct. I guess we need to think this
> thru a little more carefully. What should the "default" weight be?
> What if user specifies weights only for subset of leaves?
> 
> This part of the uAPI seems under-defined.
> 
> Maybe a better adjustment would be to make core set the weight to 1
> automatically if the user has not defined it? Only when sending it to
> the driver tho, because we'd still want it to not be reported back to
> user space. Not sure how hairy it'd get code-wise.

Interesting!!
Let me look at the big picture here and re-spin.

^ permalink raw reply

* [PATCH 6.1.y] Revert "wifi: cfg80211: stop NAN and P2P in cfg80211_leave"
From: guocai.he.cn @ 2026-04-14  2:16 UTC (permalink / raw)
  To: stable; +Cc: gregkh, johannes.berg, netdev, regressions,
	miriam.rachel.korenblit

From: Guocai He <guocai.he.cn@windriver.com>

This reverts commit 0c4f1c02d27a880b10b58c63f574f13bed4f711d which is commit 
e1696c8bd0056bc1a5f7766f58ac333adc203e8a upstream.

The reverted patch introduced a deadlock. The locking situation in mainline is 
totally different, so it is incorrect to directly backport the commit from mainline.

Signed-off-by: Guocai He <guocai.he.cn@windriver.com>
---
 net/wireless/core.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/wireless/core.c b/net/wireless/core.c
index e75326932c32..2a6a8bdfa724 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -1328,10 +1328,8 @@ void __cfg80211_leave(struct cfg80211_registered_device *rdev,
 		__cfg80211_leave_ocb(rdev, dev);
 		break;
 	case NL80211_IFTYPE_P2P_DEVICE:
-		cfg80211_stop_p2p_device(rdev, wdev);
-		break;
 	case NL80211_IFTYPE_NAN:
-		cfg80211_stop_nan(rdev, wdev);
+		/* cannot happen, has no netdev */
 		break;
 	case NL80211_IFTYPE_AP_VLAN:
 	case NL80211_IFTYPE_MONITOR:
-- 
2.34.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox