Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] net: usbnet: Fix potential NULL pointer dereference
From: Oliver Neukum @ 2023-11-06 12:59 UTC (permalink / raw)
  To: Ren Mingshuai, kuba, Bjørn Mork
  Cc: caowangbao, davem, khlebnikov, liaichun, linux-kernel, netdev,
	oneukum, yanan
In-Reply-To: <20231102090630.938759-1-renmingshuai@huawei.com>

On 02.11.23 10:06, Ren Mingshuai wrote:

>> What do you mean? Grepping the function name shows call sites with NULL getting passed as skb.

You may see that we do check skb != NULL before we timestamp it.
But later in the process we depend skb == NULL implying that tx_fixup != NULL

That is the combination that must not happen. If it happens, though
simply bailing out seems to the wrong answer.

> Yes And I just learned that during the cdc_ncm_driver.probe, it is possible to pass a NULL SKB to usbnet_start_xmit().

How can that happen? And if it happens is tx_fixup set?

	Regards
		Oliver

^ permalink raw reply

* Re: [PATCH bpf 2/2] bpf: sockmap, add af_unix test with both sockets in map
From: Jakub Sitnicki @ 2023-11-06 12:44 UTC (permalink / raw)
  To: John Fastabend; +Cc: bpf, netdev, yangyingliang, martin.lau
In-Reply-To: <20231016190819.81307-3-john.fastabend@gmail.com>

On Mon, Oct 16, 2023 at 12:08 PM -07, John Fastabend wrote:
> This adds a test where both pairs of a af_unix paired socket are put into
> a BPF map. This ensures that when we tear down the af_unix pair we don't
> have any issues on sockmap side with ordering and reference counting.
>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  .../selftests/bpf/prog_tests/sockmap_listen.c | 39 ++++++++++++++++---
>  .../selftests/bpf/progs/test_sockmap_listen.c |  7 ++++
>  2 files changed, 40 insertions(+), 6 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
> index 8df8cbb447f1..90e97907c1c1 100644
> --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
> +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
> @@ -1824,8 +1824,10 @@ static void inet_unix_skb_redir_to_connected(struct test_sockmap_listen *skel,
>  	xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
>  }
>  
> -static void unix_inet_redir_to_connected(int family, int type, int sock_mapfd,
> -					int verd_mapfd, enum redir_mode mode)
> +static void unix_inet_redir_to_connected(int family, int type,
> +					int sock_mapfd, int nop_mapfd,
> +					int verd_mapfd,
> +					enum redir_mode mode)
>  {
>  	const char *log_prefix = redir_mode_str(mode);
>  	int c0, c1, p0, p1;
> @@ -1849,6 +1851,12 @@ static void unix_inet_redir_to_connected(int family, int type, int sock_mapfd,
>  	if (err)
>  		goto close;
>  
> +	if (nop_mapfd >= 0) {
> +		err = add_to_sockmap(nop_mapfd, c0, c1);
> +		if (err)
> +			goto close;
> +	}
> +
>  	n = write(c1, "a", 1);
>  	if (n < 0)
>  		FAIL_ERRNO("%s: write", log_prefix);
> @@ -1883,6 +1891,7 @@ static void unix_inet_skb_redir_to_connected(struct test_sockmap_listen *skel,
>  					    struct bpf_map *inner_map, int family)
>  {
>  	int verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
> +	int nop_map = bpf_map__fd(skel->maps.nop_map);
>  	int verdict_map = bpf_map__fd(skel->maps.verdict_map);
>  	int sock_map = bpf_map__fd(inner_map);
>  	int err;
> @@ -1892,14 +1901,32 @@ static void unix_inet_skb_redir_to_connected(struct test_sockmap_listen *skel,
>  		return;
>  
>  	skel->bss->test_ingress = false;
> -	unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
> +	unix_inet_redir_to_connected(family, SOCK_DGRAM,
> +				     sock_map, -1, verdict_map,
> +				     REDIR_EGRESS);
> +	unix_inet_redir_to_connected(family, SOCK_DGRAM,
> +				     sock_map, -1, verdict_map,
>  				     REDIR_EGRESS);
> -	unix_inet_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
> +
> +	unix_inet_redir_to_connected(family, SOCK_DGRAM,
> +				     sock_map, nop_map, verdict_map,
> +				     REDIR_EGRESS);
> +	unix_inet_redir_to_connected(family, SOCK_STREAM,
> +				     sock_map, nop_map, verdict_map,
>  				     REDIR_EGRESS);
>  	skel->bss->test_ingress = true;
> -	unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
> +	unix_inet_redir_to_connected(family, SOCK_DGRAM,
> +				     sock_map, -1, verdict_map,
> +				     REDIR_INGRESS);
> +	unix_inet_redir_to_connected(family, SOCK_STREAM,
> +				     sock_map, -1, verdict_map,
> +				     REDIR_INGRESS);
> +
> +	unix_inet_redir_to_connected(family, SOCK_DGRAM,
> +				     sock_map, nop_map, verdict_map,
>  				     REDIR_INGRESS);
> -	unix_inet_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
> +	unix_inet_redir_to_connected(family, SOCK_STREAM,
> +				     sock_map, nop_map, verdict_map,
>  				     REDIR_INGRESS);
>  
>  	xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
> diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c
> index 464d35bd57c7..b7250eb9c30c 100644
> --- a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c
> +++ b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c
> @@ -14,6 +14,13 @@ struct {
>  	__type(value, __u64);
>  } sock_map SEC(".maps");
>  
> +struct {
> +	__uint(type, BPF_MAP_TYPE_SOCKMAP);
> +	__uint(max_entries, 2);
> +	__type(key, __u32);
> +	__type(value, __u64);
> +} nop_map SEC(".maps");
> +
>  struct {
>  	__uint(type, BPF_MAP_TYPE_SOCKHASH);
>  	__uint(max_entries, 2);

So... we have a bug in unix_inet_redir_to_connected() - it happily
ignores the passed socket type, which is currently hardcoded to
SOCK_DGRAM :-)

Which means these tests don't exercise unix_stream paths where the added
psock->skpair is actually needed.

But I'm able to reproduce the bug by running the VSOCK redir test:

bash-5.2# ./test_progs -n 212/79
[   23.232282] ==================================================================
[   23.232634] BUG: KASAN: slab-use-after-free in sock_def_readable+0xe3/0x400
[   23.232942] Read of size 8 at addr ffff8881013f9860 by task kworker/1:2/220
[   23.233253]
[   23.233326] CPU: 1 PID: 220 Comm: kworker/1:2 Tainted: G           OE      6.6.0 #30
[   23.233697] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
[   23.234074] Workqueue: events sk_psock_backlog
[   23.234271] Call Trace:
[   23.234381]  <TASK>
[   23.234477]  dump_stack_lvl+0x4a/0x90
[   23.234640]  print_address_description.constprop.0+0x33/0x400
[   23.234888]  ? preempt_count_sub+0x13/0xc0
[   23.235071]  print_report+0xb6/0x260
[   23.235228]  ? kasan_complete_mode_report_info+0x7c/0x1f0
[   23.235462]  kasan_report+0xd0/0x110
[   23.235619]  ? sock_def_readable+0xe3/0x400
[   23.235801]  ? sock_def_readable+0xe3/0x400
[   23.235989]  kasan_check_range+0xf7/0x1b0
[   23.236164]  __kasan_check_read+0x11/0x20
[   23.236340]  sock_def_readable+0xe3/0x400
[   23.236514]  unix_stream_sendmsg+0x3c5/0x7d0
[   23.236704]  ? queue_oob+0x300/0x300
[   23.236865]  sock_sendmsg+0x229/0x250
[   23.237030]  ? sock_write_iter+0x320/0x320
[   23.237211]  ? __this_cpu_preempt_check+0x13/0x20
[   23.237416]  ? lock_acquire+0x191/0x410
[   23.237607]  ? lock_sync+0x110/0x110
[   23.237766]  ? lock_is_held_type+0xd0/0x130
[   23.237948]  ? __asan_storeN+0x12/0x20
[   23.238115]  __skb_send_sock+0x4fa/0x670
[   23.238288]  ? preempt_count_sub+0x13/0xc0
[   23.238494]  ? sendmsg_locked+0x90/0x90
[   23.238721]  ? sendmsg_unlocked+0x40/0x40
[   23.238975]  ? __lock_acquire+0x765/0xf00
[   23.239252]  ? __this_cpu_preempt_check+0x13/0x20
[   23.239570]  ? lock_acquire+0x191/0x410
[   23.239831]  skb_send_sock+0x10/0x20
[   23.240079]  sk_psock_backlog+0x141/0x5e0
[   23.240340]  ? __this_cpu_preempt_check+0x13/0x20
[   23.240638]  process_one_work+0x49d/0x970
[   23.240900]  ? drain_workqueue+0x1c0/0x1c0
[   23.241173]  ? assign_work+0xe1/0x120
[   23.241404]  worker_thread+0x380/0x680
[   23.241660]  ? trace_hardirqs_on+0x22/0x100
[   23.241933]  ? preempt_count_sub+0x13/0xc0
[   23.242213]  ? process_one_work+0x970/0x970
[   23.242491]  kthread+0x1ba/0x200
[   23.242687]  ? kthread+0xfe/0x200
[   23.242890]  ? kthread_complete_and_exit+0x20/0x20
[   23.243193]  ret_from_fork+0x35/0x60
[   23.243418]  ? kthread_complete_and_exit+0x20/0x20
[   23.243718]  ret_from_fork_asm+0x11/0x20
[   23.243995]  </TASK>
[   23.244145]
[   23.244227] Allocated by task 227:
[   23.244446]  kasan_save_stack+0x26/0x50
[   23.244709]  kasan_set_track+0x25/0x40
[   23.244951]  kasan_save_alloc_info+0x1e/0x30
[   23.245220]  __kasan_slab_alloc+0x72/0x80
[   23.245491]  kmem_cache_alloc+0x182/0x360
[   23.245758]  sk_prot_alloc+0x43/0x160
[   23.246007]  sk_alloc+0x2c/0x3a0
[   23.246216]  unix_create1+0x86/0x440
[   23.246462]  unix_create+0x7d/0x100
[   23.246701]  __sock_create+0x1bc/0x420
[   23.246960]  __sys_socketpair+0x1ac/0x3a0
[   23.247237]  __x64_sys_socketpair+0x4f/0x60
[   23.247521]  do_syscall_64+0x38/0x90
[   23.247769]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   23.248113]
[   23.248225] Freed by task 227:
[   23.248444]  kasan_save_stack+0x26/0x50
[   23.248707]  kasan_set_track+0x25/0x40
[   23.248963]  kasan_save_free_info+0x2b/0x50
[   23.249249]  ____kasan_slab_free+0x154/0x1c0
[   23.249541]  __kasan_slab_free+0x12/0x20
[   23.249810]  kmem_cache_free+0x1e7/0x480
[   23.250084]  __sk_destruct+0x270/0x3f0
[   23.250342]  sk_destruct+0x78/0x90
[   23.250577]  __sk_free+0x51/0x160
[   23.250807]  sk_free+0x45/0x70
[   23.251025]  unix_release_sock+0x5cc/0x700
[   23.251301]  unix_release+0x50/0x70
[   23.251536]  __sock_release+0x5f/0x120
[   23.251754]  sock_close+0x13/0x20
[   23.252109]  __fput+0x1f3/0x470
[   23.252451]  __fput_sync+0x2f/0x40
[   23.252811]  __x64_sys_close+0x51/0x90
[   23.253169]  do_syscall_64+0x38/0x90
[   23.253480]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   23.253940]
[   23.254097] The buggy address belongs to the object at ffff8881013f9800
[   23.254097]  which belongs to the cache UNIX-STREAM of size 1920
[   23.254936] The buggy address is located 96 bytes inside of
[   23.254936]  freed 1920-byte region [ffff8881013f9800, ffff8881013f9f80)
[   23.255731]
[   23.255844] The buggy address belongs to the physical page:
[   23.256225] page:00000000c005ecb3 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff8881013f8000 pfn:0x1013f8
[   23.256905] head:00000000c005ecb3 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[   23.257382] flags: 0x2fffe000000840(slab|head|node=0|zone=2|lastcpupid=0x7fff)
[   23.257791] page_type: 0xffffffff()
[   23.257988] raw: 002fffe000000840 ffff888100961a40 dead000000000122 0000000000000000
[   23.258418] raw: ffff8881013f8000 000000008010000e 00000001ffffffff 0000000000000000
[   23.258817] page dumped because: kasan: bad access detected
[   23.259131]
[   23.259205] Memory state around the buggy address:
[   23.259469]  ffff8881013f9700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   23.259871]  ffff8881013f9780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   23.260290] >ffff8881013f9800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   23.260704]                                                        ^
[   23.261056]  ffff8881013f9880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   23.261469]  ffff8881013f9900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   23.261854] ==================================================================
[   23.262453] Disabling lock debugging due to kernel taint
#212/79  sockmap_listen/sockmap VSOCK test_vsock_redir:OK
#212     sockmap_listen:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
bash-5.2#

If I modify the test to use (AF_UNIX, SOCK_DGRAM) instead of
SOCK_STREAM, the bug no longer reproduces.

Which confirms my thinking that unix_dgram_sendmsg is safe to use from
sockmap because it grabs a ref to skpair.

^ permalink raw reply

* Re: [PATCH] net: usbnet: Fix potential NULL pointer dereference
From: Oliver Neukum @ 2023-11-06 12:53 UTC (permalink / raw)
  To: Bjørn Mork, Oliver Neukum
  Cc: Ren Mingshuai, kuba, caowangbao, davem, khlebnikov, liaichun,
	linux-kernel, netdev, yanan
In-Reply-To: <871qd3up56.fsf@miraculix.mork.no>

On 06.11.23 11:55, Bjørn Mork wrote:

> I believe that code is based on the (safe?) assumption that the struct
> usbnet driver_info->tx_fixup points to cdc_ncm_tx_fixup().  And

That seems to be a correct assumption, but one that is far from obvious.
Could you add a big, fat comment?

> cdc_ncm_tx_fixup does lots of weird stuff, including special handling of
> NULL skb. It might return a valid skb for further processing by
> usbnet_start_xmit().  If it doesn't, then we jump straight to
> "not_drop", like we do when cdc_ncm_tx_fixup decides to eat the passed
> skb.
> 
> But "funky" is i precise description of all this...  If someone feels
> like it, then all that open coded skb queing inside cdc_ncm should be
> completely rewritten.

I understand what you mean, but I need a generic answer. Can you call
ndo_start_xmit() with skb == NULL?

	Regards
		Oliver


^ permalink raw reply

* Re: [PATCH bpf 1/2] bpf: sockmap, af_unix sockets need to hold ref for pair sock
From: Jakub Sitnicki @ 2023-11-06 12:35 UTC (permalink / raw)
  To: John Fastabend; +Cc: bpf, netdev, yangyingliang, martin.lau
In-Reply-To: <20231016190819.81307-2-john.fastabend@gmail.com>

On Mon, Oct 16, 2023 at 12:08 PM -07, John Fastabend wrote:
> AF_UNIX sockets are a paired socket. So sending on one of the pairs
> will lookup the paired socket as part of the send operation. It is
> possible however to put just one of the pairs in a BPF map. This
> currently increments the refcnt on the sock in the sockmap to
> ensure it is not free'd by the stack before sockmap cleans up its
> state and stops any skbs being sent/recv'd to that socket.
>
> But we missed a case. If the peer socket is closed it will be
> free'd by the stack. However, the paired socket can still be
> referenced from BPF sockmap side because we hold a reference
> there. Then if we are sending traffic through BPF sockmap to
> that socket it will try to dereference the free'd pair in its
> send logic creating a use after free.  And following splat,
>
>    [59.900375] BUG: KASAN: slab-use-after-free in sk_wake_async+0x31/0x1b0
>    [59.901211] Read of size 8 at addr ffff88811acbf060 by task kworker/1:2/954
>    [...]
>    [59.905468] Call Trace:
>    [59.905787]  <TASK>
>    [59.906066]  dump_stack_lvl+0x130/0x1d0
>    [59.908877]  print_report+0x16f/0x740
>    [59.910629]  kasan_report+0x118/0x160
>    [59.912576]  sk_wake_async+0x31/0x1b0
>    [59.913554]  sock_def_readable+0x156/0x2a0
>    [59.914060]  unix_stream_sendmsg+0x3f9/0x12a0
>    [59.916398]  sock_sendmsg+0x20e/0x250
>    [59.916854]  skb_send_sock+0x236/0xac0
>    [59.920527]  sk_psock_backlog+0x287/0xaa0
>
> To fix let BPF sockmap hold a refcnt on both the socket in the
> sockmap and its paired socket.  It wasn't obvious how to contain
> the fix to bpf_unix logic. The primarily problem with keeping this
> logic in bpf_unix was: In the sock close() we could handle the
> deref by having a close handler. But, when we are destroying the
> psock through a map delete operation we wouldn't have gotten any
> signal thorugh the proto struct other than it being replaced.
> If we do the deref from the proto replace its too early because
> we need to deref the skpair after the backlog worker has been
> stopped.
>
> Given all this it seems best to just cache it at the end of the
> psock and eat 8B for the af_unix and vsock users.
>
> Fixes: 94531cfcbe79 ("af_unix: Add unix_stream_proto for sockmap")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  include/linux/skmsg.h |  1 +
>  include/net/af_unix.h |  1 +
>  net/core/skmsg.c      |  2 ++
>  net/unix/af_unix.c    |  2 --
>  net/unix/unix_bpf.c   | 10 ++++++++++
>  5 files changed, 14 insertions(+), 2 deletions(-)
>

[...]

> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 3e8a04a13668..87dd723aacf9 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -212,8 +212,6 @@ static inline bool unix_secdata_eq(struct scm_cookie *scm, struct sk_buff *skb)
>  }
>  #endif /* CONFIG_SECURITY_NETWORK */
>  
> -#define unix_peer(sk) (unix_sk(sk)->peer)
> -
>  static inline int unix_our_peer(struct sock *sk, struct sock *osk)
>  {
>  	return unix_peer(osk) == sk;
> diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
> index 2f9d8271c6ec..705eeed10be3 100644
> --- a/net/unix/unix_bpf.c
> +++ b/net/unix/unix_bpf.c
> @@ -143,6 +143,8 @@ static void unix_stream_bpf_check_needs_rebuild(struct proto *ops)
>  
>  int unix_dgram_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
>  {
> +	struct sock *skpair;
> +
>  	if (sk->sk_type != SOCK_DGRAM)
>  		return -EOPNOTSUPP;
>  
> @@ -152,6 +154,9 @@ int unix_dgram_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool re
>  		return 0;
>  	}
>  
> +	skpair = unix_peer(sk);
> +	sock_hold(skpair);
> +	psock->skpair = skpair;
>  	unix_dgram_bpf_check_needs_rebuild(psock->sk_proto);
>  	sock_replace_proto(sk, &unix_dgram_bpf_prot);
>  	return 0;

unix_dgram should not need this, since it grabs a ref on each sendmsg.

I'm not able to reproduce this bug for unix_dgram.

Have you seen any KASAN reports for unix_dgram from syzcaller?

> @@ -159,12 +164,17 @@ int unix_dgram_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool re
>  
>  int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
>  {
> +	struct sock *skpair = unix_peer(sk);
> +
>  	if (restore) {
>  		sk->sk_write_space = psock->saved_write_space;
>  		sock_replace_proto(sk, psock->sk_proto);
>  		return 0;
>  	}
>  
> +	skpair = unix_peer(sk);
> +	sock_hold(skpair);
> +	psock->skpair = skpair;
>  	unix_stream_bpf_check_needs_rebuild(psock->sk_proto);
>  	sock_replace_proto(sk, &unix_stream_bpf_prot);
>  	return 0;


^ permalink raw reply

* Re: [PATCH net v2 3/4] net: ethernet: cortina: Protect against oversized frames
From: Vladimir Oltean @ 2023-11-06 12:40 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michał Mirosław, Andrew Lunn,
	linux-arm-kernel, netdev, linux-kernel
In-Reply-To: <20231105-gemini-largeframe-fix-v2-3-cd3a5aa6c496@linaro.org>

On Sun, Nov 05, 2023 at 09:57:25PM +0100, Linus Walleij wrote:
> The max size of a transfer no matter the MTU is 64KB-1 so immediately
> bail out if the skb exceeds that.
> 
> The calling site tries to linearize the skbuff on error so return a
> special error code -E2BIG to indicate that this will not work in
> any way and bail out immediately if this happens.
> 
> Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
>  drivers/net/ethernet/cortina/gemini.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
> index b21a94b4ab5c..576174a862a9 100644
> --- a/drivers/net/ethernet/cortina/gemini.c
> +++ b/drivers/net/ethernet/cortina/gemini.c
> @@ -1151,6 +1151,12 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
>  	if (skb->protocol == htons(ETH_P_8021Q))
>  		mtu += VLAN_HLEN;
>  
> +	if (skb->len > 65535) {
> +		/* The field for length is only 16 bits */
> +		netdev_err(netdev, "%s: frame too big, max size 65535 bytes\n", __func__);
> +		return -E2BIG;
> +	}
> +

Prints in the packet data path are extremely discouraged, since if they
trigger, they will spam your serial console and make it unusable.

I see that the out_drop label already bumps a counter. That should be
enough to signal there is a problem.

>  	word1 = skb->len;
>  	word3 = SOF_BIT;
>  
> @@ -1232,6 +1238,7 @@ static netdev_tx_t gmac_start_xmit(struct sk_buff *skb,
>  	struct gmac_txq *txq;
>  	int txq_num, nfrags;
>  	union dma_rwptr rw;
> +	int ret;
>  
>  	if (skb->len >= 0x10000)
>  		goto out_drop_free;

Since you already have this test, does the newly introduced one make
this redundant? Why not just change the limit here?

> @@ -1269,7 +1276,11 @@ static netdev_tx_t gmac_start_xmit(struct sk_buff *skb,
>  		}
>  	}
>  
> -	if (gmac_map_tx_bufs(netdev, skb, txq, &w)) {
> +	ret = gmac_map_tx_bufs(netdev, skb, txq, &w);
> +	if (ret == -E2BIG)
> +		goto out_drop;

Why out_drop and not out_drop_free? This handling will eventually cause an OOM.

The fact that it didn't makes me suspect that you never actually hit this condition,
because the network stack isn't delivering skbs larger than dev->mtu.
Maybe net/core/pktgen.c doesn't take the MTU into consideration, I'm not
completely sure there...

> +	if (ret) {
> +		/* Linearize and retry */
>  		if (skb_linearize(skb))
>  			goto out_drop;
>  
> 
> -- 
> 2.34.1
> 

^ permalink raw reply

* Re: [PATCH net v3] net: ti: icssg-prueth: Add missing icss_iep_put to error path
From: Wojciech Drewek @ 2023-11-06 12:01 UTC (permalink / raw)
  To: Jan Kiszka, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, MD Danish Anwar
  Cc: netdev, linux-kernel, Lopes Ivo, Diogo Miguel (T CED IFD-PT),
	Nishanth Menon, Su, Bao Cheng (RC-CN DF FA R&D)
In-Reply-To: <b2857e2c-cacf-4077-8e15-308dce8ccb0b@siemens.com>



On 06.11.2023 12:47, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> Analogously to prueth_remove, just also taking care for NULL'ing the
> iep pointers.
> 
> Fixes: 186734c15886 ("net: ti: icssg-prueth: add packet timestamping and ptp support")
> Fixes: 443a2367ba3c ("net: ti: icssg-prueth: am65x SR2.0 add 10M full duplex support")
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---

Thanks!
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>

> 
> Changes in v3:
>  - consolidate cleanup logic further [Wojciech]
>  - make sure to NULL iep pointers
> 
> Changes in v2:
>  - add proper tags
> 
> This was lost from the TI SDK version while ripping out SR1.0 support - which we are currently restoring for upstream.
> 
>  drivers/net/ethernet/ti/icssg/icssg_prueth.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> index 6c4b64227ac8..3abbeba26f1b 100644
> --- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> +++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> @@ -2105,10 +2105,7 @@ static int prueth_probe(struct platform_device *pdev)
>  	prueth->iep1 = icss_iep_get_idx(np, 1);
>  	if (IS_ERR(prueth->iep1)) {
>  		ret = dev_err_probe(dev, PTR_ERR(prueth->iep1), "iep1 get failed\n");
> -		icss_iep_put(prueth->iep0);
> -		prueth->iep0 = NULL;
> -		prueth->iep1 = NULL;
> -		goto free_pool;
> +		goto put_iep0;
>  	}
>  
>  	if (prueth->pdata.quirk_10m_link_issue) {
> @@ -2205,6 +2202,12 @@ static int prueth_probe(struct platform_device *pdev)
>  exit_iep:
>  	if (prueth->pdata.quirk_10m_link_issue)
>  		icss_iep_exit_fw(prueth->iep1);
> +	icss_iep_put(prueth->iep1);
> +
> +put_iep0:
> +	icss_iep_put(prueth->iep0);
> +	prueth->iep0 = NULL;
> +	prueth->iep1 = NULL;
>  
>  free_pool:
>  	gen_pool_free(prueth->sram_pool,

^ permalink raw reply

* [PATCH net v3] net: ti: icssg-prueth: Add missing icss_iep_put to error path
From: Jan Kiszka @ 2023-11-06 11:47 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	MD Danish Anwar
  Cc: netdev, linux-kernel, Lopes Ivo, Diogo Miguel (T CED IFD-PT),
	Nishanth Menon, Su, Bao Cheng (RC-CN DF FA R&D),
	Wojciech Drewek

From: Jan Kiszka <jan.kiszka@siemens.com>

Analogously to prueth_remove, just also taking care for NULL'ing the
iep pointers.

Fixes: 186734c15886 ("net: ti: icssg-prueth: add packet timestamping and ptp support")
Fixes: 443a2367ba3c ("net: ti: icssg-prueth: am65x SR2.0 add 10M full duplex support")
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---

Changes in v3:
 - consolidate cleanup logic further [Wojciech]
 - make sure to NULL iep pointers

Changes in v2:
 - add proper tags

This was lost from the TI SDK version while ripping out SR1.0 support - which we are currently restoring for upstream.

 drivers/net/ethernet/ti/icssg/icssg_prueth.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
index 6c4b64227ac8..3abbeba26f1b 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
@@ -2105,10 +2105,7 @@ static int prueth_probe(struct platform_device *pdev)
 	prueth->iep1 = icss_iep_get_idx(np, 1);
 	if (IS_ERR(prueth->iep1)) {
 		ret = dev_err_probe(dev, PTR_ERR(prueth->iep1), "iep1 get failed\n");
-		icss_iep_put(prueth->iep0);
-		prueth->iep0 = NULL;
-		prueth->iep1 = NULL;
-		goto free_pool;
+		goto put_iep0;
 	}
 
 	if (prueth->pdata.quirk_10m_link_issue) {
@@ -2205,6 +2202,12 @@ static int prueth_probe(struct platform_device *pdev)
 exit_iep:
 	if (prueth->pdata.quirk_10m_link_issue)
 		icss_iep_exit_fw(prueth->iep1);
+	icss_iep_put(prueth->iep1);
+
+put_iep0:
+	icss_iep_put(prueth->iep0);
+	prueth->iep0 = NULL;
+	prueth->iep1 = NULL;
 
 free_pool:
 	gen_pool_free(prueth->sram_pool,
-- 
2.35.3

^ permalink raw reply related

* Re: [RFC Draft net-next] docs: netdev: add section on using lei to manage netdev mail volume
From: Matthieu Baerts @ 2023-11-06 11:24 UTC (permalink / raw)
  To: David Wei; +Cc: netdev, Jakub Kicinski
In-Reply-To: <20231105185014.2523447-1-dw@davidwei.uk>

Hi David,

On 05/11/2023 19:50, David Wei wrote:
> As a beginner to netdev I found the volume of mail to be overwhelming. I only
> want to focus on core netdev changes and ignore most driver changes. I found a
> way to do this using lei, filtering the mailing list using lore's query
> language and writing the results into an IMAP server.

I agree that the volume of mail is too high with a variety of subjects.
That's why it is very important to CC the right people (as mentioned by
Patchwork [1] ;) )

[1]
https://patchwork.kernel.org/project/netdevbpf/patch/20231105185014.2523447-1-dw@davidwei.uk/

> This patch is an RFC draft of updating the maintainer-netdev documentation with
> this information in the hope of helping out others in the future.

Note that I'm also using lei to filter emails, e.g. to be notified when
someone sends a patch modifying this maintainer-netdev.rst file! [2]

But I don't think this issue of "busy mailing list" is specific to
netdev. It seems that "lei" is already mentioned in another part of the
doc [3]. Maybe this part can be improved? Or the netdev doc could add a
reference to the existing part?

(Maybe such info should be present elsewhere, e.g. on vger [4] or lore)

[2]
https://lore.kernel.org/netdev/?q=%28dfn%3ADocumentation%2Fnetworking%2Fnetdev-FAQ.rst+OR+dfn%3ADocumentation%2Fprocess%2Fmaintainer-netdev.rst%29+AND+rt%3A1.month.ago..
[3]
https://docs.kernel.org/maintainer/feature-and-driver-maintainers.html#mailing-list-participation
[4] http://vger.kernel.org/vger-lists.html

(Note: regarding the commit message here, each line should be limited to
max 72 chars ideally)

> Signed-off-by: David Wei <dw@davidwei.uk>
> ---
>  Documentation/process/maintainer-netdev.rst | 39 +++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/Documentation/process/maintainer-netdev.rst b/Documentation/process/maintainer-netdev.rst
> index 7feacc20835e..93851783de6f 100644
> --- a/Documentation/process/maintainer-netdev.rst
> +++ b/Documentation/process/maintainer-netdev.rst
> @@ -33,6 +33,45 @@ Aside from subsystems like those mentioned above, all network-related
>  Linux development (i.e. RFC, review, comments, etc.) takes place on
>  netdev.
>  
> +Managing emails
> +~~~~~~~~~~~~~~~
> +
> +netdev is a busy mailing list with on average over 200 emails received per day,
> +which can be overwhelming to beginners. Rather than subscribing to the entire
> +list, considering using ``lei`` to only subscribe to topics that you are
> +interested in. Konstantin Ryabitsev wrote excellent tutorials on using ``lei``:
> +
> + - https://people.kernel.org/monsieuricon/lore-lei-part-1-getting-started
> + - https://people.kernel.org/monsieuricon/lore-lei-part-2-now-with-imap
> +
> +As a netdev beginner, you may want to filter out driver changes and only focus
> +on core netdev changes. Try using the following query with ``lei q``::
> +
> +  lei q -o ~/Mail/netdev \
> +    -I https://lore.kernel.org/all \
> +    -t '(b:b/net/* AND tc:netdev@vger.kernel.org AND rt:2.week.ago..'

Small optimisations:

- you can remove tc:netdev@vger.kernel.org and modify the '-I' to
restrict to netdev instead of querying 'all': -I
https://lore.kernel.org/netdev/

- In theory, 'dfn:' should help you to match a filename being modified.
But in your case, 'net' is too generic, and I don't think we can specify
"starting with 'net'". You can still omit some results after [5] but the
syntax doesn't look better :)

  dfn:net AND NOT dfn:drivers/net AND NOT dfn:selftests/net AND NOT
dfn:tools/net AND rt:2.week.ago..

[5]
https://lore.kernel.org/netdev/?q=dfn%3Anet+AND+NOT+dfn%3Adrivers%2Fnet+AND+NOT+dfn%3Aselftests%2Fnet+AND+NOT+dfn%3Atools%2Fnet+AND+rt%3A2.week.ago..

> +This query will only match threads containing messages with patches that modify
> +files in ``net/*``. For more information on the query language, see:
> +
> +  https://lore.kernel.org/linux-btrfs/_/text/help/

(if this is specific to 'netdev', best to use '/netdev/', not
'/linux-btrfs/')

> +By default ``lei`` will output to a Maildir, but it also supports Mbox and IMAP
> +by adding a prefix to the output directory ``-o``. For a list of supported
> +formats and prefix strings, see:
> +
> +  https://www.mankier.com/1/lei-q

Maybe safer to point to the official doc?

https://public-inbox.org/lei-q.html

(or 'man lei-q')

> +If you would like to use IMAP, Konstantin’s blog is slightly outdated and you
> +no longer need to use here strings i.e. ``<<<`` or ``<<EOF``.

I think we can still use them. In the part 1, they are not used. Maybe
best to contact Konstantin to update his blog post instead of mentioning
in the doc that the blog post is outdated?

> You can simply
> +point lei at an IMAP server e.g. ``imaps://imap.gmail.com``::

In Konstantin's blog post, he mentioned different servers with different
specificities. Maybe easier to just point to that instead of taking one
example without more explanations?

Cheers,
Matt

^ permalink raw reply

* Re: [PATCH] net: usbnet: Fix potential NULL pointer dereference
From: Bjørn Mork @ 2023-11-06 10:55 UTC (permalink / raw)
  To: Oliver Neukum
  Cc: Ren Mingshuai, kuba, caowangbao, davem, khlebnikov, liaichun,
	linux-kernel, netdev, yanan
In-Reply-To: <80af8b7a-c543-4386-bb0c-a356189581a0@suse.com>

Oliver Neukum <oneukum@suse.com> writes:

> yes it looks like NCM does funky things, but what does that mean?
>
> ndp_to_end_store()
>
>         /* flush pending data before changing flag */
>         netif_tx_lock_bh(dev->net);
>         usbnet_start_xmit(NULL, dev->net);
>         spin_lock_bh(&ctx->mtx);
>         if (enable)
>
> expects some odd semantics from it. The proposed patch simply
> increases the drop counter, which is by itself questionable, as
> we drop nothing.
>
> But it definitely does no IO, so we flush nothing.
> That is, we clearly have bug(s) but the patch only papers over
> them.
> And frankly, the basic question needs to be answered:
> Are you allowed to call ndo_start_xmit() with a NULL skb?
>
> My understanding until now was that you must not.

Yuck.  I see that I'm to blame for that code, so I've tried to figure
out what the idea behind it could possibly have been.

I believe that code is based on the (safe?) assumption that the struct
usbnet driver_info->tx_fixup points to cdc_ncm_tx_fixup().  And
cdc_ncm_tx_fixup does lots of weird stuff, including special handling of
NULL skb. It might return a valid skb for further processing by
usbnet_start_xmit().  If it doesn't, then we jump straight to
"not_drop", like we do when cdc_ncm_tx_fixup decides to eat the passed
skb.

But "funky" is i precise description of all this...  If someone feels
like it, then all that open coded skb queing inside cdc_ncm should be
completely rewritten.

Bjørn

^ permalink raw reply

* PRP with VLAN support - or how to contribute to a Linux network driver
From: Heiko Gerstung @ 2023-11-06 11:01 UTC (permalink / raw)
  To: netdev@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 859 bytes --]

Hi All,

we are looking for a way to use the hsr/prp driver in our products and found out that it does not support VLANs at the moment. As I can see, the hsr driver is marked as “orphan”, i.e. there is no active maintainer for it. 

I would like to discuss if it makes sense to remove the PRP functionality from the HSR driver (which is based on the bridge kernel module AFAICS) and instead implement PRP as a separate module (based on the Bonding driver, which would make more sense for PRP). We have a working implementation for such a module for 4.14 and would only need help in porting it to newer kernels. We would volunteer to maintain that kernel module (or sponsor someone who could).  

Hoping for advise what the next steps could be. Happy to discuss this off-list as it may not be of interest for most people. 

Thank you
  Heiko

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 8165 bytes --]

^ permalink raw reply

* Re: [PATCH bpf 1/2] bpf: sockmap, af_unix sockets need to hold ref for pair sock
From: Jakub Sitnicki @ 2023-11-06 10:15 UTC (permalink / raw)
  To: John Fastabend; +Cc: Kuniyuki Iwashima, bpf, martin.lau, netdev, yangyingliang
In-Reply-To: <6545bc9f7e443_3358c208ae@john.notmuch>

On Fri, Nov 03, 2023 at 08:38 PM -07, John Fastabend wrote:
> Jakub Sitnicki wrote:
>> On Fri, Oct 27, 2023 at 10:38 AM -07, Kuniyuki Iwashima wrote:
>> > From: Jakub Sitnicki <jakub@cloudflare.com>
>> > Date: Fri, 27 Oct 2023 15:32:15 +0200
>> >> On Tue, Oct 24, 2023 at 02:39 PM -07, John Fastabend wrote:
>> >> > Jakub Sitnicki wrote:
>> >> >> On Mon, Oct 16, 2023 at 12:08 PM -07, John Fastabend wrote:
>> >> >> > AF_UNIX sockets are a paired socket. So sending on one of the pairs
>> >> >> > will lookup the paired socket as part of the send operation. It is
>> >> >> > possible however to put just one of the pairs in a BPF map. This
>> >> >> > currently increments the refcnt on the sock in the sockmap to
>> >> >> > ensure it is not free'd by the stack before sockmap cleans up its
>> >> >> > state and stops any skbs being sent/recv'd to that socket.
>> >> >> >
>> >> >> > But we missed a case. If the peer socket is closed it will be
>> >> >> > free'd by the stack. However, the paired socket can still be
>> >> >> > referenced from BPF sockmap side because we hold a reference
>> >> >> > there. Then if we are sending traffic through BPF sockmap to
>> >> >> > that socket it will try to dereference the free'd pair in its
>> >> >> > send logic creating a use after free.  And following splat,
>> >> >> >
>> >> >> >    [59.900375] BUG: KASAN: slab-use-after-free in sk_wake_async+0x31/0x1b0
>> >> >> >    [59.901211] Read of size 8 at addr ffff88811acbf060 by task kworker/1:2/954
>> >> >> >    [...]
>> >> >> >    [59.905468] Call Trace:
>> >> >> >    [59.905787]  <TASK>
>> >> >> >    [59.906066]  dump_stack_lvl+0x130/0x1d0
>> >> >> >    [59.908877]  print_report+0x16f/0x740
>> >> >> >    [59.910629]  kasan_report+0x118/0x160
>> >> >> >    [59.912576]  sk_wake_async+0x31/0x1b0
>> >> >> >    [59.913554]  sock_def_readable+0x156/0x2a0
>> >> >> >    [59.914060]  unix_stream_sendmsg+0x3f9/0x12a0
>> >> >> >    [59.916398]  sock_sendmsg+0x20e/0x250
>> >> >> >    [59.916854]  skb_send_sock+0x236/0xac0
>> >> >> >    [59.920527]  sk_psock_backlog+0x287/0xaa0
>> >> >> 
>> >> >> Isn't the problem here that unix_stream_sendmsg doesn't grab a ref to
>> >> >> peer sock? Unlike unix_dgram_sendmsg which uses the unix_peer_get
>> >> >> helper.
>> >> >
>> >> > It does by my read. In unix_stream_connect we have,
>> >> >
>> >> > 	sock_hold(sk);
>> >> > 	unix_peer(newsk)	= sk;
>> >> > 	newsk->sk_state		= TCP_ESTABLISHED;
>> >> >
>> >> > where it assigns the peer sock. unix_dgram_connect() also calls
>> >> > sock_hold() but through the path that does the socket lookup, such as
>> >> > unix_find_other().
>> >> >
>> >> > The problem I see is before the socket does the kfree on the
>> >> > sock we need to be sure the backlog is canceled and the skb list
>> >> > ingress_skb is purged. If we don't ensure this then the redirect
>> >> > will 
>> >> >
>> >> > My model is this,
>> >> >
>> >> >          s1            c1
>> >> > refcnt    1             1
>> >> > connect   2             2
>> >> > psock     3             3
>> >> > send(s1) ...
>> >> > close(s1) 2             1 <- close drops psock count also
>> >> > close(c1) 0             0
>> >> >
>> >> > The important bit here is the psock has a refcnt on the
>> >> > underlying sock (psock->sk) and wont dec that until after
>> >> > cancel_delayed_work_sync() completes. This ensures the
>> >> > backlog wont try to sendmsg() on that sock after we free
>> >> > it. We also check for SOCK_DEAD and abort to avoid sending
>> >> > over a socket that has been marked DEAD.
>> >> >
>> >> > So... After close(s1) the only thing keeping that sock
>> >> > around is c1. Then we close(c1) that call path is
>> >> >
>> >> >  unix_release
>> >> >    close() 
>> >> >    unix_release_sock()
>> >> >      skpair = unix_peer(sk);
>> >> >      ...
>> >> >      sock_put(skpair);  <- trouble here
>> >> >
>> >> > The release will call sock_put() on the pair socket and
>> >> > dec it to 0 where it gets free'd through sk_free(). But
>> >> > now the trouble is we haven't waited for cancel_delayed_work_sync()
>> >> > on the c1 socket yet so backlog can still run. When it does
>> >> > run it may try to send a pkg over socket s1. OK right up until
>> >> > the sendmsg(s1, ...) does a peer lookup and derefs the peer
>> >> > socket. The peer socket was free'd earlier so use after free. 
>> >> >
>> >> > The question I had originally was this is odd, we are allowing
>> >> > a sendmsg(s1) over a socket while its in unix_release(). We
>> >> > used to take the sock lock from the backlog that was dropped
>> >> > in the name of performance, but it creates these races.
>> >> >
>> >> > Other fixes I considered. First adding sock lock back to
>> >> > backlog. But that punishes the UDP and TCP cases that don't
>> >> > have this problem. Set the SOCK_DEAD flag earlier or check
>> >> > later but this just makes the race smaller doesn't really
>> >> > eliminate it.
>> >> >
>> >> > So this patch is what I came up with. 
>> >> 
>> >> What I was getting at is that we could make it safe to call sendmsg on a
>> >> unix stream sock while its peer is being release. And not just for
>> >> sockmap. I expect io_uring might have the same problem. But I didn't
>> >> actually check yet.
>> >> 
>> >> For that we could keep a ref to peer for the duration of sendmsg call,
>> >> like unix dgram does. Then 'other' doesn't become a stale pointer before
>> >> we're done with it.
>> >> 
>> >> Bumping ref count on each sendmsg is not free, but maybe its
>> >> acceptable. Unix dgram sockets live with it.
>> >
>> > The reason why only dgram sk needs sock_hold() for each sendmsg() is
>> > that dgram sk can send data without connect().  unix_peer_get() in
>> > unix_dgram_sendmsg() is to reuse the same code when peer is not set.
>> >
>> > unix_stream_sendmsg() already holds a necessary refcnt and need not
>> > use sock_hold() there.
>> >
>> > The user who touches a peer without lookup must hold refcnt beforehand.
>
> Hi, we probably do need to get a fix for this. syzkaller hit it again
> and anyways it likely will crash some real systems if folks try to use
> it with enough systems.
>
>> 
>> Right. And this ownership scheme works well for unix stream because, as
>> John nicely explained, we serialize close() and sendmsg() ops with sock
>> lock.
>> 
>> Here, however, we have a case of deferred work which holds a ref to sock
>> but does not grab the sock lock. While it is doing its thing, the sk
>> gets closed/released and we drop the skpair ref. And bam, UAF.
>> 
>> If it wasn't for the reference cycle between sk and skpair, we could
>> defer the skpari ref drop until sk_destruct callback. But we can't.
>> 
>> If grabbing a ref on skpair on each sendmsg turns out to be not a viable
>> option, I didn't run any benchmarks so can't say what's the penatly
>> like, the next best thing is RCU.
>
> I think it really would be best to stay out of the presumably hotpath
> here if we can.
>
> I had considered marking the socket SOCK_RCU_FREE which should then
> wait an rcu grace period. But then it wasn't clear to me that would
> completely solve the race. The psock still needs to do the 
> cancel_delayed_work_sync() and this is also done from the rcu call
> back on the psock. Withtout the extra reference iirc the concern
> was we would basically have two rcu callbacks running that have
> an ordering requirement which I don't think is ensured.
>

I've checked io_uring and it keeps a ref to the related file for the
lifetime the I/O request. So I think we are good there and it's only
sockmap that is "bleeding".

I agree it would be best not to hamper unix_stream sendmsg performance.

At the same time, I think we can easily make the extra ref grab in
unix_stream_sendmsg optional, keeping the fast path fast (modulo 2x a
branch op).

I realize RCU would be a bigger, riskier overhaul. Perhaps not worth it,
if there are no benefits for "the regular" unix_stream users.

If we can avoid managing more state in psock, that would be my
preference. But I don't want to block any fixes that users are waiting
for.

^ permalink raw reply

* Re: [PATCH net 0/4] vsock: fix server prevents clients from reconnecting
From: Stefano Garzarella @ 2023-11-06 10:50 UTC (permalink / raw)
  To: f.storniolo95
  Cc: luigi.leonardi, kvm, davem, edumazet, mst, imbrenda, kuba, asias,
	stefanha, pabeni, netdev, linux-kernel, virtualization
In-Reply-To: <20231103175551.41025-1-f.storniolo95@gmail.com>

On Fri, Nov 03, 2023 at 06:55:47PM +0100, f.storniolo95@gmail.com wrote:
>From: Filippo Storniolo <f.storniolo95@gmail.com>
>
>This patch series introduce fix and tests for the following vsock bug:
>If the same remote peer, using the same port, tries to connect
>to a server on a listening port more than once, the server will
>reject the connection, causing a "connection reset by peer"
>error on the remote peer. This is due to the presence of a
>dangling socket from a previous connection in both the connected
>and bound socket lists.
>The inconsistency of the above lists only occurs when the remote
>peer disconnects and the server remains active.
>This bug does not occur when the server socket is closed.
>
>More details on the first patch changelog.
>The remaining patches are refactoring and test.

Thanks for the fix and the test!

I only left a small comment in patch 2 which I don't think justifies a
v2 by itself though. If for some other reason you have to send a v2,
then maybe I would fix it.

I reviewed the series and ran the tests. Everything seems to be fine.

Thanks,
Stefano


^ permalink raw reply

* Re: [PATCH net 4/4] test/vsock: add dobule bind connect test
From: Stefano Garzarella @ 2023-11-06 10:48 UTC (permalink / raw)
  To: f.storniolo95
  Cc: luigi.leonardi, kvm, davem, edumazet, mst, imbrenda, kuba, asias,
	stefanha, pabeni, netdev, linux-kernel, virtualization
In-Reply-To: <20231103175551.41025-5-f.storniolo95@gmail.com>

On Fri, Nov 03, 2023 at 06:55:51PM +0100, f.storniolo95@gmail.com wrote:
>From: Filippo Storniolo <f.storniolo95@gmail.com>
>
>This add bind connect test which creates a listening server socket
>and tries to connect a client with a bound local port to it twice.
>
>Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com>
>Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
>Signed-off-by: Filippo Storniolo <f.storniolo95@gmail.com>
>---
> tools/testing/vsock/util.c       | 47 ++++++++++++++++++++++++++++++
> tools/testing/vsock/util.h       |  3 ++
> tools/testing/vsock/vsock_test.c | 50 ++++++++++++++++++++++++++++++++
> 3 files changed, 100 insertions(+)

LGTM!

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

>
>diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
>index 2fc96f29bdf2..ae2b33c21c45 100644
>--- a/tools/testing/vsock/util.c
>+++ b/tools/testing/vsock/util.c
>@@ -85,6 +85,48 @@ void vsock_wait_remote_close(int fd)
> 	close(epollfd);
> }
>
>+/* Bind to <bind_port>, connect to <cid, port> and return the file descriptor. */
>+int vsock_bind_connect(unsigned int cid, unsigned int port, unsigned int bind_port, int type)
>+{
>+	struct sockaddr_vm sa_client = {
>+		.svm_family = AF_VSOCK,
>+		.svm_cid = VMADDR_CID_ANY,
>+		.svm_port = bind_port,
>+	};
>+	struct sockaddr_vm sa_server = {
>+		.svm_family = AF_VSOCK,
>+		.svm_cid = cid,
>+		.svm_port = port,
>+	};
>+
>+	int client_fd, ret;
>+
>+	client_fd = socket(AF_VSOCK, type, 0);
>+	if (client_fd < 0) {
>+		perror("socket");
>+		exit(EXIT_FAILURE);
>+	}
>+
>+	if (bind(client_fd, (struct sockaddr *)&sa_client, sizeof(sa_client))) {
>+		perror("bind");
>+		exit(EXIT_FAILURE);
>+	}
>+
>+	timeout_begin(TIMEOUT);
>+	do {
>+		ret = connect(client_fd, (struct sockaddr *)&sa_server, sizeof(sa_server));
>+		timeout_check("connect");
>+	} while (ret < 0 && errno == EINTR);
>+	timeout_end();
>+
>+	if (ret < 0) {
>+		perror("connect");
>+		exit(EXIT_FAILURE);
>+	}
>+
>+	return client_fd;
>+}
>+
> /* Connect to <cid, port> and return the file descriptor. */
> static int vsock_connect(unsigned int cid, unsigned int port, int type)
> {
>@@ -223,6 +265,11 @@ int vsock_stream_accept(unsigned int cid, unsigned int port,
> 	return vsock_accept(cid, port, clientaddrp, SOCK_STREAM);
> }
>
>+int vsock_stream_listen(unsigned int cid, unsigned int port)
>+{
>+	return vsock_listen(cid, port, SOCK_STREAM);
>+}
>+
> int vsock_seqpacket_accept(unsigned int cid, unsigned int port,
> 			   struct sockaddr_vm *clientaddrp)
> {
>diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
>index a77175d25864..03c88d0cb861 100644
>--- a/tools/testing/vsock/util.h
>+++ b/tools/testing/vsock/util.h
>@@ -36,9 +36,12 @@ struct test_case {
> void init_signals(void);
> unsigned int parse_cid(const char *str);
> int vsock_stream_connect(unsigned int cid, unsigned int port);
>+int vsock_bind_connect(unsigned int cid, unsigned int port,
>+		       unsigned int bind_port, int type);
> int vsock_seqpacket_connect(unsigned int cid, unsigned int port);
> int vsock_stream_accept(unsigned int cid, unsigned int port,
> 			struct sockaddr_vm *clientaddrp);
>+int vsock_stream_listen(unsigned int cid, unsigned int port);
> int vsock_seqpacket_accept(unsigned int cid, unsigned int port,
> 			   struct sockaddr_vm *clientaddrp);
> void vsock_wait_remote_close(int fd);
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index c1f7bc9abd22..5b0e93f9996c 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -1180,6 +1180,51 @@ static void test_stream_shutrd_server(const struct test_opts *opts)
> 	close(fd);
> }
>
>+static void test_double_bind_connect_server(const struct test_opts *opts)
>+{
>+	int listen_fd, client_fd, i;
>+	struct sockaddr_vm sa_client;
>+	socklen_t socklen_client = sizeof(sa_client);
>+
>+	listen_fd = vsock_stream_listen(VMADDR_CID_ANY, 1234);
>+
>+	for (i = 0; i < 2; i++) {
>+		control_writeln("LISTENING");
>+
>+		timeout_begin(TIMEOUT);
>+		do {
>+			client_fd = accept(listen_fd, (struct sockaddr *)&sa_client,
>+					   &socklen_client);
>+			timeout_check("accept");
>+		} while (client_fd < 0 && errno == EINTR);
>+		timeout_end();
>+
>+		if (client_fd < 0) {
>+			perror("accept");
>+			exit(EXIT_FAILURE);
>+		}
>+
>+		/* Waiting for remote peer to close connection */
>+		vsock_wait_remote_close(client_fd);
>+	}
>+
>+	close(listen_fd);
>+}
>+
>+static void test_double_bind_connect_client(const struct test_opts *opts)
>+{
>+	int i, client_fd;
>+
>+	for (i = 0; i < 2; i++) {
>+		/* Wait until server is ready to accept a new connection */
>+		control_expectln("LISTENING");
>+
>+		client_fd = vsock_bind_connect(opts->peer_cid, 1234, 4321, SOCK_STREAM);
>+
>+		close(client_fd);
>+	}
>+}
>+
> static struct test_case test_cases[] = {
> 	{
> 		.name = "SOCK_STREAM connection reset",
>@@ -1285,6 +1330,11 @@ static struct test_case test_cases[] = {
> 		.run_client = test_stream_msgzcopy_empty_errq_client,
> 		.run_server = test_stream_msgzcopy_empty_errq_server,
> 	},
>+	{
>+		.name = "SOCK_STREAM double bind connect",
>+		.run_client = test_double_bind_connect_client,
>+		.run_server = test_double_bind_connect_server,
>+	},
> 	{},
> };
>
>-- 
>2.41.0
>


^ permalink raw reply

* Re: [PATCH net 3/4] test/vsock: refactor vsock_accept
From: Stefano Garzarella @ 2023-11-06 10:47 UTC (permalink / raw)
  To: f.storniolo95
  Cc: luigi.leonardi, kvm, davem, edumazet, mst, imbrenda, kuba, asias,
	stefanha, pabeni, netdev, linux-kernel, virtualization
In-Reply-To: <20231103175551.41025-4-f.storniolo95@gmail.com>

On Fri, Nov 03, 2023 at 06:55:50PM +0100, f.storniolo95@gmail.com wrote:
>From: Filippo Storniolo <f.storniolo95@gmail.com>
>
>This is a preliminary patch to introduce SOCK_STREAM bind connect test.
>vsock_accept() is split into vsock_listen() and vsock_accept().
>
>Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com>
>Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
>Signed-off-by: Filippo Storniolo <f.storniolo95@gmail.com>
>---
> tools/testing/vsock/util.c | 32 ++++++++++++++++++++------------
> 1 file changed, 20 insertions(+), 12 deletions(-)

LGTM!

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

>
>diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
>index 698b0b44a2ee..2fc96f29bdf2 100644
>--- a/tools/testing/vsock/util.c
>+++ b/tools/testing/vsock/util.c
>@@ -136,11 +136,8 @@ int vsock_seqpacket_connect(unsigned int cid, unsigned int port)
> 	return vsock_connect(cid, port, SOCK_SEQPACKET);
> }
>
>-/* Listen on <cid, port> and return the first incoming connection.  The remote
>- * address is stored to clientaddrp.  clientaddrp may be NULL.
>- */
>-static int vsock_accept(unsigned int cid, unsigned int port,
>-			struct sockaddr_vm *clientaddrp, int type)
>+/* Listen on <cid, port> and return the file descriptor. */
>+static int vsock_listen(unsigned int cid, unsigned int port, int type)
> {
> 	union {
> 		struct sockaddr sa;
>@@ -152,14 +149,7 @@ static int vsock_accept(unsigned int cid, unsigned int port,
> 			.svm_cid = cid,
> 		},
> 	};
>-	union {
>-		struct sockaddr sa;
>-		struct sockaddr_vm svm;
>-	} clientaddr;
>-	socklen_t clientaddr_len = sizeof(clientaddr.svm);
> 	int fd;
>-	int client_fd;
>-	int old_errno;
>
> 	fd = socket(AF_VSOCK, type, 0);
> 	if (fd < 0) {
>@@ -177,6 +167,24 @@ static int vsock_accept(unsigned int cid, unsigned int port,
> 		exit(EXIT_FAILURE);
> 	}
>
>+	return fd;
>+}
>+
>+/* Listen on <cid, port> and return the first incoming connection.  The remote
>+ * address is stored to clientaddrp.  clientaddrp may be NULL.
>+ */
>+static int vsock_accept(unsigned int cid, unsigned int port,
>+			struct sockaddr_vm *clientaddrp, int type)
>+{
>+	union {
>+		struct sockaddr sa;
>+		struct sockaddr_vm svm;
>+	} clientaddr;
>+	socklen_t clientaddr_len = sizeof(clientaddr.svm);
>+	int fd, client_fd, old_errno;
>+
>+	fd = vsock_listen(cid, port, type);
>+
> 	control_writeln("LISTENING");
>
> 	timeout_begin(TIMEOUT);
>-- 
>2.41.0
>


^ permalink raw reply

* Re: [PATCH net 2/4] test/vsock fix: add missing check on socket creation
From: Stefano Garzarella @ 2023-11-06 10:46 UTC (permalink / raw)
  To: f.storniolo95
  Cc: luigi.leonardi, kvm, davem, edumazet, mst, imbrenda, kuba, asias,
	stefanha, pabeni, netdev, linux-kernel, virtualization
In-Reply-To: <20231103175551.41025-3-f.storniolo95@gmail.com>

On Fri, Nov 03, 2023 at 06:55:49PM +0100, f.storniolo95@gmail.com wrote:
>From: Filippo Storniolo <f.storniolo95@gmail.com>
>
>Add check on socket() return value in vsock_listen()
>and vsock_connect()
>
>Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com>
>Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
>Signed-off-by: Filippo Storniolo <f.storniolo95@gmail.com>
>---
> tools/testing/vsock/util.c | 8 ++++++++
> 1 file changed, 8 insertions(+)

If you need to resend the entire series, maybe you can remove "fix"
from the commit title.

But it's a minor thing, so I would only change it if there's something
else that justifies sending a v2:

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

>
>diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
>index 92336721321a..698b0b44a2ee 100644
>--- a/tools/testing/vsock/util.c
>+++ b/tools/testing/vsock/util.c
>@@ -104,6 +104,10 @@ static int vsock_connect(unsigned int cid, unsigned int port, int type)
> 	control_expectln("LISTENING");
>
> 	fd = socket(AF_VSOCK, type, 0);
>+	if (fd < 0) {
>+		perror("socket");
>+		exit(EXIT_FAILURE);
>+	}
>
> 	timeout_begin(TIMEOUT);
> 	do {
>@@ -158,6 +162,10 @@ static int vsock_accept(unsigned int cid, unsigned int port,
> 	int old_errno;
>
> 	fd = socket(AF_VSOCK, type, 0);
>+	if (fd < 0) {
>+		perror("socket");
>+		exit(EXIT_FAILURE);
>+	}
>
> 	if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) {
> 		perror("bind");
>-- 
>2.41.0
>


^ permalink raw reply

* Re: [PATCH net 1/4] vsock/virtio: remove socket from connected/bound list on shutdown
From: Stefano Garzarella @ 2023-11-06 10:43 UTC (permalink / raw)
  To: f.storniolo95
  Cc: luigi.leonardi, kvm, davem, edumazet, mst, imbrenda, kuba, asias,
	stefanha, pabeni, netdev, linux-kernel, virtualization,
	Daan De Meyer
In-Reply-To: <20231103175551.41025-2-f.storniolo95@gmail.com>

On Fri, Nov 03, 2023 at 06:55:48PM +0100, f.storniolo95@gmail.com wrote:
>From: Filippo Storniolo <f.storniolo95@gmail.com>
>
>If the same remote peer, using the same port, tries to connect
>to a server on a listening port more than once, the server will
>reject the connection, causing a "connection reset by peer"
>error on the remote peer. This is due to the presence of a
>dangling socket from a previous connection in both the connected
>and bound socket lists.
>The inconsistency of the above lists only occurs when the remote
>peer disconnects and the server remains active.
>
>This bug does not occur when the server socket is closed:
>virtio_transport_release() will eventually schedule a call to
>virtio_transport_do_close() and the latter will remove the socket
>from the bound and connected socket lists and clear the sk_buff.
>
>However, virtio_transport_do_close() will only perform the above
>actions if it has been scheduled, and this will not happen
>if the server is processing the shutdown message from a remote peer.
>
>To fix this, introduce a call to vsock_remove_sock()
>when the server is handling a client disconnect.
>This is to remove the socket from the bound and connected socket
>lists without clearing the sk_buff.
>
>Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
>Reported-by: Daan De Meyer <daan.j.demeyer@gmail.com>
>Tested-by: Daan De Meyer <daan.j.demeyer@gmail.com>
>Co-developed-by: Luigi Leonardi <luigi.leonardi@outlook.com>
>Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com>
>Signed-off-by: Filippo Storniolo <f.storniolo95@gmail.com>
>---
> net/vmw_vsock/virtio_transport_common.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index e22c81435ef7..4c595dd1fd64 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -1369,11 +1369,17 @@ virtio_transport_recv_connected(struct sock *sk,
> 			vsk->peer_shutdown |= RCV_SHUTDOWN;
> 		if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SHUTDOWN_SEND)
> 			vsk->peer_shutdown |= SEND_SHUTDOWN;
>-		if (vsk->peer_shutdown == SHUTDOWN_MASK &&
>-		    vsock_stream_has_data(vsk) <= 0 &&
>-		    !sock_flag(sk, SOCK_DONE)) {
>-			(void)virtio_transport_reset(vsk, NULL);
>-			virtio_transport_do_close(vsk, true);
>+		if (vsk->peer_shutdown == SHUTDOWN_MASK) {
>+			if (vsock_stream_has_data(vsk) <= 0 && !sock_flag(sk, SOCK_DONE)) {
>+				(void)virtio_transport_reset(vsk, NULL);
>+				virtio_transport_do_close(vsk, true);
>+			}
>+			/* Remove this socket anyway because the remote peer sent
>+			 * the shutdown. This way a new connection will succeed
>+			 * if the remote peer uses the same source port,
>+			 * even if the old socket is still unreleased, but now disconnected.
>+			 */
>+			vsock_remove_sock(vsk);
> 		}
> 		if (le32_to_cpu(virtio_vsock_hdr(skb)->flags))
> 			sk->sk_state_change(sk);
>-- 
>2.41.0
>

Thanks for fixing this issue! LGTM.

Just to inform other maintainers as well. Daan reported this issue to me
at DevConf.cz, I shared it with Filippo and Luigi who analyzed and
solved it.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>



^ permalink raw reply

* Re: [PATCH] rfkill: return ENOTTY on invalid ioctl
From: Przemek Kitszel @ 2023-11-06 10:30 UTC (permalink / raw)
  To: Thomas Weißschuh
  Cc: Johannes Berg, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-wireless, netdev, linux-kernel
In-Reply-To: <613039a4-41af-48ff-8113-3b0ee8077bcf@t-8ch.de>

On 11/2/23 20:14, Thomas Weißschuh wrote:
> Hi!
> 
> On 2023-11-02 09:57:45+0100, Przemek Kitszel wrote:
>> On 11/1/23 20:41, Thomas Weißschuh wrote:
>>> For unknown ioctls the correct error is
>>> ENOTTY "Inappropriate ioctl for device".
>>
>> For sure!
>>
>> I would like to learn more of why this is not an UAPI breaking change?
> 
> "break" would mean that some user application worked correctly before
> but does not do so anymore with this change.
> 
> This seems highly unlikely and I was not able to find such an
> application via Debian code search.
> 
> In general I did *not* mark this change for stable so if some
> application would indeed break it gets detected before the patch hits
> a release.
> 
>>>
>>> ENOSYS as returned before should only be used to indicate that a syscall
>>> is not available at all.
>>>
>>> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
>>> ---
>>>    net/rfkill/core.c | 4 ++--
>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/rfkill/core.c b/net/rfkill/core.c
>>> index 14cc8fe8584b..c3feb4f49d09 100644
>>> --- a/net/rfkill/core.c
>>> +++ b/net/rfkill/core.c
>>> @@ -1351,11 +1351,11 @@ static long rfkill_fop_ioctl(struct file *file, unsigned int cmd,
>>>    			     unsigned long arg)
>>>    {
>>>    	struct rfkill_data *data = file->private_data;
>>> -	int ret = -ENOSYS;
>>> +	int ret = -ENOTTY;
>>>    	u32 size;
>>>    	if (_IOC_TYPE(cmd) != RFKILL_IOC_MAGIC)
>>> -		return -ENOSYS;
>>> +		return -ENOTTY;
>>>    	mutex_lock(&data->mtx);
>>>    	switch (_IOC_NR(cmd)) {
>>>
>>> ---
>>> base-commit: 7d461b291e65938f15f56fe58da2303b07578a76
>>> change-id: 20231101-rfkill-ioctl-enosys-00a2bb0a4ab1
>>>
>>> Best regards,
>>

Thanks!

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>

^ permalink raw reply

* Re: [PATCH] net: usbnet: Fix potential NULL pointer dereference
From: Oliver Neukum @ 2023-11-06 10:18 UTC (permalink / raw)
  To: Ren Mingshuai, kuba, Bjørn Mork
  Cc: caowangbao, davem, khlebnikov, liaichun, linux-kernel, netdev,
	oneukum, yanan
In-Reply-To: <20231102090630.938759-1-renmingshuai@huawei.com>



On 02.11.23 10:06, Ren Mingshuai wrote:
>>>> 23ba07991dad said SKB can be NULL without describing the triggering
>>>> scenario. Always Check it before dereference to void potential NULL
>>>> pointer dereference.
>>> I've tried to find out the scenarios where SKB is NULL, but failed.
>>> It seems impossible for SKB to be NULL. If SKB can be NULL, please
>>> tell me the reason and I'd be very grateful.
>>
>> What do you mean? Grepping the function name shows call sites with NULL getting passed as skb.
> 
> Yes And I just learned that during the cdc_ncm_driver.probe, it is possible to pass a NULL SKB to usbnet_start_xmit().

Hi,

yes it looks like NCM does funky things, but what does that mean?

ndp_to_end_store()

         /* flush pending data before changing flag */
         netif_tx_lock_bh(dev->net);
         usbnet_start_xmit(NULL, dev->net);
         spin_lock_bh(&ctx->mtx);
         if (enable)

expects some odd semantics from it. The proposed patch simply
increases the drop counter, which is by itself questionable, as
we drop nothing.

But it definitely does no IO, so we flush nothing.
That is, we clearly have bug(s) but the patch only papers over
them.
And frankly, the basic question needs to be answered:
Are you allowed to call ndo_start_xmit() with a NULL skb?

My understanding until now was that you must not.

	Regards
		Oliver



^ permalink raw reply

* Re: [PATCH net v2] net: ti: icssg-prueth: Add missing icss_iep_put to error path
From: Wojciech Drewek @ 2023-11-06 10:15 UTC (permalink / raw)
  To: Jan Kiszka, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, MD Danish Anwar
  Cc: netdev, linux-kernel, Lopes Ivo, Diogo Miguel (T CED IFD-PT),
	Nishanth Menon, Su, Bao Cheng (RC-CN DF FA R&D)
In-Reply-To: <c8081537-f26a-4147-83f3-0c890e824192@siemens.com>



On 05.11.2023 10:51, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> Analogously to prueth_remove.
> 
> Fixes: 186734c15886 ("net: ti: icssg-prueth: add packet timestamping and ptp support")
> Fixes: 443a2367ba3c ("net: ti: icssg-prueth: am65x SR2.0 add 10M full duplex support")
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
> 
> Changes in v2:
>  - add proper tags
> 
> This was lost from the TI SDK version while ripping out SR1.0 support - 
> which we are currently restoring for upstream.
> 
>  drivers/net/ethernet/ti/icssg/icssg_prueth.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> index 6c4b64227ac8..d119b2bb8158 100644
> --- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> +++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> @@ -2206,6 +2206,9 @@ static int prueth_probe(struct platform_device *pdev)
>  	if (prueth->pdata.quirk_10m_link_issue)
>  		icss_iep_exit_fw(prueth->iep1);
>  
> +	icss_iep_put(prueth->iep1);
> +	icss_iep_put(prueth->iep0);

We could improve it even more IMO.
If we fail to get prueth->iep0 we go to the free_pool (already done)
If we fail to get prueth->iep1 we should go to something like "put_iep0" which calls icss_iep_put(prueth->iep0);
And in case of exit_iep we can call icss_iep_put(prueth->iep1);

> +
>  free_pool:
>  	gen_pool_free(prueth->sram_pool,
>  		      (unsigned long)prueth->msmcram.va, msmc_ram_size);

^ permalink raw reply

* Re: [PATCH net v2 0/3] bugfixs for smc
From: patchwork-bot+netdevbpf @ 2023-11-06 10:10 UTC (permalink / raw)
  To: D. Wythe
  Cc: kgraul, wenjia, jaka, wintera, kuba, davem, netdev, linux-s390,
	linux-rdma
In-Reply-To: <1698991660-82957-1-git-send-email-alibuda@linux.alibaba.com>

Hello:

This series was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:

On Fri,  3 Nov 2023 14:07:37 +0800 you wrote:
> From: "D. Wythe" <alibuda@linux.alibaba.com>
> 
> This patches includes bugfix following:
> 
> 1. hung state
> 2. sock leak
> 3. potential panic
> 
> [...]

Here is the summary with links:
  - [net,v2,1/3] net/smc: fix dangling sock under state SMC_APPFINCLOSEWAIT
    https://git.kernel.org/netdev/net/c/5211c9729484
  - [net,v2,2/3] net/smc: allow cdc msg send rather than drop it with NULL sndbuf_desc
    https://git.kernel.org/netdev/net/c/c5bf605ba4f9
  - [net,v2,3/3] net/smc: put sk reference if close work was canceled
    https://git.kernel.org/netdev/net/c/aa96fbd6d78d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net v2] net: ti: icssg-prueth: Fix error cleanup on failing pruss_request_mem_region
From: Wojciech Drewek @ 2023-11-06 10:07 UTC (permalink / raw)
  To: Jan Kiszka, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, MD Danish Anwar
  Cc: netdev, linux-kernel, Lopes Ivo, Diogo Miguel (T CED IFD-PT),
	Nishanth Menon, Su, Bao Cheng (RC-CN DF FA R&D)
In-Reply-To: <06ed13ca-9f52-4b49-9178-aae245bfd958@siemens.com>



On 05.11.2023 10:51, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> We were just continuing in this case, surely not desired.
> 
> Fixes: 128d5874c082 ("net: ti: icssg-prueth: Add ICSSG ethernet driver")
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>

> 
> Changes in v2:
>  - add proper tags
> 
>  drivers/net/ethernet/ti/icssg/icssg_prueth.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> index d119b2bb8158..845e8a782d3a 100644
> --- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> +++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
> @@ -2063,7 +2063,7 @@ static int prueth_probe(struct platform_device *pdev)
>  				       &prueth->shram);
>  	if (ret) {
>  		dev_err(dev, "unable to get PRUSS SHRD RAM2: %d\n", ret);
> -		pruss_put(prueth->pruss);
> +		goto put_pruss;
>  	}
>  
>  	prueth->sram_pool = of_gen_pool_get(np, "sram", 0);
> @@ -2215,6 +2215,8 @@ static int prueth_probe(struct platform_device *pdev)
>  
>  put_mem:
>  	pruss_release_mem_region(prueth->pruss, &prueth->shram);
> +
> +put_pruss:
>  	pruss_put(prueth->pruss);
>  
>  put_cores:

^ permalink raw reply

* Inquiry from MikolajGroup Hungary
From: Bozidar Damyan @ 2023-11-06  9:54 UTC (permalink / raw)
  To: netdev

Greetings from MikolajGroup.

We would like to know if you export to Hungary, as we need some 
of your products.

Please review the attached drawings and specifications and quote
with your best price and availability.

We would appreciate your prompt attention to this request, as we
are hoping to start our project as soon as possible.

If you require any further information or have any questions,
please do not hesitate to contact us.

Best regards
Bozidar Damyan
Commercial Assistant
Email: bozidar@mikolajgroup.com
Rákóczi út 34. Balatonudvari, Veszprém
8242 Hungary.

^ permalink raw reply

* Re: [PATCH] net: bcmasp: Use common error handling code in bcmasp_probe()
From: Wojciech Drewek @ 2023-11-06 10:02 UTC (permalink / raw)
  To: Markus Elfring, Julia Lawall, David S. Miller, Eric Dumazet,
	Florian Fainelli, Jakub Kicinski, Justin Chen, Paolo Abeni,
	bcm-kernel-feedback-list, netdev, kernel-janitors
  Cc: cocci, LKML
In-Reply-To: <0b2972cb-03b2-40c7-a728-6ebe2512637f@web.de>



On 05.11.2023 17:33, Markus Elfring wrote:
> From: Markus Elfring <elfring@users.sourceforge.net>
> Date: Sun, 5 Nov 2023 17:24:01 +0100
> 
> Add a jump target so that a bit of exception handling can be better
> reused at the end of this function.
> 
> Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
> ---
>  drivers/net/ethernet/broadcom/asp2/bcmasp.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/asp2/bcmasp.c b/drivers/net/ethernet/broadcom/asp2/bcmasp.c
> index 29b04a274d07..675437e44b94 100644
> --- a/drivers/net/ethernet/broadcom/asp2/bcmasp.c
> +++ b/drivers/net/ethernet/broadcom/asp2/bcmasp.c
> @@ -1304,9 +1304,8 @@ static int bcmasp_probe(struct platform_device *pdev)
>  		intf = bcmasp_interface_create(priv, intf_node, i);
>  		if (!intf) {
>  			dev_err(dev, "Cannot create eth interface %d\n", i);
> -			bcmasp_remove_intfs(priv);
>  			of_node_put(intf_node);
> -			goto of_put_exit;
> +			goto remove_intfs;
>  		}
>  		list_add_tail(&intf->list, &priv->intfs);
>  		i++;
> @@ -1331,8 +1330,7 @@ static int bcmasp_probe(struct platform_device *pdev)
>  			netdev_err(intf->ndev,
>  				   "failed to register net_device: %d\n", ret);
>  			priv->destroy_wol(priv);
> -			bcmasp_remove_intfs(priv);
> -			goto of_put_exit;
> +			goto remove_intfs;
>  		}
>  		count++;
>  	}
> @@ -1342,6 +1340,10 @@ static int bcmasp_probe(struct platform_device *pdev)
>  of_put_exit:
>  	of_node_put(ports_node);
>  	return ret;
> +
> +remove_intfs:
> +	bcmasp_remove_intfs(priv);
> +	goto of_put_exit;

Why is it at the end of the function? Just move it above of_put_exit and it will naturally 
go to the of_node_put call. No need for "goto of_put_exit".

>  }
> 
>  static void bcmasp_remove(struct platform_device *pdev)
> --
> 2.42.0
> 
> 

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH iwl-net] ice: fix DDP package download for packages without signature segment
From: Wojciech Drewek @ 2023-11-06  9:44 UTC (permalink / raw)
  To: Paul Greenwalt, intel-wired-lan
  Cc: Dan Nowlin, Maciej, netdev, jesse.brandeburg, Fijalkowski,
	anthony.l.nguyen, horms, kuba, davem
In-Reply-To: <20231104182908.15389-1-paul.greenwalt@intel.com>



On 04.11.2023 19:29, Paul Greenwalt wrote:
> From: Dan Nowlin <dan.nowlin@intel.com>
> 
> Commit 3cbdb0343022 ("ice: Add support for E830 DDP package segment")
> incorrectly removed support for package download for packages without a
> signature segment. These packages include the signature buffer inline
> in the configurations buffers, and do not in a signature segment.
> 
> Fix package download by providing download support for both packages
> with (ice_download_pkg_with_sig_seg()) and without signature segment
> (ice_download_pkg_without_sig_seg()).
> 
> Fixes: 3cbdb0343022 ("ice: Add support for E830 DDP package segment")
> Reported-by: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>
> Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>

> ---
>  drivers/net/ethernet/intel/ice/ice_ddp.c | 106 ++++++++++++++++++++++-
>  1 file changed, 103 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_ddp.c b/drivers/net/ethernet/intel/ice/ice_ddp.c
> index cfb1580f5850..3f1a11d0252c 100644
> --- a/drivers/net/ethernet/intel/ice/ice_ddp.c
> +++ b/drivers/net/ethernet/intel/ice/ice_ddp.c
> @@ -1479,14 +1479,14 @@ ice_post_dwnld_pkg_actions(struct ice_hw *hw)
>  }
>  
>  /**
> - * ice_download_pkg
> + * ice_download_pkg_with_sig_seg
>   * @hw: pointer to the hardware structure
>   * @pkg_hdr: pointer to package header
>   *
>   * Handles the download of a complete package.
>   */
>  static enum ice_ddp_state
> -ice_download_pkg(struct ice_hw *hw, struct ice_pkg_hdr *pkg_hdr)
> +ice_download_pkg_with_sig_seg(struct ice_hw *hw, struct ice_pkg_hdr *pkg_hdr)
>  {
>  	enum ice_aq_err aq_err = hw->adminq.sq_last_status;
>  	enum ice_ddp_state state = ICE_DDP_PKG_ERR;
> @@ -1519,6 +1519,106 @@ ice_download_pkg(struct ice_hw *hw, struct ice_pkg_hdr *pkg_hdr)
>  		state = ice_post_dwnld_pkg_actions(hw);
>  
>  	ice_release_global_cfg_lock(hw);
> +
> +	return state;
> +}
> +
> +/**
> + * ice_dwnld_cfg_bufs
> + * @hw: pointer to the hardware structure
> + * @bufs: pointer to an array of buffers
> + * @count: the number of buffers in the array
> + *
> + * Obtains global config lock and downloads the package configuration buffers
> + * to the firmware.
> + */
> +static enum ice_ddp_state
> +ice_dwnld_cfg_bufs(struct ice_hw *hw, struct ice_buf *bufs, u32 count)
> +{
> +	enum ice_ddp_state state = ICE_DDP_PKG_SUCCESS;
> +	struct ice_buf_hdr *bh;
> +	int status;
> +
> +	if (!bufs || !count)
> +		return ICE_DDP_PKG_ERR;
> +
> +	/* If the first buffer's first section has its metadata bit set
> +	 * then there are no buffers to be downloaded, and the operation is
> +	 * considered a success.
> +	 */
> +	bh = (struct ice_buf_hdr *)bufs;
> +	if (le32_to_cpu(bh->section_entry[0].type) & ICE_METADATA_BUF)
> +		return ICE_DDP_PKG_SUCCESS;
> +
> +	status = ice_acquire_global_cfg_lock(hw, ICE_RES_WRITE);
> +	if (status) {
> +		if (status == -EALREADY)
> +			return ICE_DDP_PKG_ALREADY_LOADED;
> +		return ice_map_aq_err_to_ddp_state(hw->adminq.sq_last_status);
> +	}
> +
> +	state = ice_dwnld_cfg_bufs_no_lock(hw, bufs, 0, count, true);
> +	if (!state)
> +		state = ice_post_dwnld_pkg_actions(hw);
> +
> +	ice_release_global_cfg_lock(hw);
> +
> +	return state;
> +}
> +
> +/**
> + * ice_download_pkg_without_sig_seg
> + * @hw: pointer to the hardware structure
> + * @ice_seg: pointer to the segment of the package to be downloaded
> + *
> + * Handles the download of a complete package without signature segment.
> + */
> +static enum ice_ddp_state
> +ice_download_pkg_without_sig_seg(struct ice_hw *hw, struct ice_seg *ice_seg)
> +{
> +	struct ice_buf_table *ice_buf_tbl;
> +	enum ice_ddp_state state;
> +
> +	ice_debug(hw, ICE_DBG_PKG, "Segment format version: %d.%d.%d.%d\n",
> +		  ice_seg->hdr.seg_format_ver.major,
> +		  ice_seg->hdr.seg_format_ver.minor,
> +		  ice_seg->hdr.seg_format_ver.update,
> +		  ice_seg->hdr.seg_format_ver.draft);
> +
> +	ice_debug(hw, ICE_DBG_PKG, "Seg: type 0x%X, size %d, name %s\n",
> +		  le32_to_cpu(ice_seg->hdr.seg_type),
> +		  le32_to_cpu(ice_seg->hdr.seg_size), ice_seg->hdr.seg_id);
> +
> +	ice_buf_tbl = ice_find_buf_table(ice_seg);
> +
> +	ice_debug(hw, ICE_DBG_PKG, "Seg buf count: %d\n",
> +		  le32_to_cpu(ice_buf_tbl->buf_count));
> +
> +	state = ice_dwnld_cfg_bufs(hw, ice_buf_tbl->buf_array,
> +				   le32_to_cpu(ice_buf_tbl->buf_count));
> +
> +	return state;
> +}
> +
> +/**
> + * ice_download_pkg
> + * @hw: pointer to the hardware structure
> + * @pkg_hdr: pointer to package header
> + * @ice_seg: pointer to the segment of the package to be downloaded
> + *
> + * Handles the download of a complete package.
> + */
> +static enum ice_ddp_state
> +ice_download_pkg(struct ice_hw *hw, struct ice_pkg_hdr *pkg_hdr,
> +		 struct ice_seg *ice_seg)
> +{
> +	enum ice_ddp_state state;
> +
> +	if (hw->pkg_has_signing_seg)
> +		state = ice_download_pkg_with_sig_seg(hw, pkg_hdr);
> +	else
> +		state = ice_download_pkg_without_sig_seg(hw, ice_seg);
> +
>  	ice_post_pkg_dwnld_vlan_mode_cfg(hw);
>  
>  	return state;
> @@ -2083,7 +2183,7 @@ enum ice_ddp_state ice_init_pkg(struct ice_hw *hw, u8 *buf, u32 len)
>  
>  	/* initialize package hints and then download package */
>  	ice_init_pkg_hints(hw, seg);
> -	state = ice_download_pkg(hw, pkg);
> +	state = ice_download_pkg(hw, pkg, seg);
>  	if (state == ICE_DDP_PKG_ALREADY_LOADED) {
>  		ice_debug(hw, ICE_DBG_INIT,
>  			  "package previously loaded - no work.\n");
> 
> base-commit: 016b9332a3346e97a6cacffea0f9dc10e1235a75

^ permalink raw reply

* [PATCH v2] drivers/net/ppp: use standard array-copy-function
From: Philipp Stanner @ 2023-11-06  9:16 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Stanislav Fomichev, Greg Kroah-Hartman, Benjamin Tissoires,
	Al Viro
  Cc: linux-ppp, netdev, linux-kernel, Philipp Stanner, Dave Airlie

In ppp_generic.c, memdup_user() is utilized to copy a userspace array.
This is done without an overflow-check, which is, however, not critical
because the multiplicands are an unsigned short and struct sock_filter,
which is currently of size 8.

Regardless, string.h now provides memdup_array_user(), a wrapper for
copying userspace arrays in a standardized manner, which has the
advantage of making it more obvious to the reader that an array is being
copied.
The wrapper additionally performs an obligatory overflow check, saving
the reader the effort of analyzing the potential for overflow, and
making the code a bit more robust in case of future changes to the
multiplicands len * size.

Replace memdup_user() with memdup_array_user().

Suggested-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
---
Changes in v2:
- Rename the commit and rephrase its message completely so that it
  becomes a) obvious that we're not fixing an actual overflow here and
  b) emphasize that the goal is increasing readability. (Al Viro)
---
 drivers/net/ppp/ppp_generic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index a9beacd552cf..0193af2d31c9 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -570,8 +570,8 @@ static struct bpf_prog *get_filter(struct sock_fprog *uprog)

 	/* uprog->len is unsigned short, so no overflow here */
 	fprog.len = uprog->len;
-	fprog.filter = memdup_user(uprog->filter,
-				   uprog->len * sizeof(struct sock_filter));
+	fprog.filter = memdup_array_user(uprog->filter,
+					 uprog->len, sizeof(struct sock_filter));
 	if (IS_ERR(fprog.filter))
 		return ERR_CAST(fprog.filter);

-- 
2.41.0

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox