From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: <mhal@rbox.co>
Cc: <cong.wang@bytedance.com>, <davem@davemloft.net>,
<edumazet@google.com>, <kuba@kernel.org>, <kuni1840@gmail.com>,
<kuniyu@amazon.com>, <netdev@vger.kernel.org>,
<pabeni@redhat.com>
Subject: Re: [PATCH v2 net 01/15] af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.
Date: Wed, 19 Jun 2024 12:19:30 -0700 [thread overview]
Message-ID: <20240619191930.99009-1-kuniyu@amazon.com> (raw)
In-Reply-To: <17997c8f-bba1-4597-85c7-5d76de63a7a7@rbox.co>
From: Michal Luczaj <mhal@rbox.co>
Date: Wed, 19 Jun 2024 20:14:48 +0200
> On 6/17/24 20:21, Kuniyuki Iwashima wrote:
> > From: Michal Luczaj <mhal@rbox.co>
> > Date: Mon, 17 Jun 2024 01:28:52 +0200
> >> (...)
> >> Another AF_UNIX sockmap issue is with OOB. When OOB packet is sent, skb is
> >> added to recv queue, but also u->oob_skb is set. Here's the problem: when
> >> this skb goes through bpf_sk_redirect_map() and is moved between socks,
> >> oob_skb remains set on the original sock.
> >
> > Good catch!
> >
> >>
> >> [ 23.688994] WARNING: CPU: 2 PID: 993 at net/unix/garbage.c:351 unix_collect_queue+0x6c/0xb0
> >> [ 23.689019] CPU: 2 PID: 993 Comm: kworker/u32:13 Not tainted 6.10.0-rc2+ #137
> >> [ 23.689021] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
> >> [ 23.689024] Workqueue: events_unbound __unix_gc
> >> [ 23.689027] RIP: 0010:unix_collect_queue+0x6c/0xb0
> >>
> >> I wanted to write a patch, but then I realized I'm not sure what's the
> >> expected behaviour. Should the oob_skb setting follow to the skb's new sock
> >> or should it be dropped (similarly to what is happening today with
> >> scm_fp_list, i.e. redirect strips inflights)?
> >
> > The former will require large refactoring as we need to check if the
> > redirect happens for BPF_F_INGRESS and if the redirected sk is also
> > SOCK_STREAM etc.
> >
> > So, I'd go with the latter. Probably we can check if skb is u->oob_skb
> > and drop OOB data and retry next in unix_stream_read_skb(), and forbid
> > MSG_OOB in unix_bpf_recvmsg().
> > (...)
>
> Yeah, sounds reasonable. I'm just not sure I understand the retry part. For
> each skb_queue_tail() there's one ->sk_data_ready() (which does
> ->read_skb()). Why bother with a retry?
Exactly.
>
> This is what I was thinking:
>
When you post it, please make sure to CC bpf@ and sockmap maintainers too.
next prev parent reply other threads:[~2024-06-19 19:19 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-04 16:52 [PATCH v2 net 00/15] af_unix: Fix lockless access of sk->sk_state and others fields Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 01/15] af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer Kuniyuki Iwashima
2024-06-09 11:28 ` Michal Luczaj
2024-06-09 19:53 ` Kuniyuki Iwashima
2024-06-09 21:03 ` Kuniyuki Iwashima
2024-06-10 12:55 ` Michal Luczaj
2024-06-10 17:49 ` Kuniyuki Iwashima
2024-06-16 23:28 ` Michal Luczaj
2024-06-17 18:21 ` Kuniyuki Iwashima
2024-06-19 18:14 ` Michal Luczaj
2024-06-19 19:19 ` Kuniyuki Iwashima [this message]
2024-06-20 20:35 ` Michal Luczaj
2024-06-20 21:56 ` Kuniyuki Iwashima
2024-06-22 22:43 ` Michal Luczaj
2024-06-23 5:19 ` Kuniyuki Iwashima
2024-06-26 10:48 ` Michal Luczaj
2024-06-26 21:56 ` Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 02/15] af_unix: Annodate data-races around sk->sk_state for writers Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 03/15] af_unix: Annotate data-race of sk->sk_state in unix_inq_len() Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 04/15] af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll() Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 05/15] af_unix: Annotate data-race of sk->sk_state in unix_stream_connect() Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 06/15] af_unix: Annotate data-race of sk->sk_state in unix_accept() Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 07/15] af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg() Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 08/15] af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb() Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 09/15] af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 10/15] af_unix: Annotate data-races around sk->sk_sndbuf Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 11/15] af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 12/15] af_unix: Use unix_recvq_full_lockless() in unix_stream_connect() Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 13/15] af_unix: Use skb_queue_empty_lockless() in unix_release_sock() Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 14/15] af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen() Kuniyuki Iwashima
2024-06-04 16:52 ` [PATCH v2 net 15/15] af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill() Kuniyuki Iwashima
2024-06-06 11:10 ` [PATCH v2 net 00/15] af_unix: Fix lockless access of sk->sk_state and others fields patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240619191930.99009-1-kuniyu@amazon.com \
--to=kuniyu@amazon.com \
--cc=cong.wang@bytedance.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=kuni1840@gmail.com \
--cc=mhal@rbox.co \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).