From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stefano Garzarella <sgarzare@redhat.com>
Cc: "Hyunwoo Kim" <v4bel@theori.io>,
netdev@vger.kernel.org, "Simon Horman" <horms@kernel.org>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
linux-kernel@vger.kernel.org,
"Eric Dumazet" <edumazet@google.com>,
"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
"Wongi Lee" <qwerty@theori.io>,
"David S. Miller" <davem@davemloft.net>,
"Paolo Abeni" <pabeni@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Bobby Eshleman" <bobby.eshleman@bytedance.com>,
virtualization@lists.linux.dev,
"Eugenio Pérez" <eperezma@redhat.com>,
"Luigi Leonardi" <leonardi@redhat.com>,
bpf@vger.kernel.org, "Jakub Kicinski" <kuba@kernel.org>,
"Michal Luczaj" <mhal@rbox.co>,
kvm@vger.kernel.org, imv4bel@gmail.com
Subject: Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes
Date: Thu, 9 Jan 2025 04:06:18 -0500 [thread overview]
Message-ID: <20250109040604-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <77plpkw3mp4r3ue4ubmh4yhqfo777koiu65dqfqfxmjgc5uq57@aifi6mhtgtuj>
On Thu, Jan 09, 2025 at 10:01:31AM +0100, Stefano Garzarella wrote:
> On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote:
> > On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote:
> > > If the socket has been de-assigned or assigned to another transport,
> > > we must discard any packets received because they are not expected
> > > and would cause issues when we access vsk->transport.
> > >
> > > A possible scenario is described by Hyunwoo Kim in the attached link,
> > > where after a first connect() interrupted by a signal, and a second
> > > connect() failed, we can find `vsk->transport` at NULL, leading to a
> > > NULL pointer dereference.
> > >
> > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
> > > Reported-by: Hyunwoo Kim <v4bel@theori.io>
> > > Reported-by: Wongi Lee <qwerty@theori.io>
> > > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/
> > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > > ---
> > > net/vmw_vsock/virtio_transport_common.c | 7 +++++--
> > > 1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> > > index 9acc13ab3f82..51a494b69be8 100644
> > > --- a/net/vmw_vsock/virtio_transport_common.c
> > > +++ b/net/vmw_vsock/virtio_transport_common.c
> > > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t,
> > >
> > > lock_sock(sk);
> > >
> > > - /* Check if sk has been closed before lock_sock */
> > > - if (sock_flag(sk, SOCK_DONE)) {
> > > + /* Check if sk has been closed or assigned to another transport before
> > > + * lock_sock (note: listener sockets are not assigned to any transport)
> > > + */
> > > + if (sock_flag(sk, SOCK_DONE) ||
> > > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) {
> >
> > If a race scenario with vsock_listen() is added to the existing
> > race scenario, the patch can be bypassed.
> >
> > In addition to the existing scenario:
> > ```
> > cpu0 cpu1
> >
> > socket(A)
> >
> > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024})
> > vsock_bind()
> >
> > listen(A)
> > vsock_listen()
> > socket(B)
> >
> > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024})
> > vsock_connect()
> > lock_sock(sk);
> > virtio_transport_connect()
> > virtio_transport_connect()
> > virtio_transport_send_pkt_info()
> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST)
> > queue_work(vsock_loopback_work)
> > sk->sk_state = TCP_SYN_SENT;
> > release_sock(sk);
> > vsock_loopback_work()
> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST)
> > sk = vsock_find_bound_socket(&dst);
> > virtio_transport_recv_listen(sk, skb)
> > child = vsock_create_connected(sk);
> > vsock_assign_transport()
> > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL);
> > vsock_insert_connected(vchild);
> > list_add(&vsk->connected_table, list);
> > virtio_transport_send_response(vchild, skb);
> > virtio_transport_send_pkt_info()
> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE)
> > queue_work(vsock_loopback_work)
> >
> > vsock_loopback_work()
> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE)
> > sk = vsock_find_bound_socket(&dst);
> > lock_sock(sk);
> > case TCP_SYN_SENT:
> > virtio_transport_recv_connecting()
> > sk->sk_state = TCP_ESTABLISHED;
> > release_sock(sk);
> >
> > kill(connect(B));
> > lock_sock(sk);
> > if (signal_pending(current)) {
> > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE;
> > sock->state = SS_UNCONNECTED; // [1]
> > release_sock(sk);
> >
> > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024})
> > vsock_connect(B)
> > lock_sock(sk);
> > vsock_assign_transport()
> > virtio_transport_release()
> > virtio_transport_close()
> > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING))
> > virtio_transport_shutdown()
> > virtio_transport_send_pkt_info()
> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN)
> > queue_work(vsock_loopback_work)
> > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5]
> > vsock_deassign_transport()
> > vsk->transport = NULL;
> > return -ESOCKTNOSUPPORT;
> > release_sock(sk);
> > vsock_loopback_work()
> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN)
> > virtio_transport_recv_connected()
> > virtio_transport_reset()
> > virtio_transport_send_pkt_info()
> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST)
> > queue_work(vsock_loopback_work)
> > listen(B)
> > vsock_listen()
> > if (sock->state != SS_UNCONNECTED) // [2]
> > sk->sk_state = TCP_LISTEN; // [3]
> >
> > vsock_loopback_work()
> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST)
> > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4]
> > ...
> >
> > virtio_transport_close_timeout()
> > virtio_transport_do_close()
> > vsock_stream_has_data()
> > return vsk->transport->stream_has_data(vsk); // null-ptr-deref
> >
> > ```
> > (Yes, This is quite a crazy scenario, but it can actually be induced)
> >
> > Since sock->state is set to SS_UNCONNECTED during the first connect()[1],
> > it can pass the sock->state check[2] in vsock_listen() and set
> > sk->sk_state to TCP_LISTEN[3].
> > If this happens, the check in the patch with
> > `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can
> > still occur.)
> >
> > More specifically, because the sk_state has changed to TCP_LISTEN,
> > virtio_transport_recv_disconnecting() will not be called by the
> > loopback worker. However, a null-ptr-deref may occur in
> > virtio_transport_close_timeout(), which is scheduled by
> > virtio_transport_close() called in the flow of the second connect()[5].
> > (The patch no longer cancels the virtio_transport_close_timeout() worker)
> >
> > And even if the `sk->sk_state != TCP_LISTEN` check is removed from the
> > patch, it seems that a null-ptr-deref will still occur due to
> > virtio_transport_close_timeout().
> > It might be necessary to add worker cancellation at the
> > appropriate location.
>
> Thanks for the analysis!
>
> Do you have time to cook a proper patch to cover this scenario?
> Or we should mix this fix together with your patch (return 0 in
> vsock_stream_has_data()) while we investigate a better handling?
>
> Thanks,
> Stefano
better combine them imho.
next prev parent reply other threads:[~2025-01-09 9:06 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-08 18:06 [PATCH net 0/2] vsock: some fixes due to transport de-assignment Stefano Garzarella
2025-01-08 18:06 ` [PATCH net 1/2] vsock/virtio: discard packets if the transport changes Stefano Garzarella
2025-01-08 19:31 ` Hyunwoo Kim
2025-01-09 9:01 ` Stefano Garzarella
2025-01-09 9:06 ` Michael S. Tsirkin [this message]
2025-01-09 9:13 ` Hyunwoo Kim
2025-01-09 10:59 ` Stefano Garzarella
2025-01-09 11:10 ` Hyunwoo Kim
2025-01-09 13:34 ` Michal Luczaj
2025-01-09 13:42 ` Stefano Garzarella
2025-01-09 15:27 ` Michal Luczaj
2025-01-10 8:39 ` Stefano Garzarella
2025-01-21 17:30 ` Luigi Leonardi
2025-01-21 18:06 ` Michal Luczaj
2025-01-08 18:06 ` [PATCH net 2/2] vsock/bpf: return early if transport is not assigned Stefano Garzarella
2025-01-08 19:37 ` Hyunwoo Kim
2025-01-09 9:07 ` Michael S. Tsirkin
2025-01-09 9:24 ` Luigi Leonardi
2025-01-09 13:14 ` Michal Luczaj
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250109040604-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=bobby.eshleman@bytedance.com \
--cc=bpf@vger.kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eperezma@redhat.com \
--cc=horms@kernel.org \
--cc=imv4bel@gmail.com \
--cc=jasowang@redhat.com \
--cc=kuba@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=leonardi@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhal@rbox.co \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=qwerty@theori.io \
--cc=sgarzare@redhat.com \
--cc=stefanha@redhat.com \
--cc=v4bel@theori.io \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.