* [PATCH net 0/2] vsock: some fixes due to transport de-assignment @ 2025-01-08 18:06 Stefano Garzarella 2025-01-08 18:06 ` [PATCH net 1/2] vsock/virtio: discard packets if the transport changes Stefano Garzarella 2025-01-08 18:06 ` [PATCH net 2/2] vsock/bpf: return early if transport is not assigned Stefano Garzarella 0 siblings, 2 replies; 19+ messages in thread From: Stefano Garzarella @ 2025-01-08 18:06 UTC (permalink / raw) To: netdev Cc: Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Stefano Garzarella, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, Michal Luczaj, kvm This series includes two patches discussed in the thread started by Hyunwoo Kim a few weeks ago [1]. The first patch is a fix more appropriate to the problem reported in that thread, the second patch on the other hand is a related fix but of a different problem highlighted by Michal Luczaj. It's present only in vsock_bpf and already handled in af_vsock.c Hyunwoo Kim, Michal, if you can test and report your Tested-by that would be great! [1] https://lore.kernel.org/netdev/Z2K%2FI4nlHdfMRTZC@v4bel-B760M-AORUS-ELITE-AX/ Stefano Garzarella (2): vsock/virtio: discard packets if the transport changes vsock/bpf: return early if transport is not assigned net/vmw_vsock/virtio_transport_common.c | 7 +++++-- net/vmw_vsock/vsock_bpf.c | 9 +++++++++ 2 files changed, 14 insertions(+), 2 deletions(-) -- 2.47.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-08 18:06 [PATCH net 0/2] vsock: some fixes due to transport de-assignment Stefano Garzarella @ 2025-01-08 18:06 ` Stefano Garzarella 2025-01-08 19:31 ` Hyunwoo Kim 2025-01-09 13:34 ` Michal Luczaj 2025-01-08 18:06 ` [PATCH net 2/2] vsock/bpf: return early if transport is not assigned Stefano Garzarella 1 sibling, 2 replies; 19+ messages in thread From: Stefano Garzarella @ 2025-01-08 18:06 UTC (permalink / raw) To: netdev Cc: Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Stefano Garzarella, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, Michal Luczaj, kvm If the socket has been de-assigned or assigned to another transport, we must discard any packets received because they are not expected and would cause issues when we access vsk->transport. A possible scenario is described by Hyunwoo Kim in the attached link, where after a first connect() interrupted by a signal, and a second connect() failed, we can find `vsk->transport` at NULL, leading to a NULL pointer dereference. Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Reported-by: Hyunwoo Kim <v4bel@theori.io> Reported-by: Wongi Lee <qwerty@theori.io> Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> --- net/vmw_vsock/virtio_transport_common.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 9acc13ab3f82..51a494b69be8 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, lock_sock(sk); - /* Check if sk has been closed before lock_sock */ - if (sock_flag(sk, SOCK_DONE)) { + /* Check if sk has been closed or assigned to another transport before + * lock_sock (note: listener sockets are not assigned to any transport) + */ + if (sock_flag(sk, SOCK_DONE) || + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { (void)virtio_transport_reset_no_sock(t, skb); release_sock(sk); sock_put(sk); -- 2.47.1 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-08 18:06 ` [PATCH net 1/2] vsock/virtio: discard packets if the transport changes Stefano Garzarella @ 2025-01-08 19:31 ` Hyunwoo Kim 2025-01-09 9:01 ` Stefano Garzarella 2025-01-09 13:34 ` Michal Luczaj 1 sibling, 1 reply; 19+ messages in thread From: Hyunwoo Kim @ 2025-01-08 19:31 UTC (permalink / raw) To: Stefano Garzarella Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Michal Luczaj, kvm, v4bel, imv4bel On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: > If the socket has been de-assigned or assigned to another transport, > we must discard any packets received because they are not expected > and would cause issues when we access vsk->transport. > > A possible scenario is described by Hyunwoo Kim in the attached link, > where after a first connect() interrupted by a signal, and a second > connect() failed, we can find `vsk->transport` at NULL, leading to a > NULL pointer dereference. > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") > Reported-by: Hyunwoo Kim <v4bel@theori.io> > Reported-by: Wongi Lee <qwerty@theori.io> > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > --- > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > index 9acc13ab3f82..51a494b69be8 100644 > --- a/net/vmw_vsock/virtio_transport_common.c > +++ b/net/vmw_vsock/virtio_transport_common.c > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, > > lock_sock(sk); > > - /* Check if sk has been closed before lock_sock */ > - if (sock_flag(sk, SOCK_DONE)) { > + /* Check if sk has been closed or assigned to another transport before > + * lock_sock (note: listener sockets are not assigned to any transport) > + */ > + if (sock_flag(sk, SOCK_DONE) || > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { If a race scenario with vsock_listen() is added to the existing race scenario, the patch can be bypassed. In addition to the existing scenario: ``` cpu0 cpu1 socket(A) bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) vsock_bind() listen(A) vsock_listen() socket(B) connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) vsock_connect() lock_sock(sk); virtio_transport_connect() virtio_transport_connect() virtio_transport_send_pkt_info() vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) queue_work(vsock_loopback_work) sk->sk_state = TCP_SYN_SENT; release_sock(sk); vsock_loopback_work() virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) sk = vsock_find_bound_socket(&dst); virtio_transport_recv_listen(sk, skb) child = vsock_create_connected(sk); vsock_assign_transport() vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); vsock_insert_connected(vchild); list_add(&vsk->connected_table, list); virtio_transport_send_response(vchild, skb); virtio_transport_send_pkt_info() vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) queue_work(vsock_loopback_work) vsock_loopback_work() virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) sk = vsock_find_bound_socket(&dst); lock_sock(sk); case TCP_SYN_SENT: virtio_transport_recv_connecting() sk->sk_state = TCP_ESTABLISHED; release_sock(sk); kill(connect(B)); lock_sock(sk); if (signal_pending(current)) { sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; sock->state = SS_UNCONNECTED; // [1] release_sock(sk); connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) vsock_connect(B) lock_sock(sk); vsock_assign_transport() virtio_transport_release() virtio_transport_close() if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) virtio_transport_shutdown() virtio_transport_send_pkt_info() vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) queue_work(vsock_loopback_work) schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] vsock_deassign_transport() vsk->transport = NULL; return -ESOCKTNOSUPPORT; release_sock(sk); vsock_loopback_work() virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) virtio_transport_recv_connected() virtio_transport_reset() virtio_transport_send_pkt_info() vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) queue_work(vsock_loopback_work) listen(B) vsock_listen() if (sock->state != SS_UNCONNECTED) // [2] sk->sk_state = TCP_LISTEN; // [3] vsock_loopback_work() virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] ... virtio_transport_close_timeout() virtio_transport_do_close() vsock_stream_has_data() return vsk->transport->stream_has_data(vsk); // null-ptr-deref ``` (Yes, This is quite a crazy scenario, but it can actually be induced) Since sock->state is set to SS_UNCONNECTED during the first connect()[1], it can pass the sock->state check[2] in vsock_listen() and set sk->sk_state to TCP_LISTEN[3]. If this happens, the check in the patch with `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can still occur. More specifically, because the sk_state has changed to TCP_LISTEN, virtio_transport_recv_disconnecting() will not be called by the loopback worker. However, a null-ptr-deref may occur in virtio_transport_close_timeout(), which is scheduled by virtio_transport_close() called in the flow of the second connect()[5]. (The patch no longer cancels the virtio_transport_close_timeout() worker) And even if the `sk->sk_state != TCP_LISTEN` check is removed from the patch, it seems that a null-ptr-deref will still occur due to virtio_transport_close_timeout(). It might be necessary to add worker cancellation at the appropriate location. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-08 19:31 ` Hyunwoo Kim @ 2025-01-09 9:01 ` Stefano Garzarella 2025-01-09 9:06 ` Michael S. Tsirkin 2025-01-09 9:13 ` Hyunwoo Kim 0 siblings, 2 replies; 19+ messages in thread From: Stefano Garzarella @ 2025-01-09 9:01 UTC (permalink / raw) To: Hyunwoo Kim Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Michal Luczaj, kvm, imv4bel On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote: >On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: >> If the socket has been de-assigned or assigned to another transport, >> we must discard any packets received because they are not expected >> and would cause issues when we access vsk->transport. >> >> A possible scenario is described by Hyunwoo Kim in the attached link, >> where after a first connect() interrupted by a signal, and a second >> connect() failed, we can find `vsk->transport` at NULL, leading to a >> NULL pointer dereference. >> >> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") >> Reported-by: Hyunwoo Kim <v4bel@theori.io> >> Reported-by: Wongi Lee <qwerty@theori.io> >> Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ >> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> >> --- >> net/vmw_vsock/virtio_transport_common.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c >> index 9acc13ab3f82..51a494b69be8 100644 >> --- a/net/vmw_vsock/virtio_transport_common.c >> +++ b/net/vmw_vsock/virtio_transport_common.c >> @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, >> >> lock_sock(sk); >> >> - /* Check if sk has been closed before lock_sock */ >> - if (sock_flag(sk, SOCK_DONE)) { >> + /* Check if sk has been closed or assigned to another transport before >> + * lock_sock (note: listener sockets are not assigned to any transport) >> + */ >> + if (sock_flag(sk, SOCK_DONE) || >> + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { > >If a race scenario with vsock_listen() is added to the existing >race scenario, the patch can be bypassed. > >In addition to the existing scenario: >``` > cpu0 cpu1 > > socket(A) > > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) > vsock_bind() > > listen(A) > vsock_listen() > socket(B) > > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) > vsock_connect() > lock_sock(sk); > virtio_transport_connect() > virtio_transport_connect() > virtio_transport_send_pkt_info() > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) > queue_work(vsock_loopback_work) > sk->sk_state = TCP_SYN_SENT; > release_sock(sk); > vsock_loopback_work() > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) > sk = vsock_find_bound_socket(&dst); > virtio_transport_recv_listen(sk, skb) > child = vsock_create_connected(sk); > vsock_assign_transport() > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); > vsock_insert_connected(vchild); > list_add(&vsk->connected_table, list); > virtio_transport_send_response(vchild, skb); > virtio_transport_send_pkt_info() > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) > queue_work(vsock_loopback_work) > > vsock_loopback_work() > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) > sk = vsock_find_bound_socket(&dst); > lock_sock(sk); > case TCP_SYN_SENT: > virtio_transport_recv_connecting() > sk->sk_state = TCP_ESTABLISHED; > release_sock(sk); > > kill(connect(B)); > lock_sock(sk); > if (signal_pending(current)) { > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > sock->state = SS_UNCONNECTED; // [1] > release_sock(sk); > > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) > vsock_connect(B) > lock_sock(sk); > vsock_assign_transport() > virtio_transport_release() > virtio_transport_close() > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) > virtio_transport_shutdown() > virtio_transport_send_pkt_info() > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > queue_work(vsock_loopback_work) > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] > vsock_deassign_transport() > vsk->transport = NULL; > return -ESOCKTNOSUPPORT; > release_sock(sk); > vsock_loopback_work() > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > virtio_transport_recv_connected() > virtio_transport_reset() > virtio_transport_send_pkt_info() > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) > queue_work(vsock_loopback_work) > listen(B) > vsock_listen() > if (sock->state != SS_UNCONNECTED) // [2] > sk->sk_state = TCP_LISTEN; // [3] > > vsock_loopback_work() > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] > ... > > virtio_transport_close_timeout() > virtio_transport_do_close() > vsock_stream_has_data() > return vsk->transport->stream_has_data(vsk); // null-ptr-deref > >``` >(Yes, This is quite a crazy scenario, but it can actually be induced) > >Since sock->state is set to SS_UNCONNECTED during the first connect()[1], >it can pass the sock->state check[2] in vsock_listen() and set >sk->sk_state to TCP_LISTEN[3]. >If this happens, the check in the patch with >`sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can >still occur.) > >More specifically, because the sk_state has changed to TCP_LISTEN, >virtio_transport_recv_disconnecting() will not be called by the >loopback worker. However, a null-ptr-deref may occur in >virtio_transport_close_timeout(), which is scheduled by >virtio_transport_close() called in the flow of the second connect()[5]. >(The patch no longer cancels the virtio_transport_close_timeout() worker) > >And even if the `sk->sk_state != TCP_LISTEN` check is removed from the >patch, it seems that a null-ptr-deref will still occur due to >virtio_transport_close_timeout(). >It might be necessary to add worker cancellation at the >appropriate location. Thanks for the analysis! Do you have time to cook a proper patch to cover this scenario? Or we should mix this fix together with your patch (return 0 in vsock_stream_has_data()) while we investigate a better handling? Thanks, Stefano ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-09 9:01 ` Stefano Garzarella @ 2025-01-09 9:06 ` Michael S. Tsirkin 2025-01-09 9:13 ` Hyunwoo Kim 1 sibling, 0 replies; 19+ messages in thread From: Michael S. Tsirkin @ 2025-01-09 9:06 UTC (permalink / raw) To: Stefano Garzarella Cc: Hyunwoo Kim, netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michal Luczaj, kvm, imv4bel On Thu, Jan 09, 2025 at 10:01:31AM +0100, Stefano Garzarella wrote: > On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote: > > On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: > > > If the socket has been de-assigned or assigned to another transport, > > > we must discard any packets received because they are not expected > > > and would cause issues when we access vsk->transport. > > > > > > A possible scenario is described by Hyunwoo Kim in the attached link, > > > where after a first connect() interrupted by a signal, and a second > > > connect() failed, we can find `vsk->transport` at NULL, leading to a > > > NULL pointer dereference. > > > > > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") > > > Reported-by: Hyunwoo Kim <v4bel@theori.io> > > > Reported-by: Wongi Lee <qwerty@theori.io> > > > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ > > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > > > --- > > > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > > > index 9acc13ab3f82..51a494b69be8 100644 > > > --- a/net/vmw_vsock/virtio_transport_common.c > > > +++ b/net/vmw_vsock/virtio_transport_common.c > > > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, > > > > > > lock_sock(sk); > > > > > > - /* Check if sk has been closed before lock_sock */ > > > - if (sock_flag(sk, SOCK_DONE)) { > > > + /* Check if sk has been closed or assigned to another transport before > > > + * lock_sock (note: listener sockets are not assigned to any transport) > > > + */ > > > + if (sock_flag(sk, SOCK_DONE) || > > > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { > > > > If a race scenario with vsock_listen() is added to the existing > > race scenario, the patch can be bypassed. > > > > In addition to the existing scenario: > > ``` > > cpu0 cpu1 > > > > socket(A) > > > > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) > > vsock_bind() > > > > listen(A) > > vsock_listen() > > socket(B) > > > > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) > > vsock_connect() > > lock_sock(sk); > > virtio_transport_connect() > > virtio_transport_connect() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) > > queue_work(vsock_loopback_work) > > sk->sk_state = TCP_SYN_SENT; > > release_sock(sk); > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) > > sk = vsock_find_bound_socket(&dst); > > virtio_transport_recv_listen(sk, skb) > > child = vsock_create_connected(sk); > > vsock_assign_transport() > > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); > > vsock_insert_connected(vchild); > > list_add(&vsk->connected_table, list); > > virtio_transport_send_response(vchild, skb); > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > queue_work(vsock_loopback_work) > > > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > sk = vsock_find_bound_socket(&dst); > > lock_sock(sk); > > case TCP_SYN_SENT: > > virtio_transport_recv_connecting() > > sk->sk_state = TCP_ESTABLISHED; > > release_sock(sk); > > > > kill(connect(B)); > > lock_sock(sk); > > if (signal_pending(current)) { > > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > > sock->state = SS_UNCONNECTED; // [1] > > release_sock(sk); > > > > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) > > vsock_connect(B) > > lock_sock(sk); > > vsock_assign_transport() > > virtio_transport_release() > > virtio_transport_close() > > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) > > virtio_transport_shutdown() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > queue_work(vsock_loopback_work) > > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] > > vsock_deassign_transport() > > vsk->transport = NULL; > > return -ESOCKTNOSUPPORT; > > release_sock(sk); > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > virtio_transport_recv_connected() > > virtio_transport_reset() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) > > queue_work(vsock_loopback_work) > > listen(B) > > vsock_listen() > > if (sock->state != SS_UNCONNECTED) // [2] > > sk->sk_state = TCP_LISTEN; // [3] > > > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) > > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] > > ... > > > > virtio_transport_close_timeout() > > virtio_transport_do_close() > > vsock_stream_has_data() > > return vsk->transport->stream_has_data(vsk); // null-ptr-deref > > > > ``` > > (Yes, This is quite a crazy scenario, but it can actually be induced) > > > > Since sock->state is set to SS_UNCONNECTED during the first connect()[1], > > it can pass the sock->state check[2] in vsock_listen() and set > > sk->sk_state to TCP_LISTEN[3]. > > If this happens, the check in the patch with > > `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can > > still occur.) > > > > More specifically, because the sk_state has changed to TCP_LISTEN, > > virtio_transport_recv_disconnecting() will not be called by the > > loopback worker. However, a null-ptr-deref may occur in > > virtio_transport_close_timeout(), which is scheduled by > > virtio_transport_close() called in the flow of the second connect()[5]. > > (The patch no longer cancels the virtio_transport_close_timeout() worker) > > > > And even if the `sk->sk_state != TCP_LISTEN` check is removed from the > > patch, it seems that a null-ptr-deref will still occur due to > > virtio_transport_close_timeout(). > > It might be necessary to add worker cancellation at the > > appropriate location. > > Thanks for the analysis! > > Do you have time to cook a proper patch to cover this scenario? > Or we should mix this fix together with your patch (return 0 in > vsock_stream_has_data()) while we investigate a better handling? > > Thanks, > Stefano better combine them imho. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-09 9:01 ` Stefano Garzarella 2025-01-09 9:06 ` Michael S. Tsirkin @ 2025-01-09 9:13 ` Hyunwoo Kim 2025-01-09 10:59 ` Stefano Garzarella 1 sibling, 1 reply; 19+ messages in thread From: Hyunwoo Kim @ 2025-01-09 9:13 UTC (permalink / raw) To: Stefano Garzarella Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Michal Luczaj, kvm, imv4bel, v4bel On Thu, Jan 09, 2025 at 10:01:31AM +0100, Stefano Garzarella wrote: > On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote: > > On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: > > > If the socket has been de-assigned or assigned to another transport, > > > we must discard any packets received because they are not expected > > > and would cause issues when we access vsk->transport. > > > > > > A possible scenario is described by Hyunwoo Kim in the attached link, > > > where after a first connect() interrupted by a signal, and a second > > > connect() failed, we can find `vsk->transport` at NULL, leading to a > > > NULL pointer dereference. > > > > > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") > > > Reported-by: Hyunwoo Kim <v4bel@theori.io> > > > Reported-by: Wongi Lee <qwerty@theori.io> > > > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ > > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > > > --- > > > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > > > index 9acc13ab3f82..51a494b69be8 100644 > > > --- a/net/vmw_vsock/virtio_transport_common.c > > > +++ b/net/vmw_vsock/virtio_transport_common.c > > > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, > > > > > > lock_sock(sk); > > > > > > - /* Check if sk has been closed before lock_sock */ > > > - if (sock_flag(sk, SOCK_DONE)) { > > > + /* Check if sk has been closed or assigned to another transport before > > > + * lock_sock (note: listener sockets are not assigned to any transport) > > > + */ > > > + if (sock_flag(sk, SOCK_DONE) || > > > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { > > > > If a race scenario with vsock_listen() is added to the existing > > race scenario, the patch can be bypassed. > > > > In addition to the existing scenario: > > ``` > > cpu0 cpu1 > > > > socket(A) > > > > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) > > vsock_bind() > > > > listen(A) > > vsock_listen() > > socket(B) > > > > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) > > vsock_connect() > > lock_sock(sk); > > virtio_transport_connect() > > virtio_transport_connect() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) > > queue_work(vsock_loopback_work) > > sk->sk_state = TCP_SYN_SENT; > > release_sock(sk); > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) > > sk = vsock_find_bound_socket(&dst); > > virtio_transport_recv_listen(sk, skb) > > child = vsock_create_connected(sk); > > vsock_assign_transport() > > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); > > vsock_insert_connected(vchild); > > list_add(&vsk->connected_table, list); > > virtio_transport_send_response(vchild, skb); > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > queue_work(vsock_loopback_work) > > > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > sk = vsock_find_bound_socket(&dst); > > lock_sock(sk); > > case TCP_SYN_SENT: > > virtio_transport_recv_connecting() > > sk->sk_state = TCP_ESTABLISHED; > > release_sock(sk); > > > > kill(connect(B)); > > lock_sock(sk); > > if (signal_pending(current)) { > > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > > sock->state = SS_UNCONNECTED; // [1] > > release_sock(sk); > > > > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) > > vsock_connect(B) > > lock_sock(sk); > > vsock_assign_transport() > > virtio_transport_release() > > virtio_transport_close() > > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) > > virtio_transport_shutdown() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > queue_work(vsock_loopback_work) > > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] > > vsock_deassign_transport() > > vsk->transport = NULL; > > return -ESOCKTNOSUPPORT; > > release_sock(sk); > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > virtio_transport_recv_connected() > > virtio_transport_reset() > > virtio_transport_send_pkt_info() > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) > > queue_work(vsock_loopback_work) > > listen(B) > > vsock_listen() > > if (sock->state != SS_UNCONNECTED) // [2] > > sk->sk_state = TCP_LISTEN; // [3] > > > > vsock_loopback_work() > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) > > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] > > ... > > > > virtio_transport_close_timeout() > > virtio_transport_do_close() > > vsock_stream_has_data() > > return vsk->transport->stream_has_data(vsk); // null-ptr-deref > > > > ``` > > (Yes, This is quite a crazy scenario, but it can actually be induced) > > > > Since sock->state is set to SS_UNCONNECTED during the first connect()[1], > > it can pass the sock->state check[2] in vsock_listen() and set > > sk->sk_state to TCP_LISTEN[3]. > > If this happens, the check in the patch with > > `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can > > still occur.) > > > > More specifically, because the sk_state has changed to TCP_LISTEN, > > virtio_transport_recv_disconnecting() will not be called by the > > loopback worker. However, a null-ptr-deref may occur in > > virtio_transport_close_timeout(), which is scheduled by > > virtio_transport_close() called in the flow of the second connect()[5]. > > (The patch no longer cancels the virtio_transport_close_timeout() worker) > > > > And even if the `sk->sk_state != TCP_LISTEN` check is removed from the > > patch, it seems that a null-ptr-deref will still occur due to > > virtio_transport_close_timeout(). > > It might be necessary to add worker cancellation at the > > appropriate location. > > Thanks for the analysis! > > Do you have time to cook a proper patch to cover this scenario? > Or we should mix this fix together with your patch (return 0 in > vsock_stream_has_data()) while we investigate a better handling? For now, it seems better to merge them together. It seems that covering this scenario will require more analysis and testing. Regards, Hyunwoo Kim ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-09 9:13 ` Hyunwoo Kim @ 2025-01-09 10:59 ` Stefano Garzarella 2025-01-09 11:10 ` Hyunwoo Kim 0 siblings, 1 reply; 19+ messages in thread From: Stefano Garzarella @ 2025-01-09 10:59 UTC (permalink / raw) To: Hyunwoo Kim Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Michal Luczaj, kvm, imv4bel On Thu, Jan 09, 2025 at 04:13:44AM -0500, Hyunwoo Kim wrote: >On Thu, Jan 09, 2025 at 10:01:31AM +0100, Stefano Garzarella wrote: >> On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote: >> > On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: >> > > If the socket has been de-assigned or assigned to another transport, >> > > we must discard any packets received because they are not expected >> > > and would cause issues when we access vsk->transport. >> > > >> > > A possible scenario is described by Hyunwoo Kim in the attached link, >> > > where after a first connect() interrupted by a signal, and a second >> > > connect() failed, we can find `vsk->transport` at NULL, leading to a >> > > NULL pointer dereference. >> > > >> > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") >> > > Reported-by: Hyunwoo Kim <v4bel@theori.io> >> > > Reported-by: Wongi Lee <qwerty@theori.io> >> > > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ >> > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> >> > > --- >> > > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- >> > > 1 file changed, 5 insertions(+), 2 deletions(-) >> > > >> > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c >> > > index 9acc13ab3f82..51a494b69be8 100644 >> > > --- a/net/vmw_vsock/virtio_transport_common.c >> > > +++ b/net/vmw_vsock/virtio_transport_common.c >> > > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, >> > > >> > > lock_sock(sk); >> > > >> > > - /* Check if sk has been closed before lock_sock */ >> > > - if (sock_flag(sk, SOCK_DONE)) { >> > > + /* Check if sk has been closed or assigned to another transport before >> > > + * lock_sock (note: listener sockets are not assigned to any transport) >> > > + */ >> > > + if (sock_flag(sk, SOCK_DONE) || >> > > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { >> > >> > If a race scenario with vsock_listen() is added to the existing >> > race scenario, the patch can be bypassed. >> > >> > In addition to the existing scenario: >> > ``` >> > cpu0 cpu1 >> > >> > socket(A) >> > >> > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) >> > vsock_bind() >> > >> > listen(A) >> > vsock_listen() >> > socket(B) >> > >> > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) >> > vsock_connect() >> > lock_sock(sk); >> > virtio_transport_connect() >> > virtio_transport_connect() >> > virtio_transport_send_pkt_info() >> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) >> > queue_work(vsock_loopback_work) >> > sk->sk_state = TCP_SYN_SENT; >> > release_sock(sk); >> > vsock_loopback_work() >> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) >> > sk = vsock_find_bound_socket(&dst); >> > virtio_transport_recv_listen(sk, skb) >> > child = vsock_create_connected(sk); >> > vsock_assign_transport() >> > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); >> > vsock_insert_connected(vchild); >> > list_add(&vsk->connected_table, list); >> > virtio_transport_send_response(vchild, skb); >> > virtio_transport_send_pkt_info() >> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) >> > queue_work(vsock_loopback_work) >> > >> > vsock_loopback_work() >> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) >> > sk = vsock_find_bound_socket(&dst); >> > lock_sock(sk); >> > case TCP_SYN_SENT: >> > virtio_transport_recv_connecting() >> > sk->sk_state = TCP_ESTABLISHED; >> > release_sock(sk); >> > >> > kill(connect(B)); >> > lock_sock(sk); >> > if (signal_pending(current)) { >> > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; >> > sock->state = SS_UNCONNECTED; // [1] >> > release_sock(sk); >> > >> > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) >> > vsock_connect(B) >> > lock_sock(sk); >> > vsock_assign_transport() >> > virtio_transport_release() >> > virtio_transport_close() >> > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) >> > virtio_transport_shutdown() >> > virtio_transport_send_pkt_info() >> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) >> > queue_work(vsock_loopback_work) >> > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] >> > vsock_deassign_transport() >> > vsk->transport = NULL; >> > return -ESOCKTNOSUPPORT; >> > release_sock(sk); >> > vsock_loopback_work() >> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) >> > virtio_transport_recv_connected() >> > virtio_transport_reset() >> > virtio_transport_send_pkt_info() >> > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) >> > queue_work(vsock_loopback_work) >> > listen(B) >> > vsock_listen() >> > if (sock->state != SS_UNCONNECTED) // [2] >> > sk->sk_state = TCP_LISTEN; // [3] >> > >> > vsock_loopback_work() >> > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) >> > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] >> > ... >> > >> > virtio_transport_close_timeout() >> > virtio_transport_do_close() >> > vsock_stream_has_data() >> > return vsk->transport->stream_has_data(vsk); // null-ptr-deref >> > >> > ``` >> > (Yes, This is quite a crazy scenario, but it can actually be induced) >> > >> > Since sock->state is set to SS_UNCONNECTED during the first connect()[1], >> > it can pass the sock->state check[2] in vsock_listen() and set >> > sk->sk_state to TCP_LISTEN[3]. >> > If this happens, the check in the patch with >> > `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can >> > still occur.) >> > >> > More specifically, because the sk_state has changed to TCP_LISTEN, >> > virtio_transport_recv_disconnecting() will not be called by the >> > loopback worker. However, a null-ptr-deref may occur in >> > virtio_transport_close_timeout(), which is scheduled by >> > virtio_transport_close() called in the flow of the second connect()[5]. >> > (The patch no longer cancels the virtio_transport_close_timeout() worker) >> > >> > And even if the `sk->sk_state != TCP_LISTEN` check is removed from the >> > patch, it seems that a null-ptr-deref will still occur due to >> > virtio_transport_close_timeout(). >> > It might be necessary to add worker cancellation at the >> > appropriate location. >> >> Thanks for the analysis! >> >> Do you have time to cook a proper patch to cover this scenario? >> Or we should mix this fix together with your patch (return 0 in >> vsock_stream_has_data()) while we investigate a better handling? > >For now, it seems better to merge them together. Okay, since both you and Michael agree on that, I'll include your changes in this series, but adding a warning message, since it should not happen. Is that fine with you? > >It seems that covering this scenario will require more analysis and >testing. Yeah, scheduling a task during the release is tricky, especially when we are changing the transport, so I think we should handle that better. One idea that I have it to cancel delayed works in virtio_transport_destruct(), I'll test it a bit and add a patch for that in the next version of this series. We also need to reset SOCK_DONE after changing the transports. Thanks, Stefano ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-09 10:59 ` Stefano Garzarella @ 2025-01-09 11:10 ` Hyunwoo Kim 0 siblings, 0 replies; 19+ messages in thread From: Hyunwoo Kim @ 2025-01-09 11:10 UTC (permalink / raw) To: Stefano Garzarella Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Michal Luczaj, kvm, imv4bel, v4bel On Thu, Jan 09, 2025 at 11:59:21AM +0100, Stefano Garzarella wrote: > On Thu, Jan 09, 2025 at 04:13:44AM -0500, Hyunwoo Kim wrote: > > On Thu, Jan 09, 2025 at 10:01:31AM +0100, Stefano Garzarella wrote: > > > On Wed, Jan 08, 2025 at 02:31:19PM -0500, Hyunwoo Kim wrote: > > > > On Wed, Jan 08, 2025 at 07:06:16PM +0100, Stefano Garzarella wrote: > > > > > If the socket has been de-assigned or assigned to another transport, > > > > > we must discard any packets received because they are not expected > > > > > and would cause issues when we access vsk->transport. > > > > > > > > > > A possible scenario is described by Hyunwoo Kim in the attached link, > > > > > where after a first connect() interrupted by a signal, and a second > > > > > connect() failed, we can find `vsk->transport` at NULL, leading to a > > > > > NULL pointer dereference. > > > > > > > > > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") > > > > > Reported-by: Hyunwoo Kim <v4bel@theori.io> > > > > > Reported-by: Wongi Lee <qwerty@theori.io> > > > > > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ > > > > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > > > > > --- > > > > > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- > > > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > > > > > index 9acc13ab3f82..51a494b69be8 100644 > > > > > --- a/net/vmw_vsock/virtio_transport_common.c > > > > > +++ b/net/vmw_vsock/virtio_transport_common.c > > > > > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, > > > > > > > > > > lock_sock(sk); > > > > > > > > > > - /* Check if sk has been closed before lock_sock */ > > > > > - if (sock_flag(sk, SOCK_DONE)) { > > > > > + /* Check if sk has been closed or assigned to another transport before > > > > > + * lock_sock (note: listener sockets are not assigned to any transport) > > > > > + */ > > > > > + if (sock_flag(sk, SOCK_DONE) || > > > > > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { > > > > > > > > If a race scenario with vsock_listen() is added to the existing > > > > race scenario, the patch can be bypassed. > > > > > > > > In addition to the existing scenario: > > > > ``` > > > > cpu0 cpu1 > > > > > > > > socket(A) > > > > > > > > bind(A, {cid: VMADDR_CID_LOCAL, port: 1024}) > > > > vsock_bind() > > > > > > > > listen(A) > > > > vsock_listen() > > > > socket(B) > > > > > > > > connect(B, {cid: VMADDR_CID_LOCAL, port: 1024}) > > > > vsock_connect() > > > > lock_sock(sk); > > > > virtio_transport_connect() > > > > virtio_transport_connect() > > > > virtio_transport_send_pkt_info() > > > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_REQUEST) > > > > queue_work(vsock_loopback_work) > > > > sk->sk_state = TCP_SYN_SENT; > > > > release_sock(sk); > > > > vsock_loopback_work() > > > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_REQUEST) > > > > sk = vsock_find_bound_socket(&dst); > > > > virtio_transport_recv_listen(sk, skb) > > > > child = vsock_create_connected(sk); > > > > vsock_assign_transport() > > > > vvs = kzalloc(sizeof(*vvs), GFP_KERNEL); > > > > vsock_insert_connected(vchild); > > > > list_add(&vsk->connected_table, list); > > > > virtio_transport_send_response(vchild, skb); > > > > virtio_transport_send_pkt_info() > > > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > > > queue_work(vsock_loopback_work) > > > > > > > > vsock_loopback_work() > > > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RESPONSE) > > > > sk = vsock_find_bound_socket(&dst); > > > > lock_sock(sk); > > > > case TCP_SYN_SENT: > > > > virtio_transport_recv_connecting() > > > > sk->sk_state = TCP_ESTABLISHED; > > > > release_sock(sk); > > > > > > > > kill(connect(B)); > > > > lock_sock(sk); > > > > if (signal_pending(current)) { > > > > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > > > > sock->state = SS_UNCONNECTED; // [1] > > > > release_sock(sk); > > > > > > > > connect(B, {cid: VMADDR_CID_HYPERVISOR, port: 1024}) > > > > vsock_connect(B) > > > > lock_sock(sk); > > > > vsock_assign_transport() > > > > virtio_transport_release() > > > > virtio_transport_close() > > > > if (!(sk->sk_state == TCP_ESTABLISHED || sk->sk_state == TCP_CLOSING)) > > > > virtio_transport_shutdown() > > > > virtio_transport_send_pkt_info() > > > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > > > queue_work(vsock_loopback_work) > > > > schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT); // [5] > > > > vsock_deassign_transport() > > > > vsk->transport = NULL; > > > > return -ESOCKTNOSUPPORT; > > > > release_sock(sk); > > > > vsock_loopback_work() > > > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_SHUTDOWN) > > > > virtio_transport_recv_connected() > > > > virtio_transport_reset() > > > > virtio_transport_send_pkt_info() > > > > vsock_loopback_send_pkt(VIRTIO_VSOCK_OP_RST) > > > > queue_work(vsock_loopback_work) > > > > listen(B) > > > > vsock_listen() > > > > if (sock->state != SS_UNCONNECTED) // [2] > > > > sk->sk_state = TCP_LISTEN; // [3] > > > > > > > > vsock_loopback_work() > > > > virtio_transport_recv_pkt(VIRTIO_VSOCK_OP_RST) > > > > if ((sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { // [4] > > > > ... > > > > > > > > virtio_transport_close_timeout() > > > > virtio_transport_do_close() > > > > vsock_stream_has_data() > > > > return vsk->transport->stream_has_data(vsk); // null-ptr-deref > > > > > > > > ``` > > > > (Yes, This is quite a crazy scenario, but it can actually be induced) > > > > > > > > Since sock->state is set to SS_UNCONNECTED during the first connect()[1], > > > > it can pass the sock->state check[2] in vsock_listen() and set > > > > sk->sk_state to TCP_LISTEN[3]. > > > > If this happens, the check in the patch with > > > > `sk->sk_state != TCP_LISTEN` will pass[4], and a null-ptr-deref can > > > > still occur.) > > > > > > > > More specifically, because the sk_state has changed to TCP_LISTEN, > > > > virtio_transport_recv_disconnecting() will not be called by the > > > > loopback worker. However, a null-ptr-deref may occur in > > > > virtio_transport_close_timeout(), which is scheduled by > > > > virtio_transport_close() called in the flow of the second connect()[5]. > > > > (The patch no longer cancels the virtio_transport_close_timeout() worker) > > > > > > > > And even if the `sk->sk_state != TCP_LISTEN` check is removed from the > > > > patch, it seems that a null-ptr-deref will still occur due to > > > > virtio_transport_close_timeout(). > > > > It might be necessary to add worker cancellation at the > > > > appropriate location. > > > > > > Thanks for the analysis! > > > > > > Do you have time to cook a proper patch to cover this scenario? > > > Or we should mix this fix together with your patch (return 0 in > > > vsock_stream_has_data()) while we investigate a better handling? > > > > For now, it seems better to merge them together. > > Okay, since both you and Michael agree on that, I'll include your changes in > this series, but adding a warning message, since it should not happen. > > Is that fine with you? Yes, I agree. > > > > > It seems that covering this scenario will require more analysis and > > testing. > > Yeah, scheduling a task during the release is tricky, especially when we are > changing the transport, so I think we should handle that better. > > One idea that I have it to cancel delayed works in > virtio_transport_destruct(), I'll test it a bit and add a patch for that in > the next version of this series. OK. once the patch is submitted, I will review it. > > We also need to reset SOCK_DONE after changing the transports. > > Thanks, > Stefano > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-08 18:06 ` [PATCH net 1/2] vsock/virtio: discard packets if the transport changes Stefano Garzarella 2025-01-08 19:31 ` Hyunwoo Kim @ 2025-01-09 13:34 ` Michal Luczaj 2025-01-09 13:42 ` Stefano Garzarella ` (2 more replies) 1 sibling, 3 replies; 19+ messages in thread From: Michal Luczaj @ 2025-01-09 13:34 UTC (permalink / raw) To: Stefano Garzarella, netdev Cc: Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, kvm On 1/8/25 19:06, Stefano Garzarella wrote: > If the socket has been de-assigned or assigned to another transport, > we must discard any packets received because they are not expected > and would cause issues when we access vsk->transport. > > A possible scenario is described by Hyunwoo Kim in the attached link, > where after a first connect() interrupted by a signal, and a second > connect() failed, we can find `vsk->transport` at NULL, leading to a > NULL pointer dereference. > > Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") > Reported-by: Hyunwoo Kim <v4bel@theori.io> > Reported-by: Wongi Lee <qwerty@theori.io> > Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > --- > net/vmw_vsock/virtio_transport_common.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > index 9acc13ab3f82..51a494b69be8 100644 > --- a/net/vmw_vsock/virtio_transport_common.c > +++ b/net/vmw_vsock/virtio_transport_common.c > @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, > > lock_sock(sk); > > - /* Check if sk has been closed before lock_sock */ > - if (sock_flag(sk, SOCK_DONE)) { > + /* Check if sk has been closed or assigned to another transport before > + * lock_sock (note: listener sockets are not assigned to any transport) > + */ > + if (sock_flag(sk, SOCK_DONE) || > + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { > (void)virtio_transport_reset_no_sock(t, skb); > release_sock(sk); > sock_put(sk); FWIW, I've tried simplifying Hyunwoo's repro to toy with some tests. Ended up with ``` from threading import * from socket import * from signal import * def listener(tid): while True: s = socket(AF_VSOCK, SOCK_SEQPACKET) s.bind((1, 1234)) s.listen() pthread_kill(tid, SIGUSR1) signal(SIGUSR1, lambda *args: None) Thread(target=listener, args=[get_ident()]).start() while True: c = socket(AF_VSOCK, SOCK_SEQPACKET) c.connect_ex((1, 1234)) c.connect_ex((42, 1234)) ``` which gives me splats with or without this patch. That said, when I apply this patch, but drop the `sk->sk_state != TCP_LISTEN &&`: no more splats. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-09 13:34 ` Michal Luczaj @ 2025-01-09 13:42 ` Stefano Garzarella 2025-01-09 15:27 ` Michal Luczaj 2025-01-10 8:39 ` Stefano Garzarella 2025-01-21 17:30 ` Luigi Leonardi 2 siblings, 1 reply; 19+ messages in thread From: Stefano Garzarella @ 2025-01-09 13:42 UTC (permalink / raw) To: Michal Luczaj Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, kvm On Thu, Jan 09, 2025 at 02:34:28PM +0100, Michal Luczaj wrote: >On 1/8/25 19:06, Stefano Garzarella wrote: >> If the socket has been de-assigned or assigned to another transport, >> we must discard any packets received because they are not expected >> and would cause issues when we access vsk->transport. >> >> A possible scenario is described by Hyunwoo Kim in the attached link, >> where after a first connect() interrupted by a signal, and a second >> connect() failed, we can find `vsk->transport` at NULL, leading to a >> NULL pointer dereference. >> >> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") >> Reported-by: Hyunwoo Kim <v4bel@theori.io> >> Reported-by: Wongi Lee <qwerty@theori.io> >> Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ >> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> >> --- >> net/vmw_vsock/virtio_transport_common.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c >> index 9acc13ab3f82..51a494b69be8 100644 >> --- a/net/vmw_vsock/virtio_transport_common.c >> +++ b/net/vmw_vsock/virtio_transport_common.c >> @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, >> >> lock_sock(sk); >> >> - /* Check if sk has been closed before lock_sock */ >> - if (sock_flag(sk, SOCK_DONE)) { >> + /* Check if sk has been closed or assigned to another transport before >> + * lock_sock (note: listener sockets are not assigned to any transport) >> + */ >> + if (sock_flag(sk, SOCK_DONE) || >> + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { >> (void)virtio_transport_reset_no_sock(t, skb); >> release_sock(sk); >> sock_put(sk); > >FWIW, I've tried simplifying Hyunwoo's repro to toy with some tests. Ended >up with > >``` >from threading import * >from socket import * >from signal import * > >def listener(tid): > while True: > s = socket(AF_VSOCK, SOCK_SEQPACKET) > s.bind((1, 1234)) > s.listen() > pthread_kill(tid, SIGUSR1) > >signal(SIGUSR1, lambda *args: None) >Thread(target=listener, args=[get_ident()]).start() > >while True: > c = socket(AF_VSOCK, SOCK_SEQPACKET) > c.connect_ex((1, 1234)) > c.connect_ex((42, 1234)) >``` > >which gives me splats with or without this patch. > >That said, when I apply this patch, but drop the `sk->sk_state != >TCP_LISTEN &&`: no more splats. We can't drop `sk->sk_state != TCP_LISTEN &&` because listener socket doesn't have any transport (vsk->transport == NULL), so every connection request will receive an error, so maybe this is the reason of no splats. I'm cooking some more patches to fix Hyunwoo's scenario handling better the close work when the virtio destructor is called. I'll run your reproduces to test it, thanks for that! Stefano ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-09 13:42 ` Stefano Garzarella @ 2025-01-09 15:27 ` Michal Luczaj 0 siblings, 0 replies; 19+ messages in thread From: Michal Luczaj @ 2025-01-09 15:27 UTC (permalink / raw) To: Stefano Garzarella Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, kvm On 1/9/25 14:42, Stefano Garzarella wrote: > On Thu, Jan 09, 2025 at 02:34:28PM +0100, Michal Luczaj wrote: >> ... >> That said, when I apply this patch, but drop the `sk->sk_state != >> TCP_LISTEN &&`: no more splats. > > We can't drop `sk->sk_state != TCP_LISTEN &&` because listener socket > doesn't have any transport (vsk->transport == NULL), so every connection > request will receive an error, so maybe this is the reason of no splats. Bah, sorry, I didn't run the test suit. > I'm cooking some more patches to fix Hyunwoo's scenario handling better > the close work when the virtio destructor is called. > > I'll run your reproduces to test it, thanks for that! > > Stefano > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-09 13:34 ` Michal Luczaj 2025-01-09 13:42 ` Stefano Garzarella @ 2025-01-10 8:39 ` Stefano Garzarella 2025-01-21 17:30 ` Luigi Leonardi 2 siblings, 0 replies; 19+ messages in thread From: Stefano Garzarella @ 2025-01-10 8:39 UTC (permalink / raw) To: Michal Luczaj Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, kvm On Thu, Jan 09, 2025 at 02:34:28PM +0100, Michal Luczaj wrote: >On 1/8/25 19:06, Stefano Garzarella wrote: >> If the socket has been de-assigned or assigned to another transport, >> we must discard any packets received because they are not expected >> and would cause issues when we access vsk->transport. >> >> A possible scenario is described by Hyunwoo Kim in the attached link, >> where after a first connect() interrupted by a signal, and a second >> connect() failed, we can find `vsk->transport` at NULL, leading to a >> NULL pointer dereference. >> >> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") >> Reported-by: Hyunwoo Kim <v4bel@theori.io> >> Reported-by: Wongi Lee <qwerty@theori.io> >> Closes: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ >> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> >> --- >> net/vmw_vsock/virtio_transport_common.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c >> index 9acc13ab3f82..51a494b69be8 100644 >> --- a/net/vmw_vsock/virtio_transport_common.c >> +++ b/net/vmw_vsock/virtio_transport_common.c >> @@ -1628,8 +1628,11 @@ void virtio_transport_recv_pkt(struct virtio_transport *t, >> >> lock_sock(sk); >> >> - /* Check if sk has been closed before lock_sock */ >> - if (sock_flag(sk, SOCK_DONE)) { >> + /* Check if sk has been closed or assigned to another transport before >> + * lock_sock (note: listener sockets are not assigned to any transport) >> + */ >> + if (sock_flag(sk, SOCK_DONE) || >> + (sk->sk_state != TCP_LISTEN && vsk->transport != &t->transport)) { >> (void)virtio_transport_reset_no_sock(t, skb); >> release_sock(sk); >> sock_put(sk); > >FWIW, I've tried simplifying Hyunwoo's repro to toy with some tests. Ended >up with > >``` >from threading import * >from socket import * >from signal import * > >def listener(tid): > while True: > s = socket(AF_VSOCK, SOCK_SEQPACKET) > s.bind((1, 1234)) > s.listen() > pthread_kill(tid, SIGUSR1) > >signal(SIGUSR1, lambda *args: None) >Thread(target=listener, args=[get_ident()]).start() > >while True: > c = socket(AF_VSOCK, SOCK_SEQPACKET) > c.connect_ex((1, 1234)) > c.connect_ex((42, 1234)) >``` > >which gives me splats with or without this patch. Thanks again for the repro, it worked for me, only if a G2H transport wasn't loaeded (e.g. virtio-vsock driver). I tested on the v2 I just sent [1], and I can't see the kernel oops anymore, but please do test/review too. Thanks, Stefano [1] https://lore.kernel.org/netdev/20250110083511.30419-1-sgarzare@redhat.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-09 13:34 ` Michal Luczaj 2025-01-09 13:42 ` Stefano Garzarella 2025-01-10 8:39 ` Stefano Garzarella @ 2025-01-21 17:30 ` Luigi Leonardi 2025-01-21 18:06 ` Michal Luczaj 2 siblings, 1 reply; 19+ messages in thread From: Luigi Leonardi @ 2025-01-21 17:30 UTC (permalink / raw) To: Michal Luczaj Cc: Stefano Garzarella, netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, kvm On Thu, Jan 09, 2025 at 02:34:28PM +0100, Michal Luczaj wrote: >FWIW, I've tried simplifying Hyunwoo's repro to toy with some tests. >Ended >up with > >``` >from threading import * >from socket import * >from signal import * > >def listener(tid): > while True: > s = socket(AF_VSOCK, SOCK_SEQPACKET) > s.bind((1, 1234)) > s.listen() > pthread_kill(tid, SIGUSR1) > >signal(SIGUSR1, lambda *args: None) >Thread(target=listener, args=[get_ident()]).start() > >while True: > c = socket(AF_VSOCK, SOCK_SEQPACKET) > c.connect_ex((1, 1234)) > c.connect_ex((42, 1234)) >``` > >which gives me splats with or without this patch. > >That said, when I apply this patch, but drop the `sk->sk_state != >TCP_LISTEN &&`: no more splats. > Hi Michal, I think it would be nice to have this test in the vsock test suite. WDYT? If you don't have any plans to port this to C, I can take care of it :) Cheers, Luigi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 1/2] vsock/virtio: discard packets if the transport changes 2025-01-21 17:30 ` Luigi Leonardi @ 2025-01-21 18:06 ` Michal Luczaj 0 siblings, 0 replies; 19+ messages in thread From: Michal Luczaj @ 2025-01-21 18:06 UTC (permalink / raw) To: Luigi Leonardi Cc: Stefano Garzarella, netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, kvm On 1/21/25 18:30, Luigi Leonardi wrote: > On Thu, Jan 09, 2025 at 02:34:28PM +0100, Michal Luczaj wrote: >> FWIW, I've tried simplifying Hyunwoo's repro to toy with some tests. >> Ended >> up with >> >> ``` >>from threading import * >>from socket import * >>from signal import * >> >> def listener(tid): >> while True: >> s = socket(AF_VSOCK, SOCK_SEQPACKET) >> s.bind((1, 1234)) >> s.listen() >> pthread_kill(tid, SIGUSR1) >> >> signal(SIGUSR1, lambda *args: None) >> Thread(target=listener, args=[get_ident()]).start() >> >> while True: >> c = socket(AF_VSOCK, SOCK_SEQPACKET) >> c.connect_ex((1, 1234)) >> c.connect_ex((42, 1234)) >> ``` >> >> which gives me splats with or without this patch. >> >> That said, when I apply this patch, but drop the `sk->sk_state != >> TCP_LISTEN &&`: no more splats. >> > Hi Michal, > > I think it would be nice to have this test in the vsock test suite. > WDYT? If you don't have any plans to port this to C, I can take care of > it :) Sure, go ahead, but note that this is just a (probably suboptimal) Python version of Hyunwoo's C repro[1]. [1]: https://lore.kernel.org/netdev/Z2LvdTTQR7dBmPb5@v4bel-B760M-AORUS-ELITE-AX/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH net 2/2] vsock/bpf: return early if transport is not assigned 2025-01-08 18:06 [PATCH net 0/2] vsock: some fixes due to transport de-assignment Stefano Garzarella 2025-01-08 18:06 ` [PATCH net 1/2] vsock/virtio: discard packets if the transport changes Stefano Garzarella @ 2025-01-08 18:06 ` Stefano Garzarella 2025-01-08 19:37 ` Hyunwoo Kim ` (3 more replies) 1 sibling, 4 replies; 19+ messages in thread From: Stefano Garzarella @ 2025-01-08 18:06 UTC (permalink / raw) To: netdev Cc: Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Stefano Garzarella, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, Michal Luczaj, kvm Some of the core functions can only be called if the transport has been assigned. As Michal reported, a socket might have the transport at NULL, for example after a failed connect(), causing the following trace: BUG: kernel NULL pointer dereference, address: 00000000000000a0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 12faf8067 P4D 12faf8067 PUD 113670067 PMD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 15 UID: 0 PID: 1198 Comm: a.out Not tainted 6.13.0-rc2+ RIP: 0010:vsock_connectible_has_data+0x1f/0x40 Call Trace: vsock_bpf_recvmsg+0xca/0x5e0 sock_recvmsg+0xb9/0xc0 __sys_recvfrom+0xb3/0x130 __x64_sys_recvfrom+0x20/0x30 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e So we need to check the `vsk->transport` in vsock_bpf_recvmsg(), especially for connected sockets (stream/seqpacket) as we already do in __vsock_connectible_recvmsg(). Fixes: 634f1a7110b4 ("vsock: support sockmap") Reported-by: Michal Luczaj <mhal@rbox.co> Closes: https://lore.kernel.org/netdev/5ca20d4c-1017-49c2-9516-f6f75fd331e9@rbox.co/ Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> --- net/vmw_vsock/vsock_bpf.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c index 4aa6e74ec295..f201d9eca1df 100644 --- a/net/vmw_vsock/vsock_bpf.c +++ b/net/vmw_vsock/vsock_bpf.c @@ -77,6 +77,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags, int *addr_len) { struct sk_psock *psock; + struct vsock_sock *vsk; int copied; psock = sk_psock_get(sk); @@ -84,6 +85,13 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, return __vsock_recvmsg(sk, msg, len, flags); lock_sock(sk); + vsk = vsock_sk(sk); + + if (!vsk->transport) { + copied = -ENODEV; + goto out; + } + if (vsock_has_data(sk, psock) && sk_psock_queue_empty(psock)) { release_sock(sk); sk_psock_put(sk, psock); @@ -108,6 +116,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, copied = sk_msg_recvmsg(sk, psock, msg, len, flags); } +out: release_sock(sk); sk_psock_put(sk, psock); -- 2.47.1 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH net 2/2] vsock/bpf: return early if transport is not assigned 2025-01-08 18:06 ` [PATCH net 2/2] vsock/bpf: return early if transport is not assigned Stefano Garzarella @ 2025-01-08 19:37 ` Hyunwoo Kim 2025-01-09 9:07 ` Michael S. Tsirkin ` (2 subsequent siblings) 3 siblings, 0 replies; 19+ messages in thread From: Hyunwoo Kim @ 2025-01-08 19:37 UTC (permalink / raw) To: Stefano Garzarella Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Michal Luczaj, kvm, v4bel, imv4bel On Wed, Jan 08, 2025 at 07:06:17PM +0100, Stefano Garzarella wrote: > Some of the core functions can only be called if the transport > has been assigned. > > As Michal reported, a socket might have the transport at NULL, > for example after a failed connect(), causing the following trace: > > BUG: kernel NULL pointer dereference, address: 00000000000000a0 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 12faf8067 P4D 12faf8067 PUD 113670067 PMD 0 > Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI > CPU: 15 UID: 0 PID: 1198 Comm: a.out Not tainted 6.13.0-rc2+ > RIP: 0010:vsock_connectible_has_data+0x1f/0x40 > Call Trace: > vsock_bpf_recvmsg+0xca/0x5e0 > sock_recvmsg+0xb9/0xc0 > __sys_recvfrom+0xb3/0x130 > __x64_sys_recvfrom+0x20/0x30 > do_syscall_64+0x93/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > So we need to check the `vsk->transport` in vsock_bpf_recvmsg(), > especially for connected sockets (stream/seqpacket) as we already > do in __vsock_connectible_recvmsg(). > > Fixes: 634f1a7110b4 ("vsock: support sockmap") > Reported-by: Michal Luczaj <mhal@rbox.co> > Closes: https://lore.kernel.org/netdev/5ca20d4c-1017-49c2-9516-f6f75fd331e9@rbox.co/ > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > --- > net/vmw_vsock/vsock_bpf.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c > index 4aa6e74ec295..f201d9eca1df 100644 > --- a/net/vmw_vsock/vsock_bpf.c > +++ b/net/vmw_vsock/vsock_bpf.c > @@ -77,6 +77,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > size_t len, int flags, int *addr_len) > { > struct sk_psock *psock; > + struct vsock_sock *vsk; > int copied; > > psock = sk_psock_get(sk); > @@ -84,6 +85,13 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > return __vsock_recvmsg(sk, msg, len, flags); > > lock_sock(sk); > + vsk = vsock_sk(sk); > + > + if (!vsk->transport) { > + copied = -ENODEV; > + goto out; > + } > + > if (vsock_has_data(sk, psock) && sk_psock_queue_empty(psock)) { > release_sock(sk); > sk_psock_put(sk, psock); > @@ -108,6 +116,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > copied = sk_msg_recvmsg(sk, psock, msg, len, flags); > } > > +out: > release_sock(sk); > sk_psock_put(sk, psock); > > -- > 2.47.1 > Looks good to me. Reviewed-by: Hyunwoo Kim <v4bel@theori.io> Regards, Hyunwoo Kim ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 2/2] vsock/bpf: return early if transport is not assigned 2025-01-08 18:06 ` [PATCH net 2/2] vsock/bpf: return early if transport is not assigned Stefano Garzarella 2025-01-08 19:37 ` Hyunwoo Kim @ 2025-01-09 9:07 ` Michael S. Tsirkin 2025-01-09 9:24 ` Luigi Leonardi 2025-01-09 13:14 ` Michal Luczaj 3 siblings, 0 replies; 19+ messages in thread From: Michael S. Tsirkin @ 2025-01-09 9:07 UTC (permalink / raw) To: Stefano Garzarella Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Hyunwoo Kim, Michal Luczaj, kvm On Wed, Jan 08, 2025 at 07:06:17PM +0100, Stefano Garzarella wrote: > Some of the core functions can only be called if the transport > has been assigned. > > As Michal reported, a socket might have the transport at NULL, > for example after a failed connect(), causing the following trace: > > BUG: kernel NULL pointer dereference, address: 00000000000000a0 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 12faf8067 P4D 12faf8067 PUD 113670067 PMD 0 > Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI > CPU: 15 UID: 0 PID: 1198 Comm: a.out Not tainted 6.13.0-rc2+ > RIP: 0010:vsock_connectible_has_data+0x1f/0x40 > Call Trace: > vsock_bpf_recvmsg+0xca/0x5e0 > sock_recvmsg+0xb9/0xc0 > __sys_recvfrom+0xb3/0x130 > __x64_sys_recvfrom+0x20/0x30 > do_syscall_64+0x93/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > So we need to check the `vsk->transport` in vsock_bpf_recvmsg(), > especially for connected sockets (stream/seqpacket) as we already > do in __vsock_connectible_recvmsg(). > > Fixes: 634f1a7110b4 ("vsock: support sockmap") > Reported-by: Michal Luczaj <mhal@rbox.co> > Closes: https://lore.kernel.org/netdev/5ca20d4c-1017-49c2-9516-f6f75fd331e9@rbox.co/ > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> > --- > net/vmw_vsock/vsock_bpf.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c > index 4aa6e74ec295..f201d9eca1df 100644 > --- a/net/vmw_vsock/vsock_bpf.c > +++ b/net/vmw_vsock/vsock_bpf.c > @@ -77,6 +77,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > size_t len, int flags, int *addr_len) > { > struct sk_psock *psock; > + struct vsock_sock *vsk; > int copied; > > psock = sk_psock_get(sk); > @@ -84,6 +85,13 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > return __vsock_recvmsg(sk, msg, len, flags); > > lock_sock(sk); > + vsk = vsock_sk(sk); > + > + if (!vsk->transport) { > + copied = -ENODEV; > + goto out; > + } > + > if (vsock_has_data(sk, psock) && sk_psock_queue_empty(psock)) { > release_sock(sk); > sk_psock_put(sk, psock); > @@ -108,6 +116,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > copied = sk_msg_recvmsg(sk, psock, msg, len, flags); > } > > +out: > release_sock(sk); > sk_psock_put(sk, psock); > > -- > 2.47.1 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 2/2] vsock/bpf: return early if transport is not assigned 2025-01-08 18:06 ` [PATCH net 2/2] vsock/bpf: return early if transport is not assigned Stefano Garzarella 2025-01-08 19:37 ` Hyunwoo Kim 2025-01-09 9:07 ` Michael S. Tsirkin @ 2025-01-09 9:24 ` Luigi Leonardi 2025-01-09 13:14 ` Michal Luczaj 3 siblings, 0 replies; 19+ messages in thread From: Luigi Leonardi @ 2025-01-09 9:24 UTC (permalink / raw) To: Stefano Garzarella Cc: netdev, Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, Michal Luczaj, kvm On Wed, Jan 08, 2025 at 07:06:17PM +0100, Stefano Garzarella wrote: >Some of the core functions can only be called if the transport >has been assigned. > >As Michal reported, a socket might have the transport at NULL, >for example after a failed connect(), causing the following trace: > > BUG: kernel NULL pointer dereference, address: 00000000000000a0 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 12faf8067 P4D 12faf8067 PUD 113670067 PMD 0 > Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI > CPU: 15 UID: 0 PID: 1198 Comm: a.out Not tainted 6.13.0-rc2+ > RIP: 0010:vsock_connectible_has_data+0x1f/0x40 > Call Trace: > vsock_bpf_recvmsg+0xca/0x5e0 > sock_recvmsg+0xb9/0xc0 > __sys_recvfrom+0xb3/0x130 > __x64_sys_recvfrom+0x20/0x30 > do_syscall_64+0x93/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > >So we need to check the `vsk->transport` in vsock_bpf_recvmsg(), >especially for connected sockets (stream/seqpacket) as we already >do in __vsock_connectible_recvmsg(). > >Fixes: 634f1a7110b4 ("vsock: support sockmap") >Reported-by: Michal Luczaj <mhal@rbox.co> >Closes: https://lore.kernel.org/netdev/5ca20d4c-1017-49c2-9516-f6f75fd331e9@rbox.co/ >Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> >--- > net/vmw_vsock/vsock_bpf.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > >diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c >index 4aa6e74ec295..f201d9eca1df 100644 >--- a/net/vmw_vsock/vsock_bpf.c >+++ b/net/vmw_vsock/vsock_bpf.c >@@ -77,6 +77,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > size_t len, int flags, int *addr_len) > { > struct sk_psock *psock; >+ struct vsock_sock *vsk; > int copied; > > psock = sk_psock_get(sk); >@@ -84,6 +85,13 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > return __vsock_recvmsg(sk, msg, len, flags); > > lock_sock(sk); >+ vsk = vsock_sk(sk); >+ >+ if (!vsk->transport) { >+ copied = -ENODEV; >+ goto out; >+ } >+ > if (vsock_has_data(sk, psock) && sk_psock_queue_empty(psock)) { > release_sock(sk); > sk_psock_put(sk, psock); >@@ -108,6 +116,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > copied = sk_msg_recvmsg(sk, psock, msg, len, flags); > } > >+out: > release_sock(sk); > sk_psock_put(sk, psock); > >-- >2.47.1 > LGTM! Reviewed-By: Luigi Leonardi <leonardi@redhat.com> ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH net 2/2] vsock/bpf: return early if transport is not assigned 2025-01-08 18:06 ` [PATCH net 2/2] vsock/bpf: return early if transport is not assigned Stefano Garzarella ` (2 preceding siblings ...) 2025-01-09 9:24 ` Luigi Leonardi @ 2025-01-09 13:14 ` Michal Luczaj 3 siblings, 0 replies; 19+ messages in thread From: Michal Luczaj @ 2025-01-09 13:14 UTC (permalink / raw) To: Stefano Garzarella, netdev Cc: Simon Horman, Stefan Hajnoczi, linux-kernel, Eric Dumazet, Xuan Zhuo, Wongi Lee, David S. Miller, Paolo Abeni, Jason Wang, Bobby Eshleman, virtualization, Eugenio Pérez, Luigi Leonardi, bpf, Jakub Kicinski, Michael S. Tsirkin, Hyunwoo Kim, kvm On 1/8/25 19:06, Stefano Garzarella wrote: > Some of the core functions can only be called if the transport > has been assigned. > > As Michal reported, a socket might have the transport at NULL, > for example after a failed connect(), causing the following trace: > > BUG: kernel NULL pointer dereference, address: 00000000000000a0 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 12faf8067 P4D 12faf8067 PUD 113670067 PMD 0 > Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI > CPU: 15 UID: 0 PID: 1198 Comm: a.out Not tainted 6.13.0-rc2+ > RIP: 0010:vsock_connectible_has_data+0x1f/0x40 > Call Trace: > vsock_bpf_recvmsg+0xca/0x5e0 > sock_recvmsg+0xb9/0xc0 > __sys_recvfrom+0xb3/0x130 > __x64_sys_recvfrom+0x20/0x30 > do_syscall_64+0x93/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > So we need to check the `vsk->transport` in vsock_bpf_recvmsg(), > especially for connected sockets (stream/seqpacket) as we already > do in __vsock_connectible_recvmsg(). > > Fixes: 634f1a7110b4 ("vsock: support sockmap") > Reported-by: Michal Luczaj <mhal@rbox.co> > Closes: https://lore.kernel.org/netdev/5ca20d4c-1017-49c2-9516-f6f75fd331e9@rbox.co/ > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Michal Luczaj <mhal@rbox.co> > --- > net/vmw_vsock/vsock_bpf.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c > index 4aa6e74ec295..f201d9eca1df 100644 > --- a/net/vmw_vsock/vsock_bpf.c > +++ b/net/vmw_vsock/vsock_bpf.c > @@ -77,6 +77,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > size_t len, int flags, int *addr_len) > { > struct sk_psock *psock; > + struct vsock_sock *vsk; > int copied; > > psock = sk_psock_get(sk); > @@ -84,6 +85,13 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > return __vsock_recvmsg(sk, msg, len, flags); > > lock_sock(sk); > + vsk = vsock_sk(sk); > + > + if (!vsk->transport) { > + copied = -ENODEV; > + goto out; > + } > + > if (vsock_has_data(sk, psock) && sk_psock_queue_empty(psock)) { > release_sock(sk); > sk_psock_put(sk, psock); > @@ -108,6 +116,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, > copied = sk_msg_recvmsg(sk, psock, msg, len, flags); > } > > +out: > release_sock(sk); > sk_psock_put(sk, psock); > ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-01-21 18:06 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-08 18:06 [PATCH net 0/2] vsock: some fixes due to transport de-assignment Stefano Garzarella 2025-01-08 18:06 ` [PATCH net 1/2] vsock/virtio: discard packets if the transport changes Stefano Garzarella 2025-01-08 19:31 ` Hyunwoo Kim 2025-01-09 9:01 ` Stefano Garzarella 2025-01-09 9:06 ` Michael S. Tsirkin 2025-01-09 9:13 ` Hyunwoo Kim 2025-01-09 10:59 ` Stefano Garzarella 2025-01-09 11:10 ` Hyunwoo Kim 2025-01-09 13:34 ` Michal Luczaj 2025-01-09 13:42 ` Stefano Garzarella 2025-01-09 15:27 ` Michal Luczaj 2025-01-10 8:39 ` Stefano Garzarella 2025-01-21 17:30 ` Luigi Leonardi 2025-01-21 18:06 ` Michal Luczaj 2025-01-08 18:06 ` [PATCH net 2/2] vsock/bpf: return early if transport is not assigned Stefano Garzarella 2025-01-08 19:37 ` Hyunwoo Kim 2025-01-09 9:07 ` Michael S. Tsirkin 2025-01-09 9:24 ` Luigi Leonardi 2025-01-09 13:14 ` Michal Luczaj
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).