Linux virtualization list
 help / color / mirror / Atom feed
* [PATCH] virtio_vsock: Fix race condition in virtio_transport_recv_pkt
@ 2020-05-29 13:31 Jia He
  2020-05-29 14:10 ` Stefano Garzarella
  0 siblings, 1 reply; 4+ messages in thread
From: Jia He @ 2020-05-29 13:31 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella
  Cc: David S. Miller, Jakub Kicinski, kvm, virtualization, netdev,
	linux-kernel, Kaly Xin, Jia He, stable

When client tries to connect(SOCK_STREAM) the server in the guest with NONBLOCK
mode, there will be a panic on a ThunderX2 (armv8a server):
[  463.718844][ T5040] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[  463.718848][ T5040] Mem abort info:
[  463.718849][ T5040]   ESR = 0x96000044
[  463.718852][ T5040]   EC = 0x25: DABT (current EL), IL = 32 bits
[  463.718853][ T5040]   SET = 0, FnV = 0
[  463.718854][ T5040]   EA = 0, S1PTW = 0
[  463.718855][ T5040] Data abort info:
[  463.718856][ T5040]   ISV = 0, ISS = 0x00000044
[  463.718857][ T5040]   CM = 0, WnR = 1
[  463.718859][ T5040] user pgtable: 4k pages, 48-bit VAs, pgdp=0000008f6f6e9000
[  463.718861][ T5040] [0000000000000000] pgd=0000000000000000
[  463.718866][ T5040] Internal error: Oops: 96000044 [#1] SMP
[...]
[  463.718977][ T5040] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G           O      5.7.0-rc7+ #139
[  463.718980][ T5040] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018
[  463.718982][ T5040] pstate: 60400009 (nZCv daif +PAN -UAO)
[  463.718995][ T5040] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
[  463.718999][ T5040] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common]
[  463.719000][ T5040] sp : ffff80002dbe3c40
[...]
[  463.719025][ T5040] Call trace:
[  463.719030][ T5040]  virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
[  463.719034][ T5040]  vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock]
[  463.719041][ T5040]  vhost_worker+0x100/0x1a0 [vhost]
[  463.719048][ T5040]  kthread+0x128/0x130
[  463.719052][ T5040]  ret_from_fork+0x10/0x18

The race condition as follows:
Task1                            Task2
=====                            =====
__sock_release                   virtio_transport_recv_pkt
  __vsock_release                  vsock_find_bound_socket (found)
    lock_sock_nested
    vsock_remove_sock
    sock_orphan
      sk_set_socket(sk, NULL)
    ...
    release_sock
                                lock_sock
                                   virtio_transport_recv_connecting
                                     sk->sk_socket->state (panic)

This fixes it by checking vsk again whether it is in bound/connected table.

Signed-off-by: Jia He <justin.he@arm.com>
Cc: stable@vger.kernel.org
---
 net/vmw_vsock/virtio_transport_common.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 69efc891885f..0dbd6a45f0ed 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -1132,6 +1132,17 @@ void virtio_transport_recv_pkt(struct virtio_transport *t,
 
 	lock_sock(sk);
 
+	/* Check it again if vsk is removed by vsock_remove_sock */
+	spin_lock_bh(&vsock_table_lock);
+	if (!__vsock_in_bound_table(vsk) && !__vsock_in_connected_table(vsk)) {
+		spin_unlock_bh(&vsock_table_lock);
+		(void)virtio_transport_reset_no_sock(t, pkt);
+		release_sock(sk);
+		sock_put(sk);
+		goto free_pkt;
+	}
+	spin_unlock_bh(&vsock_table_lock);
+
 	/* Update CID in case it has changed after a transport reset event */
 	vsk->local_addr.svm_cid = dst.svm_cid;
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] virtio_vsock: Fix race condition in virtio_transport_recv_pkt
  2020-05-29 13:31 [PATCH] virtio_vsock: Fix race condition in virtio_transport_recv_pkt Jia He
@ 2020-05-29 14:10 ` Stefano Garzarella
  2020-05-29 15:11   ` Justin He
  0 siblings, 1 reply; 4+ messages in thread
From: Stefano Garzarella @ 2020-05-29 14:10 UTC (permalink / raw)
  To: Jia He
  Cc: Stefan Hajnoczi, David S. Miller, Jakub Kicinski, kvm,
	virtualization, netdev, linux-kernel, Kaly Xin, stable

Hi Jia,
thanks for the patch! I have some comments.

On Fri, May 29, 2020 at 09:31:23PM +0800, Jia He wrote:
> When client tries to connect(SOCK_STREAM) the server in the guest with NONBLOCK
> mode, there will be a panic on a ThunderX2 (armv8a server):
> [  463.718844][ T5040] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> [  463.718848][ T5040] Mem abort info:
> [  463.718849][ T5040]   ESR = 0x96000044
> [  463.718852][ T5040]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  463.718853][ T5040]   SET = 0, FnV = 0
> [  463.718854][ T5040]   EA = 0, S1PTW = 0
> [  463.718855][ T5040] Data abort info:
> [  463.718856][ T5040]   ISV = 0, ISS = 0x00000044
> [  463.718857][ T5040]   CM = 0, WnR = 1
> [  463.718859][ T5040] user pgtable: 4k pages, 48-bit VAs, pgdp=0000008f6f6e9000
> [  463.718861][ T5040] [0000000000000000] pgd=0000000000000000
> [  463.718866][ T5040] Internal error: Oops: 96000044 [#1] SMP
> [...]
> [  463.718977][ T5040] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G           O      5.7.0-rc7+ #139
> [  463.718980][ T5040] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018
> [  463.718982][ T5040] pstate: 60400009 (nZCv daif +PAN -UAO)
> [  463.718995][ T5040] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
> [  463.718999][ T5040] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common]
> [  463.719000][ T5040] sp : ffff80002dbe3c40
> [...]
> [  463.719025][ T5040] Call trace:
> [  463.719030][ T5040]  virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
> [  463.719034][ T5040]  vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock]
> [  463.719041][ T5040]  vhost_worker+0x100/0x1a0 [vhost]
> [  463.719048][ T5040]  kthread+0x128/0x130
> [  463.719052][ T5040]  ret_from_fork+0x10/0x18
> 
> The race condition as follows:
> Task1                            Task2
> =====                            =====
> __sock_release                   virtio_transport_recv_pkt
>   __vsock_release                  vsock_find_bound_socket (found)
>     lock_sock_nested
>     vsock_remove_sock
>     sock_orphan
>       sk_set_socket(sk, NULL)
>     ...
>     release_sock
>                                 lock_sock
>                                    virtio_transport_recv_connecting
>                                      sk->sk_socket->state (panic)
> 
> This fixes it by checking vsk again whether it is in bound/connected table.
> 
> Signed-off-by: Jia He <justin.he@arm.com>
> Cc: stable@vger.kernel.org
> ---
>  net/vmw_vsock/virtio_transport_common.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> index 69efc891885f..0dbd6a45f0ed 100644
> --- a/net/vmw_vsock/virtio_transport_common.c
> +++ b/net/vmw_vsock/virtio_transport_common.c
> @@ -1132,6 +1132,17 @@ void virtio_transport_recv_pkt(struct virtio_transport *t,
>  
>  	lock_sock(sk);
>  
> +	/* Check it again if vsk is removed by vsock_remove_sock */
> +	spin_lock_bh(&vsock_table_lock);
> +	if (!__vsock_in_bound_table(vsk) && !__vsock_in_connected_table(vsk)) {
> +		spin_unlock_bh(&vsock_table_lock);
> +		(void)virtio_transport_reset_no_sock(t, pkt);
> +		release_sock(sk);
> +		sock_put(sk);
> +		goto free_pkt;
> +	}
> +	spin_unlock_bh(&vsock_table_lock);
> +

As an a simpler alternative, can we check the sk_shutdown or the socket
state without check again both bound and connected tables?

This is a data path, so we should take it faster.

I mean something like this:

	if (sk->sk_shutdown == SHUTDOWN_MASK) {
		...
	}

or

	if (sock_flag(sk, SOCK_DEAD)) {
		...
	}

I prefer the first option, but I think also the second option should
work.

Thanks,
Stefano

>  	/* Update CID in case it has changed after a transport reset event */
>  	vsk->local_addr.svm_cid = dst.svm_cid;
>  
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] virtio_vsock: Fix race condition in virtio_transport_recv_pkt
  2020-05-29 14:10 ` Stefano Garzarella
@ 2020-05-29 15:11   ` Justin He
  0 siblings, 0 replies; 4+ messages in thread
From: Justin He @ 2020-05-29 15:11 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Stefan Hajnoczi, David S. Miller, Jakub Kicinski,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Kaly Xin,
	stable@vger.kernel.org

Hi Stefano

> -----Original Message-----
> From: Stefano Garzarella <sgarzare@redhat.com>
> Sent: Friday, May 29, 2020 10:11 PM
> To: Justin He <Justin.He@arm.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>; David S. Miller
> <davem@davemloft.net>; Jakub Kicinski <kuba@kernel.org>;
> kvm@vger.kernel.org; virtualization@lists.linux-foundation.org;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Kaly Xin
> <Kaly.Xin@arm.com>; stable@vger.kernel.org
> Subject: Re: [PATCH] virtio_vsock: Fix race condition in
> virtio_transport_recv_pkt
>
> Hi Jia,
> thanks for the patch! I have some comments.
>
> On Fri, May 29, 2020 at 09:31:23PM +0800, Jia He wrote:
> > When client tries to connect(SOCK_STREAM) the server in the guest with
> NONBLOCK
> > mode, there will be a panic on a ThunderX2 (armv8a server):
> > [  463.718844][ T5040] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000000000000
> > [  463.718848][ T5040] Mem abort info:
> > [  463.718849][ T5040]   ESR = 0x96000044
> > [  463.718852][ T5040]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [  463.718853][ T5040]   SET = 0, FnV = 0
> > [  463.718854][ T5040]   EA = 0, S1PTW = 0
> > [  463.718855][ T5040] Data abort info:
> > [  463.718856][ T5040]   ISV = 0, ISS = 0x00000044
> > [  463.718857][ T5040]   CM = 0, WnR = 1
> > [  463.718859][ T5040] user pgtable: 4k pages, 48-bit VAs,
> pgdp=0000008f6f6e9000
> > [  463.718861][ T5040] [0000000000000000] pgd=0000000000000000
> > [  463.718866][ T5040] Internal error: Oops: 96000044 [#1] SMP
> > [...]
> > [  463.718977][ T5040] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G
> O      5.7.0-rc7+ #139
> > [  463.718980][ T5040] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00,
> BIOS F06 09/25/2018
> > [  463.718982][ T5040] pstate: 60400009 (nZCv daif +PAN -UAO)
> > [  463.718995][ T5040] pc : virtio_transport_recv_pkt+0x4c8/0xd40
> [vmw_vsock_virtio_transport_common]
> > [  463.718999][ T5040] lr : virtio_transport_recv_pkt+0x1fc/0xd40
> [vmw_vsock_virtio_transport_common]
> > [  463.719000][ T5040] sp : ffff80002dbe3c40
> > [...]
> > [  463.719025][ T5040] Call trace:
> > [  463.719030][ T5040]  virtio_transport_recv_pkt+0x4c8/0xd40
> [vmw_vsock_virtio_transport_common]
> > [  463.719034][ T5040]  vhost_vsock_handle_tx_kick+0x360/0x408
> [vhost_vsock]
> > [  463.719041][ T5040]  vhost_worker+0x100/0x1a0 [vhost]
> > [  463.719048][ T5040]  kthread+0x128/0x130
> > [  463.719052][ T5040]  ret_from_fork+0x10/0x18
> >
> > The race condition as follows:
> > Task1                            Task2
> > =====                            =====
> > __sock_release                   virtio_transport_recv_pkt
> >   __vsock_release                  vsock_find_bound_socket (found)
> >     lock_sock_nested
> >     vsock_remove_sock
> >     sock_orphan
> >       sk_set_socket(sk, NULL)
> >     ...
> >     release_sock
> >                                 lock_sock
> >                                    virtio_transport_recv_connecting
> >                                      sk->sk_socket->state (panic)
> >
> > This fixes it by checking vsk again whether it is in bound/connected table.
> >
> > Signed-off-by: Jia He <justin.he@arm.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  net/vmw_vsock/virtio_transport_common.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/net/vmw_vsock/virtio_transport_common.c
> b/net/vmw_vsock/virtio_transport_common.c
> > index 69efc891885f..0dbd6a45f0ed 100644
> > --- a/net/vmw_vsock/virtio_transport_common.c
> > +++ b/net/vmw_vsock/virtio_transport_common.c
> > @@ -1132,6 +1132,17 @@ void virtio_transport_recv_pkt(struct
> virtio_transport *t,
> >
> >  lock_sock(sk);
> >
> > +/* Check it again if vsk is removed by vsock_remove_sock */
> > +spin_lock_bh(&vsock_table_lock);
> > +if (!__vsock_in_bound_table(vsk)
> && !__vsock_in_connected_table(vsk)) {
> > +spin_unlock_bh(&vsock_table_lock);
> > +(void)virtio_transport_reset_no_sock(t, pkt);
> > +release_sock(sk);
> > +sock_put(sk);
> > +goto free_pkt;
> > +}
> > +spin_unlock_bh(&vsock_table_lock);
> > +
>
> As an a simpler alternative, can we check the sk_shutdown or the socket
> state without check again both bound and connected tables?
>
> This is a data path, so we should take it faster.
>
> I mean something like this:
>
> if (sk->sk_shutdown == SHUTDOWN_MASK) {
> ...
> }
>
Thanks for the suggestion, I verified it worked fine. And it
is a more lightweight checking than mine.

I will send v2 with above change

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] virtio_vsock: Fix race condition in virtio_transport_recv_pkt
@ 2020-05-30 12:32 Jia He
  0 siblings, 0 replies; 4+ messages in thread
From: Jia He @ 2020-05-30 12:32 UTC (permalink / raw)
  To: Stefan Hajnoczi, Stefano Garzarella
  Cc: David S. Miller, Jakub Kicinski, kvm, virtualization, netdev,
	linux-kernel, Kaly Xin, Markus Elfring, Jia He, stable, Asias He

When client on the host tries to connect(SOCK_STREAM, O_NONBLOCK) to the
server on the guest, there will be a panic on a ThunderX2 (armv8a server):

[  463.718844] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[  463.718848] Mem abort info:
[  463.718849]   ESR = 0x96000044
[  463.718852]   EC = 0x25: DABT (current EL), IL = 32 bits
[  463.718853]   SET = 0, FnV = 0
[  463.718854]   EA = 0, S1PTW = 0
[  463.718855] Data abort info:
[  463.718856]   ISV = 0, ISS = 0x00000044
[  463.718857]   CM = 0, WnR = 1
[  463.718859] user pgtable: 4k pages, 48-bit VAs, pgdp=0000008f6f6e9000
[  463.718861] [0000000000000000] pgd=0000000000000000
[  463.718866] Internal error: Oops: 96000044 [#1] SMP
[...]
[  463.718977] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G           O      5.7.0-rc7+ #139
[  463.718980] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018
[  463.718982] pstate: 60400009 (nZCv daif +PAN -UAO)
[  463.718995] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
[  463.718999] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common]
[  463.719000] sp : ffff80002dbe3c40
[...]
[  463.719025] Call trace:
[  463.719030]  virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
[  463.719034]  vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock]
[  463.719041]  vhost_worker+0x100/0x1a0 [vhost]
[  463.719048]  kthread+0x128/0x130
[  463.719052]  ret_from_fork+0x10/0x18

The race condition is as follows:
Task1                                Task2
=====                                =====
__sock_release                       virtio_transport_recv_pkt
  __vsock_release                      vsock_find_bound_socket (found sk)
    lock_sock_nested
    vsock_remove_sock
    sock_orphan
      sk_set_socket(sk, NULL)
    sk->sk_shutdown = SHUTDOWN_MASK
    ...
    release_sock
                                    lock_sock
                                       virtio_transport_recv_connecting
                                         sk->sk_socket->state (panic!)

The root cause is that vsock_find_bound_socket can't hold the lock_sock,
so there is a small race window between vsock_find_bound_socket() and
lock_sock(). If __vsock_release() is running in another task,
sk->sk_socket will be set to NULL inadvertently.

Thus check the data structure member “sk_shutdown” (suggested by Stefano)
after a call of the function “lock_sock” since this field is set to
“SHUTDOWN_MASK” under the protection of “lock_sock_nested”.

Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
Signed-off-by: Jia He <justin.he@arm.com>
Cc: stable@vger.kernel.org
Cc: Asias He <asias@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
---
v4: refine the commit msg (from Markus)

 net/vmw_vsock/virtio_transport_common.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 69efc891885f..0edda1edf988 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -1132,6 +1132,14 @@ void virtio_transport_recv_pkt(struct virtio_transport *t,
 
 	lock_sock(sk);
 
+	/* Check if sk has been released before lock_sock */
+	if (sk->sk_shutdown == SHUTDOWN_MASK) {
+		(void)virtio_transport_reset_no_sock(t, pkt);
+		release_sock(sk);
+		sock_put(sk);
+		goto free_pkt;
+	}
+
 	/* Update CID in case it has changed after a transport reset event */
 	vsk->local_addr.svm_cid = dst.svm_cid;
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-30 12:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-05-29 13:31 [PATCH] virtio_vsock: Fix race condition in virtio_transport_recv_pkt Jia He
2020-05-29 14:10 ` Stefano Garzarella
2020-05-29 15:11   ` Justin He
  -- strict thread matches above, loose matches on Subject: below --
2020-05-30 12:32 Jia He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox