qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed()
@ 2024-06-07 15:00 Alexander Ivanov
  2024-06-08  9:36 ` Alexander Ivanov
  2024-06-28  9:58 ` Alexander Ivanov
  0 siblings, 2 replies; 7+ messages in thread
From: Alexander Ivanov @ 2024-06-07 15:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, den, andrey.drobyshev, eblake, vsementsov, kwolf,
	hreitz

In some cases, the NBD server can be stopped before
nbd_blockdev_client_closed() is called, causing the nbd_server variable
to be nullified. This leads to a NULL pointer dereference when accessing
nbd_server.

Add a NULL check for nbd_server to the nbd_blockdev_client_closed()
function to prevent NULL pointer dereference.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
---
 blockdev-nbd.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 213012435f..fb1f30ae0d 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -52,6 +52,9 @@ int nbd_server_max_connections(void)
 static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
 {
     nbd_client_put(client);
+    if (nbd_server == NULL) {
+        return;
+    }
     assert(nbd_server->connections > 0);
     nbd_server->connections--;
     nbd_update_server_watch(nbd_server);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed()
  2024-06-07 15:00 [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed() Alexander Ivanov
@ 2024-06-08  9:36 ` Alexander Ivanov
  2024-06-10 12:33   ` Eric Blake
  2024-06-28  9:58 ` Alexander Ivanov
  1 sibling, 1 reply; 7+ messages in thread
From: Alexander Ivanov @ 2024-06-08  9:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, den, andrey.drobyshev, eblake, vsementsov, kwolf,
	hreitz

[-- Attachment #1: Type: text/plain, Size: 1089 bytes --]

There is a bug reproducer in the attachment.


On 6/7/24 17:00, Alexander Ivanov wrote:
> In some cases, the NBD server can be stopped before
> nbd_blockdev_client_closed() is called, causing the nbd_server variable
> to be nullified. This leads to a NULL pointer dereference when accessing
> nbd_server.
>
> Add a NULL check for nbd_server to the nbd_blockdev_client_closed()
> function to prevent NULL pointer dereference.
>
> Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
> ---
>   blockdev-nbd.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/blockdev-nbd.c b/blockdev-nbd.c
> index 213012435f..fb1f30ae0d 100644
> --- a/blockdev-nbd.c
> +++ b/blockdev-nbd.c
> @@ -52,6 +52,9 @@ int nbd_server_max_connections(void)
>   static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
>   {
>       nbd_client_put(client);
> +    if (nbd_server == NULL) {
> +        return;
> +    }
>       assert(nbd_server->connections > 0);
>       nbd_server->connections--;
>       nbd_update_server_watch(nbd_server);

-- 
Best regards,
Alexander Ivanov

[-- Attachment #2: reproducer.tar.gz --]
[-- Type: application/gzip, Size: 4478 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed()
  2024-06-08  9:36 ` Alexander Ivanov
@ 2024-06-10 12:33   ` Eric Blake
  2024-06-10 12:52     ` Alexander Ivanov
  2024-06-18  8:44     ` Alexander Ivanov
  0 siblings, 2 replies; 7+ messages in thread
From: Eric Blake @ 2024-06-10 12:33 UTC (permalink / raw)
  To: Alexander Ivanov
  Cc: qemu-devel, qemu-block, den, andrey.drobyshev, vsementsov, kwolf,
	hreitz

On Sat, Jun 08, 2024 at 11:36:59AM GMT, Alexander Ivanov wrote:
> There is a bug reproducer in the attachment.

Summarizing the reproducer, you are repeatedly calling QMP
nbd-server-start/nbd-server-stop on qemu as NBD server in one thread,
and repeatedly calling 'qemu-nbd -L' in another, to try and provoke
the race where the server is stopped while qemu-nbd -L is still trying
to grab information.

> 
> 
> On 6/7/24 17:00, Alexander Ivanov wrote:
> > In some cases, the NBD server can be stopped before
> > nbd_blockdev_client_closed() is called, causing the nbd_server variable
> > to be nullified. This leads to a NULL pointer dereference when accessing
> > nbd_server.

Am I correct that the NULL pointer deref was in qemu-nbd in your
reproducer, and not in qemu-kvm?

> > 
> > Add a NULL check for nbd_server to the nbd_blockdev_client_closed()
> > function to prevent NULL pointer dereference.
> > 
> > Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
> > ---
> >   blockdev-nbd.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/blockdev-nbd.c b/blockdev-nbd.c
> > index 213012435f..fb1f30ae0d 100644
> > --- a/blockdev-nbd.c
> > +++ b/blockdev-nbd.c
> > @@ -52,6 +52,9 @@ int nbd_server_max_connections(void)
> >   static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
> >   {
> >       nbd_client_put(client);
> > +    if (nbd_server == NULL) {
> > +        return;
> > +    }
> >       assert(nbd_server->connections > 0);

While this does indeed avoid the NULL dereference right here, I still
want to understand why nbd_server is getting set to NULL while there
is still an active client, and make sure we don't have any other NULL
derefs.  I'll respond again once I've studied the code a bit more.

> >       nbd_server->connections--;
> >       nbd_update_server_watch(nbd_server);
> 
> -- 
> Best regards,
> Alexander Ivanov

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed()
  2024-06-10 12:33   ` Eric Blake
@ 2024-06-10 12:52     ` Alexander Ivanov
  2024-06-18  8:44     ` Alexander Ivanov
  1 sibling, 0 replies; 7+ messages in thread
From: Alexander Ivanov @ 2024-06-10 12:52 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, den, andrey.drobyshev, vsementsov, kwolf,
	hreitz



On 6/10/24 14:33, Eric Blake wrote:
> On Sat, Jun 08, 2024 at 11:36:59AM GMT, Alexander Ivanov wrote:
>> There is a bug reproducer in the attachment.
> Summarizing the reproducer, you are repeatedly calling QMP
> nbd-server-start/nbd-server-stop on qemu as NBD server in one thread,
> and repeatedly calling 'qemu-nbd -L' in another, to try and provoke
> the race where the server is stopped while qemu-nbd -L is still trying
> to grab information.
Yes, you are right.
>
>>
>> On 6/7/24 17:00, Alexander Ivanov wrote:
>>> In some cases, the NBD server can be stopped before
>>> nbd_blockdev_client_closed() is called, causing the nbd_server variable
>>> to be nullified. This leads to a NULL pointer dereference when accessing
>>> nbd_server.
> Am I correct that the NULL pointer deref was in qemu-nbd in your
> reproducer, and not in qemu-kvm?
No, it happened in qemu-kvm.
>
>>> Add a NULL check for nbd_server to the nbd_blockdev_client_closed()
>>> function to prevent NULL pointer dereference.
>>>
>>> Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
>>> ---
>>>    blockdev-nbd.c | 3 +++
>>>    1 file changed, 3 insertions(+)
>>>
>>> diff --git a/blockdev-nbd.c b/blockdev-nbd.c
>>> index 213012435f..fb1f30ae0d 100644
>>> --- a/blockdev-nbd.c
>>> +++ b/blockdev-nbd.c
>>> @@ -52,6 +52,9 @@ int nbd_server_max_connections(void)
>>>    static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
>>>    {
>>>        nbd_client_put(client);
>>> +    if (nbd_server == NULL) {
>>> +        return;
>>> +    }
>>>        assert(nbd_server->connections > 0);
> While this does indeed avoid the NULL dereference right here, I still
> want to understand why nbd_server is getting set to NULL while there
> is still an active client, and make sure we don't have any other NULL
> derefs.  I'll respond again once I've studied the code a bit more.
I agree that the patch fixes only symptoms, but haven't succeeded with the
bug roots investigation, and decided to raise this issue with the community.

Also, later I reproduced another bug with the patch applied and with the
same reproducer. There was an

    assert(nbd_server->connections > 0);

violation in nbd_blockdev_client_closed(). It happens much less frequently,
however. Think the reasons are similar to the original bug.
>
>>>        nbd_server->connections--;
>>>        nbd_update_server_watch(nbd_server);
>> -- 
>> Best regards,
>> Alexander Ivanov

-- 
Best regards,
Alexander Ivanov



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed()
  2024-06-10 12:33   ` Eric Blake
  2024-06-10 12:52     ` Alexander Ivanov
@ 2024-06-18  8:44     ` Alexander Ivanov
  1 sibling, 0 replies; 7+ messages in thread
From: Alexander Ivanov @ 2024-06-18  8:44 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, den, andrey.drobyshev, vsementsov, kwolf,
	hreitz

Hello Eric,

Do you have any ideas about the bug?

Thank you.

On 6/10/24 14:33, Eric Blake wrote:
> On Sat, Jun 08, 2024 at 11:36:59AM GMT, Alexander Ivanov wrote:
>> There is a bug reproducer in the attachment.
> Summarizing the reproducer, you are repeatedly calling QMP
> nbd-server-start/nbd-server-stop on qemu as NBD server in one thread,
> and repeatedly calling 'qemu-nbd -L' in another, to try and provoke
> the race where the server is stopped while qemu-nbd -L is still trying
> to grab information.
>
>>
>> On 6/7/24 17:00, Alexander Ivanov wrote:
>>> In some cases, the NBD server can be stopped before
>>> nbd_blockdev_client_closed() is called, causing the nbd_server variable
>>> to be nullified. This leads to a NULL pointer dereference when accessing
>>> nbd_server.
> Am I correct that the NULL pointer deref was in qemu-nbd in your
> reproducer, and not in qemu-kvm?
>
>>> Add a NULL check for nbd_server to the nbd_blockdev_client_closed()
>>> function to prevent NULL pointer dereference.
>>>
>>> Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
>>> ---
>>>    blockdev-nbd.c | 3 +++
>>>    1 file changed, 3 insertions(+)
>>>
>>> diff --git a/blockdev-nbd.c b/blockdev-nbd.c
>>> index 213012435f..fb1f30ae0d 100644
>>> --- a/blockdev-nbd.c
>>> +++ b/blockdev-nbd.c
>>> @@ -52,6 +52,9 @@ int nbd_server_max_connections(void)
>>>    static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
>>>    {
>>>        nbd_client_put(client);
>>> +    if (nbd_server == NULL) {
>>> +        return;
>>> +    }
>>>        assert(nbd_server->connections > 0);
> While this does indeed avoid the NULL dereference right here, I still
> want to understand why nbd_server is getting set to NULL while there
> is still an active client, and make sure we don't have any other NULL
> derefs.  I'll respond again once I've studied the code a bit more.
>
>>>        nbd_server->connections--;
>>>        nbd_update_server_watch(nbd_server);
>> -- 
>> Best regards,
>> Alexander Ivanov

-- 
Best regards,
Alexander Ivanov



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed()
  2024-06-07 15:00 [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed() Alexander Ivanov
  2024-06-08  9:36 ` Alexander Ivanov
@ 2024-06-28  9:58 ` Alexander Ivanov
  2024-07-30  2:35   ` Eric Blake
  1 sibling, 1 reply; 7+ messages in thread
From: Alexander Ivanov @ 2024-06-28  9:58 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, den, andrey.drobyshev, eblake, vsementsov, kwolf,
	hreitz

Ping?

On 6/7/24 17:00, Alexander Ivanov wrote:
>   static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
>   {
>       nbd_client_put(client);
> +    if (nbd_server == NULL) {
> +        return;
> +    }
>       assert(nbd_server->connections > 0);
>       nbd_server->connections--;
>       nbd_update_server_watch(nbd_server);

-- 
Best regards,
Alexander Ivanov



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed()
  2024-06-28  9:58 ` Alexander Ivanov
@ 2024-07-30  2:35   ` Eric Blake
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Blake @ 2024-07-30  2:35 UTC (permalink / raw)
  To: Alexander Ivanov
  Cc: qemu-devel, qemu-block, den, andrey.drobyshev, vsementsov, kwolf,
	hreitz, berrange

On Fri, Jun 28, 2024 at 11:58:37AM GMT, Alexander Ivanov wrote:
> Ping?
> 
> On 6/7/24 17:00, Alexander Ivanov wrote:
> >   static void nbd_blockdev_client_closed(NBDClient *client, bool ignored)
> >   {
> >       nbd_client_put(client);
> > +    if (nbd_server == NULL) {
> > +        return;
> > +    }
> >       assert(nbd_server->connections > 0);
> >       nbd_server->connections--;
> >       nbd_update_server_watch(nbd_server);
>

I've spent more time looking at this, and I'm stumped.  Maybe Dan can
help out.  If I understand correctly, here's the race situation we
have:

qemu main loop                coroutine                    client
QMP nbd-server-start
 - calls nbd_server_start
   - assigns nbd_server = g_new0()
   - calls qio_net_listener_open_sync
   - calls nbd_server_update_watch
     - calls qio_net_listener_set_client_func(, nbd_accept, )
     - waiting for client is now in coroutine
- returns control to QMP main loop
                                                           opens TCP socket
                              - qio notices incoming connection
                                - calls nbd_accept callback
                                - starts NBD handshake nbd_client_new(, nbd_blockdev_client_closed)
QMP nbd-server-stop
 - calls nbd_server_stop
   - calls nbd_server_free
     - calls qio_net_listener_disconnect()
       - marks qio channel as useless
   - sets nbd_server = NULL
                                - qio sees channel is useless, disconnects
                                                           client deals gracefully with dead connection
                                - calls nbd_blockdev_client_closed callback
                                - segfault since nbd_server->connections derefs NULL


Thread 1 "qemu-kvm" received signal SIGSEGV, Segmentation fault.
0x000055555600af59 in nbd_blockdev_client_closed (client=0x55555810dfc0, 
    ignored=false) at ../blockdev-nbd.c:56
56	    assert(nbd_server->connections > 0);
(gdb) p nbd_server
$1 = (NBDServerData *) 0x0
(gdb) bt
#0  0x000055555600af59 in nbd_blockdev_client_closed
    (client=0x55555810dfc0, ignored=false) at ../blockdev-nbd.c:56
#1  0x0000555555ff72f9 in client_close
    (client=0x55555810dfc0, negotiated=false) at ../nbd/server.c:1624
#2  0x0000555555ffbc49 in nbd_co_client_start (opaque=0x55555810dfc0)
    at ../nbd/server.c:3198
#3  0x0000555556220629 in coroutine_trampoline (i0=1488434896, i1=21845)

Worse, the race could go another way: if another QMP nbd-server-start
is called fast enough before the coroutine finishes the nbd_accept
code from the first connection, it could attempt to modify the second
nbd_server object, which may be associated with a completely different
socket.

As far as I can tell, the problem stems from the fact that the attempt
to establish the connection with the client is continuing in a
background coroutine despite our efforts to honor the QMP
nbd-server-stop command.  But I'm not sure on the proper way to inform
qio to abandon all incoming traffic to a given net listener.  Or maybe
I should just change the semantics of QMP nbd-server-stop to fail if
there are known connections from and nbd_accept() - but I still don't
want that to wait indefinitely, so I still want some way to tell the
qio channel that the server is going away and to tear down sockets as
soon as possible.

As a stopgap, something like this might prevent the SEGV:

diff --git i/blockdev-nbd.c w/blockdev-nbd.c
index 213012435f4..2f5ea094407 100644
--- i/blockdev-nbd.c
+++ w/blockdev-nbd.c
@@ -277,6 +277,12 @@ void qmp_nbd_server_stop(Error **errp)

     blk_exp_close_all_type(BLOCK_EXPORT_TYPE_NBD);

+    if (qio_net_listener_is_connected(nbd_server->listener) ||
+        nbd_server->connections > 0) {
+        error_setg(errp, "NBD server still has connected clients");
+        return;
+    }
+
     nbd_server_free(nbd_server);
     nbd_server = NULL;
 }

but it's not as graceful as I'd like (it would be nicer to have the
nbd-server-stop command wait until it knows all connections are
cleaned, instead of failing up front).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org



^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-07-30  2:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-07 15:00 [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed() Alexander Ivanov
2024-06-08  9:36 ` Alexander Ivanov
2024-06-10 12:33   ` Eric Blake
2024-06-10 12:52     ` Alexander Ivanov
2024-06-18  8:44     ` Alexander Ivanov
2024-06-28  9:58 ` Alexander Ivanov
2024-07-30  2:35   ` Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).