From: Mike Christie <mchristi@redhat.com>
To: Sun Ke <sunke32@huawei.com>, josef@toxicpanda.com, axboe@kernel.dk
Cc: linux-block@vger.kernel.org, nbd@other.debian.org,
linux-kernel@vger.kernel.org, Xiubo Li <xiubli@redhat.com>
Subject: Re: [PATCH] nbd: fix potential NULL pointer fault in connect and disconnect process
Date: Fri, 17 Jan 2020 11:32:06 -0600 [thread overview]
Message-ID: <5E21EF96.1010204@redhat.com> (raw)
In-Reply-To: <20200117115005.37006-1-sunke32@huawei.com>
On 01/17/2020 05:50 AM, Sun Ke wrote:
> Connect and disconnect a nbd device repeatedly, will cause
> NULL pointer fault.
>
> It will appear by the steps:
> 1. Connect the nbd device and disconnect it, but now nbd device
> is not disconnected totally.
> 2. Connect the same nbd device again immediately, it will fail
> in nbd_start_device with a EBUSY return value.
> 3. Wait a second to make sure the last config_refs is reduced
> and run nbd_config_put to disconnect the nbd device totally.
> 4. Start another process to open the nbd_device, config_refs
> will increase and at the same time disconnect it.
Just to make sure I understood this, for step 4 the process is doing:
open(/dev/nbdX);
ioctl(NBD_DISCONNECT, /dev/nbdX) or nbd_genl_disconnect(for /dev/nbdX)
?
There is no successful NBD_DO_IT / nbd_genl_connect between the open and
disconnect calls at step #4, because it would normally be done at #2 and
that failed. nbd_disconnect_and_put could then reference a null
recv_workq. If we are also racing with a close() then that could free
the device/config from under nbd_disconnect_and_put.
>
> To fix it, add a NBD_HAS_STARTED flag. Set it in nbd_start_device_ioctl
I'm not sure if we need the new bit. We could just add a check for a non
null task_recv in nbd_genl_disconnect like how nbd_start_device and
nbd_genl_disconnect do.
The new bit might be more clear which is nice. If we got this route,
should the new bit be a runtime_flag like other device state bits?
> and nbd_genl_connect if nbd device is started successfully.
> Clear it in nbd_config_put. Test it in nbd_genl_disconnect and
> nbd_genl_reconfigure.
>
> Signed-off-by: Sun Ke <sunke32@huawei.com>
> ---
> drivers/block/nbd.c | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index b4607dd96185..ddd364e208ab 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -83,6 +83,7 @@ struct link_dead_args {
>
> #define NBD_DESTROY_ON_DISCONNECT 0
> #define NBD_DISCONNECT_REQUESTED 1
> +#define NBD_HAS_STARTED 2
>
> struct nbd_config {
> u32 flags;
> @@ -1215,6 +1216,7 @@ static void nbd_config_put(struct nbd_device *nbd)
> nbd->disk->queue->limits.discard_alignment = 0;
> blk_queue_max_discard_sectors(nbd->disk->queue, UINT_MAX);
> blk_queue_flag_clear(QUEUE_FLAG_DISCARD, nbd->disk->queue);
> + clear_bit(NBD_HAS_STARTED, &nbd->flags);
>
> mutex_unlock(&nbd->config_lock);
> nbd_put(nbd);
> @@ -1290,6 +1292,8 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
> ret = nbd_start_device(nbd);
> if (ret)
> return ret;
> + else
> + set_bit(NBD_HAS_STARTED, &nbd->flags);
>
> if (max_part)
> bdev->bd_invalidated = 1;
> @@ -1961,6 +1965,7 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
> mutex_unlock(&nbd->config_lock);
> if (!ret) {
> set_bit(NBD_RT_HAS_CONFIG_REF, &config->runtime_flags);
> + set_bit(NBD_HAS_STARTED, &nbd->flags);
> refcount_inc(&nbd->config_refs);
> nbd_connect_reply(info, nbd->index);
> }
> @@ -2008,6 +2013,14 @@ static int nbd_genl_disconnect(struct sk_buff *skb, struct genl_info *info)
> index);
> return -EINVAL;
> }
> +
> + if (!test_bit(NBD_HAS_STARTED, &nbd->flags)) {
> + mutex_unlock(&nbd_index_mutex);
> + printk(KERN_ERR "nbd: device at index %d failed to start\n",
> + index);
> + return -EBUSY;
> + }
> +
> if (!refcount_inc_not_zero(&nbd->refs)) {
> mutex_unlock(&nbd_index_mutex);
> printk(KERN_ERR "nbd: device at index %d is going down\n",
> @@ -2049,6 +2062,14 @@ static int nbd_genl_reconfigure(struct sk_buff *skb, struct genl_info *info)
> index);
> return -EINVAL;
> }
> +
> + if (!test_bit(NBD_HAS_STARTED, &nbd->flags)) {
> + mutex_unlock(&nbd_index_mutex);
> + printk(KERN_ERR "nbd: device at index %d failed to start\n",
> + index);
> + return -EBUSY;
> + }
> +
> if (!refcount_inc_not_zero(&nbd->refs)) {
> mutex_unlock(&nbd_index_mutex);
> printk(KERN_ERR "nbd: device at index %d is going down\n",
>
next prev parent reply other threads:[~2020-01-17 17:32 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-17 11:50 [PATCH] nbd: fix potential NULL pointer fault in connect and disconnect process Sun Ke
2020-01-17 14:18 ` Josef Bacik
2020-01-19 6:27 ` sunke (E)
2020-01-17 17:32 ` Mike Christie [this message]
2020-01-19 7:10 ` sunke (E)
2020-02-10 3:16 ` Mike Christie
2020-02-10 9:15 ` sunke (E)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5E21EF96.1010204@redhat.com \
--to=mchristi@redhat.com \
--cc=axboe@kernel.dk \
--cc=josef@toxicpanda.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nbd@other.debian.org \
--cc=sunke32@huawei.com \
--cc=xiubli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.