From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752627AbcF2G5W (ORCPT ); Wed, 29 Jun 2016 02:57:22 -0400 Received: from metis.ext.4.pengutronix.de ([92.198.50.35]:45574 "EHLO metis.ext.4.pengutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751230AbcF2G5V (ORCPT ); Wed, 29 Jun 2016 02:57:21 -0400 From: Markus Pargmann To: Pranay Srivastava Cc: nbd-general@lists.sourceforge.net, linux-kernel@vger.kernel.org, Wouter Verhelst Subject: Re: [PATCH 1/2] nbd: make nbd device wait for its users Date: Wed, 29 Jun 2016 08:57:18 +0200 Message-ID: <2136142.XBTNBr86gI@adelgunde> User-Agent: KMail/4.14.1 (Linux/4.6.0-0.bpo.1-amd64; KDE/4.14.2; x86_64; ; ) In-Reply-To: References: <1466760574-1916-1-git-send-email-mpa@pengutronix.de> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1614127.mNBS4XGkob"; micalg="pgp-sha256"; protocol="application/pgp-signature" X-SA-Exim-Connect-IP: 2001:67c:670:100:a61f:72ff:fe68:75ba X-SA-Exim-Mail-From: mpa@pengutronix.de X-SA-Exim-Scanned: No (on metis.ext.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-kernel@vger.kernel.org Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --nextPart1614127.mNBS4XGkob Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="us-ascii" Hi, On Saturday 25 June 2016 23:22:06 Pranay Srivastava wrote: > On Fri, Jun 24, 2016 at 2:59 PM, Markus Pargmann = wrote: > > From: "Pranay Kr. Srivastava" > > > > When a timeout occurs or a recv fails, then instead of abruplty kil= ling > > nbd block device wait for it's users to finish. > > > > This is more required when filesystem(s) like ext2 or ext3 don't ex= pect > > their buffer heads to disappear while the filesystem is mounted. > > > > Each open is counted. The blockdevice is kept open until the last u= ser > > closes the block device. This offers the possibility as well to ope= n a > > new socket to be used while the filesystems are mounted. > > > > Signed-off-by: Pranay Kr. Srivastava > > > > [mpa: Keep the blockdevice open until all users left] > > Signed-off-by: Markus Pargmann > > --- > > Hi, > > > > I used your patch and changed it a bit based on my ideas. The gener= al > > difference is that this keeps the block device open. After all user= s left, the > > device is reset. > > > > The followup patch then restricts access to ioctls after a disconne= ct. I wanted > > to avoid that anyone sets up anything new without all the old users= leaving. > > > > Please let me know what you think about this. > > > > Best Regards, > > > > Markus > > > > drivers/block/nbd.c | 62 ++++++++++++++++++++++++++++++++++++++---= =2D----------- > > 1 file changed, 45 insertions(+), 17 deletions(-) > > > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c > > index 1efc26bf1d21..620660f3ff0f 100644 > > --- a/drivers/block/nbd.c > > +++ b/drivers/block/nbd.c > > @@ -69,6 +69,8 @@ struct nbd_device { > > #if IS_ENABLED(CONFIG_DEBUG_FS) > > struct dentry *dbg_dir; > > #endif > > + atomic_t users; /* Users that opened the block device */ >=20 > We don't need to put bdev in nbd struct. We can get it via gendisk an= d > bdget/bdput. Indeed, thanks. > > + struct block_device *bdev; > > }; > > > > #if IS_ENABLED(CONFIG_DEBUG_FS) > > @@ -655,9 +657,26 @@ static int nbd_set_socket(struct nbd_device *n= bd, struct socket *sock) > > return ret; > > } > > > > +static void nbd_bdev_reset(struct block_device *bdev) > > +{ > > + set_device_ro(bdev, false); > > + bdev->bd_inode->i_size =3D 0; >=20 > i_size_write should be better I guess. Yes, but that is a separate patch. This code is just moved >=20 > > + if (max_part > 0) { > > + blkdev_reread_part(bdev); > > + bdev->bd_invalidated =3D 1; > > + } > > +} > > + > > /* Reset all properties of an NBD device */ > > static void nbd_reset(struct nbd_device *nbd) > > { > > + sock_shutdown(nbd); > > + nbd_clear_que(nbd); > > + if (nbd->bdev) { > > + kill_bdev(nbd->bdev); > > + nbd_bdev_reset(nbd->bdev); > > + } > > + > > nbd->disconnect =3D false; > > nbd->timedout =3D false; > > nbd->blksize =3D 1024; >=20 > I actually forgot to ask about this blksize. We do set_blocksize call= > for bdev, but > shouldn't we instead do blkdev_logicial_blocksize? >=20 > Mount would set the blocksize depending on the filesystem's block siz= e and the > block device logical block size supported. Should we just get rid of = this? I think we should perhaps set the logical blocksize as well. But that's= a separate topic as well. >=20 > > @@ -669,16 +688,6 @@ static void nbd_reset(struct nbd_device *nbd) > > del_timer_sync(&nbd->timeout_timer); > > } > > > > -static void nbd_bdev_reset(struct block_device *bdev) > > -{ > > - set_device_ro(bdev, false); > > - bdev->bd_inode->i_size =3D 0; > > - if (max_part > 0) { > > - blkdev_reread_part(bdev); > > - bdev->bd_invalidated =3D 1; > > - } > > -} > > - > > static void nbd_parse_flags(struct nbd_device *nbd, struct block_d= evice *bdev) > > { > > if (nbd->flags & NBD_FLAG_READ_ONLY) > > @@ -803,18 +812,11 @@ static int __nbd_ioctl(struct block_device *b= dev, struct nbd_device *nbd, >=20 > How about disallowing any ioctl until the nbd_device has been reset? > Perhaps, throw error > in open instead of checking here? This is a comment to patch 2?! The idea was to not allow any control over the nbd device unless some clear instructions for a cleanup are send, for example CLEAR_SOCK. Afte= r that nothing will work. >=20 > > mutex_lock(&nbd->tx_lock); > > nbd->task_recv =3D NULL; > > > > - sock_shutdown(nbd); > > - nbd_clear_que(nbd); > > - kill_bdev(bdev); > > - nbd_bdev_reset(bdev); > > - > > if (nbd->disconnect) /* user requested, ignore sock= et errors */ > > error =3D 0; > > if (nbd->timedout) > > error =3D -ETIMEDOUT; > > >=20 > Doesn't this change the semantics for user space? NBD_DO_IT is > supposed to be over either on error or a > user disconnect not before that right[?]. Yes it does. But it will automatically cleanup the remaining pieces. Fo= r example for the standard nbd-client implementation a CLEAR_SOCK is called afterwards which will end up with the same result as with the previous code, killing all bdev. Otherwise after all block device users= left, the cleanup is done. If someone tries to use the nbd before it wa= s cleaned up, it will return -EBUSY with an error message indicating the reason. >=20 > > - nbd_reset(nbd); > > - > > return error; > > } > > > > @@ -853,10 +855,35 @@ static int nbd_ioctl(struct block_device *bde= v, fmode_t mode, > > return error; > > } > > > > +static int nbd_open(struct block_device *bdev, fmode_t mode) > > +{ > > + struct nbd_device *nbd =3D bdev->bd_disk->private_data; > > + > > + atomic_inc(&nbd->users); > > + > > + if (!nbd->bdev) > > + nbd->bdev =3D bdev; > > + > > + return 0; > > +} > > + > > +static void nbd_release(struct gendisk *disk, fmode_t mode) > > +{ > > + struct nbd_device *nbd =3D disk->private_data; > > + > > + if (atomic_dec_and_test(&nbd->users)) { > > + mutex_lock(&nbd->tx_lock); > > + nbd_reset(nbd); > > + mutex_unlock(&nbd->tx_lock); > > + } > > +} > > + > What if the following happens, >=20 > 1) nbd-client [nbd-c1] attaches the block device >=20 > 2) mount this nbd device >=20 > 3) somebody goes crazy and does nbd-client -d >=20 > Step 3) would cause a DISCONNECT and a CLEAR_SOCK ioctl to be fired, > but in CLEAR_SOCK there's a call to kill_bdev, so I guess that needs > to go as well[?]. No, this is exactly what I wanted to happen. CLEAR_SOCK is an ioctl tha= t should ensure that the socket connection is gone. I would prefer a client that has a "--force" option. Depending on this option CLEAR_SOCK is called or not. This way the user can decide what happens on disconnect or when the nbd device is still busy. Best Regards, Markus >=20 > > static const struct block_device_operations nbd_fops =3D { > > .owner =3D THIS_MODULE, > > .ioctl =3D nbd_ioctl, > > .compat_ioctl =3D nbd_ioctl, > > + .open =3D nbd_open, > > + .release =3D nbd_release, > > }; > > > > #if IS_ENABLED(CONFIG_DEBUG_FS) > > @@ -1087,6 +1114,7 @@ static int __init nbd_init(void) > > disk->private_data =3D &nbd_dev[i]; > > sprintf(disk->disk_name, "nbd%d", i); > > nbd_reset(&nbd_dev[i]); > > + atomic_set(&nbd_dev[i].users, 0); > > add_disk(disk); > > } > > > > -- > > 2.1.4 > > >=20 >=20 =2D-=20 Pengutronix e.K. | = | Industrial Linux Solutions | http://www.pengutronix.de/= | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 = | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-555= 5 | --nextPart1614127.mNBS4XGkob Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJXc3FOAAoJENnm3voMNZulNHQP/0rDB2Z78aoOp8DM00W7YWLy NQqfElwXgEA67Al9YtZtUEuogVGIrN2eCYQIeoCnTNpUaJ3970CfqKd9EeFTwyRX f+B/WK51nkV1xhcft8bWZszZ2wcWVtgWNrP8Wmph8XXdfWX6VZtm+c31EXQfyMke 4Nn+YeuNnOHdJo3avkF/K89Md14RjWZ08IN7O6Ek8t9kGLAM3eTcK5c9Haujfqqv MkPbhyEzgk7Sya7P6sTjTS93NYoPyBfT2LsyrrT6J7D/1WuMUJdj3we+PHNPHcU0 HCHOOJKnZzSWXynRjJkli1w9vkZoVV4/qeK1cFuriWZptgG/QYe+dVVohdx1NFw2 eyundnzwVmSJOMYJSiLatqoxQCSuCpSchzmRQSnSCe3nIZ9FnSgQkSMfTM82qqoY +azazgPKKgWlHTHbbTovUwlrzrl/TT2qQw45A9/N4jKP3g49YGq8HsbARmaUAUtd 7yyGRfCIl7/uge4h6vdFGL7phXkOYGeB6xKOTpYgxMGVhH3IXe0xjhgc69JFmA8+ 0cu/7HGL+a7jGB86fnkia58M69XB1uMwHHdrqLH2uVFQR4lnS45yznFwcrtCxocM K/xkbxmrqLvQJN3UEVH/J8fU1cr+SwOUJ6RLmpPOtq9nqPRtqINycS0yWGFoqWUl hldXKsVFqoA1aYE8eBxM =zHFc -----END PGP SIGNATURE----- --nextPart1614127.mNBS4XGkob--