From: Markus Pargmann <mpa@pengutronix.de>
To: Ratna Manoj <manoj.br@gmail.com>
Cc: pbonzini@redhat.com, jack@suse.cz, Gou Rao <grao@portworx.com>,
Vinod Jayaraman <jv@portworx.com>,
nbd-general@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] NBD: replace kill_bdev() with __invalidate_device()
Date: Thu, 28 Apr 2016 11:00:20 +0200 [thread overview]
Message-ID: <1809196.i4ZeAM11iZ@adelgunde> (raw)
In-Reply-To: <571ADB31.2000609@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6370 bytes --]
Hi,
On Saturday 23 April 2016 07:47:21 Ratna Manoj wrote:
> Thanks for the review.
>
> Atleast for ext4 this crash happens on a sys_umount() call, timing of
> which is not in control of block driver. Block driver cannot force the
> filesystems to be unmounted, and the file system does not expect
> buffers to get unmapped under it.
Yes the block driver can't force a clean umount.
>
> Ext4 can be fixed with the this patch:
> http://www.spinics.net/lists/linux-ext4/msg51112.html
> It did not make to the kernel. It checks the state of the buffer head
> before committing.
>
> When we consider diskett/CD as user space thread that called NBD_DO_IT,
> this problem is analogous to changing disk with another or the same
> disk suddenly when the file system is still mounted.
>
> If we completely kill the block device we would loss some writes when
> same thread is reconnected.
I am not so sure about your exact use-case here.
If the NBD_DO_IT thread returns I am considering the connection and
block device as dead and disconnected. Securing any data afterwards with
a new connection is potentially dangerous as it may be a different
server.
>
> if we do not completely kill or if we only invalidate clean buffers,
> we will have inconsistency on re-attach with a different thread
> (analogous to replacing disk with different disk suddenly).
Yes exactly. That's why I suggested that NBD_DO_IT waits until all
blockdevice users are gone. This would avoid any issues with
writing/reading data to a wrong server.
Best Regards,
Markus
>
> Ratna.
>
>
> On Wed, Apr 20, 2016 at 4:36 PM, Markus Pargmann <mpa@pengutronix.de> wrote:
>
> > Hi,
> >
> > On Thursday 24 March 2016 07:04:10 Ratna Manoj wrote:
> > > From: Ratna Manoj Bolla <manoj.br@gmail.com>
> > >
> > > When a filesystem is mounted on a nbd device and on a disconnect, because
> > > of kill_bdev(), and resetting bdev size to zero, buffer_head mappings are
> > > getting destroyed under mounted filesystem.
> > >
> > > After a bdev size reset(i.e bdev->bd_inode->i_size = 0) on a disconnect,
> > > followed by a sys_umount(),
> > > generic_shutdown_super()->...
> > > ->__sync_blockdev()->...
> > > -> blkdev_writepages()->...
> > > ->do_invalidatepage()->...
> > > -> discard_buffer() is discarding superblock buffer_head
> > assumed
> > > to be in mapped state by ext4_commit_super().
> > >
> > >
> > >
> > > Signed-off-by: Ratna Manoj Bolla <manoj.br@gmail.com>
> > > ---
> > > This script reproduces both the kernel panic scenarios:
> > >
> > > $ qemu-img create -f qcow2 f.img 1G
> > > $ mkfs.ext4 f.img
> > > $ qemu-nbd -c /dev/nbd0 f.img
> > > $ mount /dev/nbd0 dir
> > > $ killall -KILL qemu-nbd
> > > $ sleep 1
> > > $ ls dir
> > > $ umount dir
> > >
> > > Bug reports:
> > > http://www.kernelhub.org/?p=2&msg=361407
> > >
> > https://www.mail-archive.com/nbd-general@lists.sourceforge.net/msg02388.html
> >
> > Thanks, please CC nbd-general@lists.sourceforge.net,
> > linux-kernel@vger.kernel.org as well.
> >
> > So this patch simply does not cleanup the blockdevice to avoid any
> > errors on the filesystem side. The userspace thread that called
> > NBD_DO_IT will exit immediately before the filesystem decided to release
> > the blockdevice. The nbd driver assumes that the shutdown was done and
> > accepts new clients setting up sockets and so on. Couldn't this lead to
> > a lot of problems?
> >
> > Currently NBD_DO_IT returns when it is save to use the NBD device again.
> > This patch changes this as the blockdevice may still be in use when
> > NBD_DO_IT returns. I think it would be better to delay NBD_DO_IT until
> > everything is cleaned up and all filesystems are closed.
> >
> > Best Regards,
> >
> > Markus
> >
> > >
> > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > index f6b51d7..6e77b3a 100644
> > > --- a/drivers/block/nbd.c
> > > +++ b/drivers/block/nbd.c
> > > @@ -119,7 +119,8 @@ static const char *nbdcmd_to_ascii(int cmd)
> > >
> > > static int nbd_size_clear(struct nbd_device *nbd, struct block_device
> > *bdev)
> > > {
> > > - bdev->bd_inode->i_size = 0;
> > > + if (bdev->bd_openers <= 1)
> > > + bdev->bd_inode->i_size = 0;
> > > set_capacity(nbd->disk, 0);
> > > kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);
> > >
> > > @@ -678,6 +679,9 @@ static void nbd_reset(struct nbd_device *nbd)
> > >
> > > static void nbd_bdev_reset(struct block_device *bdev)
> > > {
> > > + if (bdev->bd_openers > 1)
> > > + return;
> > > +
> > > set_device_ro(bdev, false);
> > > bdev->bd_inode->i_size = 0;
> > > if (max_part > 0) {
> > > @@ -735,7 +739,7 @@ static int __nbd_ioctl(struct block_device *bdev,
> > struct nbd_device *nbd,
> > > nbd_clear_que(nbd);
> > > BUG_ON(!list_empty(&nbd->queue_head));
> > > BUG_ON(!list_empty(&nbd->waiting_queue));
> > > - kill_bdev(bdev);
> > > + __invalidate_device(bdev, true);
> > > return 0;
> > >
> > > case NBD_SET_SOCK: {
> > > @@ -809,7 +813,7 @@ static int __nbd_ioctl(struct block_device *bdev,
> > struct nbd_device *nbd,
> > >
> > > sock_shutdown(nbd);
> > > nbd_clear_que(nbd);
> > > - kill_bdev(bdev);
> > > + __invalidate_device(bdev, true);
> > > nbd_bdev_reset(bdev);
> > >
> > > if (nbd->disconnect) /* user requested, ignore socket
> > errors */
> > >
> > >
> > >
> >
> > --
> > Pengutronix e.K. | |
> > Industrial Linux Solutions | http://www.pengutronix.de/ |
> > Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
> > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
> >
>
>
>
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2016-04-28 9:00 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <56E711D1.9090708@gmail.com>
[not found] ` <2358965.KUbuAa1sts@adelgunde>
[not found] ` <56F34412.1050707@gmail.com>
2016-04-20 11:06 ` [PATCH] NBD: replace kill_bdev() with __invalidate_device() Markus Pargmann
2016-04-23 2:17 ` Ratna Manoj
2016-04-28 9:00 ` Markus Pargmann [this message]
2016-04-28 16:27 ` [Nbd] " Wouter Verhelst
2016-04-28 18:43 ` Ratna Manoj
2016-05-12 9:53 ` Markus Pargmann
2016-05-15 12:55 ` Wouter Verhelst
2016-05-19 6:35 ` Markus Pargmann
2016-05-24 15:17 ` Wouter Verhelst
2016-05-10 9:24 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1809196.i4ZeAM11iZ@adelgunde \
--to=mpa@pengutronix.de \
--cc=grao@portworx.com \
--cc=jack@suse.cz \
--cc=jv@portworx.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manoj.br@gmail.com \
--cc=nbd-general@lists.sourceforge.net \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox