Re: [PATCH] NBD: replace kill_bdev() with __invalidate_device()

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

From: Markus Pargmann <mpa@pengutronix.de>
To: Ratna Manoj <manoj.br@gmail.com>
Cc: pbonzini@redhat.com, jack@suse.cz, Gou Rao <grao@portworx.com>,
	Vinod Jayaraman <jv@portworx.com>,
	nbd-general@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] NBD: replace kill_bdev() with __invalidate_device()
Date: Thu, 28 Apr 2016 11:00:20 +0200	[thread overview]
Message-ID: <1809196.i4ZeAM11iZ@adelgunde> (raw)
In-Reply-To: <571ADB31.2000609@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6370 bytes --]

Hi,

On Saturday 23 April 2016 07:47:21 Ratna Manoj wrote:
> Thanks for the review. 
> 
> Atleast for ext4 this crash happens on a sys_umount() call, timing of
> which is not in control of block driver. Block driver cannot force the
> filesystems to be unmounted, and the file system does not expect 
> buffers to get unmapped under it.

Yes the block driver can't force a clean umount.

> 
> Ext4 can be fixed with the this patch:
> http://www.spinics.net/lists/linux-ext4/msg51112.html 
> It did not make to the kernel. It checks the state of the buffer head
> before committing.
> 
> When we consider diskett/CD as user space thread that called NBD_DO_IT,
> this problem is analogous to changing disk with another or the same
> disk suddenly when the file system is still mounted. 
> 
> If we completely kill the block device we would loss some writes when
> same thread is reconnected.

I am not so sure about your exact use-case here.

If the NBD_DO_IT thread returns I am considering the connection and
block device as dead and disconnected. Securing any data afterwards with
a new connection is potentially dangerous as it may be a different
server.

> 
> if we do not completely kill or if we only invalidate clean buffers, 
> we will have inconsistency on re-attach with a different thread
> (analogous to replacing disk with different disk suddenly). 

Yes exactly. That's why I suggested that NBD_DO_IT waits until all
blockdevice users are gone. This would avoid any issues with
writing/reading data to a wrong server.

Best Regards,

Markus

> 
> Ratna.    
>    
> 
> On Wed, Apr 20, 2016 at 4:36 PM, Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > Hi,
> >
> > On Thursday 24 March 2016 07:04:10 Ratna Manoj wrote:
> > > From: Ratna Manoj Bolla <manoj.br@gmail.com>
> > >
> > > When a filesystem is mounted on a nbd device and on a disconnect, because
> > > of kill_bdev(), and resetting bdev size to zero, buffer_head mappings are
> > > getting destroyed under mounted filesystem.
> > >
> > > After a bdev size reset(i.e bdev->bd_inode->i_size = 0) on a disconnect,
> > > followed by a sys_umount(),
> > >         generic_shutdown_super()->...
> > >         ->__sync_blockdev()->...
> > >         -> blkdev_writepages()->...
> > >         ->do_invalidatepage()->...
> > >         -> discard_buffer()   is discarding superblock buffer_head 
> > assumed
> > > to be in mapped state by ext4_commit_super().
> > >
> > >
> > >
> > > Signed-off-by: Ratna Manoj Bolla <manoj.br@gmail.com>
> > > ---
> > > This script reproduces both the kernel panic scenarios:
> > >
> > > $ qemu-img create -f qcow2 f.img 1G
> > > $ mkfs.ext4 f.img
> > > $ qemu-nbd -c /dev/nbd0 f.img
> > > $ mount /dev/nbd0 dir
> > > $ killall -KILL qemu-nbd
> > > $ sleep 1
> > > $ ls dir
> > > $ umount dir
> > >
> > > Bug reports:
> > > http://www.kernelhub.org/?p=2&msg=361407
> > > 
> > https://www.mail-archive.com/nbd-general@lists.sourceforge.net/msg02388.html
> >
> > Thanks, please CC nbd-general@lists.sourceforge.net,
> > linux-kernel@vger.kernel.org as well.
> >
> > So this patch simply does not cleanup the blockdevice to avoid any
> > errors on the filesystem side. The userspace thread that called
> > NBD_DO_IT will exit immediately before the filesystem decided to release
> > the blockdevice. The nbd driver assumes that the shutdown was done and
> > accepts new clients setting up sockets and so on. Couldn't this lead to
> > a lot of problems?
> >
> > Currently NBD_DO_IT returns when it is save to use the NBD device again.
> > This patch changes this as the blockdevice may still be in use when
> > NBD_DO_IT returns. I think it would be better to delay NBD_DO_IT until
> > everything is cleaned up and all filesystems are closed.
> >
> > Best Regards,
> >
> > Markus
> >
> > >
> > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > index f6b51d7..6e77b3a 100644
> > > --- a/drivers/block/nbd.c
> > > +++ b/drivers/block/nbd.c
> > > @@ -119,7 +119,8 @@ static const char *nbdcmd_to_ascii(int cmd)
> > >
> > >  static int nbd_size_clear(struct nbd_device *nbd, struct block_device 
> > *bdev)
> > >  {
> > > -     bdev->bd_inode->i_size = 0;
> > > +     if (bdev->bd_openers <= 1)
> > > +             bdev->bd_inode->i_size = 0;
> > >       set_capacity(nbd->disk, 0);
> > >       kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);
> > >
> > > @@ -678,6 +679,9 @@ static void nbd_reset(struct nbd_device *nbd)
> > >
> > >  static void nbd_bdev_reset(struct block_device *bdev)
> > >  {
> > > +     if (bdev->bd_openers > 1)
> > > +             return;
> > > +
> > >       set_device_ro(bdev, false);
> > >       bdev->bd_inode->i_size = 0;
> > >       if (max_part > 0) {
> > > @@ -735,7 +739,7 @@ static int __nbd_ioctl(struct block_device *bdev, 
> > struct nbd_device *nbd,
> > >               nbd_clear_que(nbd);
> > >               BUG_ON(!list_empty(&nbd->queue_head));
> > >               BUG_ON(!list_empty(&nbd->waiting_queue));
> > > -             kill_bdev(bdev);
> > > +             __invalidate_device(bdev, true);
> > >               return 0;
> > >
> > >       case NBD_SET_SOCK: {
> > > @@ -809,7 +813,7 @@ static int __nbd_ioctl(struct block_device *bdev, 
> > struct nbd_device *nbd,
> > >
> > >               sock_shutdown(nbd);
> > >               nbd_clear_que(nbd);
> > > -             kill_bdev(bdev);
> > > +             __invalidate_device(bdev, true);
> > >               nbd_bdev_reset(bdev);
> > >
> > >               if (nbd->disconnect) /* user requested, ignore socket 
> > errors */
> > >
> > >
> > >
> >
> > --
> > Pengutronix e.K.                           |                             |
> > Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> > Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> > Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
> >
> 
> 
> 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

next prev parent reply	other threads:[~2016-04-28  9:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <56E711D1.9090708@gmail.com>
     [not found] ` <2358965.KUbuAa1sts@adelgunde>
     [not found]   ` <56F34412.1050707@gmail.com>
2016-04-20 11:06     ` [PATCH] NBD: replace kill_bdev() with __invalidate_device() Markus Pargmann
2016-04-23  2:17       ` Ratna Manoj
2016-04-28  9:00         ` Markus Pargmann [this message]
2016-04-28 16:27           ` [Nbd] " Wouter Verhelst
2016-04-28 18:43             ` Ratna Manoj
2016-05-12  9:53             ` Markus Pargmann
2016-05-15 12:55               ` Wouter Verhelst
2016-05-19  6:35                 ` Markus Pargmann
2016-05-24 15:17                   ` Wouter Verhelst
2016-05-10  9:24         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1809196.i4ZeAM11iZ@adelgunde \
    --to=mpa@pengutronix.de \
    --cc=grao@portworx.com \
    --cc=jack@suse.cz \
    --cc=jv@portworx.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manoj.br@gmail.com \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox