All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Pargmann <mpa@pengutronix.de>
To: Ratna Manoj <manoj.br@gmail.com>
Cc: pbonzini@redhat.com, jack@suse.cz, Gou Rao <grao@portworx.com>,
	Vinod Jayaraman <jv@portworx.com>,
	nbd-general@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] NBD: replace kill_bdev() with __invalidate_device()
Date: Thu, 28 Apr 2016 11:00:20 +0200	[thread overview]
Message-ID: <1809196.i4ZeAM11iZ@adelgunde> (raw)
In-Reply-To: <571ADB31.2000609@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6370 bytes --]

Hi,

On Saturday 23 April 2016 07:47:21 Ratna Manoj wrote:
> Thanks for the review. 
> 
> Atleast for ext4 this crash happens on a sys_umount() call, timing of
> which is not in control of block driver. Block driver cannot force the
> filesystems to be unmounted, and the file system does not expect 
> buffers to get unmapped under it.

Yes the block driver can't force a clean umount.

> 
> Ext4 can be fixed with the this patch:
> http://www.spinics.net/lists/linux-ext4/msg51112.html 
> It did not make to the kernel. It checks the state of the buffer head
> before committing.
> 
> When we consider diskett/CD as user space thread that called NBD_DO_IT,
> this problem is analogous to changing disk with another or the same
> disk suddenly when the file system is still mounted. 
> 
> If we completely kill the block device we would loss some writes when
> same thread is reconnected.

I am not so sure about your exact use-case here.

If the NBD_DO_IT thread returns I am considering the connection and
block device as dead and disconnected. Securing any data afterwards with
a new connection is potentially dangerous as it may be a different
server.

> 
> if we do not completely kill or if we only invalidate clean buffers, 
> we will have inconsistency on re-attach with a different thread
> (analogous to replacing disk with different disk suddenly). 

Yes exactly. That's why I suggested that NBD_DO_IT waits until all
blockdevice users are gone. This would avoid any issues with
writing/reading data to a wrong server.

Best Regards,

Markus

> 
> Ratna.    
>    
> 
> On Wed, Apr 20, 2016 at 4:36 PM, Markus Pargmann <mpa@pengutronix.de> wrote:
> 
> > Hi,
> >
> > On Thursday 24 March 2016 07:04:10 Ratna Manoj wrote:
> > > From: Ratna Manoj Bolla <manoj.br@gmail.com>
> > >
> > > When a filesystem is mounted on a nbd device and on a disconnect, because
> > > of kill_bdev(), and resetting bdev size to zero, buffer_head mappings are
> > > getting destroyed under mounted filesystem.
> > >
> > > After a bdev size reset(i.e bdev->bd_inode->i_size = 0) on a disconnect,
> > > followed by a sys_umount(),
> > >         generic_shutdown_super()->...
> > >         ->__sync_blockdev()->...
> > >         -> blkdev_writepages()->...
> > >         ->do_invalidatepage()->...
> > >         -> discard_buffer()   is discarding superblock buffer_head 
> > assumed
> > > to be in mapped state by ext4_commit_super().
> > >
> > >
> > >
> > > Signed-off-by: Ratna Manoj Bolla <manoj.br@gmail.com>
> > > ---
> > > This script reproduces both the kernel panic scenarios:
> > >
> > > $ qemu-img create -f qcow2 f.img 1G
> > > $ mkfs.ext4 f.img
> > > $ qemu-nbd -c /dev/nbd0 f.img
> > > $ mount /dev/nbd0 dir
> > > $ killall -KILL qemu-nbd
> > > $ sleep 1
> > > $ ls dir
> > > $ umount dir
> > >
> > > Bug reports:
> > > http://www.kernelhub.org/?p=2&msg=361407
> > > 
> > https://www.mail-archive.com/nbd-general@lists.sourceforge.net/msg02388.html
> >
> > Thanks, please CC nbd-general@lists.sourceforge.net,
> > linux-kernel@vger.kernel.org as well.
> >
> > So this patch simply does not cleanup the blockdevice to avoid any
> > errors on the filesystem side. The userspace thread that called
> > NBD_DO_IT will exit immediately before the filesystem decided to release
> > the blockdevice. The nbd driver assumes that the shutdown was done and
> > accepts new clients setting up sockets and so on. Couldn't this lead to
> > a lot of problems?
> >
> > Currently NBD_DO_IT returns when it is save to use the NBD device again.
> > This patch changes this as the blockdevice may still be in use when
> > NBD_DO_IT returns. I think it would be better to delay NBD_DO_IT until
> > everything is cleaned up and all filesystems are closed.
> >
> > Best Regards,
> >
> > Markus
> >
> > >
> > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > index f6b51d7..6e77b3a 100644
> > > --- a/drivers/block/nbd.c
> > > +++ b/drivers/block/nbd.c
> > > @@ -119,7 +119,8 @@ static const char *nbdcmd_to_ascii(int cmd)
> > >
> > >  static int nbd_size_clear(struct nbd_device *nbd, struct block_device 
> > *bdev)
> > >  {
> > > -     bdev->bd_inode->i_size = 0;
> > > +     if (bdev->bd_openers <= 1)
> > > +             bdev->bd_inode->i_size = 0;
> > >       set_capacity(nbd->disk, 0);
> > >       kobject_uevent(&nbd_to_dev(nbd)->kobj, KOBJ_CHANGE);
> > >
> > > @@ -678,6 +679,9 @@ static void nbd_reset(struct nbd_device *nbd)
> > >
> > >  static void nbd_bdev_reset(struct block_device *bdev)
> > >  {
> > > +     if (bdev->bd_openers > 1)
> > > +             return;
> > > +
> > >       set_device_ro(bdev, false);
> > >       bdev->bd_inode->i_size = 0;
> > >       if (max_part > 0) {
> > > @@ -735,7 +739,7 @@ static int __nbd_ioctl(struct block_device *bdev, 
> > struct nbd_device *nbd,
> > >               nbd_clear_que(nbd);
> > >               BUG_ON(!list_empty(&nbd->queue_head));
> > >               BUG_ON(!list_empty(&nbd->waiting_queue));
> > > -             kill_bdev(bdev);
> > > +             __invalidate_device(bdev, true);
> > >               return 0;
> > >
> > >       case NBD_SET_SOCK: {
> > > @@ -809,7 +813,7 @@ static int __nbd_ioctl(struct block_device *bdev, 
> > struct nbd_device *nbd,
> > >
> > >               sock_shutdown(nbd);
> > >               nbd_clear_que(nbd);
> > > -             kill_bdev(bdev);
> > > +             __invalidate_device(bdev, true);
> > >               nbd_bdev_reset(bdev);
> > >
> > >               if (nbd->disconnect) /* user requested, ignore socket 
> > errors */
> > >
> > >
> > >
> >
> > --
> > Pengutronix e.K.                           |                             |
> > Industrial Linux Solutions                 | http://www.pengutronix.de/  |
> > Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
> > Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
> >
> 
> 
> 

-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2016-04-28  9:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <56E711D1.9090708@gmail.com>
     [not found] ` <2358965.KUbuAa1sts@adelgunde>
     [not found]   ` <56F34412.1050707@gmail.com>
2016-04-20 11:06     ` [PATCH] NBD: replace kill_bdev() with __invalidate_device() Markus Pargmann
2016-04-23  2:17       ` Ratna Manoj
2016-04-28  9:00         ` Markus Pargmann [this message]
2016-04-28 16:27           ` [Nbd] " Wouter Verhelst
2016-04-28 18:43             ` Ratna Manoj
2016-05-12  9:53             ` Markus Pargmann
2016-05-15 12:55               ` Wouter Verhelst
2016-05-19  6:35                 ` Markus Pargmann
2016-05-24 15:17                   ` Wouter Verhelst
2016-05-10  9:24         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1809196.i4ZeAM11iZ@adelgunde \
    --to=mpa@pengutronix.de \
    --cc=grao@portworx.com \
    --cc=jack@suse.cz \
    --cc=jv@portworx.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manoj.br@gmail.com \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.