From: Wouter Verhelst <w@uter.be>
To: Markus Pargmann <mpa@pengutronix.de>
Cc: nbd-general@lists.sourceforge.net,
Vinod Jayaraman <jv@portworx.com>,
jack@suse.cz, linux-kernel@vger.kernel.org,
Gou Rao <grao@portworx.com>,
pbonzini@redhat.com
Subject: Re: [Nbd] [PATCH] NBD: replace kill_bdev() with __invalidate_device()
Date: Tue, 24 May 2016 17:17:18 +0200 [thread overview]
Message-ID: <20160524151718.GA32656@grep.be> (raw)
In-Reply-To: <20160519063503.GE19642@pengutronix.de>
On Thu, May 19, 2016 at 08:35:03AM +0200, Markus Pargmann wrote:
> Hi Wouter,
>
> On Sun, May 15, 2016 at 02:55:39PM +0200, Wouter Verhelst wrote:
> > Hi Markus,
> >
> > On Thu, May 12, 2016 at 11:53:01AM +0200, Markus Pargmann wrote:
> > > On Thursday 28 April 2016 18:27:34 Wouter Verhelst wrote:
> > > > However, at some point I agreed with Paul (your predecessor) that when
> > > > this happens due to an error condition (as opposed to it being due to an
> > > > explicit disconnect), the kernel would block all reads from or writes to
> > > > the device, and the client may try to reconnect *from the same
> > > > PID* (i.e., it may not fork()). If that succeeds, the next NBD_DO_IT is
> > > > assumed to be connected to the same server; if instead the process
> > > > exits, then the block device is assumed to be dead, will be reset, and
> > > > all pending reads or writes would error.
> > > >
> > > > In principle, this allows for a proper reconnect from userspace if it
> > > > can be done. However, I'm not sure whether this ever worked well or
> > > > whether it was documented, so it's probably fine if you think it should
> > > > be replaced with something else.
> > >
> > > At least I was not aware of this possibility. As far as I know the
> > > previous code even had issues with the signals used to kill on timeouts
> > > which also killed the userspace program sometimes.
> > >
> > > Currently I can't see a code path that supports reconnects. But I may
> > > have removed that accidently in the past.
> >
> > Right. Like I said, I'm not sure if it ever worked well. The user space
> > client has a -persist option that tries to implement it, but I've been
> > getting some bug reports from people who've tried it (although that may
> > have been my fault rather than the kernel's).
> >
> > > > (obviously, userspace reconnecting the device to a different device is
> > > > wrong and should not be done, but that's a case of "if you break it, you
> > > > get to keep both pieces)
> > > >
> > > > At any rate, I think it makes sense for userspace to be given a chance
> > > > to *attempt* to reconnect a device when the connection drops
> > > > unexpectedly.
> > >
> > > Perhaps it would be better to setup the kernel driver explicitly for
> > > that. Perhaps some flag to let the kernel driver know that the client
> > > would like to keep the block device open? In that case the client could
> > > excplicitly use NBD_CLEAR_SOCK to cleanup everything.
> >
> > I'm not sure what you mean by this. Can you clarify?
>
> I meant that it might be better to have a separate way for NBD_DO_IT.
> Something where the client software can directly instruct the kernel to
> keep everything opened in case of an error so that the client may
> reconnect afterwards.
>
> This could be a new ioctl that sets it up, for example 'NBD_PERSISTENT'.
If we're going to add a new ioctl, I think it might be useful to have a
second "flags" type ioctl; one where the client can set options, and
where the return value of the ioctl indicates which options the kernel
understands and will/can honour.
This could then also be used for things like checking whether the kernel
supports structured writes, etc.
> The NBD_DO_IT afterwards would keep everything up and running in case of
> a connection error so that the client could set a new socket using
> NBD_SET_SOCK and reenter using NBD_DO_IT.
>
> For all clients that are not capable of this mechanism or don't use it,
> NBD_DO_IT would clean up everything properly on any error.
Right.
> > > Or perhaps a completely new ioctl that can transmit back some more
> > > information about what failures were seen and whether the blockdevice
> > > was closed or not?
> >
> > The intent was that ioctl(NBD_DO_IT) would return an error when the
> > disconnect was not requested, and would return 0 when the connection
> > dropped due to userspace doing ioctl(NBD_DISCONNECT), since dropping the
> > connection when userspace explicitly asks for it is not an error.
> >
> > drivers/block/nbd.c contains the following:
> >
> > static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
> > unsigned int cmd, unsigned long arg)
> > {
> > [...]
> > case NBD_DO_IT: {
> > [...]
> > if (nbd->disconnect) /* user requested, ignore socket errors */
> > return 0;
> > return error;
> > }
> > [...]
> >
> > so the signalling part of it is at least still there. Whether it works,
> > I haven't tested.
>
> I just looked up the kernel code from 4.0. This code was there as
> well. But the socket and blockdevice were both destroyed before leaving
> the NBD_DO_IT ioctl. So it seems to have never been really persistent.
> Filesystems would have still been killed.
>
> So for a persistent nbd device there is some more code necessary to do
> it.
Okay.
--
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
people in the world who think they really understand all of its rules,
and pretty much all of them are just lying to themselves too.
-- #debian-devel, OFTC, 2016-02-12
next prev parent reply other threads:[~2016-05-24 15:17 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <56E711D1.9090708@gmail.com>
[not found] ` <2358965.KUbuAa1sts@adelgunde>
[not found] ` <56F34412.1050707@gmail.com>
2016-04-20 11:06 ` [PATCH] NBD: replace kill_bdev() with __invalidate_device() Markus Pargmann
2016-04-23 2:17 ` Ratna Manoj
2016-04-28 9:00 ` Markus Pargmann
2016-04-28 16:27 ` [Nbd] " Wouter Verhelst
2016-04-28 18:43 ` Ratna Manoj
2016-05-12 9:53 ` Markus Pargmann
2016-05-15 12:55 ` Wouter Verhelst
2016-05-19 6:35 ` Markus Pargmann
2016-05-24 15:17 ` Wouter Verhelst [this message]
2016-05-10 9:24 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160524151718.GA32656@grep.be \
--to=w@uter.be \
--cc=grao@portworx.com \
--cc=jack@suse.cz \
--cc=jv@portworx.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mpa@pengutronix.de \
--cc=nbd-general@lists.sourceforge.net \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.