From: Mike Snitzer <snitzer@redhat.com>
To: Eric Sandeen <sandeen@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>,
axboe@kernel.dk, linux-kernel@vger.kernel.org, xfs@oss.sgi.com,
dm-devel@redhat.com, linux-fsdevel@vger.kernel.org, hch@lst.de,
Vivek Goyal <vgoyal@redhat.com>
Subject: Re: [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space
Date: Wed, 22 Jul 2015 12:51:18 -0400 [thread overview]
Message-ID: <20150722165117.GA17738@redhat.com> (raw)
In-Reply-To: <55AFC496.4000009@redhat.com>
On Wed, Jul 22 2015 at 12:28pm -0400,
Eric Sandeen <sandeen@redhat.com> wrote:
> On 7/22/15 8:34 AM, Mike Snitzer wrote:
> > On Tue, Jul 21 2015 at 10:37pm -0400,
> > Dave Chinner <david@fromorbit.com> wrote:
> >
> >> On Tue, Jul 21, 2015 at 09:40:29PM -0400, Mike Snitzer wrote:
> >>
> >>> I'm open to considering alternative interfaces for getting you the info
> >>> you need. I just don't have a great sense for what mechanism you'd like
> >>> to use. Do we invent a new block device operations table method that
> >>> sets values in a 'struct no_space_strategy' passed in to the
> >>> blockdevice?
> >>
> >> It's long been frowned on having the filesystems dig into block
> >> device structures. We have lots of wrapper functions for getting
> >> information from or performing operations on block devices. (e.g.
> >> bdev_read_only(), bdev_get_queue(), blkdev_issue_flush(),
> >> blkdev_issue_zeroout(), etc) and so I think this is the pattern we'd
> >> need to follow. If we do that - bdev_get_nospace_strategy() - then
> >> how that information gets to the filesystem is completely opaque
> >> at the fs level, and the block layer can implement it in whatever
> >> way is considered sane...
> >>
> >> And, realistically, all we really need returned is a enum to tell us
> >> how the bdev behaves on enospc:
> >> - bdev fails fast, (i.e. immediate ENOSPC)
> >> - bdev fails slow, (i.e. queue for some time, then ENOSPC)
> >> - bdev never fails (i.e. queue forever)
> >> - bdev doesn't support this (i.e. EOPNOTSUPP)
>
> I'm not sure how this is more useful than the bdev simply responding to
> a query of "should we keep trying IOs?"
>
> IOWS do we really care if it's failing fast or slow, vs. simply knowing
> whether it has now permanently failed?
>
> So rather than "bdev_get_nospace_strategy" it seems like all we need
> to know is "bdev_has_failed" - do we really care about the details?
My bdev_has_space() proposal is no different then bdev_has_failed(). If
you prefer the more generic name then fine. But bdev_has_failed() is of
limited utlity outside of devices that provide support. So I can see
why Dave is resisting it.
Anyway, the benefit of XFS tailoring its independent config based on
dm-thinp's comparable config makes sense to me. The reason for XFS's
independent config is it could be deployed on any storage (e.g. not
dm-thinp).
Affords XFS to defer to DM thinp but still have comparable functionality
for HW thinp or some other storage.
> > This 'struct no_space_strategy' would be invented purely for
> > informational purposes for upper layers' benefit -- I don't consider it
> > a "block device structure" it the traditional sense.
> >
> > I was thinking upper layers would like to know the actual timeout value
> > for the "fails slow" case. As such the 'struct no_space_strategy' would
> > have the enum and the timeout. And would be returned with a call:
> > bdev_get_nospace_strategy(bdev, &no_space_strategy)
>
> Asking for the timeout value seems to add complexity. It could change after
> we ask, and knowing it now requires another layer to be handling timeouts...
Dave is already saying XFS will have a timeout it'll be managing.
Stands to reason that XFS would base its timeout on DM thinp's timeout.
But yeah it does allow the stacked timeout that XFS uses to be out of
sync if the lower timeout changes (no different than blk_stack_limits).
Please fix this however you see fit. I'll assist anywhere that makes
sense.
next prev parent reply other threads:[~2015-07-22 16:51 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-20 15:18 [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space Mike Snitzer
2015-07-20 22:36 ` Dave Chinner
2015-07-20 23:20 ` Mike Snitzer
2015-07-21 0:36 ` Dave Chinner
2015-07-21 15:34 ` Eric Sandeen
2015-07-21 17:47 ` Mike Snitzer
2015-07-22 0:09 ` Dave Chinner
2015-07-22 1:00 ` Dave Chinner
2015-07-22 1:40 ` Mike Snitzer
2015-07-22 2:37 ` Dave Chinner
2015-07-22 13:34 ` Mike Snitzer
2015-07-22 16:28 ` Eric Sandeen
2015-07-22 16:51 ` Mike Snitzer [this message]
2015-07-23 5:10 ` Dave Chinner
2015-07-23 14:33 ` Mike Snitzer
2015-07-23 15:50 ` [RFC PATCH] block: dm thin: export how block device handles -ENOSPC Mike Snitzer
2015-07-23 16:43 ` [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space Vivek Goyal
2015-07-23 23:00 ` Dave Chinner
2015-07-24 2:34 ` Vivek Goyal
2015-07-23 17:08 ` [dm-devel] " Mikulas Patocka
2015-07-23 23:05 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150722165117.GA17738@redhat.com \
--to=snitzer@redhat.com \
--cc=axboe@kernel.dk \
--cc=david@fromorbit.com \
--cc=dm-devel@redhat.com \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=vgoyal@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).