From: Mike Snitzer <snitzer@redhat.com>
To: Eric Sandeen <sandeen@redhat.com>
Cc: axboe@kernel.dk, linux-kernel@vger.kernel.org, xfs@oss.sgi.com,
dm-devel@redhat.com, linux-fsdevel@vger.kernel.org, hch@lst.de,
Vivek Goyal <vgoyal@redhat.com>
Subject: Re: [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space
Date: Tue, 21 Jul 2015 13:47:53 -0400 [thread overview]
Message-ID: <20150721174753.GA8563@redhat.com> (raw)
In-Reply-To: <55AE6670.40903@redhat.com>
On Tue, Jul 21 2015 at 11:34am -0400,
Eric Sandeen <sandeen@redhat.com> wrote:
> On 7/20/15 5:36 PM, Dave Chinner wrote:
> > On Mon, Jul 20, 2015 at 11:18:49AM -0400, Mike Snitzer wrote:
> >> If XFS fails to write metadata it will retry the write indefinitely
> >> (with the hope that the write will succeed at some point in the future).
> >>
> >> Others can possibly speak to historic reason(s) why this is a sane
> >> default for XFS. But when XFS is deployed ontop of DM thin provisioning
> >> this infinite retry is very unwelcome -- especially if DM thinp was
> >> configured to be automatically extended with free space but the admin
> >> hasn't provided (or restored) adequate free space.
> >>
> >> To fix this infinite retry a new bdev_has_space () hook is added to XFS
> >> to break out of its metadata retry loop if the underlying block device
> >> reports it no longer has free space. DM thin provisioning is now
> >> trained to respond accordingly, which enables XFS to not cause a cascade
> >> of tasks blocked on IO waiting for XFS's infinite retry.
> >>
> >> All other block devices, which don't implement a .has_space method in
> >> block_device_operations, will always return true for bdev_has_space().
> >>
> >> With this change XFS will fail the metadata IO, force shutdown, and the
> >> XFS filesystem may be unmounted. This enables an admin to recover from
> >> their oversight, of not having provided enough free space, without
> >> having to force a hard reset of the system to get XFS to unwedge.
> >>
> >> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> >
> > Shouldn't dm-thinp just return the bio with ENOSPC as it's error?
> > The scsi layers already do this for hardware thinp ENOSPC failures,
> > so dm-thinp should behave exactly the same (i.e. via
> > __scsi_error_from_host_byte()). The behaviour of the filesystem
> > should be the same in all cases - making it conditional on whether
> > the thinp implementation can be polled for available space is wrong
> > as most hardware thinp can't be polled by the kernel forthis info..
> >
> >
> > If dm-thinp just returns ENOSPC from on the BIO like other hardware
> > thinp devices, then it is up to the filesystem to handle that
> > appropriately. i.e. whether an ENOSPC IO error is fatal to the
> > filesystem is determined by filesystem configuration and context of
> > the IO error, not whether the block device has no space (which we
> > should already know from the ENOSPC error delivered by IO
> > completion).
>
> The issue we had discussed previously is that there is no agreement
> across block devices about whether ENOSPC is a permanent or temporary
> condition. Asking the admin to tune the fs to each block device's
> behavior sucks, IMHO.
It does suck, but it beats the alternative of XFS continuing to do
nothing about the problem.
Disucssing more with Vivek, might be that XFS would be best served to
model what dm-thinp has provided with its 'no_space_timeout'. It
defaults to queueing IO for 60 seconds, once the timeout expires the
queued IOs getted errored. If set to 0 dm-thinp will queue IO
indefinitely.
So for XFS's use-case: s/queue/retry/
> This interface could at least be defined to reflect a permanent and
> unambiguous state...
The proposed bdev_has_space() interface enabled XFS to defer to the
block device. But it obviously doesn't help at all if the blockdevice
isn't providing a .has_space method -- so I can see value in XFS
having something like a 'no_space_timeout' knob.
But something needs to happen. No more bike-shedding allowed on this
one.. PLEASE DO SOMETHING! :)
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2015-07-21 17:47 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-20 15:18 [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space Mike Snitzer
2015-07-20 22:36 ` Dave Chinner
2015-07-20 23:20 ` Mike Snitzer
2015-07-21 0:36 ` Dave Chinner
2015-07-21 15:34 ` Eric Sandeen
2015-07-21 17:47 ` Mike Snitzer [this message]
2015-07-22 0:09 ` Dave Chinner
2015-07-22 1:00 ` Dave Chinner
2015-07-22 1:40 ` Mike Snitzer
2015-07-22 2:37 ` Dave Chinner
2015-07-22 13:34 ` Mike Snitzer
2015-07-22 16:28 ` Eric Sandeen
2015-07-22 16:51 ` Mike Snitzer
2015-07-23 5:10 ` Dave Chinner
2015-07-23 14:33 ` Mike Snitzer
2015-07-23 15:50 ` [RFC PATCH] block: dm thin: export how block device handles -ENOSPC Mike Snitzer
2015-07-23 16:43 ` [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space Vivek Goyal
2015-07-23 23:00 ` Dave Chinner
2015-07-24 2:34 ` Vivek Goyal
2015-07-23 17:08 ` [dm-devel] " Mikulas Patocka
2015-07-23 23:05 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150721174753.GA8563@redhat.com \
--to=snitzer@redhat.com \
--cc=axboe@kernel.dk \
--cc=dm-devel@redhat.com \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=vgoyal@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).