dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Brian Foster <bfoster@redhat.com>,
	dm-devel@redhat.com, xfs@oss.sgi.com
Subject: Re: [RFC PATCH] block: wire blkdev_fallocate() to block_device_operations' reserve_space
Date: Tue, 12 Apr 2016 17:12:43 -0700	[thread overview]
Message-ID: <20160413001243.GA25280@birch.djwong.org> (raw)
In-Reply-To: <20160412210426.GA1845@redhat.com>

On Tue, Apr 12, 2016 at 05:04:27PM -0400, Mike Snitzer wrote:
> On Tue, Apr 12 2016 at  4:39pm -0400,
> Darrick J. Wong <darrick.wong@oracle.com> wrote:
> 
> > On Tue, Apr 12, 2016 at 04:04:59PM -0400, Mike Snitzer wrote:
> > > diff --git a/fs/block_dev.c b/fs/block_dev.c
> > > index 5a2c3ab..b34c07b 100644
> > > --- a/fs/block_dev.c
> > > +++ b/fs/block_dev.c
> > > @@ -1801,17 +1801,13 @@ long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len)
> > >  	struct request_queue *q = bdev_get_queue(bdev);
> > >  	struct address_space *mapping;
> > >  	loff_t end = start + len - 1;
> > > -	loff_t bs_mask, isize;
> > > +	loff_t isize;
> > >  	int error;
> > >  
> > >  	/* We only support zero range and punch hole. */
> > >  	if (mode & ~BLKDEV_FALLOC_FL_SUPPORTED)
> > >  		return -EOPNOTSUPP;
> > >  
> > > -	/* We haven't a primitive for "ensure space exists" right now. */
> > > -	if (!(mode & ~FALLOC_FL_KEEP_SIZE))
> > > -		return -EOPNOTSUPP;
> > > -
> > >  	/* Only punch if the device can do zeroing discard. */
> > >  	if ((mode & FALLOC_FL_PUNCH_HOLE) &&
> > >  	    (!blk_queue_discard(q) || !q->limits.discard_zeroes_data))
> > > @@ -1829,9 +1825,12 @@ long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len)
> > >  			return -EINVAL;
> > >  	}
> > >  
> > > -	/* Don't allow IO that isn't aligned to logical block size */
> > > -	bs_mask = bdev_logical_block_size(bdev) - 1;
> > > -	if ((start | len) & bs_mask)
> > > +	/*
> > > +	 * Don't allow IO that isn't aligned to minimum IO size (io_min)
> > > +	 * - for normal device's io_min is usually logical block size
> > > +	 * - but for more exotic devices (e.g. DM thinp) it may be larger
> > > +	 */
> > > +	if ((start | len) % bdev_io_min(bdev))

I started by noticing the 64-bit division.  However, in researching alignment
requirements for fallocate, I noticed that nothing says that we can return
-EINVAL for unaligned offset/len for allocate or punch.  For file allocations
ext4 and xfs simply enlarge the range so that the ends are aligned to the
logical block size; for punch they both shrink the range to deallocate until
the ends are aligned, and write zeroes to the partial blocks.

At least for user-visible fallocate we should do likewise, but for the internal
blkdev_ helpers I think it makes more sense to check lbs alignment and let the
lower level driver reject the IO if min_io alignment is a hard requirement.
Documentation/block/queue-sysfs.txt says that the min_io is the smallest
/preferred/ size.

But, before that, I'll push out some new fallocate patches for -rc3.

> > >  		return -EINVAL;
> > 
> > Noted.  Will update the original patch.
> 
> BTW, I just noticed your "block: require write_same and discard requests
> align to logical block size" -- doesn't look right.

What happens if we pass a request to thinp that isn't aligned to
minimum_io_size?  Does it reject the command?

> But maybe I'm just too hyper-focused on DM thinp's needs (which would
> much prefer these checks be done in terms of minimum_io_size, rather
> than logical_block_size, and _not_ assuming power-of-2 math will work).
> 
> But at least for discard: your lbs-based check is fine; since we have
> discard_granularity to cover thinp's more specific requirements.

--D

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2016-04-13  0:12 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-12 16:42 [RFC v2 PATCH 00/10] dm-thin/xfs: prototype a block reservation allocation model Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 01/10] xfs: refactor xfs_reserve_blocks() to handle ENOSPC correctly Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 02/10] xfs: replace xfs_mod_fdblocks() bool param with flags Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 03/10] block: add block_device_operations methods to set and get reserved space Brian Foster
2016-04-14  0:32   ` Dave Chinner
2016-04-12 16:42 ` [RFC v2 PATCH 04/10] dm: add " Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 05/10] dm thin: " Brian Foster
2016-04-13 17:44   ` Darrick J. Wong
2016-04-13 18:33     ` Brian Foster
2016-04-13 20:41       ` Brian Foster
2016-04-13 21:01         ` Darrick J. Wong
2016-04-14 15:10         ` Mike Snitzer
2016-04-14 16:23           ` Brian Foster
2016-04-14 20:18             ` Mike Snitzer
2016-04-15 11:48               ` Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 06/10] xfs: thin block device reservation mechanism Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 07/10] xfs: adopt a reserved allocation model on dm-thin devices Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 08/10] xfs: handle bdev reservation ENOSPC correctly from XFS reserved pool Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 09/10] xfs: support no block reservation transaction mode Brian Foster
2016-04-12 16:42 ` [RFC v2 PATCH 10/10] xfs: use contiguous bdev reservation for file preallocation Brian Foster
2016-04-12 20:04 ` [RFC PATCH] block: wire blkdev_fallocate() to block_device_operations' reserve_space Mike Snitzer
2016-04-12 20:39   ` Darrick J. Wong
2016-04-12 20:46     ` Mike Snitzer
2016-04-12 22:25       ` Darrick J. Wong
2016-04-12 21:04     ` Mike Snitzer
2016-04-13  0:12       ` Darrick J. Wong [this message]
2016-04-14 15:18         ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160413001243.GA25280@birch.djwong.org \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=snitzer@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).