linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: Jeff Moyer <jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
	Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
	"Seymour, Shane M" <shane.seymour-ZPxbGqLxI0U@public.gmane.org>,
	"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Jeff Layton <jlayton-vpEMnDpepFuMZCB2o+C8xQ@public.gmane.org>,
	"J. Bruce Fields"
	<bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>,
	"martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org"
	<martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH v3] block: create ioctl to discard-or-zeroout a range of blocks
Date: Tue, 17 Nov 2015 20:38:04 -0800	[thread overview]
Message-ID: <20151118043804.GC32467@birch.djwong.org> (raw)
In-Reply-To: <x491tbt643m.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>

On Fri, Nov 13, 2015 at 03:23:25PM -0500, Jeff Moyer wrote:
> "Darrick J. Wong" <darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> writes:
> 
> > Create a new ioctl to expose the block layer's newfound ability to
> > issue either a zeroing discard, a WRITE SAME with a zero page, or a
> > regular write with the zero page.  This BLKZEROOUT2 ioctl takes
> > {start, length, flags} as parameters.  So far, the only flag available
> > is to enable the zeroing discard part -- without it, the call invokes
> > the old BLKZEROOUT behavior.  start and length have the same meaning
> > as in BLKZEROOUT.
> >
> > Furthermore, because BLKZEROOUT2 issues commands directly to the
> > storage device, we must invalidate the page cache (as a regular
> > O_DIRECT write would do) to avoid returning stale cache contents at a
> > later time.
> >
> > v3: Add extra padding for future expansion, and check the padding is zero.
> 
> Is there someplace we document ioctls?  This stuff really could use some
> good documentation.

There's no place that I know of.  I looked in man-pages.git but didn't see
anything promising.  There's what, like ~2000 ioctls?

--D

> 
> Cheers,
> Jeff
> 
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> > ---
> >  block/ioctl.c           |   48 ++++++++++++++++++++++++++++++++++++++++-------
> >  include/uapi/linux/fs.h |    9 +++++++++
> >  2 files changed, 50 insertions(+), 7 deletions(-)
> >
> > diff --git a/block/ioctl.c b/block/ioctl.c
> > index 8061eba..8e67551 100644
> > --- a/block/ioctl.c
> > +++ b/block/ioctl.c
> > @@ -213,19 +213,39 @@ static int blk_ioctl_discard(struct block_device *bdev, uint64_t start,
> >  }
> >  
> >  static int blk_ioctl_zeroout(struct block_device *bdev, uint64_t start,
> > -			     uint64_t len)
> > +			     uint64_t len, uint32_t flags)
> >  {
> > +	int ret;
> > +	struct address_space *mapping;
> > +	uint64_t end = start + len - 1;
> > +
> > +	if (flags & ~BLKZEROOUT2_DISCARD_OK)
> > +		return -EINVAL;
> >  	if (start & 511)
> >  		return -EINVAL;
> >  	if (len & 511)
> >  		return -EINVAL;
> > -	start >>= 9;
> > -	len >>= 9;
> > -
> > -	if (start + len > (i_size_read(bdev->bd_inode) >> 9))
> > +	if (end >= i_size_read(bdev->bd_inode))
> >  		return -EINVAL;
> >  
> > -	return blkdev_issue_zeroout(bdev, start, len, GFP_KERNEL, false);
> > +	/* Invalidate the page cache, including dirty pages */
> > +	mapping = bdev->bd_inode->i_mapping;
> > +	truncate_inode_pages_range(mapping, start, end);
> > +
> > +	ret = blkdev_issue_zeroout(bdev, start >> 9, len >> 9, GFP_KERNEL,
> > +				   flags & BLKZEROOUT2_DISCARD_OK);
> > +	if (ret)
> > +		goto out;
> > +
> > +	/*
> > +	 * Invalidate again; if someone wandered in and dirtied a page,
> > +	 * the caller will be given -EBUSY.
> > +	 */
> > +	ret = invalidate_inode_pages2_range(mapping,
> > +					    start >> PAGE_CACHE_SHIFT,
> > +					    end >> PAGE_CACHE_SHIFT);
> > +out:
> > +	return ret;
> >  }
> >  
> >  static int put_ushort(unsigned long arg, unsigned short val)
> > @@ -353,7 +373,21 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd,
> >  		if (copy_from_user(range, (void __user *)arg, sizeof(range)))
> >  			return -EFAULT;
> >  
> > -		return blk_ioctl_zeroout(bdev, range[0], range[1]);
> > +		return blk_ioctl_zeroout(bdev, range[0], range[1], 0);
> > +	}
> > +	case BLKZEROOUT2: {
> > +		struct blkzeroout2 p;
> > +
> > +		if (!(mode & FMODE_WRITE))
> > +			return -EBADF;
> > +
> > +		if (copy_from_user(&p, (void __user *)arg, sizeof(p)))
> > +			return -EFAULT;
> > +
> > +		if (p.padding || p.padding2)
> > +			return -EINVAL;
> > +
> > +		return blk_ioctl_zeroout(bdev, p.start, p.length, p.flags);
> >  	}
> >  
> >  	case HDIO_GETGEO: {
> > diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> > index 9b964a5..b811fa4 100644
> > --- a/include/uapi/linux/fs.h
> > +++ b/include/uapi/linux/fs.h
> > @@ -152,6 +152,15 @@ struct inodes_stat_t {
> >  #define BLKSECDISCARD _IO(0x12,125)
> >  #define BLKROTATIONAL _IO(0x12,126)
> >  #define BLKZEROOUT _IO(0x12,127)
> > +struct blkzeroout2 {
> > +	__u64 start;
> > +	__u64 length;
> > +	__u32 flags;
> > +	__u32 padding;
> > +	__u64 padding2;
> > +};
> > +#define BLKZEROOUT2_DISCARD_OK	1
> > +#define BLKZEROOUT2 _IOR(0x12, 127, struct blkzeroout2)
> >  
> >  #define BMAP_IOCTL 1		/* obsolete - kept for compatibility */
> >  #define FIBMAP	   _IO(0x00,1)	/* bmap access */
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/

      parent reply	other threads:[~2015-11-18  4:38 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-11  6:29 [PATCH v3] block: create ioctl to discard-or-zeroout a range of blocks Darrick J. Wong
     [not found] ` <20151111062948.GA2214-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2015-11-13 20:23   ` Jeff Moyer
     [not found]     ` <x491tbt643m.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2015-11-18  4:38       ` Darrick J. Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151118043804.GC32467@birch.djwong.org \
    --to=darrick.wong-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
    --cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
    --cc=bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org \
    --cc=hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=jlayton-vpEMnDpepFuMZCB2o+C8xQ@public.gmane.org \
    --cc=jmoyer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=shane.seymour-ZPxbGqLxI0U@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).