linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v11 05/21] vfs,ext2: Introduce IS_DAX(inode)
Date: Thu, 16 Oct 2014 11:35:17 +0200	[thread overview]
Message-ID: <20141016093517.GD19075@thinkos.etherlink> (raw)
In-Reply-To: <1411677218-29146-6-git-send-email-matthew.r.wilcox@intel.com>

On 25-Sep-2014 04:33:22 PM, Matthew Wilcox wrote:
> Use an inode flag to tag inodes which should avoid using the page cache.
> Convert ext2 to use it instead of mapping_is_xip().  Prevent I/Os to
> files tagged with the DAX flag from falling back to buffered I/O.

I agree that DAX enabled FS should not silently fallback to buffered
I/O, since it would void some guarantees about persistency of data that
has been written to a DAX mmap()'d region.

> 
> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
> Reviewed-by: Jan Kara <jack@suse.cz>

Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

> ---
>  fs/ext2/inode.c    |  9 ++++++---
>  fs/ext2/xip.h      |  2 --
>  include/linux/fs.h |  6 ++++++
>  mm/filemap.c       | 19 ++++++++++++-------
>  4 files changed, 24 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
> index 36d35c3..0cb0448 100644
> --- a/fs/ext2/inode.c
> +++ b/fs/ext2/inode.c
> @@ -731,7 +731,7 @@ static int ext2_get_blocks(struct inode *inode,
>  		goto cleanup;
>  	}
>  
> -	if (ext2_use_xip(inode->i_sb)) {
> +	if (IS_DAX(inode)) {
>  		/*
>  		 * we need to clear the block
>  		 */
> @@ -1201,7 +1201,7 @@ static int ext2_setsize(struct inode *inode, loff_t newsize)
>  
>  	inode_dio_wait(inode);
>  
> -	if (mapping_is_xip(inode->i_mapping))
> +	if (IS_DAX(inode))
>  		error = xip_truncate_page(inode->i_mapping, newsize);
>  	else if (test_opt(inode->i_sb, NOBH))
>  		error = nobh_truncate_page(inode->i_mapping,
> @@ -1273,7 +1273,8 @@ void ext2_set_inode_flags(struct inode *inode)
>  {
>  	unsigned int flags = EXT2_I(inode)->i_flags;
>  
> -	inode->i_flags &= ~(S_SYNC|S_APPEND|S_IMMUTABLE|S_NOATIME|S_DIRSYNC);
> +	inode->i_flags &= ~(S_SYNC | S_APPEND | S_IMMUTABLE | S_NOATIME |
> +				S_DIRSYNC | S_DAX);
>  	if (flags & EXT2_SYNC_FL)
>  		inode->i_flags |= S_SYNC;
>  	if (flags & EXT2_APPEND_FL)
> @@ -1284,6 +1285,8 @@ void ext2_set_inode_flags(struct inode *inode)
>  		inode->i_flags |= S_NOATIME;
>  	if (flags & EXT2_DIRSYNC_FL)
>  		inode->i_flags |= S_DIRSYNC;
> +	if (test_opt(inode->i_sb, XIP))
> +		inode->i_flags |= S_DAX;
>  }
>  
>  /* Propagate flags from i_flags to EXT2_I(inode)->i_flags */
> diff --git a/fs/ext2/xip.h b/fs/ext2/xip.h
> index 18b34d2..29be737 100644
> --- a/fs/ext2/xip.h
> +++ b/fs/ext2/xip.h
> @@ -16,9 +16,7 @@ static inline int ext2_use_xip (struct super_block *sb)
>  }
>  int ext2_get_xip_mem(struct address_space *, pgoff_t, int,
>  				void **, unsigned long *);
> -#define mapping_is_xip(map) unlikely(map->a_ops->get_xip_mem)
>  #else
> -#define mapping_is_xip(map)			0
>  #define ext2_xip_verify_sb(sb)			do { } while (0)
>  #define ext2_use_xip(sb)			0
>  #define ext2_clear_xip_target(inode, chain)	0
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 9418772..e99e5c4 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1605,6 +1605,7 @@ struct super_operations {
>  #define S_IMA		1024	/* Inode has an associated IMA struct */
>  #define S_AUTOMOUNT	2048	/* Automount/referral quasi-directory */
>  #define S_NOSEC		4096	/* no suid or xattr security attributes */
> +#define S_DAX		8192	/* Direct Access, avoiding the page cache */
>  
>  /*
>   * Note that nosuid etc flags are inode-specific: setting some file-system
> @@ -1642,6 +1643,11 @@ struct super_operations {
>  #define IS_IMA(inode)		((inode)->i_flags & S_IMA)
>  #define IS_AUTOMOUNT(inode)	((inode)->i_flags & S_AUTOMOUNT)
>  #define IS_NOSEC(inode)		((inode)->i_flags & S_NOSEC)
> +#ifdef CONFIG_FS_XIP
> +#define IS_DAX(inode)		((inode)->i_flags & S_DAX)
> +#else
> +#define IS_DAX(inode)		0
> +#endif
>  
>  /*
>   * Inode state bits.  Protected by inode->i_lock
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 90effcd..fec4db9 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1718,9 +1718,11 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
>  		 * we've already read everything we wanted to, or if
>  		 * there was a short read because we hit EOF, go ahead
>  		 * and return.  Otherwise fallthrough to buffered io for
> -		 * the rest of the read.
> +		 * the rest of the read.  Buffered reads will not work for
> +		 * DAX files, so don't bother trying.
>  		 */
> -		if (retval < 0 || !iov_iter_count(iter) || *ppos >= size) {
> +		if (retval < 0 || !iov_iter_count(iter) || *ppos >= size ||
> +		    IS_DAX(inode)) {
>  			file_accessed(file);
>  			goto out;
>  		}
> @@ -2584,13 +2586,16 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  		loff_t endbyte;
>  
>  		written = generic_file_direct_write(iocb, from, pos);
> -		if (written < 0 || written == count)
> -			goto out;
> -
>  		/*
> -		 * direct-io write to a hole: fall through to buffered I/O
> -		 * for completing the rest of the request.
> +		 * If the write stopped short of completing, fall back to
> +		 * buffered writes.  Some filesystems do this for writes to
> +		 * holes, for example.  For DAX files, a buffered write will
> +		 * not succeed (even if it did, DAX does not handle dirty
> +		 * page-cache pages correctly).
>  		 */
> +		if (written < 0 || written == count || IS_DAX(inode))
> +			goto out;
> +
>  		pos += written;
>  		count -= written;
>  
> -- 
> 2.1.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Key fingerprint: 2A0B 4ED9 15F2 D3FA 45F5  B162 1728 0A97 8118 6ACF

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-10-16  9:35 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-25 20:33 [PATCH v11 00/21] Add support for NV-DIMMs to ext4 Matthew Wilcox
2014-09-25 20:33 ` [PATCH v11 01/21] axonram: Fix bug in direct_access Matthew Wilcox
2014-10-16  7:52   ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 02/21] block: Change direct_access calling convention Matthew Wilcox
2014-10-16  8:45   ` Mathieu Desnoyers
2014-10-16 19:39     ` Matthew Wilcox
2014-09-25 20:33 ` [PATCH v11 03/21] mm: Fix XIP fault vs truncate race Matthew Wilcox
2014-10-16  8:56   ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 04/21] mm: Allow page fault handlers to perform the COW Matthew Wilcox
2014-10-16  9:12   ` Mathieu Desnoyers
2014-10-16 19:48     ` Matthew Wilcox
2014-10-17 15:35       ` Mathieu Desnoyers
2014-10-18 17:22         ` Matthew Wilcox
2014-09-25 20:33 ` [PATCH v11 05/21] vfs,ext2: Introduce IS_DAX(inode) Matthew Wilcox
2014-10-16  9:35   ` Mathieu Desnoyers [this message]
2014-09-25 20:33 ` [PATCH v11 06/21] vfs: Add copy_to_iter(), copy_from_iter() and iov_iter_zero() Matthew Wilcox
2014-10-16 13:33   ` Mathieu Desnoyers
2014-10-16 13:59     ` Matthew Wilcox
2014-10-16 14:12       ` Mathieu Desnoyers
2014-10-16 22:21         ` Matthew Wilcox
2014-10-17 15:39           ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 07/21] dax,ext2: Replace XIP read and write with DAX I/O Matthew Wilcox
2014-10-16  9:50   ` Mathieu Desnoyers
2014-10-16 19:51     ` Matthew Wilcox
2014-10-16 22:33       ` Matthew Wilcox
2014-10-17 15:52         ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 08/21] dax,ext2: Replace ext2_clear_xip_target with dax_clear_blocks Matthew Wilcox
2014-10-16 10:05   ` Mathieu Desnoyers
2014-10-16 21:22     ` Matthew Wilcox
2014-10-17 15:45       ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 09/21] dax,ext2: Replace the XIP page fault handler with the DAX page fault handler Matthew Wilcox
2014-10-16 10:20   ` Mathieu Desnoyers
2014-10-16 21:29     ` Matthew Wilcox
2014-09-25 20:33 ` [PATCH v11 10/21] dax,ext2: Replace xip_truncate_page with dax_truncate_page Matthew Wilcox
2014-10-16 10:28   ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 11/21] dax: Replace XIP documentation with DAX documentation Matthew Wilcox
2014-10-16 12:08   ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 12/21] vfs: Remove get_xip_mem Matthew Wilcox
2014-10-16 12:14   ` Mathieu Desnoyers
2014-10-16 21:44     ` Matthew Wilcox
2014-09-25 20:33 ` [PATCH v11 13/21] ext2: Remove ext2_xip_verify_sb() Matthew Wilcox
2014-10-16 12:18   ` Mathieu Desnoyers
2014-10-16 21:45     ` Matthew Wilcox
2014-09-25 20:33 ` [PATCH v11 14/21] ext2: Remove ext2_use_xip Matthew Wilcox
2014-10-16 12:20   ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 15/21] ext2: Remove xip.c and xip.h Matthew Wilcox
2014-10-16 12:21   ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 16/21] vfs,ext2: Remove CONFIG_EXT2_FS_XIP and rename CONFIG_FS_XIP to CONFIG_FS_DAX Matthew Wilcox
2014-10-16 12:26   ` Mathieu Desnoyers
2014-10-16 21:52     ` Matthew Wilcox
2014-09-25 20:33 ` [PATCH v11 17/21] ext2: Remove ext2_aops_xip Matthew Wilcox
2014-10-16 12:29   ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 18/21] ext2: Get rid of most mentions of XIP in ext2 Matthew Wilcox
2014-10-16 12:32   ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 19/21] dax: Add dax_zero_page_range Matthew Wilcox
2014-10-16 12:38   ` Mathieu Desnoyers
2014-10-16 22:01     ` Matthew Wilcox
2014-10-17 15:49       ` Mathieu Desnoyers
2014-10-18 17:41         ` Matthew Wilcox
2014-10-18 21:16           ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 20/21] ext4: Add DAX functionality Matthew Wilcox
2014-10-16 12:56   ` Mathieu Desnoyers
2014-10-16 22:16     ` Matthew Wilcox
2014-10-17 15:42       ` Mathieu Desnoyers
2014-09-25 20:33 ` [PATCH v11 21/21] brd: Rename XIP to DAX Matthew Wilcox
2014-10-16 13:00   ` Mathieu Desnoyers
2015-03-24 18:50   ` Matt Mullins
2015-03-25  3:25     ` Dave Chinner
2015-03-26 17:09     ` Should implementations of ->direct_access be allowed to sleep? Matthew Wilcox
2015-03-26 19:32       ` Dave Chinner
2015-03-29  8:02         ` Boaz Harrosh
2015-03-29  9:13           ` Boaz Harrosh
2014-09-25 20:47 ` [PATCH v11 00/21] Add support for NV-DIMMs to ext4 Matthew Wilcox
2014-09-30  9:45 ` Valdis.Kletnieks
2014-09-30 14:48   ` Matthew Wilcox
2014-09-30 14:53     ` Valdis.Kletnieks
2014-09-30 16:08       ` Matthew Wilcox
2014-09-30 17:10         ` Zuckerman, Boris
2014-09-30 19:24           ` Matthew Wilcox
2014-09-30 19:31             ` Zuckerman, Boris
2014-09-30 20:37         ` Valdis.Kletnieks
2014-09-30 21:25           ` Andreas Dilger
2014-09-30 21:52             ` Valdis.Kletnieks
2014-10-01 15:45               ` Jeff Moyer
2014-10-01 17:10                 ` Valdis.Kletnieks
2014-10-01 17:17                 ` Valdis.Kletnieks
2014-10-16  7:39 ` Mathieu Desnoyers
2014-10-16 14:11   ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141016093517.GD19075@thinkos.etherlink \
    --to=mathieu.desnoyers@efficios.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).