stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX
       [not found] <20170912050526.7627-1-ross.zwisler@linux.intel.com>
@ 2017-09-12  5:05 ` Ross Zwisler
  2017-09-12  6:38   ` Jan Kara
  2017-10-12 15:52   ` Theodore Ts'o
  2017-09-12  5:05 ` [PATCH v2 3/5] ext4: add sanity check for encryption " Ross Zwisler
  1 sibling, 2 replies; 6+ messages in thread
From: Ross Zwisler @ 2017-09-12  5:05 UTC (permalink / raw)
  To: Theodore Ts'o, Jan Kara, linux-kernel
  Cc: Ross Zwisler, Andreas Dilger, Christoph Hellwig, Dan Williams,
	Dave Chinner, linux-ext4, linux-nvdimm, stable

If an inode has inline data it is currently prevented from using DAX by a
check in ext4_set_inode_flags().  When the inode grows inline data via
ext4_create_inline_data() or removes its inline data via
ext4_destroy_inline_data_nolock(), the value of S_DAX can change.

Currently these changes are unsafe because we don't hold off page faults
and I/O, write back dirty radix tree entries and invalidate all mappings.
There are also issues with mm-level races when changing the value of S_DAX,
as well as issues with the VM_MIXEDMAP flag:

https://www.spinics.net/lists/linux-xfs/msg09859.html

The unsafe transition of S_DAX can reliably cause data corruption, as shown
by the following fstest:

https://patchwork.kernel.org/patch/9948381/

Fix this issue by preventing the DAX mount option from being used on
filesystems that were created to support inline data.  Inline data is an
option given to mkfs.ext4.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
CC: stable@vger.kernel.org
---
 fs/ext4/inline.c | 10 ----------
 fs/ext4/super.c  |  5 +++++
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 28c5c3a..fd95019 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -302,11 +302,6 @@ static int ext4_create_inline_data(handle_t *handle,
 	EXT4_I(inode)->i_inline_size = len + EXT4_MIN_INLINE_DATA_SIZE;
 	ext4_clear_inode_flag(inode, EXT4_INODE_EXTENTS);
 	ext4_set_inode_flag(inode, EXT4_INODE_INLINE_DATA);
-	/*
-	 * Propagate changes to inode->i_flags as well - e.g. S_DAX may
-	 * get cleared
-	 */
-	ext4_set_inode_flags(inode);
 	get_bh(is.iloc.bh);
 	error = ext4_mark_iloc_dirty(handle, inode, &is.iloc);
 
@@ -451,11 +446,6 @@ static int ext4_destroy_inline_data_nolock(handle_t *handle,
 		}
 	}
 	ext4_clear_inode_flag(inode, EXT4_INODE_INLINE_DATA);
-	/*
-	 * Propagate changes to inode->i_flags as well - e.g. S_DAX may
-	 * get set.
-	 */
-	ext4_set_inode_flags(inode);
 
 	get_bh(is.iloc.bh);
 	error = ext4_mark_iloc_dirty(handle, inode, &is.iloc);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index c9e7be5..4251e50 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3707,6 +3707,11 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 	}
 
 	if (sbi->s_mount_opt & EXT4_MOUNT_DAX) {
+		if (ext4_has_feature_inline_data(sb)) {
+			ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem"
+					" that may contain inline data");
+			goto failed_mount;
+		}
 		err = bdev_dax_supported(sb, blocksize);
 		if (err)
 			goto failed_mount;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/5] ext4: add sanity check for encryption + DAX
       [not found] <20170912050526.7627-1-ross.zwisler@linux.intel.com>
  2017-09-12  5:05 ` [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX Ross Zwisler
@ 2017-09-12  5:05 ` Ross Zwisler
  2017-09-12  6:45   ` Jan Kara
  1 sibling, 1 reply; 6+ messages in thread
From: Ross Zwisler @ 2017-09-12  5:05 UTC (permalink / raw)
  To: Theodore Ts'o, Jan Kara, linux-kernel
  Cc: Ross Zwisler, Andreas Dilger, Christoph Hellwig, Dan Williams,
	Dave Chinner, linux-ext4, linux-nvdimm, stable

We prevent DAX from being used on inodes which are using ext4's built in
encryption via a check in ext4_set_inode_flags().  We do have what appears
to be an unsafe transition of S_DAX in ext4_set_context(), though, where
S_DAX can get disabled without us doing a proper writeback + invalidate.

There are also issues with mm-level races when changing the value of S_DAX,
as well as issues with the VM_MIXEDMAP flag:

https://www.spinics.net/lists/linux-xfs/msg09859.html

I actually think we are safe in this case because of the following:

1) You can't encrypt an existing file.  Encryption can only be set on an
empty directory, with new inodes in that directory being created with
encryption turned on, so I don't think it's possible to turn encryption on
for a file that has open DAX mmaps or outstanding I/Os.

2) There is no way to turn encryption off on a given file.  Once an inode
is encrypted, it stays encrypted for the life of that inode, so we don't
have to worry about the case where we turn encryption off and S_DAX
suddenly turns on.

3) The only way we end up in ext4_set_context() to turn on encryption is
when we are creating a new file in the encrypted directory.  This happens
as part of ext4_create() before the inode has been allowed to do any I/O.
Here's the call tree:

 ext4_create()
   __ext4_new_inode()
	 ext4_set_inode_flags() // sets S_DAX
	 fscrypt_inherit_context()
		fscrypt_get_encryption_info();
		ext4_set_context() // sets EXT4_INODE_ENCRYPT, clears S_DAX

So, I actually think it's safe to transition S_DAX in ext4_set_context()
without any locking, writebacks or invalidations.  I've added a
WARN_ON_ONCE() sanity check to make sure that we are notified if we ever
encounter a case where we are encrypting an inode that already has data,
in which case we need to add code to safely transition S_DAX.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
CC: stable@vger.kernel.org
---
 fs/ext4/super.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 4251e50..c090780 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1159,6 +1159,9 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
 	if (inode->i_ino == EXT4_ROOT_INO)
 		return -EPERM;
 
+	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
+		return -EINVAL;
+
 	res = ext4_convert_inline_data(inode);
 	if (res)
 		return res;
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX
  2017-09-12  5:05 ` [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX Ross Zwisler
@ 2017-09-12  6:38   ` Jan Kara
  2017-10-12 15:52   ` Theodore Ts'o
  1 sibling, 0 replies; 6+ messages in thread
From: Jan Kara @ 2017-09-12  6:38 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Theodore Ts'o, Jan Kara, linux-kernel, Andreas Dilger,
	Christoph Hellwig, Dan Williams, Dave Chinner, linux-ext4,
	linux-nvdimm, stable

On Mon 11-09-17 23:05:22, Ross Zwisler wrote:
> If an inode has inline data it is currently prevented from using DAX by a
> check in ext4_set_inode_flags().  When the inode grows inline data via
> ext4_create_inline_data() or removes its inline data via
> ext4_destroy_inline_data_nolock(), the value of S_DAX can change.
> 
> Currently these changes are unsafe because we don't hold off page faults
> and I/O, write back dirty radix tree entries and invalidate all mappings.
> There are also issues with mm-level races when changing the value of S_DAX,
> as well as issues with the VM_MIXEDMAP flag:
> 
> https://www.spinics.net/lists/linux-xfs/msg09859.html
> 
> The unsafe transition of S_DAX can reliably cause data corruption, as shown
> by the following fstest:
> 
> https://patchwork.kernel.org/patch/9948381/
> 
> Fix this issue by preventing the DAX mount option from being used on
> filesystems that were created to support inline data.  Inline data is an
> option given to mkfs.ext4.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> CC: stable@vger.kernel.org

Looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/inline.c | 10 ----------
>  fs/ext4/super.c  |  5 +++++
>  2 files changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
> index 28c5c3a..fd95019 100644
> --- a/fs/ext4/inline.c
> +++ b/fs/ext4/inline.c
> @@ -302,11 +302,6 @@ static int ext4_create_inline_data(handle_t *handle,
>  	EXT4_I(inode)->i_inline_size = len + EXT4_MIN_INLINE_DATA_SIZE;
>  	ext4_clear_inode_flag(inode, EXT4_INODE_EXTENTS);
>  	ext4_set_inode_flag(inode, EXT4_INODE_INLINE_DATA);
> -	/*
> -	 * Propagate changes to inode->i_flags as well - e.g. S_DAX may
> -	 * get cleared
> -	 */
> -	ext4_set_inode_flags(inode);
>  	get_bh(is.iloc.bh);
>  	error = ext4_mark_iloc_dirty(handle, inode, &is.iloc);
>  
> @@ -451,11 +446,6 @@ static int ext4_destroy_inline_data_nolock(handle_t *handle,
>  		}
>  	}
>  	ext4_clear_inode_flag(inode, EXT4_INODE_INLINE_DATA);
> -	/*
> -	 * Propagate changes to inode->i_flags as well - e.g. S_DAX may
> -	 * get set.
> -	 */
> -	ext4_set_inode_flags(inode);
>  
>  	get_bh(is.iloc.bh);
>  	error = ext4_mark_iloc_dirty(handle, inode, &is.iloc);
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index c9e7be5..4251e50 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -3707,6 +3707,11 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  	}
>  
>  	if (sbi->s_mount_opt & EXT4_MOUNT_DAX) {
> +		if (ext4_has_feature_inline_data(sb)) {
> +			ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem"
> +					" that may contain inline data");
> +			goto failed_mount;
> +		}
>  		err = bdev_dax_supported(sb, blocksize);
>  		if (err)
>  			goto failed_mount;
> -- 
> 2.9.5
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 3/5] ext4: add sanity check for encryption + DAX
  2017-09-12  5:05 ` [PATCH v2 3/5] ext4: add sanity check for encryption " Ross Zwisler
@ 2017-09-12  6:45   ` Jan Kara
  2017-09-12 15:39     ` Ross Zwisler
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2017-09-12  6:45 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Theodore Ts'o, Jan Kara, linux-kernel, Andreas Dilger,
	Christoph Hellwig, Dan Williams, Dave Chinner, linux-ext4,
	linux-nvdimm, stable

On Mon 11-09-17 23:05:24, Ross Zwisler wrote:
> We prevent DAX from being used on inodes which are using ext4's built in
> encryption via a check in ext4_set_inode_flags().  We do have what appears
> to be an unsafe transition of S_DAX in ext4_set_context(), though, where
> S_DAX can get disabled without us doing a proper writeback + invalidate.
> 
> There are also issues with mm-level races when changing the value of S_DAX,
> as well as issues with the VM_MIXEDMAP flag:
> 
> https://www.spinics.net/lists/linux-xfs/msg09859.html
> 
> I actually think we are safe in this case because of the following:
> 
> 1) You can't encrypt an existing file.  Encryption can only be set on an
> empty directory, with new inodes in that directory being created with
> encryption turned on, so I don't think it's possible to turn encryption on
> for a file that has open DAX mmaps or outstanding I/Os.
> 
> 2) There is no way to turn encryption off on a given file.  Once an inode
> is encrypted, it stays encrypted for the life of that inode, so we don't
> have to worry about the case where we turn encryption off and S_DAX
> suddenly turns on.
> 
> 3) The only way we end up in ext4_set_context() to turn on encryption is
> when we are creating a new file in the encrypted directory.  This happens
> as part of ext4_create() before the inode has been allowed to do any I/O.
> Here's the call tree:
> 
>  ext4_create()
>    __ext4_new_inode()
> 	 ext4_set_inode_flags() // sets S_DAX
> 	 fscrypt_inherit_context()
> 		fscrypt_get_encryption_info();
> 		ext4_set_context() // sets EXT4_INODE_ENCRYPT, clears S_DAX
> 
> So, I actually think it's safe to transition S_DAX in ext4_set_context()
> without any locking, writebacks or invalidations.  I've added a
> WARN_ON_ONCE() sanity check to make sure that we are notified if we ever
> encounter a case where we are encrypting an inode that already has data,
> in which case we need to add code to safely transition S_DAX.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> CC: stable@vger.kernel.org

Looks good to me - and frankly I think we can drop the stable CC here...
Anyway, you can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/super.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 4251e50..c090780 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1159,6 +1159,9 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
>  	if (inode->i_ino == EXT4_ROOT_INO)
>  		return -EPERM;
>  
> +	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> +		return -EINVAL;
> +
>  	res = ext4_convert_inline_data(inode);
>  	if (res)
>  		return res;
> -- 
> 2.9.5
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 3/5] ext4: add sanity check for encryption + DAX
  2017-09-12  6:45   ` Jan Kara
@ 2017-09-12 15:39     ` Ross Zwisler
  0 siblings, 0 replies; 6+ messages in thread
From: Ross Zwisler @ 2017-09-12 15:39 UTC (permalink / raw)
  To: Jan Kara
  Cc: Ross Zwisler, Theodore Ts'o, linux-kernel, Andreas Dilger,
	Christoph Hellwig, Dan Williams, Dave Chinner, linux-ext4,
	linux-nvdimm, stable

On Tue, Sep 12, 2017 at 08:45:00AM +0200, Jan Kara wrote:
> On Mon 11-09-17 23:05:24, Ross Zwisler wrote:
> > We prevent DAX from being used on inodes which are using ext4's built in
> > encryption via a check in ext4_set_inode_flags().  We do have what appears
> > to be an unsafe transition of S_DAX in ext4_set_context(), though, where
> > S_DAX can get disabled without us doing a proper writeback + invalidate.
> > 
> > There are also issues with mm-level races when changing the value of S_DAX,
> > as well as issues with the VM_MIXEDMAP flag:
> > 
> > https://www.spinics.net/lists/linux-xfs/msg09859.html
> > 
> > I actually think we are safe in this case because of the following:
> > 
> > 1) You can't encrypt an existing file.  Encryption can only be set on an
> > empty directory, with new inodes in that directory being created with
> > encryption turned on, so I don't think it's possible to turn encryption on
> > for a file that has open DAX mmaps or outstanding I/Os.
> > 
> > 2) There is no way to turn encryption off on a given file.  Once an inode
> > is encrypted, it stays encrypted for the life of that inode, so we don't
> > have to worry about the case where we turn encryption off and S_DAX
> > suddenly turns on.
> > 
> > 3) The only way we end up in ext4_set_context() to turn on encryption is
> > when we are creating a new file in the encrypted directory.  This happens
> > as part of ext4_create() before the inode has been allowed to do any I/O.
> > Here's the call tree:
> > 
> >  ext4_create()
> >    __ext4_new_inode()
> > 	 ext4_set_inode_flags() // sets S_DAX
> > 	 fscrypt_inherit_context()
> > 		fscrypt_get_encryption_info();
> > 		ext4_set_context() // sets EXT4_INODE_ENCRYPT, clears S_DAX
> > 
> > So, I actually think it's safe to transition S_DAX in ext4_set_context()
> > without any locking, writebacks or invalidations.  I've added a
> > WARN_ON_ONCE() sanity check to make sure that we are notified if we ever
> > encounter a case where we are encrypting an inode that already has data,
> > in which case we need to add code to safely transition S_DAX.
> > 
> > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > CC: stable@vger.kernel.org
> 
> Looks good to me - and frankly I think we can drop the stable CC here...

Sure, I'm fine to drop the CC to stable.

> Anyway, you can add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> 
> 								Honza
> 
> > ---
> >  fs/ext4/super.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 4251e50..c090780 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -1159,6 +1159,9 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len,
> >  	if (inode->i_ino == EXT4_ROOT_INO)
> >  		return -EPERM;
> >  
> > +	if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
> > +		return -EINVAL;
> > +
> >  	res = ext4_convert_inline_data(inode);
> >  	if (res)
> >  		return res;
> > -- 
> > 2.9.5
> > 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX
  2017-09-12  5:05 ` [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX Ross Zwisler
  2017-09-12  6:38   ` Jan Kara
@ 2017-10-12 15:52   ` Theodore Ts'o
  1 sibling, 0 replies; 6+ messages in thread
From: Theodore Ts'o @ 2017-10-12 15:52 UTC (permalink / raw)
  To: Ross Zwisler
  Cc: Jan Kara, linux-kernel, Andreas Dilger, Christoph Hellwig,
	Dan Williams, Dave Chinner, linux-ext4, linux-nvdimm, stable

On Mon, Sep 11, 2017 at 11:05:22PM -0600, Ross Zwisler wrote:
> If an inode has inline data it is currently prevented from using DAX by a
> check in ext4_set_inode_flags().  When the inode grows inline data via
> ext4_create_inline_data() or removes its inline data via
> ext4_destroy_inline_data_nolock(), the value of S_DAX can change.
> 
> Currently these changes are unsafe because we don't hold off page faults
> and I/O, write back dirty radix tree entries and invalidate all mappings.
> There are also issues with mm-level races when changing the value of S_DAX,
> as well as issues with the VM_MIXEDMAP flag:
> 
> https://www.spinics.net/lists/linux-xfs/msg09859.html
> 
> The unsafe transition of S_DAX can reliably cause data corruption, as shown
> by the following fstest:
> 
> https://patchwork.kernel.org/patch/9948381/
> 
> Fix this issue by preventing the DAX mount option from being used on
> filesystems that were created to support inline data.  Inline data is an
> option given to mkfs.ext4.
> 
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> CC: stable@vger.kernel.org

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-10-12 15:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20170912050526.7627-1-ross.zwisler@linux.intel.com>
2017-09-12  5:05 ` [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX Ross Zwisler
2017-09-12  6:38   ` Jan Kara
2017-10-12 15:52   ` Theodore Ts'o
2017-09-12  5:05 ` [PATCH v2 3/5] ext4: add sanity check for encryption " Ross Zwisler
2017-09-12  6:45   ` Jan Kara
2017-09-12 15:39     ` Ross Zwisler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).