* [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX [not found] <20170912050526.7627-1-ross.zwisler@linux.intel.com> @ 2017-09-12 5:05 ` Ross Zwisler 2017-09-12 6:38 ` Jan Kara 2017-10-12 15:52 ` Theodore Ts'o 2017-09-12 5:05 ` [PATCH v2 3/5] ext4: add sanity check for encryption " Ross Zwisler 1 sibling, 2 replies; 6+ messages in thread From: Ross Zwisler @ 2017-09-12 5:05 UTC (permalink / raw) To: Theodore Ts'o, Jan Kara, linux-kernel Cc: Ross Zwisler, Andreas Dilger, Christoph Hellwig, Dan Williams, Dave Chinner, linux-ext4, linux-nvdimm, stable If an inode has inline data it is currently prevented from using DAX by a check in ext4_set_inode_flags(). When the inode grows inline data via ext4_create_inline_data() or removes its inline data via ext4_destroy_inline_data_nolock(), the value of S_DAX can change. Currently these changes are unsafe because we don't hold off page faults and I/O, write back dirty radix tree entries and invalidate all mappings. There are also issues with mm-level races when changing the value of S_DAX, as well as issues with the VM_MIXEDMAP flag: https://www.spinics.net/lists/linux-xfs/msg09859.html The unsafe transition of S_DAX can reliably cause data corruption, as shown by the following fstest: https://patchwork.kernel.org/patch/9948381/ Fix this issue by preventing the DAX mount option from being used on filesystems that were created to support inline data. Inline data is an option given to mkfs.ext4. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> CC: stable@vger.kernel.org --- fs/ext4/inline.c | 10 ---------- fs/ext4/super.c | 5 +++++ 2 files changed, 5 insertions(+), 10 deletions(-) diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index 28c5c3a..fd95019 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -302,11 +302,6 @@ static int ext4_create_inline_data(handle_t *handle, EXT4_I(inode)->i_inline_size = len + EXT4_MIN_INLINE_DATA_SIZE; ext4_clear_inode_flag(inode, EXT4_INODE_EXTENTS); ext4_set_inode_flag(inode, EXT4_INODE_INLINE_DATA); - /* - * Propagate changes to inode->i_flags as well - e.g. S_DAX may - * get cleared - */ - ext4_set_inode_flags(inode); get_bh(is.iloc.bh); error = ext4_mark_iloc_dirty(handle, inode, &is.iloc); @@ -451,11 +446,6 @@ static int ext4_destroy_inline_data_nolock(handle_t *handle, } } ext4_clear_inode_flag(inode, EXT4_INODE_INLINE_DATA); - /* - * Propagate changes to inode->i_flags as well - e.g. S_DAX may - * get set. - */ - ext4_set_inode_flags(inode); get_bh(is.iloc.bh); error = ext4_mark_iloc_dirty(handle, inode, &is.iloc); diff --git a/fs/ext4/super.c b/fs/ext4/super.c index c9e7be5..4251e50 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -3707,6 +3707,11 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) } if (sbi->s_mount_opt & EXT4_MOUNT_DAX) { + if (ext4_has_feature_inline_data(sb)) { + ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem" + " that may contain inline data"); + goto failed_mount; + } err = bdev_dax_supported(sb, blocksize); if (err) goto failed_mount; -- 2.9.5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX 2017-09-12 5:05 ` [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX Ross Zwisler @ 2017-09-12 6:38 ` Jan Kara 2017-10-12 15:52 ` Theodore Ts'o 1 sibling, 0 replies; 6+ messages in thread From: Jan Kara @ 2017-09-12 6:38 UTC (permalink / raw) To: Ross Zwisler Cc: Theodore Ts'o, Jan Kara, linux-kernel, Andreas Dilger, Christoph Hellwig, Dan Williams, Dave Chinner, linux-ext4, linux-nvdimm, stable On Mon 11-09-17 23:05:22, Ross Zwisler wrote: > If an inode has inline data it is currently prevented from using DAX by a > check in ext4_set_inode_flags(). When the inode grows inline data via > ext4_create_inline_data() or removes its inline data via > ext4_destroy_inline_data_nolock(), the value of S_DAX can change. > > Currently these changes are unsafe because we don't hold off page faults > and I/O, write back dirty radix tree entries and invalidate all mappings. > There are also issues with mm-level races when changing the value of S_DAX, > as well as issues with the VM_MIXEDMAP flag: > > https://www.spinics.net/lists/linux-xfs/msg09859.html > > The unsafe transition of S_DAX can reliably cause data corruption, as shown > by the following fstest: > > https://patchwork.kernel.org/patch/9948381/ > > Fix this issue by preventing the DAX mount option from being used on > filesystems that were created to support inline data. Inline data is an > option given to mkfs.ext4. > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > CC: stable@vger.kernel.org Looks good. You can add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/ext4/inline.c | 10 ---------- > fs/ext4/super.c | 5 +++++ > 2 files changed, 5 insertions(+), 10 deletions(-) > > diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c > index 28c5c3a..fd95019 100644 > --- a/fs/ext4/inline.c > +++ b/fs/ext4/inline.c > @@ -302,11 +302,6 @@ static int ext4_create_inline_data(handle_t *handle, > EXT4_I(inode)->i_inline_size = len + EXT4_MIN_INLINE_DATA_SIZE; > ext4_clear_inode_flag(inode, EXT4_INODE_EXTENTS); > ext4_set_inode_flag(inode, EXT4_INODE_INLINE_DATA); > - /* > - * Propagate changes to inode->i_flags as well - e.g. S_DAX may > - * get cleared > - */ > - ext4_set_inode_flags(inode); > get_bh(is.iloc.bh); > error = ext4_mark_iloc_dirty(handle, inode, &is.iloc); > > @@ -451,11 +446,6 @@ static int ext4_destroy_inline_data_nolock(handle_t *handle, > } > } > ext4_clear_inode_flag(inode, EXT4_INODE_INLINE_DATA); > - /* > - * Propagate changes to inode->i_flags as well - e.g. S_DAX may > - * get set. > - */ > - ext4_set_inode_flags(inode); > > get_bh(is.iloc.bh); > error = ext4_mark_iloc_dirty(handle, inode, &is.iloc); > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index c9e7be5..4251e50 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -3707,6 +3707,11 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) > } > > if (sbi->s_mount_opt & EXT4_MOUNT_DAX) { > + if (ext4_has_feature_inline_data(sb)) { > + ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem" > + " that may contain inline data"); > + goto failed_mount; > + } > err = bdev_dax_supported(sb, blocksize); > if (err) > goto failed_mount; > -- > 2.9.5 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX 2017-09-12 5:05 ` [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX Ross Zwisler 2017-09-12 6:38 ` Jan Kara @ 2017-10-12 15:52 ` Theodore Ts'o 1 sibling, 0 replies; 6+ messages in thread From: Theodore Ts'o @ 2017-10-12 15:52 UTC (permalink / raw) To: Ross Zwisler Cc: Jan Kara, linux-kernel, Andreas Dilger, Christoph Hellwig, Dan Williams, Dave Chinner, linux-ext4, linux-nvdimm, stable On Mon, Sep 11, 2017 at 11:05:22PM -0600, Ross Zwisler wrote: > If an inode has inline data it is currently prevented from using DAX by a > check in ext4_set_inode_flags(). When the inode grows inline data via > ext4_create_inline_data() or removes its inline data via > ext4_destroy_inline_data_nolock(), the value of S_DAX can change. > > Currently these changes are unsafe because we don't hold off page faults > and I/O, write back dirty radix tree entries and invalidate all mappings. > There are also issues with mm-level races when changing the value of S_DAX, > as well as issues with the VM_MIXEDMAP flag: > > https://www.spinics.net/lists/linux-xfs/msg09859.html > > The unsafe transition of S_DAX can reliably cause data corruption, as shown > by the following fstest: > > https://patchwork.kernel.org/patch/9948381/ > > Fix this issue by preventing the DAX mount option from being used on > filesystems that were created to support inline data. Inline data is an > option given to mkfs.ext4. > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > CC: stable@vger.kernel.org Thanks, applied. - Ted ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 3/5] ext4: add sanity check for encryption + DAX [not found] <20170912050526.7627-1-ross.zwisler@linux.intel.com> 2017-09-12 5:05 ` [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX Ross Zwisler @ 2017-09-12 5:05 ` Ross Zwisler 2017-09-12 6:45 ` Jan Kara 1 sibling, 1 reply; 6+ messages in thread From: Ross Zwisler @ 2017-09-12 5:05 UTC (permalink / raw) To: Theodore Ts'o, Jan Kara, linux-kernel Cc: Ross Zwisler, Andreas Dilger, Christoph Hellwig, Dan Williams, Dave Chinner, linux-ext4, linux-nvdimm, stable We prevent DAX from being used on inodes which are using ext4's built in encryption via a check in ext4_set_inode_flags(). We do have what appears to be an unsafe transition of S_DAX in ext4_set_context(), though, where S_DAX can get disabled without us doing a proper writeback + invalidate. There are also issues with mm-level races when changing the value of S_DAX, as well as issues with the VM_MIXEDMAP flag: https://www.spinics.net/lists/linux-xfs/msg09859.html I actually think we are safe in this case because of the following: 1) You can't encrypt an existing file. Encryption can only be set on an empty directory, with new inodes in that directory being created with encryption turned on, so I don't think it's possible to turn encryption on for a file that has open DAX mmaps or outstanding I/Os. 2) There is no way to turn encryption off on a given file. Once an inode is encrypted, it stays encrypted for the life of that inode, so we don't have to worry about the case where we turn encryption off and S_DAX suddenly turns on. 3) The only way we end up in ext4_set_context() to turn on encryption is when we are creating a new file in the encrypted directory. This happens as part of ext4_create() before the inode has been allowed to do any I/O. Here's the call tree: ext4_create() __ext4_new_inode() ext4_set_inode_flags() // sets S_DAX fscrypt_inherit_context() fscrypt_get_encryption_info(); ext4_set_context() // sets EXT4_INODE_ENCRYPT, clears S_DAX So, I actually think it's safe to transition S_DAX in ext4_set_context() without any locking, writebacks or invalidations. I've added a WARN_ON_ONCE() sanity check to make sure that we are notified if we ever encounter a case where we are encrypting an inode that already has data, in which case we need to add code to safely transition S_DAX. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> CC: stable@vger.kernel.org --- fs/ext4/super.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 4251e50..c090780 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1159,6 +1159,9 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len, if (inode->i_ino == EXT4_ROOT_INO) return -EPERM; + if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode))) + return -EINVAL; + res = ext4_convert_inline_data(inode); if (res) return res; -- 2.9.5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 3/5] ext4: add sanity check for encryption + DAX 2017-09-12 5:05 ` [PATCH v2 3/5] ext4: add sanity check for encryption " Ross Zwisler @ 2017-09-12 6:45 ` Jan Kara 2017-09-12 15:39 ` Ross Zwisler 0 siblings, 1 reply; 6+ messages in thread From: Jan Kara @ 2017-09-12 6:45 UTC (permalink / raw) To: Ross Zwisler Cc: Theodore Ts'o, Jan Kara, linux-kernel, Andreas Dilger, Christoph Hellwig, Dan Williams, Dave Chinner, linux-ext4, linux-nvdimm, stable On Mon 11-09-17 23:05:24, Ross Zwisler wrote: > We prevent DAX from being used on inodes which are using ext4's built in > encryption via a check in ext4_set_inode_flags(). We do have what appears > to be an unsafe transition of S_DAX in ext4_set_context(), though, where > S_DAX can get disabled without us doing a proper writeback + invalidate. > > There are also issues with mm-level races when changing the value of S_DAX, > as well as issues with the VM_MIXEDMAP flag: > > https://www.spinics.net/lists/linux-xfs/msg09859.html > > I actually think we are safe in this case because of the following: > > 1) You can't encrypt an existing file. Encryption can only be set on an > empty directory, with new inodes in that directory being created with > encryption turned on, so I don't think it's possible to turn encryption on > for a file that has open DAX mmaps or outstanding I/Os. > > 2) There is no way to turn encryption off on a given file. Once an inode > is encrypted, it stays encrypted for the life of that inode, so we don't > have to worry about the case where we turn encryption off and S_DAX > suddenly turns on. > > 3) The only way we end up in ext4_set_context() to turn on encryption is > when we are creating a new file in the encrypted directory. This happens > as part of ext4_create() before the inode has been allowed to do any I/O. > Here's the call tree: > > ext4_create() > __ext4_new_inode() > ext4_set_inode_flags() // sets S_DAX > fscrypt_inherit_context() > fscrypt_get_encryption_info(); > ext4_set_context() // sets EXT4_INODE_ENCRYPT, clears S_DAX > > So, I actually think it's safe to transition S_DAX in ext4_set_context() > without any locking, writebacks or invalidations. I've added a > WARN_ON_ONCE() sanity check to make sure that we are notified if we ever > encounter a case where we are encrypting an inode that already has data, > in which case we need to add code to safely transition S_DAX. > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > CC: stable@vger.kernel.org Looks good to me - and frankly I think we can drop the stable CC here... Anyway, you can add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/ext4/super.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index 4251e50..c090780 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -1159,6 +1159,9 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len, > if (inode->i_ino == EXT4_ROOT_INO) > return -EPERM; > > + if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode))) > + return -EINVAL; > + > res = ext4_convert_inline_data(inode); > if (res) > return res; > -- > 2.9.5 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 3/5] ext4: add sanity check for encryption + DAX 2017-09-12 6:45 ` Jan Kara @ 2017-09-12 15:39 ` Ross Zwisler 0 siblings, 0 replies; 6+ messages in thread From: Ross Zwisler @ 2017-09-12 15:39 UTC (permalink / raw) To: Jan Kara Cc: Ross Zwisler, Theodore Ts'o, linux-kernel, Andreas Dilger, Christoph Hellwig, Dan Williams, Dave Chinner, linux-ext4, linux-nvdimm, stable On Tue, Sep 12, 2017 at 08:45:00AM +0200, Jan Kara wrote: > On Mon 11-09-17 23:05:24, Ross Zwisler wrote: > > We prevent DAX from being used on inodes which are using ext4's built in > > encryption via a check in ext4_set_inode_flags(). We do have what appears > > to be an unsafe transition of S_DAX in ext4_set_context(), though, where > > S_DAX can get disabled without us doing a proper writeback + invalidate. > > > > There are also issues with mm-level races when changing the value of S_DAX, > > as well as issues with the VM_MIXEDMAP flag: > > > > https://www.spinics.net/lists/linux-xfs/msg09859.html > > > > I actually think we are safe in this case because of the following: > > > > 1) You can't encrypt an existing file. Encryption can only be set on an > > empty directory, with new inodes in that directory being created with > > encryption turned on, so I don't think it's possible to turn encryption on > > for a file that has open DAX mmaps or outstanding I/Os. > > > > 2) There is no way to turn encryption off on a given file. Once an inode > > is encrypted, it stays encrypted for the life of that inode, so we don't > > have to worry about the case where we turn encryption off and S_DAX > > suddenly turns on. > > > > 3) The only way we end up in ext4_set_context() to turn on encryption is > > when we are creating a new file in the encrypted directory. This happens > > as part of ext4_create() before the inode has been allowed to do any I/O. > > Here's the call tree: > > > > ext4_create() > > __ext4_new_inode() > > ext4_set_inode_flags() // sets S_DAX > > fscrypt_inherit_context() > > fscrypt_get_encryption_info(); > > ext4_set_context() // sets EXT4_INODE_ENCRYPT, clears S_DAX > > > > So, I actually think it's safe to transition S_DAX in ext4_set_context() > > without any locking, writebacks or invalidations. I've added a > > WARN_ON_ONCE() sanity check to make sure that we are notified if we ever > > encounter a case where we are encrypting an inode that already has data, > > in which case we need to add code to safely transition S_DAX. > > > > Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> > > CC: stable@vger.kernel.org > > Looks good to me - and frankly I think we can drop the stable CC here... Sure, I'm fine to drop the CC to stable. > Anyway, you can add: > > Reviewed-by: Jan Kara <jack@suse.cz> > > Honza > > > --- > > fs/ext4/super.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > index 4251e50..c090780 100644 > > --- a/fs/ext4/super.c > > +++ b/fs/ext4/super.c > > @@ -1159,6 +1159,9 @@ static int ext4_set_context(struct inode *inode, const void *ctx, size_t len, > > if (inode->i_ino == EXT4_ROOT_INO) > > return -EPERM; > > > > + if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode))) > > + return -EINVAL; > > + > > res = ext4_convert_inline_data(inode); > > if (res) > > return res; > > -- > > 2.9.5 > > > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-10-12 15:52 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20170912050526.7627-1-ross.zwisler@linux.intel.com>
2017-09-12 5:05 ` [PATCH v2 1/5] ext4: prevent data corruption with inline data + DAX Ross Zwisler
2017-09-12 6:38 ` Jan Kara
2017-10-12 15:52 ` Theodore Ts'o
2017-09-12 5:05 ` [PATCH v2 3/5] ext4: add sanity check for encryption " Ross Zwisler
2017-09-12 6:45 ` Jan Kara
2017-09-12 15:39 ` Ross Zwisler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).