linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/5] ext4 crypto: fix memory leak in ext4_bio_write_page()
@ 2015-10-04  5:38 Theodore Ts'o
  2015-10-04  5:38 ` [PATCH 2/5] ext4: optimize ext4_writepage() for attempted 4k delalloc writes Theodore Ts'o
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Theodore Ts'o @ 2015-10-04  5:38 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, stable

There are times when ext4_bio_write_page() is called even though we
don't actually need to do any I/O.  This happens when ext4_writepage()
gets called by the jbd2 commit path when an inode needs to force its
pages written out in order to provide data=ordered guarantees --- and
a page is backed by an unwritten (e.g., uninitialized) block on disk,
or if delayed allocation means the page's backing store hasn't been
allocated yet.  In that case, we need to skip the call to
ext4_encrypt_page(), since in addition to wasting CPU, it leads to a
bounce page and an ext4 crypto context getting leaked.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
 fs/ext4/page-io.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 84ba4d2..17fbe38 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -425,6 +425,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
 	struct buffer_head *bh, *head;
 	int ret = 0;
 	int nr_submitted = 0;
+	int nr_to_submit = 0;
 
 	blocksize = 1 << inode->i_blkbits;
 
@@ -477,11 +478,13 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
 			unmap_underlying_metadata(bh->b_bdev, bh->b_blocknr);
 		}
 		set_buffer_async_write(bh);
+		nr_to_submit++;
 	} while ((bh = bh->b_this_page) != head);
 
 	bh = head = page_buffers(page);
 
-	if (ext4_encrypted_inode(inode) && S_ISREG(inode->i_mode)) {
+	if (ext4_encrypted_inode(inode) && S_ISREG(inode->i_mode) &&
+	    nr_to_submit) {
 		data_page = ext4_encrypt(inode, page);
 		if (IS_ERR(data_page)) {
 			ret = PTR_ERR(data_page);
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/5] ext4: optimize ext4_writepage() for attempted 4k delalloc writes
  2015-10-04  5:38 [PATCH 1/5] ext4 crypto: fix memory leak in ext4_bio_write_page() Theodore Ts'o
@ 2015-10-04  5:38 ` Theodore Ts'o
  2015-10-04  5:38 ` [PATCH 3/5] ext4 crypto: ext4_page_crypto() doesn't need a encryption context Theodore Ts'o
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Theodore Ts'o @ 2015-10-04  5:38 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o

In cases where the file system block size is the same as the page
size, and ext4_writepage() is asked to write out a page which is
either has the unwritten bit set in the extent tree, or which does not
yet have a block assigned due to delayed allocation, we can bail out
early and, unlocking the page earlier and avoiding a round trip
through ext4_bio_write_page() with the attendant calls to
set_page_writeback() and redirty_page_for_writeback().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 fs/ext4/inode.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 612fbcf..c95d406 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1815,11 +1815,22 @@ static int ext4_writepage(struct page *page,
 	 * the page. But we may reach here when we do a journal commit via
 	 * journal_submit_inode_data_buffers() and in that case we must write
 	 * allocated buffers to achieve data=ordered mode guarantees.
+	 *
+	 * Also, if there is only one buffer per page (the fs block
+	 * size == the page size), if one buffer needs block
+	 * allocation or needs to modify the extent tree to clear the
+	 * unwritten flag, we know that the page can't be written at
+	 * all, so we might as well refuse the write immediately.
+	 * Unfortunately if the block size != page size, we can't as
+	 * easily detect this case using ext4_walk_page_buffers(), but
+	 * for the extremely common case, this is an optimization that
+	 * skips a useless round trip through ext4_bio_write_page().
 	 */
 	if (ext4_walk_page_buffers(NULL, page_bufs, 0, len, NULL,
 				   ext4_bh_delay_or_unwritten)) {
 		redirty_page_for_writepage(wbc, page);
-		if (current->flags & PF_MEMALLOC) {
+		if ((current->flags & PF_MEMALLOC) ||
+		    (inode->i_sb->s_blocksize == PAGE_CACHE_SIZE)) {
 			/*
 			 * For memory cleaning there's no point in writing only
 			 * some buffers. So just bail out. Warn if we came here
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/5] ext4 crypto: ext4_page_crypto() doesn't need a encryption context
  2015-10-04  5:38 [PATCH 1/5] ext4 crypto: fix memory leak in ext4_bio_write_page() Theodore Ts'o
  2015-10-04  5:38 ` [PATCH 2/5] ext4: optimize ext4_writepage() for attempted 4k delalloc writes Theodore Ts'o
@ 2015-10-04  5:38 ` Theodore Ts'o
  2015-10-04  5:38 ` [PATCH 4/5] ext4 crypto: replace some BUG_ON()'s with error checks Theodore Ts'o
  2015-10-04  5:38 ` [PATCH 5/5] ext4 crypto: fix bugs in ext4_encrypted_zeroout() Theodore Ts'o
  3 siblings, 0 replies; 5+ messages in thread
From: Theodore Ts'o @ 2015-10-04  5:38 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o

Since ext4_page_crypto() doesn't need an encryption context (at least
not any more), this allows us to simplify a number function signature
and also allows us to avoid needing to allocate a context in
ext4_block_write_begin().  It also means we no longer need a separate
ext4_decrypt_one() function.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 fs/ext4/crypto.c   | 28 +++++-----------------------
 fs/ext4/ext4.h     |  3 +--
 fs/ext4/inode.c    |  4 ++--
 fs/ext4/readpage.c |  2 +-
 4 files changed, 9 insertions(+), 28 deletions(-)

diff --git a/fs/ext4/crypto.c b/fs/ext4/crypto.c
index 4573155..3a5a7a2 100644
--- a/fs/ext4/crypto.c
+++ b/fs/ext4/crypto.c
@@ -253,8 +253,7 @@ typedef enum {
 	EXT4_ENCRYPT,
 } ext4_direction_t;
 
-static int ext4_page_crypto(struct ext4_crypto_ctx *ctx,
-			    struct inode *inode,
+static int ext4_page_crypto(struct inode *inode,
 			    ext4_direction_t rw,
 			    pgoff_t index,
 			    struct page *src_page,
@@ -353,7 +352,7 @@ struct page *ext4_encrypt(struct inode *inode,
 	if (IS_ERR(ciphertext_page))
 		goto errout;
 	ctx->w.control_page = plaintext_page;
-	err = ext4_page_crypto(ctx, inode, EXT4_ENCRYPT, plaintext_page->index,
+	err = ext4_page_crypto(inode, EXT4_ENCRYPT, plaintext_page->index,
 			       plaintext_page, ciphertext_page);
 	if (err) {
 		ciphertext_page = ERR_PTR(err);
@@ -378,31 +377,14 @@ struct page *ext4_encrypt(struct inode *inode,
  *
  * Return: Zero on success, non-zero otherwise.
  */
-int ext4_decrypt(struct ext4_crypto_ctx *ctx, struct page *page)
+int ext4_decrypt(struct page *page)
 {
 	BUG_ON(!PageLocked(page));
 
-	return ext4_page_crypto(ctx, page->mapping->host,
+	return ext4_page_crypto(page->mapping->host,
 				EXT4_DECRYPT, page->index, page, page);
 }
 
-/*
- * Convenience function which takes care of allocating and
- * deallocating the encryption context
- */
-int ext4_decrypt_one(struct inode *inode, struct page *page)
-{
-	int ret;
-
-	struct ext4_crypto_ctx *ctx = ext4_get_crypto_ctx(inode);
-
-	if (IS_ERR(ctx))
-		return PTR_ERR(ctx);
-	ret = ext4_decrypt(ctx, page);
-	ext4_release_crypto_ctx(ctx);
-	return ret;
-}
-
 int ext4_encrypted_zeroout(struct inode *inode, struct ext4_extent *ex)
 {
 	struct ext4_crypto_ctx	*ctx;
@@ -426,7 +408,7 @@ int ext4_encrypted_zeroout(struct inode *inode, struct ext4_extent *ex)
 	}
 
 	while (len--) {
-		err = ext4_page_crypto(ctx, inode, EXT4_ENCRYPT, lblk,
+		err = ext4_page_crypto(inode, EXT4_ENCRYPT, lblk,
 				       ZERO_PAGE(0), ciphertext_page);
 		if (err)
 			goto errout;
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 766b7f7..3f248c9 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2063,8 +2063,7 @@ void ext4_release_crypto_ctx(struct ext4_crypto_ctx *ctx);
 void ext4_restore_control_page(struct page *data_page);
 struct page *ext4_encrypt(struct inode *inode,
 			  struct page *plaintext_page);
-int ext4_decrypt(struct ext4_crypto_ctx *ctx, struct page *page);
-int ext4_decrypt_one(struct inode *inode, struct page *page);
+int ext4_decrypt(struct page *page);
 int ext4_encrypted_zeroout(struct inode *inode, struct ext4_extent *ex);
 
 #ifdef CONFIG_EXT4_FS_ENCRYPTION
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c95d406..ae52e32 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -965,7 +965,7 @@ static int ext4_block_write_begin(struct page *page, loff_t pos, unsigned len,
 	if (unlikely(err))
 		page_zero_new_buffers(page, from, to);
 	else if (decrypt)
-		err = ext4_decrypt_one(inode, page);
+		err = ext4_decrypt(page);
 	return err;
 }
 #endif
@@ -3404,7 +3404,7 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 			/* We expect the key to be set. */
 			BUG_ON(!ext4_has_encryption_key(inode));
 			BUG_ON(blocksize != PAGE_CACHE_SIZE);
-			WARN_ON_ONCE(ext4_decrypt_one(inode, page));
+			WARN_ON_ONCE(ext4_decrypt(page));
 		}
 	}
 	if (ext4_should_journal_data(inode)) {
diff --git a/fs/ext4/readpage.c b/fs/ext4/readpage.c
index e26803f..a91e4bf 100644
--- a/fs/ext4/readpage.c
+++ b/fs/ext4/readpage.c
@@ -62,7 +62,7 @@ static void completion_pages(struct work_struct *work)
 	bio_for_each_segment_all(bv, bio, i) {
 		struct page *page = bv->bv_page;
 
-		int ret = ext4_decrypt(ctx, page);
+		int ret = ext4_decrypt(page);
 		if (ret) {
 			WARN_ON_ONCE(1);
 			SetPageError(page);
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/5] ext4 crypto: replace some BUG_ON()'s with error checks
  2015-10-04  5:38 [PATCH 1/5] ext4 crypto: fix memory leak in ext4_bio_write_page() Theodore Ts'o
  2015-10-04  5:38 ` [PATCH 2/5] ext4: optimize ext4_writepage() for attempted 4k delalloc writes Theodore Ts'o
  2015-10-04  5:38 ` [PATCH 3/5] ext4 crypto: ext4_page_crypto() doesn't need a encryption context Theodore Ts'o
@ 2015-10-04  5:38 ` Theodore Ts'o
  2015-10-04  5:38 ` [PATCH 5/5] ext4 crypto: fix bugs in ext4_encrypted_zeroout() Theodore Ts'o
  3 siblings, 0 replies; 5+ messages in thread
From: Theodore Ts'o @ 2015-10-04  5:38 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, stable

Buggy (or hostile) userspace should not be able to cause the kernel to
crash.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
 fs/ext4/crypto.c        |  1 -
 fs/ext4/crypto_fname.c  |  2 --
 fs/ext4/crypto_key.c    | 16 +++++++++++++---
 fs/ext4/crypto_policy.c |  3 ++-
 4 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/crypto.c b/fs/ext4/crypto.c
index 3a5a7a2..879cb15 100644
--- a/fs/ext4/crypto.c
+++ b/fs/ext4/crypto.c
@@ -295,7 +295,6 @@ static int ext4_page_crypto(struct inode *inode,
 	else
 		res = crypto_ablkcipher_encrypt(req);
 	if (res == -EINPROGRESS || res == -EBUSY) {
-		BUG_ON(req->base.data != &ecr);
 		wait_for_completion(&ecr.completion);
 		res = ecr.res;
 	}
diff --git a/fs/ext4/crypto_fname.c b/fs/ext4/crypto_fname.c
index 847f919..2fbef8a 100644
--- a/fs/ext4/crypto_fname.c
+++ b/fs/ext4/crypto_fname.c
@@ -120,7 +120,6 @@ static int ext4_fname_encrypt(struct inode *inode,
 	ablkcipher_request_set_crypt(req, &src_sg, &dst_sg, ciphertext_len, iv);
 	res = crypto_ablkcipher_encrypt(req);
 	if (res == -EINPROGRESS || res == -EBUSY) {
-		BUG_ON(req->base.data != &ecr);
 		wait_for_completion(&ecr.completion);
 		res = ecr.res;
 	}
@@ -182,7 +181,6 @@ static int ext4_fname_decrypt(struct inode *inode,
 	ablkcipher_request_set_crypt(req, &src_sg, &dst_sg, iname->len, iv);
 	res = crypto_ablkcipher_decrypt(req);
 	if (res == -EINPROGRESS || res == -EBUSY) {
-		BUG_ON(req->base.data != &ecr);
 		wait_for_completion(&ecr.completion);
 		res = ecr.res;
 	}
diff --git a/fs/ext4/crypto_key.c b/fs/ext4/crypto_key.c
index 1d510c1..f9270ec 100644
--- a/fs/ext4/crypto_key.c
+++ b/fs/ext4/crypto_key.c
@@ -71,7 +71,6 @@ static int ext4_derive_key_aes(char deriving_key[EXT4_AES_128_ECB_KEY_SIZE],
 				     EXT4_AES_256_XTS_KEY_SIZE, NULL);
 	res = crypto_ablkcipher_encrypt(req);
 	if (res == -EINPROGRESS || res == -EBUSY) {
-		BUG_ON(req->base.data != &ecr);
 		wait_for_completion(&ecr.completion);
 		res = ecr.res;
 	}
@@ -208,7 +207,12 @@ retry:
 		goto out;
 	}
 	crypt_info->ci_keyring_key = keyring_key;
-	BUG_ON(keyring_key->type != &key_type_logon);
+	if (keyring_key->type != &key_type_logon) {
+		printk_once(KERN_WARNING
+			    "ext4: key type must be logon\n");
+		res = -ENOKEY;
+		goto out;
+	}
 	ukp = ((struct user_key_payload *)keyring_key->payload.data);
 	if (ukp->datalen != sizeof(struct ext4_encryption_key)) {
 		res = -EINVAL;
@@ -217,7 +221,13 @@ retry:
 	master_key = (struct ext4_encryption_key *)ukp->data;
 	BUILD_BUG_ON(EXT4_AES_128_ECB_KEY_SIZE !=
 		     EXT4_KEY_DERIVATION_NONCE_SIZE);
-	BUG_ON(master_key->size != EXT4_AES_256_XTS_KEY_SIZE);
+	if (master_key->size != EXT4_AES_256_XTS_KEY_SIZE) {
+		printk_once(KERN_WARNING
+			    "ext4: key size incorrect: %d\n",
+			    master_key->size);
+		res = -ENOKEY;
+		goto out;
+	}
 	res = ext4_derive_key_aes(ctx.nonce, master_key->raw,
 				  raw_key);
 	if (res)
diff --git a/fs/ext4/crypto_policy.c b/fs/ext4/crypto_policy.c
index a640ec2..ad05069 100644
--- a/fs/ext4/crypto_policy.c
+++ b/fs/ext4/crypto_policy.c
@@ -150,7 +150,8 @@ int ext4_is_child_context_consistent_with_parent(struct inode *parent,
 
 	if ((parent == NULL) || (child == NULL)) {
 		pr_err("parent %p child %p\n", parent, child);
-		BUG_ON(1);
+		WARN_ON(1);	/* Should never happen */
+		return 0;
 	}
 	/* no restrictions if the parent directory is not encrypted */
 	if (!ext4_encrypted_inode(parent))
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 5/5] ext4 crypto: fix bugs in ext4_encrypted_zeroout()
  2015-10-04  5:38 [PATCH 1/5] ext4 crypto: fix memory leak in ext4_bio_write_page() Theodore Ts'o
                   ` (2 preceding siblings ...)
  2015-10-04  5:38 ` [PATCH 4/5] ext4 crypto: replace some BUG_ON()'s with error checks Theodore Ts'o
@ 2015-10-04  5:38 ` Theodore Ts'o
  3 siblings, 0 replies; 5+ messages in thread
From: Theodore Ts'o @ 2015-10-04  5:38 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, stable

Fix multiple bugs in ext4_encrypted_zeroout(), including one that
could cause us to write an encrypted zero page to the wrong location
on disk, potentially causing data and file system corruption.
Fortunately, this tends to only show up in stress tests, but even with
these fixes, we are seeing some test failures with generic/127 --- but
these are now caused by data failures instead of metadata corruption.

Since ext4_encrypted_zeroout() is only used for some optimizations to
keep the extent tree from being too fragmented, and
ext4_encrypted_zeroout() itself isn't all that optimized from a time
or IOPS perspective, disable the extent tree optimization for
encrypted inodes for now.  This prevents the data corruption issues
reported by generic/127 until we can figure out what's going wrong.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
---
 fs/ext4/crypto.c  | 23 +++++++++++++++++++----
 fs/ext4/extents.c |  3 +++
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/crypto.c b/fs/ext4/crypto.c
index 879cb15..af06830 100644
--- a/fs/ext4/crypto.c
+++ b/fs/ext4/crypto.c
@@ -392,7 +392,13 @@ int ext4_encrypted_zeroout(struct inode *inode, struct ext4_extent *ex)
 	ext4_lblk_t		lblk = ex->ee_block;
 	ext4_fsblk_t		pblk = ext4_ext_pblock(ex);
 	unsigned int		len = ext4_ext_get_actual_len(ex);
-	int			err = 0;
+	int			ret, err = 0;
+
+#if 0
+	ext4_msg(inode->i_sb, KERN_CRIT,
+		 "ext4_encrypted_zeroout ino %lu lblk %u len %u",
+		 (unsigned long) inode->i_ino, lblk, len);
+#endif
 
 	BUG_ON(inode->i_sb->s_blocksize != PAGE_CACHE_SIZE);
 
@@ -418,17 +424,26 @@ int ext4_encrypted_zeroout(struct inode *inode, struct ext4_extent *ex)
 			goto errout;
 		}
 		bio->bi_bdev = inode->i_sb->s_bdev;
-		bio->bi_iter.bi_sector = pblk;
-		err = bio_add_page(bio, ciphertext_page,
+		bio->bi_iter.bi_sector =
+			pblk << (inode->i_sb->s_blocksize_bits - 9);
+		ret = bio_add_page(bio, ciphertext_page,
 				   inode->i_sb->s_blocksize, 0);
-		if (err) {
+		if (ret != inode->i_sb->s_blocksize) {
+			/* should never happen! */
+			ext4_msg(inode->i_sb, KERN_ERR,
+				 "bio_add_page failed: %d", ret);
+			WARN_ON(1);
 			bio_put(bio);
+			err = -EIO;
 			goto errout;
 		}
 		err = submit_bio_wait(WRITE, bio);
+		if ((err == 0) && bio->bi_error)
+			err = -EIO;
 		bio_put(bio);
 		if (err)
 			goto errout;
+		lblk++; pblk++;
 	}
 	err = 0;
 errout:
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 2553aa8..7f486e3 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3558,6 +3558,9 @@ static int ext4_ext_convert_to_initialized(handle_t *handle,
 		max_zeroout = sbi->s_extent_max_zeroout_kb >>
 			(inode->i_sb->s_blocksize_bits - 10);
 
+	if (ext4_encrypted_inode(inode))
+		max_zeroout = 0;
+
 	/* If extent is less than s_max_zeroout_kb, zeroout directly */
 	if (max_zeroout && (ee_len <= max_zeroout)) {
 		err = ext4_ext_zeroout(inode, ex);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-10-04  5:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-04  5:38 [PATCH 1/5] ext4 crypto: fix memory leak in ext4_bio_write_page() Theodore Ts'o
2015-10-04  5:38 ` [PATCH 2/5] ext4: optimize ext4_writepage() for attempted 4k delalloc writes Theodore Ts'o
2015-10-04  5:38 ` [PATCH 3/5] ext4 crypto: ext4_page_crypto() doesn't need a encryption context Theodore Ts'o
2015-10-04  5:38 ` [PATCH 4/5] ext4 crypto: replace some BUG_ON()'s with error checks Theodore Ts'o
2015-10-04  5:38 ` [PATCH 5/5] ext4 crypto: fix bugs in ext4_encrypted_zeroout() Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).