linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@linux.intel.com>
To: Theodore Ts'o <tytso@mit.edu>,
	Matthew Wilcox <willy@linux.intel.com>,
	Dave Chinner <david@fromorbit.com>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Ross Zwisler <ross.zwisler@linux.intel.com>
Subject: Re: [PATCH v10 19/21] xip: Add xip_zero_page_range
Date: Mon, 8 Sep 2014 14:59:36 -0400	[thread overview]
Message-ID: <20140908185936.GE27730@localhost.localdomain> (raw)
In-Reply-To: <20140904213641.GB4364@thunk.org>

On Thu, Sep 04, 2014 at 05:36:41PM -0400, Theodore Ts'o wrote:
> On Thu, Sep 04, 2014 at 05:08:02PM -0400, Matthew Wilcox wrote:
> > 
> > ext4 does (or did?) have this bug (expectation?).  I then take advantage
> > of the fact that we have to accommodate it, so there are now two places
> > that have to accommodate it.  I forget what the path was that has that
> > assumption, but xfstests used to display it.
> > 
> > I'm away this week (... bad timing), but I can certainly fix it elsewhere
> > in ext4 next week.
> 
> Huh?  Can you say more about what it is or was doing?  And where?
> 
> I tried to look for it, and I'm not seeing it, but I'm not entirely
> sure from your description whether I'm looking in the right place.

I wrote this patch:

diff --git a/fs/dax.c b/fs/dax.c
index 96c4fed..bdf6622 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -473,6 +473,7 @@ int dax_zero_page_range(struct inode *inode, loff_t from, unsigned length,
 	/* Block boundary? Nothing to do */
 	if (!length)
 		return 0;
+	BUG_ON((offset + length) > PAGE_CACHE_SIZE);
 
 	memset(&bh, 0, sizeof(bh));
 	bh.b_size = PAGE_CACHE_SIZE;
@@ -484,14 +485,31 @@ int dax_zero_page_range(struct inode *inode, loff_t from, unsigned length,
 		err = dax_get_addr(&bh, &addr, inode->i_blkbits);
 		if (err < 0)
 			return err;
-		/*
-		 * ext4 sometimes asks to zero past the end of a block.  It
-		 * really just wants to zero to the end of the block.
-		 */
-		length = min_t(unsigned, length, PAGE_CACHE_SIZE - offset);
 		memset(addr + offset, 0, length);
 	}
 
 	return 0;
 }
 EXPORT_SYMBOL_GPL(dax_zero_page_range);
+
+/**
+ * dax_truncate_page - handle a partial page being truncated in a DAX file
+ * @inode: The file being truncated
+ * @from: The file offset that is being truncated to
+ * @get_block: The filesystem method used to translate file offsets to blocks
+ *
+ * Similar to block_truncate_page(), this function can be called by a
+ * filesystem when it is truncating an DAX file to handle the partial page.
+ *
+ * We work in terms of PAGE_CACHE_SIZE here for commonality with
+ * block_truncate_page(), but we could go down to PAGE_SIZE if the filesystem
+ * took care of disposing of the unnecessary blocks.  Even if the filesystem
+ * block size is smaller than PAGE_SIZE, we have to zero the rest of the page
+ * since the file might be mmaped.
+ */
+int dax_truncate_page(struct inode *inode, loff_t from, get_block_t get_block)
+{
+	unsigned length = PAGE_CACHE_ALIGN(from) - from;
+	return dax_zero_page_range(inode, from, length, get_block);
+}
+EXPORT_SYMBOL_GPL(dax_truncate_page);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b0078df..d0182a5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2502,6 +2502,12 @@ static inline int dax_clear_blocks(struct inode *i, sector_t blk, long sz)
 	return 0;
 }
 
+static inline int dax_truncate_page(struct inode *inode, loff_t from,
+								get_block_t gb)
+{
+	return 0;
+}
+
 static inline int dax_zero_page_range(struct inode *inode, loff_t from,
 						unsigned len, get_block_t gb)
 {
@@ -2516,11 +2522,6 @@ static inline ssize_t dax_do_io(int rw, struct kiocb *iocb,
 }
 #endif
 
-/* Can't be a function because PAGE_CACHE_SIZE is defined in pagemap.h */
-#define dax_truncate_page(inode, from, get_block)	\
-	dax_zero_page_range(inode, from, PAGE_CACHE_SIZE, get_block)
-
-
 #ifdef CONFIG_BLOCK
 typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
 			    loff_t file_offset);

When running generic/008, it hit the BUG_ON in dax_zero_page_range():

[  506.752872] Call Trace:
[  506.752891]  [<ffffffffa02303cb>] ? __ext4_handle_dirty_metadata+0x9b/0x210 [ext4]
[  506.752910]  [<ffffffffa0200ffa>] ext4_block_zero_page_range+0x1ba/0x400 [ext4]
[  506.752930]  [<ffffffffa022f708>] ? ext4_fallocate+0x818/0xb70 [ext4]
[  506.752947]  [<ffffffffa020188e>] ext4_zero_partial_blocks+0xae/0xf0 [ext4]
[  506.752966]  [<ffffffffa022f719>] ext4_fallocate+0x829/0xb70 [ext4]
[  506.752980]  [<ffffffff811fee96>] do_fallocate+0x126/0x1b0
[  506.752992]  [<ffffffff811fef63>] SyS_fallocate+0x43/0x70

Someone appears to already know about this, since this code exists
in the current ext4_block_zero_page_range() [which I renamed to
__ext4_block_zero_page_range() in my patchset]:

        /*
         * correct length if it does not fall between
         * 'from' and the end of the block
         */
        if (length > max || length < 0)
                length = max;

Applying the following patch on top of the DAX patchset and the above
patch fixes everything nicely, but does result in a small amount of
code duplication.

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index e71adf6..5edd903 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3231,7 +3231,7 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 {
 	ext4_fsblk_t index = from >> PAGE_CACHE_SHIFT;
 	unsigned offset = from & (PAGE_CACHE_SIZE-1);
-	unsigned blocksize, max, pos;
+	unsigned blocksize, pos;
 	ext4_lblk_t iblock;
 	struct inode *inode = mapping->host;
 	struct buffer_head *bh;
@@ -3244,14 +3244,6 @@ static int __ext4_block_zero_page_range(handle_t *handle,
 		return -ENOMEM;
 
 	blocksize = inode->i_sb->s_blocksize;
-	max = blocksize - (offset & (blocksize - 1));
-
-	/*
-	 * correct length if it does not fall between
-	 * 'from' and the end of the block
-	 */
-	if (length > max || length < 0)
-		length = max;
 
 	iblock = index << (PAGE_CACHE_SHIFT - inode->i_sb->s_blocksize_bits);
 
@@ -3327,6 +3319,17 @@ static int ext4_block_zero_page_range(handle_t *handle,
 		struct address_space *mapping, loff_t from, loff_t length)
 {
 	struct inode *inode = mapping->host;
+	unsigned offset = from & (PAGE_CACHE_SIZE-1);
+	unsigned blocksize = inode->i_sb->s_blocksize;
+	unsigned max = blocksize - (offset & (blocksize - 1));
+
+	/*
+	 * correct length if it does not fall between
+	 * 'from' and the end of the block
+	 */
+	if (length > max || length < 0)
+		length = max;
+
 	if (IS_DAX(inode))
 		return dax_zero_page_range(inode, from, length, ext4_get_block);
 	return __ext4_block_zero_page_range(handle, mapping, from, length);

  reply	other threads:[~2014-09-08 18:59 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-27  3:45 [PATCH v10 00/21] Support ext4 on NV-DIMMs Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 01/21] axonram: Fix bug in direct_access Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 02/21] Change direct_access calling convention Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 03/21] Fix XIP fault vs truncate race Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 04/21] Allow page fault handlers to perform the COW Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 05/21] Introduce IS_DAX(inode) Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 06/21] Add copy_to_iter(), copy_from_iter() and iov_iter_zero() Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 07/21] Replace XIP read and write with DAX I/O Matthew Wilcox
2014-09-14 14:11   ` Boaz Harrosh
2014-08-27  3:45 ` [PATCH v10 08/21] Replace ext2_clear_xip_target with dax_clear_blocks Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 09/21] Replace the XIP page fault handler with the DAX page fault handler Matthew Wilcox
2014-09-03  7:47   ` Dave Chinner
2014-09-10 15:23     ` Matthew Wilcox
2014-09-11  3:09       ` Dave Chinner
2014-09-24 15:43         ` Matthew Wilcox
2014-09-25  1:01           ` Dave Chinner
2014-08-27  3:45 ` [PATCH v10 10/21] Replace xip_truncate_page with dax_truncate_page Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 11/21] Replace XIP documentation with DAX documentation Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 12/21] Remove get_xip_mem Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 13/21] ext2: Remove ext2_xip_verify_sb() Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 14/21] ext2: Remove ext2_use_xip Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 15/21] ext2: Remove xip.c and xip.h Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 16/21] Remove CONFIG_EXT2_FS_XIP and rename CONFIG_FS_XIP to CONFIG_FS_DAX Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 17/21] ext2: Remove ext2_aops_xip Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 18/21] Get rid of most mentions of XIP in ext2 Matthew Wilcox
2014-08-27  3:45 ` [PATCH v10 19/21] xip: Add xip_zero_page_range Matthew Wilcox
2014-09-03  9:21   ` Dave Chinner
2014-09-04 21:08     ` Matthew Wilcox
2014-09-04 21:36       ` Theodore Ts'o
2014-09-08 18:59         ` Matthew Wilcox [this message]
2014-08-27  3:45 ` [PATCH v10 20/21] ext4: Add DAX functionality Matthew Wilcox
2014-09-03 11:13   ` Dave Chinner
2014-09-10 16:49     ` Boaz Harrosh
2014-09-11  4:38       ` Dave Chinner
2014-09-14 12:25         ` Boaz Harrosh
2014-09-15  6:15           ` Dave Chinner
2014-09-15  9:41             ` Boaz Harrosh
2014-08-27  3:45 ` [PATCH v10 21/21] brd: Rename XIP to DAX Matthew Wilcox
2014-08-27 20:06 ` [PATCH v10 00/21] Support ext4 on NV-DIMMs Andrew Morton
2014-08-27 21:12   ` Matthew Wilcox
2014-08-27 21:46     ` Andrew Morton
2014-08-28  1:30       ` Andy Lutomirski
2014-08-28 16:50         ` Matthew Wilcox
2014-08-28 15:45       ` Matthew Wilcox
2014-08-27 21:22   ` Christoph Lameter
2014-08-27 21:30     ` Andrew Morton
2014-08-27 23:04       ` One Thousand Gnomes
2014-08-28  7:17       ` Dave Chinner
2014-08-30 23:11         ` Christian Stroetmann
2014-08-28  8:08 ` Boaz Harrosh
2014-08-28 22:09   ` Zwisler, Ross
2014-09-03 12:05 ` [PATCH 1/1] xfs: add DAX support Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140908185936.GE27730@localhost.localdomain \
    --to=willy@linux.intel.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).