All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mingming Cao <cmm@us.ibm.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: tytso@mit.edu, linux-ext4@vger.kernel.org
Subject: Re: [PATCH] ext4: When reading from fallocated blocks make sure we return zero.
Date: Mon, 18 Feb 2008 16:14:34 -0800	[thread overview]
Message-ID: <1203380074.3629.35.camel@localhost.localdomain> (raw)
In-Reply-To: <20080216032334.GA6501@skywalker>

On Sat, 2008-02-16 at 08:53 +0530, Aneesh Kumar K.V wrote:
> On Fri, Feb 15, 2008 at 11:43:04AM -0800, Mingming Cao wrote:
> > On Fri, 2008-02-15 at 23:46 +0530, Aneesh Kumar K.V wrote:
> > > fallocate blocks are considered as sparse area and read from them should
> > > return zero. ext4_ext_get_blocks should return zero for read request.
> > > 
> > 
> > The patch itself looks harmless, but I still don't see how this could
> > fix the problem you described at irc: a write hit a BUG_ON() in
> > fs/buffer.c saying the buffer is not mapped. Could you add more details
> > here?
> 
> Write will take the below call chain
> 
> ext4_write_begin
>   block_write_begin
>     __block_prepare_write
>        ext4_getblock
>          ext4_get_blocks_wrap
> (1)	   ext4_ext_get_blocks with create = 0 return allocated
>        ll_rw_block  if buffer not uptodate.
>          submit_bh
> 	   BUG_ON(!buffer_mapped(bh))
> 
> 
> ext4_ext_get_blocks at (1) should have returned 0. That would cause
> ext4_get_blocks_wrap to again call ext4_ext_get_blocks with create = 1
> and that would have returned us the buffer head which is mapped. This
> would also result in splitting the extent to initialized and
> uninitialized one.
> 

I see what is happening. Your fix seems treated preallocated blocks as
holes equally when get_blocks() with create = 0. This works for the
current case, but now I think the patch has side effect to delayed
allocation.  

How about the following patch?

Regards,
Mingming

ext4: ext4_get_blocks_wrap fix for writing to preallocated
From: Mingming Cao <cmm@us.ibm.com>

This patch fixed a issue with wrting to a preallocated blocks.
A write hit a BUG_ON() in fs/buffer.c saying the buffer is not mapped.

On the write path, ext4_get_block_wrap() is called with create=1, but it
will pass create=0 down to the underlying ext4ext_get_blocks()
to do a look up first.  In the preallocation case, ext4_ext_get_blocks()
with create = 0, will return number of blocks pre-allocated and buffer
head unmapped. ext4_get_blocks_wrap() thinks it succeeds too early, without
checking if it needs again call ext4_ext_get_blocks with create = 1
which would do proper handling for writing to preallocated blocks:
split the extent to initialized and uninitialized one and
returns the mapped buffer head.

Treating preallocated blocks as holes equally(i.e. ignoring the number of blocks
pre-allocated and returns 0) when get_blocks() is called with create = 0 is not enough.
ext4_ext_get_blocks() needs to differentiate these two cases for delayed allocation
purpose, as for holes it need to do reservation and prepare for later
delayed allocation, but for pre-allocated blocks it needs skip that work.

It would makes things more clear if we have clear definition of what
get_blocks() return value means.

Similar to ext4_get_blocks_handle(),  the following 
* return > 0, # of blocks already allocated
*          if these are pre-allocated blocks and create = 0
*             buffer head is unmapped
*          otherwise blocks are mapped.
*
* return = 0, if plain look up failed (blocks have not been allocated)
*          buffer head is unmapped
*
* return < 0, error case.

The for the write path, at ext4_ext_get_blocks_wrap(), it could check the
buffer_mapped() status for preallocated extent before quit too early.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>

---
 fs/ext4/extents.c |   13 +++++++++++++
 fs/ext4/inode.c   |   41 ++++++++++++++++++++++++++++++++++++++---
 2 files changed, 51 insertions(+), 3 deletions(-)

Index: linux-2.6.25-rc2/fs/ext4/extents.c
===================================================================
--- linux-2.6.25-rc2.orig/fs/ext4/extents.c	2008-02-18 15:39:52.000000000 -0800
+++ linux-2.6.25-rc2/fs/ext4/extents.c	2008-02-18 15:43:33.000000000 -0800
@@ -2285,9 +2285,22 @@ out:
 }
 
 /*
+ * Block allocation/map/preallocation routine for extents based files
+ *
+ *
  * Need to be called with
  * down_read(&EXT4_I(inode)->i_data_sem) if not allocating file system block
  * (ie, create is zero). Otherwise down_write(&EXT4_I(inode)->i_data_sem)
+ *
+ * return > 0, number of of blocks already mapped/allocated
+ *          if these are pre-allocated blocks, buffer head is unmapped if
+ *          create = 0 (look up only)
+ *          otherwise blocks are mapped
+ *
+ * return = 0, if plain look up failed (blocks have not been allocated)
+ *          buffer head is unmapped
+ *
+ * return < 0, error case.
  */
 int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
 			ext4_lblk_t iblock,
Index: linux-2.6.25-rc2/fs/ext4/inode.c
===================================================================
--- linux-2.6.25-rc2.orig/fs/ext4/inode.c	2008-02-18 15:07:00.000000000 -0800
+++ linux-2.6.25-rc2/fs/ext4/inode.c	2008-02-18 15:43:59.000000000 -0800
@@ -908,6 +908,26 @@ out:
  */
 #define DIO_CREDITS 25
 
+
+/*
+ * ext4 get_block() wrapper function
+ * It first do a look up, returns if the blocks already mapped. Otherwise
+ * it takes the write sem and do block allocation
+ *
+ * If file type is extents based, call with ext4_ext_get_blocks()
+ * Otherwise, call with ext4_get_blocks_handle() to handle indirect mapping
+ * based files
+ *
+ * return > 0, number of of blocks already mapped/allocated
+ *          if these are pre-allocated blocks, buffer head is unmapped if
+ *          create = 0 (look up only)
+ *          otherwise blocks are mapped
+ *
+ * return = 0, if plain look up failed (blocks have not been allocated)
+ *          buffer head is unmapped
+ *
+ * return < 0, error case.
+ */
 int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block,
 			unsigned long max_blocks, struct buffer_head *bh,
 			int create, int extend_disksize)
@@ -926,12 +946,27 @@ int ext4_get_blocks_wrap(handle_t *handl
 				inode, block, max_blocks, bh, 0, 0);
 	}
 	up_read((&EXT4_I(inode)->i_data_sem));
-	if (!create || (retval > 0))
+
+	/* If it is only a block(s) look up */
+	if (!create )
+		return retval;
+
+	/*
+	 * Returns if the blocks have already allocated
+	 *
+	 * Note that if blocks have been preallocated
+	 * ext4_ext_get_block() returns with buffer head unmapped.
+	 * Write to a preallocated space needs to split
+	 * the preallocated extents, thus needs to update
+	 * i_data
+	 */
+	if (retval > 0 && buffer_mapped(bh))
 		return retval;
 
 	/*
-	 * We need to allocate new blocks which will result
-	 * in i_data update
+	 * New blocks and preallocation handling will possiblely result
+	 * in i_data update, take the write sem, and call get_blocks()
+	 * with create = 1
 	 */
 	down_write((&EXT4_I(inode)->i_data_sem));
 	/*

  parent reply	other threads:[~2008-02-19  0:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-15 18:16 [PATCH] ext4: When reading from fallocated blocks make sure we return zero Aneesh Kumar K.V
2008-02-15 19:43 ` Mingming Cao
2008-02-16  3:23   ` Aneesh Kumar K.V
2008-02-18  7:45     ` Aneesh Kumar K.V
2008-02-19  0:14     ` Mingming Cao [this message]
2008-02-19  3:19       ` Aneesh Kumar K.V
2008-02-19  5:24         ` Mingming Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1203380074.3629.35.camel@localhost.localdomain \
    --to=cmm@us.ibm.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.