* [PATCH] ext4: When reading from fallocated blocks make sure we return zero. @ 2008-02-15 18:16 Aneesh Kumar K.V 2008-02-15 19:43 ` Mingming Cao 0 siblings, 1 reply; 7+ messages in thread From: Aneesh Kumar K.V @ 2008-02-15 18:16 UTC (permalink / raw) To: tytso, cmm; +Cc: linux-ext4, Aneesh Kumar K.V fallocate blocks are considered as sparse area and read from them should return zero. ext4_ext_get_blocks should return zero for read request. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- fs/ext4/extents.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 3efbfd1..5b22f71 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -2379,8 +2379,14 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode, } if (create == EXT4_CREATE_UNINITIALIZED_EXT) goto out; - if (!create) + if (!create) { + /* + * read request should return zero blocks + * allocated + */ + allocated = 0; goto out2; + } ret = ext4_ext_convert_to_initialized(handle, inode, path, iblock, -- 1.5.4.1.97.g40aab-dirty ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: When reading from fallocated blocks make sure we return zero. 2008-02-15 18:16 [PATCH] ext4: When reading from fallocated blocks make sure we return zero Aneesh Kumar K.V @ 2008-02-15 19:43 ` Mingming Cao 2008-02-16 3:23 ` Aneesh Kumar K.V 0 siblings, 1 reply; 7+ messages in thread From: Mingming Cao @ 2008-02-15 19:43 UTC (permalink / raw) To: Aneesh Kumar K.V; +Cc: tytso, linux-ext4 On Fri, 2008-02-15 at 23:46 +0530, Aneesh Kumar K.V wrote: > fallocate blocks are considered as sparse area and read from them should > return zero. ext4_ext_get_blocks should return zero for read request. > The patch itself looks harmless, but I still don't see how this could fix the problem you described at irc: a write hit a BUG_ON() in fs/buffer.c saying the buffer is not mapped. Could you add more details here? Thanks Mingming > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > --- > fs/ext4/extents.c | 8 +++++++- > 1 files changed, 7 insertions(+), 1 deletions(-) > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index 3efbfd1..5b22f71 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -2379,8 +2379,14 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode, > } > if (create == EXT4_CREATE_UNINITIALIZED_EXT) > goto out; > - if (!create) > + if (!create) { > + /* > + * read request should return zero blocks > + * allocated > + */ > + allocated = 0; > goto out2; > + } > > ret = ext4_ext_convert_to_initialized(handle, inode, > path, iblock, ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: When reading from fallocated blocks make sure we return zero. 2008-02-15 19:43 ` Mingming Cao @ 2008-02-16 3:23 ` Aneesh Kumar K.V 2008-02-18 7:45 ` Aneesh Kumar K.V 2008-02-19 0:14 ` Mingming Cao 0 siblings, 2 replies; 7+ messages in thread From: Aneesh Kumar K.V @ 2008-02-16 3:23 UTC (permalink / raw) To: Mingming Cao; +Cc: tytso, linux-ext4 On Fri, Feb 15, 2008 at 11:43:04AM -0800, Mingming Cao wrote: > On Fri, 2008-02-15 at 23:46 +0530, Aneesh Kumar K.V wrote: > > fallocate blocks are considered as sparse area and read from them should > > return zero. ext4_ext_get_blocks should return zero for read request. > > > > The patch itself looks harmless, but I still don't see how this could > fix the problem you described at irc: a write hit a BUG_ON() in > fs/buffer.c saying the buffer is not mapped. Could you add more details > here? Write will take the below call chain ext4_write_begin block_write_begin __block_prepare_write ext4_getblock ext4_get_blocks_wrap (1) ext4_ext_get_blocks with create = 0 return allocated ll_rw_block if buffer not uptodate. submit_bh BUG_ON(!buffer_mapped(bh)) ext4_ext_get_blocks at (1) should have returned 0. That would cause ext4_get_blocks_wrap to again call ext4_ext_get_blocks with create = 1 and that would have returned us the buffer head which is mapped. This would also result in splitting the extent to initialized and uninitialized one. -aneesh ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: When reading from fallocated blocks make sure we return zero. 2008-02-16 3:23 ` Aneesh Kumar K.V @ 2008-02-18 7:45 ` Aneesh Kumar K.V 2008-02-19 0:14 ` Mingming Cao 1 sibling, 0 replies; 7+ messages in thread From: Aneesh Kumar K.V @ 2008-02-18 7:45 UTC (permalink / raw) To: Mingming Cao; +Cc: tytso, linux-ext4 On Sat, Feb 16, 2008 at 08:53:34AM +0530, Aneesh Kumar K.V wrote: > On Fri, Feb 15, 2008 at 11:43:04AM -0800, Mingming Cao wrote: > > On Fri, 2008-02-15 at 23:46 +0530, Aneesh Kumar K.V wrote: > > > fallocate blocks are considered as sparse area and read from them should > > > return zero. ext4_ext_get_blocks should return zero for read request. > > > > > > > The patch itself looks harmless, but I still don't see how this could > > fix the problem you described at irc: a write hit a BUG_ON() in > > fs/buffer.c saying the buffer is not mapped. Could you add more details > > here? > > Write will take the below call chain > > ext4_write_begin > block_write_begin > __block_prepare_write > ext4_getblock > ext4_get_blocks_wrap > (1) ext4_ext_get_blocks with create = 0 return allocated > ll_rw_block if buffer not uptodate. > submit_bh > BUG_ON(!buffer_mapped(bh)) > > > ext4_ext_get_blocks at (1) should have returned 0. That would cause > ext4_get_blocks_wrap to again call ext4_ext_get_blocks with create = 1 > and that would have returned us the buffer head which is mapped. This > would also result in splitting the extent to initialized and > uninitialized one. > The change is also needed to get mmap on fallocate space to work. ------------[ cut here ]------------ WARNING: at fs/buffer.c:1680 __block_write_full_page+0x101/0x2f1() Modules linked in: Pid: 2478, comm: mmaptest Not tainted 2.6.25-rc1 #12 [<c0120e84>] warn_on_slowpath+0x41/0x51 [<c0108c00>] ? native_sched_clock+0x2d/0x9f [<c013b44e>] ? __lock_acquire+0xacb/0xb13 [<c013b44e>] ? __lock_acquire+0xacb/0xb13 [<c0180f97>] __block_write_full_page+0x101/0x2f1 [<c01d053f>] ? ext4_get_block+0x0/0xc0 [<c018124f>] block_write_full_page+0xc8/0xd1 [<c01d053f>] ? ext4_get_block+0x0/0xc0 [<c01d1a36>] ext4_ordered_writepage+0xad/0x146 [<c01cec12>] ? bget_one+0x0/0xb [<c014c5dd>] __writepage+0xb/0x25 [<c014cab2>] write_cache_pages+0x180/0x287 [<c014c5d2>] ? __writepage+0x0/0x25 [<c01527d5>] ? __do_fault+0x2e2/0x324 [<c0147889>] ? unlock_page+0x25/0x28 [<c014cbd6>] generic_writepages+0x1d/0x27 [<c014cc0c>] do_writepages+0x2c/0x34 [<c0147fe1>] __filemap_fdatawrite_range+0x5b/0x67 [<c01481ba>] filemap_fdatawrite+0x15/0x17 [<c017ea3d>] do_fsync+0x2c/0x9a [<c017eacb>] __do_fsync+0x20/0x2f [<c017eaf9>] sys_fsync+0xd/0xf [<c0104992>] sysenter_past_esp+0x5f/0xa5 ======================= ---[ end trace 5ba60b430e0af601 ]--- -aneesh ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: When reading from fallocated blocks make sure we return zero. 2008-02-16 3:23 ` Aneesh Kumar K.V 2008-02-18 7:45 ` Aneesh Kumar K.V @ 2008-02-19 0:14 ` Mingming Cao 2008-02-19 3:19 ` Aneesh Kumar K.V 1 sibling, 1 reply; 7+ messages in thread From: Mingming Cao @ 2008-02-19 0:14 UTC (permalink / raw) To: Aneesh Kumar K.V; +Cc: tytso, linux-ext4 On Sat, 2008-02-16 at 08:53 +0530, Aneesh Kumar K.V wrote: > On Fri, Feb 15, 2008 at 11:43:04AM -0800, Mingming Cao wrote: > > On Fri, 2008-02-15 at 23:46 +0530, Aneesh Kumar K.V wrote: > > > fallocate blocks are considered as sparse area and read from them should > > > return zero. ext4_ext_get_blocks should return zero for read request. > > > > > > > The patch itself looks harmless, but I still don't see how this could > > fix the problem you described at irc: a write hit a BUG_ON() in > > fs/buffer.c saying the buffer is not mapped. Could you add more details > > here? > > Write will take the below call chain > > ext4_write_begin > block_write_begin > __block_prepare_write > ext4_getblock > ext4_get_blocks_wrap > (1) ext4_ext_get_blocks with create = 0 return allocated > ll_rw_block if buffer not uptodate. > submit_bh > BUG_ON(!buffer_mapped(bh)) > > > ext4_ext_get_blocks at (1) should have returned 0. That would cause > ext4_get_blocks_wrap to again call ext4_ext_get_blocks with create = 1 > and that would have returned us the buffer head which is mapped. This > would also result in splitting the extent to initialized and > uninitialized one. > I see what is happening. Your fix seems treated preallocated blocks as holes equally when get_blocks() with create = 0. This works for the current case, but now I think the patch has side effect to delayed allocation. How about the following patch? Regards, Mingming ext4: ext4_get_blocks_wrap fix for writing to preallocated From: Mingming Cao <cmm@us.ibm.com> This patch fixed a issue with wrting to a preallocated blocks. A write hit a BUG_ON() in fs/buffer.c saying the buffer is not mapped. On the write path, ext4_get_block_wrap() is called with create=1, but it will pass create=0 down to the underlying ext4ext_get_blocks() to do a look up first. In the preallocation case, ext4_ext_get_blocks() with create = 0, will return number of blocks pre-allocated and buffer head unmapped. ext4_get_blocks_wrap() thinks it succeeds too early, without checking if it needs again call ext4_ext_get_blocks with create = 1 which would do proper handling for writing to preallocated blocks: split the extent to initialized and uninitialized one and returns the mapped buffer head. Treating preallocated blocks as holes equally(i.e. ignoring the number of blocks pre-allocated and returns 0) when get_blocks() is called with create = 0 is not enough. ext4_ext_get_blocks() needs to differentiate these two cases for delayed allocation purpose, as for holes it need to do reservation and prepare for later delayed allocation, but for pre-allocated blocks it needs skip that work. It would makes things more clear if we have clear definition of what get_blocks() return value means. Similar to ext4_get_blocks_handle(), the following * return > 0, # of blocks already allocated * if these are pre-allocated blocks and create = 0 * buffer head is unmapped * otherwise blocks are mapped. * * return = 0, if plain look up failed (blocks have not been allocated) * buffer head is unmapped * * return < 0, error case. The for the write path, at ext4_ext_get_blocks_wrap(), it could check the buffer_mapped() status for preallocated extent before quit too early. Signed-off-by: Mingming Cao <cmm@us.ibm.com> --- fs/ext4/extents.c | 13 +++++++++++++ fs/ext4/inode.c | 41 ++++++++++++++++++++++++++++++++++++++--- 2 files changed, 51 insertions(+), 3 deletions(-) Index: linux-2.6.25-rc2/fs/ext4/extents.c =================================================================== --- linux-2.6.25-rc2.orig/fs/ext4/extents.c 2008-02-18 15:39:52.000000000 -0800 +++ linux-2.6.25-rc2/fs/ext4/extents.c 2008-02-18 15:43:33.000000000 -0800 @@ -2285,9 +2285,22 @@ out: } /* + * Block allocation/map/preallocation routine for extents based files + * + * * Need to be called with * down_read(&EXT4_I(inode)->i_data_sem) if not allocating file system block * (ie, create is zero). Otherwise down_write(&EXT4_I(inode)->i_data_sem) + * + * return > 0, number of of blocks already mapped/allocated + * if these are pre-allocated blocks, buffer head is unmapped if + * create = 0 (look up only) + * otherwise blocks are mapped + * + * return = 0, if plain look up failed (blocks have not been allocated) + * buffer head is unmapped + * + * return < 0, error case. */ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode, ext4_lblk_t iblock, Index: linux-2.6.25-rc2/fs/ext4/inode.c =================================================================== --- linux-2.6.25-rc2.orig/fs/ext4/inode.c 2008-02-18 15:07:00.000000000 -0800 +++ linux-2.6.25-rc2/fs/ext4/inode.c 2008-02-18 15:43:59.000000000 -0800 @@ -908,6 +908,26 @@ out: */ #define DIO_CREDITS 25 + +/* + * ext4 get_block() wrapper function + * It first do a look up, returns if the blocks already mapped. Otherwise + * it takes the write sem and do block allocation + * + * If file type is extents based, call with ext4_ext_get_blocks() + * Otherwise, call with ext4_get_blocks_handle() to handle indirect mapping + * based files + * + * return > 0, number of of blocks already mapped/allocated + * if these are pre-allocated blocks, buffer head is unmapped if + * create = 0 (look up only) + * otherwise blocks are mapped + * + * return = 0, if plain look up failed (blocks have not been allocated) + * buffer head is unmapped + * + * return < 0, error case. + */ int ext4_get_blocks_wrap(handle_t *handle, struct inode *inode, sector_t block, unsigned long max_blocks, struct buffer_head *bh, int create, int extend_disksize) @@ -926,12 +946,27 @@ int ext4_get_blocks_wrap(handle_t *handl inode, block, max_blocks, bh, 0, 0); } up_read((&EXT4_I(inode)->i_data_sem)); - if (!create || (retval > 0)) + + /* If it is only a block(s) look up */ + if (!create ) + return retval; + + /* + * Returns if the blocks have already allocated + * + * Note that if blocks have been preallocated + * ext4_ext_get_block() returns with buffer head unmapped. + * Write to a preallocated space needs to split + * the preallocated extents, thus needs to update + * i_data + */ + if (retval > 0 && buffer_mapped(bh)) return retval; /* - * We need to allocate new blocks which will result - * in i_data update + * New blocks and preallocation handling will possiblely result + * in i_data update, take the write sem, and call get_blocks() + * with create = 1 */ down_write((&EXT4_I(inode)->i_data_sem)); /* ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: When reading from fallocated blocks make sure we return zero. 2008-02-19 0:14 ` Mingming Cao @ 2008-02-19 3:19 ` Aneesh Kumar K.V 2008-02-19 5:24 ` Mingming Cao 0 siblings, 1 reply; 7+ messages in thread From: Aneesh Kumar K.V @ 2008-02-19 3:19 UTC (permalink / raw) To: Mingming Cao; +Cc: tytso, linux-ext4 On Mon, Feb 18, 2008 at 04:14:34PM -0800, Mingming Cao wrote: > On Sat, 2008-02-16 at 08:53 +0530, Aneesh Kumar K.V wrote: > > How about the following patch? > > Regards, > Mingming > > ext4: ext4_get_blocks_wrap fix for writing to preallocated > From: Mingming Cao <cmm@us.ibm.com> > > This patch fixed a issue with wrting to a preallocated blocks. > A write hit a BUG_ON() in fs/buffer.c saying the buffer is not mapped. > > On the write path, ext4_get_block_wrap() is called with create=1, but it > will pass create=0 down to the underlying ext4ext_get_blocks() > to do a look up first. In the preallocation case, ext4_ext_get_blocks() > with create = 0, will return number of blocks pre-allocated and buffer > head unmapped. ext4_get_blocks_wrap() thinks it succeeds too early, without > checking if it needs again call ext4_ext_get_blocks with create = 1 > which would do proper handling for writing to preallocated blocks: > split the extent to initialized and uninitialized one and > returns the mapped buffer head. > > Treating preallocated blocks as holes equally(i.e. ignoring the number of blocks > pre-allocated and returns 0) when get_blocks() is called with create = 0 is not enough. > ext4_ext_get_blocks() needs to differentiate these two cases for delayed allocation > purpose, as for holes it need to do reservation and prepare for later > delayed allocation, but for pre-allocated blocks it needs skip that work. > > It would makes things more clear if we have clear definition of what > get_blocks() return value means. > > Similar to ext4_get_blocks_handle(), the following > * return > 0, # of blocks already allocated > * if these are pre-allocated blocks and create = 0 > * buffer head is unmapped > * otherwise blocks are mapped. > * > * return = 0, if plain look up failed (blocks have not been allocated) > * buffer head is unmapped > * > * return < 0, error case. > > The for the write path, at ext4_ext_get_blocks_wrap(), it could check the > buffer_mapped() status for preallocated extent before quit too early. > > Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.co> I guess we also need to make sure the buffer head have the mapped bit set. Something like the patch below. diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index bc7081f..69ccda9 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -2294,6 +2294,7 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode, struct ext4_allocation_request ar; __clear_bit(BH_New, &bh_result->b_state); + __clear_bit(BH_Mapped, &bh_result->b_state); ext_debug("blocks %u/%lu requested for inode %u\n", iblock, max_blocks, inode->i_ino); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] ext4: When reading from fallocated blocks make sure we return zero. 2008-02-19 3:19 ` Aneesh Kumar K.V @ 2008-02-19 5:24 ` Mingming Cao 0 siblings, 0 replies; 7+ messages in thread From: Mingming Cao @ 2008-02-19 5:24 UTC (permalink / raw) To: Aneesh Kumar K.V; +Cc: tytso, linux-ext4 On Tue, 2008-02-19 at 08:49 +0530, Aneesh Kumar K.V wrote: > On Mon, Feb 18, 2008 at 04:14:34PM -0800, Mingming Cao wrote: > > On Sat, 2008-02-16 at 08:53 +0530, Aneesh Kumar K.V wrote: > > > > How about the following patch? > > > > Regards, > > Mingming > > > > ext4: ext4_get_blocks_wrap fix for writing to preallocated > > From: Mingming Cao <cmm@us.ibm.com> > > > > This patch fixed a issue with wrting to a preallocated blocks. > > A write hit a BUG_ON() in fs/buffer.c saying the buffer is not mapped. > > > > On the write path, ext4_get_block_wrap() is called with create=1, but it > > will pass create=0 down to the underlying ext4ext_get_blocks() > > to do a look up first. In the preallocation case, ext4_ext_get_blocks() > > with create = 0, will return number of blocks pre-allocated and buffer > > head unmapped. ext4_get_blocks_wrap() thinks it succeeds too early, without > > checking if it needs again call ext4_ext_get_blocks with create = 1 > > which would do proper handling for writing to preallocated blocks: > > split the extent to initialized and uninitialized one and > > returns the mapped buffer head. > > > > Treating preallocated blocks as holes equally(i.e. ignoring the number of blocks > > pre-allocated and returns 0) when get_blocks() is called with create = 0 is not enough. > > ext4_ext_get_blocks() needs to differentiate these two cases for delayed allocation > > purpose, as for holes it need to do reservation and prepare for later > > delayed allocation, but for pre-allocated blocks it needs skip that work. > > > > It would makes things more clear if we have clear definition of what > > get_blocks() return value means. > > > > Similar to ext4_get_blocks_handle(), the following > > * return > 0, # of blocks already allocated > > * if these are pre-allocated blocks and create = 0 > > * buffer head is unmapped > > * otherwise blocks are mapped. > > * > > * return = 0, if plain look up failed (blocks have not been allocated) > > * buffer head is unmapped > > * > > * return < 0, error case. > > > > The for the write path, at ext4_ext_get_blocks_wrap(), it could check the > > buffer_mapped() status for preallocated extent before quit too early. > > > > Signed-off-by: Mingming Cao <cmm@us.ibm.com> > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.co> > > > > I guess we also need to make sure the buffer head have the mapped bit > set. Something like the patch below. > Good point. I modified the patch with clear_buffer_mapped() added at the begining of the wrapper function. Mingming > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index bc7081f..69ccda9 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -2294,6 +2294,7 @@ int ext4_ext_get_blocks(handle_t *handle, struct inode *inode, > struct ext4_allocation_request ar; > > __clear_bit(BH_New, &bh_result->b_state); > + __clear_bit(BH_Mapped, &bh_result->b_state); > ext_debug("blocks %u/%lu requested for inode %u\n", > iblock, max_blocks, inode->i_ino); > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-02-19 5:24 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-02-15 18:16 [PATCH] ext4: When reading from fallocated blocks make sure we return zero Aneesh Kumar K.V 2008-02-15 19:43 ` Mingming Cao 2008-02-16 3:23 ` Aneesh Kumar K.V 2008-02-18 7:45 ` Aneesh Kumar K.V 2008-02-19 0:14 ` Mingming Cao 2008-02-19 3:19 ` Aneesh Kumar K.V 2008-02-19 5:24 ` Mingming Cao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox