* [PATCH 1/4] ext4: Use tag dirty lookup during mpage_da_submit_io @ 2008-10-11 19:04 Aneesh Kumar K.V 2008-10-11 19:04 ` [PATCH 2/4] vfs: Remove the range_cont writeback mode Aneesh Kumar K.V 0 siblings, 1 reply; 8+ messages in thread From: Aneesh Kumar K.V @ 2008-10-11 19:04 UTC (permalink / raw) To: cmm, tytso, sandeen, npiggin; +Cc: linux-ext4, Aneesh Kumar K.V This enables us to drop the range_cont writeback mode use from ext4_da_writepages. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- fs/ext4/inode.c | 30 +++++++++++++----------------- 1 files changed, 13 insertions(+), 17 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 7c2820e..cba7960 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1656,17 +1656,23 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) while (index <= end) { /* XXX: optimize tail */ - nr_pages = pagevec_lookup(&pvec, mapping, index, PAGEVEC_SIZE); + /* + * We can use PAGECACHE_TAG_DIRTY lookup here because + * even though we have cleared the dirty flag on the page + * We still keep the page in the radix tree with tag + * PAGECACHE_TAG_DIRTY. See clear_page_dirty_for_io. + * The PAGECACHE_TAG_DIRTY is cleared in set_page_writeback + * which is called via the below writepage callback. + */ + nr_pages = pagevec_lookup_tag(&pvec, mapping, &index, + PAGECACHE_TAG_DIRTY, + min(end - index, + (pgoff_t)PAGEVEC_SIZE-1) + 1); if (nr_pages == 0) break; for (i = 0; i < nr_pages; i++) { struct page *page = pvec.pages[i]; - index = page->index; - if (index > end) - break; - index++; - err = mapping->a_ops->writepage(page, mpd->wbc); if (!err) mpd->pages_written++; @@ -2361,7 +2367,6 @@ static int ext4_da_writepages(struct address_space *mapping, struct writeback_control *wbc) { handle_t *handle = NULL; - loff_t range_start = 0; struct mpage_da_data mpd; struct inode *inode = mapping->host; int needed_blocks, ret = 0, nr_to_writebump = 0; @@ -2386,14 +2391,7 @@ static int ext4_da_writepages(struct address_space *mapping, wbc->nr_to_write = sbi->s_mb_stream_request; } - if (!wbc->range_cyclic) - /* - * If range_cyclic is not set force range_cont - * and save the old writeback_index - */ - wbc->range_cont = 1; - range_start = wbc->range_start; pages_skipped = wbc->pages_skipped; mpd.wbc = wbc; @@ -2452,9 +2450,8 @@ static int ext4_da_writepages(struct address_space *mapping, wbc->nr_to_write = to_write; } - if (wbc->range_cont && (pages_skipped != wbc->pages_skipped)) { + if (!wbc->range_cyclic && (pages_skipped != wbc->pages_skipped)) { /* We skipped pages in this loop */ - wbc->range_start = range_start; wbc->nr_to_write = to_write + wbc->pages_skipped - pages_skipped; wbc->pages_skipped = pages_skipped; @@ -2463,7 +2460,6 @@ static int ext4_da_writepages(struct address_space *mapping, out_writepages: wbc->nr_to_write = to_write - nr_to_writebump; - wbc->range_start = range_start; return ret; } -- 1.6.0.2.514.g23abd3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/4] vfs: Remove the range_cont writeback mode. 2008-10-11 19:04 [PATCH 1/4] ext4: Use tag dirty lookup during mpage_da_submit_io Aneesh Kumar K.V @ 2008-10-11 19:04 ` Aneesh Kumar K.V 2008-10-11 19:04 ` [PATCH 3/4] vfs: Add no_nrwrite_update and no_index_update writeback control flags Aneesh Kumar K.V 0 siblings, 1 reply; 8+ messages in thread From: Aneesh Kumar K.V @ 2008-10-11 19:04 UTC (permalink / raw) To: cmm, tytso, sandeen, npiggin; +Cc: linux-ext4, Aneesh Kumar K.V Ext4 was the only user of range_cont writeback mode and ext4 switched to a different method. So remove the range_cont mode which is not used in the kernel. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- include/linux/writeback.h | 1 - mm/page-writeback.c | 2 -- 2 files changed, 0 insertions(+), 3 deletions(-) diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 12b15c5..bd91987 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -63,7 +63,6 @@ struct writeback_control { unsigned for_writepages:1; /* This is a writepages() call */ unsigned range_cyclic:1; /* range_start is cyclic */ unsigned more_io:1; /* more io to be dispatched */ - unsigned range_cont:1; }; /* diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 24de8b6..718efa6 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -961,8 +961,6 @@ int write_cache_pages(struct address_space *mapping, if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0)) mapping->writeback_index = index; - if (wbc->range_cont) - wbc->range_start = index << PAGE_CACHE_SHIFT; return ret; } EXPORT_SYMBOL(write_cache_pages); -- 1.6.0.2.514.g23abd3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/4] vfs: Add no_nrwrite_update and no_index_update writeback control flags 2008-10-11 19:04 ` [PATCH 2/4] vfs: Remove the range_cont writeback mode Aneesh Kumar K.V @ 2008-10-11 19:04 ` Aneesh Kumar K.V 2008-10-11 19:04 ` [PATCH 4/4] ext4: Fix file fragmentation during large file write Aneesh Kumar K.V 0 siblings, 1 reply; 8+ messages in thread From: Aneesh Kumar K.V @ 2008-10-11 19:04 UTC (permalink / raw) To: cmm, tytso, sandeen, npiggin; +Cc: linux-ext4, Aneesh Kumar K.V If no_nrwrite_update is set we don't update nr_to_write in write_cache_pages. Similarly if no_index_update is we don't update address space writeback_index. These changes enable a file system to skip these updates in write_cache_pages and do them in the writepages() callback. This patch will be followed by an ext4 patch that make use of these new flags. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- include/linux/writeback.h | 4 ++++ mm/page-writeback.c | 9 +++++++-- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/include/linux/writeback.h b/include/linux/writeback.h index bd91987..b04287e 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -63,6 +63,10 @@ struct writeback_control { unsigned for_writepages:1; /* This is a writepages() call */ unsigned range_cyclic:1; /* range_start is cyclic */ unsigned more_io:1; /* more io to be dispatched */ + + /* write_cache_pages() control */ + unsigned no_nrwrite_update:1; /* don't update nr_to_write */ + unsigned no_index_update:1; /* don't update writeback_index */ }; /* diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 718efa6..4f359f4 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -876,6 +876,7 @@ int write_cache_pages(struct address_space *mapping, pgoff_t end; /* Inclusive */ int scanned = 0; int range_whole = 0; + long nr_to_write = wbc->nr_to_write; if (wbc->nonblocking && bdi_write_congested(bdi)) { wbc->encountered_congestion = 1; @@ -939,7 +940,7 @@ int write_cache_pages(struct address_space *mapping, unlock_page(page); ret = 0; } - if (ret || (--(wbc->nr_to_write) <= 0)) + if (ret || (--nr_to_write <= 0)) done = 1; if (wbc->nonblocking && bdi_write_congested(bdi)) { wbc->encountered_congestion = 1; @@ -958,8 +959,12 @@ int write_cache_pages(struct address_space *mapping, index = 0; goto retry; } - if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0)) + if (!wbc->no_index_update && + (wbc->range_cyclic || (range_whole && nr_to_write > 0))) { mapping->writeback_index = index; + } + if (!wbc->no_nrwrite_update) + wbc->nr_to_write = nr_to_write; return ret; } -- 1.6.0.2.514.g23abd3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/4] ext4: Fix file fragmentation during large file write. 2008-10-11 19:04 ` [PATCH 3/4] vfs: Add no_nrwrite_update and no_index_update writeback control flags Aneesh Kumar K.V @ 2008-10-11 19:04 ` Aneesh Kumar K.V 2008-10-12 20:31 ` Dmitri Monakhov 0 siblings, 1 reply; 8+ messages in thread From: Aneesh Kumar K.V @ 2008-10-11 19:04 UTC (permalink / raw) To: cmm, tytso, sandeen, npiggin; +Cc: linux-ext4, Aneesh Kumar K.V The range_cyclic writeback mode uses the address_space writeback_index as the start index for writeback. With delayed allocation we were updating writeback_index wrongly resulting in highly fragmented file. Number of extents reduced from 4000 to 27 for a 3GB file with the below patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> --- fs/ext4/inode.c | 83 +++++++++++++++++++++++++++++++++---------------------- 1 files changed, 50 insertions(+), 33 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index cba7960..844c136 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1648,6 +1648,7 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) int ret = 0, err, nr_pages, i; unsigned long index, end; struct pagevec pvec; + long pages_skipped; BUG_ON(mpd->next_page <= mpd->first_page); pagevec_init(&pvec, 0); @@ -1655,7 +1656,6 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) end = mpd->next_page - 1; while (index <= end) { - /* XXX: optimize tail */ /* * We can use PAGECACHE_TAG_DIRTY lookup here because * even though we have cleared the dirty flag on the page @@ -1673,8 +1673,13 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) for (i = 0; i < nr_pages; i++) { struct page *page = pvec.pages[i]; + pages_skipped = mpd->wbc->pages_skipped; err = mapping->a_ops->writepage(page, mpd->wbc); - if (!err) + if (!err && (pages_skipped == mpd->wbc->pages_skipped)) + /* + * have successfully written the page + * without skipping the same + */ mpd->pages_written++; /* * In error case, we have to continue because @@ -2110,7 +2115,6 @@ static int mpage_da_writepages(struct address_space *mapping, struct writeback_control *wbc, struct mpage_da_data *mpd) { - long to_write; int ret; if (!mpd->get_block) @@ -2125,10 +2129,7 @@ static int mpage_da_writepages(struct address_space *mapping, mpd->pages_written = 0; mpd->retval = 0; - to_write = wbc->nr_to_write; - ret = write_cache_pages(mapping, wbc, __mpage_da_writepage, mpd); - /* * Handle last extent of pages */ @@ -2137,7 +2138,7 @@ static int mpage_da_writepages(struct address_space *mapping, mpage_da_submit_io(mpd); } - wbc->nr_to_write = to_write - mpd->pages_written; + wbc->nr_to_write -= mpd->pages_written; return ret; } @@ -2366,11 +2367,14 @@ static int ext4_da_writepages_trans_blocks(struct inode *inode) static int ext4_da_writepages(struct address_space *mapping, struct writeback_control *wbc) { + pgoff_t index; + int range_whole = 0; handle_t *handle = NULL; + long pages_written = 0; struct mpage_da_data mpd; struct inode *inode = mapping->host; + int no_nrwrite_update, no_index_update; int needed_blocks, ret = 0, nr_to_writebump = 0; - long to_write, pages_skipped = 0; struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb); /* @@ -2390,16 +2394,27 @@ static int ext4_da_writepages(struct address_space *mapping, nr_to_writebump = sbi->s_mb_stream_request - wbc->nr_to_write; wbc->nr_to_write = sbi->s_mb_stream_request; } + if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) + range_whole = 1; - - pages_skipped = wbc->pages_skipped; + if (wbc->range_cyclic) + index = mapping->writeback_index; + else + index = wbc->range_start >> PAGE_CACHE_SHIFT; mpd.wbc = wbc; mpd.inode = mapping->host; -restart_loop: - to_write = wbc->nr_to_write; - while (!ret && to_write > 0) { + /* + * we don't want write_cache_pages to update + * nr_to_write and writeback_index + */ + no_nrwrite_update = wbc->no_nrwrite_update; + wbc->no_nrwrite_update = 1; + no_index_update = wbc->no_index_update; + wbc->no_index_update = 1; + + while (!ret && wbc->nr_to_write > 0) { /* * we insert one extent at a time. So we need @@ -2420,46 +2435,48 @@ static int ext4_da_writepages(struct address_space *mapping, dump_stack(); goto out_writepages; } - to_write -= wbc->nr_to_write; - mpd.get_block = ext4_da_get_block_write; ret = mpage_da_writepages(mapping, wbc, &mpd); ext4_journal_stop(handle); - if (mpd.retval == -ENOSPC) + if (mpd.retval == -ENOSPC) { + /* commit the transaction which would + * free blocks released in the transaction + * and try again + */ jbd2_journal_force_commit_nested(sbi->s_journal); - - /* reset the retry count */ - if (ret == MPAGE_DA_EXTENT_TAIL) { + ret = 0; + } else if (ret == MPAGE_DA_EXTENT_TAIL) { /* * got one extent now try with * rest of the pages */ - to_write += wbc->nr_to_write; + pages_written += mpd.pages_written; ret = 0; - } else if (wbc->nr_to_write) { + } else if (wbc->nr_to_write) /* * There is no more writeout needed * or we requested for a noblocking writeout * and we found the device congested */ - to_write += wbc->nr_to_write; break; - } - wbc->nr_to_write = to_write; - } - - if (!wbc->range_cyclic && (pages_skipped != wbc->pages_skipped)) { - /* We skipped pages in this loop */ - wbc->nr_to_write = to_write + - wbc->pages_skipped - pages_skipped; - wbc->pages_skipped = pages_skipped; - goto restart_loop; } + /* Update index */ + index += pages_written; + if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0)) + /* + * set the writeback_index so that range_cyclic + * mode will write it back later + */ + mapping->writeback_index = index; out_writepages: - wbc->nr_to_write = to_write - nr_to_writebump; + if (!no_nrwrite_update) + wbc->no_nrwrite_update = 0; + if (!no_index_update) + wbc->no_index_update = 0; + wbc->nr_to_write -= nr_to_writebump; return ret; } -- 1.6.0.2.514.g23abd3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 4/4] ext4: Fix file fragmentation during large file write. 2008-10-11 19:04 ` [PATCH 4/4] ext4: Fix file fragmentation during large file write Aneesh Kumar K.V @ 2008-10-12 20:31 ` Dmitri Monakhov 2008-10-13 9:52 ` Aneesh Kumar K.V 2008-10-13 13:34 ` Aneesh Kumar K.V 0 siblings, 2 replies; 8+ messages in thread From: Dmitri Monakhov @ 2008-10-12 20:31 UTC (permalink / raw) To: Aneesh Kumar K.V; +Cc: cmm, tytso, sandeen, npiggin, linux-ext4 "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > The range_cyclic writeback mode uses the address_space writeback_index > as the start index for writeback. With delayed allocation we were > updating writeback_index wrongly resulting in highly fragmented file. > Number of extents reduced from 4000 to 27 for a 3GB file with the below > patch. Hi i've played with fragmentation patches with following result: I've had several crash and deadlocks for example objects wasn't freed on umount: EXT4-fs: mballoc: 12800 blocks 13 reqs (6 success) EXT4-fs: mballoc: 7 extents scanned, 12 goal hits, 1 2^N hits, 0 breaks, 0 lost EXT4-fs: mballoc: 1 generated and it took 3024 EXT4-fs: mballoc: 7608 preallocated, 1536 discarded slab error in kmem_cache_destroy(): cache `ext4_prealloc_space': Can't free all objects Pid: 7703, comm: rmmod Not tainted 2.6.27-rc8 #3 Call Trace: [<ffffffff8028b011>] kmem_cache_destroy+0x7d/0xc0 [<ffffffffa03ca057>] exit_ext4_mballoc+0x10/0x1e [ext4dev] [<ffffffffa03d35b3>] exit_ext4_fs+0x1f/0x2f [ext4dev] [<ffffffff80250dff>] sys_delete_module+0x199/0x1f3 [<ffffffff8025d06e>] audit_syscall_entry+0x12d/0x160 [<ffffffff8020be0b>] system_call_fastpath+0x16/0x1b Some times sync has stuck. I'm not shure is it because of current patch set, but where is at least one brand new bug. See later. > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > --- > fs/ext4/inode.c | 83 +++++++++++++++++++++++++++++++++---------------------- > 1 files changed, 50 insertions(+), 33 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index cba7960..844c136 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -1648,6 +1648,7 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) > int ret = 0, err, nr_pages, i; > unsigned long index, end; > struct pagevec pvec; > + long pages_skipped; > > BUG_ON(mpd->next_page <= mpd->first_page); > pagevec_init(&pvec, 0); > @@ -1655,7 +1656,6 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) > end = mpd->next_page - 1; [1] In fact mpd->next_page may be bigger whan (index + wbc->nr_to_write) this result in incorrect math while exit from mpage_da_writepages() Probably we have bound end_index here. end = min(mpd->next_page, index +wbc->nr_to_write) -1; > > while (index <= end) { > - /* XXX: optimize tail */ > /* > * We can use PAGECACHE_TAG_DIRTY lookup here because > * even though we have cleared the dirty flag on the page > @@ -1673,8 +1673,13 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) > for (i = 0; i < nr_pages; i++) { > struct page *page = pvec.pages[i]; > > + pages_skipped = mpd->wbc->pages_skipped; > err = mapping->a_ops->writepage(page, mpd->wbc); > - if (!err) > + if (!err && (pages_skipped == mpd->wbc->pages_skipped)) > + /* > + * have successfully written the page > + * without skipping the same > + */ > mpd->pages_written++; > /* > * In error case, we have to continue because > @@ -2110,7 +2115,6 @@ static int mpage_da_writepages(struct address_space *mapping, > struct writeback_control *wbc, > struct mpage_da_data *mpd) > { > - long to_write; > int ret; > > if (!mpd->get_block) > @@ -2125,10 +2129,7 @@ static int mpage_da_writepages(struct address_space *mapping, > mpd->pages_written = 0; > mpd->retval = 0; > > - to_write = wbc->nr_to_write; > - > ret = write_cache_pages(mapping, wbc, __mpage_da_writepage, mpd); > - > /* > * Handle last extent of pages > */ > @@ -2137,7 +2138,7 @@ static int mpage_da_writepages(struct address_space *mapping, > mpage_da_submit_io(mpd); > } > > - wbc->nr_to_write = to_write - mpd->pages_written; > + wbc->nr_to_write -= mpd->pages_written; due to [1] wbc->nr_write becomes negative here > return ret; > } > > @@ -2366,11 +2367,14 @@ static int ext4_da_writepages_trans_blocks(struct inode *inode) > static int ext4_da_writepages(struct address_space *mapping, > struct writeback_control *wbc) > { > + pgoff_t index; > + int range_whole = 0; > handle_t *handle = NULL; > + long pages_written = 0; > struct mpage_da_data mpd; > struct inode *inode = mapping->host; > + int no_nrwrite_update, no_index_update; > int needed_blocks, ret = 0, nr_to_writebump = 0; > - long to_write, pages_skipped = 0; > struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb); > > /* > @@ -2390,16 +2394,27 @@ static int ext4_da_writepages(struct address_space *mapping, > nr_to_writebump = sbi->s_mb_stream_request - wbc->nr_to_write; > wbc->nr_to_write = sbi->s_mb_stream_request; > } > + if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) > + range_whole = 1; > > - > - pages_skipped = wbc->pages_skipped; > + if (wbc->range_cyclic) > + index = mapping->writeback_index; > + else > + index = wbc->range_start >> PAGE_CACHE_SHIFT; > > mpd.wbc = wbc; > mpd.inode = mapping->host; > > -restart_loop: > - to_write = wbc->nr_to_write; > - while (!ret && to_write > 0) { > + /* > + * we don't want write_cache_pages to update > + * nr_to_write and writeback_index > + */ > + no_nrwrite_update = wbc->no_nrwrite_update; > + wbc->no_nrwrite_update = 1; > + no_index_update = wbc->no_index_update; > + wbc->no_index_update = 1; > + > + while (!ret && wbc->nr_to_write > 0) { > > /* > * we insert one extent at a time. So we need > @@ -2420,46 +2435,48 @@ static int ext4_da_writepages(struct address_space *mapping, > dump_stack(); > goto out_writepages; > } > - to_write -= wbc->nr_to_write; > - > mpd.get_block = ext4_da_get_block_write; > ret = mpage_da_writepages(mapping, wbc, &mpd); > > ext4_journal_stop(handle); > > - if (mpd.retval == -ENOSPC) > + if (mpd.retval == -ENOSPC) { > + /* commit the transaction which would > + * free blocks released in the transaction > + * and try again > + */ > jbd2_journal_force_commit_nested(sbi->s_journal); > - > - /* reset the retry count */ > - if (ret == MPAGE_DA_EXTENT_TAIL) { > + ret = 0; > + } else if (ret == MPAGE_DA_EXTENT_TAIL) { > /* > * got one extent now try with > * rest of the pages > */ > - to_write += wbc->nr_to_write; > + pages_written += mpd.pages_written; > ret = 0; > - } else if (wbc->nr_to_write) { > + } else if (wbc->nr_to_write) > /* > * There is no more writeout needed > * or we requested for a noblocking writeout > * and we found the device congested > */ > - to_write += wbc->nr_to_write; > break; > - } > - wbc->nr_to_write = to_write; > - } > - > - if (!wbc->range_cyclic && (pages_skipped != wbc->pages_skipped)) { > - /* We skipped pages in this loop */ > - wbc->nr_to_write = to_write + > - wbc->pages_skipped - pages_skipped; > - wbc->pages_skipped = pages_skipped; > - goto restart_loop; > } > + /* Update index */ > + index += pages_written; > + if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0)) > + /* > + * set the writeback_index so that range_cyclic > + * mode will write it back later > + */ > + mapping->writeback_index = index; > > out_writepages: > - wbc->nr_to_write = to_write - nr_to_writebump; > + if (!no_nrwrite_update) > + wbc->no_nrwrite_update = 0; > + if (!no_index_update) > + wbc->no_index_update = 0; > + wbc->nr_to_write -= nr_to_writebump; > return ret; > } BTW: please add following cleanup fix. diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 6efa4ca..c248cbc 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2603,7 +2603,7 @@ static int ext4_da_write_end(struct file *file, handle_t *handle = ext4_journal_current_handle(); loff_t new_i_size; unsigned long start, end; - int write_mode = (int)fsdata; + int write_mode = (int)(unsigned long)fsdata; if (write_mode == FALL_BACK_TO_NONDELALLOC) { if (ext4_should_order_data(inode)) { ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 4/4] ext4: Fix file fragmentation during large file write. 2008-10-12 20:31 ` Dmitri Monakhov @ 2008-10-13 9:52 ` Aneesh Kumar K.V 2008-10-13 15:14 ` Dmitri Monakhov 2008-10-13 13:34 ` Aneesh Kumar K.V 1 sibling, 1 reply; 8+ messages in thread From: Aneesh Kumar K.V @ 2008-10-13 9:52 UTC (permalink / raw) To: Dmitri Monakhov; +Cc: cmm, tytso, sandeen, npiggin, linux-ext4 On Mon, Oct 13, 2008 at 12:31:43AM +0400, Dmitri Monakhov wrote: > "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > > > The range_cyclic writeback mode uses the address_space writeback_index > > as the start index for writeback. With delayed allocation we were > > updating writeback_index wrongly resulting in highly fragmented file. > > Number of extents reduced from 4000 to 27 for a 3GB file with the below > > patch. > Hi i've played with fragmentation patches with following result: > I've had several crash and deadlocks > for example objects wasn't freed on umount: > EXT4-fs: mballoc: 12800 blocks 13 reqs (6 success) > EXT4-fs: mballoc: 7 extents scanned, 12 goal hits, 1 2^N hits, 0 breaks, 0 lost > EXT4-fs: mballoc: 1 generated and it took 3024 > EXT4-fs: mballoc: 7608 preallocated, 1536 discarded > slab error in kmem_cache_destroy(): cache `ext4_prealloc_space': Can't free all objects > Pid: 7703, comm: rmmod Not tainted 2.6.27-rc8 #3 > > Call Trace: > [<ffffffff8028b011>] kmem_cache_destroy+0x7d/0xc0 > [<ffffffffa03ca057>] exit_ext4_mballoc+0x10/0x1e [ext4dev] > [<ffffffffa03d35b3>] exit_ext4_fs+0x1f/0x2f [ext4dev] > [<ffffffff80250dff>] sys_delete_module+0x199/0x1f3 > [<ffffffff8025d06e>] audit_syscall_entry+0x12d/0x160 > [<ffffffff8020be0b>] system_call_fastpath+0x16/0x1b > Some times sync has stuck. > I'm not shure is it because of current patch set, but where is at least one > brand new bug. See later. I can't reproduce this. I build ext as a module and tried to unload the module. Actually the umount should have released all the objects in slab. Can you get the /proc/slabinfo output when this happens ? > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > > --- > > fs/ext4/inode.c | 83 +++++++++++++++++++++++++++++++++---------------------- > > 1 files changed, 50 insertions(+), 33 deletions(-) > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index cba7960..844c136 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -1648,6 +1648,7 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) > > int ret = 0, err, nr_pages, i; > > unsigned long index, end; > > struct pagevec pvec; > > + long pages_skipped; > > > > BUG_ON(mpd->next_page <= mpd->first_page); > > pagevec_init(&pvec, 0); > > @@ -1655,7 +1656,6 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) > > end = mpd->next_page - 1; > [1] In fact mpd->next_page may be bigger whan (index + wbc->nr_to_write) > this result in incorrect math while exit from mpage_da_writepages() > Probably we have bound end_index here. write_cache_pages also will write more than requested. The pagevec_lookup_tag can return more than nr_to_write page which implies wbc->nr_to_write can go negative. > end = min(mpd->next_page, index +wbc->nr_to_write) -1; > > > > while (index <= end) { > > - /* XXX: optimize tail */ > > /* > > * We can use PAGECACHE_TAG_DIRTY lookup here because > > * even though we have cleared the dirty flag on the page > > @@ -1673,8 +1673,13 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) > > for (i = 0; i < nr_pages; i++) { > > struct page *page = pvec.pages[i]; > > > > + pages_skipped = mpd->wbc->pages_skipped; > > err = mapping->a_ops->writepage(page, mpd->wbc); > > - if (!err) > > + if (!err && (pages_skipped == mpd->wbc->pages_skipped)) > > + /* > > + * have successfully written the page > > + * without skipping the same > > + */ > > mpd->pages_written++; > > /* > > * In error case, we have to continue because > > @@ -2110,7 +2115,6 @@ static int mpage_da_writepages(struct address_space *mapping, > > struct writeback_control *wbc, > > struct mpage_da_data *mpd) > > { > > - long to_write; > > int ret; > > > > if (!mpd->get_block) > > @@ -2125,10 +2129,7 @@ static int mpage_da_writepages(struct address_space *mapping, > > mpd->pages_written = 0; > > mpd->retval = 0; > > > > - to_write = wbc->nr_to_write; > > - > > ret = write_cache_pages(mapping, wbc, __mpage_da_writepage, mpd); > > - > > /* > > * Handle last extent of pages > > */ > > @@ -2137,7 +2138,7 @@ static int mpage_da_writepages(struct address_space *mapping, > > mpage_da_submit_io(mpd); > > } > > > > - wbc->nr_to_write = to_write - mpd->pages_written; > > + wbc->nr_to_write -= mpd->pages_written; > due to [1] wbc->nr_write becomes negative here wbc->nr_ti_write can go negative right ? > > return ret; > > } > > > > @@ -2366,11 +2367,14 @@ static int ext4_da_writepages_trans_blocks(struct inode *inode) > > static int ext4_da_writepages(struct address_space *mapping, > > struct writeback_control *wbc) > > { > > + pgoff_t index; > > + int range_whole = 0; > > handle_t *handle = NULL; > > + long pages_written = 0; > > struct mpage_da_data mpd; > > struct inode *inode = mapping->host; > > + int no_nrwrite_update, no_index_update; > > int needed_blocks, ret = 0, nr_to_writebump = 0; > > - long to_write, pages_skipped = 0; > > struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb); > > > > /* > > @@ -2390,16 +2394,27 @@ static int ext4_da_writepages(struct address_space *mapping, > > nr_to_writebump = sbi->s_mb_stream_request - wbc->nr_to_write; > > wbc->nr_to_write = sbi->s_mb_stream_request; > > } > > + if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) > > + range_whole = 1; > > > > - > > - pages_skipped = wbc->pages_skipped; > > + if (wbc->range_cyclic) > > + index = mapping->writeback_index; > > + else > > + index = wbc->range_start >> PAGE_CACHE_SHIFT; > > > > mpd.wbc = wbc; > > mpd.inode = mapping->host; > > > > -restart_loop: > > - to_write = wbc->nr_to_write; > > - while (!ret && to_write > 0) { > > + /* > > + * we don't want write_cache_pages to update > > + * nr_to_write and writeback_index > > + */ > > + no_nrwrite_update = wbc->no_nrwrite_update; > > + wbc->no_nrwrite_update = 1; > > + no_index_update = wbc->no_index_update; > > + wbc->no_index_update = 1; > > + > > + while (!ret && wbc->nr_to_write > 0) { > > > > /* > > * we insert one extent at a time. So we need > > @@ -2420,46 +2435,48 @@ static int ext4_da_writepages(struct address_space *mapping, > > dump_stack(); > > goto out_writepages; > > } > > - to_write -= wbc->nr_to_write; > > - > > mpd.get_block = ext4_da_get_block_write; > > ret = mpage_da_writepages(mapping, wbc, &mpd); > > > > ext4_journal_stop(handle); > > > > - if (mpd.retval == -ENOSPC) > > + if (mpd.retval == -ENOSPC) { > > + /* commit the transaction which would > > + * free blocks released in the transaction > > + * and try again > > + */ > > jbd2_journal_force_commit_nested(sbi->s_journal); > > - > > - /* reset the retry count */ > > - if (ret == MPAGE_DA_EXTENT_TAIL) { > > + ret = 0; > > + } else if (ret == MPAGE_DA_EXTENT_TAIL) { > > /* > > * got one extent now try with > > * rest of the pages > > */ > > - to_write += wbc->nr_to_write; > > + pages_written += mpd.pages_written; > > ret = 0; > > - } else if (wbc->nr_to_write) { > > + } else if (wbc->nr_to_write) > > /* > > * There is no more writeout needed > > * or we requested for a noblocking writeout > > * and we found the device congested > > */ > > - to_write += wbc->nr_to_write; > > break; > > - } > > - wbc->nr_to_write = to_write; > > - } > > - > > - if (!wbc->range_cyclic && (pages_skipped != wbc->pages_skipped)) { > > - /* We skipped pages in this loop */ > > - wbc->nr_to_write = to_write + > > - wbc->pages_skipped - pages_skipped; > > - wbc->pages_skipped = pages_skipped; > > - goto restart_loop; > > } > > + /* Update index */ > > + index += pages_written; > > + if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0)) > > + /* > > + * set the writeback_index so that range_cyclic > > + * mode will write it back later > > + */ > > + mapping->writeback_index = index; > > > > out_writepages: > > - wbc->nr_to_write = to_write - nr_to_writebump; > > + if (!no_nrwrite_update) > > + wbc->no_nrwrite_update = 0; > > + if (!no_index_update) > > + wbc->no_index_update = 0; > > + wbc->nr_to_write -= nr_to_writebump; > > return ret; > > } > BTW: please add following cleanup fix. > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 6efa4ca..c248cbc 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -2603,7 +2603,7 @@ static int ext4_da_write_end(struct file *file, > handle_t *handle = ext4_journal_current_handle(); > loff_t new_i_size; > unsigned long start, end; > - int write_mode = (int)fsdata; > + int write_mode = (int)(unsigned long)fsdata; > > if (write_mode == FALL_BACK_TO_NONDELALLOC) { > if (ext4_should_order_data(inode)) { > Eric Sandeen already fixed it in the pathqueue.I can also see mainline having the fix. Which kernel are you trying ? -aneesh ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 4/4] ext4: Fix file fragmentation during large file write. 2008-10-13 9:52 ` Aneesh Kumar K.V @ 2008-10-13 15:14 ` Dmitri Monakhov 0 siblings, 0 replies; 8+ messages in thread From: Dmitri Monakhov @ 2008-10-13 15:14 UTC (permalink / raw) To: Aneesh Kumar K.V; +Cc: cmm, tytso, sandeen, npiggin, linux-ext4 "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > On Mon, Oct 13, 2008 at 12:31:43AM +0400, Dmitri Monakhov wrote: >> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: >> >> > The range_cyclic writeback mode uses the address_space writeback_index >> > as the start index for writeback. With delayed allocation we were >> > updating writeback_index wrongly resulting in highly fragmented file. >> > Number of extents reduced from 4000 to 27 for a 3GB file with the below >> > patch. >> Hi i've played with fragmentation patches with following result: >> I've had several crash and deadlocks >> for example objects wasn't freed on umount: >> EXT4-fs: mballoc: 12800 blocks 13 reqs (6 success) >> EXT4-fs: mballoc: 7 extents scanned, 12 goal hits, 1 2^N hits, 0 breaks, 0 lost >> EXT4-fs: mballoc: 1 generated and it took 3024 >> EXT4-fs: mballoc: 7608 preallocated, 1536 discarded >> slab error in kmem_cache_destroy(): cache `ext4_prealloc_space': Can't free all objects >> Pid: 7703, comm: rmmod Not tainted 2.6.27-rc8 #3 >> >> Call Trace: >> [<ffffffff8028b011>] kmem_cache_destroy+0x7d/0xc0 >> [<ffffffffa03ca057>] exit_ext4_mballoc+0x10/0x1e [ext4dev] >> [<ffffffffa03d35b3>] exit_ext4_fs+0x1f/0x2f [ext4dev] >> [<ffffffff80250dff>] sys_delete_module+0x199/0x1f3 >> [<ffffffff8025d06e>] audit_syscall_entry+0x12d/0x160 >> [<ffffffff8020be0b>] system_call_fastpath+0x16/0x1b >> Some times sync has stuck. >> I'm not shure is it because of current patch set, but where is at least one >> brand new bug. See later. > > > I can't reproduce this. I build ext as a module and tried to unload the > module. Actually the umount should have released all the objects in > slab. Can you get the /proc/slabinfo output when this happens ? this result in invalid pointer dereference. Your kmem_cache patch has fixed the issue. At least for writes w/o fallocate. > > >> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> >> > Signed-off-by: Theodore Ts'o <tytso@mit.edu> >> > --- >> > fs/ext4/inode.c | 83 +++++++++++++++++++++++++++++++++---------------------- >> > 1 files changed, 50 insertions(+), 33 deletions(-) >> > >> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c >> > index cba7960..844c136 100644 >> > --- a/fs/ext4/inode.c >> > +++ b/fs/ext4/inode.c >> > @@ -1648,6 +1648,7 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) >> > int ret = 0, err, nr_pages, i; >> > unsigned long index, end; >> > struct pagevec pvec; >> > + long pages_skipped; >> > >> > BUG_ON(mpd->next_page <= mpd->first_page); >> > pagevec_init(&pvec, 0); >> > @@ -1655,7 +1656,6 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) >> > end = mpd->next_page - 1; >> [1] In fact mpd->next_page may be bigger whan (index + wbc->nr_to_write) >> this result in incorrect math while exit from mpage_da_writepages() >> Probably we have bound end_index here. > > write_cache_pages also will write more than requested. The > pagevec_lookup_tag can return more than nr_to_write page which > implies wbc->nr_to_write can go negative. > > >> end = min(mpd->next_page, index +wbc->nr_to_write) -1; >> > >> > while (index <= end) { >> > - /* XXX: optimize tail */ >> > /* >> > * We can use PAGECACHE_TAG_DIRTY lookup here because >> > * even though we have cleared the dirty flag on the page >> > @@ -1673,8 +1673,13 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd) >> > for (i = 0; i < nr_pages; i++) { >> > struct page *page = pvec.pages[i]; >> > >> > + pages_skipped = mpd->wbc->pages_skipped; >> > err = mapping->a_ops->writepage(page, mpd->wbc); >> > - if (!err) >> > + if (!err && (pages_skipped == mpd->wbc->pages_skipped)) >> > + /* >> > + * have successfully written the page >> > + * without skipping the same >> > + */ >> > mpd->pages_written++; >> > /* >> > * In error case, we have to continue because >> > @@ -2110,7 +2115,6 @@ static int mpage_da_writepages(struct address_space *mapping, >> > struct writeback_control *wbc, >> > struct mpage_da_data *mpd) >> > { >> > - long to_write; >> > int ret; >> > >> > if (!mpd->get_block) >> > @@ -2125,10 +2129,7 @@ static int mpage_da_writepages(struct address_space *mapping, >> > mpd->pages_written = 0; >> > mpd->retval = 0; >> > >> > - to_write = wbc->nr_to_write; >> > - >> > ret = write_cache_pages(mapping, wbc, __mpage_da_writepage, mpd); >> > - >> > /* >> > * Handle last extent of pages >> > */ >> > @@ -2137,7 +2138,7 @@ static int mpage_da_writepages(struct address_space *mapping, >> > mpage_da_submit_io(mpd); >> > } >> > >> > - wbc->nr_to_write = to_write - mpd->pages_written; >> > + wbc->nr_to_write -= mpd->pages_written; >> due to [1] wbc->nr_write becomes negative here > > > wbc->nr_ti_write can go negative right ? > > > >> > return ret; >> > } >> > >> > @@ -2366,11 +2367,14 @@ static int ext4_da_writepages_trans_blocks(struct inode *inode) >> > static int ext4_da_writepages(struct address_space *mapping, >> > struct writeback_control *wbc) >> > { >> > + pgoff_t index; >> > + int range_whole = 0; >> > handle_t *handle = NULL; >> > + long pages_written = 0; >> > struct mpage_da_data mpd; >> > struct inode *inode = mapping->host; >> > + int no_nrwrite_update, no_index_update; >> > int needed_blocks, ret = 0, nr_to_writebump = 0; >> > - long to_write, pages_skipped = 0; >> > struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb); >> > >> > /* >> > @@ -2390,16 +2394,27 @@ static int ext4_da_writepages(struct address_space *mapping, >> > nr_to_writebump = sbi->s_mb_stream_request - wbc->nr_to_write; >> > wbc->nr_to_write = sbi->s_mb_stream_request; >> > } >> > + if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) >> > + range_whole = 1; >> > >> > - >> > - pages_skipped = wbc->pages_skipped; >> > + if (wbc->range_cyclic) >> > + index = mapping->writeback_index; >> > + else >> > + index = wbc->range_start >> PAGE_CACHE_SHIFT; >> > >> > mpd.wbc = wbc; >> > mpd.inode = mapping->host; >> > >> > -restart_loop: >> > - to_write = wbc->nr_to_write; >> > - while (!ret && to_write > 0) { >> > + /* >> > + * we don't want write_cache_pages to update >> > + * nr_to_write and writeback_index >> > + */ >> > + no_nrwrite_update = wbc->no_nrwrite_update; >> > + wbc->no_nrwrite_update = 1; >> > + no_index_update = wbc->no_index_update; >> > + wbc->no_index_update = 1; >> > + >> > + while (!ret && wbc->nr_to_write > 0) { >> > >> > /* >> > * we insert one extent at a time. So we need >> > @@ -2420,46 +2435,48 @@ static int ext4_da_writepages(struct address_space *mapping, >> > dump_stack(); >> > goto out_writepages; >> > } >> > - to_write -= wbc->nr_to_write; >> > - >> > mpd.get_block = ext4_da_get_block_write; >> > ret = mpage_da_writepages(mapping, wbc, &mpd); >> > >> > ext4_journal_stop(handle); >> > >> > - if (mpd.retval == -ENOSPC) >> > + if (mpd.retval == -ENOSPC) { >> > + /* commit the transaction which would >> > + * free blocks released in the transaction >> > + * and try again >> > + */ >> > jbd2_journal_force_commit_nested(sbi->s_journal); >> > - >> > - /* reset the retry count */ >> > - if (ret == MPAGE_DA_EXTENT_TAIL) { >> > + ret = 0; >> > + } else if (ret == MPAGE_DA_EXTENT_TAIL) { >> > /* >> > * got one extent now try with >> > * rest of the pages >> > */ >> > - to_write += wbc->nr_to_write; >> > + pages_written += mpd.pages_written; >> > ret = 0; >> > - } else if (wbc->nr_to_write) { >> > + } else if (wbc->nr_to_write) >> > /* >> > * There is no more writeout needed >> > * or we requested for a noblocking writeout >> > * and we found the device congested >> > */ >> > - to_write += wbc->nr_to_write; >> > break; >> > - } >> > - wbc->nr_to_write = to_write; >> > - } >> > - >> > - if (!wbc->range_cyclic && (pages_skipped != wbc->pages_skipped)) { >> > - /* We skipped pages in this loop */ >> > - wbc->nr_to_write = to_write + >> > - wbc->pages_skipped - pages_skipped; >> > - wbc->pages_skipped = pages_skipped; >> > - goto restart_loop; >> > } >> > + /* Update index */ >> > + index += pages_written; >> > + if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0)) >> > + /* >> > + * set the writeback_index so that range_cyclic >> > + * mode will write it back later >> > + */ >> > + mapping->writeback_index = index; >> > >> > out_writepages: >> > - wbc->nr_to_write = to_write - nr_to_writebump; >> > + if (!no_nrwrite_update) >> > + wbc->no_nrwrite_update = 0; >> > + if (!no_index_update) >> > + wbc->no_index_update = 0; >> > + wbc->nr_to_write -= nr_to_writebump; >> > return ret; >> > } >> BTW: please add following cleanup fix. >> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c >> index 6efa4ca..c248cbc 100644 >> --- a/fs/ext4/inode.c >> +++ b/fs/ext4/inode.c >> @@ -2603,7 +2603,7 @@ static int ext4_da_write_end(struct file *file, >> handle_t *handle = ext4_journal_current_handle(); >> loff_t new_i_size; >> unsigned long start, end; >> - int write_mode = (int)fsdata; >> + int write_mode = (int)(unsigned long)fsdata; >> >> if (write_mode == FALL_BACK_TO_NONDELALLOC) { >> if (ext4_should_order_data(inode)) { >> > > Eric Sandeen already fixed it in the pathqueue.I can also > see mainline having the fix. Which kernel are you trying ? Ops.. i've missed it. > > -aneesh ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 4/4] ext4: Fix file fragmentation during large file write. 2008-10-12 20:31 ` Dmitri Monakhov 2008-10-13 9:52 ` Aneesh Kumar K.V @ 2008-10-13 13:34 ` Aneesh Kumar K.V 1 sibling, 0 replies; 8+ messages in thread From: Aneesh Kumar K.V @ 2008-10-13 13:34 UTC (permalink / raw) To: Dmitri Monakhov Cc: Aneesh Kumar K.V, cmm, tytso, sandeen, npiggin, linux-ext4 On Mon, Oct 13, 2008 at 12:31:43AM +0400, Dmitri Monakhov wrote: > "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > > > The range_cyclic writeback mode uses the address_space writeback_index > > as the start index for writeback. With delayed allocation we were > > updating writeback_index wrongly resulting in highly fragmented file. > > Number of extents reduced from 4000 to 27 for a 3GB file with the below > > patch. > Hi i've played with fragmentation patches with following result: > I've had several crash and deadlocks > for example objects wasn't freed on umount: > EXT4-fs: mballoc: 12800 blocks 13 reqs (6 success) > EXT4-fs: mballoc: 7 extents scanned, 12 goal hits, 1 2^N hits, 0 breaks, 0 lost > EXT4-fs: mballoc: 1 generated and it took 3024 > EXT4-fs: mballoc: 7608 preallocated, 1536 discarded > slab error in kmem_cache_destroy(): cache `ext4_prealloc_space': Can't free all objects > Pid: 7703, comm: rmmod Not tainted 2.6.27-rc8 #3 > > Call Trace: > [<ffffffff8028b011>] kmem_cache_destroy+0x7d/0xc0 > [<ffffffffa03ca057>] exit_ext4_mballoc+0x10/0x1e [ext4dev] > [<ffffffffa03d35b3>] exit_ext4_fs+0x1f/0x2f [ext4dev] > [<ffffffff80250dff>] sys_delete_module+0x199/0x1f3 > [<ffffffff8025d06e>] audit_syscall_entry+0x12d/0x160 > [<ffffffff8020be0b>] system_call_fastpath+0x16/0x1b Looking at the code i found this. I haven't test the change yet. diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 2f38754..acf6a32 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2569,7 +2569,7 @@ static void ext4_mb_cleanup_pa(struct ext4_group_info *grp) pa = list_entry(cur, struct ext4_prealloc_space, pa_group_list); list_del(&pa->pa_group_list); count++; - kfree(pa); + kmem_cache_free(ext4_pspace_cachep, pa); } if (count) mb_debug("mballoc: %u PAs left\n", count); ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-10-13 15:33 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-10-11 19:04 [PATCH 1/4] ext4: Use tag dirty lookup during mpage_da_submit_io Aneesh Kumar K.V 2008-10-11 19:04 ` [PATCH 2/4] vfs: Remove the range_cont writeback mode Aneesh Kumar K.V 2008-10-11 19:04 ` [PATCH 3/4] vfs: Add no_nrwrite_update and no_index_update writeback control flags Aneesh Kumar K.V 2008-10-11 19:04 ` [PATCH 4/4] ext4: Fix file fragmentation during large file write Aneesh Kumar K.V 2008-10-12 20:31 ` Dmitri Monakhov 2008-10-13 9:52 ` Aneesh Kumar K.V 2008-10-13 15:14 ` Dmitri Monakhov 2008-10-13 13:34 ` Aneesh Kumar K.V
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox