linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: <gregkh@suse.de>
To: aneesh.kumar@linux.vnet.ibm.com, david@fromorbit.com,
	dev@jaysonking.com, gregkh@suse.de,
	Kay.Diederichs@uni-konstanz.de, linux-ext4@vger.kernel.org,
	tytso@mit.edu
Cc: <stable@kernel.org>, <stable-commits@vger.kernel.org>
Subject: Patch "ext4: Fix file fragmentation during large file write." has been added to the 2.6.27-stable tree
Date: Fri, 25 Jun 2010 16:32:15 -0700	[thread overview]
Message-ID: <12775087352667@site> (raw)
In-Reply-To: <4C001901.1070207@jaysonking.com>


This is a note to let you know that I've just added the patch titled

    ext4: Fix file fragmentation during large file write.

to the 2.6.27-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
    ext4-fix-file-fragmentation-during-large-file-write.patch
and it can be found in the queue-2.6.27 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@kernel.org> know about it.


>From dev@jaysonking.com  Fri Jun 25 15:33:09 2010
From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Date: Fri, 28 May 2010 14:26:57 -0500
Subject: ext4: Fix file fragmentation during large file write.
Cc: "Jayson R. King" <dev@jaysonking.com>, Theodore Ts'o <tytso@mit.edu>, "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>, Dave Chinner <david@fromorbit.com>, Ext4 Developers List <linux-ext4@vger.kernel.org>, Kay Diederichs <Kay.Diederichs@uni-konstanz.de>
Message-ID: <4C001901.1070207@jaysonking.com>


From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

commit 22208dedbd7626e5fc4339c417f8d24cc21f79d7 upstream.

The range_cyclic writeback mode uses the address_space writeback_index
as the start index for writeback.  With delayed allocation we were
updating writeback_index wrongly resulting in highly fragmented file.
This patch reduces the number of extents reduced from 4000 to 27 for a
3GB file.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[dev@jaysonking.com: Some changed lines from the original version of this patch were dropped, since they were rolled up with another cherry-picked patch applied to 2.6.27.y earlier.]
[dev@jaysonking.com: Use of wbc->no_nrwrite_index_update was dropped, since write_cache_pages_da() implies it.]
Signed-off-by: Jayson R. King <dev@jaysonking.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 fs/ext4/inode.c |   79 ++++++++++++++++++++++++++++++++------------------------
 1 file changed, 46 insertions(+), 33 deletions(-)

--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1721,7 +1721,11 @@ static int mpage_da_submit_io(struct mpa
 
 			pages_skipped = mpd->wbc->pages_skipped;
 			err = mapping->a_ops->writepage(page, mpd->wbc);
-			if (!err)
+			if (!err && (pages_skipped == mpd->wbc->pages_skipped))
+				/*
+				 * have successfully written the page
+				 * without skipping the same
+				 */
 				mpd->pages_written++;
 			/*
 			 * In error case, we have to continue because
@@ -2295,7 +2299,6 @@ static int mpage_da_writepages(struct ad
 			       struct writeback_control *wbc,
 			       struct mpage_da_data *mpd)
 {
-	long to_write;
 	int ret;
 
 	if (!mpd->get_block)
@@ -2310,19 +2313,18 @@ static int mpage_da_writepages(struct ad
 	mpd->pages_written = 0;
 	mpd->retval = 0;
 
-	to_write = wbc->nr_to_write;
-
 	ret = write_cache_pages_da(mapping, wbc, mpd);
-
 	/*
 	 * Handle last extent of pages
 	 */
 	if (!mpd->io_done && mpd->next_page != mpd->first_page) {
 		if (mpage_da_map_blocks(mpd) == 0)
 			mpage_da_submit_io(mpd);
-	}
 
-	wbc->nr_to_write = to_write - mpd->pages_written;
+		mpd->io_done = 1;
+		ret = MPAGE_DA_EXTENT_TAIL;
+	}
+	wbc->nr_to_write -= mpd->pages_written;
 	return ret;
 }
 
@@ -2567,11 +2569,13 @@ static int ext4_da_writepages_trans_bloc
 static int ext4_da_writepages(struct address_space *mapping,
 			      struct writeback_control *wbc)
 {
+	pgoff_t	index;
+	int range_whole = 0;
 	handle_t *handle = NULL;
 	struct mpage_da_data mpd;
 	struct inode *inode = mapping->host;
+	long pages_written = 0, pages_skipped;
 	int needed_blocks, ret = 0, nr_to_writebump = 0;
-	long to_write, pages_skipped = 0;
 	struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb);
 
 	/*
@@ -2605,16 +2609,20 @@ static int ext4_da_writepages(struct add
 		nr_to_writebump = sbi->s_mb_stream_request - wbc->nr_to_write;
 		wbc->nr_to_write = sbi->s_mb_stream_request;
 	}
+	if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
+		range_whole = 1;
 
-
-	pages_skipped = wbc->pages_skipped;
+	if (wbc->range_cyclic)
+		index = mapping->writeback_index;
+	else
+		index = wbc->range_start >> PAGE_CACHE_SHIFT;
 
 	mpd.wbc = wbc;
 	mpd.inode = mapping->host;
 
-restart_loop:
-	to_write = wbc->nr_to_write;
-	while (!ret && to_write > 0) {
+	pages_skipped = wbc->pages_skipped;
+
+	while (!ret && wbc->nr_to_write > 0) {
 
 		/*
 		 * we  insert one extent at a time. So we need
@@ -2647,46 +2655,51 @@ restart_loop:
 				goto out_writepages;
 			}
 		}
-		to_write -= wbc->nr_to_write;
-
 		mpd.get_block = ext4_da_get_block_write;
 		ret = mpage_da_writepages(mapping, wbc, &mpd);
 
 		ext4_journal_stop(handle);
 
-		if (mpd.retval == -ENOSPC)
+		if (mpd.retval == -ENOSPC) {
+			/* commit the transaction which would
+			 * free blocks released in the transaction
+			 * and try again
+			 */
 			jbd2_journal_force_commit_nested(sbi->s_journal);
-
-		/* reset the retry count */
-		if (ret == MPAGE_DA_EXTENT_TAIL) {
+			wbc->pages_skipped = pages_skipped;
+			ret = 0;
+		} else if (ret == MPAGE_DA_EXTENT_TAIL) {
 			/*
 			 * got one extent now try with
 			 * rest of the pages
 			 */
-			to_write += wbc->nr_to_write;
+			pages_written += mpd.pages_written;
+			wbc->pages_skipped = pages_skipped;
 			ret = 0;
-		} else if (wbc->nr_to_write) {
+		} else if (wbc->nr_to_write)
 			/*
 			 * There is no more writeout needed
 			 * or we requested for a noblocking writeout
 			 * and we found the device congested
 			 */
-			to_write += wbc->nr_to_write;
 			break;
-		}
-		wbc->nr_to_write = to_write;
-	}
-
-	if (!wbc->range_cyclic && (pages_skipped != wbc->pages_skipped)) {
-		/* We skipped pages in this loop */
-		wbc->nr_to_write = to_write +
-				wbc->pages_skipped - pages_skipped;
-		wbc->pages_skipped = pages_skipped;
-		goto restart_loop;
 	}
+	if (pages_skipped != wbc->pages_skipped)
+		printk(KERN_EMERG "This should not happen leaving %s "
+				"with nr_to_write = %ld ret = %d\n",
+				__func__, wbc->nr_to_write, ret);
+
+	/* Update index */
+	index += pages_written;
+	if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
+		/*
+		 * set the writeback_index so that range_cyclic
+		 * mode will write it back later
+		 */
+		mapping->writeback_index = index;
 
 out_writepages:
-	wbc->nr_to_write = to_write - nr_to_writebump;
+	wbc->nr_to_write -= nr_to_writebump;
 	return ret;
 }
 


Patches currently in stable-queue which might be from aneesh.kumar@linux.vnet.ibm.com are

queue-2.6.27/ext4-fix-file-fragmentation-during-large-file-write.patch
queue-2.6.27/ext4-use-our-own-write_cache_pages.patch
queue-2.6.27/ext4-implement-range_cyclic-in-ext4_da_writepages-instead-of-write_cache_pages.patch

  parent reply	other threads:[~2010-06-25 23:32 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-28 19:24 [PATCH 2.6.27.y 0/3] ext4 fixes Jayson R. King
2010-05-28 19:26 ` [PATCH 2.6.27.y 1/3] ext4: Use our own write_cache_pages() Jayson R. King
2010-05-29  0:49   ` tytso
2010-05-29  1:41     ` Jayson R. King
2010-05-29  2:21       ` Jayson R. King
2010-05-30 21:25       ` tytso
2010-05-31  6:35         ` Kay Diederichs
2010-06-01 13:54           ` Greg Freemyer
2010-06-01 14:49             ` Theodore Tso
2010-06-01 15:23               ` Kay Diederichs
2010-06-01 20:06               ` Jayson R. King
2010-06-01 22:12                 ` tytso
2010-06-01 20:06         ` Jayson R. King
2010-06-25 23:32   ` Patch "ext4: Use our own write_cache_pages()" has been added to the 2.6.27-stable tree gregkh
2010-05-28 19:26 ` [PATCH 2.6.27.y 2/3] ext4: Fix file fragmentation during large file write Jayson R. King
2010-05-29  1:06   ` tytso
2010-05-29  2:12     ` Jayson R. King
2010-06-25 23:32   ` gregkh [this message]
2010-05-28 19:27 ` [PATCH 2.6.27.y 3/3] ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages Jayson R. King
2010-06-25 23:32   ` Patch "ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages" has been added to the 2.6.27-stable tree gregkh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12775087352667@site \
    --to=gregkh@suse.de \
    --cc=Kay.Diederichs@uni-konstanz.de \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=david@fromorbit.com \
    --cc=dev@jaysonking.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=stable-commits@vger.kernel.org \
    --cc=stable@kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).