public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Theodore Tso <tytso@mit.edu>
Cc: cmm@us.ibm.com, sandeen@redhat.com, linux-ext4@vger.kernel.org
Subject: Re: [PATCH] ext4: Fix small file fragmentation
Date: Fri, 15 Aug 2008 22:01:12 +0530	[thread overview]
Message-ID: <20080815163112.GA6511@skywalker> (raw)
In-Reply-To: <20080815133803.GL13048@mit.edu>

On Fri, Aug 15, 2008 at 09:38:03AM -0400, Theodore Tso wrote:
> Here's an interesting data point.  Using Chris Mason's compilebench:
> 
> http://oss.oracle.com/~mason/compilebench
> 
> If I use:
> 
> ./compilebench  -D /mnt -i 2 -r 0
> 
> on a 4GB machine such that I have plenty of memory (and nothing gets
> forced disk due to memory pressure), I don't see hardly any of the
> small file fragmentation problem (0.8% of the inodes in use on the
> filesystem.  This is with your patch applied.
> 
> However, if I use:
> 
> ./compilebench  -D /mnt -i 10 -r 0
> 
> so that data blocks are getting pushed out due to memory pressure,
> then I see plenty of non-contiugous inodes (8.1% of the inodes in use
> on the filesystem).  So with your patch applied, it seems that we
> still have a problem related to delayed allocation and how the VM
> system is doing its page cleaning.

As I explained in my previous patch the problem is due to pdflush
background_writeout. Now when pdflush does the writeout we may
have only few pages for the file and we would attempt
to write them to disk. So my attempt in the last patch was to
do the below

a) When allocation blocks try to be close to the goal block specified
b) When we call ext4_da_writepages make sure we have minimal nr_to_write
  that ensures we allocate all dirty buffer_heads in a single go.
  nr_to_write is set to 1024 in pdflush background_writeout and that
  would mean we may end up calling some inodes writepages() with really
  small values even though we have more dirty buffer_heads.

What it doesn't handle is
1) File A have 4 dirty buffer_heads.
2) pdflush try to write them. We get 4 contig blocks
3) File A now have new 5 dirty_buffer_heads
4) File B now have 6 dirty_buffer_heads
5) pdflush try to write the 6 dirty buffer_heads of file B and allocate
them next to earlier file A blocks
6) pdflush try to write the 5 dirty buffer_heads of file A and allocate
them after file B blocks resulting in discontinuity.

I am right now testing the below patch which make sure new dirty inodes
are added to the tail of the dirty inode list

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 25adfc3..a658690 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -163,7 +163,9 @@ void __mark_inode_dirty(struct inode *inode, int flags)
 		 */
 		if (!was_dirty) {
 			inode->dirtied_when = jiffies;
-			list_move(&inode->i_list, &sb->s_dirty);
+			//list_move(&inode->i_list, &sb->s_dirty);
+			__list_del(&inode->i_list->prev, &inode->i_list->next);
+			list_add_tail(&inode->i_list, &sb->s_dirty);
 		}
 	}
 out:

  reply	other threads:[~2008-08-15 16:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-14 17:44 [PATCH] ext4: Fix small file fragmentation Aneesh Kumar K.V
2008-08-14 22:16 ` Mingming Cao
     [not found] ` <20080814231816.GA13048@mit.edu>
2008-08-15 13:38   ` Theodore Tso
2008-08-15 16:31     ` Aneesh Kumar K.V [this message]
2008-08-15 16:33       ` Aneesh Kumar K.V
2008-08-15 17:52   ` Aneesh Kumar K.V
2008-08-15 18:07     ` Aneesh Kumar K.V
2008-08-15 20:05       ` Theodore Tso
2008-08-16  4:43         ` Aneesh Kumar K.V
2008-08-16 10:43     ` Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080815163112.GA6511@skywalker \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cmm@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sandeen@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox