linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Large File Deletion Comparison (ext3, ext4, XFS)
@ 2007-04-27 13:41 Valerie Clement
  2007-04-27 18:33 ` Theodore Tso
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Valerie Clement @ 2007-04-27 13:41 UTC (permalink / raw)
  To: ext4 development

As asked by Alex, I included in the test results the file fragmentation 
level and the number of I/Os done during the file deletion.

Here are the results obtained with a not very fragmented 100-GB file:

                  |     ext3       ext4 + extents      xfs
------------------------------------------------------------
  nb of fragments |     796             798             15
  elapsed time    |  2m0.306s        0m11.127s       0m0.553s
                  |
  blks read       |  206600            6416            352
  blks written    |   13592           13064            104
------------------------------------------------------------


And with a more fragmented 100-GB file:

                  |     ext3       ext4 + extents       xfs
------------------------------------------------------------
  nb of fragments |   20297           19841            234
  elapsed time    | 2m18.914s        0m27.429s      0m0.892s
                  |
  blks read       |  225624           25432            592
  blks written    |   52120           50664            872
------------------------------------------------------------


More details on our web site:
http://www.bullopensource.org/ext4/20070404/FileDeletion.html

    Valérie

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Large File Deletion Comparison (ext3, ext4, XFS)
  2007-04-27 13:41 Large File Deletion Comparison (ext3, ext4, XFS) Valerie Clement
@ 2007-04-27 18:33 ` Theodore Tso
  2007-04-27 20:33   ` Andreas Dilger
  2007-04-27 18:51 ` Alex Tomas
  2007-04-27 20:38 ` Andreas Dilger
  2 siblings, 1 reply; 6+ messages in thread
From: Theodore Tso @ 2007-04-27 18:33 UTC (permalink / raw)
  To: Valerie Clement; +Cc: ext4 development

On Fri, Apr 27, 2007 at 03:41:19PM +0200, Valerie Clement wrote:
> As asked by Alex, I included in the test results the file fragmentation 
> level and the number of I/Os done during the file deletion.
> 
> Here are the results obtained with a not very fragmented 100-GB file:
> 
>                  |     ext3       ext4 + extents      xfs
> ------------------------------------------------------------
>  nb of fragments |     796             798             15
>  elapsed time    |  2m0.306s        0m11.127s       0m0.553s
>                  |
>  blks read       |  206600            6416            352
>  blks written    |   13592           13064            104
> ------------------------------------------------------------

The metablockgroups feature should help the file fragmentation level
with extents.  It's easy enough to enable this for ext4 (we just need
to remove some checks in ext4_check_descriptors), so we should just do
it.

						- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Large File Deletion Comparison (ext3, ext4, XFS)
  2007-04-27 13:41 Large File Deletion Comparison (ext3, ext4, XFS) Valerie Clement
  2007-04-27 18:33 ` Theodore Tso
@ 2007-04-27 18:51 ` Alex Tomas
  2007-04-27 20:38 ` Andreas Dilger
  2 siblings, 0 replies; 6+ messages in thread
From: Alex Tomas @ 2007-04-27 18:51 UTC (permalink / raw)
  To: Valerie Clement; +Cc: ext4 development

Valerie Clement wrote:
> As asked by Alex, I included in the test results the file fragmentation 
> level and the number of I/Os done during the file deletion.
> 
> Here are the results obtained with a not very fragmented 100-GB file:
> 
>                  |     ext3       ext4 + extents      xfs
> ------------------------------------------------------------
>  nb of fragments |     796             798             15
>  elapsed time    |  2m0.306s        0m11.127s       0m0.553s
>                  |
>  blks read       |  206600            6416            352
>  blks written    |   13592           13064            104
> ------------------------------------------------------------


hmm. if I did math right, then, in theory, 100GB file could be
placed using ~850 extents: 100 * 1024 / 120, where 120 is amount
of data one can allocate in regular group. 850 extents would
require 3 leaf blocks (340 extents/block) + 1 index block. we'd
need to read these 4 blocks + all 850 involved bitmaps + some
blocks of group descriptors. so, probably we need to tune balloc.
then we'd improve remove time by factor six (6400 blocks to read
vs. ~900-1000 blocks to read) ?

thanks, Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Large File Deletion Comparison (ext3, ext4, XFS)
  2007-04-27 18:33 ` Theodore Tso
@ 2007-04-27 20:33   ` Andreas Dilger
  0 siblings, 0 replies; 6+ messages in thread
From: Andreas Dilger @ 2007-04-27 20:33 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Valerie Clement, ext4 development

On Apr 27, 2007  14:33 -0400, Theodore Tso wrote:
> > Here are the results obtained with a not very fragmented 100-GB file:
> > 
> >                  |     ext3       ext4 + extents      xfs
> > ------------------------------------------------------------
> >  nb of fragments |     796             798             15
> >  elapsed time    |  2m0.306s        0m11.127s       0m0.553s
> >                  |
> >  blks read       |  206600            6416            352
> >  blks written    |   13592           13064            104
> > ------------------------------------------------------------
> 
> The metablockgroups feature should help the file fragmentation level
> with extents.  It's easy enough to enable this for ext4 (we just need
> to remove some checks in ext4_check_descriptors), so we should just do
> it.

I agree in this case that the META_BG feature would help here (100GB / 128MB
is in fact the 800 fragments shown), I don't think that is the major
performance hit.

The fact that we need to read 6000 blocks and write 13000 blocks is the
more serious part.  I assume that since there are only 800 fragments
there should be only 800 extents.  We can fit (4096 / 12 - 1) = 340
extents into each block, and 4 index blocks into the inode, so this
should allow all 800 extents in only 3 index blocks.  It would be useful
to know where those 6416 block reads are going in the extent case.

I suspect that is because the "tail first" truncation mechanism of ext3
causes it to zero out FAR more blocks than needed.  With extents and a
default 128MB journal we should be able to truncate + unlink a file with
only writes to the inode and the 800 bitmap + gdt blocks.  The reads should
also be limited to the bitmap blocks and extent indexes (gdt being read at
mount time).

What is needed is for truncate to walk the inode block tree (extents or
indirect blocks) and count the bitmaps + gdt blocks dirtied, and then try
and do the whole truncate under a single transaction.  That avoids any need
for truncate to be "restartable" and then there is no need to zero out the
indirect blocks from the end one-at-a-time.

Doing the bitmap read/write will definitely be more efficient with META_BG,
but that doesn't explain the other 19k blocks undergoing IO.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Large File Deletion Comparison (ext3, ext4, XFS)
  2007-04-27 13:41 Large File Deletion Comparison (ext3, ext4, XFS) Valerie Clement
  2007-04-27 18:33 ` Theodore Tso
  2007-04-27 18:51 ` Alex Tomas
@ 2007-04-27 20:38 ` Andreas Dilger
  2007-04-27 20:48   ` Alex Tomas
  2 siblings, 1 reply; 6+ messages in thread
From: Andreas Dilger @ 2007-04-27 20:38 UTC (permalink / raw)
  To: Valerie Clement; +Cc: ext4 development

On Apr 27, 2007  15:41 +0200, Valerie Clement wrote:
> As asked by Alex, I included in the test results the file fragmentation 
> level and the number of I/Os done during the file deletion.
> 
> Here are the results obtained with a not very fragmented 100-GB file:
> 
>                  |     ext3       ext4 + extents      xfs
> ------------------------------------------------------------
>  nb of fragments |     796             798             15
>  elapsed time    |  2m0.306s        0m11.127s       0m0.553s
>                  |
>  blks read       |  206600            6416            352
>  blks written    |   13592           13064            104
> ------------------------------------------------------------
> 
> 
> And with a more fragmented 100-GB file:
> 
>                  |     ext3       ext4 + extents       xfs
> ------------------------------------------------------------
>  nb of fragments |   20297           19841            234
>  elapsed time    | 2m18.914s        0m27.429s      0m0.892s
>                  |
>  blks read       |  225624           25432            592
>  blks written    |   52120           50664            872
> ------------------------------------------------------------
> 
> 
> More details on our web site:
> http://www.bullopensource.org/ext4/20070404/FileDeletion.html

Ah, one thing that is only mentioned in the URL is that the "IO count" is
in units of 512-byte sectors.  In the case of XFS doing logical journaling
this avoids a huge amount of double writes to the journal and then to the
filesystem.  I still think ext4 could do better than it currently does.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Large File Deletion Comparison (ext3, ext4, XFS)
  2007-04-27 20:38 ` Andreas Dilger
@ 2007-04-27 20:48   ` Alex Tomas
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Tomas @ 2007-04-27 20:48 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Valerie Clement, ext4 development

Andreas Dilger wrote:
> Ah, one thing that is only mentioned in the URL is that the "IO count" is
> in units of 512-byte sectors.  In the case of XFS doing logical journaling
> this avoids a huge amount of double writes to the journal and then to the
> filesystem.  I still think ext4 could do better than it currently does.

I thought about this in context of huge directories when working set of blocks
is very large and it doesn't fit journal causing frequent commits. two ideas
I was thinking of are: 1) journal "change" where possible 2) compress whole
transaction to be written in the journal

thanks, Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-04-27 20:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-27 13:41 Large File Deletion Comparison (ext3, ext4, XFS) Valerie Clement
2007-04-27 18:33 ` Theodore Tso
2007-04-27 20:33   ` Andreas Dilger
2007-04-27 18:51 ` Alex Tomas
2007-04-27 20:38 ` Andreas Dilger
2007-04-27 20:48   ` Alex Tomas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).