linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup
@ 2012-01-30  8:41 Kazuya Mio
  2012-01-30 20:36 ` Andreas Dilger
  0 siblings, 1 reply; 8+ messages in thread
From: Kazuya Mio @ 2012-01-30  8:41 UTC (permalink / raw)
  To: ext4, Jan Kara, Andreas Dilger

ext3 has a performance problem that parallel write is too slow.
I looked into this and found out that ext3 calls ext3_mark_inode_dirty()
unnecessarily.

The following result is the time of writing 16 files whose size are 3GB
by 16 threads. This measurement was performed in linux3.3-rc1 with
4-way server, 512GB memory.

    filesystem        time(sec)  call ext3_mark_inode_dirty(times)
    ---
    ext3              220.5      50,338,104
    ext3 (patched)    196.3      25,169,658
    ext4 (*1)         190.3      28,465,799

    *1 disable ext4-specific option (delalloc, extent, and so on)

ext3 in RHEL5.5 clearly shows the difference in performance.
Writing by the same method takes 533 seconds, though writing by one thread
takes 191 seconds.

Every time we write one page, ext3 calls ext3_mark_inode_dirty() four times.
Two of these are unnecessary in many case, so I add the conditions to call
the function only when it is necessary.

      sys_write
        ...
          __generic_file_aio_write
            file_update_time
              mark_inode_dirty_sync
            generic_file_buffered_write
              ...
                ext3_get_blocks_handle
                  ext3_write_begin
                    ...
                      ext3_new_blocks
                        vfs_dq_alloc_block
    1)                    mark_inode_dirty
                        vfs_dq_free_block
    2)                    mark_inode_dirty      <-- patch 1/2
                    ext3_splice_branch
    3)                ext3_mark_inode_dirty     <-- patch 2/2
                  ext3_ordered_write_end
                    update_file_sizes
    4)                mark_inode_dirty

Regards,
Kazuya Mio

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup
  2012-01-30  8:41 [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup Kazuya Mio
@ 2012-01-30 20:36 ` Andreas Dilger
  2012-01-31  5:03   ` Kazuya Mio
  2012-02-01  8:35   ` Kazuya Mio
  0 siblings, 2 replies; 8+ messages in thread
From: Andreas Dilger @ 2012-01-30 20:36 UTC (permalink / raw)
  To: Kazuya Mio; +Cc: ext4, Jan Kara

On 2012-01-30, at 1:41 AM, Kazuya Mio wrote:
> ext3 has a performance problem that parallel write is too slow.
> I looked into this and found out that ext3 calls ext3_mark_inode_dirty()
> unnecessarily.
> 
> The following result is the time of writing 16 files whose size are 3GB
> by 16 threads. This measurement was performed in linux3.3-rc1 with
> 4-way server, 512GB memory.
> 
>    filesystem        time(sec)  call ext3_mark_inode_dirty(times)
>    ---
>    ext3              220.5      50,338,104
>    ext3 (patched)    196.3      25,169,658
>    ext4 (*1)         190.3      28,465,799
> 
>    *1 disable ext4-specific option (delalloc, extent, and so on)

Can you please run this same measurement on ext4 formatted and running
with the default options?  I'd like to know if this is still a problem
in ext4 or not.


There is a better mechanism to handle the inode updates that could be
implemented if there is still a real performance concern.  There are
journal pre-commit callbacks on the buffer heads that could be run to
copy the modified data from the VFS inodes to the buffer blocks.

This would reduce the ext4_mark_inode_dirty() to setting a single dirty
flag in the inode, and updating the VFS inode ctime.  Only once per
journal commit would the VFS inode be copied into the buffer, greatly
reducing the overhead of these operations.  This should also noticeably
reduce the overhead from metadata checksums, since the checksum would
only be computed once for each inode per journal commit.

> ext3 in RHEL5.5 clearly shows the difference in performance.
> Writing by the same method takes 533 seconds, though writing by one thread
> takes 191 seconds.
> 
> Every time we write one page, ext3 calls ext3_mark_inode_dirty() four times.
> Two of these are unnecessary in many case, so I add the conditions to call
> the function only when it is necessary.
> 
>      sys_write
>        ...
>          __generic_file_aio_write
>            file_update_time
>              mark_inode_dirty_sync
>            generic_file_buffered_write
>              ...
>                ext3_get_blocks_handle
>                  ext3_write_begin
>                    ...
>                      ext3_new_blocks
>                        vfs_dq_alloc_block
>    1)                    mark_inode_dirty
>                        vfs_dq_free_block
>    2)                    mark_inode_dirty      <-- patch 1/2
>                    ext3_splice_branch
>    3)                ext3_mark_inode_dirty     <-- patch 2/2
>                  ext3_ordered_write_end
>                    update_file_sizes
>    4)                mark_inode_dirty
> 
> Regards,
> Kazuya Mio


Cheers, Andreas






^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup
  2012-01-30 20:36 ` Andreas Dilger
@ 2012-01-31  5:03   ` Kazuya Mio
  2012-02-01  8:35   ` Kazuya Mio
  1 sibling, 0 replies; 8+ messages in thread
From: Kazuya Mio @ 2012-01-31  5:03 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: ext4, Jan Kara

2012/01/31 5:36, Andreas Dilger wrote:
> Can you please run this same measurement on ext4 formatted and running
> with the default options?  I'd like to know if this is still a problem
> in ext4 or not.

Sure. I will send the result as soon as possible.

Regards,
Kazuya Mio

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup
  2012-01-30 20:36 ` Andreas Dilger
  2012-01-31  5:03   ` Kazuya Mio
@ 2012-02-01  8:35   ` Kazuya Mio
  2012-02-02 22:36     ` Andreas Dilger
  1 sibling, 1 reply; 8+ messages in thread
From: Kazuya Mio @ 2012-02-01  8:35 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: ext4, Jan Kara

2012/01/31 5:36, Andreas Dilger wrote:
> Can you please run this same measurement on ext4 formatted and running
> with the default options?  I'd like to know if this is still a problem
> in ext4 or not.

I performed the same measurement on ext4 with the default options.
Here is its result:

    filesystem        time(sec)  call extX_mark_inode_dirty(times)
    ---
    ext3              220.5      50,338,104
    ext3 (patched)    196.3      25,169,658
    ext4 (*1)         190.3      28,465,799
    ext4 (*2)         201.5      27,963,473
    ext4 (default)    223.3      14,026,118

    *1 disable ext4-specific options (delalloc, extent, and so on)
    *2 disable only delalloc option

Regards,
Kazuya Mio

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup
  2012-02-01  8:35   ` Kazuya Mio
@ 2012-02-02 22:36     ` Andreas Dilger
  2012-02-03  7:49       ` Kazuya Mio
  0 siblings, 1 reply; 8+ messages in thread
From: Andreas Dilger @ 2012-02-02 22:36 UTC (permalink / raw)
  To: Kazuya Mio; +Cc: ext4, Jan Kara

On 2012-02-01, at 1:35 AM, Kazuya Mio wrote:
> 2012/01/31 5:36, Andreas Dilger wrote:
>> Can you please run this same measurement on ext4 formatted and running
>> with the default options?  I'd like to know if this is still a problem
>> in ext4 or not.
> 
> I performed the same measurement on ext4 with the default options.

Kazuya,
thank you for running this test.  I'm unfortunately confused by the result.

> Here is its result:
> 
>   filesystem        time(sec)  call extX_mark_inode_dirty(times)
>   ---
>   ext3              220.5      50,338,104
>   ext3 (patched)    196.3      25,169,658
>   ext4 (*1)         190.3      28,465,799
>   ext4 (*2)         201.5      27,963,473
>   ext4 (default)    223.3      14,026,118
> 
>   *1 disable ext4-specific options (delalloc, extent, and so on)
>   *2 disable only delalloc option

This shows that ext4 with extents+delalloc is _slower_ than ext3, which
is very strange.  In other similar tests of write performance (see
http://downloads.linux.hp.com/~enw/ext4/3.2/large_file_creates.html,
showing multi-threaded 1GB file writes) ext4 is much faster than ext3.

Looking at your original email, is ext4 being tested on a RHEL 5.5
(2.6.18) kernel, or a more recent kernel?  It would be more useful
to run this on a more modern kernel, since the ext4 code backported
to RHEL5 was barely supporting delalloc at all, if I remember correctly.

The good news is that the number of extN_mark_inode_dirty() calls is
far lower in ext4 than in ext3, though this doesn't seem to be the
primary factor in the performance in this case.

Cheers, Andreas






^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup
  2012-02-02 22:36     ` Andreas Dilger
@ 2012-02-03  7:49       ` Kazuya Mio
  2012-02-03 13:28         ` Yongqiang Yang
  0 siblings, 1 reply; 8+ messages in thread
From: Kazuya Mio @ 2012-02-03  7:49 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: ext4, Jan Kara

2012/02/03 7:36, Andreas Dilger wrote:
>>   filesystem        time(sec)  call extX_mark_inode_dirty(times)
>>   ---
>>   ext3              220.5      50,338,104
>>   ext3 (patched)    196.3      25,169,658
>>   ext4 (*1)         190.3      28,465,799
>>   ext4 (*2)         201.5      27,963,473
>>   ext4 (default)    223.3      14,026,118
>>
>>   *1 disable ext4-specific options (delalloc, extent, and so on)
>>   *2 disable only delalloc option
> This shows that ext4 with extents+delalloc is _slower_ than ext3, which
> is very strange.  In other similar tests of write performance (see

One more thing is that ext4+delalloc is slower than ext4+nodelalloc.

> http://downloads.linux.hp.com/~enw/ext4/3.2/large_file_creates.html,
> showing multi-threaded 1GB file writes) ext4 is much faster than ext3.

I guess write buffer size of my test is different from ffsb's one.
My test calls write systemcall every time one block is allocated,
so it is close to the stress test I think.

> Looking at your original email, is ext4 being tested on a RHEL 5.5
> (2.6.18) kernel, or a more recent kernel?  It would be more useful
> to run this on a more modern kernel, since the ext4 code backported
> to RHEL5 was barely supporting delalloc at all, if I remember correctly.

I tested on the recent kernel (3.3-rc1).
I also tested on RHEL5.5, and its result showed that ext3 was much slower than
the recent kernel's one.

   filesystem        time(sec)
   ---
   ext3(RHEL5.5)     438.6
   ext3(3.3-rc1)     220.5

Regards,
Kazuya Mio

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup
  2012-02-03  7:49       ` Kazuya Mio
@ 2012-02-03 13:28         ` Yongqiang Yang
  2012-02-06  4:13           ` Kazuya Mio
  0 siblings, 1 reply; 8+ messages in thread
From: Yongqiang Yang @ 2012-02-03 13:28 UTC (permalink / raw)
  To: Kazuya Mio; +Cc: Andreas Dilger, ext4, Jan Kara

On Fri, Feb 3, 2012 at 3:49 PM, Kazuya Mio <k-mio@sx.jp.nec.com> wrote:
> 2012/02/03 7:36, Andreas Dilger wrote:
>>>
>>>  filesystem        time(sec)  call extX_mark_inode_dirty(times)
>>>  ---
>>>  ext3              220.5      50,338,104
>>>  ext3 (patched)    196.3      25,169,658
>>>  ext4 (*1)         190.3      28,465,799
>>>  ext4 (*2)         201.5      27,963,473
>>>  ext4 (default)    223.3      14,026,118
>>>
>>>  *1 disable ext4-specific options (delalloc, extent, and so on)
>>>  *2 disable only delalloc option
>>
>> This shows that ext4 with extents+delalloc is _slower_ than ext3, which
>> is very strange.  In other similar tests of write performance (see
>
>
> One more thing is that ext4+delalloc is slower than ext4+nodelalloc.
And according to the data, maybe ext4+extent is also slower than ext4+noextent.

What's the size of the fs?  and what kind of the tested device?

Yongqiang.
>
>
>> http://downloads.linux.hp.com/~enw/ext4/3.2/large_file_creates.html,
>> showing multi-threaded 1GB file writes) ext4 is much faster than ext3.
>
>
> I guess write buffer size of my test is different from ffsb's one.
> My test calls write systemcall every time one block is allocated,
> so it is close to the stress test I think.
>
>
>> Looking at your original email, is ext4 being tested on a RHEL 5.5
>> (2.6.18) kernel, or a more recent kernel?  It would be more useful
>> to run this on a more modern kernel, since the ext4 code backported
>> to RHEL5 was barely supporting delalloc at all, if I remember correctly.
>
>
> I tested on the recent kernel (3.3-rc1).
> I also tested on RHEL5.5, and its result showed that ext3 was much slower
> than
> the recent kernel's one.
>
>  filesystem        time(sec)
>  ---
>  ext3(RHEL5.5)     438.6
>  ext3(3.3-rc1)     220.5
>
> Regards,
> Kazuya Mio
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Best Wishes
Yongqiang Yang
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup
  2012-02-03 13:28         ` Yongqiang Yang
@ 2012-02-06  4:13           ` Kazuya Mio
  0 siblings, 0 replies; 8+ messages in thread
From: Kazuya Mio @ 2012-02-06  4:13 UTC (permalink / raw)
  To: Yongqiang Yang; +Cc: Andreas Dilger, ext4, Jan Kara

2012/02/03 22:28, Yongqiang Yang wrote:
> On Fri, Feb 3, 2012 at 3:49 PM, Kazuya Mio<k-mio@sx.jp.nec.com>  wrote:
>> 2012/02/03 7:36, Andreas Dilger wrote:
>>>>
>>>>   filesystem        time(sec)  call extX_mark_inode_dirty(times)
>>>>   ---
>>>>   ext3              220.5      50,338,104
>>>>   ext3 (patched)    196.3      25,169,658
>>>>   ext4 (*1)         190.3      28,465,799
>>>>   ext4 (*2)         201.5      27,963,473
>>>>   ext4 (default)    223.3      14,026,118
>>>>
>>>>   *1 disable ext4-specific options (delalloc, extent, and so on)
>>>>   *2 disable only delalloc option
>>>
>>> This shows that ext4 with extents+delalloc is _slower_ than ext3, which
>>> is very strange.  In other similar tests of write performance (see
>>
>>
>> One more thing is that ext4+delalloc is slower than ext4+nodelalloc.
> And according to the data, maybe ext4+extent is also slower than ext4+noextent.
>
> What's the size of the fs?  and what kind of the tested device?

I tested on Express5800/A1080a-S (4-way server with 8-core processors).
Filesystem size was 100GB. I used the 266GB LUN from the FC-SAN storage.

Regards,
Kazuya Mio

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-02-06  4:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-30  8:41 [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup Kazuya Mio
2012-01-30 20:36 ` Andreas Dilger
2012-01-31  5:03   ` Kazuya Mio
2012-02-01  8:35   ` Kazuya Mio
2012-02-02 22:36     ` Andreas Dilger
2012-02-03  7:49       ` Kazuya Mio
2012-02-03 13:28         ` Yongqiang Yang
2012-02-06  4:13           ` Kazuya Mio

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).