From: Curt Wohlgemuth <curtw@google.com>
To: Eric Sandeen <sandeen@redhat.com>
Cc: Xiang Wang <xiangw@google.com>, linux-ext4@vger.kernel.org
Subject: Re: Using O_DIRECT in ext4
Date: Tue, 21 Jul 2009 07:45:58 -0700 [thread overview]
Message-ID: <6601abe90907210745k3730f74dq62f1fe6539722b4d@mail.gmail.com> (raw)
In-Reply-To: <4A6538DB.5050202@redhat.com>
On Mon, Jul 20, 2009 at 8:41 PM, Eric Sandeen<sandeen@redhat.com> wrote:
> Xiang Wang wrote:
>> Hi,
>>
>> Recently I've been experimenting with O_DIRECT in ext4 to get a
>> feeling of how much file fragmentation will be generated.
>>
>> On a newly formatted ext4 partition(no-journal), I created a top-level
>> directory and under this top-level directory I ran a test program to
>> generate some files.
>>
>> The test program does the following:
>> -- create multiple threads(in my test case: 16 threads)
>> -- each thread creates a file with the O_DIRECT flag and keeps
>> extending the file to 1MB
>> Since these threads run concurrently, they compete in block allocation.
>>
>> After the program ran to a completion, I ran filefrag on each file and
>> measure how many extents there are in the file.
>> And here is a sample result:
>> file0: 6 extents found
>> file1: 20 extents found
>> file2: 7 extents found
>> file3: 6 extents found
>> file4: 6 extents found
>> file5: 5 extents found
>> file6: 6 extents found
>> file7: 20 extents found
>> file8: 20 extents found
>> file9: 20 extents found
>> file10: 20 extents found
>> file11: 20 extents found
>> file12: 20 extents found
>> file13: 19 extents found
>> file14: 19 extents found
>> file15: 19 extents found
>>
>> Looks like these files are quite heavily fragmented.
>
> Multiple parallel extending DIOs in a single dir is a tough case for a
> filesystem - it has no hints about what to do, and can't use delalloc to
> wait to see what's happening; it just has to allocate things as they
> come, more or less.
>
>> For comparison, I did the same experiment on an ext2 partition,
>> resulting in each file having only 1 extent.
>
> Interestinng, not sure I would have expected that.
Same with us; we're looking into more variables to understand it.
>> I also did the experiments of using buffered writes(by removing the
>> O_DIRECT flag) on ext2 and ext4, both resulting in each file having
>> only 1 extent.
>
> delayed allocation at work I suppose.
>
>> I am wondering whether this kind of file fragmentation is already a
>> known issue in ext4 when O_DIRECT is used? Is it something by design?
>> Since it seems like ext2 does not have this issue under my test case,
>> is it necessary that we make the behavior of ext4 similar to ext2
>> under situations like this?
>
> Is this representative of a real workload?
Not exactly perhaps, but we do have apps that are showing
significantly more fragmentation in their files on ext4 than with
ext2, while using O_DIRECT (e.g., 8 extents on ext4 vs 1 on ext2, as
reported by filefrag). The experiment above is synthetic, but fairly
representative.
(Hence the related questions about fallocate, since this is one
possible, though ugly, workaround.)
Curt
next prev parent reply other threads:[~2009-07-21 14:46 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-21 1:41 Using O_DIRECT in ext4 Xiang Wang
2009-07-21 3:41 ` Eric Sandeen
2009-07-21 14:45 ` Curt Wohlgemuth [this message]
2009-07-21 16:38 ` Eric Sandeen
2009-07-21 20:46 ` Xiang Wang
2009-07-21 21:08 ` Frank Mayhar
2009-07-21 23:46 ` Mingming Cao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6601abe90907210745k3730f74dq62f1fe6539722b4d@mail.gmail.com \
--to=curtw@google.com \
--cc=linux-ext4@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=xiangw@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).