From: "Lukáš Czerner" <lczerner@redhat.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Lukáš Czerner" <lczerner@redhat.com>,
linux-ext4@vger.kernel.org, gharm@google.com
Subject: Re: [PATCH] ext4: Do not normalize request from fallocate
Date: Mon, 25 Mar 2013 14:26:54 +0100 (CET) [thread overview]
Message-ID: <alpine.LFD.2.00.1303251420480.23176@localhost> (raw)
In-Reply-To: <20130325125309.GE26792@thunk.org>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2629 bytes --]
On Mon, 25 Mar 2013, Theodore Ts'o wrote:
> Date: Mon, 25 Mar 2013 08:53:09 -0400
> From: Theodore Ts'o <tytso@mit.edu>
> To: Lukáš Czerner <lczerner@redhat.com>
> Cc: linux-ext4@vger.kernel.org, gharm@google.com
> Subject: Re: [PATCH] ext4: Do not normalize request from fallocate
>
> On Mon, Mar 25, 2013 at 11:09:35AM +0100, Lukáš Czerner wrote:
> >
> > Sorry for being dense, but I am trying to understand why this is so
> > bad and what is the "expected" column there.
> >
> > The physical offset of each extent bellow starts on the start of the
> > block group and it seems to me that it's perfectly aligned for every
> > power of two up to the block group size.
>
> Yes, but the logical offset isn't aligned. Consider the simplest
> workload, which is where we are writing the 1GB file sequentially.
> Let's assume that the raid stripe size is 8M. So ideally, we would
> want each write to be a multiple of 8M, starting at logical block 0.
>
> But look what happens here:
>
> > > File size of 1 is 1073741824 (262144 blocks of 4096 bytes)
> > > ext: logical_offset: physical_offset: length: expected: flags:
> > > 0: 0.. 32766: 458752.. 491518: 32767: unwritten
> > > 1: 32767.. 65533: 491520.. 524286: 32767: 491519: unwritten
> > > 2: 65534.. 98300: 589824.. 622590: 32767: 524287: unwritten
>
> If we do 8M writes, then we would want to write in chunks of 2048
> blocks. So consider what happens when we write the 2048 block chunk
> starting with logical block 30720. The fact that there is a
> discontinuity between logical blocks 32766 and 32767 means that we
> will have to do a read-modify-write cycle for that particular RAID
> stripe.
>
> Does that make more sense?
Oh, now I get it :) Thanks a lot for explanation I kept thinking
about the physical layout and forgot that the logical is actually
misaligned.
>
> Another reason why keeping the file as physically contiguous as
> possible is because we can now extent caching using the extent status
> tree. So if we can allocate the file using 2 physically contiguous
> extents in instead of 9 or 10 physically contiguous extents, it means
> the extent status tree uses less memory, too. For a 1GB file, that
> might not make that much difference, but if we caching 2048 of these
> 1G files (on a 2TB disk, for example), keeping the files as physically
> contiguous as possible means we can cache the logical to physical
> block mapping of all of these files much more easily.
Yes, that makes sense too.
>
> Regards,
>
> - Ted
>
Thanks!
-Lukas
next prev parent reply other threads:[~2013-03-25 13:27 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-21 15:50 [PATCH] ext4: Do not normalize request from fallocate Lukas Czerner
2013-03-21 16:03 ` Dmitry Monakhov
2013-03-22 17:10 ` Greg Harmon
2013-03-22 19:36 ` Theodore Ts'o
[not found] ` <514cb91d.8a48340a.33fd.ffff9fa3SMTPIN_ADDED_BROKEN@mx.google.com>
2013-03-22 22:19 ` Greg Harmon
2013-03-24 0:11 ` Theodore Ts'o
2013-03-24 2:42 ` Andreas Dilger
2013-03-25 10:09 ` Lukáš Czerner
2013-03-25 12:53 ` Theodore Ts'o
2013-03-25 13:26 ` Lukáš Czerner [this message]
2013-03-25 14:44 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.1303251420480.23176@localhost \
--to=lczerner@redhat.com \
--cc=gharm@google.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).