From: Tao Ma <tm@tao.ma>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
Christoph Hellwig <hch@lst.de>, Al Viro <viro@zeniv.linux.org.uk>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH for 3.2] fs/direct-io.c: Calculate fs_count correctly in get_more_blocks.
Date: Tue, 01 Nov 2011 11:31:14 +0800 [thread overview]
Message-ID: <4EAF6802.2060005@tao.ma> (raw)
In-Reply-To: <x49wrblav46.fsf@segfault.boston.devel.redhat.com>
On 11/01/2011 02:12 AM, Jeff Moyer wrote:
> Tao Ma <tm@tao.ma> writes:
>
>> From: Tao Ma <boyu.mt@taobao.com>
>>
>> In get_more_blocks, we use dio_count to calculate fs_count and do some
>> tricky things to increase fs_count if dio_count isn't aligned. But
>> actually it still has some cornor case that can't be coverd. See the
>> following example:
>> ./dio_write foo -s 1024 -w 4096(direct write 4096 bytes at offset 1024).
>> The same goes if the offset isn't aligned to fs_blocksize.
>>
>> In this case, the old calculation counts fs_count to be 1, but actually
>> we will write into 2 different blocks(if fs_blocksize=4096). The old code
>> just works, since it will call get_block twice(and may have to allocate
>> and create extent twice for file systems like ext4). So we'd better call
>> get_block just once with the proper fs_count.
>
> This description was *really* hard for me to understand. It seems to me
> that right now there's an inefficiency in the code. It's not clear
> whether you're claiming that it was introduced recently, though. Was
> it, or has this problem been around for a while?
Actually it is there a long time ago. And the good thing is that it
isn't a bug, only some performance overhead.
>
> How did you notice this? Was there any evidence of a problem, such as
> performance overhead or less than ideal file layout?
I found it when I dig into some ext4 issues. The ext4 can't create the
whole 8K(in the above case) and ext4 has to create the blocks 2 times
for just one direct i/o write. In some of our test, it costs.
>
> Anyway, I agree that the code does not correctly calculate the number of
> file system blocks in a request. I also agree that your patch fixes
> that issue.
>
> Please ammend the description and then you can add my:
So how about the following commit log(please feel free to modify it if I
still don't describe it correctly).
In get_more_blocks, we use dio_count to calculate fs_count to let the
file system map(maybe also create) blocks. And some tricky things are
done to increase fs_count if dio_count isn't aligned.
But actually it still has some cornor case that can't be coverd. See the
following example:
./dio_write foo -s 1024 -w 4096(direct write 4096 bytes at offset 1024).
In this case, the old calculation counts fs_count to be 1, but actually
we will write into 2 different blocks(if fs_blocksize=4096). So the
underlying file system is called twice and leads to some performance
overhead. So fix it by calculating fs_count correctly and let the file
system knows what we really want to write.
Thanks
Tao
>
> Acked-by: Jeff Moyer <jmoyer@redhat.com>
>
> Cheers,
> Jeff
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-11-01 3:31 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-28 13:30 interims VFS queue Christoph Hellwig
2011-10-28 18:43 ` Stephen Rothwell
2011-10-28 19:08 ` Linus Torvalds
2011-10-28 19:31 ` Stephen Rothwell
2011-10-28 19:13 ` Stephen Rothwell
2011-10-28 22:09 ` Andrew Morton
2011-10-29 10:58 ` Christoph Hellwig
2011-10-29 11:49 ` caching the request queue was " Andi Kleen
2011-11-02 2:47 ` Vivek Goyal
2011-10-30 7:36 ` Tao Ma
2011-10-31 7:24 ` [PATCH for 3.2] fs/direct-io.c: Calculate fs_count correctly in get_more_blocks Tao Ma
2011-10-31 18:12 ` Jeff Moyer
2011-11-01 3:31 ` Tao Ma [this message]
2011-11-02 2:26 ` [PATCH V2 " Tao Ma
2011-11-02 7:36 ` Christoph Hellwig
2011-11-03 3:21 ` Tao Ma
2011-10-29 13:48 ` interims VFS queue Aneesh Kumar K.V
2011-10-29 14:37 ` Christoph Hellwig
2011-10-30 15:47 ` Hans Verkuil
2011-11-02 13:28 ` interims VFS queue, part2 Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EAF6802.2060005@tao.ma \
--to=tm@tao.ma \
--cc=akpm@linux-foundation.org \
--cc=hch@lst.de \
--cc=jmoyer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).