All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tao Ma <tm@tao.ma>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Christoph Hellwig <hch@lst.de>, Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH for 3.2] fs/direct-io.c: Calculate fs_count correctly in get_more_blocks.
Date: Tue, 01 Nov 2011 11:31:14 +0800	[thread overview]
Message-ID: <4EAF6802.2060005@tao.ma> (raw)
In-Reply-To: <x49wrblav46.fsf@segfault.boston.devel.redhat.com>

On 11/01/2011 02:12 AM, Jeff Moyer wrote:
> Tao Ma <tm@tao.ma> writes:
> 
>> From: Tao Ma <boyu.mt@taobao.com>
>>
>> In get_more_blocks, we use dio_count to calculate fs_count and do some
>> tricky things to increase fs_count if dio_count isn't aligned. But
>> actually it still has some cornor case that can't be coverd. See the
>> following example:
>> ./dio_write foo -s 1024 -w 4096(direct write 4096 bytes at offset 1024).
>> The same goes if the offset isn't aligned to fs_blocksize.
>>
>> In this case, the old calculation counts fs_count to be 1, but actually
>> we will write into 2 different blocks(if fs_blocksize=4096). The old code
>> just works, since it will call get_block twice(and may have to allocate
>> and create extent twice for file systems like ext4). So we'd better call
>> get_block just once with the proper fs_count.
> 
> This description was *really* hard for me to understand.  It seems to me
> that right now there's an inefficiency in the code.  It's not clear
> whether you're claiming that it was introduced recently, though.  Was
> it, or has this problem been around for a while?
Actually it is there a long time ago. And the good thing is that it
isn't a bug, only some performance overhead.
> 
> How did you notice this?  Was there any evidence of a problem, such as
> performance overhead or less than ideal file layout?
I found it when I dig into some ext4 issues. The ext4 can't create the
whole 8K(in the above case) and ext4 has to create the blocks 2 times
for just one direct i/o write. In some of our test, it costs.

> 
> Anyway, I agree that the code does not correctly calculate the number of
> file system blocks in a request.  I also agree that your patch fixes
> that issue.
> 
> Please ammend the description and then you can add my:
So how about the following commit log(please feel free to modify it if I
still don't describe it correctly).

In get_more_blocks, we use dio_count to calculate fs_count to let the
file system map(maybe also create) blocks. And some tricky things are
done to increase fs_count if dio_count isn't aligned.

But actually it still has some cornor case that can't be coverd. See the
following example:
./dio_write foo -s 1024 -w 4096(direct write 4096 bytes at offset 1024).

In this case, the old calculation counts fs_count to be 1, but actually
we will write into 2 different blocks(if fs_blocksize=4096). So the
underlying file system is called twice and leads to some performance
overhead. So fix it by calculating fs_count correctly and let the file
system knows what we really want to write.

Thanks
Tao

> 
> Acked-by: Jeff Moyer <jmoyer@redhat.com>
> 
> Cheers,
> Jeff
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2011-11-01  3:31 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-28 13:30 interims VFS queue Christoph Hellwig
2011-10-28 18:43 ` Stephen Rothwell
2011-10-28 19:08   ` Linus Torvalds
2011-10-28 19:31     ` Stephen Rothwell
2011-10-28 19:13   ` Stephen Rothwell
2011-10-28 22:09 ` Andrew Morton
2011-10-29 10:58   ` Christoph Hellwig
2011-10-29 11:49     ` caching the request queue was " Andi Kleen
2011-11-02  2:47       ` Vivek Goyal
2011-10-30  7:36     ` Tao Ma
2011-10-31  7:24     ` [PATCH for 3.2] fs/direct-io.c: Calculate fs_count correctly in get_more_blocks Tao Ma
2011-10-31 18:12       ` Jeff Moyer
2011-11-01  3:31         ` Tao Ma [this message]
2011-11-02  2:26         ` [PATCH V2 " Tao Ma
2011-11-02  7:36           ` Christoph Hellwig
2011-11-03  3:21             ` Tao Ma
2011-10-29 13:48   ` interims VFS queue Aneesh Kumar K.V
2011-10-29 14:37     ` Christoph Hellwig
2011-10-30 15:47   ` Hans Verkuil
2011-11-02 13:28 ` interims VFS queue, part2 Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EAF6802.2060005@tao.ma \
    --to=tm@tao.ma \
    --cc=akpm@linux-foundation.org \
    --cc=hch@lst.de \
    --cc=jmoyer@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.