From: Hans Reiser <reiser@namesys.com>
To: Andreas Dilger <adilger@clusterfs.com>
Cc: Andrew Morton <akpm@osdl.org>,
"Vladimir V. Saveliev" <vs@namesys.com>,
hch@infradead.org, Reiserfs-Dev@namesys.com,
Linux-Kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: batched write
Date: Mon, 19 Jun 2006 09:51:18 -0700 [thread overview]
Message-ID: <4496D606.8070402@namesys.com> (raw)
In-Reply-To: <20060619162740.GA5817@schatzie.adilger.int>
Andreas Dilger wrote:
>On Jun 17, 2006 10:04 -0700, Andrew Morton wrote:
>
>
>>On Thu, 15 Jun 2006 02:08:32 +0400
>>"Vladimir V. Saveliev" <vs@namesys.com> wrote:
>>
>>
>>
>>>The core of generic_file_buffered_write is
>>>do {
>>> grab_cache_page();
>>> a_ops->prepare_write();
>>> copy_from_user();
>>> a_ops->commit_write();
>>>
>>> filemap_set_next_iovec();
>>> balance_dirty_pages_ratelimited();
>>>} while (count);
>>>
>>>
>>>Would it make sence to rework this code with adding new address_space
>>>operation - fill_pages so that looks like:
>>>
>>>do {
>>> a_ops->fill_pages();
>>> filemap_set_next_iovec();
>>> balance_dirty_pages_ratelimited();
>>>} while (count);
>>>
>>>generic implementation of fill_pages would look like:
>>>
>>>generic_fill_pages()
>>>{
>>> grab_cache_page();
>>> a_ops->prepare_write();
>>> copy_from_user();
>>> a_ops->commit_write();
>>>}
>>>
>>>
>>>
>>There's nothing which leaps out and says "wrong" in this. But there's
>>nothing which leaps out and says "right", either. It seems somewhat
>>arbitrary, that's all.
>>
>>We have one filesystem which wants such a refactoring (although I don't
>>think you've adequately spelled out _why_ reiser4 wants this).
>>
>>To be able to say "yes, we want this" I think we'd need to understand which
>>other filesystems would benefit from exploiting it, and with what results?
>>
>>
>
>With the caveat that I didn't see the original patch, if this can be a step
>down the road toward supporting delayed allocation at the VFS level then
>I'm all for such changes.
>
>
What do you mean by supporting delayed allocation at the VFS level? Do
you mean calling to the FS or maybe just not stepping on the FS's toes
so much or? Delayed allocation is very fs specific in so far as I can
imagine it.
>Lustre goes to some lengths to batch up reads and writes on the client into
>large (1MB+) RPCs in order to maximize performance. Similarly on the
>server we essentially bypass the VFS in order to allocate all of the RPC's
>blocks in one call and do a large bio write in a second. It just isn't
>possible to maximize performance if everything is split into PAGE_SIZE
>chunks.
>
>I believe XFS would benefit from delayed allocation, and the ext3-delalloc
>patches from Alex also provide a large part of the performance wins for
>userspace IO, when they allow large sys_write() and VM cache flush to
>efficiently call into the filesystem to allocate many blocks at once, and
>then push them out to disk in large chunks.
>
>Cheers, Andreas
>--
>Andreas Dilger
>Principal Software Engineer
>Cluster File Systems, Inc.
>
>
>
>
>
next prev parent reply other threads:[~2006-06-19 16:51 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <44736D3E.8090808@namesys.com>
[not found] ` <20060524175312.GA3579@zero>
[not found] ` <44749E24.40203@namesys.com>
[not found] ` <20060608110044.GA5207@suse.de>
[not found] ` <1149766000.6336.29.camel@tribesman.namesys.com>
[not found] ` <20060608121006.GA8474@infradead.org>
2006-06-14 22:08 ` batched write Vladimir V. Saveliev
2006-06-17 17:04 ` Andrew Morton
2006-06-17 17:51 ` Hans Reiser
2006-06-18 11:20 ` Nix
2006-06-19 9:05 ` Hans Reiser
2006-06-19 11:32 ` Miklos Szeredi
2006-06-19 16:39 ` Hans Reiser
2006-06-19 17:35 ` Miklos Szeredi
2006-06-19 17:52 ` Akshat Aranya
2006-06-19 20:39 ` Hans Reiser
2006-06-19 16:27 ` Andreas Dilger
2006-06-19 16:51 ` Hans Reiser [this message]
2006-06-19 18:50 ` Andreas Dilger
2006-06-19 20:47 ` Hans Reiser
2006-06-20 0:01 ` David Chinner
2006-06-20 7:19 ` Hans Reiser
2006-06-20 7:26 ` Andrew Morton
2006-06-20 9:02 ` Steven Whitehouse
2006-06-20 16:26 ` Vladimir V. Saveliev
2006-06-20 17:29 ` Hans Reiser
2006-06-19 18:28 ` Vladimir V. Saveliev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4496D606.8070402@namesys.com \
--to=reiser@namesys.com \
--cc=Linux-Kernel@vger.kernel.org \
--cc=Reiserfs-Dev@namesys.com \
--cc=adilger@clusterfs.com \
--cc=akpm@osdl.org \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=vs@namesys.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).