From: Neil Brown <neilb@suse.de>
To: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Linux Filesystems <linux-fsdevel@vger.kernel.org>,
Mark Fasheh <mark.fasheh@oracle.com>,
Linux Memory Management <linux-mm@kvack.org>
Subject: Re: [patch 12/44] fs: introduce write_begin, write_end, and perform_write aops
Date: Tue, 24 Apr 2007 17:49:48 +1000 [thread overview]
Message-ID: <17965.46748.634169.563467@notabene.brown> (raw)
In-Reply-To: message from Nick Piggin on Tuesday April 24
On Tuesday April 24, npiggin@suse.de wrote:
>
> BTW. AOP_FLAG_UNINTERRUPTIBLE can be used by filesystems to avoid
> an initial read or other sequence they might be using to handle the
> case of a short write. ecryptfs uses it, others can too.
>
> For buffered writes, this doesn't get passed in (unless they are
> coming from kernel space), so I was debating whether to have it at
> all. However, in the previous API, _nobody_ had to worry about
> short writes, so this flag means I avoid making an API that is
> fundamentally less performant in some situations.
Ahhh I think I get it now.
In general, the address_space must cope with the possibility that
fewer than the expected number of bytes is copied. This may leave
parts of the page with invalid data. This can be handled by
pre-loading the page with valid data, however this may cause a
significant performance cost.
The write_begin/write_end interface provide two mechanism by which
this case can be handled more efficiently.
1/ The AOP_FLAG_UNINTERRUPTIBLE flag declares that the write will
not be partial (maybe a different name? AOP_FLAG_NO_PARTIAL).
If that is set, inefficient preparation can be avoided. However the
most common write paths will never set this flag.
2/ The return from write_end can declare that fewer bytes have been
accepted. e.g. part of the page may have been loaded from backing
store, overwriting some of the newly written bytes. If this
return value is reduced, a new write_begin/write_end cycle
may be called to attempt to write the bytes again.
Also
+ write_end: After a successful write_begin, and data copy, write_end must
+ be called. len is the original len passed to write_begin, and copied
+ is the amount that was able to be copied (they must be equal if
+ write_begin was called with intr == 0).
+
That should be "... called without AOP_FLAG_UNINTERRUPTIBLE being
set".
And "that was able to be copied" is misleading, as the copy is not done in
write_end. Maybe "that was accepted".
It seems to make sense now. I might try re-reviewing the patches based
on this improved understanding.... only a public holiday looms :-)
NeilBrown
next prev parent reply other threads:[~2007-04-24 7:49 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-24 1:23 [patch 00/44] Buffered write deadlock fix and new aops for 2.6.21-rc6-mm1 Nick Piggin
2007-04-24 1:23 ` [patch 01/44] mm: revert KERNEL_DS buffered write optimisation Nick Piggin
2007-04-24 1:23 ` [patch 02/44] Revert 81b0c8713385ce1b1b9058e916edcf9561ad76d6 Nick Piggin
2007-04-24 1:23 ` [patch 03/44] Revert 6527c2bdf1f833cc18e8f42bd97973d583e4aa83 Nick Piggin
2007-04-24 1:23 ` [patch 04/44] mm: clean up buffered write code Nick Piggin
2007-04-24 1:23 ` [patch 05/44] mm: debug write deadlocks Nick Piggin
2007-04-24 1:23 ` [patch 06/44] mm: trim more holes Nick Piggin
2007-04-24 6:07 ` Neil Brown
2007-04-24 6:17 ` Nick Piggin
2007-04-24 1:23 ` [patch 07/44] mm: buffered write cleanup Nick Piggin
2007-04-24 1:23 ` [patch 08/44] mm: write iovec cleanup Nick Piggin
2007-04-24 1:23 ` [patch 09/44] mm: fix pagecache write deadlocks Nick Piggin
2007-04-24 1:23 ` [patch 10/44] mm: buffered write iterator Nick Piggin
2007-04-24 1:23 ` [patch 11/44] fs: fix data-loss on error Nick Piggin
2007-04-24 1:23 ` [patch 12/44] fs: introduce write_begin, write_end, and perform_write aops Nick Piggin
2007-04-24 6:59 ` Neil Brown
2007-04-24 7:23 ` Nick Piggin
2007-04-24 7:49 ` Neil Brown [this message]
2007-04-24 10:37 ` Nick Piggin
2007-04-24 1:23 ` [patch 13/44] mm: restore KERNEL_DS optimisations Nick Piggin
2007-04-24 10:43 ` Christoph Hellwig
2007-04-24 11:03 ` Nick Piggin
2007-04-24 1:24 ` [patch 14/44] implement simple fs aops Nick Piggin
2007-04-24 1:24 ` [patch 15/44] block_dev convert to new aops Nick Piggin
2007-04-24 1:24 ` [patch 16/44] rd " Nick Piggin
2007-04-24 10:46 ` Christoph Hellwig
2007-04-24 11:05 ` Nick Piggin
2007-04-24 11:11 ` Christoph Hellwig
2007-04-24 11:16 ` Nick Piggin
2007-04-24 11:18 ` Christoph Hellwig
2007-04-24 11:20 ` Nick Piggin
2007-04-24 11:42 ` Neil Brown
2007-04-24 1:24 ` [patch 17/44] ext2 " Nick Piggin
2007-04-24 1:24 ` [patch 18/44] ext3 " Nick Piggin
2007-04-24 1:24 ` [patch 19/44] ext4 " Nick Piggin
2007-04-24 1:24 ` [patch 20/44] xfs " Nick Piggin
2007-04-24 1:24 ` [patch 21/44] fs: new cont helpers Nick Piggin
2007-04-24 1:24 ` [patch 22/44] fat convert to new aops Nick Piggin
2007-04-24 1:24 ` [patch 23/44] adfs " Nick Piggin
2007-04-24 1:24 ` [patch 24/44] affs " Nick Piggin
2007-04-24 1:24 ` [patch 25/44] hfs " Nick Piggin
2007-04-24 1:24 ` [patch 26/44] hfsplus " Nick Piggin
2007-04-24 1:24 ` [patch 27/44] hpfs " Nick Piggin
2007-04-24 1:24 ` [patch 28/44] bfs " Nick Piggin
2007-04-24 1:24 ` [patch 29/44] qnx4 " Nick Piggin
2007-04-24 1:24 ` [patch 30/44] nfs " Nick Piggin
2007-04-24 1:24 ` [patch 31/44] smb " Nick Piggin
2007-04-24 1:24 ` [patch 32/44] ocfs2: " Nick Piggin
2007-04-24 1:24 ` [patch 33/44] gfs2 " Nick Piggin
2007-04-24 1:24 ` [patch 34/44] fs: no AOP_TRUNCATED_PAGE for writes Nick Piggin
2007-04-24 1:24 ` [patch 35/44] ecryptfs convert to new aops Nick Piggin
2007-04-24 1:24 ` [patch 36/44] fuse " Nick Piggin
2007-04-24 1:24 ` [patch 37/44] hostfs " Nick Piggin
2007-04-27 16:11 ` Jeff Dike
2007-04-24 1:24 ` [patch 38/44] jffs2 " Nick Piggin
2007-04-24 1:24 ` [patch 39/44] cifs " Nick Piggin
2007-04-24 1:24 ` [patch 40/44] ufs " Nick Piggin
2007-04-24 1:24 ` [patch 41/44] udf " Nick Piggin
2007-04-24 1:24 ` [patch 42/44] sysv " Nick Piggin
2007-04-24 1:24 ` [patch 43/44] minix " Nick Piggin
2007-04-24 1:24 ` [patch 44/44] jfs " Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=17965.46748.634169.563467@notabene.brown \
--to=neilb@suse.de \
--cc=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.fasheh@oracle.com \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).