From: Andrew Morton <akpm@digeo.com>
To: Steven French <sfrench@us.ibm.com>
Cc: linux-fsdevel@vger.kernel.org
Subject: Re: readpages & writepages
Date: Wed, 22 Jan 2003 16:37:49 -0800 [thread overview]
Message-ID: <20030122163749.62e4861e.akpm@digeo.com> (raw)
In-Reply-To: <OFFDAAAA75.32250A66-ON87256CB5.00700F45-86256CB6.0050D220@us.ibm.com>
Steven French <sfrench@us.ibm.com> wrote:
>
> Some questions about implementing readpages & writepages for network
> filesystems ...
>
> I didn't find many examples of readpages & writepages in 2.5 yet (so far
> just the local filesystem examples that mostly use fs/mpage.c common code
> and the nfs case) as I looked through trying to implement these optional
> vfs entry points for the cifs case.
My approach there was to implement sufficient functionality for the kernel to
be able to assemble multipage BIOs for ext2 filesystems, and no more. To
keep it simple, clean and well documented, in the expectation that when other
filesystem developers get into it, the functionality will grow on-demand.
And indeed this has happened - enhancements have been added as AFS, reiserfs3
and NFS have started to use this code.
I expect more changes will be needed to accommodate other filesystems.
That's fine.
> Doing larger reads, each of size 16K,
> turns out to be reasonably efficient to typical CIFS servers (larger is of
> course also possible), and would have the benefit of reducing the network
> traffic noticeably when doing readahead. There are a few difficulties
> though - there does not appear to be an obvious mapping of the list of
> pages (passed in by to readpages/writepages), at least to something that I
> could memcopy my 16K buffer network read buffer into - so the pages have to
> be individually mapped and copied in 4K chunks, losing a small amount of
> the benefit of having a readpages/writepages in the first place
Well if you want a single 16k physically contiguous chunk of memory then yup,
there are tons of problems around that; it won't be happening in 2.6.
The overhead of mapping and unmapping 4 pages is very small, especially when
compared with the cost of the copy itself!
You'd be better off looking into avoiding that copy altogether: feed four 4k
pages down to the network stack and get them filled in direct from the
busmastering receive.
(And bear in mind that 1 single 16k chunk could perform worse than 4x4k
chunks, if the network receive and the copy are serialised. With 4k pages,
the CPU can be copying one page _while_ the network is pulling in the next
one...)
> and the
> common routines (read_cache_pages and generic_writepages/mpage_writepages
> respectively) don't seem written for the case in which > 4K is copied in
> one call to the filesystem and for nfs the mpage_writepages call seems to
> default to calling the vfs op writepage (since a null get_block_t routine
> is passed in) which could have been done by simply not supporting the
> writepages entry point - perhaps this is just an intermediate coding step,
> a staging of the eventual function.
>
> Has there been much discussion or much written down (other than a little
> bit in documentation/filesystems/locking and the readahead function
> comments in mm/readahead.c) on the suitability of readpages & writepages
> for network filesystems? There obviously can be significant benefits in
> reducing the number of network roundtrips if it can be made to work
> efficiently ...
I don't understand how this affects the number of roundtrips? You seem to be
implying that CIFS has an fs-private receive buffer, the contents of which
are copied into the VFS's pagecache?
If so, then making that buffer be 16k should work OK?
If not, then some more details on the general data flow would be needed,
please.
next prev parent reply other threads:[~2003-01-23 0:43 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-01-22 14:42 readpages & writepages Steven French
2003-01-23 0:37 ` Andrew Morton [this message]
-- strict thread matches above, loose matches on Subject: below --
2003-01-23 3:35 Steven French
2003-01-21 22:29 Steven French
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030122163749.62e4861e.akpm@digeo.com \
--to=akpm@digeo.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=sfrench@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).