From: Chuck Lever <cel@citi.umich.edu>
To: Neil Brown <neilb@suse.de>
Cc: Olaf Kirch <okir@suse.de>,
Trond Myklebust <trond.myklebust@fys.uio.no>,
nfs@lists.sourceforge.net
Subject: Re: NFS directio
Date: Sun, 09 Apr 2006 18:09:27 -0400 [thread overview]
Message-ID: <44398617.2000208@citi.umich.edu> (raw)
In-Reply-To: <17465.67.550070.247218@cse.unsw.edu.au>
Neil Brown wrote:
> On Friday March 31, cel@citi.umich.edu wrote:
>> Olaf Kirch wrote:
>>> On Fri, Mar 31, 2006 at 09:35:34AM -0500, Chuck Lever wrote:
>>>> the check isn't in 2.6.16. it was removed sometime after 2.6.5.
>>> It is still in the 2.6.16 tree I'm looking at; else I wouldn't ask :)
>> it's been in my trees since 2.6.13 or even earlier, my mistake.
>>
>> that change is part of the aio+dio patches that were just included in
>> 2.6.17-rc1. instead of creating a single patch for this change, you
>> should consider taking those patches, since they were tested as a unit.
>>
>> if you can guarantee that atomic_t is 32-bits on every platform you
>> support, then it should be save to change that #define to 2^31.
>> otherwise, the work to eliminate the limit entirely has already been
>> done by the above-mentioned patches.
>
> (Coming into the conversation a bit late....)
>
> What about the kmalloc in nfs_get_user_pages:
>
> array_size = (page_count * sizeof(struct page *));
> *pages = kmalloc(array_size, GFP_KERNEL);
>
> With a page_count of 1024, this allocates one page (on 32bit) which is
> easy.
> With a page_count of 4096 (the previous MAX_DIRECTIO_SIZE)), this
> allocates 4 consecutive pages, which won't always succeed.
>
> If you want to go higher than that (which was the point of the start
> of this thread) then you need a large-order allocation which doesn't
> (in my understanding) have a good chance of success due to
> fragmentation.
>
> So I guess my question is: how hard would it be to use a more scalable
> data structure so that very large IO sizes would be reliably
> practical?
howdy neil-
usually I/O is broken up into smaller chunks by the time it gets down to
this level, so it's never been much of an issue. it's pretty
challenging to generate a test case for extremely large I/O sizes (for
example, the size of the entire process address space).
and until now, there really hasn't been much call for doing NFS O_DIRECT
with very large requests. it's been a matter of meeting the
requirements of database I/O, which is generally 4KB to 16KB for data
files, and about a megabyte for log writes.
at this point we don't really have a test case and a use case that
reliably breaks this, so it hasn't been a priority to address this.
the structure of this code was adapted (ie stolen) from other parts of
the kernel that also employ get_user_pages. you can probably take a
look at other places that employ get_user_pages(), and see how they've
since tackled the issue.
--
corporate: <cel at netapp dot com>
personal: <chucklever at bigfoot dot com>
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next prev parent reply other threads:[~2006-04-09 22:09 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-30 15:15 NFS directio Olaf Kirch
2006-03-30 16:03 ` Trond Myklebust
2006-03-30 17:27 ` Chuck Lever
2006-03-31 7:49 ` Olaf Kirch
2006-03-31 14:35 ` Chuck Lever
2006-03-31 14:58 ` Olaf Kirch
2006-03-31 15:50 ` Chuck Lever
2006-04-09 12:38 ` Neil Brown
2006-04-09 22:09 ` Chuck Lever [this message]
2006-04-10 4:20 ` Neil Brown
2006-04-10 10:55 ` Olaf Kirch
2006-04-10 17:36 ` Chuck Lever
2006-04-11 0:12 ` Neil Brown
2006-04-11 0:47 ` Chuck Lever
2006-04-11 9:15 ` Olaf Kirch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44398617.2000208@citi.umich.edu \
--to=cel@citi.umich.edu \
--cc=neilb@suse.de \
--cc=nfs@lists.sourceforge.net \
--cc=okir@suse.de \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.