From: Ben Greear <greearb@candelatech.com>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: Reading NFS file without copying to user-space?
Date: Fri, 04 Sep 2009 16:03:05 -0700 [thread overview]
Message-ID: <4AA19CA9.5090702@candelatech.com> (raw)
In-Reply-To: <1252104582.5274.16.camel@heimdal.trondhjem.org>
On 09/04/2009 03:49 PM, Trond Myklebust wrote:
> On Fri, 2009-09-04 at 15:30 -0700, Ben Greear wrote:
>> I was thinking that the kernel might take the data received in the skb's from
>> the file-server and send it to /dev/null, ie basically just immediately
>> discard the received data. If it could do that, it would be a zero-copy
>> read: The only copying would be the NIC DMA'ing the packet into the skb.
>
> No... The RPC layer will always copy the data from the socket into a
> buffer. If you are using O_DIRECT reads, then that buffer will be the
> same one that you supplied in userland (the kernel just uses page table
> trickery to map those pages into the kernel address space). If you are
> using any other type of read (even if it is being piped using sendfile()
> or splice()) then it will copy that data into the NFS filesystem's page
> cache.
Ok, I think I understand that better now. Seems like one could have
RPC use a list of skbs as data store instead of copying the data,
but perhaps that would be optimizing for something no one would
ever really want in the real world.
>> Out of curiosity, any one have any benchmarks for NFS on 10G hardware?
>
> I'm not aware of any public figures. I'd be interested to hear how you
> max out.
>
>> Based on testing against another vendor's nfs server, it seems that the client
>> is loosing packets (the server shows tcp retransmits).
>
> Is the data being lost at the client, the switch or the server? Assuming
> that you are using a managed switch, then a look at its statistics
> should be able to answer that question.
At least for my local linux - linux tests, I'm using just fibre optic
cable to connect them, so definitely not a switch problem here. No obvious errors
reported by either NIC, and pktgen tests show that they can easily sustain
9Gbps. I need to do more detailed looking at the netstat
counters and such. I suspect I may have too-small network buffers. I last
set up their defaults when a 1GB RAM system was 'high end', and now
I'm using 12GB systems :P
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
prev parent reply other threads:[~2009-09-04 23:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-04 19:48 Reading NFS file without copying to user-space? Ben Greear
2009-09-04 20:35 ` Trond Myklebust
[not found] ` <1252096543.2402.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-09-04 20:49 ` Ben Greear
2009-09-04 20:58 ` Trond Myklebust
2009-09-04 21:12 ` Ben Greear
2009-09-04 22:00 ` Trond Myklebust
2009-09-04 21:57 ` Ben Greear
2009-09-04 22:15 ` Trond Myklebust
[not found] ` <1252102506.5274.7.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-09-04 22:30 ` Ben Greear
2009-09-04 22:49 ` Trond Myklebust
2009-09-04 23:03 ` Ben Greear [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AA19CA9.5090702@candelatech.com \
--to=greearb@candelatech.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).