From: Ben Greear <greearb@candelatech.com>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: Reading NFS file without copying to user-space?
Date: Fri, 04 Sep 2009 16:03:05 -0700 [thread overview]
Message-ID: <4AA19CA9.5090702@candelatech.com> (raw)
In-Reply-To: <1252104582.5274.16.camel@heimdal.trondhjem.org>
On 09/04/2009 03:49 PM, Trond Myklebust wrote:
> On Fri, 2009-09-04 at 15:30 -0700, Ben Greear wrote:
>> I was thinking that the kernel might take the data received in the skb's from
>> the file-server and send it to /dev/null, ie basically just immediately
>> discard the received data. If it could do that, it would be a zero-copy
>> read: The only copying would be the NIC DMA'ing the packet into the skb.
>
> No... The RPC layer will always copy the data from the socket into a
> buffer. If you are using O_DIRECT reads, then that buffer will be the
> same one that you supplied in userland (the kernel just uses page table
> trickery to map those pages into the kernel address space). If you are
> using any other type of read (even if it is being piped using sendfile()
> or splice()) then it will copy that data into the NFS filesystem's page
> cache.
Ok, I think I understand that better now. Seems like one could have
RPC use a list of skbs as data store instead of copying the data,
but perhaps that would be optimizing for something no one would
ever really want in the real world.
>> Out of curiosity, any one have any benchmarks for NFS on 10G hardware?
>
> I'm not aware of any public figures. I'd be interested to hear how you
> max out.
>
>> Based on testing against another vendor's nfs server, it seems that the client
>> is loosing packets (the server shows tcp retransmits).
>
> Is the data being lost at the client, the switch or the server? Assuming
> that you are using a managed switch, then a look at its statistics
> should be able to answer that question.
At least for my local linux - linux tests, I'm using just fibre optic
cable to connect them, so definitely not a switch problem here. No obvious errors
reported by either NIC, and pktgen tests show that they can easily sustain
9Gbps. I need to do more detailed looking at the netstat
counters and such. I suspect I may have too-small network buffers. I last
set up their defaults when a 1GB RAM system was 'high end', and now
I'm using 12GB systems :P
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
prev parent reply other threads:[~2009-09-04 23:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-04 19:48 Reading NFS file without copying to user-space? Ben Greear
2009-09-04 20:35 ` Trond Myklebust
[not found] ` <1252096543.2402.4.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-09-04 20:49 ` Ben Greear
2009-09-04 20:58 ` Trond Myklebust
2009-09-04 21:12 ` Ben Greear
2009-09-04 22:00 ` Trond Myklebust
2009-09-04 21:57 ` Ben Greear
2009-09-04 22:15 ` Trond Myklebust
[not found] ` <1252102506.5274.7.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-09-04 22:30 ` Ben Greear
2009-09-04 22:49 ` Trond Myklebust
2009-09-04 23:03 ` Ben Greear [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AA19CA9.5090702@candelatech.com \
--to=greearb@candelatech.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.