From: "J. Bruce Fields" <bfields@fieldses.org>
To: Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
Cc: Frank van Maarseveen <frankvm@frankvm.com>,
Chuck Lever <chuck.lever@oracle.com>,
Trond Myklebust <trond.myklebust@fys.uio.no>,
linux-nfs@vger.kernel.org
Subject: Re: [PATCH 4/5] NFSD: Remove NFSD_TCP kernel build option
Date: Tue, 5 Feb 2008 18:08:28 -0500 [thread overview]
Message-ID: <20080205230828.GV8210@fieldses.org> (raw)
In-Reply-To: <47A8EBC3.7050900-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
On Wed, Feb 06, 2008 at 10:05:39AM +1100, Greg Banks wrote:
> Frank van Maarseveen wrote:
> > Last time I checked (around 2.6.22) writing large files on NFSv3 over
> > UDP was 20% faster compared to TCP (Gb LAN with one switch connecting
> > all machines).
> >
> Did all of your file arrive at the server, and in the same order it left
> the client? NFS on UDP relies on IP fragmentation, which is known to
> introduce silent data corruption at high data rates (google for "IPID aliasing").
The right query appears to be "IPID aliasing NFS", which (at least for
me) gets you a nice explanation from Olaf's 2006 OLS paper as the first
hit....
--b.
> Also, last time I checked, UDP support in the server uses a single socket
> for all traffic, and processes need to serialise on the svc_sock lock to send,
> so aggregate UDP throughput is strictly limited compared to TCP. As in, 145 MB/s
> for UDP compared to filling 12 1gige pipes for TCP. I have a patch to fix this,
> but given the inherent data corruption issues of UDP I haven't bothered posting
> the most recent version.
>
>
>
> > TCP and its timeout/retransmission behavior isn't always the best choice.
> >
> >
> The timeout & retrans that sunrpc implements on top of UDP is arguably worse,
> especially if you use the "soft" mount option.
>
> --
> Greg Banks, R&D Software Engineer, SGI Australian Software Group.
> The cake is *not* a lie.
> I don't speak for SGI.
>
prev parent reply other threads:[~2008-02-05 23:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-05 0:04 [PATCH 4/5] NFSD: Remove NFSD_TCP kernel build option Chuck Lever
[not found] ` <20080205000442.18602.29035.stgit-meopP2rzCrTwdl/1UfZZQIVfYA8g3rJ/@public.gmane.org>
2008-02-05 0:19 ` Greg Banks
[not found] ` <47A7AB89.7020709-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2008-02-05 0:19 ` Trond Myklebust
[not found] ` <1202170754.28484.57.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-02-05 0:29 ` Greg Banks
[not found] ` <47A7AE03.10401-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2008-02-05 1:55 ` Chuck Lever
2008-02-05 5:49 ` Greg Banks
2008-02-05 6:05 ` Neil Brown
[not found] ` <47A7F8F3.3020907-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2008-02-05 15:50 ` Frank van Maarseveen
2008-02-05 17:50 ` Trond Myklebust
[not found] ` <1202233839.8452.31.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2008-02-05 18:08 ` Frank van Maarseveen
2008-02-05 23:05 ` Greg Banks
[not found] ` <47A8EBC3.7050900-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
2008-02-05 23:08 ` J. Bruce Fields [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080205230828.GV8210@fieldses.org \
--to=bfields@fieldses.org \
--cc=chuck.lever@oracle.com \
--cc=frankvm@frankvm.com \
--cc=gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.