linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
To: Jim Rees <rees@umich.edu>
Cc: "Myklebust, Trond" <Trond.Myklebust@netapp.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [RFC PATCH 1/2] NFSv4.1: Convert slotid from u8 to u32
Date: Thu, 9 Feb 2012 09:37:22 +0100	[thread overview]
Message-ID: <CAGue13qU7OJ+Obnc0vtGt8_2BsydeOXmwhHNzctJbWitV4AVvA@mail.gmail.com> (raw)
In-Reply-To: <20120208210150.GB29238@umich.edu>

Putting my 'high energy physics community' hat on let me comment on it.

As soon as we trying to use nfs over high latency networks the
application efficiency rapidly drops. Efficiency is wallTIme/cpuTime.
We have solve this in our home grown protocols by adding vector read
and vector write. Vector is set of offset_length. As most of the our
files has DB like structure, after reading header (something like
index) we knew where data is located. This allows us to perform in
some work loads 100 times better than NFS.

Posix does not provides such interface. But we can simulate that with
fadvise calls (and we do). Since nfs-4.0 we got compound  operations.
And you can (in theory) build a compound with multiple READ or WRITE
ops. Nevertheless this does not work for several reasons: maximal
reply size and you still have to wait for full reply. and some reply
may be up 100MB in size.

The solution here is to issue multiple requests in parallel. And this
is possible only if you have enough session slots. Server can reply
out of order and populate clients file system cache.

Tigran.

On Wed, Feb 8, 2012 at 10:01 PM, Jim Rees <rees@umich.edu> wrote:
> Myklebust, Trond wrote:
>
>  On Wed, 2012-02-08 at 15:31 -0500, Jim Rees wrote:
>  > J. Bruce Fields wrote:
>  >
>  >   On Wed, Feb 08, 2012 at 12:49:01PM -0500, Jim Rees wrote:
>  >   > Myklebust, Trond wrote:
>  >   >
>  >   >   10GigE + high latencies is exactly where we're seeing the value. Andy
>  >   >   has been working with the high energy physics community doing NFS
>  >   >   traffic between the US and CERN...
>  >   >
>  >   > CITI to CERN is just over 120ms.  I don't know what it would be from Andy's
>  >   > house.  Does he have 10G at home yet?
>  >
>  >   That still seems short of what you'd need to get a 255MB bandwidth-delay
>  >   product.
>  >
>  >   I'm just curious what the experiment is here and whether there's a
>  >   possibility the real problem is elsewhere.
>  >
>  > In my opinion, any fix that involves allocating multiple parallel data
>  > streams (rpc slots, tcp connections) is masking the real problem.  But it's
>  > an effective fix.
>
>  Who said anything about multiple tcp connections? All the slots do is
>  allow the server to process more RPC calls in parallel by feeding it
>  more work. How is that masking a problem?
>
> Sorry, the comma was intended to be "or".  I realize there is just one tcp
> connection.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-02-09  8:37 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-07  0:57 [RFC PATCH 1/2] NFSv4.1: Convert slotid from u8 to u32 Trond Myklebust
2012-02-07  0:57 ` [RFC PATCH 2/2] NFSv4.1: Add a module parameter to set the number of session slots Trond Myklebust
2012-02-08  7:34 ` [RFC PATCH 1/2] NFSv4.1: Convert slotid from u8 to u32 Benny Halevy
2012-02-08 16:23 ` J. Bruce Fields
2012-02-08 17:27   ` Myklebust, Trond
2012-02-08 17:49     ` Jim Rees
2012-02-08 18:31       ` J. Bruce Fields
2012-02-08 20:31         ` Jim Rees
2012-02-08 20:50           ` Myklebust, Trond
2012-02-08 21:01             ` Jim Rees
2012-02-09  8:37               ` Tigran Mkrtchyan [this message]
2012-02-09 18:39                 ` J. Bruce Fields
2012-02-10 16:06                   ` Andy Adamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGue13qU7OJ+Obnc0vtGt8_2BsydeOXmwhHNzctJbWitV4AVvA@mail.gmail.com \
    --to=tigran.mkrtchyan@desy.de \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=rees@umich.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).