public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Staubach <staubach@redhat.com>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: chucklever@gmail.com, Andrew Bell <andrew.bell.ia@gmail.com>,
	linux-nfs@vger.kernel.org
Subject: Re: Performance Diagnosis
Date: Tue, 15 Jul 2008 15:21:58 -0400	[thread overview]
Message-ID: <487CF8D6.2090908@redhat.com> (raw)
In-Reply-To: <1216147879.7981.44.camel@localhost>

Trond Myklebust wrote:
> On Tue, 2008-07-15 at 14:17 -0400, Chuck Lever wrote:
>   
>> On Tue, Jul 15, 2008 at 1:44 PM, Peter Staubach <staubach@redhat.com> wrote:
>>     
>>> Chuck Lever wrote:
>>>       
>>>> On Tue, Jul 15, 2008 at 11:58 AM, Peter Staubach <staubach@redhat.com>
>>>> wrote:
>>>>
>>>>         
>>>>> If it is the notion described above, sometimes called head
>>>>> of line blocking, then we could think about ways to duplex
>>>>> operations over multiple TCP connections, perhaps with one
>>>>> connection for small, low latency operations, and another
>>>>> connection for larger, higher latency operations.
>>>>>
>>>>>           
>>>> I've dreamed about that for years.  I don't think it would be too
>>>> difficult, but one thing that has held it back is the shortage of
>>>> ephemeral ports on the client may reduce the number of concurrent
>>>> mount points we can support.
>>>>
>>>> One way to avoid the port issue is to construct an SCTP transport for
>>>> NFS.  SCTP allows multiple streams on the same connection, effectively
>>>> eliminating head of line blocking.
>>>>         
>>> I like the idea of combining this work with implementing a proper
>>> connection manager so that we don't need a connection per mount.
>>> We really only need one connection per client and server, no matter
>>> how many individual mounts there might be from that single server.
>>> (Or two connections, if we want to do something like this...)
>>>
>>> We could also manage the connection space and thus, never run into
>>> the shortage of ports ever again.  When the port space is full or
>>> we've run into some other artificial limit, then we simply close
>>> down some other connection to make space.
>>>       
>> I think we should do this for text-based mounts; however this would
>> mean the connection management would happen in the kernel, which (only
>> slightly) complicates things.
>>
>> I was thinking about this a little last week when Trond mentioned
>> implementing a connected UDP socket transport...
>>
>> It would be nice if all the kernel RPC services that needed to send a
>> single RPC request (like mount, rpcbind, and so on) could share a
>> small managed pool of sockets (a pool of TCP sockets, or a pool of
>> connected UDP sockets).  Connected sockets have the ostensible
>> advantage that they can quickly detect the absence of a remote
>> listener.  But such a pool would be a good idea because multiple mount
>> requests to the same server could all flow over the same set of
>> connections.
>>
>> But we might be able to get away with something nearly as efficient if
>> the RPC client would always invoke a connect(AF_UNSPEC) before
>> destroying the socket.  Wouldn't that free the ephemeral port
>> immediately?  What are the risks of trying something like this?
>>     
>
>
> Why is all the talk here only about RPC level solutions?
>
> Newer kernels already have a good deal of extra throttling of writes at
> the NFS superblock level, and there is even a sysctl to control the
> amount of outstanding writes before the VM congestion control sets in.
> Please see /proc/sys/fs/nfs/nfs_congestion_kb

The throttling of writes definitely seems like a NFS level issue,
so that's a good thing.  (RHEL-5 might be a tad far enough behind
to not be able to take advantage of all of these modern
things...  :-))

The connection manager would seem to be a RPC level thing, although
I haven't thought through the ramifications of the NFSv4.1 stuff
and how it might impact a connection manager sufficiently.

       ps

  reply	other threads:[~2008-07-15 19:22 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-15 15:34 Performance Diagnosis Andrew Bell
     [not found] ` <e80abd30807150834m47a1b86cle39885150f1d5bfd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-15 15:49   ` Chuck Lever
2008-07-15 15:58   ` Peter Staubach
2008-07-15 16:23     ` Chuck Lever
     [not found]       ` <76bd70e30807150923r31027edxb0394a220bbe879b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-15 16:34         ` Andrew Bell
     [not found]           ` <e80abd30807150934tc14e793ydd7aae44b4c3111b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-15 17:20             ` Chuck Lever
2008-07-15 17:44         ` Peter Staubach
2008-07-15 18:17           ` Chuck Lever
     [not found]             ` <76bd70e30807151117g520f22cj1dfe26b971987d38-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-15 18:51               ` Trond Myklebust
2008-07-15 19:21                 ` Peter Staubach [this message]
2008-07-15 19:35                   ` Trond Myklebust
2008-07-15 19:55                     ` Peter Staubach
2008-07-15 20:27                       ` Trond Myklebust
2008-07-15 20:48                         ` Peter Staubach
2008-07-15 21:15                       ` Talpey, Thomas
2008-07-16  7:35                     ` Benny Halevy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=487CF8D6.2090908@redhat.com \
    --to=staubach@redhat.com \
    --cc=andrew.bell.ia@gmail.com \
    --cc=chucklever@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox