All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Staubach <staubach@redhat.com>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: chucklever@gmail.com, Andrew Bell <andrew.bell.ia@gmail.com>,
	linux-nfs@vger.kernel.org
Subject: Re: Performance Diagnosis
Date: Tue, 15 Jul 2008 15:21:58 -0400	[thread overview]
Message-ID: <487CF8D6.2090908@redhat.com> (raw)
In-Reply-To: <1216147879.7981.44.camel@localhost>

Trond Myklebust wrote:
> On Tue, 2008-07-15 at 14:17 -0400, Chuck Lever wrote:
>   
>> On Tue, Jul 15, 2008 at 1:44 PM, Peter Staubach <staubach@redhat.com> wrote:
>>     
>>> Chuck Lever wrote:
>>>       
>>>> On Tue, Jul 15, 2008 at 11:58 AM, Peter Staubach <staubach@redhat.com>
>>>> wrote:
>>>>
>>>>         
>>>>> If it is the notion described above, sometimes called head
>>>>> of line blocking, then we could think about ways to duplex
>>>>> operations over multiple TCP connections, perhaps with one
>>>>> connection for small, low latency operations, and another
>>>>> connection for larger, higher latency operations.
>>>>>
>>>>>           
>>>> I've dreamed about that for years.  I don't think it would be too
>>>> difficult, but one thing that has held it back is the shortage of
>>>> ephemeral ports on the client may reduce the number of concurrent
>>>> mount points we can support.
>>>>
>>>> One way to avoid the port issue is to construct an SCTP transport for
>>>> NFS.  SCTP allows multiple streams on the same connection, effectively
>>>> eliminating head of line blocking.
>>>>         
>>> I like the idea of combining this work with implementing a proper
>>> connection manager so that we don't need a connection per mount.
>>> We really only need one connection per client and server, no matter
>>> how many individual mounts there might be from that single server.
>>> (Or two connections, if we want to do something like this...)
>>>
>>> We could also manage the connection space and thus, never run into
>>> the shortage of ports ever again.  When the port space is full or
>>> we've run into some other artificial limit, then we simply close
>>> down some other connection to make space.
>>>       
>> I think we should do this for text-based mounts; however this would
>> mean the connection management would happen in the kernel, which (only
>> slightly) complicates things.
>>
>> I was thinking about this a little last week when Trond mentioned
>> implementing a connected UDP socket transport...
>>
>> It would be nice if all the kernel RPC services that needed to send a
>> single RPC request (like mount, rpcbind, and so on) could share a
>> small managed pool of sockets (a pool of TCP sockets, or a pool of
>> connected UDP sockets).  Connected sockets have the ostensible
>> advantage that they can quickly detect the absence of a remote
>> listener.  But such a pool would be a good idea because multiple mount
>> requests to the same server could all flow over the same set of
>> connections.
>>
>> But we might be able to get away with something nearly as efficient if
>> the RPC client would always invoke a connect(AF_UNSPEC) before
>> destroying the socket.  Wouldn't that free the ephemeral port
>> immediately?  What are the risks of trying something like this?
>>     
>
>
> Why is all the talk here only about RPC level solutions?
>
> Newer kernels already have a good deal of extra throttling of writes at
> the NFS superblock level, and there is even a sysctl to control the
> amount of outstanding writes before the VM congestion control sets in.
> Please see /proc/sys/fs/nfs/nfs_congestion_kb

The throttling of writes definitely seems like a NFS level issue,
so that's a good thing.  (RHEL-5 might be a tad far enough behind
to not be able to take advantage of all of these modern
things...  :-))

The connection manager would seem to be a RPC level thing, although
I haven't thought through the ramifications of the NFSv4.1 stuff
and how it might impact a connection manager sufficiently.

       ps

  reply	other threads:[~2008-07-15 19:22 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-15 15:34 Performance Diagnosis Andrew Bell
     [not found] ` <e80abd30807150834m47a1b86cle39885150f1d5bfd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-15 15:49   ` Chuck Lever
2008-07-15 15:58   ` Peter Staubach
2008-07-15 16:23     ` Chuck Lever
     [not found]       ` <76bd70e30807150923r31027edxb0394a220bbe879b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-15 16:34         ` Andrew Bell
     [not found]           ` <e80abd30807150934tc14e793ydd7aae44b4c3111b-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-15 17:20             ` Chuck Lever
2008-07-15 17:44         ` Peter Staubach
2008-07-15 18:17           ` Chuck Lever
     [not found]             ` <76bd70e30807151117g520f22cj1dfe26b971987d38-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-07-15 18:51               ` Trond Myklebust
2008-07-15 19:21                 ` Peter Staubach [this message]
2008-07-15 19:35                   ` Trond Myklebust
2008-07-15 19:55                     ` Peter Staubach
2008-07-15 20:27                       ` Trond Myklebust
2008-07-15 20:48                         ` Peter Staubach
2008-07-15 21:15                       ` Talpey, Thomas
2008-07-16  7:35                     ` Benny Halevy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=487CF8D6.2090908@redhat.com \
    --to=staubach@redhat.com \
    --cc=andrew.bell.ia@gmail.com \
    --cc=chucklever@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.