* RE: [NFS] Re: multiple servers per automount
@ 2003-10-10 15:43 Ogden, Aaron A.
2003-10-10 15:54 ` Mike Waychison
0 siblings, 1 reply; 7+ messages in thread
From: Ogden, Aaron A. @ 2003-10-10 15:43 UTC (permalink / raw)
To: Lever, Charles, Ian Kent, Mike Waychison; +Cc: autofs mailing list, nfs
Aha! Wisdom from the heavens... :-)
I assume that the RPC code is doing that to comply with reserved-port
restrictions, ie. ports < 1024. Solaris needs to do the same thing
(with nfssrv:nfs_portmon=1) so it seems that there would be an inherent
limit of 1024 ports or mountpoints to work with. Actually less, since
some ports will be in use. How does Sun get 260,000 active mounts if
they can only use ports < 1024? Do we really need one port for each
mountpoint?
Perhaps this has something to do with the fact that solaris autofs is
multithreaded (ie. one process) whereas linux autofs has many processes,
one for each mountpoint. Feel free to correct me if I'm wrong...
-A
-----Original Message-----
From: Lever, Charles [mailto:Charles.Lever@netapp.com]
Sent: Friday, October 10, 2003 10:10 AM
To: Ian Kent; Mike Waychison
Cc: Ogden, Aaron A.; autofs mailing list; nfs@lists.sourceforge.net
Subject: RE: [NFS] Re: [autofs] multiple servers per automount
the problem is likely the algorithm used to allocate
ports for the RPC transport sockets. it starts at
port 800 and goes down to zero.
> -----Original Message-----
> From: Ian Kent [mailto:raven@themaw.net]
> Sent: Thursday, October 09, 2003 6:09 PM
> To: Mike Waychison
> Cc: Ogden, Aaron A.; autofs mailing list; nfs@lists.sourceforge.net
> Subject: [NFS] Re: [autofs] multiple servers per automount
>
>
> On Thu, 9 Oct 2003, Mike Waychison wrote:
>
> > Ogden, Aaron A. wrote:
> >
> > >Ouch. As you may know, the limit is *much* lower in linux.
Something
> > >that I've been struggling with recently...
> > >
> > >Under normal circumstances I would not be concerned with
'limitations'
> > >of a few hundred active NFS mounts, but such limitations certainly
limit
> > >scalability for the extreme cases.
> > >
> > >
> >
> > The maximum number of plain pseudo-block device filesystems on a
given
> > filesystem is limitted to 256. (This includes proc, autofs, nfs..).
> >
> > This is because pseudo-block filesystems all use major 0, and each
have
> > a different minor (thus the 256 limit).
> >
> > There are however patches floating around (look at SuSe's kernels,
I'm
> > not sure about RH) that allow n majors to be used (default 5). This
> > gives you 1280 mounts, a big step up :)
> >
>
> But as Aaron and I know things go pear shaped at just shy of 800
mounts
> with RedHat kernels. They have the more-unnamed patch.
>
> So this would indicate that even if there is a device system that can
> increase the number of unnamed devices that subsystems like NFS cannot
> handle this many mounts.
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [NFS] Re: multiple servers per automount
2003-10-10 15:43 [NFS] Re: multiple servers per automount Ogden, Aaron A.
@ 2003-10-10 15:54 ` Mike Waychison
0 siblings, 0 replies; 7+ messages in thread
From: Mike Waychison @ 2003-10-10 15:54 UTC (permalink / raw)
To: Ogden, Aaron A.; +Cc: autofs mailing list, nfs, Lever, Charles, Ian Kent
Ogden, Aaron A. wrote:
>Aha! Wisdom from the heavens... :-)
>I assume that the RPC code is doing that to comply with reserved-port
>restrictions, ie. ports < 1024. Solaris needs to do the same thing
>(with nfssrv:nfs_portmon=1) so it seems that there would be an inherent
>limit of 1024 ports or mountpoints to work with. Actually less, since
>some ports will be in use. How does Sun get 260,000 active mounts if
>they can only use ports < 1024? Do we really need one port for each
>mountpoint?
>
>
Don't take my word for it, because I don't know any better.. But
Solaris may multiplex different NFS servers on the same udp port. They
may also have their tests done with TCP instead of udp, which solves
that problem elegantly.
>Perhaps this has something to do with the fact that solaris autofs is
>multithreaded (ie. one process) whereas linux autofs has many processes,
>one for each mountpoint. Feel free to correct me if I'm wrong...
>
>
Nah, this sounds alot like an NFS issue. See Charles Lever's post.
Mike Waychison
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [NFS] Re: multiple servers per automount
@ 2003-10-10 17:02 Eric Werme USG
0 siblings, 0 replies; 7+ messages in thread
From: Eric Werme USG @ 2003-10-10 17:02 UTC (permalink / raw)
To: aogden; +Cc: autofs
Ogden, Aaron A. wrote:
>Aha! Wisdom from the heavens... :-)
>I assume that the RPC code is doing that to comply with reserved-port
>restrictions, ie. ports < 1024. Solaris needs to do the same thing
>(with nfssrv:nfs_portmon=1) so it seems that there would be an inherent
>limit of 1024 ports or mountpoints to work with. Actually less, since
>some ports will be in use. How does Sun get 260,000 active mounts if
>they can only use ports < 1024? Do we really need one port for each
>mountpoint?
I can't speak for Solaris, but on HP's Tru64 UNIX we use one TCP
connection for all traffic per mount, and we close connections that
have been idle for 5 minutes and when there are "too many" connections
to one server. For UDP, the NFS client uses a single port, in large part
do to problems with port number space exhaustion and the ripple effects
on other consumers of that space. (We don't throttle the number of outstanding
NFS requests, but we have a fixed limit on the read/write nfsiod helper
threads.) We generally ran into port number exhaustion on our mail server
which uses NFS (via aoutmount) to access /home/user/.forward files. If one
production system went down, then the mail server would wind up with a
big flock of sendmails all trying to access the .forwards until the port
number space was chewed up, then automount couldn't issue new mounts
whereupon no mail got delivered to anyone.
The NFS client gets its first look at a reply via a callback from UDP
code when it finds the port has been registered. The callback figures
out what thread is waiting for the XID, saves the reply address in a
data structure and issues the wakeup. When the code is processed for
real, it's NFS code that does the UDP checksum, thereby loading
the local cache with the data. The inspiration was pretty simple as I
had to do the same demultiplexing in the NFS over TCP client.
BTW, the rationale behind the one TCP connection per mount was to
conform to TCP's congestion control design, but limit the amount of
cross mount locking and code complexity. Typical NFS traffic
has multiple accesses on a mount at a time, so I figured it would be
a good compromise. I know Solaris has one connection per server, I don't
know what other vendors do.
-Ric Werme
--
Eric (Ric) Werme | werme@zk3.dec.com
Hewlett-Packard Co. | http://werme.8m.net/
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [NFS] Re: multiple servers per automount
2003-10-10 15:10 Re: [autofs] " Lever, Charles
@ 2003-10-13 3:05 ` Ian Kent
0 siblings, 0 replies; 7+ messages in thread
From: Ian Kent @ 2003-10-13 3:05 UTC (permalink / raw)
To: Lever, Charles; +Cc: Ogden, Aaron A., autofs mailing list, Mike Waychison, nfs
On Fri, 10 Oct 2003, Lever, Charles wrote:
> the problem is likely the algorithm used to allocate
> ports for the RPC transport sockets. it starts at
> port 800 and goes down to zero.
Don't think so.
I appears that a single connection is maintained for nfs comms for both
udp and tcp.
However, if a rapid number of mount requests are fired then multiple
portmap connections are made. They end up in a TIME_WAIT state which is
probably causing the port allocation starvation.
This doesn't appear to happen under Solaris.
>
> > -----Original Message-----
> > From: Ian Kent [mailto:raven@themaw.net]
> > Sent: Thursday, October 09, 2003 6:09 PM
> > To: Mike Waychison
> > Cc: Ogden, Aaron A.; autofs mailing list; nfs@lists.sourceforge.net
> > Subject: [NFS] Re: [autofs] multiple servers per automount
> >
> >
> > On Thu, 9 Oct 2003, Mike Waychison wrote:
> >
> > > Ogden, Aaron A. wrote:
> > >
> > > >Ouch. As you may know, the limit is *much* lower in
> > linux. Something
> > > >that I've been struggling with recently...
> > > >
> > > >Under normal circumstances I would not be concerned with
> > 'limitations'
> > > >of a few hundred active NFS mounts, but such limitations
> > certainly limit
> > > >scalability for the extreme cases.
> > > >
> > > >
> > >
> > > The maximum number of plain pseudo-block device filesystems
> > on a given
> > > filesystem is limitted to 256. (This includes proc, autofs, nfs..).
> > >
> > > This is because pseudo-block filesystems all use major 0,
> > and each have
> > > a different minor (thus the 256 limit).
> > >
> > > There are however patches floating around (look at SuSe's
> > kernels, I'm
> > > not sure about RH) that allow n majors to be used (default 5). This
> > > gives you 1280 mounts, a big step up :)
> > >
> >
> > But as Aaron and I know things go pear shaped at just shy of
> > 800 mounts
> > with RedHat kernels. They have the more-unnamed patch.
> >
> > So this would indicate that even if there is a device system that can
> > increase the number of unnamed devices that subsystems like NFS cannot
> > handle this many mounts.
> >
> > --
> >
> > ,-._|\ Ian Kent
> > / \ Perth, Western Australia
> > *_.--._/ E-mail: raven@themaw.net
> > v Web: http://themaw.net/
> >
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by: SF.net Giveback Program.
> > SourceForge.net hosts over 70,000 Open Source Projects.
> > See the people who have HELPED US provide better services:
> > Click here: http://sourceforge.net/supporters.php
> > _______________________________________________
> > NFS maillist - NFS@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nfs
> >
>
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [NFS] RE: multiple servers per automount
2003-10-10 15:16 Ogden, Aaron A.
@ 2003-10-13 3:23 ` Ian Kent
2003-10-14 7:05 ` Joseph V Moss
0 siblings, 1 reply; 7+ messages in thread
From: Ian Kent @ 2003-10-13 3:23 UTC (permalink / raw)
To: Ogden, Aaron A.; +Cc: autofs mailing list, nfs, Mike Waychison
On Fri, 10 Oct 2003, Ogden, Aaron A. wrote:
>
>
> > So this would indicate that even if there is a device system that can
> > increase the number of unnamed devices that subsystems like NFS cannot
> > handle this many mounts.
>
> Maybe. I'm not 100% certain though. Currently I am holding steady at
> 710 active mounts, I am going to write a little script to mount more in
> small increments, ie. read a list of ~1000 mountpoints from /home, mount
> a few of them, check the filesystems, and repeat... this way I will know
> exactly where things break down.
Interesting.
If you can edge it up then it's probably not an available port
restriction.
There may be more than one issue at work here.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [NFS] RE: multiple servers per automount
2003-10-13 3:23 ` [NFS] " Ian Kent
@ 2003-10-14 7:05 ` Joseph V Moss
0 siblings, 0 replies; 7+ messages in thread
From: Joseph V Moss @ 2003-10-14 7:05 UTC (permalink / raw)
To: Ian Kent; +Cc: Ogden, Aaron A., autofs mailing list, nfs, Mike Waychison
> On Fri, 10 Oct 2003, Ogden, Aaron A. wrote:
>
> >
> >
> > > So this would indicate that even if there is a device system that can
> > > increase the number of unnamed devices that subsystems like NFS cannot
> > > handle this many mounts.
> >
> > Maybe. I'm not 100% certain though. Currently I am holding steady at
> > 710 active mounts, I am going to write a little script to mount more in
> > small increments, ie. read a list of ~1000 mountpoints from /home, mount
> > a few of them, check the filesystems, and repeat... this way I will know
> > exactly where things break down.
>
> Interesting.
>
> If you can edge it up then it's probably not an available port
> restriction.
>
> There may be more than one issue at work here.
>
The limit is 800 as others have stated. Although, it can be less than that
if something else is already using up some of the reserved UDP ports.
I wrote a patch long ago against a 2.2.x kernel to enable it to use
multiple majors for NFS mounts (like the patches now common in several
distros). I then ran into the 800 limit in the RPC layer. After changing
the RPC layer to count up from 0, instead of down from 800, with no real
upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
I'm sure I could have done many thousand if I had had that many filesystems
around to mount. Obviously, after 1024, it wasn't using reserved ports
anymore, but it didn't seem to matter.
Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
the RPC layer is different enough between 2.2 and 2.4 that it didn't work
right off. Bumping it up to somewhere around 1024 should work, but using
non-reserved ports didn't seem to work when I made a simple attempt.
Of course, the real fix for the NFS layer is the expansion of the minor
numbers that's already occurred in 2.6 and the RPC layer problems should
be fixed by multiplexing multiple mounts on the same port.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [NFS] RE: multiple servers per automount
2003-10-14 13:37 RE: [autofs] " Ian Kent
@ 2003-10-14 15:52 ` Mike Waychison
0 siblings, 0 replies; 7+ messages in thread
From: Mike Waychison @ 2003-10-14 15:52 UTC (permalink / raw)
To: Ian Kent
Cc: Ogden, Aaron A., autofs mailing list, nfs, Kernel Mailing List,
Joseph V Moss
Ian Kent wrote:
>On Tue, 14 Oct 2003, Joseph V Moss wrote:
>
>
>
>>The limit is 800 as others have stated. Although, it can be less than that
>>if something else is already using up some of the reserved UDP ports.
>>
>>I wrote a patch long ago against a 2.2.x kernel to enable it to use
>>multiple majors for NFS mounts (like the patches now common in several
>>distros). I then ran into the 800 limit in the RPC layer. After changing
>>the RPC layer to count up from 0, instead of down from 800, with no real
>>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
>>I'm sure I could have done many thousand if I had had that many filesystems
>>around to mount. Obviously, after 1024, it wasn't using reserved ports
>>anymore, but it didn't seem to matter.
>>
>>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
>>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
>>right off. Bumping it up to somewhere around 1024 should work, but using
>>non-reserved ports didn't seem to work when I made a simple attempt.
>>
>>Of course, the real fix for the NFS layer is the expansion of the minor
>>numbers that's already occurred in 2.6 and the RPC layer problems should
>>be fixed by multiplexing multiple mounts on the same port.
>>
>>
>>
>>
>
>I don't see that expansion in 2.6 (test6). It looks to me like the
>allocation is done in set_anon_super (in fs/super.c) and that looks like
>it is restricted to 256. Please correct this for me. I can't see how there
>is any change to the number of unnmaed devices.
>
>
>
Here is the quick fix for this in RH 2.1AS kernels:
http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
I don't know if anyone is working out a better scheme for
get_unnamed_dev in 2.6 yet. It does need to be done though. A simple
patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
PAGE_SIZE, automatically allowing for 32768 unnamed devices.
Mike Waychison
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2003-10-14 15:52 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-10 15:43 [NFS] Re: multiple servers per automount Ogden, Aaron A.
2003-10-10 15:54 ` Mike Waychison
-- strict thread matches above, loose matches on Subject: below --
2003-10-14 13:37 RE: [autofs] " Ian Kent
2003-10-14 15:52 ` [NFS] " Mike Waychison
2003-10-10 17:02 [NFS] " Eric Werme USG
2003-10-10 15:16 Ogden, Aaron A.
2003-10-13 3:23 ` [NFS] " Ian Kent
2003-10-14 7:05 ` Joseph V Moss
2003-10-10 15:10 Re: [autofs] " Lever, Charles
2003-10-13 3:05 ` [NFS] " Ian Kent
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.