* RE: multiple servers per automount
@ 2003-10-10 15:16 Ogden, Aaron A.
2003-10-13 3:23 ` [NFS] " Ian Kent
0 siblings, 1 reply; 23+ messages in thread
From: Ogden, Aaron A. @ 2003-10-10 15:16 UTC (permalink / raw)
To: Ian Kent, Mike Waychison; +Cc: autofs mailing list, nfs
-----Original Message-----
From: Ian Kent [mailto:raven@themaw.net]
Sent: Thursday, October 09, 2003 8:09 PM
To: Mike Waychison
Cc: Ogden, Aaron A.; autofs mailing list; nfs@lists.sourceforge.net
Subject: Re: [autofs] multiple servers per automount
>> The maximum number of plain pseudo-block device filesystems on a
given
>> filesystem is limitted to 256. (This includes proc, autofs, nfs..).
>>
>> This is because pseudo-block filesystems all use major 0, and each
have
>> a different minor (thus the 256 limit).
>>
>> There are however patches floating around (look at SuSe's kernels,
I'm
>> not sure about RH) that allow n majors to be used (default 5). This
>> gives you 1280 mounts, a big step up :)
>>
>
> But as Aaron and I know things go pear shaped at just shy of 800
mounts
> with RedHat kernels. They have the more-unnamed patch.
>
> So this would indicate that even if there is a device system that can
> increase the number of unnamed devices that subsystems like NFS cannot
> handle this many mounts.
Maybe. I'm not 100% certain though. Currently I am holding steady at
710 active mounts, I am going to write a little script to mount more in
small increments, ie. read a list of ~1000 mountpoints from /home, mount
a few of them, check the filesystems, and repeat... this way I will know
exactly where things break down.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [NFS] RE: multiple servers per automount
2003-10-10 15:16 multiple servers per automount Ogden, Aaron A.
@ 2003-10-13 3:23 ` Ian Kent
2003-10-14 7:05 ` Joseph V Moss
0 siblings, 1 reply; 23+ messages in thread
From: Ian Kent @ 2003-10-13 3:23 UTC (permalink / raw)
To: Ogden, Aaron A.; +Cc: autofs mailing list, nfs, Mike Waychison
On Fri, 10 Oct 2003, Ogden, Aaron A. wrote:
>
>
> > So this would indicate that even if there is a device system that can
> > increase the number of unnamed devices that subsystems like NFS cannot
> > handle this many mounts.
>
> Maybe. I'm not 100% certain though. Currently I am holding steady at
> 710 active mounts, I am going to write a little script to mount more in
> small increments, ie. read a list of ~1000 mountpoints from /home, mount
> a few of them, check the filesystems, and repeat... this way I will know
> exactly where things break down.
Interesting.
If you can edge it up then it's probably not an available port
restriction.
There may be more than one issue at work here.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [NFS] RE: multiple servers per automount
2003-10-13 3:23 ` [NFS] " Ian Kent
@ 2003-10-14 7:05 ` Joseph V Moss
2003-10-14 13:37 ` Ian Kent
0 siblings, 1 reply; 23+ messages in thread
From: Joseph V Moss @ 2003-10-14 7:05 UTC (permalink / raw)
To: Ian Kent; +Cc: Ogden, Aaron A., autofs mailing list, nfs, Mike Waychison
> On Fri, 10 Oct 2003, Ogden, Aaron A. wrote:
>
> >
> >
> > > So this would indicate that even if there is a device system that can
> > > increase the number of unnamed devices that subsystems like NFS cannot
> > > handle this many mounts.
> >
> > Maybe. I'm not 100% certain though. Currently I am holding steady at
> > 710 active mounts, I am going to write a little script to mount more in
> > small increments, ie. read a list of ~1000 mountpoints from /home, mount
> > a few of them, check the filesystems, and repeat... this way I will know
> > exactly where things break down.
>
> Interesting.
>
> If you can edge it up then it's probably not an available port
> restriction.
>
> There may be more than one issue at work here.
>
The limit is 800 as others have stated. Although, it can be less than that
if something else is already using up some of the reserved UDP ports.
I wrote a patch long ago against a 2.2.x kernel to enable it to use
multiple majors for NFS mounts (like the patches now common in several
distros). I then ran into the 800 limit in the RPC layer. After changing
the RPC layer to count up from 0, instead of down from 800, with no real
upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
I'm sure I could have done many thousand if I had had that many filesystems
around to mount. Obviously, after 1024, it wasn't using reserved ports
anymore, but it didn't seem to matter.
Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
the RPC layer is different enough between 2.2 and 2.4 that it didn't work
right off. Bumping it up to somewhere around 1024 should work, but using
non-reserved ports didn't seem to work when I made a simple attempt.
Of course, the real fix for the NFS layer is the expansion of the minor
numbers that's already occurred in 2.6 and the RPC layer problems should
be fixed by multiplexing multiple mounts on the same port.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: [autofs] multiple servers per automount
2003-10-14 7:05 ` Joseph V Moss
@ 2003-10-14 13:37 ` Ian Kent
0 siblings, 0 replies; 23+ messages in thread
From: Ian Kent @ 2003-10-14 13:37 UTC (permalink / raw)
To: Joseph V Moss; +Cc: Ogden, Aaron A., autofs mailing list, nfs, Mike Waychison
On Tue, 14 Oct 2003, Joseph V Moss wrote:
> The limit is 800 as others have stated. Although, it can be less than that
> if something else is already using up some of the reserved UDP ports.
>
> I wrote a patch long ago against a 2.2.x kernel to enable it to use
> multiple majors for NFS mounts (like the patches now common in several
> distros). I then ran into the 800 limit in the RPC layer. After changing
> the RPC layer to count up from 0, instead of down from 800, with no real
> upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
> I'm sure I could have done many thousand if I had had that many filesystems
> around to mount. Obviously, after 1024, it wasn't using reserved ports
> anymore, but it didn't seem to matter.
>
> Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
> the RPC layer is different enough between 2.2 and 2.4 that it didn't work
> right off. Bumping it up to somewhere around 1024 should work, but using
> non-reserved ports didn't seem to work when I made a simple attempt.
>
> Of course, the real fix for the NFS layer is the expansion of the minor
> numbers that's already occurred in 2.6 and the RPC layer problems should
> be fixed by multiplexing multiple mounts on the same port.
>
>
I don't see that expansion in 2.6 (test6). It looks to me like the
allocation is done in set_anon_super (in fs/super.c) and that looks like
it is restricted to 256. Please correct this for me. I can't see how there
is any change to the number of unnmaed devices.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: RE: [autofs] multiple servers per automount
@ 2003-10-14 13:37 ` Ian Kent
0 siblings, 0 replies; 23+ messages in thread
From: Ian Kent @ 2003-10-14 13:37 UTC (permalink / raw)
To: Joseph V Moss; +Cc: Ogden, Aaron A., autofs mailing list, nfs, Mike Waychison
On Tue, 14 Oct 2003, Joseph V Moss wrote:
> The limit is 800 as others have stated. Although, it can be less than that
> if something else is already using up some of the reserved UDP ports.
>
> I wrote a patch long ago against a 2.2.x kernel to enable it to use
> multiple majors for NFS mounts (like the patches now common in several
> distros). I then ran into the 800 limit in the RPC layer. After changing
> the RPC layer to count up from 0, instead of down from 800, with no real
> upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
> I'm sure I could have done many thousand if I had had that many filesystems
> around to mount. Obviously, after 1024, it wasn't using reserved ports
> anymore, but it didn't seem to matter.
>
> Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
> the RPC layer is different enough between 2.2 and 2.4 that it didn't work
> right off. Bumping it up to somewhere around 1024 should work, but using
> non-reserved ports didn't seem to work when I made a simple attempt.
>
> Of course, the real fix for the NFS layer is the expansion of the minor
> numbers that's already occurred in 2.6 and the RPC layer problems should
> be fixed by multiplexing multiple mounts on the same port.
>
>
I don't see that expansion in 2.6 (test6). It looks to me like the
allocation is done in set_anon_super (in fs/super.c) and that looks like
it is restricted to 256. Please correct this for me. I can't see how there
is any change to the number of unnmaed devices.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [NFS] RE: multiple servers per automount
2003-10-14 13:37 ` Ian Kent
@ 2003-10-14 15:52 ` Mike Waychison
-1 siblings, 0 replies; 23+ messages in thread
From: Mike Waychison @ 2003-10-14 15:52 UTC (permalink / raw)
To: Ian Kent
Cc: Ogden, Aaron A., autofs mailing list, nfs, Kernel Mailing List,
Joseph V Moss
Ian Kent wrote:
>On Tue, 14 Oct 2003, Joseph V Moss wrote:
>
>
>
>>The limit is 800 as others have stated. Although, it can be less than that
>>if something else is already using up some of the reserved UDP ports.
>>
>>I wrote a patch long ago against a 2.2.x kernel to enable it to use
>>multiple majors for NFS mounts (like the patches now common in several
>>distros). I then ran into the 800 limit in the RPC layer. After changing
>>the RPC layer to count up from 0, instead of down from 800, with no real
>>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
>>I'm sure I could have done many thousand if I had had that many filesystems
>>around to mount. Obviously, after 1024, it wasn't using reserved ports
>>anymore, but it didn't seem to matter.
>>
>>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
>>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
>>right off. Bumping it up to somewhere around 1024 should work, but using
>>non-reserved ports didn't seem to work when I made a simple attempt.
>>
>>Of course, the real fix for the NFS layer is the expansion of the minor
>>numbers that's already occurred in 2.6 and the RPC layer problems should
>>be fixed by multiplexing multiple mounts on the same port.
>>
>>
>>
>>
>
>I don't see that expansion in 2.6 (test6). It looks to me like the
>allocation is done in set_anon_super (in fs/super.c) and that looks like
>it is restricted to 256. Please correct this for me. I can't see how there
>is any change to the number of unnmaed devices.
>
>
>
Here is the quick fix for this in RH 2.1AS kernels:
http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
I don't know if anyone is working out a better scheme for
get_unnamed_dev in 2.6 yet. It does need to be done though. A simple
patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
PAGE_SIZE, automatically allowing for 32768 unnamed devices.
Mike Waychison
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [NFS] RE: [autofs] multiple servers per automount
@ 2003-10-14 15:52 ` Mike Waychison
0 siblings, 0 replies; 23+ messages in thread
From: Mike Waychison @ 2003-10-14 15:52 UTC (permalink / raw)
To: Ian Kent
Cc: Joseph V Moss, Ogden, Aaron A., autofs mailing list, nfs,
Kernel Mailing List
Ian Kent wrote:
>On Tue, 14 Oct 2003, Joseph V Moss wrote:
>
>
>
>>The limit is 800 as others have stated. Although, it can be less than that
>>if something else is already using up some of the reserved UDP ports.
>>
>>I wrote a patch long ago against a 2.2.x kernel to enable it to use
>>multiple majors for NFS mounts (like the patches now common in several
>>distros). I then ran into the 800 limit in the RPC layer. After changing
>>the RPC layer to count up from 0, instead of down from 800, with no real
>>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
>>I'm sure I could have done many thousand if I had had that many filesystems
>>around to mount. Obviously, after 1024, it wasn't using reserved ports
>>anymore, but it didn't seem to matter.
>>
>>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
>>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
>>right off. Bumping it up to somewhere around 1024 should work, but using
>>non-reserved ports didn't seem to work when I made a simple attempt.
>>
>>Of course, the real fix for the NFS layer is the expansion of the minor
>>numbers that's already occurred in 2.6 and the RPC layer problems should
>>be fixed by multiplexing multiple mounts on the same port.
>>
>>
>>
>>
>
>I don't see that expansion in 2.6 (test6). It looks to me like the
>allocation is done in set_anon_super (in fs/super.c) and that looks like
>it is restricted to 256. Please correct this for me. I can't see how there
>is any change to the number of unnmaed devices.
>
>
>
Here is the quick fix for this in RH 2.1AS kernels:
http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
I don't know if anyone is working out a better scheme for
get_unnamed_dev in 2.6 yet. It does need to be done though. A simple
patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
PAGE_SIZE, automatically allowing for 32768 unnamed devices.
Mike Waychison
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [NFS] RE: [autofs] multiple servers per automount
2003-10-14 15:52 ` [NFS] RE: [autofs] " Mike Waychison
(?)
@ 2003-10-14 20:44 ` H. Peter Anvin
2003-10-14 23:12 ` Mike Waychison
-1 siblings, 1 reply; 23+ messages in thread
From: H. Peter Anvin @ 2003-10-14 20:44 UTC (permalink / raw)
To: linux-kernel
Followup to: <3F8C1BB6.9010202@sun.com>
By author: Mike Waychison <Michael.Waychison@Sun.COM>
In newsgroup: linux.dev.kernel
>
> Here is the quick fix for this in RH 2.1AS kernels:
>
> http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
>
> It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
>
> I don't know if anyone is working out a better scheme for
> get_unnamed_dev in 2.6 yet. It does need to be done though. A simple
> patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
> PAGE_SIZE, automatically allowing for 32768 unnamed devices.
>
dev_t enlargement, which solves this without a bunch of auxilliary
majors, should be in 2.6.
-hpa
--
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [NFS] RE: [autofs] multiple servers per automount
2003-10-14 20:44 ` H. Peter Anvin
@ 2003-10-14 23:12 ` Mike Waychison
2003-10-15 10:28 ` Ingo Oeser
0 siblings, 1 reply; 23+ messages in thread
From: Mike Waychison @ 2003-10-14 23:12 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: linux-kernel, Ian Kent
[-- Attachment #1: Type: text/plain, Size: 1092 bytes --]
H. Peter Anvin wrote:
> Followup to: <3F8C1BB6.9010202@sun.com>
> By author: Mike Waychison <Michael.Waychison@Sun.COM>
> In newsgroup: linux.dev.kernel
>
>>Here is the quick fix for this in RH 2.1AS kernels:
>>
>>http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
>>
>>It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
>>
>>I don't know if anyone is working out a better scheme for
>>get_unnamed_dev in 2.6 yet. It does need to be done though. A simple
>>patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
>>PAGE_SIZE, automatically allowing for 32768 unnamed devices.
>>
>
>
> dev_t enlargement, which solves this without a bunch of auxilliary
> majors, should be in 2.6.
>
> -hpa
The problem still remains in 2.6 that we limit the count to 256. I've
attached a quick patch that I've compiled and tested. I don't know if
there is a better way to handle dynamic assignment of minors (haven't
kept up to date in that realm), but if there is, then we should probably
use it instead.
Mike Waychison
[-- Attachment #2: max_anon.patch --]
[-- Type: text/plain, Size: 881 bytes --]
===== fs/super.c 1.108 vs edited =====
--- 1.108/fs/super.c Wed Oct 1 15:36:45 2003
+++ edited/fs/super.c Tue Oct 14 22:52:12 2003
@@ -528,14 +528,22 @@
* filesystems which don't use real block-devices. -- jrs
*/
-enum {Max_anon = 256};
-static unsigned long unnamed_dev_in_use[Max_anon/(8*sizeof(unsigned long))];
+enum {Max_anon = PAGE_SIZE * 8};
+static void *unnamed_dev_in_use = NULL;
static spinlock_t unnamed_dev_lock = SPIN_LOCK_UNLOCKED;/* protects the above */
int set_anon_super(struct super_block *s, void *data)
{
int dev;
spin_lock(&unnamed_dev_lock);
+
+ if (!unnamed_dev_in_use)
+ unnamed_dev_in_use = (void *)get_zeroed_page(GFP_KERNEL);
+ if (!unnamed_dev_in_use) {
+ spin_unlock(&unnamed_dev_lock);
+ return -ENOMEM;
+ }
+
dev = find_first_zero_bit(unnamed_dev_in_use, Max_anon);
if (dev == Max_anon) {
spin_unlock(&unnamed_dev_lock);
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [NFS] RE: [autofs] multiple servers per automount
2003-10-14 23:12 ` Mike Waychison
@ 2003-10-15 10:28 ` Ingo Oeser
2003-10-15 16:16 ` Mike Waychison
2003-10-23 13:37 ` Ian Kent
0 siblings, 2 replies; 23+ messages in thread
From: Ingo Oeser @ 2003-10-15 10:28 UTC (permalink / raw)
To: Mike Waychison
Cc: linux-kernel, Ian Kent, linux-kernel, Ian Kent, linux-kernel,
Ian Kent, linux-kernel, Ian Kent
On Wednesday 15 October 2003 01:12, Mike Waychison wrote:
> The problem still remains in 2.6 that we limit the count to 256. I've
> attached a quick patch that I've compiled and tested. I don't know if
> there is a better way to handle dynamic assignment of minors (haven't
> kept up to date in that realm), but if there is, then we should probably
> use it instead.
In your patch you allocate inside the spinlock.
I would suggest to do sth. like the following:
void *local;
if (!unamed_dev_inuse) {
local = get_zeroed_page(GFP_KERNEL);
if (!local)
return -ENOMEM;
}
spinlock(&unamed_dev_lock);
mb();
if (!unamed_dev_inuse) {
unamed_dev_inuse = local;
/* Used globally, don't free now */
local = NULL;
}
/*
Do the lookup and alloc
*/
spinunlock(&unamed_dev_lock);
/* Free page, because of race on allocation. */
if (local)
free_page(local);
Which will swap the pointers atomically and still alloc outside the
non-sleeping locking.
Regards
Ingo Oeser
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [NFS] RE: [autofs] multiple servers per automount
2003-10-15 10:28 ` Ingo Oeser
@ 2003-10-15 16:16 ` Mike Waychison
2003-10-23 13:37 ` Ian Kent
1 sibling, 0 replies; 23+ messages in thread
From: Mike Waychison @ 2003-10-15 16:16 UTC (permalink / raw)
To: Ingo Oeser; +Cc: Mike Waychison, linux-kernel, Ian Kent
[-- Attachment #1: Type: text/plain, Size: 717 bytes --]
Ingo Oeser wrote:
> On Wednesday 15 October 2003 01:12, Mike Waychison wrote:
>
>>The problem still remains in 2.6 that we limit the count to 256. I've
>>attached a quick patch that I've compiled and tested. I don't know if
>>there is a better way to handle dynamic assignment of minors (haven't
>>kept up to date in that realm), but if there is, then we should probably
>> use it instead.
>
>
>
> In your patch you allocate inside the spinlock.
>
> I would suggest to do sth. like the following:
>
Better yet.. we could move it into an __init section that will panic if
the allocation fails (this should be the desired behaviour..). This way
we don't even have to grab the lock either.
Mike Waychison
[-- Attachment #2: max_anon_2.patch --]
[-- Type: text/plain, Size: 1592 bytes --]
===== fs/namespace.c 1.49 vs edited =====
--- 1.49/fs/namespace.c Thu Jul 17 22:30:49 2003
+++ edited/fs/namespace.c Wed Oct 15 15:59:11 2003
@@ -23,6 +23,7 @@
#include <linux/mount.h>
#include <asm/uaccess.h>
+extern void __init super_init(void);
extern int __init init_rootfs(void);
extern int __init sysfs_init(void);
@@ -1154,6 +1155,7 @@
d++;
i--;
} while (i);
+ super_init();
sysfs_init();
init_rootfs();
init_mount_tree();
===== fs/super.c 1.108 vs edited =====
--- 1.108/fs/super.c Wed Oct 1 15:36:45 2003
+++ edited/fs/super.c Wed Oct 15 15:59:50 2003
@@ -24,6 +24,7 @@
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/smp_lock.h>
+#include <linux/init.h>
#include <linux/acct.h>
#include <linux/blkdev.h>
#include <linux/quotaops.h>
@@ -527,15 +528,22 @@
* Unnamed block devices are dummy devices used by virtual
* filesystems which don't use real block-devices. -- jrs
*/
-
-enum {Max_anon = 256};
-static unsigned long unnamed_dev_in_use[Max_anon/(8*sizeof(unsigned long))];
+enum {Max_anon = PAGE_SIZE * 8};
+static void *unnamed_dev_in_use;
static spinlock_t unnamed_dev_lock = SPIN_LOCK_UNLOCKED;/* protects the above */
+void __init super_init(void)
+{
+ unnamed_dev_in_use = (void *)get_zeroed_page(GFP_KERNEL);
+ if (!unnamed_dev_in_use)
+ panic("Could not allocate anonymous device map");
+}
+
int set_anon_super(struct super_block *s, void *data)
{
int dev;
spin_lock(&unnamed_dev_lock);
+
dev = find_first_zero_bit(unnamed_dev_in_use, Max_anon);
if (dev == Max_anon) {
spin_unlock(&unnamed_dev_lock);
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [NFS] RE: [autofs] multiple servers per automount
2003-10-15 10:28 ` Ingo Oeser
2003-10-15 16:16 ` Mike Waychison
@ 2003-10-23 13:37 ` Ian Kent
2003-10-23 17:00 ` Mike Waychison
1 sibling, 1 reply; 23+ messages in thread
From: Ian Kent @ 2003-10-23 13:37 UTC (permalink / raw)
To: Ingo Oeser; +Cc: Mike Waychison, Kernel Mailing List
Please forgive my ignorance Ingo but ...
I suffer from race condition blindness. A terible afflicition when one is
trying to understand the sublties of the kernel, but I'm trying.
While I am not questioning your suggestion, I have thought about the code
and fail to see the race you point out. Please help me along.
On Wed, 15 Oct 2003, Ingo Oeser wrote:
> On Wednesday 15 October 2003 01:12, Mike Waychison wrote:
> > The problem still remains in 2.6 that we limit the count to 256. I've
> > attached a quick patch that I've compiled and tested. I don't know if
> > there is a better way to handle dynamic assignment of minors (haven't
> > kept up to date in that realm), but if there is, then we should probably
> > use it instead.
>
>
> In your patch you allocate inside the spinlock.
Do you mean we don't want to sleep under the spin lock?
Would a GFP_ATOMIC make a difference to the analysis?
>
> I would suggest to do sth. like the following:
>
> void *local;
> if (!unamed_dev_inuse) {
> local = get_zeroed_page(GFP_KERNEL);
>
> if (!local)
> return -ENOMEM;
> }
>
> spinlock(&unamed_dev_lock);
> mb();
> if (!unamed_dev_inuse) {
> unamed_dev_inuse = local;
>
> /* Used globally, don't free now */
> local = NULL;
> }
>
> /*
> Do the lookup and alloc
> */
>
> spinunlock(&unamed_dev_lock);
>
> /* Free page, because of race on allocation. */
> if (local)
> free_page(local);
>
>
> Which will swap the pointers atomically and still alloc outside the
> non-sleeping locking.
As I said please give me a hint about your thinking here.
And the use of a memory barrier as well ... umm?
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [NFS] RE: [autofs] multiple servers per automount
2003-10-23 13:37 ` Ian Kent
@ 2003-10-23 17:00 ` Mike Waychison
2003-10-23 17:09 ` Tim Hockin
2003-10-24 0:47 ` Ian Kent
0 siblings, 2 replies; 23+ messages in thread
From: Mike Waychison @ 2003-10-23 17:00 UTC (permalink / raw)
To: Ian Kent; +Cc: Ingo Oeser, Kernel Mailing List
Ian Kent wrote:
>On Wed, 15 Oct 2003, Ingo Oeser wrote:
>
>
>>In your patch you allocate inside the spinlock.
>>
>>
>
>Do you mean we don't want to sleep under the spin lock?
>Would a GFP_ATOMIC make a difference to the analysis?
>
>
Yes, sleeping within a spinlock is bad practice because it may
eventually deadlock. Pretend that the lock is taken, the call to
kmalloc is made, the mm system doesn't have any immidiately free memory
and through some flow of execution requires that a some pseudo-block
device backed filesystem needs to be mounted -> deadlock. I have no
idea if this is currently a likely scenario, however not sleeping within
a lock is 'The Right Thing' and should be avoided at all costs.
GFP_ATOMIC should be avoided in most circumstances, particularly in
environments where the code can be refactored to allow for the sleep.
It is less likely to find free memory atomically and is thus more likely
to fail.
>>I would suggest to do sth. like the following:
>>
>>void *local;
>>if (!unamed_dev_inuse) {
>> local = get_zeroed_page(GFP_KERNEL);
>>
>> if (!local)
>> return -ENOMEM;
>>}
>>
>>spinlock(&unamed_dev_lock);
>>mb();
>>if (!unamed_dev_inuse) {
>> unamed_dev_inuse = local;
>>
>> /* Used globally, don't free now */
>> local = NULL;
>>}
>>
>>/*
>> Do the lookup and alloc
>> */
>>
>>spinunlock(&unamed_dev_lock);
>>
>>/* Free page, because of race on allocation. */
>>if (local)
>> free_page(local);
>>
>>
>>Which will swap the pointers atomically and still alloc outside the
>>non-sleeping locking.
>>
>>
>
>As I said please give me a hint about your thinking here.
>And the use of a memory barrier as well ... umm?
>
>
>
Ingo's patch simply moved the allocation outside the spinlock.. See my
later patch about moving the allocation to and __init section, which is
probably the cleaner thing to do and doesn't require grabbing the page
and using it conditionally.
As for the mb(), I *thought* that a spinlock implied a memory barrier,
however I think he put it there because it solves the age-old badness of
double-checked locking (search google for good explanations of the badness).
--
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [NFS] RE: [autofs] multiple servers per automount
2003-10-23 17:00 ` Mike Waychison
@ 2003-10-23 17:09 ` Tim Hockin
2003-10-24 0:47 ` Ian Kent
1 sibling, 0 replies; 23+ messages in thread
From: Tim Hockin @ 2003-10-23 17:09 UTC (permalink / raw)
To: Mike Waychison; +Cc: Ian Kent, Ingo Oeser, Kernel Mailing List
On Thu, Oct 23, 2003 at 01:00:57PM -0400, Mike Waychison wrote:
> >Would a GFP_ATOMIC make a difference to the analysis?
> Yes, sleeping within a spinlock is bad practice because it may
> eventually deadlock. Pretend that the lock is taken, the call to
> kmalloc is made, the mm system doesn't have any immidiately free memory
> and through some flow of execution requires that a some pseudo-block
> device backed filesystem needs to be mounted -> deadlock. I have no
> idea if this is currently a likely scenario, however not sleeping within
> a lock is 'The Right Thing' and should be avoided at all costs.
it's worse than that. It's forbidden. It's a VERY likely deadlock scenario
in the general sense, even if this particular case is not. If you need to
lock something and you need to sleep holding that lock, use a semaphore.
--
Notice that as computers are becoming easier and easier to use,
suddenly there's a big market for "Dummies" books. Cause and effect,
or merely an ironic juxtaposition of unrelated facts?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [NFS] RE: [autofs] multiple servers per automount
2003-10-23 17:00 ` Mike Waychison
2003-10-23 17:09 ` Tim Hockin
@ 2003-10-24 0:47 ` Ian Kent
2003-10-24 1:42 ` Tim Hockin
1 sibling, 1 reply; 23+ messages in thread
From: Ian Kent @ 2003-10-24 0:47 UTC (permalink / raw)
To: Mike Waychison; +Cc: Ingo Oeser, Kernel Mailing List
Thanks for the description.
I thought it was bad to call a function that could block while
holding a lock. At least I was close to right this time.
I wasn't aware of the badness I'll see what I can find.
On Thu, 23 Oct 2003, Mike Waychison wrote:
>
> Ingo's patch simply moved the allocation outside the spinlock.. See my
> later patch about moving the allocation to and __init section, which is
> probably the cleaner thing to do and doesn't require grabbing the page
> and using it conditionally.
>
Missed that when I returned to it. Found it now.
That is clearly a better way to do it.
I there any chance this would be accepted into 2.6.0?
I think it's quite important, hopefully others do as well.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [NFS] RE: [autofs] multiple servers per automount
2003-10-24 0:47 ` Ian Kent
@ 2003-10-24 1:42 ` Tim Hockin
0 siblings, 0 replies; 23+ messages in thread
From: Tim Hockin @ 2003-10-24 1:42 UTC (permalink / raw)
To: Ian Kent; +Cc: Mike Waychison, Ingo Oeser, Kernel Mailing List, torvalds
Recap: Mike Waychison posted a simple patch to make Max_anon bit array
(NFS mounts etc.) use exactly one page.
On Fri, Oct 24, 2003 at 08:47:57AM +0800, Ian Kent wrote:
> I there any chance this would be accepted into 2.6.0?
>
> I think it's quite important, hopefully others do as well.
Wouldn't it be saner to have a sysctl to adjust that? From 1 page to
2^20/(PAGE_SIZE * CHAR_BIT) pages? Perhaps just in page-sized increments?
This would be a simple patch... But maybe it's not 'stabilization' for
2.6.0.
Maybe the simple version in 2.6.0 and the right version in 2.6.1?
Linus?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: [autofs] multiple servers per automount
2003-10-14 15:52 ` [NFS] RE: [autofs] " Mike Waychison
(?)
@ 2003-10-15 7:22 ` Ian Kent
-1 siblings, 0 replies; 23+ messages in thread
From: Ian Kent @ 2003-10-15 7:22 UTC (permalink / raw)
To: Mike Waychison
Cc: Joseph V Moss, Ogden, Aaron A., autofs mailing list, nfs,
Kernel Mailing List
On Tue, 14 Oct 2003, Mike Waychison wrote:
> Ian Kent wrote:
>
> >On Tue, 14 Oct 2003, Joseph V Moss wrote:
> >
> >
> >
> >>The limit is 800 as others have stated. Although, it can be less than that
> >>if something else is already using up some of the reserved UDP ports.
> >>
> >>I wrote a patch long ago against a 2.2.x kernel to enable it to use
> >>multiple majors for NFS mounts (like the patches now common in several
> >>distros). I then ran into the 800 limit in the RPC layer. After changing
> >>the RPC layer to count up from 0, instead of down from 800, with no real
> >>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
> >>I'm sure I could have done many thousand if I had had that many filesystems
> >>around to mount. Obviously, after 1024, it wasn't using reserved ports
> >>anymore, but it didn't seem to matter.
> >>
> >>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
> >>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
> >>right off. Bumping it up to somewhere around 1024 should work, but using
> >>non-reserved ports didn't seem to work when I made a simple attempt.
> >>
> >>Of course, the real fix for the NFS layer is the expansion of the minor
> >>numbers that's already occurred in 2.6 and the RPC layer problems should
> >>be fixed by multiplexing multiple mounts on the same port.
> >>
> >>
> >>
> >>
> >
> >I don't see that expansion in 2.6 (test6). It looks to me like the
> >allocation is done in set_anon_super (in fs/super.c) and that looks like
> >it is restricted to 256. Please correct this for me. I can't see how there
> >is any change to the number of unnmaed devices.
> >
> >
> >
>
> Here is the quick fix for this in RH 2.1AS kernels:
>
> http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
>
> It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
>
> I don't know if anyone is working out a better scheme for
> get_unnamed_dev in 2.6 yet. It does need to be done though. A simple
> patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
> PAGE_SIZE, automatically allowing for 32768 unnamed devices.
>
OK. Sounds like a good job for me to do (simple - maybe).
I'll spend a while looking for possible side effects.
Do you think that the possible NFS port allocation problems should hold up
this work or should it drive updates to NFS?
Comments from anyone about where to check and what to watch out for are
welcome.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [NFS] RE: [autofs] multiple servers per automount
@ 2003-10-15 7:22 ` Ian Kent
0 siblings, 0 replies; 23+ messages in thread
From: Ian Kent @ 2003-10-15 7:22 UTC (permalink / raw)
To: Mike Waychison
Cc: Joseph V Moss, Ogden, Aaron A., autofs mailing list, nfs,
Kernel Mailing List
On Tue, 14 Oct 2003, Mike Waychison wrote:
> Ian Kent wrote:
>
> >On Tue, 14 Oct 2003, Joseph V Moss wrote:
> >
> >
> >
> >>The limit is 800 as others have stated. Although, it can be less than that
> >>if something else is already using up some of the reserved UDP ports.
> >>
> >>I wrote a patch long ago against a 2.2.x kernel to enable it to use
> >>multiple majors for NFS mounts (like the patches now common in several
> >>distros). I then ran into the 800 limit in the RPC layer. After changing
> >>the RPC layer to count up from 0, instead of down from 800, with no real
> >>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
> >>I'm sure I could have done many thousand if I had had that many filesystems
> >>around to mount. Obviously, after 1024, it wasn't using reserved ports
> >>anymore, but it didn't seem to matter.
> >>
> >>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
> >>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
> >>right off. Bumping it up to somewhere around 1024 should work, but using
> >>non-reserved ports didn't seem to work when I made a simple attempt.
> >>
> >>Of course, the real fix for the NFS layer is the expansion of the minor
> >>numbers that's already occurred in 2.6 and the RPC layer problems should
> >>be fixed by multiplexing multiple mounts on the same port.
> >>
> >>
> >>
> >>
> >
> >I don't see that expansion in 2.6 (test6). It looks to me like the
> >allocation is done in set_anon_super (in fs/super.c) and that looks like
> >it is restricted to 256. Please correct this for me. I can't see how there
> >is any change to the number of unnmaed devices.
> >
> >
> >
>
> Here is the quick fix for this in RH 2.1AS kernels:
>
> http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
>
> It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
>
> I don't know if anyone is working out a better scheme for
> get_unnamed_dev in 2.6 yet. It does need to be done though. A simple
> patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
> PAGE_SIZE, automatically allowing for 32768 unnamed devices.
>
OK. Sounds like a good job for me to do (simple - maybe).
I'll spend a while looking for possible side effects.
Do you think that the possible NFS port allocation problems should hold up
this work or should it drive updates to NFS?
Comments from anyone about where to check and what to watch out for are
welcome.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: RE: [autofs] multiple servers per automount
@ 2003-10-15 7:22 ` Ian Kent
0 siblings, 0 replies; 23+ messages in thread
From: Ian Kent @ 2003-10-15 7:22 UTC (permalink / raw)
To: Mike Waychison
Cc: Joseph V Moss, Ogden, Aaron A., autofs mailing list, nfs,
Kernel Mailing List
On Tue, 14 Oct 2003, Mike Waychison wrote:
> Ian Kent wrote:
>
> >On Tue, 14 Oct 2003, Joseph V Moss wrote:
> >
> >
> >
> >>The limit is 800 as others have stated. Although, it can be less than that
> >>if something else is already using up some of the reserved UDP ports.
> >>
> >>I wrote a patch long ago against a 2.2.x kernel to enable it to use
> >>multiple majors for NFS mounts (like the patches now common in several
> >>distros). I then ran into the 800 limit in the RPC layer. After changing
> >>the RPC layer to count up from 0, instead of down from 800, with no real
> >>upper limit, I was able to mount more than 2000 NFS filesystems simultaneously.
> >>I'm sure I could have done many thousand if I had had that many filesystems
> >>around to mount. Obviously, after 1024, it wasn't using reserved ports
> >>anymore, but it didn't seem to matter.
> >>
> >>Unfortunately, while the changes to NFS were easy to port to the 2.4 kernel,
> >>the RPC layer is different enough between 2.2 and 2.4 that it didn't work
> >>right off. Bumping it up to somewhere around 1024 should work, but using
> >>non-reserved ports didn't seem to work when I made a simple attempt.
> >>
> >>Of course, the real fix for the NFS layer is the expansion of the minor
> >>numbers that's already occurred in 2.6 and the RPC layer problems should
> >>be fixed by multiplexing multiple mounts on the same port.
> >>
> >>
> >>
> >>
> >
> >I don't see that expansion in 2.6 (test6). It looks to me like the
> >allocation is done in set_anon_super (in fs/super.c) and that looks like
> >it is restricted to 256. Please correct this for me. I can't see how there
> >is any change to the number of unnmaed devices.
> >
> >
> >
>
> Here is the quick fix for this in RH 2.1AS kernels:
>
> http://www.kernelnewbies.org/kernels/rh21as/SOURCES/linux-2.4.9-moreunnamed.patch
>
> It makes unnamed block devices use majors 12, 14, 38, 39, as well as 0.
>
> I don't know if anyone is working out a better scheme for
> get_unnamed_dev in 2.6 yet. It does need to be done though. A simple
> patch for 2.6 would maybe see the unnamed_dev_in_use bitmap grow to
> PAGE_SIZE, automatically allowing for 32768 unnamed devices.
>
OK. Sounds like a good job for me to do (simple - maybe).
I'll spend a while looking for possible side effects.
Do you think that the possible NFS port allocation problems should hold up
this work or should it drive updates to NFS?
Comments from anyone about where to check and what to watch out for are
welcome.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [NFS] Re: multiple servers per automount
@ 2003-10-10 17:02 Eric Werme USG
0 siblings, 0 replies; 23+ messages in thread
From: Eric Werme USG @ 2003-10-10 17:02 UTC (permalink / raw)
To: aogden; +Cc: autofs
Ogden, Aaron A. wrote:
>Aha! Wisdom from the heavens... :-)
>I assume that the RPC code is doing that to comply with reserved-port
>restrictions, ie. ports < 1024. Solaris needs to do the same thing
>(with nfssrv:nfs_portmon=1) so it seems that there would be an inherent
>limit of 1024 ports or mountpoints to work with. Actually less, since
>some ports will be in use. How does Sun get 260,000 active mounts if
>they can only use ports < 1024? Do we really need one port for each
>mountpoint?
I can't speak for Solaris, but on HP's Tru64 UNIX we use one TCP
connection for all traffic per mount, and we close connections that
have been idle for 5 minutes and when there are "too many" connections
to one server. For UDP, the NFS client uses a single port, in large part
do to problems with port number space exhaustion and the ripple effects
on other consumers of that space. (We don't throttle the number of outstanding
NFS requests, but we have a fixed limit on the read/write nfsiod helper
threads.) We generally ran into port number exhaustion on our mail server
which uses NFS (via aoutmount) to access /home/user/.forward files. If one
production system went down, then the mail server would wind up with a
big flock of sendmails all trying to access the .forwards until the port
number space was chewed up, then automount couldn't issue new mounts
whereupon no mail got delivered to anyone.
The NFS client gets its first look at a reply via a callback from UDP
code when it finds the port has been registered. The callback figures
out what thread is waiting for the XID, saves the reply address in a
data structure and issues the wakeup. When the code is processed for
real, it's NFS code that does the UDP checksum, thereby loading
the local cache with the data. The inspiration was pretty simple as I
had to do the same demultiplexing in the NFS over TCP client.
BTW, the rationale behind the one TCP connection per mount was to
conform to TCP's congestion control design, but limit the amount of
cross mount locking and code complexity. Typical NFS traffic
has multiple accesses on a mount at a time, so I figured it would be
a good compromise. I know Solaris has one connection per server, I don't
know what other vendors do.
-Ric Werme
--
Eric (Ric) Werme | werme@zk3.dec.com
Hewlett-Packard Co. | http://werme.8m.net/
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [NFS] Re: multiple servers per automount
@ 2003-10-10 15:43 Ogden, Aaron A.
2003-10-10 15:54 ` Mike Waychison
0 siblings, 1 reply; 23+ messages in thread
From: Ogden, Aaron A. @ 2003-10-10 15:43 UTC (permalink / raw)
To: Lever, Charles, Ian Kent, Mike Waychison; +Cc: autofs mailing list, nfs
Aha! Wisdom from the heavens... :-)
I assume that the RPC code is doing that to comply with reserved-port
restrictions, ie. ports < 1024. Solaris needs to do the same thing
(with nfssrv:nfs_portmon=1) so it seems that there would be an inherent
limit of 1024 ports or mountpoints to work with. Actually less, since
some ports will be in use. How does Sun get 260,000 active mounts if
they can only use ports < 1024? Do we really need one port for each
mountpoint?
Perhaps this has something to do with the fact that solaris autofs is
multithreaded (ie. one process) whereas linux autofs has many processes,
one for each mountpoint. Feel free to correct me if I'm wrong...
-A
-----Original Message-----
From: Lever, Charles [mailto:Charles.Lever@netapp.com]
Sent: Friday, October 10, 2003 10:10 AM
To: Ian Kent; Mike Waychison
Cc: Ogden, Aaron A.; autofs mailing list; nfs@lists.sourceforge.net
Subject: RE: [NFS] Re: [autofs] multiple servers per automount
the problem is likely the algorithm used to allocate
ports for the RPC transport sockets. it starts at
port 800 and goes down to zero.
> -----Original Message-----
> From: Ian Kent [mailto:raven@themaw.net]
> Sent: Thursday, October 09, 2003 6:09 PM
> To: Mike Waychison
> Cc: Ogden, Aaron A.; autofs mailing list; nfs@lists.sourceforge.net
> Subject: [NFS] Re: [autofs] multiple servers per automount
>
>
> On Thu, 9 Oct 2003, Mike Waychison wrote:
>
> > Ogden, Aaron A. wrote:
> >
> > >Ouch. As you may know, the limit is *much* lower in linux.
Something
> > >that I've been struggling with recently...
> > >
> > >Under normal circumstances I would not be concerned with
'limitations'
> > >of a few hundred active NFS mounts, but such limitations certainly
limit
> > >scalability for the extreme cases.
> > >
> > >
> >
> > The maximum number of plain pseudo-block device filesystems on a
given
> > filesystem is limitted to 256. (This includes proc, autofs, nfs..).
> >
> > This is because pseudo-block filesystems all use major 0, and each
have
> > a different minor (thus the 256 limit).
> >
> > There are however patches floating around (look at SuSe's kernels,
I'm
> > not sure about RH) that allow n majors to be used (default 5). This
> > gives you 1280 mounts, a big step up :)
> >
>
> But as Aaron and I know things go pear shaped at just shy of 800
mounts
> with RedHat kernels. They have the more-unnamed patch.
>
> So this would indicate that even if there is a device system that can
> increase the number of unnamed devices that subsystems like NFS cannot
> handle this many mounts.
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [NFS] Re: multiple servers per automount
2003-10-10 15:43 Ogden, Aaron A.
@ 2003-10-10 15:54 ` Mike Waychison
0 siblings, 0 replies; 23+ messages in thread
From: Mike Waychison @ 2003-10-10 15:54 UTC (permalink / raw)
To: Ogden, Aaron A.; +Cc: autofs mailing list, nfs, Lever, Charles, Ian Kent
Ogden, Aaron A. wrote:
>Aha! Wisdom from the heavens... :-)
>I assume that the RPC code is doing that to comply with reserved-port
>restrictions, ie. ports < 1024. Solaris needs to do the same thing
>(with nfssrv:nfs_portmon=1) so it seems that there would be an inherent
>limit of 1024 ports or mountpoints to work with. Actually less, since
>some ports will be in use. How does Sun get 260,000 active mounts if
>they can only use ports < 1024? Do we really need one port for each
>mountpoint?
>
>
Don't take my word for it, because I don't know any better.. But
Solaris may multiplex different NFS servers on the same udp port. They
may also have their tests done with TCP instead of udp, which solves
that problem elegantly.
>Perhaps this has something to do with the fact that solaris autofs is
>multithreaded (ie. one process) whereas linux autofs has many processes,
>one for each mountpoint. Feel free to correct me if I'm wrong...
>
>
Nah, this sounds alot like an NFS issue. See Charles Lever's post.
Mike Waychison
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: Re: [autofs] multiple servers per automount
@ 2003-10-10 15:10 Lever, Charles
2003-10-13 3:05 ` [NFS] " Ian Kent
0 siblings, 1 reply; 23+ messages in thread
From: Lever, Charles @ 2003-10-10 15:10 UTC (permalink / raw)
To: Ian Kent, Mike Waychison; +Cc: Ogden, Aaron A., autofs mailing list, nfs
the problem is likely the algorithm used to allocate
ports for the RPC transport sockets. it starts at
port 800 and goes down to zero.
> -----Original Message-----
> From: Ian Kent [mailto:raven@themaw.net]
> Sent: Thursday, October 09, 2003 6:09 PM
> To: Mike Waychison
> Cc: Ogden, Aaron A.; autofs mailing list; nfs@lists.sourceforge.net
> Subject: [NFS] Re: [autofs] multiple servers per automount
>
>
> On Thu, 9 Oct 2003, Mike Waychison wrote:
>
> > Ogden, Aaron A. wrote:
> >
> > >Ouch. As you may know, the limit is *much* lower in
> linux. Something
> > >that I've been struggling with recently...
> > >
> > >Under normal circumstances I would not be concerned with
> 'limitations'
> > >of a few hundred active NFS mounts, but such limitations
> certainly limit
> > >scalability for the extreme cases.
> > >
> > >
> >
> > The maximum number of plain pseudo-block device filesystems
> on a given
> > filesystem is limitted to 256. (This includes proc, autofs, nfs..).
> >
> > This is because pseudo-block filesystems all use major 0,
> and each have
> > a different minor (thus the 256 limit).
> >
> > There are however patches floating around (look at SuSe's
> kernels, I'm
> > not sure about RH) that allow n majors to be used (default 5). This
> > gives you 1280 mounts, a big step up :)
> >
>
> But as Aaron and I know things go pear shaped at just shy of
> 800 mounts
> with RedHat kernels. They have the more-unnamed patch.
>
> So this would indicate that even if there is a device system that can
> increase the number of unnamed devices that subsystems like NFS cannot
> handle this many mounts.
>
> --
>
> ,-._|\ Ian Kent
> / \ Perth, Western Australia
> *_.--._/ E-mail: raven@themaw.net
> v Web: http://themaw.net/
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: SF.net Giveback Program.
> SourceForge.net hosts over 70,000 Open Source Projects.
> See the people who have HELPED US provide better services:
> Click here: http://sourceforge.net/supporters.php
> _______________________________________________
> NFS maillist - NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
>
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [NFS] Re: multiple servers per automount
2003-10-10 15:10 Re: [autofs] " Lever, Charles
@ 2003-10-13 3:05 ` Ian Kent
0 siblings, 0 replies; 23+ messages in thread
From: Ian Kent @ 2003-10-13 3:05 UTC (permalink / raw)
To: Lever, Charles; +Cc: Ogden, Aaron A., autofs mailing list, Mike Waychison, nfs
On Fri, 10 Oct 2003, Lever, Charles wrote:
> the problem is likely the algorithm used to allocate
> ports for the RPC transport sockets. it starts at
> port 800 and goes down to zero.
Don't think so.
I appears that a single connection is maintained for nfs comms for both
udp and tcp.
However, if a rapid number of mount requests are fired then multiple
portmap connections are made. They end up in a TIME_WAIT state which is
probably causing the port allocation starvation.
This doesn't appear to happen under Solaris.
>
> > -----Original Message-----
> > From: Ian Kent [mailto:raven@themaw.net]
> > Sent: Thursday, October 09, 2003 6:09 PM
> > To: Mike Waychison
> > Cc: Ogden, Aaron A.; autofs mailing list; nfs@lists.sourceforge.net
> > Subject: [NFS] Re: [autofs] multiple servers per automount
> >
> >
> > On Thu, 9 Oct 2003, Mike Waychison wrote:
> >
> > > Ogden, Aaron A. wrote:
> > >
> > > >Ouch. As you may know, the limit is *much* lower in
> > linux. Something
> > > >that I've been struggling with recently...
> > > >
> > > >Under normal circumstances I would not be concerned with
> > 'limitations'
> > > >of a few hundred active NFS mounts, but such limitations
> > certainly limit
> > > >scalability for the extreme cases.
> > > >
> > > >
> > >
> > > The maximum number of plain pseudo-block device filesystems
> > on a given
> > > filesystem is limitted to 256. (This includes proc, autofs, nfs..).
> > >
> > > This is because pseudo-block filesystems all use major 0,
> > and each have
> > > a different minor (thus the 256 limit).
> > >
> > > There are however patches floating around (look at SuSe's
> > kernels, I'm
> > > not sure about RH) that allow n majors to be used (default 5). This
> > > gives you 1280 mounts, a big step up :)
> > >
> >
> > But as Aaron and I know things go pear shaped at just shy of
> > 800 mounts
> > with RedHat kernels. They have the more-unnamed patch.
> >
> > So this would indicate that even if there is a device system that can
> > increase the number of unnamed devices that subsystems like NFS cannot
> > handle this many mounts.
> >
> > --
> >
> > ,-._|\ Ian Kent
> > / \ Perth, Western Australia
> > *_.--._/ E-mail: raven@themaw.net
> > v Web: http://themaw.net/
> >
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by: SF.net Giveback Program.
> > SourceForge.net hosts over 70,000 Open Source Projects.
> > See the people who have HELPED US provide better services:
> > Click here: http://sourceforge.net/supporters.php
> > _______________________________________________
> > NFS maillist - NFS@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nfs
> >
>
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2003-10-24 1:52 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-10 15:16 multiple servers per automount Ogden, Aaron A.
2003-10-13 3:23 ` [NFS] " Ian Kent
2003-10-14 7:05 ` Joseph V Moss
2003-10-14 13:37 ` RE: [autofs] " Ian Kent
2003-10-14 13:37 ` Ian Kent
2003-10-14 15:52 ` [NFS] " Mike Waychison
2003-10-14 15:52 ` [NFS] RE: [autofs] " Mike Waychison
2003-10-14 20:44 ` H. Peter Anvin
2003-10-14 23:12 ` Mike Waychison
2003-10-15 10:28 ` Ingo Oeser
2003-10-15 16:16 ` Mike Waychison
2003-10-23 13:37 ` Ian Kent
2003-10-23 17:00 ` Mike Waychison
2003-10-23 17:09 ` Tim Hockin
2003-10-24 0:47 ` Ian Kent
2003-10-24 1:42 ` Tim Hockin
2003-10-15 7:22 ` Ian Kent
2003-10-15 7:22 ` [NFS] " Ian Kent
2003-10-15 7:22 ` Ian Kent
-- strict thread matches above, loose matches on Subject: below --
2003-10-10 17:02 [NFS] " Eric Werme USG
2003-10-10 15:43 Ogden, Aaron A.
2003-10-10 15:54 ` Mike Waychison
2003-10-10 15:10 Re: [autofs] " Lever, Charles
2003-10-13 3:05 ` [NFS] " Ian Kent
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.