* autofs linux 3.8.13 and "Too many levels of symbolic links"
@ 2014-01-29 16:02 Donald Buczek
2014-01-29 17:16 ` Leonardo Chiquitto
` (3 more replies)
0 siblings, 4 replies; 50+ messages in thread
From: Donald Buczek @ 2014-01-29 16:02 UTC (permalink / raw)
To: autofs
Hello,
we are trying to switch from amd to autofs. After successfully testing
and rolling it out to the first several machines, from time to time we
get directories stuck with "Too many levels of symbolic links" on a path
which should be automounted via an indirect map.
linux 3.8.13
autofs 5.0.8
As an example, here is data from a system where the path /scratch/tmp is
stuck:
http://www.molgen.mpg.de/~buczek/autofs-demo/
auto.master # master map
auto.scratch # indirect map for /scratch
autofs # from /etc/defaults
typescript # shows the problem and a bit of gdb dump of kernel
structures
typescript.l # same with line numbers for reference
gdb-macros # macros used in the gdb session
From typescript.l , line 122ff it is clear, that /scratch/tmp is not
currently mounted. On the other hand, the gdb session finds the dentry
of /scratch/tmp which has d_flags 0x70080 (line 99,120). This is
DCACHE_MANAGE_TRANSIT+DCACHE_NEED_AUTOMOUNT+DCACHE_MOUNTED+DCACHE_RCUACCESS
with DCACHE_MOUNTED indicating that there should be something mounted
there(?). I think, this state is faulty and necessarily leads to ELOOP
during path walk. Probably the situation is known by the gurus here?
Is there any known bug which can lead to this situation? Any advice?
Thank you
Donald
--
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433
^ permalink raw reply [flat|nested] 50+ messages in thread* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-29 16:02 autofs linux 3.8.13 and "Too many levels of symbolic links" Donald Buczek @ 2014-01-29 17:16 ` Leonardo Chiquitto 2014-01-30 0:19 ` Ian Kent ` (2 subsequent siblings) 3 siblings, 0 replies; 50+ messages in thread From: Leonardo Chiquitto @ 2014-01-29 17:16 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Wed, Jan 29, 2014 at 2:02 PM, Donald Buczek <buczek@molgen.mpg.de> wrote: > Hello, > > we are trying to switch from amd to autofs. After successfully testing and > rolling it out to the first several machines, from time to time we get > directories stuck with "Too many levels of symbolic links" on a path which > should be automounted via an indirect map. > > linux 3.8.13 > autofs 5.0.8 > > As an example, here is data from a system where the path /scratch/tmp is > stuck: > > http://www.molgen.mpg.de/~buczek/autofs-demo/ > > auto.master # master map > auto.scratch # indirect map for /scratch > autofs # from /etc/defaults > typescript # shows the problem and a bit of gdb dump of kernel > structures > typescript.l # same with line numbers for reference > gdb-macros # macros used in the gdb session > > From typescript.l , line 122ff it is clear, that /scratch/tmp is not > currently mounted. On the other hand, the gdb session finds the dentry of > /scratch/tmp which has d_flags 0x70080 (line 99,120). This is > DCACHE_MANAGE_TRANSIT+DCACHE_NEED_AUTOMOUNT+DCACHE_MOUNTED+DCACHE_RCUACCESS > with DCACHE_MOUNTED indicating that there should be something mounted > there(?). I think, this state is faulty and necessarily leads to ELOOP > during path walk. Probably the situation is known by the gurus here? > > Is there any known bug which can lead to this situation? Any advice? I've seen this case at least once while investigating another problem, but unfortunately haven't had the time to dig into it. In the case I've seen, it failed to trigger the mount (with the "Too many levels of symbolic links" error) but worked a few minutes (seconds?) later if I tried again without any other action in between. Seems to suggest some race condition. Can you reproduce the problem at will? Have you tried with a more recent 3.13+ kernel? Thanks, Leonardo ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-29 16:02 autofs linux 3.8.13 and "Too many levels of symbolic links" Donald Buczek 2014-01-29 17:16 ` Leonardo Chiquitto @ 2014-01-30 0:19 ` Ian Kent 2014-01-30 10:28 ` Donald Buczek 2014-01-31 3:31 ` Ian Kent 2014-02-01 2:57 ` Ian Kent 3 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-01-30 0:19 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Wed, 2014-01-29 at 17:02 +0100, Donald Buczek wrote: > Hello, > > we are trying to switch from amd to autofs. After successfully testing > and rolling it out to the first several machines, from time to time we > get directories stuck with "Too many levels of symbolic links" on a path > which should be automounted via an indirect map. > > linux 3.8.13 What is linux 3.8.13? Oh right, an old kernel. You need to reproduce this with a current kernel, 3.13.0 for example. OTOH I have had a couple of recent reports of this, not including Leonardo's, so any information is useful. > autofs 5.0.8 > > As an example, here is data from a system where the path /scratch/tmp is > stuck: > > http://www.molgen.mpg.de/~buczek/autofs-demo/ > > auto.master # master map > auto.scratch # indirect map for /scratch > autofs # from /etc/defaults > typescript # shows the problem and a bit of gdb dump of kernel > structures > typescript.l # same with line numbers for reference > gdb-macros # macros used in the gdb session > > From typescript.l , line 122ff it is clear, that /scratch/tmp is not > currently mounted. On the other hand, the gdb session finds the dentry > of /scratch/tmp which has d_flags 0x70080 (line 99,120). This is > DCACHE_MANAGE_TRANSIT+DCACHE_NEED_AUTOMOUNT+DCACHE_MOUNTED+DCACHE_RCUACCESS > with DCACHE_MOUNTED indicating that there should be something mounted > there(?). I think, this state is faulty and necessarily leads to ELOOP > during path walk. Probably the situation is known by the gurus here? Well, at least I believe there's a bug to be found now. From this output it does show a dentry that, according to the config, shouldn't exist (but might still), is fully visible and claims it's mounted (and definitely should be). > > Is there any known bug which can lead to this situation? Any advice? Any more information you gather would be good. How frequently does this occur? Any idea of the activity leading to this? A full debug log and a time the mount was discovered inoperable might help. > > Thank you > > Donald > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-30 0:19 ` Ian Kent @ 2014-01-30 10:28 ` Donald Buczek 2014-01-30 14:30 ` Ian Kent 0 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-01-30 10:28 UTC (permalink / raw) To: Ian Kent, leonardo.lists; +Cc: autofs Thanks, Leonardo and Ian. In contrast to what Leonardo described, in our case the problem doesn't go away after some time. If the daemon is restarted and able to unmount the automount root ( /scratch here) than everything looks fine after the restart (however, the visible problem might just be (lazy?) unmounted away ?). Sadly, I am not able to reproduce it at will. The problem occurs rarely: We have about 12 active (and 24 most-of-the-time idle) machines running this code since mid December and had about 8 of theses issues. Of these, three were on one workstation and two were on another one, so there is a dependency on the hardware or usage pattern which is not yet identified. We have very active machines which mount and unmount a lot more then these two and didn't have an issue. I know its an old kernel. Sure, latest and greatest first is the systematic way to go, but I thought, I'd ask for ideas first, because the kernel upgrade will take much time and work (legacy graphic cards, netfilter functionality...) and surely will bring new bugs and problems as well. It always did. I hoped to get autofs running cleanly before that. There isn't so much change in "git log -p v3.8.13..master fs/autofs4" anyway. The logs I currently have are loglevel 1 only and there is nothing unusual logged. I can change the loglevel to 9 on the currently hung system but there are now messages when the directory is accessed. I forgot to dump the autofs_info and autofs_sb_info struct the last time. Here they are just for completeness: http://owww.molgen.mpg.de/~buczek/autofs-demo/typescript_2.l Oh yes, another info: We've seen this on various automount maps with various nfs-servers, so it doesn't depend on that. And we rebuild the maps and kill -HUP the daemon a lot. I plan to go the long way to 3.13 now and let you know if I have any new information. Thanks again Donald On 01/30/14 01:19, Ian Kent wrote: > On Wed, 2014-01-29 at 17:02 +0100, Donald Buczek wrote: >> Hello, >> >> we are trying to switch from amd to autofs. After successfully testing >> and rolling it out to the first several machines, from time to time we >> get directories stuck with "Too many levels of symbolic links" on a path >> which should be automounted via an indirect map. >> >> linux 3.8.13 > What is linux 3.8.13? > Oh right, an old kernel. > You need to reproduce this with a current kernel, 3.13.0 for example. > OTOH I have had a couple of recent reports of this, not including > Leonardo's, so any information is useful. > >> autofs 5.0.8 >> >> As an example, here is data from a system where the path /scratch/tmp is >> stuck: >> >> http://www.molgen.mpg.de/~buczek/autofs-demo/ >> >> auto.master # master map >> auto.scratch # indirect map for /scratch >> autofs # from /etc/defaults >> typescript # shows the problem and a bit of gdb dump of kernel >> structures >> typescript.l # same with line numbers for reference >> gdb-macros # macros used in the gdb session >> >> From typescript.l , line 122ff it is clear, that /scratch/tmp is not >> currently mounted. On the other hand, the gdb session finds the dentry >> of /scratch/tmp which has d_flags 0x70080 (line 99,120). This is >> DCACHE_MANAGE_TRANSIT+DCACHE_NEED_AUTOMOUNT+DCACHE_MOUNTED+DCACHE_RCUACCESS >> with DCACHE_MOUNTED indicating that there should be something mounted >> there(?). I think, this state is faulty and necessarily leads to ELOOP >> during path walk. Probably the situation is known by the gurus here? > Well, at least I believe there's a bug to be found now. > > From this output it does show a dentry that, according to the config, > shouldn't exist (but might still), is fully visible and claims it's > mounted (and definitely should be). > >> Is there any known bug which can lead to this situation? Any advice? > Any more information you gather would be good. > How frequently does this occur? > Any idea of the activity leading to this? > A full debug log and a time the mount was discovered inoperable might > help. > >> Thank you >> >> Donald >> > -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-30 10:28 ` Donald Buczek @ 2014-01-30 14:30 ` Ian Kent 2014-01-31 1:36 ` Ian Kent 0 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-01-30 14:30 UTC (permalink / raw) To: Donald Buczek; +Cc: leonardo.lists@gmail.com, autofs > On 30 Jan 2014, at 6:28 pm, Donald Buczek <buczek@molgen.mpg.de> wrote: > > Thanks, Leonardo and Ian. > > In contrast to what Leonardo described, in our case the problem doesn't go away after some time. If the daemon is restarted and able to unmount the automount root ( /scratch here) than everything looks fine after the restart (however, the visible problem might just be (lazy?) unmounted away ?). > > Sadly, I am not able to reproduce it at will. The problem occurs rarely: We have about 12 active (and 24 most-of-the-time idle) machines running this code since mid December and had about 8 of theses issues. Of these, three were on one workstation and two were on another one, so there is a dependency on the hardware or usage pattern which is not yet identified. We have very active machines which mount and unmount a lot more then these two and didn't have an issue. And with any leads as to where to look I'm stuck at guessing where to look. Understanding what leads to the symptom would be a big help. > > I know its an old kernel. Sure, latest and greatest first is the systematic way to go, but I thought, I'd ask for ideas first, because the kernel upgrade will take much time and work (legacy graphic cards, netfilter functionality...) and surely will bring new bugs and problems as well. It always did. Yeah, I understand that. The reason for an asksing if it can be reproduced on a current kernel is that it could already be fixed. Don't think that is the case here though so continue to profile the problem. I'm pretty sure I've looked at this before and have been left thinking, what needs to happen to make this happen can't happen! > > I hoped to get autofs running cleanly before that. There isn't so much change in "git log -p v3.8.13..master fs/autofs4" anyway. This probably isn't the autofs module, it's probably in the automount code in the VFS. Specific automount support has been added to the VFS around 2.6.32 so this sort of problem could be in the NFS module (assuming your seeing this with autofs NFS auto mounts), the autofs module itself or somewhere in the VFS (most likely the path walking code). Since this automount support was added the path walking code been continuously changed, pretty much re-written, so there's a lot of ground to cover. > > The logs I currently have are loglevel 1 only and there is nothing unusual logged. I can change the loglevel to 9 on the currently hung system but there are now messages when the directory is accessed. > > I forgot to dump the autofs_info and autofs_sb_info struct the last time. Here they are just for completeness: http://owww.molgen.mpg.de/~buczek/autofs-demo/typescript_2.l > > Oh yes, another info: We've seen this on various automount maps with various nfs-servers, so it doesn't depend on that. > And we rebuild the maps and kill -HUP the daemon a lot. I wonder, mmm. We need more information. Check things for inconsistencies when it happens. Things like /proc/mounts for duplicate mounts etc. I don't think I've ever got a full autofs debug log from anyone who's seen this. TBH I don't think it will give any clues but not having seen it is just another variable I can't eliminate. > > I plan to go the long way to 3.13 now and let you know if I have any new information. > > Thanks again > > Donald > > >> On 01/30/14 01:19, Ian Kent wrote: >>> On Wed, 2014-01-29 at 17:02 +0100, Donald Buczek wrote: >>> Hello, >>> >>> we are trying to switch from amd to autofs. After successfully testing >>> and rolling it out to the first several machines, from time to time we >>> get directories stuck with "Too many levels of symbolic links" on a path >>> which should be automounted via an indirect map. >>> >>> linux 3.8.13 >> What is linux 3.8.13? >> Oh right, an old kernel. >> You need to reproduce this with a current kernel, 3.13.0 for example. >> OTOH I have had a couple of recent reports of this, not including >> Leonardo's, so any information is useful. >> >>> autofs 5.0.8 >>> >>> As an example, here is data from a system where the path /scratch/tmp is >>> stuck: >>> >>> http://www.molgen.mpg.de/~buczek/autofs-demo/ >>> >>> auto.master # master map >>> auto.scratch # indirect map for /scratch >>> autofs # from /etc/defaults >>> typescript # shows the problem and a bit of gdb dump of kernel >>> structures >>> typescript.l # same with line numbers for reference >>> gdb-macros # macros used in the gdb session >>> >>> From typescript.l , line 122ff it is clear, that /scratch/tmp is not >>> currently mounted. On the other hand, the gdb session finds the dentry >>> of /scratch/tmp which has d_flags 0x70080 (line 99,120). This is >>> DCACHE_MANAGE_TRANSIT+DCACHE_NEED_AUTOMOUNT+DCACHE_MOUNTED+DCACHE_RCUACCESS >>> with DCACHE_MOUNTED indicating that there should be something mounted >>> there(?). I think, this state is faulty and necessarily leads to ELOOP >>> during path walk. Probably the situation is known by the gurus here? >> Well, at least I believe there's a bug to be found now. >> >> From this output it does show a dentry that, according to the config, >> shouldn't exist (but might still), is fully visible and claims it's >> mounted (and definitely should be). >> >>> Is there any known bug which can lead to this situation? Any advice? >> Any more information you gather would be good. >> How frequently does this occur? >> Any idea of the activity leading to this? >> A full debug log and a time the mount was discovered inoperable might >> help. >> >>> Thank you >>> >>> Donald > > > -- > Donald Buczek > buczek@molgen.mpg.de > Tel: +49 30 8413 1433 > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-30 14:30 ` Ian Kent @ 2014-01-31 1:36 ` Ian Kent 0 siblings, 0 replies; 50+ messages in thread From: Ian Kent @ 2014-01-31 1:36 UTC (permalink / raw) To: Donald Buczek; +Cc: leonardo.lists@gmail.com, autofs On Thu, 2014-01-30 at 22:30 +0800, Ian Kent wrote: > > > I hoped to get autofs running cleanly before that. There isn't so > much change in "git log -p v3.8.13..master fs/autofs4" anyway. > > This probably isn't the autofs module, it's probably in the automount > code in the VFS. > > Specific automount support has been added to the VFS around 2.6.32 so > this sort of problem could be in the NFS module (assuming your seeing > this with autofs NFS auto mounts), the autofs module itself or > somewhere in the VFS (most likely the path walking code). Since this > automount support was added the path walking code been continuously > changed, pretty much re-written, so there's a lot of ground to cover. > Actually I think the merge was at 2.6.38 not 32. Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-29 16:02 autofs linux 3.8.13 and "Too many levels of symbolic links" Donald Buczek 2014-01-29 17:16 ` Leonardo Chiquitto 2014-01-30 0:19 ` Ian Kent @ 2014-01-31 3:31 ` Ian Kent 2014-01-31 5:13 ` Ian Kent 2014-02-01 2:57 ` Ian Kent 3 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-01-31 3:31 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Wed, 2014-01-29 at 17:02 +0100, Donald Buczek wrote: > Hello, > > we are trying to switch from amd to autofs. After successfully testing > and rolling it out to the first several machines, from time to time we > get directories stuck with "Too many levels of symbolic links" on a path > which should be automounted via an indirect map. > > linux 3.8.13 > autofs 5.0.8 > > As an example, here is data from a system where the path /scratch/tmp is > stuck: > > http://www.molgen.mpg.de/~buczek/autofs-demo/ > > auto.master # master map > auto.scratch # indirect map for /scratch > autofs # from /etc/defaults > typescript # shows the problem and a bit of gdb dump of kernel > structures > typescript.l # same with line numbers for reference > gdb-macros # macros used in the gdb session > > From typescript.l , line 122ff it is clear, that /scratch/tmp is not > currently mounted. On the other hand, the gdb session finds the dentry > of /scratch/tmp which has d_flags 0x70080 (line 99,120). This is > DCACHE_MANAGE_TRANSIT+DCACHE_NEED_AUTOMOUNT+DCACHE_MOUNTED+DCACHE_RCUACCESS > with DCACHE_MOUNTED indicating that there should be something mounted > there(?). I think, this state is faulty and necessarily leads to ELOOP > during path walk. Probably the situation is known by the gurus here? Yes, I can see how DCACHE_MOUNTED being set would lead to ELOOP in this case. But, having been there before too, I couldn't see any way the DCACHE_MOUNTED would not be cleared on umount. Also, DCACHE_MOUNTED is only changed within the VFS and isn't changed very often. It can't see how a code path that should lead to one of those changes doesn't go there. I'll have another look ..... Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-31 3:31 ` Ian Kent @ 2014-01-31 5:13 ` Ian Kent 2014-01-31 10:10 ` Donald Buczek 0 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-01-31 5:13 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Fri, 2014-01-31 at 11:31 +0800, Ian Kent wrote: > On Wed, 2014-01-29 at 17:02 +0100, Donald Buczek wrote: > > Hello, > > > > we are trying to switch from amd to autofs. After successfully testing > > and rolling it out to the first several machines, from time to time we > > get directories stuck with "Too many levels of symbolic links" on a path > > which should be automounted via an indirect map. > > > > linux 3.8.13 > > autofs 5.0.8 > > > > As an example, here is data from a system where the path /scratch/tmp is > > stuck: > > > > http://www.molgen.mpg.de/~buczek/autofs-demo/ > > > > auto.master # master map > > auto.scratch # indirect map for /scratch > > autofs # from /etc/defaults > > typescript # shows the problem and a bit of gdb dump of kernel > > structures > > typescript.l # same with line numbers for reference > > gdb-macros # macros used in the gdb session > > > > From typescript.l , line 122ff it is clear, that /scratch/tmp is not > > currently mounted. On the other hand, the gdb session finds the dentry > > of /scratch/tmp which has d_flags 0x70080 (line 99,120). This is > > DCACHE_MANAGE_TRANSIT+DCACHE_NEED_AUTOMOUNT+DCACHE_MOUNTED+DCACHE_RCUACCESS > > with DCACHE_MOUNTED indicating that there should be something mounted > > there(?). I think, this state is faulty and necessarily leads to ELOOP > > during path walk. Probably the situation is known by the gurus here? > > Yes, I can see how DCACHE_MOUNTED being set would lead to ELOOP in this > case. But, having been there before too, I couldn't see any way the > DCACHE_MOUNTED would not be cleared on umount. Also, DCACHE_MOUNTED is > only changed within the VFS and isn't changed very often. It can't see > how a code path that should lead to one of those changes doesn't go > there. > > I'll have another look ..... Then the question becomes .... Can a dentry be a mount point for more than one mount .... Obviously not you say ... but what about clone(2) with CLONE_NEWNS? If you still have that kernel you used to get the info above could you check the mount (ie. struct mount not struct vfsmount) structures to see if there is one with its mnt_mountpoint set to the dentry in question? Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-31 5:13 ` Ian Kent @ 2014-01-31 10:10 ` Donald Buczek 2014-01-31 10:29 ` Donald Buczek ` (2 more replies) 0 siblings, 3 replies; 50+ messages in thread From: Donald Buczek @ 2014-01-31 10:10 UTC (permalink / raw) To: Ian Kent; +Cc: autofs On 01/31/14 06:13, Ian Kent wrote: > On Fri, 2014-01-31 at 11:31 +0800, Ian Kent wrote: >> On Wed, 2014-01-29 at 17:02 +0100, Donald Buczek wrote: >>> Hello, >>> >>> we are trying to switch from amd to autofs. After successfully testing >>> and rolling it out to the first several machines, from time to time we >>> get directories stuck with "Too many levels of symbolic links" on a path >>> which should be automounted via an indirect map. >>> >>> linux 3.8.13 >>> autofs 5.0.8 >>> >>> As an example, here is data from a system where the path /scratch/tmp is >>> stuck: >>> >>> http://www.molgen.mpg.de/~buczek/autofs-demo/ >>> >>> auto.master # master map >>> auto.scratch # indirect map for /scratch >>> autofs # from /etc/defaults >>> typescript # shows the problem and a bit of gdb dump of kernel >>> structures >>> typescript.l # same with line numbers for reference >>> gdb-macros # macros used in the gdb session >>> >>> From typescript.l , line 122ff it is clear, that /scratch/tmp is not >>> currently mounted. On the other hand, the gdb session finds the dentry >>> of /scratch/tmp which has d_flags 0x70080 (line 99,120). This is >>> DCACHE_MANAGE_TRANSIT+DCACHE_NEED_AUTOMOUNT+DCACHE_MOUNTED+DCACHE_RCUACCESS >>> with DCACHE_MOUNTED indicating that there should be something mounted >>> there(?). I think, this state is faulty and necessarily leads to ELOOP >>> during path walk. Probably the situation is known by the gurus here? >> Yes, I can see how DCACHE_MOUNTED being set would lead to ELOOP in this >> case. But, having been there before too, I couldn't see any way the >> DCACHE_MOUNTED would not be cleared on umount. Also, DCACHE_MOUNTED is >> only changed within the VFS and isn't changed very often. It can't see >> how a code path that should lead to one of those changes doesn't go >> there. >> >> I'll have another look ..... > Then the question becomes .... > > Can a dentry be a mount point for more than one mount .... > Obviously not you say ... but what about clone(2) with CLONE_NEWNS? > > If you still have that kernel you used to get the info above could you > check the mount (ie. struct mount not struct vfsmount) structures to see > if there is one with its mnt_mountpoint set to the dentry in question? > > Ian > > Hello, Ian, you said, "how DCACHE_MOUNTED would not be cleared on umount", so you are thinking about the unmount path. I asked my users and in two cases (including the one described in this thread) they think, it happened the very first time they accessed the path after boot. This suggest, the problem might appear on the mount path. Also, both were on workstations (single user!) and they both used a shell ( "cd /failing/path" and "do_something > /failing/path/bla" ) , so collisions (other threads accessing the same path at the same time) are unlikely. We don't have any hints which would suggests, that there might have been a problem with the fileserver or network involved (which would imply a bug in the "mount failure" path) Oh... Just found another important peace of information : > root:thehawk:~/# date > Fri Jan 31 10:27:48 CET 2014 > root:thehawk:~/# uptime > 10:27:51 up 8 days, 21:58, 3 users, load average: 0.37, 0.30, 0.26 The system was bootet Jan 22, 12:00 something > root:thehawk:~/# ls -al /scratch/ > total 2 > drwxr-xr-x 4 root system 0 Jan 27 13:37 . > drwxr-xr-x 35 root system 888 Jan 20 10:28 .. > drwxrwxrwt 16 root system 1136 Jan 29 14:39 local > dr-xr-xr-x 2 root system 0 Jan 27 13:37 tmp > root:thehawk:~/# ^C The creation of the dentry was Jan 27, 13:37 And here's from the fileserver: > root:moep:~/# fgrep thehawk /var/log/messages |tail -5 > 2014-01-09T14:09:35+01:00 moep rpc.mountd[646]: authenticated unmount > request from thehawk.molgen.mpg.de:797 for > /amd/moep/X/X2016/scratch/tolzmann (/amd/moep/X/X2016) > 2014-01-13T15:43:22+01:00 moep rpc.mountd[646]: authenticated mount > request from thehawk.molgen.mpg.de:922 for > /amd/moep/X/X2016/scratch/tmp (/amd/moep/X/X2016) > 2014-01-13T15:48:36+01:00 moep rpc.mountd[646]: authenticated unmount > request from thehawk.molgen.mpg.de:660 for > /amd/moep/X/X2016/scratch/tmp (/amd/moep/X/X2016) > 2014-01-16T15:52:18+01:00 moep rpc.mountd[646]: authenticated mount > request from thehawk.molgen.mpg.de:877 for > /amd/moep/X/X2016/scratch/tmp (/amd/moep/X/X2016) > 2014-01-16T15:57:30+01:00 moep rpc.mountd[646]: authenticated unmount > request from thehawk.molgen.mpg.de:745 for > /amd/moep/X/X2016/scratch/tmp (/amd/moep/X/X2016) Last access seen on the Filerver (what would be mounted on /scratch/tmp if everything went well) was days before that. So /scratch/tmp has never been mounted. I've checked the mounts as you asked ( http://owww.molgen.mpg.de/~buczek/autofs-demo/typescript_3.l ) the dentry 0xffff88016a31c440 identified in the previous sessions (and still there) is not in any mnt_mountpoint How can DCACHE_MOUNTED be set when there was no mount? The problem appears rarely and (until now) randomly. Locking failure? Okay, I've managed to get the nvidia bullshit drivers to work on linux 3.13.1 , so I'm going to reboot this workstation (with the three failures) to the latest kernel now with DEBUG set in the autofs4 directory. Perhaps we shouldn't waste to much time analyzing code which is obsoleted already. I'll surly tell you, when the problem is seen again with 8.13. Regards Donald -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-31 10:10 ` Donald Buczek @ 2014-01-31 10:29 ` Donald Buczek 2014-02-19 10:17 ` Donald Buczek 2014-02-01 1:47 ` autofs linux 3.8.13 and " Ian Kent 2014-02-01 3:32 ` Ian Kent 2 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-01-31 10:29 UTC (permalink / raw) To: Ian Kent; +Cc: autofs On 01/31/14 11:10, Donald Buczek wrote: > I'll surly tell you, when the problem is seen again with 8.13. 3.13 -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 ^ permalink raw reply [flat|nested] 50+ messages in thread
* "Too many levels of symbolic links" 2014-01-31 10:29 ` Donald Buczek @ 2014-02-19 10:17 ` Donald Buczek 2014-02-19 10:21 ` Donald Buczek 2014-02-20 11:41 ` Ian Kent 0 siblings, 2 replies; 50+ messages in thread From: Donald Buczek @ 2014-02-19 10:17 UTC (permalink / raw) To: Ian Kent; +Cc: autofs Hello, On 01/31/14 11:29, Donald Buczek wrote: > On 01/31/14 11:10, Donald Buczek wrote: > >> I'll surly tell you, when the problem is seen again with 8.13. > > 3.13 > now we've seen the "Too many levels of symbolic links" also on latest and greatest kernel 3.13.1 (and, btw, 3.10.29 as well) Debug script session in http://www.molgen.mpg.de/~buczek/autofs-demo/typescript.l System log (excerpt) in http://www.molgen.mpg.de/~buczek/autofs-demo/messages.l (16MB!) Again I've put line numbers into the files for easier reference... What is different this time is that multiple directories in multiple autofs trees exhibit the problem at the same time. Interestingly, the crafted dentry nodes have a similar (but not identical) mtime: 44 root:pate:/home/web/buczek/autofs-demo/# .]2;root@pate.ls -l /home /project /package /scratch 45 /home: 46 total 144 47 drwx--x--x 361 buczek users 32768 Feb 19 08:20 buczek <--- working 48 drwxr-x--x 161 haas abt_vin 40960 Feb 13 15:41 haas <--- working 49 drwxr-xrwt 198 root system 12288 Jan 27 20:26 web <--- working 50 51 /package: 52 total 0 53 dr-xr-xr-x 2 root system 0 Feb 17 11:03 usr <---- dead 54 55 /project: 56 total 0 57 dr-xr-xr-x 2 root system 0 Feb 17 11:06 gbrowse <--- dead 58 dr-xr-xr-x 2 root system 0 Feb 17 11:06 ngs_haas <--- dead 59 dr-xr-xr-x 2 root system 0 Feb 17 11:03 postgres <--- dead 60 dr-xr-xr-x 2 root system 0 Feb 17 11:03 splicenest <--- dead 61 62 /scratch: 63 total 4 64 drwxrwxrwt 24 root system 4096 Sep 18 14:58 ngsvin <----- working 65 dr-xr-xr-x 2 root system 0 Feb 17 11:03 ngsvin2 <----- dead dentry flags on one of theses "/project/gbrowse) are 1523840 : 1969 (gdb) print *(struct dentry *) 0xffff88007fd06c50 1970 $3 = {d_flags = 1523840, d_seq = {sequence = 4}, d_hash = {next = 0xffff880214025c98, pprev = 0xffffc9000013d570}, d_parent = 0xffff8800caa66810, d_name = {{{hash = 1876415966, len = 7}, hash_len = 31941187038}, name = 0xffff88007fd06c88 "gbrowse"}, d_inode = 0xffff8800961ad250, d_iname = "gbrowse", '\000' <repeats 24 times>, d_lockref = {{lock_count = 8610971969, { 1971 lock = {{rlock = {raw_lock = {{head_tail = 21037377, tickets = {head = 321, tail = 321}}}}}}, count = 2}}}, d_op = 0xffffffff81c45b40, d_sb = 0xffff880222a33800, d_time = 0, d_fsdata = 0xffff8800ca443b80, d_lru = {next = 0xffff88007fd06cd0, prev = 0xffff88007fd06cd0}, d_u = {d_child = {next = 0xffff88007fc9f420, prev = 0xffff8800caa668b0}, d_rcu = { 1972 next = 0xffff88007fc9f420, func = 0xffff8800caa668b0}}, d_subdirs = {next = 0xffff88007fd06cf0, prev = 0xffff88007fd06cf0}, d_alias = {next = 0x0, pprev = 0xffff8800961ad360}} which is 0x174080 which is DCACHE_RCUACCESS+ DCACHE_FSNOTIFY_PARENT_WATCHED+ DCACHE_MOUNTED+ DCACHE_NEED_AUTOMOUNT+ DCACHE_MANAGE_TRANSIT+ DCACHE_DIRECTORY_TYPE so again we have DCACHE_MOUNTED but no mount ( line 1753ff ) looking just at "grep gbrowse messages.l" the interesting part might be here: 1056818 2014-02-17T11:11:09.290343+01:00 pate kernel: [518408.741228] pid 15221: autofs4_expire_indirect: checking mountpoint ffff88007fd06c50 gbrowse 1056819 2014-02-17T11:11:09.290344+01:00 pate kernel: [518408.741229] pid 15221: autofs4_mount_busy: dentry ffff88007fd06c50 gbrowse 1056830 2014-02-17T11:11:09.290364+01:00 pate kernel: [518408.741242] pid 15221: autofs4_expire_indirect: checking mountpoint ffff88007fd06c50 gbrowse 1056831 2014-02-17T11:11:09.290371+01:00 pate kernel: [518408.741243] pid 15221: autofs4_mount_busy: dentry ffff88007fd06c50 gbrowse 1056885 2014-02-17T11:12:24.290190+01:00 pate kernel: [518483.724082] pid 15245: autofs4_expire_indirect: checking mountpoint ffff88007fd06c50 gbrowse 1056886 2014-02-17T11:12:24.290201+01:00 pate kernel: [518483.724084] pid 15245: autofs4_mount_busy: dentry ffff88007fd06c50 gbrowse 1056888 2014-02-17T11:12:24.290204+01:00 pate kernel: [518483.724088] pid 15245: autofs4_expire_indirect: returning ffff88007fd06c50 gbrowse 1056889 2014-02-17T11:12:24.290205+01:00 pate kernel: [518483.724093] pid 15245: autofs4_wait: new wait id = 0x000017b5, name = gbrowse, nfy=2 1056890 2014-02-17T11:12:24.290206+01:00 pate kernel: [518483.724095] pid 15245: autofs4_notify_daemon: wait id = 0x000017b5, name = gbrowse, type=4 1056891 2014-02-17T11:12:24+01:00 pate automount[531]: expiring path /project/gbrowse 1056892 2014-02-17T11:12:24.291190+01:00 pate kernel: [518483.724971] pid 15247: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056893 2014-02-17T11:12:24+01:00 pate automount[531]: expired /project/gbrowse 1056894 2014-02-17T11:12:24.311242+01:00 pate kernel: [518483.744358] pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056895 2014-02-17T11:12:24.311247+01:00 pate kernel: [518483.744360] pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056896 2014-02-17T11:12:24.311248+01:00 pate kernel: [518483.744364] pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056897 2014-02-17T11:12:24.311249+01:00 pate kernel: [518483.744365] pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056898 2014-02-17T11:12:24.311250+01:00 pate kernel: [518483.744366] pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse 1056899 2014-02-17T11:12:24.311251+01:00 pate kernel: [518483.744367] pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056900 2014-02-17T11:12:24.311252+01:00 pate kernel: [518483.744368] pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse 1056901 2014-02-17T11:12:24.311253+01:00 pate kernel: [518483.744369] pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056902 2014-02-17T11:12:24.311254+01:00 pate kernel: [518483.744370] pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse 1056903 2014-02-17T11:12:24.311255+01:00 pate kernel: [518483.744372] pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056904 2014-02-17T11:12:24.311255+01:00 pate kernel: [518483.744373] pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse 1056905 2014-02-17T11:12:24.311256+01:00 pate kernel: [518483.744374] pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056906 2014-02-17T11:12:24.311257+01:00 pate kernel: [518483.744375] pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse 1056907 2014-02-17T11:12:24.311258+01:00 pate kernel: [518483.744376] pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056908 2014-02-17T11:12:24.311259+01:00 pate kernel: [518483.744377] pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse 1056909 2014-02-17T11:12:24.311260+01:00 pate kernel: [518483.744378] pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse 1056910 2014-02-17T11:12:24.311261+01:00 pate kernel: [518483.744379] pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse where the daemon dismounts /project/gbrowse (but it looks like dentry=ffff88007fd06c50 stays in place and has DCACHE_MOUNTED even a day later. The log from the fileserver confirms the unmount at Feb 17 11:12:24 : root@genome:~# fgrep gbrowse /var/log/messages |grep pate|tail -6 Feb 17 10:53:42 genome mountd[3074]: authenticated mount request from pate.molgen.mpg.de:759 for /amd/genome/0/project/gbrowse (/amd/genome/0/project/gbrowse) Feb 17 10:59:54 genome mountd[3079]: authenticated unmount request from pate.molgen.mpg.de:1004 for /amd/genome/0/project/gbrowse (/amd/genome/0/project/gbrowse) Feb 17 10:59:58 genome mountd[3074]: authenticated mount request from pate.molgen.mpg.de:1005 for /amd/genome/0/project/gbrowse (/amd/genome/0/project/gbrowse) Feb 17 11:06:09 genome mountd[3079]: authenticated unmount request from pate.molgen.mpg.de:786 for /amd/genome/0/project/gbrowse (/amd/genome/0/project/gbrowse) Feb 17 11:06:14 genome mountd[3070]: authenticated mount request from pate.molgen.mpg.de:985 for /amd/genome/0/project/gbrowse (/amd/genome/0/project/gbrowse) Feb 17 11:12:24 genome mountd[3070]: authenticated unmount request from pate.molgen.mpg.de:1008 for /amd/genome/0/project/gbrowse (/amd/genome/0/project/gbrowse) The system log on the web server is only tail -n +8256 /var/log/messages |egrep "Linux version|autofs4|automount" > messages But of course I had a glimpse at the complete log around these times for other messages which might be related. Nothing... Other hints: * Until now we only had a single failure on kernel 3.13.1 (described here) and another single failure on kernel 3.10.29. I think the bug frequency is lower than it used to be with 3.8 * The two systems who have problems now belond to the group of systems which had problems on kernel 3.8. * This time it does looks like a problem during unmount. Previously we had some bad dentries on the very first access never received by the nfs server. * All systems are multicore of course root:pate:/home/web/buczek/autofs-demo/# gunzip -c /proc/config.gz |egrep -i 'preempt|smp' CONFIG_X86_64_SMP=y # CONFIG_PREEMPT_RCU is not set CONFIG_GENERIC_SMP_IDLE_THREAD=y CONFIG_PREEMPT_NOTIFIERS=y CONFIG_SMP=y # CONFIG_X86_VSMP is not set # CONFIG_MAXSMP is not set # CONFIG_PREEMPT_NONE is not set CONFIG_PREEMPT_VOLUNTARY=y # CONFIG_PREEMPT is not set CONFIG_PM_SLEEP_SMP=y CONFIG_SCSI_SAS_HOST_SMP=y So, who can make head and tail of it? Regards Donald -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-19 10:17 ` Donald Buczek @ 2014-02-19 10:21 ` Donald Buczek 2014-02-20 11:41 ` Ian Kent 1 sibling, 0 replies; 50+ messages in thread From: Donald Buczek @ 2014-02-19 10:21 UTC (permalink / raw) To: Ian Kent; +Cc: autofs On 02/19/14 11:17, Donald Buczek wrote: > > * The two systems who have problems now belond to the group of > systems which had problems on kernel 3.8. great typing. sorry. "The two systems which have problems now belong to the group of systems, which also had problems on kernel 3.8." -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-19 10:17 ` Donald Buczek 2014-02-19 10:21 ` Donald Buczek @ 2014-02-20 11:41 ` Ian Kent 2014-02-20 12:18 ` Ian Kent 1 sibling, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-02-20 11:41 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs, Alexander Viro On Wed, 2014-02-19 at 11:17 +0100, Donald Buczek wrote: > Hello, > > > On 01/31/14 11:29, Donald Buczek wrote: > > On 01/31/14 11:10, Donald Buczek wrote: > > > >> I'll surly tell you, when the problem is seen again with 8.13. > > > > 3.13 > > > > > > now we've seen the "Too many levels of symbolic links" also on latest > and greatest kernel 3.13.1 (and, btw, 3.10.29 as well) Right, I'll have a look again but I couldn't see how this could possibly happen every other time I've looked at it. It just doesn't seem possible and I could find any broken assignments anywhere. Apart from a really odd compiler optimization or bug I can't see how it happens. I could work around it but I use d_mountpoint() quite a bit and there would be much more overhead if I had to follow the mount to check if it's mounted. Not only that, I expect such a change would not be well received since we should find and fix the problem, .... s*! > > Debug script session in > http://www.molgen.mpg.de/~buczek/autofs-demo/typescript.l > System log (excerpt) in > http://www.molgen.mpg.de/~buczek/autofs-demo/messages.l (16MB!) > > Again I've put line numbers into the files for easier reference... > > What is different this time is that multiple directories in multiple > autofs trees exhibit the problem at the same time. Interestingly, the > crafted dentry nodes have a similar (but not identical) mtime: > > 44 root:pate:/home/web/buczek/autofs-demo/# .]2;root@pate.ls -l /home /project /package /scratch > 45 /home: > 46 total 144 > 47 drwx--x--x 361 buczek users 32768 Feb 19 08:20 buczek <--- working > 48 drwxr-x--x 161 haas abt_vin 40960 Feb 13 15:41 haas <--- working > 49 drwxr-xrwt 198 root system 12288 Jan 27 20:26 web <--- working > 50 > 51 /package: > 52 total 0 > 53 dr-xr-xr-x 2 root system 0 Feb 17 11:03 usr <---- dead > 54 > 55 /project: > 56 total 0 > 57 dr-xr-xr-x 2 root system 0 Feb 17 11:06 gbrowse <--- dead > 58 dr-xr-xr-x 2 root system 0 Feb 17 11:06 ngs_haas <--- dead > 59 dr-xr-xr-x 2 root system 0 Feb 17 11:03 postgres <--- dead > 60 dr-xr-xr-x 2 root system 0 Feb 17 11:03 splicenest <--- dead > 61 > 62 /scratch: > 63 total 4 > 64 drwxrwxrwt 24 root system 4096 Sep 18 14:58 ngsvin <----- working > 65 dr-xr-xr-x 2 root system 0 Feb 17 11:03 ngsvin2 <----- dead > > dentry flags on one of theses "/project/gbrowse) are 1523840 : > > 1969 (gdb) print *(struct dentry *) 0xffff88007fd06c50 > 1970 $3 = {d_flags = 1523840, d_seq = {sequence = 4}, d_hash = {next = 0xffff880214025c98, pprev = 0xffffc9000013d570}, d_parent = 0xffff8800caa66810, d_name = {{{hash = 1876415966, len = 7}, hash_len = 31941187038}, name = 0xffff88007fd06c88 "gbrowse"}, d_inode = 0xffff8800961ad250, d_iname = "gbrowse", '\000' <repeats 24 times>, d_lockref = {{lock_count = 8610971969, { > 1971 lock = {{rlock = {raw_lock = {{head_tail = 21037377, tickets = {head = 321, tail = 321}}}}}}, count = 2}}}, d_op = 0xffffffff81c45b40, d_sb = 0xffff880222a33800, d_time = 0, d_fsdata = 0xffff8800ca443b80, d_lru = {next = 0xffff88007fd06cd0, prev = 0xffff88007fd06cd0}, d_u = {d_child = {next = 0xffff88007fc9f420, prev = 0xffff8800caa668b0}, d_rcu = { > 1972 next = 0xffff88007fc9f420, func = 0xffff8800caa668b0}}, d_subdirs = {next = 0xffff88007fd06cf0, prev = 0xffff88007fd06cf0}, d_alias = {next = 0x0, pprev = 0xffff8800961ad360}} > > > which is 0x174080 which is > > DCACHE_RCUACCESS+ > DCACHE_FSNOTIFY_PARENT_WATCHED+ > DCACHE_MOUNTED+ > DCACHE_NEED_AUTOMOUNT+ > DCACHE_MANAGE_TRANSIT+ > DCACHE_DIRECTORY_TYPE > > so again we have DCACHE_MOUNTED but no mount ( line 1753ff ) > > looking just at "grep gbrowse messages.l" the interesting part might be > here: > > > 1056818 2014-02-17T11:11:09.290343+01:00 pate kernel: [518408.741228] > pid 15221: autofs4_expire_indirect: checking mountpoint ffff88007fd06c50 > gbrowse > 1056819 2014-02-17T11:11:09.290344+01:00 pate kernel: [518408.741229] > pid 15221: autofs4_mount_busy: dentry ffff88007fd06c50 gbrowse > 1056830 2014-02-17T11:11:09.290364+01:00 pate kernel: [518408.741242] > pid 15221: autofs4_expire_indirect: checking mountpoint ffff88007fd06c50 > gbrowse > 1056831 2014-02-17T11:11:09.290371+01:00 pate kernel: [518408.741243] > pid 15221: autofs4_mount_busy: dentry ffff88007fd06c50 gbrowse > 1056885 2014-02-17T11:12:24.290190+01:00 pate kernel: [518483.724082] > pid 15245: autofs4_expire_indirect: checking mountpoint ffff88007fd06c50 > gbrowse > 1056886 2014-02-17T11:12:24.290201+01:00 pate kernel: [518483.724084] > pid 15245: autofs4_mount_busy: dentry ffff88007fd06c50 gbrowse > 1056888 2014-02-17T11:12:24.290204+01:00 pate kernel: [518483.724088] > pid 15245: autofs4_expire_indirect: returning ffff88007fd06c50 gbrowse > 1056889 2014-02-17T11:12:24.290205+01:00 pate kernel: [518483.724093] > pid 15245: autofs4_wait: new wait id = 0x000017b5, name = gbrowse, nfy=2 > 1056890 2014-02-17T11:12:24.290206+01:00 pate kernel: [518483.724095] > pid 15245: autofs4_notify_daemon: wait id = 0x000017b5, name = gbrowse, > type=4 > 1056891 2014-02-17T11:12:24+01:00 pate automount[531]: expiring path > /project/gbrowse > 1056892 2014-02-17T11:12:24.291190+01:00 pate kernel: [518483.724971] > pid 15247: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056893 2014-02-17T11:12:24+01:00 pate automount[531]: expired > /project/gbrowse > 1056894 2014-02-17T11:12:24.311242+01:00 pate kernel: [518483.744358] > pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056895 2014-02-17T11:12:24.311247+01:00 pate kernel: [518483.744360] > pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056896 2014-02-17T11:12:24.311248+01:00 pate kernel: [518483.744364] > pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056897 2014-02-17T11:12:24.311249+01:00 pate kernel: [518483.744365] > pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056898 2014-02-17T11:12:24.311250+01:00 pate kernel: [518483.744366] > pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse > 1056899 2014-02-17T11:12:24.311251+01:00 pate kernel: [518483.744367] > pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056900 2014-02-17T11:12:24.311252+01:00 pate kernel: [518483.744368] > pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse > 1056901 2014-02-17T11:12:24.311253+01:00 pate kernel: [518483.744369] > pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056902 2014-02-17T11:12:24.311254+01:00 pate kernel: [518483.744370] > pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse > 1056903 2014-02-17T11:12:24.311255+01:00 pate kernel: [518483.744372] > pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056904 2014-02-17T11:12:24.311255+01:00 pate kernel: [518483.744373] > pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse > 1056905 2014-02-17T11:12:24.311256+01:00 pate kernel: [518483.744374] > pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056906 2014-02-17T11:12:24.311257+01:00 pate kernel: [518483.744375] > pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse > 1056907 2014-02-17T11:12:24.311258+01:00 pate kernel: [518483.744376] > pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056908 2014-02-17T11:12:24.311259+01:00 pate kernel: [518483.744377] > pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse > 1056909 2014-02-17T11:12:24.311260+01:00 pate kernel: [518483.744378] > pid 15246: autofs4_d_manage: dentry=ffff88007fd06c50 gbrowse > 1056910 2014-02-17T11:12:24.311261+01:00 pate kernel: [518483.744379] > pid 15246: autofs4_d_automount: dentry=ffff88007fd06c50 gbrowse > > > where the daemon dismounts /project/gbrowse (but it looks like > dentry=ffff88007fd06c50 stays in place and has DCACHE_MOUNTED even a day > later. > > The log from the fileserver confirms the unmount at Feb 17 11:12:24 : > > root@genome:~# fgrep gbrowse /var/log/messages |grep pate|tail -6 > Feb 17 10:53:42 genome mountd[3074]: authenticated mount request from > pate.molgen.mpg.de:759 for /amd/genome/0/project/gbrowse > (/amd/genome/0/project/gbrowse) > Feb 17 10:59:54 genome mountd[3079]: authenticated unmount request from > pate.molgen.mpg.de:1004 for /amd/genome/0/project/gbrowse > (/amd/genome/0/project/gbrowse) > Feb 17 10:59:58 genome mountd[3074]: authenticated mount request from > pate.molgen.mpg.de:1005 for /amd/genome/0/project/gbrowse > (/amd/genome/0/project/gbrowse) > Feb 17 11:06:09 genome mountd[3079]: authenticated unmount request from > pate.molgen.mpg.de:786 for /amd/genome/0/project/gbrowse > (/amd/genome/0/project/gbrowse) > Feb 17 11:06:14 genome mountd[3070]: authenticated mount request from > pate.molgen.mpg.de:985 for /amd/genome/0/project/gbrowse > (/amd/genome/0/project/gbrowse) > Feb 17 11:12:24 genome mountd[3070]: authenticated unmount request from > pate.molgen.mpg.de:1008 for /amd/genome/0/project/gbrowse > (/amd/genome/0/project/gbrowse) > > > The system log on the web server is only > > tail -n +8256 /var/log/messages |egrep "Linux > version|autofs4|automount" > messages > > But of course I had a glimpse at the complete log around these times for > other messages which might be related. Nothing... > > > Other hints: > > * Until now we only had a single failure on kernel 3.13.1 (described > here) and another single failure on kernel 3.10.29. I think the bug > frequency is lower than it used to be with 3.8 > * The two systems who have problems now belond to the group of > systems which had problems on kernel 3.8. > * This time it does looks like a problem during unmount. Previously > we had some bad dentries on the very first access never received by the > nfs server. > * All systems are multicore of course > > root:pate:/home/web/buczek/autofs-demo/# gunzip -c /proc/config.gz > |egrep -i 'preempt|smp' > CONFIG_X86_64_SMP=y > # CONFIG_PREEMPT_RCU is not set > CONFIG_GENERIC_SMP_IDLE_THREAD=y > CONFIG_PREEMPT_NOTIFIERS=y > CONFIG_SMP=y > # CONFIG_X86_VSMP is not set > # CONFIG_MAXSMP is not set > # CONFIG_PREEMPT_NONE is not set > CONFIG_PREEMPT_VOLUNTARY=y > # CONFIG_PREEMPT is not set > CONFIG_PM_SLEEP_SMP=y > CONFIG_SCSI_SAS_HOST_SMP=y > > > > So, who can make head and tail of it? > > > Regards > Donald > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-20 11:41 ` Ian Kent @ 2014-02-20 12:18 ` Ian Kent 2014-02-20 15:57 ` Donald Buczek 0 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-02-20 12:18 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs, Alexander Viro On Thu, 2014-02-20 at 19:41 +0800, Ian Kent wrote: > > > > 1969 (gdb) print *(struct dentry *) 0xffff88007fd06c50 > > 1970 $3 = {d_flags = 1523840, d_seq = {sequence = 4}, d_hash = {next = 0xffff880214025c98, pprev = 0xffffc9000013d570}, d_parent = 0xffff8800caa66810, d_name = {{{hash = 1876415966, len = 7}, hash_len = 31941187038}, name = 0xffff88007fd06c88 "gbrowse"}, d_inode = 0xffff8800961ad250, d_iname = "gbrowse", '\000' <repeats 24 times>, d_lockref = {{lock_count = 8610971969, { > > 1971 lock = {{rlock = {raw_lock = {{head_tail = 21037377, tickets = {head = 321, tail = 321}}}}}}, count = 2}}}, d_op = 0xffffffff81c45b40, d_sb = 0xffff880222a33800, d_time = 0, d_fsdata = 0xffff8800ca443b80, d_lru = {next = 0xffff88007fd06cd0, prev = 0xffff88007fd06cd0}, d_u = {d_child = {next = 0xffff88007fc9f420, prev = 0xffff8800caa668b0}, d_rcu = { > > 1972 next = 0xffff88007fc9f420, func = 0xffff8800caa668b0}}, d_subdirs = {next = 0xffff88007fd06cf0, prev = 0xffff88007fd06cf0}, d_alias = {next = 0x0, pprev = 0xffff8800961ad360}} I wonder if there's a struct mount for which this dentry is mnt_mountpoint and if so what the value of m_count is for the struct mountpoint of the struct mount? Oh boy ... this stuff has changed so much. It does look like mount->mountpoint->m_mount controls the setting and clearing of DCACHE_MOUNTED. If there is and m_count is non-zero then there may be a reference counting problem somewhere or some other weirdness. Be worth looking at the mount struct and it's neighbors if you can find it. > > > > > > which is 0x174080 which is > > > > DCACHE_RCUACCESS+ > > DCACHE_FSNOTIFY_PARENT_WATCHED+ > > DCACHE_MOUNTED+ > > DCACHE_NEED_AUTOMOUNT+ > > DCACHE_MANAGE_TRANSIT+ > > DCACHE_DIRECTORY_TYPE > > > > so again we have DCACHE_MOUNTED but no mount ( line 1753ff ) > > Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-20 12:18 ` Ian Kent @ 2014-02-20 15:57 ` Donald Buczek 2014-02-21 1:42 ` Ian Kent 0 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-02-20 15:57 UTC (permalink / raw) To: Ian Kent; +Cc: autofs, Alexander Viro On 02/20/14 13:18, Ian Kent wrote: > On Thu, 2014-02-20 at 19:41 +0800, Ian Kent wrote: >>> 1969 (gdb) print *(struct dentry *) 0xffff88007fd06c50 >>> 1970 $3 = {d_flags = 1523840, d_seq = {sequence = 4}, d_hash = {next = 0xffff880214025c98, pprev = 0xffffc9000013d570}, d_parent = 0xffff8800caa66810, d_name = {{{hash = 1876415966, len = 7}, hash_len = 31941187038}, name = 0xffff88007fd06c88 "gbrowse"}, d_inode = 0xffff8800961ad250, d_iname = "gbrowse", '\000' <repeats 24 times>, d_lockref = {{lock_count = 8610971969, { >>> 1971 lock = {{rlock = {raw_lock = {{head_tail = 21037377, tickets = {head = 321, tail = 321}}}}}}, count = 2}}}, d_op = 0xffffffff81c45b40, d_sb = 0xffff880222a33800, d_time = 0, d_fsdata = 0xffff8800ca443b80, d_lru = {next = 0xffff88007fd06cd0, prev = 0xffff88007fd06cd0}, d_u = {d_child = {next = 0xffff88007fc9f420, prev = 0xffff8800caa668b0}, d_rcu = { >>> 1972 next = 0xffff88007fc9f420, func = 0xffff8800caa668b0}}, d_subdirs = {next = 0xffff88007fd06cf0, prev = 0xffff88007fd06cf0}, d_alias = {next = 0x0, pprev = 0xffff8800961ad360}} > I wonder if there's a struct mount for which this dentry is > mnt_mountpoint and if so what the value of m_count is for the struct > mountpoint of the struct mount? No there isn't. Already checked in http://owww.molgen.mpg.de/~buczek/autofs-demo/typescript.l lines 1795ff Regards Donald -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-20 15:57 ` Donald Buczek @ 2014-02-21 1:42 ` Ian Kent 2014-02-21 15:15 ` Donald Buczek 0 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-02-21 1:42 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs, Alexander Viro On Thu, 2014-02-20 at 16:57 +0100, Donald Buczek wrote: > On 02/20/14 13:18, Ian Kent wrote: > > On Thu, 2014-02-20 at 19:41 +0800, Ian Kent wrote: > >>> 1969 (gdb) print *(struct dentry *) 0xffff88007fd06c50 > >>> 1970 $3 = {d_flags = 1523840, d_seq = {sequence = 4}, d_hash = {next = 0xffff880214025c98, pprev = 0xffffc9000013d570}, d_parent = 0xffff8800caa66810, d_name = {{{hash = 1876415966, len = 7}, hash_len = 31941187038}, name = 0xffff88007fd06c88 "gbrowse"}, d_inode = 0xffff8800961ad250, d_iname = "gbrowse", '\000' <repeats 24 times>, d_lockref = {{lock_count = 8610971969, { > >>> 1971 lock = {{rlock = {raw_lock = {{head_tail = 21037377, tickets = {head = 321, tail = 321}}}}}}, count = 2}}}, d_op = 0xffffffff81c45b40, d_sb = 0xffff880222a33800, d_time = 0, d_fsdata = 0xffff8800ca443b80, d_lru = {next = 0xffff88007fd06cd0, prev = 0xffff88007fd06cd0}, d_u = {d_child = {next = 0xffff88007fc9f420, prev = 0xffff8800caa668b0}, d_rcu = { > >>> 1972 next = 0xffff88007fc9f420, func = 0xffff8800caa668b0}}, d_subdirs = {next = 0xffff88007fd06cf0, prev = 0xffff88007fd06cf0}, d_alias = {next = 0x0, pprev = 0xffff8800961ad360}} > > I wonder if there's a struct mount for which this dentry is > > mnt_mountpoint and if so what the value of m_count is for the struct > > mountpoint of the struct mount? > > No there isn't. Already checked in > http://owww.molgen.mpg.de/~buczek/autofs-demo/typescript.l lines 1795ff Oh right, I forgot about that, must have been the seemingly endless list of dentrys above it, ;) LOL, and if there was a mount that's unlinked we'd never find it, but yes, a ref counting problem would most likely leave it on the list. > > Regards > Donald > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-21 1:42 ` Ian Kent @ 2014-02-21 15:15 ` Donald Buczek 2014-02-28 12:12 ` Donald Buczek 0 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-02-21 15:15 UTC (permalink / raw) To: Ian Kent; +Cc: autofs, Alexander Viro Hello, I've taken your idea "Apart from a really odd compiler optimization or bug" and compiled with gcc 4.8.2 (instead of 4.7.3) Also I've put some additional DPRINTKs into the routines setting or clearing DCACHE_MOUNTED So lets wait for the next event ..... Regards Donald On 02/21/14 02:42, Ian Kent wrote: > On Thu, 2014-02-20 at 16:57 +0100, Donald Buczek wrote: >> On 02/20/14 13:18, Ian Kent wrote: >>> On Thu, 2014-02-20 at 19:41 +0800, Ian Kent wrote: >>>>> 1969 (gdb) print *(struct dentry *) 0xffff88007fd06c50 >>>>> 1970 $3 = {d_flags = 1523840, d_seq = {sequence = 4}, d_hash = {next = 0xffff880214025c98, pprev = 0xffffc9000013d570}, d_parent = 0xffff8800caa66810, d_name = {{{hash = 1876415966, len = 7}, hash_len = 31941187038}, name = 0xffff88007fd06c88 "gbrowse"}, d_inode = 0xffff8800961ad250, d_iname = "gbrowse", '\000' <repeats 24 times>, d_lockref = {{lock_count = 8610971969, { >>>>> 1971 lock = {{rlock = {raw_lock = {{head_tail = 21037377, tickets = {head = 321, tail = 321}}}}}}, count = 2}}}, d_op = 0xffffffff81c45b40, d_sb = 0xffff880222a33800, d_time = 0, d_fsdata = 0xffff8800ca443b80, d_lru = {next = 0xffff88007fd06cd0, prev = 0xffff88007fd06cd0}, d_u = {d_child = {next = 0xffff88007fc9f420, prev = 0xffff8800caa668b0}, d_rcu = { >>>>> 1972 next = 0xffff88007fc9f420, func = 0xffff8800caa668b0}}, d_subdirs = {next = 0xffff88007fd06cf0, prev = 0xffff88007fd06cf0}, d_alias = {next = 0x0, pprev = 0xffff8800961ad360}} >>> I wonder if there's a struct mount for which this dentry is >>> mnt_mountpoint and if so what the value of m_count is for the struct >>> mountpoint of the struct mount? >> No there isn't. Already checked in >> http://owww.molgen.mpg.de/~buczek/autofs-demo/typescript.l lines 1795ff > Oh right, I forgot about that, must have been the seemingly endless list > of dentrys above it, ;) > > LOL, and if there was a mount that's unlinked we'd never find it, but > yes, a ref counting problem would most likely leave it on the list. > >> Regards >> Donald >> > -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-21 15:15 ` Donald Buczek @ 2014-02-28 12:12 ` Donald Buczek 2014-02-28 13:29 ` Alexander Viro 0 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-02-28 12:12 UTC (permalink / raw) To: Ian Kent; +Cc: autofs, Alexander Viro Hello, bug hit again with new gcc. (one of my added) debug is > *** namespace.c 2014/02/21 14:41:22 1.1 > --- namespace.c 2014/02/21 14:49:25 > *************** > *** 665,674 **** > --- 665,677 ---- > > static void put_mountpoint(struct mountpoint *mp) > { > + DPRINTK("mp=%p"); > + > if (!--mp->m_count) { > struct dentry *dentry = mp->m_dentry; > spin_lock(&dentry->d_lock); > dentry->d_flags &= ~DCACHE_MOUNTED; > + DPRINTK("cleared mounted on dentry=%p %.*s",dentry, > dentry->d_name.len, dentry->d_name.name); > spin_unlock(&dentry->d_lock); > list_del(&mp->m_hash); > kfree(mp); > > Here is an extract from the logfile from a successful dismount of /package/sequencer: > 939 2014-02-24T07:54:30.278237+01:00 kasslerbraten kernel: [ > 695.352155] pid 1425: autofs4_expire_indirect: checking mountpoint > ffff8800c2433c90 sequencer > 940 2014-02-24T07:54:30.278238+01:00 kasslerbraten kernel: [ > 695.352157] pid 1425: autofs4_mount_busy: dentry ffff8800c2433c90 > sequencer > 941 2014-02-24T07:54:30.278239+01:00 kasslerbraten kernel: [ > 695.352159] pid 1425: autofs4_mount_busy: returning = 0 > 942 2014-02-24T07:54:30.278240+01:00 kasslerbraten kernel: [ > 695.352160] pid 1425: autofs4_expire_indirect: returning > ffff8800c2433c90 sequencer > 943 2014-02-24T07:54:30.278241+01:00 kasslerbraten kernel: [ > 695.352161] pid 1425: autofs4_wait: new wait id = 0x0000000a, name = > sequencer, nfy=2 > 944 2014-02-24T07:54:30.278242+01:00 kasslerbraten kernel: [ > 695.352162] pid 1425: autofs4_notify_daemon: wait id = 0x0000000a, > name = sequencer, type=4 > > 969 2014-02-24T07:54:30.312515+01:00 kasslerbraten kernel: [ > 695.386185] pid 1433: put_mountpoint: mp=ffff8800c9e215d0 > 970 2014-02-24T07:54:30.312527+01:00 kasslerbraten kernel: [ > 695.386188] pid 1433: put_mountpoint: cleared mounted on > dentry=ffff8800c2433c90 sequencer > > 971 2014-02-24T07:54:30+01:00 kasslerbraten automount[626]: expired > /package/sequencer > > 972 2014-02-24T07:54:30.324519+01:00 kasslerbraten kernel: [ > 695.398095] pid 1428: autofs4_dir_open: file=ffff880126f43ec0 > dentry=ffff8800c2433c90 sequencer > 973 2014-02-24T07:54:30.324531+01:00 kasslerbraten kernel: [ > 695.398105] pid 1428: autofs4_dentry_release: releasing ffff8800ad2d4a50 > 974 2014-02-24T07:54:30.324534+01:00 kasslerbraten kernel: [ > 695.398109] pid 1428: autofs4_dir_rmdir: dentry ffff8800c2433c90, > removing sequencer > > 975 2014-02-24T07:54:30.324536+01:00 kasslerbraten kernel: [ > 695.398157] pid 1425: autofs4_dentry_release: releasing ffff8800c2433c90 Here is an extract from the logfile from a failed dismount of /package/sequencer: > 91779 2014-02-26T14:13:16.157522+01:00 kasslerbraten kernel: > [196176.289030] pid 8739: autofs4_expire_indirect: checking mountpoint > ffff880046a45810 sequencer > 91780 2014-02-26T14:13:16.157523+01:00 kasslerbraten kernel: > [196176.289033] pid 8739: autofs4_mount_busy: dentry ffff880046a45810 > sequencer > 91781 2014-02-26T14:13:16.157524+01:00 kasslerbraten kernel: > [196176.289034] pid 8739: autofs4_mount_busy: returning = 0 > 91782 2014-02-26T14:13:16.157525+01:00 kasslerbraten kernel: > [196176.289036] pid 8739: autofs4_expire_indirect: returning > ffff880046a45810 sequencer > 91783 2014-02-26T14:13:16.157526+01:00 kasslerbraten kernel: > [196176.289039] pid 8739: autofs4_wait: new wait id = 0x00000085, name > = sequencer, nfy=2 > 91784 2014-02-26T14:13:16.157527+01:00 kasslerbraten kernel: > [196176.289040] pid 8739: autofs4_notify_daemon: wait id = 0x00000085, > name = sequencer, type=4 > > 91785 2014-02-26T14:13:16.178496+01:00 kasslerbraten kernel: > [196176.310371] pid 8742: put_mountpoint: mp=ffff8800c9e215d0 > > 91786 2014-02-26T14:13:16+01:00 kasslerbraten automount[626]: expired > /package/sequencer > > 91787 2014-02-26T14:13:16.183536+01:00 kasslerbraten kernel: > [196176.314881] pid 8740: autofs4_d_automount: dentry=ffff880046a45810 > sequencer > 91788 2014-02-26T14:13:16.183542+01:00 kasslerbraten kernel: > [196176.314883] pid 8740: autofs4_d_automount: dentry=ffff880046a45810 > sequencer > 91789 2014-02-26T14:13:16.183544+01:00 kasslerbraten kernel: > [196176.314884] pid 8740: autofs4_d_automount: dentry=ffff880046a45810 > sequencer > 91790 2014-02-26T14:13:16.183545+01:00 kasslerbraten kernel: > [196176.314885] pid 8740: autofs4_d_automount: dentry=ffff880046a45810 > sequencer ( Full file from boot to shutdown with egrep 'automount|autofs|d_set_mounted|put_mountpoint' in http://owww.molgen.mpg.de/~buczek/autofs-demo/kasslerbraten.2014-02-26.l ) Obviously, "cleared mounted on dentry" is missing. It looks like we enter put_mountpoint() but don't get to dentry->d_flags &= ~DCACHE_MOUNTED; mp->m_count is not zero probably. What does it mean? The mount is still locked but not in the mount hash? Alas, the user has rebootet the system, I can not look at the mountpoint struct in memory. I don't understand the vfs models. Anyone? Ideas? Regards Donald On 02/21/14 16:15, Donald Buczek wrote: > Hello, > > I've taken your idea "Apart from a really odd compiler optimization or > bug" and compiled with gcc 4.8.2 (instead of 4.7.3) > > Also I've put some additional DPRINTKs into the routines setting or > clearing DCACHE_MOUNTED > > So lets wait for the next event ..... > > Regards > Donald > > > > On 02/21/14 02:42, Ian Kent wrote: >> On Thu, 2014-02-20 at 16:57 +0100, Donald Buczek wrote: >>> On 02/20/14 13:18, Ian Kent wrote: >>>> On Thu, 2014-02-20 at 19:41 +0800, Ian Kent wrote: >>>>>> 1969 (gdb) print *(struct dentry *) 0xffff88007fd06c50 >>>>>> 1970 $3 = {d_flags = 1523840, d_seq = {sequence = 4}, d_hash = >>>>>> {next = 0xffff880214025c98, pprev = 0xffffc9000013d570}, d_parent >>>>>> = 0xffff8800caa66810, d_name = {{{hash = 1876415966, len = 7}, >>>>>> hash_len = 31941187038}, name = 0xffff88007fd06c88 "gbrowse"}, >>>>>> d_inode = 0xffff8800961ad250, d_iname = "gbrowse", '\000' >>>>>> <repeats 24 times>, d_lockref = {{lock_count = 8610971969, { >>>>>> 1971 lock = {{rlock = {raw_lock = {{head_tail = 21037377, >>>>>> tickets = {head = 321, tail = 321}}}}}}, count = 2}}}, d_op = >>>>>> 0xffffffff81c45b40, d_sb = 0xffff880222a33800, d_time = 0, >>>>>> d_fsdata = 0xffff8800ca443b80, d_lru = {next = >>>>>> 0xffff88007fd06cd0, prev = 0xffff88007fd06cd0}, d_u = {d_child = >>>>>> {next = 0xffff88007fc9f420, prev = 0xffff8800caa668b0}, d_rcu = { >>>>>> 1972 next = 0xffff88007fc9f420, func = >>>>>> 0xffff8800caa668b0}}, d_subdirs = {next = 0xffff88007fd06cf0, >>>>>> prev = 0xffff88007fd06cf0}, d_alias = {next = 0x0, pprev = >>>>>> 0xffff8800961ad360}} >>>> I wonder if there's a struct mount for which this dentry is >>>> mnt_mountpoint and if so what the value of m_count is for the struct >>>> mountpoint of the struct mount? >>> No there isn't. Already checked in >>> http://owww.molgen.mpg.de/~buczek/autofs-demo/typescript.l lines 1795ff >> Oh right, I forgot about that, must have been the seemingly endless list >> of dentrys above it, ;) >> >> LOL, and if there was a mount that's unlinked we'd never find it, but >> yes, a ref counting problem would most likely leave it on the list. >> >>> Regards >>> Donald >>> >> > > -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-28 12:12 ` Donald Buczek @ 2014-02-28 13:29 ` Alexander Viro 2014-02-28 20:35 ` Donald Buczek 2014-03-02 2:22 ` Ian Kent 0 siblings, 2 replies; 50+ messages in thread From: Alexander Viro @ 2014-02-28 13:29 UTC (permalink / raw) To: Donald Buczek; +Cc: Ian Kent, autofs, Alexander Viro On Fri, Feb 28, 2014 at 01:12:58PM +0100, Donald Buczek wrote: > Obviously, "cleared mounted on dentry" is missing. > > It looks like we enter put_mountpoint() but don't get to > dentry->d_flags &= ~DCACHE_MOUNTED; > > mp->m_count is not zero probably. > > What does it mean? The mount is still locked but not in the mount hash? No, it means that something else is mounted on the same dentry (in another part of mount tree, obviously). If you mount the same fs on two different mountpoints, e.g. mount /dev/sda1 /mnt mount /dev/sda1 /tmp/foo you will have the same dentries seen in two places. Now, mount /dev/sdb11 /mnt/a mount /dev/sdc5 /tmp/foo/a and you've got two different filesystems mounted on two different places (/mnt/a and /tmp/foo/a). These two places have different vfsmounts, but the same dentry. struct mountpoint is associated with dentry, so it's also the same for both. And it serves as a mountpoint for two vfsmounts - one for fs from sdb11, another for fs from sdc5. Now umount /mnt/a; one of those two vfsmounts is gone now. struct mountpoint survives, of course, and dentry is *still* a mountpoint. sdc5 is still mounted on /tmp/foo/a, after all... ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-28 13:29 ` Alexander Viro @ 2014-02-28 20:35 ` Donald Buczek 2014-03-01 21:56 ` Donald Buczek 2014-03-02 2:22 ` Ian Kent 1 sibling, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-02-28 20:35 UTC (permalink / raw) To: Alexander Viro; +Cc: Ian Kent, autofs [-- Attachment #1: Type: text/plain, Size: 4132 bytes --] Am 28.02.2014 14:29, schrieb Alexander Viro: > On Fri, Feb 28, 2014 at 01:12:58PM +0100, Donald Buczek wrote: > >> Obviously, "cleared mounted on dentry" is missing. >> >> It looks like we enter put_mountpoint() but don't get to >> dentry->d_flags &= ~DCACHE_MOUNTED; >> >> mp->m_count is not zero probably. >> >> What does it mean? The mount is still locked but not in the mount hash? > No, it means that something else is mounted on the same dentry (in another > part of mount tree, obviously). > > If you mount the same fs on two different mountpoints, e.g. > mount /dev/sda1 /mnt > mount /dev/sda1 /tmp/foo > you will have the same dentries seen in two places. Now, > mount /dev/sdb11 /mnt/a > mount /dev/sdc5 /tmp/foo/a > > and you've got two different filesystems mounted on two different places > (/mnt/a and /tmp/foo/a). These two places have different vfsmounts, > but the same dentry. struct mountpoint is associated with dentry, so > it's also the same for both. And it serves as a mountpoint for two > vfsmounts - one for fs from sdb11, another for fs from sdc5. > > Now umount /mnt/a; one of those two vfsmounts is gone now. struct mountpoint > survives, of course, and dentry is *still* a mountpoint. sdc5 is still > mounted on /tmp/foo/a, after all... Thanks. So I guess, the idea of "struct mountpoint" is to make the dentries smaller by not having the mount count embedded in each one, 99.9% not needing it? OMG, I've just found this in the log: > 91286 2014-02-26T14:09:56.830515+01:00 kasslerbraten kernel: > [195977.007799] pid 8644: d_set_mounted: dentry=ffff88004690c710 root > 91287 2014-02-26T14:09:56.830527+01:00 kasslerbraten kernel: > [195977.007802] pid 8644: d_set_mounted: set mounted on > dentry=ffff88004690c710 root > 91288 2014-02-26T14:09:56.830529+01:00 kasslerbraten kernel: > [195977.007873] pid 8644: put_mountpoint: mp=ffff8800b8aa1a38 > 91289 2014-02-26T14:09:56.830530+01:00 kasslerbraten kernel: > [195977.007877] pid 8644: d_set_mounted: dentry=ffff8800ca45e810 tmp > 91290 2014-02-26T14:09:56.830535+01:00 kasslerbraten kernel: > [195977.007878] pid 8644: d_set_mounted: set mounted on > dentry=ffff8800ca45e810 tmp > 91291 2014-02-26T14:09:56.830536+01:00 kasslerbraten kernel: > [195977.007881] pid 8644: put_mountpoint: mp=ffff8800b8aa1a38 > 91292 2014-02-26T14:09:56.830537+01:00 kasslerbraten kernel: > [195977.007900] pid 8644: d_set_mounted: dentry=ffff880046960450 > old-root-mjn70Q > 91293 2014-02-26T14:09:56.830538+01:00 kasslerbraten kernel: > [195977.007901] pid 8644: d_set_mounted: set mounted on > dentry=ffff880046960450 old-root-mjn70Q > 91294 2014-02-26T14:09:56.830539+01:00 kasslerbraten kernel: > [195977.007903] pid 8644: put_mountpoint: mp=ffff8800a5f1dbd0 > 91295 2014-02-26T14:09:56.830540+01:00 kasslerbraten kernel: > [195977.007904] pid 8644: put_mountpoint: cleared mounted on > dentry=ffff88004690c710 root > 91296 2014-02-26T14:09:56.830541+01:00 kasslerbraten kernel: > [195977.007905] pid 8644: put_mountpoint: mp=ffff8800a5f1da90 > 91297 2014-02-26T14:09:56.830541+01:00 kasslerbraten kernel: > [195977.007954] pid 8644: put_mountpoint: mp=0000014f00490049 > 91298 2014-02-26T14:09:56.830542+01:00 kasslerbraten kernel: > [195977.007955] pid 8644: put_mountpoint: mp=0000000000000006 > 91299 2014-02-26T14:09:56.830543+01:00 kasslerbraten kernel: > [195977.007961] pid 8644: put_mountpoint: mp=ffff8800a5f1dbd0 > 91300 2014-02-26T14:09:56.830544+01:00 kasslerbraten kernel: > [195977.007963] pid 8644: put_mountpoint: mp=ffff8800a6023d10 > 91301 2014-02-26T14:09:56.830544+01:00 kasslerbraten kernel: > [195977.007963] pid 8644: put_mountpoint: mp=ffff8800a6023d10 > 91302 2014-02-26T14:09:56.830545+01:00 kasslerbraten kernel: > [195977.007964] pid 8644: put_mountpoint: mp=ffff8800a5f1dbd0 What is this? Where does "root" and "old-root-" come from? Why does 8644 survive dereferencing mp=0000000000000006 from the kernel? Is this still related to autofs? D. -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4541 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-28 20:35 ` Donald Buczek @ 2014-03-01 21:56 ` Donald Buczek 2014-03-02 0:52 ` Donald Buczek 0 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-03-01 21:56 UTC (permalink / raw) To: Alexander Viro; +Cc: Ian Kent, autofs [-- Attachment #1: Type: text/plain, Size: 1075 bytes --] Am 28.02.2014 21:35, schrieb Donald Buczek: >> 91298 2014-02-26T14:09:56.830542+01:00 kasslerbraten kernel: >> [195977.007955] pid 8644: put_mountpoint: mp=0000000000000006 At least this mystery is solved. My fault of course. Missing argument in my DPRINTK("mp=%p"); Sorry! Donald > >> *** namespace.c 2014/02/21 14:41:22 1.1 >> --- namespace.c 2014/02/21 14:49:25 >> *************** >> *** 665,674 **** >> --- 665,677 ---- >> >> static void put_mountpoint(struct mountpoint *mp) >> { >> + DPRINTK("mp=%p"); >> + >> if (!--mp->m_count) { >> struct dentry *dentry = mp->m_dentry; >> spin_lock(&dentry->d_lock); >> dentry->d_flags &= ~DCACHE_MOUNTED; >> + DPRINTK("cleared mounted on dentry=%p %.*s",dentry, >> dentry->d_name.len, dentry->d_name.name); >> spin_unlock(&dentry->d_lock); >> list_del(&mp->m_hash); >> kfree(mp); >> -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4541 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-01 21:56 ` Donald Buczek @ 2014-03-02 0:52 ` Donald Buczek 2014-03-02 2:17 ` Ian Kent 0 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-03-02 0:52 UTC (permalink / raw) To: Alexander Viro; +Cc: Ian Kent, autofs [-- Attachment #1: Type: text/plain, Size: 22437 bytes --] Okay, same bug hit again. New info this time: There _is_ a "struct mount" for the failing dentry , but neither "cat /proc/mounts" nor "mount" or the crash utility show it. Demo follows: The Problem: > root:kasslerbraten:/home/buczek/autofs/# uname -a > Linux kasslerbraten.molgen.mpg.de 3.13.1.mx64.1 #1 SMP Fri Feb 21 > 15:54:38 CET 2014 x86_64 GNU/Linux > root:kasslerbraten:/home/buczek/autofs/# ls /project/mariux32/ > ls: cannot access /project/mariux32/: Too many levels of symbolic links Logfile: > root:kasslerbraten:/home/buczek/autofs/# fgrep mariux32 > /var/log/messages|tail > 2014-03-02T01:33:02.051118+01:00 kasslerbraten kernel: [146256.094196] > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > 2014-03-02T01:33:02.051119+01:00 kasslerbraten kernel: [146256.094197] > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > 2014-03-02T01:33:02.051132+01:00 kasslerbraten kernel: [146256.094199] > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > 2014-03-02T01:33:02.051133+01:00 kasslerbraten kernel: [146256.094200] > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > 2014-03-02T01:33:02.051134+01:00 kasslerbraten kernel: [146256.094201] > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > 2014-03-02T01:33:02.051135+01:00 kasslerbraten kernel: [146256.094202] > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > 2014-03-02T01:33:02.051136+01:00 kasslerbraten kernel: [146256.094203] > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > 2014-03-02T01:33:02.051137+01:00 kasslerbraten kernel: [146256.094204] > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > 2014-03-02T01:33:02.051138+01:00 kasslerbraten kernel: [146256.094205] > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > 2014-03-02T01:33:02.051139+01:00 kasslerbraten kernel: [146256.094206] > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 Mount on /project/mariux32 not visible in /proc/mounts: > root:kasslerbraten:/home/buczek/autofs/# cat /proc/mounts > rootfs / rootfs rw 0 0 > /dev/root / reiserfs rw,relatime 0 0 > devtmpfs /dev devtmpfs > rw,relatime,size=2001060k,nr_inodes=500265,mode=755 0 0 > proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 > sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0 > tmpfs /dev/shm tmpfs rw,nosuid,nodev,relatime 0 0 > devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620 0 0 > tmpfs /run tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0 > tmpfs /sys/fs/cgroup tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0 > cgroup /sys/fs/cgroup/systemd cgroup > rw,nosuid,nodev,noexec,relatime,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd > 0 0 > cgroup /sys/fs/cgroup/cpuset cgroup > rw,nosuid,nodev,noexec,relatime,cpuset 0 0 > cgroup /sys/fs/cgroup/debug cgroup > rw,nosuid,nodev,noexec,relatime,debug 0 0 > cgroup /sys/fs/cgroup/cpu cgroup rw,nosuid,nodev,noexec,relatime,cpu 0 0 > cgroup /sys/fs/cgroup/cpuacct cgroup > rw,nosuid,nodev,noexec,relatime,cpuacct 0 0 > cgroup /sys/fs/cgroup/devices cgroup > rw,nosuid,nodev,noexec,relatime,devices 0 0 > cgroup /sys/fs/cgroup/freezer cgroup > rw,nosuid,nodev,noexec,relatime,freezer 0 0 > cgroup /sys/fs/cgroup/blkio cgroup > rw,nosuid,nodev,noexec,relatime,blkio 0 0 > systemd-1 /dev/hugepages autofs > rw,relatime,fd=26,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 > systemd-1 /proc/sys/fs/binfmt_misc autofs > rw,relatime,fd=27,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 > systemd-1 /sys/kernel/debug autofs > rw,relatime,fd=28,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 > systemd-1 /dev/mqueue autofs > rw,relatime,fd=29,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 > systemd-1 /sys/kernel/security autofs > rw,relatime,fd=30,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 > tmpfs /var/run tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0 > rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0 > tmpfs /media tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0 > nfsd /proc/fs/nfsd nfsd rw,relatime 0 0 > fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0 > /dev/sda2 /amd/kasslerbraten/0 reiserfs rw,relatime 0 0 > /etc/automount/auto.home /home autofs > rw,relatime,fd=7,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > /etc/automount/auto.jbod /jbod autofs > rw,relatime,fd=13,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > /etc/automount/auto.confidential /confidential autofs > rw,relatime,fd=19,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > /etc/automount/auto.project /project autofs > rw,relatime,fd=25,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > /etc/automount/auto.package /package autofs > rw,relatime,fd=31,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > /etc/automount/auto.scratch /scratch autofs > rw,relatime,fd=37,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > /etc/automount/auto.src /src autofs > rw,relatime,fd=43,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > pummelfee:/amd/pummelfee/X/X3009/home/abt_srv/klages /home/klages nfs4 > rw,nosuid,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=141.14.19.40,local_lock=none,addr=141.14.16.18 > 0 0 > palle:/amd/palle/1/home/abt_srv/buczek /home/buczek nfs > rw,nosuid,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=141.14.28.251,mountvers=3,mountport=58602,mountproto=udp,local_lock=none,addr=141.14.28.251 > 0 0 > binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0 > erdnuckel:/amd/erdnuckel/X/X0008/package/sequencer /package/sequencer > nfs > rw,nosuid,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=141.14.28.246,mountvers=3,mountport=53128,mountproto=udp,local_lock=none,addr=141.14.28.246 > 0 0 > root:kasslerbraten:/home/buczek/autofs/# But visible to my little perl program ( http://owww.molgen.mpg.de/~buczek/autofs-demo/peekmounts ) which walks the mountpoint hashtable and the mount_hashtable ) : > root:kasslerbraten:/home/buczek/autofs/# ./peekmounts > mountpoint 0xffff8801280ffc20 : count= 2 denty=0xffff8800c8996890 (src) > mountpoint 0xffff8801285b6c20 : count= 2 denty=0xffff8800ca457c90 > (systemd) > mountpoint 0xffff8801285b6a60 : count= 2 denty=0xffff8800ca457d50 > (cpuset) > mountpoint 0xffff8801285e9d00 : count= 2 denty=0xffff8800ca4cf710 > (hugepages) > mountpoint 0xffff8801285b6c60 : count= 2 denty=0xffff8800ca457ed0 > (cgroup) > mountpoint 0xffff8801285e9da0 : count= 2 denty=0xffff8800ca4a3c50 > (blkio) > mountpoint 0xffff8800a39282a0 : count= 1 denty=0xffff880094483e90 > (buczek) > mountpoint 0xffff8801280ffc80 : count= 2 denty=0xffff8800c8996dd0 > (scratch) > mountpoint 0xffff8800c8ed8480 : count= 2 denty=0xffff8800c88284d0 > (connections) > mountpoint 0xffff8801285e99e0 : count= 2 denty=0xffff8800bd874110 > (klages) > mountpoint 0xffff8801285e9c80 : count= 2 denty=0xffff8800ca4cfdd0 > (binfmt_misc) > mountpoint 0xffff8801285e9e40 : count= 2 denty=0xffff8800ca4a0510 > (freezer) > mountpoint 0xffff8800aa2d6880 : count= 1 denty=0xffff8800ca4cc390 (/) > mountpoint 0xffff8801285e9c20 : count= 2 denty=0xffff8800ca4cc5d0 > (debug) > mountpoint 0xffff8801285e9f60 : count= 2 denty=0xffff8800ca49c950 > (cpuacct) > mountpoint 0xffff8800c1d06920 : count= 2 denty=0xffff880129187dd0 > (confidential) > mountpoint 0xffff8800ca1736c0 : count= 2 denty=0xffff8800ca4c4a50 > (rpc_pipefs) > mountpoint 0xffff8800c1d068a0 : count= 2 denty=0xffff8800c8b2e450 > (project) > mountpoint 0xffff8801285e9ec0 : count= 2 denty=0xffff8800ca4a0ed0 > (devices) > mountpoint 0xffff8801285e9ac0 : count= 2 denty=0xffff8800ca4c8dd0 > (security) > mountpoint 0xffff8801285e9fe0 : count= 2 denty=0xffff8800ca4992d0 (cpu) > mountpoint 0xffff8800a3efeba0 : count= 1 denty=0xffff8800c890ce10 (tmp) > mountpoint 0xffff880125e44480 : count= 1 denty=0xffff88007c86da10 > (mariux32) > mountpoint 0xffff8800c1d06940 : count= 2 denty=0xffff8800c89b0690 > (home) > mountpoint 0xffff8800c2841820 : count= 2 denty=0xffff8800c8adf450 (0) > mountpoint 0xffff8800c2841400 : count= 2 denty=0xffff8800c89dc8d0 > (jbod) > mountpoint 0xffff8801285b6ce0 : count= 2 denty=0xffff8800ca4521d0 (run) > mountpoint 0xffff8801285b6d40 : count= 2 denty=0xffff8800ca452290 (pts) > mountpoint 0xffff8801285b69e0 : count= 2 denty=0xffff8800ca499ed0 > (debug) > mountpoint 0xffff8801285b6d80 : count= 2 denty=0xffff8800ca4524d0 (shm) > mountpoint 0xffff880128fef2e0 : count= 3 denty=0xffff880129002ad0 (/) > mountpoint 0xffff8801285b6e00 : count= 2 denty=0xffff8800ca452710 (sys) > mountpoint 0xffff880125d17800 : count= 1 denty=0xffff8800944ae1d0 > (roche454) > mountpoint 0xffff8801280ffce0 : count= 2 denty=0xffff8800c8979590 > (package) > mountpoint 0xffff8801285b6e80 : count= 2 denty=0xffff8800ca452a10 > (proc) > mountpoint 0xffff8801285e9600 : count= 2 denty=0xffff8800ca4e21d0 > (media) > mountpoint 0xffff8801285e9740 : count= 2 denty=0xffff88012900f050 (dev) > mountpoint 0xffff8800ca173fc0 : count= 2 denty=0xffff880129047450 > (nfsd) > mountpoint 0xffff8800b9493d00 : count= 1 denty=0xffff8800b59ade10 > (local) > mountpoint 0xffff8800ca1439c0 : count= 2 denty=0xffff8800ca4e2d10 (run) > mountpoint 0xffff8800aa146b60 : count= 1 denty=0xffff880038b8c450 (web) > mountpoint 0xffff8801285e9b40 : count= 2 denty=0xffff8801290478d0 > (mqueue) > struct mount 0xffff880128fd2e00 : mountpoint dentry > 0xffff880129002ad0 (/) mountpoint struct 0xffff880128fef2e0 > struct mount 0xffff8800c9e5e0c0 : mountpoint dentry > 0xffff880129187dd0 (confidential) mountpoint struct 0xffff8800c1d06920 > struct mount 0xffff880128fd2cc0 : mountpoint dentry > 0xffff8800ca4c4a50 (rpc_pipefs) mountpoint struct 0xffff8800ca1736c0 > struct mount 0xffff8800c9e5e200 : mountpoint dentry > 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 > struct mount 0xffff8800c9dda300 : mountpoint dentry > 0xffff8800ca4cf710 (hugepages) mountpoint struct 0xffff8801285e9d00 > struct mount 0xffff8800aa103840 : mountpoint dentry > 0xffff880129047450 (nfsd) mountpoint struct 0xffff8800ca173fc0 > struct mount 0xffff8800a3a9a300 : mountpoint dentry > 0xffff8800b59ade10 (local) mountpoint struct 0xffff8800b9493d00 > struct mount 0xffff8800aa103480 : mountpoint dentry > 0xffff8800ca457c90 (systemd) mountpoint struct 0xffff8801285b6c20 > struct mount 0xffff8800aa103340 : mountpoint dentry > 0xffff8800ca457d50 (cpuset) mountpoint struct 0xffff8801285b6a60 > struct mount 0xffff8800aa13ebc0 : mountpoint dentry > 0xffff880038b8c450 (web) mountpoint struct 0xffff8800aa146b60 > struct mount 0xffff8800aa1035c0 : mountpoint dentry > 0xffff8800ca457ed0 (cgroup) mountpoint struct 0xffff8801285b6c60 > struct mount 0xffff8800b5e40a40 : mountpoint dentry > 0xffff8800ca4a3c50 (blkio) mountpoint struct 0xffff8801285e9da0 > struct mount 0xffff8800c9e5e840 : mountpoint dentry > 0xffff8800c89b0690 (home) mountpoint struct 0xffff8800c1d06940 > struct mount 0xffff8800a3a9aa80 : mountpoint dentry > 0xffff880129187dd0 (confidential) mountpoint struct 0xffff8800c1d06920 > struct mount 0xffff8800cac2b840 : mountpoint dentry > 0xffff8800c8adf450 (0) mountpoint struct 0xffff8800c2841820 > struct mount 0xffff8800b5e402c0 : mountpoint dentry > 0xffff8800ca4c4a50 (rpc_pipefs) mountpoint struct 0xffff8800ca1736c0 > struct mount 0xffff8800c9ddad00 : mountpoint dentry > 0xffff8800ca457c90 (systemd) mountpoint struct 0xffff8801285b6c20 > struct mount 0xffff8800a3a9a940 : mountpoint dentry > 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 > struct mount 0xffff8800b5e40680 : mountpoint dentry > 0xffff8800c88284d0 (connections) mountpoint struct 0xffff8800c8ed8480 > struct mount 0xffff8800c9ddabc0 : mountpoint dentry > 0xffff8800ca457d50 (cpuset) mountpoint struct 0xffff8801285b6a60 > struct mount 0xffff8800c1d97b80 : mountpoint dentry > 0xffff8800ca4cc390 (/) mountpoint struct 0xffff8800aa2d6880 > struct mount 0xffff880128fd2a40 : mountpoint dentry > 0xffff8800c89dc8d0 (jbod) mountpoint struct 0xffff8800c2841400 > struct mount 0xffff880128fee580 : mountpoint dentry > 0xffff8800ca4521d0 (run) mountpoint struct 0xffff8801285b6ce0 > struct mount 0xffff8800c9dda440 : mountpoint dentry > 0xffff8800ca4a3c50 (blkio) mountpoint struct 0xffff8801285e9da0 > struct mount 0xffff8800b5e40b80 : mountpoint dentry > 0xffff8800ca4a0510 (freezer) mountpoint struct 0xffff8801285e9e40 > struct mount 0xffff8800c9e5e340 : mountpoint dentry > 0xffff880094483e90 (buczek) mountpoint struct 0xffff8800a39282a0 > struct mount 0xffff8800aa103980 : mountpoint dentry > 0xffff8800ca4cfdd0 (binfmt_misc) mountpoint struct 0xffff8801285e9c80 > struct mount 0xffff8800a3a9a080 : mountpoint dentry > 0xffff8800c890ce10 (tmp) mountpoint struct 0xffff8800a3efeba0 > struct mount 0xffff8800c1d97180 : mountpoint dentry > 0xffff8800bd874110 (klages) mountpoint struct 0xffff8801285e99e0 > struct mount 0xffff880128fee1c0 : mountpoint dentry > 0xffff8800ca452710 (sys) mountpoint struct 0xffff8801285b6e00 > struct mount 0xffff880128feed00 : mountpoint dentry > 0xffff880129002ad0 (/) mountpoint struct 0xffff880128fef2e0 > struct mount 0xffff8800c9e5ec00 : mountpoint dentry > 0xffff880129047450 (nfsd) mountpoint struct 0xffff8800ca173fc0 > struct mount 0xffff8800c9e5eac0 : mountpoint dentry > 0xffff8800c8979590 (package) mountpoint struct 0xffff8801280ffce0 > struct mount 0xffff8800a3a9ae40 : mountpoint dentry > 0xffff8800c89b0690 (home) mountpoint struct 0xffff8800c1d06940 > struct mount 0xffff8800b5e40e00 : mountpoint dentry > 0xffff8800ca49c950 (cpuacct) mountpoint struct 0xffff8801285e9f60 > struct mount 0xffff8800b5e40040 : mountpoint dentry > 0xffff8800c8adf450 (0) mountpoint struct 0xffff8800c2841820 > struct mount 0xffff8800b5e40900 : mountpoint dentry > 0xffff8800ca4cc5d0 (debug) mountpoint struct 0xffff8801285e9c20 > struct mount 0xffff880128fee080 : mountpoint dentry > 0xffff8800ca452a10 (proc) mountpoint struct 0xffff8801285b6e80 > struct mount 0xffff8800c9dda580 : mountpoint dentry > 0xffff8800ca4a0510 (freezer) mountpoint struct 0xffff8801285e9e40 > struct mount 0xffff8800cac2bac0 : mountpoint dentry > 0xffff8800ca4e21d0 (media) mountpoint struct 0xffff8801285e9600 > struct mount 0xffff8800cac2bc00 : mountpoint dentry > 0xffff88012900f050 (dev) mountpoint struct 0xffff8801285e9740 > struct mount 0xffff8800aa103e80 : mountpoint dentry > 0xffff8800ca452290 (pts) mountpoint struct 0xffff8801285b6d40 > struct mount 0xffff8800a3a9abc0 : mountpoint dentry > 0xffff8800c89dc8d0 (jbod) mountpoint struct 0xffff8800c2841400 > struct mount 0xffff8800b5e40540 : mountpoint dentry > 0xffff8800ca4521d0 (run) mountpoint struct 0xffff8801285b6ce0 > struct mount 0xffff8800c9dda800 : mountpoint dentry > 0xffff8800ca49c950 (cpuacct) mountpoint struct 0xffff8801285e9f60 > struct mount 0xffff8800b5e40cc0 : mountpoint dentry > 0xffff8800ca4a0ed0 (devices) mountpoint struct 0xffff8801285e9ec0 > struct mount 0xffff8800aa13e080 : mountpoint dentry > 0xffff8800ca4524d0 (shm) mountpoint struct 0xffff8801285b6d80 > struct mount 0xffff8800a3a9ad00 : mountpoint dentry > 0xffff8800bd874110 (klages) mountpoint struct 0xffff8801285e99e0 > struct mount 0xffff8800b5e407c0 : mountpoint dentry > 0xffff8800ca4c8dd0 (security) mountpoint struct 0xffff8801285e9ac0 > struct mount 0xffff8800aa1030c0 : mountpoint dentry > 0xffff8800ca4992d0 (cpu) mountpoint struct 0xffff8801285e9fe0 > struct mount 0xffff8800aa103700 : mountpoint dentry > 0xffff8800ca452710 (sys) mountpoint struct 0xffff8801285b6e00 > struct mount 0xffff8800a3a9a580 : mountpoint dentry > 0xffff8800c8979590 (package) mountpoint struct 0xffff8801280ffce0 > struct mount 0xffff8800c9dda6c0 : mountpoint dentry > 0xffff8800ca4a0ed0 (devices) mountpoint struct 0xffff8801285e9ec0 > struct mount 0xffff8800c9ddae40 : mountpoint dentry > 0xffff8800ca457ed0 (cgroup) mountpoint struct 0xffff8801285b6c60 > struct mount 0xffff8800aa103ac0 : mountpoint dentry > 0xffff8800ca452a10 (proc) mountpoint struct 0xffff8801285b6e80 > struct mount 0xffff880128fee440 : mountpoint dentry > 0xffff8800ca452290 (pts) mountpoint struct 0xffff8801285b6d40 > struct mount 0xffff880128feebc0 : mountpoint dentry > 0xffff8800ca4e2d10 (run) mountpoint struct 0xffff8800ca1439c0 > struct mount 0xffff8800b5e40180 : mountpoint dentry > 0xffff8800ca4e21d0 (media) mountpoint struct 0xffff8801285e9600 > struct mount 0xffff8800cac2b980 : mountpoint dentry > 0xffff8800c88284d0 (connections) mountpoint struct 0xffff8800c8ed8480 > struct mount 0xffff8800aa13e1c0 : mountpoint dentry > 0xffff88012900f050 (dev) mountpoint struct 0xffff8801285e9740 > struct mount 0xffff8800c9dda940 : mountpoint dentry > 0xffff8800ca4992d0 (cpu) mountpoint struct 0xffff8801285e9fe0 > struct mount 0xffff880128fee300 : mountpoint dentry > 0xffff8800ca4524d0 (shm) mountpoint struct 0xffff8801285b6d80 > struct mount 0xffff8800c9dda1c0 : mountpoint dentry > 0xffff8800ca4cfdd0 (binfmt_misc) mountpoint struct 0xffff8801285e9c80 > struct mount 0xffff8800c1d97e00 : mountpoint dentry > 0xffff8800c8996890 (src) mountpoint struct 0xffff8801280ffc20 > struct mount 0xffff8800aa103200 : mountpoint dentry > 0xffff8800ca499ed0 (debug) mountpoint struct 0xffff8801285b69e0 > struct mount 0xffff8800b5e40400 : mountpoint dentry > 0xffff8800ca4e2d10 (run) mountpoint struct 0xffff8800ca1439c0 > struct mount 0xffff8800c9e5e980 : mountpoint dentry > 0xffff8800c8996dd0 (scratch) mountpoint struct 0xffff8801280ffc80 > struct mount 0xffff8800aa103c00 : mountpoint dentry > 0xffff8801290478d0 (mqueue) mountpoint struct 0xffff8801285e9b40 > struct mount 0xffff8800c9dda080 : mountpoint dentry > 0xffff8800ca4cc5d0 (debug) mountpoint struct 0xffff8801285e9c20 > struct mount 0xffff8800a3a9a6c0 : mountpoint dentry > 0xffff88007c86da10 (mariux32) mountpoint struct 0xffff880125e44480 > struct mount 0xffff8800c9ddaa80 : mountpoint dentry > 0xffff8800ca499ed0 (debug) mountpoint struct 0xffff8801285b69e0 > struct mount 0xffff8800aa13e300 : mountpoint dentry > 0xffff880129002ad0 (/) mountpoint struct 0xffff880128fef2e0 > struct mount 0xffff8800a3a9a1c0 : mountpoint dentry > 0xffff8800c8996890 (src) mountpoint struct 0xffff8801280ffc20 > struct mount 0xffff8800aa103d40 : mountpoint dentry > 0xffff8800ca4cf710 (hugepages) mountpoint struct 0xffff8801285e9d00 > struct mount 0xffff8800c9e5ed40 : mountpoint dentry > 0xffff8800ca4c8dd0 (security) mountpoint struct 0xffff8801285e9ac0 > struct mount 0xffff8800c9e5ee80 : mountpoint dentry > 0xffff8801290478d0 (mqueue) mountpoint struct 0xffff8801285e9b40 > struct mount 0xffff8800a3a9a440 : mountpoint dentry > 0xffff8800c8996dd0 (scratch) mountpoint struct 0xffff8801280ffc80 > struct mount 0xffff8800a3a9a800 : mountpoint dentry > 0xffff8800944ae1d0 (roche454) mountpoint struct 0xffff880125d17800 This is the struct mount of "mariux32" : > (gdb) print *(struct mount *)0xffff8800a3a9a6c0 > $2 = {mnt_hash = {next = 0xffff880128e87cf0, prev = > 0xffff880128e87cf0}, mnt_parent = 0xffff8800a3a9a940, mnt_mountpoint = > 0xffff88007c86da10, mnt = {mnt_root = 0xffff88007c8ed590, > mnt_sb = 0xffff880125fef800, mnt_flags = 33}, mnt_rcu = {next = > 0x0, func = 0}, mnt_pcp = 0x60fed2000ab4, mnt_mounts = {next = > 0xffff8800a3a9a710, prev = 0xffff8800a3a9a710}, > mnt_child = {next = 0xffff8800a3a9a990, prev = 0xffff8800a3a9a860}, > mnt_instance = {next = 0xffff880125fef8b0, prev = 0xffff880125fef8b0}, > mnt_devname = 0xffff880120c69c80 > "pille:/amd/pille/1/project/mariux32", mnt_list = {next = > 0xffff8800a3a9a608, prev = 0xffff8800a3a9a888}, mnt_expire = {next = > 0xffff8800a3a9a758, > prev = 0xffff8800a3a9a758}, mnt_share = {next = > 0xffff8800a3a9a768, prev = 0xffff8800a3a9a768}, mnt_slave_list = {next > = 0xffff8800a3a9a778, prev = 0xffff8800a3a9a778}, > mnt_slave = {next = 0xffff8800a3a9a788, prev = 0xffff8800a3a9a788}, > mnt_master = 0x0, mnt_ns = 0xffff8801271f9300, mnt_mp = > 0xffff880125e44480, mnt_fsnotify_marks = {first = 0x0}, > mnt_fsnotify_mask = 0, mnt_id = 126, mnt_group_id = 0, > mnt_expiry_mark = 0, mnt_pinned = 0, mnt_ex_mountpoint = {mnt = 0x0, > dentry = 0x0}} > This is the struct mount of the parent ( "/project") : > (gdb) print *((struct mount *)0xffff8800a3a9a6c0)->mnt_parent > $3 = {mnt_hash = {next = 0xffff880128e87380, prev = > 0xffff880128e87380}, mnt_parent = 0xffff8800aa13e300, mnt_mountpoint = > 0xffff8800c8b2e450, mnt = {mnt_root = 0xffff8800c8b2e810, > mnt_sb = 0xffff8800c8f44000, mnt_flags = 32}, mnt_rcu = {next = > 0x0, func = 0}, mnt_pcp = 0x60fed2000aa4, mnt_mounts = {next = > 0xffff8800a3a9a860, prev = 0xffff8800a3a9a720}, > mnt_child = {next = 0xffff8800a3a9a5e0, prev = 0xffff8800a3a9aae0}, > mnt_instance = {next = 0xffff8800c8f440b0, prev = 0xffff8800c9e5e270}, > mnt_devname = 0xffff8800a3efe1e0 "/etc/automount/auto.project", > mnt_list = {next = 0xffff8800a3a9a888, prev = 0xffff8800a3a9ab08}, > mnt_expire = {next = 0xffff8800a3a9a9d8, > prev = 0xffff8800a3a9a9d8}, mnt_share = {next = > 0xffff8800a3a9a9e8, prev = 0xffff8800a3a9a9e8}, mnt_slave_list = {next > = 0xffff8800a3a9a9f8, prev = 0xffff8800a3a9a9f8}, > mnt_slave = {next = 0xffff8800a3a9aa08, prev = 0xffff8800a3a9aa08}, > mnt_master = 0x0, mnt_ns = 0xffff8801271f9300, mnt_mp = > 0xffff8800c1d068a0, mnt_fsnotify_marks = {first = 0x0}, > mnt_fsnotify_mask = 0, mnt_id = 124, mnt_group_id = 0, > mnt_expiry_mark = 0, mnt_pinned = 0, mnt_ex_mountpoint = {mnt = 0x0, > dentry = 0x0}} Regards Donald -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4541 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 0:52 ` Donald Buczek @ 2014-03-02 2:17 ` Ian Kent 2014-03-02 8:28 ` Donald Buczek 0 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-03-02 2:17 UTC (permalink / raw) To: Donald Buczek; +Cc: Alexander Viro, autofs On Sun, 2014-03-02 at 01:52 +0100, Donald Buczek wrote: > Okay, same bug hit again. > > New info this time: There _is_ a "struct mount" for the failing dentry , > but neither "cat /proc/mounts" nor "mount" or the crash utility show it. > > Demo follows: > > The Problem: > > > > root:kasslerbraten:/home/buczek/autofs/# uname -a > > Linux kasslerbraten.molgen.mpg.de 3.13.1.mx64.1 #1 SMP Fri Feb 21 > > 15:54:38 CET 2014 x86_64 GNU/Linux > > root:kasslerbraten:/home/buczek/autofs/# ls /project/mariux32/ > > ls: cannot access /project/mariux32/: Too many levels of symbolic links > > Logfile: > > > root:kasslerbraten:/home/buczek/autofs/# fgrep mariux32 > > /var/log/messages|tail > > 2014-03-02T01:33:02.051118+01:00 kasslerbraten kernel: [146256.094196] > > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > > 2014-03-02T01:33:02.051119+01:00 kasslerbraten kernel: [146256.094197] > > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > > 2014-03-02T01:33:02.051132+01:00 kasslerbraten kernel: [146256.094199] > > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > > 2014-03-02T01:33:02.051133+01:00 kasslerbraten kernel: [146256.094200] > > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > > 2014-03-02T01:33:02.051134+01:00 kasslerbraten kernel: [146256.094201] > > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > > 2014-03-02T01:33:02.051135+01:00 kasslerbraten kernel: [146256.094202] > > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > > 2014-03-02T01:33:02.051136+01:00 kasslerbraten kernel: [146256.094203] > > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > > 2014-03-02T01:33:02.051137+01:00 kasslerbraten kernel: [146256.094204] > > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > > 2014-03-02T01:33:02.051138+01:00 kasslerbraten kernel: [146256.094205] > > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > > 2014-03-02T01:33:02.051139+01:00 kasslerbraten kernel: [146256.094206] > > pid 23244: autofs4_d_automount: dentry=ffff88007c86da10 mariux32 > > Mount on /project/mariux32 not visible in /proc/mounts: > > > root:kasslerbraten:/home/buczek/autofs/# cat /proc/mounts > > rootfs / rootfs rw 0 0 > > /dev/root / reiserfs rw,relatime 0 0 > > devtmpfs /dev devtmpfs > > rw,relatime,size=2001060k,nr_inodes=500265,mode=755 0 0 > > proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 > > sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0 > > tmpfs /dev/shm tmpfs rw,nosuid,nodev,relatime 0 0 > > devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620 0 0 > > tmpfs /run tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0 > > tmpfs /sys/fs/cgroup tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0 > > cgroup /sys/fs/cgroup/systemd cgroup > > rw,nosuid,nodev,noexec,relatime,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd > > 0 0 > > cgroup /sys/fs/cgroup/cpuset cgroup > > rw,nosuid,nodev,noexec,relatime,cpuset 0 0 > > cgroup /sys/fs/cgroup/debug cgroup > > rw,nosuid,nodev,noexec,relatime,debug 0 0 > > cgroup /sys/fs/cgroup/cpu cgroup rw,nosuid,nodev,noexec,relatime,cpu 0 0 > > cgroup /sys/fs/cgroup/cpuacct cgroup > > rw,nosuid,nodev,noexec,relatime,cpuacct 0 0 > > cgroup /sys/fs/cgroup/devices cgroup > > rw,nosuid,nodev,noexec,relatime,devices 0 0 > > cgroup /sys/fs/cgroup/freezer cgroup > > rw,nosuid,nodev,noexec,relatime,freezer 0 0 > > cgroup /sys/fs/cgroup/blkio cgroup > > rw,nosuid,nodev,noexec,relatime,blkio 0 0 > > systemd-1 /dev/hugepages autofs > > rw,relatime,fd=26,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 > > systemd-1 /proc/sys/fs/binfmt_misc autofs > > rw,relatime,fd=27,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 > > systemd-1 /sys/kernel/debug autofs > > rw,relatime,fd=28,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 > > systemd-1 /dev/mqueue autofs > > rw,relatime,fd=29,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 > > systemd-1 /sys/kernel/security autofs > > rw,relatime,fd=30,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0 > > tmpfs /var/run tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0 > > rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0 > > tmpfs /media tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0 > > nfsd /proc/fs/nfsd nfsd rw,relatime 0 0 > > fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0 > > /dev/sda2 /amd/kasslerbraten/0 reiserfs rw,relatime 0 0 > > /etc/automount/auto.home /home autofs > > rw,relatime,fd=7,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > > /etc/automount/auto.jbod /jbod autofs > > rw,relatime,fd=13,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > > /etc/automount/auto.confidential /confidential autofs > > rw,relatime,fd=19,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > > /etc/automount/auto.project /project autofs > > rw,relatime,fd=25,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > > /etc/automount/auto.package /package autofs > > rw,relatime,fd=31,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > > /etc/automount/auto.scratch /scratch autofs > > rw,relatime,fd=37,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > > /etc/automount/auto.src /src autofs > > rw,relatime,fd=43,pgrp=542,timeout=300,minproto=5,maxproto=5,indirect 0 0 > > pummelfee:/amd/pummelfee/X/X3009/home/abt_srv/klages /home/klages nfs4 > > rw,nosuid,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=141.14.19.40,local_lock=none,addr=141.14.16.18 > > 0 0 > > palle:/amd/palle/1/home/abt_srv/buczek /home/buczek nfs > > rw,nosuid,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=141.14.28.251,mountvers=3,mountport=58602,mountproto=udp,local_lock=none,addr=141.14.28.251 > > 0 0 > > binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0 > > erdnuckel:/amd/erdnuckel/X/X0008/package/sequencer /package/sequencer > > nfs > > rw,nosuid,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=141.14.28.246,mountvers=3,mountport=53128,mountproto=udp,local_lock=none,addr=141.14.28.246 > > 0 0 > > root:kasslerbraten:/home/buczek/autofs/# > > But visible to my little perl program ( > http://owww.molgen.mpg.de/~buczek/autofs-demo/peekmounts ) which walks > the mountpoint hashtable and the mount_hashtable ) : > > > root:kasslerbraten:/home/buczek/autofs/# ./peekmounts > > mountpoint 0xffff8801280ffc20 : count= 2 denty=0xffff8800c8996890 (src) > > mountpoint 0xffff8801285b6c20 : count= 2 denty=0xffff8800ca457c90 > > (systemd) > > mountpoint 0xffff8801285b6a60 : count= 2 denty=0xffff8800ca457d50 > > (cpuset) > > mountpoint 0xffff8801285e9d00 : count= 2 denty=0xffff8800ca4cf710 > > (hugepages) > > mountpoint 0xffff8801285b6c60 : count= 2 denty=0xffff8800ca457ed0 > > (cgroup) > > mountpoint 0xffff8801285e9da0 : count= 2 denty=0xffff8800ca4a3c50 > > (blkio) > > mountpoint 0xffff8800a39282a0 : count= 1 denty=0xffff880094483e90 > > (buczek) > > mountpoint 0xffff8801280ffc80 : count= 2 denty=0xffff8800c8996dd0 > > (scratch) > > mountpoint 0xffff8800c8ed8480 : count= 2 denty=0xffff8800c88284d0 > > (connections) > > mountpoint 0xffff8801285e99e0 : count= 2 denty=0xffff8800bd874110 > > (klages) > > mountpoint 0xffff8801285e9c80 : count= 2 denty=0xffff8800ca4cfdd0 > > (binfmt_misc) > > mountpoint 0xffff8801285e9e40 : count= 2 denty=0xffff8800ca4a0510 > > (freezer) > > mountpoint 0xffff8800aa2d6880 : count= 1 denty=0xffff8800ca4cc390 (/) > > mountpoint 0xffff8801285e9c20 : count= 2 denty=0xffff8800ca4cc5d0 > > (debug) > > mountpoint 0xffff8801285e9f60 : count= 2 denty=0xffff8800ca49c950 > > (cpuacct) > > mountpoint 0xffff8800c1d06920 : count= 2 denty=0xffff880129187dd0 > > (confidential) > > mountpoint 0xffff8800ca1736c0 : count= 2 denty=0xffff8800ca4c4a50 > > (rpc_pipefs) > > mountpoint 0xffff8800c1d068a0 : count= 2 denty=0xffff8800c8b2e450 > > (project) > > mountpoint 0xffff8801285e9ec0 : count= 2 denty=0xffff8800ca4a0ed0 > > (devices) > > mountpoint 0xffff8801285e9ac0 : count= 2 denty=0xffff8800ca4c8dd0 > > (security) > > mountpoint 0xffff8801285e9fe0 : count= 2 denty=0xffff8800ca4992d0 (cpu) > > mountpoint 0xffff8800a3efeba0 : count= 1 denty=0xffff8800c890ce10 (tmp) > > mountpoint 0xffff880125e44480 : count= 1 denty=0xffff88007c86da10 > > (mariux32) > > mountpoint 0xffff8800c1d06940 : count= 2 denty=0xffff8800c89b0690 > > (home) > > mountpoint 0xffff8800c2841820 : count= 2 denty=0xffff8800c8adf450 (0) > > mountpoint 0xffff8800c2841400 : count= 2 denty=0xffff8800c89dc8d0 > > (jbod) > > mountpoint 0xffff8801285b6ce0 : count= 2 denty=0xffff8800ca4521d0 (run) > > mountpoint 0xffff8801285b6d40 : count= 2 denty=0xffff8800ca452290 (pts) > > mountpoint 0xffff8801285b69e0 : count= 2 denty=0xffff8800ca499ed0 > > (debug) > > mountpoint 0xffff8801285b6d80 : count= 2 denty=0xffff8800ca4524d0 (shm) > > mountpoint 0xffff880128fef2e0 : count= 3 denty=0xffff880129002ad0 (/) > > mountpoint 0xffff8801285b6e00 : count= 2 denty=0xffff8800ca452710 (sys) > > mountpoint 0xffff880125d17800 : count= 1 denty=0xffff8800944ae1d0 > > (roche454) > > mountpoint 0xffff8801280ffce0 : count= 2 denty=0xffff8800c8979590 > > (package) > > mountpoint 0xffff8801285b6e80 : count= 2 denty=0xffff8800ca452a10 > > (proc) > > mountpoint 0xffff8801285e9600 : count= 2 denty=0xffff8800ca4e21d0 > > (media) > > mountpoint 0xffff8801285e9740 : count= 2 denty=0xffff88012900f050 (dev) > > mountpoint 0xffff8800ca173fc0 : count= 2 denty=0xffff880129047450 > > (nfsd) > > mountpoint 0xffff8800b9493d00 : count= 1 denty=0xffff8800b59ade10 > > (local) > > mountpoint 0xffff8800ca1439c0 : count= 2 denty=0xffff8800ca4e2d10 (run) > > mountpoint 0xffff8800aa146b60 : count= 1 denty=0xffff880038b8c450 (web) > > mountpoint 0xffff8801285e9b40 : count= 2 denty=0xffff8801290478d0 > > (mqueue) > > struct mount 0xffff880128fd2e00 : mountpoint dentry > > 0xffff880129002ad0 (/) mountpoint struct 0xffff880128fef2e0 > > struct mount 0xffff8800c9e5e0c0 : mountpoint dentry > > 0xffff880129187dd0 (confidential) mountpoint struct 0xffff8800c1d06920 > > struct mount 0xffff880128fd2cc0 : mountpoint dentry > > 0xffff8800ca4c4a50 (rpc_pipefs) mountpoint struct 0xffff8800ca1736c0 > > struct mount 0xffff8800c9e5e200 : mountpoint dentry > > 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 > > struct mount 0xffff8800c9dda300 : mountpoint dentry > > 0xffff8800ca4cf710 (hugepages) mountpoint struct 0xffff8801285e9d00 > > struct mount 0xffff8800aa103840 : mountpoint dentry > > 0xffff880129047450 (nfsd) mountpoint struct 0xffff8800ca173fc0 > > struct mount 0xffff8800a3a9a300 : mountpoint dentry > > 0xffff8800b59ade10 (local) mountpoint struct 0xffff8800b9493d00 > > struct mount 0xffff8800aa103480 : mountpoint dentry > > 0xffff8800ca457c90 (systemd) mountpoint struct 0xffff8801285b6c20 > > struct mount 0xffff8800aa103340 : mountpoint dentry > > 0xffff8800ca457d50 (cpuset) mountpoint struct 0xffff8801285b6a60 > > struct mount 0xffff8800aa13ebc0 : mountpoint dentry > > 0xffff880038b8c450 (web) mountpoint struct 0xffff8800aa146b60 > > struct mount 0xffff8800aa1035c0 : mountpoint dentry > > 0xffff8800ca457ed0 (cgroup) mountpoint struct 0xffff8801285b6c60 > > struct mount 0xffff8800b5e40a40 : mountpoint dentry > > 0xffff8800ca4a3c50 (blkio) mountpoint struct 0xffff8801285e9da0 > > struct mount 0xffff8800c9e5e840 : mountpoint dentry > > 0xffff8800c89b0690 (home) mountpoint struct 0xffff8800c1d06940 > > struct mount 0xffff8800a3a9aa80 : mountpoint dentry > > 0xffff880129187dd0 (confidential) mountpoint struct 0xffff8800c1d06920 > > struct mount 0xffff8800cac2b840 : mountpoint dentry > > 0xffff8800c8adf450 (0) mountpoint struct 0xffff8800c2841820 > > struct mount 0xffff8800b5e402c0 : mountpoint dentry > > 0xffff8800ca4c4a50 (rpc_pipefs) mountpoint struct 0xffff8800ca1736c0 > > struct mount 0xffff8800c9ddad00 : mountpoint dentry > > 0xffff8800ca457c90 (systemd) mountpoint struct 0xffff8801285b6c20 > > struct mount 0xffff8800a3a9a940 : mountpoint dentry > > 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 > > struct mount 0xffff8800b5e40680 : mountpoint dentry > > 0xffff8800c88284d0 (connections) mountpoint struct 0xffff8800c8ed8480 > > struct mount 0xffff8800c9ddabc0 : mountpoint dentry > > 0xffff8800ca457d50 (cpuset) mountpoint struct 0xffff8801285b6a60 > > struct mount 0xffff8800c1d97b80 : mountpoint dentry > > 0xffff8800ca4cc390 (/) mountpoint struct 0xffff8800aa2d6880 > > struct mount 0xffff880128fd2a40 : mountpoint dentry > > 0xffff8800c89dc8d0 (jbod) mountpoint struct 0xffff8800c2841400 > > struct mount 0xffff880128fee580 : mountpoint dentry > > 0xffff8800ca4521d0 (run) mountpoint struct 0xffff8801285b6ce0 > > struct mount 0xffff8800c9dda440 : mountpoint dentry > > 0xffff8800ca4a3c50 (blkio) mountpoint struct 0xffff8801285e9da0 > > struct mount 0xffff8800b5e40b80 : mountpoint dentry > > 0xffff8800ca4a0510 (freezer) mountpoint struct 0xffff8801285e9e40 > > struct mount 0xffff8800c9e5e340 : mountpoint dentry > > 0xffff880094483e90 (buczek) mountpoint struct 0xffff8800a39282a0 > > struct mount 0xffff8800aa103980 : mountpoint dentry > > 0xffff8800ca4cfdd0 (binfmt_misc) mountpoint struct 0xffff8801285e9c80 > > struct mount 0xffff8800a3a9a080 : mountpoint dentry > > 0xffff8800c890ce10 (tmp) mountpoint struct 0xffff8800a3efeba0 > > struct mount 0xffff8800c1d97180 : mountpoint dentry > > 0xffff8800bd874110 (klages) mountpoint struct 0xffff8801285e99e0 > > struct mount 0xffff880128fee1c0 : mountpoint dentry > > 0xffff8800ca452710 (sys) mountpoint struct 0xffff8801285b6e00 > > struct mount 0xffff880128feed00 : mountpoint dentry > > 0xffff880129002ad0 (/) mountpoint struct 0xffff880128fef2e0 > > struct mount 0xffff8800c9e5ec00 : mountpoint dentry > > 0xffff880129047450 (nfsd) mountpoint struct 0xffff8800ca173fc0 > > struct mount 0xffff8800c9e5eac0 : mountpoint dentry > > 0xffff8800c8979590 (package) mountpoint struct 0xffff8801280ffce0 > > struct mount 0xffff8800a3a9ae40 : mountpoint dentry > > 0xffff8800c89b0690 (home) mountpoint struct 0xffff8800c1d06940 > > struct mount 0xffff8800b5e40e00 : mountpoint dentry > > 0xffff8800ca49c950 (cpuacct) mountpoint struct 0xffff8801285e9f60 > > struct mount 0xffff8800b5e40040 : mountpoint dentry > > 0xffff8800c8adf450 (0) mountpoint struct 0xffff8800c2841820 > > struct mount 0xffff8800b5e40900 : mountpoint dentry > > 0xffff8800ca4cc5d0 (debug) mountpoint struct 0xffff8801285e9c20 > > struct mount 0xffff880128fee080 : mountpoint dentry > > 0xffff8800ca452a10 (proc) mountpoint struct 0xffff8801285b6e80 > > struct mount 0xffff8800c9dda580 : mountpoint dentry > > 0xffff8800ca4a0510 (freezer) mountpoint struct 0xffff8801285e9e40 > > struct mount 0xffff8800cac2bac0 : mountpoint dentry > > 0xffff8800ca4e21d0 (media) mountpoint struct 0xffff8801285e9600 > > struct mount 0xffff8800cac2bc00 : mountpoint dentry > > 0xffff88012900f050 (dev) mountpoint struct 0xffff8801285e9740 > > struct mount 0xffff8800aa103e80 : mountpoint dentry > > 0xffff8800ca452290 (pts) mountpoint struct 0xffff8801285b6d40 > > struct mount 0xffff8800a3a9abc0 : mountpoint dentry > > 0xffff8800c89dc8d0 (jbod) mountpoint struct 0xffff8800c2841400 > > struct mount 0xffff8800b5e40540 : mountpoint dentry > > 0xffff8800ca4521d0 (run) mountpoint struct 0xffff8801285b6ce0 > > struct mount 0xffff8800c9dda800 : mountpoint dentry > > 0xffff8800ca49c950 (cpuacct) mountpoint struct 0xffff8801285e9f60 > > struct mount 0xffff8800b5e40cc0 : mountpoint dentry > > 0xffff8800ca4a0ed0 (devices) mountpoint struct 0xffff8801285e9ec0 > > struct mount 0xffff8800aa13e080 : mountpoint dentry > > 0xffff8800ca4524d0 (shm) mountpoint struct 0xffff8801285b6d80 > > struct mount 0xffff8800a3a9ad00 : mountpoint dentry > > 0xffff8800bd874110 (klages) mountpoint struct 0xffff8801285e99e0 > > struct mount 0xffff8800b5e407c0 : mountpoint dentry > > 0xffff8800ca4c8dd0 (security) mountpoint struct 0xffff8801285e9ac0 > > struct mount 0xffff8800aa1030c0 : mountpoint dentry > > 0xffff8800ca4992d0 (cpu) mountpoint struct 0xffff8801285e9fe0 > > struct mount 0xffff8800aa103700 : mountpoint dentry > > 0xffff8800ca452710 (sys) mountpoint struct 0xffff8801285b6e00 > > struct mount 0xffff8800a3a9a580 : mountpoint dentry > > 0xffff8800c8979590 (package) mountpoint struct 0xffff8801280ffce0 > > struct mount 0xffff8800c9dda6c0 : mountpoint dentry > > 0xffff8800ca4a0ed0 (devices) mountpoint struct 0xffff8801285e9ec0 > > struct mount 0xffff8800c9ddae40 : mountpoint dentry > > 0xffff8800ca457ed0 (cgroup) mountpoint struct 0xffff8801285b6c60 > > struct mount 0xffff8800aa103ac0 : mountpoint dentry > > 0xffff8800ca452a10 (proc) mountpoint struct 0xffff8801285b6e80 > > struct mount 0xffff880128fee440 : mountpoint dentry > > 0xffff8800ca452290 (pts) mountpoint struct 0xffff8801285b6d40 > > struct mount 0xffff880128feebc0 : mountpoint dentry > > 0xffff8800ca4e2d10 (run) mountpoint struct 0xffff8800ca1439c0 > > struct mount 0xffff8800b5e40180 : mountpoint dentry > > 0xffff8800ca4e21d0 (media) mountpoint struct 0xffff8801285e9600 > > struct mount 0xffff8800cac2b980 : mountpoint dentry > > 0xffff8800c88284d0 (connections) mountpoint struct 0xffff8800c8ed8480 > > struct mount 0xffff8800aa13e1c0 : mountpoint dentry > > 0xffff88012900f050 (dev) mountpoint struct 0xffff8801285e9740 > > struct mount 0xffff8800c9dda940 : mountpoint dentry > > 0xffff8800ca4992d0 (cpu) mountpoint struct 0xffff8801285e9fe0 > > struct mount 0xffff880128fee300 : mountpoint dentry > > 0xffff8800ca4524d0 (shm) mountpoint struct 0xffff8801285b6d80 > > struct mount 0xffff8800c9dda1c0 : mountpoint dentry > > 0xffff8800ca4cfdd0 (binfmt_misc) mountpoint struct 0xffff8801285e9c80 > > struct mount 0xffff8800c1d97e00 : mountpoint dentry > > 0xffff8800c8996890 (src) mountpoint struct 0xffff8801280ffc20 > > struct mount 0xffff8800aa103200 : mountpoint dentry > > 0xffff8800ca499ed0 (debug) mountpoint struct 0xffff8801285b69e0 > > struct mount 0xffff8800b5e40400 : mountpoint dentry > > 0xffff8800ca4e2d10 (run) mountpoint struct 0xffff8800ca1439c0 > > struct mount 0xffff8800c9e5e980 : mountpoint dentry > > 0xffff8800c8996dd0 (scratch) mountpoint struct 0xffff8801280ffc80 > > struct mount 0xffff8800aa103c00 : mountpoint dentry > > 0xffff8801290478d0 (mqueue) mountpoint struct 0xffff8801285e9b40 > > struct mount 0xffff8800c9dda080 : mountpoint dentry > > 0xffff8800ca4cc5d0 (debug) mountpoint struct 0xffff8801285e9c20 > > struct mount 0xffff8800a3a9a6c0 : mountpoint dentry > > 0xffff88007c86da10 (mariux32) mountpoint struct 0xffff880125e44480 > > struct mount 0xffff8800c9ddaa80 : mountpoint dentry > > 0xffff8800ca499ed0 (debug) mountpoint struct 0xffff8801285b69e0 > > struct mount 0xffff8800aa13e300 : mountpoint dentry > > 0xffff880129002ad0 (/) mountpoint struct 0xffff880128fef2e0 > > struct mount 0xffff8800a3a9a1c0 : mountpoint dentry > > 0xffff8800c8996890 (src) mountpoint struct 0xffff8801280ffc20 > > struct mount 0xffff8800aa103d40 : mountpoint dentry > > 0xffff8800ca4cf710 (hugepages) mountpoint struct 0xffff8801285e9d00 > > struct mount 0xffff8800c9e5ed40 : mountpoint dentry > > 0xffff8800ca4c8dd0 (security) mountpoint struct 0xffff8801285e9ac0 > > struct mount 0xffff8800c9e5ee80 : mountpoint dentry > > 0xffff8801290478d0 (mqueue) mountpoint struct 0xffff8801285e9b40 > > struct mount 0xffff8800a3a9a440 : mountpoint dentry > > 0xffff8800c8996dd0 (scratch) mountpoint struct 0xffff8801280ffc80 > > struct mount 0xffff8800a3a9a800 : mountpoint dentry > > 0xffff8800944ae1d0 (roche454) mountpoint struct 0xffff880125d17800 > > This is the struct mount of "mariux32" : > > > (gdb) print *(struct mount *)0xffff8800a3a9a6c0 > > $2 = {mnt_hash = {next = 0xffff880128e87cf0, prev = > > 0xffff880128e87cf0}, mnt_parent = 0xffff8800a3a9a940, mnt_mountpoint = > > 0xffff88007c86da10, mnt = {mnt_root = 0xffff88007c8ed590, > > mnt_sb = 0xffff880125fef800, mnt_flags = 33}, mnt_rcu = {next = > > 0x0, func = 0}, mnt_pcp = 0x60fed2000ab4, mnt_mounts = {next = > > 0xffff8800a3a9a710, prev = 0xffff8800a3a9a710}, > > mnt_child = {next = 0xffff8800a3a9a990, prev = 0xffff8800a3a9a860}, > > mnt_instance = {next = 0xffff880125fef8b0, prev = 0xffff880125fef8b0}, > > mnt_devname = 0xffff880120c69c80 > > "pille:/amd/pille/1/project/mariux32", mnt_list = {next = > > 0xffff8800a3a9a608, prev = 0xffff8800a3a9a888}, mnt_expire = {next = > > 0xffff8800a3a9a758, > > prev = 0xffff8800a3a9a758}, mnt_share = {next = > > 0xffff8800a3a9a768, prev = 0xffff8800a3a9a768}, mnt_slave_list = {next > > = 0xffff8800a3a9a778, prev = 0xffff8800a3a9a778}, > > mnt_slave = {next = 0xffff8800a3a9a788, prev = 0xffff8800a3a9a788}, > > mnt_master = 0x0, mnt_ns = 0xffff8801271f9300, mnt_mp = > > 0xffff880125e44480, mnt_fsnotify_marks = {first = 0x0}, > > mnt_fsnotify_mask = 0, mnt_id = 126, mnt_group_id = 0, > > mnt_expiry_mark = 0, mnt_pinned = 0, mnt_ex_mountpoint = {mnt = 0x0, > > dentry = 0x0}} > > > > This is the struct mount of the parent ( "/project") : > > > (gdb) print *((struct mount *)0xffff8800a3a9a6c0)->mnt_parent > > $3 = {mnt_hash = {next = 0xffff880128e87380, prev = > > 0xffff880128e87380}, mnt_parent = 0xffff8800aa13e300, mnt_mountpoint = mnt_hash->next == mnt_hash->prev, mount has been unlinked from the mount tree so is not "visible". As far as we are concerned this mount has gone. > > 0xffff8800c8b2e450, mnt = {mnt_root = 0xffff8800c8b2e810, > > mnt_sb = 0xffff8800c8f44000, mnt_flags = 32}, mnt_rcu = {next = > > 0x0, func = 0}, mnt_pcp = 0x60fed2000aa4, mnt_mounts = {next = > > 0xffff8800a3a9a860, prev = 0xffff8800a3a9a720}, > > mnt_child = {next = 0xffff8800a3a9a5e0, prev = 0xffff8800a3a9aae0}, > > mnt_instance = {next = 0xffff8800c8f440b0, prev = 0xffff8800c9e5e270}, > > mnt_devname = 0xffff8800a3efe1e0 "/etc/automount/auto.project", > > mnt_list = {next = 0xffff8800a3a9a888, prev = 0xffff8800a3a9ab08}, > > mnt_expire = {next = 0xffff8800a3a9a9d8, > > prev = 0xffff8800a3a9a9d8}, mnt_share = {next = > > 0xffff8800a3a9a9e8, prev = 0xffff8800a3a9a9e8}, mnt_slave_list = {next > > = 0xffff8800a3a9a9f8, prev = 0xffff8800a3a9a9f8}, > > mnt_slave = {next = 0xffff8800a3a9aa08, prev = 0xffff8800a3a9aa08}, > > mnt_master = 0x0, mnt_ns = 0xffff8801271f9300, mnt_mp = > > 0xffff8800c1d068a0, mnt_fsnotify_marks = {first = 0x0}, > > mnt_fsnotify_mask = 0, mnt_id = 124, mnt_group_id = 0, > > mnt_expiry_mark = 0, mnt_pinned = 0, mnt_ex_mountpoint = {mnt = 0x0, > > dentry = 0x0}} > > > Regards > Donald > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 2:17 ` Ian Kent @ 2014-03-02 8:28 ` Donald Buczek 2014-03-02 9:41 ` Ian Kent 0 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-03-02 8:28 UTC (permalink / raw) To: Ian Kent; +Cc: Alexander Viro, autofs [-- Attachment #1: Type: text/plain, Size: 340 bytes --] Am 02.03.2014 03:17, schrieb Ian Kent: > mnt_hash->next == mnt_hash->prev, mount has been unlinked from the mount > tree so is not "visible". As far as we are concerned this mount has > gone. No, prev and next both point to the list_head in the mount_hashtable. -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4541 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 8:28 ` Donald Buczek @ 2014-03-02 9:41 ` Ian Kent 2014-03-02 10:22 ` Donald Buczek 0 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-03-02 9:41 UTC (permalink / raw) To: Donald Buczek; +Cc: Alexander Viro, autofs On Sun, 2014-03-02 at 09:28 +0100, Donald Buczek wrote: > Am 02.03.2014 03:17, schrieb Ian Kent: > > mnt_hash->next == mnt_hash->prev, mount has been unlinked from the mount > > tree so is not "visible". As far as we are concerned this mount has > > gone. > > No, prev and next both point to the list_head in the mount_hashtable. Fair call, ->mnt_mp != NULL too which implies the mount hasn't been unlinked. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 9:41 ` Ian Kent @ 2014-03-02 10:22 ` Donald Buczek 2014-03-02 11:03 ` Ian Kent 0 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-03-02 10:22 UTC (permalink / raw) To: Ian Kent; +Cc: Alexander Viro, autofs [-- Attachment #1: Type: text/plain, Size: 4279 bytes --] Am 02.03.2014 10:41, schrieb Ian Kent: > On Sun, 2014-03-02 at 09:28 +0100, Donald Buczek wrote: >> Am 02.03.2014 03:17, schrieb Ian Kent: >>> mnt_hash->next == mnt_hash->prev, mount has been unlinked from the mount >>> tree so is not "visible". As far as we are concerned this mount has >>> gone. >> No, prev and next both point to the list_head in the mount_hashtable. > Fair call, ->mnt_mp != NULL too which implies the mount hasn't been > unlinked. The problem is, that the mount is in another namespace. I've put mnt_ns into my perl script: > root:kasslerbraten:/home/buczek/autofs/# ./peekmounts |grep mariux32 > mountpoint 0xffff880125e44480 : count= 1 denty=0xffff88007c86da10 > (mariux32) > struct mount 0xffff8800a3a9a6c0 : mountpoint dentry > 0xffff88007c86da10 (mariux32) mountpoint struct 0xffff880125e44480 NS > 0xffff8801271f9300 > root:kasslerbraten:/home/buczek/autofs/# ./peekmounts |grep project > mountpoint 0xffff8800c1d068a0 : count= 2 denty=0xffff8800c8b2e450 > (project) > struct mount 0xffff8800c9e5e200 : mountpoint dentry > 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 NS > 0xffff88012d00ed00 > struct mount 0xffff8800a3a9a940 : mountpoint dentry > 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 NS > 0xffff8801271f9300 We have a /project without a mounted /project/mariux32 and a /project with a mounted /project/mariux32 in another namespace. This goes in the direction you mentioned in your other mail ("Illegal as far as autofs is concerned because an autofs mount is strictly associated with a path defined by its map") The system-wide, absolute semantics of pathnames in the automount world don't fit well into the process-local, relative mount semantics of the kernel. I still don't know, where these "root" or "old-root" messages come from, but again the error occured after these strange messages appeared > 2014-02-28T12:33:08.461073+01:00 kasslerbraten kernel: [13093.129511] > pid 7670: d_set_mounted: set mounted on dentry=ffff88007ca58a50 root > 2014-02-28T12:33:08.461074+01:00 kasslerbraten kernel: [13093.129569] > pid 7670: put_mountpoint: mp=ffff8801271f9338 > 2014-02-28T12:33:08.461075+01:00 kasslerbraten kernel: [13093.129574] > pid 7670: d_set_mounted: dentry=ffff8800c890ce10 tmp > 2014-02-28T12:33:08.461076+01:00 kasslerbraten kernel: [13093.129575] > pid 7670: d_set_mounted: set mounted on dentry=ffff8800c890ce10 tmp > 2014-02-28T12:33:08.461077+01:00 kasslerbraten kernel: [13093.129578] > pid 7670: put_mountpoint: mp=ffff8801271f9338 > 2014-02-28T12:33:08.461078+01:00 kasslerbraten kernel: [13093.129599] > pid 7670: d_set_mounted: dentry=ffff88007ca407d0 old-root-D0k5jB > 2014-02-28T12:33:08.461079+01:00 kasslerbraten kernel: [13093.129601] > pid 7670: d_set_mounted: set mounted on dentry=ffff88007ca407d0 > old-root-D0k5jB > 2014-02-28T12:33:08.461080+01:00 kasslerbraten kernel: [13093.129602] > pid 7670: put_mountpoint: mp=ffff8800c9e5e750 > 2014-02-28T12:33:08.461081+01:00 kasslerbraten kernel: [13093.129603] > pid 7670: put_mountpoint: cleared mounted on dentry=ffff88007ca58a50 root > 2014-02-28T12:33:08.461082+01:00 kasslerbraten kernel: [13093.129604] > pid 7670: put_mountpoint: mp=ffff8800c9e5e610 > 2014-02-28T12:33:08.461082+01:00 kasslerbraten kernel: [13093.129662] > pid 7670: put_mountpoint: mp=0000014800450045 > 2014-02-28T12:33:08.461083+01:00 kasslerbraten kernel: [13093.129663] > pid 7670: put_mountpoint: mp=0000000000000006 (as explained in another mail , the addresses of "mp=" are wrong, so don't worry about these) This looks like chroot or somesuch. But I have no idea. I don't find \"root or \"old- in any sources. There is not "root" in any map. Hmmm. Isn't the daemon doing lazy umounts? Could it be this? I've compiled the daemon with --enable-ignore-busy Anything forcing the daemon to restart? systemd doing stupid things to the daemon or the filesystem? > [Service] > Type=forking > ExecStartPre=/sbin/make-automaps > ExecStart=/usr/sbin/automount -v > PIDFile=/run/autofs-running > ExecReload=/bin/kill -HUP $MAINPID Donald -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4541 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 10:22 ` Donald Buczek @ 2014-03-02 11:03 ` Ian Kent 2014-03-02 11:15 ` Donald Buczek 2014-03-02 11:25 ` Ian Kent 0 siblings, 2 replies; 50+ messages in thread From: Ian Kent @ 2014-03-02 11:03 UTC (permalink / raw) To: Donald Buczek; +Cc: Alexander Viro, autofs On Sun, 2014-03-02 at 11:22 +0100, Donald Buczek wrote: > Am 02.03.2014 10:41, schrieb Ian Kent: > > On Sun, 2014-03-02 at 09:28 +0100, Donald Buczek wrote: > >> Am 02.03.2014 03:17, schrieb Ian Kent: > >>> mnt_hash->next == mnt_hash->prev, mount has been unlinked from the mount > >>> tree so is not "visible". As far as we are concerned this mount has > >>> gone. > >> No, prev and next both point to the list_head in the mount_hashtable. > > Fair call, ->mnt_mp != NULL too which implies the mount hasn't been > > unlinked. > > The problem is, that the mount is in another namespace. I've put mnt_ns > into my perl script: > > > root:kasslerbraten:/home/buczek/autofs/# ./peekmounts |grep mariux32 > > mountpoint 0xffff880125e44480 : count= 1 denty=0xffff88007c86da10 > > (mariux32) > > struct mount 0xffff8800a3a9a6c0 : mountpoint dentry > > 0xffff88007c86da10 (mariux32) mountpoint struct 0xffff880125e44480 NS > > 0xffff8801271f9300 > > root:kasslerbraten:/home/buczek/autofs/# ./peekmounts |grep project > > mountpoint 0xffff8800c1d068a0 : count= 2 denty=0xffff8800c8b2e450 > > (project) > > struct mount 0xffff8800c9e5e200 : mountpoint dentry > > 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 NS > > 0xffff88012d00ed00 > > struct mount 0xffff8800a3a9a940 : mountpoint dentry > > 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 NS > > 0xffff8801271f9300 > > We have a /project without a mounted /project/mariux32 and a /project > with a mounted /project/mariux32 in another namespace. Yep, I'm struggling to follow the namespace list handling atm. It is something I'm going to need get to terms with because of issues like this. > > This goes in the direction you mentioned in your other mail ("Illegal as > far as autofs is concerned because an autofs mount is strictly > associated with a path defined by its map") The system-wide, absolute > semantics of pathnames in the automount world don't fit well into the > process-local, relative mount semantics of the kernel. Yes, but a bigger issue is that the autofs semantics of multiple name spaces aren't defined which means all I can do for now is make statements like the one above. No, asking folks concerned with namespaces didn't result in useful feedback. Perhaps I'm asking the question in the wrong way, I don't know yet. > > I still don't know, where these "root" or "old-root" messages come from, > but again the error occured after these strange messages appeared > > > 2014-02-28T12:33:08.461073+01:00 kasslerbraten kernel: [13093.129511] > > pid 7670: d_set_mounted: set mounted on dentry=ffff88007ca58a50 root > > 2014-02-28T12:33:08.461074+01:00 kasslerbraten kernel: [13093.129569] > > pid 7670: put_mountpoint: mp=ffff8801271f9338 > > 2014-02-28T12:33:08.461075+01:00 kasslerbraten kernel: [13093.129574] > > pid 7670: d_set_mounted: dentry=ffff8800c890ce10 tmp > > 2014-02-28T12:33:08.461076+01:00 kasslerbraten kernel: [13093.129575] > > pid 7670: d_set_mounted: set mounted on dentry=ffff8800c890ce10 tmp > > 2014-02-28T12:33:08.461077+01:00 kasslerbraten kernel: [13093.129578] > > pid 7670: put_mountpoint: mp=ffff8801271f9338 > > 2014-02-28T12:33:08.461078+01:00 kasslerbraten kernel: [13093.129599] > > pid 7670: d_set_mounted: dentry=ffff88007ca407d0 old-root-D0k5jB > > 2014-02-28T12:33:08.461079+01:00 kasslerbraten kernel: [13093.129601] > > pid 7670: d_set_mounted: set mounted on dentry=ffff88007ca407d0 > > old-root-D0k5jB > > 2014-02-28T12:33:08.461080+01:00 kasslerbraten kernel: [13093.129602] > > pid 7670: put_mountpoint: mp=ffff8800c9e5e750 > > 2014-02-28T12:33:08.461081+01:00 kasslerbraten kernel: [13093.129603] > > pid 7670: put_mountpoint: cleared mounted on dentry=ffff88007ca58a50 root > > 2014-02-28T12:33:08.461082+01:00 kasslerbraten kernel: [13093.129604] > > pid 7670: put_mountpoint: mp=ffff8800c9e5e610 > > 2014-02-28T12:33:08.461082+01:00 kasslerbraten kernel: [13093.129662] > > pid 7670: put_mountpoint: mp=0000014800450045 > > 2014-02-28T12:33:08.461083+01:00 kasslerbraten kernel: [13093.129663] > > pid 7670: put_mountpoint: mp=0000000000000006 > > (as explained in another mail , the addresses of "mp=" are wrong, so > don't worry about these) > > This looks like chroot or somesuch. But I have no idea. I don't find > \"root or \"old- in any sources. There is not "root" in any map. Can't see anything myself but I've lost track of what kernel version we're using here, what is it again? > > Hmmm. Isn't the daemon doing lazy umounts? Could it be this? You would need to have a fairly old version of autofs for it to be doing lazy umounts and you'd probably be seeing different problems as a result. In particular, processes unable to successfully call getcwd() or scripts unable to get pwd from /proc/<pid>/cwd. > I've compiled the daemon with --enable-ignore-busy Umm .. now your making me think. IIRC I added that so the daemon would not refuse to exit when it encountered mounts that were in use. The idea being to reconstruct the user space data structures at startup essentially re-connecting to the mounts left mounted. Sure, that has it's own set of difficulties but they are much less offensive than the problems seen by using lazy umount. In short it isn't related to lazy umounts. Although I think there's a case were they could be used if you don't use the miscellaneous device for ioctl control. But you need to explicitly remove the device file (or make it inaccessible) to make that happen. There isn't anything like that in any systemd units I'm aware of so autofs will use the device file by default. > Anything forcing the daemon to restart? > systemd doing stupid things to the daemon or the filesystem? > > > [Service] > > Type=forking > > ExecStartPre=/sbin/make-automaps > > ExecStart=/usr/sbin/automount -v > > PIDFile=/run/autofs-running > > ExecReload=/bin/kill -HUP $MAINPID This isn't the systemd unit that's included in the package tar. I have no idea what /sbin/make-automaps is or does. Other than the unknown make-automaps it looks like it should be OK. > > Donald > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 11:03 ` Ian Kent @ 2014-03-02 11:15 ` Donald Buczek 2014-03-02 11:30 ` Ian Kent 2014-03-02 11:25 ` Ian Kent 1 sibling, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-03-02 11:15 UTC (permalink / raw) To: Ian Kent; +Cc: Alexander Viro, autofs [-- Attachment #1: Type: text/plain, Size: 7820 bytes --] I've follow the mysterious "root" up: > (gdb) print ((struct dentry *)0xffff88007ca407d0)->d_name->name > $10 = (const unsigned char *) 0xffff88007ca40808 "old-root-D0k5jB" > (gdb) print ((struct dentry *)0xffff88007ca407d0)->d_parent->d_name->name > $11 = (const unsigned char *) 0xffff88007ca6c748 "private" > (gdb) print ((struct dentry > *)0xffff88007ca407d0)->d_parent->d_parent->d_name->name > $12 = (const unsigned char *) 0xffff88007ca58b48 > "systemd-namespace-os99ZC" > (gdb) print ((struct dentry > *)0xffff88007ca407d0)->d_parent->d_parent->d_parent->d_name->name > $13 = (const unsigned char *) 0xffff8800c890ce48 "tmp" > (gdb) print ((struct dentry > *)0xffff88007ca407d0)->d_parent->d_parent->d_parent->d_parent->d_name->name > $14 = (const unsigned char *) 0xffff88012900f2c8 "/" > root:kasslerbraten:/home/buczek/autofs/# ls -lR > /tmp/systemd-namespace-os99ZC/ > /tmp/systemd-namespace-os99ZC/: > total 0 > drwxrwxr-t 2 root system 48 Feb 28 12:33 private > drwxrwxr-x 2 root system 48 Feb 28 12:33 root > > /tmp/systemd-namespace-os99ZC/private: > total 0 > > /tmp/systemd-namespace-os99ZC/root: > total 0 So its systemd which is doing some strange namespace stuff in /tmp. This probably collides in some way with the autofs model of autofs having global pathnames. Still not clear and not solved, but we're really coming closer... Donald Am 02.03.2014 12:03, schrieb Ian Kent: > On Sun, 2014-03-02 at 11:22 +0100, Donald Buczek wrote: >> Am 02.03.2014 10:41, schrieb Ian Kent: >>> On Sun, 2014-03-02 at 09:28 +0100, Donald Buczek wrote: >>>> Am 02.03.2014 03:17, schrieb Ian Kent: >>>>> mnt_hash->next == mnt_hash->prev, mount has been unlinked from the mount >>>>> tree so is not "visible". As far as we are concerned this mount has >>>>> gone. >>>> No, prev and next both point to the list_head in the mount_hashtable. >>> Fair call, ->mnt_mp != NULL too which implies the mount hasn't been >>> unlinked. >> The problem is, that the mount is in another namespace. I've put mnt_ns >> into my perl script: >> >>> root:kasslerbraten:/home/buczek/autofs/# ./peekmounts |grep mariux32 >>> mountpoint 0xffff880125e44480 : count= 1 denty=0xffff88007c86da10 >>> (mariux32) >>> struct mount 0xffff8800a3a9a6c0 : mountpoint dentry >>> 0xffff88007c86da10 (mariux32) mountpoint struct 0xffff880125e44480 NS >>> 0xffff8801271f9300 >>> root:kasslerbraten:/home/buczek/autofs/# ./peekmounts |grep project >>> mountpoint 0xffff8800c1d068a0 : count= 2 denty=0xffff8800c8b2e450 >>> (project) >>> struct mount 0xffff8800c9e5e200 : mountpoint dentry >>> 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 NS >>> 0xffff88012d00ed00 >>> struct mount 0xffff8800a3a9a940 : mountpoint dentry >>> 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 NS >>> 0xffff8801271f9300 >> We have a /project without a mounted /project/mariux32 and a /project >> with a mounted /project/mariux32 in another namespace. > Yep, I'm struggling to follow the namespace list handling atm. > It is something I'm going to need get to terms with because of issues > like this. > >> This goes in the direction you mentioned in your other mail ("Illegal as >> far as autofs is concerned because an autofs mount is strictly >> associated with a path defined by its map") The system-wide, absolute >> semantics of pathnames in the automount world don't fit well into the >> process-local, relative mount semantics of the kernel. > Yes, but a bigger issue is that the autofs semantics of multiple name > spaces aren't defined which means all I can do for now is make > statements like the one above. > > No, asking folks concerned with namespaces didn't result in useful > feedback. Perhaps I'm asking the question in the wrong way, I don't know > yet. > >> I still don't know, where these "root" or "old-root" messages come from, >> but again the error occured after these strange messages appeared >> >>> 2014-02-28T12:33:08.461073+01:00 kasslerbraten kernel: [13093.129511] >>> pid 7670: d_set_mounted: set mounted on dentry=ffff88007ca58a50 root >>> 2014-02-28T12:33:08.461074+01:00 kasslerbraten kernel: [13093.129569] >>> pid 7670: put_mountpoint: mp=ffff8801271f9338 >>> 2014-02-28T12:33:08.461075+01:00 kasslerbraten kernel: [13093.129574] >>> pid 7670: d_set_mounted: dentry=ffff8800c890ce10 tmp >>> 2014-02-28T12:33:08.461076+01:00 kasslerbraten kernel: [13093.129575] >>> pid 7670: d_set_mounted: set mounted on dentry=ffff8800c890ce10 tmp >>> 2014-02-28T12:33:08.461077+01:00 kasslerbraten kernel: [13093.129578] >>> pid 7670: put_mountpoint: mp=ffff8801271f9338 >>> 2014-02-28T12:33:08.461078+01:00 kasslerbraten kernel: [13093.129599] >>> pid 7670: d_set_mounted: dentry=ffff88007ca407d0 old-root-D0k5jB >>> 2014-02-28T12:33:08.461079+01:00 kasslerbraten kernel: [13093.129601] >>> pid 7670: d_set_mounted: set mounted on dentry=ffff88007ca407d0 >>> old-root-D0k5jB >>> 2014-02-28T12:33:08.461080+01:00 kasslerbraten kernel: [13093.129602] >>> pid 7670: put_mountpoint: mp=ffff8800c9e5e750 >>> 2014-02-28T12:33:08.461081+01:00 kasslerbraten kernel: [13093.129603] >>> pid 7670: put_mountpoint: cleared mounted on dentry=ffff88007ca58a50 root >>> 2014-02-28T12:33:08.461082+01:00 kasslerbraten kernel: [13093.129604] >>> pid 7670: put_mountpoint: mp=ffff8800c9e5e610 >>> 2014-02-28T12:33:08.461082+01:00 kasslerbraten kernel: [13093.129662] >>> pid 7670: put_mountpoint: mp=0000014800450045 >>> 2014-02-28T12:33:08.461083+01:00 kasslerbraten kernel: [13093.129663] >>> pid 7670: put_mountpoint: mp=0000000000000006 >> (as explained in another mail , the addresses of "mp=" are wrong, so >> don't worry about these) >> >> This looks like chroot or somesuch. But I have no idea. I don't find >> \"root or \"old- in any sources. There is not "root" in any map. > Can't see anything myself but I've lost track of what kernel version > we're using here, what is it again? > >> Hmmm. Isn't the daemon doing lazy umounts? Could it be this? > You would need to have a fairly old version of autofs for it to be doing > lazy umounts and you'd probably be seeing different problems as a > result. In particular, processes unable to successfully call getcwd() or > scripts unable to get pwd from /proc/<pid>/cwd. > >> I've compiled the daemon with --enable-ignore-busy > Umm .. now your making me think. > > IIRC I added that so the daemon would not refuse to exit when it > encountered mounts that were in use. The idea being to reconstruct the > user space data structures at startup essentially re-connecting to the > mounts left mounted. Sure, that has it's own set of difficulties but > they are much less offensive than the problems seen by using lazy > umount. > > In short it isn't related to lazy umounts. > > Although I think there's a case were they could be used if you don't use > the miscellaneous device for ioctl control. But you need to explicitly > remove the device file (or make it inaccessible) to make that happen. > There isn't anything like that in any systemd units I'm aware of so > autofs will use the device file by default. > >> Anything forcing the daemon to restart? >> systemd doing stupid things to the daemon or the filesystem? >> >>> [Service] >>> Type=forking >>> ExecStartPre=/sbin/make-automaps >>> ExecStart=/usr/sbin/automount -v >>> PIDFile=/run/autofs-running >>> ExecReload=/bin/kill -HUP $MAINPID > This isn't the systemd unit that's included in the package tar. > I have no idea what /sbin/make-automaps is or does. > > Other than the unknown make-automaps it looks like it should be OK. > >> Donald >> > -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4541 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 11:15 ` Donald Buczek @ 2014-03-02 11:30 ` Ian Kent 2014-03-02 11:35 ` Ian Kent 0 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-03-02 11:30 UTC (permalink / raw) To: Donald Buczek; +Cc: Alexander Viro, autofs On Sun, 2014-03-02 at 12:15 +0100, Donald Buczek wrote: > I've follow the mysterious "root" up: > > > (gdb) print ((struct dentry *)0xffff88007ca407d0)->d_name->name > > $10 = (const unsigned char *) 0xffff88007ca40808 "old-root-D0k5jB" > > (gdb) print ((struct dentry *)0xffff88007ca407d0)->d_parent->d_name->name > > $11 = (const unsigned char *) 0xffff88007ca6c748 "private" > > (gdb) print ((struct dentry > > *)0xffff88007ca407d0)->d_parent->d_parent->d_name->name > > $12 = (const unsigned char *) 0xffff88007ca58b48 > > "systemd-namespace-os99ZC" > > (gdb) print ((struct dentry > > *)0xffff88007ca407d0)->d_parent->d_parent->d_parent->d_name->name > > $13 = (const unsigned char *) 0xffff8800c890ce48 "tmp" > > (gdb) print ((struct dentry > > *)0xffff88007ca407d0)->d_parent->d_parent->d_parent->d_parent->d_name->name > > $14 = (const unsigned char *) 0xffff88012900f2c8 "/" > > root:kasslerbraten:/home/buczek/autofs/# ls -lR > > /tmp/systemd-namespace-os99ZC/ > > /tmp/systemd-namespace-os99ZC/: > > total 0 > > drwxrwxr-t 2 root system 48 Feb 28 12:33 private > > drwxrwxr-x 2 root system 48 Feb 28 12:33 root > > > > /tmp/systemd-namespace-os99ZC/private: > > total 0 > > > > /tmp/systemd-namespace-os99ZC/root: > > total 0 > > > So its systemd which is doing some strange namespace stuff in /tmp. > This probably collides in some way with the autofs model of autofs > having global pathnames. > > Still not clear and not solved, but we're really coming closer... LOL, forgive me for thinking that systemd just sets "/" shared in a simple way. That little pain in the a*** might be what it's doing there. > > Donald > > > > Am 02.03.2014 12:03, schrieb Ian Kent: > > On Sun, 2014-03-02 at 11:22 +0100, Donald Buczek wrote: > >> Am 02.03.2014 10:41, schrieb Ian Kent: > >>> On Sun, 2014-03-02 at 09:28 +0100, Donald Buczek wrote: > >>>> Am 02.03.2014 03:17, schrieb Ian Kent: > >>>>> mnt_hash->next == mnt_hash->prev, mount has been unlinked from the mount > >>>>> tree so is not "visible". As far as we are concerned this mount has > >>>>> gone. > >>>> No, prev and next both point to the list_head in the mount_hashtable. > >>> Fair call, ->mnt_mp != NULL too which implies the mount hasn't been > >>> unlinked. > >> The problem is, that the mount is in another namespace. I've put mnt_ns > >> into my perl script: > >> > >>> root:kasslerbraten:/home/buczek/autofs/# ./peekmounts |grep mariux32 > >>> mountpoint 0xffff880125e44480 : count= 1 denty=0xffff88007c86da10 > >>> (mariux32) > >>> struct mount 0xffff8800a3a9a6c0 : mountpoint dentry > >>> 0xffff88007c86da10 (mariux32) mountpoint struct 0xffff880125e44480 NS > >>> 0xffff8801271f9300 > >>> root:kasslerbraten:/home/buczek/autofs/# ./peekmounts |grep project > >>> mountpoint 0xffff8800c1d068a0 : count= 2 denty=0xffff8800c8b2e450 > >>> (project) > >>> struct mount 0xffff8800c9e5e200 : mountpoint dentry > >>> 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 NS > >>> 0xffff88012d00ed00 > >>> struct mount 0xffff8800a3a9a940 : mountpoint dentry > >>> 0xffff8800c8b2e450 (project) mountpoint struct 0xffff8800c1d068a0 NS > >>> 0xffff8801271f9300 > >> We have a /project without a mounted /project/mariux32 and a /project > >> with a mounted /project/mariux32 in another namespace. > > Yep, I'm struggling to follow the namespace list handling atm. > > It is something I'm going to need get to terms with because of issues > > like this. > > > >> This goes in the direction you mentioned in your other mail ("Illegal as > >> far as autofs is concerned because an autofs mount is strictly > >> associated with a path defined by its map") The system-wide, absolute > >> semantics of pathnames in the automount world don't fit well into the > >> process-local, relative mount semantics of the kernel. > > Yes, but a bigger issue is that the autofs semantics of multiple name > > spaces aren't defined which means all I can do for now is make > > statements like the one above. > > > > No, asking folks concerned with namespaces didn't result in useful > > feedback. Perhaps I'm asking the question in the wrong way, I don't know > > yet. > > > >> I still don't know, where these "root" or "old-root" messages come from, > >> but again the error occured after these strange messages appeared > >> > >>> 2014-02-28T12:33:08.461073+01:00 kasslerbraten kernel: [13093.129511] > >>> pid 7670: d_set_mounted: set mounted on dentry=ffff88007ca58a50 root > >>> 2014-02-28T12:33:08.461074+01:00 kasslerbraten kernel: [13093.129569] > >>> pid 7670: put_mountpoint: mp=ffff8801271f9338 > >>> 2014-02-28T12:33:08.461075+01:00 kasslerbraten kernel: [13093.129574] > >>> pid 7670: d_set_mounted: dentry=ffff8800c890ce10 tmp > >>> 2014-02-28T12:33:08.461076+01:00 kasslerbraten kernel: [13093.129575] > >>> pid 7670: d_set_mounted: set mounted on dentry=ffff8800c890ce10 tmp > >>> 2014-02-28T12:33:08.461077+01:00 kasslerbraten kernel: [13093.129578] > >>> pid 7670: put_mountpoint: mp=ffff8801271f9338 > >>> 2014-02-28T12:33:08.461078+01:00 kasslerbraten kernel: [13093.129599] > >>> pid 7670: d_set_mounted: dentry=ffff88007ca407d0 old-root-D0k5jB > >>> 2014-02-28T12:33:08.461079+01:00 kasslerbraten kernel: [13093.129601] > >>> pid 7670: d_set_mounted: set mounted on dentry=ffff88007ca407d0 > >>> old-root-D0k5jB > >>> 2014-02-28T12:33:08.461080+01:00 kasslerbraten kernel: [13093.129602] > >>> pid 7670: put_mountpoint: mp=ffff8800c9e5e750 > >>> 2014-02-28T12:33:08.461081+01:00 kasslerbraten kernel: [13093.129603] > >>> pid 7670: put_mountpoint: cleared mounted on dentry=ffff88007ca58a50 root > >>> 2014-02-28T12:33:08.461082+01:00 kasslerbraten kernel: [13093.129604] > >>> pid 7670: put_mountpoint: mp=ffff8800c9e5e610 > >>> 2014-02-28T12:33:08.461082+01:00 kasslerbraten kernel: [13093.129662] > >>> pid 7670: put_mountpoint: mp=0000014800450045 > >>> 2014-02-28T12:33:08.461083+01:00 kasslerbraten kernel: [13093.129663] > >>> pid 7670: put_mountpoint: mp=0000000000000006 > >> (as explained in another mail , the addresses of "mp=" are wrong, so > >> don't worry about these) > >> > >> This looks like chroot or somesuch. But I have no idea. I don't find > >> \"root or \"old- in any sources. There is not "root" in any map. > > Can't see anything myself but I've lost track of what kernel version > > we're using here, what is it again? > > > >> Hmmm. Isn't the daemon doing lazy umounts? Could it be this? > > You would need to have a fairly old version of autofs for it to be doing > > lazy umounts and you'd probably be seeing different problems as a > > result. In particular, processes unable to successfully call getcwd() or > > scripts unable to get pwd from /proc/<pid>/cwd. > > > >> I've compiled the daemon with --enable-ignore-busy > > Umm .. now your making me think. > > > > IIRC I added that so the daemon would not refuse to exit when it > > encountered mounts that were in use. The idea being to reconstruct the > > user space data structures at startup essentially re-connecting to the > > mounts left mounted. Sure, that has it's own set of difficulties but > > they are much less offensive than the problems seen by using lazy > > umount. > > > > In short it isn't related to lazy umounts. > > > > Although I think there's a case were they could be used if you don't use > > the miscellaneous device for ioctl control. But you need to explicitly > > remove the device file (or make it inaccessible) to make that happen. > > There isn't anything like that in any systemd units I'm aware of so > > autofs will use the device file by default. > > > >> Anything forcing the daemon to restart? > >> systemd doing stupid things to the daemon or the filesystem? > >> > >>> [Service] > >>> Type=forking > >>> ExecStartPre=/sbin/make-automaps > >>> ExecStart=/usr/sbin/automount -v > >>> PIDFile=/run/autofs-running > >>> ExecReload=/bin/kill -HUP $MAINPID > > This isn't the systemd unit that's included in the package tar. > > I have no idea what /sbin/make-automaps is or does. > > > > Other than the unknown make-automaps it looks like it should be OK. > > > >> Donald > >> > > > > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 11:30 ` Ian Kent @ 2014-03-02 11:35 ` Ian Kent 0 siblings, 0 replies; 50+ messages in thread From: Ian Kent @ 2014-03-02 11:35 UTC (permalink / raw) To: Donald Buczek; +Cc: Alexander Viro, autofs On Sun, 2014-03-02 at 19:30 +0800, Ian Kent wrote: > On Sun, 2014-03-02 at 12:15 +0100, Donald Buczek wrote: > > I've follow the mysterious "root" up: > > > > > (gdb) print ((struct dentry *)0xffff88007ca407d0)->d_name->name > > > $10 = (const unsigned char *) 0xffff88007ca40808 "old-root-D0k5jB" > > > (gdb) print ((struct dentry *)0xffff88007ca407d0)->d_parent->d_name->name > > > $11 = (const unsigned char *) 0xffff88007ca6c748 "private" > > > (gdb) print ((struct dentry > > > *)0xffff88007ca407d0)->d_parent->d_parent->d_name->name > > > $12 = (const unsigned char *) 0xffff88007ca58b48 > > > "systemd-namespace-os99ZC" > > > (gdb) print ((struct dentry > > > *)0xffff88007ca407d0)->d_parent->d_parent->d_parent->d_name->name > > > $13 = (const unsigned char *) 0xffff8800c890ce48 "tmp" > > > (gdb) print ((struct dentry > > > *)0xffff88007ca407d0)->d_parent->d_parent->d_parent->d_parent->d_name->name > > > $14 = (const unsigned char *) 0xffff88012900f2c8 "/" > > > root:kasslerbraten:/home/buczek/autofs/# ls -lR > > > /tmp/systemd-namespace-os99ZC/ > > > /tmp/systemd-namespace-os99ZC/: > > > total 0 > > > drwxrwxr-t 2 root system 48 Feb 28 12:33 private > > > drwxrwxr-x 2 root system 48 Feb 28 12:33 root > > > > > > /tmp/systemd-namespace-os99ZC/private: > > > total 0 > > > > > > /tmp/systemd-namespace-os99ZC/root: > > > total 0 > > > > > > So its systemd which is doing some strange namespace stuff in /tmp. > > This probably collides in some way with the autofs model of autofs > > having global pathnames. > > > > Still not clear and not solved, but we're really coming closer... > > LOL, forgive me for thinking that systemd just sets "/" shared in a > simple way. > > That little pain in the a*** might be what it's doing there. > Mind you making "/" shared should be done way before autofs is started. Container implementations do similar things with like pivot mount on a new root and may change the propagation type as well. Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 11:03 ` Ian Kent 2014-03-02 11:15 ` Donald Buczek @ 2014-03-02 11:25 ` Ian Kent 1 sibling, 0 replies; 50+ messages in thread From: Ian Kent @ 2014-03-02 11:25 UTC (permalink / raw) To: Donald Buczek; +Cc: Alexander Viro, autofs On Sun, 2014-03-02 at 19:03 +0800, Ian Kent wrote: > On Sun, 2014-03-02 at 11:22 +0100, Donald Buczek wrote: > > > > This goes in the direction you mentioned in your other mail ("Illegal as > > far as autofs is concerned because an autofs mount is strictly > > associated with a path defined by its map") The system-wide, absolute > > semantics of pathnames in the automount world don't fit well into the > > process-local, relative mount semantics of the kernel. > > Yes, but a bigger issue is that the autofs semantics of multiple name > spaces aren't defined which means all I can do for now is make > statements like the one above. > > No, asking folks concerned with namespaces didn't result in useful > feedback. Perhaps I'm asking the question in the wrong way, I don't know > yet. Or asking the wrong people! But there are two namespace related patches in the 3.14.0 rc kernel. TBH I can't see how they would make a difference here but we probably should try a patched kernel to find out for sure. See commits 6eaba35b and fbff0870 in the current Linus kernel tree. If you need me to send you the patches for the kernel your using I can do that but I'll need specific kernel version. Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-02-28 13:29 ` Alexander Viro 2014-02-28 20:35 ` Donald Buczek @ 2014-03-02 2:22 ` Ian Kent 2014-03-02 7:10 ` Ian Kent 1 sibling, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-03-02 2:22 UTC (permalink / raw) To: Alexander Viro; +Cc: Donald Buczek, autofs On Fri, 2014-02-28 at 08:29 -0500, Alexander Viro wrote: > On Fri, Feb 28, 2014 at 01:12:58PM +0100, Donald Buczek wrote: > > > Obviously, "cleared mounted on dentry" is missing. > > > > It looks like we enter put_mountpoint() but don't get to > > dentry->d_flags &= ~DCACHE_MOUNTED; > > > > mp->m_count is not zero probably. > > > > What does it mean? The mount is still locked but not in the mount hash? > > No, it means that something else is mounted on the same dentry (in another > part of mount tree, obviously). > > If you mount the same fs on two different mountpoints, e.g. > mount /dev/sda1 /mnt > mount /dev/sda1 /tmp/foo > you will have the same dentries seen in two places. Now, > mount /dev/sdb11 /mnt/a > mount /dev/sdc5 /tmp/foo/a > > and you've got two different filesystems mounted on two different places > (/mnt/a and /tmp/foo/a). These two places have different vfsmounts, > but the same dentry. struct mountpoint is associated with dentry, so > it's also the same for both. And it serves as a mountpoint for two > vfsmounts - one for fs from sdb11, another for fs from sdc5. > > Now umount /mnt/a; one of those two vfsmounts is gone now. struct mountpoint > survives, of course, and dentry is *still* a mountpoint. sdc5 is still > mounted on /tmp/foo/a, after all... Ahh, right ... I'll need to think about my use (misuse) of d_mountpoint(). Thanks Al. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 2:22 ` Ian Kent @ 2014-03-02 7:10 ` Ian Kent 2014-03-02 14:55 ` Donald Buczek 0 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-03-02 7:10 UTC (permalink / raw) To: Alexander Viro; +Cc: Donald Buczek, autofs On Sun, 2014-03-02 at 10:22 +0800, Ian Kent wrote: > On Fri, 2014-02-28 at 08:29 -0500, Alexander Viro wrote: > > On Fri, Feb 28, 2014 at 01:12:58PM +0100, Donald Buczek wrote: > > > > > Obviously, "cleared mounted on dentry" is missing. > > > > > > It looks like we enter put_mountpoint() but don't get to > > > dentry->d_flags &= ~DCACHE_MOUNTED; > > > > > > mp->m_count is not zero probably. > > > > > > What does it mean? The mount is still locked but not in the mount hash? > > > > No, it means that something else is mounted on the same dentry (in another > > part of mount tree, obviously). > > > > If you mount the same fs on two different mountpoints, e.g. > > mount /dev/sda1 /mnt > > mount /dev/sda1 /tmp/foo > > you will have the same dentries seen in two places. Now, > > mount /dev/sdb11 /mnt/a > > mount /dev/sdc5 /tmp/foo/a > > > > and you've got two different filesystems mounted on two different places > > (/mnt/a and /tmp/foo/a). These two places have different vfsmounts, > > but the same dentry. struct mountpoint is associated with dentry, so > > it's also the same for both. And it serves as a mountpoint for two > > vfsmounts - one for fs from sdb11, another for fs from sdc5. > > > > Now umount /mnt/a; one of those two vfsmounts is gone now. struct mountpoint > > survives, of course, and dentry is *still* a mountpoint. sdc5 is still > > mounted on /tmp/foo/a, after all... Good example but for autofs file systems doesn't this amount to saying its been bound somewhere else? Illegal as far as autofs is concerned because an autofs mount is strictly associated with a path defined by its map. And, yes, bind mounting an autofs file system elsewhere isn't vetoed by the kernel. This makes be start thinking about implications wrt. containers .... > > Ahh, right ... I'll need to think about my use (misuse) of > d_mountpoint(). So maybe I don't need to worry about this just yet. > > Thanks Al. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 7:10 ` Ian Kent @ 2014-03-02 14:55 ` Donald Buczek 2014-03-02 18:51 ` Donald Buczek 2014-03-03 2:40 ` Ian Kent 0 siblings, 2 replies; 50+ messages in thread From: Donald Buczek @ 2014-03-02 14:55 UTC (permalink / raw) To: Ian Kent, Alexander Viro; +Cc: autofs [-- Attachment #1: Type: text/plain, Size: 5325 bytes --] Am 02.03.2014 08:10, schrieb Ian Kent: > On Sun, 2014-03-02 at 10:22 +0800, Ian Kent wrote: >> On Fri, 2014-02-28 at 08:29 -0500, Alexander Viro wrote: >>> On Fri, Feb 28, 2014 at 01:12:58PM +0100, Donald Buczek wrote: >>> >>>> Obviously, "cleared mounted on dentry" is missing. >>>> >>>> It looks like we enter put_mountpoint() but don't get to >>>> dentry->d_flags &= ~DCACHE_MOUNTED; >>>> >>>> mp->m_count is not zero probably. >>>> >>>> What does it mean? The mount is still locked but not in the mount hash? >>> No, it means that something else is mounted on the same dentry (in another >>> part of mount tree, obviously). >>> >>> If you mount the same fs on two different mountpoints, e.g. >>> mount /dev/sda1 /mnt >>> mount /dev/sda1 /tmp/foo >>> you will have the same dentries seen in two places. Now, >>> mount /dev/sdb11 /mnt/a >>> mount /dev/sdc5 /tmp/foo/a >>> >>> and you've got two different filesystems mounted on two different places >>> (/mnt/a and /tmp/foo/a). These two places have different vfsmounts, >>> but the same dentry. struct mountpoint is associated with dentry, so >>> it's also the same for both. And it serves as a mountpoint for two >>> vfsmounts - one for fs from sdb11, another for fs from sdc5. >>> >>> Now umount /mnt/a; one of those two vfsmounts is gone now. struct mountpoint >>> survives, of course, and dentry is *still* a mountpoint. sdc5 is still >>> mounted on /tmp/foo/a, after all... > Good example but for autofs file systems doesn't this amount to saying > its been bound somewhere else? > > Illegal as far as autofs is concerned because an autofs mount is > strictly associated with a path defined by its map. > > And, yes, bind mounting an autofs file system elsewhere isn't vetoed by > the kernel. > > This makes be start thinking about implications wrt. containers .... > >> Ahh, right ... I'll need to think about my use (misuse) of >> d_mountpoint(). > So maybe I don't need to worry about this just yet. I think you should, because exactly this is the bug. d_mountpoint(dentry) just says, that we have a struct mountpoint for the dentry. It does not say, that the path is mounted in the current namespace. The struct mountpoint might exists, because the path is mounted in other namespaces but not ours. The problem at our site is clear now: We have only one service with PrivateTmp=yes which is colord.service. And here is the missing mount: > root:kasslerbraten:/lib/systemd/system/# ps -Af|fgrep colord > root 7670 1 0 Feb28 ? 00:00:00 /usr/lib/colord/colord > root 7897 7329 0 14:46 pts/8 00:00:00 fgrep colord > root:kasslerbraten:/lib/systemd/system/# cat /proc/7670/mounts|grep > mariux32 > pille:/amd/pille/1/project/mariux32 /project/mariux32 nfs > rw,nosuid,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=141.14.28.250,mountvers=3,mountport=56263,mountproto=udp,local_lock=none,addr=141.14.28.250 > 0 0 colord.service is dbus-started. So it is started quiet randomly and depending on user usage pattern, mostly but not exclusively on workstations. That is exactly how we've seen the bug to appear. When the services is started, systemd uses unshare(CLONE_NEWNS) to clone the namespace. This new namespace inherits existing mounts, including automounted ones. These mounts might eventually expire at a later time. When this occurs, they are dismounted from the automount daemons namespace, which is the global, pid 1 namespace. But because they are still mounted in another namespace, the dentry stays flagged as DCACHE_MOUNTED, which prevents autofs to remount it on access. The mount, however, just exists in another namespace and is useless for anybody else. Final prove, that this is the true story: > root:kasslerbraten:/lib/systemd/system/# ls /project/mariux32 > ls: cannot open directory /project/mariux32: Too many levels of > symbolic links > root:kasslerbraten:/lib/systemd/system/# kill -9 7670 > root:kasslerbraten:/lib/systemd/system/# ls /project/mariux32 > beeroot home i686 svnroot > root:kasslerbraten:/lib/systemd/system/# Of course, I can easily work around that in our environment (eg. just remove PrivateTmp=yes from the service). So I'm pretty sure, it will work for me now. The bug, however, is in autofs. systemd is doing perfectly legal user-mode things. Perhaps autofs should use lookup_mnt() to decide along this pattern: if ( dentry->d_flags & DCACHE_MOUNTED && lookup_mnt(path) ) { /* mounted */ } else { /* not mounted */ } That doesn't solve the problem, however, that mounts cloned by a unshare(CLONE_NEWNS) would never expire. Also there is another bug somewhere, because I see, that the mount, visible to the /usr/lib/colord/colord process was logged as "unmounted" in the nfs server when it expired in the global namespace. So I doubt it would be working even for that process. So possibly automounted mounts shouldn't be cloned at all? Together with chroot or pivot_root the sematics would be more than unclear anyway. Your problem now :-) Thanks for you help with this! Regards Donald -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4541 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 14:55 ` Donald Buczek @ 2014-03-02 18:51 ` Donald Buczek 2014-03-03 2:40 ` Ian Kent 2014-03-03 2:40 ` Ian Kent 1 sibling, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-03-02 18:51 UTC (permalink / raw) To: Ian Kent, Alexander Viro; +Cc: autofs [-- Attachment #1: Type: text/plain, Size: 840 bytes --] Addendum: The error can be reproduced with this script (and /project/mariux32 being an automount point of course) : --------------- #! /bin/sh ls -ld /project/mariux32/. unshare -m -- bash -c "sleep 6;ls -ld /project/mariux32/.;echo exit..." & kill -USR1 `cat /var/run/autofs-running` sleep 3 ls -ld /project/mariux32/. wait --------------- Output: > root:dose:/home/buczek/autofs/# ./test.sh > drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. > ls: cannot access /project/mariux32/.: Too many levels of symbolic links > drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. > exit... And yes, the fileserver logs "authenticated unmount request" , but the (later!) ls succeeds anyway. Regards Donald -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4541 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 18:51 ` Donald Buczek @ 2014-03-03 2:40 ` Ian Kent 0 siblings, 0 replies; 50+ messages in thread From: Ian Kent @ 2014-03-03 2:40 UTC (permalink / raw) To: Donald Buczek; +Cc: Alexander Viro, autofs On Sun, 2014-03-02 at 19:51 +0100, Donald Buczek wrote: > Addendum: > > The error can be reproduced with this script (and /project/mariux32 > being an automount point of course) : > --------------- > #! /bin/sh > > ls -ld /project/mariux32/. > unshare -m -- bash -c "sleep 6;ls -ld /project/mariux32/.;echo exit..." & > kill -USR1 `cat /var/run/autofs-running` > sleep 3 > ls -ld /project/mariux32/. > wait > --------------- > > Output: > > > root:dose:/home/buczek/autofs/# ./test.sh > > drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. > > ls: cannot access /project/mariux32/.: Too many levels of symbolic links > > drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. > > exit... Thanks, this is useful too. > > And yes, the fileserver logs "authenticated unmount request" , but the > (later!) ls succeeds anyway. > > Regards > Donald > > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-02 14:55 ` Donald Buczek 2014-03-02 18:51 ` Donald Buczek @ 2014-03-03 2:40 ` Ian Kent 2014-03-04 6:06 ` Ian Kent 1 sibling, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-03-03 2:40 UTC (permalink / raw) To: Donald Buczek; +Cc: Alexander Viro, autofs On Sun, 2014-03-02 at 15:55 +0100, Donald Buczek wrote: > Am 02.03.2014 08:10, schrieb Ian Kent: > > On Sun, 2014-03-02 at 10:22 +0800, Ian Kent wrote: > >> On Fri, 2014-02-28 at 08:29 -0500, Alexander Viro wrote: > >>> On Fri, Feb 28, 2014 at 01:12:58PM +0100, Donald Buczek wrote: > >>> > >>>> Obviously, "cleared mounted on dentry" is missing. > >>>> > >>>> It looks like we enter put_mountpoint() but don't get to > >>>> dentry->d_flags &= ~DCACHE_MOUNTED; > >>>> > >>>> mp->m_count is not zero probably. > >>>> > >>>> What does it mean? The mount is still locked but not in the mount hash? > >>> No, it means that something else is mounted on the same dentry (in another > >>> part of mount tree, obviously). > >>> > >>> If you mount the same fs on two different mountpoints, e.g. > >>> mount /dev/sda1 /mnt > >>> mount /dev/sda1 /tmp/foo > >>> you will have the same dentries seen in two places. Now, > >>> mount /dev/sdb11 /mnt/a > >>> mount /dev/sdc5 /tmp/foo/a > >>> > >>> and you've got two different filesystems mounted on two different places > >>> (/mnt/a and /tmp/foo/a). These two places have different vfsmounts, > >>> but the same dentry. struct mountpoint is associated with dentry, so > >>> it's also the same for both. And it serves as a mountpoint for two > >>> vfsmounts - one for fs from sdb11, another for fs from sdc5. > >>> > >>> Now umount /mnt/a; one of those two vfsmounts is gone now. struct mountpoint > >>> survives, of course, and dentry is *still* a mountpoint. sdc5 is still > >>> mounted on /tmp/foo/a, after all... > > Good example but for autofs file systems doesn't this amount to saying > > its been bound somewhere else? > > > > Illegal as far as autofs is concerned because an autofs mount is > > strictly associated with a path defined by its map. > > > > And, yes, bind mounting an autofs file system elsewhere isn't vetoed by > > the kernel. > > > > This makes be start thinking about implications wrt. containers .... > > > >> Ahh, right ... I'll need to think about my use (misuse) of > >> d_mountpoint(). > > So maybe I don't need to worry about this just yet. I think you've hit on almost all the current problems I'm struggling with and adds to it, ;) > > I think you should, because exactly this is the bug. > d_mountpoint(dentry) just says, that we have a struct mountpoint for the > dentry. It does not say, that the path is mounted in the current > namespace. The struct mountpoint might exists, because the path is > mounted in other namespaces but not ours. Yes, and this adds a new case to the list of problems. > > The problem at our site is clear now: > > We have only one service with PrivateTmp=yes which is colord.service. > And here is the missing mount: > > > root:kasslerbraten:/lib/systemd/system/# ps -Af|fgrep colord > > root 7670 1 0 Feb28 ? 00:00:00 /usr/lib/colord/colord > > root 7897 7329 0 14:46 pts/8 00:00:00 fgrep colord > > root:kasslerbraten:/lib/systemd/system/# cat /proc/7670/mounts|grep > > mariux32 > > pille:/amd/pille/1/project/mariux32 /project/mariux32 nfs > > rw,nosuid,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=141.14.28.250,mountvers=3,mountport=56263,mountproto=udp,local_lock=none,addr=141.14.28.250 > > 0 0 > > colord.service is dbus-started. So it is started quiet randomly and > depending on user usage pattern, mostly but not exclusively on > workstations. That is exactly how we've seen the bug to appear. > > When the services is started, systemd uses unshare(CLONE_NEWNS) to clone > the namespace. This new namespace inherits existing mounts, including > automounted ones. > These mounts might eventually expire at a later time. When this occurs, > they are dismounted from the automount daemons namespace, which is the > global, pid 1 namespace. But because they are still mounted in another > namespace, the dentry stays flagged as DCACHE_MOUNTED, which prevents > autofs to remount it on access. The mount, however, just exists in > another namespace and is useless for anybody else. Useless yes, but there is currently no way to mount something so that it won't be propagated. No, MS_PRIVATE says "I'm private don't propagate my children". To add a flag to do this isn't a simple task either AFAICS. And then there are those that explicitly want the propagation and expect it to work. I think they will eventually be disappointed. > > Final prove, that this is the true story: > > > root:kasslerbraten:/lib/systemd/system/# ls /project/mariux32 > > ls: cannot open directory /project/mariux32: Too many levels of > > symbolic links > > root:kasslerbraten:/lib/systemd/system/# kill -9 7670 > > root:kasslerbraten:/lib/systemd/system/# ls /project/mariux32 > > beeroot home i686 svnroot > > root:kasslerbraten:/lib/systemd/system/# > > Of course, I can easily work around that in our environment (eg. just > remove PrivateTmp=yes from the service). So I'm pretty sure, it will > work for me now. > The bug, however, is in autofs. systemd is doing perfectly legal > user-mode things. > > Perhaps autofs should use lookup_mnt() to decide along this pattern: > > if ( dentry->d_flags & DCACHE_MOUNTED && lookup_mnt(path) ) { > /* mounted */ > } else { > /* not mounted */ > } Also, not as simple as you might think. First lookup_mnt() isn't exported and I believe the preference is that, that doesn't change. But follow_down_one() is exported and could be used. Next, it would involve changing the function signature of a dentry operation function. That function could be used by other modules that we don't know about and they would break. > > That doesn't solve the problem, however, that mounts cloned by a > unshare(CLONE_NEWNS) would never expire. Also there is another bug > somewhere, because I see, that the mount, visible to the > /usr/lib/colord/colord process was logged as "unmounted" in the nfs > server when it expired in the global namespace. So I doubt it would be > working even for that process. So possibly automounted mounts shouldn't > be cloned at all? Together with chroot or pivot_root the sematics would > be more than unclear anyway. Your problem now :-) Hehe, like I said some people are going to be disappointed. There's just one question about this that remains. Assuming systemd is setting "/" shared what happens if "mount --make-rprivate /" is run before autofs is started? So if you can spend a little more time on this an answer to this would be helpful. > > Thanks for you help with this! Actually, thank you. This investigation has given me quite a bit of new insight into the current difficulties I have with namespace handling. Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-03 2:40 ` Ian Kent @ 2014-03-04 6:06 ` Ian Kent 2016-03-09 17:44 ` Donald Buczek 0 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-03-04 6:06 UTC (permalink / raw) To: Donald Buczek; +Cc: Alexander Viro, autofs On Mon, 2014-03-03 at 10:40 +0800, Ian Kent wrote: > > > > > That doesn't solve the problem, however, that mounts cloned by a > > unshare(CLONE_NEWNS) would never expire. Also there is another bug > > somewhere, because I see, that the mount, visible to the > > /usr/lib/colord/colord process was logged as "unmounted" in the nfs > > server when it expired in the global namespace. So I doubt it would be > > working even for that process. So possibly automounted mounts shouldn't > > be cloned at all? Together with chroot or pivot_root the sematics would > > be more than unclear anyway. Your problem now :-) > > Hehe, like I said some people are going to be disappointed. > > There's just one question about this that remains. > > Assuming systemd is setting "/" shared what happens if "mount > --make-rprivate /" is run before autofs is started? > > So if you can spend a little more time on this an answer to this would > be helpful. No need for this, thanks to your reproducer. In fact the problem doesn't appear happen if "/" is set shared so in your case "/" must be set either slave or private. And expanding the reproducer a bit I see another failure case too, and it doesn't appear to be the unreliable d_mountpoint() check, not sure yet exactly what it is. Thanks Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2014-03-04 6:06 ` Ian Kent @ 2016-03-09 17:44 ` Donald Buczek 2016-03-16 1:32 ` Ian Kent ` (2 more replies) 0 siblings, 3 replies; 50+ messages in thread From: Donald Buczek @ 2016-03-09 17:44 UTC (permalink / raw) To: Ian Kent; +Cc: autofs Hi, Kent, in 2014 we analyzed and discussed a problem which in my view boiled down to "autofs refuses to mount on a path (dentry) which already is mounted in another namespace." This is because it uses d_mountpoint ( = DCACHE_MOUNTED) to decide whether a mount should be attempted or not. At that point I selfishly changed our setting to avoid use of mount namespaces and left you alone with the problem. But now we need mount namespaces ourselves using kernel 4.4.2 and the old problem reoccurred So my questions: * am I right, that this problem is still unresolved? * is this considered a bug? and if so, do you already have an idea, which way this could be resolved? As a reminder, here is the script we used to demonstrate the problem ( assuming /project/mariux32 is served by autofs) : === #! /bin/sh ls -ld /project/mariux32/. unshare -m -- bash -c "sleep 6;ls -ld /project/mariux32/.;echo exit..." & kill -USR1 `cat /var/run/autofs-running` sleep 3 ls -ld /project/mariux32/. wait === In your mail quoted below you wrote, the error would be avoided if "/" was set to shared before automount is started, but I can't confirm this. === root:nsa:/scratch/local/# systemctl stop automount.service root:nsa:/scratch/local/# mount --make-rshared / root:nsa:/scratch/local/# systemctl start automount.service root:nsa:/scratch/local/# ./test.sh drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. ls: cannot access /project/mariux32/.: Too many levels of symbolic links drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. exit... root:nsa:/scratch/local/# === Thank you! Donald On 03/04/14 07:06, Ian Kent wrote: > On Mon, 2014-03-03 at 10:40 +0800, Ian Kent wrote: >>> That doesn't solve the problem, however, that mounts cloned by a >>> unshare(CLONE_NEWNS) would never expire. Also there is another bug >>> somewhere, because I see, that the mount, visible to the >>> /usr/lib/colord/colord process was logged as "unmounted" in the nfs >>> server when it expired in the global namespace. So I doubt it would be >>> working even for that process. So possibly automounted mounts shouldn't >>> be cloned at all? Together with chroot or pivot_root the sematics would >>> be more than unclear anyway. Your problem now :-) >> Hehe, like I said some people are going to be disappointed. >> >> There's just one question about this that remains. >> >> Assuming systemd is setting "/" shared what happens if "mount >> --make-rprivate /" is run before autofs is started? >> >> So if you can spend a little more time on this an answer to this would >> be helpful. > No need for this, thanks to your reproducer. > > In fact the problem doesn't appear happen if "/" is set shared so in > your case "/" must be set either slave or private. > > And expanding the reproducer a bit I see another failure case too, and > it doesn't appear to be the unreliable d_mountpoint() check, not sure > yet exactly what it is. > > Thanks > Ian > -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 -- To unsubscribe from this list: send the line "unsubscribe autofs" in ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2016-03-09 17:44 ` Donald Buczek @ 2016-03-16 1:32 ` Ian Kent 2016-03-16 1:58 ` Ian Kent 2016-03-16 2:10 ` Ian Kent 2 siblings, 0 replies; 50+ messages in thread From: Ian Kent @ 2016-03-16 1:32 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Wed, 2016-03-09 at 18:44 +0100, Donald Buczek wrote: > Hi, Kent, > > in 2014 we analyzed and discussed a problem which in my view boiled > down > to "autofs refuses to mount on a path (dentry) which already is > mounted > in another namespace." This is because it uses d_mountpoint ( = > DCACHE_MOUNTED) to decide whether a mount should be attempted or not. > At > that point I selfishly changed our setting to avoid use of mount > namespaces and left you alone with the problem. > > But now we need mount namespaces ourselves using kernel 4.4.2 and the > old problem reoccurred > > So my questions: > > * am I right, that this problem is still unresolved? > * is this considered a bug? I originally made a couple of patches to make autofs namespace aware for this case but I'm still holding on to them because, as I did them, I realized there's quite a bit more going on with this. For example, suppose autofs is namespace aware, the autofs file system has been cloned as part of the namespace creation, the filesystem in the new namespace is propagation private and the automount daemon is running in the root namespace. In this case there's no limit on the number of times the namespace can attempt to trigger a mount which is possibly open to be used as a denial of service attack. So the current ELOOP behaviour is probably needed in this case. Another example, assume the automount daemon is running in the root namespace, there are multiple containers where an indirect mount map has been passed as a volume and the container implementation sets it's mounts as propagation slave. In this case the mounts are mounted in the root namespace and propagated to the containers. And similarly, if there's a bad mount the containers are capped on the number of mount attempts by the current ELOOP behaviour. But ELOOP probably isn't the error return the containers should be getting either and allowing unabated callbacks is probably not good either. There are more cases, some of which I haven't properly investigated. So I ended up holding onto the patches. What exactly is your usage need? Ian -- To unsubscribe from this list: send the line "unsubscribe autofs" in ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2016-03-09 17:44 ` Donald Buczek 2016-03-16 1:32 ` Ian Kent @ 2016-03-16 1:58 ` Ian Kent 2016-03-16 2:10 ` Ian Kent 2 siblings, 0 replies; 50+ messages in thread From: Ian Kent @ 2016-03-16 1:58 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Wed, 2016-03-09 at 18:44 +0100, Donald Buczek wrote: > > As a reminder, here is the script we used to demonstrate the problem ( > assuming /project/mariux32 is served by autofs) : > > === > #! /bin/sh > > ls -ld /project/mariux32/. > unshare -m -- bash -c "sleep 6;ls -ld /project/mariux32/.;echo > exit..." & > kill -USR1 `cat /var/run/autofs-running` > sleep 3 > ls -ld /project/mariux32/. > wait > === > > In your mail quoted below you wrote, the error would be avoided if "/" > was set to shared before automount is started, but I can't confirm > this. > > === > root:nsa:/scratch/local/# systemctl stop automount.service > root:nsa:/scratch/local/# mount --make-rshared / > root:nsa:/scratch/local/# systemctl start automount.service > root:nsa:/scratch/local/# ./test.sh > drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. > ls: cannot access /project/mariux32/.: Too many levels of symbolic > links > drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. > exit... > root:nsa:/scratch/local/# And, sadly, it's not that simple as I tried to describe with the cases of the previous mail. One thing that occurred to me long ago was setting the autofs mounts propagation private at mount so they wouldn't be cloned to namespaces. So you'd think, problem solved for some selected cases, but other cases, such as container implementations that need mounts to be propagation slave, broken. But setting the autofs mounts themselves propagation private doesn't stop them being cloned, it only prevents the child mount from propagating which would just force the ELOOP behaviour regardless of the namespace mount propagation status. And, as I found out, it isn't possible to set a mount so it doesn't propagate, it can only be done by setting the parent to not propagate (all of) it's children. So there's no way to selectively set individual mounts to not propagate at mount time, so they don't show up in a created namespace. So it is quite a difficult problem. Ian -- To unsubscribe from this list: send the line "unsubscribe autofs" in ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2016-03-09 17:44 ` Donald Buczek 2016-03-16 1:32 ` Ian Kent 2016-03-16 1:58 ` Ian Kent @ 2016-03-16 2:10 ` Ian Kent 2016-05-20 14:12 ` Donald Buczek 2 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2016-03-16 2:10 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Wed, 2016-03-09 at 18:44 +0100, Donald Buczek wrote: > As a reminder, here is the script we used to demonstrate the problem ( > assuming /project/mariux32 is served by autofs) : > > === > #! /bin/sh > > ls -ld /project/mariux32/. > unshare -m -- bash -c "sleep 6;ls -ld /project/mariux32/.;echo > exit..." & > kill -USR1 `cat /var/run/autofs-running` > sleep 3 > ls -ld /project/mariux32/. > wait > === > > In your mail quoted below you wrote, the error would be avoided if "/" > was set to shared before automount is started, but I can't confirm > this. > > === > root:nsa:/scratch/local/# systemctl stop automount.service > root:nsa:/scratch/local/# mount --make-rshared / > root:nsa:/scratch/local/# systemctl start automount.service > root:nsa:/scratch/local/# ./test.sh > drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. > ls: cannot access /project/mariux32/.: Too many levels of symbolic > links > drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. > exit... > root:nsa:/scratch/local/# > === Actually, assuming /project is an indirect mount, I think that should have been OK. Again, there's more going on, probably unshare not cloning file handles (the -m clones only the mount namespace I think). Without a way of sending mount requests to the automount daemon, which needs to be done via a file handle (the current pipe implementation or even if it was socket based) the symptom will be the same as when / is propagation private. This is just a guess though. > > Thank you! > > Donald > > > On 03/04/14 07:06, Ian Kent wrote: > > On Mon, 2014-03-03 at 10:40 +0800, Ian Kent wrote: > > > > That doesn't solve the problem, however, that mounts cloned by a > > > > unshare(CLONE_NEWNS) would never expire. Also there is another > > > > bug > > > > somewhere, because I see, that the mount, visible to the > > > > /usr/lib/colord/colord process was logged as "unmounted" in the > > > > nfs > > > > server when it expired in the global namespace. So I doubt it > > > > would be > > > > working even for that process. So possibly automounted mounts > > > > shouldn't > > > > be cloned at all? Together with chroot or pivot_root the > > > > sematics would > > > > be more than unclear anyway. Your problem now :-) > > > Hehe, like I said some people are going to be disappointed. > > > > > > There's just one question about this that remains. > > > > > > Assuming systemd is setting "/" shared what happens if "mount > > > --make-rprivate /" is run before autofs is started? > > > > > > So if you can spend a little more time on this an answer to this > > > would > > > be helpful. > > No need for this, thanks to your reproducer. > > > > In fact the problem doesn't appear happen if "/" is set shared so in > > your case "/" must be set either slave or private. > > > > And expanding the reproducer a bit I see another failure case too, > > and > > it doesn't appear to be the unreliable d_mountpoint() check, not > > sure > > yet exactly what it is. > > > > Thanks > > Ian > > > > -- To unsubscribe from this list: send the line "unsubscribe autofs" in ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2016-03-16 2:10 ` Ian Kent @ 2016-05-20 14:12 ` Donald Buczek 2016-05-23 1:53 ` Ian Kent 0 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2016-05-20 14:12 UTC (permalink / raw) To: Ian Kent; +Cc: autofs On 03/16/16 03:10, Ian Kent wrote: > On Wed, 2016-03-09 at 18:44 +0100, Donald Buczek wrote: >> As a reminder, here is the script we used to demonstrate the problem ( >> assuming /project/mariux32 is served by autofs) : >> >> === >> #! /bin/sh >> >> ls -ld /project/mariux32/. >> unshare -m -- bash -c "sleep 6;ls -ld /project/mariux32/.;echo >> exit..." & >> kill -USR1 `cat /var/run/autofs-running` >> sleep 3 >> ls -ld /project/mariux32/. >> wait >> === >> >> In your mail quoted below you wrote, the error would be avoided if "/" >> was set to shared before automount is started, but I can't confirm >> this. >> >> === >> root:nsa:/scratch/local/# systemctl stop automount.service >> root:nsa:/scratch/local/# mount --make-rshared / >> root:nsa:/scratch/local/# systemctl start automount.service >> root:nsa:/scratch/local/# ./test.sh >> drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. >> ls: cannot access /project/mariux32/.: Too many levels of symbolic >> links >> drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. >> exit... >> root:nsa:/scratch/local/# >> === > Actually, assuming /project is an indirect mount, I think that should > have been OK. My fault. It does work. I've overlooked > unshare since util-linux version 2.27 automatically sets propagation to private in a > new mount namespace to make sure that the new namespace is really > unshared. It's possible to disable this feature with option --propagation unchanged. So in fact it does > unshare(CLONE_NEWNS) = 0 > mount("none", "/", NULL, MS_REC|MS_PRIVATE, NULL) = 0 If I add "--propagation unchanged" to the test script, it works. Might be a solution for our environment. I understand, that the combined semantics of automount and mount namespaces are yet to be defined and there is no clear,unique way to do that. But independent of that, I think it might be better if autofs would continue to try to mount on top of a dentry with DCACHE_MOUNTED set and this might be a requirement for most of the thinkable solutions. > Again, there's more going on, probably unshare not cloning file handles > (the -m clones only the mount namespace I think). Without a way of > sending mount requests to the automount daemon, which needs to be done > via a file handle (the current pipe implementation or even if it was > socket based) the symptom will be the same as when / is propagation > private. Well, with "mount --make-rshared /" and "unshare -m --propagation unchanged" it looks like everything seems to be working. And the mounts can be triggered from the new namespace as well. There is only one cosmetic thing: When a automount expires, the unmount fails when the (shared mount) in the other namespace still is busy. This generates some log. 2016-05-20T15:43:39+02:00 deadbird automount[938]: expiring path /project/mariux32 2016-05-20T15:43:39+02:00 deadbird automount[938]: >> umount.nfs: /project/mariux32: device is busy 2016-05-20T15:43:39+02:00 deadbird automount[938]: >> umount.nfs: /project/mariux32: device is busy 2016-05-20T15:43:39+02:00 deadbird automount[938]: >> umount.nfs: /project/mariux32: device is busy 2016-05-20T15:43:39+02:00 deadbird automount[938]: Unable to update the mtab file, /proc/mounts and /etc/mtab will differ 2016-05-20T15:43:39+02:00 deadbird automount[938]: expired /project/mariux32 Regards Donald > > This is just a guess though. > >> Thank you! >> >> Donald >> >> >> On 03/04/14 07:06, Ian Kent wrote: >>> On Mon, 2014-03-03 at 10:40 +0800, Ian Kent wrote: >>>>> That doesn't solve the problem, however, that mounts cloned by a >>>>> unshare(CLONE_NEWNS) would never expire. Also there is another >>>>> bug >>>>> somewhere, because I see, that the mount, visible to the >>>>> /usr/lib/colord/colord process was logged as "unmounted" in the >>>>> nfs >>>>> server when it expired in the global namespace. So I doubt it >>>>> would be >>>>> working even for that process. So possibly automounted mounts >>>>> shouldn't >>>>> be cloned at all? Together with chroot or pivot_root the >>>>> sematics would >>>>> be more than unclear anyway. Your problem now :-) >>>> Hehe, like I said some people are going to be disappointed. >>>> >>>> There's just one question about this that remains. >>>> >>>> Assuming systemd is setting "/" shared what happens if "mount >>>> --make-rprivate /" is run before autofs is started? >>>> >>>> So if you can spend a little more time on this an answer to this >>>> would >>>> be helpful. >>> No need for this, thanks to your reproducer. >>> >>> In fact the problem doesn't appear happen if "/" is set shared so in >>> your case "/" must be set either slave or private. >>> >>> And expanding the reproducer a bit I see another failure case too, >>> and >>> it doesn't appear to be the unreliable d_mountpoint() check, not >>> sure >>> yet exactly what it is. >>> >>> Thanks >>> Ian >>> >> -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 -- To unsubscribe from this list: send the line "unsubscribe autofs" in ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: "Too many levels of symbolic links" 2016-05-20 14:12 ` Donald Buczek @ 2016-05-23 1:53 ` Ian Kent 0 siblings, 0 replies; 50+ messages in thread From: Ian Kent @ 2016-05-23 1:53 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Fri, 2016-05-20 at 16:12 +0200, Donald Buczek wrote: > On 03/16/16 03:10, Ian Kent wrote: > > On Wed, 2016-03-09 at 18:44 +0100, Donald Buczek wrote: > > > As a reminder, here is the script we used to demonstrate the problem ( > > > assuming /project/mariux32 is served by autofs) : > > > > > > === > > > #! /bin/sh > > > > > > ls -ld /project/mariux32/. > > > unshare -m -- bash -c "sleep 6;ls -ld /project/mariux32/.;echo > > > exit..." & > > > kill -USR1 `cat /var/run/autofs-running` > > > sleep 3 > > > ls -ld /project/mariux32/. > > > wait > > > === > > > > > > In your mail quoted below you wrote, the error would be avoided if "/" > > > was set to shared before automount is started, but I can't confirm > > > this. > > > > > > === > > > root:nsa:/scratch/local/# systemctl stop automount.service > > > root:nsa:/scratch/local/# mount --make-rshared / > > > root:nsa:/scratch/local/# systemctl start automount.service > > > root:nsa:/scratch/local/# ./test.sh > > > drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. > > > ls: cannot access /project/mariux32/.: Too many levels of symbolic > > > links > > > drwxrwsr-x 6 mx32prj mx32grp 56 Feb 24 2011 /project/mariux32/. > > > exit... > > > root:nsa:/scratch/local/# > > > === > > Actually, assuming /project is an indirect mount, I think that should > > have been OK. > > My fault. It does work. I've overlooked > > > unshare since util-linux version 2.27 automatically sets > propagation to private in a > > new mount namespace to make sure that the new namespace is really > > unshared. It's possible to disable this feature with option > --propagation unchanged. > > So in fact it does > > > unshare(CLONE_NEWNS) = 0 > > mount("none", "/", NULL, MS_REC|MS_PRIVATE, NULL) = 0 > > If I add "--propagation unchanged" to the test script, it works. Might > be a solution for our environment. > > I understand, that the combined semantics of automount and mount > namespaces are yet to be defined and there is no clear,unique way to do > that. > > But independent of that, I think it might be better if autofs would > continue to try to mount on top of a dentry with DCACHE_MOUNTED set and > this might be a requirement for most of the thinkable solutions. > > > Again, there's more going on, probably unshare not cloning file handles > > (the -m clones only the mount namespace I think). Without a way of > > sending mount requests to the automount daemon, which needs to be done > > via a file handle (the current pipe implementation or even if it was > > socket based) the symptom will be the same as when / is propagation > > private. > > Well, with "mount --make-rshared /" and "unshare -m --propagation > unchanged" it looks like everything seems to be working. And the mounts > can be triggered from the new namespace as well. There is only one > cosmetic thing: When a automount expires, the unmount fails when the > (shared mount) in the other namespace still is busy. This generates some > log. > > 2016-05-20T15:43:39+02:00 deadbird automount[938]: expiring path > /project/mariux32 > 2016-05-20T15:43:39+02:00 deadbird automount[938]: >> umount.nfs: > /project/mariux32: device is busy > 2016-05-20T15:43:39+02:00 deadbird automount[938]: >> umount.nfs: > /project/mariux32: device is busy > 2016-05-20T15:43:39+02:00 deadbird automount[938]: >> umount.nfs: > /project/mariux32: device is busy > 2016-05-20T15:43:39+02:00 deadbird automount[938]: Unable to update the > mtab file, /proc/mounts and /etc/mtab will differ > 2016-05-20T15:43:39+02:00 deadbird automount[938]: expired /project/mariux32 I've looked at the mount propagation code many times and it always ends in confusion. For example, in this case, it looks like the reference count on all the mounts "of the parent mount" are checked which seems wrong to me since a specific mount contained in the parent mount is being umounted. But I have to assume the actual outcome is that the mount would be umounted if it (itself) isn't busy in any branch of the propagation tree. On the face of it that appears to conflict with the notion of independent namespaces where mount and umount need to be able to be done independently within namespaces. It looks as though the changes that decoupled the kernel mount itself from the vfsmount struture are in 3.8 and those are needed for mount/umount independence, at a minimum. Maybe they weren't complete or maybe the propagation code is conflicting with the usage (it came some time before the namespace implementation) and this case hasn't been covered. This is the type of side effect that worries me most. > > Regards > Donald > > > > > > This is just a guess though. > > > > > Thank you! > > > > > > Donald > > > > > > > > > On 03/04/14 07:06, Ian Kent wrote: > > > > On Mon, 2014-03-03 at 10:40 +0800, Ian Kent wrote: > > > > > > That doesn't solve the problem, however, that mounts cloned by a > > > > > > unshare(CLONE_NEWNS) would never expire. Also there is another > > > > > > bug > > > > > > somewhere, because I see, that the mount, visible to the > > > > > > /usr/lib/colord/colord process was logged as "unmounted" in the > > > > > > nfs > > > > > > server when it expired in the global namespace. So I doubt it > > > > > > would be > > > > > > working even for that process. So possibly automounted mounts > > > > > > shouldn't > > > > > > be cloned at all? Together with chroot or pivot_root the > > > > > > sematics would > > > > > > be more than unclear anyway. Your problem now :-) > > > > > Hehe, like I said some people are going to be disappointed. > > > > > > > > > > There's just one question about this that remains. > > > > > > > > > > Assuming systemd is setting "/" shared what happens if "mount > > > > > --make-rprivate /" is run before autofs is started? > > > > > > > > > > So if you can spend a little more time on this an answer to this > > > > > would > > > > > be helpful. > > > > No need for this, thanks to your reproducer. > > > > > > > > In fact the problem doesn't appear happen if "/" is set shared so in > > > > your case "/" must be set either slave or private. > > > > > > > > And expanding the reproducer a bit I see another failure case too, > > > > and > > > > it doesn't appear to be the unreliable d_mountpoint() check, not > > > > sure > > > > yet exactly what it is. > > > > > > > > Thanks > > > > Ian > > > > > > > > > -- To unsubscribe from this list: send the line "unsubscribe autofs" in ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-31 10:10 ` Donald Buczek 2014-01-31 10:29 ` Donald Buczek @ 2014-02-01 1:47 ` Ian Kent 2014-02-01 3:32 ` Ian Kent 2 siblings, 0 replies; 50+ messages in thread From: Ian Kent @ 2014-02-01 1:47 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Fri, 2014-01-31 at 11:10 +0100, Donald Buczek wrote: > > Hello, Ian, > > you said, "how DCACHE_MOUNTED would not be cleared on umount", so you > are thinking about the unmount path. I asked my users and in two cases > (including the one described in this thread) they think, it happened the > very first time they accessed the path after boot. This suggest, the > problem might appear on the mount path. > > Also, both were on workstations (single user!) and they both used a > shell ( "cd /failing/path" and "do_something > /failing/path/bla" ) , so > collisions (other threads accessing the same path at the same time) are > unlikely. > > We don't have any hints which would suggests, that there might have been > a problem with the fileserver or network involved (which would imply a > bug in the "mount failure" path) Thanks for all this, I appreciate it. snip ... > I've checked the mounts as you asked ( > http://owww.molgen.mpg.de/~buczek/autofs-demo/typescript_3.l ) the > dentry 0xffff88016a31c440 identified in the previous sessions (and still > there) is not in any mnt_mountpoint LOL, it was a long shot and now I know for sure, good. > > How can DCACHE_MOUNTED be set when there was no mount? > The problem appears rarely and (until now) randomly. Locking failure? > > Okay, I've managed to get the nvidia bullshit drivers to work on linux > 3.13.1 , so I'm going to reboot this workstation (with the three > failures) to the latest kernel now with DEBUG set in the autofs4 directory. It's probably a good idea for you to get to some later kernel anyway. See this (not my doing) in 3.8 fs/namei.c /* * Clear dentry's mounted state if it has no remaining mounts. * vfsmount_lock must be held for write. */ static void dentry_reset_mounted(struct dentry *dentry) { unsigned u; for (u = 0; u < HASH_SIZE; u++) { struct mount *p; list_for_each_entry(p, &mount_hashtable[u], mnt_hash) { if (p->mnt_mountpoint == dentry) return; } } spin_lock(&dentry->d_lock); dentry->d_flags &= ~DCACHE_MOUNTED; spin_unlock(&dentry->d_lock); } Which means all dentries in the system need to be scanned before DCACHE_MOUNTED is cleared. Now that's not a problem for a smallish number of mounts but can be a problem for larger numbers of mounts or if there's a large number of dentries in the system. Don't remember exactly when Al Viro fixed it but it's easy to check. > > Perhaps we shouldn't waste to much time analyzing code which is > obsoleted already. I'll surly tell you, when the problem is seen again > with 8.13. Thanks again. Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-31 10:10 ` Donald Buczek 2014-01-31 10:29 ` Donald Buczek 2014-02-01 1:47 ` autofs linux 3.8.13 and " Ian Kent @ 2014-02-01 3:32 ` Ian Kent 2014-02-01 13:08 ` Donald Buczek 2 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-02-01 3:32 UTC (permalink / raw) To: David Howells; +Cc: autofs, Donald Buczek Hi David, Wondering if you could perhaps lend a hand with this analysis. The "Too many levels of symbolic links" error has been reported against the rhel-6 back port and a number of kernel versions (over time) but has not yet been reported against the most recent kernels. So it may still be an issue. Donald has provided quite a bit of useful information in the forgoing discussion. Have a look at this link for debug information he has provided so far: http://www.molgen.mpg.de/~buczek/autofs-demo/ I can forward mails from earlier posts if you need to see them. On Fri, 2014-01-31 at 11:10 +0100, Donald Buczek wrote: > On 01/31/14 06:13, Ian Kent wrote: > > On Fri, 2014-01-31 at 11:31 +0800, Ian Kent wrote: > >> On Wed, 2014-01-29 at 17:02 +0100, Donald Buczek wrote: > >>> Hello, > >>> > >>> we are trying to switch from amd to autofs. After successfully testing > >>> and rolling it out to the first several machines, from time to time we > >>> get directories stuck with "Too many levels of symbolic links" on a path > >>> which should be automounted via an indirect map. > >>> > >>> linux 3.8.13 > >>> autofs 5.0.8 > >>> > >>> As an example, here is data from a system where the path /scratch/tmp is > >>> stuck: > >>> > >>> http://www.molgen.mpg.de/~buczek/autofs-demo/ > >>> > >>> auto.master # master map > >>> auto.scratch # indirect map for /scratch > >>> autofs # from /etc/defaults > >>> typescript # shows the problem and a bit of gdb dump of kernel > >>> structures > >>> typescript.l # same with line numbers for reference > >>> gdb-macros # macros used in the gdb session > >>> > >>> From typescript.l , line 122ff it is clear, that /scratch/tmp is not > >>> currently mounted. On the other hand, the gdb session finds the dentry > >>> of /scratch/tmp which has d_flags 0x70080 (line 99,120). This is > >>> DCACHE_MANAGE_TRANSIT+DCACHE_NEED_AUTOMOUNT+DCACHE_MOUNTED+DCACHE_RCUACCESS > >>> with DCACHE_MOUNTED indicating that there should be something mounted > >>> there(?). I think, this state is faulty and necessarily leads to ELOOP > >>> during path walk. Probably the situation is known by the gurus here? > >> Yes, I can see how DCACHE_MOUNTED being set would lead to ELOOP in this > >> case. But, having been there before too, I couldn't see any way the > >> DCACHE_MOUNTED would not be cleared on umount. Also, DCACHE_MOUNTED is > >> only changed within the VFS and isn't changed very often. It can't see > >> how a code path that should lead to one of those changes doesn't go > >> there. > >> > >> I'll have another look ..... > > Then the question becomes .... > > > > Can a dentry be a mount point for more than one mount .... > > Obviously not you say ... but what about clone(2) with CLONE_NEWNS? > > > > If you still have that kernel you used to get the info above could you > > check the mount (ie. struct mount not struct vfsmount) structures to see > > if there is one with its mnt_mountpoint set to the dentry in question? > > > > Ian > > > > > > Hello, Ian, > > you said, "how DCACHE_MOUNTED would not be cleared on umount", so you > are thinking about the unmount path. I asked my users and in two cases > (including the one described in this thread) they think, it happened the > very first time they accessed the path after boot. This suggest, the > problem might appear on the mount path. > > Also, both were on workstations (single user!) and they both used a > shell ( "cd /failing/path" and "do_something > /failing/path/bla" ) , so > collisions (other threads accessing the same path at the same time) are > unlikely. > > We don't have any hints which would suggests, that there might have been > a problem with the fileserver or network involved (which would imply a > bug in the "mount failure" path) > > Oh... Just found another important peace of information : > > > root:thehawk:~/# date > > Fri Jan 31 10:27:48 CET 2014 > > root:thehawk:~/# uptime > > 10:27:51 up 8 days, 21:58, 3 users, load average: 0.37, 0.30, 0.26 > > The system was bootet Jan 22, 12:00 something > > > root:thehawk:~/# ls -al /scratch/ > > total 2 > > drwxr-xr-x 4 root system 0 Jan 27 13:37 . > > drwxr-xr-x 35 root system 888 Jan 20 10:28 .. > > drwxrwxrwt 16 root system 1136 Jan 29 14:39 local > > dr-xr-xr-x 2 root system 0 Jan 27 13:37 tmp > > root:thehawk:~/# ^C > > The creation of the dentry was Jan 27, 13:37 > > And here's from the fileserver: > > root:moep:~/# fgrep thehawk /var/log/messages |tail -5 > > 2014-01-09T14:09:35+01:00 moep rpc.mountd[646]: authenticated unmount > > request from thehawk.molgen.mpg.de:797 for > > /amd/moep/X/X2016/scratch/tolzmann (/amd/moep/X/X2016) > > 2014-01-13T15:43:22+01:00 moep rpc.mountd[646]: authenticated mount > > request from thehawk.molgen.mpg.de:922 for > > /amd/moep/X/X2016/scratch/tmp (/amd/moep/X/X2016) > > 2014-01-13T15:48:36+01:00 moep rpc.mountd[646]: authenticated unmount > > request from thehawk.molgen.mpg.de:660 for > > /amd/moep/X/X2016/scratch/tmp (/amd/moep/X/X2016) > > 2014-01-16T15:52:18+01:00 moep rpc.mountd[646]: authenticated mount > > request from thehawk.molgen.mpg.de:877 for > > /amd/moep/X/X2016/scratch/tmp (/amd/moep/X/X2016) > > 2014-01-16T15:57:30+01:00 moep rpc.mountd[646]: authenticated unmount > > request from thehawk.molgen.mpg.de:745 for > > /amd/moep/X/X2016/scratch/tmp (/amd/moep/X/X2016) > > Last access seen on the Filerver (what would be mounted on /scratch/tmp > if everything went well) was days before that. > > So /scratch/tmp has never been mounted. This is the most interesting information so far. As you know the mounted flag is only ever set at mount and umount. The implication that it is set on a dentry that's never been mounted is very strange. But first, a question for Donald. Given that the autofs configuration has BROWSE_MODE="no" we don't know how the tmp directory in /scratch got created since it has never been mounted. It shouldn't exist, any idea how it got created? Unfortunately we probably need a full autofs debug log to answer that. Anyway, ignoring that for now and assuming tmp was never mounted there's only one place I can see where this might happen and only if there were some strange compiler optimization badness and that's in fs/namei.c:follow_managed(): while (managed = ACCESS_ONCE(path->dentry->d_flags), managed &= DCACHE_MANAGED_DENTRY, unlikely(managed != 0)) { I just can't see how this incorrect flags setting could happen at all so I'm clutching at straws. Any further thoughts on how this might be happening David? > > I've checked the mounts as you asked ( > http://owww.molgen.mpg.de/~buczek/autofs-demo/typescript_3.l ) the > dentry 0xffff88016a31c440 identified in the previous sessions (and still > there) is not in any mnt_mountpoint > > How can DCACHE_MOUNTED be set when there was no mount? > The problem appears rarely and (until now) randomly. Locking failure? > > Okay, I've managed to get the nvidia bullshit drivers to work on linux > 3.13.1 , so I'm going to reboot this workstation (with the three > failures) to the latest kernel now with DEBUG set in the autofs4 directory. > > Perhaps we shouldn't waste to much time analyzing code which is > obsoleted already. I'll surly tell you, when the problem is seen again > with 8.13. > > Regards > Donald > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-02-01 3:32 ` Ian Kent @ 2014-02-01 13:08 ` Donald Buczek 0 siblings, 0 replies; 50+ messages in thread From: Donald Buczek @ 2014-02-01 13:08 UTC (permalink / raw) To: Ian Kent, David Howells; +Cc: autofs [-- Attachment #1: Type: text/plain, Size: 886 bytes --] Am 01.02.2014 04:32, schrieb Ian Kent: > But first, a question for Donald. Given that the autofs configuration > has BROWSE_MODE="no" we don't know how the tmp directory in /scratch > got created since it has never been mounted. It shouldn't exist, any > idea how it got created? The users or scripts know the complete path or it is referenced by symlinks or is a home directory. So the first access is directly to "/scratch/tmp/whatever" without needing to browse and discover "tmp" in /scratch. I assume the directory "tmp" was created by autofs code after the user tried to access "/scratch/tmp/something" to give the daemon something to mount on. It looks like this mount attempt didn't reach the fileserver. We don't know yet, if it reached the daemon or not. Regards Donald -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4541 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-01-29 16:02 autofs linux 3.8.13 and "Too many levels of symbolic links" Donald Buczek ` (2 preceding siblings ...) 2014-01-31 3:31 ` Ian Kent @ 2014-02-01 2:57 ` Ian Kent 2014-02-01 13:01 ` Donald Buczek 3 siblings, 1 reply; 50+ messages in thread From: Ian Kent @ 2014-02-01 2:57 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Wed, 2014-01-29 at 17:02 +0100, Donald Buczek wrote: > Hello, > > we are trying to switch from amd to autofs. After successfully testing I didn't notice this before, so as an aside from our discussion You were (perhaps are) an amd user. As it happens I'm working on adding an amd map parser to autofs, probably for similar reasons to you needing to switch away from using it. I guess it won't make much difference to you since many people use amd for its cross platform abilities. And, well, there's that bug we're discussing .... But, fyi, let me tell you where it's at. I'm not sure how much of the functionality will end up in autofs quite yet but so far the things that likely won't be available are: - type program mounts - needed in autofs too. - but can't be done without significant autofs infrastructure change (or they would have been done in autofs ages ago). - would like to add this, probably some time later. - type nfsx mounts - might (but probably not) get done for the initial commit. - a bit hard to do within autofs. - type lustre mounts - would like to do for initial commit but .... - type direct mounts - I think I understand how these are supposed to work. - don't work in amd on linux so can't check. - can't find any references to users of them either. - a bit difficult to do in autofs. - undecided as yet, at best will do some time later. - map type passwd - seems to prescriptive to be useful. - unlikely to be implemented. - map type ndbm - may implement later if people use it. - would need to add to autofs as well for sane implementation. - configuration options not yet implemented - fully_qualified_hosts - unmount_on_exit - browsable_dirs - probably a couple of others I've missed. - many of the configuration options aren't used or aren't sensible within autofs. - man page for amd options - autofs(5) will need to be updated before initial commit. I've probably missed a few things but this will give you an idea. My plan is to commit initial changes, announce this to try and get some testers and continue to work on it after that. Eventually the amd map changes will be autofs-5.1.0. Any interest in this? Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-02-01 2:57 ` Ian Kent @ 2014-02-01 13:01 ` Donald Buczek 2014-02-02 3:45 ` Ian Kent 0 siblings, 1 reply; 50+ messages in thread From: Donald Buczek @ 2014-02-01 13:01 UTC (permalink / raw) To: Ian Kent; +Cc: autofs [-- Attachment #1: Type: text/plain, Size: 4441 bytes --] Am 01.02.2014 03:57, schrieb Ian Kent: > On Wed, 2014-01-29 at 17:02 +0100, Donald Buczek wrote: >> Hello, >> >> we are trying to switch from amd to autofs. After successfully testing > I didn't notice this before, so as an aside from our discussion > > You were (perhaps are) an amd user. > > As it happens I'm working on adding an amd map parser to autofs, > probably for similar reasons to you needing to switch away from using > it. I guess it won't make much difference to you since many people use > amd for its cross platform abilities. And, well, there's that bug we're > discussing .... > > But, fyi, let me tell you where it's at. > > I'm not sure how much of the functionality will end up in autofs quite > yet but so far the things that likely won't be available are: > > - type program mounts > - needed in autofs too. > - but can't be done without significant autofs infrastructure > change (or they would have been done in autofs ages ago). > - would like to add this, probably some time later. > > - type nfsx mounts > - might (but probably not) get done for the initial commit. > - a bit hard to do within autofs. > > - type lustre mounts > - would like to do for initial commit but .... > > - type direct mounts > - I think I understand how these are supposed to work. > - don't work in amd on linux so can't check. > - can't find any references to users of them either. > - a bit difficult to do in autofs. > - undecided as yet, at best will do some time later. > > - map type passwd > - seems to prescriptive to be useful. > - unlikely to be implemented. > > - map type ndbm > - may implement later if people use it. > - would need to add to autofs as well for sane implementation. > > - configuration options not yet implemented > - fully_qualified_hosts > - unmount_on_exit > - browsable_dirs > - probably a couple of others I've missed. > - many of the configuration options aren't used or aren't > sensible within autofs. > > - man page for amd options > - autofs(5) will need to be updated before initial commit. > > > I've probably missed a few things but this will give you an idea. > My plan is to commit initial changes, announce this to try and get some > testers and continue to work on it after that. Eventually the amd map > changes will be autofs-5.1.0. > > Any interest in this? > Ian > Well, to be true, we don't need anything of this. Where we have used some of the more complex map features of amd, I consider this a mistake and I'm happy to find another solution for that. As small and simple as possible is what we would appreciate the most. Even with the current autofs code, much of the complexity comes from features, we don't need (though I surly acknowledge that other do): direct maps, nested mountpoints, ldap maps, nis maps... We could even live without multithreading in the daemon. We could live without automatic selection between nfs and local bind mounts (because we build the maps on each node individually after we pushed around the configuration data). We could live without expiration support in kernel and daemon and just try to /bin/umount from our own scripts ourselves. I'm not to happy with uid/gid of the original mount persisting in kernel data structure, perhaps to see the daemon die on getpwuid() in two years, because the original account expired some month ago. For what? For /${uid}/ in the paths? Our goal is to provide a stable, global namespace to our nodes and not a "depends on who asked for it" namespace :-) In our current setting, amd sits as a local nfs-server on the automount directories, so any path resolution of an already mounted subdirectory still goes through RPC serialisation and context switches and a big, buggy user-mode daemon. This is the great performance and stability problem we want to get rid of, so we go for in-kernel autofs. I think, in the latest versions of amd they are working on supporting autofs too, but we had too many bad experience, we don't want to upgrade amd away from our more or less stable version. So, please, autofs/automount, don't become a second amd, stay as small and simple and fast and bugfree as possible. Just my view. Regards Donald -- Donald Buczek buczek@molgen.mpg.de Tel: +49 30 8413 1433 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4541 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: autofs linux 3.8.13 and "Too many levels of symbolic links" 2014-02-01 13:01 ` Donald Buczek @ 2014-02-02 3:45 ` Ian Kent 0 siblings, 0 replies; 50+ messages in thread From: Ian Kent @ 2014-02-02 3:45 UTC (permalink / raw) To: Donald Buczek; +Cc: autofs On Sat, 2014-02-01 at 14:01 +0100, Donald Buczek wrote: > Am 01.02.2014 03:57, schrieb Ian Kent: > > On Wed, 2014-01-29 at 17:02 +0100, Donald Buczek wrote: > >> Hello, > >> > >> we are trying to switch from amd to autofs. After successfully testing > > I didn't notice this before, so as an aside from our discussion > > > > You were (perhaps are) an amd user. > > > > As it happens I'm working on adding an amd map parser to autofs, > > probably for similar reasons to you needing to switch away from using > > it. I guess it won't make much difference to you since many people use > > amd for its cross platform abilities. And, well, there's that bug we're > > discussing .... > > > > But, fyi, let me tell you where it's at. > > > > I'm not sure how much of the functionality will end up in autofs quite > > yet but so far the things that likely won't be available are: > > > > - type program mounts > > - needed in autofs too. > > - but can't be done without significant autofs infrastructure > > change (or they would have been done in autofs ages ago). > > - would like to add this, probably some time later. > > > > - type nfsx mounts > > - might (but probably not) get done for the initial commit. > > - a bit hard to do within autofs. > > > > - type lustre mounts > > - would like to do for initial commit but .... > > > > - type direct mounts > > - I think I understand how these are supposed to work. > > - don't work in amd on linux so can't check. > > - can't find any references to users of them either. > > - a bit difficult to do in autofs. > > - undecided as yet, at best will do some time later. > > > > - map type passwd > > - seems to prescriptive to be useful. > > - unlikely to be implemented. > > > > - map type ndbm > > - may implement later if people use it. > > - would need to add to autofs as well for sane implementation. > > > > - configuration options not yet implemented > > - fully_qualified_hosts > > - unmount_on_exit > > - browsable_dirs > > - probably a couple of others I've missed. > > - many of the configuration options aren't used or aren't > > sensible within autofs. > > > > - man page for amd options > > - autofs(5) will need to be updated before initial commit. > > > > > > I've probably missed a few things but this will give you an idea. > > My plan is to commit initial changes, announce this to try and get some > > testers and continue to work on it after that. Eventually the amd map > > changes will be autofs-5.1.0. > > > > Any interest in this? > > Ian > > > > Well, to be true, we don't need anything of this. Where we have used > some of the more complex map features of amd, I consider this a mistake > and I'm happy to find another solution for that. As small and simple as > possible is what we would appreciate the most. Yeah, when I first saw amd maps (quite a long time ago now) my thought was they were more complex than they need to be and, at the time, recommended against using amd for that reason. There was also the potential support headache that could result from using something that people could bend and twist in hard to understand ways. > > Even with the current autofs code, much of the complexity comes from > features, we don't need (though I surly acknowledge that other do): > direct maps, nested mountpoints, ldap maps, nis maps... We could even > live without multithreading in the daemon. We could live without > automatic selection between nfs and local bind mounts (because we build > the maps on each node individually after we pushed around the > configuration data). We could live without expiration support in kernel > and daemon and just try to /bin/umount from our own scripts ourselves. > I'm not to happy with uid/gid of the original mount persisting in kernel > data structure, perhaps to see the daemon die on getpwuid() in two > years, because the original account expired some month ago. For what? > For /${uid}/ in the paths? Our goal is to provide a stable, global > namespace to our nodes and not a "depends on who asked for it" namespace > :-) Yes, autofs is too complex, I agree. But, it's been my experience that trying to simplify something mostly leads to even more complex code which misses the point completely. So I'll most likely leave it the way it is until I have a reason to change it. Other people do use the features though so it's worth spending time to make them function as best I can. Your saying that if an automounted mount remains mounted for a long time the uid/gid in the dentry info struct can become stale and cause subsequent failures. I guess that could happen but a stale uid/gid shouldn't cause a mount failure in AFAICS, except for those whose maps use those settings. In which case they do need to (and should) be concerned with mounts that stay mounted for long amounts of time. Beside the macro usage it's used to trigger dependent mounts in the mount location path, since the daemon doesn't trigger mounts itself. I guess the requesting uid/gid isn't really needed for that, I suppose I could make the spawned mount its own process group leader and that would be enough but I have the uid/gid. Sure, something else you don't need but ..... > > In our current setting, amd sits as a local nfs-server on the automount > directories, so any path resolution of an already mounted subdirectory > still goes through RPC serialisation and context switches and a big, > buggy user-mode daemon. This is the great performance and stability > problem we want to get rid of, so we go for in-kernel autofs. I think, > in the latest versions of amd they are working on supporting autofs too, > but we had too many bad experience, we don't want to upgrade amd away > from our more or less stable version. So, please, autofs/automount, > don't become a second amd, stay as small and simple and fast and bugfree > as possible. Just my view. It's way too late for autofs to be small and simple, ;) In amd autofs support has been present for quite a while. What is recent is support for sun format maps but I don't expect that will ever make it into an actual release given the current lack of activity in the project. It's too late, the amd parser is nearly ready for initial commit. In my defense, there's quite good code separation between the amd and autofs specific parts due to how a parser is added to autofs. Sure, the map sources (like file, nis, etc.) necessarily need changes for the key matching logic but that isn't a great deal of code. And the changes to other parts of the code base are relatively small. I'm not saying there won't be problems with it initially but I don't think it is as bad as you think it will be. The implementation itself is much smaller than it is in amd. Indeed, the amd parser implementation is, IMHO, much better than the autofs parser implementation from a design POV and I'm thinking of re-writing the autofs parser with a similar design. As I said above I've resisted doing that to date because I've spent so much time getting it adequately stable and it has been doing it's job well enough. But I need to add a feature for the amd parser that has been requested for autofs also and the best way to do it means doing the re-write (rather than just adding a hard to maintain hack). Not only that there's a real possibility for simplification of the autofs parser (in this case anyway) that usually doesn't actually happen during a re-write. So I have some compelling reasons to do it. Ian ^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2016-05-23 1:53 UTC | newest] Thread overview: 50+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-29 16:02 autofs linux 3.8.13 and "Too many levels of symbolic links" Donald Buczek 2014-01-29 17:16 ` Leonardo Chiquitto 2014-01-30 0:19 ` Ian Kent 2014-01-30 10:28 ` Donald Buczek 2014-01-30 14:30 ` Ian Kent 2014-01-31 1:36 ` Ian Kent 2014-01-31 3:31 ` Ian Kent 2014-01-31 5:13 ` Ian Kent 2014-01-31 10:10 ` Donald Buczek 2014-01-31 10:29 ` Donald Buczek 2014-02-19 10:17 ` Donald Buczek 2014-02-19 10:21 ` Donald Buczek 2014-02-20 11:41 ` Ian Kent 2014-02-20 12:18 ` Ian Kent 2014-02-20 15:57 ` Donald Buczek 2014-02-21 1:42 ` Ian Kent 2014-02-21 15:15 ` Donald Buczek 2014-02-28 12:12 ` Donald Buczek 2014-02-28 13:29 ` Alexander Viro 2014-02-28 20:35 ` Donald Buczek 2014-03-01 21:56 ` Donald Buczek 2014-03-02 0:52 ` Donald Buczek 2014-03-02 2:17 ` Ian Kent 2014-03-02 8:28 ` Donald Buczek 2014-03-02 9:41 ` Ian Kent 2014-03-02 10:22 ` Donald Buczek 2014-03-02 11:03 ` Ian Kent 2014-03-02 11:15 ` Donald Buczek 2014-03-02 11:30 ` Ian Kent 2014-03-02 11:35 ` Ian Kent 2014-03-02 11:25 ` Ian Kent 2014-03-02 2:22 ` Ian Kent 2014-03-02 7:10 ` Ian Kent 2014-03-02 14:55 ` Donald Buczek 2014-03-02 18:51 ` Donald Buczek 2014-03-03 2:40 ` Ian Kent 2014-03-03 2:40 ` Ian Kent 2014-03-04 6:06 ` Ian Kent 2016-03-09 17:44 ` Donald Buczek 2016-03-16 1:32 ` Ian Kent 2016-03-16 1:58 ` Ian Kent 2016-03-16 2:10 ` Ian Kent 2016-05-20 14:12 ` Donald Buczek 2016-05-23 1:53 ` Ian Kent 2014-02-01 1:47 ` autofs linux 3.8.13 and " Ian Kent 2014-02-01 3:32 ` Ian Kent 2014-02-01 13:08 ` Donald Buczek 2014-02-01 2:57 ` Ian Kent 2014-02-01 13:01 ` Donald Buczek 2014-02-02 3:45 ` Ian Kent
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.