* When ghosting is enabled, failed mounts by AutoFS leave broken directories @ 2012-09-24 20:14 Leonardo Chiquitto 2012-09-25 2:13 ` Ian Kent 0 siblings, 1 reply; 10+ messages in thread From: Leonardo Chiquitto @ 2012-09-24 20:14 UTC (permalink / raw) To: autofs; +Cc: linux-fsdevel, miklos Hi, If you try to access an automounted directory and the mount operation fails, the attributes of the directory (owner, group, mode, access time) are lost. This happens only when AutoFS' browse mode (ghosting) is enabled. How to reproduce the problem: Make sure you have BROWSE_MODE="yes" in /etc/sysconfig/autofs. 1. Add a simple indirect mount map (it's required that the server does NOT export the /notexported directory): # cat /etc/auto.master /nfs /etc/auto.nfs # cat /etc/auto.nfs v2 server:/notexported 2. Restart AutoFS and try to access the directory: # cd /nfs/v2 -bash: cd: /nfs/v2: No such file or directory # ls -la /nfs ls: cannot access /nfs/v2: No such file or directory total 4 drwxr-xr-x 3 root root 0 Sep 24 21:43 . drwxr-xr-x 28 root root 4096 Sep 24 21:25 .. d????????? ? ? ? ? ? v2 This problem is reproducible on recent upstream kernels such as 3.6.0-rc6. I found out that the last kernel version that does not exhibit the bug is 2.6.37. Between 2.6.37 and 2.6.38 there were quite some changes in AutoFS/VFS and my attempts to bisect the problem were frustrated by early crashes. Any ideas? Thanks, Leonardo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories 2012-09-24 20:14 When ghosting is enabled, failed mounts by AutoFS leave broken directories Leonardo Chiquitto @ 2012-09-25 2:13 ` Ian Kent 2012-09-25 21:42 ` Leonardo Chiquitto 0 siblings, 1 reply; 10+ messages in thread From: Ian Kent @ 2012-09-25 2:13 UTC (permalink / raw) To: Leonardo Chiquitto; +Cc: autofs, linux-fsdevel, miklos On Mon, 2012-09-24 at 17:14 -0300, Leonardo Chiquitto wrote: > Hi, > > If you try to access an automounted directory and the mount operation fails, > the attributes of the directory (owner, group, mode, access time) are lost. > This happens only when AutoFS' browse mode (ghosting) is enabled. > > How to reproduce the problem: > > Make sure you have BROWSE_MODE="yes" in /etc/sysconfig/autofs. > > 1. Add a simple indirect mount map (it's required that the server does NOT > export the /notexported directory): > > # cat /etc/auto.master > /nfs /etc/auto.nfs > > # cat /etc/auto.nfs > v2 server:/notexported > > 2. Restart AutoFS and try to access the directory: > > # cd /nfs/v2 > -bash: cd: /nfs/v2: No such file or directory > > # ls -la /nfs > ls: cannot access /nfs/v2: No such file or directory > total 4 > drwxr-xr-x 3 root root 0 Sep 24 21:43 . > drwxr-xr-x 28 root root 4096 Sep 24 21:25 .. > d????????? ? ? ? ? ? v2 > > This problem is reproducible on recent upstream kernels such as 3.6.0-rc6. > I found out that the last kernel version that does not exhibit the bug is > 2.6.37. Between 2.6.37 and 2.6.38 there were quite some changes in > AutoFS/VFS and my attempts to bisect the problem were frustrated by > early crashes. > > Any ideas? Not really. It would be best to start with an full autofs debug log so we can see what is happening within automount and work back from there. We'll probably need to look at the dentry itself as well to see if we can get any info about where it came from possibly, with a kernel dump. Ian ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories 2012-09-25 2:13 ` Ian Kent @ 2012-09-25 21:42 ` Leonardo Chiquitto 2012-09-26 9:15 ` Ian Kent 0 siblings, 1 reply; 10+ messages in thread From: Leonardo Chiquitto @ 2012-09-25 21:42 UTC (permalink / raw) To: Ian Kent; +Cc: autofs, linux-fsdevel, miklos On Mon, Sep 24, 2012 at 11:13 PM, Ian Kent <raven@themaw.net> wrote: > On Mon, 2012-09-24 at 17:14 -0300, Leonardo Chiquitto wrote: >> Hi, >> >> If you try to access an automounted directory and the mount operation fails, >> the attributes of the directory (owner, group, mode, access time) are lost. >> This happens only when AutoFS' browse mode (ghosting) is enabled. >> >> How to reproduce the problem: >> >> Make sure you have BROWSE_MODE="yes" in /etc/sysconfig/autofs. >> >> 1. Add a simple indirect mount map (it's required that the server does NOT >> export the /notexported directory): >> >> # cat /etc/auto.master >> /nfs /etc/auto.nfs >> >> # cat /etc/auto.nfs >> v2 server:/notexported >> >> 2. Restart AutoFS and try to access the directory: >> >> # cd /nfs/v2 >> -bash: cd: /nfs/v2: No such file or directory >> >> # ls -la /nfs >> ls: cannot access /nfs/v2: No such file or directory >> total 4 >> drwxr-xr-x 3 root root 0 Sep 24 21:43 . >> drwxr-xr-x 28 root root 4096 Sep 24 21:25 .. >> d????????? ? ? ? ? ? v2 >> >> This problem is reproducible on recent upstream kernels such as 3.6.0-rc6. >> I found out that the last kernel version that does not exhibit the bug is >> 2.6.37. Between 2.6.37 and 2.6.38 there were quite some changes in >> AutoFS/VFS and my attempts to bisect the problem were frustrated by >> early crashes. >> >> Any ideas? > > Not really. > > It would be best to start with an full autofs debug log so we can see > what is happening within automount and work back from there. We'll > probably need to look at the dentry itself as well to see if we can get > any info about where it came from possibly, with a kernel dump. > > Ian Hi Ian, Here are the debug logs from automount and a snapshot of the dentry and inode of the "broken" directory (running 2.6.38). I tried to compare it with 2.6.37 (which works), but the structures changed a lot between these kernels. ## starting up Sep 25 13:44:30 n1 automount[2337]: Starting automounter version 5.0.7, master map auto.master Sep 25 13:44:30 n1 automount[2337]: using kernel protocol version 5.02 Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_master: reading master files auto.master Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init gathered global options: (null) Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, passing -n to mount Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, passing -n to mount Sep 25 13:44:30 n1 automount[2337]: lookup_read_master: lookup(file): read entry /nfs Sep 25 13:44:30 n1 automount[2337]: master_do_mount: mounting /nfs Sep 25 13:44:30 n1 automount[2337]: automount_path_to_fifo: fifo name /run/autofs.fifo-nfs Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_map: reading map file /etc/auto.nfs Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init gathered global options: (null) Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, passing -n to mount Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, passing -n to mount Sep 25 13:44:30 n1 automount[2337]: mounted indirect on /nfs with timeout 60, freq 15 seconds Sep 25 13:44:30 n1 automount[2337]: st_ready: st_ready(): state = 0 path /nfs Sep 25 13:44:30 n1 automount[2337]: ghosting enabled ## ls /nfs/v2 Sep 25 13:44:44 n1 automount[2337]: handle_packet: type = 3 Sep 25 13:44:44 n1 automount[2337]: handle_packet_missing_indirect: token 6, name v2, request pid 2352 Sep 25 13:44:44 n1 automount[2337]: attempting to mount entry /nfs/v2 Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): looking up v2 Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): v2 -> server:/nfs Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): expanded entry: server:/nfs Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): gathered options: Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): dequote("server:/nfs") -> server:/nfs Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): core of entry: options=, loc=server:/nfs Sep 25 13:44:44 n1 automount[2337]: sun_mount: parse(sun): mounting root /nfs, mountpoint v2, what server:/nfs, fstype nfs, options (null) Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): root=/nfs name=v2 what=server:/nfs, fstype=nfs, options=(null) Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host server(10.1.1.4) proto tcp version 0x30 Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.000214 Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000114 Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 164 weight 0 Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host server(10.1.1.4) proto udp version 0x30 Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.001836 Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000095 Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 965 weight 0 Sep 25 13:44:44 n1 automount[2337]: prune_host_list: selected subset of hosts that support NFS3 over TCP Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling mkdir_path /nfs/v2 Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling mount -t nfs server:/nfs /nfs/v2 Sep 25 13:44:44 n1 automount[2337]: spawn_mount: mtab link detected, passing -n to mount Sep 25 13:44:44 n1 checkproc: checkproc: Empty pid file /var/run/rpc.statd.pid for /usr/sbin/rpc.statd Sep 25 13:44:44 n1 start_daemon: start_daemon: Empty pid file /var/run/rpc.statd.pid for /usr/sbin/rpc.statd Sep 25 13:44:44 n1 rpc.statd[2369]: Version 1.2.6 starting Sep 25 13:44:44 n1 rpc.statd[2369]: Flags: TI-RPC Sep 25 13:44:44 n1 automount[2337]: >> Starting rpc.statd .....done Sep 25 13:44:44 n1 automount[2337]: >> mount.nfs: access denied by server while mounting server:/nfs Sep 25 13:44:44 n1 automount[2337]: mount(nfs): nfs: mount failure server:/nfs on /nfs/v2 Sep 25 13:44:44 n1 automount[2337]: dev_ioctl_send_fail: token = 6 Sep 25 13:44:44 n1 automount[2337]: failed to mount /nfs/v2 Sep 25 13:44:54 n1 automount[2337]: st_expire: state 1 path /nfs Sep 25 13:44:54 n1 automount[2337]: expire_proc: exp_proc = 140484187907840 path /nfs Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: got thid 140484187907840 path /nfs stat 0 Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: sigchld: exp 140484187907840 finished, switching from 2 to 1 Sep 25 13:44:54 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs Sep 25 13:45:09 n1 automount[2337]: st_expire: state 1 path /nfs Sep 25 13:45:09 n1 automount[2337]: expire_proc: exp_proc = 140484187907840 path /nfs Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: got thid 140484187907840 path /nfs stat 0 Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: sigchld: exp 140484187907840 finished, switching from 2 to 1 Sep 25 13:45:09 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs ## ls -la /nfs Sep 25 13:45:12 n1 automount[2337]: handle_packet: type = 3 Sep 25 13:45:12 n1 automount[2337]: handle_packet_missing_indirect: token 7, name v2, request pid 2397 Sep 25 13:45:12 n1 automount[2337]: dev_ioctl_send_fail: token = 7 dentry and inode (/nfs/v2) from a v2.6.38 kernel, after triggering the mount: struct dentry { d_flags = 393216, d_seq = { sequence = 4 }, d_hash = { next = 0x0, pprev = 0xffffc900011f5680 }, d_parent = 0xffff880478fcb0c0, d_name = { hash = 238128, len = 2, name = 0xffff880478fcb3f8 "v2" }, d_inode = 0xffff8804944693e0, d_iname = "v2\000\000\065\060\066\062\061\062fb\000Q\246\357\001\000\000\000\000\000\000\000\000\000\000\000\000\000\000", d_count = 1, d_lock = { { rlock = { raw_lock = { slock = 8650884 } } } }, d_op = 0xffffffffa01f0940, d_sb = 0xffff880496c8f400, d_time = 0, d_fsdata = 0xffff880497262540, d_lru = { next = 0xffff880478fcb440, prev = 0xffff880478fcb440 }, d_u = { d_child = { next = 0xffff880478fcb2d0, prev = 0xffff880478fcb160 }, d_u = { d_child = { next = 0xffff880478fcb2d0, prev = 0xffff880478fcb160 }, d_rcu = { next = 0xffff880478fcb2d0, func = 0xffff880478fcb160 } }, d_subdirs = { next = 0xffff880478fcb460, prev = 0xffff880478fcb460 }, d_alias = { next = 0xffff880494469478, prev = 0xffff880494469478 } } struct inode { i_mode = 16749, i_uid = 0, i_gid = 0, i_op = 0xffffffffa01f0880, i_sb = 0xffff880496c8f400, i_lock = { { rlock = { raw_lock = { slock = 65537 } } } }, i_flags = 0, i_mutex = { count = { counter = 1 }, wait_lock = { { rlock = { raw_lock = { slock = 0 } } } }, wait_list = { next = 0xffff880494469410, prev = 0xffff880494469410 }, owner = 0x0 }, i_state = 0, dirtied_when = 0, i_hash = { next = 0x0, pprev = 0x0 }, i_wb_list = { next = 0xffff880494469448, prev = 0xffff880494469448 }, i_lru = { next = 0xffff880494469458, prev = 0xffff880494469458 }, i_sb_list = { next = 0xffff8804944a68d8, prev = 0xffff880496c8f4c0 }, { i_dentry = { next = 0xffff880478fcb470, prev = 0xffff880478fcb470 }, i_rcu = { next = 0xffff880478fcb470, func = 0xffff880478fcb470 } }, i_ino = 120539, i_count = { counter = 1 }, i_nlink = 2, i_rdev = 0, i_blkbits = 10, i_version = 0, i_size = 0, i_atime = { tv_sec = 1348605652, tv_nsec = 807650231 }, i_mtime = { tv_sec = 1348605652, tv_nsec = 807650231 }, i_ctime = { tv_sec = 1348605652, tv_nsec = 807650231 }, i_blocks = 0, i_bytes = 0, i_alloc_sem = { count = 0, wait_lock = { { rlock = { raw_lock = { slock = 0 } } } }, wait_list = { next = 0xffff880494469500, prev = 0xffff880494469500 } }, i_fop = 0xffffffffa01f07a0, i_flock = 0x0, i_mapping = 0xffff880494469528, i_data = { host = 0xffff8804944693e0, page_tree = { height = 0, gfp_mask = 32, rnode = 0x0 }, tree_lock = { { rlock = { raw_lock = { slock = 0 } } } }, i_mmap_writable = 0, i_mmap = { prio_tree_node = 0x0, index_bits = 1, raw = 1 }, i_mmap_nonlinear = { next = 0xffff880494469558, prev = 0xffff880494469558 }, i_mmap_lock = { { rlock = { raw_lock = { slock = 0 } } } }, truncate_count = 0, nrpages = 0, writeback_index = 0, a_ops = 0xffffffff81d79300, flags = 131290, backing_dev_info = 0xffffffff81a261c0, private_lock = { { rlock = { raw_lock = { slock = 0 } } } }, private_list = { next = 0xffff8804944695a0, prev = 0xffff8804944695a0 }, assoc_mapping = 0x0, unmap_mutex = { count = { counter = 1 }, wait_lock = { { rlock = { raw_lock = { slock = 0 } } } }, wait_list = { next = 0xffff8804944695c0, prev = 0xffff8804944695c0 }, owner = 0x0 } }, i_dquot = {0x0, 0x0}, i_devices = { next = 0xffff8804944695e8, prev = 0xffff8804944695e8 }, { i_pipe = 0x0, i_bdev = 0x0, i_cdev = 0x0 }, i_generation = 0, i_fsnotify_mask = 0, i_fsnotify_marks = { first = 0x0 }, i_writecount = { counter = 0 }, i_security = 0x0, i_acl = 0xffffffffffffffff, i_default_acl = 0xffffffffffffffff, i_private = 0x0 } Thanks, Leonardo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories 2012-09-25 21:42 ` Leonardo Chiquitto @ 2012-09-26 9:15 ` Ian Kent 2012-09-26 19:39 ` Leonardo Chiquitto 0 siblings, 1 reply; 10+ messages in thread From: Ian Kent @ 2012-09-26 9:15 UTC (permalink / raw) To: Leonardo Chiquitto; +Cc: autofs, linux-fsdevel, miklos On Tue, 2012-09-25 at 18:42 -0300, Leonardo Chiquitto wrote: > On Mon, Sep 24, 2012 at 11:13 PM, Ian Kent <raven@themaw.net> wrote: > > On Mon, 2012-09-24 at 17:14 -0300, Leonardo Chiquitto wrote: > >> Hi, > >> > >> If you try to access an automounted directory and the mount operation fails, > >> the attributes of the directory (owner, group, mode, access time) are lost. > >> This happens only when AutoFS' browse mode (ghosting) is enabled. > >> > >> How to reproduce the problem: > >> > >> Make sure you have BROWSE_MODE="yes" in /etc/sysconfig/autofs. > >> > >> 1. Add a simple indirect mount map (it's required that the server does NOT > >> export the /notexported directory): > >> > >> # cat /etc/auto.master > >> /nfs /etc/auto.nfs > >> > >> # cat /etc/auto.nfs > >> v2 server:/notexported > >> > >> 2. Restart AutoFS and try to access the directory: > >> > >> # cd /nfs/v2 > >> -bash: cd: /nfs/v2: No such file or directory > >> > >> # ls -la /nfs > >> ls: cannot access /nfs/v2: No such file or directory > >> total 4 > >> drwxr-xr-x 3 root root 0 Sep 24 21:43 . > >> drwxr-xr-x 28 root root 4096 Sep 24 21:25 .. > >> d????????? ? ? ? ? ? v2 > >> > >> This problem is reproducible on recent upstream kernels such as 3.6.0-rc6. > >> I found out that the last kernel version that does not exhibit the bug is > >> 2.6.37. Between 2.6.37 and 2.6.38 there were quite some changes in > >> AutoFS/VFS and my attempts to bisect the problem were frustrated by > >> early crashes. > >> > >> Any ideas? > > > > Not really. > > > > It would be best to start with an full autofs debug log so we can see > > what is happening within automount and work back from there. We'll > > probably need to look at the dentry itself as well to see if we can get > > any info about where it came from possibly, with a kernel dump. > > > > Ian > > Hi Ian, > > Here are the debug logs from automount and a snapshot of the dentry > and inode of the "broken" directory (running 2.6.38). > > I tried to compare it with 2.6.37 (which works), but the structures > changed a lot between these kernels. A great deal changed in 2.6.38 in the VFS. You really can't use 2.6.38 because there were a number of fixes that went in after this and I think the VFS code up to around 3.0 or 3.1 could have had this problem anyway. What your looking for is kernels with fs/namei.c:follow_automount() with this near the top of the function, which shouldn't cause this type of problem: /* We don't want to mount if someone's just doing a stat - * unless they're stat'ing a directory and appended a '/' to * the name. * * We do, however, want to mount if someone wants to open or * create a file of any type under the mountpoint, wants to * traverse through the mountpoint or wants to open the * mounted directory. Also, autofs may mark negative dentries * as being automount points. These will need the attentions * of the daemon to instantiate them before they can be used. */ if (!(flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY | LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) && path->dentry->d_inode) return -EISDIR; But I'm seeing this with 3.4.9-2.fc16 which does have this in follow_automount() so that's a puzzle. > > ## starting up > > Sep 25 13:44:30 n1 automount[2337]: Starting automounter version > 5.0.7, master map auto.master > Sep 25 13:44:30 n1 automount[2337]: using kernel protocol version 5.02 > Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_master: reading > master files auto.master > Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init > gathered global options: (null) > Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, > passing -n to mount > Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, > passing -n to mount > Sep 25 13:44:30 n1 automount[2337]: lookup_read_master: lookup(file): > read entry /nfs > Sep 25 13:44:30 n1 automount[2337]: master_do_mount: mounting /nfs > Sep 25 13:44:30 n1 automount[2337]: automount_path_to_fifo: fifo name > /run/autofs.fifo-nfs > Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_map: reading map > file /etc/auto.nfs > Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init > gathered global options: (null) > Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, > passing -n to mount > Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, > passing -n to mount > Sep 25 13:44:30 n1 automount[2337]: mounted indirect on /nfs with > timeout 60, freq 15 seconds > Sep 25 13:44:30 n1 automount[2337]: st_ready: st_ready(): state = 0 path /nfs > Sep 25 13:44:30 n1 automount[2337]: ghosting enabled > > ## ls /nfs/v2 > > Sep 25 13:44:44 n1 automount[2337]: handle_packet: type = 3 > Sep 25 13:44:44 n1 automount[2337]: handle_packet_missing_indirect: > token 6, name v2, request pid 2352 > Sep 25 13:44:44 n1 automount[2337]: attempting to mount entry /nfs/v2 > Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): looking up v2 > Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): v2 -> > server:/nfs > Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): expanded > entry: server:/nfs > Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): gathered options: > Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): > dequote("server:/nfs") -> server:/nfs > Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): core of > entry: options=, loc=server:/nfs > Sep 25 13:44:44 n1 automount[2337]: sun_mount: parse(sun): mounting > root /nfs, mountpoint v2, what server:/nfs, fstype nfs, options (null) > Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): root=/nfs > name=v2 what=server:/nfs, fstype=nfs, options=(null) > Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host > server(10.1.1.4) proto tcp version 0x30 > Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.000214 > Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000114 > Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 164 weight 0 > Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host > server(10.1.1.4) proto udp version 0x30 > Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.001836 > Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000095 > Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 965 weight 0 > Sep 25 13:44:44 n1 automount[2337]: prune_host_list: selected subset > of hosts that support NFS3 over TCP > Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling > mkdir_path /nfs/v2 > Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling > mount -t nfs server:/nfs /nfs/v2 > Sep 25 13:44:44 n1 automount[2337]: spawn_mount: mtab link detected, > passing -n to mount > Sep 25 13:44:44 n1 checkproc: checkproc: Empty pid file > /var/run/rpc.statd.pid for /usr/sbin/rpc.statd > Sep 25 13:44:44 n1 start_daemon: start_daemon: Empty pid file > /var/run/rpc.statd.pid for /usr/sbin/rpc.statd > Sep 25 13:44:44 n1 rpc.statd[2369]: Version 1.2.6 starting > Sep 25 13:44:44 n1 rpc.statd[2369]: Flags: TI-RPC > Sep 25 13:44:44 n1 automount[2337]: >> Starting rpc.statd .....done > Sep 25 13:44:44 n1 automount[2337]: >> mount.nfs: access denied by > server while mounting server:/nfs > Sep 25 13:44:44 n1 automount[2337]: mount(nfs): nfs: mount failure > server:/nfs on /nfs/v2 > Sep 25 13:44:44 n1 automount[2337]: dev_ioctl_send_fail: token = 6 > Sep 25 13:44:44 n1 automount[2337]: failed to mount /nfs/v2 > Sep 25 13:44:54 n1 automount[2337]: st_expire: state 1 path /nfs > Sep 25 13:44:54 n1 automount[2337]: expire_proc: exp_proc = > 140484187907840 path /nfs > Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: got thid > 140484187907840 path /nfs stat 0 > Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: sigchld: exp > 140484187907840 finished, switching from 2 to 1 > Sep 25 13:44:54 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs > Sep 25 13:45:09 n1 automount[2337]: st_expire: state 1 path /nfs > Sep 25 13:45:09 n1 automount[2337]: expire_proc: exp_proc = > 140484187907840 path /nfs > Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: got thid > 140484187907840 path /nfs stat 0 > Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: sigchld: exp > 140484187907840 finished, switching from 2 to 1 > Sep 25 13:45:09 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs > > ## ls -la /nfs > > Sep 25 13:45:12 n1 automount[2337]: handle_packet: type = 3 > Sep 25 13:45:12 n1 automount[2337]: handle_packet_missing_indirect: > token 7, name v2, request pid 2397 > Sep 25 13:45:12 n1 automount[2337]: dev_ioctl_send_fail: token = 7 This callback shouldn't happen. > > dentry and inode (/nfs/v2) from a v2.6.38 kernel, after triggering > the mount: This certainly looks like an autofs dentry. > > struct dentry { > d_flags = 393216, > d_seq = { > sequence = 4 > }, > d_hash = { > next = 0x0, > pprev = 0xffffc900011f5680 > }, This looks a bit odd but I don't think it is the cause of the problem. Other than that these look ok. snip ... I setup a simple map, auto.test, with a single key "this" to check this out and did an "strace -ff -o ls.trace /bin/ls --color=auto /test" against the 3.4.9-2.fc16 I currently have installed. Ignoring all the cruft, near the bottom we see: stat64("/test", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 openat(AT_FDCWD, "/test", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3 getdents64(3, /* 3 entries */, 32768) = 72 lstat64("/test/this", 0x95aba40) = -1 ENOENT (No such file or directory) This is the problem I think, it should not attempt a mount and should return 0. I've only looked at the current upstream source so far and I can't see why this is happening. I've built upstream 3.5.4 so I can look at the actual source of the kernel I'm using to test. Ian ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories 2012-09-26 9:15 ` Ian Kent @ 2012-09-26 19:39 ` Leonardo Chiquitto 2012-09-27 1:04 ` Ian Kent 0 siblings, 1 reply; 10+ messages in thread From: Leonardo Chiquitto @ 2012-09-26 19:39 UTC (permalink / raw) To: Ian Kent; +Cc: autofs, linux-fsdevel, miklos On Wed, Sep 26, 2012 at 6:15 AM, Ian Kent <raven@themaw.net> wrote: > On Tue, 2012-09-25 at 18:42 -0300, Leonardo Chiquitto wrote: >> On Mon, Sep 24, 2012 at 11:13 PM, Ian Kent <raven@themaw.net> wrote: >> > On Mon, 2012-09-24 at 17:14 -0300, Leonardo Chiquitto wrote: >> >> Hi, >> >> >> >> If you try to access an automounted directory and the mount operation fails, >> >> the attributes of the directory (owner, group, mode, access time) are lost. >> >> This happens only when AutoFS' browse mode (ghosting) is enabled. >> >> >> >> How to reproduce the problem: >> >> >> >> Make sure you have BROWSE_MODE="yes" in /etc/sysconfig/autofs. >> >> >> >> 1. Add a simple indirect mount map (it's required that the server does NOT >> >> export the /notexported directory): >> >> >> >> # cat /etc/auto.master >> >> /nfs /etc/auto.nfs >> >> >> >> # cat /etc/auto.nfs >> >> v2 server:/notexported >> >> >> >> 2. Restart AutoFS and try to access the directory: >> >> >> >> # cd /nfs/v2 >> >> -bash: cd: /nfs/v2: No such file or directory >> >> >> >> # ls -la /nfs >> >> ls: cannot access /nfs/v2: No such file or directory >> >> total 4 >> >> drwxr-xr-x 3 root root 0 Sep 24 21:43 . >> >> drwxr-xr-x 28 root root 4096 Sep 24 21:25 .. >> >> d????????? ? ? ? ? ? v2 >> >> >> >> This problem is reproducible on recent upstream kernels such as 3.6.0-rc6. >> >> I found out that the last kernel version that does not exhibit the bug is >> >> 2.6.37. Between 2.6.37 and 2.6.38 there were quite some changes in >> >> AutoFS/VFS and my attempts to bisect the problem were frustrated by >> >> early crashes. >> >> >> >> Any ideas? >> > >> > Not really. >> > >> > It would be best to start with an full autofs debug log so we can see >> > what is happening within automount and work back from there. We'll >> > probably need to look at the dentry itself as well to see if we can get >> > any info about where it came from possibly, with a kernel dump. >> > >> > Ian >> >> Hi Ian, >> >> Here are the debug logs from automount and a snapshot of the dentry >> and inode of the "broken" directory (running 2.6.38). >> >> I tried to compare it with 2.6.37 (which works), but the structures >> changed a lot between these kernels. > > A great deal changed in 2.6.38 in the VFS. > > You really can't use 2.6.38 because there were a number of fixes that > went in after this and I think the VFS code up to around 3.0 or 3.1 > could have had this problem anyway. I'm using 3.6.0-rc6 here. I collected the data from 2.6.38 because this is the first kernel that shows the bug and I thought it would be easier to isolate the fix this way (smaller distance). > What your looking for is kernels with fs/namei.c:follow_automount() with > this near the top of the function, which shouldn't cause this type of > problem: > > /* We don't want to mount if someone's just doing a stat - > * unless they're stat'ing a directory and appended a '/' to > * the name. > * > * We do, however, want to mount if someone wants to open or > * create a file of any type under the mountpoint, wants to > * traverse through the mountpoint or wants to open the > * mounted directory. Also, autofs may mark negative dentries > * as being automount points. These will need the attentions > * of the daemon to instantiate them before they can be used. > */ > if (!(flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY | > LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) && > path->dentry->d_inode) > return -EISDIR; > > But I'm seeing this with 3.4.9-2.fc16 which does have this in > follow_automount() so that's a puzzle. > >> >> ## starting up >> >> Sep 25 13:44:30 n1 automount[2337]: Starting automounter version >> 5.0.7, master map auto.master >> Sep 25 13:44:30 n1 automount[2337]: using kernel protocol version 5.02 >> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_master: reading >> master files auto.master >> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init >> gathered global options: (null) >> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, >> passing -n to mount >> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, >> passing -n to mount >> Sep 25 13:44:30 n1 automount[2337]: lookup_read_master: lookup(file): >> read entry /nfs >> Sep 25 13:44:30 n1 automount[2337]: master_do_mount: mounting /nfs >> Sep 25 13:44:30 n1 automount[2337]: automount_path_to_fifo: fifo name >> /run/autofs.fifo-nfs >> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_map: reading map >> file /etc/auto.nfs >> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init >> gathered global options: (null) >> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, >> passing -n to mount >> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, >> passing -n to mount >> Sep 25 13:44:30 n1 automount[2337]: mounted indirect on /nfs with >> timeout 60, freq 15 seconds >> Sep 25 13:44:30 n1 automount[2337]: st_ready: st_ready(): state = 0 path /nfs >> Sep 25 13:44:30 n1 automount[2337]: ghosting enabled >> >> ## ls /nfs/v2 >> >> Sep 25 13:44:44 n1 automount[2337]: handle_packet: type = 3 >> Sep 25 13:44:44 n1 automount[2337]: handle_packet_missing_indirect: >> token 6, name v2, request pid 2352 >> Sep 25 13:44:44 n1 automount[2337]: attempting to mount entry /nfs/v2 >> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): looking up v2 >> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): v2 -> >> server:/nfs >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): expanded >> entry: server:/nfs >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): gathered options: >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): >> dequote("server:/nfs") -> server:/nfs >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): core of >> entry: options=, loc=server:/nfs >> Sep 25 13:44:44 n1 automount[2337]: sun_mount: parse(sun): mounting >> root /nfs, mountpoint v2, what server:/nfs, fstype nfs, options (null) >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): root=/nfs >> name=v2 what=server:/nfs, fstype=nfs, options=(null) >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host >> server(10.1.1.4) proto tcp version 0x30 >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.000214 >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000114 >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 164 weight 0 >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host >> server(10.1.1.4) proto udp version 0x30 >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.001836 >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000095 >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 965 weight 0 >> Sep 25 13:44:44 n1 automount[2337]: prune_host_list: selected subset >> of hosts that support NFS3 over TCP >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling >> mkdir_path /nfs/v2 >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling >> mount -t nfs server:/nfs /nfs/v2 >> Sep 25 13:44:44 n1 automount[2337]: spawn_mount: mtab link detected, >> passing -n to mount >> Sep 25 13:44:44 n1 checkproc: checkproc: Empty pid file >> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd >> Sep 25 13:44:44 n1 start_daemon: start_daemon: Empty pid file >> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd >> Sep 25 13:44:44 n1 rpc.statd[2369]: Version 1.2.6 starting >> Sep 25 13:44:44 n1 rpc.statd[2369]: Flags: TI-RPC >> Sep 25 13:44:44 n1 automount[2337]: >> Starting rpc.statd .....done >> Sep 25 13:44:44 n1 automount[2337]: >> mount.nfs: access denied by >> server while mounting server:/nfs >> Sep 25 13:44:44 n1 automount[2337]: mount(nfs): nfs: mount failure >> server:/nfs on /nfs/v2 >> Sep 25 13:44:44 n1 automount[2337]: dev_ioctl_send_fail: token = 6 >> Sep 25 13:44:44 n1 automount[2337]: failed to mount /nfs/v2 >> Sep 25 13:44:54 n1 automount[2337]: st_expire: state 1 path /nfs >> Sep 25 13:44:54 n1 automount[2337]: expire_proc: exp_proc = >> 140484187907840 path /nfs >> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: got thid >> 140484187907840 path /nfs stat 0 >> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: sigchld: exp >> 140484187907840 finished, switching from 2 to 1 >> Sep 25 13:44:54 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs >> Sep 25 13:45:09 n1 automount[2337]: st_expire: state 1 path /nfs >> Sep 25 13:45:09 n1 automount[2337]: expire_proc: exp_proc = >> 140484187907840 path /nfs >> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: got thid >> 140484187907840 path /nfs stat 0 >> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: sigchld: exp >> 140484187907840 finished, switching from 2 to 1 >> Sep 25 13:45:09 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs >> >> ## ls -la /nfs >> >> Sep 25 13:45:12 n1 automount[2337]: handle_packet: type = 3 >> Sep 25 13:45:12 n1 automount[2337]: handle_packet_missing_indirect: >> token 7, name v2, request pid 2397 >> Sep 25 13:45:12 n1 automount[2337]: dev_ioctl_send_fail: token = 7 > > This callback shouldn't happen. > >> >> dentry and inode (/nfs/v2) from a v2.6.38 kernel, after triggering >> the mount: > > This certainly looks like an autofs dentry. > >> >> struct dentry { >> d_flags = 393216, >> d_seq = { >> sequence = 4 >> }, >> d_hash = { >> next = 0x0, >> pprev = 0xffffc900011f5680 >> }, > > This looks a bit odd but I don't think it is the cause of the problem. > > Other than that these look ok. > > snip ... > > I setup a simple map, auto.test, with a single key "this" to check this > out and did an "strace -ff -o ls.trace /bin/ls --color=auto /test" > against the 3.4.9-2.fc16 I currently have installed. > > Ignoring all the cruft, near the bottom we see: > stat64("/test", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 > openat(AT_FDCWD, "/test", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3 > getdents64(3, /* 3 entries */, 32768) = 72 > lstat64("/test/this", 0x95aba40) = -1 ENOENT (No such file or directory) This happens only after there was an attempt to mount /test/this and the directory is already in the "incomplete" state, right? > This is the problem I think, it should not attempt a mount and should > return 0. My theory about the root cause is that when the mount fails, the kernel forgets to "reset" the state of that dentry. Makes any sense? > I've only looked at the current upstream source so far and I can't see > why this is happening. > > I've built upstream 3.5.4 so I can look at the actual source of the > kernel I'm using to test. Thanks for your help Leonardo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories 2012-09-26 19:39 ` Leonardo Chiquitto @ 2012-09-27 1:04 ` Ian Kent 2012-09-27 4:35 ` Ian Kent 0 siblings, 1 reply; 10+ messages in thread From: Ian Kent @ 2012-09-27 1:04 UTC (permalink / raw) To: Leonardo Chiquitto; +Cc: autofs, linux-fsdevel, miklos On Wed, 2012-09-26 at 16:39 -0300, Leonardo Chiquitto wrote: > On Wed, Sep 26, 2012 at 6:15 AM, Ian Kent <raven@themaw.net> wrote: > > On Tue, 2012-09-25 at 18:42 -0300, Leonardo Chiquitto wrote: > >> On Mon, Sep 24, 2012 at 11:13 PM, Ian Kent <raven@themaw.net> wrote: > >> > On Mon, 2012-09-24 at 17:14 -0300, Leonardo Chiquitto wrote: > >> >> Hi, > >> >> > >> >> If you try to access an automounted directory and the mount operation fails, > >> >> the attributes of the directory (owner, group, mode, access time) are lost. > >> >> This happens only when AutoFS' browse mode (ghosting) is enabled. > >> >> > >> >> How to reproduce the problem: > >> >> > >> >> Make sure you have BROWSE_MODE="yes" in /etc/sysconfig/autofs. > >> >> > >> >> 1. Add a simple indirect mount map (it's required that the server does NOT > >> >> export the /notexported directory): > >> >> > >> >> # cat /etc/auto.master > >> >> /nfs /etc/auto.nfs > >> >> > >> >> # cat /etc/auto.nfs > >> >> v2 server:/notexported > >> >> > >> >> 2. Restart AutoFS and try to access the directory: > >> >> > >> >> # cd /nfs/v2 > >> >> -bash: cd: /nfs/v2: No such file or directory > >> >> > >> >> # ls -la /nfs > >> >> ls: cannot access /nfs/v2: No such file or directory > >> >> total 4 > >> >> drwxr-xr-x 3 root root 0 Sep 24 21:43 . > >> >> drwxr-xr-x 28 root root 4096 Sep 24 21:25 .. > >> >> d????????? ? ? ? ? ? v2 > >> >> > >> >> This problem is reproducible on recent upstream kernels such as 3.6.0-rc6. > >> >> I found out that the last kernel version that does not exhibit the bug is > >> >> 2.6.37. Between 2.6.37 and 2.6.38 there were quite some changes in > >> >> AutoFS/VFS and my attempts to bisect the problem were frustrated by > >> >> early crashes. > >> >> > >> >> Any ideas? > >> > > >> > Not really. > >> > > >> > It would be best to start with an full autofs debug log so we can see > >> > what is happening within automount and work back from there. We'll > >> > probably need to look at the dentry itself as well to see if we can get > >> > any info about where it came from possibly, with a kernel dump. > >> > > >> > Ian > >> > >> Hi Ian, > >> > >> Here are the debug logs from automount and a snapshot of the dentry > >> and inode of the "broken" directory (running 2.6.38). > >> > >> I tried to compare it with 2.6.37 (which works), but the structures > >> changed a lot between these kernels. > > > > A great deal changed in 2.6.38 in the VFS. > > > > You really can't use 2.6.38 because there were a number of fixes that > > went in after this and I think the VFS code up to around 3.0 or 3.1 > > could have had this problem anyway. > > I'm using 3.6.0-rc6 here. I collected the data from 2.6.38 because this > is the first kernel that shows the bug and I thought it would be easier > to isolate the fix this way (smaller distance). > > > What your looking for is kernels with fs/namei.c:follow_automount() with > > this near the top of the function, which shouldn't cause this type of > > problem: > > > > /* We don't want to mount if someone's just doing a stat - > > * unless they're stat'ing a directory and appended a '/' to > > * the name. > > * > > * We do, however, want to mount if someone wants to open or > > * create a file of any type under the mountpoint, wants to > > * traverse through the mountpoint or wants to open the > > * mounted directory. Also, autofs may mark negative dentries > > * as being automount points. These will need the attentions > > * of the daemon to instantiate them before they can be used. > > */ > > if (!(flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY | > > LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) && > > path->dentry->d_inode) > > return -EISDIR; > > > > But I'm seeing this with 3.4.9-2.fc16 which does have this in > > follow_automount() so that's a puzzle. > > > >> > >> ## starting up > >> > >> Sep 25 13:44:30 n1 automount[2337]: Starting automounter version > >> 5.0.7, master map auto.master > >> Sep 25 13:44:30 n1 automount[2337]: using kernel protocol version 5.02 > >> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_master: reading > >> master files auto.master > >> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init > >> gathered global options: (null) > >> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, > >> passing -n to mount > >> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, > >> passing -n to mount > >> Sep 25 13:44:30 n1 automount[2337]: lookup_read_master: lookup(file): > >> read entry /nfs > >> Sep 25 13:44:30 n1 automount[2337]: master_do_mount: mounting /nfs > >> Sep 25 13:44:30 n1 automount[2337]: automount_path_to_fifo: fifo name > >> /run/autofs.fifo-nfs > >> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_map: reading map > >> file /etc/auto.nfs > >> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init > >> gathered global options: (null) > >> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, > >> passing -n to mount > >> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, > >> passing -n to mount > >> Sep 25 13:44:30 n1 automount[2337]: mounted indirect on /nfs with > >> timeout 60, freq 15 seconds > >> Sep 25 13:44:30 n1 automount[2337]: st_ready: st_ready(): state = 0 path /nfs > >> Sep 25 13:44:30 n1 automount[2337]: ghosting enabled > >> > >> ## ls /nfs/v2 > >> > >> Sep 25 13:44:44 n1 automount[2337]: handle_packet: type = 3 > >> Sep 25 13:44:44 n1 automount[2337]: handle_packet_missing_indirect: > >> token 6, name v2, request pid 2352 > >> Sep 25 13:44:44 n1 automount[2337]: attempting to mount entry /nfs/v2 > >> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): looking up v2 > >> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): v2 -> > >> server:/nfs > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): expanded > >> entry: server:/nfs > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): gathered options: > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): > >> dequote("server:/nfs") -> server:/nfs > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): core of > >> entry: options=, loc=server:/nfs > >> Sep 25 13:44:44 n1 automount[2337]: sun_mount: parse(sun): mounting > >> root /nfs, mountpoint v2, what server:/nfs, fstype nfs, options (null) > >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): root=/nfs > >> name=v2 what=server:/nfs, fstype=nfs, options=(null) > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host > >> server(10.1.1.4) proto tcp version 0x30 > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.000214 > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000114 > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 164 weight 0 > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host > >> server(10.1.1.4) proto udp version 0x30 > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.001836 > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000095 > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 965 weight 0 > >> Sep 25 13:44:44 n1 automount[2337]: prune_host_list: selected subset > >> of hosts that support NFS3 over TCP > >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling > >> mkdir_path /nfs/v2 > >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling > >> mount -t nfs server:/nfs /nfs/v2 > >> Sep 25 13:44:44 n1 automount[2337]: spawn_mount: mtab link detected, > >> passing -n to mount > >> Sep 25 13:44:44 n1 checkproc: checkproc: Empty pid file > >> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd > >> Sep 25 13:44:44 n1 start_daemon: start_daemon: Empty pid file > >> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd > >> Sep 25 13:44:44 n1 rpc.statd[2369]: Version 1.2.6 starting > >> Sep 25 13:44:44 n1 rpc.statd[2369]: Flags: TI-RPC > >> Sep 25 13:44:44 n1 automount[2337]: >> Starting rpc.statd .....done > >> Sep 25 13:44:44 n1 automount[2337]: >> mount.nfs: access denied by > >> server while mounting server:/nfs > >> Sep 25 13:44:44 n1 automount[2337]: mount(nfs): nfs: mount failure > >> server:/nfs on /nfs/v2 > >> Sep 25 13:44:44 n1 automount[2337]: dev_ioctl_send_fail: token = 6 > >> Sep 25 13:44:44 n1 automount[2337]: failed to mount /nfs/v2 > >> Sep 25 13:44:54 n1 automount[2337]: st_expire: state 1 path /nfs > >> Sep 25 13:44:54 n1 automount[2337]: expire_proc: exp_proc = > >> 140484187907840 path /nfs > >> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: got thid > >> 140484187907840 path /nfs stat 0 > >> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: sigchld: exp > >> 140484187907840 finished, switching from 2 to 1 > >> Sep 25 13:44:54 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs > >> Sep 25 13:45:09 n1 automount[2337]: st_expire: state 1 path /nfs > >> Sep 25 13:45:09 n1 automount[2337]: expire_proc: exp_proc = > >> 140484187907840 path /nfs > >> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: got thid > >> 140484187907840 path /nfs stat 0 > >> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: sigchld: exp > >> 140484187907840 finished, switching from 2 to 1 > >> Sep 25 13:45:09 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs > >> > >> ## ls -la /nfs > >> > >> Sep 25 13:45:12 n1 automount[2337]: handle_packet: type = 3 > >> Sep 25 13:45:12 n1 automount[2337]: handle_packet_missing_indirect: > >> token 7, name v2, request pid 2397 > >> Sep 25 13:45:12 n1 automount[2337]: dev_ioctl_send_fail: token = 7 > > > > This callback shouldn't happen. > > > >> > >> dentry and inode (/nfs/v2) from a v2.6.38 kernel, after triggering > >> the mount: > > > > This certainly looks like an autofs dentry. > > > >> > >> struct dentry { > >> d_flags = 393216, > >> d_seq = { > >> sequence = 4 > >> }, > >> d_hash = { > >> next = 0x0, > >> pprev = 0xffffc900011f5680 > >> }, > > > > This looks a bit odd but I don't think it is the cause of the problem. > > > > Other than that these look ok. > > > > snip ... > > > > I setup a simple map, auto.test, with a single key "this" to check this > > out and did an "strace -ff -o ls.trace /bin/ls --color=auto /test" > > against the 3.4.9-2.fc16 I currently have installed. > > > > Ignoring all the cruft, near the bottom we see: > > stat64("/test", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 > > openat(AT_FDCWD, "/test", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3 > > getdents64(3, /* 3 entries */, 32768) = 72 > > lstat64("/test/this", 0x95aba40) = -1 ENOENT (No such file or directory) > > This happens only after there was an attempt to mount /test/this > and the directory is already in the "incomplete" state, right? But the directory is complete before the failed mount and should remain so when using the "browse" option. You an see that by listing the contents of the directory before the mount attempt. > > > This is the problem I think, it should not attempt a mount and should > > return 0. > > My theory about the root cause is that when the mount fails, the > kernel forgets to "reset" the state of that dentry. Makes any sense? Not really. There's no need to reset anything, all the kernel does is do a path walk to obtain the existing dentry and fill in the information from it. But the lstat gets a fail on the walk and doesn't fill in the information so user space doesn't get it. The problem is the fact that there's a call back occurring which shouldn't happen so maybe something (perhaps the daemon) has removed the directory when it shouldn't have. That would cause a call back to happen and when it sees a fail the remove isn't done because of the browse option. Essentially an unmatched create/remove perhaps but with a prior incorrect remove somewhere. I looked a bit but can't see how that would happen either. > > > I've only looked at the current upstream source so far and I can't see > > why this is happening. > > > > I've built upstream 3.5.4 so I can look at the actual source of the > > kernel I'm using to test. > > Thanks for your help > Leonardo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories 2012-09-27 1:04 ` Ian Kent @ 2012-09-27 4:35 ` Ian Kent 2012-09-27 18:35 ` Leonardo Chiquitto 0 siblings, 1 reply; 10+ messages in thread From: Ian Kent @ 2012-09-27 4:35 UTC (permalink / raw) To: Leonardo Chiquitto; +Cc: autofs, linux-fsdevel, miklos On Thu, 2012-09-27 at 09:04 +0800, Ian Kent wrote: > On Wed, 2012-09-26 at 16:39 -0300, Leonardo Chiquitto wrote: > > On Wed, Sep 26, 2012 at 6:15 AM, Ian Kent <raven@themaw.net> wrote: > > > On Tue, 2012-09-25 at 18:42 -0300, Leonardo Chiquitto wrote: > > >> On Mon, Sep 24, 2012 at 11:13 PM, Ian Kent <raven@themaw.net> wrote: > > >> > On Mon, 2012-09-24 at 17:14 -0300, Leonardo Chiquitto wrote: > > >> >> Hi, > > >> >> > > >> >> If you try to access an automounted directory and the mount operation fails, > > >> >> the attributes of the directory (owner, group, mode, access time) are lost. > > >> >> This happens only when AutoFS' browse mode (ghosting) is enabled. > > >> >> > > >> >> How to reproduce the problem: > > >> >> > > >> >> Make sure you have BROWSE_MODE="yes" in /etc/sysconfig/autofs. > > >> >> > > >> >> 1. Add a simple indirect mount map (it's required that the server does NOT > > >> >> export the /notexported directory): > > >> >> > > >> >> # cat /etc/auto.master > > >> >> /nfs /etc/auto.nfs > > >> >> > > >> >> # cat /etc/auto.nfs > > >> >> v2 server:/notexported > > >> >> > > >> >> 2. Restart AutoFS and try to access the directory: > > >> >> > > >> >> # cd /nfs/v2 > > >> >> -bash: cd: /nfs/v2: No such file or directory > > >> >> > > >> >> # ls -la /nfs > > >> >> ls: cannot access /nfs/v2: No such file or directory > > >> >> total 4 > > >> >> drwxr-xr-x 3 root root 0 Sep 24 21:43 . > > >> >> drwxr-xr-x 28 root root 4096 Sep 24 21:25 .. > > >> >> d????????? ? ? ? ? ? v2 > > >> >> > > >> >> This problem is reproducible on recent upstream kernels such as 3.6.0-rc6. > > >> >> I found out that the last kernel version that does not exhibit the bug is > > >> >> 2.6.37. Between 2.6.37 and 2.6.38 there were quite some changes in > > >> >> AutoFS/VFS and my attempts to bisect the problem were frustrated by > > >> >> early crashes. > > >> >> > > >> >> Any ideas? > > >> > > > >> > Not really. > > >> > > > >> > It would be best to start with an full autofs debug log so we can see > > >> > what is happening within automount and work back from there. We'll > > >> > probably need to look at the dentry itself as well to see if we can get > > >> > any info about where it came from possibly, with a kernel dump. > > >> > > > >> > Ian > > >> > > >> Hi Ian, > > >> > > >> Here are the debug logs from automount and a snapshot of the dentry > > >> and inode of the "broken" directory (running 2.6.38). > > >> > > >> I tried to compare it with 2.6.37 (which works), but the structures > > >> changed a lot between these kernels. > > > > > > A great deal changed in 2.6.38 in the VFS. > > > > > > You really can't use 2.6.38 because there were a number of fixes that > > > went in after this and I think the VFS code up to around 3.0 or 3.1 > > > could have had this problem anyway. > > > > I'm using 3.6.0-rc6 here. I collected the data from 2.6.38 because this > > is the first kernel that shows the bug and I thought it would be easier > > to isolate the fix this way (smaller distance). > > > > > What your looking for is kernels with fs/namei.c:follow_automount() with > > > this near the top of the function, which shouldn't cause this type of > > > problem: > > > > > > /* We don't want to mount if someone's just doing a stat - > > > * unless they're stat'ing a directory and appended a '/' to > > > * the name. > > > * > > > * We do, however, want to mount if someone wants to open or > > > * create a file of any type under the mountpoint, wants to > > > * traverse through the mountpoint or wants to open the > > > * mounted directory. Also, autofs may mark negative dentries > > > * as being automount points. These will need the attentions > > > * of the daemon to instantiate them before they can be used. > > > */ > > > if (!(flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY | > > > LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) && > > > path->dentry->d_inode) > > > return -EISDIR; > > > > > > But I'm seeing this with 3.4.9-2.fc16 which does have this in > > > follow_automount() so that's a puzzle. > > > > > >> > > >> ## starting up > > >> > > >> Sep 25 13:44:30 n1 automount[2337]: Starting automounter version > > >> 5.0.7, master map auto.master > > >> Sep 25 13:44:30 n1 automount[2337]: using kernel protocol version 5.02 > > >> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_master: reading > > >> master files auto.master > > >> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init > > >> gathered global options: (null) > > >> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, > > >> passing -n to mount > > >> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, > > >> passing -n to mount > > >> Sep 25 13:44:30 n1 automount[2337]: lookup_read_master: lookup(file): > > >> read entry /nfs > > >> Sep 25 13:44:30 n1 automount[2337]: master_do_mount: mounting /nfs > > >> Sep 25 13:44:30 n1 automount[2337]: automount_path_to_fifo: fifo name > > >> /run/autofs.fifo-nfs > > >> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_map: reading map > > >> file /etc/auto.nfs > > >> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init > > >> gathered global options: (null) > > >> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, > > >> passing -n to mount > > >> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, > > >> passing -n to mount > > >> Sep 25 13:44:30 n1 automount[2337]: mounted indirect on /nfs with > > >> timeout 60, freq 15 seconds > > >> Sep 25 13:44:30 n1 automount[2337]: st_ready: st_ready(): state = 0 path /nfs > > >> Sep 25 13:44:30 n1 automount[2337]: ghosting enabled > > >> > > >> ## ls /nfs/v2 > > >> > > >> Sep 25 13:44:44 n1 automount[2337]: handle_packet: type = 3 > > >> Sep 25 13:44:44 n1 automount[2337]: handle_packet_missing_indirect: > > >> token 6, name v2, request pid 2352 > > >> Sep 25 13:44:44 n1 automount[2337]: attempting to mount entry /nfs/v2 > > >> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): looking up v2 > > >> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): v2 -> > > >> server:/nfs > > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): expanded > > >> entry: server:/nfs > > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): gathered options: > > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): > > >> dequote("server:/nfs") -> server:/nfs > > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): core of > > >> entry: options=, loc=server:/nfs > > >> Sep 25 13:44:44 n1 automount[2337]: sun_mount: parse(sun): mounting > > >> root /nfs, mountpoint v2, what server:/nfs, fstype nfs, options (null) > > >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): root=/nfs > > >> name=v2 what=server:/nfs, fstype=nfs, options=(null) > > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host > > >> server(10.1.1.4) proto tcp version 0x30 > > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.000214 > > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000114 > > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 164 weight 0 > > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host > > >> server(10.1.1.4) proto udp version 0x30 > > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.001836 > > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000095 > > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 965 weight 0 > > >> Sep 25 13:44:44 n1 automount[2337]: prune_host_list: selected subset > > >> of hosts that support NFS3 over TCP > > >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling > > >> mkdir_path /nfs/v2 > > >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling > > >> mount -t nfs server:/nfs /nfs/v2 > > >> Sep 25 13:44:44 n1 automount[2337]: spawn_mount: mtab link detected, > > >> passing -n to mount > > >> Sep 25 13:44:44 n1 checkproc: checkproc: Empty pid file > > >> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd > > >> Sep 25 13:44:44 n1 start_daemon: start_daemon: Empty pid file > > >> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd > > >> Sep 25 13:44:44 n1 rpc.statd[2369]: Version 1.2.6 starting > > >> Sep 25 13:44:44 n1 rpc.statd[2369]: Flags: TI-RPC > > >> Sep 25 13:44:44 n1 automount[2337]: >> Starting rpc.statd .....done > > >> Sep 25 13:44:44 n1 automount[2337]: >> mount.nfs: access denied by > > >> server while mounting server:/nfs > > >> Sep 25 13:44:44 n1 automount[2337]: mount(nfs): nfs: mount failure > > >> server:/nfs on /nfs/v2 > > >> Sep 25 13:44:44 n1 automount[2337]: dev_ioctl_send_fail: token = 6 > > >> Sep 25 13:44:44 n1 automount[2337]: failed to mount /nfs/v2 > > >> Sep 25 13:44:54 n1 automount[2337]: st_expire: state 1 path /nfs > > >> Sep 25 13:44:54 n1 automount[2337]: expire_proc: exp_proc = > > >> 140484187907840 path /nfs > > >> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: got thid > > >> 140484187907840 path /nfs stat 0 > > >> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: sigchld: exp > > >> 140484187907840 finished, switching from 2 to 1 > > >> Sep 25 13:44:54 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs > > >> Sep 25 13:45:09 n1 automount[2337]: st_expire: state 1 path /nfs > > >> Sep 25 13:45:09 n1 automount[2337]: expire_proc: exp_proc = > > >> 140484187907840 path /nfs > > >> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: got thid > > >> 140484187907840 path /nfs stat 0 > > >> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: sigchld: exp > > >> 140484187907840 finished, switching from 2 to 1 > > >> Sep 25 13:45:09 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs > > >> > > >> ## ls -la /nfs > > >> > > >> Sep 25 13:45:12 n1 automount[2337]: handle_packet: type = 3 > > >> Sep 25 13:45:12 n1 automount[2337]: handle_packet_missing_indirect: > > >> token 7, name v2, request pid 2397 > > >> Sep 25 13:45:12 n1 automount[2337]: dev_ioctl_send_fail: token = 7 > > > > > > This callback shouldn't happen. > > > > > >> > > >> dentry and inode (/nfs/v2) from a v2.6.38 kernel, after triggering > > >> the mount: > > > > > > This certainly looks like an autofs dentry. > > > > > >> > > >> struct dentry { > > >> d_flags = 393216, > > >> d_seq = { > > >> sequence = 4 > > >> }, > > >> d_hash = { > > >> next = 0x0, > > >> pprev = 0xffffc900011f5680 > > >> }, > > > > > > This looks a bit odd but I don't think it is the cause of the problem. > > > > > > Other than that these look ok. > > > > > > snip ... > > > > > > I setup a simple map, auto.test, with a single key "this" to check this > > > out and did an "strace -ff -o ls.trace /bin/ls --color=auto /test" > > > against the 3.4.9-2.fc16 I currently have installed. > > > > > > Ignoring all the cruft, near the bottom we see: > > > stat64("/test", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 > > > openat(AT_FDCWD, "/test", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3 > > > getdents64(3, /* 3 entries */, 32768) = 72 > > > lstat64("/test/this", 0x95aba40) = -1 ENOENT (No such file or directory) > > > > This happens only after there was an attempt to mount /test/this > > and the directory is already in the "incomplete" state, right? > > But the directory is complete before the failed mount and should remain > so when using the "browse" option. You an see that by listing the > contents of the directory before the mount attempt. > > > > > > This is the problem I think, it should not attempt a mount and should > > > return 0. > > > > My theory about the root cause is that when the mount fails, the > > kernel forgets to "reset" the state of that dentry. Makes any sense? > > Not really. There's no need to reset anything, all the kernel does is do > a path walk to obtain the existing dentry and fill in the information > from it. But the lstat gets a fail on the walk and doesn't fill in the > information so user space doesn't get it. > > The problem is the fact that there's a call back occurring which > shouldn't happen so maybe something (perhaps the daemon) has removed the > directory when it shouldn't have. That would cause a call back to happen > and when it sees a fail the remove isn't done because of the browse > option. Essentially an unmatched create/remove perhaps but with a prior > incorrect remove somewhere. I looked a bit but can't see how that would > happen either. I can confirm that there is no directory removal and re-create going on. Checking for that lead me to discover this mistake which fixes the symptom we're seeing: autofs4 - fix reset pending flag on mount fail From: Ian Kent <raven@themaw.net> In autofs4_d_automount(), if a mount fail occurs the AUTOFS_INF_PENDING mount pending flag is not cleared. One effect of this is when using the "browse" option, directory entry attributes show up with all "?"s due to the incorrect callback and subsequent failure return (when in fact no callback should be made). --- fs/autofs4/root.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index 75e5f1c..8c4292f 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -392,10 +392,12 @@ static struct vfsmount *autofs4_d_automount(struct path *path) ino->flags |= AUTOFS_INF_PENDING; spin_unlock(&sbi->fs_lock); status = autofs4_mount_wait(dentry); - if (status) - return ERR_PTR(status); spin_lock(&sbi->fs_lock); ino->flags &= ~AUTOFS_INF_PENDING; + if (status) { + spin_unlock(&sbi->fs_lock); + return ERR_PTR(status); + } } done: if (!(ino->flags & AUTOFS_INF_EXPIRING)) { ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories 2012-09-27 4:35 ` Ian Kent @ 2012-09-27 18:35 ` Leonardo Chiquitto 2012-09-27 18:59 ` Michael Tokarev 0 siblings, 1 reply; 10+ messages in thread From: Leonardo Chiquitto @ 2012-09-27 18:35 UTC (permalink / raw) To: Ian Kent; +Cc: autofs, linux-fsdevel, miklos On Thu, Sep 27, 2012 at 1:35 AM, Ian Kent <raven@themaw.net> wrote: > On Thu, 2012-09-27 at 09:04 +0800, Ian Kent wrote: >> On Wed, 2012-09-26 at 16:39 -0300, Leonardo Chiquitto wrote: >> > On Wed, Sep 26, 2012 at 6:15 AM, Ian Kent <raven@themaw.net> wrote: >> > > On Tue, 2012-09-25 at 18:42 -0300, Leonardo Chiquitto wrote: >> > >> On Mon, Sep 24, 2012 at 11:13 PM, Ian Kent <raven@themaw.net> wrote: >> > >> > On Mon, 2012-09-24 at 17:14 -0300, Leonardo Chiquitto wrote: >> > >> >> Hi, >> > >> >> >> > >> >> If you try to access an automounted directory and the mount operation fails, >> > >> >> the attributes of the directory (owner, group, mode, access time) are lost. >> > >> >> This happens only when AutoFS' browse mode (ghosting) is enabled. >> > >> >> >> > >> >> How to reproduce the problem: >> > >> >> >> > >> >> Make sure you have BROWSE_MODE="yes" in /etc/sysconfig/autofs. >> > >> >> >> > >> >> 1. Add a simple indirect mount map (it's required that the server does NOT >> > >> >> export the /notexported directory): >> > >> >> >> > >> >> # cat /etc/auto.master >> > >> >> /nfs /etc/auto.nfs >> > >> >> >> > >> >> # cat /etc/auto.nfs >> > >> >> v2 server:/notexported >> > >> >> >> > >> >> 2. Restart AutoFS and try to access the directory: >> > >> >> >> > >> >> # cd /nfs/v2 >> > >> >> -bash: cd: /nfs/v2: No such file or directory >> > >> >> >> > >> >> # ls -la /nfs >> > >> >> ls: cannot access /nfs/v2: No such file or directory >> > >> >> total 4 >> > >> >> drwxr-xr-x 3 root root 0 Sep 24 21:43 . >> > >> >> drwxr-xr-x 28 root root 4096 Sep 24 21:25 .. >> > >> >> d????????? ? ? ? ? ? v2 >> > >> >> >> > >> >> This problem is reproducible on recent upstream kernels such as 3.6.0-rc6. >> > >> >> I found out that the last kernel version that does not exhibit the bug is >> > >> >> 2.6.37. Between 2.6.37 and 2.6.38 there were quite some changes in >> > >> >> AutoFS/VFS and my attempts to bisect the problem were frustrated by >> > >> >> early crashes. >> > >> >> >> > >> >> Any ideas? >> > >> > >> > >> > Not really. >> > >> > >> > >> > It would be best to start with an full autofs debug log so we can see >> > >> > what is happening within automount and work back from there. We'll >> > >> > probably need to look at the dentry itself as well to see if we can get >> > >> > any info about where it came from possibly, with a kernel dump. >> > >> > >> > >> > Ian >> > >> >> > >> Hi Ian, >> > >> >> > >> Here are the debug logs from automount and a snapshot of the dentry >> > >> and inode of the "broken" directory (running 2.6.38). >> > >> >> > >> I tried to compare it with 2.6.37 (which works), but the structures >> > >> changed a lot between these kernels. >> > > >> > > A great deal changed in 2.6.38 in the VFS. >> > > >> > > You really can't use 2.6.38 because there were a number of fixes that >> > > went in after this and I think the VFS code up to around 3.0 or 3.1 >> > > could have had this problem anyway. >> > >> > I'm using 3.6.0-rc6 here. I collected the data from 2.6.38 because this >> > is the first kernel that shows the bug and I thought it would be easier >> > to isolate the fix this way (smaller distance). >> > >> > > What your looking for is kernels with fs/namei.c:follow_automount() with >> > > this near the top of the function, which shouldn't cause this type of >> > > problem: >> > > >> > > /* We don't want to mount if someone's just doing a stat - >> > > * unless they're stat'ing a directory and appended a '/' to >> > > * the name. >> > > * >> > > * We do, however, want to mount if someone wants to open or >> > > * create a file of any type under the mountpoint, wants to >> > > * traverse through the mountpoint or wants to open the >> > > * mounted directory. Also, autofs may mark negative dentries >> > > * as being automount points. These will need the attentions >> > > * of the daemon to instantiate them before they can be used. >> > > */ >> > > if (!(flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY | >> > > LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) && >> > > path->dentry->d_inode) >> > > return -EISDIR; >> > > >> > > But I'm seeing this with 3.4.9-2.fc16 which does have this in >> > > follow_automount() so that's a puzzle. >> > > >> > >> >> > >> ## starting up >> > >> >> > >> Sep 25 13:44:30 n1 automount[2337]: Starting automounter version >> > >> 5.0.7, master map auto.master >> > >> Sep 25 13:44:30 n1 automount[2337]: using kernel protocol version 5.02 >> > >> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_master: reading >> > >> master files auto.master >> > >> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init >> > >> gathered global options: (null) >> > >> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, >> > >> passing -n to mount >> > >> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, >> > >> passing -n to mount >> > >> Sep 25 13:44:30 n1 automount[2337]: lookup_read_master: lookup(file): >> > >> read entry /nfs >> > >> Sep 25 13:44:30 n1 automount[2337]: master_do_mount: mounting /nfs >> > >> Sep 25 13:44:30 n1 automount[2337]: automount_path_to_fifo: fifo name >> > >> /run/autofs.fifo-nfs >> > >> Sep 25 13:44:30 n1 automount[2337]: lookup_nss_read_map: reading map >> > >> file /etc/auto.nfs >> > >> Sep 25 13:44:30 n1 automount[2337]: parse_init: parse(sun): init >> > >> gathered global options: (null) >> > >> Sep 25 13:44:30 n1 automount[2337]: spawn_mount: mtab link detected, >> > >> passing -n to mount >> > >> Sep 25 13:44:30 n1 automount[2337]: spawn_umount: mtab link detected, >> > >> passing -n to mount >> > >> Sep 25 13:44:30 n1 automount[2337]: mounted indirect on /nfs with >> > >> timeout 60, freq 15 seconds >> > >> Sep 25 13:44:30 n1 automount[2337]: st_ready: st_ready(): state = 0 path /nfs >> > >> Sep 25 13:44:30 n1 automount[2337]: ghosting enabled >> > >> >> > >> ## ls /nfs/v2 >> > >> >> > >> Sep 25 13:44:44 n1 automount[2337]: handle_packet: type = 3 >> > >> Sep 25 13:44:44 n1 automount[2337]: handle_packet_missing_indirect: >> > >> token 6, name v2, request pid 2352 >> > >> Sep 25 13:44:44 n1 automount[2337]: attempting to mount entry /nfs/v2 >> > >> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): looking up v2 >> > >> Sep 25 13:44:44 n1 automount[2337]: lookup_mount: lookup(file): v2 -> >> > >> server:/nfs >> > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): expanded >> > >> entry: server:/nfs >> > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): gathered options: >> > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): >> > >> dequote("server:/nfs") -> server:/nfs >> > >> Sep 25 13:44:44 n1 automount[2337]: parse_mount: parse(sun): core of >> > >> entry: options=, loc=server:/nfs >> > >> Sep 25 13:44:44 n1 automount[2337]: sun_mount: parse(sun): mounting >> > >> root /nfs, mountpoint v2, what server:/nfs, fstype nfs, options (null) >> > >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): root=/nfs >> > >> name=v2 what=server:/nfs, fstype=nfs, options=(null) >> > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host >> > >> server(10.1.1.4) proto tcp version 0x30 >> > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.000214 >> > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000114 >> > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 164 weight 0 >> > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: called with host >> > >> server(10.1.1.4) proto udp version 0x30 >> > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v3 rpc ping time: 0.001836 >> > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: nfs v2 rpc ping time: 0.000095 >> > >> Sep 25 13:44:44 n1 automount[2337]: get_nfs_info: host server cost 965 weight 0 >> > >> Sep 25 13:44:44 n1 automount[2337]: prune_host_list: selected subset >> > >> of hosts that support NFS3 over TCP >> > >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling >> > >> mkdir_path /nfs/v2 >> > >> Sep 25 13:44:44 n1 automount[2337]: mount_mount: mount(nfs): calling >> > >> mount -t nfs server:/nfs /nfs/v2 >> > >> Sep 25 13:44:44 n1 automount[2337]: spawn_mount: mtab link detected, >> > >> passing -n to mount >> > >> Sep 25 13:44:44 n1 checkproc: checkproc: Empty pid file >> > >> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd >> > >> Sep 25 13:44:44 n1 start_daemon: start_daemon: Empty pid file >> > >> /var/run/rpc.statd.pid for /usr/sbin/rpc.statd >> > >> Sep 25 13:44:44 n1 rpc.statd[2369]: Version 1.2.6 starting >> > >> Sep 25 13:44:44 n1 rpc.statd[2369]: Flags: TI-RPC >> > >> Sep 25 13:44:44 n1 automount[2337]: >> Starting rpc.statd .....done >> > >> Sep 25 13:44:44 n1 automount[2337]: >> mount.nfs: access denied by >> > >> server while mounting server:/nfs >> > >> Sep 25 13:44:44 n1 automount[2337]: mount(nfs): nfs: mount failure >> > >> server:/nfs on /nfs/v2 >> > >> Sep 25 13:44:44 n1 automount[2337]: dev_ioctl_send_fail: token = 6 >> > >> Sep 25 13:44:44 n1 automount[2337]: failed to mount /nfs/v2 >> > >> Sep 25 13:44:54 n1 automount[2337]: st_expire: state 1 path /nfs >> > >> Sep 25 13:44:54 n1 automount[2337]: expire_proc: exp_proc = >> > >> 140484187907840 path /nfs >> > >> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: got thid >> > >> 140484187907840 path /nfs stat 0 >> > >> Sep 25 13:44:54 n1 automount[2337]: expire_cleanup: sigchld: exp >> > >> 140484187907840 finished, switching from 2 to 1 >> > >> Sep 25 13:44:54 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs >> > >> Sep 25 13:45:09 n1 automount[2337]: st_expire: state 1 path /nfs >> > >> Sep 25 13:45:09 n1 automount[2337]: expire_proc: exp_proc = >> > >> 140484187907840 path /nfs >> > >> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: got thid >> > >> 140484187907840 path /nfs stat 0 >> > >> Sep 25 13:45:09 n1 automount[2337]: expire_cleanup: sigchld: exp >> > >> 140484187907840 finished, switching from 2 to 1 >> > >> Sep 25 13:45:09 n1 automount[2337]: st_ready: st_ready(): state = 2 path /nfs >> > >> >> > >> ## ls -la /nfs >> > >> >> > >> Sep 25 13:45:12 n1 automount[2337]: handle_packet: type = 3 >> > >> Sep 25 13:45:12 n1 automount[2337]: handle_packet_missing_indirect: >> > >> token 7, name v2, request pid 2397 >> > >> Sep 25 13:45:12 n1 automount[2337]: dev_ioctl_send_fail: token = 7 >> > > >> > > This callback shouldn't happen. >> > > >> > >> >> > >> dentry and inode (/nfs/v2) from a v2.6.38 kernel, after triggering >> > >> the mount: >> > > >> > > This certainly looks like an autofs dentry. >> > > >> > >> >> > >> struct dentry { >> > >> d_flags = 393216, >> > >> d_seq = { >> > >> sequence = 4 >> > >> }, >> > >> d_hash = { >> > >> next = 0x0, >> > >> pprev = 0xffffc900011f5680 >> > >> }, >> > > >> > > This looks a bit odd but I don't think it is the cause of the problem. >> > > >> > > Other than that these look ok. >> > > >> > > snip ... >> > > >> > > I setup a simple map, auto.test, with a single key "this" to check this >> > > out and did an "strace -ff -o ls.trace /bin/ls --color=auto /test" >> > > against the 3.4.9-2.fc16 I currently have installed. >> > > >> > > Ignoring all the cruft, near the bottom we see: >> > > stat64("/test", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 >> > > openat(AT_FDCWD, "/test", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3 >> > > getdents64(3, /* 3 entries */, 32768) = 72 >> > > lstat64("/test/this", 0x95aba40) = -1 ENOENT (No such file or directory) >> > >> > This happens only after there was an attempt to mount /test/this >> > and the directory is already in the "incomplete" state, right? >> >> But the directory is complete before the failed mount and should remain >> so when using the "browse" option. You an see that by listing the >> contents of the directory before the mount attempt. >> >> > >> > > This is the problem I think, it should not attempt a mount and should >> > > return 0. >> > >> > My theory about the root cause is that when the mount fails, the >> > kernel forgets to "reset" the state of that dentry. Makes any sense? >> >> Not really. There's no need to reset anything, all the kernel does is do >> a path walk to obtain the existing dentry and fill in the information >> from it. But the lstat gets a fail on the walk and doesn't fill in the >> information so user space doesn't get it. >> >> The problem is the fact that there's a call back occurring which >> shouldn't happen so maybe something (perhaps the daemon) has removed the >> directory when it shouldn't have. That would cause a call back to happen >> and when it sees a fail the remove isn't done because of the browse >> option. Essentially an unmatched create/remove perhaps but with a prior >> incorrect remove somewhere. I looked a bit but can't see how that would >> happen either. > > I can confirm that there is no directory removal and re-create going on. > > Checking for that lead me to discover this mistake which fixes the > symptom we're seeing: > > autofs4 - fix reset pending flag on mount fail > > From: Ian Kent <raven@themaw.net> > > In autofs4_d_automount(), if a mount fail occurs the AUTOFS_INF_PENDING > mount pending flag is not cleared. > > One effect of this is when using the "browse" option, directory entry > attributes show up with all "?"s due to the incorrect callback and > subsequent failure return (when in fact no callback should be made). Reported-and-tested-by: Leonardo Chiquitto <lchiquitto@suse.de> Ian, I just tested the patch here and confirmed it works perfectly. Thanks! Leonardo > fs/autofs4/root.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > > diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c > index 75e5f1c..8c4292f 100644 > --- a/fs/autofs4/root.c > +++ b/fs/autofs4/root.c > @@ -392,10 +392,12 @@ static struct vfsmount *autofs4_d_automount(struct path *path) > ino->flags |= AUTOFS_INF_PENDING; > spin_unlock(&sbi->fs_lock); > status = autofs4_mount_wait(dentry); > - if (status) > - return ERR_PTR(status); > spin_lock(&sbi->fs_lock); > ino->flags &= ~AUTOFS_INF_PENDING; > + if (status) { > + spin_unlock(&sbi->fs_lock); > + return ERR_PTR(status); > + } > } > done: > if (!(ino->flags & AUTOFS_INF_EXPIRING)) { > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories 2012-09-27 18:35 ` Leonardo Chiquitto @ 2012-09-27 18:59 ` Michael Tokarev 2012-10-09 11:08 ` Leonardo Chiquitto 0 siblings, 1 reply; 10+ messages in thread From: Michael Tokarev @ 2012-09-27 18:59 UTC (permalink / raw) To: Leonardo Chiquitto; +Cc: Ian Kent, autofs, linux-fsdevel, miklos On 27.09.2012 22:35, Leonardo Chiquitto wrote: > On Thu, Sep 27, 2012 at 1:35 AM, Ian Kent <raven@themaw.net> wrote: [] >> I can confirm that there is no directory removal and re-create going on. >> >> Checking for that lead me to discover this mistake which fixes the >> symptom we're seeing: >> >> autofs4 - fix reset pending flag on mount fail >> >> From: Ian Kent <raven@themaw.net> >> >> In autofs4_d_automount(), if a mount fail occurs the AUTOFS_INF_PENDING >> mount pending flag is not cleared. >> >> One effect of this is when using the "browse" option, directory entry >> attributes show up with all "?"s due to the incorrect callback and >> subsequent failure return (when in fact no callback should be made). > > Reported-and-tested-by: Leonardo Chiquitto <lchiquitto@suse.de> > > Ian, > > I just tested the patch here and confirmed it works perfectly. Is it a -stable (kernel) material? This and the previous change, the locking logic fix/rework? Both appears to be needed for all current 3.x stable trees... Thanks, /mjt >> fs/autofs4/root.c | 6 ++++-- >> 1 files changed, 4 insertions(+), 2 deletions(-) >> >> >> diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c >> index 75e5f1c..8c4292f 100644 >> --- a/fs/autofs4/root.c >> +++ b/fs/autofs4/root.c >> @@ -392,10 +392,12 @@ static struct vfsmount *autofs4_d_automount(struct path *path) >> ino->flags |= AUTOFS_INF_PENDING; >> spin_unlock(&sbi->fs_lock); >> status = autofs4_mount_wait(dentry); >> - if (status) >> - return ERR_PTR(status); >> spin_lock(&sbi->fs_lock); >> ino->flags &= ~AUTOFS_INF_PENDING; >> + if (status) { >> + spin_unlock(&sbi->fs_lock); >> + return ERR_PTR(status); >> + } >> } >> done: >> if (!(ino->flags & AUTOFS_INF_EXPIRING)) { ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: When ghosting is enabled, failed mounts by AutoFS leave broken directories 2012-09-27 18:59 ` Michael Tokarev @ 2012-10-09 11:08 ` Leonardo Chiquitto 0 siblings, 0 replies; 10+ messages in thread From: Leonardo Chiquitto @ 2012-10-09 11:08 UTC (permalink / raw) To: Michael Tokarev; +Cc: Ian Kent, autofs, linux-fsdevel, miklos On Thu, Sep 27, 2012 at 3:59 PM, Michael Tokarev <mjt@tls.msk.ru> wrote: > On 27.09.2012 22:35, Leonardo Chiquitto wrote: >> On Thu, Sep 27, 2012 at 1:35 AM, Ian Kent <raven@themaw.net> wrote: > [] >>> I can confirm that there is no directory removal and re-create going on. >>> >>> Checking for that lead me to discover this mistake which fixes the >>> symptom we're seeing: >>> >>> autofs4 - fix reset pending flag on mount fail >>> >>> From: Ian Kent <raven@themaw.net> >>> >>> In autofs4_d_automount(), if a mount fail occurs the AUTOFS_INF_PENDING >>> mount pending flag is not cleared. >>> >>> One effect of this is when using the "browse" option, directory entry >>> attributes show up with all "?"s due to the incorrect callback and >>> subsequent failure return (when in fact no callback should be made). >> >> Reported-and-tested-by: Leonardo Chiquitto <lchiquitto@suse.de> >> >> Ian, >> >> I just tested the patch here and confirmed it works perfectly. > > Is it a -stable (kernel) material? This and the previous change, > the locking logic fix/rework? Both appears to be needed for all > current 3.x stable trees... IMO, at least this one should be submitted to stable after it reaches mainline. Leonardo >>> fs/autofs4/root.c | 6 ++++-- >>> 1 files changed, 4 insertions(+), 2 deletions(-) >>> >>> >>> diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c >>> index 75e5f1c..8c4292f 100644 >>> --- a/fs/autofs4/root.c >>> +++ b/fs/autofs4/root.c >>> @@ -392,10 +392,12 @@ static struct vfsmount *autofs4_d_automount(struct path *path) >>> ino->flags |= AUTOFS_INF_PENDING; >>> spin_unlock(&sbi->fs_lock); >>> status = autofs4_mount_wait(dentry); >>> - if (status) >>> - return ERR_PTR(status); >>> spin_lock(&sbi->fs_lock); >>> ino->flags &= ~AUTOFS_INF_PENDING; >>> + if (status) { >>> + spin_unlock(&sbi->fs_lock); >>> + return ERR_PTR(status); >>> + } >>> } >>> done: >>> if (!(ino->flags & AUTOFS_INF_EXPIRING)) { ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-10-09 11:08 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-09-24 20:14 When ghosting is enabled, failed mounts by AutoFS leave broken directories Leonardo Chiquitto 2012-09-25 2:13 ` Ian Kent 2012-09-25 21:42 ` Leonardo Chiquitto 2012-09-26 9:15 ` Ian Kent 2012-09-26 19:39 ` Leonardo Chiquitto 2012-09-27 1:04 ` Ian Kent 2012-09-27 4:35 ` Ian Kent 2012-09-27 18:35 ` Leonardo Chiquitto 2012-09-27 18:59 ` Michael Tokarev 2012-10-09 11:08 ` Leonardo Chiquitto
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).