From: Ian Kent <raven@themaw.net>
To: "Erwan Loaëc" <erwan.loaec@cgin.fr>
Cc: autofs@linux.kernel.org
Subject: Re: Problem of concurrent mount/umount call
Date: Wed, 27 Oct 2010 10:10:13 +0800 [thread overview]
Message-ID: <1288145413.3188.7.camel@localhost> (raw)
In-Reply-To: <4CC6D613.2050502@cgin.fr>
On Tue, 2010-10-26 at 15:22 +0200, Erwan Loaëc wrote:
> Hi,
>
> First, thank you for spending time to analyse the problem.
>
> I've apply the patch in our dev environnement, and defined a slow
> timeout for the mount (60s)
>
> FYI, in production, in order to limit this problem, the timeout is set
> to 2 hours.
>
> When it will be possible, I'll update every autofs. The problem is quite
> rare, so I won't be able to attest that the problem is solve or not
> immediatly.
Right, that sounds sensible since I can't work out how it gets through.
I also need to know that the change doesn't cause any unwanted side
effects so let me know how things go.
Adding an is_mounted() check in a frequently called function like this
can have an overhead if your using the old ioctl interface, which you
are by the look of the log. This should only show up if you have a
largish number of mounts. How many is a large number is very much site
dependent but if the CPU usage of the daemon is acceptable to you then
you don't need to worry about it.
>
> In addition, can you tell me why the patch is "autofs-5.0.3 - fix expire
> race" and not autofs-5.0.5 ?
That's just a typo, I'll rename the patch before committing it.
>
> Thanks,
>
> Erwan
>
>
> Ian Kent wrote:
> > On Tue, 2010-10-26 at 09:22 +0200, Erwan Loaëc wrote:
> >> Sorry for my previous mail with missing parts...
> >>
> >>
> >> Ian Kent wrote:
> >>> On Fri, 2010-10-22 at 17:19 +0200, Erwan Loaëc wrote:
> >>>> Hello,
> >>>>
> >>>> Sorry for the time, but before posting again I've upgrade every
> >>>> production servers with newest autofs4 module (with last patch) and last
> >>> What does "newest autofs4 module" mean exactly?
> >>
> >> Yes it is not the "newest module" but the module recompiled with the
> >> patch autofs4-2.6.26-v5-update-20090903.patch
> >
> > OK.
> >
> >>>> automount with all patch EXCEPT these:
> >>>>
> >>>> autofs-5.0.5-fix-restart.patch
> >>>> autofs-5.0.5-fix-status-privilege-error.patch
> >>>> autofs-5.0.4-always-read-file-maps-mount-lookup-map-read-fix.patch
> >>>> autofs-5.0.5-fix-direct-map-not-updating-on-reread.patch
> >>>> autofs-5.0.5-add-external-bind-method.patch
> >>>> autofs-5.0.5-fix-add-simple-bind-auth.patch
> >>>> autofs-5.0.5-add-dump-maps-option.patch
> >>>> autofs-5.0.5-fix-submount-shutdown-wait.patch
> >>>>
> >>>>
> >>>> Today I had the same behaviour than the issue explained my previous mail.
> >>>>
> >>>> --------------------
> >>>> Oct 22 16:44:06 SERVERNAME automount[2665]: umount_autofs_offset:
> >>>> couldn't get ioctl fd for offset /cifs/XXXXXXX/volume: No such file or
> >>>> directory
> >>>> Oct 22 16:44:06 SERVERNAME automount[2665]:
> >>>> handle_packet_missing_direct:1363: can't find map entry for (20,3548545)
> >>>> --------------------
> >>> This could be caused by a umount returning success when in fact it
> >>> didn't succeed with the umount. Are you sure umount is returning correct
> >>> status?
> >> Unfortunaly the last case occured on production server with debug
> >> disable. I can't find more information in logfile...
> >>
> >> But, as I've explained in my previous mail, this could logical with the
> >> bad sequence found in my previous case:
> >>
> >> *Call to the share /cifs/XXXXXXX/volume
> >> *Mount /cifs/XXXXXXX/volume expiring
> >> *New call to the share /cifs/XXXXXXX/volume
> >> *Umount /cifs/XXXXXXX/volume
> >
> > But that isn't quite what the problem is.
> >
> > I've had a closer look at the log and:
> >
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_direct: token 1526, name /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: expiring path /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path /cifs/SERV2/TM_termoz incl 1
> > Aug 26 10:11:42 bacchus automount[17827]: umount_subtree_mounts: unmounting dir = /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1526
> >
> > /cifs/SERV2/TM_termoz is umounted.
> >
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 5
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet_missing_direct: token 1527, name /cifs/SERV2/TM_termoz, request pid 29523
> > Aug 26 10:11:42 bacchus automount[17827]: attempting to mount entry /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: lookup_mount: lookup(program): /cifs/SERV2/TM_termoz -> -fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile ://SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): expanded entry: -fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile ://SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): gathered options: fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
> > Aug 26 10:11:42 bacchus automount[17827]: sun_mount: parse(sun): mounting root /tmp/auto1I7gB7, mountpoint /cifs/SERV2/TM_termoz, what //SERV2/TM_termoz, fstype cifs, options file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile
> > Aug 26 10:11:42 bacchus automount[17827]: do_mount: //SERV2/TM_termoz /cifs/SERV2/TM_termoz type cifs options file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile using module generic
> > Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): calling mkdir_path /tmp/auto1I7gB7
> > Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): calling mount -t cifs -s -o file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile //SERV2/TM_termoz /tmp/auto1I7gB7
> > Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): mounted //SERV2/TM_termoz type cifs on /tmp/auto1I7gB7
> > Aug 26 10:11:42 bacchus automount[17827]: move_mount: moved mount tree from /tmp/auto1I7gB7 to /cifs/SERV2/TM_termoz
> >
> > And is then mounted again here, but the request is not complete and
> > we get an expire, which means that the kernel saw the tree of mounts
> > below /cifs/SERV2 as not busy. This means the dentry was selected
> > for expire before the mount, and in fact before the path walk which
> > caused it even started.
> >
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 4
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_indirect: token 1528, name SERV2
> > Aug 26 10:11:42 bacchus automount[17827]: expiring path /cifs/SERV2
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path /cifs/SERV2 incl 1
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/01_MARKETING
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/01_MARKETING
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/02_ADMINISTRATIF
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/02_ADMINISTRATIF
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/03_EXPLOITATION
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/03_EXPLOITATION
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/04_SERVICE-CLIENTS
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/04_SERVICE-CLIENTS
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/05_TELECHARGEMENTS
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/05_TELECHARGEMENTS
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/06_UTILISATEURS
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/06_UTILISATEURS
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/07_PRODUCTION
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/07_PRODUCTION
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/08_BDD
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/08_BDD
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/ADMIN$
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/ADMIN$
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/C$
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/C$
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/CGIN
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/CGIN
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/D$
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/D$
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/F$
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/F$
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/ImageMagick-6.3.3V6
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/ImageMagick-6.3.3V6
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: umount_autofs_offset: couldn't get ioctl fd for offset /cifs/SERV2/TM_termoz: No such file or directory
> > Aug 26 10:11:42 bacchus automount[17827]: forcing umount of offset mount /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: failed to remove dir /cifs/SERV2/TM_termoz: Device or resource busy
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/Projets
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/Projets
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/Projets_Air
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/Projets_Air
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/termozs
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/termozs
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/packages
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/packages
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/powervault
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/powervault
> > Aug 26 10:11:42 bacchus automount[17827]: umount_multi_triggers: umount offset /cifs/SERV2/temp
> > Aug 26 10:11:42 bacchus automount[17827]: umounted offset mount /cifs/SERV2/temp
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/01_MARKETING
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/02_ADMINISTRATIF
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/03_EXPLOITATION
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/04_SERVICE-CLIENTS
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/05_TELECHARGEMENTS
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/06_UTILISATEURS
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/07_PRODUCTION
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/08_BDD
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/ADMIN$
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/C$
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/CGIN
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/D$
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/F$
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/ImageMagick-6.3.3V6
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/TM_termoz
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/Projets
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/Projets_Air
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/termozs
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/packages
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/powervault
> > Aug 26 10:11:42 bacchus automount[17827]: cache_delete_offset_list: deleting offset key /cifs/SERV2/temp
> >
> > Now we are trying to umount the tree of autofs trigger mounts but one of
> > the trigger mounts now has a mount on it which causes the problem.
> >
> > Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1527
> > Aug 26 10:11:42 bacchus automount[17827]: mounted /cifs/SERV2/TM_termoz
> >
> > Now the mount completes.
> >
> > It is the mount not completing before the expire which causes autofs to
> > not recognize that the tree has become busy which should prevent it from
> > attempting to umount it.
> >
> > The check to determine if the umount of the tree can go ahead is done by
> > checking if any of these autofs trigger mounts have a open file handle
> > (which is used for control operations when they have a mount on top of
> > them). But, during the mount, the file handle is not set until the time
> > we see the "ioctl_send_ready: token = 1527" message so it is missed by
> > the expire check.
> >
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 5
> > Aug 26 10:11:42 bacchus automount[17827]: rm_unwanted_fn: removing directory /cifs/SERV2
> > Aug 26 10:11:42 bacchus automount[17827]: unable to remove directory /cifs/SERV2: Directory not empty
> > Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2
> > Aug 26 10:11:42 bacchus automount[17827]: handle_packet_missing_direct:1323: can't find map entry for (46,12396497)
> >
> > If we ever get any of these then it is game over.
> > They are fatal and, although you will see a hang because a lock is not
> > released, you must start over anyway. There is no other option.
> >
> > I don't know why the above expire is getting through, it shouldn't, but
> > user space does need to handle this case anyway.
> >
> > I've done some basic testing and this still needs work (like a problem
> > description) but I'm interested to know if this helps.
> >
> > autofs-5.0.3 - fix expire race
> >
> > From: Ian Kent <raven@themaw.net>
> >
> >
> > ---
> >
> > lib/mounts.c | 5 ++++-
> > 1 files changed, 4 insertions(+), 1 deletions(-)
> >
> >
> > diff --git a/lib/mounts.c b/lib/mounts.c
> > index 4c44982..f26579e 100644
> > --- a/lib/mounts.c
> > +++ b/lib/mounts.c
> > @@ -1525,8 +1525,11 @@ int umount_multi_triggers(struct autofs_point *ap, struct mapent *me, char *root
> > oe_base = oe->key + strlen(root);
> > left += umount_multi_triggers(ap, oe, root, oe_base);
> >
> > - if (oe->ioctlfd != -1)
> > + if (oe->ioctlfd != -1 ||
> > + is_mounted(_PROC_MOUNTS, oe->key, MNTS_REAL)) {
> > left++;
> > + break;
> > + }
> > }
> >
> > if (left)
> >
> >
>
> _______________________________________________
> autofs mailing list
> autofs@linux.kernel.org
> http://linux.kernel.org/mailman/listinfo/autofs
_______________________________________________
autofs mailing list
autofs@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/autofs
next prev parent reply other threads:[~2010-10-27 2:10 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-26 10:39 Problem of concurrent mount/umount call Erwan Loaëc
2010-09-20 1:58 ` Ian Kent
2010-09-20 2:08 ` Ian Kent
2010-10-22 15:19 ` Erwan Loaëc
2010-10-26 0:10 ` Ian Kent
2010-10-26 0:18 ` Ian Kent
2010-10-26 6:59 ` Erwan Loaëc
2010-10-26 7:22 ` Erwan Loaëc
2010-10-26 12:36 ` Ian Kent
2010-10-26 13:22 ` Erwan Loaëc
2010-10-27 2:10 ` Ian Kent [this message]
2010-12-07 8:54 ` Erwan Loaëc
2010-12-08 2:42 ` Ian Kent
2011-02-17 11:01 ` Erwan Loaëc
2011-02-19 2:56 ` Ian Kent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1288145413.3188.7.camel@localhost \
--to=raven@themaw.net \
--cc=autofs@linux.kernel.org \
--cc=erwan.loaec@cgin.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.