* [patch] buffer overflow, failure to auto-dismount, log file diarrhea
@ 2003-11-21 18:56 Jim Carter
2003-11-22 8:20 ` Ian Kent
0 siblings, 1 reply; 17+ messages in thread
From: Jim Carter @ 2003-11-21 18:56 UTC (permalink / raw)
To: autofs
Here at UCLA-Mathnet we use the automounter extensively, and we have had
a couple of associated problems:
A. Our NFS mounted filesystems invariably require submounts, and they
are not auto-dismounted. Ever. This proved to be due to a duplicated
slash between the main and sub mount points. Once this was fixed,
auto-dismount started working. There is a rumor that a related fix was
posted somewhere, but I couldn't find it.
B. If the daemon is hit with SIGUSR1, it goes into an infinite loop
trying unsuccessfully to dismount eligible filesystems, spitting out
typically 1000 syslog messages over 2 seconds until item C (below)
supervenes. I put in both a rate throttle (20/second) and a dynamic
limit on the number of dismounts.
C. Upon auto-dismount or SIGUSR1 looping, st_prepare_shutdown is called
when ap.state != ST_READY and an assertion fails, killing the thread.
I changed it to die on ST_SHUTDOWN_PENDING, i.e. a recursive call. I'm
not 100% sure that this is the correct contingency, but automount does
dismount the unused filesystems and does exit.
D. It would appear that a maliciously constructed directory name could
overflow a buffer in several places, at least causing denial of service
and possibly allowing the execution of code. Perhaps other O.S. limits
on the length of a filename (PATH_MAX) invariably protect the daemon
from this exposure, but defense in depth in this area seems both
warranted and not burdensome. I changed sprintf to snprintf wherever
occurring, and the subroutine which joins the dirname and basename
checks the buffer size.
E. There were a few cases where size_t and int were mixed together,
causing compiler warnings. As there have been exploits against mixed
signed-unsigned variables, I took the opportunity to fix this issue.
On one machine we have been running the patched automount for a week,
and all Linux boxes have had it in production for 48 hours. No peculiar
log messages or crashes have been seen. In one test, various random NFS
filesystems were automounted and allowed to time out, with varying
numbers of filesystems simultaneously mounted. Almost 500
mount-autodismount pairs were done; processes always exited when they
should, and no error messages appeared. There's a pretty good chance
that this patch is working.
The patches follow. They are against autofs-4.0.0pre10, which is the
version distributed with SuSE 8.2, the distro we are using.
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu http://www.math.ucla.edu/~jimc (q.v. for PGP key)
diff -r -u autofs-4.0.0pre10.orig/daemon/automount.c autofs-4.0.0pre10/daemon/automount.c
--- autofs-4.0.0pre10.orig/daemon/automount.c 2001-03-27 21:08:23.000000000 -0800
+++ autofs-4.0.0pre10/daemon/automount.c 2003-11-19 13:39:39.000000000 -0800
@@ -30,6 +30,7 @@
#include <stdlib.h>
#include <string.h>
#include <syslog.h>
+#include <time.h>
#include <unistd.h>
#include <mntent.h>
#include <sys/ioctl.h>
@@ -106,8 +107,8 @@
int pipefd; /* File descriptor for pipe */
int ioctlfd; /* File descriptor for ioctls */
dev_t dev; /* "Device" number assigned by kernel */
- unsigned exp_timeout; /* Timeout for expiring mounts */
- unsigned exp_runfreq; /* Frequency for polling for timeouts */
+ time_t exp_timeout; /* Timeout for expiring mounts */
+ time_t exp_runfreq; /* Frequency for polling for timeouts */
volatile pid_t exp_process; /* Process that is currently expiring */
volatile struct pending_mount *mounts; /* Pending mount queue */
struct lookup_mod *lookup; /* Lookup module */
@@ -123,6 +124,50 @@
static void cleanup_exit(int exit_code);
static int handle_packet_expire(const struct autofs_packet_expire *pkt);
+/* sum = "dir/base" with attention to buffer overflows, and multiple slashes
+ at the joint are avoided. */
+static void cat_path(char* sum, size_t szsum, const char* dir, const char* base)
+{
+ const char* parts[3] = {dir, "", base};
+ const char* part;
+ size_t len;
+ int i, nsl;
+ for (i = 0; i < 3; ++i) {
+ part = parts[i];
+ len = strlen(part) + 1;
+ if (len > szsum)
+ len = szsum;
+ strncpy(sum, part, len);
+ if (len > 1) {
+ sum += len-1;
+ szsum -= len-1;
+ }
+ /* Suppress the duplicate slash(es) between parts */
+ if (i == 0) {
+ nsl = (sum[-1] == '/') + (base[0] == '/');
+ if (nsl == 0) {
+ parts[1] = "/"; /*No slashes, insert one*/
+ } else if (nsl > 1) {
+ parts[2]++; /*Too many slashes, lose one*/
+ }
+ }
+ }
+ if (! szsum > 0)
+ sum[-1] = '\0'; /*Should come from base, but it was truncated*/
+}
+
+/* sum = "dir/base" with attention to buffer overflows, and multiple slashes
+ at the joint are avoided. The length of base is specified explicitly. */
+static void ncat_path(char* sum, size_t szsum, const char* dir,
+ const char* base, size_t szbase)
+{
+ char pktname[PATH_MAX+1];
+ int namelen = ((PATH_MAX < szbase) ? PATH_MAX : szbase)+1;
+ strncpy(pktname, base, namelen);
+ pktname[namelen-1] = '\0';
+ cat_path(sum, szsum, dir, pktname);
+}
+
int mkdir_path(const char *path, mode_t mode)
{
char *buf = alloca(strlen(path)+1);
@@ -185,7 +230,7 @@
struct stat st;
int rv = 0;
- sprintf(path_buf, "%s/%s", root, name);
+ cat_path(path_buf, sizeof(path_buf), root, name);
if ( !lstat(path_buf,&st) ) {
if ( S_ISDIR(st.st_mode) ) {
if ( st.st_dev != ap.dev ) {
@@ -222,7 +267,7 @@
strcmp(de->d_name, "..") == 0)
continue;
- sprintf(buf, "%s/%s", base, de->d_name);
+ cat_path(buf, sizeof(buf), base, de->d_name);
if (walk_tree(buf, fn, 1, arg)) {
closedir(dir);
return -1;
@@ -271,7 +316,7 @@
const char *path;
struct mntlist *next;
} *mntlist = NULL, *mptr;
- int pathlen = strlen(path);
+ size_t pathlen = strlen(path);
DB(syslog(LOG_INFO, "umount_multi: path=%s incl=%d\n", path, incl));
@@ -284,7 +329,7 @@
/* Construct a list of eligible dirs ordered longest->shortest
so that umount will work */
while((mnt = getmntent(mtab)) != NULL) {
- int len = strlen(mnt->mnt_dir);
+ size_t len = strlen(mnt->mnt_dir);
struct mntlist *m, **prev;
char *p;
@@ -432,9 +477,10 @@
return -1;
}
- sprintf(options, "fd=%d,pgrp=%u,minproto=2,maxproto=%d", pipefd[1],
+ snprintf(options, sizeof(options),
+ "fd=%d,pgrp=%u,minproto=2,maxproto=%d", pipefd[1],
(unsigned)my_pgrp, AUTOFS_MAX_PROTO_VERSION);
- sprintf(our_name, "automount(pid%u)", (unsigned)my_pid);
+ snprintf(our_name, sizeof(our_name), "automount(pid%u)", (unsigned)my_pid);
if (spawnl(LOG_DEBUG, PATH_MOUNT, PATH_MOUNT, "-t", "autofs", "-o",
options, our_name, path, NULL) != 0) {
@@ -728,14 +774,17 @@
close(ap.state_pipe[0]);
close(ap.state_pipe[1]);
- /* Generate expire messages until there's
- nothing more to expire */
- while(ioctl(ap.ioctlfd, AUTOFS_IOC_EXPIRE_MULTI, &now) == 0)
- ;
-
- /* EXPIRE_MULTI is synchronous, so we can
- be sure the umounts are done by the time we reach
- here */
+ /* Generate expire messages until there's nothing more to
+ expire. If a bug prevents unmounting, limit attempts to
+ 20/second and a few more than the known number of mounts. */
+ count = count_mounts(ap.path) + 3;
+ while(ioctl(ap.ioctlfd, AUTOFS_IOC_EXPIRE_MULTI, &now) == 0 && --count > 0) {
+ struct timespec nap = { 0, 50000000 }; /*5e-2 seconds*/
+ nanosleep(&nap, NULL); /*Don't care if it ends early*/
+ }
+
+ /* EXPIRE_MULTI is synchronous, so we can be sure (famous last
+ words) the umounts are done by the time we reach here */
if ((count = count_mounts(ap.path))) {
DB(syslog(LOG_INFO,
"expire_proc: %d remaining in %s\n",
@@ -764,7 +813,7 @@
DB(syslog(LOG_INFO, "prep_shutdown: state = %d\n",
ap.state));
- assert(ap.state == ST_READY);
+ assert(ap.state != ST_SHUTDOWN_PENDING);
/* Turn off timeouts */
alarm(0);
@@ -858,7 +907,7 @@
char *buf = (char *)ptr;
while(len > 0) {
- size_t r = read(fd, buf, len);
+ ssize_t r = read(fd, buf, len);
if (r == -1) {
if (errno == EINTR)
@@ -942,9 +991,10 @@
struct stat st;
sigset_t oldsig;
pid_t f;
+ char buf[PATH_MAX+1];
DB(syslog(LOG_INFO, "handle_packet_missing: token %d, name %s\n",
- pkt->wait_queue_token, pkt->name));
+ (int)pkt->wait_queue_token, pkt->name));
/* Ignore packet if we're trying to shut down */
if (ap.state == ST_SHUTDOWN_PENDING ||
@@ -971,8 +1021,8 @@
}
sigprocmask(SIG_UNBLOCK, &sigchld_mask, NULL);
- syslog(LOG_INFO, "attempting to mount entry %s/%s",
- ap.path, pkt->name);
+ ncat_path(buf, sizeof(buf), ap.path, pkt->name, pkt->len);
+ syslog(LOG_INFO, "attempting to mount entry %s", buf);
sigprocmask(SIG_BLOCK, &lock_sigs, &oldsig);
@@ -984,7 +1034,6 @@
return 1;
} else if ( !f ) {
int err;
- char buf[PATH_MAX+1];
ignore_signals(); /* Set up a sensible signal environment */
close(ap.pipefd);
@@ -995,7 +1044,7 @@
err = ap.lookup->lookup_mount(ap.path, pkt->name, pkt->len,
ap.lookup->context);
- sprintf(buf, "%s/%.*s", ap.path, pkt->len, pkt->name);
+ ncat_path(buf, sizeof(buf), ap.path, pkt->name, pkt->len);
/* If at first you don't succeed, hide all evidence you ever tried */
if (err) {
@@ -1031,7 +1080,7 @@
{
char buf[PATH_MAX];
- sprintf(buf, "%s/%.*s", ap.path, namelen, name);
+ ncat_path(buf, sizeof(buf), ap.path, name, namelen);
syslog(LOG_DEBUG, "running expiration on path %s", buf);
if ( umount_multi(buf, 1) == 0 ) {
@@ -1112,7 +1161,7 @@
int ret;
DB(syslog(LOG_INFO, "handle_packet_expire_multi: token %d, name %s\n",
- pkt->wait_queue_token, pkt->name));
+ (int)(pkt->wait_queue_token), pkt->name));
ret = handle_expire(pkt->name, pkt->len, pkt->wait_queue_token);
@@ -1423,14 +1472,14 @@
ap.exp_timeout = ap.exp_runfreq = 0;
syslog(LOG_INFO, "kernel does not support timeouts");
} else {
- unsigned long timeout;
+ time_t timeout;
ap.exp_runfreq = (ap.exp_timeout+CHECK_RATIO-1) / CHECK_RATIO;
timeout = ap.exp_timeout;
syslog(LOG_INFO, "using timeout %d seconds; freq %d secs",
- timeout, ap.exp_runfreq);
+ (int)timeout, (int)ap.exp_runfreq);
ioctl(ap.ioctlfd, AUTOFS_IOC_SETTIMEOUT, &timeout);
diff -r -u autofs-4.0.0pre10.orig/daemon/module.c autofs-4.0.0pre10/daemon/module.c
--- autofs-4.0.0pre10.orig/daemon/module.c 2001-03-27 21:08:23.000000000 -0800
+++ autofs-4.0.0pre10/daemon/module.c 2003-11-13 11:45:54.000000000 -0800
@@ -26,6 +26,7 @@
{
struct lookup_mod *mod;
char *fnbuf;
+ size_t szfnbuf;
void *dh;
int *ver;
@@ -34,13 +35,14 @@
if ( err_prefix ) syslog(LOG_CRIT, "%s%m", err_prefix);
return NULL;
}
- fnbuf = alloca(strlen(name) + strlen(AUTOFS_LIB_DIR) + 13);
+ szfnbuf = strlen(name) + strlen(AUTOFS_LIB_DIR) + 13;
+ fnbuf = alloca(szfnbuf);
if ( !fnbuf ) {
free(mod);
if ( err_prefix ) syslog(LOG_CRIT, "%s%m", err_prefix);
return NULL;
}
- sprintf(fnbuf, "%s//lookup_%s.so", AUTOFS_LIB_DIR, name);
+ snprintf(fnbuf, szfnbuf, "%s//lookup_%s.so", AUTOFS_LIB_DIR, name);
if ( !(dh = dlopen(fnbuf, RTLD_NOW)) ) {
if ( err_prefix )
@@ -91,6 +93,7 @@
{
struct parse_mod *mod;
char *fnbuf;
+ size_t szfnbuf;
void *dh;
int *ver;
@@ -99,13 +102,14 @@
if ( err_prefix ) syslog(LOG_CRIT, "%s%m", err_prefix);
return NULL;
}
- fnbuf = alloca(strlen(name) + strlen(AUTOFS_LIB_DIR) + 12);
+ szfnbuf = strlen(name) + strlen(AUTOFS_LIB_DIR) + 12;
+ fnbuf = alloca(szfnbuf);
if ( !fnbuf ) {
free(mod);
if ( err_prefix ) syslog(LOG_CRIT, "%s%m", err_prefix);
return NULL;
}
- sprintf(fnbuf, "%s//parse_%s.so", AUTOFS_LIB_DIR, name);
+ snprintf(fnbuf, szfnbuf, "%s//parse_%s.so", AUTOFS_LIB_DIR, name);
if ( !(dh = dlopen(fnbuf, RTLD_NOW)) ) {
if ( err_prefix )
@@ -155,6 +159,7 @@
{
struct mount_mod *mod;
char *fnbuf;
+ size_t szfnbuf;
void *dh;
int *ver;
@@ -163,14 +168,15 @@
if ( err_prefix ) syslog(LOG_CRIT, "%s%m", err_prefix);
return NULL;
}
- fnbuf = alloca(strlen(name) + strlen(AUTOFS_LIB_DIR) + 12);
+ szfnbuf = strlen(name) + strlen(AUTOFS_LIB_DIR) + 12;
+ fnbuf = alloca(szfnbuf);
if ( !fnbuf ) {
free(mod);
if ( err_prefix )
syslog(LOG_CRIT, "%s%m", err_prefix);
return NULL;
}
- sprintf(fnbuf, "%s//mount_%s.so", AUTOFS_LIB_DIR, name);
+ snprintf(fnbuf, szfnbuf, "%s//mount_%s.so", AUTOFS_LIB_DIR, name);
if ( !(dh = dlopen(fnbuf, RTLD_NOW)) ) {
if ( err_prefix )
diff -r -u autofs-4.0.0pre10.orig/daemon/mount.c autofs-4.0.0pre10/daemon/mount.c
--- autofs-4.0.0pre10.orig/daemon/mount.c 2001-03-27 21:08:23.000000000 -0800
+++ autofs-4.0.0pre10/daemon/mount.c 2003-11-14 11:50:16.000000000 -0800
@@ -21,6 +21,7 @@
#include <syslog.h>
#include <stdlib.h>
+#include <string.h>
#include "automount.h"
/* These filesystems are known not to work with the "generic" module */
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2003-11-21 18:56 [patch] buffer overflow, failure to auto-dismount, log file diarrhea Jim Carter
@ 2003-11-22 8:20 ` Ian Kent
2003-11-26 18:39 ` Jim Carter
0 siblings, 1 reply; 17+ messages in thread
From: Ian Kent @ 2003-11-22 8:20 UTC (permalink / raw)
To: Jim Carter; +Cc: autofs mailing list
On Fri, 21 Nov 2003, Jim Carter wrote:
> Here at UCLA-Mathnet we use the automounter extensively, and we have had
> a couple of associated problems:
Thanks for your efforts but ...
>
> A. Our NFS mounted filesystems invariably require submounts, and they
> are not auto-dismounted. Ever. This proved to be due to a duplicated
> slash between the main and sub mount points. Once this was fixed,
> auto-dismount started working. There is a rumor that a related fix was
> posted somewhere, but I couldn't find it.
I believe I fixed this in the 4.0.0 release. There are a couple of other
fixes in it also.
>
> B. If the daemon is hit with SIGUSR1, it goes into an infinite loop
> trying unsuccessfully to dismount eligible filesystems, spitting out
> typically 1000 syslog messages over 2 seconds until item C (below)
> supervenes. I put in both a rate throttle (20/second) and a dynamic
> limit on the number of dismounts.
This sounds like a problem that needs to be identified and fixed.
Rate throttling seems more of a workaround that a solution.
Can you give more information please.
Come to think about it this might have already been fixed in ????
Maybe it's not in the 4.0.0 release. I'll check.
>
> C. Upon auto-dismount or SIGUSR1 looping, st_prepare_shutdown is called
> when ap.state != ST_READY and an assertion fails, killing the thread.
> I changed it to die on ST_SHUTDOWN_PENDING, i.e. a recursive call. I'm
> not 100% sure that this is the correct contingency, but automount does
> dismount the unused filesystems and does exit.
Have seen this. I'm not sure if I fixed this in the 4.0.0 release either.
Will check into it.
>
> D. It would appear that a maliciously constructed directory name could
> overflow a buffer in several places, at least causing denial of service
> and possibly allowing the execution of code. Perhaps other O.S. limits
> on the length of a filename (PATH_MAX) invariably protect the daemon
> from this exposure, but defense in depth in this area seems both
> warranted and not burdensome. I changed sprintf to snprintf wherever
> occurring, and the subroutine which joins the dirname and basename
> checks the buffer size.
Definately will check into this for 4.0 and 4.1.
>
> E. There were a few cases where size_t and int were mixed together,
> causing compiler warnings. As there have been exploits against mixed
> signed-unsigned variables, I took the opportunity to fix this issue.
Ditto.
>
> On one machine we have been running the patched automount for a week,
> and all Linux boxes have had it in production for 48 hours. No peculiar
> log messages or crashes have been seen. In one test, various random NFS
> filesystems were automounted and allowed to time out, with varying
> numbers of filesystems simultaneously mounted. Almost 500
> mount-autodismount pairs were done; processes always exited when they
> should, and no error messages appeared. There's a pretty good chance
> that this patch is working.
Assume you mean the attached patch.
>
> The patches follow. They are against autofs-4.0.0pre10, which is the
> version distributed with SuSE 8.2, the distro we are using.
The SuSE maintainer contacted me a while ago, sent me a copy of his
autofs which was much appreciated. I merged some of the SuSE patches into
the current 4.1.0 beta.
I hope to encourage him to adopt 4.1.0 when a final version is released.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2003-11-22 8:20 ` Ian Kent
@ 2003-11-26 18:39 ` Jim Carter
2003-11-27 14:08 ` Ian Kent
2004-01-01 13:12 ` Ian Kent
0 siblings, 2 replies; 17+ messages in thread
From: Jim Carter @ 2003-11-26 18:39 UTC (permalink / raw)
To: Ian Kent; +Cc: autofs mailing list
On Sat, 22 Nov 2003, Ian Kent wrote:
> On Fri, 21 Nov 2003, Jim Carter wrote:
> > B. If the daemon is hit with SIGUSR1, it goes into an infinite loop
> > trying unsuccessfully to dismount eligible filesystems, spitting out
> > typically 1000 syslog messages over 2 seconds until item C (below)
> > supervenes. I put in both a rate throttle (20/second) and a dynamic
> > limit on the number of dismounts.
>
> This sounds like a problem that needs to be identified and fixed.
> Rate throttling seems more of a workaround that a solution.
> Can you give more information please.
This part of the patch is efinitely a kludge. The daemon's logic goes like
this: When it's time to purge mounts, it sends a packet to the driver
saying "find an expired mount". The driver sends up a packet saying
"dismount /net/tupelo//h1". The daemon tries to do that, but the filesystem
is not actually dismounted (lots of possible reasons this could happen).
Repeating the loop, the daemon asks "find an expired mount". The driver
sends up a packet saying "dismount /net/tupelo//h1"...
A possible non-kludge fix might go like this. The daemon walks the tree of
(its own sub-) mounts and for each, it may or may not make a judgment that
the mount might (or might not) be expired. On likely-looking mounts, it
asks the driver "is this expired" or "when was it really last used"?
If the mount is really expired, the daemon attempts to dismounts it. But,
if the filesystem fails to go away, the daemon will not return to it until
the next USR1 or ALRM, avoiding the infinite loop.
Here's another possibility: you shouldn't go around updating the atime of
the mounted filesystem, but the mount point belongs to the driver, and if
you stat the mount point's inode, the driver can provide the last access
time (what it uses to decide about expiration) as the atime of that inode.
Then the daemon can do the entire logic of picking expired mounts. That
would be preferable as design, and it avoids all infinite loop
possibilities. Presumably to stat the inode, you would open(2) the mount
point directory before mounting on it, and then use lstat. I hope that
will actually work. Of course, both of these fixes require protocol
changes in the driver.
> > C. Upon auto-dismount or SIGUSR1 looping, st_prepare_shutdown is called
> > when ap.state != ST_READY and an assertion fails, killing the thread.
> > I changed it to die on ST_SHUTDOWN_PENDING, i.e. a recursive call. I'm
> > not 100% sure that this is the correct contingency, but automount does
> > dismount the unused filesystems and does exit.
>
> Have seen this. I'm not sure if I fixed this in the 4.0.0 release either.
> Will check into it.
When the submounted daemon dismounts its last filesystem, it's in ST_EXPIRE
(I think that's the spelling), but correctly calls st_prepare_shutdown. I
don't know if there are any other non-obvious but correct transitions into
SHUTDOWN state.
> > The patches follow. They are against autofs-4.0.0pre10, which is the
> > version distributed with SuSE 8.2, the distro we are using.
>
> The SuSE maintainer contacted me a while ago, sent me a copy of his
> autofs which was much appreciated. I merged some of the SuSE patches into
> the current 4.1.0 beta.
>
> I hope to encourage him to adopt 4.1.0 when a final version is released.
We're definitely looking forward to it. We're pretty aggressive about
patching machines and auditing software, and when we make private patches a
big problem is making sure they stay installed.
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu http://www.math.ucla.edu/~jimc (q.v. for PGP key)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2003-11-26 18:39 ` Jim Carter
@ 2003-11-27 14:08 ` Ian Kent
2004-01-01 13:12 ` Ian Kent
1 sibling, 0 replies; 17+ messages in thread
From: Ian Kent @ 2003-11-27 14:08 UTC (permalink / raw)
To: Jim Carter; +Cc: autofs mailing list
On Wed, 26 Nov 2003, Jim Carter wrote:
> On Sat, 22 Nov 2003, Ian Kent wrote:
> > On Fri, 21 Nov 2003, Jim Carter wrote:
> > > B. If the daemon is hit with SIGUSR1, it goes into an infinite loop
> > > trying unsuccessfully to dismount eligible filesystems, spitting out
> > > typically 1000 syslog messages over 2 seconds until item C (below)
> > > supervenes. I put in both a rate throttle (20/second) and a dynamic
> > > limit on the number of dismounts.
> >
> > This sounds like a problem that needs to be identified and fixed.
> > Rate throttling seems more of a workaround that a solution.
> > Can you give more information please.
>
> This part of the patch is efinitely a kludge. The daemon's logic goes like
> this: When it's time to purge mounts, it sends a packet to the driver
> saying "find an expired mount". The driver sends up a packet saying
> "dismount /net/tupelo//h1". The daemon tries to do that, but the filesystem
> is not actually dismounted (lots of possible reasons this could happen).
> Repeating the loop, the daemon asks "find an expired mount". The driver
> sends up a packet saying "dismount /net/tupelo//h1"...
Ahhh. I remember now. This was something I can across ages ago. I altered
the kernel module.
The situation was that during a normal expire the candidates' info struct
last_used field is updated to the current time to stop this. I saw that
change months later and couldn't work out what it was for and removed it.
You are right the problem still exists.
I take the point that many do not want to alter the kernel source anyway.
I don't want to regress back to a develop, test and beta again so I will
add this to the growing list of stuff for 4.1.1 and work on it as a bug
fix for 4.0.1 when I get to the next development cycle (hope in early
December).
Thanks for you contribution.
> > > C. Upon auto-dismount or SIGUSR1 looping, st_prepare_shutdown is called
> > > when ap.state != ST_READY and an assertion fails, killing the thread.
> > > I changed it to die on ST_SHUTDOWN_PENDING, i.e. a recursive call. I'm
> > > not 100% sure that this is the correct contingency, but automount does
> > > dismount the unused filesystems and does exit.
> >
> > Have seen this. I'm not sure if I fixed this in the 4.0.0 release either.
> > Will check into it.
>
> When the submounted daemon dismounts its last filesystem, it's in ST_EXPIRE
> (I think that's the spelling), but correctly calls st_prepare_shutdown. I
> don't know if there are any other non-obvious but correct transitions into
> SHUTDOWN state.
As I said previously I've seen this. I believe this was an error in the
state machine that Jeremy originally implemented. It seemed to work fine
with the change to the assertion.
Again I'll check and merge your patch as a bug fix in 4.0.1 if it's
not already there.
>
> We're definitely looking forward to it. We're pretty aggressive about
> patching machines and auditing software, and when we make private patches a
> big problem is making sure they stay installed.
Understand that very well.
I really must contact the maintainer to see what his plans are for SuSE
antofs. I'll do my best.
Thanks again.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: raven@themaw.net
v Web: http://themaw.net/
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2003-11-26 18:39 ` Jim Carter
2003-11-27 14:08 ` Ian Kent
@ 2004-01-01 13:12 ` Ian Kent
2004-01-01 22:49 ` Jim Carter
1 sibling, 1 reply; 17+ messages in thread
From: Ian Kent @ 2004-01-01 13:12 UTC (permalink / raw)
To: Jim Carter; +Cc: autofs mailing list
I've just been re-reading your mail after looking at your patch ....
On Wed, 26 Nov 2003, Jim Carter wrote:
>
> A possible non-kludge fix might go like this. The daemon walks the tree of
> (its own sub-) mounts and for each, it may or may not make a judgment that
> the mount might (or might not) be expired. On likely-looking mounts, it
> asks the driver "is this expired" or "when was it really last used"?
> If the mount is really expired, the daemon attempts to dismounts it. But,
> if the filesystem fails to go away, the daemon will not return to it until
> the next USR1 or ALRM, avoiding the infinite loop.
I'm thinking I like this approach.
It may be fairly straight forward to do and should be quite effective.
This approach has the advantage that it can be done entirely in the
daemon and should also work for older versions of the kernel module.
>
> Here's another possibility: you shouldn't go around updating the atime of
> the mounted filesystem, but the mount point belongs to the driver, and if
> you stat the mount point's inode, the driver can provide the last access
> time (what it uses to decide about expiration) as the atime of that inode.
> Then the daemon can do the entire logic of picking expired mounts. That
> would be preferable as design, and it avoids all infinite loop
> possibilities. Presumably to stat the inode, you would open(2) the mount
> point directory before mounting on it, and then use lstat. I hope that
> will actually work. Of course, both of these fixes require protocol
> changes in the driver.
This sounds a bit complicated considering what needs to be achieved. I
keep thinking that there would be changes needed in the kernel module for
this.
Have you had any other thoughts on this?
Ian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-01 13:12 ` Ian Kent
@ 2004-01-01 22:49 ` Jim Carter
2004-01-02 0:35 ` Ian Kent
0 siblings, 1 reply; 17+ messages in thread
From: Jim Carter @ 2004-01-01 22:49 UTC (permalink / raw)
To: Ian Kent; +Cc: autofs mailing list
On Thu, 1 Jan 2004, Ian Kent wrote:
> > asks the driver "is this expired" or "when was it really last used"?
-- snip --
> This approach has the advantage that it can be done entirely in the
> daemon and should also work for older versions of the kernel module.
Definitely an advantage that it can be done entirely in the daemon. But I
didn't notice if you can make the older (and newer) drivers divulge the
last access time. If that will work, great. I was just reading some of
the sources (see below), and I didn't spot the reference to last_usage.
> > possibilities. Presumably to stat the inode, you would open(2) the mount
> > point directory before mounting on it, and then use lstat. I hope that
> > will actually work. Of course, both of these fixes require protocol
> > changes in the driver.
>
> This sounds a bit complicated considering what needs to be achieved. I
> keep thinking that there would be changes needed in the kernel module for
> this.
Oops, I meant fstat (works on open file descriptors) rather than lstat.
Complicated? The module has to return something when you stat the inode,
and a newly improved module can easily fill in the last used time rather
than whatever it provides now. Presently (kernel 2.4.20) in
fs/autofs/inode.c, autofs_read_inode() puts CURRENT_TIME in all of mtime,
atime, ctime. I don't quite see how to get from its argument (struct
inode*) to struct autofs_dir_ent* which has the last_usage field, but it
should be doable without a lot of drama.
Also, with autofs_oz_mode(), the userspace daemon would not even need to do
the open() - fstat() thing; it should be getting the mount point's inode
directly, and not that which is mounted on it. I think.
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu http://www.math.ucla.edu/~jimc (q.v. for PGP key)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-01 22:49 ` Jim Carter
@ 2004-01-02 0:35 ` Ian Kent
2004-01-03 1:03 ` Jim Carter
0 siblings, 1 reply; 17+ messages in thread
From: Ian Kent @ 2004-01-02 0:35 UTC (permalink / raw)
To: Jim Carter; +Cc: autofs mailing list
On Thu, 1 Jan 2004, Jim Carter wrote:
> On Thu, 1 Jan 2004, Ian Kent wrote:
> > > asks the driver "is this expired" or "when was it really last used"?
> -- snip --
> > This approach has the advantage that it can be done entirely in the
> > daemon and should also work for older versions of the kernel module.
>
> Definitely an advantage that it can be done entirely in the daemon. But I
> didn't notice if you can make the older (and newer) drivers divulge the
> last access time. If that will work, great. I was just reading some of
> the sources (see below), and I didn't spot the reference to last_usage.
Sorry I was wrong about that being easy. I don't know what I was thinking.
Stat would divulge that info as the access time is generally along with
last_used.
>
> > > possibilities. Presumably to stat the inode, you would open(2) the mount
> > > point directory before mounting on it, and then use lstat. I hope that
> > > will actually work. Of course, both of these fixes require protocol
> > > changes in the driver.
> >
> > This sounds a bit complicated considering what needs to be achieved. I
> > keep thinking that there would be changes needed in the kernel module for
> > this.
>
> Oops, I meant fstat (works on open file descriptors) rather than lstat.
>
> Complicated? The module has to return something when you stat the inode,
> and a newly improved module can easily fill in the last used time rather
> than whatever it provides now. Presently (kernel 2.4.20) in
> fs/autofs/inode.c, autofs_read_inode() puts CURRENT_TIME in all of mtime,
> atime, ctime. I don't quite see how to get from its argument (struct
> inode*) to struct autofs_dir_ent* which has the last_usage field, but it
> should be doable without a lot of drama.
I thought you were using autofs4. autofs_dir_ent stuff is in v3.
I think the autofs_read_inode is used for initialization. Tracking the
code for sys_stat and sys_fstat it apears that this is not used.
One other thing which makes this hard is that the kernel module
will never see the opening of a path that is a mountpoint. The path walk
in the kernel will send this to the filesystem routines of the mount.
>
> Also, with autofs_oz_mode(), the userspace daemon would not even need to do
> the open() - fstat() thing; it should be getting the mount point's inode
> directly, and not that which is mounted on it. I think.
Yes but the path walk is very good at getting things right so we never
actually get a lookin.
Thinking more about it what are the situations that can cause this. If a
mount is busy it should never be offered as expired, regardless of its
last_used. There is the possibility of a broken NFS server causing it but
perhaps that can be dealt with using an rpc_ping type thing and adding a
-f to the umount if it doesn't respond.
Ian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-02 0:35 ` Ian Kent
@ 2004-01-03 1:03 ` Jim Carter
2004-01-03 3:47 ` Ian Kent
0 siblings, 1 reply; 17+ messages in thread
From: Jim Carter @ 2004-01-03 1:03 UTC (permalink / raw)
To: Ian Kent; +Cc: autofs mailing list
On Fri, 2 Jan 2004, Ian Kent wrote:
> Stat would divulge that info as the access time is generally along with
> last_used.
I wish!
> I thought you were using autofs4. autofs_dir_ent stuff is in v3.
Oops, took a wrong turn in the source tree. I looked through the *right*
sources, and inode.c autofs4_get_inode() works the same as in v3, i.e.
sets inode->i_[amc]time = CURRENT_TIME.
> I think the autofs_read_inode is used for initialization. Tracking the
> code for sys_stat and sys_fstat it apears that this is not used.
I think you're right. Hiss, boo. Hmm, the dentry points to the inode
storage, and so does the autofs_info structure. If autofs4_update_usage
updated the inode fields as well as (or instead of) inf->last_used, then
(presumably) stat() would get the right times. Or does sys_stat go
directly to the mounted inode, ignoring the mount point? Probably it's
smart enough to do that, but even so the mount point inode is revalidated
(autofs4_revalidate is called). We know this because the autofs4 module
successfully keeps track of the last usage time. So sys_fstat on a FD open
on that inode should get the inode and not what's mounted on it. It says
here.
> One other thing which makes this hard is that the kernel module
> will never see the opening of a path that is a mountpoint. The path walk
> in the kernel will send this to the filesystem routines of the mount.
Well, autofs4_revalidate() does get called on each open, even if not much
else gets called.
> Thinking more about it what are the situations that can cause this. If a
> mount is busy it should never be offered as expired, regardless of its
> last_used.
Right, expire.c :: autofs4_expire() bypasses a mount point if its last_used
field is too recent, and then it checks is_tree_busy(). Oh, look, a kludge
that could be added, how wonderful! Assuming that it's feasible to get
module-provided data into the inode data returned from sys_stat or
sys_fstat, a mode bit could be set to indicate busyness, e.g. user write
implies not busy.
Of course a race condition is possible: the userspace daemon is assured by
the module that the mount point is not busy, someone uses it, the daemon
then unmounts, and the unmount fails. As it should. Its response should
be to poll the module again, and the module will confirm that the mount
point is really busy (and with last_used artificially advanced to
CURRENT_TIME).
> There is the possibility of a broken NFS server causing it but
> perhaps that can be dealt with using an rpc_ping type thing and adding a
> -f to the umount if it doesn't respond.
Hmm, I think you're saying (and I'm agreeing) that we're trying to solve
the wrong problem. Before the NFS server crashes, the mount point is busy
with useful programs. Then it crashes. The ideal is, when it comes back
it accepts the client's NFS handle and everything starts working again, and
autofs doesn't do anything violent that would prevent that. The reality,
at least with soft mounts, is that the programs using the mount point are
fatally wounded, but they linger.
So the key task is to bulldoze aside the wreckage so a new instance of the
revitalized NFS server can be mounted. Killing the moribund processes and
forcibly dismounting the destroyed filesystem instance are important so as
to recover resources, but the client can continue to function for quite a
long time even if that's not done; the only totally vital action is to be
able to remount, and that can possibly be done by renaming, or by
remounting the wreckage elsewhere, deferring the actual dismount.
NFS ping should be enough for recognizing a dead server. That means each
client's autofs would ping each mounted server once every 5 minutes.
Presumably a lot more NFS activity is done by the active programs.
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu http://www.math.ucla.edu/~jimc (q.v. for PGP key)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-03 1:03 ` Jim Carter
@ 2004-01-03 3:47 ` Ian Kent
2004-01-09 19:58 ` Jim Carter
0 siblings, 1 reply; 17+ messages in thread
From: Ian Kent @ 2004-01-03 3:47 UTC (permalink / raw)
To: Jim Carter; +Cc: autofs mailing list
On Fri, 2 Jan 2004, Jim Carter wrote:
> On Fri, 2 Jan 2004, Ian Kent wrote:
> > Stat would divulge that info as the access time is generally along with
> > last_used.
>
> I wish!
It is. I don't have a copy of the vanilla module source handy but I haven't
changed autofs4_update_usage so it should be the same as mine.
>
> > I thought you were using autofs4. autofs_dir_ent stuff is in v3.
>
> Oops, took a wrong turn in the source tree. I looked through the *right*
> sources, and inode.c autofs4_get_inode() works the same as in v3, i.e.
> sets inode->i_[amc]time = CURRENT_TIME.
>
> > I think the autofs_read_inode is used for initialization. Tracking the
> > code for sys_stat and sys_fstat it apears that this is not used.
>
> I think you're right. Hiss, boo. Hmm, the dentry points to the inode
> storage, and so does the autofs_info structure. If autofs4_update_usage
> updated the inode fields as well as (or instead of) inf->last_used, then
> (presumably) stat() would get the right times. Or does sys_stat go
> directly to the mounted inode, ignoring the mount point? Probably it's
> smart enough to do that, but even so the mount point inode is revalidated
> (autofs4_revalidate is called). We know this because the autofs4 module
> successfully keeps track of the last usage time. So sys_fstat on a FD open
> on that inode should get the inode and not what's mounted on it. It says
> here.
I believe it does. We can rely on the inode atime being equal to
last_used. sys_stat goes directly to the inode.
Ah. reavalidate is called frequently. If you define DEBUG in
the source header you'll see what I mean. It's much more busy when
directories exist in the autofs4 tree (ie. mounted filesystems). The
difficulty here has always been that you don't know what the reason of the
call is (ie. an ls, open, stat ....). All you know is that there is a
path_walk going on to resolve to a dentry. Better, the path_walk
always follows the mounts, so stat gets its values from the mount point
inode.
>
> > One other thing which makes this hard is that the kernel module
> > will never see the opening of a path that is a mountpoint. The path walk
> > in the kernel will send this to the filesystem routines of the mount.
>
> Well, autofs4_revalidate() does get called on each open, even if not much
> else gets called.
See above.
>
> > Thinking more about it what are the situations that can cause this. If a
> > mount is busy it should never be offered as expired, regardless of its
> > last_used.
>
> Right, expire.c :: autofs4_expire() bypasses a mount point if its last_used
> field is too recent, and then it checks is_tree_busy(). Oh, look, a kludge
> that could be added, how wonderful! Assuming that it's feasible to get
> module-provided data into the inode data returned from sys_stat or
> sys_fstat, a mode bit could be set to indicate busyness, e.g. user write
> implies not busy.
A user write implies an open file. The expire routine must identify this
dentry as busy or it's a bug that needs to be fixed.
I believe that we should never update kernel data structures in ways that
are not, either provided by a kernel API call or are commonly accepted
methods.
>
> Of course a race condition is possible: the userspace daemon is assured by
> the module that the mount point is not busy, someone uses it, the daemon
> then unmounts, and the unmount fails. As it should. Its response should
> be to poll the module again, and the module will confirm that the mount
> point is really busy (and with last_used artificially advanced to
> CURRENT_TIME).
Not sure about this.
This has been a source of pain for me recently. On the face of it, the
module blocks mount requests for an expired dentry during the callback to
the daemon.
>
> > There is the possibility of a broken NFS server causing it but
> > perhaps that can be dealt with using an rpc_ping type thing and adding a
> > -f to the umount if it doesn't respond.
>
> Hmm, I think you're saying (and I'm agreeing) that we're trying to solve
> the wrong problem. Before the NFS server crashes, the mount point is busy
> with useful programs. Then it crashes. The ideal is, when it comes back
> it accepts the client's NFS handle and everything starts working again, and
> autofs doesn't do anything violent that would prevent that. The reality,
> at least with soft mounts, is that the programs using the mount point are
> fatally wounded, but they linger.
Stale file handles abound. Not really autofss' problem, but yes, what can
we do to improve the situation?
>
> So the key task is to bulldoze aside the wreckage so a new instance of the
> revitalized NFS server can be mounted. Killing the moribund processes and
> forcibly dismounting the destroyed filesystem instance are important so as
> to recover resources, but the client can continue to function for quite a
> long time even if that's not done; the only totally vital action is to be
> able to remount, and that can possibly be done by renaming, or by
> remounting the wreckage elsewhere, deferring the actual dismount.
I'll have to think about this for a while. The rename idea might cause all
sorts of problems.
Anyway the problem should only occur if the mount point is no longer busy
and has expired in which case we should be free to forcibly umount it.
A process with stuff in use on the broken mount point will have a cow when
it tries to do something like access an open file. In this case it's
likely to need to be put down, leaving the way clear for a forced umount.
A kill -9 on the blocked process needs further investigation. I'm not sure
what the situation is there (probably not good).
>
> NFS ping should be enough for recognizing a dead server. That means each
> client's autofs would ping each mounted server once every 5 minutes.
> Presumably a lot more NFS activity is done by the active programs.
>
But only for mounts that are not busy and have expired. And then every
expiry if they fail to umount and then we return to what to do about the
original problem you described. Perhaps the limit your original patch
implemented is in fact a good idea and possibly all that is needed. If the
looping is detected a warning could be printed to alert one to the
possibility of a bug in the module.
One thing I noticed in the module expire routine is that it doesn't
progress past a failed mount on subsequent calls. A change to the
adjustment of the d_subdirs list is probably needed (a bit difficult, but
not impossible, in my implementation of the expire routine).
By the way. Since you are obviously interested in autofs, it would
be best if you could use the latest version. Please see kernel.org if
you are able to update.
Ian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-03 3:47 ` Ian Kent
@ 2004-01-09 19:58 ` Jim Carter
2004-01-09 20:17 ` H. Peter Anvin
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Jim Carter @ 2004-01-09 19:58 UTC (permalink / raw)
To: Ian Kent; +Cc: autofs mailing list
On Sat, 3 Jan 2004, Ian Kent wrote:
> On Fri, 2 Jan 2004, Jim Carter wrote:
> > Right, expire.c :: autofs4_expire() bypasses a mount point if its last_used
> > field is too recent, and then it checks is_tree_busy(). Oh, look, a kludge
> > that could be added, how wonderful! Assuming that it's feasible to get
> > module-provided data into the inode data returned from sys_stat or
> > sys_fstat, a mode bit could be set to indicate busyness, e.g. user write
> > implies not busy.
>
> A user write implies an open file. The expire routine must identify this
> dentry as busy or it's a bug that needs to be fixed.
No, I meant that when the sys_stat method fills in the mode field of
the returning stat structure, it would be reported as 555 for a busy
filesystem, or 755 if it looks idle and dismountable. The mode field as
seen by the VFS layer can stay at 755. This would be when statting
autofs's directory inode, not whatever is mounted on it.
Is it really true that the VFS layer doesn't maintain use counts for
directories where you can find them easily?
> > So the key task is to bulldoze aside the wreckage so a new instance of the
> > revitalized NFS server can be mounted. Killing the moribund processes and
> > forcibly dismounting the destroyed filesystem instance are important so as
> > to recover resources, but the client can continue to function for quite a
> > long time even if that's not done; the only totally vital action is to be
> > able to remount, and that can possibly be done by renaming, or by
> > remounting the wreckage elsewhere, deferring the actual dismount.
>
> I'll have to think about this for a while. The rename idea might cause all
> sorts of problems.
I can imagine. Maybe a -move remount? Can you even do this when the
mounted FS is busy? Well...
mkdir mtpt1 ; mkdir mtpt2
mount -t nfs hollyfs:/m1 mtpt1
cd mtpt1
sleep 3600 & (PID is 2486)
cd ..
umount mtpt1 (device is busy -- correct)
mount --move mtpt1 mtpt2 (works -- yay!)
ls -l /proc/2486/cwd (says: /proc/2486/cwd -> /tmp/root.jimc/mtpt2/)
grep mtpt /proc/mounts (says: hollyfs:/m1 /tmp/root.jimc/mtpt2 nfs...)
grep mtpt /etc/mtab (says: hollyfs:/m1 /tmp/root.jimc/mtpt1 nfs...
/tmp/root.jimc/mtpt1 /tmp/root.jimc/mtpt2 none rw 0 0
kill 2486
umount mtpt2 (works -- correct)
grep mtpt /proc/mounts (says: nothing)
grep mtpt /etc/mtab (says: hollyfs:/m1 /tmp/root.jimc/mtpt1, hiss, boo)
So you can do a move mount on a busy filesystem; the only downside is that
/etc/mtab has a stale entry in it -- obviously a bug, not an essential
behavior.
> A kill -9 on the blocked process needs further investigation. I'm not sure
> what the situation is there (probably not good).
I would really like it if, when autofs decided to forcibly unmount, it
could kill the using processes, as in "fuser -k -m /net/host/mountpoint".
Actually this is a general issue for unmount(8), not specific to autofs.
But one wonders whether fuser is going to run wild...
> One thing I noticed in the module expire routine is that it doesn't
> progress past a failed mount on subsequent calls. A change to the
> adjustment of the d_subdirs list is probably needed (a bit difficult, but
> not impossible, in my implementation of the expire routine).
I thought it moved the mount point to the end, before attempting to unmount
it. Maybe that behavior got changed in a subsequent version, or maybe I'm
remembering wrong.
> By the way. Since you are obviously interested in autofs, it would
> be best if you could use the latest version. Please see kernel.org if
> you are able to update.
I'll discuss this with my management. By the way, for RPM-based distros, a
lot of updating happens via mindless scripts, and it helps if ad-hoc fixes
are packed up as a RPM. Then vendor updates to retro versions don't
overwrite the latest patches. Example:
Currently installed: autofs4-4.0.0pre10-504
I create and install: autofs4-4.1.0-001
Vendor issues patch: autofs4-4.0.5-678
The scripts know that the vendor's patch should be bypassed. If I get the
go-ahead to install 4.1.0, I'll fake up a spec file (like steal it from the
SuSE package) and send in some notes for how to use it.
By the way, sorry for the late reply. Your message just arrived
(2003-01-09) -- probably stuck on some MX somewhere for 3 days.
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu http://www.math.ucla.edu/~jimc (q.v. for PGP key)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-09 19:58 ` Jim Carter
@ 2004-01-09 20:17 ` H. Peter Anvin
2004-01-09 21:11 ` Jim Carter
2004-01-09 21:27 ` Mike Waychison
2004-01-10 5:06 ` Ian Kent
2 siblings, 1 reply; 17+ messages in thread
From: H. Peter Anvin @ 2004-01-09 20:17 UTC (permalink / raw)
To: Jim Carter; +Cc: autofs mailing list, Ian Kent
Jim Carter wrote:
>
>>A kill -9 on the blocked process needs further investigation. I'm not sure
>>what the situation is there (probably not good).
>
> I would really like it if, when autofs decided to forcibly unmount, it
> could kill the using processes, as in "fuser -k -m /net/host/mountpoint".
> Actually this is a general issue for unmount(8), not specific to autofs.
> But one wonders whether fuser is going to run wild...
>
How on the face of planet Earth do you expect autofs should ever take
such a drastic action? This could only by a human deciding to do so,
and if so, it's not an issue for autofs, as long as human umounts are
properly handled (they should be.)
-hpa
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-09 20:17 ` H. Peter Anvin
@ 2004-01-09 21:11 ` Jim Carter
2004-01-10 5:09 ` Ian Kent
0 siblings, 1 reply; 17+ messages in thread
From: Jim Carter @ 2004-01-09 21:11 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: autofs mailing list, Ian Kent
On Fri, 9 Jan 2004, H. Peter Anvin wrote:
> Jim Carter wrote:
> > I would really like it if, when autofs decided to forcibly unmount, it
> > could kill the using processes, as in "fuser -k -m /net/host/mountpoint".
> > Actually this is a general issue for unmount(8), not specific to autofs.
> > But one wonders whether fuser is going to run wild...
>
> How on the face of planet Earth do you expect autofs should ever take
> such a drastic action? This could only by a human deciding to do so,
> and if so, it's not an issue for autofs, as long as human umounts are
> properly handled (they should be.)
Just yesterday I had five Linux boxes hang while rebooting, while
trying to shut down autofs4, because processes using the filesystems had
been killed but wouldn't die. This human said, I've already decided that
these machines are idle, and so I just killed the shutdown script, rather
than researching more drastic action to get rid of the processes, or to
find out why they weren't exiting, or to do "umount -l", etc.
It's lucky that autofs is shut off before sshd, or else I would have had to
visit every machine.
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu http://www.math.ucla.edu/~jimc (q.v. for PGP key)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-09 19:58 ` Jim Carter
2004-01-09 20:17 ` H. Peter Anvin
@ 2004-01-09 21:27 ` Mike Waychison
2004-01-09 21:43 ` Jim Carter
2004-01-10 5:06 ` Ian Kent
2 siblings, 1 reply; 17+ messages in thread
From: Mike Waychison @ 2004-01-09 21:27 UTC (permalink / raw)
To: Jim Carter; +Cc: autofs mailing list, Ian Kent
[-- Attachment #1.1: Type: text/plain, Size: 4555 bytes --]
Jim Carter wrote:
>On Sat, 3 Jan 2004, Ian Kent wrote:
>
>
>>On Fri, 2 Jan 2004, Jim Carter wrote:
>>
>>
>>>Right, expire.c :: autofs4_expire() bypasses a mount point if its last_used
>>>field is too recent, and then it checks is_tree_busy(). Oh, look, a kludge
>>>that could be added, how wonderful! Assuming that it's feasible to get
>>>module-provided data into the inode data returned from sys_stat or
>>>sys_fstat, a mode bit could be set to indicate busyness, e.g. user write
>>>implies not busy.
>>>
>>>
>>A user write implies an open file. The expire routine must identify this
>>dentry as busy or it's a bug that needs to be fixed.
>>
>>
>
>No, I meant that when the sys_stat method fills in the mode field of
>the returning stat structure, it would be reported as 555 for a busy
>filesystem, or 755 if it looks idle and dismountable. The mode field as
>seen by the VFS layer can stay at 755. This would be when statting
>autofs's directory inode, not whatever is mounted on it.
>
>
>
This hijacking of the st_mode is just that, hijacking. You'd be
changing the mode of the directory on the NFS filesystem.. This
wouldn't work well in the VFS when it tries to check permissions, but it
also wouldn't work for more than one NFS client accessing the same
filehandle.
>Is it really true that the VFS layer doesn't maintain use counts for
>directories where you can find them easily?
>
>
>
Use counts for directories? The of a file has no real consistent
corelation with which directory that file belongs.
>>>So the key task is to bulldoze aside the wreckage so a new instance of the
>>>revitalized NFS server can be mounted. Killing the moribund processes and
>>>forcibly dismounting the destroyed filesystem instance are important so as
>>>to recover resources, but the client can continue to function for quite a
>>>long time even if that's not done; the only totally vital action is to be
>>>able to remount, and that can possibly be done by renaming, or by
>>>remounting the wreckage elsewhere, deferring the actual dismount.
>>>
>>>
>>I'll have to think about this for a while. The rename idea might cause all
>>sorts of problems.
>>
>>
>
>I can imagine. Maybe a -move remount? Can you even do this when the
>mounted FS is busy? Well...
>
>mkdir mtpt1 ; mkdir mtpt2
>mount -t nfs hollyfs:/m1 mtpt1
>cd mtpt1
>sleep 3600 & (PID is 2486)
>cd ..
>umount mtpt1 (device is busy -- correct)
>mount --move mtpt1 mtpt2 (works -- yay!)
>ls -l /proc/2486/cwd (says: /proc/2486/cwd -> /tmp/root.jimc/mtpt2/)
>grep mtpt /proc/mounts (says: hollyfs:/m1 /tmp/root.jimc/mtpt2 nfs...)
>grep mtpt /etc/mtab (says: hollyfs:/m1 /tmp/root.jimc/mtpt1 nfs...
> /tmp/root.jimc/mtpt1 /tmp/root.jimc/mtpt2 none rw 0 0
>kill 2486
>umount mtpt2 (works -- correct)
>grep mtpt /proc/mounts (says: nothing)
>grep mtpt /etc/mtab (says: hollyfs:/m1 /tmp/root.jimc/mtpt1, hiss, boo)
>
>So you can do a move mount on a busy filesystem; the only downside is that
>/etc/mtab has a stale entry in it -- obviously a bug, not an essential
>behavior.
>
>
>
Yup, it's a bug, however now that you can mount two filesystems on top
of each other, there really is no reliable way to know which entry in
/etc/mtab has moved. Eg:
# mkdir /foo
# mount -o ro host:/path /foo
# mount -o rw host:/path /foo
# cat /etc/mtab
(two lines for /foo are printed)
# mkdir /bar
# mount --move /foo /bar
Which entry in /etc/mtab is changed from /foo to /bar? Unless you can
make steadfast rules on entry ordering, you can't reliably decide which
entry has changed.
Even worse, /etc/mtab is broken when using multiple namespaces.
>>A kill -9 on the blocked process needs further investigation. I'm not sure
>>what the situation is there (probably not good).
>>
>>
>
>I would really like it if, when autofs decided to forcibly unmount, it
>could kill the using processes, as in "fuser -k -m /net/host/mountpoint".
>Actually this is a general issue for unmount(8), not specific to autofs.
>But one wonders whether fuser is going to run wild...
>
>
>
Like HPA already mentioned, this is bad. I don't even think you could
reliably detect this.
--
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[-- Attachment #1.2: Type: application/pgp-signature, Size: 251 bytes --]
[-- Attachment #2: Type: text/plain, Size: 140 bytes --]
_______________________________________________
autofs mailing list
autofs@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/autofs
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-09 21:27 ` Mike Waychison
@ 2004-01-09 21:43 ` Jim Carter
2004-01-09 22:00 ` Mike Waychison
0 siblings, 1 reply; 17+ messages in thread
From: Jim Carter @ 2004-01-09 21:43 UTC (permalink / raw)
To: Mike Waychison; +Cc: autofs mailing list, Ian Kent
On Fri, 9 Jan 2004, Mike Waychison wrote:
> Jim Carter wrote:
> >No, I meant that when the sys_stat method fills in the mode field of
> >the returning stat structure, it would be reported as 555 for a busy
> >filesystem, or 755 if it looks idle and dismountable. The mode field as
> >seen by the VFS layer can stay at 755. This would be when statting
> >autofs's directory inode, not whatever is mounted on it.
> This hijacking of the st_mode is just that, hijacking. You'd be
> changing the mode of the directory on the NFS filesystem.. This
> wouldn't work well in the VFS when it tries to check permissions, but it
> also wouldn't work for more than one NFS client accessing the same
> filehandle.
There are 2 inodes here: the mount point (owned by autofs) and what's
mounted on it. We mustn't monkey with the mountee's data, of course. I'm
talking about statting the mount point itself. You can't do that by name,
but in a prior discussion we were talking about opening the mount point
before the mount, and later doing fstat on the file descriptor. HPA
mentioned providing a readlink method and doing lstat on the mount point by
name, which would intercept the path walk before the mountee was noticed.
So the caller gets the mode of the mount point, which autofs can create or
interpret any way it wants, rather than the mountee's data.
> >I would really like it if, when autofs decided to forcibly unmount, it
> >could kill the using processes, as in "fuser -k -m /net/host/mountpoint".
> >Actually this is a general issue for unmount(8), not specific to autofs.
> >But one wonders whether fuser is going to run wild...
> Like HPA already mentioned, this is bad. I don't even think you could
> reliably detect this.
Agreed, you wonder if the right processes are going to be picked. But...
"Ask for the moon; they might give it to you." In the present case it
should be possible to stick the fuser execution into the autofs shutdown
script -- it already is aware that unmounts haven't finished, and it
probably knows or can find out which filesystems are involved.
P.S. "Be careful what you wish for; you might get it."
James F. Carter Voice 310 825 2897 FAX 310 206 6673
UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu http://www.math.ucla.edu/~jimc (q.v. for PGP key)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-09 21:43 ` Jim Carter
@ 2004-01-09 22:00 ` Mike Waychison
0 siblings, 0 replies; 17+ messages in thread
From: Mike Waychison @ 2004-01-09 22:00 UTC (permalink / raw)
To: Jim Carter; +Cc: autofs mailing list, Ian Kent
[-- Attachment #1.1: Type: text/plain, Size: 3058 bytes --]
Jim Carter wrote:
>On Fri, 9 Jan 2004, Mike Waychison wrote:
>
>
>
>>Jim Carter wrote:
>>
>>
>>>No, I meant that when the sys_stat method fills in the mode field of
>>>the returning stat structure, it would be reported as 555 for a busy
>>>filesystem, or 755 if it looks idle and dismountable. The mode field as
>>>seen by the VFS layer can stay at 755. This would be when statting
>>>autofs's directory inode, not whatever is mounted on it.
>>>
>>>
>
>
>
>>This hijacking of the st_mode is just that, hijacking. You'd be
>>changing the mode of the directory on the NFS filesystem.. This
>>wouldn't work well in the VFS when it tries to check permissions, but it
>>also wouldn't work for more than one NFS client accessing the same
>>filehandle.
>>
>>
>
>There are 2 inodes here: the mount point (owned by autofs) and what's
>mounted on it. We mustn't monkey with the mountee's data, of course. I'm
>talking about statting the mount point itself. You can't do that by name,
>but in a prior discussion we were talking about opening the mount point
>before the mount, and later doing fstat on the file descriptor. HPA
>mentioned providing a readlink method and doing lstat on the mount point by
>name, which would intercept the path walk before the mountee was noticed.
>
>So the caller gets the mode of the mount point, which autofs can create or
>interpret any way it wants, rather than the mountee's data.
>
>
>
Ok. Then the kernel would need to periodically make these checks.. It
could work, but I see it as yet another communication kludge.
>>>I would really like it if, when autofs decided to forcibly unmount, it
>>>could kill the using processes, as in "fuser -k -m /net/host/mountpoint".
>>>Actually this is a general issue for unmount(8), not specific to autofs.
>>>But one wonders whether fuser is going to run wild...
>>>
>>>
>
>
>
>>Like HPA already mentioned, this is bad. I don't even think you could
>>reliably detect this.
>>
>>
>
>Agreed, you wonder if the right processes are going to be picked. But...
>"Ask for the moon; they might give it to you." In the present case it
>should be possible to stick the fuser execution into the autofs shutdown
>script -- it already is aware that unmounts haven't finished, and it
>probably knows or can find out which filesystems are involved.
>
>
>
This probably isn't best placed in the initscript. The automount daemon
should instead have a timeout setup for unmounting these filesystems.
If the unmount hangs for more than x seconds, move on to the next one.
Let init take care of killing off processes if they haven't died in time
for shutdown.
--
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice
mailto: Michael.Waychison@Sun.COM
http://www.sun.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE: The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[-- Attachment #1.2: Type: application/pgp-signature, Size: 251 bytes --]
[-- Attachment #2: Type: text/plain, Size: 140 bytes --]
_______________________________________________
autofs mailing list
autofs@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/autofs
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-09 19:58 ` Jim Carter
2004-01-09 20:17 ` H. Peter Anvin
2004-01-09 21:27 ` Mike Waychison
@ 2004-01-10 5:06 ` Ian Kent
2 siblings, 0 replies; 17+ messages in thread
From: Ian Kent @ 2004-01-10 5:06 UTC (permalink / raw)
To: Jim Carter; +Cc: autofs mailing list
On Fri, 9 Jan 2004, Jim Carter wrote:
> >
> > A user write implies an open file. The expire routine must identify this
> > dentry as busy or it's a bug that needs to be fixed.
>
> No, I meant that when the sys_stat method fills in the mode field of
> the returning stat structure, it would be reported as 555 for a busy
> filesystem, or 755 if it looks idle and dismountable. The mode field as
> seen by the VFS layer can stay at 755. This would be when statting
> autofs's directory inode, not whatever is mounted on it.
Don't see how the modes can identify whether a directory is busy. Its
modes could just as well be 555 anyway.
>
> Is it really true that the VFS layer doesn't maintain use counts for
> directories where you can find them easily?
Good question. I'm not sure but I believe the reference count
should be incremented when the dentry is actually being used, so it
doesn't go away. For example, during a mount a dentry is obtained for the
root directory and, since it is obviously in use it has a none zero
reference count as long as the mount is present.
>
> > > So the key task is to bulldoze aside the wreckage so a new instance of the
> > > revitalized NFS server can be mounted. Killing the moribund processes and
> > > forcibly dismounting the destroyed filesystem instance are important so as
> > > to recover resources, but the client can continue to function for quite a
> > > long time even if that's not done; the only totally vital action is to be
> > > able to remount, and that can possibly be done by renaming, or by
> > > remounting the wreckage elsewhere, deferring the actual dismount.
> >
> > I'll have to think about this for a while. The rename idea might cause all
> > sorts of problems.
>
> I can imagine. Maybe a -move remount? Can you even do this when the
> mounted FS is busy? Well...
>
> mkdir mtpt1 ; mkdir mtpt2
> mount -t nfs hollyfs:/m1 mtpt1
> cd mtpt1
> sleep 3600 & (PID is 2486)
> cd ..
> umount mtpt1 (device is busy -- correct)
> mount --move mtpt1 mtpt2 (works -- yay!)
> ls -l /proc/2486/cwd (says: /proc/2486/cwd -> /tmp/root.jimc/mtpt2/)
> grep mtpt /proc/mounts (says: hollyfs:/m1 /tmp/root.jimc/mtpt2 nfs...)
> grep mtpt /etc/mtab (says: hollyfs:/m1 /tmp/root.jimc/mtpt1 nfs...
> /tmp/root.jimc/mtpt1 /tmp/root.jimc/mtpt2 none rw 0 0
> kill 2486
> umount mtpt2 (works -- correct)
> grep mtpt /proc/mounts (says: nothing)
> grep mtpt /etc/mtab (says: hollyfs:/m1 /tmp/root.jimc/mtpt1, hiss, boo)
>
> So you can do a move mount on a busy filesystem; the only downside is that
> /etc/mtab has a stale entry in it -- obviously a bug, not an essential
> behavior.
>
> > A kill -9 on the blocked process needs further investigation. I'm not sure
> > what the situation is there (probably not good).
>
> I would really like it if, when autofs decided to forcibly unmount, it
> could kill the using processes, as in "fuser -k -m /net/host/mountpoint".
> Actually this is a general issue for unmount(8), not specific to autofs.
> But one wonders whether fuser is going to run wild...
>
> > One thing I noticed in the module expire routine is that it doesn't
> > progress past a failed mount on subsequent calls. A change to the
> > adjustment of the d_subdirs list is probably needed (a bit difficult, but
> > not impossible, in my implementation of the expire routine).
>
> I thought it moved the mount point to the end, before attempting to unmount
> it. Maybe that behavior got changed in a subsequent version, or maybe I'm
> remembering wrong.
I thought it was the front of the list. In fact I thought putting the
expire candidate on the end of the list would help with the infinite loop
problem as well, along with the limit patch you sent. In fact I think that
using the limit patch is a preferable approach to manage this bug for now.
I need to do more work on this in my module patch. Changes I needed to
make have caused this manipulation to become ineffective.
>
> > By the way. Since you are obviously interested in autofs, it would
> > be best if you could use the latest version. Please see kernel.org if
> > you are able to update.
>
> I'll discuss this with my management. By the way, for RPM-based distros, a
> lot of updating happens via mindless scripts, and it helps if ad-hoc fixes
> are packed up as a RPM. Then vendor updates to retro versions don't
> overwrite the latest patches. Example:
> Currently installed: autofs4-4.0.0pre10-504
> I create and install: autofs4-4.1.0-001
> Vendor issues patch: autofs4-4.0.5-678
>
SuSE is a bit difficult. The rpm I have is very much RedHat centric.
Ian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [patch] buffer overflow, failure to auto-dismount, log file diarrhea
2004-01-09 21:11 ` Jim Carter
@ 2004-01-10 5:09 ` Ian Kent
0 siblings, 0 replies; 17+ messages in thread
From: Ian Kent @ 2004-01-10 5:09 UTC (permalink / raw)
To: Jim Carter; +Cc: autofs mailing list, H. Peter Anvin
On Fri, 9 Jan 2004, Jim Carter wrote:
>
> Just yesterday I had five Linux boxes hang while rebooting, while
> trying to shut down autofs4, because processes using the filesystems had
> been killed but wouldn't die. This human said, I've already decided that
> these machines are idle, and so I just killed the shutdown script, rather
> than researching more drastic action to get rid of the processes, or to
> find out why they weren't exiting, or to do "umount -l", etc.
Please try 4.1 and the kernel module patch (you are using 2.4?). If it
doesn't improve your situation you are no worse of and I will get more
bug info. On the other hand ....
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2004-01-10 5:09 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-11-21 18:56 [patch] buffer overflow, failure to auto-dismount, log file diarrhea Jim Carter
2003-11-22 8:20 ` Ian Kent
2003-11-26 18:39 ` Jim Carter
2003-11-27 14:08 ` Ian Kent
2004-01-01 13:12 ` Ian Kent
2004-01-01 22:49 ` Jim Carter
2004-01-02 0:35 ` Ian Kent
2004-01-03 1:03 ` Jim Carter
2004-01-03 3:47 ` Ian Kent
2004-01-09 19:58 ` Jim Carter
2004-01-09 20:17 ` H. Peter Anvin
2004-01-09 21:11 ` Jim Carter
2004-01-10 5:09 ` Ian Kent
2004-01-09 21:27 ` Mike Waychison
2004-01-09 21:43 ` Jim Carter
2004-01-09 22:00 ` Mike Waychison
2004-01-10 5:06 ` Ian Kent
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.