* [uml-devel] Help? (Filesystem race condition in 2.6.13.2 in -skas0 and -tt mode.)
@ 2005-09-20 20:24 Rob Landley
2005-10-08 22:00 ` Rob Landley
0 siblings, 1 reply; 2+ messages in thread
From: Rob Landley @ 2005-09-20 20:24 UTC (permalink / raw)
To: user-mode-linux-devel
I have a system build that happens under UML. The build is a series of bash
scripts that compile and run a heavily modified Linux From Scratch system.
The previous version was working fine under 2.6.11 UML in -tt mode (with one
patch applied, the one to fix the permissions on block devices in hostfs).
I just upgraded the build to use the 2.6.13.2, and now I'm getting the build
breaking with random file not found errors. I've tried both -tt mode and
-skas0 mode, and it doesn't like either one.
It seems to be a filesystem problem. The filesystem it's using is an ext2
partition that lives on a loopback mounted file living in hostfs. Somewhere
along the way, random "file not found" errors are sneaking in, on files the
build created seconds earlier.
Here's the end of the last three build attempts. This is from re-running the
build with no changes, it dies in different places, but with the same _kind_
of error each time. The build process re-creates the loopback partition from
scratch each time (deletes the working directory with all the tempfiles in
it, then dd if=/dev/zero followed by mke2fs. This is hostfs mounted and then
loopback mounted.)
What info do you need to be able to reproduce this problem? (I'll happily
send you a tarball of my whole build, but it's 100 megs. The vast majority
of that being the source tarballs for linux tarball, gcc, and binutils....)
---------------------------------------
make[2]: Nothing to be done for `install-data-am'.
make[2]: Leaving directory `/tools/sources/diffutils-2.8.1/ms'
make[1]: Leaving directory `/tools/sources/diffutils-2.8.1/ms'
Making install in src
make[1]: Entering directory `/tools/sources/diffutils-2.8.1/src'
make[2]: Entering directory `/tools/sources/diffutils-2.8.1/src'
/bin/sh ../config/mkinstalldirs /tools/bin
/tools/bin/install -c cmp /tools/bin/cmp
/tools/bin/install -c diff /tools/bin/diff
/tools/bin/install -c diff3 /tools/bin/diff3
/tools/bin/install -c sdiff /tools/bin/sdiff
make[2]: Nothing to be done for `install-data-am'.
make[2]: Leaving directory `/tools/sources/diffutils-2.8.1/src'
make[1]: Leaving directory `/tools/sources/diffutils-2.8.1/src'
Making install in man
make[1]: Entering directory `/tools/sources/diffutils-2.8.1/man'
make[2]: Entering directory `/tools/sources/diffutils-2.8.1/man'
make[2]: Nothing to be done for `install-exec-am'.
/bin/sh ../config/mkinstalldirs /tools/man/man1
/tools/bin/install -c -m 644 ./cmp.1 /tools/man/man1/cmp.1
/tools/bin/install -c -m 644 ./diff.1 /tools/man/man1/diff.1
/tools/bin/install -c -m 644 ./diff3.1 /tools/man/man1/diff3.1
/tools/bin/install -c -m 644 ./sdiff.1 /tools/man/man1/sdiff.1
install: unable to open `/tools/man/man1/sdiff.1/sdiff.1': No such file or
directory
install: cannot change permissions of /tools/man/man1/sdiff.1/sdiff.1: No such
file or directory
install: cannot change ownership of /tools/man/man1/sdiff.1/sdiff.1: No such
file or directory
make[2]: *** [install-man1] Error 1
make[2]: Leaving directory `/tools/sources/diffutils-2.8.1/man'
make[1]: *** [install-am] Error 2
make[1]: Leaving directory `/tools/sources/diffutils-2.8.1/man'
make: *** [install-recursive] Error 1
System halted.
=== Toolchain build script exiting.
bin/ln -f /usr/man/man3/uuid_generate.3 /usr/man/man3/uuid_generate_random.3
/bin/ln -f /usr/man/man3/uuid_generate.3 /usr/man/man3/uuid_generate_time.3
make[1]: Leaving directory `/var/tmp/e2fsprogs-build/lib/uuid'
making install in lib/blkid
make[1]: Entering directory `/var/tmp/e2fsprogs-build/lib/blkid'
../../../e2fsprogs-1.34/mkinstalldirs /lib \
/usr/lib
/bin/install -c libblkid.so.1.0 /lib/libblkid.so.1.0
strip --strip-debug \
/lib/libblkid.so.1.0
ln -s -f libblkid.so.1.0 /lib/libblkid.so.1
ln -s -f /lib/libblkid.so.1 \
/usr/lib/libblkid.so
/sbin/ldconfig
../../../e2fsprogs-1.34/mkinstalldirs /usr/lib \
/usr/include/blkid
mkdir /usr/include/blkid
/bin/install -c -m 644 libblkid.a /usr/lib/libblkid.a
/bin/chmod 644 /usr/lib/libblkid.a
ranlib /usr/lib/libblkid.a
/bin/chmod 444 /usr/lib/libblkid.a
set -e; for i in blkid.h; do \
/bin/install -c -m
644 ../../../e2fsprogs-1.34/lib/blkid/$i /usr/include/blkid/$i; \
done
install: unable to open `/usr/include/blkid/blkid.h/blkid.h': No such file or
directory
install: cannot change permissions of /usr/include/blkid/blkid.h/blkid.h: No
such file or directory
install: cannot change ownership of /usr/include/blkid/blkid.h/blkid.h: No
such file or directory
make[1]: *** [install] Error 1
make[1]: Leaving directory `/var/tmp/e2fsprogs-build/lib/blkid'
make: *** [install-libs-recursive] Error 1
Kernel panic - not syncing: Attempted to kill init!
=== Toolchain build script exiting.
make[2]: Leaving directory `/tools/sources/diffutils-2.8.1/lib'
make[1]: Leaving directory `/tools/sources/diffutils-2.8.1/lib'
Making install in m4
make[1]: Entering directory `/tools/sources/diffutils-2.8.1/m4'
make[2]: Entering directory `/tools/sources/diffutils-2.8.1/m4'
make[2]: Nothing to be done for `install-exec-am'.
make[2]: Nothing to be done for `install-data-am'.
make[2]: Leaving directory `/tools/sources/diffutils-2.8.1/m4'
make[1]: Leaving directory `/tools/sources/diffutils-2.8.1/m4'
Making install in ms
make[1]: Entering directory `/tools/sources/diffutils-2.8.1/ms'
make[2]: Entering directory `/tools/sources/diffutils-2.8.1/ms'
make[2]: Nothing to be done for `install-exec-am'.
make[2]: Nothing to be done for `install-data-am'.
make[2]: Leaving directory `/tools/sources/diffutils-2.8.1/ms'
make[1]: Leaving directory `/tools/sources/diffutils-2.8.1/ms'
Making install in src
make[1]: Entering directory `/tools/sources/diffutils-2.8.1/src'
make[2]: Entering directory `/tools/sources/diffutils-2.8.1/src'
/bin/sh ../config/mkinstalldirs /tools/bin
/tools/bin/install -c cmp /tools/bin/cmp
/tools/bin/install -c diff /tools/bin/diff
/tools/bin/install -c diff3 /tools/bin/diff3
/tools/bin/install -c sdiff /tools/bin/sdiff
install: unable to open `/tools/bin/sdiff/sdiff': No such file or directory
install: cannot change permissions of /tools/bin/sdiff/sdiff: No such file or
directory
install: cannot change ownership of /tools/bin/sdiff/sdiff: No such file or
directory
make[2]: *** [install-binPROGRAMS] Error 1
make[2]: Leaving directory `/tools/sources/diffutils-2.8.1/src'
make[1]: *** [install-am] Error 2
make[1]: Leaving directory `/tools/sources/diffutils-2.8.1/src'
make: *** [install-recursive] Error 1
System halted.
=== Toolchain build script exiting.
-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [uml-devel] Help? (Filesystem race condition in 2.6.13.2 in -skas0 and -tt mode.)
2005-09-20 20:24 [uml-devel] Help? (Filesystem race condition in 2.6.13.2 in -skas0 and -tt mode.) Rob Landley
@ 2005-10-08 22:00 ` Rob Landley
0 siblings, 0 replies; 2+ messages in thread
From: Rob Landley @ 2005-10-08 22:00 UTC (permalink / raw)
To: user-mode-linux-devel
On Tuesday 20 September 2005 15:24, Rob Landley wrote:
> I have a system build that happens under UML. The build is a series of
> bash scripts that compile and run a heavily modified Linux From Scratch
> system. The previous version was working fine under 2.6.11 UML in -tt mode
> (with one patch applied, the one to fix the permissions on block devices in
> hostfs).
>
> I just upgraded the build to use the 2.6.13.2, and now I'm getting the
> build breaking with random file not found errors. I've tried both -tt mode
> and -skas0 mode, and it doesn't like either one.
>
> It seems to be a filesystem problem.
Update: it seems to be that a filesystem is returning true for S_ISDIR() on
something that isn't a directory. Looking closer at the failure cases I
posted:
> /tools/bin/install -c -m 644 ./sdiff.1 /tools/man/man1/sdiff.1
> install: unable to open `/tools/man/man1/sdiff.1/sdiff.1': No such file or
> directory
> install: cannot change permissions of /tools/man/man1/sdiff.1/sdiff.1: No
> such file or directory
> install: cannot change ownership of /tools/man/man1/sdiff.1/sdiff.1: No
> such file or directory
Note how the last argument, "/tools/man/man1/sdiff.1" is having an extra
"/sdiff.1" appended to the end?
and
> /tools/bin/install -c sdiff /tools/bin/sdiff
> install: unable to open `/tools/bin/sdiff/sdiff': No such file or directory
> install: cannot change permissions of /tools/bin/sdiff/sdiff: No such file
> or directory
> install: cannot change ownership of /tools/bin/sdiff/sdiff: No such file or
> directory
Again, the string "/tools/bin/sdiff" is turning into
"/tools/bin/sdiff/sdiff"...
The busybox source code in question can be viewed out of our source control
here:
http://www.busybox.net/cgi-bin/viewcvs.cgi/*checkout*/trunk/busybox/coreutils/install.c?content-type=text%2Fplain&rev=11515
And the relevant snippet is:
cp_mv_stat2(argv[argc - 1], &statbuf, lstat);
for (i = optind; i < argc - 1; i++) {
unsigned char *dest;
if (S_ISDIR(statbuf.st_mode)) {
dest = concat_path_file(argv[argc - 1], basename(argv[i]));
} else {
dest = argv[argc - 1];
}
ret |= copy_file(argv[i], dest, copy_flags);
/* Set the file mode */
if (chmod(dest, mode) == -1) {
bb_perror_msg("cannot change permissions of %s", dest);
ret = EXIT_FAILURE;
}
/* Set the user and group id */
if (lchown(dest, uid, gid) == -1) {
bb_perror_msg("cannot change ownership of %s", dest);
ret = EXIT_FAILURE;
}
and cp_mv_stat2 is:
extern int cp_mv_stat2(const char *fn, struct stat *fn_stat, stat_func sf)
{
if (sf(fn, fn_stat) < 0) {
if (errno != ENOENT) {
bb_perror_msg("unable to stat `%s'", fn);
return -1;
}
return 0;
} else if (S_ISDIR(fn_stat->st_mode)) {
return 3;
}
return 1;
}
(Yeah, it needs a cleanup, I know...)
So the uclibc lstat() is getting true from S_ISDIR() on a freshly created
normal file. It's definitely an intermittent problem, it _seems_ like the
lstat structure is not zeroed out or something.
This is linked against uClibc 0.9.27, compiled with "long file support" and
thus using the following implementation for lstat:
int lstat64(const char *file_name, struct stat64 *buf)
{
int result;
struct kernel_stat64 kbuf;
result = __syscall_lstat64(file_name, &kbuf);
if (result == 0) {
__xstat64_conv(&kbuf, buf);
}
return result;
}
And the opaque bit of the above is:
void __xstat64_conv(struct kernel_stat64 *kbuf, struct stat64 *buf)
{
/* Convert to current kernel version of `struct stat64'. */
buf->st_dev = kbuf->st_dev;
buf->st_ino = kbuf->st_ino;
#ifdef _HAVE_STAT64___ST_INO
buf->__st_ino = kbuf->__st_ino;
#endif
buf->st_mode = kbuf->st_mode;
buf->st_nlink = kbuf->st_nlink;
buf->st_uid = kbuf->st_uid;
buf->st_gid = kbuf->st_gid;
buf->st_rdev = kbuf->st_rdev;
buf->st_size = kbuf->st_size;
buf->st_blksize = kbuf->st_blksize;
buf->st_blocks = kbuf->st_blocks;
buf->st_atime = kbuf->st_atime;
buf->st_mtime = kbuf->st_mtime;
buf->st_ctime = kbuf->st_ctime;
}
Um, this might be relevant too:
#define __NR___syscall_lstat64 __NR_lstat64
_syscall2(int, __syscall_lstat64, const char *, file_name, struct stat64 *,
buf)
;
Now: given all that, do any of you guys have the foggiest idea what's going
on? (This is running install on an ext2 formatted file loopback mounted and
living in a hostfs partition, so the stack of things getting exercised is
ext2 living in /dev/loop0 attached to hostfs file which is exporting an ext3
partition from the host system (ubuntu "hoary hedgehog").
By the way, I've cherry-picked out all the binaries and libraries involved in
this, and got it down to a 650k tarball. Anybody want that?
Rob
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2005-10-08 22:00 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-20 20:24 [uml-devel] Help? (Filesystem race condition in 2.6.13.2 in -skas0 and -tt mode.) Rob Landley
2005-10-08 22:00 ` Rob Landley
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.