From: rmorell@nvidia.com
To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: mmap vs. ctime bug?
Date: Mon, 9 May 2011 18:23:48 -0700 [thread overview]
Message-ID: <20110510012348.GJ3848@morell.nvidia.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 3076 bytes --]
I tracked an intermittent failure in one of our build systems down to
questionable kernel behavior.
The makefile for the build seems completely reasonable. It essentially does
this (greatly simplified):
output: $(OBJECTS)
ld -o output $(OBJECTS)
$(POSTPROCESS) output
tarball.tgz: output
tar zcf tarball.tgz output
$(POSTPROCESS) in this case is just a program that modifies some ELF headers.
This program does so through libelf, but the important part is that libelf
operates on the file using mmap().
The problem is that the "tar" step sometimes fails with the error:
/bin/tar: output: file changed as we read it
As tar is adding a file to a tarball, it first stats the file, reads the entire
file, then stats it again. It reports the above error if the ctime does not
match between the two stat calls. In the case of the intermittent failure, the
ctime does not match for the file as reported by stat(1).
Adding a sync between the postprocess program's termination and the tar
invocation "fixes" the problem, but adds a significant delay to the overall
build time, so I'd prefer to not do that.
I was able to reproduce the behavior with a simple test case (attached) with
the latest git kernel built from 26822eebb25. To run the test, simply
put test.c and the Makefile in a new directory and run "make runtest".
Note that the filesystem blocks and ctime change between the two stat
invocations, although the mtime remains the same:
# make runtest
gcc test.c -o test
rm -f out
./test out
stat out
File: `out'
Size: 268435456 Blocks: 377096 IO Block: 4096 regular file
Device: 304h/772d Inode: 655367 Links: 1
Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2011-05-09 18:06:24.000000000 -0700
Modify: 2011-05-09 18:06:27.000000000 -0700
Change: 2011-05-09 18:06:27.000000000 -0700
sync
stat out
File: `out'
Size: 268435456 Blocks: 524808 IO Block: 4096 regular file
Device: 304h/772d Inode: 655367 Links: 1
Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2011-05-09 18:06:24.000000000 -0700
Modify: 2011-05-09 18:06:27.000000000 -0700
Change: 2011-05-09 18:06:28.000000000 -0700
(note: depending on your system, you may need to tweak the "SIZE" constant in
test.c up to see ctime actually change at a resolution of 1s)
Does this seem like a bug to anyone else? For the normal "make" flow to work
properly, files really need to be done changing by the time a process exits and
wait(3) returns to the parent. The heavy-hammer workaround of adding a
sync(1) throws away a ton of potential benefit from the filesystem cache.
Adding an msync(MS_SYNC) in the toy test app also "fixes" the problem, but
that's not feasible in the production environment since libelf is doing the
modification internally and besides, it seems like it shouldn't be necessary.
If it matters, the filesystem is a dead simple ext3 with no special mount
flags, but I suspect this is not specific to FS:
# mount
/dev/hda4 on / type ext3 (rw)
[...]
Thanks,
Robert
[-- Attachment #2: test.c --]
[-- Type: text/x-c, Size: 892 bytes --]
#include <unistd.h>
#include <stdio.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define SIZE (256 * 1024 * 1024)
int main(int argc, char **argv)
{
char *ptr;
int fd, i;
if (argc < 2) {
printf("Usage: %s <filename>\n", argv[0]);
return 1;
}
fd = open(argv[1], O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
if (fd < 0) {
perror("Failed to open file");
return 1;
}
if (ftruncate(fd, SIZE)) {
perror("Failed to truncate to full size");
return 1;
}
ptr = mmap(NULL, SIZE, PROT_WRITE, MAP_SHARED, fd, 0);
if (ptr == MAP_FAILED) {
perror("Failed to mmap");
return 1;
}
/* touch each page */
for (i = 0; i < SIZE; i += 4096)
ptr[i] = 1;
/* Uncommenting the below line masks the problem */
/* msync(ptr, SIZE, MS_SYNC); */
if (munmap(ptr, SIZE)) {
perror("Failed to munmap");
return 1;
}
close(fd);
return 0;
}
[-- Attachment #3: Makefile --]
[-- Type: text/plain, Size: 114 bytes --]
test: test.c
gcc test.c -o test
runtest: test
rm -f out
./test out
stat out
sync
stat out
.PHONY: runtest
next reply other threads:[~2011-05-10 1:23 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-10 1:23 rmorell [this message]
2011-05-11 10:43 ` mmap vs. ctime bug? Jan Kara
2011-05-11 16:24 ` rmorell
2011-05-11 17:01 ` Jan Kara
2011-05-11 17:37 ` Robert Morell
2011-05-12 12:24 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110510012348.GJ3848@morell.nvidia.com \
--to=rmorell@nvidia.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox