All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: linux-ext4@vger.kernel.org
Subject: [Bug 14354] Bad corruption with 2.6.32-rc1 and upwards
Date: Sat, 17 Oct 2009 10:51:49 GMT	[thread overview]
Message-ID: <200910171051.n9HApnhw015305@demeter.kernel.org> (raw)
In-Reply-To: <bug-14354-13602@http.bugzilla.kernel.org/>

http://bugzilla.kernel.org/show_bug.cgi?id=14354





--- Comment #78 from Theodore Tso <tytso@mit.edu>  2009-10-17 10:51:41 ---
Alexey,

There is a very big difference between _files_ being corrupted and the
_file_ _system_ being corrupted.  Your test, as I understand it, is a
"make modules_install" from a kernel source tree, followed immediately
by a forced crash of the system, correct?  Are you doing an "rm -rf
/lib/modules/2.6.32-XXXX" first, or are you just doing a "make
modules_install" and overwriting files.

In any case, if you don't do a forced sync of the filesystem, some of
the recently written files will be corrupted.  (Specifically, they may
be only partially written, or truncated to zero-length.)  This is
normal and to be expected.  If you want to make sure files are written
to stable storage, you *must* use sync or fsync(3).

This is true for pretty much any file system, by the way.  If you have
a script that looks something like this

#!/bin/sh
rm -rf /lib/modules/`uname -r`
make modules_install
echo c > /proc/sysrq-trigger

you _will_ end up with some files being missing, or not fully written
out.  Try it with ext3, xfs, btrfs, reseirfs.  All Unix filesystems
have some amount of asynchronous writes, because otherwise performance
would suck donkey gonads.  You can try to mount with -o sync, just to
see how horrible things would be.

So what do you do if you have a "precious" file --- a file where you
want to update its contents, but you want to make absolutely sure
either the old file or the new file's contents will still be present?
Well, you have to use fsync().  Well-written text editors and things
like mail transfer angents tend to get this right.  Here's one right
way of doing it:

1)  fd = open("foobar.new", O_CREAT|O_TRUNC, mode_of_foobar);
2)  /* copy acl's, extended attributes from foobar to foobar.new */
3)  write(fd, buf, bufsize); /* Write the new contents of foobar */
4)  fsync(fd);
5)  close(fd);
6)  rename("foobar.new", "foobar");

The basic idea is you write the new file, then you use fsync() to
guarantee that the contents have been written to disk, and then
finally you rename the old file on top of the old one.

As it turns out, for a long time Linux systems were drop dead
reliable.  Unfortunately, recently with the advent of ACPI
suspend/resume, which assumed that BIOS authors were competent and
would test on OS's other than windows, and proprietry video drivers
that tend to be super unreliable, Linux systems have started crashing
more often.  Worse yet, application writers are started getting
sloppy, and would write code sequences like this when they want to
update files:

1)  fd = open("foobar", O_CREAT|O_TRUNCATE, default_mode);
2)  write(fd, buf, bufsize); /* write the new contents of foobar */
3)  close(fd);

Or this:

1)  fd = open("foobar.new", O_CREAT|O_TRUNC, mode_of_foobar);
2)  write(fd, buf, bufsize); /* Write the new contents of foobar */
3)  close(fd);
4)  rename("foobar.new", "foobar");

I call the first "update-via-truncate" and the second
"update-via-replace".  Because with delayed allocation, files have a
tendency to become zero-length if you update them without using
fsync() and than an errant ACPI bios or buggy video driver takes your
system down --- and because KDE was updating many more dot files than
necessary, and firefox was writing half a megabyte of disk files for
every single web click, people really started to notice problems.

As a result, we have hueristics that detect update-via-rename and
update-via-truncate, and if we detect this write pattern, we force a
background writeback of that file.  It's not a synchronous writeback,
since that would destroy performance, but a very small amount of time
after a close(2)'ing a file descript that was opened with O_TRUNCATE
or which had been explicitly truncated down to zero using ftruncate(2)
-- i.e., update-via-truncate --- , or after a rename(2) which causes
an inode to be unlinked --- i.e., uodate-via-unlink --- the contents
of that file will be written to disk.  This is what auto_da_alloc=0
inhibits.

So why is it that you apparently had no data loss when you used
auto_da_alloc=0?  I'm guessing because the file system activity entire
script fit within a single jbd2 transaction, and the transaction never
committed before the script forced a system crash.  (Normally a
transaction will contain five seconds of filesystem activity, unless
(a) a program calls fsync(), or (b) there's been enough file system
activity that a significant chunk of the journal space has been
confused.

One of the changes between 2.6.31 and 2.6.32-rc1 was a bugfix that
fixed a problem in 2.6.31 where update-via-truncate wasn't getting
detected.  This got fixed in 2.6.32-rc1, and that does change when
data gets forced out to disk.

But in any case, if it's just a matter of the file contents not
getting written to disk, that's expected if you don't use fsync() and
you crash immediately afterwards.  As I said earlier, all file systems
will tend to lose data if you crash without first using fsync().

The bug which I'm interested in replicating is one where the actual
_file_ _system_ is getting corrupted.  But if it's just a matter of
not using sync() or fsync() before a crash, that's not a bug.

                                   - Ted

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

  parent reply	other threads:[~2009-10-17 10:51 UTC|newest]

Thread overview: 218+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-09 15:42 [Bug 14354] New: Bad corruption with 2.6.32-rc1 and upwards bugzilla-daemon
2009-10-09 15:51 ` [Bug 14354] " bugzilla-daemon
2009-10-09 16:06 ` bugzilla-daemon
2009-10-09 16:44 ` bugzilla-daemon
2009-10-09 16:50 ` bugzilla-daemon
2009-10-09 22:35 ` bugzilla-daemon
2009-10-10  7:32 ` bugzilla-daemon
2009-10-10 16:48 ` bugzilla-daemon
2009-10-10 16:50 ` bugzilla-daemon
2009-10-10 17:00 ` bugzilla-daemon
2009-10-10 17:04 ` bugzilla-daemon
2009-10-10 19:54 ` bugzilla-daemon
2009-10-11  1:26 ` bugzilla-daemon
2009-10-11  2:03 ` bugzilla-daemon
2009-10-11 12:31 ` bugzilla-daemon
2009-10-11 19:07 ` bugzilla-daemon
2009-10-11 21:45 ` bugzilla-daemon
2009-10-11 23:01 ` bugzilla-daemon
2009-10-12  0:02 ` bugzilla-daemon
2009-10-12  2:18 ` bugzilla-daemon
2009-10-12  2:24 ` bugzilla-daemon
2009-10-12  7:54 ` bugzilla-daemon
2009-10-12  7:56 ` bugzilla-daemon
2009-10-12  7:57 ` bugzilla-daemon
2009-10-12 15:24 ` bugzilla-daemon
2009-10-12 15:27 ` [Bug 14354] New: " Aneesh Kumar K.V
2009-10-12 15:27 ` [Bug 14354] " bugzilla-daemon
2009-10-12 18:15 ` bugzilla-daemon
2009-10-12 18:29 ` bugzilla-daemon
2009-10-12 21:41 ` bugzilla-daemon
2009-10-12 21:50 ` bugzilla-daemon
2009-10-13  0:18 ` bugzilla-daemon
2009-10-13  8:00 ` bugzilla-daemon
2009-10-13  8:02 ` bugzilla-daemon
2009-10-13  8:47 ` bugzilla-daemon
2009-10-13 13:16 ` bugzilla-daemon
2009-10-13 14:50 ` bugzilla-daemon
2009-10-13 15:12 ` bugzilla-daemon
2009-10-13 20:17 ` bugzilla-daemon
2009-10-13 20:28 ` bugzilla-daemon
2009-10-13 20:55 ` bugzilla-daemon
2009-10-13 21:10 ` bugzilla-daemon
2009-10-13 22:02 ` bugzilla-daemon
2009-10-14  0:45 ` bugzilla-daemon
2009-10-14  0:48 ` bugzilla-daemon
2009-10-14  2:31 ` bugzilla-daemon
2009-10-14  3:17 ` bugzilla-daemon
2009-10-14  3:26 ` bugzilla-daemon
2009-10-14  9:31 ` bugzilla-daemon
2009-10-14 13:21 ` bugzilla-daemon
2009-10-14 13:36 ` bugzilla-daemon
2009-10-14 22:08 ` bugzilla-daemon
2009-10-14 22:09 ` bugzilla-daemon
2009-10-14 22:11 ` bugzilla-daemon
2009-10-15  4:24 ` bugzilla-daemon
2009-10-15  7:11 ` bugzilla-daemon
2009-10-15 15:07 ` bugzilla-daemon
2009-10-15 17:58 ` bugzilla-daemon
2009-10-15 18:30 ` bugzilla-daemon
2009-10-15 19:10 ` bugzilla-daemon
2009-10-15 23:40 ` bugzilla-daemon
2009-10-16  9:16 ` bugzilla-daemon
2009-10-16 10:48 ` bugzilla-daemon
2009-10-16 14:13 ` bugzilla-daemon
2009-10-16 14:41 ` bugzilla-daemon
2009-10-16 15:27 ` bugzilla-daemon
2009-10-16 19:14 ` bugzilla-daemon
2009-10-16 19:39 ` bugzilla-daemon
2009-10-16 20:05 ` bugzilla-daemon
2009-10-16 20:07 ` bugzilla-daemon
2009-10-16 22:24 ` bugzilla-daemon
2009-10-16 23:02 ` bugzilla-daemon
2009-10-17  2:27 ` bugzilla-daemon
2009-10-17  6:01 ` bugzilla-daemon
2009-10-17  6:03 ` bugzilla-daemon
2009-10-17  6:12 ` bugzilla-daemon
2009-10-17  6:38 ` bugzilla-daemon
2009-10-17  6:38 ` bugzilla-daemon
2009-10-17  6:43 ` bugzilla-daemon
2009-10-17  8:18 ` bugzilla-daemon
2009-10-17 10:51 ` bugzilla-daemon [this message]
2009-10-17 11:37 ` bugzilla-daemon
2009-10-17 14:37 ` bugzilla-daemon
2009-10-17 14:40 ` bugzilla-daemon
2009-10-17 15:23 ` bugzilla-daemon
2009-10-17 17:46 ` bugzilla-daemon
2009-10-17 19:57 ` bugzilla-daemon
2009-10-17 20:54 ` bugzilla-daemon
2009-10-17 20:59 ` bugzilla-daemon
2009-10-18  0:02 ` bugzilla-daemon
2009-10-18  0:03 ` bugzilla-daemon
2009-10-18  0:04 ` bugzilla-daemon
2009-10-18  0:04 ` bugzilla-daemon
2009-10-18  0:05 ` bugzilla-daemon
2009-10-18  0:05 ` bugzilla-daemon
2009-10-18  0:07 ` bugzilla-daemon
2009-10-18  0:07 ` bugzilla-daemon
2009-10-18  0:07 ` bugzilla-daemon
2009-10-18  7:44 ` bugzilla-daemon
2009-10-18 10:06 ` bugzilla-daemon
2009-10-18 11:23 ` bugzilla-daemon
2009-10-18 11:25 ` bugzilla-daemon
2009-10-18 11:57 ` bugzilla-daemon
2009-10-19 21:07 ` bugzilla-daemon
2009-10-19 21:08 ` bugzilla-daemon
2009-10-20 17:28 ` bugzilla-daemon
2009-10-20 20:21 ` bugzilla-daemon
2009-10-21  0:53 ` bugzilla-daemon
2009-10-21  0:57 ` bugzilla-daemon
2009-10-21  5:55 ` bugzilla-daemon
2009-10-21  9:05 ` bugzilla-daemon
2009-10-21  9:06 ` bugzilla-daemon
2009-10-22 18:43 ` bugzilla-daemon
2009-10-22 19:06 ` bugzilla-daemon
2009-10-22 19:22 ` bugzilla-daemon
2009-10-22 19:29 ` bugzilla-daemon
2009-10-22 19:37 ` bugzilla-daemon
2009-10-22 21:58 ` bugzilla-daemon
2009-10-22 22:24 ` bugzilla-daemon
2009-10-23  4:30 ` bugzilla-daemon
2009-10-23  7:45 ` bugzilla-daemon
2009-10-24 17:04 ` bugzilla-daemon
2009-10-24 17:05 ` bugzilla-daemon
2009-10-25  8:58 ` bugzilla-daemon
2009-10-25 13:59 ` bugzilla-daemon
2009-10-25 14:01 ` bugzilla-daemon
2009-10-25 19:04 ` bugzilla-daemon
2009-10-26 13:46 ` bugzilla-daemon
2009-10-26 15:42 ` bugzilla-daemon
2009-10-26 16:17 ` bugzilla-daemon
2009-10-26 19:13 ` bugzilla-daemon
2009-10-26 21:46 ` bugzilla-daemon
2009-10-26 22:40 ` bugzilla-daemon
2009-10-27  6:54   ` Aneesh Kumar K.V
2009-10-27  6:39 ` bugzilla-daemon
2009-10-27 10:00 ` bugzilla-daemon
2009-10-27 10:38 ` bugzilla-daemon
2009-10-27 11:03   ` Aneesh Kumar K.V
2009-10-27 10:48 ` bugzilla-daemon
2009-10-27 11:47 ` bugzilla-daemon
2009-10-27 12:41 ` bugzilla-daemon
2009-10-27 13:02 ` bugzilla-daemon
2009-10-27 18:50 ` bugzilla-daemon
2009-10-27 19:00 ` bugzilla-daemon
2009-10-27 20:00 ` bugzilla-daemon
2009-10-27 20:31 ` bugzilla-daemon
2009-10-27 20:37 ` bugzilla-daemon
2009-10-27 21:23 ` bugzilla-daemon
2009-10-27 21:42 ` bugzilla-daemon
2009-10-27 22:04 ` bugzilla-daemon
2009-10-27 23:38 ` bugzilla-daemon
2009-10-28  6:44 ` bugzilla-daemon
2009-10-28  7:20 ` bugzilla-daemon
2009-10-29 16:23 ` bugzilla-daemon
2009-10-29 16:47 ` bugzilla-daemon
2009-10-29 17:34 ` bugzilla-daemon
2009-10-29 20:11 ` bugzilla-daemon
2009-10-29 21:25 ` bugzilla-daemon
2009-10-29 21:39 ` bugzilla-daemon
2009-10-29 21:42 ` bugzilla-daemon
2009-10-29 21:52 ` bugzilla-daemon
2009-10-29 21:55 ` bugzilla-daemon
2009-10-29 22:20 ` bugzilla-daemon
2009-10-29 22:23 ` bugzilla-daemon
2009-10-30  8:16 ` bugzilla-daemon
2009-10-30  8:22 ` bugzilla-daemon
2009-10-30 10:26 ` bugzilla-daemon
2009-10-30 10:42 ` bugzilla-daemon
2009-10-30 13:54 ` bugzilla-daemon
2009-10-30 16:27 ` bugzilla-daemon
2009-10-30 18:05 ` bugzilla-daemon
2009-10-30 19:17 ` bugzilla-daemon
2009-10-30 19:22 ` bugzilla-daemon
2009-10-30 19:56 ` bugzilla-daemon
2009-10-31  9:15 ` bugzilla-daemon
2009-10-31 15:26 ` bugzilla-daemon
2009-10-31 16:16 ` bugzilla-daemon
2009-10-31 19:15 ` bugzilla-daemon
2009-10-31 19:56 ` bugzilla-daemon
2009-11-02  4:07 ` bugzilla-daemon
2009-11-02 17:05 ` bugzilla-daemon
2009-11-02 23:11 ` bugzilla-daemon
2009-11-03  1:07 ` bugzilla-daemon
2009-11-03  9:33 ` bugzilla-daemon
2009-11-03 13:37 ` bugzilla-daemon
2009-11-03 13:43 ` bugzilla-daemon
2009-11-03 13:58 ` bugzilla-daemon
2009-11-03 14:32 ` bugzilla-daemon
2009-11-03 23:39 ` bugzilla-daemon
2009-11-03 23:43 ` bugzilla-daemon
2009-11-04  3:33 ` bugzilla-daemon
2009-11-06 20:49 ` bugzilla-daemon
2009-11-06 22:07 ` bugzilla-daemon
2009-11-08 18:12   ` Christoph Hellwig
2009-11-08 19:12   ` Theodore Tso
2009-11-06 22:15 ` bugzilla-daemon
2009-11-06 22:26 ` bugzilla-daemon
2009-11-08 18:12 ` bugzilla-daemon
2009-11-08 19:12 ` bugzilla-daemon
2009-11-08 21:05 ` bugzilla-daemon
2009-11-08 21:12   ` Theodore Tso
2009-11-08 21:12 ` bugzilla-daemon
2009-11-09 14:54 ` bugzilla-daemon
2009-11-16 22:30 ` bugzilla-daemon
2009-11-16 22:30 ` bugzilla-daemon
2009-11-17 22:20 ` bugzilla-daemon
2009-11-17 22:21 ` bugzilla-daemon
2009-11-17 22:22 ` bugzilla-daemon
  -- strict thread matches above, loose matches on Subject: below --
2009-10-11 22:07 2.6.32-rc4: Reported regressions from 2.6.31 Rafael J. Wysocki
2009-10-11 22:22 ` [Bug #14354] Bad corruption with 2.6.32-rc1 and upwards Rafael J. Wysocki
2009-10-26 18:45 2.6.32-rc5-git3: Reported regressions from 2.6.31 Rafael J. Wysocki
2009-10-26 18:55 ` [Bug #14354] Bad corruption with 2.6.32-rc1 and upwards Rafael J. Wysocki
2009-10-26 18:55   ` Rafael J. Wysocki
2009-11-16 22:33 2.6.32-rc7-git1: Reported regressions from 2.6.31 Rafael J. Wysocki
2009-11-16 22:37 ` [Bug #14354] Bad corruption with 2.6.32-rc1 and upwards Rafael J. Wysocki
2009-11-17  2:02   ` Theodore Tso
2009-11-17  2:02     ` Theodore Tso
     [not found]     ` <20091117020202.GA23066-3s7WtUTddSA@public.gmane.org>
2009-11-17 22:23       ` Rafael J. Wysocki
2009-11-17 22:23         ` Rafael J. Wysocki
     [not found]         ` <200911172323.11224.rjw-KKrjLPT3xs0@public.gmane.org>
2009-11-18  0:11           ` tytso-3s7WtUTddSA
2009-11-18  0:11             ` tytso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200910171051.n9HApnhw015305@demeter.kernel.org \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.