From: bugzilla-daemon@bugzilla.kernel.org
To: linux-ext4@vger.kernel.org
Subject: [Bug 15910] zero-length files and performance degradation
Date: Wed, 5 May 2010 18:54:48 GMT [thread overview]
Message-ID: <201005051854.o45IsmJk027144@demeter.kernel.org> (raw)
In-Reply-To: <bug-15910-13602@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=15910
Theodore Tso <tytso@mit.edu> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |tytso@mit.edu
--- Comment #1 from Theodore Tso <tytso@mit.edu> 2010-05-05 18:54:23 ---
Why can't you #1, just fsync after writing the control file, if that's the
primary problem?
Or #2, make the dpkg recover more gracefully if it finds that the control file
has been truncated down to zero?
The reality is that all of the newer file systems are going to have this
property. XFS has always behaved this way. Btrfs will as well. We are _all_
using the same hueristic to force sync a file which is replaced via a rename()
system call, but that's really considered a workaround buggy application
programs that don't call fsync(), because there are more stupid application
programmers than there are of us file system developers.
As far as the rest of the files are concerned, what I would suggest doing is
set a sentinel value which is used to indicate that package is being installed,
and if the system crashes, either in the init scripts or the next time dpkg
runs, it should reinstall that package. That way you're not fsync()'ing every
single file in the package, and you're also not optimizing for the exception
condition. You just have appropriate application-level retries in case of a
crash.
So Debian and Ubuntu have a choice. You can just stick with the ext3, and not
upgrade, but this is one place where you can't blackmail file system developers
by saying, "if you don't do this, I'll go use some other file system" ---
because we are *all* doing delayed allocation. It's allowed by POSIX, and
it's the only way to get much better file system performance --- and there are
intelligent ways you can design your applications so the right thing happens on
a power failure. Programmers used to be familiar with these in the days
before ext3, because that's how the world has always worked in Unix.
Ext3 has lousy performance precisely because it guaranteed more semantics that
what was promised by POSIX, and unfortunately, people have gotten flabby
(think: the humans in the movie Wall-E) and lazy about how to write programs
that write to the file system defensively. So if people are upset about the
performance of ext3, great, upgrade to newer file systems. But then you will
need to be careful about how you code applications like dpkg.
In retrospect, I really wish we hadn't given programmers the data=ordered
guarantees in ext3, because they both trashed ext3's performance and caused
application programmers to get the wrong idea about how the world worked.
Unfortunately, the damange has been done....
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
next prev parent reply other threads:[~2010-05-05 18:55 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
2010-05-05 18:54 ` bugzilla-daemon [this message]
2010-05-06 4:06 ` [Bug 15910] " bugzilla-daemon
2010-05-06 4:18 ` bugzilla-daemon
2010-05-09 18:19 ` bugzilla-daemon
2010-05-10 2:56 ` tytso
2010-05-10 14:22 ` Peng Tao
2010-05-10 14:34 ` tytso
2010-05-10 3:49 ` bugzilla-daemon
2010-05-10 14:36 ` bugzilla-daemon
2010-05-10 14:52 ` bugzilla-daemon
2010-05-10 17:23 ` bugzilla-daemon
2010-05-10 22:33 ` bugzilla-daemon
2011-03-07 0:30 ` bugzilla-daemon
2011-03-09 19:09 ` bugzilla-daemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201005051854.o45IsmJk027144@demeter.kernel.org \
--to=bugzilla-daemon@bugzilla.kernel.org \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.