* Re: Which FileSystem do you use on your postfix server?
[not found] <20081031121002.D94A11F3E98@spike.porcupine.org>
@ 2008-10-31 12:13 ` Justin Piszcz
2008-10-31 13:32 ` Justin Piszcz
2008-10-31 14:56 ` Eric Sandeen
0 siblings, 2 replies; 7+ messages in thread
From: Justin Piszcz @ 2008-10-31 12:13 UTC (permalink / raw)
To: Postfix users; +Cc: xfs
On Fri, 31 Oct 2008, Wietse Venema wrote:
> Does XFS still overwrite existing files with zeros, when those
> files were open for write at the time of unclean shutdown? This
I believe this was fixed in an early 2.6.2x release, cc'ing xfs mailing
list to confirm.
> would violate a basic requirement of Postfix (don't lose data after
> fsync). Postfix updates existing files all the time: it updates
> queue files as it marks recipients as done, and it updates mailbox
> files as it appends mail.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Which FileSystem do you use on your postfix server?
2008-10-31 12:13 ` Which FileSystem do you use on your postfix server? Justin Piszcz
@ 2008-10-31 13:32 ` Justin Piszcz
2008-10-31 14:56 ` Eric Sandeen
1 sibling, 0 replies; 7+ messages in thread
From: Justin Piszcz @ 2008-10-31 13:32 UTC (permalink / raw)
Cc: xfs
On Fri, 31 Oct 2008, Justin Piszcz wrote:
>
>
> On Fri, 31 Oct 2008, Wietse Venema wrote:
>
>> Does XFS still overwrite existing files with zeros, when those
>> files were open for write at the time of unclean shutdown? This
> I believe this was fixed in an early 2.6.2x release, cc'ing xfs mailing list
> to confirm.
>
>> would violate a basic requirement of Postfix (don't lose data after
>> fsync). Postfix updates existing files all the time: it updates
>> queue files as it marks recipients as done, and it updates mailbox
>> files as it appends mail.
>
>
No need to respond to this, sent a reply to the postfix list:
http://oss.sgi.com/projects/xfs/faq.html#nulls
Q: Why do I see binary NULLS in some files after recovery when I unplugged
the power?
Update: This issue has been addressed with a CVS fix on the 29th March
2007 and merged into mainline on 8th May 2007 for 2.6.22-rc1.
XFS journals metadata updates, not data updates. After a crash you are
supposed to get a consistent filesystem which looks like the state
sometime shortly before the crash, NOT what the in memory image looked
like the instant before the crash.
Since XFS does not write data out immediately unless you tell it to with
fsync, an O_SYNC or O_DIRECT open (the same is true of other filesystems),
you are looking at an inode which was flushed out, but whose data was not.
Typically you'll find that the inode is not taking any space since all it
has is a size but no extents allocated (try examining the file with the
xfs_bmap(8) command).
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Which FileSystem do you use on your postfix server?
2008-10-31 12:13 ` Which FileSystem do you use on your postfix server? Justin Piszcz
2008-10-31 13:32 ` Justin Piszcz
@ 2008-10-31 14:56 ` Eric Sandeen
2008-10-31 15:37 ` Wietse Venema
1 sibling, 1 reply; 7+ messages in thread
From: Eric Sandeen @ 2008-10-31 14:56 UTC (permalink / raw)
To: Justin Piszcz; +Cc: Postfix users, xfs, wietse
(Please bear with me; quoting a previous postfix-users email, but I'm
not on that list. Feel free to put this back on the postfix-users list
if it'd otherwise bounce)
> Nikita Kipriyanov:
>> DULMANDAKH Sukhbaatar ?????:
>> > For me XFS seemed very fast. But usually I use ext3, which is
>> > proven to be stable enough for most situations.
>> >
>> >
>> >
>> I feel also that xfs if much faster than ext3 and reiserfs, especially
>> when it deals with metadata. In some bulk operation (bulk changing
>> attributes of ~100000 files) it was approx. 15 times faster than ext3
>> (20 sec xfs, 5 min ext3).
>>
>> xfs's journal covers only metadata, so you probally lose some lastest
>> not-synched data on power loss, but you will stay with consistent fs.
>
> Does XFS still overwrite existing files with zeros, when those
> files were open for write at the time of unclean shutdown?
XFS has never done this. (explicitly overwrite with zeros, that is).
There was a time in the past when after a truncate + size update +
crash, the log would replay these metadata operations (truncate+size
update) but the data blocks had never hit the disk (this is assuming
there was no fsync complete), so there were no data blocks (extents)
associated with the file - you wound up with a sparse file as a result.
Reading this led to zeros, of course.
This is NOT the same as "overwriting existing files with zeros" which
xfs has *never* done.
This particular behavior has been fixed in 2 ways, though. One, if a
file has been truncated down, it will be synced on close:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7d4fb40ad7efe4586d1341d4731377fb4530836f
[XFS] Start writeout earlier (on last close) in the case where we have a
truncate down followed by delayed allocation (buffered writes) - worst
case scenario for the notorious NULL files problem. This reduces the
window where we are exposed to that problem significantly.
Two, a separate in-memory vs. on-disk size is now tracked:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ba87ea699ebd9dd577bf055ebc4a98200e337542
[XFS] Fix to prevent the notorious 'NULL files' problem after a crash.
The problem that has been addressed is that of synchronising updates of
the file size with writes that extend a file. Without the fix the update
of a file's size, as a result of a write beyond eof, is independent of
when the cached data is flushed to disk. Often the file size update
would be written to the filesystem log before the data is flushed to
disk. When a system crashes between these two events and the filesystem
log is replayed on mount the file's size will be set but since the
contents never made it to disk the file is full of holes. If some of the
cached data was flushed to disk then it may just be a section of the
file at the end that has holes.
There are existing fixes to help alleviate this problem, particularly in
the case where a file has been truncated, that force cached data to be
flushed to disk when the file is closed. If the system crashes while the
file(s) are still open then this flushing will never occur.
The fix that we have implemented is to introduce a second file size,
called the in-memory file size, that represents the current file size as
viewed by the user. The existing file size, called the on-disk file
size, is the one that get's written to the filesystem log and we only
update it when it is safe to do so. When we write to a file beyond eof
we only update the in- memory file size in the write operation. Later
when the I/O operation, that flushes the cached data to disk completes,
an I/O completion routine will update the on-disk file size. The on-disk
file size will be updated to the maximum offset of the I/O or to the
value of the in-memory file size if the I/O includes eof.
========
> This
> would violate a basic requirement of Postfix (don't lose data after
> fsync). Postfix updates existing files all the time: it updates
> queue files as it marks recipients as done, and it updates mailbox
> files as it appends mail.
As long as postfix is looking after data properly with fsyncs etc, xfs
should be perfectly safe w.r.t. data integrity on a crash. If you see
any other behavior, it's a *bug* which should be reported, and I'm sure
it would be fixed. As far as I know, though, there is no issue here.
> Wietse
>
> To: Private List <evals@tux.org>
> From: "Theodore Ts'o" <tytso@mit.edu>
> Date: Sun, 19 Dec 2004 23:10:09 -0500
> Subject: Re: [evals] ext3 vs reiser with quotas
>
> [...]
This email has been quoted too many times, and it's just not accurate.
> This issue is completely different from the XFS issue of zero'ing
> all open files on an unclean shutdown, of course.
As stated above, this does not happen, at least not in the active
zeroing sense.
> [..] The reason
> why it is done is to avoid a potential security problem, where a
> file could be left with someone else's data.
No. The file simply did not have extents on it, because the crash
happened before the data was flushed.
> Ext3 solves this
> problem by delaying the journal commit until the data blocks are
> written, as opposed to trashing all open files. Again, it's a
> solution which can impact performance, but at least in my opinion,
> for a filesystem, performace is Job #2. Making sure you don't lose
> data is Job #1.
And it's equally the job of the application; if an application uses the
proper calls to sync data on xfs, xfs will not lose that data on a crash.
Thanks,
-Eric (a happy postfix+xfs user for years) :)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Which FileSystem do you use on your postfix server?
2008-10-31 14:56 ` Eric Sandeen
@ 2008-10-31 15:37 ` Wietse Venema
2008-10-31 22:18 ` Dave Chinner
0 siblings, 1 reply; 7+ messages in thread
From: Wietse Venema @ 2008-10-31 15:37 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Justin Piszcz, Postfix users, xfs, wietse
Eric Sandeen:
> > This
> > would violate a basic requirement of Postfix (don't lose data after
> > fsync). Postfix updates existing files all the time: it updates
> > queue files as it marks recipients as done, and it updates mailbox
> > files as it appends mail.
>
> As long as postfix is looking after data properly with fsyncs etc, xfs
> should be perfectly safe w.r.t. data integrity on a crash. If you see
> any other behavior, it's a *bug* which should be reported, and I'm sure
> it would be fixed. As far as I know, though, there is no issue here.
The specific question is, will unclean shutdown cause loss of data
that was already fsynced, when the file was updated after the fsync.
For example, if the on-disk file metadata is updated after the file
data is appended, then there is no need to have a zero-fill problem
after crash during append.
What if the crash happens after Postfix requests a 1-byte write in
the middle of a file, i.e. without changing the size? A reasonable
implementation would not corrupt the file, but would either update
the file data or not change it. I can deal with that.
Wietse
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Which FileSystem do you use on your postfix server?
2008-10-31 15:37 ` Wietse Venema
@ 2008-10-31 22:18 ` Dave Chinner
2008-10-31 22:56 ` Wietse Venema
0 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2008-10-31 22:18 UTC (permalink / raw)
To: Wietse Venema; +Cc: Eric Sandeen, Justin Piszcz, Postfix users, xfs
On Fri, Oct 31, 2008 at 11:37:58AM -0400, Wietse Venema wrote:
> Eric Sandeen:
> > > This
> > > would violate a basic requirement of Postfix (don't lose data after
> > > fsync). Postfix updates existing files all the time: it updates
> > > queue files as it marks recipients as done, and it updates mailbox
> > > files as it appends mail.
> >
> > As long as postfix is looking after data properly with fsyncs etc, xfs
> > should be perfectly safe w.r.t. data integrity on a crash. If you see
> > any other behavior, it's a *bug* which should be reported, and I'm sure
> > it would be fixed. As far as I know, though, there is no issue here.
>
> The specific question is, will unclean shutdown cause loss of data
> that was already fsynced,
No.
> when the file was updated after the fsync.
and no.
XFS guarantees that you won't lose anything you fsync()d. You might
lose what you wrote after the fsync()), though, because you haven't
fsync()d it. Obvious, yes?
> For example, if the on-disk file metadata is updated after the file
> data is appended, then there is no need to have a zero-fill problem
> after crash during append.
In case you didn't read Eric's response - that's exactly how we
fixed XFS to prevent this problem. And please stop propagating
this erroneous "zero-fill" meme - Eric addressed how wrong that
FUD is as well.
> What if the crash happens after Postfix requests a 1-byte write in
> the middle of a file, i.e. without changing the size? A
> reasonable implementation would not corrupt the file, but would
> either update the file data or not change it. I can deal with
> that.
That is exactly how XFS has always behaved for non-extending data
overwrite. i.e. Exactly the same pretty much every filesystem that
has ever existed.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Which FileSystem do you use on your postfix server?
2008-10-31 22:18 ` Dave Chinner
@ 2008-10-31 22:56 ` Wietse Venema
2008-11-02 21:44 ` Dave Chinner
0 siblings, 1 reply; 7+ messages in thread
From: Wietse Venema @ 2008-10-31 22:56 UTC (permalink / raw)
To: Dave Chinner
Cc: Wietse Venema, Eric Sandeen, Justin Piszcz, Postfix users, xfs
Dave Chinner:
> On Fri, Oct 31, 2008 at 11:37:58AM -0400, Wietse Venema wrote:
> > Eric Sandeen:
> > > > This
> > > > would violate a basic requirement of Postfix (don't lose data after
> > > > fsync). Postfix updates existing files all the time: it updates
> > > > queue files as it marks recipients as done, and it updates mailbox
> > > > files as it appends mail.
> > >
> > > As long as postfix is looking after data properly with fsyncs etc, xfs
> > > should be perfectly safe w.r.t. data integrity on a crash. If you see
> > > any other behavior, it's a *bug* which should be reported, and I'm sure
> > > it would be fixed. As far as I know, though, there is no issue here.
> >
> > The specific question is, will unclean shutdown cause loss of data
> > that was already fsynced,
>
> No.
>
> > when the file was updated after the fsync.
>
> and no.
>
> XFS guarantees that you won't lose anything you fsync()d. You might
> lose what you wrote after the fsync()), though, because you haven't
> fsync()d it. Obvious, yes?
This is how I hoped any reasonable implementation would work. The
stories about null files made me wonder if there was something
unusual about XFS that I should be aware of.
> > For example, if the on-disk file metadata is updated after the file
> > data is appended, then there is no need to have a zero-fill problem
> > after crash during append.
>
> In case you didn't read Eric's response - that's exactly how we
> fixed XFS to prevent this problem. And please stop propagating
> this erroneous "zero-fill" meme - Eric addressed how wrong that
> FUD is as well.
Just confirming a specific case that I care about.
Here's something I would like to know regarding the order of
directory updates:
- Does fsync(file) guarantee the file's directory entry is safe?
Some file systems complete directory updates before the open/link/rename
system call returns, so fsync() doesn't have to worry about it.
- Does rename() guarantee that at least one directory entry will
exist even when the system crashes in the middle of the operation?
Postfix assumes both answers are "yes"; old ext2fs violated both
assumptions.
> > What if the crash happens after Postfix requests a 1-byte write in
> > the middle of a file, i.e. without changing the size? A
> > reasonable implementation would not corrupt the file, but would
> > either update the file data or not change it. I can deal with
> > that.
>
> That is exactly how XFS has always behaved for non-extending data
> overwrite. i.e. Exactly the same pretty much every filesystem that
> has ever existed.
Good. Thanks for confirming that XFS is not unusual.
Wietse
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Which FileSystem do you use on your postfix server?
2008-10-31 22:56 ` Wietse Venema
@ 2008-11-02 21:44 ` Dave Chinner
0 siblings, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2008-11-02 21:44 UTC (permalink / raw)
To: Wietse Venema; +Cc: Eric Sandeen, Justin Piszcz, Postfix users, xfs
On Fri, Oct 31, 2008 at 06:56:15PM -0400, Wietse Venema wrote:
> Dave Chinner:
> Here's something I would like to know regarding the order of
> directory updates:
>
> - Does fsync(file) guarantee the file's directory entry is safe?
No.
> Some file systems complete directory updates before the open/link/rename
> system call returns, so fsync() doesn't have to worry about it.
If you run with '-o dirsync', all directory transactions are guaranteed
to be in the log on disk by the time the syscall returns. Note that
this means you do at least one log write per create/link/rename/unlink
syscall, which has performance impact....
> - Does rename() guarantee that at least one directory entry will
> exist even when the system crashes in the middle of the operation?
Yes - either it will complete atomically or no change will occur at
all.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-11-02 21:45 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20081031121002.D94A11F3E98@spike.porcupine.org>
2008-10-31 12:13 ` Which FileSystem do you use on your postfix server? Justin Piszcz
2008-10-31 13:32 ` Justin Piszcz
2008-10-31 14:56 ` Eric Sandeen
2008-10-31 15:37 ` Wietse Venema
2008-10-31 22:18 ` Dave Chinner
2008-10-31 22:56 ` Wietse Venema
2008-11-02 21:44 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox