* Same magic in statfs() call for ext? @ 2009-03-16 13:36 Jan Kara 2009-03-16 16:13 ` Eric Sandeen 0 siblings, 1 reply; 4+ messages in thread From: Jan Kara @ 2009-03-16 13:36 UTC (permalink / raw) To: linux-ext4 Hi, I've just noticed that EXT2_SUPER_MAGIC == EXT3_SUPER_MAGIC == EXT4_SUPER_MAGIC. That is just fine for the disk format but as a result we also return the same magic in statfs() syscall and thus a simple application has hard time recognizing whether it works on ext2, ext3 or ext4 (it would have to parse /proc/mounts and that is non-trivial if not impossible when it comes to bind mounts). So should not we return different magic numbers depending on how the filesystem is currently mounted? Now you may ask why should the application care - and I agree that in the ideal world it should not. But for example there's a thread on GTK mailing list [1] where they discuss the problem that with delayed allocation and ext4, user can easily lose his data after crash (Ted wrote about it here in some other mail some time ago). So they would like to call fsync() after the file is written but on ext3 that is quite heavy and because of autosave saving happens quite often. So they'd do fsync() only if the filesystem is mounted as ext4... So I'm writing here so hear some opinions on returning different magic numbers from statfs(). Honza [1] http://mail.gnome.org/archives/gtk-devel-list/2009-March/msg00082.html -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Same magic in statfs() call for ext? 2009-03-16 13:36 Same magic in statfs() call for ext? Jan Kara @ 2009-03-16 16:13 ` Eric Sandeen 2009-03-16 16:27 ` Jan Kara 0 siblings, 1 reply; 4+ messages in thread From: Eric Sandeen @ 2009-03-16 16:13 UTC (permalink / raw) To: Jan Kara; +Cc: linux-ext4 Jan Kara wrote: > Hi, > > I've just noticed that EXT2_SUPER_MAGIC == EXT3_SUPER_MAGIC == > EXT4_SUPER_MAGIC. Just noticed? *grin* > That is just fine for the disk format but as a result we > also return the same magic in statfs() syscall and thus a simple > application has hard time recognizing whether it works on ext2, ext3 or > ext4 (it would have to parse /proc/mounts and that is non-trivial if not > impossible when it comes to bind mounts). I have a guess as to why they want to know, and ... > So should not we return different > magic numbers depending on how the filesystem is currently mounted? > Now you may ask why should the application care - and I agree that in the > ideal world it should not. But for example there's a thread on GTK mailing > list [1] where they discuss the problem that with delayed allocation and > ext4, user can easily lose his data after crash ... sadly I was right. :) > (Ted wrote about it here in > some other mail some time ago). So they would like to call fsync() after > the file is written but on ext3 that is quite heavy and because of autosave > saving happens quite often. So they'd do fsync() only if the filesystem > is mounted as ext4... > So I'm writing here so hear some opinions on returning different magic > numbers from statfs(). > > Honza > > [1] http://mail.gnome.org/archives/gtk-devel-list/2009-March/msg00082.html As an aside, Ted also pointed out that ext4-without-delalloc also hurts on fsync just like ext3 does, so testing "ext3 vs. ext4" isn't quite enough in general. I have been a bit dismayed that app writers just want the old ext3 behavior (which still has a window for loss, doesn't it?) so that they can get away without fsyncing. And talking to KDE folks and others, I think that if ext3 didn't hurt so much w/ fsync, they would just happily do the right posix-defined thing and add fsync() when needed. But instead, since they are now justifiably afraid of fsync, we are in this quandary. (maybe this is over-simplifying a bit). But off the top of my head, I think that I would prefer to see applications generally do the right, posix-conformant thing w.r.t. data integrity (i.e. fsync()) unless, via statfs, they find out "fsync hurts, and we're likely to be reasoonably safe without it" IOW, adding exceptions for ext3 sounds better to me than munging ext4, xfs, btrfs, and all future filesystems to conform to some behavior which isn't in any API or spec ... -Eric ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Same magic in statfs() call for ext? 2009-03-16 16:13 ` Eric Sandeen @ 2009-03-16 16:27 ` Jan Kara 2009-03-30 18:23 ` Andreas Dilger 0 siblings, 1 reply; 4+ messages in thread From: Jan Kara @ 2009-03-16 16:27 UTC (permalink / raw) To: Eric Sandeen; +Cc: linux-ext4 On Mon 16-03-09 11:13:13, Eric Sandeen wrote: > Jan Kara wrote: > > Hi, > > > > I've just noticed that EXT2_SUPER_MAGIC == EXT3_SUPER_MAGIC == > > EXT4_SUPER_MAGIC. > Just noticed? *grin* ;-) > > That is just fine for the disk format but as a result we > > also return the same magic in statfs() syscall and thus a simple > > application has hard time recognizing whether it works on ext2, ext3 or > > ext4 (it would have to parse /proc/mounts and that is non-trivial if not > > impossible when it comes to bind mounts). > > I have a guess as to why they want to know, and ... > > > So should not we return different > > magic numbers depending on how the filesystem is currently mounted? > > Now you may ask why should the application care - and I agree that in the > > ideal world it should not. But for example there's a thread on GTK mailing > > list [1] where they discuss the problem that with delayed allocation and > > ext4, user can easily lose his data after crash > > ... sadly I was right. :) > > > (Ted wrote about it here in > > some other mail some time ago). So they would like to call fsync() after > > the file is written but on ext3 that is quite heavy and because of autosave > > saving happens quite often. So they'd do fsync() only if the filesystem > > is mounted as ext4... > > So I'm writing here so hear some opinions on returning different magic > > numbers from statfs(). > > > > Honza > > > > [1] http://mail.gnome.org/archives/gtk-devel-list/2009-March/msg00082.html > > As an aside, Ted also pointed out that ext4-without-delalloc also hurts > on fsync just like ext3 does, so testing "ext3 vs. ext4" isn't quite > enough in general. Yes, I know but it's at least some approximation. > I have been a bit dismayed that app writers just want the old ext3 > behavior (which still has a window for loss, doesn't it?) so that they > can get away without fsyncing. And talking to KDE folks and others, I > think that if ext3 didn't hurt so much w/ fsync, they would just happily > do the right posix-defined thing and add fsync() when needed. > > But instead, since they are now justifiably afraid of fsync, we are in > this quandary. (maybe this is over-simplifying a bit). > > But off the top of my head, I think that I would prefer to see > applications generally do the right, posix-conformant thing w.r.t. data > integrity (i.e. fsync()) unless, via statfs, they find out "fsync hurts, > and we're likely to be reasoonably safe without it" > > IOW, adding exceptions for ext3 sounds better to me than munging ext4, > xfs, btrfs, and all future filesystems to conform to some behavior which > isn't in any API or spec ... Yes, I agree that if they want data on disk, they should use fsync(). But as you say for ext3 this is not really usable so they have to somehow recognize that "they are on a filesystem where fsync() sucks" and avoid it as much as possible. And I feel slightly in favor of giving them enough rope (i.e., different magic numbers in statfs) to hang themselves ;-). Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Same magic in statfs() call for ext? 2009-03-16 16:27 ` Jan Kara @ 2009-03-30 18:23 ` Andreas Dilger 0 siblings, 0 replies; 4+ messages in thread From: Andreas Dilger @ 2009-03-30 18:23 UTC (permalink / raw) To: Jan Kara; +Cc: Eric Sandeen, linux-ext4 On Mar 16, 2009 17:27 +0100, Jan Kara wrote: > On Mon 16-03-09 11:13:13, Eric Sandeen wrote: > > But off the top of my head, I think that I would prefer to see > > applications generally do the right, posix-conformant thing w.r.t. data > > integrity (i.e. fsync()) unless, via statfs, they find out "fsync hurts, > > and we're likely to be reasoonably safe without it" > > > > IOW, adding exceptions for ext3 sounds better to me than munging ext4, > > xfs, btrfs, and all future filesystems to conform to some behavior which > > isn't in any API or spec ... > > Yes, I agree that if they want data on disk, they should use fsync(). But > as you say for ext3 this is not really usable so they have to somehow > recognize that "they are on a filesystem where fsync() sucks" and avoid it > as much as possible. And I feel slightly in favor of giving them enough rope > (i.e., different magic numbers in statfs) to hang themselves ;-). One possibility that I've thought of in the past is to have "dynamic data=journal" mode when fsync is being called and files are small. What this means is that small file data will be written to the journal on fsync instead of journaling only the metadata and flushing the data to the filesystem in ordered mode. While it means data is written twice to disk (once to journal, once to fs), if there is a lot of fsync going on and the files are small then it may still be faster than doing the seeks. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-03-30 18:24 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-03-16 13:36 Same magic in statfs() call for ext? Jan Kara 2009-03-16 16:13 ` Eric Sandeen 2009-03-16 16:27 ` Jan Kara 2009-03-30 18:23 ` Andreas Dilger
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.