* [PATCH] ext3: skip orphan cleanup on rocompat fs @ 2011-02-26 20:40 Amir Goldstein 2011-02-28 10:14 ` Rogier Wolff 2011-02-28 18:09 ` Jan Kara 0 siblings, 2 replies; 11+ messages in thread From: Amir Goldstein @ 2011-02-26 20:40 UTC (permalink / raw) To: Jan Kara; +Cc: Theodore Tso, Ext4 Developers List Orphan cleanup is currently executed even on readonly mount. It deletes inodes and frees blocks, which could be very bad for some RO_COMPAT features, HAS_SNAPSHOT to name one. Now the damage is done, because current stock kernels may corrupt future fs with readonly compatible features, when doing orphan cleanup. This patch skips the orphan cleanup if readonly compatible features would prevent the fs from being mounted (or remounted) readwrite. Signed-off-by: Amir Goldstein <amir73il@users.sf.net> --- fs/ext3/super.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/fs/ext3/super.c b/fs/ext3/super.c index 85c8cc8..6cdd575 100644 --- a/fs/ext3/super.c +++ b/fs/ext3/super.c @@ -1464,6 +1464,13 @@ static void ext3_orphan_cleanup (struct super_block * sb, return; } + /* Check if feature set allows readwrite operations */ + if (EXT3_HAS_RO_COMPAT_FEATURE(sb, ~EXT3_FEATURE_RO_COMPAT_SUPP)) { + ext3_msg(sb, KERN_INFO, "Skipping orphan cleanup on readonly-" + "compatible fs"); + return; + } + if (EXT3_SB(sb)->s_mount_state & EXT3_ERROR_FS) { if (es->s_last_orphan) jbd_debug(1, "Errors on filesystem, " -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] ext3: skip orphan cleanup on rocompat fs 2011-02-26 20:40 [PATCH] ext3: skip orphan cleanup on rocompat fs Amir Goldstein @ 2011-02-28 10:14 ` Rogier Wolff 2011-02-28 13:10 ` Amir Goldstein ` (2 more replies) 2011-02-28 18:09 ` Jan Kara 1 sibling, 3 replies; 11+ messages in thread From: Rogier Wolff @ 2011-02-28 10:14 UTC (permalink / raw) To: Amir Goldstein; +Cc: Jan Kara, Theodore Tso, Ext4 Developers List On Sat, Feb 26, 2011 at 10:40:19PM +0200, Amir Goldstein wrote: > This patch skips the orphan cleanup if readonly compatible features > would prevent the fs from being mounted (or remounted) readwrite. I use the "mount readonly" option to, for instance, view/check the filesystem to determine wether or not I need to fsck first. I use the "readonly" feature to prevent the mounting to be a mistake-prone situation. It prevents e.g. applications from dropping temporary files in my current directory. Every time fsck or such a cleanup does something, there is the option of the cleanup or fixup being wrong. When you honour the "readonly" request from the user, the careful user can go back to the situation where he/she started. If the cleanup/fixup is really neccesary, do so in in-core buffers of the filesystem. Write the infrastructure that allows us to have dirty buffers that MAY NOT (yet?!?) be written to the device. This will also solve the problem of journal recovery on readonly mount of a root filesystem. when it has been fscked, and it's remounted rw, we can remove the ban on the writeback of the dirty buffers. So I stronly disagree with your patch: It should not only prevent the cleanup when writing is not allows due to ro-compat situation, but also when requested by the user. Roger. ------------------ Back in the old days I was still using minix. Linux didn't exist or wasn't usable enough. I had something that needed removing, so I typed rm -rf *, thinking I was in the directory that needed removing. I wasn't! There went my (modified) kernel tree! It took me some three seconds to find, verify and execute the solution: Powerswitch. In such an incident, cleaning up inodes that you think were deleted anyway removes information about files that may need recovery. Imagine that accidentally my big database file was unlinked. Ooops. But the database server is still keeping the file open. Phew. We can continue to run until we find a solution.... So the inode is now orphaned. But we can recover it with some filesystem magic. Maybe not by answering yes to fsck questions, but it is recoverable without dataloss, right? Then the power goes out... Ooops. Instead of two more days to get everything ready for the recovery we have to do it NOW. There goes. Boot from CD, let's just mount the partition readonly to get access to our tools and binaries that may facilitate the recovery of our database. BAM! Away goes the nicely allocated inode. (ext2 used to just mark the inode as not in use, ext3 cleans it up so that we lose the pointers to the 7 datablocks, the indirect block and double indirect block!) Now we will have to guess where the file started etc etc. The principle is: Do as I say. That keeps things predictable. If you try to outguess the user, it will be horribly wrong every once in a while. You're right. In 99% of the cases, the system just crashed/rebooted while some temporary files were still open, but already deleted. And in 99% of the cases, the system will boot, perform an automatic fsck and eventually remount rw. So writing those orphaned inodes back early doesn't make a real difference. -- ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 ** ** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 ** *-- BitWizard writes Linux device drivers for any device you may have! --* Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. Does it sit on the couch all day? Is it unemployed? Please be specific! Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ext3: skip orphan cleanup on rocompat fs 2011-02-28 10:14 ` Rogier Wolff @ 2011-02-28 13:10 ` Amir Goldstein 2011-02-28 18:22 ` Jan Kara 2011-02-28 18:22 ` Ted Ts'o 2 siblings, 0 replies; 11+ messages in thread From: Amir Goldstein @ 2011-02-28 13:10 UTC (permalink / raw) To: Rogier Wolff; +Cc: Jan Kara, Theodore Tso, Ext4 Developers List On Mon, Feb 28, 2011 at 12:14 PM, Rogier Wolff <R.E.Wolff@bitwizard.nl> wrote: > > On Sat, Feb 26, 2011 at 10:40:19PM +0200, Amir Goldstein wrote: >> This patch skips the orphan cleanup if readonly compatible features >> would prevent the fs from being mounted (or remounted) readwrite. > > I use the "mount readonly" option to, for instance, view/check the > filesystem to determine wether or not I need to fsck first. I use the > "readonly" feature to prevent the mounting to be a mistake-prone > situation. It prevents e.g. applications from dropping temporary files > in my current directory. > > Every time fsck or such a cleanup does something, there is the option > of the cleanup or fixup being wrong. When you honour the "readonly" > request from the user, the careful user can go back to the situation > where he/she started. > > If the cleanup/fixup is really neccesary, do so in in-core buffers of > the filesystem. Write the infrastructure that allows us to have dirty > buffers that MAY NOT (yet?!?) be written to the device. This will also > solve the problem of journal recovery on readonly mount of a root > filesystem. when it has been fscked, and it's remounted rw, we can > remove the ban on the writeback of the dirty buffers. > > So I stronly disagree with your patch: It should not only prevent the > cleanup when writing is not allows due to ro-compat situation, but > also when requested by the user. > I am not disagreeing with your disagreement, but implementing what you desire has more implications than my patch, so I rather that my patch gets merged, because it is mostly-harmless and we can continue the discussion, whether or not file systems need to honor the readonly mount option to the word in parallel. Amir. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ext3: skip orphan cleanup on rocompat fs 2011-02-28 10:14 ` Rogier Wolff 2011-02-28 13:10 ` Amir Goldstein @ 2011-02-28 18:22 ` Jan Kara 2011-02-28 18:49 ` Ted Ts'o 2011-02-28 19:05 ` Eric Sandeen 2011-02-28 18:22 ` Ted Ts'o 2 siblings, 2 replies; 11+ messages in thread From: Jan Kara @ 2011-02-28 18:22 UTC (permalink / raw) To: Rogier Wolff; +Cc: Amir Goldstein, Jan Kara, Theodore Tso, Ext4 Developers List On Mon 28-02-11 11:14:55, Rogier Wolff wrote: > > On Sat, Feb 26, 2011 at 10:40:19PM +0200, Amir Goldstein wrote: > > This patch skips the orphan cleanup if readonly compatible features > > would prevent the fs from being mounted (or remounted) readwrite. > > I use the "mount readonly" option to, for instance, view/check the > filesystem to determine wether or not I need to fsck first. I use the > "readonly" feature to prevent the mounting to be a mistake-prone > situation. It prevents e.g. applications from dropping temporary files > in my current directory. > > Every time fsck or such a cleanup does something, there is the option > of the cleanup or fixup being wrong. When you honour the "readonly" > request from the user, the careful user can go back to the situation > where he/she started. > > If the cleanup/fixup is really neccesary, do so in in-core buffers of Mounting (even read-only) without recovering the journal will give you a view of a corrupted filesystem. Usually not what you want (although I agree with you that there are occasions where this *is* what you want). > the filesystem. Write the infrastructure that allows us to have dirty > buffers that MAY NOT (yet?!?) be written to the device. This will also > solve the problem of journal recovery on readonly mount of a root > filesystem. when it has been fscked, and it's remounted rw, we can > remove the ban on the writeback of the dirty buffers. Yes, this would be a nice feature but noone ever got to implementing it. You are welcome to contribute it :). > So I stronly disagree with your patch: It should not only prevent the > cleanup when writing is not allows due to ro-compat situation, but > also when requested by the user. As Amir said, the patch is trivial and a clear improvement. So it goes in. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ext3: skip orphan cleanup on rocompat fs 2011-02-28 18:22 ` Jan Kara @ 2011-02-28 18:49 ` Ted Ts'o 2011-02-28 19:32 ` Jan Kara 2011-02-28 19:05 ` Eric Sandeen 1 sibling, 1 reply; 11+ messages in thread From: Ted Ts'o @ 2011-02-28 18:49 UTC (permalink / raw) To: Jan Kara; +Cc: Rogier Wolff, Amir Goldstein, Ext4 Developers List On Mon, Feb 28, 2011 at 07:22:01PM +0100, Jan Kara wrote: > As Amir said, the patch is trivial and a clear improvement. So it goes > in. You might want to look at the improvements I made to the ext4 version of the patch, in terms of a slightly better error message and commit description, and make similar changes to the ext3 version of the patch. See: http://article.gmane.org/gmane.comp.file-systems.ext4/23620 - Ted ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ext3: skip orphan cleanup on rocompat fs 2011-02-28 18:49 ` Ted Ts'o @ 2011-02-28 19:32 ` Jan Kara 0 siblings, 0 replies; 11+ messages in thread From: Jan Kara @ 2011-02-28 19:32 UTC (permalink / raw) To: Ted Ts'o; +Cc: Jan Kara, Rogier Wolff, Amir Goldstein, Ext4 Developers List On Mon 28-02-11 13:49:29, Ted Ts'o wrote: > On Mon, Feb 28, 2011 at 07:22:01PM +0100, Jan Kara wrote: > > As Amir said, the patch is trivial and a clear improvement. So it goes > > in. > > You might want to look at the improvements I made to the ext4 version > of the patch, in terms of a slightly better error message and commit > description, and make similar changes to the ext3 version of the > patch. > > See: http://article.gmane.org/gmane.comp.file-systems.ext4/23620 Yes, I took your version and ported it to ext3 in the end :). Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ext3: skip orphan cleanup on rocompat fs 2011-02-28 18:22 ` Jan Kara 2011-02-28 18:49 ` Ted Ts'o @ 2011-02-28 19:05 ` Eric Sandeen 1 sibling, 0 replies; 11+ messages in thread From: Eric Sandeen @ 2011-02-28 19:05 UTC (permalink / raw) To: Jan Kara; +Cc: Rogier Wolff, Amir Goldstein, Theodore Tso, Ext4 Developers List On 2/28/11 12:22 PM, Jan Kara wrote: > On Mon 28-02-11 11:14:55, Rogier Wolff wrote: >> >> On Sat, Feb 26, 2011 at 10:40:19PM +0200, Amir Goldstein wrote: >>> This patch skips the orphan cleanup if readonly compatible features >>> would prevent the fs from being mounted (or remounted) readwrite. >> >> I use the "mount readonly" option to, for instance, view/check the >> filesystem to determine wether or not I need to fsck first. I use the >> "readonly" feature to prevent the mounting to be a mistake-prone >> situation. It prevents e.g. applications from dropping temporary files >> in my current directory. >> >> Every time fsck or such a cleanup does something, there is the option >> of the cleanup or fixup being wrong. When you honour the "readonly" >> request from the user, the careful user can go back to the situation >> where he/she started. >> >> If the cleanup/fixup is really neccesary, do so in in-core buffers of > Mounting (even read-only) without recovering the journal will give you a > view of a corrupted filesystem. Usually not what you want (although I agree > with you that there are occasions where this *is* what you want). Also, you can tell ext3/4 -not- to recover the journal, or, you can mark the device RO before mount. So while not the default behavior, it is at least possible. -Eric ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ext3: skip orphan cleanup on rocompat fs 2011-02-28 10:14 ` Rogier Wolff 2011-02-28 13:10 ` Amir Goldstein 2011-02-28 18:22 ` Jan Kara @ 2011-02-28 18:22 ` Ted Ts'o 2 siblings, 0 replies; 11+ messages in thread From: Ted Ts'o @ 2011-02-28 18:22 UTC (permalink / raw) To: Rogier Wolff; +Cc: Amir Goldstein, Jan Kara, Ext4 Developers List On Mon, Feb 28, 2011 at 11:14:55AM +0100, Rogier Wolff wrote: > If the cleanup/fixup is really neccesary, do so in in-core buffers of > the filesystem. Write the infrastructure that allows us to have dirty > buffers that MAY NOT (yet?!?) be written to the device. This will also > solve the problem of journal recovery on readonly mount of a root > filesystem. when it has been fscked, and it's remounted rw, we can > remove the ban on the writeback of the dirty buffers. That would be an interesting and useful thing to add, but it's not a trivial change. There is the risk that if the journal is very large, and system memory is very, that there might not be enough memory to hold all of the dirty buffers (or it might handicap the machine). That shouldn't be an issue on reasonably configured machines, but as we know, not all file servers are reasonably configured (see previous discussions about tdb :-) If someone want to work on this as a project, that would be great, but to be honest it's not high on my priority list at the moment. - Ted ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ext3: skip orphan cleanup on rocompat fs 2011-02-26 20:40 [PATCH] ext3: skip orphan cleanup on rocompat fs Amir Goldstein 2011-02-28 10:14 ` Rogier Wolff @ 2011-02-28 18:09 ` Jan Kara 2011-03-24 10:34 ` Amir Goldstein 1 sibling, 1 reply; 11+ messages in thread From: Jan Kara @ 2011-02-28 18:09 UTC (permalink / raw) To: Amir Goldstein; +Cc: Jan Kara, Theodore Tso, Ext4 Developers List On Sat 26-02-11 22:40:19, Amir Goldstein wrote: > Orphan cleanup is currently executed even on readonly mount. > It deletes inodes and frees blocks, which could be very bad for some > RO_COMPAT features, HAS_SNAPSHOT to name one. > > Now the damage is done, because current stock kernels may > corrupt future fs with readonly compatible features, > when doing orphan cleanup. > > This patch skips the orphan cleanup if readonly compatible features > would prevent the fs from being mounted (or remounted) readwrite. Thanks Amir. I actually took the version Ted committed to ext4 and ported it for ext3. Anyway, the patch is in my tree. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] ext3: skip orphan cleanup on rocompat fs 2011-02-28 18:09 ` Jan Kara @ 2011-03-24 10:34 ` Amir Goldstein 2011-03-24 16:07 ` [stable] " Greg KH 0 siblings, 1 reply; 11+ messages in thread From: Amir Goldstein @ 2011-03-24 10:34 UTC (permalink / raw) To: stable; +Cc: Theodore Tso, Ext4 Developers List, Jan Kara Hi, Please apply commit ce654b37 to any stable kernel out there. The relevant ext4 patch is in Ted's tree and has not reached mainline yet. Ted, can you please add CC: stable to my patch. Thanks, Amir. On Mon, Feb 28, 2011 at 8:09 PM, Jan Kara <jack@suse.cz> wrote: > On Sat 26-02-11 22:40:19, Amir Goldstein wrote: >> Orphan cleanup is currently executed even on readonly mount. >> It deletes inodes and frees blocks, which could be very bad for some >> RO_COMPAT features, HAS_SNAPSHOT to name one. >> >> Now the damage is done, because current stock kernels may >> corrupt future fs with readonly compatible features, >> when doing orphan cleanup. >> >> This patch skips the orphan cleanup if readonly compatible features >> would prevent the fs from being mounted (or remounted) readwrite. > Thanks Amir. I actually took the version Ted committed to ext4 and ported > it for ext3. Anyway, the patch is in my tree. > > Honza > > -- > Jan Kara <jack@suse.cz> > SUSE Labs, CR > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [stable] [PATCH] ext3: skip orphan cleanup on rocompat fs 2011-03-24 10:34 ` Amir Goldstein @ 2011-03-24 16:07 ` Greg KH 0 siblings, 0 replies; 11+ messages in thread From: Greg KH @ 2011-03-24 16:07 UTC (permalink / raw) To: Amir Goldstein; +Cc: stable, Jan Kara, Ext4 Developers List, Theodore Tso On Thu, Mar 24, 2011 at 12:34:05PM +0200, Amir Goldstein wrote: > Hi, > > Please apply commit ce654b37 to any stable kernel out there. Now applied, thanks. greg k-h ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-03-24 16:07 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-02-26 20:40 [PATCH] ext3: skip orphan cleanup on rocompat fs Amir Goldstein 2011-02-28 10:14 ` Rogier Wolff 2011-02-28 13:10 ` Amir Goldstein 2011-02-28 18:22 ` Jan Kara 2011-02-28 18:49 ` Ted Ts'o 2011-02-28 19:32 ` Jan Kara 2011-02-28 19:05 ` Eric Sandeen 2011-02-28 18:22 ` Ted Ts'o 2011-02-28 18:09 ` Jan Kara 2011-03-24 10:34 ` Amir Goldstein 2011-03-24 16:07 ` [stable] " Greg KH
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).