From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Mahoney Subject: Re: Error handling: How to "lose" a transaction Date: Wed, 21 Dec 2011 21:59:24 -0500 Message-ID: <4EF29D0C.90709@suse.de> References: <4EE7C7F2.2080905@suse.com> <20111214001319.GX31158@shiny> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 To: Chris Mason , Jeff Mahoney , Mark Fasheh , Btrfs Development List Return-path: In-Reply-To: <20111214001319.GX31158@shiny> List-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/13/2011 07:13 PM, Chris Mason wrote: > On Tue, Dec 13, 2011 at 04:47:30PM -0500, Jeff Mahoney wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> >> >> Hi Chris - >> >> I'm starting to dig into the fun part of error handling and >> btrfs_commit_transaction is a minefield right now. >> >> I've been thinking about how I would go about recovering from a >> serious error like an -EIO while writing out or an -ENOMEM in a >> deep part of the code that it's prohibitively expensive to >> recover from. Mostly I'm looking for the best way to make calling >> btrfs_std_error() be functionally equivalent to killing the power >> on the disk. We already block off new writers, but that's >> obviously nowhere near enough. We could have an open transaction >> floating around, uncommitted transactions queued, and then an >> unrecoverable error hits, forcing us to shut it all down. >> >> It seems to me that that a similar method of recovery that I >> wrote for reiserfs can be used here as well. Am I understanding >> correctly that if I go through the motions of committing the >> transaction *except* for updating the tree roots, or maybe even >> doing that but declining to write the superblocks out, that the >> transaction essentially doesn't exist on disk? Including the >> allocations? The in-memory representation will not match what's >> on disk, but that's what happens with every file system in >> RO-failure mode. With CoW even for data, data is essentially >> frozen in time as well. (I suppose with nodatacow that's not >> true, but that's for another day.) > > Hi Jeff, > > Thanks for taking another pass at this. > > It should be possible to just skip the step where we update the > roots in the super and you'll keep a fully consistent FS on disk. > The only rule would be that you're not allowed to take a block that > we've freed in the aborted transaction and reuse it. Perfect. Sorry I haven't responded to this yet. I started digging right in and I've started to have some good results. It turns out there's already a btrfs_cleanup_transaction call that will tear down outstanding transactions. It's not perfect and I've fixed a few bugs in there, but it saved me a bunch of effort. I just wished I noticed it a day before since I had it half implemented myself. :) This afternoon I started running xfstests on a dm-linear mapped partition. Halfway through a sufficiently long test, I swap out the linear mapping to an error mapping. It still crashes, but somewhat less spectacularly. There are still a ton of BUG_ON's I need to eliminate as well as work out the usual I/O error-recovery issue of uninterruptible, unrecoverable writeback contexts and still-locked pages holding up exit. I'm pretty pleased with the results so far and am pretty optimistic. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBAgAGBQJO8p0MAAoJEB57S2MheeWyF4YP/A6uUzuP4zui+iwenSP44trw FkONuTidZRjSgA4tXNrdsnIF2txtiewzp0HvWWudw5rnMNQzznyO0WynKHSPG3ep xFZnfpvaYoCaMQt70IxAQFDsZpowbPAI8194mbJqKAql4f2RNzlg/3fR4k+Fz6Ye Gu824uEbtyHghy96C37e/E30Zizu6+S7xrx8jwmnKbq44docoIV3Pw9LZGOU99Db 1IFipExd0Z/ZhTTiK4gZ787nPhM9QNfxw/9+h1g4gUfJqlcmRrcwGJmOj5iOBGBt Man51ZCI8hYBpubTgvTQalut+uLq9lCoBZQGTbKHLNLd21qM+Ji4KCAQzMBUtqGn pzSfs3Gdwa1WjYszINAS6gqA+0ubh1F/WxGwJKW85JnAYy8OjTJHru7GlYzt3C9Y gouU7xgrneVn+lZFwV9X0gwX8yLQx5Lh9YEF6AJLXJuXHg4zGZyhpFjVkmTlle93 dFUblB92q9lxdw5V8f1Uw+EDIlACZZRo7MFDSypjdTTryRFiAjhCtBdBpnu54Mrb fH2kdhPCBm4YqAQLlo43aOPAbkOYElAr0rgPvqaLzimZLAW0kd/nGU/if3mhMMa2 7ad7tKTQyktyGKuEkMPnSCU8SqFNGA750aeFG22uJJjbdCytyzkJmeqYQD5oykqm vDpKh0g20Fcqb98q+qbt =jjDk -----END PGP SIGNATURE-----