linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Mahoney <jeffm@suse.de>
To: Chris Mason <chris.mason@oracle.com>,
	Jeff Mahoney <jeffm@suse.com>, Mark Fasheh <mfasheh@suse.de>,
	Btrfs Development List <linux-btrfs@vger.kernel.org>
Subject: Re: Error handling: How to "lose" a transaction
Date: Wed, 21 Dec 2011 21:59:24 -0500	[thread overview]
Message-ID: <4EF29D0C.90709@suse.de> (raw)
In-Reply-To: <20111214001319.GX31158@shiny>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/13/2011 07:13 PM, Chris Mason wrote:
> On Tue, Dec 13, 2011 at 04:47:30PM -0500, Jeff Mahoney wrote:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>> 
>> 
>> Hi Chris -
>> 
>> I'm starting to dig into the fun part of error handling and 
>> btrfs_commit_transaction is a minefield right now.
>> 
>> I've been thinking about how I would go about recovering from a 
>> serious error like an -EIO while writing out or an -ENOMEM in a
>> deep part of the code that it's prohibitively expensive to
>> recover from. Mostly I'm looking for the best way to make calling
>> btrfs_std_error() be functionally equivalent to killing the power
>> on the disk. We already block off new writers, but that's
>> obviously nowhere near enough. We could have an open transaction
>> floating around, uncommitted transactions queued, and then an
>> unrecoverable error hits, forcing us to shut it all down.
>> 
>> It seems to me that that a similar method of recovery that I
>> wrote for reiserfs can be used here as well. Am I understanding
>> correctly that if I go through the motions of committing the
>> transaction *except* for updating the tree roots, or maybe even
>> doing that but declining to write the superblocks out, that the
>> transaction essentially doesn't exist on disk? Including the
>> allocations? The in-memory representation will not match what's
>> on disk, but that's what happens with every file system in
>> RO-failure mode. With CoW even for data, data is essentially 
>> frozen in time as well. (I suppose with nodatacow that's not
>> true, but that's for another day.)
> 
> Hi Jeff,
> 
> Thanks for taking another pass at this.
> 
> It should be possible to just skip the step where we update the
> roots in the super and you'll keep a fully consistent FS on disk.
> The only rule would be that you're not allowed to take a block that
> we've freed in the aborted transaction and reuse it.

Perfect.

Sorry I haven't responded to this yet. I started digging right in and
I've started to have some good results. It turns out there's already a
btrfs_cleanup_transaction call that will tear down outstanding
transactions. It's not perfect and I've fixed a few bugs in there, but
it saved me a bunch of effort. I just wished I noticed it a day before
since I had it half implemented myself. :)

This afternoon I started running xfstests on a dm-linear mapped
partition. Halfway through a sufficiently long test, I swap out the
linear mapping to an error mapping. It still crashes, but somewhat
less spectacularly. There are still a ton of BUG_ON's I need to
eliminate as well as work out the usual I/O error-recovery issue of
uninterruptible, unrecoverable writeback contexts and still-locked
pages holding up exit. I'm pretty pleased with the results so far and
am pretty optimistic.

- -Jeff


- -- 
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJO8p0MAAoJEB57S2MheeWyF4YP/A6uUzuP4zui+iwenSP44trw
FkONuTidZRjSgA4tXNrdsnIF2txtiewzp0HvWWudw5rnMNQzznyO0WynKHSPG3ep
xFZnfpvaYoCaMQt70IxAQFDsZpowbPAI8194mbJqKAql4f2RNzlg/3fR4k+Fz6Ye
Gu824uEbtyHghy96C37e/E30Zizu6+S7xrx8jwmnKbq44docoIV3Pw9LZGOU99Db
1IFipExd0Z/ZhTTiK4gZ787nPhM9QNfxw/9+h1g4gUfJqlcmRrcwGJmOj5iOBGBt
Man51ZCI8hYBpubTgvTQalut+uLq9lCoBZQGTbKHLNLd21qM+Ji4KCAQzMBUtqGn
pzSfs3Gdwa1WjYszINAS6gqA+0ubh1F/WxGwJKW85JnAYy8OjTJHru7GlYzt3C9Y
gouU7xgrneVn+lZFwV9X0gwX8yLQx5Lh9YEF6AJLXJuXHg4zGZyhpFjVkmTlle93
dFUblB92q9lxdw5V8f1Uw+EDIlACZZRo7MFDSypjdTTryRFiAjhCtBdBpnu54Mrb
fH2kdhPCBm4YqAQLlo43aOPAbkOYElAr0rgPvqaLzimZLAW0kd/nGU/if3mhMMa2
7ad7tKTQyktyGKuEkMPnSCU8SqFNGA750aeFG22uJJjbdCytyzkJmeqYQD5oykqm
vDpKh0g20Fcqb98q+qbt
=jjDk
-----END PGP SIGNATURE-----

  reply	other threads:[~2011-12-22  2:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-13 21:47 Error handling: How to "lose" a transaction Jeff Mahoney
2011-12-14  0:13 ` Chris Mason
2011-12-22  2:59   ` Jeff Mahoney [this message]
2011-12-22  3:21     ` Liu Bo
2011-12-22  3:38       ` Jeff Mahoney
2011-12-23  5:12         ` Jeff Mahoney
2011-12-23  5:43           ` Liu Bo
2011-12-23 14:17           ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EF29D0C.90709@suse.de \
    --to=jeffm@suse.de \
    --cc=chris.mason@oracle.com \
    --cc=jeffm@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=mfasheh@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).