linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ian Kent <raven@themaw.net>
To: Josef Bacik <josef@redhat.com>
Cc: Yoshinori Sano <yoshinori.sano@gmail.com>,
	chris.mason@oracle.com, linux-btrfs@vger.kernel.org
Subject: Re: On Removing BUG_ON macros
Date: Mon, 08 Nov 2010 23:02:04 +0800	[thread overview]
Message-ID: <1289228524.3309.51.camel@localhost> (raw)
In-Reply-To: <20101108141539.GD2515@localhost.localdomain>

On Mon, 2010-11-08 at 09:15 -0500, Josef Bacik wrote:
> On Mon, Nov 08, 2010 at 10:06:13PM +0800, Ian Kent wrote:
> > On Mon, 2010-11-08 at 07:42 -0500, Josef Bacik wrote:
> > > On Mon, Nov 08, 2010 at 10:54:07AM +0800, Ian Kent wrote:
> > > > On Sun, 2010-11-07 at 09:51 -0500, Josef Bacik wrote:
> > > > > On Sun, Nov 07, 2010 at 04:16:47PM +0900, Yoshinori Sano wrote:
> > > > > > This is a question I've posted on the #btrfs IRC channel today.
> > > > > > hyperair adviced me to contact with Josef Bacik or Chris Mason.
> > > > > > So, I post my question to this maling list.
> > > > > > 
> > > > > > Here are my post on the IRC:
> > > > > > 
> > > > > > Actually, I want to remove BUG_ON(ret) around the Btrfs code.
> > > > > > The motivation is to make the Btrfs code more robust.
> > > > > > First of all, is this meaningless?
> > > > > > 
> > > > > > For example, there are code like the following:
> > > > > > 
> > > > > >     struct btrfs_path *path;
> > > > > >     path = btrfs_alloc_path();
> > > > > >     BUG_ON(!path);
> > > > > > 
> > > > > > This is a frequenty used pattern of current Btrfs code.
> > > > > > A btrfs_alloc_path()'s caller has to deal with the allocation failure
> > > > > > instead of using BUG_ON.  However, (this is what most interesting
> > > > > > thing for me) can the caller do any proper error handlings here?
> > > > > > I mean, is this a critical situation where we cannot recover from?
> > > > > >
> > > > > 
> > > > > No we're just lazy ;).  Tho making sure the caller can recover from getting
> > > > > -ENOMEM is very important, which is why in some of these paths we just do BUG_ON
> > > > > since fixing the callers is tricky.  A good strategy for things like this is to
> > > > > do something like
> > > > > 
> > > > > static int foo = 1;
> > > > > 
> > > > > path = btrfs_alloc_path();
> > > > > if (!path || !(foo % 1000))
> > > > > 	return -ENOMEM;
> > > > > foo++;
> > > > 
> > > > Hahaha, I love it.
> > > > 
> > > > So, return ENOMEM every 1000 times we call the containing function!
> > > > 
> > > > > 
> > > > > that way you can catch all the callers and make sure we're handling the error
> > > > > all the way up the chain properly.  Thanks,
> > > > 
> > > > Yeah, I suspect this approach will be a bit confusing though.
> > > > 
> > > > I believe that it will be more effective, although time consuming, to
> > > > work through the call tree function by function. Although, as I have
> > > > said, the problem is working out what needs to be done to recover,
> > > > rather than working out what the callers are. I'm not at all sure yet
> > > > but I also suspect that it may not be possible to recover in some cases,
> > > > which will likely lead to serious rework of some subsystems (but, hey,
> > > > who am I to say, I really don't have any clue yet).
> > > >
> > > 
> > > So we talked about this at plumbers.  First thing we need is a way to flip the
> > > filesystem read only, that way we can deal with the simple corruption cases.
> > 
> > Right, yes.
> > 
> > > And then we can start looking at these harder cases where it's really unclear
> > > about how to recover.
> > 
> > I have a way to go before I will even understand these cases.
> > 
> > > 
> > > Thankfully because we're COW we really shouldn't have any cases that we have to
> > > unwind anything, we just fail the operation and go on our happy merry way.  The
> > > only tricky thing is where we get ENOMEM when say inserting the metadata for
> > > data after writing out the data, since that will leave data just sitting around.
> > > Probably should look at what NFS does with dirty pages when the server hangs up.
> > 
> > OK, that's a though for me to focus on while I'm trying to work out
> > what's going on ... mmm.
> > 
> > Indeed, a large proportion of these are handling ENOMEM.
> > 
> > I somehow suspect your heavily focused on disk io itself when I'm still
> > back thinking about house keeping of operations, in the process of being
> > queued and those currently being processed, the later being the
> > difficult case. But I'll eventually get to worrying about io as part of
> > that process. It's also worth mentioning that my scope is also quite
> > narrow at this stage, focusing largely on the transaction subsystem,
> > although that tends to pull in a fair amount of other code too.
> > 
> 
> So the transaction stuff should be relatively simple since we shouldn't have too
> much to clean up if the transaction fails to allocate.  Maybe point out some
> places where you are having trouble and I can frame up what we'd want to do to
> give you an idea of where to go?  Thanks,

Thanks, I will when I have something to discuss or maybe I'll start with
the couple I have, when I get a chance to get back to it anyway.

Ian


  reply	other threads:[~2010-11-08 15:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-07  7:16 On Removing BUG_ON macros Yoshinori Sano
2010-11-07 14:51 ` Josef Bacik
2010-11-08  2:54   ` Ian Kent
2010-11-08 12:42     ` Josef Bacik
2010-11-08 14:06       ` Ian Kent
2010-11-08 14:15         ` Josef Bacik
2010-11-08 15:02           ` Ian Kent [this message]
2010-11-11  4:32             ` Ian Kent
2010-12-01 18:31               ` Josef Bacik
2010-11-09  6:13       ` Yoshinori Sano
2010-11-08 13:17   ` Yoshinori Sano
2010-11-08 13:28     ` Josef Bacik
2010-11-08 23:02       ` Yoshinori Sano
2010-11-08  2:36 ` Ian Kent

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1289228524.3309.51.camel@localhost \
    --to=raven@themaw.net \
    --cc=chris.mason@oracle.com \
    --cc=josef@redhat.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=yoshinori.sano@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).