Re: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Theodore Ts'o <tytso@mit.edu>
To: Josef Bacik <jbacik@fusionio.com>
Cc: Younger Liu <younger.liu@huawei.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-ext4@vger.kernel.org,
	Ocfs2-Devel <ocfs2-devel@oss.oracle.com>,
	Li Zefan <lizefan@huawei.com>,
	jack@suse.cz
Subject: Re: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()
Date: Thu, 20 Jun 2013 14:12:15 -0400	[thread overview]
Message-ID: <20130620181215.GD4982@thunk.org> (raw)
In-Reply-To: <20130620172609.GC4288@localhost.localdomain>

On Thu, Jun 20, 2013 at 01:26:09PM -0400, Josef Bacik wrote:
> I realize it's been a little bit since I've looked at jbd but I'll offer my
> opinion.  Callers of jbd2_journal_restart() may not be the ones who originated
> the handle, so doing what Jan has done with jbd2_journal_start_reserved() isn't
> going to work because all the guy at the top is going to see is an error and
> have no way to tell if his handle is invalid or not.

Yeah, that's what I meant by "it would require changing all of the
callers".

> What I would suggest is getting a unified way to mark that the handle has
> already been cleaned up and can just be free'd.

The problem though is we need to make sure none of the callers don't
try to do anything else with handle besides calling
jbd2_journal_stop().  In particular, we can't allow a call to
jbd2_journal_get_write_access(), jbd2_journal_revoke() to operate on
the handle, because its transaction pointer is (potentially) invalid.

> Then fix jbd2_journal_start_reserved() and jbd2_journal_restart() to
> set that in the handle and make jbd2_journal_stop() just free up the
> handle and reset current->journal_info but not return an error.
> It's important to not return an error from jbd2_journal_stop() so
> that it doesn't invoke the ext4 error handling stuff and you get a
> read only file system when the error may not be read only file
> system worthy.

The handle->h_aborted bit, which is currently not used, does most of
the right thing, modulo the question of the fact that
jbd2_journal_stop() will return an error.  What's important from my
perspective is that the various callers that operate on a handle check
is_handle_aborted() before trying to use the it.  We'll still need to
audit the callers to make sure there isn't some uncommon-taken code
path where ext4_handle_dirty_metadata() gets called after
ext4_journal_restart() has returned an error.

As a FAST paper once opined, "EIO: Error Handling Is Occasionally correct".  :-)

> This way you have a nice clean way of dealing with handle errors that allow you
> to pass back a real error to the caller and the caller can just do its normal
> jbd2_journal_stop() and cleanup and do its own error handling the way it feels.
> This keeps the yucky details of no longer valid handles all internal to jbd2 and
> ext4/ocfs2 don't have to worry about it.  Thanks,

Yes, that could work, although we'll need to check to make sure all of
the code paths that invoke jbd2_journal_restart() handle errors
appropriately, and don't rely on jbd2_journal_stop() returning an
error.  Thanks for your thoughts!

Regards,

	       	   	      	 		     - Ted

WARNING: multiple messages have this Message-ID (diff)

From: Theodore Ts'o <tytso@mit.edu>
To: Josef Bacik <jbacik@fusionio.com>
Cc: Younger Liu <younger.liu@huawei.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-ext4@vger.kernel.org,
	Ocfs2-Devel <ocfs2-devel@oss.oracle.com>,
	Li Zefan <lizefan@huawei.com>,
	jack@suse.cz
Subject: [Ocfs2-devel] [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()
Date: Thu, 20 Jun 2013 14:12:15 -0400	[thread overview]
Message-ID: <20130620181215.GD4982@thunk.org> (raw)
In-Reply-To: <20130620172609.GC4288@localhost.localdomain>

On Thu, Jun 20, 2013 at 01:26:09PM -0400, Josef Bacik wrote:
> I realize it's been a little bit since I've looked at jbd but I'll offer my
> opinion.  Callers of jbd2_journal_restart() may not be the ones who originated
> the handle, so doing what Jan has done with jbd2_journal_start_reserved() isn't
> going to work because all the guy at the top is going to see is an error and
> have no way to tell if his handle is invalid or not.

Yeah, that's what I meant by "it would require changing all of the
callers".

> What I would suggest is getting a unified way to mark that the handle has
> already been cleaned up and can just be free'd.

The problem though is we need to make sure none of the callers don't
try to do anything else with handle besides calling
jbd2_journal_stop().  In particular, we can't allow a call to
jbd2_journal_get_write_access(), jbd2_journal_revoke() to operate on
the handle, because its transaction pointer is (potentially) invalid.

> Then fix jbd2_journal_start_reserved() and jbd2_journal_restart() to
> set that in the handle and make jbd2_journal_stop() just free up the
> handle and reset current->journal_info but not return an error.
> It's important to not return an error from jbd2_journal_stop() so
> that it doesn't invoke the ext4 error handling stuff and you get a
> read only file system when the error may not be read only file
> system worthy.

The handle->h_aborted bit, which is currently not used, does most of
the right thing, modulo the question of the fact that
jbd2_journal_stop() will return an error.  What's important from my
perspective is that the various callers that operate on a handle check
is_handle_aborted() before trying to use the it.  We'll still need to
audit the callers to make sure there isn't some uncommon-taken code
path where ext4_handle_dirty_metadata() gets called after
ext4_journal_restart() has returned an error.

As a FAST paper once opined, "EIO: Error Handling Is Occasionally correct".  :-)

> This way you have a nice clean way of dealing with handle errors that allow you
> to pass back a real error to the caller and the caller can just do its normal
> jbd2_journal_stop() and cleanup and do its own error handling the way it feels.
> This keeps the yucky details of no longer valid handles all internal to jbd2 and
> ext4/ocfs2 don't have to worry about it.  Thanks,

Yes, that could work, although we'll need to check to make sure all of
the code paths that invoke jbd2_journal_restart() handle errors
appropriately, and don't rely on jbd2_journal_stop() returning an
error.  Thanks for your thoughts!

Regards,

	       	   	      	 		     - Ted

next prev parent reply	other threads:[~2013-06-20 18:12 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-19  4:48 [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart() Younger Liu
2013-06-19  4:48 ` [Ocfs2-devel] " Younger Liu
2013-06-20 15:55 ` Theodore Ts'o
2013-06-20 15:55   ` Theodore Ts'o
2013-06-20 15:55   ` [Ocfs2-devel] " Theodore Ts'o
2013-06-20 16:08   ` [PATCH] jbd2: fix theoretical race in jbd2__journal_restart Theodore Ts'o
2013-06-20 16:08     ` [Ocfs2-devel] " Theodore Ts'o
2013-06-20 17:26   ` [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart() Josef Bacik
2013-06-20 17:26     ` [Ocfs2-devel] " Josef Bacik
2013-06-20 18:12     ` Theodore Ts'o [this message]
2013-06-20 18:12       ` Theodore Ts'o
2013-06-21 23:26       ` Jan Kara
2013-06-21 23:26         ` [Ocfs2-devel] " Jan Kara
2013-06-21 13:29   ` Younger Liu
2013-06-21 13:29     ` [Ocfs2-devel] " Younger Liu
2013-06-23 17:36     ` Theodore Ts'o
2013-06-23 17:36       ` [Ocfs2-devel] " Theodore Ts'o
2013-06-23 17:44       ` [PATCH] jbd2: invalidate handle if jbd2_journal_restart() fails Theodore Ts'o
2013-06-24  9:53         ` Jan Kara
2013-06-25  9:42         ` Younger Liu
2013-06-29 23:46           ` Theodore Ts'o
2013-07-03 12:22             ` Younger Liu
2013-07-03 12:36               ` Younger Liu
2013-06-25  8:30       ` [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart() Younger Liu
2013-06-25  8:30         ` [Ocfs2-devel] " Younger Liu
2013-06-29 13:22       ` Joel Becker
2013-06-29 13:22         ` [Ocfs2-devel] " Joel Becker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130620181215.GD4982@thunk.org \
    --to=tytso@mit.edu \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=jbacik@fusionio.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=younger.liu@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.