From: Jan Kara <jack@suse.cz>
To: Eric Sandeen <sandeen@redhat.com>
Cc: ext4 development <linux-ext4@vger.kernel.org>,
Jan Kara <jack@suse.cz>, Dave Wysochanski <dwysocha@redhat.com>
Subject: Re: [PATCH RFC] jbd: don't wake kjournald unnecessarily
Date: Wed, 19 Dec 2012 02:27:10 +0100 [thread overview]
Message-ID: <20121219012710.GF5987@quack.suse.cz> (raw)
In-Reply-To: <50D0A1FD.7040203@redhat.com>
On Tue 18-12-12 11:03:57, Eric Sandeen wrote:
> Commit d9b0193 jbd: fix fsync() tid wraparound bug
> changed the logic for whether __log_start_commit() should wake up
> kjournald.
>
> After backporting this to RHEL6, I had a report of a performance regression
> on a large benchmark, and it was narrowed down to the change above.
Strange. I wonder what really happened that those additional wakeups had
influence on performance. They should be pretty cheap.
> I did a little investigation of jbd behavior while running xfstest
> 013, which just does a large fsstress run, and found that we were
> waking up kjournald more often than before; specifically,
> in the case where
>
> target == j_commit_request == journal->j_running_transaction
>
> It seems to me that the wakeup is not needed if we already have
> the right target on the commit request, so I tested with the
> additional condition added in the patch below; this brought
> performance back up to prior levels.
Correct.
> I also tested it with tid_t defined to a u8, to get frequent wraps.
> If I back out the wraparound patch, it will easily provoke
> the original ASSERT that prompted the prior commit. With
> the commit in place and the patch below, I survived running
> fsstress for 10 hours without problems even with a frequently-wrapping
> tid_t.
Thanks for throughout testing!
> A couple questions remain:
>
> With a u8 tid_t, the "else" clause from commit d9b0193 fires
> frequently; I really think the underlying problem is that tid_geq()
> etc does not properly handle wraparounds - if, say, target is 255
> and j_commit_request is 0, we don't know if j_commit_request
> is 255 tids behind, or 1 tid ahead. I have to think about that
> some more, unless it's obvious to someone else.
Well, there's no way to handle wraps better AFAICT. Tids eventually wrap
and if someone has stored away tid of a transaction he wants committed and
keeps it for a long time before using it, it can end up being anywhere
before / after current j_commit_request. The hope was that it takes long
enough to wrap around 32-bit tids. If this happens often in practice we may
have to switch to 64-bit tids (in memory, on disk 32-bit tids are enough
because of limited journal size).
> FWIW, some people have indeed seen that else clause fire upstream,
> both in the case where j_commit_request is > 2^31 and the
> target is 0.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=46031
> http://forums.debian.net/viewtopic.php?f=5&t=80741
This is actually curious. The fact that i_datasync_tid was 0 means that
either journal was not initialized during ext3_iget() or j_commit_sequence
was 0 during ext3_iget() - note that j_commit_sequence is initialized to
j_transaction_sequence in journal_reset()... Hum, but in a case when
ext3_load_journal() calls journal_wipe() and that finds j_tail != 0, we
call journal_skip_recovery(). That ends up setting j_transaction_sequence
to the last transaction in the log but j_commit_sequence is left at 0.
I see that explains how we could hit the warning. I think we should
initialize j_commit_sequence properly also when skipping recovery and that
will solve the problem.
BTW if we find j_tail == 0 in journal_wipe(), we skip setting
j_transaction_sequence to the last transaction in the journal. So
j_transaction_sequence ends up being 0 but j_tail_sequence is set by
load_superblock() to sb->s_sequence so there's a mismatch between reality
and what j_tail_sequence claims to be the oldest transaction in the log.
That reminds me of the report where bogus journal replay corrupted a
filesystem...
I'll fix both these issues.
> Anyway, I think this patch helps on the "don't send extra wakeups"
> side of things. Does anyone see a problem with it?
The patch is fine. I'll queue it up.
Honza
> =============
> [PATCH] jbd: don't wake kjournald unnecessarily
>
> Don't send an extra wakeup to kjournald in the case where we
> already have the proper target in j_commit_request, i.e. that
> commit has already been requested for commit.
>
> commit d9b0193 "jbd: fix fsync() tid wraparound bug" changed
> the logic leading to a wakeup, but it caused some extra wakeups
> which were found to lead to a measurable performance regression.
>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
>
> diff --git a/fs/jbd/journal.c b/fs/jbd/journal.c
> index a286233..81cc7ea 100644
> --- a/fs/jbd/journal.c
> +++ b/fs/jbd/journal.c
> @@ -446,7 +446,8 @@ int __log_start_commit(journal_t *journal, tid_t target)
> * currently running transaction (if it exists). Otherwise,
> * the target tid must be an old one.
> */
> - if (journal->j_running_transaction &&
> + if (journal->j_commit_request != target &&
> + journal->j_running_transaction &&
> journal->j_running_transaction->t_tid == target) {
> /*
> * We want a new commit: OK, mark the request and wakeup the
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2012-12-19 1:27 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-18 17:03 [PATCH RFC] jbd: don't wake kjournald unnecessarily Eric Sandeen
2012-12-19 1:27 ` Jan Kara [this message]
2012-12-19 2:05 ` Jan Kara
2012-12-19 3:08 ` Eric Sandeen
2012-12-19 8:13 ` Jan Kara
2012-12-19 15:37 ` Theodore Ts'o
2012-12-19 17:14 ` Jan Kara
2012-12-19 20:27 ` Theodore Ts'o
2012-12-19 21:19 ` Eric Sandeen
2012-12-21 17:01 ` Eric Sandeen
2012-12-21 17:46 ` Theodore Ts'o
2013-01-08 19:19 ` Eric Sandeen
2013-01-11 16:42 ` Eric Sandeen
2013-01-11 19:03 ` Jan Kara
2013-01-11 19:06 ` Eric Sandeen
2012-12-19 15:46 ` Eric Sandeen
2012-12-19 17:11 ` Jan Kara
2012-12-19 2:36 ` Eric Sandeen
2012-12-19 2:59 ` [PATCH] jbd2: " Eric Sandeen
2012-12-19 8:09 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121219012710.GF5987@quack.suse.cz \
--to=jack@suse.cz \
--cc=dwysocha@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=sandeen@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.