From: Jan Kara <jack@suse.cz>
To: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Cc: tytso@mit.edu, adilger.kernel@dilger.ca,
Yongqiang Yang <xiaoqiangnk@gmail.com>,
linux-ext4@vger.kernel.org
Subject: Re: [PATCH 1/2] jbd2: jbd2_journal_stop needs an exclusive control to synchronize around t_update operations
Date: Tue, 3 Jan 2012 16:32:45 +0100 [thread overview]
Message-ID: <20120103153245.GE31457@quack.suse.cz> (raw)
In-Reply-To: <20111222205650.fe9d1b36.toshi.okajima@jp.fujitsu.com>
[-- Attachment #1: Type: text/plain, Size: 5234 bytes --]
Hello,
On Thu 22-12-11 20:56:50, Toshiyuki Okajima wrote:
> The following statements need an exclusive control for the critical code section
> around t_update operations:
> [jbd2_journal_stop()]
> 1445 /*
> 1446 * Once we drop t_updates, if it goes to zero the transaction
> 1447 * could start committing on us and eventually disappear. So
> 1448 * once we do this, we must not dereference transaction
> 1449 * pointer again.
> 1450 */
> 1451 tid = transaction->t_tid;
> + read_lock(&journal->j_state_lock);
> ----- critical code section ------------------------------------------------
> 1452 if (atomic_dec_and_test(&transaction->t_updates)) {
> 1453 wake_up(&journal->j_wait_updates);
> 1454 if (journal->j_barrier_count)
> 1455 wake_up(&journal->j_wait_transaction_locked);
> 1456 }
> -----------------------------------------------------------------------------
> + read_unlock(&journal->j_state_lock);
> 1457
>
> Because the functions which have the other critical code sections around t_update
> operations,
> - jbd2_journal_commit_transaction
> - start_this_handle
> - jbd2_journal_lock_updates
> can not synchronize with jbd2_journal_stop.
>
> ex) jbd2_journal_lock_updates
> 505 void jbd2_journal_lock_updates(journal_t *journal)
> 506 {
> 507 DEFINE_WAIT(wait);
> 508
> 509 write_lock(&journal->j_state_lock);
> 510 ++journal->j_barrier_count;
> 511
> 512 /* Wait until there are no running updates */
> 513 while (1) {
> 514 transaction_t *transaction = journal->j_running_transaction;
> 515
> 516 if (!transaction)
> 517 break;
> 518
> 519 spin_lock(&transaction->t_handle_lock);
> ----- critical code section ------------------------------------------------
> 520 if (!atomic_read(&transaction->t_updates)) {
> 521 spin_unlock(&transaction->t_handle_lock);
> 522 break;
> 523 }
> 524 prepare_to_wait(&journal->j_wait_updates, &wait,
> 525 TASK_UNINTERRUPTIBLE);
> -----------------------------------------------------------------------------
> 526 spin_unlock(&transaction->t_handle_lock);
> 527 write_unlock(&journal->j_state_lock);
> 528 schedule();
> 529 finish_wait(&journal->j_wait_updates, &wait);
> 530 write_lock(&journal->j_state_lock);
> 531 }
> 532 write_unlock(&journal->j_state_lock);
>
> Thefore, the following steps causes a hang-up of process1:
> 1) (process1) line 520 in jbd2_journal_lock_updates
> transaction->t_updates is equal to 1, and then goto 4).
> 2) (process2) line 1452 in jbd2_journal_stop
> transaction->t_updates becomes to 0, and then goto 3).
> 3) (process2) line 1453 in jbd2_journal_stop
> wake_up(&journal->j_wait_updates) tries to wake someone up.
> 4) (process1) line 524 in jbd2_journal_lock_updates
> prepare to sleep itself, and then goto 5).
> 5) (process1) line 528 in jbd2_journal_lock_updates
> sleep forever because process2 doesn't wake it up anymore.
Thanks for the analysis. Actually, you fix adds unnecessary overhead.
The problem really is the wrong ordering of prepare_to_wait() and t_updates
check. So attached patch should fix the issue as well without introducing
the overhead.
> Similar problem also exists for j_barrier_count operations but it can be
> fixed, too:
> [jbd2_journal_lock_updates]
> 505 void jbd2_journal_lock_updates(journal_t *journal)
> 506 {
> 507 DEFINE_WAIT(wait);
> 508
> 509 write_lock(&journal->j_state_lock);
> ----------------------------------------------------------------------------
> 510 ++journal->j_barrier_count;
> ----------------------------------------------------------------------------
> ...
> 532 write_unlock(&journal->j_state_lock);
>
> [jbd2_journal_stop]
> 1445 /*
> 1446 * Once we drop t_updates, if it goes to zero the transaction
> 1447 * could start committing on us and eventually disappear. So
> 1448 * once we do this, we must not dereference transaction
> 1449 * pointer again.
> 1450 */
> 1451 tid = transaction->t_tid;
> + read_lock(&journal->j_state_lock);
> 1452 if (atomic_dec_and_test(&transaction->t_updates)) {
> 1453 wake_up(&journal->j_wait_updates);
> ----------------------------------------------------------------------------
> 1454 if (journal->j_barrier_count)
> 1455 wake_up(&journal->j_wait_transaction_locked);
> ----------------------------------------------------------------------------
> 1456 }
> + read_unlock(&journal->j_state_lock);
> 1457
Here I don't agree. We use wait_event() to wait for j_barrier_count to
drop to zero and wait_event() has proper ordering of prepare_to_wait() and
test.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
[-- Attachment #2: 0001-jbd2-Fix-hung-processes-in-jbd2_journal_lock_updates.patch --]
[-- Type: text/x-patch, Size: 2346 bytes --]
>From 1cd5b8178893f3f186ce93eb1f47664a1a3e81fc Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Tue, 3 Jan 2012 16:13:29 +0100
Subject: [PATCH] jbd2: Fix hung processes in jbd2_journal_lock_updates()
Toshiyuki Okajima found out that when running
for ((i=0; i < 100000; i++)); do
if ((i%2 == 0)); then
chattr +j /mnt/file
else
chattr -j /mnt/file
fi
echo "0" >> /mnt/file
done
process sometimes hangs indefinitely in jbd2_journal_lock_updates().
Toshiyuki identified that the following race happens:
jbd2_journal_lock_updates() |jbd2_journal_stop()
---------------------------------------+---------------------------------------
write_lock(&journal->j_state_lock) | .
++journal->j_barrier_count | .
spin_lock(&tran->t_handle_lock) | .
atomic_read(&tran->t_updates) //not 0 |
| atomic_dec_and_test(&tran->t_updates)
| // t_updates = 0
| wake_up(&journal->j_wait_updates)
prepare_to_wait() | // no process is woken up.
spin_unlock(&tran->t_handle_lock) |
write_unlock(&journal->j_state_lock) |
schedule() // never return |
We fix the problem by first calling prepare_to_wait() and only after that
checking t_updates in jbd2_journal_lock_updates().
Reported-and-analyzed-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/jbd2/transaction.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index a0e41a4..35ae096 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -517,12 +517,13 @@ void jbd2_journal_lock_updates(journal_t *journal)
break;
spin_lock(&transaction->t_handle_lock);
+ prepare_to_wait(&journal->j_wait_updates, &wait,
+ TASK_UNINTERRUPTIBLE);
if (!atomic_read(&transaction->t_updates)) {
spin_unlock(&transaction->t_handle_lock);
+ finish_wait(&journal->j_wait_updates, &wait);
break;
}
- prepare_to_wait(&journal->j_wait_updates, &wait,
- TASK_UNINTERRUPTIBLE);
spin_unlock(&transaction->t_handle_lock);
write_unlock(&journal->j_state_lock);
schedule();
--
1.7.1
next prev parent reply other threads:[~2012-01-04 17:00 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-16 11:19 [PATCH][RFC] jbd2: jbd2_journal_stop needs an exclusive control to synchronize around t_updates operations Toshiyuki Okajima
2011-12-16 12:22 ` Yongqiang Yang
2011-12-20 10:44 ` Toshiyuki Okajima
2011-12-22 11:36 ` [PATCH 0/2 take2][RFC] hangup in jbd2_journal_lock_updates Toshiyuki Okajima
2011-12-22 11:56 ` [PATCH 1/2] jbd2: jbd2_journal_stop needs an exclusive control to synchronize around t_update operations Toshiyuki Okajima
2012-01-03 15:32 ` Jan Kara [this message]
2012-01-05 3:12 ` Ted Ts'o
2011-12-22 12:00 ` [PATCH 2/2] jbd2: remove all t_handle_lock statements Toshiyuki Okajima
2011-12-22 14:07 ` Yongqiang Yang
2011-12-26 0:20 ` Toshiyuki Okajima
2011-12-26 1:17 ` [PATCH 2/2 take2] jbd2: delete spin_lock(t_handle_lock) inside wirte_lock(j_state_lock) Toshiyuki Okajima
2011-12-31 6:35 ` Yongqiang Yang
2012-01-03 15:35 ` Jan Kara
2012-01-05 3:27 ` Ted Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120103153245.GE31457@quack.suse.cz \
--to=jack@suse.cz \
--cc=adilger.kernel@dilger.ca \
--cc=linux-ext4@vger.kernel.org \
--cc=toshi.okajima@jp.fujitsu.com \
--cc=tytso@mit.edu \
--cc=xiaoqiangnk@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.