All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Cc: tytso@mit.edu, adilger.kernel@dilger.ca,
	Yongqiang Yang <xiaoqiangnk@gmail.com>,
	linux-ext4@vger.kernel.org
Subject: Re: [PATCH 1/2] jbd2: jbd2_journal_stop needs an exclusive control to synchronize around t_update operations
Date: Tue, 3 Jan 2012 16:32:45 +0100	[thread overview]
Message-ID: <20120103153245.GE31457@quack.suse.cz> (raw)
In-Reply-To: <20111222205650.fe9d1b36.toshi.okajima@jp.fujitsu.com>

[-- Attachment #1: Type: text/plain, Size: 5234 bytes --]

  Hello,

On Thu 22-12-11 20:56:50, Toshiyuki Okajima wrote:
> The following statements need an exclusive control for the critical code section
> around t_update operations:
> [jbd2_journal_stop()]
> 1445         /*
> 1446          * Once we drop t_updates, if it goes to zero the transaction
> 1447          * could start committing on us and eventually disappear.  So
> 1448          * once we do this, we must not dereference transaction
> 1449          * pointer again.
> 1450          */
> 1451         tid = transaction->t_tid;
> +    read_lock(&journal->j_state_lock);
> ----- critical code section ------------------------------------------------
> 1452         if (atomic_dec_and_test(&transaction->t_updates)) {
> 1453                 wake_up(&journal->j_wait_updates);
> 1454                 if (journal->j_barrier_count)
> 1455                         wake_up(&journal->j_wait_transaction_locked);
> 1456         }
> -----------------------------------------------------------------------------
> +    read_unlock(&journal->j_state_lock);
> 1457 
> 
> Because the functions which have the other critical code sections around t_update 
> operations, 
>  - jbd2_journal_commit_transaction
>  - start_this_handle
>  - jbd2_journal_lock_updates
> can not synchronize with jbd2_journal_stop.
> 
> ex) jbd2_journal_lock_updates
>  505 void jbd2_journal_lock_updates(journal_t *journal)
>  506 {
>  507         DEFINE_WAIT(wait);
>  508 
>  509         write_lock(&journal->j_state_lock);
>  510         ++journal->j_barrier_count;
>  511 
>  512         /* Wait until there are no running updates */
>  513         while (1) {
>  514                transaction_t *transaction = journal->j_running_transaction;
>  515 
>  516                 if (!transaction)
>  517                         break;
>  518 
>  519                 spin_lock(&transaction->t_handle_lock);
> ----- critical code section ------------------------------------------------
>  520                 if (!atomic_read(&transaction->t_updates)) {
>  521                         spin_unlock(&transaction->t_handle_lock);
>  522                         break;
>  523                 }
>  524                 prepare_to_wait(&journal->j_wait_updates, &wait,
>  525                                 TASK_UNINTERRUPTIBLE);
> -----------------------------------------------------------------------------
>  526                 spin_unlock(&transaction->t_handle_lock);
>  527                 write_unlock(&journal->j_state_lock);
>  528                 schedule();
>  529                 finish_wait(&journal->j_wait_updates, &wait);
>  530                 write_lock(&journal->j_state_lock);
>  531         }
>  532         write_unlock(&journal->j_state_lock);
> 
> Thefore, the following steps causes a hang-up of process1:
> 1) (process1) line 520 in jbd2_journal_lock_updates
>       transaction->t_updates is equal to 1, and then goto 4).
> 2) (process2) line 1452 in jbd2_journal_stop
>       transaction->t_updates becomes to 0, and then goto 3).
> 3) (process2) line 1453 in jbd2_journal_stop
>       wake_up(&journal->j_wait_updates) tries to wake someone up.
> 4) (process1) line 524 in jbd2_journal_lock_updates
>       prepare to sleep itself, and then goto 5).
> 5) (process1) line 528 in jbd2_journal_lock_updates
>       sleep forever because process2 doesn't wake it up anymore.
  Thanks for the analysis. Actually, you fix adds unnecessary overhead.
The problem really is the wrong ordering of prepare_to_wait() and t_updates
check. So attached patch should fix the issue as well without introducing
the overhead.

> Similar problem also exists for j_barrier_count operations but it can be
> fixed, too:
> [jbd2_journal_lock_updates]
>  505 void jbd2_journal_lock_updates(journal_t *journal)
>  506 {
>  507         DEFINE_WAIT(wait);
>  508 
>  509         write_lock(&journal->j_state_lock);
> ----------------------------------------------------------------------------
>  510         ++journal->j_barrier_count;
> ----------------------------------------------------------------------------
> ...
>  532         write_unlock(&journal->j_state_lock);
> 
> [jbd2_journal_stop]
> 1445         /*
> 1446          * Once we drop t_updates, if it goes to zero the transaction
> 1447          * could start committing on us and eventually disappear.  So
> 1448          * once we do this, we must not dereference transaction
> 1449          * pointer again.
> 1450          */
> 1451         tid = transaction->t_tid;
> +    read_lock(&journal->j_state_lock);
> 1452         if (atomic_dec_and_test(&transaction->t_updates)) {
> 1453                 wake_up(&journal->j_wait_updates);
> ----------------------------------------------------------------------------
> 1454                 if (journal->j_barrier_count)
> 1455                         wake_up(&journal->j_wait_transaction_locked);
> ----------------------------------------------------------------------------
> 1456         }
> +    read_unlock(&journal->j_state_lock);
> 1457 
  Here I don't agree. We use wait_event() to wait for j_barrier_count to
drop to zero and wait_event() has proper ordering of prepare_to_wait() and
test.

								Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR

[-- Attachment #2: 0001-jbd2-Fix-hung-processes-in-jbd2_journal_lock_updates.patch --]
[-- Type: text/x-patch, Size: 2346 bytes --]

>From 1cd5b8178893f3f186ce93eb1f47664a1a3e81fc Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Tue, 3 Jan 2012 16:13:29 +0100
Subject: [PATCH] jbd2: Fix hung processes in jbd2_journal_lock_updates()

Toshiyuki Okajima found out that when running

for ((i=0; i < 100000; i++)); do
        if ((i%2 == 0)); then
                chattr +j /mnt/file
        else
                chattr -j /mnt/file
        fi
        echo "0" >> /mnt/file
done

process sometimes hangs indefinitely in jbd2_journal_lock_updates().

Toshiyuki identified that the following race happens:

jbd2_journal_lock_updates()            |jbd2_journal_stop()
---------------------------------------+---------------------------------------
 write_lock(&journal->j_state_lock)    |    .
 ++journal->j_barrier_count            |    .
 spin_lock(&tran->t_handle_lock)       |    .
 atomic_read(&tran->t_updates) //not 0 |
                                       | atomic_dec_and_test(&tran->t_updates)
                                       |    // t_updates = 0
                                       | wake_up(&journal->j_wait_updates)
 prepare_to_wait()                     |    // no process is woken up.
 spin_unlock(&tran->t_handle_lock)     |
 write_unlock(&journal->j_state_lock)  |
 schedule() // never return            |

We fix the problem by first calling prepare_to_wait() and only after that
checking t_updates in jbd2_journal_lock_updates().

Reported-and-analyzed-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/jbd2/transaction.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index a0e41a4..35ae096 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -517,12 +517,13 @@ void jbd2_journal_lock_updates(journal_t *journal)
 			break;
 
 		spin_lock(&transaction->t_handle_lock);
+		prepare_to_wait(&journal->j_wait_updates, &wait,
+				TASK_UNINTERRUPTIBLE);
 		if (!atomic_read(&transaction->t_updates)) {
 			spin_unlock(&transaction->t_handle_lock);
+			finish_wait(&journal->j_wait_updates, &wait);
 			break;
 		}
-		prepare_to_wait(&journal->j_wait_updates, &wait,
-				TASK_UNINTERRUPTIBLE);
 		spin_unlock(&transaction->t_handle_lock);
 		write_unlock(&journal->j_state_lock);
 		schedule();
-- 
1.7.1


  reply	other threads:[~2012-01-04 17:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-16 11:19 [PATCH][RFC] jbd2: jbd2_journal_stop needs an exclusive control to synchronize around t_updates operations Toshiyuki Okajima
2011-12-16 12:22 ` Yongqiang Yang
2011-12-20 10:44   ` Toshiyuki Okajima
2011-12-22 11:36     ` [PATCH 0/2 take2][RFC] hangup in jbd2_journal_lock_updates Toshiyuki Okajima
2011-12-22 11:56       ` [PATCH 1/2] jbd2: jbd2_journal_stop needs an exclusive control to synchronize around t_update operations Toshiyuki Okajima
2012-01-03 15:32         ` Jan Kara [this message]
2012-01-05  3:12           ` Ted Ts'o
2011-12-22 12:00       ` [PATCH 2/2] jbd2: remove all t_handle_lock statements Toshiyuki Okajima
2011-12-22 14:07         ` Yongqiang Yang
2011-12-26  0:20           ` Toshiyuki Okajima
2011-12-26  1:17             ` [PATCH 2/2 take2] jbd2: delete spin_lock(t_handle_lock) inside wirte_lock(j_state_lock) Toshiyuki Okajima
2011-12-31  6:35               ` Yongqiang Yang
2012-01-03 15:35               ` Jan Kara
2012-01-05  3:27               ` Ted Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120103153245.GE31457@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=adilger.kernel@dilger.ca \
    --cc=linux-ext4@vger.kernel.org \
    --cc=toshi.okajima@jp.fujitsu.com \
    --cc=tytso@mit.edu \
    --cc=xiaoqiangnk@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.