From: Alexey Lyahkov <alexey.lyashkov@gmail.com>
To: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: linux-ext4 <linux-ext4@vger.kernel.org>,
Theodore Ts'o <tytso@mit.edu>, Andreas Dilger <adilger@dilger.ca>,
Artem Blagodarenko <artem.blagodarenko@gmail.com>,
Andrew Perepechko <anserper@ya.ru>
Subject: Re: [PATCH] jbd2: wake up journal waiters in FIFO order, not LIFO
Date: Thu, 8 Sep 2022 08:51:40 +0300 [thread overview]
Message-ID: <B32B956C-E851-42A2-9419-2947C442E2AA@gmail.com> (raw)
In-Reply-To: <20220908054611.vjcb27wmq4dggqmv@riteshh-domain>
Hi Ritesh,
This was hit on the Lustre OSS node when we have ton’s of short write with sync/(journal commit) in parallel.
Each write was done from own thread (like 1k-2k threads in parallel).
It caused a situation when only few/some threads make a wakeup and enter to the transaction until it will be T_LOCKED.
In our’s observation all handles from head was waked and it’s handles added recently, while old handles still in list and
It caused a soft lockup messages on console.
Alex
> On 8 Sep 2022, at 08:46, Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
>
> On 22/09/07 07:59PM, Alexey Lyashkov wrote:
>> From: Andrew Perepechko <anserper@ya.ru>
>>
>> LIFO wakeup order is unfair and sometimes leads to a journal
>> user not being able to get a journal handle for hundreds of
>> transactions in a row.
>>
>> FIFO wakeup can make things more fair.
>
> prepare_to_wait() will always add the task to the head of the list.
> While prepare_to_wait_exclusive() will add the task to the tail since all of the
> exclusive tasks are added to the tail.
> wake_up() function will wake up all non-exclusive and single exclusive task
> v/s
> wake_up_all() function will wake up all tasks irrespective.
>
> So your change does makes the ordering to FIFO, in which the task which came in
> first will be woken up first.
>
> Although I was wondering about 2 things -
> 1. In what scenario this was observed to become a problem/bottleneck for you?
> Could you kindly give more details of your problem?
>
> 2. What about start_this_handle() function where we call wait_event()
> for j_barrier_count to be 0? I guess that doesn't happen often.
>
> -ritesh
>
>
>>
>> Signed-off-by: Alexey Lyashkov <alexey.lyashkov@gmail.com>
>> ---
>> fs/jbd2/commit.c | 2 +-
>> fs/jbd2/transaction.c | 6 +++---
>> 2 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
>> index b2b2bc9b88d9..ec2b55879e3a 100644
>> --- a/fs/jbd2/commit.c
>> +++ b/fs/jbd2/commit.c
>> @@ -570,7 +570,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
>> journal->j_running_transaction = NULL;
>> start_time = ktime_get();
>> commit_transaction->t_log_start = journal->j_head;
>> - wake_up(&journal->j_wait_transaction_locked);
>> + wake_up_all(&journal->j_wait_transaction_locked);
>> write_unlock(&journal->j_state_lock);
>>
>> jbd2_debug(3, "JBD2: commit phase 2a\n");
>> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
>> index e1be93ccd81c..6a404ac1c178 100644
>> --- a/fs/jbd2/transaction.c
>> +++ b/fs/jbd2/transaction.c
>> @@ -168,7 +168,7 @@ static void wait_transaction_locked(journal_t *journal)
>> int need_to_start;
>> tid_t tid = journal->j_running_transaction->t_tid;
>>
>> - prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
>> + prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
>> TASK_UNINTERRUPTIBLE);
>> need_to_start = !tid_geq(journal->j_commit_request, tid);
>> read_unlock(&journal->j_state_lock);
>> @@ -194,7 +194,7 @@ static void wait_transaction_switching(journal_t *journal)
>> read_unlock(&journal->j_state_lock);
>> return;
>> }
>> - prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
>> + prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
>> TASK_UNINTERRUPTIBLE);
>> read_unlock(&journal->j_state_lock);
>> /*
>> @@ -920,7 +920,7 @@ void jbd2_journal_unlock_updates (journal_t *journal)
>> write_lock(&journal->j_state_lock);
>> --journal->j_barrier_count;
>> write_unlock(&journal->j_state_lock);
>> - wake_up(&journal->j_wait_transaction_locked);
>> + wake_up_all(&journal->j_wait_transaction_locked);
>> }
>>
>> static void warn_dirty_buffer(struct buffer_head *bh)
>> --
>> 2.31.1
>>
next prev parent reply other threads:[~2022-09-08 5:51 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-07 16:59 [PATCH] jbd2: wake up journal waiters in FIFO order, not LIFO Alexey Lyashkov
2022-09-08 5:46 ` Ritesh Harjani (IBM)
2022-09-08 5:51 ` Alexey Lyahkov [this message]
2022-09-08 6:11 ` Ritesh Harjani (IBM)
2022-09-08 8:21 ` Alexey Lyahkov
2022-09-08 8:28 ` Andrew
2022-09-08 9:11 ` Ritesh Harjani (IBM)
2022-09-08 9:13 ` Ritesh Harjani (IBM)
-- strict thread matches above, loose matches on Subject: below --
2022-09-12 18:01 Alexey Lyashkov
2022-09-30 3:19 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B32B956C-E851-42A2-9419-2947C442E2AA@gmail.com \
--to=alexey.lyashkov@gmail.com \
--cc=adilger@dilger.ca \
--cc=anserper@ya.ru \
--cc=artem.blagodarenko@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox