[PATCH] jbd2: wake up journal waiters in FIFO order, not LIFO

public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] jbd2: wake up journal waiters in FIFO order, not  LIFO
@ 2022-09-07 16:59 Alexey Lyashkov
  2022-09-08  5:46 ` Ritesh Harjani (IBM)
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Lyashkov @ 2022-09-07 16:59 UTC (permalink / raw)
  To: linux-ext4, Theodore Ts'o, Andreas Dilger, Artem Blagodarenko
  Cc: Andrew Perepechko, Alexey Lyashkov

From: Andrew Perepechko <anserper@ya.ru>

LIFO wakeup order is unfair and sometimes leads to a journal
user not being able to get a journal handle for hundreds of
transactions in a row.

FIFO wakeup can make things more fair.

Signed-off-by: Alexey Lyashkov <alexey.lyashkov@gmail.com>
---
 fs/jbd2/commit.c      | 2 +-
 fs/jbd2/transaction.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index b2b2bc9b88d9..ec2b55879e3a 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -570,7 +570,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
 	journal->j_running_transaction = NULL;
 	start_time = ktime_get();
 	commit_transaction->t_log_start = journal->j_head;
-	wake_up(&journal->j_wait_transaction_locked);
+	wake_up_all(&journal->j_wait_transaction_locked);
 	write_unlock(&journal->j_state_lock);
 
 	jbd2_debug(3, "JBD2: commit phase 2a\n");
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index e1be93ccd81c..6a404ac1c178 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -168,7 +168,7 @@ static void wait_transaction_locked(journal_t *journal)
 	int need_to_start;
 	tid_t tid = journal->j_running_transaction->t_tid;
 
-	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
+	prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
 			TASK_UNINTERRUPTIBLE);
 	need_to_start = !tid_geq(journal->j_commit_request, tid);
 	read_unlock(&journal->j_state_lock);
@@ -194,7 +194,7 @@ static void wait_transaction_switching(journal_t *journal)
 		read_unlock(&journal->j_state_lock);
 		return;
 	}
-	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
+	prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
 			TASK_UNINTERRUPTIBLE);
 	read_unlock(&journal->j_state_lock);
 	/*
@@ -920,7 +920,7 @@ void jbd2_journal_unlock_updates (journal_t *journal)
 	write_lock(&journal->j_state_lock);
 	--journal->j_barrier_count;
 	write_unlock(&journal->j_state_lock);
-	wake_up(&journal->j_wait_transaction_locked);
+	wake_up_all(&journal->j_wait_transaction_locked);
 }
 
 static void warn_dirty_buffer(struct buffer_head *bh)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] jbd2: wake up journal waiters in FIFO order, not  LIFO
  2022-09-07 16:59 [PATCH] jbd2: wake up journal waiters in FIFO order, not LIFO Alexey Lyashkov
@ 2022-09-08  5:46 ` Ritesh Harjani (IBM)
  2022-09-08  5:51   ` Alexey Lyahkov
  2022-09-08  9:13   ` Ritesh Harjani (IBM)
  0 siblings, 2 replies; 10+ messages in thread
From: Ritesh Harjani (IBM) @ 2022-09-08  5:46 UTC (permalink / raw)
  To: Alexey Lyashkov
  Cc: linux-ext4, Theodore Ts'o, Andreas Dilger, Artem Blagodarenko,
	Andrew Perepechko

On 22/09/07 07:59PM, Alexey Lyashkov wrote:
> From: Andrew Perepechko <anserper@ya.ru>
> 
> LIFO wakeup order is unfair and sometimes leads to a journal
> user not being able to get a journal handle for hundreds of
> transactions in a row.
> 
> FIFO wakeup can make things more fair.

prepare_to_wait() will always add the task to the head of the list.
While prepare_to_wait_exclusive() will add the task to the tail since all of the
exclusive tasks are added to the tail.
wake_up() function will wake up all non-exclusive and single exclusive task 
v/s
wake_up_all() function will wake up all tasks irrespective.

So your change does makes the ordering to FIFO, in which the task which came in 
first will be woken up first. 

Although I was wondering about 2 things - 
1. In what scenario this was observed to become a problem/bottleneck for you?
Could you kindly give more details of your problem?

2. What about start_this_handle() function where we call wait_event() 
for j_barrier_count to be 0? I guess that doesn't happen often.

-ritesh


> 
> Signed-off-by: Alexey Lyashkov <alexey.lyashkov@gmail.com>
> ---
>  fs/jbd2/commit.c      | 2 +-
>  fs/jbd2/transaction.c | 6 +++---
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index b2b2bc9b88d9..ec2b55879e3a 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -570,7 +570,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
>  	journal->j_running_transaction = NULL;
>  	start_time = ktime_get();
>  	commit_transaction->t_log_start = journal->j_head;
> -	wake_up(&journal->j_wait_transaction_locked);
> +	wake_up_all(&journal->j_wait_transaction_locked);
>  	write_unlock(&journal->j_state_lock);
>  
>  	jbd2_debug(3, "JBD2: commit phase 2a\n");
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index e1be93ccd81c..6a404ac1c178 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -168,7 +168,7 @@ static void wait_transaction_locked(journal_t *journal)
>  	int need_to_start;
>  	tid_t tid = journal->j_running_transaction->t_tid;
>  
> -	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
> +	prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
>  			TASK_UNINTERRUPTIBLE);
>  	need_to_start = !tid_geq(journal->j_commit_request, tid);
>  	read_unlock(&journal->j_state_lock);
> @@ -194,7 +194,7 @@ static void wait_transaction_switching(journal_t *journal)
>  		read_unlock(&journal->j_state_lock);
>  		return;
>  	}
> -	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
> +	prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
>  			TASK_UNINTERRUPTIBLE);
>  	read_unlock(&journal->j_state_lock);
>  	/*
> @@ -920,7 +920,7 @@ void jbd2_journal_unlock_updates (journal_t *journal)
>  	write_lock(&journal->j_state_lock);
>  	--journal->j_barrier_count;
>  	write_unlock(&journal->j_state_lock);
> -	wake_up(&journal->j_wait_transaction_locked);
> +	wake_up_all(&journal->j_wait_transaction_locked);
>  }
>  
>  static void warn_dirty_buffer(struct buffer_head *bh)
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] jbd2: wake up journal waiters in FIFO order, not  LIFO
  2022-09-08  5:46 ` Ritesh Harjani (IBM)
@ 2022-09-08  5:51   ` Alexey Lyahkov
  2022-09-08  6:11     ` Ritesh Harjani (IBM)
  2022-09-08  9:13   ` Ritesh Harjani (IBM)
  1 sibling, 1 reply; 10+ messages in thread
From: Alexey Lyahkov @ 2022-09-08  5:51 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Andreas Dilger, Artem Blagodarenko,
	Andrew Perepechko

Hi Ritesh,

This was hit on the Lustre OSS node when we have ton’s of short write with sync/(journal commit) in parallel.
Each write was done from own thread (like 1k-2k threads in parallel).
It caused a situation when only few/some threads make a wakeup and enter to the transaction until it will be T_LOCKED.
In our’s observation all handles from head was waked and it’s handles added recently, while old handles still in list and
It caused a soft lockup messages on console.

Alex


> On 8 Sep 2022, at 08:46, Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
> 
> On 22/09/07 07:59PM, Alexey Lyashkov wrote:
>> From: Andrew Perepechko <anserper@ya.ru>
>> 
>> LIFO wakeup order is unfair and sometimes leads to a journal
>> user not being able to get a journal handle for hundreds of
>> transactions in a row.
>> 
>> FIFO wakeup can make things more fair.
> 
> prepare_to_wait() will always add the task to the head of the list.
> While prepare_to_wait_exclusive() will add the task to the tail since all of the
> exclusive tasks are added to the tail.
> wake_up() function will wake up all non-exclusive and single exclusive task 
> v/s
> wake_up_all() function will wake up all tasks irrespective.
> 
> So your change does makes the ordering to FIFO, in which the task which came in 
> first will be woken up first. 
> 
> Although I was wondering about 2 things - 
> 1. In what scenario this was observed to become a problem/bottleneck for you?
> Could you kindly give more details of your problem?
> 
> 2. What about start_this_handle() function where we call wait_event() 
> for j_barrier_count to be 0? I guess that doesn't happen often.
> 
> -ritesh
> 
> 
>> 
>> Signed-off-by: Alexey Lyashkov <alexey.lyashkov@gmail.com>
>> ---
>> fs/jbd2/commit.c      | 2 +-
>> fs/jbd2/transaction.c | 6 +++---
>> 2 files changed, 4 insertions(+), 4 deletions(-)
>> 
>> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
>> index b2b2bc9b88d9..ec2b55879e3a 100644
>> --- a/fs/jbd2/commit.c
>> +++ b/fs/jbd2/commit.c
>> @@ -570,7 +570,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
>> 	journal->j_running_transaction = NULL;
>> 	start_time = ktime_get();
>> 	commit_transaction->t_log_start = journal->j_head;
>> -	wake_up(&journal->j_wait_transaction_locked);
>> +	wake_up_all(&journal->j_wait_transaction_locked);
>> 	write_unlock(&journal->j_state_lock);
>> 
>> 	jbd2_debug(3, "JBD2: commit phase 2a\n");
>> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
>> index e1be93ccd81c..6a404ac1c178 100644
>> --- a/fs/jbd2/transaction.c
>> +++ b/fs/jbd2/transaction.c
>> @@ -168,7 +168,7 @@ static void wait_transaction_locked(journal_t *journal)
>> 	int need_to_start;
>> 	tid_t tid = journal->j_running_transaction->t_tid;
>> 
>> -	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
>> +	prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
>> 			TASK_UNINTERRUPTIBLE);
>> 	need_to_start = !tid_geq(journal->j_commit_request, tid);
>> 	read_unlock(&journal->j_state_lock);
>> @@ -194,7 +194,7 @@ static void wait_transaction_switching(journal_t *journal)
>> 		read_unlock(&journal->j_state_lock);
>> 		return;
>> 	}
>> -	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
>> +	prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
>> 			TASK_UNINTERRUPTIBLE);
>> 	read_unlock(&journal->j_state_lock);
>> 	/*
>> @@ -920,7 +920,7 @@ void jbd2_journal_unlock_updates (journal_t *journal)
>> 	write_lock(&journal->j_state_lock);
>> 	--journal->j_barrier_count;
>> 	write_unlock(&journal->j_state_lock);
>> -	wake_up(&journal->j_wait_transaction_locked);
>> +	wake_up_all(&journal->j_wait_transaction_locked);
>> }
>> 
>> static void warn_dirty_buffer(struct buffer_head *bh)
>> -- 
>> 2.31.1
>> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] jbd2: wake up journal waiters in FIFO order, not  LIFO
  2022-09-08  5:51   ` Alexey Lyahkov
@ 2022-09-08  6:11     ` Ritesh Harjani (IBM)
  2022-09-08  8:21       ` Alexey Lyahkov
  0 siblings, 1 reply; 10+ messages in thread
From: Ritesh Harjani (IBM) @ 2022-09-08  6:11 UTC (permalink / raw)
  To: Alexey Lyahkov
  Cc: linux-ext4, Theodore Ts'o, Andreas Dilger, Artem Blagodarenko,
	Andrew Perepechko

On 22/09/08 08:51AM, Alexey Lyahkov wrote:
> Hi Ritesh,
> 
> This was hit on the Lustre OSS node when we have ton’s of short write with sync/(journal commit) in parallel.
> Each write was done from own thread (like 1k-2k threads in parallel).
> It caused a situation when only few/some threads make a wakeup and enter to the transaction until it will be T_LOCKED.
> In our’s observation all handles from head was waked and it’s handles added recently, while old handles still in list and

Thanks Alexey for providing the details.

> It caused a soft lockup messages on console.

Did you mean hung task timeout? I was wondering why will there be soft lockup
warning, because these old handles are anyway in a waiting state right.
Am I missing something?

-ritesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] jbd2: wake up journal waiters in FIFO order, not  LIFO
  2022-09-08  6:11     ` Ritesh Harjani (IBM)
@ 2022-09-08  8:21       ` Alexey Lyahkov
  2022-09-08  8:28         ` Andrew
  2022-09-08  9:11         ` Ritesh Harjani (IBM)
  0 siblings, 2 replies; 10+ messages in thread
From: Alexey Lyahkov @ 2022-09-08  8:21 UTC (permalink / raw)
  To: Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Andreas Dilger, Artem Blagodarenko,
	Andrew Perepechko



> On 8 Sep 2022, at 09:11, Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
> 
> On 22/09/08 08:51AM, Alexey Lyahkov wrote:
>> Hi Ritesh,
>> 
>> This was hit on the Lustre OSS node when we have ton’s of short write with sync/(journal commit) in parallel.
>> Each write was done from own thread (like 1k-2k threads in parallel).
>> It caused a situation when only few/some threads make a wakeup and enter to the transaction until it will be T_LOCKED.
>> In our’s observation all handles from head was waked and it’s handles added recently, while old handles still in list and
> 
> Thanks Alexey for providing the details.
> 
>> It caused a soft lockup messages on console.
> 
> Did you mean hung task timeout? I was wondering why will there be soft lockup
> warning, because these old handles are anyway in a waiting state right.
> Am I missing something?
> 
Oh. I asked a colleges about details. It was internal lustre hung detector not a kernel side

[ 2221.036503] Lustre: ll_ost_io04_080: service thread pid 55122 was inactive for 80.284 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
[ 2221.036677] Pid: 55212, comm: ll_ost_io05_074 4.18.0-305.10.2.x6.1.010.19.x86_64 #1 SMP Thu Jun 30 13:42:51 MDT 2022
[ 2221.056673] Lustre: Skipped 2 previous similar messages
[ 2221.067821] Call Trace TBD:
[ 2221.067855] [<0>] wait_transaction_locked+0x89/0xc0 [jbd2]
[ 2221.099175] [<0>] add_transaction_credits+0xd4/0x290 [jbd2]
[ 2221.105266] [<0>] start_this_handle+0x10a/0x520 [jbd2]
[ 2221.110904] [<0>] jbd2__journal_start+0xea/0x1f0 [jbd2]
[ 2221.116679] [<0>] __ldiskfs_journal_start_sb+0x6e/0x130 [ldiskfs]
[ 2221.123316] [<0>] osd_trans_start+0x13b/0x4f0 [osd_ldiskfs]
[ 2221.129417] [<0>] ofd_commitrw_write+0x620/0x1830 [ofd]
[ 2221.135147] [<0>] ofd_commitrw+0x731/0xd80 [ofd]
[ 2221.140420] [<0>] obd_commitrw+0x1ac/0x370 [ptlrpc]
[ 2221.145858] [<0>] tgt_brw_write+0x1913/0x1d50 [ptlrpc]
[ 2221.151561] [<0>] tgt_request_handle+0xc93/0x1a40 [ptlrpc]
[ 2221.157622] [<0>] ptlrpc_server_handle_request+0x323/0xbd0 [ptlrpc]
[ 2221.164454] [<0>] ptlrpc_main+0xc06/0x1560 [ptlrpc]
[ 2221.169860] [<0>] kthread+0x116/0x130
[ 2221.174033] [<0>] ret_from_fork+0x1f/0x40


Other logs have shown this thread can’t take a handle, but other threads able to do it many times.
Kernel detector don’t hit because thread have wakeup many times but it have seen T_LOCKED and go to sleep again.

Alex



> -ritesh


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] jbd2: wake up journal waiters in FIFO order, not LIFO
  2022-09-08  8:21       ` Alexey Lyahkov
@ 2022-09-08  8:28         ` Andrew
  2022-09-08  9:11         ` Ritesh Harjani (IBM)
  1 sibling, 0 replies; 10+ messages in thread
From: Andrew @ 2022-09-08  8:28 UTC (permalink / raw)
  To: Alexey Lyahkov, Ritesh Harjani (IBM)
  Cc: linux-ext4, Theodore Ts'o, Andreas Dilger, Artem Blagodarenko

> Other logs have shown this thread can’t take a handle, but other threads able to do it many times.
> Kernel detector don’t hit because thread have wakeup many times but it have seen T_LOCKED and go to sleep again.
Also, the transactions are small because the workload is sync writes.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] jbd2: wake up journal waiters in FIFO order, not  LIFO
  2022-09-08  8:21       ` Alexey Lyahkov
  2022-09-08  8:28         ` Andrew
@ 2022-09-08  9:11         ` Ritesh Harjani (IBM)
  1 sibling, 0 replies; 10+ messages in thread
From: Ritesh Harjani (IBM) @ 2022-09-08  9:11 UTC (permalink / raw)
  To: Alexey Lyahkov
  Cc: linux-ext4, Theodore Ts'o, Andreas Dilger, Artem Blagodarenko,
	Andrew Perepechko

On 22/09/08 11:21AM, Alexey Lyahkov wrote:
> 
> 
> > On 8 Sep 2022, at 09:11, Ritesh Harjani (IBM) <ritesh.list@gmail.com> wrote:
> > 
> > On 22/09/08 08:51AM, Alexey Lyahkov wrote:
> >> Hi Ritesh,
> >> 
> >> This was hit on the Lustre OSS node when we have ton’s of short write with sync/(journal commit) in parallel.
> >> Each write was done from own thread (like 1k-2k threads in parallel).
> >> It caused a situation when only few/some threads make a wakeup and enter to the transaction until it will be T_LOCKED.
> >> In our’s observation all handles from head was waked and it’s handles added recently, while old handles still in list and
> > 
> > Thanks Alexey for providing the details.
> > 
> >> It caused a soft lockup messages on console.
> > 
> > Did you mean hung task timeout? I was wondering why will there be soft lockup
> > warning, because these old handles are anyway in a waiting state right.
> > Am I missing something?
> > 
> Oh. I asked a colleges about details. It was internal lustre hung detector not a kernel side

Thanks again for sharing the details. This indeed looks like a task handle can
remain in wait state for long due to wrong wakeup order in case of many threads. 

-ritesh

> 
> [ 2221.036503] Lustre: ll_ost_io04_080: service thread pid 55122 was inactive for 80.284 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
> [ 2221.036677] Pid: 55212, comm: ll_ost_io05_074 4.18.0-305.10.2.x6.1.010.19.x86_64 #1 SMP Thu Jun 30 13:42:51 MDT 2022
> [ 2221.056673] Lustre: Skipped 2 previous similar messages
> [ 2221.067821] Call Trace TBD:
> [ 2221.067855] [<0>] wait_transaction_locked+0x89/0xc0 [jbd2]
> [ 2221.099175] [<0>] add_transaction_credits+0xd4/0x290 [jbd2]
> [ 2221.105266] [<0>] start_this_handle+0x10a/0x520 [jbd2]
> [ 2221.110904] [<0>] jbd2__journal_start+0xea/0x1f0 [jbd2]
> [ 2221.116679] [<0>] __ldiskfs_journal_start_sb+0x6e/0x130 [ldiskfs]
> [ 2221.123316] [<0>] osd_trans_start+0x13b/0x4f0 [osd_ldiskfs]
> [ 2221.129417] [<0>] ofd_commitrw_write+0x620/0x1830 [ofd]
> [ 2221.135147] [<0>] ofd_commitrw+0x731/0xd80 [ofd]
> [ 2221.140420] [<0>] obd_commitrw+0x1ac/0x370 [ptlrpc]
> [ 2221.145858] [<0>] tgt_brw_write+0x1913/0x1d50 [ptlrpc]
> [ 2221.151561] [<0>] tgt_request_handle+0xc93/0x1a40 [ptlrpc]
> [ 2221.157622] [<0>] ptlrpc_server_handle_request+0x323/0xbd0 [ptlrpc]
> [ 2221.164454] [<0>] ptlrpc_main+0xc06/0x1560 [ptlrpc]
> [ 2221.169860] [<0>] kthread+0x116/0x130
> [ 2221.174033] [<0>] ret_from_fork+0x1f/0x40
> 
> 
> Other logs have shown this thread can’t take a handle, but other threads able to do it many times.
> Kernel detector don’t hit because thread have wakeup many times but it have seen T_LOCKED and go to sleep again.
> 
> Alex
> 
> 
> 
> > -ritesh
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] jbd2: wake up journal waiters in FIFO order, not  LIFO
  2022-09-08  5:46 ` Ritesh Harjani (IBM)
  2022-09-08  5:51   ` Alexey Lyahkov
@ 2022-09-08  9:13   ` Ritesh Harjani (IBM)
  1 sibling, 0 replies; 10+ messages in thread
From: Ritesh Harjani (IBM) @ 2022-09-08  9:13 UTC (permalink / raw)
  To: Alexey Lyashkov
  Cc: linux-ext4, Theodore Ts'o, Andreas Dilger, Artem Blagodarenko,
	Andrew Perepechko

On 22/09/08 11:16AM, Ritesh Harjani (IBM) wrote:
> On 22/09/07 07:59PM, Alexey Lyashkov wrote:
> > From: Andrew Perepechko <anserper@ya.ru>
> > 
> > LIFO wakeup order is unfair and sometimes leads to a journal
> > user not being able to get a journal handle for hundreds of
> > transactions in a row.
> > 
> > FIFO wakeup can make things more fair.
> 
> prepare_to_wait() will always add the task to the head of the list.
> While prepare_to_wait_exclusive() will add the task to the tail since all of the
> exclusive tasks are added to the tail.
> wake_up() function will wake up all non-exclusive and single exclusive task 
> v/s
> wake_up_all() function will wake up all tasks irrespective.
> 
> So your change does makes the ordering to FIFO, in which the task which came in 
> first will be woken up first. 

With all other details which got discussed in other threads. 
This looks like the right thing to do.

Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] jbd2: wake up journal waiters in FIFO order, not  LIFO
@ 2022-09-12 18:01 Alexey Lyashkov
  2022-09-30  3:19 ` Theodore Ts'o
  0 siblings, 1 reply; 10+ messages in thread
From: Alexey Lyashkov @ 2022-09-12 18:01 UTC (permalink / raw)
  To: linux-ext4, Theodore Ts'o, Andreas Dilger, Artem Blagodarenko
  Cc: Andrew Perepechko, Ritesh Harjani, Alexey Lyashkov

From: Andrew Perepechko <anserper@ya.ru>

LIFO wakeup order is unfair and sometimes leads to a journal
user not being able to get a journal handle for hundreds of
transactions in a row.

FIFO wakeup can make things more fair.

Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@gmail.com>
---
 fs/jbd2/commit.c      | 2 +-
 fs/jbd2/transaction.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index b2b2bc9b88d9..ec2b55879e3a 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -570,7 +570,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
 	journal->j_running_transaction = NULL;
 	start_time = ktime_get();
 	commit_transaction->t_log_start = journal->j_head;
-	wake_up(&journal->j_wait_transaction_locked);
+	wake_up_all(&journal->j_wait_transaction_locked);
 	write_unlock(&journal->j_state_lock);
 
 	jbd2_debug(3, "JBD2: commit phase 2a\n");
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index e1be93ccd81c..6a404ac1c178 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -168,7 +168,7 @@ static void wait_transaction_locked(journal_t *journal)
 	int need_to_start;
 	tid_t tid = journal->j_running_transaction->t_tid;
 
-	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
+	prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
 			TASK_UNINTERRUPTIBLE);
 	need_to_start = !tid_geq(journal->j_commit_request, tid);
 	read_unlock(&journal->j_state_lock);
@@ -194,7 +194,7 @@ static void wait_transaction_switching(journal_t *journal)
 		read_unlock(&journal->j_state_lock);
 		return;
 	}
-	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
+	prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
 			TASK_UNINTERRUPTIBLE);
 	read_unlock(&journal->j_state_lock);
 	/*
@@ -920,7 +920,7 @@ void jbd2_journal_unlock_updates (journal_t *journal)
 	write_lock(&journal->j_state_lock);
 	--journal->j_barrier_count;
 	write_unlock(&journal->j_state_lock);
-	wake_up(&journal->j_wait_transaction_locked);
+	wake_up_all(&journal->j_wait_transaction_locked);
 }
 
 static void warn_dirty_buffer(struct buffer_head *bh)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] jbd2: wake up journal waiters in FIFO order, not LIFO
  2022-09-12 18:01 Alexey Lyashkov
@ 2022-09-30  3:19 ` Theodore Ts'o
  0 siblings, 0 replies; 10+ messages in thread
From: Theodore Ts'o @ 2022-09-30  3:19 UTC (permalink / raw)
  To: linux-ext4, adilger, artem.blagodarenko, alexey.lyashkov
  Cc: Theodore Ts'o, anserper, ritesh.list

On Mon, 12 Sep 2022 21:01:37 +0300, Alexey Lyashkov wrote:
> From: Andrew Perepechko <anserper@ya.ru>
> 
> LIFO wakeup order is unfair and sometimes leads to a journal
> user not being able to get a journal handle for hundreds of
> transactions in a row.
> 
> FIFO wakeup can make things more fair.
> 
> [...]

Applied, thanks!

[1/1] jbd2: wake up journal waiters in FIFO order, not LIFO
      commit: 2bf6739025dcfca3a24fbc73b8f2dbd979c0e4c9

Best regards,
-- 
Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-09-30  3:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-07 16:59 [PATCH] jbd2: wake up journal waiters in FIFO order, not LIFO Alexey Lyashkov
2022-09-08  5:46 ` Ritesh Harjani (IBM)
2022-09-08  5:51   ` Alexey Lyahkov
2022-09-08  6:11     ` Ritesh Harjani (IBM)
2022-09-08  8:21       ` Alexey Lyahkov
2022-09-08  8:28         ` Andrew
2022-09-08  9:11         ` Ritesh Harjani (IBM)
2022-09-08  9:13   ` Ritesh Harjani (IBM)
  -- strict thread matches above, loose matches on Subject: below --
2022-09-12 18:01 Alexey Lyashkov
2022-09-30  3:19 ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox