* jbd2_clear_buffer_revoked_flags() takes a long time @ 2018-10-10 13:43 Adrian Hunter 2018-10-10 17:49 ` Theodore Y. Ts'o 0 siblings, 1 reply; 6+ messages in thread From: Adrian Hunter @ 2018-10-10 13:43 UTC (permalink / raw) To: Andreas Dilger, linux-ext4@vger.kernel.org, Theodore Ts'o Hi I have a case on a v4.14 kernel where the EXT4 journal commit disables preemption for 30ms due to jbd2_clear_buffer_revoked_flags(). That in turn disables preemption on other CPUs as they come to spin waiting for the same lock. The side-effect of that is that it periodically blocks high priority tasks from running. I see jbd2_clear_buffer_revoked_flags() iterating 32768 times calling __find_get_block(). Is there any way to make jbd2_clear_buffer_revoked_flags() take less time, or move its work out from under write_lock(&journal->j_state_lock)? Or do you have any other suggestions? Regards Adrian ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: jbd2_clear_buffer_revoked_flags() takes a long time 2018-10-10 13:43 jbd2_clear_buffer_revoked_flags() takes a long time Adrian Hunter @ 2018-10-10 17:49 ` Theodore Y. Ts'o 2018-10-11 11:12 ` Jan Kara 0 siblings, 1 reply; 6+ messages in thread From: Theodore Y. Ts'o @ 2018-10-10 17:49 UTC (permalink / raw) To: Adrian Hunter; +Cc: Andreas Dilger, linux-ext4@vger.kernel.org On Wed, Oct 10, 2018 at 04:43:27PM +0300, Adrian Hunter wrote: > Hi > > I have a case on a v4.14 kernel where the EXT4 journal commit disables > preemption for 30ms due to jbd2_clear_buffer_revoked_flags(). That in turn > disables preemption on other CPUs as they come to spin waiting for the same > lock. The side-effect of that is that it periodically blocks high priority > tasks from running. > > I see jbd2_clear_buffer_revoked_flags() iterating 32768 times calling > __find_get_block(). > > Is there any way to make jbd2_clear_buffer_revoked_flags() take less time, > or move its work out from under write_lock(&journal->j_state_lock)? Hmm.... I'd have to look a bit more carefully and then run some tests, but I *think* we can drop the j_state_lock at the beginning of JBD2 commit phase 1, and then grab it again right before we set commit_transaction->t_state to T_FLUSH. That should be safe because while the transaction state is T_LOCKED, we can't start any new handles, so there can't be any new blocks added to the revoke table. Can you give that a try and see whether that solves your priority inversion problem? Cheers, - Ted ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: jbd2_clear_buffer_revoked_flags() takes a long time 2018-10-10 17:49 ` Theodore Y. Ts'o @ 2018-10-11 11:12 ` Jan Kara 2018-10-11 12:38 ` Adrian Hunter 0 siblings, 1 reply; 6+ messages in thread From: Jan Kara @ 2018-10-11 11:12 UTC (permalink / raw) To: Theodore Y. Ts'o Cc: Adrian Hunter, Andreas Dilger, linux-ext4@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 1357 bytes --] On Wed 10-10-18 13:49:34, Theodore Y. Ts'o wrote: > On Wed, Oct 10, 2018 at 04:43:27PM +0300, Adrian Hunter wrote: > > Hi > > > > I have a case on a v4.14 kernel where the EXT4 journal commit disables > > preemption for 30ms due to jbd2_clear_buffer_revoked_flags(). That in turn > > disables preemption on other CPUs as they come to spin waiting for the same > > lock. The side-effect of that is that it periodically blocks high priority > > tasks from running. > > > > I see jbd2_clear_buffer_revoked_flags() iterating 32768 times calling > > __find_get_block(). > > > > Is there any way to make jbd2_clear_buffer_revoked_flags() take less time, > > or move its work out from under write_lock(&journal->j_state_lock)? > > Hmm.... I'd have to look a bit more carefully and then run some tests, > but I *think* we can drop the j_state_lock at the beginning of JBD2 > commit phase 1, and then grab it again right before we set > commit_transaction->t_state to T_FLUSH. > > That should be safe because while the transaction state is T_LOCKED, > we can't start any new handles, so there can't be any new blocks added > to the revoke table. > > Can you give that a try and see whether that solves your priority > inversion problem? Agreed. Something like attached patch (compile-tested only)? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR [-- Attachment #2: 0001-jbd2-Avoid-long-hold-times-of-j_state_lock-while-com.patch --] [-- Type: text/x-patch, Size: 2286 bytes --] >From 3627b5f30996504019cd84f326402fccbb9a298b Mon Sep 17 00:00:00 2001 From: Jan Kara <jack@suse.cz> Date: Thu, 11 Oct 2018 13:04:44 +0200 Subject: [PATCH] jbd2: Avoid long hold times of j_state_lock while committing a transaction We can hold j_state_lock for writing at the beginning of jbd2_journal_commit_transaction() for a rather long time (reportedly for 30 ms) due cleaning revoke bits of all revoked buffers under it. The handling of revoke tables as well as cleaning of t_reserved_list, and checkpoint lists does not need j_state_lock for anything. Furthermore the transaction is in T_LOCKED state and we waited for all outstanding handles so nobody is going to be adding anything to the transaction. Just drop the lock for unnecessary operations. Reported-by: Adrian Hunter <adrian.hunter@intel.com> Suggested-by: "Theodore Y. Ts'o" <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> --- fs/jbd2/commit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index 150cc030b4d7..356b75fa3101 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -422,6 +422,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) stats.run.rs_locked); stats.run.rs_running = jbd2_time_diff(commit_transaction->t_start, stats.run.rs_locked); + write_unlock(&journal->j_state_lock); spin_lock(&commit_transaction->t_handle_lock); while (atomic_read(&commit_transaction->t_updates)) { @@ -431,9 +432,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) TASK_UNINTERRUPTIBLE); if (atomic_read(&commit_transaction->t_updates)) { spin_unlock(&commit_transaction->t_handle_lock); - write_unlock(&journal->j_state_lock); schedule(); - write_lock(&journal->j_state_lock); spin_lock(&commit_transaction->t_handle_lock); } finish_wait(&journal->j_wait_updates, &wait); @@ -505,6 +504,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) atomic_sub(atomic_read(&journal->j_reserved_credits), &commit_transaction->t_outstanding_credits); + write_lock(&journal->j_state_lock); trace_jbd2_commit_flushing(journal, commit_transaction); stats.run.rs_flushing = jiffies; stats.run.rs_locked = jbd2_time_diff(stats.run.rs_locked, -- 2.16.4 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: jbd2_clear_buffer_revoked_flags() takes a long time 2018-10-11 11:12 ` Jan Kara @ 2018-10-11 12:38 ` Adrian Hunter 2018-10-16 8:49 ` Adrian Hunter 0 siblings, 1 reply; 6+ messages in thread From: Adrian Hunter @ 2018-10-11 12:38 UTC (permalink / raw) To: Jan Kara, Theodore Y. Ts'o; +Cc: Andreas Dilger, linux-ext4@vger.kernel.org On 11/10/18 2:12 PM, Jan Kara wrote: > On Wed 10-10-18 13:49:34, Theodore Y. Ts'o wrote: >> On Wed, Oct 10, 2018 at 04:43:27PM +0300, Adrian Hunter wrote: >>> Hi >>> >>> I have a case on a v4.14 kernel where the EXT4 journal commit disables >>> preemption for 30ms due to jbd2_clear_buffer_revoked_flags(). That in turn >>> disables preemption on other CPUs as they come to spin waiting for the same >>> lock. The side-effect of that is that it periodically blocks high priority >>> tasks from running. >>> >>> I see jbd2_clear_buffer_revoked_flags() iterating 32768 times calling >>> __find_get_block(). >>> >>> Is there any way to make jbd2_clear_buffer_revoked_flags() take less time, >>> or move its work out from under write_lock(&journal->j_state_lock)? >> Hmm.... I'd have to look a bit more carefully and then run some tests, >> but I *think* we can drop the j_state_lock at the beginning of JBD2 >> commit phase 1, and then grab it again right before we set >> commit_transaction->t_state to T_FLUSH. >> >> That should be safe because while the transaction state is T_LOCKED, >> we can't start any new handles, so there can't be any new blocks added >> to the revoke table. >> >> Can you give that a try and see whether that solves your priority >> inversion problem? > Agreed. Something like attached patch (compile-tested only)? I have been testing a patch with the unlock/lock at slightly different positions, and it definitely helps. The incidence of my problem drops from nearly every writeback, to a few an hour. I haven't had time to find out what is causing the remaining cases yet - it may not be related to EXT4. I should be able to test this patch tomorrow. > > Honza > > -- Jan Kara <jack@suse.com> SUSE Labs, CR > > > 0001-jbd2-Avoid-long-hold-times-of-j_state_lock-while-com.patch > >>From 3627b5f30996504019cd84f326402fccbb9a298b Mon Sep 17 00:00:00 2001 > From: Jan Kara <jack@suse.cz> > Date: Thu, 11 Oct 2018 13:04:44 +0200 > Subject: [PATCH] jbd2: Avoid long hold times of j_state_lock while committing > a transaction > > We can hold j_state_lock for writing at the beginning of > jbd2_journal_commit_transaction() for a rather long time (reportedly for > 30 ms) due cleaning revoke bits of all revoked buffers under it. The > handling of revoke tables as well as cleaning of t_reserved_list, and > checkpoint lists does not need j_state_lock for anything. Furthermore > the transaction is in T_LOCKED state and we waited for all outstanding > handles so nobody is going to be adding anything to the transaction. > > Just drop the lock for unnecessary operations. > > Reported-by: Adrian Hunter <adrian.hunter@intel.com> > Suggested-by: "Theodore Y. Ts'o" <tytso@mit.edu> > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/jbd2/commit.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c > index 150cc030b4d7..356b75fa3101 100644 > --- a/fs/jbd2/commit.c > +++ b/fs/jbd2/commit.c > @@ -422,6 +422,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) > stats.run.rs_locked); > stats.run.rs_running = jbd2_time_diff(commit_transaction->t_start, > stats.run.rs_locked); > + write_unlock(&journal->j_state_lock); > > spin_lock(&commit_transaction->t_handle_lock); > while (atomic_read(&commit_transaction->t_updates)) { > @@ -431,9 +432,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) > TASK_UNINTERRUPTIBLE); > if (atomic_read(&commit_transaction->t_updates)) { > spin_unlock(&commit_transaction->t_handle_lock); > - write_unlock(&journal->j_state_lock); > schedule(); > - write_lock(&journal->j_state_lock); > spin_lock(&commit_transaction->t_handle_lock); > } > finish_wait(&journal->j_wait_updates, &wait); > @@ -505,6 +504,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) > atomic_sub(atomic_read(&journal->j_reserved_credits), > &commit_transaction->t_outstanding_credits); > > + write_lock(&journal->j_state_lock); > trace_jbd2_commit_flushing(journal, commit_transaction); > stats.run.rs_flushing = jiffies; > stats.run.rs_locked = jbd2_time_diff(stats.run.rs_locked, > -- 2.16.4 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: jbd2_clear_buffer_revoked_flags() takes a long time 2018-10-11 12:38 ` Adrian Hunter @ 2018-10-16 8:49 ` Adrian Hunter 2018-10-16 9:50 ` Jan Kara 0 siblings, 1 reply; 6+ messages in thread From: Adrian Hunter @ 2018-10-16 8:49 UTC (permalink / raw) To: Jan Kara, Theodore Y. Ts'o; +Cc: Andreas Dilger, linux-ext4@vger.kernel.org On 11/10/18 3:38 PM, Adrian Hunter wrote: > On 11/10/18 2:12 PM, Jan Kara wrote: >> On Wed 10-10-18 13:49:34, Theodore Y. Ts'o wrote: >>> On Wed, Oct 10, 2018 at 04:43:27PM +0300, Adrian Hunter wrote: >>>> Hi >>>> >>>> I have a case on a v4.14 kernel where the EXT4 journal commit disables >>>> preemption for 30ms due to jbd2_clear_buffer_revoked_flags(). That in turn >>>> disables preemption on other CPUs as they come to spin waiting for the same >>>> lock. The side-effect of that is that it periodically blocks high priority >>>> tasks from running. >>>> >>>> I see jbd2_clear_buffer_revoked_flags() iterating 32768 times calling >>>> __find_get_block(). >>>> >>>> Is there any way to make jbd2_clear_buffer_revoked_flags() take less time, >>>> or move its work out from under write_lock(&journal->j_state_lock)? >>> Hmm.... I'd have to look a bit more carefully and then run some tests, >>> but I *think* we can drop the j_state_lock at the beginning of JBD2 >>> commit phase 1, and then grab it again right before we set >>> commit_transaction->t_state to T_FLUSH. >>> >>> That should be safe because while the transaction state is T_LOCKED, >>> we can't start any new handles, so there can't be any new blocks added >>> to the revoke table. >>> >>> Can you give that a try and see whether that solves your priority >>> inversion problem? >> Agreed. Something like attached patch (compile-tested only)? > > I have been testing a patch with the unlock/lock at slightly different > positions, and it definitely helps. The incidence of my problem drops from > nearly every writeback, to a few an hour. I haven't had time to find out > what is causing the remaining cases yet - it may not be related to EXT4. I > should be able to test this patch tomorrow. Thanks very much for the quick response and patch! Tested-by: Adrian Hunter <adrian.hunter@intel.com> With more stress I also found move_expired_inodes() (wb_writeback->queue_io->move_expired_inodes) to take up to 16ms using 230,000 branches while under spin lock. AFAICT we weren't hitting that in practice so I am not following it up at this stage. > >> >> Honza >> >> -- Jan Kara <jack@suse.com> SUSE Labs, CR >> >> >> 0001-jbd2-Avoid-long-hold-times-of-j_state_lock-while-com.patch >> >> >From 3627b5f30996504019cd84f326402fccbb9a298b Mon Sep 17 00:00:00 2001 >> From: Jan Kara <jack@suse.cz> >> Date: Thu, 11 Oct 2018 13:04:44 +0200 >> Subject: [PATCH] jbd2: Avoid long hold times of j_state_lock while committing >> a transaction >> >> We can hold j_state_lock for writing at the beginning of >> jbd2_journal_commit_transaction() for a rather long time (reportedly for >> 30 ms) due cleaning revoke bits of all revoked buffers under it. The >> handling of revoke tables as well as cleaning of t_reserved_list, and >> checkpoint lists does not need j_state_lock for anything. Furthermore >> the transaction is in T_LOCKED state and we waited for all outstanding >> handles so nobody is going to be adding anything to the transaction. >> >> Just drop the lock for unnecessary operations. >> >> Reported-by: Adrian Hunter <adrian.hunter@intel.com> >> Suggested-by: "Theodore Y. Ts'o" <tytso@mit.edu> >> Signed-off-by: Jan Kara <jack@suse.cz> >> --- >> fs/jbd2/commit.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c >> index 150cc030b4d7..356b75fa3101 100644 >> --- a/fs/jbd2/commit.c >> +++ b/fs/jbd2/commit.c >> @@ -422,6 +422,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) >> stats.run.rs_locked); >> stats.run.rs_running = jbd2_time_diff(commit_transaction->t_start, >> stats.run.rs_locked); >> + write_unlock(&journal->j_state_lock); >> >> spin_lock(&commit_transaction->t_handle_lock); >> while (atomic_read(&commit_transaction->t_updates)) { >> @@ -431,9 +432,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) >> TASK_UNINTERRUPTIBLE); >> if (atomic_read(&commit_transaction->t_updates)) { >> spin_unlock(&commit_transaction->t_handle_lock); >> - write_unlock(&journal->j_state_lock); >> schedule(); >> - write_lock(&journal->j_state_lock); >> spin_lock(&commit_transaction->t_handle_lock); >> } >> finish_wait(&journal->j_wait_updates, &wait); >> @@ -505,6 +504,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) >> atomic_sub(atomic_read(&journal->j_reserved_credits), >> &commit_transaction->t_outstanding_credits); >> >> + write_lock(&journal->j_state_lock); >> trace_jbd2_commit_flushing(journal, commit_transaction); >> stats.run.rs_flushing = jiffies; >> stats.run.rs_locked = jbd2_time_diff(stats.run.rs_locked, >> -- 2.16.4 >> > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: jbd2_clear_buffer_revoked_flags() takes a long time 2018-10-16 8:49 ` Adrian Hunter @ 2018-10-16 9:50 ` Jan Kara 0 siblings, 0 replies; 6+ messages in thread From: Jan Kara @ 2018-10-16 9:50 UTC (permalink / raw) To: Adrian Hunter Cc: Jan Kara, Theodore Y. Ts'o, Andreas Dilger, linux-ext4@vger.kernel.org On Tue 16-10-18 11:49:45, Adrian Hunter wrote: > On 11/10/18 3:38 PM, Adrian Hunter wrote: > > On 11/10/18 2:12 PM, Jan Kara wrote: > >> On Wed 10-10-18 13:49:34, Theodore Y. Ts'o wrote: > >>> On Wed, Oct 10, 2018 at 04:43:27PM +0300, Adrian Hunter wrote: > >>>> Hi > >>>> > >>>> I have a case on a v4.14 kernel where the EXT4 journal commit disables > >>>> preemption for 30ms due to jbd2_clear_buffer_revoked_flags(). That in turn > >>>> disables preemption on other CPUs as they come to spin waiting for the same > >>>> lock. The side-effect of that is that it periodically blocks high priority > >>>> tasks from running. > >>>> > >>>> I see jbd2_clear_buffer_revoked_flags() iterating 32768 times calling > >>>> __find_get_block(). > >>>> > >>>> Is there any way to make jbd2_clear_buffer_revoked_flags() take less time, > >>>> or move its work out from under write_lock(&journal->j_state_lock)? > >>> Hmm.... I'd have to look a bit more carefully and then run some tests, > >>> but I *think* we can drop the j_state_lock at the beginning of JBD2 > >>> commit phase 1, and then grab it again right before we set > >>> commit_transaction->t_state to T_FLUSH. > >>> > >>> That should be safe because while the transaction state is T_LOCKED, > >>> we can't start any new handles, so there can't be any new blocks added > >>> to the revoke table. > >>> > >>> Can you give that a try and see whether that solves your priority > >>> inversion problem? > >> Agreed. Something like attached patch (compile-tested only)? > > > > I have been testing a patch with the unlock/lock at slightly different > > positions, and it definitely helps. The incidence of my problem drops from > > nearly every writeback, to a few an hour. I haven't had time to find out > > what is causing the remaining cases yet - it may not be related to EXT4. I > > should be able to test this patch tomorrow. > > Thanks very much for the quick response and patch! > > Tested-by: Adrian Hunter <adrian.hunter@intel.com> Thanks. I've officially posted the patch. > With more stress I also found move_expired_inodes() > (wb_writeback->queue_io->move_expired_inodes) to take up to 16ms using > 230,000 branches while under spin lock. AFAICT we weren't hitting that in > practice so I am not following it up at this stage. Interesting. I actually have a patch simplifying that area as well sitting in some branch in my tree. So I can dust it off if you are interested. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-10-16 17:40 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-10-10 13:43 jbd2_clear_buffer_revoked_flags() takes a long time Adrian Hunter 2018-10-10 17:49 ` Theodore Y. Ts'o 2018-10-11 11:12 ` Jan Kara 2018-10-11 12:38 ` Adrian Hunter 2018-10-16 8:49 ` Adrian Hunter 2018-10-16 9:50 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).