linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3 v2] jbd2 performance improvements
@ 2013-01-31 17:53 Theodore Ts'o
  2013-01-31 17:53 ` [PATCH 1/3] jbd2: don't wake kjournald unnecessarily Theodore Ts'o
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Theodore Ts'o @ 2013-01-31 17:53 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o

I've added a patch which tracks the delay between when a commit was
first requested, and when the commit thread actually starts processing
the commit.  In addition to helping to assess the effects of the "commit
as soon as possible after log_start_commit", it is also useful in
debugging problems caused by real time processes which might interfere
with the kjournald thread (since we've had one user ask a question about
this already).

There are other performance metrics that we should consider adding,
including:

*) using a per-cpu variable to track t_max_wait, so we can measure this
   performance metric without incurring a scalability penalty; we
   currently disable this unless debugging is enabled, which is
   unfortunate.
*) how many handles are getting blocked waiting for a transaction to
   finish, and not just the max wait time
*) keeping statistics of why various commits have been started (was it
   due to an explicit fsync, or due to O_SYNC, or due to the transaction
   time expiring --- and in the latter case, whether it is due to an
   expiry while the file system was idle, and hence was picked up by
   kjournald, or during the stop_this_handle processing)

But these are things we can save for another time...

        	      	     	    	     - Ted

Eric Sandeen (1):
  jbd2: don't wake kjournald unnecessarily

Theodore Ts'o (2):
  jbd2: commit as soon as possible after log_start_commit
  jbd2: track request delay statistics

 fs/jbd2/commit.c            | 13 ++++++++++++-
 fs/jbd2/journal.c           | 17 ++++++++++++++---
 fs/jbd2/transaction.c       |  7 +++++--
 include/linux/jbd2.h        |  8 ++++++++
 include/trace/events/jbd2.h |  8 ++++++--
 5 files changed, 45 insertions(+), 8 deletions(-)

-- 
1.7.12.rc0.22.gcdd159b


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] jbd2: don't wake kjournald unnecessarily
  2013-01-31 17:53 [PATCH 0/3 v2] jbd2 performance improvements Theodore Ts'o
@ 2013-01-31 17:53 ` Theodore Ts'o
  2013-01-31 17:53 ` [PATCH 2/3] jbd2: commit as soon as possible after log_start_commit Theodore Ts'o
  2013-01-31 17:53 ` [PATCH 3/3] jbd2: track request delay statistics Theodore Ts'o
  2 siblings, 0 replies; 5+ messages in thread
From: Theodore Ts'o @ 2013-01-31 17:53 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Eric Sandeen, Theodore Ts'o

From: Eric Sandeen <sandeen@redhat.com>

Don't send an extra wakeup to kjournald in the case where we
already have the proper target in j_commit_request, i.e. that
transaction has already been requested for commit.

commit deeeaf13 "jbd2: fix fsync() tid wraparound bug" changed
the logic leading to a wakeup, but it caused some extra wakeups
which were found to lead to a measurable performance regression.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
[tytso@mit.edu: reworked check to make it clearer]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 fs/jbd2/journal.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index dbf41f9..1a80e31 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -513,6 +513,10 @@ int __jbd2_log_space_left(journal_t *journal)
  */
 int __jbd2_log_start_commit(journal_t *journal, tid_t target)
 {
+	/* Return if the txn has already requested to be committed */
+	if (journal->j_commit_request == target)
+		return 0;
+
 	/*
 	 * The only transaction we can possibly wait upon is the
 	 * currently running transaction (if it exists).  Otherwise,
-- 
1.7.12.rc0.22.gcdd159b


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] jbd2: commit as soon as possible after log_start_commit
  2013-01-31 17:53 [PATCH 0/3 v2] jbd2 performance improvements Theodore Ts'o
  2013-01-31 17:53 ` [PATCH 1/3] jbd2: don't wake kjournald unnecessarily Theodore Ts'o
@ 2013-01-31 17:53 ` Theodore Ts'o
  2013-02-04 21:02   ` Theodore Ts'o
  2013-01-31 17:53 ` [PATCH 3/3] jbd2: track request delay statistics Theodore Ts'o
  2 siblings, 1 reply; 5+ messages in thread
From: Theodore Ts'o @ 2013-01-31 17:53 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o

Once a transaction has been requested to be committed, don't let any
other handles start under that transaction, and don't allow any
pending transactions to be extended (i.e., in the case of
unlink/ftruncate).

The idea is once the transaction has had log_start_commit() called on
it, at least one thread is blocked waiting for that transaction to
commit, and over time, more and more threads will end up getting
blocked.  In order to avoid high variability in file system operations
getting blocked behind the a blocked start_this_handle(), we should
try to get the commit started as soon as possible.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 fs/jbd2/commit.c      | 3 ++-
 fs/jbd2/journal.c     | 1 +
 fs/jbd2/transaction.c | 6 ++++--
 include/linux/jbd2.h  | 1 +
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 3091d42..fd2ac94 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -424,7 +424,8 @@ void jbd2_journal_commit_transaction(journal_t *journal)
 	J_ASSERT(journal->j_committing_transaction == NULL);
 
 	commit_transaction = journal->j_running_transaction;
-	J_ASSERT(commit_transaction->t_state == T_RUNNING);
+	J_ASSERT(commit_transaction->t_state == T_REQUESTED ||
+		 commit_transaction->t_state == T_RUNNING);
 
 	trace_jbd2_start_commit(journal, commit_transaction);
 	jbd_debug(1, "JBD2: starting commit of transaction %d\n",
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 1a80e31..c22773b 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -533,6 +533,7 @@ int __jbd2_log_start_commit(journal_t *journal, tid_t target)
 		jbd_debug(1, "JBD2: requesting commit %d/%d\n",
 			  journal->j_commit_request,
 			  journal->j_commit_sequence);
+		journal->j_running_transaction->t_state = T_REQUESTED;
 		wake_up(&journal->j_wait_commit);
 		return 1;
 	} else if (!tid_geq(journal->j_commit_request, target))
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index df9f297..6daff29 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -224,7 +224,8 @@ repeat:
 	 * If the current transaction is locked down for commit, wait for the
 	 * lock to be released.
 	 */
-	if (transaction->t_state == T_LOCKED) {
+	if ((transaction->t_state == T_LOCKED) ||
+	    (transaction->t_state == T_REQUESTED)) {
 		DEFINE_WAIT(wait);
 
 		prepare_to_wait(&journal->j_wait_transaction_locked,
@@ -2179,7 +2180,8 @@ void __jbd2_journal_refile_buffer(struct journal_head *jh)
 	else
 		jlist = BJ_Reserved;
 	__jbd2_journal_file_buffer(jh, jh->b_transaction, jlist);
-	J_ASSERT_JH(jh, jh->b_transaction->t_state == T_RUNNING);
+	J_ASSERT_JH(jh, (jh->b_transaction->t_state == T_RUNNING ||
+			 jh->b_transaction->t_state == T_REQUESTED));
 
 	if (was_dirty)
 		set_buffer_jbddirty(bh);
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index e30b663..920a8d0 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -493,6 +493,7 @@ struct transaction_s
 	 */
 	enum {
 		T_RUNNING,
+		T_REQUESTED,
 		T_LOCKED,
 		T_FLUSH,
 		T_COMMIT,
-- 
1.7.12.rc0.22.gcdd159b


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] jbd2: track request delay statistics
  2013-01-31 17:53 [PATCH 0/3 v2] jbd2 performance improvements Theodore Ts'o
  2013-01-31 17:53 ` [PATCH 1/3] jbd2: don't wake kjournald unnecessarily Theodore Ts'o
  2013-01-31 17:53 ` [PATCH 2/3] jbd2: commit as soon as possible after log_start_commit Theodore Ts'o
@ 2013-01-31 17:53 ` Theodore Ts'o
  2 siblings, 0 replies; 5+ messages in thread
From: Theodore Ts'o @ 2013-01-31 17:53 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o

Track the delay between when we first request that the commit begin
and when it actually begins, so we can see how much of a gap exists.
In theory, this should just be the remaining time scheduling quantuum
of the thread which requested the commit (assuming it was not a
synchronous operation which triggered the commit request) plus
scheduling overhead; however, it's possible that real time processes
might get in the way of letting the kjournald thread from executing.

In addition, since we now prohibit handles from being newly started or
extended once a commit has been requested, calculate the commit
running time from when the commit was first requested, and include the
request delay as part of the transaction lockdown delay.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 fs/jbd2/commit.c            | 10 ++++++++++
 fs/jbd2/journal.c           | 12 +++++++++---
 fs/jbd2/transaction.c       |  1 +
 include/linux/jbd2.h        |  7 +++++++
 include/trace/events/jbd2.h |  8 ++++++--
 5 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index fd2ac94..20dde86 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -436,7 +436,14 @@ void jbd2_journal_commit_transaction(journal_t *journal)
 
 	trace_jbd2_commit_locking(journal, commit_transaction);
 	stats.run.rs_wait = commit_transaction->t_max_wait;
+	stats.run.rs_request_delay = 0;
 	stats.run.rs_locked = jiffies;
+	if (commit_transaction->t_requested) {
+		stats.run.rs_request_delay =
+			jbd2_time_diff(commit_transaction->t_requested,
+				       stats.run.rs_locked);
+		stats.run.rs_locked = commit_transaction->t_requested;
+	}
 	stats.run.rs_running = jbd2_time_diff(commit_transaction->t_start,
 					      stats.run.rs_locked);
 
@@ -1117,7 +1124,10 @@ restart_loop:
 	 */
 	spin_lock(&journal->j_history_lock);
 	journal->j_stats.ts_tid++;
+	if (commit_transaction->t_requested)
+		journal->j_stats.ts_requested++;
 	journal->j_stats.run.rs_wait += stats.run.rs_wait;
+	journal->j_stats.run.rs_request_delay += stats.run.rs_request_delay;
 	journal->j_stats.run.rs_running += stats.run.rs_running;
 	journal->j_stats.run.rs_locked += stats.run.rs_locked;
 	journal->j_stats.run.rs_flushing += stats.run.rs_flushing;
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index c22773b..f6ad0c0 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -534,6 +534,7 @@ int __jbd2_log_start_commit(journal_t *journal, tid_t target)
 			  journal->j_commit_request,
 			  journal->j_commit_sequence);
 		journal->j_running_transaction->t_state = T_REQUESTED;
+		journal->j_running_transaction->t_requested = jiffies;
 		wake_up(&journal->j_wait_commit);
 		return 1;
 	} else if (!tid_geq(journal->j_commit_request, target))
@@ -899,13 +900,18 @@ static int jbd2_seq_info_show(struct seq_file *seq, void *v)
 
 	if (v != SEQ_START_TOKEN)
 		return 0;
-	seq_printf(seq, "%lu transaction, each up to %u blocks\n",
-			s->stats->ts_tid,
-			s->journal->j_max_transaction_buffers);
+	seq_printf(seq, "%lu transactions (%lu requested), "
+		   "each up to %u blocks\n",
+		   s->stats->ts_tid, s->stats->ts_requested,
+		   s->journal->j_max_transaction_buffers);
 	if (s->stats->ts_tid == 0)
 		return 0;
 	seq_printf(seq, "average: \n  %ums waiting for transaction\n",
 	    jiffies_to_msecs(s->stats->run.rs_wait / s->stats->ts_tid));
+	seq_printf(seq, "  %ums request delay\n",
+	    (s->stats->ts_requested == 0) ? 0 :
+	    jiffies_to_msecs(s->stats->run.rs_request_delay /
+			     s->stats->ts_requested));
 	seq_printf(seq, "  %ums running transaction\n",
 	    jiffies_to_msecs(s->stats->run.rs_running / s->stats->ts_tid));
 	seq_printf(seq, "  %ums transaction was being locked\n",
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 6daff29..7fbc68f 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -100,6 +100,7 @@ jbd2_get_transaction(journal_t *journal, transaction_t *transaction)
 	journal->j_running_transaction = transaction;
 	transaction->t_max_wait = 0;
 	transaction->t_start = jiffies;
+	transaction->t_requested = 0;
 
 	return transaction;
 }
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 920a8d0..95c1ca8 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -582,6 +582,11 @@ struct transaction_s
 	unsigned long		t_start;
 
 	/*
+	 * When commit was requested
+	 */
+	unsigned long		t_requested;
+
+	/*
 	 * Checkpointing stats [j_checkpoint_sem]
 	 */
 	struct transaction_chp_stats_s t_chp_stats;
@@ -638,6 +643,7 @@ struct transaction_s
 
 struct transaction_run_stats_s {
 	unsigned long		rs_wait;
+	unsigned long		rs_request_delay;
 	unsigned long		rs_running;
 	unsigned long		rs_locked;
 	unsigned long		rs_flushing;
@@ -650,6 +656,7 @@ struct transaction_run_stats_s {
 
 struct transaction_stats_s {
 	unsigned long		ts_tid;
+	unsigned long		ts_requested;
 	struct transaction_run_stats_s run;
 };
 
diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
index 127993d..5419f57 100644
--- a/include/trace/events/jbd2.h
+++ b/include/trace/events/jbd2.h
@@ -142,6 +142,7 @@ TRACE_EVENT(jbd2_run_stats,
 		__field(		dev_t,	dev		)
 		__field(	unsigned long,	tid		)
 		__field(	unsigned long,	wait		)
+		__field(	unsigned long,	request_delay	)
 		__field(	unsigned long,	running		)
 		__field(	unsigned long,	locked		)
 		__field(	unsigned long,	flushing	)
@@ -155,6 +156,7 @@ TRACE_EVENT(jbd2_run_stats,
 		__entry->dev		= dev;
 		__entry->tid		= tid;
 		__entry->wait		= stats->rs_wait;
+		__entry->request_delay	= stats->rs_request_delay;
 		__entry->running	= stats->rs_running;
 		__entry->locked		= stats->rs_locked;
 		__entry->flushing	= stats->rs_flushing;
@@ -164,10 +166,12 @@ TRACE_EVENT(jbd2_run_stats,
 		__entry->blocks_logged	= stats->rs_blocks_logged;
 	),
 
-	TP_printk("dev %d,%d tid %lu wait %u running %u locked %u flushing %u "
-		  "logging %u handle_count %u blocks %u blocks_logged %u",
+	TP_printk("dev %d,%d tid %lu wait %u request_delay %u running %u "
+		  "locked %u flushing %u logging %u handle_count %u "
+		  "blocks %u blocks_logged %u",
 		  MAJOR(__entry->dev), MINOR(__entry->dev), __entry->tid,
 		  jiffies_to_msecs(__entry->wait),
+		  jiffies_to_msecs(__entry->request_delay),
 		  jiffies_to_msecs(__entry->running),
 		  jiffies_to_msecs(__entry->locked),
 		  jiffies_to_msecs(__entry->flushing),
-- 
1.7.12.rc0.22.gcdd159b


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/3] jbd2: commit as soon as possible after log_start_commit
  2013-01-31 17:53 ` [PATCH 2/3] jbd2: commit as soon as possible after log_start_commit Theodore Ts'o
@ 2013-02-04 21:02   ` Theodore Ts'o
  0 siblings, 0 replies; 5+ messages in thread
From: Theodore Ts'o @ 2013-02-04 21:02 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Barry J. Marson

On Thu, Jan 31, 2013 at 12:53:07PM -0500, Theodore Ts'o wrote:
> Once a transaction has been requested to be committed, don't let any
> other handles start under that transaction, and don't allow any
> pending transactions to be extended (i.e., in the case of
> unlink/ftruncate).
> 
> The idea is once the transaction has had log_start_commit() called on
> it, at least one thread is blocked waiting for that transaction to
> commit, and over time, more and more threads will end up getting
> blocked.  In order to avoid high variability in file system operations
> getting blocked behind the a blocked start_this_handle(), we should
> try to get the commit started as soon as possible.

I'm going to drop this patch because thanks to some performance
measurement works by Barry Marson at Red Hat, it shows that this patch
apparently makes things worse by 17% with AIM7.

At a guess, it looks like some AIM7 has enough threads competing for
the CPU that it can take a good 80ms or more before kjournald can get
scheduled, and then start locking down the transaction.  During those
80ms, a number of short transactions get in and out and make forward
progress on the benchmark.  However, there are some "long pole"
handles that will take up to 100-250ms to complete, and during that
time, nothing else can get done.  By starting to lock down the
transaction as soon as the commit is requested, instead of waiting
until the kjournald thread can be scheduled, we end up limiting
forward progress made by the quick "in and out" handles, and this far
outweighs the benefits of stopping a long-lived handle from getting
started.

It may be that after we do some work to conclusively identify what
these "long pole" handles might be --- I strongly suspect they come
from truncate/unlink calls, but we need to make sure.  If in fact they
are coming from truncate/unlink calls, it might be possible to let a
truncate processing stop its handle once it notices that we are
requesting that a journal commit should start, so we can lower the
average amount of time start_this_handle() gets blocked waiting for a
commit to complete.

Thanks again to Barry for doing the benchmarking work!  As always,
people who provide us with benchmarking support provide an absolutely
invaluable service.

					- Ted

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-02-04 21:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-31 17:53 [PATCH 0/3 v2] jbd2 performance improvements Theodore Ts'o
2013-01-31 17:53 ` [PATCH 1/3] jbd2: don't wake kjournald unnecessarily Theodore Ts'o
2013-01-31 17:53 ` [PATCH 2/3] jbd2: commit as soon as possible after log_start_commit Theodore Ts'o
2013-02-04 21:02   ` Theodore Ts'o
2013-01-31 17:53 ` [PATCH 3/3] jbd2: track request delay statistics Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).