linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch/rft] jbd2: tag journal writes as metadata I/O
@ 2010-04-01 19:04 Jeff Moyer
  2010-04-01 19:48 ` Jan Kara
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Jeff Moyer @ 2010-04-01 19:04 UTC (permalink / raw)
  To: linux-ext4; +Cc: linux-kernel, jens.axboe, esandeen

Hi,

In running iozone for writes to small files, we noticed a pretty big
discrepency between the performance of the deadline and cfq I/O
schedulers.  Investigation showed that I/O was being issued from 2
different contexts: the iozone process itself, and the jbd2/sdh-8 thread
(as expected).  Because of the way cfq performs slice idling, the delays
introduced between the metadata and data I/Os were significant.  For
example, cfq would see about 7MB/s versus deadline's 35 for the same
workload.  I also tested fs_mark with writing and fsyncing 1000 64k
files, and a similar 5x performance difference was observed.  Eric
Sandeen suggested that I flag the journal writes as metadata, and once I
did that, the performance difference went away completely (cfq has
special logic to prioritize metadata I/O).

So, I'm submitting this patch for comments and testing.  I have a
similar patch for jbd that I will submit if folks agree that this is a
good idea.

Cheers,
Jeff

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 671da7f..1998265 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -139,7 +139,7 @@ static int journal_submit_commit_record(journal_t *journal,
 		set_buffer_ordered(bh);
 		barrier_done = 1;
 	}
-	ret = submit_bh(WRITE_SYNC_PLUG, bh);
+	ret = submit_bh(WRITE_SYNC_PLUG | (1<<BIO_RW_META), bh);
 	if (barrier_done)
 		clear_buffer_ordered(bh);
 
@@ -160,7 +160,7 @@ static int journal_submit_commit_record(journal_t *journal,
 		lock_buffer(bh);
 		set_buffer_uptodate(bh);
 		clear_buffer_dirty(bh);
-		ret = submit_bh(WRITE_SYNC_PLUG, bh);
+		ret = submit_bh(WRITE_SYNC_PLUG | (1<<BIO_RW_META), bh);
 	}
 	*cbh = bh;
 	return ret;
@@ -369,7 +369,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
 	int tag_bytes = journal_tag_bytes(journal);
 	struct buffer_head *cbh = NULL; /* For transactional checksums */
 	__u32 crc32_sum = ~0;
-	int write_op = WRITE;
+	int write_op = WRITE_META;
 
 	/*
 	 * First job: lock down the current transaction and wait for
@@ -409,7 +409,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
 	 * instead we rely on sync_buffer() doing the unplug for us.
 	 */
 	if (commit_transaction->t_synchronous_commit)
-		write_op = WRITE_SYNC_PLUG;
+		write_op = WRITE_SYNC_PLUG | (1<<BIO_RW_META);
 	trace_jbd2_commit_locking(journal, commit_transaction);
 	stats.run.rs_wait = commit_transaction->t_max_wait;
 	stats.run.rs_locked = jiffies;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-04-06 19:05 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-01 19:04 [patch/rft] jbd2: tag journal writes as metadata I/O Jeff Moyer
2010-04-01 19:48 ` Jan Kara
2010-04-05 15:24   ` Jeff Moyer
2010-04-05 17:46     ` tytso
2010-04-06 15:20       ` Jan Kara
2010-04-06 18:25       ` Vivek Goyal
2010-04-06 18:45         ` tytso
2010-04-06 19:04           ` Jeff Moyer
2010-04-02  7:00 ` Jens Axboe
2010-04-05 17:52 ` tytso
2010-04-05 18:36   ` Jeff Moyer
2010-04-05 19:48     ` tytso
2010-04-05 20:34       ` Jeff Moyer
2010-04-05 20:41         ` Jeff Moyer
2010-04-05 21:01           ` tytso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).