linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Tomas <alex@clusterfs.com>
To: Badari Pulavarty <pbadari@us.ibm.com>
Cc: suparna@in.ibm.com, sct@redhat.com,
	akpmext2-devel@lists.sourceforge.net,
	linux-fsdevel@vger.kernel.org
Subject: Re: [Ext2-devel] Reviewing ext3 improvement patches (delalloc, mballoc, extents)
Date: Fri, 4 Mar 2005 18:02:35 +0300	[thread overview]
Message-ID: <20050304180235.0a8ff966.alex@clusterfs.com> (raw)
In-Reply-To: <1109898734.4961.11.camel@dyn318077bld.beaverton.ibm.com>

On 03 Mar 2005 17:12:14 -0800
Badari Pulavarty <pbadari@us.ibm.com> wrote:

> One more thing, we need to keep in mind is - we need to make sure
> that "ordered" mode also improved - since all our testcode 
> focuses on "writeback" mode and the default mode is "ordered" :(
> 

I've just cooked the patch to implement ordered mode for delayed
allocation path. please take it:

ftp://ftp.clusterfs.com/pub/people/alex/2.6.11/ext3-delalloc-ordered-2.6.11-0.1.patch

Stephen, Andrew could you review it, please?

thanks, Alex


Index: linux-2.6.11/include/linux/jbd.h
===================================================================
--- linux-2.6.11.orig/include/linux/jbd.h	2005-03-02 20:49:13.000000000 +0300
+++ linux-2.6.11/include/linux/jbd.h	2005-03-04 17:03:52.000000000 +0300
@@ -486,6 +486,12 @@
 	struct journal_head	*t_sync_datalist;
 
 	/*
+	 * Number of BIO's submited in context of the transaction we
+	 * want to complete before committing
+	 */
+	 atomic_t		t_bios_in_flight;
+
+	/*
 	 * Doubly-linked circular list of all forget buffers (superseded
 	 * buffers which we can un-checkpoint once this transaction commits)
 	 * [j_list_lock]
@@ -678,6 +684,9 @@
 	/* Wait queue to wait for updates to complete */
 	wait_queue_head_t	j_wait_updates;
 
+	/* Wait queue to wait for all BIOs to complete */
+	wait_queue_head_t	j_wait_bios;
+
 	/* Semaphore for locking against concurrent checkpoints */
 	struct semaphore 	j_checkpoint_sem;
 
Index: linux-2.6.11/fs/jbd/commit.c
===================================================================
--- linux-2.6.11.orig/fs/jbd/commit.c	2005-03-02 20:49:09.000000000 +0300
+++ linux-2.6.11/fs/jbd/commit.c	2005-03-04 17:53:52.000000000 +0300
@@ -619,6 +620,13 @@
 	if (is_journal_aborted(journal))
 		goto skip_commit;
 
+	/*
+	 * Before the commit record, we have to wait for all bio's
+	 * ext3_wb_writepages() issued against newly-allocated blocks
+	 */
+	wait_event(journal->j_wait_bios, 
+		atomic_read(&commit_transaction->t_bios_in_flight) == 0);
+
 	/* Done it all: now write the commit record.  We should have
 	 * cleaned up our previous buffers by now, so if we are in abort
 	 * mode we can now just skip the rest of the journal write
Index: linux-2.6.11/fs/jbd/transaction.c
===================================================================
--- linux-2.6.11.orig/fs/jbd/transaction.c	2005-03-02 20:49:09.000000000 +0300
+++ linux-2.6.11/fs/jbd/transaction.c	2005-03-04 17:05:28.000000000 +0300
@@ -51,6 +51,7 @@
 	transaction->t_tid = journal->j_transaction_sequence++;
 	transaction->t_expires = jiffies + journal->j_commit_interval;
 	spin_lock_init(&transaction->t_handle_lock);
+	atomic_set(&transaction->t_bios_in_flight, 0);
 
 	/* Set up the commit timer for the new transaction. */
 	journal->j_commit_timer->expires = transaction->t_expires;
Index: linux-2.6.11/fs/jbd/journal.c
===================================================================
--- linux-2.6.11.orig/fs/jbd/journal.c	2005-03-04 17:04:29.000000000 +0300
+++ linux-2.6.11/fs/jbd/journal.c	2005-03-04 17:04:40.000000000 +0300
@@ -671,6 +671,7 @@
 	init_waitqueue_head(&journal->j_wait_checkpoint);
 	init_waitqueue_head(&journal->j_wait_commit);
 	init_waitqueue_head(&journal->j_wait_updates);
+	init_waitqueue_head(&journal->j_wait_bios);
 	init_MUTEX(&journal->j_barrier);
 	init_MUTEX(&journal->j_checkpoint_sem);
 	spin_lock_init(&journal->j_revoke_lock);
Index: linux-2.6.11/fs/ext3/writeback.c
===================================================================
--- linux-2.6.11.orig/fs/ext3/writeback.c	2005-03-04 15:10:01.000000000 +0300
+++ linux-2.6.11/fs/ext3/writeback.c	2005-03-04 17:33:05.000000000 +0300
@@ -145,6 +145,17 @@
 	if (bio->bi_size)
 		return 1;
 
+	if (bio->bi_private) {
+		transaction_t *transaction = bio->bi_private;
+
+		/* 
+		 * journal_commit_transaction() may be awaiting
+		 * the bio to complete.
+		 */
+		if (atomic_dec_and_test(&transaction->t_bios_in_flight))
+			wake_up(&transaction->t_journal->j_wait_bios);
+	}
+
 	do {
 		struct page *page = bvec->bv_page;
 
@@ -162,6 +173,16 @@
 static struct bio *ext3_wb_bio_submit(struct bio *bio, handle_t *handle)
 {
 	bio->bi_end_io = ext3_wb_end_io;
+	if (handle) {
+		/*
+		 * In data=ordered we shouldn't commit the transaction
+		 * until all data related to the transaction get on a
+		 * platter.
+		 */
+		atomic_inc(&handle->h_transaction->t_bios_in_flight);
+		bio->bi_private = handle->h_transaction;
+	} else
+		bio->bi_private = NULL;
 	submit_bio(WRITE, bio);
 	return NULL;
 }

  parent reply	other threads:[~2005-03-04 15:04 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-03-03  8:33 Reviewing ext3 improvement patches (delalloc, mballoc, extents) Suparna Bhattacharya
2005-03-03  9:40 ` Andreas Dilger
2005-03-03 22:10   ` Theodore Ts'o
2005-03-03 22:30     ` Alex Tomas
2005-03-04 11:13   ` Suparna Bhattacharya
2005-03-04 12:29     ` Alex Tomas
2005-03-04 18:25       ` [Ext2-devel] " Andreas Dilger
2005-03-04  1:12 ` [Ext2-devel] " Badari Pulavarty
2005-03-04  1:46   ` Mingming Cao
2005-03-04  3:26     ` Suparna Bhattacharya
2005-03-14  8:36     ` Werner Almesberger
2005-03-14  9:04       ` Suparna Bhattacharya
2005-03-14 15:02         ` Werner Almesberger
2005-03-14 15:43           ` Alex Tomas
2005-03-14 16:37             ` [Ext2-devel] " Werner Almesberger
2005-03-14 17:13               ` Alex Tomas
2005-03-15  0:28                 ` Werner Almesberger
2005-03-14 22:23               ` Bryan Henderson
2005-03-15  0:42                 ` Werner Almesberger
2005-03-15 21:59                   ` Bryan Henderson
2005-03-04 11:30   ` [Ext2-devel] " Alex Tomas
2005-03-04 15:02   ` Alex Tomas [this message]
2005-03-13 14:41     ` Delayed alloc for ordered-mode Suparna Bhattacharya
2005-03-13 19:32       ` Badari Pulavarty

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050304180235.0a8ff966.alex@clusterfs.com \
    --to=alex@clusterfs.com \
    --cc=akpmext2-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=pbadari@us.ibm.com \
    --cc=sct@redhat.com \
    --cc=suparna@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).