linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: linux-fsdevel@vger.kernel.org, akpm@osdl.org, zach.brown@oracle.com
Subject: [PATCH 5 of 8] Make ext3 safe for the new DIO locking rules
Date: Thu, 21 Dec 2006 21:00:57 -0500	[thread overview]
Message-ID: <20061222020057.GQ11354@think.oraclecorp.com> (raw)
In-Reply-To: <20061222014552.GA26388@think.oraclecorp.com>

This creates a version of ext3_get_block that starts and ends a transaction.

By starting and ending the transaction inside get_block, this is able to
avoid lock inversion problems when the DIO code tries to take page locks
inside blockdev_direct_IO. (transaction locks must always happen after
page locks).

Signed-off-by: Chris Mason <chris.mason@oracle.com>

diff -r 385bc75d9266 -r bebaf8972a31 fs/ext3/inode.c
--- a/fs/ext3/inode.c	Thu Dec 21 15:31:30 2006 -0500
+++ b/fs/ext3/inode.c	Thu Dec 21 15:31:30 2006 -0500
@@ -1673,6 +1673,30 @@ static int ext3_releasepage(struct page 
 	return journal_try_to_free_buffers(journal, page, wait);
 }
 
+static int ext3_get_block_direct_IO(struct inode *inode, sector_t iblock,
+			struct buffer_head *bh_result, int create)
+{
+	int ret = 0;
+	handle_t *handle = ext3_journal_start(inode, DIO_CREDITS);
+	if (IS_ERR(handle)) {
+		ret = PTR_ERR(handle);
+		goto out;
+	}
+	ret = ext3_get_block(inode, iblock, bh_result, create);
+	/*
+	 * Reacquire the handle: ext3_get_block() can restart the transaction
+	 */
+	handle = journal_current_handle();
+	if (handle) {
+		int err;
+		err = ext3_journal_stop(handle);
+		if (!ret)
+			ret = err;
+	}
+out:
+	return ret;
+}
+
 /*
  * If the O_DIRECT write will extend the file then add this inode to the
  * orphan list.  So recovery will truncate it back to the original size
@@ -1693,39 +1717,58 @@ static ssize_t ext3_direct_IO(int rw, st
 	int orphan = 0;
 	size_t count = iov_length(iov, nr_segs);
 
-	if (rw == WRITE) {
-		loff_t final_size = offset + count;
-
+	if (rw == WRITE && (offset + count > inode->i_size)) { 
 		handle = ext3_journal_start(inode, DIO_CREDITS);
 		if (IS_ERR(handle)) {
 			ret = PTR_ERR(handle);
 			goto out;
 		}
-		if (final_size > inode->i_size) {
-			ret = ext3_orphan_add(handle, inode);
-			if (ret)
-				goto out_stop;
-			orphan = 1;
-			ei->i_disksize = inode->i_size;
-		}
-	}
-
+		ret = ext3_orphan_add(handle, inode);
+		if (ret) {
+			ext3_journal_stop(handle);
+			goto out;
+		}
+		ei->i_disksize = inode->i_size;
+		ret = ext3_journal_stop(handle);
+		if (ret) {
+			/* something has gone horribly wrong, cleanup
+			 * the orphan list in ram
+			 */
+			if (inode->i_nlink)
+				ext3_orphan_del(NULL, inode);
+			goto out;
+		}
+		orphan = 1;
+	}
+
+	/*
+	 * the placeholder page code may take a page lock, so we have
+	 * to stop any running transactions before calling
+	 * blockdev_direct_IO.  Use ext3_get_block_direct_IO to start
+	 * and stop a transaction on each get_block call.
+	 */
 	ret = blockdev_direct_IO(rw, iocb, inode, inode->i_sb->s_bdev, iov,
 				 offset, nr_segs,
-				 ext3_get_block, NULL);
+				 ext3_get_block_direct_IO, NULL);
 
 	/*
 	 * Reacquire the handle: ext3_get_block() can restart the transaction
 	 */
 	handle = journal_current_handle();
 
-out_stop:
-	if (handle) {
+	if (orphan) {
 		int err;
-
-		if (orphan && inode->i_nlink)
+		handle = ext3_journal_start(inode, DIO_CREDITS);
+		if (IS_ERR(handle)) {
+			ret = PTR_ERR(handle);
+			if (inode->i_nlink)
+				ext3_orphan_del(NULL, inode);
+			goto out;
+		}
+
+		if (inode->i_nlink)
 			ext3_orphan_del(handle, inode);
-		if (orphan && ret > 0) {
+		if (ret > 0) {
 			loff_t end = offset + ret;
 			if (end > inode->i_size) {
 				ei->i_disksize = end;



  parent reply	other threads:[~2006-12-22  2:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-22  1:45 [PATCH 0 of 8] O_DIRECT locking rework v5 Chris Mason
2006-12-22  1:52 ` [PATCH 1 of 8] Introduce a place holder page for the pagecache Chris Mason
2006-12-22  1:55 ` [PATCH 2 of 8] Change O_DIRECT to use placeholders instead of i_mutex/i_alloc_sem locking Chris Mason
2006-12-22  1:57 ` [PATCH 3 of 8] DIO: don't fall back to buffered writes Chris Mason
2006-12-22  1:59 ` [PATCH 4 of 8] Add flags to control direct IO helpers Chris Mason
2006-12-22  2:00 ` Chris Mason [this message]
2006-12-22  2:02 ` [PATCH 6 of 8] Make reiserfs safe for new DIO locking rules Chris Mason
2006-12-22  2:03 ` [PATCH 7 of 8] Adapt XFS to the new blockdev_direct_IO calls Chris Mason
2006-12-22  2:05 ` [PATCH 8 of 8] Avoid too many boundary buffers in DIO Chris Mason
  -- strict thread matches above, loose matches on Subject: below --
2007-02-07  0:32 [RFC PATCH 0 of 8] O_DIRECT locking rework Chris Mason
2007-02-07  0:32 ` [PATCH 5 of 8] Make ext3 safe for the new DIO locking rules Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061222020057.GQ11354@think.oraclecorp.com \
    --to=chris.mason@oracle.com \
    --cc=akpm@osdl.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=zach.brown@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).