From: Chris Mason <mason@suse.com>
To: akpm@osdl.org, linux-kernel@vger.kernel.org
Subject: [PATCH RFC] __bd_forget should wait for inodes using the mapping
Date: Thu, 17 Jun 2004 21:54:28 -0400 [thread overview]
Message-ID: <1087523668.8002.103.camel@watt.suse.com> (raw)
__bd_forget will change the mapping for filesystem inodes without
waiting to make sure no users of the block device address space are
using that mapping.
In the case of background writeout, it is possible for __bd_forget
to free the block device inode while mpage_writepages is still
looking through the mapping for dirty pages. This is because
each device node in the filesystem has a pointer to the block
device address space, and __bd_forget is used to reset those pointers
before the block device inode is freed.
There is no locking to make sure __bd_forget isn't running
at the same time as __writeback_single_inode is run on the
filesystem device node.
Here's an example patch that should fix things, Andi just found
a race where I wasn't holding onto the filesystem inode correctly,
so this rev got a last minute fix before I wander off for the night.
It's quite ugly, I'm hoping we can hash out something better.
Index: linux.t/fs/block_dev.c
===================================================================
--- linux.t.orig/fs/block_dev.c 2004-06-17 21:14:08.000000000 -0400
+++ linux.t/fs/block_dev.c 2004-06-17 21:46:46.203782616 -0400
@@ -24,6 +24,7 @@
#include <linux/uio.h>
#include <linux/namei.h>
#include <asm/uaccess.h>
+#include <linux/writeback.h>
struct bdev_inode {
struct block_device bdev;
@@ -258,11 +259,31 @@ static void init_once(void * foo, kmem_c
}
}
+/*
+ * we have to make sure that we don't free the block
+ * device inode and mapping while one of the inodes using
+ * it is in background writeback.
+ *
+ * The lock ordering required elsewhere is bdev_lock->inode_lock.
+ */
static inline void __bd_forget(struct inode *inode)
{
+ spin_lock(&inode_lock);
+ __iget(inode);
+ while (inode->i_state & I_LOCK) {
+ spin_unlock(&bdev_lock);
+ spin_unlock(&inode_lock);
+ __wait_on_inode(inode);
+ spin_lock(&bdev_lock);
+ spin_lock(&inode_lock);
+ }
list_del_init(&inode->i_devices);
inode->i_bdev = NULL;
inode->i_mapping = &inode->i_data;
+ spin_unlock(&inode_lock);
+ spin_unlock(&bdev_lock);
+ iput(inode);
+ spin_lock(&bdev_lock);
}
static void bdev_clear_inode(struct inode *inode)
next reply other threads:[~2004-06-18 1:57 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-06-18 1:54 Chris Mason [this message]
2004-06-18 2:01 ` [PATCH RFC] __bd_forget should wait for inodes using the mapping Chris Mason
2004-06-18 2:10 ` viro
2004-06-18 13:03 ` Chris Mason
2004-06-18 14:22 ` viro
2004-06-18 14:47 ` Chris Mason
2004-06-18 15:15 ` viro
2004-06-18 15:41 ` Chris Mason
2004-06-18 15:43 ` viro
2004-06-18 16:05 ` Chris Mason
2004-06-18 20:26 ` Andrew Morton
2004-06-18 20:44 ` Chris Mason
2004-06-18 21:27 ` Andrew Morton
2004-06-18 23:15 ` Chris Mason
2004-06-18 23:25 ` Andrew Morton
2004-06-18 14:20 ` Chris Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1087523668.8002.103.camel@watt.suse.com \
--to=mason@suse.com \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.