linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] BDI handling fixes
@ 2010-09-16 22:44 Jan Kara
  2010-09-16 22:44 ` [PATCH 1/3] bdi: Initialize noop_backing_dev_info properly Jan Kara
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Jan Kara @ 2010-09-16 22:44 UTC (permalink / raw)
  To: Jens, "Axboe <axboe"; +Cc: Christoph Hellwig, LKML, linux-fsdevel


  Hi,

  series of these three patches fixes the warning in __mark_inode_dirty()
which happens when I do e.g. touch /dev/zero. The first two patches should
be obvious enough and probably worth merging independently of the third
patch. The third patch is upto a discussion whether we want to solve the
problem that way or differently. Christoph, I know we spoke at LSF that
inode_to_bdi() could be a per-sb method but the current version of the
patch seems clean enough to me that we could maybe go even without the
special callback?

								Honza

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] bdi: Initialize noop_backing_dev_info properly
  2010-09-16 22:44 [PATCH 0/3] BDI handling fixes Jan Kara
@ 2010-09-16 22:44 ` Jan Kara
  2010-09-16 22:44 ` [PATCH 2/3] char: Mark /dev/zero and /dev/kmem as not capable of writeback Jan Kara
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2010-09-16 22:44 UTC (permalink / raw)
  To: Jens, "Axboe <axboe"
  Cc: Christoph Hellwig, LKML, linux-fsdevel, Jan Kara

Properly initialize this backing dev info so that writeback code does not
barf when getting to it e.g. via sb->s_bdi.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/backing-dev.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index eaa4a5b..69ae8ca 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -30,6 +30,7 @@ EXPORT_SYMBOL_GPL(default_backing_dev_info);
 
 struct backing_dev_info noop_backing_dev_info = {
 	.name		= "noop",
+	.capabilities	= BDI_CAP_NO_ACCT_AND_WRITEBACK,
 };
 EXPORT_SYMBOL_GPL(noop_backing_dev_info);
 
@@ -243,6 +244,7 @@ static int __init default_bdi_init(void)
 	err = bdi_init(&default_backing_dev_info);
 	if (!err)
 		bdi_register(&default_backing_dev_info, NULL, "default");
+	err = bdi_init(&noop_backing_dev_info);
 
 	return err;
 }
-- 
1.6.4.2

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] char: Mark /dev/zero and /dev/kmem as not capable of writeback
  2010-09-16 22:44 [PATCH 0/3] BDI handling fixes Jan Kara
  2010-09-16 22:44 ` [PATCH 1/3] bdi: Initialize noop_backing_dev_info properly Jan Kara
@ 2010-09-16 22:44 ` Jan Kara
  2010-09-16 22:44 ` [PATCH 3/3] bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends Jan Kara
  2010-09-16 23:47 ` [PATCH 0/3] BDI handling fixes Christoph Hellwig
  3 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2010-09-16 22:44 UTC (permalink / raw)
  To: Jens, "Axboe <axboe"
  Cc: Christoph Hellwig, LKML, linux-fsdevel, Jan Kara

These devices don't do any writeback but their device inodes still can get
dirty so mark bdi appropriately so that bdi code does the right thing and files
inodes to lists of bdi carrying the device inodes.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 drivers/char/mem.c |    3 ++-
 fs/char_dev.c      |    4 +++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index a398ecd..1f528fa 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -788,10 +788,11 @@ static const struct file_operations zero_fops = {
 /*
  * capabilities for /dev/zero
  * - permits private mappings, "copies" are taken of the source of zeros
+ * - no writeback happens
  */
 static struct backing_dev_info zero_bdi = {
 	.name		= "char/mem",
-	.capabilities	= BDI_CAP_MAP_COPY,
+	.capabilities	= BDI_CAP_MAP_COPY | BDI_CAP_NO_ACCT_AND_WRITEBACK,
 };
 
 static const struct file_operations full_fops = {
diff --git a/fs/char_dev.c b/fs/char_dev.c
index f80a4f2..143d393 100644
--- a/fs/char_dev.c
+++ b/fs/char_dev.c
@@ -40,7 +40,9 @@ struct backing_dev_info directly_mappable_cdev_bdi = {
 #endif
 		/* permit direct mmap, for read, write or exec */
 		BDI_CAP_MAP_DIRECT |
-		BDI_CAP_READ_MAP | BDI_CAP_WRITE_MAP | BDI_CAP_EXEC_MAP),
+		BDI_CAP_READ_MAP | BDI_CAP_WRITE_MAP | BDI_CAP_EXEC_MAP |
+		/* no writeback happens */
+		BDI_CAP_NO_ACCT_AND_WRITEBACK),
 };
 
 static struct kobj_map *cdev_map;
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends
  2010-09-16 22:44 [PATCH 0/3] BDI handling fixes Jan Kara
  2010-09-16 22:44 ` [PATCH 1/3] bdi: Initialize noop_backing_dev_info properly Jan Kara
  2010-09-16 22:44 ` [PATCH 2/3] char: Mark /dev/zero and /dev/kmem as not capable of writeback Jan Kara
@ 2010-09-16 22:44 ` Jan Kara
  2010-09-16 23:47 ` [PATCH 0/3] BDI handling fixes Christoph Hellwig
  3 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2010-09-16 22:44 UTC (permalink / raw)
  To: Jens, "Axboe <axboe"
  Cc: Christoph Hellwig, LKML, linux-fsdevel, Jan Kara

Inodes of devices such as /dev/zero can get dirty for example via utime(2)
syscall or due to atime update. Backing device of such inodes (zero_bdi, etc.)
is however unable to handle dirty inodes and thus __mark_inode_dirty complains.
In fact, inode should be rather dirtied against backing device of the
filesystem holding it. This is generally a good rule except for filesystems
such as 'bdev' or 'mtd_inodefs'. Inodes in these pseudofilesystems are
referenced from ordinary filesystem inodes and carry mapping with real data of
the device. Thus for these inodes we have to use
inode->i_mapping->backing_dev_info as we did so far. We distinguish these
filesystems by checking whether sb->s_bdi points to a non-trivial backing
device or not.

Example: Assume we have an ext3 filesystem on /dev/sda1 mounted on /. There's a
device inode A described by a path "/dev/sdb" on this filesystem. This inode
will be dirtied against backing device "8:0" after this patch. bdev filesystem
contains block device inode B coupled with our inode A. When someone modifies a
page of /dev/sdb, it's B that gets dirtied and the dirtying happens against the
backing device "8:16". Thus both inodes get filed to a correct bdi list.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/fs-writeback.c |   23 +++++++++++++++++++++--
 1 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 49fd41a..7dad208 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -52,8 +52,6 @@ struct wb_writeback_work {
 #define CREATE_TRACE_POINTS
 #include <trace/events/writeback.h>
 
-#define inode_to_bdi(inode)	((inode)->i_mapping->backing_dev_info)
-
 /*
  * We don't actually have pdflush, but this one is exported though /proc...
  */
@@ -71,6 +69,27 @@ int writeback_in_progress(struct backing_dev_info *bdi)
 	return test_bit(BDI_writeback_running, &bdi->state);
 }
 
+static inline struct backing_dev_info *inode_to_bdi(struct inode *inode)
+{
+	struct super_block *sb = inode->i_sb;
+	struct backing_dev_info *bdi = inode->i_mapping->backing_dev_info;
+
+	/*
+	 * For inodes on standard filesystems, we use superblock's bdi. For
+	 * inodes on virtual filesystems, we want to use inode mapping's bdi
+	 * because they can possibly point to something useful (think about
+	 * block_dev filesystem).
+	 */
+	if (sb->s_bdi && sb->s_bdi != &noop_backing_dev_info) {
+		/* Some device inodes could play dirty tricks. Catch them... */
+		WARN(bdi != sb->s_bdi && bdi_cap_writeback_dirty(bdi),
+			"Dirtiable inode bdi %s != sb bdi %s\n",
+			bdi->name, sb->s_bdi->name);
+		return sb->s_bdi;
+	}
+	return bdi;
+}
+
 /* Wakeup flusher thread or forker thread to fork it. Requires bdi->wb_lock. */
 static void _bdi_wakeup_flusher(struct backing_dev_info *bdi)
 {
-- 
1.6.4.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] BDI handling fixes
  2010-09-16 22:44 [PATCH 0/3] BDI handling fixes Jan Kara
                   ` (2 preceding siblings ...)
  2010-09-16 22:44 ` [PATCH 3/3] bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends Jan Kara
@ 2010-09-16 23:47 ` Christoph Hellwig
  2010-09-17 12:52   ` Jan Kara
  3 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2010-09-16 23:47 UTC (permalink / raw)
  To: Jan Kara
  Cc: Jens, "Axboe <axboe", Christoph Hellwig, LKML,
	linux-fsdevel

On Fri, Sep 17, 2010 at 12:44:08AM +0200, Jan Kara wrote:
> 
>   Hi,
> 
>   series of these three patches fixes the warning in __mark_inode_dirty()
> which happens when I do e.g. touch /dev/zero. The first two patches should
> be obvious enough and probably worth merging independently of the third
> patch. The third patch is upto a discussion whether we want to solve the
> problem that way or differently. Christoph, I know we spoke at LSF that
> inode_to_bdi() could be a per-sb method but the current version of the
> patch seems clean enough to me that we could maybe go even without the
> special callback?

Feel free to go with the simpler one.  But what I think really needs to
be changes is the no writeback flag - it's exactly the wrong way around.

Instead just add a flag to allow writeback for the block device and
fs-specific bdi structures.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] BDI handling fixes
  2010-09-16 23:47 ` [PATCH 0/3] BDI handling fixes Christoph Hellwig
@ 2010-09-17 12:52   ` Jan Kara
  2010-09-17 13:44     ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kara @ 2010-09-17 12:52 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Jens, "Axboe <axboe", LKML, linux-fsdevel

On Thu 16-09-10 19:47:42, Christoph Hellwig wrote:
> On Fri, Sep 17, 2010 at 12:44:08AM +0200, Jan Kara wrote:
> > 
> >   Hi,
> > 
> >   series of these three patches fixes the warning in __mark_inode_dirty()
> > which happens when I do e.g. touch /dev/zero. The first two patches should
> > be obvious enough and probably worth merging independently of the third
> > patch. The third patch is upto a discussion whether we want to solve the
> > problem that way or differently. Christoph, I know we spoke at LSF that
> > inode_to_bdi() could be a per-sb method but the current version of the
> > patch seems clean enough to me that we could maybe go even without the
> > special callback?
> 
> Feel free to go with the simpler one.  But what I think really needs to
> be changes is the no writeback flag - it's exactly the wrong way around.
> 
> Instead just add a flag to allow writeback for the block device and
> fs-specific bdi structures.
  Agreed. I just think that I'll first make the flags right and then just
mechanically flip NO_WRITEBACK and WRITEBACK...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] BDI handling fixes
  2010-09-17 12:52   ` Jan Kara
@ 2010-09-17 13:44     ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2010-09-17 13:44 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Jens, "Axboe <axboe", LKML,
	linux-fsdevel

On Fri, Sep 17, 2010 at 02:52:45PM +0200, Jan Kara wrote:
>   Agreed. I just think that I'll first make the flags right and then just
> mechanically flip NO_WRITEBACK and WRITEBACK...

Okay, go with the minimal version for now and I'll sort it out later.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-09-17 13:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-16 22:44 [PATCH 0/3] BDI handling fixes Jan Kara
2010-09-16 22:44 ` [PATCH 1/3] bdi: Initialize noop_backing_dev_info properly Jan Kara
2010-09-16 22:44 ` [PATCH 2/3] char: Mark /dev/zero and /dev/kmem as not capable of writeback Jan Kara
2010-09-16 22:44 ` [PATCH 3/3] bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends Jan Kara
2010-09-16 23:47 ` [PATCH 0/3] BDI handling fixes Christoph Hellwig
2010-09-17 12:52   ` Jan Kara
2010-09-17 13:44     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).