linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2][RFC] Provide accounting for dirty metadata
@ 2016-08-09 19:08 Josef Bacik
  2016-08-09 19:08 ` [PATCH 1/2] remove mapping from balance_dirty_pages*() Josef Bacik
  2016-08-09 19:08 ` [PATCH 2/2] writeback: allow for dirty metadata accounting Josef Bacik
  0 siblings, 2 replies; 15+ messages in thread
From: Josef Bacik @ 2016-08-09 19:08 UTC (permalink / raw)
  To: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

Btrfs has always had a dummy inode that we used to allocate pages for our
metadata.  This has allowed us to take advantage of balance_dirty_pages() since
our dirty metadata is unbounded otherwise.  This has worked fine for years, but
now we want to add sub pagesize blocksize support.  The easiest way to do this
would be to just kmalloc() our pages, since we already do all of our buffer
management ourselves anyway.  But in order to switch over to that we need to
kill the metadata inode and allow us to make our own allocations for metadata.

Enter these two patches.  The first one is fairly straightforward.
balance_dirty_pages() works on a per-bdi case, the only reason we pass mapping
around is so we can check and see if the fs we are working against has writeback
cgroups enabled.  So this change is just changing function arguments and has no
behavior change at all.

The second patch is where I'd like a little more attention.  I've added some
helpers to deal with the dirty metadata page accounting.  Basically this just
adds some page stats that balance_dirty_pages() can take into account when
deciding whether to kick the background writeback.  Then I've added a sb
callback to handle writing back dirty metadata, and I've added a list to the bdi
for the sb's attached to that bdi.  My plan is to just add/remove the btrfs sb
to it when we mount/umount.

Any suggestions are welcome, this is relatively self contained and doesn't
complicate things too much so I think works well, but I may not be thinking of
other cases.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] remove mapping from balance_dirty_pages*()
  2016-08-09 19:08 [PATCH 0/2][RFC] Provide accounting for dirty metadata Josef Bacik
@ 2016-08-09 19:08 ` Josef Bacik
  2016-08-09 19:30   ` kbuild test robot
                     ` (5 more replies)
  2016-08-09 19:08 ` [PATCH 2/2] writeback: allow for dirty metadata accounting Josef Bacik
  1 sibling, 6 replies; 15+ messages in thread
From: Josef Bacik @ 2016-08-09 19:08 UTC (permalink / raw)
  To: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

The only reason we pass in the mapping is to get the inode in order to see if
writeback cgroups is enabled, and even then it only checks the bdi and a super
block flag.  balance_dirty_pages() doesn't even use the mapping.  Since
balance_dirty_pages*() works on a bdi level, just pass in the bdi and super
block directly so we can avoid using mapping.  This will allow us to still use
balance_dirty_pages for dirty metadata pages that are not backed by an
address_mapping.

Signed-off-by: Josef Bacik <jbacik@fb.com>
---
 fs/btrfs/disk-io.c          |  4 ++--
 fs/btrfs/file.c             |  3 ++-
 fs/btrfs/ioctl.c            |  3 ++-
 fs/btrfs/relocation.c       |  3 ++-
 fs/buffer.c                 |  3 ++-
 fs/iomap.c                  |  3 ++-
 include/linux/backing-dev.h | 23 +++++++++++++++++------
 include/linux/writeback.h   |  3 ++-
 mm/filemap.c                |  4 +++-
 mm/memory.c                 |  9 +++++++--
 mm/page-writeback.c         | 15 +++++++--------
 11 files changed, 48 insertions(+), 25 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 17346f7..c1d951a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4024,8 +4024,8 @@ static void __btrfs_btree_balance_dirty(struct btrfs_root *root,
 	ret = percpu_counter_compare(&root->fs_info->dirty_metadata_bytes,
 				     BTRFS_DIRTY_METADATA_THRESH);
 	if (ret > 0) {
-		balance_dirty_pages_ratelimited(
-				   root->fs_info->btree_inode->i_mapping);
+		balance_dirty_pages_ratelimited(&root->fs_info->bdi,
+						root->fs_info->sb);
 	}
 }
 
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 9404121..f060b08 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1686,7 +1686,8 @@ again:
 
 		cond_resched();
 
-		balance_dirty_pages_ratelimited(inode->i_mapping);
+		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
+						inode->i_sb);
 		if (dirty_pages < (root->nodesize >> PAGE_SHIFT) + 1)
 			btrfs_btree_balance_dirty(root);
 
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 14ed1e9..a222bad 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1410,7 +1410,8 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
 		}
 
 		defrag_count += ret;
-		balance_dirty_pages_ratelimited(inode->i_mapping);
+		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
+						inode->i_sb);
 		inode_unlock(inode);
 
 		if (newer_than) {
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 4598e29..7fc6ea7 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -3202,7 +3202,8 @@ static int relocate_file_extent_cluster(struct inode *inode,
 		put_page(page);
 
 		index++;
-		balance_dirty_pages_ratelimited(inode->i_mapping);
+		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
+						inode->i_sb);
 		btrfs_throttle(BTRFS_I(inode)->root);
 	}
 	WARN_ON(nr != cluster->nr);
diff --git a/fs/buffer.c b/fs/buffer.c
index 9c8eb9b..9bbe30d 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2386,7 +2386,8 @@ static int cont_expand_zero(struct file *file, struct address_space *mapping,
 		BUG_ON(err != len);
 		err = 0;
 
-		balance_dirty_pages_ratelimited(mapping);
+		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
+						inode->i_sb);
 
 		if (unlikely(fatal_signal_pending(current))) {
 			err = -EINTR;
diff --git a/fs/iomap.c b/fs/iomap.c
index 48141b8..937e266 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -226,7 +226,8 @@ again:
 		written += copied;
 		length -= copied;
 
-		balance_dirty_pages_ratelimited(inode->i_mapping);
+		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
+						inode->i_sb);
 	} while (iov_iter_count(i) && length);
 
 	return written ? written : status;
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 491a917..3b76eeb 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -252,8 +252,9 @@ void wb_blkcg_offline(struct blkcg *blkcg);
 int inode_congested(struct inode *inode, int cong_bits);
 
 /**
- * inode_cgwb_enabled - test whether cgroup writeback is enabled on an inode
- * @inode: inode of interest
+ * bdi_cgwb_enabled - test wether cgroup writeback is enabled on a filesystem
+ * @bdi: the bdi we care about
+ * @sb: the super for the bdi
  *
  * cgroup writeback requires support from both the bdi and filesystem.
  * Also, both memcg and iocg have to be on the default hierarchy.  Test
@@ -262,15 +263,25 @@ int inode_congested(struct inode *inode, int cong_bits);
  * Note that the test result may change dynamically on the same inode
  * depending on how memcg and iocg are configured.
  */
-static inline bool inode_cgwb_enabled(struct inode *inode)
+static inline bool bdi_cgwb_enabled(struct backing_dev_info *bdi,
+				    struct super_block *sb)
 {
-	struct backing_dev_info *bdi = inode_to_bdi(inode);
-
 	return cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
 		cgroup_subsys_on_dfl(io_cgrp_subsys) &&
 		bdi_cap_account_dirty(bdi) &&
 		(bdi->capabilities & BDI_CAP_CGROUP_WRITEBACK) &&
-		(inode->i_sb->s_iflags & SB_I_CGROUPWB);
+		(sb->s_iflags & SB_I_CGROUPWB);
+}
+
+/**
+ * inode_cgwb_enabled - test whether cgroup writeback is enabled on an inode
+ * @inode: inode of interest
+ *
+ * Does the inode have cgroup writeback support.
+ */
+static inline bool inode_cgwb_enabled(struct inode *inode)
+{
+	return bdi_cgwb_enabled(inode_to_bdi(inode), inode->i_sb);
 }
 
 /**
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index fc1e16c..256ffc3 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -364,7 +364,8 @@ unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh);
 
 void wb_update_bandwidth(struct bdi_writeback *wb, unsigned long start_time);
 void page_writeback_init(void);
-void balance_dirty_pages_ratelimited(struct address_space *mapping);
+void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
+				     struct super_block *sb);
 bool wb_over_bg_thresh(struct bdi_writeback *wb);
 
 typedef int (*writepage_t)(struct page *page, struct writeback_control *wbc,
diff --git a/mm/filemap.c b/mm/filemap.c
index 3083ded..abb0e98 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2667,6 +2667,7 @@ ssize_t generic_perform_write(struct file *file,
 				struct iov_iter *i, loff_t pos)
 {
 	struct address_space *mapping = file->f_mapping;
+	struct inode *inode = mapping->host;
 	const struct address_space_operations *a_ops = mapping->a_ops;
 	long status = 0;
 	ssize_t written = 0;
@@ -2746,7 +2747,8 @@ again:
 		pos += copied;
 		written += copied;
 
-		balance_dirty_pages_ratelimited(mapping);
+		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
+						inode->i_sb);
 	} while (iov_iter_count(i));
 
 	return written ? written : status;
diff --git a/mm/memory.c b/mm/memory.c
index 83be99d..d43e73b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -64,6 +64,7 @@
 #include <linux/debugfs.h>
 #include <linux/userfaultfd_k.h>
 #include <linux/dax.h>
+#include <linux/backing-dev.h>
 
 #include <asm/io.h>
 #include <asm/mmu_context.h>
@@ -2105,11 +2106,13 @@ static inline int wp_page_reuse(struct fault_env *fe, pte_t orig_pte,
 		put_page(page);
 
 		if ((dirtied || page_mkwrite) && mapping) {
+			struct inode *inode = mapping->host;
 			/*
 			 * Some device drivers do not set page.mapping
 			 * but still dirty their pages
 			 */
-			balance_dirty_pages_ratelimited(mapping);
+			balance_dirty_pages_ratelimited(inode_to_bdi(inode),
+							inode->i_sb);
 		}
 
 		if (!page_mkwrite)
@@ -3291,11 +3294,13 @@ static int do_shared_fault(struct fault_env *fe, pgoff_t pgoff)
 	mapping = page_rmapping(fault_page);
 	unlock_page(fault_page);
 	if ((dirtied || vma->vm_ops->page_mkwrite) && mapping) {
+		struct inode *inode = mapping->host;
 		/*
 		 * Some device drivers do not set page.mapping but still
 		 * dirty their pages
 		 */
-		balance_dirty_pages_ratelimited(mapping);
+		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
+						inode->i_sb);
 	}
 
 	if (!vma->vm_ops->page_mkwrite)
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index f4cd7d8..121a6e3 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1559,8 +1559,7 @@ static inline void wb_dirty_limits(struct dirty_throttle_control *dtc)
  * If we're over `background_thresh' then the writeback threads are woken to
  * perform some writeout.
  */
-static void balance_dirty_pages(struct address_space *mapping,
-				struct bdi_writeback *wb,
+static void balance_dirty_pages(struct bdi_writeback *wb,
 				unsigned long pages_dirtied)
 {
 	struct dirty_throttle_control gdtc_stor = { GDTC_INIT(wb) };
@@ -1849,7 +1848,8 @@ DEFINE_PER_CPU(int, dirty_throttle_leaks) = 0;
 
 /**
  * balance_dirty_pages_ratelimited - balance dirty memory state
- * @mapping: address_space which was dirtied
+ * @bdi: the bdi that was dirtied
+ * @sb: the super block that was dirtied
  *
  * Processes which are dirtying memory should call in here once for each page
  * which was newly dirtied.  The function will periodically check the system's
@@ -1860,10 +1860,9 @@ DEFINE_PER_CPU(int, dirty_throttle_leaks) = 0;
  * limit we decrease the ratelimiting by a lot, to prevent individual processes
  * from overshooting the limit by (ratelimit_pages) each.
  */
-void balance_dirty_pages_ratelimited(struct address_space *mapping)
+void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
+				     struct super_block *sb)
 {
-	struct inode *inode = mapping->host;
-	struct backing_dev_info *bdi = inode_to_bdi(inode);
 	struct bdi_writeback *wb = NULL;
 	int ratelimit;
 	int *p;
@@ -1871,7 +1870,7 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping)
 	if (!bdi_cap_account_dirty(bdi))
 		return;
 
-	if (inode_cgwb_enabled(inode))
+	if (bdi_cgwb_enabled(bdi, sb))
 		wb = wb_get_create_current(bdi, GFP_KERNEL);
 	if (!wb)
 		wb = &bdi->wb;
@@ -1909,7 +1908,7 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping)
 	preempt_enable();
 
 	if (unlikely(current->nr_dirtied >= ratelimit))
-		balance_dirty_pages(mapping, wb, current->nr_dirtied);
+		balance_dirty_pages(wb, current->nr_dirtied);
 
 	wb_put(wb);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] writeback: allow for dirty metadata accounting
  2016-08-09 19:08 [PATCH 0/2][RFC] Provide accounting for dirty metadata Josef Bacik
  2016-08-09 19:08 ` [PATCH 1/2] remove mapping from balance_dirty_pages*() Josef Bacik
@ 2016-08-09 19:08 ` Josef Bacik
  2016-08-10 10:09   ` Jan Kara
  2016-08-10 20:12   ` Tejun Heo
  1 sibling, 2 replies; 15+ messages in thread
From: Josef Bacik @ 2016-08-09 19:08 UTC (permalink / raw)
  To: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

Provide a mechanism for file systems to indicate how much dirty metadata they
are holding.  This introduces a few things

1) Zone stats for dirty metadata, which is the same as the NR_FILE_DIRTY.
2) WB stat for dirty metadata.  This way we know if we need to try and call into
the file system to write out metadata.  This could potentially be used in the
future to make balancing of dirty pages smarter.
3) A super callback to handle writing back dirty metadata.

A future patch will take advantage of this work in btrfs.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
---
 fs/fs-writeback.c                | 57 +++++++++++++++++++++++++++--
 fs/super.c                       |  1 +
 include/linux/backing-dev-defs.h |  3 ++
 include/linux/fs.h               |  4 +++
 include/linux/mm.h               |  4 +++
 include/linux/mmzone.h           |  1 +
 mm/backing-dev.c                 |  3 ++
 mm/page-writeback.c              | 78 ++++++++++++++++++++++++++++++++++++++--
 8 files changed, 145 insertions(+), 6 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 56c8fda..c670d45 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1615,11 +1615,36 @@ static long writeback_sb_inodes(struct super_block *sb,
 	return wrote;
 }
 
+static long writeback_sb_metadata(struct super_block *sb,
+				  struct bdi_writeback *wb,
+				  struct wb_writeback_work *work)
+{
+	struct writeback_control wbc = {
+		.sync_mode		= work->sync_mode,
+		.tagged_writepages	= work->tagged_writepages,
+		.for_kupdate		= work->for_kupdate,
+		.for_background		= work->for_background,
+		.for_sync		= work->for_sync,
+		.range_cyclic		= work->range_cyclic,
+		.range_start		= 0,
+		.range_end		= LLONG_MAX,
+	};
+	long write_chunk;
+
+	write_chunk = writeback_chunk_size(wb, work);
+	wbc.nr_to_write = write_chunk;
+	sb->s_op->write_metadata(sb, &wbc);
+	work->nr_pages -= write_chunk - wbc.nr_to_write;
+
+	return write_chunk - wbc.nr_to_write;
+}
+
 static long __writeback_inodes_wb(struct bdi_writeback *wb,
 				  struct wb_writeback_work *work)
 {
 	unsigned long start_time = jiffies;
 	long wrote = 0;
+	bool done = false;
 
 	while (!list_empty(&wb->b_io)) {
 		struct inode *inode = wb_inode(wb->b_io.prev);
@@ -1639,11 +1664,37 @@ static long __writeback_inodes_wb(struct bdi_writeback *wb,
 
 		/* refer to the same tests at the end of writeback_sb_inodes */
 		if (wrote) {
-			if (time_is_before_jiffies(start_time + HZ / 10UL))
-				break;
-			if (work->nr_pages <= 0)
+			if (time_is_before_jiffies(start_time + HZ / 10UL) ||
+			    work->nr_pages <= 0) {
+				done = true;
 				break;
+			}
+		}
+	}
+
+	if (!done && wb_stat(wb, WB_METADATA_DIRTY)) {
+		struct list_head list;
+
+		spin_unlock(&wb->list_lock);
+		spin_lock(&wb->bdi->sb_list_lock);
+		list_splice_init(&wb->bdi->sb_list, &list);
+		while (!list_empty(&list)) {
+			struct super_block *sb;
+
+			sb = list_first_entry(&list, struct super_block,
+					      s_bdi_list);
+			list_move_tail(&wb->bdi->sb_list, &sb->s_bdi_list);
+			if (!sb->s_op->write_metadata)
+				continue;
+			if (!trylock_super(sb))
+				continue;
+			spin_unlock(&wb->bdi->sb_list_lock);
+			wrote += writeback_sb_metadata(sb, wb, work);
+			spin_lock(&wb->bdi->sb_list_lock);
+			up_read(&sb->s_umount);
 		}
+		spin_unlock(&wb->bdi->sb_list_lock);
+		spin_lock(&wb->list_lock);
 	}
 	/* Leave any unwritten inodes on b_io */
 	return wrote;
diff --git a/fs/super.c b/fs/super.c
index c2ff475..940696d 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -215,6 +215,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
 	spin_lock_init(&s->s_inode_list_lock);
 	INIT_LIST_HEAD(&s->s_inodes_wb);
 	spin_lock_init(&s->s_inode_wblist_lock);
+	INIT_LIST_HEAD(&s->s_bdi_list);
 
 	if (list_lru_init_memcg(&s->s_dentry_lru))
 		goto fail;
diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 3f10307..0e731a3 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -34,6 +34,7 @@ typedef int (congested_fn)(void *, int);
 enum wb_stat_item {
 	WB_RECLAIMABLE,
 	WB_WRITEBACK,
+	WB_METADATA_DIRTY,
 	WB_DIRTIED,
 	WB_WRITTEN,
 	NR_WB_STAT_ITEMS
@@ -166,6 +167,8 @@ struct backing_dev_info {
 
 	struct timer_list laptop_mode_wb_timer;
 
+	spinlock_t sb_list_lock;
+	struct list_head sb_list;
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *debug_dir;
 	struct dentry *debug_stats;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f3f0b4c8..015b06d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1430,6 +1430,8 @@ struct super_block {
 
 	spinlock_t		s_inode_wblist_lock;
 	struct list_head	s_inodes_wb;	/* writeback inodes */
+
+	struct list_head	s_bdi_list;
 };
 
 /* Helper functions so that in most cases filesystems will
@@ -1805,6 +1807,8 @@ struct super_operations {
 				  struct shrink_control *);
 	long (*free_cached_objects)(struct super_block *,
 				    struct shrink_control *);
+	long (*write_metadata)(struct super_block *sb,
+			       struct writeback_control *wbc);
 };
 
 /*
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 08ed53e..0e90fde 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -31,6 +31,7 @@ struct file_ra_state;
 struct user_struct;
 struct writeback_control;
 struct bdi_writeback;
+struct backing_dev_info;
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES	/* Don't use mapnrs, do it properly */
 extern unsigned long max_mapnr;
@@ -1363,6 +1364,9 @@ int redirty_page_for_writepage(struct writeback_control *wbc,
 void account_page_dirtied(struct page *page, struct address_space *mapping);
 void account_page_cleaned(struct page *page, struct address_space *mapping,
 			  struct bdi_writeback *wb);
+void account_metadata_dirtied(struct page *page, struct backing_dev_info *bdi);
+void account_metadata_cleaned(struct page *page, struct backing_dev_info *bdi);
+void account_metadata_writeback(struct page *page, struct backing_dev_info *bdi);
 int set_page_dirty(struct page *page);
 int set_page_dirty_lock(struct page *page);
 void cancel_dirty_page(struct page *page);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index f2e4e90..c4177ef 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -167,6 +167,7 @@ enum node_stat_item {
 	NR_VMSCAN_IMMEDIATE,	/* Prioritise for reclaim when writeback ends */
 	NR_DIRTIED,		/* page dirtyings since bootup */
 	NR_WRITTEN,		/* page writings since bootup */
+	NR_METADATA_DIRTY,	/* Metadata dirty pages */
 	NR_VM_NODE_STAT_ITEMS
 };
 
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index efe2377..0f1692c 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -78,6 +78,7 @@ static int bdi_debug_stats_show(struct seq_file *m, void *v)
 		   "BackgroundThresh:   %10lu kB\n"
 		   "BdiDirtied:         %10lu kB\n"
 		   "BdiWritten:         %10lu kB\n"
+		   "BdiMetadataDirty:	%10lu kB\n"
 		   "BdiWriteBandwidth:  %10lu kBps\n"
 		   "b_dirty:            %10lu\n"
 		   "b_io:               %10lu\n"
@@ -92,6 +93,7 @@ static int bdi_debug_stats_show(struct seq_file *m, void *v)
 		   K(background_thresh),
 		   (unsigned long) K(wb_stat(wb, WB_DIRTIED)),
 		   (unsigned long) K(wb_stat(wb, WB_WRITTEN)),
+		   (unsigned long) K(wb_stat(wb, WB_METADATA_DIRTY)),
 		   (unsigned long) K(wb->write_bandwidth),
 		   nr_dirty,
 		   nr_io,
@@ -780,6 +782,7 @@ int bdi_init(struct backing_dev_info *bdi)
 	bdi->max_prop_frac = FPROP_FRAC_BASE;
 	INIT_LIST_HEAD(&bdi->bdi_list);
 	INIT_LIST_HEAD(&bdi->wb_list);
+	INIT_LIST_HEAD(&bdi->sb_list);
 	init_waitqueue_head(&bdi->wb_waitq);
 
 	ret = cgwb_bdi_init(bdi);
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 121a6e3..c6492de1a 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -506,6 +506,7 @@ bool node_dirty_ok(struct pglist_data *pgdat)
 	nr_pages += node_page_state(pgdat, NR_FILE_DIRTY);
 	nr_pages += node_page_state(pgdat, NR_UNSTABLE_NFS);
 	nr_pages += node_page_state(pgdat, NR_WRITEBACK);
+	nr_pages += node_page_state(pgdat, NR_METADATA_DIRTY);
 
 	return nr_pages <= limit;
 }
@@ -1595,7 +1596,8 @@ static void balance_dirty_pages(struct bdi_writeback *wb,
 		 * been flushed to permanent storage.
 		 */
 		nr_reclaimable = global_node_page_state(NR_FILE_DIRTY) +
-					global_node_page_state(NR_UNSTABLE_NFS);
+				global_node_page_state(NR_UNSTABLE_NFS) +
+				global_node_page_state(NR_METADATA_DIRTY);
 		gdtc->avail = global_dirtyable_memory();
 		gdtc->dirty = nr_reclaimable + global_node_page_state(NR_WRITEBACK);
 
@@ -1935,7 +1937,8 @@ bool wb_over_bg_thresh(struct bdi_writeback *wb)
 	 */
 	gdtc->avail = global_dirtyable_memory();
 	gdtc->dirty = global_node_page_state(NR_FILE_DIRTY) +
-		      global_node_page_state(NR_UNSTABLE_NFS);
+		      global_node_page_state(NR_UNSTABLE_NFS) +
+		      global_node_page_state(NR_METADATA_DIRTY);
 	domain_dirty_limits(gdtc);
 
 	if (gdtc->dirty > gdtc->bg_thresh)
@@ -2009,7 +2012,8 @@ void laptop_mode_timer_fn(unsigned long data)
 {
 	struct request_queue *q = (struct request_queue *)data;
 	int nr_pages = global_node_page_state(NR_FILE_DIRTY) +
-		global_node_page_state(NR_UNSTABLE_NFS);
+		global_node_page_state(NR_UNSTABLE_NFS) +
+		global_node_page_state(NR_METADATA_DIRTY);
 	struct bdi_writeback *wb;
 
 	/*
@@ -2473,6 +2477,74 @@ void account_page_dirtied(struct page *page, struct address_space *mapping)
 EXPORT_SYMBOL(account_page_dirtied);
 
 /*
+ * account_metadata_dirtied
+ * @page - the page being dirited
+ * @bdi - the bdi that owns this page
+ *
+ * Do the dirty page accounting for metadata pages that aren't backed by an
+ * address_space.
+ */
+void account_metadata_dirtied(struct page *page, struct backing_dev_info *bdi)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	__inc_node_page_state(page, NR_METADATA_DIRTY);
+	__inc_node_page_state(page, NR_FILE_DIRTY);
+	__inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
+	__inc_node_page_state(page, NR_DIRTIED);
+	__inc_wb_stat(&bdi->wb, WB_RECLAIMABLE);
+	__inc_wb_stat(&bdi->wb, WB_DIRTIED);
+	__inc_wb_stat(&bdi->wb, WB_METADATA_DIRTY);
+	current->nr_dirtied++;
+	task_io_account_write(PAGE_SIZE);
+	this_cpu_inc(bdp_ratelimits);
+	local_irq_restore(flags);
+}
+EXPORT_SYMBOL(account_metadata_dirtied);
+
+/*
+ * account_metadata_cleaned
+ * @page - the page being cleaned
+ * @bdi - the bdi that owns this page
+ *
+ * Called on a no longer dirty metadata page.
+ */
+void account_metadata_cleaned(struct page *page, struct backing_dev_info *bdi)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	__dec_node_page_state(page, NR_METADATA_DIRTY);
+	__dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
+	__dec_wb_stat(&bdi->wb, WB_RECLAIMABLE);
+	__dec_wb_stat(&bdi->wb, WB_METADATA_DIRTY);
+	task_io_account_cancelled_write(PAGE_SIZE);
+	local_irq_restore(flags);
+}
+EXPORT_SYMBOL(account_metadata_cleaned);
+
+/*
+ * account_metadata_writeback
+ * @page - the page being marked as writeback
+ * @bdi - the bdi that owns this page
+ *
+ * Called on a metadata page that has been marked writeback.
+ */
+void account_metadata_writeback(struct page *page, struct backing_dev_info *bdi)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	__inc_wb_stat(&bdi->wb, WB_WRITEBACK);
+	__dec_wb_stat(&bdi->wb, WB_METADATA_DIRTY);
+	__inc_node_page_state(page, NR_WRITEBACK);
+	__dec_node_page_state(page, NR_METADATA_DIRTY);
+	local_irq_restore(flags);
+}
+EXPORT_SYMBOL(account_metadata_writeback);
+
+/*
  * Helper function for deaccounting dirty page without writeback.
  *
  * Caller must hold lock_page_memcg().
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] remove mapping from balance_dirty_pages*()
  2016-08-09 19:08 ` [PATCH 1/2] remove mapping from balance_dirty_pages*() Josef Bacik
@ 2016-08-09 19:30   ` kbuild test robot
  2016-08-09 19:32   ` kbuild test robot
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: kbuild test robot @ 2016-08-09 19:30 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kbuild-all, linux-btrfs, linux-fsdevel, kernel-team, jack, viro,
	dchinner, hch

[-- Attachment #1: Type: text/plain, Size: 2431 bytes --]

Hi Josef,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.8-rc1 next-20160809]
[cannot apply to linux/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Josef-Bacik/Provide-accounting-for-dirty-metadata/20160810-031219
config: x86_64-randconfig-x019-201632 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   In file included from include/linux/linkage.h:4:0,
                    from include/linux/kernel.h:6,
                    from mm/page-writeback.c:14:
   mm/page-writeback.c: In function 'balance_dirty_pages_ratelimited':
   mm/page-writeback.c:1873:6: error: implicit declaration of function 'bdi_cgwb_enabled' [-Werror=implicit-function-declaration]
     if (bdi_cgwb_enabled(bdi, sb))
         ^
   include/linux/compiler.h:149:30: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                 ^~~~
>> mm/page-writeback.c:1873:2: note: in expansion of macro 'if'
     if (bdi_cgwb_enabled(bdi, sb))
     ^~
   cc1: some warnings being treated as errors

vim +/if +1873 mm/page-writeback.c

  1857	 *
  1858	 * On really big machines, get_writeback_state is expensive, so try to avoid
  1859	 * calling it too often (ratelimiting).  But once we're over the dirty memory
  1860	 * limit we decrease the ratelimiting by a lot, to prevent individual processes
  1861	 * from overshooting the limit by (ratelimit_pages) each.
  1862	 */
  1863	void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
  1864					     struct super_block *sb)
  1865	{
  1866		struct bdi_writeback *wb = NULL;
  1867		int ratelimit;
  1868		int *p;
  1869	
  1870		if (!bdi_cap_account_dirty(bdi))
  1871			return;
  1872	
> 1873		if (bdi_cgwb_enabled(bdi, sb))
  1874			wb = wb_get_create_current(bdi, GFP_KERNEL);
  1875		if (!wb)
  1876			wb = &bdi->wb;
  1877	
  1878		ratelimit = current->nr_dirtied_pause;
  1879		if (wb->dirty_exceeded)
  1880			ratelimit = min(ratelimit, 32 >> (PAGE_SHIFT - 10));
  1881	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 26211 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] remove mapping from balance_dirty_pages*()
  2016-08-09 19:08 ` [PATCH 1/2] remove mapping from balance_dirty_pages*() Josef Bacik
  2016-08-09 19:30   ` kbuild test robot
@ 2016-08-09 19:32   ` kbuild test robot
  2016-08-09 20:12   ` kbuild test robot
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: kbuild test robot @ 2016-08-09 19:32 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kbuild-all, linux-btrfs, linux-fsdevel, kernel-team, jack, viro,
	dchinner, hch

[-- Attachment #1: Type: text/plain, Size: 8203 bytes --]

Hi Josef,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.8-rc1 next-20160809]
[cannot apply to linux/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Josef-Bacik/Provide-accounting-for-dirty-metadata/20160810-031219
config: x86_64-randconfig-x013-201632 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   mm/page-writeback.c: In function 'balance_dirty_pages_ratelimited':
>> mm/page-writeback.c:1873:6: error: implicit declaration of function 'bdi_cgwb_enabled' [-Werror=implicit-function-declaration]
     if (bdi_cgwb_enabled(bdi, sb))
         ^~~~~~~~~~~~~~~~
   Cyclomatic Complexity 5 include/linux/compiler.h:__read_once_size
   Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:set_bit
   Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:clear_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:test_and_set_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:test_and_set_bit_lock
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:test_and_clear_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:fls64
   Cyclomatic Complexity 1 include/linux/log2.h:__ilog2_u64
   Cyclomatic Complexity 1 arch/x86/include/asm/current.h:get_current
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:atomic_inc
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:atomic_dec_and_test
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:atomic64_read
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:atomic64_inc
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:atomic64_dec
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic64_64.h:atomic64_add_return
   Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_read
   Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_add_return
   Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_inc
   Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_dec
   Cyclomatic Complexity 2 arch/x86/include/asm/jump_label.h:arch_static_branch
   Cyclomatic Complexity 1 include/linux/jump_label.h:static_key_false
   Cyclomatic Complexity 1 arch/x86/include/asm/paravirt.h:arch_local_save_flags
   Cyclomatic Complexity 1 arch/x86/include/asm/paravirt.h:arch_local_irq_restore
   Cyclomatic Complexity 1 arch/x86/include/asm/paravirt.h:arch_local_irq_disable
   Cyclomatic Complexity 1 arch/x86/include/asm/paravirt.h:arch_local_irq_save
   Cyclomatic Complexity 1 include/linux/math64.h:div_u64_rem
   Cyclomatic Complexity 1 include/linux/math64.h:div64_u64
   Cyclomatic Complexity 1 include/linux/math64.h:div64_s64
   Cyclomatic Complexity 1 include/linux/math64.h:div_u64
   Cyclomatic Complexity 1 arch/x86/include/asm/irqflags.h:arch_irqs_disabled_flags
   Cyclomatic Complexity 2 include/linux/thread_info.h:test_ti_thread_flag
   Cyclomatic Complexity 5 arch/x86/include/asm/preempt.h:__preempt_count_add
   Cyclomatic Complexity 5 arch/x86/include/asm/preempt.h:__preempt_count_sub
   Cyclomatic Complexity 1 include/linux/spinlock.h:spinlock_check
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_lock
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_lock_bh
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_lock_irq
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock_bh
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock_irq
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock_irqrestore
   Cyclomatic Complexity 1 include/linux/rcupdate.h:rcu_read_lock_sched_notrace
   Cyclomatic Complexity 1 include/linux/rcupdate.h:rcu_read_unlock_sched_notrace
   Cyclomatic Complexity 1 include/linux/mmzone.h:populated_zone
   Cyclomatic Complexity 1 include/linux/percpu_counter.h:percpu_counter_add
   Cyclomatic Complexity 1 include/linux/percpu_counter.h:__percpu_counter_add
   Cyclomatic Complexity 1 include/linux/percpu_counter.h:percpu_counter_read
   Cyclomatic Complexity 1 include/linux/percpu_counter.h:percpu_counter_read_positive
   Cyclomatic Complexity 1 include/linux/percpu_counter.h:percpu_counter_sum_positive
   Cyclomatic Complexity 3 include/linux/page-flags.h:compound_head
   Cyclomatic Complexity 1 include/linux/page-flags.h:PageLocked
   Cyclomatic Complexity 1 include/linux/page-flags.h:PageError
   Cyclomatic Complexity 1 include/linux/page-flags.h:PageDirty
   Cyclomatic Complexity 1 include/linux/page-flags.h:ClearPageDirty
   Cyclomatic Complexity 1 include/linux/page-flags.h:TestSetPageDirty
   Cyclomatic Complexity 1 include/linux/page-flags.h:TestClearPageDirty
   Cyclomatic Complexity 1 include/linux/page-flags.h:PagePrivate
   Cyclomatic Complexity 1 include/linux/page-flags.h:PageWriteback
   Cyclomatic Complexity 1 include/linux/page-flags.h:TestSetPageWriteback
   Cyclomatic Complexity 1 include/linux/page-flags.h:TestClearPageWriteback
   Cyclomatic Complexity 1 include/linux/page-flags.h:PageReclaim
   Cyclomatic Complexity 1 include/linux/page-flags.h:ClearPageReclaim
   Cyclomatic Complexity 1 include/linux/page-flags.h:PageSwapCache
   Cyclomatic Complexity 2 include/linux/page-flags.h:PageUptodate
   Cyclomatic Complexity 2 include/linux/page_ref.h:page_ref_inc
   Cyclomatic Complexity 2 include/linux/page_ref.h:page_ref_dec_and_test
   Cyclomatic Complexity 1 include/linux/mm.h:put_page_testzero
   Cyclomatic Complexity 1 include/linux/mm.h:page_zonenum
   Cyclomatic Complexity 1 include/linux/mm.h:get_zone_device_page
   Cyclomatic Complexity 1 include/linux/mm.h:put_zone_device_page
   Cyclomatic Complexity 1 include/linux/mm.h:is_zone_device_page
   Cyclomatic Complexity 2 include/linux/mm.h:get_page
   Cyclomatic Complexity 3 include/linux/mm.h:put_page
   Cyclomatic Complexity 1 include/linux/mm.h:page_zone
   Cyclomatic Complexity 1 include/linux/mm.h:page_pgdat
   Cyclomatic Complexity 1 include/linux/vmstat.h:global_page_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:global_node_page_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:zone_page_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__inc_zone_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__inc_node_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__dec_zone_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__dec_node_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__inc_zone_page_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__inc_node_page_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__dec_zone_page_state
   Cyclomatic Complexity 1 include/linux/vmstat.h:__dec_node_page_state
   Cyclomatic Complexity 2 include/linux/mm.h:page_index
   Cyclomatic Complexity 1 include/linux/signal.h:sigismember
   Cyclomatic Complexity 1 include/linux/sched.h:test_tsk_thread_flag
   Cyclomatic Complexity 1 include/linux/sched.h:signal_pending
   Cyclomatic Complexity 1 include/linux/sched.h:__fatal_signal_pending
   Cyclomatic Complexity 3 include/linux/sched.h:fatal_signal_pending
   Cyclomatic Complexity 1 include/linux/backing-dev-defs.h:wb_put
   Cyclomatic Complexity 1 include/linux/writeback.h:inode_attach_wb
   Cyclomatic Complexity 1 include/linux/memcontrol.h:lock_page_memcg
   Cyclomatic Complexity 1 include/linux/memcontrol.h:unlock_page_memcg

vim +/bdi_cgwb_enabled +1873 mm/page-writeback.c

  1867		int ratelimit;
  1868		int *p;
  1869	
  1870		if (!bdi_cap_account_dirty(bdi))
  1871			return;
  1872	
> 1873		if (bdi_cgwb_enabled(bdi, sb))
  1874			wb = wb_get_create_current(bdi, GFP_KERNEL);
  1875		if (!wb)
  1876			wb = &bdi->wb;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 25495 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] remove mapping from balance_dirty_pages*()
  2016-08-09 19:08 ` [PATCH 1/2] remove mapping from balance_dirty_pages*() Josef Bacik
  2016-08-09 19:30   ` kbuild test robot
  2016-08-09 19:32   ` kbuild test robot
@ 2016-08-09 20:12   ` kbuild test robot
  2016-08-09 20:50   ` kbuild test robot
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: kbuild test robot @ 2016-08-09 20:12 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kbuild-all, linux-btrfs, linux-fsdevel, kernel-team, jack, viro,
	dchinner, hch

[-- Attachment #1: Type: text/plain, Size: 7611 bytes --]

Hi Josef,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.8-rc1 next-20160809]
[cannot apply to linux/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Josef-Bacik/Provide-accounting-for-dirty-metadata/20160810-031219
config: x86_64-randconfig-s3-08100114 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   fs/ntfs/attrib.c: In function 'ntfs_attr_set':
>> fs/ntfs/attrib.c:2548:35: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
      balance_dirty_pages_ratelimited(mapping);
                                      ^~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> fs/ntfs/attrib.c:2548:3: error: too few arguments to function 'balance_dirty_pages_ratelimited'
      balance_dirty_pages_ratelimited(mapping);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/attrib.c:2589:35: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
      balance_dirty_pages_ratelimited(mapping);
                                      ^~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/attrib.c:2589:3: error: too few arguments to function 'balance_dirty_pages_ratelimited'
      balance_dirty_pages_ratelimited(mapping);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/attrib.c:2606:35: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
      balance_dirty_pages_ratelimited(mapping);
                                      ^~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/attrib.c:2606:3: error: too few arguments to function 'balance_dirty_pages_ratelimited'
      balance_dirty_pages_ratelimited(mapping);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors
--
   fs/ntfs/file.c: In function 'ntfs_attr_extend_initialized':
>> fs/ntfs/file.c:279:35: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
      balance_dirty_pages_ratelimited(mapping);
                                      ^~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from fs/ntfs/file.c:22:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> fs/ntfs/file.c:279:3: error: too few arguments to function 'balance_dirty_pages_ratelimited'
      balance_dirty_pages_ratelimited(mapping);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from fs/ntfs/file.c:22:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/file.c: In function 'ntfs_perform_write':
   fs/ntfs/file.c:1917:35: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
      balance_dirty_pages_ratelimited(mapping);
                                      ^~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from fs/ntfs/file.c:22:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/file.c:1917:3: error: too few arguments to function 'balance_dirty_pages_ratelimited'
      balance_dirty_pages_ratelimited(mapping);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from fs/ntfs/file.c:22:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/balance_dirty_pages_ratelimited +2548 fs/ntfs/attrib.c

a3ac1414 Cong Wang          2011-11-25  2542  		kaddr = kmap_atomic(page);
^1da177e Linus Torvalds     2005-04-16  2543  		memset(kaddr + start_ofs, val, size - start_ofs);
^1da177e Linus Torvalds     2005-04-16  2544  		flush_dcache_page(page);
a3ac1414 Cong Wang          2011-11-25  2545  		kunmap_atomic(kaddr);
^1da177e Linus Torvalds     2005-04-16  2546  		set_page_dirty(page);
09cbfeaf Kirill A. Shutemov 2016-04-01  2547  		put_page(page);
bfab36e8 Anton Altaparmakov 2007-10-12 @2548  		balance_dirty_pages_ratelimited(mapping);
bfab36e8 Anton Altaparmakov 2007-10-12  2549  		cond_resched();
^1da177e Linus Torvalds     2005-04-16  2550  		if (idx == end)
^1da177e Linus Torvalds     2005-04-16  2551  			goto done;

:::::: The code at line 2548 was first introduced by commit
:::::: bfab36e81611e60573b84eb4e4b4c8d8545b2320 NTFS: Fix a mount time deadlock.

:::::: TO: Anton Altaparmakov <aia21@cam.ac.uk>
:::::: CC: Linus Torvalds <torvalds@woody.linux-foundation.org>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 30232 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] remove mapping from balance_dirty_pages*()
  2016-08-09 19:08 ` [PATCH 1/2] remove mapping from balance_dirty_pages*() Josef Bacik
                     ` (2 preceding siblings ...)
  2016-08-09 20:12   ` kbuild test robot
@ 2016-08-09 20:50   ` kbuild test robot
  2016-08-10  8:27   ` Jan Kara
  2016-08-10 19:56   ` Tejun Heo
  5 siblings, 0 replies; 15+ messages in thread
From: kbuild test robot @ 2016-08-09 20:50 UTC (permalink / raw)
  To: Josef Bacik
  Cc: kbuild-all, linux-btrfs, linux-fsdevel, kernel-team, jack, viro,
	dchinner, hch

Hi Josef,

[auto build test WARNING on linus/master]
[also build test WARNING on v4.8-rc1 next-20160809]
[cannot apply to linux/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Josef-Bacik/Provide-accounting-for-dirty-metadata/20160810-031219
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   include/linux/compiler.h:230:8: sparse: attribute 'no_sanitize_address': unknown attribute
>> drivers/mtd/devices/block2mtd.c:74:64: sparse: not enough arguments for function balance_dirty_pages_ratelimited
   drivers/mtd/devices/block2mtd.c:165:56: sparse: not enough arguments for function balance_dirty_pages_ratelimited
   drivers/mtd/devices/block2mtd.c: In function '_block2mtd_erase':
   drivers/mtd/devices/block2mtd.c:74:37: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
        balance_dirty_pages_ratelimited(mapping);
                                        ^~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from drivers/mtd/devices/block2mtd.c:23:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/mtd/devices/block2mtd.c:74:5: error: too few arguments to function 'balance_dirty_pages_ratelimited'
        balance_dirty_pages_ratelimited(mapping);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from drivers/mtd/devices/block2mtd.c:23:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/mtd/devices/block2mtd.c: In function '_block2mtd_write':
   drivers/mtd/devices/block2mtd.c:165:36: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
       balance_dirty_pages_ratelimited(mapping);
                                       ^~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from drivers/mtd/devices/block2mtd.c:23:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/mtd/devices/block2mtd.c:165:4: error: too few arguments to function 'balance_dirty_pages_ratelimited'
       balance_dirty_pages_ratelimited(mapping);
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from drivers/mtd/devices/block2mtd.c:23:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors
--
   include/linux/compiler.h:230:8: sparse: attribute 'no_sanitize_address': unknown attribute
>> fs/ntfs/attrib.c:2548:48: sparse: not enough arguments for function balance_dirty_pages_ratelimited
   fs/ntfs/attrib.c:2589:48: sparse: not enough arguments for function balance_dirty_pages_ratelimited
   fs/ntfs/attrib.c:2606:48: sparse: not enough arguments for function balance_dirty_pages_ratelimited
   fs/ntfs/attrib.c: In function 'ntfs_attr_set':
   fs/ntfs/attrib.c:2548:35: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
      balance_dirty_pages_ratelimited(mapping);
                                      ^~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/attrib.c:2548:3: error: too few arguments to function 'balance_dirty_pages_ratelimited'
      balance_dirty_pages_ratelimited(mapping);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/attrib.c:2589:35: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
      balance_dirty_pages_ratelimited(mapping);
                                      ^~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/attrib.c:2589:3: error: too few arguments to function 'balance_dirty_pages_ratelimited'
      balance_dirty_pages_ratelimited(mapping);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/attrib.c:2606:35: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
      balance_dirty_pages_ratelimited(mapping);
                                      ^~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/attrib.c:2606:3: error: too few arguments to function 'balance_dirty_pages_ratelimited'
      balance_dirty_pages_ratelimited(mapping);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/memcontrol.h:30:0,
                    from include/linux/swap.h:8,
                    from fs/ntfs/attrib.c:26:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors
--
   include/linux/compiler.h:230:8: sparse: attribute 'no_sanitize_address': unknown attribute
>> fs/ntfs/file.c:279:48: sparse: not enough arguments for function balance_dirty_pages_ratelimited
   fs/ntfs/file.c:1917:48: sparse: not enough arguments for function balance_dirty_pages_ratelimited
   fs/ntfs/file.c: In function 'ntfs_attr_extend_initialized':
   fs/ntfs/file.c:279:35: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
      balance_dirty_pages_ratelimited(mapping);
                                      ^~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from fs/ntfs/file.c:22:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/file.c:279:3: error: too few arguments to function 'balance_dirty_pages_ratelimited'
      balance_dirty_pages_ratelimited(mapping);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from fs/ntfs/file.c:22:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/file.c: In function 'ntfs_perform_write':
   fs/ntfs/file.c:1917:35: error: passing argument 1 of 'balance_dirty_pages_ratelimited' from incompatible pointer type [-Werror=incompatible-pointer-types]
      balance_dirty_pages_ratelimited(mapping);
                                      ^~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from fs/ntfs/file.c:22:
   include/linux/writeback.h:367:6: note: expected 'struct backing_dev_info *' but argument is of type 'struct address_space *'
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   fs/ntfs/file.c:1917:3: error: too few arguments to function 'balance_dirty_pages_ratelimited'
      balance_dirty_pages_ratelimited(mapping);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/backing-dev.h:15:0,
                    from fs/ntfs/file.c:22:
   include/linux/writeback.h:367:6: note: declared here
    void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +74 drivers/mtd/devices/block2mtd.c

^1da177e Linus Torvalds     2005-04-16  58  	int pages = len >> PAGE_SHIFT;
^1da177e Linus Torvalds     2005-04-16  59  	u_long *p;
^1da177e Linus Torvalds     2005-04-16  60  	u_long *max;
^1da177e Linus Torvalds     2005-04-16  61  
^1da177e Linus Torvalds     2005-04-16  62  	while (pages) {
21d31f1f Joern Engel        2007-02-20  63  		page = page_read(mapping, index);
^1da177e Linus Torvalds     2005-04-16  64  		if (IS_ERR(page))
^1da177e Linus Torvalds     2005-04-16  65  			return PTR_ERR(page);
^1da177e Linus Torvalds     2005-04-16  66  
0ffb74cc Joern Engel        2007-02-20  67  		max = page_address(page) + PAGE_SIZE;
0ffb74cc Joern Engel        2007-02-20  68  		for (p=page_address(page); p<max; p++)
^1da177e Linus Torvalds     2005-04-16  69  			if (*p != -1UL) {
^1da177e Linus Torvalds     2005-04-16  70  				lock_page(page);
^1da177e Linus Torvalds     2005-04-16  71  				memset(page_address(page), 0xff, PAGE_SIZE);
^1da177e Linus Torvalds     2005-04-16  72  				set_page_dirty(page);
^1da177e Linus Torvalds     2005-04-16  73  				unlock_page(page);
031da73f NeilBrown          2012-12-12 @74  				balance_dirty_pages_ratelimited(mapping);
^1da177e Linus Torvalds     2005-04-16  75  				break;
^1da177e Linus Torvalds     2005-04-16  76  			}
^1da177e Linus Torvalds     2005-04-16  77  
09cbfeaf Kirill A. Shutemov 2016-04-01  78  		put_page(page);
^1da177e Linus Torvalds     2005-04-16  79  		pages--;
^1da177e Linus Torvalds     2005-04-16  80  		index++;
^1da177e Linus Torvalds     2005-04-16  81  	}
^1da177e Linus Torvalds     2005-04-16  82  	return 0;

:::::: The code at line 74 was first introduced by commit
:::::: 031da73fca29ebba3eea5a512746e01d52e542cd mtd: block2mtd: throttle writes by calling balance_dirty_pages_ratelimited.

:::::: TO: NeilBrown <neilb@suse.de>
:::::: CC: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] remove mapping from balance_dirty_pages*()
  2016-08-09 19:08 ` [PATCH 1/2] remove mapping from balance_dirty_pages*() Josef Bacik
                     ` (3 preceding siblings ...)
  2016-08-09 20:50   ` kbuild test robot
@ 2016-08-10  8:27   ` Jan Kara
  2016-08-10  8:29     ` Jan Kara
  2016-08-10 19:56   ` Tejun Heo
  5 siblings, 1 reply; 15+ messages in thread
From: Jan Kara @ 2016-08-10  8:27 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

On Tue 09-08-16 15:08:26, Josef Bacik wrote:
> The only reason we pass in the mapping is to get the inode in order to see if
> writeback cgroups is enabled, and even then it only checks the bdi and a super
> block flag.  balance_dirty_pages() doesn't even use the mapping.  Since
> balance_dirty_pages*() works on a bdi level, just pass in the bdi and super
> block directly so we can avoid using mapping.  This will allow us to still use
> balance_dirty_pages for dirty metadata pages that are not backed by an
> address_mapping.
>
> Signed-off-by: Josef Bacik <jbacik@fb.com>

The patch looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/btrfs/disk-io.c          |  4 ++--
>  fs/btrfs/file.c             |  3 ++-
>  fs/btrfs/ioctl.c            |  3 ++-
>  fs/btrfs/relocation.c       |  3 ++-
>  fs/buffer.c                 |  3 ++-
>  fs/iomap.c                  |  3 ++-
>  include/linux/backing-dev.h | 23 +++++++++++++++++------
>  include/linux/writeback.h   |  3 ++-
>  mm/filemap.c                |  4 +++-
>  mm/memory.c                 |  9 +++++++--
>  mm/page-writeback.c         | 15 +++++++--------
>  11 files changed, 48 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 17346f7..c1d951a 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -4024,8 +4024,8 @@ static void __btrfs_btree_balance_dirty(struct btrfs_root *root,
>  	ret = percpu_counter_compare(&root->fs_info->dirty_metadata_bytes,
>  				     BTRFS_DIRTY_METADATA_THRESH);
>  	if (ret > 0) {
> -		balance_dirty_pages_ratelimited(
> -				   root->fs_info->btree_inode->i_mapping);
> +		balance_dirty_pages_ratelimited(&root->fs_info->bdi,
> +						root->fs_info->sb);
>  	}
>  }
>  
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 9404121..f060b08 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1686,7 +1686,8 @@ again:
>  
>  		cond_resched();
>  
> -		balance_dirty_pages_ratelimited(inode->i_mapping);
> +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> +						inode->i_sb);
>  		if (dirty_pages < (root->nodesize >> PAGE_SHIFT) + 1)
>  			btrfs_btree_balance_dirty(root);
>  
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 14ed1e9..a222bad 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1410,7 +1410,8 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
>  		}
>  
>  		defrag_count += ret;
> -		balance_dirty_pages_ratelimited(inode->i_mapping);
> +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> +						inode->i_sb);
>  		inode_unlock(inode);
>  
>  		if (newer_than) {
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 4598e29..7fc6ea7 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -3202,7 +3202,8 @@ static int relocate_file_extent_cluster(struct inode *inode,
>  		put_page(page);
>  
>  		index++;
> -		balance_dirty_pages_ratelimited(inode->i_mapping);
> +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> +						inode->i_sb);
>  		btrfs_throttle(BTRFS_I(inode)->root);
>  	}
>  	WARN_ON(nr != cluster->nr);
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 9c8eb9b..9bbe30d 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2386,7 +2386,8 @@ static int cont_expand_zero(struct file *file, struct address_space *mapping,
>  		BUG_ON(err != len);
>  		err = 0;
>  
> -		balance_dirty_pages_ratelimited(mapping);
> +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> +						inode->i_sb);
>  
>  		if (unlikely(fatal_signal_pending(current))) {
>  			err = -EINTR;
> diff --git a/fs/iomap.c b/fs/iomap.c
> index 48141b8..937e266 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -226,7 +226,8 @@ again:
>  		written += copied;
>  		length -= copied;
>  
> -		balance_dirty_pages_ratelimited(inode->i_mapping);
> +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> +						inode->i_sb);
>  	} while (iov_iter_count(i) && length);
>  
>  	return written ? written : status;
> diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
> index 491a917..3b76eeb 100644
> --- a/include/linux/backing-dev.h
> +++ b/include/linux/backing-dev.h
> @@ -252,8 +252,9 @@ void wb_blkcg_offline(struct blkcg *blkcg);
>  int inode_congested(struct inode *inode, int cong_bits);
>  
>  /**
> - * inode_cgwb_enabled - test whether cgroup writeback is enabled on an inode
> - * @inode: inode of interest
> + * bdi_cgwb_enabled - test wether cgroup writeback is enabled on a filesystem
> + * @bdi: the bdi we care about
> + * @sb: the super for the bdi
>   *
>   * cgroup writeback requires support from both the bdi and filesystem.
>   * Also, both memcg and iocg have to be on the default hierarchy.  Test
> @@ -262,15 +263,25 @@ int inode_congested(struct inode *inode, int cong_bits);
>   * Note that the test result may change dynamically on the same inode
>   * depending on how memcg and iocg are configured.
>   */
> -static inline bool inode_cgwb_enabled(struct inode *inode)
> +static inline bool bdi_cgwb_enabled(struct backing_dev_info *bdi,
> +				    struct super_block *sb)
>  {
> -	struct backing_dev_info *bdi = inode_to_bdi(inode);
> -
>  	return cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
>  		cgroup_subsys_on_dfl(io_cgrp_subsys) &&
>  		bdi_cap_account_dirty(bdi) &&
>  		(bdi->capabilities & BDI_CAP_CGROUP_WRITEBACK) &&
> -		(inode->i_sb->s_iflags & SB_I_CGROUPWB);
> +		(sb->s_iflags & SB_I_CGROUPWB);
> +}
> +
> +/**
> + * inode_cgwb_enabled - test whether cgroup writeback is enabled on an inode
> + * @inode: inode of interest
> + *
> + * Does the inode have cgroup writeback support.
> + */
> +static inline bool inode_cgwb_enabled(struct inode *inode)
> +{
> +	return bdi_cgwb_enabled(inode_to_bdi(inode), inode->i_sb);
>  }
>  
>  /**
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index fc1e16c..256ffc3 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -364,7 +364,8 @@ unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh);
>  
>  void wb_update_bandwidth(struct bdi_writeback *wb, unsigned long start_time);
>  void page_writeback_init(void);
> -void balance_dirty_pages_ratelimited(struct address_space *mapping);
> +void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
> +				     struct super_block *sb);
>  bool wb_over_bg_thresh(struct bdi_writeback *wb);
>  
>  typedef int (*writepage_t)(struct page *page, struct writeback_control *wbc,
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 3083ded..abb0e98 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2667,6 +2667,7 @@ ssize_t generic_perform_write(struct file *file,
>  				struct iov_iter *i, loff_t pos)
>  {
>  	struct address_space *mapping = file->f_mapping;
> +	struct inode *inode = mapping->host;
>  	const struct address_space_operations *a_ops = mapping->a_ops;
>  	long status = 0;
>  	ssize_t written = 0;
> @@ -2746,7 +2747,8 @@ again:
>  		pos += copied;
>  		written += copied;
>  
> -		balance_dirty_pages_ratelimited(mapping);
> +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> +						inode->i_sb);
>  	} while (iov_iter_count(i));
>  
>  	return written ? written : status;
> diff --git a/mm/memory.c b/mm/memory.c
> index 83be99d..d43e73b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -64,6 +64,7 @@
>  #include <linux/debugfs.h>
>  #include <linux/userfaultfd_k.h>
>  #include <linux/dax.h>
> +#include <linux/backing-dev.h>
>  
>  #include <asm/io.h>
>  #include <asm/mmu_context.h>
> @@ -2105,11 +2106,13 @@ static inline int wp_page_reuse(struct fault_env *fe, pte_t orig_pte,
>  		put_page(page);
>  
>  		if ((dirtied || page_mkwrite) && mapping) {
> +			struct inode *inode = mapping->host;
>  			/*
>  			 * Some device drivers do not set page.mapping
>  			 * but still dirty their pages
>  			 */
> -			balance_dirty_pages_ratelimited(mapping);
> +			balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> +							inode->i_sb);
>  		}
>  
>  		if (!page_mkwrite)
> @@ -3291,11 +3294,13 @@ static int do_shared_fault(struct fault_env *fe, pgoff_t pgoff)
>  	mapping = page_rmapping(fault_page);
>  	unlock_page(fault_page);
>  	if ((dirtied || vma->vm_ops->page_mkwrite) && mapping) {
> +		struct inode *inode = mapping->host;
>  		/*
>  		 * Some device drivers do not set page.mapping but still
>  		 * dirty their pages
>  		 */
> -		balance_dirty_pages_ratelimited(mapping);
> +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> +						inode->i_sb);
>  	}
>  
>  	if (!vma->vm_ops->page_mkwrite)
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index f4cd7d8..121a6e3 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1559,8 +1559,7 @@ static inline void wb_dirty_limits(struct dirty_throttle_control *dtc)
>   * If we're over `background_thresh' then the writeback threads are woken to
>   * perform some writeout.
>   */
> -static void balance_dirty_pages(struct address_space *mapping,
> -				struct bdi_writeback *wb,
> +static void balance_dirty_pages(struct bdi_writeback *wb,
>  				unsigned long pages_dirtied)
>  {
>  	struct dirty_throttle_control gdtc_stor = { GDTC_INIT(wb) };
> @@ -1849,7 +1848,8 @@ DEFINE_PER_CPU(int, dirty_throttle_leaks) = 0;
>  
>  /**
>   * balance_dirty_pages_ratelimited - balance dirty memory state
> - * @mapping: address_space which was dirtied
> + * @bdi: the bdi that was dirtied
> + * @sb: the super block that was dirtied
>   *
>   * Processes which are dirtying memory should call in here once for each page
>   * which was newly dirtied.  The function will periodically check the system's
> @@ -1860,10 +1860,9 @@ DEFINE_PER_CPU(int, dirty_throttle_leaks) = 0;
>   * limit we decrease the ratelimiting by a lot, to prevent individual processes
>   * from overshooting the limit by (ratelimit_pages) each.
>   */
> -void balance_dirty_pages_ratelimited(struct address_space *mapping)
> +void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
> +				     struct super_block *sb)
>  {
> -	struct inode *inode = mapping->host;
> -	struct backing_dev_info *bdi = inode_to_bdi(inode);
>  	struct bdi_writeback *wb = NULL;
>  	int ratelimit;
>  	int *p;
> @@ -1871,7 +1870,7 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping)
>  	if (!bdi_cap_account_dirty(bdi))
>  		return;
>  
> -	if (inode_cgwb_enabled(inode))
> +	if (bdi_cgwb_enabled(bdi, sb))
>  		wb = wb_get_create_current(bdi, GFP_KERNEL);
>  	if (!wb)
>  		wb = &bdi->wb;
> @@ -1909,7 +1908,7 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping)
>  	preempt_enable();
>  
>  	if (unlikely(current->nr_dirtied >= ratelimit))
> -		balance_dirty_pages(mapping, wb, current->nr_dirtied);
> +		balance_dirty_pages(wb, current->nr_dirtied);
>  
>  	wb_put(wb);
>  }
> -- 
> 1.8.3.1
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] remove mapping from balance_dirty_pages*()
  2016-08-10  8:27   ` Jan Kara
@ 2016-08-10  8:29     ` Jan Kara
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Kara @ 2016-08-10  8:29 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

On Wed 10-08-16 10:27:33, Jan Kara wrote:
> On Tue 09-08-16 15:08:26, Josef Bacik wrote:
> > The only reason we pass in the mapping is to get the inode in order to see if
> > writeback cgroups is enabled, and even then it only checks the bdi and a super
> > block flag.  balance_dirty_pages() doesn't even use the mapping.  Since
> > balance_dirty_pages*() works on a bdi level, just pass in the bdi and super
> > block directly so we can avoid using mapping.  This will allow us to still use
> > balance_dirty_pages for dirty metadata pages that are not backed by an
> > address_mapping.
> >
> > Signed-off-by: Josef Bacik <jbacik@fb.com>
> 
> The patch looks good. You can add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>

Well, modulo those build failures which obviously need to be sorted out...
But they should be trivial to deal with.

								Honza

> > ---
> >  fs/btrfs/disk-io.c          |  4 ++--
> >  fs/btrfs/file.c             |  3 ++-
> >  fs/btrfs/ioctl.c            |  3 ++-
> >  fs/btrfs/relocation.c       |  3 ++-
> >  fs/buffer.c                 |  3 ++-
> >  fs/iomap.c                  |  3 ++-
> >  include/linux/backing-dev.h | 23 +++++++++++++++++------
> >  include/linux/writeback.h   |  3 ++-
> >  mm/filemap.c                |  4 +++-
> >  mm/memory.c                 |  9 +++++++--
> >  mm/page-writeback.c         | 15 +++++++--------
> >  11 files changed, 48 insertions(+), 25 deletions(-)
> > 
> > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > index 17346f7..c1d951a 100644
> > --- a/fs/btrfs/disk-io.c
> > +++ b/fs/btrfs/disk-io.c
> > @@ -4024,8 +4024,8 @@ static void __btrfs_btree_balance_dirty(struct btrfs_root *root,
> >  	ret = percpu_counter_compare(&root->fs_info->dirty_metadata_bytes,
> >  				     BTRFS_DIRTY_METADATA_THRESH);
> >  	if (ret > 0) {
> > -		balance_dirty_pages_ratelimited(
> > -				   root->fs_info->btree_inode->i_mapping);
> > +		balance_dirty_pages_ratelimited(&root->fs_info->bdi,
> > +						root->fs_info->sb);
> >  	}
> >  }
> >  
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index 9404121..f060b08 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -1686,7 +1686,8 @@ again:
> >  
> >  		cond_resched();
> >  
> > -		balance_dirty_pages_ratelimited(inode->i_mapping);
> > +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> > +						inode->i_sb);
> >  		if (dirty_pages < (root->nodesize >> PAGE_SHIFT) + 1)
> >  			btrfs_btree_balance_dirty(root);
> >  
> > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> > index 14ed1e9..a222bad 100644
> > --- a/fs/btrfs/ioctl.c
> > +++ b/fs/btrfs/ioctl.c
> > @@ -1410,7 +1410,8 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
> >  		}
> >  
> >  		defrag_count += ret;
> > -		balance_dirty_pages_ratelimited(inode->i_mapping);
> > +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> > +						inode->i_sb);
> >  		inode_unlock(inode);
> >  
> >  		if (newer_than) {
> > diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> > index 4598e29..7fc6ea7 100644
> > --- a/fs/btrfs/relocation.c
> > +++ b/fs/btrfs/relocation.c
> > @@ -3202,7 +3202,8 @@ static int relocate_file_extent_cluster(struct inode *inode,
> >  		put_page(page);
> >  
> >  		index++;
> > -		balance_dirty_pages_ratelimited(inode->i_mapping);
> > +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> > +						inode->i_sb);
> >  		btrfs_throttle(BTRFS_I(inode)->root);
> >  	}
> >  	WARN_ON(nr != cluster->nr);
> > diff --git a/fs/buffer.c b/fs/buffer.c
> > index 9c8eb9b..9bbe30d 100644
> > --- a/fs/buffer.c
> > +++ b/fs/buffer.c
> > @@ -2386,7 +2386,8 @@ static int cont_expand_zero(struct file *file, struct address_space *mapping,
> >  		BUG_ON(err != len);
> >  		err = 0;
> >  
> > -		balance_dirty_pages_ratelimited(mapping);
> > +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> > +						inode->i_sb);
> >  
> >  		if (unlikely(fatal_signal_pending(current))) {
> >  			err = -EINTR;
> > diff --git a/fs/iomap.c b/fs/iomap.c
> > index 48141b8..937e266 100644
> > --- a/fs/iomap.c
> > +++ b/fs/iomap.c
> > @@ -226,7 +226,8 @@ again:
> >  		written += copied;
> >  		length -= copied;
> >  
> > -		balance_dirty_pages_ratelimited(inode->i_mapping);
> > +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> > +						inode->i_sb);
> >  	} while (iov_iter_count(i) && length);
> >  
> >  	return written ? written : status;
> > diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
> > index 491a917..3b76eeb 100644
> > --- a/include/linux/backing-dev.h
> > +++ b/include/linux/backing-dev.h
> > @@ -252,8 +252,9 @@ void wb_blkcg_offline(struct blkcg *blkcg);
> >  int inode_congested(struct inode *inode, int cong_bits);
> >  
> >  /**
> > - * inode_cgwb_enabled - test whether cgroup writeback is enabled on an inode
> > - * @inode: inode of interest
> > + * bdi_cgwb_enabled - test wether cgroup writeback is enabled on a filesystem
> > + * @bdi: the bdi we care about
> > + * @sb: the super for the bdi
> >   *
> >   * cgroup writeback requires support from both the bdi and filesystem.
> >   * Also, both memcg and iocg have to be on the default hierarchy.  Test
> > @@ -262,15 +263,25 @@ int inode_congested(struct inode *inode, int cong_bits);
> >   * Note that the test result may change dynamically on the same inode
> >   * depending on how memcg and iocg are configured.
> >   */
> > -static inline bool inode_cgwb_enabled(struct inode *inode)
> > +static inline bool bdi_cgwb_enabled(struct backing_dev_info *bdi,
> > +				    struct super_block *sb)
> >  {
> > -	struct backing_dev_info *bdi = inode_to_bdi(inode);
> > -
> >  	return cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
> >  		cgroup_subsys_on_dfl(io_cgrp_subsys) &&
> >  		bdi_cap_account_dirty(bdi) &&
> >  		(bdi->capabilities & BDI_CAP_CGROUP_WRITEBACK) &&
> > -		(inode->i_sb->s_iflags & SB_I_CGROUPWB);
> > +		(sb->s_iflags & SB_I_CGROUPWB);
> > +}
> > +
> > +/**
> > + * inode_cgwb_enabled - test whether cgroup writeback is enabled on an inode
> > + * @inode: inode of interest
> > + *
> > + * Does the inode have cgroup writeback support.
> > + */
> > +static inline bool inode_cgwb_enabled(struct inode *inode)
> > +{
> > +	return bdi_cgwb_enabled(inode_to_bdi(inode), inode->i_sb);
> >  }
> >  
> >  /**
> > diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> > index fc1e16c..256ffc3 100644
> > --- a/include/linux/writeback.h
> > +++ b/include/linux/writeback.h
> > @@ -364,7 +364,8 @@ unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh);
> >  
> >  void wb_update_bandwidth(struct bdi_writeback *wb, unsigned long start_time);
> >  void page_writeback_init(void);
> > -void balance_dirty_pages_ratelimited(struct address_space *mapping);
> > +void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
> > +				     struct super_block *sb);
> >  bool wb_over_bg_thresh(struct bdi_writeback *wb);
> >  
> >  typedef int (*writepage_t)(struct page *page, struct writeback_control *wbc,
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 3083ded..abb0e98 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -2667,6 +2667,7 @@ ssize_t generic_perform_write(struct file *file,
> >  				struct iov_iter *i, loff_t pos)
> >  {
> >  	struct address_space *mapping = file->f_mapping;
> > +	struct inode *inode = mapping->host;
> >  	const struct address_space_operations *a_ops = mapping->a_ops;
> >  	long status = 0;
> >  	ssize_t written = 0;
> > @@ -2746,7 +2747,8 @@ again:
> >  		pos += copied;
> >  		written += copied;
> >  
> > -		balance_dirty_pages_ratelimited(mapping);
> > +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> > +						inode->i_sb);
> >  	} while (iov_iter_count(i));
> >  
> >  	return written ? written : status;
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 83be99d..d43e73b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -64,6 +64,7 @@
> >  #include <linux/debugfs.h>
> >  #include <linux/userfaultfd_k.h>
> >  #include <linux/dax.h>
> > +#include <linux/backing-dev.h>
> >  
> >  #include <asm/io.h>
> >  #include <asm/mmu_context.h>
> > @@ -2105,11 +2106,13 @@ static inline int wp_page_reuse(struct fault_env *fe, pte_t orig_pte,
> >  		put_page(page);
> >  
> >  		if ((dirtied || page_mkwrite) && mapping) {
> > +			struct inode *inode = mapping->host;
> >  			/*
> >  			 * Some device drivers do not set page.mapping
> >  			 * but still dirty their pages
> >  			 */
> > -			balance_dirty_pages_ratelimited(mapping);
> > +			balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> > +							inode->i_sb);
> >  		}
> >  
> >  		if (!page_mkwrite)
> > @@ -3291,11 +3294,13 @@ static int do_shared_fault(struct fault_env *fe, pgoff_t pgoff)
> >  	mapping = page_rmapping(fault_page);
> >  	unlock_page(fault_page);
> >  	if ((dirtied || vma->vm_ops->page_mkwrite) && mapping) {
> > +		struct inode *inode = mapping->host;
> >  		/*
> >  		 * Some device drivers do not set page.mapping but still
> >  		 * dirty their pages
> >  		 */
> > -		balance_dirty_pages_ratelimited(mapping);
> > +		balance_dirty_pages_ratelimited(inode_to_bdi(inode),
> > +						inode->i_sb);
> >  	}
> >  
> >  	if (!vma->vm_ops->page_mkwrite)
> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > index f4cd7d8..121a6e3 100644
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -1559,8 +1559,7 @@ static inline void wb_dirty_limits(struct dirty_throttle_control *dtc)
> >   * If we're over `background_thresh' then the writeback threads are woken to
> >   * perform some writeout.
> >   */
> > -static void balance_dirty_pages(struct address_space *mapping,
> > -				struct bdi_writeback *wb,
> > +static void balance_dirty_pages(struct bdi_writeback *wb,
> >  				unsigned long pages_dirtied)
> >  {
> >  	struct dirty_throttle_control gdtc_stor = { GDTC_INIT(wb) };
> > @@ -1849,7 +1848,8 @@ DEFINE_PER_CPU(int, dirty_throttle_leaks) = 0;
> >  
> >  /**
> >   * balance_dirty_pages_ratelimited - balance dirty memory state
> > - * @mapping: address_space which was dirtied
> > + * @bdi: the bdi that was dirtied
> > + * @sb: the super block that was dirtied
> >   *
> >   * Processes which are dirtying memory should call in here once for each page
> >   * which was newly dirtied.  The function will periodically check the system's
> > @@ -1860,10 +1860,9 @@ DEFINE_PER_CPU(int, dirty_throttle_leaks) = 0;
> >   * limit we decrease the ratelimiting by a lot, to prevent individual processes
> >   * from overshooting the limit by (ratelimit_pages) each.
> >   */
> > -void balance_dirty_pages_ratelimited(struct address_space *mapping)
> > +void balance_dirty_pages_ratelimited(struct backing_dev_info *bdi,
> > +				     struct super_block *sb)
> >  {
> > -	struct inode *inode = mapping->host;
> > -	struct backing_dev_info *bdi = inode_to_bdi(inode);
> >  	struct bdi_writeback *wb = NULL;
> >  	int ratelimit;
> >  	int *p;
> > @@ -1871,7 +1870,7 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping)
> >  	if (!bdi_cap_account_dirty(bdi))
> >  		return;
> >  
> > -	if (inode_cgwb_enabled(inode))
> > +	if (bdi_cgwb_enabled(bdi, sb))
> >  		wb = wb_get_create_current(bdi, GFP_KERNEL);
> >  	if (!wb)
> >  		wb = &bdi->wb;
> > @@ -1909,7 +1908,7 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping)
> >  	preempt_enable();
> >  
> >  	if (unlikely(current->nr_dirtied >= ratelimit))
> > -		balance_dirty_pages(mapping, wb, current->nr_dirtied);
> > +		balance_dirty_pages(wb, current->nr_dirtied);
> >  
> >  	wb_put(wb);
> >  }
> > -- 
> > 1.8.3.1
> > 
> > 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] writeback: allow for dirty metadata accounting
  2016-08-09 19:08 ` [PATCH 2/2] writeback: allow for dirty metadata accounting Josef Bacik
@ 2016-08-10 10:09   ` Jan Kara
  2016-08-10 14:05     ` Josef Bacik
  2016-08-10 20:12   ` Tejun Heo
  1 sibling, 1 reply; 15+ messages in thread
From: Jan Kara @ 2016-08-10 10:09 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

On Tue 09-08-16 15:08:27, Josef Bacik wrote:
> Provide a mechanism for file systems to indicate how much dirty metadata they
> are holding.  This introduces a few things
> 
> 1) Zone stats for dirty metadata, which is the same as the NR_FILE_DIRTY.
> 2) WB stat for dirty metadata.  This way we know if we need to try and call into
> the file system to write out metadata.  This could potentially be used in the
> future to make balancing of dirty pages smarter.
> 3) A super callback to handle writing back dirty metadata.
> 
> A future patch will take advantage of this work in btrfs.  Thanks,

Hum, I once had a patch to allow filesystems to hook more into writeback
where a filesystem was just asked to do writeback and it could decide what
to do with it (it could use generic helpers to essentially replicate what
current writeback code does) but it could also choose some smarter strategy
of picking inodes to write. This scheme could easily accommodate your
metadata writeback as well and there are also other uses for it. But that
patch got broken by Tejun's cgroup aware writeback so one would have to
start from scratch.

We certainly have to think how to integrate this with cgroup aware
writeback. I guess your ->writeback_metadata() just does not bother and would
write anything in the root cgroup, right? After all you don't even pass the
information for which memcg the metadata writeback should be performed down
to the fs callback (that is encoded in the bdi_writeback structure). And
for now I think we could get away with that although it would need to be
handled properly in future I think.

If we created a generic filesystem writeback callback as I suggest, proper
integration with memcg writeback in unavoidable. But I have to think how to
do that best.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] writeback: allow for dirty metadata accounting
  2016-08-10 10:09   ` Jan Kara
@ 2016-08-10 14:05     ` Josef Bacik
  0 siblings, 0 replies; 15+ messages in thread
From: Josef Bacik @ 2016-08-10 14:05 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

On 08/10/2016 06:09 AM, Jan Kara wrote:
> On Tue 09-08-16 15:08:27, Josef Bacik wrote:
>> Provide a mechanism for file systems to indicate how much dirty metadata they
>> are holding.  This introduces a few things
>>
>> 1) Zone stats for dirty metadata, which is the same as the NR_FILE_DIRTY.
>> 2) WB stat for dirty metadata.  This way we know if we need to try and call into
>> the file system to write out metadata.  This could potentially be used in the
>> future to make balancing of dirty pages smarter.
>> 3) A super callback to handle writing back dirty metadata.
>>
>> A future patch will take advantage of this work in btrfs.  Thanks,
>
> Hum, I once had a patch to allow filesystems to hook more into writeback
> where a filesystem was just asked to do writeback and it could decide what
> to do with it (it could use generic helpers to essentially replicate what
> current writeback code does) but it could also choose some smarter strategy
> of picking inodes to write. This scheme could easily accommodate your
> metadata writeback as well and there are also other uses for it. But that
> patch got broken by Tejun's cgroup aware writeback so one would have to
> start from scratch.
>
> We certainly have to think how to integrate this with cgroup aware
> writeback. I guess your ->writeback_metadata() just does not bother and would
> write anything in the root cgroup, right? After all you don't even pass the
> information for which memcg the metadata writeback should be performed down
> to the fs callback (that is encoded in the bdi_writeback structure). And
> for now I think we could get away with that although it would need to be
> handled properly in future I think.
>

I thought about this some but I'm not sure how to work it out so it's sane. 
Currently no other file system's metadata is covered by the writeback cgroup. 
Btrfs is simply by accident, we have an inode where all of our metadata is 
attached.  This doesn't make a whole lot of sense as the inode is tied to 
whichever task dirited it last, so you are going to end up with weird writeback 
behavior on btrfs metadata if you are using writeback cgroups.  I think removing 
this capability for now is actually better overall so we can come up with a 
different solution.

> If we created a generic filesystem writeback callback as I suggest, proper
> integration with memcg writeback in unavoidable. But I have to think how to
> do that best.

So the reason I'm doing this is because the last time I tried to kill our btree 
inode I got bogged down trying to reproduce our own special writeback logic for 
metadata.  I basically constantly oom'ed the box because we'd fill up memory 
with dirty metadata, and then I started just wholesale copying 
mm/page-writeback.c and mm/fs-writeback.c to try and stop the madness and gave 
up because that was just as crazy.

I think that having writeback a little more modularized so file systems can be 
smarter about picking inodes if they want is a good long term goal, but for now 
I'd like to get this work in so I can go about killing our fs wide inode.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] remove mapping from balance_dirty_pages*()
  2016-08-09 19:08 ` [PATCH 1/2] remove mapping from balance_dirty_pages*() Josef Bacik
                     ` (4 preceding siblings ...)
  2016-08-10  8:27   ` Jan Kara
@ 2016-08-10 19:56   ` Tejun Heo
  5 siblings, 0 replies; 15+ messages in thread
From: Tejun Heo @ 2016-08-10 19:56 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

On Tue, Aug 09, 2016 at 03:08:26PM -0400, Josef Bacik wrote:
> The only reason we pass in the mapping is to get the inode in order to see if
> writeback cgroups is enabled, and even then it only checks the bdi and a super
> block flag.  balance_dirty_pages() doesn't even use the mapping.  Since
> balance_dirty_pages*() works on a bdi level, just pass in the bdi and super
> block directly so we can avoid using mapping.  This will allow us to still use
> balance_dirty_pages for dirty metadata pages that are not backed by an
> address_mapping.
> 
> Signed-off-by: Josef Bacik <jbacik@fb.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] writeback: allow for dirty metadata accounting
  2016-08-09 19:08 ` [PATCH 2/2] writeback: allow for dirty metadata accounting Josef Bacik
  2016-08-10 10:09   ` Jan Kara
@ 2016-08-10 20:12   ` Tejun Heo
  2016-08-10 21:16     ` Josef Bacik
  1 sibling, 1 reply; 15+ messages in thread
From: Tejun Heo @ 2016-08-10 20:12 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

Hello, Josef.

On Tue, Aug 09, 2016 at 03:08:27PM -0400, Josef Bacik wrote:
> Provide a mechanism for file systems to indicate how much dirty metadata they
> are holding.  This introduces a few things
> 
> 1) Zone stats for dirty metadata, which is the same as the NR_FILE_DIRTY.
> 2) WB stat for dirty metadata.  This way we know if we need to try and call into
> the file system to write out metadata.  This could potentially be used in the
> future to make balancing of dirty pages smarter.
> 3) A super callback to handle writing back dirty metadata.
> 
> A future patch will take advantage of this work in btrfs.  Thanks,
> 
> Signed-off-by: Josef Bacik <jbacik@fb.com>
...
> +	if (!done && wb_stat(wb, WB_METADATA_DIRTY)) {
> +		struct list_head list;
> +
> +		spin_unlock(&wb->list_lock);
> +		spin_lock(&wb->bdi->sb_list_lock);
> +		list_splice_init(&wb->bdi->sb_list, &list);
> +		while (!list_empty(&list)) {
> +			struct super_block *sb;
> +
> +			sb = list_first_entry(&list, struct super_block,
> +					      s_bdi_list);
> +			list_move_tail(&wb->bdi->sb_list, &sb->s_bdi_list);

It bothers me a bit that sb's can actually be off bdi->sb_list while
sb_list_lock is released.  Can we make this explicit?  e.g. keep
separate bdi sb list for sb's pending metadata writeout (like b_dirty)
or by just walking the list in a smart way?

> +			if (!sb->s_op->write_metadata)
> +				continue;
> +			if (!trylock_super(sb))
> +				continue;
> +			spin_unlock(&wb->bdi->sb_list_lock);
> +			wrote += writeback_sb_metadata(sb, wb, work);
> +			spin_lock(&wb->bdi->sb_list_lock);
> +			up_read(&sb->s_umount);
>  		}
> +		spin_unlock(&wb->bdi->sb_list_lock);
> +		spin_lock(&wb->list_lock);
>  	}
>  	/* Leave any unwritten inodes on b_io */
>  	return wrote;
> diff --git a/fs/super.c b/fs/super.c
> index c2ff475..940696d 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -215,6 +215,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
>  	spin_lock_init(&s->s_inode_list_lock);
>  	INIT_LIST_HEAD(&s->s_inodes_wb);
>  	spin_lock_init(&s->s_inode_wblist_lock);
> +	INIT_LIST_HEAD(&s->s_bdi_list);

I can't find where sb's are actually added to the list.  Is that
supposed to happen from specific filesystems?  Also, it might make
sense to split up additions of sb_list and stat into separate patches.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] writeback: allow for dirty metadata accounting
  2016-08-10 20:12   ` Tejun Heo
@ 2016-08-10 21:16     ` Josef Bacik
  2016-08-10 21:39       ` Tejun Heo
  0 siblings, 1 reply; 15+ messages in thread
From: Josef Bacik @ 2016-08-10 21:16 UTC (permalink / raw)
  To: Tejun Heo
  Cc: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

On 08/10/2016 04:12 PM, Tejun Heo wrote:
> Hello, Josef.
>
> On Tue, Aug 09, 2016 at 03:08:27PM -0400, Josef Bacik wrote:
>> Provide a mechanism for file systems to indicate how much dirty metadata they
>> are holding.  This introduces a few things
>>
>> 1) Zone stats for dirty metadata, which is the same as the NR_FILE_DIRTY.
>> 2) WB stat for dirty metadata.  This way we know if we need to try and call into
>> the file system to write out metadata.  This could potentially be used in the
>> future to make balancing of dirty pages smarter.
>> 3) A super callback to handle writing back dirty metadata.
>>
>> A future patch will take advantage of this work in btrfs.  Thanks,
>>
>> Signed-off-by: Josef Bacik <jbacik@fb.com>
> ...
>> +	if (!done && wb_stat(wb, WB_METADATA_DIRTY)) {
>> +		struct list_head list;
>> +
>> +		spin_unlock(&wb->list_lock);
>> +		spin_lock(&wb->bdi->sb_list_lock);
>> +		list_splice_init(&wb->bdi->sb_list, &list);
>> +		while (!list_empty(&list)) {
>> +			struct super_block *sb;
>> +
>> +			sb = list_first_entry(&list, struct super_block,
>> +					      s_bdi_list);
>> +			list_move_tail(&wb->bdi->sb_list, &sb->s_bdi_list);
>
> It bothers me a bit that sb's can actually be off bdi->sb_list while
> sb_list_lock is released.  Can we make this explicit?  e.g. keep
> separate bdi sb list for sb's pending metadata writeout (like b_dirty)
> or by just walking the list in a smart way?
>

Yeah I wasn't super thrilled with this either, but since I'm only using it for 
writeback I figured it was ok for now.  I suppose I could make it less sb_list 
and just make it dirty_sb_list, and only add if the super says it has dirty 
metadata.  Does that sound better?  Then us being off of the list during 
writeback isn't that big of a deal.

>> +			if (!sb->s_op->write_metadata)
>> +				continue;
>> +			if (!trylock_super(sb))
>> +				continue;
>> +			spin_unlock(&wb->bdi->sb_list_lock);
>> +			wrote += writeback_sb_metadata(sb, wb, work);
>> +			spin_lock(&wb->bdi->sb_list_lock);
>> +			up_read(&sb->s_umount);
>>  		}
>> +		spin_unlock(&wb->bdi->sb_list_lock);
>> +		spin_lock(&wb->list_lock);
>>  	}
>>  	/* Leave any unwritten inodes on b_io */
>>  	return wrote;
>> diff --git a/fs/super.c b/fs/super.c
>> index c2ff475..940696d 100644
>> --- a/fs/super.c
>> +++ b/fs/super.c
>> @@ -215,6 +215,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
>>  	spin_lock_init(&s->s_inode_list_lock);
>>  	INIT_LIST_HEAD(&s->s_inodes_wb);
>>  	spin_lock_init(&s->s_inode_wblist_lock);
>> +	INIT_LIST_HEAD(&s->s_bdi_list);
>
> I can't find where sb's are actually added to the list.  Is that
> supposed to happen from specific filesystems?  Also, it might make
> sense to split up additions of sb_list and stat into separate patches.
>

I was going to be lazy and only add it if we cared about writing out metadata, 
and then we could expand it to everybody if somebody else had a usecase.  But I 
think maybe the b_sb_dirty list idea is a better fit overall so I can just do 
that so it makes more sense.  I can split up these patches as well, thanks,

Josef

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] writeback: allow for dirty metadata accounting
  2016-08-10 21:16     ` Josef Bacik
@ 2016-08-10 21:39       ` Tejun Heo
  0 siblings, 0 replies; 15+ messages in thread
From: Tejun Heo @ 2016-08-10 21:39 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, linux-fsdevel, kernel-team, jack, viro, dchinner,
	hch

Hello, Josef.

On Wed, Aug 10, 2016 at 05:16:03PM -0400, Josef Bacik wrote:
> > It bothers me a bit that sb's can actually be off bdi->sb_list while
> > sb_list_lock is released.  Can we make this explicit?  e.g. keep
> > separate bdi sb list for sb's pending metadata writeout (like b_dirty)
> > or by just walking the list in a smart way?
> > 
> 
> Yeah I wasn't super thrilled with this either, but since I'm only using it
> for writeback I figured it was ok for now.  I suppose I could make it less
> sb_list and just make it dirty_sb_list, and only add if the super says it
> has dirty metadata.  Does that sound better?  Then us being off of the list
> during writeback isn't that big of a deal.

Yeah, that feels more logical to me.

> > I can't find where sb's are actually added to the list.  Is that
> > supposed to happen from specific filesystems?  Also, it might make
> > sense to split up additions of sb_list and stat into separate patches.
> > 
> 
> I was going to be lazy and only add it if we cared about writing out
> metadata, and then we could expand it to everybody if somebody else had a
> usecase.  But I think maybe the b_sb_dirty list idea is a better fit overall
> so I can just do that so it makes more sense.  I can split up these patches
> as well, thanks,

b_sb_dirty list sounds good and we wouldn't have to worry too much
about lifetime issues either as the list would be linked iff the sb is
dirty.  It might still make sense to ensure that the list is unlinked
when sb is putting the bdi tho.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-08-10 21:40 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-09 19:08 [PATCH 0/2][RFC] Provide accounting for dirty metadata Josef Bacik
2016-08-09 19:08 ` [PATCH 1/2] remove mapping from balance_dirty_pages*() Josef Bacik
2016-08-09 19:30   ` kbuild test robot
2016-08-09 19:32   ` kbuild test robot
2016-08-09 20:12   ` kbuild test robot
2016-08-09 20:50   ` kbuild test robot
2016-08-10  8:27   ` Jan Kara
2016-08-10  8:29     ` Jan Kara
2016-08-10 19:56   ` Tejun Heo
2016-08-09 19:08 ` [PATCH 2/2] writeback: allow for dirty metadata accounting Josef Bacik
2016-08-10 10:09   ` Jan Kara
2016-08-10 14:05     ` Josef Bacik
2016-08-10 20:12   ` Tejun Heo
2016-08-10 21:16     ` Josef Bacik
2016-08-10 21:39       ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).