From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.fusionio.com ([66.114.96.31]:60452 "EHLO mx2.fusionio.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756302Ab3AHNCo (ORCPT ); Tue, 8 Jan 2013 08:02:44 -0500 Date: Tue, 8 Jan 2013 08:02:41 -0500 From: Josef Bacik To: Miao Xie CC: Linux Btrfs , Josef Bacik Subject: Re: [PATCH V2] Btrfs: flush all dirty inodes if writeback can not start Message-ID: <20130108130241.GD2389@localhost.localdomain> References: <50D2F42D.7070600@cn.fujitsu.com> <50D826FF.8040202@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" In-Reply-To: <50D826FF.8040202@cn.fujitsu.com> Sender: linux-btrfs-owner@vger.kernel.org List-ID: On Mon, Dec 24, 2012 at 02:57:19AM -0700, Miao Xie wrote: > We may try to flush some dirty pages when there is no enough space to reserve. > But it is possible that this operation fails, in order to get enough space to > reserve successfully, we will sync all the delalloc file. This operation is > safe, we needn't worry about the case that the filesystem goes from r/w to r/o. > because the filesystem should guarantee all the dirty pages have been written > into the disk after it becomes readonly, so the sync operation will do nothing > if the filesystem is already readonly. Though it may waste lots of time, > as a corner case, we needn't care. > > Signed-off-by: Miao Xie > --- > Changelog v1 -> v2: > - make the function static > --- > fs/btrfs/extent-tree.c | 40 +++++++++++++++++++++++++++++++--------- > 1 file changed, 31 insertions(+), 9 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index b6ed965..2d9fe27 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -3695,12 +3695,15 @@ static int can_overcommit(struct btrfs_root *root, > return 0; > } > > -static int writeback_inodes_sb_nr_if_idle_safe(struct super_block *sb, > - unsigned long nr_pages, > - enum wb_reason reason) > +static inline int writeback_inodes_sb_nr_if_idle_safe(struct super_block *sb, > + unsigned long nr_pages, > + enum wb_reason reason) > { > - if (!writeback_in_progress(sb->s_bdi) && > - down_read_trylock(&sb->s_umount)) { > + /* the flusher is dealing with the dirty inodes now. */ > + if (writeback_in_progress(sb->s_bdi)) > + return 1; > + > + if (down_read_trylock(&sb->s_umount)) { > writeback_inodes_sb_nr(sb, nr_pages, reason); > up_read(&sb->s_umount); > return 1; > @@ -3709,6 +3712,28 @@ static int writeback_inodes_sb_nr_if_idle_safe(struct super_block *sb, > return 0; > } > > +static void btrfs_writeback_inodes_sb_nr(struct btrfs_root *root, > + unsigned long nr_pages) > +{ > + struct super_block *sb = root->fs_info->sb; > + int started; > + > + /* If we can not start writeback, just sync all the delalloc file. */ > + started = writeback_inodes_sb_nr_if_idle_safe(sb, nr_pages, > + WB_REASON_FS_FREE_SPACE); > + if (!started) { > + /* > + * We needn't worry the filesystem going from r/w to r/o though > + * we don't acquire ->s_umount mutex, because the filesystem > + * should guarantee the delalloc inodes list be empty after > + * the filesystem is readonly(all dirty pages are written to > + * the disk). > + */ > + btrfs_start_delalloc_inodes(root, 0); > + btrfs_wait_ordered_extents(root, 0); We can't just call wait_ordered_extents, we may have an open trans handle which could make us deadlock if a transaction commit starts. Thanks, Josef