From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: [PATCH] 2.5.x write_super is not for syncing Date: Mon, 02 Dec 2002 14:07:00 -0800 Sender: linux-fsdevel-owner@vger.kernel.org Message-ID: <3DEBD984.BACA3099@digeo.com> References: <1034791206.18503.68.camel@tiny> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: "linux-fsdevel@vger.kernel.org" Return-path: Received: from digeo-nav01.digeo.com (digeo-nav01.digeo.com [192.168.1.233]) by packet.digeo.com (8.9.3+Sun/8.9.3) with SMTP id OAA12197 for ; Mon, 2 Dec 2002 14:07:02 -0800 (PST) To: Chris Mason List-Id: linux-fsdevel.vger.kernel.org Six weeks ago, Chris Mason wrote: > > Hello all, > > This patch adds a commit_super super operation that allows the journaled > filesystems to do something different for the periodic write_super calls > and sync. This is the "right fix" for the ext3 data=journal problem. > Based on comments from Al about my last patch, alloc_super sets a > default empty super_operations struct on each super. This allows us to > get rid of all checks for sb->s_ops == NULL. > > sync_supers is changed so that it doesn't loop on a single FS if the > write_super call leaves sb->s_dirt set. I did this by changing > generic_shutdown_super to use list_del_init(&sb->s_list), which allows > us to check for supers that have been removed from the super_blocks list > while we slept. The idea came from an sgi patch Hugh Dickins sent me. > > Anyway, this is against 2.5.43, please review: > > --- 1.161/fs/buffer.c Tue Oct 8 14:40:47 2002 > +++ edited/fs/buffer.c Wed Oct 16 11:39:04 2002 > @@ -217,8 +217,10 @@ > sync_inodes_sb(sb, 0); > DQUOT_SYNC(sb); > lock_super(sb); > - if (sb->s_dirt && sb->s_op && sb->s_op->write_super) > + if (sb->s_dirt && sb->s_op->write_super) > sb->s_op->write_super(sb); > + if (sb->s_dirt && sb->s_op->commit_super) > + sb->s_op->commit_super(sb); Except for the s_dirt test in here. If s_dirt is zero and we have dirty inodes, the filesystem _still_ is not told what to do. I don't think we'll ever get this right until we start telling the filesystem what's happening. So instead of all these little presumptuous micro-syncs which we're doing in there, we need to turn this inside out and just call sb->s_op->sync_everything_for_umount() and let the fs decide how to get everything tight on disk. That's a bit drastic. At a minimum we need to remove that s_dirt test and make the commit_super() call unconditional. What would that break?