From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Piggin Subject: Re: [PATCH 6/6] writeback: limit write_cache_pages integrity scanning to current EOF Date: Fri, 28 May 2010 15:06:55 +1000 Message-ID: <20100528050655.GY22536@laptop> References: <1274784852-30502-1-git-send-email-david@fromorbit.com> <1274784852-30502-7-git-send-email-david@fromorbit.com> <20100527143341.d4258798.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Dave Chinner , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, tytso@mit.edu, jens.axboe@oracle.com To: Andrew Morton Return-path: Received: from cantor2.suse.de ([195.135.220.15]:48388 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751088Ab0E1FHP (ORCPT ); Fri, 28 May 2010 01:07:15 -0400 Content-Disposition: inline In-Reply-To: <20100527143341.d4258798.akpm@linux-foundation.org> Sender: linux-ext4-owner@vger.kernel.org List-ID: On Thu, May 27, 2010 at 02:33:41PM -0700, Andrew Morton wrote: > On Tue, 25 May 2010 20:54:12 +1000 > Dave Chinner wrote: > > > From: Dave Chinner > > > > sync can currently take a really long time if a concurrent writer is > > extending a file. The problem is that the dirty pages on the address > > space grow in the same direction as write_cache_pages scans, so if > > the writer keeps ahead of writeback, the writeback will not > > terminate until the writer stops adding dirty pages. ... > That being said, I think the patch is insufficient. If I create an > enormous (possibly sparse) file with a 16TB hole (or a run of clean > pages) in the middle and then start busily writing into that hole (run > of clean pages), the problem will still occur. Yep. > One obvious fix for that (a) would be to add another radix-tree tag and > do two passes across the radix-tree. Yes this is the method I tried. Jan has taken it further and should have the latest patches around. A good test case for the starvation would be helpful. > Another fix (b) would be to track the number of dirty pages per > adddress_space, and only write that number of pages. > > Another fix would be to work out how the code handled this situation > before we broke it, and restore that in some fashion. I guess fix (b) > above kinda does that. I took that out (and offered fix a in replacement but it was turned down at the time). Because b stands for broken. IIRC we were writing out no more than 2x the dirty pages of the file during sync. The problem with that is more pages can be dirtied after we calculate the number, and then we might write out those newly dirty pages and miss old dirty pages.