From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: [patch 9/9] mm: do_sync_mapping_range integrity fix Date: Fri, 31 Oct 2008 03:04:32 -0700 Message-ID: <20081031030432.426e3c51.akpm@linux-foundation.org> References: <20081028144715.683011000@suse.de> <20081028145734.706927000@nick.local0.net> <20081030161344.0ed5ca52.akpm@linux-foundation.org> <20081031091616.GF19268@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org, david@fromorbit.com, chris.mason@oracle.com To: Nick Piggin Return-path: Received: from smtp1.linux-foundation.org ([140.211.169.13]:52546 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750870AbYJaKEx (ORCPT ); Fri, 31 Oct 2008 06:04:53 -0400 In-Reply-To: <20081031091616.GF19268@wotan.suse.de> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Fri, 31 Oct 2008 10:16:16 +0100 Nick Piggin wrote: > > > So all we need to guarantee here is that > > __filemap_fdatawrite_range(WB_SYNC_NONE) will start writeout on all > > dirty pages in the range. Probably that gets broken lower down as part > > of various hacks^woptimisations have gone in, but which ones, and > > where? Perhaps _this_ (if it's there) is what should be fixed. > > WB_SYNC_NONE has never (until this was introduced) been used for data > integrity AFAIKS. There is code littered throughout fs/ which assumes > WB_SYNC_NONE ~= "efficiency / background writeback". At least > definitely the buffer.c "trylock" will cause dirty pages to be > skipped. The change seriously wrecks sync_file_range(SYNC_FILE_RANGE_WRITE), the whole point of which is to as-nonblockingly-as-possible shove some data into the queues. This is useful. Perhaps we could use WB_SYNC_ALL if SYNC_FILE_RANGE_WAIT_BEFORE was specified, or WB_SYNC_NONE if it was not (need to think about SYNC_FILE_RANGE_WAIT_AFTER, too). That's a bit grubby, because userspace could do sync_file_range(SYNC_FILE_RANGE_WAIT_BEFORE); sync_file_range(SYNC_FILE_RANGE_WRITE); expecting it to have the same behaviour as sync_file_range(SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE); How the hell did that stupid sync_mode thing get into writeback_control? :( We should get rid of WB_SYNC_FOO and migrate to better-defined writeback_control fields.