From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: [patch 6/6] mm: fsync livelock avoidance Date: Thu, 11 Dec 2008 14:45:02 -0800 Message-ID: <20081211144502.28ab9036.akpm@linux-foundation.org> References: <20081210072454.GB27096@wotan.suse.de> <20081210074209.GG27096@wotan.suse.de> <20081211135111.cada5b8b.akpm@linux-foundation.org> <20081211223213.GC8294@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org, mpatocka@redhat.com To: Nick Piggin Return-path: Received: from smtp1.linux-foundation.org ([140.211.169.13]:59307 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756877AbYLKWub (ORCPT ); Thu, 11 Dec 2008 17:50:31 -0500 In-Reply-To: <20081211223213.GC8294@wotan.suse.de> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Thu, 11 Dec 2008 23:32:13 +0100 Nick Piggin wrote: > On Thu, Dec 11, 2008 at 01:51:11PM -0800, Andrew Morton wrote: > > On Wed, 10 Dec 2008 08:42:09 +0100 > > Nick Piggin wrote: > > > > > > This lock also solves a real data integrity problem that I only noticed as > > > I was writing the livelock avoidance code. If we consider the lock as the > > > solution to this bug, this makes the livelock avoidance code much more > > > attractive because then it does not introduce the new lock. > > > > > > The bug is that fsync errors do not get propogated back up to the caller > > > properly in some cases. Consider where we write a page in the writeout path, > > > then it encounters an IO error and finishes writeback, in the meantime, another > > > process (eg. via sys_sync, or another fsync) clears the mapping error bits. > > > Then our fsync will have appeared to finish successfully, but actually should > > > have returned error. > > > > Has *anybody* *ever* complained about this behaviour? I think maybe > > one person after sixish years? > > The livelock behaviour? (or the error propagation). > > I first heard about it from Mikulas, where some dm tool locks up because > it does direct IO on the block device of mounted filesystem (or something > like that). Does it actually lock up? Or does it just take a loooong time? Presumably it can be worked around in userspace. > That case is actually mostly solved by my first ptach in the > series. mm-direct-io-starvation-improvement.patch? I guess that would help a lot. I can't imagine why we didn't do that years ago??? Can we please determine whether that optimisation was sufficient for Mikulas's example? > > Why fix it? > > Good question. My earlier patches already in your tree removed some starvation > avoidance code because they were breaking data integrity semantics. So in > theory, your tree today is more susceptible to this sync/fsync starvation > than mainline. I care most about the correctness, and it would be great if > nobody cares about this starvation problem so we don't need the extra > complexity. Yes, it does add quite a bit of complexity and more code. It'd be good if we could find some way of avoiding merging it.