From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail202.messagelabs.com (mail202.messagelabs.com [216.82.254.227]) by kanga.kvack.org (Postfix) with SMTP id D2BAD6B007E for ; Tue, 8 Sep 2009 13:00:04 -0400 (EDT) Date: Tue, 8 Sep 2009 19:00:02 +0200 From: Nick Piggin Subject: Re: Why doesn't zap_pte_range() call page_mkwrite() Message-ID: <20090908170002.GD29902@wotan.suse.de> References: <1240519320.5602.9.camel@heimdal.trondhjem.org> <20090424104137.GA7601@sgi.com> <1240592448.4946.35.camel@heimdal.trondhjem.org> <20090425051028.GC10088@wotan.suse.de> <20090908153007.GB2513@think> <20090908154132.GC29902@wotan.suse.de> <20090908163149.GB2975@think> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090908163149.GB2975@think> Sender: owner-linux-mm@kvack.org To: Chris Mason , Trond Myklebust , Miklos Szeredi , holt@sgi.com, linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org List-ID: On Tue, Sep 08, 2009 at 12:31:49PM -0400, Chris Mason wrote: > On Tue, Sep 08, 2009 at 05:41:32PM +0200, Nick Piggin wrote: > > It hasn't fallen completely off my radar. fsblock has the same issue > > (although I've just been ignoring gup writes into fsblock fs for the > > time being). > > Ok, I'll change my detection code a bit then. OK. > > I have a basic idea of what to do... It would be nice to change calling > > convention of get_user_pages and take the page lock. Database people might > > scream, in which case we could only take the page lock for filesystems that > > define ->page_mkwrite (so shared mem segments avoid the overhead). Lock > > ordering might get a bit interesting, but if we can have callers ensure they > > always submit and release partially fulfilled requirests, then we can always > > trylock them. > > I think everyone will have page_mkwrite eventually, at least everyone > who the databases will care about ;) Ah, the problem is not where the DIO write goes, it's where the read goes :) (ie. the read writes into get_user_pages pages). So for databases this should typically be shared memory segments I'd say (tmpfs), or maybe anonymous memory. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org