From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758949Ab2CHXNF (ORCPT ); Thu, 8 Mar 2012 18:13:05 -0500 Received: from mail-tul01m020-f174.google.com ([209.85.214.174]:58602 "EHLO mail-tul01m020-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753165Ab2CHXNC (ORCPT ); Thu, 8 Mar 2012 18:13:02 -0500 Message-ID: <4F593CF8.2000105@amacapital.net> Date: Thu, 08 Mar 2012 15:12:56 -0800 From: Andy Lutomirski User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.1) Gecko/20120216 Thunderbird/10.0.1 MIME-Version: 1.0 To: Jan Kara CC: Andrew Morton , linux-mm@kvack.org, LKML , Al Viro , linux-fsdevel@vger.kernel.org, dchinner@redhat.com, Jaya Kumar , Sage Weil , ceph-devel@vger.kernel.org, Steve French , linux-cifs@vger.kernel.org, Eric Van Hensbergen , Ron Minnich , Latchesar Ionkov , v9fs-developer@lists.sourceforge.net, Miklos Szeredi , fuse-devel@lists.sourceforge.net, Steven Whitehouse , cluster-devel@redhat.com, Greg Kroah-Hartman Subject: Re: [PATCH 00/11 v2] Push file_update_time() into .page_mkwrite References: <1330602103-8851-1-git-send-email-jack@suse.cz> In-Reply-To: <1330602103-8851-1-git-send-email-jack@suse.cz> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/01/2012 03:41 AM, Jan Kara wrote: > Hello, > > to provide reliable support for filesystem freezing, filesystems need to have > complete control over when metadata is changed. In particular, > file_update_time() calls from page fault code make it impossible for > filesystems to prevent inodes from being dirtied while the filesystem is > frozen. > > To fix the issue, this patch set changes page fault code to call > file_update_time() only when ->page_mkwrite() callback is not provided. If the > callback is provided, it is the responsibility of the filesystem to perform > update of i_mtime / i_ctime if needed. We also push file_update_time() call > to all existing ->page_mkwrite() implementations if the time update does not > obviously happen by other means. If you know your filesystem does not need > update of modification times in ->page_mkwrite() handler, please speak up and > I'll drop the patch for your filesystem. > > As a side note, an alternative would be to remove call of file_update_time() > from page fault code altogether and require all filesystems needing it to do > that in their ->page_mkwrite() implementation. That is certainly possible > although maybe slightly inefficient and would require auditting 100+ > vm_operations_structs *shiver*. IMO updating file times should happen when changes get written out, not when a page is made writable, for two reasons: 1. Correctness. With the current approach, it's very easy for files to be changed after the last mtime update -- any changes between mkwrite and actual writeback won't affect mtime. 2. Performance. I have an application (presumably guessable from my email address) for which blocking in page_mkwrite is an absolute show-stopper. (In fact it's so bad that we reverted back to running on Windows until I hacked up a kernel to not do this.) I have an incorrect patch [1] to fix it, but I haven't gotten around to a real fix. (I also have stable pages reverted in my kernel. Some day I'll submit a patch to make it a filesystem option. Or maybe it should even be a block device / queue property like the alignment offset and optimal io size -- there are plenty of block device and file combinations which don't benefit at all from stable pages.) I'd prefer if file_update_time in page_mkwrite didn't proliferate. A better fix is probably to introduce a new inode flag, update it when a page is undirtied, and then dirty and write the inode from the writeback path. (Kind of like my patch, but with an inode flag instead of a page flag, and with the file_update_time done from the fs.) [1] http://patchwork.ozlabs.org/patch/122516/ --Andy