From mboxrd@z Thu Jan 1 00:00:00 1970 From: jim owens Subject: Re: [PATCH 0/4] Fiemap, an extent mapping ioctl - round 2 Date: Mon, 07 Jul 2008 19:01:24 -0400 Message-ID: <4872A044.7070001@hp.com> References: <20080625221835.GQ28100@wotan.suse.de> <1214489061.6237.16.camel@norville.austin.ibm.com> <4863A483.5060303@redhat.com> <1214490465.6237.24.camel@norville.austin.ibm.com> <486C13B7.4030402@hp.com> <20080703111726.GZ29319@disturbed> <486CC446.1050602@hp.com> <0371AF9C-A83A-4584-83D2-6EE9DCEACD77@cam.ac.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org To: Anton Altaparmakov , Dave Chinner Return-path: Received: from g4t0016.houston.hp.com ([15.201.24.19]:32946 "EHLO g4t0016.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757616AbYGGXBx (ORCPT ); Mon, 7 Jul 2008 19:01:53 -0400 In-Reply-To: <0371AF9C-A83A-4584-83D2-6EE9DCEACD77@cam.ac.uk> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Anton Altaparmakov wrote: > It is completely irrelevant whether the information is still valid > after the fiemap returns. So if that is true, any XFS utility that does more than PRINT the extent map based on doing JUST a fiemap is subject to erronious results. I agree with everyone who says that to do useful work with the output of fiemap, you need a set of syscall functions that have this effect: mandatory_exclusive_file_lock(); [optional] fsync(); or force_allocation(); fiemap(); [do ugly userspace stuff] release_mandatory_exclusive_file_lock(); Without the locking steps, any code that acts on the fiemap output is just guessing, and if XFS utilities do unlocked fiemap, it doesn't matter that they have forced an atomic fsync, their extent map is no more valid than the non-atomic case. So why bother having it allocate and sync storage (besides so you don't have to add code to handle unknown extent types)? Dave Chinner wrote: > On Fri, Jul 04, 2008 at 01:13:25PM +0100, Jamie Lokier wrote: >>You can only read blocks if the mapping remains stable after returning >>it, which means the application _must_ ensure no process is modifying >>the file, and that it's on a filesystem which doesn't arbitrarily move >>blocks when it feels like it. > > Like: > > # xfs_freeze -f > # xfs_bmap -vvp > # > # xfs_freeze -u > >>You've explained that it does provide a >>guarantee: the resulting map will be valid for a consistent snapshot >>of the file at some instant in time during the FIEMAP call. In other >>words, with concurrent modifiers, atomic sync+map ensures no delalloc >>regions (is there anything else?) in the map, while fsync() + map gets >>close but does not ensure it. > > Synchronisation with direct I/O, ensures unwritten extent conversion > completion with concurrent async direct I/O before mapping, space > preallocation, etc. So the sequence above seems to match my locked sequence and only needs the fsync() instead of counting on fiemap-with-sync. However, I will point out that the FREEZE-FILESYSTEM commands (which I assume is your semantic as it is using ) I am used to using do not allow any metadata changes on the storage. This is because the device snapshot code needs it stable. So if xfs_bmap and fiemap() are expected to ignore freeze and change metadata to do allocations that is sematically incorrect too. jim