From mboxrd@z Thu Jan  1 00:00:00 1970
From: jim owens <jowens@hp.com>
Subject: Re: [PATCH 0/4] Fiemap, an extent mapping ioctl - round 2
Date: Mon, 07 Jul 2008 19:01:24 -0400
Message-ID: <4872A044.7070001@hp.com>
References: <20080625221835.GQ28100@wotan.suse.de> <1214489061.6237.16.camel@norville.austin.ibm.com> <4863A483.5060303@redhat.com> <1214490465.6237.24.camel@norville.austin.ibm.com> <486C13B7.4030402@hp.com> <20080703111726.GZ29319@disturbed> <486CC446.1050602@hp.com> <0371AF9C-A83A-4584-83D2-6EE9DCEACD77@cam.ac.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Cc: linux-fsdevel@vger.kernel.org
To: Anton Altaparmakov <aia21@cam.ac.uk>,
	Dave Chinner <david@fromorbit.com>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from g4t0016.houston.hp.com ([15.201.24.19]:32946 "EHLO
	g4t0016.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1757616AbYGGXBx (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Mon, 7 Jul 2008 19:01:53 -0400
In-Reply-To: <0371AF9C-A83A-4584-83D2-6EE9DCEACD77@cam.ac.uk>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

Anton Altaparmakov wrote:

> It is completely irrelevant whether the information is still valid  
> after the fiemap returns.

So if that is true, any XFS utility that does more than PRINT
the extent map based on doing JUST a fiemap is subject to
erronious results.

I agree with everyone who says that to do useful work with
the output of fiemap, you need a set of syscall functions
that have this effect:

    mandatory_exclusive_file_lock();
      [optional] fsync(); or force_allocation();
    fiemap();
      [do ugly userspace stuff]
    release_mandatory_exclusive_file_lock();

Without the locking steps, any code that acts on the
fiemap output is just guessing, and if XFS utilities
do unlocked fiemap, it doesn't matter that they have
forced an atomic fsync, their extent map is no more
valid than the non-atomic case.  So why bother having
it allocate and sync storage (besides so you don't
have to add code to handle unknown extent types)?

Dave Chinner wrote:
> On Fri, Jul 04, 2008 at 01:13:25PM +0100, Jamie Lokier wrote:
>>You can only read blocks if the mapping remains stable after returning
>>it, which means the application _must_ ensure no process is modifying
>>the file, and that it's on a filesystem which doesn't arbitrarily move
>>blocks when it feels like it.
> 
> Like:
> 
> # xfs_freeze -f <mntpt>
> # xfs_bmap -vvp <file>
> # <do something nasty with direct block access>
> # xfs_freeze -u <mntpt>
> 
>>You've explained that it does provide a
>>guarantee: the resulting map will be valid for a consistent snapshot
>>of the file at some instant in time during the FIEMAP call.  In other
>>words, with concurrent modifiers, atomic sync+map ensures no delalloc
>>regions (is there anything else?) in the map, while fsync() + map gets
>>close but does not ensure it.
> 
> Synchronisation with direct I/O, ensures unwritten extent conversion
> completion with concurrent async direct I/O before mapping, space
> preallocation, etc.

So the sequence above seems to match my locked sequence and
only needs the fsync() instead of counting on fiemap-with-sync.

However, I will point out that the FREEZE-FILESYSTEM commands
(which I assume is your semantic as it is using <mntpt>) I am
used to using do not allow any metadata changes on the storage.
This is because the device snapshot code needs it stable.

So if xfs_bmap and fiemap() are expected to ignore freeze and
change metadata to do allocations that is sematically incorrect too.

jim