From mboxrd@z Thu Jan 1 00:00:00 1970 From: jim owens Subject: Re: [PATCH 0/4] Fiemap, an extent mapping ioctl - round 2 Date: Wed, 02 Jul 2008 19:48:07 -0400 Message-ID: <486C13B7.4030402@hp.com> References: <20080625221835.GQ28100@wotan.suse.de> <1214489061.6237.16.camel@norville.austin.ibm.com> <4863A483.5060303@redhat.com> <1214490465.6237.24.camel@norville.austin.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit To: linux-fsdevel@vger.kernel.org Return-path: Received: from g5t0009.atlanta.hp.com ([15.192.0.46]:15938 "EHLO g5t0009.atlanta.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751492AbYGBXsN (ORCPT ); Wed, 2 Jul 2008 19:48:13 -0400 Received: from g4t0009.houston.hp.com (g4t0009.houston.hp.com [16.234.32.26]) by g5t0009.atlanta.hp.com (Postfix) with ESMTP id C3770300C8 for ; Wed, 2 Jul 2008 23:48:12 +0000 (UTC) Received: from ldl.fc.hp.com (ldl.fc.hp.com [15.11.146.30]) by g4t0009.houston.hp.com (Postfix) with ESMTP id 939C57C34A for ; Wed, 2 Jul 2008 23:48:12 +0000 (UTC) In-Reply-To: <1214490465.6237.24.camel@norville.austin.ibm.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: I'm back from vacation and ready to cause fiemap() trouble. Dave Kleikamp wrote: > On Thu, 2008-06-26 at 09:15 -0500, Eric Sandeen wrote: > >>>SYNC really doesn't look like it belongs, and it's only there so that >>>the new ioctl acts like the xfs ioctl. >> >>I disagree, while it may have been inspired by the xfs behavior, it's >>not at all xfs specific. >> >>If a filesystem implements delalloc, you may want to know which ranges >>are still delalloc in the fiemap output, or you may want to put them on >>disk and know the actual physical location. And if you want a snapshot >>of an actual, consistent layout of the file at a point in time, then you >>need an atomic sync+map - for any filesystem. > > This makes sense. In fact, I could see always doing the sync if there > are delalloc blocks to ensure that the location of the blocks will > always be returned. > > I guess I was put off by Andreas' response that FIEMAP_FLAG_SYNC is > there because xfsbmap had it "isn't harmful either". This seemed a bit > weak, but I see that there is a better justification than just that. I say IT IS HARMFUL to have the FIEMAP_FLAG_SYNC. The email trail points out how this so-called atomic sync+map will lead programmers to write bad code because it leads them to think there is some valuable guarantee of consistency by using the SYNC flag. This is not true. The fiemap by itself is equivalent in all cases to reading multiple disk blocks, while someone else is writing some random subset of the same blocks. You have data, but it is not a clean "before" or "after" picture. The only way to get a true useful snapshot is to have a set of commands doing: freeze_metadata() read_metadata() ... userspace operate on metadata ... unfreeze_metadata() If you are going to define fiemap to have an internal freeze_metadata(), then I say that is even MORE HARMFUL because it makes every (de)allocate/(de)compress/move code path take a giant lock just so fiemap can get a static picture that encompasses all in-range extents. And that static picture can be invalid the moment the giant lock is released. jim