From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Dilger Subject: Re: [PATCH 0/4] Fiemap, an extent mapping ioctl - round 2 Date: Fri, 04 Jul 2008 02:49:20 -0600 Message-ID: <20080704084920.GP6239@webber.adilger.int> References: <20080625221835.GQ28100@wotan.suse.de> <486CE430.9010902@hp.com> <20080703151731.GD1390@shareable.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7BIT Cc: jim owens , linux-fsdevel@vger.kernel.org, mfasheh@suse.com To: Jamie Lokier Return-path: Received: from sca-es-mail-2.Sun.COM ([192.18.43.133]:50326 "EHLO sca-es-mail-2.sun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753102AbYGDItY (ORCPT ); Fri, 4 Jul 2008 04:49:24 -0400 Received: from fe-sfbay-10.sun.com ([192.18.43.129]) by sca-es-mail-2.sun.com (8.13.7+Sun/8.12.9) with ESMTP id m648nNgZ024726 for ; Fri, 4 Jul 2008 01:49:23 -0700 (PDT) Received: from conversion-daemon.fe-sfbay-10.sun.com by fe-sfbay-10.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) id <0K3H00M014H42000@fe-sfbay-10.sun.com> (original mail from adilger@sun.com) for linux-fsdevel@vger.kernel.org; Fri, 04 Jul 2008 01:49:22 -0700 (PDT) In-reply-to: <20080703151731.GD1390@shareable.org> Content-disposition: inline Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Jul 03, 2008 16:17 +0100, Jamie Lokier wrote: > jim owens wrote: > > FIEMAP_EXTENT_NO_BYPASS > > > > As in "you can't bypass the filesystem" to directly access it. > > Can we also commit to this, when FIEMAP_EXTENT_NO_BYPASS is *not* set: > > 1. The data at fe_physical, and *will not move* so long as nothing > modifies *that particular file*? > > 2. Both reading *and writing* the file bypassing the filesystem are ok. I don't think any such guarantee can be made. What if the file is truncated and rewritten after the FIEMAP is called? The filesystem can't guarantee that will not happen. I think the only way to make sure of constant mapping is to call FIEMAP before and after the blocks are read. > The reason for 2 is that some filesystems checksum the data and/or > replicate it, and won't be readable if you write to it directly. EEEEEK. The _intent_ of FIEMAP is mostly for reporting fragmentation, and possibly to allow a "generic" defragmenter to be written. At an outside stretch I could imagine some tools like "dump" wanting direct read access to the file data. Directly writing underneath a filesystem is major bad news and will likely corrupt the filesystem because you can never be sure that there aren't dirty pages in the page cache that will overwrite your "direct" write, or that your write isn't racy with an unlink or truncate. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.