From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zach Brown Subject: Re: Efficient handling of sparse files Date: Mon, 28 Feb 2005 11:55:54 -0800 Message-ID: <4223774A.5020906@zabbo.net> References: <20050228174149.GA28741@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org Received: from tetsuo.zabbo.net ([207.173.201.20]:5575 "EHLO tetsuo.zabbo.net") by vger.kernel.org with ESMTP id S261680AbVB1Tz6 (ORCPT ); Mon, 28 Feb 2005 14:55:58 -0500 To: Matthew Wilcox In-Reply-To: <20050228174149.GA28741@parcelfarce.linux.theplanet.co.uk> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org > I was wondering if we could introduce a new system call (or ioctl?) that, > given an fd would find the next block with data in it. We could use the > ->bmap method ... except that has dire warnings about adding new callers > and viro may soon be in testicle-gouging range. Hmm. What you're talking about reminds me of some ioctl()s Alex has for ext3+extents and it feels like the pagevec apis that want to find populated pages across the page cache index space. Sooo. struct fs_extent { u64 file_start; u64 block_start; u64 contig; }; (I don't really care if those are in bytes or blocks or whatever. someone with strong opinions can pick a unit :)) long sys_find_extents_please(int fd, off_t file_start, struct fs_extent *extents, long nr_extents); so it'll fill in as many extent structs in the caller as it finds contiguous regions on disk starting with the given file position, returning the number populated. I'd, somewhat obviously, want to push this into an fs method perhaps with a generic_ that just spins on bmap(). I think this would let Alex kill his ioctl() and ocfs2 could certainly fill this with reasonable results. To move into lala land, I wonder if we would want to consider the difference between mapped blocks with data and mapped blocks that haven't been touched and which are going to return zeros. One could argue that it's marginally ridiculous that there isn't a shared interface to reserve blocks without having to manually zero them. If we did have such an interface, something like rsync doesn't actually care that they're mapped if the fs knows that they're still just zeros. I acknowledge that this is a bit out there :) In any case, if you want help rolling proofs-of-concept I could lend a few hours. - z