From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matthew Wilcox Subject: Efficient handling of sparse files Date: Mon, 28 Feb 2005 17:41:49 +0000 Message-ID: <20050228174149.GA28741@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Received: from parcelfarce.linux.theplanet.co.uk ([195.92.249.252]:32169 "EHLO parcelfarce.linux.theplanet.co.uk") by vger.kernel.org with ESMTP id S261663AbVB1Rlz (ORCPT ); Mon, 28 Feb 2005 12:41:55 -0500 Received: from willy by parcelfarce.linux.theplanet.co.uk with local (Exim 4.33) id 1D5otx-0001hZ-CE for linux-fsdevel@vger.kernel.org; Mon, 28 Feb 2005 17:41:49 +0000 To: linux-fsdevel@vger.kernel.org Content-Disposition: inline Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org This problem came up with the systemimager program which uses rsync to install files from a master server to many clients. Red Hat has a system user with uid 2^32-1 which causes lastlog to grow to 1.2GB in size. rsync does understand the concept of sparse files (with the -S flag), but it has to read every block to discover that it is indeed empty. This sucks. I was wondering if we could introduce a new system call (or ioctl?) that, given an fd would find the next block with data in it. We could use the ->bmap method ... except that has dire warnings about adding new callers and viro may soon be in testicle-gouging range. One system interface hack would be to introduce lseek(fd, 0, SEEK_DATA) ... but without permission to reuse ->bmap for this purpose, it's pointless to discuss user interfaces. Suggestions? -- "Next the statesmen will invent cheap lies, putting the blame upon the nation that is attacked, and every man will be glad of those conscience-soothing falsities, and will diligently study them, and refuse to examine any refutations of them; and thus he will by and by convince himself that the war is just, and will thank God for the better sleep he enjoys after this process of grotesque self-deception." -- Mark Twain