From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matt Mackall Subject: Re: [RFC] TileFS - a proposal for scalable integrity checking Date: Wed, 2 May 2007 10:37:38 -0500 Message-ID: <20070502153738.GJ11115@waste.org> References: <20070428220522.GN11166@waste.org> <20070429232349.GA19937@thunk.org> <20070430014042.GL11115@waste.org> <20070502133205.GB20776@lazybastard.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Theodore Tso , linux-fsdevel@vger.kernel.org To: =?iso-8859-1?Q?J=F6rn?= Engel Return-path: Received: from waste.org ([66.93.16.53]:35833 "EHLO waste.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2993385AbXEBPhv (ORCPT ); Wed, 2 May 2007 11:37:51 -0400 Content-Disposition: inline In-Reply-To: <20070502133205.GB20776@lazybastard.org> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Wed, May 02, 2007 at 03:32:05PM +0200, J=F6rn Engel wrote: > On Sun, 29 April 2007 20:40:42 -0500, Matt Mackall wrote: > >=20 > > So we should have no trouble checking an exabyte-sized filesystem o= n a > > 4MB box. Even if it has one exabyte-sized file! We check the first > > tile, see that it points to our file, then iterate through that fil= e, > > checking that the forward and reverse pointers for each block match > > and all CRCs match, etc. We cache the file's inode as clean, finish > > checking anything else in the first tile, then mark it clean. When = we get > > to the next tile (and the next billion after that!), we notice that > > each block points back to our cached inode and skip rechecking it. >=20 > How would you catch the case where some block in tile 2 claims to bel= ong > to your just-checked inode but the inode has no reference to it? You're right, that is a problem. Without the known-clean inode cache, we would revisit the file in its entirety when checking tile 2, thus ensuring that both forward and reverse pointers were intact.. > How would you catch the inode referencing the same block twice with j= ust > 4MB of memory? =2E.which would also let us catch instances of the above, but would be very slow for files that span many tiles. > I believe you need the fpos field in your rmap for both problems. fpos does allow us to check just a subset of the file efficiently, yes. And that things are more strictly 1:1, because it unambiguously matches a single forward pointer in the file. Ok, I'm warming to the idea. But indirect blocks don't have an fpos, per se. They'd need a special encoding. As the fpos entries will all be block aligned, we'll have 12 extra bits to play with, so that may be easy enough. It's a bit frustrating to have 96-bit (inode+fpos) pointers in one direction and 32-bit (blockno) pointers in the other though. This doubles the overhead to .4%. Still not fatal - regular ext2 overhead is somewhere between 1% and 3% depending on inode usage. --=20 Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html