From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Mason Subject: Re: Data Deduplication with the help of an online filesystem check Date: Wed, 29 Apr 2009 09:14:35 -0400 Message-ID: <1241010875.20099.2.camel@think.oraclecorp.com> References: <20090427033331.GC17677@cip.informatik.uni-erlangen.de> <200904281945.10274.hjclaes@web.de> <20090428201619.GK7217@cip.informatik.uni-erlangen.de> <200904282236.07428.hjclaes@web.de> <20090428205242.GA13112@cip.informatik.uni-erlangen.de> <1240952295.15136.73.camel@think.oraclecorp.com> <20090428211255.GB13112@cip.informatik.uni-erlangen.de> <1240953977.15136.76.camel@think.oraclecorp.com> <20090428221455.GA27794@cip.informatik.uni-erlangen.de> <1240960687.15136.88.camel@think.oraclecorp.com> <20090429120300.GG22917@cip.informatik.uni-erlangen.de> Mime-Version: 1.0 Content-Type: text/plain Cc: Heinz-Josef Claes , Edward Shishkin , Tomasz Chmielewski , linux-btrfs@vger.kernel.org To: Thomas Glanzmann Return-path: In-Reply-To: <20090429120300.GG22917@cip.informatik.uni-erlangen.de> List-ID: On Wed, 2009-04-29 at 14:03 +0200, Thomas Glanzmann wrote: > Hello Chris, > > > You can start with the code documentation section on > > http://btrfs.wiki.kernel.org > > I read through this and at the moment one questions come in my mind: > > http://btrfs.wiki.kernel.org/images-btrfs/7/72/Chunks-overview.png > > Looking at this picture, when I'm going to implement the dedup code, do I also > have to take care to spread the blocks over the different devices or is > there already infrastructure in place that automates that process? The layering inside of btrfs means that you don't need to worry about chunks or multiple devices. But, in your ioctls you want to deal with [file, offset, len], not directly with block numbers. COW means that blocks can move around without you knowing, and some of the btrfs internals will COW files in order to relocate storage. So, what you want is a dedup file (or files) where your DB knows a given offset in the file has a given csum. -chris