From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Dilger Subject: Re: bdar: efficiently backup allocated bytes in file systems Date: Wed, 19 Mar 2008 10:58:43 +0800 Message-ID: <20080319025843.GE2971@webber.adilger.int> References: <47DF1737.2050700@zabbo.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7BIT Cc: linux-fsdevel@vger.kernel.org To: Zach Brown Return-path: Received: from sca-es-mail-1.Sun.COM ([192.18.43.132]:52846 "EHLO sca-es-mail-1.sun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753253AbYCSTZE (ORCPT ); Wed, 19 Mar 2008 15:25:04 -0400 Received: from fe-sfbay-10.sun.com ([192.18.43.129]) by sca-es-mail-1.sun.com (8.13.7+Sun/8.12.9) with ESMTP id m2J2xNNu008730 for ; Tue, 18 Mar 2008 19:59:23 -0700 (PDT) Received: from conversion-daemon.fe-sfbay-10.sun.com by fe-sfbay-10.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) id <0JXY00M01IRK1M00@fe-sfbay-10.sun.com> (original mail from adilger@sun.com) for linux-fsdevel@vger.kernel.org; Tue, 18 Mar 2008 19:59:23 -0700 (PDT) In-reply-to: <47DF1737.2050700@zabbo.net> Content-disposition: inline Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mar 17, 2008 18:13 -0700, Zach Brown wrote: > So, I had a fun time throwing together a utility last weekend. I > thought I'd share it sooner rather than later. > > I found myself wanting to backup a copy of an ancient ~75g ext3 file > system. I got frustrated by of our utilities which don't saturate > storage. I wanted dd line rates but I also only wanted to copy > referenced data. > > So I threw something together which does that. I made it work roughly > like tar so that people have some idea what to expect. So you can do > something like: > > $ bdar -cf - /dev/sda3 | gzip -c > /tmp/sda3-backup.bdar.gz > ... > $ zcat /tmp/sda3-backup.bdar.gz | bdar -xf - /dev/sda3 > > and it will do exactly what you would guess it would do after reading > those command lines. > > The bdar file format is just a header and then a series of regions of > bytes described by their length and offset. To create a bdar file from > a file system bdar needs to know enough to figure out what extents are > referenced. Restoring a bdar is generic, though, it just stamps bytes > into the target file. So the question is whether the ".bdar" file is specific to the filesystem being backed up, and if it only allows backing up the whole filesystem? Does it create a dense output file or a sparse one? Does it store the data as chunks of blocks in a full-device map or on a per file basis? If you can't restore a .bdar backup file to a smaller device than the source device that makes it less useful than most of the other tools. > I only taught it the most basic knowledge of ext[234]. Just enough to > show that generating the bdar is ~4x faster than tar and ~2x faster than > dump :). There's still some available disk bandwidth to consume with > read-ahead, but it's pretty close. (single spindle, ~5g of kernel > trees, beefy cpus.) The question is whether the 2x speed improvement is worth the lack of portability compared to even dump? Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.