linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rogier Wolff <R.E.Wolff@BitWizard.nl>
To: Ric Wheeler <rwheeler@redhat.com>
Cc: Christian Brandt <brandtc@psi5.com>, linux-ext4@vger.kernel.org
Subject: Re: fsck.ext4 taking months
Date: Tue, 29 Mar 2011 08:03:00 +0200	[thread overview]
Message-ID: <20110329060300.GA27142@bitwizard.nl> (raw)
In-Reply-To: <4D909E92.4080209@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1979 bytes --]

On Mon, Mar 28, 2011 at 10:43:30AM -0400, Ric Wheeler wrote:
> On 03/27/2011 07:28 AM, Christian Brandt wrote:
> >Situation: External 500GB drive holds lots of snapshots using lots of
> >hard links made by rsync --link-dest. The controller went bad and
> >destroyed superblock and directory structures. The drive contains
> >roughly a million files and four complete directory-tree-snapshots with
> >each roughly a million hardlinks.
> >
> >Tried
> >
> >e2fsck 1.41.12 (17-May-2010)
> >         Benutze EXT2FS Library version 1.41.12, 17-May-2010
> >
> >e2fsck 1.41.11 (14-Mar-2010)
> >         Benutze EXT2FS Library version 1.41.11, 14-Mar-2010
> >
> >Symptoms: fsck.ext4 -y -f takes nearly a month to fix the structures on
> >a P4@2,8Ghz, with very little access to the drive and 100% cpu use.
> >
> >output of fsck looks much like this:
> >
> >File ??? (Inode #123456, modify time Wed Jul 22 16:20:23 2009)
> >   block Nr. 6144 double block(s), used with four file(s):
> >     <filesystem metadata>
> >     ??? (Inode #123457, mod time Wed Jul 22 16:20:23 2009)
> >     ??? (Inode #123458, mod time Wed Jul 22 16:20:23 2009)
> >     ...
> >multiply claimed block map? Yes
> >
> >Is there an adhoc method of getting my data back faster?
> >
> >Is the slow performance with lots of hard links a known issue?

Yes, it is a known issue. 

You get to test my patch. :-)

I strongly suspect that (just like me) sometime in the past you've
seen e2fsck run out of memory and were advised to enable the
on-disk-databases.

	Roger. 



-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

[-- Attachment #2: tdb_init_fix.diff --]
[-- Type: text/x-diff, Size: 2610 bytes --]

diff --git a/e2fsck/dirinfo.c b/e2fsck/dirinfo.c
index 901235c..9b29f23 100644
--- a/e2fsck/dirinfo.c
+++ b/e2fsck/dirinfo.c
@@ -62,7 +62,7 @@ static void setup_tdb(e2fsck_t ctx, ext2_ino_t num_dirs)
 	uuid_unparse(ctx->fs->super->s_uuid, uuid);
 	sprintf(db->tdb_fn, "%s/%s-dirinfo-XXXXXX", tdb_dir, uuid);
 	fd = mkstemp(db->tdb_fn);
-	db->tdb = tdb_open(db->tdb_fn, 0, TDB_CLEAR_IF_FIRST,
+	db->tdb = tdb_open(db->tdb_fn, 999931, TDB_NOLOCK | TDB_NOSYNC,
 			   O_RDWR | O_CREAT | O_TRUNC, 0600);
 	close(fd);
 }
diff --git a/lib/ext2fs/icount.c b/lib/ext2fs/icount.c
index bec0f5f..bdd5b26 100644
--- a/lib/ext2fs/icount.c
+++ b/lib/ext2fs/icount.c
@@ -173,6 +173,19 @@ static void uuid_unparse(void *uu, char *out)
 		uuid.node[3], uuid.node[4], uuid.node[5]);
 }
 
+static unsigned int my_tdb_hash(TDB_DATA *key)
+{
+        unsigned int value;      /* Used to compute the hash value.  */
+        int   i;        /* Used to cycle through random values. */
+
+        /* initial value 0 is as good as any one. */
+        for (value = 0, i=0; i < key->dsize; i++)
+                value = value * 256 + key->dptr[i] + (value >> 24) * 241;
+
+        return value;
+}
+
+
 errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
 				   int flags, ext2_icount_t *ret)
 {
@@ -180,6 +193,7 @@ errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
 	errcode_t	retval;
 	char 		*fn, uuid[40];
 	int		fd;
+	int		hash_size;
 
 	retval = alloc_icount(fs, flags,  &icount);
 	if (retval)
@@ -192,9 +206,20 @@ errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
 	sprintf(fn, "%s/%s-icount-XXXXXX", tdb_dir, uuid);
 	fd = mkstemp(fn);
 
+	/* 
+	hash_size should be on the same order of the number of entries actually
+	used. The tdb default used to be 131 which gives us a big performance 
+	penalty with normal inode numbers. We now trust the superblock. If it's 
+	wrong, don't worry, tdb will manage, it will just cost a little bit more 
+	CPUtime. 
+	If the hash function is good and distributes the values uniformly across 
+	the 32bit output space, it doesn't really matter that we didn't chose a
+	prime.  The default tdb hash function is pretty worthless. Someone didn't 
+	read Knuth. */
+	hash_size = fs->super->s_inodes_count - fs->super->s_free_inodes_count;
 	icount->tdb_fn = fn;
-	icount->tdb = tdb_open(fn, 0, TDB_CLEAR_IF_FIRST,
-			       O_RDWR | O_CREAT | O_TRUNC, 0600);
+	icount->tdb = tdb_open_ex(fn, hash_size, TDB_NOLOCK | TDB_NOSYNC,
+			       O_RDWR | O_CREAT | O_TRUNC, 0600, NULL, my_tdb_hash);
 	if (icount->tdb) {
 		close(fd);
 		*ret = icount;

  reply	other threads:[~2011-03-29  6:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-27 11:28 fsck.ext4 taking months Christian Brandt
2011-03-28 14:43 ` Ric Wheeler
2011-03-29  6:03   ` Rogier Wolff [this message]
2011-03-29 20:26     ` Christian Brandt
2011-03-30  8:45       ` Rogier Wolff
2011-03-29 20:21   ` Christian Brandt
2011-03-28 15:07 ` Eric Sandeen
2011-03-28 15:47 ` Ted Ts'o
2011-03-29 22:02   ` Christian Brandt
2011-03-30  8:34     ` Rogier Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110329060300.GA27142@bitwizard.nl \
    --to=r.e.wolff@bitwizard.nl \
    --cc=brandtc@psi5.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=rwheeler@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).