All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rogier Wolff <R.E.Wolff@BitWizard.nl>
To: Ric Wheeler <rwheeler@redhat.com>
Cc: Christian Brandt <brandtc@psi5.com>, linux-ext4@vger.kernel.org
Subject: Re: fsck.ext4 taking months
Date: Tue, 29 Mar 2011 08:03:00 +0200	[thread overview]
Message-ID: <20110329060300.GA27142@bitwizard.nl> (raw)
In-Reply-To: <4D909E92.4080209@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1979 bytes --]

On Mon, Mar 28, 2011 at 10:43:30AM -0400, Ric Wheeler wrote:
> On 03/27/2011 07:28 AM, Christian Brandt wrote:
> >Situation: External 500GB drive holds lots of snapshots using lots of
> >hard links made by rsync --link-dest. The controller went bad and
> >destroyed superblock and directory structures. The drive contains
> >roughly a million files and four complete directory-tree-snapshots with
> >each roughly a million hardlinks.
> >
> >Tried
> >
> >e2fsck 1.41.12 (17-May-2010)
> >         Benutze EXT2FS Library version 1.41.12, 17-May-2010
> >
> >e2fsck 1.41.11 (14-Mar-2010)
> >         Benutze EXT2FS Library version 1.41.11, 14-Mar-2010
> >
> >Symptoms: fsck.ext4 -y -f takes nearly a month to fix the structures on
> >a P4@2,8Ghz, with very little access to the drive and 100% cpu use.
> >
> >output of fsck looks much like this:
> >
> >File ??? (Inode #123456, modify time Wed Jul 22 16:20:23 2009)
> >   block Nr. 6144 double block(s), used with four file(s):
> >     <filesystem metadata>
> >     ??? (Inode #123457, mod time Wed Jul 22 16:20:23 2009)
> >     ??? (Inode #123458, mod time Wed Jul 22 16:20:23 2009)
> >     ...
> >multiply claimed block map? Yes
> >
> >Is there an adhoc method of getting my data back faster?
> >
> >Is the slow performance with lots of hard links a known issue?

Yes, it is a known issue. 

You get to test my patch. :-)

I strongly suspect that (just like me) sometime in the past you've
seen e2fsck run out of memory and were advised to enable the
on-disk-databases.

	Roger. 



-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

[-- Attachment #2: tdb_init_fix.diff --]
[-- Type: text/x-diff, Size: 2610 bytes --]

diff --git a/e2fsck/dirinfo.c b/e2fsck/dirinfo.c
index 901235c..9b29f23 100644
--- a/e2fsck/dirinfo.c
+++ b/e2fsck/dirinfo.c
@@ -62,7 +62,7 @@ static void setup_tdb(e2fsck_t ctx, ext2_ino_t num_dirs)
 	uuid_unparse(ctx->fs->super->s_uuid, uuid);
 	sprintf(db->tdb_fn, "%s/%s-dirinfo-XXXXXX", tdb_dir, uuid);
 	fd = mkstemp(db->tdb_fn);
-	db->tdb = tdb_open(db->tdb_fn, 0, TDB_CLEAR_IF_FIRST,
+	db->tdb = tdb_open(db->tdb_fn, 999931, TDB_NOLOCK | TDB_NOSYNC,
 			   O_RDWR | O_CREAT | O_TRUNC, 0600);
 	close(fd);
 }
diff --git a/lib/ext2fs/icount.c b/lib/ext2fs/icount.c
index bec0f5f..bdd5b26 100644
--- a/lib/ext2fs/icount.c
+++ b/lib/ext2fs/icount.c
@@ -173,6 +173,19 @@ static void uuid_unparse(void *uu, char *out)
 		uuid.node[3], uuid.node[4], uuid.node[5]);
 }
 
+static unsigned int my_tdb_hash(TDB_DATA *key)
+{
+        unsigned int value;      /* Used to compute the hash value.  */
+        int   i;        /* Used to cycle through random values. */
+
+        /* initial value 0 is as good as any one. */
+        for (value = 0, i=0; i < key->dsize; i++)
+                value = value * 256 + key->dptr[i] + (value >> 24) * 241;
+
+        return value;
+}
+
+
 errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
 				   int flags, ext2_icount_t *ret)
 {
@@ -180,6 +193,7 @@ errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
 	errcode_t	retval;
 	char 		*fn, uuid[40];
 	int		fd;
+	int		hash_size;
 
 	retval = alloc_icount(fs, flags,  &icount);
 	if (retval)
@@ -192,9 +206,20 @@ errcode_t ext2fs_create_icount_tdb(ext2_filsys fs, char *tdb_dir,
 	sprintf(fn, "%s/%s-icount-XXXXXX", tdb_dir, uuid);
 	fd = mkstemp(fn);
 
+	/* 
+	hash_size should be on the same order of the number of entries actually
+	used. The tdb default used to be 131 which gives us a big performance 
+	penalty with normal inode numbers. We now trust the superblock. If it's 
+	wrong, don't worry, tdb will manage, it will just cost a little bit more 
+	CPUtime. 
+	If the hash function is good and distributes the values uniformly across 
+	the 32bit output space, it doesn't really matter that we didn't chose a
+	prime.  The default tdb hash function is pretty worthless. Someone didn't 
+	read Knuth. */
+	hash_size = fs->super->s_inodes_count - fs->super->s_free_inodes_count;
 	icount->tdb_fn = fn;
-	icount->tdb = tdb_open(fn, 0, TDB_CLEAR_IF_FIRST,
-			       O_RDWR | O_CREAT | O_TRUNC, 0600);
+	icount->tdb = tdb_open_ex(fn, hash_size, TDB_NOLOCK | TDB_NOSYNC,
+			       O_RDWR | O_CREAT | O_TRUNC, 0600, NULL, my_tdb_hash);
 	if (icount->tdb) {
 		close(fd);
 		*ret = icount;

  reply	other threads:[~2011-03-29  6:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-27 11:28 fsck.ext4 taking months Christian Brandt
2011-03-28 14:43 ` Ric Wheeler
2011-03-29  6:03   ` Rogier Wolff [this message]
2011-03-29 20:26     ` Christian Brandt
2011-03-30  8:45       ` Rogier Wolff
2011-03-29 20:21   ` Christian Brandt
2011-03-28 15:07 ` Eric Sandeen
2011-03-28 15:47 ` Ted Ts'o
2011-03-29 22:02   ` Christian Brandt
2011-03-30  8:34     ` Rogier Wolff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110329060300.GA27142@bitwizard.nl \
    --to=r.e.wolff@bitwizard.nl \
    --cc=brandtc@psi5.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=rwheeler@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.