All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Barry Naujok" <bnaujok@melbourne.sgi.com>
To: 'Madan Valluri' <mvalluri@sgi.com>, 'Nathan Scott' <nathans@sgi.com>
Cc: xfs@oss.sgi.com
Subject: RE: Review: xfs_repair fixes for dir2 corruption
Date: Mon, 31 Jul 2006 17:18:56 +1000	[thread overview]
Message-ID: <200607310714.RAA03415@larry.melbourne.sgi.com> (raw)
In-Reply-To: <44CA22F8.2040507@sgi.com>

[-- Attachment #1: Type: text/plain, Size: 2203 bytes --]

 

> -----Original Message-----
> From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] 
> On Behalf Of Madan Valluri
> Sent: Saturday, 29 July 2006 12:45 AM
> To: Nathan Scott
> Cc: Barry Naujok; xfs@oss.sgi.com
> Subject: Re: Review: xfs_repair fixes for dir2 corruption
> 
> Nathan Scott wrote:
> > On Fri, Jul 28, 2006 at 11:58:52AM +1000, Barry Naujok wrote:
> >   
> >> This patch addresses the following xfs_repair issues:
> >>     
> >
> > The libxfs cache stuff looks good to me.  Maybe Madan can cast
> > an eye over the repair changes for ya?
> >
> > cheers.
> >
> >   
> >>  
> 1) Since dir_hash_add can be called for both V1 and V2 
> directories its 
> second parameter type xfs_dir2_dataptr_t should be neutral.

Ok, made it a __uint32_t.

> 2) In dir_hash_add when dup is set, do you still need to add to the 
> nextbyhash by list?

Good pickup. I set junk to 1 as well in the dup check.

> 3) The following statement in   longform_dir2_rebuild looks odd. 
> Besides, FWIW, you can match "/."
> 
>                 if (p->name[0] == '/' || (p->name[0] == '.' && 
> (p->namelen == 1
>                                 || (p->namelen == 2 && 
> p->name[1] == '.'))))
>                         continue;
> 
> Consider:
> 
>                if (((p->name[0] == '/' || p->name[0] == '.') && 
> p->namelen == 1) ||
>                         (p->name[0] == '.' && p->name[1] == '.' && 
> p->namelen == 2))
>                         continue;
> 
> 4) Related to items 2&3, shouldn't the code be skipping 
> duplicate entries?

That's exactly what the above is doing, it's skipping all entries starting with a "/" as they have been marked as bad,
and the "." and ".." entries which already exist after the libxfs_dir2_init() call.

> 5) Can we do anything to minimize the do_error calls in 
> longform_dir2_rebuild? Seems like on a full file system, while 
> rebuilding say the root directory, matters can get wacky - 
> The directory 
> is being rebuilt and we have no further room. Sounds like 
> that this how 
> it has been....

Done, now it will do what it can, and I do a flush before the lost+found creation which still has the do_error() calls.

Updated diff attached.

Thanks,
Barry

[-- Attachment #2: diff.txt --]
[-- Type: text/plain, Size: 51023 bytes --]


===========================================================================
xfsprogs/include/cache.h
===========================================================================

--- a/xfsprogs/include/cache.h	2006-07-31 17:09:22.000000000 +1000
+++ b/xfsprogs/include/cache.h	2006-07-27 16:17:47.804986322 +1000
@@ -30,6 +30,7 @@ struct cache_node;
 typedef void *cache_key_t;
 typedef void (*cache_walk_t)(struct cache_node *);
 typedef struct cache_node * (*cache_node_alloc_t)(void);
+typedef void (*cache_node_flush_t)(struct cache_node *);
 typedef void (*cache_node_relse_t)(struct cache_node *);
 typedef unsigned int (*cache_node_hash_t)(cache_key_t, unsigned int);
 typedef int (*cache_node_compare_t)(struct cache_node *, cache_key_t);
@@ -38,6 +39,7 @@ typedef unsigned int (*cache_bulk_relse_
 struct cache_operations {
 	cache_node_hash_t	hash;
 	cache_node_alloc_t	alloc;
+	cache_node_flush_t	flush;
 	cache_node_relse_t	relse;
 	cache_node_compare_t	compare;
 	cache_bulk_relse_t	bulkrelse;	/* optional */
@@ -49,6 +51,7 @@ struct cache {
 	pthread_mutex_t		c_mutex;	/* node count mutex */
 	cache_node_hash_t	hash;		/* node hash function */
 	cache_node_alloc_t	alloc;		/* allocation function */
+	cache_node_flush_t	flush;		/* flush dirty data function */
 	cache_node_relse_t	relse;		/* memory free function */
 	cache_node_compare_t	compare;	/* comparison routine */
 	cache_bulk_relse_t	bulkrelse;	/* bulk release routine */
@@ -75,6 +78,7 @@ struct cache *cache_init(unsigned int, s
 void cache_destroy(struct cache *);
 void cache_walk(struct cache *, cache_walk_t);
 void cache_purge(struct cache *);
+void cache_flush(struct cache *);
 
 int cache_node_get(struct cache *, cache_key_t, struct cache_node **);
 void cache_node_put(struct cache_node *);

===========================================================================
xfsprogs/include/libxfs.h
===========================================================================

--- a/xfsprogs/include/libxfs.h	2006-07-31 17:09:22.000000000 +1000
+++ b/xfsprogs/include/libxfs.h	2006-07-31 16:42:58.384131150 +1000
@@ -257,6 +257,7 @@ extern int	libxfs_writebuf_int (xfs_buf_
 extern struct cache	*libxfs_bcache;
 extern struct cache_operations	libxfs_bcache_operations;
 extern void	libxfs_bcache_purge (void);
+extern void	libxfs_bcache_flush (void);
 extern xfs_buf_t	*libxfs_getbuf (dev_t, xfs_daddr_t, int);
 extern void	libxfs_putbuf (xfs_buf_t *);
 extern void	libxfs_purgebuf (xfs_buf_t *);
@@ -465,8 +466,11 @@ extern int	libxfs_bmapi_single(xfs_trans
 				xfs_fsblock_t *, xfs_fileoff_t);
 extern int	libxfs_bmap_finish (xfs_trans_t **, xfs_bmap_free_t *,
 				xfs_fsblock_t, int *);
+extern void	libxfs_bmap_cancel(xfs_bmap_free_t *);
 extern int	libxfs_bmap_next_offset (xfs_trans_t *, xfs_inode_t *,
 				xfs_fileoff_t *, int);
+extern int	libxfs_bmap_last_offset(xfs_trans_t *, xfs_inode_t *, 
+				xfs_fileoff_t *, int);
 extern int	libxfs_bunmapi (xfs_trans_t *, xfs_inode_t *, xfs_fileoff_t,
 				xfs_filblks_t, int, xfs_extnum_t,
 				xfs_fsblock_t *, xfs_bmap_free_t *, int *);

===========================================================================
xfsprogs/libxfs/cache.c
===========================================================================

--- a/xfsprogs/libxfs/cache.c	2006-07-31 17:09:22.000000000 +1000
+++ b/xfsprogs/libxfs/cache.c	2006-07-27 17:42:43.812685388 +1000
@@ -60,6 +60,7 @@ cache_init(
 	cache->c_hashsize = hashsize;
 	cache->hash = cache_operations->hash;
 	cache->alloc = cache_operations->alloc;
+	cache->flush = cache_operations->flush;
 	cache->relse = cache_operations->relse;
 	cache->compare = cache_operations->compare;
 	cache->bulkrelse = cache_operations->bulkrelse ?
@@ -422,6 +423,39 @@ cache_purge(
 		cache_abort();
 	}
 #endif
+	/* flush any remaining nodes to disk */
+	cache_flush(cache);
+}
+
+/*
+ * Flush all nodes in the cache to disk. 
+ */
+void
+cache_flush(
+	struct cache *		cache)
+{
+	struct cache_hash *	hash;
+	struct list_head *	head;
+	struct list_head *	pos;
+	struct cache_node *	node;
+	int			i;
+	
+	if (!cache->flush)
+		return;
+	
+	for (i = 0; i < cache->c_hashsize; i++) {
+		hash = &cache->c_hash[i];
+		
+		pthread_mutex_lock(&hash->ch_mutex);
+		head = &hash->ch_list;
+		for (pos = head->next; pos != head; pos = pos->next) {
+			node = (struct cache_node *)pos;
+			pthread_mutex_lock(&node->cn_mutex);
+			cache->flush(node);
+			pthread_mutex_unlock(&node->cn_mutex);
+		}
+		pthread_mutex_unlock(&hash->ch_mutex);
+	}
 }
 
 #define	HASH_REPORT	(3*HASH_CACHE_RATIO)

===========================================================================
xfsprogs/libxfs/rdwr.c
===========================================================================

--- a/xfsprogs/libxfs/rdwr.c	2006-07-31 17:09:22.000000000 +1000
+++ b/xfsprogs/libxfs/rdwr.c	2006-07-27 16:40:56.612373938 +1000
@@ -416,6 +416,15 @@ libxfs_iomove(xfs_buf_t *bp, uint boff, 
 }
 
 static void
+libxfs_bflush(struct cache_node *node)
+{
+	xfs_buf_t		*bp = (xfs_buf_t *)node;
+
+	if ((bp != NULL) && (bp->b_flags & LIBXFS_B_DIRTY))
+		libxfs_writebufr(bp);
+}
+
+static void
 libxfs_brelse(struct cache_node *node)
 {
 	xfs_buf_t		*bp = (xfs_buf_t *)node;
@@ -442,9 +451,16 @@ libxfs_bcache_purge(void)
 	cache_purge(libxfs_bcache);
 }
 
+void 
+libxfs_bcache_flush(void)
+{
+	cache_flush(libxfs_bcache);
+}
+
 struct cache_operations libxfs_bcache_operations = {
 	/* .hash */	libxfs_bhash,
 	/* .alloc */	libxfs_balloc,
+	/* .flush */	libxfs_bflush,
 	/* .relse */	libxfs_brelse,
 	/* .compare */	libxfs_bcompare,
 	/* .bulkrelse */ NULL	/* TODO: lio_listio64 interface? */
@@ -649,6 +665,7 @@ libxfs_icache_purge(void)
 struct cache_operations libxfs_icache_operations = {
 	/* .hash */	libxfs_ihash,
 	/* .alloc */	libxfs_ialloc,
+	/* .flush */	NULL,
 	/* .relse */	libxfs_irelse,
 	/* .compare */	libxfs_icompare,
 	/* .bulkrelse */ NULL

===========================================================================
xfsprogs/libxfs/xfs.h
===========================================================================

--- a/xfsprogs/libxfs/xfs.h	2006-07-31 17:09:22.000000000 +1000
+++ b/xfsprogs/libxfs/xfs.h	2006-07-31 16:26:28.040456032 +1000
@@ -97,7 +97,9 @@
 #define xfs_bmapi			libxfs_bmapi
 #define xfs_bmapi_single		libxfs_bmapi_single
 #define xfs_bmap_finish			libxfs_bmap_finish
+#define xfs_bmap_cancel			libxfs_bmap_cancel
 #define xfs_bmap_del_free		libxfs_bmap_del_free
+#define xfs_bmap_last_offset		libxfs_bmap_last_offset
 #define xfs_bunmapi			libxfs_bunmapi
 #define xfs_free_extent			libxfs_free_extent
 #define xfs_rtfree_extent		libxfs_rtfree_extent

===========================================================================
xfsprogs/repair/phase6.c
===========================================================================

--- a/xfsprogs/repair/phase6.c	2006-07-31 17:09:22.000000000 +1000
+++ b/xfsprogs/repair/phase6.c	2006-07-31 16:37:03.134895653 +1000
@@ -36,43 +36,36 @@ static int orphanage_entered;
 
 /*
  * Data structures and routines to keep track of directory entries
- * and whether their leaf entry has been seen
+ * and whether their leaf entry has been seen. Also used for name
+ * duplicate checking and rebuilding step if required.
  */
 typedef struct dir_hash_ent {
-	struct dir_hash_ent	*next;	/* pointer to next entry */
-	xfs_dir2_leaf_entry_t	ent;	/* address and hash value */
-	short			junkit;	/* name starts with / */
-	short			seen;	/* have seen leaf entry */
+	struct dir_hash_ent	*nextbyaddr;	/* next in addr bucket */
+	struct dir_hash_ent	*nextbyhash;	/* next in name bucket */
+	struct dir_hash_ent	*nextbyorder;	/* next in order added */
+	xfs_dahash_t		hashval;	/* hash value of name */
+	__uint32_t		address;	/* offset of data entry */
+	xfs_ino_t 		inum;		/* inode num of entry */
+	short			junkit;		/* name starts with / */
+	short			seen;		/* have seen leaf entry */
+	int	  	    	namelen;	/* length of name */
+	uchar_t    	    	*name;		/* pointer to name (no NULL) */
 } dir_hash_ent_t;
 
 typedef struct dir_hash_tab {
-	int			size;	/* size of hash table */
-	dir_hash_ent_t		*tab[1];/* actual hash table, variable size */
+	int			size;		/* size of hash tables */
+	int			names_duped;	/* 1 = ent names malloced */
+	dir_hash_ent_t		*first;		/* ptr to first added entry */
+	dir_hash_ent_t		*last;		/* ptr to last added entry */
+	dir_hash_ent_t		**byhash;	/* ptr to name hash buckets */
+	dir_hash_ent_t		**byaddr;	/* ptr to addr hash buckets */
 } dir_hash_tab_t;
+
 #define	DIR_HASH_TAB_SIZE(n)	\
-	(offsetof(dir_hash_tab_t, tab) + (sizeof(dir_hash_ent_t *) * (n)))
+	(sizeof(dir_hash_tab_t) + (sizeof(dir_hash_ent_t *) * (n) * 2))
 #define	DIR_HASH_FUNC(t,a)	((a) % (t)->size)
 
 /*
- * Track names to check for duplicates in a directory.
- */
-
-typedef struct name_hash_ent {
-	struct name_hash_ent	*next;	/* pointer to next entry */
-	xfs_dahash_t		hashval;/* hash value of name */
-	int	  	    	namelen;/* length of name */
-	uchar_t    	    	*name;	/* pointer to name (no NULL) */
-} name_hash_ent_t;		
-
-typedef struct name_hash_tab {
-	int			size;	/* size of hash table */
-	name_hash_ent_t		*tab[1];/* actual hash table, variable size */
-} name_hash_tab_t;
-#define	NAME_HASH_TAB_SIZE(n)	\
-	(offsetof(name_hash_tab_t, tab) + (sizeof(name_hash_ent_t *) * (n)))
-#define	NAME_HASH_FUNC(t,a)	((a) % (t)->size)
-
-/*
  * Track the contents of the freespace table in a directory.
  */
 typedef struct freetab {
@@ -94,28 +87,78 @@ typedef struct freetab {
 #define	DIR_HASH_CK_BADSTALE	5
 #define	DIR_HASH_CK_TOTAL	6
 
-static void
+/*
+ * Returns 0 if the name already exists (ie. a duplicate)
+ */
+static int
 dir_hash_add(
 	dir_hash_tab_t		*hashtab,
-	xfs_dahash_t		hash,
-	xfs_dir2_dataptr_t	addr,
-	int			junk)
-{
-	int			i;
+	__uint32_t		addr,	
+	xfs_ino_t		inum,
+	int			namelen,
+	uchar_t			*name)
+{
+	xfs_dahash_t		hash = 0;
+	int			byaddr;
+	int			byhash = 0;
 	dir_hash_ent_t		*p;
-
-	i = DIR_HASH_FUNC(hashtab, addr);
+	int			dup;
+	short			junk;
+	
+	ASSERT(!hashtab->names_duped);
+	
+	junk = name[0] == '/';
+	byaddr = DIR_HASH_FUNC(hashtab, addr);
+	dup = 0;
+
+	if (!junk) {
+		hash = libxfs_da_hashname(name, namelen);
+		byhash = DIR_HASH_FUNC(hashtab, hash);
+
+		/* 
+		 * search hash bucket for existing name.
+		 */
+		for (p = hashtab->byhash[byhash]; p; p = p->nextbyhash) {
+			if (p->hashval == hash && p->namelen == namelen) {
+				if (memcmp(p->name, name, namelen) == 0) {
+					dup = 1;
+					junk = 1;
+					break;
+				}
+			}
+		}
+	}
+	
 	if ((p = malloc(sizeof(*p))) == NULL)
 		do_error(_("malloc failed in dir_hash_add (%u bytes)\n"),
 			sizeof(*p));
-	p->next = hashtab->tab[i];
-	hashtab->tab[i] = p;
-	if (!(p->junkit = junk))
-		p->ent.hashval = hash;
-	p->ent.address = addr;
+	
+	p->nextbyaddr = hashtab->byaddr[byaddr];
+	hashtab->byaddr[byaddr] = p;
+	if (hashtab->last) 
+		hashtab->last->nextbyorder = p;
+	else
+		hashtab->first = p;
+	p->nextbyorder = NULL;
+	hashtab->last = p;
+	
+	if (!(p->junkit = junk)) {
+		p->hashval = hash;
+		p->nextbyhash = hashtab->byhash[byhash];
+		hashtab->byhash[byhash] = p;
+	}
+	p->address = addr;
+	p->inum = inum;
 	p->seen = 0;
+	p->namelen = namelen;
+	p->name = name;
+	
+	return !dup;
 }
 
+/*
+ * checks to see if any data entries are not in the leaf blocks 
+ */
 static int
 dir_hash_unseen(
 	dir_hash_tab_t	*hashtab)
@@ -124,7 +167,7 @@ dir_hash_unseen(
 	dir_hash_ent_t	*p;
 
 	for (i = 0; i < hashtab->size; i++) {
-		for (p = hashtab->tab[i]; p; p = p->next) {
+		for (p = hashtab->byaddr[i]; p; p = p->nextbyaddr) {
 			if (p->seen == 0)
 				return 1;
 		}
@@ -173,8 +216,10 @@ dir_hash_done(
 	dir_hash_ent_t	*p;
 
 	for (i = 0; i < hashtab->size; i++) {
-		for (p = hashtab->tab[i]; p; p = n) {
-			n = p->next;
+		for (p = hashtab->byaddr[i]; p; p = n) {
+			n = p->nextbyaddr;
+			if (hashtab->names_duped)
+				free(p->name);
 			free(p);
 		}
 	}
@@ -196,6 +241,10 @@ dir_hash_init(
 	if ((hashtab = calloc(DIR_HASH_TAB_SIZE(hsize), 1)) == NULL)
 		do_error(_("calloc failed in dir_hash_init\n"));
 	hashtab->size = hsize;
+	hashtab->byhash = (dir_hash_ent_t**)((char *)hashtab + 
+		sizeof(dir_hash_tab_t));
+	hashtab->byaddr = (dir_hash_ent_t**)((char *)hashtab + 
+		sizeof(dir_hash_tab_t) + sizeof(dir_hash_ent_t*) * hsize);
 	return hashtab;
 }
 
@@ -209,12 +258,12 @@ dir_hash_see(
 	dir_hash_ent_t		*p;
 
 	i = DIR_HASH_FUNC(hashtab, addr);
-	for (p = hashtab->tab[i]; p; p = p->next) {
-		if (p->ent.address != addr)
+	for (p = hashtab->byaddr[i]; p; p = p->nextbyaddr) {
+		if (p->address != addr)
 			continue;
 		if (p->seen)
 			return DIR_HASH_CK_DUPLEAF;
-		if (p->junkit == 0 && p->ent.hashval != hash)
+		if (p->junkit == 0 && p->hashval != hash)
 			return DIR_HASH_CK_BADHASH;
 		p->seen = 1;
 		return DIR_HASH_CK_OK;
@@ -222,6 +271,10 @@ dir_hash_see(
 	return DIR_HASH_CK_NODATA;
 }
 
+/*
+ * checks to make sure leafs match a data entry, and that the stale
+ * count is valid.
+ */
 static int
 dir_hash_see_all(
 	dir_hash_tab_t		*hashtab,
@@ -246,81 +299,26 @@ dir_hash_see_all(
 }
 
 /*
- * Returns 0 if the name already exists (ie. a duplicate)
+ * Convert name pointers into locally allocated memory.
+ * This must only be done after all the entries have been added.
  */
-static int
-name_hash_add(
-	name_hash_tab_t		*nametab,
-	uchar_t			*name,
-	int			namelen)
+static void
+dir_hash_dup_names(dir_hash_tab_t *hashtab)
 {
-	xfs_dahash_t		hash;
-	int			i;
-	name_hash_ent_t		*p;
-
-	hash = libxfs_da_hashname(name, namelen);
-			
-	i = NAME_HASH_FUNC(nametab, hash);
-	
-	/* 
-	 * search hash bucket for existing name.
-	 */
-	for (p = nametab->tab[i]; p; p = p->next) {
-		if (p->hashval == hash && p->namelen == namelen) {
-			if (memcmp(p->name, name, namelen) == 0) 
-				return 0; /* exists */
-		}
-	}
-	
-	if ((p = malloc(sizeof(*p))) == NULL)
-		do_error(_("malloc failed in name_hash_add (%u bytes)\n"),
-			sizeof(*p));
+	uchar_t			*name;
+	dir_hash_ent_t		*p;
 	
-	p->next = nametab->tab[i];
-	p->hashval = hash;
-	p->name = name;
-	p->namelen = namelen;
-	nametab->tab[i] = p;
+	if (hashtab->names_duped)
+		return;
 	
-	return 1;	/* success, no duplicate */
-}
-
-static name_hash_tab_t *
-name_hash_init(
-	xfs_fsize_t	size)
-{
-	name_hash_tab_t	*nametab;
-	int		hsize;
-
-	hsize = size / (16 * 4);
-	if (hsize > 1024)
-		hsize = 1024;
-	else if (hsize < 16)
-		hsize = 16;
-	if ((nametab = calloc(NAME_HASH_TAB_SIZE(hsize), 1)) == NULL)
-		do_error(_("calloc failed in name_hash_init\n"));
-	nametab->size = hsize;
-	return nametab;
-}
-
-static void
-name_hash_done(
-	name_hash_tab_t	*nametab)
-{
-	int		i;
-	name_hash_ent_t	*n;
-	name_hash_ent_t	*p;
-
-	for (i = 0; i < nametab->size; i++) {
-		for (p = nametab->tab[i]; p; p = n) {
-			n = p->next;
-			free(p);
-		}
+	for (p = hashtab->first; p; p = p->nextbyorder) {
+		name = malloc(p->namelen);
+		memcpy(name, p->name, p->namelen);
+		p->name = name;
 	}
-	free(nametab);
+	hashtab->names_duped = 1;
 }
 
-
 /*
  * Version 1 or 2 directory routine wrappers
 */
@@ -1385,7 +1383,8 @@ lf_block_dir_entry_check(xfs_mount_t		*m
 			dir_stack_t		*stack,
 			ino_tree_node_t		*current_irec,
 			int			current_ino_offset,
-			name_hash_tab_t		*nametab)
+			dir_hash_tab_t		*hashtab,
+			xfs_dablk_t		da_bno)
 {
 	xfs_dir_leaf_entry_t	*entry;
 	ino_tree_node_t		*irec;
@@ -1545,7 +1544,9 @@ lf_block_dir_entry_check(xfs_mount_t		*m
 		/*
 		 * check for duplicate names in directory.
 		 */ 
-		if (!name_hash_add(nametab, namest->name, entry->namelen)) {
+		if (!dir_hash_add(hashtab, (da_bno << mp->m_sb.sb_blocklog) + 
+						entry->nameidx, 
+				lino, entry->namelen, namest->name)) {
 			do_warn(
 		_("entry \"%s\" (ino %llu) in dir %llu is a duplicate name"),
 				fname, lino, ino);
@@ -1635,7 +1636,7 @@ longform_dir_entry_check(xfs_mount_t	*mp
 			dir_stack_t	*stack,
 			ino_tree_node_t	*irec,
 			int		ino_offset,
-			name_hash_tab_t	*nametab)
+			dir_hash_tab_t	*hashtab)
 {
 	xfs_dir_leafblock_t	*leaf;
 	xfs_buf_t		*bp;
@@ -1677,8 +1678,6 @@ longform_dir_entry_check(xfs_mount_t	*mp
 
 		leaf = (xfs_dir_leafblock_t *)XFS_BUF_PTR(bp);
 
-		da_bno = INT_GET(leaf->hdr.info.forw, ARCH_CONVERT);
-
 		if (INT_GET(leaf->hdr.info.magic, ARCH_CONVERT) !=
 		    XFS_DIR_LEAF_MAGIC)  {
 			if (!no_modify)  {
@@ -1699,9 +1698,11 @@ _("bad magic # (0x%x) for dir ino %llu l
 		}
 
 		if (!skipit)
-			lf_block_dir_entry_check(mp, ino, leaf, &dirty,
-						num_illegal, need_dot, stack,
-						irec, ino_offset, nametab);
+			lf_block_dir_entry_check(mp, ino, leaf, &dirty, 
+					num_illegal, need_dot, stack, irec, 
+					ino_offset, hashtab, da_bno);
+
+		da_bno = INT_GET(leaf->hdr.info.forw, ARCH_CONVERT);
 
 		ASSERT(dirty == 0 || (dirty && !no_modify));
 
@@ -1745,6 +1746,152 @@ _("can't map leaf block %d in dir %llu, 
 }
 
 /*
+ * Unexpected failure during the rebuild will leave the entries in
+ * lost+found on the next run
+ */
+
+static void 
+longform_dir2_rebuild(
+	xfs_mount_t	*mp,
+	xfs_ino_t	ino,
+	xfs_inode_t	*ip,
+	dir_hash_tab_t	*hashtab)
+{
+	int			error;
+	int			nres;
+	xfs_trans_t		*tp;
+	xfs_fileoff_t		lastblock;
+	xfs_fsblock_t		firstblock;
+	xfs_bmap_free_t		flist;
+	xfs_ino_t		parentino;
+	xfs_inode_t		*pip;
+	int			byhash;
+	dir_hash_ent_t		*p;
+	int			committed;
+	int			done;
+	
+	/* 
+	 * trash directory completely and rebuild from scratch using the
+	 * name/inode pairs in the hash table
+	 */
+	 
+	do_warn(_("rebuilding directory inode %llu\n"), ino);
+	
+	/* 
+	 * first attempt to locate the parent inode, if it can't be found,
+	 * we'll use the lost+found inode 
+	 */
+	byhash = DIR_HASH_FUNC(hashtab, libxfs_da_hashname((uchar_t*)"..", 2));
+	parentino = orphanage_ino;
+	for (p = hashtab->byhash[byhash]; p; p = p->nextbyhash) {
+		if (p->namelen == 2 && p->name[0] == '.' && p->name[1] == '.') {
+			parentino = p->inum;
+			break;
+		}
+	}
+
+	XFS_BMAP_INIT(&flist, &firstblock);
+		
+	tp = libxfs_trans_alloc(mp, 0);
+	nres = XFS_REMOVE_SPACE_RES(mp);
+	error = libxfs_trans_reserve(tp, nres, XFS_REMOVE_LOG_RES(mp), 0,
+			XFS_TRANS_PERM_LOG_RES, XFS_REMOVE_LOG_COUNT);
+	if (error)
+		res_failed(error);
+	libxfs_trans_ijoin(tp, ip, 0);
+	libxfs_trans_ihold(tp, ip);
+	
+	if ((error = libxfs_bmap_last_offset(tp, ip, &lastblock, 
+						XFS_DATA_FORK)))
+		do_error(_("xfs_bmap_last_offset failed -- error - %d\n"), 
+			error);
+	
+	/* re-init the directory to shortform */
+	if ((error = libxfs_trans_iget(mp, tp, parentino, 0, 0, &pip))) {
+		do_warn(
+		_("couldn't iget parent inode %llu -- error - %d\n"),
+			parentino, error);
+		/* we'll try to use the orphanage ino then */
+		parentino = orphanage_ino;
+		if ((error = libxfs_trans_iget(mp, tp, parentino, 0, 0, &pip)))
+			do_error(
+		_("couldn't iget lost+found inode %llu -- error - %d\n"),
+				parentino, error);
+	}
+
+	/* free all data, leaf, node and freespace blocks */
+	
+	if ((error = libxfs_bunmapi(tp, ip, 0, lastblock, 
+			XFS_BMAPI_METADATA, 0, &firstblock, &flist,
+			&done))) {
+		do_warn(_("xfs_bunmapi failed -- error - %d\n"), error);
+		libxfs_trans_cancel(tp, XFS_TRANS_RELEASE_LOG_RES |
+					XFS_TRANS_ABORT);
+		return;
+	}
+		
+	ASSERT(done);
+
+	libxfs_dir2_init(tp, ip, pip);
+	
+	error = libxfs_bmap_finish(&tp, &flist, firstblock, &committed);
+				
+	libxfs_trans_commit(tp, 
+			XFS_TRANS_RELEASE_LOG_RES|XFS_TRANS_SYNC, 0);
+		
+	/* go through the hash list and re-add the inodes */
+
+	for (p = hashtab->first; p; p = p->nextbyorder) {
+		
+		if (p->name[0] == '/' || (p->name[0] == '.' && (p->namelen == 1 
+				|| (p->namelen == 2 && p->name[1] == '.'))))
+			continue;
+		
+		tp = libxfs_trans_alloc(mp, 0);
+		nres = XFS_CREATE_SPACE_RES(mp, p->namelen);
+		if ((error = libxfs_trans_reserve(tp, nres, 
+				XFS_CREATE_LOG_RES(mp), 0,
+				XFS_TRANS_PERM_LOG_RES, XFS_CREATE_LOG_COUNT))) {
+			do_warn(
+	_("space reservation failed (%d), filesystem may be out of space\n"),
+				error);
+			break;
+		}
+
+		libxfs_trans_ijoin(tp, ip, 0);
+		libxfs_trans_ihold(tp, ip);
+
+		XFS_BMAP_INIT(&flist, &firstblock);
+		if ((error = libxfs_dir2_createname(tp, ip, (char*)p->name, 
+				p->namelen, p->inum, &firstblock, &flist, 
+				nres))) {
+			do_warn(
+_("name create failed in ino %llu (%d), filesystem may be out of space\n"),
+				ino, error);
+			libxfs_trans_cancel(tp, XFS_TRANS_RELEASE_LOG_RES |
+						XFS_TRANS_ABORT);
+			break;
+		}
+
+		if ((error = libxfs_bmap_finish(&tp, &flist, firstblock, 
+				&committed))) {
+			do_warn(
+	_("bmap finish failed (%d), filesystem may be out of space\n"),
+				error);
+			libxfs_bmap_cancel(&flist);
+			libxfs_trans_cancel(tp, XFS_TRANS_RELEASE_LOG_RES |
+						XFS_TRANS_ABORT);
+			break;
+		}
+			
+
+		libxfs_trans_commit(tp, 
+				XFS_TRANS_RELEASE_LOG_RES|XFS_TRANS_SYNC, 0);
+	}
+}
+
+
+/*
  * Kill a block in a version 2 inode.
  * Makes its own transaction.
  */
@@ -1807,7 +1954,6 @@ longform_dir2_entry_check_data(
 	xfs_dabuf_t		**bpp,
 	dir_hash_tab_t		*hashtab,
 	freetab_t		**freetabp,
-	name_hash_tab_t		*nametab,
 	xfs_dablk_t		da_bno,
 	int			isblock)
 {
@@ -1828,6 +1974,7 @@ longform_dir2_entry_check_data(
 	freetab_t		*freetab;
 	int			i;
 	int			ino_offset;
+	xfs_ino_t		inum;
 	ino_tree_node_t		*irec;
 	int			junkit;
 	int			lastfree;
@@ -1956,8 +2103,7 @@ longform_dir2_entry_check_data(
 	libxfs_trans_ijoin(tp, ip, 0);
 	libxfs_trans_ihold(tp, ip);
 	libxfs_da_bjoin(tp, bp);
-	if (isblock)
-		libxfs_da_bhold(tp, bp);
+	libxfs_da_bhold(tp, bp);
 	XFS_BMAP_INIT(&flist, &firstblock);
 	if (INT_GET(d->hdr.magic, ARCH_CONVERT) != wantmagic) {
 		do_warn(_("bad directory block magic # %#x for directory inode "
@@ -1987,7 +2133,7 @@ longform_dir2_entry_check_data(
 	while (ptr < endptr) {
 		dup = (xfs_dir2_data_unused_t *)ptr;
 		if (INT_GET(dup->freetag, ARCH_CONVERT) ==
-		    XFS_DIR2_DATA_FREE_TAG) {
+		    				XFS_DIR2_DATA_FREE_TAG) {
 			if (lastfree) {
 				do_warn(_("directory inode %llu block %u has "
 					  "consecutive free entries: "),
@@ -2011,10 +2157,24 @@ longform_dir2_entry_check_data(
 		addr = XFS_DIR2_DB_OFF_TO_DATAPTR(mp, db, ptr - (char *)d);
 		dep = (xfs_dir2_data_entry_t *)ptr;
 		ptr += XFS_DIR2_DATA_ENTSIZE(dep->namelen);
+		inum = INT_GET(dep->inumber, ARCH_CONVERT);
 		lastfree = 0;
-		dir_hash_add(hashtab,
-			libxfs_da_hashname((uchar_t *)dep->name, dep->namelen),
-			addr, dep->name[0] == '/');
+		if (!dir_hash_add(hashtab, addr, inum, dep->namelen, 
+				dep->name)) {
+			do_warn(
+		_("entry \"%s\" (ino %llu) in dir %llu is a duplicate name"),
+				fname, inum, ip->i_ino);
+			if (!no_modify) {
+				if (verbose)
+					do_warn(
+					_(", marking entry to be junked\n"));
+				else
+					do_warn("\n");
+			} else {
+				do_warn(_(", would junk entry\n"));
+			}
+			dep->name[0] = '/';
+		}
 		/*
 		 * skip bogus entries (leading '/').  they'll be deleted
 		 * later.  must still log it, else we leak references to
@@ -2029,7 +2189,7 @@ longform_dir2_entry_check_data(
 		junkit = 0;
 		bcopy(dep->name, fname, dep->namelen);
 		fname[dep->namelen] = '\0';
-		ASSERT(INT_GET(dep->inumber, ARCH_CONVERT) != NULLFSINO);
+		ASSERT(inum != NULLFSINO);
 		/*
 		 * skip the '..' entry since it's checked when the
 		 * directory is reached by something else.  if it never
@@ -2039,7 +2199,7 @@ longform_dir2_entry_check_data(
 		if (dep->namelen == 2 && dep->name[0] == '.' &&
 		    dep->name[1] == '.')
 			continue;
-		ASSERT(no_modify || !verify_inum(mp, INT_GET(dep->inumber, ARCH_CONVERT)));
+		ASSERT(no_modify || !verify_inum(mp, inum));
 		/*
 		 * special case the . entry.  we know there's only one
 		 * '.' and only '.' points to itself because bogus entries
@@ -2049,7 +2209,7 @@ longform_dir2_entry_check_data(
 		 * '..' is already accounted for or will be taken care
 		 * of when directory is moved to orphanage.
 		 */
-		if (ip->i_ino == INT_GET(dep->inumber, ARCH_CONVERT))  {
+		if (ip->i_ino == inum)  {
 			ASSERT(dep->name[0] == '.' && dep->namelen == 1);
 			add_inode_ref(current_irec, current_ino_offset);
 			*need_dot = 0;
@@ -2062,23 +2222,18 @@ longform_dir2_entry_check_data(
 		 * just skip it.  no need to process it and it's ..
 		 * link is already accounted for.
 		 */
-		if (INT_GET(dep->inumber, ARCH_CONVERT) == orphanage_ino &&
-		    strcmp(fname, ORPHANAGE) == 0)
+		if (inum == orphanage_ino && strcmp(fname, ORPHANAGE) == 0)
 			continue;
 		/*
 		 * skip entries with bogus inumbers if we're in no modify mode
 		 */
-		if (no_modify &&
-		    verify_inum(mp, INT_GET(dep->inumber, ARCH_CONVERT)))
+		if (no_modify && verify_inum(mp, inum))
 			continue;
 		/*
 		 * ok, now handle the rest of the cases besides '.' and '..'
 		 */
-		irec = find_inode_rec(
-			XFS_INO_TO_AGNO(mp,
-				INT_GET(dep->inumber, ARCH_CONVERT)),
-			XFS_INO_TO_AGINO(mp,
-				INT_GET(dep->inumber, ARCH_CONVERT)));
+		irec = find_inode_rec(XFS_INO_TO_AGNO(mp, inum),
+					XFS_INO_TO_AGINO(mp, inum));
 		if (irec == NULL)  {
 			nbad++;
 			do_warn(_("entry \"%s\" in directory inode %llu points "
@@ -2093,9 +2248,7 @@ longform_dir2_entry_check_data(
 			}
 			continue;
 		}
-		ino_offset = XFS_INO_TO_AGINO(mp,
-				INT_GET(dep->inumber, ARCH_CONVERT)) -
-					irec->ino_startnum;
+		ino_offset = XFS_INO_TO_AGINO(mp, inum) - irec->ino_startnum;
 		/*
 		 * if it's a free inode, blow out the entry.
 		 * by now, any inode that we think is free
@@ -2106,18 +2259,13 @@ longform_dir2_entry_check_data(
 			 * don't complain if this entry points to the old
 			 * and now-free lost+found inode
 			 */
-			if (verbose || no_modify ||
-			    INT_GET(dep->inumber, ARCH_CONVERT) !=
-			    old_orphanage_ino)
+			if (verbose || no_modify || inum != old_orphanage_ino)
 				do_warn(
 	_("entry \"%s\" in directory inode %llu points to free inode %llu"),
-					fname, ip->i_ino,
-					INT_GET(dep->inumber, ARCH_CONVERT));
+					fname, ip->i_ino, inum);
 			nbad++;
 			if (!no_modify)  {
-				if (verbose ||
-				    INT_GET(dep->inumber, ARCH_CONVERT) !=
-				    old_orphanage_ino)
+				if (verbose || inum != old_orphanage_ino)
 					do_warn(
 					_(", marking entry to be junked\n"));
 				else
@@ -2130,28 +2278,6 @@ longform_dir2_entry_check_data(
 			continue;
 		}
 		/*
-		 * check for duplicate names in directory.
-		 */ 
-		if (!name_hash_add(nametab, dep->name, dep->namelen)) {
-			do_warn(
-		_("entry \"%s\" (ino %llu) in dir %llu is a duplicate name"),
-				fname, INT_GET(dep->inumber, ARCH_CONVERT),
-				ip->i_ino);
-			nbad++;
-			if (!no_modify) {
-				if (verbose)
-					do_warn(
-					_(", marking entry to be junked\n"));
-				else
-					do_warn("\n");
-				dep->name[0] = '/';
-				libxfs_dir2_data_log_entry(tp, bp, dep);
-			} else {
-				do_warn(_(", would junk entry\n"));
-			}
-			continue;
-		}
-		/*
 		 * check easy case first, regular inode, just bump
 		 * the link count and continue
 		 */
@@ -2172,22 +2298,17 @@ longform_dir2_entry_check_data(
 			junkit = 1;
 			do_warn(
 _("entry \"%s\" in dir %llu points to an already connected directory inode %llu,\n"),
-				fname, ip->i_ino,
-				INT_GET(dep->inumber, ARCH_CONVERT));
+				fname, ip->i_ino, inum);
 		} else if (parent == ip->i_ino)  {
 			add_inode_reached(irec, ino_offset);
 			add_inode_ref(current_irec, current_ino_offset);
-			if (!is_inode_refchecked(
-				INT_GET(dep->inumber, ARCH_CONVERT), irec,
-					ino_offset))
-				push_dir(stack,
-					INT_GET(dep->inumber, ARCH_CONVERT));
+			if (!is_inode_refchecked(inum, irec, ino_offset))
+				push_dir(stack, inum);
 		} else  {
 			junkit = 1;
 			do_warn(
 _("entry \"%s\" in dir inode %llu inconsistent with .. value (%llu) in ino %llu,\n"),
-				fname, ip->i_ino, parent,
-				INT_GET(dep->inumber, ARCH_CONVERT));
+				fname, ip->i_ino, parent, inum);
 		}
 		if (junkit)  {
 			junkit = 0;
@@ -2195,9 +2316,7 @@ _("entry \"%s\" in dir inode %llu incons
 			if (!no_modify)  {
 				dep->name[0] = '/';
 				libxfs_dir2_data_log_entry(tp, bp, dep);
-				if (verbose ||
-				    INT_GET(dep->inumber, ARCH_CONVERT) !=
-				    old_orphanage_ino)
+				if (verbose || inum != old_orphanage_ino)
 					do_warn(
 					_("\twill clear entry \"%s\"\n"),
 						fname);
@@ -2212,8 +2331,6 @@ _("entry \"%s\" in dir inode %llu incons
 		libxfs_dir2_data_freescan(mp, d, &needlog, NULL);
 	if (needlog)
 		libxfs_dir2_data_log_header(tp, bp);
-	else if (!isblock && !nbad)
-		libxfs_da_brelse(tp, bp);
 	libxfs_bmap_finish(&tp, &flist, firstblock, &committed);
 	libxfs_trans_commit(tp, 0, 0);
 	freetab->ents[db].v = INT_GET(d->hdr.bestfree[0].length, ARCH_CONVERT);
@@ -2306,19 +2423,19 @@ longform_dir2_check_node(
 	xfs_fileoff_t		next_da_bno;
 	int			seeval = 0;
 	int			used;
-
+	
 	for (da_bno = mp->m_dirleafblk, next_da_bno = 0;
 	     next_da_bno != NULLFILEOFF && da_bno < mp->m_dirfreeblk;
 	     da_bno = (xfs_dablk_t)next_da_bno) {
 		next_da_bno = da_bno + mp->m_dirblkfsbs - 1;
-		if (libxfs_bmap_next_offset(NULL, ip, &next_da_bno, XFS_DATA_FORK))
+		if (libxfs_bmap_next_offset(NULL, ip, &next_da_bno, XFS_DATA_FORK)) 
 			break;
 		if (libxfs_da_read_bufr(NULL, ip, da_bno, -1, &bp,
 				XFS_DATA_FORK)) {
-			do_error(
-			_("can't read block %u for directory inode %llu\n"),
+			do_warn(
+			_("can't read leaf block %u for directory inode %llu\n"),
 				da_bno, ip->i_ino);
-			/* NOTREACHED */
+			return 1;
 		}
 		leaf = bp->data;
 		if (INT_GET(leaf->hdr.info.magic, ARCH_CONVERT) !=
@@ -2348,23 +2465,24 @@ longform_dir2_check_node(
 		seeval = dir_hash_see_all(hashtab, leaf->ents, INT_GET(leaf->hdr.count, ARCH_CONVERT),
 			INT_GET(leaf->hdr.stale, ARCH_CONVERT));
 		libxfs_da_brelse(NULL, bp);
-		if (seeval != DIR_HASH_CK_OK)
+		if (seeval != DIR_HASH_CK_OK) 
 			return 1;
 	}
-	if (dir_hash_check(hashtab, ip, seeval))
+	if (dir_hash_check(hashtab, ip, seeval)) 
 		return 1;
+	
 	for (da_bno = mp->m_dirfreeblk, next_da_bno = 0;
 	     next_da_bno != NULLFILEOFF;
 	     da_bno = (xfs_dablk_t)next_da_bno) {
 		next_da_bno = da_bno + mp->m_dirblkfsbs - 1;
-		if (libxfs_bmap_next_offset(NULL, ip, &next_da_bno, XFS_DATA_FORK))
+		if (libxfs_bmap_next_offset(NULL, ip, &next_da_bno, XFS_DATA_FORK)) 
 			break;
 		if (libxfs_da_read_bufr(NULL, ip, da_bno, -1, &bp,
 				XFS_DATA_FORK)) {
-			do_error(_("can't read block %u for directory inode "
-				   "%llu\n"),
+			do_warn(
+		_("can't read freespace block %u for directory inode %llu\n"),
 				da_bno, ip->i_ino);
-			/* NOTREACHED */
+			return 1;
 		}
 		free = bp->data;
 		fdb = XFS_DIR2_DA_TO_DB(mp, da_bno);
@@ -2418,388 +2536,9 @@ longform_dir2_check_node(
 }
 
 /*
- * Rebuild a directory: set up.
- * Turn it into a node-format directory with no contents in the
- * upper area.  Also has correct freespace blocks.
- */
-void
-longform_dir2_rebuild_setup(
-	xfs_mount_t		*mp,
-	xfs_ino_t		ino,
-	xfs_inode_t		*ip,
-	freetab_t		*freetab)
-{
-	xfs_da_args_t		args;
-	int			committed;
-	xfs_dir2_data_t		*data = NULL;
-	xfs_dabuf_t		*dbp;
-	int			error;
-	xfs_dir2_db_t		fbno;
-	xfs_dabuf_t		*fbp;
-	xfs_fsblock_t		firstblock;
-	xfs_bmap_free_t		flist;
-	xfs_dir2_free_t		*free;
-	int			i;
-	int			j;
-	xfs_dablk_t		lblkno;
-	xfs_dabuf_t		*lbp;
-	xfs_dir2_leaf_t		*leaf;
-	int			nres;
-	xfs_trans_t		*tp;
-
-	/* read first directory block */
-	tp = libxfs_trans_alloc(mp, 0);
-	nres = XFS_DAENTER_SPACE_RES(mp, XFS_DATA_FORK);
-	error = libxfs_trans_reserve(tp,
-		nres, XFS_CREATE_LOG_RES(mp), 0, XFS_TRANS_PERM_LOG_RES,
-		XFS_CREATE_LOG_COUNT);
-	if (error)
-		res_failed(error);
-	libxfs_trans_ijoin(tp, ip, 0);
-	libxfs_trans_ihold(tp, ip);
-	XFS_BMAP_INIT(&flist, &firstblock);
-	if (libxfs_da_read_buf(tp, ip, mp->m_dirdatablk, -2, &dbp,
-			XFS_DATA_FORK)) {
-		do_error(_("can't read block %u for directory inode %llu\n"),
-			mp->m_dirdatablk, ino);
-		/* NOTREACHED */
-	}
-
-	if (dbp)
-		data = dbp->data;
-
-	/* check for block format directory */
-	if (data &&
-	    INT_GET((data)->hdr.magic, ARCH_CONVERT) == XFS_DIR2_BLOCK_MAGIC) {
-		xfs_dir2_block_t	*block;
-		xfs_dir2_leaf_entry_t	*blp;
-		xfs_dir2_block_tail_t	*btp;
-		int			needlog;
-		int			needscan;
-
-		/* convert directory block from block format to data format */
-		INT_SET(data->hdr.magic, ARCH_CONVERT, XFS_DIR2_DATA_MAGIC);
-
-		/* construct freelist */
-		block = (xfs_dir2_block_t *)data;
-		btp = XFS_DIR2_BLOCK_TAIL_P(mp, block);
-		blp = XFS_DIR2_BLOCK_LEAF_P(btp);
-		needlog = needscan = 0;
-		libxfs_dir2_data_make_free(tp, dbp, (char *)blp - (char *)block,
-			(char *)block + mp->m_dirblksize - (char *)blp,
-			&needlog, &needscan);
-		if (needscan)
-			libxfs_dir2_data_freescan(mp, data, &needlog, NULL);
-		libxfs_da_log_buf(tp, dbp, 0, mp->m_dirblksize - 1);
-	} else if (dbp) {
-		libxfs_da_brelse(tp, dbp);
-	}
-
-	/* allocate blocks for btree */
-	bzero(&args, sizeof(args));
-	args.trans = tp;
-	args.dp = ip;
-	args.whichfork = XFS_DATA_FORK;
-	args.firstblock = &firstblock;
-	args.flist = &flist;
-	args.total = nres;
-	if ((error = libxfs_da_grow_inode(&args, &lblkno)) ||
-	    (error = libxfs_da_get_buf(tp, ip, lblkno, -1, &lbp, XFS_DATA_FORK))) {
-		do_error(_("can't add btree block to directory inode %llu\n"),
-			ino);
-		/* NOTREACHED */
-	}
-	leaf = lbp->data;
-	bzero(leaf, mp->m_dirblksize);
-	INT_SET(leaf->hdr.info.magic, ARCH_CONVERT, XFS_DIR2_LEAFN_MAGIC);
-	libxfs_da_log_buf(tp, lbp, 0, mp->m_dirblksize - 1);
-	libxfs_bmap_finish(&tp, &flist, firstblock, &committed);
-	libxfs_trans_commit(tp, 0, 0);
-
-	for (i = 0; i < freetab->nents; i += XFS_DIR2_MAX_FREE_BESTS(mp)) {
-		tp = libxfs_trans_alloc(mp, 0);
-		nres = XFS_DAENTER_SPACE_RES(mp, XFS_DATA_FORK);
-		error = libxfs_trans_reserve(tp,
-			nres, XFS_CREATE_LOG_RES(mp), 0, XFS_TRANS_PERM_LOG_RES,
-			XFS_CREATE_LOG_COUNT);
-		if (error)
-			res_failed(error);
-		libxfs_trans_ijoin(tp, ip, 0);
-		libxfs_trans_ihold(tp, ip);
-		XFS_BMAP_INIT(&flist, &firstblock);
-		bzero(&args, sizeof(args));
-		args.trans = tp;
-		args.dp = ip;
-		args.whichfork = XFS_DATA_FORK;
-		args.firstblock = &firstblock;
-		args.flist = &flist;
-		args.total = nres;
-		if ((error = libxfs_dir2_grow_inode(&args, XFS_DIR2_FREE_SPACE,
-						 &fbno)) ||
-		    (error = libxfs_da_get_buf(tp, ip, XFS_DIR2_DB_TO_DA(mp, fbno),
-					    -1, &fbp, XFS_DATA_FORK))) {
-			do_error(_("can't add free block to directory inode "
-				   "%llu\n"),
-				ino);
-			/* NOTREACHED */
-		}
-		free = fbp->data;
-		bzero(free, mp->m_dirblksize);
-		INT_SET(free->hdr.magic, ARCH_CONVERT, XFS_DIR2_FREE_MAGIC);
-		INT_SET(free->hdr.firstdb, ARCH_CONVERT, i);
-		INT_SET(free->hdr.nvalid, ARCH_CONVERT, XFS_DIR2_MAX_FREE_BESTS(mp));
-		if (i + INT_GET(free->hdr.nvalid, ARCH_CONVERT) > freetab->nents)
-			INT_SET(free->hdr.nvalid, ARCH_CONVERT, freetab->nents - i);
-		for (j = 0; j < INT_GET(free->hdr.nvalid, ARCH_CONVERT); j++) {
-			INT_SET(free->bests[j], ARCH_CONVERT, freetab->ents[i + j].v);
-			if (INT_GET(free->bests[j], ARCH_CONVERT) != NULLDATAOFF)
-				INT_MOD(free->hdr.nused, ARCH_CONVERT, +1);
-		}
-		libxfs_da_log_buf(tp, fbp, 0, mp->m_dirblksize - 1);
-		libxfs_bmap_finish(&tp, &flist, firstblock, &committed);
-		libxfs_trans_commit(tp, 0, 0);
-	}
-}
-
-/*
- * Rebuild the entries from a single data block.
- */
-void
-longform_dir2_rebuild_data(
-	xfs_mount_t		*mp,
-	xfs_ino_t		ino,
-	xfs_inode_t		*ip,
-	xfs_dablk_t		da_bno)
-{
-	xfs_dabuf_t		*bp;
-	xfs_dir2_block_tail_t	*btp;
-	int			committed;
-	xfs_dir2_data_t		*data;
-	xfs_dir2_db_t		dbno;
-	xfs_dir2_data_entry_t	*dep;
-	xfs_dir2_data_unused_t	*dup;
-	char			*endptr;
-	int			error;
-	xfs_dir2_free_t		*fblock;
-	xfs_dabuf_t		*fbp;
-	xfs_dir2_db_t		fdb;
-	int			fi;
-	xfs_fsblock_t		firstblock;
-	xfs_bmap_free_t		flist;
-	int			needlog;
-	int			needscan;
-	int			nres;
-	char			*ptr;
-	xfs_trans_t		*tp;
-
-	if (libxfs_da_read_buf(NULL, ip, da_bno, da_bno == 0 ? -2 : -1, &bp,
-			XFS_DATA_FORK)) {
-		do_error(_("can't read block %u for directory inode %llu\n"),
-			da_bno, ino);
-		/* NOTREACHED */
-	}
-	if (da_bno == 0 && bp == NULL)
-		/*
-		 * The block was punched out.
-		 */
-		return;
-	ASSERT(bp);
-	dbno = XFS_DIR2_DA_TO_DB(mp, da_bno);
-	fdb = XFS_DIR2_DB_TO_FDB(mp, dbno);
-	if (libxfs_da_read_buf(NULL, ip, XFS_DIR2_DB_TO_DA(mp, fdb), -1, &fbp,
-			XFS_DATA_FORK)) {
-		do_error(_("can't read block %u for directory inode %llu\n"),
-			XFS_DIR2_DB_TO_DA(mp, fdb), ino);
-		/* NOTREACHED */
-	}
-	data = malloc(mp->m_dirblksize);
-	if (!data) {
-		do_error(
-		_("malloc failed in longform_dir2_rebuild_data (%u bytes)\n"),
-			mp->m_dirblksize);
-		exit(1);
-	}
-	bcopy(bp->data, data, mp->m_dirblksize);
-	ptr = (char *)data->u;
-	if (INT_GET(data->hdr.magic, ARCH_CONVERT) == XFS_DIR2_BLOCK_MAGIC) {
-		btp = XFS_DIR2_BLOCK_TAIL_P(mp, (xfs_dir2_block_t *)data);
-		endptr = (char *)XFS_DIR2_BLOCK_LEAF_P(btp);
-	} else
-		endptr = (char *)data + mp->m_dirblksize;
-	fblock = fbp->data;
-	fi = XFS_DIR2_DB_TO_FDINDEX(mp, dbno);
-	tp = libxfs_trans_alloc(mp, 0);
-	error = libxfs_trans_reserve(tp, 0, XFS_CREATE_LOG_RES(mp), 0,
-		XFS_TRANS_PERM_LOG_RES, XFS_CREATE_LOG_COUNT);
-	if (error)
-		res_failed(error);
-	libxfs_trans_ijoin(tp, ip, 0);
-	libxfs_trans_ihold(tp, ip);
-	libxfs_da_bjoin(tp, bp);
-	libxfs_da_bhold(tp, bp);
-	libxfs_da_bjoin(tp, fbp);
-	libxfs_da_bhold(tp, fbp);
-	XFS_BMAP_INIT(&flist, &firstblock);
-	needlog = needscan = 0;
-	bzero(((xfs_dir2_data_t *)(bp->data))->hdr.bestfree,
-		sizeof(data->hdr.bestfree));
-	libxfs_dir2_data_make_free(tp, bp, (xfs_dir2_data_aoff_t)sizeof(data->hdr),
-		mp->m_dirblksize - sizeof(data->hdr), &needlog, &needscan);
-	ASSERT(needscan == 0);
-	libxfs_dir2_data_log_header(tp, bp);
-	INT_SET(fblock->bests[fi], ARCH_CONVERT,
-		INT_GET(((xfs_dir2_data_t *)(bp->data))->hdr.bestfree[0].length, ARCH_CONVERT));
-	libxfs_dir2_free_log_bests(tp, fbp, fi, fi);
-	libxfs_bmap_finish(&tp, &flist, firstblock, &committed);
-	libxfs_trans_commit(tp, 0, 0);
-
-	while (ptr < endptr) {
-		dup = (xfs_dir2_data_unused_t *)ptr;
-		if (INT_GET(dup->freetag, ARCH_CONVERT) == XFS_DIR2_DATA_FREE_TAG) {
-			ptr += INT_GET(dup->length, ARCH_CONVERT);
-			continue;
-		}
-		dep = (xfs_dir2_data_entry_t *)ptr;
-		ptr += XFS_DIR2_DATA_ENTSIZE(dep->namelen);
-		if (dep->name[0] == '/')
-			continue;
-		tp = libxfs_trans_alloc(mp, 0);
-		nres = XFS_CREATE_SPACE_RES(mp, dep->namelen);
-		error = libxfs_trans_reserve(tp, nres, XFS_CREATE_LOG_RES(mp), 0,
-			XFS_TRANS_PERM_LOG_RES, XFS_CREATE_LOG_COUNT);
-		if (error)
-			res_failed(error);
-		libxfs_trans_ijoin(tp, ip, 0);
-		libxfs_trans_ihold(tp, ip);
-		libxfs_da_bjoin(tp, bp);
-		libxfs_da_bhold(tp, bp);
-		libxfs_da_bjoin(tp, fbp);
-		libxfs_da_bhold(tp, fbp);
-		XFS_BMAP_INIT(&flist, &firstblock);
-		error = dir_createname(mp, tp, ip, (char *)dep->name,
-			dep->namelen, INT_GET(dep->inumber, ARCH_CONVERT),
-			&firstblock, &flist, nres);
-		ASSERT(error == 0);
-		libxfs_bmap_finish(&tp, &flist, firstblock, &committed);
-		libxfs_trans_commit(tp, 0, 0);
-	}
-	libxfs_da_brelse(NULL, bp);
-	libxfs_da_brelse(NULL, fbp);
-	free(data);
-}
-
-/*
- * Finish the rebuild of a directory.
- * Stuff / in and then remove it, this forces the directory to end
- * up in the right format.
- */
-void
-longform_dir2_rebuild_finish(
-	xfs_mount_t		*mp,
-	xfs_ino_t		ino,
-	xfs_inode_t		*ip)
-{
-	int			committed;
-	int			error;
-	xfs_fsblock_t		firstblock;
-	xfs_bmap_free_t		flist;
-	int			nres;
-	xfs_trans_t		*tp;
-
-	tp = libxfs_trans_alloc(mp, 0);
-	nres = XFS_CREATE_SPACE_RES(mp, 1);
-	error = libxfs_trans_reserve(tp, nres, XFS_CREATE_LOG_RES(mp), 0,
-		XFS_TRANS_PERM_LOG_RES, XFS_CREATE_LOG_COUNT);
-	if (error)
-		res_failed(error);
-	libxfs_trans_ijoin(tp, ip, 0);
-	libxfs_trans_ihold(tp, ip);
-	XFS_BMAP_INIT(&flist, &firstblock);
-	error = dir_createname(mp, tp, ip, "/", 1, ino,
-			&firstblock, &flist, nres);
-	ASSERT(error == 0);
-	libxfs_bmap_finish(&tp, &flist, firstblock, &committed);
-	libxfs_trans_commit(tp, 0, 0);
-
-	/* could kill trailing empty data blocks here */
-
-	tp = libxfs_trans_alloc(mp, 0);
-	nres = XFS_REMOVE_SPACE_RES(mp);
-	error = libxfs_trans_reserve(tp, nres, XFS_REMOVE_LOG_RES(mp), 0,
-		XFS_TRANS_PERM_LOG_RES, XFS_REMOVE_LOG_COUNT);
-	if (error)
-		res_failed(error);
-	libxfs_trans_ijoin(tp, ip, 0);
-	libxfs_trans_ihold(tp, ip);
-	XFS_BMAP_INIT(&flist, &firstblock);
-	error = dir_removename(mp, tp, ip, "/", 1, ino,
-			&firstblock, &flist, nres);
-	ASSERT(error == 0);
-	libxfs_bmap_finish(&tp, &flist, firstblock, &committed);
-	libxfs_trans_commit(tp, 0, 0);
-}
-
-/*
- * Rebuild a directory.
- * Remove all the non-data blocks.
- * Re-initialize to (empty) node form.
- * Loop over the data blocks reinserting each entry.
- * Force the directory into the right format.
- */
-void
-longform_dir2_rebuild(
-	xfs_mount_t	*mp,
-	xfs_ino_t	ino,
-	xfs_inode_t	*ip,
-	int		*num_illegal,
-	freetab_t	*freetab,
-	int		isblock)
-{
-	xfs_dabuf_t	*bp;
-	xfs_dablk_t	da_bno;
-	xfs_fileoff_t	next_da_bno;
-
-	do_warn(_("rebuilding directory inode %llu\n"), ino);
-
-	/* kill leaf blocks */
-	for (da_bno = mp->m_dirleafblk, next_da_bno = isblock ? NULLFILEOFF : 0;
-	     next_da_bno != NULLFILEOFF;
-	     da_bno = (xfs_dablk_t)next_da_bno) {
-		next_da_bno = da_bno + mp->m_dirblkfsbs - 1;
-		if (libxfs_bmap_next_offset(NULL, ip, &next_da_bno, XFS_DATA_FORK))
-			break;
-		if (libxfs_da_get_buf(NULL, ip, da_bno, -1, &bp, XFS_DATA_FORK)) {
-			do_error(_("can't get block %u for directory inode "
-				   "%llu\n"),
-				da_bno, ino);
-			/* NOTREACHED */
-		}
-		dir2_kill_block(mp, ip, da_bno, bp);
-	}
-
-	/* rebuild empty btree and freelist */
-	longform_dir2_rebuild_setup(mp, ino, ip, freetab);
-
-	/* rebuild directory */
-	for (da_bno = mp->m_dirdatablk, next_da_bno = 0;
-	     da_bno < mp->m_dirleafblk && next_da_bno != NULLFILEOFF;
-	     da_bno = (xfs_dablk_t)next_da_bno) {
-		next_da_bno = da_bno + mp->m_dirblkfsbs - 1;
-		if (libxfs_bmap_next_offset(NULL, ip, &next_da_bno, XFS_DATA_FORK))
-			break;
-		longform_dir2_rebuild_data(mp, ino, ip, da_bno);
-	}
-
-	/* put the directory in the appropriate on-disk format */
-	longform_dir2_rebuild_finish(mp, ino, ip);
-	*num_illegal = 0;
-}
-
-/*
- * succeeds or dies, inode never gets dirtied since all changes
- * happen in file blocks.  the inode size and other core info
- * is already correct, it's just the leaf entries that get altered.
- * XXX above comment is wrong for v2 - need to see why it matters
+ * If a directory is corrupt, we need to read in as many entries as possible,
+ * destroy the entry and create a new one with recovered name/inode pairs.
+ * (ie. get libxfs to do all the grunt work)
  */
 void
 longform_dir2_entry_check(xfs_mount_t	*mp,
@@ -2810,15 +2549,14 @@ longform_dir2_entry_check(xfs_mount_t	*m
 			dir_stack_t	*stack,
 			ino_tree_node_t	*irec,
 			int		ino_offset,
-			name_hash_tab_t	*nametab)
+			dir_hash_tab_t	*hashtab)
 {
 	xfs_dir2_block_t	*block;
 	xfs_dir2_leaf_entry_t	*blp;
-	xfs_dabuf_t		*bp;
+	xfs_dabuf_t		**bplist;
 	xfs_dir2_block_tail_t	*btp;
 	xfs_dablk_t		da_bno;
 	freetab_t		*freetab;
-	dir_hash_tab_t		*hashtab;
 	int			i;
 	int			isblock;
 	int			isleaf;
@@ -2840,6 +2578,7 @@ longform_dir2_entry_check(xfs_mount_t	*m
 		freetab->ents[i].v = NULLDATAOFF;
 		freetab->ents[i].s = 0;
 	}
+	bplist = calloc(freetab->naents, sizeof(xfs_dabuf_t*));
 	/* is this a block, leaf, or node directory? */
 	libxfs_dir2_isblock(NULL, ip, &isblock);
 	libxfs_dir2_isleaf(NULL, ip, &isleaf);
@@ -2847,50 +2586,58 @@ longform_dir2_entry_check(xfs_mount_t	*m
 	if (do_prefetch && !isblock)
 		prefetch_p6_dir2(mp, ip);
 
-	/* check directory data */
-	hashtab = dir_hash_init(ip->i_d.di_size);
+	/* check directory "data" blocks (ie. name/inode pairs) */
 	for (da_bno = 0, next_da_bno = 0;
 	     next_da_bno != NULLFILEOFF && da_bno < mp->m_dirleafblk;
 	     da_bno = (xfs_dablk_t)next_da_bno) {
 		next_da_bno = da_bno + mp->m_dirblkfsbs - 1;
+		ASSERT(XFS_DIR2_DA_TO_DB(mp, da_bno) < freetab->naents);
 		if (libxfs_bmap_next_offset(NULL, ip, &next_da_bno, XFS_DATA_FORK))
 			break;
-		if (libxfs_da_read_bufr(NULL, ip, da_bno,
-				da_bno == 0 ? -2 : -1, &bp, XFS_DATA_FORK)) {
-			do_error(_("can't read block %u for directory inode "
-				   "%llu\n"),
+		if (libxfs_da_read_bufr(NULL, ip, da_bno, -1, 
+				&bplist[XFS_DIR2_DA_TO_DB(mp, da_bno)], 
+				XFS_DATA_FORK)) {
+			do_warn(_(
+			"can't read data block %u for directory inode %llu\n"),
 				da_bno, ino);
-			/* NOTREACHED */
+			*num_illegal++;
+			continue;	/* try and read all "data" blocks */
 		}
-		/* is there a hole at the start? */
-		if (da_bno == 0 && bp == NULL)
-			continue;
 		longform_dir2_entry_check_data(mp, ip, num_illegal, need_dot,
-			stack, irec, ino_offset, &bp, hashtab, &freetab, 
-			nametab, da_bno, isblock);
-		/* it releases the buffer unless isblock is set */
+				stack, irec, ino_offset, 
+				&bplist[XFS_DIR2_DA_TO_DB(mp, da_bno)], hashtab,  
+				&freetab, da_bno, isblock);
 	}
 	fixit = (*num_illegal != 0) || dir2_is_badino(ino);
 
 	/* check btree and freespace */
 	if (isblock) {
-		ASSERT(bp);
-		block = bp->data;
+		block = bplist[0]->data;
 		btp = XFS_DIR2_BLOCK_TAIL_P(mp, block);
 		blp = XFS_DIR2_BLOCK_LEAF_P(btp);
-		seeval = dir_hash_see_all(hashtab, blp, INT_GET(btp->count, ARCH_CONVERT), INT_GET(btp->stale, ARCH_CONVERT));
+		seeval = dir_hash_see_all(hashtab, blp, 
+				INT_GET(btp->count, ARCH_CONVERT), 
+				INT_GET(btp->stale, ARCH_CONVERT));
 		if (dir_hash_check(hashtab, ip, seeval))
 			fixit |= 1;
-		libxfs_da_brelse(NULL, bp);
 	} else if (isleaf) {
 		fixit |= longform_dir2_check_leaf(mp, ip, hashtab, freetab);
 	} else {
 		fixit |= longform_dir2_check_node(mp, ip, hashtab, freetab);
 	}
-	dir_hash_done(hashtab);
-	if (!no_modify && fixit)
-		longform_dir2_rebuild(mp, ino, ip, num_illegal, freetab,
-			isblock);
+	if (!no_modify && fixit) {
+		dir_hash_dup_names(hashtab);
+		for (i = 0; i < freetab->naents; i++) 
+			if (bplist[i])
+				libxfs_da_brelse(NULL, bplist[i]);
+		longform_dir2_rebuild(mp, ino, ip, hashtab);
+		*num_illegal = 0;
+	} else {
+		for (i = 0; i < freetab->naents; i++) 
+			if (bplist[i])
+				libxfs_da_brelse(NULL, bplist[i]);
+	}
+	
 	free(freetab);
 }
 
@@ -2906,7 +2653,7 @@ shortform_dir_entry_check(xfs_mount_t	*m
 			dir_stack_t	*stack,
 			ino_tree_node_t	*current_irec,
 			int		current_ino_offset,
-			name_hash_tab_t	*nametab)
+			dir_hash_tab_t	*hashtab)
 {
 	xfs_ino_t		lino;
 	xfs_ino_t		parent;
@@ -3044,7 +2791,7 @@ _("entry \"%s\" in shortform dir %llu re
 		ASSERT(irec != NULL);
 
 		ino_offset = XFS_INO_TO_AGINO(mp, lino) - irec->ino_startnum;
-
+		
 		/*
 		 * if it's a free inode, blow out the entry.
 		 * by now, any inode that we think is free
@@ -3066,8 +2813,9 @@ _("entry \"%s\" in shortform dir inode %
 				do_warn(_("would junk entry \"%s\"\n"),
 					fname);
 			}
-		} else if (!name_hash_add(nametab, sf_entry->name, 
-					sf_entry->namelen)) {
+		} else if (!dir_hash_add(hashtab, 
+				(xfs_dir2_dataptr_t)(sf_entry - &sf->list[0]),
+				lino, sf_entry->namelen, sf_entry->name)) {
 			/*
 			 * check for duplicate names in directory.
 			 */ 
@@ -3311,7 +3059,7 @@ shortform_dir2_entry_check(xfs_mount_t	*
 			dir_stack_t	*stack,
 			ino_tree_node_t	*current_irec,
 			int		current_ino_offset,
-			name_hash_tab_t	*nametab)
+			dir_hash_tab_t	*hashtab)
 {
 	xfs_ino_t		lino;
 	xfs_ino_t		parent;
@@ -3484,7 +3232,9 @@ shortform_dir2_entry_check(xfs_mount_t	*
 				do_warn(_("would junk entry \"%s\"\n"),
 					fname);
 			}
-		} else if (!name_hash_add(nametab, sfep->name, sfep->namelen)) {
+		} else if (!dir_hash_add(hashtab, (xfs_dir2_dataptr_t)
+					(sfep - XFS_DIR2_SF_FIRSTENTRY(sfp)),
+				lino, sfep->namelen, sfep->name)) {
 			/*
 			 * check for duplicate names in directory.
 			 */ 
@@ -3650,7 +3400,7 @@ process_dirstack(xfs_mount_t *mp, dir_st
 	xfs_trans_t		*tp;
 	xfs_dahash_t		hashval;
 	ino_tree_node_t		*irec;
-	name_hash_tab_t		*nametab;
+	dir_hash_tab_t		*hashtab;
 	int			ino_offset, need_dot, committed;
 	int			dirty, num_illegal, error, nres;
 
@@ -3731,7 +3481,7 @@ process_dirstack(xfs_mount_t *mp, dir_st
 
 		add_inode_refchecked(ino, irec, ino_offset);
 
-		nametab = name_hash_init(ip->i_d.di_size);
+		hashtab = dir_hash_init(ip->i_d.di_size);
 
 		/*
 		 * look for bogus entries
@@ -3750,13 +3500,13 @@ process_dirstack(xfs_mount_t *mp, dir_st
 							&num_illegal, &need_dot,
 							stack, irec,
 							ino_offset,
-							nametab);
+							hashtab);
 			else
 				longform_dir_entry_check(mp, ino, ip,
 							&num_illegal, &need_dot,
 							stack, irec,
 							ino_offset,
-							nametab);
+							hashtab);
 			break;
 		case XFS_DINODE_FMT_LOCAL:
 			tp = libxfs_trans_alloc(mp, 0);
@@ -3781,12 +3531,12 @@ process_dirstack(xfs_mount_t *mp, dir_st
 				shortform_dir2_entry_check(mp, ino, ip, &dirty,
 							stack, irec,
 							ino_offset,
-							nametab);
+							hashtab);
 			else
 				shortform_dir_entry_check(mp, ino, ip, &dirty,
 							stack, irec,
 							ino_offset,
-							nametab);
+							hashtab);
 
 			ASSERT(dirty == 0 || (dirty && !no_modify));
 			if (dirty)  {
@@ -3801,7 +3551,7 @@ process_dirstack(xfs_mount_t *mp, dir_st
 		default:
 			break;
 		}
-		name_hash_done(nametab);
+		dir_hash_done(hashtab);
 
 		hashval = 0;
 
@@ -4223,6 +3973,10 @@ _("        - skipping filesystem travers
 	}
 
 	do_log(_("        - traversals finished ... \n"));
+	
+	/* flush all dirty data before doing lost+found search */
+	libxfs_bcache_flush();
+	
 	do_log(_("        - moving disconnected inodes to lost+found ... \n"));
 
 	/*

  reply	other threads:[~2006-07-31  9:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-25  3:50 review: increase bulkstat readahead window Nathan Scott
2006-07-25  9:40 ` Christoph Hellwig
2006-07-25 22:37   ` Nathan Scott
2006-07-26 10:25     ` Christoph Hellwig
2006-07-27 23:17       ` Nathan Scott
2006-07-28  1:58         ` Review: xfs_repair fixes for dir2 corruption Barry Naujok
2006-07-28  8:10           ` Nathan Scott
2006-07-28 14:45             ` Madan Valluri
2006-07-31  7:18               ` Barry Naujok [this message]
2006-07-30  5:19           ` christian
2006-08-01 21:50           ` Adam Sjøgren
2006-08-01 23:06             ` Christian Guggenberger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200607310714.RAA03415@larry.melbourne.sgi.com \
    --to=bnaujok@melbourne.sgi.com \
    --cc=mvalluri@sgi.com \
    --cc=nathans@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.