* [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf
@ 2013-05-14 16:21 Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 02/47] libgfs2: allow dir_split_leaf to receive a leaf buffer Bob Peterson
` (45 more replies)
0 siblings, 46 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch makes libgfs2 externalize function dir_split_leaf so that
fsck.gfs2 can split leafs in a future patch.
rhbz#902920
---
gfs2/libgfs2/fs_ops.c | 3 +--
gfs2/libgfs2/libgfs2.h | 2 ++
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
index 51e6abf..8b67d2a 100644
--- a/gfs2/libgfs2/fs_ops.c
+++ b/gfs2/libgfs2/fs_ops.c
@@ -925,8 +925,7 @@ void gfs2_put_leaf_nr(struct gfs2_inode *dip, uint32_t inx, uint64_t leaf_out)
}
}
-static void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex,
- uint64_t leaf_no)
+void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex, uint64_t leaf_no)
{
struct gfs2_buffer_head *nbh, *obh;
struct gfs2_leaf *nleaf, *oleaf;
diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index 997e23f..8f298ea 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -468,6 +468,8 @@ extern void block_map(struct gfs2_inode *ip, uint64_t lblock, int *new,
extern void gfs2_get_leaf_nr(struct gfs2_inode *dip, uint32_t index,
uint64_t *leaf_out);
extern void gfs2_put_leaf_nr(struct gfs2_inode *dip, uint32_t inx, uint64_t leaf_out);
+extern void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex,
+ uint64_t leaf_no);
extern void gfs2_free_block(struct gfs2_sbd *sdp, uint64_t block);
extern int gfs2_freedi(struct gfs2_sbd *sdp, uint64_t block);
extern int gfs2_get_leaf(struct gfs2_inode *dip, uint64_t leaf_no,
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 02/47] libgfs2: allow dir_split_leaf to receive a leaf buffer
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 03/47] libgfs2: let dir_split_leaf receive a "broken" lindex Bob Peterson
` (44 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This is a small performance improvement. Rather than having function
dir_split_leaf read in the leaf to be split, this patch lets the
buffer_head to be passed in from the calling function, which has it
read in anyway.
rhbz#902920
---
gfs2/libgfs2/fs_ops.c | 9 ++++-----
gfs2/libgfs2/libgfs2.h | 2 +-
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
index 8b67d2a..89adf32 100644
--- a/gfs2/libgfs2/fs_ops.c
+++ b/gfs2/libgfs2/fs_ops.c
@@ -925,9 +925,10 @@ void gfs2_put_leaf_nr(struct gfs2_inode *dip, uint32_t inx, uint64_t leaf_out)
}
}
-void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex, uint64_t leaf_no)
+void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex, uint64_t leaf_no,
+ struct gfs2_buffer_head *obh)
{
- struct gfs2_buffer_head *nbh, *obh;
+ struct gfs2_buffer_head *nbh;
struct gfs2_leaf *nleaf, *oleaf;
struct gfs2_dirent *dent, *prev = NULL, *next = NULL, *new;
uint32_t start, len, half_len, divider;
@@ -951,7 +952,6 @@ void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex, uint64_t leaf_no)
nleaf = (struct gfs2_leaf *)nbh->b_data;
nleaf->lf_dirent_format = cpu_to_be32(GFS2_FORMAT_DE);
- obh = bread(dip->i_sbd, leaf_no);
oleaf = (struct gfs2_leaf *)obh->b_data;
len = 1 << (dip->i_di.di_depth - be16_to_cpu(oleaf->lf_depth));
@@ -1036,7 +1036,6 @@ void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex, uint64_t leaf_no)
bmodified(dip->i_bh);
bmodified(obh); /* Need to do this in case nothing was moved */
- brelse(obh);
bmodified(nbh);
brelse(nbh);
}
@@ -1183,8 +1182,8 @@ restart:
if (dirent_alloc(dip, bh, len, &dent)) {
if (be16_to_cpu(leaf->lf_depth) < dip->i_di.di_depth) {
+ dir_split_leaf(dip, lindex, leaf_no, bh);
brelse(bh);
- dir_split_leaf(dip, lindex, leaf_no);
goto restart;
} else if (dip->i_di.di_depth < GFS2_DIR_MAX_DEPTH) {
diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index 8f298ea..3147c83 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -469,7 +469,7 @@ extern void gfs2_get_leaf_nr(struct gfs2_inode *dip, uint32_t index,
uint64_t *leaf_out);
extern void gfs2_put_leaf_nr(struct gfs2_inode *dip, uint32_t inx, uint64_t leaf_out);
extern void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex,
- uint64_t leaf_no);
+ uint64_t leaf_no, struct gfs2_buffer_head *obh);
extern void gfs2_free_block(struct gfs2_sbd *sdp, uint64_t block);
extern int gfs2_freedi(struct gfs2_sbd *sdp, uint64_t block);
extern int gfs2_get_leaf(struct gfs2_inode *dip, uint64_t leaf_no,
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 03/47] libgfs2: let dir_split_leaf receive a "broken" lindex
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 02/47] libgfs2: allow dir_split_leaf to receive a leaf buffer Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-15 16:01 ` Steven Whitehouse
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 04/47] fsck.gfs2: Move function find_free_blk to util.c Bob Peterson
` (43 subsequent siblings)
45 siblings, 1 reply; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
For ordinary leaf blocks, the hash table must follow the rules,
which means it needs to follow a power-of-two boundary. In other
words, it needs to enforce that: start = (lindex & ~(len - 1));
But when doing repairs, fsck will need to detect when hash tables
violate this rule and fix it. In that case, it may need to pass
in an invalid starting offset for a leaf to split. This patch
moves the responsibility for checking the starting block to the
calling function.
rhbz#902920
---
gfs2/libgfs2/fs_ops.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
index 89adf32..11ef6b4 100644
--- a/gfs2/libgfs2/fs_ops.c
+++ b/gfs2/libgfs2/fs_ops.c
@@ -957,7 +957,7 @@ void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex, uint64_t leaf_no,
len = 1 << (dip->i_di.di_depth - be16_to_cpu(oleaf->lf_depth));
half_len = len >> 1;
- start = (lindex & ~(len - 1));
+ start = lindex;
lp = calloc(1, half_len * sizeof(uint64_t));
if (lp == NULL) {
@@ -1160,7 +1160,7 @@ static int dir_e_add(struct gfs2_inode *dip, const char *filename, int len,
struct gfs2_buffer_head *bh, *nbh;
struct gfs2_leaf *leaf, *nleaf;
struct gfs2_dirent *dent;
- uint32_t lindex;
+ uint32_t lindex, llen;
uint32_t hash;
uint64_t leaf_no, bn;
int err = 0;
@@ -1182,7 +1182,10 @@ restart:
if (dirent_alloc(dip, bh, len, &dent)) {
if (be16_to_cpu(leaf->lf_depth) < dip->i_di.di_depth) {
- dir_split_leaf(dip, lindex, leaf_no, bh);
+ llen = 1 << (dip->i_di.di_depth -
+ be16_to_cpu(leaf->lf_depth));
+ dir_split_leaf(dip, lindex & ~(llen - 1),
+ leaf_no, bh);
brelse(bh);
goto restart;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 04/47] fsck.gfs2: Move function find_free_blk to util.c
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 02/47] libgfs2: allow dir_split_leaf to receive a leaf buffer Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 03/47] libgfs2: let dir_split_leaf receive a "broken" lindex Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-15 16:04 ` Steven Whitehouse
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 05/47] fsck.gfs2: Split out function to make sure lost+found exists Bob Peterson
` (42 subsequent siblings)
45 siblings, 1 reply; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
In a future patch to fsck, function find_free_blk will be called in
order to correctly report blocks that will need to be allocated for
things such as leaf splits. This patch moves function find_free_blk
to a more centralized place, util.c, to that end.
rhbz#902920
---
gfs2/fsck/lost_n_found.c | 39 ---------------------------------------
gfs2/fsck/util.c | 39 +++++++++++++++++++++++++++++++++++++++
gfs2/fsck/util.h | 2 ++
3 files changed, 41 insertions(+), 39 deletions(-)
diff --git a/gfs2/fsck/lost_n_found.c b/gfs2/fsck/lost_n_found.c
index 570f3a8..1fb5076 100644
--- a/gfs2/fsck/lost_n_found.c
+++ b/gfs2/fsck/lost_n_found.c
@@ -88,45 +88,6 @@ static void add_dotdot(struct gfs2_inode *ip)
}
}
-static uint64_t find_free_blk(struct gfs2_sbd *sdp)
-{
- struct osi_node *n, *next = NULL;
- struct rgrp_tree *rl = NULL;
- struct gfs2_rindex *ri;
- struct gfs2_rgrp *rg;
- unsigned int block, bn = 0, x = 0, y = 0;
- unsigned int state;
- struct gfs2_buffer_head *bh;
-
- memset(&rg, 0, sizeof(rg));
- for (n = osi_first(&sdp->rgtree); n; n = next) {
- next = osi_next(n);
- rl = (struct rgrp_tree *)n;
- if (rl->rg.rg_free)
- break;
- }
-
- if (n == NULL)
- return 0;
-
- ri = &rl->ri;
- rg = &rl->rg;
-
- for (block = 0; block < ri->ri_length; block++) {
- bh = rl->bh[block];
- x = (block) ? sizeof(struct gfs2_meta_header) : sizeof(struct gfs2_rgrp);
-
- for (; x < sdp->bsize; x++)
- for (y = 0; y < GFS2_NBBY; y++) {
- state = (bh->b_data[x] >> (GFS2_BIT_SIZE * y)) & 0x03;
- if (state == GFS2_BLKST_FREE)
- return ri->ri_data0 + bn;
- bn++;
- }
- }
- return 0;
-}
-
/* add_inode_to_lf - Add dir entry to lost+found for the inode
* @ip: inode to add to lost + found
*
diff --git a/gfs2/fsck/util.c b/gfs2/fsck/util.c
index 7c89155..94d532e 100644
--- a/gfs2/fsck/util.c
+++ b/gfs2/fsck/util.c
@@ -615,3 +615,42 @@ bad_dinode:
stack;
return -EPERM;
}
+
+uint64_t find_free_blk(struct gfs2_sbd *sdp)
+{
+ struct osi_node *n, *next = NULL;
+ struct rgrp_tree *rl = NULL;
+ struct gfs2_rindex *ri;
+ struct gfs2_rgrp *rg;
+ unsigned int block, bn = 0, x = 0, y = 0;
+ unsigned int state;
+ struct gfs2_buffer_head *bh;
+
+ memset(&rg, 0, sizeof(rg));
+ for (n = osi_first(&sdp->rgtree); n; n = next) {
+ next = osi_next(n);
+ rl = (struct rgrp_tree *)n;
+ if (rl->rg.rg_free)
+ break;
+ }
+
+ if (n == NULL)
+ return 0;
+
+ ri = &rl->ri;
+ rg = &rl->rg;
+
+ for (block = 0; block < ri->ri_length; block++) {
+ bh = rl->bh[block];
+ x = (block) ? sizeof(struct gfs2_meta_header) : sizeof(struct gfs2_rgrp);
+
+ for (; x < sdp->bsize; x++)
+ for (y = 0; y < GFS2_NBBY; y++) {
+ state = (bh->b_data[x] >> (GFS2_BIT_SIZE * y)) & 0x03;
+ if (state == GFS2_BLKST_FREE)
+ return ri->ri_data0 + bn;
+ bn++;
+ }
+ }
+ return 0;
+}
diff --git a/gfs2/fsck/util.h b/gfs2/fsck/util.h
index 80ed0c4..1a4811c 100644
--- a/gfs2/fsck/util.h
+++ b/gfs2/fsck/util.h
@@ -184,6 +184,8 @@ extern char generic_interrupt(const char *caller, const char *where,
const char *progress, const char *question,
const char *answers);
extern char gfs2_getch(void);
+extern uint64_t find_free_blk(struct gfs2_sbd *sdp);
+
#define stack log_debug("<backtrace> - %s()\n", __func__)
#endif /* __UTIL_H__ */
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 05/47] fsck.gfs2: Split out function to make sure lost+found exists
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (2 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 04/47] fsck.gfs2: Move function find_free_blk to util.c Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 06/47] fsck.gfs2: Check for formal inode mismatch when adding to lost+found Bob Peterson
` (41 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch extracts a section of code from the lost+found functions
and makes a new make_sure_lf_exists function that can be called
from more places.
---
gfs2/fsck/lost_n_found.c | 130 +++++++++++++++++++++++------------------------
gfs2/fsck/lost_n_found.h | 1 +
2 files changed, 66 insertions(+), 65 deletions(-)
diff --git a/gfs2/fsck/lost_n_found.c b/gfs2/fsck/lost_n_found.c
index 1fb5076..f379646 100644
--- a/gfs2/fsck/lost_n_found.c
+++ b/gfs2/fsck/lost_n_found.c
@@ -88,6 +88,70 @@ static void add_dotdot(struct gfs2_inode *ip)
}
}
+void make_sure_lf_exists(struct gfs2_inode *ip)
+{
+ uint8_t q;
+ struct dir_info *di;
+ struct gfs2_sbd *sdp = ip->i_sbd;
+ uint32_t mode;
+
+ if (lf_dip)
+ return;
+
+ log_info( _("Locating/Creating lost+found directory\n"));
+
+ /* if this is gfs1, we have to trick createi into using
+ no_formal_ino = no_addr, so we set next_inum to the
+ free block we're about to allocate. */
+ if (sdp->gfs1)
+ sdp->md.next_inum = find_free_blk(sdp);
+ mode = (sdp->gfs1 ? DT2IF(GFS_FILE_DIR) : S_IFDIR) | 0700;
+ if (sdp->gfs1)
+ lf_dip = gfs_createi(sdp->md.rooti, "lost+found", mode, 0);
+ else
+ lf_dip = createi(sdp->md.rooti, "lost+found",
+ S_IFDIR | 0700, 0);
+ if (lf_dip == NULL) {
+ log_crit(_("Error creating lost+found: %s\n"),
+ strerror(errno));
+ exit(FSCK_ERROR);
+ }
+
+ /* createi will have incremented the di_nlink link count for the root
+ directory. We must set the nlink value in the hash table to keep
+ them in sync so that pass4 can detect and fix any descrepancies. */
+ set_di_nlink(sdp->md.rooti);
+
+ q = block_type(lf_dip->i_di.di_num.no_addr);
+ if (q != gfs2_inode_dir) {
+ /* This is a new lost+found directory, so set its block type
+ and increment link counts for the directories */
+ /* FIXME: i'd feel better about this if fs_mkdir returned
+ whether it created a new directory or just found an old one,
+ and we used that instead of the block_type to run this */
+ fsck_blockmap_set(ip, lf_dip->i_di.di_num.no_addr,
+ _("lost+found dinode"), gfs2_inode_dir);
+ dirtree_insert(lf_dip->i_di.di_num);
+ /* root inode links to lost+found */
+ incr_link_count(sdp->md.rooti->i_di.di_num, lf_dip, _("root"));
+ /* lost+found link for '.' from itself */
+ incr_link_count(lf_dip->i_di.di_num, lf_dip, "\".\"");
+ /* lost+found link for '..' back to root */
+ incr_link_count(lf_dip->i_di.di_num, sdp->md.rooti, "\"..\"");
+ if (sdp->gfs1)
+ lf_dip->i_di.__pad1 = GFS_FILE_DIR;
+ }
+ log_info( _("lost+found directory is dinode %lld (0x%llx)\n"),
+ (unsigned long long)lf_dip->i_di.di_num.no_addr,
+ (unsigned long long)lf_dip->i_di.di_num.no_addr);
+ di = dirtree_find(lf_dip->i_di.di_num.no_addr);
+ if (di) {
+ log_info( _("Marking lost+found inode connected\n"));
+ di->checked = 1;
+ di = NULL;
+ }
+}
+
/* add_inode_to_lf - Add dir entry to lost+found for the inode
* @ip: inode to add to lost + found
*
@@ -102,74 +166,10 @@ int add_inode_to_lf(struct gfs2_inode *ip){
__be32 inode_type;
uint64_t lf_blocks;
struct gfs2_sbd *sdp = ip->i_sbd;
- struct dir_info *di;
int err = 0;
uint32_t mode;
- if (!lf_dip) {
- uint8_t q;
-
- log_info( _("Locating/Creating lost+found directory\n"));
-
- /* if this is gfs1, we have to trick createi into using
- no_formal_ino = no_addr, so we set next_inum to the
- free block we're about to allocate. */
- if (sdp->gfs1)
- sdp->md.next_inum = find_free_blk(sdp);
- mode = (sdp->gfs1 ? DT2IF(GFS_FILE_DIR) : S_IFDIR) | 0700;
- if (sdp->gfs1)
- lf_dip = gfs_createi(sdp->md.rooti, "lost+found",
- mode, 0);
- else
- lf_dip = createi(sdp->md.rooti, "lost+found",
- S_IFDIR | 0700, 0);
- if (lf_dip == NULL) {
- log_crit(_("Error creating lost+found: %s\n"),
- strerror(errno));
- exit(FSCK_ERROR);
- }
-
- /* createi will have incremented the di_nlink link count for
- the root directory. We must set the nlink value
- in the hash table to keep them in sync so that pass4 can
- detect and fix any descrepancies. */
- set_di_nlink(sdp->md.rooti);
-
- q = block_type(lf_dip->i_di.di_num.no_addr);
- if (q != gfs2_inode_dir) {
- /* This is a new lost+found directory, so set its
- * block type and increment link counts for
- * the directories */
- /* FIXME: i'd feel better about this if
- * fs_mkdir returned whether it created a new
- * directory or just found an old one, and we
- * used that instead of the block_type to run
- * this */
- fsck_blockmap_set(ip, lf_dip->i_di.di_num.no_addr,
- _("lost+found dinode"),
- gfs2_inode_dir);
- /* root inode links to lost+found */
- incr_link_count(sdp->md.rooti->i_di.di_num,
- lf_dip, _("root"));
- /* lost+found link for '.' from itself */
- incr_link_count(lf_dip->i_di.di_num,
- lf_dip, "\".\"");
- /* lost+found link for '..' back to root */
- incr_link_count(lf_dip->i_di.di_num, sdp->md.rooti,
- "\"..\"");
- if (sdp->gfs1)
- lf_dip->i_di.__pad1 = GFS_FILE_DIR;
- }
- log_info( _("lost+found directory is dinode %lld (0x%llx)\n"),
- (unsigned long long)lf_dip->i_di.di_num.no_addr,
- (unsigned long long)lf_dip->i_di.di_num.no_addr);
- di = dirtree_find(lf_dip->i_di.di_num.no_addr);
- if (di) {
- log_info( _("Marking lost+found inode connected\n"));
- di->checked = 1;
- di = NULL;
- }
- }
+ make_sure_lf_exists(ip);
if (ip->i_di.di_num.no_addr == lf_dip->i_di.di_num.no_addr) {
log_err( _("Trying to add lost+found to itself...skipping"));
return 0;
diff --git a/gfs2/fsck/lost_n_found.h b/gfs2/fsck/lost_n_found.h
index f28a1d9..2b76cc2 100644
--- a/gfs2/fsck/lost_n_found.h
+++ b/gfs2/fsck/lost_n_found.h
@@ -4,5 +4,6 @@
#include "libgfs2.h"
int add_inode_to_lf(struct gfs2_inode *ip);
+void make_sure_lf_exists(struct gfs2_inode *ip);
#endif /* __LOST_N_FOUND_H__ */
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 06/47] fsck.gfs2: Check for formal inode mismatch when adding to lost+found
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (3 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 05/47] fsck.gfs2: Split out function to make sure lost+found exists Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-15 16:08 ` Steven Whitehouse
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 07/47] fsck.gfs2: shorten some debug messages in lost+found Bob Peterson
` (40 subsequent siblings)
45 siblings, 1 reply; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch adds a check to the code that adds inodes to lost+found
so that dinodes with formal inode mismatches are logged, but not added.
rhbz#902920
---
gfs2/fsck/lost_n_found.c | 44 ++++++++++++++++++++++++++++----------------
1 file changed, 28 insertions(+), 16 deletions(-)
diff --git a/gfs2/fsck/lost_n_found.c b/gfs2/fsck/lost_n_found.c
index f379646..3d9acb5 100644
--- a/gfs2/fsck/lost_n_found.c
+++ b/gfs2/fsck/lost_n_found.c
@@ -40,24 +40,36 @@ static void add_dotdot(struct gfs2_inode *ip)
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)di->dotdot_parent.no_addr,
(unsigned long long)di->dotdot_parent.no_addr);
- decr_link_count(di->dotdot_parent.no_addr,
- ip->i_di.di_num.no_addr,
- _(".. unlinked, moving to lost+found"));
dip = fsck_load_inode(sdp, di->dotdot_parent.no_addr);
- if (dip->i_di.di_nlink > 0) {
- dip->i_di.di_nlink--;
- set_di_nlink(dip); /* keep inode tree in sync */
- log_debug(_("Decrementing its links to %d\n"),
- dip->i_di.di_nlink);
- bmodified(dip->i_bh);
- } else if (!dip->i_di.di_nlink) {
- log_debug(_("Its link count is zero.\n"));
+ if (dip->i_di.di_num.no_formal_ino ==
+ di->dotdot_parent.no_formal_ino) {
+ decr_link_count(di->dotdot_parent.no_addr,
+ ip->i_di.di_num.no_addr,
+ _(".. unlinked, moving to lost+found"));
+ if (dip->i_di.di_nlink > 0) {
+ dip->i_di.di_nlink--;
+ set_di_nlink(dip); /* keep inode tree in sync */
+ log_debug(_("Decrementing its links to %d\n"),
+ dip->i_di.di_nlink);
+ bmodified(dip->i_bh);
+ } else if (!dip->i_di.di_nlink) {
+ log_debug(_("Its link count is zero.\n"));
+ } else {
+ log_debug(_("Its link count is %d! Changing "
+ "it to 0.\n"), dip->i_di.di_nlink);
+ dip->i_di.di_nlink = 0;
+ set_di_nlink(dip); /* keep inode tree in sync */
+ bmodified(dip->i_bh);
+ }
} else {
- log_debug(_("Its link count is %d! Changing "
- "it to 0.\n"), dip->i_di.di_nlink);
- dip->i_di.di_nlink = 0;
- set_di_nlink(dip); /* keep inode tree in sync */
- bmodified(dip->i_bh);
+ log_debug(_("Directory (0x%llx)'s link to parent "
+ "(0x%llx) had a formal inode discrepancy: "
+ "was 0x%llx, expected 0x%llx\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)di->dotdot_parent.no_addr,
+ di->dotdot_parent.no_formal_ino,
+ dip->i_di.di_num.no_formal_ino);
+ log_debug(_("The parent directory was not changed.\n"));
}
fsck_inode_put(&dip);
di = NULL;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 07/47] fsck.gfs2: shorten some debug messages in lost+found
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (4 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 06/47] fsck.gfs2: Check for formal inode mismatch when adding to lost+found Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 08/47] fsck.gfs2: Move basic directory entry checks to separate function Bob Peterson
` (39 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch changes the debug output of lost+found such that it
only prints the block number in hexadecimal. This shortens the output
and makes debug output easier to read.
rhbz#902920
---
gfs2/fsck/lost_n_found.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/gfs2/fsck/lost_n_found.c b/gfs2/fsck/lost_n_found.c
index 3d9acb5..3226ac9 100644
--- a/gfs2/fsck/lost_n_found.c
+++ b/gfs2/fsck/lost_n_found.c
@@ -34,11 +34,9 @@ static void add_dotdot(struct gfs2_inode *ip)
if (di && valid_block(sdp, di->dotdot_parent.no_addr)) {
struct gfs2_inode *dip;
- log_debug(_("Directory %lld (0x%llx) already had a "
- "\"..\" link to %lld (0x%llx).\n"),
+ log_debug(_("Directory (0x%llx) already had a "
+ "\"..\" link to (0x%llx).\n"),
(unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)di->dotdot_parent.no_addr,
(unsigned long long)di->dotdot_parent.no_addr);
dip = fsck_load_inode(sdp, di->dotdot_parent.no_addr);
if (dip->i_di.di_num.no_formal_ino ==
@@ -76,15 +74,13 @@ static void add_dotdot(struct gfs2_inode *ip)
} else {
if (di)
log_debug(_("Couldn't find a valid \"..\" entry "
- "for orphan directory %lld (0x%llx): "
+ "for orphan directory (0x%llx): "
"'..' = 0x%llx\n"),
(unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)di->dotdot_parent.no_addr);
else
- log_debug(_("Couldn't find directory %lld (0x%llx) "
+ log_debug(_("Couldn't find directory (0x%llx) "
"in directory tree.\n"),
- (unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
}
if (gfs2_dirent_del(ip, "..", 2))
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 08/47] fsck.gfs2: Move basic directory entry checks to separate function
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (5 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 07/47] fsck.gfs2: shorten some debug messages in lost+found Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 09/47] fsck.gfs2: Add formal inode check to basic dirent checks Bob Peterson
` (38 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch moves a huge chunk of code from bloated function
check_dentry. The moved section of code performs basic directory entry
checks. The code is basically unchanged, but I made clear_eattrs
metawalk functions global.
rhbz#902920
---
gfs2/fsck/pass2.c | 160 ++++++++++++++++++++++++++++++++----------------------
1 file changed, 94 insertions(+), 66 deletions(-)
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index 3053711..8d66ff4 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -19,6 +19,13 @@
#define MAX_FILENAME 256
+struct metawalk_fxns clear_eattrs = {
+ .check_eattr_indir = delete_eattr_indir,
+ .check_eattr_leaf = delete_eattr_leaf,
+ .check_eattr_entry = clear_eattr_entry,
+ .check_eattr_extentry = clear_eattr_extentry,
+};
+
/* Set children's parent inode in dir_info structure - ext2 does not set
* dotdot inode here, but instead in pass3 - should we? */
static int set_parent_dir(struct gfs2_sbd *sdp, struct gfs2_inum child,
@@ -288,51 +295,35 @@ static int bad_formal_ino(struct gfs2_inode *ip, struct gfs2_dirent *dent,
return 0;
}
-/* FIXME: should maybe refactor this a bit - but need to deal with
- * FIXMEs internally first */
-static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
- struct gfs2_dirent *prev_de,
- struct gfs2_buffer_head *bh, char *filename,
- uint32_t *count, void *priv)
+/* basic_dentry_checks - fundamental checks for directory entries
+ *
+ * @ip: pointer to the incode inode structure
+ * @entry: pointer to the inum info
+ * @tmp_name: user-friendly file name
+ * @count: pointer to the entry count
+ * @de: pointer to the directory entry
+ *
+ * Returns: 1 means corruption, nuke the dentry, 0 means checks pass
+ */
+static int basic_dentry_checks(struct gfs2_inode *ip, struct gfs2_dirent *dent,
+ struct gfs2_inum *entry, const char *tmp_name,
+ uint32_t *count, struct gfs2_dirent *de,
+ struct dir_status *ds, uint8_t *q,
+ struct gfs2_buffer_head *bh)
{
struct gfs2_sbd *sdp = ip->i_sbd;
- uint8_t q = 0;
- char tmp_name[MAX_FILENAME];
- struct gfs2_inum entry;
- struct dir_status *ds = (struct dir_status *) priv;
- int error;
- struct gfs2_inode *entry_ip = NULL;
- struct metawalk_fxns clear_eattrs = {0};
- struct gfs2_dirent dentry, *de;
uint32_t calculated_hash;
+ struct gfs2_inode *entry_ip = NULL;
+ int error;
- memset(&dentry, 0, sizeof(struct gfs2_dirent));
- gfs2_dirent_in(&dentry, (char *)dent);
- de = &dentry;
-
- clear_eattrs.check_eattr_indir = delete_eattr_indir;
- clear_eattrs.check_eattr_leaf = delete_eattr_leaf;
- clear_eattrs.check_eattr_entry = clear_eattr_entry;
- clear_eattrs.check_eattr_extentry = clear_eattr_extentry;
-
- entry.no_addr = de->de_inum.no_addr;
- entry.no_formal_ino = de->de_inum.no_formal_ino;
-
- /* Start of checks */
- memset(tmp_name, 0, MAX_FILENAME);
- if (de->de_name_len < MAX_FILENAME)
- strncpy(tmp_name, filename, de->de_name_len);
- else
- strncpy(tmp_name, filename, MAX_FILENAME - 1);
-
- if (!valid_block(ip->i_sbd, entry.no_addr)) {
+ if (!valid_block(ip->i_sbd, entry->no_addr)) {
log_err( _("Block # referenced by directory entry %s in inode "
"%lld (0x%llx) is invalid\n"),
tmp_name, (unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
if (query( _("Clear directory entry to out of range block? "
"(y/n) "))) {
- goto nuke_dentry;
+ return 1;
} else {
log_err( _("Directory entry to out of range block remains\n"));
(*count)++;
@@ -349,7 +340,7 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
de->de_rec_len, de->de_name_len);
if (!query( _("Clear the directory entry? (y/n) "))) {
log_err( _("Directory entry not fixed.\n"));
- goto dentry_is_valid;
+ return 0;
}
fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
_("corrupt directory entry"),
@@ -371,7 +362,7 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
tmp_name)) {
log_err( _("Directory entry hash for %s not "
"fixed.\n"), tmp_name);
- goto dentry_is_valid;
+ return 0;
}
de->de_hash = calculated_hash;
gfs2_dirent_out(de, (char *)dent);
@@ -380,7 +371,7 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
tmp_name);
}
- q = block_type(entry.no_addr);
+ *q = block_type(entry->no_addr);
/* Get the status of the directory inode */
/**
* 1. Blocks marked "invalid" were invalidated due to duplicate
@@ -394,25 +385,25 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
* 2. Blocks marked "bad" need to have their entire
* metadata tree deleted.
*/
- if (q == gfs2_inode_invalid || q == gfs2_bad_block) {
+ if (*q == gfs2_inode_invalid || *q == gfs2_bad_block) {
/* This entry's inode has bad blocks in it */
/* Handle bad blocks */
log_err( _("Found directory entry '%s' pointing to invalid "
"block %lld (0x%llx)\n"), tmp_name,
- (unsigned long long)entry.no_addr,
- (unsigned long long)entry.no_addr);
+ (unsigned long long)entry->no_addr,
+ (unsigned long long)entry->no_addr);
if (!query( _("Delete inode containing bad blocks? (y/n)"))) {
log_warn( _("Entry to inode containing bad blocks remains\n"));
- goto dentry_is_valid;
+ return 0;
}
- if (q == gfs2_bad_block) {
- if (ip->i_di.di_num.no_addr == entry.no_addr)
+ if (*q == gfs2_bad_block) {
+ if (ip->i_di.di_num.no_addr == entry->no_addr)
entry_ip = ip;
else
- entry_ip = fsck_load_inode(sdp, entry.no_addr);
+ entry_ip = fsck_load_inode(sdp, entry->no_addr);
if (ip->i_di.di_eattr) {
check_inode_eattr(entry_ip,
&pass2_fxns_delete);
@@ -421,29 +412,29 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
if (entry_ip != ip)
fsck_inode_put(&entry_ip);
}
- fsck_blockmap_set(ip, entry.no_addr,
+ fsck_blockmap_set(ip, entry->no_addr,
_("bad directory entry"), gfs2_block_free);
log_err( _("Inode %lld (0x%llx) was deleted.\n"),
- (unsigned long long)entry.no_addr,
- (unsigned long long)entry.no_addr);
- goto nuke_dentry;
+ (unsigned long long)entry->no_addr,
+ (unsigned long long)entry->no_addr);
+ return 1;
}
- if (q < gfs2_inode_dir || q > gfs2_inode_sock) {
+ if (*q < gfs2_inode_dir || *q > gfs2_inode_sock) {
log_err( _("Directory entry '%s' referencing inode %llu "
"(0x%llx) in dir inode %llu (0x%llx) block type "
"%d: %s.\n"), tmp_name,
- (unsigned long long)entry.no_addr,
- (unsigned long long)entry.no_addr,
+ (unsigned long long)entry->no_addr,
+ (unsigned long long)entry->no_addr,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr,
- q, q == gfs2_inode_invalid ?
+ *q, *q == gfs2_inode_invalid ?
_("was previously marked invalid") :
_("was deleted or is not an inode"));
if (!query( _("Clear directory entry to non-inode block? "
"(y/n) "))) {
log_err( _("Directory entry to non-inode block remains\n"));
- goto dentry_is_valid;
+ return 0;
}
/* Don't decrement the link here: Here in pass2, we increment
@@ -462,20 +453,20 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
count on "delete_block_if_notdup" knowing whether it's
really a duplicate block if we never traversed the metadata
tree for the invalid inode. */
- goto nuke_dentry;
+ return 1;
}
- error = check_file_type(de->de_type, q, sdp->gfs1);
+ error = check_file_type(de->de_type, *q, sdp->gfs1);
if (error < 0) {
log_err( _("Error: directory entry type is "
"incompatible with block type@block %lld "
"(0x%llx) in directory inode %llu (0x%llx).\n"),
- (unsigned long long)entry.no_addr,
- (unsigned long long)entry.no_addr,
+ (unsigned long long)entry->no_addr,
+ (unsigned long long)entry->no_addr,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
log_err( _("Directory entry type is %d, block type is %d.\n"),
- de->de_type, q);
+ de->de_type, *q);
stack;
return -1;
}
@@ -483,22 +474,59 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
log_err( _("Type '%s' in dir entry (%s, %llu/0x%llx) conflicts"
" with type '%s' in dinode. (Dir entry is stale.)\n"),
de_type_string(de->de_type), tmp_name,
- (unsigned long long)entry.no_addr,
- (unsigned long long)entry.no_addr,
- block_type_string(q));
+ (unsigned long long)entry->no_addr,
+ (unsigned long long)entry->no_addr,
+ block_type_string(*q));
if (!query( _("Clear stale directory entry? (y/n) "))) {
log_err( _("Stale directory entry remains\n"));
- goto dentry_is_valid;
+ return 0;
}
- if (ip->i_di.di_num.no_addr == entry.no_addr)
+ if (ip->i_di.di_num.no_addr == entry->no_addr)
entry_ip = ip;
else
- entry_ip = fsck_load_inode(sdp, entry.no_addr);
+ entry_ip = fsck_load_inode(sdp, entry->no_addr);
check_inode_eattr(entry_ip, &clear_eattrs);
if (entry_ip != ip)
fsck_inode_put(&entry_ip);
- goto nuke_dentry;
+ return 1;
}
+ return 0;
+}
+
+/* FIXME: should maybe refactor this a bit - but need to deal with
+ * FIXMEs internally first */
+static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
+ struct gfs2_dirent *prev_de,
+ struct gfs2_buffer_head *bh, char *filename,
+ uint32_t *count, void *priv)
+{
+ struct gfs2_sbd *sdp = ip->i_sbd;
+ uint8_t q = 0;
+ char tmp_name[MAX_FILENAME];
+ struct gfs2_inum entry;
+ struct dir_status *ds = (struct dir_status *) priv;
+ int error;
+ struct gfs2_inode *entry_ip = NULL;
+ struct gfs2_dirent dentry, *de;
+
+ memset(&dentry, 0, sizeof(struct gfs2_dirent));
+ gfs2_dirent_in(&dentry, (char *)dent);
+ de = &dentry;
+
+ entry.no_addr = de->de_inum.no_addr;
+ entry.no_formal_ino = de->de_inum.no_formal_ino;
+
+ /* Start of checks */
+ memset(tmp_name, 0, MAX_FILENAME);
+ if (de->de_name_len < MAX_FILENAME)
+ strncpy(tmp_name, filename, de->de_name_len);
+ else
+ strncpy(tmp_name, filename, MAX_FILENAME - 1);
+
+ error = basic_dentry_checks(ip, dent, &entry, tmp_name, count, de,
+ ds, &q, bh);
+ if (error)
+ goto nuke_dentry;
if (!strcmp(".", tmp_name)) {
log_debug( _("Found . dentry in directory %lld (0x%llx)\n"),
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 09/47] fsck.gfs2: Add formal inode check to basic dirent checks
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (6 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 08/47] fsck.gfs2: Move basic directory entry checks to separate function Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 10/47] fsck.gfs2: Add new function to check dir hash tables Bob Peterson
` (37 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch adds a check to the basic directory entry checks which
compares the formal inode number of the directory entry to the
formal inode number in the inode tree that was set up by pass1.
If the numbers don't match, this directory entry is corrupt.
rhbz#902920
---
gfs2/fsck/pass2.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index 8d66ff4..7c0c104 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -315,6 +315,7 @@ static int basic_dentry_checks(struct gfs2_inode *ip, struct gfs2_dirent *dent,
uint32_t calculated_hash;
struct gfs2_inode *entry_ip = NULL;
int error;
+ struct inode_info *ii;
if (!valid_block(ip->i_sbd, entry->no_addr)) {
log_err( _("Block # referenced by directory entry %s in inode "
@@ -490,6 +491,25 @@ static int basic_dentry_checks(struct gfs2_inode *ip, struct gfs2_dirent *dent,
fsck_inode_put(&entry_ip);
return 1;
}
+ /* We need to verify the formal inode number matches. If it doesn't,
+ it needs to be deleted. */
+ ii = inodetree_find(entry->no_addr);
+ if (ii && ii->di_num.no_formal_ino != entry->no_formal_ino) {
+ log_err( _("Directory entry '%s' pointing to block %llu "
+ "(0x%llx) in directory %llu (0x%llx) has the "
+ "wrong 'formal' inode number.\n"), tmp_name,
+ (unsigned long long)entry->no_addr,
+ (unsigned long long)entry->no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr);
+ log_err( _("The directory entry has %llu (0x%llx) but the "
+ "inode has %llu (0x%llx)\n"),
+ (unsigned long long)entry->no_formal_ino,
+ (unsigned long long)entry->no_formal_ino,
+ (unsigned long long)ii->di_num.no_formal_ino,
+ (unsigned long long)ii->di_num.no_formal_ino);
+ return 1;
+ }
return 0;
}
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 10/47] fsck.gfs2: Add new function to check dir hash tables
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (7 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 09/47] fsck.gfs2: Add formal inode check to basic dirent checks Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 11/47] fsck.gfs2: Special case '..' when processing bad formal inode number Bob Peterson
` (36 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
It's very important that fsck.gfs2 checks for a valid directory hash
table before operating further on the directory. Before this patch,
we were doing some incomplete testing, after we had already operated
on the directory, with function check_num_ptrs. This patch replaces
that scheme with a new one that preemptively checks the hash table
with a new function called check_hash_tbl.
We've got to make sure the hash table is sane. Each leaf needs to
be counted a proper power of 2. We can't just have 3 pointers to a leaf.
The number of pointers must correspond to the proper leaf depth, and they
must all fall on power-of-two boundaries. For example, suppose we have
directory that's had one of its middle leaf blocks split several times.
The split history might look something like this:
leaf
split lindex length depth pointers leaf split
----- ------ ------ ----- -------- -------------
0: 0x00 0x100 0 00 - ff <--split to 1:
1: 0x00 0x80 1 00 - 7f <--split to 2:
0x80 0x80 1 80 - ff
2: 0x00 0x40 2 00 - 3f <--split to 3:
0x40 0x40 2 40 - 7f
0x80 0x80 1 80 - ff
3: 0x00 0x20 3 00 - 1f
0x20 0x20 3 20 - 3f <--split to 4
0x40 0x40 2 40 - 7f
0x80 0x80 1 80 - ff
4: 0x00 0x20 3 00 - 1f
0x20 0x10 4 20 - 2f
0x30 0x10 4 30 - 3f <--split to 5
0x40 0x40 2 40 - 7f
0x80 0x80 1 80 - ff
5: 0x00 0x20 3 00 - 1f
0x20 0x10 4 20 - 2f
0x30 0x8 5 30 - 37 <--split to 6
0x38 0x8 5 38 - 3f
0x40 0x40 2 40 - 7f
0x80 0x80 1 80 - ff
6: 0x00 0x20 3 00 - 1f
0x20 0x10 4 20 - 2f
0x30 0x4 6 30 - 33
0x34 0x4 6 34 - 37
0x38 0x8 5 38 - 3f
0x40 0x40 2 40 - 7f
0x80 0x80 1 80 - ff
You can see from this example that it's impossible for a leaf block to have
a lf_depth of 5 and lindex 0x34. As shown in "5:" above, a leaf depth of 5
can only fall@offset 0x30 or 0x38. If it somehow falls elsewhere, say
0x34, the proper depth should be 6, and there should only be 4 pointers,
as per split "6:" above. The leaf block pointers all need to fall properly
on these boundaries, otherwise the kernel code's calculations will land it
on the wrong leaf block while it's searching, and the result will be files
you can see with ls, but can't open, delete or use them.
rhbz#902920
---
gfs2/fsck/metawalk.c | 173 +++++++++++++++--
gfs2/fsck/metawalk.h | 12 +-
gfs2/fsck/pass1.c | 142 --------------
gfs2/fsck/pass2.c | 532 +++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 699 insertions(+), 160 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index a40a6af..e76c35f 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -714,6 +714,24 @@ static int check_leaf_blks(struct gfs2_inode *ip, struct metawalk_fxns *pass)
/* Readahead */
dir_leaf_reada(ip, tbl, hsize);
+ if (pass->check_hash_tbl) {
+ error = pass->check_hash_tbl(ip, tbl, hsize, pass->private);
+ if (error < 0) {
+ free(tbl);
+ posix_fadvise(sdp->device_fd, 0, 0, POSIX_FADV_NORMAL);
+ return error;
+ }
+ /* If hash table changes were made, read it in again. */
+ if (error) {
+ free(tbl);
+ tbl = get_dir_hash(ip);
+ if (tbl == NULL) {
+ perror("get_dir_hash");
+ return -1;
+ }
+ }
+ }
+
/* Find the first valid leaf pointer in range and use it as our "old"
leaf. That way, bad blocks at the beginning will be overwritten
with the first valid leaf. */
@@ -766,21 +784,6 @@ static int check_leaf_blks(struct gfs2_inode *ip, struct metawalk_fxns *pass)
posix_fadvise(sdp->device_fd, 0, 0, POSIX_FADV_NORMAL);
return 0;
}
- /* If the old leaf was a duplicate referenced by a
- previous dinode, we can't check the number of
- pointers because the number of pointers may be for
- that other dinode's reference, not this one. */
- if (pass->check_num_ptrs && !old_was_dup &&
- valid_block(ip->i_sbd, old_leaf)) {
- error = pass->check_num_ptrs(ip, old_leaf,
- &ref_count,
- &lindex,
- &oldleaf);
- if (error) {
- free(tbl);
- return error;
- }
- }
error = check_leaf(ip, lindex, pass, &ref_count,
&leaf_no, old_leaf, &bad_leaf,
first_ok_leaf, &leaf, &oldleaf);
@@ -1687,3 +1690,143 @@ void reprocess_inode(struct gfs2_inode *ip, const char *desc)
log_err( _("Error %d reprocessing the %s metadata tree.\n"),
error, desc);
}
+
+/*
+ * write_new_leaf - allocate and write a new leaf to cover a gap in hash table
+ * @dip: the directory inode
+ * @start_lindex: where in the hash table to start writing
+ * @num_copies: number of copies of the pointer to write into hash table
+ * @before_or_after: desc. of whether this is being added before/after/etc.
+ * @bn: pointer to return the newly allocated leaf's block number
+ */
+int write_new_leaf(struct gfs2_inode *dip, int start_lindex, int num_copies,
+ const char *before_or_after, uint64_t *bn)
+{
+ struct gfs2_buffer_head *nbh;
+ struct gfs2_leaf *leaf;
+ struct gfs2_dirent *dent;
+ int count, i;
+ int factor = 0, pad_size;
+ uint64_t *cpyptr;
+ char *padbuf;
+ int divisor = num_copies;
+ int end_lindex = start_lindex + num_copies;
+
+ padbuf = malloc(num_copies * sizeof(uint64_t));
+ /* calculate the depth needed for the new leaf */
+ while (divisor > 1) {
+ factor++;
+ divisor /= 2;
+ }
+ /* Make sure the number of copies is properly a factor of 2 */
+ if ((1 << factor) != num_copies) {
+ log_err(_("Program error: num_copies not a factor of 2.\n"));
+ log_err(_("num_copies=%d, dinode = %lld (0x%llx)\n"),
+ num_copies,
+ (unsigned long long)dip->i_di.di_num.no_addr,
+ (unsigned long long)dip->i_di.di_num.no_addr);
+ log_err(_("lindex = %d (0x%x)\n"), start_lindex, start_lindex);
+ stack;
+ return -1;
+ }
+
+ /* allocate and write out a new leaf block */
+ *bn = meta_alloc(dip);
+ fsck_blockmap_set(dip, *bn, _("directory leaf"), gfs2_leaf_blk);
+ log_err(_("A new directory leaf was allocated at block %lld "
+ "(0x%llx) to fill the %d (0x%x) pointer gap %s the existing "
+ "pointer at index %d (0x%x).\n"), (unsigned long long)*bn,
+ (unsigned long long)*bn, num_copies, num_copies,
+ before_or_after, start_lindex, start_lindex);
+ dip->i_di.di_blocks++;
+ bmodified(dip->i_bh);
+ nbh = bget(dip->i_sbd, *bn);
+ memset(nbh->b_data, 0, dip->i_sbd->bsize);
+ leaf = (struct gfs2_leaf *)nbh->b_data;
+ leaf->lf_header.mh_magic = cpu_to_be32(GFS2_MAGIC);
+ leaf->lf_header.mh_type = cpu_to_be32(GFS2_METATYPE_LF);
+ leaf->lf_header.mh_format = cpu_to_be32(GFS2_FORMAT_LF);
+ leaf->lf_depth = cpu_to_be16(dip->i_di.di_depth - factor);
+
+ /* initialize the first dirent on the new leaf block */
+ dent = (struct gfs2_dirent *)(nbh->b_data + sizeof(struct gfs2_leaf));
+ dent->de_rec_len = cpu_to_be16(dip->i_sbd->bsize -
+ sizeof(struct gfs2_leaf));
+ bmodified(nbh);
+ brelse(nbh);
+
+ /* pad the hash table with the new leaf block */
+ cpyptr = (uint64_t *)padbuf;
+ for (i = start_lindex; i < end_lindex; i++) {
+ *cpyptr = cpu_to_be64(*bn);
+ cpyptr++;
+ }
+ pad_size = num_copies * sizeof(uint64_t);
+ log_err(_("Writing to the hash table of directory %lld "
+ "(0x%llx)@index: 0x%x for 0x%lx pointers.\n"),
+ (unsigned long long)dip->i_di.di_num.no_addr,
+ (unsigned long long)dip->i_di.di_num.no_addr,
+ start_lindex, pad_size / sizeof(uint64_t));
+ if (dip->i_sbd->gfs1)
+ count = gfs1_writei(dip, padbuf, start_lindex *
+ sizeof(uint64_t), pad_size);
+ else
+ count = gfs2_writei(dip, padbuf, start_lindex *
+ sizeof(uint64_t), pad_size);
+ free(padbuf);
+ if (count != pad_size) {
+ log_err( _("Error: bad write while fixing directory leaf "
+ "pointers.\n"));
+ return -1;
+ }
+ return 0;
+}
+
+/* repair_leaf - Warn the user of an error and ask permission to fix it
+ * Process a bad leaf pointer and ask to repair the first time.
+ * The repair process involves extending the previous leaf's entries
+ * so that they replace the bad ones. We have to hack up the old
+ * leaf a bit, but it's better than deleting the whole directory,
+ * which is what used to happen before. */
+int repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no, int lindex,
+ int ref_count, const char *msg)
+{
+ int new_leaf_blks = 0, error, refs;
+ uint64_t bn = 0;
+
+ log_err( _("Directory Inode %llu (0x%llx) points to leaf %llu"
+ " (0x%llx) %s.\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)*leaf_no,
+ (unsigned long long)*leaf_no, msg);
+ if (!query( _("Attempt to patch around it? (y/n) "))) {
+ log_err( _("Bad leaf left in place.\n"));
+ goto out;
+ }
+ /* We can only write leafs in quantities that are factors of
+ two, since leaves are doubled, not added sequentially.
+ So if we have a hole that's not a factor of 2, we have to
+ break it down into separate leaf blocks that are. */
+ while (ref_count) {
+ refs = 1;
+ while (refs <= ref_count) {
+ if (refs * 2 > ref_count)
+ break;
+ refs *= 2;
+ }
+ error = write_new_leaf(ip, lindex, refs, _("replacing"), &bn);
+ if (error)
+ return error;
+
+ new_leaf_blks++;
+ lindex += refs;
+ ref_count -= refs;
+ }
+ log_err( _("Directory Inode %llu (0x%llx) repaired.\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr);
+out:
+ *leaf_no = bn;
+ return new_leaf_blks;
+}
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index e114427..c43baf0 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -39,6 +39,11 @@ extern struct gfs2_inode *fsck_system_inode(struct gfs2_sbd *sdp,
uint64_t block);
extern int find_remove_dup(struct gfs2_inode *ip, uint64_t block,
const char *btype);
+extern int write_new_leaf(struct gfs2_inode *dip, int start_lindex,
+ int num_copies, const char *before_or_after,
+ uint64_t *bn);
+extern int repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no, int lindex,
+ int ref_count, const char *msg);
extern int free_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
const char *btype);
@@ -95,9 +100,10 @@ struct metawalk_fxns {
int (*finish_eattr_indir) (struct gfs2_inode *ip, int leaf_pointers,
int leaf_pointer_errors, void *private);
void (*big_file_msg) (struct gfs2_inode *ip, uint64_t blks_checked);
- int (*check_num_ptrs) (struct gfs2_inode *ip, uint64_t leafno,
- int *ref_count, int *lindex,
- struct gfs2_leaf *leaf);
+ int (*check_hash_tbl) (struct gfs2_inode *ip, uint64_t *tbl,
+ unsigned hsize, void *private);
+ int (*repair_leaf) (struct gfs2_inode *ip, uint64_t *leaf_no,
+ int lindex, int ref_count, const char *msg);
};
#endif /* _METAWALK_H */
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 9a34e97..cc69e84 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -60,8 +60,6 @@ static int check_extended_leaf_eattr(struct gfs2_inode *ip, uint64_t *data_ptr,
struct gfs2_ea_header *ea_hdr,
struct gfs2_ea_header *ea_hdr_prev,
void *private);
-static int check_num_ptrs(struct gfs2_inode *ip, uint64_t leafno,
- int *ref_count, int *lindex, struct gfs2_leaf *leaf);
static int finish_eattr_indir(struct gfs2_inode *ip, int leaf_pointers,
int leaf_pointer_errors, void *private);
static int invalidate_metadata(struct gfs2_inode *ip, uint64_t block,
@@ -92,7 +90,6 @@ struct metawalk_fxns pass1_fxns = {
.check_eattr_extentry = check_extended_leaf_eattr,
.finish_eattr_indir = finish_eattr_indir,
.big_file_msg = big_file_comfort,
- .check_num_ptrs = check_num_ptrs,
};
struct metawalk_fxns undo_fxns = {
@@ -204,145 +201,6 @@ struct metawalk_fxns sysdir_fxns = {
.check_dentry = resuscitate_dentry,
};
-/*
- * fix_leaf_pointers - fix a directory dinode that has a number of pointers
- * that is not a multiple of 2.
- * dip - the directory inode having the problem
- * lindex - the index of the leaf right after the problem (need to back up)
- * cur_numleafs - current (incorrect) number of instances of the leaf block
- * correct_numleafs - the correct number instances of the leaf block
- */
-static int fix_leaf_pointers(struct gfs2_inode *dip, int *lindex,
- int cur_numleafs, int correct_numleafs)
-{
- int count;
- char *ptrbuf;
- int start_lindex = *lindex - cur_numleafs; /* start of bad ptrs */
- int tot_num_ptrs = (1 << dip->i_di.di_depth) - start_lindex;
- int bufsize = tot_num_ptrs * sizeof(uint64_t);
- int off_by = cur_numleafs - correct_numleafs;
-
- ptrbuf = malloc(bufsize);
- if (!ptrbuf) {
- log_err( _("Error: Cannot allocate memory to fix the leaf "
- "pointers.\n"));
- return -1;
- }
- /* Read all the pointers, starting with the first bad one */
- count = gfs2_readi(dip, ptrbuf, start_lindex * sizeof(uint64_t),
- bufsize);
- if (count != bufsize) {
- log_err( _("Error: bad read while fixing leaf pointers.\n"));
- free(ptrbuf);
- return -1;
- }
-
- bufsize -= off_by * sizeof(uint64_t); /* We need to write fewer */
- /* Write the same pointers, but offset them so they fit within the
- smaller factor of 2. So if we have 12 pointers, write out only
- the last 8 of them. If we have 7, write the last 4, etc.
- We need to write these starting at the current lindex and adjust
- lindex accordingly. */
- if (dip->i_sbd->gfs1)
- count = gfs1_writei(dip, ptrbuf + (off_by * sizeof(uint64_t)),
- start_lindex * sizeof(uint64_t), bufsize);
- else
- count = gfs2_writei(dip, ptrbuf + (off_by * sizeof(uint64_t)),
- start_lindex * sizeof(uint64_t), bufsize);
- if (count != bufsize) {
- log_err( _("Error: bad read while fixing leaf pointers.\n"));
- free(ptrbuf);
- return -1;
- }
- /* Now zero out the hole left at the end */
- memset(ptrbuf, 0, off_by * sizeof(uint64_t));
- if (dip->i_sbd->gfs1)
- gfs1_writei(dip, ptrbuf, (start_lindex * sizeof(uint64_t)) +
- bufsize, off_by * sizeof(uint64_t));
- else
- gfs2_writei(dip, ptrbuf, (start_lindex * sizeof(uint64_t)) +
- bufsize, off_by * sizeof(uint64_t));
- free(ptrbuf);
- *lindex -= off_by; /* adjust leaf index to account for the change */
- return 0;
-}
-
-/**
- * check_num_ptrs - check a previously processed leaf's pointer count in the
- * hash table.
- *
- * The number of pointers in a directory hash table that point to any given
- * leaf block should always be a factor of two. The difference between the
- * leaf block's depth and the dinode's di_depth gives us the factor.
- * This function makes sure the leaf follows the rules properly.
- *
- * ip - pointer to the in-core inode structure
- * leafno - the leaf number we're operating on
- * ref_count - the number of pointers to this leaf we actually counted.
- * exp_count - the number of pointers to this leaf we expect based on
- * ip depth minus leaf depth.
- * lindex - leaf index number
- * leaf - the leaf structure for the leaf block to check
- */
-static int check_num_ptrs(struct gfs2_inode *ip, uint64_t leafno,
- int *ref_count, int *lindex, struct gfs2_leaf *leaf)
-{
- int factor = 0, divisor = *ref_count, multiple = 1, error = 0;
- struct gfs2_buffer_head *lbh;
- int exp_count;
-
- /* Check to see if the number of pointers we found is a power of 2.
- It needs to be and if it's not we need to fix it.*/
- while (divisor > 1) {
- factor++;
- divisor /= 2;
- multiple = multiple << 1;
- }
- if (*ref_count != multiple) {
- log_err( _("Directory #%llu (0x%llx) has an invalid number of "
- "pointers to leaf #%llu (0x%llx)\n\tFound: %u, "
- "which is not a factor of 2.\n"),
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)leafno,
- (unsigned long long)leafno, *ref_count);
- if (!query( _("Attempt to fix it? (y/n) "))) {
- log_err( _("Directory inode was not fixed.\n"));
- return 1;
- }
- error = fix_leaf_pointers(ip, lindex, *ref_count, multiple);
- if (error)
- return error;
- *ref_count = multiple;
- log_err( _("Directory inode was fixed.\n"));
- }
- /* Check to see if the counted number of leaf pointers is what we
- expect based on the leaf depth. */
- exp_count = (1 << (ip->i_di.di_depth - leaf->lf_depth));
- if (*ref_count != exp_count) {
- log_err( _("Directory #%llu (0x%llx) has an incorrect number "
- "of pointers to leaf #%llu (0x%llx)\n\tFound: "
- "%u, Expected: %u\n"),
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)leafno,
- (unsigned long long)leafno, *ref_count, exp_count);
- if (!query( _("Attempt to fix it? (y/n) "))) {
- log_err( _("Directory leaf was not fixed.\n"));
- return 1;
- }
- lbh = bread(ip->i_sbd, leafno);
- gfs2_leaf_in(leaf, lbh);
- log_err( _("Leaf depth was %d, changed to %d\n"),
- leaf->lf_depth, ip->i_di.di_depth - factor);
- leaf->lf_depth = ip->i_di.di_depth - factor;
- gfs2_leaf_out(leaf, lbh);
- brelse(lbh);
- log_err( _("Directory leaf was fixed.\n"));
- }
- return 0;
-}
-
static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
{
struct block_count *bc = (struct block_count *) private;
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index 7c0c104..a71be4b 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -15,6 +15,7 @@
#include "eattr.h"
#include "metawalk.h"
#include "link.h"
+#include "lost_n_found.h"
#include "inode_hash.h"
#define MAX_FILENAME 256
@@ -295,6 +296,11 @@ static int bad_formal_ino(struct gfs2_inode *ip, struct gfs2_dirent *dent,
return 0;
}
+static int hash_table_index(uint32_t hash, struct gfs2_inode *ip)
+{
+ return hash >> (32 - ip->i_di.di_depth);
+}
+
/* basic_dentry_checks - fundamental checks for directory entries
*
* @ip: pointer to the incode inode structure
@@ -715,6 +721,530 @@ nuke_dentry:
return 1;
}
+/* pad_with_leafblks - pad a hash table with pointers to new leaf blocks
+ *
+ * @ip: pointer to the dinode structure
+ * @tbl: pointer to the hash table in memory
+ * @lindex: index location within the hash table to pad
+ * @len: number of pointers to be padded
+ */
+static void pad_with_leafblks(struct gfs2_inode *ip, uint64_t *tbl,
+ int lindex, int len)
+{
+ int new_len, i;
+ uint32_t proper_start = lindex;
+ uint64_t new_leaf_blk;
+
+ log_err(_("Padding inode %llu (0x%llx) hash table at offset %d (0x%x) "
+ "for %d pointers.\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr, lindex, lindex,
+ len);
+ while (len) {
+ new_len = 1;
+ /* Determine the next factor of 2 down from extras. We can't
+ just write out a leaf block on a power-of-two boundary.
+ We also need to make sure it has a length that will
+ ensure a "proper start" block as well. */
+ while ((new_len << 1) <= len) {
+ /* Translation: If doubling the size of the new leaf
+ will make its start boundary wrong, we have to
+ settle for a smaller length (and iterate more). */
+ proper_start = (lindex & ~((new_len << 1) - 1));
+ if (lindex != proper_start)
+ break;
+ new_len <<= 1;
+ }
+ write_new_leaf(ip, lindex, new_len, "after", &new_leaf_blk);
+ log_err(_("New leaf block was allocated@%llu (0x%llx) for "
+ "index %d (0x%x), length %d\n"),
+ (unsigned long long)new_leaf_blk,
+ (unsigned long long)new_leaf_blk,
+ lindex, lindex, new_len);
+ fsck_blockmap_set(ip, new_leaf_blk, _("pad leaf"),
+ gfs2_leaf_blk);
+ /* Fix the hash table in memory to have the new leaf */
+ for (i = 0; i < new_len; i++)
+ tbl[lindex + i] = cpu_to_be64(new_leaf_blk);
+ len -= new_len;
+ lindex += new_len;
+ }
+}
+
+/* lost_leaf - repair a leaf block that's on the wrong directory inode
+ *
+ * If the correct index is less than the starting index, we have a problem.
+ * Since we process the index sequentially, the previous index has already
+ * been processed, fixed, and is now correct. But this leaf wants to overwrite
+ * a previously written good leaf. The only thing we can do is move all the
+ * directory entries to lost+found so we don't overwrite the good leaf. Then
+ * we need to pad the gap we leave.
+ */
+static int lost_leaf(struct gfs2_inode *ip, uint64_t *tbl, uint64_t leafno,
+ int ref_count, int lindex, struct gfs2_buffer_head *bh)
+{
+ char *filename;
+ char *bh_end = bh->b_data + ip->i_sbd->bsize;
+ struct gfs2_dirent de, *dent;
+ int error;
+
+ log_err(_("Leaf block %llu (0x%llx) seems to be out of place and its "
+ "contents need to be moved to lost+found.\n"),
+ (unsigned long long)leafno, (unsigned long long)leafno);
+ if (!query( _("Attempt to fix it? (y/n) "))) {
+ log_err( _("Directory leaf was not fixed.\n"));
+ return 0;
+ }
+ make_sure_lf_exists(ip);
+
+ dent = (struct gfs2_dirent *)(bh->b_data + sizeof(struct gfs2_leaf));
+ while (1) {
+ char tmp_name[PATH_MAX];
+
+ memset(&de, 0, sizeof(struct gfs2_dirent));
+ gfs2_dirent_in(&de, (char *)dent);
+ filename = (char *)dent + sizeof(struct gfs2_dirent);
+ memset(tmp_name, 0, sizeof(tmp_name));
+ if (de.de_name_len > sizeof(filename)) {
+ log_debug(_("Encountered bad filename length; "
+ "stopped processing.\n"));
+ break;
+ }
+ memcpy(tmp_name, filename, de.de_name_len);
+ if ((de.de_name_len == 1 && filename[0] == '.')) {
+ log_debug(_("Skipping entry '.'\n"));
+ } else if (de.de_name_len == 2 && filename[0] == '.' &&
+ filename[1] == '.') {
+ log_debug(_("Skipping entry '..'\n"));
+ } else if (!de.de_inum.no_formal_ino) { /* sentinel */
+ log_debug(_("Skipping sentinel '%s'\n"), tmp_name);
+ } else {
+ uint32_t count;
+ struct dir_status ds = {0};
+ uint8_t q = 0;
+
+ error = basic_dentry_checks(ip, dent, &de.de_inum,
+ tmp_name, &count, &de,
+ &ds, &q, bh);
+ if (error) {
+ log_err(_("Not relocating corrupt entry "
+ "\"%s\".\n"), tmp_name);
+ } else {
+ error = dir_add(lf_dip, filename,
+ de.de_name_len, &de.de_inum,
+ de.de_type);
+ if (error && error != -EEXIST) {
+ log_err(_("Error %d encountered while "
+ "trying to relocate \"%s\" "
+ "to lost+found.\n"), error,
+ tmp_name);
+ return error;
+ }
+ /* This inode is linked from lost+found */
+ incr_link_count(de.de_inum, lf_dip,
+ _("from lost+found"));
+ /* If it's a directory, lost+found is
+ back-linked to it via .. */
+ if (q == gfs2_inode_dir)
+ incr_link_count(lf_dip->i_di.di_num,
+ NULL,
+ _("to lost+found"));
+ log_err(_("Relocated \"%s\", block %llu "
+ "(0x%llx) to lost+found.\n"),
+ tmp_name,
+ (unsigned long long)de.de_inum.no_addr,
+ (unsigned long long)de.de_inum.no_addr);
+ }
+ }
+ if ((char *)dent + de.de_rec_len >= bh_end)
+ break;
+ dent = (struct gfs2_dirent *)((char *)dent + de.de_rec_len);
+ }
+ log_err(_("Directory entries from misplaced leaf block were relocated "
+ "to lost+found.\n"));
+ /* Free the lost leaf. */
+ fsck_blockmap_set(ip, leafno, _("lost leaf"), gfs2_block_free);
+ ip->i_di.di_blocks--;
+ bmodified(ip->i_bh);
+ /* Now we have to deal with the bad hash table entries pointing to the
+ misplaced leaf block. But we can't just fill the gap with a single
+ leaf. We have to write on nice power-of-two boundaries, and we have
+ to pad out any extra pointers. */
+ pad_with_leafblks(ip, tbl, lindex, ref_count);
+ return 1;
+}
+
+/* fix_hashtable - fix a corrupt hash table
+ *
+ * The main intent of this function is to sort out hash table problems.
+ * That is, it needs to determine if leaf blocks are in the wrong place,
+ * if the count of pointers is wrong, and if there are extra pointers.
+ * Everything should be placed on correct power-of-two boundaries appropriate
+ * to their leaf depth, and extra pointers should be correctly padded with new
+ * leaf blocks.
+ *
+ * @ip: the directory dinode structure pointer
+ * @tbl: hash table that's already read into memory
+ * @hsize: hash table size, as dictated by the dinode's di_depth
+ * @leafblk: the leaf block number that appears at this lindex in the tbl
+ * @lindex: leaf index that has a problem
+ * @proper_start: where this leaf's pointers should start, as far as the
+ * hash table is concerned (sight unseen; trusting the leaf
+ * really belongs here).
+ * @len: count of pointers in the hash table to this leafblk
+ * @proper_len: pointer to return the proper number of pointers, as the kernel
+ * calculates it, based on the leaf depth.
+ * @factor: the proper depth, given this number of pointers (rounded down).
+ *
+ * Returns: 0 - no changes made, or X if changes were made
+ */
+static int fix_hashtable(struct gfs2_inode *ip, uint64_t *tbl, unsigned hsize,
+ uint64_t leafblk, int lindex, uint32_t proper_start,
+ int len, int *proper_len, int factor)
+{
+ struct gfs2_buffer_head *lbh;
+ struct gfs2_leaf *leaf;
+ struct gfs2_dirent dentry, *de;
+ int changes = 0, error, i, extras, hash_index;
+ uint64_t new_leaf_blk;
+ uint32_t leaf_proper_start;
+
+ *proper_len = len;
+ log_err(_("Dinode %llu (0x%llx) has a hash table error at index "
+ "0x%x, length 0x%x: leaf block %llu (0x%llx)\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr, lindex, len,
+ (unsigned long long)leafblk, (unsigned long long)leafblk);
+ if (!query( _("Fix the hash table? (y/n) "))) {
+ log_err(_("Hash table not fixed.\n"));
+ return 0;
+ }
+
+ lbh = bread(ip->i_sbd, leafblk);
+ leaf = (struct gfs2_leaf *)lbh->b_data;
+ /* If the leaf's depth is out of range for this dinode, it's obviously
+ attached to the wrong dinode. Move the dirents to lost+found. */
+ if (be16_to_cpu(leaf->lf_depth) > ip->i_di.di_depth) {
+ log_err(_("This leaf block's depth (%d) is too big for this "
+ "dinode's depth (%d)\n"),
+ be16_to_cpu(leaf->lf_depth), ip->i_di.di_depth);
+ error = lost_leaf(ip, tbl, leafblk, len, lindex, lbh);
+ brelse(lbh);
+ return error;
+ }
+
+ memset(&dentry, 0, sizeof(struct gfs2_dirent));
+ de = (struct gfs2_dirent *)(lbh->b_data + sizeof(struct gfs2_leaf));
+ gfs2_dirent_in(&dentry, (char *)de);
+
+ /* If this is an empty leaf, we can just delete it and pad. */
+ if ((dentry.de_rec_len == cpu_to_be16(ip->i_sbd->bsize -
+ sizeof(struct gfs2_leaf))) &&
+ (dentry.de_inum.no_formal_ino == 0)) {
+ brelse(lbh);
+ gfs2_free_block(ip->i_sbd, leafblk);
+ log_err(_("Out of place leaf block %llu (0x%llx) had no "
+ "entries, so it was deleted.\n"),
+ (unsigned long long)leafblk,
+ (unsigned long long)leafblk);
+ pad_with_leafblks(ip, tbl, lindex, len);
+ log_err(_("Reprocessing index 0x%x (case 1).\n"), lindex);
+ return 1;
+ }
+
+ /* Calculate the proper number of pointers based on the leaf depth. */
+ *proper_len = 1 << (ip->i_di.di_depth - be16_to_cpu(leaf->lf_depth));
+
+ /* Look at the first dirent and check its hash value to see if it's
+ at the proper starting offset. */
+ hash_index = hash_table_index(dentry.de_hash, ip);
+ if (hash_index < lindex || hash_index > lindex + len) {
+ log_err(_("This leaf block has hash index %d, which is out of "
+ "bounds for where it appears in the hash table "
+ "(%d - %d)\n"),
+ hash_index, lindex, lindex + len);
+ error = lost_leaf(ip, tbl, leafblk, len, lindex, lbh);
+ brelse(lbh);
+ return error;
+ }
+
+ /* Now figure out where this leaf should start, and pad any pointers
+ up to that point with new leaf blocks. */
+ leaf_proper_start = (hash_index & ~(*proper_len - 1));
+ if (lindex < leaf_proper_start) {
+ log_err(_("Leaf pointers start at %d (0x%x), should be %d "
+ "(%x).\n"), lindex, lindex,
+ leaf_proper_start, leaf_proper_start);
+ pad_with_leafblks(ip, tbl, lindex, leaf_proper_start - lindex);
+ brelse(lbh);
+ return 1; /* reprocess the starting lindex */
+ }
+ /* If the proper start according to the leaf's hash index is later
+ than the proper start according to the hash table, it's once
+ again lost and we have to relocate it. The same applies if the
+ leaf's hash index is prior to the proper state, but the leaf is
+ already@its maximum depth. */
+ if ((leaf_proper_start < proper_start) ||
+ ((*proper_len > len || lindex > leaf_proper_start) &&
+ be16_to_cpu(leaf->lf_depth) == ip->i_di.di_depth)) {
+ log_err(_("Leaf block should start at 0x%x, but it appears at "
+ "0x%x in the hash table.\n"), leaf_proper_start,
+ proper_start);
+ error = lost_leaf(ip, tbl, leafblk, len, lindex, lbh);
+ brelse(lbh);
+ return error;
+ }
+
+ /* If we SHOULD have more pointers than we do, we can solve the
+ problem by splitting the block to a lower depth. Then we may have
+ the right number of pointers. If the leaf block pointers start
+ later than they should, we can split the leaf to give it a smaller
+ footprint in the hash table. */
+ if ((*proper_len > len || lindex > leaf_proper_start) &&
+ ip->i_di.di_depth > be16_to_cpu(leaf->lf_depth)) {
+ log_err(_("For depth %d, length %d, the proper start is: "
+ "0x%x.\n"), factor, len, proper_start);
+ changes++;
+ new_leaf_blk = find_free_blk(ip->i_sbd);
+ dir_split_leaf(ip, lindex, leafblk, lbh);
+ /* re-read the leaf to pick up dir_split_leaf's changes */
+ gfs2_leaf_in(leaf, lbh);
+ *proper_len = 1 << (ip->i_di.di_depth -
+ be16_to_cpu(leaf->lf_depth));
+ log_err(_("Leaf block %llu (0x%llx) was split from length "
+ "%d to %d\n"), (unsigned long long)leafblk,
+ (unsigned long long)leafblk, len, *proper_len);
+ if (*proper_len < 0) {
+ log_err(_("Programming error: proper_len=%d, "
+ "di_depth = %d, lf_depth = %d.\n"),
+ *proper_len, ip->i_di.di_depth,
+ be16_to_cpu(leaf->lf_depth));
+ exit(FSCK_ERROR);
+ }
+ log_err(_("New split-off leaf block was allocated at %lld "
+ "(0x%llx) for index %d (0x%x)\n"),
+ (unsigned long long)new_leaf_blk,
+ (unsigned long long)new_leaf_blk, lindex, lindex);
+ fsck_blockmap_set(ip, new_leaf_blk, _("split leaf"),
+ gfs2_leaf_blk);
+ log_err(_("Hash table repaired.\n"));
+ /* Fix up the hash table in memory to include the new leaf */
+ for (i = 0; i < *proper_len; i++)
+ tbl[lindex + i] = cpu_to_be64(new_leaf_blk);
+ if (*proper_len < (len >> 1)) {
+ log_err(_("One leaf split is not enough. The hash "
+ "table will need to be reprocessed.\n"));
+ brelse(lbh);
+ return changes;
+ }
+ lindex += (*proper_len); /* skip the new leaf from the split */
+ len -= (*proper_len);
+ }
+ if (*proper_len < len) {
+ log_err(_("There are %d pointers, but leaf 0x%llx's "
+ "depth, %d, only allows %d\n"),
+ len, (unsigned long long)leafblk,
+ be16_to_cpu(leaf->lf_depth), *proper_len);
+ }
+ brelse(lbh);
+ /* At this point, lindex should be at the proper end of the pointers.
+ Now we need to replace any extra duplicate pointers to the old
+ (original) leafblk (that ran off the end) with new leaf blocks. */
+ lindex += (*proper_len); /* Skip past the normal good pointers */
+ len -= (*proper_len);
+ extras = 0;
+ for (i = 0; i < len; i++) {
+ if (be64_to_cpu(tbl[lindex + i]) == leafblk)
+ extras++;
+ else
+ break;
+ }
+ if (extras) {
+ log_err(_("Found %d extra pointers to leaf %llu (0x%llx)\n"),
+ extras, (unsigned long long)leafblk,
+ (unsigned long long)leafblk);
+ pad_with_leafblks(ip, tbl, lindex, extras);
+ log_err(_("Reprocessing index 0x%x (case 2).\n"), lindex);
+ return 1;
+ }
+ return changes;
+}
+
+/* check_hash_tbl - check that the hash table is sane
+ *
+ * We've got to make sure the hash table is sane. Each leaf needs to
+ * be counted a proper power of 2. We can't just have 3 pointers to a leaf.
+ * The number of pointers must correspond to the proper leaf depth, and they
+ * must all fall on power-of-two boundaries. The leaf block pointers all need
+ * to fall properly on these boundaries, otherwise the kernel code's
+ * calculations will land it on the wrong leaf block while it's searching,
+ * and the result will be files you can see with ls, but can't open, delete
+ * or use them.
+ *
+ * The goal of this function is to check the hash table to make sure the
+ * boundaries and lengths all line up properly, and if not, to fix it.
+ *
+ * Note: There's a delicate balance here, because this function gets called
+ * BEFORE leaf blocks are checked by function check_leaf from function
+ * check_leaf_blks: the hash table has to be sane before we can start
+ * checking all the leaf blocks. And yet if there's hash table corruption
+ * we may need to reference leaf blocks to fix it, which means we need
+ * to check and/or fix a leaf block along the way.
+ */
+static int check_hash_tbl(struct gfs2_inode *ip, uint64_t *tbl,
+ unsigned hsize, void *private)
+{
+ int error = 0;
+ int lindex, len, proper_len, i, changes = 0;
+ uint64_t leafblk;
+ struct gfs2_leaf leaf;
+ struct gfs2_buffer_head *lbh;
+ int factor;
+ uint32_t proper_start;
+
+ lindex = 0;
+ while (lindex < hsize) {
+ if (fsck_abort)
+ return changes;
+ len = 1;
+ factor = 0;
+ leafblk = be64_to_cpu(tbl[lindex]);
+ while (lindex + (len << 1) - 1 < hsize) {
+ if (be64_to_cpu(tbl[lindex + (len << 1) - 1]) !=
+ leafblk)
+ break;
+ len <<= 1;
+ factor++;
+ }
+
+ /* Check for leftover pointers after the factor of two: */
+ proper_len = len; /* A factor of 2 that fits nicely */
+ while (lindex + len < hsize &&
+ be64_to_cpu(tbl[lindex + len]) == leafblk)
+ len++;
+
+ /* See if that leaf block is valid. If not, write a new one
+ that falls on a proper boundary. If it doesn't naturally,
+ we may need more. */
+ if (!valid_block(ip->i_sbd, leafblk)) {
+ uint64_t new_leafblk;
+
+ log_err(_("Dinode %llu (0x%llx) has bad leaf pointers "
+ "at offset %d for %d\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ lindex, len);
+ if (!query( _("Fix the hash table? (y/n) "))) {
+ log_err(_("Hash table not fixed.\n"));
+ lindex += len;
+ continue;
+ }
+ error = write_new_leaf(ip, lindex, proper_len,
+ _("replacing"), &new_leafblk);
+ if (error)
+ return error;
+
+ for (i = lindex; i < lindex + proper_len; i++)
+ tbl[i] = cpu_to_be64(new_leafblk);
+ lindex += proper_len;
+ continue;
+ }
+ /* Make sure they call on proper leaf-split boundaries. This
+ is the calculation used by the kernel, and dir_split_leaf */
+ proper_start = (lindex & ~(proper_len - 1));
+ if (lindex != proper_start) {
+ log_debug(_("lindex 0x%llx is not a proper starting "
+ "point for this leaf: 0x%llx\n"),
+ (unsigned long long)lindex,
+ (unsigned long long)proper_start);
+ changes = fix_hashtable(ip, tbl, hsize, leafblk,
+ lindex, proper_start, len,
+ &proper_len, factor);
+ /* Check if we need to split more leaf blocks */
+ if (changes) {
+ if (proper_len < (len >> 1))
+ log_err(_("More leaf splits are "
+ "needed; "));
+ log_err(_("Reprocessing index 0x%x (case 3).\n"),
+ lindex);
+ continue; /* Make it reprocess the lindex */
+ }
+ }
+ /* Check for extra pointers to this leaf. At this point, len
+ is the number of pointers we have. proper_len is the proper
+ number of pointers if the hash table is assumed correct.
+ Function fix_hashtable will read in the leaf block and
+ determine the "actual" proper length based on the leaf
+ depth, and adjust the hash table accordingly. */
+ if (len != proper_len) {
+ log_err(_("Length %d (0x%x) is not a proper length "
+ "for this leaf. Valid boundary assumed to "
+ "be %d (0x%x).\n"),
+ len, len, proper_len, proper_len);
+ lbh = bread(ip->i_sbd, leafblk);
+ gfs2_leaf_in(&leaf, lbh);
+ brelse(lbh);
+ if (gfs2_check_meta(lbh, GFS2_METATYPE_LF) ||
+ leaf.lf_depth > ip->i_di.di_depth)
+ leaf.lf_depth = factor;
+ changes = fix_hashtable(ip, tbl, hsize, leafblk,
+ lindex, lindex, len,
+ &proper_len, leaf.lf_depth);
+ /* If fixing the hash table made changes, we can no
+ longer count on the leaf block pointers all pointing
+ to the same leaf (which is checked below). To avoid
+ flagging another error, reprocess the offset. */
+ if (changes) {
+ log_err(_("Reprocessing index 0x%x (case 4).\n"),
+ lindex);
+ continue; /* Make it reprocess the lindex */
+ }
+ }
+
+ /* Now make sure they're all the same pointer */
+ for (i = lindex; i < lindex + proper_len; i++) {
+ if (fsck_abort)
+ return changes;
+
+ if (be64_to_cpu(tbl[i]) == leafblk) /* No problem */
+ continue;
+
+ log_err(_("Dinode %llu (0x%llx) has a hash table "
+ "inconsistency@index %d (0x%d) for %d\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ i, i, len);
+ if (!query( _("Fix the hash table? (y/n) "))) {
+ log_err(_("Hash table not fixed.\n"));
+ continue;
+ }
+ changes++;
+ /* Now we have to determine if the hash table is
+ corrupt, or if the leaf has the wrong depth. */
+ lbh = bread(ip->i_sbd, leafblk);
+ gfs2_leaf_in(&leaf, lbh);
+ brelse(lbh);
+ /* Calculate the expected pointer count based on the
+ leaf depth. */
+ proper_len = 1 << (ip->i_di.di_depth - leaf.lf_depth);
+ if (proper_len != len) {
+ log_debug(_("Length 0x%x is not proper for "
+ "this leaf: 0x%x"),
+ len, proper_len);
+ changes = fix_hashtable(ip, tbl, hsize,
+ leafblk, lindex,
+ lindex, len,
+ &proper_len,
+ leaf.lf_depth);
+ break;
+ }
+ }
+ lindex += proper_len;
+ }
+ if (!error && changes)
+ error = 1;
+ return error;
+}
struct metawalk_fxns pass2_fxns = {
.private = NULL,
@@ -725,6 +1255,8 @@ struct metawalk_fxns pass2_fxns = {
.check_eattr_leaf = check_eattr_leaf,
.check_dentry = check_dentry,
.check_eattr_entry = NULL,
+ .check_hash_tbl = check_hash_tbl,
+ .repair_leaf = repair_leaf,
};
/* Check system directory inode */
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 11/47] fsck.gfs2: Special case '..' when processing bad formal inode number
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (8 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 10/47] fsck.gfs2: Add new function to check dir hash tables Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 12/47] fsck.gfs2: Move function to read directory hash table to util.c Bob Peterson
` (35 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
In a recent patch to fsck.gfs2, we added the ability to make sure the
formal inode number in each directory entry matches the formal inode
number in the dinode. If it doesn't match, fsck tries to fix it up.
We can't do much for regular files, but we can fix up directories.
If the directory linkage is intact, it just fixes the formal inode number.
But to check if the directory linkage is intact, we were checking to make
sure the child directory points to the parent with its "..". For example,
suppose we have gfs2 mounted as /mnt/gfs2, and at the root, we have
directory "a", and within "a" we have a subdirectory "b". In other words:
/mnt/gfs2/a/b/...
Now suppose fsck.gfs2 finds a formal inode number mismatch between the
dirent inside "a" which points to "b" and the inode "b" itself. Since
both "a" and "b" are directories, it tries to determine if the directory
linkage is intact by testing whether b's ".." dirent actually points
back to "a". And if it's good, we can just fix the formal inode number
so that they match.
That's all well and good, and works for the most part. However, if
the dirent found to be wrong isn't "b" but ".." we've got a problem.
Today's algorithm would look up the ".." of ".." which won't be
pointingi back to what we want.
For this patch, I'm special-casing ".." and making it just delete the
correct directory entry. However, we have to do it in such a way that
it doesn't decrement di_entries, since the entry is invalid.
rhbz#902920
---
gfs2/fsck/pass2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index a71be4b..5d8c2b6 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -260,7 +260,7 @@ static int bad_formal_ino(struct gfs2_inode *ip, struct gfs2_dirent *dent,
(unsigned long long)entry.no_formal_ino,
(unsigned long long)ii->di_num.no_formal_ino,
(unsigned long long)ii->di_num.no_formal_ino);
- if (q != gfs2_inode_dir) {
+ if (q != gfs2_inode_dir || !strcmp("..", tmp_name)) {
if (query( _("Remove the corrupt directory entry? (y/n) ")))
return 1;
log_err( _("Corrupt directory entry not removed.\n"));
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 12/47] fsck.gfs2: Move function to read directory hash table to util.c
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (9 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 11/47] fsck.gfs2: Special case '..' when processing bad formal inode number Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 13/47] fsck.gfs2: Misc cleanups Bob Peterson
` (34 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch moves function get_dir_hash from metawalk.c to util.c.
This was done because a future patch will need access to the function.
---
gfs2/fsck/metawalk.c | 18 ------------------
gfs2/fsck/util.c | 19 +++++++++++++++++++
gfs2/fsck/util.h | 1 +
3 files changed, 20 insertions(+), 18 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index e76c35f..5a13c6f 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -639,24 +639,6 @@ out_copy_old_leaf:
return 1;
}
-static uint64_t *get_dir_hash(struct gfs2_inode *ip)
-{
- unsigned hsize = (1 << ip->i_di.di_depth) * sizeof(uint64_t);
- int ret;
- uint64_t *tbl = malloc(hsize);
-
- if (tbl == NULL)
- return NULL;
-
- ret = gfs2_readi(ip, tbl, 0, hsize);
- if (ret != hsize) {
- free(tbl);
- return NULL;
- }
-
- return tbl;
-}
-
static int u64cmp(const void *p1, const void *p2)
{
uint64_t a = *(uint64_t *)p1;
diff --git a/gfs2/fsck/util.c b/gfs2/fsck/util.c
index 94d532e..5be260c 100644
--- a/gfs2/fsck/util.c
+++ b/gfs2/fsck/util.c
@@ -654,3 +654,22 @@ uint64_t find_free_blk(struct gfs2_sbd *sdp)
}
return 0;
}
+
+uint64_t *get_dir_hash(struct gfs2_inode *ip)
+{
+ unsigned hsize = (1 << ip->i_di.di_depth) * sizeof(uint64_t);
+ int ret;
+ uint64_t *tbl = malloc(hsize);
+
+ if (tbl == NULL)
+ return NULL;
+
+ ret = gfs2_readi(ip, tbl, 0, hsize);
+ if (ret != hsize) {
+ free(tbl);
+ return NULL;
+ }
+
+ return tbl;
+}
+
diff --git a/gfs2/fsck/util.h b/gfs2/fsck/util.h
index 1a4811c..7b587d4 100644
--- a/gfs2/fsck/util.h
+++ b/gfs2/fsck/util.h
@@ -185,6 +185,7 @@ extern char generic_interrupt(const char *caller, const char *where,
const char *answers);
extern char gfs2_getch(void);
extern uint64_t find_free_blk(struct gfs2_sbd *sdp);
+extern uint64_t *get_dir_hash(struct gfs2_inode *ip);
#define stack log_debug("<backtrace> - %s()\n", __func__)
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 13/47] fsck.gfs2: Misc cleanups
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (10 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 12/47] fsck.gfs2: Move function to read directory hash table to util.c Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 14/47] fsck.gfs2: Verify dirent hash values correspond to proper leaf block Bob Peterson
` (33 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch contains some trivial cleanups.
---
gfs2/fsck/metawalk.c | 6 ++++++
gfs2/fsck/pass1.c | 4 ++--
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 5a13c6f..ce6bdbe 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -335,6 +335,7 @@ static void dirblk_truncate(struct gfs2_inode *ip, struct gfs2_dirent *fixb,
* bh - buffer for the leaf block
* type - type of block this is (linear or exhash)
* @count - set to the count entries
+ * @lindex - the last inde
* @pass - structure pointing to pass-specific functions
*
* returns: 0 - good block or it was repaired to be good
@@ -515,6 +516,8 @@ static int warn_and_patch(struct gfs2_inode *ip, uint64_t *leaf_no,
/**
* check_leaf - check a leaf block for errors
+ * Reads in the leaf block
+ * Leaves the buffer around for further analysis (caller must brelse)
*/
static int check_leaf(struct gfs2_inode *ip, int lindex,
struct metawalk_fxns *pass, int *ref_count,
@@ -1170,6 +1173,9 @@ static void free_metalist(struct gfs2_inode *ip, osi_list_t *mlp)
* This includes hash table blocks for directories
* which are technically "data" in the bitmap.
*
+ * Returns: 0 - all is well, process the blocks this metadata references
+ * 1 - something went wrong, but process the sub-blocks anyway
+ * -1 - something went wrong, so don't process the sub-blocks
* @ip:
* @mlp:
*/
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index cc69e84..0d4da5d 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -211,7 +211,7 @@ static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
So we know it's a leaf block. */
q = block_type(block);
if (q != gfs2_block_free) {
- log_err( _("Found duplicate block %llu (0x%llx) referenced "
+ log_err( _("Found duplicate block #%llu (0x%llx) referenced "
"as a directory leaf in dinode "
"%llu (0x%llx) - was marked %d (%s)\n"),
(unsigned long long)block,
@@ -264,7 +264,7 @@ static int check_metalist(struct gfs2_inode *ip, uint64_t block,
}
q = block_type(block);
if (q != gfs2_block_free) {
- log_err( _("Found duplicate block %llu (0x%llx) referenced "
+ log_err( _("Found duplicate block #%llu (0x%llx) referenced "
"as metadata in indirect block for dinode "
"%llu (0x%llx) - was marked %d (%s)\n"),
(unsigned long long)block,
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 14/47] fsck.gfs2: Verify dirent hash values correspond to proper leaf block
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (11 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 13/47] fsck.gfs2: Misc cleanups Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 15/47] fsck.gfs2: re-read hash table if directory height or depth changes Bob Peterson
` (32 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch checks to make sure all the dirents on a leaf block have
hash values that are actually appropriate to the leaf block.
With extended hashing, the file name is used to generate a hash value.
Today's fsck checks that the hash value is proper to the file name,
but that's not enough. The hash value is also shifted by a certain
amount (determined by i_depth) to produce an index into the hash table.
For example, suppose di_depth == 8. Valid indexes into the hash table
go from 0 to 1<<di_depth-1 which is 1<<8-1 which is 256-1 or 255.
Now suppose we have four actual leaf blocks, and each leaf block is
repeated 64 (0x40) times in the index. So the hash table, indexed by
file name would be something like:
entries 00-3f = first leaf block
entries 40-7f = second leaf block
entries 80-bf = third leaf block
entries c0-ff = fourth leaf block
So ht index = name->hash >> (32 - ip->i_depth).
In our example, i_depth is 8, so:
ht index == hash >> (32 - 8) == hash >> 24
In this case, the hash value is shifted by a certain amount to get
the index into the table. For example, file name "Solar" has hash
value 0x59f4dde1. So the hr index == 0x59f4dde1 >> 24 == 0x59.
Therefore, a file with the name "Solar" better appear on the second
leaf, which covers index values from 0x40 to 0x7f.
What this patch does is verify that all the dirents on the first
leaf block have a hash value starting with 0x00 to 0x3f, and all
the dirents on the second leaf block have a hash value starting with
0x40 to 0x7f, and so forth. If they appear on the wrong leaf block,
they need to be relocated to the proper leaf block.
rhbz#902920
---
gfs2/fsck/metawalk.c | 12 ++--
gfs2/fsck/metawalk.h | 3 +-
gfs2/fsck/pass1.c | 2 +-
gfs2/fsck/pass1b.c | 5 +-
gfs2/fsck/pass2.c | 190 +++++++++++++++++++++++++++++++++++++++++++++++++--
5 files changed, 197 insertions(+), 15 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index ce6bdbe..40ef766 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -342,7 +342,8 @@ static void dirblk_truncate(struct gfs2_inode *ip, struct gfs2_dirent *fixb,
* -1 - error occurred
*/
static int check_entries(struct gfs2_inode *ip, struct gfs2_buffer_head *bh,
- int type, uint32_t *count, struct metawalk_fxns *pass)
+ int type, uint32_t *count, int lindex,
+ struct metawalk_fxns *pass)
{
struct gfs2_dirent *dent;
struct gfs2_dirent de, *prev;
@@ -450,6 +451,7 @@ static int check_entries(struct gfs2_inode *ip, struct gfs2_buffer_head *bh,
} else {
error = pass->check_dentry(ip, dent, prev, bh,
filename, count,
+ lindex,
pass->private);
if (error < 0) {
stack;
@@ -589,7 +591,8 @@ static int check_leaf(struct gfs2_inode *ip, int lindex,
}
if (pass->check_dentry && is_dir(&ip->i_di, sdp->gfs1)) {
- error = check_entries(ip, lbh, DIR_EXHASH, &count, pass);
+ error = check_entries(ip, lbh, DIR_EXHASH, &count, lindex,
+ pass);
if (skip_this_pass || fsck_abort)
goto out;
@@ -1450,7 +1453,7 @@ int check_linear_dir(struct gfs2_inode *ip, struct gfs2_buffer_head *bh,
int error = 0;
uint32_t count = 0;
- error = check_entries(ip, bh, DIR_LINEAR, &count, pass);
+ error = check_entries(ip, bh, DIR_LINEAR, &count, 0, pass);
if (error < 0) {
stack;
return -1;
@@ -1481,7 +1484,8 @@ int check_dir(struct gfs2_sbd *sdp, uint64_t block, struct metawalk_fxns *pass)
static int remove_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
struct gfs2_dirent *prev_de,
struct gfs2_buffer_head *bh,
- char *filename, uint32_t *count, void *private)
+ char *filename, uint32_t *count, int lindex,
+ void *private)
{
/* the metawalk_fxn's private field must be set to the dentry
* block we want to clear */
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index c43baf0..bef99ae 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -85,7 +85,8 @@ struct metawalk_fxns {
int (*check_dentry) (struct gfs2_inode *ip, struct gfs2_dirent *de,
struct gfs2_dirent *prev,
struct gfs2_buffer_head *bh,
- char *filename, uint32_t *count, void *private);
+ char *filename, uint32_t *count,
+ int lindex, void *private);
int (*check_eattr_entry) (struct gfs2_inode *ip,
struct gfs2_buffer_head *leaf_bh,
struct gfs2_ea_header *ea_hdr,
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 0d4da5d..dd6b958 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -148,7 +148,7 @@ static int resuscitate_metalist(struct gfs2_inode *ip, uint64_t block,
static int resuscitate_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
struct gfs2_dirent *prev_de,
struct gfs2_buffer_head *bh, char *filename,
- uint32_t *count, void *priv)
+ uint32_t *count, int lindex, void *priv)
{
struct gfs2_sbd *sdp = ip->i_sbd;
struct gfs2_dirent dentry, *de;
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index e8c39be..f3f90ef 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -50,7 +50,8 @@ static int check_eattr_extentry(struct gfs2_inode *ip, uint64_t *ea_data_ptr,
void *private);
static int find_dentry(struct gfs2_inode *ip, struct gfs2_dirent *de,
struct gfs2_dirent *prev, struct gfs2_buffer_head *bh,
- char *filename, uint32_t *count, void *priv);
+ char *filename, uint32_t *count, int lindex,
+ void *priv);
struct metawalk_fxns find_refs = {
.private = NULL,
@@ -174,7 +175,7 @@ static int check_dir_dup_ref(struct gfs2_inode *ip, struct gfs2_dirent *de,
static int find_dentry(struct gfs2_inode *ip, struct gfs2_dirent *de,
struct gfs2_dirent *prev,
struct gfs2_buffer_head *bh, char *filename,
- uint32_t *count, void *priv)
+ uint32_t *count, int lindex, void *priv)
{
struct osi_node *n, *next = NULL;
osi_list_t *tmp2;
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index 5d8c2b6..527071c 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -301,6 +301,166 @@ static int hash_table_index(uint32_t hash, struct gfs2_inode *ip)
return hash >> (32 - ip->i_di.di_depth);
}
+static int hash_table_max(int lindex, struct gfs2_inode *ip,
+ struct gfs2_buffer_head *bh)
+{
+ struct gfs2_leaf *leaf = (struct gfs2_leaf *)bh->b_data;
+ return (1 << (ip->i_di.di_depth - be16_to_cpu(leaf->lf_depth))) +
+ lindex - 1;
+}
+
+static int check_leaf_depth(struct gfs2_inode *ip, uint64_t leaf_no,
+ int ref_count, struct gfs2_buffer_head *lbh)
+{
+ struct gfs2_leaf *leaf = (struct gfs2_leaf *)lbh->b_data;
+ int cur_depth = be16_to_cpu(leaf->lf_depth);
+ int exp_count = 1 << (ip->i_di.di_depth - cur_depth);
+ int divisor;
+ int factor, correct_depth;
+
+ if (exp_count == ref_count)
+ return 0;
+
+ factor = 0;
+ divisor = ref_count;
+ while (divisor > 1) {
+ factor++;
+ divisor >>= 1;
+ }
+ correct_depth = ip->i_di.di_depth - factor;
+ if (cur_depth == correct_depth)
+ return 0;
+
+ log_err(_("Leaf block %llu (0x%llx) in dinode %llu (0x%llx) has the "
+ "wrong depth: is %d (length %d), should be %d (length "
+ "%d).\n"),
+ (unsigned long long)leaf_no, (unsigned long long)leaf_no,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ cur_depth, ref_count, correct_depth, exp_count);
+ if (!query( _("Fix the leaf block? (y/n)"))) {
+ log_err( _("The leaf block was not fixed.\n"));
+ return 0;
+ }
+
+ leaf->lf_depth = cpu_to_be16(correct_depth);
+ bmodified(lbh);
+ log_err( _("The leaf block depth was fixed.\n"));
+ return 1;
+}
+
+/* wrong_leaf: Deal with a dirent discovered to be on the wrong leaf block
+ *
+ * Returns: 1 if the dirent is to be removed, 0 if it needs to be kept,
+ * or -1 on error
+ */
+static int wrong_leaf(struct gfs2_inode *ip, struct gfs2_inum *entry,
+ const char *tmp_name, int lindex, int lindex_max,
+ int hash_index, struct gfs2_buffer_head *bh,
+ struct dir_status *ds, struct gfs2_dirent *dent,
+ struct gfs2_dirent *de, struct gfs2_dirent *prev_de,
+ uint32_t *count, uint8_t q)
+{
+ struct gfs2_sbd *sdp = ip->i_sbd;
+ struct gfs2_buffer_head *dest_lbh;
+ uint64_t planned_leaf, real_leaf;
+ int li, dest_ref, error;
+ uint64_t *tbl;
+
+ log_err(_("Directory entry '%s' at block %lld (0x%llx) is on the "
+ "wrong leaf block.\n"), tmp_name,
+ (unsigned long long)entry->no_addr,
+ (unsigned long long)entry->no_addr);
+ log_err(_("Leaf index is: 0x%x. The range for this leaf block is "
+ "0x%x - 0x%x\n"), hash_index, lindex, lindex_max);
+ if (!query( _("Move the misplaced directory entry to "
+ "a valid leaf block? (y/n) "))) {
+ log_err( _("Misplaced directory entry not moved.\n"));
+ return 0;
+ }
+
+ /* check the destination leaf block's depth */
+ tbl = get_dir_hash(ip);
+ if (tbl == NULL) {
+ perror("get_dir_hash");
+ return -1;
+ }
+ planned_leaf = be64_to_cpu(tbl[hash_index]);
+ log_err(_("Moving it from leaf %llu (0x%llx) to %llu (0x%llx)\n"),
+ (unsigned long long)be64_to_cpu(tbl[lindex]),
+ (unsigned long long)be64_to_cpu(tbl[lindex]),
+ (unsigned long long)planned_leaf,
+ (unsigned long long)planned_leaf);
+ /* Can't trust lf_depth; we have to count */
+ dest_ref = 0;
+ for (li = 0; li < (1 << ip->i_di.di_depth); li++) {
+ if (be64_to_cpu(tbl[li]) == planned_leaf)
+ dest_ref++;
+ else if (dest_ref)
+ break;
+ }
+ dest_lbh = bread(sdp, planned_leaf);
+ check_leaf_depth(ip, planned_leaf, dest_ref, dest_lbh);
+ brelse(dest_lbh);
+ free(tbl);
+
+ /* check if it's already on the correct leaf block */
+ error = dir_search(ip, tmp_name, de->de_name_len, NULL, &de->de_inum);
+ if (!error) {
+ log_err(_("The misplaced directory entry already appears on "
+ "the correct leaf block.\n"));
+ log_err( _("The bad duplicate directory entry "
+ "'%s' was cleared.\n"), tmp_name);
+ return 1; /* nuke the dent upon return */
+ }
+
+ if (dir_add(ip, tmp_name, de->de_name_len, &de->de_inum,
+ de->de_type) == 0) {
+ log_err(_("The misplaced directory entry was moved to a "
+ "valid leaf block.\n"));
+ gfs2_get_leaf_nr(ip, hash_index, &real_leaf);
+ if (real_leaf != planned_leaf) {
+ log_err(_("The planned leaf was split. The new leaf "
+ "is: %llu (0x%llx)"),
+ (unsigned long long)real_leaf,
+ (unsigned long long)real_leaf);
+ fsck_blockmap_set(ip, real_leaf, _("split leaf"),
+ gfs2_indir_blk);
+ }
+ /* If the misplaced dirent was supposed to be earlier in the
+ hash table, we need to adjust our counts for the blocks
+ that have already been processed. If it's supposed to
+ appear later, we'll count it has part of our normal
+ processing when we get to that leaf block later on in the
+ hash table. */
+ if (hash_index > lindex) {
+ log_err(_("Accounting deferred.\n"));
+ return 1; /* nuke the dent upon return */
+ }
+ /* If we get here, it's because we moved a dent to another
+ leaf, but that leaf has already been processed. So we have
+ to nuke the dent from this leaf when we return, but we
+ still need to do the "good dent" accounting. */
+ error = incr_link_count(*entry, ip, _("valid reference"));
+ if (error > 0 &&
+ bad_formal_ino(ip, dent, *entry, tmp_name, q, de, bh) == 1)
+ return 1; /* nuke it */
+
+ /* You cannot do this:
+ (*count)++;
+ The reason is: *count is the count of dentries on the leaf,
+ and we moved the dentry to a previous leaf within the same
+ directory dinode. So the directory counts still get
+ incremented, but not leaf entries. When we called dir_add
+ above, it should have fixed that prev leaf's lf_entries. */
+ ds->entry_count++;
+ return 1;
+ } else {
+ log_err(_("Error moving directory entry.\n"));
+ return 1; /* nuke it */
+ }
+}
+
/* basic_dentry_checks - fundamental checks for directory entries
*
* @ip: pointer to the incode inode structure
@@ -522,9 +682,9 @@ static int basic_dentry_checks(struct gfs2_inode *ip, struct gfs2_dirent *dent,
/* FIXME: should maybe refactor this a bit - but need to deal with
* FIXMEs internally first */
static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
- struct gfs2_dirent *prev_de,
- struct gfs2_buffer_head *bh, char *filename,
- uint32_t *count, void *priv)
+ struct gfs2_dirent *prev_de,
+ struct gfs2_buffer_head *bh, char *filename,
+ uint32_t *count, int lindex, void *priv)
{
struct gfs2_sbd *sdp = ip->i_sbd;
uint8_t q = 0;
@@ -534,6 +694,8 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
int error;
struct gfs2_inode *entry_ip = NULL;
struct gfs2_dirent dentry, *de;
+ int hash_index; /* index into the hash table based on the hash */
+ int lindex_max; /* largest acceptable hash table index for hash */
memset(&dentry, 0, sizeof(struct gfs2_dirent));
gfs2_dirent_in(&dentry, (char *)dent);
@@ -674,6 +836,21 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
ds->dotdotdir = 1;
goto dentry_is_valid;
}
+ /* If this is an exhash directory, make sure the dentries in the leaf
+ block have a hash table index that fits */
+ if (ip->i_di.di_flags & GFS2_DIF_EXHASH) {
+ hash_index = hash_table_index(de->de_hash, ip);
+ lindex_max = hash_table_max(lindex, ip, bh);
+ if (hash_index < lindex || hash_index > lindex_max) {
+ int nuke_dent;
+
+ nuke_dent = wrong_leaf(ip, &entry, tmp_name, lindex,
+ lindex_max, hash_index, bh, ds,
+ dent, de, prev_de, count, q);
+ if (nuke_dent)
+ goto nuke_dentry;
+ }
+ }
/* After this point we're only concerned with directories */
if (q != gfs2_inode_dir) {
@@ -705,10 +882,9 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
dentry_is_valid:
/* This directory inode links to this inode via this dentry */
error = incr_link_count(entry, ip, _("valid reference"));
- if (error > 0) {
- if (bad_formal_ino(ip, dent, entry, tmp_name, q, de, bh) == 1)
- goto nuke_dentry;
- }
+ if (error > 0 &&
+ bad_formal_ino(ip, dent, entry, tmp_name, q, de, bh) == 1)
+ goto nuke_dentry;
(*count)++;
ds->entry_count++;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 15/47] fsck.gfs2: re-read hash table if directory height or depth changes
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (12 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 14/47] fsck.gfs2: Verify dirent hash values correspond to proper leaf block Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 16/47] fsck.gfs2: fix leaf blocks, don't try to patch the hash table Bob Peterson
` (31 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
There may be times when fsck.gfs2 wants to move things around. For
example, if it finds dirents entries on the wrong leaf block, it
may want to move them to a different leaf. If it does, it may need
to split the leaf, which means we're adding another block. We may
in fact have doubled our exhash table, so the table in cache is no
longer valid. In this case, we need to discard the old one and read
it in again. This patch checks for these things and re-reads the
hash table as appropriate.
rhbz#902920
---
gfs2/fsck/metawalk.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 47 insertions(+), 2 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 40ef766..73bdba0 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -683,18 +683,23 @@ static int check_leaf_blks(struct gfs2_inode *ip, struct metawalk_fxns *pass)
struct gfs2_leaf leaf, oldleaf;
unsigned hsize = (1 << ip->i_di.di_depth);
uint64_t leaf_no, old_leaf, bad_leaf = -1;
- uint64_t first_ok_leaf;
+ uint64_t first_ok_leaf, orig_di_blocks;
struct gfs2_buffer_head *lbh;
int lindex;
struct gfs2_sbd *sdp = ip->i_sbd;
- int ref_count = 0, old_was_dup;
+ int ref_count = 0, orig_ref_count, orig_di_depth, orig_di_height, old_was_dup;
uint64_t *tbl;
+ int tbl_valid;
tbl = get_dir_hash(ip);
if (tbl == NULL) {
perror("get_dir_hash");
return -1;
}
+ tbl_valid = 1;
+ orig_di_depth = ip->i_di.di_depth;
+ orig_di_height = ip->i_di.di_height;
+ orig_di_blocks = ip->i_di.di_blocks;
/* Turn off system readahead */
posix_fadvise(sdp->device_fd, 0, 0, POSIX_FADV_RANDOM);
@@ -752,6 +757,21 @@ static int check_leaf_blks(struct gfs2_inode *ip, struct metawalk_fxns *pass)
for (lindex = 0; lindex < hsize; lindex++) {
if (fsck_abort)
break;
+
+ if (!tbl_valid) {
+ free(tbl);
+ log_debug(_("Re-reading 0x%llx hash table.\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr);
+ tbl = get_dir_hash(ip);
+ if (tbl == NULL) {
+ perror("get_dir_hash");
+ return -1;
+ }
+ tbl_valid = 1;
+ orig_di_depth = ip->i_di.di_depth;
+ orig_di_height = ip->i_di.di_height;
+ orig_di_blocks = ip->i_di.di_blocks;
+ }
leaf_no = be64_to_cpu(tbl[lindex]);
/* GFS has multiple indirect pointers to the same leaf
@@ -765,6 +785,7 @@ static int check_leaf_blks(struct gfs2_inode *ip, struct metawalk_fxns *pass)
ref_count++;
continue;
}
+ orig_ref_count = ref_count;
do {
if (fsck_abort) {
@@ -775,6 +796,8 @@ static int check_leaf_blks(struct gfs2_inode *ip, struct metawalk_fxns *pass)
error = check_leaf(ip, lindex, pass, &ref_count,
&leaf_no, old_leaf, &bad_leaf,
first_ok_leaf, &leaf, &oldleaf);
+ if (ref_count != orig_ref_count)
+ tbl_valid = 0;
old_was_dup = (error == -EEXIST);
old_leaf = leaf_no;
memcpy(&oldleaf, &leaf, sizeof(oldleaf));
@@ -784,6 +807,28 @@ static int check_leaf_blks(struct gfs2_inode *ip, struct metawalk_fxns *pass)
log_debug( _("Leaf chain (0x%llx) detected.\n"),
(unsigned long long)leaf_no);
} while (1); /* while we have chained leaf blocks */
+ if (orig_di_depth != ip->i_di.di_depth) {
+ log_debug(_("Depth of 0x%llx changed from %d to %d\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ orig_di_depth, ip->i_di.di_depth);
+ tbl_valid = 0;
+ }
+ if (orig_di_height != ip->i_di.di_height) {
+ log_debug(_("Height of 0x%llx changed from %d to "
+ "%d\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ orig_di_height, ip->i_di.di_height);
+ tbl_valid = 0;
+ }
+ if (orig_di_blocks != ip->i_di.di_blocks) {
+ log_debug(_("Block count of 0x%llx changed from %llu "
+ "to %llu\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)orig_di_blocks,
+ (unsigned long long)ip->i_di.di_blocks);
+ tbl_valid = 0;
+ }
+ lindex += ref_count;
} /* for every leaf block */
free(tbl);
posix_fadvise(sdp->device_fd, 0, 0, POSIX_FADV_NORMAL);
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 16/47] fsck.gfs2: fix leaf blocks, don't try to patch the hash table
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (13 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 15/47] fsck.gfs2: re-read hash table if directory height or depth changes Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 17/47] fsck.gfs2: check leaf depth when validating leaf blocks Bob Peterson
` (30 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, when we detected a bad leaf block, fsck.gfs2 would
try to patch the hash table. That's very wrong, because the hash table
needs to be on nice power-of-two boundaries. This patch changes the
code so that the hash table is actually repaired.
rhbz#902920
---
gfs2/fsck/metawalk.c | 135 ++++++++++++++++++---------------------------------
gfs2/fsck/metawalk.h | 3 +-
gfs2/fsck/pass1.c | 14 ++++++
gfs2/fsck/pass2.c | 9 +++-
4 files changed, 72 insertions(+), 89 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 73bdba0..4cd712e 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -480,52 +480,14 @@ static int check_entries(struct gfs2_inode *ip, struct gfs2_buffer_head *bh,
return 0;
}
-/* warn_and_patch - Warn the user of an error and ask permission to fix it
- * Process a bad leaf pointer and ask to repair the first time.
- * The repair process involves extending the previous leaf's entries
- * so that they replace the bad ones. We have to hack up the old
- * leaf a bit, but it's better than deleting the whole directory,
- * which is what used to happen before. */
-static int warn_and_patch(struct gfs2_inode *ip, uint64_t *leaf_no,
- uint64_t *bad_leaf, uint64_t old_leaf,
- uint64_t first_ok_leaf, int pindex, const char *msg)
-{
- int okay_to_fix = 0;
-
- if (*bad_leaf != *leaf_no) {
- log_err( _("Directory Inode %llu (0x%llx) points to leaf %llu"
- " (0x%llx) %s.\n"),
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)*leaf_no,
- (unsigned long long)*leaf_no, msg);
- }
- if (*leaf_no == *bad_leaf ||
- (okay_to_fix = query( _("Attempt to patch around it? (y/n) ")))) {
- if (valid_block(ip->i_sbd, old_leaf))
- gfs2_put_leaf_nr(ip, pindex, old_leaf);
- else
- gfs2_put_leaf_nr(ip, pindex, first_ok_leaf);
- log_err( _("Directory Inode %llu (0x%llx) repaired.\n"),
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr);
- } else
- log_err( _("Bad leaf left in place.\n"));
- *bad_leaf = *leaf_no;
- *leaf_no = old_leaf;
- return okay_to_fix;
-}
-
/**
* check_leaf - check a leaf block for errors
* Reads in the leaf block
* Leaves the buffer around for further analysis (caller must brelse)
*/
static int check_leaf(struct gfs2_inode *ip, int lindex,
- struct metawalk_fxns *pass, int *ref_count,
- uint64_t *leaf_no, uint64_t old_leaf, uint64_t *bad_leaf,
- uint64_t first_ok_leaf, struct gfs2_leaf *leaf,
- struct gfs2_leaf *oldleaf)
+ struct metawalk_fxns *pass,
+ uint64_t *leaf_no, struct gfs2_leaf *leaf, int *ref_count)
{
int error = 0, fix;
struct gfs2_buffer_head *lbh = NULL;
@@ -533,7 +495,6 @@ static int check_leaf(struct gfs2_inode *ip, int lindex,
struct gfs2_sbd *sdp = ip->i_sbd;
const char *msg;
- *ref_count = 1;
/* Make sure the block number is in range. */
if (!valid_block(ip->i_sbd, *leaf_no)) {
log_err( _("Leaf block #%llu (0x%llx) is out of range for "
@@ -543,7 +504,7 @@ static int check_leaf(struct gfs2_inode *ip, int lindex,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
msg = _("that is out of range");
- goto out_copy_old_leaf;
+ goto bad_leaf;
}
/* Try to read in the leaf block. */
@@ -551,7 +512,7 @@ static int check_leaf(struct gfs2_inode *ip, int lindex,
/* Make sure it's really a valid leaf block. */
if (gfs2_check_meta(lbh, GFS2_METATYPE_LF)) {
msg = _("that is not really a leaf");
- goto out_copy_old_leaf;
+ goto bad_leaf;
}
if (pass->check_leaf) {
error = pass->check_leaf(ip, *leaf_no, pass->private);
@@ -587,7 +548,7 @@ static int check_leaf(struct gfs2_inode *ip, int lindex,
(unsigned long long)*leaf_no,
(unsigned long long)*leaf_no);
msg = _("that is not a leaf");
- goto out_copy_old_leaf;
+ goto bad_leaf;
}
if (pass->check_dentry && is_dir(&ip->i_di, sdp->gfs1)) {
@@ -599,16 +560,18 @@ static int check_leaf(struct gfs2_inode *ip, int lindex,
if (error < 0) {
stack;
- goto out;
+ goto out; /* This seems wrong: needs investigation */
}
- if (count != leaf->lf_entries) {
- /* release and re-read the leaf in case check_entries
- changed it. */
- brelse(lbh);
- lbh = bread(sdp, *leaf_no);
- gfs2_leaf_in(leaf, lbh);
+ if (count == leaf->lf_entries)
+ goto out;
+ /* release and re-read the leaf in case check_entries
+ changed it. */
+ brelse(lbh);
+ lbh = bread(sdp, *leaf_no);
+ gfs2_leaf_in(leaf, lbh);
+ if (count != leaf->lf_entries) {
log_err( _("Leaf %llu (0x%llx) entry count in "
"directory %llu (0x%llx) does not match "
"number of entries found - is %u, found %u\n"),
@@ -630,17 +593,16 @@ out:
brelse(lbh);
return 0;
-out_copy_old_leaf:
- /* The leaf we read in is bad. So we'll copy the old leaf into the
- * new one. However, that will make us shift our ref count. */
- fix = warn_and_patch(ip, leaf_no, bad_leaf, old_leaf,
- first_ok_leaf, lindex, msg);
- (*ref_count)++;
- memcpy(leaf, oldleaf, sizeof(struct gfs2_leaf));
- if (lbh) {
- if (fix)
- bmodified(lbh);
+bad_leaf:
+ if (lbh)
brelse(lbh);
+ if (pass->repair_leaf) {
+ /* The leaf we read in is bad so we need to repair it. */
+ fix = pass->repair_leaf(ip, leaf_no, lindex, *ref_count, msg,
+ pass->private);
+ if (fix < 0)
+ return fix;
+
}
return 1;
}
@@ -679,17 +641,17 @@ static void dir_leaf_reada(struct gfs2_inode *ip, uint64_t *tbl, unsigned hsize)
/* Checks exhash directory entries */
static int check_leaf_blks(struct gfs2_inode *ip, struct metawalk_fxns *pass)
{
- int error;
- struct gfs2_leaf leaf, oldleaf;
+ int error = 0;
+ struct gfs2_leaf leaf;
unsigned hsize = (1 << ip->i_di.di_depth);
- uint64_t leaf_no, old_leaf, bad_leaf = -1;
+ uint64_t leaf_no, leaf_next;
uint64_t first_ok_leaf, orig_di_blocks;
struct gfs2_buffer_head *lbh;
int lindex;
struct gfs2_sbd *sdp = ip->i_sbd;
- int ref_count = 0, orig_ref_count, orig_di_depth, orig_di_height, old_was_dup;
+ int ref_count, orig_ref_count, orig_di_depth, orig_di_height;
uint64_t *tbl;
- int tbl_valid;
+ int chained_leaf, tbl_valid;
tbl = get_dir_hash(ip);
if (tbl == NULL) {
@@ -751,10 +713,11 @@ static int check_leaf_blks(struct gfs2_inode *ip, struct metawalk_fxns *pass)
posix_fadvise(sdp->device_fd, 0, 0, POSIX_FADV_NORMAL);
return 1;
}
- old_leaf = -1;
- memset(&oldleaf, 0, sizeof(oldleaf));
- old_was_dup = 0;
- for (lindex = 0; lindex < hsize; lindex++) {
+ lindex = 0;
+ leaf_next = -1;
+ while (lindex < hsize) {
+ int l;
+
if (fsck_abort)
break;
@@ -774,38 +737,36 @@ static int check_leaf_blks(struct gfs2_inode *ip, struct metawalk_fxns *pass)
}
leaf_no = be64_to_cpu(tbl[lindex]);
- /* GFS has multiple indirect pointers to the same leaf
- * until those extra pointers are needed, so skip the dups */
- if (leaf_no == bad_leaf) {
- tbl[lindex] = cpu_to_be64(old_leaf);
- gfs2_put_leaf_nr(ip, lindex, old_leaf);
- ref_count++;
- continue;
- } else if (old_leaf == leaf_no) {
+ /* count the number of block pointers to this leaf. We don't
+ need to count the current lindex, because we already know
+ it's a reference */
+ ref_count = 1;
+
+ for (l = lindex + 1; l < hsize; l++) {
+ leaf_next = be64_to_cpu(tbl[l]);
+ if (leaf_next != leaf_no)
+ break;
ref_count++;
- continue;
}
orig_ref_count = ref_count;
+ chained_leaf = 0;
do {
if (fsck_abort) {
free(tbl);
posix_fadvise(sdp->device_fd, 0, 0, POSIX_FADV_NORMAL);
return 0;
}
- error = check_leaf(ip, lindex, pass, &ref_count,
- &leaf_no, old_leaf, &bad_leaf,
- first_ok_leaf, &leaf, &oldleaf);
+ error = check_leaf(ip, lindex, pass, &leaf_no, &leaf,
+ &ref_count);
if (ref_count != orig_ref_count)
tbl_valid = 0;
- old_was_dup = (error == -EEXIST);
- old_leaf = leaf_no;
- memcpy(&oldleaf, &leaf, sizeof(oldleaf));
if (!leaf.lf_next || error)
break;
leaf_no = leaf.lf_next;
- log_debug( _("Leaf chain (0x%llx) detected.\n"),
- (unsigned long long)leaf_no);
+ chained_leaf++;
+ log_debug( _("Leaf chain #%d (0x%llx) detected.\n"),
+ chained_leaf, (unsigned long long)leaf_no);
} while (1); /* while we have chained leaf blocks */
if (orig_di_depth != ip->i_di.di_depth) {
log_debug(_("Depth of 0x%llx changed from %d to %d\n"),
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index bef99ae..e11b5e0 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -104,7 +104,8 @@ struct metawalk_fxns {
int (*check_hash_tbl) (struct gfs2_inode *ip, uint64_t *tbl,
unsigned hsize, void *private);
int (*repair_leaf) (struct gfs2_inode *ip, uint64_t *leaf_no,
- int lindex, int ref_count, const char *msg);
+ int lindex, int ref_count, const char *msg,
+ void *private);
};
#endif /* _METAWALK_H */
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index dd6b958..e827a55 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -78,6 +78,19 @@ static int invalidate_eattr_leaf(struct gfs2_inode *ip, uint64_t block,
void *private);
static int handle_ip(struct gfs2_sbd *sdp, struct gfs2_inode *ip);
+static int pass1_repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no,
+ int lindex, int ref_count, const char *msg,
+ void *private)
+{
+ struct block_count *bc = (struct block_count *)private;
+ int new_leaf_blks;
+
+ new_leaf_blks = repair_leaf(ip, leaf_no, lindex, ref_count, msg);
+ bc->indir_count += new_leaf_blks;
+
+ return new_leaf_blks;
+}
+
struct metawalk_fxns pass1_fxns = {
.private = NULL,
.check_leaf = check_leaf,
@@ -90,6 +103,7 @@ struct metawalk_fxns pass1_fxns = {
.check_eattr_extentry = check_extended_leaf_eattr,
.finish_eattr_indir = finish_eattr_indir,
.big_file_msg = big_file_comfort,
+ .repair_leaf = pass1_repair_leaf,
};
struct metawalk_fxns undo_fxns = {
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index 527071c..e0b1350 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -1422,6 +1422,13 @@ static int check_hash_tbl(struct gfs2_inode *ip, uint64_t *tbl,
return error;
}
+static int pass2_repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no,
+ int lindex, int ref_count, const char *msg,
+ void *private)
+{
+ return repair_leaf(ip, leaf_no, lindex, ref_count, msg);
+}
+
struct metawalk_fxns pass2_fxns = {
.private = NULL,
.check_leaf = NULL,
@@ -1432,7 +1439,7 @@ struct metawalk_fxns pass2_fxns = {
.check_dentry = check_dentry,
.check_eattr_entry = NULL,
.check_hash_tbl = check_hash_tbl,
- .repair_leaf = repair_leaf,
+ .repair_leaf = pass2_repair_leaf,
};
/* Check system directory inode */
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 17/47] fsck.gfs2: check leaf depth when validating leaf blocks
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (14 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 16/47] fsck.gfs2: fix leaf blocks, don't try to patch the hash table Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 18/47] fsck.gfs2: small cleanups Bob Peterson
` (29 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Now that fsck.gfs2 can validate the hash table is relatively sane
without actually reading the leaf blocks, when it does need to read
in those leaf blocks, we need to check the leaf depth is
appropriate for the (now sane) number of pointers we encountered in
the hash table. This patch adds a call to check the leaf depth from
pass2.
rhbz#902920
---
gfs2/fsck/metawalk.c | 3 +++
gfs2/fsck/metawalk.h | 2 ++
gfs2/fsck/pass2.c | 1 +
3 files changed, 6 insertions(+)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 4cd712e..fd4ec93 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -514,6 +514,9 @@ static int check_leaf(struct gfs2_inode *ip, int lindex,
msg = _("that is not really a leaf");
goto bad_leaf;
}
+ if (pass->check_leaf_depth)
+ error = pass->check_leaf_depth(ip, *leaf_no, *ref_count, lbh);
+
if (pass->check_leaf) {
error = pass->check_leaf(ip, *leaf_no, pass->private);
if (error) {
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index e11b5e0..486c6eb 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -69,6 +69,8 @@ extern int free_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
*/
struct metawalk_fxns {
void *private;
+ int (*check_leaf_depth) (struct gfs2_inode *ip, uint64_t leaf_no,
+ int ref_count, struct gfs2_buffer_head *lbh);
int (*check_leaf) (struct gfs2_inode *ip, uint64_t block,
void *private);
int (*check_metalist) (struct gfs2_inode *ip, uint64_t block,
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index e0b1350..48b20f5 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -1431,6 +1431,7 @@ static int pass2_repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no,
struct metawalk_fxns pass2_fxns = {
.private = NULL,
+ .check_leaf_depth = check_leaf_depth,
.check_leaf = NULL,
.check_metalist = NULL,
.check_data = NULL,
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 18/47] fsck.gfs2: small cleanups
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (15 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 17/47] fsck.gfs2: check leaf depth when validating leaf blocks Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 19/47] fsck.gfs2: reprocess inodes when blocks are added Bob Peterson
` (28 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch just fixes some messages that were wrong, adds comments,
and so forth.
rhbz#902920
---
gfs2/fsck/metawalk.c | 24 +++++++++++++-----------
gfs2/fsck/pass2.c | 9 +++------
gfs2/fsck/util.h | 2 +-
3 files changed, 17 insertions(+), 18 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index fd4ec93..05706da 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -356,14 +356,12 @@ static int check_entries(struct gfs2_inode *ip, struct gfs2_buffer_head *bh,
if (type == DIR_LINEAR) {
dent = (struct gfs2_dirent *)(bh->b_data + sizeof(struct gfs2_dinode));
- }
- else if (type == DIR_EXHASH) {
+ } else if (type == DIR_EXHASH) {
dent = (struct gfs2_dirent *)(bh->b_data + sizeof(struct gfs2_leaf));
- log_debug( _("Checking leaf %llu (0x%llu)\n"),
+ log_debug( _("Checking leaf %llu (0x%llx)\n"),
(unsigned long long)bh->b_blocknr,
(unsigned long long)bh->b_blocknr);
- }
- else {
+ } else {
log_err( _("Invalid directory type %d specified\n"), type);
return -1;
}
@@ -498,11 +496,12 @@ static int check_leaf(struct gfs2_inode *ip, int lindex,
/* Make sure the block number is in range. */
if (!valid_block(ip->i_sbd, *leaf_no)) {
log_err( _("Leaf block #%llu (0x%llx) is out of range for "
- "directory #%llu (0x%llx).\n"),
+ "directory #%llu (0x%llx) at index %d (0x%x).\n"),
(unsigned long long)*leaf_no,
(unsigned long long)*leaf_no,
(unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr);
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ lindex, lindex);
msg = _("that is out of range");
goto bad_leaf;
}
@@ -1334,8 +1333,8 @@ static int check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
/**
* check_metatree
- * @ip:
- * @rgd:
+ * @ip: inode structure in memory
+ * @pass: structure passed in from caller to determine the sub-functions
*
*/
int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
@@ -1684,8 +1683,11 @@ void reprocess_inode(struct gfs2_inode *ip, const char *desc)
int error;
alloc_fxns.private = (void *)desc;
- log_info( _("%s had blocks added; reprocessing its metadata tree "
- "at height=%d.\n"), desc, ip->i_di.di_height);
+ log_info( _("%s inode %llu (0x%llx) had blocks added; reprocessing "
+ "its metadata tree at height=%d.\n"), desc,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ ip->i_di.di_height);
error = check_metatree(ip, &alloc_fxns);
if (error)
log_err( _("Error %d reprocessing the %s metadata tree.\n"),
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index 48b20f5..5572fa3 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -46,9 +46,9 @@ static int set_parent_dir(struct gfs2_sbd *sdp, struct gfs2_inum child,
if (di->dinode.no_addr == child.no_addr &&
di->dinode.no_formal_ino == child.no_formal_ino) {
if (di->treewalk_parent) {
- log_err( _("Another directory at block %llx (0x%llx) "
- "already contains this child %lld (%llx) - "
- "checking parent %llx (0x%llx)\n"),
+ log_err( _("Another directory at block %lld (0x%llx) "
+ "already contains this child %lld (0x%llx)"
+ " - checking parent %lld (0x%llx)\n"),
(unsigned long long)di->treewalk_parent,
(unsigned long long)di->treewalk_parent,
(unsigned long long)child.no_addr,
@@ -1784,6 +1784,3 @@ int pass2(struct gfs2_sbd *sdp)
gfs2_dup_free();
return FSCK_OK;
}
-
-
-
diff --git a/gfs2/fsck/util.h b/gfs2/fsck/util.h
index 7b587d4..00c2239 100644
--- a/gfs2/fsck/util.h
+++ b/gfs2/fsck/util.h
@@ -65,7 +65,7 @@ static const inline char *block_type_string(uint8_t q)
const char *blktyp[] = {
"free",
"data",
- "indirect data",
+ "indirect meta",
"directory",
"file",
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 19/47] fsck.gfs2: reprocess inodes when blocks are added
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (16 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 18/47] fsck.gfs2: small cleanups Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 20/47] fsck.gfs2: Remove redundant leaf depth check Bob Peterson
` (27 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch adds several calls to reprocess_inode when functions
may have potentially added blocks to a dinode. This happens, for
example, when leaf blocks are split or new leaf blocks are added.
The purpose of reprocessing the inode is to properly mark the new
blocks in the fsck blockmap. If we don't, the new blocks may be
flagged as wrong in the bitmap, and set free in pass5.
rhbz#902920
---
gfs2/fsck/metawalk.c | 6 ++++++
gfs2/fsck/pass1.c | 11 +++++++++++
gfs2/fsck/pass2.c | 15 +++++++++------
3 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 05706da..e985dbc 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -1474,9 +1474,12 @@ int check_dir(struct gfs2_sbd *sdp, uint64_t block, struct metawalk_fxns *pass)
{
struct gfs2_inode *ip;
int error = 0;
+ uint64_t cur_blks;
ip = fsck_load_inode(sdp, block);
+ cur_blks = ip->i_di.di_blocks;
+
if (ip->i_di.di_flags & GFS2_DIF_EXHASH)
error = check_leaf_blks(ip, pass);
else
@@ -1485,6 +1488,9 @@ int check_dir(struct gfs2_sbd *sdp, uint64_t block, struct metawalk_fxns *pass)
if (error < 0)
stack;
+ if (ip->i_di.di_blocks != cur_blks)
+ reprocess_inode(ip, _("Current"));
+
fsck_inode_put(&ip); /* does a brelse */
return error;
}
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index e827a55..5137559 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -1022,6 +1022,7 @@ static int handle_ip(struct gfs2_sbd *sdp, struct gfs2_inode *ip)
struct block_count bc = {0};
long bad_pointers;
uint64_t block = ip->i_bh->b_blocknr;
+ uint64_t lf_blks = 0;
bad_pointers = 0L;
@@ -1083,8 +1084,18 @@ static int handle_ip(struct gfs2_sbd *sdp, struct gfs2_inode *ip)
}
}
+ if (lf_dip)
+ lf_blks = lf_dip->i_di.di_blocks;
+
pass1_fxns.private = &bc;
error = check_metatree(ip, &pass1_fxns);
+
+ /* Pass1 may have added some blocks to lost+found by virtue of leafs
+ that were misplaced. If it did, we need to reprocess lost+found
+ to correctly account for its blocks. */
+ if (lf_dip && lf_dip->i_di.di_blocks != lf_blks)
+ reprocess_inode(lf_dip, "lost+found");
+
if (fsck_abort || error < 0)
return 0;
if (error > 0) {
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index 5572fa3..1e7f884 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -1448,7 +1448,7 @@ struct metawalk_fxns pass2_fxns = {
static int check_system_dir(struct gfs2_inode *sysinode, const char *dirname,
int builder(struct gfs2_sbd *sdp))
{
- uint64_t iblock = 0;
+ uint64_t iblock = 0, cur_blks;
struct dir_status ds = {0};
char *filename;
int filename_len;
@@ -1468,12 +1468,15 @@ static int check_system_dir(struct gfs2_inode *sysinode, const char *dirname,
pass2_fxns.private = (void *) &ds;
if (ds.q == gfs2_bad_block) {
+ cur_blks = sysinode->i_di.di_blocks;
/* First check that the directory's metatree is valid */
error = check_metatree(sysinode, &pass2_fxns);
if (error < 0) {
stack;
return error;
}
+ if (sysinode->i_di.di_blocks != cur_blks)
+ reprocess_inode(sysinode, _("System inode"));
}
error = check_dir(sysinode->i_sbd, iblock, &pass2_fxns);
if (skip_this_pass || fsck_abort) /* if asked to skip the rest */
@@ -1493,8 +1496,7 @@ static int check_system_dir(struct gfs2_inode *sysinode, const char *dirname,
if (!ds.dotdir) {
log_err( _("No '.' entry found for %s directory.\n"), dirname);
if (query( _("Is it okay to add '.' entry? (y/n) "))) {
- uint64_t cur_blks = sysinode->i_di.di_blocks;
-
+ cur_blks = sysinode->i_di.di_blocks;
sprintf(tmp_name, ".");
filename_len = strlen(tmp_name); /* no trailing NULL */
if (!(filename = malloc(sizeof(char) * filename_len))) {
@@ -1585,7 +1587,7 @@ static inline int is_system_dir(struct gfs2_sbd *sdp, uint64_t block)
*/
int pass2(struct gfs2_sbd *sdp)
{
- uint64_t dirblk;
+ uint64_t dirblk, cur_blks;
uint8_t q;
struct dir_status ds = {0};
struct gfs2_inode *ip;
@@ -1647,12 +1649,15 @@ int pass2(struct gfs2_sbd *sdp)
/* First check that the directory's metatree
* is valid */
ip = fsck_load_inode(sdp, dirblk);
+ cur_blks = ip->i_di.di_blocks;
error = check_metatree(ip, &pass2_fxns);
fsck_inode_put(&ip);
if (error < 0) {
stack;
return error;
}
+ if (ip->i_di.di_blocks != cur_blks)
+ reprocess_inode(ip, "current");
}
error = check_dir(sdp, dirblk, &pass2_fxns);
if (skip_this_pass || fsck_abort) /* if asked to skip the rest */
@@ -1711,8 +1716,6 @@ int pass2(struct gfs2_sbd *sdp)
(unsigned long long)dirblk);
if (query( _("Is it okay to add '.' entry? (y/n) "))) {
- uint64_t cur_blks;
-
sprintf(tmp_name, ".");
filename_len = strlen(tmp_name); /* no trailing
NULL */
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 20/47] fsck.gfs2: Remove redundant leaf depth check
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (17 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 19/47] fsck.gfs2: reprocess inodes when blocks are added Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 21/47] fsck.gfs2: link dinodes that only have extended attribute problems Bob Peterson
` (26 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
A previous patch changed the way we check leaf block depth.
This patch removes the redundant check from pass1.
rhbz#902920
---
gfs2/fsck/pass1.c | 16 ----------------
1 file changed, 16 deletions(-)
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 5137559..04e5289 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -1021,7 +1021,6 @@ static int handle_ip(struct gfs2_sbd *sdp, struct gfs2_inode *ip)
int error;
struct block_count bc = {0};
long bad_pointers;
- uint64_t block = ip->i_bh->b_blocknr;
uint64_t lf_blks = 0;
bad_pointers = 0L;
@@ -1069,21 +1068,6 @@ static int handle_ip(struct gfs2_sbd *sdp, struct gfs2_inode *ip)
if (set_di_nlink(ip))
goto bad_dinode;
- if (is_dir(&ip->i_di, sdp->gfs1) && (ip->i_di.di_flags & GFS2_DIF_EXHASH)) {
- if (((1 << ip->i_di.di_depth) * sizeof(uint64_t)) != ip->i_di.di_size){
- log_warn( _("Directory dinode block #%llu (0x%llx"
- ") has bad depth. Found %u, Expected %u\n"),
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr,
- ip->i_di.di_depth,
- (1 >> (ip->i_di.di_size/sizeof(uint64_t))));
- if (fsck_blockmap_set(ip, block, _("bad depth"),
- gfs2_block_free))
- goto bad_dinode;
- return 0;
- }
- }
-
if (lf_dip)
lf_blks = lf_dip->i_di.di_blocks;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 21/47] fsck.gfs2: link dinodes that only have extended attribute problems
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (18 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 20/47] fsck.gfs2: Remove redundant leaf depth check Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 22/47] fsck.gfs2: Add clarifying message to duplicate processing Bob Peterson
` (25 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
The job of pass1b is to resolve duplicate references to the same block.
Eventually it does a fair job of determining the rightful owner of the
block, and then it has to deal with the other dinode(s) that referenced
the block improperly. If another dinode improperly referenced the block
as data or metadata, it's obvious file corruption and the dinode should
be deleted. However, if the other dinode improperly referenced the
block as an extended attribute, it can fix the situation by removing
the extended attributes from the dinode. Prior to this patch, there
was a check in the code for this situation so that the dinode was only
deleted if the bad block reference was as data or metadata. However,
regardless of the situation, the code removed the inode from the
inode rbtree. That resulted in the dinode being considered unlinked,
so it would get improperly tossed into lost+found and left in a
indeterminate state. Subsequent runs of fsck.gfs2 could find the
discrepancy and flag it as unlinked again. This patch adds another
check so that the inode is not removed from the inode rbtree, so it
is linked properly during pass2.
rhbz#902920
---
gfs2/fsck/pass1b.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index f3f90ef..bd60d84 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -523,9 +523,12 @@ static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *b,
(unsigned long long)id->block_no);
ip = fsck_load_inode(sdp, id->block_no);
- ii = inodetree_find(ip->i_di.di_num.no_addr);
- if (ii)
- inodetree_delete(ii);
+ if (id->reftypecount[ref_as_data] ||
+ id->reftypecount[ref_as_meta]) {
+ ii = inodetree_find(ip->i_di.di_num.no_addr);
+ if (ii)
+ inodetree_delete(ii);
+ }
clear_dup_fxns.private = (void *) dh;
/* Clear the EAs for the inode first */
check_inode_eattr(ip, &clear_dup_fxns);
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 22/47] fsck.gfs2: Add clarifying message to duplicate processing
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (19 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 21/47] fsck.gfs2: link dinodes that only have extended attribute problems Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 23/47] fsck.gfs2: separate function to calculate metadata block header size Bob Peterson
` (24 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch adds a message to the fsck output that indicates which
block reference is acceptable. That helps to determine if fsck made
the right decision when a duplicate is resolved.
rhbz#902920
---
gfs2/fsck/pass1b.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index bd60d84..56b77f5 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -485,6 +485,15 @@ static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *b,
q = block_type(id->block_no);
if (q != gfs2_inode_invalid) {
found_good_ref = 1;
+ log_warn( _("Inode %s (%lld/0x%llx)'s "
+ "reference to block %llu (0x%llx) "
+ "as '%s' is acceptable.\n"),
+ id->name,
+ (unsigned long long)id->block_no,
+ (unsigned long long)id->block_no,
+ (unsigned long long)b->block,
+ (unsigned long long)b->block,
+ reftypes[this_ref]);
continue; /* don't delete the dinode */
}
}
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 23/47] fsck.gfs2: separate function to calculate metadata block header size
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (20 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 22/47] fsck.gfs2: Add clarifying message to duplicate processing Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions Bob Peterson
` (23 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch creates a new function hdr_size that calculates the size
of a GFS2 metadata header depending on the height and type of block.
rhbz#902920
---
gfs2/fsck/metawalk.c | 44 +++++++++++++++++++++++---------------------
1 file changed, 23 insertions(+), 21 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index e985dbc..d1b12f1 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -1331,6 +1331,23 @@ static int check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
return error;
}
+static int hdr_size(struct gfs2_buffer_head *bh, int height)
+{
+ if (height > 1) {
+ if (gfs2_check_meta(bh, GFS2_METATYPE_IN))
+ return 0;
+ if (bh->sdp->gfs1)
+ return sizeof(struct gfs_indirect);
+ else
+ return sizeof(struct gfs2_meta_header);
+ }
+ /* if this isn't really a dinode, skip it */
+ if (gfs2_check_meta(bh, GFS2_METATYPE_DI))
+ return 0;
+
+ return sizeof(struct gfs2_dinode);
+}
+
/**
* check_metatree
* @ip: inode structure in memory
@@ -1401,28 +1418,13 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
bh = osi_list_entry(list->next, struct gfs2_buffer_head,
b_altlist);
- if (height > 1) {
- if (gfs2_check_meta(bh, GFS2_METATYPE_IN)) {
- if (bh == ip->i_bh)
- osi_list_del(&bh->b_altlist);
- else
- brelse(bh);
- continue;
- }
- if (ip->i_sbd->gfs1)
- head_size = sizeof(struct gfs_indirect);
+ head_size = hdr_size(bh, height);
+ if (!head_size) {
+ if (bh == ip->i_bh)
+ osi_list_del(&bh->b_altlist);
else
- head_size = sizeof(struct gfs2_meta_header);
- } else {
- /* if this isn't really a dinode, skip it */
- if (gfs2_check_meta(bh, GFS2_METATYPE_DI)) {
- if (bh == ip->i_bh)
- osi_list_del(&bh->b_altlist);
- else
- brelse(bh);
- continue;
- }
- head_size = sizeof(struct gfs2_dinode);
+ brelse(bh);
+ continue;
}
if (pass->check_data)
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (21 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 23/47] fsck.gfs2: separate function to calculate metadata block header size Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-16 13:27 ` Steven Whitehouse
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 25/47] fsck.gfs2: Check for interrupt when resolving duplicates Bob Peterson
` (22 subsequent siblings)
45 siblings, 1 reply; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
In pass1, it traverses the metadata free, processing each dinode and
marking which blocks are used by that dinode. If a dinode is found
to have unrecoverable errors, it does a bunch of work to "undo" the
things it did. This is especially important for the processing of
duplicate block references. Suppose dinode X references block 1234.
Later in pass1, suppose a different dinode, Y, also references block
1234. This is flagged as a duplicate block reference. Still later,
suppose pass1 determines dinode Y is bad. Now it has to undo the
work it did. It needs to properly unmark the data and metadata
blocks it marked as no longer "free" so that valid references that
follow aren't flagged as duplicate references. At the same time,
it needs to unflag block 1234 as a duplicate reference as well, so
that dinode X's original reference is still considered valid.
Before this patch, fsck.gfs2 was trying to traverse the entire
metadata tree for the bad dinode, trying to "undo" the designations.
That becomes a huge problem if the damage was discovered in the
middle of the metadata, in which case it may never have flagged any
of the data blocks as "in use as data" in its blockmap. The result
of "undoing" the designations sometimes resulted in blocks improperly
being marked as "free" when they were, in fact, referenced by other
valid dinodes.
For example, suppose corrupt dinode Y references metadata blocks
1234, 1235, and 1236. Now suppose a serious problem is found as part
of its processing of block 1234, and so it stopped its metadata tree
traversal there. Metadata blocks 1235 and 1236 are still listed as
metadata for the bad dinode, but if we traverse the entire tree,
those two blocks may be improperly processed. If another dinode
actually uses blocks 1235 or 1236, the improper "undo" processing
of those two blocks can screw up the valid references.
This patch reworks the "undo" functions so that the "undo" functions
don't get called on the entire metadata and data of the defective
dinode. Instead, only the metadata and data blocks queued onto the
metadata list are processed. This should ensure that the "undo"
functions only operate on blocks that were processed in the first
place.
rhbz#902920
---
gfs2/fsck/metawalk.c | 109 ++++++++++++++++++++++----------
gfs2/fsck/metawalk.h | 4 ++
gfs2/fsck/pass1.c | 172 ++++++++++++++++-----------------------------------
3 files changed, 135 insertions(+), 150 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index d1b12f1..b9d9f89 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -1259,7 +1259,7 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
if (err < 0) {
stack;
error = err;
- goto fail;
+ return error;
}
if (err > 0) {
if (!error)
@@ -1278,14 +1278,11 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
}
if (!nbh)
nbh = bread(ip->i_sbd, block);
- osi_list_add(&nbh->b_altlist, cur_list);
+ osi_list_add_prev(&nbh->b_altlist, cur_list);
} /* for all data on the indirect block */
} /* for blocks at that height */
} /* for height */
- return error;
-fail:
- free_metalist(ip, mlp);
- return error;
+ return 0;
}
/**
@@ -1331,6 +1328,27 @@ static int check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
return error;
}
+static int undo_check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
+ uint64_t *ptr_start, char *ptr_end)
+{
+ int rc = 0;
+ uint64_t block, *ptr;
+
+ /* If there isn't much pointer corruption check the pointers */
+ for (ptr = ptr_start ; (char *)ptr < ptr_end && !fsck_abort; ptr++) {
+ if (!*ptr)
+ continue;
+
+ if (skip_this_pass || fsck_abort)
+ return 1;
+ block = be64_to_cpu(*ptr);
+ rc = pass->undo_check_data(ip, block, pass->private);
+ if (rc < 0)
+ return rc;
+ }
+ return 0;
+}
+
static int hdr_size(struct gfs2_buffer_head *bh, int height)
{
if (height > 1) {
@@ -1363,6 +1381,7 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
int i, head_size;
uint64_t blks_checked = 0;
int error, rc;
+ int metadata_clean = 0;
if (!height && !is_dir(&ip->i_di, ip->i_sbd->gfs1))
return 0;
@@ -1374,35 +1393,21 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
error = build_and_check_metalist(ip, &metalist[0], pass);
if (error) {
stack;
- free_metalist(ip, &metalist[0]);
- return error;
+ goto undo_metalist;
}
+ metadata_clean = 1;
/* For directories, we've already checked the "data" blocks which
* comprise the directory hash table, so we perform the directory
* checks and exit. */
if (is_dir(&ip->i_di, ip->i_sbd->gfs1)) {
- free_metalist(ip, &metalist[0]);
if (!(ip->i_di.di_flags & GFS2_DIF_EXHASH))
- return 0;
+ goto out;
/* check validity of leaf blocks and leaf chains */
error = check_leaf_blks(ip, pass);
- return error;
- }
-
- /* Free the metalist buffers from heights we don't need to check.
- For the rest we'll free as we check them to save time.
- metalist[0] will only have the dinode bh, so we can skip it. */
- for (i = 1; i < height - 1; i++) {
- list = &metalist[i];
- while (!osi_list_empty(list)) {
- bh = osi_list_entry(list->next,
- struct gfs2_buffer_head, b_altlist);
- if (bh == ip->i_bh)
- osi_list_del(&bh->b_altlist);
- else
- brelse(bh);
- }
+ if (error)
+ goto undo_metalist;
+ goto out;
}
/* check data blocks */
@@ -1435,14 +1440,12 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
else
rc = 0;
- if (rc && (!error || rc < 0))
+ if (rc && (!error || rc < 0)) {
error = rc;
+ break;
+ }
if (pass->big_file_msg && ip->i_di.di_blocks > COMFORTABLE_BLKS)
pass->big_file_msg(ip, blks_checked);
- if (bh == ip->i_bh)
- osi_list_del(&bh->b_altlist);
- else
- brelse(bh);
}
if (pass->big_file_msg && ip->i_di.di_blocks > COMFORTABLE_BLKS) {
log_notice( _("\rLarge file at %lld (0x%llx) - 100 percent "
@@ -1452,6 +1455,50 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
(unsigned long long)ip->i_di.di_num.no_addr);
fflush(stdout);
}
+undo_metalist:
+ if (!error)
+ goto out;
+ log_err( _("Error: inode %llu (0x%llx) had unrecoverable errors.\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr);
+ if (!query( _("Remove the invalid inode? (y/n) "))) {
+ free_metalist(ip, &metalist[0]);
+ log_err(_("Invalid inode not deleted.\n"));
+ return error;
+ }
+ for (i = 0; pass->undo_check_meta && i < height; i++) {
+ while (!osi_list_empty(&metalist[i])) {
+ list = &metalist[i];
+ bh = osi_list_entry(list->next,
+ struct gfs2_buffer_head,
+ b_altlist);
+ log_err(_("Undoing metadata work for block %llu "
+ "(0x%llx)\n"),
+ (unsigned long long)bh->b_blocknr,
+ (unsigned long long)bh->b_blocknr);
+ if (i)
+ rc = pass->undo_check_meta(ip, bh->b_blocknr,
+ i, pass->private);
+ else
+ rc = 0;
+ if (metadata_clean && rc == 0 && i == height - 1) {
+ head_size = hdr_size(bh, height);
+ if (head_size)
+ undo_check_data(ip, pass, (uint64_t *)
+ (bh->b_data + head_size),
+ (bh->b_data + ip->i_sbd->bsize));
+ }
+ if (bh == ip->i_bh)
+ osi_list_del(&bh->b_altlist);
+ else
+ brelse(bh);
+ }
+ }
+ /* Set the dinode as "bad" so it gets deleted */
+ fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
+ _("corrupt"), gfs2_block_free);
+ log_err(_("The corrupt inode was invalidated.\n"));
+out:
free_metalist(ip, &metalist[0]);
return error;
}
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index 486c6eb..f5e71e1 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -108,6 +108,10 @@ struct metawalk_fxns {
int (*repair_leaf) (struct gfs2_inode *ip, uint64_t *leaf_no,
int lindex, int ref_count, const char *msg,
void *private);
+ int (*undo_check_meta) (struct gfs2_inode *ip, uint64_t block,
+ int h, void *private);
+ int (*undo_check_data) (struct gfs2_inode *ip, uint64_t block,
+ void *private);
};
#endif /* _METAWALK_H */
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 04e5289..a88895f 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -39,8 +39,7 @@ static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private);
static int check_metalist(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h, void *private);
static int undo_check_metalist(struct gfs2_inode *ip, uint64_t block,
- struct gfs2_buffer_head **bh, int h,
- void *private);
+ int h, void *private);
static int check_data(struct gfs2_inode *ip, uint64_t block, void *private);
static int undo_check_data(struct gfs2_inode *ip, uint64_t block,
void *private);
@@ -104,12 +103,8 @@ struct metawalk_fxns pass1_fxns = {
.finish_eattr_indir = finish_eattr_indir,
.big_file_msg = big_file_comfort,
.repair_leaf = pass1_repair_leaf,
-};
-
-struct metawalk_fxns undo_fxns = {
- .private = NULL,
- .check_metalist = undo_check_metalist,
- .check_data = undo_check_data,
+ .undo_check_meta = undo_check_metalist,
+ .undo_check_data = undo_check_data,
};
struct metawalk_fxns invalidate_fxns = {
@@ -326,53 +321,67 @@ static int check_metalist(struct gfs2_inode *ip, uint64_t block,
return 0;
}
-static int undo_check_metalist(struct gfs2_inode *ip, uint64_t block,
- struct gfs2_buffer_head **bh, int h,
- void *private)
+/* undo_reference - undo previously processed data or metadata
+ * We've treated the metadata for this dinode as good so far, but not we
+ * realize it's bad. So we need to undo what we've done.
+ *
+ * Returns: 0 - We need to process the block as metadata. In other words,
+ * we need to undo any blocks it refers to.
+ * 1 - We can't process the block as metadata.
+ */
+
+static int undo_reference(struct gfs2_inode *ip, uint64_t block, int meta,
+ void *private)
{
- int found_dup = 0, iblk_type;
- struct gfs2_buffer_head *nbh;
struct block_count *bc = (struct block_count *)private;
-
- *bh = NULL;
+ struct duptree *dt;
+ struct inode_with_dups *id;
if (!valid_block(ip->i_sbd, block)) { /* blk outside of FS */
fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
_("bad block referencing"), gfs2_block_free);
return 1;
}
- if (is_dir(&ip->i_di, ip->i_sbd->gfs1) && h == ip->i_di.di_height)
- iblk_type = GFS2_METATYPE_JD;
- else
- iblk_type = GFS2_METATYPE_IN;
- found_dup = find_remove_dup(ip, block, _("Metadata"));
- nbh = bread(ip->i_sbd, block);
+ if (meta)
+ bc->indir_count--;
+ dt = dupfind(block);
+ if (dt) {
+ /* remove all duplicate reference structures from this inode */
+ do {
+ id = find_dup_ref_inode(dt, ip);
+ if (!id)
+ break;
- if (gfs2_check_meta(nbh, iblk_type)) {
- if (!found_dup) {
- fsck_blockmap_set(ip, block, _("bad indirect"),
- gfs2_block_free);
- brelse(nbh);
+ dup_listent_delete(id);
+ } while (id);
+
+ if (dt->refs) {
+ log_err(_("Block %llu (0x%llx) is still referenced "
+ "from another inode; not freeing.\n"),
+ (unsigned long long)block,
+ (unsigned long long)block);
return 1;
}
- brelse(nbh);
- nbh = NULL;
- } else /* blk check ok */
- *bh = nbh;
-
- bc->indir_count--;
- if (found_dup) {
- if (nbh)
- brelse(nbh);
- *bh = NULL;
- return 1; /* don't process the metadata again */
- } else
- fsck_blockmap_set(ip, block, _("bad indirect"),
- gfs2_block_free);
+ }
+ fsck_blockmap_set(ip, block,
+ meta ? _("bad indirect") : _("referenced data"),
+ gfs2_block_free);
return 0;
}
+static int undo_check_metalist(struct gfs2_inode *ip, uint64_t block,
+ int h, void *private)
+{
+ return undo_reference(ip, block, 1, private);
+}
+
+static int undo_check_data(struct gfs2_inode *ip, uint64_t block,
+ void *private)
+{
+ return undo_reference(ip, block, 0, private);
+}
+
static int check_data(struct gfs2_inode *ip, uint64_t block, void *private)
{
uint8_t q;
@@ -438,71 +447,9 @@ static int check_data(struct gfs2_inode *ip, uint64_t block, void *private)
return 0;
}
-static int undo_check_data(struct gfs2_inode *ip, uint64_t block,
- void *private)
-{
- struct block_count *bc = (struct block_count *) private;
-
- if (!valid_block(ip->i_sbd, block)) {
- /* Mark the owner of this block with the bad_block
- * designator so we know to check it for out of range
- * blocks later */
- fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
- _("bad (invalid or out of range) data"),
- gfs2_block_free);
- return 1;
- }
- bc->data_count--;
- return free_block_if_notdup(ip, block, _("data"));
-}
-
static int remove_inode_eattr(struct gfs2_inode *ip, struct block_count *bc)
{
- struct duptree *dt;
- struct inode_with_dups *id;
- osi_list_t *ref;
- int moved = 0;
-
- /* If it's a duplicate reference to the block, we need to check
- if the reference is on the valid or invalid inodes list.
- If it's on the valid inode's list, move it to the invalid
- inodes list. The reason is simple: This inode, although
- valid, has an now-invalid reference, so we should not give
- this reference preferential treatment over others. */
- dt = dupfind(ip->i_di.di_eattr);
- if (dt) {
- osi_list_foreach(ref, &dt->ref_inode_list) {
- id = osi_list_entry(ref, struct inode_with_dups, list);
- if (id->block_no == ip->i_di.di_num.no_addr) {
- log_debug( _("Moving inode %lld (0x%llx)'s "
- "duplicate reference to %lld "
- "(0x%llx) from the valid to the "
- "invalid reference list.\n"),
- (unsigned long long)
- ip->i_di.di_num.no_addr,
- (unsigned long long)
- ip->i_di.di_num.no_addr,
- (unsigned long long)
- ip->i_di.di_eattr,
- (unsigned long long)
- ip->i_di.di_eattr);
- /* Move from the normal to the invalid list */
- osi_list_del(&id->list);
- osi_list_add_prev(&id->list,
- &dt->ref_invinode_list);
- moved = 1;
- break;
- }
- }
- if (!moved)
- log_debug( _("Duplicate reference to %lld "
- "(0x%llx) not moved.\n"),
- (unsigned long long)ip->i_di.di_eattr,
- (unsigned long long)ip->i_di.di_eattr);
- } else {
- delete_block(ip, ip->i_di.di_eattr, NULL,
- "extended attribute", NULL);
- }
+ undo_reference(ip, ip->i_di.di_eattr, 0, bc);
ip->i_di.di_eattr = 0;
bc->ea_count = 0;
ip->i_di.di_blocks = 1 + bc->indir_count + bc->data_count;
@@ -1080,23 +1027,10 @@ static int handle_ip(struct gfs2_sbd *sdp, struct gfs2_inode *ip)
if (lf_dip && lf_dip->i_di.di_blocks != lf_blks)
reprocess_inode(lf_dip, "lost+found");
- if (fsck_abort || error < 0)
+ /* We there was an error, we return 0 because we want fsck to continue
+ and analyze the other dinodes as well. */
+ if (fsck_abort || error != 0)
return 0;
- if (error > 0) {
- log_err( _("Error: inode %llu (0x%llx) has unrecoverable "
- "errors; invalidating.\n"),
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr);
- undo_fxns.private = &bc;
- check_metatree(ip, &undo_fxns);
- /* If we undo the metadata accounting, including metadatas
- duplicate block status, we need to make sure later passes
- don't try to free up the metadata referenced by this inode.
- Therefore we mark the inode as free space. */
- fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
- _("corrupt"), gfs2_block_free);
- return 0;
- }
error = check_inode_eattr(ip, &pass1_fxns);
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 25/47] fsck.gfs2: Check for interrupt when resolving duplicates
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (22 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 26/47] fsck.gfs2: Consistent naming of struct duptree variables Bob Peterson
` (21 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch adds another check for interrupts while resolving duplicate
block references in pass1b.
rhbz#902920
---
gfs2/fsck/pass1b.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index 56b77f5..7108bb4 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -459,6 +459,9 @@ static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *b,
int found_good_ref = 0;
osi_list_foreach_safe(tmp, ref_list, x) {
+ if (skip_this_pass || fsck_abort)
+ return FSCK_OK;
+
id = osi_list_entry(tmp, struct inode_with_dups, list);
dh->b = b;
dh->id = id;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 26/47] fsck.gfs2: Consistent naming of struct duptree variables
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (23 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 25/47] fsck.gfs2: Check for interrupt when resolving duplicates Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 27/47] fsck.gfs2: Keep proper counts when duplicates are found Bob Peterson
` (20 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
There were several places in the fsck.gfs2 code that referenced
variables of type struct duptree, but sometimes they were called
dt, d, b or even data. This patch achieves a level of consistency
and calls them all dt. This helps readability: when you see a
variable dt, you know it's a struct duptree.
---
gfs2/fsck/fsck.h | 2 +-
gfs2/fsck/metawalk.c | 24 +++++------
gfs2/fsck/pass1b.c | 113 ++++++++++++++++++++++++++-------------------------
gfs2/fsck/util.c | 42 +++++++++----------
4 files changed, 91 insertions(+), 90 deletions(-)
diff --git a/gfs2/fsck/fsck.h b/gfs2/fsck/fsck.h
index 5313bb3..b21a670 100644
--- a/gfs2/fsck/fsck.h
+++ b/gfs2/fsck/fsck.h
@@ -117,7 +117,7 @@ extern int fsck_query(const char *format, ...)
__attribute__((format(printf,1,2)));
extern struct dir_info *dirtree_find(uint64_t block);
extern void dup_listent_delete(struct inode_with_dups *id);
-extern void dup_delete(struct duptree *b);
+extern void dup_delete(struct duptree *dt);
extern void dirtree_delete(struct dir_info *b);
/* FIXME: Hack to get this going for pass2 - this should be pulled out
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index b9d9f89..d872ff3 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -179,14 +179,14 @@ struct duptree *dupfind(uint64_t block)
struct osi_node *node = dup_blocks.osi_node;
while (node) {
- struct duptree *data = (struct duptree *)node;
+ struct duptree *dt = (struct duptree *)node;
- if (block < data->block)
+ if (block < dt->block)
node = node->osi_left;
- else if (block > data->block)
+ else if (block > dt->block)
node = node->osi_right;
else
- return data;
+ return dt;
}
return NULL;
}
@@ -955,15 +955,15 @@ int delete_block(struct gfs2_inode *ip, uint64_t block,
*/
int find_remove_dup(struct gfs2_inode *ip, uint64_t block, const char *btype)
{
- struct duptree *d;
+ struct duptree *dt;
struct inode_with_dups *id;
- d = dupfind(block);
- if (!d)
+ dt = dupfind(block);
+ if (!dt)
return 0;
/* remove the inode reference id structure for this reference. */
- id = find_dup_ref_inode(d, ip);
+ id = find_dup_ref_inode(dt, ip);
if (!id)
return 0;
@@ -973,14 +973,14 @@ int find_remove_dup(struct gfs2_inode *ip, uint64_t block, const char *btype)
(unsigned long long)block, (unsigned long long)block,
btype, (unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
- d->refs--; /* one less reference */
- if (d->refs == 1) {
+ dt->refs--; /* one less reference */
+ if (dt->refs == 1) {
log_info( _("This leaves only one reference: it's "
"no longer a duplicate.\n"));
- dup_delete(d); /* not duplicate now */
+ dup_delete(dt); /* not duplicate now */
} else
log_info( _("%d block reference(s) remain.\n"),
- d->refs);
+ dt->refs);
return 1; /* but the original ref still exists so do not free it. */
}
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index 7108bb4..54c8649 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -22,7 +22,7 @@ struct fxn_info {
};
struct dup_handler {
- struct duptree *b;
+ struct duptree *dt;
struct inode_with_dups *id;
int ref_inode_count;
int ref_count;
@@ -179,21 +179,21 @@ static int find_dentry(struct gfs2_inode *ip, struct gfs2_dirent *de,
{
struct osi_node *n, *next = NULL;
osi_list_t *tmp2;
- struct duptree *b;
+ struct duptree *dt;
int found;
for (n = osi_first(&dup_blocks); n; n = next) {
next = osi_next(n);
- b = (struct duptree *)n;
+ dt = (struct duptree *)n;
found = 0;
- osi_list_foreach(tmp2, &b->ref_invinode_list) {
+ osi_list_foreach(tmp2, &dt->ref_invinode_list) {
if (check_dir_dup_ref(ip, de, tmp2, filename)) {
found = 1;
break;
}
}
if (!found) {
- osi_list_foreach(tmp2, &b->ref_inode_list) {
+ osi_list_foreach(tmp2, &dt->ref_inode_list) {
if (check_dir_dup_ref(ip, de, tmp2, filename))
break;
}
@@ -210,7 +210,7 @@ static int clear_dup_metalist(struct gfs2_inode *ip, uint64_t block,
void *private)
{
struct dup_handler *dh = (struct dup_handler *) private;
- struct duptree *d;
+ struct duptree *dt;
if (!valid_block(ip->i_sbd, block))
return 0;
@@ -225,14 +225,14 @@ static int clear_dup_metalist(struct gfs2_inode *ip, uint64_t block,
to delete it altogether. If the block is a duplicate referenced
block, we need to keep its type intact and let the caller sort
it out once we're down to a single reference. */
- d = dupfind(block);
- if (!d) {
+ dt = dupfind(block);
+ if (!dt) {
fsck_blockmap_set(ip, block, _("no longer valid"),
gfs2_block_free);
return 0;
}
/* This block, having failed the above test, is duplicated somewhere */
- if (block == dh->b->block) {
+ if (block == dh->dt->block) {
log_err( _("Not clearing duplicate reference in inode \"%s\" "
"at block #%llu (0x%llx) to block #%llu (0x%llx) "
"because it's valid for another inode.\n"),
@@ -400,7 +400,7 @@ static enum dup_ref_type get_ref_type(struct inode_with_dups *id)
return ref_types;
}
-static void log_inode_reference(struct duptree *b, osi_list_t *tmp, int inval)
+static void log_inode_reference(struct duptree *dt, osi_list_t *tmp, int inval)
{
char reftypestring[32];
struct inode_with_dups *id;
@@ -420,8 +420,8 @@ static void log_inode_reference(struct duptree *b, osi_list_t *tmp, int inval)
"block %llu (0x%llx) (%s)\n"), id->name,
(unsigned long long)id->block_no,
(unsigned long long)id->block_no, id->dup_count,
- (unsigned long long)b->block,
- (unsigned long long)b->block, reftypestring);
+ (unsigned long long)dt->block,
+ (unsigned long long)dt->block, reftypestring);
}
/*
* resolve_dup_references - resolve all but the last dinode that has a
@@ -436,7 +436,7 @@ static void log_inode_reference(struct duptree *b, osi_list_t *tmp, int inval)
* acceptable_ref - Delete dinodes that reference the given block as anything
* _but_ this type. Try to save references as this type.
*/
-static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *b,
+static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *dt,
osi_list_t *ref_list, struct dup_handler *dh,
int inval, int acceptable_ref)
{
@@ -463,7 +463,7 @@ static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *b,
return FSCK_OK;
id = osi_list_entry(tmp, struct inode_with_dups, list);
- dh->b = b;
+ dh->dt = dt;
dh->id = id;
if (dh->ref_inode_count == 1) /* down to the last reference */
@@ -494,8 +494,8 @@ static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *b,
id->name,
(unsigned long long)id->block_no,
(unsigned long long)id->block_no,
- (unsigned long long)b->block,
- (unsigned long long)b->block,
+ (unsigned long long)dt->block,
+ (unsigned long long)dt->block,
reftypes[this_ref]);
continue; /* don't delete the dinode */
}
@@ -513,8 +513,8 @@ static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *b,
"really %s.\n"),
id->name, (unsigned long long)id->block_no,
(unsigned long long)id->block_no,
- (unsigned long long)b->block,
- (unsigned long long)b->block,
+ (unsigned long long)dt->block,
+ (unsigned long long)dt->block,
reftypes[this_ref], reftypes[acceptable_ref]);
if (!(query( _("Okay to delete %s inode %lld (0x%llx)? "
"(y/n) "),
@@ -564,7 +564,7 @@ static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *b,
return 0;
}
-static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *b)
+static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *dt)
{
struct gfs2_inode *ip;
osi_list_t *tmp;
@@ -576,12 +576,12 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *b)
enum dup_ref_type acceptable_ref;
/* Count the duplicate references, both valid and invalid */
- osi_list_foreach(tmp, &b->ref_invinode_list) {
+ osi_list_foreach(tmp, &dt->ref_invinode_list) {
id = osi_list_entry(tmp, struct inode_with_dups, list);
dh.ref_inode_count++;
dh.ref_count += id->dup_count;
}
- osi_list_foreach(tmp, &b->ref_inode_list) {
+ osi_list_foreach(tmp, &dt->ref_inode_list) {
id = osi_list_entry(tmp, struct inode_with_dups, list);
dh.ref_inode_count++;
dh.ref_count += id->dup_count;
@@ -590,13 +590,14 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *b)
/* Log the duplicate references */
log_notice( _("Block %llu (0x%llx) has %d inodes referencing it"
" for a total of %d duplicate references:\n"),
- (unsigned long long)b->block, (unsigned long long)b->block,
- dh.ref_inode_count, dh.ref_count);
+ (unsigned long long)dt->block,
+ (unsigned long long)dt->block,
+ dh.ref_inode_count, dh.ref_count);
- osi_list_foreach(tmp, &b->ref_invinode_list)
- log_inode_reference(b, tmp, 1);
- osi_list_foreach(tmp, &b->ref_inode_list)
- log_inode_reference(b, tmp, 0);
+ osi_list_foreach(tmp, &dt->ref_invinode_list)
+ log_inode_reference(dt, tmp, 1);
+ osi_list_foreach(tmp, &dt->ref_inode_list)
+ log_inode_reference(dt, tmp, 0);
/* Figure out the block type to see if we can eliminate references
to a different type. In other words, if the duplicate block looks
@@ -605,7 +606,7 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *b)
references to it as metadata. Dinodes with such references are
clearly corrupt and need to be deleted.
And if we're left with a single reference, problem solved. */
- bh = bread(sdp, b->block);
+ bh = bread(sdp, dt->block);
cmagic = ((struct gfs2_meta_header *)(bh->b_data))->mh_magic;
ctype = ((struct gfs2_meta_header *)(bh->b_data))->mh_type;
brelse(bh);
@@ -650,10 +651,10 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *b)
"Step 1: Eliminate references to block %llu "
"(0x%llx) that were previously marked "
"invalid.\n"),
- (unsigned long long)b->block,
- (unsigned long long)b->block);
- last_reference = resolve_dup_references(sdp, b,
- &b->ref_invinode_list,
+ (unsigned long long)dt->block,
+ (unsigned long long)dt->block);
+ last_reference = resolve_dup_references(sdp, dt,
+ &dt->ref_invinode_list,
&dh, 1, ref_types);
}
/* Step 2 - eliminate reference from inodes that reference it as the
@@ -665,10 +666,10 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *b)
log_debug( _("----------------------------------------------\n"
"Step 2: Eliminate references to block %llu "
"(0x%llx) that need the wrong block type.\n"),
- (unsigned long long)b->block,
- (unsigned long long)b->block);
- last_reference = resolve_dup_references(sdp, b,
- &b->ref_inode_list,
+ (unsigned long long)dt->block,
+ (unsigned long long)dt->block);
+ last_reference = resolve_dup_references(sdp, dt,
+ &dt->ref_inode_list,
&dh, 0,
acceptable_ref);
}
@@ -680,20 +681,20 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *b)
log_debug( _("----------------------------------------------\n"
"Step 3: Choose one reference to block %llu "
"(0x%llx) to keep.\n"),
- (unsigned long long)b->block,
- (unsigned long long)b->block);
- last_reference = resolve_dup_references(sdp, b,
- &b->ref_inode_list,
+ (unsigned long long)dt->block,
+ (unsigned long long)dt->block);
+ last_reference = resolve_dup_references(sdp, dt,
+ &dt->ref_inode_list,
&dh, 0, ref_types);
}
/* Now fix the block type of the block in question. */
- if (osi_list_empty(&b->ref_inode_list)) {
+ if (osi_list_empty(&dt->ref_inode_list)) {
log_notice( _("Block %llu (0x%llx) has no more references; "
"Marking as 'free'.\n"),
- (unsigned long long)b->block,
- (unsigned long long)b->block);
- gfs2_blockmap_set(bl, b->block, gfs2_block_free);
- check_n_fix_bitmap(sdp, b->block, gfs2_block_free);
+ (unsigned long long)dt->block,
+ (unsigned long long)dt->block);
+ gfs2_blockmap_set(bl, dt->block, gfs2_block_free);
+ check_n_fix_bitmap(sdp, dt->block, gfs2_block_free);
return 0;
}
if (last_reference) {
@@ -701,14 +702,14 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *b)
log_notice( _("Block %llu (0x%llx) has only one remaining "
"reference.\n"),
- (unsigned long long)b->block,
- (unsigned long long)b->block);
+ (unsigned long long)dt->block,
+ (unsigned long long)dt->block);
/* If we're down to a single reference (and not all references
deleted, which may be the case of an inode that has only
itself and a reference), we need to reset the block type
from invalid to data or metadata. Start at the first one
in the list, not the structure's place holder. */
- tmp = (&b->ref_inode_list)->next;
+ tmp = (&dt->ref_inode_list)->next;
id = osi_list_entry(tmp, struct inode_with_dups, list);
log_debug( _("----------------------------------------------\n"
"Step 4. Set block type based on the remaining "
@@ -724,27 +725,27 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *b)
"the block as free.\n"),
(unsigned long long)id->block_no,
(unsigned long long)id->block_no);
- fsck_blockmap_set(ip, b->block,
+ fsck_blockmap_set(ip, dt->block,
_("reference-repaired leaf"),
gfs2_block_free);
} else if (id->reftypecount[ref_is_inode]) {
set_ip_blockmap(ip, 0); /* 0=do not add to dirtree */
} else if (id->reftypecount[ref_as_data]) {
- fsck_blockmap_set(ip, b->block,
+ fsck_blockmap_set(ip, dt->block,
_("reference-repaired data"),
gfs2_block_used);
} else if (id->reftypecount[ref_as_meta]) {
if (is_dir(&ip->i_di, sdp->gfs1))
- fsck_blockmap_set(ip, b->block,
+ fsck_blockmap_set(ip, dt->block,
_("reference-repaired leaf"),
gfs2_leaf_blk);
else
- fsck_blockmap_set(ip, b->block,
+ fsck_blockmap_set(ip, dt->block,
_("reference-repaired "
"indirect"),
gfs2_indir_blk);
} else
- fsck_blockmap_set(ip, b->block,
+ fsck_blockmap_set(ip, dt->block,
_("reference-repaired extended "
"attribute"),
gfs2_meta_eattr);
@@ -761,7 +762,7 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *b)
* use in pass2 */
int pass1b(struct gfs2_sbd *sdp)
{
- struct duptree *b;
+ struct duptree *dt;
uint64_t i;
uint8_t q;
struct osi_node *n, *next = NULL;
@@ -817,9 +818,9 @@ int pass1b(struct gfs2_sbd *sdp)
out:
for (n = osi_first(&dup_blocks); n; n = next) {
next = osi_next(n);
- b = (struct duptree *)n;
+ dt = (struct duptree *)n;
if (!skip_this_pass && !rc) /* no error & not asked to skip the rest */
- handle_dup_blk(sdp, b);
+ handle_dup_blk(sdp, dt);
/* Do not attempt to free the dup_blocks list or its parts
here because any func that calls check_metatree needs
to check duplicate status based on this linked list.
diff --git a/gfs2/fsck/util.c b/gfs2/fsck/util.c
index 5be260c..c11768f 100644
--- a/gfs2/fsck/util.c
+++ b/gfs2/fsck/util.c
@@ -234,7 +234,7 @@ int fsck_query(const char *format, ...)
static struct duptree *gfs2_dup_set(uint64_t dblock, int create)
{
struct osi_node **newn = &dup_blocks.osi_node, *parent = NULL;
- struct duptree *data;
+ struct duptree *dt;
/* Figure out where to put new node */
while (*newn) {
@@ -251,24 +251,24 @@ static struct duptree *gfs2_dup_set(uint64_t dblock, int create)
if (!create)
return NULL;
- data = malloc(sizeof(struct duptree));
- if (data == NULL) {
+ dt = malloc(sizeof(struct duptree));
+ if (dt == NULL) {
log_crit( _("Unable to allocate duptree structure\n"));
return NULL;
}
dups_found++;
- memset(data, 0, sizeof(struct duptree));
+ memset(dt, 0, sizeof(struct duptree));
/* Add new node and rebalance tree. */
- data->block = dblock;
- data->refs = 1; /* reference 1 is actually the reference we need to
- discover in pass1b. */
- data->first_ref_found = 0;
- osi_list_init(&data->ref_inode_list);
- osi_list_init(&data->ref_invinode_list);
- osi_link_node(&data->node, parent, newn);
- osi_insert_color(&data->node, &dup_blocks);
-
- return data;
+ dt->block = dblock;
+ dt->refs = 1; /* reference 1 is actually the reference we need to
+ discover in pass1b. */
+ dt->first_ref_found = 0;
+ osi_list_init(&dt->ref_inode_list);
+ osi_list_init(&dt->ref_invinode_list);
+ osi_link_node(&dt->node, parent, newn);
+ osi_insert_color(&dt->node, &dup_blocks);
+
+ return dt;
}
/**
@@ -453,23 +453,23 @@ void dup_listent_delete(struct inode_with_dups *id)
free(id);
}
-void dup_delete(struct duptree *b)
+void dup_delete(struct duptree *dt)
{
struct inode_with_dups *id;
osi_list_t *tmp;
- while (!osi_list_empty(&b->ref_invinode_list)) {
- tmp = (&b->ref_invinode_list)->next;
+ while (!osi_list_empty(&dt->ref_invinode_list)) {
+ tmp = (&dt->ref_invinode_list)->next;
id = osi_list_entry(tmp, struct inode_with_dups, list);
dup_listent_delete(id);
}
- while (!osi_list_empty(&b->ref_inode_list)) {
- tmp = (&b->ref_inode_list)->next;
+ while (!osi_list_empty(&dt->ref_inode_list)) {
+ tmp = (&dt->ref_inode_list)->next;
id = osi_list_entry(tmp, struct inode_with_dups, list);
dup_listent_delete(id);
}
- osi_erase(&b->node, &dup_blocks);
- free(b);
+ osi_erase(&dt->node, &dup_blocks);
+ free(dt);
}
void dirtree_delete(struct dir_info *b)
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 27/47] fsck.gfs2: Keep proper counts when duplicates are found
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (24 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 26/47] fsck.gfs2: Consistent naming of struct duptree variables Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 28/47] fsck.gfs2: print metadata block reference on data errors Bob Peterson
` (19 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
When fsck.gfs2 discovered a duplicate reference to the same block,
it was not properly incrementing the block counters for data and
metadata. Therefore, when the duplicate situation is resolved, the
resulting dinode is likely to have the wrong block count. This patch
makes it increment the counters, regardless of whether the block is
a duplicate reference.
rhbz#902920
---
gfs2/fsck/pass1.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index a88895f..27bb5d4 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -218,6 +218,7 @@ static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
/* Note if we've gotten this far, the block has already passed the
check in metawalk: gfs2_check_meta(lbh, GFS2_METATYPE_LF).
So we know it's a leaf block. */
+ bc->indir_count++;
q = block_type(block);
if (q != gfs2_block_free) {
log_err( _("Found duplicate block #%llu (0x%llx) referenced "
@@ -235,7 +236,6 @@ static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
return -EEXIST;
}
fsck_blockmap_set(ip, block, _("directory leaf"), gfs2_leaf_blk);
- bc->indir_count++;
return 0;
}
@@ -401,6 +401,7 @@ static int check_data(struct gfs2_inode *ip, uint64_t block, void *private)
gfs2_bad_block);
return 1;
}
+ bc->data_count++; /* keep the count sane anyway */
q = block_type(block);
if (q != gfs2_block_free) {
log_err( _("Found duplicate %s block %llu (0x%llx) "
@@ -415,16 +416,11 @@ static int check_data(struct gfs2_inode *ip, uint64_t block, void *private)
"sort it out in pass1b.\n"));
add_duplicate_ref(ip, block, ref_as_data, 0,
INODE_VALID);
- /* If the prev ref was as data, this is likely a data
- block, so keep the block count for both refs. */
- if (q == gfs2_block_used)
- bc->data_count++;
return 1;
}
log_info( _("The block was invalid as metadata but might be "
"okay as data. I'll sort it out in pass1b.\n"));
add_duplicate_ref(ip, block, ref_as_data, 0, INODE_VALID);
- bc->data_count++;
return 1;
}
/* In gfs1, rgrp indirect blocks are marked in the bitmap as "meta".
@@ -443,7 +439,6 @@ static int check_data(struct gfs2_inode *ip, uint64_t block, void *private)
fsck_blockmap_set(ip, block, _("jdata"), gfs2_jdata);
} else
fsck_blockmap_set(ip, block, _("data"), gfs2_block_used);
- bc->data_count++;
return 0;
}
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 28/47] fsck.gfs2: print metadata block reference on data errors
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (25 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 27/47] fsck.gfs2: Keep proper counts when duplicates are found Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 29/47] fsck.gfs2: print block count values when fixing them Bob Peterson
` (18 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, fsck.gfs2 would cite data block errors, but it
wouldn't tell you which metadata block referenced the bad data block.
That's fine, but it makes it very difficult to backtrack problems.
This patch prints out which metadata block referenced the bad data
so it may be backtracked easier.
rhbz#902920
---
gfs2/fsck/metawalk.c | 17 ++++++++++-------
gfs2/fsck/metawalk.h | 7 ++++---
gfs2/fsck/pass1.c | 37 ++++++++++++++++++++++++++-----------
gfs2/fsck/pass1b.c | 9 ++++++---
4 files changed, 46 insertions(+), 24 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index d872ff3..e1a685a 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -1299,11 +1299,14 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
* 2 (ENOENT) is there were too many bad pointers
*/
static int check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
- uint64_t *ptr_start, char *ptr_end,
+ struct gfs2_buffer_head *bh, int head_size,
uint64_t *blks_checked)
{
int error = 0, rc = 0;
uint64_t block, *ptr;
+ uint64_t *ptr_start = (uint64_t *)(bh->b_data + head_size);
+ char *ptr_end = (bh->b_data + ip->i_sbd->bsize);
+ uint64_t metablock = bh->b_blocknr;
/* If there isn't much pointer corruption check the pointers */
for (ptr = ptr_start ; (char *)ptr < ptr_end && !fsck_abort; ptr++) {
@@ -1318,7 +1321,7 @@ static int check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
would defeat the rangecheck_block related functions in
pass1. Therefore the individual check_data functions
should do a range check. */
- rc = pass->check_data(ip, block, pass->private);
+ rc = pass->check_data(ip, metablock, block, pass->private);
if (rc < 0)
return rc;
if (!error && rc)
@@ -1433,9 +1436,7 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
}
if (pass->check_data)
- rc = check_data(ip, pass, (uint64_t *)
- (bh->b_data + head_size),
- (bh->b_data + ip->i_sbd->bsize),
+ rc = check_data(ip, pass, bh, head_size,
&blks_checked);
else
rc = 0;
@@ -1609,7 +1610,8 @@ int delete_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
return delete_block_if_notdup(ip, block, NULL, _("leaf"), private);
}
-int delete_data(struct gfs2_inode *ip, uint64_t block, void *private)
+int delete_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private)
{
return delete_block_if_notdup(ip, block, NULL, _("data"), private);
}
@@ -1670,7 +1672,8 @@ static int alloc_metalist(struct gfs2_inode *ip, uint64_t block,
return 0;
}
-static int alloc_data(struct gfs2_inode *ip, uint64_t block, void *private)
+static int alloc_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private)
{
uint8_t q;
const char *desc = (const char *)private;
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index f5e71e1..05b0e7a 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -23,7 +23,8 @@ extern int delete_block(struct gfs2_inode *ip, uint64_t block,
extern int delete_metadata(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h, void *private);
extern int delete_leaf(struct gfs2_inode *ip, uint64_t block, void *private);
-extern int delete_data(struct gfs2_inode *ip, uint64_t block, void *private);
+extern int delete_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private);
extern int delete_eattr_indir(struct gfs2_inode *ip, uint64_t block, uint64_t parent,
struct gfs2_buffer_head **bh, void *private);
extern int delete_eattr_leaf(struct gfs2_inode *ip, uint64_t block, uint64_t parent,
@@ -76,8 +77,8 @@ struct metawalk_fxns {
int (*check_metalist) (struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h,
void *private);
- int (*check_data) (struct gfs2_inode *ip, uint64_t block,
- void *private);
+ int (*check_data) (struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private);
int (*check_eattr_indir) (struct gfs2_inode *ip, uint64_t block,
uint64_t parent,
struct gfs2_buffer_head **bh, void *private);
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 27bb5d4..3a47184 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -40,7 +40,8 @@ static int check_metalist(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h, void *private);
static int undo_check_metalist(struct gfs2_inode *ip, uint64_t block,
int h, void *private);
-static int check_data(struct gfs2_inode *ip, uint64_t block, void *private);
+static int check_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private);
static int undo_check_data(struct gfs2_inode *ip, uint64_t block,
void *private);
static int check_eattr_indir(struct gfs2_inode *ip, uint64_t indirect,
@@ -66,8 +67,8 @@ static int invalidate_metadata(struct gfs2_inode *ip, uint64_t block,
void *private);
static int invalidate_leaf(struct gfs2_inode *ip, uint64_t block,
void *private);
-static int invalidate_data(struct gfs2_inode *ip, uint64_t block,
- void *private);
+static int invalidate_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private);
static int invalidate_eattr_indir(struct gfs2_inode *ip, uint64_t block,
uint64_t parent,
struct gfs2_buffer_head **bh,
@@ -382,17 +383,24 @@ static int undo_check_data(struct gfs2_inode *ip, uint64_t block,
return undo_reference(ip, block, 0, private);
}
-static int check_data(struct gfs2_inode *ip, uint64_t block, void *private)
+static int check_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private)
{
uint8_t q;
struct block_count *bc = (struct block_count *) private;
if (!valid_block(ip->i_sbd, block)) {
log_err( _("inode %lld (0x%llx) has a bad data block pointer "
- "%lld (invalid or out of range)\n"),
+ "%lld (0x%llx) (invalid or out of range) "),
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)block);
+ (unsigned long long)block, (unsigned long long)block);
+ if (metablock == ip->i_di.di_num.no_addr)
+ log_err("\n");
+ else
+ log_err(_("from metadata block %llu (0x%llx)\n"),
+ (unsigned long long)metablock,
+ (unsigned long long)metablock);
/* Mark the owner of this block with the bad_block
* designator so we know to check it for out of range
* blocks later */
@@ -405,12 +413,19 @@ static int check_data(struct gfs2_inode *ip, uint64_t block, void *private)
q = block_type(block);
if (q != gfs2_block_free) {
log_err( _("Found duplicate %s block %llu (0x%llx) "
- "referenced as data by dinode %llu (0x%llx)\n"),
+ "referenced as data by dinode %llu (0x%llx) "),
block_type_string(q),
(unsigned long long)block,
(unsigned long long)block,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
+ if (metablock == ip->i_di.di_num.no_addr)
+ log_err("\n");
+ else
+ log_err(_("from metadata block %llu (0x%llx)\n"),
+ (unsigned long long)metablock,
+ (unsigned long long)metablock);
+
if (q != gfs2_meta_inval) {
log_info( _("Seems to be a normal duplicate; I'll "
"sort it out in pass1b.\n"));
@@ -837,8 +852,8 @@ static int invalidate_leaf(struct gfs2_inode *ip, uint64_t block,
return mark_block_invalid(ip, block, ref_as_meta, _("leaf"));
}
-static int invalidate_data(struct gfs2_inode *ip, uint64_t block,
- void *private)
+static int invalidate_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private)
{
return mark_block_invalid(ip, block, ref_as_data, _("data"));
}
@@ -926,8 +941,8 @@ static int rangecheck_leaf(struct gfs2_inode *ip, uint64_t block,
return rangecheck_block(ip, block, NULL, btype_leaf, private);
}
-static int rangecheck_data(struct gfs2_inode *ip, uint64_t block,
- void *private)
+static int rangecheck_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private)
{
return rangecheck_block(ip, block, NULL, btype_data, private);
}
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index 54c8649..6114ba3 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -31,7 +31,8 @@ struct dup_handler {
static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private);
static int check_metalist(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h, void *private);
-static int check_data(struct gfs2_inode *ip, uint64_t block, void *private);
+static int check_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private);
static int check_eattr_indir(struct gfs2_inode *ip, uint64_t block,
uint64_t parent, struct gfs2_buffer_head **bh,
void *private);
@@ -88,7 +89,8 @@ static int check_metalist(struct gfs2_inode *ip, uint64_t block,
return add_duplicate_ref(ip, block, ref_as_meta, 1, INODE_VALID);
}
-static int check_data(struct gfs2_inode *ip, uint64_t block, void *private)
+static int check_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private)
{
return add_duplicate_ref(ip, block, ref_as_data, 1, INODE_VALID);
}
@@ -255,7 +257,8 @@ static int clear_dup_metalist(struct gfs2_inode *ip, uint64_t block,
return 1;
}
-static int clear_dup_data(struct gfs2_inode *ip, uint64_t block, void *private)
+static int clear_dup_data(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private)
{
return clear_dup_metalist(ip, block, NULL, 0, private);
}
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 29/47] fsck.gfs2: print block count values when fixing them
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (26 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 28/47] fsck.gfs2: print metadata block reference on data errors Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 30/47] fsck.gfs2: Do not invalidate metablocks of dinodes with invalid mode Bob Peterson
` (17 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, block counts were fixed, but it didn't log what
the new value was changed to. That made it very difficult to track
down block count problems. This patch changes the logging so that
it prints the new block count, and a breakdown of how many blocks
were counted for metadata, data, extended attributes, etc.
rhbz#902920
---
gfs2/fsck/pass1.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 3a47184..964e60b 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -613,7 +613,11 @@ static int finish_eattr_indir(struct gfs2_inode *ip, int leaf_pointers,
ip->i_di.di_blocks = 1 + bc->indir_count +
bc->data_count + bc->ea_count;
bmodified(ip->i_bh);
- log_err( _("Block count fixed.\n"));
+ log_err(_("Block count fixed: 1+%lld+%lld+%lld = %lld.\n"),
+ (unsigned long long)bc->indir_count,
+ (unsigned long long)bc->data_count,
+ (unsigned long long)bc->ea_count,
+ (unsigned long long)ip->i_di.di_blocks);
return 1;
}
log_err( _("Block count not fixed.\n"));
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 30/47] fsck.gfs2: Do not invalidate metablocks of dinodes with invalid mode
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (27 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 29/47] fsck.gfs2: print block count values when fixing them Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 31/47] fsck.gfs2: Log when unrecoverable data block errors are encountered Bob Peterson
` (16 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, when fsck.gfs2 encountered a dinode with an invalid
mode, it would take steps to invalidate its metadata. That's wrong
because if the mode is invalid, you don't know how to treat it.
It's especially wrong if its metadata references the same blocks
that other valid dinodes reference, because then we could end up
deleting blocks belonging to valid files and directories.
rhbz#902920
---
gfs2/fsck/pass1.c | 25 ++++++++-----------------
1 file changed, 8 insertions(+), 17 deletions(-)
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 964e60b..0f3adfe 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -1005,23 +1005,14 @@ static int handle_ip(struct gfs2_sbd *sdp, struct gfs2_inode *ip)
error = set_ip_blockmap(ip, 1);
if (error == -EINVAL) {
- /* We found a dinode that has an invalid mode, so we can't
- tell if it's a data file, directory or a socket.
- Regardless, we have to invalidate its metadata in case there
- are duplicate blocks referenced. If we don't call
- check_metatree, the blocks it references will be deleted
- wholesale by pass2, and if any of those blocks are
- duplicates--referenced by another dinode for some reason--
- we will mark it free, even though it's in use. In other
- words, we would introduce file system corruption. So we
- need to keep track of the fact that it's invalid and
- skip parts that we can't be sure of based on dinode type. */
- log_debug("Invalid mode dinode found at block %lld (0x%llx): "
- "Invalidating all its metadata.\n",
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr);
- check_metatree(ip, &invalidate_fxns);
- check_inode_eattr(ip, &invalidate_fxns);
+ /* We found a dinode that has an invalid mode. At this point
+ set_ip_blockmap returned an error, which means it never
+ got inserted into the inode tree. Since we haven't even
+ processed its metadata with pass1_fxns, none of its
+ metadata will be flagged as metadata or data blocks yet.
+ Therefore, we don't need to invalidate anything. */
+ fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
+ _("invalid mode"), gfs2_block_free);
return 0;
} else if (error)
goto bad_dinode;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 31/47] fsck.gfs2: Log when unrecoverable data block errors are encountered
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (28 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 30/47] fsck.gfs2: Do not invalidate metablocks of dinodes with invalid mode Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 32/47] fsck.gfs2: don't remove buffers from the list when errors are found Bob Peterson
` (15 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch adds a log message whenever unrecoverable data block errors
are found. Otherwise the output doesn't say why it stopped processing
data, and which block had the problem.
rhbz#902920
---
gfs2/fsck/metawalk.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index e1a685a..4e18a7b 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -1322,10 +1322,15 @@ static int check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
pass1. Therefore the individual check_data functions
should do a range check. */
rc = pass->check_data(ip, metablock, block, pass->private);
+ if (!error && rc) {
+ error = rc;
+ log_info(_("\nUnrecoverable data block error %d on "
+ "block %llu (0x%llx).\n"), rc,
+ (unsigned long long)block,
+ (unsigned long long)block);
+ }
if (rc < 0)
return rc;
- if (!error && rc)
- error = rc;
(*blks_checked)++;
}
return error;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 32/47] fsck.gfs2: don't remove buffers from the list when errors are found
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (29 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 31/47] fsck.gfs2: Log when unrecoverable data block errors are encountered Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 33/47] fsck.gfs2: Don't flag GFS1 non-dinode blocks as duplicates Bob Peterson
` (14 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, if an error was encountered while marking the
data blocks, the blocks would be removed from the linked list.
Now that we've got "undo" functions, we need to be able to undo
the designations of those blocks, which means we need to keep those
buffers on the linked list so they're found later. If we don't,
the undo data block function won't process them, and therefore they'll
be marked as "data" blocks in the bitmap, but no files will reference
the blocks (because the error causes the inode to be deleted).
With this patch, the metadata that points to the faulty data is kept
on the linked list, and after the error is found, the undo function
will therefore find it and mark its blocks as "free".
rhbz#902920
---
gfs2/fsck/metawalk.c | 17 +++++------------
1 file changed, 5 insertions(+), 12 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 4e18a7b..3aa1398 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -1383,7 +1383,7 @@ static int hdr_size(struct gfs2_buffer_head *bh, int height)
int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
{
osi_list_t metalist[GFS2_MAX_META_HEIGHT];
- osi_list_t *list;
+ osi_list_t *list, *tmp;
struct gfs2_buffer_head *bh;
uint32_t height = ip->i_di.di_height;
int i, head_size;
@@ -1423,23 +1423,16 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
if (ip->i_di.di_blocks > COMFORTABLE_BLKS)
last_reported_fblock = -10000000;
- while (error >= 0 && !osi_list_empty(list)) {
+ for (tmp = list->next; error >= 0 && tmp != list; tmp = tmp->next) {
if (fsck_abort) {
free_metalist(ip, &metalist[0]);
return 0;
}
- bh = osi_list_entry(list->next, struct gfs2_buffer_head,
- b_altlist);
-
+ bh = osi_list_entry(tmp, struct gfs2_buffer_head, b_altlist);
head_size = hdr_size(bh, height);
- if (!head_size) {
- if (bh == ip->i_bh)
- osi_list_del(&bh->b_altlist);
- else
- brelse(bh);
+ if (!head_size)
continue;
- }
-
+
if (pass->check_data)
rc = check_data(ip, pass, bh, head_size,
&blks_checked);
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 33/47] fsck.gfs2: Don't flag GFS1 non-dinode blocks as duplicates
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (30 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 32/47] fsck.gfs2: don't remove buffers from the list when errors are found Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 34/47] fsck.gfs2: externalize check_leaf Bob Peterson
` (13 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, fsck.gfs2 could get into problems when processing
a GFS1 file system. The issue goes back to the fact that all GFS1
metadata is marked as "Meta" in the bitmap, whereas that bitmap
designation is reserved for dinodes in GFS2. For example, take a
GFS1 file of height 2, which looks like this:
Block
------
0x1234 dinode
0x1235 |----> indirect meta
0x1236 |---->data at offset 0 of the file
Before this patch, fsck.gfs2 would:
1. Encounter the dinode at 0x1234 and mark it as "dinode" in the
blockmap.
2. Process its metadata, see block 0x1235, mark it as "indirect meta"
in the blockmap.
3. Process the metadata's data, see block 0x1236, mark it as "data".
4. When it's done with the dinode, it moves on to the next dinode.
But since GFS1 doesn't distinguish dinodes from other metadata,
the next block in the bitmap that has that designation is block
0x1235.
5. Since block 0x1235 was previously marked "indirect meta" pass1
gets confused and thinks the block is a duplicate reference,
and it's invalid as a dinode. This is a non-problem that's
treated as a problem, and it makes bad decisions based on it,
deleting what it perceives to be corruption.
This patch adds special checks for this problem and assumes the block
is just normal GFS1 non-dinode metadata.
---
gfs2/fsck/pass1.c | 69 ++++++++++++++++++++++++++++++++++++-------------------
1 file changed, 45 insertions(+), 24 deletions(-)
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 0f3adfe..004ca78 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -1084,22 +1084,11 @@ bad_dinode:
*/
static int handle_di(struct gfs2_sbd *sdp, struct gfs2_buffer_head *bh)
{
- uint8_t q;
int error = 0;
uint64_t block = bh->b_blocknr;
struct gfs2_inode *ip;
ip = fsck_inode_get(sdp, bh);
- q = block_type(block);
- if (q != gfs2_block_free) {
- log_err( _("Found a duplicate inode block at #%llu"
- " (0x%llx) previously marked as a %s\n"),
- (unsigned long long)block,
- (unsigned long long)block, block_type_string(q));
- add_duplicate_ref(ip, block, ref_as_meta, 0, INODE_VALID);
- fsck_inode_put(&ip);
- return 0;
- }
if (ip->i_di.di_num.no_addr != block) {
log_err( _("Inode #%llu (0x%llx): Bad inode address found: %llu "
@@ -1359,8 +1348,13 @@ static int pass1_process_bitmap(struct gfs2_sbd *sdp, struct rgrp_tree *rgd, uin
struct gfs2_buffer_head *bh;
unsigned i;
uint64_t block;
+ struct gfs2_inode *ip;
+ uint8_t q;
for (i = 0; i < n; i++) {
+ int is_inode;
+ uint32_t check_magic;
+
block = ibuf[i];
/* skip gfs1 rindex indirect blocks */
@@ -1389,9 +1383,47 @@ static int pass1_process_bitmap(struct gfs2_sbd *sdp, struct rgrp_tree *rgd, uin
(unsigned long long)block);
continue;
}
+
bh = bread(sdp, block);
- if (gfs2_check_meta(bh, GFS2_METATYPE_DI)) {
+ is_inode = 0;
+ if (gfs2_check_meta(bh, GFS2_METATYPE_DI) == 0)
+ is_inode = 1;
+
+ check_magic = ((struct gfs2_meta_header *)
+ (bh->b_data))->mh_magic;
+
+ q = block_type(block);
+ if (q != gfs2_block_free) {
+ if (be32_to_cpu(check_magic) == GFS2_MAGIC &&
+ sdp->gfs1 && !is_inode) {
+ log_debug("Block 0x%llx assumed to be "
+ "previously processed GFS1 "
+ "non-dinode metadata.\n",
+ (unsigned long long)block);
+ brelse(bh);
+ continue;
+ }
+ log_err( _("Found a duplicate inode block at #%llu "
+ "(0x%llx) previously marked as a %s\n"),
+ (unsigned long long)block,
+ (unsigned long long)block,
+ block_type_string(q));
+ ip = fsck_inode_get(sdp, bh);
+ if (is_inode && ip->i_di.di_num.no_addr == block)
+ add_duplicate_ref(ip, block, ref_is_inode, 0,
+ INODE_VALID);
+ else
+ log_info(_("dinum.no_addr is wrong, so I "
+ "assume the bitmap is just "
+ "wrong.\n"));
+ fsck_inode_put(&ip);
+ brelse(bh);
+ continue;
+ }
+
+ if (!is_inode) {
+ if (be32_to_cpu(check_magic) == GFS2_MAGIC) {
/* In gfs2, a bitmap mark of 2 means an inode,
but in gfs1 it means any metadata. So if
this is gfs1 and not an inode, it may be
@@ -1399,12 +1431,7 @@ static int pass1_process_bitmap(struct gfs2_sbd *sdp, struct rgrp_tree *rgd, uin
be referenced by an inode, so we need to
skip it here and it will be sorted out
when the referencing inode is checked. */
- if (sdp->gfs1) {
- uint32_t check_magic;
-
- check_magic = ((struct gfs2_meta_header *)
- (bh->b_data))->mh_magic;
- if (be32_to_cpu(check_magic) == GFS2_MAGIC) {
+ if (sdp->gfs1) {
log_debug( _("Deferring GFS1 "
"metadata block #"
"%" PRIu64" (0x%"
@@ -1418,12 +1445,6 @@ static int pass1_process_bitmap(struct gfs2_sbd *sdp, struct rgrp_tree *rgd, uin
"%llu (0x%llx)\n"),
(unsigned long long)block,
(unsigned long long)block);
- if (gfs2_blockmap_set(bl, block, gfs2_block_free)) {
- stack;
- brelse(bh);
- gfs2_special_free(&gfs1_rindex_blks);
- return FSCK_ERROR;
- }
check_n_fix_bitmap(sdp, block, gfs2_block_free);
} else if (handle_di(sdp, bh) < 0) {
stack;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 34/47] fsck.gfs2: externalize check_leaf
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (31 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 33/47] fsck.gfs2: Don't flag GFS1 non-dinode blocks as duplicates Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 35/47] fsck.gfs2: pass2: check leaf blocks when fixing hash table Bob Peterson
` (12 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch makes metawalk function check_leaf external so that it
may be called in a future patch for fixing hash tables.
rhbz#902920
---
gfs2/fsck/metawalk.c | 5 ++---
gfs2/fsck/metawalk.h | 3 +++
gfs2/fsck/pass1.c | 6 +++---
gfs2/fsck/pass1b.c | 6 +++---
4 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 3aa1398..772b210 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -483,9 +483,8 @@ static int check_entries(struct gfs2_inode *ip, struct gfs2_buffer_head *bh,
* Reads in the leaf block
* Leaves the buffer around for further analysis (caller must brelse)
*/
-static int check_leaf(struct gfs2_inode *ip, int lindex,
- struct metawalk_fxns *pass,
- uint64_t *leaf_no, struct gfs2_leaf *leaf, int *ref_count)
+int check_leaf(struct gfs2_inode *ip, int lindex, struct metawalk_fxns *pass,
+ uint64_t *leaf_no, struct gfs2_leaf *leaf, int *ref_count)
{
int error = 0, fix;
struct gfs2_buffer_head *lbh = NULL;
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index 05b0e7a..2ba0d72 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -15,6 +15,9 @@ extern int check_dir(struct gfs2_sbd *sdp, uint64_t block,
struct metawalk_fxns *pass);
extern int check_linear_dir(struct gfs2_inode *ip, struct gfs2_buffer_head *bh,
struct metawalk_fxns *pass);
+extern int check_leaf(struct gfs2_inode *ip, int lindex,
+ struct metawalk_fxns *pass, uint64_t *leaf_no,
+ struct gfs2_leaf *leaf, int *ref_count);
extern int remove_dentry_from_dir(struct gfs2_sbd *sdp, uint64_t dir,
uint64_t dentryblock);
extern int delete_block(struct gfs2_inode *ip, uint64_t block,
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 004ca78..a6fe9a7 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -35,7 +35,7 @@ struct block_count {
uint64_t ea_count;
};
-static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private);
+static int p1check_leaf(struct gfs2_inode *ip, uint64_t block, void *private);
static int check_metalist(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h, void *private);
static int undo_check_metalist(struct gfs2_inode *ip, uint64_t block,
@@ -93,7 +93,7 @@ static int pass1_repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no,
struct metawalk_fxns pass1_fxns = {
.private = NULL,
- .check_leaf = check_leaf,
+ .check_leaf = p1check_leaf,
.check_metalist = check_metalist,
.check_data = check_data,
.check_eattr_indir = check_eattr_indir,
@@ -211,7 +211,7 @@ struct metawalk_fxns sysdir_fxns = {
.check_dentry = resuscitate_dentry,
};
-static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
+static int p1check_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
{
struct block_count *bc = (struct block_count *) private;
uint8_t q;
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index 6114ba3..b2532fd 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -28,7 +28,7 @@ struct dup_handler {
int ref_count;
};
-static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private);
+static int check_leaf_refs(struct gfs2_inode *ip, uint64_t block, void *private);
static int check_metalist(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h, void *private);
static int check_data(struct gfs2_inode *ip, uint64_t metablock,
@@ -56,7 +56,7 @@ static int find_dentry(struct gfs2_inode *ip, struct gfs2_dirent *de,
struct metawalk_fxns find_refs = {
.private = NULL,
- .check_leaf = check_leaf,
+ .check_leaf = check_leaf_refs,
.check_metalist = check_metalist,
.check_data = check_data,
.check_eattr_indir = check_eattr_indir,
@@ -78,7 +78,7 @@ struct metawalk_fxns find_dirents = {
.check_eattr_extentry = NULL,
};
-static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
+static int check_leaf_refs(struct gfs2_inode *ip, uint64_t block, void *private)
{
return add_duplicate_ref(ip, block, ref_as_meta, 1, INODE_VALID);
}
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 35/47] fsck.gfs2: pass2: check leaf blocks when fixing hash table
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (32 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 34/47] fsck.gfs2: externalize check_leaf Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 36/47] fsck.gfs2: standardize check_metatree return codes Bob Peterson
` (11 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, pass2 would attempt to fix the hash table without
first checking the basic integrity of the leaf blocks it was checking.
A misplaced leaf might have its entries relocated as a matter of course.
But if that leaf block had a problem, it could cause all kinds of
errors, including segfaults. This patch gives the hash table repair
function the ability to do basic integrity checks on the leaf block,
and perform repairs if necessary.
rhbz#902920
---
gfs2/fsck/pass2.c | 100 ++++++++++++++++++++++++++++++++++++++++++------------
1 file changed, 79 insertions(+), 21 deletions(-)
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index 1e7f884..e38841e 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -1050,6 +1050,66 @@ static int lost_leaf(struct gfs2_inode *ip, uint64_t *tbl, uint64_t leafno,
return 1;
}
+static int basic_check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
+ struct gfs2_dirent *prev_de,
+ struct gfs2_buffer_head *bh, char *filename,
+ uint32_t *count, int lindex, void *priv)
+{
+ uint8_t q = 0;
+ char tmp_name[MAX_FILENAME];
+ struct gfs2_inum entry;
+ struct dir_status *ds = (struct dir_status *) priv;
+ struct gfs2_dirent dentry, *de;
+ int error;
+
+ memset(&dentry, 0, sizeof(struct gfs2_dirent));
+ gfs2_dirent_in(&dentry, (char *)dent);
+ de = &dentry;
+
+ entry.no_addr = de->de_inum.no_addr;
+ entry.no_formal_ino = de->de_inum.no_formal_ino;
+
+ /* Start of checks */
+ memset(tmp_name, 0, MAX_FILENAME);
+ if (de->de_name_len < MAX_FILENAME)
+ strncpy(tmp_name, filename, de->de_name_len);
+ else
+ strncpy(tmp_name, filename, MAX_FILENAME - 1);
+
+ error = basic_dentry_checks(ip, dent, &entry, tmp_name, count, de,
+ ds, &q, bh);
+ if (error) {
+ dirent2_del(ip, bh, prev_de, dent);
+ log_err( _("Bad directory entry '%s' cleared.\n"), tmp_name);
+ return 1;
+ } else {
+ return 0;
+ }
+}
+
+static int pass2_repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no,
+ int lindex, int ref_count, const char *msg,
+ void *private)
+{
+ return repair_leaf(ip, leaf_no, lindex, ref_count, msg);
+}
+
+/* The purpose of leafck_fxns is to provide a means for function fix_hashtable
+ * to do basic sanity checks on leaf blocks before manipulating them, for
+ * example, splitting them. If they're corrupt, splitting them or trying to
+ * move their contents can cause a segfault. We can't really use the standard
+ * pass2_fxns because that will do things we don't want. For example, it will
+ * find '.' and '..' and increment the directory link count, which would be
+ * done a second time when the dirent is really checked in pass2_fxns.
+ * We don't want it to do the "wrong leaf" thing, or set_parent_dir either.
+ * We just want a basic sanity check on pointers and lengths.
+ */
+struct metawalk_fxns leafck_fxns = {
+ .check_leaf_depth = check_leaf_depth,
+ .check_dentry = basic_check_dentry,
+ .repair_leaf = pass2_repair_leaf,
+};
+
/* fix_hashtable - fix a corrupt hash table
*
* The main intent of this function is to sort out hash table problems.
@@ -1079,10 +1139,11 @@ static int fix_hashtable(struct gfs2_inode *ip, uint64_t *tbl, unsigned hsize,
int len, int *proper_len, int factor)
{
struct gfs2_buffer_head *lbh;
- struct gfs2_leaf *leaf;
+ struct gfs2_leaf leaf;
struct gfs2_dirent dentry, *de;
int changes = 0, error, i, extras, hash_index;
uint64_t new_leaf_blk;
+ uint64_t leaf_no;
uint32_t leaf_proper_start;
*proper_len = len;
@@ -1096,14 +1157,20 @@ static int fix_hashtable(struct gfs2_inode *ip, uint64_t *tbl, unsigned hsize,
return 0;
}
+ memset(&leaf, 0, sizeof(leaf));
+ leaf_no = leafblk;
+ error = check_leaf(ip, lindex, &leafck_fxns, &leaf_no, &leaf, &len);
+ if (error) {
+ log_debug("Leaf repaired while fixing the hash table.\n");
+ error = 0;
+ }
lbh = bread(ip->i_sbd, leafblk);
- leaf = (struct gfs2_leaf *)lbh->b_data;
/* If the leaf's depth is out of range for this dinode, it's obviously
attached to the wrong dinode. Move the dirents to lost+found. */
- if (be16_to_cpu(leaf->lf_depth) > ip->i_di.di_depth) {
+ if (leaf.lf_depth > ip->i_di.di_depth) {
log_err(_("This leaf block's depth (%d) is too big for this "
"dinode's depth (%d)\n"),
- be16_to_cpu(leaf->lf_depth), ip->i_di.di_depth);
+ leaf.lf_depth, ip->i_di.di_depth);
error = lost_leaf(ip, tbl, leafblk, len, lindex, lbh);
brelse(lbh);
return error;
@@ -1129,7 +1196,7 @@ static int fix_hashtable(struct gfs2_inode *ip, uint64_t *tbl, unsigned hsize,
}
/* Calculate the proper number of pointers based on the leaf depth. */
- *proper_len = 1 << (ip->i_di.di_depth - be16_to_cpu(leaf->lf_depth));
+ *proper_len = 1 << (ip->i_di.di_depth - leaf.lf_depth);
/* Look at the first dirent and check its hash value to see if it's
at the proper starting offset. */
@@ -1162,7 +1229,7 @@ static int fix_hashtable(struct gfs2_inode *ip, uint64_t *tbl, unsigned hsize,
already at its maximum depth. */
if ((leaf_proper_start < proper_start) ||
((*proper_len > len || lindex > leaf_proper_start) &&
- be16_to_cpu(leaf->lf_depth) == ip->i_di.di_depth)) {
+ leaf.lf_depth == ip->i_di.di_depth)) {
log_err(_("Leaf block should start at 0x%x, but it appears at "
"0x%x in the hash table.\n"), leaf_proper_start,
proper_start);
@@ -1177,24 +1244,22 @@ static int fix_hashtable(struct gfs2_inode *ip, uint64_t *tbl, unsigned hsize,
later than they should, we can split the leaf to give it a smaller
footprint in the hash table. */
if ((*proper_len > len || lindex > leaf_proper_start) &&
- ip->i_di.di_depth > be16_to_cpu(leaf->lf_depth)) {
+ ip->i_di.di_depth > leaf.lf_depth) {
log_err(_("For depth %d, length %d, the proper start is: "
"0x%x.\n"), factor, len, proper_start);
changes++;
new_leaf_blk = find_free_blk(ip->i_sbd);
dir_split_leaf(ip, lindex, leafblk, lbh);
/* re-read the leaf to pick up dir_split_leaf's changes */
- gfs2_leaf_in(leaf, lbh);
- *proper_len = 1 << (ip->i_di.di_depth -
- be16_to_cpu(leaf->lf_depth));
+ gfs2_leaf_in(&leaf, lbh);
+ *proper_len = 1 << (ip->i_di.di_depth - leaf.lf_depth);
log_err(_("Leaf block %llu (0x%llx) was split from length "
"%d to %d\n"), (unsigned long long)leafblk,
(unsigned long long)leafblk, len, *proper_len);
if (*proper_len < 0) {
log_err(_("Programming error: proper_len=%d, "
"di_depth = %d, lf_depth = %d.\n"),
- *proper_len, ip->i_di.di_depth,
- be16_to_cpu(leaf->lf_depth));
+ *proper_len, ip->i_di.di_depth, leaf.lf_depth);
exit(FSCK_ERROR);
}
log_err(_("New split-off leaf block was allocated at %lld "
@@ -1219,8 +1284,8 @@ static int fix_hashtable(struct gfs2_inode *ip, uint64_t *tbl, unsigned hsize,
if (*proper_len < len) {
log_err(_("There are %d pointers, but leaf 0x%llx's "
"depth, %d, only allows %d\n"),
- len, (unsigned long long)leafblk,
- be16_to_cpu(leaf->lf_depth), *proper_len);
+ len, (unsigned long long)leafblk, leaf.lf_depth,
+ *proper_len);
}
brelse(lbh);
/* At this point, lindex should be at the proper end of the pointers.
@@ -1422,13 +1487,6 @@ static int check_hash_tbl(struct gfs2_inode *ip, uint64_t *tbl,
return error;
}
-static int pass2_repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no,
- int lindex, int ref_count, const char *msg,
- void *private)
-{
- return repair_leaf(ip, leaf_no, lindex, ref_count, msg);
-}
-
struct metawalk_fxns pass2_fxns = {
.private = NULL,
.check_leaf_depth = check_leaf_depth,
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 36/47] fsck.gfs2: standardize check_metatree return codes
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (33 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 35/47] fsck.gfs2: pass2: check leaf blocks when fixing hash table Bob Peterson
@ 2013-05-14 16:21 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 37/47] fsck.gfs2: don't invalidate files with duplicate data block refs Bob Peterson
` (10 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:21 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch aims to not change functionality at all. What it does is
adds a standard set of three return codes with the following meanings:
meta_is_good - all is well, keep processing metadata normally
meta_skip_further - an non-fatal error occurred, so further metadata
processing for this inode should be skipped.
meta_error - a fatal error occurred in this metadata, so we need to
abort processing.
rhbz#902920
---
gfs2/fsck/metawalk.c | 14 +++++++-------
gfs2/fsck/metawalk.h | 6 ++++++
gfs2/fsck/pass1.c | 28 ++++++++++++++--------------
gfs2/fsck/pass1b.c | 6 +++---
gfs2/fsck/util.c | 12 ++++++------
5 files changed, 36 insertions(+), 30 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 772b210..d285ee5 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -996,9 +996,9 @@ int free_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
{
if (!find_remove_dup(ip, block, btype)) { /* not a dup */
fsck_blockmap_set(ip, block, btype, gfs2_block_free);
- return 1;
+ return meta_skip_further;
}
- return 0;
+ return meta_is_good;
}
/**
@@ -1015,7 +1015,7 @@ static int delete_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
uint8_t q;
if (!valid_block(ip->i_sbd, block))
- return -EFAULT;
+ return meta_error;
q = block_type(block);
if (q == gfs2_block_free) {
@@ -1025,7 +1025,7 @@ static int delete_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)block,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
- return 0;
+ return meta_is_good;
}
return free_block_if_notdup(ip, block, btype);
}
@@ -1255,12 +1255,12 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
pass->private);
/* check_metalist should hold any buffers
it gets with "bread". */
- if (err < 0) {
+ if (err == meta_error) {
stack;
error = err;
return error;
}
- if (err > 0) {
+ if (err == meta_skip_further) {
if (!error)
error = err;
log_debug( _("Skipping block %llu (0x%llx)\n"),
@@ -1666,7 +1666,7 @@ static int alloc_metalist(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)block);
gfs2_blockmap_set(bl, block, gfs2_indir_blk);
}
- return 0;
+ return meta_is_good;
}
static int alloc_data(struct gfs2_inode *ip, uint64_t metablock,
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index 2ba0d72..49217cc 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -56,6 +56,12 @@ extern int free_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
#define fsck_blockmap_set(ip, b, bt, m) _fsck_blockmap_set(ip, b, bt, m, \
__FUNCTION__, __LINE__)
+enum meta_check_rc {
+ meta_error = -1,
+ meta_is_good = 0,
+ meta_skip_further = 1,
+};
+
/* metawalk_fxns: function pointers to check various parts of the fs
*
* The functions should return -1 on fatal errors, 1 if the block
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index a6fe9a7..3c4dc89 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -139,14 +139,14 @@ static int resuscitate_metalist(struct gfs2_inode *ip, uint64_t block,
"range) found in system inode %lld (0x%llx).\n"),
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
- return 1;
+ return meta_skip_further;
}
if (fsck_system_inode(ip->i_sbd, block))
fsck_blockmap_set(ip, block, _("system file"), gfs2_indir_blk);
else
check_n_fix_bitmap(ip->i_sbd, block, gfs2_indir_blk);
bc->indir_count++;
- return 0;
+ return meta_is_good;
}
/*
@@ -263,7 +263,7 @@ static int check_metalist(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
- return 1;
+ return meta_skip_further;
}
if (is_dir(&ip->i_di, ip->i_sbd->gfs1) && h == ip->i_di.di_height) {
iblk_type = GFS2_METATYPE_JD;
@@ -300,7 +300,7 @@ static int check_metalist(struct gfs2_inode *ip, uint64_t block,
gfs2_meta_inval);
brelse(nbh);
nbh = NULL;
- return 1;
+ return meta_skip_further;
}
brelse(nbh);
nbh = NULL;
@@ -314,12 +314,12 @@ static int check_metalist(struct gfs2_inode *ip, uint64_t block,
nbh = NULL;
*bh = NULL;
}
- return 1; /* don't process the metadata again */
+ return meta_skip_further; /* don't process the metadata again */
} else
fsck_blockmap_set(ip, block, _("indirect"),
gfs2_indir_blk);
- return 0;
+ return meta_is_good;
}
/* undo_reference - undo previously processed data or metadata
@@ -825,7 +825,7 @@ static int mark_block_invalid(struct gfs2_inode *ip, uint64_t block,
* and as a result, they'll be freed when this dinode is deleted,
* despite being used by another dinode as a valid block. */
if (!valid_block(ip->i_sbd, block))
- return 0;
+ return meta_is_good;
q = block_type(block);
if (q != gfs2_block_free) {
@@ -837,10 +837,10 @@ static int mark_block_invalid(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)block,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
- return 0;
+ return meta_is_good;
}
fsck_blockmap_set(ip, block, btype, gfs2_meta_inval);
- return 0;
+ return meta_is_good;
}
static int invalidate_metadata(struct gfs2_inode *ip, uint64_t block,
@@ -910,9 +910,9 @@ static int rangecheck_block(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
if ((*bad_pointers) <= BAD_POINTER_TOLERANCE)
- return ENOENT;
+ return meta_skip_further;
else
- return -ENOENT; /* Exits check_metatree quicker */
+ return meta_error; /* Exits check_metatree quicker */
}
/* See how many duplicate blocks it has */
q = block_type(block);
@@ -925,11 +925,11 @@ static int rangecheck_block(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
if ((*bad_pointers) <= BAD_POINTER_TOLERANCE)
- return ENOENT;
+ return meta_skip_further;
else
- return -ENOENT; /* Exits check_metatree quicker */
+ return meta_error; /* Exits check_metatree quicker */
}
- return 0;
+ return meta_is_good;
}
static int rangecheck_metadata(struct gfs2_inode *ip, uint64_t block,
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index b2532fd..b5da200 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -215,7 +215,7 @@ static int clear_dup_metalist(struct gfs2_inode *ip, uint64_t block,
struct duptree *dt;
if (!valid_block(ip->i_sbd, block))
- return 0;
+ return meta_is_good;
/* This gets tricky. We're traversing a metadata tree trying to
delete an inode based on it having a duplicate block reference
@@ -231,7 +231,7 @@ static int clear_dup_metalist(struct gfs2_inode *ip, uint64_t block,
if (!dt) {
fsck_blockmap_set(ip, block, _("no longer valid"),
gfs2_block_free);
- return 0;
+ return meta_is_good;
}
/* This block, having failed the above test, is duplicated somewhere */
if (block == dh->dt->block) {
@@ -254,7 +254,7 @@ static int clear_dup_metalist(struct gfs2_inode *ip, uint64_t block,
be mistakenly freed as "no longer valid" (in this function above)
even though it's valid metadata for a different inode. Returning
1 ensures that the metadata isn't processed again. */
- return 1;
+ return meta_skip_further;
}
static int clear_dup_data(struct gfs2_inode *ip, uint64_t metablock,
diff --git a/gfs2/fsck/util.c b/gfs2/fsck/util.c
index c11768f..078d5f6 100644
--- a/gfs2/fsck/util.c
+++ b/gfs2/fsck/util.c
@@ -316,19 +316,19 @@ int add_duplicate_ref(struct gfs2_inode *ip, uint64_t block,
struct duptree *dt;
if (!valid_block(ip->i_sbd, block))
- return 0;
+ return meta_is_good;
/* If this is not the first reference (i.e. all calls from pass1) we
need to create the duplicate reference. If this is pass1b, we want
to ignore references that aren't found. */
dt = gfs2_dup_set(block, !first);
if (!dt) /* If this isn't a duplicate */
- return 0;
+ return meta_is_good;
/* If we found the duplicate reference but we've already discovered
the first reference (in pass1b) and the other references in pass1,
we don't need to count it, so just return. */
if (dt->first_ref_found)
- return 0;
+ return meta_is_good;
/* The first time this is called from pass1 is actually the second
reference. When we go back in pass1b looking for the original
@@ -350,12 +350,12 @@ int add_duplicate_ref(struct gfs2_inode *ip, uint64_t block,
if (!(id = malloc(sizeof(*id)))) {
log_crit( _("Unable to allocate "
"inode_with_dups structure\n"));
- return -1;
+ return meta_error;
}
if (!(memset(id, 0, sizeof(*id)))) {
log_crit( _("Unable to zero inode_with_dups "
"structure\n"));
- return -1;
+ return meta_error;
}
id->block_no = ip->i_di.di_num.no_addr;
q = block_type(ip->i_di.di_num.no_addr);
@@ -389,7 +389,7 @@ int add_duplicate_ref(struct gfs2_inode *ip, uint64_t block,
else
log_info( _("This brings the total to: %d duplicate "
"references\n"), dt->refs);
- return 0;
+ return meta_is_good;
}
struct dir_info *dirtree_insert(struct gfs2_inum inum)
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 37/47] fsck.gfs2: don't invalidate files with duplicate data block refs
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (34 preceding siblings ...)
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 36/47] fsck.gfs2: standardize check_metatree return codes Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 38/47] fsck.gfs2: check for duplicate first references Bob Peterson
` (9 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, whenever pass1 encountered a duplicated data block
pointer, it would mark the file as invalid. But if reason the block
was duplicated was due to a different bad inode, the inode with the
valid data block reference was still punished and deleted.
This patch adds an additional check to see if the previous reference
to the data block was as a _valid_ metadata block. If the previous
reference was as metadata, and the metadata checked out okay, then
it can't possibly be a data block for the second reference. In that
case, we know for a fact that the second reference is invalid. But
if the previous reference was also as data, the inode might be okay
and duplicate resolving in pass1b might sort it out and leave this
inode as the only valid reference. In that case, we should treat the
inode as valid, not invalid. So this patch basically treats duplicate
data block references as "innocent until proven guilty" rather than
just the opposite.
rhbz#902920
---
gfs2/fsck/pass1.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 3c4dc89..df10089 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -425,18 +425,32 @@ static int check_data(struct gfs2_inode *ip, uint64_t metablock,
log_err(_("from metadata block %llu (0x%llx)\n"),
(unsigned long long)metablock,
(unsigned long long)metablock);
-
+
+ if (q >= gfs2_indir_blk && q <= gfs2_jdata) {
+ log_info(_("The block was processed earlier as valid "
+ "metadata, so it can't possibly be "
+ "data.\n"));
+ /* We still need to add a duplicate record here because
+ when check_metatree tries to delete the inode, we
+ can't have the "undo" functions freeing the block
+ out from other the original referencing inode. */
+ add_duplicate_ref(ip, block, ref_as_data, 0,
+ INODE_VALID);
+ return 1;
+ }
if (q != gfs2_meta_inval) {
log_info( _("Seems to be a normal duplicate; I'll "
"sort it out in pass1b.\n"));
add_duplicate_ref(ip, block, ref_as_data, 0,
INODE_VALID);
- return 1;
+ /* This inode references the block as data. So if this
+ all is validated, we want to keep this count. */
+ return 0;
}
log_info( _("The block was invalid as metadata but might be "
"okay as data. I'll sort it out in pass1b.\n"));
add_duplicate_ref(ip, block, ref_as_data, 0, INODE_VALID);
- return 1;
+ return 0;
}
/* In gfs1, rgrp indirect blocks are marked in the bitmap as "meta".
In gfs2, "meta" is only for dinodes. So here we dummy up the
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 38/47] fsck.gfs2: check for duplicate first references
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (35 preceding siblings ...)
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 37/47] fsck.gfs2: don't invalidate files with duplicate data block refs Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 39/47] fsck.gfs2: When flagging a duplicate reference, show valid or invalid Bob Peterson
` (8 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, fsck.gfs2 could get into situations where it's
in pass1b searching for the first reference to a block that it knows
has been referenced twice. However, for one reason or another, the
first reference has been deleted. It may seem unlikely because pass1
tries to "undo" its references when it deletes a bad dinode. But
it can still happen, for example, when pass1b decides to delete a
dinode because of a _different_ duplicate reference within the same
dinode. If the first reference was deleted prior to searching for the
original reference, pass1b won't find the original reference. So
prior to this patch, it would just keep on looking, until it found
the second reference. In other words, it would mistake the second
reference for the first reference. Then it would get confused and
treat the reference as a duplicate of itself. Later, it would choose
which reference to delete, and delete its dinode. But since they're
the same reference, it could delete a dinode with a perfectly good
reference (the first invalid reference having already been deleted).
The solution that this patch implements is to check if the first
reference we found is actually the second reference, and if so,
treat it as a first reference. That way, it avoids creating a
second duplicate reference structure, and later when it resolves
the references, it finds there's only one, and it doesn't need to
delete the valid dinode.
rhbz#902920
---
gfs2/fsck/util.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/gfs2/fsck/util.c b/gfs2/fsck/util.c
index 078d5f6..fc3a0ec 100644
--- a/gfs2/fsck/util.c
+++ b/gfs2/fsck/util.c
@@ -330,6 +330,28 @@ int add_duplicate_ref(struct gfs2_inode *ip, uint64_t block,
if (dt->first_ref_found)
return meta_is_good;
+ /* Check for a previous reference to this duplicate */
+ id = find_dup_ref_inode(dt, ip);
+
+ /* We have to be careful here. The original referencing dinode may have
+ deemed to be bad and deleted/freed in pass1. In that case, pass1b
+ wouldn't discover the correct [deleted] original reference. In
+ that case, we don't want to be confused and consider this second
+ reference the same as the first. If we do, we'll never be able to
+ resolve it. The first reference can't be the second reference. */
+ if (id && first && !dt->first_ref_found) {
+ log_info(_("Original reference to block %llu (0x%llx) was "
+ "previously found to be bad and deleted.\n"),
+ (unsigned long long)block,
+ (unsigned long long)block);
+ log_info(_("I'll consider the reference from inode %llu "
+ "(0x%llx) the first reference.\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr);
+ dt->first_ref_found = 1;
+ return meta_is_good;
+ }
+
/* The first time this is called from pass1 is actually the second
reference. When we go back in pass1b looking for the original
reference, we don't want to increment the reference count because
@@ -341,8 +363,6 @@ int add_duplicate_ref(struct gfs2_inode *ip, uint64_t block,
dt->refs++;
}
- /* Check for a previous reference to this duplicate */
- id = find_dup_ref_inode(dt, ip);
if (id == NULL) {
/* Check for the inode on the invalid inode reference list. */
uint8_t q;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 39/47] fsck.gfs2: When flagging a duplicate reference, show valid or invalid
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (36 preceding siblings ...)
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 38/47] fsck.gfs2: check for duplicate first references Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 40/47] fsck.gfs2: major duplicate reference reform Bob Peterson
` (7 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch changes the logging when duplicate block references are
flagged. The idea is to print whether or not the inode with the reference
is valid or invalid, which helps in diagnosing problems when duplicate
block references are resolved.
rhbz#902920
---
gfs2/fsck/util.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/gfs2/fsck/util.c b/gfs2/fsck/util.c
index fc3a0ec..ef59e6e 100644
--- a/gfs2/fsck/util.c
+++ b/gfs2/fsck/util.c
@@ -399,9 +399,10 @@ int add_duplicate_ref(struct gfs2_inode *ip, uint64_t block,
id->reftypecount[reftype]++;
id->dup_count++;
log_info( _("Found %d reference(s) to block %llu"
- " (0x%llx) as %s in inode #%llu (0x%llx)\n"),
+ " (0x%llx) as %s in %s inode #%llu (0x%llx)\n"),
id->dup_count, (unsigned long long)block,
(unsigned long long)block, reftypes[reftype],
+ inode_valid ? _("valid") : _("invalid"),
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
if (first)
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 40/47] fsck.gfs2: major duplicate reference reform
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (37 preceding siblings ...)
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 39/47] fsck.gfs2: When flagging a duplicate reference, show valid or invalid Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 41/47] fsck.gfs2: Remove all bad eattr blocks Bob Peterson
` (6 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch is a large set of changes designed to rework how pass1b
resolves duplicate block references. There are basically two major
changes with this patch:
First, the metawalk functions were trying to attribute too much
information to the return codes of its callback functions: (1) Was
there an error? (2) Was the inode valid? (3) Was a duplicate block
reference encountered? (4) Should we keep going and process more of
its metadata? This often led to bad decisions made by metawalk:
For example, it would stop processing metadata when it should have
continued, thereby forgetting to mark blocks free that were no longer
in use. This patch introduces two new variables to the metatree
functions, *is_valid and *was_duplicate. The first one indicates
whether the dinode was valid or whether there is good cause to
delete it. The second indicates whether a duplicate block reference
was encountered. With this patch, the return code indicates simply
whether metadata processing should be skipped or not, and nothing
more. This is especially useful in pass1. For example, if it
encounters major corruption in a dinode, it doesn't do any good to
mark all its blocks as duplicates and have the undo functions try
to reverse all those decisions.
The second major change with this patch has to do with the
philosophy of how duplicate references are resolved. Before, pass1
would flag the duplicates and pass1b would try to resolve them all,
marking dinodes that should be deleted as "bad", and pass2 would
delete the bad dinodes. This becomes very problematic and messy
in pass1b, especially in cases where you have a number of duplicate
references that are common between multiple dinodes. For example,
suppose files A, B and C share some of the same blocks, but not
others:
A - 0x3000 0x3001 0x1233 0x1234 0x3004
B - 0x4000 0x4001 0x4002 0x1234 0x1235
C - 0x1231 0x1232 0x1233 0x1234 0x1235
The old strategy that got us into trouble was to log the three
duplicate blocks, delete invalid dinodes A and B, but leave the
duplicate reference structure around for 0x1233, 0x1234 and 0x1235
so that C would be left intact with the only references to all five
blocks. But in cleaning up the leftover duplicate structure often
led to bad decisions where C wouldn't have all its blocks marked
as referenced. Often, you would end up with blocks that were marked
as free which were still in use, and blocks that were marked as
in use that should have been freed, and it was all due to the
existence of those duplicate structures that were still on the list
until pass2.
The new strategy is to resolve-as-you-go. In other words, pass1b
considers the three duplicate blocks, but when it decides that
file A should be deleted, it removes all its references from the
list, thereby making the decision between B and C easier: it no
longer has to worry about block 1233, and there's only one thing
to consider about block 0x1234 and 0x1235. When B is deleted, it
removes all its duplicate references, so block 0x1235 is no longer
considered to be in conflict. Once a file is deleted, all its
duplicate reference structures are removed so as not to confuse
other duplicates being resolved. The duplicate handler structure,
struct dup_handler, is revised with every reference that's resolved
so it's not working off a long list of possibles, most of
which were already taken care of by previous actions.
rhbz#902920
Conflicts:
gfs2/fsck/pass1b.c
---
gfs2/fsck/fsck.h | 2 -
gfs2/fsck/initialize.c | 2 +-
gfs2/fsck/metawalk.c | 219 ++++++++-----
gfs2/fsck/metawalk.h | 31 +-
gfs2/fsck/pass1.c | 101 +++---
gfs2/fsck/pass1b.c | 810 ++++++++++++++++++++-----------------------------
gfs2/fsck/pass2.c | 60 ----
gfs2/fsck/util.c | 37 ++-
gfs2/fsck/util.h | 3 +-
9 files changed, 612 insertions(+), 653 deletions(-)
diff --git a/gfs2/fsck/fsck.h b/gfs2/fsck/fsck.h
index b21a670..6d888af 100644
--- a/gfs2/fsck/fsck.h
+++ b/gfs2/fsck/fsck.h
@@ -112,11 +112,9 @@ extern int pass4(struct gfs2_sbd *sdp);
extern int pass5(struct gfs2_sbd *sdp);
extern int rg_repair(struct gfs2_sbd *sdp, int trust_lvl, int *rg_count,
int *sane);
-extern void gfs2_dup_free(void);
extern int fsck_query(const char *format, ...)
__attribute__((format(printf,1,2)));
extern struct dir_info *dirtree_find(uint64_t block);
-extern void dup_listent_delete(struct inode_with_dups *id);
extern void dup_delete(struct duptree *dt);
extern void dirtree_delete(struct dir_info *b);
diff --git a/gfs2/fsck/initialize.c b/gfs2/fsck/initialize.c
index 7d64b0a..b01b240 100644
--- a/gfs2/fsck/initialize.c
+++ b/gfs2/fsck/initialize.c
@@ -66,7 +66,7 @@ static int block_mounters(struct gfs2_sbd *sdp, int block_em)
return 0;
}
-void gfs2_dup_free(void)
+static void gfs2_dup_free(void)
{
struct osi_node *n;
struct duptree *dt;
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index d285ee5..19593f3 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -950,7 +950,8 @@ int delete_block(struct gfs2_inode *ip, uint64_t block,
/**
* find_remove_dup - find out if this is a duplicate ref. If so, remove it.
- * Returns: 0 if not a duplicate reference, 1 if it is.
+ *
+ * Returns: 1 if there are any remaining references to this block, else 0.
*/
int find_remove_dup(struct gfs2_inode *ip, uint64_t block, const char *btype)
{
@@ -964,41 +965,18 @@ int find_remove_dup(struct gfs2_inode *ip, uint64_t block, const char *btype)
/* remove the inode reference id structure for this reference. */
id = find_dup_ref_inode(dt, ip);
if (!id)
- return 0;
-
- dup_listent_delete(id);
- log_err( _("Removing duplicate status of block %llu (0x%llx) "
- "referenced as %s by dinode %llu (0x%llx)\n"),
- (unsigned long long)block, (unsigned long long)block,
- btype, (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr);
- dt->refs--; /* one less reference */
- if (dt->refs == 1) {
- log_info( _("This leaves only one reference: it's "
- "no longer a duplicate.\n"));
+ goto more_refs;
+
+ dup_listent_delete(dt, id);
+ if (dt->refs == 0) {
+ log_info( _("This was the last reference: it's no longer a "
+ "duplicate.\n"));
dup_delete(dt); /* not duplicate now */
- } else
- log_info( _("%d block reference(s) remain.\n"),
- dt->refs);
- return 1; /* but the original ref still exists so do not free it. */
-}
-
-/**
- * free_block_if_notdup - free blocks associated with an inode, but if it's a
- * duplicate, just remove that designation instead.
- * Returns: 1 if the block was freed, 0 if a duplicate reference was removed
- * Note: The return code is handled this way because there are places in
- * metawalk.c that assume "1" means "change was made" and "0" means
- * change was not made.
- */
-int free_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
- const char *btype)
-{
- if (!find_remove_dup(ip, block, btype)) { /* not a dup */
- fsck_blockmap_set(ip, block, btype, gfs2_block_free);
- return meta_skip_further;
+ return 0;
}
- return meta_is_good;
+more_refs:
+ log_info( _("%d block reference(s) remain.\n"), dt->refs);
+ return 1; /* references still exist so do not free the block. */
}
/**
@@ -1010,7 +988,8 @@ int free_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
*/
static int delete_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh,
- const char *btype, void *private)
+ const char *btype, int *was_duplicate,
+ void *private)
{
uint8_t q;
@@ -1027,7 +1006,19 @@ static int delete_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)ip->i_di.di_num.no_addr);
return meta_is_good;
}
- return free_block_if_notdup(ip, block, btype);
+ if (find_remove_dup(ip, block, btype)) { /* a dup */
+ if (was_duplicate)
+ *was_duplicate = 1;
+ log_err( _("Not clearing duplicate reference in inode "
+ "at block #%llu (0x%llx) to block #%llu (0x%llx) "
+ "because it's referenced by another inode.\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)block, (unsigned long long)block);
+ } else {
+ fsck_blockmap_set(ip, block, btype, gfs2_block_free);
+ }
+ return meta_is_good;
}
/**
@@ -1197,7 +1188,7 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
osi_list_t *prev_list, *cur_list, *tmp;
int h, head_size, iblk_type;
uint64_t *ptr, block;
- int error = 0, err;
+ int error, was_duplicate, is_valid;
osi_list_add(&metabh->b_altlist, &mlp[0]);
@@ -1211,7 +1202,7 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
/* if (<there are no indirect blocks to check>) */
if (height < 2)
- return 0;
+ return meta_is_good;
for (h = 1; h < height; h++) {
if (h > 1) {
if (is_dir(&ip->i_di, ip->i_sbd->gfs1) &&
@@ -1243,7 +1234,7 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
ptr++) {
if (skip_this_pass || fsck_abort) {
free_metalist(ip, mlp);
- return FSCK_OK;
+ return meta_is_good;
}
nbh = NULL;
@@ -1251,19 +1242,41 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
continue;
block = be64_to_cpu(*ptr);
- err = pass->check_metalist(ip, block, &nbh, h,
- pass->private);
+ was_duplicate = 0;
+ error = pass->check_metalist(ip, block, &nbh,
+ h, &is_valid,
+ &was_duplicate,
+ pass->private);
/* check_metalist should hold any buffers
it gets with "bread". */
- if (err == meta_error) {
+ if (error == meta_error) {
stack;
- error = err;
+ log_info(_("\nSerious metadata "
+ "error on block %llu "
+ "(0x%llx).\n"),
+ (unsigned long long)block,
+ (unsigned long long)block);
+ return error;
+ }
+ if (error == meta_skip_further) {
+ log_info(_("\nUnrecoverable metadata "
+ "error on block %llu "
+ "(0x%llx). Further metadata"
+ " will be skipped.\n"),
+ (unsigned long long)block,
+ (unsigned long long)block);
return error;
}
- if (err == meta_skip_further) {
- if (!error)
- error = err;
- log_debug( _("Skipping block %llu (0x%llx)\n"),
+ if (!is_valid) {
+ log_debug( _("Skipping rejected block "
+ "%llu (0x%llx)\n"),
+ (unsigned long long)block,
+ (unsigned long long)block);
+ continue;
+ }
+ if (was_duplicate) {
+ log_debug( _("Skipping duplicate %llu "
+ "(0x%llx)\n"),
(unsigned long long)block,
(unsigned long long)block);
continue;
@@ -1597,34 +1610,52 @@ int remove_dentry_from_dir(struct gfs2_sbd *sdp, uint64_t dir,
}
int delete_metadata(struct gfs2_inode *ip, uint64_t block,
- struct gfs2_buffer_head **bh, int h, void *private)
+ struct gfs2_buffer_head **bh, int h, int *is_valid,
+ int *was_duplicate, void *private)
{
- return delete_block_if_notdup(ip, block, bh, _("metadata"), private);
+ *is_valid = 1;
+ *was_duplicate = 0;
+ return delete_block_if_notdup(ip, block, bh, _("metadata"),
+ was_duplicate, private);
}
int delete_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
{
- return delete_block_if_notdup(ip, block, NULL, _("leaf"), private);
+ return delete_block_if_notdup(ip, block, NULL, _("leaf"), NULL,
+ private);
}
int delete_data(struct gfs2_inode *ip, uint64_t metablock,
uint64_t block, void *private)
{
- return delete_block_if_notdup(ip, block, NULL, _("data"), private);
+ return delete_block_if_notdup(ip, block, NULL, _("data"), NULL,
+ private);
}
-int delete_eattr_indir(struct gfs2_inode *ip, uint64_t block, uint64_t parent,
- struct gfs2_buffer_head **bh, void *private)
+static int del_eattr_generic(struct gfs2_inode *ip, uint64_t block,
+ uint64_t parent, struct gfs2_buffer_head **bh,
+ void *private, const char *eatype)
{
- int ret;
+ int ret = 0;
+ int was_free = 0;
+ uint8_t q;
- ret = delete_block_if_notdup(ip, block, NULL,
- _("indirect extended attribute"),
- private);
+ if (valid_block(ip->i_sbd, block)) {
+ q = block_type(block);
+ if (q == gfs2_block_free)
+ was_free = 1;
+ ret = delete_block_if_notdup(ip, block, NULL, eatype,
+ NULL, private);
+ if (!ret) {
+ *bh = bread(ip->i_sbd, block);
+ if (!was_free)
+ ip->i_di.di_blocks--;
+ bmodified(ip->i_bh);
+ }
+ }
/* Even if it's a duplicate reference, we want to eliminate the
reference itself, and adjust di_blocks accordingly. */
if (ip->i_di.di_eattr) {
- ip->i_di.di_blocks--;
if (block == ip->i_di.di_eattr)
ip->i_di.di_eattr = 0;
bmodified(ip->i_bh);
@@ -1632,24 +1663,74 @@ int delete_eattr_indir(struct gfs2_inode *ip, uint64_t block, uint64_t parent,
return ret;
}
+int delete_eattr_indir(struct gfs2_inode *ip, uint64_t block, uint64_t parent,
+ struct gfs2_buffer_head **bh, void *private)
+{
+ return del_eattr_generic(ip, block, parent, bh, private,
+ _("extended attribute"));
+}
+
int delete_eattr_leaf(struct gfs2_inode *ip, uint64_t block, uint64_t parent,
struct gfs2_buffer_head **bh, void *private)
{
- int ret;
+ return del_eattr_generic(ip, block, parent, bh, private,
+ _("indirect extended attribute"));
+}
- ret = delete_block_if_notdup(ip, block, NULL, _("extended attribute"),
- private);
- if (ip->i_di.di_eattr) {
- ip->i_di.di_blocks--;
- if (block == ip->i_di.di_eattr)
- ip->i_di.di_eattr = 0;
- bmodified(ip->i_bh);
+int delete_eattr_entry(struct gfs2_inode *ip, struct gfs2_buffer_head *leaf_bh,
+ struct gfs2_ea_header *ea_hdr,
+ struct gfs2_ea_header *ea_hdr_prev, void *private)
+{
+ struct gfs2_sbd *sdp = ip->i_sbd;
+ char ea_name[256];
+ uint32_t avail_size;
+ int max_ptrs;
+
+ if (!ea_hdr->ea_name_len){
+ /* Skip this entry for now */
+ return 1;
}
- return ret;
+
+ memset(ea_name, 0, sizeof(ea_name));
+ strncpy(ea_name, (char *)ea_hdr + sizeof(struct gfs2_ea_header),
+ ea_hdr->ea_name_len);
+
+ if (!GFS2_EATYPE_VALID(ea_hdr->ea_type) &&
+ ((ea_hdr_prev) || (!ea_hdr_prev && ea_hdr->ea_type))){
+ /* Skip invalid entry */
+ return 1;
+ }
+
+ if (!ea_hdr->ea_num_ptrs)
+ return 0;
+
+ avail_size = sdp->sd_sb.sb_bsize - sizeof(struct gfs2_meta_header);
+ max_ptrs = (be32_to_cpu(ea_hdr->ea_data_len) + avail_size - 1) /
+ avail_size;
+
+ if (max_ptrs > ea_hdr->ea_num_ptrs)
+ return 1;
+
+ log_debug( _(" Pointers Required: %d\n Pointers Reported: %d\n"),
+ max_ptrs, ea_hdr->ea_num_ptrs);
+
+ return 0;
+}
+
+int delete_eattr_extentry(struct gfs2_inode *ip, uint64_t *ea_data_ptr,
+ struct gfs2_buffer_head *leaf_bh,
+ struct gfs2_ea_header *ea_hdr,
+ struct gfs2_ea_header *ea_hdr_prev, void *private)
+{
+ uint64_t block = be64_to_cpu(*ea_data_ptr);
+
+ return delete_block_if_notdup(ip, block, NULL, _("extended attribute"),
+ NULL, private);
}
static int alloc_metalist(struct gfs2_inode *ip, uint64_t block,
- struct gfs2_buffer_head **bh, int h, void *private)
+ struct gfs2_buffer_head **bh, int h, int *is_valid,
+ int *was_duplicate, void *private)
{
uint8_t q;
const char *desc = (const char *)private;
@@ -1657,6 +1738,8 @@ static int alloc_metalist(struct gfs2_inode *ip, uint64_t block,
/* No need to range_check here--if it was added, it's in range. */
/* We can't check the bitmap here because this function is called
after the bitmap has been set but before the blockmap has. */
+ *is_valid = 1;
+ *was_duplicate = 0;
*bh = bread(ip->i_sbd, block);
q = block_type(block);
if (blockmap_to_bitmap(q, ip->i_sbd->gfs1) == GFS2_BLKST_FREE) {
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index 49217cc..56f57d9 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -24,7 +24,8 @@ extern int delete_block(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, const char *btype,
void *private);
extern int delete_metadata(struct gfs2_inode *ip, uint64_t block,
- struct gfs2_buffer_head **bh, int h, void *private);
+ struct gfs2_buffer_head **bh, int h, int *is_valid,
+ int *was_duplicate, void *private);
extern int delete_leaf(struct gfs2_inode *ip, uint64_t block, void *private);
extern int delete_data(struct gfs2_inode *ip, uint64_t metablock,
uint64_t block, void *private);
@@ -32,6 +33,17 @@ extern int delete_eattr_indir(struct gfs2_inode *ip, uint64_t block, uint64_t pa
struct gfs2_buffer_head **bh, void *private);
extern int delete_eattr_leaf(struct gfs2_inode *ip, uint64_t block, uint64_t parent,
struct gfs2_buffer_head **bh, void *private);
+extern int delete_eattr_entry(struct gfs2_inode *ip,
+ struct gfs2_buffer_head *leaf_bh,
+ struct gfs2_ea_header *ea_hdr,
+ struct gfs2_ea_header *ea_hdr_prev,
+ void *private);
+extern int delete_eattr_extentry(struct gfs2_inode *ip, uint64_t *ea_data_ptr,
+ struct gfs2_buffer_head *leaf_bh,
+ struct gfs2_ea_header *ea_hdr,
+ struct gfs2_ea_header *ea_hdr_prev,
+ void *private);
+
extern int _fsck_blockmap_set(struct gfs2_inode *ip, uint64_t bblock,
const char *btype, enum gfs2_mark_block mark,
const char *caller, int line);
@@ -48,8 +60,6 @@ extern int write_new_leaf(struct gfs2_inode *dip, int start_lindex,
uint64_t *bn);
extern int repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no, int lindex,
int ref_count, const char *msg);
-extern int free_block_if_notdup(struct gfs2_inode *ip, uint64_t block,
- const char *btype);
#define is_duplicate(dblock) ((dupfind(dblock)) ? 1 : 0)
@@ -83,8 +93,23 @@ struct metawalk_fxns {
int ref_count, struct gfs2_buffer_head *lbh);
int (*check_leaf) (struct gfs2_inode *ip, uint64_t block,
void *private);
+ /* parameters to the check_metalist sub-functions:
+ ip: incore inode pointer
+ block: block number of the metadata block to be checked
+ bh: buffer_head to be returned
+ h: height
+ is_valid: returned as 1 if the metadata block is valid and should
+ be added to the metadata list for further processing.
+ was_duplicate: returns as 1 if the metadata block was determined
+ to be a duplicate reference, in which case we want to
+ skip adding it to the metadata list.
+ private: Pointer to pass-specific data
+ returns: 0 - everything is good, but there may be duplicates
+ 1 - skip further processing
+ */
int (*check_metalist) (struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h,
+ int *is_valid, int *was_duplicate,
void *private);
int (*check_data) (struct gfs2_inode *ip, uint64_t metablock,
uint64_t block, void *private);
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index df10089..ee7e2c5 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -37,7 +37,8 @@ struct block_count {
static int p1check_leaf(struct gfs2_inode *ip, uint64_t block, void *private);
static int check_metalist(struct gfs2_inode *ip, uint64_t block,
- struct gfs2_buffer_head **bh, int h, void *private);
+ struct gfs2_buffer_head **bh, int h, int *is_valid,
+ int *was_duplicate, void *private);
static int undo_check_metalist(struct gfs2_inode *ip, uint64_t block,
int h, void *private);
static int check_data(struct gfs2_inode *ip, uint64_t metablock,
@@ -64,6 +65,7 @@ static int finish_eattr_indir(struct gfs2_inode *ip, int leaf_pointers,
int leaf_pointer_errors, void *private);
static int invalidate_metadata(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h,
+ int *is_valid, int *was_duplicate,
void *private);
static int invalidate_leaf(struct gfs2_inode *ip, uint64_t block,
void *private);
@@ -127,10 +129,13 @@ struct metawalk_fxns invalidate_fxns = {
*/
static int resuscitate_metalist(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h,
+ int *is_valid, int *was_duplicate,
void *private)
{
struct block_count *bc = (struct block_count *)private;
+ *is_valid = 1;
+ *was_duplicate = 0;
*bh = NULL;
if (!valid_block(ip->i_sbd, block)){ /* blk outside of FS */
fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
@@ -139,7 +144,8 @@ static int resuscitate_metalist(struct gfs2_inode *ip, uint64_t block,
"range) found in system inode %lld (0x%llx).\n"),
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
- return meta_skip_further;
+ *is_valid = 0;
+ return meta_is_good;
}
if (fsck_system_inode(ip->i_sbd, block))
fsck_blockmap_set(ip, block, _("system file"), gfs2_indir_blk);
@@ -241,16 +247,19 @@ static int p1check_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
}
static int check_metalist(struct gfs2_inode *ip, uint64_t block,
- struct gfs2_buffer_head **bh, int h, void *private)
+ struct gfs2_buffer_head **bh, int h, int *is_valid,
+ int *was_duplicate, void *private)
{
uint8_t q;
- int found_dup = 0, iblk_type;
+ int iblk_type;
struct gfs2_buffer_head *nbh;
struct block_count *bc = (struct block_count *)private;
const char *blktypedesc;
*bh = NULL;
+ *was_duplicate = 0;
+ *is_valid = 0;
if (!valid_block(ip->i_sbd, block)) { /* blk outside of FS */
/* The bad dinode should be invalidated later due to
"unrecoverable" errors. The inode itself should be
@@ -282,12 +291,13 @@ static int check_metalist(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr, q,
block_type_string(q));
- add_duplicate_ref(ip, block, ref_as_meta, 0, INODE_VALID);
- found_dup = 1;
+ *was_duplicate = 1;
}
nbh = bread(ip->i_sbd, block);
- if (gfs2_check_meta(nbh, iblk_type)){
+ *is_valid = (gfs2_check_meta(nbh, iblk_type) == 0);
+
+ if (!(*is_valid)) {
log_err( _("Inode %lld (0x%llx) has a bad indirect block "
"pointer %lld (0x%llx) (points to something "
"that is not %s).\n"),
@@ -295,31 +305,23 @@ static int check_metalist(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)block,
(unsigned long long)block, blktypedesc);
- if (!found_dup) {
- fsck_blockmap_set(ip, block, _("bad indirect"),
- gfs2_meta_inval);
- brelse(nbh);
- nbh = NULL;
- return meta_skip_further;
- }
brelse(nbh);
- nbh = NULL;
- } else /* blk check ok */
- *bh = nbh;
+ return meta_skip_further;
+ }
bc->indir_count++;
- if (found_dup) {
- if (nbh) {
- brelse(nbh);
- nbh = NULL;
- *bh = NULL;
- }
- return meta_skip_further; /* don't process the metadata again */
- } else
- fsck_blockmap_set(ip, block, _("indirect"),
- gfs2_indir_blk);
+ if (*was_duplicate) {
+ add_duplicate_ref(ip, block, ref_as_meta, 0,
+ *is_valid ? INODE_VALID : INODE_INVALID);
+ brelse(nbh);
+ } else {
+ *bh = nbh;
+ fsck_blockmap_set(ip, block, _("indirect"), gfs2_indir_blk);
+ }
- return meta_is_good;
+ if (*is_valid)
+ return meta_is_good;
+ return meta_skip_further;
}
/* undo_reference - undo previously processed data or metadata
@@ -354,7 +356,7 @@ static int undo_reference(struct gfs2_inode *ip, uint64_t block, int meta,
if (!id)
break;
- dup_listent_delete(id);
+ dup_listent_delete(dt, id);
} while (id);
if (dt->refs) {
@@ -827,7 +829,8 @@ static int check_eattr_entries(struct gfs2_inode *ip,
* delete_block_if_notdup.
*/
static int mark_block_invalid(struct gfs2_inode *ip, uint64_t block,
- enum dup_ref_type reftype, const char *btype)
+ enum dup_ref_type reftype, const char *btype,
+ int *is_valid, int *was_duplicate)
{
uint8_t q;
@@ -838,11 +841,20 @@ static int mark_block_invalid(struct gfs2_inode *ip, uint64_t block,
* referenced elsewhere (duplicates) won't be flagged as such,
* and as a result, they'll be freed when this dinode is deleted,
* despite being used by another dinode as a valid block. */
- if (!valid_block(ip->i_sbd, block))
+ if (is_valid)
+ *is_valid = 1;
+ if (was_duplicate)
+ *was_duplicate = 0;
+ if (!valid_block(ip->i_sbd, block)) {
+ if (is_valid)
+ *is_valid = 0;
return meta_is_good;
+ }
q = block_type(block);
if (q != gfs2_block_free) {
+ if (was_duplicate)
+ *was_duplicate = 1;
add_duplicate_ref(ip, block, reftype, 0, INODE_INVALID);
log_info( _("%s block %lld (0x%llx), part of inode "
"%lld (0x%llx), was previously referenced so "
@@ -859,21 +871,27 @@ static int mark_block_invalid(struct gfs2_inode *ip, uint64_t block,
static int invalidate_metadata(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h,
+ int *is_valid, int *was_duplicate,
void *private)
{
- return mark_block_invalid(ip, block, ref_as_meta, _("metadata"));
+ *is_valid = 1;
+ *was_duplicate = 0;
+ return mark_block_invalid(ip, block, ref_as_meta, _("metadata"),
+ is_valid, was_duplicate);
}
static int invalidate_leaf(struct gfs2_inode *ip, uint64_t block,
void *private)
{
- return mark_block_invalid(ip, block, ref_as_meta, _("leaf"));
+ return mark_block_invalid(ip, block, ref_as_meta, _("leaf"),
+ NULL, NULL);
}
static int invalidate_data(struct gfs2_inode *ip, uint64_t metablock,
uint64_t block, void *private)
{
- return mark_block_invalid(ip, block, ref_as_data, _("data"));
+ return mark_block_invalid(ip, block, ref_as_data, _("data"),
+ NULL, NULL);
}
static int invalidate_eattr_indir(struct gfs2_inode *ip, uint64_t block,
@@ -881,7 +899,8 @@ static int invalidate_eattr_indir(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, void *private)
{
return mark_block_invalid(ip, block, ref_as_ea,
- _("indirect extended attribute"));
+ _("indirect extended attribute"),
+ NULL, NULL);
}
static int invalidate_eattr_leaf(struct gfs2_inode *ip, uint64_t block,
@@ -889,7 +908,8 @@ static int invalidate_eattr_leaf(struct gfs2_inode *ip, uint64_t block,
void *private)
{
return mark_block_invalid(ip, block, ref_as_ea,
- _("extended attribute"));
+ _("extended attribute"),
+ NULL, NULL);
}
/**
@@ -924,7 +944,7 @@ static int rangecheck_block(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
if ((*bad_pointers) <= BAD_POINTER_TOLERANCE)
- return meta_skip_further;
+ return meta_is_good;
else
return meta_error; /* Exits check_metatree quicker */
}
@@ -939,7 +959,7 @@ static int rangecheck_block(struct gfs2_inode *ip, uint64_t block,
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
if ((*bad_pointers) <= BAD_POINTER_TOLERANCE)
- return meta_skip_further;
+ return meta_is_good;
else
return meta_error; /* Exits check_metatree quicker */
}
@@ -948,8 +968,11 @@ static int rangecheck_block(struct gfs2_inode *ip, uint64_t block,
static int rangecheck_metadata(struct gfs2_inode *ip, uint64_t block,
struct gfs2_buffer_head **bh, int h,
+ int *is_valid, int *was_duplicate,
void *private)
{
+ *is_valid = 1;
+ *was_duplicate = 0;
return rangecheck_block(ip, block, bh, btype_meta, private);
}
@@ -1048,7 +1071,7 @@ static int handle_ip(struct gfs2_sbd *sdp, struct gfs2_inode *ip)
/* We there was an error, we return 0 because we want fsck to continue
and analyze the other dinodes as well. */
- if (fsck_abort || error != 0)
+ if (fsck_abort)
return 0;
error = check_inode_eattr(ip, &pass1_fxns);
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index b5da200..15a3f3a 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -23,386 +23,10 @@ struct fxn_info {
struct dup_handler {
struct duptree *dt;
- struct inode_with_dups *id;
int ref_inode_count;
int ref_count;
};
-static int check_leaf_refs(struct gfs2_inode *ip, uint64_t block, void *private);
-static int check_metalist(struct gfs2_inode *ip, uint64_t block,
- struct gfs2_buffer_head **bh, int h, void *private);
-static int check_data(struct gfs2_inode *ip, uint64_t metablock,
- uint64_t block, void *private);
-static int check_eattr_indir(struct gfs2_inode *ip, uint64_t block,
- uint64_t parent, struct gfs2_buffer_head **bh,
- void *private);
-static int check_eattr_leaf(struct gfs2_inode *ip, uint64_t block,
- uint64_t parent, struct gfs2_buffer_head **bh,
- void *private);
-static int check_eattr_entry(struct gfs2_inode *ip,
- struct gfs2_buffer_head *leaf_bh,
- struct gfs2_ea_header *ea_hdr,
- struct gfs2_ea_header *ea_hdr_prev,
- void *private);
-static int check_eattr_extentry(struct gfs2_inode *ip, uint64_t *ea_data_ptr,
- struct gfs2_buffer_head *leaf_bh,
- struct gfs2_ea_header *ea_hdr,
- struct gfs2_ea_header *ea_hdr_prev,
- void *private);
-static int find_dentry(struct gfs2_inode *ip, struct gfs2_dirent *de,
- struct gfs2_dirent *prev, struct gfs2_buffer_head *bh,
- char *filename, uint32_t *count, int lindex,
- void *priv);
-
-struct metawalk_fxns find_refs = {
- .private = NULL,
- .check_leaf = check_leaf_refs,
- .check_metalist = check_metalist,
- .check_data = check_data,
- .check_eattr_indir = check_eattr_indir,
- .check_eattr_leaf = check_eattr_leaf,
- .check_dentry = NULL,
- .check_eattr_entry = check_eattr_entry,
- .check_eattr_extentry = check_eattr_extentry,
-};
-
-struct metawalk_fxns find_dirents = {
- .private = NULL,
- .check_leaf = NULL,
- .check_metalist = NULL,
- .check_data = NULL,
- .check_eattr_indir = NULL,
- .check_eattr_leaf = NULL,
- .check_dentry = find_dentry,
- .check_eattr_entry = NULL,
- .check_eattr_extentry = NULL,
-};
-
-static int check_leaf_refs(struct gfs2_inode *ip, uint64_t block, void *private)
-{
- return add_duplicate_ref(ip, block, ref_as_meta, 1, INODE_VALID);
-}
-
-static int check_metalist(struct gfs2_inode *ip, uint64_t block,
- struct gfs2_buffer_head **bh, int h, void *private)
-{
- return add_duplicate_ref(ip, block, ref_as_meta, 1, INODE_VALID);
-}
-
-static int check_data(struct gfs2_inode *ip, uint64_t metablock,
- uint64_t block, void *private)
-{
- return add_duplicate_ref(ip, block, ref_as_data, 1, INODE_VALID);
-}
-
-static int check_eattr_indir(struct gfs2_inode *ip, uint64_t block,
- uint64_t parent, struct gfs2_buffer_head **bh,
- void *private)
-{
- struct gfs2_sbd *sdp = ip->i_sbd;
- int error;
-
- error = add_duplicate_ref(ip, block, ref_as_ea, 1, INODE_VALID);
- if (!error)
- *bh = bread(sdp, block);
-
- return error;
-}
-
-static int check_eattr_leaf(struct gfs2_inode *ip, uint64_t block,
- uint64_t parent, struct gfs2_buffer_head **bh,
- void *private)
-{
- struct gfs2_sbd *sdp = ip->i_sbd;
- int error;
-
- error = add_duplicate_ref(ip, block, ref_as_ea, 1, INODE_VALID);
- if (!error)
- *bh = bread(sdp, block);
- return error;
-}
-
-static int check_eattr_entry(struct gfs2_inode *ip,
- struct gfs2_buffer_head *leaf_bh,
- struct gfs2_ea_header *ea_hdr,
- struct gfs2_ea_header *ea_hdr_prev, void *private)
-{
- return 0;
-}
-
-static int check_eattr_extentry(struct gfs2_inode *ip, uint64_t *ea_data_ptr,
- struct gfs2_buffer_head *leaf_bh,
- struct gfs2_ea_header *ea_hdr,
- struct gfs2_ea_header *ea_hdr_prev,
- void *private)
-{
- uint64_t block = be64_to_cpu(*ea_data_ptr);
-
- return add_duplicate_ref(ip, block, ref_as_ea, 1, INODE_VALID);
-}
-
-/*
- * check_dir_dup_ref - check for a directory entry duplicate reference
- * and if found, set the name into the id.
- * Returns: 1 if filename was found, otherwise 0
- */
-static int check_dir_dup_ref(struct gfs2_inode *ip, struct gfs2_dirent *de,
- osi_list_t *tmp2, char *filename)
-{
- struct inode_with_dups *id;
-
- id = osi_list_entry(tmp2, struct inode_with_dups, list);
- if (id->name)
- /* We can only have one parent of inodes that contain duplicate
- * blocks...no need to keep looking for this one. */
- return 1;
- if (id->block_no == de->de_inum.no_addr) {
- id->name = strdup(filename);
- id->parent = ip->i_di.di_num.no_addr;
- log_debug( _("Duplicate block %llu (0x%llx"
- ") is in file or directory %llu"
- " (0x%llx) named %s\n"),
- (unsigned long long)id->block_no,
- (unsigned long long)id->block_no,
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr,
- filename);
- /* If there are duplicates of duplicates, I guess we'll miss
- them here. */
- return 1;
- }
- return 0;
-}
-
-static int find_dentry(struct gfs2_inode *ip, struct gfs2_dirent *de,
- struct gfs2_dirent *prev,
- struct gfs2_buffer_head *bh, char *filename,
- uint32_t *count, int lindex, void *priv)
-{
- struct osi_node *n, *next = NULL;
- osi_list_t *tmp2;
- struct duptree *dt;
- int found;
-
- for (n = osi_first(&dup_blocks); n; n = next) {
- next = osi_next(n);
- dt = (struct duptree *)n;
- found = 0;
- osi_list_foreach(tmp2, &dt->ref_invinode_list) {
- if (check_dir_dup_ref(ip, de, tmp2, filename)) {
- found = 1;
- break;
- }
- }
- if (!found) {
- osi_list_foreach(tmp2, &dt->ref_inode_list) {
- if (check_dir_dup_ref(ip, de, tmp2, filename))
- break;
- }
- }
- }
- /* Return the number of leaf entries so metawalk doesn't flag this
- leaf as having none. */
- *count = be16_to_cpu(((struct gfs2_leaf *)bh->b_data)->lf_entries);
- return 0;
-}
-
-static int clear_dup_metalist(struct gfs2_inode *ip, uint64_t block,
- struct gfs2_buffer_head **bh, int h,
- void *private)
-{
- struct dup_handler *dh = (struct dup_handler *) private;
- struct duptree *dt;
-
- if (!valid_block(ip->i_sbd, block))
- return meta_is_good;
-
- /* This gets tricky. We're traversing a metadata tree trying to
- delete an inode based on it having a duplicate block reference
- somewhere in its metadata. We know this block is listed as data
- or metadata for this inode, but it may or may not be one of the
- actual duplicate references that caused the problem. If it's not
- a duplicate, it's normal metadata that isn't referenced anywhere
- else, but we're deleting the inode out from under it, so we need
- to delete it altogether. If the block is a duplicate referenced
- block, we need to keep its type intact and let the caller sort
- it out once we're down to a single reference. */
- dt = dupfind(block);
- if (!dt) {
- fsck_blockmap_set(ip, block, _("no longer valid"),
- gfs2_block_free);
- return meta_is_good;
- }
- /* This block, having failed the above test, is duplicated somewhere */
- if (block == dh->dt->block) {
- log_err( _("Not clearing duplicate reference in inode \"%s\" "
- "at block #%llu (0x%llx) to block #%llu (0x%llx) "
- "because it's valid for another inode.\n"),
- dh->id->name ? dh->id->name : _("unknown name"),
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)block, (unsigned long long)block);
- log_err( _("Inode %s is in directory %llu (0x%llx)\n"),
- dh->id->name ? dh->id->name : "",
- (unsigned long long)dh->id->parent,
- (unsigned long long)dh->id->parent);
- }
- /* We return 1 not 0 because we need build_and_check_metalist to
- bypass adding the metadata below it to the metalist. If that
- were to happen, all the indirect blocks pointed to by the
- duplicate block would be processed twice, which means it might
- be mistakenly freed as "no longer valid" (in this function above)
- even though it's valid metadata for a different inode. Returning
- 1 ensures that the metadata isn't processed again. */
- return meta_skip_further;
-}
-
-static int clear_dup_data(struct gfs2_inode *ip, uint64_t metablock,
- uint64_t block, void *private)
-{
- return clear_dup_metalist(ip, block, NULL, 0, private);
-}
-
-static int clear_leaf(struct gfs2_inode *ip, uint64_t block, void *private)
-{
- return clear_dup_metalist(ip, block, NULL, 0, private);
-}
-
-static int clear_dup_eattr_indir(struct gfs2_inode *ip, uint64_t block,
- uint64_t parent, struct gfs2_buffer_head **bh,
- void *private)
-{
- return clear_dup_metalist(ip, block, NULL, 0, private);
-}
-
-static int clear_dup_eattr_leaf(struct gfs2_inode *ip, uint64_t block,
- uint64_t parent, struct gfs2_buffer_head **bh,
- void *private)
-{
- return clear_dup_metalist(ip, block, NULL, 0, private);
-}
-
-static int clear_eattr_entry (struct gfs2_inode *ip,
- struct gfs2_buffer_head *leaf_bh,
- struct gfs2_ea_header *ea_hdr,
- struct gfs2_ea_header *ea_hdr_prev,
- void *private)
-{
- struct gfs2_sbd *sdp = ip->i_sbd;
- char ea_name[256];
-
- if (!ea_hdr->ea_name_len){
- /* Skip this entry for now */
- return 1;
- }
-
- memset(ea_name, 0, sizeof(ea_name));
- strncpy(ea_name, (char *)ea_hdr + sizeof(struct gfs2_ea_header),
- ea_hdr->ea_name_len);
-
- if (!GFS2_EATYPE_VALID(ea_hdr->ea_type) &&
- ((ea_hdr_prev) || (!ea_hdr_prev && ea_hdr->ea_type))){
- /* Skip invalid entry */
- return 1;
- }
-
- if (ea_hdr->ea_num_ptrs){
- uint32_t avail_size;
- int max_ptrs;
-
- avail_size = sdp->sd_sb.sb_bsize - sizeof(struct gfs2_meta_header);
- max_ptrs = (be32_to_cpu(ea_hdr->ea_data_len) + avail_size - 1) /
- avail_size;
-
- if (max_ptrs > ea_hdr->ea_num_ptrs)
- return 1;
- else {
- log_debug( _(" Pointers Required: %d\n Pointers Reported: %d\n"),
- max_ptrs, ea_hdr->ea_num_ptrs);
- }
- }
- return 0;
-}
-
-static int clear_eattr_extentry(struct gfs2_inode *ip, uint64_t *ea_data_ptr,
- struct gfs2_buffer_head *leaf_bh,
- struct gfs2_ea_header *ea_hdr,
- struct gfs2_ea_header *ea_hdr_prev,
- void *private)
-{
- uint64_t block = be64_to_cpu(*ea_data_ptr);
-
- return clear_dup_metalist(ip, block, NULL, 0, private);
-}
-
-/* Finds all references to duplicate blocks in the metadata */
-static int find_block_ref(struct gfs2_sbd *sdp, uint64_t inode)
-{
- struct gfs2_inode *ip;
- int error = 0;
-
- ip = fsck_load_inode(sdp, inode); /* bread, inode_get */
- /* double-check the meta header just to be sure it's metadata */
- if (ip->i_di.di_header.mh_magic != GFS2_MAGIC ||
- ip->i_di.di_header.mh_type != GFS2_METATYPE_DI) {
- log_debug( _("Block %lld (0x%llx) is not gfs2 metadata.\n"),
- (unsigned long long)inode,
- (unsigned long long)inode);
- fsck_inode_put(&ip);
- return 1;
- }
- /* Check to see if this inode was referenced by another by mistake */
- add_duplicate_ref(ip, inode, ref_is_inode, 1, INODE_VALID);
-
- /* Check this dinode's metadata for references to known duplicates */
- error = check_metatree(ip, &find_refs);
- if (error < 0) {
- stack;
- fsck_inode_put(&ip); /* out, brelse, free */
- return error;
- }
-
- /* Exhash dir leafs will be checked by check_metatree (right after
- the "end:" label.) But if this is a linear directory we need to
- check the dir with check_linear_dir. */
- if (is_dir(&ip->i_di, sdp->gfs1) &&
- !(ip->i_di.di_flags & GFS2_DIF_EXHASH))
- error = check_linear_dir(ip, ip->i_bh, &find_dirents);
-
- /* Check for ea references in the inode */
- if (!error)
- error = check_inode_eattr(ip, &find_refs);
-
- fsck_inode_put(&ip); /* out, brelse, free */
-
- return error;
-}
-
-/* get_ref_type - figure out if all duplicate references from this inode
- are the same type, and if so, return the type. */
-static enum dup_ref_type get_ref_type(struct inode_with_dups *id)
-{
- enum dup_ref_type t, i;
- int found_type_with_ref;
- int found_other_types;
-
- for (t = ref_as_data; t < ref_types; t++) {
- found_type_with_ref = 0;
- found_other_types = 0;
- for (i = ref_as_data; i < ref_types; i++) {
- if (id->reftypecount[i]) {
- if (t == i)
- found_type_with_ref = 1;
- else
- found_other_types = 1;
- }
- }
- if (found_type_with_ref)
- return found_other_types ? ref_types : t;
- }
- return ref_types;
-}
-
static void log_inode_reference(struct duptree *dt, osi_list_t *tmp, int inval)
{
char reftypestring[32];
@@ -426,12 +50,74 @@ static void log_inode_reference(struct duptree *dt, osi_list_t *tmp, int inval)
(unsigned long long)dt->block,
(unsigned long long)dt->block, reftypestring);
}
+
+/* delete_all_dups - delete all duplicate records for a given inode */
+static void delete_all_dups(struct gfs2_inode *ip)
+{
+ struct osi_node *n, *next;
+ struct duptree *dt;
+ osi_list_t *tmp, *x;
+ struct inode_with_dups *id;
+ int found;
+
+ for (n = osi_first(&dup_blocks); n; n = next) {
+ next = osi_next(n);
+ dt = (struct duptree *)n;
+
+ found = 0;
+ id = NULL;
+
+ osi_list_foreach_safe(tmp, &dt->ref_invinode_list, x) {
+ id = osi_list_entry(tmp, struct inode_with_dups, list);
+ if (id->block_no == ip->i_di.di_num.no_addr) {
+ dup_listent_delete(dt, id);
+ found = 1;
+ }
+ }
+ osi_list_foreach_safe(tmp, &dt->ref_inode_list, x) {
+ id = osi_list_entry(tmp, struct inode_with_dups, list);
+ if (id->block_no == ip->i_di.di_num.no_addr) {
+ dup_listent_delete(dt, id);
+ found = 1;
+ }
+ }
+ if (!found)
+ continue;
+
+ if (dt->refs == 0) {
+ log_debug(_("This was the last reference: 0x%llx is "
+ "no longer a duplicate.\n"),
+ (unsigned long long)dt->block);
+ dup_delete(dt); /* not duplicate now */
+ } else {
+ log_debug(_("%d references remain to 0x%llx\n"),
+ dt->refs, (unsigned long long)dt->block);
+ if (dt->refs > 1)
+ continue;
+
+ id = NULL;
+ osi_list_foreach(tmp, &dt->ref_invinode_list)
+ id = osi_list_entry(tmp,
+ struct inode_with_dups,
+ list);
+ osi_list_foreach(tmp, &dt->ref_inode_list)
+ id = osi_list_entry(tmp,
+ struct inode_with_dups,
+ list);
+ if (id)
+ log_debug("Last reference is from inode "
+ "0x%llx\n",
+ (unsigned long long)id->block_no);
+ }
+ }
+}
+
/*
* resolve_dup_references - resolve all but the last dinode that has a
* duplicate reference to a given block.
*
* @sdp - pointer to the superblock structure
- * @b - pointer to the duplicate reference rbtree to use
+ * @dt - pointer to the duplicate reference rbtree to use
* @ref_list - list of duplicate references to be resolved (invalid or valid)
* @dh - duplicate handler
* inval - The references on this ref_list are invalid. We prefer to delete
@@ -439,40 +125,42 @@ static void log_inode_reference(struct duptree *dt, osi_list_t *tmp, int inval)
* acceptable_ref - Delete dinodes that reference the given block as anything
* _but_ this type. Try to save references as this type.
*/
-static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *dt,
- osi_list_t *ref_list, struct dup_handler *dh,
- int inval, int acceptable_ref)
+static void resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *dt,
+ osi_list_t *ref_list,
+ struct dup_handler *dh,
+ int inval, int acceptable_ref)
{
struct gfs2_inode *ip;
struct inode_with_dups *id;
osi_list_t *tmp, *x;
- struct metawalk_fxns clear_dup_fxns = {
+ struct metawalk_fxns pass1b_fxns_delete = {
.private = NULL,
- .check_leaf = clear_leaf,
- .check_metalist = clear_dup_metalist,
- .check_data = clear_dup_data,
- .check_eattr_indir = clear_dup_eattr_indir,
- .check_eattr_leaf = clear_dup_eattr_leaf,
- .check_dentry = NULL,
- .check_eattr_entry = clear_eattr_entry,
- .check_eattr_extentry = clear_eattr_extentry,
+ .check_metalist = delete_metadata,
+ .check_data = delete_data,
+ .check_leaf = delete_leaf,
+ .check_eattr_indir = delete_eattr_indir,
+ .check_eattr_leaf = delete_eattr_leaf,
+ .check_eattr_entry = delete_eattr_entry,
+ .check_eattr_extentry = delete_eattr_extentry,
};
enum dup_ref_type this_ref;
struct inode_info *ii;
int found_good_ref = 0;
+ uint64_t dup_block;
+ uint8_t q;
osi_list_foreach_safe(tmp, ref_list, x) {
if (skip_this_pass || fsck_abort)
- return FSCK_OK;
+ return;
id = osi_list_entry(tmp, struct inode_with_dups, list);
dh->dt = dt;
- dh->id = id;
if (dh->ref_inode_count == 1) /* down to the last reference */
- return 1;
+ return;
this_ref = get_ref_type(id);
+ q = block_type(id->block_no);
if (inval)
log_warn( _("Invalid "));
/* FIXME: If we already found an acceptable reference to this
@@ -484,11 +172,8 @@ static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *dt,
type and */
this_ref == acceptable_ref && /* this ref is acceptable */
!found_good_ref) { /* We haven't found a good reference */
- uint8_t q;
-
/* If this is an invalid inode, but not on the invalid
list, it's better to delete it. */
- q = block_type(id->block_no);
if (q != gfs2_inode_invalid) {
found_good_ref = 1;
log_warn( _("Inode %s (%lld/0x%llx)'s "
@@ -526,69 +211,124 @@ static int resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *dt,
(unsigned long long)id->block_no))) {
log_warn( _("The bad inode was not cleared."));
/* delete the list entry so we don't leak memory but
- leave the reference count. If the decrement the
+ leave the reference count. If we decrement the
ref count, we could get down to 1 and the dinode
would be changed without a 'Yes' answer. */
/* (dh->ref_inode_count)--;*/
- dup_listent_delete(id);
+ dup_listent_delete(dt, id);
continue;
}
- log_warn( _("Clearing inode %lld (0x%llx)...\n"),
- (unsigned long long)id->block_no,
- (unsigned long long)id->block_no);
-
+ if (q == gfs2_block_free)
+ log_warn( _("Inode %lld (0x%llx) was previously "
+ "deleted.\n"),
+ (unsigned long long)id->block_no,
+ (unsigned long long)id->block_no);
+ else
+ log_warn(_("Pass1b is deleting inode %lld (0x%llx).\n"),
+ (unsigned long long)id->block_no,
+ (unsigned long long)id->block_no);
+
+ dup_block = id->block_no;
ip = fsck_load_inode(sdp, id->block_no);
- if (id->reftypecount[ref_as_data] ||
- id->reftypecount[ref_as_meta]) {
- ii = inodetree_find(ip->i_di.di_num.no_addr);
- if (ii)
- inodetree_delete(ii);
- }
- clear_dup_fxns.private = (void *) dh;
- /* Clear the EAs for the inode first */
- check_inode_eattr(ip, &clear_dup_fxns);
- /* If the dup was in data or metadata, clear the dinode */
- if (id->reftypecount[ref_as_data] ||
- id->reftypecount[ref_as_meta]) {
- check_metatree(ip, &clear_dup_fxns);
- fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
- _("duplicate referencing bad"),
- gfs2_inode_invalid);
+ /* If we've already deleted this dinode, don't try to delete
+ it again. That could free blocks that used to be duplicate
+ references that are now resolved (and gone). */
+ if (q != gfs2_block_free) {
+ /* Clear the EAs for the inode first */
+ check_inode_eattr(ip, &pass1b_fxns_delete);
+ /* If the reference was as metadata or data, we've got
+ a corrupt dinode that will be deleted. */
+ if (inval || id->reftypecount[ref_as_data] ||
+ id->reftypecount[ref_as_meta]) {
+ /* Remove the inode from the inode tree */
+ ii = inodetree_find(ip->i_di.di_num.no_addr);
+ if (ii)
+ inodetree_delete(ii);
+ fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
+ _("duplicate referencing bad"),
+ gfs2_inode_invalid);
+ /* We delete the dup_handler inode count and
+ duplicate id BEFORE clearing the metadata,
+ because if this is the last reference to
+ this metadata block, we need to traverse the
+ tree and free the data blocks it references.
+ However, we don't want to delete other
+ duplicates that may be used by other
+ dinodes. */
+ (dh->ref_inode_count)--;
+ /* FIXME: other option should be to duplicate
+ the block for each duplicate and point the
+ metadata at the cloned blocks */
+ check_metatree(ip, &pass1b_fxns_delete);
+ }
}
+ /* Now we've got to go through an delete any other duplicate
+ references from this dinode we're deleting. If we don't,
+ pass1b will discover the other duplicate record, try to
+ delete this dinode a second time, and this time its earlier
+ duplicate references won't be seen as duplicates anymore
+ (because they were eliminated earlier in pass1b). And so
+ the blocks will be mistakenly freed, when, in fact, they're
+ still being referenced by a valid dinode. */
+ delete_all_dups(ip);
fsck_inode_put(&ip); /* out, brelse, free */
- (dh->ref_inode_count)--;
- /* FIXME: other option should be to duplicate the
- * block for each duplicate and point the metadata at
- * the cloned blocks */
- dup_listent_delete(id);
}
- if (dh->ref_inode_count == 1) /* down to the last reference */
- return 1;
- return 0;
+ return;
}
-static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *dt)
+/* revise_dup_handler - get current information about a duplicate reference
+ *
+ * Function resolve_dup_references can delete dinodes that reference blocks
+ * which may have duplicate references. Therefore, the duplicate tree is
+ * constantly being changed. This function revises the duplicate handler so
+ * that it accurately matches what's in the duplicate tree regarding this block
+ */
+static void revise_dup_handler(uint64_t dup_blk, struct dup_handler *dh)
{
- struct gfs2_inode *ip;
osi_list_t *tmp;
+ struct duptree *dt;
struct inode_with_dups *id;
- struct dup_handler dh = {0};
- int last_reference = 0;
- struct gfs2_buffer_head *bh;
- uint32_t cmagic, ctype;
- enum dup_ref_type acceptable_ref;
+ dh->ref_inode_count = 0;
+ dh->ref_count = 0;
+ dh->dt = NULL;
+
+ dt = dupfind(dup_blk);
+ if (!dt)
+ return;
+
+ dh->dt = dt;
/* Count the duplicate references, both valid and invalid */
osi_list_foreach(tmp, &dt->ref_invinode_list) {
id = osi_list_entry(tmp, struct inode_with_dups, list);
- dh.ref_inode_count++;
- dh.ref_count += id->dup_count;
+ dh->ref_inode_count++;
+ dh->ref_count += id->dup_count;
}
osi_list_foreach(tmp, &dt->ref_inode_list) {
id = osi_list_entry(tmp, struct inode_with_dups, list);
- dh.ref_inode_count++;
- dh.ref_count += id->dup_count;
+ dh->ref_inode_count++;
+ dh->ref_count += id->dup_count;
}
+}
+
+/* handle_dup_blk - handle a duplicate block reference.
+ *
+ * This function should resolve and delete the duplicate block reference given,
+ * iow dt.
+ */
+static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *dt)
+{
+ osi_list_t *tmp;
+ struct gfs2_inode *ip;
+ struct inode_with_dups *id;
+ struct dup_handler dh = {0};
+ struct gfs2_buffer_head *bh;
+ uint32_t cmagic, ctype;
+ enum dup_ref_type acceptable_ref;
+ uint64_t dup_blk;
+
+ dup_blk = dt->block;
+ revise_dup_handler(dup_blk, &dh);
/* Log the duplicate references */
log_notice( _("Block %llu (0x%llx) has %d inodes referencing it"
@@ -642,77 +382,67 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *dt)
invalidated for other reasons, such as bad pointers. So we need to
make sure at this point that any inode deletes reverse out any
duplicate reference before we get to this point. */
- if (dh.ref_count == 1)
- last_reference = 1;
/* Step 1 - eliminate references from inodes that are not valid.
* This may be because they were deleted due to corruption.
* All block types are unacceptable, so we use ref_types.
*/
- if (!last_reference) {
+ if (dh.ref_count > 1) {
log_debug( _("----------------------------------------------\n"
"Step 1: Eliminate references to block %llu "
"(0x%llx) that were previously marked "
"invalid.\n"),
(unsigned long long)dt->block,
(unsigned long long)dt->block);
- last_reference = resolve_dup_references(sdp, dt,
- &dt->ref_invinode_list,
- &dh, 1, ref_types);
+ resolve_dup_references(sdp, dt, &dt->ref_invinode_list,
+ &dh, 1, ref_types);
+ revise_dup_handler(dup_blk, &dh);
}
/* Step 2 - eliminate reference from inodes that reference it as the
* wrong type. For example, a data file referencing it as
* a data block, but it's really a metadata block. Or a
* directory inode referencing a data block as a leaf block.
*/
- if (!last_reference) {
+ if (dh.ref_count > 1) {
log_debug( _("----------------------------------------------\n"
"Step 2: Eliminate references to block %llu "
"(0x%llx) that need the wrong block type.\n"),
(unsigned long long)dt->block,
(unsigned long long)dt->block);
- last_reference = resolve_dup_references(sdp, dt,
- &dt->ref_inode_list,
- &dh, 0,
- acceptable_ref);
+ resolve_dup_references(sdp, dt, &dt->ref_inode_list, &dh, 0,
+ acceptable_ref);
+ revise_dup_handler(dup_blk, &dh);
}
/* Step 3 - We have multiple dinodes referencing it as the correct
* type. Just blast one of them.
* All block types are fair game, so we use ref_types.
*/
- if (!last_reference) {
+ if (dh.ref_count > 1) {
log_debug( _("----------------------------------------------\n"
"Step 3: Choose one reference to block %llu "
"(0x%llx) to keep.\n"),
(unsigned long long)dt->block,
(unsigned long long)dt->block);
- last_reference = resolve_dup_references(sdp, dt,
- &dt->ref_inode_list,
- &dh, 0, ref_types);
- }
- /* Now fix the block type of the block in question. */
- if (osi_list_empty(&dt->ref_inode_list)) {
- log_notice( _("Block %llu (0x%llx) has no more references; "
- "Marking as 'free'.\n"),
- (unsigned long long)dt->block,
- (unsigned long long)dt->block);
- gfs2_blockmap_set(bl, dt->block, gfs2_block_free);
- check_n_fix_bitmap(sdp, dt->block, gfs2_block_free);
- return 0;
+ resolve_dup_references(sdp, dt, &dt->ref_inode_list, &dh, 0,
+ ref_types);
+ revise_dup_handler(dup_blk, &dh);
}
- if (last_reference) {
+ /* If there's still a last remaining reference, and it's a valid
+ reference, use it to determine the correct block type for our
+ blockmap and bitmap. */
+ if (dh.ref_count == 1 && !osi_list_empty(&dt->ref_inode_list)) {
uint8_t q;
log_notice( _("Block %llu (0x%llx) has only one remaining "
- "reference.\n"),
- (unsigned long long)dt->block,
- (unsigned long long)dt->block);
+ "valid reference.\n"),
+ (unsigned long long)dup_blk,
+ (unsigned long long)dup_blk);
/* If we're down to a single reference (and not all references
deleted, which may be the case of an inode that has only
itself and a reference), we need to reset the block type
from invalid to data or metadata. Start at the first one
in the list, not the structure's place holder. */
- tmp = (&dt->ref_inode_list)->next;
+ tmp = dt->ref_inode_list.next;
id = osi_list_entry(tmp, struct inode_with_dups, list);
log_debug( _("----------------------------------------------\n"
"Step 4. Set block type based on the remaining "
@@ -753,13 +483,147 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *dt)
"attribute"),
gfs2_meta_eattr);
fsck_inode_put(&ip); /* out, brelse, free */
+ log_debug(_("Done with duplicate reference to block 0x%llx\n"),
+ (unsigned long long)dt->block);
+ dup_delete(dt);
} else {
/* They may have answered no and not fixed all references. */
- log_debug( _("All duplicate references were processed.\n"));
+ log_debug( _("All duplicate references to block 0x%llx were "
+ "processed.\n"), (unsigned long long)dup_blk);
+ if (dh.ref_count) {
+ log_debug(_("Done with duplicate reference to block "
+ "0x%llx, but %d references remain.\n"),
+ (unsigned long long)dup_blk, dh.ref_count);
+ } else {
+ log_notice( _("Block %llu (0x%llx) has no more "
+ "references; Marking as 'free'.\n"),
+ (unsigned long long)dup_blk,
+ (unsigned long long)dup_blk);
+ if (dh.dt)
+ dup_delete(dh.dt);
+ /* Now fix the block type of the block in question. */
+ gfs2_blockmap_set(bl, dup_blk, gfs2_block_free);
+ check_n_fix_bitmap(sdp, dup_blk, gfs2_block_free);
+ }
}
return 0;
}
+static int check_leaf_refs(struct gfs2_inode *ip, uint64_t block,
+ void *private)
+{
+ return add_duplicate_ref(ip, block, ref_as_meta, 1, INODE_VALID);
+}
+
+static int check_metalist_refs(struct gfs2_inode *ip, uint64_t block,
+ struct gfs2_buffer_head **bh, int h,
+ int *is_valid, int *was_duplicate,
+ void *private)
+{
+ *was_duplicate = 0;
+ *is_valid = 1;
+ return add_duplicate_ref(ip, block, ref_as_meta, 1, INODE_VALID);
+}
+
+static int check_data_refs(struct gfs2_inode *ip, uint64_t metablock,
+ uint64_t block, void *private)
+{
+ return add_duplicate_ref(ip, block, ref_as_data, 1, INODE_VALID);
+}
+
+static int check_eattr_indir_refs(struct gfs2_inode *ip, uint64_t block,
+ uint64_t parent,
+ struct gfs2_buffer_head **bh, void *private)
+{
+ struct gfs2_sbd *sdp = ip->i_sbd;
+ int error;
+
+ error = add_duplicate_ref(ip, block, ref_as_ea, 1, INODE_VALID);
+ if (!error)
+ *bh = bread(sdp, block);
+
+ return error;
+}
+
+static int check_eattr_leaf_refs(struct gfs2_inode *ip, uint64_t block,
+ uint64_t parent, struct gfs2_buffer_head **bh,
+ void *private)
+{
+ struct gfs2_sbd *sdp = ip->i_sbd;
+ int error;
+
+ error = add_duplicate_ref(ip, block, ref_as_ea, 1, INODE_VALID);
+ if (!error)
+ *bh = bread(sdp, block);
+ return error;
+}
+
+static int check_eattr_entry_refs(struct gfs2_inode *ip,
+ struct gfs2_buffer_head *leaf_bh,
+ struct gfs2_ea_header *ea_hdr,
+ struct gfs2_ea_header *ea_hdr_prev,
+ void *private)
+{
+ return 0;
+}
+
+static int check_eattr_extentry_refs(struct gfs2_inode *ip,
+ uint64_t *ea_data_ptr,
+ struct gfs2_buffer_head *leaf_bh,
+ struct gfs2_ea_header *ea_hdr,
+ struct gfs2_ea_header *ea_hdr_prev,
+ void *private)
+{
+ uint64_t block = be64_to_cpu(*ea_data_ptr);
+
+ return add_duplicate_ref(ip, block, ref_as_ea, 1, INODE_VALID);
+}
+
+/* Finds all references to duplicate blocks in the metadata */
+/* Finds all references to duplicate blocks in the metadata */
+static int find_block_ref(struct gfs2_sbd *sdp, uint64_t inode)
+{
+ struct gfs2_inode *ip;
+ int error = 0;
+ struct metawalk_fxns find_refs = {
+ .private = NULL,
+ .check_leaf = check_leaf_refs,
+ .check_metalist = check_metalist_refs,
+ .check_data = check_data_refs,
+ .check_eattr_indir = check_eattr_indir_refs,
+ .check_eattr_leaf = check_eattr_leaf_refs,
+ .check_eattr_entry = check_eattr_entry_refs,
+ .check_eattr_extentry = check_eattr_extentry_refs,
+ };
+
+ ip = fsck_load_inode(sdp, inode); /* bread, inode_get */
+
+ /* double-check the meta header just to be sure it's metadata */
+ if (ip->i_di.di_header.mh_magic != GFS2_MAGIC ||
+ ip->i_di.di_header.mh_type != GFS2_METATYPE_DI) {
+ log_debug( _("Block %lld (0x%llx) is not gfs2 metadata.\n"),
+ (unsigned long long)inode,
+ (unsigned long long)inode);
+ error = 1;
+ goto out;
+ }
+ /* Check to see if this inode was referenced by another by mistake */
+ add_duplicate_ref(ip, inode, ref_is_inode, 1, INODE_VALID);
+
+ /* Check this dinode's metadata for references to known duplicates */
+ error = check_metatree(ip, &find_refs);
+ if (error < 0)
+ stack;
+
+ /* Check for ea references in the inode */
+ if (!error)
+ error = check_inode_eattr(ip, &find_refs);
+
+out:
+ fsck_inode_put(&ip); /* out, brelse, free */
+ return error;
+}
+
/* Pass 1b handles finding the previous inode for a duplicate block
* When found, store the inodes pointing to the duplicate block for
* use in pass2 */
@@ -768,7 +632,7 @@ int pass1b(struct gfs2_sbd *sdp)
struct duptree *dt;
uint64_t i;
uint8_t q;
- struct osi_node *n, *next = NULL;
+ struct osi_node *n;
int rc = FSCK_OK;
log_info( _("Looking for duplicate blocks...\n"));
@@ -819,17 +683,11 @@ int pass1b(struct gfs2_sbd *sdp)
* it later */
log_info( _("Handling duplicate blocks\n"));
out:
- for (n = osi_first(&dup_blocks); n; n = next) {
- next = osi_next(n);
+ /* Resolve all duplicates by clearing out the dup tree */
+ while ((n = osi_first(&dup_blocks))) {
dt = (struct duptree *)n;
if (!skip_this_pass && !rc) /* no error & not asked to skip the rest */
handle_dup_blk(sdp, dt);
- /* Do not attempt to free the dup_blocks list or its parts
- here because any func that calls check_metatree needs
- to check duplicate status based on this linked list.
- This is especially true for pass2 where it may delete "bad"
- inodes, and we can't delete an inode's indirect block if
- it was a duplicate (therefore in use by another dinode). */
}
return rc;
}
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index e38841e..8b38b43 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -169,59 +169,6 @@ static int check_file_type(uint8_t de_type, uint8_t blk_type, int gfs1)
return 0;
}
-static int delete_eattr_entry (struct gfs2_inode *ip,
- struct gfs2_buffer_head *leaf_bh,
- struct gfs2_ea_header *ea_hdr,
- struct gfs2_ea_header *ea_hdr_prev,
- void *private)
-{
- struct gfs2_sbd *sdp = ip->i_sbd;
- char ea_name[256];
-
- if (!ea_hdr->ea_name_len){
- /* Skip this entry for now */
- return 1;
- }
-
- memset(ea_name, 0, sizeof(ea_name));
- strncpy(ea_name, (char *)ea_hdr + sizeof(struct gfs2_ea_header),
- ea_hdr->ea_name_len);
-
- if (!GFS2_EATYPE_VALID(ea_hdr->ea_type) &&
- ((ea_hdr_prev) || (!ea_hdr_prev && ea_hdr->ea_type))){
- /* Skip invalid entry */
- return 1;
- }
-
- if (ea_hdr->ea_num_ptrs){
- uint32_t avail_size;
- int max_ptrs;
-
- avail_size = sdp->sd_sb.sb_bsize - sizeof(struct gfs2_meta_header);
- max_ptrs = (be32_to_cpu(ea_hdr->ea_data_len) + avail_size - 1) /
- avail_size;
-
- if (max_ptrs > ea_hdr->ea_num_ptrs)
- return 1;
- else {
- log_debug( _(" Pointers Required: %d\n Pointers Reported: %d\n"),
- max_ptrs, ea_hdr->ea_num_ptrs);
- }
- }
- return 0;
-}
-
-static int delete_eattr_extentry(struct gfs2_inode *ip, uint64_t *ea_data_ptr,
- struct gfs2_buffer_head *leaf_bh,
- struct gfs2_ea_header *ea_hdr,
- struct gfs2_ea_header *ea_hdr_prev,
- void *private)
-{
- uint64_t block = be64_to_cpu(*ea_data_ptr);
-
- return delete_metadata(ip, block, NULL, 0, private);
-}
-
struct metawalk_fxns pass2_fxns_delete = {
.private = NULL,
.check_metalist = delete_metadata,
@@ -1836,12 +1783,5 @@ int pass2(struct gfs2_sbd *sdp)
}
fsck_inode_put(&ip); /* does a gfs2_dinode_out, brelse */
}
- /* Now that we've deleted the inodes marked "bad" we can safely
- get rid of the duplicate block list. If we do it any sooner,
- we won't discover that a given block is a duplicate and avoid
- deleting it from both inodes referencing it. Note: The other
- returns from this function are premature exits of the program
- and gfs2_block_list_destroy should get rid of the list for us. */
- gfs2_dup_free();
return FSCK_OK;
}
diff --git a/gfs2/fsck/util.c b/gfs2/fsck/util.c
index ef59e6e..9d6f163 100644
--- a/gfs2/fsck/util.c
+++ b/gfs2/fsck/util.c
@@ -466,8 +466,39 @@ struct dir_info *dirtree_find(uint64_t block)
return NULL;
}
-void dup_listent_delete(struct inode_with_dups *id)
+/* get_ref_type - figure out if all duplicate references from this inode
+ are the same type, and if so, return the type. */
+enum dup_ref_type get_ref_type(struct inode_with_dups *id)
{
+ enum dup_ref_type t, i;
+ int found_type_with_ref;
+ int found_other_types;
+
+ for (t = ref_as_data; t < ref_types; t++) {
+ found_type_with_ref = 0;
+ found_other_types = 0;
+ for (i = ref_as_data; i < ref_types; i++) {
+ if (id->reftypecount[i]) {
+ if (t == i)
+ found_type_with_ref = 1;
+ else
+ found_other_types = 1;
+ }
+ }
+ if (found_type_with_ref)
+ return found_other_types ? ref_types : t;
+ }
+ return ref_types;
+}
+
+void dup_listent_delete(struct duptree *dt, struct inode_with_dups *id)
+{
+ log_err( _("Removing duplicate reference to block %llu (0x%llx) "
+ "referenced as %s by dinode %llu (0x%llx)\n"),
+ (unsigned long long)dt->block, (unsigned long long)dt->block,
+ reftypes[get_ref_type(id)], (unsigned long long)id->block_no,
+ (unsigned long long)id->block_no);
+ dt->refs--; /* one less reference */
if (id->name)
free(id->name);
osi_list_del(&id->list);
@@ -482,12 +513,12 @@ void dup_delete(struct duptree *dt)
while (!osi_list_empty(&dt->ref_invinode_list)) {
tmp = (&dt->ref_invinode_list)->next;
id = osi_list_entry(tmp, struct inode_with_dups, list);
- dup_listent_delete(id);
+ dup_listent_delete(dt, id);
}
while (!osi_list_empty(&dt->ref_inode_list)) {
tmp = (&dt->ref_inode_list)->next;
id = osi_list_entry(tmp, struct inode_with_dups, list);
- dup_listent_delete(id);
+ dup_listent_delete(dt, id);
}
osi_erase(&dt->node, &dup_blocks);
free(dt);
diff --git a/gfs2/fsck/util.h b/gfs2/fsck/util.h
index 00c2239..361b1a2 100644
--- a/gfs2/fsck/util.h
+++ b/gfs2/fsck/util.h
@@ -19,7 +19,7 @@ int add_duplicate_ref(struct gfs2_inode *ip, uint64_t block,
enum dup_ref_type reftype, int first, int inode_valid);
extern struct inode_with_dups *find_dup_ref_inode(struct duptree *dt,
struct gfs2_inode *ip);
-extern void dup_listent_delete(struct inode_with_dups *id);
+extern void dup_listent_delete(struct duptree *dt, struct inode_with_dups *id);
extern const char *reftypes[ref_types + 1];
@@ -174,6 +174,7 @@ static inline uint32_t gfs_to_gfs2_mode(struct gfs2_inode *ip)
}
}
+extern enum dup_ref_type get_ref_type(struct inode_with_dups *id);
extern struct gfs2_bmap *gfs2_bmap_create(struct gfs2_sbd *sdp, uint64_t size,
uint64_t *addl_mem_needed);
extern void *gfs2_bmap_destroy(struct gfs2_sbd *sdp, struct gfs2_bmap *il);
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 41/47] fsck.gfs2: Remove all bad eattr blocks
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (38 preceding siblings ...)
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 40/47] fsck.gfs2: major duplicate reference reform Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 42/47] fsck.gfs2: Remove unused variable Bob Peterson
` (5 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, bad extended attributes were not properly removed
from a dinode, and blocks were not freed. This patch properly
removes them all.
rhbz#872564
---
gfs2/fsck/pass1.c | 12 ++++++++----
gfs2/fsck/pass1c.c | 8 +++++++-
gfs2/fsck/pass2.c | 18 ++++++++++--------
3 files changed, 25 insertions(+), 13 deletions(-)
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index ee7e2c5..ad6690b 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -487,6 +487,8 @@ static int remove_inode_eattr(struct gfs2_inode *ip, struct block_count *bc)
static int ask_remove_inode_eattr(struct gfs2_inode *ip,
struct block_count *bc)
{
+ if (ip->i_di.di_eattr == 0)
+ return 0; /* eattr was removed prior to this call */
log_err( _("Inode %lld (0x%llx) has unrecoverable Extended Attribute "
"errors.\n"), (unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)ip->i_di.di_num.no_addr);
@@ -1074,11 +1076,13 @@ static int handle_ip(struct gfs2_sbd *sdp, struct gfs2_inode *ip)
if (fsck_abort)
return 0;
- error = check_inode_eattr(ip, &pass1_fxns);
+ if (!error) {
+ error = check_inode_eattr(ip, &pass1_fxns);
- if (error &&
- !(ip->i_di.di_flags & GFS2_DIF_EA_INDIRECT))
- ask_remove_inode_eattr(ip, &bc);
+ if (error &&
+ !(ip->i_di.di_flags & GFS2_DIF_EA_INDIRECT))
+ ask_remove_inode_eattr(ip, &bc);
+ }
if (ip->i_di.di_blocks !=
(1 + bc.indir_count + bc.data_count + bc.ea_count)) {
diff --git a/gfs2/fsck/pass1c.c b/gfs2/fsck/pass1c.c
index 26d47d5..b918de1 100644
--- a/gfs2/fsck/pass1c.c
+++ b/gfs2/fsck/pass1c.c
@@ -12,6 +12,12 @@
#include "util.h"
#include "metawalk.h"
+struct metawalk_fxns pass1c_fxns_delete = {
+ .private = NULL,
+ .check_eattr_indir = delete_eattr_indir,
+ .check_eattr_leaf = delete_eattr_leaf,
+};
+
static int remove_eattr_entry(struct gfs2_sbd *sdp,
struct gfs2_buffer_head *leaf_bh,
struct gfs2_ea_header *curr,
@@ -62,7 +68,7 @@ static int ask_remove_eattr_entry(struct gfs2_sbd *sdp,
static int ask_remove_eattr(struct gfs2_inode *ip)
{
if (query( _("Remove the bad Extended Attribute? (y/n) "))) {
- ip->i_di.di_eattr = 0;
+ check_inode_eattr(ip, &pass1c_fxns_delete);
bmodified(ip->i_bh);
log_err( _("Bad Extended Attribute removed.\n"));
return 1;
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index 8b38b43..5c27a35 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -20,11 +20,13 @@
#define MAX_FILENAME 256
-struct metawalk_fxns clear_eattrs = {
+struct metawalk_fxns pass2_fxns;
+
+struct metawalk_fxns delete_eattrs = {
.check_eattr_indir = delete_eattr_indir,
.check_eattr_leaf = delete_eattr_leaf,
- .check_eattr_entry = clear_eattr_entry,
- .check_eattr_extentry = clear_eattr_extentry,
+ .check_eattr_entry = delete_eattr_entry,
+ .check_eattr_extentry = delete_eattr_extentry,
};
/* Set children's parent inode in dir_info structure - ext2 does not set
@@ -599,7 +601,7 @@ static int basic_dentry_checks(struct gfs2_inode *ip, struct gfs2_dirent *dent,
entry_ip = ip;
else
entry_ip = fsck_load_inode(sdp, entry->no_addr);
- check_inode_eattr(entry_ip, &clear_eattrs);
+ check_inode_eattr(entry_ip, &delete_eattrs);
if (entry_ip != ip)
fsck_inode_put(&entry_ip);
return 1;
@@ -683,7 +685,7 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
entry_ip = ip;
else
entry_ip = fsck_load_inode(sdp, entry.no_addr);
- check_inode_eattr(entry_ip, &clear_eattrs);
+ check_inode_eattr(entry_ip, &delete_eattrs);
if (entry_ip != ip)
fsck_inode_put(&entry_ip);
goto nuke_dentry;
@@ -714,7 +716,7 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
entry_ip = ip;
else
entry_ip = fsck_load_inode(sdp, entry.no_addr);
- check_inode_eattr(entry_ip, &clear_eattrs);
+ check_inode_eattr(entry_ip, &delete_eattrs);
if (entry_ip != ip)
fsck_inode_put(&entry_ip);
goto nuke_dentry;
@@ -744,7 +746,7 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
entry_ip = ip;
else
entry_ip = fsck_load_inode(sdp, entry.no_addr);
- check_inode_eattr(entry_ip, &clear_eattrs);
+ check_inode_eattr(entry_ip, &delete_eattrs);
if (entry_ip != ip)
fsck_inode_put(&entry_ip);
@@ -764,7 +766,7 @@ static int check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
entry_ip = ip;
else
entry_ip = fsck_load_inode(sdp, entry.no_addr);
- check_inode_eattr(entry_ip, &clear_eattrs);
+ check_inode_eattr(entry_ip, &delete_eattrs);
if (entry_ip != ip)
fsck_inode_put(&entry_ip);
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 42/47] fsck.gfs2: Remove unused variable
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (39 preceding siblings ...)
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 41/47] fsck.gfs2: Remove all bad eattr blocks Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 43/47] fsck.gfs2: double-check transitions from dinode to data Bob Peterson
` (4 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
This patch removes a variable that wasn't being used.
---
gfs2/fsck/pass1b.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index 15a3f3a..9c76eda 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -146,7 +146,6 @@ static void resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *dt,
enum dup_ref_type this_ref;
struct inode_info *ii;
int found_good_ref = 0;
- uint64_t dup_block;
uint8_t q;
osi_list_foreach_safe(tmp, ref_list, x) {
@@ -228,7 +227,6 @@ static void resolve_dup_references(struct gfs2_sbd *sdp, struct duptree *dt,
(unsigned long long)id->block_no,
(unsigned long long)id->block_no);
- dup_block = id->block_no;
ip = fsck_load_inode(sdp, id->block_no);
/* If we've already deleted this dinode, don't try to delete
it again. That could free blocks that used to be duplicate
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 43/47] fsck.gfs2: double-check transitions from dinode to data
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (40 preceding siblings ...)
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 42/47] fsck.gfs2: Remove unused variable Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 44/47] fsck.gfs2: Stop "undo" process when error data block is reached Bob Peterson
` (3 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
If a corrupt dinode references a bunch of blocks as data blocks,
and those blocks occur later in the bitmap (as is usually the case)
but they're really dinodes, we have a problem. Before it finds the
corruption, it can change the bitmap markings from 'dinode' to 'data'
blocks. Later, when it determines the dinode is corrupt. It tries
to "undo" all those data blocks, but since pass1 hasn't processed
them yet, it marks them as 'free' in the bitmap, and we've lost the
fact that they're dinodes. The result is that the files/dinodes
being improperly referenced are deleted by mistake.
This patch adds a check for bitmap transitions in pass1 from 'dinode'
to 'data', where the block hasn't been checked yet. We don't care about
transitions from dinode to free because that's a normal delete of a
dinode. We also don't care about transitions between dinode to
metadata, because all those checks validate that the metadata type is
the correct type of metadata, so we know we're making the right
decision. So the only issue are data blocks referencing dinodes.
What this patch does is: when the bitmap is making a transition from
'dinode' to 'data' in pass1, it basically puts up a red flag.
The block is read in and checked to see if it really looks like a
dinode. We have to be careful here, because customer data is allowed
to look like a dinode. If the block really seems to be a dinode, we
DO NOT want to treat it as a data block and assume the duplicate
reference handler in pass1b will handle it, because the dinode's
metadata blocks will not have been checked in pass1.
Instead, we want to flag it as corruption in the referencing file
dinode, not change the bitmap or blockmap, and allow pass1 to treat
it properly as a dinode when it gets there. The corrupt dinode
referencing the dinode as 'data' should be deleted and the work done
thusfar should be backed out by the pass1 'undo' functions.
Conflicts:
gfs2/fsck/pass1b.c
---
gfs2/fsck/metawalk.c | 21 +++++++++++++----
gfs2/fsck/metawalk.h | 14 +++++++----
gfs2/fsck/pass1.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++-----
gfs2/fsck/pass1b.c | 2 +-
gfs2/fsck/pass2.c | 2 +-
gfs2/fsck/pass3.c | 4 ++--
6 files changed, 89 insertions(+), 19 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 19593f3..923a140 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -27,7 +27,7 @@
is used to set the latter. The two must be kept in sync, otherwise
you'll get bitmap mismatches. This function checks the status of the
bitmap whenever the blockmap changes, and fixes it accordingly. */
-int check_n_fix_bitmap(struct gfs2_sbd *sdp, uint64_t blk,
+int check_n_fix_bitmap(struct gfs2_sbd *sdp, uint64_t blk, int error_on_dinode,
enum gfs2_mark_block new_blockmap_state)
{
int old_bitmap_state, new_bitmap_state;
@@ -49,6 +49,16 @@ int check_n_fix_bitmap(struct gfs2_sbd *sdp, uint64_t blk,
/* gfs1 descriptions: */
{"free", "data", "free meta", "metadata", "reserved"}};
+ if (error_on_dinode && old_bitmap_state == GFS2_BLKST_DINODE &&
+ new_bitmap_state != GFS2_BLKST_FREE) {
+ log_debug(_("Reference as '%s' to block %llu (0x%llx) "
+ "which was marked as dinode. Needs "
+ "further investigation.\n"),
+ allocdesc[sdp->gfs1][new_bitmap_state],
+ (unsigned long long)blk,
+ (unsigned long long)blk);
+ return 1;
+ }
/* Keep these messages as short as possible, or the output
gets to be huge and unmanageable. */
log_err( _("Block %llu (0x%llx) was '%s', should be %s.\n"),
@@ -106,6 +116,7 @@ int check_n_fix_bitmap(struct gfs2_sbd *sdp, uint64_t blk,
*/
int _fsck_blockmap_set(struct gfs2_inode *ip, uint64_t bblock,
const char *btype, enum gfs2_mark_block mark,
+ int error_on_dinode,
const char *caller, int fline)
{
int error;
@@ -164,9 +175,11 @@ int _fsck_blockmap_set(struct gfs2_inode *ip, uint64_t bblock,
/* First, check the rgrp bitmap against what we think it should be.
If that fails, it's an invalid block--part of an rgrp. */
- error = check_n_fix_bitmap(ip->i_sbd, bblock, mark);
+ error = check_n_fix_bitmap(ip->i_sbd, bblock, error_on_dinode, mark);
if (error) {
- log_err( _("This block is not represented in the bitmap.\n"));
+ if (error < 0)
+ log_err( _("This block is not represented in the "
+ "bitmap.\n"));
return error;
}
@@ -517,7 +530,7 @@ int check_leaf(struct gfs2_inode *ip, int lindex, struct metawalk_fxns *pass,
if (pass->check_leaf) {
error = pass->check_leaf(ip, *leaf_no, pass->private);
- if (error) {
+ if (error == -EEXIST) {
log_info(_("Previous reference to leaf %lld (0x%llx) "
"has already checked it; skipping.\n"),
(unsigned long long)*leaf_no,
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index 56f57d9..aacb962 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -45,10 +45,12 @@ extern int delete_eattr_extentry(struct gfs2_inode *ip, uint64_t *ea_data_ptr,
void *private);
extern int _fsck_blockmap_set(struct gfs2_inode *ip, uint64_t bblock,
- const char *btype, enum gfs2_mark_block mark,
- const char *caller, int line);
+ const char *btype, enum gfs2_mark_block mark,
+ int error_on_dinode,
+ const char *caller, int line);
extern int check_n_fix_bitmap(struct gfs2_sbd *sdp, uint64_t blk,
- enum gfs2_mark_block new_blockmap_state);
+ int error_on_dinode,
+ enum gfs2_mark_block new_blockmap_state);
extern void reprocess_inode(struct gfs2_inode *ip, const char *desc);
extern struct duptree *dupfind(uint64_t block);
extern struct gfs2_inode *fsck_system_inode(struct gfs2_sbd *sdp,
@@ -63,8 +65,10 @@ extern int repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no, int lindex,
#define is_duplicate(dblock) ((dupfind(dblock)) ? 1 : 0)
-#define fsck_blockmap_set(ip, b, bt, m) _fsck_blockmap_set(ip, b, bt, m, \
- __FUNCTION__, __LINE__)
+#define fsck_blockmap_set(ip, b, bt, m) \
+ _fsck_blockmap_set(ip, b, bt, m, 0, __FUNCTION__, __LINE__)
+#define fsck_blkmap_set_noino(ip, b, bt, m) \
+ _fsck_blockmap_set(ip, b, bt, m, 1, __FUNCTION__, __LINE__)
enum meta_check_rc {
meta_error = -1,
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index ad6690b..ee828d8 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -150,7 +150,7 @@ static int resuscitate_metalist(struct gfs2_inode *ip, uint64_t block,
if (fsck_system_inode(ip->i_sbd, block))
fsck_blockmap_set(ip, block, _("system file"), gfs2_indir_blk);
else
- check_n_fix_bitmap(ip->i_sbd, block, gfs2_indir_blk);
+ check_n_fix_bitmap(ip->i_sbd, block, 0, gfs2_indir_blk);
bc->indir_count++;
return meta_is_good;
}
@@ -204,7 +204,7 @@ static int resuscitate_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
if (fsck_system_inode(sdp, block))
fsck_blockmap_set(ip, block, _("system file"), dinode_type);
else
- check_n_fix_bitmap(sdp, block, dinode_type);
+ check_n_fix_bitmap(sdp, block, 0, dinode_type);
/* Return the number of leaf entries so metawalk doesn't flag this
leaf as having none. */
*count = be16_to_cpu(((struct gfs2_leaf *)bh->b_data)->lf_entries);
@@ -339,6 +339,8 @@ static int undo_reference(struct gfs2_inode *ip, uint64_t block, int meta,
struct block_count *bc = (struct block_count *)private;
struct duptree *dt;
struct inode_with_dups *id;
+ int old_bitmap_state = 0;
+ struct rgrp_tree *rgd;
if (!valid_block(ip->i_sbd, block)) { /* blk outside of FS */
fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
@@ -367,6 +369,12 @@ static int undo_reference(struct gfs2_inode *ip, uint64_t block, int meta,
return 1;
}
}
+ if (!meta) {
+ rgd = gfs2_blk2rgrpd(ip->i_sbd, block);
+ old_bitmap_state = lgfs2_get_bitmap(ip->i_sbd, block, rgd);
+ if (old_bitmap_state == GFS2_BLKST_DINODE)
+ return -1;
+ }
fsck_blockmap_set(ip, block,
meta ? _("bad indirect") : _("referenced data"),
gfs2_block_free);
@@ -385,6 +393,51 @@ static int undo_check_data(struct gfs2_inode *ip, uint64_t block,
return undo_reference(ip, block, 0, private);
}
+/* blockmap_set_as_data - set block as 'data' in the blockmap, if not dinode
+ *
+ * This function tries to set a block that's referenced as data as 'data'
+ * in the fsck blockmap. But if that block is marked as 'dinode' in the
+ * rgrp bitmap, it does additional checks to see if it looks like a dinode.
+ * Note that previous checks were done for duplicate references, so this
+ * is checking for dinodes that we haven't processed yet.
+ */
+static int blockmap_set_as_data(struct gfs2_inode *ip, uint64_t block)
+{
+ int error;
+ struct gfs2_buffer_head *bh;
+ struct gfs2_dinode *di;
+
+ error = fsck_blkmap_set_noino(ip, block, _("data"), gfs2_block_used);
+ if (!error)
+ return 0;
+
+ error = 0;
+ /* The bitmap says it's a dinode, but a block reference begs to differ.
+ So which is it? */
+ bh = bread(ip->i_sbd, block);
+ if (gfs2_check_meta(bh, GFS2_METATYPE_DI) != 0)
+ goto out;
+
+ /* The meta header agrees it's a dinode. But it might be data in
+ disguise, so do some extra checks. */
+ di = (struct gfs2_dinode *)bh->b_data;
+ if (be64_to_cpu(di->di_num.no_addr) != block)
+ goto out;
+
+ log_err(_("Inode %lld (0x%llx) has a reference to block %lld (0x%llx) "
+ "as a data block, but it appears to be a dinode we "
+ "haven't checked yet.\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)block, (unsigned long long)block);
+ error = -1;
+out:
+ if (!error)
+ fsck_blockmap_set(ip, block, _("data"), gfs2_block_used);
+ brelse(bh);
+ return error;
+}
+
static int check_data(struct gfs2_inode *ip, uint64_t metablock,
uint64_t block, void *private)
{
@@ -469,7 +522,7 @@ static int check_data(struct gfs2_inode *ip, uint64_t metablock,
(unsigned long long)block, (unsigned long long)block);
fsck_blockmap_set(ip, block, _("jdata"), gfs2_jdata);
} else
- fsck_blockmap_set(ip, block, _("data"), gfs2_block_used);
+ return blockmap_set_as_data(ip, block);
return 0;
}
@@ -1199,7 +1252,7 @@ static int check_system_inode(struct gfs2_sbd *sdp,
(unsigned long long)iblock,
(unsigned long long)iblock);
gfs2_blockmap_set(bl, iblock, gfs2_block_free);
- check_n_fix_bitmap(sdp, iblock, gfs2_block_free);
+ check_n_fix_bitmap(sdp, iblock, 0, gfs2_block_free);
inode_put(sysinode);
}
}
@@ -1486,7 +1539,7 @@ static int pass1_process_bitmap(struct gfs2_sbd *sdp, struct rgrp_tree *rgd, uin
"%llu (0x%llx)\n"),
(unsigned long long)block,
(unsigned long long)block);
- check_n_fix_bitmap(sdp, block, gfs2_block_free);
+ check_n_fix_bitmap(sdp, block, 0, gfs2_block_free);
} else if (handle_di(sdp, bh) < 0) {
stack;
brelse(bh);
@@ -1596,7 +1649,7 @@ int pass1(struct gfs2_sbd *sdp)
}
/* rgrps and bitmaps don't have bits to represent
their blocks, so don't do this:
- check_n_fix_bitmap(sdp, rgd->ri.ri_addr + i,
+ check_n_fix_bitmap(sdp, rgd->ri.ri_addr + i, 0,
gfs2_meta_rgrp);*/
}
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index 9c76eda..9a23197 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -501,7 +501,7 @@ static int handle_dup_blk(struct gfs2_sbd *sdp, struct duptree *dt)
dup_delete(dh.dt);
/* Now fix the block type of the block in question. */
gfs2_blockmap_set(bl, dup_blk, gfs2_block_free);
- check_n_fix_bitmap(sdp, dup_blk, gfs2_block_free);
+ check_n_fix_bitmap(sdp, dup_blk, 0, gfs2_block_free);
}
}
return 0;
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index 5c27a35..c4b8356 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -1713,7 +1713,7 @@ int pass2(struct gfs2_sbd *sdp)
/* Can't use fsck_blockmap_set here because we don't
have an inode in memory. */
gfs2_blockmap_set(bl, dirblk, gfs2_inode_invalid);
- check_n_fix_bitmap(sdp, dirblk, gfs2_inode_invalid);
+ check_n_fix_bitmap(sdp, dirblk, 0, gfs2_inode_invalid);
}
ip = fsck_load_inode(sdp, dirblk);
if (!ds.dotdir) {
diff --git a/gfs2/fsck/pass3.c b/gfs2/fsck/pass3.c
index 53052b6..4894d8c 100644
--- a/gfs2/fsck/pass3.c
+++ b/gfs2/fsck/pass3.c
@@ -275,7 +275,7 @@ int pass3(struct gfs2_sbd *sdp)
gfs2_blockmap_set(bl, di->dinode.no_addr,
gfs2_block_free);
check_n_fix_bitmap(sdp, di->dinode.no_addr,
- gfs2_block_free);
+ 0, gfs2_block_free);
break;
} else
log_err( _("Unlinked directory with bad block remains\n"));
@@ -299,7 +299,7 @@ int pass3(struct gfs2_sbd *sdp)
because we don't have ip */
gfs2_blockmap_set(bl, di->dinode.no_addr,
gfs2_block_free);
- check_n_fix_bitmap(sdp, di->dinode.no_addr,
+ check_n_fix_bitmap(sdp, di->dinode.no_addr, 0,
gfs2_block_free);
log_err( _("The block was cleared\n"));
break;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 44/47] fsck.gfs2: Stop "undo" process when error data block is reached
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (41 preceding siblings ...)
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 43/47] fsck.gfs2: double-check transitions from dinode to data Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 45/47] fsck.gfs2: Don't allocate leaf blocks in pass1 Bob Peterson
` (2 subsequent siblings)
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
When fsck.gfs2 discovers a data block in error, it flags the error
and especially in pass1, it tries to "undo" the block designations
it previously marked in the blockmap. Before this patch, the "undo"
functions didn't know when to stop. So it could "undo" designations
in the blockmap that it hadn't "done" in the first place. With this
patch, if an error is encountered while processing data blocks
(not counting duplicate references--for example, blocks marked as
'data' that are really dinodes which it hasn't gotten to yet) it
saves off the block where the error occurred. Later, during the
"undo" processing, it stops when it reaches the block that flagged
the error.
rhbz#902920
---
gfs2/fsck/metawalk.c | 36 +++++++++++++++++++++++++++---------
gfs2/fsck/pass1.c | 2 +-
2 files changed, 28 insertions(+), 10 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 923a140..4a2dd50 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -1325,7 +1325,7 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
*/
static int check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
struct gfs2_buffer_head *bh, int head_size,
- uint64_t *blks_checked)
+ uint64_t *blks_checked, uint64_t *error_blk)
{
int error = 0, rc = 0;
uint64_t block, *ptr;
@@ -1349,8 +1349,13 @@ static int check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
rc = pass->check_data(ip, metablock, block, pass->private);
if (!error && rc) {
error = rc;
- log_info(_("\nUnrecoverable data block error %d on "
- "block %llu (0x%llx).\n"), rc,
+ log_info("\n");
+ if (rc < 0) {
+ *error_blk = block;
+ log_info(_("Unrecoverable "));
+ }
+ log_info(_("data block error %d on block %llu "
+ "(0x%llx).\n"), rc,
(unsigned long long)block,
(unsigned long long)block);
}
@@ -1362,7 +1367,8 @@ static int check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
}
static int undo_check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
- uint64_t *ptr_start, char *ptr_end)
+ uint64_t *ptr_start, char *ptr_end,
+ uint64_t error_blk)
{
int rc = 0;
uint64_t block, *ptr;
@@ -1375,6 +1381,8 @@ static int undo_check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
if (skip_this_pass || fsck_abort)
return 1;
block = be64_to_cpu(*ptr);
+ if (block == error_blk)
+ return 1;
rc = pass->undo_check_data(ip, block, pass->private);
if (rc < 0)
return rc;
@@ -1415,6 +1423,8 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
uint64_t blks_checked = 0;
int error, rc;
int metadata_clean = 0;
+ uint64_t error_blk = 0;
+ int hit_error_blk = 0;
if (!height && !is_dir(&ip->i_di, ip->i_sbd->gfs1))
return 0;
@@ -1460,7 +1470,7 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
if (pass->check_data)
rc = check_data(ip, pass, bh, head_size,
- &blks_checked);
+ &blks_checked, &error_blk);
else
rc = 0;
@@ -1505,12 +1515,20 @@ undo_metalist:
i, pass->private);
else
rc = 0;
- if (metadata_clean && rc == 0 && i == height - 1) {
+ if (metadata_clean && rc == 0 && i == height - 1 &&
+ !hit_error_blk) {
head_size = hdr_size(bh, height);
- if (head_size)
- undo_check_data(ip, pass, (uint64_t *)
+ if (head_size) {
+ rc = undo_check_data(ip, pass,
+ (uint64_t *)
(bh->b_data + head_size),
- (bh->b_data + ip->i_sbd->bsize));
+ (bh->b_data + ip->i_sbd->bsize),
+ error_blk);
+ if (rc > 0) {
+ hit_error_blk = 1;
+ rc = 0;
+ }
+ }
}
if (bh == ip->i_bh)
osi_list_del(&bh->b_altlist);
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index ee828d8..2c1c046 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -462,7 +462,7 @@ static int check_data(struct gfs2_inode *ip, uint64_t metablock,
fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
_("bad (out of range) data"),
gfs2_bad_block);
- return 1;
+ return -1;
}
bc->data_count++; /* keep the count sane anyway */
q = block_type(block);
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 45/47] fsck.gfs2: Don't allocate leaf blocks in pass1
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (42 preceding siblings ...)
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 44/47] fsck.gfs2: Stop "undo" process when error data block is reached Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 46/47] fsck.gfs2: take hash table start boundaries into account Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 47/47] fsck.gfs2: delete all duplicates from unrecoverable damaged dinodes Bob Peterson
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
Before this patch, if leaf blocks were found to be corrupt, pass1
tried to fix them by allocating new leaf blocks in place of the bad
ones. That's a bad idea, because pass1 populates the blockmap and
sets the bitmap accordingly. In other words, it's dynamically changing.
Say, for example, that you're checking a directory a dinode 0x1234, and
it has a corrupt hash table, and needs new leaf blocks inserted.
Now suppose you have a second directory that occurs later in the bitmap,
say at block 0x2345, and it references leaf block 0x2346, but for some
reason that block (0x2346) is improperly set to "free" in the bitmap.
If pass1 goes out looking for a free block in order to allocate a new
leaf for 0x1234, it will naturally find block 0x2346, because it's
marked free. It writes a new leaf at that block and adds a new
reference in the hash table of 0x1234. Later, when pass1 processes
directory 0x2345, it discovers the reference to 0x2346. Not only has
it wiped out the perfectly good leaf block, it has also created a
duplicate block reference that it needs to sort out in pass1b, which
will likely keep the replaced reference and throw the good one we
had. Thus, we introduced corruption into the file system when we
should have kept the only good reference to 0x2346 and fixed the
bitmap.
The solution provided by this patch is to simply zero out the bad
hash table entries when pass1 comes across them. Later, when pass2
discovers the zero leaf blocks, it can safely allocate new blocks
(since pass1 synced the bitmap according to the blockmap) for the new
leaf blocks and replace the zeros with valid block references.
rhbz#902920
---
gfs2/fsck/metawalk.c | 31 ++++++++++++++++++++++++++++++-
gfs2/fsck/metawalk.h | 2 +-
gfs2/fsck/pass1.c | 9 ++-------
gfs2/fsck/pass2.c | 2 +-
4 files changed, 34 insertions(+), 10 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index 4a2dd50..ede32e7 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -1961,7 +1961,7 @@ int write_new_leaf(struct gfs2_inode *dip, int start_lindex, int num_copies,
* leaf a bit, but it's better than deleting the whole directory,
* which is what used to happen before. */
int repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no, int lindex,
- int ref_count, const char *msg)
+ int ref_count, const char *msg, int allow_alloc)
{
int new_leaf_blks = 0, error, refs;
uint64_t bn = 0;
@@ -1976,6 +1976,35 @@ int repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no, int lindex,
log_err( _("Bad leaf left in place.\n"));
goto out;
}
+ if (!allow_alloc) {
+ uint64_t *cpyptr;
+ char *padbuf;
+ int pad_size, i;
+
+ padbuf = malloc(ref_count * sizeof(uint64_t));
+ cpyptr = (uint64_t *)padbuf;
+ for (i = 0; i < ref_count; i++) {
+ *cpyptr = 0;
+ cpyptr++;
+ }
+ pad_size = ref_count * sizeof(uint64_t);
+ log_err(_("Writing zeros to the hash table of directory %lld "
+ "(0x%llx)@index: 0x%x for 0x%x pointers.\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ lindex, ref_count);
+ if (ip->i_sbd->gfs1)
+ gfs1_writei(ip, padbuf, lindex * sizeof(uint64_t),
+ pad_size);
+ else
+ gfs2_writei(ip, padbuf, lindex * sizeof(uint64_t),
+ pad_size);
+ free(padbuf);
+ log_err( _("Directory Inode %llu (0x%llx) patched.\n"),
+ (unsigned long long)ip->i_di.di_num.no_addr,
+ (unsigned long long)ip->i_di.di_num.no_addr);
+ goto out;
+ }
/* We can only write leafs in quantities that are factors of
two, since leaves are doubled, not added sequentially.
So if we have a hole that's not a factor of 2, we have to
diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
index aacb962..a5a51c2 100644
--- a/gfs2/fsck/metawalk.h
+++ b/gfs2/fsck/metawalk.h
@@ -61,7 +61,7 @@ extern int write_new_leaf(struct gfs2_inode *dip, int start_lindex,
int num_copies, const char *before_or_after,
uint64_t *bn);
extern int repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no, int lindex,
- int ref_count, const char *msg);
+ int ref_count, const char *msg, int allow_alloc);
#define is_duplicate(dblock) ((dupfind(dblock)) ? 1 : 0)
diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
index 2c1c046..df778ef 100644
--- a/gfs2/fsck/pass1.c
+++ b/gfs2/fsck/pass1.c
@@ -84,13 +84,8 @@ static int pass1_repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no,
int lindex, int ref_count, const char *msg,
void *private)
{
- struct block_count *bc = (struct block_count *)private;
- int new_leaf_blks;
-
- new_leaf_blks = repair_leaf(ip, leaf_no, lindex, ref_count, msg);
- bc->indir_count += new_leaf_blks;
-
- return new_leaf_blks;
+ repair_leaf(ip, leaf_no, lindex, ref_count, msg, 0);
+ return 0;
}
struct metawalk_fxns pass1_fxns = {
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index c4b8356..fba0f84 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -1040,7 +1040,7 @@ static int pass2_repair_leaf(struct gfs2_inode *ip, uint64_t *leaf_no,
int lindex, int ref_count, const char *msg,
void *private)
{
- return repair_leaf(ip, leaf_no, lindex, ref_count, msg);
+ return repair_leaf(ip, leaf_no, lindex, ref_count, msg, 1);
}
/* The purpose of leafck_fxns is to provide a means for function fix_hashtable
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 46/47] fsck.gfs2: take hash table start boundaries into account
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (43 preceding siblings ...)
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 45/47] fsck.gfs2: Don't allocate leaf blocks in pass1 Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 47/47] fsck.gfs2: delete all duplicates from unrecoverable damaged dinodes Bob Peterson
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
When checking the hash table in pass2, we can't just keep doubling
the length for each consecutive check because the number of pointer
copies (aka length) is also tied to the starting offset. If the
starting offset is invalid for the length, it might treat a chunk of
the hash table as bigger than it should, eventually overwriting good
entries. Along the same lines, while we're trying to determine the
length, it's not good enough to double the length and check if the
hash table entry matches. The reason is: there can be several values
overwritten with the same value, 0x00, that indicates places where
pass1 found an invalid leaf block pointer. To avoid that, we need to
check intermediate values as well, and stop if we find a gap.
---
gfs2/fsck/metawalk.c | 5 +++--
gfs2/fsck/pass2.c | 43 ++++++++++++++++++++++++++++++++++---------
2 files changed, 37 insertions(+), 11 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index ede32e7..c7122ac 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -473,11 +473,12 @@ static int check_entries(struct gfs2_inode *ip, struct gfs2_buffer_head *bh,
if ((char *)dent + de.de_rec_len >= bh_end){
log_debug( _("Last entry processed for %lld->%lld "
- "(0x%llx->0x%llx).\n"),
+ "(0x%llx->0x%llx), di_blocks=%llu.\n"),
(unsigned long long)ip->i_di.di_num.no_addr,
(unsigned long long)bh->b_blocknr,
(unsigned long long)ip->i_di.di_num.no_addr,
- (unsigned long long)bh->b_blocknr);
+ (unsigned long long)bh->b_blocknr,
+ (unsigned long long)ip->i_di.di_blocks);
break;
}
diff --git a/gfs2/fsck/pass2.c b/gfs2/fsck/pass2.c
index fba0f84..23046db 100644
--- a/gfs2/fsck/pass2.c
+++ b/gfs2/fsck/pass2.c
@@ -370,9 +370,10 @@ static int wrong_leaf(struct gfs2_inode *ip, struct gfs2_inum *entry,
gfs2_get_leaf_nr(ip, hash_index, &real_leaf);
if (real_leaf != planned_leaf) {
log_err(_("The planned leaf was split. The new leaf "
- "is: %llu (0x%llx)"),
+ "is: %llu (0x%llx). di_blocks=%llu\n"),
(unsigned long long)real_leaf,
- (unsigned long long)real_leaf);
+ (unsigned long long)real_leaf,
+ (unsigned long long)ip->i_di.di_blocks);
fsck_blockmap_set(ip, real_leaf, _("split leaf"),
gfs2_indir_blk);
}
@@ -1032,6 +1033,7 @@ static int basic_check_dentry(struct gfs2_inode *ip, struct gfs2_dirent *dent,
log_err( _("Bad directory entry '%s' cleared.\n"), tmp_name);
return 1;
} else {
+ (*count)++;
return 0;
}
}
@@ -1150,11 +1152,13 @@ static int fix_hashtable(struct gfs2_inode *ip, uint64_t *tbl, unsigned hsize,
/* Look at the first dirent and check its hash value to see if it's
at the proper starting offset. */
hash_index = hash_table_index(dentry.de_hash, ip);
+ /* Need to use len here, not *proper_len because the leaf block may
+ be valid within the range, but starts too soon in the hash table. */
if (hash_index < lindex || hash_index > lindex + len) {
log_err(_("This leaf block has hash index %d, which is out of "
"bounds for where it appears in the hash table "
"(%d - %d)\n"),
- hash_index, lindex, lindex + len);
+ hash_index, lindex, lindex + *proper_len);
error = lost_leaf(ip, tbl, leafblk, len, lindex, lbh);
brelse(lbh);
return error;
@@ -1291,6 +1295,8 @@ static int check_hash_tbl(struct gfs2_inode *ip, uint64_t *tbl,
struct gfs2_buffer_head *lbh;
int factor;
uint32_t proper_start;
+ uint32_t next_proper_start;
+ int anomaly;
lindex = 0;
while (lindex < hsize) {
@@ -1299,10 +1305,23 @@ static int check_hash_tbl(struct gfs2_inode *ip, uint64_t *tbl,
len = 1;
factor = 0;
leafblk = be64_to_cpu(tbl[lindex]);
+ next_proper_start = lindex;
+ anomaly = 0;
while (lindex + (len << 1) - 1 < hsize) {
if (be64_to_cpu(tbl[lindex + (len << 1) - 1]) !=
leafblk)
break;
+ next_proper_start = (lindex & ~((len << 1) - 1));
+ if (lindex != next_proper_start)
+ anomaly = 1;
+ /* Check if there are other values written between
+ here and the next factor. */
+ for (i = len; !anomaly && i + lindex < hsize &&
+ i < (len << 1); i++)
+ if (be64_to_cpu(tbl[lindex + i]) != leafblk)
+ anomaly = 1;
+ if (anomaly)
+ break;
len <<= 1;
factor++;
}
@@ -1344,8 +1363,10 @@ static int check_hash_tbl(struct gfs2_inode *ip, uint64_t *tbl,
proper_start = (lindex & ~(proper_len - 1));
if (lindex != proper_start) {
log_debug(_("lindex 0x%llx is not a proper starting "
- "point for this leaf: 0x%llx\n"),
+ "point for leaf %llu (0x%llx): 0x%llx\n"),
(unsigned long long)lindex,
+ (unsigned long long)leafblk,
+ (unsigned long long)leafblk,
(unsigned long long)proper_start);
changes = fix_hashtable(ip, tbl, hsize, leafblk,
lindex, proper_start, len,
@@ -1368,9 +1389,11 @@ static int check_hash_tbl(struct gfs2_inode *ip, uint64_t *tbl,
depth, and adjust the hash table accordingly. */
if (len != proper_len) {
log_err(_("Length %d (0x%x) is not a proper length "
- "for this leaf. Valid boundary assumed to "
- "be %d (0x%x).\n"),
- len, len, proper_len, proper_len);
+ "for leaf %llu (0x%llx). Valid boundary "
+ "assumed to be %d (0x%x).\n"), len, len,
+ (unsigned long long)leafblk,
+ (unsigned long long)leafblk,
+ proper_len, proper_len);
lbh = bread(ip->i_sbd, leafblk);
gfs2_leaf_in(&leaf, lbh);
brelse(lbh);
@@ -1419,8 +1442,10 @@ static int check_hash_tbl(struct gfs2_inode *ip, uint64_t *tbl,
proper_len = 1 << (ip->i_di.di_depth - leaf.lf_depth);
if (proper_len != len) {
log_debug(_("Length 0x%x is not proper for "
- "this leaf: 0x%x"),
- len, proper_len);
+ "leaf %llu (0x%llx): 0x%x"),
+ len, (unsigned long long)leafblk,
+ (unsigned long long)leafblk,
+ proper_len);
changes = fix_hashtable(ip, tbl, hsize,
leafblk, lindex,
lindex, len,
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 47/47] fsck.gfs2: delete all duplicates from unrecoverable damaged dinodes
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
` (44 preceding siblings ...)
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 46/47] fsck.gfs2: take hash table start boundaries into account Bob Peterson
@ 2013-05-14 16:22 ` Bob Peterson
45 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-14 16:22 UTC (permalink / raw)
To: cluster-devel.redhat.com
When pass1 encounters a dinode with unrecoverable damage, it tries
to "undo" the metadata and data block designations it marked in the
blockmap prior to finding the damage. That's all fine, but if the
damaged dinode has a duplicate reference, we also need to delete that
from the duplicate reference list. Otherwise pass1b may try to
resolve the duplicate reference and reinstate the damaged dinode.
---
gfs2/fsck/metawalk.c | 5 +++++
gfs2/fsck/pass1b.c | 60 ----------------------------------------------------
gfs2/fsck/util.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++
gfs2/fsck/util.h | 1 +
4 files changed, 65 insertions(+), 60 deletions(-)
diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
index c7122ac..e4e3067 100644
--- a/gfs2/fsck/metawalk.c
+++ b/gfs2/fsck/metawalk.c
@@ -1537,6 +1537,11 @@ undo_metalist:
brelse(bh);
}
}
+ /* There may be leftover duplicate records, so we need to delete them.
+ For example, if a metadata block was found to be a duplicate, we
+ may not have added it to the metalist, which means it's not there
+ to undo. */
+ delete_all_dups(ip);
/* Set the dinode as "bad" so it gets deleted */
fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
_("corrupt"), gfs2_block_free);
diff --git a/gfs2/fsck/pass1b.c b/gfs2/fsck/pass1b.c
index 9a23197..0dcb306 100644
--- a/gfs2/fsck/pass1b.c
+++ b/gfs2/fsck/pass1b.c
@@ -52,66 +52,6 @@ static void log_inode_reference(struct duptree *dt, osi_list_t *tmp, int inval)
}
/* delete_all_dups - delete all duplicate records for a given inode */
-static void delete_all_dups(struct gfs2_inode *ip)
-{
- struct osi_node *n, *next;
- struct duptree *dt;
- osi_list_t *tmp, *x;
- struct inode_with_dups *id;
- int found;
-
- for (n = osi_first(&dup_blocks); n; n = next) {
- next = osi_next(n);
- dt = (struct duptree *)n;
-
- found = 0;
- id = NULL;
-
- osi_list_foreach_safe(tmp, &dt->ref_invinode_list, x) {
- id = osi_list_entry(tmp, struct inode_with_dups, list);
- if (id->block_no == ip->i_di.di_num.no_addr) {
- dup_listent_delete(dt, id);
- found = 1;
- }
- }
- osi_list_foreach_safe(tmp, &dt->ref_inode_list, x) {
- id = osi_list_entry(tmp, struct inode_with_dups, list);
- if (id->block_no == ip->i_di.di_num.no_addr) {
- dup_listent_delete(dt, id);
- found = 1;
- }
- }
- if (!found)
- continue;
-
- if (dt->refs == 0) {
- log_debug(_("This was the last reference: 0x%llx is "
- "no longer a duplicate.\n"),
- (unsigned long long)dt->block);
- dup_delete(dt); /* not duplicate now */
- } else {
- log_debug(_("%d references remain to 0x%llx\n"),
- dt->refs, (unsigned long long)dt->block);
- if (dt->refs > 1)
- continue;
-
- id = NULL;
- osi_list_foreach(tmp, &dt->ref_invinode_list)
- id = osi_list_entry(tmp,
- struct inode_with_dups,
- list);
- osi_list_foreach(tmp, &dt->ref_inode_list)
- id = osi_list_entry(tmp,
- struct inode_with_dups,
- list);
- if (id)
- log_debug("Last reference is from inode "
- "0x%llx\n",
- (unsigned long long)id->block_no);
- }
- }
-}
-
/*
* resolve_dup_references - resolve all but the last dinode that has a
* duplicate reference to a given block.
diff --git a/gfs2/fsck/util.c b/gfs2/fsck/util.c
index 9d6f163..fd1b292 100644
--- a/gfs2/fsck/util.c
+++ b/gfs2/fsck/util.c
@@ -725,3 +725,62 @@ uint64_t *get_dir_hash(struct gfs2_inode *ip)
return tbl;
}
+void delete_all_dups(struct gfs2_inode *ip)
+{
+ struct osi_node *n, *next;
+ struct duptree *dt;
+ osi_list_t *tmp, *x;
+ struct inode_with_dups *id;
+ int found;
+
+ for (n = osi_first(&dup_blocks); n; n = next) {
+ next = osi_next(n);
+ dt = (struct duptree *)n;
+
+ found = 0;
+ id = NULL;
+
+ osi_list_foreach_safe(tmp, &dt->ref_invinode_list, x) {
+ id = osi_list_entry(tmp, struct inode_with_dups, list);
+ if (id->block_no == ip->i_di.di_num.no_addr) {
+ dup_listent_delete(dt, id);
+ found = 1;
+ }
+ }
+ osi_list_foreach_safe(tmp, &dt->ref_inode_list, x) {
+ id = osi_list_entry(tmp, struct inode_with_dups, list);
+ if (id->block_no == ip->i_di.di_num.no_addr) {
+ dup_listent_delete(dt, id);
+ found = 1;
+ }
+ }
+ if (!found)
+ continue;
+
+ if (dt->refs == 0) {
+ log_debug(_("This was the last reference: 0x%llx is "
+ "no longer a duplicate.\n"),
+ (unsigned long long)dt->block);
+ dup_delete(dt); /* not duplicate now */
+ } else {
+ log_debug(_("%d references remain to 0x%llx\n"),
+ dt->refs, (unsigned long long)dt->block);
+ if (dt->refs > 1)
+ continue;
+
+ id = NULL;
+ osi_list_foreach(tmp, &dt->ref_invinode_list)
+ id = osi_list_entry(tmp,
+ struct inode_with_dups,
+ list);
+ osi_list_foreach(tmp, &dt->ref_inode_list)
+ id = osi_list_entry(tmp,
+ struct inode_with_dups,
+ list);
+ if (id)
+ log_debug("Last reference is from inode "
+ "0x%llx\n",
+ (unsigned long long)id->block_no);
+ }
+ }
+}
diff --git a/gfs2/fsck/util.h b/gfs2/fsck/util.h
index 361b1a2..580acd8 100644
--- a/gfs2/fsck/util.h
+++ b/gfs2/fsck/util.h
@@ -187,6 +187,7 @@ extern char generic_interrupt(const char *caller, const char *where,
extern char gfs2_getch(void);
extern uint64_t find_free_blk(struct gfs2_sbd *sdp);
extern uint64_t *get_dir_hash(struct gfs2_inode *ip);
+extern void delete_all_dups(struct gfs2_inode *ip);
#define stack log_debug("<backtrace> - %s()\n", __func__)
--
1.7.11.7
^ permalink raw reply related [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 03/47] libgfs2: let dir_split_leaf receive a "broken" lindex
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 03/47] libgfs2: let dir_split_leaf receive a "broken" lindex Bob Peterson
@ 2013-05-15 16:01 ` Steven Whitehouse
2013-05-20 16:02 ` Bob Peterson
0 siblings, 1 reply; 59+ messages in thread
From: Steven Whitehouse @ 2013-05-15 16:01 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Tue, 2013-05-14 at 11:21 -0500, Bob Peterson wrote:
> For ordinary leaf blocks, the hash table must follow the rules,
> which means it needs to follow a power-of-two boundary. In other
> words, it needs to enforce that: start = (lindex & ~(len - 1));
> But when doing repairs, fsck will need to detect when hash tables
> violate this rule and fix it. In that case, it may need to pass
> in an invalid starting offset for a leaf to split. This patch
> moves the responsibility for checking the starting block to the
> calling function.
>
> rhbz#902920
> ---
> gfs2/libgfs2/fs_ops.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
> index 89adf32..11ef6b4 100644
> --- a/gfs2/libgfs2/fs_ops.c
> +++ b/gfs2/libgfs2/fs_ops.c
> @@ -957,7 +957,7 @@ void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex, uint64_t leaf_no,
> len = 1 << (dip->i_di.di_depth - be16_to_cpu(oleaf->lf_depth));
> half_len = len >> 1;
>
> - start = (lindex & ~(len - 1));
> + start = lindex;
Why not just rename the lindex as start? Otherwise this might be
confusing to have two names for the same variable,
Steve.
>
> lp = calloc(1, half_len * sizeof(uint64_t));
> if (lp == NULL) {
> @@ -1160,7 +1160,7 @@ static int dir_e_add(struct gfs2_inode *dip, const char *filename, int len,
> struct gfs2_buffer_head *bh, *nbh;
> struct gfs2_leaf *leaf, *nleaf;
> struct gfs2_dirent *dent;
> - uint32_t lindex;
> + uint32_t lindex, llen;
> uint32_t hash;
> uint64_t leaf_no, bn;
> int err = 0;
> @@ -1182,7 +1182,10 @@ restart:
> if (dirent_alloc(dip, bh, len, &dent)) {
>
> if (be16_to_cpu(leaf->lf_depth) < dip->i_di.di_depth) {
> - dir_split_leaf(dip, lindex, leaf_no, bh);
> + llen = 1 << (dip->i_di.di_depth -
> + be16_to_cpu(leaf->lf_depth));
> + dir_split_leaf(dip, lindex & ~(llen - 1),
> + leaf_no, bh);
> brelse(bh);
> goto restart;
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 04/47] fsck.gfs2: Move function find_free_blk to util.c
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 04/47] fsck.gfs2: Move function find_free_blk to util.c Bob Peterson
@ 2013-05-15 16:04 ` Steven Whitehouse
0 siblings, 0 replies; 59+ messages in thread
From: Steven Whitehouse @ 2013-05-15 16:04 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Tue, 2013-05-14 at 11:21 -0500, Bob Peterson wrote:
> In a future patch to fsck, function find_free_blk will be called in
> order to correctly report blocks that will need to be allocated for
> things such as leaf splits. This patch moves function find_free_blk
> to a more centralized place, util.c, to that end.
>
> rhbz#902920
> ---
> gfs2/fsck/lost_n_found.c | 39 ---------------------------------------
> gfs2/fsck/util.c | 39 +++++++++++++++++++++++++++++++++++++++
> gfs2/fsck/util.h | 2 ++
> 3 files changed, 41 insertions(+), 39 deletions(-)
>
> diff --git a/gfs2/fsck/lost_n_found.c b/gfs2/fsck/lost_n_found.c
> index 570f3a8..1fb5076 100644
> --- a/gfs2/fsck/lost_n_found.c
> +++ b/gfs2/fsck/lost_n_found.c
> @@ -88,45 +88,6 @@ static void add_dotdot(struct gfs2_inode *ip)
> }
> }
>
> -static uint64_t find_free_blk(struct gfs2_sbd *sdp)
> -{
> - struct osi_node *n, *next = NULL;
> - struct rgrp_tree *rl = NULL;
> - struct gfs2_rindex *ri;
> - struct gfs2_rgrp *rg;
> - unsigned int block, bn = 0, x = 0, y = 0;
> - unsigned int state;
> - struct gfs2_buffer_head *bh;
> -
> - memset(&rg, 0, sizeof(rg));
> - for (n = osi_first(&sdp->rgtree); n; n = next) {
> - next = osi_next(n);
> - rl = (struct rgrp_tree *)n;
> - if (rl->rg.rg_free)
> - break;
> - }
> -
This just looks wrong - it seems to not care where it is allocating
blocks, just grabbing the first one that comes along. This should be
changed so that it can allocate extents and also so that it puts blocks
in a sensible place, but I guess we don't have to do that now - A
thought for the future at least,
Steve.
> - if (n == NULL)
> - return 0;
> -
> - ri = &rl->ri;
> - rg = &rl->rg;
> -
> - for (block = 0; block < ri->ri_length; block++) {
> - bh = rl->bh[block];
> - x = (block) ? sizeof(struct gfs2_meta_header) : sizeof(struct gfs2_rgrp);
> -
> - for (; x < sdp->bsize; x++)
> - for (y = 0; y < GFS2_NBBY; y++) {
> - state = (bh->b_data[x] >> (GFS2_BIT_SIZE * y)) & 0x03;
> - if (state == GFS2_BLKST_FREE)
> - return ri->ri_data0 + bn;
> - bn++;
> - }
> - }
> - return 0;
> -}
> -
> /* add_inode_to_lf - Add dir entry to lost+found for the inode
> * @ip: inode to add to lost + found
> *
> diff --git a/gfs2/fsck/util.c b/gfs2/fsck/util.c
> index 7c89155..94d532e 100644
> --- a/gfs2/fsck/util.c
> +++ b/gfs2/fsck/util.c
> @@ -615,3 +615,42 @@ bad_dinode:
> stack;
> return -EPERM;
> }
> +
> +uint64_t find_free_blk(struct gfs2_sbd *sdp)
> +{
> + struct osi_node *n, *next = NULL;
> + struct rgrp_tree *rl = NULL;
> + struct gfs2_rindex *ri;
> + struct gfs2_rgrp *rg;
> + unsigned int block, bn = 0, x = 0, y = 0;
> + unsigned int state;
> + struct gfs2_buffer_head *bh;
> +
> + memset(&rg, 0, sizeof(rg));
> + for (n = osi_first(&sdp->rgtree); n; n = next) {
> + next = osi_next(n);
> + rl = (struct rgrp_tree *)n;
> + if (rl->rg.rg_free)
> + break;
> + }
> +
> + if (n == NULL)
> + return 0;
> +
> + ri = &rl->ri;
> + rg = &rl->rg;
> +
> + for (block = 0; block < ri->ri_length; block++) {
> + bh = rl->bh[block];
> + x = (block) ? sizeof(struct gfs2_meta_header) : sizeof(struct gfs2_rgrp);
> +
> + for (; x < sdp->bsize; x++)
> + for (y = 0; y < GFS2_NBBY; y++) {
> + state = (bh->b_data[x] >> (GFS2_BIT_SIZE * y)) & 0x03;
> + if (state == GFS2_BLKST_FREE)
> + return ri->ri_data0 + bn;
> + bn++;
> + }
> + }
> + return 0;
> +}
> diff --git a/gfs2/fsck/util.h b/gfs2/fsck/util.h
> index 80ed0c4..1a4811c 100644
> --- a/gfs2/fsck/util.h
> +++ b/gfs2/fsck/util.h
> @@ -184,6 +184,8 @@ extern char generic_interrupt(const char *caller, const char *where,
> const char *progress, const char *question,
> const char *answers);
> extern char gfs2_getch(void);
> +extern uint64_t find_free_blk(struct gfs2_sbd *sdp);
> +
> #define stack log_debug("<backtrace> - %s()\n", __func__)
>
> #endif /* __UTIL_H__ */
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 06/47] fsck.gfs2: Check for formal inode mismatch when adding to lost+found
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 06/47] fsck.gfs2: Check for formal inode mismatch when adding to lost+found Bob Peterson
@ 2013-05-15 16:08 ` Steven Whitehouse
2013-05-17 12:47 ` Bob Peterson
0 siblings, 1 reply; 59+ messages in thread
From: Steven Whitehouse @ 2013-05-15 16:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Tue, 2013-05-14 at 11:21 -0500, Bob Peterson wrote:
> This patch adds a check to the code that adds inodes to lost+found
> so that dinodes with formal inode mismatches are logged, but not added.
>
I'm not sure I understand what this one is doing. If there is a mismatch
between the dir entry and the inode that suggests that the dir entry and
inode are not related to the same thing,
Steve.
> rhbz#902920
> ---
> gfs2/fsck/lost_n_found.c | 44 ++++++++++++++++++++++++++++----------------
> 1 file changed, 28 insertions(+), 16 deletions(-)
>
> diff --git a/gfs2/fsck/lost_n_found.c b/gfs2/fsck/lost_n_found.c
> index f379646..3d9acb5 100644
> --- a/gfs2/fsck/lost_n_found.c
> +++ b/gfs2/fsck/lost_n_found.c
> @@ -40,24 +40,36 @@ static void add_dotdot(struct gfs2_inode *ip)
> (unsigned long long)ip->i_di.di_num.no_addr,
> (unsigned long long)di->dotdot_parent.no_addr,
> (unsigned long long)di->dotdot_parent.no_addr);
> - decr_link_count(di->dotdot_parent.no_addr,
> - ip->i_di.di_num.no_addr,
> - _(".. unlinked, moving to lost+found"));
> dip = fsck_load_inode(sdp, di->dotdot_parent.no_addr);
> - if (dip->i_di.di_nlink > 0) {
> - dip->i_di.di_nlink--;
> - set_di_nlink(dip); /* keep inode tree in sync */
> - log_debug(_("Decrementing its links to %d\n"),
> - dip->i_di.di_nlink);
> - bmodified(dip->i_bh);
> - } else if (!dip->i_di.di_nlink) {
> - log_debug(_("Its link count is zero.\n"));
> + if (dip->i_di.di_num.no_formal_ino ==
> + di->dotdot_parent.no_formal_ino) {
> + decr_link_count(di->dotdot_parent.no_addr,
> + ip->i_di.di_num.no_addr,
> + _(".. unlinked, moving to lost+found"));
> + if (dip->i_di.di_nlink > 0) {
> + dip->i_di.di_nlink--;
> + set_di_nlink(dip); /* keep inode tree in sync */
> + log_debug(_("Decrementing its links to %d\n"),
> + dip->i_di.di_nlink);
> + bmodified(dip->i_bh);
> + } else if (!dip->i_di.di_nlink) {
> + log_debug(_("Its link count is zero.\n"));
> + } else {
> + log_debug(_("Its link count is %d! Changing "
> + "it to 0.\n"), dip->i_di.di_nlink);
> + dip->i_di.di_nlink = 0;
> + set_di_nlink(dip); /* keep inode tree in sync */
> + bmodified(dip->i_bh);
> + }
> } else {
> - log_debug(_("Its link count is %d! Changing "
> - "it to 0.\n"), dip->i_di.di_nlink);
> - dip->i_di.di_nlink = 0;
> - set_di_nlink(dip); /* keep inode tree in sync */
> - bmodified(dip->i_bh);
> + log_debug(_("Directory (0x%llx)'s link to parent "
> + "(0x%llx) had a formal inode discrepancy: "
> + "was 0x%llx, expected 0x%llx\n"),
> + (unsigned long long)ip->i_di.di_num.no_addr,
> + (unsigned long long)di->dotdot_parent.no_addr,
> + di->dotdot_parent.no_formal_ino,
> + dip->i_di.di_num.no_formal_ino);
> + log_debug(_("The parent directory was not changed.\n"));
> }
> fsck_inode_put(&dip);
> di = NULL;
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions Bob Peterson
@ 2013-05-16 13:27 ` Steven Whitehouse
2013-05-16 13:49 ` Bob Peterson
0 siblings, 1 reply; 59+ messages in thread
From: Steven Whitehouse @ 2013-05-16 13:27 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
This sounds to me like we are doing things in the wrong order. We
shouldn't need to undo things that have been done, otherwise we'll just
land up in a tangle,
Steve.
On Tue, 2013-05-14 at 11:21 -0500, Bob Peterson wrote:
> In pass1, it traverses the metadata free, processing each dinode and
> marking which blocks are used by that dinode. If a dinode is found
> to have unrecoverable errors, it does a bunch of work to "undo" the
> things it did. This is especially important for the processing of
> duplicate block references. Suppose dinode X references block 1234.
> Later in pass1, suppose a different dinode, Y, also references block
> 1234. This is flagged as a duplicate block reference. Still later,
> suppose pass1 determines dinode Y is bad. Now it has to undo the
> work it did. It needs to properly unmark the data and metadata
> blocks it marked as no longer "free" so that valid references that
> follow aren't flagged as duplicate references. At the same time,
> it needs to unflag block 1234 as a duplicate reference as well, so
> that dinode X's original reference is still considered valid.
>
> Before this patch, fsck.gfs2 was trying to traverse the entire
> metadata tree for the bad dinode, trying to "undo" the designations.
> That becomes a huge problem if the damage was discovered in the
> middle of the metadata, in which case it may never have flagged any
> of the data blocks as "in use as data" in its blockmap. The result
> of "undoing" the designations sometimes resulted in blocks improperly
> being marked as "free" when they were, in fact, referenced by other
> valid dinodes.
>
> For example, suppose corrupt dinode Y references metadata blocks
> 1234, 1235, and 1236. Now suppose a serious problem is found as part
> of its processing of block 1234, and so it stopped its metadata tree
> traversal there. Metadata blocks 1235 and 1236 are still listed as
> metadata for the bad dinode, but if we traverse the entire tree,
> those two blocks may be improperly processed. If another dinode
> actually uses blocks 1235 or 1236, the improper "undo" processing
> of those two blocks can screw up the valid references.
>
> This patch reworks the "undo" functions so that the "undo" functions
> don't get called on the entire metadata and data of the defective
> dinode. Instead, only the metadata and data blocks queued onto the
> metadata list are processed. This should ensure that the "undo"
> functions only operate on blocks that were processed in the first
> place.
>
> rhbz#902920
> ---
> gfs2/fsck/metawalk.c | 109 ++++++++++++++++++++++----------
> gfs2/fsck/metawalk.h | 4 ++
> gfs2/fsck/pass1.c | 172 ++++++++++++++++-----------------------------------
> 3 files changed, 135 insertions(+), 150 deletions(-)
>
> diff --git a/gfs2/fsck/metawalk.c b/gfs2/fsck/metawalk.c
> index d1b12f1..b9d9f89 100644
> --- a/gfs2/fsck/metawalk.c
> +++ b/gfs2/fsck/metawalk.c
> @@ -1259,7 +1259,7 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
> if (err < 0) {
> stack;
> error = err;
> - goto fail;
> + return error;
> }
> if (err > 0) {
> if (!error)
> @@ -1278,14 +1278,11 @@ static int build_and_check_metalist(struct gfs2_inode *ip, osi_list_t *mlp,
> }
> if (!nbh)
> nbh = bread(ip->i_sbd, block);
> - osi_list_add(&nbh->b_altlist, cur_list);
> + osi_list_add_prev(&nbh->b_altlist, cur_list);
> } /* for all data on the indirect block */
> } /* for blocks at that height */
> } /* for height */
> - return error;
> -fail:
> - free_metalist(ip, mlp);
> - return error;
> + return 0;
> }
>
> /**
> @@ -1331,6 +1328,27 @@ static int check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
> return error;
> }
>
> +static int undo_check_data(struct gfs2_inode *ip, struct metawalk_fxns *pass,
> + uint64_t *ptr_start, char *ptr_end)
> +{
> + int rc = 0;
> + uint64_t block, *ptr;
> +
> + /* If there isn't much pointer corruption check the pointers */
> + for (ptr = ptr_start ; (char *)ptr < ptr_end && !fsck_abort; ptr++) {
> + if (!*ptr)
> + continue;
> +
> + if (skip_this_pass || fsck_abort)
> + return 1;
> + block = be64_to_cpu(*ptr);
> + rc = pass->undo_check_data(ip, block, pass->private);
> + if (rc < 0)
> + return rc;
> + }
> + return 0;
> +}
> +
> static int hdr_size(struct gfs2_buffer_head *bh, int height)
> {
> if (height > 1) {
> @@ -1363,6 +1381,7 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
> int i, head_size;
> uint64_t blks_checked = 0;
> int error, rc;
> + int metadata_clean = 0;
>
> if (!height && !is_dir(&ip->i_di, ip->i_sbd->gfs1))
> return 0;
> @@ -1374,35 +1393,21 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
> error = build_and_check_metalist(ip, &metalist[0], pass);
> if (error) {
> stack;
> - free_metalist(ip, &metalist[0]);
> - return error;
> + goto undo_metalist;
> }
>
> + metadata_clean = 1;
> /* For directories, we've already checked the "data" blocks which
> * comprise the directory hash table, so we perform the directory
> * checks and exit. */
> if (is_dir(&ip->i_di, ip->i_sbd->gfs1)) {
> - free_metalist(ip, &metalist[0]);
> if (!(ip->i_di.di_flags & GFS2_DIF_EXHASH))
> - return 0;
> + goto out;
> /* check validity of leaf blocks and leaf chains */
> error = check_leaf_blks(ip, pass);
> - return error;
> - }
> -
> - /* Free the metalist buffers from heights we don't need to check.
> - For the rest we'll free as we check them to save time.
> - metalist[0] will only have the dinode bh, so we can skip it. */
> - for (i = 1; i < height - 1; i++) {
> - list = &metalist[i];
> - while (!osi_list_empty(list)) {
> - bh = osi_list_entry(list->next,
> - struct gfs2_buffer_head, b_altlist);
> - if (bh == ip->i_bh)
> - osi_list_del(&bh->b_altlist);
> - else
> - brelse(bh);
> - }
> + if (error)
> + goto undo_metalist;
> + goto out;
> }
>
> /* check data blocks */
> @@ -1435,14 +1440,12 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
> else
> rc = 0;
>
> - if (rc && (!error || rc < 0))
> + if (rc && (!error || rc < 0)) {
> error = rc;
> + break;
> + }
> if (pass->big_file_msg && ip->i_di.di_blocks > COMFORTABLE_BLKS)
> pass->big_file_msg(ip, blks_checked);
> - if (bh == ip->i_bh)
> - osi_list_del(&bh->b_altlist);
> - else
> - brelse(bh);
> }
> if (pass->big_file_msg && ip->i_di.di_blocks > COMFORTABLE_BLKS) {
> log_notice( _("\rLarge file at %lld (0x%llx) - 100 percent "
> @@ -1452,6 +1455,50 @@ int check_metatree(struct gfs2_inode *ip, struct metawalk_fxns *pass)
> (unsigned long long)ip->i_di.di_num.no_addr);
> fflush(stdout);
> }
> +undo_metalist:
> + if (!error)
> + goto out;
> + log_err( _("Error: inode %llu (0x%llx) had unrecoverable errors.\n"),
> + (unsigned long long)ip->i_di.di_num.no_addr,
> + (unsigned long long)ip->i_di.di_num.no_addr);
> + if (!query( _("Remove the invalid inode? (y/n) "))) {
> + free_metalist(ip, &metalist[0]);
> + log_err(_("Invalid inode not deleted.\n"));
> + return error;
> + }
> + for (i = 0; pass->undo_check_meta && i < height; i++) {
> + while (!osi_list_empty(&metalist[i])) {
> + list = &metalist[i];
> + bh = osi_list_entry(list->next,
> + struct gfs2_buffer_head,
> + b_altlist);
> + log_err(_("Undoing metadata work for block %llu "
> + "(0x%llx)\n"),
> + (unsigned long long)bh->b_blocknr,
> + (unsigned long long)bh->b_blocknr);
> + if (i)
> + rc = pass->undo_check_meta(ip, bh->b_blocknr,
> + i, pass->private);
> + else
> + rc = 0;
> + if (metadata_clean && rc == 0 && i == height - 1) {
> + head_size = hdr_size(bh, height);
> + if (head_size)
> + undo_check_data(ip, pass, (uint64_t *)
> + (bh->b_data + head_size),
> + (bh->b_data + ip->i_sbd->bsize));
> + }
> + if (bh == ip->i_bh)
> + osi_list_del(&bh->b_altlist);
> + else
> + brelse(bh);
> + }
> + }
> + /* Set the dinode as "bad" so it gets deleted */
> + fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
> + _("corrupt"), gfs2_block_free);
> + log_err(_("The corrupt inode was invalidated.\n"));
> +out:
> free_metalist(ip, &metalist[0]);
> return error;
> }
> diff --git a/gfs2/fsck/metawalk.h b/gfs2/fsck/metawalk.h
> index 486c6eb..f5e71e1 100644
> --- a/gfs2/fsck/metawalk.h
> +++ b/gfs2/fsck/metawalk.h
> @@ -108,6 +108,10 @@ struct metawalk_fxns {
> int (*repair_leaf) (struct gfs2_inode *ip, uint64_t *leaf_no,
> int lindex, int ref_count, const char *msg,
> void *private);
> + int (*undo_check_meta) (struct gfs2_inode *ip, uint64_t block,
> + int h, void *private);
> + int (*undo_check_data) (struct gfs2_inode *ip, uint64_t block,
> + void *private);
> };
>
> #endif /* _METAWALK_H */
> diff --git a/gfs2/fsck/pass1.c b/gfs2/fsck/pass1.c
> index 04e5289..a88895f 100644
> --- a/gfs2/fsck/pass1.c
> +++ b/gfs2/fsck/pass1.c
> @@ -39,8 +39,7 @@ static int check_leaf(struct gfs2_inode *ip, uint64_t block, void *private);
> static int check_metalist(struct gfs2_inode *ip, uint64_t block,
> struct gfs2_buffer_head **bh, int h, void *private);
> static int undo_check_metalist(struct gfs2_inode *ip, uint64_t block,
> - struct gfs2_buffer_head **bh, int h,
> - void *private);
> + int h, void *private);
> static int check_data(struct gfs2_inode *ip, uint64_t block, void *private);
> static int undo_check_data(struct gfs2_inode *ip, uint64_t block,
> void *private);
> @@ -104,12 +103,8 @@ struct metawalk_fxns pass1_fxns = {
> .finish_eattr_indir = finish_eattr_indir,
> .big_file_msg = big_file_comfort,
> .repair_leaf = pass1_repair_leaf,
> -};
> -
> -struct metawalk_fxns undo_fxns = {
> - .private = NULL,
> - .check_metalist = undo_check_metalist,
> - .check_data = undo_check_data,
> + .undo_check_meta = undo_check_metalist,
> + .undo_check_data = undo_check_data,
> };
>
> struct metawalk_fxns invalidate_fxns = {
> @@ -326,53 +321,67 @@ static int check_metalist(struct gfs2_inode *ip, uint64_t block,
> return 0;
> }
>
> -static int undo_check_metalist(struct gfs2_inode *ip, uint64_t block,
> - struct gfs2_buffer_head **bh, int h,
> - void *private)
> +/* undo_reference - undo previously processed data or metadata
> + * We've treated the metadata for this dinode as good so far, but not we
> + * realize it's bad. So we need to undo what we've done.
> + *
> + * Returns: 0 - We need to process the block as metadata. In other words,
> + * we need to undo any blocks it refers to.
> + * 1 - We can't process the block as metadata.
> + */
> +
> +static int undo_reference(struct gfs2_inode *ip, uint64_t block, int meta,
> + void *private)
> {
> - int found_dup = 0, iblk_type;
> - struct gfs2_buffer_head *nbh;
> struct block_count *bc = (struct block_count *)private;
> -
> - *bh = NULL;
> + struct duptree *dt;
> + struct inode_with_dups *id;
>
> if (!valid_block(ip->i_sbd, block)) { /* blk outside of FS */
> fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
> _("bad block referencing"), gfs2_block_free);
> return 1;
> }
> - if (is_dir(&ip->i_di, ip->i_sbd->gfs1) && h == ip->i_di.di_height)
> - iblk_type = GFS2_METATYPE_JD;
> - else
> - iblk_type = GFS2_METATYPE_IN;
>
> - found_dup = find_remove_dup(ip, block, _("Metadata"));
> - nbh = bread(ip->i_sbd, block);
> + if (meta)
> + bc->indir_count--;
> + dt = dupfind(block);
> + if (dt) {
> + /* remove all duplicate reference structures from this inode */
> + do {
> + id = find_dup_ref_inode(dt, ip);
> + if (!id)
> + break;
>
> - if (gfs2_check_meta(nbh, iblk_type)) {
> - if (!found_dup) {
> - fsck_blockmap_set(ip, block, _("bad indirect"),
> - gfs2_block_free);
> - brelse(nbh);
> + dup_listent_delete(id);
> + } while (id);
> +
> + if (dt->refs) {
> + log_err(_("Block %llu (0x%llx) is still referenced "
> + "from another inode; not freeing.\n"),
> + (unsigned long long)block,
> + (unsigned long long)block);
> return 1;
> }
> - brelse(nbh);
> - nbh = NULL;
> - } else /* blk check ok */
> - *bh = nbh;
> -
> - bc->indir_count--;
> - if (found_dup) {
> - if (nbh)
> - brelse(nbh);
> - *bh = NULL;
> - return 1; /* don't process the metadata again */
> - } else
> - fsck_blockmap_set(ip, block, _("bad indirect"),
> - gfs2_block_free);
> + }
> + fsck_blockmap_set(ip, block,
> + meta ? _("bad indirect") : _("referenced data"),
> + gfs2_block_free);
> return 0;
> }
>
> +static int undo_check_metalist(struct gfs2_inode *ip, uint64_t block,
> + int h, void *private)
> +{
> + return undo_reference(ip, block, 1, private);
> +}
> +
> +static int undo_check_data(struct gfs2_inode *ip, uint64_t block,
> + void *private)
> +{
> + return undo_reference(ip, block, 0, private);
> +}
> +
> static int check_data(struct gfs2_inode *ip, uint64_t block, void *private)
> {
> uint8_t q;
> @@ -438,71 +447,9 @@ static int check_data(struct gfs2_inode *ip, uint64_t block, void *private)
> return 0;
> }
>
> -static int undo_check_data(struct gfs2_inode *ip, uint64_t block,
> - void *private)
> -{
> - struct block_count *bc = (struct block_count *) private;
> -
> - if (!valid_block(ip->i_sbd, block)) {
> - /* Mark the owner of this block with the bad_block
> - * designator so we know to check it for out of range
> - * blocks later */
> - fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
> - _("bad (invalid or out of range) data"),
> - gfs2_block_free);
> - return 1;
> - }
> - bc->data_count--;
> - return free_block_if_notdup(ip, block, _("data"));
> -}
> -
> static int remove_inode_eattr(struct gfs2_inode *ip, struct block_count *bc)
> {
> - struct duptree *dt;
> - struct inode_with_dups *id;
> - osi_list_t *ref;
> - int moved = 0;
> -
> - /* If it's a duplicate reference to the block, we need to check
> - if the reference is on the valid or invalid inodes list.
> - If it's on the valid inode's list, move it to the invalid
> - inodes list. The reason is simple: This inode, although
> - valid, has an now-invalid reference, so we should not give
> - this reference preferential treatment over others. */
> - dt = dupfind(ip->i_di.di_eattr);
> - if (dt) {
> - osi_list_foreach(ref, &dt->ref_inode_list) {
> - id = osi_list_entry(ref, struct inode_with_dups, list);
> - if (id->block_no == ip->i_di.di_num.no_addr) {
> - log_debug( _("Moving inode %lld (0x%llx)'s "
> - "duplicate reference to %lld "
> - "(0x%llx) from the valid to the "
> - "invalid reference list.\n"),
> - (unsigned long long)
> - ip->i_di.di_num.no_addr,
> - (unsigned long long)
> - ip->i_di.di_num.no_addr,
> - (unsigned long long)
> - ip->i_di.di_eattr,
> - (unsigned long long)
> - ip->i_di.di_eattr);
> - /* Move from the normal to the invalid list */
> - osi_list_del(&id->list);
> - osi_list_add_prev(&id->list,
> - &dt->ref_invinode_list);
> - moved = 1;
> - break;
> - }
> - }
> - if (!moved)
> - log_debug( _("Duplicate reference to %lld "
> - "(0x%llx) not moved.\n"),
> - (unsigned long long)ip->i_di.di_eattr,
> - (unsigned long long)ip->i_di.di_eattr);
> - } else {
> - delete_block(ip, ip->i_di.di_eattr, NULL,
> - "extended attribute", NULL);
> - }
> + undo_reference(ip, ip->i_di.di_eattr, 0, bc);
> ip->i_di.di_eattr = 0;
> bc->ea_count = 0;
> ip->i_di.di_blocks = 1 + bc->indir_count + bc->data_count;
> @@ -1080,23 +1027,10 @@ static int handle_ip(struct gfs2_sbd *sdp, struct gfs2_inode *ip)
> if (lf_dip && lf_dip->i_di.di_blocks != lf_blks)
> reprocess_inode(lf_dip, "lost+found");
>
> - if (fsck_abort || error < 0)
> + /* We there was an error, we return 0 because we want fsck to continue
> + and analyze the other dinodes as well. */
> + if (fsck_abort || error != 0)
> return 0;
> - if (error > 0) {
> - log_err( _("Error: inode %llu (0x%llx) has unrecoverable "
> - "errors; invalidating.\n"),
> - (unsigned long long)ip->i_di.di_num.no_addr,
> - (unsigned long long)ip->i_di.di_num.no_addr);
> - undo_fxns.private = &bc;
> - check_metatree(ip, &undo_fxns);
> - /* If we undo the metadata accounting, including metadatas
> - duplicate block status, we need to make sure later passes
> - don't try to free up the metadata referenced by this inode.
> - Therefore we mark the inode as free space. */
> - fsck_blockmap_set(ip, ip->i_di.di_num.no_addr,
> - _("corrupt"), gfs2_block_free);
> - return 0;
> - }
>
> error = check_inode_eattr(ip, &pass1_fxns);
>
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions
2013-05-16 13:27 ` Steven Whitehouse
@ 2013-05-16 13:49 ` Bob Peterson
2013-05-16 14:02 ` Steven Whitehouse
0 siblings, 1 reply; 59+ messages in thread
From: Bob Peterson @ 2013-05-16 13:49 UTC (permalink / raw)
To: cluster-devel.redhat.com
----- Original Message -----
| Hi,
|
| This sounds to me like we are doing things in the wrong order. We
| shouldn't need to undo things that have been done, otherwise we'll just
| land up in a tangle,
|
| Steve.
Hi,
Pass1's job is to traverse the metadata tree of every dinode, marking
which blocks are metadata, which are data, which are ext. attributes, etc.
With its current design, it runs through that tree once (for each dinode),
marking the blocks as it goes in its blockmap. If it encounters damage it
can't recover from, it has to "undo" those designations, otherwise you
end up in situations where a severely damaged dinode causes a lot of
collateral damage because it references blocks that are in use by a
newer, healthier dinode with valid references.
The alternative is to run through each dinode's metadata tree twice:
Once to determine its general health, and a second time to remember the
blocks it used in the blockmap. This obviously would be a lot slower.
The slowness would affect every dinode, healthy or damaged, whereas the
current method only takes extra time for damaged dinodes.
This ability to "undo" blockmap designations is not new to fsck.gfs2.
It's been doing that for many releases. Recent patches just restructured
it a bit to make better decisions and only affect pass1.
Regards,
Bob Peterson
Red Hat File Systems
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions
2013-05-16 13:49 ` Bob Peterson
@ 2013-05-16 14:02 ` Steven Whitehouse
2013-05-16 15:02 ` Bob Peterson
0 siblings, 1 reply; 59+ messages in thread
From: Steven Whitehouse @ 2013-05-16 14:02 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Thu, 2013-05-16 at 09:49 -0400, Bob Peterson wrote:
> ----- Original Message -----
> | Hi,
> |
> | This sounds to me like we are doing things in the wrong order. We
> | shouldn't need to undo things that have been done, otherwise we'll just
> | land up in a tangle,
> |
> | Steve.
>
> Hi,
>
> Pass1's job is to traverse the metadata tree of every dinode, marking
> which blocks are metadata, which are data, which are ext. attributes, etc.
> With its current design, it runs through that tree once (for each dinode),
> marking the blocks as it goes in its blockmap. If it encounters damage it
> can't recover from, it has to "undo" those designations, otherwise you
> end up in situations where a severely damaged dinode causes a lot of
> collateral damage because it references blocks that are in use by a
> newer, healthier dinode with valid references.
>
> The alternative is to run through each dinode's metadata tree twice:
> Once to determine its general health, and a second time to remember the
> blocks it used in the blockmap. This obviously would be a lot slower.
> The slowness would affect every dinode, healthy or damaged, whereas the
> current method only takes extra time for damaged dinodes.
>
> This ability to "undo" blockmap designations is not new to fsck.gfs2.
> It's been doing that for many releases. Recent patches just restructured
> it a bit to make better decisions and only affect pass1.
>
> Regards,
>
> Bob Peterson
> Red Hat File Systems
>
Yes, but the undo side of things worries me... it is very easy to get
tied in knots doing that. The question is what is "damage it can't
recover from"? this is a bit vague and doesn't really explain what is
going on here.
I don't yet understand why we'd need to run through each inodes metadata
tree more than once in this case,
Steve.
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions
2013-05-16 14:02 ` Steven Whitehouse
@ 2013-05-16 15:02 ` Bob Peterson
2013-05-16 15:24 ` Steven Whitehouse
0 siblings, 1 reply; 59+ messages in thread
From: Bob Peterson @ 2013-05-16 15:02 UTC (permalink / raw)
To: cluster-devel.redhat.com
----- Original Message -----
| Yes, but the undo side of things worries me... it is very easy to get
| tied in knots doing that. The question is what is "damage it can't
| recover from"? this is a bit vague and doesn't really explain what is
| going on here.
|
| I don't yet understand why we'd need to run through each inodes metadata
| tree more than once in this case,
|
| Steve.
Hi,
One thing to bear in mind is that the fsck blockmap is supposed to
represent the correct state of all blocks in the on-disk bitmap.
The job of pass1 is to build the blockmap, which starts out entirely "free".
As the metadata is traversed, the blocks are filled in with the appropriate
type.
The job of pass5 is to synchronize the on-disk bitmap to the blockmap.
So we must ensure that the blockmap is accurate at ALL times after pass1.
One of the primary checks pass1 does is to make sure that a block is "free"
in the blockmap before changing its designation, otherwise it's a duplicate
block reference that must be resolved in pass1b.
Here's an example: Suppose you have a file with di_height==2, two levels of
indirection. Suppose the dinode is layed out something like this:
dinode indirect data
------ -------- ------
0x1000 - dinode
---> 0x1001
---> 0x1002
---> 0x1003
...
---> 0x1010
---> 0x1011
---> 0x1012
---> 0x1013
...
---> 0x1020
---> 0x1021
---> 0x1022
---> 0x1023
---> 0x7777777777777777777
---> 0x1025
...
---> 0x1030
Now let's further suppose that this file was supposed to be deleted,
and many of its blocks were in fact reused by a newer, valid dinode,
but somehow, the bitmap was corrupted into saying this dinode is still
alive (a dinode, not free or unlinked).
For the sake of argument, say that second dinode appears later in the
bitmap, so pass1 gets to corrupt dinode 0x1000 before it gets to the
valid dinode that correctly references the blocks.
As it traverses the metadata tree, it builds an array of lists, one for
each height. Each item in the linked list corresponds to a metadata block.
So pass1 traverses the array, marks down in its blockmap that block 0x1000 is
dinode, blocks 0x1001, 0x1011, and 0x1021 are metadata blocks. Then it
processes the data block pointers within the metadata blocks, marking
0x1002, 0x1003, all the way up to 1023 as "data" blocks.
When it hits the block 0x7777777777777777777, it determines that's out
of range for the device, and therefore the data file has an unrecoverable
data block error.
At this point, it doesn't make sense to continue marking 0x1025 and beyond
as referenced data blocks, because that will only make matters worse.
Now we've got a problem: Before we knew 0x1000 was corrupt, we marked all
its references in the blockmap. We can't just delete the corrupt dinode
because most of its blocks are in-use by that other dinode.
One strategy is to keep the blocks it previously marked as "data" and "meta"
"as is" in the blockmap, mark the dinode as "invalid dinode" in the blockmap
and move along. Later, when we get to the other valid dinode, we'll see
potentially tens of thousands of duplicate references. Assuming we have
enough memory to record all these references, and time enough to resolve
them, these can all be checked in pass1b and resolved properly, due to
the fact that we marked the dinode as "invalid" (we favor the valid reference).
The problems with this strategy is (1) it takes lots of time and memory to
record and resolve all these duplicate references, and (2) when it gets to
pass5, the blocks that AREN'T referenced elsewhere are now set to "data"
in the blockmap, so pass5 will set the bitmap accordingly. But a subsequent
run of fsck.gfs2 will determine that no valid dinode references those
data blocks, and it will complain about blocks improperly marked as
"data" that should, in fact, be "free". This is bad behavior: a second run of
fsck.gfs2 should come up clean.
So to prevent this from happening, pass1, upon discovering the out-of-range
block, makes an effort to "undo" its blockmap designations. It traverses
the dinode's metadata tree once more, but sets all the blocks back to
"free". Well, not all of them, because if the invalid dinode referenced
blocks that were previously encountered, it would have recorded them as
duplicate references, so it has to "undo" that designation as well.
Another alternative is to do a pre-check for all possible types of
corruption of every block. This involves making two passes through the
metadata: The first pass validates the blocks are all valid, and there are
absolutely no problems with data or metadata. The second pass marks all the
blocks in the blockmap as their appropriate type.
Yes, it gets very sticky, very messy. That's why it's taken so long to
get it right.
Regards,
Bob Peterson
Red Hat File Systems
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions
2013-05-16 15:02 ` Bob Peterson
@ 2013-05-16 15:24 ` Steven Whitehouse
2013-05-20 13:08 ` Bob Peterson
0 siblings, 1 reply; 59+ messages in thread
From: Steven Whitehouse @ 2013-05-16 15:24 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Thu, 2013-05-16 at 11:02 -0400, Bob Peterson wrote:
> ----- Original Message -----
> | Yes, but the undo side of things worries me... it is very easy to get
> | tied in knots doing that. The question is what is "damage it can't
> | recover from"? this is a bit vague and doesn't really explain what is
> | going on here.
> |
> | I don't yet understand why we'd need to run through each inodes metadata
> | tree more than once in this case,
> |
> | Steve.
>
> Hi,
>
> One thing to bear in mind is that the fsck blockmap is supposed to
> represent the correct state of all blocks in the on-disk bitmap.
> The job of pass1 is to build the blockmap, which starts out entirely "free".
> As the metadata is traversed, the blocks are filled in with the appropriate
> type.
>
Yes, but this cannot be true, as we don't actually always know the true
correct state of the blocks, so sometimes, blocks will need to be marked
as unknown until more evidence is available, as I think your example
shows.
> The job of pass5 is to synchronize the on-disk bitmap to the blockmap.
>
> So we must ensure that the blockmap is accurate at ALL times after pass1.
> One of the primary checks pass1 does is to make sure that a block is "free"
> in the blockmap before changing its designation, otherwise it's a duplicate
> block reference that must be resolved in pass1b.
>
> Here's an example: Suppose you have a file with di_height==2, two levels of
> indirection. Suppose the dinode is layed out something like this:
>
> dinode indirect data
> ------ -------- ------
> 0x1000 - dinode
> ---> 0x1001
> ---> 0x1002
> ---> 0x1003
> ...
> ---> 0x1010
> ---> 0x1011
> ---> 0x1012
> ---> 0x1013
> ...
> ---> 0x1020
> ---> 0x1021
> ---> 0x1022
> ---> 0x1023
> ---> 0x7777777777777777777
> ---> 0x1025
> ...
> ---> 0x1030
>
> Now let's further suppose that this file was supposed to be deleted,
> and many of its blocks were in fact reused by a newer, valid dinode,
> but somehow, the bitmap was corrupted into saying this dinode is still
> alive (a dinode, not free or unlinked).
>
> For the sake of argument, say that second dinode appears later in the
> bitmap, so pass1 gets to corrupt dinode 0x1000 before it gets to the
> valid dinode that correctly references the blocks.
>
> As it traverses the metadata tree, it builds an array of lists, one for
> each height. Each item in the linked list corresponds to a metadata block.
> So pass1 traverses the array, marks down in its blockmap that block 0x1000 is
> dinode, blocks 0x1001, 0x1011, and 0x1021 are metadata blocks. Then it
> processes the data block pointers within the metadata blocks, marking
> 0x1002, 0x1003, all the way up to 1023 as "data" blocks.
> When it hits the block 0x7777777777777777777, it determines that's out
> of range for the device, and therefore the data file has an unrecoverable
> data block error.
>
> At this point, it doesn't make sense to continue marking 0x1025 and beyond
> as referenced data blocks, because that will only make matters worse.
>
I'm not convinced. In that case you have a single reference to a single
out of range block. All that we need to do there is to reset the pointer
to 0 (unallocated) and thats it. There is no need to stop processing the
remainder of the block unless there is some reason to believe that this
was not just a one off issue.
> Now we've got a problem: Before we knew 0x1000 was corrupt, we marked all
> its references in the blockmap. We can't just delete the corrupt dinode
> because most of its blocks are in-use by that other dinode.
>
I think the confusion seems to stem from doing things in the wrong
order. What we should be doing is verifying the tree structure of the
filesystem first, and then after that is done, updating the bitmaps to
match, in case there is a mismatch between the actual fs structure and
the bitmaps.
Also, we can use the initial state of the bitmaps in order to find more
objects to look at (i.e. in case of unlinked inodes) and also use
pointers in the filesystem in the same way (in case of directory entries
which point to non-inode blocks) for example. But neither of those
things requires undoing anything so far as I can see.
So what we ought to have is something like this:
- Start looking at bitmaps to find initial set of things to check
- Start checking inodes, adding additional blocks to list of things to
check as we go
- Once finished, check bitmaps against list to ensure consistency
against the tree
Obviously that is rather simplified since there are a few extras we need
to deal with some corner cases, but that should be the core of it. If
each rgrp is checked as we go, then it should be possible to do with
just one pass through the fs.
And I know that we will not be able to do that immediately, but that is
the kind of structure that we probably should be working towards over
time. So this isn't really something for this patch set, but something
that we should be looking into in due course,
Steve.
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 06/47] fsck.gfs2: Check for formal inode mismatch when adding to lost+found
2013-05-15 16:08 ` Steven Whitehouse
@ 2013-05-17 12:47 ` Bob Peterson
2013-05-17 12:55 ` Steven Whitehouse
0 siblings, 1 reply; 59+ messages in thread
From: Bob Peterson @ 2013-05-17 12:47 UTC (permalink / raw)
To: cluster-devel.redhat.com
----- Original Message -----
| Hi,
|
| On Tue, 2013-05-14 at 11:21 -0500, Bob Peterson wrote:
| > This patch adds a check to the code that adds inodes to lost+found
| > so that dinodes with formal inode mismatches are logged, but not added.
| >
| I'm not sure I understand what this one is doing. If there is a mismatch
| between the dir entry and the inode that suggests that the dir entry and
| inode are not related to the same thing,
|
| Steve.
Hi,
Yes, you're correct, and this mismatch is precisely why it's wrong
to add the file to lost+found, and that's what this patch is avoiding.
Regards,
Bob Peterson
Red Hat File Systems
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 06/47] fsck.gfs2: Check for formal inode mismatch when adding to lost+found
2013-05-17 12:47 ` Bob Peterson
@ 2013-05-17 12:55 ` Steven Whitehouse
0 siblings, 0 replies; 59+ messages in thread
From: Steven Whitehouse @ 2013-05-17 12:55 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
On Fri, 2013-05-17 at 08:47 -0400, Bob Peterson wrote:
> ----- Original Message -----
> | Hi,
> |
> | On Tue, 2013-05-14 at 11:21 -0500, Bob Peterson wrote:
> | > This patch adds a check to the code that adds inodes to lost+found
> | > so that dinodes with formal inode mismatches are logged, but not added.
> | >
> | I'm not sure I understand what this one is doing. If there is a mismatch
> | between the dir entry and the inode that suggests that the dir entry and
> | inode are not related to the same thing,
> |
> | Steve.
>
> Hi,
>
> Yes, you're correct, and this mismatch is precisely why it's wrong
> to add the file to lost+found, and that's what this patch is avoiding.
>
I'm not sure that makes it any clearer... if the inode and the dir entry
do not have the same formal inode number, then there are three possible
causes:
1. There is no relation between the two, so we must add the inode to
lost+found and delete the directory entry. This might happen if the
inode number has become corrupted in the directory entry for example.
2. The inode has a corrupt formal inode number, and we can potentially
fix that by setting it to the same as the directory entry
3. The directory entry has a corrupt formal inode number, and we can
potentially fix that by setting it to be the same as the one in the
inode
How can we tell the difference between the three situations? I suspect
in reality that will be quite tricky to do, and I'm not sure that I
follow the logic here,
Steve.
> Regards,
>
> Bob Peterson
> Red Hat File Systems
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions
2013-05-16 15:24 ` Steven Whitehouse
@ 2013-05-20 13:08 ` Bob Peterson
0 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-20 13:08 UTC (permalink / raw)
To: cluster-devel.redhat.com
----- Original Message -----
| > Hi,
| >
| > One thing to bear in mind is that the fsck blockmap is supposed to
| > represent the correct state of all blocks in the on-disk bitmap.
| > The job of pass1 is to build the blockmap, which starts out entirely
| > "free".
| > As the metadata is traversed, the blocks are filled in with the appropriate
| > type.
| >
| Yes, but this cannot be true, as we don't actually always know the true
| correct state of the blocks, so sometimes, blocks will need to be marked
| as unknown until more evidence is available, as I think your example
| shows.
While pass1 is running, we don't know the correct state of every block,
but that's what pass1 is trying to determine. At the end of pass1, we should
know the correct state of every block, except those with duplicate references.
You are correct about the "unknown" state, which is called "free" in the blockmap.
The bitmap and blockmap are two different things, with two different purposes.
The bitmap is the state of the block on the media, and "free" means "free".
The blockmap is what we know about each block, based on reference, and "free" means
the state is "unknown" and so far it has not been referenced by anything.
|
| >
| > At this point, it doesn't make sense to continue marking 0x1025 and beyond
| > as referenced data blocks, because that will only make matters worse.
| >
| I'm not convinced. In that case you have a single reference to a single
| out of range block. All that we need to do there is to reset the pointer
| to 0 (unallocated) and thats it. There is no need to stop processing the
| remainder of the block unless there is some reason to believe that this
| was not just a one off issue.
Actually, the way we have to approach this is a lot more complex.
The major problem here is, and probably always will be, duplicate block
references. If there's a badly corrupt dinode that references 1000 blocks,
and that file should have been deleted, but it's not, due to a bitmap problem,
and those 1000 blocks have been reallocated to 200 different healthy files,
many different problems arise, depending on whether you've processed that
corrupt file before or after the other 200 (and maybe 100 before, and 100 after),
so that some of the blocks are marked as duplicate references and some aren't.
| > Now we've got a problem: Before we knew 0x1000 was corrupt, we marked all
| > its references in the blockmap. We can't just delete the corrupt dinode
| > because most of its blocks are in-use by that other dinode.
| >
| I think the confusion seems to stem from doing things in the wrong
| order. What we should be doing is verifying the tree structure of the
| filesystem first, and then after that is done, updating the bitmaps to
| match, in case there is a mismatch between the actual fs structure and
| the bitmaps.
That's exactly what we do. Pass1 verifies the file system tree structure, by
building its blockmap. Pass5 updates the bitmaps to match the blockmap built
by pass1.
| Also, we can use the initial state of the bitmaps in order to find more
| objects to look at (i.e. in case of unlinked inodes) and also use
| pointers in the filesystem in the same way (in case of directory entries
| which point to non-inode blocks) for example. But neither of those
| things requires undoing anything so far as I can see.
|
| So what we ought to have is something like this:
|
| - Start looking at bitmaps to find initial set of things to check
| - Start checking inodes, adding additional blocks to list of things to
| check as we go
| - Once finished, check bitmaps against list to ensure consistency
| against the tree
Yes, this would work ideally, but only for metadata blocks. You still need
some way to determine the state of the bitmap versus the references in order
to ensure there aren't duplicate block references. Unless, of course, the
list includes the data blocks as well as metadata, but then it would take an
enormous amount of memory to accomplish: at least 8X more memory than the
entire file system size. If you don't include the data blocks, you still
need some means to ensure the same blocks aren't referenced multiple times,
and any proposed solution would have to take into account the fact that
blocks can be multiple-referenced as either metadata or data from multiple
sources. Using a 2-bit bitmap and a corresponding 8-bit blockmap requires
much less memory, which is what we have today.
Even if the list includes only metadata, I'm concerned about the amount
of memory it would take, especially for common use cases like email, where
the file system contains many terabytes of tiny files.
| Obviously that is rather simplified since there are a few extras we need
| to deal with some corner cases, but that should be the core of it. If
| each rgrp is checked as we go, then it should be possible to do with
| just one pass through the fs.
|
| And I know that we will not be able to do that immediately, but that is
| the kind of structure that we probably should be working towards over
| time. So this isn't really something for this patch set, but something
| that we should be looking into in due course,
|
| Steve.
Bob Peterson
Red Hat File Systems
^ permalink raw reply [flat|nested] 59+ messages in thread
* [Cluster-devel] [gfs2-utils PATCH 03/47] libgfs2: let dir_split_leaf receive a "broken" lindex
2013-05-15 16:01 ` Steven Whitehouse
@ 2013-05-20 16:02 ` Bob Peterson
0 siblings, 0 replies; 59+ messages in thread
From: Bob Peterson @ 2013-05-20 16:02 UTC (permalink / raw)
To: cluster-devel.redhat.com
----- Original Message -----
| Hi,
|
| On Tue, 2013-05-14 at 11:21 -0500, Bob Peterson wrote:
| > For ordinary leaf blocks, the hash table must follow the rules,
| > which means it needs to follow a power-of-two boundary. In other
| > words, it needs to enforce that: start = (lindex & ~(len - 1));
| > But when doing repairs, fsck will need to detect when hash tables
| > violate this rule and fix it. In that case, it may need to pass
| > in an invalid starting offset for a leaf to split. This patch
| > moves the responsibility for checking the starting block to the
| > calling function.
| >
| > rhbz#902920
| > ---
| > gfs2/libgfs2/fs_ops.c | 9 ++++++---
| > 1 file changed, 6 insertions(+), 3 deletions(-)
| >
| > diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
| > index 89adf32..11ef6b4 100644
| > --- a/gfs2/libgfs2/fs_ops.c
| > +++ b/gfs2/libgfs2/fs_ops.c
| > @@ -957,7 +957,7 @@ void dir_split_leaf(struct gfs2_inode *dip, uint32_t
| > lindex, uint64_t leaf_no,
| > len = 1 << (dip->i_di.di_depth - be16_to_cpu(oleaf->lf_depth));
| > half_len = len >> 1;
| >
| > - start = (lindex & ~(len - 1));
| > + start = lindex;
| Why not just rename the lindex as start? Otherwise this might be
| confusing to have two names for the same variable,
|
| Steve.
Hi,
Good idea. I've implemented your change, and the replacement patch is
given below.
Regards,
Bob Peterson
Red Hat File Systems
---
commit 7e1170eec23957b084ca80828eac9fd1c8988062
Author: Bob Peterson <rpeterso@redhat.com>
Date: Thu Feb 21 09:36:01 2013 -0700
libgfs2: let dir_split_leaf receive a "broken" lindex
For ordinary leaf blocks, the hash table must follow the rules,
which means it needs to follow a power-of-two boundary. In other
words, it needs to enforce that: start = (lindex & ~(len - 1));
But when doing repairs, fsck will need to detect when hash tables
violate this rule and fix it. In that case, it may need to pass
in an invalid starting offset for a leaf to split. This patch
moves the responsibility for checking the starting block to the
calling function.
diff --git a/gfs2/libgfs2/fs_ops.c b/gfs2/libgfs2/fs_ops.c
index 89adf32..d009e2f 100644
--- a/gfs2/libgfs2/fs_ops.c
+++ b/gfs2/libgfs2/fs_ops.c
@@ -925,13 +925,13 @@ void gfs2_put_leaf_nr(struct gfs2_inode *dip, uint32_t inx, uint64_t leaf_out)
}
}
-void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex, uint64_t leaf_no,
+void dir_split_leaf(struct gfs2_inode *dip, uint32_t start, uint64_t leaf_no,
struct gfs2_buffer_head *obh)
{
struct gfs2_buffer_head *nbh;
struct gfs2_leaf *nleaf, *oleaf;
struct gfs2_dirent *dent, *prev = NULL, *next = NULL, *new;
- uint32_t start, len, half_len, divider;
+ uint32_t len, half_len, divider;
uint64_t bn, *lp;
uint32_t name_len;
int x, moved = FALSE;
@@ -957,8 +957,6 @@ void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex, uint64_t leaf_no,
len = 1 << (dip->i_di.di_depth - be16_to_cpu(oleaf->lf_depth));
half_len = len >> 1;
- start = (lindex & ~(len - 1));
-
lp = calloc(1, half_len * sizeof(uint64_t));
if (lp == NULL) {
fprintf(stderr, "Out of memory in %s\n", __FUNCTION__);
@@ -1160,7 +1158,7 @@ static int dir_e_add(struct gfs2_inode *dip, const char *filename, int len,
struct gfs2_buffer_head *bh, *nbh;
struct gfs2_leaf *leaf, *nleaf;
struct gfs2_dirent *dent;
- uint32_t lindex;
+ uint32_t lindex, llen;
uint32_t hash;
uint64_t leaf_no, bn;
int err = 0;
@@ -1182,7 +1180,10 @@ restart:
if (dirent_alloc(dip, bh, len, &dent)) {
if (be16_to_cpu(leaf->lf_depth) < dip->i_di.di_depth) {
- dir_split_leaf(dip, lindex, leaf_no, bh);
+ llen = 1 << (dip->i_di.di_depth -
+ be16_to_cpu(leaf->lf_depth));
+ dir_split_leaf(dip, lindex & ~(llen - 1),
+ leaf_no, bh);
brelse(bh);
goto restart;
diff --git a/gfs2/libgfs2/libgfs2.h b/gfs2/libgfs2/libgfs2.h
index 3147c83..3055355 100644
--- a/gfs2/libgfs2/libgfs2.h
+++ b/gfs2/libgfs2/libgfs2.h
@@ -468,7 +468,7 @@ extern void block_map(struct gfs2_inode *ip, uint64_t lblock, int *new,
extern void gfs2_get_leaf_nr(struct gfs2_inode *dip, uint32_t index,
uint64_t *leaf_out);
extern void gfs2_put_leaf_nr(struct gfs2_inode *dip, uint32_t inx, uint64_t leaf_out);
-extern void dir_split_leaf(struct gfs2_inode *dip, uint32_t lindex,
+extern void dir_split_leaf(struct gfs2_inode *dip, uint32_t start,
uint64_t leaf_no, struct gfs2_buffer_head *obh);
extern void gfs2_free_block(struct gfs2_sbd *sdp, uint64_t block);
extern int gfs2_freedi(struct gfs2_sbd *sdp, uint64_t block);
^ permalink raw reply related [flat|nested] 59+ messages in thread
end of thread, other threads:[~2013-05-20 16:02 UTC | newest]
Thread overview: 59+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-14 16:21 [Cluster-devel] [gfs2-utils PATCH 01/47] libgfs2: externalize dir_split_leaf Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 02/47] libgfs2: allow dir_split_leaf to receive a leaf buffer Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 03/47] libgfs2: let dir_split_leaf receive a "broken" lindex Bob Peterson
2013-05-15 16:01 ` Steven Whitehouse
2013-05-20 16:02 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 04/47] fsck.gfs2: Move function find_free_blk to util.c Bob Peterson
2013-05-15 16:04 ` Steven Whitehouse
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 05/47] fsck.gfs2: Split out function to make sure lost+found exists Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 06/47] fsck.gfs2: Check for formal inode mismatch when adding to lost+found Bob Peterson
2013-05-15 16:08 ` Steven Whitehouse
2013-05-17 12:47 ` Bob Peterson
2013-05-17 12:55 ` Steven Whitehouse
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 07/47] fsck.gfs2: shorten some debug messages in lost+found Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 08/47] fsck.gfs2: Move basic directory entry checks to separate function Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 09/47] fsck.gfs2: Add formal inode check to basic dirent checks Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 10/47] fsck.gfs2: Add new function to check dir hash tables Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 11/47] fsck.gfs2: Special case '..' when processing bad formal inode number Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 12/47] fsck.gfs2: Move function to read directory hash table to util.c Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 13/47] fsck.gfs2: Misc cleanups Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 14/47] fsck.gfs2: Verify dirent hash values correspond to proper leaf block Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 15/47] fsck.gfs2: re-read hash table if directory height or depth changes Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 16/47] fsck.gfs2: fix leaf blocks, don't try to patch the hash table Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 17/47] fsck.gfs2: check leaf depth when validating leaf blocks Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 18/47] fsck.gfs2: small cleanups Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 19/47] fsck.gfs2: reprocess inodes when blocks are added Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 20/47] fsck.gfs2: Remove redundant leaf depth check Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 21/47] fsck.gfs2: link dinodes that only have extended attribute problems Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 22/47] fsck.gfs2: Add clarifying message to duplicate processing Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 23/47] fsck.gfs2: separate function to calculate metadata block header size Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 24/47] fsck.gfs2: Rework the "undo" functions Bob Peterson
2013-05-16 13:27 ` Steven Whitehouse
2013-05-16 13:49 ` Bob Peterson
2013-05-16 14:02 ` Steven Whitehouse
2013-05-16 15:02 ` Bob Peterson
2013-05-16 15:24 ` Steven Whitehouse
2013-05-20 13:08 ` Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 25/47] fsck.gfs2: Check for interrupt when resolving duplicates Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 26/47] fsck.gfs2: Consistent naming of struct duptree variables Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 27/47] fsck.gfs2: Keep proper counts when duplicates are found Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 28/47] fsck.gfs2: print metadata block reference on data errors Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 29/47] fsck.gfs2: print block count values when fixing them Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 30/47] fsck.gfs2: Do not invalidate metablocks of dinodes with invalid mode Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 31/47] fsck.gfs2: Log when unrecoverable data block errors are encountered Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 32/47] fsck.gfs2: don't remove buffers from the list when errors are found Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 33/47] fsck.gfs2: Don't flag GFS1 non-dinode blocks as duplicates Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 34/47] fsck.gfs2: externalize check_leaf Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 35/47] fsck.gfs2: pass2: check leaf blocks when fixing hash table Bob Peterson
2013-05-14 16:21 ` [Cluster-devel] [gfs2-utils PATCH 36/47] fsck.gfs2: standardize check_metatree return codes Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 37/47] fsck.gfs2: don't invalidate files with duplicate data block refs Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 38/47] fsck.gfs2: check for duplicate first references Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 39/47] fsck.gfs2: When flagging a duplicate reference, show valid or invalid Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 40/47] fsck.gfs2: major duplicate reference reform Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 41/47] fsck.gfs2: Remove all bad eattr blocks Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 42/47] fsck.gfs2: Remove unused variable Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 43/47] fsck.gfs2: double-check transitions from dinode to data Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 44/47] fsck.gfs2: Stop "undo" process when error data block is reached Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 45/47] fsck.gfs2: Don't allocate leaf blocks in pass1 Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 46/47] fsck.gfs2: take hash table start boundaries into account Bob Peterson
2013-05-14 16:22 ` [Cluster-devel] [gfs2-utils PATCH 47/47] fsck.gfs2: delete all duplicates from unrecoverable damaged dinodes Bob Peterson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).