* [PATCH 01/28] xfs: create individual inode alloc. helper
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 02/28] xfs: update free inode record logic to support sparse inode records Brian Foster
` (27 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Inode allocation from sparse inode records must filter the ir_free mask
against ir_holemask. In preparation for this requirement, create a
helper to allocate an individual inode from an inode record.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
libxfs/xfs_ialloc.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index 2b4e4e0..a1cf1dd 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -727,6 +727,16 @@ xfs_ialloc_get_rec(
}
/*
+ * Return the offset of the first free inode in the record.
+ */
+STATIC int
+xfs_inobt_first_free_inode(
+ struct xfs_inobt_rec_incore *rec)
+{
+ return xfs_lowbit64(rec->ir_free);
+}
+
+/*
* Allocate an inode using the inobt-only algorithm.
*/
STATIC int
@@ -956,7 +966,7 @@ newino:
}
alloc_inode:
- offset = xfs_lowbit64(rec.ir_free);
+ offset = xfs_inobt_first_free_inode(&rec);
ASSERT(offset >= 0);
ASSERT(offset < XFS_INODES_PER_CHUNK);
ASSERT((XFS_AGINO_TO_OFFSET(mp, rec.ir_startino) %
@@ -1205,7 +1215,7 @@ xfs_dialloc_ag(
if (error)
goto error_cur;
- offset = xfs_lowbit64(rec.ir_free);
+ offset = xfs_inobt_first_free_inode(&rec);
ASSERT(offset >= 0);
ASSERT(offset < XFS_INODES_PER_CHUNK);
ASSERT((XFS_AGINO_TO_OFFSET(mp, rec.ir_startino) %
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 02/28] xfs: update free inode record logic to support sparse inode records
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
2015-06-02 18:41 ` [PATCH 01/28] xfs: create individual inode alloc. helper Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 03/28] xfs: support min/max agbno args in block allocator Brian Foster
` (26 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
xfs_difree_inobt() uses logic in a couple places that assume inobt
records refer to fully allocated chunks. Specifically, the use of
mp->m_ialloc_inos can cause problems for inode chunks that are sparsely
allocated. Sparse inode chunks can, by definition, define a smaller
number of inodes than a full inode chunk.
Fix the logic that determines whether an inode record should be removed
from the inobt to use the ir_free mask rather than ir_freecount. Fix the
agi counters modification to use ir_freecount to add the actual number
of inodes freed rather than assuming a full inode chunk.
Also make sure that we preserve the behavior to not remove inode chunks
if the block size is large enough for multiple inode chunks (e.g.,
bsize=64k, isize=512). This behavior was previously implicit in that in
such configurations, ir.freecount of a single record never matches
m_ialloc_inos. Hence, add some comments as well.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
libxfs/xfs_ialloc.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index a1cf1dd..673c0a7 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -1503,10 +1503,13 @@ xfs_difree_inobt(
rec.ir_freecount++;
/*
- * When an inode cluster is free, it becomes eligible for removal
+ * When an inode chunk is free, it becomes eligible for removal. Don't
+ * remove the chunk if the block size is large enough for multiple inode
+ * chunks (that might not be free).
*/
if (!(mp->m_flags & XFS_MOUNT_IKEEP) &&
- (rec.ir_freecount == mp->m_ialloc_inos)) {
+ rec.ir_free == XFS_INOBT_ALL_FREE &&
+ mp->m_sb.sb_inopblock <= XFS_INODES_PER_CHUNK) {
*deleted = 1;
*first_ino = XFS_AGINO_TO_INO(mp, agno, rec.ir_startino);
@@ -1516,7 +1519,7 @@ xfs_difree_inobt(
* AGI and Superblock inode counts, and mark the disk space
* to be freed when the transaction is committed.
*/
- ilen = mp->m_ialloc_inos;
+ ilen = rec.ir_freecount;
be32_add_cpu(&agi->agi_count, -ilen);
be32_add_cpu(&agi->agi_freecount, -(ilen - 1));
xfs_ialloc_log_agi(tp, agbp, XFS_AGI_COUNT | XFS_AGI_FREECOUNT);
@@ -1636,8 +1639,13 @@ xfs_difree_finobt(
* free inode. Hence, if all of the inodes are free and we aren't
* keeping inode chunks permanently on disk, remove the record.
* Otherwise, update the record with the new information.
+ *
+ * Note that we currently can't free chunks when the block size is large
+ * enough for multiple chunks. Leave the finobt record to remain in sync
+ * with the inobt.
*/
- if (rec.ir_freecount == mp->m_ialloc_inos &&
+ if (rec.ir_free == XFS_INOBT_ALL_FREE &&
+ mp->m_sb.sb_inopblock <= XFS_INODES_PER_CHUNK &&
!(mp->m_flags & XFS_MOUNT_IKEEP)) {
error = xfs_btree_delete(cur, &i);
if (error)
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 03/28] xfs: support min/max agbno args in block allocator
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
2015-06-02 18:41 ` [PATCH 01/28] xfs: create individual inode alloc. helper Brian Foster
2015-06-02 18:41 ` [PATCH 02/28] xfs: update free inode record logic to support sparse inode records Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 04/28] xfs: add sparse inode chunk alignment superblock field Brian Foster
` (25 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The block allocator supports various arguments to tweak block allocation
behavior and set allocation requirements. The sparse inode chunk feature
introduces a new requirement not supported by the current arguments.
Sparse inode allocations must convert or merge into an inode record that
describes a fixed length chunk (64 inodes x inodesize). Full inode chunk
allocations by definition always result in valid inode records. Sparse
chunk allocations are smaller and the associated records can refer to
blocks not owned by the inode chunk. This model can result in invalid
inode records in certain cases.
For example, if a sparse allocation occurs near the start of an AG, the
aligned inode record for that chunk might refer to agbno 0. If an
allocation occurs towards the end of the AG and the AG size is not
aligned, the inode record could refer to blocks beyond the end of the
AG. While neither of these scenarios directly result in corruption, they
both insert invalid inode records and at minimum cause repair to
complain, are unlikely to merge into full chunks over time and set land
mines for other areas of code.
To guarantee sparse inode chunk allocation creates valid inode records,
support the ability to specify an agbno range limit for
XFS_ALLOCTYPE_NEAR_BNO block allocations. The min/max agbno's are
specified in the allocation arguments and limit the block allocation
algorithms to that range. The starting 'agbno' hint is clamped to the
range if the specified agbno is out of range. If no sufficient extent is
available within the range, the allocation fails. For backwards
compatibility, the min/max fields can be initialized to 0 to disable
range limiting (e.g., equivalent to min=0,max=agsize).
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
libxfs/xfs_alloc.c | 42 +++++++++++++++++++++++++++++++++++++-----
libxfs/xfs_alloc.h | 2 ++
2 files changed, 39 insertions(+), 5 deletions(-)
diff --git a/libxfs/xfs_alloc.c b/libxfs/xfs_alloc.c
index 23e3c53..5d4f094 100644
--- a/libxfs/xfs_alloc.c
+++ b/libxfs/xfs_alloc.c
@@ -145,13 +145,27 @@ xfs_alloc_compute_aligned(
{
xfs_agblock_t bno;
xfs_extlen_t len;
+ xfs_extlen_t diff;
/* Trim busy sections out of found extent */
xfs_extent_busy_trim(args, foundbno, foundlen, &bno, &len);
+ /*
+ * If we have a largish extent that happens to start before min_agbno,
+ * see if we can shift it into range...
+ */
+ if (bno < args->min_agbno && bno + len > args->min_agbno) {
+ diff = args->min_agbno - bno;
+ if (len > diff) {
+ bno += diff;
+ len -= diff;
+ }
+ }
+
if (args->alignment > 1 && len >= args->minlen) {
xfs_agblock_t aligned_bno = roundup(bno, args->alignment);
- xfs_extlen_t diff = aligned_bno - bno;
+
+ diff = aligned_bno - bno;
*resbno = aligned_bno;
*reslen = diff >= len ? 0 : len - diff;
@@ -791,9 +805,13 @@ xfs_alloc_find_best_extent(
* The good extent is closer than this one.
*/
if (!dir) {
+ if (*sbnoa > args->max_agbno)
+ goto out_use_good;
if (*sbnoa >= args->agbno + gdiff)
goto out_use_good;
} else {
+ if (*sbnoa < args->min_agbno)
+ goto out_use_good;
if (*sbnoa <= args->agbno - gdiff)
goto out_use_good;
}
@@ -880,6 +898,17 @@ xfs_alloc_ag_vextent_near(
dofirst = prandom_u32() & 1;
#endif
+ /* handle unitialized agbno range so caller doesn't have to */
+ if (!args->min_agbno && !args->max_agbno)
+ args->max_agbno = args->mp->m_sb.sb_agblocks - 1;
+ ASSERT(args->min_agbno <= args->max_agbno);
+
+ /* clamp agbno to the range if it's outside */
+ if (args->agbno < args->min_agbno)
+ args->agbno = args->min_agbno;
+ if (args->agbno > args->max_agbno)
+ args->agbno = args->max_agbno;
+
restart:
bno_cur_lt = NULL;
bno_cur_gt = NULL;
@@ -972,6 +1001,8 @@ restart:
<bnoa, <lena);
if (ltlena < args->minlen)
continue;
+ if (ltbnoa < args->min_agbno || ltbnoa > args->max_agbno)
+ continue;
args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen);
xfs_alloc_fix_len(args);
ASSERT(args->len >= args->minlen);
@@ -1092,11 +1123,11 @@ restart:
XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0);
xfs_alloc_compute_aligned(args, ltbno, ltlen,
<bnoa, <lena);
- if (ltlena >= args->minlen)
+ if (ltlena >= args->minlen && ltbnoa >= args->min_agbno)
break;
if ((error = xfs_btree_decrement(bno_cur_lt, 0, &i)))
goto error0;
- if (!i) {
+ if (!i || ltbnoa < args->min_agbno) {
xfs_btree_del_cursor(bno_cur_lt,
XFS_BTREE_NOERROR);
bno_cur_lt = NULL;
@@ -1108,11 +1139,11 @@ restart:
XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0);
xfs_alloc_compute_aligned(args, gtbno, gtlen,
>bnoa, >lena);
- if (gtlena >= args->minlen)
+ if (gtlena >= args->minlen && gtbnoa <= args->max_agbno)
break;
if ((error = xfs_btree_increment(bno_cur_gt, 0, &i)))
goto error0;
- if (!i) {
+ if (!i || gtbnoa > args->max_agbno) {
xfs_btree_del_cursor(bno_cur_gt,
XFS_BTREE_NOERROR);
bno_cur_gt = NULL;
@@ -1212,6 +1243,7 @@ restart:
ASSERT(ltnew >= ltbno);
ASSERT(ltnew + rlen <= ltbnoa + ltlena);
ASSERT(ltnew + rlen <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length));
+ ASSERT(ltnew >= args->min_agbno && ltnew <= args->max_agbno);
args->agbno = ltnew;
if ((error = xfs_alloc_fixup_trees(cnt_cur, bno_cur_lt, ltbno, ltlen,
diff --git a/libxfs/xfs_alloc.h b/libxfs/xfs_alloc.h
index db5da4a..fbe383f 100644
--- a/libxfs/xfs_alloc.h
+++ b/libxfs/xfs_alloc.h
@@ -112,6 +112,8 @@ typedef struct xfs_alloc_arg {
xfs_extlen_t total; /* total blocks needed in xaction */
xfs_extlen_t alignment; /* align answer to multiple of this */
xfs_extlen_t minalignslop; /* slop for minlen+alignment calcs */
+ xfs_agblock_t min_agbno; /* set an agbno range for NEAR allocs */
+ xfs_agblock_t max_agbno; /* ... */
xfs_extlen_t len; /* output: actual size of extent */
xfs_alloctype_t type; /* allocation type XFS_ALLOCTYPE_... */
xfs_alloctype_t otype; /* original allocation type */
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 04/28] xfs: add sparse inode chunk alignment superblock field
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (2 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 03/28] xfs: support min/max agbno args in block allocator Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 05/28] xfs: use sparse chunk alignment for min. inode allocation requirement Brian Foster
` (24 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Add sb_spino_align to the superblock to specify sparse inode chunk
alignment. This also currently represents the minimum allowable sparse
chunk allocation size.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
libxfs/xfs_format.h | 4 ++--
libxfs/xfs_sb.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
index 4d313d3..20acbc4 100644
--- a/libxfs/xfs_format.h
+++ b/libxfs/xfs_format.h
@@ -170,7 +170,7 @@ typedef struct xfs_sb {
__uint32_t sb_features_log_incompat;
__uint32_t sb_crc; /* superblock crc */
- __uint32_t sb_pad;
+ xfs_extlen_t sb_spino_align; /* sparse inode chunk alignment */
xfs_ino_t sb_pquotino; /* project quota inode */
xfs_lsn_t sb_lsn; /* last write sequence */
@@ -256,7 +256,7 @@ typedef struct xfs_dsb {
__be32 sb_features_log_incompat;
__le32 sb_crc; /* superblock crc */
- __be32 sb_pad;
+ __be32 sb_spino_align; /* sparse inode chunk alignment */
__be64 sb_pquotino; /* project quota inode */
__be64 sb_lsn; /* last write sequence */
diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
index 6844cd8..f34c676 100644
--- a/libxfs/xfs_sb.c
+++ b/libxfs/xfs_sb.c
@@ -357,7 +357,7 @@ __xfs_sb_from_disk(
be32_to_cpu(from->sb_features_log_incompat);
/* crc is only used on disk, not in memory; just init to 0 here. */
to->sb_crc = 0;
- to->sb_pad = 0;
+ to->sb_spino_align = be32_to_cpu(from->sb_spino_align);
to->sb_pquotino = be64_to_cpu(from->sb_pquotino);
to->sb_lsn = be64_to_cpu(from->sb_lsn);
/* Convert on-disk flags to in-memory flags? */
@@ -499,7 +499,7 @@ xfs_sb_to_disk(
cpu_to_be32(from->sb_features_incompat);
to->sb_features_log_incompat =
cpu_to_be32(from->sb_features_log_incompat);
- to->sb_pad = 0;
+ to->sb_spino_align = cpu_to_be32(from->sb_spino_align);
to->sb_lsn = cpu_to_be64(from->sb_lsn);
}
}
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 05/28] xfs: use sparse chunk alignment for min. inode allocation requirement
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (3 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 04/28] xfs: add sparse inode chunk alignment superblock field Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 06/28] xfs: sparse inode chunks feature helpers and mount requirements Brian Foster
` (23 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
xfs_ialloc_ag_select() iterates through the allocation groups looking
for free inodes or free space to determine whether to allow an inode
allocation to proceed. If no free inodes are available, it assumes that
an AG must have an extent longer than mp->m_ialloc_blks.
Sparse inode chunk support currently allows for allocations smaller than
the traditional inode chunk size specified in m_ialloc_blks. The current
minimum sparse allocation is set in the superblock sb_spino_align field
at mkfs time. Create a new m_ialloc_min_blks field in xfs_mount and use
this to represent the minimum supported allocation size for inode
chunks. Initialize m_ialloc_min_blks at mount time based on whether
sparse inodes are supported.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
include/xfs_mount.h | 2 ++
libxfs/xfs_ialloc.c | 2 +-
libxfs/xfs_sb.c | 5 +++++
3 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/include/xfs_mount.h b/include/xfs_mount.h
index 70bdea0..998439e 100644
--- a/include/xfs_mount.h
+++ b/include/xfs_mount.h
@@ -73,6 +73,8 @@ typedef struct xfs_mount {
uint m_attroffset; /* inode attribute offset */
int m_ialloc_inos; /* inodes in inode allocation */
int m_ialloc_blks; /* blocks in inode allocation */
+ int m_ialloc_min_blks;/* min blocks in sparse inode
+ * allocation */
int m_litino; /* size of inode union area */
int m_inoalign_mask;/* mask sb_inoalignmt if used */
struct xfs_trans_resv m_resv; /* precomputed res values */
diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index 673c0a7..1be6d27 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -640,7 +640,7 @@ xfs_ialloc_ag_select(
* if we fail allocation due to alignment issues then it is most
* likely a real ENOSPC condition.
*/
- ineed = mp->m_ialloc_blks;
+ ineed = mp->m_ialloc_min_blks;
if (flags && ineed > 1)
ineed += xfs_ialloc_cluster_alignment(mp);
longest = pag->pagf_longest;
diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
index f34c676..fe16e8f 100644
--- a/libxfs/xfs_sb.c
+++ b/libxfs/xfs_sb.c
@@ -672,6 +672,11 @@ xfs_sb_mount_common(
mp->m_ialloc_inos = (int)MAX((__uint16_t)XFS_INODES_PER_CHUNK,
sbp->sb_inopblock);
mp->m_ialloc_blks = mp->m_ialloc_inos >> sbp->sb_inopblog;
+
+ if (sbp->sb_spino_align)
+ mp->m_ialloc_min_blks = sbp->sb_spino_align;
+ else
+ mp->m_ialloc_min_blks = mp->m_ialloc_blks;
}
/*
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 06/28] xfs: sparse inode chunks feature helpers and mount requirements
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (4 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 05/28] xfs: use sparse chunk alignment for min. inode allocation requirement Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 07/28] xfs: add fs geometry bit for sparse inode chunks Brian Foster
` (22 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The sparse inode chunks feature uses the helper function to enable the
allocation of sparse inode chunks. The incompatible feature bit is set
on disk at mkfs time to prevent mount from unsupported kernels.
Also, enforce the inode alignment requirements required for sparse inode
chunks at mount time. When enabled, full inode chunks (and all inode
record) alignment is increased from cluster size to inode chunk size.
Sparse inode alignment must match the cluster size of the fs. Both
superblock alignment fields are set as such by mkfs when sparse inode
support is enabled.
Finally, warn that sparse inode chunks is an experimental feature until
further notice.
[xfsprogs:
Dropped the experimental feature warning to reduce userspace noise and
facilitate testing.]
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
libxfs/xfs_format.h | 7 +++++++
libxfs/xfs_sb.c | 18 ++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
index 20acbc4..f5b3499 100644
--- a/libxfs/xfs_format.h
+++ b/libxfs/xfs_format.h
@@ -457,6 +457,7 @@ xfs_sb_has_ro_compat_feature(
}
#define XFS_SB_FEAT_INCOMPAT_FTYPE (1 << 0) /* filetype in dirent */
+#define XFS_SB_FEAT_INCOMPAT_SPINODES (1 << 1) /* sparse inode chunks */
#define XFS_SB_FEAT_INCOMPAT_ALL \
(XFS_SB_FEAT_INCOMPAT_FTYPE)
@@ -506,6 +507,12 @@ static inline int xfs_sb_version_hasfinobt(xfs_sb_t *sbp)
(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_FINOBT);
}
+static inline bool xfs_sb_version_hassparseinodes(struct xfs_sb *sbp)
+{
+ return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
+ xfs_sb_has_incompat_feature(sbp, XFS_SB_FEAT_INCOMPAT_SPINODES);
+}
+
/*
* end of superblock version macros
*/
diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
index fe16e8f..d528a3f 100644
--- a/libxfs/xfs_sb.c
+++ b/libxfs/xfs_sb.c
@@ -172,6 +172,24 @@ xfs_mount_validate_sb(
return -EFSCORRUPTED;
}
+ /*
+ * Full inode chunks must be aligned to inode chunk size when
+ * sparse inodes are enabled to support the sparse chunk
+ * allocation algorithm and prevent overlapping inode records.
+ */
+ if (xfs_sb_version_hassparseinodes(sbp)) {
+ uint32_t align;
+
+ align = XFS_INODES_PER_CHUNK * sbp->sb_inodesize
+ >> sbp->sb_blocklog;
+ if (sbp->sb_inoalignmt != align) {
+ xfs_warn(mp,
+"Inode block alignment (%u) must match chunk size (%u) for sparse inodes.",
+ sbp->sb_inoalignmt, align);
+ return -EINVAL;
+ }
+ }
+
if (unlikely(
sbp->sb_logstart == 0 && mp->m_logdev_targp == mp->m_ddev_targp)) {
xfs_warn(mp,
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 07/28] xfs: add fs geometry bit for sparse inode chunks
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (5 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 06/28] xfs: sparse inode chunks feature helpers and mount requirements Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 08/28] xfs: introduce inode record hole mask " Brian Foster
` (21 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Define an fs geometry bit for sparse inode chunks such that the
characteristic of the fs can be identified by userspace.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
libxfs/xfs_fs.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/libxfs/xfs_fs.h b/libxfs/xfs_fs.h
index 18dc721..89689c6 100644
--- a/libxfs/xfs_fs.h
+++ b/libxfs/xfs_fs.h
@@ -239,6 +239,7 @@ typedef struct xfs_fsop_resblks {
#define XFS_FSOP_GEOM_FLAGS_V5SB 0x8000 /* version 5 superblock */
#define XFS_FSOP_GEOM_FLAGS_FTYPE 0x10000 /* inode directory types */
#define XFS_FSOP_GEOM_FLAGS_FINOBT 0x20000 /* free inode btree */
+#define XFS_FSOP_GEOM_FLAGS_SPINODES 0x40000 /* sparse inode chunks */
/*
* Minimum and maximum sizes need for growth checks.
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 08/28] xfs: introduce inode record hole mask for sparse inode chunks
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (6 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 07/28] xfs: add fs geometry bit for sparse inode chunks Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 09/28] xfs: pass inode count through ordered icreate log item Brian Foster
` (20 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The inode btrees track 64 inodes per record regardless of inode size.
Thus, inode chunks on disk vary in size depending on the size of the
inodes. This creates a contiguous allocation requirement for new inode
chunks that can be difficult to satisfy on an aged and fragmented (free
space) filesystems.
The inode record freecount currently uses 4 bytes on disk to track the
free inode count. With a maximum freecount value of 64, only one byte is
required. Convert the freecount field to a single byte and use two of
the remaining 3 higher order bytes left for the hole mask field. Use the
final leftover byte for the total count field.
The hole mask field tracks holes in the chunks of physical space that
the inode record refers to. This facilitates the sparse allocation of
inode chunks when contiguous chunks are not available and allows the
inode btrees to identify what portions of the chunk contain valid
inodes. The total count field contains the total number of valid inodes
referred to by the record. This can also be deduced from the hole mask.
The count field provides clarity and redundancy for internal record
verification.
Note that neither of the new fields can be written to disk on fs'
without sparse inode support. Doing so writes to the high-order bytes of
freecount and causes corruption from the perspective of older kernels.
The on-disk inobt record data structure is updated with a union to
distinguish between the original, "full" format and the new, "sparse"
format. The conversion routines to get, insert and update records are
updated to translate to and from the on-disk record accordingly such
that freecount remains a 4-byte value on non-supported fs, yet the new
fields of the in-core record are always valid with respect to the
record. This means that higher level code can refer to the current
in-core record format unconditionally and lower level code ensures that
records are translated to/from disk according to the capabilities of the
fs.
[xfsprogs:
Fixed up struct xfs_inobt_rec accessors throughout the codebase to
handle the union added to the structure.]
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
db/btblock.c | 9 +++++----
db/check.c | 8 ++++----
libxfs/xfs_format.h | 34 ++++++++++++++++++++++++++++++---
libxfs/xfs_ialloc.c | 48 +++++++++++++++++++++++++++++++++++++++--------
libxfs/xfs_ialloc_btree.c | 11 ++++++++++-
repair/phase5.c | 2 +-
repair/scan.c | 14 +++++++-------
7 files changed, 98 insertions(+), 28 deletions(-)
diff --git a/db/btblock.c b/db/btblock.c
index cdb8b1d..d87991d 100644
--- a/db/btblock.c
+++ b/db/btblock.c
@@ -435,11 +435,12 @@ const field_t inobt_key_flds[] = {
};
#undef KOFF
-#define ROFF(f) bitize(offsetof(xfs_inobt_rec_t, ir_ ## f))
+#define ROFF(f) bitize(offsetof(xfs_inobt_rec_t, f))
const field_t inobt_rec_flds[] = {
- { "startino", FLDT_AGINO, OI(ROFF(startino)), C1, 0, TYP_INODE },
- { "freecount", FLDT_INT32D, OI(ROFF(freecount)), C1, 0, TYP_NONE },
- { "free", FLDT_INOFREE, OI(ROFF(free)), C1, 0, TYP_NONE },
+ { "startino", FLDT_AGINO, OI(ROFF(ir_startino)), C1, 0, TYP_INODE },
+ { "freecount", FLDT_INT32D, OI(ROFF(ir_u.f.ir_freecount)), C1, 0,
+ TYP_NONE },
+ { "free", FLDT_INOFREE, OI(ROFF(ir_free)), C1, 0, TYP_NONE },
{ NULL }
};
#undef ROFF
diff --git a/db/check.c b/db/check.c
index 01f5b6e..1822905 100644
--- a/db/check.c
+++ b/db/check.c
@@ -4216,8 +4216,8 @@ scanfunc_ino(
}
icount += XFS_INODES_PER_CHUNK;
agicount += XFS_INODES_PER_CHUNK;
- ifree += be32_to_cpu(rp[i].ir_freecount);
- agifreecount += be32_to_cpu(rp[i].ir_freecount);
+ ifree += be32_to_cpu(rp[i].ir_u.f.ir_freecount);
+ agifreecount += be32_to_cpu(rp[i].ir_u.f.ir_freecount);
push_cur();
set_cur(&typtab[TYP_INODE],
XFS_AGB_TO_DADDR(mp, seqno,
@@ -4242,13 +4242,13 @@ scanfunc_ino(
(xfs_dinode_t *)((char *)iocur_top->data + ((off + j) << mp->m_sb.sb_inodelog)),
isfree);
}
- if (nfree != be32_to_cpu(rp[i].ir_freecount)) {
+ if (nfree != be32_to_cpu(rp[i].ir_u.f.ir_freecount)) {
if (!sflag)
dbprintf(_("ir_freecount/free mismatch, "
"inode chunk %u/%u, freecount "
"%d nfree %d\n"),
seqno, agino,
- be32_to_cpu(rp[i].ir_freecount), nfree);
+ be32_to_cpu(rp[i].ir_u.f.ir_freecount), nfree);
error++;
}
pop_cur();
diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
index f5b3499..177a3fb 100644
--- a/libxfs/xfs_format.h
+++ b/libxfs/xfs_format.h
@@ -1223,26 +1223,54 @@ typedef __uint64_t xfs_inofree_t;
#define XFS_INOBT_ALL_FREE ((xfs_inofree_t)-1)
#define XFS_INOBT_MASK(i) ((xfs_inofree_t)1 << (i))
+#define XFS_INOBT_HOLEMASK_FULL 0 /* holemask for full chunk */
+#define XFS_INOBT_HOLEMASK_BITS (NBBY * sizeof(__uint16_t))
+#define XFS_INODES_PER_HOLEMASK_BIT \
+ (XFS_INODES_PER_CHUNK / (NBBY * sizeof(__uint16_t)))
+
static inline xfs_inofree_t xfs_inobt_maskn(int i, int n)
{
return ((n >= XFS_INODES_PER_CHUNK ? 0 : XFS_INOBT_MASK(n)) - 1) << i;
}
/*
- * Data record structure
+ * The on-disk inode record structure has two formats. The original "full"
+ * format uses a 4-byte freecount. The "sparse" format uses a 1-byte freecount
+ * and replaces the 3 high-order freecount bytes wth the holemask and inode
+ * count.
+ *
+ * The holemask of the sparse record format allows an inode chunk to have holes
+ * that refer to blocks not owned by the inode record. This facilitates inode
+ * allocation in the event of severe free space fragmentation.
*/
typedef struct xfs_inobt_rec {
__be32 ir_startino; /* starting inode number */
- __be32 ir_freecount; /* count of free inodes (set bits) */
+ union {
+ struct {
+ __be32 ir_freecount; /* count of free inodes */
+ } f;
+ struct {
+ __be16 ir_holemask;/* hole mask for sparse chunks */
+ __u8 ir_count; /* total inode count */
+ __u8 ir_freecount; /* count of free inodes */
+ } sp;
+ } ir_u;
__be64 ir_free; /* free inode mask */
} xfs_inobt_rec_t;
typedef struct xfs_inobt_rec_incore {
xfs_agino_t ir_startino; /* starting inode number */
- __int32_t ir_freecount; /* count of free inodes (set bits) */
+ __uint16_t ir_holemask; /* hole mask for sparse chunks */
+ __uint8_t ir_count; /* total inode count */
+ __uint8_t ir_freecount; /* count of free inodes (set bits) */
xfs_inofree_t ir_free; /* free inode mask */
} xfs_inobt_rec_incore_t;
+static inline bool xfs_inobt_issparse(uint16_t holemask)
+{
+ /* non-zero holemask represents a sparse rec. */
+ return holemask;
+}
/*
* Key structure
diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index 1be6d27..00de739 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -60,6 +60,8 @@ xfs_inobt_lookup(
int *stat) /* success/failure */
{
cur->bc_rec.i.ir_startino = ino;
+ cur->bc_rec.i.ir_holemask = 0;
+ cur->bc_rec.i.ir_count = 0;
cur->bc_rec.i.ir_freecount = 0;
cur->bc_rec.i.ir_free = 0;
return xfs_btree_lookup(cur, dir, stat);
@@ -77,7 +79,14 @@ xfs_inobt_update(
union xfs_btree_rec rec;
rec.inobt.ir_startino = cpu_to_be32(irec->ir_startino);
- rec.inobt.ir_freecount = cpu_to_be32(irec->ir_freecount);
+ if (xfs_sb_version_hassparseinodes(&cur->bc_mp->m_sb)) {
+ rec.inobt.ir_u.sp.ir_holemask = cpu_to_be16(irec->ir_holemask);
+ rec.inobt.ir_u.sp.ir_count = irec->ir_count;
+ rec.inobt.ir_u.sp.ir_freecount = irec->ir_freecount;
+ } else {
+ /* ir_holemask/ir_count not supported on-disk */
+ rec.inobt.ir_u.f.ir_freecount = cpu_to_be32(irec->ir_freecount);
+ }
rec.inobt.ir_free = cpu_to_be64(irec->ir_free);
return xfs_btree_update(cur, &rec);
}
@@ -95,12 +104,27 @@ xfs_inobt_get_rec(
int error;
error = xfs_btree_get_rec(cur, &rec, stat);
- if (!error && *stat == 1) {
- irec->ir_startino = be32_to_cpu(rec->inobt.ir_startino);
- irec->ir_freecount = be32_to_cpu(rec->inobt.ir_freecount);
- irec->ir_free = be64_to_cpu(rec->inobt.ir_free);
+ if (error || *stat == 0)
+ return error;
+
+ irec->ir_startino = be32_to_cpu(rec->inobt.ir_startino);
+ if (xfs_sb_version_hassparseinodes(&cur->bc_mp->m_sb)) {
+ irec->ir_holemask = be16_to_cpu(rec->inobt.ir_u.sp.ir_holemask);
+ irec->ir_count = rec->inobt.ir_u.sp.ir_count;
+ irec->ir_freecount = rec->inobt.ir_u.sp.ir_freecount;
+ } else {
+ /*
+ * ir_holemask/ir_count not supported on-disk. Fill in hardcoded
+ * values for full inode chunks.
+ */
+ irec->ir_holemask = XFS_INOBT_HOLEMASK_FULL;
+ irec->ir_count = XFS_INODES_PER_CHUNK;
+ irec->ir_freecount =
+ be32_to_cpu(rec->inobt.ir_u.f.ir_freecount);
}
- return error;
+ irec->ir_free = be64_to_cpu(rec->inobt.ir_free);
+
+ return 0;
}
/*
@@ -109,10 +133,14 @@ xfs_inobt_get_rec(
STATIC int
xfs_inobt_insert_rec(
struct xfs_btree_cur *cur,
+ __uint16_t holemask,
+ __uint8_t count,
__int32_t freecount,
xfs_inofree_t free,
int *stat)
{
+ cur->bc_rec.i.ir_holemask = holemask;
+ cur->bc_rec.i.ir_count = count;
cur->bc_rec.i.ir_freecount = freecount;
cur->bc_rec.i.ir_free = free;
return xfs_btree_insert(cur, stat);
@@ -149,7 +177,9 @@ xfs_inobt_insert(
}
ASSERT(i == 0);
- error = xfs_inobt_insert_rec(cur, XFS_INODES_PER_CHUNK,
+ error = xfs_inobt_insert_rec(cur, XFS_INOBT_HOLEMASK_FULL,
+ XFS_INODES_PER_CHUNK,
+ XFS_INODES_PER_CHUNK,
XFS_INOBT_ALL_FREE, &i);
if (error) {
xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
@@ -1604,7 +1634,9 @@ xfs_difree_finobt(
*/
XFS_WANT_CORRUPTED_GOTO(mp, ibtrec->ir_freecount == 1, error);
- error = xfs_inobt_insert_rec(cur, ibtrec->ir_freecount,
+ error = xfs_inobt_insert_rec(cur, ibtrec->ir_holemask,
+ ibtrec->ir_count,
+ ibtrec->ir_freecount,
ibtrec->ir_free, &i);
if (error)
goto error;
diff --git a/libxfs/xfs_ialloc_btree.c b/libxfs/xfs_ialloc_btree.c
index 9ac143a..a58c1ea 100644
--- a/libxfs/xfs_ialloc_btree.c
+++ b/libxfs/xfs_ialloc_btree.c
@@ -166,7 +166,16 @@ xfs_inobt_init_rec_from_cur(
union xfs_btree_rec *rec)
{
rec->inobt.ir_startino = cpu_to_be32(cur->bc_rec.i.ir_startino);
- rec->inobt.ir_freecount = cpu_to_be32(cur->bc_rec.i.ir_freecount);
+ if (xfs_sb_version_hassparseinodes(&cur->bc_mp->m_sb)) {
+ rec->inobt.ir_u.sp.ir_holemask =
+ cpu_to_be16(cur->bc_rec.i.ir_holemask);
+ rec->inobt.ir_u.sp.ir_count = cur->bc_rec.i.ir_count;
+ rec->inobt.ir_u.sp.ir_freecount = cur->bc_rec.i.ir_freecount;
+ } else {
+ /* ir_holemask/ir_count not supported on-disk */
+ rec->inobt.ir_u.f.ir_freecount =
+ cpu_to_be32(cur->bc_rec.i.ir_freecount);
+ }
rec->inobt.ir_free = cpu_to_be64(cur->bc_rec.i.ir_free);
}
diff --git a/repair/phase5.c b/repair/phase5.c
index 1ce57a1..d01e72b 100644
--- a/repair/phase5.c
+++ b/repair/phase5.c
@@ -1240,7 +1240,7 @@ build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
inocnt += is_inode_free(ino_rec, k);
}
- bt_rec[j].ir_freecount = cpu_to_be32(inocnt);
+ bt_rec[j].ir_u.f.ir_freecount = cpu_to_be32(inocnt);
freecount += inocnt;
count += XFS_INODES_PER_CHUNK;
diff --git a/repair/scan.c b/repair/scan.c
index e7e05d1..e64d0e5 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -890,10 +890,10 @@ _("inode rec for ino %" PRIu64 " (%d/%d) overlaps existing rec (start %d/%d)\n")
}
}
- if (nfree != be32_to_cpu(rp->ir_freecount)) {
+ if (nfree != be32_to_cpu(rp->ir_u.f.ir_freecount)) {
do_warn(_("ir_freecount/free mismatch, inode "
"chunk %d/%u, freecount %d nfree %d\n"),
- agno, ino, be32_to_cpu(rp->ir_freecount), nfree);
+ agno, ino, be32_to_cpu(rp->ir_u.f.ir_freecount), nfree);
}
return suspect;
@@ -1089,10 +1089,10 @@ check_freecount:
* corruption). Issue a warning and continue the scan. The final btree
* reconstruction will correct this naturally.
*/
- if (nfree != be32_to_cpu(rp->ir_freecount)) {
+ if (nfree != be32_to_cpu(rp->ir_u.f.ir_freecount)) {
do_warn(
_("finobt ir_freecount/free mismatch, inode chunk %d/%u, freecount %d nfree %d\n"),
- agno, ino, be32_to_cpu(rp->ir_freecount), nfree);
+ agno, ino, be32_to_cpu(rp->ir_u.f.ir_freecount), nfree);
}
if (!nfree) {
@@ -1215,9 +1215,9 @@ _("inode btree block claimed (state %d), agno %d, bno %d, suspect %d\n"),
agcnts->agicount += XFS_INODES_PER_CHUNK;
agcnts->icount += XFS_INODES_PER_CHUNK;
agcnts->agifreecount +=
- be32_to_cpu(rp[i].ir_freecount);
+ be32_to_cpu(rp[i].ir_u.f.ir_freecount);
agcnts->ifreecount +=
- be32_to_cpu(rp[i].ir_freecount);
+ be32_to_cpu(rp[i].ir_u.f.ir_freecount);
suspect = scan_single_ino_chunk(agno, &rp[i],
suspect);
@@ -1228,7 +1228,7 @@ _("inode btree block claimed (state %d), agno %d, bno %d, suspect %d\n"),
* consistent with the agi
*/
agcnts->fibtfreecount +=
- be32_to_cpu(rp[i].ir_freecount);
+ be32_to_cpu(rp[i].ir_u.f.ir_freecount);
suspect = scan_single_finobt_chunk(agno, &rp[i],
suspect);
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 09/28] xfs: pass inode count through ordered icreate log item
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (7 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 08/28] xfs: introduce inode record hole mask " Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 10/28] xfs: enable sparse inode chunks for v5 superblocks Brian Foster
` (19 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
v5 superblocks use an ordered log item for logging the initialization of
inode chunks. The icreate log item is currently hardcoded to an inode
count of 64 inodes.
The agbno and extent length are used to initialize the inode chunk from
log recovery. While an incorrect inode count does not lead to bad inode
chunk initialization, we should pass the correct inode count such that log
recovery has enough data to perform meaningful validity checks on the
chunk.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
libxfs/xfs_ialloc.c | 7 ++++---
libxfs/xfs_ialloc.h | 2 +-
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index 00de739..34f0290 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -245,6 +245,7 @@ xfs_ialloc_inode_init(
struct xfs_mount *mp,
struct xfs_trans *tp,
struct list_head *buffer_list,
+ int icount,
xfs_agnumber_t agno,
xfs_agblock_t agbno,
xfs_agblock_t length,
@@ -300,7 +301,7 @@ xfs_ialloc_inode_init(
* they track in the AIL as if they were physically logged.
*/
if (tp)
- xfs_icreate_log(tp, agno, agbno, mp->m_ialloc_inos,
+ xfs_icreate_log(tp, agno, agbno, icount,
mp->m_sb.sb_inodesize, length, gen);
} else
version = 2;
@@ -520,8 +521,8 @@ xfs_ialloc_ag_alloc(
* rather than a linear progression to prevent the next generation
* number from being easily guessable.
*/
- error = xfs_ialloc_inode_init(args.mp, tp, NULL, agno, args.agbno,
- args.len, prandom_u32());
+ error = xfs_ialloc_inode_init(args.mp, tp, NULL, newlen, agno,
+ args.agbno, args.len, prandom_u32());
if (error)
return error;
diff --git a/libxfs/xfs_ialloc.h b/libxfs/xfs_ialloc.h
index 100007d..4d4b702 100644
--- a/libxfs/xfs_ialloc.h
+++ b/libxfs/xfs_ialloc.h
@@ -156,7 +156,7 @@ int xfs_inobt_get_rec(struct xfs_btree_cur *cur,
* Inode chunk initialisation routine
*/
int xfs_ialloc_inode_init(struct xfs_mount *mp, struct xfs_trans *tp,
- struct list_head *buffer_list,
+ struct list_head *buffer_list, int icount,
xfs_agnumber_t agno, xfs_agblock_t agbno,
xfs_agblock_t length, unsigned int gen);
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 10/28] xfs: enable sparse inode chunks for v5 superblocks
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (8 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 09/28] xfs: pass inode count through ordered icreate log item Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 11/28] mkfs: sparse inode chunk support Brian Foster
` (18 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Enable mounting of filesystems with sparse inode support enabled. Add
the incompat. feature bit to the *_ALL mask.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
libxfs/xfs_format.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
index 177a3fb..1699b8f 100644
--- a/libxfs/xfs_format.h
+++ b/libxfs/xfs_format.h
@@ -459,7 +459,8 @@ xfs_sb_has_ro_compat_feature(
#define XFS_SB_FEAT_INCOMPAT_FTYPE (1 << 0) /* filetype in dirent */
#define XFS_SB_FEAT_INCOMPAT_SPINODES (1 << 1) /* sparse inode chunks */
#define XFS_SB_FEAT_INCOMPAT_ALL \
- (XFS_SB_FEAT_INCOMPAT_FTYPE)
+ (XFS_SB_FEAT_INCOMPAT_FTYPE| \
+ XFS_SB_FEAT_INCOMPAT_SPINODES)
#define XFS_SB_FEAT_INCOMPAT_UNKNOWN ~XFS_SB_FEAT_INCOMPAT_ALL
static inline bool
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 11/28] mkfs: sparse inode chunk support
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (9 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 10/28] xfs: enable sparse inode chunks for v5 superblocks Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 12/28] db: support sparse inode chunk inobt record and sb fields Brian Foster
` (17 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Allow format of sparse inode chunk enabled filesystems via the '-i
sparse' flag. Note that sparse inode chunk support requires a v5
superblock (-m crc=1).
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
man/man8/mkfs.xfs.8 | 17 +++++++++++++++++
mkfs/xfs_mkfs.c | 37 ++++++++++++++++++++++++++++++++++---
2 files changed, 51 insertions(+), 3 deletions(-)
diff --git a/man/man8/mkfs.xfs.8 b/man/man8/mkfs.xfs.8
index ad9ff3d..542dea9 100644
--- a/man/man8/mkfs.xfs.8
+++ b/man/man8/mkfs.xfs.8
@@ -409,6 +409,23 @@ This is used to enable 32bit quota project identifiers. The
is either 0 or 1, with 1 signifying that 32bit projid are to be enabled.
If the value is omitted, 1 is assumed. (This default changed
in release version 3.2.0.)
+.TP
+.BI sparse[= value ]
+Enable sparse inode chunk allocation. The
+.I value
+is either 0 or 1, with 1 signifying that sparse allocation is enabled.
+If the value is omitted, 1 is assumed. Sparse inode allocation is
+disabled by default. This feature is only available for filesystems
+formatted with
+.B \-m crc=1.
+.IP
+When enabled, sparse inode allocation allows the filesystem to allocate
+smaller than the standard 64-inode chunk when free space is severely
+limited. This feature is useful for filesystems that might fragment free
+space over time such that no free extents are large enough to
+accommodate a chunk of 64 inodes. Without this feature enabled, inode
+allocations can fail with out of space errors under severe fragmented
+free space conditions.
.RE
.TP
.BI \-l " log_section_options"
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index 1770666..a3f29e0 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -109,6 +109,8 @@ char *iopts[] = {
"attr",
#define I_PROJID32BIT 6
"projid32bit",
+#define I_SPINODES 7
+ "sparse",
NULL
};
@@ -1004,6 +1006,7 @@ main(
int lazy_sb_counters;
int crcs_enabled;
int finobt;
+ int spinodes;
progname = basename(argv[0]);
setlocale(LC_ALL, "");
@@ -1038,6 +1041,7 @@ main(
lazy_sb_counters = 1;
crcs_enabled = 0;
finobt = 0;
+ spinodes = 0;
memset(&fsx, 0, sizeof(fsx));
memset(&xi, 0, sizeof(xi));
@@ -1359,6 +1363,13 @@ main(
illegal(value, "i projid32bit");
projid16bit = c ? 0 : 1;
break;
+ case I_SPINODES:
+ if (!value || *value == '\0')
+ value = "1";
+ spinodes = atoi(value);
+ if (spinodes < 0 || spinodes > 1)
+ illegal(value, "i spinodes");
+ break;
default:
unknown('i', value);
}
@@ -1890,6 +1901,12 @@ _("warning: finobt not supported without CRC support, disabled.\n"));
finobt = 0;
}
+ if (spinodes && !crcs_enabled) {
+ fprintf(stderr,
+_("warning: sparse inodes not supported without CRC support, disabled.\n"));
+ spinodes = 0;
+ }
+
if (nsflag || nlflag) {
if (dirblocksize < blocksize ||
dirblocksize > XFS_MAX_BLOCKSIZE) {
@@ -2568,7 +2585,7 @@ _("size %s specified for log subvolume is too large, maximum is %lld blocks\n"),
printf(_(
"meta-data=%-22s isize=%-6d agcount=%lld, agsize=%lld blks\n"
" =%-22s sectsz=%-5u attr=%u, projid32bit=%u\n"
- " =%-22s crc=%-8u finobt=%u\n"
+ " =%-22s crc=%-8u finobt=%u, sparse=%u\n"
"data =%-22s bsize=%-6u blocks=%llu, imaxpct=%u\n"
" =%-22s sunit=%-6u swidth=%u blks\n"
"naming =version %-14u bsize=%-6u ascii-ci=%d ftype=%d\n"
@@ -2577,7 +2594,7 @@ _("size %s specified for log subvolume is too large, maximum is %lld blocks\n"),
"realtime =%-22s extsz=%-6d blocks=%lld, rtextents=%lld\n"),
dfile, isize, (long long)agcount, (long long)agsize,
"", sectorsize, attrversion, !projid16bit,
- "", crcs_enabled, finobt,
+ "", crcs_enabled, finobt, spinodes,
"", blocksize, (long long)dblocks, imaxpct,
"", dsunit, dswidth,
dirversion, dirblocksize, nci, dirftype,
@@ -2646,6 +2663,20 @@ _("size %s specified for log subvolume is too large, maximum is %lld blocks\n"),
sbp->sb_logsectsize = 0;
}
+ /*
+ * Sparse inode chunk support has two main inode alignment requirements.
+ * First, sparse chunk alignment must match the cluster size. Second,
+ * full chunk alignment must match the inode chunk size.
+ *
+ * Copy the already calculated/scaled inoalignmt to spino_align and
+ * update the former to the full inode chunk size.
+ */
+ if (spinodes) {
+ sbp->sb_spino_align = sbp->sb_inoalignmt;
+ sbp->sb_inoalignmt = XFS_INODES_PER_CHUNK * isize >> blocklog;
+ sbp->sb_features_incompat |= XFS_SB_FEAT_INCOMPAT_SPINODES;
+ }
+
if (force_overwrite)
zero_old_xfs_structures(&xi, sbp);
@@ -3193,7 +3224,7 @@ usage( void )
sectlog=n|sectsize=num\n\
/* force overwrite */ [-f]\n\
/* inode size */ [-i log=n|perblock=n|size=num,maxpct=n,attr=0|1|2,\n\
- projid32bit=0|1]\n\
+ projid32bit=0|1,sparse=0|1]\n\
/* no discard */ [-K]\n\
/* log subvol */ [-l agnum=n,internal,size=num,logdev=xxx,version=n\n\
sunit=value|su=num,sectlog=n|sectsize=num,\n\
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 12/28] db: support sparse inode chunk inobt record and sb fields
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (10 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 11/28] mkfs: sparse inode chunk support Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 13/28] db: show sparse inodes feature state in version command output Brian Foster
` (16 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The sparse inode feature uses a different on-disk inobt record format.
Define the new record format in the xfs_db type infrastructure and use
this definition for fs' that support sparse inodes.
Also update the superblock type structure with the sb_spino_align field.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
db/btblock.c | 36 ++++++++++++++++++++++++++++++++++++
db/btblock.h | 3 +++
db/field.c | 4 ++++
db/field.h | 2 ++
db/init.c | 4 +++-
db/sb.c | 1 +
db/type.c | 40 ++++++++++++++++++++++++++++++++++++++++
db/type.h | 1 +
8 files changed, 90 insertions(+), 1 deletion(-)
diff --git a/db/btblock.c b/db/btblock.c
index d87991d..982b52b 100644
--- a/db/btblock.c
+++ b/db/btblock.c
@@ -392,6 +392,11 @@ const field_t inobt_crc_hfld[] = {
{ NULL }
};
+const field_t inobt_spcrc_hfld[] = {
+ { "", FLDT_INOBT_SPCRC, OI(0), C1, 0, TYP_NONE },
+ { NULL }
+};
+
#define OFF(f) bitize(offsetof(struct xfs_btree_block, bb_ ## f))
const field_t inobt_flds[] = {
{ "magic", FLDT_UINT32X, OI(OFF(magic)), C1, 0, TYP_NONE },
@@ -426,6 +431,26 @@ const field_t inobt_crc_flds[] = {
FLD_ARRAY|FLD_ABASE1|FLD_COUNT|FLD_OFFSET, TYP_INOBT },
{ NULL }
};
+const field_t inobt_spcrc_flds[] = {
+ { "magic", FLDT_UINT32X, OI(OFF(magic)), C1, 0, TYP_NONE },
+ { "level", FLDT_UINT16D, OI(OFF(level)), C1, 0, TYP_NONE },
+ { "numrecs", FLDT_UINT16D, OI(OFF(numrecs)), C1, 0, TYP_NONE },
+ { "leftsib", FLDT_AGBLOCK, OI(OFF(u.s.bb_leftsib)), C1, 0, TYP_INOBT },
+ { "rightsib", FLDT_AGBLOCK, OI(OFF(u.s.bb_rightsib)), C1, 0, TYP_INOBT },
+ { "bno", FLDT_DFSBNO, OI(OFF(u.s.bb_blkno)), C1, 0, TYP_INOBT },
+ { "lsn", FLDT_UINT64X, OI(OFF(u.s.bb_lsn)), C1, 0, TYP_NONE },
+ { "uuid", FLDT_UUID, OI(OFF(u.s.bb_uuid)), C1, 0, TYP_NONE },
+ { "owner", FLDT_AGNUMBER, OI(OFF(u.s.bb_owner)), C1, 0, TYP_NONE },
+ { "crc", FLDT_CRC, OI(OFF(u.s.bb_crc)), C1, 0, TYP_NONE },
+ { "recs", FLDT_INOBTSPREC, btblock_rec_offset, btblock_rec_count,
+ FLD_ARRAY|FLD_ABASE1|FLD_COUNT|FLD_OFFSET, TYP_NONE },
+ { "keys", FLDT_INOBTKEY, btblock_key_offset, btblock_key_count,
+ FLD_ARRAY|FLD_ABASE1|FLD_COUNT|FLD_OFFSET, TYP_NONE },
+ { "ptrs", FLDT_INOBTPTR, btblock_ptr_offset, btblock_key_count,
+ FLD_ARRAY|FLD_ABASE1|FLD_COUNT|FLD_OFFSET, TYP_INOBT },
+ { NULL }
+};
+
#undef OFF
#define KOFF(f) bitize(offsetof(xfs_inobt_key_t, ir_ ## f))
@@ -443,6 +468,17 @@ const field_t inobt_rec_flds[] = {
{ "free", FLDT_INOFREE, OI(ROFF(ir_free)), C1, 0, TYP_NONE },
{ NULL }
};
+/* sparse inode on-disk format */
+const field_t inobt_sprec_flds[] = {
+ { "startino", FLDT_AGINO, OI(ROFF(ir_startino)), C1, 0, TYP_INODE },
+ { "holemask", FLDT_UINT16X, OI(ROFF(ir_u.sp.ir_holemask)), C1, 0,
+ TYP_NONE },
+ { "count", FLDT_UINT8D, OI(ROFF(ir_u.sp.ir_count)), C1, 0, TYP_NONE },
+ { "freecount", FLDT_INT8D, OI(ROFF(ir_u.sp.ir_freecount)), C1, 0,
+ TYP_NONE },
+ { "free", FLDT_INOFREE, OI(ROFF(ir_free)), C1, 0, TYP_NONE },
+ { NULL }
+};
#undef ROFF
diff --git a/db/btblock.h b/db/btblock.h
index daee060..228eb36 100644
--- a/db/btblock.h
+++ b/db/btblock.h
@@ -33,9 +33,12 @@ extern const struct field bmapbtd_rec_flds[];
extern const struct field inobt_flds[];
extern const struct field inobt_hfld[];
extern const struct field inobt_crc_flds[];
+extern const struct field inobt_spcrc_flds[];
extern const struct field inobt_crc_hfld[];
+extern const struct field inobt_spcrc_hfld[];
extern const struct field inobt_key_flds[];
extern const struct field inobt_rec_flds[];
+extern const struct field inobt_sprec_flds[];
extern const struct field bnobt_flds[];
extern const struct field bnobt_hfld[];
diff --git a/db/field.c b/db/field.c
index 816065e..52d9d9b 100644
--- a/db/field.c
+++ b/db/field.c
@@ -285,12 +285,16 @@ const ftattr_t ftattrtab[] = {
FTARG_SIZE, NULL, inobt_flds },
{ FLDT_INOBT_CRC, "inobt", NULL, (char *)inobt_crc_flds, btblock_size,
FTARG_SIZE, NULL, inobt_crc_flds },
+ { FLDT_INOBT_SPCRC, "inobt", NULL, (char *)inobt_spcrc_flds,
+ btblock_size, FTARG_SIZE, NULL, inobt_spcrc_flds },
{ FLDT_INOBTKEY, "inobtkey", fp_sarray, (char *)inobt_key_flds,
SI(bitsz(xfs_inobt_key_t)), 0, NULL, inobt_key_flds },
{ FLDT_INOBTPTR, "inobtptr", fp_num, "%u", SI(bitsz(xfs_inobt_ptr_t)),
0, fa_agblock, NULL },
{ FLDT_INOBTREC, "inobtrec", fp_sarray, (char *)inobt_rec_flds,
SI(bitsz(xfs_inobt_rec_t)), 0, NULL, inobt_rec_flds },
+ { FLDT_INOBTSPREC, "inobtsprec", fp_sarray, (char *) inobt_sprec_flds,
+ SI(bitsz(xfs_inobt_rec_t)), 0, NULL, inobt_sprec_flds },
{ FLDT_INODE, "inode", NULL, (char *)inode_flds, inode_size, FTARG_SIZE,
NULL, inode_flds },
{ FLDT_INODE_CRC, "inode", NULL, (char *)inode_crc_flds, inode_size,
diff --git a/db/field.h b/db/field.h
index 6343c9a..2546240 100644
--- a/db/field.h
+++ b/db/field.h
@@ -143,9 +143,11 @@ typedef enum fldt {
FLDT_INO,
FLDT_INOBT,
FLDT_INOBT_CRC,
+ FLDT_INOBT_SPCRC,
FLDT_INOBTKEY,
FLDT_INOBTPTR,
FLDT_INOBTREC,
+ FLDT_INOBTSPREC,
FLDT_INODE,
FLDT_INODE_CRC,
FLDT_INOFREE,
diff --git a/db/init.c b/db/init.c
index e7f536a..f93ab15 100644
--- a/db/init.c
+++ b/db/init.c
@@ -169,7 +169,9 @@ init(
}
}
- if (xfs_sb_version_hascrc(&mp->m_sb))
+ if (xfs_sb_version_hassparseinodes(&mp->m_sb))
+ type_set_tab_spcrc();
+ else if (xfs_sb_version_hascrc(&mp->m_sb))
type_set_tab_crc();
push_cur();
diff --git a/db/sb.c b/db/sb.c
index cd12f83..4208569 100644
--- a/db/sb.c
+++ b/db/sb.c
@@ -119,6 +119,7 @@ const field_t sb_flds[] = {
{ "features_log_incompat", FLDT_UINT32X, OI(OFF(features_log_incompat)),
C1, 0, TYP_NONE },
{ "crc", FLDT_CRC, OI(OFF(crc)), C1, 0, TYP_NONE },
+ { "spino_align", FLDT_EXTLEN, OI(OFF(spino_align)), C1, 0, TYP_NONE },
{ "pquotino", FLDT_INO, OI(OFF(pquotino)), C1, 0, TYP_INODE },
{ "lsn", FLDT_UINT64X, OI(OFF(lsn)), C1, 0, TYP_NONE },
{ NULL }
diff --git a/db/type.c b/db/type.c
index b29f2a4..28535de 100644
--- a/db/type.c
+++ b/db/type.c
@@ -107,6 +107,40 @@ static const typ_t __typtab_crc[] = {
{ TYP_NONE, NULL }
};
+static const typ_t __typtab_spcrc[] = {
+ { TYP_AGF, "agf", handle_struct, agf_hfld, &xfs_agf_buf_ops },
+ { TYP_AGFL, "agfl", handle_struct, agfl_crc_hfld, &xfs_agfl_buf_ops },
+ { TYP_AGI, "agi", handle_struct, agi_hfld, &xfs_agfl_buf_ops },
+ { TYP_ATTR, "attr3", handle_struct, attr3_hfld,
+ &xfs_attr3_db_buf_ops },
+ { TYP_BMAPBTA, "bmapbta", handle_struct, bmapbta_crc_hfld,
+ &xfs_bmbt_buf_ops },
+ { TYP_BMAPBTD, "bmapbtd", handle_struct, bmapbtd_crc_hfld,
+ &xfs_bmbt_buf_ops },
+ { TYP_BNOBT, "bnobt", handle_struct, bnobt_crc_hfld,
+ &xfs_allocbt_buf_ops },
+ { TYP_CNTBT, "cntbt", handle_struct, cntbt_crc_hfld,
+ &xfs_allocbt_buf_ops },
+ { TYP_DATA, "data", handle_block, NULL, NULL },
+ { TYP_DIR2, "dir3", handle_struct, dir3_hfld,
+ &xfs_dir3_db_buf_ops },
+ { TYP_DQBLK, "dqblk", handle_struct, dqblk_hfld,
+ &xfs_dquot_buf_ops },
+ { TYP_INOBT, "inobt", handle_struct, inobt_spcrc_hfld,
+ &xfs_inobt_buf_ops },
+ { TYP_INODATA, "inodata", NULL, NULL, NULL },
+ { TYP_INODE, "inode", handle_struct, inode_crc_hfld,
+ &xfs_inode_buf_ops },
+ { TYP_LOG, "log", NULL, NULL, NULL },
+ { TYP_RTBITMAP, "rtbitmap", NULL, NULL, NULL },
+ { TYP_RTSUMMARY, "rtsummary", NULL, NULL, NULL },
+ { TYP_SB, "sb", handle_struct, sb_hfld, &xfs_sb_buf_ops },
+ { TYP_SYMLINK, "symlink", handle_struct, symlink_crc_hfld,
+ &xfs_symlink_buf_ops },
+ { TYP_TEXT, "text", handle_text, NULL, NULL },
+ { TYP_NONE, NULL }
+};
+
const typ_t *typtab = __typtab;
void
@@ -115,6 +149,12 @@ type_set_tab_crc(void)
typtab = __typtab_crc;
}
+void
+type_set_tab_spcrc(void)
+{
+ typtab = __typtab_spcrc;
+}
+
static const typ_t *
findtyp(
char *name)
diff --git a/db/type.h b/db/type.h
index 3bb26f1..c9421d1 100644
--- a/db/type.h
+++ b/db/type.h
@@ -48,6 +48,7 @@ extern const typ_t *typtab, *cur_typ;
extern void type_init(void);
extern void type_set_tab_crc(void);
+extern void type_set_tab_spcrc(void);
extern void handle_block(int action, const struct field *fields, int argc,
char **argv);
extern void handle_string(int action, const struct field *fields, int argc,
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 13/28] db: show sparse inodes feature state in version command output
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (11 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 12/28] db: support sparse inode chunk inobt record and sb fields Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 14/28] growfs: display sparse inode status from xfs_info Brian Foster
` (15 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The xfs_db version command prints a string for each of the various
features supported by a filesystem. Include 'SPARSE_INODES' in the
version string when sparse inode chunk allocation is supported by the
fs.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
db/sb.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/db/sb.c b/db/sb.c
index 4208569..ff2318c 100644
--- a/db/sb.c
+++ b/db/sb.c
@@ -661,6 +661,8 @@ version_string(
strcat(s, ",CRC");
if (xfs_sb_version_hasftype(sbp))
strcat(s, ",FTYPE");
+ if (xfs_sb_version_hassparseinodes(sbp))
+ strcat(s, ",SPARSE_INODES");
return s;
}
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 14/28] growfs: display sparse inode status from xfs_info
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (12 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 13/28] db: show sparse inodes feature state in version command output Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 15/28] repair: handle sparse format inobt record freecount correctly Brian Foster
` (14 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Check the sparse inode feature bit of the geometry flags and display
whether sparse inode chunks are supported by the fs.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
growfs/xfs_growfs.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/growfs/xfs_growfs.c b/growfs/xfs_growfs.c
index 8e611b6..4a344fe 100644
--- a/growfs/xfs_growfs.c
+++ b/growfs/xfs_growfs.c
@@ -57,12 +57,13 @@ report_info(
int crcs_enabled,
int cimode,
int ftype_enabled,
- int finobt_enabled)
+ int finobt_enabled,
+ int spinodes)
{
printf(_(
"meta-data=%-22s isize=%-6u agcount=%u, agsize=%u blks\n"
" =%-22s sectsz=%-5u attr=%u, projid32bit=%u\n"
- " =%-22s crc=%-8u finobt=%u\n"
+ " =%-22s crc=%-8u finobt=%u spinodes=%u\n"
"data =%-22s bsize=%-6u blocks=%llu, imaxpct=%u\n"
" =%-22s sunit=%-6u swidth=%u blks\n"
"naming =version %-14u bsize=%-6u ascii-ci=%d ftype=%d\n"
@@ -72,7 +73,7 @@ report_info(
mntpoint, geo.inodesize, geo.agcount, geo.agblocks,
"", geo.sectsize, attrversion, projid32bit,
- "", crcs_enabled, finobt_enabled,
+ "", crcs_enabled, finobt_enabled, spinodes,
"", geo.blocksize, (unsigned long long)geo.datablocks,
geo.imaxpct,
"", geo.sunit, geo.swidth,
@@ -125,6 +126,7 @@ main(int argc, char **argv)
int crcs_enabled;
int ftype_enabled = 0;
int finobt_enabled; /* free inode btree */
+ int spinodes;
progname = basename(argv[0]);
setlocale(LC_ALL, "");
@@ -247,11 +249,12 @@ main(int argc, char **argv)
crcs_enabled = geo.flags & XFS_FSOP_GEOM_FLAGS_V5SB ? 1 : 0;
ftype_enabled = geo.flags & XFS_FSOP_GEOM_FLAGS_FTYPE ? 1 : 0;
finobt_enabled = geo.flags & XFS_FSOP_GEOM_FLAGS_FINOBT ? 1 : 0;
+ spinodes = geo.flags & XFS_FSOP_GEOM_FLAGS_SPINODES ? 1 : 0;
if (nflag) {
report_info(geo, datadev, isint, logdev, rtdev,
lazycount, dirversion, logversion,
attrversion, projid32bit, crcs_enabled, ci,
- ftype_enabled, finobt_enabled);
+ ftype_enabled, finobt_enabled, spinodes);
exit(0);
}
@@ -289,7 +292,7 @@ main(int argc, char **argv)
report_info(geo, datadev, isint, logdev, rtdev,
lazycount, dirversion, logversion,
attrversion, projid32bit, crcs_enabled, ci, ftype_enabled,
- finobt_enabled);
+ finobt_enabled, spinodes);
ddsize = xi.dsize;
dlsize = ( xi.logBBsize? xi.logBBsize :
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 15/28] repair: handle sparse format inobt record freecount correctly
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (13 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 14/28] growfs: display sparse inode status from xfs_info Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-05 0:53 ` Dave Chinner
2015-06-02 18:41 ` [PATCH 16/28] repair: remove duplicate field from aghdr_cnts Brian Foster
` (13 subsequent siblings)
28 siblings, 1 reply; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The sparse inode chunk feature introduces a new inobt record format that
converts ir_freecount from 4 bytes to 1 byte. ir_freecount references
throughout repair currently assume the 'full' format and endian-convert
from the 32-bit value.
Update the xfs_repair inobt scan and tree rebuild codepaths to use the
correct record format for ir_freecount when sparse inodes is enabled.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/phase5.c | 6 +++++-
repair/scan.c | 37 +++++++++++++++++++++++++------------
2 files changed, 30 insertions(+), 13 deletions(-)
diff --git a/repair/phase5.c b/repair/phase5.c
index d01e72b..04bf049 100644
--- a/repair/phase5.c
+++ b/repair/phase5.c
@@ -1240,7 +1240,11 @@ build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
inocnt += is_inode_free(ino_rec, k);
}
- bt_rec[j].ir_u.f.ir_freecount = cpu_to_be32(inocnt);
+ if (xfs_sb_version_hassparseinodes(&mp->m_sb))
+ bt_rec[j].ir_u.sp.ir_freecount = inocnt;
+ else
+ bt_rec[j].ir_u.f.ir_freecount =
+ cpu_to_be32(inocnt);
freecount += inocnt;
count += XFS_INODES_PER_CHUNK;
diff --git a/repair/scan.c b/repair/scan.c
index e64d0e5..f42459c 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -751,11 +751,16 @@ scan_single_ino_chunk(
int off;
int state;
ino_tree_node_t *ino_rec, *first_rec, *last_rec;
+ int freecount;
ino = be32_to_cpu(rp->ir_startino);
off = XFS_AGINO_TO_OFFSET(mp, ino);
agbno = XFS_AGINO_TO_AGBNO(mp, ino);
lino = XFS_AGINO_TO_INO(mp, agno, ino);
+ if (xfs_sb_version_hassparseinodes(&mp->m_sb))
+ freecount = rp->ir_u.sp.ir_freecount;
+ else
+ freecount = be32_to_cpu(rp->ir_u.f.ir_freecount);
/*
* on multi-block block chunks, all chunks start
@@ -890,10 +895,10 @@ _("inode rec for ino %" PRIu64 " (%d/%d) overlaps existing rec (start %d/%d)\n")
}
}
- if (nfree != be32_to_cpu(rp->ir_u.f.ir_freecount)) {
- do_warn(_("ir_freecount/free mismatch, inode "
- "chunk %d/%u, freecount %d nfree %d\n"),
- agno, ino, be32_to_cpu(rp->ir_u.f.ir_freecount), nfree);
+ if (nfree != freecount) {
+ do_warn(
+_("ir_freecount/free mismatch, inode chunk %d/%u, freecount %d nfree %d\n"),
+ agno, ino, freecount, nfree);
}
return suspect;
@@ -913,11 +918,16 @@ scan_single_finobt_chunk(
int off;
int state;
ino_tree_node_t *first_rec, *last_rec, *ino_rec;
+ int freecount;
ino = be32_to_cpu(rp->ir_startino);
off = XFS_AGINO_TO_OFFSET(mp, ino);
agbno = XFS_AGINO_TO_AGBNO(mp, ino);
lino = XFS_AGINO_TO_INO(mp, agno, ino);
+ if (xfs_sb_version_hassparseinodes(&mp->m_sb))
+ freecount = rp->ir_u.sp.ir_freecount;
+ else
+ freecount = be32_to_cpu(rp->ir_u.f.ir_freecount);
/*
* on multi-block block chunks, all chunks start at the beginning of the
@@ -1089,10 +1099,10 @@ check_freecount:
* corruption). Issue a warning and continue the scan. The final btree
* reconstruction will correct this naturally.
*/
- if (nfree != be32_to_cpu(rp->ir_u.f.ir_freecount)) {
+ if (nfree != freecount) {
do_warn(
_("finobt ir_freecount/free mismatch, inode chunk %d/%u, freecount %d nfree %d\n"),
- agno, ino, be32_to_cpu(rp->ir_u.f.ir_freecount), nfree);
+ agno, ino, freecount, nfree);
}
if (!nfree) {
@@ -1137,6 +1147,7 @@ scan_inobt(
xfs_inobt_ptr_t *pp;
xfs_inobt_rec_t *rp;
int hdr_errors;
+ int freecount;
hdr_errors = 0;
@@ -1210,14 +1221,17 @@ _("inode btree block claimed (state %d), agno %d, bno %d, suspect %d\n"),
* the block. skip processing of bogus records.
*/
for (i = 0; i < numrecs; i++) {
+ if (xfs_sb_version_hassparseinodes(&mp->m_sb))
+ freecount = rp[i].ir_u.sp.ir_freecount;
+ else
+ freecount = be32_to_cpu(rp[i].ir_u.f.ir_freecount);
+
if (magic == XFS_IBT_MAGIC ||
magic == XFS_IBT_CRC_MAGIC) {
agcnts->agicount += XFS_INODES_PER_CHUNK;
agcnts->icount += XFS_INODES_PER_CHUNK;
- agcnts->agifreecount +=
- be32_to_cpu(rp[i].ir_u.f.ir_freecount);
- agcnts->ifreecount +=
- be32_to_cpu(rp[i].ir_u.f.ir_freecount);
+ agcnts->agifreecount += freecount;
+ agcnts->ifreecount += freecount;
suspect = scan_single_ino_chunk(agno, &rp[i],
suspect);
@@ -1227,8 +1241,7 @@ _("inode btree block claimed (state %d), agno %d, bno %d, suspect %d\n"),
* so only the free inode count is expected to be
* consistent with the agi
*/
- agcnts->fibtfreecount +=
- be32_to_cpu(rp[i].ir_u.f.ir_freecount);
+ agcnts->fibtfreecount += freecount;
suspect = scan_single_finobt_chunk(agno, &rp[i],
suspect);
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* Re: [PATCH 15/28] repair: handle sparse format inobt record freecount correctly
2015-06-02 18:41 ` [PATCH 15/28] repair: handle sparse format inobt record freecount correctly Brian Foster
@ 2015-06-05 0:53 ` Dave Chinner
0 siblings, 0 replies; 38+ messages in thread
From: Dave Chinner @ 2015-06-05 0:53 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Tue, Jun 02, 2015 at 02:41:48PM -0400, Brian Foster wrote:
> The sparse inode chunk feature introduces a new inobt record format that
> converts ir_freecount from 4 bytes to 1 byte. ir_freecount references
> throughout repair currently assume the 'full' format and endian-convert
> from the 32-bit value.
>
> Update the xfs_repair inobt scan and tree rebuild codepaths to use the
> correct record format for ir_freecount when sparse inodes is enabled.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> repair/phase5.c | 6 +++++-
> repair/scan.c | 37 +++++++++++++++++++++++++------------
> 2 files changed, 30 insertions(+), 13 deletions(-)
>
> diff --git a/repair/phase5.c b/repair/phase5.c
> index d01e72b..04bf049 100644
> --- a/repair/phase5.c
> +++ b/repair/phase5.c
> @@ -1240,7 +1240,11 @@ build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
> inocnt += is_inode_free(ino_rec, k);
> }
>
> - bt_rec[j].ir_u.f.ir_freecount = cpu_to_be32(inocnt);
> + if (xfs_sb_version_hassparseinodes(&mp->m_sb))
> + bt_rec[j].ir_u.sp.ir_freecount = inocnt;
> + else
> + bt_rec[j].ir_u.f.ir_freecount =
> + cpu_to_be32(inocnt);
> freecount += inocnt;
> count += XFS_INODES_PER_CHUNK;
Can you make this a "inorec_set_freecount(mp, rec, count)" helper?
> diff --git a/repair/scan.c b/repair/scan.c
> index e64d0e5..f42459c 100644
> --- a/repair/scan.c
> +++ b/repair/scan.c
> @@ -751,11 +751,16 @@ scan_single_ino_chunk(
> int off;
> int state;
> ino_tree_node_t *ino_rec, *first_rec, *last_rec;
> + int freecount;
>
> ino = be32_to_cpu(rp->ir_startino);
> off = XFS_AGINO_TO_OFFSET(mp, ino);
> agbno = XFS_AGINO_TO_AGBNO(mp, ino);
> lino = XFS_AGINO_TO_INO(mp, agno, ino);
> + if (xfs_sb_version_hassparseinodes(&mp->m_sb))
> + freecount = rp->ir_u.sp.ir_freecount;
> + else
> + freecount = be32_to_cpu(rp->ir_u.f.ir_freecount);
And this a "freecount = inorec_get_freecount(mp, rec)" helper?
The code is otherwise fine, so I'll apply this patch as is to keep
working through the series. Can you send the helper update as a
delta patch that applies at the end of the entire series?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 16/28] repair: remove duplicate field from aghdr_cnts
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (14 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 15/28] repair: handle sparse format inobt record freecount correctly Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 17/28] repair: use ir_count for filesystems with sparse inode support Brian Foster
` (12 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The agicount and icount fields are used in separate parts of the AG scan
but both fields track the same data. agicount is used to compare with
the AGI header and icount is used to calculate the total inode count to
compare with sb_icount.
Use agicount rather than icount in scan_ags() and remove the icount
field.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/scan.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/repair/scan.c b/repair/scan.c
index f42459c..9daa488 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -44,7 +44,6 @@ struct aghdr_cnts {
__uint32_t agicount;
__uint32_t agifreecount;
__uint64_t fdblocks;
- __uint64_t icount;
__uint64_t ifreecount;
__uint32_t fibtfreecount;
};
@@ -1229,7 +1228,6 @@ _("inode btree block claimed (state %d), agno %d, bno %d, suspect %d\n"),
if (magic == XFS_IBT_MAGIC ||
magic == XFS_IBT_CRC_MAGIC) {
agcnts->agicount += XFS_INODES_PER_CHUNK;
- agcnts->icount += XFS_INODES_PER_CHUNK;
agcnts->agifreecount += freecount;
agcnts->ifreecount += freecount;
@@ -1668,7 +1666,7 @@ scan_ags(
/* tally up the counts */
for (i = 0; i < mp->m_sb.sb_agcount; i++) {
fdblocks += agcnts[i].fdblocks;
- icount += agcnts[i].icount;
+ icount += agcnts[i].agicount;
ifreecount += agcnts[i].ifreecount;
}
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 17/28] repair: use ir_count for filesystems with sparse inode support
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (15 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 16/28] repair: remove duplicate field from aghdr_cnts Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 18/28] repair: scan and track sparse inode chunks correctly Brian Foster
` (11 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Repair currently assumes each inobt record covers 64 inodes and uses
this value to validate inode counts in the AGI headers and superblock.
This is not always the case with sparse inode support.
Update scan_inobt() to check for sparse inode support and use the new
ir_count field for inode accounting. ir_count contains the total number
of inodes tracked by the record.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/scan.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/repair/scan.c b/repair/scan.c
index 9daa488..8677e41 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -1227,7 +1227,16 @@ _("inode btree block claimed (state %d), agno %d, bno %d, suspect %d\n"),
if (magic == XFS_IBT_MAGIC ||
magic == XFS_IBT_CRC_MAGIC) {
- agcnts->agicount += XFS_INODES_PER_CHUNK;
+ int icount = XFS_INODES_PER_CHUNK;
+
+ /*
+ * ir_count holds the inode count for all
+ * records on fs' with sparse inode support
+ */
+ if (xfs_sb_version_hassparseinodes(&mp->m_sb))
+ icount = rp[i].ir_u.sp.ir_count;
+
+ agcnts->agicount += icount;
agcnts->agifreecount += freecount;
agcnts->ifreecount += freecount;
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 18/28] repair: scan and track sparse inode chunks correctly
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (16 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 17/28] repair: use ir_count for filesystems with sparse inode support Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-05 0:56 ` Dave Chinner
2015-06-02 18:41 ` [PATCH 19/28] repair: scan sparse finobt records correctly Brian Foster
` (10 subsequent siblings)
28 siblings, 1 reply; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Phase 2 of xfs_repair scans the on-disk inobt and creates in-core
records for all inodes in the fs. This also involves marking
free/allocated state of all inodes, internal record verification and
block state management for the inode chunks tracked by inode records.
Various parts of the inobt scan mechanism assume fully allocated inode
records and thus lead to spurious errors when sparse inode records are
encountered.
Update the inobt scan to detect and handle sparse inode records
correctly. Do not set the allocation state of blocks in sparse inode
regions as these blocks do not belong to the record. Do not account
sparse inodes against the ir_freecount as these inodes do not exist and
are not available for allocation by the fs. Finally, track the sparse
status of each individual inode in the in-core inode records for future
reference.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
include/libxfs.h | 16 ++++++++++++++++
repair/incore.h | 14 ++++++++++++++
repair/incore_ino.c | 1 +
repair/scan.c | 48 +++++++++++++++++++++++++++++++++++++++++-------
4 files changed, 72 insertions(+), 7 deletions(-)
diff --git a/include/libxfs.h b/include/libxfs.h
index 6a59cc0..3321c50 100644
--- a/include/libxfs.h
+++ b/include/libxfs.h
@@ -183,6 +183,22 @@ extern unsigned long libxfs_physmem(void); /* in kilobytes */
#define XFS_INOBT_IS_FREE_DISK(rp,i) \
((be64_to_cpu((rp)->ir_free) & XFS_INOBT_MASK(i)) != 0)
+static inline bool
+XFS_INOBT_IS_SPARSE_DISK(
+ struct xfs_inobt_rec *rp,
+ int offset)
+{
+ int spshift;
+ uint16_t holemask;
+
+ holemask = be16_to_cpu(rp->ir_u.sp.ir_holemask);
+ spshift = offset / XFS_INODES_PER_HOLEMASK_BIT;
+ if ((1 << spshift) & holemask)
+ return true;
+
+ return false;
+}
+
static inline void
libxfs_bmbt_disk_get_all(
struct xfs_bmbt_rec *rp,
diff --git a/repair/incore.h b/repair/incore.h
index ba819b4..d4e44a7 100644
--- a/repair/incore.h
+++ b/repair/incore.h
@@ -285,6 +285,7 @@ typedef struct ino_tree_node {
avlnode_t avl_node;
xfs_agino_t ino_startnum; /* starting inode # */
xfs_inofree_t ir_free; /* inode free bit mask */
+ __uint64_t ir_sparse; /* sparse inode bitmask */
__uint64_t ino_confirmed; /* confirmed bitmask */
__uint64_t ino_isa_dir; /* bit == 1 if a directory */
__uint8_t nlink_size;
@@ -477,6 +478,19 @@ static inline int is_inode_free(struct ino_tree_node *irec, int offset)
}
/*
+ * set/test is inode sparse (not physically allocated)
+ */
+static inline void set_inode_sparse(struct ino_tree_node *irec, int offset)
+{
+ irec->ir_sparse |= XFS_INOBT_MASK(offset);
+}
+
+static inline bool is_inode_sparse(struct ino_tree_node *irec, int offset)
+{
+ return irec->ir_sparse & XFS_INOBT_MASK(offset);
+}
+
+/*
* add_inode_reached() is set on inode I only if I has been reached
* by an inode P claiming to be the parent and if I is a directory,
* the .. link in the I says that P is I's parent.
diff --git a/repair/incore_ino.c b/repair/incore_ino.c
index 9502648..cda6c2b 100644
--- a/repair/incore_ino.c
+++ b/repair/incore_ino.c
@@ -258,6 +258,7 @@ alloc_ino_node(
irec->ino_confirmed = 0;
irec->ino_isa_dir = 0;
irec->ir_free = (xfs_inofree_t) - 1;
+ irec->ir_sparse = 0;
irec->ino_un.ex_data = NULL;
irec->nlink_size = sizeof(__uint8_t);
irec->disk_nlinks.un8 = alloc_nlink_array(irec->nlink_size);
diff --git a/repair/scan.c b/repair/scan.c
index 8677e41..5b67e15 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -736,6 +736,17 @@ _("%s freespace btree block claimed (state %d), agno %d, bno %d, suspect %d\n"),
}
}
+static bool
+ino_issparse(
+ struct xfs_inobt_rec *rp,
+ int offset)
+{
+ if (!xfs_sb_version_hassparseinodes(&mp->m_sb))
+ return false;
+
+ return XFS_INOBT_IS_SPARSE_DISK(rp, offset);
+}
+
static int
scan_single_ino_chunk(
xfs_agnumber_t agno,
@@ -749,7 +760,8 @@ scan_single_ino_chunk(
int nfree;
int off;
int state;
- ino_tree_node_t *ino_rec, *first_rec, *last_rec;
+ ino_tree_node_t *ino_rec = NULL;
+ ino_tree_node_t *first_rec, *last_rec;
int freecount;
ino = be32_to_cpu(rp->ir_startino);
@@ -815,8 +827,12 @@ _("bad ending inode # (%" PRIu64 " (0x%x 0x%zx)) in ino rec, skipping rec\n"),
for (j = 0;
j < XFS_INODES_PER_CHUNK;
j += mp->m_sb.sb_inopblock) {
- agbno = XFS_AGINO_TO_AGBNO(mp, ino + j);
+ /* inodes in sparse chunks don't use blocks */
+ if (ino_issparse(rp, j))
+ continue;
+
+ agbno = XFS_AGINO_TO_AGBNO(mp, ino + j);
state = get_bmap(agno, agbno);
if (state == XR_E_UNKNOWN) {
set_bmap(agno, agbno, XR_E_INO);
@@ -861,8 +877,6 @@ _("inode rec for ino %" PRIu64 " (%d/%d) overlaps existing rec (start %d/%d)\n")
return suspect;
}
- nfree = 0;
-
/*
* now mark all the inodes as existing and free or used.
* if the tree is suspect, put them into the uncertain
@@ -870,14 +884,12 @@ _("inode rec for ino %" PRIu64 " (%d/%d) overlaps existing rec (start %d/%d)\n")
*/
if (!suspect) {
if (XFS_INOBT_IS_FREE_DISK(rp, 0)) {
- nfree++;
ino_rec = set_inode_free_alloc(mp, agno, ino);
} else {
ino_rec = set_inode_used_alloc(mp, agno, ino);
}
for (j = 1; j < XFS_INODES_PER_CHUNK; j++) {
if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
- nfree++;
set_inode_free(ino_rec, j);
} else {
set_inode_used(ino_rec, j);
@@ -886,7 +898,6 @@ _("inode rec for ino %" PRIu64 " (%d/%d) overlaps existing rec (start %d/%d)\n")
} else {
for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
- nfree++;
add_aginode_uncertain(mp, agno, ino + j, 1);
} else {
add_aginode_uncertain(mp, agno, ino + j, 0);
@@ -894,6 +905,29 @@ _("inode rec for ino %" PRIu64 " (%d/%d) overlaps existing rec (start %d/%d)\n")
}
}
+ /*
+ * Mark sparse inodes as such in the in-core tree. Verify that sparse
+ * inodes are free and that freecount is consistent with the free mask.
+ */
+ nfree = 0;
+ for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
+ if (ino_issparse(rp, j)) {
+ if (!suspect && !XFS_INOBT_IS_FREE_DISK(rp, j)) {
+ do_warn(
+_("ir_holemask/ir_free mismatch, inode chunk %d/%u, holemask 0x%x free 0x%llx\n"),
+ agno, ino,
+ be16_to_cpu(rp->ir_u.sp.ir_holemask),
+ be64_to_cpu(rp->ir_free));
+ suspect++;
+ }
+ if (!suspect && ino_rec)
+ set_inode_sparse(ino_rec, j);
+ } else if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
+ /* freecount only tracks non-sparse inos */
+ nfree++;
+ }
+ }
+
if (nfree != freecount) {
do_warn(
_("ir_freecount/free mismatch, inode chunk %d/%u, freecount %d nfree %d\n"),
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* Re: [PATCH 18/28] repair: scan and track sparse inode chunks correctly
2015-06-02 18:41 ` [PATCH 18/28] repair: scan and track sparse inode chunks correctly Brian Foster
@ 2015-06-05 0:56 ` Dave Chinner
0 siblings, 0 replies; 38+ messages in thread
From: Dave Chinner @ 2015-06-05 0:56 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Tue, Jun 02, 2015 at 02:41:51PM -0400, Brian Foster wrote:
> Phase 2 of xfs_repair scans the on-disk inobt and creates in-core
> records for all inodes in the fs. This also involves marking
> free/allocated state of all inodes, internal record verification and
> block state management for the inode chunks tracked by inode records.
> Various parts of the inobt scan mechanism assume fully allocated inode
> records and thus lead to spurious errors when sparse inode records are
> encountered.
>
> Update the inobt scan to detect and handle sparse inode records
> correctly. Do not set the allocation state of blocks in sparse inode
> regions as these blocks do not belong to the record. Do not account
> sparse inodes against the ir_freecount as these inodes do not exist and
> are not available for allocation by the fs. Finally, track the sparse
> status of each individual inode in the in-core inode records for future
> reference.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> include/libxfs.h | 16 ++++++++++++++++
> repair/incore.h | 14 ++++++++++++++
> repair/incore_ino.c | 1 +
> repair/scan.c | 48 +++++++++++++++++++++++++++++++++++++++++-------
> 4 files changed, 72 insertions(+), 7 deletions(-)
>
> diff --git a/include/libxfs.h b/include/libxfs.h
> index 6a59cc0..3321c50 100644
> --- a/include/libxfs.h
> +++ b/include/libxfs.h
> @@ -183,6 +183,22 @@ extern unsigned long libxfs_physmem(void); /* in kilobytes */
> #define XFS_INOBT_IS_FREE_DISK(rp,i) \
> ((be64_to_cpu((rp)->ir_free) & XFS_INOBT_MASK(i)) != 0)
>
> +static inline bool
> +XFS_INOBT_IS_SPARSE_DISK(
Shouty! ;)
I changed this lower case, otherwise ok.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 19/28] repair: scan sparse finobt records correctly
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (17 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 18/28] repair: scan and track sparse inode chunks correctly Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-05 1:03 ` Dave Chinner
2015-06-02 18:41 ` [PATCH 20/28] repair: validate ir_count field for sparse format records Brian Foster
` (9 subsequent siblings)
28 siblings, 1 reply; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The finobt scan performs similar checks as to the inobt scan, including
internal record consistency checks, consistency with inobt records,
inode block state, etc. Various parts of this mechanism also assume
fully allocated inode records and thus lead to false errors with sparse
records.
Update the finobt scan to detect and handle sparse inode records
correctly. As for the inobt, do not assume that blocks associated with
sparse regions are allocated for inodes and do not account sparse inodes
against the freecount. Additionally, verify that sparse state is
consistent with the in-core record and set up any new in-core records
that might have been missing from the inobt correctly.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/scan.c | 51 +++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 43 insertions(+), 8 deletions(-)
diff --git a/repair/scan.c b/repair/scan.c
index 5b67e15..52c05e2 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -950,7 +950,8 @@ scan_single_finobt_chunk(
int nfree;
int off;
int state;
- ino_tree_node_t *first_rec, *last_rec, *ino_rec;
+ ino_tree_node_t *ino_rec = NULL;
+ ino_tree_node_t *first_rec, *last_rec;
int freecount;
ino = be32_to_cpu(rp->ir_startino);
@@ -1014,8 +1015,19 @@ _("bad ending inode # (%" PRIu64 " (0x%x 0x%zx)) in finobt rec, skipping rec\n")
j < XFS_INODES_PER_CHUNK;
j += mp->m_sb.sb_inopblock) {
agbno = XFS_AGINO_TO_AGBNO(mp, ino + j);
-
state = get_bmap(agno, agbno);
+
+ /* sparse inodes should not refer to inode blocks */
+ if (ino_issparse(rp, j)) {
+ if (state == XR_E_INO) {
+ do_warn(
+_("sparse inode chunk claims inode block, finobt block - agno %d, bno %d, inopb %d\n"),
+ agno, agbno, mp->m_sb.sb_inopblock);
+ suspect++;
+ }
+ continue;
+ }
+
if (state == XR_E_INO) {
continue;
} else if ((state == XR_E_UNKNOWN) ||
@@ -1060,8 +1072,9 @@ _("finobt rec for ino %" PRIu64 " (%d/%u) does not match existing rec (%d/%d)\n"
nfree = 0;
for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
int isfree = XFS_INOBT_IS_FREE_DISK(rp, j);
+ int issparse = ino_issparse(rp, j);
- if (isfree)
+ if (isfree && !issparse)
nfree++;
/*
@@ -1071,6 +1084,10 @@ _("finobt rec for ino %" PRIu64 " (%d/%u) does not match existing rec (%d/%d)\n"
if (!suspect &&
isfree != is_inode_free(first_rec, j))
suspect++;
+
+ if (!suspect &&
+ issparse != is_inode_sparse(first_rec, j))
+ suspect++;
}
goto check_freecount;
@@ -1088,16 +1105,13 @@ _("finobt rec for ino %" PRIu64 " (%d/%u) does not match existing rec (%d/%d)\n"
* inodes previously inserted into the uncertain tree should be
* superceded by these when the uncertain tree is processed
*/
- nfree = 0;
if (XFS_INOBT_IS_FREE_DISK(rp, 0)) {
- nfree++;
ino_rec = set_inode_free_alloc(mp, agno, ino);
} else {
ino_rec = set_inode_used_alloc(mp, agno, ino);
}
for (j = 1; j < XFS_INODES_PER_CHUNK; j++) {
if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
- nfree++;
set_inode_free(ino_rec, j);
} else {
set_inode_used(ino_rec, j);
@@ -1108,17 +1122,38 @@ _("finobt rec for ino %" PRIu64 " (%d/%u) does not match existing rec (%d/%d)\n"
* this should handle the case where the inobt scan may have
* already added uncertain inodes
*/
- nfree = 0;
for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
add_aginode_uncertain(mp, agno, ino + j, 1);
- nfree++;
} else {
add_aginode_uncertain(mp, agno, ino + j, 0);
}
}
}
+ /*
+ * Mark sparse inodes as such in the in-core tree. Verify that sparse
+ * inodes are free and that freecount is consistent with the free mask.
+ */
+ nfree = 0;
+ for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
+ if (ino_issparse(rp, j)) {
+ if (!suspect && !XFS_INOBT_IS_FREE_DISK(rp, j)) {
+ do_warn(
+_("finobt ir_holemask/ir_free mismatch, inode chunk %d/%u, holemask 0x%x free 0x%llx\n"),
+ agno, ino,
+ be16_to_cpu(rp->ir_u.sp.ir_holemask),
+ be64_to_cpu(rp->ir_free));
+ suspect++;
+ }
+ if (!suspect && ino_rec)
+ set_inode_sparse(ino_rec, j);
+ } else if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
+ /* freecount only tracks non-sparse inos */
+ nfree++;
+ }
+ }
+
check_freecount:
/*
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* Re: [PATCH 19/28] repair: scan sparse finobt records correctly
2015-06-02 18:41 ` [PATCH 19/28] repair: scan sparse finobt records correctly Brian Foster
@ 2015-06-05 1:03 ` Dave Chinner
2015-06-05 16:52 ` Brian Foster
0 siblings, 1 reply; 38+ messages in thread
From: Dave Chinner @ 2015-06-05 1:03 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Tue, Jun 02, 2015 at 02:41:52PM -0400, Brian Foster wrote:
> The finobt scan performs similar checks as to the inobt scan, including
> internal record consistency checks, consistency with inobt records,
> inode block state, etc. Various parts of this mechanism also assume
> fully allocated inode records and thus lead to false errors with sparse
> records.
>
> Update the finobt scan to detect and handle sparse inode records
> correctly. As for the inobt, do not assume that blocks associated with
> sparse regions are allocated for inodes and do not account sparse inodes
> against the freecount. Additionally, verify that sparse state is
> consistent with the in-core record and set up any new in-core records
> that might have been missing from the inobt correctly.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
....
>
> + /*
> + * Mark sparse inodes as such in the in-core tree. Verify that sparse
> + * inodes are free and that freecount is consistent with the free mask.
> + */
> + nfree = 0;
> + for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
> + if (ino_issparse(rp, j)) {
> + if (!suspect && !XFS_INOBT_IS_FREE_DISK(rp, j)) {
> + do_warn(
> +_("finobt ir_holemask/ir_free mismatch, inode chunk %d/%u, holemask 0x%x free 0x%llx\n"),
> + agno, ino,
> + be16_to_cpu(rp->ir_u.sp.ir_holemask),
> + be64_to_cpu(rp->ir_free));
> + suspect++;
> + }
> + if (!suspect && ino_rec)
> + set_inode_sparse(ino_rec, j);
> + } else if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
> + /* freecount only tracks non-sparse inos */
> + nfree++;
> + }
> + }
> +
This is the same checking code as used for the inobt. Can you factor
these into a helper? I'll apply as is, so delta patch again. ;)
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH 19/28] repair: scan sparse finobt records correctly
2015-06-05 1:03 ` Dave Chinner
@ 2015-06-05 16:52 ` Brian Foster
0 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-05 16:52 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Fri, Jun 05, 2015 at 11:03:02AM +1000, Dave Chinner wrote:
> On Tue, Jun 02, 2015 at 02:41:52PM -0400, Brian Foster wrote:
> > The finobt scan performs similar checks as to the inobt scan, including
> > internal record consistency checks, consistency with inobt records,
> > inode block state, etc. Various parts of this mechanism also assume
> > fully allocated inode records and thus lead to false errors with sparse
> > records.
> >
> > Update the finobt scan to detect and handle sparse inode records
> > correctly. As for the inobt, do not assume that blocks associated with
> > sparse regions are allocated for inodes and do not account sparse inodes
> > against the freecount. Additionally, verify that sparse state is
> > consistent with the in-core record and set up any new in-core records
> > that might have been missing from the inobt correctly.
> >
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> ....
> >
> > + /*
> > + * Mark sparse inodes as such in the in-core tree. Verify that sparse
> > + * inodes are free and that freecount is consistent with the free mask.
> > + */
> > + nfree = 0;
> > + for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
> > + if (ino_issparse(rp, j)) {
> > + if (!suspect && !XFS_INOBT_IS_FREE_DISK(rp, j)) {
> > + do_warn(
> > +_("finobt ir_holemask/ir_free mismatch, inode chunk %d/%u, holemask 0x%x free 0x%llx\n"),
> > + agno, ino,
> > + be16_to_cpu(rp->ir_u.sp.ir_holemask),
> > + be64_to_cpu(rp->ir_free));
> > + suspect++;
> > + }
> > + if (!suspect && ino_rec)
> > + set_inode_sparse(ino_rec, j);
> > + } else if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
> > + /* freecount only tracks non-sparse inos */
> > + nfree++;
> > + }
> > + }
> > +
>
> This is the same checking code as used for the inobt. Can you factor
> these into a helper? I'll apply as is, so delta patch again. ;)
>
There's actually quite a bit of duplication throughout the inobt and
finobt record scan functions. I noticed this when doing the finobt stuff
but there were enough differences that it didn't seem worth the effort
at the time.
Looking back at it now with the sparse stuff added and whatnot, it's
probably worth refactoring. The only difference between much of the
logic is the error messages that distinguish between the trees (inobt
vs. finobt). I've made a pass through this code to create a couple
helpers for these functions and allow the caller to identify the tree
based on a simple enum parameter.
I'll get the patches for this and the other cleanups posted once they
get through some testing...
Brian
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 20/28] repair: validate ir_count field for sparse format records
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (18 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 19/28] repair: scan sparse finobt records correctly Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 21/28] repair: process sparse inode records correctly Brian Foster
` (8 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Sparse format inobt records contain an additional count field that
records the number of physical inodes tracked by the record. Verify the
count is internally consistent according to the holemask, similar to how
freecount is validated against the free mask.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/scan.c | 43 ++++++++++++++++++++++++++++++++++---------
1 file changed, 34 insertions(+), 9 deletions(-)
diff --git a/repair/scan.c b/repair/scan.c
index 52c05e2..9b16199 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -758,6 +758,7 @@ scan_single_ino_chunk(
xfs_agblock_t agbno;
int j;
int nfree;
+ int ninodes;
int off;
int state;
ino_tree_node_t *ino_rec = NULL;
@@ -909,7 +910,7 @@ _("inode rec for ino %" PRIu64 " (%d/%d) overlaps existing rec (start %d/%d)\n")
* Mark sparse inodes as such in the in-core tree. Verify that sparse
* inodes are free and that freecount is consistent with the free mask.
*/
- nfree = 0;
+ nfree = ninodes = 0;
for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
if (ino_issparse(rp, j)) {
if (!suspect && !XFS_INOBT_IS_FREE_DISK(rp, j)) {
@@ -922,9 +923,11 @@ _("ir_holemask/ir_free mismatch, inode chunk %d/%u, holemask 0x%x free 0x%llx\n"
}
if (!suspect && ino_rec)
set_inode_sparse(ino_rec, j);
- } else if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
- /* freecount only tracks non-sparse inos */
- nfree++;
+ } else {
+ /* count fields track non-sparse inos */
+ if (XFS_INOBT_IS_FREE_DISK(rp, j))
+ nfree++;
+ ninodes++;
}
}
@@ -934,6 +937,14 @@ _("ir_freecount/free mismatch, inode chunk %d/%u, freecount %d nfree %d\n"),
agno, ino, freecount, nfree);
}
+ /* verify sparse record formats have a valid inode count */
+ if (xfs_sb_version_hassparseinodes(&mp->m_sb) &&
+ ninodes != rp->ir_u.sp.ir_count) {
+ do_warn(
+_("invalid inode count, inode chunk %d/%u, count %d ninodes %d\n"),
+ agno, ino, rp->ir_u.sp.ir_count, ninodes);
+ }
+
return suspect;
}
@@ -948,6 +959,7 @@ scan_single_finobt_chunk(
xfs_agblock_t agbno;
int j;
int nfree;
+ int ninodes;
int off;
int state;
ino_tree_node_t *ino_rec = NULL;
@@ -1069,11 +1081,13 @@ _("finobt rec for ino %" PRIu64 " (%d/%u) does not match existing rec (%d/%d)\n"
return ++suspect;
}
- nfree = 0;
+ nfree = ninodes = 0;
for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
int isfree = XFS_INOBT_IS_FREE_DISK(rp, j);
int issparse = ino_issparse(rp, j);
+ if (!issparse)
+ ninodes++;
if (isfree && !issparse)
nfree++;
@@ -1135,7 +1149,7 @@ _("finobt rec for ino %" PRIu64 " (%d/%u) does not match existing rec (%d/%d)\n"
* Mark sparse inodes as such in the in-core tree. Verify that sparse
* inodes are free and that freecount is consistent with the free mask.
*/
- nfree = 0;
+ nfree = ninodes = 0;
for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
if (ino_issparse(rp, j)) {
if (!suspect && !XFS_INOBT_IS_FREE_DISK(rp, j)) {
@@ -1148,10 +1162,13 @@ _("finobt ir_holemask/ir_free mismatch, inode chunk %d/%u, holemask 0x%x free 0x
}
if (!suspect && ino_rec)
set_inode_sparse(ino_rec, j);
- } else if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
- /* freecount only tracks non-sparse inos */
- nfree++;
+ } else {
+ /* count fields track non-sparse inos */
+ if (XFS_INOBT_IS_FREE_DISK(rp, j))
+ nfree++;
+ ninodes++;
}
+
}
check_freecount:
@@ -1178,6 +1195,14 @@ _("finobt ir_freecount/free mismatch, inode chunk %d/%u, freecount %d nfree %d\n
_("finobt record with no free inodes, inode chunk %d/%u\n"), agno, ino);
}
+ /* verify sparse record formats have a valid inode count */
+ if (xfs_sb_version_hassparseinodes(&mp->m_sb) &&
+ ninodes != rp->ir_u.sp.ir_count) {
+ do_warn(
+_("invalid inode count, inode chunk %d/%u, count %d ninodes %d\n"),
+ agno, ino, rp->ir_u.sp.ir_count, ninodes);
+ }
+
return suspect;
}
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 21/28] repair: process sparse inode records correctly
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (19 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 20/28] repair: validate ir_count field for sparse format records Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-05 1:12 ` Dave Chinner
2015-06-02 18:41 ` [PATCH 22/28] repair: factor out sparse inodes from finobt reconstruction Brian Foster
` (7 subsequent siblings)
28 siblings, 1 reply; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The inode processing phases of xfs_repair (3 and 4) validate the actual
inodes referred to by the previously scanned inode btrees. The physical
inodes are read from disk and internally validated in various ways. The
inode block state is also verified and corrected if necessary.
Sparse inodes are not physically allocated and the associated blocks may
be allocated to any other area of the fs (file data, internal use,
etc.). Attempts to validate these blocks as inode blocks produce noisy
corruption errors.
Update the inode processing mechanism to handle sparse inode records
correctly. Since sparse inodes do not exist, the general approach here
is to simply skip validation of sparse inodes. Update
process_inode_chunk() to skip reads of sparse clusters and set the buf
pointer of associated clusters to NULL. Update the rest of the function
to only verify non-NULL cluster buffers. Also, skip the inode block
state checks for blocks in sparse inode clusters.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/dino_chunks.c | 162 +++++++++++++++++++++++++++++++--------------------
1 file changed, 98 insertions(+), 64 deletions(-)
diff --git a/repair/dino_chunks.c b/repair/dino_chunks.c
index a1ce9e7..9b7d017 100644
--- a/repair/dino_chunks.c
+++ b/repair/dino_chunks.c
@@ -615,6 +615,7 @@ process_inode_chunk(
* set up first irec
*/
ino_rec = first_irec;
+ irec_offset = 0;
bplist = malloc(cluster_count * sizeof(xfs_buf_t *));
if (bplist == NULL)
@@ -622,6 +623,18 @@ process_inode_chunk(
cluster_count * sizeof(xfs_buf_t *));
for (bp_index = 0; bp_index < cluster_count; bp_index++) {
+ /*
+ * Skip the cluster buffer if the first inode is sparse. The
+ * remaining inodes in the cluster share the same state as
+ * sparse inodes occur at cluster granularity.
+ */
+ if (is_inode_sparse(ino_rec, irec_offset)) {
+ pftrace("skip sparse inode, startnum 0x%x idx %d",
+ ino_rec->ino_startnum, irec_offset);
+ bplist[bp_index] = NULL;
+ goto next_readbuf;
+ }
+
pftrace("about to read off %llu in AG %d",
XFS_AGB_TO_DADDR(mp, agno, agbno), agno);
@@ -641,12 +654,16 @@ process_inode_chunk(
free(bplist);
return(1);
}
- agbno += blks_per_cluster;
- bplist[bp_index]->b_ops = &xfs_inode_buf_ops;
pftrace("readbuf %p (%llu, %d) in AG %d", bplist[bp_index],
(long long)XFS_BUF_ADDR(bplist[bp_index]),
XFS_BUF_COUNT(bplist[bp_index]), agno);
+
+ bplist[bp_index]->b_ops = &xfs_inode_buf_ops;
+
+next_readbuf:
+ irec_offset += mp->m_sb.sb_inopblock * blks_per_cluster;
+ agbno += blks_per_cluster;
}
agbno = XFS_AGINO_TO_AGBNO(mp, first_irec->ino_startnum);
@@ -665,24 +682,27 @@ process_inode_chunk(
*/
if (ino_discovery) {
for (;;) {
- /*
- * make inode pointer
- */
- dino = xfs_make_iptr(mp, bplist[bp_index], cluster_offset);
agino = irec_offset + ino_rec->ino_startnum;
- /*
- * we always think that the root and realtime
- * inodes are verified even though we may have
- * to reset them later to keep from losing the
- * chunk that they're in
- */
- if (verify_dinode(mp, dino, agno, agino) == 0 ||
- (agno == 0 &&
- (mp->m_sb.sb_rootino == agino ||
- mp->m_sb.sb_rsumino == agino ||
- mp->m_sb.sb_rbmino == agino)))
- status++;
+ /* no buffers for sparse clusters */
+ if (bplist[bp_index]) {
+ /* make inode pointer */
+ dino = xfs_make_iptr(mp, bplist[bp_index],
+ cluster_offset);
+
+ /*
+ * we always think that the root and realtime
+ * inodes are verified even though we may have
+ * to reset them later to keep from losing the
+ * chunk that they're in
+ */
+ if (verify_dinode(mp, dino, agno, agino) == 0 ||
+ (agno == 0 &&
+ (mp->m_sb.sb_rootino == agino ||
+ mp->m_sb.sb_rsumino == agino ||
+ mp->m_sb.sb_rbmino == agino)))
+ status++;
+ }
irec_offset++;
icnt++;
@@ -716,7 +736,8 @@ process_inode_chunk(
if (!status) {
*bogus = 1;
for (bp_index = 0; bp_index < cluster_count; bp_index++)
- libxfs_putbuf(bplist[bp_index]);
+ if (bplist[bp_index])
+ libxfs_putbuf(bplist[bp_index]);
free(bplist);
return(0);
}
@@ -736,35 +757,41 @@ process_inode_chunk(
/*
* mark block as an inode block in the incore bitmap
*/
- pthread_mutex_lock(&ag_locks[agno].lock);
- state = get_bmap(agno, agbno);
- switch (state) {
- case XR_E_INO: /* already marked */
- break;
- case XR_E_UNKNOWN:
- case XR_E_FREE:
- case XR_E_FREE1:
- set_bmap(agno, agbno, XR_E_INO);
- break;
- case XR_E_BAD_STATE:
- do_error(_("bad state in block map %d\n"), state);
- break;
- default:
- set_bmap(agno, agbno, XR_E_MULT);
- do_warn(_("inode block %" PRIu64 " multiply claimed, state was %d\n"),
- XFS_AGB_TO_FSB(mp, agno, agbno), state);
- break;
+ if (!is_inode_sparse(ino_rec, irec_offset)) {
+ pthread_mutex_lock(&ag_locks[agno].lock);
+ state = get_bmap(agno, agbno);
+ switch (state) {
+ case XR_E_INO: /* already marked */
+ break;
+ case XR_E_UNKNOWN:
+ case XR_E_FREE:
+ case XR_E_FREE1:
+ set_bmap(agno, agbno, XR_E_INO);
+ break;
+ case XR_E_BAD_STATE:
+ do_error(_("bad state in block map %d\n"), state);
+ break;
+ default:
+ set_bmap(agno, agbno, XR_E_MULT);
+ do_warn(
+ _("inode block %" PRIu64 " multiply claimed, state was %d\n"),
+ XFS_AGB_TO_FSB(mp, agno, agbno), state);
+ break;
+ }
+ pthread_mutex_unlock(&ag_locks[agno].lock);
}
- pthread_mutex_unlock(&ag_locks[agno].lock);
for (;;) {
- /*
- * make inode pointer
- */
- dino = xfs_make_iptr(mp, bplist[bp_index], cluster_offset);
agino = irec_offset + ino_rec->ino_startnum;
ino = XFS_AGINO_TO_INO(mp, agno, agino);
+ if (is_inode_sparse(ino_rec, irec_offset))
+ goto process_next;
+
+ /* make inode pointer */
+ dino = xfs_make_iptr(mp, bplist[bp_index], cluster_offset);
+
+
is_used = 3;
ino_dirty = 0;
parent = 0;
@@ -895,6 +922,7 @@ process_inode_chunk(
}
}
+process_next:
irec_offset++;
ibuf_offset++;
icnt++;
@@ -906,6 +934,9 @@ process_inode_chunk(
* done! - finished up irec and block simultaneously
*/
for (bp_index = 0; bp_index < cluster_count; bp_index++) {
+ if (!bplist[bp_index])
+ continue;
+
pftrace("put/writebuf %p (%llu) in AG %d",
bplist[bp_index], (long long)
XFS_BUF_ADDR(bplist[bp_index]), agno);
@@ -925,29 +956,32 @@ process_inode_chunk(
ibuf_offset = 0;
agbno++;
- pthread_mutex_lock(&ag_locks[agno].lock);
- state = get_bmap(agno, agbno);
- switch (state) {
- case XR_E_INO: /* already marked */
- break;
- case XR_E_UNKNOWN:
- case XR_E_FREE:
- case XR_E_FREE1:
- set_bmap(agno, agbno, XR_E_INO);
- break;
- case XR_E_BAD_STATE:
- do_error(_("bad state in block map %d\n"),
- state);
- break;
- default:
- set_bmap(agno, agbno, XR_E_MULT);
- do_warn(
- _("inode block %" PRIu64 " multiply claimed, state was %d\n"),
- XFS_AGB_TO_FSB(mp, agno, agbno), state);
- break;
+ if (!is_inode_sparse(ino_rec, irec_offset)) {
+ pthread_mutex_lock(&ag_locks[agno].lock);
+ state = get_bmap(agno, agbno);
+ switch (state) {
+ case XR_E_INO: /* already marked */
+ break;
+ case XR_E_UNKNOWN:
+ case XR_E_FREE:
+ case XR_E_FREE1:
+ set_bmap(agno, agbno, XR_E_INO);
+ break;
+ case XR_E_BAD_STATE:
+ do_error(
+ _("bad state in block map %d\n"),
+ state);
+ break;
+ default:
+ set_bmap(agno, agbno, XR_E_MULT);
+ do_warn(
+ _("inode block %" PRIu64 " multiply claimed, state was %d\n"),
+ XFS_AGB_TO_FSB(mp, agno, agbno),
+ state);
+ break;
+ }
+ pthread_mutex_unlock(&ag_locks[agno].lock);
}
- pthread_mutex_unlock(&ag_locks[agno].lock);
-
} else if (irec_offset == XFS_INODES_PER_CHUNK) {
/*
* get new irec (multiple chunks per block fs)
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* Re: [PATCH 21/28] repair: process sparse inode records correctly
2015-06-02 18:41 ` [PATCH 21/28] repair: process sparse inode records correctly Brian Foster
@ 2015-06-05 1:12 ` Dave Chinner
0 siblings, 0 replies; 38+ messages in thread
From: Dave Chinner @ 2015-06-05 1:12 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Tue, Jun 02, 2015 at 02:41:54PM -0400, Brian Foster wrote:
> The inode processing phases of xfs_repair (3 and 4) validate the actual
> inodes referred to by the previously scanned inode btrees. The physical
> inodes are read from disk and internally validated in various ways. The
> inode block state is also verified and corrected if necessary.
>
> Sparse inodes are not physically allocated and the associated blocks may
> be allocated to any other area of the fs (file data, internal use,
> etc.). Attempts to validate these blocks as inode blocks produce noisy
> corruption errors.
>
> Update the inode processing mechanism to handle sparse inode records
> correctly. Since sparse inodes do not exist, the general approach here
> is to simply skip validation of sparse inodes. Update
> process_inode_chunk() to skip reads of sparse clusters and set the buf
> pointer of associated clusters to NULL. Update the rest of the function
> to only verify non-NULL cluster buffers. Also, skip the inode block
> state checks for blocks in sparse inode clusters.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Code looks good, but in looking at this, another helper is in order:
> @@ -736,35 +757,41 @@ process_inode_chunk(
> /*
> * mark block as an inode block in the incore bitmap
> */
> - pthread_mutex_lock(&ag_locks[agno].lock);
> - state = get_bmap(agno, agbno);
> - switch (state) {
> - case XR_E_INO: /* already marked */
> - break;
> - case XR_E_UNKNOWN:
> - case XR_E_FREE:
> - case XR_E_FREE1:
> - set_bmap(agno, agbno, XR_E_INO);
> - break;
> - case XR_E_BAD_STATE:
> - do_error(_("bad state in block map %d\n"), state);
> - break;
> - default:
> - set_bmap(agno, agbno, XR_E_MULT);
> - do_warn(_("inode block %" PRIu64 " multiply claimed, state was %d\n"),
> - XFS_AGB_TO_FSB(mp, agno, agbno), state);
> - break;
> + if (!is_inode_sparse(ino_rec, irec_offset)) {
> + pthread_mutex_lock(&ag_locks[agno].lock);
> + state = get_bmap(agno, agbno);
> + switch (state) {
> + case XR_E_INO: /* already marked */
> + break;
> + case XR_E_UNKNOWN:
> + case XR_E_FREE:
> + case XR_E_FREE1:
> + set_bmap(agno, agbno, XR_E_INO);
> + break;
> + case XR_E_BAD_STATE:
> + do_error(_("bad state in block map %d\n"), state);
> + break;
> + default:
> + set_bmap(agno, agbno, XR_E_MULT);
> + do_warn(
> + _("inode block %" PRIu64 " multiply claimed, state was %d\n"),
> + XFS_AGB_TO_FSB(mp, agno, agbno), state);
> + break;
> + }
> + pthread_mutex_unlock(&ag_locks[agno].lock);
> }
> - pthread_mutex_unlock(&ag_locks[agno].lock);
This state update code is repeated and has an indentical
modification later in the patch, so can you factor
it into a helper again? (delta patch!)
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 22/28] repair: factor out sparse inodes from finobt reconstruction
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (20 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 21/28] repair: process sparse inode records correctly Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 23/28] repair: do not account sparse inodes in phase 5 cursor init Brian Foster
` (6 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Phase 5 of xfs_repair recreates the on-disk btrees. The free inode btree
(finobt) contains inode records that contain one or more free inodes.
Sparse inodes are marked as free and therefore sparse inode records can
be incorrectly included in the finobt even when no real free inodes are
available in the record.
Update the finobt in-core record traversal helpers to factor out sparse
inodes and only consider inode records with allocated, free inodes for
finobt insertion.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/incore.h | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/repair/incore.h b/repair/incore.h
index d4e44a7..5a63e1e 100644
--- a/repair/incore.h
+++ b/repair/incore.h
@@ -384,6 +384,14 @@ void clear_uncertain_ino_cache(xfs_agnumber_t agno);
/*
* finobt helpers
*/
+
+static inline bool
+inode_rec_has_free(struct ino_tree_node *ino_rec)
+{
+ /* must have real, allocated inodes for finobt */
+ return ino_rec->ir_free & ~ino_rec->ir_sparse;
+}
+
static inline ino_tree_node_t *
findfirst_free_inode_rec(xfs_agnumber_t agno)
{
@@ -391,7 +399,7 @@ findfirst_free_inode_rec(xfs_agnumber_t agno)
ino_rec = findfirst_inode_rec(agno);
- while (ino_rec && !ino_rec->ir_free)
+ while (ino_rec && !inode_rec_has_free(ino_rec))
ino_rec = next_ino_rec(ino_rec);
return ino_rec;
@@ -402,7 +410,7 @@ next_free_ino_rec(ino_tree_node_t *ino_rec)
{
ino_rec = next_ino_rec(ino_rec);
- while (ino_rec && !ino_rec->ir_free)
+ while (ino_rec && !inode_rec_has_free(ino_rec))
ino_rec = next_ino_rec(ino_rec);
return ino_rec;
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 23/28] repair: do not account sparse inodes in phase 5 cursor init.
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (21 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 22/28] repair: factor out sparse inodes from finobt reconstruction Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 24/28] repair: reconstruct sparse inode records correctly on disk Brian Foster
` (5 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The inode btrees are reconstructed in phase 5 of xfs_repair. The btree
cursor initialization counts the allocated and free inodes in the
in-core records and calculates the expected geometry of the resulting
btree. The free and total inode counts for each AG are also ultimately
aggregated to update the associated superblock counts.
Update init_ino_cursor() to not assume 64 inode records and not account
sparse inodes into the total or free inode count for each AG.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/phase5.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/repair/phase5.c b/repair/phase5.c
index 04bf049..30f2d05 100644
--- a/repair/phase5.c
+++ b/repair/phase5.c
@@ -899,7 +899,8 @@ init_ino_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs,
{
__uint64_t ninos;
__uint64_t nfinos;
- __uint64_t rec_nfinos;
+ int rec_nfinos;
+ int rec_ninos;
ino_tree_node_t *ino_rec;
int num_recs;
int level;
@@ -919,11 +920,19 @@ init_ino_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs,
*/
ino_rec = findfirst_inode_rec(agno);
for (num_recs = 0; ino_rec != NULL; ino_rec = next_ino_rec(ino_rec)) {
+ rec_ninos = 0;
rec_nfinos = 0;
for (i = 0; i < XFS_INODES_PER_CHUNK; i++) {
ASSERT(is_inode_confirmed(ino_rec, i));
+ /*
+ * sparse inodes are not factored into superblock (free)
+ * inode counts
+ */
+ if (is_inode_sparse(ino_rec, i))
+ continue;
if (is_inode_free(ino_rec, i))
rec_nfinos++;
+ rec_ninos++;
}
/*
@@ -933,7 +942,7 @@ init_ino_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs,
continue;
nfinos += rec_nfinos;
- ninos += XFS_INODES_PER_CHUNK;
+ ninos += rec_ninos;
num_recs++;
}
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 24/28] repair: reconstruct sparse inode records correctly on disk
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (22 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 23/28] repair: do not account sparse inodes in phase 5 cursor init Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 25/28] repair: do not prefetch holes in sparse inode chunks Brian Foster
` (4 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Phase 5 traverses all of the in-core inode records and regenerates the
inode btrees a record at a time. The record insertion code doesn't
account for sparse inodes which means the ir_holemask and ir_count
fields are not set on-disk and ir_freecount is set with an invalid value
for sparse inode records.
Update build_ino_tree() to handle sparse inode records correctly. We
must account real, allocated inodes only into the ir_freecount field.
The 64-bit in-core sparse inode bitmask must be converted to compressed
16-bit ir_holemask format. Finally, the ir_count field must set to the
total (non-sparse) inode count of the record.
If the fs does not support sparse inodes, both the ir_holemask and
ir_count field are initialized to zero to preserve backwards
compatibility. These bytes historically landed in the high order bytes
of ir_freecount and must be 0 to be interpreted correctly by older XFS
implementations without sparse inode support.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/phase5.c | 47 +++++++++++++++++++++++++++++++++++++++--------
1 file changed, 39 insertions(+), 8 deletions(-)
diff --git a/repair/phase5.c b/repair/phase5.c
index 30f2d05..0601810 100644
--- a/repair/phase5.c
+++ b/repair/phase5.c
@@ -1158,8 +1158,12 @@ build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
xfs_agino_t count = 0;
xfs_agino_t freecount = 0;
int inocnt;
+ uint8_t finocnt;
int k;
int level = btree_curs->num_levels;
+ int spmask;
+ uint64_t sparse;
+ uint16_t holemask;
for (i = 0; i < level; i++) {
lptr = &btree_curs->level[i];
@@ -1243,19 +1247,46 @@ build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno,
cpu_to_be32(ino_rec->ino_startnum);
bt_rec[j].ir_free = cpu_to_be64(ino_rec->ir_free);
- inocnt = 0;
+ inocnt = finocnt = 0;
for (k = 0; k < sizeof(xfs_inofree_t)*NBBY; k++) {
ASSERT(is_inode_confirmed(ino_rec, k));
- inocnt += is_inode_free(ino_rec, k);
+
+ if (is_inode_sparse(ino_rec, k))
+ continue;
+ if (is_inode_free(ino_rec, k))
+ finocnt++;
+ inocnt++;
}
- if (xfs_sb_version_hassparseinodes(&mp->m_sb))
- bt_rec[j].ir_u.sp.ir_freecount = inocnt;
- else
+ if (!xfs_sb_version_hassparseinodes(&mp->m_sb)) {
bt_rec[j].ir_u.f.ir_freecount =
- cpu_to_be32(inocnt);
- freecount += inocnt;
- count += XFS_INODES_PER_CHUNK;
+ cpu_to_be32(finocnt);
+ goto nextrec;
+ }
+
+ /*
+ * Convert the 64-bit in-core sparse inode state to the
+ * 16-bit on-disk holemask.
+ */
+ holemask = 0;
+ spmask = (1 << XFS_INODES_PER_HOLEMASK_BIT) - 1;
+ sparse = ino_rec->ir_sparse;
+ for (k = 0; k < XFS_INOBT_HOLEMASK_BITS; k++) {
+ if (sparse & spmask) {
+ ASSERT((sparse & spmask) == spmask);
+ holemask |= (1 << k);
+ } else
+ ASSERT((sparse & spmask) == 0);
+ sparse >>= XFS_INODES_PER_HOLEMASK_BIT;
+ }
+
+ bt_rec[j].ir_u.sp.ir_freecount = finocnt;
+ bt_rec[j].ir_u.sp.ir_count = inocnt;
+ bt_rec[j].ir_u.sp.ir_holemask = cpu_to_be16(holemask);
+
+nextrec:
+ freecount += finocnt;
+ count += inocnt;
if (finobt)
ino_rec = next_free_ino_rec(ino_rec);
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 25/28] repair: do not prefetch holes in sparse inode chunks
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (23 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 24/28] repair: reconstruct sparse inode records correctly on disk Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:41 ` [PATCH 26/28] repair: handle sparse inode alignment Brian Foster
` (3 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
The repair prefetch mechanism reads all inode chunks in advance of
repair processing to improve performance. Inode buffer verification and
processing can occur within the prefetch mechanism such as when
directories are being processed. Prefetch currently assumes fully
populated inode chunks which leads to corruption errors attempting to
verify inode buffers that do not contain inodes.
Update prefetch to check the previously scanned sparse inode bits and
skip inode buffer reads of clusters that are sparse. We check sparse
state per-inode cluster because the cluster size is the min. allowable
inode chunk hole granularity.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/prefetch.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/repair/prefetch.c b/repair/prefetch.c
index d6246ce..1577971 100644
--- a/repair/prefetch.c
+++ b/repair/prefetch.c
@@ -679,6 +679,7 @@ pf_queuing_worker(
xfs_agblock_t bno;
int i;
int err;
+ uint64_t sparse;
blks_per_cluster = mp->m_inode_cluster_size >> mp->m_sb.sb_blocklog;
if (blks_per_cluster == 0)
@@ -736,17 +737,27 @@ pf_queuing_worker(
num_inos = 0;
bno = XFS_AGINO_TO_AGBNO(mp, cur_irec->ino_startnum);
+ sparse = cur_irec->ir_sparse;
do {
struct xfs_buf_map map;
map.bm_bn = XFS_AGB_TO_DADDR(mp, args->agno, bno);
map.bm_len = XFS_FSB_TO_BB(mp, blks_per_cluster);
- pf_queue_io(args, &map, 1,
- (cur_irec->ino_isa_dir != 0) ? B_DIR_INODE
- : B_INODE);
+
+ /*
+ * Queue I/O for each non-sparse cluster. We can check
+ * sparse state in cluster sized chunks as cluster size
+ * is the min. granularity of sparse irec regions.
+ */
+ if ((sparse & ((1 << inodes_per_cluster) - 1)) == 0)
+ pf_queue_io(args, &map, 1,
+ (cur_irec->ino_isa_dir != 0) ?
+ B_DIR_INODE : B_INODE);
+
bno += blks_per_cluster;
num_inos += inodes_per_cluster;
+ sparse >>= inodes_per_cluster;
} while (num_inos < mp->m_ialloc_inos);
}
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 26/28] repair: handle sparse inode alignment
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (24 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 25/28] repair: do not prefetch holes in sparse inode chunks Brian Foster
@ 2015-06-02 18:41 ` Brian Foster
2015-06-02 18:42 ` [PATCH 27/28] metadump: reorder inode record sanity checks and inode buffer read Brian Foster
` (2 subsequent siblings)
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:41 UTC (permalink / raw)
To: xfs
Sparse inode support requires inode alignment to match inode chunk size.
xfs_repair currently expects inode alignment to match the default
cluster size or a scaled factor thereof.
Update sb_validate_ino_align() to consider the superblock valid if
sparse inode support is enabled and alignment matches the chunk size.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/sb.c | 25 ++++++++++++++++++++-----
1 file changed, 20 insertions(+), 5 deletions(-)
diff --git a/repair/sb.c b/repair/sb.c
index 03be5e8..a291b3e 100644
--- a/repair/sb.c
+++ b/repair/sb.c
@@ -170,12 +170,13 @@ find_secondary_sb(xfs_sb_t *rsb)
}
/*
- * Calculate what inode alignment field ought to be
- * based on internal superblock info and determine if it is valid.
+ * Calculate what the inode alignment field ought to be based on internal
+ * superblock info and determine if it is valid.
*
- * For v5 superblocks, the inode alignment will either match that of the
- * standard XFS_INODE_BIG_CLUSTER_SIZE, or it will be scaled based on the inode
- * size. Either value is valid in this case.
+ * For standard v5 superblocks, the inode alignment must either match
+ * XFS_INODE_BIG_CLUSTER_SIZE or a multiple based on the inode size. For v5
+ * superblocks with sparse inode chunks enabled, inode alignment must match the
+ * inode chunk size.
*
* Return true if the alignment is valid, false otherwise.
*/
@@ -201,6 +202,20 @@ sb_validate_ino_align(struct xfs_sb *sb)
if (align == sb->sb_inoalignmt)
return true;
+ /*
+ * Sparse inodes requires inoalignmt to match full inode chunk size and
+ * spino_align to match the scaled alignment (as calculated above).
+ */
+ if (xfs_sb_version_hassparseinodes(sb)) {
+ if (align != sb->sb_spino_align)
+ return false;
+
+ align = (sb->sb_inodesize * XFS_INODES_PER_CHUNK)
+ >> sb->sb_blocklog;
+ if (align == sb->sb_inoalignmt)
+ return true;
+ }
+
return false;
}
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 27/28] metadump: reorder inode record sanity checks and inode buffer read
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (25 preceding siblings ...)
2015-06-02 18:41 ` [PATCH 26/28] repair: handle sparse inode alignment Brian Foster
@ 2015-06-02 18:42 ` Brian Foster
2015-06-02 18:42 ` [PATCH 28/28] metadump: support sparse inode records Brian Foster
2015-06-16 0:33 ` [PATCH 00/28] xfsprogs: sparse inode chunks Dave Chinner
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:42 UTC (permalink / raw)
To: xfs
In preparation to support sparse inode records, refactor
copy_inode_chunk() to perform all record sanity checks before the cursor
is set to the inode chunk and the inode buffer is read.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
db/metadump.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/db/metadump.c b/db/metadump.c
index 94f92bc..e101501 100644
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -1846,21 +1846,10 @@ copy_inode_chunk(
return 1;
}
- push_cur();
- set_cur(&typtab[TYP_INODE], XFS_AGB_TO_DADDR(mp, agno, agbno),
- XFS_FSB_TO_BB(mp, mp->m_ialloc_blks),
- DB_RING_IGN, NULL);
- if (iocur_top->data == NULL) {
- print_warning("cannot read inode block %u/%u", agno, agbno);
- rval = !stop_on_read_error;
- goto pop_out;
- }
-
/*
* check for basic assumptions about inode chunks, and if any
* assumptions fail, don't process the inode chunk.
*/
-
if ((mp->m_sb.sb_inopblock <= XFS_INODES_PER_CHUNK && off != 0) ||
(mp->m_sb.sb_inopblock > XFS_INODES_PER_CHUNK &&
off % XFS_INODES_PER_CHUNK != 0) ||
@@ -1870,7 +1859,17 @@ copy_inode_chunk(
if (show_warnings)
print_warning("badly aligned inode (start = %llu)",
XFS_AGINO_TO_INO(mp, agno, agino));
- goto skip_processing;
+ return 1;
+ }
+
+ push_cur();
+ set_cur(&typtab[TYP_INODE], XFS_AGB_TO_DADDR(mp, agno, agbno),
+ XFS_FSB_TO_BB(mp, mp->m_ialloc_blks),
+ DB_RING_IGN, NULL);
+ if (iocur_top->data == NULL) {
+ print_warning("cannot read inode block %u/%u", agno, agbno);
+ rval = !stop_on_read_error;
+ goto pop_out;
}
/*
@@ -1889,7 +1888,7 @@ copy_inode_chunk(
if (!process_inode(agno, agino + i, dip))
goto pop_out;
}
-skip_processing:
+
if (write_buf(iocur_top))
goto pop_out;
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* [PATCH 28/28] metadump: support sparse inode records
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (26 preceding siblings ...)
2015-06-02 18:42 ` [PATCH 27/28] metadump: reorder inode record sanity checks and inode buffer read Brian Foster
@ 2015-06-02 18:42 ` Brian Foster
2015-06-16 0:33 ` [PATCH 00/28] xfsprogs: sparse inode chunks Dave Chinner
28 siblings, 0 replies; 38+ messages in thread
From: Brian Foster @ 2015-06-02 18:42 UTC (permalink / raw)
To: xfs
xfs_metadump currently uses mp->m_ialloc_blks sized buffers to copy
inode chunks. If a filesystem supports sparse inodes, some clusters
within inode chunks can point to arbitrary data. If the buffer used to
read inodes includes these sparse clusters, inode read verification
fails and prints filesystem corruption warnings.
Update copy_inode_chunks() to support using a cluster sized buffer to
read a full inode chunk in multiple iterations if sparse inodes is
enabled. For each cluster read, check whether the first inode in the
cluster is sparse and skip the cluster if so. This is safe because
sparse records are allocated at cluster granularity.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
db/metadump.c | 83 ++++++++++++++++++++++++++++++++++++++++++-----------------
1 file changed, 60 insertions(+), 23 deletions(-)
diff --git a/db/metadump.c b/db/metadump.c
index e101501..5391c4c 100644
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -1830,13 +1830,43 @@ copy_inode_chunk(
xfs_agino_t agino;
int off;
xfs_agblock_t agbno;
+ xfs_agblock_t end_agbno;
int i;
int rval = 0;
+ int blks_per_buf;
+ int inodes_per_buf;
+ int ioff;
agino = be32_to_cpu(rp->ir_startino);
agbno = XFS_AGINO_TO_AGBNO(mp, agino);
+ end_agbno = agbno + mp->m_ialloc_blks;
off = XFS_INO_TO_OFFSET(mp, agino);
+ /*
+ * If the fs supports sparse inode records, we must process inodes a
+ * cluster at a time because that is the sparse allocation granularity.
+ * Otherwise, we risk CRC corruption errors on reads of inode chunks.
+ *
+ * Also make sure that that we don't process more than the single record
+ * we've been passed (large block sizes can hold multiple inode chunks).
+ */
+ if (xfs_sb_version_hassparseinodes(&mp->m_sb))
+ blks_per_buf = xfs_icluster_size_fsb(mp);
+ else
+ blks_per_buf = mp->m_ialloc_blks;
+ inodes_per_buf = min(blks_per_buf << mp->m_sb.sb_inopblog,
+ XFS_INODES_PER_CHUNK);
+
+ /*
+ * Sanity check that we only process a single buffer if ir_startino has
+ * a buffer offset. A non-zero offset implies that the entire chunk lies
+ * within a block.
+ */
+ if (off && inodes_per_buf != XFS_INODES_PER_CHUNK) {
+ print_warning("bad starting inode offset %d", off);
+ return 0;
+ }
+
if (agino == 0 || agino == NULLAGINO || !valid_bno(agno, agbno) ||
!valid_bno(agno, XFS_AGINO_TO_AGBNO(mp,
agino + XFS_INODES_PER_CHUNK - 1))) {
@@ -1863,36 +1893,43 @@ copy_inode_chunk(
}
push_cur();
- set_cur(&typtab[TYP_INODE], XFS_AGB_TO_DADDR(mp, agno, agbno),
- XFS_FSB_TO_BB(mp, mp->m_ialloc_blks),
- DB_RING_IGN, NULL);
- if (iocur_top->data == NULL) {
- print_warning("cannot read inode block %u/%u", agno, agbno);
- rval = !stop_on_read_error;
- goto pop_out;
- }
- /*
- * scan through inodes and copy any btree extent lists, directory
- * contents and extended attributes.
- */
- for (i = 0; i < XFS_INODES_PER_CHUNK; i++) {
- xfs_dinode_t *dip;
+ ioff = 0;
+ while (agbno < end_agbno && ioff < XFS_INODES_PER_CHUNK) {
+ if (XFS_INOBT_IS_SPARSE_DISK(rp, ioff))
+ goto next_bp;
+
+ set_cur(&typtab[TYP_INODE], XFS_AGB_TO_DADDR(mp, agno, agbno),
+ XFS_FSB_TO_BB(mp, blks_per_buf), DB_RING_IGN, NULL);
+ if (iocur_top->data == NULL) {
+ print_warning("cannot read inode block %u/%u",
+ agno, agbno);
+ rval = !stop_on_read_error;
+ goto pop_out;
+ }
- if (XFS_INOBT_IS_FREE_DISK(rp, i))
- continue;
+ for (i = 0; i < inodes_per_buf; i++) {
+ xfs_dinode_t *dip;
- dip = (xfs_dinode_t *)((char *)iocur_top->data +
+ if (XFS_INOBT_IS_FREE_DISK(rp, ioff + i))
+ continue;
+
+ dip = (xfs_dinode_t *)((char *)iocur_top->data +
((off + i) << mp->m_sb.sb_inodelog));
- if (!process_inode(agno, agino + i, dip))
- goto pop_out;
- }
+ if (!process_inode(agno, agino + ioff + i, dip))
+ goto pop_out;
- if (write_buf(iocur_top))
- goto pop_out;
+ inodes_copied++;
+ }
- inodes_copied += XFS_INODES_PER_CHUNK;
+ if (write_buf(iocur_top))
+ goto pop_out;
+
+next_bp:
+ agbno += blks_per_buf;
+ ioff += inodes_per_buf;
+ }
if (show_progress)
print_progress("Copied %u of %u inodes (%u of %u AGs)",
--
1.9.3
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 38+ messages in thread* Re: [PATCH 00/28] xfsprogs: sparse inode chunks
2015-06-02 18:41 [PATCH 00/28] xfsprogs: sparse inode chunks Brian Foster
` (27 preceding siblings ...)
2015-06-02 18:42 ` [PATCH 28/28] metadump: support sparse inode records Brian Foster
@ 2015-06-16 0:33 ` Dave Chinner
2015-06-16 0:39 ` Dave Chinner
28 siblings, 1 reply; 38+ messages in thread
From: Dave Chinner @ 2015-06-16 0:33 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Tue, Jun 02, 2015 at 02:41:33PM -0400, Brian Foster wrote:
> Hi all,
>
> Now that the sparse inode chunks feature is merged into the kernel tree
> for 4.2, here is the first official drop of userspace support. This
> series is based on the current libxfs-4.1-update branch.
>
> Patches 1-10 are libxfs infrastructure and correspond to the similarly
> named kernel patches. The bits not relevant to userspace are dropped
> along with the bulk of the sparse inode chunk allocation logic from the
> kernel due to the combination of non-existent dependencies in userspace
> (e.g., xfs_bit.c) and the fact that this code isn't invoked from
> userspace.
Ok, so this is causing problems with merging other code into
userspace. What I'm trying to do is keepthe kernel fs/xfs/libxfs/
code as close to identical with the xfsprogs libxfs/ code so that
patches just port straight across. I came across this difference
because my rmap btree patches fail to apply cleanly to xfs_ialloc.c
and it's because of all this missing code in userspace.
Rather than wait another day or two for you to rework this, Brian,
I'm simply going to rework this series to pull all the kernel patches
across and make it compile in userspace so that I can pull all the
rmap btree stuff across without needing to rework bits and peices of
the patchset.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 38+ messages in thread* Re: [PATCH 00/28] xfsprogs: sparse inode chunks
2015-06-16 0:33 ` [PATCH 00/28] xfsprogs: sparse inode chunks Dave Chinner
@ 2015-06-16 0:39 ` Dave Chinner
2015-06-16 10:55 ` Brian Foster
0 siblings, 1 reply; 38+ messages in thread
From: Dave Chinner @ 2015-06-16 0:39 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Tue, Jun 16, 2015 at 10:33:44AM +1000, Dave Chinner wrote:
> On Tue, Jun 02, 2015 at 02:41:33PM -0400, Brian Foster wrote:
> > Hi all,
> >
> > Now that the sparse inode chunks feature is merged into the kernel tree
> > for 4.2, here is the first official drop of userspace support. This
> > series is based on the current libxfs-4.1-update branch.
> >
> > Patches 1-10 are libxfs infrastructure and correspond to the similarly
> > named kernel patches. The bits not relevant to userspace are dropped
> > along with the bulk of the sparse inode chunk allocation logic from the
> > kernel due to the combination of non-existent dependencies in userspace
> > (e.g., xfs_bit.c) and the fact that this code isn't invoked from
> > userspace.
>
> Ok, so this is causing problems with merging other code into
> userspace. What I'm trying to do is keepthe kernel fs/xfs/libxfs/
> code as close to identical with the xfsprogs libxfs/ code so that
> patches just port straight across. I came across this difference
> because my rmap btree patches fail to apply cleanly to xfs_ialloc.c
> and it's because of all this missing code in userspace.
>
> Rather than wait another day or two for you to rework this, Brian,
> I'm simply going to rework this series to pull all the kernel patches
> across and make it compile in userspace so that I can pull all the
> rmap btree stuff across without needing to rework bits and peices of
> the patchset.
BTW, with patches I'll soon commit:
$ tools/libxfs-apply --source ../kern/xfsdev/ --commit 22419ac..22ce1e1
Commits to apply:
d4cc540 xfs: create individual inode alloc. helper
999633d xfs: update free inode record logic to support sparse inode records
bfe46d4 xfs: support min/max agbno args in block allocator
fb4f2b4 xfs: add sparse inode chunk alignment superblock field
066a188 xfs: use sparse chunk alignment for min. inode allocation requirement
e5376fc xfs: sparse inode chunks feature helpers and mount requirements
502a4e7 xfs: add fs geometry bit for sparse inode chunks
5419040 xfs: introduce inode record hole mask for sparse inode chunks
12d0714 xfs: use actual inode count for sparse records in bulkstat/inumbers
463958a xfs: pass inode count through ordered icreate log item
7f43c90 xfs: handle sparse inode chunks in icreate log recovery
4148c34 xfs: helper to convert holemask to inode alloc. bitmap
56d1115 xfs: allocate sparse inode chunks on full chunk allocation failure
1cdadee xfs: randomly do sparse inode allocations in DEBUG mode
26dd521 xfs: filter out sparse regions from individual inode allocation
10ae3dc7 xfs: only free allocated regions of inode chunks
09b5660 xfs: skip unallocated regions of inode chunks in xfs_ifree_cluster()
22ce1e1 xfs: enable sparse inode chunks for v5 superblocks
Proceed [y|N]? y
Applying patch..xfs__create_individual_inode_alloc_helper
Patch applied.
Patch xfs__create_individual_inode_alloc_helper refreshed
Applying patch..xfs__update_free_inode_record_logic_to_support_sparse_inode_records
Patch applied.
Patch xfs__update_free_inode_record_logic_to_support_sparse_inode_records refreshed
Applying patch..xfs__support_min-max_agbno_args_in_block_allocator
Patch applied.
Patch xfs__support_min-max_agbno_args_in_block_allocator refreshed
Applying patch..xfs__add_sparse_inode_chunk_alignment_superblock_field
Patch applied.
Patch xfs__add_sparse_inode_chunk_alignment_superblock_field refreshed
Applying patch..xfs__use_sparse_chunk_alignment_for_min_inode_allocation_requirement
Patch applied.
Patch xfs__use_sparse_chunk_alignment_for_min_inode_allocation_requirement refreshed
Applying patch..xfs__sparse_inode_chunks_feature_helpers_and_mount_requirements
Patch applied.
Patch xfs__sparse_inode_chunks_feature_helpers_and_mount_requirements refreshed
Applying patch..xfs__add_fs_geometry_bit_for_sparse_inode_chunks
Patch applied.
Patch xfs__add_fs_geometry_bit_for_sparse_inode_chunks refreshed
Applying patch..xfs__introduce_inode_record_hole_mask_for_sparse_inode_chunks
Patch applied.
Patch xfs__introduce_inode_record_hole_mask_for_sparse_inode_chunks refreshed
Applying patch..xfs__use_actual_inode_count_for_sparse_records_in_bulkstat-inumbers
Patch applied.
Patch xfs__use_actual_inode_count_for_sparse_records_in_bulkstat-inumbers refreshed
Applying patch..xfs__pass_inode_count_through_ordered_icreate_log_item
Patch applied.
Patch xfs__pass_inode_count_through_ordered_icreate_log_item refreshed
Applying patch..xfs__handle_sparse_inode_chunks_in_icreate_log_recovery
Patch applied.
Patch xfs__handle_sparse_inode_chunks_in_icreate_log_recovery refreshed
Applying patch..xfs__helper_to_convert_holemask_to_inode_alloc_bitmap
Patch applied.
Patch xfs__helper_to_convert_holemask_to_inode_alloc_bitmap refreshed
Applying patch..xfs__allocate_sparse_inode_chunks_on_full_chunk_allocation_failure
Patch applied.
Patch xfs__allocate_sparse_inode_chunks_on_full_chunk_allocation_failure refreshed
Applying patch..xfs__randomly_do_sparse_inode_allocations_in_debug_mode
Patch applied.
Patch xfs__randomly_do_sparse_inode_allocations_in_debug_mode refreshed
Applying patch..xfs__filter_out_sparse_regions_from_individual_inode_allocation
Patch applied.
Patch xfs__filter_out_sparse_regions_from_individual_inode_allocation refreshed
Applying patch..xfs__only_free_allocated_regions_of_inode_chunks
Patch applied.
Patch xfs__only_free_allocated_regions_of_inode_chunks refreshed
Applying patch..xfs__skip_unallocated_regions_of_inode_chunks_in_xfs_ifree_cluster
Patch applied.
Patch xfs__skip_unallocated_regions_of_inode_chunks_in_xfs_ifree_cluster refreshed
Applying patch..xfs__enable_sparse_inode_chunks_for_v5_superblocks
Patch applied.
Patch xfs__enable_sparse_inode_chunks_for_v5_superblocks refreshed
$
The series has applied cleanly with less than 30s work, now I can go
and do the bits I need to make it compile....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 00/28] xfsprogs: sparse inode chunks
2015-06-16 0:39 ` Dave Chinner
@ 2015-06-16 10:55 ` Brian Foster
2015-06-16 20:26 ` Dave Chinner
0 siblings, 1 reply; 38+ messages in thread
From: Brian Foster @ 2015-06-16 10:55 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Tue, Jun 16, 2015 at 10:39:23AM +1000, Dave Chinner wrote:
> On Tue, Jun 16, 2015 at 10:33:44AM +1000, Dave Chinner wrote:
> > On Tue, Jun 02, 2015 at 02:41:33PM -0400, Brian Foster wrote:
> > > Hi all,
> > >
> > > Now that the sparse inode chunks feature is merged into the kernel tree
> > > for 4.2, here is the first official drop of userspace support. This
> > > series is based on the current libxfs-4.1-update branch.
> > >
> > > Patches 1-10 are libxfs infrastructure and correspond to the similarly
> > > named kernel patches. The bits not relevant to userspace are dropped
> > > along with the bulk of the sparse inode chunk allocation logic from the
> > > kernel due to the combination of non-existent dependencies in userspace
> > > (e.g., xfs_bit.c) and the fact that this code isn't invoked from
> > > userspace.
> >
> > Ok, so this is causing problems with merging other code into
> > userspace. What I'm trying to do is keepthe kernel fs/xfs/libxfs/
> > code as close to identical with the xfsprogs libxfs/ code so that
> > patches just port straight across. I came across this difference
> > because my rmap btree patches fail to apply cleanly to xfs_ialloc.c
> > and it's because of all this missing code in userspace.
> >
> > Rather than wait another day or two for you to rework this, Brian,
> > I'm simply going to rework this series to pull all the kernel patches
> > across and make it compile in userspace so that I can pull all the
> > rmap btree stuff across without needing to rework bits and peices of
> > the patchset.
>
> BTW, with patches I'll soon commit:
>
> $ tools/libxfs-apply --source ../kern/xfsdev/ --commit 22419ac..22ce1e1
> Commits to apply:
> d4cc540 xfs: create individual inode alloc. helper
> 999633d xfs: update free inode record logic to support sparse inode records
> bfe46d4 xfs: support min/max agbno args in block allocator
> fb4f2b4 xfs: add sparse inode chunk alignment superblock field
> 066a188 xfs: use sparse chunk alignment for min. inode allocation requirement
> e5376fc xfs: sparse inode chunks feature helpers and mount requirements
> 502a4e7 xfs: add fs geometry bit for sparse inode chunks
> 5419040 xfs: introduce inode record hole mask for sparse inode chunks
> 12d0714 xfs: use actual inode count for sparse records in bulkstat/inumbers
> 463958a xfs: pass inode count through ordered icreate log item
> 7f43c90 xfs: handle sparse inode chunks in icreate log recovery
> 4148c34 xfs: helper to convert holemask to inode alloc. bitmap
> 56d1115 xfs: allocate sparse inode chunks on full chunk allocation failure
> 1cdadee xfs: randomly do sparse inode allocations in DEBUG mode
> 26dd521 xfs: filter out sparse regions from individual inode allocation
> 10ae3dc7 xfs: only free allocated regions of inode chunks
> 09b5660 xfs: skip unallocated regions of inode chunks in xfs_ifree_cluster()
> 22ce1e1 xfs: enable sparse inode chunks for v5 superblocks
> Proceed [y|N]? y
> Applying patch..xfs__create_individual_inode_alloc_helper
> Patch applied.
> Patch xfs__create_individual_inode_alloc_helper refreshed
> Applying patch..xfs__update_free_inode_record_logic_to_support_sparse_inode_records
> Patch applied.
> Patch xfs__update_free_inode_record_logic_to_support_sparse_inode_records refreshed
> Applying patch..xfs__support_min-max_agbno_args_in_block_allocator
> Patch applied.
> Patch xfs__support_min-max_agbno_args_in_block_allocator refreshed
> Applying patch..xfs__add_sparse_inode_chunk_alignment_superblock_field
> Patch applied.
> Patch xfs__add_sparse_inode_chunk_alignment_superblock_field refreshed
> Applying patch..xfs__use_sparse_chunk_alignment_for_min_inode_allocation_requirement
> Patch applied.
> Patch xfs__use_sparse_chunk_alignment_for_min_inode_allocation_requirement refreshed
> Applying patch..xfs__sparse_inode_chunks_feature_helpers_and_mount_requirements
> Patch applied.
> Patch xfs__sparse_inode_chunks_feature_helpers_and_mount_requirements refreshed
> Applying patch..xfs__add_fs_geometry_bit_for_sparse_inode_chunks
> Patch applied.
> Patch xfs__add_fs_geometry_bit_for_sparse_inode_chunks refreshed
> Applying patch..xfs__introduce_inode_record_hole_mask_for_sparse_inode_chunks
> Patch applied.
> Patch xfs__introduce_inode_record_hole_mask_for_sparse_inode_chunks refreshed
> Applying patch..xfs__use_actual_inode_count_for_sparse_records_in_bulkstat-inumbers
> Patch applied.
> Patch xfs__use_actual_inode_count_for_sparse_records_in_bulkstat-inumbers refreshed
> Applying patch..xfs__pass_inode_count_through_ordered_icreate_log_item
> Patch applied.
> Patch xfs__pass_inode_count_through_ordered_icreate_log_item refreshed
> Applying patch..xfs__handle_sparse_inode_chunks_in_icreate_log_recovery
> Patch applied.
> Patch xfs__handle_sparse_inode_chunks_in_icreate_log_recovery refreshed
> Applying patch..xfs__helper_to_convert_holemask_to_inode_alloc_bitmap
> Patch applied.
> Patch xfs__helper_to_convert_holemask_to_inode_alloc_bitmap refreshed
> Applying patch..xfs__allocate_sparse_inode_chunks_on_full_chunk_allocation_failure
> Patch applied.
> Patch xfs__allocate_sparse_inode_chunks_on_full_chunk_allocation_failure refreshed
> Applying patch..xfs__randomly_do_sparse_inode_allocations_in_debug_mode
> Patch applied.
> Patch xfs__randomly_do_sparse_inode_allocations_in_debug_mode refreshed
> Applying patch..xfs__filter_out_sparse_regions_from_individual_inode_allocation
> Patch applied.
> Patch xfs__filter_out_sparse_regions_from_individual_inode_allocation refreshed
> Applying patch..xfs__only_free_allocated_regions_of_inode_chunks
> Patch applied.
> Patch xfs__only_free_allocated_regions_of_inode_chunks refreshed
> Applying patch..xfs__skip_unallocated_regions_of_inode_chunks_in_xfs_ifree_cluster
> Patch applied.
> Patch xfs__skip_unallocated_regions_of_inode_chunks_in_xfs_ifree_cluster refreshed
> Applying patch..xfs__enable_sparse_inode_chunks_for_v5_superblocks
> Patch applied.
> Patch xfs__enable_sparse_inode_chunks_for_v5_superblocks refreshed
> $
>
> The series has applied cleanly with less than 30s work, now I can go
> and do the bits I need to make it compile....
>
Yeah, the issue was never that it didn't apply. For the most part,
everything in the kernel series applied fine to userspace (save for
non-existent files in userspace, etc.). The issue was that some of the
bits didn't compile in userspace and it didn't seem worth the effort to
port over the dependent bits for code that ultimately doesn't execute in
userspace (and thus can't be tested easily either).
So what is the plan to handle that? I suppose we could comment out those
particular bits, just leave stub functions for the depending or
dependent code, move it to part of the shared libxfs assuming it doesn't
create further dependencies, etc. The first couple options would
probably still eventually lead to libxfs conflicts going forward,
whereas outside stubs or pulling more into libxfs probably avoids
that.
Brian
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH 00/28] xfsprogs: sparse inode chunks
2015-06-16 10:55 ` Brian Foster
@ 2015-06-16 20:26 ` Dave Chinner
0 siblings, 0 replies; 38+ messages in thread
From: Dave Chinner @ 2015-06-16 20:26 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Tue, Jun 16, 2015 at 06:55:04AM -0400, Brian Foster wrote:
> On Tue, Jun 16, 2015 at 10:39:23AM +1000, Dave Chinner wrote:
> > On Tue, Jun 16, 2015 at 10:33:44AM +1000, Dave Chinner wrote:
> > > On Tue, Jun 02, 2015 at 02:41:33PM -0400, Brian Foster wrote:
> > > > Hi all,
> > > >
> > > > Now that the sparse inode chunks feature is merged into the kernel tree
> > > > for 4.2, here is the first official drop of userspace support. This
> > > > series is based on the current libxfs-4.1-update branch.
> > > >
> > > > Patches 1-10 are libxfs infrastructure and correspond to the similarly
> > > > named kernel patches. The bits not relevant to userspace are dropped
> > > > along with the bulk of the sparse inode chunk allocation logic from the
> > > > kernel due to the combination of non-existent dependencies in userspace
> > > > (e.g., xfs_bit.c) and the fact that this code isn't invoked from
> > > > userspace.
> > >
> > > Ok, so this is causing problems with merging other code into
> > > userspace. What I'm trying to do is keepthe kernel fs/xfs/libxfs/
> > > code as close to identical with the xfsprogs libxfs/ code so that
> > > patches just port straight across. I came across this difference
> > > because my rmap btree patches fail to apply cleanly to xfs_ialloc.c
> > > and it's because of all this missing code in userspace.
> > >
> > > Rather than wait another day or two for you to rework this, Brian,
> > > I'm simply going to rework this series to pull all the kernel patches
> > > across and make it compile in userspace so that I can pull all the
> > > rmap btree stuff across without needing to rework bits and peices of
> > > the patchset.
> >
> > BTW, with patches I'll soon commit:
> >
> > $ tools/libxfs-apply --source ../kern/xfsdev/ --commit 22419ac..22ce1e1
> > Commits to apply:
> > d4cc540 xfs: create individual inode alloc. helper
> > 999633d xfs: update free inode record logic to support sparse inode records
> > bfe46d4 xfs: support min/max agbno args in block allocator
> > fb4f2b4 xfs: add sparse inode chunk alignment superblock field
> > 066a188 xfs: use sparse chunk alignment for min. inode allocation requirement
> > e5376fc xfs: sparse inode chunks feature helpers and mount requirements
> > 502a4e7 xfs: add fs geometry bit for sparse inode chunks
> > 5419040 xfs: introduce inode record hole mask for sparse inode chunks
> > 12d0714 xfs: use actual inode count for sparse records in bulkstat/inumbers
> > 463958a xfs: pass inode count through ordered icreate log item
> > 7f43c90 xfs: handle sparse inode chunks in icreate log recovery
> > 4148c34 xfs: helper to convert holemask to inode alloc. bitmap
> > 56d1115 xfs: allocate sparse inode chunks on full chunk allocation failure
> > 1cdadee xfs: randomly do sparse inode allocations in DEBUG mode
> > 26dd521 xfs: filter out sparse regions from individual inode allocation
> > 10ae3dc7 xfs: only free allocated regions of inode chunks
> > 09b5660 xfs: skip unallocated regions of inode chunks in xfs_ifree_cluster()
> > 22ce1e1 xfs: enable sparse inode chunks for v5 superblocks
> > Proceed [y|N]? y
> > Applying patch..xfs__create_individual_inode_alloc_helper
> > Patch applied.
> > Patch xfs__create_individual_inode_alloc_helper refreshed
> > Applying patch..xfs__update_free_inode_record_logic_to_support_sparse_inode_records
> > Patch applied.
> > Patch xfs__update_free_inode_record_logic_to_support_sparse_inode_records refreshed
> > Applying patch..xfs__support_min-max_agbno_args_in_block_allocator
> > Patch applied.
> > Patch xfs__support_min-max_agbno_args_in_block_allocator refreshed
> > Applying patch..xfs__add_sparse_inode_chunk_alignment_superblock_field
> > Patch applied.
> > Patch xfs__add_sparse_inode_chunk_alignment_superblock_field refreshed
> > Applying patch..xfs__use_sparse_chunk_alignment_for_min_inode_allocation_requirement
> > Patch applied.
> > Patch xfs__use_sparse_chunk_alignment_for_min_inode_allocation_requirement refreshed
> > Applying patch..xfs__sparse_inode_chunks_feature_helpers_and_mount_requirements
> > Patch applied.
> > Patch xfs__sparse_inode_chunks_feature_helpers_and_mount_requirements refreshed
> > Applying patch..xfs__add_fs_geometry_bit_for_sparse_inode_chunks
> > Patch applied.
> > Patch xfs__add_fs_geometry_bit_for_sparse_inode_chunks refreshed
> > Applying patch..xfs__introduce_inode_record_hole_mask_for_sparse_inode_chunks
> > Patch applied.
> > Patch xfs__introduce_inode_record_hole_mask_for_sparse_inode_chunks refreshed
> > Applying patch..xfs__use_actual_inode_count_for_sparse_records_in_bulkstat-inumbers
> > Patch applied.
> > Patch xfs__use_actual_inode_count_for_sparse_records_in_bulkstat-inumbers refreshed
> > Applying patch..xfs__pass_inode_count_through_ordered_icreate_log_item
> > Patch applied.
> > Patch xfs__pass_inode_count_through_ordered_icreate_log_item refreshed
> > Applying patch..xfs__handle_sparse_inode_chunks_in_icreate_log_recovery
> > Patch applied.
> > Patch xfs__handle_sparse_inode_chunks_in_icreate_log_recovery refreshed
> > Applying patch..xfs__helper_to_convert_holemask_to_inode_alloc_bitmap
> > Patch applied.
> > Patch xfs__helper_to_convert_holemask_to_inode_alloc_bitmap refreshed
> > Applying patch..xfs__allocate_sparse_inode_chunks_on_full_chunk_allocation_failure
> > Patch applied.
> > Patch xfs__allocate_sparse_inode_chunks_on_full_chunk_allocation_failure refreshed
> > Applying patch..xfs__randomly_do_sparse_inode_allocations_in_debug_mode
> > Patch applied.
> > Patch xfs__randomly_do_sparse_inode_allocations_in_debug_mode refreshed
> > Applying patch..xfs__filter_out_sparse_regions_from_individual_inode_allocation
> > Patch applied.
> > Patch xfs__filter_out_sparse_regions_from_individual_inode_allocation refreshed
> > Applying patch..xfs__only_free_allocated_regions_of_inode_chunks
> > Patch applied.
> > Patch xfs__only_free_allocated_regions_of_inode_chunks refreshed
> > Applying patch..xfs__skip_unallocated_regions_of_inode_chunks_in_xfs_ifree_cluster
> > Patch applied.
> > Patch xfs__skip_unallocated_regions_of_inode_chunks_in_xfs_ifree_cluster refreshed
> > Applying patch..xfs__enable_sparse_inode_chunks_for_v5_superblocks
> > Patch applied.
> > Patch xfs__enable_sparse_inode_chunks_for_v5_superblocks refreshed
> > $
> >
> > The series has applied cleanly with less than 30s work, now I can go
> > and do the bits I need to make it compile....
> >
>
> Yeah, the issue was never that it didn't apply. For the most part,
> everything in the kernel series applied fine to userspace (save for
> non-existent files in userspace, etc.). The issue was that some of the
> bits didn't compile in userspace and it didn't seem worth the effort to
> port over the dependent bits for code that ultimately doesn't execute in
> userspace (and thus can't be tested easily either).
>
> So what is the plan to handle that? I suppose we could comment out those
> particular bits, just leave stub functions for the depending or
> dependent code, move it to part of the shared libxfs assuming it doesn't
> create further dependencies, etc. The first couple options would
> probably still eventually lead to libxfs conflicts going forward,
> whereas outside stubs or pulling more into libxfs probably avoids
> that.
No stubs or partial files - that's where we were before I started on
the libxfs unification and it made merges hell. libxfs/ files should
be as close to identical as possible in both code bases.
I just pulled in xfs_bit.c (xfs_bit.h is already in libxfs/), and
I copied the bitmap pieces from the kernel files across into
libxfs_priv.h, and everything compiles and seems to work.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 38+ messages in thread