* [PATCH 01/64] xfs: avoid redundant AGFL buffer invalidation
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
@ 2024-10-02 1:08 ` Darrick J. Wong
2024-10-02 1:08 ` [PATCH 02/64] xfs: don't walk off the end of a directory data block Darrick J. Wong
` (62 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:08 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Gao Xiang, Dave Chinner, Chandan Babu R, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: d40c2865bdbbbba6418436b0a877daebe1d7c63e
Currently AGFL blocks can be filled from the following three sources:
- allocbt free blocks, as in xfs_allocbt_free_block();
- rmapbt free blocks, as in xfs_rmapbt_free_block();
- refilled from freespace btrees, as in xfs_alloc_fix_freelist().
Originally, allocbt free blocks would be marked as stale only when they
put back in the general free space pool as Dave mentioned on IRC, "we
don't stale AGF metadata btree blocks when they are returned to the
AGFL .. but once they get put back in the general free space pool, we
have to make sure the buffers are marked stale as the next user of
those blocks might be user data...."
However, after commit ca250b1b3d71 ("xfs: invalidate allocbt blocks
moved to the free list") and commit edfd9dd54921 ("xfs: move buffer
invalidation to xfs_btree_free_block"), even allocbt / bmapbt free
blocks will be invalidated immediately since they may fail to pass
V5 format validation on writeback even writeback to free space would be
safe.
IOWs, IMHO currently there is actually no difference of free blocks
between AGFL freespace pool and the general free space pool. So let's
avoid extra redundant AGFL buffer invalidation, since otherwise we're
currently facing unnecessary xfs_log_force() due to xfs_trans_binval()
again on buffers already marked as stale before as below:
[ 333.507469] Call Trace:
[ 333.507862] xfs_buf_find+0x371/0x6a0 <- xfs_buf_lock
[ 333.508451] xfs_buf_get_map+0x3f/0x230
[ 333.509062] xfs_trans_get_buf_map+0x11a/0x280
[ 333.509751] xfs_free_agfl_block+0xa1/0xd0
[ 333.510403] xfs_agfl_free_finish_item+0x16e/0x1d0
[ 333.511157] xfs_defer_finish_noroll+0x1ef/0x5c0
[ 333.511871] xfs_defer_finish+0xc/0xa0
[ 333.512471] xfs_itruncate_extents_flags+0x18a/0x5e0
[ 333.513253] xfs_inactive_truncate+0xb8/0x130
[ 333.513930] xfs_inactive+0x223/0x270
xfs_log_force() will take tens of milliseconds with AGF buffer locked.
It becomes an unnecessary long latency especially on our PMEM devices
with FSDAX enabled and fsops like xfs_reflink_find_shared() at the same
time are stuck due to the same AGF lock. Removing the double
invalidation on the AGFL blocks does not make this issue go away, but
this patch fixes for our workloads in reality and it should also work
by the code analysis.
Note that I'm not sure I need to remove another redundant one in
xfs_alloc_ag_vextent_small() since it's unrelated to our workloads.
Also fstests are passed with this patch.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
libxfs/defer_item.c | 4 ++--
libxfs/xfs_alloc.c | 28 +---------------------------
libxfs/xfs_alloc.h | 6 ++++--
3 files changed, 7 insertions(+), 31 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index 8cdf57eac..77a368e6f 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -189,8 +189,8 @@ xfs_agfl_free_finish_item(
error = xfs_alloc_read_agf(xefi->xefi_pag, tp, 0, &agbp);
if (!error)
- error = xfs_free_agfl_block(tp, xefi->xefi_pag->pag_agno,
- agbno, agbp, &oinfo);
+ error = xfs_free_ag_extent(tp, agbp, xefi->xefi_pag->pag_agno,
+ agbno, 1, &oinfo, XFS_AG_RESV_AGFL);
xfs_extent_free_put_group(xefi);
kmem_cache_free(xfs_extfree_item_cache, xefi);
diff --git a/libxfs/xfs_alloc.c b/libxfs/xfs_alloc.c
index 45feff034..ab547d80c 100644
--- a/libxfs/xfs_alloc.c
+++ b/libxfs/xfs_alloc.c
@@ -1928,7 +1928,7 @@ xfs_alloc_ag_vextent_size(
/*
* Free the extent starting at agno/bno for length.
*/
-STATIC int
+int
xfs_free_ag_extent(
struct xfs_trans *tp,
struct xfs_buf *agbp,
@@ -2418,32 +2418,6 @@ xfs_alloc_space_available(
return true;
}
-int
-xfs_free_agfl_block(
- struct xfs_trans *tp,
- xfs_agnumber_t agno,
- xfs_agblock_t agbno,
- struct xfs_buf *agbp,
- struct xfs_owner_info *oinfo)
-{
- int error;
- struct xfs_buf *bp;
-
- error = xfs_free_ag_extent(tp, agbp, agno, agbno, 1, oinfo,
- XFS_AG_RESV_AGFL);
- if (error)
- return error;
-
- error = xfs_trans_get_buf(tp, tp->t_mountp->m_ddev_targp,
- XFS_AGB_TO_DADDR(tp->t_mountp, agno, agbno),
- tp->t_mountp->m_bsize, 0, &bp);
- if (error)
- return error;
- xfs_trans_binval(tp, bp);
-
- return 0;
-}
-
/*
* Check the agfl fields of the agf for inconsistency or corruption.
*
diff --git a/libxfs/xfs_alloc.h b/libxfs/xfs_alloc.h
index 0b956f8b9..3dc8e44fe 100644
--- a/libxfs/xfs_alloc.h
+++ b/libxfs/xfs_alloc.h
@@ -80,6 +80,10 @@ int xfs_alloc_get_freelist(struct xfs_perag *pag, struct xfs_trans *tp,
int xfs_alloc_put_freelist(struct xfs_perag *pag, struct xfs_trans *tp,
struct xfs_buf *agfbp, struct xfs_buf *agflbp,
xfs_agblock_t bno, int btreeblk);
+int xfs_free_ag_extent(struct xfs_trans *tp, struct xfs_buf *agbp,
+ xfs_agnumber_t agno, xfs_agblock_t bno,
+ xfs_extlen_t len, const struct xfs_owner_info *oinfo,
+ enum xfs_ag_resv_type type);
/*
* Compute and fill in value of m_alloc_maxlevels.
@@ -194,8 +198,6 @@ int xfs_alloc_read_agf(struct xfs_perag *pag, struct xfs_trans *tp, int flags,
struct xfs_buf **agfbpp);
int xfs_alloc_read_agfl(struct xfs_perag *pag, struct xfs_trans *tp,
struct xfs_buf **bpp);
-int xfs_free_agfl_block(struct xfs_trans *, xfs_agnumber_t, xfs_agblock_t,
- struct xfs_buf *, struct xfs_owner_info *);
int xfs_alloc_fix_freelist(struct xfs_alloc_arg *args, uint32_t alloc_flags);
int xfs_free_extent_fix_freelist(struct xfs_trans *tp, struct xfs_perag *pag,
struct xfs_buf **agbp);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 02/64] xfs: don't walk off the end of a directory data block
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
2024-10-02 1:08 ` [PATCH 01/64] xfs: avoid redundant AGFL buffer invalidation Darrick J. Wong
@ 2024-10-02 1:08 ` Darrick J. Wong
2024-10-02 1:08 ` [PATCH 03/64] xfs: Remove header files which are included more than once Darrick J. Wong
` (61 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:08 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: lei lu, Chandan Babu R, linux-xfs
From: lei lu <llfamsec@gmail.com>
Source kernel commit: 0c7fcdb6d06cdf8b19b57c17605215b06afa864a
This adds sanity checks for xfs_dir2_data_unused and xfs_dir2_data_entry
to make sure don't stray beyond valid memory region. Before patching, the
loop simply checks that the start offset of the dup and dep is within the
range. So in a crafted image, if last entry is xfs_dir2_data_unused, we
can change dup->length to dup->length-1 and leave 1 byte of space. In the
next traversal, this space will be considered as dup or dep. We may
encounter an out of bound read when accessing the fixed members.
In the patch, we make sure that the remaining bytes large enough to hold
an unused entry before accessing xfs_dir2_data_unused and
xfs_dir2_data_unused is XFS_DIR2_DATA_ALIGN byte aligned. We also make
sure that the remaining bytes large enough to hold a dirent with a
single-byte name before accessing xfs_dir2_data_entry.
Signed-off-by: lei lu <llfamsec@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
libxfs/xfs_dir2_data.c | 31 ++++++++++++++++++++++++++-----
libxfs/xfs_dir2_priv.h | 7 +++++++
2 files changed, 33 insertions(+), 5 deletions(-)
diff --git a/libxfs/xfs_dir2_data.c b/libxfs/xfs_dir2_data.c
index 0c77245ee..65e6ed879 100644
--- a/libxfs/xfs_dir2_data.c
+++ b/libxfs/xfs_dir2_data.c
@@ -175,6 +175,14 @@ __xfs_dir3_data_check(
while (offset < end) {
struct xfs_dir2_data_unused *dup = bp->b_addr + offset;
struct xfs_dir2_data_entry *dep = bp->b_addr + offset;
+ unsigned int reclen;
+
+ /*
+ * Are the remaining bytes large enough to hold an
+ * unused entry?
+ */
+ if (offset > end - xfs_dir2_data_unusedsize(1))
+ return __this_address;
/*
* If it's unused, look for the space in the bestfree table.
@@ -184,9 +192,13 @@ __xfs_dir3_data_check(
if (be16_to_cpu(dup->freetag) == XFS_DIR2_DATA_FREE_TAG) {
xfs_failaddr_t fa;
+ reclen = xfs_dir2_data_unusedsize(
+ be16_to_cpu(dup->length));
if (lastfree != 0)
return __this_address;
- if (offset + be16_to_cpu(dup->length) > end)
+ if (be16_to_cpu(dup->length) != reclen)
+ return __this_address;
+ if (offset + reclen > end)
return __this_address;
if (be16_to_cpu(*xfs_dir2_data_unused_tag_p(dup)) !=
offset)
@@ -204,10 +216,18 @@ __xfs_dir3_data_check(
be16_to_cpu(bf[2].length))
return __this_address;
}
- offset += be16_to_cpu(dup->length);
+ offset += reclen;
lastfree = 1;
continue;
}
+
+ /*
+ * This is not an unused entry. Are the remaining bytes
+ * large enough for a dirent with a single-byte name?
+ */
+ if (offset > end - xfs_dir2_data_entsize(mp, 1))
+ return __this_address;
+
/*
* It's a real entry. Validate the fields.
* If this is a block directory then make sure it's
@@ -216,10 +236,11 @@ __xfs_dir3_data_check(
*/
if (dep->namelen == 0)
return __this_address;
+ reclen = xfs_dir2_data_entsize(mp, dep->namelen);
+ if (offset + reclen > end)
+ return __this_address;
if (!xfs_verify_dir_ino(mp, be64_to_cpu(dep->inumber)))
return __this_address;
- if (offset + xfs_dir2_data_entsize(mp, dep->namelen) > end)
- return __this_address;
if (be16_to_cpu(*xfs_dir2_data_entry_tag_p(mp, dep)) != offset)
return __this_address;
if (xfs_dir2_data_get_ftype(mp, dep) >= XFS_DIR3_FT_MAX)
@@ -242,7 +263,7 @@ __xfs_dir3_data_check(
if (i >= be32_to_cpu(btp->count))
return __this_address;
}
- offset += xfs_dir2_data_entsize(mp, dep->namelen);
+ offset += reclen;
}
/*
* Need to have seen all the entries and all the bestfree slots.
diff --git a/libxfs/xfs_dir2_priv.h b/libxfs/xfs_dir2_priv.h
index 3befb3250..100413502 100644
--- a/libxfs/xfs_dir2_priv.h
+++ b/libxfs/xfs_dir2_priv.h
@@ -189,6 +189,13 @@ void xfs_dir2_sf_put_ftype(struct xfs_mount *mp,
extern int xfs_readdir(struct xfs_trans *tp, struct xfs_inode *dp,
struct dir_context *ctx, size_t bufsize);
+static inline unsigned int
+xfs_dir2_data_unusedsize(
+ unsigned int len)
+{
+ return round_up(len, XFS_DIR2_DATA_ALIGN);
+}
+
static inline unsigned int
xfs_dir2_data_entsize(
struct xfs_mount *mp,
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 03/64] xfs: Remove header files which are included more than once
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
2024-10-02 1:08 ` [PATCH 01/64] xfs: avoid redundant AGFL buffer invalidation Darrick J. Wong
2024-10-02 1:08 ` [PATCH 02/64] xfs: don't walk off the end of a directory data block Darrick J. Wong
@ 2024-10-02 1:08 ` Darrick J. Wong
2024-10-02 1:08 ` [PATCH 04/64] xfs: hoist extent size helpers to libxfs Darrick J. Wong
` (60 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:08 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Wenchao Hao, Chandan Babu R, linux-xfs
From: Wenchao Hao <haowenchao22@gmail.com>
Source kernel commit: a330cae8a7147890262b06e1aa13db048e3b130f
Following warning is reported, so remove these duplicated header
including:
./fs/xfs/libxfs/xfs_trans_resv.c: xfs_da_format.h is included more than once.
./fs/xfs/scrub/quota_repair.c: xfs_format.h is included more than once.
./fs/xfs/xfs_handle.c: xfs_da_btree.h is included more than once.
./fs/xfs/xfs_qm_bhv.c: xfs_mount.h is included more than once.
./fs/xfs/xfs_trace.c: xfs_bmap.h is included more than once.
This is just a clean code, no logic changed.
Signed-off-by: Wenchao Hao <haowenchao22@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
libxfs/xfs_trans_resv.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/libxfs/xfs_trans_resv.c b/libxfs/xfs_trans_resv.c
index dc405a943..a2cb4d63e 100644
--- a/libxfs/xfs_trans_resv.c
+++ b/libxfs/xfs_trans_resv.c
@@ -19,7 +19,6 @@
#include "xfs_trans_space.h"
#include "xfs_quota_defs.h"
#include "xfs_rtbitmap.h"
-#include "xfs_da_format.h"
#define _ALLOC true
#define _FREE false
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 04/64] xfs: hoist extent size helpers to libxfs
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (2 preceding siblings ...)
2024-10-02 1:08 ` [PATCH 03/64] xfs: Remove header files which are included more than once Darrick J. Wong
@ 2024-10-02 1:08 ` Darrick J. Wong
2024-10-02 1:09 ` [PATCH 05/64] xfs: hoist inode flag conversion functions " Darrick J. Wong
` (59 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:08 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: acdddbe168040372a8b6b9b5876b92b715322910
Move the extent size helpers to xfs_bmap.c in libxfs since they're used
there already.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
include/xfs_inode.h | 7 +++++++
libxfs/libxfs_priv.h | 2 --
libxfs/xfs_bmap.c | 42 ++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_bmap.h | 3 +++
4 files changed, 52 insertions(+), 2 deletions(-)
diff --git a/include/xfs_inode.h b/include/xfs_inode.h
index 9bbf37225..ec4eada81 100644
--- a/include/xfs_inode.h
+++ b/include/xfs_inode.h
@@ -345,6 +345,11 @@ static inline bool xfs_inode_has_bigrtalloc(struct xfs_inode *ip)
return XFS_IS_REALTIME_INODE(ip) && ip->i_mount->m_sb.sb_rextsize > 1;
}
+static inline bool xfs_is_always_cow_inode(struct xfs_inode *ip)
+{
+ return false;
+}
+
/* Always set the child's GID to this value, even if the parent is setgid. */
#define CRED_FORCE_GID (1U << 0)
struct cred {
@@ -370,4 +375,6 @@ extern int libxfs_iget(struct xfs_mount *, struct xfs_trans *, xfs_ino_t,
uint, struct xfs_inode **);
extern void libxfs_irele(struct xfs_inode *ip);
+#define XFS_DEFAULT_COWEXTSZ_HINT 32
+
#endif /* __XFS_INODE_H__ */
diff --git a/libxfs/libxfs_priv.h b/libxfs/libxfs_priv.h
index 5d1aa23c7..0bf0c54ac 100644
--- a/libxfs/libxfs_priv.h
+++ b/libxfs/libxfs_priv.h
@@ -468,8 +468,6 @@ xfs_buf_readahead(
#define xfs_rotorstep 1
#define xfs_bmap_rtalloc(a) (-ENOSYS)
-#define xfs_get_extsz_hint(ip) (0)
-#define xfs_get_cowextsz_hint(ip) (0)
#define xfs_inode_is_filestream(ip) (0)
#define xfs_filestream_lookup_ag(ip) (0)
#define xfs_filestream_new_ag(ip,ag) (0)
diff --git a/libxfs/xfs_bmap.c b/libxfs/xfs_bmap.c
index e60d11470..befbe0b07 100644
--- a/libxfs/xfs_bmap.c
+++ b/libxfs/xfs_bmap.c
@@ -6448,3 +6448,45 @@ xfs_bmap_query_all(
return xfs_btree_query_all(cur, xfs_bmap_query_range_helper, &query);
}
+
+/* Helper function to extract extent size hint from inode */
+xfs_extlen_t
+xfs_get_extsz_hint(
+ struct xfs_inode *ip)
+{
+ /*
+ * No point in aligning allocations if we need to COW to actually
+ * write to them.
+ */
+ if (xfs_is_always_cow_inode(ip))
+ return 0;
+ if ((ip->i_diflags & XFS_DIFLAG_EXTSIZE) && ip->i_extsize)
+ return ip->i_extsize;
+ if (XFS_IS_REALTIME_INODE(ip) &&
+ ip->i_mount->m_sb.sb_rextsize > 1)
+ return ip->i_mount->m_sb.sb_rextsize;
+ return 0;
+}
+
+/*
+ * Helper function to extract CoW extent size hint from inode.
+ * Between the extent size hint and the CoW extent size hint, we
+ * return the greater of the two. If the value is zero (automatic),
+ * use the default size.
+ */
+xfs_extlen_t
+xfs_get_cowextsz_hint(
+ struct xfs_inode *ip)
+{
+ xfs_extlen_t a, b;
+
+ a = 0;
+ if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE)
+ a = ip->i_cowextsize;
+ b = xfs_get_extsz_hint(ip);
+
+ a = max(a, b);
+ if (a == 0)
+ return XFS_DEFAULT_COWEXTSZ_HINT;
+ return a;
+}
diff --git a/libxfs/xfs_bmap.h b/libxfs/xfs_bmap.h
index 667b0c2b3..7592d46e9 100644
--- a/libxfs/xfs_bmap.h
+++ b/libxfs/xfs_bmap.h
@@ -296,4 +296,7 @@ typedef int (*xfs_bmap_query_range_fn)(
int xfs_bmap_query_all(struct xfs_btree_cur *cur, xfs_bmap_query_range_fn fn,
void *priv);
+xfs_extlen_t xfs_get_extsz_hint(struct xfs_inode *ip);
+xfs_extlen_t xfs_get_cowextsz_hint(struct xfs_inode *ip);
+
#endif /* __XFS_BMAP_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 05/64] xfs: hoist inode flag conversion functions to libxfs
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (3 preceding siblings ...)
2024-10-02 1:08 ` [PATCH 04/64] xfs: hoist extent size helpers to libxfs Darrick J. Wong
@ 2024-10-02 1:09 ` Darrick J. Wong
2024-10-02 1:09 ` [PATCH 06/64] xfs: hoist project id get/set " Darrick J. Wong
` (58 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:09 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: b7c477be396948ce88ea591b91070fa68ac12437
Hoist the inode flag conversion functions into libxfs so that we can
keep them in sync. Do this by creating a new xfs_inode_util.c file in
libxfs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
include/libxfs.h | 1
include/xfs_inode.h | 1
libxfs/Makefile | 2 +
libxfs/util.c | 60 -----------------------
libxfs/xfs_bmap.c | 1
libxfs/xfs_inode_util.c | 124 +++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_inode_util.h | 14 +++++
7 files changed, 143 insertions(+), 60 deletions(-)
create mode 100644 libxfs/xfs_inode_util.c
create mode 100644 libxfs/xfs_inode_util.h
diff --git a/include/libxfs.h b/include/libxfs.h
index 31d081191..17cf619f0 100644
--- a/include/libxfs.h
+++ b/include/libxfs.h
@@ -74,6 +74,7 @@ struct iomap;
#include "xfs_attr_sf.h"
#include "xfs_inode_fork.h"
#include "xfs_inode_buf.h"
+#include "xfs_inode_util.h"
#include "xfs_alloc.h"
#include "xfs_btree.h"
#include "xfs_bmap.h"
diff --git a/include/xfs_inode.h b/include/xfs_inode.h
index ec4eada81..17d3da6ae 100644
--- a/include/xfs_inode.h
+++ b/include/xfs_inode.h
@@ -10,6 +10,7 @@
/* These match kernel side includes */
#include "xfs_inode_buf.h"
#include "xfs_inode_fork.h"
+#include "xfs_inode_util.h"
struct xfs_trans;
struct xfs_mount;
diff --git a/libxfs/Makefile b/libxfs/Makefile
index 833c65092..cc3312b57 100644
--- a/libxfs/Makefile
+++ b/libxfs/Makefile
@@ -52,6 +52,7 @@ HFILES = \
xfs_ialloc_btree.h \
xfs_inode_buf.h \
xfs_inode_fork.h \
+ xfs_inode_util.h \
xfs_parent.h \
xfs_quota_defs.h \
xfs_refcount.h \
@@ -105,6 +106,7 @@ CFILES = buf_mem.c \
xfs_iext_tree.c \
xfs_inode_buf.c \
xfs_inode_fork.c \
+ xfs_inode_util.c \
xfs_ialloc_btree.c \
xfs_log_rlimit.c \
xfs_parent.c \
diff --git a/libxfs/util.c b/libxfs/util.c
index 373749457..4e96ba5ce 100644
--- a/libxfs/util.c
+++ b/libxfs/util.c
@@ -150,66 +150,6 @@ current_time(struct inode *inode)
return tv;
}
-STATIC uint16_t
-xfs_flags2diflags(
- struct xfs_inode *ip,
- unsigned int xflags)
-{
- /* can't set PREALLOC this way, just preserve it */
- uint16_t di_flags =
- (ip->i_diflags & XFS_DIFLAG_PREALLOC);
-
- if (xflags & FS_XFLAG_IMMUTABLE)
- di_flags |= XFS_DIFLAG_IMMUTABLE;
- if (xflags & FS_XFLAG_APPEND)
- di_flags |= XFS_DIFLAG_APPEND;
- if (xflags & FS_XFLAG_SYNC)
- di_flags |= XFS_DIFLAG_SYNC;
- if (xflags & FS_XFLAG_NOATIME)
- di_flags |= XFS_DIFLAG_NOATIME;
- if (xflags & FS_XFLAG_NODUMP)
- di_flags |= XFS_DIFLAG_NODUMP;
- if (xflags & FS_XFLAG_NODEFRAG)
- di_flags |= XFS_DIFLAG_NODEFRAG;
- if (xflags & FS_XFLAG_FILESTREAM)
- di_flags |= XFS_DIFLAG_FILESTREAM;
- if (S_ISDIR(VFS_I(ip)->i_mode)) {
- if (xflags & FS_XFLAG_RTINHERIT)
- di_flags |= XFS_DIFLAG_RTINHERIT;
- if (xflags & FS_XFLAG_NOSYMLINKS)
- di_flags |= XFS_DIFLAG_NOSYMLINKS;
- if (xflags & FS_XFLAG_EXTSZINHERIT)
- di_flags |= XFS_DIFLAG_EXTSZINHERIT;
- if (xflags & FS_XFLAG_PROJINHERIT)
- di_flags |= XFS_DIFLAG_PROJINHERIT;
- } else if (S_ISREG(VFS_I(ip)->i_mode)) {
- if (xflags & FS_XFLAG_REALTIME)
- di_flags |= XFS_DIFLAG_REALTIME;
- if (xflags & FS_XFLAG_EXTSIZE)
- di_flags |= XFS_DIFLAG_EXTSIZE;
- }
-
- return di_flags;
-}
-
-STATIC uint64_t
-xfs_flags2diflags2(
- struct xfs_inode *ip,
- unsigned int xflags)
-{
- uint64_t di_flags2 =
- (ip->i_diflags2 & (XFS_DIFLAG2_REFLINK |
- XFS_DIFLAG2_BIGTIME |
- XFS_DIFLAG2_NREXT64));
-
- if (xflags & FS_XFLAG_DAX)
- di_flags2 |= XFS_DIFLAG2_DAX;
- if (xflags & FS_XFLAG_COWEXTSIZE)
- di_flags2 |= XFS_DIFLAG2_COWEXTSIZE;
-
- return di_flags2;
-}
-
/* Propagate di_flags from a parent inode to a child inode. */
static void
xfs_inode_propagate_flags(
diff --git a/libxfs/xfs_bmap.c b/libxfs/xfs_bmap.c
index befbe0b07..5f4446104 100644
--- a/libxfs/xfs_bmap.c
+++ b/libxfs/xfs_bmap.c
@@ -33,6 +33,7 @@
#include "xfs_health.h"
#include "defer_item.h"
#include "xfs_symlink_remote.h"
+#include "xfs_inode_util.h"
struct kmem_cache *xfs_bmap_intent_cache;
diff --git a/libxfs/xfs_inode_util.c b/libxfs/xfs_inode_util.c
new file mode 100644
index 000000000..868a77caf
--- /dev/null
+++ b/libxfs/xfs_inode_util.c
@@ -0,0 +1,124 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2000-2006 Silicon Graphics, Inc.
+ * All Rights Reserved.
+ */
+#include "libxfs_priv.h"
+#include "xfs_fs.h"
+#include "xfs_shared.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_sb.h"
+#include "xfs_mount.h"
+#include "xfs_inode.h"
+#include "xfs_inode_util.h"
+
+uint16_t
+xfs_flags2diflags(
+ struct xfs_inode *ip,
+ unsigned int xflags)
+{
+ /* can't set PREALLOC this way, just preserve it */
+ uint16_t di_flags =
+ (ip->i_diflags & XFS_DIFLAG_PREALLOC);
+
+ if (xflags & FS_XFLAG_IMMUTABLE)
+ di_flags |= XFS_DIFLAG_IMMUTABLE;
+ if (xflags & FS_XFLAG_APPEND)
+ di_flags |= XFS_DIFLAG_APPEND;
+ if (xflags & FS_XFLAG_SYNC)
+ di_flags |= XFS_DIFLAG_SYNC;
+ if (xflags & FS_XFLAG_NOATIME)
+ di_flags |= XFS_DIFLAG_NOATIME;
+ if (xflags & FS_XFLAG_NODUMP)
+ di_flags |= XFS_DIFLAG_NODUMP;
+ if (xflags & FS_XFLAG_NODEFRAG)
+ di_flags |= XFS_DIFLAG_NODEFRAG;
+ if (xflags & FS_XFLAG_FILESTREAM)
+ di_flags |= XFS_DIFLAG_FILESTREAM;
+ if (S_ISDIR(VFS_I(ip)->i_mode)) {
+ if (xflags & FS_XFLAG_RTINHERIT)
+ di_flags |= XFS_DIFLAG_RTINHERIT;
+ if (xflags & FS_XFLAG_NOSYMLINKS)
+ di_flags |= XFS_DIFLAG_NOSYMLINKS;
+ if (xflags & FS_XFLAG_EXTSZINHERIT)
+ di_flags |= XFS_DIFLAG_EXTSZINHERIT;
+ if (xflags & FS_XFLAG_PROJINHERIT)
+ di_flags |= XFS_DIFLAG_PROJINHERIT;
+ } else if (S_ISREG(VFS_I(ip)->i_mode)) {
+ if (xflags & FS_XFLAG_REALTIME)
+ di_flags |= XFS_DIFLAG_REALTIME;
+ if (xflags & FS_XFLAG_EXTSIZE)
+ di_flags |= XFS_DIFLAG_EXTSIZE;
+ }
+
+ return di_flags;
+}
+
+uint64_t
+xfs_flags2diflags2(
+ struct xfs_inode *ip,
+ unsigned int xflags)
+{
+ uint64_t di_flags2 =
+ (ip->i_diflags2 & (XFS_DIFLAG2_REFLINK |
+ XFS_DIFLAG2_BIGTIME |
+ XFS_DIFLAG2_NREXT64));
+
+ if (xflags & FS_XFLAG_DAX)
+ di_flags2 |= XFS_DIFLAG2_DAX;
+ if (xflags & FS_XFLAG_COWEXTSIZE)
+ di_flags2 |= XFS_DIFLAG2_COWEXTSIZE;
+
+ return di_flags2;
+}
+
+uint32_t
+xfs_ip2xflags(
+ struct xfs_inode *ip)
+{
+ uint32_t flags = 0;
+
+ if (ip->i_diflags & XFS_DIFLAG_ANY) {
+ if (ip->i_diflags & XFS_DIFLAG_REALTIME)
+ flags |= FS_XFLAG_REALTIME;
+ if (ip->i_diflags & XFS_DIFLAG_PREALLOC)
+ flags |= FS_XFLAG_PREALLOC;
+ if (ip->i_diflags & XFS_DIFLAG_IMMUTABLE)
+ flags |= FS_XFLAG_IMMUTABLE;
+ if (ip->i_diflags & XFS_DIFLAG_APPEND)
+ flags |= FS_XFLAG_APPEND;
+ if (ip->i_diflags & XFS_DIFLAG_SYNC)
+ flags |= FS_XFLAG_SYNC;
+ if (ip->i_diflags & XFS_DIFLAG_NOATIME)
+ flags |= FS_XFLAG_NOATIME;
+ if (ip->i_diflags & XFS_DIFLAG_NODUMP)
+ flags |= FS_XFLAG_NODUMP;
+ if (ip->i_diflags & XFS_DIFLAG_RTINHERIT)
+ flags |= FS_XFLAG_RTINHERIT;
+ if (ip->i_diflags & XFS_DIFLAG_PROJINHERIT)
+ flags |= FS_XFLAG_PROJINHERIT;
+ if (ip->i_diflags & XFS_DIFLAG_NOSYMLINKS)
+ flags |= FS_XFLAG_NOSYMLINKS;
+ if (ip->i_diflags & XFS_DIFLAG_EXTSIZE)
+ flags |= FS_XFLAG_EXTSIZE;
+ if (ip->i_diflags & XFS_DIFLAG_EXTSZINHERIT)
+ flags |= FS_XFLAG_EXTSZINHERIT;
+ if (ip->i_diflags & XFS_DIFLAG_NODEFRAG)
+ flags |= FS_XFLAG_NODEFRAG;
+ if (ip->i_diflags & XFS_DIFLAG_FILESTREAM)
+ flags |= FS_XFLAG_FILESTREAM;
+ }
+
+ if (ip->i_diflags2 & XFS_DIFLAG2_ANY) {
+ if (ip->i_diflags2 & XFS_DIFLAG2_DAX)
+ flags |= FS_XFLAG_DAX;
+ if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE)
+ flags |= FS_XFLAG_COWEXTSIZE;
+ }
+
+ if (xfs_inode_has_attr_fork(ip))
+ flags |= FS_XFLAG_HASATTR;
+ return flags;
+}
diff --git a/libxfs/xfs_inode_util.h b/libxfs/xfs_inode_util.h
new file mode 100644
index 000000000..6ad1898a0
--- /dev/null
+++ b/libxfs/xfs_inode_util.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2000-2003,2005 Silicon Graphics, Inc.
+ * All Rights Reserved.
+ */
+#ifndef __XFS_INODE_UTIL_H__
+#define __XFS_INODE_UTIL_H__
+
+uint16_t xfs_flags2diflags(struct xfs_inode *ip, unsigned int xflags);
+uint64_t xfs_flags2diflags2(struct xfs_inode *ip, unsigned int xflags);
+uint32_t xfs_dic2xflags(struct xfs_inode *ip);
+uint32_t xfs_ip2xflags(struct xfs_inode *ip);
+
+#endif /* __XFS_INODE_UTIL_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 06/64] xfs: hoist project id get/set functions to libxfs
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (4 preceding siblings ...)
2024-10-02 1:09 ` [PATCH 05/64] xfs: hoist inode flag conversion functions " Darrick J. Wong
@ 2024-10-02 1:09 ` Darrick J. Wong
2024-10-02 1:09 ` [PATCH 07/64] libxfs: put all the inode functions in a single file Darrick J. Wong
` (57 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:09 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: fcea5b35f36233c04003ab8b3eb081b5e20e1aa4
Move the project id get and set functions into libxfs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_inode_util.c | 10 ++++++++++
libxfs/xfs_inode_util.h | 2 ++
2 files changed, 12 insertions(+)
diff --git a/libxfs/xfs_inode_util.c b/libxfs/xfs_inode_util.c
index 868a77caf..0a9ea03e2 100644
--- a/libxfs/xfs_inode_util.c
+++ b/libxfs/xfs_inode_util.c
@@ -122,3 +122,13 @@ xfs_ip2xflags(
flags |= FS_XFLAG_HASATTR;
return flags;
}
+
+prid_t
+xfs_get_initial_prid(struct xfs_inode *dp)
+{
+ if (dp->i_diflags & XFS_DIFLAG_PROJINHERIT)
+ return dp->i_projid;
+
+ /* Assign to the root project by default. */
+ return 0;
+}
diff --git a/libxfs/xfs_inode_util.h b/libxfs/xfs_inode_util.h
index 6ad1898a0..f7e4d5a82 100644
--- a/libxfs/xfs_inode_util.h
+++ b/libxfs/xfs_inode_util.h
@@ -11,4 +11,6 @@ uint64_t xfs_flags2diflags2(struct xfs_inode *ip, unsigned int xflags);
uint32_t xfs_dic2xflags(struct xfs_inode *ip);
uint32_t xfs_ip2xflags(struct xfs_inode *ip);
+prid_t xfs_get_initial_prid(struct xfs_inode *dp);
+
#endif /* __XFS_INODE_UTIL_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 07/64] libxfs: put all the inode functions in a single file
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (5 preceding siblings ...)
2024-10-02 1:09 ` [PATCH 06/64] xfs: hoist project id get/set " Darrick J. Wong
@ 2024-10-02 1:09 ` Darrick J. Wong
2024-10-02 1:09 ` [PATCH 08/64] libxfs: pass IGET flags through to xfs_iread Darrick J. Wong
` (56 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:09 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Move all the inode functions into a single source code file.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/Makefile | 1
libxfs/inode.c | 383 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
libxfs/rdwr.c | 95 --------------
libxfs/util.c | 257 -------------------------------------
4 files changed, 384 insertions(+), 352 deletions(-)
create mode 100644 libxfs/inode.c
diff --git a/libxfs/Makefile b/libxfs/Makefile
index cc3312b57..8c93d7b53 100644
--- a/libxfs/Makefile
+++ b/libxfs/Makefile
@@ -70,6 +70,7 @@ CFILES = buf_mem.c \
cache.c \
defer_item.c \
init.c \
+ inode.c \
kmem.c \
listxattr.c \
logitem.c \
diff --git a/libxfs/inode.c b/libxfs/inode.c
new file mode 100644
index 000000000..fffca7761
--- /dev/null
+++ b/libxfs/inode.c
@@ -0,0 +1,383 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2000-2005 Silicon Graphics, Inc.
+ * All Rights Reserved.
+ */
+
+#include "libxfs_priv.h"
+#include "libxfs.h"
+#include "libxfs_io.h"
+#include "init.h"
+#include "xfs_fs.h"
+#include "xfs_shared.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_mount.h"
+#include "xfs_defer.h"
+#include "xfs_inode_buf.h"
+#include "xfs_inode_fork.h"
+#include "xfs_inode.h"
+#include "xfs_trans.h"
+#include "xfs_bmap.h"
+#include "xfs_bmap_btree.h"
+#include "xfs_trans_space.h"
+#include "xfs_ialloc.h"
+#include "xfs_alloc.h"
+#include "xfs_bit.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
+#include "xfs_dir2_priv.h"
+
+/* Propagate di_flags from a parent inode to a child inode. */
+static void
+xfs_inode_propagate_flags(
+ struct xfs_inode *ip,
+ const struct xfs_inode *pip)
+{
+ unsigned int di_flags = 0;
+ umode_t mode = VFS_I(ip)->i_mode;
+
+ if ((mode & S_IFMT) == S_IFDIR) {
+ if (pip->i_diflags & XFS_DIFLAG_RTINHERIT)
+ di_flags |= XFS_DIFLAG_RTINHERIT;
+ if (pip->i_diflags & XFS_DIFLAG_EXTSZINHERIT) {
+ di_flags |= XFS_DIFLAG_EXTSZINHERIT;
+ ip->i_extsize = pip->i_extsize;
+ }
+ } else {
+ if ((pip->i_diflags & XFS_DIFLAG_RTINHERIT) &&
+ xfs_has_realtime(ip->i_mount))
+ di_flags |= XFS_DIFLAG_REALTIME;
+ if (pip->i_diflags & XFS_DIFLAG_EXTSZINHERIT) {
+ di_flags |= XFS_DIFLAG_EXTSIZE;
+ ip->i_extsize = pip->i_extsize;
+ }
+ }
+ if (pip->i_diflags & XFS_DIFLAG_PROJINHERIT)
+ di_flags |= XFS_DIFLAG_PROJINHERIT;
+ ip->i_diflags |= di_flags;
+}
+
+/*
+ * Increment the link count on an inode & log the change.
+ */
+void
+libxfs_bumplink(
+ struct xfs_trans *tp,
+ struct xfs_inode *ip)
+{
+ struct inode *inode = VFS_I(ip);
+
+ xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
+
+ if (inode->i_nlink != XFS_NLINK_PINNED)
+ inc_nlink(inode);
+
+ xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+}
+
+/*
+ * Initialise a newly allocated inode and return the in-core inode to the
+ * caller locked exclusively.
+ */
+static int
+libxfs_init_new_inode(
+ struct xfs_trans *tp,
+ struct xfs_inode *pip,
+ xfs_ino_t ino,
+ umode_t mode,
+ xfs_nlink_t nlink,
+ dev_t rdev,
+ struct cred *cr,
+ struct fsxattr *fsx,
+ struct xfs_inode **ipp)
+{
+ struct xfs_mount *mp = tp->t_mountp;
+ struct xfs_inode *ip;
+ unsigned int flags;
+ int error;
+
+ error = libxfs_iget(mp, tp, ino, XFS_IGET_CREATE, &ip);
+ if (error != 0)
+ return error;
+ ASSERT(ip != NULL);
+
+ VFS_I(ip)->i_mode = mode;
+ set_nlink(VFS_I(ip), nlink);
+ i_uid_write(VFS_I(ip), cr->cr_uid);
+ i_gid_write(VFS_I(ip), cr->cr_gid);
+ ip->i_projid = pip ? 0 : fsx->fsx_projid;
+ xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG | XFS_ICHGTIME_MOD);
+
+ if (pip && (VFS_I(pip)->i_mode & S_ISGID)) {
+ if (!(cr->cr_flags & CRED_FORCE_GID))
+ VFS_I(ip)->i_gid = VFS_I(pip)->i_gid;
+ if ((VFS_I(pip)->i_mode & S_ISGID) && (mode & S_IFMT) == S_IFDIR)
+ VFS_I(ip)->i_mode |= S_ISGID;
+ }
+
+ ip->i_disk_size = 0;
+ ip->i_df.if_nextents = 0;
+ ASSERT(ip->i_nblocks == 0);
+ ip->i_extsize = pip ? 0 : fsx->fsx_extsize;
+ ip->i_diflags = pip ? 0 : xfs_flags2diflags(ip, fsx->fsx_xflags);
+
+ if (xfs_has_v3inodes(ip->i_mount)) {
+ VFS_I(ip)->i_version = 1;
+ ip->i_diflags2 = ip->i_mount->m_ino_geo.new_diflags2;
+ if (!pip)
+ ip->i_diflags2 = xfs_flags2diflags2(ip,
+ fsx->fsx_xflags);
+ ip->i_crtime = inode_get_mtime(VFS_I(ip)); /* struct copy */
+ ip->i_cowextsize = pip ? 0 : fsx->fsx_cowextsize;
+ }
+
+ flags = XFS_ILOG_CORE;
+ switch (mode & S_IFMT) {
+ case S_IFIFO:
+ case S_IFSOCK:
+ /* doesn't make sense to set an rdev for these */
+ rdev = 0;
+ /* FALLTHROUGH */
+ case S_IFCHR:
+ case S_IFBLK:
+ ip->i_df.if_format = XFS_DINODE_FMT_DEV;
+ flags |= XFS_ILOG_DEV;
+ VFS_I(ip)->i_rdev = rdev;
+ break;
+ case S_IFREG:
+ case S_IFDIR:
+ if (pip && (pip->i_diflags & XFS_DIFLAG_ANY))
+ xfs_inode_propagate_flags(ip, pip);
+ /* FALLTHROUGH */
+ case S_IFLNK:
+ ip->i_df.if_format = XFS_DINODE_FMT_EXTENTS;
+ ip->i_df.if_bytes = 0;
+ ip->i_df.if_data = NULL;
+ break;
+ default:
+ ASSERT(0);
+ }
+
+ /*
+ * If we're going to set a parent pointer on this file, we need to
+ * create an attr fork to receive that parent pointer.
+ */
+ if (pip && xfs_has_parent(mp)) {
+ ip->i_forkoff = xfs_default_attroffset(ip) >> 3;
+ xfs_ifork_init_attr(ip, XFS_DINODE_FMT_EXTENTS, 0);
+
+ if (!xfs_has_attr(mp)) {
+ spin_lock(&mp->m_sb_lock);
+ xfs_add_attr(mp);
+ spin_unlock(&mp->m_sb_lock);
+ xfs_log_sb(tp);
+ }
+ }
+
+ /*
+ * Log the new values stuffed into the inode.
+ */
+ xfs_trans_ijoin(tp, ip, 0);
+ xfs_trans_log_inode(tp, ip, flags);
+ *ipp = ip;
+ return 0;
+}
+
+/*
+ * Writes a modified inode's changes out to the inode's on disk home.
+ * Originally based on xfs_iflush_int() from xfs_inode.c in the kernel.
+ */
+int
+libxfs_iflush_int(
+ struct xfs_inode *ip,
+ struct xfs_buf *bp)
+{
+ struct xfs_inode_log_item *iip;
+ struct xfs_dinode *dip;
+ struct xfs_mount *mp;
+
+ ASSERT(ip->i_df.if_format != XFS_DINODE_FMT_BTREE ||
+ ip->i_df.if_nextents > ip->i_df.if_ext_max);
+
+ iip = ip->i_itemp;
+ mp = ip->i_mount;
+
+ /* set *dip = inode's place in the buffer */
+ dip = xfs_buf_offset(bp, ip->i_imap.im_boffset);
+
+ if (XFS_ISREG(ip)) {
+ ASSERT( (ip->i_df.if_format == XFS_DINODE_FMT_EXTENTS) ||
+ (ip->i_df.if_format == XFS_DINODE_FMT_BTREE) );
+ } else if (XFS_ISDIR(ip)) {
+ ASSERT( (ip->i_df.if_format == XFS_DINODE_FMT_EXTENTS) ||
+ (ip->i_df.if_format == XFS_DINODE_FMT_BTREE) ||
+ (ip->i_df.if_format == XFS_DINODE_FMT_LOCAL) );
+ }
+ ASSERT(ip->i_df.if_nextents+ip.i_af->if_nextents <= ip->i_nblocks);
+ ASSERT(ip->i_forkoff <= mp->m_sb.sb_inodesize);
+
+ /* bump the change count on v3 inodes */
+ if (xfs_has_v3inodes(mp))
+ VFS_I(ip)->i_version++;
+
+ /*
+ * If there are inline format data / attr forks attached to this inode,
+ * make sure they are not corrupt.
+ */
+ if (ip->i_df.if_format == XFS_DINODE_FMT_LOCAL &&
+ xfs_ifork_verify_local_data(ip))
+ return -EFSCORRUPTED;
+ if (xfs_inode_has_attr_fork(ip) &&
+ ip->i_af.if_format == XFS_DINODE_FMT_LOCAL &&
+ xfs_ifork_verify_local_attr(ip))
+ return -EFSCORRUPTED;
+
+ /*
+ * Copy the dirty parts of the inode into the on-disk
+ * inode. We always copy out the core of the inode,
+ * because if the inode is dirty at all the core must
+ * be.
+ */
+ xfs_inode_to_disk(ip, dip, iip->ili_item.li_lsn);
+
+ xfs_iflush_fork(ip, dip, iip, XFS_DATA_FORK);
+ if (xfs_inode_has_attr_fork(ip))
+ xfs_iflush_fork(ip, dip, iip, XFS_ATTR_FORK);
+
+ /* generate the checksum. */
+ xfs_dinode_calc_crc(mp, dip);
+
+ return 0;
+}
+
+/*
+ * Wrapper around call to libxfs_ialloc. Takes care of committing and
+ * allocating a new transaction as needed.
+ *
+ * Originally there were two copies of this code - one in mkfs, the
+ * other in repair - now there is just the one.
+ */
+int
+libxfs_dir_ialloc(
+ struct xfs_trans **tpp,
+ struct xfs_inode *dp,
+ mode_t mode,
+ nlink_t nlink,
+ xfs_dev_t rdev,
+ struct cred *cr,
+ struct fsxattr *fsx,
+ struct xfs_inode **ipp)
+{
+ xfs_ino_t parent_ino = dp ? dp->i_ino : 0;
+ xfs_ino_t ino;
+ int error;
+
+ /*
+ * Call the space management code to pick the on-disk inode to be
+ * allocated.
+ */
+ error = xfs_dialloc(tpp, parent_ino, mode, &ino);
+ if (error)
+ return error;
+
+ return libxfs_init_new_inode(*tpp, dp, ino, mode, nlink, rdev, cr,
+ fsx, ipp);
+}
+
+/*
+ * Inode cache stubs.
+ */
+
+struct kmem_cache *xfs_inode_cache;
+extern struct kmem_cache *xfs_ili_cache;
+
+int
+libxfs_iget(
+ struct xfs_mount *mp,
+ struct xfs_trans *tp,
+ xfs_ino_t ino,
+ uint lock_flags,
+ struct xfs_inode **ipp)
+{
+ struct xfs_inode *ip;
+ struct xfs_buf *bp;
+ struct xfs_perag *pag;
+ int error = 0;
+
+ /* reject inode numbers outside existing AGs */
+ if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount)
+ return -EINVAL;
+
+ ip = kmem_cache_zalloc(xfs_inode_cache, 0);
+ if (!ip)
+ return -ENOMEM;
+
+ VFS_I(ip)->i_count = 1;
+ ip->i_ino = ino;
+ ip->i_mount = mp;
+ ip->i_af.if_format = XFS_DINODE_FMT_EXTENTS;
+ spin_lock_init(&VFS_I(ip)->i_lock);
+
+ pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
+ error = xfs_imap(pag, tp, ip->i_ino, &ip->i_imap, 0);
+ xfs_perag_put(pag);
+
+ if (error)
+ goto out_destroy;
+
+ error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &bp);
+ if (error)
+ goto out_destroy;
+
+ error = xfs_inode_from_disk(ip,
+ xfs_buf_offset(bp, ip->i_imap.im_boffset));
+ if (!error)
+ xfs_buf_set_ref(bp, XFS_INO_REF);
+ xfs_trans_brelse(tp, bp);
+
+ if (error)
+ goto out_destroy;
+
+ *ipp = ip;
+ return 0;
+
+out_destroy:
+ kmem_cache_free(xfs_inode_cache, ip);
+ *ipp = NULL;
+ return error;
+}
+
+static void
+libxfs_idestroy(
+ struct xfs_inode *ip)
+{
+ switch (VFS_I(ip)->i_mode & S_IFMT) {
+ case S_IFREG:
+ case S_IFDIR:
+ case S_IFLNK:
+ libxfs_idestroy_fork(&ip->i_df);
+ break;
+ }
+
+ libxfs_ifork_zap_attr(ip);
+
+ if (ip->i_cowfp) {
+ libxfs_idestroy_fork(ip->i_cowfp);
+ kmem_cache_free(xfs_ifork_cache, ip->i_cowfp);
+ }
+}
+
+void
+libxfs_irele(
+ struct xfs_inode *ip)
+{
+ VFS_I(ip)->i_count--;
+
+ if (VFS_I(ip)->i_count == 0) {
+ ASSERT(ip->i_itemp == NULL);
+ libxfs_idestroy(ip);
+ kmem_cache_free(xfs_inode_cache, ip);
+ }
+}
diff --git a/libxfs/rdwr.c b/libxfs/rdwr.c
index e430416f6..7d4d93e4f 100644
--- a/libxfs/rdwr.c
+++ b/libxfs/rdwr.c
@@ -1079,101 +1079,6 @@ xfs_verify_magic16(
return dmagic == bp->b_ops->magic16[idx];
}
-/*
- * Inode cache stubs.
- */
-
-struct kmem_cache *xfs_inode_cache;
-extern struct kmem_cache *xfs_ili_cache;
-
-int
-libxfs_iget(
- struct xfs_mount *mp,
- struct xfs_trans *tp,
- xfs_ino_t ino,
- uint lock_flags,
- struct xfs_inode **ipp)
-{
- struct xfs_inode *ip;
- struct xfs_buf *bp;
- struct xfs_perag *pag;
- int error = 0;
-
- /* reject inode numbers outside existing AGs */
- if (!ino || XFS_INO_TO_AGNO(mp, ino) >= mp->m_sb.sb_agcount)
- return -EINVAL;
-
- ip = kmem_cache_zalloc(xfs_inode_cache, 0);
- if (!ip)
- return -ENOMEM;
-
- VFS_I(ip)->i_count = 1;
- ip->i_ino = ino;
- ip->i_mount = mp;
- ip->i_af.if_format = XFS_DINODE_FMT_EXTENTS;
- spin_lock_init(&VFS_I(ip)->i_lock);
-
- pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
- error = xfs_imap(pag, tp, ip->i_ino, &ip->i_imap, 0);
- xfs_perag_put(pag);
-
- if (error)
- goto out_destroy;
-
- error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &bp);
- if (error)
- goto out_destroy;
-
- error = xfs_inode_from_disk(ip,
- xfs_buf_offset(bp, ip->i_imap.im_boffset));
- if (!error)
- xfs_buf_set_ref(bp, XFS_INO_REF);
- xfs_trans_brelse(tp, bp);
-
- if (error)
- goto out_destroy;
-
- *ipp = ip;
- return 0;
-
-out_destroy:
- kmem_cache_free(xfs_inode_cache, ip);
- *ipp = NULL;
- return error;
-}
-
-static void
-libxfs_idestroy(xfs_inode_t *ip)
-{
- switch (VFS_I(ip)->i_mode & S_IFMT) {
- case S_IFREG:
- case S_IFDIR:
- case S_IFLNK:
- libxfs_idestroy_fork(&ip->i_df);
- break;
- }
-
- libxfs_ifork_zap_attr(ip);
-
- if (ip->i_cowfp) {
- libxfs_idestroy_fork(ip->i_cowfp);
- kmem_cache_free(xfs_ifork_cache, ip->i_cowfp);
- }
-}
-
-void
-libxfs_irele(
- struct xfs_inode *ip)
-{
- VFS_I(ip)->i_count--;
-
- if (VFS_I(ip)->i_count == 0) {
- ASSERT(ip->i_itemp == NULL);
- libxfs_idestroy(ip);
- kmem_cache_free(xfs_inode_cache, ip);
- }
-}
-
/*
* Flush everything dirty in the kernel and disk write caches to stable media.
* Returns 0 for success or a negative error code.
diff --git a/libxfs/util.c b/libxfs/util.c
index 4e96ba5ce..7aa92c0e4 100644
--- a/libxfs/util.c
+++ b/libxfs/util.c
@@ -150,229 +150,6 @@ current_time(struct inode *inode)
return tv;
}
-/* Propagate di_flags from a parent inode to a child inode. */
-static void
-xfs_inode_propagate_flags(
- struct xfs_inode *ip,
- const struct xfs_inode *pip)
-{
- unsigned int di_flags = 0;
- umode_t mode = VFS_I(ip)->i_mode;
-
- if ((mode & S_IFMT) == S_IFDIR) {
- if (pip->i_diflags & XFS_DIFLAG_RTINHERIT)
- di_flags |= XFS_DIFLAG_RTINHERIT;
- if (pip->i_diflags & XFS_DIFLAG_EXTSZINHERIT) {
- di_flags |= XFS_DIFLAG_EXTSZINHERIT;
- ip->i_extsize = pip->i_extsize;
- }
- } else {
- if ((pip->i_diflags & XFS_DIFLAG_RTINHERIT) &&
- xfs_has_realtime(ip->i_mount))
- di_flags |= XFS_DIFLAG_REALTIME;
- if (pip->i_diflags & XFS_DIFLAG_EXTSZINHERIT) {
- di_flags |= XFS_DIFLAG_EXTSIZE;
- ip->i_extsize = pip->i_extsize;
- }
- }
- if (pip->i_diflags & XFS_DIFLAG_PROJINHERIT)
- di_flags |= XFS_DIFLAG_PROJINHERIT;
- ip->i_diflags |= di_flags;
-}
-
-/*
- * Increment the link count on an inode & log the change.
- */
-void
-libxfs_bumplink(
- struct xfs_trans *tp,
- struct xfs_inode *ip)
-{
- struct inode *inode = VFS_I(ip);
-
- xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
-
- if (inode->i_nlink != XFS_NLINK_PINNED)
- inc_nlink(inode);
-
- xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
-}
-
-/*
- * Initialise a newly allocated inode and return the in-core inode to the
- * caller locked exclusively.
- */
-static int
-libxfs_init_new_inode(
- struct xfs_trans *tp,
- struct xfs_inode *pip,
- xfs_ino_t ino,
- umode_t mode,
- xfs_nlink_t nlink,
- dev_t rdev,
- struct cred *cr,
- struct fsxattr *fsx,
- struct xfs_inode **ipp)
-{
- struct xfs_mount *mp = tp->t_mountp;
- struct xfs_inode *ip;
- unsigned int flags;
- int error;
-
- error = libxfs_iget(mp, tp, ino, XFS_IGET_CREATE, &ip);
- if (error != 0)
- return error;
- ASSERT(ip != NULL);
-
- VFS_I(ip)->i_mode = mode;
- set_nlink(VFS_I(ip), nlink);
- i_uid_write(VFS_I(ip), cr->cr_uid);
- i_gid_write(VFS_I(ip), cr->cr_gid);
- ip->i_projid = pip ? 0 : fsx->fsx_projid;
- xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG | XFS_ICHGTIME_MOD);
-
- if (pip && (VFS_I(pip)->i_mode & S_ISGID)) {
- if (!(cr->cr_flags & CRED_FORCE_GID))
- VFS_I(ip)->i_gid = VFS_I(pip)->i_gid;
- if ((VFS_I(pip)->i_mode & S_ISGID) && (mode & S_IFMT) == S_IFDIR)
- VFS_I(ip)->i_mode |= S_ISGID;
- }
-
- ip->i_disk_size = 0;
- ip->i_df.if_nextents = 0;
- ASSERT(ip->i_nblocks == 0);
- ip->i_extsize = pip ? 0 : fsx->fsx_extsize;
- ip->i_diflags = pip ? 0 : xfs_flags2diflags(ip, fsx->fsx_xflags);
-
- if (xfs_has_v3inodes(ip->i_mount)) {
- VFS_I(ip)->i_version = 1;
- ip->i_diflags2 = ip->i_mount->m_ino_geo.new_diflags2;
- if (!pip)
- ip->i_diflags2 = xfs_flags2diflags2(ip,
- fsx->fsx_xflags);
- ip->i_crtime = inode_get_mtime(VFS_I(ip)); /* struct copy */
- ip->i_cowextsize = pip ? 0 : fsx->fsx_cowextsize;
- }
-
- flags = XFS_ILOG_CORE;
- switch (mode & S_IFMT) {
- case S_IFIFO:
- case S_IFSOCK:
- /* doesn't make sense to set an rdev for these */
- rdev = 0;
- /* FALLTHROUGH */
- case S_IFCHR:
- case S_IFBLK:
- ip->i_df.if_format = XFS_DINODE_FMT_DEV;
- flags |= XFS_ILOG_DEV;
- VFS_I(ip)->i_rdev = rdev;
- break;
- case S_IFREG:
- case S_IFDIR:
- if (pip && (pip->i_diflags & XFS_DIFLAG_ANY))
- xfs_inode_propagate_flags(ip, pip);
- /* FALLTHROUGH */
- case S_IFLNK:
- ip->i_df.if_format = XFS_DINODE_FMT_EXTENTS;
- ip->i_df.if_bytes = 0;
- ip->i_df.if_data = NULL;
- break;
- default:
- ASSERT(0);
- }
-
- /*
- * If we're going to set a parent pointer on this file, we need to
- * create an attr fork to receive that parent pointer.
- */
- if (pip && xfs_has_parent(mp)) {
- ip->i_forkoff = xfs_default_attroffset(ip) >> 3;
- xfs_ifork_init_attr(ip, XFS_DINODE_FMT_EXTENTS, 0);
-
- if (!xfs_has_attr(mp)) {
- spin_lock(&mp->m_sb_lock);
- xfs_add_attr(mp);
- spin_unlock(&mp->m_sb_lock);
- xfs_log_sb(tp);
- }
- }
-
- /*
- * Log the new values stuffed into the inode.
- */
- xfs_trans_ijoin(tp, ip, 0);
- xfs_trans_log_inode(tp, ip, flags);
- *ipp = ip;
- return 0;
-}
-
-/*
- * Writes a modified inode's changes out to the inode's on disk home.
- * Originally based on xfs_iflush_int() from xfs_inode.c in the kernel.
- */
-int
-libxfs_iflush_int(
- xfs_inode_t *ip,
- struct xfs_buf *bp)
-{
- struct xfs_inode_log_item *iip;
- struct xfs_dinode *dip;
- xfs_mount_t *mp;
-
- ASSERT(ip->i_df.if_format != XFS_DINODE_FMT_BTREE ||
- ip->i_df.if_nextents > ip->i_df.if_ext_max);
-
- iip = ip->i_itemp;
- mp = ip->i_mount;
-
- /* set *dip = inode's place in the buffer */
- dip = xfs_buf_offset(bp, ip->i_imap.im_boffset);
-
- if (XFS_ISREG(ip)) {
- ASSERT( (ip->i_df.if_format == XFS_DINODE_FMT_EXTENTS) ||
- (ip->i_df.if_format == XFS_DINODE_FMT_BTREE) );
- } else if (XFS_ISDIR(ip)) {
- ASSERT( (ip->i_df.if_format == XFS_DINODE_FMT_EXTENTS) ||
- (ip->i_df.if_format == XFS_DINODE_FMT_BTREE) ||
- (ip->i_df.if_format == XFS_DINODE_FMT_LOCAL) );
- }
- ASSERT(ip->i_df.if_nextents+ip.i_af->if_nextents <= ip->i_nblocks);
- ASSERT(ip->i_forkoff <= mp->m_sb.sb_inodesize);
-
- /* bump the change count on v3 inodes */
- if (xfs_has_v3inodes(mp))
- VFS_I(ip)->i_version++;
-
- /*
- * If there are inline format data / attr forks attached to this inode,
- * make sure they are not corrupt.
- */
- if (ip->i_df.if_format == XFS_DINODE_FMT_LOCAL &&
- xfs_ifork_verify_local_data(ip))
- return -EFSCORRUPTED;
- if (xfs_inode_has_attr_fork(ip) &&
- ip->i_af.if_format == XFS_DINODE_FMT_LOCAL &&
- xfs_ifork_verify_local_attr(ip))
- return -EFSCORRUPTED;
-
- /*
- * Copy the dirty parts of the inode into the on-disk
- * inode. We always copy out the core of the inode,
- * because if the inode is dirty at all the core must
- * be.
- */
- xfs_inode_to_disk(ip, dip, iip->ili_item.li_lsn);
-
- xfs_iflush_fork(ip, dip, iip, XFS_DATA_FORK);
- if (xfs_inode_has_attr_fork(ip))
- xfs_iflush_fork(ip, dip, iip, XFS_ATTR_FORK);
-
- /* generate the checksum. */
- xfs_dinode_calc_crc(mp, dip);
-
- return 0;
-}
-
int
libxfs_mod_incore_sb(
struct xfs_mount *mp,
@@ -477,40 +254,6 @@ libxfs_alloc_file_space(
return error;
}
-/*
- * Wrapper around call to libxfs_ialloc. Takes care of committing and
- * allocating a new transaction as needed.
- *
- * Originally there were two copies of this code - one in mkfs, the
- * other in repair - now there is just the one.
- */
-int
-libxfs_dir_ialloc(
- struct xfs_trans **tpp,
- struct xfs_inode *dp,
- mode_t mode,
- nlink_t nlink,
- xfs_dev_t rdev,
- struct cred *cr,
- struct fsxattr *fsx,
- struct xfs_inode **ipp)
-{
- xfs_ino_t parent_ino = dp ? dp->i_ino : 0;
- xfs_ino_t ino;
- int error;
-
- /*
- * Call the space management code to pick the on-disk inode to be
- * allocated.
- */
- error = xfs_dialloc(tpp, parent_ino, mode, &ino);
- if (error)
- return error;
-
- return libxfs_init_new_inode(*tpp, dp, ino, mode, nlink, rdev, cr,
- fsx, ipp);
-}
-
void
cmn_err(int level, char *fmt, ...)
{
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 08/64] libxfs: pass IGET flags through to xfs_iread
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (6 preceding siblings ...)
2024-10-02 1:09 ` [PATCH 07/64] libxfs: put all the inode functions in a single file Darrick J. Wong
@ 2024-10-02 1:09 ` Darrick J. Wong
2024-10-02 5:47 ` Christoph Hellwig
2024-10-02 1:10 ` [PATCH 09/64] xfs: pack icreate initialization parameters into a separate structure Darrick J. Wong
` (55 subsequent siblings)
63 siblings, 1 reply; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:09 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Change the lock_flags parameter to iget_flags so that we can supply
XFS_IGET_ flags in future patches. All callers of libxfs_iget and
libxfs_trans_iget pass zero for this parameter and there are no inode
locks in xfsprogs, so there's no behavior change here.
Port the kernel's version of the xfs_inode_from_disk callsite.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/inode.c | 40 ++++++++++++++++++++++++++++------------
1 file changed, 28 insertions(+), 12 deletions(-)
diff --git a/libxfs/inode.c b/libxfs/inode.c
index fffca7761..2af7e8fe9 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -298,11 +298,10 @@ libxfs_iget(
struct xfs_mount *mp,
struct xfs_trans *tp,
xfs_ino_t ino,
- uint lock_flags,
+ uint flags,
struct xfs_inode **ipp)
{
struct xfs_inode *ip;
- struct xfs_buf *bp;
struct xfs_perag *pag;
int error = 0;
@@ -327,18 +326,35 @@ libxfs_iget(
if (error)
goto out_destroy;
- error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &bp);
- if (error)
- goto out_destroy;
+ /*
+ * For version 5 superblocks, if we are initialising a new inode and we
+ * are not utilising the XFS_MOUNT_IKEEP inode cluster mode, we can
+ * simply build the new inode core with a random generation number.
+ *
+ * For version 4 (and older) superblocks, log recovery is dependent on
+ * the di_flushiter field being initialised from the current on-disk
+ * value and hence we must also read the inode off disk even when
+ * initializing new inodes.
+ */
+ if (xfs_has_v3inodes(mp) &&
+ (flags & XFS_IGET_CREATE) && !xfs_has_ikeep(mp)) {
+ VFS_I(ip)->i_generation = get_random_u32();
+ } else {
+ struct xfs_buf *bp;
- error = xfs_inode_from_disk(ip,
- xfs_buf_offset(bp, ip->i_imap.im_boffset));
- if (!error)
- xfs_buf_set_ref(bp, XFS_INO_REF);
- xfs_trans_brelse(tp, bp);
+ error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &bp);
+ if (error)
+ goto out_destroy;
- if (error)
- goto out_destroy;
+ error = xfs_inode_from_disk(ip,
+ xfs_buf_offset(bp, ip->i_imap.im_boffset));
+ if (!error)
+ xfs_buf_set_ref(bp, XFS_INO_REF);
+ xfs_trans_brelse(tp, bp);
+
+ if (error)
+ goto out_destroy;
+ }
*ipp = ip;
return 0;
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 09/64] xfs: pack icreate initialization parameters into a separate structure
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (7 preceding siblings ...)
2024-10-02 1:09 ` [PATCH 08/64] libxfs: pass IGET flags through to xfs_iread Darrick J. Wong
@ 2024-10-02 1:10 ` Darrick J. Wong
2024-10-02 1:10 ` [PATCH 10/64] libxfs: " Darrick J. Wong
` (54 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:10 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: ba4b39fe4c011078469dcd28f51447d75852d21c
Callers that want to create an inode currently pass all possible file
attribute values for the new inode into xfs_init_new_inode as ten
separate parameters. This causes two code maintenance issues: first, we
have large multi-line call sites which programmers must read carefully
to make sure they did not accidentally invert a value. Second, all
three file id parameters must be passed separately to the quota
functions; any discrepancy results in quota count errors.
Clean this up by creating a new icreate_args structure to hold all this
information, some helpers to initialize them properly, and make the
callers pass this structure through to the creation function, whose name
we shorten to xfs_icreate. This eliminates the issues, enables us to
keep the inode init code in sync with userspace via libxfs, and is
needed for future metadata directory tree management.
(A subsequent cleanup will also fix the quota alloc calls.)
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_inode_util.h | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/libxfs/xfs_inode_util.h b/libxfs/xfs_inode_util.h
index f7e4d5a82..9226482fd 100644
--- a/libxfs/xfs_inode_util.h
+++ b/libxfs/xfs_inode_util.h
@@ -13,4 +13,26 @@ uint32_t xfs_ip2xflags(struct xfs_inode *ip);
prid_t xfs_get_initial_prid(struct xfs_inode *dp);
+/*
+ * File creation context.
+ *
+ * Due to our only partial reliance on the VFS to propagate uid and gid values
+ * according to accepted Unix behaviors, callers must initialize idmap to the
+ * correct idmapping structure to get the correct inheritance behaviors when
+ * XFS_MOUNT_GRPID is set.
+ *
+ * To create files detached from the directory tree (e.g. quota inodes), set
+ * idmap to NULL. To create a tree root, set pip to NULL.
+ */
+struct xfs_icreate_args {
+ struct mnt_idmap *idmap;
+ struct xfs_inode *pip; /* parent inode or null */
+ dev_t rdev;
+ umode_t mode;
+
+#define XFS_ICREATE_TMPFILE (1U << 0) /* create an unlinked file */
+#define XFS_ICREATE_INIT_XATTRS (1U << 1) /* will set xattrs immediately */
+ uint16_t flags;
+};
+
#endif /* __XFS_INODE_UTIL_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 10/64] libxfs: pack icreate initialization parameters into a separate structure
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (8 preceding siblings ...)
2024-10-02 1:10 ` [PATCH 09/64] xfs: pack icreate initialization parameters into a separate structure Darrick J. Wong
@ 2024-10-02 1:10 ` Darrick J. Wong
2024-10-02 1:10 ` [PATCH 11/64] xfs: implement atime updates in xfs_trans_ichgtime Darrick J. Wong
` (53 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:10 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: ba4b39fe4c011078469dcd28f51447d75852d21c
Callers that want to create an inode currently pass all possible file
attribute values for the new inode into xfs_init_new_inode as ten
separate parameters. This causes two code maintenance issues: first, we
have large multi-line call sites which programmers must read carefully
to make sure they did not accidentally invert a value. Second, all
three file id parameters must be passed separately to the quota
functions; any discrepancy results in quota count errors.
Clean this up by creating a new icreate_args structure to hold all this
information, some helpers to initialize them properly, and make the
callers pass this structure through to the creation function, whose name
we shorten to xfs_icreate. This eliminates the issues, enables us to
keep the inode init code in sync with userspace via libxfs, and is
needed for future metadata directory tree management.
(A subsequent cleanup will also fix the quota alloc calls.)
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
include/xfs_inode.h | 46 +++++++++++++++++----
libxfs/inode.c | 114 +++++++++++++++++++++++++++++++++++----------------
2 files changed, 117 insertions(+), 43 deletions(-)
diff --git a/include/xfs_inode.h b/include/xfs_inode.h
index 17d3da6ae..4142c45e4 100644
--- a/include/xfs_inode.h
+++ b/include/xfs_inode.h
@@ -7,6 +7,36 @@
#ifndef __XFS_INODE_H__
#define __XFS_INODE_H__
+/*
+ * Borrow the kernel's uid/gid types. These are used by xfs_inode_util.h, so
+ * they must come first in the header file.
+ */
+
+typedef struct {
+ uid_t val;
+} kuid_t;
+
+typedef struct {
+ gid_t val;
+} kgid_t;
+
+static inline kuid_t make_kuid(uid_t uid)
+{
+ kuid_t v = { .val = uid };
+ return v;
+}
+
+static inline kgid_t make_kgid(gid_t gid)
+{
+ kgid_t v = { .val = gid };
+ return v;
+}
+
+#define KUIDT_INIT(value) (kuid_t){ value }
+#define KGIDT_INIT(value) (kgid_t){ value }
+#define GLOBAL_ROOT_UID KUIDT_INIT(0)
+#define GLOBAL_ROOT_GID KGIDT_INIT(0)
+
/* These match kernel side includes */
#include "xfs_inode_buf.h"
#include "xfs_inode_fork.h"
@@ -34,8 +64,8 @@ static inline bool IS_I_VERSION(const struct inode *inode) { return false; }
*/
struct inode {
mode_t i_mode;
- uint32_t i_uid;
- uint32_t i_gid;
+ kuid_t i_uid;
+ kgid_t i_gid;
uint32_t i_nlink;
xfs_dev_t i_rdev; /* This actually holds xfs_dev_t */
unsigned int i_count;
@@ -50,19 +80,19 @@ struct inode {
static inline uint32_t i_uid_read(struct inode *inode)
{
- return inode->i_uid;
+ return inode->i_uid.val;
}
static inline uint32_t i_gid_read(struct inode *inode)
{
- return inode->i_gid;
+ return inode->i_gid.val;
}
-static inline void i_uid_write(struct inode *inode, uint32_t uid)
+static inline void i_uid_write(struct inode *inode, uid_t uid)
{
- inode->i_uid = uid;
+ inode->i_uid.val = uid;
}
-static inline void i_gid_write(struct inode *inode, uint32_t gid)
+static inline void i_gid_write(struct inode *inode, gid_t gid)
{
- inode->i_gid = gid;
+ inode->i_gid.val = gid;
}
static inline void ihold(struct inode *inode)
diff --git a/libxfs/inode.c b/libxfs/inode.c
index 2af7e8fe9..9ccc22adf 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -82,18 +82,16 @@ libxfs_bumplink(
* caller locked exclusively.
*/
static int
-libxfs_init_new_inode(
+libxfs_icreate(
struct xfs_trans *tp,
- struct xfs_inode *pip,
xfs_ino_t ino,
- umode_t mode,
- xfs_nlink_t nlink,
- dev_t rdev,
- struct cred *cr,
- struct fsxattr *fsx,
+ const struct xfs_icreate_args *args,
struct xfs_inode **ipp)
{
struct xfs_mount *mp = tp->t_mountp;
+ struct xfs_inode *pip = args->pip;
+ struct inode *dir = pip ? VFS_I(pip) : NULL;
+ struct inode *inode;
struct xfs_inode *ip;
unsigned int flags;
int error;
@@ -103,48 +101,47 @@ libxfs_init_new_inode(
return error;
ASSERT(ip != NULL);
- VFS_I(ip)->i_mode = mode;
- set_nlink(VFS_I(ip), nlink);
- i_uid_write(VFS_I(ip), cr->cr_uid);
- i_gid_write(VFS_I(ip), cr->cr_gid);
- ip->i_projid = pip ? 0 : fsx->fsx_projid;
+ inode = VFS_I(ip);
+ inode->i_mode = args->mode;
+ if (args->flags & XFS_ICREATE_TMPFILE)
+ set_nlink(inode, 0);
+ else if (S_ISDIR(args->mode))
+ set_nlink(inode, 2);
+ else
+ set_nlink(inode, 1);
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
+ ip->i_projid = 0;
xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG | XFS_ICHGTIME_MOD);
- if (pip && (VFS_I(pip)->i_mode & S_ISGID)) {
- if (!(cr->cr_flags & CRED_FORCE_GID))
- VFS_I(ip)->i_gid = VFS_I(pip)->i_gid;
- if ((VFS_I(pip)->i_mode & S_ISGID) && (mode & S_IFMT) == S_IFDIR)
- VFS_I(ip)->i_mode |= S_ISGID;
+ if (pip && (dir->i_mode & S_ISGID)) {
+ inode->i_gid = dir->i_gid;
+ if (S_ISDIR(args->mode))
+ inode->i_mode |= S_ISGID;
}
ip->i_disk_size = 0;
ip->i_df.if_nextents = 0;
ASSERT(ip->i_nblocks == 0);
- ip->i_extsize = pip ? 0 : fsx->fsx_extsize;
- ip->i_diflags = pip ? 0 : xfs_flags2diflags(ip, fsx->fsx_xflags);
+ ip->i_extsize = 0;
+ ip->i_diflags = 0;
if (xfs_has_v3inodes(ip->i_mount)) {
- VFS_I(ip)->i_version = 1;
+ inode->i_version = 1;
ip->i_diflags2 = ip->i_mount->m_ino_geo.new_diflags2;
- if (!pip)
- ip->i_diflags2 = xfs_flags2diflags2(ip,
- fsx->fsx_xflags);
- ip->i_crtime = inode_get_mtime(VFS_I(ip)); /* struct copy */
- ip->i_cowextsize = pip ? 0 : fsx->fsx_cowextsize;
+ ip->i_crtime = inode_get_mtime(inode); /* struct copy */
+ ip->i_cowextsize = 0;
}
flags = XFS_ILOG_CORE;
- switch (mode & S_IFMT) {
+ switch (args->mode & S_IFMT) {
case S_IFIFO:
case S_IFSOCK:
- /* doesn't make sense to set an rdev for these */
- rdev = 0;
- /* FALLTHROUGH */
case S_IFCHR:
case S_IFBLK:
ip->i_df.if_format = XFS_DINODE_FMT_DEV;
flags |= XFS_ILOG_DEV;
- VFS_I(ip)->i_rdev = rdev;
+ VFS_I(ip)->i_rdev = args->rdev;
break;
case S_IFREG:
case S_IFDIR:
@@ -161,10 +158,16 @@ libxfs_init_new_inode(
}
/*
- * If we're going to set a parent pointer on this file, we need to
- * create an attr fork to receive that parent pointer.
+ * If we need to create attributes immediately after allocating the
+ * inode, initialise an empty attribute fork right now. We use the
+ * default fork offset for attributes here as we don't know exactly what
+ * size or how many attributes we might be adding. We can do this
+ * safely here because we know the data fork is completely empty and
+ * this saves us from needing to run a separate transaction to set the
+ * fork offset in the immediate future.
*/
- if (pip && xfs_has_parent(mp)) {
+ if ((args->flags & XFS_ICREATE_INIT_XATTRS) &&
+ (xfs_has_attr(tp->t_mountp) || xfs_has_attr2(tp->t_mountp))) {
ip->i_forkoff = xfs_default_attroffset(ip) >> 3;
xfs_ifork_init_attr(ip, XFS_DINODE_FMT_EXTENTS, 0);
@@ -270,10 +273,27 @@ libxfs_dir_ialloc(
struct fsxattr *fsx,
struct xfs_inode **ipp)
{
+ struct xfs_icreate_args args = {
+ .pip = dp,
+ .mode = mode,
+ };
+ struct xfs_inode *ip;
+ struct inode *inode;
xfs_ino_t parent_ino = dp ? dp->i_ino : 0;
xfs_ino_t ino;
int error;
+ if (dp && xfs_has_parent(dp->i_mount))
+ args.flags |= XFS_ICREATE_INIT_XATTRS;
+
+ /* Only devices get rdev numbers */
+ switch (mode & S_IFMT) {
+ case S_IFCHR:
+ case S_IFBLK:
+ args.rdev = rdev;
+ break;
+ }
+
/*
* Call the space management code to pick the on-disk inode to be
* allocated.
@@ -282,8 +302,32 @@ libxfs_dir_ialloc(
if (error)
return error;
- return libxfs_init_new_inode(*tpp, dp, ino, mode, nlink, rdev, cr,
- fsx, ipp);
+ error = libxfs_icreate(*tpp, ino, &args, &ip);
+ if (error)
+ return error;
+
+ inode = VFS_I(ip);
+ i_uid_write(inode, cr->cr_uid);
+ if (cr->cr_flags & CRED_FORCE_GID)
+ i_gid_write(inode, cr->cr_gid);
+ set_nlink(inode, nlink);
+
+ /* If there is no parent dir, initialize the file from fsxattr data. */
+ if (dp == NULL) {
+ ip->i_projid = fsx->fsx_projid;
+ ip->i_extsize = fsx->fsx_extsize;
+ ip->i_diflags = xfs_flags2diflags(ip, fsx->fsx_xflags);
+
+ if (xfs_has_v3inodes(ip->i_mount)) {
+ ip->i_diflags2 = xfs_flags2diflags2(ip,
+ fsx->fsx_xflags);
+ ip->i_cowextsize = fsx->fsx_cowextsize;
+ }
+ }
+
+ xfs_trans_log_inode(*tpp, ip, XFS_ILOG_CORE);
+ *ipp = ip;
+ return 0;
}
/*
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 11/64] xfs: implement atime updates in xfs_trans_ichgtime
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (9 preceding siblings ...)
2024-10-02 1:10 ` [PATCH 10/64] libxfs: " Darrick J. Wong
@ 2024-10-02 1:10 ` Darrick J. Wong
2024-10-02 1:10 ` [PATCH 12/64] libxfs: rearrange libxfs_trans_ichgtime call when creating inodes Darrick J. Wong
` (52 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:10 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 3d1dfb6df9b7b9ffc95499b9ddd92d949e5a60d2
Enable xfs_trans_ichgtime to change the inode access time so that we can
use this function to set inode times when allocating inodes instead of
open-coding it.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_shared.h | 1 +
libxfs/xfs_trans_inode.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/libxfs/xfs_shared.h b/libxfs/xfs_shared.h
index 34f104ed3..9a705381f 100644
--- a/libxfs/xfs_shared.h
+++ b/libxfs/xfs_shared.h
@@ -183,6 +183,7 @@ void xfs_log_get_max_trans_res(struct xfs_mount *mp,
#define XFS_ICHGTIME_MOD 0x1 /* data fork modification timestamp */
#define XFS_ICHGTIME_CHG 0x2 /* inode field change timestamp */
#define XFS_ICHGTIME_CREATE 0x4 /* inode create timestamp */
+#define XFS_ICHGTIME_ACCESS 0x8 /* last access timestamp */
/* Computed inode geometry for the filesystem. */
struct xfs_ino_geometry {
diff --git a/libxfs/xfs_trans_inode.c b/libxfs/xfs_trans_inode.c
index f8484eb20..45b513bc5 100644
--- a/libxfs/xfs_trans_inode.c
+++ b/libxfs/xfs_trans_inode.c
@@ -65,6 +65,8 @@ xfs_trans_ichgtime(
inode_set_mtime_to_ts(inode, tv);
if (flags & XFS_ICHGTIME_CHG)
inode_set_ctime_to_ts(inode, tv);
+ if (flags & XFS_ICHGTIME_ACCESS)
+ inode_set_atime_to_ts(inode, tv);
if (flags & XFS_ICHGTIME_CREATE)
ip->i_crtime = tv;
}
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 12/64] libxfs: rearrange libxfs_trans_ichgtime call when creating inodes
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (10 preceding siblings ...)
2024-10-02 1:10 ` [PATCH 11/64] xfs: implement atime updates in xfs_trans_ichgtime Darrick J. Wong
@ 2024-10-02 1:10 ` Darrick J. Wong
2024-10-02 5:48 ` Christoph Hellwig
2024-10-02 1:11 ` [PATCH 13/64] libxfs: set access time when creating files Darrick J. Wong
` (51 subsequent siblings)
63 siblings, 1 reply; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:10 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Rearrange the libxfs_trans_ichgtime call in libxfs_ialloc so that we
call it once with the flags we want.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/inode.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/libxfs/inode.c b/libxfs/inode.c
index 9ccc22adf..b302bbbfd 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -94,6 +94,7 @@ libxfs_icreate(
struct inode *inode;
struct xfs_inode *ip;
unsigned int flags;
+ int times = XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG;
int error;
error = libxfs_iget(mp, tp, ino, XFS_IGET_CREATE, &ip);
@@ -112,7 +113,6 @@ libxfs_icreate(
inode->i_uid = GLOBAL_ROOT_UID;
inode->i_gid = GLOBAL_ROOT_GID;
ip->i_projid = 0;
- xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG | XFS_ICHGTIME_MOD);
if (pip && (dir->i_mode & S_ISGID)) {
inode->i_gid = dir->i_gid;
@@ -129,10 +129,12 @@ libxfs_icreate(
if (xfs_has_v3inodes(ip->i_mount)) {
inode->i_version = 1;
ip->i_diflags2 = ip->i_mount->m_ino_geo.new_diflags2;
- ip->i_crtime = inode_get_mtime(inode); /* struct copy */
ip->i_cowextsize = 0;
+ times |= XFS_ICHGTIME_CREATE;
}
+ xfs_trans_ichgtime(tp, ip, times);
+
flags = XFS_ILOG_CORE;
switch (args->mode & S_IFMT) {
case S_IFIFO:
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 13/64] libxfs: set access time when creating files
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (11 preceding siblings ...)
2024-10-02 1:10 ` [PATCH 12/64] libxfs: rearrange libxfs_trans_ichgtime call when creating inodes Darrick J. Wong
@ 2024-10-02 1:11 ` Darrick J. Wong
2024-10-02 5:49 ` Christoph Hellwig
2024-10-02 1:11 ` [PATCH 14/64] libxfs: when creating a file in a directory, set the project id based on the parent Darrick J. Wong
` (50 subsequent siblings)
63 siblings, 1 reply; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:11 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Set the access time on files that we're creating, to match the behavior
of the kernel.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/inode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/libxfs/inode.c b/libxfs/inode.c
index b302bbbfd..132cf990d 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -94,7 +94,8 @@ libxfs_icreate(
struct inode *inode;
struct xfs_inode *ip;
unsigned int flags;
- int times = XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG;
+ int times = XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG |
+ XFS_ICHGTIME_ACCESS;
int error;
error = libxfs_iget(mp, tp, ino, XFS_IGET_CREATE, &ip);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 14/64] libxfs: when creating a file in a directory, set the project id based on the parent
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (12 preceding siblings ...)
2024-10-02 1:11 ` [PATCH 13/64] libxfs: set access time when creating files Darrick J. Wong
@ 2024-10-02 1:11 ` Darrick J. Wong
2024-10-02 5:49 ` Christoph Hellwig
2024-10-02 1:11 ` [PATCH 15/64] libxfs: pass flags2 from parent to child when creating files Darrick J. Wong
` (49 subsequent siblings)
63 siblings, 1 reply; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:11 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
When we're creating a file as a child of an existing directory, use
xfs_get_initial_prid to have the child inherit the project id of the
directory if the directory has PROJINHERIT set, just like the kernel
does. This fixes mkfs project id propagation with -d projinherit=X when
protofiles are in use.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/inode.c | 3 +++
libxfs/libxfs_api_defs.h | 1 +
2 files changed, 4 insertions(+)
diff --git a/libxfs/inode.c b/libxfs/inode.c
index 132cf990d..d022b41b6 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -121,6 +121,9 @@ libxfs_icreate(
inode->i_mode |= S_ISGID;
}
+ if (pip)
+ ip->i_projid = libxfs_get_initial_prid(pip);
+
ip->i_disk_size = 0;
ip->i_df.if_nextents = 0;
ASSERT(ip->i_nblocks == 0);
diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
index df316727b..a507904f2 100644
--- a/libxfs/libxfs_api_defs.h
+++ b/libxfs/libxfs_api_defs.h
@@ -166,6 +166,7 @@
#define xfs_free_extent_later libxfs_free_extent_later
#define xfs_free_perag libxfs_free_perag
#define xfs_fs_geometry libxfs_fs_geometry
+#define xfs_get_initial_prid libxfs_get_initial_prid
#define xfs_highbit32 libxfs_highbit32
#define xfs_highbit64 libxfs_highbit64
#define xfs_ialloc_calc_rootino libxfs_ialloc_calc_rootino
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 15/64] libxfs: pass flags2 from parent to child when creating files
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (13 preceding siblings ...)
2024-10-02 1:11 ` [PATCH 14/64] libxfs: when creating a file in a directory, set the project id based on the parent Darrick J. Wong
@ 2024-10-02 1:11 ` Darrick J. Wong
2024-10-02 5:49 ` Christoph Hellwig
2024-10-02 1:12 ` [PATCH 16/64] xfs: split new inode creation into two pieces Darrick J. Wong
` (48 subsequent siblings)
63 siblings, 1 reply; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:11 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
When mkfs creates a new file as a child of an existing directory, we
should propagate the flags2 field from parent to child like the kernel
does. This ensures that mkfs propagates cowextsize hints properly when
protofiles are in use.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/inode.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/libxfs/inode.c b/libxfs/inode.c
index d022b41b6..3e72b25cc 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -59,6 +59,20 @@ xfs_inode_propagate_flags(
ip->i_diflags |= di_flags;
}
+/* Propagate di_flags2 from a parent inode to a child inode. */
+static void
+xfs_inode_inherit_flags2(
+ struct xfs_inode *ip,
+ const struct xfs_inode *pip)
+{
+ if (pip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) {
+ ip->i_diflags2 |= XFS_DIFLAG2_COWEXTSIZE;
+ ip->i_cowextsize = pip->i_cowextsize;
+ }
+ if (pip->i_diflags2 & XFS_DIFLAG2_DAX)
+ ip->i_diflags2 |= XFS_DIFLAG2_DAX;
+}
+
/*
* Increment the link count on an inode & log the change.
*/
@@ -153,6 +167,8 @@ libxfs_icreate(
case S_IFDIR:
if (pip && (pip->i_diflags & XFS_DIFLAG_ANY))
xfs_inode_propagate_flags(ip, pip);
+ if (pip && (pip->i_diflags2 & XFS_DIFLAG2_ANY))
+ xfs_inode_inherit_flags2(ip, pip);
/* FALLTHROUGH */
case S_IFLNK:
ip->i_df.if_format = XFS_DINODE_FMT_EXTENTS;
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 16/64] xfs: split new inode creation into two pieces
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (14 preceding siblings ...)
2024-10-02 1:11 ` [PATCH 15/64] libxfs: pass flags2 from parent to child when creating files Darrick J. Wong
@ 2024-10-02 1:12 ` Darrick J. Wong
2024-10-02 1:12 ` [PATCH 17/64] libxfs: " Darrick J. Wong
` (47 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:12 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 38fd3d6a956f1b104f11cd6eee116c54bfe458c4
There are two parts to initializing a newly allocated inode: setting up
the incore structures, and initializing the new inode core based on the
parent inode and the current user's environment. The initialization
code is not specific to the kernel, so we would like to share that with
userspace by hoisting it to libxfs. Therefore, split xfs_icreate into
separate functions to prepare for the next few patches.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_ialloc.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index d8697561e..cef2819aa 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -1941,6 +1941,21 @@ xfs_dialloc(
}
return -ENOSPC;
}
+
+ /*
+ * Protect against obviously corrupt allocation btree records. Later
+ * xfs_iget checks will catch re-allocation of other active in-memory
+ * and on-disk inodes. If we don't catch reallocating the parent inode
+ * here we will deadlock in xfs_iget() so we have to do these checks
+ * first.
+ */
+ if (ino == parent || !xfs_verify_dir_ino(mp, ino)) {
+ xfs_alert(mp, "Allocated a known in-use inode 0x%llx!", ino);
+ xfs_agno_mark_sick(mp, XFS_INO_TO_AGNO(mp, ino),
+ XFS_SICK_AG_INOBT);
+ return -EFSCORRUPTED;
+ }
+
*new_ino = ino;
return 0;
}
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 17/64] libxfs: split new inode creation into two pieces
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (15 preceding siblings ...)
2024-10-02 1:12 ` [PATCH 16/64] xfs: split new inode creation into two pieces Darrick J. Wong
@ 2024-10-02 1:12 ` Darrick J. Wong
2024-10-02 1:12 ` [PATCH 18/64] libxfs: backport inode init code from the kernel Darrick J. Wong
` (46 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:12 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 38fd3d6a956f1b104f11cd6eee116c54bfe458c4
There are two parts to initializing a newly allocated inode: setting up
the incore structures, and initializing the new inode core based on the
parent inode and the current user's environment. The initialization
code is not specific to the kernel, so we would like to share that with
userspace by hoisting it to libxfs. Therefore, split xfs_icreate into
separate functions to prepare for the next few patches.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/inode.c | 51 ++++++++++++++++++++++++++++++---------------------
1 file changed, 30 insertions(+), 21 deletions(-)
diff --git a/libxfs/inode.c b/libxfs/inode.c
index 3e72b25cc..206b779a8 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -91,33 +91,21 @@ libxfs_bumplink(
xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
}
-/*
- * Initialise a newly allocated inode and return the in-core inode to the
- * caller locked exclusively.
- */
-static int
-libxfs_icreate(
+/* Initialise an inode's attributes. */
+static void
+xfs_inode_init(
struct xfs_trans *tp,
- xfs_ino_t ino,
const struct xfs_icreate_args *args,
- struct xfs_inode **ipp)
+ struct xfs_inode *ip)
{
struct xfs_mount *mp = tp->t_mountp;
struct xfs_inode *pip = args->pip;
struct inode *dir = pip ? VFS_I(pip) : NULL;
- struct inode *inode;
- struct xfs_inode *ip;
+ struct inode *inode = VFS_I(ip);
unsigned int flags;
int times = XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG |
XFS_ICHGTIME_ACCESS;
- int error;
- error = libxfs_iget(mp, tp, ino, XFS_IGET_CREATE, &ip);
- if (error != 0)
- return error;
- ASSERT(ip != NULL);
-
- inode = VFS_I(ip);
inode->i_mode = args->mode;
if (args->flags & XFS_ICREATE_TMPFILE)
set_nlink(inode, 0);
@@ -201,11 +189,32 @@ libxfs_icreate(
}
}
- /*
- * Log the new values stuffed into the inode.
- */
- xfs_trans_ijoin(tp, ip, 0);
xfs_trans_log_inode(tp, ip, flags);
+}
+
+/*
+ * Initialise a newly allocated inode and return the in-core inode to the
+ * caller locked exclusively.
+ */
+static int
+libxfs_icreate(
+ struct xfs_trans *tp,
+ xfs_ino_t ino,
+ const struct xfs_icreate_args *args,
+ struct xfs_inode **ipp)
+{
+ struct xfs_mount *mp = tp->t_mountp;
+ struct xfs_inode *ip = NULL;
+ int error;
+
+ error = libxfs_iget(mp, tp, ino, XFS_IGET_CREATE, &ip);
+ if (error)
+ return error;
+
+ ASSERT(ip != NULL);
+ xfs_trans_ijoin(tp, ip, 0);
+ xfs_inode_init(tp, args, ip);
+
*ipp = ip;
return 0;
}
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 18/64] libxfs: backport inode init code from the kernel
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (16 preceding siblings ...)
2024-10-02 1:12 ` [PATCH 17/64] libxfs: " Darrick J. Wong
@ 2024-10-02 1:12 ` Darrick J. Wong
2024-10-02 5:50 ` Christoph Hellwig
2024-10-02 1:12 ` [PATCH 19/64] libxfs: remove libxfs_dir_ialloc Darrick J. Wong
` (45 subsequent siblings)
63 siblings, 1 reply; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:12 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Reorganize the userspace inode initialization code to more closely
resemble its kernel counterpart. This is preparation to hoist the
initialization routines to libxfs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
include/xfs_inode.h | 20 +++++++++++++++
include/xfs_mount.h | 8 ++++++
libxfs/inode.c | 68 +++++++++++++++++++++++++++++++++++++-------------
libxfs/libxfs_priv.h | 10 +++++++
4 files changed, 88 insertions(+), 18 deletions(-)
diff --git a/include/xfs_inode.h b/include/xfs_inode.h
index 4142c45e4..d2f391ea8 100644
--- a/include/xfs_inode.h
+++ b/include/xfs_inode.h
@@ -78,6 +78,12 @@ struct inode {
spinlock_t i_lock;
};
+static inline void
+inode_set_iversion(struct inode *inode, uint64_t version)
+{
+ inode->i_version = version;
+}
+
static inline uint32_t i_uid_read(struct inode *inode)
{
return inode->i_uid.val;
@@ -95,6 +101,18 @@ static inline void i_gid_write(struct inode *inode, gid_t gid)
inode->i_gid.val = gid;
}
+static inline void inode_fsuid_set(struct inode *inode,
+ struct mnt_idmap *idmap)
+{
+ inode->i_uid = make_kuid(0);
+}
+
+static inline void inode_fsgid_set(struct inode *inode,
+ struct mnt_idmap *idmap)
+{
+ inode->i_gid = make_kgid(0);
+}
+
static inline void ihold(struct inode *inode)
{
inode->i_count++;
@@ -408,4 +426,6 @@ extern void libxfs_irele(struct xfs_inode *ip);
#define XFS_DEFAULT_COWEXTSZ_HINT 32
+#define XFS_INHERIT_GID(pip) (VFS_I(pip)->i_mode & S_ISGID)
+
#endif /* __XFS_INODE_H__ */
diff --git a/include/xfs_mount.h b/include/xfs_mount.h
index a9525e4e0..4492a2f28 100644
--- a/include/xfs_mount.h
+++ b/include/xfs_mount.h
@@ -228,6 +228,7 @@ __XFS_UNSUPP_FEAT(ikeep)
__XFS_UNSUPP_FEAT(swalloc)
__XFS_UNSUPP_FEAT(small_inums)
__XFS_UNSUPP_FEAT(readonly)
+__XFS_UNSUPP_FEAT(grpid)
/* Operational mount state flags */
#define XFS_OPSTATE_INODE32 0 /* inode32 allocator active */
@@ -308,4 +309,11 @@ static inline void libxfs_buftarg_drain(struct xfs_buftarg *btp)
cache_purge(btp->bcache);
}
+struct mnt_idmap {
+ /* empty */
+};
+
+/* bogus idmapping so that mkfs can do directory inheritance correctly */
+#define libxfs_nop_idmap ((struct mnt_idmap *)1)
+
#endif /* __XFS_MOUNT_H__ */
diff --git a/libxfs/inode.c b/libxfs/inode.c
index 206b779a8..dda9b778d 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -31,7 +31,7 @@
/* Propagate di_flags from a parent inode to a child inode. */
static void
-xfs_inode_propagate_flags(
+xfs_inode_inherit_flags(
struct xfs_inode *ip,
const struct xfs_inode *pip)
{
@@ -106,35 +106,52 @@ xfs_inode_init(
int times = XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG |
XFS_ICHGTIME_ACCESS;
- inode->i_mode = args->mode;
if (args->flags & XFS_ICREATE_TMPFILE)
set_nlink(inode, 0);
else if (S_ISDIR(args->mode))
set_nlink(inode, 2);
else
set_nlink(inode, 1);
- inode->i_uid = GLOBAL_ROOT_UID;
- inode->i_gid = GLOBAL_ROOT_GID;
- ip->i_projid = 0;
+ inode->i_rdev = args->rdev;
- if (pip && (dir->i_mode & S_ISGID)) {
- inode->i_gid = dir->i_gid;
- if (S_ISDIR(args->mode))
- inode->i_mode |= S_ISGID;
+ if (!args->idmap || pip == NULL) {
+ /* creating a tree root, sb rooted, or detached file */
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
+ ip->i_projid = 0;
+ inode->i_mode = args->mode;
+ } else {
+ /* creating a child in the directory tree */
+ if (dir && !(dir->i_mode & S_ISGID) && xfs_has_grpid(mp)) {
+ inode_fsuid_set(inode, args->idmap);
+ inode->i_gid = dir->i_gid;
+ inode->i_mode = args->mode;
+ } else {
+ inode_init_owner(args->idmap, inode, dir, args->mode);
+ }
+
+ /*
+ * If the group ID of the new file does not match the effective
+ * group ID or one of the supplementary group IDs, the S_ISGID
+ * bit is cleared (and only if the irix_sgid_inherit
+ * compatibility variable is set).
+ */
+ if (irix_sgid_inherit && (inode->i_mode & S_ISGID) &&
+ !vfsgid_in_group_p(i_gid_into_vfsgid(args->idmap, inode)))
+ inode->i_mode &= ~S_ISGID;
+
+ ip->i_projid = pip ? xfs_get_initial_prid(pip) : 0;
}
- if (pip)
- ip->i_projid = libxfs_get_initial_prid(pip);
-
ip->i_disk_size = 0;
ip->i_df.if_nextents = 0;
ASSERT(ip->i_nblocks == 0);
+
ip->i_extsize = 0;
ip->i_diflags = 0;
- if (xfs_has_v3inodes(ip->i_mount)) {
- inode->i_version = 1;
- ip->i_diflags2 = ip->i_mount->m_ino_geo.new_diflags2;
+ if (xfs_has_v3inodes(mp)) {
+ inode_set_iversion(inode, 1);
ip->i_cowextsize = 0;
times |= XFS_ICHGTIME_CREATE;
}
@@ -149,15 +166,14 @@ xfs_inode_init(
case S_IFBLK:
ip->i_df.if_format = XFS_DINODE_FMT_DEV;
flags |= XFS_ILOG_DEV;
- VFS_I(ip)->i_rdev = args->rdev;
break;
case S_IFREG:
case S_IFDIR:
if (pip && (pip->i_diflags & XFS_DIFLAG_ANY))
- xfs_inode_propagate_flags(ip, pip);
+ xfs_inode_inherit_flags(ip, pip);
if (pip && (pip->i_diflags2 & XFS_DIFLAG2_ANY))
xfs_inode_inherit_flags2(ip, pip);
- /* FALLTHROUGH */
+ fallthrough;
case S_IFLNK:
ip->i_df.if_format = XFS_DINODE_FMT_EXTENTS;
ip->i_df.if_bytes = 0;
@@ -391,6 +407,7 @@ libxfs_iget(
VFS_I(ip)->i_count = 1;
ip->i_ino = ino;
ip->i_mount = mp;
+ ip->i_diflags2 = mp->m_ino_geo.new_diflags2;
ip->i_af.if_format = XFS_DINODE_FMT_EXTENTS;
spin_lock_init(&VFS_I(ip)->i_lock);
@@ -472,3 +489,18 @@ libxfs_irele(
kmem_cache_free(xfs_inode_cache, ip);
}
}
+
+void inode_init_owner(struct mnt_idmap *idmap, struct inode *inode,
+ const struct inode *dir, umode_t mode)
+{
+ inode_fsuid_set(inode, idmap);
+ if (dir && dir->i_mode & S_ISGID) {
+ inode->i_gid = dir->i_gid;
+
+ /* Directories are special, and always inherit S_ISGID */
+ if (S_ISDIR(mode))
+ mode |= S_ISGID;
+ } else
+ inode_fsgid_set(inode, idmap);
+ inode->i_mode = mode;
+}
diff --git a/libxfs/libxfs_priv.h b/libxfs/libxfs_priv.h
index 0bf0c54ac..ecacfff82 100644
--- a/libxfs/libxfs_priv.h
+++ b/libxfs/libxfs_priv.h
@@ -225,6 +225,12 @@ static inline bool WARN_ON(bool expr) {
(inode)->i_version = (version); \
} while (0)
+struct inode;
+struct mnt_idmap;
+
+void inode_init_owner(struct mnt_idmap *idmap, struct inode *inode,
+ const struct inode *dir, umode_t mode);
+
#define __must_check __attribute__((__warn_unused_result__))
/*
@@ -639,4 +645,8 @@ int xfs_bmap_last_extent(struct xfs_trans *tp, struct xfs_inode *ip,
#define cond_resched() ((void)0)
+/* xfs_linux.h */
+#define irix_sgid_inherit (false)
+#define vfsgid_in_group_p(...) (false)
+
#endif /* __LIBXFS_INTERNAL_XFS_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 19/64] libxfs: remove libxfs_dir_ialloc
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (17 preceding siblings ...)
2024-10-02 1:12 ` [PATCH 18/64] libxfs: backport inode init code from the kernel Darrick J. Wong
@ 2024-10-02 1:12 ` Darrick J. Wong
2024-10-02 5:51 ` Christoph Hellwig
2024-10-02 1:13 ` [PATCH 20/64] libxfs: implement get_random_u32 Darrick J. Wong
` (44 subsequent siblings)
63 siblings, 1 reply; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:12 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
This function no longer exists in the kernel, and it's not really needed
in userspace either. There are two users of it: repair and mkfs.
xfs_repair and xfs_db do not have useful cred and fsxattr structures so
they can call libxfs_dialloc and libxfs_icreate directly. For mkfs
we'll move the guts of libxfs_dir_ialloc into proto.c as a creatproto
function that handles setting user/group ids, and move struct cred to
mkfs since it's now the only user.
This gets us ready to hoist the rest of the inode initialization code to
libxfs for metadata directories.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
db/iunlink.c | 17 ++++++--
include/xfs_inode.h | 14 +------
libxfs/inode.c | 77 ------------------------------------
libxfs/libxfs_api_defs.h | 1
mkfs/proto.c | 98 +++++++++++++++++++++++++++++++++++++---------
repair/phase6.c | 60 +++++++++++++---------------
6 files changed, 125 insertions(+), 142 deletions(-)
diff --git a/db/iunlink.c b/db/iunlink.c
index 3163036e6..fcc824d9a 100644
--- a/db/iunlink.c
+++ b/db/iunlink.c
@@ -312,10 +312,14 @@ static int
create_unlinked(
struct xfs_mount *mp)
{
- struct cred cr = { };
- struct fsxattr fsx = { };
+ struct xfs_icreate_args args = {
+ .idmap = libxfs_nop_idmap,
+ .mode = S_IFREG | 0600,
+ .flags = XFS_ICREATE_TMPFILE,
+ };
struct xfs_inode *ip;
struct xfs_trans *tp;
+ xfs_ino_t ino;
unsigned int resblks;
int error;
@@ -327,8 +331,13 @@ create_unlinked(
return error;
}
- error = -libxfs_dir_ialloc(&tp, NULL, S_IFREG | 0600, 0, 0, &cr, &fsx,
- &ip);
+ error = -libxfs_dialloc(&tp, 0, args.mode, &ino);
+ if (error) {
+ dbprintf(_("alloc inode: %s\n"), strerror(error));
+ goto out_cancel;
+ }
+
+ error = -libxfs_icreate(tp, ino, &args, &ip);
if (error) {
dbprintf(_("create inode: %s\n"), strerror(error));
goto out_cancel;
diff --git a/include/xfs_inode.h b/include/xfs_inode.h
index d2f391ea8..1f9b07a53 100644
--- a/include/xfs_inode.h
+++ b/include/xfs_inode.h
@@ -399,17 +399,6 @@ static inline bool xfs_is_always_cow_inode(struct xfs_inode *ip)
return false;
}
-/* Always set the child's GID to this value, even if the parent is setgid. */
-#define CRED_FORCE_GID (1U << 0)
-struct cred {
- uid_t cr_uid;
- gid_t cr_gid;
- unsigned int cr_flags;
-};
-
-extern int libxfs_dir_ialloc (struct xfs_trans **, struct xfs_inode *,
- mode_t, nlink_t, xfs_dev_t, struct cred *,
- struct fsxattr *, struct xfs_inode **);
extern void libxfs_trans_inode_alloc_buf (struct xfs_trans *,
struct xfs_buf *);
@@ -419,6 +408,9 @@ extern int libxfs_iflush_int (struct xfs_inode *, struct xfs_buf *);
void libxfs_bumplink(struct xfs_trans *tp, struct xfs_inode *ip);
+int libxfs_icreate(struct xfs_trans *tp, xfs_ino_t ino,
+ const struct xfs_icreate_args *args, struct xfs_inode **ipp);
+
/* Inode Cache Interfaces */
extern int libxfs_iget(struct xfs_mount *, struct xfs_trans *, xfs_ino_t,
uint, struct xfs_inode **);
diff --git a/libxfs/inode.c b/libxfs/inode.c
index dda9b778d..eb71f90bc 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -212,7 +212,7 @@ xfs_inode_init(
* Initialise a newly allocated inode and return the in-core inode to the
* caller locked exclusively.
*/
-static int
+int
libxfs_icreate(
struct xfs_trans *tp,
xfs_ino_t ino,
@@ -302,81 +302,6 @@ libxfs_iflush_int(
return 0;
}
-/*
- * Wrapper around call to libxfs_ialloc. Takes care of committing and
- * allocating a new transaction as needed.
- *
- * Originally there were two copies of this code - one in mkfs, the
- * other in repair - now there is just the one.
- */
-int
-libxfs_dir_ialloc(
- struct xfs_trans **tpp,
- struct xfs_inode *dp,
- mode_t mode,
- nlink_t nlink,
- xfs_dev_t rdev,
- struct cred *cr,
- struct fsxattr *fsx,
- struct xfs_inode **ipp)
-{
- struct xfs_icreate_args args = {
- .pip = dp,
- .mode = mode,
- };
- struct xfs_inode *ip;
- struct inode *inode;
- xfs_ino_t parent_ino = dp ? dp->i_ino : 0;
- xfs_ino_t ino;
- int error;
-
- if (dp && xfs_has_parent(dp->i_mount))
- args.flags |= XFS_ICREATE_INIT_XATTRS;
-
- /* Only devices get rdev numbers */
- switch (mode & S_IFMT) {
- case S_IFCHR:
- case S_IFBLK:
- args.rdev = rdev;
- break;
- }
-
- /*
- * Call the space management code to pick the on-disk inode to be
- * allocated.
- */
- error = xfs_dialloc(tpp, parent_ino, mode, &ino);
- if (error)
- return error;
-
- error = libxfs_icreate(*tpp, ino, &args, &ip);
- if (error)
- return error;
-
- inode = VFS_I(ip);
- i_uid_write(inode, cr->cr_uid);
- if (cr->cr_flags & CRED_FORCE_GID)
- i_gid_write(inode, cr->cr_gid);
- set_nlink(inode, nlink);
-
- /* If there is no parent dir, initialize the file from fsxattr data. */
- if (dp == NULL) {
- ip->i_projid = fsx->fsx_projid;
- ip->i_extsize = fsx->fsx_extsize;
- ip->i_diflags = xfs_flags2diflags(ip, fsx->fsx_xflags);
-
- if (xfs_has_v3inodes(ip->i_mount)) {
- ip->i_diflags2 = xfs_flags2diflags2(ip,
- fsx->fsx_xflags);
- ip->i_cowextsize = fsx->fsx_cowextsize;
- }
- }
-
- xfs_trans_log_inode(*tpp, ip, XFS_ILOG_CORE);
- *ipp = ip;
- return 0;
-}
-
/*
* Inode cache stubs.
*/
diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
index a507904f2..903f7dc69 100644
--- a/libxfs/libxfs_api_defs.h
+++ b/libxfs/libxfs_api_defs.h
@@ -117,6 +117,7 @@
#define xfs_da_shrink_inode libxfs_da_shrink_inode
#define xfs_defer_cancel libxfs_defer_cancel
#define xfs_defer_finish libxfs_defer_finish
+#define xfs_dialloc libxfs_dialloc
#define xfs_dinode_calc_crc libxfs_dinode_calc_crc
#define xfs_dinode_good_version libxfs_dinode_good_version
#define xfs_dinode_verify libxfs_dinode_verify
diff --git a/mkfs/proto.c b/mkfs/proto.c
index 8e16eb150..58edc59f7 100644
--- a/mkfs/proto.c
+++ b/mkfs/proto.c
@@ -405,6 +405,70 @@ newpptr(
return ret;
}
+struct cred {
+ uid_t cr_uid;
+ gid_t cr_gid;
+};
+
+static int
+creatproto(
+ struct xfs_trans **tpp,
+ struct xfs_inode *dp,
+ mode_t mode,
+ xfs_dev_t rdev,
+ struct cred *cr,
+ struct fsxattr *fsx,
+ struct xfs_inode **ipp)
+{
+ struct xfs_icreate_args args = {
+ .idmap = libxfs_nop_idmap,
+ .pip = dp,
+ .rdev = rdev,
+ .mode = mode,
+ };
+ struct xfs_inode *ip;
+ struct inode *inode;
+ xfs_ino_t parent_ino = dp ? dp->i_ino : 0;
+ xfs_ino_t ino;
+ int error;
+
+ if (dp && xfs_has_parent(dp->i_mount))
+ args.flags |= XFS_ICREATE_INIT_XATTRS;
+
+ /*
+ * Call the space management code to pick the on-disk inode to be
+ * allocated.
+ */
+ error = -libxfs_dialloc(tpp, parent_ino, mode, &ino);
+ if (error)
+ return error;
+
+ error = -libxfs_icreate(*tpp, ino, &args, &ip);
+ if (error)
+ return error;
+
+ inode = VFS_I(ip);
+ i_uid_write(inode, cr->cr_uid);
+ i_gid_write(inode, cr->cr_gid);
+
+ /* If there is no parent dir, initialize the file from fsxattr data. */
+ if (dp == NULL) {
+ ip->i_projid = fsx->fsx_projid;
+ ip->i_extsize = fsx->fsx_extsize;
+ ip->i_diflags = xfs_flags2diflags(ip, fsx->fsx_xflags);
+
+ if (xfs_has_v3inodes(ip->i_mount)) {
+ ip->i_diflags2 = xfs_flags2diflags2(ip,
+ fsx->fsx_xflags);
+ ip->i_cowextsize = fsx->fsx_cowextsize;
+ }
+ }
+
+ libxfs_trans_log_inode(*tpp, ip, XFS_ILOG_CORE);
+ *ipp = ip;
+ return 0;
+}
+
static void
parseproto(
xfs_mount_t *mp,
@@ -505,7 +569,6 @@ parseproto(
mode |= val;
creds.cr_uid = (int)getnum(getstr(pp), 0, 0, false);
creds.cr_gid = (int)getnum(getstr(pp), 0, 0, false);
- creds.cr_flags = CRED_FORCE_GID;
xname.name = (unsigned char *)name;
xname.len = name ? strlen(name) : 0;
xname.type = 0;
@@ -515,8 +578,8 @@ parseproto(
buf = newregfile(pp, &len);
tp = getres(mp, XFS_B_TO_FSB(mp, len));
ppargs = newpptr(mp);
- error = -libxfs_dir_ialloc(&tp, pip, mode|S_IFREG, 1, 0,
- &creds, fsxp, &ip);
+ error = creatproto(&tp, pip, mode | S_IFREG, 0, &creds, fsxp,
+ &ip);
if (error)
fail(_("Inode allocation failed"), error);
writefile(tp, ip, buf, len);
@@ -539,8 +602,8 @@ parseproto(
}
tp = getres(mp, XFS_B_TO_FSB(mp, llen));
ppargs = newpptr(mp);
- error = -libxfs_dir_ialloc(&tp, pip, mode|S_IFREG, 1, 0,
- &creds, fsxp, &ip);
+ error = creatproto(&tp, pip, mode | S_IFREG, 0, &creds, fsxp,
+ &ip);
if (error)
fail(_("Inode pre-allocation failed"), error);
@@ -562,7 +625,7 @@ parseproto(
ppargs = newpptr(mp);
majdev = getnum(getstr(pp), 0, 0, false);
mindev = getnum(getstr(pp), 0, 0, false);
- error = -libxfs_dir_ialloc(&tp, pip, mode|S_IFBLK, 1,
+ error = creatproto(&tp, pip, mode | S_IFBLK,
IRIX_MKDEV(majdev, mindev), &creds, fsxp, &ip);
if (error) {
fail(_("Inode allocation failed"), error);
@@ -578,7 +641,7 @@ parseproto(
ppargs = newpptr(mp);
majdev = getnum(getstr(pp), 0, 0, false);
mindev = getnum(getstr(pp), 0, 0, false);
- error = -libxfs_dir_ialloc(&tp, pip, mode|S_IFCHR, 1,
+ error = creatproto(&tp, pip, mode | S_IFCHR,
IRIX_MKDEV(majdev, mindev), &creds, fsxp, &ip);
if (error)
fail(_("Inode allocation failed"), error);
@@ -591,8 +654,8 @@ parseproto(
case IF_FIFO:
tp = getres(mp, 0);
ppargs = newpptr(mp);
- error = -libxfs_dir_ialloc(&tp, pip, mode|S_IFIFO, 1, 0,
- &creds, fsxp, &ip);
+ error = creatproto(&tp, pip, mode | S_IFIFO, 0, &creds, fsxp,
+ &ip);
if (error)
fail(_("Inode allocation failed"), error);
libxfs_trans_ijoin(tp, pip, 0);
@@ -604,8 +667,8 @@ parseproto(
len = (int)strlen(buf);
tp = getres(mp, XFS_B_TO_FSB(mp, len));
ppargs = newpptr(mp);
- error = -libxfs_dir_ialloc(&tp, pip, mode|S_IFLNK, 1, 0,
- &creds, fsxp, &ip);
+ error = creatproto(&tp, pip, mode | S_IFLNK, 0, &creds, fsxp,
+ &ip);
if (error)
fail(_("Inode allocation failed"), error);
writesymlink(tp, ip, buf, len);
@@ -615,11 +678,10 @@ parseproto(
break;
case IF_DIRECTORY:
tp = getres(mp, 0);
- error = -libxfs_dir_ialloc(&tp, pip, mode|S_IFDIR, 1, 0,
- &creds, fsxp, &ip);
+ error = creatproto(&tp, pip, mode | S_IFDIR, 0, &creds, fsxp,
+ &ip);
if (error)
fail(_("Inode allocation failed"), error);
- libxfs_bumplink(tp, ip); /* account for . */
if (!pip) {
pip = ip;
mp->m_sb.sb_rootino = ip->i_ino;
@@ -714,14 +776,13 @@ rtinit(
memset(&creds, 0, sizeof(creds));
memset(&fsxattrs, 0, sizeof(fsxattrs));
- error = -libxfs_dir_ialloc(&tp, NULL, S_IFREG, 1, 0,
- &creds, &fsxattrs, &rbmip);
+ error = creatproto(&tp, NULL, S_IFREG, 0, &creds, &fsxattrs, &rbmip);
if (error) {
fail(_("Realtime bitmap inode allocation failed"), error);
}
/*
* Do our thing with rbmip before allocating rsumip,
- * because the next call to ialloc() may
+ * because the next call to createproto may
* commit the transaction in which rbmip was allocated.
*/
mp->m_sb.sb_rbmino = rbmip->i_ino;
@@ -731,8 +792,7 @@ rtinit(
libxfs_trans_log_inode(tp, rbmip, XFS_ILOG_CORE);
libxfs_log_sb(tp);
mp->m_rbmip = rbmip;
- error = -libxfs_dir_ialloc(&tp, NULL, S_IFREG, 1, 0,
- &creds, &fsxattrs, &rsumip);
+ error = creatproto(&tp, NULL, S_IFREG, 0, &creds, &fsxattrs, &rsumip);
if (error) {
fail(_("Realtime summary inode allocation failed"), error);
}
diff --git a/repair/phase6.c b/repair/phase6.c
index ad067ba0a..7a5694284 100644
--- a/repair/phase6.c
+++ b/repair/phase6.c
@@ -20,8 +20,6 @@
#include "versions.h"
#include "repair/pptr.h"
-static struct cred zerocr;
-static struct fsxattr zerofsx;
static xfs_ino_t orphanage_ino;
/*
@@ -891,20 +889,27 @@ mk_root_dir(xfs_mount_t *mp)
* orphanage name == lost+found
*/
static xfs_ino_t
-mk_orphanage(xfs_mount_t *mp)
+mk_orphanage(
+ struct xfs_mount *mp)
{
- xfs_ino_t ino;
- xfs_trans_t *tp;
- xfs_inode_t *ip;
- xfs_inode_t *pip;
- ino_tree_node_t *irec;
- int ino_offset = 0;
- int i;
- int error;
- const int mode = 0755;
- int nres;
- struct xfs_name xname;
- struct xfs_parent_args *ppargs = NULL;
+ struct xfs_icreate_args args = {
+ .idmap = libxfs_nop_idmap,
+ .mode = S_IFDIR | 0755,
+ };
+ struct xfs_trans *tp;
+ struct xfs_inode *ip;
+ struct xfs_inode *pip;
+ struct ino_tree_node *irec;
+ xfs_ino_t ino;
+ int ino_offset = 0;
+ int i;
+ int error;
+ int nres;
+ struct xfs_name xname;
+ struct xfs_parent_args *ppargs = NULL;
+
+ if (xfs_has_parent(mp))
+ args.flags |= XFS_ICREATE_INIT_XATTRS;
i = -libxfs_parent_start(mp, &ppargs);
if (i)
@@ -922,6 +927,7 @@ mk_orphanage(xfs_mount_t *mp)
do_error(_("%d - couldn't iget root inode to obtain %s\n"),
i, ORPHANAGE);
+ args.pip = pip;
xname.name = (unsigned char *)ORPHANAGE;
xname.len = strlen(ORPHANAGE);
xname.type = XFS_DIR3_FT_DIR;
@@ -939,23 +945,15 @@ mk_orphanage(xfs_mount_t *mp)
if (i)
res_failed(i);
- /*
- * use iget/ijoin instead of trans_iget because the ialloc
- * wrapper can commit the transaction and start a new one
- */
-/* i = -libxfs_iget(mp, NULL, mp->m_sb.sb_rootino, 0, &pip);
- if (i)
- do_error(_("%d - couldn't iget root inode to make %s\n"),
- i, ORPHANAGE);*/
-
- error = -libxfs_dir_ialloc(&tp, pip, mode|S_IFDIR,
- 1, 0, &zerocr, &zerofsx, &ip);
- if (error) {
+ error = -libxfs_dialloc(&tp, mp->m_sb.sb_rootino, args.mode, &ino);
+ if (error)
do_error(_("%s inode allocation failed %d\n"),
ORPHANAGE, error);
- }
- libxfs_bumplink(tp, ip); /* account for . */
- ino = ip->i_ino;
+
+ error = -libxfs_icreate(tp, ino, &args, &ip);
+ if (error)
+ do_error(_("%s inode initialization failed %d\n"),
+ ORPHANAGE, error);
irec = find_inode_rec(mp,
XFS_INO_TO_AGNO(mp, ino),
@@ -3344,8 +3342,6 @@ phase6(xfs_mount_t *mp)
parent_ptr_init(mp);
- memset(&zerocr, 0, sizeof(struct cred));
- memset(&zerofsx, 0, sizeof(struct fsxattr));
orphanage_ino = 0;
do_log(_("Phase 6 - check inode connectivity...\n"));
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 20/64] libxfs: implement get_random_u32
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (18 preceding siblings ...)
2024-10-02 1:12 ` [PATCH 19/64] libxfs: remove libxfs_dir_ialloc Darrick J. Wong
@ 2024-10-02 1:13 ` Darrick J. Wong
2024-10-02 5:51 ` Christoph Hellwig
2024-10-02 1:13 ` [PATCH 21/64] xfs: hoist new inode initialization functions to libxfs Darrick J. Wong
` (43 subsequent siblings)
63 siblings, 1 reply; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:13 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Actually query the kernel for some random bytes instead of returning
zero, if that's possible. The most noticeable effect of this is that
mkfs will now create the rtbitmap file, the rtsummary file, and children
of the root directory with a nonzero generation. Apparently xfsdump
requires that the root directory have a generation number of zero.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
configure.ac | 1 +
include/builddefs.in | 1 +
libxfs/Makefile | 4 ++++
libxfs/libxfs_priv.h | 11 +++++++----
libxfs/util.c | 19 +++++++++++++++++++
m4/package_libcdev.m4 | 15 +++++++++++++++
mkfs/proto.c | 3 +++
7 files changed, 50 insertions(+), 4 deletions(-)
diff --git a/configure.ac b/configure.ac
index d021c519d..1c9fa8173 100644
--- a/configure.ac
+++ b/configure.ac
@@ -152,6 +152,7 @@ AC_HAVE_DEVMAPPER
AC_HAVE_MALLINFO
AC_HAVE_MALLINFO2
AC_HAVE_MEMFD_CREATE
+AC_HAVE_GETRANDOM_NONBLOCK
if test "$enable_scrub" = "yes"; then
if test "$enable_libicu" = "yes" || test "$enable_libicu" = "probe"; then
AC_HAVE_LIBICU
diff --git a/include/builddefs.in b/include/builddefs.in
index 07c4a43f7..c8c7de7fd 100644
--- a/include/builddefs.in
+++ b/include/builddefs.in
@@ -102,6 +102,7 @@ HAVE_DEVMAPPER = @have_devmapper@
HAVE_MALLINFO = @have_mallinfo@
HAVE_MALLINFO2 = @have_mallinfo2@
HAVE_MEMFD_CREATE = @have_memfd_create@
+HAVE_GETRANDOM_NONBLOCK = @have_getrandom_nonblock@
HAVE_LIBICU = @have_libicu@
HAVE_SYSTEMD = @have_systemd@
SYSTEMD_SYSTEM_UNIT_DIR = @systemd_system_unit_dir@
diff --git a/libxfs/Makefile b/libxfs/Makefile
index 8c93d7b53..fd623cf40 100644
--- a/libxfs/Makefile
+++ b/libxfs/Makefile
@@ -135,6 +135,10 @@ ifeq ($(HAVE_MEMFD_CREATE),yes)
LCFLAGS += -DHAVE_MEMFD_CREATE
endif
+ifeq ($(HAVE_GETRANDOM_NONBLOCK),yes)
+LCFLAGS += -DHAVE_GETRANDOM_NONBLOCK
+endif
+
FCFLAGS = -I.
LTLIBS = $(LIBPTHREAD) $(LIBRT)
diff --git a/libxfs/libxfs_priv.h b/libxfs/libxfs_priv.h
index ecacfff82..8dd364b0d 100644
--- a/libxfs/libxfs_priv.h
+++ b/libxfs/libxfs_priv.h
@@ -63,6 +63,9 @@
#include "libfrog/crc32c.h"
#include <sys/xattr.h>
+#ifdef HAVE_GETRANDOM_NONBLOCK
+#include <sys/random.h>
+#endif
/* Zones used in libxfs allocations that aren't in shared header files */
extern struct kmem_cache *xfs_buf_item_cache;
@@ -212,11 +215,11 @@ static inline bool WARN_ON(bool expr) {
#define percpu_counter_read_positive(x) ((*x) > 0 ? (*x) : 0)
#define percpu_counter_sum_positive(x) ((*x) > 0 ? (*x) : 0)
-/*
- * get_random_u32 is used for di_gen inode allocation, it must be zero for
- * libxfs or all sorts of badness can occur!
- */
+#ifdef HAVE_GETRANDOM_NONBLOCK
+uint32_t get_random_u32(void);
+#else
#define get_random_u32() (0)
+#endif
#define PAGE_SIZE getpagesize()
diff --git a/libxfs/util.c b/libxfs/util.c
index 7aa92c0e4..a3f3ad299 100644
--- a/libxfs/util.c
+++ b/libxfs/util.c
@@ -462,3 +462,22 @@ void xfs_dirattr_mark_sick(struct xfs_inode *ip, int whichfork) { }
void xfs_da_mark_sick(struct xfs_da_args *args) { }
void xfs_inode_mark_sick(struct xfs_inode *ip, unsigned int mask) { }
void xfs_rt_mark_sick(struct xfs_mount *mp, unsigned int mask) { }
+
+#ifdef HAVE_GETRANDOM_NONBLOCK
+uint32_t
+get_random_u32(void)
+{
+ uint32_t ret;
+ ssize_t sz;
+
+ /*
+ * Try to extract a u32 of randomness from /dev/urandom. If that
+ * fails, fall back to returning zero like we used to do.
+ */
+ sz = getrandom(&ret, sizeof(ret), GRND_NONBLOCK);
+ if (sz != sizeof(ret))
+ return 0;
+
+ return ret;
+}
+#endif
diff --git a/m4/package_libcdev.m4 b/m4/package_libcdev.m4
index 6de8b33ee..13cb5156d 100644
--- a/m4/package_libcdev.m4
+++ b/m4/package_libcdev.m4
@@ -195,6 +195,21 @@ memfd_create(0, 0);
AC_SUBST(have_memfd_create)
])
+#
+# Check if we have a getrandom syscall with a GRND_NONBLOCK flag
+#
+AC_DEFUN([AC_HAVE_GETRANDOM_NONBLOCK],
+ [ AC_MSG_CHECKING([for getrandom and GRND_NONBLOCK])
+ AC_LINK_IFELSE([AC_LANG_PROGRAM([[
+#include <sys/random.h>
+ ]], [[
+ unsigned int moo;
+ return getrandom(&moo, sizeof(moo), GRND_NONBLOCK);
+ ]])],[have_getrandom_nonblock=yes
+ AC_MSG_RESULT(yes)],[AC_MSG_RESULT(no)])
+ AC_SUBST(have_getrandom_nonblock)
+ ])
+
AC_DEFUN([AC_PACKAGE_CHECK_LTO],
[ AC_MSG_CHECKING([if C compiler supports LTO])
OLD_CFLAGS="$CFLAGS"
diff --git a/mkfs/proto.c b/mkfs/proto.c
index 58edc59f7..96cb9f854 100644
--- a/mkfs/proto.c
+++ b/mkfs/proto.c
@@ -462,6 +462,9 @@ creatproto(
fsx->fsx_xflags);
ip->i_cowextsize = fsx->fsx_cowextsize;
}
+
+ /* xfsdump breaks if the root dir has a nonzero generation */
+ inode->i_generation = 0;
}
libxfs_trans_log_inode(*tpp, ip, XFS_ILOG_CORE);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 21/64] xfs: hoist new inode initialization functions to libxfs
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (19 preceding siblings ...)
2024-10-02 1:13 ` [PATCH 20/64] libxfs: implement get_random_u32 Darrick J. Wong
@ 2024-10-02 1:13 ` Darrick J. Wong
2024-10-02 1:13 ` [PATCH 22/64] xfs: hoist xfs_iunlink " Darrick J. Wong
` (42 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:13 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: e9d2b35bb9d3ff372fad27998fc3969ced3f563d
Move all the code that initializes a new inode's attributes from the
icreate_args structure and the parent directory into libxfs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
include/xfs_inode.h | 6 +
libxfs/inode.c | 161 ------------------------------------
libxfs/xfs_inode_util.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_inode_util.h | 12 +++
libxfs/xfs_shared.h | 8 --
repair/phase6.c | 3 -
6 files changed, 231 insertions(+), 170 deletions(-)
diff --git a/include/xfs_inode.h b/include/xfs_inode.h
index 1f9b07a53..7ce6f0183 100644
--- a/include/xfs_inode.h
+++ b/include/xfs_inode.h
@@ -420,4 +420,10 @@ extern void libxfs_irele(struct xfs_inode *ip);
#define XFS_INHERIT_GID(pip) (VFS_I(pip)->i_mode & S_ISGID)
+#define xfs_inherit_noatime (false)
+#define xfs_inherit_nodump (false)
+#define xfs_inherit_sync (false)
+#define xfs_inherit_nosymlinks (false)
+#define xfs_inherit_nodefrag (false)
+
#endif /* __XFS_INODE_H__ */
diff --git a/libxfs/inode.c b/libxfs/inode.c
index eb71f90bc..61068078a 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -29,50 +29,6 @@
#include "xfs_da_btree.h"
#include "xfs_dir2_priv.h"
-/* Propagate di_flags from a parent inode to a child inode. */
-static void
-xfs_inode_inherit_flags(
- struct xfs_inode *ip,
- const struct xfs_inode *pip)
-{
- unsigned int di_flags = 0;
- umode_t mode = VFS_I(ip)->i_mode;
-
- if ((mode & S_IFMT) == S_IFDIR) {
- if (pip->i_diflags & XFS_DIFLAG_RTINHERIT)
- di_flags |= XFS_DIFLAG_RTINHERIT;
- if (pip->i_diflags & XFS_DIFLAG_EXTSZINHERIT) {
- di_flags |= XFS_DIFLAG_EXTSZINHERIT;
- ip->i_extsize = pip->i_extsize;
- }
- } else {
- if ((pip->i_diflags & XFS_DIFLAG_RTINHERIT) &&
- xfs_has_realtime(ip->i_mount))
- di_flags |= XFS_DIFLAG_REALTIME;
- if (pip->i_diflags & XFS_DIFLAG_EXTSZINHERIT) {
- di_flags |= XFS_DIFLAG_EXTSIZE;
- ip->i_extsize = pip->i_extsize;
- }
- }
- if (pip->i_diflags & XFS_DIFLAG_PROJINHERIT)
- di_flags |= XFS_DIFLAG_PROJINHERIT;
- ip->i_diflags |= di_flags;
-}
-
-/* Propagate di_flags2 from a parent inode to a child inode. */
-static void
-xfs_inode_inherit_flags2(
- struct xfs_inode *ip,
- const struct xfs_inode *pip)
-{
- if (pip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) {
- ip->i_diflags2 |= XFS_DIFLAG2_COWEXTSIZE;
- ip->i_cowextsize = pip->i_cowextsize;
- }
- if (pip->i_diflags2 & XFS_DIFLAG2_DAX)
- ip->i_diflags2 |= XFS_DIFLAG2_DAX;
-}
-
/*
* Increment the link count on an inode & log the change.
*/
@@ -91,123 +47,6 @@ libxfs_bumplink(
xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
}
-/* Initialise an inode's attributes. */
-static void
-xfs_inode_init(
- struct xfs_trans *tp,
- const struct xfs_icreate_args *args,
- struct xfs_inode *ip)
-{
- struct xfs_mount *mp = tp->t_mountp;
- struct xfs_inode *pip = args->pip;
- struct inode *dir = pip ? VFS_I(pip) : NULL;
- struct inode *inode = VFS_I(ip);
- unsigned int flags;
- int times = XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG |
- XFS_ICHGTIME_ACCESS;
-
- if (args->flags & XFS_ICREATE_TMPFILE)
- set_nlink(inode, 0);
- else if (S_ISDIR(args->mode))
- set_nlink(inode, 2);
- else
- set_nlink(inode, 1);
- inode->i_rdev = args->rdev;
-
- if (!args->idmap || pip == NULL) {
- /* creating a tree root, sb rooted, or detached file */
- inode->i_uid = GLOBAL_ROOT_UID;
- inode->i_gid = GLOBAL_ROOT_GID;
- ip->i_projid = 0;
- inode->i_mode = args->mode;
- } else {
- /* creating a child in the directory tree */
- if (dir && !(dir->i_mode & S_ISGID) && xfs_has_grpid(mp)) {
- inode_fsuid_set(inode, args->idmap);
- inode->i_gid = dir->i_gid;
- inode->i_mode = args->mode;
- } else {
- inode_init_owner(args->idmap, inode, dir, args->mode);
- }
-
- /*
- * If the group ID of the new file does not match the effective
- * group ID or one of the supplementary group IDs, the S_ISGID
- * bit is cleared (and only if the irix_sgid_inherit
- * compatibility variable is set).
- */
- if (irix_sgid_inherit && (inode->i_mode & S_ISGID) &&
- !vfsgid_in_group_p(i_gid_into_vfsgid(args->idmap, inode)))
- inode->i_mode &= ~S_ISGID;
-
- ip->i_projid = pip ? xfs_get_initial_prid(pip) : 0;
- }
-
- ip->i_disk_size = 0;
- ip->i_df.if_nextents = 0;
- ASSERT(ip->i_nblocks == 0);
-
- ip->i_extsize = 0;
- ip->i_diflags = 0;
-
- if (xfs_has_v3inodes(mp)) {
- inode_set_iversion(inode, 1);
- ip->i_cowextsize = 0;
- times |= XFS_ICHGTIME_CREATE;
- }
-
- xfs_trans_ichgtime(tp, ip, times);
-
- flags = XFS_ILOG_CORE;
- switch (args->mode & S_IFMT) {
- case S_IFIFO:
- case S_IFSOCK:
- case S_IFCHR:
- case S_IFBLK:
- ip->i_df.if_format = XFS_DINODE_FMT_DEV;
- flags |= XFS_ILOG_DEV;
- break;
- case S_IFREG:
- case S_IFDIR:
- if (pip && (pip->i_diflags & XFS_DIFLAG_ANY))
- xfs_inode_inherit_flags(ip, pip);
- if (pip && (pip->i_diflags2 & XFS_DIFLAG2_ANY))
- xfs_inode_inherit_flags2(ip, pip);
- fallthrough;
- case S_IFLNK:
- ip->i_df.if_format = XFS_DINODE_FMT_EXTENTS;
- ip->i_df.if_bytes = 0;
- ip->i_df.if_data = NULL;
- break;
- default:
- ASSERT(0);
- }
-
- /*
- * If we need to create attributes immediately after allocating the
- * inode, initialise an empty attribute fork right now. We use the
- * default fork offset for attributes here as we don't know exactly what
- * size or how many attributes we might be adding. We can do this
- * safely here because we know the data fork is completely empty and
- * this saves us from needing to run a separate transaction to set the
- * fork offset in the immediate future.
- */
- if ((args->flags & XFS_ICREATE_INIT_XATTRS) &&
- (xfs_has_attr(tp->t_mountp) || xfs_has_attr2(tp->t_mountp))) {
- ip->i_forkoff = xfs_default_attroffset(ip) >> 3;
- xfs_ifork_init_attr(ip, XFS_DINODE_FMT_EXTENTS, 0);
-
- if (!xfs_has_attr(mp)) {
- spin_lock(&mp->m_sb_lock);
- xfs_add_attr(mp);
- spin_unlock(&mp->m_sb_lock);
- xfs_log_sb(tp);
- }
- }
-
- xfs_trans_log_inode(tp, ip, flags);
-}
-
/*
* Initialise a newly allocated inode and return the in-core inode to the
* caller locked exclusively.
diff --git a/libxfs/xfs_inode_util.c b/libxfs/xfs_inode_util.c
index 0a9ea03e2..633c7616c 100644
--- a/libxfs/xfs_inode_util.c
+++ b/libxfs/xfs_inode_util.c
@@ -13,6 +13,10 @@
#include "xfs_mount.h"
#include "xfs_inode.h"
#include "xfs_inode_util.h"
+#include "xfs_trans.h"
+#include "xfs_ialloc.h"
+#include "xfs_health.h"
+#include "xfs_bmap.h"
uint16_t
xfs_flags2diflags(
@@ -132,3 +136,210 @@ xfs_get_initial_prid(struct xfs_inode *dp)
/* Assign to the root project by default. */
return 0;
}
+
+/* Propagate di_flags from a parent inode to a child inode. */
+static inline void
+xfs_inode_inherit_flags(
+ struct xfs_inode *ip,
+ const struct xfs_inode *pip)
+{
+ unsigned int di_flags = 0;
+ xfs_failaddr_t failaddr;
+ umode_t mode = VFS_I(ip)->i_mode;
+
+ if (S_ISDIR(mode)) {
+ if (pip->i_diflags & XFS_DIFLAG_RTINHERIT)
+ di_flags |= XFS_DIFLAG_RTINHERIT;
+ if (pip->i_diflags & XFS_DIFLAG_EXTSZINHERIT) {
+ di_flags |= XFS_DIFLAG_EXTSZINHERIT;
+ ip->i_extsize = pip->i_extsize;
+ }
+ if (pip->i_diflags & XFS_DIFLAG_PROJINHERIT)
+ di_flags |= XFS_DIFLAG_PROJINHERIT;
+ } else if (S_ISREG(mode)) {
+ if ((pip->i_diflags & XFS_DIFLAG_RTINHERIT) &&
+ xfs_has_realtime(ip->i_mount))
+ di_flags |= XFS_DIFLAG_REALTIME;
+ if (pip->i_diflags & XFS_DIFLAG_EXTSZINHERIT) {
+ di_flags |= XFS_DIFLAG_EXTSIZE;
+ ip->i_extsize = pip->i_extsize;
+ }
+ }
+ if ((pip->i_diflags & XFS_DIFLAG_NOATIME) &&
+ xfs_inherit_noatime)
+ di_flags |= XFS_DIFLAG_NOATIME;
+ if ((pip->i_diflags & XFS_DIFLAG_NODUMP) &&
+ xfs_inherit_nodump)
+ di_flags |= XFS_DIFLAG_NODUMP;
+ if ((pip->i_diflags & XFS_DIFLAG_SYNC) &&
+ xfs_inherit_sync)
+ di_flags |= XFS_DIFLAG_SYNC;
+ if ((pip->i_diflags & XFS_DIFLAG_NOSYMLINKS) &&
+ xfs_inherit_nosymlinks)
+ di_flags |= XFS_DIFLAG_NOSYMLINKS;
+ if ((pip->i_diflags & XFS_DIFLAG_NODEFRAG) &&
+ xfs_inherit_nodefrag)
+ di_flags |= XFS_DIFLAG_NODEFRAG;
+ if (pip->i_diflags & XFS_DIFLAG_FILESTREAM)
+ di_flags |= XFS_DIFLAG_FILESTREAM;
+
+ ip->i_diflags |= di_flags;
+
+ /*
+ * Inode verifiers on older kernels only check that the extent size
+ * hint is an integer multiple of the rt extent size on realtime files.
+ * They did not check the hint alignment on a directory with both
+ * rtinherit and extszinherit flags set. If the misaligned hint is
+ * propagated from a directory into a new realtime file, new file
+ * allocations will fail due to math errors in the rt allocator and/or
+ * trip the verifiers. Validate the hint settings in the new file so
+ * that we don't let broken hints propagate.
+ */
+ failaddr = xfs_inode_validate_extsize(ip->i_mount, ip->i_extsize,
+ VFS_I(ip)->i_mode, ip->i_diflags);
+ if (failaddr) {
+ ip->i_diflags &= ~(XFS_DIFLAG_EXTSIZE |
+ XFS_DIFLAG_EXTSZINHERIT);
+ ip->i_extsize = 0;
+ }
+}
+
+/* Propagate di_flags2 from a parent inode to a child inode. */
+static inline void
+xfs_inode_inherit_flags2(
+ struct xfs_inode *ip,
+ const struct xfs_inode *pip)
+{
+ xfs_failaddr_t failaddr;
+
+ if (pip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) {
+ ip->i_diflags2 |= XFS_DIFLAG2_COWEXTSIZE;
+ ip->i_cowextsize = pip->i_cowextsize;
+ }
+ if (pip->i_diflags2 & XFS_DIFLAG2_DAX)
+ ip->i_diflags2 |= XFS_DIFLAG2_DAX;
+
+ /* Don't let invalid cowextsize hints propagate. */
+ failaddr = xfs_inode_validate_cowextsize(ip->i_mount, ip->i_cowextsize,
+ VFS_I(ip)->i_mode, ip->i_diflags, ip->i_diflags2);
+ if (failaddr) {
+ ip->i_diflags2 &= ~XFS_DIFLAG2_COWEXTSIZE;
+ ip->i_cowextsize = 0;
+ }
+}
+
+/* Initialise an inode's attributes. */
+void
+xfs_inode_init(
+ struct xfs_trans *tp,
+ const struct xfs_icreate_args *args,
+ struct xfs_inode *ip)
+{
+ struct xfs_inode *pip = args->pip;
+ struct inode *dir = pip ? VFS_I(pip) : NULL;
+ struct xfs_mount *mp = tp->t_mountp;
+ struct inode *inode = VFS_I(ip);
+ unsigned int flags;
+ int times = XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG |
+ XFS_ICHGTIME_ACCESS;
+
+ if (args->flags & XFS_ICREATE_TMPFILE)
+ set_nlink(inode, 0);
+ else if (S_ISDIR(args->mode))
+ set_nlink(inode, 2);
+ else
+ set_nlink(inode, 1);
+ inode->i_rdev = args->rdev;
+
+ if (!args->idmap || pip == NULL) {
+ /* creating a tree root, sb rooted, or detached file */
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
+ ip->i_projid = 0;
+ inode->i_mode = args->mode;
+ } else {
+ /* creating a child in the directory tree */
+ if (dir && !(dir->i_mode & S_ISGID) && xfs_has_grpid(mp)) {
+ inode_fsuid_set(inode, args->idmap);
+ inode->i_gid = dir->i_gid;
+ inode->i_mode = args->mode;
+ } else {
+ inode_init_owner(args->idmap, inode, dir, args->mode);
+ }
+
+ /*
+ * If the group ID of the new file does not match the effective
+ * group ID or one of the supplementary group IDs, the S_ISGID
+ * bit is cleared (and only if the irix_sgid_inherit
+ * compatibility variable is set).
+ */
+ if (irix_sgid_inherit && (inode->i_mode & S_ISGID) &&
+ !vfsgid_in_group_p(i_gid_into_vfsgid(args->idmap, inode)))
+ inode->i_mode &= ~S_ISGID;
+
+ ip->i_projid = pip ? xfs_get_initial_prid(pip) : 0;
+ }
+
+ ip->i_disk_size = 0;
+ ip->i_df.if_nextents = 0;
+ ASSERT(ip->i_nblocks == 0);
+
+ ip->i_extsize = 0;
+ ip->i_diflags = 0;
+
+ if (xfs_has_v3inodes(mp)) {
+ inode_set_iversion(inode, 1);
+ ip->i_cowextsize = 0;
+ times |= XFS_ICHGTIME_CREATE;
+ }
+
+ xfs_trans_ichgtime(tp, ip, times);
+
+ flags = XFS_ILOG_CORE;
+ switch (args->mode & S_IFMT) {
+ case S_IFIFO:
+ case S_IFCHR:
+ case S_IFBLK:
+ case S_IFSOCK:
+ ip->i_df.if_format = XFS_DINODE_FMT_DEV;
+ flags |= XFS_ILOG_DEV;
+ break;
+ case S_IFREG:
+ case S_IFDIR:
+ if (pip && (pip->i_diflags & XFS_DIFLAG_ANY))
+ xfs_inode_inherit_flags(ip, pip);
+ if (pip && (pip->i_diflags2 & XFS_DIFLAG2_ANY))
+ xfs_inode_inherit_flags2(ip, pip);
+ fallthrough;
+ case S_IFLNK:
+ ip->i_df.if_format = XFS_DINODE_FMT_EXTENTS;
+ ip->i_df.if_bytes = 0;
+ ip->i_df.if_data = NULL;
+ break;
+ default:
+ ASSERT(0);
+ }
+
+ /*
+ * If we need to create attributes immediately after allocating the
+ * inode, initialise an empty attribute fork right now. We use the
+ * default fork offset for attributes here as we don't know exactly what
+ * size or how many attributes we might be adding. We can do this
+ * safely here because we know the data fork is completely empty and
+ * this saves us from needing to run a separate transaction to set the
+ * fork offset in the immediate future.
+ */
+ if (args->flags & XFS_ICREATE_INIT_XATTRS) {
+ ip->i_forkoff = xfs_default_attroffset(ip) >> 3;
+ xfs_ifork_init_attr(ip, XFS_DINODE_FMT_EXTENTS, 0);
+
+ if (!xfs_has_attr(mp)) {
+ spin_lock(&mp->m_sb_lock);
+ xfs_add_attr(mp);
+ spin_unlock(&mp->m_sb_lock);
+ xfs_log_sb(tp);
+ }
+ }
+
+ xfs_trans_log_inode(tp, ip, flags);
+}
diff --git a/libxfs/xfs_inode_util.h b/libxfs/xfs_inode_util.h
index 9226482fd..bf5393db4 100644
--- a/libxfs/xfs_inode_util.h
+++ b/libxfs/xfs_inode_util.h
@@ -35,4 +35,16 @@ struct xfs_icreate_args {
uint16_t flags;
};
+/*
+ * Flags for xfs_trans_ichgtime().
+ */
+#define XFS_ICHGTIME_MOD 0x1 /* data fork modification timestamp */
+#define XFS_ICHGTIME_CHG 0x2 /* inode field change timestamp */
+#define XFS_ICHGTIME_CREATE 0x4 /* inode create timestamp */
+#define XFS_ICHGTIME_ACCESS 0x8 /* last access timestamp */
+void xfs_trans_ichgtime(struct xfs_trans *tp, struct xfs_inode *ip, int flags);
+
+void xfs_inode_init(struct xfs_trans *tp, const struct xfs_icreate_args *args,
+ struct xfs_inode *ip);
+
#endif /* __XFS_INODE_UTIL_H__ */
diff --git a/libxfs/xfs_shared.h b/libxfs/xfs_shared.h
index 9a705381f..2f7413afb 100644
--- a/libxfs/xfs_shared.h
+++ b/libxfs/xfs_shared.h
@@ -177,14 +177,6 @@ void xfs_log_get_max_trans_res(struct xfs_mount *mp,
#define XFS_REFC_BTREE_REF 1
#define XFS_SSB_REF 0
-/*
- * Flags for xfs_trans_ichgtime().
- */
-#define XFS_ICHGTIME_MOD 0x1 /* data fork modification timestamp */
-#define XFS_ICHGTIME_CHG 0x2 /* inode field change timestamp */
-#define XFS_ICHGTIME_CREATE 0x4 /* inode create timestamp */
-#define XFS_ICHGTIME_ACCESS 0x8 /* last access timestamp */
-
/* Computed inode geometry for the filesystem. */
struct xfs_ino_geometry {
/* Maximum inode count in this filesystem. */
diff --git a/repair/phase6.c b/repair/phase6.c
index 7a5694284..52e42d4c0 100644
--- a/repair/phase6.c
+++ b/repair/phase6.c
@@ -842,7 +842,8 @@ mk_root_dir(xfs_mount_t *mp)
}
/*
- * take care of the core -- initialization from xfs_ialloc()
+ * take care of the core since we didn't call the libxfs ialloc function
+ * (comment changed to avoid tangling xfs/437)
*/
reset_inode_fields(ip);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 22/64] xfs: hoist xfs_iunlink to libxfs
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (20 preceding siblings ...)
2024-10-02 1:13 ` [PATCH 21/64] xfs: hoist new inode initialization functions to libxfs Darrick J. Wong
@ 2024-10-02 1:13 ` Darrick J. Wong
2024-10-02 1:13 ` [PATCH 23/64] xfs: hoist xfs_{bump,drop}link " Darrick J. Wong
` (41 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:13 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: b8a6107921ca799330ff3efdd154b7fa0ff54582
Move xfs_iunlink and xfs_iunlink_remove to libxfs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
include/xfs_inode.h | 1
include/xfs_trace.h | 6 +
libxfs/Makefile | 2
libxfs/inode.c | 2
libxfs/iunlink.c | 163 +++++++++++++++++++++++++++
libxfs/iunlink.h | 24 ++++
libxfs/libxfs_priv.h | 2
libxfs/xfs_inode_util.c | 281 +++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_inode_util.h | 4 +
9 files changed, 485 insertions(+)
create mode 100644 libxfs/iunlink.c
create mode 100644 libxfs/iunlink.h
diff --git a/include/xfs_inode.h b/include/xfs_inode.h
index 7ce6f0183..19aaa78f3 100644
--- a/include/xfs_inode.h
+++ b/include/xfs_inode.h
@@ -237,6 +237,7 @@ typedef struct xfs_inode {
/* unlinked list pointers */
xfs_agino_t i_next_unlinked;
+ xfs_agino_t i_prev_unlinked;
xfs_extnum_t i_cnextents; /* # of extents in cow fork */
unsigned int i_cformat; /* format of cow fork */
diff --git a/include/xfs_trace.h b/include/xfs_trace.h
index fe0854b20..812fbb38e 100644
--- a/include/xfs_trace.h
+++ b/include/xfs_trace.h
@@ -361,4 +361,10 @@
#define trace_xlog_intent_recovery_failed(...) ((void) 0)
+#define trace_xfs_iunlink_update_bucket(...) ((void) 0)
+#define trace_xfs_iunlink_update_dinode(...) ((void) 0)
+#define trace_xfs_iunlink(...) ((void) 0)
+#define trace_xfs_iunlink_reload_next(...) ((void) 0)
+#define trace_xfs_iunlink_remove(...) ((void) 0)
+
#endif /* __TRACE_H__ */
diff --git a/libxfs/Makefile b/libxfs/Makefile
index fd623cf40..72e287b8b 100644
--- a/libxfs/Makefile
+++ b/libxfs/Makefile
@@ -25,6 +25,7 @@ HFILES = \
libxfs_api_defs.h \
listxattr.h \
init.h \
+ iunlink.h \
libxfs_priv.h \
linux-err.h \
topology.h \
@@ -71,6 +72,7 @@ CFILES = buf_mem.c \
defer_item.c \
init.c \
inode.c \
+ iunlink.c \
kmem.c \
listxattr.c \
logitem.c \
diff --git a/libxfs/inode.c b/libxfs/inode.c
index 61068078a..20b9c483a 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -173,6 +173,8 @@ libxfs_iget(
ip->i_mount = mp;
ip->i_diflags2 = mp->m_ino_geo.new_diflags2;
ip->i_af.if_format = XFS_DINODE_FMT_EXTENTS;
+ ip->i_next_unlinked = NULLAGINO;
+ ip->i_prev_unlinked = NULLAGINO;
spin_lock_init(&VFS_I(ip)->i_lock);
pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
diff --git a/libxfs/iunlink.c b/libxfs/iunlink.c
new file mode 100644
index 000000000..6d0554535
--- /dev/null
+++ b/libxfs/iunlink.c
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2020-2022, Red Hat, Inc.
+ * All Rights Reserved.
+ */
+
+#include "libxfs_priv.h"
+#include "libxfs.h"
+#include "libxfs_io.h"
+#include "init.h"
+#include "xfs_fs.h"
+#include "xfs_shared.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_mount.h"
+#include "xfs_inode.h"
+#include "xfs_trans.h"
+#include "xfs_ag.h"
+#include "iunlink.h"
+#include "xfs_trace.h"
+
+/* in memory log item structure */
+struct xfs_iunlink_item {
+ struct xfs_inode *ip;
+ struct xfs_perag *pag;
+ xfs_agino_t next_agino;
+ xfs_agino_t old_agino;
+};
+
+/*
+ * Look up the inode cluster buffer and log the on-disk unlinked inode change
+ * we need to make.
+ */
+static int
+xfs_iunlink_log_dinode(
+ struct xfs_trans *tp,
+ struct xfs_iunlink_item *iup)
+{
+ struct xfs_mount *mp = tp->t_mountp;
+ struct xfs_inode *ip = iup->ip;
+ struct xfs_dinode *dip;
+ struct xfs_buf *ibp;
+ int offset;
+ int error;
+
+ error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &ibp);
+ if (error)
+ return error;
+ /*
+ * Don't log the unlinked field on stale buffers as this may be the
+ * transaction that frees the inode cluster and relogging the buffer
+ * here will incorrectly remove the stale state.
+ */
+ if (ibp->b_flags & LIBXFS_B_STALE)
+ goto out;
+
+ dip = xfs_buf_offset(ibp, ip->i_imap.im_boffset);
+
+ /* Make sure the old pointer isn't garbage. */
+ if (be32_to_cpu(dip->di_next_unlinked) != iup->old_agino) {
+ xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
+ sizeof(*dip), __this_address);
+ error = -EFSCORRUPTED;
+ goto out;
+ }
+
+ trace_xfs_iunlink_update_dinode(mp, iup->pag->pag_agno,
+ XFS_INO_TO_AGINO(mp, ip->i_ino),
+ be32_to_cpu(dip->di_next_unlinked), iup->next_agino);
+
+ dip->di_next_unlinked = cpu_to_be32(iup->next_agino);
+ offset = ip->i_imap.im_boffset +
+ offsetof(struct xfs_dinode, di_next_unlinked);
+
+ xfs_dinode_calc_crc(mp, dip);
+ xfs_trans_inode_buf(tp, ibp);
+ xfs_trans_log_buf(tp, ibp, offset, offset + sizeof(xfs_agino_t) - 1);
+ return 0;
+out:
+ xfs_trans_brelse(tp, ibp);
+ return error;
+}
+
+/*
+ * Initialize the inode log item for a newly allocated (in-core) inode.
+ *
+ * Inode extents can only reside within an AG. Hence specify the starting
+ * block for the inode chunk by offset within an AG as well as the
+ * length of the allocated extent.
+ *
+ * This joins the item to the transaction and marks it dirty so
+ * that we don't need a separate call to do this, nor does the
+ * caller need to know anything about the iunlink item.
+ */
+int
+xfs_iunlink_log_inode(
+ struct xfs_trans *tp,
+ struct xfs_inode *ip,
+ struct xfs_perag *pag,
+ xfs_agino_t next_agino)
+{
+ struct xfs_iunlink_item iup = {
+ .ip = ip,
+ .pag = pag,
+ .next_agino = next_agino,
+ .old_agino = ip->i_next_unlinked,
+ };
+
+ ASSERT(xfs_verify_agino_or_null(pag, next_agino));
+ ASSERT(xfs_verify_agino_or_null(pag, ip->i_next_unlinked));
+
+ /*
+ * Since we're updating a linked list, we should never find that the
+ * current pointer is the same as the new value, unless we're
+ * terminating the list.
+ */
+ if (ip->i_next_unlinked == next_agino) {
+ if (next_agino != NULLAGINO)
+ return -EFSCORRUPTED;
+ return 0;
+ }
+
+ return xfs_iunlink_log_dinode(tp, &iup);
+}
+
+/*
+ * Load the inode @next_agino into the cache and set its prev_unlinked pointer
+ * to @prev_agino. Caller must hold the AGI to synchronize with other changes
+ * to the unlinked list.
+ */
+int
+xfs_iunlink_reload_next(
+ struct xfs_trans *tp,
+ struct xfs_buf *agibp,
+ xfs_agino_t prev_agino,
+ xfs_agino_t next_agino)
+{
+ struct xfs_perag *pag = agibp->b_pag;
+ struct xfs_mount *mp = pag->pag_mount;
+ struct xfs_inode *next_ip = NULL;
+ xfs_ino_t ino;
+ int error;
+
+ ASSERT(next_agino != NULLAGINO);
+
+ ino = XFS_AGINO_TO_INO(mp, pag->pag_agno, next_agino);
+ error = libxfs_iget(mp, tp, ino, XFS_IGET_UNTRUSTED, &next_ip);
+ if (error)
+ return error;
+
+ /* If this is not an unlinked inode, something is very wrong. */
+ if (VFS_I(next_ip)->i_nlink != 0) {
+ error = -EFSCORRUPTED;
+ goto rele;
+ }
+
+ next_ip->i_prev_unlinked = prev_agino;
+ trace_xfs_iunlink_reload_next(next_ip);
+rele:
+ xfs_irele(next_ip);
+ return error;
+}
diff --git a/libxfs/iunlink.h b/libxfs/iunlink.h
new file mode 100644
index 000000000..8d8032cf9
--- /dev/null
+++ b/libxfs/iunlink.h
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2020-2022, Red Hat, Inc.
+ * All Rights Reserved.
+ */
+#ifndef XFS_IUNLINK_ITEM_H
+#define XFS_IUNLINK_ITEM_H 1
+
+struct xfs_trans;
+struct xfs_inode;
+struct xfs_perag;
+
+static inline struct xfs_inode *
+xfs_iunlink_lookup(struct xfs_perag *pag, xfs_agino_t agino)
+{
+ return NULL;
+}
+
+int xfs_iunlink_log_inode(struct xfs_trans *tp, struct xfs_inode *ip,
+ struct xfs_perag *pag, xfs_agino_t next_agino);
+int xfs_iunlink_reload_next(struct xfs_trans *tp, struct xfs_buf *agibp,
+ xfs_agino_t prev_agino, xfs_agino_t next_agino);
+
+#endif /* XFS_IUNLINK_ITEM_H */
diff --git a/libxfs/libxfs_priv.h b/libxfs/libxfs_priv.h
index 8dd364b0d..a77524dfd 100644
--- a/libxfs/libxfs_priv.h
+++ b/libxfs/libxfs_priv.h
@@ -482,6 +482,8 @@ xfs_buf_readahead(
#define xfs_filestream_new_ag(ip,ag) (0)
#define xfs_filestream_select_ag(...) (-ENOSYS)
+#define xfs_trans_inode_buf(tp, bp) ((void) 0)
+
/* quota bits */
#define xfs_trans_mod_dquot_byino(t,i,f,d) ({ \
uint _f = (f); \
diff --git a/libxfs/xfs_inode_util.c b/libxfs/xfs_inode_util.c
index 633c7616c..2d7e970d7 100644
--- a/libxfs/xfs_inode_util.c
+++ b/libxfs/xfs_inode_util.c
@@ -17,6 +17,9 @@
#include "xfs_ialloc.h"
#include "xfs_health.h"
#include "xfs_bmap.h"
+#include "xfs_trace.h"
+#include "xfs_ag.h"
+#include "iunlink.h"
uint16_t
xfs_flags2diflags(
@@ -343,3 +346,281 @@ xfs_inode_init(
xfs_trans_log_inode(tp, ip, flags);
}
+
+/*
+ * In-Core Unlinked List Lookups
+ * =============================
+ *
+ * Every inode is supposed to be reachable from some other piece of metadata
+ * with the exception of the root directory. Inodes with a connection to a
+ * file descriptor but not linked from anywhere in the on-disk directory tree
+ * are collectively known as unlinked inodes, though the filesystem itself
+ * maintains links to these inodes so that on-disk metadata are consistent.
+ *
+ * XFS implements a per-AG on-disk hash table of unlinked inodes. The AGI
+ * header contains a number of buckets that point to an inode, and each inode
+ * record has a pointer to the next inode in the hash chain. This
+ * singly-linked list causes scaling problems in the iunlink remove function
+ * because we must walk that list to find the inode that points to the inode
+ * being removed from the unlinked hash bucket list.
+ *
+ * Hence we keep an in-memory double linked list to link each inode on an
+ * unlinked list. Because there are 64 unlinked lists per AGI, keeping pointer
+ * based lists would require having 64 list heads in the perag, one for each
+ * list. This is expensive in terms of memory (think millions of AGs) and cache
+ * misses on lookups. Instead, use the fact that inodes on the unlinked list
+ * must be referenced at the VFS level to keep them on the list and hence we
+ * have an existence guarantee for inodes on the unlinked list.
+ *
+ * Given we have an existence guarantee, we can use lockless inode cache lookups
+ * to resolve aginos to xfs inodes. This means we only need 8 bytes per inode
+ * for the double linked unlinked list, and we don't need any extra locking to
+ * keep the list safe as all manipulations are done under the AGI buffer lock.
+ * Keeping the list up to date does not require memory allocation, just finding
+ * the XFS inode and updating the next/prev unlinked list aginos.
+ */
+
+/*
+ * Update the prev pointer of the next agino. Returns -ENOLINK if the inode
+ * is not in cache.
+ */
+static int
+xfs_iunlink_update_backref(
+ struct xfs_perag *pag,
+ xfs_agino_t prev_agino,
+ xfs_agino_t next_agino)
+{
+ struct xfs_inode *ip;
+
+ /* No update necessary if we are at the end of the list. */
+ if (next_agino == NULLAGINO)
+ return 0;
+
+ ip = xfs_iunlink_lookup(pag, next_agino);
+ if (!ip)
+ return -ENOLINK;
+
+ ip->i_prev_unlinked = prev_agino;
+ return 0;
+}
+
+/*
+ * Point the AGI unlinked bucket at an inode and log the results. The caller
+ * is responsible for validating the old value.
+ */
+STATIC int
+xfs_iunlink_update_bucket(
+ struct xfs_trans *tp,
+ struct xfs_perag *pag,
+ struct xfs_buf *agibp,
+ unsigned int bucket_index,
+ xfs_agino_t new_agino)
+{
+ struct xfs_agi *agi = agibp->b_addr;
+ xfs_agino_t old_value;
+ int offset;
+
+ ASSERT(xfs_verify_agino_or_null(pag, new_agino));
+
+ old_value = be32_to_cpu(agi->agi_unlinked[bucket_index]);
+ trace_xfs_iunlink_update_bucket(tp->t_mountp, pag->pag_agno, bucket_index,
+ old_value, new_agino);
+
+ /*
+ * We should never find the head of the list already set to the value
+ * passed in because either we're adding or removing ourselves from the
+ * head of the list.
+ */
+ if (old_value == new_agino) {
+ xfs_buf_mark_corrupt(agibp);
+ xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI);
+ return -EFSCORRUPTED;
+ }
+
+ agi->agi_unlinked[bucket_index] = cpu_to_be32(new_agino);
+ offset = offsetof(struct xfs_agi, agi_unlinked) +
+ (sizeof(xfs_agino_t) * bucket_index);
+ xfs_trans_log_buf(tp, agibp, offset, offset + sizeof(xfs_agino_t) - 1);
+ return 0;
+}
+
+static int
+xfs_iunlink_insert_inode(
+ struct xfs_trans *tp,
+ struct xfs_perag *pag,
+ struct xfs_buf *agibp,
+ struct xfs_inode *ip)
+{
+ struct xfs_mount *mp = tp->t_mountp;
+ struct xfs_agi *agi = agibp->b_addr;
+ xfs_agino_t next_agino;
+ xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
+ short bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS;
+ int error;
+
+ /*
+ * Get the index into the agi hash table for the list this inode will
+ * go on. Make sure the pointer isn't garbage and that this inode
+ * isn't already on the list.
+ */
+ next_agino = be32_to_cpu(agi->agi_unlinked[bucket_index]);
+ if (next_agino == agino ||
+ !xfs_verify_agino_or_null(pag, next_agino)) {
+ xfs_buf_mark_corrupt(agibp);
+ xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI);
+ return -EFSCORRUPTED;
+ }
+
+ /*
+ * Update the prev pointer in the next inode to point back to this
+ * inode.
+ */
+ error = xfs_iunlink_update_backref(pag, agino, next_agino);
+ if (error == -ENOLINK)
+ error = xfs_iunlink_reload_next(tp, agibp, agino, next_agino);
+ if (error)
+ return error;
+
+ if (next_agino != NULLAGINO) {
+ /*
+ * There is already another inode in the bucket, so point this
+ * inode to the current head of the list.
+ */
+ error = xfs_iunlink_log_inode(tp, ip, pag, next_agino);
+ if (error)
+ return error;
+ ip->i_next_unlinked = next_agino;
+ }
+
+ /* Point the head of the list to point to this inode. */
+ ip->i_prev_unlinked = NULLAGINO;
+ return xfs_iunlink_update_bucket(tp, pag, agibp, bucket_index, agino);
+}
+
+/*
+ * This is called when the inode's link count has gone to 0 or we are creating
+ * a tmpfile via O_TMPFILE. The inode @ip must have nlink == 0.
+ *
+ * We place the on-disk inode on a list in the AGI. It will be pulled from this
+ * list when the inode is freed.
+ */
+int
+xfs_iunlink(
+ struct xfs_trans *tp,
+ struct xfs_inode *ip)
+{
+ struct xfs_mount *mp = tp->t_mountp;
+ struct xfs_perag *pag;
+ struct xfs_buf *agibp;
+ int error;
+
+ ASSERT(VFS_I(ip)->i_nlink == 0);
+ ASSERT(VFS_I(ip)->i_mode != 0);
+ trace_xfs_iunlink(ip);
+
+ pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
+
+ /* Get the agi buffer first. It ensures lock ordering on the list. */
+ error = xfs_read_agi(pag, tp, 0, &agibp);
+ if (error)
+ goto out;
+
+ error = xfs_iunlink_insert_inode(tp, pag, agibp, ip);
+out:
+ xfs_perag_put(pag);
+ return error;
+}
+
+static int
+xfs_iunlink_remove_inode(
+ struct xfs_trans *tp,
+ struct xfs_perag *pag,
+ struct xfs_buf *agibp,
+ struct xfs_inode *ip)
+{
+ struct xfs_mount *mp = tp->t_mountp;
+ struct xfs_agi *agi = agibp->b_addr;
+ xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ip->i_ino);
+ xfs_agino_t head_agino;
+ short bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS;
+ int error;
+
+ trace_xfs_iunlink_remove(ip);
+
+ /*
+ * Get the index into the agi hash table for the list this inode will
+ * go on. Make sure the head pointer isn't garbage.
+ */
+ head_agino = be32_to_cpu(agi->agi_unlinked[bucket_index]);
+ if (!xfs_verify_agino(pag, head_agino)) {
+ XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp,
+ agi, sizeof(*agi));
+ xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI);
+ return -EFSCORRUPTED;
+ }
+
+ /*
+ * Set our inode's next_unlinked pointer to NULL and then return
+ * the old pointer value so that we can update whatever was previous
+ * to us in the list to point to whatever was next in the list.
+ */
+ error = xfs_iunlink_log_inode(tp, ip, pag, NULLAGINO);
+ if (error)
+ return error;
+
+ /*
+ * Update the prev pointer in the next inode to point back to previous
+ * inode in the chain.
+ */
+ error = xfs_iunlink_update_backref(pag, ip->i_prev_unlinked,
+ ip->i_next_unlinked);
+ if (error == -ENOLINK)
+ error = xfs_iunlink_reload_next(tp, agibp, ip->i_prev_unlinked,
+ ip->i_next_unlinked);
+ if (error)
+ return error;
+
+ if (head_agino != agino) {
+ struct xfs_inode *prev_ip;
+
+ prev_ip = xfs_iunlink_lookup(pag, ip->i_prev_unlinked);
+ if (!prev_ip) {
+ xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE);
+ return -EFSCORRUPTED;
+ }
+
+ error = xfs_iunlink_log_inode(tp, prev_ip, pag,
+ ip->i_next_unlinked);
+ prev_ip->i_next_unlinked = ip->i_next_unlinked;
+ } else {
+ /* Point the head of the list to the next unlinked inode. */
+ error = xfs_iunlink_update_bucket(tp, pag, agibp, bucket_index,
+ ip->i_next_unlinked);
+ }
+
+ ip->i_next_unlinked = NULLAGINO;
+ ip->i_prev_unlinked = 0;
+ return error;
+}
+
+/*
+ * Pull the on-disk inode from the AGI unlinked list.
+ */
+int
+xfs_iunlink_remove(
+ struct xfs_trans *tp,
+ struct xfs_perag *pag,
+ struct xfs_inode *ip)
+{
+ struct xfs_buf *agibp;
+ int error;
+
+ trace_xfs_iunlink_remove(ip);
+
+ /* Get the agi buffer first. It ensures lock ordering on the list. */
+ error = xfs_read_agi(pag, tp, 0, &agibp);
+ if (error)
+ return error;
+
+ return xfs_iunlink_remove_inode(tp, pag, agibp, ip);
+}
diff --git a/libxfs/xfs_inode_util.h b/libxfs/xfs_inode_util.h
index bf5393db4..42a032afe 100644
--- a/libxfs/xfs_inode_util.h
+++ b/libxfs/xfs_inode_util.h
@@ -47,4 +47,8 @@ void xfs_trans_ichgtime(struct xfs_trans *tp, struct xfs_inode *ip, int flags);
void xfs_inode_init(struct xfs_trans *tp, const struct xfs_icreate_args *args,
struct xfs_inode *ip);
+int xfs_iunlink(struct xfs_trans *tp, struct xfs_inode *ip);
+int xfs_iunlink_remove(struct xfs_trans *tp, struct xfs_perag *pag,
+ struct xfs_inode *ip);
+
#endif /* __XFS_INODE_UTIL_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 23/64] xfs: hoist xfs_{bump,drop}link to libxfs
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (21 preceding siblings ...)
2024-10-02 1:13 ` [PATCH 22/64] xfs: hoist xfs_iunlink " Darrick J. Wong
@ 2024-10-02 1:13 ` Darrick J. Wong
2024-10-02 1:14 ` [PATCH 24/64] xfs: separate the icreate logic around INIT_XATTRS Darrick J. Wong
` (40 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:13 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: a9e583d34facc64b6edf3c9afb2ff4891038176d
Move xfs_bumplink and xfs_droplink to libxfs.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
include/xfs_inode.h | 2 --
libxfs/inode.c | 18 ----------------
libxfs/libxfs_priv.h | 1 +
libxfs/xfs_inode_util.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_inode_util.h | 2 ++
5 files changed, 56 insertions(+), 20 deletions(-)
diff --git a/include/xfs_inode.h b/include/xfs_inode.h
index 19aaa78f3..170cc5288 100644
--- a/include/xfs_inode.h
+++ b/include/xfs_inode.h
@@ -407,8 +407,6 @@ extern void libxfs_trans_ichgtime(struct xfs_trans *,
struct xfs_inode *, int);
extern int libxfs_iflush_int (struct xfs_inode *, struct xfs_buf *);
-void libxfs_bumplink(struct xfs_trans *tp, struct xfs_inode *ip);
-
int libxfs_icreate(struct xfs_trans *tp, xfs_ino_t ino,
const struct xfs_icreate_args *args, struct xfs_inode **ipp);
diff --git a/libxfs/inode.c b/libxfs/inode.c
index 20b9c483a..2062ecf54 100644
--- a/libxfs/inode.c
+++ b/libxfs/inode.c
@@ -29,24 +29,6 @@
#include "xfs_da_btree.h"
#include "xfs_dir2_priv.h"
-/*
- * Increment the link count on an inode & log the change.
- */
-void
-libxfs_bumplink(
- struct xfs_trans *tp,
- struct xfs_inode *ip)
-{
- struct inode *inode = VFS_I(ip);
-
- xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
-
- if (inode->i_nlink != XFS_NLINK_PINNED)
- inc_nlink(inode);
-
- xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
-}
-
/*
* Initialise a newly allocated inode and return the in-core inode to the
* caller locked exclusively.
diff --git a/libxfs/libxfs_priv.h b/libxfs/libxfs_priv.h
index a77524dfd..b720cc5fa 100644
--- a/libxfs/libxfs_priv.h
+++ b/libxfs/libxfs_priv.h
@@ -135,6 +135,7 @@ extern void cmn_err(int, char *, ...);
enum ce { CE_DEBUG, CE_CONT, CE_NOTE, CE_WARN, CE_ALERT, CE_PANIC };
#define xfs_info(mp,fmt,args...) cmn_err(CE_CONT, _(fmt), ## args)
+#define xfs_info_ratelimited(mp,fmt,args...) cmn_err(CE_CONT, _(fmt), ## args)
#define xfs_notice(mp,fmt,args...) cmn_err(CE_NOTE, _(fmt), ## args)
#define xfs_warn(mp,fmt,args...) cmn_err((mp) ? CE_WARN : CE_WARN, _(fmt), ## args)
#define xfs_err(mp,fmt,args...) cmn_err(CE_ALERT, _(fmt), ## args)
diff --git a/libxfs/xfs_inode_util.c b/libxfs/xfs_inode_util.c
index 2d7e970d7..62af002b2 100644
--- a/libxfs/xfs_inode_util.c
+++ b/libxfs/xfs_inode_util.c
@@ -624,3 +624,56 @@ xfs_iunlink_remove(
return xfs_iunlink_remove_inode(tp, pag, agibp, ip);
}
+
+/*
+ * Decrement the link count on an inode & log the change. If this causes the
+ * link count to go to zero, move the inode to AGI unlinked list so that it can
+ * be freed when the last active reference goes away via xfs_inactive().
+ */
+int
+xfs_droplink(
+ struct xfs_trans *tp,
+ struct xfs_inode *ip)
+{
+ struct inode *inode = VFS_I(ip);
+
+ xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
+
+ if (inode->i_nlink == 0) {
+ xfs_info_ratelimited(tp->t_mountp,
+ "Inode 0x%llx link count dropped below zero. Pinning link count.",
+ ip->i_ino);
+ set_nlink(inode, XFS_NLINK_PINNED);
+ }
+ if (inode->i_nlink != XFS_NLINK_PINNED)
+ drop_nlink(inode);
+
+ xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+
+ if (inode->i_nlink)
+ return 0;
+
+ return xfs_iunlink(tp, ip);
+}
+
+/*
+ * Increment the link count on an inode & log the change.
+ */
+void
+xfs_bumplink(
+ struct xfs_trans *tp,
+ struct xfs_inode *ip)
+{
+ struct inode *inode = VFS_I(ip);
+
+ xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
+
+ if (inode->i_nlink == XFS_NLINK_PINNED - 1)
+ xfs_info_ratelimited(tp->t_mountp,
+ "Inode 0x%llx link count exceeded maximum. Pinning link count.",
+ ip->i_ino);
+ if (inode->i_nlink != XFS_NLINK_PINNED)
+ inc_nlink(inode);
+
+ xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+}
diff --git a/libxfs/xfs_inode_util.h b/libxfs/xfs_inode_util.h
index 42a032afe..50c14ba6c 100644
--- a/libxfs/xfs_inode_util.h
+++ b/libxfs/xfs_inode_util.h
@@ -50,5 +50,7 @@ void xfs_inode_init(struct xfs_trans *tp, const struct xfs_icreate_args *args,
int xfs_iunlink(struct xfs_trans *tp, struct xfs_inode *ip);
int xfs_iunlink_remove(struct xfs_trans *tp, struct xfs_perag *pag,
struct xfs_inode *ip);
+int xfs_droplink(struct xfs_trans *tp, struct xfs_inode *ip);
+void xfs_bumplink(struct xfs_trans *tp, struct xfs_inode *ip);
#endif /* __XFS_INODE_UTIL_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 24/64] xfs: separate the icreate logic around INIT_XATTRS
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (22 preceding siblings ...)
2024-10-02 1:13 ` [PATCH 23/64] xfs: hoist xfs_{bump,drop}link " Darrick J. Wong
@ 2024-10-02 1:14 ` Darrick J. Wong
2024-10-02 1:14 ` [PATCH 25/64] xfs: create libxfs helper to link a new inode into a directory Darrick J. Wong
` (39 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:14 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: b11b11e3b7a72606cfef527255a9467537bcaaa5
INIT_XATTRS is overloaded here -- it's set during the creat process when
we think that we're immediately going to set some ACL xattrs to save
time. However, it's also used by the parent pointers code to enable the
attr fork in preparation to receive ppptr xattrs. This results in
xfs_has_parent() branches scattered around the codebase to turn on
INIT_XATTRS.
Linkable files are created far more commonly than unlinkable temporary
files or directory tree roots, so we should centralize this logic in
xfs_inode_init. For the three callers that don't want parent pointers
(online repiar tempfiles, unlinkable tempfiles, rootdir creation) we
provide an UNLINKABLE flag to skip attr fork initialization.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_inode_util.c | 36 ++++++++++++++++++++++++++----------
libxfs/xfs_inode_util.h | 1 +
2 files changed, 27 insertions(+), 10 deletions(-)
diff --git a/libxfs/xfs_inode_util.c b/libxfs/xfs_inode_util.c
index 62af002b2..13c32d114 100644
--- a/libxfs/xfs_inode_util.c
+++ b/libxfs/xfs_inode_util.c
@@ -231,6 +231,31 @@ xfs_inode_inherit_flags2(
}
}
+/*
+ * If we need to create attributes immediately after allocating the inode,
+ * initialise an empty attribute fork right now. We use the default fork offset
+ * for attributes here as we don't know exactly what size or how many
+ * attributes we might be adding. We can do this safely here because we know
+ * the data fork is completely empty and this saves us from needing to run a
+ * separate transaction to set the fork offset in the immediate future.
+ *
+ * If we have parent pointers and the caller hasn't told us that the file will
+ * never be linked into a directory tree, we /must/ create the attr fork.
+ */
+static inline bool
+xfs_icreate_want_attrfork(
+ struct xfs_mount *mp,
+ const struct xfs_icreate_args *args)
+{
+ if (args->flags & XFS_ICREATE_INIT_XATTRS)
+ return true;
+
+ if (!(args->flags & XFS_ICREATE_UNLINKABLE) && xfs_has_parent(mp))
+ return true;
+
+ return false;
+}
+
/* Initialise an inode's attributes. */
void
xfs_inode_init(
@@ -323,16 +348,7 @@ xfs_inode_init(
ASSERT(0);
}
- /*
- * If we need to create attributes immediately after allocating the
- * inode, initialise an empty attribute fork right now. We use the
- * default fork offset for attributes here as we don't know exactly what
- * size or how many attributes we might be adding. We can do this
- * safely here because we know the data fork is completely empty and
- * this saves us from needing to run a separate transaction to set the
- * fork offset in the immediate future.
- */
- if (args->flags & XFS_ICREATE_INIT_XATTRS) {
+ if (xfs_icreate_want_attrfork(mp, args)) {
ip->i_forkoff = xfs_default_attroffset(ip) >> 3;
xfs_ifork_init_attr(ip, XFS_DINODE_FMT_EXTENTS, 0);
diff --git a/libxfs/xfs_inode_util.h b/libxfs/xfs_inode_util.h
index 50c14ba6c..1c54c3b0c 100644
--- a/libxfs/xfs_inode_util.h
+++ b/libxfs/xfs_inode_util.h
@@ -32,6 +32,7 @@ struct xfs_icreate_args {
#define XFS_ICREATE_TMPFILE (1U << 0) /* create an unlinked file */
#define XFS_ICREATE_INIT_XATTRS (1U << 1) /* will set xattrs immediately */
+#define XFS_ICREATE_UNLINKABLE (1U << 2) /* cannot link into dir tree */
uint16_t flags;
};
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 25/64] xfs: create libxfs helper to link a new inode into a directory
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (23 preceding siblings ...)
2024-10-02 1:14 ` [PATCH 24/64] xfs: separate the icreate logic around INIT_XATTRS Darrick J. Wong
@ 2024-10-02 1:14 ` Darrick J. Wong
2024-10-02 1:14 ` [PATCH 26/64] xfs: create libxfs helper to link an existing " Darrick J. Wong
` (38 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:14 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 1fa2e81957cf11620867729fb613b121692ee0d3
Create a new libxfs function to link a newly created inode into a
directory. The upcoming metadata directory feature will need this to
create a metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_dir2.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_dir2.h | 12 ++++++++++++
2 files changed, 65 insertions(+)
diff --git a/libxfs/xfs_dir2.c b/libxfs/xfs_dir2.c
index 9cf05ec51..e98b28024 100644
--- a/libxfs/xfs_dir2.c
+++ b/libxfs/xfs_dir2.c
@@ -18,6 +18,9 @@
#include "xfs_errortag.h"
#include "xfs_trace.h"
#include "xfs_health.h"
+#include "xfs_bmap_btree.h"
+#include "xfs_trans_space.h"
+#include "xfs_parent.h"
const struct xfs_name xfs_name_dotdot = {
.name = (const unsigned char *)"..",
@@ -755,3 +758,53 @@ xfs_dir2_compname(
return xfs_ascii_ci_compname(args, name, len);
return xfs_da_compname(args, name, len);
}
+
+/*
+ * Given a directory @dp, a newly allocated inode @ip, and a @name, link @ip
+ * into @dp under the given @name. If @ip is a directory, it will be
+ * initialized. Both inodes must have the ILOCK held and the transaction must
+ * have sufficient blocks reserved.
+ */
+int
+xfs_dir_create_child(
+ struct xfs_trans *tp,
+ unsigned int resblks,
+ struct xfs_dir_update *du)
+{
+ struct xfs_inode *dp = du->dp;
+ const struct xfs_name *name = du->name;
+ struct xfs_inode *ip = du->ip;
+ int error;
+
+ xfs_assert_ilocked(ip, XFS_ILOCK_EXCL);
+ xfs_assert_ilocked(dp, XFS_ILOCK_EXCL);
+
+ error = xfs_dir_createname(tp, dp, name, ip->i_ino, resblks);
+ if (error) {
+ ASSERT(error != -ENOSPC);
+ return error;
+ }
+
+ xfs_trans_ichgtime(tp, dp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
+ xfs_trans_log_inode(tp, dp, XFS_ILOG_CORE);
+
+ if (S_ISDIR(VFS_I(ip)->i_mode)) {
+ error = xfs_dir_init(tp, ip, dp);
+ if (error)
+ return error;
+
+ xfs_bumplink(tp, dp);
+ }
+
+ /*
+ * If we have parent pointers, we need to add the attribute containing
+ * the parent information now.
+ */
+ if (du->ppargs) {
+ error = xfs_parent_addname(tp, du->ppargs, dp, name, ip);
+ if (error)
+ return error;
+ }
+
+ return 0;
+}
diff --git a/libxfs/xfs_dir2.h b/libxfs/xfs_dir2.h
index 6dbe6e9ec..a1ba6fd0a 100644
--- a/libxfs/xfs_dir2.h
+++ b/libxfs/xfs_dir2.h
@@ -309,4 +309,16 @@ static inline unsigned char xfs_ascii_ci_xfrm(unsigned char c)
return c;
}
+struct xfs_parent_args;
+
+struct xfs_dir_update {
+ struct xfs_inode *dp;
+ const struct xfs_name *name;
+ struct xfs_inode *ip;
+ struct xfs_parent_args *ppargs;
+};
+
+int xfs_dir_create_child(struct xfs_trans *tp, unsigned int resblks,
+ struct xfs_dir_update *du);
+
#endif /* __XFS_DIR2_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 26/64] xfs: create libxfs helper to link an existing inode into a directory
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (24 preceding siblings ...)
2024-10-02 1:14 ` [PATCH 25/64] xfs: create libxfs helper to link a new inode into a directory Darrick J. Wong
@ 2024-10-02 1:14 ` Darrick J. Wong
2024-10-02 1:14 ` [PATCH 27/64] xfs: hoist inode free function to libxfs Darrick J. Wong
` (37 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:14 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: c1f0bad4232fd309b2fe849153fcf473e775b1f7
Create a new libxfs function to link an existing inode into a directory.
The upcoming metadata directory feature will need this to create a
metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_dir2.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++--
libxfs/xfs_dir2.h | 4 ++-
2 files changed, 71 insertions(+), 4 deletions(-)
diff --git a/libxfs/xfs_dir2.c b/libxfs/xfs_dir2.c
index e98b28024..802b9a1b3 100644
--- a/libxfs/xfs_dir2.c
+++ b/libxfs/xfs_dir2.c
@@ -21,6 +21,7 @@
#include "xfs_bmap_btree.h"
#include "xfs_trans_space.h"
#include "xfs_parent.h"
+#include "xfs_ag.h"
const struct xfs_name xfs_name_dotdot = {
.name = (const unsigned char *)"..",
@@ -586,9 +587,9 @@ xfs_dir_replace(
*/
int
xfs_dir_canenter(
- xfs_trans_t *tp,
- xfs_inode_t *dp,
- struct xfs_name *name) /* name of entry to add */
+ struct xfs_trans *tp,
+ struct xfs_inode *dp,
+ const struct xfs_name *name) /* name of entry to add */
{
return xfs_dir_createname(tp, dp, name, 0, 0);
}
@@ -808,3 +809,67 @@ xfs_dir_create_child(
return 0;
}
+
+/*
+ * Given a directory @dp, an existing non-directory inode @ip, and a @name,
+ * link @ip into @dp under the given @name. Both inodes must have the ILOCK
+ * held.
+ */
+int
+xfs_dir_add_child(
+ struct xfs_trans *tp,
+ unsigned int resblks,
+ struct xfs_dir_update *du)
+{
+ struct xfs_inode *dp = du->dp;
+ const struct xfs_name *name = du->name;
+ struct xfs_inode *ip = du->ip;
+ struct xfs_mount *mp = tp->t_mountp;
+ int error;
+
+ xfs_assert_ilocked(ip, XFS_ILOCK_EXCL);
+ xfs_assert_ilocked(dp, XFS_ILOCK_EXCL);
+ ASSERT(!S_ISDIR(VFS_I(ip)->i_mode));
+
+ if (!resblks) {
+ error = xfs_dir_canenter(tp, dp, name);
+ if (error)
+ return error;
+ }
+
+ /*
+ * Handle initial link state of O_TMPFILE inode
+ */
+ if (VFS_I(ip)->i_nlink == 0) {
+ struct xfs_perag *pag;
+
+ pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
+ error = xfs_iunlink_remove(tp, pag, ip);
+ xfs_perag_put(pag);
+ if (error)
+ return error;
+ }
+
+ error = xfs_dir_createname(tp, dp, name, ip->i_ino, resblks);
+ if (error)
+ return error;
+
+ xfs_trans_ichgtime(tp, dp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
+ xfs_trans_log_inode(tp, dp, XFS_ILOG_CORE);
+
+ xfs_bumplink(tp, ip);
+
+ /*
+ * If we have parent pointers, we now need to add the parent record to
+ * the attribute fork of the inode. If this is the initial parent
+ * attribute, we need to create it correctly, otherwise we can just add
+ * the parent to the inode.
+ */
+ if (du->ppargs) {
+ error = xfs_parent_addname(tp, du->ppargs, dp, name, ip);
+ if (error)
+ return error;
+ }
+
+ return 0;
+}
diff --git a/libxfs/xfs_dir2.h b/libxfs/xfs_dir2.h
index a1ba6fd0a..4f9711509 100644
--- a/libxfs/xfs_dir2.h
+++ b/libxfs/xfs_dir2.h
@@ -74,7 +74,7 @@ extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
const struct xfs_name *name, xfs_ino_t inum,
xfs_extlen_t tot);
extern int xfs_dir_canenter(struct xfs_trans *tp, struct xfs_inode *dp,
- struct xfs_name *name);
+ const struct xfs_name *name);
int xfs_dir_lookup_args(struct xfs_da_args *args);
int xfs_dir_createname_args(struct xfs_da_args *args);
@@ -320,5 +320,7 @@ struct xfs_dir_update {
int xfs_dir_create_child(struct xfs_trans *tp, unsigned int resblks,
struct xfs_dir_update *du);
+int xfs_dir_add_child(struct xfs_trans *tp, unsigned int resblks,
+ struct xfs_dir_update *du);
#endif /* __XFS_DIR2_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 27/64] xfs: hoist inode free function to libxfs
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (25 preceding siblings ...)
2024-10-02 1:14 ` [PATCH 26/64] xfs: create libxfs helper to link an existing " Darrick J. Wong
@ 2024-10-02 1:14 ` Darrick J. Wong
2024-10-02 1:15 ` [PATCH 28/64] xfs: create libxfs helper to remove an existing inode/name from a directory Darrick J. Wong
` (36 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:14 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 1964435d19d947b8626379d09db3e33b9669f333
Create a libxfs helper function that marks an inode free on disk.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_inode_util.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_inode_util.h | 5 +++++
2 files changed, 56 insertions(+)
diff --git a/libxfs/xfs_inode_util.c b/libxfs/xfs_inode_util.c
index 13c32d114..74d2b5960 100644
--- a/libxfs/xfs_inode_util.c
+++ b/libxfs/xfs_inode_util.c
@@ -693,3 +693,54 @@ xfs_bumplink(
xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
}
+
+/* Free an inode in the ondisk index and zero it out. */
+int
+xfs_inode_uninit(
+ struct xfs_trans *tp,
+ struct xfs_perag *pag,
+ struct xfs_inode *ip,
+ struct xfs_icluster *xic)
+{
+ struct xfs_mount *mp = ip->i_mount;
+ int error;
+
+ /*
+ * Free the inode first so that we guarantee that the AGI lock is going
+ * to be taken before we remove the inode from the unlinked list. This
+ * makes the AGI lock -> unlinked list modification order the same as
+ * used in O_TMPFILE creation.
+ */
+ error = xfs_difree(tp, pag, ip->i_ino, xic);
+ if (error)
+ return error;
+
+ error = xfs_iunlink_remove(tp, pag, ip);
+ if (error)
+ return error;
+
+ /*
+ * Free any local-format data sitting around before we reset the
+ * data fork to extents format. Note that the attr fork data has
+ * already been freed by xfs_attr_inactive.
+ */
+ if (ip->i_df.if_format == XFS_DINODE_FMT_LOCAL) {
+ kfree(ip->i_df.if_data);
+ ip->i_df.if_data = NULL;
+ ip->i_df.if_bytes = 0;
+ }
+
+ VFS_I(ip)->i_mode = 0; /* mark incore inode as free */
+ ip->i_diflags = 0;
+ ip->i_diflags2 = mp->m_ino_geo.new_diflags2;
+ ip->i_forkoff = 0; /* mark the attr fork not in use */
+ ip->i_df.if_format = XFS_DINODE_FMT_EXTENTS;
+
+ /*
+ * Bump the generation count so no one will be confused
+ * by reincarnations of this inode.
+ */
+ VFS_I(ip)->i_generation++;
+ xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+ return 0;
+}
diff --git a/libxfs/xfs_inode_util.h b/libxfs/xfs_inode_util.h
index 1c54c3b0c..060242998 100644
--- a/libxfs/xfs_inode_util.h
+++ b/libxfs/xfs_inode_util.h
@@ -6,6 +6,8 @@
#ifndef __XFS_INODE_UTIL_H__
#define __XFS_INODE_UTIL_H__
+struct xfs_icluster;
+
uint16_t xfs_flags2diflags(struct xfs_inode *ip, unsigned int xflags);
uint64_t xfs_flags2diflags2(struct xfs_inode *ip, unsigned int xflags);
uint32_t xfs_dic2xflags(struct xfs_inode *ip);
@@ -48,6 +50,9 @@ void xfs_trans_ichgtime(struct xfs_trans *tp, struct xfs_inode *ip, int flags);
void xfs_inode_init(struct xfs_trans *tp, const struct xfs_icreate_args *args,
struct xfs_inode *ip);
+int xfs_inode_uninit(struct xfs_trans *tp, struct xfs_perag *pag,
+ struct xfs_inode *ip, struct xfs_icluster *xic);
+
int xfs_iunlink(struct xfs_trans *tp, struct xfs_inode *ip);
int xfs_iunlink_remove(struct xfs_trans *tp, struct xfs_perag *pag,
struct xfs_inode *ip);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 28/64] xfs: create libxfs helper to remove an existing inode/name from a directory
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (26 preceding siblings ...)
2024-10-02 1:14 ` [PATCH 27/64] xfs: hoist inode free function to libxfs Darrick J. Wong
@ 2024-10-02 1:15 ` Darrick J. Wong
2024-10-02 1:15 ` [PATCH 29/64] xfs: create libxfs helper to exchange two directory entries Darrick J. Wong
` (35 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:15 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 90636e4531a8bfb5ef37d38a76eb97e5f5793deb
Create a new libxfs function to remove a (name, inode) entry from a
directory. The upcoming metadata directory feature will need this to
create a metadata directory tree.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_dir2.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_dir2.h | 2 +
2 files changed, 83 insertions(+)
diff --git a/libxfs/xfs_dir2.c b/libxfs/xfs_dir2.c
index 802b9a1b3..e46f7f489 100644
--- a/libxfs/xfs_dir2.c
+++ b/libxfs/xfs_dir2.c
@@ -873,3 +873,84 @@ xfs_dir_add_child(
return 0;
}
+
+/*
+ * Given a directory @dp, a child @ip, and a @name, remove the (@name, @ip)
+ * entry from the directory. Both inodes must have the ILOCK held.
+ */
+int
+xfs_dir_remove_child(
+ struct xfs_trans *tp,
+ unsigned int resblks,
+ struct xfs_dir_update *du)
+{
+ struct xfs_inode *dp = du->dp;
+ const struct xfs_name *name = du->name;
+ struct xfs_inode *ip = du->ip;
+ int error;
+
+ xfs_assert_ilocked(ip, XFS_ILOCK_EXCL);
+ xfs_assert_ilocked(dp, XFS_ILOCK_EXCL);
+
+ /*
+ * If we're removing a directory perform some additional validation.
+ */
+ if (S_ISDIR(VFS_I(ip)->i_mode)) {
+ ASSERT(VFS_I(ip)->i_nlink >= 2);
+ if (VFS_I(ip)->i_nlink != 2)
+ return -ENOTEMPTY;
+ if (!xfs_dir_isempty(ip))
+ return -ENOTEMPTY;
+
+ /* Drop the link from ip's "..". */
+ error = xfs_droplink(tp, dp);
+ if (error)
+ return error;
+
+ /* Drop the "." link from ip to self. */
+ error = xfs_droplink(tp, ip);
+ if (error)
+ return error;
+
+ /*
+ * Point the unlinked child directory's ".." entry to the root
+ * directory to eliminate back-references to inodes that may
+ * get freed before the child directory is closed. If the fs
+ * gets shrunk, this can lead to dirent inode validation errors.
+ */
+ if (dp->i_ino != tp->t_mountp->m_sb.sb_rootino) {
+ error = xfs_dir_replace(tp, ip, &xfs_name_dotdot,
+ tp->t_mountp->m_sb.sb_rootino, 0);
+ if (error)
+ return error;
+ }
+ } else {
+ /*
+ * When removing a non-directory we need to log the parent
+ * inode here. For a directory this is done implicitly
+ * by the xfs_droplink call for the ".." entry.
+ */
+ xfs_trans_log_inode(tp, dp, XFS_ILOG_CORE);
+ }
+ xfs_trans_ichgtime(tp, dp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
+
+ /* Drop the link from dp to ip. */
+ error = xfs_droplink(tp, ip);
+ if (error)
+ return error;
+
+ error = xfs_dir_removename(tp, dp, name, ip->i_ino, resblks);
+ if (error) {
+ ASSERT(error != -ENOENT);
+ return error;
+ }
+
+ /* Remove parent pointer. */
+ if (du->ppargs) {
+ error = xfs_parent_removename(tp, du->ppargs, dp, name, ip);
+ if (error)
+ return error;
+ }
+
+ return 0;
+}
diff --git a/libxfs/xfs_dir2.h b/libxfs/xfs_dir2.h
index 4f9711509..c89916d1c 100644
--- a/libxfs/xfs_dir2.h
+++ b/libxfs/xfs_dir2.h
@@ -322,5 +322,7 @@ int xfs_dir_create_child(struct xfs_trans *tp, unsigned int resblks,
struct xfs_dir_update *du);
int xfs_dir_add_child(struct xfs_trans *tp, unsigned int resblks,
struct xfs_dir_update *du);
+int xfs_dir_remove_child(struct xfs_trans *tp, unsigned int resblks,
+ struct xfs_dir_update *du);
#endif /* __XFS_DIR2_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 29/64] xfs: create libxfs helper to exchange two directory entries
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (27 preceding siblings ...)
2024-10-02 1:15 ` [PATCH 28/64] xfs: create libxfs helper to remove an existing inode/name from a directory Darrick J. Wong
@ 2024-10-02 1:15 ` Darrick J. Wong
2024-10-02 1:15 ` [PATCH 30/64] xfs: create libxfs helper to rename " Darrick J. Wong
` (34 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:15 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: a55712b35c065eee4ab1195233a5478fb7c93efa
Create a new libxfs function to exchange two directory entries.
The upcoming metadata directory feature will need this to replace a
metadata inode directory entry.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_dir2.c | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_dir2.h | 3 +
2 files changed, 128 insertions(+)
diff --git a/libxfs/xfs_dir2.c b/libxfs/xfs_dir2.c
index e46f7f489..b47626815 100644
--- a/libxfs/xfs_dir2.c
+++ b/libxfs/xfs_dir2.c
@@ -954,3 +954,128 @@ xfs_dir_remove_child(
return 0;
}
+
+/*
+ * Exchange the entry (@name1, @ip1) in directory @dp1 with the entry (@name2,
+ * @ip2) in directory @dp2, and update '..' @ip1 and @ip2's entries as needed.
+ * @ip1 and @ip2 need not be of the same type.
+ *
+ * All inodes must have the ILOCK held, and both entries must already exist.
+ */
+int
+xfs_dir_exchange_children(
+ struct xfs_trans *tp,
+ struct xfs_dir_update *du1,
+ struct xfs_dir_update *du2,
+ unsigned int spaceres)
+{
+ struct xfs_inode *dp1 = du1->dp;
+ const struct xfs_name *name1 = du1->name;
+ struct xfs_inode *ip1 = du1->ip;
+ struct xfs_inode *dp2 = du2->dp;
+ const struct xfs_name *name2 = du2->name;
+ struct xfs_inode *ip2 = du2->ip;
+ int ip1_flags = 0;
+ int ip2_flags = 0;
+ int dp2_flags = 0;
+ int error;
+
+ /* Swap inode number for dirent in first parent */
+ error = xfs_dir_replace(tp, dp1, name1, ip2->i_ino, spaceres);
+ if (error)
+ return error;
+
+ /* Swap inode number for dirent in second parent */
+ error = xfs_dir_replace(tp, dp2, name2, ip1->i_ino, spaceres);
+ if (error)
+ return error;
+
+ /*
+ * If we're renaming one or more directories across different parents,
+ * update the respective ".." entries (and link counts) to match the new
+ * parents.
+ */
+ if (dp1 != dp2) {
+ dp2_flags = XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG;
+
+ if (S_ISDIR(VFS_I(ip2)->i_mode)) {
+ error = xfs_dir_replace(tp, ip2, &xfs_name_dotdot,
+ dp1->i_ino, spaceres);
+ if (error)
+ return error;
+
+ /* transfer ip2 ".." reference to dp1 */
+ if (!S_ISDIR(VFS_I(ip1)->i_mode)) {
+ error = xfs_droplink(tp, dp2);
+ if (error)
+ return error;
+ xfs_bumplink(tp, dp1);
+ }
+
+ /*
+ * Although ip1 isn't changed here, userspace needs
+ * to be warned about the change, so that applications
+ * relying on it (like backup ones), will properly
+ * notify the change
+ */
+ ip1_flags |= XFS_ICHGTIME_CHG;
+ ip2_flags |= XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG;
+ }
+
+ if (S_ISDIR(VFS_I(ip1)->i_mode)) {
+ error = xfs_dir_replace(tp, ip1, &xfs_name_dotdot,
+ dp2->i_ino, spaceres);
+ if (error)
+ return error;
+
+ /* transfer ip1 ".." reference to dp2 */
+ if (!S_ISDIR(VFS_I(ip2)->i_mode)) {
+ error = xfs_droplink(tp, dp1);
+ if (error)
+ return error;
+ xfs_bumplink(tp, dp2);
+ }
+
+ /*
+ * Although ip2 isn't changed here, userspace needs
+ * to be warned about the change, so that applications
+ * relying on it (like backup ones), will properly
+ * notify the change
+ */
+ ip1_flags |= XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG;
+ ip2_flags |= XFS_ICHGTIME_CHG;
+ }
+ }
+
+ if (ip1_flags) {
+ xfs_trans_ichgtime(tp, ip1, ip1_flags);
+ xfs_trans_log_inode(tp, ip1, XFS_ILOG_CORE);
+ }
+ if (ip2_flags) {
+ xfs_trans_ichgtime(tp, ip2, ip2_flags);
+ xfs_trans_log_inode(tp, ip2, XFS_ILOG_CORE);
+ }
+ if (dp2_flags) {
+ xfs_trans_ichgtime(tp, dp2, dp2_flags);
+ xfs_trans_log_inode(tp, dp2, XFS_ILOG_CORE);
+ }
+ xfs_trans_ichgtime(tp, dp1, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
+ xfs_trans_log_inode(tp, dp1, XFS_ILOG_CORE);
+
+ /* Schedule parent pointer replacements */
+ if (du1->ppargs) {
+ error = xfs_parent_replacename(tp, du1->ppargs, dp1, name1,
+ dp2, name2, ip1);
+ if (error)
+ return error;
+ }
+
+ if (du2->ppargs) {
+ error = xfs_parent_replacename(tp, du2->ppargs, dp2, name2,
+ dp1, name1, ip2);
+ if (error)
+ return error;
+ }
+
+ return 0;
+}
diff --git a/libxfs/xfs_dir2.h b/libxfs/xfs_dir2.h
index c89916d1c..8b1e192bd 100644
--- a/libxfs/xfs_dir2.h
+++ b/libxfs/xfs_dir2.h
@@ -325,4 +325,7 @@ int xfs_dir_add_child(struct xfs_trans *tp, unsigned int resblks,
int xfs_dir_remove_child(struct xfs_trans *tp, unsigned int resblks,
struct xfs_dir_update *du);
+int xfs_dir_exchange_children(struct xfs_trans *tp, struct xfs_dir_update *du1,
+ struct xfs_dir_update *du2, unsigned int spaceres);
+
#endif /* __XFS_DIR2_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 30/64] xfs: create libxfs helper to rename two directory entries
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (28 preceding siblings ...)
2024-10-02 1:15 ` [PATCH 29/64] xfs: create libxfs helper to exchange two directory entries Darrick J. Wong
@ 2024-10-02 1:15 ` Darrick J. Wong
2024-10-02 1:15 ` [PATCH 31/64] xfs: move dirent update hooks to xfs_dir2.c Darrick J. Wong
` (33 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:15 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 28d0d813444645689fefa232bcf88e86a5a3a746
Create a new libxfs function to rename two directory entries. The
upcoming metadata directory feature will need this to replace a metadata
inode directory entry.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_dir2.c | 227 +++++++++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_dir2.h | 3 +
2 files changed, 230 insertions(+)
diff --git a/libxfs/xfs_dir2.c b/libxfs/xfs_dir2.c
index b47626815..c2bab8f03 100644
--- a/libxfs/xfs_dir2.c
+++ b/libxfs/xfs_dir2.c
@@ -22,6 +22,7 @@
#include "xfs_trans_space.h"
#include "xfs_parent.h"
#include "xfs_ag.h"
+#include "xfs_ialloc.h"
const struct xfs_name xfs_name_dotdot = {
.name = (const unsigned char *)"..",
@@ -1079,3 +1080,229 @@ xfs_dir_exchange_children(
return 0;
}
+
+/*
+ * Given an entry (@src_name, @src_ip) in directory @src_dp, make the entry
+ * @target_name in directory @target_dp point to @src_ip and remove the
+ * original entry, cleaning up everything left behind.
+ *
+ * Cleanup involves dropping a link count on @target_ip, and either removing
+ * the (@src_name, @src_ip) entry from @src_dp or simply replacing the entry
+ * with (@src_name, @wip) if a whiteout inode @wip is supplied.
+ *
+ * All inodes must have the ILOCK held. We assume that if @src_ip is a
+ * directory then its '..' doesn't already point to @target_dp, and that @wip
+ * is a freshly allocated whiteout.
+ */
+int
+xfs_dir_rename_children(
+ struct xfs_trans *tp,
+ struct xfs_dir_update *du_src,
+ struct xfs_dir_update *du_tgt,
+ unsigned int spaceres,
+ struct xfs_dir_update *du_wip)
+{
+ struct xfs_mount *mp = tp->t_mountp;
+ struct xfs_inode *src_dp = du_src->dp;
+ const struct xfs_name *src_name = du_src->name;
+ struct xfs_inode *src_ip = du_src->ip;
+ struct xfs_inode *target_dp = du_tgt->dp;
+ const struct xfs_name *target_name = du_tgt->name;
+ struct xfs_inode *target_ip = du_tgt->ip;
+ bool new_parent = (src_dp != target_dp);
+ bool src_is_directory;
+ int error;
+
+ src_is_directory = S_ISDIR(VFS_I(src_ip)->i_mode);
+
+ /*
+ * Check for expected errors before we dirty the transaction
+ * so we can return an error without a transaction abort.
+ */
+ if (target_ip == NULL) {
+ /*
+ * If there's no space reservation, check the entry will
+ * fit before actually inserting it.
+ */
+ if (!spaceres) {
+ error = xfs_dir_canenter(tp, target_dp, target_name);
+ if (error)
+ return error;
+ }
+ } else {
+ /*
+ * If target exists and it's a directory, check that whether
+ * it can be destroyed.
+ */
+ if (S_ISDIR(VFS_I(target_ip)->i_mode) &&
+ (!xfs_dir_isempty(target_ip) ||
+ (VFS_I(target_ip)->i_nlink > 2)))
+ return -EEXIST;
+ }
+
+ /*
+ * Directory entry creation below may acquire the AGF. Remove
+ * the whiteout from the unlinked list first to preserve correct
+ * AGI/AGF locking order. This dirties the transaction so failures
+ * after this point will abort and log recovery will clean up the
+ * mess.
+ *
+ * For whiteouts, we need to bump the link count on the whiteout
+ * inode. After this point, we have a real link, clear the tmpfile
+ * state flag from the inode so it doesn't accidentally get misused
+ * in future.
+ */
+ if (du_wip->ip) {
+ struct xfs_perag *pag;
+
+ ASSERT(VFS_I(du_wip->ip)->i_nlink == 0);
+
+ pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, du_wip->ip->i_ino));
+ error = xfs_iunlink_remove(tp, pag, du_wip->ip);
+ xfs_perag_put(pag);
+ if (error)
+ return error;
+
+ xfs_bumplink(tp, du_wip->ip);
+ }
+
+ /*
+ * Set up the target.
+ */
+ if (target_ip == NULL) {
+ /*
+ * If target does not exist and the rename crosses
+ * directories, adjust the target directory link count
+ * to account for the ".." reference from the new entry.
+ */
+ error = xfs_dir_createname(tp, target_dp, target_name,
+ src_ip->i_ino, spaceres);
+ if (error)
+ return error;
+
+ xfs_trans_ichgtime(tp, target_dp,
+ XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
+
+ if (new_parent && src_is_directory) {
+ xfs_bumplink(tp, target_dp);
+ }
+ } else { /* target_ip != NULL */
+ /*
+ * Link the source inode under the target name.
+ * If the source inode is a directory and we are moving
+ * it across directories, its ".." entry will be
+ * inconsistent until we replace that down below.
+ *
+ * In case there is already an entry with the same
+ * name at the destination directory, remove it first.
+ */
+ error = xfs_dir_replace(tp, target_dp, target_name,
+ src_ip->i_ino, spaceres);
+ if (error)
+ return error;
+
+ xfs_trans_ichgtime(tp, target_dp,
+ XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
+
+ /*
+ * Decrement the link count on the target since the target
+ * dir no longer points to it.
+ */
+ error = xfs_droplink(tp, target_ip);
+ if (error)
+ return error;
+
+ if (src_is_directory) {
+ /*
+ * Drop the link from the old "." entry.
+ */
+ error = xfs_droplink(tp, target_ip);
+ if (error)
+ return error;
+ }
+ } /* target_ip != NULL */
+
+ /*
+ * Remove the source.
+ */
+ if (new_parent && src_is_directory) {
+ /*
+ * Rewrite the ".." entry to point to the new
+ * directory.
+ */
+ error = xfs_dir_replace(tp, src_ip, &xfs_name_dotdot,
+ target_dp->i_ino, spaceres);
+ ASSERT(error != -EEXIST);
+ if (error)
+ return error;
+ }
+
+ /*
+ * We always want to hit the ctime on the source inode.
+ *
+ * This isn't strictly required by the standards since the source
+ * inode isn't really being changed, but old unix file systems did
+ * it and some incremental backup programs won't work without it.
+ */
+ xfs_trans_ichgtime(tp, src_ip, XFS_ICHGTIME_CHG);
+ xfs_trans_log_inode(tp, src_ip, XFS_ILOG_CORE);
+
+ /*
+ * Adjust the link count on src_dp. This is necessary when
+ * renaming a directory, either within one parent when
+ * the target existed, or across two parent directories.
+ */
+ if (src_is_directory && (new_parent || target_ip != NULL)) {
+
+ /*
+ * Decrement link count on src_directory since the
+ * entry that's moved no longer points to it.
+ */
+ error = xfs_droplink(tp, src_dp);
+ if (error)
+ return error;
+ }
+
+ /*
+ * For whiteouts, we only need to update the source dirent with the
+ * inode number of the whiteout inode rather than removing it
+ * altogether.
+ */
+ if (du_wip->ip)
+ error = xfs_dir_replace(tp, src_dp, src_name, du_wip->ip->i_ino,
+ spaceres);
+ else
+ error = xfs_dir_removename(tp, src_dp, src_name, src_ip->i_ino,
+ spaceres);
+ if (error)
+ return error;
+
+ xfs_trans_ichgtime(tp, src_dp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
+ xfs_trans_log_inode(tp, src_dp, XFS_ILOG_CORE);
+ if (new_parent)
+ xfs_trans_log_inode(tp, target_dp, XFS_ILOG_CORE);
+
+ /* Schedule parent pointer updates. */
+ if (du_wip->ppargs) {
+ error = xfs_parent_addname(tp, du_wip->ppargs, src_dp,
+ src_name, du_wip->ip);
+ if (error)
+ return error;
+ }
+
+ if (du_src->ppargs) {
+ error = xfs_parent_replacename(tp, du_src->ppargs, src_dp,
+ src_name, target_dp, target_name, src_ip);
+ if (error)
+ return error;
+ }
+
+ if (du_tgt->ppargs) {
+ error = xfs_parent_removename(tp, du_tgt->ppargs, target_dp,
+ target_name, target_ip);
+ if (error)
+ return error;
+ }
+
+ return 0;
+}
diff --git a/libxfs/xfs_dir2.h b/libxfs/xfs_dir2.h
index 8b1e192bd..df6d4bbe3 100644
--- a/libxfs/xfs_dir2.h
+++ b/libxfs/xfs_dir2.h
@@ -327,5 +327,8 @@ int xfs_dir_remove_child(struct xfs_trans *tp, unsigned int resblks,
int xfs_dir_exchange_children(struct xfs_trans *tp, struct xfs_dir_update *du1,
struct xfs_dir_update *du2, unsigned int spaceres);
+int xfs_dir_rename_children(struct xfs_trans *tp, struct xfs_dir_update *du_src,
+ struct xfs_dir_update *du_tgt, unsigned int spaceres,
+ struct xfs_dir_update *du_wip);
#endif /* __XFS_DIR2_H__ */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 31/64] xfs: move dirent update hooks to xfs_dir2.c
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (29 preceding siblings ...)
2024-10-02 1:15 ` [PATCH 30/64] xfs: create libxfs helper to rename " Darrick J. Wong
@ 2024-10-02 1:15 ` Darrick J. Wong
2024-10-02 1:16 ` [PATCH 32/64] xfs: don't use the incore struct xfs_sb for offsets into struct xfs_dsb Darrick J. Wong
` (32 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:15 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 62bbf50bea21b1c76990fd1bae58a65660a11c27
Move the directory entry update hook code to xfs_dir2 so that it is
mostly consolidated with the higher level directory functions. Retain
the exports so that online fsck can still send notifications through the
hooks.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_dir2.c | 104 +++++++++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_dir2.h | 25 +++++++++++++
2 files changed, 129 insertions(+)
diff --git a/libxfs/xfs_dir2.c b/libxfs/xfs_dir2.c
index c2bab8f03..0b026d5f5 100644
--- a/libxfs/xfs_dir2.c
+++ b/libxfs/xfs_dir2.c
@@ -761,6 +761,81 @@ xfs_dir2_compname(
return xfs_da_compname(args, name, len);
}
+#ifdef CONFIG_XFS_LIVE_HOOKS
+/*
+ * Use a static key here to reduce the overhead of directory live update hooks.
+ * If the compiler supports jump labels, the static branch will be replaced by
+ * a nop sled when there are no hook users. Online fsck is currently the only
+ * caller, so this is a reasonable tradeoff.
+ *
+ * Note: Patching the kernel code requires taking the cpu hotplug lock. Other
+ * parts of the kernel allocate memory with that lock held, which means that
+ * XFS callers cannot hold any locks that might be used by memory reclaim or
+ * writeback when calling the static_branch_{inc,dec} functions.
+ */
+DEFINE_STATIC_XFS_HOOK_SWITCH(xfs_dir_hooks_switch);
+
+void
+xfs_dir_hook_disable(void)
+{
+ xfs_hooks_switch_off(&xfs_dir_hooks_switch);
+}
+
+void
+xfs_dir_hook_enable(void)
+{
+ xfs_hooks_switch_on(&xfs_dir_hooks_switch);
+}
+
+/* Call hooks for a directory update relating to a child dirent update. */
+inline void
+xfs_dir_update_hook(
+ struct xfs_inode *dp,
+ struct xfs_inode *ip,
+ int delta,
+ const struct xfs_name *name)
+{
+ if (xfs_hooks_switched_on(&xfs_dir_hooks_switch)) {
+ struct xfs_dir_update_params p = {
+ .dp = dp,
+ .ip = ip,
+ .delta = delta,
+ .name = name,
+ };
+ struct xfs_mount *mp = ip->i_mount;
+
+ xfs_hooks_call(&mp->m_dir_update_hooks, 0, &p);
+ }
+}
+
+/* Call the specified function during a directory update. */
+int
+xfs_dir_hook_add(
+ struct xfs_mount *mp,
+ struct xfs_dir_hook *hook)
+{
+ return xfs_hooks_add(&mp->m_dir_update_hooks, &hook->dirent_hook);
+}
+
+/* Stop calling the specified function during a directory update. */
+void
+xfs_dir_hook_del(
+ struct xfs_mount *mp,
+ struct xfs_dir_hook *hook)
+{
+ xfs_hooks_del(&mp->m_dir_update_hooks, &hook->dirent_hook);
+}
+
+/* Configure directory update hook functions. */
+void
+xfs_dir_hook_setup(
+ struct xfs_dir_hook *hook,
+ notifier_fn_t mod_fn)
+{
+ xfs_hook_setup(&hook->dirent_hook, mod_fn);
+}
+#endif /* CONFIG_XFS_LIVE_HOOKS */
+
/*
* Given a directory @dp, a newly allocated inode @ip, and a @name, link @ip
* into @dp under the given @name. If @ip is a directory, it will be
@@ -808,6 +883,7 @@ xfs_dir_create_child(
return error;
}
+ xfs_dir_update_hook(dp, ip, 1, name);
return 0;
}
@@ -872,6 +948,7 @@ xfs_dir_add_child(
return error;
}
+ xfs_dir_update_hook(dp, ip, 1, name);
return 0;
}
@@ -953,6 +1030,7 @@ xfs_dir_remove_child(
return error;
}
+ xfs_dir_update_hook(dp, ip, -1, name);
return 0;
}
@@ -1078,6 +1156,18 @@ xfs_dir_exchange_children(
return error;
}
+ /*
+ * Inform our hook clients that we've finished an exchange operation as
+ * follows: removed the source and target files from their directories;
+ * added the target to the source directory; and added the source to
+ * the target directory. All inodes are locked, so it's ok to model a
+ * rename this way so long as we say we deleted entries before we add
+ * new ones.
+ */
+ xfs_dir_update_hook(dp1, ip1, -1, name1);
+ xfs_dir_update_hook(dp2, ip2, -1, name2);
+ xfs_dir_update_hook(dp1, ip2, 1, name1);
+ xfs_dir_update_hook(dp2, ip1, 1, name2);
return 0;
}
@@ -1304,5 +1394,19 @@ xfs_dir_rename_children(
return error;
}
+ /*
+ * Inform our hook clients that we've finished a rename operation as
+ * follows: removed the source and target files from their directories;
+ * that we've added the source to the target directory; and finally
+ * that we've added the whiteout, if there was one. All inodes are
+ * locked, so it's ok to model a rename this way so long as we say we
+ * deleted entries before we add new ones.
+ */
+ if (target_ip)
+ xfs_dir_update_hook(target_dp, target_ip, -1, target_name);
+ xfs_dir_update_hook(src_dp, src_ip, -1, src_name);
+ xfs_dir_update_hook(target_dp, src_ip, 1, target_name);
+ if (du_wip->ip)
+ xfs_dir_update_hook(src_dp, du_wip->ip, 1, src_name);
return 0;
}
diff --git a/libxfs/xfs_dir2.h b/libxfs/xfs_dir2.h
index df6d4bbe3..576068ed8 100644
--- a/libxfs/xfs_dir2.h
+++ b/libxfs/xfs_dir2.h
@@ -309,6 +309,31 @@ static inline unsigned char xfs_ascii_ci_xfrm(unsigned char c)
return c;
}
+struct xfs_dir_update_params {
+ const struct xfs_inode *dp;
+ const struct xfs_inode *ip;
+ const struct xfs_name *name;
+ int delta;
+};
+
+#ifdef CONFIG_XFS_LIVE_HOOKS
+void xfs_dir_update_hook(struct xfs_inode *dp, struct xfs_inode *ip,
+ int delta, const struct xfs_name *name);
+
+struct xfs_dir_hook {
+ struct xfs_hook dirent_hook;
+};
+
+void xfs_dir_hook_disable(void);
+void xfs_dir_hook_enable(void);
+
+int xfs_dir_hook_add(struct xfs_mount *mp, struct xfs_dir_hook *hook);
+void xfs_dir_hook_del(struct xfs_mount *mp, struct xfs_dir_hook *hook);
+void xfs_dir_hook_setup(struct xfs_dir_hook *hook, notifier_fn_t mod_fn);
+#else
+# define xfs_dir_update_hook(dp, ip, delta, name) ((void)0)
+#endif /* CONFIG_XFS_LIVE_HOOKS */
+
struct xfs_parent_args;
struct xfs_dir_update {
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 32/64] xfs: don't use the incore struct xfs_sb for offsets into struct xfs_dsb
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (30 preceding siblings ...)
2024-10-02 1:15 ` [PATCH 31/64] xfs: move dirent update hooks to xfs_dir2.c Darrick J. Wong
@ 2024-10-02 1:16 ` Darrick J. Wong
2024-10-02 1:16 ` [PATCH 33/64] xfs: clean up extent free log intent item tracepoint callsites Darrick J. Wong
` (31 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:16 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: ac3a0275165b4f80d9b7b516d6a8f8b308644fff
Currently, the XFS_SB_CRC_OFF macro uses the incore superblock struct
(xfs_sb) to compute the address of sb_crc within the ondisk superblock
struct (xfs_dsb). This is a landmine if we ever change the layout of
the incore superblock (as we're about to do), so redefine the macro
to use xfs_dsb to compute the layout of xfs_dsb.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_format.h | 9 ++++-----
libxfs/xfs_ondisk.h | 1 +
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h
index 61f51becf..e1bfee0c3 100644
--- a/libxfs/xfs_format.h
+++ b/libxfs/xfs_format.h
@@ -90,8 +90,7 @@ struct xfs_ifork;
#define XFSLABEL_MAX 12
/*
- * Superblock - in core version. Must match the ondisk version below.
- * Must be padded to 64 bit alignment.
+ * Superblock - in core version. Must be padded to 64 bit alignment.
*/
typedef struct xfs_sb {
uint32_t sb_magicnum; /* magic number == XFS_SB_MAGIC */
@@ -178,10 +177,8 @@ typedef struct xfs_sb {
/* must be padded to 64 bit alignment */
} xfs_sb_t;
-#define XFS_SB_CRC_OFF offsetof(struct xfs_sb, sb_crc)
-
/*
- * Superblock - on disk version. Must match the in core version above.
+ * Superblock - on disk version.
* Must be padded to 64 bit alignment.
*/
struct xfs_dsb {
@@ -265,6 +262,8 @@ struct xfs_dsb {
/* must be padded to 64 bit alignment */
};
+#define XFS_SB_CRC_OFF offsetof(struct xfs_dsb, sb_crc)
+
/*
* Misc. Flags - warning - these will be cleared by xfs_repair unless
* a feature bit is set when the flag is used.
diff --git a/libxfs/xfs_ondisk.h b/libxfs/xfs_ondisk.h
index e8cdd77d0..23c133fd3 100644
--- a/libxfs/xfs_ondisk.h
+++ b/libxfs/xfs_ondisk.h
@@ -85,6 +85,7 @@ xfs_check_ondisk_structs(void)
XFS_CHECK_STRUCT_SIZE(xfs_attr_leaf_name_remote_t, 12);
*/
+ XFS_CHECK_OFFSET(struct xfs_dsb, sb_crc, 224);
XFS_CHECK_OFFSET(xfs_attr_leaf_name_local_t, valuelen, 0);
XFS_CHECK_OFFSET(xfs_attr_leaf_name_local_t, namelen, 2);
XFS_CHECK_OFFSET(xfs_attr_leaf_name_local_t, nameval, 3);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 33/64] xfs: clean up extent free log intent item tracepoint callsites
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (31 preceding siblings ...)
2024-10-02 1:16 ` [PATCH 32/64] xfs: don't use the incore struct xfs_sb for offsets into struct xfs_dsb Darrick J. Wong
@ 2024-10-02 1:16 ` Darrick J. Wong
2024-10-02 1:16 ` [PATCH 34/64] xfs: convert "skip_discard" to a proper flags bitset Darrick J. Wong
` (30 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:16 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 4e0e2c0fe35b44cd4db6a138ed4316178ed60b5c
Pass the incore EFI structure to the tracepoints instead of open-coding
the argument passing. This cleans up the call sites a bit.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
include/xfs_trace.h | 5 ++---
libxfs/xfs_alloc.c | 7 +++----
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/include/xfs_trace.h b/include/xfs_trace.h
index 812fbb38e..f6d6a6ea1 100644
--- a/include/xfs_trace.h
+++ b/include/xfs_trace.h
@@ -14,7 +14,7 @@
#define trace_xfbtree_trans_commit_buf(...) ((void) 0)
#define trace_xfs_agfl_reset(a,b,c,d) ((void) 0)
-#define trace_xfs_agfl_free_defer(a,b,c,d,e) ((void) 0)
+#define trace_xfs_agfl_free_defer(...) ((void) 0)
#define trace_xfs_alloc_cur_check(...) ((void) 0)
#define trace_xfs_alloc_cur(a) ((void) 0)
#define trace_xfs_alloc_cur_left(a) ((void) 0)
@@ -243,8 +243,7 @@
#define trace_xfs_defer_item_pause(...) ((void) 0)
#define trace_xfs_defer_item_unpause(...) ((void) 0)
-#define trace_xfs_bmap_free_defer(...) ((void) 0)
-#define trace_xfs_bmap_free_deferred(...) ((void) 0)
+#define trace_xfs_extent_free_defer(...) ((void) 0)
#define trace_xfs_rmap_map(...) ((void) 0)
#define trace_xfs_rmap_map_error(...) ((void) 0)
diff --git a/libxfs/xfs_alloc.c b/libxfs/xfs_alloc.c
index ab547d80c..48fdffd46 100644
--- a/libxfs/xfs_alloc.c
+++ b/libxfs/xfs_alloc.c
@@ -2540,7 +2540,7 @@ xfs_defer_agfl_block(
xefi->xefi_owner = oinfo->oi_owner;
xefi->xefi_agresv = XFS_AG_RESV_AGFL;
- trace_xfs_agfl_free_defer(mp, agno, 0, agbno, 1);
+ trace_xfs_agfl_free_defer(mp, xefi);
xfs_extent_free_get_group(mp, xefi);
xfs_defer_add(tp, &xefi->xefi_list, &xfs_agfl_free_defer_type);
@@ -2602,9 +2602,8 @@ xfs_defer_extent_free(
} else {
xefi->xefi_owner = XFS_RMAP_OWN_NULL;
}
- trace_xfs_bmap_free_defer(mp,
- XFS_FSB_TO_AGNO(tp->t_mountp, bno), 0,
- XFS_FSB_TO_AGBNO(tp->t_mountp, bno), len);
+
+ trace_xfs_extent_free_defer(mp, xefi);
xfs_extent_free_get_group(mp, xefi);
*dfpp = xfs_defer_add(tp, &xefi->xefi_list, &xfs_extent_free_defer_type);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 34/64] xfs: convert "skip_discard" to a proper flags bitset
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (32 preceding siblings ...)
2024-10-02 1:16 ` [PATCH 33/64] xfs: clean up extent free log intent item tracepoint callsites Darrick J. Wong
@ 2024-10-02 1:16 ` Darrick J. Wong
2024-10-02 1:16 ` [PATCH 35/64] xfs: pass the fsbno to xfs_perag_intent_get Darrick J. Wong
` (29 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:16 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 980faece91a60c279e7c24cb1d1a378bbbb74bb9
Convert the boolean to skip discard on free into a proper flags field so
that we can add more flags in the next patch.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_ag.c | 2 +-
libxfs/xfs_alloc.c | 13 +++++++------
libxfs/xfs_alloc.h | 9 +++++++--
libxfs/xfs_bmap.c | 12 ++++++++----
libxfs/xfs_bmap_btree.c | 2 +-
libxfs/xfs_ialloc.c | 5 ++---
libxfs/xfs_ialloc_btree.c | 2 +-
libxfs/xfs_refcount.c | 6 +++---
libxfs/xfs_refcount_btree.c | 2 +-
repair/bulkload.c | 3 ++-
10 files changed, 33 insertions(+), 23 deletions(-)
diff --git a/libxfs/xfs_ag.c b/libxfs/xfs_ag.c
index 47522d0fc..ed9ac7f58 100644
--- a/libxfs/xfs_ag.c
+++ b/libxfs/xfs_ag.c
@@ -1006,7 +1006,7 @@ xfs_ag_shrink_space(
goto resv_err;
err2 = xfs_free_extent_later(*tpp, args.fsbno, delta, NULL,
- XFS_AG_RESV_NONE, true);
+ XFS_AG_RESV_NONE, XFS_FREE_EXTENT_SKIP_DISCARD);
if (err2)
goto resv_err;
diff --git a/libxfs/xfs_alloc.c b/libxfs/xfs_alloc.c
index 48fdffd46..6f792d280 100644
--- a/libxfs/xfs_alloc.c
+++ b/libxfs/xfs_alloc.c
@@ -2558,7 +2558,7 @@ xfs_defer_extent_free(
xfs_filblks_t len,
const struct xfs_owner_info *oinfo,
enum xfs_ag_resv_type type,
- bool skip_discard,
+ unsigned int free_flags,
struct xfs_defer_pending **dfpp)
{
struct xfs_extent_free_item *xefi;
@@ -2578,6 +2578,7 @@ xfs_defer_extent_free(
ASSERT(len < mp->m_sb.sb_agblocks);
ASSERT(agbno + len <= mp->m_sb.sb_agblocks);
#endif
+ ASSERT(!(free_flags & ~XFS_FREE_EXTENT_ALL_FLAGS));
ASSERT(xfs_extfree_item_cache != NULL);
ASSERT(type != XFS_AG_RESV_AGFL);
@@ -2589,7 +2590,7 @@ xfs_defer_extent_free(
xefi->xefi_startblock = bno;
xefi->xefi_blockcount = (xfs_extlen_t)len;
xefi->xefi_agresv = type;
- if (skip_discard)
+ if (free_flags & XFS_FREE_EXTENT_SKIP_DISCARD)
xefi->xefi_flags |= XFS_EFI_SKIP_DISCARD;
if (oinfo) {
ASSERT(oinfo->oi_offset == 0);
@@ -2617,11 +2618,11 @@ xfs_free_extent_later(
xfs_filblks_t len,
const struct xfs_owner_info *oinfo,
enum xfs_ag_resv_type type,
- bool skip_discard)
+ unsigned int free_flags)
{
struct xfs_defer_pending *dontcare = NULL;
- return xfs_defer_extent_free(tp, bno, len, oinfo, type, skip_discard,
+ return xfs_defer_extent_free(tp, bno, len, oinfo, type, free_flags,
&dontcare);
}
@@ -2646,13 +2647,13 @@ xfs_free_extent_later(
int
xfs_alloc_schedule_autoreap(
const struct xfs_alloc_arg *args,
- bool skip_discard,
+ unsigned int free_flags,
struct xfs_alloc_autoreap *aarp)
{
int error;
error = xfs_defer_extent_free(args->tp, args->fsbno, args->len,
- &args->oinfo, args->resv, skip_discard, &aarp->dfp);
+ &args->oinfo, args->resv, free_flags, &aarp->dfp);
if (error)
return error;
diff --git a/libxfs/xfs_alloc.h b/libxfs/xfs_alloc.h
index 3dc8e44fe..7f51b3cb0 100644
--- a/libxfs/xfs_alloc.h
+++ b/libxfs/xfs_alloc.h
@@ -235,7 +235,12 @@ xfs_buf_to_agfl_bno(
int xfs_free_extent_later(struct xfs_trans *tp, xfs_fsblock_t bno,
xfs_filblks_t len, const struct xfs_owner_info *oinfo,
- enum xfs_ag_resv_type type, bool skip_discard);
+ enum xfs_ag_resv_type type, unsigned int free_flags);
+
+/* Don't issue a discard for the blocks freed. */
+#define XFS_FREE_EXTENT_SKIP_DISCARD (1U << 0)
+
+#define XFS_FREE_EXTENT_ALL_FLAGS (XFS_FREE_EXTENT_SKIP_DISCARD)
/*
* List of extents to be free "later".
@@ -264,7 +269,7 @@ struct xfs_alloc_autoreap {
};
int xfs_alloc_schedule_autoreap(const struct xfs_alloc_arg *args,
- bool skip_discard, struct xfs_alloc_autoreap *aarp);
+ unsigned int free_flags, struct xfs_alloc_autoreap *aarp);
void xfs_alloc_cancel_autoreap(struct xfs_trans *tp,
struct xfs_alloc_autoreap *aarp);
void xfs_alloc_commit_autoreap(struct xfs_trans *tp,
diff --git a/libxfs/xfs_bmap.c b/libxfs/xfs_bmap.c
index 5f4446104..4b10f169f 100644
--- a/libxfs/xfs_bmap.c
+++ b/libxfs/xfs_bmap.c
@@ -599,7 +599,7 @@ xfs_bmap_btree_to_extents(
xfs_rmap_ino_bmbt_owner(&oinfo, ip->i_ino, whichfork);
error = xfs_free_extent_later(cur->bc_tp, cbno, 1, &oinfo,
- XFS_AG_RESV_NONE, false);
+ XFS_AG_RESV_NONE, 0);
if (error)
return error;
@@ -5375,11 +5375,15 @@ xfs_bmap_del_extent_real(
error = xfs_rtfree_blocks(tp, del->br_startblock,
del->br_blockcount);
} else {
+ unsigned int efi_flags = 0;
+
+ if ((bflags & XFS_BMAPI_NODISCARD) ||
+ del->br_state == XFS_EXT_UNWRITTEN)
+ efi_flags |= XFS_FREE_EXTENT_SKIP_DISCARD;
+
error = xfs_free_extent_later(tp, del->br_startblock,
del->br_blockcount, NULL,
- XFS_AG_RESV_NONE,
- ((bflags & XFS_BMAPI_NODISCARD) ||
- del->br_state == XFS_EXT_UNWRITTEN));
+ XFS_AG_RESV_NONE, efi_flags);
}
if (error)
return error;
diff --git a/libxfs/xfs_bmap_btree.c b/libxfs/xfs_bmap_btree.c
index 2a603b4d1..a14ca3595 100644
--- a/libxfs/xfs_bmap_btree.c
+++ b/libxfs/xfs_bmap_btree.c
@@ -281,7 +281,7 @@ xfs_bmbt_free_block(
xfs_rmap_ino_bmbt_owner(&oinfo, ip->i_ino, cur->bc_ino.whichfork);
error = xfs_free_extent_later(cur->bc_tp, fsbno, 1, &oinfo,
- XFS_AG_RESV_NONE, false);
+ XFS_AG_RESV_NONE, 0);
if (error)
return error;
diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index cef2819aa..c526f677e 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -1985,7 +1985,7 @@ xfs_difree_inode_chunk(
return xfs_free_extent_later(tp,
XFS_AGB_TO_FSB(mp, agno, sagbno),
M_IGEO(mp)->ialloc_blks, &XFS_RMAP_OINFO_INODES,
- XFS_AG_RESV_NONE, false);
+ XFS_AG_RESV_NONE, 0);
}
/* holemask is only 16-bits (fits in an unsigned long) */
@@ -2031,8 +2031,7 @@ xfs_difree_inode_chunk(
ASSERT(contigblk % mp->m_sb.sb_spino_align == 0);
error = xfs_free_extent_later(tp,
XFS_AGB_TO_FSB(mp, agno, agbno), contigblk,
- &XFS_RMAP_OINFO_INODES, XFS_AG_RESV_NONE,
- false);
+ &XFS_RMAP_OINFO_INODES, XFS_AG_RESV_NONE, 0);
if (error)
return error;
diff --git a/libxfs/xfs_ialloc_btree.c b/libxfs/xfs_ialloc_btree.c
index 5db9d0b33..5042cc62f 100644
--- a/libxfs/xfs_ialloc_btree.c
+++ b/libxfs/xfs_ialloc_btree.c
@@ -169,7 +169,7 @@ __xfs_inobt_free_block(
xfs_inobt_mod_blockcount(cur, -1);
fsbno = XFS_DADDR_TO_FSB(cur->bc_mp, xfs_buf_daddr(bp));
return xfs_free_extent_later(cur->bc_tp, fsbno, 1,
- &XFS_RMAP_OINFO_INOBT, resv, false);
+ &XFS_RMAP_OINFO_INOBT, resv, 0);
}
STATIC int
diff --git a/libxfs/xfs_refcount.c b/libxfs/xfs_refcount.c
index 47049488b..b4e6900be 100644
--- a/libxfs/xfs_refcount.c
+++ b/libxfs/xfs_refcount.c
@@ -1172,7 +1172,7 @@ xfs_refcount_adjust_extents(
tmp.rc_startblock);
error = xfs_free_extent_later(cur->bc_tp, fsbno,
tmp.rc_blockcount, NULL,
- XFS_AG_RESV_NONE, false);
+ XFS_AG_RESV_NONE, 0);
if (error)
goto out_error;
}
@@ -1236,7 +1236,7 @@ xfs_refcount_adjust_extents(
ext.rc_startblock);
error = xfs_free_extent_later(cur->bc_tp, fsbno,
ext.rc_blockcount, NULL,
- XFS_AG_RESV_NONE, false);
+ XFS_AG_RESV_NONE, 0);
if (error)
goto out_error;
}
@@ -2021,7 +2021,7 @@ xfs_refcount_recover_cow_leftovers(
/* Free the block. */
error = xfs_free_extent_later(tp, fsb,
rr->rr_rrec.rc_blockcount, NULL,
- XFS_AG_RESV_NONE, false);
+ XFS_AG_RESV_NONE, 0);
if (error)
goto out_trans;
diff --git a/libxfs/xfs_refcount_btree.c b/libxfs/xfs_refcount_btree.c
index 362b2a2d7..162f9e689 100644
--- a/libxfs/xfs_refcount_btree.c
+++ b/libxfs/xfs_refcount_btree.c
@@ -108,7 +108,7 @@ xfs_refcountbt_free_block(
be32_add_cpu(&agf->agf_refcount_blocks, -1);
xfs_alloc_log_agf(cur->bc_tp, agbp, XFS_AGF_REFCOUNT_BLOCKS);
return xfs_free_extent_later(cur->bc_tp, fsbno, 1,
- &XFS_RMAP_OINFO_REFC, XFS_AG_RESV_METADATA, false);
+ &XFS_RMAP_OINFO_REFC, XFS_AG_RESV_METADATA, 0);
}
STATIC int
diff --git a/repair/bulkload.c b/repair/bulkload.c
index d36e32d99..c96e569ef 100644
--- a/repair/bulkload.c
+++ b/repair/bulkload.c
@@ -196,7 +196,8 @@ bulkload_free_extent(
*/
fsbno = XFS_AGB_TO_FSB(sc->mp, resv->pag->pag_agno, free_agbno);
error = -libxfs_free_extent_later(sc->tp, fsbno, free_aglen,
- &bkl->oinfo, XFS_AG_RESV_NONE, true);
+ &bkl->oinfo, XFS_AG_RESV_NONE,
+ XFS_FREE_EXTENT_SKIP_DISCARD);
if (error)
return error;
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 35/64] xfs: pass the fsbno to xfs_perag_intent_get
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (33 preceding siblings ...)
2024-10-02 1:16 ` [PATCH 34/64] xfs: convert "skip_discard" to a proper flags bitset Darrick J. Wong
@ 2024-10-02 1:16 ` Darrick J. Wong
2024-10-02 1:17 ` [PATCH 36/64] xfs: add a xefi_entry helper Darrick J. Wong
` (28 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:16 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Source kernel commit: 62d597a197e390a89eadff60b98231e91b32ab83
All callers of xfs_perag_intent_get have a fsbno and need boilerplate
code to turn that into an agno. Just pass the fsbno to
xfs_perag_intent_get and look up the agno there.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
include/xfs_mount.h | 3 ++-
libxfs/defer_item.c | 21 ++++-----------------
2 files changed, 6 insertions(+), 18 deletions(-)
diff --git a/include/xfs_mount.h b/include/xfs_mount.h
index 4492a2f28..a60474a8d 100644
--- a/include/xfs_mount.h
+++ b/include/xfs_mount.h
@@ -298,7 +298,8 @@ struct xfs_defer_drain { /* empty */ };
#define xfs_defer_drain_init(dr) ((void)0)
#define xfs_defer_drain_free(dr) ((void)0)
-#define xfs_perag_intent_get(mp, agno) xfs_perag_get((mp), (agno))
+#define xfs_perag_intent_get(mp, agno) \
+ xfs_perag_get((mp), XFS_FSB_TO_AGNO((mp), (agno)))
#define xfs_perag_intent_put(pag) xfs_perag_put(pag)
static inline void xfs_perag_intent_hold(struct xfs_perag *pag) {}
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index 77a368e6f..fb40a6625 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -79,10 +79,7 @@ xfs_extent_free_get_group(
struct xfs_mount *mp,
struct xfs_extent_free_item *xefi)
{
- xfs_agnumber_t agno;
-
- agno = XFS_FSB_TO_AGNO(mp, xefi->xefi_startblock);
- xefi->xefi_pag = xfs_perag_intent_get(mp, agno);
+ xefi->xefi_pag = xfs_perag_intent_get(mp, xefi->xefi_startblock);
}
/* Release an active AG ref after some freeing work. */
@@ -256,10 +253,7 @@ xfs_rmap_update_get_group(
struct xfs_mount *mp,
struct xfs_rmap_intent *ri)
{
- xfs_agnumber_t agno;
-
- agno = XFS_FSB_TO_AGNO(mp, ri->ri_bmap.br_startblock);
- ri->ri_pag = xfs_perag_intent_get(mp, agno);
+ ri->ri_pag = xfs_perag_intent_get(mp, ri->ri_bmap.br_startblock);
}
/* Release an active AG ref after finishing rmapping work. */
@@ -369,10 +363,7 @@ xfs_refcount_update_get_group(
struct xfs_mount *mp,
struct xfs_refcount_intent *ri)
{
- xfs_agnumber_t agno;
-
- agno = XFS_FSB_TO_AGNO(mp, ri->ri_startblock);
- ri->ri_pag = xfs_perag_intent_get(mp, agno);
+ ri->ri_pag = xfs_perag_intent_get(mp, ri->ri_startblock);
}
/* Release an active AG ref after finishing refcounting work. */
@@ -490,13 +481,9 @@ xfs_bmap_update_get_group(
struct xfs_mount *mp,
struct xfs_bmap_intent *bi)
{
- xfs_agnumber_t agno;
-
if (xfs_ifork_is_realtime(bi->bi_owner, bi->bi_whichfork))
return;
- agno = XFS_FSB_TO_AGNO(mp, bi->bi_bmap.br_startblock);
-
/*
* Bump the intent count on behalf of the deferred rmap and refcount
* intent items that that we can queue when we finish this bmap work.
@@ -504,7 +491,7 @@ xfs_bmap_update_get_group(
* intent drops the intent count, ensuring that the intent count
* remains nonzero across the transaction roll.
*/
- bi->bi_pag = xfs_perag_intent_get(mp, agno);
+ bi->bi_pag = xfs_perag_intent_get(mp, bi->bi_bmap.br_startblock);
}
/* Add this deferred BUI to the transaction. */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 36/64] xfs: add a xefi_entry helper
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (34 preceding siblings ...)
2024-10-02 1:16 ` [PATCH 35/64] xfs: pass the fsbno to xfs_perag_intent_get Darrick J. Wong
@ 2024-10-02 1:17 ` Darrick J. Wong
2024-10-02 1:17 ` [PATCH 37/64] xfs: reuse xfs_extent_free_cancel_item Darrick J. Wong
` (27 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:17 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Source kernel commit: 649c0c2b86ee944a1a9962b310b1b97ead12e97a
Add a helper to translate from the item list head to the
xfs_extent_free_item structure and use it so shorten assignments
and avoid the need for extra local variables.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/defer_item.c | 24 ++++++++++--------------
1 file changed, 10 insertions(+), 14 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index fb40a6625..8cb27912f 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -32,6 +32,11 @@
/* Extent Freeing */
+static inline struct xfs_extent_free_item *xefi_entry(const struct list_head *e)
+{
+ return list_entry(e, struct xfs_extent_free_item, xefi_list);
+}
+
/* Sort bmap items by AG. */
static int
xfs_extent_free_diff_items(
@@ -39,11 +44,8 @@ xfs_extent_free_diff_items(
const struct list_head *a,
const struct list_head *b)
{
- const struct xfs_extent_free_item *ra;
- const struct xfs_extent_free_item *rb;
-
- ra = container_of(a, struct xfs_extent_free_item, xefi_list);
- rb = container_of(b, struct xfs_extent_free_item, xefi_list);
+ struct xfs_extent_free_item *ra = xefi_entry(a);
+ struct xfs_extent_free_item *rb = xefi_entry(b);
return ra->xefi_pag->pag_agno - rb->xefi_pag->pag_agno;
}
@@ -99,12 +101,10 @@ xfs_extent_free_finish_item(
struct xfs_btree_cur **state)
{
struct xfs_owner_info oinfo = { };
- struct xfs_extent_free_item *xefi;
+ struct xfs_extent_free_item *xefi = xefi_entry(item);
xfs_agblock_t agbno;
int error = 0;
- xefi = container_of(item, struct xfs_extent_free_item, xefi_list);
-
oinfo.oi_owner = xefi->xefi_owner;
if (xefi->xefi_flags & XFS_EFI_ATTR_FORK)
oinfo.oi_flags |= XFS_OWNER_INFO_ATTR_FORK;
@@ -143,9 +143,7 @@ STATIC void
xfs_extent_free_cancel_item(
struct list_head *item)
{
- struct xfs_extent_free_item *xefi;
-
- xefi = container_of(item, struct xfs_extent_free_item, xefi_list);
+ struct xfs_extent_free_item *xefi = xefi_entry(item);
xfs_extent_free_put_group(xefi);
kmem_cache_free(xfs_extfree_item_cache, xefi);
@@ -173,13 +171,11 @@ xfs_agfl_free_finish_item(
{
struct xfs_owner_info oinfo = { };
struct xfs_mount *mp = tp->t_mountp;
- struct xfs_extent_free_item *xefi;
+ struct xfs_extent_free_item *xefi = xefi_entry(item);
struct xfs_buf *agbp;
int error;
xfs_agblock_t agbno;
- xefi = container_of(item, struct xfs_extent_free_item, xefi_list);
-
ASSERT(xefi->xefi_blockcount == 1);
agbno = XFS_FSB_TO_AGBNO(mp, xefi->xefi_startblock);
oinfo.oi_owner = xefi->xefi_owner;
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 37/64] xfs: reuse xfs_extent_free_cancel_item
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (35 preceding siblings ...)
2024-10-02 1:17 ` [PATCH 36/64] xfs: add a xefi_entry helper Darrick J. Wong
@ 2024-10-02 1:17 ` Darrick J. Wong
2024-10-02 1:17 ` [PATCH 38/64] xfs: remove duplicate asserts in xfs_defer_extent_free Darrick J. Wong
` (26 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:17 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Source kernel commit: 61665fae4e4302f2a48de56749640a9f1a4c2ec5
Reuse xfs_extent_free_cancel_item to put the AG/RTG and free the item in
a few places that currently open code the logic.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/defer_item.c | 32 ++++++++++++++------------------
1 file changed, 14 insertions(+), 18 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index 8cb27912f..dd88e75e9 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -92,6 +92,17 @@ xfs_extent_free_put_group(
xfs_perag_intent_put(xefi->xefi_pag);
}
+/* Cancel a free extent. */
+STATIC void
+xfs_extent_free_cancel_item(
+ struct list_head *item)
+{
+ struct xfs_extent_free_item *xefi = xefi_entry(item);
+
+ xfs_extent_free_put_group(xefi);
+ kmem_cache_free(xfs_extfree_item_cache, xefi);
+}
+
/* Process a free extent. */
STATIC int
xfs_extent_free_finish_item(
@@ -123,11 +134,8 @@ xfs_extent_free_finish_item(
* Don't free the XEFI if we need a new transaction to complete
* processing of it.
*/
- if (error == -EAGAIN)
- return error;
-
- xfs_extent_free_put_group(xefi);
- kmem_cache_free(xfs_extfree_item_cache, xefi);
+ if (error != -EAGAIN)
+ xfs_extent_free_cancel_item(item);
return error;
}
@@ -138,17 +146,6 @@ xfs_extent_free_abort_intent(
{
}
-/* Cancel a free extent. */
-STATIC void
-xfs_extent_free_cancel_item(
- struct list_head *item)
-{
- struct xfs_extent_free_item *xefi = xefi_entry(item);
-
- xfs_extent_free_put_group(xefi);
- kmem_cache_free(xfs_extfree_item_cache, xefi);
-}
-
const struct xfs_defer_op_type xfs_extent_free_defer_type = {
.name = "extent_free",
.create_intent = xfs_extent_free_create_intent,
@@ -185,8 +182,7 @@ xfs_agfl_free_finish_item(
error = xfs_free_ag_extent(tp, agbp, xefi->xefi_pag->pag_agno,
agbno, 1, &oinfo, XFS_AG_RESV_AGFL);
- xfs_extent_free_put_group(xefi);
- kmem_cache_free(xfs_extfree_item_cache, xefi);
+ xfs_extent_free_cancel_item(item);
return error;
}
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 38/64] xfs: remove duplicate asserts in xfs_defer_extent_free
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (36 preceding siblings ...)
2024-10-02 1:17 ` [PATCH 37/64] xfs: reuse xfs_extent_free_cancel_item Darrick J. Wong
@ 2024-10-02 1:17 ` Darrick J. Wong
2024-10-02 1:17 ` [PATCH 39/64] xfs: remove xfs_defer_agfl_block Darrick J. Wong
` (25 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:17 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Source kernel commit: 851a6781895a0f6e0ba75168dc7aecc132d13e6a
The bno/len verification is already done by the calls to
xfs_verify_rtbext / xfs_verify_fsbext, and reporting a corruption error
seem like the better handling than tripping an assert anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/xfs_alloc.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/libxfs/xfs_alloc.c b/libxfs/xfs_alloc.c
index 6f792d280..93e628e8c 100644
--- a/libxfs/xfs_alloc.c
+++ b/libxfs/xfs_alloc.c
@@ -2563,23 +2563,10 @@ xfs_defer_extent_free(
{
struct xfs_extent_free_item *xefi;
struct xfs_mount *mp = tp->t_mountp;
-#ifdef DEBUG
- xfs_agnumber_t agno;
- xfs_agblock_t agbno;
- ASSERT(bno != NULLFSBLOCK);
- ASSERT(len > 0);
ASSERT(len <= XFS_MAX_BMBT_EXTLEN);
ASSERT(!isnullstartblock(bno));
- agno = XFS_FSB_TO_AGNO(mp, bno);
- agbno = XFS_FSB_TO_AGBNO(mp, bno);
- ASSERT(agno < mp->m_sb.sb_agcount);
- ASSERT(agbno < mp->m_sb.sb_agblocks);
- ASSERT(len < mp->m_sb.sb_agblocks);
- ASSERT(agbno + len <= mp->m_sb.sb_agblocks);
-#endif
ASSERT(!(free_flags & ~XFS_FREE_EXTENT_ALL_FLAGS));
- ASSERT(xfs_extfree_item_cache != NULL);
ASSERT(type != XFS_AG_RESV_AGFL);
if (XFS_IS_CORRUPT(mp, !xfs_verify_fsbext(mp, bno, len)))
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 39/64] xfs: remove xfs_defer_agfl_block
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (37 preceding siblings ...)
2024-10-02 1:17 ` [PATCH 38/64] xfs: remove duplicate asserts in xfs_defer_extent_free Darrick J. Wong
@ 2024-10-02 1:17 ` Darrick J. Wong
2024-10-02 1:18 ` [PATCH 40/64] xfs: move xfs_extent_free_defer_add to xfs_extfree_item.c Darrick J. Wong
` (24 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:17 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Source kernel commit: 7272f77c67c0710918e5678266f8dad6e3bfc8d2
xfs_free_extent_later can handle the extra AGFL special casing with
very little extra logic.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/xfs_alloc.c | 68 +++++++++++++++++-----------------------------------
1 file changed, 22 insertions(+), 46 deletions(-)
diff --git a/libxfs/xfs_alloc.c b/libxfs/xfs_alloc.c
index 93e628e8c..60ac73828 100644
--- a/libxfs/xfs_alloc.c
+++ b/libxfs/xfs_alloc.c
@@ -2505,48 +2505,6 @@ xfs_agfl_reset(
clear_bit(XFS_AGSTATE_AGFL_NEEDS_RESET, &pag->pag_opstate);
}
-/*
- * Defer an AGFL block free. This is effectively equivalent to
- * xfs_free_extent_later() with some special handling particular to AGFL blocks.
- *
- * Deferring AGFL frees helps prevent log reservation overruns due to too many
- * allocation operations in a transaction. AGFL frees are prone to this problem
- * because for one they are always freed one at a time. Further, an immediate
- * AGFL block free can cause a btree join and require another block free before
- * the real allocation can proceed. Deferring the free disconnects freeing up
- * the AGFL slot from freeing the block.
- */
-static int
-xfs_defer_agfl_block(
- struct xfs_trans *tp,
- xfs_agnumber_t agno,
- xfs_agblock_t agbno,
- struct xfs_owner_info *oinfo)
-{
- struct xfs_mount *mp = tp->t_mountp;
- struct xfs_extent_free_item *xefi;
- xfs_fsblock_t fsbno = XFS_AGB_TO_FSB(mp, agno, agbno);
-
- ASSERT(xfs_extfree_item_cache != NULL);
- ASSERT(oinfo != NULL);
-
- if (XFS_IS_CORRUPT(mp, !xfs_verify_fsbno(mp, fsbno)))
- return -EFSCORRUPTED;
-
- xefi = kmem_cache_zalloc(xfs_extfree_item_cache,
- GFP_KERNEL | __GFP_NOFAIL);
- xefi->xefi_startblock = fsbno;
- xefi->xefi_blockcount = 1;
- xefi->xefi_owner = oinfo->oi_owner;
- xefi->xefi_agresv = XFS_AG_RESV_AGFL;
-
- trace_xfs_agfl_free_defer(mp, xefi);
-
- xfs_extent_free_get_group(mp, xefi);
- xfs_defer_add(tp, &xefi->xefi_list, &xfs_agfl_free_defer_type);
- return 0;
-}
-
/*
* Add the extent to the list of extents to be free at transaction end.
* The list is maintained sorted (by block number).
@@ -2567,7 +2525,6 @@ xfs_defer_extent_free(
ASSERT(len <= XFS_MAX_BMBT_EXTLEN);
ASSERT(!isnullstartblock(bno));
ASSERT(!(free_flags & ~XFS_FREE_EXTENT_ALL_FLAGS));
- ASSERT(type != XFS_AG_RESV_AGFL);
if (XFS_IS_CORRUPT(mp, !xfs_verify_fsbext(mp, bno, len)))
return -EFSCORRUPTED;
@@ -2594,7 +2551,13 @@ xfs_defer_extent_free(
trace_xfs_extent_free_defer(mp, xefi);
xfs_extent_free_get_group(mp, xefi);
- *dfpp = xfs_defer_add(tp, &xefi->xefi_list, &xfs_extent_free_defer_type);
+
+ if (xefi->xefi_agresv == XFS_AG_RESV_AGFL)
+ *dfpp = xfs_defer_add(tp, &xefi->xefi_list,
+ &xfs_agfl_free_defer_type);
+ else
+ *dfpp = xfs_defer_add(tp, &xefi->xefi_list,
+ &xfs_extent_free_defer_type);
return 0;
}
@@ -2852,8 +2815,21 @@ xfs_alloc_fix_freelist(
if (error)
goto out_agbp_relse;
- /* defer agfl frees */
- error = xfs_defer_agfl_block(tp, args->agno, bno, &targs.oinfo);
+ /*
+ * Defer the AGFL block free.
+ *
+ * This helps to prevent log reservation overruns due to too
+ * many allocation operations in a transaction. AGFL frees are
+ * prone to this problem because for one they are always freed
+ * one at a time. Further, an immediate AGFL block free can
+ * cause a btree join and require another block free before the
+ * real allocation can proceed.
+ * Deferring the free disconnects freeing up the AGFL slot from
+ * freeing the block.
+ */
+ error = xfs_free_extent_later(tp,
+ XFS_AGB_TO_FSB(mp, args->agno, bno), 1,
+ &targs.oinfo, XFS_AG_RESV_AGFL, 0);
if (error)
goto out_agbp_relse;
}
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 40/64] xfs: move xfs_extent_free_defer_add to xfs_extfree_item.c
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (38 preceding siblings ...)
2024-10-02 1:17 ` [PATCH 39/64] xfs: remove xfs_defer_agfl_block Darrick J. Wong
@ 2024-10-02 1:18 ` Darrick J. Wong
2024-10-02 1:18 ` [PATCH 41/64] xfs: give rmap btree cursor error tracepoints their own class Darrick J. Wong
` (23 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:18 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 84a3c1576c5aade32170fae6c61d51bd2d16010f
Move the code that adds the incore xfs_extent_free_item deferred work
data to a transaction to live with the EFI log item code. This means
that the allocator code no longer has to know about the inner workings
of the EFI log items.
As a consequence, we can get rid of the _{get,put}_group helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/defer_item.c | 28 +++++++++++++++-------------
libxfs/defer_item.h | 7 +++++++
libxfs/xfs_alloc.c | 12 ++----------
libxfs/xfs_alloc.h | 3 ---
4 files changed, 24 insertions(+), 26 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index dd88e75e9..2df0ce4e8 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -27,6 +27,7 @@
#include "defer_item.h"
#include "xfs_ag.h"
#include "xfs_exchmaps.h"
+#include "defer_item.h"
/* Dummy defer item ops, since we don't do logging. */
@@ -75,21 +76,22 @@ xfs_extent_free_create_done(
return NULL;
}
-/* Take an active ref to the AG containing the space we're freeing. */
+/* Add this deferred EFI to the transaction. */
void
-xfs_extent_free_get_group(
- struct xfs_mount *mp,
- struct xfs_extent_free_item *xefi)
+xfs_extent_free_defer_add(
+ struct xfs_trans *tp,
+ struct xfs_extent_free_item *xefi,
+ struct xfs_defer_pending **dfpp)
{
+ struct xfs_mount *mp = tp->t_mountp;
+
xefi->xefi_pag = xfs_perag_intent_get(mp, xefi->xefi_startblock);
-}
-
-/* Release an active AG ref after some freeing work. */
-static inline void
-xfs_extent_free_put_group(
- struct xfs_extent_free_item *xefi)
-{
- xfs_perag_intent_put(xefi->xefi_pag);
+ if (xefi->xefi_agresv == XFS_AG_RESV_AGFL)
+ *dfpp = xfs_defer_add(tp, &xefi->xefi_list,
+ &xfs_agfl_free_defer_type);
+ else
+ *dfpp = xfs_defer_add(tp, &xefi->xefi_list,
+ &xfs_extent_free_defer_type);
}
/* Cancel a free extent. */
@@ -99,7 +101,7 @@ xfs_extent_free_cancel_item(
{
struct xfs_extent_free_item *xefi = xefi_entry(item);
- xfs_extent_free_put_group(xefi);
+ xfs_perag_intent_put(xefi->xefi_pag);
kmem_cache_free(xfs_extfree_item_cache, xefi);
}
diff --git a/libxfs/defer_item.h b/libxfs/defer_item.h
index df2b8d68b..03f3f1505 100644
--- a/libxfs/defer_item.h
+++ b/libxfs/defer_item.h
@@ -23,4 +23,11 @@ struct xfs_exchmaps_intent;
void xfs_exchmaps_defer_add(struct xfs_trans *tp,
struct xfs_exchmaps_intent *xmi);
+struct xfs_extent_free_item;
+struct xfs_defer_pending;
+
+void xfs_extent_free_defer_add(struct xfs_trans *tp,
+ struct xfs_extent_free_item *xefi,
+ struct xfs_defer_pending **dfpp);
+
#endif /* __LIBXFS_DEFER_ITEM_H_ */
diff --git a/libxfs/xfs_alloc.c b/libxfs/xfs_alloc.c
index 60ac73828..063ac1973 100644
--- a/libxfs/xfs_alloc.c
+++ b/libxfs/xfs_alloc.c
@@ -23,6 +23,7 @@
#include "xfs_ag_resv.h"
#include "xfs_bmap.h"
#include "xfs_health.h"
+#include "defer_item.h"
struct kmem_cache *xfs_extfree_item_cache;
@@ -2548,16 +2549,7 @@ xfs_defer_extent_free(
xefi->xefi_owner = XFS_RMAP_OWN_NULL;
}
- trace_xfs_extent_free_defer(mp, xefi);
-
- xfs_extent_free_get_group(mp, xefi);
-
- if (xefi->xefi_agresv == XFS_AG_RESV_AGFL)
- *dfpp = xfs_defer_add(tp, &xefi->xefi_list,
- &xfs_agfl_free_defer_type);
- else
- *dfpp = xfs_defer_add(tp, &xefi->xefi_list,
- &xfs_extent_free_defer_type);
+ xfs_extent_free_defer_add(tp, xefi, dfpp);
return 0;
}
diff --git a/libxfs/xfs_alloc.h b/libxfs/xfs_alloc.h
index 7f51b3cb0..fae170825 100644
--- a/libxfs/xfs_alloc.h
+++ b/libxfs/xfs_alloc.h
@@ -256,9 +256,6 @@ struct xfs_extent_free_item {
enum xfs_ag_resv_type xefi_agresv;
};
-void xfs_extent_free_get_group(struct xfs_mount *mp,
- struct xfs_extent_free_item *xefi);
-
#define XFS_EFI_SKIP_DISCARD (1U << 0) /* don't issue discard */
#define XFS_EFI_ATTR_FORK (1U << 1) /* freeing attr fork block */
#define XFS_EFI_BMBT_BLOCK (1U << 2) /* freeing bmap btree block */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 41/64] xfs: give rmap btree cursor error tracepoints their own class
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (39 preceding siblings ...)
2024-10-02 1:18 ` [PATCH 40/64] xfs: move xfs_extent_free_defer_add to xfs_extfree_item.c Darrick J. Wong
@ 2024-10-02 1:18 ` Darrick J. Wong
2024-10-02 1:18 ` [PATCH 42/64] xfs: pass btree cursors to rmap btree tracepoints Darrick J. Wong
` (22 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:18 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 71f5a17e526775f001f643c9d54e5b59fa29d7ac
Create a new tracepoint class for btree-related errors, then convert all
the rmap tracepoints to use it. Also fix the one tracepoint that was
abusing the old class by making it a separate tracepoint.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_rmap.c | 33 +++++++++++----------------------
1 file changed, 11 insertions(+), 22 deletions(-)
diff --git a/libxfs/xfs_rmap.c b/libxfs/xfs_rmap.c
index c3195e532..74a30ed81 100644
--- a/libxfs/xfs_rmap.c
+++ b/libxfs/xfs_rmap.c
@@ -110,8 +110,7 @@ xfs_rmap_update(
xfs_rmap_irec_offset_pack(irec));
error = xfs_btree_update(cur, &rec);
if (error)
- trace_xfs_rmap_update_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_rmap_update_error(cur, error, _RET_IP_);
return error;
}
@@ -154,8 +153,7 @@ xfs_rmap_insert(
}
done:
if (error)
- trace_xfs_rmap_insert_error(rcur->bc_mp,
- rcur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_rmap_insert_error(rcur, error, _RET_IP_);
return error;
}
@@ -193,8 +191,7 @@ xfs_rmap_delete(
}
done:
if (error)
- trace_xfs_rmap_delete_error(rcur->bc_mp,
- rcur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_rmap_delete_error(rcur, error, _RET_IP_);
return error;
}
@@ -815,8 +812,7 @@ xfs_rmap_unmap(
unwritten, oinfo);
out_error:
if (error)
- trace_xfs_rmap_unmap_error(mp, cur->bc_ag.pag->pag_agno,
- error, _RET_IP_);
+ trace_xfs_rmap_unmap_error(cur, error, _RET_IP_);
return error;
}
@@ -1147,8 +1143,7 @@ xfs_rmap_map(
unwritten, oinfo);
out_error:
if (error)
- trace_xfs_rmap_map_error(mp, cur->bc_ag.pag->pag_agno,
- error, _RET_IP_);
+ trace_xfs_rmap_map_error(cur, error, _RET_IP_);
return error;
}
@@ -1343,8 +1338,7 @@ xfs_rmap_convert(
RIGHT.rm_blockcount > XFS_RMAP_LEN_MAX)
state &= ~RMAP_RIGHT_CONTIG;
- trace_xfs_rmap_convert_state(mp, cur->bc_ag.pag->pag_agno, state,
- _RET_IP_);
+ trace_xfs_rmap_convert_state(cur, state, _RET_IP_);
/* reset the cursor back to PREV */
error = xfs_rmap_lookup_le(cur, bno, owner, offset, oldext, NULL, &i);
@@ -1697,8 +1691,7 @@ xfs_rmap_convert(
unwritten, oinfo);
done:
if (error)
- trace_xfs_rmap_convert_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_rmap_convert_error(cur, error, _RET_IP_);
return error;
}
@@ -1821,8 +1814,7 @@ xfs_rmap_convert_shared(
RIGHT.rm_blockcount > XFS_RMAP_LEN_MAX)
state &= ~RMAP_RIGHT_CONTIG;
- trace_xfs_rmap_convert_state(mp, cur->bc_ag.pag->pag_agno, state,
- _RET_IP_);
+ trace_xfs_rmap_convert_state(cur, state, _RET_IP_);
/*
* Switch out based on the FILLING and CONTIG state bits.
*/
@@ -2124,8 +2116,7 @@ xfs_rmap_convert_shared(
unwritten, oinfo);
done:
if (error)
- trace_xfs_rmap_convert_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_rmap_convert_error(cur, error, _RET_IP_);
return error;
}
@@ -2324,8 +2315,7 @@ xfs_rmap_unmap_shared(
unwritten, oinfo);
out_error:
if (error)
- trace_xfs_rmap_unmap_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_rmap_unmap_error(cur, error, _RET_IP_);
return error;
}
@@ -2485,8 +2475,7 @@ xfs_rmap_map_shared(
unwritten, oinfo);
out_error:
if (error)
- trace_xfs_rmap_map_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_rmap_map_error(cur, error, _RET_IP_);
return error;
}
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 42/64] xfs: pass btree cursors to rmap btree tracepoints
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (40 preceding siblings ...)
2024-10-02 1:18 ` [PATCH 41/64] xfs: give rmap btree cursor error tracepoints their own class Darrick J. Wong
@ 2024-10-02 1:18 ` Darrick J. Wong
2024-10-02 1:19 ` [PATCH 43/64] xfs: clean up rmap log intent item tracepoint callsites Darrick J. Wong
` (21 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:18 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 47492ed124219b37acf65cd931c1e45d5bc0c274
Prepare the rmap btree tracepoints for use with realtime rmap btrees by
making them take the btree cursor object as a parameter. This will save
us a lot of trouble later on.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_rmap.c | 184 +++++++++++++++++++++--------------------------------
1 file changed, 73 insertions(+), 111 deletions(-)
diff --git a/libxfs/xfs_rmap.c b/libxfs/xfs_rmap.c
index 74a30ed81..46bee57cc 100644
--- a/libxfs/xfs_rmap.c
+++ b/libxfs/xfs_rmap.c
@@ -99,8 +99,7 @@ xfs_rmap_update(
union xfs_btree_rec rec;
int error;
- trace_xfs_rmap_update(cur->bc_mp, cur->bc_ag.pag->pag_agno,
- irec->rm_startblock, irec->rm_blockcount,
+ trace_xfs_rmap_update(cur, irec->rm_startblock, irec->rm_blockcount,
irec->rm_owner, irec->rm_offset, irec->rm_flags);
rec.rmap.rm_startblock = cpu_to_be32(irec->rm_startblock);
@@ -126,8 +125,7 @@ xfs_rmap_insert(
int i;
int error;
- trace_xfs_rmap_insert(rcur->bc_mp, rcur->bc_ag.pag->pag_agno, agbno,
- len, owner, offset, flags);
+ trace_xfs_rmap_insert(rcur, agbno, len, owner, offset, flags);
error = xfs_rmap_lookup_eq(rcur, agbno, len, owner, offset, flags, &i);
if (error)
@@ -169,8 +167,7 @@ xfs_rmap_delete(
int i;
int error;
- trace_xfs_rmap_delete(rcur->bc_mp, rcur->bc_ag.pag->pag_agno, agbno,
- len, owner, offset, flags);
+ trace_xfs_rmap_delete(rcur, agbno, len, owner, offset, flags);
error = xfs_rmap_lookup_eq(rcur, agbno, len, owner, offset, flags, &i);
if (error)
@@ -338,8 +335,7 @@ xfs_rmap_find_left_neighbor_helper(
{
struct xfs_find_left_neighbor_info *info = priv;
- trace_xfs_rmap_find_left_neighbor_candidate(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, rec->rm_startblock,
+ trace_xfs_rmap_find_left_neighbor_candidate(cur, rec->rm_startblock,
rec->rm_blockcount, rec->rm_owner, rec->rm_offset,
rec->rm_flags);
@@ -389,8 +385,8 @@ xfs_rmap_find_left_neighbor(
info.high.rm_blockcount = 0;
info.irec = irec;
- trace_xfs_rmap_find_left_neighbor_query(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, bno, 0, owner, offset, flags);
+ trace_xfs_rmap_find_left_neighbor_query(cur, bno, 0, owner, offset,
+ flags);
/*
* Historically, we always used the range query to walk every reverse
@@ -421,8 +417,7 @@ xfs_rmap_find_left_neighbor(
return error;
*stat = 1;
- trace_xfs_rmap_find_left_neighbor_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, irec->rm_startblock,
+ trace_xfs_rmap_find_left_neighbor_result(cur, irec->rm_startblock,
irec->rm_blockcount, irec->rm_owner, irec->rm_offset,
irec->rm_flags);
return 0;
@@ -437,8 +432,7 @@ xfs_rmap_lookup_le_range_helper(
{
struct xfs_find_left_neighbor_info *info = priv;
- trace_xfs_rmap_lookup_le_range_candidate(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, rec->rm_startblock,
+ trace_xfs_rmap_lookup_le_range_candidate(cur, rec->rm_startblock,
rec->rm_blockcount, rec->rm_owner, rec->rm_offset,
rec->rm_flags);
@@ -485,8 +479,7 @@ xfs_rmap_lookup_le_range(
*stat = 0;
info.irec = irec;
- trace_xfs_rmap_lookup_le_range(cur->bc_mp, cur->bc_ag.pag->pag_agno,
- bno, 0, owner, offset, flags);
+ trace_xfs_rmap_lookup_le_range(cur, bno, 0, owner, offset, flags);
/*
* Historically, we always used the range query to walk every reverse
@@ -517,8 +510,7 @@ xfs_rmap_lookup_le_range(
return error;
*stat = 1;
- trace_xfs_rmap_lookup_le_range_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, irec->rm_startblock,
+ trace_xfs_rmap_lookup_le_range_result(cur, irec->rm_startblock,
irec->rm_blockcount, irec->rm_owner, irec->rm_offset,
irec->rm_flags);
return 0;
@@ -630,8 +622,7 @@ xfs_rmap_unmap(
(flags & XFS_RMAP_BMBT_BLOCK);
if (unwritten)
flags |= XFS_RMAP_UNWRITTEN;
- trace_xfs_rmap_unmap(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_unmap(cur, bno, len, unwritten, oinfo);
/*
* We should always have a left record because there's a static record
@@ -647,10 +638,9 @@ xfs_rmap_unmap(
goto out_error;
}
- trace_xfs_rmap_lookup_le_range_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, ltrec.rm_startblock,
- ltrec.rm_blockcount, ltrec.rm_owner,
- ltrec.rm_offset, ltrec.rm_flags);
+ trace_xfs_rmap_lookup_le_range_result(cur, ltrec.rm_startblock,
+ ltrec.rm_blockcount, ltrec.rm_owner, ltrec.rm_offset,
+ ltrec.rm_flags);
ltoff = ltrec.rm_offset;
/*
@@ -717,10 +707,9 @@ xfs_rmap_unmap(
if (ltrec.rm_startblock == bno && ltrec.rm_blockcount == len) {
/* exact match, simply remove the record from rmap tree */
- trace_xfs_rmap_delete(mp, cur->bc_ag.pag->pag_agno,
- ltrec.rm_startblock, ltrec.rm_blockcount,
- ltrec.rm_owner, ltrec.rm_offset,
- ltrec.rm_flags);
+ trace_xfs_rmap_delete(cur, ltrec.rm_startblock,
+ ltrec.rm_blockcount, ltrec.rm_owner,
+ ltrec.rm_offset, ltrec.rm_flags);
error = xfs_btree_delete(cur, &i);
if (error)
goto out_error;
@@ -796,8 +785,7 @@ xfs_rmap_unmap(
else
cur->bc_rec.r.rm_offset = offset + len;
cur->bc_rec.r.rm_flags = flags;
- trace_xfs_rmap_insert(mp, cur->bc_ag.pag->pag_agno,
- cur->bc_rec.r.rm_startblock,
+ trace_xfs_rmap_insert(cur, cur->bc_rec.r.rm_startblock,
cur->bc_rec.r.rm_blockcount,
cur->bc_rec.r.rm_owner,
cur->bc_rec.r.rm_offset,
@@ -808,8 +796,7 @@ xfs_rmap_unmap(
}
out_done:
- trace_xfs_rmap_unmap_done(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_unmap_done(cur, bno, len, unwritten, oinfo);
out_error:
if (error)
trace_xfs_rmap_unmap_error(cur, error, _RET_IP_);
@@ -982,8 +969,7 @@ xfs_rmap_map(
(flags & XFS_RMAP_BMBT_BLOCK);
if (unwritten)
flags |= XFS_RMAP_UNWRITTEN;
- trace_xfs_rmap_map(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_map(cur, bno, len, unwritten, oinfo);
ASSERT(!xfs_rmap_should_skip_owner_update(oinfo));
/*
@@ -996,8 +982,7 @@ xfs_rmap_map(
if (error)
goto out_error;
if (have_lt) {
- trace_xfs_rmap_lookup_le_range_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, ltrec.rm_startblock,
+ trace_xfs_rmap_lookup_le_range_result(cur, ltrec.rm_startblock,
ltrec.rm_blockcount, ltrec.rm_owner,
ltrec.rm_offset, ltrec.rm_flags);
@@ -1035,10 +1020,10 @@ xfs_rmap_map(
error = -EFSCORRUPTED;
goto out_error;
}
- trace_xfs_rmap_find_right_neighbor_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, gtrec.rm_startblock,
- gtrec.rm_blockcount, gtrec.rm_owner,
- gtrec.rm_offset, gtrec.rm_flags);
+ trace_xfs_rmap_find_right_neighbor_result(cur,
+ gtrec.rm_startblock, gtrec.rm_blockcount,
+ gtrec.rm_owner, gtrec.rm_offset,
+ gtrec.rm_flags);
if (!xfs_rmap_is_mergeable(>rec, owner, flags))
have_gt = 0;
}
@@ -1075,12 +1060,9 @@ xfs_rmap_map(
* result: |rrrrrrrrrrrrrrrrrrrrrrrrrrrrr|
*/
ltrec.rm_blockcount += gtrec.rm_blockcount;
- trace_xfs_rmap_delete(mp, cur->bc_ag.pag->pag_agno,
- gtrec.rm_startblock,
- gtrec.rm_blockcount,
- gtrec.rm_owner,
- gtrec.rm_offset,
- gtrec.rm_flags);
+ trace_xfs_rmap_delete(cur, gtrec.rm_startblock,
+ gtrec.rm_blockcount, gtrec.rm_owner,
+ gtrec.rm_offset, gtrec.rm_flags);
error = xfs_btree_delete(cur, &i);
if (error)
goto out_error;
@@ -1127,8 +1109,7 @@ xfs_rmap_map(
cur->bc_rec.r.rm_owner = owner;
cur->bc_rec.r.rm_offset = offset;
cur->bc_rec.r.rm_flags = flags;
- trace_xfs_rmap_insert(mp, cur->bc_ag.pag->pag_agno, bno, len,
- owner, offset, flags);
+ trace_xfs_rmap_insert(cur, bno, len, owner, offset, flags);
error = xfs_btree_insert(cur, &i);
if (error)
goto out_error;
@@ -1139,8 +1120,7 @@ xfs_rmap_map(
}
}
- trace_xfs_rmap_map_done(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_map_done(cur, bno, len, unwritten, oinfo);
out_error:
if (error)
trace_xfs_rmap_map_error(cur, error, _RET_IP_);
@@ -1217,8 +1197,7 @@ xfs_rmap_convert(
(flags & (XFS_RMAP_ATTR_FORK | XFS_RMAP_BMBT_BLOCK))));
oldext = unwritten ? XFS_RMAP_UNWRITTEN : 0;
new_endoff = offset + len;
- trace_xfs_rmap_convert(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_convert(cur, bno, len, unwritten, oinfo);
/*
* For the initial lookup, look for an exact match or the left-adjacent
@@ -1234,10 +1213,9 @@ xfs_rmap_convert(
goto done;
}
- trace_xfs_rmap_lookup_le_range_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, PREV.rm_startblock,
- PREV.rm_blockcount, PREV.rm_owner,
- PREV.rm_offset, PREV.rm_flags);
+ trace_xfs_rmap_lookup_le_range_result(cur, PREV.rm_startblock,
+ PREV.rm_blockcount, PREV.rm_owner, PREV.rm_offset,
+ PREV.rm_flags);
ASSERT(PREV.rm_offset <= offset);
ASSERT(PREV.rm_offset + PREV.rm_blockcount >= new_endoff);
@@ -1278,10 +1256,9 @@ xfs_rmap_convert(
error = -EFSCORRUPTED;
goto done;
}
- trace_xfs_rmap_find_left_neighbor_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, LEFT.rm_startblock,
- LEFT.rm_blockcount, LEFT.rm_owner,
- LEFT.rm_offset, LEFT.rm_flags);
+ trace_xfs_rmap_find_left_neighbor_result(cur,
+ LEFT.rm_startblock, LEFT.rm_blockcount,
+ LEFT.rm_owner, LEFT.rm_offset, LEFT.rm_flags);
if (LEFT.rm_startblock + LEFT.rm_blockcount == bno &&
LEFT.rm_offset + LEFT.rm_blockcount == offset &&
xfs_rmap_is_mergeable(&LEFT, owner, newext))
@@ -1319,10 +1296,10 @@ xfs_rmap_convert(
error = -EFSCORRUPTED;
goto done;
}
- trace_xfs_rmap_find_right_neighbor_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, RIGHT.rm_startblock,
- RIGHT.rm_blockcount, RIGHT.rm_owner,
- RIGHT.rm_offset, RIGHT.rm_flags);
+ trace_xfs_rmap_find_right_neighbor_result(cur,
+ RIGHT.rm_startblock, RIGHT.rm_blockcount,
+ RIGHT.rm_owner, RIGHT.rm_offset,
+ RIGHT.rm_flags);
if (bno + len == RIGHT.rm_startblock &&
offset + len == RIGHT.rm_offset &&
xfs_rmap_is_mergeable(&RIGHT, owner, newext))
@@ -1369,10 +1346,9 @@ xfs_rmap_convert(
error = -EFSCORRUPTED;
goto done;
}
- trace_xfs_rmap_delete(mp, cur->bc_ag.pag->pag_agno,
- RIGHT.rm_startblock, RIGHT.rm_blockcount,
- RIGHT.rm_owner, RIGHT.rm_offset,
- RIGHT.rm_flags);
+ trace_xfs_rmap_delete(cur, RIGHT.rm_startblock,
+ RIGHT.rm_blockcount, RIGHT.rm_owner,
+ RIGHT.rm_offset, RIGHT.rm_flags);
error = xfs_btree_delete(cur, &i);
if (error)
goto done;
@@ -1389,10 +1365,9 @@ xfs_rmap_convert(
error = -EFSCORRUPTED;
goto done;
}
- trace_xfs_rmap_delete(mp, cur->bc_ag.pag->pag_agno,
- PREV.rm_startblock, PREV.rm_blockcount,
- PREV.rm_owner, PREV.rm_offset,
- PREV.rm_flags);
+ trace_xfs_rmap_delete(cur, PREV.rm_startblock,
+ PREV.rm_blockcount, PREV.rm_owner,
+ PREV.rm_offset, PREV.rm_flags);
error = xfs_btree_delete(cur, &i);
if (error)
goto done;
@@ -1421,10 +1396,9 @@ xfs_rmap_convert(
* Setting all of a previous oldext extent to newext.
* The left neighbor is contiguous, the right is not.
*/
- trace_xfs_rmap_delete(mp, cur->bc_ag.pag->pag_agno,
- PREV.rm_startblock, PREV.rm_blockcount,
- PREV.rm_owner, PREV.rm_offset,
- PREV.rm_flags);
+ trace_xfs_rmap_delete(cur, PREV.rm_startblock,
+ PREV.rm_blockcount, PREV.rm_owner,
+ PREV.rm_offset, PREV.rm_flags);
error = xfs_btree_delete(cur, &i);
if (error)
goto done;
@@ -1461,10 +1435,9 @@ xfs_rmap_convert(
error = -EFSCORRUPTED;
goto done;
}
- trace_xfs_rmap_delete(mp, cur->bc_ag.pag->pag_agno,
- RIGHT.rm_startblock, RIGHT.rm_blockcount,
- RIGHT.rm_owner, RIGHT.rm_offset,
- RIGHT.rm_flags);
+ trace_xfs_rmap_delete(cur, RIGHT.rm_startblock,
+ RIGHT.rm_blockcount, RIGHT.rm_owner,
+ RIGHT.rm_offset, RIGHT.rm_flags);
error = xfs_btree_delete(cur, &i);
if (error)
goto done;
@@ -1542,8 +1515,7 @@ xfs_rmap_convert(
NEW.rm_blockcount = len;
NEW.rm_flags = newext;
cur->bc_rec.r = NEW;
- trace_xfs_rmap_insert(mp, cur->bc_ag.pag->pag_agno, bno,
- len, owner, offset, newext);
+ trace_xfs_rmap_insert(cur, bno, len, owner, offset, newext);
error = xfs_btree_insert(cur, &i);
if (error)
goto done;
@@ -1601,8 +1573,7 @@ xfs_rmap_convert(
NEW.rm_blockcount = len;
NEW.rm_flags = newext;
cur->bc_rec.r = NEW;
- trace_xfs_rmap_insert(mp, cur->bc_ag.pag->pag_agno, bno,
- len, owner, offset, newext);
+ trace_xfs_rmap_insert(cur, bno, len, owner, offset, newext);
error = xfs_btree_insert(cur, &i);
if (error)
goto done;
@@ -1633,9 +1604,8 @@ xfs_rmap_convert(
NEW = PREV;
NEW.rm_blockcount = offset - PREV.rm_offset;
cur->bc_rec.r = NEW;
- trace_xfs_rmap_insert(mp, cur->bc_ag.pag->pag_agno,
- NEW.rm_startblock, NEW.rm_blockcount,
- NEW.rm_owner, NEW.rm_offset,
+ trace_xfs_rmap_insert(cur, NEW.rm_startblock,
+ NEW.rm_blockcount, NEW.rm_owner, NEW.rm_offset,
NEW.rm_flags);
error = xfs_btree_insert(cur, &i);
if (error)
@@ -1662,8 +1632,7 @@ xfs_rmap_convert(
/* new middle extent - newext */
cur->bc_rec.r.rm_flags &= ~XFS_RMAP_UNWRITTEN;
cur->bc_rec.r.rm_flags |= newext;
- trace_xfs_rmap_insert(mp, cur->bc_ag.pag->pag_agno, bno, len,
- owner, offset, newext);
+ trace_xfs_rmap_insert(cur, bno, len, owner, offset, newext);
error = xfs_btree_insert(cur, &i);
if (error)
goto done;
@@ -1687,8 +1656,7 @@ xfs_rmap_convert(
ASSERT(0);
}
- trace_xfs_rmap_convert_done(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_convert_done(cur, bno, len, unwritten, oinfo);
done:
if (error)
trace_xfs_rmap_convert_error(cur, error, _RET_IP_);
@@ -1727,8 +1695,7 @@ xfs_rmap_convert_shared(
(flags & (XFS_RMAP_ATTR_FORK | XFS_RMAP_BMBT_BLOCK))));
oldext = unwritten ? XFS_RMAP_UNWRITTEN : 0;
new_endoff = offset + len;
- trace_xfs_rmap_convert(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_convert(cur, bno, len, unwritten, oinfo);
/*
* For the initial lookup, look for and exact match or the left-adjacent
@@ -1797,10 +1764,10 @@ xfs_rmap_convert_shared(
error = -EFSCORRUPTED;
goto done;
}
- trace_xfs_rmap_find_right_neighbor_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, RIGHT.rm_startblock,
- RIGHT.rm_blockcount, RIGHT.rm_owner,
- RIGHT.rm_offset, RIGHT.rm_flags);
+ trace_xfs_rmap_find_right_neighbor_result(cur,
+ RIGHT.rm_startblock, RIGHT.rm_blockcount,
+ RIGHT.rm_owner, RIGHT.rm_offset,
+ RIGHT.rm_flags);
if (xfs_rmap_is_mergeable(&RIGHT, owner, newext))
state |= RMAP_RIGHT_CONTIG;
}
@@ -2112,8 +2079,7 @@ xfs_rmap_convert_shared(
ASSERT(0);
}
- trace_xfs_rmap_convert_done(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_convert_done(cur, bno, len, unwritten, oinfo);
done:
if (error)
trace_xfs_rmap_convert_error(cur, error, _RET_IP_);
@@ -2154,8 +2120,7 @@ xfs_rmap_unmap_shared(
xfs_owner_info_unpack(oinfo, &owner, &offset, &flags);
if (unwritten)
flags |= XFS_RMAP_UNWRITTEN;
- trace_xfs_rmap_unmap(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_unmap(cur, bno, len, unwritten, oinfo);
/*
* We should always have a left record because there's a static record
@@ -2311,8 +2276,7 @@ xfs_rmap_unmap_shared(
goto out_error;
}
- trace_xfs_rmap_unmap_done(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_unmap_done(cur, bno, len, unwritten, oinfo);
out_error:
if (error)
trace_xfs_rmap_unmap_error(cur, error, _RET_IP_);
@@ -2350,8 +2314,7 @@ xfs_rmap_map_shared(
xfs_owner_info_unpack(oinfo, &owner, &offset, &flags);
if (unwritten)
flags |= XFS_RMAP_UNWRITTEN;
- trace_xfs_rmap_map(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_map(cur, bno, len, unwritten, oinfo);
/* Is there a left record that abuts our range? */
error = xfs_rmap_find_left_neighbor(cur, bno, owner, offset, flags,
@@ -2376,10 +2339,10 @@ xfs_rmap_map_shared(
error = -EFSCORRUPTED;
goto out_error;
}
- trace_xfs_rmap_find_right_neighbor_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, gtrec.rm_startblock,
- gtrec.rm_blockcount, gtrec.rm_owner,
- gtrec.rm_offset, gtrec.rm_flags);
+ trace_xfs_rmap_find_right_neighbor_result(cur,
+ gtrec.rm_startblock, gtrec.rm_blockcount,
+ gtrec.rm_owner, gtrec.rm_offset,
+ gtrec.rm_flags);
if (!xfs_rmap_is_mergeable(>rec, owner, flags))
have_gt = 0;
@@ -2471,8 +2434,7 @@ xfs_rmap_map_shared(
goto out_error;
}
- trace_xfs_rmap_map_done(mp, cur->bc_ag.pag->pag_agno, bno, len,
- unwritten, oinfo);
+ trace_xfs_rmap_map_done(cur, bno, len, unwritten, oinfo);
out_error:
if (error)
trace_xfs_rmap_map_error(cur, error, _RET_IP_);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 43/64] xfs: clean up rmap log intent item tracepoint callsites
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (41 preceding siblings ...)
2024-10-02 1:18 ` [PATCH 42/64] xfs: pass btree cursors to rmap btree tracepoints Darrick J. Wong
@ 2024-10-02 1:19 ` Darrick J. Wong
2024-10-02 1:19 ` [PATCH 44/64] xfs: add a ri_entry helper Darrick J. Wong
` (20 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:19 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: fbe8c7e167a6b226ae0234c26ebb65d8401473a5
Pass the incore rmap structure to the tracepoints instead of open-coding
the argument passing.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_rmap.c | 22 +++++-----------------
libxfs/xfs_rmap.h | 10 ++++++++++
2 files changed, 15 insertions(+), 17 deletions(-)
diff --git a/libxfs/xfs_rmap.c b/libxfs/xfs_rmap.c
index 46bee57cc..57c0d9418 100644
--- a/libxfs/xfs_rmap.c
+++ b/libxfs/xfs_rmap.c
@@ -2584,20 +2584,15 @@ xfs_rmap_finish_one(
struct xfs_rmap_intent *ri,
struct xfs_btree_cur **pcur)
{
+ struct xfs_owner_info oinfo;
struct xfs_mount *mp = tp->t_mountp;
struct xfs_btree_cur *rcur;
struct xfs_buf *agbp = NULL;
- int error = 0;
- struct xfs_owner_info oinfo;
xfs_agblock_t bno;
bool unwritten;
+ int error = 0;
- bno = XFS_FSB_TO_AGBNO(mp, ri->ri_bmap.br_startblock);
-
- trace_xfs_rmap_deferred(mp, ri->ri_pag->pag_agno, ri->ri_type, bno,
- ri->ri_owner, ri->ri_whichfork,
- ri->ri_bmap.br_startoff, ri->ri_bmap.br_blockcount,
- ri->ri_bmap.br_state);
+ trace_xfs_rmap_deferred(mp, ri);
if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_RMAP_FINISH_ONE))
return -EIO;
@@ -2672,15 +2667,6 @@ __xfs_rmap_add(
{
struct xfs_rmap_intent *ri;
- trace_xfs_rmap_defer(tp->t_mountp,
- XFS_FSB_TO_AGNO(tp->t_mountp, bmap->br_startblock),
- type,
- XFS_FSB_TO_AGBNO(tp->t_mountp, bmap->br_startblock),
- owner, whichfork,
- bmap->br_startoff,
- bmap->br_blockcount,
- bmap->br_state);
-
ri = kmem_cache_alloc(xfs_rmap_intent_cache, GFP_KERNEL | __GFP_NOFAIL);
INIT_LIST_HEAD(&ri->ri_list);
ri->ri_type = type;
@@ -2688,6 +2674,8 @@ __xfs_rmap_add(
ri->ri_whichfork = whichfork;
ri->ri_bmap = *bmap;
+ trace_xfs_rmap_defer(tp->t_mountp, ri);
+
xfs_rmap_update_get_group(tp->t_mountp, ri);
xfs_defer_add(tp, &ri->ri_list, &xfs_rmap_update_defer_type);
}
diff --git a/libxfs/xfs_rmap.h b/libxfs/xfs_rmap.h
index 9d01fe689..731c97137 100644
--- a/libxfs/xfs_rmap.h
+++ b/libxfs/xfs_rmap.h
@@ -157,6 +157,16 @@ enum xfs_rmap_intent_type {
XFS_RMAP_FREE,
};
+#define XFS_RMAP_INTENT_STRINGS \
+ { XFS_RMAP_MAP, "map" }, \
+ { XFS_RMAP_MAP_SHARED, "map_shared" }, \
+ { XFS_RMAP_UNMAP, "unmap" }, \
+ { XFS_RMAP_UNMAP_SHARED, "unmap_shared" }, \
+ { XFS_RMAP_CONVERT, "cvt" }, \
+ { XFS_RMAP_CONVERT_SHARED, "cvt_shared" }, \
+ { XFS_RMAP_ALLOC, "alloc" }, \
+ { XFS_RMAP_FREE, "free" }
+
struct xfs_rmap_intent {
struct list_head ri_list;
enum xfs_rmap_intent_type ri_type;
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 44/64] xfs: add a ri_entry helper
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (42 preceding siblings ...)
2024-10-02 1:19 ` [PATCH 43/64] xfs: clean up rmap log intent item tracepoint callsites Darrick J. Wong
@ 2024-10-02 1:19 ` Darrick J. Wong
2024-10-02 1:19 ` [PATCH 45/64] xfs: reuse xfs_rmap_update_cancel_item Darrick J. Wong
` (19 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:19 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Source kernel commit: f93963779b438a33ca4b13384c070a6864ce2b2b
Add a helper to translate from the item list head to the
rmap_intent_item structure and use it so shorten assignments
and avoid the need for extra local variables.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/defer_item.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index 2df0ce4e8..013ce0304 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -200,6 +200,11 @@ const struct xfs_defer_op_type xfs_agfl_free_defer_type = {
/* Reverse Mapping */
+static inline struct xfs_rmap_intent *ri_entry(const struct list_head *e)
+{
+ return list_entry(e, struct xfs_rmap_intent, ri_list);
+}
+
/* Sort rmap intents by AG. */
static int
xfs_rmap_update_diff_items(
@@ -207,11 +212,8 @@ xfs_rmap_update_diff_items(
const struct list_head *a,
const struct list_head *b)
{
- const struct xfs_rmap_intent *ra;
- const struct xfs_rmap_intent *rb;
-
- ra = container_of(a, struct xfs_rmap_intent, ri_list);
- rb = container_of(b, struct xfs_rmap_intent, ri_list);
+ struct xfs_rmap_intent *ra = ri_entry(a);
+ struct xfs_rmap_intent *rb = ri_entry(b);
return ra->ri_pag->pag_agno - rb->ri_pag->pag_agno;
}
@@ -266,11 +268,9 @@ xfs_rmap_update_finish_item(
struct list_head *item,
struct xfs_btree_cur **state)
{
- struct xfs_rmap_intent *ri;
+ struct xfs_rmap_intent *ri = ri_entry(item);
int error;
- ri = container_of(item, struct xfs_rmap_intent, ri_list);
-
error = xfs_rmap_finish_one(tp, ri, state);
xfs_rmap_update_put_group(ri);
@@ -290,9 +290,7 @@ STATIC void
xfs_rmap_update_cancel_item(
struct list_head *item)
{
- struct xfs_rmap_intent *ri;
-
- ri = container_of(item, struct xfs_rmap_intent, ri_list);
+ struct xfs_rmap_intent *ri = ri_entry(item);
xfs_rmap_update_put_group(ri);
kmem_cache_free(xfs_rmap_intent_cache, ri);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 45/64] xfs: reuse xfs_rmap_update_cancel_item
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (43 preceding siblings ...)
2024-10-02 1:19 ` [PATCH 44/64] xfs: add a ri_entry helper Darrick J. Wong
@ 2024-10-02 1:19 ` Darrick J. Wong
2024-10-02 1:19 ` [PATCH 46/64] xfs: don't bother calling xfs_rmap_finish_one_cleanup in xfs_rmap_finish_one Darrick J. Wong
` (18 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:19 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Source kernel commit: 37f9d1db03ba0511403c5d25ba0baaddf5208ba7
Reuse xfs_rmap_update_cancel_item to put the AG/RTG and free the item in
a few places that currently open code the logic.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/defer_item.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index 013ce0304..f8b27c55c 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -260,6 +260,17 @@ xfs_rmap_update_put_group(
xfs_perag_intent_put(ri->ri_pag);
}
+/* Cancel a deferred rmap update. */
+STATIC void
+xfs_rmap_update_cancel_item(
+ struct list_head *item)
+{
+ struct xfs_rmap_intent *ri = ri_entry(item);
+
+ xfs_rmap_update_put_group(ri);
+ kmem_cache_free(xfs_rmap_intent_cache, ri);
+}
+
/* Process a deferred rmap update. */
STATIC int
xfs_rmap_update_finish_item(
@@ -273,8 +284,7 @@ xfs_rmap_update_finish_item(
error = xfs_rmap_finish_one(tp, ri, state);
- xfs_rmap_update_put_group(ri);
- kmem_cache_free(xfs_rmap_intent_cache, ri);
+ xfs_rmap_update_cancel_item(item);
return error;
}
@@ -285,17 +295,6 @@ xfs_rmap_update_abort_intent(
{
}
-/* Cancel a deferred rmap update. */
-STATIC void
-xfs_rmap_update_cancel_item(
- struct list_head *item)
-{
- struct xfs_rmap_intent *ri = ri_entry(item);
-
- xfs_rmap_update_put_group(ri);
- kmem_cache_free(xfs_rmap_intent_cache, ri);
-}
-
const struct xfs_defer_op_type xfs_rmap_update_defer_type = {
.name = "rmap",
.create_intent = xfs_rmap_update_create_intent,
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 46/64] xfs: don't bother calling xfs_rmap_finish_one_cleanup in xfs_rmap_finish_one
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (44 preceding siblings ...)
2024-10-02 1:19 ` [PATCH 45/64] xfs: reuse xfs_rmap_update_cancel_item Darrick J. Wong
@ 2024-10-02 1:19 ` Darrick J. Wong
2024-10-02 1:20 ` [PATCH 47/64] xfs: simplify usage of the rcur local variable " Darrick J. Wong
` (17 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:19 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 8363b4361997044ecb99880a1a9bfdebf9145eed
In xfs_rmap_finish_one we known the cursor is non-zero when calling
xfs_rmap_finish_one_cleanup and we pass a 0 error variable. This means
xfs_rmap_finish_one_cleanup is just doing a xfs_btree_del_cursor.
Open code that and move xfs_rmap_finish_one_cleanup to
fs/xfs/xfs_rmap_item.c.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: minor porting changes]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/defer_item.c | 17 +++++++++++++++++
libxfs/xfs_rmap.c | 19 +------------------
libxfs/xfs_rmap.h | 2 --
3 files changed, 18 insertions(+), 20 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index f8b27c55c..7721267e4 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -288,6 +288,23 @@ xfs_rmap_update_finish_item(
return error;
}
+/* Clean up after calling xfs_rmap_finish_one. */
+STATIC void
+xfs_rmap_finish_one_cleanup(
+ struct xfs_trans *tp,
+ struct xfs_btree_cur *rcur,
+ int error)
+{
+ struct xfs_buf *agbp = NULL;
+
+ if (rcur == NULL)
+ return;
+ agbp = rcur->bc_ag.agbp;
+ xfs_btree_del_cursor(rcur, error);
+ if (error && agbp)
+ xfs_trans_brelse(tp, agbp);
+}
+
/* Abort all pending RUIs. */
STATIC void
xfs_rmap_update_abort_intent(
diff --git a/libxfs/xfs_rmap.c b/libxfs/xfs_rmap.c
index 57c0d9418..1b5004b9c 100644
--- a/libxfs/xfs_rmap.c
+++ b/libxfs/xfs_rmap.c
@@ -2522,23 +2522,6 @@ xfs_rmap_query_all(
return xfs_btree_query_all(cur, xfs_rmap_query_range_helper, &query);
}
-/* Clean up after calling xfs_rmap_finish_one. */
-void
-xfs_rmap_finish_one_cleanup(
- struct xfs_trans *tp,
- struct xfs_btree_cur *rcur,
- int error)
-{
- struct xfs_buf *agbp;
-
- if (rcur == NULL)
- return;
- agbp = rcur->bc_ag.agbp;
- xfs_btree_del_cursor(rcur, error);
- if (error)
- xfs_trans_brelse(tp, agbp);
-}
-
/* Commit an rmap operation into the ondisk tree. */
int
__xfs_rmap_finish_intent(
@@ -2603,7 +2586,7 @@ xfs_rmap_finish_one(
*/
rcur = *pcur;
if (rcur != NULL && rcur->bc_ag.pag != ri->ri_pag) {
- xfs_rmap_finish_one_cleanup(tp, rcur, 0);
+ xfs_btree_del_cursor(rcur, 0);
rcur = NULL;
*pcur = NULL;
}
diff --git a/libxfs/xfs_rmap.h b/libxfs/xfs_rmap.h
index 731c97137..9d85dd2a6 100644
--- a/libxfs/xfs_rmap.h
+++ b/libxfs/xfs_rmap.h
@@ -192,8 +192,6 @@ void xfs_rmap_alloc_extent(struct xfs_trans *tp, xfs_agnumber_t agno,
void xfs_rmap_free_extent(struct xfs_trans *tp, xfs_agnumber_t agno,
xfs_agblock_t bno, xfs_extlen_t len, uint64_t owner);
-void xfs_rmap_finish_one_cleanup(struct xfs_trans *tp,
- struct xfs_btree_cur *rcur, int error);
int xfs_rmap_finish_one(struct xfs_trans *tp, struct xfs_rmap_intent *ri,
struct xfs_btree_cur **pcur);
int __xfs_rmap_finish_intent(struct xfs_btree_cur *rcur,
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 47/64] xfs: simplify usage of the rcur local variable in xfs_rmap_finish_one
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (45 preceding siblings ...)
2024-10-02 1:19 ` [PATCH 46/64] xfs: don't bother calling xfs_rmap_finish_one_cleanup in xfs_rmap_finish_one Darrick J. Wong
@ 2024-10-02 1:20 ` Darrick J. Wong
2024-10-02 1:20 ` [PATCH 48/64] xfs: move xfs_rmap_update_defer_add to xfs_rmap_item.c Darrick J. Wong
` (16 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:20 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Christoph Hellwig <hch@lst.de>
Source kernel commit: 905af72610d90f58f994feff4ead1fc258f5d2b1
Only update rcur when we know the final *pcur value.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[djwong: don't leave the caller with a dangling ref]
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
libxfs/xfs_rmap.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/libxfs/xfs_rmap.c b/libxfs/xfs_rmap.c
index 1b5004b9c..d60edaa23 100644
--- a/libxfs/xfs_rmap.c
+++ b/libxfs/xfs_rmap.c
@@ -2569,7 +2569,7 @@ xfs_rmap_finish_one(
{
struct xfs_owner_info oinfo;
struct xfs_mount *mp = tp->t_mountp;
- struct xfs_btree_cur *rcur;
+ struct xfs_btree_cur *rcur = *pcur;
struct xfs_buf *agbp = NULL;
xfs_agblock_t bno;
bool unwritten;
@@ -2584,7 +2584,6 @@ xfs_rmap_finish_one(
* If we haven't gotten a cursor or the cursor AG doesn't match
* the startblock, get one now.
*/
- rcur = *pcur;
if (rcur != NULL && rcur->bc_ag.pag != ri->ri_pag) {
xfs_btree_del_cursor(rcur, 0);
rcur = NULL;
@@ -2606,9 +2605,8 @@ xfs_rmap_finish_one(
return -EFSCORRUPTED;
}
- rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, ri->ri_pag);
+ *pcur = rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, ri->ri_pag);
}
- *pcur = rcur;
xfs_rmap_ino_owner(&oinfo, ri->ri_owner, ri->ri_whichfork,
ri->ri_bmap.br_startoff);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 48/64] xfs: move xfs_rmap_update_defer_add to xfs_rmap_item.c
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (46 preceding siblings ...)
2024-10-02 1:20 ` [PATCH 47/64] xfs: simplify usage of the rcur local variable " Darrick J. Wong
@ 2024-10-02 1:20 ` Darrick J. Wong
2024-10-02 1:20 ` [PATCH 49/64] xfs: give refcount btree cursor error tracepoints their own class Darrick J. Wong
` (15 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:20 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: ea7b0820d960d5a3ee72bc67cbd8b5d47c67aa4c
Move the code that adds the incore xfs_rmap_update_item deferred work
data to a transaction to live with the RUI log item code. This means
that the rmap code no longer has to know about the inner workings of the
RUI log items.
As a consequence, we can get rid of the _{get,put}_group helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/defer_item.c | 21 +++++++++------------
libxfs/defer_item.h | 4 ++++
libxfs/xfs_rmap.c | 6 ++----
libxfs/xfs_rmap.h | 3 ---
4 files changed, 15 insertions(+), 19 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index 7721267e4..1c106b844 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -243,21 +243,18 @@ xfs_rmap_update_create_done(
return NULL;
}
-/* Take an active ref to the AG containing the space we're rmapping. */
+/* Add this deferred RUI to the transaction. */
void
-xfs_rmap_update_get_group(
- struct xfs_mount *mp,
+xfs_rmap_defer_add(
+ struct xfs_trans *tp,
struct xfs_rmap_intent *ri)
{
+ struct xfs_mount *mp = tp->t_mountp;
+
+ trace_xfs_rmap_defer(mp, ri);
+
ri->ri_pag = xfs_perag_intent_get(mp, ri->ri_bmap.br_startblock);
-}
-
-/* Release an active AG ref after finishing rmapping work. */
-static inline void
-xfs_rmap_update_put_group(
- struct xfs_rmap_intent *ri)
-{
- xfs_perag_intent_put(ri->ri_pag);
+ xfs_defer_add(tp, &ri->ri_list, &xfs_rmap_update_defer_type);
}
/* Cancel a deferred rmap update. */
@@ -267,7 +264,7 @@ xfs_rmap_update_cancel_item(
{
struct xfs_rmap_intent *ri = ri_entry(item);
- xfs_rmap_update_put_group(ri);
+ xfs_perag_intent_put(ri->ri_pag);
kmem_cache_free(xfs_rmap_intent_cache, ri);
}
diff --git a/libxfs/defer_item.h b/libxfs/defer_item.h
index 03f3f1505..be354785b 100644
--- a/libxfs/defer_item.h
+++ b/libxfs/defer_item.h
@@ -30,4 +30,8 @@ void xfs_extent_free_defer_add(struct xfs_trans *tp,
struct xfs_extent_free_item *xefi,
struct xfs_defer_pending **dfpp);
+struct xfs_rmap_intent;
+
+void xfs_rmap_defer_add(struct xfs_trans *tp, struct xfs_rmap_intent *ri);
+
#endif /* __LIBXFS_DEFER_ITEM_H_ */
diff --git a/libxfs/xfs_rmap.c b/libxfs/xfs_rmap.c
index d60edaa23..22947e3c9 100644
--- a/libxfs/xfs_rmap.c
+++ b/libxfs/xfs_rmap.c
@@ -23,6 +23,7 @@
#include "xfs_inode.h"
#include "xfs_ag.h"
#include "xfs_health.h"
+#include "defer_item.h"
struct kmem_cache *xfs_rmap_intent_cache;
@@ -2655,10 +2656,7 @@ __xfs_rmap_add(
ri->ri_whichfork = whichfork;
ri->ri_bmap = *bmap;
- trace_xfs_rmap_defer(tp->t_mountp, ri);
-
- xfs_rmap_update_get_group(tp->t_mountp, ri);
- xfs_defer_add(tp, &ri->ri_list, &xfs_rmap_update_defer_type);
+ xfs_rmap_defer_add(tp, ri);
}
/* Map an extent into a file. */
diff --git a/libxfs/xfs_rmap.h b/libxfs/xfs_rmap.h
index 9d85dd2a6..b783dd4dd 100644
--- a/libxfs/xfs_rmap.h
+++ b/libxfs/xfs_rmap.h
@@ -176,9 +176,6 @@ struct xfs_rmap_intent {
struct xfs_perag *ri_pag;
};
-void xfs_rmap_update_get_group(struct xfs_mount *mp,
- struct xfs_rmap_intent *ri);
-
/* functions for updating the rmapbt based on bmbt map/unmap operations */
void xfs_rmap_map_extent(struct xfs_trans *tp, struct xfs_inode *ip,
int whichfork, struct xfs_bmbt_irec *imap);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 49/64] xfs: give refcount btree cursor error tracepoints their own class
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (47 preceding siblings ...)
2024-10-02 1:20 ` [PATCH 48/64] xfs: move xfs_rmap_update_defer_add to xfs_rmap_item.c Darrick J. Wong
@ 2024-10-02 1:20 ` Darrick J. Wong
2024-10-02 1:20 ` [PATCH 50/64] xfs: create specialized classes for refcount tracepoints Darrick J. Wong
` (14 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:20 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 7cf2663ff1cfb20f5fe025122016b68920b28041
Convert all the refcount tracepoints to use the btree error tracepoint
class.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_refcount.c | 42 ++++++++++++++----------------------------
1 file changed, 14 insertions(+), 28 deletions(-)
diff --git a/libxfs/xfs_refcount.c b/libxfs/xfs_refcount.c
index b4e6900be..c78d42728 100644
--- a/libxfs/xfs_refcount.c
+++ b/libxfs/xfs_refcount.c
@@ -210,8 +210,7 @@ xfs_refcount_update(
error = xfs_btree_update(cur, &rec);
if (error)
- trace_xfs_refcount_update_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_update_error(cur, error, _RET_IP_);
return error;
}
@@ -246,8 +245,7 @@ xfs_refcount_insert(
out_error:
if (error)
- trace_xfs_refcount_insert_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_insert_error(cur, error, _RET_IP_);
return error;
}
@@ -287,8 +285,7 @@ xfs_refcount_delete(
&found_rec);
out_error:
if (error)
- trace_xfs_refcount_delete_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_delete_error(cur, error, _RET_IP_);
return error;
}
@@ -437,8 +434,7 @@ xfs_refcount_split_extent(
return error;
out_error:
- trace_xfs_refcount_split_extent_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_split_extent_error(cur, error, _RET_IP_);
return error;
}
@@ -521,8 +517,7 @@ xfs_refcount_merge_center_extents(
return error;
out_error:
- trace_xfs_refcount_merge_center_extents_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_merge_center_extents_error(cur, error, _RET_IP_);
return error;
}
@@ -588,8 +583,7 @@ xfs_refcount_merge_left_extent(
return error;
out_error:
- trace_xfs_refcount_merge_left_extent_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_merge_left_extent_error(cur, error, _RET_IP_);
return error;
}
@@ -657,8 +651,7 @@ xfs_refcount_merge_right_extent(
return error;
out_error:
- trace_xfs_refcount_merge_right_extent_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_merge_right_extent_error(cur, error, _RET_IP_);
return error;
}
@@ -752,8 +745,7 @@ xfs_refcount_find_left_extents(
return error;
out_error:
- trace_xfs_refcount_find_left_extent_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_find_left_extent_error(cur, error, _RET_IP_);
return error;
}
@@ -847,8 +839,7 @@ xfs_refcount_find_right_extents(
return error;
out_error:
- trace_xfs_refcount_find_right_extent_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_find_right_extent_error(cur, error, _RET_IP_);
return error;
}
@@ -1253,8 +1244,7 @@ xfs_refcount_adjust_extents(
return error;
out_error:
- trace_xfs_refcount_modify_extent_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_modify_extent_error(cur, error, _RET_IP_);
return error;
}
@@ -1314,8 +1304,7 @@ xfs_refcount_adjust(
return 0;
out_error:
- trace_xfs_refcount_adjust_error(cur->bc_mp, cur->bc_ag.pag->pag_agno,
- error, _RET_IP_);
+ trace_xfs_refcount_adjust_error(cur, error, _RET_IP_);
return error;
}
@@ -1629,8 +1618,7 @@ xfs_refcount_find_shared(
out_error:
if (error)
- trace_xfs_refcount_find_shared_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_find_shared_error(cur, error, _RET_IP_);
return error;
}
@@ -1785,8 +1773,7 @@ xfs_refcount_adjust_cow_extents(
return error;
out_error:
- trace_xfs_refcount_modify_extent_error(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, error, _RET_IP_);
+ trace_xfs_refcount_modify_extent_error(cur, error, _RET_IP_);
return error;
}
@@ -1832,8 +1819,7 @@ xfs_refcount_adjust_cow(
return 0;
out_error:
- trace_xfs_refcount_adjust_cow_error(cur->bc_mp, cur->bc_ag.pag->pag_agno,
- error, _RET_IP_);
+ trace_xfs_refcount_adjust_cow_error(cur, error, _RET_IP_);
return error;
}
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 50/64] xfs: create specialized classes for refcount tracepoints
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (48 preceding siblings ...)
2024-10-02 1:20 ` [PATCH 49/64] xfs: give refcount btree cursor error tracepoints their own class Darrick J. Wong
@ 2024-10-02 1:20 ` Darrick J. Wong
2024-10-02 1:21 ` [PATCH 51/64] xfs: pass btree cursors to refcount btree tracepoints Darrick J. Wong
` (13 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:20 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: bb0efb0d0a2885b4c65ca31e2815da2281b99153
The only user of the "ag" tracepoint event classes is the refcount
btree, so rename them to make that obvious and make them take the btree
cursor to simplify the arguments. This will save us a lot of trouble
later on.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_refcount.c | 24 +++++++++---------------
1 file changed, 9 insertions(+), 15 deletions(-)
diff --git a/libxfs/xfs_refcount.c b/libxfs/xfs_refcount.c
index c78d42728..4143aca5f 100644
--- a/libxfs/xfs_refcount.c
+++ b/libxfs/xfs_refcount.c
@@ -50,7 +50,7 @@ xfs_refcount_lookup_le(
xfs_agblock_t bno,
int *stat)
{
- trace_xfs_refcount_lookup(cur->bc_mp, cur->bc_ag.pag->pag_agno,
+ trace_xfs_refcount_lookup(cur,
xfs_refcount_encode_startblock(bno, domain),
XFS_LOOKUP_LE);
cur->bc_rec.rc.rc_startblock = bno;
@@ -70,7 +70,7 @@ xfs_refcount_lookup_ge(
xfs_agblock_t bno,
int *stat)
{
- trace_xfs_refcount_lookup(cur->bc_mp, cur->bc_ag.pag->pag_agno,
+ trace_xfs_refcount_lookup(cur,
xfs_refcount_encode_startblock(bno, domain),
XFS_LOOKUP_GE);
cur->bc_rec.rc.rc_startblock = bno;
@@ -90,7 +90,7 @@ xfs_refcount_lookup_eq(
xfs_agblock_t bno,
int *stat)
{
- trace_xfs_refcount_lookup(cur->bc_mp, cur->bc_ag.pag->pag_agno,
+ trace_xfs_refcount_lookup(cur,
xfs_refcount_encode_startblock(bno, domain),
XFS_LOOKUP_LE);
cur->bc_rec.rc.rc_startblock = bno;
@@ -1261,11 +1261,9 @@ xfs_refcount_adjust(
int error;
if (adj == XFS_REFCOUNT_ADJUST_INCREASE)
- trace_xfs_refcount_increase(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, *agbno, *aglen);
+ trace_xfs_refcount_increase(cur, *agbno, *aglen);
else
- trace_xfs_refcount_decrease(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, *agbno, *aglen);
+ trace_xfs_refcount_decrease(cur, *agbno, *aglen);
/*
* Ensure that no rcextents cross the boundary of the adjustment range.
@@ -1525,8 +1523,7 @@ xfs_refcount_find_shared(
int have;
int error;
- trace_xfs_refcount_find_shared(cur->bc_mp, cur->bc_ag.pag->pag_agno,
- agbno, aglen);
+ trace_xfs_refcount_find_shared(cur, agbno, aglen);
/* By default, skip the whole range */
*fbno = NULLAGBLOCK;
@@ -1613,8 +1610,7 @@ xfs_refcount_find_shared(
}
done:
- trace_xfs_refcount_find_shared_result(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, *fbno, *flen);
+ trace_xfs_refcount_find_shared_result(cur, *fbno, *flen);
out_error:
if (error)
@@ -1832,8 +1828,7 @@ __xfs_refcount_cow_alloc(
xfs_agblock_t agbno,
xfs_extlen_t aglen)
{
- trace_xfs_refcount_cow_increase(rcur->bc_mp, rcur->bc_ag.pag->pag_agno,
- agbno, aglen);
+ trace_xfs_refcount_cow_increase(rcur, agbno, aglen);
/* Add refcount btree reservation */
return xfs_refcount_adjust_cow(rcur, agbno, aglen,
@@ -1849,8 +1844,7 @@ __xfs_refcount_cow_free(
xfs_agblock_t agbno,
xfs_extlen_t aglen)
{
- trace_xfs_refcount_cow_decrease(rcur->bc_mp, rcur->bc_ag.pag->pag_agno,
- agbno, aglen);
+ trace_xfs_refcount_cow_decrease(rcur, agbno, aglen);
/* Remove refcount btree reservation */
return xfs_refcount_adjust_cow(rcur, agbno, aglen,
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 51/64] xfs: pass btree cursors to refcount btree tracepoints
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (49 preceding siblings ...)
2024-10-02 1:20 ` [PATCH 50/64] xfs: create specialized classes for refcount tracepoints Darrick J. Wong
@ 2024-10-02 1:21 ` Darrick J. Wong
2024-10-02 1:21 ` [PATCH 52/64] xfs: clean up refcount log intent item tracepoint callsites Darrick J. Wong
` (12 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:21 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 8fbac2f1a0947dc45ecf13e9b5aa17b5942b4a2d
Prepare the rest of refcount btree tracepoints for use with realtime
reflink by making them take the btree cursor object as a parameter.
This will save us a lot of trouble later on.
Remove the xfs_refcount_recover_extent tracepoint since it's already
covered by other refcount tracepoints.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_refcount.c | 42 +++++++++++++++---------------------------
1 file changed, 15 insertions(+), 27 deletions(-)
diff --git a/libxfs/xfs_refcount.c b/libxfs/xfs_refcount.c
index 4143aca5f..31b6549f5 100644
--- a/libxfs/xfs_refcount.c
+++ b/libxfs/xfs_refcount.c
@@ -182,7 +182,7 @@ xfs_refcount_get_rec(
if (fa)
return xfs_refcount_complain_bad_rec(cur, fa, irec);
- trace_xfs_refcount_get(cur->bc_mp, cur->bc_ag.pag->pag_agno, irec);
+ trace_xfs_refcount_get(cur, irec);
return 0;
}
@@ -200,7 +200,7 @@ xfs_refcount_update(
uint32_t start;
int error;
- trace_xfs_refcount_update(cur->bc_mp, cur->bc_ag.pag->pag_agno, irec);
+ trace_xfs_refcount_update(cur, irec);
start = xfs_refcount_encode_startblock(irec->rc_startblock,
irec->rc_domain);
@@ -227,7 +227,7 @@ xfs_refcount_insert(
{
int error;
- trace_xfs_refcount_insert(cur->bc_mp, cur->bc_ag.pag->pag_agno, irec);
+ trace_xfs_refcount_insert(cur, irec);
cur->bc_rec.rc.rc_startblock = irec->rc_startblock;
cur->bc_rec.rc.rc_blockcount = irec->rc_blockcount;
@@ -272,7 +272,7 @@ xfs_refcount_delete(
error = -EFSCORRUPTED;
goto out_error;
}
- trace_xfs_refcount_delete(cur->bc_mp, cur->bc_ag.pag->pag_agno, &irec);
+ trace_xfs_refcount_delete(cur, &irec);
error = xfs_btree_delete(cur, i);
if (XFS_IS_CORRUPT(cur->bc_mp, *i != 1)) {
xfs_btree_mark_sick(cur);
@@ -409,8 +409,7 @@ xfs_refcount_split_extent(
return 0;
*shape_changed = true;
- trace_xfs_refcount_split_extent(cur->bc_mp, cur->bc_ag.pag->pag_agno,
- &rcext, agbno);
+ trace_xfs_refcount_split_extent(cur, &rcext, agbno);
/* Establish the right extent. */
tmp = rcext;
@@ -453,8 +452,7 @@ xfs_refcount_merge_center_extents(
int error;
int found_rec;
- trace_xfs_refcount_merge_center_extents(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, left, center, right);
+ trace_xfs_refcount_merge_center_extents(cur, left, center, right);
ASSERT(left->rc_domain == center->rc_domain);
ASSERT(right->rc_domain == center->rc_domain);
@@ -535,8 +533,7 @@ xfs_refcount_merge_left_extent(
int error;
int found_rec;
- trace_xfs_refcount_merge_left_extent(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, left, cleft);
+ trace_xfs_refcount_merge_left_extent(cur, left, cleft);
ASSERT(left->rc_domain == cleft->rc_domain);
@@ -600,8 +597,7 @@ xfs_refcount_merge_right_extent(
int error;
int found_rec;
- trace_xfs_refcount_merge_right_extent(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, cright, right);
+ trace_xfs_refcount_merge_right_extent(cur, cright, right);
ASSERT(right->rc_domain == cright->rc_domain);
@@ -740,8 +736,7 @@ xfs_refcount_find_left_extents(
cleft->rc_refcount = 1;
cleft->rc_domain = domain;
}
- trace_xfs_refcount_find_left_extent(cur->bc_mp, cur->bc_ag.pag->pag_agno,
- left, cleft, agbno);
+ trace_xfs_refcount_find_left_extent(cur, left, cleft, agbno);
return error;
out_error:
@@ -834,8 +829,8 @@ xfs_refcount_find_right_extents(
cright->rc_refcount = 1;
cright->rc_domain = domain;
}
- trace_xfs_refcount_find_right_extent(cur->bc_mp, cur->bc_ag.pag->pag_agno,
- cright, right, agbno + aglen);
+ trace_xfs_refcount_find_right_extent(cur, cright, right,
+ agbno + aglen);
return error;
out_error:
@@ -1138,8 +1133,7 @@ xfs_refcount_adjust_extents(
tmp.rc_refcount = 1 + adj;
tmp.rc_domain = XFS_REFC_DOMAIN_SHARED;
- trace_xfs_refcount_modify_extent(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, &tmp);
+ trace_xfs_refcount_modify_extent(cur, &tmp);
/*
* Either cover the hole (increment) or
@@ -1204,8 +1198,7 @@ xfs_refcount_adjust_extents(
if (ext.rc_refcount == MAXREFCOUNT)
goto skip;
ext.rc_refcount += adj;
- trace_xfs_refcount_modify_extent(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, &ext);
+ trace_xfs_refcount_modify_extent(cur, &ext);
cur->bc_refc.nr_ops++;
if (ext.rc_refcount > 1) {
error = xfs_refcount_update(cur, &ext);
@@ -1720,8 +1713,7 @@ xfs_refcount_adjust_cow_extents(
tmp.rc_refcount = 1;
tmp.rc_domain = XFS_REFC_DOMAIN_COW;
- trace_xfs_refcount_modify_extent(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, &tmp);
+ trace_xfs_refcount_modify_extent(cur, &tmp);
error = xfs_refcount_insert(cur, &tmp,
&found_tmp);
@@ -1752,8 +1744,7 @@ xfs_refcount_adjust_cow_extents(
}
ext.rc_refcount = 0;
- trace_xfs_refcount_modify_extent(cur->bc_mp,
- cur->bc_ag.pag->pag_agno, &ext);
+ trace_xfs_refcount_modify_extent(cur, &ext);
error = xfs_refcount_delete(cur, &found_rec);
if (error)
goto out_error;
@@ -1989,9 +1980,6 @@ xfs_refcount_recover_cow_leftovers(
if (error)
goto out_free;
- trace_xfs_refcount_recover_extent(mp, pag->pag_agno,
- &rr->rr_rrec);
-
/* Free the orphan record */
fsb = XFS_AGB_TO_FSB(mp, pag->pag_agno,
rr->rr_rrec.rc_startblock);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 52/64] xfs: clean up refcount log intent item tracepoint callsites
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (50 preceding siblings ...)
2024-10-02 1:21 ` [PATCH 51/64] xfs: pass btree cursors to refcount btree tracepoints Darrick J. Wong
@ 2024-10-02 1:21 ` Darrick J. Wong
2024-10-02 1:21 ` [PATCH 53/64] xfs: add a ci_entry helper Darrick J. Wong
` (11 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:21 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 886f11c797722650d98c554b28e66f12317a33e4
Pass the incore refcount intent structure to the tracepoints instead of
open-coding the argument passing.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_refcount.c | 14 ++++----------
libxfs/xfs_refcount.h | 6 ++++++
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/libxfs/xfs_refcount.c b/libxfs/xfs_refcount.c
index 31b6549f5..14d1101b4 100644
--- a/libxfs/xfs_refcount.c
+++ b/libxfs/xfs_refcount.c
@@ -1366,9 +1366,7 @@ xfs_refcount_finish_one(
bno = XFS_FSB_TO_AGBNO(mp, ri->ri_startblock);
- trace_xfs_refcount_deferred(mp, XFS_FSB_TO_AGNO(mp, ri->ri_startblock),
- ri->ri_type, XFS_FSB_TO_AGBNO(mp, ri->ri_startblock),
- ri->ri_blockcount);
+ trace_xfs_refcount_deferred(mp, ri);
if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REFCOUNT_FINISH_ONE))
return -EIO;
@@ -1431,8 +1429,7 @@ xfs_refcount_finish_one(
return -EFSCORRUPTED;
}
if (!error && ri->ri_blockcount > 0)
- trace_xfs_refcount_finish_one_leftover(mp, ri->ri_pag->pag_agno,
- ri->ri_type, bno, ri->ri_blockcount);
+ trace_xfs_refcount_finish_one_leftover(mp, ri);
return error;
}
@@ -1448,11 +1445,6 @@ __xfs_refcount_add(
{
struct xfs_refcount_intent *ri;
- trace_xfs_refcount_defer(tp->t_mountp,
- XFS_FSB_TO_AGNO(tp->t_mountp, startblock),
- type, XFS_FSB_TO_AGBNO(tp->t_mountp, startblock),
- blockcount);
-
ri = kmem_cache_alloc(xfs_refcount_intent_cache,
GFP_KERNEL | __GFP_NOFAIL);
INIT_LIST_HEAD(&ri->ri_list);
@@ -1460,6 +1452,8 @@ __xfs_refcount_add(
ri->ri_startblock = startblock;
ri->ri_blockcount = blockcount;
+ trace_xfs_refcount_defer(tp->t_mountp, ri);
+
xfs_refcount_update_get_group(tp->t_mountp, ri);
xfs_defer_add(tp, &ri->ri_list, &xfs_refcount_update_defer_type);
}
diff --git a/libxfs/xfs_refcount.h b/libxfs/xfs_refcount.h
index 9b56768a5..01a206211 100644
--- a/libxfs/xfs_refcount.h
+++ b/libxfs/xfs_refcount.h
@@ -48,6 +48,12 @@ enum xfs_refcount_intent_type {
XFS_REFCOUNT_FREE_COW,
};
+#define XFS_REFCOUNT_INTENT_STRINGS \
+ { XFS_REFCOUNT_INCREASE, "incr" }, \
+ { XFS_REFCOUNT_DECREASE, "decr" }, \
+ { XFS_REFCOUNT_ALLOC_COW, "alloc_cow" }, \
+ { XFS_REFCOUNT_FREE_COW, "free_cow" }
+
struct xfs_refcount_intent {
struct list_head ri_list;
struct xfs_perag *ri_pag;
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 53/64] xfs: add a ci_entry helper
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (51 preceding siblings ...)
2024-10-02 1:21 ` [PATCH 52/64] xfs: clean up refcount log intent item tracepoint callsites Darrick J. Wong
@ 2024-10-02 1:21 ` Darrick J. Wong
2024-10-02 1:21 ` [PATCH 54/64] xfs: reuse xfs_refcount_update_cancel_item Darrick J. Wong
` (10 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:21 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 0e9254861f980bd60a58b7c2b57ba0414c038409
Add a helper to translate from the item list head to the
refcount_intent_item structure and use it so shorten assignments and
avoid the need for extra local variables.
Inspired-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/defer_item.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index 1c106b844..53902d775 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -321,6 +321,11 @@ const struct xfs_defer_op_type xfs_rmap_update_defer_type = {
/* Reference Counting */
+static inline struct xfs_refcount_intent *ci_entry(const struct list_head *e)
+{
+ return list_entry(e, struct xfs_refcount_intent, ri_list);
+}
+
/* Sort refcount intents by AG. */
static int
xfs_refcount_update_diff_items(
@@ -328,11 +333,8 @@ xfs_refcount_update_diff_items(
const struct list_head *a,
const struct list_head *b)
{
- const struct xfs_refcount_intent *ra;
- const struct xfs_refcount_intent *rb;
-
- ra = container_of(a, struct xfs_refcount_intent, ri_list);
- rb = container_of(b, struct xfs_refcount_intent, ri_list);
+ struct xfs_refcount_intent *ra = ci_entry(a);
+ struct xfs_refcount_intent *rb = ci_entry(b);
return ra->ri_pag->pag_agno - rb->ri_pag->pag_agno;
}
@@ -387,10 +389,9 @@ xfs_refcount_update_finish_item(
struct list_head *item,
struct xfs_btree_cur **state)
{
- struct xfs_refcount_intent *ri;
+ struct xfs_refcount_intent *ri = ci_entry(item);
int error;
- ri = container_of(item, struct xfs_refcount_intent, ri_list);
error = xfs_refcount_finish_one(tp, ri, state);
/* Did we run out of reservation? Requeue what we didn't finish. */
@@ -417,9 +418,7 @@ STATIC void
xfs_refcount_update_cancel_item(
struct list_head *item)
{
- struct xfs_refcount_intent *ri;
-
- ri = container_of(item, struct xfs_refcount_intent, ri_list);
+ struct xfs_refcount_intent *ri = ci_entry(item);
xfs_refcount_update_put_group(ri);
kmem_cache_free(xfs_refcount_intent_cache, ri);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 54/64] xfs: reuse xfs_refcount_update_cancel_item
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (52 preceding siblings ...)
2024-10-02 1:21 ` [PATCH 53/64] xfs: add a ci_entry helper Darrick J. Wong
@ 2024-10-02 1:21 ` Darrick J. Wong
2024-10-02 1:22 ` [PATCH 55/64] xfs: don't bother calling xfs_refcount_finish_one_cleanup in xfs_refcount_finish_one Darrick J. Wong
` (9 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:21 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 8aef79928b3ddd8c10a3235f982933addc15a977
Reuse xfs_refcount_update_cancel_item to put the AG/RTG and free the
item in a few places that currently open code the logic.
Inspired-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/defer_item.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index 53902d775..8cf360567 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -381,6 +381,17 @@ xfs_refcount_update_put_group(
xfs_perag_intent_put(ri->ri_pag);
}
+/* Cancel a deferred refcount update. */
+STATIC void
+xfs_refcount_update_cancel_item(
+ struct list_head *item)
+{
+ struct xfs_refcount_intent *ri = ci_entry(item);
+
+ xfs_refcount_update_put_group(ri);
+ kmem_cache_free(xfs_refcount_intent_cache, ri);
+}
+
/* Process a deferred refcount update. */
STATIC int
xfs_refcount_update_finish_item(
@@ -401,8 +412,7 @@ xfs_refcount_update_finish_item(
return -EAGAIN;
}
- xfs_refcount_update_put_group(ri);
- kmem_cache_free(xfs_refcount_intent_cache, ri);
+ xfs_refcount_update_cancel_item(item);
return error;
}
@@ -413,17 +423,6 @@ xfs_refcount_update_abort_intent(
{
}
-/* Cancel a deferred refcount update. */
-STATIC void
-xfs_refcount_update_cancel_item(
- struct list_head *item)
-{
- struct xfs_refcount_intent *ri = ci_entry(item);
-
- xfs_refcount_update_put_group(ri);
- kmem_cache_free(xfs_refcount_intent_cache, ri);
-}
-
const struct xfs_defer_op_type xfs_refcount_update_defer_type = {
.name = "refcount",
.create_intent = xfs_refcount_update_create_intent,
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 55/64] xfs: don't bother calling xfs_refcount_finish_one_cleanup in xfs_refcount_finish_one
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (53 preceding siblings ...)
2024-10-02 1:21 ` [PATCH 54/64] xfs: reuse xfs_refcount_update_cancel_item Darrick J. Wong
@ 2024-10-02 1:22 ` Darrick J. Wong
2024-10-02 1:22 ` [PATCH 56/64] xfs: simplify usage of the rcur local variable " Darrick J. Wong
` (8 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:22 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: bac3f784925299b5e69a857e7e03e59c88aa14be
In xfs_refcount_finish_one we know the cursor is non-zero when calling
xfs_refcount_finish_one_cleanup and we pass a 0 error variable. This
means xfs_refcount_finish_one_cleanup is just doing a
xfs_btree_del_cursor.
Open code that and move xfs_refcount_finish_one_cleanup to
fs/xfs/xfs_refcount_item.c.
Inspired-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/defer_item.c | 17 +++++++++++++++++
libxfs/xfs_refcount.c | 19 +------------------
libxfs/xfs_refcount.h | 2 --
3 files changed, 18 insertions(+), 20 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index 8cf360567..f6560a6b3 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -423,6 +423,23 @@ xfs_refcount_update_abort_intent(
{
}
+/* Clean up after calling xfs_refcount_finish_one. */
+STATIC void
+xfs_refcount_finish_one_cleanup(
+ struct xfs_trans *tp,
+ struct xfs_btree_cur *rcur,
+ int error)
+{
+ struct xfs_buf *agbp;
+
+ if (rcur == NULL)
+ return;
+ agbp = rcur->bc_ag.agbp;
+ xfs_btree_del_cursor(rcur, error);
+ if (error)
+ xfs_trans_brelse(tp, agbp);
+}
+
const struct xfs_defer_op_type xfs_refcount_update_defer_type = {
.name = "refcount",
.create_intent = xfs_refcount_update_create_intent,
diff --git a/libxfs/xfs_refcount.c b/libxfs/xfs_refcount.c
index 14d1101b4..4b9a8be36 100644
--- a/libxfs/xfs_refcount.c
+++ b/libxfs/xfs_refcount.c
@@ -1299,23 +1299,6 @@ xfs_refcount_adjust(
return error;
}
-/* Clean up after calling xfs_refcount_finish_one. */
-void
-xfs_refcount_finish_one_cleanup(
- struct xfs_trans *tp,
- struct xfs_btree_cur *rcur,
- int error)
-{
- struct xfs_buf *agbp;
-
- if (rcur == NULL)
- return;
- agbp = rcur->bc_ag.agbp;
- xfs_btree_del_cursor(rcur, error);
- if (error)
- xfs_trans_brelse(tp, agbp);
-}
-
/*
* Set up a continuation a deferred refcount operation by updating the intent.
* Checks to make sure we're not going to run off the end of the AG.
@@ -1379,7 +1362,7 @@ xfs_refcount_finish_one(
if (rcur != NULL && rcur->bc_ag.pag != ri->ri_pag) {
nr_ops = rcur->bc_refc.nr_ops;
shape_changes = rcur->bc_refc.shape_changes;
- xfs_refcount_finish_one_cleanup(tp, rcur, 0);
+ xfs_btree_del_cursor(rcur, 0);
rcur = NULL;
*pcur = NULL;
}
diff --git a/libxfs/xfs_refcount.h b/libxfs/xfs_refcount.h
index 01a206211..c94b8f71d 100644
--- a/libxfs/xfs_refcount.h
+++ b/libxfs/xfs_refcount.h
@@ -82,8 +82,6 @@ void xfs_refcount_increase_extent(struct xfs_trans *tp,
void xfs_refcount_decrease_extent(struct xfs_trans *tp,
struct xfs_bmbt_irec *irec);
-extern void xfs_refcount_finish_one_cleanup(struct xfs_trans *tp,
- struct xfs_btree_cur *rcur, int error);
extern int xfs_refcount_finish_one(struct xfs_trans *tp,
struct xfs_refcount_intent *ri, struct xfs_btree_cur **pcur);
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 56/64] xfs: simplify usage of the rcur local variable in xfs_refcount_finish_one
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (54 preceding siblings ...)
2024-10-02 1:22 ` [PATCH 55/64] xfs: don't bother calling xfs_refcount_finish_one_cleanup in xfs_refcount_finish_one Darrick J. Wong
@ 2024-10-02 1:22 ` Darrick J. Wong
2024-10-02 1:22 ` [PATCH 57/64] xfs: move xfs_refcount_update_defer_add to xfs_refcount_item.c Darrick J. Wong
` (7 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:22 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: e51987a12cb57ca3702bff5df8a615037b2c8f8a
Only update rcur when we know the final *pcur value.
Inspired-by: Christoph Hellwig <hch@lst.de>
[djwong: don't leave the caller with a dangling ref]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/xfs_refcount.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/libxfs/xfs_refcount.c b/libxfs/xfs_refcount.c
index 4b9a8be36..d0a057f5c 100644
--- a/libxfs/xfs_refcount.c
+++ b/libxfs/xfs_refcount.c
@@ -1340,7 +1340,7 @@ xfs_refcount_finish_one(
struct xfs_btree_cur **pcur)
{
struct xfs_mount *mp = tp->t_mountp;
- struct xfs_btree_cur *rcur;
+ struct xfs_btree_cur *rcur = *pcur;
struct xfs_buf *agbp = NULL;
int error = 0;
xfs_agblock_t bno;
@@ -1358,7 +1358,6 @@ xfs_refcount_finish_one(
* If we haven't gotten a cursor or the cursor AG doesn't match
* the startblock, get one now.
*/
- rcur = *pcur;
if (rcur != NULL && rcur->bc_ag.pag != ri->ri_pag) {
nr_ops = rcur->bc_refc.nr_ops;
shape_changes = rcur->bc_refc.shape_changes;
@@ -1372,11 +1371,11 @@ xfs_refcount_finish_one(
if (error)
return error;
- rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, ri->ri_pag);
+ *pcur = rcur = xfs_refcountbt_init_cursor(mp, tp, agbp,
+ ri->ri_pag);
rcur->bc_refc.nr_ops = nr_ops;
rcur->bc_refc.shape_changes = shape_changes;
}
- *pcur = rcur;
switch (ri->ri_type) {
case XFS_REFCOUNT_INCREASE:
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 57/64] xfs: move xfs_refcount_update_defer_add to xfs_refcount_item.c
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (55 preceding siblings ...)
2024-10-02 1:22 ` [PATCH 56/64] xfs: simplify usage of the rcur local variable " Darrick J. Wong
@ 2024-10-02 1:22 ` Darrick J. Wong
2024-10-02 1:22 ` [PATCH 58/64] xfs: Avoid races with cnt_btree lastrec updates Darrick J. Wong
` (6 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:22 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Christoph Hellwig, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: 783e8a7c9cab6744ebc5dfe75081248ac39181b2
Move the code that adds the incore xfs_refcount_update_item deferred
work data to a transaction live with the CUI log item code. This means
that the refcount code no longer has to know about the inner workings of
the CUI log items.
As a consequence, we can get rid of the _{get,put}_group helpers.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
libxfs/defer_item.c | 21 +++++++++------------
libxfs/defer_item.h | 5 +++++
libxfs/xfs_refcount.c | 6 ++----
libxfs/xfs_refcount.h | 3 ---
4 files changed, 16 insertions(+), 19 deletions(-)
diff --git a/libxfs/defer_item.c b/libxfs/defer_item.c
index f6560a6b3..98a291c7b 100644
--- a/libxfs/defer_item.c
+++ b/libxfs/defer_item.c
@@ -364,21 +364,18 @@ xfs_refcount_update_create_done(
return NULL;
}
-/* Take an active ref to the AG containing the space we're refcounting. */
+/* Add this deferred CUI to the transaction. */
void
-xfs_refcount_update_get_group(
- struct xfs_mount *mp,
+xfs_refcount_defer_add(
+ struct xfs_trans *tp,
struct xfs_refcount_intent *ri)
{
+ struct xfs_mount *mp = tp->t_mountp;
+
+ trace_xfs_refcount_defer(mp, ri);
+
ri->ri_pag = xfs_perag_intent_get(mp, ri->ri_startblock);
-}
-
-/* Release an active AG ref after finishing refcounting work. */
-static inline void
-xfs_refcount_update_put_group(
- struct xfs_refcount_intent *ri)
-{
- xfs_perag_intent_put(ri->ri_pag);
+ xfs_defer_add(tp, &ri->ri_list, &xfs_refcount_update_defer_type);
}
/* Cancel a deferred refcount update. */
@@ -388,7 +385,7 @@ xfs_refcount_update_cancel_item(
{
struct xfs_refcount_intent *ri = ci_entry(item);
- xfs_refcount_update_put_group(ri);
+ xfs_perag_intent_put(ri->ri_pag);
kmem_cache_free(xfs_refcount_intent_cache, ri);
}
diff --git a/libxfs/defer_item.h b/libxfs/defer_item.h
index be354785b..93cf1eed5 100644
--- a/libxfs/defer_item.h
+++ b/libxfs/defer_item.h
@@ -34,4 +34,9 @@ struct xfs_rmap_intent;
void xfs_rmap_defer_add(struct xfs_trans *tp, struct xfs_rmap_intent *ri);
+struct xfs_refcount_intent;
+
+void xfs_refcount_defer_add(struct xfs_trans *tp,
+ struct xfs_refcount_intent *ri);
+
#endif /* __LIBXFS_DEFER_ITEM_H_ */
diff --git a/libxfs/xfs_refcount.c b/libxfs/xfs_refcount.c
index d0a057f5c..22f8afb27 100644
--- a/libxfs/xfs_refcount.c
+++ b/libxfs/xfs_refcount.c
@@ -23,6 +23,7 @@
#include "xfs_rmap.h"
#include "xfs_ag.h"
#include "xfs_health.h"
+#include "defer_item.h"
struct kmem_cache *xfs_refcount_intent_cache;
@@ -1434,10 +1435,7 @@ __xfs_refcount_add(
ri->ri_startblock = startblock;
ri->ri_blockcount = blockcount;
- trace_xfs_refcount_defer(tp->t_mountp, ri);
-
- xfs_refcount_update_get_group(tp->t_mountp, ri);
- xfs_defer_add(tp, &ri->ri_list, &xfs_refcount_update_defer_type);
+ xfs_refcount_defer_add(tp, ri);
}
/*
diff --git a/libxfs/xfs_refcount.h b/libxfs/xfs_refcount.h
index c94b8f71d..68acb0b1b 100644
--- a/libxfs/xfs_refcount.h
+++ b/libxfs/xfs_refcount.h
@@ -74,9 +74,6 @@ xfs_refcount_check_domain(
return true;
}
-void xfs_refcount_update_get_group(struct xfs_mount *mp,
- struct xfs_refcount_intent *ri);
-
void xfs_refcount_increase_extent(struct xfs_trans *tp,
struct xfs_bmbt_irec *irec);
void xfs_refcount_decrease_extent(struct xfs_trans *tp,
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 58/64] xfs: Avoid races with cnt_btree lastrec updates
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (56 preceding siblings ...)
2024-10-02 1:22 ` [PATCH 57/64] xfs: move xfs_refcount_update_defer_add to xfs_refcount_item.c Darrick J. Wong
@ 2024-10-02 1:22 ` Darrick J. Wong
2024-10-02 1:23 ` [PATCH 59/64] xfs: AIL doesn't need manual pushing Darrick J. Wong
` (5 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:22 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Zizhi Wo, Chandan Babu R, linux-xfs
From: Zizhi Wo <wozizhi@huawei.com>
Source kernel commit: 94a0333b9212a114d19096a77903f76d0d5bca26
A concurrent file creation and little writing could unexpectedly return
-ENOSPC error since there is a race window that the allocator could get
the wrong agf->agf_longest.
Write file process steps:
1) Find the entry that best meets the conditions, then calculate the start
address and length of the remaining part of the entry after allocation.
2) Delete this entry and update the -current- agf->agf_longest.
3) Insert the remaining unused parts of this entry based on the
calculations in 1), and update the agf->agf_longest again if necessary.
Create file process steps:
1) Check whether there are free inodes in the inode chunk.
2) If there is no free inode, check whether there has space for creating
inode chunks, perform the no-lock judgment first.
3) If the judgment succeeds, the judgment is performed again with agf lock
held. Otherwire, an error is returned directly.
If the write process is in step 2) but not go to 3) yet, the create file
process goes to 2) at this time, it may be mistaken for no space,
resulting in the file system still has space but the file creation fails.
We have sent two different commits to the community in order to fix this
problem[1][2]. Unfortunately, both solutions have flaws. In [2], I
discussed with Dave and Darrick, realized that a better solution to this
problem requires the "last cnt record tracking" to be ripped out of the
generic btree code. And surprisingly, Dave directly provided his fix code.
This patch includes appropriate modifications based on his tmp-code to
address this issue.
The entire fix can be roughly divided into two parts:
1) Delete the code related to lastrec-update in the generic btree code.
2) Place the process of updating longest freespace with cntbt separately
to the end of the cntbt modifications. Move the cursor to the rightmost
firstly, and update the longest free extent based on the record.
Note that we can not update the longest with xfs_alloc_get_rec() after
find the longest record, as xfs_verify_agbno() may not pass because
pag->block_count is updated on the outside. Therefore, use
xfs_btree_get_rec() as a replacement.
[1] https://lore.kernel.org/all/20240419061848.1032366-2-yebin10@huawei.com
[2] https://lore.kernel.org/all/20240604071121.3981686-1-wozizhi@huawei.com
Reported by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
libxfs/xfs_alloc.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++
libxfs/xfs_alloc_btree.c | 64 --------------------------
libxfs/xfs_btree.c | 51 ---------------------
libxfs/xfs_btree.h | 16 ------
4 files changed, 115 insertions(+), 130 deletions(-)
diff --git a/libxfs/xfs_alloc.c b/libxfs/xfs_alloc.c
index 063ac1973..3806a6bc0 100644
--- a/libxfs/xfs_alloc.c
+++ b/libxfs/xfs_alloc.c
@@ -462,6 +462,97 @@ xfs_alloc_fix_len(
args->len = rlen;
}
+/*
+ * Determine if the cursor points to the block that contains the right-most
+ * block of records in the by-count btree. This block contains the largest
+ * contiguous free extent in the AG, so if we modify a record in this block we
+ * need to call xfs_alloc_fixup_longest() once the modifications are done to
+ * ensure the agf->agf_longest field is kept up to date with the longest free
+ * extent tracked by the by-count btree.
+ */
+static bool
+xfs_alloc_cursor_at_lastrec(
+ struct xfs_btree_cur *cnt_cur)
+{
+ struct xfs_btree_block *block;
+ union xfs_btree_ptr ptr;
+ struct xfs_buf *bp;
+
+ block = xfs_btree_get_block(cnt_cur, 0, &bp);
+
+ xfs_btree_get_sibling(cnt_cur, block, &ptr, XFS_BB_RIGHTSIB);
+ return xfs_btree_ptr_is_null(cnt_cur, &ptr);
+}
+
+/*
+ * Find the rightmost record of the cntbt, and return the longest free space
+ * recorded in it. Simply set both the block number and the length to their
+ * maximum values before searching.
+ */
+static int
+xfs_cntbt_longest(
+ struct xfs_btree_cur *cnt_cur,
+ xfs_extlen_t *longest)
+{
+ struct xfs_alloc_rec_incore irec;
+ union xfs_btree_rec *rec;
+ int stat = 0;
+ int error;
+
+ memset(&cnt_cur->bc_rec, 0xFF, sizeof(cnt_cur->bc_rec));
+ error = xfs_btree_lookup(cnt_cur, XFS_LOOKUP_LE, &stat);
+ if (error)
+ return error;
+ if (!stat) {
+ /* totally empty tree */
+ *longest = 0;
+ return 0;
+ }
+
+ error = xfs_btree_get_rec(cnt_cur, &rec, &stat);
+ if (error)
+ return error;
+ if (XFS_IS_CORRUPT(cnt_cur->bc_mp, !stat)) {
+ xfs_btree_mark_sick(cnt_cur);
+ return -EFSCORRUPTED;
+ }
+
+ xfs_alloc_btrec_to_irec(rec, &irec);
+ *longest = irec.ar_blockcount;
+ return 0;
+}
+
+/*
+ * Update the longest contiguous free extent in the AG from the by-count cursor
+ * that is passed to us. This should be done at the end of any allocation or
+ * freeing operation that touches the longest extent in the btree.
+ *
+ * Needing to update the longest extent can be determined by calling
+ * xfs_alloc_cursor_at_lastrec() after the cursor is positioned for record
+ * modification but before the modification begins.
+ */
+static int
+xfs_alloc_fixup_longest(
+ struct xfs_btree_cur *cnt_cur)
+{
+ struct xfs_perag *pag = cnt_cur->bc_ag.pag;
+ struct xfs_buf *bp = cnt_cur->bc_ag.agbp;
+ struct xfs_agf *agf = bp->b_addr;
+ xfs_extlen_t longest = 0;
+ int error;
+
+ /* Lookup last rec in order to update AGF. */
+ error = xfs_cntbt_longest(cnt_cur, &longest);
+ if (error)
+ return error;
+
+ pag->pagf_longest = longest;
+ agf->agf_longest = cpu_to_be32(pag->pagf_longest);
+ xfs_alloc_log_agf(cnt_cur->bc_tp, bp, XFS_AGF_LONGEST);
+
+ return 0;
+}
+
/*
* Update the two btrees, logically removing from freespace the extent
* starting at rbno, rlen blocks. The extent is contained within the
@@ -486,6 +577,7 @@ xfs_alloc_fixup_trees(
xfs_extlen_t nflen1=0; /* first new free length */
xfs_extlen_t nflen2=0; /* second new free length */
struct xfs_mount *mp;
+ bool fixup_longest = false;
mp = cnt_cur->bc_mp;
@@ -574,6 +666,10 @@ xfs_alloc_fixup_trees(
nfbno2 = rbno + rlen;
nflen2 = (fbno + flen) - nfbno2;
}
+
+ if (xfs_alloc_cursor_at_lastrec(cnt_cur))
+ fixup_longest = true;
+
/*
* Delete the entry from the by-size btree.
*/
@@ -651,6 +747,10 @@ xfs_alloc_fixup_trees(
return -EFSCORRUPTED;
}
}
+
+ if (fixup_longest)
+ return xfs_alloc_fixup_longest(cnt_cur);
+
return 0;
}
@@ -1953,6 +2053,7 @@ xfs_free_ag_extent(
int i;
int error;
struct xfs_perag *pag = agbp->b_pag;
+ bool fixup_longest = false;
bno_cur = cnt_cur = NULL;
mp = tp->t_mountp;
@@ -2216,8 +2317,13 @@ xfs_free_ag_extent(
}
xfs_btree_del_cursor(bno_cur, XFS_BTREE_NOERROR);
bno_cur = NULL;
+
/*
* In all cases we need to insert the new freespace in the by-size tree.
+ *
+ * If this new freespace is being inserted in the block that contains
+ * the largest free space in the btree, make sure we also fix up the
+ * agf->agf-longest tracker field.
*/
if ((error = xfs_alloc_lookup_eq(cnt_cur, nbno, nlen, &i)))
goto error0;
@@ -2226,6 +2332,8 @@ xfs_free_ag_extent(
error = -EFSCORRUPTED;
goto error0;
}
+ if (xfs_alloc_cursor_at_lastrec(cnt_cur))
+ fixup_longest = true;
if ((error = xfs_btree_insert(cnt_cur, &i)))
goto error0;
if (XFS_IS_CORRUPT(mp, i != 1)) {
@@ -2233,6 +2341,12 @@ xfs_free_ag_extent(
error = -EFSCORRUPTED;
goto error0;
}
+ if (fixup_longest) {
+ error = xfs_alloc_fixup_longest(cnt_cur);
+ if (error)
+ goto error0;
+ }
+
xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR);
cnt_cur = NULL;
diff --git a/libxfs/xfs_alloc_btree.c b/libxfs/xfs_alloc_btree.c
index 949eb02cd..9140dec00 100644
--- a/libxfs/xfs_alloc_btree.c
+++ b/libxfs/xfs_alloc_btree.c
@@ -113,67 +113,6 @@ xfs_allocbt_free_block(
return 0;
}
-/*
- * Update the longest extent in the AGF
- */
-STATIC void
-xfs_allocbt_update_lastrec(
- struct xfs_btree_cur *cur,
- const struct xfs_btree_block *block,
- const union xfs_btree_rec *rec,
- int ptr,
- int reason)
-{
- struct xfs_agf *agf = cur->bc_ag.agbp->b_addr;
- struct xfs_perag *pag;
- __be32 len;
- int numrecs;
-
- ASSERT(!xfs_btree_is_bno(cur->bc_ops));
-
- switch (reason) {
- case LASTREC_UPDATE:
- /*
- * If this is the last leaf block and it's the last record,
- * then update the size of the longest extent in the AG.
- */
- if (ptr != xfs_btree_get_numrecs(block))
- return;
- len = rec->alloc.ar_blockcount;
- break;
- case LASTREC_INSREC:
- if (be32_to_cpu(rec->alloc.ar_blockcount) <=
- be32_to_cpu(agf->agf_longest))
- return;
- len = rec->alloc.ar_blockcount;
- break;
- case LASTREC_DELREC:
- numrecs = xfs_btree_get_numrecs(block);
- if (ptr <= numrecs)
- return;
- ASSERT(ptr == numrecs + 1);
-
- if (numrecs) {
- xfs_alloc_rec_t *rrp;
-
- rrp = XFS_ALLOC_REC_ADDR(cur->bc_mp, block, numrecs);
- len = rrp->ar_blockcount;
- } else {
- len = 0;
- }
-
- break;
- default:
- ASSERT(0);
- return;
- }
-
- agf->agf_longest = len;
- pag = cur->bc_ag.agbp->b_pag;
- pag->pagf_longest = be32_to_cpu(len);
- xfs_alloc_log_agf(cur->bc_tp, cur->bc_ag.agbp, XFS_AGF_LONGEST);
-}
-
STATIC int
xfs_allocbt_get_minrecs(
struct xfs_btree_cur *cur,
@@ -491,7 +430,6 @@ const struct xfs_btree_ops xfs_bnobt_ops = {
.set_root = xfs_allocbt_set_root,
.alloc_block = xfs_allocbt_alloc_block,
.free_block = xfs_allocbt_free_block,
- .update_lastrec = xfs_allocbt_update_lastrec,
.get_minrecs = xfs_allocbt_get_minrecs,
.get_maxrecs = xfs_allocbt_get_maxrecs,
.init_key_from_rec = xfs_allocbt_init_key_from_rec,
@@ -509,7 +447,6 @@ const struct xfs_btree_ops xfs_bnobt_ops = {
const struct xfs_btree_ops xfs_cntbt_ops = {
.name = "cnt",
.type = XFS_BTREE_TYPE_AG,
- .geom_flags = XFS_BTGEO_LASTREC_UPDATE,
.rec_len = sizeof(xfs_alloc_rec_t),
.key_len = sizeof(xfs_alloc_key_t),
@@ -523,7 +460,6 @@ const struct xfs_btree_ops xfs_cntbt_ops = {
.set_root = xfs_allocbt_set_root,
.alloc_block = xfs_allocbt_alloc_block,
.free_block = xfs_allocbt_free_block,
- .update_lastrec = xfs_allocbt_update_lastrec,
.get_minrecs = xfs_allocbt_get_minrecs,
.get_maxrecs = xfs_allocbt_get_maxrecs,
.init_key_from_rec = xfs_allocbt_init_key_from_rec,
diff --git a/libxfs/xfs_btree.c b/libxfs/xfs_btree.c
index a91441b46..bb53b6d7a 100644
--- a/libxfs/xfs_btree.c
+++ b/libxfs/xfs_btree.c
@@ -1329,30 +1329,6 @@ xfs_btree_init_block_cur(
xfs_btree_owner(cur));
}
-/*
- * Return true if ptr is the last record in the btree and
- * we need to track updates to this record. The decision
- * will be further refined in the update_lastrec method.
- */
-STATIC int
-xfs_btree_is_lastrec(
- struct xfs_btree_cur *cur,
- struct xfs_btree_block *block,
- int level)
-{
- union xfs_btree_ptr ptr;
-
- if (level > 0)
- return 0;
- if (!(cur->bc_ops->geom_flags & XFS_BTGEO_LASTREC_UPDATE))
- return 0;
-
- xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB);
- if (!xfs_btree_ptr_is_null(cur, &ptr))
- return 0;
- return 1;
-}
-
STATIC void
xfs_btree_buf_to_ptr(
struct xfs_btree_cur *cur,
@@ -2418,15 +2394,6 @@ xfs_btree_update(
xfs_btree_copy_recs(cur, rp, rec, 1);
xfs_btree_log_recs(cur, bp, ptr, ptr);
- /*
- * If we are tracking the last record in the tree and
- * we are at the far right edge of the tree, update it.
- */
- if (xfs_btree_is_lastrec(cur, block, 0)) {
- cur->bc_ops->update_lastrec(cur, block, rec,
- ptr, LASTREC_UPDATE);
- }
-
/* Pass new key value up to our parent. */
if (xfs_btree_needs_key_update(cur, ptr)) {
error = xfs_btree_update_keys(cur, 0);
@@ -3615,15 +3582,6 @@ xfs_btree_insrec(
goto error0;
}
- /*
- * If we are tracking the last record in the tree and
- * we are at the far right edge of the tree, update it.
- */
- if (xfs_btree_is_lastrec(cur, block, level)) {
- cur->bc_ops->update_lastrec(cur, block, rec,
- ptr, LASTREC_INSREC);
- }
-
/*
* Return the new block number, if any.
* If there is one, give back a record value and a cursor too.
@@ -3981,15 +3939,6 @@ xfs_btree_delrec(
xfs_btree_set_numrecs(block, --numrecs);
xfs_btree_log_block(cur, bp, XFS_BB_NUMRECS);
- /*
- * If we are tracking the last record in the tree and
- * we are at the far right edge of the tree, update it.
- */
- if (xfs_btree_is_lastrec(cur, block, level)) {
- cur->bc_ops->update_lastrec(cur, block, NULL,
- ptr, LASTREC_DELREC);
- }
-
/*
* We're at the root level. First, shrink the root block in-memory.
* Try to get rid of the next level down. If we can't then there's
diff --git a/libxfs/xfs_btree.h b/libxfs/xfs_btree.h
index f93374278..10b7ddc3b 100644
--- a/libxfs/xfs_btree.h
+++ b/libxfs/xfs_btree.h
@@ -154,12 +154,6 @@ struct xfs_btree_ops {
int *stat);
int (*free_block)(struct xfs_btree_cur *cur, struct xfs_buf *bp);
- /* update last record information */
- void (*update_lastrec)(struct xfs_btree_cur *cur,
- const struct xfs_btree_block *block,
- const union xfs_btree_rec *rec,
- int ptr, int reason);
-
/* records in block/level */
int (*get_minrecs)(struct xfs_btree_cur *cur, int level);
int (*get_maxrecs)(struct xfs_btree_cur *cur, int level);
@@ -222,15 +216,7 @@ struct xfs_btree_ops {
};
/* btree geometry flags */
-#define XFS_BTGEO_LASTREC_UPDATE (1U << 0) /* track last rec externally */
-#define XFS_BTGEO_OVERLAPPING (1U << 1) /* overlapping intervals */
-
-/*
- * Reasons for the update_lastrec method to be called.
- */
-#define LASTREC_UPDATE 0
-#define LASTREC_INSREC 1
-#define LASTREC_DELREC 2
+#define XFS_BTGEO_OVERLAPPING (1U << 0) /* overlapping intervals */
union xfs_btree_irec {
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 59/64] xfs: AIL doesn't need manual pushing
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (57 preceding siblings ...)
2024-10-02 1:22 ` [PATCH 58/64] xfs: Avoid races with cnt_btree lastrec updates Darrick J. Wong
@ 2024-10-02 1:23 ` Darrick J. Wong
2024-10-02 1:23 ` [PATCH 60/64] xfs: background AIL push should target physical space Darrick J. Wong
` (4 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:23 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Dave Chinner, Chandan Babu R, linux-xfs
From: Dave Chinner <dchinner@redhat.com>
Source kernel commit: 9adf40249e6cfd7231c2973bb305f6c20902bfd9
We have a mechanism that checks the amount of log space remaining
available every time we make a transaction reservation. If the
amount of space is below a threshold (25% free) we push on the AIL
to tell it to do more work. To do this, we end up calculating the
LSN that the AIL needs to push to on every reservation and updating
the push target for the AIL with that new target LSN.
This is silly and expensive. The AIL is perfectly capable of
calculating the push target itself, and it will always be running
when the AIL contains objects.
What the target does is determine if the AIL needs to do
any work before it goes back to sleep. If we haven't run out of
reservation space or memory (or some other push all trigger), it
will simply go back to sleep for a while if there is more than 25%
of the journal space free without doing anything.
If there are items in the AIL at a lower LSN than the target, it
will try to push up to the target or to the point of getting stuck
before going back to sleep and trying again soon after.`
Hence we can modify the AIL to calculate it's own 25% push target
before it starts a push using the same reserve grant head based
calculation as is currently used, and remove all the places where we
ask the AIL to push to a new 25% free target. We can also drop the
minimum free space size of 256BBs from the calculation because the
25% of a minimum sized log is *always going to be larger than
256BBs.
This does still require a manual push in certain circumstances.
These circumstances arise when the AIL is not full, but the
reservation grants consume the entire of the free space in the log.
In this case, we still need to push on the AIL to free up space, so
when we hit this condition (i.e. reservation going to sleep to wait
on log space) we do a single push to tell the AIL it should empty
itself. This will keep the AIL moving as new reservations come in
and want more space, rather than keep queuing them and having to
push the AIL repeatedly.
The reason for using the "push all" when grant space runs out is
that we can run out of grant space when there is more than 25% of
the log free. Small logs are notorious for this, and we have a hack
in the log callback code (xlog_state_set_callback()) where we push
the AIL because the *head* moved) to ensure that we kick the AIL
when we consume space in it because that can push us over the "less
than 25% available" available that starts tail pushing back up
again.
Hence when we run out of grant space and are going to sleep, we have
to consider that the grant space may be consuming almost all the log
space and there is almost nothing in the AIL. In this situation, the
AIL pins the tail and moving the tail forwards is the only way the
grant space will come available, so we have to force the AIL to push
everything to guarantee grant space will eventually be returned.
Hence triggering a "push all" just before sleeping removes all the
nasty corner cases we have in other parts of the code that work
around the "we didn't ask the AIL to push enough to free grant
space" condition that leads to log space hangs...
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
include/xfs_trans.h | 2 +-
libxfs/xfs_defer.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/xfs_trans.h b/include/xfs_trans.h
index b7f01ff07..912bd4085 100644
--- a/include/xfs_trans.h
+++ b/include/xfs_trans.h
@@ -163,7 +163,7 @@ libxfs_trans_read_buf(
#define xfs_log_item_in_current_chkpt(lip) (false)
/* Contorted mess to make gcc shut up about unused vars. */
-#define xlog_grant_push_threshold(log, need) \
+#define xfs_ail_push_target(ail) \
((log) == (log) ? NULLCOMMITLSN : NULLCOMMITLSN)
/* from xfs_log.h */
diff --git a/libxfs/xfs_defer.c b/libxfs/xfs_defer.c
index 7cf392e2f..56722da23 100644
--- a/libxfs/xfs_defer.c
+++ b/libxfs/xfs_defer.c
@@ -550,7 +550,7 @@ xfs_defer_relog(
* the log threshold once per call.
*/
if (threshold_lsn == NULLCOMMITLSN) {
- threshold_lsn = xlog_grant_push_threshold(log, 0);
+ threshold_lsn = xfs_ail_push_target(log->l_ailp);
if (threshold_lsn == NULLCOMMITLSN)
break;
}
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 60/64] xfs: background AIL push should target physical space
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (58 preceding siblings ...)
2024-10-02 1:23 ` [PATCH 59/64] xfs: AIL doesn't need manual pushing Darrick J. Wong
@ 2024-10-02 1:23 ` Darrick J. Wong
2024-10-02 1:23 ` [PATCH 61/64] xfs: get rid of xfs_ag_resv_rmapbt_alloc Darrick J. Wong
` (3 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:23 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Dave Chinner, Chandan Babu R, linux-xfs
From: Dave Chinner <dchinner@redhat.com>
Source kernel commit: b50b4c49d8d79af05ac3bb3587f58589713139cc
Currently the AIL attempts to keep 25% of the "log space" free,
where the current used space is tracked by the reserve grant head.
That is, it tracks both physical space used plus the amount reserved
by transactions in progress.
When we start tail pushing, we are trying to make space for new
reservations by writing back older metadata and the log is generally
physically full of dirty metadata, and reservations for modifications
in flight take up whatever space the AIL can physically free up.
Hence we don't really need to take into account the reservation
space that has been used - we just need to keep the log tail moving
as fast as we can to free up space for more reservations to be made.
We know exactly how much physical space the journal is consuming in
the AIL (i.e. max LSN - min LSN) so we can base push thresholds
directly on this state rather than have to look at grant head
reservations to determine how much to physically push out of the
log.
This also allows code that needs to know if log items in the current
transaction need to be pushed or re-logged to simply sample the
current target - they don't need to calculate the current target
themselves. This avoids the need for any locking when doing such
checks.
Further, moving to a physical target means we don't need "push all
until empty semantics" like were introduced in the previous patch.
We can now test and clear the "push all" as a one-shot command to
set the target to the current head of the AIL. This allows the
xfsaild to maximise the use of log space right up to the point where
conditions indicate that the xfsaild is not keeping up with load and
it needs to work harder, and as soon as those constraints go away
(i.e. external code no longer needs everything pushed) the xfsaild
will return to maintaining the normal 25% free space thresholds.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
include/xfs_trans.h | 2 +-
libxfs/xfs_defer.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/xfs_trans.h b/include/xfs_trans.h
index 912bd4085..9bc4b1ef5 100644
--- a/include/xfs_trans.h
+++ b/include/xfs_trans.h
@@ -163,7 +163,7 @@ libxfs_trans_read_buf(
#define xfs_log_item_in_current_chkpt(lip) (false)
/* Contorted mess to make gcc shut up about unused vars. */
-#define xfs_ail_push_target(ail) \
+#define xfs_ail_get_push_target(ail) \
((log) == (log) ? NULLCOMMITLSN : NULLCOMMITLSN)
/* from xfs_log.h */
diff --git a/libxfs/xfs_defer.c b/libxfs/xfs_defer.c
index 56722da23..e3a608f64 100644
--- a/libxfs/xfs_defer.c
+++ b/libxfs/xfs_defer.c
@@ -550,7 +550,7 @@ xfs_defer_relog(
* the log threshold once per call.
*/
if (threshold_lsn == NULLCOMMITLSN) {
- threshold_lsn = xfs_ail_push_target(log->l_ailp);
+ threshold_lsn = xfs_ail_get_push_target(log->l_ailp);
if (threshold_lsn == NULLCOMMITLSN)
break;
}
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 61/64] xfs: get rid of xfs_ag_resv_rmapbt_alloc
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (59 preceding siblings ...)
2024-10-02 1:23 ` [PATCH 60/64] xfs: background AIL push should target physical space Darrick J. Wong
@ 2024-10-02 1:23 ` Darrick J. Wong
2024-10-02 1:23 ` [PATCH 62/64] xfs: remove unused parameter in macro XFS_DQUOT_LOGRES Darrick J. Wong
` (2 subsequent siblings)
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:23 UTC (permalink / raw)
To: aalbersh, djwong, cem
Cc: Long Li, Christoph Hellwig, Chandan Babu R, linux-xfs
From: Long Li <leo.lilong@huawei.com>
Source kernel commit: 49cdc4e834e46d7c11a91d7adcfa04f56d19efaf
The pag in xfs_ag_resv_rmapbt_alloc() is already held when the struct
xfs_btree_cur is initialized in xfs_rmapbt_init_cursor(), so there is no
need to get pag again.
On the other hand, in xfs_rmapbt_free_block(), the similar function
xfs_ag_resv_rmapbt_free() was removed in commit 92a005448f6f ("xfs: get
rid of unnecessary xfs_perag_{get,put} pairs"), xfs_ag_resv_rmapbt_alloc()
was left because scrub used it, but now scrub has removed it. Therefore,
we could get rid of xfs_ag_resv_rmapbt_alloc() just like the rmap free
block, make the code cleaner.
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
libxfs/xfs_ag_resv.h | 19 -------------------
libxfs/xfs_rmap_btree.c | 7 ++++++-
2 files changed, 6 insertions(+), 20 deletions(-)
diff --git a/libxfs/xfs_ag_resv.h b/libxfs/xfs_ag_resv.h
index ff20ed93d..f247eeff7 100644
--- a/libxfs/xfs_ag_resv.h
+++ b/libxfs/xfs_ag_resv.h
@@ -33,23 +33,4 @@ xfs_perag_resv(
}
}
-/*
- * RMAPBT reservation accounting wrappers. Since rmapbt blocks are sourced from
- * the AGFL, they are allocated one at a time and the reservation updates don't
- * require a transaction.
- */
-static inline void
-xfs_ag_resv_rmapbt_alloc(
- struct xfs_mount *mp,
- xfs_agnumber_t agno)
-{
- struct xfs_alloc_arg args = { NULL };
- struct xfs_perag *pag;
-
- args.len = 1;
- pag = xfs_perag_get(mp, agno);
- xfs_ag_resv_alloc_extent(pag, XFS_AG_RESV_RMAPBT, &args);
- xfs_perag_put(pag);
-}
-
#endif /* __XFS_AG_RESV_H__ */
diff --git a/libxfs/xfs_rmap_btree.c b/libxfs/xfs_rmap_btree.c
index a2730e29c..f1732b72d 100644
--- a/libxfs/xfs_rmap_btree.c
+++ b/libxfs/xfs_rmap_btree.c
@@ -87,6 +87,7 @@ xfs_rmapbt_alloc_block(
struct xfs_buf *agbp = cur->bc_ag.agbp;
struct xfs_agf *agf = agbp->b_addr;
struct xfs_perag *pag = cur->bc_ag.pag;
+ struct xfs_alloc_arg args = { .len = 1 };
int error;
xfs_agblock_t bno;
@@ -106,7 +107,11 @@ xfs_rmapbt_alloc_block(
be32_add_cpu(&agf->agf_rmap_blocks, 1);
xfs_alloc_log_agf(cur->bc_tp, agbp, XFS_AGF_RMAP_BLOCKS);
- xfs_ag_resv_rmapbt_alloc(cur->bc_mp, pag->pag_agno);
+ /*
+ * Since rmapbt blocks are sourced from the AGFL, they are allocated one
+ * at a time and the reservation updates don't require a transaction.
+ */
+ xfs_ag_resv_alloc_extent(pag, XFS_AG_RESV_RMAPBT, &args);
*stat = 1;
return 0;
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 62/64] xfs: remove unused parameter in macro XFS_DQUOT_LOGRES
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (60 preceding siblings ...)
2024-10-02 1:23 ` [PATCH 61/64] xfs: get rid of xfs_ag_resv_rmapbt_alloc Darrick J. Wong
@ 2024-10-02 1:23 ` Darrick J. Wong
2024-10-02 1:24 ` [PATCH 63/64] xfs: fix di_onlink checking for V1/V2 inodes Darrick J. Wong
2024-10-02 1:24 ` [PATCH 64/64] xfs: xfs_finobt_count_blocks() walks the wrong btree Darrick J. Wong
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:23 UTC (permalink / raw)
To: aalbersh, djwong, cem; +Cc: Julian Sun, Chandan Babu R, linux-xfs
From: Julian Sun <sunjunchao2870@gmail.com>
Source kernel commit: af5d92f2fad818663da2ce073b6fe15b9d56ffdc
In the macro definition of XFS_DQUOT_LOGRES, a parameter is accepted,
but it is not used. Hence, it should be removed.
This patch has only passed compilation test, but it should be fine.
Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
libxfs/xfs_quota_defs.h | 2 +-
libxfs/xfs_trans_resv.c | 28 ++++++++++++++--------------
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/libxfs/xfs_quota_defs.h b/libxfs/xfs_quota_defs.h
index cb035da3f..fb05f44f6 100644
--- a/libxfs/xfs_quota_defs.h
+++ b/libxfs/xfs_quota_defs.h
@@ -56,7 +56,7 @@ typedef uint8_t xfs_dqtype_t;
* And, of course, we also need to take into account the dquot log format item
* used to describe each dquot.
*/
-#define XFS_DQUOT_LOGRES(mp) \
+#define XFS_DQUOT_LOGRES \
((sizeof(struct xfs_dq_logformat) + sizeof(struct xfs_disk_dquot)) * 6)
#define XFS_IS_QUOTA_ON(mp) ((mp)->m_qflags & XFS_ALL_QUOTA_ACCT)
diff --git a/libxfs/xfs_trans_resv.c b/libxfs/xfs_trans_resv.c
index a2cb4d63e..6b87bf4d5 100644
--- a/libxfs/xfs_trans_resv.c
+++ b/libxfs/xfs_trans_resv.c
@@ -335,11 +335,11 @@ xfs_calc_write_reservation(
blksz);
t1 += adj;
t3 += adj;
- return XFS_DQUOT_LOGRES(mp) + max3(t1, t2, t3);
+ return XFS_DQUOT_LOGRES + max3(t1, t2, t3);
}
t4 = xfs_calc_refcountbt_reservation(mp, 1);
- return XFS_DQUOT_LOGRES(mp) + max(t4, max3(t1, t2, t3));
+ return XFS_DQUOT_LOGRES + max(t4, max3(t1, t2, t3));
}
unsigned int
@@ -407,11 +407,11 @@ xfs_calc_itruncate_reservation(
xfs_refcountbt_block_count(mp, 4),
blksz);
- return XFS_DQUOT_LOGRES(mp) + max3(t1, t2, t3);
+ return XFS_DQUOT_LOGRES + max3(t1, t2, t3);
}
t4 = xfs_calc_refcountbt_reservation(mp, 2);
- return XFS_DQUOT_LOGRES(mp) + max(t4, max3(t1, t2, t3));
+ return XFS_DQUOT_LOGRES + max(t4, max3(t1, t2, t3));
}
unsigned int
@@ -463,7 +463,7 @@ STATIC uint
xfs_calc_rename_reservation(
struct xfs_mount *mp)
{
- unsigned int overhead = XFS_DQUOT_LOGRES(mp);
+ unsigned int overhead = XFS_DQUOT_LOGRES;
struct xfs_trans_resv *resp = M_RES(mp);
unsigned int t1, t2, t3 = 0;
@@ -574,7 +574,7 @@ STATIC uint
xfs_calc_link_reservation(
struct xfs_mount *mp)
{
- unsigned int overhead = XFS_DQUOT_LOGRES(mp);
+ unsigned int overhead = XFS_DQUOT_LOGRES;
struct xfs_trans_resv *resp = M_RES(mp);
unsigned int t1, t2, t3 = 0;
@@ -638,7 +638,7 @@ STATIC uint
xfs_calc_remove_reservation(
struct xfs_mount *mp)
{
- unsigned int overhead = XFS_DQUOT_LOGRES(mp);
+ unsigned int overhead = XFS_DQUOT_LOGRES;
struct xfs_trans_resv *resp = M_RES(mp);
unsigned int t1, t2, t3 = 0;
@@ -726,7 +726,7 @@ xfs_calc_icreate_reservation(
struct xfs_mount *mp)
{
struct xfs_trans_resv *resp = M_RES(mp);
- unsigned int overhead = XFS_DQUOT_LOGRES(mp);
+ unsigned int overhead = XFS_DQUOT_LOGRES;
unsigned int t1, t2, t3 = 0;
t1 = xfs_calc_icreate_resv_alloc(mp);
@@ -744,7 +744,7 @@ STATIC uint
xfs_calc_create_tmpfile_reservation(
struct xfs_mount *mp)
{
- uint res = XFS_DQUOT_LOGRES(mp);
+ uint res = XFS_DQUOT_LOGRES;
res += xfs_calc_icreate_resv_alloc(mp);
return res + xfs_calc_iunlink_add_reservation(mp);
@@ -826,7 +826,7 @@ STATIC uint
xfs_calc_ifree_reservation(
struct xfs_mount *mp)
{
- return XFS_DQUOT_LOGRES(mp) +
+ return XFS_DQUOT_LOGRES +
xfs_calc_inode_res(mp, 1) +
xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) +
xfs_calc_iunlink_remove_reservation(mp) +
@@ -843,7 +843,7 @@ STATIC uint
xfs_calc_ichange_reservation(
struct xfs_mount *mp)
{
- return XFS_DQUOT_LOGRES(mp) +
+ return XFS_DQUOT_LOGRES +
xfs_calc_inode_res(mp, 1) +
xfs_calc_buf_res(1, mp->m_sb.sb_sectsize);
@@ -952,7 +952,7 @@ STATIC uint
xfs_calc_addafork_reservation(
struct xfs_mount *mp)
{
- return XFS_DQUOT_LOGRES(mp) +
+ return XFS_DQUOT_LOGRES +
xfs_calc_inode_res(mp, 1) +
xfs_calc_buf_res(2, mp->m_sb.sb_sectsize) +
xfs_calc_buf_res(1, mp->m_dir_geo->blksize) +
@@ -1000,7 +1000,7 @@ STATIC uint
xfs_calc_attrsetm_reservation(
struct xfs_mount *mp)
{
- return XFS_DQUOT_LOGRES(mp) +
+ return XFS_DQUOT_LOGRES +
xfs_calc_inode_res(mp, 1) +
xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) +
xfs_calc_buf_res(XFS_DA_NODE_MAXDEPTH, XFS_FSB_TO_B(mp, 1));
@@ -1040,7 +1040,7 @@ STATIC uint
xfs_calc_attrrm_reservation(
struct xfs_mount *mp)
{
- return XFS_DQUOT_LOGRES(mp) +
+ return XFS_DQUOT_LOGRES +
max((xfs_calc_inode_res(mp, 1) +
xfs_calc_buf_res(XFS_DA_NODE_MAXDEPTH,
XFS_FSB_TO_B(mp, 1)) +
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 63/64] xfs: fix di_onlink checking for V1/V2 inodes
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (61 preceding siblings ...)
2024-10-02 1:23 ` [PATCH 62/64] xfs: remove unused parameter in macro XFS_DQUOT_LOGRES Darrick J. Wong
@ 2024-10-02 1:24 ` Darrick J. Wong
2024-10-02 1:24 ` [PATCH 64/64] xfs: xfs_finobt_count_blocks() walks the wrong btree Darrick J. Wong
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:24 UTC (permalink / raw)
To: aalbersh, djwong, cem
Cc: kjell.m.randa, Christoph Hellwig, Chandan Babu R, linux-xfs
From: Darrick J. Wong <djwong@kernel.org>
Source kernel commit: e21fea4ac3cf12eba1921fbbf7764bf69c6d4b2c
"KjellR" complained on IRC that an old V4 filesystem suddenly stopped
mounting after upgrading from 6.9.11 to 6.10.3, with the following splat
when trying to read the rt bitmap inode:
00000000: 49 4e 80 00 01 02 00 01 00 00 00 00 00 00 00 00 IN..............
00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000020: 00 00 00 00 00 00 00 00 43 d2 a9 da 21 0f d6 30 ........C...!..0
00000030: 43 d2 a9 da 21 0f d6 30 00 00 00 00 00 00 00 00 C...!..0........
00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000050: 00 00 00 02 00 00 00 00 00 00 00 04 00 00 00 00 ................
00000060: ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
As Dave Chinner points out, this is a V1 inode with both di_onlink and
di_nlink set to 1 and di_flushiter == 0. In other words, this inode was
formatted this way by mkfs and hasn't been touched since then.
Back in the old days of xfsprogs 3.2.3, I observed that libxfs_ialloc
would set di_nlink, but if the filesystem didn't have NLINK, it would
then set di_version = 1. libxfs_iflush_int later sees the V1 inode and
copies the value of di_nlink to di_onlink without zeroing di_onlink.
Eventually this filesystem must have been upgraded to support NLINK
because 6.10 doesn't support !NLINK filesystems, which is how we tripped
over this old behavior. The filesystem doesn't have a realtime section,
so that's why the rtbitmap inode has never been touched.
Fix this by removing the di_onlink/di_nlink checking for all V1/V2
inodes because this is a muddy mess. The V3 inode handling code has
always supported NLINK and written di_onlink==0 so keep that check.
The removal of the V1 inode handling code when we dropped support for
!NLINK obscured this old behavior.
Reported-by: kjell.m.randa@gmail.com
Fixes: 40cb8613d612 ("xfs: check unused nlink fields in the ondisk inode")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
libxfs/xfs_inode_buf.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c
index 856659cc3..5970ee705 100644
--- a/libxfs/xfs_inode_buf.c
+++ b/libxfs/xfs_inode_buf.c
@@ -511,12 +511,18 @@ xfs_dinode_verify(
return __this_address;
}
- if (dip->di_version > 1) {
+ /*
+ * Historical note: xfsprogs in the 3.2 era set up its incore inodes to
+ * have di_nlink track the link count, even if the actual filesystem
+ * only supported V1 inodes (i.e. di_onlink). When writing out the
+ * ondisk inode, it would set both the ondisk di_nlink and di_onlink to
+ * the the incore di_nlink value, which is why we cannot check for
+ * di_nlink==0 on a V1 inode. V2/3 inodes would get written out with
+ * di_onlink==0, so we can check that.
+ */
+ if (dip->di_version >= 2) {
if (dip->di_onlink)
return __this_address;
- } else {
- if (dip->di_nlink)
- return __this_address;
}
/* don't allow invalid i_size */
^ permalink raw reply related [flat|nested] 111+ messages in thread* [PATCH 64/64] xfs: xfs_finobt_count_blocks() walks the wrong btree
2024-10-02 1:04 ` [PATCHSET v2.5 3/6] libxfs: resync with 6.11 Darrick J. Wong
` (62 preceding siblings ...)
2024-10-02 1:24 ` [PATCH 63/64] xfs: fix di_onlink checking for V1/V2 inodes Darrick J. Wong
@ 2024-10-02 1:24 ` Darrick J. Wong
63 siblings, 0 replies; 111+ messages in thread
From: Darrick J. Wong @ 2024-10-02 1:24 UTC (permalink / raw)
To: aalbersh, djwong, cem
Cc: Anders Blomdell, Dave Chinner, Christoph Hellwig, Chandan Babu R,
linux-xfs
From: Dave Chinner <dchinner@redhat.com>
Source kernel commit: 95179935beadccaf0f0bb461adb778731e293da4
As a result of the factoring in commit 14dd46cf31f4 ("xfs: split
xfs_inobt_init_cursor"), mount started taking a long time on a
user's filesystem. For Anders, this made mount times regress from
under a second to over 15 minutes for a filesystem with only 30
million inodes in it.
Anders bisected it down to the above commit, but even then the bug
was not obvious. In this commit, over 20 calls to
xfs_inobt_init_cursor() were modified, and some we modified to call
a new function named xfs_finobt_init_cursor().
If that takes you a moment to reread those function names to see
what the rename was, then you have realised why this bug wasn't
spotted during review. And it wasn't spotted on inspection even
after the bisect pointed at this commit - a single missing "f" isn't
the easiest thing for a human eye to notice....
The result is that xfs_finobt_count_blocks() now incorrectly calls
xfs_inobt_init_cursor() so it is now walking the inobt instead of
the finobt. Hence when there are lots of allocated inodes in a
filesystem, mount takes a -long- time run because it now walks a
massive allocated inode btrees instead of the small, nearly empty
free inode btrees. It also means all the finobt space reservations
are wrong, so mount could potentially given ENOSPC on kernel
upgrade.
In hindsight, commit 14dd46cf31f4 should have been two commits - the
first to convert the finobt callers to the new API, the second to
modify the xfs_inobt_init_cursor() API for the inobt callers. That
would have made the bug very obvious during review.
Fixes: 14dd46cf31f4 ("xfs: split xfs_inobt_init_cursor")
Reported-by: Anders Blomdell <anders.blomdell@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
libxfs/xfs_ialloc_btree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libxfs/xfs_ialloc_btree.c b/libxfs/xfs_ialloc_btree.c
index 5042cc62f..489c080fb 100644
--- a/libxfs/xfs_ialloc_btree.c
+++ b/libxfs/xfs_ialloc_btree.c
@@ -748,7 +748,7 @@ xfs_finobt_count_blocks(
if (error)
return error;
- cur = xfs_inobt_init_cursor(pag, tp, agbp);
+ cur = xfs_finobt_init_cursor(pag, tp, agbp);
error = xfs_btree_count_blocks(cur, tree_blocks);
xfs_btree_del_cursor(cur, error);
xfs_trans_brelse(tp, agbp);
^ permalink raw reply related [flat|nested] 111+ messages in thread