* [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator
@ 2023-06-20 21:32 Omar Sandoval
2023-06-20 21:32 ` [PATCH 1/6] xfs: cache last bitmap block in " Omar Sandoval
` (6 more replies)
0 siblings, 7 replies; 23+ messages in thread
From: Omar Sandoval @ 2023-06-20 21:32 UTC (permalink / raw)
To: linux-xfs; +Cc: Darrick J . Wong, kernel-team, Prashant Nema
Hello,
Our distributed storage system uses XFS's realtime device support as a
way to split an XFS filesystem between an SSD and an HDD -- we configure
the HDD as the realtime device so that metadata goes on the SSD and data
goes on the HDD.
We've been running this in production for a few years now, so we have
some fairly fragmented filesystems. This has exposed various CPU
inefficiencies in the realtime allocator. These became even worse when
we experimented with using XFS_XFLAG_EXTSIZE to force files to be
allocated contiguously.
This series adds several optimizations that don't change the realtime
allocator's decisions, but make them happen more efficiently, mainly by
avoiding redundant work. We've tested these in production and measured
~10% lower CPU utilization. Furthermore, it made it possible to use
XFS_XFLAG_EXTSIZE to force contiguous allocations -- without these
patches, our most fragmented systems would become unresponsive due to
high CPU usage in the realtime allocator, but with them, CPU utilization
is actually ~4-6% lower than before, and disk I/O utilization is 15-20%
lower.
Patches 2 and 3 are preparations for later optimizations; the remaining
patches are the optimizations themselves.
This is based on Linus' tree as of today (commit
692b7dc87ca6d55ab254f8259e6f970171dc9d01).
Thanks!
Omar Sandoval (6):
xfs: cache last bitmap block in realtime allocator
xfs: invert the realtime summary cache
xfs: return maximum free size from xfs_rtany_summary()
xfs: limit maxlen based on available space in
xfs_rtallocate_extent_near()
xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
xfs: don't look for end of extent further than necessary in
xfs_rtallocate_extent_near()
fs/xfs/libxfs/xfs_rtbitmap.c | 173 ++++++++++++++--------------
fs/xfs/xfs_mount.h | 6 +-
fs/xfs/xfs_rtalloc.c | 215 ++++++++++++++++-------------------
fs/xfs/xfs_rtalloc.h | 28 +++--
4 files changed, 207 insertions(+), 215 deletions(-)
--
2.41.0
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 1/6] xfs: cache last bitmap block in realtime allocator
2023-06-20 21:32 [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
@ 2023-06-20 21:32 ` Omar Sandoval
2023-07-12 18:29 ` Darrick J. Wong
2023-06-20 21:32 ` [PATCH 2/6] xfs: invert the realtime summary cache Omar Sandoval
` (5 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2023-06-20 21:32 UTC (permalink / raw)
To: linux-xfs; +Cc: Darrick J . Wong, kernel-team, Prashant Nema
From: Omar Sandoval <osandov@fb.com>
Profiling a workload on a highly fragmented realtime device showed a ton
of CPU cycles being spent in xfs_trans_read_buf() called by
xfs_rtbuf_get(). Further tracing showed that much of that was repeated
calls to xfs_rtbuf_get() for the same block of the realtime bitmap.
These come from xfs_rtallocate_extent_block(): as it walks through
ranges of free bits in the bitmap, each call to xfs_rtcheck_range() and
xfs_rtfind_{forw,back}() gets the same bitmap block. If the bitmap block
is very fragmented, then this is _a lot_ of buffer lookups.
The realtime allocator already passes around a cache of the last used
realtime summary block to avoid repeated reads (the parameters rbpp and
rsb). We can do the same for the realtime bitmap.
This replaces rbpp and rsb with a struct xfs_rtbuf_cache, which caches
the most recently used block for both the realtime bitmap and summary.
xfs_rtbuf_get() now handles the caching instead of the callers, which
requires plumbing xfs_rtbuf_cache to more functions but also makes sure
we don't miss anything.
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
fs/xfs/libxfs/xfs_rtbitmap.c | 167 ++++++++++++++++++-----------------
fs/xfs/xfs_rtalloc.c | 107 ++++++++++------------
fs/xfs/xfs_rtalloc.h | 28 ++++--
3 files changed, 154 insertions(+), 148 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index fa180ab66b73..1a832c9a412f 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -46,6 +46,21 @@ const struct xfs_buf_ops xfs_rtbuf_ops = {
.verify_write = xfs_rtbuf_verify_write,
};
+static void
+xfs_rtbuf_cache_relse(
+ xfs_trans_t *tp, /* transaction pointer */
+ struct xfs_rtbuf_cache *cache) /* cached blocks */
+{
+ if (cache->bbuf) {
+ xfs_trans_brelse(tp, cache->bbuf);
+ cache->bbuf = NULL;
+ }
+ if (cache->sbuf) {
+ xfs_trans_brelse(tp, cache->sbuf);
+ cache->sbuf = NULL;
+ }
+}
+
/*
* Get a buffer for the bitmap or summary file block specified.
* The buffer is returned read and locked.
@@ -56,14 +71,35 @@ xfs_rtbuf_get(
xfs_trans_t *tp, /* transaction pointer */
xfs_rtblock_t block, /* block number in bitmap or summary */
int issum, /* is summary not bitmap */
+ struct xfs_rtbuf_cache *cache, /* in/out: cached blocks */
struct xfs_buf **bpp) /* output: buffer for the block */
{
+ struct xfs_buf **cbpp; /* cached block buffer */
+ xfs_fsblock_t *cbp; /* cached block number */
struct xfs_buf *bp; /* block buffer, result */
xfs_inode_t *ip; /* bitmap or summary inode */
xfs_bmbt_irec_t map;
int nmap = 1;
int error; /* error value */
+ cbpp = issum ? &cache->bbuf : &cache->sbuf;
+ cbp = issum ? &cache->bblock : &cache->sblock;
+ /*
+ * If we have a cached buffer, and the block number matches, use that.
+ */
+ if (*cbpp && *cbp == block) {
+ *bpp = *cbpp;
+ return 0;
+ }
+ /*
+ * Otherwise we have to have to get the buffer. If there was an old
+ * one, get rid of it first.
+ */
+ if (*cbpp) {
+ xfs_trans_brelse(tp, *cbpp);
+ *cbpp = NULL;
+ }
+
ip = issum ? mp->m_rsumip : mp->m_rbmip;
error = xfs_bmapi_read(ip, block, 1, &map, &nmap, 0);
@@ -82,7 +118,8 @@ xfs_rtbuf_get(
xfs_trans_buf_set_type(tp, bp, issum ? XFS_BLFT_RTSUMMARY_BUF
: XFS_BLFT_RTBITMAP_BUF);
- *bpp = bp;
+ *cbpp = *bpp = bp;
+ *cbp = block;
return 0;
}
@@ -96,6 +133,7 @@ xfs_rtfind_back(
xfs_trans_t *tp, /* transaction pointer */
xfs_rtblock_t start, /* starting block to look at */
xfs_rtblock_t limit, /* last block to look at */
+ struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
xfs_rtblock_t *rtblock) /* out: start block found */
{
xfs_rtword_t *b; /* current word in buffer */
@@ -116,7 +154,7 @@ xfs_rtfind_back(
* Compute and read in starting bitmap block for starting block.
*/
block = XFS_BITTOBLOCK(mp, start);
- error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -153,7 +191,6 @@ xfs_rtfind_back(
/*
* Different. Mark where we are and return.
*/
- xfs_trans_brelse(tp, bp);
i = bit - XFS_RTHIBIT(wdiff);
*rtblock = start - i + 1;
return 0;
@@ -167,8 +204,7 @@ xfs_rtfind_back(
/*
* If done with this block, get the previous one.
*/
- xfs_trans_brelse(tp, bp);
- error = xfs_rtbuf_get(mp, tp, --block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, --block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -199,7 +235,6 @@ xfs_rtfind_back(
/*
* Different, mark where we are and return.
*/
- xfs_trans_brelse(tp, bp);
i += XFS_NBWORD - 1 - XFS_RTHIBIT(wdiff);
*rtblock = start - i + 1;
return 0;
@@ -213,8 +248,7 @@ xfs_rtfind_back(
/*
* If done with this block, get the previous one.
*/
- xfs_trans_brelse(tp, bp);
- error = xfs_rtbuf_get(mp, tp, --block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, --block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -246,7 +280,6 @@ xfs_rtfind_back(
/*
* Different, mark where we are and return.
*/
- xfs_trans_brelse(tp, bp);
i += XFS_NBWORD - 1 - XFS_RTHIBIT(wdiff);
*rtblock = start - i + 1;
return 0;
@@ -256,7 +289,6 @@ xfs_rtfind_back(
/*
* No match, return that we scanned the whole area.
*/
- xfs_trans_brelse(tp, bp);
*rtblock = start - i + 1;
return 0;
}
@@ -271,6 +303,7 @@ xfs_rtfind_forw(
xfs_trans_t *tp, /* transaction pointer */
xfs_rtblock_t start, /* starting block to look at */
xfs_rtblock_t limit, /* last block to look at */
+ struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
xfs_rtblock_t *rtblock) /* out: start block found */
{
xfs_rtword_t *b; /* current word in buffer */
@@ -291,7 +324,7 @@ xfs_rtfind_forw(
* Compute and read in starting bitmap block for starting block.
*/
block = XFS_BITTOBLOCK(mp, start);
- error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -327,7 +360,6 @@ xfs_rtfind_forw(
/*
* Different. Mark where we are and return.
*/
- xfs_trans_brelse(tp, bp);
i = XFS_RTLOBIT(wdiff) - bit;
*rtblock = start + i - 1;
return 0;
@@ -341,8 +373,7 @@ xfs_rtfind_forw(
/*
* If done with this block, get the previous one.
*/
- xfs_trans_brelse(tp, bp);
- error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -372,7 +403,6 @@ xfs_rtfind_forw(
/*
* Different, mark where we are and return.
*/
- xfs_trans_brelse(tp, bp);
i += XFS_RTLOBIT(wdiff);
*rtblock = start + i - 1;
return 0;
@@ -386,8 +416,7 @@ xfs_rtfind_forw(
/*
* If done with this block, get the next one.
*/
- xfs_trans_brelse(tp, bp);
- error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -416,7 +445,6 @@ xfs_rtfind_forw(
/*
* Different, mark where we are and return.
*/
- xfs_trans_brelse(tp, bp);
i += XFS_RTLOBIT(wdiff);
*rtblock = start + i - 1;
return 0;
@@ -426,7 +454,6 @@ xfs_rtfind_forw(
/*
* No match, return that we scanned the whole area.
*/
- xfs_trans_brelse(tp, bp);
*rtblock = start + i - 1;
return 0;
}
@@ -447,8 +474,7 @@ xfs_rtmodify_summary_int(
int log, /* log2 of extent size */
xfs_rtblock_t bbno, /* bitmap block number */
int delta, /* change to make to summary info */
- struct xfs_buf **rbpp, /* in/out: summary block buffer */
- xfs_fsblock_t *rsb, /* in/out: summary block number */
+ struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
xfs_suminfo_t *sum) /* out: summary info for this block */
{
struct xfs_buf *bp; /* buffer for the summary block */
@@ -465,30 +491,9 @@ xfs_rtmodify_summary_int(
* Compute the block number in the summary file.
*/
sb = XFS_SUMOFFSTOBLOCK(mp, so);
- /*
- * If we have an old buffer, and the block number matches, use that.
- */
- if (*rbpp && *rsb == sb)
- bp = *rbpp;
- /*
- * Otherwise we have to get the buffer.
- */
- else {
- /*
- * If there was an old one, get rid of it first.
- */
- if (*rbpp)
- xfs_trans_brelse(tp, *rbpp);
- error = xfs_rtbuf_get(mp, tp, sb, 1, &bp);
- if (error) {
- return error;
- }
- /*
- * Remember this buffer and block for the next call.
- */
- *rbpp = bp;
- *rsb = sb;
- }
+ error = xfs_rtbuf_get(mp, tp, sb, 1, rtbufc, &bp);
+ if (error)
+ return error;
/*
* Point to the summary information, modify/log it, and/or copy it out.
*/
@@ -517,11 +522,9 @@ xfs_rtmodify_summary(
int log, /* log2 of extent size */
xfs_rtblock_t bbno, /* bitmap block number */
int delta, /* change to make to summary info */
- struct xfs_buf **rbpp, /* in/out: summary block buffer */
- xfs_fsblock_t *rsb) /* in/out: summary block number */
+ struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
{
- return xfs_rtmodify_summary_int(mp, tp, log, bbno,
- delta, rbpp, rsb, NULL);
+ return xfs_rtmodify_summary_int(mp, tp, log, bbno, delta, rtbufc, NULL);
}
/*
@@ -534,7 +537,8 @@ xfs_rtmodify_range(
xfs_trans_t *tp, /* transaction pointer */
xfs_rtblock_t start, /* starting block to modify */
xfs_extlen_t len, /* length of extent to modify */
- int val) /* 1 for free, 0 for allocated */
+ int val, /* 1 for free, 0 for allocated */
+ struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
{
xfs_rtword_t *b; /* current word in buffer */
int bit; /* bit number in the word */
@@ -555,7 +559,7 @@ xfs_rtmodify_range(
/*
* Read the bitmap block, and point to its data.
*/
- error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -600,7 +604,7 @@ xfs_rtmodify_range(
xfs_trans_log_buf(tp, bp,
(uint)((char *)first - (char *)bufp),
(uint)((char *)b - (char *)bufp));
- error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -640,7 +644,7 @@ xfs_rtmodify_range(
xfs_trans_log_buf(tp, bp,
(uint)((char *)first - (char *)bufp),
(uint)((char *)b - (char *)bufp));
- error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -690,8 +694,7 @@ xfs_rtfree_range(
xfs_trans_t *tp, /* transaction pointer */
xfs_rtblock_t start, /* starting block to free */
xfs_extlen_t len, /* length to free */
- struct xfs_buf **rbpp, /* in/out: summary block buffer */
- xfs_fsblock_t *rsb) /* in/out: summary block number */
+ struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
{
xfs_rtblock_t end; /* end of the freed extent */
int error; /* error value */
@@ -702,7 +705,7 @@ xfs_rtfree_range(
/*
* Modify the bitmap to mark this extent freed.
*/
- error = xfs_rtmodify_range(mp, tp, start, len, 1);
+ error = xfs_rtmodify_range(mp, tp, start, len, 1, rtbufc);
if (error) {
return error;
}
@@ -711,7 +714,7 @@ xfs_rtfree_range(
* We need to find the beginning and end of the extent so we can
* properly update the summary.
*/
- error = xfs_rtfind_back(mp, tp, start, 0, &preblock);
+ error = xfs_rtfind_back(mp, tp, start, 0, rtbufc, &preblock);
if (error) {
return error;
}
@@ -719,7 +722,7 @@ xfs_rtfree_range(
* Find the next allocated block (end of allocated extent).
*/
error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1,
- &postblock);
+ rtbufc, &postblock);
if (error)
return error;
/*
@@ -729,7 +732,7 @@ xfs_rtfree_range(
if (preblock < start) {
error = xfs_rtmodify_summary(mp, tp,
XFS_RTBLOCKLOG(start - preblock),
- XFS_BITTOBLOCK(mp, preblock), -1, rbpp, rsb);
+ XFS_BITTOBLOCK(mp, preblock), -1, rtbufc);
if (error) {
return error;
}
@@ -741,7 +744,7 @@ xfs_rtfree_range(
if (postblock > end) {
error = xfs_rtmodify_summary(mp, tp,
XFS_RTBLOCKLOG(postblock - end),
- XFS_BITTOBLOCK(mp, end + 1), -1, rbpp, rsb);
+ XFS_BITTOBLOCK(mp, end + 1), -1, rtbufc);
if (error) {
return error;
}
@@ -752,7 +755,7 @@ xfs_rtfree_range(
*/
error = xfs_rtmodify_summary(mp, tp,
XFS_RTBLOCKLOG(postblock + 1 - preblock),
- XFS_BITTOBLOCK(mp, preblock), 1, rbpp, rsb);
+ XFS_BITTOBLOCK(mp, preblock), 1, rtbufc);
return error;
}
@@ -767,6 +770,7 @@ xfs_rtcheck_range(
xfs_rtblock_t start, /* starting block number of extent */
xfs_extlen_t len, /* length of extent */
int val, /* 1 for free, 0 for allocated */
+ struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
xfs_rtblock_t *new, /* out: first block not matching */
int *stat) /* out: 1 for matches, 0 for not */
{
@@ -789,7 +793,7 @@ xfs_rtcheck_range(
/*
* Read the bitmap block.
*/
- error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -824,7 +828,6 @@ xfs_rtcheck_range(
/*
* Different, compute first wrong bit and return.
*/
- xfs_trans_brelse(tp, bp);
i = XFS_RTLOBIT(wdiff) - bit;
*new = start + i;
*stat = 0;
@@ -839,8 +842,7 @@ xfs_rtcheck_range(
/*
* If done with this block, get the next one.
*/
- xfs_trans_brelse(tp, bp);
- error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -870,7 +872,6 @@ xfs_rtcheck_range(
/*
* Different, compute first wrong bit and return.
*/
- xfs_trans_brelse(tp, bp);
i += XFS_RTLOBIT(wdiff);
*new = start + i;
*stat = 0;
@@ -885,8 +886,7 @@ xfs_rtcheck_range(
/*
* If done with this block, get the next one.
*/
- xfs_trans_brelse(tp, bp);
- error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
+ error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
if (error) {
return error;
}
@@ -915,7 +915,6 @@ xfs_rtcheck_range(
/*
* Different, compute first wrong bit and return.
*/
- xfs_trans_brelse(tp, bp);
i += XFS_RTLOBIT(wdiff);
*new = start + i;
*stat = 0;
@@ -926,7 +925,6 @@ xfs_rtcheck_range(
/*
* Successful, return.
*/
- xfs_trans_brelse(tp, bp);
*new = start + i;
*stat = 1;
return 0;
@@ -941,20 +939,21 @@ xfs_rtcheck_alloc_range(
xfs_mount_t *mp, /* file system mount point */
xfs_trans_t *tp, /* transaction pointer */
xfs_rtblock_t bno, /* starting block number of extent */
- xfs_extlen_t len) /* length of extent */
+ xfs_extlen_t len, /* length of extent */
+ struct xfs_rtbuf_cache *rtbufc) /* in/out: cached blocks */
{
xfs_rtblock_t new; /* dummy for xfs_rtcheck_range */
int stat;
int error;
- error = xfs_rtcheck_range(mp, tp, bno, len, 0, &new, &stat);
+ error = xfs_rtcheck_range(mp, tp, bno, len, 0, rtbufc, &new, &stat);
if (error)
return error;
ASSERT(stat);
return 0;
}
#else
-#define xfs_rtcheck_alloc_range(m,t,b,l) (0)
+#define xfs_rtcheck_alloc_range(m,t,b,l,r) (0)
#endif
/*
* Free an extent in the realtime subvolume. Length is expressed in
@@ -968,22 +967,21 @@ xfs_rtfree_extent(
{
int error; /* error value */
xfs_mount_t *mp; /* file system mount structure */
- xfs_fsblock_t sb; /* summary file block number */
- struct xfs_buf *sumbp = NULL; /* summary file block buffer */
+ struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
mp = tp->t_mountp;
ASSERT(mp->m_rbmip->i_itemp != NULL);
ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
- error = xfs_rtcheck_alloc_range(mp, tp, bno, len);
+ error = xfs_rtcheck_alloc_range(mp, tp, bno, len, &rtbufc);
if (error)
return error;
/*
* Free the range of realtime blocks.
*/
- error = xfs_rtfree_range(mp, tp, bno, len, &sumbp, &sb);
+ error = xfs_rtfree_range(mp, tp, bno, len, &rtbufc);
if (error) {
return error;
}
@@ -1021,6 +1019,7 @@ xfs_rtalloc_query_range(
xfs_rtblock_t high_key;
int is_free;
int error = 0;
+ struct xfs_rtbuf_cache rtbufc = {};
if (low_rec->ar_startext > high_rec->ar_startext)
return -EINVAL;
@@ -1034,13 +1033,14 @@ xfs_rtalloc_query_range(
rtstart = low_rec->ar_startext;
while (rtstart <= high_key) {
/* Is the first block free? */
- error = xfs_rtcheck_range(mp, tp, rtstart, 1, 1, &rtend,
- &is_free);
+ error = xfs_rtcheck_range(mp, tp, rtstart, 1, 1, &rtbufc,
+ &rtend, &is_free);
if (error)
break;
/* How long does the extent go for? */
- error = xfs_rtfind_forw(mp, tp, rtstart, high_key, &rtend);
+ error = xfs_rtfind_forw(mp, tp, rtstart, high_key, &rtbufc,
+ &rtend);
if (error)
break;
@@ -1056,6 +1056,8 @@ xfs_rtalloc_query_range(
rtstart = rtend + 1;
}
+ xfs_rtbuf_cache_relse(tp, &rtbufc);
+
return error;
}
@@ -1085,11 +1087,14 @@ xfs_rtalloc_extent_is_free(
xfs_extlen_t len,
bool *is_free)
{
+ struct xfs_rtbuf_cache rtbufc = {};
xfs_rtblock_t end;
int matches;
int error;
- error = xfs_rtcheck_range(mp, tp, start, len, 1, &end, &matches);
+ error = xfs_rtcheck_range(mp, tp, start, len, 1, &rtbufc, &end,
+ &matches);
+ xfs_rtbuf_cache_relse(tp, &rtbufc);
if (error)
return error;
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 16534e9873f6..61ef13286654 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -32,11 +32,10 @@ xfs_rtget_summary(
xfs_trans_t *tp, /* transaction pointer */
int log, /* log2 of extent size */
xfs_rtblock_t bbno, /* bitmap block number */
- struct xfs_buf **rbpp, /* in/out: summary block buffer */
- xfs_fsblock_t *rsb, /* in/out: summary block number */
+ struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
xfs_suminfo_t *sum) /* out: summary info for this block */
{
- return xfs_rtmodify_summary_int(mp, tp, log, bbno, 0, rbpp, rsb, sum);
+ return xfs_rtmodify_summary_int(mp, tp, log, bbno, 0, rtbufc, sum);
}
/*
@@ -50,8 +49,7 @@ xfs_rtany_summary(
int low, /* low log2 extent size */
int high, /* high log2 extent size */
xfs_rtblock_t bbno, /* bitmap block number */
- struct xfs_buf **rbpp, /* in/out: summary block buffer */
- xfs_fsblock_t *rsb, /* in/out: summary block number */
+ struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
int *stat) /* out: any good extents here? */
{
int error; /* error value */
@@ -69,7 +67,7 @@ xfs_rtany_summary(
/*
* Get one summary datum.
*/
- error = xfs_rtget_summary(mp, tp, log, bbno, rbpp, rsb, &sum);
+ error = xfs_rtget_summary(mp, tp, log, bbno, rtbufc, &sum);
if (error) {
return error;
}
@@ -104,29 +102,27 @@ xfs_rtcopy_summary(
xfs_trans_t *tp) /* transaction pointer */
{
xfs_rtblock_t bbno; /* bitmap block number */
- struct xfs_buf *bp; /* summary buffer */
+ struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
int error; /* error return value */
int log; /* summary level number (log length) */
xfs_suminfo_t sum; /* summary data */
- xfs_fsblock_t sumbno; /* summary block number */
- bp = NULL;
for (log = omp->m_rsumlevels - 1; log >= 0; log--) {
for (bbno = omp->m_sb.sb_rbmblocks - 1;
(xfs_srtblock_t)bbno >= 0;
bbno--) {
- error = xfs_rtget_summary(omp, tp, log, bbno, &bp,
- &sumbno, &sum);
+ error = xfs_rtget_summary(omp, tp, log, bbno, &rtbufc,
+ &sum);
if (error)
return error;
if (sum == 0)
continue;
error = xfs_rtmodify_summary(omp, tp, log, bbno, -sum,
- &bp, &sumbno);
+ &rtbufc);
if (error)
return error;
error = xfs_rtmodify_summary(nmp, tp, log, bbno, sum,
- &bp, &sumbno);
+ &rtbufc);
if (error)
return error;
ASSERT(sum > 0);
@@ -144,8 +140,7 @@ xfs_rtallocate_range(
xfs_trans_t *tp, /* transaction pointer */
xfs_rtblock_t start, /* start block to allocate */
xfs_extlen_t len, /* length to allocate */
- struct xfs_buf **rbpp, /* in/out: summary block buffer */
- xfs_fsblock_t *rsb) /* in/out: summary block number */
+ struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
{
xfs_rtblock_t end; /* end of the allocated extent */
int error; /* error value */
@@ -158,14 +153,14 @@ xfs_rtallocate_range(
* We need to find the beginning and end of the extent so we can
* properly update the summary.
*/
- error = xfs_rtfind_back(mp, tp, start, 0, &preblock);
+ error = xfs_rtfind_back(mp, tp, start, 0, rtbufc, &preblock);
if (error) {
return error;
}
/*
* Find the next allocated block (end of free extent).
*/
- error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1,
+ error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1, rtbufc,
&postblock);
if (error) {
return error;
@@ -176,7 +171,7 @@ xfs_rtallocate_range(
*/
error = xfs_rtmodify_summary(mp, tp,
XFS_RTBLOCKLOG(postblock + 1 - preblock),
- XFS_BITTOBLOCK(mp, preblock), -1, rbpp, rsb);
+ XFS_BITTOBLOCK(mp, preblock), -1, rtbufc);
if (error) {
return error;
}
@@ -187,7 +182,7 @@ xfs_rtallocate_range(
if (preblock < start) {
error = xfs_rtmodify_summary(mp, tp,
XFS_RTBLOCKLOG(start - preblock),
- XFS_BITTOBLOCK(mp, preblock), 1, rbpp, rsb);
+ XFS_BITTOBLOCK(mp, preblock), 1, rtbufc);
if (error) {
return error;
}
@@ -199,7 +194,7 @@ xfs_rtallocate_range(
if (postblock > end) {
error = xfs_rtmodify_summary(mp, tp,
XFS_RTBLOCKLOG(postblock - end),
- XFS_BITTOBLOCK(mp, end + 1), 1, rbpp, rsb);
+ XFS_BITTOBLOCK(mp, end + 1), 1, rtbufc);
if (error) {
return error;
}
@@ -207,7 +202,7 @@ xfs_rtallocate_range(
/*
* Modify the bitmap to mark this extent allocated.
*/
- error = xfs_rtmodify_range(mp, tp, start, len, 0);
+ error = xfs_rtmodify_range(mp, tp, start, len, 0, rtbufc);
return error;
}
@@ -226,8 +221,7 @@ xfs_rtallocate_extent_block(
xfs_extlen_t maxlen, /* maximum length to allocate */
xfs_extlen_t *len, /* out: actual length allocated */
xfs_rtblock_t *nextp, /* out: next block to try */
- struct xfs_buf **rbpp, /* in/out: summary block buffer */
- xfs_fsblock_t *rsb, /* in/out: summary block number */
+ struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
xfs_extlen_t prod, /* extent product factor */
xfs_rtblock_t *rtblock) /* out: start block allocated */
{
@@ -254,7 +248,8 @@ xfs_rtallocate_extent_block(
* See if there's a free extent of maxlen starting at i.
* If it's not so then next will contain the first non-free.
*/
- error = xfs_rtcheck_range(mp, tp, i, maxlen, 1, &next, &stat);
+ error = xfs_rtcheck_range(mp, tp, i, maxlen, 1, rtbufc, &next,
+ &stat);
if (error) {
return error;
}
@@ -262,8 +257,7 @@ xfs_rtallocate_extent_block(
/*
* i for maxlen is all free, allocate and return that.
*/
- error = xfs_rtallocate_range(mp, tp, i, maxlen, rbpp,
- rsb);
+ error = xfs_rtallocate_range(mp, tp, i, maxlen, rtbufc);
if (error) {
return error;
}
@@ -290,7 +284,7 @@ xfs_rtallocate_extent_block(
* If not done yet, find the start of the next free space.
*/
if (next < end) {
- error = xfs_rtfind_forw(mp, tp, next, end, &i);
+ error = xfs_rtfind_forw(mp, tp, next, end, rtbufc, &i);
if (error) {
return error;
}
@@ -315,7 +309,7 @@ xfs_rtallocate_extent_block(
/*
* Allocate besti for bestlen & return that.
*/
- error = xfs_rtallocate_range(mp, tp, besti, bestlen, rbpp, rsb);
+ error = xfs_rtallocate_range(mp, tp, besti, bestlen, rtbufc);
if (error) {
return error;
}
@@ -344,9 +338,8 @@ xfs_rtallocate_extent_exact(
xfs_rtblock_t bno, /* starting block number to allocate */
xfs_extlen_t minlen, /* minimum length to allocate */
xfs_extlen_t maxlen, /* maximum length to allocate */
+ struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
xfs_extlen_t *len, /* out: actual length allocated */
- struct xfs_buf **rbpp, /* in/out: summary block buffer */
- xfs_fsblock_t *rsb, /* in/out: summary block number */
xfs_extlen_t prod, /* extent product factor */
xfs_rtblock_t *rtblock) /* out: start block allocated */
{
@@ -359,7 +352,8 @@ xfs_rtallocate_extent_exact(
/*
* Check if the range in question (for maxlen) is free.
*/
- error = xfs_rtcheck_range(mp, tp, bno, maxlen, 1, &next, &isfree);
+ error = xfs_rtcheck_range(mp, tp, bno, maxlen, 1, rtbufc, &next,
+ &isfree);
if (error) {
return error;
}
@@ -367,7 +361,7 @@ xfs_rtallocate_extent_exact(
/*
* If it is, allocate it and return success.
*/
- error = xfs_rtallocate_range(mp, tp, bno, maxlen, rbpp, rsb);
+ error = xfs_rtallocate_range(mp, tp, bno, maxlen, rtbufc);
if (error) {
return error;
}
@@ -402,7 +396,7 @@ xfs_rtallocate_extent_exact(
/*
* Allocate what we can and return it.
*/
- error = xfs_rtallocate_range(mp, tp, bno, maxlen, rbpp, rsb);
+ error = xfs_rtallocate_range(mp, tp, bno, maxlen, rtbufc);
if (error) {
return error;
}
@@ -424,8 +418,7 @@ xfs_rtallocate_extent_near(
xfs_extlen_t minlen, /* minimum length to allocate */
xfs_extlen_t maxlen, /* maximum length to allocate */
xfs_extlen_t *len, /* out: actual length allocated */
- struct xfs_buf **rbpp, /* in/out: summary block buffer */
- xfs_fsblock_t *rsb, /* in/out: summary block number */
+ struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
xfs_extlen_t prod, /* extent product factor */
xfs_rtblock_t *rtblock) /* out: start block allocated */
{
@@ -456,8 +449,8 @@ xfs_rtallocate_extent_near(
/*
* Try the exact allocation first.
*/
- error = xfs_rtallocate_extent_exact(mp, tp, bno, minlen, maxlen, len,
- rbpp, rsb, prod, &r);
+ error = xfs_rtallocate_extent_exact(mp, tp, bno, minlen, maxlen, rtbufc,
+ len, prod, &r);
if (error) {
return error;
}
@@ -481,7 +474,7 @@ xfs_rtallocate_extent_near(
* starting in this bitmap block.
*/
error = xfs_rtany_summary(mp, tp, log2len, mp->m_rsumlevels - 1,
- bbno + i, rbpp, rsb, &any);
+ bbno + i, rtbufc, &any);
if (error) {
return error;
}
@@ -499,8 +492,8 @@ xfs_rtallocate_extent_near(
* this block.
*/
error = xfs_rtallocate_extent_block(mp, tp,
- bbno + i, minlen, maxlen, len, &n, rbpp,
- rsb, prod, &r);
+ bbno + i, minlen, maxlen, len, &n,
+ rtbufc, prod, &r);
if (error) {
return error;
}
@@ -529,7 +522,7 @@ xfs_rtallocate_extent_near(
*/
error = xfs_rtany_summary(mp, tp,
log2len, mp->m_rsumlevels - 1,
- bbno + j, rbpp, rsb, &any);
+ bbno + j, rtbufc, &any);
if (error) {
return error;
}
@@ -545,7 +538,7 @@ xfs_rtallocate_extent_near(
continue;
error = xfs_rtallocate_extent_block(mp,
tp, bbno + j, minlen, maxlen,
- len, &n, rbpp, rsb, prod, &r);
+ len, &n, rtbufc, prod, &r);
if (error) {
return error;
}
@@ -566,8 +559,8 @@ xfs_rtallocate_extent_near(
* that we found.
*/
error = xfs_rtallocate_extent_block(mp, tp,
- bbno + i, minlen, maxlen, len, &n, rbpp,
- rsb, prod, &r);
+ bbno + i, minlen, maxlen, len, &n,
+ rtbufc, prod, &r);
if (error) {
return error;
}
@@ -626,8 +619,7 @@ xfs_rtallocate_extent_size(
xfs_extlen_t minlen, /* minimum length to allocate */
xfs_extlen_t maxlen, /* maximum length to allocate */
xfs_extlen_t *len, /* out: actual length allocated */
- struct xfs_buf **rbpp, /* in/out: summary block buffer */
- xfs_fsblock_t *rsb, /* in/out: summary block number */
+ struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
xfs_extlen_t prod, /* extent product factor */
xfs_rtblock_t *rtblock) /* out: start block allocated */
{
@@ -656,7 +648,7 @@ xfs_rtallocate_extent_size(
/*
* Get the summary for this level/block.
*/
- error = xfs_rtget_summary(mp, tp, l, i, rbpp, rsb,
+ error = xfs_rtget_summary(mp, tp, l, i, rtbufc,
&sum);
if (error) {
return error;
@@ -670,7 +662,7 @@ xfs_rtallocate_extent_size(
* Try allocating the extent.
*/
error = xfs_rtallocate_extent_block(mp, tp, i, maxlen,
- maxlen, len, &n, rbpp, rsb, prod, &r);
+ maxlen, len, &n, rtbufc, prod, &r);
if (error) {
return error;
}
@@ -715,7 +707,7 @@ xfs_rtallocate_extent_size(
/*
* Get the summary information for this level/block.
*/
- error = xfs_rtget_summary(mp, tp, l, i, rbpp, rsb,
+ error = xfs_rtget_summary(mp, tp, l, i, rtbufc,
&sum);
if (error) {
return error;
@@ -733,7 +725,7 @@ xfs_rtallocate_extent_size(
error = xfs_rtallocate_extent_block(mp, tp, i,
XFS_RTMAX(minlen, 1 << l),
XFS_RTMIN(maxlen, (1 << (l + 1)) - 1),
- len, &n, rbpp, rsb, prod, &r);
+ len, &n, rtbufc, prod, &r);
if (error) {
return error;
}
@@ -922,7 +914,6 @@ xfs_growfs_rt(
xfs_extlen_t rbmblocks; /* current number of rt bitmap blocks */
xfs_extlen_t rsumblocks; /* current number of rt summary blks */
xfs_sb_t *sbp; /* old superblock */
- xfs_fsblock_t sumbno; /* summary block number */
uint8_t *rsum_cache; /* old summary cache */
sbp = &mp->m_sb;
@@ -1025,6 +1016,7 @@ xfs_growfs_rt(
bmbno++) {
struct xfs_trans *tp;
xfs_rfsblock_t nrblocks_step;
+ struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
*nmp = *mp;
nsbp = &nmp->m_sb;
@@ -1111,9 +1103,8 @@ xfs_growfs_rt(
/*
* Free new extent.
*/
- bp = NULL;
error = xfs_rtfree_range(nmp, tp, sbp->sb_rextents,
- nsbp->sb_rextents - sbp->sb_rextents, &bp, &sumbno);
+ nsbp->sb_rextents - sbp->sb_rextents, &rtbufc);
if (error) {
error_cancel:
xfs_trans_cancel(tp);
@@ -1185,8 +1176,7 @@ xfs_rtallocate_extent(
xfs_mount_t *mp = tp->t_mountp;
int error; /* error value */
xfs_rtblock_t r; /* result allocated block */
- xfs_fsblock_t sb; /* summary file block number */
- struct xfs_buf *sumbp; /* summary file block buffer */
+ struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
ASSERT(minlen > 0 && minlen <= maxlen);
@@ -1208,13 +1198,14 @@ xfs_rtallocate_extent(
}
retry:
- sumbp = NULL;
+ rtbufc.bbuf = NULL;
+ rtbufc.sbuf = NULL;
if (bno == 0) {
error = xfs_rtallocate_extent_size(mp, tp, minlen, maxlen, len,
- &sumbp, &sb, prod, &r);
+ &rtbufc, prod, &r);
} else {
error = xfs_rtallocate_extent_near(mp, tp, bno, minlen, maxlen,
- len, &sumbp, &sb, prod, &r);
+ len, &rtbufc, prod, &r);
}
if (error)
diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h
index 62c7ad79cbb6..888552c4f287 100644
--- a/fs/xfs/xfs_rtalloc.h
+++ b/fs/xfs/xfs_rtalloc.h
@@ -101,29 +101,39 @@ xfs_growfs_rt(
/*
* From xfs_rtbitmap.c
*/
+struct xfs_rtbuf_cache {
+ struct xfs_buf *bbuf; /* bitmap block buffer */
+ xfs_fsblock_t bblock; /* bitmap block number */
+ struct xfs_buf *sbuf; /* summary block buffer */
+ xfs_fsblock_t sblock; /* summary block number */
+};
+
int xfs_rtbuf_get(struct xfs_mount *mp, struct xfs_trans *tp,
- xfs_rtblock_t block, int issum, struct xfs_buf **bpp);
+ xfs_rtblock_t block, int issum, struct xfs_rtbuf_cache *cache,
+ struct xfs_buf **bpp);
int xfs_rtcheck_range(struct xfs_mount *mp, struct xfs_trans *tp,
xfs_rtblock_t start, xfs_extlen_t len, int val,
- xfs_rtblock_t *new, int *stat);
+ struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *new,
+ int *stat);
int xfs_rtfind_back(struct xfs_mount *mp, struct xfs_trans *tp,
xfs_rtblock_t start, xfs_rtblock_t limit,
- xfs_rtblock_t *rtblock);
+ struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *rtblock);
int xfs_rtfind_forw(struct xfs_mount *mp, struct xfs_trans *tp,
xfs_rtblock_t start, xfs_rtblock_t limit,
- xfs_rtblock_t *rtblock);
+ struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *rtblock);
int xfs_rtmodify_range(struct xfs_mount *mp, struct xfs_trans *tp,
- xfs_rtblock_t start, xfs_extlen_t len, int val);
+ xfs_rtblock_t start, xfs_extlen_t len, int val,
+ struct xfs_rtbuf_cache *rtbufc);
int xfs_rtmodify_summary_int(struct xfs_mount *mp, struct xfs_trans *tp,
int log, xfs_rtblock_t bbno, int delta,
- struct xfs_buf **rbpp, xfs_fsblock_t *rsb,
+ struct xfs_rtbuf_cache *rtbufc,
xfs_suminfo_t *sum);
int xfs_rtmodify_summary(struct xfs_mount *mp, struct xfs_trans *tp, int log,
- xfs_rtblock_t bbno, int delta, struct xfs_buf **rbpp,
- xfs_fsblock_t *rsb);
+ xfs_rtblock_t bbno, int delta,
+ struct xfs_rtbuf_cache *rtbufc);
int xfs_rtfree_range(struct xfs_mount *mp, struct xfs_trans *tp,
xfs_rtblock_t start, xfs_extlen_t len,
- struct xfs_buf **rbpp, xfs_fsblock_t *rsb);
+ struct xfs_rtbuf_cache *rtbufc);
int xfs_rtalloc_query_range(struct xfs_mount *mp, struct xfs_trans *tp,
const struct xfs_rtalloc_rec *low_rec,
const struct xfs_rtalloc_rec *high_rec,
--
2.41.0
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 2/6] xfs: invert the realtime summary cache
2023-06-20 21:32 [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
2023-06-20 21:32 ` [PATCH 1/6] xfs: cache last bitmap block in " Omar Sandoval
@ 2023-06-20 21:32 ` Omar Sandoval
2023-07-12 22:40 ` Darrick J. Wong
2023-06-20 21:32 ` [PATCH 3/6] xfs: return maximum free size from xfs_rtany_summary() Omar Sandoval
` (4 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2023-06-20 21:32 UTC (permalink / raw)
To: linux-xfs; +Cc: Darrick J . Wong, kernel-team, Prashant Nema
From: Omar Sandoval <osandov@fb.com>
In commit 355e3532132b ("xfs: cache minimum realtime summary level"), I
added a cache of the minimum level of the realtime summary that has any
free extents. However, it turns out that the _maximum_ level is more
useful for upcoming optimizations, and basically equivalent for the
existing usage. So, let's change the meaning of the cache to be the
maximum level + 1, or 0 if there are no free extents.
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
fs/xfs/libxfs/xfs_rtbitmap.c | 6 +++---
fs/xfs/xfs_mount.h | 6 +++---
fs/xfs/xfs_rtalloc.c | 31 +++++++++++++++++++------------
3 files changed, 25 insertions(+), 18 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index 1a832c9a412f..d9493f64adfc 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -503,10 +503,10 @@ xfs_rtmodify_summary_int(
*sp += delta;
if (mp->m_rsum_cache) {
- if (*sp == 0 && log == mp->m_rsum_cache[bbno])
- mp->m_rsum_cache[bbno]++;
- if (*sp != 0 && log < mp->m_rsum_cache[bbno])
+ if (*sp == 0 && log + 1 == mp->m_rsum_cache[bbno])
mp->m_rsum_cache[bbno] = log;
+ if (*sp != 0 && log >= mp->m_rsum_cache[bbno])
+ mp->m_rsum_cache[bbno] = log + 1;
}
xfs_trans_log_buf(tp, bp, first, first + sizeof(*sp) - 1);
}
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 6c09f89534d3..964541c36730 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -103,9 +103,9 @@ typedef struct xfs_mount {
/*
* Optional cache of rt summary level per bitmap block with the
- * invariant that m_rsum_cache[bbno] <= the minimum i for which
- * rsum[i][bbno] != 0. Reads and writes are serialized by the rsumip
- * inode lock.
+ * invariant that m_rsum_cache[bbno] > the maximum i for which
+ * rsum[i][bbno] != 0, or 0 if rsum[i][bbno] == 0 for all i.
+ * Reads and writes are serialized by the rsumip inode lock.
*/
uint8_t *m_rsum_cache;
struct xfs_mru_cache *m_filestream; /* per-mount filestream data */
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 61ef13286654..d3c76532d20e 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -56,14 +56,19 @@ xfs_rtany_summary(
int log; /* loop counter, log2 of ext. size */
xfs_suminfo_t sum; /* summary data */
- /* There are no extents at levels < m_rsum_cache[bbno]. */
- if (mp->m_rsum_cache && low < mp->m_rsum_cache[bbno])
- low = mp->m_rsum_cache[bbno];
+ /* There are no extents at levels >= m_rsum_cache[bbno]. */
+ if (mp->m_rsum_cache) {
+ high = min(high, mp->m_rsum_cache[bbno] - 1);
+ if (low > high) {
+ *stat = 0;
+ return 0;
+ }
+ }
/*
* Loop over logs of extent sizes.
*/
- for (log = low; log <= high; log++) {
+ for (log = high; log >= low; log--) {
/*
* Get one summary datum.
*/
@@ -84,9 +89,9 @@ xfs_rtany_summary(
*/
*stat = 0;
out:
- /* There were no extents at levels < log. */
- if (mp->m_rsum_cache && log > mp->m_rsum_cache[bbno])
- mp->m_rsum_cache[bbno] = log;
+ /* There were no extents at levels > log. */
+ if (mp->m_rsum_cache && log + 1 < mp->m_rsum_cache[bbno])
+ mp->m_rsum_cache[bbno] = log + 1;
return 0;
}
@@ -878,12 +883,14 @@ xfs_alloc_rsum_cache(
xfs_extlen_t rbmblocks) /* number of rt bitmap blocks */
{
/*
- * The rsum cache is initialized to all zeroes, which is trivially a
- * lower bound on the minimum level with any free extents. We can
- * continue without the cache if it couldn't be allocated.
+ * The rsum cache is initialized to the maximum value, which is
+ * trivially an upper bound on the maximum level with any free extents.
+ * We can continue without the cache if it couldn't be allocated.
*/
- mp->m_rsum_cache = kvzalloc(rbmblocks, GFP_KERNEL);
- if (!mp->m_rsum_cache)
+ mp->m_rsum_cache = kvmalloc(rbmblocks, GFP_KERNEL);
+ if (mp->m_rsum_cache)
+ memset(mp->m_rsum_cache, -1, rbmblocks);
+ else
xfs_warn(mp, "could not allocate realtime summary cache");
}
--
2.41.0
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 3/6] xfs: return maximum free size from xfs_rtany_summary()
2023-06-20 21:32 [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
2023-06-20 21:32 ` [PATCH 1/6] xfs: cache last bitmap block in " Omar Sandoval
2023-06-20 21:32 ` [PATCH 2/6] xfs: invert the realtime summary cache Omar Sandoval
@ 2023-06-20 21:32 ` Omar Sandoval
2023-07-12 22:44 ` Darrick J. Wong
2023-06-20 21:32 ` [PATCH 4/6] xfs: limit maxlen based on available space in xfs_rtallocate_extent_near() Omar Sandoval
` (3 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2023-06-20 21:32 UTC (permalink / raw)
To: linux-xfs; +Cc: Darrick J . Wong, kernel-team, Prashant Nema
From: Omar Sandoval <osandov@fb.com>
Instead of only returning whether there is any free space, return the
maximum size, which is fast thanks to the previous commit. This will be
used by two upcoming optimizations.
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
fs/xfs/xfs_rtalloc.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index d3c76532d20e..ba7d42e0090f 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -50,7 +50,7 @@ xfs_rtany_summary(
int high, /* high log2 extent size */
xfs_rtblock_t bbno, /* bitmap block number */
struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
- int *stat) /* out: any good extents here? */
+ int *maxlog) /* out: maximum log2 extent size free */
{
int error; /* error value */
int log; /* loop counter, log2 of ext. size */
@@ -60,7 +60,7 @@ xfs_rtany_summary(
if (mp->m_rsum_cache) {
high = min(high, mp->m_rsum_cache[bbno] - 1);
if (low > high) {
- *stat = 0;
+ *maxlog = -1;
return 0;
}
}
@@ -80,14 +80,14 @@ xfs_rtany_summary(
* If there are any, return success.
*/
if (sum) {
- *stat = 1;
+ *maxlog = log;
goto out;
}
}
/*
* Found nothing, return failure.
*/
- *stat = 0;
+ *maxlog = -1;
out:
/* There were no extents at levels > log. */
if (mp->m_rsum_cache && log + 1 < mp->m_rsum_cache[bbno])
@@ -427,7 +427,7 @@ xfs_rtallocate_extent_near(
xfs_extlen_t prod, /* extent product factor */
xfs_rtblock_t *rtblock) /* out: start block allocated */
{
- int any; /* any useful extents from summary */
+ int maxlog; /* maximum useful extent from summary */
xfs_rtblock_t bbno; /* bitmap block number */
int error; /* error value */
int i; /* bitmap block offset (loop control) */
@@ -479,7 +479,7 @@ xfs_rtallocate_extent_near(
* starting in this bitmap block.
*/
error = xfs_rtany_summary(mp, tp, log2len, mp->m_rsumlevels - 1,
- bbno + i, rtbufc, &any);
+ bbno + i, rtbufc, &maxlog);
if (error) {
return error;
}
@@ -487,7 +487,7 @@ xfs_rtallocate_extent_near(
* If there are any useful extents starting here, try
* allocating one.
*/
- if (any) {
+ if (maxlog >= 0) {
/*
* On the positive side of the starting location.
*/
@@ -527,7 +527,7 @@ xfs_rtallocate_extent_near(
*/
error = xfs_rtany_summary(mp, tp,
log2len, mp->m_rsumlevels - 1,
- bbno + j, rtbufc, &any);
+ bbno + j, rtbufc, &maxlog);
if (error) {
return error;
}
@@ -539,7 +539,7 @@ xfs_rtallocate_extent_near(
* extent given, we've already tried
* that allocation, don't do it again.
*/
- if (any)
+ if (maxlog >= 0)
continue;
error = xfs_rtallocate_extent_block(mp,
tp, bbno + j, minlen, maxlen,
--
2.41.0
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 4/6] xfs: limit maxlen based on available space in xfs_rtallocate_extent_near()
2023-06-20 21:32 [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
` (2 preceding siblings ...)
2023-06-20 21:32 ` [PATCH 3/6] xfs: return maximum free size from xfs_rtany_summary() Omar Sandoval
@ 2023-06-20 21:32 ` Omar Sandoval
2023-07-12 23:01 ` Darrick J. Wong
2023-06-20 21:32 ` [PATCH 5/6] xfs: don't try redundant allocations " Omar Sandoval
` (2 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2023-06-20 21:32 UTC (permalink / raw)
To: linux-xfs; +Cc: Darrick J . Wong, kernel-team, Prashant Nema
From: Omar Sandoval <osandov@fb.com>
xfs_rtallocate_extent_near() calls xfs_rtallocate_extent_block() with
the minlen and maxlen that were passed to it.
xfs_rtallocate_extent_block() then scans the bitmap block looking for a
free range of size maxlen. If there is none, it has to scan the whole
bitmap block before returning the largest range of at least size minlen.
For a fragmented realtime device and a large allocation request, it's
almost certain that this will have to search the whole bitmap block,
leading to high CPU usage.
However, the realtime summary tells us the maximum size available in the
bitmap block. We can limit the search in xfs_rtallocate_extent_block()
to that size and often stop before scanning the whole bitmap block.
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
fs/xfs/xfs_rtalloc.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index ba7d42e0090f..d079dfb77c73 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -488,6 +488,8 @@ xfs_rtallocate_extent_near(
* allocating one.
*/
if (maxlog >= 0) {
+ xfs_extlen_t maxavail =
+ min(maxlen, ((xfs_extlen_t)1 << (maxlog + 1)) - 1);
/*
* On the positive side of the starting location.
*/
@@ -497,7 +499,7 @@ xfs_rtallocate_extent_near(
* this block.
*/
error = xfs_rtallocate_extent_block(mp, tp,
- bbno + i, minlen, maxlen, len, &n,
+ bbno + i, minlen, maxavail, len, &n,
rtbufc, prod, &r);
if (error) {
return error;
@@ -542,7 +544,7 @@ xfs_rtallocate_extent_near(
if (maxlog >= 0)
continue;
error = xfs_rtallocate_extent_block(mp,
- tp, bbno + j, minlen, maxlen,
+ tp, bbno + j, minlen, maxavail,
len, &n, rtbufc, prod, &r);
if (error) {
return error;
@@ -564,7 +566,7 @@ xfs_rtallocate_extent_near(
* that we found.
*/
error = xfs_rtallocate_extent_block(mp, tp,
- bbno + i, minlen, maxlen, len, &n,
+ bbno + i, minlen, maxavail, len, &n,
rtbufc, prod, &r);
if (error) {
return error;
--
2.41.0
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 5/6] xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
2023-06-20 21:32 [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
` (3 preceding siblings ...)
2023-06-20 21:32 ` [PATCH 4/6] xfs: limit maxlen based on available space in xfs_rtallocate_extent_near() Omar Sandoval
@ 2023-06-20 21:32 ` Omar Sandoval
2023-07-12 23:34 ` Darrick J. Wong
2023-06-20 21:32 ` [PATCH 6/6] xfs: don't look for end of extent further than necessary " Omar Sandoval
2023-07-06 21:39 ` [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
6 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2023-06-20 21:32 UTC (permalink / raw)
To: linux-xfs; +Cc: Darrick J . Wong, kernel-team, Prashant Nema
From: Omar Sandoval <osandov@fb.com>
xfs_rtallocate_extent_near() tries to find a free extent as close to a
target bitmap block given by bbno as possible, which may be before or
after bbno. Searching backwards has a complication: the realtime summary
accounts for free space _starting_ in a bitmap block, but not straddling
or ending in a bitmap block. So, when the negative search finds a free
extent in the realtime summary, in order to end up closer to the target,
it looks for the end of the free extent. For example, if bbno - 2 has a
free extent, then it will check bbno - 1, then bbno - 2. But then if
bbno - 3 has a free extent, it will check bbno - 1 again, then bbno - 2
again, and then bbno - 3. This results in a quadratic loop, which is
completely pointless since the repeated checks won't find anything new.
Fix it by remembering where we last checked up to and continue from
there. This also obviates the need for a check of the realtime summary.
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
fs/xfs/xfs_rtalloc.c | 46 +++-----------------------------------------
1 file changed, 3 insertions(+), 43 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index d079dfb77c73..4d9d0be2e616 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -468,6 +468,7 @@ xfs_rtallocate_extent_near(
}
bbno = XFS_BITTOBLOCK(mp, bno);
i = 0;
+ j = -1;
ASSERT(minlen != 0);
log2len = xfs_highbit32(minlen);
/*
@@ -518,31 +519,11 @@ xfs_rtallocate_extent_near(
else { /* i < 0 */
/*
* Loop backwards through the bitmap blocks from
- * the starting point-1 up to where we are now.
+ * where we last checked up to where we are now.
* There should be an extent which ends in this
* bitmap block and is long enough.
*/
- for (j = -1; j > i; j--) {
- /*
- * Grab the summary information for
- * this bitmap block.
- */
- error = xfs_rtany_summary(mp, tp,
- log2len, mp->m_rsumlevels - 1,
- bbno + j, rtbufc, &maxlog);
- if (error) {
- return error;
- }
- /*
- * If there's no extent given in the
- * summary that means the extent we
- * found must carry over from an
- * earlier block. If there is an
- * extent given, we've already tried
- * that allocation, don't do it again.
- */
- if (maxlog >= 0)
- continue;
+ for (; j >= i; j--) {
error = xfs_rtallocate_extent_block(mp,
tp, bbno + j, minlen, maxavail,
len, &n, rtbufc, prod, &r);
@@ -557,27 +538,6 @@ xfs_rtallocate_extent_near(
return 0;
}
}
- /*
- * There weren't intervening bitmap blocks
- * with a long enough extent, or the
- * allocation didn't work for some reason
- * (i.e. it's a little * too short).
- * Try to allocate from the summary block
- * that we found.
- */
- error = xfs_rtallocate_extent_block(mp, tp,
- bbno + i, minlen, maxavail, len, &n,
- rtbufc, prod, &r);
- if (error) {
- return error;
- }
- /*
- * If it works, return the extent.
- */
- if (r != NULLRTBLOCK) {
- *rtblock = r;
- return 0;
- }
}
}
/*
--
2.41.0
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 6/6] xfs: don't look for end of extent further than necessary in xfs_rtallocate_extent_near()
2023-06-20 21:32 [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
` (4 preceding siblings ...)
2023-06-20 21:32 ` [PATCH 5/6] xfs: don't try redundant allocations " Omar Sandoval
@ 2023-06-20 21:32 ` Omar Sandoval
2023-08-01 23:40 ` Darrick J. Wong
2023-07-06 21:39 ` [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
6 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2023-06-20 21:32 UTC (permalink / raw)
To: linux-xfs; +Cc: Darrick J . Wong, kernel-team, Prashant Nema
From: Omar Sandoval <osandov@fb.com>
As explained in the previous commit, xfs_rtallocate_extent_near() looks
for the end of a free extent when searching backwards from the target
bitmap block. Since the previous commit, it searches from the last
bitmap block it checked to the bitmap block containing the start of the
extent.
This may still be more than necessary, since the free extent may not be
that long. We know the maximum size of the free extent from the realtime
summary. Use that to compute how many bitmap blocks we actually need to
check.
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
fs/xfs/xfs_rtalloc.c | 25 +++++++++++++++++++++----
1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 4d9d0be2e616..2e2eb7c4a648 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -517,12 +517,29 @@ xfs_rtallocate_extent_near(
* On the negative side of the starting location.
*/
else { /* i < 0 */
+ int maxblocks;
+
/*
- * Loop backwards through the bitmap blocks from
- * where we last checked up to where we are now.
- * There should be an extent which ends in this
- * bitmap block and is long enough.
+ * Loop backwards to find the end of the extent
+ * we found in the realtime summary.
+ *
+ * maxblocks is the maximum possible number of
+ * bitmap blocks from the start of the extent to
+ * the end of the extent.
*/
+ if (maxlog == 0)
+ maxblocks = 0;
+ else if (maxlog < mp->m_blkbit_log)
+ maxblocks = 1;
+ else
+ maxblocks = 2 << (maxlog - mp->m_blkbit_log);
+ /*
+ * We need to check bbno + i + maxblocks down to
+ * bbno + i. We already checked bbno down to
+ * bbno + j + 1, so we don't need to check those
+ * again.
+ */
+ j = min(i + maxblocks, j);
for (; j >= i; j--) {
error = xfs_rtallocate_extent_block(mp,
tp, bbno + j, minlen, maxavail,
--
2.41.0
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator
2023-06-20 21:32 [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
` (5 preceding siblings ...)
2023-06-20 21:32 ` [PATCH 6/6] xfs: don't look for end of extent further than necessary " Omar Sandoval
@ 2023-07-06 21:39 ` Omar Sandoval
2023-07-07 0:36 ` Dave Chinner
6 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2023-07-06 21:39 UTC (permalink / raw)
To: linux-xfs; +Cc: Darrick J . Wong, kernel-team, Prashant Nema
On Tue, Jun 20, 2023 at 02:32:10PM -0700, Omar Sandoval wrote:
> Hello,
>
> Our distributed storage system uses XFS's realtime device support as a
> way to split an XFS filesystem between an SSD and an HDD -- we configure
> the HDD as the realtime device so that metadata goes on the SSD and data
> goes on the HDD.
>
> We've been running this in production for a few years now, so we have
> some fairly fragmented filesystems. This has exposed various CPU
> inefficiencies in the realtime allocator. These became even worse when
> we experimented with using XFS_XFLAG_EXTSIZE to force files to be
> allocated contiguously.
>
> This series adds several optimizations that don't change the realtime
> allocator's decisions, but make them happen more efficiently, mainly by
> avoiding redundant work. We've tested these in production and measured
> ~10% lower CPU utilization. Furthermore, it made it possible to use
> XFS_XFLAG_EXTSIZE to force contiguous allocations -- without these
> patches, our most fragmented systems would become unresponsive due to
> high CPU usage in the realtime allocator, but with them, CPU utilization
> is actually ~4-6% lower than before, and disk I/O utilization is 15-20%
> lower.
>
> Patches 2 and 3 are preparations for later optimizations; the remaining
> patches are the optimizations themselves.
>
> This is based on Linus' tree as of today (commit
> 692b7dc87ca6d55ab254f8259e6f970171dc9d01).
>
> Thanks!
>
> Omar Sandoval (6):
> xfs: cache last bitmap block in realtime allocator
> xfs: invert the realtime summary cache
> xfs: return maximum free size from xfs_rtany_summary()
> xfs: limit maxlen based on available space in
> xfs_rtallocate_extent_near()
> xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
> xfs: don't look for end of extent further than necessary in
> xfs_rtallocate_extent_near()
>
> fs/xfs/libxfs/xfs_rtbitmap.c | 173 ++++++++++++++--------------
> fs/xfs/xfs_mount.h | 6 +-
> fs/xfs/xfs_rtalloc.c | 215 ++++++++++++++++-------------------
> fs/xfs/xfs_rtalloc.h | 28 +++--
> 4 files changed, 207 insertions(+), 215 deletions(-)
Gentle ping.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator
2023-07-06 21:39 ` [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
@ 2023-07-07 0:36 ` Dave Chinner
0 siblings, 0 replies; 23+ messages in thread
From: Dave Chinner @ 2023-07-07 0:36 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-xfs, Darrick J . Wong, kernel-team, Prashant Nema
On Thu, Jul 06, 2023 at 02:39:08PM -0700, Omar Sandoval wrote:
> On Tue, Jun 20, 2023 at 02:32:10PM -0700, Omar Sandoval wrote:
> > Hello,
> >
> > Our distributed storage system uses XFS's realtime device support as a
> > way to split an XFS filesystem between an SSD and an HDD -- we configure
> > the HDD as the realtime device so that metadata goes on the SSD and data
> > goes on the HDD.
> >
> > We've been running this in production for a few years now, so we have
> > some fairly fragmented filesystems. This has exposed various CPU
> > inefficiencies in the realtime allocator. These became even worse when
> > we experimented with using XFS_XFLAG_EXTSIZE to force files to be
> > allocated contiguously.
> >
> > This series adds several optimizations that don't change the realtime
> > allocator's decisions, but make them happen more efficiently, mainly by
> > avoiding redundant work. We've tested these in production and measured
> > ~10% lower CPU utilization. Furthermore, it made it possible to use
> > XFS_XFLAG_EXTSIZE to force contiguous allocations -- without these
> > patches, our most fragmented systems would become unresponsive due to
> > high CPU usage in the realtime allocator, but with them, CPU utilization
> > is actually ~4-6% lower than before, and disk I/O utilization is 15-20%
> > lower.
> >
> > Patches 2 and 3 are preparations for later optimizations; the remaining
> > patches are the optimizations themselves.
> >
> > This is based on Linus' tree as of today (commit
> > 692b7dc87ca6d55ab254f8259e6f970171dc9d01).
> >
> > Thanks!
> >
> > Omar Sandoval (6):
> > xfs: cache last bitmap block in realtime allocator
> > xfs: invert the realtime summary cache
> > xfs: return maximum free size from xfs_rtany_summary()
> > xfs: limit maxlen based on available space in
> > xfs_rtallocate_extent_near()
> > xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
> > xfs: don't look for end of extent further than necessary in
> > xfs_rtallocate_extent_near()
> >
> > fs/xfs/libxfs/xfs_rtbitmap.c | 173 ++++++++++++++--------------
> > fs/xfs/xfs_mount.h | 6 +-
> > fs/xfs/xfs_rtalloc.c | 215 ++++++++++++++++-------------------
> > fs/xfs/xfs_rtalloc.h | 28 +++--
> > 4 files changed, 207 insertions(+), 215 deletions(-)
>
> Gentle ping.
Sorry, I haven't had time to get to this yet. I've still got a
couple more bug reports to work through before I can really start
thinking about looking at anything else..
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/6] xfs: cache last bitmap block in realtime allocator
2023-06-20 21:32 ` [PATCH 1/6] xfs: cache last bitmap block in " Omar Sandoval
@ 2023-07-12 18:29 ` Darrick J. Wong
2023-07-17 18:18 ` Omar Sandoval
0 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2023-07-12 18:29 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-xfs, kernel-team, Prashant Nema
On Tue, Jun 20, 2023 at 02:32:11PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> Profiling a workload on a highly fragmented realtime device showed a ton
> of CPU cycles being spent in xfs_trans_read_buf() called by
> xfs_rtbuf_get(). Further tracing showed that much of that was repeated
> calls to xfs_rtbuf_get() for the same block of the realtime bitmap.
> These come from xfs_rtallocate_extent_block(): as it walks through
> ranges of free bits in the bitmap, each call to xfs_rtcheck_range() and
> xfs_rtfind_{forw,back}() gets the same bitmap block. If the bitmap block
> is very fragmented, then this is _a lot_ of buffer lookups.
>
> The realtime allocator already passes around a cache of the last used
> realtime summary block to avoid repeated reads (the parameters rbpp and
> rsb). We can do the same for the realtime bitmap.
>
> This replaces rbpp and rsb with a struct xfs_rtbuf_cache, which caches
> the most recently used block for both the realtime bitmap and summary.
> xfs_rtbuf_get() now handles the caching instead of the callers, which
> requires plumbing xfs_rtbuf_cache to more functions but also makes sure
> we don't miss anything.
Hmm. I initially wondered why rtbitmap blocks wouldn't just stay
attached to the transaction, but then I realized -- this is the
/scanning/ phase that's repeatedly getting and releasing the rtbmp
block, right? The allocator hasn't dirtied any rtbitmap blocks yet, so
each xfs_trans_brelse actually drops the buffer, and that's how the
xfs_rtbuf_get ends up pounding on the bmapi and then the buffer cache?
Repeatedly?
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
> fs/xfs/libxfs/xfs_rtbitmap.c | 167 ++++++++++++++++++-----------------
> fs/xfs/xfs_rtalloc.c | 107 ++++++++++------------
> fs/xfs/xfs_rtalloc.h | 28 ++++--
> 3 files changed, 154 insertions(+), 148 deletions(-)
>
> diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
> index fa180ab66b73..1a832c9a412f 100644
> --- a/fs/xfs/libxfs/xfs_rtbitmap.c
> +++ b/fs/xfs/libxfs/xfs_rtbitmap.c
> @@ -46,6 +46,21 @@ const struct xfs_buf_ops xfs_rtbuf_ops = {
> .verify_write = xfs_rtbuf_verify_write,
> };
>
> +static void
> +xfs_rtbuf_cache_relse(
> + xfs_trans_t *tp, /* transaction pointer */
Please don't introduce more typedef usage.
struct xfs_trans *tp,
struct xfs_rtbuf_cache *cache)
(Also there's a lot of broken indenting in the lines added by this
patch.)
> + struct xfs_rtbuf_cache *cache) /* cached blocks */
> +{
> + if (cache->bbuf) {
> + xfs_trans_brelse(tp, cache->bbuf);
> + cache->bbuf = NULL;
> + }
> + if (cache->sbuf) {
> + xfs_trans_brelse(tp, cache->sbuf);
> + cache->sbuf = NULL;
> + }
> +}
> +
> /*
> * Get a buffer for the bitmap or summary file block specified.
> * The buffer is returned read and locked.
> @@ -56,14 +71,35 @@ xfs_rtbuf_get(
> xfs_trans_t *tp, /* transaction pointer */
> xfs_rtblock_t block, /* block number in bitmap or summary */
> int issum, /* is summary not bitmap */
> + struct xfs_rtbuf_cache *cache, /* in/out: cached blocks */
> struct xfs_buf **bpp) /* output: buffer for the block */
> {
> + struct xfs_buf **cbpp; /* cached block buffer */
> + xfs_fsblock_t *cbp; /* cached block number */
> struct xfs_buf *bp; /* block buffer, result */
> xfs_inode_t *ip; /* bitmap or summary inode */
> xfs_bmbt_irec_t map;
> int nmap = 1;
> int error; /* error value */
>
> + cbpp = issum ? &cache->bbuf : &cache->sbuf;
> + cbp = issum ? &cache->bblock : &cache->sblock;
> + /*
> + * If we have a cached buffer, and the block number matches, use that.
> + */
> + if (*cbpp && *cbp == block) {
> + *bpp = *cbpp;
> + return 0;
> + }
> + /*
> + * Otherwise we have to have to get the buffer. If there was an old
> + * one, get rid of it first.
> + */
> + if (*cbpp) {
> + xfs_trans_brelse(tp, *cbpp);
> + *cbpp = NULL;
> + }
> +
> ip = issum ? mp->m_rsumip : mp->m_rbmip;
>
> error = xfs_bmapi_read(ip, block, 1, &map, &nmap, 0);
> @@ -82,7 +118,8 @@ xfs_rtbuf_get(
>
> xfs_trans_buf_set_type(tp, bp, issum ? XFS_BLFT_RTSUMMARY_BUF
> : XFS_BLFT_RTBITMAP_BUF);
> - *bpp = bp;
> + *cbpp = *bpp = bp;
> + *cbp = block;
> return 0;
> }
>
> @@ -96,6 +133,7 @@ xfs_rtfind_back(
> xfs_trans_t *tp, /* transaction pointer */
> xfs_rtblock_t start, /* starting block to look at */
> xfs_rtblock_t limit, /* last block to look at */
> + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> xfs_rtblock_t *rtblock) /* out: start block found */
> {
> xfs_rtword_t *b; /* current word in buffer */
> @@ -116,7 +154,7 @@ xfs_rtfind_back(
> * Compute and read in starting bitmap block for starting block.
> */
> block = XFS_BITTOBLOCK(mp, start);
> - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -153,7 +191,6 @@ xfs_rtfind_back(
> /*
> * Different. Mark where we are and return.
> */
> - xfs_trans_brelse(tp, bp);
> i = bit - XFS_RTHIBIT(wdiff);
> *rtblock = start - i + 1;
> return 0;
> @@ -167,8 +204,7 @@ xfs_rtfind_back(
> /*
> * If done with this block, get the previous one.
> */
> - xfs_trans_brelse(tp, bp);
> - error = xfs_rtbuf_get(mp, tp, --block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, --block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -199,7 +235,6 @@ xfs_rtfind_back(
> /*
> * Different, mark where we are and return.
> */
> - xfs_trans_brelse(tp, bp);
> i += XFS_NBWORD - 1 - XFS_RTHIBIT(wdiff);
> *rtblock = start - i + 1;
> return 0;
> @@ -213,8 +248,7 @@ xfs_rtfind_back(
> /*
> * If done with this block, get the previous one.
> */
> - xfs_trans_brelse(tp, bp);
> - error = xfs_rtbuf_get(mp, tp, --block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, --block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -246,7 +280,6 @@ xfs_rtfind_back(
> /*
> * Different, mark where we are and return.
> */
> - xfs_trans_brelse(tp, bp);
> i += XFS_NBWORD - 1 - XFS_RTHIBIT(wdiff);
> *rtblock = start - i + 1;
> return 0;
> @@ -256,7 +289,6 @@ xfs_rtfind_back(
> /*
> * No match, return that we scanned the whole area.
> */
> - xfs_trans_brelse(tp, bp);
> *rtblock = start - i + 1;
> return 0;
> }
> @@ -271,6 +303,7 @@ xfs_rtfind_forw(
> xfs_trans_t *tp, /* transaction pointer */
> xfs_rtblock_t start, /* starting block to look at */
> xfs_rtblock_t limit, /* last block to look at */
> + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> xfs_rtblock_t *rtblock) /* out: start block found */
> {
> xfs_rtword_t *b; /* current word in buffer */
> @@ -291,7 +324,7 @@ xfs_rtfind_forw(
> * Compute and read in starting bitmap block for starting block.
> */
> block = XFS_BITTOBLOCK(mp, start);
> - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -327,7 +360,6 @@ xfs_rtfind_forw(
> /*
> * Different. Mark where we are and return.
> */
> - xfs_trans_brelse(tp, bp);
> i = XFS_RTLOBIT(wdiff) - bit;
> *rtblock = start + i - 1;
> return 0;
> @@ -341,8 +373,7 @@ xfs_rtfind_forw(
> /*
> * If done with this block, get the previous one.
> */
> - xfs_trans_brelse(tp, bp);
> - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -372,7 +403,6 @@ xfs_rtfind_forw(
> /*
> * Different, mark where we are and return.
> */
> - xfs_trans_brelse(tp, bp);
> i += XFS_RTLOBIT(wdiff);
> *rtblock = start + i - 1;
> return 0;
> @@ -386,8 +416,7 @@ xfs_rtfind_forw(
> /*
> * If done with this block, get the next one.
> */
> - xfs_trans_brelse(tp, bp);
> - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -416,7 +445,6 @@ xfs_rtfind_forw(
> /*
> * Different, mark where we are and return.
> */
> - xfs_trans_brelse(tp, bp);
> i += XFS_RTLOBIT(wdiff);
> *rtblock = start + i - 1;
> return 0;
> @@ -426,7 +454,6 @@ xfs_rtfind_forw(
> /*
> * No match, return that we scanned the whole area.
> */
> - xfs_trans_brelse(tp, bp);
> *rtblock = start + i - 1;
> return 0;
> }
> @@ -447,8 +474,7 @@ xfs_rtmodify_summary_int(
> int log, /* log2 of extent size */
> xfs_rtblock_t bbno, /* bitmap block number */
> int delta, /* change to make to summary info */
> - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> - xfs_fsblock_t *rsb, /* in/out: summary block number */
> + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> xfs_suminfo_t *sum) /* out: summary info for this block */
> {
> struct xfs_buf *bp; /* buffer for the summary block */
> @@ -465,30 +491,9 @@ xfs_rtmodify_summary_int(
> * Compute the block number in the summary file.
> */
> sb = XFS_SUMOFFSTOBLOCK(mp, so);
> - /*
> - * If we have an old buffer, and the block number matches, use that.
> - */
> - if (*rbpp && *rsb == sb)
> - bp = *rbpp;
> - /*
> - * Otherwise we have to get the buffer.
> - */
> - else {
> - /*
> - * If there was an old one, get rid of it first.
> - */
> - if (*rbpp)
> - xfs_trans_brelse(tp, *rbpp);
> - error = xfs_rtbuf_get(mp, tp, sb, 1, &bp);
> - if (error) {
> - return error;
> - }
> - /*
> - * Remember this buffer and block for the next call.
> - */
> - *rbpp = bp;
> - *rsb = sb;
> - }
> + error = xfs_rtbuf_get(mp, tp, sb, 1, rtbufc, &bp);
> + if (error)
> + return error;
> /*
> * Point to the summary information, modify/log it, and/or copy it out.
> */
> @@ -517,11 +522,9 @@ xfs_rtmodify_summary(
> int log, /* log2 of extent size */
> xfs_rtblock_t bbno, /* bitmap block number */
> int delta, /* change to make to summary info */
> - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> - xfs_fsblock_t *rsb) /* in/out: summary block number */
> + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> {
> - return xfs_rtmodify_summary_int(mp, tp, log, bbno,
> - delta, rbpp, rsb, NULL);
> + return xfs_rtmodify_summary_int(mp, tp, log, bbno, delta, rtbufc, NULL);
> }
>
> /*
> @@ -534,7 +537,8 @@ xfs_rtmodify_range(
> xfs_trans_t *tp, /* transaction pointer */
> xfs_rtblock_t start, /* starting block to modify */
> xfs_extlen_t len, /* length of extent to modify */
> - int val) /* 1 for free, 0 for allocated */
> + int val, /* 1 for free, 0 for allocated */
> + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> {
> xfs_rtword_t *b; /* current word in buffer */
> int bit; /* bit number in the word */
> @@ -555,7 +559,7 @@ xfs_rtmodify_range(
> /*
> * Read the bitmap block, and point to its data.
> */
> - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -600,7 +604,7 @@ xfs_rtmodify_range(
> xfs_trans_log_buf(tp, bp,
> (uint)((char *)first - (char *)bufp),
> (uint)((char *)b - (char *)bufp));
> - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -640,7 +644,7 @@ xfs_rtmodify_range(
> xfs_trans_log_buf(tp, bp,
> (uint)((char *)first - (char *)bufp),
> (uint)((char *)b - (char *)bufp));
> - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -690,8 +694,7 @@ xfs_rtfree_range(
> xfs_trans_t *tp, /* transaction pointer */
> xfs_rtblock_t start, /* starting block to free */
> xfs_extlen_t len, /* length to free */
> - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> - xfs_fsblock_t *rsb) /* in/out: summary block number */
> + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> {
> xfs_rtblock_t end; /* end of the freed extent */
> int error; /* error value */
> @@ -702,7 +705,7 @@ xfs_rtfree_range(
> /*
> * Modify the bitmap to mark this extent freed.
> */
> - error = xfs_rtmodify_range(mp, tp, start, len, 1);
> + error = xfs_rtmodify_range(mp, tp, start, len, 1, rtbufc);
> if (error) {
> return error;
> }
> @@ -711,7 +714,7 @@ xfs_rtfree_range(
> * We need to find the beginning and end of the extent so we can
> * properly update the summary.
> */
> - error = xfs_rtfind_back(mp, tp, start, 0, &preblock);
> + error = xfs_rtfind_back(mp, tp, start, 0, rtbufc, &preblock);
> if (error) {
> return error;
> }
> @@ -719,7 +722,7 @@ xfs_rtfree_range(
> * Find the next allocated block (end of allocated extent).
> */
> error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1,
> - &postblock);
> + rtbufc, &postblock);
> if (error)
> return error;
> /*
> @@ -729,7 +732,7 @@ xfs_rtfree_range(
> if (preblock < start) {
> error = xfs_rtmodify_summary(mp, tp,
> XFS_RTBLOCKLOG(start - preblock),
> - XFS_BITTOBLOCK(mp, preblock), -1, rbpp, rsb);
> + XFS_BITTOBLOCK(mp, preblock), -1, rtbufc);
> if (error) {
> return error;
> }
> @@ -741,7 +744,7 @@ xfs_rtfree_range(
> if (postblock > end) {
> error = xfs_rtmodify_summary(mp, tp,
> XFS_RTBLOCKLOG(postblock - end),
> - XFS_BITTOBLOCK(mp, end + 1), -1, rbpp, rsb);
> + XFS_BITTOBLOCK(mp, end + 1), -1, rtbufc);
> if (error) {
> return error;
> }
> @@ -752,7 +755,7 @@ xfs_rtfree_range(
> */
> error = xfs_rtmodify_summary(mp, tp,
> XFS_RTBLOCKLOG(postblock + 1 - preblock),
> - XFS_BITTOBLOCK(mp, preblock), 1, rbpp, rsb);
> + XFS_BITTOBLOCK(mp, preblock), 1, rtbufc);
> return error;
> }
>
> @@ -767,6 +770,7 @@ xfs_rtcheck_range(
> xfs_rtblock_t start, /* starting block number of extent */
> xfs_extlen_t len, /* length of extent */
> int val, /* 1 for free, 0 for allocated */
> + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> xfs_rtblock_t *new, /* out: first block not matching */
> int *stat) /* out: 1 for matches, 0 for not */
> {
> @@ -789,7 +793,7 @@ xfs_rtcheck_range(
> /*
> * Read the bitmap block.
> */
> - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -824,7 +828,6 @@ xfs_rtcheck_range(
> /*
> * Different, compute first wrong bit and return.
> */
> - xfs_trans_brelse(tp, bp);
> i = XFS_RTLOBIT(wdiff) - bit;
> *new = start + i;
> *stat = 0;
> @@ -839,8 +842,7 @@ xfs_rtcheck_range(
> /*
> * If done with this block, get the next one.
> */
> - xfs_trans_brelse(tp, bp);
> - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -870,7 +872,6 @@ xfs_rtcheck_range(
> /*
> * Different, compute first wrong bit and return.
> */
> - xfs_trans_brelse(tp, bp);
> i += XFS_RTLOBIT(wdiff);
> *new = start + i;
> *stat = 0;
> @@ -885,8 +886,7 @@ xfs_rtcheck_range(
> /*
> * If done with this block, get the next one.
> */
> - xfs_trans_brelse(tp, bp);
> - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> if (error) {
> return error;
> }
> @@ -915,7 +915,6 @@ xfs_rtcheck_range(
> /*
> * Different, compute first wrong bit and return.
> */
> - xfs_trans_brelse(tp, bp);
> i += XFS_RTLOBIT(wdiff);
> *new = start + i;
> *stat = 0;
> @@ -926,7 +925,6 @@ xfs_rtcheck_range(
> /*
> * Successful, return.
> */
> - xfs_trans_brelse(tp, bp);
> *new = start + i;
> *stat = 1;
> return 0;
> @@ -941,20 +939,21 @@ xfs_rtcheck_alloc_range(
> xfs_mount_t *mp, /* file system mount point */
> xfs_trans_t *tp, /* transaction pointer */
> xfs_rtblock_t bno, /* starting block number of extent */
> - xfs_extlen_t len) /* length of extent */
> + xfs_extlen_t len, /* length of extent */
> + struct xfs_rtbuf_cache *rtbufc) /* in/out: cached blocks */
> {
> xfs_rtblock_t new; /* dummy for xfs_rtcheck_range */
> int stat;
> int error;
>
> - error = xfs_rtcheck_range(mp, tp, bno, len, 0, &new, &stat);
> + error = xfs_rtcheck_range(mp, tp, bno, len, 0, rtbufc, &new, &stat);
> if (error)
> return error;
> ASSERT(stat);
> return 0;
> }
> #else
> -#define xfs_rtcheck_alloc_range(m,t,b,l) (0)
> +#define xfs_rtcheck_alloc_range(m,t,b,l,r) (0)
> #endif
> /*
> * Free an extent in the realtime subvolume. Length is expressed in
> @@ -968,22 +967,21 @@ xfs_rtfree_extent(
> {
> int error; /* error value */
> xfs_mount_t *mp; /* file system mount structure */
> - xfs_fsblock_t sb; /* summary file block number */
> - struct xfs_buf *sumbp = NULL; /* summary file block buffer */
> + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
>
> mp = tp->t_mountp;
>
> ASSERT(mp->m_rbmip->i_itemp != NULL);
> ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
>
> - error = xfs_rtcheck_alloc_range(mp, tp, bno, len);
> + error = xfs_rtcheck_alloc_range(mp, tp, bno, len, &rtbufc);
> if (error)
> return error;
>
> /*
> * Free the range of realtime blocks.
> */
> - error = xfs_rtfree_range(mp, tp, bno, len, &sumbp, &sb);
> + error = xfs_rtfree_range(mp, tp, bno, len, &rtbufc);
> if (error) {
> return error;
> }
> @@ -1021,6 +1019,7 @@ xfs_rtalloc_query_range(
> xfs_rtblock_t high_key;
> int is_free;
> int error = 0;
> + struct xfs_rtbuf_cache rtbufc = {};
>
> if (low_rec->ar_startext > high_rec->ar_startext)
> return -EINVAL;
> @@ -1034,13 +1033,14 @@ xfs_rtalloc_query_range(
> rtstart = low_rec->ar_startext;
> while (rtstart <= high_key) {
> /* Is the first block free? */
> - error = xfs_rtcheck_range(mp, tp, rtstart, 1, 1, &rtend,
> - &is_free);
> + error = xfs_rtcheck_range(mp, tp, rtstart, 1, 1, &rtbufc,
> + &rtend, &is_free);
> if (error)
> break;
>
> /* How long does the extent go for? */
> - error = xfs_rtfind_forw(mp, tp, rtstart, high_key, &rtend);
> + error = xfs_rtfind_forw(mp, tp, rtstart, high_key, &rtbufc,
> + &rtend);
> if (error)
> break;
>
> @@ -1056,6 +1056,8 @@ xfs_rtalloc_query_range(
> rtstart = rtend + 1;
> }
>
> + xfs_rtbuf_cache_relse(tp, &rtbufc);
> +
> return error;
> }
>
> @@ -1085,11 +1087,14 @@ xfs_rtalloc_extent_is_free(
> xfs_extlen_t len,
> bool *is_free)
> {
> + struct xfs_rtbuf_cache rtbufc = {};
> xfs_rtblock_t end;
> int matches;
> int error;
>
> - error = xfs_rtcheck_range(mp, tp, start, len, 1, &end, &matches);
> + error = xfs_rtcheck_range(mp, tp, start, len, 1, &rtbufc, &end,
> + &matches);
> + xfs_rtbuf_cache_relse(tp, &rtbufc);
> if (error)
> return error;
>
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index 16534e9873f6..61ef13286654 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -32,11 +32,10 @@ xfs_rtget_summary(
> xfs_trans_t *tp, /* transaction pointer */
> int log, /* log2 of extent size */
> xfs_rtblock_t bbno, /* bitmap block number */
> - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> - xfs_fsblock_t *rsb, /* in/out: summary block number */
> + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> xfs_suminfo_t *sum) /* out: summary info for this block */
> {
> - return xfs_rtmodify_summary_int(mp, tp, log, bbno, 0, rbpp, rsb, sum);
> + return xfs_rtmodify_summary_int(mp, tp, log, bbno, 0, rtbufc, sum);
> }
>
> /*
> @@ -50,8 +49,7 @@ xfs_rtany_summary(
> int low, /* low log2 extent size */
> int high, /* high log2 extent size */
> xfs_rtblock_t bbno, /* bitmap block number */
> - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> - xfs_fsblock_t *rsb, /* in/out: summary block number */
> + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> int *stat) /* out: any good extents here? */
> {
> int error; /* error value */
> @@ -69,7 +67,7 @@ xfs_rtany_summary(
> /*
> * Get one summary datum.
> */
> - error = xfs_rtget_summary(mp, tp, log, bbno, rbpp, rsb, &sum);
> + error = xfs_rtget_summary(mp, tp, log, bbno, rtbufc, &sum);
> if (error) {
> return error;
> }
> @@ -104,29 +102,27 @@ xfs_rtcopy_summary(
> xfs_trans_t *tp) /* transaction pointer */
> {
> xfs_rtblock_t bbno; /* bitmap block number */
> - struct xfs_buf *bp; /* summary buffer */
> + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
> int error; /* error return value */
> int log; /* summary level number (log length) */
> xfs_suminfo_t sum; /* summary data */
> - xfs_fsblock_t sumbno; /* summary block number */
>
> - bp = NULL;
> for (log = omp->m_rsumlevels - 1; log >= 0; log--) {
> for (bbno = omp->m_sb.sb_rbmblocks - 1;
> (xfs_srtblock_t)bbno >= 0;
> bbno--) {
> - error = xfs_rtget_summary(omp, tp, log, bbno, &bp,
> - &sumbno, &sum);
> + error = xfs_rtget_summary(omp, tp, log, bbno, &rtbufc,
> + &sum);
> if (error)
> return error;
> if (sum == 0)
> continue;
> error = xfs_rtmodify_summary(omp, tp, log, bbno, -sum,
> - &bp, &sumbno);
> + &rtbufc);
> if (error)
> return error;
> error = xfs_rtmodify_summary(nmp, tp, log, bbno, sum,
> - &bp, &sumbno);
> + &rtbufc);
> if (error)
> return error;
> ASSERT(sum > 0);
> @@ -144,8 +140,7 @@ xfs_rtallocate_range(
> xfs_trans_t *tp, /* transaction pointer */
> xfs_rtblock_t start, /* start block to allocate */
> xfs_extlen_t len, /* length to allocate */
> - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> - xfs_fsblock_t *rsb) /* in/out: summary block number */
> + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> {
> xfs_rtblock_t end; /* end of the allocated extent */
> int error; /* error value */
> @@ -158,14 +153,14 @@ xfs_rtallocate_range(
> * We need to find the beginning and end of the extent so we can
> * properly update the summary.
> */
> - error = xfs_rtfind_back(mp, tp, start, 0, &preblock);
> + error = xfs_rtfind_back(mp, tp, start, 0, rtbufc, &preblock);
> if (error) {
> return error;
> }
> /*
> * Find the next allocated block (end of free extent).
> */
> - error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1,
> + error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1, rtbufc,
> &postblock);
> if (error) {
> return error;
> @@ -176,7 +171,7 @@ xfs_rtallocate_range(
> */
> error = xfs_rtmodify_summary(mp, tp,
> XFS_RTBLOCKLOG(postblock + 1 - preblock),
> - XFS_BITTOBLOCK(mp, preblock), -1, rbpp, rsb);
> + XFS_BITTOBLOCK(mp, preblock), -1, rtbufc);
> if (error) {
> return error;
> }
> @@ -187,7 +182,7 @@ xfs_rtallocate_range(
> if (preblock < start) {
> error = xfs_rtmodify_summary(mp, tp,
> XFS_RTBLOCKLOG(start - preblock),
> - XFS_BITTOBLOCK(mp, preblock), 1, rbpp, rsb);
> + XFS_BITTOBLOCK(mp, preblock), 1, rtbufc);
> if (error) {
> return error;
> }
> @@ -199,7 +194,7 @@ xfs_rtallocate_range(
> if (postblock > end) {
> error = xfs_rtmodify_summary(mp, tp,
> XFS_RTBLOCKLOG(postblock - end),
> - XFS_BITTOBLOCK(mp, end + 1), 1, rbpp, rsb);
> + XFS_BITTOBLOCK(mp, end + 1), 1, rtbufc);
> if (error) {
> return error;
> }
> @@ -207,7 +202,7 @@ xfs_rtallocate_range(
> /*
> * Modify the bitmap to mark this extent allocated.
> */
> - error = xfs_rtmodify_range(mp, tp, start, len, 0);
> + error = xfs_rtmodify_range(mp, tp, start, len, 0, rtbufc);
> return error;
> }
>
> @@ -226,8 +221,7 @@ xfs_rtallocate_extent_block(
> xfs_extlen_t maxlen, /* maximum length to allocate */
> xfs_extlen_t *len, /* out: actual length allocated */
> xfs_rtblock_t *nextp, /* out: next block to try */
> - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> - xfs_fsblock_t *rsb, /* in/out: summary block number */
> + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> xfs_extlen_t prod, /* extent product factor */
> xfs_rtblock_t *rtblock) /* out: start block allocated */
> {
> @@ -254,7 +248,8 @@ xfs_rtallocate_extent_block(
> * See if there's a free extent of maxlen starting at i.
> * If it's not so then next will contain the first non-free.
> */
> - error = xfs_rtcheck_range(mp, tp, i, maxlen, 1, &next, &stat);
> + error = xfs_rtcheck_range(mp, tp, i, maxlen, 1, rtbufc, &next,
> + &stat);
> if (error) {
> return error;
> }
> @@ -262,8 +257,7 @@ xfs_rtallocate_extent_block(
> /*
> * i for maxlen is all free, allocate and return that.
> */
> - error = xfs_rtallocate_range(mp, tp, i, maxlen, rbpp,
> - rsb);
> + error = xfs_rtallocate_range(mp, tp, i, maxlen, rtbufc);
> if (error) {
> return error;
> }
> @@ -290,7 +284,7 @@ xfs_rtallocate_extent_block(
> * If not done yet, find the start of the next free space.
> */
> if (next < end) {
> - error = xfs_rtfind_forw(mp, tp, next, end, &i);
> + error = xfs_rtfind_forw(mp, tp, next, end, rtbufc, &i);
> if (error) {
> return error;
> }
> @@ -315,7 +309,7 @@ xfs_rtallocate_extent_block(
> /*
> * Allocate besti for bestlen & return that.
> */
> - error = xfs_rtallocate_range(mp, tp, besti, bestlen, rbpp, rsb);
> + error = xfs_rtallocate_range(mp, tp, besti, bestlen, rtbufc);
> if (error) {
> return error;
> }
> @@ -344,9 +338,8 @@ xfs_rtallocate_extent_exact(
> xfs_rtblock_t bno, /* starting block number to allocate */
> xfs_extlen_t minlen, /* minimum length to allocate */
> xfs_extlen_t maxlen, /* maximum length to allocate */
> + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> xfs_extlen_t *len, /* out: actual length allocated */
> - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> - xfs_fsblock_t *rsb, /* in/out: summary block number */
> xfs_extlen_t prod, /* extent product factor */
> xfs_rtblock_t *rtblock) /* out: start block allocated */
> {
> @@ -359,7 +352,8 @@ xfs_rtallocate_extent_exact(
> /*
> * Check if the range in question (for maxlen) is free.
> */
> - error = xfs_rtcheck_range(mp, tp, bno, maxlen, 1, &next, &isfree);
> + error = xfs_rtcheck_range(mp, tp, bno, maxlen, 1, rtbufc, &next,
> + &isfree);
> if (error) {
> return error;
> }
> @@ -367,7 +361,7 @@ xfs_rtallocate_extent_exact(
> /*
> * If it is, allocate it and return success.
> */
> - error = xfs_rtallocate_range(mp, tp, bno, maxlen, rbpp, rsb);
> + error = xfs_rtallocate_range(mp, tp, bno, maxlen, rtbufc);
> if (error) {
> return error;
> }
> @@ -402,7 +396,7 @@ xfs_rtallocate_extent_exact(
> /*
> * Allocate what we can and return it.
> */
> - error = xfs_rtallocate_range(mp, tp, bno, maxlen, rbpp, rsb);
> + error = xfs_rtallocate_range(mp, tp, bno, maxlen, rtbufc);
> if (error) {
> return error;
> }
> @@ -424,8 +418,7 @@ xfs_rtallocate_extent_near(
> xfs_extlen_t minlen, /* minimum length to allocate */
> xfs_extlen_t maxlen, /* maximum length to allocate */
> xfs_extlen_t *len, /* out: actual length allocated */
> - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> - xfs_fsblock_t *rsb, /* in/out: summary block number */
> + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> xfs_extlen_t prod, /* extent product factor */
> xfs_rtblock_t *rtblock) /* out: start block allocated */
> {
> @@ -456,8 +449,8 @@ xfs_rtallocate_extent_near(
> /*
> * Try the exact allocation first.
> */
> - error = xfs_rtallocate_extent_exact(mp, tp, bno, minlen, maxlen, len,
> - rbpp, rsb, prod, &r);
> + error = xfs_rtallocate_extent_exact(mp, tp, bno, minlen, maxlen, rtbufc,
> + len, prod, &r);
> if (error) {
> return error;
> }
> @@ -481,7 +474,7 @@ xfs_rtallocate_extent_near(
> * starting in this bitmap block.
> */
> error = xfs_rtany_summary(mp, tp, log2len, mp->m_rsumlevels - 1,
> - bbno + i, rbpp, rsb, &any);
> + bbno + i, rtbufc, &any);
> if (error) {
> return error;
> }
> @@ -499,8 +492,8 @@ xfs_rtallocate_extent_near(
> * this block.
> */
> error = xfs_rtallocate_extent_block(mp, tp,
> - bbno + i, minlen, maxlen, len, &n, rbpp,
> - rsb, prod, &r);
> + bbno + i, minlen, maxlen, len, &n,
> + rtbufc, prod, &r);
> if (error) {
> return error;
> }
> @@ -529,7 +522,7 @@ xfs_rtallocate_extent_near(
> */
> error = xfs_rtany_summary(mp, tp,
> log2len, mp->m_rsumlevels - 1,
> - bbno + j, rbpp, rsb, &any);
> + bbno + j, rtbufc, &any);
> if (error) {
> return error;
> }
> @@ -545,7 +538,7 @@ xfs_rtallocate_extent_near(
> continue;
> error = xfs_rtallocate_extent_block(mp,
> tp, bbno + j, minlen, maxlen,
> - len, &n, rbpp, rsb, prod, &r);
> + len, &n, rtbufc, prod, &r);
> if (error) {
> return error;
> }
> @@ -566,8 +559,8 @@ xfs_rtallocate_extent_near(
> * that we found.
> */
> error = xfs_rtallocate_extent_block(mp, tp,
> - bbno + i, minlen, maxlen, len, &n, rbpp,
> - rsb, prod, &r);
> + bbno + i, minlen, maxlen, len, &n,
> + rtbufc, prod, &r);
> if (error) {
> return error;
> }
> @@ -626,8 +619,7 @@ xfs_rtallocate_extent_size(
> xfs_extlen_t minlen, /* minimum length to allocate */
> xfs_extlen_t maxlen, /* maximum length to allocate */
> xfs_extlen_t *len, /* out: actual length allocated */
> - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> - xfs_fsblock_t *rsb, /* in/out: summary block number */
> + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> xfs_extlen_t prod, /* extent product factor */
> xfs_rtblock_t *rtblock) /* out: start block allocated */
> {
> @@ -656,7 +648,7 @@ xfs_rtallocate_extent_size(
> /*
> * Get the summary for this level/block.
> */
> - error = xfs_rtget_summary(mp, tp, l, i, rbpp, rsb,
> + error = xfs_rtget_summary(mp, tp, l, i, rtbufc,
> &sum);
> if (error) {
> return error;
> @@ -670,7 +662,7 @@ xfs_rtallocate_extent_size(
> * Try allocating the extent.
> */
> error = xfs_rtallocate_extent_block(mp, tp, i, maxlen,
> - maxlen, len, &n, rbpp, rsb, prod, &r);
> + maxlen, len, &n, rtbufc, prod, &r);
> if (error) {
> return error;
> }
> @@ -715,7 +707,7 @@ xfs_rtallocate_extent_size(
> /*
> * Get the summary information for this level/block.
> */
> - error = xfs_rtget_summary(mp, tp, l, i, rbpp, rsb,
> + error = xfs_rtget_summary(mp, tp, l, i, rtbufc,
> &sum);
> if (error) {
> return error;
> @@ -733,7 +725,7 @@ xfs_rtallocate_extent_size(
> error = xfs_rtallocate_extent_block(mp, tp, i,
> XFS_RTMAX(minlen, 1 << l),
> XFS_RTMIN(maxlen, (1 << (l + 1)) - 1),
> - len, &n, rbpp, rsb, prod, &r);
> + len, &n, rtbufc, prod, &r);
> if (error) {
> return error;
> }
> @@ -922,7 +914,6 @@ xfs_growfs_rt(
> xfs_extlen_t rbmblocks; /* current number of rt bitmap blocks */
> xfs_extlen_t rsumblocks; /* current number of rt summary blks */
> xfs_sb_t *sbp; /* old superblock */
> - xfs_fsblock_t sumbno; /* summary block number */
> uint8_t *rsum_cache; /* old summary cache */
>
> sbp = &mp->m_sb;
> @@ -1025,6 +1016,7 @@ xfs_growfs_rt(
> bmbno++) {
> struct xfs_trans *tp;
> xfs_rfsblock_t nrblocks_step;
> + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
>
> *nmp = *mp;
> nsbp = &nmp->m_sb;
> @@ -1111,9 +1103,8 @@ xfs_growfs_rt(
> /*
> * Free new extent.
> */
> - bp = NULL;
> error = xfs_rtfree_range(nmp, tp, sbp->sb_rextents,
> - nsbp->sb_rextents - sbp->sb_rextents, &bp, &sumbno);
> + nsbp->sb_rextents - sbp->sb_rextents, &rtbufc);
> if (error) {
> error_cancel:
> xfs_trans_cancel(tp);
> @@ -1185,8 +1176,7 @@ xfs_rtallocate_extent(
> xfs_mount_t *mp = tp->t_mountp;
> int error; /* error value */
> xfs_rtblock_t r; /* result allocated block */
> - xfs_fsblock_t sb; /* summary file block number */
> - struct xfs_buf *sumbp; /* summary file block buffer */
> + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
>
> ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
> ASSERT(minlen > 0 && minlen <= maxlen);
> @@ -1208,13 +1198,14 @@ xfs_rtallocate_extent(
> }
>
> retry:
> - sumbp = NULL;
> + rtbufc.bbuf = NULL;
> + rtbufc.sbuf = NULL;
xfs_rtbuf_cache_relse?
> if (bno == 0) {
> error = xfs_rtallocate_extent_size(mp, tp, minlen, maxlen, len,
> - &sumbp, &sb, prod, &r);
> + &rtbufc, prod, &r);
> } else {
> error = xfs_rtallocate_extent_near(mp, tp, bno, minlen, maxlen,
> - len, &sumbp, &sb, prod, &r);
> + len, &rtbufc, prod, &r);
> }
>
> if (error)
> diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h
> index 62c7ad79cbb6..888552c4f287 100644
> --- a/fs/xfs/xfs_rtalloc.h
> +++ b/fs/xfs/xfs_rtalloc.h
> @@ -101,29 +101,39 @@ xfs_growfs_rt(
> /*
> * From xfs_rtbitmap.c
> */
> +struct xfs_rtbuf_cache {
> + struct xfs_buf *bbuf; /* bitmap block buffer */
> + xfs_fsblock_t bblock; /* bitmap block number */
> + struct xfs_buf *sbuf; /* summary block buffer */
> + xfs_fsblock_t sblock; /* summary block number */
These are really offsets into the data fork, right? They ought to be
xfs_fileoff_t, except that none of the rtalloc code uses the correct
type?
And yes I do want to merge the fixes for that:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=clean-up-realtime-units
--D
> +};
> +
> int xfs_rtbuf_get(struct xfs_mount *mp, struct xfs_trans *tp,
> - xfs_rtblock_t block, int issum, struct xfs_buf **bpp);
> + xfs_rtblock_t block, int issum, struct xfs_rtbuf_cache *cache,
> + struct xfs_buf **bpp);
> int xfs_rtcheck_range(struct xfs_mount *mp, struct xfs_trans *tp,
> xfs_rtblock_t start, xfs_extlen_t len, int val,
> - xfs_rtblock_t *new, int *stat);
> + struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *new,
> + int *stat);
> int xfs_rtfind_back(struct xfs_mount *mp, struct xfs_trans *tp,
> xfs_rtblock_t start, xfs_rtblock_t limit,
> - xfs_rtblock_t *rtblock);
> + struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *rtblock);
> int xfs_rtfind_forw(struct xfs_mount *mp, struct xfs_trans *tp,
> xfs_rtblock_t start, xfs_rtblock_t limit,
> - xfs_rtblock_t *rtblock);
> + struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *rtblock);
> int xfs_rtmodify_range(struct xfs_mount *mp, struct xfs_trans *tp,
> - xfs_rtblock_t start, xfs_extlen_t len, int val);
> + xfs_rtblock_t start, xfs_extlen_t len, int val,
> + struct xfs_rtbuf_cache *rtbufc);
> int xfs_rtmodify_summary_int(struct xfs_mount *mp, struct xfs_trans *tp,
> int log, xfs_rtblock_t bbno, int delta,
> - struct xfs_buf **rbpp, xfs_fsblock_t *rsb,
> + struct xfs_rtbuf_cache *rtbufc,
> xfs_suminfo_t *sum);
> int xfs_rtmodify_summary(struct xfs_mount *mp, struct xfs_trans *tp, int log,
> - xfs_rtblock_t bbno, int delta, struct xfs_buf **rbpp,
> - xfs_fsblock_t *rsb);
> + xfs_rtblock_t bbno, int delta,
> + struct xfs_rtbuf_cache *rtbufc);
> int xfs_rtfree_range(struct xfs_mount *mp, struct xfs_trans *tp,
> xfs_rtblock_t start, xfs_extlen_t len,
> - struct xfs_buf **rbpp, xfs_fsblock_t *rsb);
> + struct xfs_rtbuf_cache *rtbufc);
> int xfs_rtalloc_query_range(struct xfs_mount *mp, struct xfs_trans *tp,
> const struct xfs_rtalloc_rec *low_rec,
> const struct xfs_rtalloc_rec *high_rec,
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/6] xfs: invert the realtime summary cache
2023-06-20 21:32 ` [PATCH 2/6] xfs: invert the realtime summary cache Omar Sandoval
@ 2023-07-12 22:40 ` Darrick J. Wong
2023-07-17 19:54 ` Omar Sandoval
0 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2023-07-12 22:40 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-xfs, kernel-team, Prashant Nema
On Tue, Jun 20, 2023 at 02:32:12PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> In commit 355e3532132b ("xfs: cache minimum realtime summary level"), I
> added a cache of the minimum level of the realtime summary that has any
> free extents. However, it turns out that the _maximum_ level is more
> useful for upcoming optimizations, and basically equivalent for the
> existing usage. So, let's change the meaning of the cache to be the
> maximum level + 1, or 0 if there are no free extents.
Hmm. If I'm reading xfs_rtmodify_summary_int right, m_rsum_cache[b] now
tells us the maximum log2(length) of the free extents starting in
rtbitmap block b?
IOWs, let's say the cache contents are:
{2, 3, 2, 15, 8}
Someone asks for a 400rtx (realtime extent) allocation, so we want to
find a free space of at least magnitude floor(log2(400)) == 8.
The cache tells us that there aren't any free extents longer than 2^1
blocks in rtbitmap blocks 0 and 2; longer than 2^2 blocks in rtbmp block
1; longer than 2^7 blocks in rtbmp block 4; nor longer than 2^14 blocks
in rtbmp block 3?
From the cache contents, we should therefore examine rtbitmap block 3.
If the cache contents were instead:
{2, 3, 2, 8, 8}
Then we instead might scan rtbitmap blocks 3 and 4 for the longest
allocation that we can get? Looking back at the original commit, that
seems to make more sense to me...
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
> fs/xfs/libxfs/xfs_rtbitmap.c | 6 +++---
> fs/xfs/xfs_mount.h | 6 +++---
> fs/xfs/xfs_rtalloc.c | 31 +++++++++++++++++++------------
> 3 files changed, 25 insertions(+), 18 deletions(-)
>
> diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
> index 1a832c9a412f..d9493f64adfc 100644
> --- a/fs/xfs/libxfs/xfs_rtbitmap.c
> +++ b/fs/xfs/libxfs/xfs_rtbitmap.c
> @@ -503,10 +503,10 @@ xfs_rtmodify_summary_int(
>
> *sp += delta;
> if (mp->m_rsum_cache) {
> - if (*sp == 0 && log == mp->m_rsum_cache[bbno])
> - mp->m_rsum_cache[bbno]++;
> - if (*sp != 0 && log < mp->m_rsum_cache[bbno])
> + if (*sp == 0 && log + 1 == mp->m_rsum_cache[bbno])
> mp->m_rsum_cache[bbno] = log;
> + if (*sp != 0 && log >= mp->m_rsum_cache[bbno])
> + mp->m_rsum_cache[bbno] = log + 1;
> }
> xfs_trans_log_buf(tp, bp, first, first + sizeof(*sp) - 1);
> }
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index 6c09f89534d3..964541c36730 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -103,9 +103,9 @@ typedef struct xfs_mount {
>
> /*
> * Optional cache of rt summary level per bitmap block with the
> - * invariant that m_rsum_cache[bbno] <= the minimum i for which
> - * rsum[i][bbno] != 0. Reads and writes are serialized by the rsumip
> - * inode lock.
> + * invariant that m_rsum_cache[bbno] > the maximum i for which
> + * rsum[i][bbno] != 0, or 0 if rsum[i][bbno] == 0 for all i.
> + * Reads and writes are serialized by the rsumip inode lock.
> */
> uint8_t *m_rsum_cache;
> struct xfs_mru_cache *m_filestream; /* per-mount filestream data */
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index 61ef13286654..d3c76532d20e 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -56,14 +56,19 @@ xfs_rtany_summary(
> int log; /* loop counter, log2 of ext. size */
> xfs_suminfo_t sum; /* summary data */
>
> - /* There are no extents at levels < m_rsum_cache[bbno]. */
> - if (mp->m_rsum_cache && low < mp->m_rsum_cache[bbno])
> - low = mp->m_rsum_cache[bbno];
> + /* There are no extents at levels >= m_rsum_cache[bbno]. */
> + if (mp->m_rsum_cache) {
> + high = min(high, mp->m_rsum_cache[bbno] - 1);
> + if (low > high) {
> + *stat = 0;
> + return 0;
> + }
> + }
>
> /*
> * Loop over logs of extent sizes.
> */
> - for (log = low; log <= high; log++) {
> + for (log = high; log >= low; log--) {
> /*
> * Get one summary datum.
> */
> @@ -84,9 +89,9 @@ xfs_rtany_summary(
> */
> *stat = 0;
> out:
> - /* There were no extents at levels < log. */
> - if (mp->m_rsum_cache && log > mp->m_rsum_cache[bbno])
> - mp->m_rsum_cache[bbno] = log;
> + /* There were no extents at levels > log. */
> + if (mp->m_rsum_cache && log + 1 < mp->m_rsum_cache[bbno])
> + mp->m_rsum_cache[bbno] = log + 1;
> return 0;
> }
>
> @@ -878,12 +883,14 @@ xfs_alloc_rsum_cache(
> xfs_extlen_t rbmblocks) /* number of rt bitmap blocks */
> {
> /*
> - * The rsum cache is initialized to all zeroes, which is trivially a
> - * lower bound on the minimum level with any free extents. We can
> - * continue without the cache if it couldn't be allocated.
> + * The rsum cache is initialized to the maximum value, which is
> + * trivially an upper bound on the maximum level with any free extents.
> + * We can continue without the cache if it couldn't be allocated.
> */
> - mp->m_rsum_cache = kvzalloc(rbmblocks, GFP_KERNEL);
> - if (!mp->m_rsum_cache)
> + mp->m_rsum_cache = kvmalloc(rbmblocks, GFP_KERNEL);
> + if (mp->m_rsum_cache)
> + memset(mp->m_rsum_cache, -1, rbmblocks);
> + else
> xfs_warn(mp, "could not allocate realtime summary cache");
> }
>
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 3/6] xfs: return maximum free size from xfs_rtany_summary()
2023-06-20 21:32 ` [PATCH 3/6] xfs: return maximum free size from xfs_rtany_summary() Omar Sandoval
@ 2023-07-12 22:44 ` Darrick J. Wong
0 siblings, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2023-07-12 22:44 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-xfs, kernel-team, Prashant Nema
On Tue, Jun 20, 2023 at 02:32:13PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> Instead of only returning whether there is any free space, return the
> maximum size, which is fast thanks to the previous commit. This will be
> used by two upcoming optimizations.
>
> Signed-off-by: Omar Sandoval <osandov@fb.com>
Assuming I understood the changes in the /last/ two patches, this seems
like a reasonable thing to pass upwards...
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> ---
> fs/xfs/xfs_rtalloc.c | 18 +++++++++---------
> 1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index d3c76532d20e..ba7d42e0090f 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -50,7 +50,7 @@ xfs_rtany_summary(
> int high, /* high log2 extent size */
> xfs_rtblock_t bbno, /* bitmap block number */
> struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> - int *stat) /* out: any good extents here? */
> + int *maxlog) /* out: maximum log2 extent size free */
> {
> int error; /* error value */
> int log; /* loop counter, log2 of ext. size */
> @@ -60,7 +60,7 @@ xfs_rtany_summary(
> if (mp->m_rsum_cache) {
> high = min(high, mp->m_rsum_cache[bbno] - 1);
> if (low > high) {
> - *stat = 0;
> + *maxlog = -1;
> return 0;
> }
> }
> @@ -80,14 +80,14 @@ xfs_rtany_summary(
> * If there are any, return success.
> */
> if (sum) {
> - *stat = 1;
> + *maxlog = log;
> goto out;
> }
> }
> /*
> * Found nothing, return failure.
> */
> - *stat = 0;
> + *maxlog = -1;
> out:
> /* There were no extents at levels > log. */
> if (mp->m_rsum_cache && log + 1 < mp->m_rsum_cache[bbno])
> @@ -427,7 +427,7 @@ xfs_rtallocate_extent_near(
> xfs_extlen_t prod, /* extent product factor */
> xfs_rtblock_t *rtblock) /* out: start block allocated */
> {
> - int any; /* any useful extents from summary */
> + int maxlog; /* maximum useful extent from summary */
> xfs_rtblock_t bbno; /* bitmap block number */
> int error; /* error value */
> int i; /* bitmap block offset (loop control) */
> @@ -479,7 +479,7 @@ xfs_rtallocate_extent_near(
> * starting in this bitmap block.
> */
> error = xfs_rtany_summary(mp, tp, log2len, mp->m_rsumlevels - 1,
> - bbno + i, rtbufc, &any);
> + bbno + i, rtbufc, &maxlog);
> if (error) {
> return error;
> }
> @@ -487,7 +487,7 @@ xfs_rtallocate_extent_near(
> * If there are any useful extents starting here, try
> * allocating one.
> */
> - if (any) {
> + if (maxlog >= 0) {
> /*
> * On the positive side of the starting location.
> */
> @@ -527,7 +527,7 @@ xfs_rtallocate_extent_near(
> */
> error = xfs_rtany_summary(mp, tp,
> log2len, mp->m_rsumlevels - 1,
> - bbno + j, rtbufc, &any);
> + bbno + j, rtbufc, &maxlog);
> if (error) {
> return error;
> }
> @@ -539,7 +539,7 @@ xfs_rtallocate_extent_near(
> * extent given, we've already tried
> * that allocation, don't do it again.
> */
> - if (any)
> + if (maxlog >= 0)
> continue;
> error = xfs_rtallocate_extent_block(mp,
> tp, bbno + j, minlen, maxlen,
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 4/6] xfs: limit maxlen based on available space in xfs_rtallocate_extent_near()
2023-06-20 21:32 ` [PATCH 4/6] xfs: limit maxlen based on available space in xfs_rtallocate_extent_near() Omar Sandoval
@ 2023-07-12 23:01 ` Darrick J. Wong
2023-07-17 20:33 ` Omar Sandoval
0 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2023-07-12 23:01 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-xfs, kernel-team, Prashant Nema
On Tue, Jun 20, 2023 at 02:32:14PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> xfs_rtallocate_extent_near() calls xfs_rtallocate_extent_block() with
> the minlen and maxlen that were passed to it.
> xfs_rtallocate_extent_block() then scans the bitmap block looking for a
> free range of size maxlen. If there is none, it has to scan the whole
> bitmap block before returning the largest range of at least size minlen.
> For a fragmented realtime device and a large allocation request, it's
> almost certain that this will have to search the whole bitmap block,
> leading to high CPU usage.
>
> However, the realtime summary tells us the maximum size available in the
> bitmap block. We can limit the search in xfs_rtallocate_extent_block()
> to that size and often stop before scanning the whole bitmap block.
>
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
> fs/xfs/xfs_rtalloc.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index ba7d42e0090f..d079dfb77c73 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -488,6 +488,8 @@ xfs_rtallocate_extent_near(
> * allocating one.
> */
> if (maxlog >= 0) {
> + xfs_extlen_t maxavail =
> + min(maxlen, ((xfs_extlen_t)1 << (maxlog + 1)) - 1);
There can be up to 2^52rtx (realtime extents) in the filesystem, right?
xfs_extlen_t is a u32, which will overflow this calculation if the
realtime volume is seriously huge. IOWs, doesn't this need to be:
xfs_extlen_t maxavail = max_t(xfs_rtblock_t, maxlen,
(1ULL << (maxlog + 1)) - 1);
(The rest of the patch looks ok)
--D
> /*
> * On the positive side of the starting location.
> */
> @@ -497,7 +499,7 @@ xfs_rtallocate_extent_near(
> * this block.
> */
> error = xfs_rtallocate_extent_block(mp, tp,
> - bbno + i, minlen, maxlen, len, &n,
> + bbno + i, minlen, maxavail, len, &n,
> rtbufc, prod, &r);
> if (error) {
> return error;
> @@ -542,7 +544,7 @@ xfs_rtallocate_extent_near(
> if (maxlog >= 0)
> continue;
> error = xfs_rtallocate_extent_block(mp,
> - tp, bbno + j, minlen, maxlen,
> + tp, bbno + j, minlen, maxavail,
> len, &n, rtbufc, prod, &r);
> if (error) {
> return error;
> @@ -564,7 +566,7 @@ xfs_rtallocate_extent_near(
> * that we found.
> */
> error = xfs_rtallocate_extent_block(mp, tp,
> - bbno + i, minlen, maxlen, len, &n,
> + bbno + i, minlen, maxavail, len, &n,
> rtbufc, prod, &r);
> if (error) {
> return error;
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 5/6] xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
2023-06-20 21:32 ` [PATCH 5/6] xfs: don't try redundant allocations " Omar Sandoval
@ 2023-07-12 23:34 ` Darrick J. Wong
2023-07-17 21:06 ` Omar Sandoval
0 siblings, 1 reply; 23+ messages in thread
From: Darrick J. Wong @ 2023-07-12 23:34 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-xfs, kernel-team, Prashant Nema
On Tue, Jun 20, 2023 at 02:32:15PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> xfs_rtallocate_extent_near() tries to find a free extent as close to a
> target bitmap block given by bbno as possible, which may be before or
> after bbno. Searching backwards has a complication: the realtime summary
> accounts for free space _starting_ in a bitmap block, but not straddling
> or ending in a bitmap block. So, when the negative search finds a free
> extent in the realtime summary, in order to end up closer to the target,
> it looks for the end of the free extent. For example, if bbno - 2 has a
> free extent, then it will check bbno - 1, then bbno - 2. But then if
> bbno - 3 has a free extent, it will check bbno - 1 again, then bbno - 2
> again, and then bbno - 3. This results in a quadratic loop, which is
> completely pointless since the repeated checks won't find anything new.
>
> Fix it by remembering where we last checked up to and continue from
> there. This also obviates the need for a check of the realtime summary.
>
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
> fs/xfs/xfs_rtalloc.c | 46 +++-----------------------------------------
> 1 file changed, 3 insertions(+), 43 deletions(-)
>
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index d079dfb77c73..4d9d0be2e616 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -468,6 +468,7 @@ xfs_rtallocate_extent_near(
> }
> bbno = XFS_BITTOBLOCK(mp, bno);
> i = 0;
> + j = -1;
> ASSERT(minlen != 0);
> log2len = xfs_highbit32(minlen);
> /*
> @@ -518,31 +519,11 @@ xfs_rtallocate_extent_near(
> else { /* i < 0 */
> /*
> * Loop backwards through the bitmap blocks from
> - * the starting point-1 up to where we are now.
> + * where we last checked up to where we are now.
I find this comment a bit unclear -- aren't we looping backwards from
where we last checked *downwards*? I was reading "where we are now" to
mean @i, which contains a negative value.
"When @i is negative, we try to find a free extent that starts in the
bitmap blocks before bbno. Starting from the last bitmap block that we
checked in a negative scan (initially bbno - 1) and walking downwards
towards (bbno + i), try to allocate an extent of satisfactory length."
But now having worked my way through that, now I'm wondering what the j
loop is even doing. Doesn't the sequence of blocks that we call
xfs_rtallocate_extent_block on alternate backwards and forwards? e.g.
Try to find a satisfactory free extent that starts in:
bbno
bbno + 1
bbno - 1
bbno + 2
bbno - 2
...
etc?
Why not avoid the loop entirely by calling xfs_rtallocate_extent_block
on bbno + i once before switching back to positive @i? What am I
missing here?
> * There should be an extent which ends in this
> * bitmap block and is long enough.
> */
> - for (j = -1; j > i; j--) {
> - /*
> - * Grab the summary information for
> - * this bitmap block.
> - */
> - error = xfs_rtany_summary(mp, tp,
> - log2len, mp->m_rsumlevels - 1,
> - bbno + j, rtbufc, &maxlog);
> - if (error) {
> - return error;
> - }
> - /*
> - * If there's no extent given in the
> - * summary that means the extent we
> - * found must carry over from an
> - * earlier block. If there is an
> - * extent given, we've already tried
> - * that allocation, don't do it again.
> - */
> - if (maxlog >= 0)
> - continue;
> + for (; j >= i; j--) {
Changing the j > i to j >= i is what obviates the extra call to
xfs_rtallocate_extent_block below, correct?
--D
> error = xfs_rtallocate_extent_block(mp,
> tp, bbno + j, minlen, maxavail,
> len, &n, rtbufc, prod, &r);
> @@ -557,27 +538,6 @@ xfs_rtallocate_extent_near(
> return 0;
> }
> }
> - /*
> - * There weren't intervening bitmap blocks
> - * with a long enough extent, or the
> - * allocation didn't work for some reason
> - * (i.e. it's a little * too short).
> - * Try to allocate from the summary block
> - * that we found.
> - */
> - error = xfs_rtallocate_extent_block(mp, tp,
> - bbno + i, minlen, maxavail, len, &n,
> - rtbufc, prod, &r);
> - if (error) {
> - return error;
> - }
> - /*
> - * If it works, return the extent.
> - */
> - if (r != NULLRTBLOCK) {
> - *rtblock = r;
> - return 0;
> - }
> }
> }
> /*
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/6] xfs: cache last bitmap block in realtime allocator
2023-07-12 18:29 ` Darrick J. Wong
@ 2023-07-17 18:18 ` Omar Sandoval
2023-08-01 22:48 ` Darrick J. Wong
0 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2023-07-17 18:18 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, kernel-team, Prashant Nema
On Wed, Jul 12, 2023 at 11:29:26AM -0700, Darrick J. Wong wrote:
> On Tue, Jun 20, 2023 at 02:32:11PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> >
> > Profiling a workload on a highly fragmented realtime device showed a ton
> > of CPU cycles being spent in xfs_trans_read_buf() called by
> > xfs_rtbuf_get(). Further tracing showed that much of that was repeated
> > calls to xfs_rtbuf_get() for the same block of the realtime bitmap.
> > These come from xfs_rtallocate_extent_block(): as it walks through
> > ranges of free bits in the bitmap, each call to xfs_rtcheck_range() and
> > xfs_rtfind_{forw,back}() gets the same bitmap block. If the bitmap block
> > is very fragmented, then this is _a lot_ of buffer lookups.
> >
> > The realtime allocator already passes around a cache of the last used
> > realtime summary block to avoid repeated reads (the parameters rbpp and
> > rsb). We can do the same for the realtime bitmap.
> >
> > This replaces rbpp and rsb with a struct xfs_rtbuf_cache, which caches
> > the most recently used block for both the realtime bitmap and summary.
> > xfs_rtbuf_get() now handles the caching instead of the callers, which
> > requires plumbing xfs_rtbuf_cache to more functions but also makes sure
> > we don't miss anything.
>
> Hmm. I initially wondered why rtbitmap blocks wouldn't just stay
> attached to the transaction, but then I realized -- this is the
> /scanning/ phase that's repeatedly getting and releasing the rtbmp
> block, right? The allocator hasn't dirtied any rtbitmap blocks yet, so
> each xfs_trans_brelse actually drops the buffer, and that's how the
> xfs_rtbuf_get ends up pounding on the bmapi and then the buffer cache?
> Repeatedly?
That's right.
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > ---
> > fs/xfs/libxfs/xfs_rtbitmap.c | 167 ++++++++++++++++++-----------------
> > fs/xfs/xfs_rtalloc.c | 107 ++++++++++------------
> > fs/xfs/xfs_rtalloc.h | 28 ++++--
> > 3 files changed, 154 insertions(+), 148 deletions(-)
> >
> > diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
> > index fa180ab66b73..1a832c9a412f 100644
> > --- a/fs/xfs/libxfs/xfs_rtbitmap.c
> > +++ b/fs/xfs/libxfs/xfs_rtbitmap.c
> > @@ -46,6 +46,21 @@ const struct xfs_buf_ops xfs_rtbuf_ops = {
> > .verify_write = xfs_rtbuf_verify_write,
> > };
> >
> > +static void
> > +xfs_rtbuf_cache_relse(
> > + xfs_trans_t *tp, /* transaction pointer */
>
> Please don't introduce more typedef usage.
Oops, I was just copying the surrounding code. I'll fix this in my
changes.
> struct xfs_trans *tp,
> struct xfs_rtbuf_cache *cache)
>
> (Also there's a lot of broken indenting in the lines added by this
> patch.)
I'll fix it in this new function. Should I also be reindenting all of
the parameter lists where I added this parameter?
> > + struct xfs_rtbuf_cache *cache) /* cached blocks */
> > +{
> > + if (cache->bbuf) {
> > + xfs_trans_brelse(tp, cache->bbuf);
> > + cache->bbuf = NULL;
> > + }
> > + if (cache->sbuf) {
> > + xfs_trans_brelse(tp, cache->sbuf);
> > + cache->sbuf = NULL;
> > + }
> > +}
> > +
> > /*
> > * Get a buffer for the bitmap or summary file block specified.
> > * The buffer is returned read and locked.
> > @@ -56,14 +71,35 @@ xfs_rtbuf_get(
> > xfs_trans_t *tp, /* transaction pointer */
> > xfs_rtblock_t block, /* block number in bitmap or summary */
> > int issum, /* is summary not bitmap */
> > + struct xfs_rtbuf_cache *cache, /* in/out: cached blocks */
> > struct xfs_buf **bpp) /* output: buffer for the block */
> > {
> > + struct xfs_buf **cbpp; /* cached block buffer */
> > + xfs_fsblock_t *cbp; /* cached block number */
> > struct xfs_buf *bp; /* block buffer, result */
> > xfs_inode_t *ip; /* bitmap or summary inode */
> > xfs_bmbt_irec_t map;
> > int nmap = 1;
> > int error; /* error value */
> >
> > + cbpp = issum ? &cache->bbuf : &cache->sbuf;
> > + cbp = issum ? &cache->bblock : &cache->sblock;
> > + /*
> > + * If we have a cached buffer, and the block number matches, use that.
> > + */
> > + if (*cbpp && *cbp == block) {
> > + *bpp = *cbpp;
> > + return 0;
> > + }
> > + /*
> > + * Otherwise we have to have to get the buffer. If there was an old
> > + * one, get rid of it first.
> > + */
> > + if (*cbpp) {
> > + xfs_trans_brelse(tp, *cbpp);
> > + *cbpp = NULL;
> > + }
> > +
> > ip = issum ? mp->m_rsumip : mp->m_rbmip;
> >
> > error = xfs_bmapi_read(ip, block, 1, &map, &nmap, 0);
> > @@ -82,7 +118,8 @@ xfs_rtbuf_get(
> >
> > xfs_trans_buf_set_type(tp, bp, issum ? XFS_BLFT_RTSUMMARY_BUF
> > : XFS_BLFT_RTBITMAP_BUF);
> > - *bpp = bp;
> > + *cbpp = *bpp = bp;
> > + *cbp = block;
> > return 0;
> > }
> >
> > @@ -96,6 +133,7 @@ xfs_rtfind_back(
> > xfs_trans_t *tp, /* transaction pointer */
> > xfs_rtblock_t start, /* starting block to look at */
> > xfs_rtblock_t limit, /* last block to look at */
> > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > xfs_rtblock_t *rtblock) /* out: start block found */
> > {
> > xfs_rtword_t *b; /* current word in buffer */
> > @@ -116,7 +154,7 @@ xfs_rtfind_back(
> > * Compute and read in starting bitmap block for starting block.
> > */
> > block = XFS_BITTOBLOCK(mp, start);
> > - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -153,7 +191,6 @@ xfs_rtfind_back(
> > /*
> > * Different. Mark where we are and return.
> > */
> > - xfs_trans_brelse(tp, bp);
> > i = bit - XFS_RTHIBIT(wdiff);
> > *rtblock = start - i + 1;
> > return 0;
> > @@ -167,8 +204,7 @@ xfs_rtfind_back(
> > /*
> > * If done with this block, get the previous one.
> > */
> > - xfs_trans_brelse(tp, bp);
> > - error = xfs_rtbuf_get(mp, tp, --block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, --block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -199,7 +235,6 @@ xfs_rtfind_back(
> > /*
> > * Different, mark where we are and return.
> > */
> > - xfs_trans_brelse(tp, bp);
> > i += XFS_NBWORD - 1 - XFS_RTHIBIT(wdiff);
> > *rtblock = start - i + 1;
> > return 0;
> > @@ -213,8 +248,7 @@ xfs_rtfind_back(
> > /*
> > * If done with this block, get the previous one.
> > */
> > - xfs_trans_brelse(tp, bp);
> > - error = xfs_rtbuf_get(mp, tp, --block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, --block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -246,7 +280,6 @@ xfs_rtfind_back(
> > /*
> > * Different, mark where we are and return.
> > */
> > - xfs_trans_brelse(tp, bp);
> > i += XFS_NBWORD - 1 - XFS_RTHIBIT(wdiff);
> > *rtblock = start - i + 1;
> > return 0;
> > @@ -256,7 +289,6 @@ xfs_rtfind_back(
> > /*
> > * No match, return that we scanned the whole area.
> > */
> > - xfs_trans_brelse(tp, bp);
> > *rtblock = start - i + 1;
> > return 0;
> > }
> > @@ -271,6 +303,7 @@ xfs_rtfind_forw(
> > xfs_trans_t *tp, /* transaction pointer */
> > xfs_rtblock_t start, /* starting block to look at */
> > xfs_rtblock_t limit, /* last block to look at */
> > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > xfs_rtblock_t *rtblock) /* out: start block found */
> > {
> > xfs_rtword_t *b; /* current word in buffer */
> > @@ -291,7 +324,7 @@ xfs_rtfind_forw(
> > * Compute and read in starting bitmap block for starting block.
> > */
> > block = XFS_BITTOBLOCK(mp, start);
> > - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -327,7 +360,6 @@ xfs_rtfind_forw(
> > /*
> > * Different. Mark where we are and return.
> > */
> > - xfs_trans_brelse(tp, bp);
> > i = XFS_RTLOBIT(wdiff) - bit;
> > *rtblock = start + i - 1;
> > return 0;
> > @@ -341,8 +373,7 @@ xfs_rtfind_forw(
> > /*
> > * If done with this block, get the previous one.
> > */
> > - xfs_trans_brelse(tp, bp);
> > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -372,7 +403,6 @@ xfs_rtfind_forw(
> > /*
> > * Different, mark where we are and return.
> > */
> > - xfs_trans_brelse(tp, bp);
> > i += XFS_RTLOBIT(wdiff);
> > *rtblock = start + i - 1;
> > return 0;
> > @@ -386,8 +416,7 @@ xfs_rtfind_forw(
> > /*
> > * If done with this block, get the next one.
> > */
> > - xfs_trans_brelse(tp, bp);
> > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -416,7 +445,6 @@ xfs_rtfind_forw(
> > /*
> > * Different, mark where we are and return.
> > */
> > - xfs_trans_brelse(tp, bp);
> > i += XFS_RTLOBIT(wdiff);
> > *rtblock = start + i - 1;
> > return 0;
> > @@ -426,7 +454,6 @@ xfs_rtfind_forw(
> > /*
> > * No match, return that we scanned the whole area.
> > */
> > - xfs_trans_brelse(tp, bp);
> > *rtblock = start + i - 1;
> > return 0;
> > }
> > @@ -447,8 +474,7 @@ xfs_rtmodify_summary_int(
> > int log, /* log2 of extent size */
> > xfs_rtblock_t bbno, /* bitmap block number */
> > int delta, /* change to make to summary info */
> > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > xfs_suminfo_t *sum) /* out: summary info for this block */
> > {
> > struct xfs_buf *bp; /* buffer for the summary block */
> > @@ -465,30 +491,9 @@ xfs_rtmodify_summary_int(
> > * Compute the block number in the summary file.
> > */
> > sb = XFS_SUMOFFSTOBLOCK(mp, so);
> > - /*
> > - * If we have an old buffer, and the block number matches, use that.
> > - */
> > - if (*rbpp && *rsb == sb)
> > - bp = *rbpp;
> > - /*
> > - * Otherwise we have to get the buffer.
> > - */
> > - else {
> > - /*
> > - * If there was an old one, get rid of it first.
> > - */
> > - if (*rbpp)
> > - xfs_trans_brelse(tp, *rbpp);
> > - error = xfs_rtbuf_get(mp, tp, sb, 1, &bp);
> > - if (error) {
> > - return error;
> > - }
> > - /*
> > - * Remember this buffer and block for the next call.
> > - */
> > - *rbpp = bp;
> > - *rsb = sb;
> > - }
> > + error = xfs_rtbuf_get(mp, tp, sb, 1, rtbufc, &bp);
> > + if (error)
> > + return error;
> > /*
> > * Point to the summary information, modify/log it, and/or copy it out.
> > */
> > @@ -517,11 +522,9 @@ xfs_rtmodify_summary(
> > int log, /* log2 of extent size */
> > xfs_rtblock_t bbno, /* bitmap block number */
> > int delta, /* change to make to summary info */
> > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > - xfs_fsblock_t *rsb) /* in/out: summary block number */
> > + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> > {
> > - return xfs_rtmodify_summary_int(mp, tp, log, bbno,
> > - delta, rbpp, rsb, NULL);
> > + return xfs_rtmodify_summary_int(mp, tp, log, bbno, delta, rtbufc, NULL);
> > }
> >
> > /*
> > @@ -534,7 +537,8 @@ xfs_rtmodify_range(
> > xfs_trans_t *tp, /* transaction pointer */
> > xfs_rtblock_t start, /* starting block to modify */
> > xfs_extlen_t len, /* length of extent to modify */
> > - int val) /* 1 for free, 0 for allocated */
> > + int val, /* 1 for free, 0 for allocated */
> > + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> > {
> > xfs_rtword_t *b; /* current word in buffer */
> > int bit; /* bit number in the word */
> > @@ -555,7 +559,7 @@ xfs_rtmodify_range(
> > /*
> > * Read the bitmap block, and point to its data.
> > */
> > - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -600,7 +604,7 @@ xfs_rtmodify_range(
> > xfs_trans_log_buf(tp, bp,
> > (uint)((char *)first - (char *)bufp),
> > (uint)((char *)b - (char *)bufp));
> > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -640,7 +644,7 @@ xfs_rtmodify_range(
> > xfs_trans_log_buf(tp, bp,
> > (uint)((char *)first - (char *)bufp),
> > (uint)((char *)b - (char *)bufp));
> > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -690,8 +694,7 @@ xfs_rtfree_range(
> > xfs_trans_t *tp, /* transaction pointer */
> > xfs_rtblock_t start, /* starting block to free */
> > xfs_extlen_t len, /* length to free */
> > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > - xfs_fsblock_t *rsb) /* in/out: summary block number */
> > + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> > {
> > xfs_rtblock_t end; /* end of the freed extent */
> > int error; /* error value */
> > @@ -702,7 +705,7 @@ xfs_rtfree_range(
> > /*
> > * Modify the bitmap to mark this extent freed.
> > */
> > - error = xfs_rtmodify_range(mp, tp, start, len, 1);
> > + error = xfs_rtmodify_range(mp, tp, start, len, 1, rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -711,7 +714,7 @@ xfs_rtfree_range(
> > * We need to find the beginning and end of the extent so we can
> > * properly update the summary.
> > */
> > - error = xfs_rtfind_back(mp, tp, start, 0, &preblock);
> > + error = xfs_rtfind_back(mp, tp, start, 0, rtbufc, &preblock);
> > if (error) {
> > return error;
> > }
> > @@ -719,7 +722,7 @@ xfs_rtfree_range(
> > * Find the next allocated block (end of allocated extent).
> > */
> > error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1,
> > - &postblock);
> > + rtbufc, &postblock);
> > if (error)
> > return error;
> > /*
> > @@ -729,7 +732,7 @@ xfs_rtfree_range(
> > if (preblock < start) {
> > error = xfs_rtmodify_summary(mp, tp,
> > XFS_RTBLOCKLOG(start - preblock),
> > - XFS_BITTOBLOCK(mp, preblock), -1, rbpp, rsb);
> > + XFS_BITTOBLOCK(mp, preblock), -1, rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -741,7 +744,7 @@ xfs_rtfree_range(
> > if (postblock > end) {
> > error = xfs_rtmodify_summary(mp, tp,
> > XFS_RTBLOCKLOG(postblock - end),
> > - XFS_BITTOBLOCK(mp, end + 1), -1, rbpp, rsb);
> > + XFS_BITTOBLOCK(mp, end + 1), -1, rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -752,7 +755,7 @@ xfs_rtfree_range(
> > */
> > error = xfs_rtmodify_summary(mp, tp,
> > XFS_RTBLOCKLOG(postblock + 1 - preblock),
> > - XFS_BITTOBLOCK(mp, preblock), 1, rbpp, rsb);
> > + XFS_BITTOBLOCK(mp, preblock), 1, rtbufc);
> > return error;
> > }
> >
> > @@ -767,6 +770,7 @@ xfs_rtcheck_range(
> > xfs_rtblock_t start, /* starting block number of extent */
> > xfs_extlen_t len, /* length of extent */
> > int val, /* 1 for free, 0 for allocated */
> > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > xfs_rtblock_t *new, /* out: first block not matching */
> > int *stat) /* out: 1 for matches, 0 for not */
> > {
> > @@ -789,7 +793,7 @@ xfs_rtcheck_range(
> > /*
> > * Read the bitmap block.
> > */
> > - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -824,7 +828,6 @@ xfs_rtcheck_range(
> > /*
> > * Different, compute first wrong bit and return.
> > */
> > - xfs_trans_brelse(tp, bp);
> > i = XFS_RTLOBIT(wdiff) - bit;
> > *new = start + i;
> > *stat = 0;
> > @@ -839,8 +842,7 @@ xfs_rtcheck_range(
> > /*
> > * If done with this block, get the next one.
> > */
> > - xfs_trans_brelse(tp, bp);
> > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -870,7 +872,6 @@ xfs_rtcheck_range(
> > /*
> > * Different, compute first wrong bit and return.
> > */
> > - xfs_trans_brelse(tp, bp);
> > i += XFS_RTLOBIT(wdiff);
> > *new = start + i;
> > *stat = 0;
> > @@ -885,8 +886,7 @@ xfs_rtcheck_range(
> > /*
> > * If done with this block, get the next one.
> > */
> > - xfs_trans_brelse(tp, bp);
> > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > if (error) {
> > return error;
> > }
> > @@ -915,7 +915,6 @@ xfs_rtcheck_range(
> > /*
> > * Different, compute first wrong bit and return.
> > */
> > - xfs_trans_brelse(tp, bp);
> > i += XFS_RTLOBIT(wdiff);
> > *new = start + i;
> > *stat = 0;
> > @@ -926,7 +925,6 @@ xfs_rtcheck_range(
> > /*
> > * Successful, return.
> > */
> > - xfs_trans_brelse(tp, bp);
> > *new = start + i;
> > *stat = 1;
> > return 0;
> > @@ -941,20 +939,21 @@ xfs_rtcheck_alloc_range(
> > xfs_mount_t *mp, /* file system mount point */
> > xfs_trans_t *tp, /* transaction pointer */
> > xfs_rtblock_t bno, /* starting block number of extent */
> > - xfs_extlen_t len) /* length of extent */
> > + xfs_extlen_t len, /* length of extent */
> > + struct xfs_rtbuf_cache *rtbufc) /* in/out: cached blocks */
> > {
> > xfs_rtblock_t new; /* dummy for xfs_rtcheck_range */
> > int stat;
> > int error;
> >
> > - error = xfs_rtcheck_range(mp, tp, bno, len, 0, &new, &stat);
> > + error = xfs_rtcheck_range(mp, tp, bno, len, 0, rtbufc, &new, &stat);
> > if (error)
> > return error;
> > ASSERT(stat);
> > return 0;
> > }
> > #else
> > -#define xfs_rtcheck_alloc_range(m,t,b,l) (0)
> > +#define xfs_rtcheck_alloc_range(m,t,b,l,r) (0)
> > #endif
> > /*
> > * Free an extent in the realtime subvolume. Length is expressed in
> > @@ -968,22 +967,21 @@ xfs_rtfree_extent(
> > {
> > int error; /* error value */
> > xfs_mount_t *mp; /* file system mount structure */
> > - xfs_fsblock_t sb; /* summary file block number */
> > - struct xfs_buf *sumbp = NULL; /* summary file block buffer */
> > + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
> >
> > mp = tp->t_mountp;
> >
> > ASSERT(mp->m_rbmip->i_itemp != NULL);
> > ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
> >
> > - error = xfs_rtcheck_alloc_range(mp, tp, bno, len);
> > + error = xfs_rtcheck_alloc_range(mp, tp, bno, len, &rtbufc);
> > if (error)
> > return error;
> >
> > /*
> > * Free the range of realtime blocks.
> > */
> > - error = xfs_rtfree_range(mp, tp, bno, len, &sumbp, &sb);
> > + error = xfs_rtfree_range(mp, tp, bno, len, &rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -1021,6 +1019,7 @@ xfs_rtalloc_query_range(
> > xfs_rtblock_t high_key;
> > int is_free;
> > int error = 0;
> > + struct xfs_rtbuf_cache rtbufc = {};
> >
> > if (low_rec->ar_startext > high_rec->ar_startext)
> > return -EINVAL;
> > @@ -1034,13 +1033,14 @@ xfs_rtalloc_query_range(
> > rtstart = low_rec->ar_startext;
> > while (rtstart <= high_key) {
> > /* Is the first block free? */
> > - error = xfs_rtcheck_range(mp, tp, rtstart, 1, 1, &rtend,
> > - &is_free);
> > + error = xfs_rtcheck_range(mp, tp, rtstart, 1, 1, &rtbufc,
> > + &rtend, &is_free);
> > if (error)
> > break;
> >
> > /* How long does the extent go for? */
> > - error = xfs_rtfind_forw(mp, tp, rtstart, high_key, &rtend);
> > + error = xfs_rtfind_forw(mp, tp, rtstart, high_key, &rtbufc,
> > + &rtend);
> > if (error)
> > break;
> >
> > @@ -1056,6 +1056,8 @@ xfs_rtalloc_query_range(
> > rtstart = rtend + 1;
> > }
> >
> > + xfs_rtbuf_cache_relse(tp, &rtbufc);
> > +
> > return error;
> > }
> >
> > @@ -1085,11 +1087,14 @@ xfs_rtalloc_extent_is_free(
> > xfs_extlen_t len,
> > bool *is_free)
> > {
> > + struct xfs_rtbuf_cache rtbufc = {};
> > xfs_rtblock_t end;
> > int matches;
> > int error;
> >
> > - error = xfs_rtcheck_range(mp, tp, start, len, 1, &end, &matches);
> > + error = xfs_rtcheck_range(mp, tp, start, len, 1, &rtbufc, &end,
> > + &matches);
> > + xfs_rtbuf_cache_relse(tp, &rtbufc);
> > if (error)
> > return error;
> >
> > diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> > index 16534e9873f6..61ef13286654 100644
> > --- a/fs/xfs/xfs_rtalloc.c
> > +++ b/fs/xfs/xfs_rtalloc.c
> > @@ -32,11 +32,10 @@ xfs_rtget_summary(
> > xfs_trans_t *tp, /* transaction pointer */
> > int log, /* log2 of extent size */
> > xfs_rtblock_t bbno, /* bitmap block number */
> > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > xfs_suminfo_t *sum) /* out: summary info for this block */
> > {
> > - return xfs_rtmodify_summary_int(mp, tp, log, bbno, 0, rbpp, rsb, sum);
> > + return xfs_rtmodify_summary_int(mp, tp, log, bbno, 0, rtbufc, sum);
> > }
> >
> > /*
> > @@ -50,8 +49,7 @@ xfs_rtany_summary(
> > int low, /* low log2 extent size */
> > int high, /* high log2 extent size */
> > xfs_rtblock_t bbno, /* bitmap block number */
> > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > int *stat) /* out: any good extents here? */
> > {
> > int error; /* error value */
> > @@ -69,7 +67,7 @@ xfs_rtany_summary(
> > /*
> > * Get one summary datum.
> > */
> > - error = xfs_rtget_summary(mp, tp, log, bbno, rbpp, rsb, &sum);
> > + error = xfs_rtget_summary(mp, tp, log, bbno, rtbufc, &sum);
> > if (error) {
> > return error;
> > }
> > @@ -104,29 +102,27 @@ xfs_rtcopy_summary(
> > xfs_trans_t *tp) /* transaction pointer */
> > {
> > xfs_rtblock_t bbno; /* bitmap block number */
> > - struct xfs_buf *bp; /* summary buffer */
> > + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
> > int error; /* error return value */
> > int log; /* summary level number (log length) */
> > xfs_suminfo_t sum; /* summary data */
> > - xfs_fsblock_t sumbno; /* summary block number */
> >
> > - bp = NULL;
> > for (log = omp->m_rsumlevels - 1; log >= 0; log--) {
> > for (bbno = omp->m_sb.sb_rbmblocks - 1;
> > (xfs_srtblock_t)bbno >= 0;
> > bbno--) {
> > - error = xfs_rtget_summary(omp, tp, log, bbno, &bp,
> > - &sumbno, &sum);
> > + error = xfs_rtget_summary(omp, tp, log, bbno, &rtbufc,
> > + &sum);
> > if (error)
> > return error;
> > if (sum == 0)
> > continue;
> > error = xfs_rtmodify_summary(omp, tp, log, bbno, -sum,
> > - &bp, &sumbno);
> > + &rtbufc);
> > if (error)
> > return error;
> > error = xfs_rtmodify_summary(nmp, tp, log, bbno, sum,
> > - &bp, &sumbno);
> > + &rtbufc);
> > if (error)
> > return error;
> > ASSERT(sum > 0);
> > @@ -144,8 +140,7 @@ xfs_rtallocate_range(
> > xfs_trans_t *tp, /* transaction pointer */
> > xfs_rtblock_t start, /* start block to allocate */
> > xfs_extlen_t len, /* length to allocate */
> > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > - xfs_fsblock_t *rsb) /* in/out: summary block number */
> > + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> > {
> > xfs_rtblock_t end; /* end of the allocated extent */
> > int error; /* error value */
> > @@ -158,14 +153,14 @@ xfs_rtallocate_range(
> > * We need to find the beginning and end of the extent so we can
> > * properly update the summary.
> > */
> > - error = xfs_rtfind_back(mp, tp, start, 0, &preblock);
> > + error = xfs_rtfind_back(mp, tp, start, 0, rtbufc, &preblock);
> > if (error) {
> > return error;
> > }
> > /*
> > * Find the next allocated block (end of free extent).
> > */
> > - error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1,
> > + error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1, rtbufc,
> > &postblock);
> > if (error) {
> > return error;
> > @@ -176,7 +171,7 @@ xfs_rtallocate_range(
> > */
> > error = xfs_rtmodify_summary(mp, tp,
> > XFS_RTBLOCKLOG(postblock + 1 - preblock),
> > - XFS_BITTOBLOCK(mp, preblock), -1, rbpp, rsb);
> > + XFS_BITTOBLOCK(mp, preblock), -1, rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -187,7 +182,7 @@ xfs_rtallocate_range(
> > if (preblock < start) {
> > error = xfs_rtmodify_summary(mp, tp,
> > XFS_RTBLOCKLOG(start - preblock),
> > - XFS_BITTOBLOCK(mp, preblock), 1, rbpp, rsb);
> > + XFS_BITTOBLOCK(mp, preblock), 1, rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -199,7 +194,7 @@ xfs_rtallocate_range(
> > if (postblock > end) {
> > error = xfs_rtmodify_summary(mp, tp,
> > XFS_RTBLOCKLOG(postblock - end),
> > - XFS_BITTOBLOCK(mp, end + 1), 1, rbpp, rsb);
> > + XFS_BITTOBLOCK(mp, end + 1), 1, rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -207,7 +202,7 @@ xfs_rtallocate_range(
> > /*
> > * Modify the bitmap to mark this extent allocated.
> > */
> > - error = xfs_rtmodify_range(mp, tp, start, len, 0);
> > + error = xfs_rtmodify_range(mp, tp, start, len, 0, rtbufc);
> > return error;
> > }
> >
> > @@ -226,8 +221,7 @@ xfs_rtallocate_extent_block(
> > xfs_extlen_t maxlen, /* maximum length to allocate */
> > xfs_extlen_t *len, /* out: actual length allocated */
> > xfs_rtblock_t *nextp, /* out: next block to try */
> > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > xfs_extlen_t prod, /* extent product factor */
> > xfs_rtblock_t *rtblock) /* out: start block allocated */
> > {
> > @@ -254,7 +248,8 @@ xfs_rtallocate_extent_block(
> > * See if there's a free extent of maxlen starting at i.
> > * If it's not so then next will contain the first non-free.
> > */
> > - error = xfs_rtcheck_range(mp, tp, i, maxlen, 1, &next, &stat);
> > + error = xfs_rtcheck_range(mp, tp, i, maxlen, 1, rtbufc, &next,
> > + &stat);
> > if (error) {
> > return error;
> > }
> > @@ -262,8 +257,7 @@ xfs_rtallocate_extent_block(
> > /*
> > * i for maxlen is all free, allocate and return that.
> > */
> > - error = xfs_rtallocate_range(mp, tp, i, maxlen, rbpp,
> > - rsb);
> > + error = xfs_rtallocate_range(mp, tp, i, maxlen, rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -290,7 +284,7 @@ xfs_rtallocate_extent_block(
> > * If not done yet, find the start of the next free space.
> > */
> > if (next < end) {
> > - error = xfs_rtfind_forw(mp, tp, next, end, &i);
> > + error = xfs_rtfind_forw(mp, tp, next, end, rtbufc, &i);
> > if (error) {
> > return error;
> > }
> > @@ -315,7 +309,7 @@ xfs_rtallocate_extent_block(
> > /*
> > * Allocate besti for bestlen & return that.
> > */
> > - error = xfs_rtallocate_range(mp, tp, besti, bestlen, rbpp, rsb);
> > + error = xfs_rtallocate_range(mp, tp, besti, bestlen, rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -344,9 +338,8 @@ xfs_rtallocate_extent_exact(
> > xfs_rtblock_t bno, /* starting block number to allocate */
> > xfs_extlen_t minlen, /* minimum length to allocate */
> > xfs_extlen_t maxlen, /* maximum length to allocate */
> > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > xfs_extlen_t *len, /* out: actual length allocated */
> > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > xfs_extlen_t prod, /* extent product factor */
> > xfs_rtblock_t *rtblock) /* out: start block allocated */
> > {
> > @@ -359,7 +352,8 @@ xfs_rtallocate_extent_exact(
> > /*
> > * Check if the range in question (for maxlen) is free.
> > */
> > - error = xfs_rtcheck_range(mp, tp, bno, maxlen, 1, &next, &isfree);
> > + error = xfs_rtcheck_range(mp, tp, bno, maxlen, 1, rtbufc, &next,
> > + &isfree);
> > if (error) {
> > return error;
> > }
> > @@ -367,7 +361,7 @@ xfs_rtallocate_extent_exact(
> > /*
> > * If it is, allocate it and return success.
> > */
> > - error = xfs_rtallocate_range(mp, tp, bno, maxlen, rbpp, rsb);
> > + error = xfs_rtallocate_range(mp, tp, bno, maxlen, rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -402,7 +396,7 @@ xfs_rtallocate_extent_exact(
> > /*
> > * Allocate what we can and return it.
> > */
> > - error = xfs_rtallocate_range(mp, tp, bno, maxlen, rbpp, rsb);
> > + error = xfs_rtallocate_range(mp, tp, bno, maxlen, rtbufc);
> > if (error) {
> > return error;
> > }
> > @@ -424,8 +418,7 @@ xfs_rtallocate_extent_near(
> > xfs_extlen_t minlen, /* minimum length to allocate */
> > xfs_extlen_t maxlen, /* maximum length to allocate */
> > xfs_extlen_t *len, /* out: actual length allocated */
> > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > xfs_extlen_t prod, /* extent product factor */
> > xfs_rtblock_t *rtblock) /* out: start block allocated */
> > {
> > @@ -456,8 +449,8 @@ xfs_rtallocate_extent_near(
> > /*
> > * Try the exact allocation first.
> > */
> > - error = xfs_rtallocate_extent_exact(mp, tp, bno, minlen, maxlen, len,
> > - rbpp, rsb, prod, &r);
> > + error = xfs_rtallocate_extent_exact(mp, tp, bno, minlen, maxlen, rtbufc,
> > + len, prod, &r);
> > if (error) {
> > return error;
> > }
> > @@ -481,7 +474,7 @@ xfs_rtallocate_extent_near(
> > * starting in this bitmap block.
> > */
> > error = xfs_rtany_summary(mp, tp, log2len, mp->m_rsumlevels - 1,
> > - bbno + i, rbpp, rsb, &any);
> > + bbno + i, rtbufc, &any);
> > if (error) {
> > return error;
> > }
> > @@ -499,8 +492,8 @@ xfs_rtallocate_extent_near(
> > * this block.
> > */
> > error = xfs_rtallocate_extent_block(mp, tp,
> > - bbno + i, minlen, maxlen, len, &n, rbpp,
> > - rsb, prod, &r);
> > + bbno + i, minlen, maxlen, len, &n,
> > + rtbufc, prod, &r);
> > if (error) {
> > return error;
> > }
> > @@ -529,7 +522,7 @@ xfs_rtallocate_extent_near(
> > */
> > error = xfs_rtany_summary(mp, tp,
> > log2len, mp->m_rsumlevels - 1,
> > - bbno + j, rbpp, rsb, &any);
> > + bbno + j, rtbufc, &any);
> > if (error) {
> > return error;
> > }
> > @@ -545,7 +538,7 @@ xfs_rtallocate_extent_near(
> > continue;
> > error = xfs_rtallocate_extent_block(mp,
> > tp, bbno + j, minlen, maxlen,
> > - len, &n, rbpp, rsb, prod, &r);
> > + len, &n, rtbufc, prod, &r);
> > if (error) {
> > return error;
> > }
> > @@ -566,8 +559,8 @@ xfs_rtallocate_extent_near(
> > * that we found.
> > */
> > error = xfs_rtallocate_extent_block(mp, tp,
> > - bbno + i, minlen, maxlen, len, &n, rbpp,
> > - rsb, prod, &r);
> > + bbno + i, minlen, maxlen, len, &n,
> > + rtbufc, prod, &r);
> > if (error) {
> > return error;
> > }
> > @@ -626,8 +619,7 @@ xfs_rtallocate_extent_size(
> > xfs_extlen_t minlen, /* minimum length to allocate */
> > xfs_extlen_t maxlen, /* maximum length to allocate */
> > xfs_extlen_t *len, /* out: actual length allocated */
> > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > xfs_extlen_t prod, /* extent product factor */
> > xfs_rtblock_t *rtblock) /* out: start block allocated */
> > {
> > @@ -656,7 +648,7 @@ xfs_rtallocate_extent_size(
> > /*
> > * Get the summary for this level/block.
> > */
> > - error = xfs_rtget_summary(mp, tp, l, i, rbpp, rsb,
> > + error = xfs_rtget_summary(mp, tp, l, i, rtbufc,
> > &sum);
> > if (error) {
> > return error;
> > @@ -670,7 +662,7 @@ xfs_rtallocate_extent_size(
> > * Try allocating the extent.
> > */
> > error = xfs_rtallocate_extent_block(mp, tp, i, maxlen,
> > - maxlen, len, &n, rbpp, rsb, prod, &r);
> > + maxlen, len, &n, rtbufc, prod, &r);
> > if (error) {
> > return error;
> > }
> > @@ -715,7 +707,7 @@ xfs_rtallocate_extent_size(
> > /*
> > * Get the summary information for this level/block.
> > */
> > - error = xfs_rtget_summary(mp, tp, l, i, rbpp, rsb,
> > + error = xfs_rtget_summary(mp, tp, l, i, rtbufc,
> > &sum);
> > if (error) {
> > return error;
> > @@ -733,7 +725,7 @@ xfs_rtallocate_extent_size(
> > error = xfs_rtallocate_extent_block(mp, tp, i,
> > XFS_RTMAX(minlen, 1 << l),
> > XFS_RTMIN(maxlen, (1 << (l + 1)) - 1),
> > - len, &n, rbpp, rsb, prod, &r);
> > + len, &n, rtbufc, prod, &r);
> > if (error) {
> > return error;
> > }
> > @@ -922,7 +914,6 @@ xfs_growfs_rt(
> > xfs_extlen_t rbmblocks; /* current number of rt bitmap blocks */
> > xfs_extlen_t rsumblocks; /* current number of rt summary blks */
> > xfs_sb_t *sbp; /* old superblock */
> > - xfs_fsblock_t sumbno; /* summary block number */
> > uint8_t *rsum_cache; /* old summary cache */
> >
> > sbp = &mp->m_sb;
> > @@ -1025,6 +1016,7 @@ xfs_growfs_rt(
> > bmbno++) {
> > struct xfs_trans *tp;
> > xfs_rfsblock_t nrblocks_step;
> > + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
> >
> > *nmp = *mp;
> > nsbp = &nmp->m_sb;
> > @@ -1111,9 +1103,8 @@ xfs_growfs_rt(
> > /*
> > * Free new extent.
> > */
> > - bp = NULL;
> > error = xfs_rtfree_range(nmp, tp, sbp->sb_rextents,
> > - nsbp->sb_rextents - sbp->sb_rextents, &bp, &sumbno);
> > + nsbp->sb_rextents - sbp->sb_rextents, &rtbufc);
> > if (error) {
> > error_cancel:
> > xfs_trans_cancel(tp);
> > @@ -1185,8 +1176,7 @@ xfs_rtallocate_extent(
> > xfs_mount_t *mp = tp->t_mountp;
> > int error; /* error value */
> > xfs_rtblock_t r; /* result allocated block */
> > - xfs_fsblock_t sb; /* summary file block number */
> > - struct xfs_buf *sumbp; /* summary file block buffer */
> > + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
> >
> > ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
> > ASSERT(minlen > 0 && minlen <= maxlen);
> > @@ -1208,13 +1198,14 @@ xfs_rtallocate_extent(
> > }
> >
> > retry:
> > - sumbp = NULL;
> > + rtbufc.bbuf = NULL;
> > + rtbufc.sbuf = NULL;
>
> xfs_rtbuf_cache_relse?
In this case, the cached blocks have been dirtied and attached to the
transaction, so it's not strictly necessary (which is also why the old
code didn't call xfs_trans_brelse()). But it's probably clearer to just
use xfs_rtbuf_cache_relse() anyways rather than needing to justify that.
> > if (bno == 0) {
> > error = xfs_rtallocate_extent_size(mp, tp, minlen, maxlen, len,
> > - &sumbp, &sb, prod, &r);
> > + &rtbufc, prod, &r);
> > } else {
> > error = xfs_rtallocate_extent_near(mp, tp, bno, minlen, maxlen,
> > - len, &sumbp, &sb, prod, &r);
> > + len, &rtbufc, prod, &r);
> > }
> >
> > if (error)
> > diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h
> > index 62c7ad79cbb6..888552c4f287 100644
> > --- a/fs/xfs/xfs_rtalloc.h
> > +++ b/fs/xfs/xfs_rtalloc.h
> > @@ -101,29 +101,39 @@ xfs_growfs_rt(
> > /*
> > * From xfs_rtbitmap.c
> > */
> > +struct xfs_rtbuf_cache {
> > + struct xfs_buf *bbuf; /* bitmap block buffer */
> > + xfs_fsblock_t bblock; /* bitmap block number */
> > + struct xfs_buf *sbuf; /* summary block buffer */
> > + xfs_fsblock_t sblock; /* summary block number */
>
> These are really offsets into the data fork, right? They ought to be
> xfs_fileoff_t, except that none of the rtalloc code uses the correct
> type?
Yup, once again I was mimicking the surrounding code. I can fix it here.
Thanks!
> And yes I do want to merge the fixes for that:
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=clean-up-realtime-units
>
> --D
>
> > +};
> > +
> > int xfs_rtbuf_get(struct xfs_mount *mp, struct xfs_trans *tp,
> > - xfs_rtblock_t block, int issum, struct xfs_buf **bpp);
> > + xfs_rtblock_t block, int issum, struct xfs_rtbuf_cache *cache,
> > + struct xfs_buf **bpp);
> > int xfs_rtcheck_range(struct xfs_mount *mp, struct xfs_trans *tp,
> > xfs_rtblock_t start, xfs_extlen_t len, int val,
> > - xfs_rtblock_t *new, int *stat);
> > + struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *new,
> > + int *stat);
> > int xfs_rtfind_back(struct xfs_mount *mp, struct xfs_trans *tp,
> > xfs_rtblock_t start, xfs_rtblock_t limit,
> > - xfs_rtblock_t *rtblock);
> > + struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *rtblock);
> > int xfs_rtfind_forw(struct xfs_mount *mp, struct xfs_trans *tp,
> > xfs_rtblock_t start, xfs_rtblock_t limit,
> > - xfs_rtblock_t *rtblock);
> > + struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *rtblock);
> > int xfs_rtmodify_range(struct xfs_mount *mp, struct xfs_trans *tp,
> > - xfs_rtblock_t start, xfs_extlen_t len, int val);
> > + xfs_rtblock_t start, xfs_extlen_t len, int val,
> > + struct xfs_rtbuf_cache *rtbufc);
> > int xfs_rtmodify_summary_int(struct xfs_mount *mp, struct xfs_trans *tp,
> > int log, xfs_rtblock_t bbno, int delta,
> > - struct xfs_buf **rbpp, xfs_fsblock_t *rsb,
> > + struct xfs_rtbuf_cache *rtbufc,
> > xfs_suminfo_t *sum);
> > int xfs_rtmodify_summary(struct xfs_mount *mp, struct xfs_trans *tp, int log,
> > - xfs_rtblock_t bbno, int delta, struct xfs_buf **rbpp,
> > - xfs_fsblock_t *rsb);
> > + xfs_rtblock_t bbno, int delta,
> > + struct xfs_rtbuf_cache *rtbufc);
> > int xfs_rtfree_range(struct xfs_mount *mp, struct xfs_trans *tp,
> > xfs_rtblock_t start, xfs_extlen_t len,
> > - struct xfs_buf **rbpp, xfs_fsblock_t *rsb);
> > + struct xfs_rtbuf_cache *rtbufc);
> > int xfs_rtalloc_query_range(struct xfs_mount *mp, struct xfs_trans *tp,
> > const struct xfs_rtalloc_rec *low_rec,
> > const struct xfs_rtalloc_rec *high_rec,
> > --
> > 2.41.0
> >
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/6] xfs: invert the realtime summary cache
2023-07-12 22:40 ` Darrick J. Wong
@ 2023-07-17 19:54 ` Omar Sandoval
2023-08-01 23:17 ` Darrick J. Wong
0 siblings, 1 reply; 23+ messages in thread
From: Omar Sandoval @ 2023-07-17 19:54 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, kernel-team, Prashant Nema
On Wed, Jul 12, 2023 at 03:40:01PM -0700, Darrick J. Wong wrote:
> On Tue, Jun 20, 2023 at 02:32:12PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> >
> > In commit 355e3532132b ("xfs: cache minimum realtime summary level"), I
> > added a cache of the minimum level of the realtime summary that has any
> > free extents. However, it turns out that the _maximum_ level is more
> > useful for upcoming optimizations, and basically equivalent for the
> > existing usage. So, let's change the meaning of the cache to be the
> > maximum level + 1, or 0 if there are no free extents.
>
> Hmm. If I'm reading xfs_rtmodify_summary_int right, m_rsum_cache[b] now
> tells us the maximum log2(length) of the free extents starting in
> rtbitmap block b?
>
> IOWs, let's say the cache contents are:
>
> {2, 3, 2, 15, 8}
>
> Someone asks for a 400rtx (realtime extent) allocation, so we want to
> find a free space of at least magnitude floor(log2(400)) == 8.
>
> The cache tells us that there aren't any free extents longer than 2^1
> blocks in rtbitmap blocks 0 and 2; longer than 2^2 blocks in rtbmp block
> 1; longer than 2^7 blocks in rtbmp block 4; nor longer than 2^14 blocks
> in rtbmp block 3?
There's a potential for an off-by-one bug here, so just to make sure
we're saying the same thing: the realtime summary for level n contains
the number of free extents starting in a bitmap block such that
floor(log2(size_in_realtime_extents)) == n. The maximum size of a free
extent in level n is therefore 2^(n + 1) - 1 realtime extents.
So in your example, the cache is telling us that realtime bitmap blocks
0 and 2 don't have anything free in levels 2 or above, and therefore
don't have any free extents longer than _or equal to_ 2^2.
I'll try to reword the commit message and comments to make this
unambiguous.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 4/6] xfs: limit maxlen based on available space in xfs_rtallocate_extent_near()
2023-07-12 23:01 ` Darrick J. Wong
@ 2023-07-17 20:33 ` Omar Sandoval
0 siblings, 0 replies; 23+ messages in thread
From: Omar Sandoval @ 2023-07-17 20:33 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, kernel-team, Prashant Nema
On Wed, Jul 12, 2023 at 04:01:59PM -0700, Darrick J. Wong wrote:
> On Tue, Jun 20, 2023 at 02:32:14PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> >
> > xfs_rtallocate_extent_near() calls xfs_rtallocate_extent_block() with
> > the minlen and maxlen that were passed to it.
> > xfs_rtallocate_extent_block() then scans the bitmap block looking for a
> > free range of size maxlen. If there is none, it has to scan the whole
> > bitmap block before returning the largest range of at least size minlen.
> > For a fragmented realtime device and a large allocation request, it's
> > almost certain that this will have to search the whole bitmap block,
> > leading to high CPU usage.
> >
> > However, the realtime summary tells us the maximum size available in the
> > bitmap block. We can limit the search in xfs_rtallocate_extent_block()
> > to that size and often stop before scanning the whole bitmap block.
> >
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > ---
> > fs/xfs/xfs_rtalloc.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> > index ba7d42e0090f..d079dfb77c73 100644
> > --- a/fs/xfs/xfs_rtalloc.c
> > +++ b/fs/xfs/xfs_rtalloc.c
> > @@ -488,6 +488,8 @@ xfs_rtallocate_extent_near(
> > * allocating one.
> > */
> > if (maxlog >= 0) {
> > + xfs_extlen_t maxavail =
> > + min(maxlen, ((xfs_extlen_t)1 << (maxlog + 1)) - 1);
>
> There can be up to 2^52rtx (realtime extents) in the filesystem, right?
> xfs_extlen_t is a u32, which will overflow this calculation if the
> realtime volume is seriously huge. IOWs, doesn't this need to be:
>
> xfs_extlen_t maxavail = max_t(xfs_rtblock_t, maxlen,
> (1ULL << (maxlog + 1)) - 1);
>
> (The rest of the patch looks ok)
min_t instead of max_t, but good catch, fixed.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 5/6] xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
2023-07-12 23:34 ` Darrick J. Wong
@ 2023-07-17 21:06 ` Omar Sandoval
2023-07-31 20:58 ` Omar Sandoval
2023-08-01 23:00 ` Darrick J. Wong
0 siblings, 2 replies; 23+ messages in thread
From: Omar Sandoval @ 2023-07-17 21:06 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, kernel-team, Prashant Nema
On Wed, Jul 12, 2023 at 04:34:03PM -0700, Darrick J. Wong wrote:
> On Tue, Jun 20, 2023 at 02:32:15PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> >
> > xfs_rtallocate_extent_near() tries to find a free extent as close to a
> > target bitmap block given by bbno as possible, which may be before or
> > after bbno. Searching backwards has a complication: the realtime summary
> > accounts for free space _starting_ in a bitmap block, but not straddling
> > or ending in a bitmap block. So, when the negative search finds a free
> > extent in the realtime summary, in order to end up closer to the target,
> > it looks for the end of the free extent. For example, if bbno - 2 has a
> > free extent, then it will check bbno - 1, then bbno - 2. But then if
> > bbno - 3 has a free extent, it will check bbno - 1 again, then bbno - 2
> > again, and then bbno - 3. This results in a quadratic loop, which is
> > completely pointless since the repeated checks won't find anything new.
> >
> > Fix it by remembering where we last checked up to and continue from
> > there. This also obviates the need for a check of the realtime summary.
> >
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > ---
> > fs/xfs/xfs_rtalloc.c | 46 +++-----------------------------------------
> > 1 file changed, 3 insertions(+), 43 deletions(-)
> >
> > diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> > index d079dfb77c73..4d9d0be2e616 100644
> > --- a/fs/xfs/xfs_rtalloc.c
> > +++ b/fs/xfs/xfs_rtalloc.c
> > @@ -468,6 +468,7 @@ xfs_rtallocate_extent_near(
> > }
> > bbno = XFS_BITTOBLOCK(mp, bno);
> > i = 0;
> > + j = -1;
> > ASSERT(minlen != 0);
> > log2len = xfs_highbit32(minlen);
> > /*
> > @@ -518,31 +519,11 @@ xfs_rtallocate_extent_near(
> > else { /* i < 0 */
> > /*
> > * Loop backwards through the bitmap blocks from
> > - * the starting point-1 up to where we are now.
> > + * where we last checked up to where we are now.
>
> I find this comment a bit unclear -- aren't we looping backwards from
> where we last checked *downwards*? I was reading "where we are now" to
> mean @i, which contains a negative value.
Yes, "where we last checked down to where we are now" might be better
wording.
> "When @i is negative, we try to find a free extent that starts in the
> bitmap blocks before bbno. Starting from the last bitmap block that we
> checked in a negative scan (initially bbno - 1) and walking downwards
> towards (bbno + i), try to allocate an extent of satisfactory length."
>
> But now having worked my way through that, now I'm wondering what the j
> loop is even doing. Doesn't the sequence of blocks that we call
> xfs_rtallocate_extent_block on alternate backwards and forwards? e.g.
>
> Try to find a satisfactory free extent that starts in:
>
> bbno
> bbno + 1
> bbno - 1
> bbno + 2
> bbno - 2
> ...
> etc?
>
> Why not avoid the loop entirely by calling xfs_rtallocate_extent_block
> on bbno + i once before switching back to positive @i? What am I
> missing here?
There are two ways I can think of to remove the j loop, so I'll address
both.
If you mean: make the i >= 0 and i < 0 branches the same and call
xfs_rtallocate_extent_block() if and only if xfs_rtany_summary() returns
a non-zero maxlog, i.e.:
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 4ab03eafd39f..9d7296c40ddd 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -495,10 +495,6 @@ xfs_rtallocate_extent_near(
xfs_extlen_t maxavail =
min_t(xfs_rtblock_t, maxlen,
(1ULL << (maxlog + 1)) - 1);
- /*
- * On the positive side of the starting location.
- */
- if (i >= 0) {
/*
* Try to allocate an extent starting in
* this block.
@@ -517,33 +513,6 @@ xfs_rtallocate_extent_near(
return 0;
}
}
- /*
- * On the negative side of the starting location.
- */
- else { /* i < 0 */
- /*
- * Loop backwards through the bitmap blocks from
- * where we last checked up to where we are now.
- * There should be an extent which ends in this
- * bitmap block and is long enough.
- */
- for (; j >= i; j--) {
- error = xfs_rtallocate_extent_block(mp,
- tp, bbno + j, minlen, maxavail,
- len, &n, rtbufc, prod, &r);
- if (error) {
- return error;
- }
- /*
- * If it works, return the extent.
- */
- if (r != NULLRTBLOCK) {
- *rtblock = r;
- return 0;
- }
- }
- }
- }
/*
* Loop control. If we were on the positive side, and there's
* still more blocks on the negative side, go there.
Then when i < 0, this will only find the _beginning_ of a free extent
before bbno rather than the apparent goal of trying to allocate as close
as possible to bbno, i.e., the _end_ of the free extent. (This is what I
tried to explain in the commit message.)
If instead you mean: unconditionally call xfs_rtallocate_extent_block()
for bbno + i when i < 0, i.e.:
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 4ab03eafd39f..1cf42910c0e8 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -491,14 +491,10 @@ xfs_rtallocate_extent_near(
* If there are any useful extents starting here, try
* allocating one.
*/
- if (maxlog >= 0) {
+ if (maxlog >= 0 || i < 0) {
xfs_extlen_t maxavail =
min_t(xfs_rtblock_t, maxlen,
(1ULL << (maxlog + 1)) - 1);
- /*
- * On the positive side of the starting location.
- */
- if (i >= 0) {
/*
* Try to allocate an extent starting in
* this block.
@@ -517,33 +513,6 @@ xfs_rtallocate_extent_near(
return 0;
}
}
- /*
- * On the negative side of the starting location.
- */
- else { /* i < 0 */
- /*
- * Loop backwards through the bitmap blocks from
- * where we last checked up to where we are now.
- * There should be an extent which ends in this
- * bitmap block and is long enough.
- */
- for (; j >= i; j--) {
- error = xfs_rtallocate_extent_block(mp,
- tp, bbno + j, minlen, maxavail,
- len, &n, rtbufc, prod, &r);
- if (error) {
- return error;
- }
- /*
- * If it works, return the extent.
- */
- if (r != NULLRTBLOCK) {
- *rtblock = r;
- return 0;
- }
- }
- }
- }
/*
* Loop control. If we were on the positive side, and there's
* still more blocks on the negative side, go there.
Then this will find the end of the extent, but we will waste a lot of
time searching bitmap blocks that don't have any usable free space. (In
fact, this is something that patch 6 tries to reduce further.)
> > * There should be an extent which ends in this
> > * bitmap block and is long enough.
> > */
> > - for (j = -1; j > i; j--) {
> > - /*
> > - * Grab the summary information for
> > - * this bitmap block.
> > - */
> > - error = xfs_rtany_summary(mp, tp,
> > - log2len, mp->m_rsumlevels - 1,
> > - bbno + j, rtbufc, &maxlog);
> > - if (error) {
> > - return error;
> > - }
> > - /*
> > - * If there's no extent given in the
> > - * summary that means the extent we
> > - * found must carry over from an
> > - * earlier block. If there is an
> > - * extent given, we've already tried
> > - * that allocation, don't do it again.
> > - */
> > - if (maxlog >= 0)
> > - continue;
> > + for (; j >= i; j--) {
>
> Changing the j > i to j >= i is what obviates the extra call to
> xfs_rtallocate_extent_block below, correct?
Correct. Before, the loop body was different because it contained a call
to xfs_rtany_summary(). But now there's no check, so the extra call can
be included in the loop.
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 5/6] xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
2023-07-17 21:06 ` Omar Sandoval
@ 2023-07-31 20:58 ` Omar Sandoval
2023-08-01 23:00 ` Darrick J. Wong
1 sibling, 0 replies; 23+ messages in thread
From: Omar Sandoval @ 2023-07-31 20:58 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, kernel-team, Prashant Nema
On Mon, Jul 17, 2023 at 02:06:36PM -0700, Omar Sandoval wrote:
> On Wed, Jul 12, 2023 at 04:34:03PM -0700, Darrick J. Wong wrote:
> > On Tue, Jun 20, 2023 at 02:32:15PM -0700, Omar Sandoval wrote:
> > > From: Omar Sandoval <osandov@fb.com>
> > >
> > > xfs_rtallocate_extent_near() tries to find a free extent as close to a
> > > target bitmap block given by bbno as possible, which may be before or
> > > after bbno. Searching backwards has a complication: the realtime summary
> > > accounts for free space _starting_ in a bitmap block, but not straddling
> > > or ending in a bitmap block. So, when the negative search finds a free
> > > extent in the realtime summary, in order to end up closer to the target,
> > > it looks for the end of the free extent. For example, if bbno - 2 has a
> > > free extent, then it will check bbno - 1, then bbno - 2. But then if
> > > bbno - 3 has a free extent, it will check bbno - 1 again, then bbno - 2
> > > again, and then bbno - 3. This results in a quadratic loop, which is
> > > completely pointless since the repeated checks won't find anything new.
> > >
> > > Fix it by remembering where we last checked up to and continue from
> > > there. This also obviates the need for a check of the realtime summary.
> > >
> > > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > > ---
> > > fs/xfs/xfs_rtalloc.c | 46 +++-----------------------------------------
> > > 1 file changed, 3 insertions(+), 43 deletions(-)
> > >
> > > diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> > > index d079dfb77c73..4d9d0be2e616 100644
> > > --- a/fs/xfs/xfs_rtalloc.c
> > > +++ b/fs/xfs/xfs_rtalloc.c
> > > @@ -468,6 +468,7 @@ xfs_rtallocate_extent_near(
> > > }
> > > bbno = XFS_BITTOBLOCK(mp, bno);
> > > i = 0;
> > > + j = -1;
> > > ASSERT(minlen != 0);
> > > log2len = xfs_highbit32(minlen);
> > > /*
> > > @@ -518,31 +519,11 @@ xfs_rtallocate_extent_near(
> > > else { /* i < 0 */
> > > /*
> > > * Loop backwards through the bitmap blocks from
> > > - * the starting point-1 up to where we are now.
> > > + * where we last checked up to where we are now.
> >
> > I find this comment a bit unclear -- aren't we looping backwards from
> > where we last checked *downwards*? I was reading "where we are now" to
> > mean @i, which contains a negative value.
>
> Yes, "where we last checked down to where we are now" might be better
> wording.
>
> > "When @i is negative, we try to find a free extent that starts in the
> > bitmap blocks before bbno. Starting from the last bitmap block that we
> > checked in a negative scan (initially bbno - 1) and walking downwards
> > towards (bbno + i), try to allocate an extent of satisfactory length."
> >
> > But now having worked my way through that, now I'm wondering what the j
> > loop is even doing. Doesn't the sequence of blocks that we call
> > xfs_rtallocate_extent_block on alternate backwards and forwards? e.g.
> >
> > Try to find a satisfactory free extent that starts in:
> >
> > bbno
> > bbno + 1
> > bbno - 1
> > bbno + 2
> > bbno - 2
> > ...
> > etc?
> >
> > Why not avoid the loop entirely by calling xfs_rtallocate_extent_block
> > on bbno + i once before switching back to positive @i? What am I
> > missing here?
>
> There are two ways I can think of to remove the j loop, so I'll address
> both.
>
> If you mean: make the i >= 0 and i < 0 branches the same and call
> xfs_rtallocate_extent_block() if and only if xfs_rtany_summary() returns
> a non-zero maxlog, i.e.:
>
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index 4ab03eafd39f..9d7296c40ddd 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -495,10 +495,6 @@ xfs_rtallocate_extent_near(
> xfs_extlen_t maxavail =
> min_t(xfs_rtblock_t, maxlen,
> (1ULL << (maxlog + 1)) - 1);
> - /*
> - * On the positive side of the starting location.
> - */
> - if (i >= 0) {
> /*
> * Try to allocate an extent starting in
> * this block.
> @@ -517,33 +513,6 @@ xfs_rtallocate_extent_near(
> return 0;
> }
> }
> - /*
> - * On the negative side of the starting location.
> - */
> - else { /* i < 0 */
> - /*
> - * Loop backwards through the bitmap blocks from
> - * where we last checked up to where we are now.
> - * There should be an extent which ends in this
> - * bitmap block and is long enough.
> - */
> - for (; j >= i; j--) {
> - error = xfs_rtallocate_extent_block(mp,
> - tp, bbno + j, minlen, maxavail,
> - len, &n, rtbufc, prod, &r);
> - if (error) {
> - return error;
> - }
> - /*
> - * If it works, return the extent.
> - */
> - if (r != NULLRTBLOCK) {
> - *rtblock = r;
> - return 0;
> - }
> - }
> - }
> - }
> /*
> * Loop control. If we were on the positive side, and there's
> * still more blocks on the negative side, go there.
>
> Then when i < 0, this will only find the _beginning_ of a free extent
> before bbno rather than the apparent goal of trying to allocate as close
> as possible to bbno, i.e., the _end_ of the free extent. (This is what I
> tried to explain in the commit message.)
>
> If instead you mean: unconditionally call xfs_rtallocate_extent_block()
> for bbno + i when i < 0, i.e.:
>
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index 4ab03eafd39f..1cf42910c0e8 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -491,14 +491,10 @@ xfs_rtallocate_extent_near(
> * If there are any useful extents starting here, try
> * allocating one.
> */
> - if (maxlog >= 0) {
> + if (maxlog >= 0 || i < 0) {
> xfs_extlen_t maxavail =
> min_t(xfs_rtblock_t, maxlen,
> (1ULL << (maxlog + 1)) - 1);
> - /*
> - * On the positive side of the starting location.
> - */
> - if (i >= 0) {
> /*
> * Try to allocate an extent starting in
> * this block.
> @@ -517,33 +513,6 @@ xfs_rtallocate_extent_near(
> return 0;
> }
> }
> - /*
> - * On the negative side of the starting location.
> - */
> - else { /* i < 0 */
> - /*
> - * Loop backwards through the bitmap blocks from
> - * where we last checked up to where we are now.
> - * There should be an extent which ends in this
> - * bitmap block and is long enough.
> - */
> - for (; j >= i; j--) {
> - error = xfs_rtallocate_extent_block(mp,
> - tp, bbno + j, minlen, maxavail,
> - len, &n, rtbufc, prod, &r);
> - if (error) {
> - return error;
> - }
> - /*
> - * If it works, return the extent.
> - */
> - if (r != NULLRTBLOCK) {
> - *rtblock = r;
> - return 0;
> - }
> - }
> - }
> - }
> /*
> * Loop control. If we were on the positive side, and there's
> * still more blocks on the negative side, go there.
>
>
> Then this will find the end of the extent, but we will waste a lot of
> time searching bitmap blocks that don't have any usable free space. (In
> fact, this is something that patch 6 tries to reduce further.)
Ping, I hope this clarified things.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/6] xfs: cache last bitmap block in realtime allocator
2023-07-17 18:18 ` Omar Sandoval
@ 2023-08-01 22:48 ` Darrick J. Wong
0 siblings, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2023-08-01 22:48 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-xfs, kernel-team, Prashant Nema
On Mon, Jul 17, 2023 at 11:18:12AM -0700, Omar Sandoval wrote:
> On Wed, Jul 12, 2023 at 11:29:26AM -0700, Darrick J. Wong wrote:
> > On Tue, Jun 20, 2023 at 02:32:11PM -0700, Omar Sandoval wrote:
> > > From: Omar Sandoval <osandov@fb.com>
> > >
> > > Profiling a workload on a highly fragmented realtime device showed a ton
> > > of CPU cycles being spent in xfs_trans_read_buf() called by
> > > xfs_rtbuf_get(). Further tracing showed that much of that was repeated
> > > calls to xfs_rtbuf_get() for the same block of the realtime bitmap.
> > > These come from xfs_rtallocate_extent_block(): as it walks through
> > > ranges of free bits in the bitmap, each call to xfs_rtcheck_range() and
> > > xfs_rtfind_{forw,back}() gets the same bitmap block. If the bitmap block
> > > is very fragmented, then this is _a lot_ of buffer lookups.
> > >
> > > The realtime allocator already passes around a cache of the last used
> > > realtime summary block to avoid repeated reads (the parameters rbpp and
> > > rsb). We can do the same for the realtime bitmap.
> > >
> > > This replaces rbpp and rsb with a struct xfs_rtbuf_cache, which caches
> > > the most recently used block for both the realtime bitmap and summary.
> > > xfs_rtbuf_get() now handles the caching instead of the callers, which
> > > requires plumbing xfs_rtbuf_cache to more functions but also makes sure
> > > we don't miss anything.
> >
> > Hmm. I initially wondered why rtbitmap blocks wouldn't just stay
> > attached to the transaction, but then I realized -- this is the
> > /scanning/ phase that's repeatedly getting and releasing the rtbmp
> > block, right? The allocator hasn't dirtied any rtbitmap blocks yet, so
> > each xfs_trans_brelse actually drops the buffer, and that's how the
> > xfs_rtbuf_get ends up pounding on the bmapi and then the buffer cache?
> > Repeatedly?
>
> That's right.
>
> > > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > > ---
> > > fs/xfs/libxfs/xfs_rtbitmap.c | 167 ++++++++++++++++++-----------------
> > > fs/xfs/xfs_rtalloc.c | 107 ++++++++++------------
> > > fs/xfs/xfs_rtalloc.h | 28 ++++--
> > > 3 files changed, 154 insertions(+), 148 deletions(-)
> > >
> > > diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
> > > index fa180ab66b73..1a832c9a412f 100644
> > > --- a/fs/xfs/libxfs/xfs_rtbitmap.c
> > > +++ b/fs/xfs/libxfs/xfs_rtbitmap.c
> > > @@ -46,6 +46,21 @@ const struct xfs_buf_ops xfs_rtbuf_ops = {
> > > .verify_write = xfs_rtbuf_verify_write,
> > > };
> > >
> > > +static void
> > > +xfs_rtbuf_cache_relse(
> > > + xfs_trans_t *tp, /* transaction pointer */
> >
> > Please don't introduce more typedef usage.
>
> Oops, I was just copying the surrounding code. I'll fix this in my
> changes.
>
> > struct xfs_trans *tp,
> > struct xfs_rtbuf_cache *cache)
> >
> > (Also there's a lot of broken indenting in the lines added by this
> > patch.)
>
> I'll fix it in this new function. Should I also be reindenting all of
> the parameter lists where I added this parameter?
Others are likely to have opinions, but I only care about the grottier
things like mixing tabs and spaces in an indent.
At some point I'll push my patchset to fix all the ambiguous type usages
like fsblock for fileoff, and mixing of rtblock for rtextent and the
like.
> > > + struct xfs_rtbuf_cache *cache) /* cached blocks */
> > > +{
> > > + if (cache->bbuf) {
> > > + xfs_trans_brelse(tp, cache->bbuf);
> > > + cache->bbuf = NULL;
> > > + }
> > > + if (cache->sbuf) {
> > > + xfs_trans_brelse(tp, cache->sbuf);
> > > + cache->sbuf = NULL;
> > > + }
> > > +}
> > > +
> > > /*
> > > * Get a buffer for the bitmap or summary file block specified.
> > > * The buffer is returned read and locked.
> > > @@ -56,14 +71,35 @@ xfs_rtbuf_get(
> > > xfs_trans_t *tp, /* transaction pointer */
> > > xfs_rtblock_t block, /* block number in bitmap or summary */
> > > int issum, /* is summary not bitmap */
> > > + struct xfs_rtbuf_cache *cache, /* in/out: cached blocks */
> > > struct xfs_buf **bpp) /* output: buffer for the block */
> > > {
> > > + struct xfs_buf **cbpp; /* cached block buffer */
> > > + xfs_fsblock_t *cbp; /* cached block number */
> > > struct xfs_buf *bp; /* block buffer, result */
> > > xfs_inode_t *ip; /* bitmap or summary inode */
> > > xfs_bmbt_irec_t map;
> > > int nmap = 1;
> > > int error; /* error value */
> > >
> > > + cbpp = issum ? &cache->bbuf : &cache->sbuf;
> > > + cbp = issum ? &cache->bblock : &cache->sblock;
> > > + /*
> > > + * If we have a cached buffer, and the block number matches, use that.
> > > + */
> > > + if (*cbpp && *cbp == block) {
> > > + *bpp = *cbpp;
> > > + return 0;
> > > + }
> > > + /*
> > > + * Otherwise we have to have to get the buffer. If there was an old
> > > + * one, get rid of it first.
> > > + */
> > > + if (*cbpp) {
> > > + xfs_trans_brelse(tp, *cbpp);
> > > + *cbpp = NULL;
> > > + }
> > > +
> > > ip = issum ? mp->m_rsumip : mp->m_rbmip;
> > >
> > > error = xfs_bmapi_read(ip, block, 1, &map, &nmap, 0);
> > > @@ -82,7 +118,8 @@ xfs_rtbuf_get(
> > >
> > > xfs_trans_buf_set_type(tp, bp, issum ? XFS_BLFT_RTSUMMARY_BUF
> > > : XFS_BLFT_RTBITMAP_BUF);
> > > - *bpp = bp;
> > > + *cbpp = *bpp = bp;
> > > + *cbp = block;
> > > return 0;
> > > }
> > >
> > > @@ -96,6 +133,7 @@ xfs_rtfind_back(
> > > xfs_trans_t *tp, /* transaction pointer */
> > > xfs_rtblock_t start, /* starting block to look at */
> > > xfs_rtblock_t limit, /* last block to look at */
> > > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > > xfs_rtblock_t *rtblock) /* out: start block found */
> > > {
> > > xfs_rtword_t *b; /* current word in buffer */
> > > @@ -116,7 +154,7 @@ xfs_rtfind_back(
> > > * Compute and read in starting bitmap block for starting block.
> > > */
> > > block = XFS_BITTOBLOCK(mp, start);
> > > - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -153,7 +191,6 @@ xfs_rtfind_back(
> > > /*
> > > * Different. Mark where we are and return.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > i = bit - XFS_RTHIBIT(wdiff);
> > > *rtblock = start - i + 1;
> > > return 0;
> > > @@ -167,8 +204,7 @@ xfs_rtfind_back(
> > > /*
> > > * If done with this block, get the previous one.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > - error = xfs_rtbuf_get(mp, tp, --block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, --block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -199,7 +235,6 @@ xfs_rtfind_back(
> > > /*
> > > * Different, mark where we are and return.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > i += XFS_NBWORD - 1 - XFS_RTHIBIT(wdiff);
> > > *rtblock = start - i + 1;
> > > return 0;
> > > @@ -213,8 +248,7 @@ xfs_rtfind_back(
> > > /*
> > > * If done with this block, get the previous one.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > - error = xfs_rtbuf_get(mp, tp, --block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, --block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -246,7 +280,6 @@ xfs_rtfind_back(
> > > /*
> > > * Different, mark where we are and return.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > i += XFS_NBWORD - 1 - XFS_RTHIBIT(wdiff);
> > > *rtblock = start - i + 1;
> > > return 0;
> > > @@ -256,7 +289,6 @@ xfs_rtfind_back(
> > > /*
> > > * No match, return that we scanned the whole area.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > *rtblock = start - i + 1;
> > > return 0;
> > > }
> > > @@ -271,6 +303,7 @@ xfs_rtfind_forw(
> > > xfs_trans_t *tp, /* transaction pointer */
> > > xfs_rtblock_t start, /* starting block to look at */
> > > xfs_rtblock_t limit, /* last block to look at */
> > > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > > xfs_rtblock_t *rtblock) /* out: start block found */
> > > {
> > > xfs_rtword_t *b; /* current word in buffer */
> > > @@ -291,7 +324,7 @@ xfs_rtfind_forw(
> > > * Compute and read in starting bitmap block for starting block.
> > > */
> > > block = XFS_BITTOBLOCK(mp, start);
> > > - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -327,7 +360,6 @@ xfs_rtfind_forw(
> > > /*
> > > * Different. Mark where we are and return.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > i = XFS_RTLOBIT(wdiff) - bit;
> > > *rtblock = start + i - 1;
> > > return 0;
> > > @@ -341,8 +373,7 @@ xfs_rtfind_forw(
> > > /*
> > > * If done with this block, get the previous one.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -372,7 +403,6 @@ xfs_rtfind_forw(
> > > /*
> > > * Different, mark where we are and return.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > i += XFS_RTLOBIT(wdiff);
> > > *rtblock = start + i - 1;
> > > return 0;
> > > @@ -386,8 +416,7 @@ xfs_rtfind_forw(
> > > /*
> > > * If done with this block, get the next one.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -416,7 +445,6 @@ xfs_rtfind_forw(
> > > /*
> > > * Different, mark where we are and return.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > i += XFS_RTLOBIT(wdiff);
> > > *rtblock = start + i - 1;
> > > return 0;
> > > @@ -426,7 +454,6 @@ xfs_rtfind_forw(
> > > /*
> > > * No match, return that we scanned the whole area.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > *rtblock = start + i - 1;
> > > return 0;
> > > }
> > > @@ -447,8 +474,7 @@ xfs_rtmodify_summary_int(
> > > int log, /* log2 of extent size */
> > > xfs_rtblock_t bbno, /* bitmap block number */
> > > int delta, /* change to make to summary info */
> > > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > > xfs_suminfo_t *sum) /* out: summary info for this block */
> > > {
> > > struct xfs_buf *bp; /* buffer for the summary block */
> > > @@ -465,30 +491,9 @@ xfs_rtmodify_summary_int(
> > > * Compute the block number in the summary file.
> > > */
> > > sb = XFS_SUMOFFSTOBLOCK(mp, so);
> > > - /*
> > > - * If we have an old buffer, and the block number matches, use that.
> > > - */
> > > - if (*rbpp && *rsb == sb)
> > > - bp = *rbpp;
> > > - /*
> > > - * Otherwise we have to get the buffer.
> > > - */
> > > - else {
> > > - /*
> > > - * If there was an old one, get rid of it first.
> > > - */
> > > - if (*rbpp)
> > > - xfs_trans_brelse(tp, *rbpp);
> > > - error = xfs_rtbuf_get(mp, tp, sb, 1, &bp);
> > > - if (error) {
> > > - return error;
> > > - }
> > > - /*
> > > - * Remember this buffer and block for the next call.
> > > - */
> > > - *rbpp = bp;
> > > - *rsb = sb;
> > > - }
> > > + error = xfs_rtbuf_get(mp, tp, sb, 1, rtbufc, &bp);
> > > + if (error)
> > > + return error;
> > > /*
> > > * Point to the summary information, modify/log it, and/or copy it out.
> > > */
> > > @@ -517,11 +522,9 @@ xfs_rtmodify_summary(
> > > int log, /* log2 of extent size */
> > > xfs_rtblock_t bbno, /* bitmap block number */
> > > int delta, /* change to make to summary info */
> > > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > > - xfs_fsblock_t *rsb) /* in/out: summary block number */
> > > + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> > > {
> > > - return xfs_rtmodify_summary_int(mp, tp, log, bbno,
> > > - delta, rbpp, rsb, NULL);
> > > + return xfs_rtmodify_summary_int(mp, tp, log, bbno, delta, rtbufc, NULL);
> > > }
> > >
> > > /*
> > > @@ -534,7 +537,8 @@ xfs_rtmodify_range(
> > > xfs_trans_t *tp, /* transaction pointer */
> > > xfs_rtblock_t start, /* starting block to modify */
> > > xfs_extlen_t len, /* length of extent to modify */
> > > - int val) /* 1 for free, 0 for allocated */
> > > + int val, /* 1 for free, 0 for allocated */
> > > + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> > > {
> > > xfs_rtword_t *b; /* current word in buffer */
> > > int bit; /* bit number in the word */
> > > @@ -555,7 +559,7 @@ xfs_rtmodify_range(
> > > /*
> > > * Read the bitmap block, and point to its data.
> > > */
> > > - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -600,7 +604,7 @@ xfs_rtmodify_range(
> > > xfs_trans_log_buf(tp, bp,
> > > (uint)((char *)first - (char *)bufp),
> > > (uint)((char *)b - (char *)bufp));
> > > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -640,7 +644,7 @@ xfs_rtmodify_range(
> > > xfs_trans_log_buf(tp, bp,
> > > (uint)((char *)first - (char *)bufp),
> > > (uint)((char *)b - (char *)bufp));
> > > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -690,8 +694,7 @@ xfs_rtfree_range(
> > > xfs_trans_t *tp, /* transaction pointer */
> > > xfs_rtblock_t start, /* starting block to free */
> > > xfs_extlen_t len, /* length to free */
> > > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > > - xfs_fsblock_t *rsb) /* in/out: summary block number */
> > > + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> > > {
> > > xfs_rtblock_t end; /* end of the freed extent */
> > > int error; /* error value */
> > > @@ -702,7 +705,7 @@ xfs_rtfree_range(
> > > /*
> > > * Modify the bitmap to mark this extent freed.
> > > */
> > > - error = xfs_rtmodify_range(mp, tp, start, len, 1);
> > > + error = xfs_rtmodify_range(mp, tp, start, len, 1, rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -711,7 +714,7 @@ xfs_rtfree_range(
> > > * We need to find the beginning and end of the extent so we can
> > > * properly update the summary.
> > > */
> > > - error = xfs_rtfind_back(mp, tp, start, 0, &preblock);
> > > + error = xfs_rtfind_back(mp, tp, start, 0, rtbufc, &preblock);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -719,7 +722,7 @@ xfs_rtfree_range(
> > > * Find the next allocated block (end of allocated extent).
> > > */
> > > error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1,
> > > - &postblock);
> > > + rtbufc, &postblock);
> > > if (error)
> > > return error;
> > > /*
> > > @@ -729,7 +732,7 @@ xfs_rtfree_range(
> > > if (preblock < start) {
> > > error = xfs_rtmodify_summary(mp, tp,
> > > XFS_RTBLOCKLOG(start - preblock),
> > > - XFS_BITTOBLOCK(mp, preblock), -1, rbpp, rsb);
> > > + XFS_BITTOBLOCK(mp, preblock), -1, rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -741,7 +744,7 @@ xfs_rtfree_range(
> > > if (postblock > end) {
> > > error = xfs_rtmodify_summary(mp, tp,
> > > XFS_RTBLOCKLOG(postblock - end),
> > > - XFS_BITTOBLOCK(mp, end + 1), -1, rbpp, rsb);
> > > + XFS_BITTOBLOCK(mp, end + 1), -1, rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -752,7 +755,7 @@ xfs_rtfree_range(
> > > */
> > > error = xfs_rtmodify_summary(mp, tp,
> > > XFS_RTBLOCKLOG(postblock + 1 - preblock),
> > > - XFS_BITTOBLOCK(mp, preblock), 1, rbpp, rsb);
> > > + XFS_BITTOBLOCK(mp, preblock), 1, rtbufc);
> > > return error;
> > > }
> > >
> > > @@ -767,6 +770,7 @@ xfs_rtcheck_range(
> > > xfs_rtblock_t start, /* starting block number of extent */
> > > xfs_extlen_t len, /* length of extent */
> > > int val, /* 1 for free, 0 for allocated */
> > > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > > xfs_rtblock_t *new, /* out: first block not matching */
> > > int *stat) /* out: 1 for matches, 0 for not */
> > > {
> > > @@ -789,7 +793,7 @@ xfs_rtcheck_range(
> > > /*
> > > * Read the bitmap block.
> > > */
> > > - error = xfs_rtbuf_get(mp, tp, block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -824,7 +828,6 @@ xfs_rtcheck_range(
> > > /*
> > > * Different, compute first wrong bit and return.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > i = XFS_RTLOBIT(wdiff) - bit;
> > > *new = start + i;
> > > *stat = 0;
> > > @@ -839,8 +842,7 @@ xfs_rtcheck_range(
> > > /*
> > > * If done with this block, get the next one.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -870,7 +872,6 @@ xfs_rtcheck_range(
> > > /*
> > > * Different, compute first wrong bit and return.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > i += XFS_RTLOBIT(wdiff);
> > > *new = start + i;
> > > *stat = 0;
> > > @@ -885,8 +886,7 @@ xfs_rtcheck_range(
> > > /*
> > > * If done with this block, get the next one.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > - error = xfs_rtbuf_get(mp, tp, ++block, 0, &bp);
> > > + error = xfs_rtbuf_get(mp, tp, ++block, 0, rtbufc, &bp);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -915,7 +915,6 @@ xfs_rtcheck_range(
> > > /*
> > > * Different, compute first wrong bit and return.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > i += XFS_RTLOBIT(wdiff);
> > > *new = start + i;
> > > *stat = 0;
> > > @@ -926,7 +925,6 @@ xfs_rtcheck_range(
> > > /*
> > > * Successful, return.
> > > */
> > > - xfs_trans_brelse(tp, bp);
> > > *new = start + i;
> > > *stat = 1;
> > > return 0;
> > > @@ -941,20 +939,21 @@ xfs_rtcheck_alloc_range(
> > > xfs_mount_t *mp, /* file system mount point */
> > > xfs_trans_t *tp, /* transaction pointer */
> > > xfs_rtblock_t bno, /* starting block number of extent */
> > > - xfs_extlen_t len) /* length of extent */
> > > + xfs_extlen_t len, /* length of extent */
> > > + struct xfs_rtbuf_cache *rtbufc) /* in/out: cached blocks */
> > > {
> > > xfs_rtblock_t new; /* dummy for xfs_rtcheck_range */
> > > int stat;
> > > int error;
> > >
> > > - error = xfs_rtcheck_range(mp, tp, bno, len, 0, &new, &stat);
> > > + error = xfs_rtcheck_range(mp, tp, bno, len, 0, rtbufc, &new, &stat);
> > > if (error)
> > > return error;
> > > ASSERT(stat);
> > > return 0;
> > > }
> > > #else
> > > -#define xfs_rtcheck_alloc_range(m,t,b,l) (0)
> > > +#define xfs_rtcheck_alloc_range(m,t,b,l,r) (0)
> > > #endif
> > > /*
> > > * Free an extent in the realtime subvolume. Length is expressed in
> > > @@ -968,22 +967,21 @@ xfs_rtfree_extent(
> > > {
> > > int error; /* error value */
> > > xfs_mount_t *mp; /* file system mount structure */
> > > - xfs_fsblock_t sb; /* summary file block number */
> > > - struct xfs_buf *sumbp = NULL; /* summary file block buffer */
> > > + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
> > >
> > > mp = tp->t_mountp;
> > >
> > > ASSERT(mp->m_rbmip->i_itemp != NULL);
> > > ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
> > >
> > > - error = xfs_rtcheck_alloc_range(mp, tp, bno, len);
> > > + error = xfs_rtcheck_alloc_range(mp, tp, bno, len, &rtbufc);
> > > if (error)
> > > return error;
> > >
> > > /*
> > > * Free the range of realtime blocks.
> > > */
> > > - error = xfs_rtfree_range(mp, tp, bno, len, &sumbp, &sb);
> > > + error = xfs_rtfree_range(mp, tp, bno, len, &rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -1021,6 +1019,7 @@ xfs_rtalloc_query_range(
> > > xfs_rtblock_t high_key;
> > > int is_free;
> > > int error = 0;
> > > + struct xfs_rtbuf_cache rtbufc = {};
> > >
> > > if (low_rec->ar_startext > high_rec->ar_startext)
> > > return -EINVAL;
> > > @@ -1034,13 +1033,14 @@ xfs_rtalloc_query_range(
> > > rtstart = low_rec->ar_startext;
> > > while (rtstart <= high_key) {
> > > /* Is the first block free? */
> > > - error = xfs_rtcheck_range(mp, tp, rtstart, 1, 1, &rtend,
> > > - &is_free);
> > > + error = xfs_rtcheck_range(mp, tp, rtstart, 1, 1, &rtbufc,
> > > + &rtend, &is_free);
> > > if (error)
> > > break;
> > >
> > > /* How long does the extent go for? */
> > > - error = xfs_rtfind_forw(mp, tp, rtstart, high_key, &rtend);
> > > + error = xfs_rtfind_forw(mp, tp, rtstart, high_key, &rtbufc,
> > > + &rtend);
> > > if (error)
> > > break;
> > >
> > > @@ -1056,6 +1056,8 @@ xfs_rtalloc_query_range(
> > > rtstart = rtend + 1;
> > > }
> > >
> > > + xfs_rtbuf_cache_relse(tp, &rtbufc);
> > > +
> > > return error;
> > > }
> > >
> > > @@ -1085,11 +1087,14 @@ xfs_rtalloc_extent_is_free(
> > > xfs_extlen_t len,
> > > bool *is_free)
> > > {
> > > + struct xfs_rtbuf_cache rtbufc = {};
> > > xfs_rtblock_t end;
> > > int matches;
> > > int error;
> > >
> > > - error = xfs_rtcheck_range(mp, tp, start, len, 1, &end, &matches);
> > > + error = xfs_rtcheck_range(mp, tp, start, len, 1, &rtbufc, &end,
> > > + &matches);
> > > + xfs_rtbuf_cache_relse(tp, &rtbufc);
> > > if (error)
> > > return error;
> > >
> > > diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> > > index 16534e9873f6..61ef13286654 100644
> > > --- a/fs/xfs/xfs_rtalloc.c
> > > +++ b/fs/xfs/xfs_rtalloc.c
> > > @@ -32,11 +32,10 @@ xfs_rtget_summary(
> > > xfs_trans_t *tp, /* transaction pointer */
> > > int log, /* log2 of extent size */
> > > xfs_rtblock_t bbno, /* bitmap block number */
> > > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > > xfs_suminfo_t *sum) /* out: summary info for this block */
> > > {
> > > - return xfs_rtmodify_summary_int(mp, tp, log, bbno, 0, rbpp, rsb, sum);
> > > + return xfs_rtmodify_summary_int(mp, tp, log, bbno, 0, rtbufc, sum);
> > > }
> > >
> > > /*
> > > @@ -50,8 +49,7 @@ xfs_rtany_summary(
> > > int low, /* low log2 extent size */
> > > int high, /* high log2 extent size */
> > > xfs_rtblock_t bbno, /* bitmap block number */
> > > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > > int *stat) /* out: any good extents here? */
> > > {
> > > int error; /* error value */
> > > @@ -69,7 +67,7 @@ xfs_rtany_summary(
> > > /*
> > > * Get one summary datum.
> > > */
> > > - error = xfs_rtget_summary(mp, tp, log, bbno, rbpp, rsb, &sum);
> > > + error = xfs_rtget_summary(mp, tp, log, bbno, rtbufc, &sum);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -104,29 +102,27 @@ xfs_rtcopy_summary(
> > > xfs_trans_t *tp) /* transaction pointer */
> > > {
> > > xfs_rtblock_t bbno; /* bitmap block number */
> > > - struct xfs_buf *bp; /* summary buffer */
> > > + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
> > > int error; /* error return value */
> > > int log; /* summary level number (log length) */
> > > xfs_suminfo_t sum; /* summary data */
> > > - xfs_fsblock_t sumbno; /* summary block number */
> > >
> > > - bp = NULL;
> > > for (log = omp->m_rsumlevels - 1; log >= 0; log--) {
> > > for (bbno = omp->m_sb.sb_rbmblocks - 1;
> > > (xfs_srtblock_t)bbno >= 0;
> > > bbno--) {
> > > - error = xfs_rtget_summary(omp, tp, log, bbno, &bp,
> > > - &sumbno, &sum);
> > > + error = xfs_rtget_summary(omp, tp, log, bbno, &rtbufc,
> > > + &sum);
> > > if (error)
> > > return error;
> > > if (sum == 0)
> > > continue;
> > > error = xfs_rtmodify_summary(omp, tp, log, bbno, -sum,
> > > - &bp, &sumbno);
> > > + &rtbufc);
> > > if (error)
> > > return error;
> > > error = xfs_rtmodify_summary(nmp, tp, log, bbno, sum,
> > > - &bp, &sumbno);
> > > + &rtbufc);
> > > if (error)
> > > return error;
> > > ASSERT(sum > 0);
> > > @@ -144,8 +140,7 @@ xfs_rtallocate_range(
> > > xfs_trans_t *tp, /* transaction pointer */
> > > xfs_rtblock_t start, /* start block to allocate */
> > > xfs_extlen_t len, /* length to allocate */
> > > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > > - xfs_fsblock_t *rsb) /* in/out: summary block number */
> > > + struct xfs_rtbuf_cache *rtbufc) /* in/out: cache of realtime blocks */
> > > {
> > > xfs_rtblock_t end; /* end of the allocated extent */
> > > int error; /* error value */
> > > @@ -158,14 +153,14 @@ xfs_rtallocate_range(
> > > * We need to find the beginning and end of the extent so we can
> > > * properly update the summary.
> > > */
> > > - error = xfs_rtfind_back(mp, tp, start, 0, &preblock);
> > > + error = xfs_rtfind_back(mp, tp, start, 0, rtbufc, &preblock);
> > > if (error) {
> > > return error;
> > > }
> > > /*
> > > * Find the next allocated block (end of free extent).
> > > */
> > > - error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1,
> > > + error = xfs_rtfind_forw(mp, tp, end, mp->m_sb.sb_rextents - 1, rtbufc,
> > > &postblock);
> > > if (error) {
> > > return error;
> > > @@ -176,7 +171,7 @@ xfs_rtallocate_range(
> > > */
> > > error = xfs_rtmodify_summary(mp, tp,
> > > XFS_RTBLOCKLOG(postblock + 1 - preblock),
> > > - XFS_BITTOBLOCK(mp, preblock), -1, rbpp, rsb);
> > > + XFS_BITTOBLOCK(mp, preblock), -1, rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -187,7 +182,7 @@ xfs_rtallocate_range(
> > > if (preblock < start) {
> > > error = xfs_rtmodify_summary(mp, tp,
> > > XFS_RTBLOCKLOG(start - preblock),
> > > - XFS_BITTOBLOCK(mp, preblock), 1, rbpp, rsb);
> > > + XFS_BITTOBLOCK(mp, preblock), 1, rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -199,7 +194,7 @@ xfs_rtallocate_range(
> > > if (postblock > end) {
> > > error = xfs_rtmodify_summary(mp, tp,
> > > XFS_RTBLOCKLOG(postblock - end),
> > > - XFS_BITTOBLOCK(mp, end + 1), 1, rbpp, rsb);
> > > + XFS_BITTOBLOCK(mp, end + 1), 1, rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -207,7 +202,7 @@ xfs_rtallocate_range(
> > > /*
> > > * Modify the bitmap to mark this extent allocated.
> > > */
> > > - error = xfs_rtmodify_range(mp, tp, start, len, 0);
> > > + error = xfs_rtmodify_range(mp, tp, start, len, 0, rtbufc);
> > > return error;
> > > }
> > >
> > > @@ -226,8 +221,7 @@ xfs_rtallocate_extent_block(
> > > xfs_extlen_t maxlen, /* maximum length to allocate */
> > > xfs_extlen_t *len, /* out: actual length allocated */
> > > xfs_rtblock_t *nextp, /* out: next block to try */
> > > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > > xfs_extlen_t prod, /* extent product factor */
> > > xfs_rtblock_t *rtblock) /* out: start block allocated */
> > > {
> > > @@ -254,7 +248,8 @@ xfs_rtallocate_extent_block(
> > > * See if there's a free extent of maxlen starting at i.
> > > * If it's not so then next will contain the first non-free.
> > > */
> > > - error = xfs_rtcheck_range(mp, tp, i, maxlen, 1, &next, &stat);
> > > + error = xfs_rtcheck_range(mp, tp, i, maxlen, 1, rtbufc, &next,
> > > + &stat);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -262,8 +257,7 @@ xfs_rtallocate_extent_block(
> > > /*
> > > * i for maxlen is all free, allocate and return that.
> > > */
> > > - error = xfs_rtallocate_range(mp, tp, i, maxlen, rbpp,
> > > - rsb);
> > > + error = xfs_rtallocate_range(mp, tp, i, maxlen, rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -290,7 +284,7 @@ xfs_rtallocate_extent_block(
> > > * If not done yet, find the start of the next free space.
> > > */
> > > if (next < end) {
> > > - error = xfs_rtfind_forw(mp, tp, next, end, &i);
> > > + error = xfs_rtfind_forw(mp, tp, next, end, rtbufc, &i);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -315,7 +309,7 @@ xfs_rtallocate_extent_block(
> > > /*
> > > * Allocate besti for bestlen & return that.
> > > */
> > > - error = xfs_rtallocate_range(mp, tp, besti, bestlen, rbpp, rsb);
> > > + error = xfs_rtallocate_range(mp, tp, besti, bestlen, rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -344,9 +338,8 @@ xfs_rtallocate_extent_exact(
> > > xfs_rtblock_t bno, /* starting block number to allocate */
> > > xfs_extlen_t minlen, /* minimum length to allocate */
> > > xfs_extlen_t maxlen, /* maximum length to allocate */
> > > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > > xfs_extlen_t *len, /* out: actual length allocated */
> > > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > > xfs_extlen_t prod, /* extent product factor */
> > > xfs_rtblock_t *rtblock) /* out: start block allocated */
> > > {
> > > @@ -359,7 +352,8 @@ xfs_rtallocate_extent_exact(
> > > /*
> > > * Check if the range in question (for maxlen) is free.
> > > */
> > > - error = xfs_rtcheck_range(mp, tp, bno, maxlen, 1, &next, &isfree);
> > > + error = xfs_rtcheck_range(mp, tp, bno, maxlen, 1, rtbufc, &next,
> > > + &isfree);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -367,7 +361,7 @@ xfs_rtallocate_extent_exact(
> > > /*
> > > * If it is, allocate it and return success.
> > > */
> > > - error = xfs_rtallocate_range(mp, tp, bno, maxlen, rbpp, rsb);
> > > + error = xfs_rtallocate_range(mp, tp, bno, maxlen, rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -402,7 +396,7 @@ xfs_rtallocate_extent_exact(
> > > /*
> > > * Allocate what we can and return it.
> > > */
> > > - error = xfs_rtallocate_range(mp, tp, bno, maxlen, rbpp, rsb);
> > > + error = xfs_rtallocate_range(mp, tp, bno, maxlen, rtbufc);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -424,8 +418,7 @@ xfs_rtallocate_extent_near(
> > > xfs_extlen_t minlen, /* minimum length to allocate */
> > > xfs_extlen_t maxlen, /* maximum length to allocate */
> > > xfs_extlen_t *len, /* out: actual length allocated */
> > > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > > xfs_extlen_t prod, /* extent product factor */
> > > xfs_rtblock_t *rtblock) /* out: start block allocated */
> > > {
> > > @@ -456,8 +449,8 @@ xfs_rtallocate_extent_near(
> > > /*
> > > * Try the exact allocation first.
> > > */
> > > - error = xfs_rtallocate_extent_exact(mp, tp, bno, minlen, maxlen, len,
> > > - rbpp, rsb, prod, &r);
> > > + error = xfs_rtallocate_extent_exact(mp, tp, bno, minlen, maxlen, rtbufc,
> > > + len, prod, &r);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -481,7 +474,7 @@ xfs_rtallocate_extent_near(
> > > * starting in this bitmap block.
> > > */
> > > error = xfs_rtany_summary(mp, tp, log2len, mp->m_rsumlevels - 1,
> > > - bbno + i, rbpp, rsb, &any);
> > > + bbno + i, rtbufc, &any);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -499,8 +492,8 @@ xfs_rtallocate_extent_near(
> > > * this block.
> > > */
> > > error = xfs_rtallocate_extent_block(mp, tp,
> > > - bbno + i, minlen, maxlen, len, &n, rbpp,
> > > - rsb, prod, &r);
> > > + bbno + i, minlen, maxlen, len, &n,
> > > + rtbufc, prod, &r);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -529,7 +522,7 @@ xfs_rtallocate_extent_near(
> > > */
> > > error = xfs_rtany_summary(mp, tp,
> > > log2len, mp->m_rsumlevels - 1,
> > > - bbno + j, rbpp, rsb, &any);
> > > + bbno + j, rtbufc, &any);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -545,7 +538,7 @@ xfs_rtallocate_extent_near(
> > > continue;
> > > error = xfs_rtallocate_extent_block(mp,
> > > tp, bbno + j, minlen, maxlen,
> > > - len, &n, rbpp, rsb, prod, &r);
> > > + len, &n, rtbufc, prod, &r);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -566,8 +559,8 @@ xfs_rtallocate_extent_near(
> > > * that we found.
> > > */
> > > error = xfs_rtallocate_extent_block(mp, tp,
> > > - bbno + i, minlen, maxlen, len, &n, rbpp,
> > > - rsb, prod, &r);
> > > + bbno + i, minlen, maxlen, len, &n,
> > > + rtbufc, prod, &r);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -626,8 +619,7 @@ xfs_rtallocate_extent_size(
> > > xfs_extlen_t minlen, /* minimum length to allocate */
> > > xfs_extlen_t maxlen, /* maximum length to allocate */
> > > xfs_extlen_t *len, /* out: actual length allocated */
> > > - struct xfs_buf **rbpp, /* in/out: summary block buffer */
> > > - xfs_fsblock_t *rsb, /* in/out: summary block number */
> > > + struct xfs_rtbuf_cache *rtbufc, /* in/out: cache of realtime blocks */
> > > xfs_extlen_t prod, /* extent product factor */
> > > xfs_rtblock_t *rtblock) /* out: start block allocated */
> > > {
> > > @@ -656,7 +648,7 @@ xfs_rtallocate_extent_size(
> > > /*
> > > * Get the summary for this level/block.
> > > */
> > > - error = xfs_rtget_summary(mp, tp, l, i, rbpp, rsb,
> > > + error = xfs_rtget_summary(mp, tp, l, i, rtbufc,
> > > &sum);
> > > if (error) {
> > > return error;
> > > @@ -670,7 +662,7 @@ xfs_rtallocate_extent_size(
> > > * Try allocating the extent.
> > > */
> > > error = xfs_rtallocate_extent_block(mp, tp, i, maxlen,
> > > - maxlen, len, &n, rbpp, rsb, prod, &r);
> > > + maxlen, len, &n, rtbufc, prod, &r);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -715,7 +707,7 @@ xfs_rtallocate_extent_size(
> > > /*
> > > * Get the summary information for this level/block.
> > > */
> > > - error = xfs_rtget_summary(mp, tp, l, i, rbpp, rsb,
> > > + error = xfs_rtget_summary(mp, tp, l, i, rtbufc,
> > > &sum);
> > > if (error) {
> > > return error;
> > > @@ -733,7 +725,7 @@ xfs_rtallocate_extent_size(
> > > error = xfs_rtallocate_extent_block(mp, tp, i,
> > > XFS_RTMAX(minlen, 1 << l),
> > > XFS_RTMIN(maxlen, (1 << (l + 1)) - 1),
> > > - len, &n, rbpp, rsb, prod, &r);
> > > + len, &n, rtbufc, prod, &r);
> > > if (error) {
> > > return error;
> > > }
> > > @@ -922,7 +914,6 @@ xfs_growfs_rt(
> > > xfs_extlen_t rbmblocks; /* current number of rt bitmap blocks */
> > > xfs_extlen_t rsumblocks; /* current number of rt summary blks */
> > > xfs_sb_t *sbp; /* old superblock */
> > > - xfs_fsblock_t sumbno; /* summary block number */
> > > uint8_t *rsum_cache; /* old summary cache */
> > >
> > > sbp = &mp->m_sb;
> > > @@ -1025,6 +1016,7 @@ xfs_growfs_rt(
> > > bmbno++) {
> > > struct xfs_trans *tp;
> > > xfs_rfsblock_t nrblocks_step;
> > > + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
> > >
> > > *nmp = *mp;
> > > nsbp = &nmp->m_sb;
> > > @@ -1111,9 +1103,8 @@ xfs_growfs_rt(
> > > /*
> > > * Free new extent.
> > > */
> > > - bp = NULL;
> > > error = xfs_rtfree_range(nmp, tp, sbp->sb_rextents,
> > > - nsbp->sb_rextents - sbp->sb_rextents, &bp, &sumbno);
> > > + nsbp->sb_rextents - sbp->sb_rextents, &rtbufc);
> > > if (error) {
> > > error_cancel:
> > > xfs_trans_cancel(tp);
> > > @@ -1185,8 +1176,7 @@ xfs_rtallocate_extent(
> > > xfs_mount_t *mp = tp->t_mountp;
> > > int error; /* error value */
> > > xfs_rtblock_t r; /* result allocated block */
> > > - xfs_fsblock_t sb; /* summary file block number */
> > > - struct xfs_buf *sumbp; /* summary file block buffer */
> > > + struct xfs_rtbuf_cache rtbufc = {}; /* cache of realtime blocks */
> > >
> > > ASSERT(xfs_isilocked(mp->m_rbmip, XFS_ILOCK_EXCL));
> > > ASSERT(minlen > 0 && minlen <= maxlen);
> > > @@ -1208,13 +1198,14 @@ xfs_rtallocate_extent(
> > > }
> > >
> > > retry:
> > > - sumbp = NULL;
> > > + rtbufc.bbuf = NULL;
> > > + rtbufc.sbuf = NULL;
> >
> > xfs_rtbuf_cache_relse?
>
> In this case, the cached blocks have been dirtied and attached to the
> transaction, so it's not strictly necessary (which is also why the old
> code didn't call xfs_trans_brelse()). But it's probably clearer to just
> use xfs_rtbuf_cache_relse() anyways rather than needing to justify that.
>
> > > if (bno == 0) {
> > > error = xfs_rtallocate_extent_size(mp, tp, minlen, maxlen, len,
> > > - &sumbp, &sb, prod, &r);
> > > + &rtbufc, prod, &r);
> > > } else {
> > > error = xfs_rtallocate_extent_near(mp, tp, bno, minlen, maxlen,
> > > - len, &sumbp, &sb, prod, &r);
> > > + len, &rtbufc, prod, &r);
> > > }
> > >
> > > if (error)
> > > diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h
> > > index 62c7ad79cbb6..888552c4f287 100644
> > > --- a/fs/xfs/xfs_rtalloc.h
> > > +++ b/fs/xfs/xfs_rtalloc.h
> > > @@ -101,29 +101,39 @@ xfs_growfs_rt(
> > > /*
> > > * From xfs_rtbitmap.c
> > > */
> > > +struct xfs_rtbuf_cache {
> > > + struct xfs_buf *bbuf; /* bitmap block buffer */
> > > + xfs_fsblock_t bblock; /* bitmap block number */
> > > + struct xfs_buf *sbuf; /* summary block buffer */
> > > + xfs_fsblock_t sblock; /* summary block number */
> >
> > These are really offsets into the data fork, right? They ought to be
> > xfs_fileoff_t, except that none of the rtalloc code uses the correct
> > type?
>
> Yup, once again I was mimicking the surrounding code. I can fix it here.
Ok.
> Thanks!
NP. Sorry I've been slow at getting back to this. :/
--D
>
> > And yes I do want to merge the fixes for that:
> > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=clean-up-realtime-units
> >
> > --D
> >
> > > +};
> > > +
> > > int xfs_rtbuf_get(struct xfs_mount *mp, struct xfs_trans *tp,
> > > - xfs_rtblock_t block, int issum, struct xfs_buf **bpp);
> > > + xfs_rtblock_t block, int issum, struct xfs_rtbuf_cache *cache,
> > > + struct xfs_buf **bpp);
> > > int xfs_rtcheck_range(struct xfs_mount *mp, struct xfs_trans *tp,
> > > xfs_rtblock_t start, xfs_extlen_t len, int val,
> > > - xfs_rtblock_t *new, int *stat);
> > > + struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *new,
> > > + int *stat);
> > > int xfs_rtfind_back(struct xfs_mount *mp, struct xfs_trans *tp,
> > > xfs_rtblock_t start, xfs_rtblock_t limit,
> > > - xfs_rtblock_t *rtblock);
> > > + struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *rtblock);
> > > int xfs_rtfind_forw(struct xfs_mount *mp, struct xfs_trans *tp,
> > > xfs_rtblock_t start, xfs_rtblock_t limit,
> > > - xfs_rtblock_t *rtblock);
> > > + struct xfs_rtbuf_cache *rtbufc, xfs_rtblock_t *rtblock);
> > > int xfs_rtmodify_range(struct xfs_mount *mp, struct xfs_trans *tp,
> > > - xfs_rtblock_t start, xfs_extlen_t len, int val);
> > > + xfs_rtblock_t start, xfs_extlen_t len, int val,
> > > + struct xfs_rtbuf_cache *rtbufc);
> > > int xfs_rtmodify_summary_int(struct xfs_mount *mp, struct xfs_trans *tp,
> > > int log, xfs_rtblock_t bbno, int delta,
> > > - struct xfs_buf **rbpp, xfs_fsblock_t *rsb,
> > > + struct xfs_rtbuf_cache *rtbufc,
> > > xfs_suminfo_t *sum);
> > > int xfs_rtmodify_summary(struct xfs_mount *mp, struct xfs_trans *tp, int log,
> > > - xfs_rtblock_t bbno, int delta, struct xfs_buf **rbpp,
> > > - xfs_fsblock_t *rsb);
> > > + xfs_rtblock_t bbno, int delta,
> > > + struct xfs_rtbuf_cache *rtbufc);
> > > int xfs_rtfree_range(struct xfs_mount *mp, struct xfs_trans *tp,
> > > xfs_rtblock_t start, xfs_extlen_t len,
> > > - struct xfs_buf **rbpp, xfs_fsblock_t *rsb);
> > > + struct xfs_rtbuf_cache *rtbufc);
> > > int xfs_rtalloc_query_range(struct xfs_mount *mp, struct xfs_trans *tp,
> > > const struct xfs_rtalloc_rec *low_rec,
> > > const struct xfs_rtalloc_rec *high_rec,
> > > --
> > > 2.41.0
> > >
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 5/6] xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
2023-07-17 21:06 ` Omar Sandoval
2023-07-31 20:58 ` Omar Sandoval
@ 2023-08-01 23:00 ` Darrick J. Wong
1 sibling, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2023-08-01 23:00 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-xfs, kernel-team, Prashant Nema
On Mon, Jul 17, 2023 at 02:06:36PM -0700, Omar Sandoval wrote:
> On Wed, Jul 12, 2023 at 04:34:03PM -0700, Darrick J. Wong wrote:
> > On Tue, Jun 20, 2023 at 02:32:15PM -0700, Omar Sandoval wrote:
> > > From: Omar Sandoval <osandov@fb.com>
> > >
> > > xfs_rtallocate_extent_near() tries to find a free extent as close to a
> > > target bitmap block given by bbno as possible, which may be before or
> > > after bbno. Searching backwards has a complication: the realtime summary
> > > accounts for free space _starting_ in a bitmap block, but not straddling
> > > or ending in a bitmap block. So, when the negative search finds a free
> > > extent in the realtime summary, in order to end up closer to the target,
> > > it looks for the end of the free extent. For example, if bbno - 2 has a
> > > free extent, then it will check bbno - 1, then bbno - 2. But then if
> > > bbno - 3 has a free extent, it will check bbno - 1 again, then bbno - 2
> > > again, and then bbno - 3. This results in a quadratic loop, which is
> > > completely pointless since the repeated checks won't find anything new.
> > >
> > > Fix it by remembering where we last checked up to and continue from
> > > there. This also obviates the need for a check of the realtime summary.
> > >
> > > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > > ---
> > > fs/xfs/xfs_rtalloc.c | 46 +++-----------------------------------------
> > > 1 file changed, 3 insertions(+), 43 deletions(-)
> > >
> > > diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> > > index d079dfb77c73..4d9d0be2e616 100644
> > > --- a/fs/xfs/xfs_rtalloc.c
> > > +++ b/fs/xfs/xfs_rtalloc.c
> > > @@ -468,6 +468,7 @@ xfs_rtallocate_extent_near(
> > > }
> > > bbno = XFS_BITTOBLOCK(mp, bno);
> > > i = 0;
> > > + j = -1;
> > > ASSERT(minlen != 0);
> > > log2len = xfs_highbit32(minlen);
> > > /*
> > > @@ -518,31 +519,11 @@ xfs_rtallocate_extent_near(
> > > else { /* i < 0 */
> > > /*
> > > * Loop backwards through the bitmap blocks from
> > > - * the starting point-1 up to where we are now.
> > > + * where we last checked up to where we are now.
> >
> > I find this comment a bit unclear -- aren't we looping backwards from
> > where we last checked *downwards*? I was reading "where we are now" to
> > mean @i, which contains a negative value.
>
> Yes, "where we last checked down to where we are now" might be better
> wording.
<nod>
> > "When @i is negative, we try to find a free extent that starts in the
> > bitmap blocks before bbno. Starting from the last bitmap block that we
> > checked in a negative scan (initially bbno - 1) and walking downwards
> > towards (bbno + i), try to allocate an extent of satisfactory length."
> >
> > But now having worked my way through that, now I'm wondering what the j
> > loop is even doing. Doesn't the sequence of blocks that we call
> > xfs_rtallocate_extent_block on alternate backwards and forwards? e.g.
> >
> > Try to find a satisfactory free extent that starts in:
> >
> > bbno
> > bbno + 1
> > bbno - 1
> > bbno + 2
> > bbno - 2
> > ...
> > etc?
> >
> > Why not avoid the loop entirely by calling xfs_rtallocate_extent_block
> > on bbno + i once before switching back to positive @i? What am I
> > missing here?
>
> There are two ways I can think of to remove the j loop, so I'll address
> both.
>
> If you mean: make the i >= 0 and i < 0 branches the same and call
> xfs_rtallocate_extent_block() if and only if xfs_rtany_summary() returns
> a non-zero maxlog, i.e.:
>
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index 4ab03eafd39f..9d7296c40ddd 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -495,10 +495,6 @@ xfs_rtallocate_extent_near(
> xfs_extlen_t maxavail =
> min_t(xfs_rtblock_t, maxlen,
> (1ULL << (maxlog + 1)) - 1);
> - /*
> - * On the positive side of the starting location.
> - */
> - if (i >= 0) {
> /*
> * Try to allocate an extent starting in
> * this block.
> @@ -517,33 +513,6 @@ xfs_rtallocate_extent_near(
> return 0;
> }
> }
> - /*
> - * On the negative side of the starting location.
> - */
> - else { /* i < 0 */
> - /*
> - * Loop backwards through the bitmap blocks from
> - * where we last checked up to where we are now.
> - * There should be an extent which ends in this
> - * bitmap block and is long enough.
> - */
> - for (; j >= i; j--) {
> - error = xfs_rtallocate_extent_block(mp,
> - tp, bbno + j, minlen, maxavail,
> - len, &n, rtbufc, prod, &r);
> - if (error) {
> - return error;
> - }
> - /*
> - * If it works, return the extent.
> - */
> - if (r != NULLRTBLOCK) {
> - *rtblock = r;
> - return 0;
> - }
> - }
> - }
> - }
> /*
> * Loop control. If we were on the positive side, and there's
> * still more blocks on the negative side, go there.
>
> Then when i < 0, this will only find the _beginning_ of a free extent
> before bbno rather than the apparent goal of trying to allocate as close
> as possible to bbno, i.e., the _end_ of the free extent. (This is what I
> tried to explain in the commit message.)
Hmm. It's hard to remember what I was thinking 15 minutes ago let alone
2 weeks ago, but I /think/ this is what I was driving at. I agree with
your statement that this would find any free extent starting before
bbno, instead of a free extent *ending* as close as possible to bbno.
I now understand what this patch is trying to accomplish, and it looks
good to me, modulo whatever comment changes you want to make. :)
> If instead you mean: unconditionally call xfs_rtallocate_extent_block()
> for bbno + i when i < 0, i.e.:
>
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index 4ab03eafd39f..1cf42910c0e8 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -491,14 +491,10 @@ xfs_rtallocate_extent_near(
> * If there are any useful extents starting here, try
> * allocating one.
> */
> - if (maxlog >= 0) {
> + if (maxlog >= 0 || i < 0) {
> xfs_extlen_t maxavail =
> min_t(xfs_rtblock_t, maxlen,
> (1ULL << (maxlog + 1)) - 1);
> - /*
> - * On the positive side of the starting location.
> - */
> - if (i >= 0) {
> /*
> * Try to allocate an extent starting in
> * this block.
> @@ -517,33 +513,6 @@ xfs_rtallocate_extent_near(
> return 0;
> }
> }
> - /*
> - * On the negative side of the starting location.
> - */
> - else { /* i < 0 */
> - /*
> - * Loop backwards through the bitmap blocks from
> - * where we last checked up to where we are now.
> - * There should be an extent which ends in this
> - * bitmap block and is long enough.
> - */
> - for (; j >= i; j--) {
> - error = xfs_rtallocate_extent_block(mp,
> - tp, bbno + j, minlen, maxavail,
> - len, &n, rtbufc, prod, &r);
> - if (error) {
> - return error;
> - }
> - /*
> - * If it works, return the extent.
> - */
> - if (r != NULLRTBLOCK) {
> - *rtblock = r;
> - return 0;
> - }
> - }
> - }
> - }
> /*
> * Loop control. If we were on the positive side, and there's
> * still more blocks on the negative side, go there.
>
>
> Then this will find the end of the extent, but we will waste a lot of
> time searching bitmap blocks that don't have any usable free space. (In
> fact, this is something that patch 6 tries to reduce further.)
<nod>
> > > * There should be an extent which ends in this
> > > * bitmap block and is long enough.
> > > */
> > > - for (j = -1; j > i; j--) {
> > > - /*
> > > - * Grab the summary information for
> > > - * this bitmap block.
> > > - */
> > > - error = xfs_rtany_summary(mp, tp,
> > > - log2len, mp->m_rsumlevels - 1,
> > > - bbno + j, rtbufc, &maxlog);
> > > - if (error) {
> > > - return error;
> > > - }
> > > - /*
> > > - * If there's no extent given in the
> > > - * summary that means the extent we
> > > - * found must carry over from an
> > > - * earlier block. If there is an
> > > - * extent given, we've already tried
> > > - * that allocation, don't do it again.
> > > - */
> > > - if (maxlog >= 0)
> > > - continue;
> > > + for (; j >= i; j--) {
> >
> > Changing the j > i to j >= i is what obviates the extra call to
> > xfs_rtallocate_extent_block below, correct?
>
> Correct. Before, the loop body was different because it contained a call
> to xfs_rtany_summary(). But now there's no check, so the extra call can
> be included in the loop.
Ok, thanks. That makes sense to me now.
--D
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/6] xfs: invert the realtime summary cache
2023-07-17 19:54 ` Omar Sandoval
@ 2023-08-01 23:17 ` Darrick J. Wong
0 siblings, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2023-08-01 23:17 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-xfs, kernel-team, Prashant Nema
On Mon, Jul 17, 2023 at 12:54:24PM -0700, Omar Sandoval wrote:
> On Wed, Jul 12, 2023 at 03:40:01PM -0700, Darrick J. Wong wrote:
> > On Tue, Jun 20, 2023 at 02:32:12PM -0700, Omar Sandoval wrote:
> > > From: Omar Sandoval <osandov@fb.com>
> > >
> > > In commit 355e3532132b ("xfs: cache minimum realtime summary level"), I
> > > added a cache of the minimum level of the realtime summary that has any
> > > free extents. However, it turns out that the _maximum_ level is more
> > > useful for upcoming optimizations, and basically equivalent for the
> > > existing usage. So, let's change the meaning of the cache to be the
> > > maximum level + 1, or 0 if there are no free extents.
> >
> > Hmm. If I'm reading xfs_rtmodify_summary_int right, m_rsum_cache[b] now
> > tells us the maximum log2(length) of the free extents starting in
> > rtbitmap block b?
> >
> > IOWs, let's say the cache contents are:
> >
> > {2, 3, 2, 15, 8}
> >
> > Someone asks for a 400rtx (realtime extent) allocation, so we want to
> > find a free space of at least magnitude floor(log2(400)) == 8.
> >
> > The cache tells us that there aren't any free extents longer than 2^1
> > blocks in rtbitmap blocks 0 and 2; longer than 2^2 blocks in rtbmp block
> > 1; longer than 2^7 blocks in rtbmp block 4; nor longer than 2^14 blocks
> > in rtbmp block 3?
>
> There's a potential for an off-by-one bug here, so just to make sure
> we're saying the same thing: the realtime summary for level n contains
> the number of free extents starting in a bitmap block such that
> floor(log2(size_in_realtime_extents)) == n. The maximum size of a free
> extent in level n is therefore 2^(n + 1) - 1 realtime extents.
>
> So in your example, the cache is telling us that realtime bitmap blocks
> 0 and 2 don't have anything free in levels 2 or above, and therefore
> don't have any free extents longer than _or equal to_ 2^2.
D'oh. I forgot that subtlety that the maximum size of a free
extent in level n is therefore 2^(n + 1) - 1 realtime extents.
> I'll try to reword the commit message and comments to make this
> unambiguous.
Ok, thanks. A couple of quick examples (feel free to use mine) would be
helpful for descrambling my brain. :)
--D
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 6/6] xfs: don't look for end of extent further than necessary in xfs_rtallocate_extent_near()
2023-06-20 21:32 ` [PATCH 6/6] xfs: don't look for end of extent further than necessary " Omar Sandoval
@ 2023-08-01 23:40 ` Darrick J. Wong
0 siblings, 0 replies; 23+ messages in thread
From: Darrick J. Wong @ 2023-08-01 23:40 UTC (permalink / raw)
To: Omar Sandoval; +Cc: linux-xfs, kernel-team, Prashant Nema
On Tue, Jun 20, 2023 at 02:32:16PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> As explained in the previous commit, xfs_rtallocate_extent_near() looks
> for the end of a free extent when searching backwards from the target
> bitmap block. Since the previous commit, it searches from the last
> bitmap block it checked to the bitmap block containing the start of the
> extent.
>
> This may still be more than necessary, since the free extent may not be
> that long. We know the maximum size of the free extent from the realtime
> summary. Use that to compute how many bitmap blocks we actually need to
> check.
>
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
> fs/xfs/xfs_rtalloc.c | 25 +++++++++++++++++++++----
> 1 file changed, 21 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
> index 4d9d0be2e616..2e2eb7c4a648 100644
> --- a/fs/xfs/xfs_rtalloc.c
> +++ b/fs/xfs/xfs_rtalloc.c
> @@ -517,12 +517,29 @@ xfs_rtallocate_extent_near(
> * On the negative side of the starting location.
> */
> else { /* i < 0 */
> + int maxblocks;
> +
> /*
> - * Loop backwards through the bitmap blocks from
> - * where we last checked up to where we are now.
> - * There should be an extent which ends in this
> - * bitmap block and is long enough.
> + * Loop backwards to find the end of the extent
> + * we found in the realtime summary.
> + *
> + * maxblocks is the maximum possible number of
> + * bitmap blocks from the start of the extent to
> + * the end of the extent.
> */
> + if (maxlog == 0)
> + maxblocks = 0;
> + else if (maxlog < mp->m_blkbit_log)
> + maxblocks = 1;
> + else
> + maxblocks = 2 << (maxlog - mp->m_blkbit_log);
> + /*
> + * We need to check bbno + i + maxblocks down to
> + * bbno + i. We already checked bbno down to
> + * bbno + j + 1, so we don't need to check those
> + * again.
> + */
> + j = min(i + maxblocks, j);
Makes sense now with a fresher head...
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
What does the xfsprogs version of this patchset look like?
--D
> for (; j >= i; j--) {
> error = xfs_rtallocate_extent_block(mp,
> tp, bbno + j, minlen, maxavail,
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2023-08-01 23:40 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-20 21:32 [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
2023-06-20 21:32 ` [PATCH 1/6] xfs: cache last bitmap block in " Omar Sandoval
2023-07-12 18:29 ` Darrick J. Wong
2023-07-17 18:18 ` Omar Sandoval
2023-08-01 22:48 ` Darrick J. Wong
2023-06-20 21:32 ` [PATCH 2/6] xfs: invert the realtime summary cache Omar Sandoval
2023-07-12 22:40 ` Darrick J. Wong
2023-07-17 19:54 ` Omar Sandoval
2023-08-01 23:17 ` Darrick J. Wong
2023-06-20 21:32 ` [PATCH 3/6] xfs: return maximum free size from xfs_rtany_summary() Omar Sandoval
2023-07-12 22:44 ` Darrick J. Wong
2023-06-20 21:32 ` [PATCH 4/6] xfs: limit maxlen based on available space in xfs_rtallocate_extent_near() Omar Sandoval
2023-07-12 23:01 ` Darrick J. Wong
2023-07-17 20:33 ` Omar Sandoval
2023-06-20 21:32 ` [PATCH 5/6] xfs: don't try redundant allocations " Omar Sandoval
2023-07-12 23:34 ` Darrick J. Wong
2023-07-17 21:06 ` Omar Sandoval
2023-07-31 20:58 ` Omar Sandoval
2023-08-01 23:00 ` Darrick J. Wong
2023-06-20 21:32 ` [PATCH 6/6] xfs: don't look for end of extent further than necessary " Omar Sandoval
2023-08-01 23:40 ` Darrick J. Wong
2023-07-06 21:39 ` [PATCH 0/6] xfs: CPU usage optimizations for realtime allocator Omar Sandoval
2023-07-07 0:36 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).