* [RFC 0/4] xfs: prototype dynamic AG size grow for image mode
@ 2024-10-08 13:13 Brian Foster
2024-10-08 13:13 ` [RFC 1/4] xfs: factor out sb_agblocks usage in growfs Brian Foster
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Brian Foster @ 2024-10-08 13:13 UTC (permalink / raw)
To: linux-xfs; +Cc: djwong, sandeen
Hi all,
This is a followup to the discussion here [1] on some ideas on how to
better deal with the growfs agcount scalability problem that cloud use
cases tend to run into. This series prototypes the concept of using an
agcount=1 mkfs format to facilitate more dynamic growfs behavior. More
specifically, we can grow the AG size of the filesystem up until a
second AG is added, so therefore we can use the target growfs size to
set a more suitable AG size at growfs time.
As per the previous discussion, there are multiple different ways this
can go, in xfsprogs and the kernel. For example, a size hint could be
provided to mkfs to avoid growfs time changes, a feature bit could be
used to manage functionality, AG size changes could be separated into a
different ioctl to lift the heuristic into userspace, etc. The purpose
here is simply to implement some of the core mechanism as conveniently
as possible and to explore whether it is a workable and potentially
useful improvement.
Patches 1-3 are prep/cleanup patches and not worth digging too much
into. Patch 4 hacks AG size growth into the typical growfs path and uses
a simple heuristic to provide fairly conservative behavior in the case
of unexpectedly small grows. See the commit logs and code comments for
more details and discussion points. Finally, note that this has only
seen light and targeted testing. Thoughts?
Brian
[1] https://lore.kernel.org/linux-xfs/20240812135652.250798-1-bfoster@redhat.com/
Brian Foster (4):
xfs: factor out sb_agblocks usage in growfs
xfs: transaction support for sb_agblocks updates
xfs: factor out a helper to calculate post-growfs agcount
xfs: support dynamic AG size growing on single AG filesystems
fs/xfs/libxfs/xfs_shared.h | 1 +
fs/xfs/xfs_fsops.c | 137 ++++++++++++++++++++++++++++++++-----
fs/xfs/xfs_trans.c | 15 ++++
fs/xfs/xfs_trans.h | 1 +
4 files changed, 137 insertions(+), 17 deletions(-)
--
2.46.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC 1/4] xfs: factor out sb_agblocks usage in growfs
2024-10-08 13:13 [RFC 0/4] xfs: prototype dynamic AG size grow for image mode Brian Foster
@ 2024-10-08 13:13 ` Brian Foster
2024-10-08 13:13 ` [RFC 2/4] xfs: transaction support for sb_agblocks updates Brian Foster
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Brian Foster @ 2024-10-08 13:13 UTC (permalink / raw)
To: linux-xfs; +Cc: djwong, sandeen
Factor out usage of sb_agblocks in the growfs path. This is in
preparation to support growing AG size.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/xfs/xfs_fsops.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 3643cc843f62..6401424303c5 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -38,6 +38,7 @@ static int
xfs_resizefs_init_new_ags(
struct xfs_trans *tp,
struct aghdr_init_data *id,
+ xfs_agblock_t agblocks,
xfs_agnumber_t oagcount,
xfs_agnumber_t nagcount,
xfs_rfsblock_t delta,
@@ -57,9 +58,9 @@ xfs_resizefs_init_new_ags(
if (id->agno == nagcount - 1)
id->agsize = nb - (id->agno *
- (xfs_rfsblock_t)mp->m_sb.sb_agblocks);
+ (xfs_rfsblock_t)agblocks);
else
- id->agsize = mp->m_sb.sb_agblocks;
+ id->agsize = agblocks;
error = xfs_ag_init_headers(mp, id);
if (error) {
@@ -89,6 +90,7 @@ xfs_growfs_data_private(
{
struct xfs_buf *bp;
int error;
+ xfs_agblock_t nagblocks;
xfs_agnumber_t nagcount;
xfs_agnumber_t nagimax = 0;
xfs_rfsblock_t nb, nb_div, nb_mod;
@@ -113,16 +115,18 @@ xfs_growfs_data_private(
xfs_buf_relse(bp);
}
+ nagblocks = mp->m_sb.sb_agblocks;
+
nb_div = nb;
- nb_mod = do_div(nb_div, mp->m_sb.sb_agblocks);
+ nb_mod = do_div(nb_div, nagblocks);
if (nb_mod && nb_mod >= XFS_MIN_AG_BLOCKS)
nb_div++;
else if (nb_mod)
- nb = nb_div * mp->m_sb.sb_agblocks;
+ nb = nb_div * nagblocks;
if (nb_div > XFS_MAX_AGNUMBER + 1) {
nb_div = XFS_MAX_AGNUMBER + 1;
- nb = nb_div * mp->m_sb.sb_agblocks;
+ nb = nb_div * nagblocks;
}
nagcount = nb_div;
delta = nb - mp->m_sb.sb_dblocks;
@@ -161,8 +165,8 @@ xfs_growfs_data_private(
last_pag = xfs_perag_get(mp, oagcount - 1);
if (delta > 0) {
- error = xfs_resizefs_init_new_ags(tp, &id, oagcount, nagcount,
- delta, last_pag, &lastag_extended);
+ error = xfs_resizefs_init_new_ags(tp, &id, nagblocks, oagcount,
+ nagcount, delta, last_pag, &lastag_extended);
} else {
xfs_warn_mount(mp, XFS_OPSTATE_WARNED_SHRINK,
"EXPERIMENTAL online shrink feature in use. Use at your own risk!");
--
2.46.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC 2/4] xfs: transaction support for sb_agblocks updates
2024-10-08 13:13 [RFC 0/4] xfs: prototype dynamic AG size grow for image mode Brian Foster
2024-10-08 13:13 ` [RFC 1/4] xfs: factor out sb_agblocks usage in growfs Brian Foster
@ 2024-10-08 13:13 ` Brian Foster
2024-10-09 8:05 ` Christoph Hellwig
2024-10-08 13:13 ` [RFC 3/4] xfs: factor out a helper to calculate post-growfs agcount Brian Foster
` (2 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Brian Foster @ 2024-10-08 13:13 UTC (permalink / raw)
To: linux-xfs; +Cc: djwong, sandeen
Support transactional changes to superblock agblocks and related
fields.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/xfs/libxfs/xfs_shared.h | 1 +
fs/xfs/xfs_trans.c | 15 +++++++++++++++
fs/xfs/xfs_trans.h | 1 +
3 files changed, 17 insertions(+)
diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h
index 33b84a3a83ff..b8e80827a010 100644
--- a/fs/xfs/libxfs/xfs_shared.h
+++ b/fs/xfs/libxfs/xfs_shared.h
@@ -157,6 +157,7 @@ void xfs_log_get_max_trans_res(struct xfs_mount *mp,
#define XFS_TRANS_SB_RBLOCKS 0x00000800
#define XFS_TRANS_SB_REXTENTS 0x00001000
#define XFS_TRANS_SB_REXTSLOG 0x00002000
+#define XFS_TRANS_SB_AGBLOCKS 0x00004000
/*
* Here we centralize the specification of XFS meta-data buffer reference count
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index bdf3704dc301..34a9896ec398 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -433,6 +433,9 @@ xfs_trans_mod_sb(
case XFS_TRANS_SB_DBLOCKS:
tp->t_dblocks_delta += delta;
break;
+ case XFS_TRANS_SB_AGBLOCKS:
+ tp->t_agblocks_delta += delta;
+ break;
case XFS_TRANS_SB_AGCOUNT:
ASSERT(delta > 0);
tp->t_agcount_delta += delta;
@@ -526,6 +529,16 @@ xfs_trans_apply_sb_deltas(
be64_add_cpu(&sbp->sb_dblocks, tp->t_dblocks_delta);
whole = 1;
}
+ if (tp->t_agblocks_delta) {
+ xfs_agblock_t agblocks;
+
+ agblocks = be32_to_cpu(sbp->sb_agblocks);
+ agblocks += tp->t_agblocks_delta;
+
+ sbp->sb_agblocks = cpu_to_be32(agblocks);
+ sbp->sb_agblklog = ilog2(roundup_pow_of_two(agblocks));
+ whole = 1;
+ }
if (tp->t_agcount_delta) {
be32_add_cpu(&sbp->sb_agcount, tp->t_agcount_delta);
whole = 1;
@@ -657,6 +670,8 @@ xfs_trans_unreserve_and_mod_sb(
* incore reservations.
*/
mp->m_sb.sb_dblocks += tp->t_dblocks_delta;
+ mp->m_sb.sb_agblocks += tp->t_agblocks_delta;
+ mp->m_sb.sb_agblklog = ilog2(roundup_pow_of_two(mp->m_sb.sb_agblocks));
mp->m_sb.sb_agcount += tp->t_agcount_delta;
mp->m_sb.sb_imax_pct += tp->t_imaxpct_delta;
mp->m_sb.sb_rextsize += tp->t_rextsize_delta;
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index f06cc0f41665..11462406988d 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -141,6 +141,7 @@ typedef struct xfs_trans {
int64_t t_frextents_delta;/* superblock freextents chg*/
int64_t t_res_frextents_delta; /* on-disk only chg */
int64_t t_dblocks_delta;/* superblock dblocks change */
+ int64_t t_agblocks_delta;/* superblock agblocks change */
int64_t t_agcount_delta;/* superblock agcount change */
int64_t t_imaxpct_delta;/* superblock imaxpct change */
int64_t t_rextsize_delta;/* superblock rextsize chg */
--
2.46.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC 3/4] xfs: factor out a helper to calculate post-growfs agcount
2024-10-08 13:13 [RFC 0/4] xfs: prototype dynamic AG size grow for image mode Brian Foster
2024-10-08 13:13 ` [RFC 1/4] xfs: factor out sb_agblocks usage in growfs Brian Foster
2024-10-08 13:13 ` [RFC 2/4] xfs: transaction support for sb_agblocks updates Brian Foster
@ 2024-10-08 13:13 ` Brian Foster
2024-10-08 13:13 ` [RFC 4/4] xfs: support dynamic AG size growing on single AG filesystems Brian Foster
2024-10-08 13:25 ` [PATCH] xfsprogs/mkfs: prototype XFS image mode format for scalable AG growth Brian Foster
4 siblings, 0 replies; 9+ messages in thread
From: Brian Foster @ 2024-10-08 13:13 UTC (permalink / raw)
To: linux-xfs; +Cc: djwong, sandeen
Factor out the new agcount calculation logic into a helper.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/xfs/xfs_fsops.c | 42 +++++++++++++++++++++++++++++-------------
1 file changed, 29 insertions(+), 13 deletions(-)
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 6401424303c5..3b95a368584e 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -80,6 +80,33 @@ xfs_resizefs_init_new_ags(
return error;
}
+/*
+ * Calculate new AG count based on provided AG size. May adjust final nblocks
+ * count if necessary for a valid AG count.
+ */
+static xfs_agnumber_t
+xfs_growfs_calc_agcount(
+ struct xfs_mount *mp,
+ xfs_agblock_t nagblocks,
+ xfs_rfsblock_t *nblocks)
+{
+ xfs_rfsblock_t nb_div, nb_mod;
+
+ nb_div = *nblocks;
+ nb_mod = do_div(nb_div, nagblocks);
+ if (nb_mod && nb_mod >= XFS_MIN_AG_BLOCKS)
+ nb_div++;
+ else if (nb_mod)
+ *nblocks = nb_div * nagblocks;
+
+ if (nb_div > XFS_MAX_AGNUMBER + 1) {
+ nb_div = XFS_MAX_AGNUMBER + 1;
+ *nblocks = nb_div * nagblocks;
+ }
+
+ return nb_div;
+}
+
/*
* growfs operations
*/
@@ -93,7 +120,7 @@ xfs_growfs_data_private(
xfs_agblock_t nagblocks;
xfs_agnumber_t nagcount;
xfs_agnumber_t nagimax = 0;
- xfs_rfsblock_t nb, nb_div, nb_mod;
+ xfs_rfsblock_t nb;
int64_t delta;
bool lastag_extended = false;
xfs_agnumber_t oagcount;
@@ -117,18 +144,7 @@ xfs_growfs_data_private(
nagblocks = mp->m_sb.sb_agblocks;
- nb_div = nb;
- nb_mod = do_div(nb_div, nagblocks);
- if (nb_mod && nb_mod >= XFS_MIN_AG_BLOCKS)
- nb_div++;
- else if (nb_mod)
- nb = nb_div * nagblocks;
-
- if (nb_div > XFS_MAX_AGNUMBER + 1) {
- nb_div = XFS_MAX_AGNUMBER + 1;
- nb = nb_div * nagblocks;
- }
- nagcount = nb_div;
+ nagcount = xfs_growfs_calc_agcount(mp, nagblocks, &nb);
delta = nb - mp->m_sb.sb_dblocks;
/*
* Reject filesystems with a single AG because they are not
--
2.46.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC 4/4] xfs: support dynamic AG size growing on single AG filesystems
2024-10-08 13:13 [RFC 0/4] xfs: prototype dynamic AG size grow for image mode Brian Foster
` (2 preceding siblings ...)
2024-10-08 13:13 ` [RFC 3/4] xfs: factor out a helper to calculate post-growfs agcount Brian Foster
@ 2024-10-08 13:13 ` Brian Foster
2024-10-08 13:25 ` [PATCH] xfsprogs/mkfs: prototype XFS image mode format for scalable AG growth Brian Foster
4 siblings, 0 replies; 9+ messages in thread
From: Brian Foster @ 2024-10-08 13:13 UTC (permalink / raw)
To: linux-xfs; +Cc: djwong, sandeen
This is a prototype for AG size growing of single AG filesystems.
The intent is to experiment with a potential solution to the
recurring problem where cloud-oriented filesystem images are
initially formatted to very small sizes and then copied/deployed and
grown to excessively high AG counts. This ultimately leads to
performance and scalability problems and can only currently be
resolved through a reformat and data migration.
Since the use case for a cloud image filesystem is known at creation
time, nothing prevents mkfs from starting with a geometry that is
more suitable to the post-deployment size. For example, the image
creator could use a larger file size if sparse files are handled
efficiently, or mkfs could in theory support creating a single AG
filesystem where the AG size is larger than the current fs size.
While mkfs doesn't currently support this, it is trivially enabled
and growfs already works as expected.
These options require enough familiarity with filesystem specific
geometry that image creators might not take these steps. Therefore,
the purpose of this prototype is to propose a growfs scheme that
would cooperate with a special mkfs time option that is specifically
designed for the cloud image use case. For example, consider a mkfs
command like 'mkfs.xfs --image <file>' where mkfs knows to create a
single AG filesystem with a larger than default log under the
implication that the image file is to be grown as part of a
deployment process.
The purpose of formatting with a single AG is that the AG size can
increase with no impact on existing data and functionality up until
a second AG is created. Therefore, kernel growfs of a single AG
filesystem can optionally decide to increase the AG size before
physically growing the fs. If the AG size is grown, the first AG is
extended just the same as a final runt AG is on a multi-ag
filesystem.
As an example, consider a 512MB filesystem image formatted and then
grown to 20GB. The standard mkfs and growfs sequence produces a
filesystem with over 150 AGs. A dynamic growfs can increase the AG
size to 5GB and produce a 4xAG filesystem more typical of how a 20GB
filesystem is formatted from the start.
This patch implements a simple AG size grow mechanism and sample
heuristic for resizing small, single AG filesystems. The heuristic
defines a minimum AG size of 4GB and otherwise targets a standard
4xAG geometry. This means that a small filesystem grown to anything
less than ~16GB will see an enforced 4GB AG size at the cost of
reduced redundancy (i.e. AG count). On the other hand, as the target
grow size increases beyond 16GB, the AG size is increased to
maintain a 4xAG geometry up until the maximum AG size is reached.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
fs/xfs/xfs_fsops.c | 89 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 86 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 3b95a368584e..9cd70989fa1c 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -107,6 +107,64 @@ xfs_growfs_calc_agcount(
return nb_div;
}
+/*
+ * Calculate post-grow AG size. AG size remains unchanged for everything other
+ * than agcount=1 filesystems with no format time alignment constraints.
+ *
+ * Otherwise, agcount=1 implies an "image mode" filesystem is being deployed and
+ * grown. To help prevent tiny AG size filesystems from being grown to excessive
+ * AG counts, we have the ability to extend the AG size before growing the
+ * physical size of the fs. The objective is to set a reasonable enough size to
+ * end up with multiple AGs for metadata redundancy.
+ */
+#define XFS_AGSIZE_THRESHOLD (4ULL << 30) /* 4GB */
+static xfs_agblock_t
+xfs_growfs_calc_agblocks(
+ struct xfs_mount *mp,
+ xfs_rfsblock_t nblocks)
+{
+ xfs_agblock_t nagblocks = XFS_B_TO_FSB(mp, XFS_AGSIZE_THRESHOLD);
+
+ if (mp->m_sb.sb_agcount > 1 || mp->m_sb.sb_unit ||
+ mp->m_sb.sb_agblocks >= nagblocks)
+ return mp->m_sb.sb_agblocks;
+
+ /*
+ * This is a sample image mode growfs heuristic that reuses the 4GB
+ * threshold from mkfs concurrency logic as a minimum AG size. AG size
+ * is set to the maximum of 4GB or 25% of the target grow size. IOW,
+ * filesystems remain single AG until grown to at least 4GB plus the
+ * minimum number of blocks required to create a runt second AG. The AG
+ * size is grown larger for grows beyond the 16GB (4 x 4GB AGs) total
+ * size threshold to target typical 4xAG mkfs time geometry.
+ *
+ * The end result is that grows from tiny to very large end up with a
+ * more typical geometry. Smaller grows may not, but the 4GB minimum AG
+ * size prevents the situation of growing MB sized AGs to pathological
+ * AG counts.
+ *
+ * XXX: We need to decide how to handle filesystems that remain single
+ * AG after grow. It should be rare enough to grow a filesystem to a
+ * sub-4GB size that we may not have to be too paranoid about it, but a
+ * warning or kernel message is probably warranted at minimum.
+ */
+ if (nblocks < (nagblocks + XFS_MIN_AG_BLOCKS)) {
+ /* grow too small, remain single AG */
+ nagblocks = nblocks;
+ } else {
+ /*
+ * Enough space for at least a runt second AG. Use the larger of
+ * 25% of the new target size and the threshold size.
+ */
+ do_div(nblocks, 4);
+ nagblocks = max_t(xfs_rfsblock_t, nagblocks, nblocks);
+ }
+
+ /* clamp to current ag size and max allowed */
+ nagblocks = min_t(xfs_rfsblock_t, nagblocks, XFS_B_TO_FSB(mp, XFS_MAX_AG_BYTES));
+ return max_t(xfs_rfsblock_t, nagblocks, mp->m_sb.sb_agblocks);
+}
+
/*
* growfs operations
*/
@@ -117,7 +175,7 @@ xfs_growfs_data_private(
{
struct xfs_buf *bp;
int error;
- xfs_agblock_t nagblocks;
+ xfs_agblock_t oagblocks, nagblocks;
xfs_agnumber_t nagcount;
xfs_agnumber_t nagimax = 0;
xfs_rfsblock_t nb;
@@ -142,7 +200,9 @@ xfs_growfs_data_private(
xfs_buf_relse(bp);
}
- nagblocks = mp->m_sb.sb_agblocks;
+ oagcount = mp->m_sb.sb_agcount;
+ oagblocks = mp->m_sb.sb_agblocks;
+ nagblocks = xfs_growfs_calc_agblocks(mp, nb);
nagcount = xfs_growfs_calc_agcount(mp, nagblocks, &nb);
delta = nb - mp->m_sb.sb_dblocks;
@@ -158,7 +218,30 @@ xfs_growfs_data_private(
if (delta == 0)
return 0;
- oagcount = mp->m_sb.sb_agcount;
+ /*
+ * Grow agblocks in a separate transaction to ensure that the
+ * subsequent grow transaction sees the updated superblock. We only
+ * grow agblocks for single AG filesystems where an outsized AG size is
+ * harmless, so this doesn't necessarily need to be atomic with the
+ * broader growfs operation.
+ *
+ * Nonetheless, this is included here mainly for prototyping
+ * convenience. We might want to consider splitting this off into a
+ * separate FSGROWFSAG operation, but that's open for discussion.
+ * Single AG fs' may also be exclusive enough to handle here as such.
+ */
+ if (nagblocks > oagblocks) {
+ error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growdata,
+ XFS_GROWFS_SPACE_RES(mp), 0, XFS_TRANS_RESERVE,
+ &tp);
+ xfs_trans_mod_sb(tp, XFS_TRANS_SB_AGBLOCKS, nagblocks - oagblocks);
+ xfs_trans_set_sync(tp);
+ error = xfs_trans_commit(tp);
+ if (error)
+ return error;
+ oagblocks = nagblocks;
+ }
+
/* allocate the new per-ag structures */
if (nagcount > oagcount) {
error = xfs_initialize_perag(mp, nagcount, nb, &nagimax);
--
2.46.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH] xfsprogs/mkfs: prototype XFS image mode format for scalable AG growth
2024-10-08 13:13 [RFC 0/4] xfs: prototype dynamic AG size grow for image mode Brian Foster
` (3 preceding siblings ...)
2024-10-08 13:13 ` [RFC 4/4] xfs: support dynamic AG size growing on single AG filesystems Brian Foster
@ 2024-10-08 13:25 ` Brian Foster
4 siblings, 0 replies; 9+ messages in thread
From: Brian Foster @ 2024-10-08 13:25 UTC (permalink / raw)
To: linux-xfs; +Cc: djwong, sandeen
Tweak a few checks to facilitate experimentation with an agcount=1
filesystem format with a larger agsize than the filesystem data
size. The purpose of this is to POC a filesystem image mode format
for XFS that better supports the typical cloud filesystem image
deployment use case where a very small fs image is created and then
immediately grown orders of magnitude in size once deployed to
container environments. The large grow size delta produces
filesystems with excessive AG counts, which leads to various other
functional problems that eventually derive from this sort of
pathological geometry.
To experiment with this patch, format a small fs with something like
the following:
mkfs.xfs -f -lsize=64m -dsize=512m,agcount=1,agsize=8g <imgfile>
Increase the underlying image file size, mount and grow. The
filesystem will grow according to the format time AG size as if the
AG was a typical runt AG on a traditional multi-AG fs.
This means that the filesystem remains with an AG count of 1 until
fs size grows beyond AG size. Since the typical deployment workflow
is an immediate very small -> very large, one-time grow, the image
fs can set a reasonable enough default or configurable AG size
(based on user input) that ensures deployed filesystems end up in a
generally supportable geometry (i.e. with multiple AGs for
superblock redundancy) before seeing production workloads.
Further optional changes are possible on the kernel side to help
provide some simple guardrails against misuse of this mechanism. For
example, the kernel could do anything from warn/fail or restrict
runtime functionality for an insufficient grow. The image mode
itself could set a backwards incompat feature bit that requires a
mount option to enable full functionality (with the exception of
growfs). More discussion is required to determine whether this
provides a usable solution for the common cloud workflows that
exhibit this problem and what the right interface and/or limitations
are to ensure it is used correctly.
Not-Signed-off-by: Brian Foster <bfoster@redhat.com>
---
This is mostly a repost of the previous RFD patch to allow mkfs to
create single AG filesystems with AG sizes larger than the filesystem
itself. The main tweak in this version is that agcount=1 is allowed not
just for explicitly outsized AG sizes, but for any file-based target
device. This supports either setting a large AG size at format time or
sticking with the default size and letting the kernel set a new AG size
at growfs time.
Brian
mkfs/xfs_mkfs.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
index bbd0dbb6c..20168b58d 100644
--- a/mkfs/xfs_mkfs.c
+++ b/mkfs/xfs_mkfs.c
@@ -329,8 +329,7 @@ static struct opt_params dopts = {
},
.subopt_params = {
{ .index = D_AGCOUNT,
- .conflicts = { { &dopts, D_AGSIZE },
- { &dopts, D_CONCURRENCY },
+ .conflicts = { { &dopts, D_CONCURRENCY },
{ NULL, LAST_CONFLICT } },
.minval = 1,
.maxval = XFS_MAX_AGNUMBER,
@@ -372,8 +371,7 @@ static struct opt_params dopts = {
.defaultval = SUBOPT_NEEDS_VAL,
},
{ .index = D_AGSIZE,
- .conflicts = { { &dopts, D_AGCOUNT },
- { &dopts, D_CONCURRENCY },
+ .conflicts = { { &dopts, D_CONCURRENCY },
{ NULL, LAST_CONFLICT } },
.convert = true,
.minval = XFS_AG_MIN_BYTES,
@@ -1264,7 +1262,7 @@ validate_ag_geometry(
usage();
}
- if (agsize > dblocks) {
+ if (agsize > dblocks && agcount != 1) {
fprintf(stderr,
_("agsize (%lld blocks) too big, data area is %lld blocks\n"),
(long long)agsize, (long long)dblocks);
@@ -2812,12 +2810,20 @@ validate_supported(
/*
* Filesystems should not have fewer than two AGs, because we need to
- * have redundant superblocks.
+ * have redundant superblocks. The exception is filesystem image files
+ * that are intended to be grown on deployment before production use.
+ *
+ * A single AG provides more flexibility to grow the filesystem because
+ * the AG size can be grown until a second AG is added. This helps
+ * prevent tiny image filesystems being grown to unwieldy AG counts.
*/
if (mp->m_sb.sb_agcount < 2) {
fprintf(stderr,
- _("Filesystem must have at least 2 superblocks for redundancy!\n"));
- usage();
+ _("Filesystem must have at least 2 superblocks for redundancy.\n"));
+ if (!cli->xi->data.isfile)
+ usage();
+ fprintf(stderr,
+ _("Proceeding for image file, grow before use.\n"));
}
}
--
2.46.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC 2/4] xfs: transaction support for sb_agblocks updates
2024-10-08 13:13 ` [RFC 2/4] xfs: transaction support for sb_agblocks updates Brian Foster
@ 2024-10-09 8:05 ` Christoph Hellwig
2024-10-09 12:38 ` Brian Foster
0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2024-10-09 8:05 UTC (permalink / raw)
To: Brian Foster; +Cc: linux-xfs, djwong, sandeen
On Tue, Oct 08, 2024 at 09:13:46AM -0400, Brian Foster wrote:
> Support transactional changes to superblock agblocks and related
> fields.
The growfs log recovery fix requires moving all the growfs sb updates
out of the transaction deltas. (It also despertely needs a review or
two)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC 2/4] xfs: transaction support for sb_agblocks updates
2024-10-09 8:05 ` Christoph Hellwig
@ 2024-10-09 12:38 ` Brian Foster
2024-10-09 12:44 ` Christoph Hellwig
0 siblings, 1 reply; 9+ messages in thread
From: Brian Foster @ 2024-10-09 12:38 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs, djwong, sandeen
On Wed, Oct 09, 2024 at 01:05:32AM -0700, Christoph Hellwig wrote:
> On Tue, Oct 08, 2024 at 09:13:46AM -0400, Brian Foster wrote:
> > Support transactional changes to superblock agblocks and related
> > fields.
>
> The growfs log recovery fix requires moving all the growfs sb updates
> out of the transaction deltas. (It also despertely needs a review or
> two)
>
Ok, got a link to that fix? Is this the same as for that growfs related
fstest?
Anyways, this patch is really just doing updates as updates are done. It
can change if needed, but that's an implementation detail depending on
the high level direction this whole thing goes, if anywhere..
Brian
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC 2/4] xfs: transaction support for sb_agblocks updates
2024-10-09 12:38 ` Brian Foster
@ 2024-10-09 12:44 ` Christoph Hellwig
0 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2024-10-09 12:44 UTC (permalink / raw)
To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs, djwong, sandeen
On Wed, Oct 09, 2024 at 08:38:00AM -0400, Brian Foster wrote:
> Ok, got a link to that fix?
https://lore.kernel.org/linux-xfs/20240930164211.2357358-1-hch@lst.de/
> Is this the same as for that growfs related fstest?
Yes.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-10-09 12:44 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-08 13:13 [RFC 0/4] xfs: prototype dynamic AG size grow for image mode Brian Foster
2024-10-08 13:13 ` [RFC 1/4] xfs: factor out sb_agblocks usage in growfs Brian Foster
2024-10-08 13:13 ` [RFC 2/4] xfs: transaction support for sb_agblocks updates Brian Foster
2024-10-09 8:05 ` Christoph Hellwig
2024-10-09 12:38 ` Brian Foster
2024-10-09 12:44 ` Christoph Hellwig
2024-10-08 13:13 ` [RFC 3/4] xfs: factor out a helper to calculate post-growfs agcount Brian Foster
2024-10-08 13:13 ` [RFC 4/4] xfs: support dynamic AG size growing on single AG filesystems Brian Foster
2024-10-08 13:25 ` [PATCH] xfsprogs/mkfs: prototype XFS image mode format for scalable AG growth Brian Foster
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox