* [PATCH] xfs: reserve enough blocks to handle btree splits when remapping
@ 2017-04-12 21:48 Darrick J. Wong
2017-04-13 9:53 ` Christoph Hellwig
2017-04-17 19:37 ` Darrick J. Wong
0 siblings, 2 replies; 4+ messages in thread
From: Darrick J. Wong @ 2017-04-12 21:48 UTC (permalink / raw)
To: xfs; +Cc: Christoph Hellwig
In xfs_reflink_end_cow, we erroneously reserve only enough blocks to
handle adding 1 extent. This is problematic if we fragment free space,
have to do CoW, and then have to perform multiple bmap btree expansions.
Furthermore, the BUI recovery routine doesn't reserve /any/ blocks to
handle btree splits, so log recovery fails after our first error causes
the filesystem to go down.
Therefore, refactor the transaction block reservation macros until we
have a macro that works for our deferred (re)mapping activities, and fix
both problems by using that macro.
With 1k blocks we can hit this fairly often in g/187 if the scratch fs
is big enough.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
fs/xfs/libxfs/xfs_trans_space.h | 18 ++++++++++++------
fs/xfs/xfs_bmap_item.c | 7 ++++++-
fs/xfs/xfs_reflink.c | 3 ++-
3 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_trans_space.h b/fs/xfs/libxfs/xfs_trans_space.h
index 7917f6e..04278cf 100644
--- a/fs/xfs/libxfs/xfs_trans_space.h
+++ b/fs/xfs/libxfs/xfs_trans_space.h
@@ -23,6 +23,16 @@
*/
#define XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp) \
(((mp)->m_rmap_mxr[0]) - ((mp)->m_rmap_mnr[0]))
+static inline unsigned int
+XFS_RMAPADD_SPACE_RES(
+ struct xfs_mount *mp)
+{
+ return xfs_sb_version_hasrmapbt(&mp->m_sb) ? mp->m_rmap_maxlevels : 0;
+}
+#define XFS_NRMAPADD_SPACE_RES(mp,b,w)\
+ (((b + XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp) - 1) / \
+ XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp)) * \
+ XFS_RMAPADD_SPACE_RES(mp))
#define XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp) \
(((mp)->m_alloc_mxr[0]) - ((mp)->m_alloc_mnr[0]))
#define XFS_EXTENTADD_SPACE_RES(mp,w) (XFS_BM_MAXLEVELS(mp,w) - 1)
@@ -31,12 +41,8 @@
XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp)) * \
XFS_EXTENTADD_SPACE_RES(mp,w))
#define XFS_SWAP_RMAP_SPACE_RES(mp,b,w)\
- (((b + XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp) - 1) / \
- XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp)) * \
- XFS_EXTENTADD_SPACE_RES(mp,w) + \
- ((b + XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp) - 1) / \
- XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp)) * \
- (mp)->m_rmap_maxlevels)
+ (XFS_NEXTENTADD_SPACE_RES((mp), (b), (w)) + \
+ XFS_NRMAPADD_SPACE_RES((mp), (b), (w)))
#define XFS_DAENTER_1B(mp,w) \
((w) == XFS_DATA_FORK ? (mp)->m_dir_geo->fsbcount : 1)
#define XFS_DAENTER_DBS(mp,w) \
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index 9bf57c7..055ab8f 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -34,6 +34,8 @@
#include "xfs_bmap.h"
#include "xfs_icache.h"
#include "xfs_trace.h"
+#include "xfs_bmap_btree.h"
+#include "xfs_trans_space.h"
kmem_zone_t *xfs_bui_zone;
@@ -402,6 +404,7 @@ xfs_bui_recover(
struct xfs_inode *ip = NULL;
struct xfs_defer_ops dfops;
xfs_fsblock_t firstfsb;
+ unsigned int resblks;
ASSERT(!test_bit(XFS_BUI_RECOVERED, &buip->bui_flags));
@@ -446,7 +449,9 @@ xfs_bui_recover(
return -EIO;
}
- error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp);
+ resblks = XFS_SWAP_RMAP_SPACE_RES(mp, 1, XFS_DATA_FORK);
+ error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, resblks, 0,
+ 0, &tp);
if (error)
return error;
budp = xfs_trans_get_bud(tp, buip);
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index c0f3754..b811742 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -706,7 +706,8 @@ xfs_reflink_end_cow(
end_fsb = XFS_B_TO_FSB(ip->i_mount, offset + count);
/* Start a rolling transaction to switch the mappings */
- resblks = XFS_EXTENTADD_SPACE_RES(ip->i_mount, XFS_DATA_FORK);
+ resblks = XFS_SWAP_RMAP_SPACE_RES(ip->i_mount,
+ end_fsb - offset_fsb + 1, XFS_DATA_FORK);
error = xfs_trans_alloc(ip->i_mount, &M_RES(ip->i_mount)->tr_write,
resblks, 0, 0, &tp);
if (error)
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] xfs: reserve enough blocks to handle btree splits when remapping
2017-04-12 21:48 [PATCH] xfs: reserve enough blocks to handle btree splits when remapping Darrick J. Wong
@ 2017-04-13 9:53 ` Christoph Hellwig
2017-04-13 15:52 ` Darrick J. Wong
2017-04-17 19:37 ` Darrick J. Wong
1 sibling, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2017-04-13 9:53 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: xfs, Christoph Hellwig
> +static inline unsigned int
> +XFS_RMAPADD_SPACE_RES(
> + struct xfs_mount *mp)
> +{
> + return xfs_sb_version_hasrmapbt(&mp->m_sb) ? mp->m_rmap_maxlevels : 0;
> +}
All the other helpers are macros. While I much prefer inlines it might
be a good idea to do it in one go and also convert them to lower case
instead of screaming..
> +#define XFS_NRMAPADD_SPACE_RES(mp,b,w)\
> + (((b + XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp) - 1) / \
> + XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp)) * \
> + XFS_RMAPADD_SPACE_RES(mp))
But then again the above is so much more readable than this, so maybe
it's a good idea after all and this one should be an inline as well.
Also I think future developers will really appreciate comments
explaining what we add up here.
Otherwise this looks fine and test fine for me:
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] xfs: reserve enough blocks to handle btree splits when remapping
2017-04-13 9:53 ` Christoph Hellwig
@ 2017-04-13 15:52 ` Darrick J. Wong
0 siblings, 0 replies; 4+ messages in thread
From: Darrick J. Wong @ 2017-04-13 15:52 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
On Thu, Apr 13, 2017 at 02:53:11AM -0700, Christoph Hellwig wrote:
> > +static inline unsigned int
> > +XFS_RMAPADD_SPACE_RES(
> > + struct xfs_mount *mp)
> > +{
> > + return xfs_sb_version_hasrmapbt(&mp->m_sb) ? mp->m_rmap_maxlevels : 0;
> > +}
>
> All the other helpers are macros. While I much prefer inlines it might
> be a good idea to do it in one go and also convert them to lower case
> instead of screaming..
>
> > +#define XFS_NRMAPADD_SPACE_RES(mp,b,w)\
> > + (((b + XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp) - 1) / \
> > + XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp)) * \
> > + XFS_RMAPADD_SPACE_RES(mp))
>
> But then again the above is so much more readable than this, so maybe
> it's a good idea after all and this one should be an inline as well.
>
> Also I think future developers will really appreciate comments
> explaining what we add up here.
Yeah, I was going to clean this up too but then I distracted myself with
that other bunmapi max_len thing, and by the time I sent it, it was late
evening already. I'll try to squeeze it in today.
--D
>
> Otherwise this looks fine and test fine for me:
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Tested-by: Christoph Hellwig <hch@lst.de>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: reserve enough blocks to handle btree splits when remapping
2017-04-12 21:48 [PATCH] xfs: reserve enough blocks to handle btree splits when remapping Darrick J. Wong
2017-04-13 9:53 ` Christoph Hellwig
@ 2017-04-17 19:37 ` Darrick J. Wong
1 sibling, 0 replies; 4+ messages in thread
From: Darrick J. Wong @ 2017-04-17 19:37 UTC (permalink / raw)
To: xfs; +Cc: Christoph Hellwig
On Wed, Apr 12, 2017 at 02:48:52PM -0700, Darrick J. Wong wrote:
> In xfs_reflink_end_cow, we erroneously reserve only enough blocks to
> handle adding 1 extent. This is problematic if we fragment free space,
> have to do CoW, and then have to perform multiple bmap btree expansions.
> Furthermore, the BUI recovery routine doesn't reserve /any/ blocks to
> handle btree splits, so log recovery fails after our first error causes
> the filesystem to go down.
>
> Therefore, refactor the transaction block reservation macros until we
> have a macro that works for our deferred (re)mapping activities, and fix
> both problems by using that macro.
>
> With 1k blocks we can hit this fairly often in g/187 if the scratch fs
> is big enough.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> fs/xfs/libxfs/xfs_trans_space.h | 18 ++++++++++++------
> fs/xfs/xfs_bmap_item.c | 7 ++++++-
> fs/xfs/xfs_reflink.c | 3 ++-
> 3 files changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/fs/xfs/libxfs/xfs_trans_space.h b/fs/xfs/libxfs/xfs_trans_space.h
> index 7917f6e..04278cf 100644
> --- a/fs/xfs/libxfs/xfs_trans_space.h
> +++ b/fs/xfs/libxfs/xfs_trans_space.h
> @@ -23,6 +23,16 @@
> */
> #define XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp) \
> (((mp)->m_rmap_mxr[0]) - ((mp)->m_rmap_mnr[0]))
> +static inline unsigned int
> +XFS_RMAPADD_SPACE_RES(
> + struct xfs_mount *mp)
> +{
> + return xfs_sb_version_hasrmapbt(&mp->m_sb) ? mp->m_rmap_maxlevels : 0;
> +}
> +#define XFS_NRMAPADD_SPACE_RES(mp,b,w)\
> + (((b + XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp) - 1) / \
> + XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp)) * \
> + XFS_RMAPADD_SPACE_RES(mp))
> #define XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp) \
> (((mp)->m_alloc_mxr[0]) - ((mp)->m_alloc_mnr[0]))
> #define XFS_EXTENTADD_SPACE_RES(mp,w) (XFS_BM_MAXLEVELS(mp,w) - 1)
> @@ -31,12 +41,8 @@
> XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp)) * \
> XFS_EXTENTADD_SPACE_RES(mp,w))
> #define XFS_SWAP_RMAP_SPACE_RES(mp,b,w)\
> - (((b + XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp) - 1) / \
> - XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp)) * \
> - XFS_EXTENTADD_SPACE_RES(mp,w) + \
> - ((b + XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp) - 1) / \
> - XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp)) * \
> - (mp)->m_rmap_maxlevels)
> + (XFS_NEXTENTADD_SPACE_RES((mp), (b), (w)) + \
> + XFS_NRMAPADD_SPACE_RES((mp), (b), (w)))
> #define XFS_DAENTER_1B(mp,w) \
> ((w) == XFS_DATA_FORK ? (mp)->m_dir_geo->fsbcount : 1)
> #define XFS_DAENTER_DBS(mp,w) \
> diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
> index 9bf57c7..055ab8f 100644
> --- a/fs/xfs/xfs_bmap_item.c
> +++ b/fs/xfs/xfs_bmap_item.c
> @@ -34,6 +34,8 @@
> #include "xfs_bmap.h"
> #include "xfs_icache.h"
> #include "xfs_trace.h"
> +#include "xfs_bmap_btree.h"
> +#include "xfs_trans_space.h"
>
>
> kmem_zone_t *xfs_bui_zone;
> @@ -402,6 +404,7 @@ xfs_bui_recover(
> struct xfs_inode *ip = NULL;
> struct xfs_defer_ops dfops;
> xfs_fsblock_t firstfsb;
> + unsigned int resblks;
>
> ASSERT(!test_bit(XFS_BUI_RECOVERED, &buip->bui_flags));
>
> @@ -446,7 +449,9 @@ xfs_bui_recover(
> return -EIO;
> }
>
> - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp);
> + resblks = XFS_SWAP_RMAP_SPACE_RES(mp, 1, XFS_DATA_FORK);
> + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, resblks, 0,
> + 0, &tp);
> if (error)
> return error;
> budp = xfs_trans_get_bud(tp, buip);
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index c0f3754..b811742 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -706,7 +706,8 @@ xfs_reflink_end_cow(
> end_fsb = XFS_B_TO_FSB(ip->i_mount, offset + count);
>
> /* Start a rolling transaction to switch the mappings */
> - resblks = XFS_EXTENTADD_SPACE_RES(ip->i_mount, XFS_DATA_FORK);
> + resblks = XFS_SWAP_RMAP_SPACE_RES(ip->i_mount,
> + end_fsb - offset_fsb + 1, XFS_DATA_FORK);
I forgot that (end_fsb - offset_fsb + 1) is a 64-bit quantity, so this
patch breaks the build on 32-bit systems. Assuming that we're unlikely
ever to need to remap 16T of single-block extents, we could just clamp
the worst case extent count to 2^32-1 and throw in a warning in case
this assumption ever gets violated...
--D
> error = xfs_trans_alloc(ip->i_mount, &M_RES(ip->i_mount)->tr_write,
> resblks, 0, 0, &tp);
> if (error)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-04-17 19:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-12 21:48 [PATCH] xfs: reserve enough blocks to handle btree splits when remapping Darrick J. Wong
2017-04-13 9:53 ` Christoph Hellwig
2017-04-13 15:52 ` Darrick J. Wong
2017-04-17 19:37 ` Darrick J. Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox