* [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim @ 2024-08-14 4:23 Christoph Hellwig 2024-08-14 4:23 ` [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard Christoph Hellwig 2024-08-14 5:12 ` [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim Darrick J. Wong 0 siblings, 2 replies; 7+ messages in thread From: Christoph Hellwig @ 2024-08-14 4:23 UTC (permalink / raw) To: chandan.babu; +Cc: djwong, linux-xfs There is no truncating down going on here, the code has changed multiple times since the comment was added with the initial FITRIM implementation and it doesn't make sense in the current context. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_discard.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c index 6f0fc7fe1f2ba9..6516afecce0979 100644 --- a/fs/xfs/xfs_discard.c +++ b/fs/xfs/xfs_discard.c @@ -689,13 +689,6 @@ xfs_ioc_trim( range.minlen = max_t(u64, granularity, range.minlen); minlen = XFS_B_TO_FSB(mp, range.minlen); - /* - * Truncating down the len isn't actually quite correct, but using - * BBTOB would mean we trivially get overflows for values - * of ULLONG_MAX or slightly lower. And ULLONG_MAX is the default - * used by the fstrim application. In the end it really doesn't - * matter as trimming blocks is an advisory interface. - */ max_blocks = mp->m_sb.sb_dblocks + mp->m_sb.sb_rblocks; if (range.start >= XFS_FSB_TO_B(mp, max_blocks) || range.minlen > XFS_FSB_TO_B(mp, mp->m_ag_max_usable) || -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard 2024-08-14 4:23 [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim Christoph Hellwig @ 2024-08-14 4:23 ` Christoph Hellwig 2024-08-14 5:41 ` Darrick J. Wong 2024-08-14 5:12 ` [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim Darrick J. Wong 1 sibling, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2024-08-14 4:23 UTC (permalink / raw) To: chandan.babu; +Cc: djwong, linux-xfs Fix the newly added discard support to only offer a FITRIM range that spans the RT device in addition to the main device if the RT device actually supports discard. Without this we'll incorrectly accept a larger range than actually supported and confuse user space if the RT device does not support discard. This can easily happen when the main device is a SSD but the RT device is a hard driver. Move the code around a bit to keep the max_blocks and granularity assignments together and to handle the case where only the RT device supports discard as well. Fixes: 3ba3ab1f6719 ("xfs: enable FITRIM on the realtime device") Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_discard.c | 37 ++++++++++++++++++++++++------------- 1 file changed, 24 insertions(+), 13 deletions(-) diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c index 6516afecce0979..46e28499a7d3ce 100644 --- a/fs/xfs/xfs_discard.c +++ b/fs/xfs/xfs_discard.c @@ -654,9 +654,9 @@ xfs_ioc_trim( struct xfs_mount *mp, struct fstrim_range __user *urange) { - unsigned int granularity = - bdev_discard_granularity(mp->m_ddev_targp->bt_bdev); + struct block_device *bdev = mp->m_ddev_targp->bt_bdev; struct block_device *rt_bdev = NULL; + unsigned int granularity = 0; struct fstrim_range range; xfs_daddr_t start, end; xfs_extlen_t minlen; @@ -666,15 +666,6 @@ xfs_ioc_trim( if (!capable(CAP_SYS_ADMIN)) return -EPERM; - if (mp->m_rtdev_targp && - bdev_max_discard_sectors(mp->m_rtdev_targp->bt_bdev)) - rt_bdev = mp->m_rtdev_targp->bt_bdev; - if (!bdev_max_discard_sectors(mp->m_ddev_targp->bt_bdev) && !rt_bdev) - return -EOPNOTSUPP; - - if (rt_bdev) - granularity = max(granularity, - bdev_discard_granularity(rt_bdev)); /* * We haven't recovered the log, so we cannot use our bnobt-guided @@ -683,13 +674,33 @@ xfs_ioc_trim( if (xfs_has_norecovery(mp)) return -EROFS; + if (bdev_max_discard_sectors(bdev)) { + max_blocks = mp->m_sb.sb_dblocks; + granularity = bdev_discard_granularity(bdev); + } else { + bdev = NULL; + } + + if (mp->m_rtdev_targp) { + rt_bdev = mp->m_rtdev_targp->bt_bdev; + if (!bdev_max_discard_sectors(rt_bdev)) { + if (!bdev) + return -EOPNOTSUPP; + rt_bdev = NULL; + } + } + if (rt_bdev) { + max_blocks += mp->m_sb.sb_rblocks; + granularity = + max(granularity, bdev_discard_granularity(rt_bdev)); + } + if (copy_from_user(&range, urange, sizeof(range))) return -EFAULT; range.minlen = max_t(u64, granularity, range.minlen); minlen = XFS_B_TO_FSB(mp, range.minlen); - max_blocks = mp->m_sb.sb_dblocks + mp->m_sb.sb_rblocks; if (range.start >= XFS_FSB_TO_B(mp, max_blocks) || range.minlen > XFS_FSB_TO_B(mp, mp->m_ag_max_usable) || range.len < mp->m_sb.sb_blocksize) @@ -698,7 +709,7 @@ xfs_ioc_trim( start = BTOBB(range.start); end = start + BTOBBT(range.len) - 1; - if (bdev_max_discard_sectors(mp->m_ddev_targp->bt_bdev)) { + if (bdev) { error = xfs_trim_datadev_extents(mp, start, end, minlen, &blocks_trimmed); if (error) -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard 2024-08-14 4:23 ` [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard Christoph Hellwig @ 2024-08-14 5:41 ` Darrick J. Wong 2024-08-14 5:48 ` Christoph Hellwig 0 siblings, 1 reply; 7+ messages in thread From: Darrick J. Wong @ 2024-08-14 5:41 UTC (permalink / raw) To: Christoph Hellwig; +Cc: chandan.babu, linux-xfs On Wed, Aug 14, 2024 at 06:23:58AM +0200, Christoph Hellwig wrote: > Fix the newly added discard support to only offer a FITRIM range that > spans the RT device in addition to the main device if the RT device > actually supports discard. Without this we'll incorrectly accept > a larger range than actually supported and confuse user space if the > RT device does not support discard. This can easily happen when the > main device is a SSD but the RT device is a hard driver. > > Move the code around a bit to keep the max_blocks and granularity > assignments together and to handle the case where only the RT device > supports discard as well. > > Fixes: 3ba3ab1f6719 ("xfs: enable FITRIM on the realtime device") > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_discard.c | 37 ++++++++++++++++++++++++------------- > 1 file changed, 24 insertions(+), 13 deletions(-) > > diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c > index 6516afecce0979..46e28499a7d3ce 100644 > --- a/fs/xfs/xfs_discard.c > +++ b/fs/xfs/xfs_discard.c > @@ -654,9 +654,9 @@ xfs_ioc_trim( > struct xfs_mount *mp, > struct fstrim_range __user *urange) > { > - unsigned int granularity = > - bdev_discard_granularity(mp->m_ddev_targp->bt_bdev); > + struct block_device *bdev = mp->m_ddev_targp->bt_bdev; > struct block_device *rt_bdev = NULL; > + unsigned int granularity = 0; > struct fstrim_range range; > xfs_daddr_t start, end; > xfs_extlen_t minlen; > @@ -666,15 +666,6 @@ xfs_ioc_trim( > > if (!capable(CAP_SYS_ADMIN)) > return -EPERM; > - if (mp->m_rtdev_targp && > - bdev_max_discard_sectors(mp->m_rtdev_targp->bt_bdev)) > - rt_bdev = mp->m_rtdev_targp->bt_bdev; > - if (!bdev_max_discard_sectors(mp->m_ddev_targp->bt_bdev) && !rt_bdev) > - return -EOPNOTSUPP; Does this still return EOPNOTSUPP if there's no rt device and the data device doesn't support discard? I only see an EOPNOTSUPP return if there is a rt device and both devices don't support discard. > - > - if (rt_bdev) > - granularity = max(granularity, > - bdev_discard_granularity(rt_bdev)); > > /* > * We haven't recovered the log, so we cannot use our bnobt-guided > @@ -683,13 +674,33 @@ xfs_ioc_trim( > if (xfs_has_norecovery(mp)) > return -EROFS; > > + if (bdev_max_discard_sectors(bdev)) { > + max_blocks = mp->m_sb.sb_dblocks; > + granularity = bdev_discard_granularity(bdev); > + } else { > + bdev = NULL; > + } > + > + if (mp->m_rtdev_targp) { > + rt_bdev = mp->m_rtdev_targp->bt_bdev; > + if (!bdev_max_discard_sectors(rt_bdev)) { > + if (!bdev) > + return -EOPNOTSUPP; > + rt_bdev = NULL; > + } > + } > + if (rt_bdev) { > + max_blocks += mp->m_sb.sb_rblocks; I think this will break xfs_scrub, which (unlike fstrim) breaks up its FITRIM requests into smaller pieces. The (afwul) FITRIM interface says that [0, dblocks) trims the data device, and [dblocks, dblocks + rblocks) trims the realtime device. If the data device doesn't support discard, max_blocks will be rblocks, and that's what we use to validate the @start parameter. For example, if the data device is 10T spinning rust and the rt device is a 10G SSD, max_blocks will be 10G. A FITRIM request for just the rt device will be [10T, 10G), which now fails with EINVAL. I don't have a fix to suggest for this yet, but let me play around with this tomorrow and see if I can come up with something better, or figure out how I'm being thick. ;) My guess is that what we really want is if either device supports discard we allow the full range, but if a specific device doesn't support discard then we skip it and don't add anything to the outgoing range.len. But that's what I thought the current code does. <shrug> --D > + granularity = > + max(granularity, bdev_discard_granularity(rt_bdev)); > + } > + > if (copy_from_user(&range, urange, sizeof(range))) > return -EFAULT; > > range.minlen = max_t(u64, granularity, range.minlen); > minlen = XFS_B_TO_FSB(mp, range.minlen); > > - max_blocks = mp->m_sb.sb_dblocks + mp->m_sb.sb_rblocks; > if (range.start >= XFS_FSB_TO_B(mp, max_blocks) || > range.minlen > XFS_FSB_TO_B(mp, mp->m_ag_max_usable) || > range.len < mp->m_sb.sb_blocksize) > @@ -698,7 +709,7 @@ xfs_ioc_trim( > start = BTOBB(range.start); > end = start + BTOBBT(range.len) - 1; > > - if (bdev_max_discard_sectors(mp->m_ddev_targp->bt_bdev)) { > + if (bdev) { > error = xfs_trim_datadev_extents(mp, start, end, minlen, > &blocks_trimmed); > if (error) > -- > 2.43.0 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard 2024-08-14 5:41 ` Darrick J. Wong @ 2024-08-14 5:48 ` Christoph Hellwig 2024-08-14 6:16 ` Darrick J. Wong 0 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2024-08-14 5:48 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Christoph Hellwig, chandan.babu, linux-xfs On Tue, Aug 13, 2024 at 10:41:18PM -0700, Darrick J. Wong wrote: > Does this still return EOPNOTSUPP if there's no rt device and the data > device doesn't support discard? Yes, I'll need to fix that. > > + if (rt_bdev) { > > + max_blocks += mp->m_sb.sb_rblocks; > > I think this will break xfs_scrub, which (unlike fstrim) breaks up its > FITRIM requests into smaller pieces. No breakage noticed during my testing, but I'm not sure how that would have materialized anyway.. > The (afwul) FITRIM interface says > that [0, dblocks) trims the data device, and [dblocks, dblocks + > rblocks) trims the realtime device. > > If the data device doesn't support discard, max_blocks will be rblocks, > and that's what we use to validate the @start parameter. For example, > if the data device is 10T spinning rust and the rt device is a 10G SSD, > max_blocks will be 10G. A FITRIM request for just the rt device will be > [10T, 10G), which now fails with EINVAL. > > I don't have a fix to suggest for this yet, but let me play around with > this tomorrow and see if I can come up with something better, or figure > out how I'm being thick. ;) > > My guess is that what we really want is if either device supports > discard we allow the full range, but if a specific device doesn't > support discard then we skip it and don't add anything to the outgoing > range.len. But that's what I thought the current code does. <shrug> The problem is that if we allow the full range, but return a smaller around generic/260 fails. That is with your patches to not fail if FITRIM for the RT device is supported, without them it always fails as soon as FITRIM is supported on the RT device. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard 2024-08-14 5:48 ` Christoph Hellwig @ 2024-08-14 6:16 ` Darrick J. Wong 0 siblings, 0 replies; 7+ messages in thread From: Darrick J. Wong @ 2024-08-14 6:16 UTC (permalink / raw) To: Christoph Hellwig; +Cc: chandan.babu, linux-xfs On Wed, Aug 14, 2024 at 07:48:38AM +0200, Christoph Hellwig wrote: > On Tue, Aug 13, 2024 at 10:41:18PM -0700, Darrick J. Wong wrote: > > Does this still return EOPNOTSUPP if there's no rt device and the data > > device doesn't support discard? > > Yes, I'll need to fix that. > > > > + if (rt_bdev) { > > > + max_blocks += mp->m_sb.sb_rblocks; > > > > I think this will break xfs_scrub, which (unlike fstrim) breaks up its > > FITRIM requests into smaller pieces. > > No breakage noticed during my testing, but I'm not sure how that > would have materialized anyway.. scrub silently starts discarding less than it does now. My guess is nobody would notice because meh and who cares? ;) > > The (afwul) FITRIM interface says > > that [0, dblocks) trims the data device, and [dblocks, dblocks + > > rblocks) trims the realtime device. > > > > If the data device doesn't support discard, max_blocks will be rblocks, > > and that's what we use to validate the @start parameter. For example, > > if the data device is 10T spinning rust and the rt device is a 10G SSD, > > max_blocks will be 10G. A FITRIM request for just the rt device will be > > [10T, 10G), which now fails with EINVAL. > > > > I don't have a fix to suggest for this yet, but let me play around with > > this tomorrow and see if I can come up with something better, or figure > > out how I'm being thick. ;) > > > > My guess is that what we really want is if either device supports > > discard we allow the full range, but if a specific device doesn't > > support discard then we skip it and don't add anything to the outgoing > > range.len. But that's what I thought the current code does. <shrug> > > The problem is that if we allow the full range, but return a smaller > around generic/260 fails. That is with your patches to not fail if > FITRIM for the RT device is supported, without them it always fails > as soon as FITRIM is supported on the RT device. Ugh. We can't just lie and add unsupported ranges to range.len because (a) that's dishonest even if there's no expectation that the discards did anything and (b) we'd have walk the freespace metadata just to lie. OTOH generic/260 is butt-ugly with all the accounting special cases in there. Maybe there's a way to change the test to figure out how much discarding is possible on an xfs filesystem so that it can handle hetergeneous filesystems? --D ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim 2024-08-14 4:23 [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim Christoph Hellwig 2024-08-14 4:23 ` [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard Christoph Hellwig @ 2024-08-14 5:12 ` Darrick J. Wong 1 sibling, 0 replies; 7+ messages in thread From: Darrick J. Wong @ 2024-08-14 5:12 UTC (permalink / raw) To: Christoph Hellwig; +Cc: chandan.babu, linux-xfs On Wed, Aug 14, 2024 at 06:23:57AM +0200, Christoph Hellwig wrote: > There is no truncating down going on here, the code has changed multiple > times since the comment was added with the initial FITRIM implementation > and it doesn't make sense in the current context. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Heh, whoops. Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > --- > fs/xfs/xfs_discard.c | 7 ------- > 1 file changed, 7 deletions(-) > > diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c > index 6f0fc7fe1f2ba9..6516afecce0979 100644 > --- a/fs/xfs/xfs_discard.c > +++ b/fs/xfs/xfs_discard.c > @@ -689,13 +689,6 @@ xfs_ioc_trim( > range.minlen = max_t(u64, granularity, range.minlen); > minlen = XFS_B_TO_FSB(mp, range.minlen); > > - /* > - * Truncating down the len isn't actually quite correct, but using > - * BBTOB would mean we trivially get overflows for values > - * of ULLONG_MAX or slightly lower. And ULLONG_MAX is the default > - * used by the fstrim application. In the end it really doesn't > - * matter as trimming blocks is an advisory interface. > - */ > max_blocks = mp->m_sb.sb_dblocks + mp->m_sb.sb_rblocks; > if (range.start >= XFS_FSB_TO_B(mp, max_blocks) || > range.minlen > XFS_FSB_TO_B(mp, mp->m_ag_max_usable) || > -- > 2.43.0 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* fix FITRIM with non-discard capable RT device v2 @ 2024-08-16 8:18 Christoph Hellwig 2024-08-16 8:18 ` [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim Christoph Hellwig 0 siblings, 1 reply; 7+ messages in thread From: Christoph Hellwig @ 2024-08-16 8:18 UTC (permalink / raw) To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs Hi all, this series fixes FITRIM when an RT device that does not support discards is present. Changes since v1: - explicitly reject the discard capable RT device and non-discard capable main device case because it is too much of a mess to handle and not practically useful Diffstat: xfs_discard.c | 57 +++++++++++++++++++++++++++++++-------------------------- 1 file changed, 31 insertions(+), 26 deletions(-) ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim 2024-08-16 8:18 fix FITRIM with non-discard capable RT device v2 Christoph Hellwig @ 2024-08-16 8:18 ` Christoph Hellwig 0 siblings, 0 replies; 7+ messages in thread From: Christoph Hellwig @ 2024-08-16 8:18 UTC (permalink / raw) To: Chandan Babu R; +Cc: Darrick J. Wong, linux-xfs There is no truncating down going on here, the code has changed multiple times since the comment was added with the initial FITRIM implementation and it doesn't make sense in the current context. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> --- fs/xfs/xfs_discard.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c index 80586336276c19..d56efe9eae2cef 100644 --- a/fs/xfs/xfs_discard.c +++ b/fs/xfs/xfs_discard.c @@ -720,13 +720,6 @@ xfs_ioc_trim( range.minlen = max_t(u64, granularity, range.minlen); minlen = XFS_B_TO_FSB(mp, range.minlen); - /* - * Truncating down the len isn't actually quite correct, but using - * BBTOB would mean we trivially get overflows for values - * of ULLONG_MAX or slightly lower. And ULLONG_MAX is the default - * used by the fstrim application. In the end it really doesn't - * matter as trimming blocks is an advisory interface. - */ max_blocks = mp->m_sb.sb_dblocks + mp->m_sb.sb_rblocks; if (range.start >= XFS_FSB_TO_B(mp, max_blocks) || range.minlen > XFS_FSB_TO_B(mp, mp->m_ag_max_usable) || -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-08-16 8:19 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-14 4:23 [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim Christoph Hellwig 2024-08-14 4:23 ` [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard Christoph Hellwig 2024-08-14 5:41 ` Darrick J. Wong 2024-08-14 5:48 ` Christoph Hellwig 2024-08-14 6:16 ` Darrick J. Wong 2024-08-14 5:12 ` [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim Darrick J. Wong -- strict thread matches above, loose matches on Subject: below -- 2024-08-16 8:18 fix FITRIM with non-discard capable RT device v2 Christoph Hellwig 2024-08-16 8:18 ` [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox