From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: chandan.babu@oracle.com, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard
Date: Tue, 13 Aug 2024 22:41:18 -0700 [thread overview]
Message-ID: <20240814054118.GE865349@frogsfrogsfrogs> (raw)
In-Reply-To: <20240814042358.19297-2-hch@lst.de>
On Wed, Aug 14, 2024 at 06:23:58AM +0200, Christoph Hellwig wrote:
> Fix the newly added discard support to only offer a FITRIM range that
> spans the RT device in addition to the main device if the RT device
> actually supports discard. Without this we'll incorrectly accept
> a larger range than actually supported and confuse user space if the
> RT device does not support discard. This can easily happen when the
> main device is a SSD but the RT device is a hard driver.
>
> Move the code around a bit to keep the max_blocks and granularity
> assignments together and to handle the case where only the RT device
> supports discard as well.
>
> Fixes: 3ba3ab1f6719 ("xfs: enable FITRIM on the realtime device")
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_discard.c | 37 ++++++++++++++++++++++++-------------
> 1 file changed, 24 insertions(+), 13 deletions(-)
>
> diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
> index 6516afecce0979..46e28499a7d3ce 100644
> --- a/fs/xfs/xfs_discard.c
> +++ b/fs/xfs/xfs_discard.c
> @@ -654,9 +654,9 @@ xfs_ioc_trim(
> struct xfs_mount *mp,
> struct fstrim_range __user *urange)
> {
> - unsigned int granularity =
> - bdev_discard_granularity(mp->m_ddev_targp->bt_bdev);
> + struct block_device *bdev = mp->m_ddev_targp->bt_bdev;
> struct block_device *rt_bdev = NULL;
> + unsigned int granularity = 0;
> struct fstrim_range range;
> xfs_daddr_t start, end;
> xfs_extlen_t minlen;
> @@ -666,15 +666,6 @@ xfs_ioc_trim(
>
> if (!capable(CAP_SYS_ADMIN))
> return -EPERM;
> - if (mp->m_rtdev_targp &&
> - bdev_max_discard_sectors(mp->m_rtdev_targp->bt_bdev))
> - rt_bdev = mp->m_rtdev_targp->bt_bdev;
> - if (!bdev_max_discard_sectors(mp->m_ddev_targp->bt_bdev) && !rt_bdev)
> - return -EOPNOTSUPP;
Does this still return EOPNOTSUPP if there's no rt device and the data
device doesn't support discard? I only see an EOPNOTSUPP return if
there is a rt device and both devices don't support discard.
> -
> - if (rt_bdev)
> - granularity = max(granularity,
> - bdev_discard_granularity(rt_bdev));
>
> /*
> * We haven't recovered the log, so we cannot use our bnobt-guided
> @@ -683,13 +674,33 @@ xfs_ioc_trim(
> if (xfs_has_norecovery(mp))
> return -EROFS;
>
> + if (bdev_max_discard_sectors(bdev)) {
> + max_blocks = mp->m_sb.sb_dblocks;
> + granularity = bdev_discard_granularity(bdev);
> + } else {
> + bdev = NULL;
> + }
> +
> + if (mp->m_rtdev_targp) {
> + rt_bdev = mp->m_rtdev_targp->bt_bdev;
> + if (!bdev_max_discard_sectors(rt_bdev)) {
> + if (!bdev)
> + return -EOPNOTSUPP;
> + rt_bdev = NULL;
> + }
> + }
> + if (rt_bdev) {
> + max_blocks += mp->m_sb.sb_rblocks;
I think this will break xfs_scrub, which (unlike fstrim) breaks up its
FITRIM requests into smaller pieces. The (afwul) FITRIM interface says
that [0, dblocks) trims the data device, and [dblocks, dblocks +
rblocks) trims the realtime device.
If the data device doesn't support discard, max_blocks will be rblocks,
and that's what we use to validate the @start parameter. For example,
if the data device is 10T spinning rust and the rt device is a 10G SSD,
max_blocks will be 10G. A FITRIM request for just the rt device will be
[10T, 10G), which now fails with EINVAL.
I don't have a fix to suggest for this yet, but let me play around with
this tomorrow and see if I can come up with something better, or figure
out how I'm being thick. ;)
My guess is that what we really want is if either device supports
discard we allow the full range, but if a specific device doesn't
support discard then we skip it and don't add anything to the outgoing
range.len. But that's what I thought the current code does. <shrug>
--D
> + granularity =
> + max(granularity, bdev_discard_granularity(rt_bdev));
> + }
> +
> if (copy_from_user(&range, urange, sizeof(range)))
> return -EFAULT;
>
> range.minlen = max_t(u64, granularity, range.minlen);
> minlen = XFS_B_TO_FSB(mp, range.minlen);
>
> - max_blocks = mp->m_sb.sb_dblocks + mp->m_sb.sb_rblocks;
> if (range.start >= XFS_FSB_TO_B(mp, max_blocks) ||
> range.minlen > XFS_FSB_TO_B(mp, mp->m_ag_max_usable) ||
> range.len < mp->m_sb.sb_blocksize)
> @@ -698,7 +709,7 @@ xfs_ioc_trim(
> start = BTOBB(range.start);
> end = start + BTOBBT(range.len) - 1;
>
> - if (bdev_max_discard_sectors(mp->m_ddev_targp->bt_bdev)) {
> + if (bdev) {
> error = xfs_trim_datadev_extents(mp, start, end, minlen,
> &blocks_trimmed);
> if (error)
> --
> 2.43.0
>
>
next prev parent reply other threads:[~2024-08-14 5:41 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-14 4:23 [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim Christoph Hellwig
2024-08-14 4:23 ` [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard Christoph Hellwig
2024-08-14 5:41 ` Darrick J. Wong [this message]
2024-08-14 5:48 ` Christoph Hellwig
2024-08-14 6:16 ` Darrick J. Wong
2024-08-14 5:12 ` [PATCH 1/2] xfs: remove a stale comment in xfs_ioc_trim Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2024-08-16 8:18 fix FITRIM with non-discard capable RT device v2 Christoph Hellwig
2024-08-16 8:18 ` [PATCH 2/2] xfs: don't extend the FITRIM range if the rt device does not support discard Christoph Hellwig
2024-08-16 21:50 ` Darrick J. Wong
2024-08-19 12:44 ` Christoph Hellwig
2024-08-19 15:00 ` Darrick J. Wong
2024-08-19 15:08 ` Christoph Hellwig
2024-08-20 16:19 ` Darrick J. Wong
2024-08-20 16:35 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240814054118.GE865349@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox