From: Yongpeng Yang <yangyongpeng.storage@gmail.com>
To: Chao Yu <chao@kernel.org>,
Yongpeng Yang <yangyongpeng.storage@gmail.com>,
Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Yongpeng Yang <yangyongpeng@xiaomi.com>,
linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH] f2fs: ensure minimum trim granularity accounts for all devices
Date: Tue, 28 Oct 2025 17:27:20 +0800 [thread overview]
Message-ID: <b603b6bb-e772-469d-b439-f79e83c9964b@gmail.com> (raw)
In-Reply-To: <207c2ba6-49ef-490b-9896-0458abbf93e7@kernel.org>
On 10/28/25 10:30, Chao Yu via Linux-f2fs-devel wrote:
> On 10/27/25 21:06, Yongpeng Yang wrote:
>> On 10/27/25 16:35, Chao Yu via Linux-f2fs-devel wrote:
>>> On 10/24/25 22:37, Yongpeng Yang wrote:
>>>> From: Yongpeng Yang <yangyongpeng@xiaomi.com>
>>>>
>>>> When F2FS uses multiple block devices, each device may have a
>>>> different discard granularity. The minimum trim granularity must be
>>>> at least the maximum discard granularity of all devices, excluding
>>>> zoned devices. Use max_t instead of the max() macro to compute the
>>>> maximum value.
>>>>
>>>> Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
>>>> ---
>>>> fs/f2fs/f2fs.h | 12 ++++++++++++
>>>> fs/f2fs/file.c | 12 ++++++------
>>>> 2 files changed, 18 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>> index 32fb2e7338b7..064bdbf463f7 100644
>>>> --- a/fs/f2fs/f2fs.h
>>>> +++ b/fs/f2fs/f2fs.h
>>>> @@ -4762,6 +4762,18 @@ static inline bool f2fs_hw_support_discard(struct f2fs_sb_info *sbi)
>>>> return false;
>>>> }
>>>> +static inline unsigned int f2fs_hw_discard_granularity(struct f2fs_sb_info *sbi)
>>>> +{
>>>> + int i = 1;
>>>> + unsigned int discard_granularity = bdev_discard_granularity(sbi->sb->s_bdev);
>>>
>>> Yongpeng,
>>>
>>> The patch makes sense to me.
>>>
>>> One extra question, if a zoned device contains both conventional zones and
>>> sequential zones, what discard granularity will it exposes?
>>>
>>> Thanks,
>> I don't have such a device. I think the exposed discard granularity should be that of the conventional zones, since the sequential zones have a default reset granularity of 1 zone, and no additional information is needed to indicate that.
>
> I guess you can have a virtual one simulated by null_blk driver?
>
> https://zonedstorage.io/docs/getting-started/zbd-emulation#zoned-block-device-emulation-with-null_blk
1. When using qemu to emulate a zns ssd, a namespace cannot
simultaneously contain both conventional zones and sequential zones.
Additionally, for the emulated zoned device, the discard_granularity
cannot be configured manually. Its size is defaulted to the maximum
value between the logical_block_size and 4KiB.
static int nvme_ns_init_blk(NvmeNamespace *ns, Error **errp)
{
...
if (ns->blkconf.discard_granularity == -1) {
ns->blkconf.discard_granularity =
MAX(ns->blkconf.logical_block_size, MIN_DISCARD_GRANULARITY);
}
...
}
The default value of discard_granularity is set to logical_block_size in
nvme driver.
static void nvme_config_discard(struct nvme_ns *ns, struct queue_limits
*lim)
{
...
lim->discard_granularity = lim->logical_block_size;
...
}
2. QEMU cannot emulate SMR HDDs. From scsi driver code, I found that the
discard_granularity of a scsi device is as follows. The value of
sdkp->unmap_granularity is shared across multiple LUNs, meaning that
both conventional LUNs and sequential LUNs have the same
sdkp->unmap_granularity. As a result, the discard_granularity is also
the same for both types of zones. Therefore, from the driver
perspective, a zoned device that contains both conventional zones and
sequential zones will have the same discard_granularity as other
conventional devices.
static void sd_config_discard(struct scsi_disk *sdkp, struct
queue_limits *lim,
unsigned int mode)
{
...
lim->discard_granularity = max(sdkp->physical_block_size,
sdkp->unmap_granularity * logical_block_size);
...
}
static void sd_read_block_limits(struct scsi_disk *sdkp,
struct queue_limits *lim)
{
...
sdkp->unmap_granularity = get_unaligned_be32(&vpd->data[28]);
...
}
3. It seems that discard_granularity is related to logical_block_size
and physical_block_size, and is not associated with the zone size. For
zoned device, discard_granularity is meaningless.
>
> - nullblk_create.sh 512 2 1024 1024
> - cat /sys/block/nullb1/queue/discard_*
> 0
> 0
> 0
> 0
>
> I didn't dig into more details, though. :)
>
> Thanks,
>
I found that null device didn't config discard_*.
static int null_add_dev(struct nullb_device *dev)
{
...
struct queue_limits lim = {
.logical_block_size = dev->blocksize,
.physical_block_size = dev->blocksize,
.max_hw_sectors = dev->max_sectors,
};
...
}>>
>> Yongpeng>
>>>> +
>>>> + if (f2fs_is_multi_device(sbi))
>>>> + for (; i < sbi->s_ndevs && !bdev_is_zoned(FDEV(i).bdev); i++)
>>>> + discard_granularity = max_t(unsigned int, discard_granularity,
>>>> + bdev_discard_granularity(FDEV(i).bdev));
>>>> + return discard_granularity;
>>>> +}
>>>> +
>>>> static inline bool f2fs_realtime_discard_enable(struct f2fs_sb_info *sbi)
>>>> {
>>>> return (test_opt(sbi, DISCARD) && f2fs_hw_support_discard(sbi)) ||
>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>>> index 6d42e2d28861..ced0f78532c9 100644
>>>> --- a/fs/f2fs/file.c
>>>> +++ b/fs/f2fs/file.c
>>>> @@ -2588,14 +2588,14 @@ static int f2fs_keep_noreuse_range(struct inode *inode,
>>>> static int f2fs_ioc_fitrim(struct file *filp, unsigned long arg)
>>>> {
>>>> struct inode *inode = file_inode(filp);
>>>> - struct super_block *sb = inode->i_sb;
>>>> + struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
>>>> struct fstrim_range range;
>>>> int ret;
>>>> if (!capable(CAP_SYS_ADMIN))
>>>> return -EPERM;
>>>> - if (!f2fs_hw_support_discard(F2FS_SB(sb)))
>>>> + if (!f2fs_hw_support_discard(sbi))
>>>> return -EOPNOTSUPP;
>>>> if (copy_from_user(&range, (struct fstrim_range __user *)arg,
>>>> @@ -2606,9 +2606,9 @@ static int f2fs_ioc_fitrim(struct file *filp, unsigned long arg)
>>>> if (ret)
>>>> return ret;
>>>> - range.minlen = max((unsigned int)range.minlen,
>>>> - bdev_discard_granularity(sb->s_bdev));
>>>> - ret = f2fs_trim_fs(F2FS_SB(sb), &range);
>>>> + range.minlen = max_t(unsigned int, range.minlen,
>>>> + f2fs_hw_discard_granularity(sbi));
>>>> + ret = f2fs_trim_fs(sbi, &range);
>>>> mnt_drop_write_file(filp);
>>>> if (ret < 0)
>>>> return ret;
>>>> @@ -2616,7 +2616,7 @@ static int f2fs_ioc_fitrim(struct file *filp, unsigned long arg)
>>>> if (copy_to_user((struct fstrim_range __user *)arg, &range,
>>>> sizeof(range)))
>>>> return -EFAULT;
>>>> - f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
>>>> + f2fs_update_time(sbi, REQ_TIME);
>>>> return 0;
>>>> }
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
next prev parent reply other threads:[~2025-10-28 9:27 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-24 14:37 [f2fs-dev] [PATCH] f2fs: ensure minimum trim granularity accounts for all devices Yongpeng Yang
2025-10-27 8:35 ` Chao Yu via Linux-f2fs-devel
2025-10-27 13:06 ` Yongpeng Yang
2025-10-28 2:30 ` Chao Yu via Linux-f2fs-devel
2025-10-28 9:27 ` Yongpeng Yang [this message]
2025-10-29 9:42 ` Chao Yu via Linux-f2fs-devel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b603b6bb-e772-469d-b439-f79e83c9964b@gmail.com \
--to=yangyongpeng.storage@gmail.com \
--cc=chao@kernel.org \
--cc=jaegeuk@kernel.org \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=yangyongpeng@xiaomi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).