* [PATCH 0/5] Ioctl to clear unused space in various ways
@ 2025-02-28 14:49 David Sterba
2025-02-28 14:49 ` [PATCH 1/5] btrfs: extend trim callchains to pass the operation type David Sterba
` (5 more replies)
0 siblings, 6 replies; 14+ messages in thread
From: David Sterba @ 2025-02-28 14:49 UTC (permalink / raw)
To: linux-btrfs; +Cc: David Sterba
Add ioctl that is similar to FITRIM and in addition to trim can do also
zeroing (either plain overwrite, or unmap the blocks if the device
supports it) and secure erase.
This can be used to zero the unused space in e.g. VM images (when run
from inside the guest, if fstrim is not supported) or free space on
thin-provisioned devices.
The secure erase is provided by blkdiscard command but I'm not aware of
equivalent that can be run on a filesystem, so this is for parity.
David Sterba (5):
btrfs: extend trim callchains to pass the operation type
btrfs: add new ioctl CLEAR_FREE
btrfs: add zeroout mode to CLEAR_FREE ioctl
btrfs: add secure erase mode to CLEAR_FREE ioctl
btrfs: add more zeroout modes to CLEAR_FREE ioctl
fs/btrfs/discard.c | 4 +-
fs/btrfs/extent-tree.c | 159 +++++++++++++++++++++++++++++++-----
fs/btrfs/extent-tree.h | 5 +-
fs/btrfs/free-space-cache.c | 29 ++++---
fs/btrfs/free-space-cache.h | 8 +-
fs/btrfs/inode.c | 2 +-
fs/btrfs/ioctl.c | 42 ++++++++++
fs/btrfs/volumes.c | 3 +-
include/uapi/linux/btrfs.h | 46 +++++++++++
9 files changed, 258 insertions(+), 40 deletions(-)
--
2.47.1
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH 1/5] btrfs: extend trim callchains to pass the operation type 2025-02-28 14:49 [PATCH 0/5] Ioctl to clear unused space in various ways David Sterba @ 2025-02-28 14:49 ` David Sterba 2025-02-28 14:49 ` [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE David Sterba ` (4 subsequent siblings) 5 siblings, 0 replies; 14+ messages in thread From: David Sterba @ 2025-02-28 14:49 UTC (permalink / raw) To: linux-btrfs; +Cc: David Sterba Preparatory work for more than trim/discard operation that can be performed on the unused space from an ioctl. As FITRIM is not extensible, we'll need a new one. Now we extend any caller that takes part in the trim/discard to take one parameter defining the type of operation. The operation multiplexer btrfs_issue_clear_op() will be extended in followup patches. Signed-off-by: David Sterba <dsterba@suse.com> --- fs/btrfs/discard.c | 4 +-- fs/btrfs/extent-tree.c | 51 +++++++++++++++++++++++-------------- fs/btrfs/extent-tree.h | 3 ++- fs/btrfs/free-space-cache.c | 29 +++++++++++---------- fs/btrfs/free-space-cache.h | 8 +++--- fs/btrfs/inode.c | 2 +- fs/btrfs/volumes.c | 3 ++- include/uapi/linux/btrfs.h | 8 ++++++ 8 files changed, 68 insertions(+), 40 deletions(-) diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c index e815d165cccc..cd7220465f1f 100644 --- a/fs/btrfs/discard.c +++ b/fs/btrfs/discard.c @@ -518,13 +518,13 @@ static void btrfs_discard_workfn(struct work_struct *work) btrfs_trim_block_group_bitmaps(block_group, &trimmed, block_group->discard_cursor, btrfs_block_group_end(block_group), - minlen, maxlen, true); + minlen, maxlen, true, BTRFS_CLEAR_OP_DISCARD); discard_ctl->discard_bitmap_bytes += trimmed; } else { btrfs_trim_block_group_extents(block_group, &trimmed, block_group->discard_cursor, btrfs_block_group_end(block_group), - minlen, true); + minlen, true, BTRFS_CLEAR_OP_DISCARD); discard_ctl->discard_extent_bytes += trimmed; } diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 5de1a1293c93..df86ffde478b 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1247,8 +1247,20 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans, return ret; } +static int btrfs_issue_clear_op(struct block_device *bdev, u64 start, u64 size, + enum btrfs_clear_op_type clear) +{ + switch (clear) { + case BTRFS_CLEAR_OP_DISCARD: + return blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, + size >> SECTOR_SHIFT, GFP_NOFS); + default: + return -EOPNOTSUPP; + } +} + static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, - u64 *discarded_bytes) + u64 *discarded_bytes, enum btrfs_clear_op_type clear) { int j, ret = 0; u64 bytes_left, end; @@ -1293,11 +1305,8 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, bytes_left = end - start; continue; } - if (size) { - ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, - size >> SECTOR_SHIFT, - GFP_NOFS); + ret = btrfs_issue_clear_op(bdev, start, size, clear); if (!ret) *discarded_bytes += size; else if (ret != -EOPNOTSUPP) @@ -1315,9 +1324,7 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, while (bytes_left) { u64 bytes_to_discard = min(BTRFS_MAX_DISCARD_CHUNK_SIZE, bytes_left); - ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, - bytes_to_discard >> SECTOR_SHIFT, - GFP_NOFS); + ret = btrfs_issue_clear_op(bdev, start, bytes_left, clear); if (ret) { if (ret != -EOPNOTSUPP) @@ -1338,7 +1345,8 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len, return ret; } -static int do_discard_extent(struct btrfs_discard_stripe *stripe, u64 *bytes) +static int do_discard_extent(struct btrfs_discard_stripe *stripe, u64 *bytes, + enum btrfs_clear_op_type clear) { struct btrfs_device *dev = stripe->dev; struct btrfs_fs_info *fs_info = dev->fs_info; @@ -1367,7 +1375,7 @@ static int do_discard_extent(struct btrfs_discard_stripe *stripe, u64 *bytes) &discarded); discarded += src_disc; } else if (bdev_max_discard_sectors(stripe->dev->bdev)) { - ret = btrfs_issue_discard(dev->bdev, phys, len, &discarded); + ret = btrfs_issue_discard(dev->bdev, phys, len, &discarded, clear); } else { ret = 0; *bytes = 0; @@ -1379,7 +1387,8 @@ static int do_discard_extent(struct btrfs_discard_stripe *stripe, u64 *bytes) } int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr, - u64 num_bytes, u64 *actual_bytes) + u64 num_bytes, u64 *actual_bytes, + enum btrfs_clear_op_type clear) { int ret = 0; u64 discarded_bytes = 0; @@ -1418,7 +1427,7 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr, &stripe->dev->dev_state)) continue; - ret = do_discard_extent(stripe, &bytes); + ret = do_discard_extent(stripe, &bytes, clear); if (ret) { /* * Keep going if discard is not supported by the @@ -2837,7 +2846,8 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans) if (btrfs_test_opt(fs_info, DISCARD_SYNC)) ret = btrfs_discard_extent(fs_info, start, - end + 1 - start, NULL); + end + 1 - start, NULL, + BTRFS_CLEAR_OP_DISCARD); clear_extent_dirty(unpin, start, end, &cached_state); ret = unpin_extent_range(fs_info, start, end, true); @@ -2866,7 +2876,8 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans) ret = btrfs_discard_extent(fs_info, block_group->start, block_group->length, - &trimmed); + &trimmed, + BTRFS_CLEAR_OP_DISCARD); list_del_init(&block_group->bg_list); btrfs_unfreeze_block_group(block_group); @@ -6360,7 +6371,8 @@ void btrfs_error_unpin_extent_range(struct btrfs_fs_info *fs_info, u64 start, u6 * it while performing the free space search since we have already * held back allocations. */ -static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed) +static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed, + enum btrfs_clear_op_type clear) { u64 start = BTRFS_DEVICE_RANGE_RESERVED, len = 0, end = 0; int ret; @@ -6425,8 +6437,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device, u64 *trimmed) break; } - ret = btrfs_issue_discard(device->bdev, start, len, - &bytes); + ret = btrfs_issue_discard(device->bdev, start, len, &bytes, clear); if (!ret) set_extent_bit(&device->alloc_state, start, start + bytes - 1, CHUNK_TRIMMED, NULL); @@ -6508,7 +6519,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) &group_trimmed, start, end, - range->minlen); + range->minlen, + BTRFS_CLEAR_OP_DISCARD); trimmed += group_trimmed; if (ret) { @@ -6529,7 +6541,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) continue; - ret = btrfs_trim_free_extents(device, &group_trimmed); + ret = btrfs_trim_free_extents(device, &group_trimmed, + BTRFS_CLEAR_OP_DISCARD); trimmed += group_trimmed; if (ret) { diff --git a/fs/btrfs/extent-tree.h b/fs/btrfs/extent-tree.h index 0ed682d9ed7b..c8e1a30309ab 100644 --- a/fs/btrfs/extent-tree.h +++ b/fs/btrfs/extent-tree.h @@ -163,7 +163,8 @@ int btrfs_drop_subtree(struct btrfs_trans_handle *trans, struct extent_buffer *parent); void btrfs_error_unpin_extent_range(struct btrfs_fs_info *fs_info, u64 start, u64 end); int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr, - u64 num_bytes, u64 *actual_bytes); + u64 num_bytes, u64 *actual_bytes, + enum btrfs_clear_op_type clear); int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range); #endif diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 3095cce904b5..161fdcab07e4 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -3652,7 +3652,8 @@ static int do_trimming(struct btrfs_block_group *block_group, u64 *total_trimmed, u64 start, u64 bytes, u64 reserved_start, u64 reserved_bytes, enum btrfs_trim_state reserved_trim_state, - struct btrfs_trim_range *trim_entry) + struct btrfs_trim_range *trim_entry, + enum btrfs_clear_op_type clear) { struct btrfs_space_info *space_info = block_group->space_info; struct btrfs_fs_info *fs_info = block_group->fs_info; @@ -3674,7 +3675,7 @@ static int do_trimming(struct btrfs_block_group *block_group, spin_unlock(&block_group->lock); spin_unlock(&space_info->lock); - ret = btrfs_discard_extent(fs_info, start, bytes, &trimmed); + ret = btrfs_discard_extent(fs_info, start, bytes, &trimmed, clear); if (!ret) { *total_trimmed += trimmed; trim_state = BTRFS_TRIM_STATE_TRIMMED; @@ -3711,7 +3712,7 @@ static int do_trimming(struct btrfs_block_group *block_group, */ static int trim_no_bitmap(struct btrfs_block_group *block_group, u64 *total_trimmed, u64 start, u64 end, u64 minlen, - bool async) + bool async, enum btrfs_clear_op_type clear) { struct btrfs_discard_ctl *discard_ctl = &block_group->fs_info->discard_ctl; @@ -3800,7 +3801,7 @@ static int trim_no_bitmap(struct btrfs_block_group *block_group, ret = do_trimming(block_group, total_trimmed, start, bytes, extent_start, extent_bytes, extent_trim_state, - &trim_entry); + &trim_entry, clear); if (ret) { block_group->discard_cursor = start + bytes; break; @@ -3877,7 +3878,7 @@ static void end_trimming_bitmap(struct btrfs_free_space_ctl *ctl, */ static int trim_bitmaps(struct btrfs_block_group *block_group, u64 *total_trimmed, u64 start, u64 end, u64 minlen, - u64 maxlen, bool async) + u64 maxlen, bool async, enum btrfs_clear_op_type clear) { struct btrfs_discard_ctl *discard_ctl = &block_group->fs_info->discard_ctl; @@ -3986,7 +3987,7 @@ static int trim_bitmaps(struct btrfs_block_group *block_group, mutex_unlock(&ctl->cache_writeout_mutex); ret = do_trimming(block_group, total_trimmed, start, bytes, - start, bytes, 0, &trim_entry); + start, bytes, 0, &trim_entry, clear); if (ret) { reset_trimming_bitmap(ctl, offset); block_group->discard_cursor = @@ -4020,7 +4021,8 @@ static int trim_bitmaps(struct btrfs_block_group *block_group, } int btrfs_trim_block_group(struct btrfs_block_group *block_group, - u64 *trimmed, u64 start, u64 end, u64 minlen) + u64 *trimmed, u64 start, u64 end, u64 minlen, + enum btrfs_clear_op_type clear) { struct btrfs_free_space_ctl *ctl = block_group->free_space_ctl; int ret; @@ -4038,11 +4040,11 @@ int btrfs_trim_block_group(struct btrfs_block_group *block_group, btrfs_freeze_block_group(block_group); spin_unlock(&block_group->lock); - ret = trim_no_bitmap(block_group, trimmed, start, end, minlen, false); + ret = trim_no_bitmap(block_group, trimmed, start, end, minlen, false, clear); if (ret) goto out; - ret = trim_bitmaps(block_group, trimmed, start, end, minlen, 0, false); + ret = trim_bitmaps(block_group, trimmed, start, end, minlen, 0, false, clear); div64_u64_rem(end, BITS_PER_BITMAP * ctl->unit, &rem); /* If we ended in the middle of a bitmap, reset the trimming flag */ if (rem) @@ -4054,7 +4056,7 @@ int btrfs_trim_block_group(struct btrfs_block_group *block_group, int btrfs_trim_block_group_extents(struct btrfs_block_group *block_group, u64 *trimmed, u64 start, u64 end, u64 minlen, - bool async) + bool async, enum btrfs_clear_op_type clear) { int ret; @@ -4068,7 +4070,7 @@ int btrfs_trim_block_group_extents(struct btrfs_block_group *block_group, btrfs_freeze_block_group(block_group); spin_unlock(&block_group->lock); - ret = trim_no_bitmap(block_group, trimmed, start, end, minlen, async); + ret = trim_no_bitmap(block_group, trimmed, start, end, minlen, async, clear); btrfs_unfreeze_block_group(block_group); return ret; @@ -4076,7 +4078,8 @@ int btrfs_trim_block_group_extents(struct btrfs_block_group *block_group, int btrfs_trim_block_group_bitmaps(struct btrfs_block_group *block_group, u64 *trimmed, u64 start, u64 end, u64 minlen, - u64 maxlen, bool async) + u64 maxlen, bool async, + enum btrfs_clear_op_type clear) { int ret; @@ -4091,7 +4094,7 @@ int btrfs_trim_block_group_bitmaps(struct btrfs_block_group *block_group, spin_unlock(&block_group->lock); ret = trim_bitmaps(block_group, trimmed, start, end, minlen, maxlen, - async); + async, clear); btrfs_unfreeze_block_group(block_group); diff --git a/fs/btrfs/free-space-cache.h b/fs/btrfs/free-space-cache.h index 9f1dbfdee8ca..c4c2e5571355 100644 --- a/fs/btrfs/free-space-cache.h +++ b/fs/btrfs/free-space-cache.h @@ -159,13 +159,15 @@ void btrfs_return_cluster_to_free_space( struct btrfs_block_group *block_group, struct btrfs_free_cluster *cluster); int btrfs_trim_block_group(struct btrfs_block_group *block_group, - u64 *trimmed, u64 start, u64 end, u64 minlen); + u64 *trimmed, u64 start, u64 end, u64 minlen, + enum btrfs_clear_op_type clear); int btrfs_trim_block_group_extents(struct btrfs_block_group *block_group, u64 *trimmed, u64 start, u64 end, u64 minlen, - bool async); + bool async, enum btrfs_clear_op_type clear); int btrfs_trim_block_group_bitmaps(struct btrfs_block_group *block_group, u64 *trimmed, u64 start, u64 end, u64 minlen, - u64 maxlen, bool async); + u64 maxlen, bool async, + enum btrfs_clear_op_type clear); bool btrfs_free_space_cache_v1_active(struct btrfs_fs_info *fs_info); int btrfs_set_free_space_cache_v1_active(struct btrfs_fs_info *fs_info, bool active); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index f9bd9242acd3..282ccd1869c0 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3289,7 +3289,7 @@ int btrfs_finish_one_ordered(struct btrfs_ordered_extent *ordered_extent) btrfs_discard_extent(fs_info, ordered_extent->disk_bytenr, ordered_extent->disk_num_bytes, - NULL); + NULL, BTRFS_CLEAR_OP_DISCARD); btrfs_free_reserved_extent(fs_info, ordered_extent->disk_bytenr, ordered_extent->disk_num_bytes, 1); diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 515003a23e65..b6d007e358d4 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3534,7 +3534,8 @@ int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset) * filesystem's point of view. */ if (btrfs_is_zoned(fs_info)) { - ret = btrfs_discard_extent(fs_info, chunk_offset, length, NULL); + ret = btrfs_discard_extent(fs_info, chunk_offset, length, NULL, + BTRFS_CLEAR_OP_DISCARD); if (ret) btrfs_info(fs_info, "failed to reset zone %llu after relocation", diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index d3b222d7af24..64f971a6bcb2 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -1086,6 +1086,14 @@ enum btrfs_err_code { BTRFS_ERROR_DEV_RAID1C4_MIN_NOT_MET, }; +/* + * Type of operation that will be used to clear unused blocks. + */ +enum btrfs_clear_op_type { + BTRFS_CLEAR_OP_DISCARD, + BTRFS_NR_CLEAR_OP_TYPES, +}; + #define BTRFS_IOC_SNAP_CREATE _IOW(BTRFS_IOCTL_MAGIC, 1, \ struct btrfs_ioctl_vol_args) #define BTRFS_IOC_DEFRAG _IOW(BTRFS_IOCTL_MAGIC, 2, \ -- 2.47.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE 2025-02-28 14:49 [PATCH 0/5] Ioctl to clear unused space in various ways David Sterba 2025-02-28 14:49 ` [PATCH 1/5] btrfs: extend trim callchains to pass the operation type David Sterba @ 2025-02-28 14:49 ` David Sterba 2025-03-01 3:19 ` Sun YangKai ` (2 more replies) 2025-02-28 14:49 ` [PATCH 3/5] btrfs: add zeroout mode to CLEAR_FREE ioctl David Sterba ` (3 subsequent siblings) 5 siblings, 3 replies; 14+ messages in thread From: David Sterba @ 2025-02-28 14:49 UTC (permalink / raw) To: linux-btrfs; +Cc: David Sterba Add a new ioctl that is an extensible version of FITRIM. It currently does only the trim/discard and will be extended by other modes like zeroing or block unmapping. We need a new ioctl for that because struct fstrim_range does not provide any existing or reserved member for extensions. The new ioctl also supports TRIM as the operation type. Signed-off-by: David Sterba <dsterba@suse.com> --- fs/btrfs/extent-tree.c | 92 ++++++++++++++++++++++++++++++++++++++ fs/btrfs/extent-tree.h | 2 + fs/btrfs/ioctl.c | 42 +++++++++++++++++ include/uapi/linux/btrfs.h | 20 +++++++++ 4 files changed, 156 insertions(+) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index df86ffde478b..4ab9850b7383 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -6562,3 +6562,95 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range) return bg_ret; return dev_ret; } + +int btrfs_clear_free_space(struct btrfs_fs_info *fs_info, + struct btrfs_ioctl_clear_free_args *args) +{ + struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; + struct btrfs_device *device; + struct btrfs_block_group *cache = NULL; + u64 group_cleared; + u64 range_end = U64_MAX; + u64 start; + u64 end; + u64 cleared = 0; + u64 bg_failed = 0; + u64 dev_failed = 0; + int bg_ret = 0; + int dev_ret = 0; + int ret = 0; + + if (args->start == U64_MAX) + return -EINVAL; + + /* + * Check range overflow if args->length is set. The default args->length + * is U64_MAX. + */ + if (args->length != U64_MAX && + check_add_overflow(args->start, args->length, &range_end)) + return -EINVAL; + + cache = btrfs_lookup_first_block_group(fs_info, args->start); + for (; cache; cache = btrfs_next_block_group(cache)) { + if (cache->start >= range_end) { + btrfs_put_block_group(cache); + break; + } + + start = max(args->start, cache->start); + end = min(range_end, cache->start + cache->length); + + if (end - start >= args->minlen) { + if (!btrfs_block_group_done(cache)) { + ret = btrfs_cache_block_group(cache, true); + if (ret) { + bg_failed++; + bg_ret = ret; + continue; + } + } + ret = btrfs_trim_block_group(cache, &group_cleared, + start, end, args->minlen, + args->type); + + cleared += group_cleared; + if (ret) { + bg_failed++; + bg_ret = ret; + continue; + } + } + } + + if (bg_failed) + btrfs_warn(fs_info, + "failed to clear %llu block group(s), last error %d", + bg_failed, bg_ret); + + mutex_lock(&fs_devices->device_list_mutex); + list_for_each_entry(device, &fs_devices->devices, dev_list) { + if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) + continue; + + ret = btrfs_trim_free_extents(device, &group_cleared, args->type); + if (ret) { + dev_failed++; + dev_ret = ret; + break; + } + + cleared += group_cleared; + } + mutex_unlock(&fs_devices->device_list_mutex); + + if (dev_failed) + btrfs_warn(fs_info, + "failed to trim %llu device(s), last error %d", + dev_failed, dev_ret); + args->length = cleared; + if (bg_ret) + return bg_ret; + + return dev_ret; +} diff --git a/fs/btrfs/extent-tree.h b/fs/btrfs/extent-tree.h index c8e1a30309ab..e0702b276825 100644 --- a/fs/btrfs/extent-tree.h +++ b/fs/btrfs/extent-tree.h @@ -166,5 +166,7 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr, u64 num_bytes, u64 *actual_bytes, enum btrfs_clear_op_type clear); int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range); +int btrfs_clear_free_space(struct btrfs_fs_info *fs_info, + struct btrfs_ioctl_clear_free_args *args); #endif diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index f3ce82d113be..203e8a23d6c2 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -5213,6 +5213,46 @@ static int btrfs_ioctl_subvol_sync(struct btrfs_fs_info *fs_info, void __user *a return 0; } +static int btrfs_ioctl_clear_free(struct file *file, void __user *arg) +{ + struct btrfs_fs_info *fs_info; + struct btrfs_ioctl_clear_free_args args; + u64 total_bytes; + int ret; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (copy_from_user(&args, arg, sizeof(args))) + return -EFAULT; + + if (args.type >= BTRFS_NR_CLEAR_OP_TYPES) + return -EOPNOTSUPP; + + ret = mnt_want_write_file(file); + if (ret) + return ret; + + fs_info = inode_to_fs_info(file_inode(file)); + total_bytes = btrfs_super_total_bytes(fs_info->super_copy); + if (args.start > total_bytes) { + ret = -EINVAL; + goto out_drop_write; + } + + ret = btrfs_clear_free_space(fs_info, &args); + if (ret < 0) + goto out_drop_write; + + if (copy_to_user(arg, &args, sizeof(args))) + ret = -EFAULT; + +out_drop_write: + mnt_drop_write_file(file); + + return 0; +} + long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -5368,6 +5408,8 @@ long btrfs_ioctl(struct file *file, unsigned int #endif case BTRFS_IOC_SUBVOL_SYNC_WAIT: return btrfs_ioctl_subvol_sync(fs_info, argp); + case BTRFS_IOC_CLEAR_FREE: + return btrfs_ioctl_clear_free(file, argp); } return -ENOTTY; diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index 64f971a6bcb2..278010aff02e 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -1094,6 +1094,24 @@ enum btrfs_clear_op_type { BTRFS_NR_CLEAR_OP_TYPES, }; +struct btrfs_ioctl_clear_free_args { + /* In, type of clearing operation, enumerated in btrfs_clear_free_op_type. */ + __u32 type; + /* Reserved must be zero. */ + __u32 reserved1; + /* + * In. Starting offset to clear from in the logical address space (same + * as fstrim_range::start). + */ + __u64 start; /* in */ + /* In, out. Length from the start to clear (same as fstrim_range::length). */ + __u64 length; + /* In. Minimal length to clear (same as fstrim_range::minlen). */ + __u64 minlen; + /* Reserved, must be zero. */ + __u64 reserved2[4]; +}; + #define BTRFS_IOC_SNAP_CREATE _IOW(BTRFS_IOCTL_MAGIC, 1, \ struct btrfs_ioctl_vol_args) #define BTRFS_IOC_DEFRAG _IOW(BTRFS_IOCTL_MAGIC, 2, \ @@ -1200,6 +1218,8 @@ enum btrfs_clear_op_type { struct btrfs_ioctl_vol_args_v2) #define BTRFS_IOC_LOGICAL_INO_V2 _IOWR(BTRFS_IOCTL_MAGIC, 59, \ struct btrfs_ioctl_logical_ino_args) +#define BTRFS_IOC_CLEAR_FREE _IOW(BTRFS_IOCTL_MAGIC, 90, \ + struct btrfs_ioctl_clear_free_args) #define BTRFS_IOC_GET_SUBVOL_INFO _IOR(BTRFS_IOCTL_MAGIC, 60, \ struct btrfs_ioctl_get_subvol_info_args) #define BTRFS_IOC_GET_SUBVOL_ROOTREF _IOWR(BTRFS_IOCTL_MAGIC, 61, \ -- 2.47.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE 2025-02-28 14:49 ` [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE David Sterba @ 2025-03-01 3:19 ` Sun YangKai 2025-03-03 8:43 ` David Sterba 2025-03-03 8:47 ` David Sterba 2025-03-10 16:38 ` Johannes Thumshirn 2 siblings, 1 reply; 14+ messages in thread From: Sun YangKai @ 2025-03-01 3:19 UTC (permalink / raw) To: dsterba; +Cc: linux-btrfs New to lkml, please correct me if I made any mistake :P > +static int btrfs_ioctl_clear_free(struct file *file, void __user *arg) > +{ > + struct btrfs_fs_info *fs_info; > + struct btrfs_ioctl_clear_free_args args; > + u64 total_bytes; > + int ret; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + if (copy_from_user(&args, arg, sizeof(args))) > + return -EFAULT; > + > + if (args.type >= BTRFS_NR_CLEAR_OP_TYPES) > + return -EOPNOTSUPP; > + > + ret = mnt_want_write_file(file); > + if (ret) > + return ret; > + > + fs_info = inode_to_fs_info(file_inode(file)); > + total_bytes = btrfs_super_total_bytes(fs_info->super_copy); > + if (args.start > total_bytes) { > + ret = -EINVAL; > + goto out_drop_write; > + } > + > + ret = btrfs_clear_free_space(fs_info, &args); > + if (ret < 0) > + goto out_drop_write; > + > + if (copy_to_user(arg, &args, sizeof(args))) > + ret = -EFAULT; > + > +out_drop_write: > + mnt_drop_write_file(file); > + > + return 0; previous stored return value int `ret` is not used here. > +} ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE 2025-03-01 3:19 ` Sun YangKai @ 2025-03-03 8:43 ` David Sterba 0 siblings, 0 replies; 14+ messages in thread From: David Sterba @ 2025-03-03 8:43 UTC (permalink / raw) To: Sun YangKai; +Cc: dsterba, linux-btrfs On Sat, Mar 01, 2025 at 11:19:13AM +0800, Sun YangKai wrote: > New to lkml, please correct me if I made any mistake :P > > > +static int btrfs_ioctl_clear_free(struct file *file, void __user *arg) > > +{ > > + struct btrfs_fs_info *fs_info; > > + struct btrfs_ioctl_clear_free_args args; > > + u64 total_bytes; > > + int ret; > > + > > + if (!capable(CAP_SYS_ADMIN)) > > + return -EPERM; > > + > > + if (copy_from_user(&args, arg, sizeof(args))) > > + return -EFAULT; > > + > > + if (args.type >= BTRFS_NR_CLEAR_OP_TYPES) > > + return -EOPNOTSUPP; > > + > > + ret = mnt_want_write_file(file); > > + if (ret) > > + return ret; > > + > > + fs_info = inode_to_fs_info(file_inode(file)); > > + total_bytes = btrfs_super_total_bytes(fs_info->super_copy); > > + if (args.start > total_bytes) { > > + ret = -EINVAL; > > + goto out_drop_write; > > + } > > + > > + ret = btrfs_clear_free_space(fs_info, &args); > > + if (ret < 0) > > + goto out_drop_write; > > + > > + if (copy_to_user(arg, &args, sizeof(args))) > > + ret = -EFAULT; > > + > > +out_drop_write: > > + mnt_drop_write_file(file); > > + > > + return 0; > previous stored return value int `ret` is not used here. Right, that's a mistake, it should have been. I'll fix it, thanks. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE 2025-02-28 14:49 ` [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE David Sterba 2025-03-01 3:19 ` Sun YangKai @ 2025-03-03 8:47 ` David Sterba 2025-03-10 16:38 ` Johannes Thumshirn 2 siblings, 0 replies; 14+ messages in thread From: David Sterba @ 2025-03-03 8:47 UTC (permalink / raw) To: David Sterba; +Cc: linux-btrfs On Fri, Feb 28, 2025 at 03:49:31PM +0100, David Sterba wrote: > @@ -1200,6 +1218,8 @@ enum btrfs_clear_op_type { > struct btrfs_ioctl_vol_args_v2) > #define BTRFS_IOC_LOGICAL_INO_V2 _IOWR(BTRFS_IOCTL_MAGIC, 59, \ > struct btrfs_ioctl_logical_ino_args) > +#define BTRFS_IOC_CLEAR_FREE _IOW(BTRFS_IOCTL_MAGIC, 90, \ > + struct btrfs_ioctl_clear_free_args) This will be moved to the end and renumbered to nr. 66. This just shows the age of the patchset when this was the last one. > #define BTRFS_IOC_GET_SUBVOL_INFO _IOR(BTRFS_IOCTL_MAGIC, 60, \ > struct btrfs_ioctl_get_subvol_info_args) > #define BTRFS_IOC_GET_SUBVOL_ROOTREF _IOWR(BTRFS_IOCTL_MAGIC, 61, \ > -- > 2.47.1 > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE 2025-02-28 14:49 ` [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE David Sterba 2025-03-01 3:19 ` Sun YangKai 2025-03-03 8:47 ` David Sterba @ 2025-03-10 16:38 ` Johannes Thumshirn 2025-03-10 20:18 ` David Sterba 2025-03-11 0:11 ` Damien Le Moal 2 siblings, 2 replies; 14+ messages in thread From: Johannes Thumshirn @ 2025-03-10 16:38 UTC (permalink / raw) To: David Sterba, linux-btrfs@vger.kernel.org, Naohiro Aota On 28.02.25 15:49, David Sterba wrote: > Add a new ioctl that is an extensible version of FITRIM. It currently > does only the trim/discard and will be extended by other modes like > zeroing or block unmapping. > > We need a new ioctl for that because struct fstrim_range does not > provide any existing or reserved member for extensions. The new ioctl > also supports TRIM as the operation type. > > Signed-off-by: David Sterba <dsterba@suse.com> [...] I /think/ we need some extra checks for zoned here. blkdev_issue_zeroout won't work on zoned devices IFF the 'start' range is not aligned to the write pointer. Also blkdev_issue_discard() on 'unused space' of a zoned filesystem won't do what a user is expecting. I think this needs: > +static int btrfs_ioctl_clear_free(struct file *file, void __user *arg) > +{ > + struct btrfs_fs_info *fs_info; > + struct btrfs_ioctl_clear_free_args args; > + u64 total_bytes; > + int ret; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; if (btrfs_is_zoned(fs_info)) return -EOPNOTSUPP; > + > + if (copy_from_user(&args, arg, sizeof(args))) > + return -EFAULT; > + > + if (args.type >= BTRFS_NR_CLEAR_OP_TYPES) > + return -EOPNOTSUPP; > + > + ret = mnt_want_write_file(file); > + if (ret) > + return ret; > + > + fs_info = inode_to_fs_info(file_inode(file)); > + total_bytes = btrfs_super_total_bytes(fs_info->super_copy); > + if (args.start > total_bytes) { > + ret = -EINVAL; > + goto out_drop_write; > + } > + > + ret = btrfs_clear_free_space(fs_info, &args); > + if (ret < 0) > + goto out_drop_write; > + > + if (copy_to_user(arg, &args, sizeof(args))) > + ret = -EFAULT; > + > +out_drop_write: > + mnt_drop_write_file(file); > + > + return 0; > +} > + ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE 2025-03-10 16:38 ` Johannes Thumshirn @ 2025-03-10 20:18 ` David Sterba 2025-03-11 0:11 ` Damien Le Moal 1 sibling, 0 replies; 14+ messages in thread From: David Sterba @ 2025-03-10 20:18 UTC (permalink / raw) To: Johannes Thumshirn Cc: David Sterba, linux-btrfs@vger.kernel.org, Naohiro Aota On Mon, Mar 10, 2025 at 04:38:25PM +0000, Johannes Thumshirn wrote: > On 28.02.25 15:49, David Sterba wrote: > > Add a new ioctl that is an extensible version of FITRIM. It currently > > does only the trim/discard and will be extended by other modes like > > zeroing or block unmapping. > > > > We need a new ioctl for that because struct fstrim_range does not > > provide any existing or reserved member for extensions. The new ioctl > > also supports TRIM as the operation type. > > > > Signed-off-by: David Sterba <dsterba@suse.com> > > [...] > > I /think/ we need some extra checks for zoned here. blkdev_issue_zeroout > won't work on zoned devices IFF the 'start' range is not aligned to the > write pointer. Also blkdev_issue_discard() on 'unused space' of a zoned > filesystem won't do what a user is expecting. > > I think this needs: > > > +static int btrfs_ioctl_clear_free(struct file *file, void __user *arg) > > +{ > > + struct btrfs_fs_info *fs_info; > > + struct btrfs_ioctl_clear_free_args args; > > + u64 total_bytes; > > + int ret; > > + > > + if (!capable(CAP_SYS_ADMIN)) > > + return -EPERM; > > if (btrfs_is_zoned(fs_info)) > return -EOPNOTSUPP; Right, I'll add it, thanks. As the ioctl is extension of FITRIM, the same checks should be done (btrfs_ioctl_fitrim). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE 2025-03-10 16:38 ` Johannes Thumshirn 2025-03-10 20:18 ` David Sterba @ 2025-03-11 0:11 ` Damien Le Moal 2025-03-11 18:39 ` David Sterba 1 sibling, 1 reply; 14+ messages in thread From: Damien Le Moal @ 2025-03-11 0:11 UTC (permalink / raw) To: Johannes Thumshirn, David Sterba, linux-btrfs@vger.kernel.org, Naohiro Aota On 3/11/25 01:38, Johannes Thumshirn wrote: > On 28.02.25 15:49, David Sterba wrote: >> Add a new ioctl that is an extensible version of FITRIM. It currently >> does only the trim/discard and will be extended by other modes like >> zeroing or block unmapping. >> >> We need a new ioctl for that because struct fstrim_range does not >> provide any existing or reserved member for extensions. The new ioctl >> also supports TRIM as the operation type. >> >> Signed-off-by: David Sterba <dsterba@suse.com> > > [...] > > I /think/ we need some extra checks for zoned here. blkdev_issue_zeroout > won't work on zoned devices IFF the 'start' range is not aligned to the > write pointer. Also blkdev_issue_discard() on 'unused space' of a zoned > filesystem won't do what a user is expecting. Zoned SAS HDDs can support discard on conventional zones. And if they do not, we can still do the emulated zeroout on conventional zones. For sequential zones, we can do a zone reset of all zones that have been written but have all blocks free (so unused). > > I think this needs: > >> +static int btrfs_ioctl_clear_free(struct file *file, void __user *arg) >> +{ >> + struct btrfs_fs_info *fs_info; >> + struct btrfs_ioctl_clear_free_args args; >> + u64 total_bytes; >> + int ret; >> + >> + if (!capable(CAP_SYS_ADMIN)) >> + return -EPERM; > > if (btrfs_is_zoned(fs_info)) > return -EOPNOTSUPP; With the above observations, this may be a bit too harsh. Though it is probably fine for now for zones, but adding a comment mentioning the above things we could do may be good as a reminder for later improvements. > >> + >> + if (copy_from_user(&args, arg, sizeof(args))) >> + return -EFAULT; >> + >> + if (args.type >= BTRFS_NR_CLEAR_OP_TYPES) >> + return -EOPNOTSUPP; >> + >> + ret = mnt_want_write_file(file); >> + if (ret) >> + return ret; >> + >> + fs_info = inode_to_fs_info(file_inode(file)); >> + total_bytes = btrfs_super_total_bytes(fs_info->super_copy); >> + if (args.start > total_bytes) { >> + ret = -EINVAL; >> + goto out_drop_write; >> + } >> + >> + ret = btrfs_clear_free_space(fs_info, &args); >> + if (ret < 0) >> + goto out_drop_write; >> + >> + if (copy_to_user(arg, &args, sizeof(args))) >> + ret = -EFAULT; >> + >> +out_drop_write: >> + mnt_drop_write_file(file); >> + >> + return 0; >> +} >> + -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE 2025-03-11 0:11 ` Damien Le Moal @ 2025-03-11 18:39 ` David Sterba 0 siblings, 0 replies; 14+ messages in thread From: David Sterba @ 2025-03-11 18:39 UTC (permalink / raw) To: Damien Le Moal Cc: Johannes Thumshirn, David Sterba, linux-btrfs@vger.kernel.org, Naohiro Aota On Tue, Mar 11, 2025 at 09:11:52AM +0900, Damien Le Moal wrote: > On 3/11/25 01:38, Johannes Thumshirn wrote: > > On 28.02.25 15:49, David Sterba wrote: > >> Add a new ioctl that is an extensible version of FITRIM. It currently > >> does only the trim/discard and will be extended by other modes like > >> zeroing or block unmapping. > >> > >> We need a new ioctl for that because struct fstrim_range does not > >> provide any existing or reserved member for extensions. The new ioctl > >> also supports TRIM as the operation type. > >> > >> Signed-off-by: David Sterba <dsterba@suse.com> > > > > [...] > > > > I /think/ we need some extra checks for zoned here. blkdev_issue_zeroout > > won't work on zoned devices IFF the 'start' range is not aligned to the > > write pointer. Also blkdev_issue_discard() on 'unused space' of a zoned > > filesystem won't do what a user is expecting. > > Zoned SAS HDDs can support discard on conventional zones. And if they do not, we > can still do the emulated zeroout on conventional zones. > For sequential zones, we can do a zone reset of all zones that have been written > but have all blocks free (so unused). > > > > > I think this needs: > > > >> +static int btrfs_ioctl_clear_free(struct file *file, void __user *arg) > >> +{ > >> + struct btrfs_fs_info *fs_info; > >> + struct btrfs_ioctl_clear_free_args args; > >> + u64 total_bytes; > >> + int ret; > >> + > >> + if (!capable(CAP_SYS_ADMIN)) > >> + return -EPERM; > > > > if (btrfs_is_zoned(fs_info)) > > return -EOPNOTSUPP; > > With the above observations, this may be a bit too harsh. Though it is probably > fine for now for zones, but adding a comment mentioning the above things we > could do may be good as a reminder for later improvements. Thanks for the info, I'll update the comment. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/5] btrfs: add zeroout mode to CLEAR_FREE ioctl 2025-02-28 14:49 [PATCH 0/5] Ioctl to clear unused space in various ways David Sterba 2025-02-28 14:49 ` [PATCH 1/5] btrfs: extend trim callchains to pass the operation type David Sterba 2025-02-28 14:49 ` [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE David Sterba @ 2025-02-28 14:49 ` David Sterba 2025-02-28 14:49 ` [PATCH 4/5] btrfs: add secure erase " David Sterba ` (2 subsequent siblings) 5 siblings, 0 replies; 14+ messages in thread From: David Sterba @ 2025-02-28 14:49 UTC (permalink / raw) To: linux-btrfs; +Cc: David Sterba Add new type of clearing that will write zeros to the unused space (similar to what trim/discard would do). The mode is implemented by blkdev_issue_zeroout() that can write zeros to the blocks explicitly unless the hardware implements UNMAP command that unmaps the blocks that effectively appear as zeroed. This is handled transparently. As a special case of thin provisioning device, the UNMAP is usually handled and can free the underlying space. Signed-off-by: David Sterba <dsterba@suse.com> --- fs/btrfs/extent-tree.c | 6 ++++++ include/uapi/linux/btrfs.h | 6 ++++++ 2 files changed, 12 insertions(+) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 4ab9850b7383..6c45625a293a 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1250,10 +1250,16 @@ static int remove_extent_backref(struct btrfs_trans_handle *trans, static int btrfs_issue_clear_op(struct block_device *bdev, u64 start, u64 size, enum btrfs_clear_op_type clear) { + unsigned int flags = BLKDEV_ZERO_KILLABLE; + switch (clear) { case BTRFS_CLEAR_OP_DISCARD: return blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, size >> SECTOR_SHIFT, GFP_NOFS); + case BTRFS_CLEAR_OP_ZERO: + return blkdev_issue_zeroout(bdev, start >> SECTOR_SHIFT, + size >> SECTOR_SHIFT, GFP_NOFS, + flags); default: return -EOPNOTSUPP; } diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index 278010aff02e..65700c289140 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -1091,6 +1091,12 @@ enum btrfs_err_code { */ enum btrfs_clear_op_type { BTRFS_CLEAR_OP_DISCARD, + /* + * Write zeros to the range, either overwrite or with hardware offload + * that can unmap the blocks internally. + * (Same as blkdev_issue_zeroout() with 0 flags). + */ + BTRFS_CLEAR_OP_ZERO, BTRFS_NR_CLEAR_OP_TYPES, }; -- 2.47.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/5] btrfs: add secure erase mode to CLEAR_FREE ioctl 2025-02-28 14:49 [PATCH 0/5] Ioctl to clear unused space in various ways David Sterba ` (2 preceding siblings ...) 2025-02-28 14:49 ` [PATCH 3/5] btrfs: add zeroout mode to CLEAR_FREE ioctl David Sterba @ 2025-02-28 14:49 ` David Sterba 2025-02-28 14:49 ` [PATCH 5/5] btrfs: add more zeroout modes " David Sterba 2025-03-05 11:01 ` [PATCH 0/5] Ioctl to clear unused space in various ways David Sterba 5 siblings, 0 replies; 14+ messages in thread From: David Sterba @ 2025-02-28 14:49 UTC (permalink / raw) To: linux-btrfs; +Cc: David Sterba Add another type of clearing that will do secure erase on the unused space. This requires hardware support and works as a regular discard while also deleting any copied or cached blocks. Same as "blkdiscard --secure". The unused space ranges may not be aligned to the secure erase block or be of a sufficient length, the exact result depends on the device. Some blocks may still contain valid data even after this ioctl. Signed-off-by: David Sterba <dsterba@suse.com> --- fs/btrfs/extent-tree.c | 3 +++ include/uapi/linux/btrfs.h | 7 +++++++ 2 files changed, 10 insertions(+) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 6c45625a293a..e38760fbf324 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1253,6 +1253,9 @@ static int btrfs_issue_clear_op(struct block_device *bdev, u64 start, u64 size, unsigned int flags = BLKDEV_ZERO_KILLABLE; switch (clear) { + case BTRFS_CLEAR_OP_SECURE_ERASE: + return blkdev_issue_secure_erase(bdev, start >> SECTOR_SHIFT, + size >> SECTOR_SHIFT, GFP_NOFS); case BTRFS_CLEAR_OP_DISCARD: return blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, size >> SECTOR_SHIFT, GFP_NOFS); diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index 65700c289140..018f0f1bbd5f 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -1097,6 +1097,13 @@ enum btrfs_clear_op_type { * (Same as blkdev_issue_zeroout() with 0 flags). */ BTRFS_CLEAR_OP_ZERO, + /* + * Do a secure erase operation on the range. If supported by the + * underlying hardware, this works as regular discard except that all + * copies of the discarded blocks that were possibly created by + * garbage collection must also be erased. + */ + BTRFS_CLEAR_OP_SECURE_ERASE, BTRFS_NR_CLEAR_OP_TYPES, }; -- 2.47.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/5] btrfs: add more zeroout modes to CLEAR_FREE ioctl 2025-02-28 14:49 [PATCH 0/5] Ioctl to clear unused space in various ways David Sterba ` (3 preceding siblings ...) 2025-02-28 14:49 ` [PATCH 4/5] btrfs: add secure erase " David Sterba @ 2025-02-28 14:49 ` David Sterba 2025-03-05 11:01 ` [PATCH 0/5] Ioctl to clear unused space in various ways David Sterba 5 siblings, 0 replies; 14+ messages in thread From: David Sterba @ 2025-02-28 14:49 UTC (permalink / raw) To: linux-btrfs; +Cc: David Sterba The zeroing mode BTRFS_CLEAR_OP_ZERO is safe for use regardless of the underlying device capabilities, either zeros are written or the device will unmap the blocks. This a safe behaviour. In case it's desired to do one or the another add modes that can enforce that or fail when unsupported; - CLEAR_OP_ZERO - overwrite by zero blocks, forbid unmapping blocks by the device - CLEAR_OP_ZERO_NOFALLBACK - unmap the blocks by device and do not fall back to overwriting by zeros Implemented by __blkdev_issue_zeroout() and also documented there. Signed-off-by: David Sterba <dsterba@suse.com> --- fs/btrfs/extent-tree.c | 11 +++++++++-- include/uapi/linux/btrfs.h | 5 +++++ 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index e38760fbf324..779216aa8ce0 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1259,10 +1259,17 @@ static int btrfs_issue_clear_op(struct block_device *bdev, u64 start, u64 size, case BTRFS_CLEAR_OP_DISCARD: return blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, size >> SECTOR_SHIFT, GFP_NOFS); + case BTRFS_CLEAR_OP_ZERO_NOUNMAP: + flags |= BLKDEV_ZERO_NOUNMAP; + return blkdev_issue_zeroout(bdev, start >> SECTOR_SHIFT, + size >> SECTOR_SHIFT, GFP_NOFS, flags); + case BTRFS_CLEAR_OP_ZERO_NOFALLBACK: + flags |= BLKDEV_ZERO_NOFALLBACK; + return blkdev_issue_zeroout(bdev, start >> SECTOR_SHIFT, + size >> SECTOR_SHIFT, GFP_NOFS, flags); case BTRFS_CLEAR_OP_ZERO: return blkdev_issue_zeroout(bdev, start >> SECTOR_SHIFT, - size >> SECTOR_SHIFT, GFP_NOFS, - flags); + size >> SECTOR_SHIFT, GFP_NOFS, flags); default: return -EOPNOTSUPP; } diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h index 018f0f1bbd5f..12e54f3b0a13 100644 --- a/include/uapi/linux/btrfs.h +++ b/include/uapi/linux/btrfs.h @@ -1104,6 +1104,11 @@ enum btrfs_clear_op_type { * garbage collection must also be erased. */ BTRFS_CLEAR_OP_SECURE_ERASE, + + /* Overwrite by zeros, do not try to unmap blocks. */ + BTRFS_CLEAR_OP_ZERO_NOUNMAP, + /* Request unmapping the blocks and don't fall back to writing zeros. */ + BTRFS_CLEAR_OP_ZERO_NOFALLBACK, BTRFS_NR_CLEAR_OP_TYPES, }; -- 2.47.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 0/5] Ioctl to clear unused space in various ways 2025-02-28 14:49 [PATCH 0/5] Ioctl to clear unused space in various ways David Sterba ` (4 preceding siblings ...) 2025-02-28 14:49 ` [PATCH 5/5] btrfs: add more zeroout modes " David Sterba @ 2025-03-05 11:01 ` David Sterba 5 siblings, 0 replies; 14+ messages in thread From: David Sterba @ 2025-03-05 11:01 UTC (permalink / raw) To: David Sterba; +Cc: linux-btrfs On Fri, Feb 28, 2025 at 03:49:20PM +0100, David Sterba wrote: > Add ioctl that is similar to FITRIM and in addition to trim can do also > zeroing (either plain overwrite, or unmap the blocks if the device > supports it) and secure erase. > > This can be used to zero the unused space in e.g. VM images (when run > from inside the guest, if fstrim is not supported) or free space on > thin-provisioned devices. > > The secure erase is provided by blkdiscard command but I'm not aware of > equivalent that can be run on a filesystem, so this is for parity. For current implementation of TRIM we have the in-memory cache that tracks which chunks have been trimmed so it's not duplicating the IO. As the CLEAR code builds on tre trim code the cache would apply to any of the new modes, which is probably not what we want. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-03-11 18:39 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-28 14:49 [PATCH 0/5] Ioctl to clear unused space in various ways David Sterba 2025-02-28 14:49 ` [PATCH 1/5] btrfs: extend trim callchains to pass the operation type David Sterba 2025-02-28 14:49 ` [PATCH 2/5] btrfs: add new ioctl CLEAR_FREE David Sterba 2025-03-01 3:19 ` Sun YangKai 2025-03-03 8:43 ` David Sterba 2025-03-03 8:47 ` David Sterba 2025-03-10 16:38 ` Johannes Thumshirn 2025-03-10 20:18 ` David Sterba 2025-03-11 0:11 ` Damien Le Moal 2025-03-11 18:39 ` David Sterba 2025-02-28 14:49 ` [PATCH 3/5] btrfs: add zeroout mode to CLEAR_FREE ioctl David Sterba 2025-02-28 14:49 ` [PATCH 4/5] btrfs: add secure erase " David Sterba 2025-02-28 14:49 ` [PATCH 5/5] btrfs: add more zeroout modes " David Sterba 2025-03-05 11:01 ` [PATCH 0/5] Ioctl to clear unused space in various ways David Sterba
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox