From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Daeho Jeong <daeho43@gmail.com>
Cc: linux-kernel@vger.kernel.org,
linux-f2fs-devel@lists.sourceforge.net, kernel-team@android.com,
Daeho Jeong <daehojeong@google.com>
Subject: Re: [PATCH v3] f2fs: support dynamic include/exclude for device aliasing
Date: Mon, 15 Jun 2026 16:11:03 +0000 [thread overview]
Message-ID: <ajAkF5FYJAUyuz2Z@google.com> (raw)
In-Reply-To: <20260606011742.1205390-1-daeho43@gmail.com>
Hi Daeho,
Could you please rebase on top of dev-test?
Thanks,
On 06/05, Daeho Jeong wrote:
> From: Daeho Jeong <daehojeong@google.com>
>
> This patch adds a dynamic management feature to the existing device
> aliasing functionality. It allows users to dynamically exclude or
> include specific devices from the filesystem's free pool at runtime
> through new ioctls.
>
> To support this, three new ioctls are introduced:
> - F2FS_IOC_EXCLUDE_DEV_ALIAS: This reclaims the space occupied by a
> device aliasing file. It first performs a capacity check, resets GC
> victim information for the target range, marks the segments as in-use
> to prevent new allocations, and then triggers GC to migrate existing
> valid data out of the range. Finally, it reserves these blocks in the
> SIT to effectively exclude the device from the usable capacity.
>
> - F2FS_IOC_INCLUDE_DEV_ALIAS: This releases the reserved space of a
> previously excluded device aliasing file. It truncates the blocks
> associated with the file, which makes them available for general
> filesystem allocation again.
>
> - F2FS_IOC_GET_DEV_ALIAS_STATUS: This retrieves the current aliasing
> status of a device aliasing file, returning whether the file is
> included (active alias) or excluded (inactive alias, with blocks
> fully allocated on the device).
>
> Signed-off-by: Daeho Jeong <daehojeong@google.com>
> ---
> v3: add CAP_SYS_ADMIN and checkpoint=disabled check.
> remove a f2fs specific flag exposed with getflags.
> v2: prevent operations during checkpoint=disabled.
> ---
> Documentation/filesystems/f2fs.rst | 35 ++++
> fs/f2fs/f2fs.h | 9 +-
> fs/f2fs/file.c | 272 ++++++++++++++++++++++++++++-
> fs/f2fs/gc.c | 30 ++--
> fs/f2fs/namei.c | 11 ++
> fs/f2fs/segment.c | 178 +++++++++++++------
> fs/f2fs/segment.h | 11 ++
> fs/f2fs/super.c | 34 ++++
> include/uapi/linux/f2fs.h | 7 +
> 9 files changed, 520 insertions(+), 67 deletions(-)
>
> diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
> index 7e4031631286..d154c8ac0cd7 100644
> --- a/Documentation/filesystems/f2fs.rst
> +++ b/Documentation/filesystems/f2fs.rst
> @@ -1036,6 +1036,41 @@ So, the key idea is, user can do any file operations on /dev/vdc, and
> reclaim the space after the use, while the space is counted as /data.
> That doesn't require modifying partition size and filesystem format.
>
> +Dynamic Device Aliasing Management
> +----------------------------------
> +
> +In addition to static device aliasing by deleting the aliasing file, F2FS
> +supports dynamic management of device aliasing. This mechanism allows the system
> +to dynamically transition partition ownership between F2FS userdata and external
> +entities (e.g., zRAM, raw partition) based on system requirements without
> +deleting the master aliasing file or requiring unmount/remount.
> +
> +The master aliasing file is created during the initial format of the file system
> +and remains as a persistent control entity (ioctl gateway) in the root directory.
> +
> +- Partition Exclusion (In-service to Aliased)
> + When a specific partition needs to be dedicated to external services (e.g., zRAM),
> + a user can exclude the device alias range via ioctl. The kernel resets GC victim
> + information for the target range, marks segments as in-use to prevent new
> + allocations, and triggers forced GC to migrate existing valid data out of the
> + range. Finally, it reserves these blocks in the SIT to effectively exclude the
> + device from the usable capacity.
> +
> +- Partition Inclusion (Aliased to In-service)
> + When external usage concludes, the space is reclaimed not by deleting the file,
> + but through the inclusion ioctl. The kernel truncates blocks associated with
> + the file, releasing them back to general filesystem allocation.
> +
> +.. code-block::
> +
> + # f2fs_io dev_alias include /mnt/f2fs/vdc.file
> + # df -h
> + /dev/vdb 64G 753M 64G 2% /mnt/f2fs
> +
> + # f2fs_io dev_alias exclude /mnt/f2fs/vdc.file
> + # df -h
> + /dev/vdb 64G 33G 32G 52% /mnt/f2fs
> +
> Per-file Read-Only Large Folio Support
> --------------------------------------
>
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 29f81a496b72..5e0c5701c088 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1398,6 +1398,8 @@ struct f2fs_dev_info {
> unsigned int total_segments;
> block_t start_blk;
> block_t end_blk;
> + bool has_alias;
> + bool is_excluding;
> #ifdef CONFIG_BLK_DEV_ZONED
> unsigned int nr_blkz; /* Total number of zones */
> unsigned long *blkz_seq; /* Bitmap indicating sequential zones */
> @@ -3970,7 +3972,10 @@ int f2fs_create_flush_cmd_control(struct f2fs_sb_info *sbi);
> int f2fs_flush_device_cache(struct f2fs_sb_info *sbi);
> void f2fs_destroy_flush_cmd_control(struct f2fs_sb_info *sbi, bool free);
> void f2fs_invalidate_blocks(struct f2fs_sb_info *sbi, block_t addr,
> - unsigned int len);
> + unsigned int len);
> +void f2fs_reserve_device_alias(struct f2fs_sb_info *sbi, block_t addr,
> + unsigned int len);
> +
> bool f2fs_is_checkpointed_data(struct f2fs_sb_info *sbi, block_t blkaddr);
> int f2fs_start_discard_thread(struct f2fs_sb_info *sbi);
> void f2fs_drop_discard_cmd(struct f2fs_sb_info *sbi);
> @@ -4189,6 +4194,8 @@ void f2fs_build_gc_manager(struct f2fs_sb_info *sbi);
> int f2fs_gc_range(struct f2fs_sb_info *sbi,
> unsigned int start_seg, unsigned int end_seg,
> bool dry_run, unsigned int dry_run_sections);
> +void f2fs_reset_gc_victim_resource(struct f2fs_sb_info *sbi,
> + unsigned int start, unsigned int end);
> int f2fs_resize_fs(struct file *filp, __u64 block_count);
> int __init f2fs_create_garbage_collection_cache(void);
> void f2fs_destroy_garbage_collection_cache(void);
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index abcf6f486dd7..8a25467ca4f4 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -803,13 +803,25 @@ int f2fs_do_truncate_blocks(struct inode *inode, u64 from, bool lock)
>
> if (IS_DEVICE_ALIASING(inode)) {
> struct extent_tree *et = F2FS_I(inode)->extent_tree[EX_READ];
> - struct extent_info ei = et->largest;
> + struct extent_info ei;
> +
> + if (!et) {
> + f2fs_folio_put(ifolio, true);
> + err = -ENODATA;
> + goto out;
> + }
> +
> + read_lock(&et->lock);
> + ei = et->largest;
> + read_unlock(&et->lock);
>
> f2fs_invalidate_blocks(sbi, ei.blk, ei.len);
>
> dec_valid_block_count(sbi, inode, ei.len);
> f2fs_update_time(sbi, REQ_TIME);
>
> + f2fs_drop_extent_tree(inode);
> +
> f2fs_folio_put(ifolio, true);
> goto out;
> }
> @@ -1092,8 +1104,9 @@ int f2fs_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
> return -EPERM;
>
> if ((attr->ia_valid & ATTR_SIZE)) {
> - if (!f2fs_is_compress_backend_ready(inode) ||
> - IS_DEVICE_ALIASING(inode))
> + if (IS_DEVICE_ALIASING(inode))
> + return -EPERM;
> + if (!f2fs_is_compress_backend_ready(inode))
> return -EOPNOTSUPP;
> if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED) &&
> !IS_ALIGNED(attr->ia_size,
> @@ -2115,6 +2128,9 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask)
> if (IS_NOQUOTA(inode))
> return -EPERM;
>
> + if (IS_DEVICE_ALIASING(inode))
> + return -EPERM;
> +
> if ((iflags ^ masked_flags) & F2FS_CASEFOLD_FL) {
> if (!f2fs_sb_has_casefold(F2FS_I_SB(inode)))
> return -EOPNOTSUPP;
> @@ -2663,6 +2679,17 @@ static int f2fs_ioc_get_encryption_policy(struct file *filp, unsigned long arg)
> return fscrypt_ioctl_get_policy(filp, (void __user *)arg);
> }
>
> +static int f2fs_ioc_get_dev_alias_status(struct file *filp, unsigned long arg)
> +{
> + struct inode *inode = file_inode(filp);
> +
> + if (!IS_DEVICE_ALIASING(inode))
> + return -EINVAL;
> +
> + return put_user(F2FS_HAS_BLOCKS(inode) ? F2FS_DEV_ALIAS_STATUS_EXCLUDED :
> + F2FS_DEV_ALIAS_STATUS_INCLUDED, (u32 __user *)arg);
> +}
> +
> static int f2fs_ioc_get_encryption_pwsalt(struct file *filp, unsigned long arg)
> {
> struct inode *inode = file_inode(filp);
> @@ -3599,6 +3626,236 @@ static int f2fs_ioc_get_dev_alias_file(struct file *filp, unsigned long arg)
> (u32 __user *)arg);
> }
>
> +static int f2fs_ioc_exclude_dev_alias(struct file *filp)
> +{
> + struct inode *inode = file_inode(filp);
> + struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> + struct extent_tree *et = F2FS_I(inode)->extent_tree[EX_READ];
> + struct extent_info ei;
> + struct cp_control cpc = { CP_SYNC, 0, 0, 0 };
> + struct f2fs_lock_context lc;
> + blkcnt_t count;
> + unsigned int start, end, segno;
> + int type, i, err;
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EPERM;
> +
> + if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED)))
> + return -EINVAL;
> +
> + err = mnt_want_write_file(filp);
> + if (err)
> + return err;
> +
> + inode_lock(inode);
> +
> + if (!IS_DEVICE_ALIASING(inode)) {
> + err = -EINVAL;
> + goto out_inode_unlock;
> + }
> +
> + if (F2FS_HAS_BLOCKS(inode)) {
> + err = 0;
> + goto out_inode_unlock;
> + }
> +
> + for (i = 1; i < sbi->s_ndevs; i++) {
> + char *name = strrchr(FDEV(i).path, '/');
> +
> + name = name ? name + 1 : FDEV(i).path;
> + if (!strcmp(name, filp->f_path.dentry->d_name.name)) {
> + ei.blk = FDEV(i).start_blk;
> + ei.len = FDEV(i).total_segments << sbi->log_blocks_per_seg;
> + ei.fofs = 0;
> + break;
> + }
> + }
> +
> + if (i == sbi->s_ndevs) {
> + err = -ENODATA;
> + goto out_inode_unlock;
> + }
> +
> + count = ei.len;
> + err = inc_valid_block_count(sbi, inode, &count, false);
> + if (err)
> + goto out_inode_unlock;
> +
> + f2fs_down_write(&sbi->gc_lock);
> + f2fs_lock_op(sbi, &lc);
> +
> + FDEV(f2fs_target_device_index(sbi, ei.blk)).is_excluding = true;
> +
> + start = GET_SEGNO(sbi, ei.blk);
> + end = GET_SEGNO(sbi, ei.blk + ei.len - 1);
> +
> + /* Reset the victim information to prevent GC from targeting the range */
> + f2fs_reset_gc_victim_resource(sbi, start, end);
> +
> + /* Mark the range as inuse to prevent new allocations in it */
> + for (segno = start; segno <= end; segno++)
> + __set_test_and_inuse(sbi, segno);
> +
> + /* Move out cursegs from the target range */
> + for (type = CURSEG_HOT_DATA; type < NR_CURSEG_PERSIST_TYPE; type++) {
> + err = f2fs_allocate_segment_for_resize(sbi, type, start, end);
> + if (err) {
> + f2fs_unlock_op(sbi, &lc);
> + goto out_gc_unlock;
> + }
> + }
> +
> + f2fs_unlock_op(sbi, &lc);
> + f2fs_up_write(&sbi->gc_lock);
> +
> + /* Write checkpoint synchronously to flush all pending writes and free space */
> + err = f2fs_write_checkpoint(sbi, &cpc);
> + if (err) {
> + f2fs_down_write(&sbi->gc_lock);
> + goto out_gc_unlock;
> + }
> +
> + /* Re-acquire gc_lock and cp_rwsem read lock for the entire range GC */
> + f2fs_down_write(&sbi->gc_lock);
> + f2fs_lock_op(sbi, &lc);
> +
> + /* do GC to move out valid blocks in the range all at once! */
> + err = f2fs_gc_range(sbi, start, end, false, 0);
> + if (err) {
> + f2fs_unlock_op(sbi, &lc);
> + goto out_gc_unlock;
> + }
> +
> + if (et) {
> + write_lock(&et->lock);
> + et->largest = ei;
> + write_unlock(&et->lock);
> + }
> + clear_inode_flag(inode, FI_NO_EXTENT);
> +
> + f2fs_reserve_device_alias(sbi, ei.blk, ei.len);
> +
> + i_size_write(inode, (loff_t)ei.len << PAGE_SHIFT);
> + f2fs_update_inode_page(inode);
> +
> + FDEV(f2fs_target_device_index(sbi, ei.blk)).is_excluding = false;
> +
> + f2fs_unlock_op(sbi, &lc);
> + f2fs_up_write(&sbi->gc_lock);
> +
> + inode_unlock(inode);
> + mnt_drop_write_file(filp);
> +
> + err = f2fs_write_checkpoint(sbi, &cpc);
> + return err;
> +
> +out_gc_unlock:
> + FDEV(f2fs_target_device_index(sbi, ei.blk)).is_excluding = false;
> + f2fs_up_write(&sbi->gc_lock);
> +
> + /*
> + * Put successfully GC'ed segments back into PRE list so checkpoint
> + * commits and frees them!
> + */
> + f2fs_lock_op(sbi, &lc);
> + for (segno = start; segno <= end; segno++) {
> + if (get_valid_blocks(sbi, segno, false) == 0) {
> + mutex_lock(&DIRTY_I(sbi)->seglist_lock);
> + if (!test_and_set_bit(segno, DIRTY_I(sbi)->dirty_segmap[PRE]))
> + DIRTY_I(sbi)->nr_dirty[PRE]++;
> + mutex_unlock(&DIRTY_I(sbi)->seglist_lock);
> + }
> + }
> + f2fs_unlock_op(sbi, &lc);
> +
> + count = ei.len;
> + dec_valid_block_count(sbi, inode, count);
> +
> + inode_unlock(inode);
> + mnt_drop_write_file(filp);
> +
> + f2fs_write_checkpoint(sbi, &cpc);
> + return err;
> +
> +out_inode_unlock:
> + inode_unlock(inode);
> + mnt_drop_write_file(filp);
> + return err;
> +}
> +
> +static int f2fs_ioc_include_dev_alias(struct file *filp)
> +{
> + struct inode *inode = file_inode(filp);
> + struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> + struct extent_tree *et = F2FS_I(inode)->extent_tree[EX_READ];
> + struct extent_info ei = {0, };
> + struct cp_control cpc = { CP_SYNC, 0, 0, 0 };
> + struct f2fs_lock_context lc;
> + int err;
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EPERM;
> +
> + if (unlikely(is_sbi_flag_set(sbi, SBI_CP_DISABLED)))
> + return -EINVAL;
> +
> + err = mnt_want_write_file(filp);
> + if (err)
> + return err;
> +
> + inode_lock(inode);
> +
> + if (!IS_DEVICE_ALIASING(inode)) {
> + err = -EINVAL;
> + goto out_inode_unlock;
> + }
> +
> + if (!F2FS_HAS_BLOCKS(inode)) {
> + err = 0;
> + goto out_inode_unlock;
> + }
> +
> + err = filemap_write_and_wait(inode->i_mapping);
> + if (err)
> + goto out_inode_unlock;
> +
> + if (et) {
> + read_lock(&et->lock);
> + ei = et->largest;
> + read_unlock(&et->lock);
> + }
> +
> + f2fs_down_write(&sbi->gc_lock);
> + f2fs_lock_op(sbi, &lc);
> +
> + truncate_setsize(inode, 0);
> +
> + err = f2fs_truncate_blocks(inode, 0, false);
> + if (err) {
> + i_size_write(inode, (loff_t)ei.len << PAGE_SHIFT);
> + f2fs_unlock_op(sbi, &lc);
> + f2fs_up_write(&sbi->gc_lock);
> + goto out_inode_unlock;
> + }
> +
> + f2fs_update_inode_page(inode);
> +
> + f2fs_unlock_op(sbi, &lc);
> + f2fs_up_write(&sbi->gc_lock);
> +
> + inode_unlock(inode);
> + mnt_drop_write_file(filp);
> +
> + err = f2fs_write_checkpoint(sbi, &cpc);
> + return err;
> +
> +out_inode_unlock:
> + inode_unlock(inode);
> + mnt_drop_write_file(filp);
> + return err;
> +}
> +
> static int f2fs_ioc_io_prio(struct file *filp, unsigned long arg)
> {
> struct inode *inode = file_inode(filp);
> @@ -4721,8 +4978,14 @@ static long __f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> return f2fs_ioc_compress_file(filp);
> case F2FS_IOC_GET_DEV_ALIAS_FILE:
> return f2fs_ioc_get_dev_alias_file(filp, arg);
> + case F2FS_IOC_GET_DEV_ALIAS_STATUS:
> + return f2fs_ioc_get_dev_alias_status(filp, arg);
> case F2FS_IOC_IO_PRIO:
> return f2fs_ioc_io_prio(filp, arg);
> + case F2FS_IOC_EXCLUDE_DEV_ALIAS:
> + return f2fs_ioc_exclude_dev_alias(filp);
> + case F2FS_IOC_INCLUDE_DEV_ALIAS:
> + return f2fs_ioc_include_dev_alias(filp);
> default:
> return -ENOTTY;
> }
> @@ -5447,7 +5710,10 @@ long f2fs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> case F2FS_IOC_DECOMPRESS_FILE:
> case F2FS_IOC_COMPRESS_FILE:
> case F2FS_IOC_GET_DEV_ALIAS_FILE:
> + case F2FS_IOC_GET_DEV_ALIAS_STATUS:
> case F2FS_IOC_IO_PRIO:
> + case F2FS_IOC_EXCLUDE_DEV_ALIAS:
> + case F2FS_IOC_INCLUDE_DEV_ALIAS:
> break;
> default:
> return -ENOIOCTLCMD;
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 60378614bc54..755df9b6bbaa 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -2143,29 +2143,37 @@ int f2fs_gc_range(struct f2fs_sb_info *sbi,
> return 0;
> }
>
> +void f2fs_reset_gc_victim_resource(struct f2fs_sb_info *sbi,
> + unsigned int start, unsigned int end)
> +{
> + int i;
> +
> + mutex_lock(&DIRTY_I(sbi)->seglist_lock);
> + for (i = 0; i < MAX_GC_POLICY; i++)
> + if (SIT_I(sbi)->last_victim[i] >= start &&
> + SIT_I(sbi)->last_victim[i] <= end)
> + SIT_I(sbi)->last_victim[i] = 0;
> +
> + for (i = BG_GC; i <= FG_GC; i++)
> + if (sbi->next_victim_seg[i] >= start &&
> + sbi->next_victim_seg[i] <= end)
> + sbi->next_victim_seg[i] = NULL_SEGNO;
> + mutex_unlock(&DIRTY_I(sbi)->seglist_lock);
> +}
> +
> static int free_segment_range(struct f2fs_sb_info *sbi,
> unsigned int secs, bool dry_run)
> {
> unsigned int next_inuse, start, end;
> struct cp_control cpc = { CP_RESIZE, 0, 0, 0 };
> - int gc_mode, gc_type;
> int err = 0;
> int type;
>
> - /* Force block allocation for GC */
> MAIN_SECS(sbi) -= secs;
> start = MAIN_SECS(sbi) * SEGS_PER_SEC(sbi);
> end = MAIN_SEGS(sbi) - 1;
>
> - mutex_lock(&DIRTY_I(sbi)->seglist_lock);
> - for (gc_mode = 0; gc_mode < MAX_GC_POLICY; gc_mode++)
> - if (SIT_I(sbi)->last_victim[gc_mode] >= start)
> - SIT_I(sbi)->last_victim[gc_mode] = 0;
> -
> - for (gc_type = BG_GC; gc_type <= FG_GC; gc_type++)
> - if (sbi->next_victim_seg[gc_type] >= start)
> - sbi->next_victim_seg[gc_type] = NULL_SEGNO;
> - mutex_unlock(&DIRTY_I(sbi)->seglist_lock);
> + f2fs_reset_gc_victim_resource(sbi, start, end);
>
> /* Move out cursegs from the target range */
> for (type = CURSEG_HOT_DATA; type < NR_CURSEG_PERSIST_TYPE; type++) {
> diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
> index e360f08a9586..b7974242ead1 100644
> --- a/fs/f2fs/namei.c
> +++ b/fs/f2fs/namei.c
> @@ -553,6 +553,9 @@ static int f2fs_unlink(struct inode *dir, struct dentry *dentry)
>
> trace_f2fs_unlink_enter(dir, dentry);
>
> + if (IS_DEVICE_ALIASING(inode))
> + return -EPERM;
> +
> if (unlikely(f2fs_cp_error(sbi))) {
> err = -EIO;
> goto out;
> @@ -931,6 +934,9 @@ static int f2fs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
> bool old_is_dir = S_ISDIR(old_inode->i_mode);
> int err;
>
> + if (IS_DEVICE_ALIASING(old_inode))
> + return -EPERM;
> +
> if (unlikely(f2fs_cp_error(sbi)))
> return -EIO;
> if (!f2fs_is_checkpoint_ready(sbi))
> @@ -1000,6 +1006,8 @@ static int f2fs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
> }
>
> if (new_inode) {
> + if (IS_DEVICE_ALIASING(new_inode))
> + return -EPERM;
>
> err = -ENOTEMPTY;
> if (old_is_dir && !f2fs_empty_dir(new_inode))
> @@ -1127,6 +1135,9 @@ static int f2fs_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
> int old_nlink = 0, new_nlink = 0;
> int err;
>
> + if (IS_DEVICE_ALIASING(old_inode) || IS_DEVICE_ALIASING(new_inode))
> + return -EPERM;
> +
> if (unlikely(f2fs_cp_error(sbi)))
> return -EIO;
> if (!f2fs_is_checkpoint_ready(sbi))
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 6a97fe76712b..c0ddc09adc51 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -2498,44 +2498,51 @@ static int update_sit_entry_for_alloc(struct f2fs_sb_info *sbi, struct seg_entry
> #ifdef CONFIG_F2FS_CHECK_FS
> bool mir_exist;
> #endif
> + int del_count = del;
> + int i;
> +
> + f2fs_bug_on(sbi, GET_SEGNO(sbi, blkaddr) != GET_SEGNO(sbi, blkaddr + del_count - 1));
>
> - exist = f2fs_test_and_set_bit(offset, se->cur_valid_map);
> + for (i = 0; i < del_count; i++) {
> + exist = f2fs_test_and_set_bit(offset + i, se->cur_valid_map);
> #ifdef CONFIG_F2FS_CHECK_FS
> - mir_exist = f2fs_test_and_set_bit(offset,
> - se->cur_valid_map_mir);
> - if (unlikely(exist != mir_exist)) {
> - f2fs_err(sbi, "Inconsistent error when setting bitmap, blk:%u, old bit:%d",
> - blkaddr, exist);
> - f2fs_bug_on(sbi, 1);
> - }
> + mir_exist = f2fs_test_and_set_bit(offset + i,
> + se->cur_valid_map_mir);
> + if (unlikely(exist != mir_exist)) {
> + f2fs_err(sbi, "Inconsistent error when setting bitmap, blk:%u, old bit:%d",
> + blkaddr + i, exist);
> + f2fs_bug_on(sbi, 1);
> + }
> #endif
> - if (unlikely(exist)) {
> - f2fs_err(sbi, "Bitmap was wrongly set, blk:%u", blkaddr);
> - f2fs_bug_on(sbi, 1);
> - se->valid_blocks--;
> - del = 0;
> - }
> + if (unlikely(exist)) {
> + f2fs_err(sbi, "Bitmap was wrongly set, blk:%u", blkaddr + i);
> + f2fs_bug_on(sbi, 1);
> + se->valid_blocks--;
> + del -= 1;
> + continue;
> + }
>
> - if (f2fs_block_unit_discard(sbi) &&
> - !f2fs_test_and_set_bit(offset, se->discard_map))
> - sbi->discard_blks--;
> + if (f2fs_block_unit_discard(sbi) &&
> + !f2fs_test_and_set_bit(offset + i, se->discard_map))
> + sbi->discard_blks--;
>
> - /*
> - * SSR should never reuse block which is checkpointed
> - * or newly invalidated.
> - */
> - if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> - if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map)) {
> - se->ckpt_valid_blocks++;
> - if (__is_large_section(sbi))
> - get_sec_entry(sbi, segno)->ckpt_valid_blocks++;
> + /*
> + * SSR should never reuse block which is checkpointed
> + * or newly invalidated.
> + */
> + if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> + if (!f2fs_test_and_set_bit(offset + i, se->ckpt_valid_map)) {
> + se->ckpt_valid_blocks++;
> + if (__is_large_section(sbi))
> + get_sec_entry(sbi, segno)->ckpt_valid_blocks++;
> + }
> }
> - }
>
> - if (!f2fs_test_bit(offset, se->ckpt_valid_map)) {
> - se->ckpt_valid_blocks += del;
> - if (__is_large_section(sbi))
> - get_sec_entry(sbi, segno)->ckpt_valid_blocks += del;
> + if (!f2fs_test_bit(offset + i, se->ckpt_valid_map)) {
> + se->ckpt_valid_blocks += 1;
> + if (__is_large_section(sbi))
> + get_sec_entry(sbi, segno)->ckpt_valid_blocks += 1;
> + }
> }
>
> if (__is_large_section(sbi))
> @@ -2590,9 +2597,14 @@ void f2fs_invalidate_blocks(struct f2fs_sb_info *sbi, block_t addr,
> unsigned int segno = GET_SEGNO(sbi, addr);
> struct sit_info *sit_i = SIT_I(sbi);
> block_t addr_start = addr, addr_end = addr + len - 1;
> - unsigned int seg_num = GET_SEGNO(sbi, addr_end) - segno + 1;
> + unsigned int seg_num;
> unsigned int i = 1, max_blocks = sbi->blocks_per_seg, cnt;
>
> + if (len == 0)
> + return;
> +
> + seg_num = GET_SEGNO(sbi, addr_end) - segno + 1;
> +
> f2fs_bug_on(sbi, addr == NULL_ADDR);
> if (addr == NEW_ADDR || addr == COMPRESS_ADDR)
> return;
> @@ -2625,6 +2637,51 @@ void f2fs_invalidate_blocks(struct f2fs_sb_info *sbi, block_t addr,
> up_write(&sit_i->sentry_lock);
> }
>
> +void f2fs_reserve_device_alias(struct f2fs_sb_info *sbi, block_t addr,
> + unsigned int len)
> +{
> + unsigned int segno = GET_SEGNO(sbi, addr);
> + struct sit_info *sit_i = SIT_I(sbi);
> + block_t addr_start = addr, addr_end = addr + len - 1;
> + unsigned int seg_num;
> + unsigned int i = 1, max_blocks = sbi->blocks_per_seg, cnt;
> +
> + if (len == 0)
> + return;
> +
> + seg_num = GET_SEGNO(sbi, addr_end) - segno + 1;
> +
> + down_write(&sit_i->sentry_lock);
> +
> + if (seg_num == 1)
> + cnt = len;
> + else
> + cnt = max_blocks - GET_BLKOFF_FROM_SEG0(sbi, addr);
> +
> + do {
> + update_segment_mtime(sbi, addr_start, 0);
> + update_sit_entry(sbi, addr_start, cnt);
> +
> + /* Remove the segment from PRE (prefree) to prevent checkpoint from freeing it! */
> + mutex_lock(&DIRTY_I(sbi)->seglist_lock);
> + if (test_and_clear_bit(segno, DIRTY_I(sbi)->dirty_segmap[PRE]))
> + DIRTY_I(sbi)->nr_dirty[PRE]--;
> + mutex_unlock(&DIRTY_I(sbi)->seglist_lock);
> +
> + /* add it into dirty seglist */
> + locate_dirty_segment(sbi, segno);
> +
> + /* update @addr_start and @cnt and @segno */
> + addr_start = START_BLOCK(sbi, ++segno);
> + if (++i == seg_num)
> + cnt = GET_BLKOFF_FROM_SEG0(sbi, addr_end) + 1;
> + else
> + cnt = max_blocks;
> + } while (i <= seg_num);
> +
> + up_write(&sit_i->sentry_lock);
> +}
> +
> bool f2fs_is_checkpointed_data(struct f2fs_sb_info *sbi, block_t blkaddr)
> {
> struct sit_info *sit_i = SIT_I(sbi);
> @@ -2783,6 +2840,7 @@ static int get_new_segment(struct f2fs_sb_info *sbi,
> unsigned int alloc_policy = sbi->allocate_section_policy;
> unsigned int alloc_hint = sbi->allocate_section_hint;
> bool init = true;
> + bool looped = false;
> int i;
> int ret = 0;
>
> @@ -2833,33 +2891,49 @@ static int get_new_segment(struct f2fs_sb_info *sbi,
> find_other_zone:
> secno = find_next_zero_bit(free_i->free_secmap, MAIN_SECS(sbi), hint);
>
> -#ifdef CONFIG_BLK_DEV_ZONED
> - if (secno >= MAIN_SECS(sbi) && f2fs_sb_has_blkzoned(sbi)) {
> - /* Write only to sequential zones */
> - if (sbi->blkzone_alloc_policy == BLKZONE_ALLOC_ONLY_SEQ) {
> - hint = GET_SEC_FROM_SEG(sbi, sbi->first_seq_zone_segno);
> - secno = find_next_zero_bit(free_i->free_secmap, MAIN_SECS(sbi), hint);
> - } else
> - secno = find_first_zero_bit(free_i->free_secmap,
> - MAIN_SECS(sbi));
> - if (secno >= MAIN_SECS(sbi)) {
> - ret = -ENOSPC;
> - f2fs_bug_on(sbi, 1);
> - goto out_unlock;
> - }
> - }
> -#endif
> -
> if (secno >= MAIN_SECS(sbi)) {
> - secno = find_first_zero_bit(free_i->free_secmap,
> - MAIN_SECS(sbi));
> - if (secno >= MAIN_SECS(sbi)) {
> + if (looped) {
> ret = -ENOSPC;
> f2fs_bug_on(sbi, !pinning);
> goto out_unlock;
> }
> +#ifdef CONFIG_BLK_DEV_ZONED
> + /* Write only to sequential zones */
> + if (f2fs_sb_has_blkzoned(sbi) &&
> + sbi->blkzone_alloc_policy == BLKZONE_ALLOC_ONLY_SEQ)
> + hint = GET_SEC_FROM_SEG(sbi, sbi->first_seq_zone_segno);
> + else
> +#endif
> + hint = 0;
> + looped = true;
> + goto find_other_zone;
> }
> +
> segno = GET_SEG_FROM_SEC(sbi, secno);
> +
> + if (f2fs_sb_has_device_alias(sbi) && pinning && f2fs_is_multi_device(sbi)) {
> + int devi = f2fs_target_device_index(sbi, START_BLOCK(sbi, segno));
> +
> + if (FDEV(devi).has_alias) {
> + unsigned int end_segno;
> +
> + while (devi < sbi->s_ndevs && FDEV(devi).has_alias) {
> + block_t next_blk;
> +
> + end_segno = GET_SEGNO(sbi, FDEV(devi).end_blk);
> + hint = GET_SEC_FROM_SEG(sbi, end_segno) + 1;
> +
> + if (hint >= MAIN_SECS(sbi) || ++devi >= sbi->s_ndevs)
> + break;
> +
> + next_blk = START_BLOCK(sbi, GET_SEG_FROM_SEC(sbi, hint));
> + if (next_blk < FDEV(devi).start_blk ||
> + next_blk > FDEV(devi).end_blk)
> + break;
> + }
> + goto find_other_zone;
> + }
> + }
> zoneno = GET_ZONE_FROM_SEC(sbi, secno);
>
> /* give up on finding another zone */
> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
> index 068845660b0f..914523f5d3ea 100644
> --- a/fs/f2fs/segment.h
> +++ b/fs/f2fs/segment.h
> @@ -980,6 +980,17 @@ static inline bool sec_usage_check(struct f2fs_sb_info *sbi, unsigned int secno)
> {
> if (is_cursec(sbi, secno) || (sbi->cur_victim_sec == secno))
> return true;
> + if (f2fs_sb_has_device_alias(sbi) && f2fs_is_multi_device(sbi)) {
> + int i;
> + block_t start_blk = START_BLOCK(sbi, GET_SEG_FROM_SEC(sbi, secno));
> +
> + for (i = 0; i < sbi->s_ndevs; i++) {
> + if (FDEV(i).is_excluding &&
> + start_blk >= FDEV(i).start_blk &&
> + start_blk <= FDEV(i).end_blk)
> + return true;
> + }
> + }
> return false;
> }
>
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 9d421a07d2d5..ee599d202fc9 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -4916,6 +4916,38 @@ static void f2fs_tuning_parameters(struct f2fs_sb_info *sbi)
> sbi->readdir_ra = true;
> }
>
> +static void f2fs_restore_device_alias(struct f2fs_sb_info *sbi)
> +{
> + struct inode *root = d_inode(sbi->sb->s_root);
> + struct f2fs_dir_entry *de;
> + struct folio *folio;
> + int i;
> +
> + if (!f2fs_sb_has_device_alias(sbi))
> + return;
> +
> + for (i = 1; i < sbi->s_ndevs; i++) {
> + char *name = strrchr(FDEV(i).path, '/');
> + struct qstr qstr;
> +
> + name = name ? name + 1 : FDEV(i).path;
> + qstr.name = name;
> + qstr.len = strlen(name);
> +
> + de = f2fs_find_entry(root, &qstr, &folio);
> + if (de) {
> + struct inode *inode = f2fs_iget(sbi->sb, le32_to_cpu(de->ino));
> +
> + if (!IS_ERR(inode)) {
> + if (IS_DEVICE_ALIASING(inode))
> + FDEV(i).has_alias = true;
> + iput(inode);
> + }
> + f2fs_folio_put(folio, 0);
> + }
> + }
> +}
> +
> static int f2fs_fill_super(struct super_block *sb, struct fs_context *fc)
> {
> struct f2fs_fs_context *ctx = fc->fs_private;
> @@ -5341,6 +5373,8 @@ static int f2fs_fill_super(struct super_block *sb, struct fs_context *fc)
> f2fs_update_time(sbi, REQ_TIME);
> clear_sbi_flag(sbi, SBI_CP_DISABLED_QUICK);
>
> + f2fs_restore_device_alias(sbi);
> +
> sbi->umount_lock_holder = NULL;
> return 0;
>
> diff --git a/include/uapi/linux/f2fs.h b/include/uapi/linux/f2fs.h
> index 795e26258355..6ca6ae06918e 100644
> --- a/include/uapi/linux/f2fs.h
> +++ b/include/uapi/linux/f2fs.h
> @@ -45,6 +45,9 @@
> #define F2FS_IOC_START_ATOMIC_REPLACE _IO(F2FS_IOCTL_MAGIC, 25)
> #define F2FS_IOC_GET_DEV_ALIAS_FILE _IOR(F2FS_IOCTL_MAGIC, 26, __u32)
> #define F2FS_IOC_IO_PRIO _IOW(F2FS_IOCTL_MAGIC, 27, __u32)
> +#define F2FS_IOC_EXCLUDE_DEV_ALIAS _IO(F2FS_IOCTL_MAGIC, 28)
> +#define F2FS_IOC_INCLUDE_DEV_ALIAS _IO(F2FS_IOCTL_MAGIC, 29)
> +#define F2FS_IOC_GET_DEV_ALIAS_STATUS _IOR(F2FS_IOCTL_MAGIC, 30, __u32)
>
> /*
> * should be same as XFS_IOC_GOINGDOWN.
> @@ -70,6 +73,10 @@ enum {
> F2FS_IOPRIO_MAX,
> };
>
> +/* for F2FS_IOC_GET_DEV_ALIAS_STATUS */
> +#define F2FS_DEV_ALIAS_STATUS_INCLUDED 0
> +#define F2FS_DEV_ALIAS_STATUS_EXCLUDED 1
> +
> struct f2fs_gc_range {
> __u32 sync;
> __u64 start;
> --
> 2.54.0.1032.g2f8565e1d1-goog
>
prev parent reply other threads:[~2026-06-15 16:11 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-06 1:17 [PATCH v3] f2fs: support dynamic include/exclude for device aliasing Daeho Jeong
2026-06-15 16:11 ` Jaegeuk Kim [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ajAkF5FYJAUyuz2Z@google.com \
--to=jaegeuk@kernel.org \
--cc=daeho43@gmail.com \
--cc=daehojeong@google.com \
--cc=kernel-team@android.com \
--cc=linux-f2fs-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox