linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2 v8] add ioctl/sysfs to donate file-backed pages
@ 2025-01-31 22:27 Jaegeuk Kim
  2025-01-31 22:27 ` [PATCH 1/2] f2fs: register inodes which is able to donate pages Jaegeuk Kim
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Jaegeuk Kim @ 2025-01-31 22:27 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel; +Cc: Jaegeuk Kim

Note, let me keep improving this patch set, while trying to get some feedbacks
from MM and API folks from [1].

If users clearly know which file-backed pages to reclaim in system view, they
can use this ioctl() to register in advance and reclaim all at once later.

I'd like to propose this API in F2FS only, since
1) the use-case is quite limited in Android at the moment. Once it's generall
accepted with more use-cases, happy to propose a generic API such as fadvise.
Please chime in, if there's any needs.

2) it's file-backed pages which requires to maintain the list of inode objects.
I'm not sure this fits in MM tho, also happy to listen to any feedback.

[1] https://lore.kernel.org/lkml/Z4qmF2n2pzuHqad_@google.com/

Change log from v7:
 - change the sysfs entry to reclaim pages in all f2fs mounts

Change log from v6:
 - change sysfs entry name to reclaim_caches_kb

Jaegeuk Kim (2):
  f2fs: register inodes which is able to donate pages
  f2fs: add a sysfs entry to request donate file-backed pages

Jaegeuk Kim (2):
  f2fs: register inodes which is able to donate pages
  f2fs: add a sysfs entry to request donate file-backed pages

 Documentation/ABI/testing/sysfs-fs-f2fs |  7 ++
 fs/f2fs/debug.c                         |  3 +
 fs/f2fs/f2fs.h                          | 14 +++-
 fs/f2fs/file.c                          | 60 +++++++++++++++++
 fs/f2fs/inode.c                         | 14 ++++
 fs/f2fs/shrinker.c                      | 90 +++++++++++++++++++++++++
 fs/f2fs/super.c                         |  1 +
 fs/f2fs/sysfs.c                         | 63 +++++++++++++++++
 include/uapi/linux/f2fs.h               |  7 ++
 9 files changed, 258 insertions(+), 1 deletion(-)

-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] f2fs: register inodes which is able to donate pages
  2025-01-31 22:27 [PATCH 0/2 v8] add ioctl/sysfs to donate file-backed pages Jaegeuk Kim
@ 2025-01-31 22:27 ` Jaegeuk Kim
  2025-01-31 22:27 ` [PATCH 2/2] f2fs: add a sysfs entry to request donate file-backed pages Jaegeuk Kim
  2025-02-04  5:49 ` [PATCH 0/2 v8] add ioctl/sysfs to " Christoph Hellwig
  2 siblings, 0 replies; 9+ messages in thread
From: Jaegeuk Kim @ 2025-01-31 22:27 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel; +Cc: Jaegeuk Kim, Chao Yu

This patch introduces an inode list to keep the page cache ranges that users
can donate pages together.

 #define F2FS_IOC_DONATE_RANGE		_IOW(F2FS_IOCTL_MAGIC, 27,	\
						struct f2fs_donate_range)
 struct f2fs_donate_range {
	__u64 start;
	__u64 len;
 };

e.g., ioctl(F2FS_IOC_DONATE_RANGE, &range);

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 fs/f2fs/debug.c           |  3 ++
 fs/f2fs/f2fs.h            | 12 +++++++-
 fs/f2fs/file.c            | 60 +++++++++++++++++++++++++++++++++++++++
 fs/f2fs/inode.c           | 14 +++++++++
 fs/f2fs/super.c           |  1 +
 include/uapi/linux/f2fs.h |  7 +++++
 6 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index 468828288a4a..16c2dfb4f595 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -164,6 +164,7 @@ static void update_general_status(struct f2fs_sb_info *sbi)
 	si->ndirty_imeta = get_pages(sbi, F2FS_DIRTY_IMETA);
 	si->ndirty_dirs = sbi->ndirty_inode[DIR_INODE];
 	si->ndirty_files = sbi->ndirty_inode[FILE_INODE];
+	si->ndonate_files = sbi->donate_files;
 	si->nquota_files = sbi->nquota_files;
 	si->ndirty_all = sbi->ndirty_inode[DIRTY_META];
 	si->aw_cnt = atomic_read(&sbi->atomic_files);
@@ -501,6 +502,8 @@ static int stat_show(struct seq_file *s, void *v)
 			   si->compr_inode, si->compr_blocks);
 		seq_printf(s, "  - Swapfile Inode: %u\n",
 			   si->swapfile_inode);
+		seq_printf(s, "  - Donate Inode: %u\n",
+			   si->ndonate_files);
 		seq_printf(s, "  - Orphan/Append/Update Inode: %u, %u, %u\n",
 			   si->orphans, si->append, si->update);
 		seq_printf(s, "\nMain area: %d segs, %d secs %d zones\n",
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 1afa7be16e7d..805585a7d2b6 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -849,6 +849,11 @@ struct f2fs_inode_info {
 #endif
 	struct list_head dirty_list;	/* dirty list for dirs and files */
 	struct list_head gdirty_list;	/* linked in global dirty list */
+
+	/* linked in global inode list for cache donation */
+	struct list_head gdonate_list;
+	pgoff_t donate_start, donate_end; /* inclusive */
+
 	struct task_struct *atomic_write_task;	/* store atomic write task */
 	struct extent_tree *extent_tree[NR_EXTENT_CACHES];
 					/* cached extent_tree entry */
@@ -1273,6 +1278,7 @@ enum inode_type {
 	DIR_INODE,			/* for dirty dir inode */
 	FILE_INODE,			/* for dirty regular/symlink inode */
 	DIRTY_META,			/* for all dirtied inode metadata */
+	DONATE_INODE,			/* for all inode to donate pages */
 	NR_INODE_TYPE,
 };
 
@@ -1628,6 +1634,9 @@ struct f2fs_sb_info {
 	unsigned int warm_data_age_threshold;
 	unsigned int last_age_weight;
 
+	/* control donate caches */
+	unsigned int donate_files;
+
 	/* basic filesystem units */
 	unsigned int log_sectors_per_block;	/* log2 sectors per block */
 	unsigned int log_blocksize;		/* log2 block size */
@@ -3966,7 +3975,8 @@ struct f2fs_stat_info {
 	unsigned long long allocated_data_blocks;
 	int ndirty_node, ndirty_dent, ndirty_meta, ndirty_imeta;
 	int ndirty_data, ndirty_qdata;
-	unsigned int ndirty_dirs, ndirty_files, nquota_files, ndirty_all;
+	unsigned int ndirty_dirs, ndirty_files, ndirty_all;
+	unsigned int nquota_files, ndonate_files;
 	int nats, dirty_nats, sits, dirty_sits;
 	int free_nids, avail_nids, alloc_nids;
 	int total_count, utilization;
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index f92a9fba9991..642b8d85a035 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2448,6 +2448,63 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg)
 	return ret;
 }
 
+static int f2fs_ioc_donate_range(struct file *filp, unsigned long arg)
+{
+	struct inode *inode = file_inode(filp);
+	struct mnt_idmap *idmap = file_mnt_idmap(filp);
+	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+	struct f2fs_donate_range range;
+	u64 max_bytes = F2FS_BLK_TO_BYTES(max_file_blocks(inode));
+	u64 start, end;
+
+	if (copy_from_user(&range, (struct f2fs_donate_range __user *)arg,
+							sizeof(range)))
+		return -EFAULT;
+
+	if (!inode_owner_or_capable(idmap, inode))
+		return -EACCES;
+
+	if (!S_ISREG(inode->i_mode))
+		return -EINVAL;
+
+	if (range.start >= max_bytes || range.len > max_bytes ||
+	    (range.start + range.len) > max_bytes)
+		return -EINVAL;
+
+	start = range.start >> PAGE_SHIFT;
+	end = DIV_ROUND_UP(range.start + range.len, PAGE_SIZE);
+
+	inode_lock(inode);
+
+	if (f2fs_is_atomic_file(inode)) {
+		inode_unlock(inode);
+		return -EINVAL;
+	}
+
+	spin_lock(&sbi->inode_lock[DONATE_INODE]);
+	/* let's remove the range, if len = 0 */
+	if (!range.len) {
+		if (!list_empty(&F2FS_I(inode)->gdonate_list)) {
+			list_del_init(&F2FS_I(inode)->gdonate_list);
+			sbi->donate_files--;
+		}
+	} else {
+		if (list_empty(&F2FS_I(inode)->gdonate_list)) {
+			list_add_tail(&F2FS_I(inode)->gdonate_list,
+					&sbi->inode_list[DONATE_INODE]);
+			sbi->donate_files++;
+		} else {
+			list_move_tail(&F2FS_I(inode)->gdonate_list,
+					&sbi->inode_list[DONATE_INODE]);
+		}
+		F2FS_I(inode)->donate_start = start;
+		F2FS_I(inode)->donate_end = end - 1;
+	}
+	spin_unlock(&sbi->inode_lock[DONATE_INODE]);
+	inode_unlock(inode);
+	return 0;
+}
+
 static int f2fs_ioc_fitrim(struct file *filp, unsigned long arg)
 {
 	struct inode *inode = file_inode(filp);
@@ -4477,6 +4534,8 @@ static long __f2fs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
 		return -EOPNOTSUPP;
 	case F2FS_IOC_SHUTDOWN:
 		return f2fs_ioc_shutdown(filp, arg);
+	case F2FS_IOC_DONATE_RANGE:
+		return f2fs_ioc_donate_range(filp, arg);
 	case FITRIM:
 		return f2fs_ioc_fitrim(filp, arg);
 	case FS_IOC_SET_ENCRYPTION_POLICY:
@@ -5228,6 +5287,7 @@ long f2fs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	case F2FS_IOC_RELEASE_VOLATILE_WRITE:
 	case F2FS_IOC_ABORT_ATOMIC_WRITE:
 	case F2FS_IOC_SHUTDOWN:
+	case F2FS_IOC_DONATE_RANGE:
 	case FITRIM:
 	case FS_IOC_SET_ENCRYPTION_POLICY:
 	case FS_IOC_GET_ENCRYPTION_PWSALT:
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 3dd25f64d6f1..cba2f6bacde4 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -804,6 +804,19 @@ int f2fs_write_inode(struct inode *inode, struct writeback_control *wbc)
 	return 0;
 }
 
+static void f2fs_remove_donate_inode(struct inode *inode)
+{
+	struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+
+	if (list_empty(&F2FS_I(inode)->gdonate_list))
+		return;
+
+	spin_lock(&sbi->inode_lock[DONATE_INODE]);
+	list_del_init(&F2FS_I(inode)->gdonate_list);
+	sbi->donate_files--;
+	spin_unlock(&sbi->inode_lock[DONATE_INODE]);
+}
+
 /*
  * Called at the last iput() if i_nlink is zero
  */
@@ -838,6 +851,7 @@ void f2fs_evict_inode(struct inode *inode)
 
 	f2fs_bug_on(sbi, get_dirty_pages(inode));
 	f2fs_remove_dirty_inode(inode);
+	f2fs_remove_donate_inode(inode);
 
 	if (!IS_DEVICE_ALIASING(inode))
 		f2fs_destroy_extent_tree(inode);
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 19b67828ae32..24ded06c8980 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1441,6 +1441,7 @@ static struct inode *f2fs_alloc_inode(struct super_block *sb)
 	spin_lock_init(&fi->i_size_lock);
 	INIT_LIST_HEAD(&fi->dirty_list);
 	INIT_LIST_HEAD(&fi->gdirty_list);
+	INIT_LIST_HEAD(&fi->gdonate_list);
 	init_f2fs_rwsem(&fi->i_gc_rwsem[READ]);
 	init_f2fs_rwsem(&fi->i_gc_rwsem[WRITE]);
 	init_f2fs_rwsem(&fi->i_xattr_sem);
diff --git a/include/uapi/linux/f2fs.h b/include/uapi/linux/f2fs.h
index f7aaf8d23e20..cd38a7c166e6 100644
--- a/include/uapi/linux/f2fs.h
+++ b/include/uapi/linux/f2fs.h
@@ -44,6 +44,8 @@
 #define F2FS_IOC_COMPRESS_FILE		_IO(F2FS_IOCTL_MAGIC, 24)
 #define F2FS_IOC_START_ATOMIC_REPLACE	_IO(F2FS_IOCTL_MAGIC, 25)
 #define F2FS_IOC_GET_DEV_ALIAS_FILE	_IOR(F2FS_IOCTL_MAGIC, 26, __u32)
+#define F2FS_IOC_DONATE_RANGE		_IOW(F2FS_IOCTL_MAGIC, 27,	\
+						struct f2fs_donate_range)
 
 /*
  * should be same as XFS_IOC_GOINGDOWN.
@@ -97,4 +99,9 @@ struct f2fs_comp_option {
 	__u8 log_cluster_size;
 };
 
+struct f2fs_donate_range {
+	__u64 start;
+	__u64 len;
+};
+
 #endif /* _UAPI_LINUX_F2FS_H */
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] f2fs: add a sysfs entry to request donate file-backed pages
  2025-01-31 22:27 [PATCH 0/2 v8] add ioctl/sysfs to donate file-backed pages Jaegeuk Kim
  2025-01-31 22:27 ` [PATCH 1/2] f2fs: register inodes which is able to donate pages Jaegeuk Kim
@ 2025-01-31 22:27 ` Jaegeuk Kim
  2025-02-06  2:23   ` [f2fs-dev] " Chao Yu
  2025-02-07 16:28   ` [PATCH 2/2 v2] " Jaegeuk Kim
  2025-02-04  5:49 ` [PATCH 0/2 v8] add ioctl/sysfs to " Christoph Hellwig
  2 siblings, 2 replies; 9+ messages in thread
From: Jaegeuk Kim @ 2025-01-31 22:27 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel; +Cc: Jaegeuk Kim

1. ioctl(fd1, F2FS_IOC_DONATE_RANGE, {0,3});
2. ioctl(fd2, F2FS_IOC_DONATE_RANGE, {1,2});
3. ioctl(fd3, F2FS_IOC_DONATE_RANGE, {3,1});
4. echo 1024 > /sys/fs/f2fs/tuning/reclaim_caches_kb

This gives a way to reclaim file-backed pages by iterating all f2fs mounts until
reclaiming 1MB page cache ranges, registered by #1, #2, and #3.

5. cat /sys/fs/f2fs/tuning/reclaim_caches_kb
-> gives total number of registered file ranges.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
 Documentation/ABI/testing/sysfs-fs-f2fs |  7 ++
 fs/f2fs/f2fs.h                          |  2 +
 fs/f2fs/shrinker.c                      | 90 +++++++++++++++++++++++++
 fs/f2fs/sysfs.c                         | 63 +++++++++++++++++
 4 files changed, 162 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
index 3e1630c70d8a..81deae2af84d 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -828,3 +828,10 @@ Date:		November 2024
 Contact:	"Chao Yu" <chao@kernel.org>
 Description:	It controls max read extent count for per-inode, the value of threshold
 		is 10240 by default.
+
+What:		/sys/fs/f2fs/tuning/reclaim_caches_kb
+Date:		February 2025
+Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
+Description:	It reclaims the given KBs of file-backed pages registered by
+		ioctl(F2FS_IOC_DONATE_RANGE).
+		For example, writing N tries to drop N KBs spaces in LRU.
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 805585a7d2b6..bd0d8138b71d 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4241,6 +4241,8 @@ unsigned long f2fs_shrink_count(struct shrinker *shrink,
 			struct shrink_control *sc);
 unsigned long f2fs_shrink_scan(struct shrinker *shrink,
 			struct shrink_control *sc);
+unsigned int f2fs_donate_files(void);
+void f2fs_reclaim_caches(unsigned int reclaim_caches_kb);
 void f2fs_join_shrinker(struct f2fs_sb_info *sbi);
 void f2fs_leave_shrinker(struct f2fs_sb_info *sbi);
 
diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c
index 83d6fb97dcae..45efff635d8e 100644
--- a/fs/f2fs/shrinker.c
+++ b/fs/f2fs/shrinker.c
@@ -130,6 +130,96 @@ unsigned long f2fs_shrink_scan(struct shrinker *shrink,
 	return freed;
 }
 
+unsigned int f2fs_donate_files(void)
+{
+	struct f2fs_sb_info *sbi;
+	struct list_head *p;
+	unsigned int donate_files = 0;
+
+	spin_lock(&f2fs_list_lock);
+	p = f2fs_list.next;
+	while (p != &f2fs_list) {
+		sbi = list_entry(p, struct f2fs_sb_info, s_list);
+
+		/* stop f2fs_put_super */
+		if (!mutex_trylock(&sbi->umount_mutex)) {
+			p = p->next;
+			continue;
+		}
+		spin_unlock(&f2fs_list_lock);
+
+		donate_files += sbi->donate_files;
+
+		spin_lock(&f2fs_list_lock);
+		p = p->next;
+		mutex_unlock(&sbi->umount_mutex);
+	}
+	spin_unlock(&f2fs_list_lock);
+
+	return donate_files;
+}
+
+static unsigned int do_reclaim_caches(struct f2fs_sb_info *sbi,
+				unsigned int reclaim_caches_kb)
+{
+	struct inode *inode;
+	struct f2fs_inode_info *fi;
+	unsigned int nfiles = sbi->donate_files;
+	pgoff_t npages = reclaim_caches_kb >> (PAGE_SHIFT - 10);
+
+	while (npages && nfiles--) {
+		pgoff_t len;
+
+		spin_lock(&sbi->inode_lock[DONATE_INODE]);
+		if (list_empty(&sbi->inode_list[DONATE_INODE])) {
+			spin_unlock(&sbi->inode_lock[DONATE_INODE]);
+			break;
+		}
+		fi = list_first_entry(&sbi->inode_list[DONATE_INODE],
+					struct f2fs_inode_info, gdonate_list);
+		list_move_tail(&fi->gdonate_list, &sbi->inode_list[DONATE_INODE]);
+		inode = igrab(&fi->vfs_inode);
+		spin_unlock(&sbi->inode_lock[DONATE_INODE]);
+
+		if (!inode)
+			continue;
+
+		len = fi->donate_end - fi->donate_start + 1;
+		npages = npages < len ? 0 : npages - len;
+		invalidate_inode_pages2_range(inode->i_mapping,
+					fi->donate_start, fi->donate_end);
+		iput(inode);
+		cond_resched();
+	}
+	return npages << (PAGE_SHIFT - 10);
+}
+
+void f2fs_reclaim_caches(unsigned int reclaim_caches_kb)
+{
+	struct f2fs_sb_info *sbi;
+	struct list_head *p;
+
+	spin_lock(&f2fs_list_lock);
+	p = f2fs_list.next;
+	while (p != &f2fs_list && reclaim_caches_kb) {
+		sbi = list_entry(p, struct f2fs_sb_info, s_list);
+
+		/* stop f2fs_put_super */
+		if (!mutex_trylock(&sbi->umount_mutex)) {
+			p = p->next;
+			continue;
+		}
+		spin_unlock(&f2fs_list_lock);
+
+		reclaim_caches_kb = do_reclaim_caches(sbi, reclaim_caches_kb);
+
+		spin_lock(&f2fs_list_lock);
+		p = p->next;
+		mutex_unlock(&sbi->umount_mutex);
+	}
+	spin_unlock(&f2fs_list_lock);
+}
+
 void f2fs_join_shrinker(struct f2fs_sb_info *sbi)
 {
 	spin_lock(&f2fs_list_lock);
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 4bd7b17a20c8..579226a05a69 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -916,6 +916,39 @@ static struct f2fs_base_attr f2fs_base_attr_##_name = {		\
 	.show	= f2fs_feature_show,				\
 }
 
+static ssize_t f2fs_tune_show(struct f2fs_base_attr *a, char *buf)
+{
+	unsigned int res;
+
+	if (!strcmp(a->attr.name, "reclaim_caches_kb"))
+		res = f2fs_donate_files();
+
+	return sysfs_emit(buf, "%u\n", res);
+}
+
+static ssize_t f2fs_tune_store(struct f2fs_base_attr *a,
+			const char *buf, size_t count)
+{
+	unsigned long t;
+	int ret;
+
+	ret = kstrtoul(skip_spaces(buf), 0, &t);
+	if (ret)
+		return ret;
+
+	if (!strcmp(a->attr.name, "reclaim_caches_kb"))
+		f2fs_reclaim_caches(t);
+
+	return ret ? ret : count;
+}
+
+#define F2FS_TUNE_RW_ATTR(_name)				\
+static struct f2fs_base_attr f2fs_base_attr_##_name = {		\
+	.attr = {.name = __stringify(_name), .mode = 0644 },	\
+	.show	= f2fs_tune_show,				\
+	.store	= f2fs_tune_store,				\
+}
+
 static ssize_t f2fs_sb_feature_show(struct f2fs_attr *a,
 		struct f2fs_sb_info *sbi, char *buf)
 {
@@ -1368,6 +1401,14 @@ static struct attribute *f2fs_sb_feat_attrs[] = {
 };
 ATTRIBUTE_GROUPS(f2fs_sb_feat);
 
+F2FS_TUNE_RW_ATTR(reclaim_caches_kb);
+
+static struct attribute *f2fs_tune_attrs[] = {
+	BASE_ATTR_LIST(reclaim_caches_kb),
+	NULL,
+};
+ATTRIBUTE_GROUPS(f2fs_tune);
+
 static const struct sysfs_ops f2fs_attr_ops = {
 	.show	= f2fs_attr_show,
 	.store	= f2fs_attr_store,
@@ -1401,6 +1442,20 @@ static struct kobject f2fs_feat = {
 	.kset	= &f2fs_kset,
 };
 
+static const struct sysfs_ops f2fs_tune_attr_ops = {
+	.show	= f2fs_base_attr_show,
+	.store	= f2fs_base_attr_store,
+};
+
+static const struct kobj_type f2fs_tune_ktype = {
+	.default_groups = f2fs_tune_groups,
+	.sysfs_ops	= &f2fs_tune_attr_ops,
+};
+
+static struct kobject f2fs_tune = {
+	.kset	= &f2fs_kset,
+};
+
 static ssize_t f2fs_stat_attr_show(struct kobject *kobj,
 				struct attribute *attr, char *buf)
 {
@@ -1637,6 +1692,11 @@ int __init f2fs_init_sysfs(void)
 	if (ret)
 		goto unregister_out;
 
+	ret = kobject_init_and_add(&f2fs_tune, &f2fs_tune_ktype,
+				   NULL, "tuning");
+	if (ret)
+		goto put_feat;
+
 	f2fs_proc_root = proc_mkdir("fs/f2fs", NULL);
 	if (!f2fs_proc_root) {
 		ret = -ENOMEM;
@@ -1645,6 +1705,8 @@ int __init f2fs_init_sysfs(void)
 
 	return 0;
 put_kobject:
+	kobject_put(&f2fs_tune);
+put_feat:
 	kobject_put(&f2fs_feat);
 unregister_out:
 	kset_unregister(&f2fs_kset);
@@ -1653,6 +1715,7 @@ int __init f2fs_init_sysfs(void)
 
 void f2fs_exit_sysfs(void)
 {
+	kobject_put(&f2fs_tune);
 	kobject_put(&f2fs_feat);
 	kset_unregister(&f2fs_kset);
 	remove_proc_entry("fs/f2fs", NULL);
-- 
2.48.1.362.g079036d154-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2 v8] add ioctl/sysfs to donate file-backed pages
  2025-01-31 22:27 [PATCH 0/2 v8] add ioctl/sysfs to donate file-backed pages Jaegeuk Kim
  2025-01-31 22:27 ` [PATCH 1/2] f2fs: register inodes which is able to donate pages Jaegeuk Kim
  2025-01-31 22:27 ` [PATCH 2/2] f2fs: add a sysfs entry to request donate file-backed pages Jaegeuk Kim
@ 2025-02-04  5:49 ` Christoph Hellwig
  2025-02-04 16:26   ` Jaegeuk Kim
  2 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2025-02-04  5:49 UTC (permalink / raw)
  To: Jaegeuk Kim
  Cc: linux-kernel, linux-f2fs-devel, linux-mm, linux-api,
	linux-fsdevel

On Fri, Jan 31, 2025 at 10:27:55PM +0000, Jaegeuk Kim wrote:
> Note, let me keep improving this patch set, while trying to get some feedbacks
> from MM and API folks from [1].

Please actually drive it instead of only interacting once after
I told you to.  The feedback is clearly that it is a MM thing, so please
drive it forward instead of going back to the hacky file system version.

> 
> If users clearly know which file-backed pages to reclaim in system view, they
> can use this ioctl() to register in advance and reclaim all at once later.
> 
> I'd like to propose this API in F2FS only, since
> 1) the use-case is quite limited in Android at the moment. Once it's generall
> accepted with more use-cases, happy to propose a generic API such as fadvise.
> Please chime in, if there's any needs.
> 
> 2) it's file-backed pages which requires to maintain the list of inode objects.
> I'm not sure this fits in MM tho, also happy to listen to any feedback.
> 
> [1] https://lore.kernel.org/lkml/Z4qmF2n2pzuHqad_@google.com/
> 
> Change log from v7:
>  - change the sysfs entry to reclaim pages in all f2fs mounts
> 
> Change log from v6:
>  - change sysfs entry name to reclaim_caches_kb
> 
> Jaegeuk Kim (2):
>   f2fs: register inodes which is able to donate pages
>   f2fs: add a sysfs entry to request donate file-backed pages
> 
> Jaegeuk Kim (2):
>   f2fs: register inodes which is able to donate pages
>   f2fs: add a sysfs entry to request donate file-backed pages
> 
>  Documentation/ABI/testing/sysfs-fs-f2fs |  7 ++
>  fs/f2fs/debug.c                         |  3 +
>  fs/f2fs/f2fs.h                          | 14 +++-
>  fs/f2fs/file.c                          | 60 +++++++++++++++++
>  fs/f2fs/inode.c                         | 14 ++++
>  fs/f2fs/shrinker.c                      | 90 +++++++++++++++++++++++++
>  fs/f2fs/super.c                         |  1 +
>  fs/f2fs/sysfs.c                         | 63 +++++++++++++++++
>  include/uapi/linux/f2fs.h               |  7 ++
>  9 files changed, 258 insertions(+), 1 deletion(-)
> 
> -- 
> 2.48.1.362.g079036d154-goog
> 
> 
---end quoted text---

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2 v8] add ioctl/sysfs to donate file-backed pages
  2025-02-04  5:49 ` [PATCH 0/2 v8] add ioctl/sysfs to " Christoph Hellwig
@ 2025-02-04 16:26   ` Jaegeuk Kim
  0 siblings, 0 replies; 9+ messages in thread
From: Jaegeuk Kim @ 2025-02-04 16:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-f2fs-devel, linux-mm, linux-api,
	linux-fsdevel

On 02/03, Christoph Hellwig wrote:
> On Fri, Jan 31, 2025 at 10:27:55PM +0000, Jaegeuk Kim wrote:
> > Note, let me keep improving this patch set, while trying to get some feedbacks
> > from MM and API folks from [1].
> 
> Please actually drive it instead of only interacting once after
> I told you to.  The feedback is clearly that it is a MM thing, so please
> drive it forward instead of going back to the hacky file system version.

I keep saying working in parallel for production. And, no worries, I won't
merge this to -next until I get the feedback from the MM folks. I was
waiting for a couple of weeks before bothering them, so will ping there.

> 
> > 
> > If users clearly know which file-backed pages to reclaim in system view, they
> > can use this ioctl() to register in advance and reclaim all at once later.
> > 
> > I'd like to propose this API in F2FS only, since
> > 1) the use-case is quite limited in Android at the moment. Once it's generall
> > accepted with more use-cases, happy to propose a generic API such as fadvise.
> > Please chime in, if there's any needs.
> > 
> > 2) it's file-backed pages which requires to maintain the list of inode objects.
> > I'm not sure this fits in MM tho, also happy to listen to any feedback.
> > 
> > [1] https://lore.kernel.org/lkml/Z4qmF2n2pzuHqad_@google.com/
> > 
> > Change log from v7:
> >  - change the sysfs entry to reclaim pages in all f2fs mounts
> > 
> > Change log from v6:
> >  - change sysfs entry name to reclaim_caches_kb
> > 
> > Jaegeuk Kim (2):
> >   f2fs: register inodes which is able to donate pages
> >   f2fs: add a sysfs entry to request donate file-backed pages
> > 
> > Jaegeuk Kim (2):
> >   f2fs: register inodes which is able to donate pages
> >   f2fs: add a sysfs entry to request donate file-backed pages
> > 
> >  Documentation/ABI/testing/sysfs-fs-f2fs |  7 ++
> >  fs/f2fs/debug.c                         |  3 +
> >  fs/f2fs/f2fs.h                          | 14 +++-
> >  fs/f2fs/file.c                          | 60 +++++++++++++++++
> >  fs/f2fs/inode.c                         | 14 ++++
> >  fs/f2fs/shrinker.c                      | 90 +++++++++++++++++++++++++
> >  fs/f2fs/super.c                         |  1 +
> >  fs/f2fs/sysfs.c                         | 63 +++++++++++++++++
> >  include/uapi/linux/f2fs.h               |  7 ++
> >  9 files changed, 258 insertions(+), 1 deletion(-)
> > 
> > -- 
> > 2.48.1.362.g079036d154-goog
> > 
> > 
> ---end quoted text---

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: add a sysfs entry to request donate file-backed pages
  2025-01-31 22:27 ` [PATCH 2/2] f2fs: add a sysfs entry to request donate file-backed pages Jaegeuk Kim
@ 2025-02-06  2:23   ` Chao Yu
  2025-02-06 18:33     ` Jaegeuk Kim
  2025-02-07 16:28   ` [PATCH 2/2 v2] " Jaegeuk Kim
  1 sibling, 1 reply; 9+ messages in thread
From: Chao Yu @ 2025-02-06  2:23 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-f2fs-devel; +Cc: chao

On 2/1/25 06:27, Jaegeuk Kim via Linux-f2fs-devel wrote:
> 1. ioctl(fd1, F2FS_IOC_DONATE_RANGE, {0,3});
> 2. ioctl(fd2, F2FS_IOC_DONATE_RANGE, {1,2});
> 3. ioctl(fd3, F2FS_IOC_DONATE_RANGE, {3,1});
> 4. echo 1024 > /sys/fs/f2fs/tuning/reclaim_caches_kb
> 
> This gives a way to reclaim file-backed pages by iterating all f2fs mounts until
> reclaiming 1MB page cache ranges, registered by #1, #2, and #3.
> 
> 5. cat /sys/fs/f2fs/tuning/reclaim_caches_kb
> -> gives total number of registered file ranges.
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> ---
>  Documentation/ABI/testing/sysfs-fs-f2fs |  7 ++
>  fs/f2fs/f2fs.h                          |  2 +
>  fs/f2fs/shrinker.c                      | 90 +++++++++++++++++++++++++
>  fs/f2fs/sysfs.c                         | 63 +++++++++++++++++
>  4 files changed, 162 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
> index 3e1630c70d8a..81deae2af84d 100644
> --- a/Documentation/ABI/testing/sysfs-fs-f2fs
> +++ b/Documentation/ABI/testing/sysfs-fs-f2fs
> @@ -828,3 +828,10 @@ Date:		November 2024
>  Contact:	"Chao Yu" <chao@kernel.org>
>  Description:	It controls max read extent count for per-inode, the value of threshold
>  		is 10240 by default.
> +
> +What:		/sys/fs/f2fs/tuning/reclaim_caches_kb
> +Date:		February 2025
> +Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
> +Description:	It reclaims the given KBs of file-backed pages registered by
> +		ioctl(F2FS_IOC_DONATE_RANGE).
> +		For example, writing N tries to drop N KBs spaces in LRU.
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 805585a7d2b6..bd0d8138b71d 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -4241,6 +4241,8 @@ unsigned long f2fs_shrink_count(struct shrinker *shrink,
>  			struct shrink_control *sc);
>  unsigned long f2fs_shrink_scan(struct shrinker *shrink,
>  			struct shrink_control *sc);
> +unsigned int f2fs_donate_files(void);
> +void f2fs_reclaim_caches(unsigned int reclaim_caches_kb);
>  void f2fs_join_shrinker(struct f2fs_sb_info *sbi);
>  void f2fs_leave_shrinker(struct f2fs_sb_info *sbi);
>  
> diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c
> index 83d6fb97dcae..45efff635d8e 100644
> --- a/fs/f2fs/shrinker.c
> +++ b/fs/f2fs/shrinker.c
> @@ -130,6 +130,96 @@ unsigned long f2fs_shrink_scan(struct shrinker *shrink,
>  	return freed;
>  }
>  
> +unsigned int f2fs_donate_files(void)
> +{
> +	struct f2fs_sb_info *sbi;
> +	struct list_head *p;
> +	unsigned int donate_files = 0;
> +
> +	spin_lock(&f2fs_list_lock);
> +	p = f2fs_list.next;
> +	while (p != &f2fs_list) {
> +		sbi = list_entry(p, struct f2fs_sb_info, s_list);
> +
> +		/* stop f2fs_put_super */
> +		if (!mutex_trylock(&sbi->umount_mutex)) {
> +			p = p->next;
> +			continue;
> +		}
> +		spin_unlock(&f2fs_list_lock);
> +
> +		donate_files += sbi->donate_files;
> +
> +		spin_lock(&f2fs_list_lock);
> +		p = p->next;
> +		mutex_unlock(&sbi->umount_mutex);
> +	}
> +	spin_unlock(&f2fs_list_lock);
> +
> +	return donate_files;
> +}
> +
> +static unsigned int do_reclaim_caches(struct f2fs_sb_info *sbi,
> +				unsigned int reclaim_caches_kb)
> +{
> +	struct inode *inode;
> +	struct f2fs_inode_info *fi;
> +	unsigned int nfiles = sbi->donate_files;
> +	pgoff_t npages = reclaim_caches_kb >> (PAGE_SHIFT - 10);
> +
> +	while (npages && nfiles--) {
> +		pgoff_t len;
> +
> +		spin_lock(&sbi->inode_lock[DONATE_INODE]);
> +		if (list_empty(&sbi->inode_list[DONATE_INODE])) {
> +			spin_unlock(&sbi->inode_lock[DONATE_INODE]);
> +			break;
> +		}
> +		fi = list_first_entry(&sbi->inode_list[DONATE_INODE],
> +					struct f2fs_inode_info, gdonate_list);
> +		list_move_tail(&fi->gdonate_list, &sbi->inode_list[DONATE_INODE]);
> +		inode = igrab(&fi->vfs_inode);
> +		spin_unlock(&sbi->inode_lock[DONATE_INODE]);
> +
> +		if (!inode)
> +			continue;
> +
> +		len = fi->donate_end - fi->donate_start + 1;
> +		npages = npages < len ? 0 : npages - len;
> +		invalidate_inode_pages2_range(inode->i_mapping,
> +					fi->donate_start, fi->donate_end);
> +		iput(inode);
> +		cond_resched();
> +	}
> +	return npages << (PAGE_SHIFT - 10);
> +}
> +
> +void f2fs_reclaim_caches(unsigned int reclaim_caches_kb)
> +{
> +	struct f2fs_sb_info *sbi;
> +	struct list_head *p;
> +
> +	spin_lock(&f2fs_list_lock);
> +	p = f2fs_list.next;
> +	while (p != &f2fs_list && reclaim_caches_kb) {
> +		sbi = list_entry(p, struct f2fs_sb_info, s_list);
> +
> +		/* stop f2fs_put_super */
> +		if (!mutex_trylock(&sbi->umount_mutex)) {
> +			p = p->next;
> +			continue;
> +		}
> +		spin_unlock(&f2fs_list_lock);
> +
> +		reclaim_caches_kb = do_reclaim_caches(sbi, reclaim_caches_kb);
> +
> +		spin_lock(&f2fs_list_lock);
> +		p = p->next;
> +		mutex_unlock(&sbi->umount_mutex);
> +	}
> +	spin_unlock(&f2fs_list_lock);
> +}
> +
>  void f2fs_join_shrinker(struct f2fs_sb_info *sbi)
>  {
>  	spin_lock(&f2fs_list_lock);
> diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
> index 4bd7b17a20c8..579226a05a69 100644
> --- a/fs/f2fs/sysfs.c
> +++ b/fs/f2fs/sysfs.c
> @@ -916,6 +916,39 @@ static struct f2fs_base_attr f2fs_base_attr_##_name = {		\
>  	.show	= f2fs_feature_show,				\
>  }
>  
> +static ssize_t f2fs_tune_show(struct f2fs_base_attr *a, char *buf)
> +{
> +	unsigned int res;
> +
> +	if (!strcmp(a->attr.name, "reclaim_caches_kb"))
> +		res = f2fs_donate_files();
> +
> +	return sysfs_emit(buf, "%u\n", res);
> +}
> +
> +static ssize_t f2fs_tune_store(struct f2fs_base_attr *a,
> +			const char *buf, size_t count)
> +{
> +	unsigned long t;
> +	int ret;
> +
> +	ret = kstrtoul(skip_spaces(buf), 0, &t);
> +	if (ret)
> +		return ret;
> +
> +	if (!strcmp(a->attr.name, "reclaim_caches_kb"))
> +		f2fs_reclaim_caches(t);
> +
> +	return ret ? ret : count;

return count;

Thanks,

> +}
> +
> +#define F2FS_TUNE_RW_ATTR(_name)				\
> +static struct f2fs_base_attr f2fs_base_attr_##_name = {		\
> +	.attr = {.name = __stringify(_name), .mode = 0644 },	\
> +	.show	= f2fs_tune_show,				\
> +	.store	= f2fs_tune_store,				\
> +}
> +
>  static ssize_t f2fs_sb_feature_show(struct f2fs_attr *a,
>  		struct f2fs_sb_info *sbi, char *buf)
>  {
> @@ -1368,6 +1401,14 @@ static struct attribute *f2fs_sb_feat_attrs[] = {
>  };
>  ATTRIBUTE_GROUPS(f2fs_sb_feat);
>  
> +F2FS_TUNE_RW_ATTR(reclaim_caches_kb);
> +
> +static struct attribute *f2fs_tune_attrs[] = {
> +	BASE_ATTR_LIST(reclaim_caches_kb),
> +	NULL,
> +};
> +ATTRIBUTE_GROUPS(f2fs_tune);
> +
>  static const struct sysfs_ops f2fs_attr_ops = {
>  	.show	= f2fs_attr_show,
>  	.store	= f2fs_attr_store,
> @@ -1401,6 +1442,20 @@ static struct kobject f2fs_feat = {
>  	.kset	= &f2fs_kset,
>  };
>  
> +static const struct sysfs_ops f2fs_tune_attr_ops = {
> +	.show	= f2fs_base_attr_show,
> +	.store	= f2fs_base_attr_store,
> +};
> +
> +static const struct kobj_type f2fs_tune_ktype = {
> +	.default_groups = f2fs_tune_groups,
> +	.sysfs_ops	= &f2fs_tune_attr_ops,
> +};
> +
> +static struct kobject f2fs_tune = {
> +	.kset	= &f2fs_kset,
> +};
> +
>  static ssize_t f2fs_stat_attr_show(struct kobject *kobj,
>  				struct attribute *attr, char *buf)
>  {
> @@ -1637,6 +1692,11 @@ int __init f2fs_init_sysfs(void)
>  	if (ret)
>  		goto unregister_out;
>  
> +	ret = kobject_init_and_add(&f2fs_tune, &f2fs_tune_ktype,
> +				   NULL, "tuning");
> +	if (ret)
> +		goto put_feat;
> +
>  	f2fs_proc_root = proc_mkdir("fs/f2fs", NULL);
>  	if (!f2fs_proc_root) {
>  		ret = -ENOMEM;
> @@ -1645,6 +1705,8 @@ int __init f2fs_init_sysfs(void)
>  
>  	return 0;
>  put_kobject:
> +	kobject_put(&f2fs_tune);
> +put_feat:
>  	kobject_put(&f2fs_feat);
>  unregister_out:
>  	kset_unregister(&f2fs_kset);
> @@ -1653,6 +1715,7 @@ int __init f2fs_init_sysfs(void)
>  
>  void f2fs_exit_sysfs(void)
>  {
> +	kobject_put(&f2fs_tune);
>  	kobject_put(&f2fs_feat);
>  	kset_unregister(&f2fs_kset);
>  	remove_proc_entry("fs/f2fs", NULL);


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2] f2fs: add a sysfs entry to request donate file-backed pages
  2025-02-06  2:23   ` [f2fs-dev] " Chao Yu
@ 2025-02-06 18:33     ` Jaegeuk Kim
  0 siblings, 0 replies; 9+ messages in thread
From: Jaegeuk Kim @ 2025-02-06 18:33 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-kernel, linux-f2fs-devel

On 02/06, Chao Yu wrote:
> On 2/1/25 06:27, Jaegeuk Kim via Linux-f2fs-devel wrote:
> > 1. ioctl(fd1, F2FS_IOC_DONATE_RANGE, {0,3});
> > 2. ioctl(fd2, F2FS_IOC_DONATE_RANGE, {1,2});
> > 3. ioctl(fd3, F2FS_IOC_DONATE_RANGE, {3,1});
> > 4. echo 1024 > /sys/fs/f2fs/tuning/reclaim_caches_kb
> > 
> > This gives a way to reclaim file-backed pages by iterating all f2fs mounts until
> > reclaiming 1MB page cache ranges, registered by #1, #2, and #3.
> > 
> > 5. cat /sys/fs/f2fs/tuning/reclaim_caches_kb
> > -> gives total number of registered file ranges.
> > 
> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > ---
> >  Documentation/ABI/testing/sysfs-fs-f2fs |  7 ++
> >  fs/f2fs/f2fs.h                          |  2 +
> >  fs/f2fs/shrinker.c                      | 90 +++++++++++++++++++++++++
> >  fs/f2fs/sysfs.c                         | 63 +++++++++++++++++
> >  4 files changed, 162 insertions(+)
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
> > index 3e1630c70d8a..81deae2af84d 100644
> > --- a/Documentation/ABI/testing/sysfs-fs-f2fs
> > +++ b/Documentation/ABI/testing/sysfs-fs-f2fs
> > @@ -828,3 +828,10 @@ Date:		November 2024
> >  Contact:	"Chao Yu" <chao@kernel.org>
> >  Description:	It controls max read extent count for per-inode, the value of threshold
> >  		is 10240 by default.
> > +
> > +What:		/sys/fs/f2fs/tuning/reclaim_caches_kb
> > +Date:		February 2025
> > +Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
> > +Description:	It reclaims the given KBs of file-backed pages registered by
> > +		ioctl(F2FS_IOC_DONATE_RANGE).
> > +		For example, writing N tries to drop N KBs spaces in LRU.
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index 805585a7d2b6..bd0d8138b71d 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -4241,6 +4241,8 @@ unsigned long f2fs_shrink_count(struct shrinker *shrink,
> >  			struct shrink_control *sc);
> >  unsigned long f2fs_shrink_scan(struct shrinker *shrink,
> >  			struct shrink_control *sc);
> > +unsigned int f2fs_donate_files(void);
> > +void f2fs_reclaim_caches(unsigned int reclaim_caches_kb);
> >  void f2fs_join_shrinker(struct f2fs_sb_info *sbi);
> >  void f2fs_leave_shrinker(struct f2fs_sb_info *sbi);
> >  
> > diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c
> > index 83d6fb97dcae..45efff635d8e 100644
> > --- a/fs/f2fs/shrinker.c
> > +++ b/fs/f2fs/shrinker.c
> > @@ -130,6 +130,96 @@ unsigned long f2fs_shrink_scan(struct shrinker *shrink,
> >  	return freed;
> >  }
> >  
> > +unsigned int f2fs_donate_files(void)
> > +{
> > +	struct f2fs_sb_info *sbi;
> > +	struct list_head *p;
> > +	unsigned int donate_files = 0;
> > +
> > +	spin_lock(&f2fs_list_lock);
> > +	p = f2fs_list.next;
> > +	while (p != &f2fs_list) {
> > +		sbi = list_entry(p, struct f2fs_sb_info, s_list);
> > +
> > +		/* stop f2fs_put_super */
> > +		if (!mutex_trylock(&sbi->umount_mutex)) {
> > +			p = p->next;
> > +			continue;
> > +		}
> > +		spin_unlock(&f2fs_list_lock);
> > +
> > +		donate_files += sbi->donate_files;
> > +
> > +		spin_lock(&f2fs_list_lock);
> > +		p = p->next;
> > +		mutex_unlock(&sbi->umount_mutex);
> > +	}
> > +	spin_unlock(&f2fs_list_lock);
> > +
> > +	return donate_files;
> > +}
> > +
> > +static unsigned int do_reclaim_caches(struct f2fs_sb_info *sbi,
> > +				unsigned int reclaim_caches_kb)
> > +{
> > +	struct inode *inode;
> > +	struct f2fs_inode_info *fi;
> > +	unsigned int nfiles = sbi->donate_files;
> > +	pgoff_t npages = reclaim_caches_kb >> (PAGE_SHIFT - 10);
> > +
> > +	while (npages && nfiles--) {
> > +		pgoff_t len;
> > +
> > +		spin_lock(&sbi->inode_lock[DONATE_INODE]);
> > +		if (list_empty(&sbi->inode_list[DONATE_INODE])) {
> > +			spin_unlock(&sbi->inode_lock[DONATE_INODE]);
> > +			break;
> > +		}
> > +		fi = list_first_entry(&sbi->inode_list[DONATE_INODE],
> > +					struct f2fs_inode_info, gdonate_list);
> > +		list_move_tail(&fi->gdonate_list, &sbi->inode_list[DONATE_INODE]);
> > +		inode = igrab(&fi->vfs_inode);
> > +		spin_unlock(&sbi->inode_lock[DONATE_INODE]);
> > +
> > +		if (!inode)
> > +			continue;
> > +
> > +		len = fi->donate_end - fi->donate_start + 1;
> > +		npages = npages < len ? 0 : npages - len;
> > +		invalidate_inode_pages2_range(inode->i_mapping,
> > +					fi->donate_start, fi->donate_end);
> > +		iput(inode);
> > +		cond_resched();
> > +	}
> > +	return npages << (PAGE_SHIFT - 10);
> > +}
> > +
> > +void f2fs_reclaim_caches(unsigned int reclaim_caches_kb)
> > +{
> > +	struct f2fs_sb_info *sbi;
> > +	struct list_head *p;
> > +
> > +	spin_lock(&f2fs_list_lock);
> > +	p = f2fs_list.next;
> > +	while (p != &f2fs_list && reclaim_caches_kb) {
> > +		sbi = list_entry(p, struct f2fs_sb_info, s_list);
> > +
> > +		/* stop f2fs_put_super */
> > +		if (!mutex_trylock(&sbi->umount_mutex)) {
> > +			p = p->next;
> > +			continue;
> > +		}
> > +		spin_unlock(&f2fs_list_lock);
> > +
> > +		reclaim_caches_kb = do_reclaim_caches(sbi, reclaim_caches_kb);
> > +
> > +		spin_lock(&f2fs_list_lock);
> > +		p = p->next;
> > +		mutex_unlock(&sbi->umount_mutex);
> > +	}
> > +	spin_unlock(&f2fs_list_lock);
> > +}
> > +
> >  void f2fs_join_shrinker(struct f2fs_sb_info *sbi)
> >  {
> >  	spin_lock(&f2fs_list_lock);
> > diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
> > index 4bd7b17a20c8..579226a05a69 100644
> > --- a/fs/f2fs/sysfs.c
> > +++ b/fs/f2fs/sysfs.c
> > @@ -916,6 +916,39 @@ static struct f2fs_base_attr f2fs_base_attr_##_name = {		\
> >  	.show	= f2fs_feature_show,				\
> >  }
> >  
> > +static ssize_t f2fs_tune_show(struct f2fs_base_attr *a, char *buf)
> > +{
> > +	unsigned int res;
> > +
> > +	if (!strcmp(a->attr.name, "reclaim_caches_kb"))
> > +		res = f2fs_donate_files();
> > +
> > +	return sysfs_emit(buf, "%u\n", res);
> > +}
> > +
> > +static ssize_t f2fs_tune_store(struct f2fs_base_attr *a,
> > +			const char *buf, size_t count)
> > +{
> > +	unsigned long t;
> > +	int ret;
> > +
> > +	ret = kstrtoul(skip_spaces(buf), 0, &t);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (!strcmp(a->attr.name, "reclaim_caches_kb"))
> > +		f2fs_reclaim_caches(t);
> > +
> > +	return ret ? ret : count;
> 
> return count;

Applied. Thanks,

> 
> Thanks,
> 
> > +}
> > +
> > +#define F2FS_TUNE_RW_ATTR(_name)				\
> > +static struct f2fs_base_attr f2fs_base_attr_##_name = {		\
> > +	.attr = {.name = __stringify(_name), .mode = 0644 },	\
> > +	.show	= f2fs_tune_show,				\
> > +	.store	= f2fs_tune_store,				\
> > +}
> > +
> >  static ssize_t f2fs_sb_feature_show(struct f2fs_attr *a,
> >  		struct f2fs_sb_info *sbi, char *buf)
> >  {
> > @@ -1368,6 +1401,14 @@ static struct attribute *f2fs_sb_feat_attrs[] = {
> >  };
> >  ATTRIBUTE_GROUPS(f2fs_sb_feat);
> >  
> > +F2FS_TUNE_RW_ATTR(reclaim_caches_kb);
> > +
> > +static struct attribute *f2fs_tune_attrs[] = {
> > +	BASE_ATTR_LIST(reclaim_caches_kb),
> > +	NULL,
> > +};
> > +ATTRIBUTE_GROUPS(f2fs_tune);
> > +
> >  static const struct sysfs_ops f2fs_attr_ops = {
> >  	.show	= f2fs_attr_show,
> >  	.store	= f2fs_attr_store,
> > @@ -1401,6 +1442,20 @@ static struct kobject f2fs_feat = {
> >  	.kset	= &f2fs_kset,
> >  };
> >  
> > +static const struct sysfs_ops f2fs_tune_attr_ops = {
> > +	.show	= f2fs_base_attr_show,
> > +	.store	= f2fs_base_attr_store,
> > +};
> > +
> > +static const struct kobj_type f2fs_tune_ktype = {
> > +	.default_groups = f2fs_tune_groups,
> > +	.sysfs_ops	= &f2fs_tune_attr_ops,
> > +};
> > +
> > +static struct kobject f2fs_tune = {
> > +	.kset	= &f2fs_kset,
> > +};
> > +
> >  static ssize_t f2fs_stat_attr_show(struct kobject *kobj,
> >  				struct attribute *attr, char *buf)
> >  {
> > @@ -1637,6 +1692,11 @@ int __init f2fs_init_sysfs(void)
> >  	if (ret)
> >  		goto unregister_out;
> >  
> > +	ret = kobject_init_and_add(&f2fs_tune, &f2fs_tune_ktype,
> > +				   NULL, "tuning");
> > +	if (ret)
> > +		goto put_feat;
> > +
> >  	f2fs_proc_root = proc_mkdir("fs/f2fs", NULL);
> >  	if (!f2fs_proc_root) {
> >  		ret = -ENOMEM;
> > @@ -1645,6 +1705,8 @@ int __init f2fs_init_sysfs(void)
> >  
> >  	return 0;
> >  put_kobject:
> > +	kobject_put(&f2fs_tune);
> > +put_feat:
> >  	kobject_put(&f2fs_feat);
> >  unregister_out:
> >  	kset_unregister(&f2fs_kset);
> > @@ -1653,6 +1715,7 @@ int __init f2fs_init_sysfs(void)
> >  
> >  void f2fs_exit_sysfs(void)
> >  {
> > +	kobject_put(&f2fs_tune);
> >  	kobject_put(&f2fs_feat);
> >  	kset_unregister(&f2fs_kset);
> >  	remove_proc_entry("fs/f2fs", NULL);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2 v2] f2fs: add a sysfs entry to request donate file-backed pages
  2025-01-31 22:27 ` [PATCH 2/2] f2fs: add a sysfs entry to request donate file-backed pages Jaegeuk Kim
  2025-02-06  2:23   ` [f2fs-dev] " Chao Yu
@ 2025-02-07 16:28   ` Jaegeuk Kim
  2025-02-10  9:59     ` [f2fs-dev] " Chao Yu
  1 sibling, 1 reply; 9+ messages in thread
From: Jaegeuk Kim @ 2025-02-07 16:28 UTC (permalink / raw)
  To: linux-kernel, linux-f2fs-devel

1. ioctl(fd1, F2FS_IOC_DONATE_RANGE, {0,3});
2. ioctl(fd2, F2FS_IOC_DONATE_RANGE, {1,2});
3. ioctl(fd3, F2FS_IOC_DONATE_RANGE, {3,1});
4. echo 1024 > /sys/fs/f2fs/tuning/reclaim_caches_kb

This gives a way to reclaim file-backed pages by iterating all f2fs mounts until
reclaiming 1MB page cache ranges, registered by #1, #2, and #3.

5. cat /sys/fs/f2fs/tuning/reclaim_caches_kb
-> gives total number of registered file ranges.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---

 From v1:
   - Minor revision to clean up the flow.

 Documentation/ABI/testing/sysfs-fs-f2fs |  7 ++
 fs/f2fs/f2fs.h                          |  2 +
 fs/f2fs/shrinker.c                      | 90 +++++++++++++++++++++++++
 fs/f2fs/sysfs.c                         | 63 +++++++++++++++++
 4 files changed, 162 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
index 3e1630c70d8a..81deae2af84d 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -828,3 +828,10 @@ Date:		November 2024
 Contact:	"Chao Yu" <chao@kernel.org>
 Description:	It controls max read extent count for per-inode, the value of threshold
 		is 10240 by default.
+
+What:		/sys/fs/f2fs/tuning/reclaim_caches_kb
+Date:		February 2025
+Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
+Description:	It reclaims the given KBs of file-backed pages registered by
+		ioctl(F2FS_IOC_DONATE_RANGE).
+		For example, writing N tries to drop N KBs spaces in LRU.
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 805585a7d2b6..bd0d8138b71d 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4241,6 +4241,8 @@ unsigned long f2fs_shrink_count(struct shrinker *shrink,
 			struct shrink_control *sc);
 unsigned long f2fs_shrink_scan(struct shrinker *shrink,
 			struct shrink_control *sc);
+unsigned int f2fs_donate_files(void);
+void f2fs_reclaim_caches(unsigned int reclaim_caches_kb);
 void f2fs_join_shrinker(struct f2fs_sb_info *sbi);
 void f2fs_leave_shrinker(struct f2fs_sb_info *sbi);
 
diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c
index 83d6fb97dcae..45efff635d8e 100644
--- a/fs/f2fs/shrinker.c
+++ b/fs/f2fs/shrinker.c
@@ -130,6 +130,96 @@ unsigned long f2fs_shrink_scan(struct shrinker *shrink,
 	return freed;
 }
 
+unsigned int f2fs_donate_files(void)
+{
+	struct f2fs_sb_info *sbi;
+	struct list_head *p;
+	unsigned int donate_files = 0;
+
+	spin_lock(&f2fs_list_lock);
+	p = f2fs_list.next;
+	while (p != &f2fs_list) {
+		sbi = list_entry(p, struct f2fs_sb_info, s_list);
+
+		/* stop f2fs_put_super */
+		if (!mutex_trylock(&sbi->umount_mutex)) {
+			p = p->next;
+			continue;
+		}
+		spin_unlock(&f2fs_list_lock);
+
+		donate_files += sbi->donate_files;
+
+		spin_lock(&f2fs_list_lock);
+		p = p->next;
+		mutex_unlock(&sbi->umount_mutex);
+	}
+	spin_unlock(&f2fs_list_lock);
+
+	return donate_files;
+}
+
+static unsigned int do_reclaim_caches(struct f2fs_sb_info *sbi,
+				unsigned int reclaim_caches_kb)
+{
+	struct inode *inode;
+	struct f2fs_inode_info *fi;
+	unsigned int nfiles = sbi->donate_files;
+	pgoff_t npages = reclaim_caches_kb >> (PAGE_SHIFT - 10);
+
+	while (npages && nfiles--) {
+		pgoff_t len;
+
+		spin_lock(&sbi->inode_lock[DONATE_INODE]);
+		if (list_empty(&sbi->inode_list[DONATE_INODE])) {
+			spin_unlock(&sbi->inode_lock[DONATE_INODE]);
+			break;
+		}
+		fi = list_first_entry(&sbi->inode_list[DONATE_INODE],
+					struct f2fs_inode_info, gdonate_list);
+		list_move_tail(&fi->gdonate_list, &sbi->inode_list[DONATE_INODE]);
+		inode = igrab(&fi->vfs_inode);
+		spin_unlock(&sbi->inode_lock[DONATE_INODE]);
+
+		if (!inode)
+			continue;
+
+		len = fi->donate_end - fi->donate_start + 1;
+		npages = npages < len ? 0 : npages - len;
+		invalidate_inode_pages2_range(inode->i_mapping,
+					fi->donate_start, fi->donate_end);
+		iput(inode);
+		cond_resched();
+	}
+	return npages << (PAGE_SHIFT - 10);
+}
+
+void f2fs_reclaim_caches(unsigned int reclaim_caches_kb)
+{
+	struct f2fs_sb_info *sbi;
+	struct list_head *p;
+
+	spin_lock(&f2fs_list_lock);
+	p = f2fs_list.next;
+	while (p != &f2fs_list && reclaim_caches_kb) {
+		sbi = list_entry(p, struct f2fs_sb_info, s_list);
+
+		/* stop f2fs_put_super */
+		if (!mutex_trylock(&sbi->umount_mutex)) {
+			p = p->next;
+			continue;
+		}
+		spin_unlock(&f2fs_list_lock);
+
+		reclaim_caches_kb = do_reclaim_caches(sbi, reclaim_caches_kb);
+
+		spin_lock(&f2fs_list_lock);
+		p = p->next;
+		mutex_unlock(&sbi->umount_mutex);
+	}
+	spin_unlock(&f2fs_list_lock);
+}
+
 void f2fs_join_shrinker(struct f2fs_sb_info *sbi)
 {
 	spin_lock(&f2fs_list_lock);
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index b419555e1ea7..b27336acf519 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -916,6 +916,39 @@ static struct f2fs_base_attr f2fs_base_attr_##_name = {		\
 	.show	= f2fs_feature_show,				\
 }
 
+static ssize_t f2fs_tune_show(struct f2fs_base_attr *a, char *buf)
+{
+	unsigned int res = 0;
+
+	if (!strcmp(a->attr.name, "reclaim_caches_kb"))
+		res = f2fs_donate_files();
+
+	return sysfs_emit(buf, "%u\n", res);
+}
+
+static ssize_t f2fs_tune_store(struct f2fs_base_attr *a,
+			const char *buf, size_t count)
+{
+	unsigned long t;
+	int ret;
+
+	ret = kstrtoul(skip_spaces(buf), 0, &t);
+	if (ret)
+		return ret;
+
+	if (!strcmp(a->attr.name, "reclaim_caches_kb"))
+		f2fs_reclaim_caches(t);
+
+	return count;
+}
+
+#define F2FS_TUNE_RW_ATTR(_name)				\
+static struct f2fs_base_attr f2fs_base_attr_##_name = {		\
+	.attr = {.name = __stringify(_name), .mode = 0644 },	\
+	.show	= f2fs_tune_show,				\
+	.store	= f2fs_tune_store,				\
+}
+
 static ssize_t f2fs_sb_feature_show(struct f2fs_attr *a,
 		struct f2fs_sb_info *sbi, char *buf)
 {
@@ -1368,6 +1401,14 @@ static struct attribute *f2fs_sb_feat_attrs[] = {
 };
 ATTRIBUTE_GROUPS(f2fs_sb_feat);
 
+F2FS_TUNE_RW_ATTR(reclaim_caches_kb);
+
+static struct attribute *f2fs_tune_attrs[] = {
+	BASE_ATTR_LIST(reclaim_caches_kb),
+	NULL,
+};
+ATTRIBUTE_GROUPS(f2fs_tune);
+
 static const struct sysfs_ops f2fs_attr_ops = {
 	.show	= f2fs_attr_show,
 	.store	= f2fs_attr_store,
@@ -1401,6 +1442,20 @@ static struct kobject f2fs_feat = {
 	.kset	= &f2fs_kset,
 };
 
+static const struct sysfs_ops f2fs_tune_attr_ops = {
+	.show	= f2fs_base_attr_show,
+	.store	= f2fs_base_attr_store,
+};
+
+static const struct kobj_type f2fs_tune_ktype = {
+	.default_groups = f2fs_tune_groups,
+	.sysfs_ops	= &f2fs_tune_attr_ops,
+};
+
+static struct kobject f2fs_tune = {
+	.kset	= &f2fs_kset,
+};
+
 static ssize_t f2fs_stat_attr_show(struct kobject *kobj,
 				struct attribute *attr, char *buf)
 {
@@ -1637,6 +1692,11 @@ int __init f2fs_init_sysfs(void)
 	if (ret)
 		goto put_kobject;
 
+	ret = kobject_init_and_add(&f2fs_tune, &f2fs_tune_ktype,
+				   NULL, "tuning");
+	if (ret)
+		goto put_kobject;
+
 	f2fs_proc_root = proc_mkdir("fs/f2fs", NULL);
 	if (!f2fs_proc_root) {
 		ret = -ENOMEM;
@@ -1644,7 +1704,9 @@ int __init f2fs_init_sysfs(void)
 	}
 
 	return 0;
+
 put_kobject:
+	kobject_put(&f2fs_tune);
 	kobject_put(&f2fs_feat);
 	kset_unregister(&f2fs_kset);
 	return ret;
@@ -1652,6 +1714,7 @@ int __init f2fs_init_sysfs(void)
 
 void f2fs_exit_sysfs(void)
 {
+	kobject_put(&f2fs_tune);
 	kobject_put(&f2fs_feat);
 	kset_unregister(&f2fs_kset);
 	remove_proc_entry("fs/f2fs", NULL);
-- 
2.48.1.502.g6dc24dfdaf-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [f2fs-dev] [PATCH 2/2 v2] f2fs: add a sysfs entry to request donate file-backed pages
  2025-02-07 16:28   ` [PATCH 2/2 v2] " Jaegeuk Kim
@ 2025-02-10  9:59     ` Chao Yu
  0 siblings, 0 replies; 9+ messages in thread
From: Chao Yu @ 2025-02-10  9:59 UTC (permalink / raw)
  To: Jaegeuk Kim, linux-kernel, linux-f2fs-devel; +Cc: chao

On 2/8/25 00:28, Jaegeuk Kim via Linux-f2fs-devel wrote:
> 1. ioctl(fd1, F2FS_IOC_DONATE_RANGE, {0,3});
> 2. ioctl(fd2, F2FS_IOC_DONATE_RANGE, {1,2});
> 3. ioctl(fd3, F2FS_IOC_DONATE_RANGE, {3,1});
> 4. echo 1024 > /sys/fs/f2fs/tuning/reclaim_caches_kb
> 
> This gives a way to reclaim file-backed pages by iterating all f2fs mounts until
> reclaiming 1MB page cache ranges, registered by #1, #2, and #3.
> 
> 5. cat /sys/fs/f2fs/tuning/reclaim_caches_kb
> -> gives total number of registered file ranges.
> 
> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

Reviewed-by: Chao Yu <chao@kernel.org>

Thanks,

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-02-10  9:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-31 22:27 [PATCH 0/2 v8] add ioctl/sysfs to donate file-backed pages Jaegeuk Kim
2025-01-31 22:27 ` [PATCH 1/2] f2fs: register inodes which is able to donate pages Jaegeuk Kim
2025-01-31 22:27 ` [PATCH 2/2] f2fs: add a sysfs entry to request donate file-backed pages Jaegeuk Kim
2025-02-06  2:23   ` [f2fs-dev] " Chao Yu
2025-02-06 18:33     ` Jaegeuk Kim
2025-02-07 16:28   ` [PATCH 2/2 v2] " Jaegeuk Kim
2025-02-10  9:59     ` [f2fs-dev] " Chao Yu
2025-02-04  5:49 ` [PATCH 0/2 v8] add ioctl/sysfs to " Christoph Hellwig
2025-02-04 16:26   ` Jaegeuk Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).