* [PATCH v5 0/5] data placement hints and FDP
[not found] <CGME20240910151040epcas5p3f47fa7ea37a35f8b44dd9174689e1bb9@epcas5p3.samsung.com>
@ 2024-09-10 15:01 ` Kanchan Joshi
[not found] ` <CGME20240910151044epcas5p37f61bb85ccf8b3eb875e77c3fc260c51@epcas5p3.samsung.com>
` (5 more replies)
0 siblings, 6 replies; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-10 15:01 UTC (permalink / raw)
To: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche
Cc: linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Kanchan Joshi
Current write-hint infrastructure supports 6 temperature-based data
lifetime hints.
The series extends the infrastructure with a new temperature-agnostic
placement-type hint. New fcntl codes F_{SET/GET}_RW_HINT_EX allow to
send the hint type/value on file. See patch #3 commit description and
interface example below [*].
Overall this creates 127 placement hint values that users can pass.
Patch #5 adds the ability to map these new hint values to nvme-specific
placement-identifiers.
Patch #4 restricts SCSI to use only lifetime hint values.
Patch #1 and #2 are simple prep patches.
[*]
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <inttypes.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <linux/fcntl.h>
int main(int argc, char *argv[])
{
struct rw_hint_ex set_hint_ex={}, get_hint_ex={};
int fd, ret;
if (argc < 4) {
fprintf(stderr, "Usage: %s file <hint type> <hint value>\n",
argv[0]);
return 1;
}
fd = open(argv[1], O_CREAT|O_RDWR|O_DIRECT, 0644);
if (fd < 0) {
perror("open");
return 1;
}
set_hint_ex.type = atoi(argv[2]);
set_hint_ex.val = atol(argv[3]);
ret = fcntl(fd, F_SET_RW_HINT_EX, &set_hint_ex);
if (ret < 0) {
perror("fcntl: Error, F_SET_RW_HINT_EX");
goto close_fd;
}
ret = fcntl(fd, F_GET_RW_HINT_EX, &get_hint_ex);
if (ret < 0) {
perror("fcntl: Error, F_GET_RW_HINT_EX");
goto close_fd;
}
printf("set_hint (%d,%llu)\nget_hint (%d,%llu)\n",
set_hint_ex.type, set_hint_ex.val,
get_hint_ex.type, get_hint_ex.val);
close_fd:
close(fd);
return 0;
}
/* set placement hint (type 2) with value 126 */
# ./a.out /dev/nvme0n1 2 126
set_hint (2,126)
get_hint (2,126)
/* invalid placement hint value */
# ./a.out /dev/nvme0n1 2 128
fcntl: Error, F_SET_RW_HINT_EX: Invalid argument
Changes since v4:
- Retain the size/type checking on the enum (Bart)
- Use the name "*_lifetime_hint" rather than "*_life_hint" (Bart)
Changes since v3:
- 4 new patches to introduce placement hints
- Make nvme patch use the placement hints rather than lifetime hints
Changes since v2:
- Base it on nvme-6.11 and resolve a merge conflict
Changes since v1:
- Reduce the fetched plids from 128 to 6 (Keith)
- Use struct_size for a calculation (Keith)
- Handle robot/sparse warning
Kanchan Joshi (4):
fs, block: refactor enum rw_hint
fcntl: rename rw_hint_* to rw_lifetime_hint_*
fcntl: add F_{SET/GET}_RW_HINT_EX
nvme: enable FDP support
Nitesh Shetty (1):
sd: limit to use write life hints
drivers/nvme/host/core.c | 81 ++++++++++++++++++++++++++++++++++++++
drivers/nvme/host/nvme.h | 4 ++
drivers/scsi/sd.c | 7 ++--
fs/buffer.c | 4 +-
fs/f2fs/f2fs.h | 5 ++-
fs/f2fs/segment.c | 5 ++-
fs/fcntl.c | 79 ++++++++++++++++++++++++++++++++++---
include/linux/blk-mq.h | 2 +-
include/linux/blk_types.h | 2 +-
include/linux/fs.h | 2 +-
include/linux/nvme.h | 19 +++++++++
include/linux/rw_hint.h | 17 +++++++-
include/uapi/linux/fcntl.h | 14 +++++++
13 files changed, 221 insertions(+), 20 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v5 1/5] fs, block: refactor enum rw_hint
[not found] ` <CGME20240910151044epcas5p37f61bb85ccf8b3eb875e77c3fc260c51@epcas5p3.samsung.com>
@ 2024-09-10 15:01 ` Kanchan Joshi
2024-09-12 12:53 ` Christoph Hellwig
0 siblings, 1 reply; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-10 15:01 UTC (permalink / raw)
To: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche
Cc: linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Kanchan Joshi
Rename enum rw_hint to rw_lifetime_hint.
Change i_write_hint (in inode), bi_write_hint(in bio), and write_hint
(in request) to use u8 data-type rather than this enum.
This is in preparation to introduce a new write hint type.
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
fs/buffer.c | 4 ++--
fs/f2fs/f2fs.h | 5 +++--
fs/f2fs/segment.c | 5 +++--
include/linux/blk-mq.h | 2 +-
include/linux/blk_types.h | 2 +-
include/linux/fs.h | 2 +-
include/linux/rw_hint.h | 4 ++--
7 files changed, 13 insertions(+), 11 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index e55ad471c530..0c6bc9b7d4ad 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -55,7 +55,7 @@
static int fsync_buffers_list(spinlock_t *lock, struct list_head *list);
static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
- enum rw_hint hint, struct writeback_control *wbc);
+ u8 hint, struct writeback_control *wbc);
#define BH_ENTRY(list) list_entry((list), struct buffer_head, b_assoc_buffers)
@@ -2778,7 +2778,7 @@ static void end_bio_bh_io_sync(struct bio *bio)
}
static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
- enum rw_hint write_hint,
+ u8 write_hint,
struct writeback_control *wbc)
{
const enum req_op op = opf & REQ_OP_MASK;
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ac19c61f0c3e..9b843b57dba1 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3756,8 +3756,9 @@ int f2fs_build_segment_manager(struct f2fs_sb_info *sbi);
void f2fs_destroy_segment_manager(struct f2fs_sb_info *sbi);
int __init f2fs_create_segment_manager_caches(void);
void f2fs_destroy_segment_manager_caches(void);
-int f2fs_rw_hint_to_seg_type(struct f2fs_sb_info *sbi, enum rw_hint hint);
-enum rw_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi,
+int f2fs_rw_hint_to_seg_type(struct f2fs_sb_info *sbi,
+ enum rw_lifetime_hint hint);
+enum rw_lifetime_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi,
enum page_type type, enum temp_type temp);
unsigned int f2fs_usable_segs_in_sec(struct f2fs_sb_info *sbi,
unsigned int segno);
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 78c3198a6308..6802e82f9ffd 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3381,7 +3381,8 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct fstrim_range *range)
return err;
}
-int f2fs_rw_hint_to_seg_type(struct f2fs_sb_info *sbi, enum rw_hint hint)
+int f2fs_rw_hint_to_seg_type(struct f2fs_sb_info *sbi,
+ enum rw_lifetime_hint hint)
{
if (F2FS_OPTION(sbi).active_logs == 2)
return CURSEG_HOT_DATA;
@@ -3425,7 +3426,7 @@ int f2fs_rw_hint_to_seg_type(struct f2fs_sb_info *sbi, enum rw_hint hint)
* WRITE_LIFE_MEDIUM " WRITE_LIFE_MEDIUM
* WRITE_LIFE_LONG " WRITE_LIFE_LONG
*/
-enum rw_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi,
+enum rw_lifetime_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi,
enum page_type type, enum temp_type temp)
{
switch (type) {
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 8d304b1d16b1..1e5ce84ccf0a 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -159,7 +159,7 @@ struct request {
struct blk_crypto_keyslot *crypt_keyslot;
#endif
- enum rw_hint write_hint;
+ u8 write_hint;
unsigned short ioprio;
enum mq_rq_state state;
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 36ed96133217..446c847bb3b3 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -216,7 +216,7 @@ struct bio {
*/
unsigned short bi_flags; /* BIO_* below */
unsigned short bi_ioprio;
- enum rw_hint bi_write_hint;
+ u8 bi_write_hint;
blk_status_t bi_status;
atomic_t __bi_remaining;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index fb0426f349fc..f9a7a2a80661 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -674,7 +674,7 @@ struct inode {
spinlock_t i_lock; /* i_blocks, i_bytes, maybe i_size */
unsigned short i_bytes;
u8 i_blkbits;
- enum rw_hint i_write_hint;
+ u8 i_write_hint;
blkcnt_t i_blocks;
#ifdef __NEED_I_SIZE_ORDERED
diff --git a/include/linux/rw_hint.h b/include/linux/rw_hint.h
index 309ca72f2dfb..b9942f5f13d3 100644
--- a/include/linux/rw_hint.h
+++ b/include/linux/rw_hint.h
@@ -7,7 +7,7 @@
#include <uapi/linux/fcntl.h>
/* Block storage write lifetime hint values. */
-enum rw_hint {
+enum rw_lifetime_hint {
WRITE_LIFE_NOT_SET = RWH_WRITE_LIFE_NOT_SET,
WRITE_LIFE_NONE = RWH_WRITE_LIFE_NONE,
WRITE_LIFE_SHORT = RWH_WRITE_LIFE_SHORT,
@@ -18,7 +18,7 @@ enum rw_hint {
/* Sparse ignores __packed annotations on enums, hence the #ifndef below. */
#ifndef __CHECKER__
-static_assert(sizeof(enum rw_hint) == 1);
+static_assert(sizeof(enum rw_lifetime_hint) == 1);
#endif
#endif /* _LINUX_RW_HINT_H */
--
2.25.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 2/5] fcntl: rename rw_hint_* to rw_lifetime_hint_*
[not found] ` <CGME20240910151048epcas5p3c610d63022362ec5fcc6fc362ad2fb9f@epcas5p3.samsung.com>
@ 2024-09-10 15:01 ` Kanchan Joshi
2024-09-12 12:54 ` Christoph Hellwig
0 siblings, 1 reply; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-10 15:01 UTC (permalink / raw)
To: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche
Cc: linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Kanchan Joshi
F_GET/SET_RW_HINT fcntl handlers query/set write life hints.
Rename the handlers/helpers to be explicit that write life hints are
being handled.
This is in preparation to introduce a new interface that supports more
than one type of write hint.
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
fs/fcntl.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 300e5d9ad913..9df35e7ff754 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -269,7 +269,7 @@ static int f_getowner_uids(struct file *filp, unsigned long arg)
}
#endif
-static bool rw_hint_valid(u64 hint)
+static bool rw_lifetime_hint_valid(u64 hint)
{
BUILD_BUG_ON(WRITE_LIFE_NOT_SET != RWH_WRITE_LIFE_NOT_SET);
BUILD_BUG_ON(WRITE_LIFE_NONE != RWH_WRITE_LIFE_NONE);
@@ -291,7 +291,7 @@ static bool rw_hint_valid(u64 hint)
}
}
-static long fcntl_get_rw_hint(struct file *file, unsigned int cmd,
+static long fcntl_get_rw_lifetime_hint(struct file *file, unsigned int cmd,
unsigned long arg)
{
struct inode *inode = file_inode(file);
@@ -303,7 +303,7 @@ static long fcntl_get_rw_hint(struct file *file, unsigned int cmd,
return 0;
}
-static long fcntl_set_rw_hint(struct file *file, unsigned int cmd,
+static long fcntl_set_rw_lifetime_hint(struct file *file, unsigned int cmd,
unsigned long arg)
{
struct inode *inode = file_inode(file);
@@ -312,7 +312,7 @@ static long fcntl_set_rw_hint(struct file *file, unsigned int cmd,
if (copy_from_user(&hint, argp, sizeof(hint)))
return -EFAULT;
- if (!rw_hint_valid(hint))
+ if (!rw_lifetime_hint_valid(hint))
return -EINVAL;
WRITE_ONCE(inode->i_write_hint, hint);
@@ -449,10 +449,10 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
err = memfd_fcntl(filp, cmd, argi);
break;
case F_GET_RW_HINT:
- err = fcntl_get_rw_hint(filp, cmd, arg);
+ err = fcntl_get_rw_lifetime_hint(filp, cmd, arg);
break;
case F_SET_RW_HINT:
- err = fcntl_set_rw_hint(filp, cmd, arg);
+ err = fcntl_set_rw_lifetime_hint(filp, cmd, arg);
break;
default:
break;
--
2.25.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX
[not found] ` <CGME20240910151052epcas5p48b20962753b1e3171daf98f050d0b5af@epcas5p4.samsung.com>
@ 2024-09-10 15:01 ` Kanchan Joshi
2024-09-10 18:48 ` Jens Axboe
` (2 more replies)
0 siblings, 3 replies; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-10 15:01 UTC (permalink / raw)
To: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche
Cc: linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Kanchan Joshi,
Nitesh Shetty
This is similar to existing F_{SET/GET}_RW_HINT but more
generic/extensible.
F_SET/GET_RW_HINT_EX take a pointer to a struct rw_hint_ex as argument:
struct rw_hint_ex {
__u8 type;
__u8 pad[7];
__u64 val;
};
With F_SET_RW_HINT_EX, the user passes the hint type and its value.
Hint type can be either lifetime hint (TYPE_RW_LIFETIME_HINT) or
placement hint (TYPE_RW_PLACEMENT_HINT). The interface allows to add
more hint add more hint types in future.
Valid values for life hints are same as values supported by existing
fcntl(F_SET_RW_HINT).
Valid values for placement hints are between 0 to 126, both inclusive.
The inode retains either the lifetime hint or the placement hint, whichever
is set later. The set hint type and its value can be queried by
F_GET_RW_HINT_EX.
The i_write_hint field of the inode is a 1-byte field. Use the most
significant bit as the hint type. This bit is set for placement hint.
For lifetime hint, this bit remains zero.
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
---
fs/fcntl.c | 67 ++++++++++++++++++++++++++++++++++++++
include/linux/rw_hint.h | 13 ++++++++
include/uapi/linux/fcntl.h | 14 ++++++++
3 files changed, 94 insertions(+)
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 9df35e7ff754..b35aec56981a 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -291,6 +291,14 @@ static bool rw_lifetime_hint_valid(u64 hint)
}
}
+static inline bool rw_placement_hint_valid(u64 val)
+{
+ if (val <= MAX_PLACEMENT_HINT_VAL)
+ return true;
+
+ return false;
+}
+
static long fcntl_get_rw_lifetime_hint(struct file *file, unsigned int cmd,
unsigned long arg)
{
@@ -327,6 +335,59 @@ static long fcntl_set_rw_lifetime_hint(struct file *file, unsigned int cmd,
return 0;
}
+static long fcntl_get_rw_hint_ex(struct file *file, unsigned int cmd,
+ unsigned long arg)
+{
+ struct rw_hint_ex __user *rw_hint_ex_p = (void __user *)arg;
+ struct rw_hint_ex rwh = {};
+ struct inode *inode = file_inode(file);
+ u8 hint = READ_ONCE(inode->i_write_hint);
+
+ rwh.type = WRITE_HINT_TYPE(hint);
+ rwh.val = WRITE_HINT_VAL(hint);
+
+ if (copy_to_user(rw_hint_ex_p, &rwh, sizeof(rwh)))
+ return -EFAULT;
+
+ return 0;
+}
+
+static long fcntl_set_rw_hint_ex(struct file *file, unsigned int cmd,
+ unsigned long arg)
+{
+ struct rw_hint_ex __user *rw_hint_ex_p = (void __user *)arg;
+ struct rw_hint_ex rwh;
+ struct inode *inode = file_inode(file);
+ u64 hint;
+ int i;
+
+ if (copy_from_user(&rwh, rw_hint_ex_p, sizeof(rwh)))
+ return -EFAULT;
+ for (i = 0; i < ARRAY_SIZE(rwh.pad); i++)
+ if (rwh.pad[i])
+ return -EINVAL;
+ switch (rwh.type) {
+ case TYPE_RW_LIFETIME_HINT:
+ if (!rw_lifetime_hint_valid(rwh.val))
+ return -EINVAL;
+ hint = rwh.val;
+ break;
+ case TYPE_RW_PLACEMENT_HINT:
+ if (!rw_placement_hint_valid(rwh.val))
+ return -EINVAL;
+ hint = PLACEMENT_HINT_TYPE | rwh.val;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ WRITE_ONCE(inode->i_write_hint, hint);
+ if (file->f_mapping->host != inode)
+ WRITE_ONCE(file->f_mapping->host->i_write_hint, hint);
+
+ return 0;
+}
+
/* Is the file descriptor a dup of the file? */
static long f_dupfd_query(int fd, struct file *filp)
{
@@ -454,6 +515,12 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
case F_SET_RW_HINT:
err = fcntl_set_rw_lifetime_hint(filp, cmd, arg);
break;
+ case F_GET_RW_HINT_EX:
+ err = fcntl_get_rw_hint_ex(filp, cmd, arg);
+ break;
+ case F_SET_RW_HINT_EX:
+ err = fcntl_set_rw_hint_ex(filp, cmd, arg);
+ break;
default:
break;
}
diff --git a/include/linux/rw_hint.h b/include/linux/rw_hint.h
index b9942f5f13d3..ff708a75e2f6 100644
--- a/include/linux/rw_hint.h
+++ b/include/linux/rw_hint.h
@@ -21,4 +21,17 @@ enum rw_lifetime_hint {
static_assert(sizeof(enum rw_lifetime_hint) == 1);
#endif
+#define WRITE_HINT_TYPE_BIT BIT(7)
+#define WRITE_HINT_VAL_MASK (WRITE_HINT_TYPE_BIT - 1)
+#define WRITE_HINT_TYPE(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
+ TYPE_RW_PLACEMENT_HINT : TYPE_RW_LIFETIME_HINT)
+#define WRITE_HINT_VAL(h) ((h) & WRITE_HINT_VAL_MASK)
+
+#define WRITE_PLACEMENT_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
+ WRITE_HINT_VAL(h) : 0)
+#define WRITE_LIFETIME_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
+ 0 : WRITE_HINT_VAL(h))
+
+#define PLACEMENT_HINT_TYPE WRITE_HINT_TYPE_BIT
+#define MAX_PLACEMENT_HINT_VAL (WRITE_HINT_VAL_MASK - 1)
#endif /* _LINUX_RW_HINT_H */
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index c0bcc185fa48..f758a7230419 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -57,6 +57,8 @@
#define F_SET_RW_HINT (F_LINUX_SPECIFIC_BASE + 12)
#define F_GET_FILE_RW_HINT (F_LINUX_SPECIFIC_BASE + 13)
#define F_SET_FILE_RW_HINT (F_LINUX_SPECIFIC_BASE + 14)
+#define F_GET_RW_HINT_EX (F_LINUX_SPECIFIC_BASE + 15)
+#define F_SET_RW_HINT_EX (F_LINUX_SPECIFIC_BASE + 16)
/*
* Valid hint values for F_{GET,SET}_RW_HINT. 0 is "not set", or can be
@@ -76,6 +78,18 @@
*/
#define RWF_WRITE_LIFE_NOT_SET RWH_WRITE_LIFE_NOT_SET
+enum rw_hint_type {
+ TYPE_RW_LIFETIME_HINT = 1,
+ TYPE_RW_PLACEMENT_HINT
+};
+
+/* Exchange information with F_{GET/SET}_RW_HINT fcntl */
+struct rw_hint_ex {
+ __u8 type;
+ __u8 pad[7];
+ __u64 val;
+};
+
/*
* Types of directory notifications that may be requested.
*/
--
2.25.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 4/5] sd: limit to use write life hints
[not found] ` <CGME20240910151057epcas5p3369c6257a6f169b4caa6dd59548b538c@epcas5p3.samsung.com>
@ 2024-09-10 15:01 ` Kanchan Joshi
2024-09-12 13:02 ` Christoph Hellwig
0 siblings, 1 reply; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-10 15:01 UTC (permalink / raw)
To: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche
Cc: linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Nitesh Shetty,
Kanchan Joshi
From: Nitesh Shetty <nj.shetty@samsung.com>
The incoming hint value maybe either lifetime hint or placement hint.
Make SCSI interpret only temperature-based write lifetime hints.
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
drivers/scsi/sd.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index dad3991397cf..82bd4b07314e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1191,8 +1191,8 @@ static u8 sd_group_number(struct scsi_cmnd *cmd)
if (!sdkp->rscs)
return 0;
- return min3((u32)rq->write_hint, (u32)sdkp->permanent_stream_count,
- 0x3fu);
+ return min3((u32)WRITE_LIFETIME_HINT(rq->write_hint),
+ (u32)sdkp->permanent_stream_count, 0x3fu);
}
static blk_status_t sd_setup_rw32_cmnd(struct scsi_cmnd *cmd, bool write,
@@ -1390,7 +1390,8 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd)
ret = sd_setup_rw16_cmnd(cmd, write, lba, nr_blocks,
protect | fua, dld);
} else if ((nr_blocks > 0xff) || (lba > 0x1fffff) ||
- sdp->use_10_for_rw || protect || rq->write_hint) {
+ sdp->use_10_for_rw || protect ||
+ WRITE_LIFETIME_HINT(rq->write_hint)) {
ret = sd_setup_rw10_cmnd(cmd, write, lba, nr_blocks,
protect | fua);
} else {
--
2.25.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v5 5/5] nvme: enable FDP support
[not found] ` <CGME20240910151101epcas5p1c4e90f7334125fc49106d58d43cffcec@epcas5p1.samsung.com>
@ 2024-09-10 15:02 ` Kanchan Joshi
0 siblings, 0 replies; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-10 15:02 UTC (permalink / raw)
To: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche
Cc: linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Kanchan Joshi,
Nitesh Shetty, Hui Qi
Flexible Data Placement (FDP), as ratified in TP 4146a, allows the host
to control the placement of logical blocks so as to reduce the SSD WAF.
Userspace can send the data placement information using the write hints.
Fetch the placement-identifiers if the device supports FDP.
The incoming placement hint is mapped to a placement-identifier, which
in turn is set in the DSPEC field of the write command.
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Hui Qi <hui81.qi@samsung.com>
Acked-by: Keith Busch <kbusch@kernel.org>
---
drivers/nvme/host/core.c | 81 ++++++++++++++++++++++++++++++++++++++++
drivers/nvme/host/nvme.h | 4 ++
include/linux/nvme.h | 19 ++++++++++
3 files changed, 104 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index cb846521a77f..5fee63dbb80b 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -44,6 +44,20 @@ struct nvme_ns_info {
bool is_removed;
};
+struct nvme_fdp_ruh_status_desc {
+ u16 pid;
+ u16 ruhid;
+ u32 earutr;
+ u64 ruamw;
+ u8 rsvd16[16];
+};
+
+struct nvme_fdp_ruh_status {
+ u8 rsvd0[14];
+ __le16 nruhsd;
+ struct nvme_fdp_ruh_status_desc ruhsd[];
+};
+
unsigned int admin_timeout = 60;
module_param(admin_timeout, uint, 0644);
MODULE_PARM_DESC(admin_timeout, "timeout in seconds for admin commands");
@@ -657,6 +671,7 @@ static void nvme_free_ns_head(struct kref *ref)
ida_free(&head->subsys->ns_ida, head->instance);
cleanup_srcu_struct(&head->srcu);
nvme_put_subsystem(head->subsys);
+ kfree(head->plids);
kfree(head);
}
@@ -959,6 +974,17 @@ static bool nvme_valid_atomic_write(struct request *req)
return true;
}
+static inline void nvme_assign_placement_id(struct nvme_ns *ns,
+ struct request *req,
+ struct nvme_command *cmd)
+{
+ u8 h = umin(ns->head->nr_plids - 1,
+ WRITE_PLACEMENT_HINT(req->write_hint));
+
+ cmd->rw.control |= cpu_to_le16(NVME_RW_DTYPE_DPLCMT);
+ cmd->rw.dsmgmt |= cpu_to_le32(ns->head->plids[h] << 16);
+}
+
static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
struct request *req, struct nvme_command *cmnd,
enum nvme_opcode op)
@@ -1078,6 +1104,8 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req)
break;
case REQ_OP_WRITE:
ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_write);
+ if (!ret && ns->head->nr_plids)
+ nvme_assign_placement_id(ns, req, cmd);
break;
case REQ_OP_ZONE_APPEND:
ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_zone_append);
@@ -2114,6 +2142,52 @@ static int nvme_update_ns_info_generic(struct nvme_ns *ns,
return ret;
}
+static int nvme_fetch_fdp_plids(struct nvme_ns *ns, u32 nsid)
+{
+ struct nvme_command c = {};
+ struct nvme_fdp_ruh_status *ruhs;
+ struct nvme_fdp_ruh_status_desc *ruhsd;
+ int size, ret, i;
+
+refetch_plids:
+ size = struct_size(ruhs, ruhsd, ns->head->nr_plids);
+ ruhs = kzalloc(size, GFP_KERNEL);
+ if (!ruhs)
+ return -ENOMEM;
+
+ c.imr.opcode = nvme_cmd_io_mgmt_recv;
+ c.imr.nsid = cpu_to_le32(nsid);
+ c.imr.mo = 0x1;
+ c.imr.numd = cpu_to_le32((size >> 2) - 1);
+
+ ret = nvme_submit_sync_cmd(ns->queue, &c, ruhs, size);
+ if (ret)
+ goto out;
+
+ if (!ns->head->nr_plids) {
+ ns->head->nr_plids = le16_to_cpu(ruhs->nruhsd);
+ ns->head->nr_plids =
+ min_t(u16, ns->head->nr_plids, NVME_MAX_PLIDS);
+
+ if (!ns->head->nr_plids)
+ goto out;
+
+ kfree(ruhs);
+ goto refetch_plids;
+ }
+ ns->head->plids = kzalloc(ns->head->nr_plids * sizeof(u16), GFP_KERNEL);
+ if (!ns->head->plids)
+ return -ENOMEM;
+
+ for (i = 0; i < ns->head->nr_plids; i++) {
+ ruhsd = &ruhs->ruhsd[i];
+ ns->head->plids[i] = le16_to_cpu(ruhsd->pid);
+ }
+out:
+ kfree(ruhs);
+ return ret;
+}
+
static int nvme_update_ns_info_block(struct nvme_ns *ns,
struct nvme_ns_info *info)
{
@@ -2205,6 +2279,13 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
if (ret && !nvme_first_scan(ns->disk))
goto out;
}
+ if (ns->ctrl->ctratt & NVME_CTRL_ATTR_FDPS) {
+ ret = nvme_fetch_fdp_plids(ns, info->nsid);
+ if (ret)
+ dev_warn(ns->ctrl->device,
+ "FDP failure status:0x%x\n", ret);
+ }
+
ret = 0;
out:
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index c717c051c6fd..e7fe39598507 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -449,6 +449,8 @@ struct nvme_ns_ids {
u8 csi;
};
+#define NVME_MAX_PLIDS (MAX_PLACEMENT_HINT_VAL + 1)
+
/*
* Anchor structure for namespaces. There is one for each namespace in a
* NVMe subsystem that any of our controllers can see, and the namespace
@@ -470,6 +472,8 @@ struct nvme_ns_head {
struct kref ref;
bool shared;
bool passthru_err_log_enabled;
+ u16 nr_plids;
+ u16 *plids;
struct nvme_effects_log *effects;
u64 nuse;
unsigned ns_id;
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index b58d9405d65e..a954eaee5b0f 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -275,6 +275,7 @@ enum nvme_ctrl_attr {
NVME_CTRL_ATTR_HID_128_BIT = (1 << 0),
NVME_CTRL_ATTR_TBKAS = (1 << 6),
NVME_CTRL_ATTR_ELBAS = (1 << 15),
+ NVME_CTRL_ATTR_FDPS = (1 << 19),
};
struct nvme_id_ctrl {
@@ -843,6 +844,7 @@ enum nvme_opcode {
nvme_cmd_resv_register = 0x0d,
nvme_cmd_resv_report = 0x0e,
nvme_cmd_resv_acquire = 0x11,
+ nvme_cmd_io_mgmt_recv = 0x12,
nvme_cmd_resv_release = 0x15,
nvme_cmd_zone_mgmt_send = 0x79,
nvme_cmd_zone_mgmt_recv = 0x7a,
@@ -864,6 +866,7 @@ enum nvme_opcode {
nvme_opcode_name(nvme_cmd_resv_register), \
nvme_opcode_name(nvme_cmd_resv_report), \
nvme_opcode_name(nvme_cmd_resv_acquire), \
+ nvme_opcode_name(nvme_cmd_io_mgmt_recv), \
nvme_opcode_name(nvme_cmd_resv_release), \
nvme_opcode_name(nvme_cmd_zone_mgmt_send), \
nvme_opcode_name(nvme_cmd_zone_mgmt_recv), \
@@ -1015,6 +1018,7 @@ enum {
NVME_RW_PRINFO_PRCHK_GUARD = 1 << 12,
NVME_RW_PRINFO_PRACT = 1 << 13,
NVME_RW_DTYPE_STREAMS = 1 << 4,
+ NVME_RW_DTYPE_DPLCMT = 2 << 4,
NVME_WZ_DEAC = 1 << 9,
};
@@ -1102,6 +1106,20 @@ struct nvme_zone_mgmt_recv_cmd {
__le32 cdw14[2];
};
+struct nvme_io_mgmt_recv_cmd {
+ __u8 opcode;
+ __u8 flags;
+ __u16 command_id;
+ __le32 nsid;
+ __le64 rsvd2[2];
+ union nvme_data_ptr dptr;
+ __u8 mo;
+ __u8 rsvd11;
+ __u16 mos;
+ __le32 numd;
+ __le32 cdw12[4];
+};
+
enum {
NVME_ZRA_ZONE_REPORT = 0,
NVME_ZRASF_ZONE_REPORT_ALL = 0,
@@ -1822,6 +1840,7 @@ struct nvme_command {
struct nvmf_auth_receive_command auth_receive;
struct nvme_dbbuf dbbuf;
struct nvme_directive_cmd directive;
+ struct nvme_io_mgmt_recv_cmd imr;
};
};
--
2.25.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX
2024-09-10 15:01 ` [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX Kanchan Joshi
@ 2024-09-10 18:48 ` Jens Axboe
2024-09-11 15:50 ` Kanchan Joshi
2024-09-12 13:01 ` Christoph Hellwig
2024-09-12 20:36 ` Bart Van Assche
2 siblings, 1 reply; 31+ messages in thread
From: Jens Axboe @ 2024-09-10 18:48 UTC (permalink / raw)
To: Kanchan Joshi, kbusch, hch, sagi, martin.petersen,
James.Bottomley, brauner, viro, jack, jaegeuk, jlayton,
chuck.lever, bvanassche
Cc: linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/10/24 9:01 AM, Kanchan Joshi wrote:
> +static inline bool rw_placement_hint_valid(u64 val)
> +{
> + if (val <= MAX_PLACEMENT_HINT_VAL)
> + return true;
> +
> + return false;
> +}
Nit, why not just:
static inline bool rw_placement_hint_valid(u64 val)
{
return val <= MAX_PLACEMENT_HINT_VAL;
}
> +static long fcntl_set_rw_hint_ex(struct file *file, unsigned int cmd,
> + unsigned long arg)
> +{
> + struct rw_hint_ex __user *rw_hint_ex_p = (void __user *)arg;
> + struct rw_hint_ex rwh;
> + struct inode *inode = file_inode(file);
> + u64 hint;
> + int i;
> +
> + if (copy_from_user(&rwh, rw_hint_ex_p, sizeof(rwh)))
> + return -EFAULT;
> + for (i = 0; i < ARRAY_SIZE(rwh.pad); i++)
> + if (rwh.pad[i])
> + return -EINVAL;
if (memchr_inv(rwh.pad, 0, sizeof(rwh.pad)))
return -EINVAL;
--
Jens Axboe
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX
2024-09-10 18:48 ` Jens Axboe
@ 2024-09-11 15:50 ` Kanchan Joshi
0 siblings, 0 replies; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-11 15:50 UTC (permalink / raw)
To: Jens Axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche
Cc: linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/11/2024 12:18 AM, Jens Axboe wrote:
> On 9/10/24 9:01 AM, Kanchan Joshi wrote:
>> +static inline bool rw_placement_hint_valid(u64 val)
>> +{
>> + if (val <= MAX_PLACEMENT_HINT_VAL)
>> + return true;
>> +
>> + return false;
>> +}
> Nit, why not just:
>
> static inline bool rw_placement_hint_valid(u64 val)
> {
> return val <= MAX_PLACEMENT_HINT_VAL;
> }
>
Right, concise.
I can fold in both the changes in next respin.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 1/5] fs, block: refactor enum rw_hint
2024-09-10 15:01 ` [PATCH v5 1/5] fs, block: refactor enum rw_hint Kanchan Joshi
@ 2024-09-12 12:53 ` Christoph Hellwig
2024-09-12 15:50 ` Kanchan Joshi
0 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2024-09-12 12:53 UTC (permalink / raw)
To: Kanchan Joshi
Cc: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche,
linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz
On Tue, Sep 10, 2024 at 08:31:56PM +0530, Kanchan Joshi wrote:
> Rename enum rw_hint to rw_lifetime_hint.
> Change i_write_hint (in inode), bi_write_hint(in bio), and write_hint
> (in request) to use u8 data-type rather than this enum.
>
> This is in preparation to introduce a new write hint type.
The rationale seems a bit sparse. Why is it renamed? Because the
name fits better, because you need the same for something else?
> static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
> - enum rw_hint hint, struct writeback_control *wbc);
> + u8 hint, struct writeback_control *wbc);
And moving from the enum to an plain integer seems like a bit of a
retrograde step.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 2/5] fcntl: rename rw_hint_* to rw_lifetime_hint_*
2024-09-10 15:01 ` [PATCH v5 2/5] fcntl: rename rw_hint_* to rw_lifetime_hint_* Kanchan Joshi
@ 2024-09-12 12:54 ` Christoph Hellwig
2024-09-12 15:51 ` Kanchan Joshi
0 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2024-09-12 12:54 UTC (permalink / raw)
To: Kanchan Joshi
Cc: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche,
linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz
On Tue, Sep 10, 2024 at 08:31:57PM +0530, Kanchan Joshi wrote:
> F_GET/SET_RW_HINT fcntl handlers query/set write life hints.
> Rename the handlers/helpers to be explicit that write life hints are
> being handled.
>
> This is in preparation to introduce a new interface that supports more
> than one type of write hint.
Wouldn't it make more sense to stick with the name as exposed in the
uapi? The same minda applies to the previous patch - in fact IFF we
decide to do the rename I'd probably expect both parts to go together.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX
2024-09-10 15:01 ` [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX Kanchan Joshi
2024-09-10 18:48 ` Jens Axboe
@ 2024-09-12 13:01 ` Christoph Hellwig
2024-09-12 15:53 ` Kanchan Joshi
2024-09-12 20:36 ` Bart Van Assche
2 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2024-09-12 13:01 UTC (permalink / raw)
To: Kanchan Joshi
Cc: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche,
linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On Tue, Sep 10, 2024 at 08:31:58PM +0530, Kanchan Joshi wrote:
> This is similar to existing F_{SET/GET}_RW_HINT but more
> generic/extensible.
>
> F_SET/GET_RW_HINT_EX take a pointer to a struct rw_hint_ex as argument:
>
> struct rw_hint_ex {
> __u8 type;
> __u8 pad[7];
> __u64 val;
> };
>
> With F_SET_RW_HINT_EX, the user passes the hint type and its value.
> Hint type can be either lifetime hint (TYPE_RW_LIFETIME_HINT) or
> placement hint (TYPE_RW_PLACEMENT_HINT). The interface allows to add
> more hint add more hint types in future.
What is the point of multiplexing these into a single call vs having
one fcntl for each? It's not like the code points are a super
limited resource.
And the _EX name isn't exactly descriptive either and screams of horrible
Windows APIs :)
> + WRITE_ONCE(inode->i_write_hint, hint);
> + if (file->f_mapping->host != inode)
> + WRITE_ONCE(file->f_mapping->host->i_write_hint, hint);
This doesn't work. You need a file system method for this so that
the file system can intercept it, instead of storing it in completely
arbitrary inodes without any kind of checking for support or intercetion
point.
> --- a/include/linux/rw_hint.h
> +++ b/include/linux/rw_hint.h
> @@ -21,4 +21,17 @@ enum rw_lifetime_hint {
> static_assert(sizeof(enum rw_lifetime_hint) == 1);
> #endif
>
> +#define WRITE_HINT_TYPE_BIT BIT(7)
> +#define WRITE_HINT_VAL_MASK (WRITE_HINT_TYPE_BIT - 1)
> +#define WRITE_HINT_TYPE(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
> + TYPE_RW_PLACEMENT_HINT : TYPE_RW_LIFETIME_HINT)
> +#define WRITE_HINT_VAL(h) ((h) & WRITE_HINT_VAL_MASK)
> +
> +#define WRITE_PLACEMENT_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
> + WRITE_HINT_VAL(h) : 0)
> +#define WRITE_LIFETIME_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
> + 0 : WRITE_HINT_VAL(h))
> +
> +#define PLACEMENT_HINT_TYPE WRITE_HINT_TYPE_BIT
> +#define MAX_PLACEMENT_HINT_VAL (WRITE_HINT_VAL_MASK - 1)
That's a whole lot of undocumented macros. Please turn these into proper
inline functions and write documentation for them.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-10 15:01 ` [PATCH v5 4/5] sd: limit to use write life hints Kanchan Joshi
@ 2024-09-12 13:02 ` Christoph Hellwig
2024-09-12 16:31 ` Kanchan Joshi
0 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2024-09-12 13:02 UTC (permalink / raw)
To: Kanchan Joshi
Cc: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche,
linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On Tue, Sep 10, 2024 at 08:31:59PM +0530, Kanchan Joshi wrote:
> From: Nitesh Shetty <nj.shetty@samsung.com>
>
> The incoming hint value maybe either lifetime hint or placement hint.
.. may either be .. ?
> Make SCSI interpret only temperature-based write lifetime hints.
>
> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> ---
> drivers/scsi/sd.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index dad3991397cf..82bd4b07314e 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1191,8 +1191,8 @@ static u8 sd_group_number(struct scsi_cmnd *cmd)
> if (!sdkp->rscs)
> return 0;
>
> - return min3((u32)rq->write_hint, (u32)sdkp->permanent_stream_count,
> - 0x3fu);
> + return min3((u32)WRITE_LIFETIME_HINT(rq->write_hint),
No fan of the screaming WRITE_LIFETIME_HINT. Or the fact that multiple
things are multiplexed into the single rq->write_hint field to
start with.
This code could also use a bit of documentation already in the existing
version, but even more so now.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 1/5] fs, block: refactor enum rw_hint
2024-09-12 12:53 ` Christoph Hellwig
@ 2024-09-12 15:50 ` Kanchan Joshi
2024-09-12 20:30 ` Bart Van Assche
0 siblings, 1 reply; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-12 15:50 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz
On 9/12/2024 6:23 PM, Christoph Hellwig wrote:
> On Tue, Sep 10, 2024 at 08:31:56PM +0530, Kanchan Joshi wrote:
>> Rename enum rw_hint to rw_lifetime_hint.
>> Change i_write_hint (in inode), bi_write_hint(in bio), and write_hint
>> (in request) to use u8 data-type rather than this enum.
>>
>> This is in preparation to introduce a new write hint type.
>
> The rationale seems a bit sparse. Why is it renamed? Because the
> name fits better, because you need the same for something else?
>
Right, new name fits better. Because 'enum rw_hint' is a generic name
that conveys 'any' hint. This was fine before. But once we start
supporting more than one hint type, we need to be specific what
hint-type is being handled. More below.
>> static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
>> - enum rw_hint hint, struct writeback_control *wbc);
>> + u8 hint, struct writeback_control *wbc);
>
> And moving from the enum to an plain integer seems like a bit of a
> retrograde step.
This particular enum is hardwired to take 6 temperature-hint values [*].
But this (and many other) functions act as a simple propagator, which do
not have to care whether hint type is lifetime or placement or anything
else.
The creator/originator of the hint decides what hint to pass (userspace
in this case). And the consumer (driver in this case) decides whether or
not it understands the hint that has been passed. The intermediate
components/functions only need to pass the hint, regardless of its type,
down.
Wherever hint is being used in generic way, u8 data type is being used.
Down the line if a component/function needs to care for a specific
type, it can start decoding the passed hint type/value (using the
appropriate macro similar to what this series does for SCSI and NVMe).
Overall, this also helps to avoid the churn. Otherwise we duplicate all
the propagation code that has been done for temperature hint across the
IO stack.
[*]
enum rw_hint {
WRITE_LIFE_NOT_SET = RWH_WRITE_LIFE_NOT_SET,
WRITE_LIFE_NONE = RWH_WRITE_LIFE_NONE,
WRITE_LIFE_SHORT = RWH_WRITE_LIFE_SHORT,
WRITE_LIFE_MEDIUM = RWH_WRITE_LIFE_MEDIUM,
WRITE_LIFE_LONG = RWH_WRITE_LIFE_LONG,
WRITE_LIFE_EXTREME = RWH_WRITE_LIFE_EXTREME,
} __packed;
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 2/5] fcntl: rename rw_hint_* to rw_lifetime_hint_*
2024-09-12 12:54 ` Christoph Hellwig
@ 2024-09-12 15:51 ` Kanchan Joshi
0 siblings, 0 replies; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-12 15:51 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz
On 9/12/2024 6:24 PM, Christoph Hellwig wrote:
> On Tue, Sep 10, 2024 at 08:31:57PM +0530, Kanchan Joshi wrote:
>> F_GET/SET_RW_HINT fcntl handlers query/set write life hints.
>> Rename the handlers/helpers to be explicit that write life hints are
>> being handled.
>>
>> This is in preparation to introduce a new interface that supports more
>> than one type of write hint.
>
> Wouldn't it make more sense to stick with the name as exposed in the
> uapi?
uapi used
for opcode: F_GET/SET_RW_HINT
for values: RWH_WRITE_LIFE_*.
The kernel handlers were using the name rw_hint_* (e.g., rw_hint_valid,
fcntl_get/set_rw_hint etc.). Since rw_hint is generic term, it seemed
clearer to call a spade a spade (e.g. rw_lifetime_hint_valid,
fcntl_get/set_lifetime_hint).
The same minda applies to the previous patch - in fact IFF we
> decide to do the rename I'd probably expect both parts to go together.
>
Sure. I can merge both the patches if that's what you prefer.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX
2024-09-12 13:01 ` Christoph Hellwig
@ 2024-09-12 15:53 ` Kanchan Joshi
0 siblings, 0 replies; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-12 15:53 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/12/2024 6:31 PM, Christoph Hellwig wrote:
> On Tue, Sep 10, 2024 at 08:31:58PM +0530, Kanchan Joshi wrote:
>> This is similar to existing F_{SET/GET}_RW_HINT but more
>> generic/extensible.
>>
>> F_SET/GET_RW_HINT_EX take a pointer to a struct rw_hint_ex as argument:
>>
>> struct rw_hint_ex {
>> __u8 type;
>> __u8 pad[7];
>> __u64 val;
>> };
>>
>> With F_SET_RW_HINT_EX, the user passes the hint type and its value.
>> Hint type can be either lifetime hint (TYPE_RW_LIFETIME_HINT) or
>> placement hint (TYPE_RW_PLACEMENT_HINT). The interface allows to add
>> more hint add more hint types in future.
>
> What is the point of multiplexing these into a single call vs having
> one fcntl for each? It's not like the code points are a super
> limited resource.
Do you mean new fcntl code only for placement hint?
I thought folks will prefer the user-interface to be future proof so
that they don't have to add a new fcntl opcode.
Had the existing fcntl accepted "hint type" as argument, I would not
have resorted to add a new one now.
You may have noticed that in io_uring metadata series also, even though
current meta type is 'integrity', we allow user interface to express
other types of metadata too.
> And the _EX name isn't exactly descriptive either and screams of horrible
> Windows APIs :)
I can change to what you prefer.
But my inspiration behind this name was Linux F_GET/SET_OWN_EX (which is
revamped version of F_GET/SET_OWN).
>> + WRITE_ONCE(inode->i_write_hint, hint);
>> + if (file->f_mapping->host != inode)
>> + WRITE_ONCE(file->f_mapping->host->i_write_hint, hint);
>
> This doesn't work. You need a file system method for this so that
> the file system can intercept it, instead of storing it in completely
> arbitrary inodes without any kind of checking for support or intercetion
> point.
>
I don't understand why will it not work. The hint is being set in the
same way how it is done in the current code (in existing fcntl handlers
for temperature hints).
>> --- a/include/linux/rw_hint.h
>> +++ b/include/linux/rw_hint.h
>> @@ -21,4 +21,17 @@ enum rw_lifetime_hint {
>> static_assert(sizeof(enum rw_lifetime_hint) == 1);
>> #endif
>>
>> +#define WRITE_HINT_TYPE_BIT BIT(7)
>> +#define WRITE_HINT_VAL_MASK (WRITE_HINT_TYPE_BIT - 1)
>> +#define WRITE_HINT_TYPE(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
>> + TYPE_RW_PLACEMENT_HINT : TYPE_RW_LIFETIME_HINT)
>> +#define WRITE_HINT_VAL(h) ((h) & WRITE_HINT_VAL_MASK)
>> +
>> +#define WRITE_PLACEMENT_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
>> + WRITE_HINT_VAL(h) : 0)
>> +#define WRITE_LIFETIME_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
>> + 0 : WRITE_HINT_VAL(h))
>> +
>> +#define PLACEMENT_HINT_TYPE WRITE_HINT_TYPE_BIT
>> +#define MAX_PLACEMENT_HINT_VAL (WRITE_HINT_VAL_MASK - 1)
>
> That's a whole lot of undocumented macros. Please turn these into proper
> inline functions and write documentation for them.
I can try doing that.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-12 13:02 ` Christoph Hellwig
@ 2024-09-12 16:31 ` Kanchan Joshi
2024-09-13 8:06 ` Christoph Hellwig
0 siblings, 1 reply; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-12 16:31 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/12/2024 6:32 PM, Christoph Hellwig wrote:
> On Tue, Sep 10, 2024 at 08:31:59PM +0530, Kanchan Joshi wrote:
>> From: Nitesh Shetty <nj.shetty@samsung.com>
>>
>> The incoming hint value maybe either lifetime hint or placement hint.
>
> .. may either be .. ?
Sure.
>> Make SCSI interpret only temperature-based write lifetime hints.
>>
>> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com>
>> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
>> ---
>> drivers/scsi/sd.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index dad3991397cf..82bd4b07314e 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -1191,8 +1191,8 @@ static u8 sd_group_number(struct scsi_cmnd *cmd)
>> if (!sdkp->rscs)
>> return 0;
>>
>> - return min3((u32)rq->write_hint, (u32)sdkp->permanent_stream_count,
>> - 0x3fu);
>> + return min3((u32)WRITE_LIFETIME_HINT(rq->write_hint),
>
> No fan of the screaming WRITE_LIFETIME_HINT.
Macros tend to. Once it becomes lowercase (inline function), it will
stop screaming.
Or the fact that multiple
> things are multiplexed into the single rq->write_hint field to
> start with.
Please see the response in patch #1. My worries were:
(a) adding a new field and propagating it across the stack will cause
code duplication.
(b) to add a new field we need to carve space within inode, bio and
request.
We had a hole in request, but it is set to vanish after ongoing
integrity refactoring patch of Keith [1]. For inode also, there is no
liberty at this point [2].
I think current multiplexing approach is similar to ioprio where
multiple io priority classes/values are expressed within an int type.
And few kernel components choose to interpret certain ioprio values at will.
And all this is still in-kernel details. Which can be changed if/when
other factors start helping.
[1]
https://lore.kernel.org/linux-nvme/20240911201240.3982856-2-kbusch@meta.com/
[2]
https://lore.kernel.org/linux-nvme/20240903-erfassen-bandmitglieder-32dfaeee66b2@brauner/
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 1/5] fs, block: refactor enum rw_hint
2024-09-12 15:50 ` Kanchan Joshi
@ 2024-09-12 20:30 ` Bart Van Assche
2024-09-13 7:22 ` Kanchan Joshi
0 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2024-09-12 20:30 UTC (permalink / raw)
To: Kanchan Joshi, Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz
On 9/12/24 8:50 AM, Kanchan Joshi wrote:
> Wherever hint is being used in generic way, u8 data type is being used.
Has it been considered to introduce a new union and to use that as the
type of 'hint' instead of 'u8'?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX
2024-09-10 15:01 ` [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX Kanchan Joshi
2024-09-10 18:48 ` Jens Axboe
2024-09-12 13:01 ` Christoph Hellwig
@ 2024-09-12 20:36 ` Bart Van Assche
2024-09-13 7:15 ` Kanchan Joshi
2 siblings, 1 reply; 31+ messages in thread
From: Bart Van Assche @ 2024-09-12 20:36 UTC (permalink / raw)
To: Kanchan Joshi, axboe, kbusch, hch, sagi, martin.petersen,
James.Bottomley, brauner, viro, jack, jaegeuk, jlayton,
chuck.lever
Cc: linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/10/24 8:01 AM, Kanchan Joshi wrote:
> diff --git a/include/linux/rw_hint.h b/include/linux/rw_hint.h
> index b9942f5f13d3..ff708a75e2f6 100644
> --- a/include/linux/rw_hint.h
> +++ b/include/linux/rw_hint.h
> @@ -21,4 +21,17 @@ enum rw_lifetime_hint {
> static_assert(sizeof(enum rw_lifetime_hint) == 1);
> #endif
>
> +#define WRITE_HINT_TYPE_BIT BIT(7)
> +#define WRITE_HINT_VAL_MASK (WRITE_HINT_TYPE_BIT - 1)
> +#define WRITE_HINT_TYPE(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
> + TYPE_RW_PLACEMENT_HINT : TYPE_RW_LIFETIME_HINT)
> +#define WRITE_HINT_VAL(h) ((h) & WRITE_HINT_VAL_MASK)
> +
> +#define WRITE_PLACEMENT_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
> + WRITE_HINT_VAL(h) : 0)
> +#define WRITE_LIFETIME_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
> + 0 : WRITE_HINT_VAL(h))
> +
> +#define PLACEMENT_HINT_TYPE WRITE_HINT_TYPE_BIT
> +#define MAX_PLACEMENT_HINT_VAL (WRITE_HINT_VAL_MASK - 1)
> #endif /* _LINUX_RW_HINT_H */
The above macros implement a union of two 7-bit types in an 8-bit field.
Wouldn't we be better of by using two separate 8-bit values such that we
don't need the above macros?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX
2024-09-12 20:36 ` Bart Van Assche
@ 2024-09-13 7:15 ` Kanchan Joshi
0 siblings, 0 replies; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-13 7:15 UTC (permalink / raw)
To: Bart Van Assche, axboe, kbusch, hch, sagi, martin.petersen,
James.Bottomley, brauner, viro, jack, jaegeuk, jlayton,
chuck.lever
Cc: linux-nvme, linux-fsdevel, linux-f2fs-devel, linux-block,
linux-scsi, gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/13/2024 2:06 AM, Bart Van Assche wrote:
> On 9/10/24 8:01 AM, Kanchan Joshi wrote:
>> diff --git a/include/linux/rw_hint.h b/include/linux/rw_hint.h
>> index b9942f5f13d3..ff708a75e2f6 100644
>> --- a/include/linux/rw_hint.h
>> +++ b/include/linux/rw_hint.h
>> @@ -21,4 +21,17 @@ enum rw_lifetime_hint {
>> static_assert(sizeof(enum rw_lifetime_hint) == 1);
>> #endif
>> +#define WRITE_HINT_TYPE_BIT BIT(7)
>> +#define WRITE_HINT_VAL_MASK (WRITE_HINT_TYPE_BIT - 1)
>> +#define WRITE_HINT_TYPE(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
>> + TYPE_RW_PLACEMENT_HINT : TYPE_RW_LIFETIME_HINT)
>> +#define WRITE_HINT_VAL(h) ((h) & WRITE_HINT_VAL_MASK)
>> +
>> +#define WRITE_PLACEMENT_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
>> + WRITE_HINT_VAL(h) : 0)
>> +#define WRITE_LIFETIME_HINT(h) (((h) & WRITE_HINT_TYPE_BIT) ? \
>> + 0 : WRITE_HINT_VAL(h))
>> +
>> +#define PLACEMENT_HINT_TYPE WRITE_HINT_TYPE_BIT
>> +#define MAX_PLACEMENT_HINT_VAL (WRITE_HINT_VAL_MASK - 1)
>> #endif /* _LINUX_RW_HINT_H */
>
> The above macros implement a union of two 7-bit types in an 8-bit field.
> Wouldn't we be better of by using two separate 8-bit values such that we
> don't need the above macros?
I had considered that, but it requires two bytes of space. In inode,
bio, and request.
For example this change in inode:
@@ -674,7 +674,13 @@ struct inode {
spinlock_t i_lock; /* i_blocks, i_bytes, maybe
i_size */
unsigned short i_bytes;
u8 i_blkbits;
- u8 i_write_hint;
+ union {
+ struct {
+ enum rw_liftime_hint lifetime_hint;
+ u8 placement_hint;
+ };
+ u16 i_write_hint;
+ };
With this, generic propagation code will continue to use
inode->i_write_hint. And specific places (that care) can use either
lifetime_hint or placement_hint.
That kills the need of type-bit and above macros, but we don't have the
two bytes of space currently.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 1/5] fs, block: refactor enum rw_hint
2024-09-12 20:30 ` Bart Van Assche
@ 2024-09-13 7:22 ` Kanchan Joshi
0 siblings, 0 replies; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-13 7:22 UTC (permalink / raw)
To: Bart Van Assche, Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz
On 9/13/2024 2:00 AM, Bart Van Assche wrote:
> On 9/12/24 8:50 AM, Kanchan Joshi wrote:
>> Wherever hint is being used in generic way, u8 data type is being used.
>
> Has it been considered to introduce a new union and to use that as the
> type of 'hint' instead of 'u8'?
>
Is it same as your other question in patch 3?. I commented there.
If not, can you expand on what you prefer (maybe with a code fragment).
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-12 16:31 ` Kanchan Joshi
@ 2024-09-13 8:06 ` Christoph Hellwig
2024-09-16 13:49 ` Kanchan Joshi
0 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2024-09-13 8:06 UTC (permalink / raw)
To: Kanchan Joshi
Cc: Christoph Hellwig, axboe, kbusch, sagi, martin.petersen,
James.Bottomley, brauner, viro, jack, jaegeuk, jlayton,
chuck.lever, bvanassche, linux-nvme, linux-fsdevel,
linux-f2fs-devel, linux-block, linux-scsi, gost.dev, vishak.g,
javier.gonz, Nitesh Shetty
On Thu, Sep 12, 2024 at 10:01:00PM +0530, Kanchan Joshi wrote:
> Please see the response in patch #1. My worries were:
> (a) adding a new field and propagating it across the stack will cause
> code duplication.
> (b) to add a new field we need to carve space within inode, bio and
> request.
> We had a hole in request, but it is set to vanish after ongoing
> integrity refactoring patch of Keith [1]. For inode also, there is no
> liberty at this point [2].
>
> I think current multiplexing approach is similar to ioprio where
> multiple io priority classes/values are expressed within an int type.
> And few kernel components choose to interpret certain ioprio values at will.
>
> And all this is still in-kernel details. Which can be changed if/when
> other factors start helping.
Maybe part of the problem is that the API is very confusing. A smal
part of that is of course that the existing temperature hints already
have some issues, but this seems to be taking them make it significantly
worse.
Note: this tries to include highlevel comments from the discussion of
the previous patches instead of splitting them over multiple threads.
F_{S,G}ET_RW_HINT works on arbitrary file descriptors with absolutely no
check for support by the device or file system and not check for the
file type. That's not exactly good API design, but not really a major
because they are clearly designed as hints with a fixed number of
values, allowing the implementation to map them if not enough are
supported.
But if we increase this to a variable number of hints that don't have
any meaning (and even if that is just the rough order of the temperature
hints assigned to them), that doesn't really work. We'll need an API
to check if these stream hints are supported and how many of them,
otherwise the applications can't make any sensible use of them.
If these aren't just stream hints of the file system but you actually
want them as an abstract API for FDP you'll also need to actually
expose even more information like the reclaim unit size, but let's
ignore that for this part of the discssion.
Back the the API: the existing lifetime hints have basically three
layers:
1) syscall ABI
2) the hint stored in the inode
3) the hint passed in the bio
1) is very much fixed for the temperature API, we just need to think if
we want to support it at the same time as a more general hints API.
Or if we can map one into another. Or if we can't support them at
the same time how that is communicated.
For 2) and 3) we can use an actual union if we decide to not support
both at the same time, keyed off a flag outside the field, but if not
we simply need space for both.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-13 8:06 ` Christoph Hellwig
@ 2024-09-16 13:49 ` Kanchan Joshi
2024-09-17 6:20 ` Christoph Hellwig
0 siblings, 1 reply; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-16 13:49 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/13/2024 1:36 PM, Christoph Hellwig wrote:
> On Thu, Sep 12, 2024 at 10:01:00PM +0530, Kanchan Joshi wrote:
>> Please see the response in patch #1. My worries were:
>> (a) adding a new field and propagating it across the stack will cause
>> code duplication.
>> (b) to add a new field we need to carve space within inode, bio and
>> request.
>> We had a hole in request, but it is set to vanish after ongoing
>> integrity refactoring patch of Keith [1]. For inode also, there is no
>> liberty at this point [2].
>>
>> I think current multiplexing approach is similar to ioprio where
>> multiple io priority classes/values are expressed within an int type.
>> And few kernel components choose to interpret certain ioprio values at will.
>>
>> And all this is still in-kernel details. Which can be changed if/when
>> other factors start helping.
>
> Maybe part of the problem is that the API is very confusing. A smal
> part of that is of course that the existing temperature hints already
> have some issues, but this seems to be taking them make it significantly
> worse.
Can you explain what part is confusing. This is a simple API that takes
type/value pair. Two types (and respective values) are clearly defined
currently, and more can be added in future.
> Note: this tries to include highlevel comments from the discussion of
> the previous patches instead of splitting them over multiple threads.
>
> F_{S,G}ET_RW_HINT works on arbitrary file descriptors with absolutely no
> check for support by the device or file system and not check for the
> file type. That's not exactly good API design, but not really a major
> because they are clearly designed as hints with a fixed number of
> values, allowing the implementation to map them if not enough are
> supported.
>
> But if we increase this to a variable number of hints that don't have
> any meaning (and even if that is just the rough order of the temperature
> hints assigned to them), that doesn't really work. We'll need an API
> to check if these stream hints are supported and how many of them,
> otherwise the applications can't make any sensible use of them.
- Since writes are backward compatible, nothing bad happens if the
passed placement-hint value is not supported. Maybe desired outcome (in
terms of WAF reduction) may not come but that's not a kernel problem
anyway. It's rather about how well application is segregating and how
well device is doing its job.
- Device is perfectly happy to work with numbers (0 to 256 in current
spec) to produce some value (i.e., WAF reduction). Any extra
semantics/abstraction on these numbers only adds to the work without
increasing that value. If any application needs that, it's free to
attach any meaning/semantics to these numbers.
Extra abstraction has already been done with temperature-hint (over
multi-stream numbers). If that's useful somehow, we should consider
going back to using those (v3)? But if we are doing a new placement
hint, it's better to use plain numbers without any semantics. That will
be (a) more scalable, (b) be closer to what device can readily accept,
(c) justify why placement should be a different hint-type, and (d) help
Kernel because it has to do less (no intermediate mapping/transformation
etc).
IMHO sticking to the existing hint model and doing less (in terms of
abstraction, reporting and stuff) in kernel maybe a better path.
> If these aren't just stream hints of the file system but you actually
> want them as an abstract API for FDP you'll also need to actually
> expose even more information like the reclaim unit size, but let's
> ignore that for this part of the discssion.
>
> Back the the API: the existing lifetime hints have basically three
> layers:
>
> 1) syscall ABI
> 2) the hint stored in the inode
> 3) the hint passed in the bio
>
> 1) is very much fixed for the temperature API, we just need to think if
> we want to support it at the same time as a more general hints API.
> Or if we can map one into another. Or if we can't support them at
> the same time how that is communicated.
>
> For 2) and 3) we can use an actual union if we decide to not support
> both at the same time, keyed off a flag outside the field, but if not
> we simply need space for both.
>
Right, if there were space, we probably would have kept both.
But particularly for these two types (temperature and placement) it's
probably fine if one overwrites the another. This is not automatic and
will happen only at the behest of user. And that's something we can
clearly document in the man page of the new fcntl. Hope that sounds fine?
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-16 13:49 ` Kanchan Joshi
@ 2024-09-17 6:20 ` Christoph Hellwig
2024-09-17 16:03 ` Kanchan Joshi
0 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2024-09-17 6:20 UTC (permalink / raw)
To: Kanchan Joshi
Cc: Christoph Hellwig, axboe, kbusch, sagi, martin.petersen,
James.Bottomley, brauner, viro, jack, jaegeuk, jlayton,
chuck.lever, bvanassche, linux-nvme, linux-fsdevel,
linux-f2fs-devel, linux-block, linux-scsi, gost.dev, vishak.g,
javier.gonz, Nitesh Shetty
On Mon, Sep 16, 2024 at 07:19:21PM +0530, Kanchan Joshi wrote:
> > Maybe part of the problem is that the API is very confusing. A smal
> > part of that is of course that the existing temperature hints already
> > have some issues, but this seems to be taking them make it significantly
> > worse.
>
> Can you explain what part is confusing. This is a simple API that takes
> type/value pair. Two types (and respective values) are clearly defined
> currently, and more can be added in future.
I though I outlined that below.
> > But if we increase this to a variable number of hints that don't have
> > any meaning (and even if that is just the rough order of the temperature
> > hints assigned to them), that doesn't really work. We'll need an API
> > to check if these stream hints are supported and how many of them,
> > otherwise the applications can't make any sensible use of them.
>
> - Since writes are backward compatible, nothing bad happens if the
> passed placement-hint value is not supported. Maybe desired outcome (in
> terms of WAF reduction) may not come but that's not a kernel problem
> anyway. It's rather about how well application is segregating and how
> well device is doing its job.
What do you mean with "writes are backward compatible" ?
> - Device is perfectly happy to work with numbers (0 to 256 in current
> spec) to produce some value (i.e., WAF reduction). Any extra
> semantics/abstraction on these numbers only adds to the work without
> increasing that value. If any application needs that, it's free to
> attach any meaning/semantics to these numbers.
If the device (or file system, which really needs to be in control
for actual files vs just block devices) does not support all 256
we need to reduce them to less than that. The kernel can help with
that a bit if the streams have meanings (collapsing temperature levels
that are close), but not at all if they don't have meanings. The
application can and thus needs to know the number of separate
streams available.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-17 6:20 ` Christoph Hellwig
@ 2024-09-17 16:03 ` Kanchan Joshi
2024-09-17 17:00 ` Kanchan Joshi
2024-09-18 6:42 ` Christoph Hellwig
0 siblings, 2 replies; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-17 16:03 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/17/2024 11:50 AM, Christoph Hellwig wrote:
>>> But if we increase this to a variable number of hints that don't have
>>> any meaning (and even if that is just the rough order of the temperature
>>> hints assigned to them), that doesn't really work. We'll need an API
>>> to check if these stream hints are supported and how many of them,
>>> otherwise the applications can't make any sensible use of them.
>> - Since writes are backward compatible, nothing bad happens if the
>> passed placement-hint value is not supported. Maybe desired outcome (in
>> terms of WAF reduction) may not come but that's not a kernel problem
>> anyway. It's rather about how well application is segregating and how
>> well device is doing its job.
> What do you mean with "writes are backward compatible" ?
>
Writes are not going to fail even if you don't pass the placement-id or
pass a placement-id that is not valid. FDP-enabled SSD will not shout
and complete writes fine even with FDP-unaware software.
I think that part is same as how Linux write hints behave ATM. Writes
don't have to carry the lifetime hint always. And when they do, the hint
value never becomes the reason of failure (e.g. life hints on NVMe
vanish in the thin air rather than causing any failure).
>> - Device is perfectly happy to work with numbers (0 to 256 in current
>> spec) to produce some value (i.e., WAF reduction). Any extra
>> semantics/abstraction on these numbers only adds to the work without
>> increasing that value. If any application needs that, it's free to
>> attach any meaning/semantics to these numbers.
> If the device (or file system, which really needs to be in control
> for actual files vs just block devices) does not support all 256
> we need to reduce them to less than that. The kernel can help with
> that a bit if the streams have meanings (collapsing temperature levels
> that are close), but not at all if they don't have meanings.
Current patch (nvme) does what you mentioned above.
Pasting the fragment that maps potentially large placement-hints to the
last valid placement-id.
+static inline void nvme_assign_placement_id(struct nvme_ns *ns,
+ struct request *req,
+ struct nvme_command *cmd)
+{
+ u8 h = umin(ns->head->nr_plids - 1,
+ WRITE_PLACEMENT_HINT(req->write_hint));
+
+ cmd->rw.control |= cpu_to_le16(NVME_RW_DTYPE_DPLCMT);
+ cmd->rw.dsmgmt |= cpu_to_le32(ns->head->plids[h] << 16);
+}
But this was just an implementation choice (and not a failure avoidance
fallback).
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-17 16:03 ` Kanchan Joshi
@ 2024-09-17 17:00 ` Kanchan Joshi
2024-09-18 6:42 ` Christoph Hellwig
1 sibling, 0 replies; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-17 17:00 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/17/2024 9:33 PM, Kanchan Joshi wrote:
> On 9/17/2024 11:50 AM, Christoph Hellwig wrote:
>>>> But if we increase this to a variable number of hints that don't have
>>>> any meaning (and even if that is just the rough order of the temperature
>>>> hints assigned to them), that doesn't really work. We'll need an API
>>>> to check if these stream hints are supported and how many of them,
>>>> otherwise the applications can't make any sensible use of them.
>>> - Since writes are backward compatible, nothing bad happens if the
>>> passed placement-hint value is not supported. Maybe desired outcome (in
>>> terms of WAF reduction) may not come but that's not a kernel problem
>>> anyway. It's rather about how well application is segregating and how
>>> well device is doing its job.
>> What do you mean with "writes are backward compatible" ?
>>
> Writes are not going to fail even if you don't pass the placement-id or
> pass a placement-id that is not valid. FDP-enabled SSD will not shout
> and complete writes fine even with FDP-unaware software.
>
> I think that part is same as how Linux write hints behave ATM. Writes
> don't have to carry the lifetime hint always. And when they do, the hint
> value never becomes the reason of failure (e.g. life hints on NVMe
> vanish in the thin air rather than causing any failure).
>
FWIW, I am not sure about current SCSI streams but NVMe multi-stream did
not tolerate invalid values. Write command with invalid stream was
aborted. So in that scheme of things, it was important to be pedantic
about what values are being passed.
But in FDP, things are closer to Linux hints that don't cause failures.
With the plain-numbers interface, the similarities will increase.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-17 16:03 ` Kanchan Joshi
2024-09-17 17:00 ` Kanchan Joshi
@ 2024-09-18 6:42 ` Christoph Hellwig
2024-09-18 8:12 ` Kanchan Joshi
1 sibling, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2024-09-18 6:42 UTC (permalink / raw)
To: Kanchan Joshi
Cc: Christoph Hellwig, axboe, kbusch, sagi, martin.petersen,
James.Bottomley, brauner, viro, jack, jaegeuk, jlayton,
chuck.lever, bvanassche, linux-nvme, linux-fsdevel,
linux-f2fs-devel, linux-block, linux-scsi, gost.dev, vishak.g,
javier.gonz, Nitesh Shetty
> > If the device (or file system, which really needs to be in control
> > for actual files vs just block devices) does not support all 256
> > we need to reduce them to less than that. The kernel can help with
> > that a bit if the streams have meanings (collapsing temperature levels
> > that are close), but not at all if they don't have meanings.
>
> Current patch (nvme) does what you mentioned above.
> Pasting the fragment that maps potentially large placement-hints to the
> last valid placement-id.
>
> +static inline void nvme_assign_placement_id(struct nvme_ns *ns,
> + struct request *req,
> + struct nvme_command *cmd)
> +{
> + u8 h = umin(ns->head->nr_plids - 1,
> + WRITE_PLACEMENT_HINT(req->write_hint));
> +
> + cmd->rw.control |= cpu_to_le16(NVME_RW_DTYPE_DPLCMT);
> + cmd->rw.dsmgmt |= cpu_to_le32(ns->head->plids[h] << 16);
> +}
>
> But this was just an implementation choice (and not a failure avoidance
> fallback).
And it completely fucks thing up as I said. If I have an application
that wants to separate streams I need to know how many stream I
have available, and not fold all higher numbers into the last one
available.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-18 6:42 ` Christoph Hellwig
@ 2024-09-18 8:12 ` Kanchan Joshi
2024-09-18 12:01 ` Christoph Hellwig
0 siblings, 1 reply; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-18 8:12 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/18/2024 12:12 PM, Christoph Hellwig wrote:
>>> If the device (or file system, which really needs to be in control
>>> for actual files vs just block devices) does not support all 256
>>> we need to reduce them to less than that. The kernel can help with
>>> that a bit if the streams have meanings (collapsing temperature levels
>>> that are close), but not at all if they don't have meanings.
>> Current patch (nvme) does what you mentioned above.
>> Pasting the fragment that maps potentially large placement-hints to the
>> last valid placement-id.
>>
>> +static inline void nvme_assign_placement_id(struct nvme_ns *ns,
>> + struct request *req,
>> + struct nvme_command *cmd)
>> +{
>> + u8 h = umin(ns->head->nr_plids - 1,
>> + WRITE_PLACEMENT_HINT(req->write_hint));
>> +
>> + cmd->rw.control |= cpu_to_le16(NVME_RW_DTYPE_DPLCMT);
>> + cmd->rw.dsmgmt |= cpu_to_le32(ns->head->plids[h] << 16);
>> +}
>>
>> But this was just an implementation choice (and not a failure avoidance
>> fallback).
> And it completely fucks thing up as I said. If I have an application
> that wants to separate streams I need to know how many stream I
> have available, and not fold all higher numbers into the last one
> available.
Would you prefer a new queue attribute (say nr_streams) that tells that?
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-18 8:12 ` Kanchan Joshi
@ 2024-09-18 12:01 ` Christoph Hellwig
2024-09-24 9:24 ` Kanchan Joshi
0 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2024-09-18 12:01 UTC (permalink / raw)
To: Kanchan Joshi
Cc: Christoph Hellwig, axboe, kbusch, sagi, martin.petersen,
James.Bottomley, brauner, viro, jack, jaegeuk, jlayton,
chuck.lever, bvanassche, linux-nvme, linux-fsdevel,
linux-f2fs-devel, linux-block, linux-scsi, gost.dev, vishak.g,
javier.gonz, Nitesh Shetty
On Wed, Sep 18, 2024 at 01:42:51PM +0530, Kanchan Joshi wrote:
> Would you prefer a new queue attribute (say nr_streams) that tells that?
No. For one because using the same file descriptors as the one used
to set the hind actually makes it usable - finding the block device
does not. And second as told about half a dozend time for this scheme
to actually work on a regular file the file system actually needs the
arbiter, as it can work on top of multiple block devices, consumes
streams, might export streams even if the underlying devices don't and
so on.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-18 12:01 ` Christoph Hellwig
@ 2024-09-24 9:24 ` Kanchan Joshi
2024-09-24 9:28 ` Christoph Hellwig
0 siblings, 1 reply; 31+ messages in thread
From: Kanchan Joshi @ 2024-09-24 9:24 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, kbusch, sagi, martin.petersen, James.Bottomley, brauner,
viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche, linux-nvme,
linux-fsdevel, linux-f2fs-devel, linux-block, linux-scsi,
gost.dev, vishak.g, javier.gonz, Nitesh Shetty
On 9/18/2024 5:31 PM, Christoph Hellwig wrote:
> On Wed, Sep 18, 2024 at 01:42:51PM +0530, Kanchan Joshi wrote:
>> Would you prefer a new queue attribute (say nr_streams) that tells that?
>
> No. For one because using the same file descriptors as the one used
> to set the hind actually makes it usable - finding the block device
> does not. And second as told about half a dozend time for this scheme
> to actually work on a regular file the file system actually needs the
> arbiter, as it can work on top of multiple block devices, consumes
> streams, might export streams even if the underlying devices don't and
> so on.
>
FS managed/created hints is a different topic altogether, and honestly
that is not the scope of this series. That needs to be thought at per-FS
level due to different data/meta layouts.
This scope of this series is to enable application-managed hints passing
through the file system. FS only needs to pass what it receives. No
active decision making (since application is doing that). Whether it
works fine or not - is application's problem. But due to the simplicity
it scales across filesystems. This is for the class of applications that
know about their data and have decided to be in control.
Regardless, since placement-hints are not getting the reception I
imagined, I will backtrack.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v5 4/5] sd: limit to use write life hints
2024-09-24 9:24 ` Kanchan Joshi
@ 2024-09-24 9:28 ` Christoph Hellwig
0 siblings, 0 replies; 31+ messages in thread
From: Christoph Hellwig @ 2024-09-24 9:28 UTC (permalink / raw)
To: Kanchan Joshi
Cc: Christoph Hellwig, axboe, kbusch, sagi, martin.petersen,
James.Bottomley, brauner, viro, jack, jaegeuk, jlayton,
chuck.lever, bvanassche, linux-nvme, linux-fsdevel,
linux-f2fs-devel, linux-block, linux-scsi, gost.dev, vishak.g,
javier.gonz, Nitesh Shetty
On Tue, Sep 24, 2024 at 02:54:51PM +0530, Kanchan Joshi wrote:
> FS managed/created hints is a different topic altogether,
> and honestly
> that is not the scope of this series. That needs to be thought at per-FS
> level due to different data/meta layouts.
No, it is not. If you design an API where hints bypass the file
system you fundamentally do the wrong thing when there is a file
system. No one is asking to actually implement file system
support in this series, but we need to consider the fundamental
problem in the API design.
And yes, the actual implementation will be highly dependent on the
file system.
> This scope of this series is to enable application-managed hints passing
> through the file system. FS only needs to pass what it receives.
Which fundamentally can't work for even a semi-intelligent file system.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [f2fs-dev] [PATCH v5 0/5] data placement hints and FDP
2024-09-10 15:01 ` [PATCH v5 0/5] data placement hints and FDP Kanchan Joshi
` (4 preceding siblings ...)
[not found] ` <CGME20240910151101epcas5p1c4e90f7334125fc49106d58d43cffcec@epcas5p1.samsung.com>
@ 2025-01-29 0:56 ` patchwork-bot+f2fs
5 siblings, 0 replies; 31+ messages in thread
From: patchwork-bot+f2fs @ 2025-01-29 0:56 UTC (permalink / raw)
To: Kanchan Joshi
Cc: axboe, kbusch, hch, sagi, martin.petersen, James.Bottomley,
brauner, viro, jack, jaegeuk, jlayton, chuck.lever, bvanassche,
vishak.g, linux-scsi, gost.dev, linux-nvme, linux-f2fs-devel,
linux-block, linux-fsdevel, javier.gonz
Hello:
This series was applied to jaegeuk/f2fs.git (dev)
by David Sterba <dsterba@suse.com>:
On Tue, 10 Sep 2024 20:31:55 +0530 you wrote:
> Current write-hint infrastructure supports 6 temperature-based data
> lifetime hints.
> The series extends the infrastructure with a new temperature-agnostic
> placement-type hint. New fcntl codes F_{SET/GET}_RW_HINT_EX allow to
> send the hint type/value on file. See patch #3 commit description and
> interface example below [*].
>
> [...]
Here is the summary with links:
- [f2fs-dev,v5,1/5] fs, block: refactor enum rw_hint
(no matching commit)
- [f2fs-dev,v5,2/5] fcntl: rename rw_hint_* to rw_lifetime_hint_*
(no matching commit)
- [f2fs-dev,v5,3/5] fcntl: add F_{SET/GET}_RW_HINT_EX
(no matching commit)
- [f2fs-dev,v5,4/5] sd: limit to use write life hints
(no matching commit)
- [f2fs-dev,v5,5/5] nvme: enable FDP support
https://git.kernel.org/jaegeuk/f2fs/c/2fa07d7a0f00
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2025-01-29 0:55 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20240910151040epcas5p3f47fa7ea37a35f8b44dd9174689e1bb9@epcas5p3.samsung.com>
2024-09-10 15:01 ` [PATCH v5 0/5] data placement hints and FDP Kanchan Joshi
[not found] ` <CGME20240910151044epcas5p37f61bb85ccf8b3eb875e77c3fc260c51@epcas5p3.samsung.com>
2024-09-10 15:01 ` [PATCH v5 1/5] fs, block: refactor enum rw_hint Kanchan Joshi
2024-09-12 12:53 ` Christoph Hellwig
2024-09-12 15:50 ` Kanchan Joshi
2024-09-12 20:30 ` Bart Van Assche
2024-09-13 7:22 ` Kanchan Joshi
[not found] ` <CGME20240910151048epcas5p3c610d63022362ec5fcc6fc362ad2fb9f@epcas5p3.samsung.com>
2024-09-10 15:01 ` [PATCH v5 2/5] fcntl: rename rw_hint_* to rw_lifetime_hint_* Kanchan Joshi
2024-09-12 12:54 ` Christoph Hellwig
2024-09-12 15:51 ` Kanchan Joshi
[not found] ` <CGME20240910151052epcas5p48b20962753b1e3171daf98f050d0b5af@epcas5p4.samsung.com>
2024-09-10 15:01 ` [PATCH v5 3/5] fcntl: add F_{SET/GET}_RW_HINT_EX Kanchan Joshi
2024-09-10 18:48 ` Jens Axboe
2024-09-11 15:50 ` Kanchan Joshi
2024-09-12 13:01 ` Christoph Hellwig
2024-09-12 15:53 ` Kanchan Joshi
2024-09-12 20:36 ` Bart Van Assche
2024-09-13 7:15 ` Kanchan Joshi
[not found] ` <CGME20240910151057epcas5p3369c6257a6f169b4caa6dd59548b538c@epcas5p3.samsung.com>
2024-09-10 15:01 ` [PATCH v5 4/5] sd: limit to use write life hints Kanchan Joshi
2024-09-12 13:02 ` Christoph Hellwig
2024-09-12 16:31 ` Kanchan Joshi
2024-09-13 8:06 ` Christoph Hellwig
2024-09-16 13:49 ` Kanchan Joshi
2024-09-17 6:20 ` Christoph Hellwig
2024-09-17 16:03 ` Kanchan Joshi
2024-09-17 17:00 ` Kanchan Joshi
2024-09-18 6:42 ` Christoph Hellwig
2024-09-18 8:12 ` Kanchan Joshi
2024-09-18 12:01 ` Christoph Hellwig
2024-09-24 9:24 ` Kanchan Joshi
2024-09-24 9:28 ` Christoph Hellwig
[not found] ` <CGME20240910151101epcas5p1c4e90f7334125fc49106d58d43cffcec@epcas5p1.samsung.com>
2024-09-10 15:02 ` [PATCH v5 5/5] nvme: enable FDP support Kanchan Joshi
2025-01-29 0:56 ` [f2fs-dev] [PATCH v5 0/5] data placement hints and FDP patchwork-bot+f2fs
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).