* [PATCH for-next v5 0/4] add ioctl to query metadata and protection info capabilities
[not found] <CGME20250630090606epcas5p42edec1dfe34f53c9f1448acb0964bb8f@epcas5p4.samsung.com>
@ 2025-06-30 9:05 ` Anuj Gupta
[not found] ` <CGME20250630090608epcas5p192ace2db81d6cc04854919225464444c@epcas5p1.samsung.com>
` (5 more replies)
0 siblings, 6 replies; 9+ messages in thread
From: Anuj Gupta @ 2025-06-30 9:05 UTC (permalink / raw)
To: vincent.fu, jack, anuj1072538, axboe, viro, brauner, hch,
martin.petersen, ebiggers, adilger
Cc: linux-block, linux-fsdevel, joshi.k, linux-nvme, linux-scsi,
gost.dev, Anuj Gupta
Hi all,
This patch series adds a new ioctl to query metadata and integrity
capability.
Patch 1 renames tuple_size field to metadata_size
Patch 2 adds a pi_tuple_size field in blk_integrity struct which is later
used to export this value to the user as well.
Patch 3 allows computing right pi_offset value.
Patch 4 introduces a new ioctl to query integrity capability.
v4->v5
add a pad field in the user structure to align it (Christoph)
get rid of overly long lines (Christoph)
add missing nvme prefix to the patch desc (Christoph)
v3->v4
rename tuple_size to metadata_size to inflect right meaning (Martin)
rectify the condition in blk_validate_integrity_limits when csum type is
none (Christoph)
change uapi field comment to more friendly formats (Christoph)
add comments regarding ioctl behaviour when bi is NULL (Christoph)
remove the reserved fields and use different scheme for extensibility
(Christian)
Other misc code improvements (Christoph)
set pi_tuple_size and pi_offset in NVMe only if csum type is not NONE
v2->v3
better naming for uapi struct fields (Martin)
validate integrity fields in blk-settings.c (Christoph)
v1 -> v2
introduce metadata_size, storage_tag_size and ref_tag_size field in the
uapi struct (Martin)
uapi struct fields comment improvements (Martin)
add csum_type definitions to the uapi file (Martin)
add fpc_* prefix to uapi struct fields (Andreas)
bump the size of rsvd and hence the uapi struct to 32 bytes (Andreas)
use correct value for ioctl (Andreas)
use clearer names for CRC (Eric)
Anuj Gupta (4):
block: rename tuple_size field in blk_integrity to metadata_size
block: introduce pi_tuple_size field in blk_integrity
nvme: set pi_offset only when checksum type is not
BLK_INTEGRITY_CSUM_NONE
fs: add ioctl to query metadata and protection info capabilities
block/bio-integrity-auto.c | 4 +--
block/blk-integrity.c | 54 +++++++++++++++++++++++++++-
block/blk-settings.c | 44 +++++++++++++++++++++--
block/ioctl.c | 4 +++
block/t10-pi.c | 16 ++++-----
drivers/md/dm-crypt.c | 4 +--
drivers/md/dm-integrity.c | 12 +++----
drivers/nvdimm/btt.c | 2 +-
drivers/nvme/host/core.c | 7 ++--
drivers/nvme/target/io-cmd-bdev.c | 2 +-
drivers/scsi/sd_dif.c | 3 +-
include/linux/blk-integrity.h | 11 ++++--
include/linux/blkdev.h | 3 +-
include/uapi/linux/fs.h | 59 +++++++++++++++++++++++++++++++
14 files changed, 195 insertions(+), 30 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH for-next v5 1/4] block: rename tuple_size field in blk_integrity to metadata_size
[not found] ` <CGME20250630090608epcas5p192ace2db81d6cc04854919225464444c@epcas5p1.samsung.com>
@ 2025-06-30 9:05 ` Anuj Gupta
0 siblings, 0 replies; 9+ messages in thread
From: Anuj Gupta @ 2025-06-30 9:05 UTC (permalink / raw)
To: vincent.fu, jack, anuj1072538, axboe, viro, brauner, hch,
martin.petersen, ebiggers, adilger
Cc: linux-block, linux-fsdevel, joshi.k, linux-nvme, linux-scsi,
gost.dev, Anuj Gupta, Christoph Hellwig
The tuple_size field in blk_integrity currently represents the total
size of metadata associated with each data interval. To make the meaning
more explicit, rename tuple_size to metadata_size. This is a purely
mechanical rename with no functional changes.
Suggested-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
block/bio-integrity-auto.c | 4 ++--
block/blk-integrity.c | 2 +-
block/blk-settings.c | 6 +++---
block/t10-pi.c | 16 ++++++++--------
drivers/md/dm-crypt.c | 4 ++--
drivers/md/dm-integrity.c | 12 ++++++------
drivers/nvdimm/btt.c | 2 +-
drivers/nvme/host/core.c | 2 +-
drivers/nvme/target/io-cmd-bdev.c | 2 +-
drivers/scsi/sd_dif.c | 2 +-
include/linux/blk-integrity.h | 4 ++--
include/linux/blkdev.h | 2 +-
12 files changed, 29 insertions(+), 29 deletions(-)
diff --git a/block/bio-integrity-auto.c b/block/bio-integrity-auto.c
index 9c6657664792..687952f63bbb 100644
--- a/block/bio-integrity-auto.c
+++ b/block/bio-integrity-auto.c
@@ -54,10 +54,10 @@ static bool bi_offload_capable(struct blk_integrity *bi)
{
switch (bi->csum_type) {
case BLK_INTEGRITY_CSUM_CRC64:
- return bi->tuple_size == sizeof(struct crc64_pi_tuple);
+ return bi->metadata_size == sizeof(struct crc64_pi_tuple);
case BLK_INTEGRITY_CSUM_CRC:
case BLK_INTEGRITY_CSUM_IP:
- return bi->tuple_size == sizeof(struct t10_pi_tuple);
+ return bi->metadata_size == sizeof(struct t10_pi_tuple);
default:
pr_warn_once("%s: unknown integrity checksum type:%d\n",
__func__, bi->csum_type);
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index e4e2567061f9..c1102bf4cd8d 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -239,7 +239,7 @@ static ssize_t format_show(struct device *dev, struct device_attribute *attr,
{
struct blk_integrity *bi = dev_to_bi(dev);
- if (!bi->tuple_size)
+ if (!bi->metadata_size)
return sysfs_emit(page, "none\n");
return sysfs_emit(page, "%s\n", blk_integrity_profile_name(bi));
}
diff --git a/block/blk-settings.c b/block/blk-settings.c
index a000daafbfb4..787500ff00c3 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -114,7 +114,7 @@ static int blk_validate_integrity_limits(struct queue_limits *lim)
{
struct blk_integrity *bi = &lim->integrity;
- if (!bi->tuple_size) {
+ if (!bi->metadata_size) {
if (bi->csum_type != BLK_INTEGRITY_CSUM_NONE ||
bi->tag_size || ((bi->flags & BLK_INTEGRITY_REF_TAG))) {
pr_warn("invalid PI settings.\n");
@@ -875,7 +875,7 @@ bool queue_limits_stack_integrity(struct queue_limits *t,
return true;
if (ti->flags & BLK_INTEGRITY_STACKED) {
- if (ti->tuple_size != bi->tuple_size)
+ if (ti->metadata_size != bi->metadata_size)
goto incompatible;
if (ti->interval_exp != bi->interval_exp)
goto incompatible;
@@ -891,7 +891,7 @@ bool queue_limits_stack_integrity(struct queue_limits *t,
ti->flags |= (bi->flags & BLK_INTEGRITY_DEVICE_CAPABLE) |
(bi->flags & BLK_INTEGRITY_REF_TAG);
ti->csum_type = bi->csum_type;
- ti->tuple_size = bi->tuple_size;
+ ti->metadata_size = bi->metadata_size;
ti->pi_offset = bi->pi_offset;
ti->interval_exp = bi->interval_exp;
ti->tag_size = bi->tag_size;
diff --git a/block/t10-pi.c b/block/t10-pi.c
index 851db518ee5e..0c4ed9702146 100644
--- a/block/t10-pi.c
+++ b/block/t10-pi.c
@@ -56,7 +56,7 @@ static void t10_pi_generate(struct blk_integrity_iter *iter,
pi->ref_tag = 0;
iter->data_buf += iter->interval;
- iter->prot_buf += bi->tuple_size;
+ iter->prot_buf += bi->metadata_size;
iter->seed++;
}
}
@@ -105,7 +105,7 @@ static blk_status_t t10_pi_verify(struct blk_integrity_iter *iter,
next:
iter->data_buf += iter->interval;
- iter->prot_buf += bi->tuple_size;
+ iter->prot_buf += bi->metadata_size;
iter->seed++;
}
@@ -125,7 +125,7 @@ static blk_status_t t10_pi_verify(struct blk_integrity_iter *iter,
static void t10_pi_type1_prepare(struct request *rq)
{
struct blk_integrity *bi = &rq->q->limits.integrity;
- const int tuple_sz = bi->tuple_size;
+ const int tuple_sz = bi->metadata_size;
u32 ref_tag = t10_pi_ref_tag(rq);
u8 offset = bi->pi_offset;
struct bio *bio;
@@ -177,7 +177,7 @@ static void t10_pi_type1_complete(struct request *rq, unsigned int nr_bytes)
{
struct blk_integrity *bi = &rq->q->limits.integrity;
unsigned intervals = nr_bytes >> bi->interval_exp;
- const int tuple_sz = bi->tuple_size;
+ const int tuple_sz = bi->metadata_size;
u32 ref_tag = t10_pi_ref_tag(rq);
u8 offset = bi->pi_offset;
struct bio *bio;
@@ -234,7 +234,7 @@ static void ext_pi_crc64_generate(struct blk_integrity_iter *iter,
put_unaligned_be48(0ULL, pi->ref_tag);
iter->data_buf += iter->interval;
- iter->prot_buf += bi->tuple_size;
+ iter->prot_buf += bi->metadata_size;
iter->seed++;
}
}
@@ -289,7 +289,7 @@ static blk_status_t ext_pi_crc64_verify(struct blk_integrity_iter *iter,
next:
iter->data_buf += iter->interval;
- iter->prot_buf += bi->tuple_size;
+ iter->prot_buf += bi->metadata_size;
iter->seed++;
}
@@ -299,7 +299,7 @@ static blk_status_t ext_pi_crc64_verify(struct blk_integrity_iter *iter,
static void ext_pi_type1_prepare(struct request *rq)
{
struct blk_integrity *bi = &rq->q->limits.integrity;
- const int tuple_sz = bi->tuple_size;
+ const int tuple_sz = bi->metadata_size;
u64 ref_tag = ext_pi_ref_tag(rq);
u8 offset = bi->pi_offset;
struct bio *bio;
@@ -340,7 +340,7 @@ static void ext_pi_type1_complete(struct request *rq, unsigned int nr_bytes)
{
struct blk_integrity *bi = &rq->q->limits.integrity;
unsigned intervals = nr_bytes >> bi->interval_exp;
- const int tuple_sz = bi->tuple_size;
+ const int tuple_sz = bi->metadata_size;
u64 ref_tag = ext_pi_ref_tag(rq);
u8 offset = bi->pi_offset;
struct bio *bio;
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 381b37f181e5..bec91cfb5cb8 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1207,11 +1207,11 @@ static int crypt_integrity_ctr(struct crypt_config *cc, struct dm_target *ti)
return -EINVAL;
}
- if (bi->tuple_size < cc->used_tag_size) {
+ if (bi->metadata_size < cc->used_tag_size) {
ti->error = "Integrity profile tag size mismatch.";
return -EINVAL;
}
- cc->tuple_size = bi->tuple_size;
+ cc->tuple_size = bi->metadata_size;
if (1 << bi->interval_exp != cc->sector_size) {
ti->error = "Integrity profile sector size mismatch.";
return -EINVAL;
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 4395657fa583..efeee0a873c0 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -3906,8 +3906,8 @@ static void dm_integrity_io_hints(struct dm_target *ti, struct queue_limits *lim
struct blk_integrity *bi = &limits->integrity;
memset(bi, 0, sizeof(*bi));
- bi->tuple_size = ic->tag_size;
- bi->tag_size = bi->tuple_size;
+ bi->metadata_size = ic->tag_size;
+ bi->tag_size = bi->metadata_size;
bi->interval_exp =
ic->sb->log2_sectors_per_block + SECTOR_SHIFT;
}
@@ -4746,18 +4746,18 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned int argc, char **argv
ti->error = "Integrity profile not supported";
goto bad;
}
- /*printk("tag_size: %u, tuple_size: %u\n", bi->tag_size, bi->tuple_size);*/
- if (bi->tuple_size < ic->tag_size) {
+ /*printk("tag_size: %u, metadata_size: %u\n", bi->tag_size, bi->metadata_size);*/
+ if (bi->metadata_size < ic->tag_size) {
r = -EINVAL;
ti->error = "The integrity profile is smaller than tag size";
goto bad;
}
- if ((unsigned long)bi->tuple_size > PAGE_SIZE / 2) {
+ if ((unsigned long)bi->metadata_size > PAGE_SIZE / 2) {
r = -EINVAL;
ti->error = "Too big tuple size";
goto bad;
}
- ic->tuple_size = bi->tuple_size;
+ ic->tuple_size = bi->metadata_size;
if (1 << bi->interval_exp != ic->sectors_per_block << SECTOR_SHIFT) {
r = -EINVAL;
ti->error = "Integrity profile sector size mismatch";
diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 423dcd190906..2a1aa32e6693 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1506,7 +1506,7 @@ static int btt_blk_init(struct btt *btt)
int rc;
if (btt_meta_size(btt) && IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY)) {
- lim.integrity.tuple_size = btt_meta_size(btt);
+ lim.integrity.metadata_size = btt_meta_size(btt);
lim.integrity.tag_size = btt_meta_size(btt);
}
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e533d791955d..de8d27ceefc4 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1866,7 +1866,7 @@ static bool nvme_init_integrity(struct nvme_ns_head *head,
break;
}
- bi->tuple_size = head->ms;
+ bi->metadata_size = head->ms;
bi->pi_offset = info->pi_offset;
return true;
}
diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index eba42df2f821..42fb19f94ab8 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -65,7 +65,7 @@ static void nvmet_bdev_ns_enable_integrity(struct nvmet_ns *ns)
return;
if (bi->csum_type == BLK_INTEGRITY_CSUM_CRC) {
- ns->metadata_size = bi->tuple_size;
+ ns->metadata_size = bi->metadata_size;
if (bi->flags & BLK_INTEGRITY_REF_TAG)
ns->pi_type = NVME_NS_DPS_PI_TYPE1;
else
diff --git a/drivers/scsi/sd_dif.c b/drivers/scsi/sd_dif.c
index ae6ce6f5d622..18bfca1f1c78 100644
--- a/drivers/scsi/sd_dif.c
+++ b/drivers/scsi/sd_dif.c
@@ -52,7 +52,7 @@ void sd_dif_config_host(struct scsi_disk *sdkp, struct queue_limits *lim)
if (type != T10_PI_TYPE3_PROTECTION)
bi->flags |= BLK_INTEGRITY_REF_TAG;
- bi->tuple_size = sizeof(struct t10_pi_tuple);
+ bi->metadata_size = sizeof(struct t10_pi_tuple);
if (dif && type) {
bi->flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
diff --git a/include/linux/blk-integrity.h b/include/linux/blk-integrity.h
index c7eae0bfb013..d27730da47f3 100644
--- a/include/linux/blk-integrity.h
+++ b/include/linux/blk-integrity.h
@@ -33,7 +33,7 @@ int blk_rq_integrity_map_user(struct request *rq, void __user *ubuf,
static inline bool
blk_integrity_queue_supports_integrity(struct request_queue *q)
{
- return q->limits.integrity.tuple_size;
+ return q->limits.integrity.metadata_size;
}
static inline struct blk_integrity *blk_get_integrity(struct gendisk *disk)
@@ -74,7 +74,7 @@ static inline unsigned int bio_integrity_intervals(struct blk_integrity *bi,
static inline unsigned int bio_integrity_bytes(struct blk_integrity *bi,
unsigned int sectors)
{
- return bio_integrity_intervals(bi, sectors) * bi->tuple_size;
+ return bio_integrity_intervals(bi, sectors) * bi->metadata_size;
}
static inline bool blk_integrity_rq(struct request *rq)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a51f92b6c340..edc3b458fbd9 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -116,7 +116,7 @@ enum blk_integrity_checksum {
struct blk_integrity {
unsigned char flags;
enum blk_integrity_checksum csum_type;
- unsigned char tuple_size;
+ unsigned char metadata_size;
unsigned char pi_offset;
unsigned char interval_exp;
unsigned char tag_size;
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH for-next v5 2/4] block: introduce pi_tuple_size field in blk_integrity
[not found] ` <CGME20250630090611epcas5p4e77ca8635bf5d3cefea3e3ce8a972dc3@epcas5p4.samsung.com>
@ 2025-06-30 9:05 ` Anuj Gupta
0 siblings, 0 replies; 9+ messages in thread
From: Anuj Gupta @ 2025-06-30 9:05 UTC (permalink / raw)
To: vincent.fu, jack, anuj1072538, axboe, viro, brauner, hch,
martin.petersen, ebiggers, adilger
Cc: linux-block, linux-fsdevel, joshi.k, linux-nvme, linux-scsi,
gost.dev, Anuj Gupta, Christoph Hellwig
Introduce a new pi_tuple_size field in struct blk_integrity to
explicitly represent the size (in bytes) of the protection information
(PI) tuple. This is a prep patch.
Add validation in blk_validate_integrity_limits() to ensure that
pi size matches the expected size for known checksum types and never
exceeds the pi_tuple_size.
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
block/blk-settings.c | 38 ++++++++++++++++++++++++++++++++++++++
drivers/nvme/host/core.c | 2 ++
drivers/scsi/sd_dif.c | 1 +
include/linux/blkdev.h | 1 +
4 files changed, 42 insertions(+)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 787500ff00c3..32f3cdc9835a 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -14,6 +14,8 @@
#include <linux/jiffies.h>
#include <linux/gfp.h>
#include <linux/dma-mapping.h>
+#include <linux/t10-pi.h>
+#include <linux/crc64.h>
#include "blk.h"
#include "blk-rq-qos.h"
@@ -135,6 +137,42 @@ static int blk_validate_integrity_limits(struct queue_limits *lim)
return -EINVAL;
}
+ if (bi->pi_tuple_size > bi->metadata_size) {
+ pr_warn("pi_tuple_size (%u) exceeds metadata_size (%u)\n",
+ bi->pi_tuple_size,
+ bi->metadata_size);
+ return -EINVAL;
+ }
+
+ switch (bi->csum_type) {
+ case BLK_INTEGRITY_CSUM_NONE:
+ if (bi->pi_tuple_size) {
+ pr_warn("pi_tuple_size must be 0 when checksum type \
+ is none\n");
+ return -EINVAL;
+ }
+ break;
+ case BLK_INTEGRITY_CSUM_CRC:
+ case BLK_INTEGRITY_CSUM_IP:
+ if (bi->pi_tuple_size != sizeof(struct t10_pi_tuple)) {
+ pr_warn("pi_tuple_size mismatch for T10 PI: expected \
+ %zu, got %u\n",
+ sizeof(struct t10_pi_tuple),
+ bi->pi_tuple_size);
+ return -EINVAL;
+ }
+ break;
+ case BLK_INTEGRITY_CSUM_CRC64:
+ if (bi->pi_tuple_size != sizeof(struct crc64_pi_tuple)) {
+ pr_warn("pi_tuple_size mismatch for CRC64 PI: \
+ expected %zu, got %u\n",
+ sizeof(struct crc64_pi_tuple),
+ bi->pi_tuple_size);
+ return -EINVAL;
+ }
+ break;
+ }
+
if (!bi->interval_exp)
bi->interval_exp = ilog2(lim->logical_block_size);
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index de8d27ceefc4..685dea0f23a3 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1867,6 +1867,8 @@ static bool nvme_init_integrity(struct nvme_ns_head *head,
}
bi->metadata_size = head->ms;
+ if (bi->csum_type)
+ bi->pi_tuple_size = head->pi_size;
bi->pi_offset = info->pi_offset;
return true;
}
diff --git a/drivers/scsi/sd_dif.c b/drivers/scsi/sd_dif.c
index 18bfca1f1c78..ff4217fef93b 100644
--- a/drivers/scsi/sd_dif.c
+++ b/drivers/scsi/sd_dif.c
@@ -53,6 +53,7 @@ void sd_dif_config_host(struct scsi_disk *sdkp, struct queue_limits *lim)
bi->flags |= BLK_INTEGRITY_REF_TAG;
bi->metadata_size = sizeof(struct t10_pi_tuple);
+ bi->pi_tuple_size = bi->metadata_size;
if (dif && type) {
bi->flags |= BLK_INTEGRITY_DEVICE_CAPABLE;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index edc3b458fbd9..82348fcc2455 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -120,6 +120,7 @@ struct blk_integrity {
unsigned char pi_offset;
unsigned char interval_exp;
unsigned char tag_size;
+ unsigned char pi_tuple_size;
};
typedef unsigned int __bitwise blk_mode_t;
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH for-next v5 3/4] nvme: set pi_offset only when checksum type is not BLK_INTEGRITY_CSUM_NONE
[not found] ` <CGME20250630090614epcas5p383dde7cef99658083e2ff520f6fac994@epcas5p3.samsung.com>
@ 2025-06-30 9:05 ` Anuj Gupta
0 siblings, 0 replies; 9+ messages in thread
From: Anuj Gupta @ 2025-06-30 9:05 UTC (permalink / raw)
To: vincent.fu, jack, anuj1072538, axboe, viro, brauner, hch,
martin.petersen, ebiggers, adilger
Cc: linux-block, linux-fsdevel, joshi.k, linux-nvme, linux-scsi,
gost.dev, Anuj Gupta, Christoph Hellwig
protection information is treated as opaque when checksum type is
BLK_INTEGRITY_CSUM_NONE. In order to maintain the right metadata
semantics, set pi_offset only in cases where checksum type is not
BLK_INTEGRITY_CSUM_NONE.
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
drivers/nvme/host/core.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 685dea0f23a3..500a9e82d60e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1867,9 +1867,10 @@ static bool nvme_init_integrity(struct nvme_ns_head *head,
}
bi->metadata_size = head->ms;
- if (bi->csum_type)
+ if (bi->csum_type) {
bi->pi_tuple_size = head->pi_size;
- bi->pi_offset = info->pi_offset;
+ bi->pi_offset = info->pi_offset;
+ }
return true;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH for-next v5 4/4] fs: add ioctl to query metadata and protection info capabilities
[not found] ` <CGME20250630090616epcas5p2a9ca118ca83586172d69213e22b635a1@epcas5p2.samsung.com>
@ 2025-06-30 9:05 ` Anuj Gupta
2025-06-30 13:30 ` Christoph Hellwig
0 siblings, 1 reply; 9+ messages in thread
From: Anuj Gupta @ 2025-06-30 9:05 UTC (permalink / raw)
To: vincent.fu, jack, anuj1072538, axboe, viro, brauner, hch,
martin.petersen, ebiggers, adilger
Cc: linux-block, linux-fsdevel, joshi.k, linux-nvme, linux-scsi,
gost.dev, Anuj Gupta
Add a new ioctl, FS_IOC_GETLBMD_CAP, to query metadata and protection
info (PI) capabilities. This ioctl returns information about the files
integrity profile. This is useful for userspace applications to
understand a files end-to-end data protection support and configure the
I/O accordingly.
For now this interface is only supported by block devices. However the
design and placement of this ioctl in generic FS ioctl space allows us
to extend it to work over files as well. This maybe useful when
filesystems start supporting PI-aware layouts.
A new structure struct logical_block_metadata_cap is introduced, which
contains the following fields:
1. lbmd_flags: bitmask of logical block metadata capability flags
2. lbmd_interval: the amount of data described by each unit of logical
block metadata
3. lbmd_size: size in bytes of the logical block metadata associated
with each interval
4. lbmd_opaque_size: size in bytes of the opaque block tag associated
with each interval
5. lbmd_opaque_offset: offset in bytes of the opaque block tag within
the logical block metadata
6. lbmd_pi_size: size in bytes of the T10 PI tuple associated with each
interval
7. lbmd_pi_offset: offset in bytes of T10 PI tuple within the logical
block metadata
8. lbmd_pi_guard_tag_type: T10 PI guard tag type
9. lbmd_pi_app_tag_size: size in bytes of the T10 PI application tag
10. lbmd_pi_ref_tag_size: size in bytes of the T10 PI reference tag
11. lbmd_pi_storage_tag_size: size in bytes of the T10 PI storage tag
The internal logic to fetch the capability is encapsulated in a helper
function blk_get_meta_cap(), which uses the blk_integrity profile
associated with the device. The ioctl returns -EOPNOTSUPP, if
CONFIG_BLK_DEV_INTEGRITY is not enabled.
Suggested-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
---
block/blk-integrity.c | 52 ++++++++++++++++++++++++++++++
block/ioctl.c | 4 +++
include/linux/blk-integrity.h | 7 +++++
include/uapi/linux/fs.h | 59 +++++++++++++++++++++++++++++++++++
4 files changed, 122 insertions(+)
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index c1102bf4cd8d..9d9dc9c32083 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -13,6 +13,7 @@
#include <linux/scatterlist.h>
#include <linux/export.h>
#include <linux/slab.h>
+#include <linux/t10-pi.h>
#include "blk.h"
@@ -54,6 +55,57 @@ int blk_rq_count_integrity_sg(struct request_queue *q, struct bio *bio)
return segments;
}
+int blk_get_meta_cap(struct block_device *bdev, unsigned int cmd,
+ struct logical_block_metadata_cap __user *argp)
+{
+ struct blk_integrity *bi = blk_get_integrity(bdev->bd_disk);
+ struct logical_block_metadata_cap meta_cap = {};
+ size_t usize = _IOC_SIZE(cmd);
+
+ if (!argp)
+ return -EINVAL;
+ if (usize < LBMD_SIZE_VER0)
+ return -EINVAL;
+ if (!bi)
+ goto out;
+
+ if (bi->flags & BLK_INTEGRITY_DEVICE_CAPABLE)
+ meta_cap.lbmd_flags |= LBMD_PI_CAP_INTEGRITY;
+ if (bi->flags & BLK_INTEGRITY_REF_TAG)
+ meta_cap.lbmd_flags |= LBMD_PI_CAP_REFTAG;
+ meta_cap.lbmd_interval = 1 << bi->interval_exp;
+ meta_cap.lbmd_size = bi->metadata_size;
+ meta_cap.lbmd_pi_size = bi->pi_tuple_size;
+ meta_cap.lbmd_pi_offset = bi->pi_offset;
+ meta_cap.lbmd_opaque_size = bi->metadata_size - bi->pi_tuple_size;
+ if (meta_cap.lbmd_opaque_size && !bi->pi_offset)
+ meta_cap.lbmd_opaque_offset = bi->pi_tuple_size;
+
+ meta_cap.lbmd_guard_tag_type = bi->csum_type;
+ if (bi->csum_type != BLK_INTEGRITY_CSUM_NONE)
+ meta_cap.lbmd_app_tag_size = 2;
+
+ if (bi->flags & BLK_INTEGRITY_REF_TAG) {
+ switch (bi->csum_type) {
+ case BLK_INTEGRITY_CSUM_CRC64:
+ meta_cap.lbmd_ref_tag_size =
+ sizeof_field(struct crc64_pi_tuple, ref_tag);
+ break;
+ case BLK_INTEGRITY_CSUM_CRC:
+ case BLK_INTEGRITY_CSUM_IP:
+ meta_cap.lbmd_ref_tag_size =
+ sizeof_field(struct t10_pi_tuple, ref_tag);
+ break;
+ default:
+ break;
+ }
+ }
+
+out:
+ return copy_struct_to_user(argp, usize, &meta_cap, sizeof(meta_cap),
+ NULL);
+}
+
/**
* blk_rq_map_integrity_sg - Map integrity metadata into a scatterlist
* @rq: request to map
diff --git a/block/ioctl.c b/block/ioctl.c
index e472cc1030c6..9ad403733e19 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -13,6 +13,7 @@
#include <linux/uaccess.h>
#include <linux/pagemap.h>
#include <linux/io_uring/cmd.h>
+#include <linux/blk-integrity.h>
#include <uapi/linux/blkdev.h>
#include "blk.h"
#include "blk-crypto-internal.h"
@@ -566,6 +567,9 @@ static int blkdev_common_ioctl(struct block_device *bdev, blk_mode_t mode,
{
unsigned int max_sectors;
+ if (_IOC_NR(cmd) == _IOC_NR(FS_IOC_GETLBMD_CAP))
+ return blk_get_meta_cap(bdev, cmd, argp);
+
switch (cmd) {
case BLKFLSBUF:
return blkdev_flushbuf(bdev, cmd, arg);
diff --git a/include/linux/blk-integrity.h b/include/linux/blk-integrity.h
index d27730da47f3..e04c6e5bf1c6 100644
--- a/include/linux/blk-integrity.h
+++ b/include/linux/blk-integrity.h
@@ -29,6 +29,8 @@ int blk_rq_map_integrity_sg(struct request *, struct scatterlist *);
int blk_rq_count_integrity_sg(struct request_queue *, struct bio *);
int blk_rq_integrity_map_user(struct request *rq, void __user *ubuf,
ssize_t bytes);
+int blk_get_meta_cap(struct block_device *bdev, unsigned int cmd,
+ struct logical_block_metadata_cap __user *argp);
static inline bool
blk_integrity_queue_supports_integrity(struct request_queue *q)
@@ -92,6 +94,11 @@ static inline struct bio_vec rq_integrity_vec(struct request *rq)
rq->bio->bi_integrity->bip_iter);
}
#else /* CONFIG_BLK_DEV_INTEGRITY */
+static inline int blk_get_meta_cap(struct block_device *bdev, unsigned int cmd,
+ struct logical_block_metadata_cap __user *argp)
+{
+ return -EOPNOTSUPP;
+}
static inline int blk_rq_count_integrity_sg(struct request_queue *q,
struct bio *b)
{
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 0098b0ce8ccb..83720a2fd20d 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -91,6 +91,63 @@ struct fs_sysfs_path {
__u8 name[128];
};
+/* Protection info capability flags */
+#define LBMD_PI_CAP_INTEGRITY (1 << 0)
+#define LBMD_PI_CAP_REFTAG (1 << 1)
+
+/* Checksum types for Protection Information */
+#define LBMD_PI_CSUM_NONE 0
+#define LBMD_PI_CSUM_IP 1
+#define LBMD_PI_CSUM_CRC16_T10DIF 2
+#define LBMD_PI_CSUM_CRC64_NVME 4
+
+/* sizeof first published struct */
+#define LBMD_SIZE_VER0 16
+
+/*
+ * Logical block metadata capability descriptor
+ * If the device does not support metadata, all the fields will be zero.
+ * Applications must check lbmd_flags to determine whether metadata is
+ * supported or not.
+ */
+struct logical_block_metadata_cap {
+ /* Bitmask of logical block metadata capability flags */
+ __u32 lbmd_flags;
+ /*
+ * The amount of data described by each unit of logical block
+ * metadata
+ */
+ __u16 lbmd_interval;
+ /*
+ * Size in bytes of the logical block metadata associated with each
+ * interval
+ */
+ __u8 lbmd_size;
+ /*
+ * Size in bytes of the opaque block tag associated with each
+ * interval
+ */
+ __u8 lbmd_opaque_size;
+ /*
+ * Offset in bytes of the opaque block tag within the logical block
+ * metadata
+ */
+ __u8 lbmd_opaque_offset;
+ /* Size in bytes of the T10 PI tuple associated with each interval */
+ __u8 lbmd_pi_size;
+ /* Offset in bytes of T10 PI tuple within the logical block metadata */
+ __u8 lbmd_pi_offset;
+ /* T10 PI guard tag type */
+ __u8 lbmd_guard_tag_type;
+ /* Size in bytes of the T10 PI application tag */
+ __u8 lbmd_app_tag_size;
+ /* Size in bytes of the T10 PI reference tag */
+ __u8 lbmd_ref_tag_size;
+ /* Size in bytes of the T10 PI storage tag */
+ __u8 lbmd_storage_tag_size;
+ __u8 pad;
+};
+
/* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions */
#define FILE_DEDUPE_RANGE_SAME 0
#define FILE_DEDUPE_RANGE_DIFFERS 1
@@ -247,6 +304,8 @@ struct fsxattr {
* also /sys/kernel/debug/ for filesystems with debugfs exports
*/
#define FS_IOC_GETFSSYSFSPATH _IOR(0x15, 1, struct fs_sysfs_path)
+/* Get logical block metadata capability details */
+#define FS_IOC_GETLBMD_CAP _IOWR(0x15, 2, struct logical_block_metadata_cap)
/*
* Inode flags (FS_IOC_GETFLAGS / FS_IOC_SETFLAGS)
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH for-next v5 4/4] fs: add ioctl to query metadata and protection info capabilities
2025-06-30 9:05 ` [PATCH for-next v5 4/4] fs: add ioctl to query metadata and protection info capabilities Anuj Gupta
@ 2025-06-30 13:30 ` Christoph Hellwig
0 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2025-06-30 13:30 UTC (permalink / raw)
To: Anuj Gupta
Cc: vincent.fu, jack, anuj1072538, axboe, viro, brauner, hch,
martin.petersen, ebiggers, adilger, linux-block, linux-fsdevel,
joshi.k, linux-nvme, linux-scsi, gost.dev
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH for-next v5 0/4] add ioctl to query metadata and protection info capabilities
2025-06-30 9:05 ` [PATCH for-next v5 0/4] add ioctl to query metadata and protection info capabilities Anuj Gupta
` (3 preceding siblings ...)
[not found] ` <CGME20250630090616epcas5p2a9ca118ca83586172d69213e22b635a1@epcas5p2.samsung.com>
@ 2025-07-01 12:01 ` Christian Brauner
2025-07-22 6:36 ` Christoph Hellwig
5 siblings, 0 replies; 9+ messages in thread
From: Christian Brauner @ 2025-07-01 12:01 UTC (permalink / raw)
To: Anuj Gupta
Cc: Christian Brauner, linux-block, linux-fsdevel, joshi.k,
linux-nvme, linux-scsi, gost.dev, vincent.fu, jack, anuj1072538,
axboe, viro, hch, martin.petersen, ebiggers, adilger
On Mon, 30 Jun 2025 14:35:44 +0530, Anuj Gupta wrote:
> This patch series adds a new ioctl to query metadata and integrity
> capability.
>
> Patch 1 renames tuple_size field to metadata_size
> Patch 2 adds a pi_tuple_size field in blk_integrity struct which is later
> used to export this value to the user as well.
> Patch 3 allows computing right pi_offset value.
> Patch 4 introduces a new ioctl to query integrity capability.
>
> [...]
Applied to the vfs-6.17.integrity branch of the vfs/vfs.git tree.
Patches in the vfs-6.17.integrity branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.
Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-6.17.integrity
[1/4] block: rename tuple_size field in blk_integrity to metadata_size
https://git.kernel.org/vfs/vfs/c/c6603b1d6556
[2/4] block: introduce pi_tuple_size field in blk_integrity
https://git.kernel.org/vfs/vfs/c/76e45252a4ce
[3/4] nvme: set pi_offset only when checksum type is not BLK_INTEGRITY_CSUM_NONE
https://git.kernel.org/vfs/vfs/c/f3ee50659148
[4/4] fs: add ioctl to query metadata and protection info capabilities
https://git.kernel.org/vfs/vfs/c/9eb22f7fedfc
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH for-next v5 0/4] add ioctl to query metadata and protection info capabilities
2025-06-30 9:05 ` [PATCH for-next v5 0/4] add ioctl to query metadata and protection info capabilities Anuj Gupta
` (4 preceding siblings ...)
2025-07-01 12:01 ` [PATCH for-next v5 0/4] " Christian Brauner
@ 2025-07-22 6:36 ` Christoph Hellwig
2025-07-22 8:17 ` Anuj Gupta/Anuj Gupta
5 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2025-07-22 6:36 UTC (permalink / raw)
To: Anuj Gupta
Cc: vincent.fu, jack, anuj1072538, axboe, viro, brauner, hch,
martin.petersen, ebiggers, adilger, linux-block, linux-fsdevel,
joshi.k, linux-nvme, linux-scsi, gost.dev
What's the status of the fio patches to support PI now that this
has landed?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH for-next v5 0/4] add ioctl to query metadata and protection info capabilities
2025-07-22 6:36 ` Christoph Hellwig
@ 2025-07-22 8:17 ` Anuj Gupta/Anuj Gupta
0 siblings, 0 replies; 9+ messages in thread
From: Anuj Gupta/Anuj Gupta @ 2025-07-22 8:17 UTC (permalink / raw)
To: Christoph Hellwig
Cc: vincent.fu, jack, anuj1072538, axboe, viro, brauner,
martin.petersen, ebiggers, adilger, linux-block, linux-fsdevel,
joshi.k, linux-nvme, linux-scsi, gost.dev
On 7/22/2025 12:06 PM, Christoph Hellwig wrote:
> What's the status of the fio patches to support PI now that this
> has landed?
>
Hi Christoph,
Vincent and I have been working on the corresponding fio support - its
nearly ready. We plan to post the patches soon. ITMT, here's the WIP
branch for reference:
https://github.com/vincentkfu/fio/tree/pi-block
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-07-22 8:29 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20250630090606epcas5p42edec1dfe34f53c9f1448acb0964bb8f@epcas5p4.samsung.com>
2025-06-30 9:05 ` [PATCH for-next v5 0/4] add ioctl to query metadata and protection info capabilities Anuj Gupta
[not found] ` <CGME20250630090608epcas5p192ace2db81d6cc04854919225464444c@epcas5p1.samsung.com>
2025-06-30 9:05 ` [PATCH for-next v5 1/4] block: rename tuple_size field in blk_integrity to metadata_size Anuj Gupta
[not found] ` <CGME20250630090611epcas5p4e77ca8635bf5d3cefea3e3ce8a972dc3@epcas5p4.samsung.com>
2025-06-30 9:05 ` [PATCH for-next v5 2/4] block: introduce pi_tuple_size field in blk_integrity Anuj Gupta
[not found] ` <CGME20250630090614epcas5p383dde7cef99658083e2ff520f6fac994@epcas5p3.samsung.com>
2025-06-30 9:05 ` [PATCH for-next v5 3/4] nvme: set pi_offset only when checksum type is not BLK_INTEGRITY_CSUM_NONE Anuj Gupta
[not found] ` <CGME20250630090616epcas5p2a9ca118ca83586172d69213e22b635a1@epcas5p2.samsung.com>
2025-06-30 9:05 ` [PATCH for-next v5 4/4] fs: add ioctl to query metadata and protection info capabilities Anuj Gupta
2025-06-30 13:30 ` Christoph Hellwig
2025-07-01 12:01 ` [PATCH for-next v5 0/4] " Christian Brauner
2025-07-22 6:36 ` Christoph Hellwig
2025-07-22 8:17 ` Anuj Gupta/Anuj Gupta
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).