* [PATCH 0/5] block: add tracepoints for ZBD specific operations
@ 2025-07-09 11:46 Johannes Thumshirn
2025-07-09 11:47 ` [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs Johannes Thumshirn
` (4 more replies)
0 siblings, 5 replies; 22+ messages in thread
From: Johannes Thumshirn @ 2025-07-09 11:46 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Christoph Hellwig, linux-block,
Shin'ichiro Kawasaki, Chaitanya Kulkarni, Johannes Thumshirn
Add tracepoints for operations specific to zoned block devices.
These tracepoints are also the foundation for adding zoned block device
support to blktrace, which will be sent in a follow up.
But even without the immediate support for ZBD operations in blktrace, these
tracepoints have prooven to be effective in debugging issues with filesystems
on zoned block devices.
Johannes Thumshirn (5):
blktrace: add zoned block commands to blk_fill_rwbs
block: split blk_zone_update_request_bio into two functions
block: add tracepoint for blk_zone_update_request_bio
block: add tracepoint for blkdev_zone_mgmt
block: add trace messages to zone write plugging
block/blk-mq.c | 6 ++-
block/blk-zoned.c | 23 ++++++++++
block/blk.h | 31 ++++++-------
include/trace/events/block.h | 89 ++++++++++++++++++++++++++++++++++++
kernel/trace/blktrace.c | 21 +++++++++
5 files changed, 151 insertions(+), 19 deletions(-)
--
2.50.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs
2025-07-09 11:46 [PATCH 0/5] block: add tracepoints for ZBD specific operations Johannes Thumshirn
@ 2025-07-09 11:47 ` Johannes Thumshirn
2025-07-09 15:30 ` Bart Van Assche
2025-07-10 4:13 ` Damien Le Moal
2025-07-09 11:47 ` [PATCH 2/5] block: split blk_zone_update_request_bio into two functions Johannes Thumshirn
` (3 subsequent siblings)
4 siblings, 2 replies; 22+ messages in thread
From: Johannes Thumshirn @ 2025-07-09 11:47 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Christoph Hellwig, linux-block,
Shin'ichiro Kawasaki, Chaitanya Kulkarni, Johannes Thumshirn
Add zoned block commands to blk_fill_rwbs:
- ZONE APPEND will be decoded as 'ZA'
- ZONE RESET and ZONE RESET ALL will be decoded as 'ZR'
- ZONE FINISH will be decoded as 'ZF'
- ZONE OPEN will be decoded as 'ZO'
- ZONE CLOSE will be decoded as 'ZC'
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
kernel/trace/blktrace.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 3f6a7bdc6edf..f1dc00c22e37 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1875,6 +1875,27 @@ void blk_fill_rwbs(char *rwbs, blk_opf_t opf)
case REQ_OP_READ:
rwbs[i++] = 'R';
break;
+ case REQ_OP_ZONE_APPEND:
+ rwbs[i++] = 'Z';
+ rwbs[i++] = 'A';
+ break;
+ case REQ_OP_ZONE_RESET:
+ case REQ_OP_ZONE_RESET_ALL:
+ rwbs[i++] = 'Z';
+ rwbs[i++] = 'R';
+ break;
+ case REQ_OP_ZONE_FINISH:
+ rwbs[i++] = 'Z';
+ rwbs[i++] = 'F';
+ break;
+ case REQ_OP_ZONE_OPEN:
+ rwbs[i++] = 'Z';
+ rwbs[i++] = 'O';
+ break;
+ case REQ_OP_ZONE_CLOSE:
+ rwbs[i++] = 'Z';
+ rwbs[i++] = 'C';
+ break;
default:
rwbs[i++] = 'N';
}
--
2.50.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/5] block: split blk_zone_update_request_bio into two functions
2025-07-09 11:46 [PATCH 0/5] block: add tracepoints for ZBD specific operations Johannes Thumshirn
2025-07-09 11:47 ` [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs Johannes Thumshirn
@ 2025-07-09 11:47 ` Johannes Thumshirn
2025-07-10 4:15 ` Damien Le Moal
2025-07-09 11:47 ` [PATCH 3/5] block: add tracepoint for blk_zone_update_request_bio Johannes Thumshirn
` (2 subsequent siblings)
4 siblings, 1 reply; 22+ messages in thread
From: Johannes Thumshirn @ 2025-07-09 11:47 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Christoph Hellwig, linux-block,
Shin'ichiro Kawasaki, Chaitanya Kulkarni, Johannes Thumshirn
blk_zone_update_request_bio() does two things. First it checks if the
request to be completed was written via ZONE APPEND and if yes it then
updates the sector to the one that the data was written to.
This is small enough to be an inline function. But upcoming changes adding
a tracepoint don't work if the function is inlined.
Split the function into two, the first is blk_req_bio_is_zone_append()
checking if the sector needs to be updated. This can still be an inline
function. The second is blk_zone_append_update_request_bio() doing the
sector update.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
block/blk-mq.c | 6 ++++--
block/blk-zoned.c | 13 +++++++++++++
block/blk.h | 31 ++++++++++++++-----------------
3 files changed, 31 insertions(+), 19 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4806b867e37d..755ea0335831 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -883,7 +883,8 @@ static void blk_complete_request(struct request *req)
/* Completion has already been traced */
bio_clear_flag(bio, BIO_TRACE_COMPLETION);
- blk_zone_update_request_bio(req, bio);
+ if (blk_req_bio_is_zone_append(req, bio))
+ blk_zone_append_update_request_bio(req, bio);
if (!is_flush)
bio_endio(bio);
@@ -982,7 +983,8 @@ bool blk_update_request(struct request *req, blk_status_t error,
/* Don't actually finish bio if it's part of flush sequence */
if (!bio->bi_iter.bi_size) {
- blk_zone_update_request_bio(req, bio);
+ if (blk_req_bio_is_zone_append(req, bio))
+ blk_zone_append_update_request_bio(req, bio);
if (!is_flush)
bio_endio(bio);
}
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 351d659280e1..e79e0357d1f1 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -1205,6 +1205,19 @@ static void disk_zone_wplug_unplug_bio(struct gendisk *disk,
spin_unlock_irqrestore(&zwplug->lock, flags);
}
+void blk_zone_append_update_request_bio(struct request *rq, struct bio *bio)
+{
+ /*
+ * For zone append requests, the request sector indicates the location
+ * at which the BIO data was written. Return this value to the BIO
+ * issuer through the BIO iter sector.
+ * For plugged zone writes, which include emulated zone append, we need
+ * the original BIO sector so that blk_zone_write_plug_bio_endio() can
+ * lookup the zone write plug.
+ */
+ bio->bi_iter.bi_sector = rq->__sector;
+}
+
void blk_zone_write_plug_bio_endio(struct bio *bio)
{
struct gendisk *disk = bio->bi_bdev->bd_disk;
diff --git a/block/blk.h b/block/blk.h
index 37ec459fe656..b4c01774df98 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -467,23 +467,15 @@ static inline bool bio_zone_write_plugging(struct bio *bio)
{
return bio_flagged(bio, BIO_ZONE_WRITE_PLUGGING);
}
-void blk_zone_write_plug_bio_merged(struct bio *bio);
-void blk_zone_write_plug_init_request(struct request *rq);
-static inline void blk_zone_update_request_bio(struct request *rq,
- struct bio *bio)
+static inline bool blk_req_bio_is_zone_append(struct request *rq,
+ struct bio *bio)
{
- /*
- * For zone append requests, the request sector indicates the location
- * at which the BIO data was written. Return this value to the BIO
- * issuer through the BIO iter sector.
- * For plugged zone writes, which include emulated zone append, we need
- * the original BIO sector so that blk_zone_write_plug_bio_endio() can
- * lookup the zone write plug.
- */
- if (req_op(rq) == REQ_OP_ZONE_APPEND ||
- bio_flagged(bio, BIO_EMULATES_ZONE_APPEND))
- bio->bi_iter.bi_sector = rq->__sector;
+ return req_op(rq) == REQ_OP_ZONE_APPEND ||
+ bio_flagged(bio, BIO_EMULATES_ZONE_APPEND);
}
+void blk_zone_write_plug_bio_merged(struct bio *bio);
+void blk_zone_write_plug_init_request(struct request *rq);
+void blk_zone_append_update_request_bio(struct request *rq, struct bio *bio);
void blk_zone_write_plug_bio_endio(struct bio *bio);
static inline void blk_zone_bio_endio(struct bio *bio)
{
@@ -516,14 +508,19 @@ static inline bool bio_zone_write_plugging(struct bio *bio)
{
return false;
}
+static inline bool blk_req_bio_is_zone_append(struct request *req,
+ struct bio *bio)
+{
+ return false;
+}
static inline void blk_zone_write_plug_bio_merged(struct bio *bio)
{
}
static inline void blk_zone_write_plug_init_request(struct request *rq)
{
}
-static inline void blk_zone_update_request_bio(struct request *rq,
- struct bio *bio)
+static inline void blk_zone_append_update_request_bio(struct request *rq,
+ struct bio *bio)
{
}
static inline void blk_zone_bio_endio(struct bio *bio)
--
2.50.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 3/5] block: add tracepoint for blk_zone_update_request_bio
2025-07-09 11:46 [PATCH 0/5] block: add tracepoints for ZBD specific operations Johannes Thumshirn
2025-07-09 11:47 ` [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs Johannes Thumshirn
2025-07-09 11:47 ` [PATCH 2/5] block: split blk_zone_update_request_bio into two functions Johannes Thumshirn
@ 2025-07-09 11:47 ` Johannes Thumshirn
2025-07-10 4:18 ` Damien Le Moal
2025-07-09 11:47 ` [PATCH 4/5] block: add tracepoint for blkdev_zone_mgmt Johannes Thumshirn
2025-07-09 11:47 ` [PATCH 5/5] block: add trace messages to zone write plugging Johannes Thumshirn
4 siblings, 1 reply; 22+ messages in thread
From: Johannes Thumshirn @ 2025-07-09 11:47 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Christoph Hellwig, linux-block,
Shin'ichiro Kawasaki, Chaitanya Kulkarni, Johannes Thumshirn
Add a tracepoint in blk_zone_update_request_bio() to trace the bio sector
update on ZONE APPEND completions.
An example for this tracepoint is as follows:
<idle>-0 [001] d.h1. 381.746444: blk_zone_update_request_bio: 259,5 ZAS 131072 () 1048832 + 256 none,0,0 [swapper/1]
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
block/blk-zoned.c | 3 +++
include/trace/events/block.h | 11 +++++++++++
2 files changed, 14 insertions(+)
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index e79e0357d1f1..2584ffb6b022 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -17,6 +17,8 @@
#include <linux/refcount.h>
#include <linux/mempool.h>
+#include <trace/events/block.h>
+
#include "blk.h"
#include "blk-mq-sched.h"
#include "blk-mq-debugfs.h"
@@ -1216,6 +1218,7 @@ void blk_zone_append_update_request_bio(struct request *rq, struct bio *bio)
* lookup the zone write plug.
*/
bio->bi_iter.bi_sector = rq->__sector;
+ trace_blk_zone_append_update_request_bio(rq);
}
void blk_zone_write_plug_bio_endio(struct bio *bio)
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 14a924c0e303..16d51f868064 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -404,6 +404,17 @@ DEFINE_EVENT(block_bio, block_getrq,
TP_ARGS(bio)
);
+/**
+ * block_zone_update_request_bio - update the bio sector after a zone append
+ * @bio: the completed block IO operation
+ *
+ * Update the bio's bi_sector after a zone append command has been completed.
+ */
+DEFINE_EVENT(block_rq, blk_zone_append_update_request_bio,
+ TP_PROTO(struct request *rq),
+ TP_ARGS(rq)
+);
+
/**
* block_plug - keep operations requests in request queue
* @q: request queue to plug
--
2.50.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 4/5] block: add tracepoint for blkdev_zone_mgmt
2025-07-09 11:46 [PATCH 0/5] block: add tracepoints for ZBD specific operations Johannes Thumshirn
` (2 preceding siblings ...)
2025-07-09 11:47 ` [PATCH 3/5] block: add tracepoint for blk_zone_update_request_bio Johannes Thumshirn
@ 2025-07-09 11:47 ` Johannes Thumshirn
2025-07-09 15:37 ` Bart Van Assche
2025-07-10 4:20 ` Damien Le Moal
2025-07-09 11:47 ` [PATCH 5/5] block: add trace messages to zone write plugging Johannes Thumshirn
4 siblings, 2 replies; 22+ messages in thread
From: Johannes Thumshirn @ 2025-07-09 11:47 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Christoph Hellwig, linux-block,
Shin'ichiro Kawasaki, Chaitanya Kulkarni, Johannes Thumshirn
Add a tracepoint for blkdev_zone_mgmt to trace zone management commands
submitted by higher layers like file systems or user space.
An example output for this tracepoint is as follows:
mkfs.btrfs-210 [000] ..... 116.727190: blkdev_zone_mgmt: 259,6 ZRS 524288 + 0
This example output shows a REQ_OP_ZONE_RESET_ALL operation submitted by
mkfs.btrfs.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
block/blk-zoned.c | 2 ++
include/trace/events/block.h | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 36 insertions(+)
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 2584ffb6b022..48f75f58d05e 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -179,6 +179,7 @@ static int blkdev_zone_reset_all(struct block_device *bdev)
struct bio bio;
bio_init(&bio, bdev, NULL, 0, REQ_OP_ZONE_RESET_ALL | REQ_SYNC);
+ trace_blkdev_zone_mgmt(&bio, 0);
return submit_bio_wait(&bio);
}
@@ -242,6 +243,7 @@ int blkdev_zone_mgmt(struct block_device *bdev, enum req_op op,
cond_resched();
}
+ trace_blkdev_zone_mgmt(bio, nr_sectors);
ret = submit_bio_wait(bio);
bio_put(bio);
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 16d51f868064..9a25b686fd09 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -599,6 +599,40 @@ TRACE_EVENT(block_rq_remap,
(unsigned long long)__entry->old_sector, __entry->nr_bios)
);
+/**
+ * blkdev_zone_mgmt - Execute a zone management operation on a range of zones
+ * @bio: The block IO operation sent down to the device
+ * @nr_sectors: The number of sectors affected by this operation
+ *
+ * Execute a zone management operation on a specified range of zones. This
+ * range is encoded in %nr_sectors, which has to be a multiple of the zone
+ * size.
+ */
+TRACE_EVENT(blkdev_zone_mgmt,
+
+ TP_PROTO(struct bio *bio, sector_t nr_sectors),
+
+ TP_ARGS(bio, nr_sectors),
+
+ TP_STRUCT__entry(
+ __field( dev_t, dev )
+ __field( sector_t, sector )
+ __field( sector_t, nr_sectors )
+ __array( char, rwbs, RWBS_LEN)
+ ),
+
+ TP_fast_assign(
+ __entry->dev = bio_dev(bio);
+ __entry->sector = bio->bi_iter.bi_sector;
+ __entry->nr_sectors = bio_sectors(bio);
+ blk_fill_rwbs(__entry->rwbs, bio->bi_opf);
+ ),
+
+ TP_printk("%d,%d %s %llu + %llu",
+ MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
+ (unsigned long long)__entry->sector,
+ __entry->nr_sectors)
+);
#endif /* _TRACE_BLOCK_H */
/* This part must be outside protection */
--
2.50.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 5/5] block: add trace messages to zone write plugging
2025-07-09 11:46 [PATCH 0/5] block: add tracepoints for ZBD specific operations Johannes Thumshirn
` (3 preceding siblings ...)
2025-07-09 11:47 ` [PATCH 4/5] block: add tracepoint for blkdev_zone_mgmt Johannes Thumshirn
@ 2025-07-09 11:47 ` Johannes Thumshirn
2025-07-10 4:23 ` Damien Le Moal
4 siblings, 1 reply; 22+ messages in thread
From: Johannes Thumshirn @ 2025-07-09 11:47 UTC (permalink / raw)
To: Jens Axboe
Cc: Damien Le Moal, Christoph Hellwig, linux-block,
Shin'ichiro Kawasaki, Chaitanya Kulkarni, Johannes Thumshirn
Add tracepoints to zone write plugging plug and unplug events.
kworker/u9:3-162 [000] d..1. 2231.939277: disk_zone_wplug_add_bio: 8,0 zone 12, BIO 6291456 + 14
kworker/0:1H-59 [000] d..1. 2231.939884: blk_zone_wplug_bio: 8,0 zone 24, BIO 12775168 + 4
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
block/blk-zoned.c | 5 ++++
include/trace/events/block.h | 44 ++++++++++++++++++++++++++++++++++++
2 files changed, 49 insertions(+)
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 48f75f58d05e..b881aadbe35f 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -822,6 +822,8 @@ static inline void disk_zone_wplug_add_bio(struct gendisk *disk,
* at the tail of the list to preserve the sequential write order.
*/
bio_list_add(&zwplug->bio_list, bio);
+ trace_disk_zone_wplug_add_bio(zwplug->disk->queue, zwplug->zone_no,
+ bio->bi_iter.bi_sector, nr_segs);
zwplug->flags |= BLK_ZONE_WPLUG_PLUGGED;
@@ -1317,6 +1319,9 @@ static void blk_zone_wplug_bio_work(struct work_struct *work)
goto put_zwplug;
}
+ trace_blk_zone_wplug_bio(zwplug->disk->queue, zwplug->zone_no,
+ bio->bi_iter.bi_sector, bio->__bi_nr_segments);
+
if (!blk_zone_wplug_prepare_bio(zwplug, bio)) {
blk_zone_wplug_bio_io_error(zwplug, bio);
goto again;
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 9a25b686fd09..16d5a87f3030 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -633,6 +633,50 @@ TRACE_EVENT(blkdev_zone_mgmt,
(unsigned long long)__entry->sector,
__entry->nr_sectors)
);
+
+DECLARE_EVENT_CLASS(block_zwplug,
+
+ TP_PROTO(struct request_queue *q, unsigned int zno, sector_t sector,
+ unsigned int nr_segs),
+
+ TP_ARGS(q, zno, sector, nr_segs),
+
+ TP_STRUCT__entry(
+ __field( dev_t, dev )
+ __field( unsigned int, zno )
+ __field( sector_t, sector )
+ __field( unsigned int, nr_segs )
+ ),
+
+ TP_fast_assign(
+ __entry->dev = disk_devt(q->disk);
+ __entry->zno = zno;
+ __entry->sector = sector;
+ __entry->nr_segs = nr_segs;
+ ),
+
+ TP_printk("%d,%d zone %u, BIO %llu + %u",
+ MAJOR(__entry->dev), MINOR(__entry->dev), __entry->zno,
+ (unsigned long long)__entry->sector,
+ __entry->nr_segs)
+);
+
+DEFINE_EVENT(block_zwplug, disk_zone_wplug_add_bio,
+
+ TP_PROTO(struct request_queue *q, unsigned int zno, sector_t sector,
+ unsigned int nr_segs),
+
+ TP_ARGS(q, zno, sector, nr_segs)
+);
+
+DEFINE_EVENT(block_zwplug, blk_zone_wplug_bio,
+
+ TP_PROTO(struct request_queue *q, unsigned int zno, sector_t sector,
+ unsigned int nr_segs),
+
+ TP_ARGS(q, zno, sector, nr_segs)
+);
+
#endif /* _TRACE_BLOCK_H */
/* This part must be outside protection */
--
2.50.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs
2025-07-09 11:47 ` [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs Johannes Thumshirn
@ 2025-07-09 15:30 ` Bart Van Assche
2025-07-10 5:55 ` Johannes Thumshirn
2025-07-10 4:13 ` Damien Le Moal
1 sibling, 1 reply; 22+ messages in thread
From: Bart Van Assche @ 2025-07-09 15:30 UTC (permalink / raw)
To: Johannes Thumshirn, Jens Axboe
Cc: Damien Le Moal, Christoph Hellwig, linux-block,
Shin'ichiro Kawasaki, Chaitanya Kulkarni
On 7/9/25 4:47 AM, Johannes Thumshirn wrote:
> Add zoned block commands to blk_fill_rwbs:
>
> - ZONE APPEND will be decoded as 'ZA'
> - ZONE RESET and ZONE RESET ALL will be decoded as 'ZR'
> - ZONE FINISH will be decoded as 'ZF'
> - ZONE OPEN will be decoded as 'ZO'
> - ZONE CLOSE will be decoded as 'ZC'
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
> kernel/trace/blktrace.c | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 3f6a7bdc6edf..f1dc00c22e37 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -1875,6 +1875,27 @@ void blk_fill_rwbs(char *rwbs, blk_opf_t opf)
> case REQ_OP_READ:
> rwbs[i++] = 'R';
> break;
> + case REQ_OP_ZONE_APPEND:
> + rwbs[i++] = 'Z';
> + rwbs[i++] = 'A';
> + break;
> + case REQ_OP_ZONE_RESET:
> + case REQ_OP_ZONE_RESET_ALL:
> + rwbs[i++] = 'Z';
> + rwbs[i++] = 'R';
> + break;
> + case REQ_OP_ZONE_FINISH:
> + rwbs[i++] = 'Z';
> + rwbs[i++] = 'F';
> + break;
> + case REQ_OP_ZONE_OPEN:
> + rwbs[i++] = 'Z';
> + rwbs[i++] = 'O';
> + break;
> + case REQ_OP_ZONE_CLOSE:
> + rwbs[i++] = 'Z';
> + rwbs[i++] = 'C';
> + break;
> default:
> rwbs[i++] = 'N';
> }
Has it been considered to add a warning statement in blk_fill_rwbs()
that verifies that blk_fill_rwbs() does not write outside the bounds of
the rwbs array? See also the RWBS_LEN definition.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 4/5] block: add tracepoint for blkdev_zone_mgmt
2025-07-09 11:47 ` [PATCH 4/5] block: add tracepoint for blkdev_zone_mgmt Johannes Thumshirn
@ 2025-07-09 15:37 ` Bart Van Assche
2025-07-10 4:10 ` Damien Le Moal
2025-07-10 4:20 ` Damien Le Moal
1 sibling, 1 reply; 22+ messages in thread
From: Bart Van Assche @ 2025-07-09 15:37 UTC (permalink / raw)
To: Johannes Thumshirn, Jens Axboe
Cc: Damien Le Moal, Christoph Hellwig, linux-block,
Shin'ichiro Kawasaki, Chaitanya Kulkarni
On 7/9/25 4:47 AM, Johannes Thumshirn wrote:
> + TP_printk("%d,%d %s %llu + %llu",
> + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
> + (unsigned long long)__entry->sector,
> + __entry->nr_sectors)
sector_t is a synonym for u64. u64 is defined as unsigned long in
include/uapi/asm-generic/int-l64.h and is defined as unsigned long long
in include/uapi/asm-generic/int-ll64.h. Kernel code always includes
the int-ll64.h header file. In other words, I think the above cast is
superfluous for all CPU architectures supported by the Linux kernel.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 4/5] block: add tracepoint for blkdev_zone_mgmt
2025-07-09 15:37 ` Bart Van Assche
@ 2025-07-10 4:10 ` Damien Le Moal
2025-07-10 15:46 ` Bart Van Assche
0 siblings, 1 reply; 22+ messages in thread
From: Damien Le Moal @ 2025-07-10 4:10 UTC (permalink / raw)
To: Bart Van Assche, Johannes Thumshirn, Jens Axboe
Cc: Christoph Hellwig, linux-block, Shin'ichiro Kawasaki,
Chaitanya Kulkarni
On 7/10/25 12:37 AM, Bart Van Assche wrote:
> On 7/9/25 4:47 AM, Johannes Thumshirn wrote:
>> + TP_printk("%d,%d %s %llu + %llu",
>> + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
>> + (unsigned long long)__entry->sector,
>> + __entry->nr_sectors)
>
> sector_t is a synonym for u64. u64 is defined as unsigned long in
> include/uapi/asm-generic/int-l64.h and is defined as unsigned long long
> in include/uapi/asm-generic/int-ll64.h. Kernel code always includes
> the int-ll64.h header file. In other words, I think the above cast is
> superfluous for all CPU architectures supported by the Linux kernel.
%llu format will not work on 32-bits arch.
>
> Thanks,
>
> Bart.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs
2025-07-09 11:47 ` [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs Johannes Thumshirn
2025-07-09 15:30 ` Bart Van Assche
@ 2025-07-10 4:13 ` Damien Le Moal
2025-07-10 9:59 ` Johannes Thumshirn
1 sibling, 1 reply; 22+ messages in thread
From: Damien Le Moal @ 2025-07-10 4:13 UTC (permalink / raw)
To: Johannes Thumshirn, Jens Axboe
Cc: Christoph Hellwig, linux-block, Shin'ichiro Kawasaki,
Chaitanya Kulkarni
On 7/9/25 8:47 PM, Johannes Thumshirn wrote:
> Add zoned block commands to blk_fill_rwbs:
>
> - ZONE APPEND will be decoded as 'ZA'
> - ZONE RESET and ZONE RESET ALL will be decoded as 'ZR'
> - ZONE FINISH will be decoded as 'ZF'
> - ZONE OPEN will be decoded as 'ZO'
> - ZONE CLOSE will be decoded as 'ZC'
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
> kernel/trace/blktrace.c | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 3f6a7bdc6edf..f1dc00c22e37 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -1875,6 +1875,27 @@ void blk_fill_rwbs(char *rwbs, blk_opf_t opf)
> case REQ_OP_READ:
> rwbs[i++] = 'R';
> break;
May be enclode this new hunk under a #ifdef CONFIG_BLK_DEV_ZONED ?
> + case REQ_OP_ZONE_APPEND:
> + rwbs[i++] = 'Z';
> + rwbs[i++] = 'A';
> + break;
> + case REQ_OP_ZONE_RESET:
> + case REQ_OP_ZONE_RESET_ALL:
> + rwbs[i++] = 'Z';
> + rwbs[i++] = 'R';
I would really prefer the ability to distinguish single zone reset and all
zones reset... Are we limited to 2 chars for the operation name ? If not,
making REQ_OP_ZONE_RESET_ALL be "ZRA" would be better I think. If you want to
preserve the 2 chars for the op name, then maybe ... no goo idea... Naming is
hard :)
> + break;
> + case REQ_OP_ZONE_FINISH:
> + rwbs[i++] = 'Z';
> + rwbs[i++] = 'F';
> + break;
> + case REQ_OP_ZONE_OPEN:
> + rwbs[i++] = 'Z';
> + rwbs[i++] = 'O';
> + break;
> + case REQ_OP_ZONE_CLOSE:
> + rwbs[i++] = 'Z';
> + rwbs[i++] = 'C';
> + break;
> default:
> rwbs[i++] = 'N';
> }
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/5] block: split blk_zone_update_request_bio into two functions
2025-07-09 11:47 ` [PATCH 2/5] block: split blk_zone_update_request_bio into two functions Johannes Thumshirn
@ 2025-07-10 4:15 ` Damien Le Moal
0 siblings, 0 replies; 22+ messages in thread
From: Damien Le Moal @ 2025-07-10 4:15 UTC (permalink / raw)
To: Johannes Thumshirn, Jens Axboe
Cc: Christoph Hellwig, linux-block, Shin'ichiro Kawasaki,
Chaitanya Kulkarni
On 7/9/25 8:47 PM, Johannes Thumshirn wrote:
> blk_zone_update_request_bio() does two things. First it checks if the
> request to be completed was written via ZONE APPEND and if yes it then
> updates the sector to the one that the data was written to.
>
> This is small enough to be an inline function. But upcoming changes adding
> a tracepoint don't work if the function is inlined.
>
> Split the function into two, the first is blk_req_bio_is_zone_append()
> checking if the sector needs to be updated. This can still be an inline
> function. The second is blk_zone_append_update_request_bio() doing the
> sector update.
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 3/5] block: add tracepoint for blk_zone_update_request_bio
2025-07-09 11:47 ` [PATCH 3/5] block: add tracepoint for blk_zone_update_request_bio Johannes Thumshirn
@ 2025-07-10 4:18 ` Damien Le Moal
0 siblings, 0 replies; 22+ messages in thread
From: Damien Le Moal @ 2025-07-10 4:18 UTC (permalink / raw)
To: Johannes Thumshirn, Jens Axboe
Cc: Christoph Hellwig, linux-block, Shin'ichiro Kawasaki,
Chaitanya Kulkarni
On 7/9/25 8:47 PM, Johannes Thumshirn wrote:
> Add a tracepoint in blk_zone_update_request_bio() to trace the bio sector
> update on ZONE APPEND completions.
>
> An example for this tracepoint is as follows:
>
> <idle>-0 [001] d.h1. 381.746444: blk_zone_update_request_bio: 259,5 ZAS 131072 () 1048832 + 256 none,0,0 [swapper/1]
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 4/5] block: add tracepoint for blkdev_zone_mgmt
2025-07-09 11:47 ` [PATCH 4/5] block: add tracepoint for blkdev_zone_mgmt Johannes Thumshirn
2025-07-09 15:37 ` Bart Van Assche
@ 2025-07-10 4:20 ` Damien Le Moal
1 sibling, 0 replies; 22+ messages in thread
From: Damien Le Moal @ 2025-07-10 4:20 UTC (permalink / raw)
To: Johannes Thumshirn, Jens Axboe
Cc: Christoph Hellwig, linux-block, Shin'ichiro Kawasaki,
Chaitanya Kulkarni
On 7/9/25 8:47 PM, Johannes Thumshirn wrote:
> Add a tracepoint for blkdev_zone_mgmt to trace zone management commands
> submitted by higher layers like file systems or user space.
>
> An example output for this tracepoint is as follows:
>
> mkfs.btrfs-210 [000] ..... 116.727190: blkdev_zone_mgmt: 259,6 ZRS 524288 + 0
>
> This example output shows a REQ_OP_ZONE_RESET_ALL operation submitted by
> mkfs.btrfs.
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Modulo the remark about zone reset all op name in the trace, this looks OK to me.
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 5/5] block: add trace messages to zone write plugging
2025-07-09 11:47 ` [PATCH 5/5] block: add trace messages to zone write plugging Johannes Thumshirn
@ 2025-07-10 4:23 ` Damien Le Moal
2025-07-10 6:27 ` Johannes Thumshirn
0 siblings, 1 reply; 22+ messages in thread
From: Damien Le Moal @ 2025-07-10 4:23 UTC (permalink / raw)
To: Johannes Thumshirn, Jens Axboe
Cc: Christoph Hellwig, linux-block, Shin'ichiro Kawasaki,
Chaitanya Kulkarni
On 7/9/25 8:47 PM, Johannes Thumshirn wrote:
> Add tracepoints to zone write plugging plug and unplug events.
>
> kworker/u9:3-162 [000] d..1. 2231.939277: disk_zone_wplug_add_bio: 8,0 zone 12, BIO 6291456 + 14
> kworker/0:1H-59 [000] d..1. 2231.939884: blk_zone_wplug_bio: 8,0 zone 24, BIO 12775168 + 4
This is showing "+ bio->__bi_nr_segments", which is odd...
Is this intentional ? Why not the "+ bio_sectors(bio)" ?
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
> block/blk-zoned.c | 5 ++++
> include/trace/events/block.h | 44 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 49 insertions(+)
>
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index 48f75f58d05e..b881aadbe35f 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -822,6 +822,8 @@ static inline void disk_zone_wplug_add_bio(struct gendisk *disk,
> * at the tail of the list to preserve the sequential write order.
> */
> bio_list_add(&zwplug->bio_list, bio);
> + trace_disk_zone_wplug_add_bio(zwplug->disk->queue, zwplug->zone_no,
> + bio->bi_iter.bi_sector, nr_segs);
>
> zwplug->flags |= BLK_ZONE_WPLUG_PLUGGED;
>
> @@ -1317,6 +1319,9 @@ static void blk_zone_wplug_bio_work(struct work_struct *work)
> goto put_zwplug;
> }
>
> + trace_blk_zone_wplug_bio(zwplug->disk->queue, zwplug->zone_no,
> + bio->bi_iter.bi_sector, bio->__bi_nr_segments);
> +
> if (!blk_zone_wplug_prepare_bio(zwplug, bio)) {
> blk_zone_wplug_bio_io_error(zwplug, bio);
> goto again;
> diff --git a/include/trace/events/block.h b/include/trace/events/block.h
> index 9a25b686fd09..16d5a87f3030 100644
> --- a/include/trace/events/block.h
> +++ b/include/trace/events/block.h
> @@ -633,6 +633,50 @@ TRACE_EVENT(blkdev_zone_mgmt,
> (unsigned long long)__entry->sector,
> __entry->nr_sectors)
> );
> +
> +DECLARE_EVENT_CLASS(block_zwplug,
> +
> + TP_PROTO(struct request_queue *q, unsigned int zno, sector_t sector,
> + unsigned int nr_segs),
> +
> + TP_ARGS(q, zno, sector, nr_segs),
> +
> + TP_STRUCT__entry(
> + __field( dev_t, dev )
> + __field( unsigned int, zno )
> + __field( sector_t, sector )
> + __field( unsigned int, nr_segs )
> + ),
> +
> + TP_fast_assign(
> + __entry->dev = disk_devt(q->disk);
> + __entry->zno = zno;
> + __entry->sector = sector;
> + __entry->nr_segs = nr_segs;
> + ),
> +
> + TP_printk("%d,%d zone %u, BIO %llu + %u",
> + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->zno,
> + (unsigned long long)__entry->sector,
> + __entry->nr_segs)
> +);
> +
> +DEFINE_EVENT(block_zwplug, disk_zone_wplug_add_bio,
> +
> + TP_PROTO(struct request_queue *q, unsigned int zno, sector_t sector,
> + unsigned int nr_segs),
> +
> + TP_ARGS(q, zno, sector, nr_segs)
> +);
> +
> +DEFINE_EVENT(block_zwplug, blk_zone_wplug_bio,
> +
> + TP_PROTO(struct request_queue *q, unsigned int zno, sector_t sector,
> + unsigned int nr_segs),
> +
> + TP_ARGS(q, zno, sector, nr_segs)
> +);
> +
> #endif /* _TRACE_BLOCK_H */
>
> /* This part must be outside protection */
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs
2025-07-09 15:30 ` Bart Van Assche
@ 2025-07-10 5:55 ` Johannes Thumshirn
2025-07-10 15:48 ` Bart Van Assche
0 siblings, 1 reply; 22+ messages in thread
From: Johannes Thumshirn @ 2025-07-10 5:55 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe
Cc: Damien Le Moal, hch, linux-block@vger.kernel.org,
Shinichiro Kawasaki, Chaitanya Kulkarni
On 09.07.25 17:31, Bart Van Assche wrote:
> On 7/9/25 4:47 AM, Johannes Thumshirn wrote:
>> Add zoned block commands to blk_fill_rwbs:
>>
>> - ZONE APPEND will be decoded as 'ZA'
>> - ZONE RESET and ZONE RESET ALL will be decoded as 'ZR'
>> - ZONE FINISH will be decoded as 'ZF'
>> - ZONE OPEN will be decoded as 'ZO'
>> - ZONE CLOSE will be decoded as 'ZC'
>>
>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>> ---
>> kernel/trace/blktrace.c | 21 +++++++++++++++++++++
>> 1 file changed, 21 insertions(+)
>>
>> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
>> index 3f6a7bdc6edf..f1dc00c22e37 100644
>> --- a/kernel/trace/blktrace.c
>> +++ b/kernel/trace/blktrace.c
>> @@ -1875,6 +1875,27 @@ void blk_fill_rwbs(char *rwbs, blk_opf_t opf)
>> case REQ_OP_READ:
>> rwbs[i++] = 'R';
>> break;
>> + case REQ_OP_ZONE_APPEND:
>> + rwbs[i++] = 'Z';
>> + rwbs[i++] = 'A';
>> + break;
>> + case REQ_OP_ZONE_RESET:
>> + case REQ_OP_ZONE_RESET_ALL:
>> + rwbs[i++] = 'Z';
>> + rwbs[i++] = 'R';
>> + break;
>> + case REQ_OP_ZONE_FINISH:
>> + rwbs[i++] = 'Z';
>> + rwbs[i++] = 'F';
>> + break;
>> + case REQ_OP_ZONE_OPEN:
>> + rwbs[i++] = 'Z';
>> + rwbs[i++] = 'O';
>> + break;
>> + case REQ_OP_ZONE_CLOSE:
>> + rwbs[i++] = 'Z';
>> + rwbs[i++] = 'C';
>> + break;
>> default:
>> rwbs[i++] = 'N';
>> }
>
> Has it been considered to add a warning statement in blk_fill_rwbs()
> that verifies that blk_fill_rwbs() does not write outside the bounds of
> the rwbs array? See also the RWBS_LEN definition.
$ git grep -E "#define\sRWBS_LEN"
include/trace/events/block.h:#define RWBS_LEN 9
So even if we would have
opf = (REQ_PREFLUSH | REQ_OP_ZONE_APPEND | REQ_FUA | REQ_RAHEAD |
REQ_SYNC | REQ_META | REQ_ATOMIC);
it'll be 8 including the trailing \0 it'll be 9.
If you look closely, REQ_OP_SECURE_ERASE already is 'DE' so no changes.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 5/5] block: add trace messages to zone write plugging
2025-07-10 4:23 ` Damien Le Moal
@ 2025-07-10 6:27 ` Johannes Thumshirn
0 siblings, 0 replies; 22+ messages in thread
From: Johannes Thumshirn @ 2025-07-10 6:27 UTC (permalink / raw)
To: Damien Le Moal, Jens Axboe
Cc: hch, linux-block@vger.kernel.org, Shinichiro Kawasaki,
Chaitanya Kulkarni
On 10.07.25 06:26, Damien Le Moal wrote:
> On 7/9/25 8:47 PM, Johannes Thumshirn wrote:
>> Add tracepoints to zone write plugging plug and unplug events.
>>
>> kworker/u9:3-162 [000] d..1. 2231.939277: disk_zone_wplug_add_bio: 8,0 zone 12, BIO 6291456 + 14
>> kworker/0:1H-59 [000] d..1. 2231.939884: blk_zone_wplug_bio: 8,0 zone 24, BIO 12775168 + 4
>
> This is showing "+ bio->__bi_nr_segments", which is odd...
> Is this intentional ? Why not the "+ bio_sectors(bio)" ?
Because I'm dumb I guess.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs
2025-07-10 4:13 ` Damien Le Moal
@ 2025-07-10 9:59 ` Johannes Thumshirn
2025-07-10 10:39 ` Damien Le Moal
0 siblings, 1 reply; 22+ messages in thread
From: Johannes Thumshirn @ 2025-07-10 9:59 UTC (permalink / raw)
To: Damien Le Moal, Jens Axboe
Cc: hch, linux-block@vger.kernel.org, Shinichiro Kawasaki,
Chaitanya Kulkarni
On 10.07.25 06:16, Damien Le Moal wrote:
> On 7/9/25 8:47 PM, Johannes Thumshirn wrote:
>> Add zoned block commands to blk_fill_rwbs:
>>
>> - ZONE APPEND will be decoded as 'ZA'
>> - ZONE RESET and ZONE RESET ALL will be decoded as 'ZR'
>> - ZONE FINISH will be decoded as 'ZF'
>> - ZONE OPEN will be decoded as 'ZO'
>> - ZONE CLOSE will be decoded as 'ZC'
>>
>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>> ---
>> kernel/trace/blktrace.c | 21 +++++++++++++++++++++
>> 1 file changed, 21 insertions(+)
>>
>> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
>> index 3f6a7bdc6edf..f1dc00c22e37 100644
>> --- a/kernel/trace/blktrace.c
>> +++ b/kernel/trace/blktrace.c
>> @@ -1875,6 +1875,27 @@ void blk_fill_rwbs(char *rwbs, blk_opf_t opf)
>> case REQ_OP_READ:
>> rwbs[i++] = 'R';
>> break;
>
> May be enclode this new hunk under a #ifdef CONFIG_BLK_DEV_ZONED ?
I'd prefer not to a) I dislike #ifdef in .c files and b) in other cases
(i.e. submit_bio_noacct() we don't ifdef as either.
Of cause none of this is a no-go, so it's all personal preference and
I'll do whatever Jens wants here.
>> + case REQ_OP_ZONE_APPEND:
>> + rwbs[i++] = 'Z';
>> + rwbs[i++] = 'A';
>> + break;
>> + case REQ_OP_ZONE_RESET:
>> + case REQ_OP_ZONE_RESET_ALL:
>> + rwbs[i++] = 'Z';
>> + rwbs[i++] = 'R';
>
> I would really prefer the ability to distinguish single zone reset and all
> zones reset... Are we limited to 2 chars for the operation name ? If not,
> making REQ_OP_ZONE_RESET_ALL be "ZRA" would be better I think. If you want to
> preserve the 2 chars for the op name, then maybe ... no goo idea... Naming is
> hard :)
Again, I'd prefer the 2 chars version, as a RESET ALL can be
distinguished form a plain RESET by the number of sectors (see patch
4/5) and we would need to bump RWBS_LEN.
But of cause this as well is personal preference and I can go either
way. Whatever is preferred by everyone.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs
2025-07-10 9:59 ` Johannes Thumshirn
@ 2025-07-10 10:39 ` Damien Le Moal
0 siblings, 0 replies; 22+ messages in thread
From: Damien Le Moal @ 2025-07-10 10:39 UTC (permalink / raw)
To: Johannes Thumshirn, Jens Axboe
Cc: hch, linux-block@vger.kernel.org, Shinichiro Kawasaki,
Chaitanya Kulkarni
On 7/10/25 6:59 PM, Johannes Thumshirn wrote:
>>> + case REQ_OP_ZONE_APPEND:
>>> + rwbs[i++] = 'Z';
>>> + rwbs[i++] = 'A';
>>> + break;
>>> + case REQ_OP_ZONE_RESET:
>>> + case REQ_OP_ZONE_RESET_ALL:
>>> + rwbs[i++] = 'Z';
>>> + rwbs[i++] = 'R';
>>
>> I would really prefer the ability to distinguish single zone reset and all
>> zones reset... Are we limited to 2 chars for the operation name ? If not,
>> making REQ_OP_ZONE_RESET_ALL be "ZRA" would be better I think. If you want to
>> preserve the 2 chars for the op name, then maybe ... no goo idea... Naming is
>> hard :)
>
> Again, I'd prefer the 2 chars version, as a RESET ALL can be
> distinguished form a plain RESET by the number of sectors (see patch
> 4/5) and we would need to bump RWBS_LEN.
Hu ? both reset and reset all use 0 sectors for the BIO. Reset all always also
use sector == 0, so that means that you cannot distinguish a reset all from a
single zone reset of the first zone... OK, this is really just that. But still,
given the different op code, a different name would be nice.
>
> But of cause this as well is personal preference and I can go either
> way. Whatever is preferred by everyone.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 4/5] block: add tracepoint for blkdev_zone_mgmt
2025-07-10 4:10 ` Damien Le Moal
@ 2025-07-10 15:46 ` Bart Van Assche
0 siblings, 0 replies; 22+ messages in thread
From: Bart Van Assche @ 2025-07-10 15:46 UTC (permalink / raw)
To: Damien Le Moal, Johannes Thumshirn, Jens Axboe
Cc: Christoph Hellwig, linux-block, Shin'ichiro Kawasaki,
Chaitanya Kulkarni
On 7/9/25 9:10 PM, Damien Le Moal wrote:
> On 7/10/25 12:37 AM, Bart Van Assche wrote:
>> On 7/9/25 4:47 AM, Johannes Thumshirn wrote:
>>> + TP_printk("%d,%d %s %llu + %llu",
>>> + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs,
>>> + (unsigned long long)__entry->sector,
>>> + __entry->nr_sectors)
>>
>> sector_t is a synonym for u64. u64 is defined as unsigned long in
>> include/uapi/asm-generic/int-l64.h and is defined as unsigned long long
>> in include/uapi/asm-generic/int-ll64.h. Kernel code always includes
>> the int-ll64.h header file. In other words, I think the above cast is
>> superfluous for all CPU architectures supported by the Linux kernel.
>
> %llu format will not work on 32-bits arch.
Huh? I think it would be a severe bug in the code in lib/vsprintf.c if
it would not support the %llu format on 32-bits architectures. Is there
perhaps a misunderstanding?
Bart.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs
2025-07-10 5:55 ` Johannes Thumshirn
@ 2025-07-10 15:48 ` Bart Van Assche
2025-07-14 10:23 ` Johannes Thumshirn
0 siblings, 1 reply; 22+ messages in thread
From: Bart Van Assche @ 2025-07-10 15:48 UTC (permalink / raw)
To: Johannes Thumshirn, Jens Axboe
Cc: Damien Le Moal, hch, linux-block@vger.kernel.org,
Shinichiro Kawasaki, Chaitanya Kulkarni
On 7/9/25 10:55 PM, Johannes Thumshirn wrote:
> On 09.07.25 17:31, Bart Van Assche wrote:
>> Has it been considered to add a warning statement in blk_fill_rwbs()
>> that verifies that blk_fill_rwbs() does not write outside the bounds of
>> the rwbs array? See also the RWBS_LEN definition.
>
> $ git grep -E "#define\sRWBS_LEN"
> include/trace/events/block.h:#define RWBS_LEN 9
>
> So even if we would have
>
> opf = (REQ_PREFLUSH | REQ_OP_ZONE_APPEND | REQ_FUA | REQ_RAHEAD |
> REQ_SYNC | REQ_META | REQ_ATOMIC);
>
> it'll be 8 including the trailing \0 it'll be 9.
>
> If you look closely, REQ_OP_SECURE_ERASE already is 'DE' so no changes.
It seems like my comment was not clear enough. I am aware that the
current code does not trigger a buffer overflow. Adding a length check
would help in my opinion because:
- It would catch potential future changes of blk_fill_rwbs() that
introduce a buffer overflow.
- It would document the length of the rwbs output buffer. Today there
are no references to RWBS_LEN in the blk_fill_rwbs() function -
neither in the code nor in any comments.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs
2025-07-10 15:48 ` Bart Van Assche
@ 2025-07-14 10:23 ` Johannes Thumshirn
2025-07-14 16:00 ` Bart Van Assche
0 siblings, 1 reply; 22+ messages in thread
From: Johannes Thumshirn @ 2025-07-14 10:23 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe
Cc: Damien Le Moal, hch, linux-block@vger.kernel.org,
Shinichiro Kawasaki, Chaitanya Kulkarni
On 10.07.25 17:48, Bart Van Assche wrote:
> On 7/9/25 10:55 PM, Johannes Thumshirn wrote:
>> On 09.07.25 17:31, Bart Van Assche wrote:
>>> Has it been considered to add a warning statement in blk_fill_rwbs()
>>> that verifies that blk_fill_rwbs() does not write outside the bounds of
>>> the rwbs array? See also the RWBS_LEN definition.
>>
>> $ git grep -E "#define\sRWBS_LEN"
>> include/trace/events/block.h:#define RWBS_LEN 9
>>
>> So even if we would have
>>
>> opf = (REQ_PREFLUSH | REQ_OP_ZONE_APPEND | REQ_FUA | REQ_RAHEAD |
>> REQ_SYNC | REQ_META | REQ_ATOMIC);
>>
>> it'll be 8 including the trailing \0 it'll be 9.
>>
>> If you look closely, REQ_OP_SECURE_ERASE already is 'DE' so no changes.
>
> It seems like my comment was not clear enough. I am aware that the
> current code does not trigger a buffer overflow. Adding a length check
> would help in my opinion because:
> - It would catch potential future changes of blk_fill_rwbs() that
> introduce a buffer overflow.
> - It would document the length of the rwbs output buffer. Today there
> are no references to RWBS_LEN in the blk_fill_rwbs() function -
> neither in the code nor in any comments.
Do you mean something like this:
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index f1dc00c22e37..ac30cb83067f 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -1911,6 +1911,8 @@ void blk_fill_rwbs(char *rwbs, blk_opf_t opf)
if (opf & REQ_ATOMIC)
rwbs[i++] = 'U';
+ WARN_ON_ONCE(i >= RWBS_LEN);
+
rwbs[i] = '\0';
}
EXPORT_SYMBOL_GPL(blk_fill_rwbs);
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs
2025-07-14 10:23 ` Johannes Thumshirn
@ 2025-07-14 16:00 ` Bart Van Assche
0 siblings, 0 replies; 22+ messages in thread
From: Bart Van Assche @ 2025-07-14 16:00 UTC (permalink / raw)
To: Johannes Thumshirn, Jens Axboe
Cc: Damien Le Moal, hch, linux-block@vger.kernel.org,
Shinichiro Kawasaki, Chaitanya Kulkarni
On 7/14/25 3:23 AM, Johannes Thumshirn wrote:
> Do you mean something like this:
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index f1dc00c22e37..ac30cb83067f 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -1911,6 +1911,8 @@ void blk_fill_rwbs(char *rwbs, blk_opf_t opf)
> if (opf & REQ_ATOMIC)
> rwbs[i++] = 'U';
>
> + WARN_ON_ONCE(i >= RWBS_LEN);
> +
> rwbs[i] = '\0';
> }
> EXPORT_SYMBOL_GPL(blk_fill_rwbs);
Yes, thanks!
Bart.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-07-14 16:00 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-09 11:46 [PATCH 0/5] block: add tracepoints for ZBD specific operations Johannes Thumshirn
2025-07-09 11:47 ` [PATCH 1/5] blktrace: add zoned block commands to blk_fill_rwbs Johannes Thumshirn
2025-07-09 15:30 ` Bart Van Assche
2025-07-10 5:55 ` Johannes Thumshirn
2025-07-10 15:48 ` Bart Van Assche
2025-07-14 10:23 ` Johannes Thumshirn
2025-07-14 16:00 ` Bart Van Assche
2025-07-10 4:13 ` Damien Le Moal
2025-07-10 9:59 ` Johannes Thumshirn
2025-07-10 10:39 ` Damien Le Moal
2025-07-09 11:47 ` [PATCH 2/5] block: split blk_zone_update_request_bio into two functions Johannes Thumshirn
2025-07-10 4:15 ` Damien Le Moal
2025-07-09 11:47 ` [PATCH 3/5] block: add tracepoint for blk_zone_update_request_bio Johannes Thumshirn
2025-07-10 4:18 ` Damien Le Moal
2025-07-09 11:47 ` [PATCH 4/5] block: add tracepoint for blkdev_zone_mgmt Johannes Thumshirn
2025-07-09 15:37 ` Bart Van Assche
2025-07-10 4:10 ` Damien Le Moal
2025-07-10 15:46 ` Bart Van Assche
2025-07-10 4:20 ` Damien Le Moal
2025-07-09 11:47 ` [PATCH 5/5] block: add trace messages to zone write plugging Johannes Thumshirn
2025-07-10 4:23 ` Damien Le Moal
2025-07-10 6:27 ` Johannes Thumshirn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).