* [PATCH v2 1/2] tracing/events: block: track and print if unplug was explicit or schedule
2018-04-13 16:36 [PATCH v2 0/2] tracing/events: block: bring more on a par with blktrace Steffen Maier
@ 2018-04-13 16:36 ` Steffen Maier
2018-04-13 16:36 ` [PATCH v2 2/2] tracing/events: block: dev_t via driver core for plug and unplug events Steffen Maier
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Steffen Maier @ 2018-04-13 16:36 UTC (permalink / raw)
To: linux-kernel, linux-block
Cc: Steven Rostedt, Ingo Molnar, Jens Axboe, Li Zefan,
Greg Kroah-Hartman, Steffen Maier
Just like blktrace distinguishes explicit and schedule by means of
BLK_TA_UNPLUG_IO and BLK_TA_UNPLUG_TIMER, actually make use of the
existing argument "explicit" to distinguish the two cases in the one
common tracepoint block_unplug.
Complements v2.6.39 commit 49cac01e1fa7 ("block: make unplug timer trace
event correspond to the schedule() unplug") and commit d9c978331790
("block: remove block_unplug_timer() trace point").
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
---
Changes since v1:
Use 0 and 1 instead of false and true for __print_symbolic table.
Now "trace-cmd report" can decode it. [Steven Rostedt]
include/trace/events/block.h | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 81b43f5bdf23..e90bb6eb8097 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -470,6 +470,11 @@ TRACE_EVENT(block_plug,
TP_printk("[%s]", __entry->comm)
);
+#define show_block_unplug_explicit(val) \
+ __print_symbolic(val, \
+ {0, "schedule"}, \
+ {1, "explicit"})
+
DECLARE_EVENT_CLASS(block_unplug,
TP_PROTO(struct request_queue *q, unsigned int depth, bool explicit),
@@ -478,15 +483,18 @@ DECLARE_EVENT_CLASS(block_unplug,
TP_STRUCT__entry(
__field( int, nr_rq )
+ __field( bool, explicit )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
__entry->nr_rq = depth;
+ __entry->explicit = explicit;
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
- TP_printk("[%s] %d", __entry->comm, __entry->nr_rq)
+ TP_printk("[%s] %d %s", __entry->comm, __entry->nr_rq,
+ show_block_unplug_explicit(__entry->explicit))
);
/**
--
2.13.5
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH v2 2/2] tracing/events: block: dev_t via driver core for plug and unplug events
2018-04-13 16:36 [PATCH v2 0/2] tracing/events: block: bring more on a par with blktrace Steffen Maier
2018-04-13 16:36 ` [PATCH v2 1/2] tracing/events: block: track and print if unplug was explicit or schedule Steffen Maier
@ 2018-04-13 16:36 ` Steffen Maier
2018-04-13 16:36 ` [RFC v2] tracing/events: block: also try to get dev_t via driver core for some events Steffen Maier
2018-04-13 17:50 ` [PATCH v2 0/2] tracing/events: block: bring more on a par with blktrace Steven Rostedt
3 siblings, 0 replies; 5+ messages in thread
From: Steffen Maier @ 2018-04-13 16:36 UTC (permalink / raw)
To: linux-kernel, linux-block
Cc: Steven Rostedt, Ingo Molnar, Jens Axboe, Li Zefan,
Greg Kroah-Hartman, Steffen Maier
Complements v2.6.31 commit 55782138e47d ("tracing/events: convert block
trace points to TRACE_EVENT()") to be equivalent to traditional blktrace
output. Also this allows event filtering to not always get all (un)plug
events.
NB: The NULL pointer check for q->kobj.parent is certainly racy and
I don't have enough experience if it's good enough for a trace event.
The change did work for my cases (block device read/write I/O on
zfcp-attached SCSI disks and dm-mpath on top).
While I haven't seen any prior art using driver core (parent) relations
for trace events, there are other cases using this when no direct pointer
exists between objects, such as:
#define to_scsi_target(d) container_of(d, struct scsi_target, dev)
static inline struct scsi_target *scsi_target(struct scsi_device *sdev)
{
return to_scsi_target(sdev->sdev_gendev.parent);
}
This is the object model we make use of here:
struct gendisk {
struct hd_struct {
struct device { /*container_of*/
struct kobject kobj; <--+
dev_t devt; /*deref*/ |
} __dev; |
} part0; |
struct request_queue *queue; ..+ |
} : |
: |
struct request_queue { <..............+ |
/* queue kobject */ |
struct kobject { |
struct kobject *parent; --------+
} kobj;
}
The parent pointer comes from:
#define disk_to_dev(disk) (&(disk)->part0.__dev)
int blk_register_queue(struct gendisk *disk)
struct device *dev = disk_to_dev(disk);
struct request_queue *q = disk->queue;
ret = kobject_add(&q->kobj, kobject_get(&dev->kobj), "%s", "queue");
^^^parent
$ ls -d /sys/block/sdf/queue
/sys/block/sda/queue
$ cat /sys/block/sdf/dev
80:0
A partition does not have its own request queue:
$ cat /sys/block/sdf/sdf1/dev
8:81
$ ls -d /sys/block/sdf/sdf1/queue
ls: cannot access '/sys/block/sdf/sdf1/queue': No such file or directory
The difference to blktrace parsed output is that block events don't use the
partition's minor number but the containing block device's minor number:
$ dd if=/dev/sdf1 count=1
$ cat /sys/kernel/debug/tracing/trace
block_bio_remap: 8,80 R 2048 + 32 <- (8,81) 0
block_bio_queue: 8,80 R 2048 + 32 [dd]
block_getrq: 8,80 R 2048 + 32 [dd]
block_plug: 8,80 [dd]
^^^^
block_rq_insert: 8,80 R 16384 () 2048 + 32 [dd]
block_unplug: 8,80 [dd] 1 explicit
^^^^
block_rq_issue: 8,80 R 16384 () 2048 + 32 [dd]
block_rq_complete: 8,80 R () 2048 + 32 [0]
$ btrace /dev/sdf1
8,80 1 1 0.000000000 240240 A R 2048 + 32 <- (8,81) 0
8,81 1 2 0.000220890 240240 Q R 2048 + 32 [dd]
8,81 1 3 0.000229639 240240 G R 2048 + 32 [dd]
8,81 1 4 0.000231805 240240 P N [dd]
^^
8,81 1 5 0.000234671 240240 I R 2048 + 32 [dd]
8,81 1 6 0.000236365 240240 U N [dd] 1
^^
8,81 1 7 0.000238527 240240 D R 2048 + 32 [dd]
8,81 2 2 0.000613741 0 C R 2048 + 32 [0]
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
---
include/trace/events/block.h | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index e90bb6eb8097..6ea5a3899c2e 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -460,14 +460,18 @@ TRACE_EVENT(block_plug,
TP_ARGS(q),
TP_STRUCT__entry(
+ __field( dev_t, dev )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
+ __entry->dev = q->kobj.parent ?
+ container_of(q->kobj.parent, struct device, kobj)->devt : 0;
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
- TP_printk("[%s]", __entry->comm)
+ TP_printk("%d,%d [%s]",
+ MAJOR(__entry->dev), MINOR(__entry->dev), __entry->comm)
);
#define show_block_unplug_explicit(val) \
@@ -482,18 +486,23 @@ DECLARE_EVENT_CLASS(block_unplug,
TP_ARGS(q, depth, explicit),
TP_STRUCT__entry(
+ __field( dev_t, dev )
__field( int, nr_rq )
__field( bool, explicit )
__array( char, comm, TASK_COMM_LEN )
),
TP_fast_assign(
+ __entry->dev = q->kobj.parent ?
+ container_of(q->kobj.parent, struct device, kobj)->devt : 0;
__entry->nr_rq = depth;
__entry->explicit = explicit;
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
- TP_printk("[%s] %d %s", __entry->comm, __entry->nr_rq,
+ TP_printk("%d,%d [%s] %d %s",
+ MAJOR(__entry->dev), MINOR(__entry->dev),
+ __entry->comm, __entry->nr_rq,
show_block_unplug_explicit(__entry->explicit))
);
--
2.13.5
^ permalink raw reply related [flat|nested] 5+ messages in thread* [RFC v2] tracing/events: block: also try to get dev_t via driver core for some events
2018-04-13 16:36 [PATCH v2 0/2] tracing/events: block: bring more on a par with blktrace Steffen Maier
2018-04-13 16:36 ` [PATCH v2 1/2] tracing/events: block: track and print if unplug was explicit or schedule Steffen Maier
2018-04-13 16:36 ` [PATCH v2 2/2] tracing/events: block: dev_t via driver core for plug and unplug events Steffen Maier
@ 2018-04-13 16:36 ` Steffen Maier
2018-04-13 17:50 ` [PATCH v2 0/2] tracing/events: block: bring more on a par with blktrace Steven Rostedt
3 siblings, 0 replies; 5+ messages in thread
From: Steffen Maier @ 2018-04-13 16:36 UTC (permalink / raw)
To: linux-kernel, linux-block
Cc: Steven Rostedt, Ingo Molnar, Jens Axboe, Li Zefan,
Greg Kroah-Hartman, Steffen Maier
Complements v2.6.31 commit 55782138e47d ("tracing/events: convert block
trace points to TRACE_EVENT()") for cases where rq->rq_disk == NULL:
block_rq_requeue, block_rq_insert, block_rq_issue;
and for cases where bio == NULL:
block_getrq, block_sleeprq.
NB: The NULL pointer check for q->kobj.parent is certainly racy and
I don't have enough experience if it's good enough for a trace event.
Since I don't know when above cases would occur,
I'm not sure this is worth it.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
---
include/trace/events/block.h | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 6ea5a3899c2e..001e4f369df9 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -86,7 +86,9 @@ TRACE_EVENT(block_rq_requeue,
),
TP_fast_assign(
- __entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) : 0;
+ __entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) :
+ (q->kobj.parent ?
+ container_of(q->kobj.parent, struct device, kobj)->devt : 0);
__entry->sector = blk_rq_trace_sector(rq);
__entry->nr_sector = blk_rq_trace_nr_sectors(rq);
@@ -162,7 +164,9 @@ DECLARE_EVENT_CLASS(block_rq,
),
TP_fast_assign(
- __entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) : 0;
+ __entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) :
+ (q->kobj.parent ?
+ container_of(q->kobj.parent, struct device, kobj)->devt : 0);
__entry->sector = blk_rq_trace_sector(rq);
__entry->nr_sector = blk_rq_trace_nr_sectors(rq);
__entry->bytes = blk_rq_bytes(rq);
@@ -397,7 +401,9 @@ DECLARE_EVENT_CLASS(block_get_rq,
),
TP_fast_assign(
- __entry->dev = bio ? bio_dev(bio) : 0;
+ __entry->dev = bio ? bio_dev(bio) :
+ (q->kobj.parent ?
+ container_of(q->kobj.parent, struct device, kobj)->devt : 0);
__entry->sector = bio ? bio->bi_iter.bi_sector : 0;
__entry->nr_sector = bio ? bio_sectors(bio) : 0;
blk_fill_rwbs(__entry->rwbs,
--
2.13.5
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH v2 0/2] tracing/events: block: bring more on a par with blktrace
2018-04-13 16:36 [PATCH v2 0/2] tracing/events: block: bring more on a par with blktrace Steffen Maier
` (2 preceding siblings ...)
2018-04-13 16:36 ` [RFC v2] tracing/events: block: also try to get dev_t via driver core for some events Steffen Maier
@ 2018-04-13 17:50 ` Steven Rostedt
3 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2018-04-13 17:50 UTC (permalink / raw)
To: Steffen Maier
Cc: linux-kernel, linux-block, Ingo Molnar, Jens Axboe, Li Zefan,
Greg Kroah-Hartman
On Fri, 13 Apr 2018 18:36:20 +0200
Steffen Maier <maier@linux.ibm.com> wrote:
> I had the need to understand I/O request processing in detail.
> But I also had the need to enrich block traces with other trace events
> including my own dynamic kprobe events. So I preferred block trace events
> over blktrace to get everything nicely sorted into one ftrace output.
> However, I missed device filtering for (un)plug events and also
> the difference between the two flavors of unplug.
>
> The first two patches bring block trace events closer to their
> counterpart in blktrace tooling.
>
> The last patch is just an RFC. I still kept it in this patch set because
> it is inspired by PATCH 2/2.
>
> Changes since v1:
> [PATCH v2 1/2]
> Use 0 and 1 instead of false and true for __print_symbolic table.
> Now "trace-cmd report" can decode it. [Steven Rostedt]
>
> Steffen Maier (3):
> tracing/events: block: track and print if unplug was explicit or
> schedule
> tracing/events: block: dev_t via driver core for plug and unplug
> events
> tracing/events: block: also try to get dev_t via driver core for some
> events
>
> include/trace/events/block.h | 33 ++++++++++++++++++++++++++++-----
> 1 file changed, 28 insertions(+), 5 deletions(-)
>
From just the tracing point of view:
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
as for the race conditions in the block layer, I'll let someone else
decided that.
-- Steve
^ permalink raw reply [flat|nested] 5+ messages in thread