All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Ming Lei <ming.lei@redhat.com>, Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 5.4 47/57] block/diskstats: more accurate approximation of io_ticks for slow disks
Date: Mon,  5 Oct 2020 17:26:59 +0200	[thread overview]
Message-ID: <20201005142112.070155997@linuxfoundation.org> (raw)
In-Reply-To: <20201005142109.796046410@linuxfoundation.org>

From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

commit 2b8bd423614c595540eaadcfbc702afe8e155e50 upstream.

Currently io_ticks is approximated by adding one at each start and end of
requests if jiffies counter has changed. This works perfectly for requests
shorter than a jiffy or if one of requests starts/ends at each jiffy.

If disk executes just one request at a time and they are longer than two
jiffies then only first and last jiffies will be accounted.

Fix is simple: at the end of request add up into io_ticks jiffies passed
since last update rather than just one jiffy.

Example: common HDD executes random read 4k requests around 12ms.

fio --name=test --filename=/dev/sdb --rw=randread --direct=1 --runtime=30 &
iostat -x 10 sdb

Note changes of iostat's "%util" 8,43% -> 99,99% before/after patch:

Before:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0,00     0,00   82,60    0,00   330,40     0,00     8,00     0,96   12,09   12,09    0,00   1,02   8,43

After:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0,00     0,00   82,50    0,00   330,00     0,00     8,00     1,00   12,10   12,10    0,00  12,12  99,99

Now io_ticks does not loose time between start and end of requests, but
for queue-depth > 1 some I/O time between adjacent starts might be lost.

For load estimation "%util" is not as useful as average queue length,
but it clearly shows how often disk queue is completely empty.

Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
From: "Banerjee, Debabrata" <dbanerje@akamai.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 Documentation/admin-guide/iostats.rst |    5 ++++-
 block/bio.c                           |    8 ++++----
 block/blk-core.c                      |    4 ++--
 include/linux/genhd.h                 |    2 +-
 4 files changed, 11 insertions(+), 8 deletions(-)

--- a/Documentation/admin-guide/iostats.rst
+++ b/Documentation/admin-guide/iostats.rst
@@ -99,7 +99,7 @@ Field 10 -- # of milliseconds spent doin
 
     Since 5.0 this field counts jiffies when at least one request was
     started or completed. If request runs more than 2 jiffies then some
-    I/O time will not be accounted unless there are other requests.
+    I/O time might be not accounted in case of concurrent requests.
 
 Field 11 -- weighted # of milliseconds spent doing I/Os
     This field is incremented at each I/O start, I/O completion, I/O
@@ -133,6 +133,9 @@ are summed (possibly overflowing the uns
 summed to) and the result given to the user.  There is no convenient
 user interface for accessing the per-CPU counters themselves.
 
+Since 4.19 request times are measured with nanoseconds precision and
+truncated to milliseconds before showing in this interface.
+
 Disks vs Partitions
 -------------------
 
--- a/block/bio.c
+++ b/block/bio.c
@@ -1754,14 +1754,14 @@ defer:
 	schedule_work(&bio_dirty_work);
 }
 
-void update_io_ticks(struct hd_struct *part, unsigned long now)
+void update_io_ticks(struct hd_struct *part, unsigned long now, bool end)
 {
 	unsigned long stamp;
 again:
 	stamp = READ_ONCE(part->stamp);
 	if (unlikely(stamp != now)) {
 		if (likely(cmpxchg(&part->stamp, stamp, now) == stamp)) {
-			__part_stat_add(part, io_ticks, 1);
+			__part_stat_add(part, io_ticks, end ? now - stamp : 1);
 		}
 	}
 	if (part->partno) {
@@ -1777,7 +1777,7 @@ void generic_start_io_acct(struct reques
 
 	part_stat_lock();
 
-	update_io_ticks(part, jiffies);
+	update_io_ticks(part, jiffies, false);
 	part_stat_inc(part, ios[sgrp]);
 	part_stat_add(part, sectors[sgrp], sectors);
 	part_inc_in_flight(q, part, op_is_write(op));
@@ -1795,7 +1795,7 @@ void generic_end_io_acct(struct request_
 
 	part_stat_lock();
 
-	update_io_ticks(part, now);
+	update_io_ticks(part, now, true);
 	part_stat_add(part, nsecs[sgrp], jiffies_to_nsecs(duration));
 	part_stat_add(part, time_in_queue, duration);
 	part_dec_in_flight(q, part, op_is_write(req_op));
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1334,7 +1334,7 @@ void blk_account_io_done(struct request
 		part_stat_lock();
 		part = req->part;
 
-		update_io_ticks(part, jiffies);
+		update_io_ticks(part, jiffies, true);
 		part_stat_inc(part, ios[sgrp]);
 		part_stat_add(part, nsecs[sgrp], now - req->start_time_ns);
 		part_stat_add(part, time_in_queue, nsecs_to_jiffies64(now - req->start_time_ns));
@@ -1376,7 +1376,7 @@ void blk_account_io_start(struct request
 		rq->part = part;
 	}
 
-	update_io_ticks(part, jiffies);
+	update_io_ticks(part, jiffies, false);
 
 	part_stat_unlock();
 }
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -419,7 +419,7 @@ static inline void free_part_info(struct
 	kfree(part->info);
 }
 
-void update_io_ticks(struct hd_struct *part, unsigned long now);
+void update_io_ticks(struct hd_struct *part, unsigned long now, bool end);
 
 /* block/genhd.c */
 extern void device_add_disk(struct device *parent, struct gendisk *disk,



  parent reply	other threads:[~2020-10-05 15:31 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-05 15:26 [PATCH 5.4 00/57] 5.4.70-rc1 review Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 01/57] btrfs: fix filesystem corruption after a device replace Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 02/57] mmc: sdhci: Workaround broken command queuing on Intel GLK based IRBIS models Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 03/57] USB: gadget: f_ncm: Fix NDP16 datagram validation Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 04/57] gpio: siox: explicitly support only threaded irqs Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 05/57] gpio: mockup: fix resource leak in error path Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 06/57] gpio: tc35894: fix up tc35894 interrupt configuration Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 07/57] clk: socfpga: stratix10: fix the divider for the emac_ptp_free_clk Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 08/57] vsock/virtio: add transport parameter to the virtio_transport_reset_no_sock() Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 09/57] net: virtio_vsock: Enhance connection semantics Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 10/57] xfs: trim IO to found COW extent limit Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 11/57] Input: i8042 - add nopnp quirk for Acer Aspire 5 A515 Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 12/57] iio: adc: qcom-spmi-adc5: fix driver name Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 13/57] ftrace: Move RCU is watching check after recursion check Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 14/57] memstick: Skip allocating card when removing host Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 15/57] drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 16/57] clocksource/drivers/timer-gx6605s: Fixup counter reload Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 17/57] libbpf: Remove arch-specific include path in Makefile Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 18/57] drivers/net/wan/hdlc_fr: Add needed_headroom for PVC devices Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 19/57] drm/sun4i: mixer: Extend regmap max_register Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 20/57] net: dec: de2104x: Increase receive ring size for Tulip Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 21/57] rndis_host: increase sleep time in the query-response loop Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 22/57] nvme-core: get/put ctrl and transport module in nvme_dev_open/release() Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 23/57] fuse: fix the ->direct_IO() treatment of iov_iter Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 24/57] drivers/net/wan/lapbether: Make skb->protocol consistent with the header Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 25/57] drivers/net/wan/hdlc: Set skb->protocol before transmitting Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 26/57] mac80211: Fix radiotap header channel flag for 6GHz band Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 27/57] mac80211: do not allow bigger VHT MPDUs than the hardware supports Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 28/57] tracing: Make the space reserved for the pid wider Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 29/57] tools/io_uring: fix compile breakage Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 30/57] spi: fsl-espi: Only process interrupts for expected events Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 31/57] nvme-pci: fix NULL req in completion handler Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 32/57] nvme-fc: fail new connections to a deleted host or remote port Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 33/57] gpio: sprd: Clear interrupt when setting the type as edge Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 34/57] phy: ti: am654: Fix a leak in serdes_am654_probe() Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 35/57] pinctrl: mvebu: Fix i2c sda definition for 98DX3236 Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 36/57] nfs: Fix security label length not being reset Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 37/57] clk: tegra: Always program PLL_E when enabled Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 38/57] clk: samsung: exynos4: mark chipid clock as CLK_IGNORE_UNUSED Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 39/57] iommu/exynos: add missing put_device() call in exynos_iommu_of_xlate() Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 40/57] gpio/aspeed-sgpio: enable access to all 80 input & output sgpios Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 41/57] gpio/aspeed-sgpio: dont enable all interrupts by default Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 42/57] gpio: aspeed: fix ast2600 bank properties Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 43/57] i2c: cpm: Fix i2c_ram structure Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 44/57] Input: trackpoint - enable Synaptics trackpoints Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 45/57] scripts/dtc: only append to HOST_EXTRACFLAGS instead of overwriting Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 46/57] random32: Restore __latent_entropy attribute on net_rand_state Greg Kroah-Hartman
2020-10-05 15:26 ` Greg Kroah-Hartman [this message]
2020-10-05 15:27 ` [PATCH 5.4 48/57] mm: replace memmap_context by meminit_context Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 49/57] mm: dont rely on system state to detect hot-plug operations Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 50/57] nvme: Cleanup and rename nvme_block_nr() Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 51/57] nvme: Introduce nvme_lba_to_sect() Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 52/57] nvme: consolidate chunk_sectors settings Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 53/57] epoll: do not insert into poll queues until all sanity checks are done Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 54/57] epoll: replace ->visited/visited_list with generation count Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 55/57] epoll: EPOLL_CTL_ADD: close the race in decision to take fast path Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 56/57] ep_create_wakeup_source(): dentry name can change under you Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 57/57] netfilter: ctnetlink: add a range check for l3/l4 protonum Greg Kroah-Hartman
2020-10-05 17:50 ` [PATCH 5.4 00/57] 5.4.70-rc1 review Jon Hunter
2020-10-06  0:22 ` Shuah Khan
2020-10-06  5:54 ` Naresh Kamboju
2020-10-06  8:25   ` Naresh Kamboju
2020-10-06 14:49     ` Ben Hutchings
2020-10-06 18:17 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201005142112.070155997@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.