stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	Ming Lei <ming.lei@redhat.com>, Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 5.4 47/57] block/diskstats: more accurate approximation of io_ticks for slow disks
Date: Mon,  5 Oct 2020 17:26:59 +0200	[thread overview]
Message-ID: <20201005142112.070155997@linuxfoundation.org> (raw)
In-Reply-To: <20201005142109.796046410@linuxfoundation.org>

From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

commit 2b8bd423614c595540eaadcfbc702afe8e155e50 upstream.

Currently io_ticks is approximated by adding one at each start and end of
requests if jiffies counter has changed. This works perfectly for requests
shorter than a jiffy or if one of requests starts/ends at each jiffy.

If disk executes just one request at a time and they are longer than two
jiffies then only first and last jiffies will be accounted.

Fix is simple: at the end of request add up into io_ticks jiffies passed
since last update rather than just one jiffy.

Example: common HDD executes random read 4k requests around 12ms.

fio --name=test --filename=/dev/sdb --rw=randread --direct=1 --runtime=30 &
iostat -x 10 sdb

Note changes of iostat's "%util" 8,43% -> 99,99% before/after patch:

Before:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0,00     0,00   82,60    0,00   330,40     0,00     8,00     0,96   12,09   12,09    0,00   1,02   8,43

After:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0,00     0,00   82,50    0,00   330,00     0,00     8,00     1,00   12,10   12,10    0,00  12,12  99,99

Now io_ticks does not loose time between start and end of requests, but
for queue-depth > 1 some I/O time between adjacent starts might be lost.

For load estimation "%util" is not as useful as average queue length,
but it clearly shows how often disk queue is completely empty.

Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
From: "Banerjee, Debabrata" <dbanerje@akamai.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 Documentation/admin-guide/iostats.rst |    5 ++++-
 block/bio.c                           |    8 ++++----
 block/blk-core.c                      |    4 ++--
 include/linux/genhd.h                 |    2 +-
 4 files changed, 11 insertions(+), 8 deletions(-)

--- a/Documentation/admin-guide/iostats.rst
+++ b/Documentation/admin-guide/iostats.rst
@@ -99,7 +99,7 @@ Field 10 -- # of milliseconds spent doin
 
     Since 5.0 this field counts jiffies when at least one request was
     started or completed. If request runs more than 2 jiffies then some
-    I/O time will not be accounted unless there are other requests.
+    I/O time might be not accounted in case of concurrent requests.
 
 Field 11 -- weighted # of milliseconds spent doing I/Os
     This field is incremented at each I/O start, I/O completion, I/O
@@ -133,6 +133,9 @@ are summed (possibly overflowing the uns
 summed to) and the result given to the user.  There is no convenient
 user interface for accessing the per-CPU counters themselves.
 
+Since 4.19 request times are measured with nanoseconds precision and
+truncated to milliseconds before showing in this interface.
+
 Disks vs Partitions
 -------------------
 
--- a/block/bio.c
+++ b/block/bio.c
@@ -1754,14 +1754,14 @@ defer:
 	schedule_work(&bio_dirty_work);
 }
 
-void update_io_ticks(struct hd_struct *part, unsigned long now)
+void update_io_ticks(struct hd_struct *part, unsigned long now, bool end)
 {
 	unsigned long stamp;
 again:
 	stamp = READ_ONCE(part->stamp);
 	if (unlikely(stamp != now)) {
 		if (likely(cmpxchg(&part->stamp, stamp, now) == stamp)) {
-			__part_stat_add(part, io_ticks, 1);
+			__part_stat_add(part, io_ticks, end ? now - stamp : 1);
 		}
 	}
 	if (part->partno) {
@@ -1777,7 +1777,7 @@ void generic_start_io_acct(struct reques
 
 	part_stat_lock();
 
-	update_io_ticks(part, jiffies);
+	update_io_ticks(part, jiffies, false);
 	part_stat_inc(part, ios[sgrp]);
 	part_stat_add(part, sectors[sgrp], sectors);
 	part_inc_in_flight(q, part, op_is_write(op));
@@ -1795,7 +1795,7 @@ void generic_end_io_acct(struct request_
 
 	part_stat_lock();
 
-	update_io_ticks(part, now);
+	update_io_ticks(part, now, true);
 	part_stat_add(part, nsecs[sgrp], jiffies_to_nsecs(duration));
 	part_stat_add(part, time_in_queue, duration);
 	part_dec_in_flight(q, part, op_is_write(req_op));
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1334,7 +1334,7 @@ void blk_account_io_done(struct request
 		part_stat_lock();
 		part = req->part;
 
-		update_io_ticks(part, jiffies);
+		update_io_ticks(part, jiffies, true);
 		part_stat_inc(part, ios[sgrp]);
 		part_stat_add(part, nsecs[sgrp], now - req->start_time_ns);
 		part_stat_add(part, time_in_queue, nsecs_to_jiffies64(now - req->start_time_ns));
@@ -1376,7 +1376,7 @@ void blk_account_io_start(struct request
 		rq->part = part;
 	}
 
-	update_io_ticks(part, jiffies);
+	update_io_ticks(part, jiffies, false);
 
 	part_stat_unlock();
 }
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -419,7 +419,7 @@ static inline void free_part_info(struct
 	kfree(part->info);
 }
 
-void update_io_ticks(struct hd_struct *part, unsigned long now);
+void update_io_ticks(struct hd_struct *part, unsigned long now, bool end);
 
 /* block/genhd.c */
 extern void device_add_disk(struct device *parent, struct gendisk *disk,



  parent reply	other threads:[~2020-10-05 15:31 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-05 15:26 [PATCH 5.4 00/57] 5.4.70-rc1 review Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 01/57] btrfs: fix filesystem corruption after a device replace Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 02/57] mmc: sdhci: Workaround broken command queuing on Intel GLK based IRBIS models Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 03/57] USB: gadget: f_ncm: Fix NDP16 datagram validation Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 04/57] gpio: siox: explicitly support only threaded irqs Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 05/57] gpio: mockup: fix resource leak in error path Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 06/57] gpio: tc35894: fix up tc35894 interrupt configuration Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 07/57] clk: socfpga: stratix10: fix the divider for the emac_ptp_free_clk Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 08/57] vsock/virtio: add transport parameter to the virtio_transport_reset_no_sock() Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 09/57] net: virtio_vsock: Enhance connection semantics Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 10/57] xfs: trim IO to found COW extent limit Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 11/57] Input: i8042 - add nopnp quirk for Acer Aspire 5 A515 Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 12/57] iio: adc: qcom-spmi-adc5: fix driver name Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 13/57] ftrace: Move RCU is watching check after recursion check Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 14/57] memstick: Skip allocating card when removing host Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 15/57] drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 16/57] clocksource/drivers/timer-gx6605s: Fixup counter reload Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 17/57] libbpf: Remove arch-specific include path in Makefile Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 18/57] drivers/net/wan/hdlc_fr: Add needed_headroom for PVC devices Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 19/57] drm/sun4i: mixer: Extend regmap max_register Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 20/57] net: dec: de2104x: Increase receive ring size for Tulip Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 21/57] rndis_host: increase sleep time in the query-response loop Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 22/57] nvme-core: get/put ctrl and transport module in nvme_dev_open/release() Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 23/57] fuse: fix the ->direct_IO() treatment of iov_iter Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 24/57] drivers/net/wan/lapbether: Make skb->protocol consistent with the header Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 25/57] drivers/net/wan/hdlc: Set skb->protocol before transmitting Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 26/57] mac80211: Fix radiotap header channel flag for 6GHz band Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 27/57] mac80211: do not allow bigger VHT MPDUs than the hardware supports Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 28/57] tracing: Make the space reserved for the pid wider Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 29/57] tools/io_uring: fix compile breakage Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 30/57] spi: fsl-espi: Only process interrupts for expected events Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 31/57] nvme-pci: fix NULL req in completion handler Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 32/57] nvme-fc: fail new connections to a deleted host or remote port Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 33/57] gpio: sprd: Clear interrupt when setting the type as edge Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 34/57] phy: ti: am654: Fix a leak in serdes_am654_probe() Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 35/57] pinctrl: mvebu: Fix i2c sda definition for 98DX3236 Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 36/57] nfs: Fix security label length not being reset Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 37/57] clk: tegra: Always program PLL_E when enabled Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 38/57] clk: samsung: exynos4: mark chipid clock as CLK_IGNORE_UNUSED Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 39/57] iommu/exynos: add missing put_device() call in exynos_iommu_of_xlate() Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 40/57] gpio/aspeed-sgpio: enable access to all 80 input & output sgpios Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 41/57] gpio/aspeed-sgpio: dont enable all interrupts by default Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 42/57] gpio: aspeed: fix ast2600 bank properties Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 43/57] i2c: cpm: Fix i2c_ram structure Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 44/57] Input: trackpoint - enable Synaptics trackpoints Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 45/57] scripts/dtc: only append to HOST_EXTRACFLAGS instead of overwriting Greg Kroah-Hartman
2020-10-05 15:26 ` [PATCH 5.4 46/57] random32: Restore __latent_entropy attribute on net_rand_state Greg Kroah-Hartman
2020-10-05 15:26 ` Greg Kroah-Hartman [this message]
2020-10-05 15:27 ` [PATCH 5.4 48/57] mm: replace memmap_context by meminit_context Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 49/57] mm: dont rely on system state to detect hot-plug operations Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 50/57] nvme: Cleanup and rename nvme_block_nr() Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 51/57] nvme: Introduce nvme_lba_to_sect() Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 52/57] nvme: consolidate chunk_sectors settings Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 53/57] epoll: do not insert into poll queues until all sanity checks are done Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 54/57] epoll: replace ->visited/visited_list with generation count Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 55/57] epoll: EPOLL_CTL_ADD: close the race in decision to take fast path Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 56/57] ep_create_wakeup_source(): dentry name can change under you Greg Kroah-Hartman
2020-10-05 15:27 ` [PATCH 5.4 57/57] netfilter: ctnetlink: add a range check for l3/l4 protonum Greg Kroah-Hartman
2020-10-05 17:50 ` [PATCH 5.4 00/57] 5.4.70-rc1 review Jon Hunter
2020-10-06  0:22 ` Shuah Khan
2020-10-06  5:54 ` Naresh Kamboju
2020-10-06  8:25   ` Naresh Kamboju
2020-10-06 14:49     ` Ben Hutchings
2020-10-06 18:17 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201005142112.070155997@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).