qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Denis V. Lunev" <den@openvz.org>
To: Max Reitz <mreitz@redhat.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [PATCH 2/3] block: add logging facility for long standing IO requests
Date: Thu, 17 Sep 2020 18:27:11 +0300	[thread overview]
Message-ID: <063e05ed-a942-a8f4-354e-c88dcabd2663@openvz.org> (raw)
In-Reply-To: <1971fbff-b255-d0c1-bc07-553242512bcd@redhat.com>

On 9/17/20 4:56 PM, Max Reitz wrote:
> On 10.08.20 12:14, Denis V. Lunev wrote:
>> There are severe delays with IO requests processing if QEMU is running in
>> virtual machine or over software defined storage. Such delays potentially
>> results in unpredictable guest behavior. For example, guests over IDE or
>> SATA drive could remount filesystem read-only if write is performed
>> longer than 10 seconds.
>>
>> Such reports are very complex to process. Some good starting point for this
>> seems quite reasonable. This patch provides one. It adds logging of such
>> potentially dangerous long IO operations.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Max Reitz <mreitz@redhat.com>
>> ---
>>  block/accounting.c         | 59 +++++++++++++++++++++++++++++++++++++-
>>  blockdev.c                 |  7 ++++-
>>  include/block/accounting.h |  5 +++-
>>  3 files changed, 68 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/accounting.c b/block/accounting.c
>> index 8d41c8a83a..24f48c84b8 100644
>> --- a/block/accounting.c
>> +++ b/block/accounting.c
> [...]
>
>> @@ -182,6 +185,58 @@ void block_latency_histograms_clear(BlockAcctStats *stats)
>>      }
>>  }
>>  
>> +static const char *block_account_type(enum BlockAcctType type)
>> +{
>> +    switch (type) {
>> +    case BLOCK_ACCT_READ:
>> +        return "READ";
>> +    case BLOCK_ACCT_WRITE:
>> +        return "WRITE";
>> +    case BLOCK_ACCT_FLUSH:
>> +        return "DISCARD";
>> +    case BLOCK_ACCT_UNMAP:
>> +        return "TRUNCATE";
> I don’t understand how FLUSH translates to DISCARD, and UNMAP to TRUNCATE.
>

Ha, seems I have made copy/pasted and missed this.
Nice catch.

>> +    case BLOCK_ACCT_NONE:
>> +        return "NONE";
>> +    case BLOCK_MAX_IOTYPE:
>> +        break;
>> +    }
>> +    return "UNKNOWN";
>> +}
>> +
>> +static void block_acct_report_long(BlockAcctStats *stats,
>> +                                   BlockAcctCookie *cookie, int64_t latency_ns)
>> +{
>> +    unsigned latency_ms = latency_ns / SCALE_MS;
>> +    int64_t start_time_host_s;
>> +    char buf[64];
>> +    struct tm t;
>> +    BlockDriverState *bs;
>> +
>> +    if (cookie->type == BLOCK_ACCT_NONE) {
>> +        return;
>> +    }
>> +    if (stats->latency_log_threshold_ms == 0) {
>> +        return;
>> +    }
>> +    if (latency_ms < stats->latency_log_threshold_ms) {
>> +        return;
>> +    }
>> +
>> +    start_time_host_s = cookie->start_time_ns / NANOSECONDS_PER_SECOND;
>> +    strftime(buf, sizeof(buf), "%m-%d %H:%M:%S",
>> +             localtime_r(&start_time_host_s, &t));
>> +
>> +    bs = blk_bs(blk_stats2blk(stats));
>> +    qemu_log("long %s[%ld] IO request: %d.03%d since %s.%03d bs: %s(%s, %s)\n",
>> +             block_account_type(cookie->type), cookie->bytes,
>> +             (int)(latency_ms / 1000), (int)(latency_ms % 1000), buf,
> Why not just latency_ms * .001f and %.3f?
I personally dislike floating point, this leads to
a bit strange output.


>
>> +             (int)((cookie->start_time_ns / 1000000) % 1000),
> s/1000000/SCALE_MS/? (But I’m not sure whether that’s what the SCALE_?S
> are for.)
yes, you right.

>> +             bs == NULL ? "unknown" : bdrv_get_node_name(bs),
>> +             bs == NULL ? "unknown" : bdrv_get_format_name(bs),
>> +             bs == NULL ? "unknown" : bs->filename);
> Now that I’m writing this response already, I wonder whether a QMP event
> wouldn’t be nice.  (But considering that accounting apparently just
> doesn’t with -blockdev, I suppose that’s not that big of a worry.)
may be. Why not?

Though there is some note. Usually after the stall there
is a bunch of requests completed at the same time
once the storage is unfrozen, thus some aggregation
would be required.

I think that this should be made separately.
>> +}
>> +
>>  static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie,
>>                                   bool failed)
>>  {
> [...]
>
>> diff --git a/blockdev.c b/blockdev.c
>> index 3848a9c8ab..66158d1292 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -622,7 +622,8 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
>>  
>>          bs->detect_zeroes = detect_zeroes;
>>  
>> -        block_acct_setup(blk_get_stats(blk), account_invalid, account_failed);
>> +        block_acct_setup(blk_get_stats(blk), account_invalid, account_failed,
>> +                qemu_opt_get_number(opts, "latency-log-threshold", 0));
> latency_log_threshold_ms is an unsigned int and so this will silently
> overflow for values >= 2^32.
>
> (Or for user-specified values < 0, which are wrapped around.)
>
>>  
>>          if (!parse_stats_intervals(blk_get_stats(blk), interval_list, errp)) {
>>              blk_unref(blk);
>> @@ -3727,6 +3728,10 @@ QemuOptsList qemu_common_drive_opts = {
>>              .type = QEMU_OPT_BOOL,
>>              .help = "whether to account for failed I/O operations "
>>                      "in the statistics",
>> +        },{
>> +            .name = "latency-log-threshold",
>> +            .type = QEMU_OPT_NUMBER,
>> +            .help = "threshold for long I/O report (disabled if <=0), in ms",
> Because of that overflow, negative values will not necessarily disable
> reporting, because truncating them to 32 bit may make them small
> absolute values again.
>
> In any case, I’d just say “disabled if 0”, and not mention anything
> about <0.
will do.

Thanks for review!

Den


  reply	other threads:[~2020-09-17 15:29 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-10 10:14 [PATCH v2 for 5.2 0/3] block: add logging facility for long standing IO requests Denis V. Lunev
2020-08-10 10:14 ` [PATCH 1/3] block/block-backend: add converter from BlockAcctStats to BlockBackend Denis V. Lunev
2020-09-17 14:02   ` Max Reitz
2020-08-10 10:14 ` [PATCH 2/3] block: add logging facility for long standing IO requests Denis V. Lunev
2020-08-20  8:03   ` David Edmondson
2020-08-24  9:01     ` Denis V. Lunev
2020-09-17 13:56   ` Max Reitz
2020-09-17 15:27     ` Denis V. Lunev [this message]
2020-08-10 10:14 ` [PATCH 3/3] block: enable long IO requests report by default Denis V. Lunev
2020-09-17 14:03   ` Max Reitz
2020-08-12 14:00 ` [PATCH v2 for 5.2 0/3] block: add logging facility for long standing IO requests Stefan Hajnoczi
2020-08-20  7:37   ` Denis V. Lunev
  -- strict thread matches above, loose matches on Subject: below --
2020-08-05 10:08 [PATCH v2 " Denis V. Lunev
2020-08-05 10:08 ` [PATCH 2/3] " Denis V. Lunev
2020-08-05 13:51   ` Philippe Mathieu-Daudé
2020-08-05 14:00     ` Denis V. Lunev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=063e05ed-a942-a8f4-354e-c88dcabd2663@openvz.org \
    --to=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).