qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Benoît Canet" <benoit.canet@irqsave.net>
To: qemu-devel@nongnu.org
Cc: kwolf@redhat.com, anshul.makkar@profitbricks.com,
	armbru@redhat.com, stefanha@redhat.com
Subject: [Qemu-devel] IO accounting overhaul
Date: Thu, 28 Aug 2014 16:38:09 +0200	[thread overview]
Message-ID: <20140828143809.GB28789@irqsave.net> (raw)


Hi,

I collected some items of a cloud provider wishlist regarding I/O accouting.

In a cloud I/O accouting can have 3 purpose: billing, helping the customers
and doing metrology to help the cloud provider seeks hidden costs.

I'll cover the two former topic in this mail because they are the most important
business wize.

1) prefered place to collect billing IO accounting data:
--------------------------------------------------------
For billing purpose the collected data must be as close as possible to what the
customer would see by using iostats in his vm.

The first conclusion we can draw is that the choice of collecting IO accouting
data used for billing in the block devices models is right.

2) what to do with occurences of rare events:
---------------------------------------------

Another point is that QEMU developpers agree that they don't know which policy
to apply to some I/O accounting events.
Must QEMU discard invalid I/O write IO or account them as done ?
Must QEMU count a failed read I/O as done ?

When discusting this with a cloud provider the following appears: these decisions
are really specific to each cloud provider and QEMU should not implement them.
The right thing to do is to add accouting counters to collect these events.

Moreover these rare events are precious troubleshooting data so it's an additional
reason not to toss them.

3) list of block I/O accouting metrics wished for billing and helping the customers
-----------------------------------------------------------------------------------

Basic I/O accouting data will end up making the customers bills.
Extra I/O accouting informations would be a precious help for the cloud provider
to implement a monitoring panel like Amazon Cloudwatch.

Here is the list of counters and statitics I would like to help implement in QEMU.

This is the most important part of the mail and the one I would like the community
review the most.

Once this list is settled I would proceed to implement the required infrastructure
in QEMU before using it in the device models.

/* volume of data transfered by the IOs */
read_bytes
write_bytes

/* operation count */
read_ios
write_ios
flush_ios

/* how many invalid IOs the guest submit */
invalid_read_ios
invalid_write_ios
invalid_flush_ios

/* how many io error happened */
read_ios_error
write_ios_error
flush_ios_error

/* account the time passed doing IOs */
total_read_time
total_write_time
total_flush_time

/* since when the volume is iddle */
qvolume_iddleness_time

/* the following would compute latecies for slices of 1 seconds then toss the
 * result and start a new slice. A weighted sumation of the instant latencies
 * could help to implement this.
 */
1s_read_average_latency
1s_write_average_latency
1s_flush_average_latency

/* the former three numbers could be used to further compute a 1 minute slice value */
1m_read_average_latency
1m_write_average_latency
1m_flush_average_latency

/* the former three numbers could be used to further compute a 1 hours slice value */
1h_read_average_latency
1h_write_average_latency
1h_flush_average_latency

/* 1 second average number of requests in flight */
1s_read_queue_depth
1s_write_queue_depth

/* 1 minute average number of requests in flight */
1m_read_queue_depth
1m_write_queue_depth

/* 1 hours average number of requests in flight */
1h_read_queue_depth
1h_write_queue_depth

4) Making this happen
-------------------------

Outscale want to make these IO stat happen and gave me the go to do whatever
grunt is required to do so.
That said we could collaborate on some part of the work.

Best regards

Benoît

             reply	other threads:[~2014-08-28 14:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-28 14:38 Benoît Canet [this message]
2014-08-29 16:04 ` [Qemu-devel] IO accounting overhaul Stefan Hajnoczi
2014-08-29 16:32   ` Benoît Canet
2014-09-01  9:52 ` Markus Armbruster
2014-09-01 10:44   ` [Qemu-devel] [libvirt] " Benoît Canet
2014-09-01 11:41     ` Markus Armbruster
2014-09-01 13:38       ` Benoît Canet
2014-09-02 13:59         ` Markus Armbruster
2014-09-05 14:30       ` Kevin Wolf
2014-09-05 14:56         ` Benoît Canet
2014-09-05 14:57         ` Benoît Canet
2014-09-05 15:24         ` Benoît Canet
2014-09-08  7:12         ` Markus Armbruster
2014-09-08  9:12           ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140828143809.GB28789@irqsave.net \
    --to=benoit.canet@irqsave.net \
    --cc=anshul.makkar@profitbricks.com \
    --cc=armbru@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).