From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57692) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z09VZ-0001fY-Ja for qemu-devel@nongnu.org; Wed, 03 Jun 2015 10:19:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z09VY-0002Q0-8g for qemu-devel@nongnu.org; Wed, 03 Jun 2015 10:19:05 -0400 Message-ID: <556F0CC5.4040401@redhat.com> Date: Wed, 03 Jun 2015 08:18:45 -0600 From: Eric Blake MIME-Version: 1.0 References: <20150603134042.GA22770@igalia.com> In-Reply-To: <20150603134042.GA22770@igalia.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="scNWbNvofFGwqrc3mWxrdTwwvuVfjQbjs" Subject: Re: [Qemu-devel] I/O accounting overhaul List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alberto Garcia , qemu-devel@nongnu.org Cc: Kevin Wolf , Markus Armbruster , Stefan Hajnoczi , qemu-block@nongnu.org, Max Reitz This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --scNWbNvofFGwqrc3mWxrdTwwvuVfjQbjs Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 06/03/2015 07:40 AM, Alberto Garcia wrote: > Hello, >=20 > I would like to retake the work that Beno=C3=AEt was about to start las= t > year and extend the I/O accounting in QEMU. I was reading the past > discussions and I will try to summarize all the ideas. >=20 > The current accounting code collects the following information: >=20 > typedef struct BlockAcctStats { > uint64_t nr_bytes[BLOCK_MAX_IOTYPE]; > uint64_t nr_ops[BLOCK_MAX_IOTYPE]; > uint64_t total_time_ns[BLOCK_MAX_IOTYPE]; > uint64_t merged[BLOCK_MAX_IOTYPE]; > uint64_t wr_highest_sector; > } BlockAcctStats; >=20 > where the arrays hold information for read, write and flush > operations. >=20 > The accounting stats are stored in the BlockDriverState, but they're > actually from the device backed by the BDS, so they could probably be > moved there. For the interface we could extend BlockDeviceStats and > add the new fields, but query-blockstats works on BDS, so maybe we > need new API? >=20 We want stats per BDS (it would be nice to know how many reads are satisfied from the active layer, vs. how many are satisfied from the backing image, to know how stable and useful the backing image is). But we also want stats per BB (how many reads did the guest attempt, regardless of which BDS served the read). So any good solution needs to work from both views (whether by two API, or by one with a flag, is bike-shedding). > The fields are mostly self-explanatory. merged counts the number of > requests merged into a single one (using virtio_blk_submit_multireq), > and wr_highest_sector is the number of the highest sector that has > been written. It would also be nice if wr_highest_sector could be populated even for images that have not yet been written (right now, it starts life at 0 until a write, but if we can learn the current highest sector as part of opening an image even for just reads, that would be a bit nicer). >=20 > In addition to those we can have: >=20 > uint64_t nr_invalid_ops[BLOCK_MAX_IOTYPE]; > uint64_t nr_failed_ops[BLOCK_MAX_IOTYPE]; >=20 > The decision about whether to count these two as done (for e.g. > total_time_ns) could be configurable by the user. >=20 > int64_t last_access_time_ns; >=20 > This would be updated after each operation, and would be useful to > know for how long a particular device has been idle. >=20 > uint64_t latency[BLOCK_MAX_IOTYPE]; >=20 > What we added in average to total_time_ns[] in the past second (or > minute, or hour; the interval would be configurable). We could also > collect the maximum and minimum latencies for that period. >=20 > This could be updated every time an operation is accounted, so I > think it could be implemented without adding any timer. >=20 > uint64_t queue_depth[BLOCK_MAX_IOTYPE]; >=20 > Average number of requests. Similar to the previous one. It would > require us to keep a count of ongoing requests as well. >=20 > About the implementation, I read that it was possible to call > block_acct_start() without calling block_acct_done(). I don't know if > that's still the case, I need to check that. >=20 > I don't know if I'm forgetting anything. I have a rough implementation > covering most of the things I described, but of course it needs to be > polished etc. before publishing. >=20 > What do you think about this? Comments and suggestions are welcome. >=20 > Thanks, >=20 > Berto >=20 >=20 --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --scNWbNvofFGwqrc3mWxrdTwwvuVfjQbjs Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJVbwzFAAoJEKeha0olJ0Nq3/IH/24qqpns14GZox8HRPFRha/g CYh7fVt2CwHJg+zir8uh838B4nNODHCxnHHhbGX/mrOCV3c7gT72i1sR1p04WAtR kVJJoxAkaGb7yhq1LOeCxSA52SU+u9+wOjwjcWI8X2yt/8Uf4OWuOFuAYH3z63GY nIkZn0k3Q9fsbFjGcKgkZbzR2ENufHsJlAkuYcbN6tfzkVzTVIpjqylhsmP9gzar TQYNtESHK7ek20eFX/Uxzsc8/jsZucxdmeeNartDoGtELn33q1o1sRabPzoaGmJK QZ1aK4CfrRRcRZ8wLXurNPsMb1Rf7poHsAnhi5QOl3leKmCnBLGPzABPGmzQP9Q= =WDYL -----END PGP SIGNATURE----- --scNWbNvofFGwqrc3mWxrdTwwvuVfjQbjs--