qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] QEMU-guestOS latencies.
@ 2016-07-28  9:25 Nir Levy
  2016-08-08 15:10 ` Stefan Hajnoczi
  0 siblings, 1 reply; 5+ messages in thread
From: Nir Levy @ 2016-07-28  9:25 UTC (permalink / raw)
  To: qemu-devel@nongnu.org; +Cc: Yan Fridland

[-- Attachment #1: Type: text/plain, Size: 2286 bytes --]

Hi all,

First, thanks for your time and attention for reading this.

I wish to share with you some of my goals.
My main goal is to trace latencies qemu-kvm interface (in order to see if they
secondary goal is to figure out the way qemu thread are spawned.
in addition I wish to understand ram allocation and avoid host swaps.

So far I have mainly debugged virstd and qemu.
not always I succeed with avoiding.
virsh -k0 start  KPO
error: Failed to start domain KPO
error: monitor socket did not show up: No such file or directory

when debugging qemu.
although I have used -k0, is there's any other way to overcome this?

My observations  so far using attaching to qemu process spawned from virtd are:
qemu thread are divided into several categories:
- block device controller (via qcow2_open - main io thread)
- thread for each VCPU
- Trace thread that launches at each report.
- IO worker threads (QEMU_AIO_READ, _WRITE, _IOCTL, _FLUSH ...etc) which are spawned regularly and I have failed so far to get the main purpose of them.
  those threads are spawned in an extensive rate one the guest application is running. (traffic is mainly through DPDK)

qemu_anon_ram_alloc summery:
    4G - pc.ram
256K - pc.bios
128K - pc.rom
256K - virtio-net-pci.rom
  2M  -/rom@etc/acpi/tables
  4K  -/rom@etc/table-loader

I used simple trace to get events ( I still have not insert my own's) according to Stefan Hajnoczi.
and studied a bit the output.
there are time offsets ranging from
object_dynamic_cast_assert 1574595.371 pid=15930 type=qio-channel-file target=qio-channel-file file=qemu-char.c line=0x509 func=pty_chr_update_read_handler_locked to
object_dynamic_cast_assert -1.710 pid=15930 type=Haswell-noTSX-x86_64-cpu target=x86_64-cpu file=/home/nirl/qemu_instrumenting_build/qemu-2.6.0/target-i386/kvm.c line=0xac2 func=kvm_arch_post_run
which is very strange.

in addition those offsets as far as I get them are only from previous trace.
is there's a simple way to adjust log to get uptime in nano second instead of offsets?
what would you recommend for trace latencies of guest OS related?


Regards and many thanks.
Nir Levy
SW Engineer

Web: www.asocstech.com<http://www.asocstech.com/> |
[cid:image001.jpg@01D1B599.5A2C9530]


[-- Attachment #2: image001.jpg --]
[-- Type: image/jpeg, Size: 2704 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] QEMU-guestOS latencies.
@ 2016-07-28 16:37 Nir Levy
  0 siblings, 0 replies; 5+ messages in thread
From: Nir Levy @ 2016-07-28 16:37 UTC (permalink / raw)
  To: qemu-devel@nongnu.org; +Cc: Yan Fridland

[-- Attachment #1: Type: text/plain, Size: 2725 bytes --]

After changing the
inside function qemuMonitorOpenUnix
int timeout to MAX_INT;
in file: ./src/qemu/qemu_monitor.c (libvirt)
I am now able to debug kvm and malloc

I would love hearing some tips regarding tracing for latencies.

regards,
Nir.

From: Nir Levy
Sent: Thursday, July 28, 2016 12:25 PM
To: 'qemu-devel@nongnu.org' <qemu-devel@nongnu.org>
Cc: Yan Fridland <yan@asocsnetworks.com>
Subject: QEMU-guestOS latencies.

Hi all,

First, thanks for your time and attention for reading this.

I wish to share with you some of my goals.
My main goal is to trace latencies qemu-kvm interface (in order to see if they
secondary goal is to figure out the way qemu thread are spawned.
in addition I wish to understand ram allocation and avoid host swaps.

So far I have mainly debugged virstd and qemu.
not always I succeed with avoiding.
virsh -k0 start  KPO
error: Failed to start domain KPO
error: monitor socket did not show up: No such file or directory

when debugging qemu.
although I have used -k0, is there's any other way to overcome this?

My observations  so far using attaching to qemu process spawned from virtd are:
qemu thread are divided into several categories:
- block device controller (via qcow2_open - main io thread)
- thread for each VCPU
- Trace thread that launches at each report.
- IO worker threads (QEMU_AIO_READ, _WRITE, _IOCTL, _FLUSH ...etc) which are spawned regularly and I have failed so far to get the main purpose of them.
  those threads are spawned in an extensive rate one the guest application is running. (traffic is mainly through DPDK)
qemu_anon_ram_alloc summery:
    4G - pc.ram
256K - pc.bios
128K - pc.rom
256K - virtio-net-pci.rom
  2M  -/rom@etc/acpi/tables
  4K  -/rom@etc/table-loader

I used simple trace to get events ( I still have not insert my own's) according to Stefan Hajnoczi.
and studied a bit the output.
there are time offsets ranging from
object_dynamic_cast_assert 1574595.371 pid=15930 type=qio-channel-file target=qio-channel-file file=qemu-char.c line=0x509 func=pty_chr_update_read_handler_locked to
object_dynamic_cast_assert -1.710 pid=15930 type=Haswell-noTSX-x86_64-cpu target=x86_64-cpu file=/home/nirl/qemu_instrumenting_build/qemu-2.6.0/target-i386/kvm.c line=0xac2 func=kvm_arch_post_run
which is very strange.

in addition those offsets as far as I get them are only from previous trace.
is there's a simple way to adjust log to get uptime in nano second instead of offsets?
what would you recommend for trace latencies of guest OS related?


Regards and many thanks.
Nir Levy
SW Engineer

Web: www.asocstech.com<http://www.asocstech.com/> |
[cid:image001.jpg@01D1B599.5A2C9530]


[-- Attachment #2: image001.jpg --]
[-- Type: image/jpeg, Size: 2704 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] QEMU-guestOS latencies.
  2016-07-28  9:25 [Qemu-devel] QEMU-guestOS latencies Nir Levy
@ 2016-08-08 15:10 ` Stefan Hajnoczi
  2016-08-09 11:45   ` Nir Levy
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Hajnoczi @ 2016-08-08 15:10 UTC (permalink / raw)
  To: Nir Levy; +Cc: qemu-devel@nongnu.org, Yan Fridland

[-- Attachment #1: Type: text/plain, Size: 272 bytes --]

On Thu, Jul 28, 2016 at 09:25:41AM +0000, Nir Levy wrote:
> in addition I wish to understand ram allocation and avoid host swaps.

I didn't read everything but this stood out.  QEMU has a -realtime
mlock=on option if you wish to mlock(2) guest memory on the host.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] QEMU-guestOS latencies.
  2016-08-08 15:10 ` Stefan Hajnoczi
@ 2016-08-09 11:45   ` Nir Levy
  2016-08-11  9:10     ` Stefan Hajnoczi
  0 siblings, 1 reply; 5+ messages in thread
From: Nir Levy @ 2016-08-09 11:45 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel@nongnu.org, Yan Fridland

Hi stephan,

Thanks for getting back to me,
as for the mlock, we have tried it here and apparently mlock cause our system to freeze for some reason,
using linux tools and ram allocation we have verified no swaps are occurring.
still we have delays that cause our system to lose track.
my goal as I have mentioned before is to clear/blame QEMU from/of the delays.

I am continuing to learn your simple trace.
I have used it and seen how it is working.
the get_clock is saved as 8 octets but is converted via ./scripts/simpletrace.py to delta_ns = timestamp - self.last_timestamp
which result with illogical diff (negative to very large in a split file within few seconds)
it is worth mentioning I am tracing all possible events and maybe it what cause problems.

Thanks again.
Nir.











-----Original Message-----
From: Stefan Hajnoczi [mailto:stefanha@gmail.com] 
Sent: Monday, August 8, 2016 6:11 PM
To: Nir Levy <nirl@asocsnetworks.com>
Cc: qemu-devel@nongnu.org; Yan Fridland <yan@asocsnetworks.com>
Subject: Re: [Qemu-devel] QEMU-guestOS latencies.

On Thu, Jul 28, 2016 at 09:25:41AM +0000, Nir Levy wrote:
> in addition I wish to understand ram allocation and avoid host swaps.

I didn't read everything but this stood out.  QEMU has a -realtime mlock=on option if you wish to mlock(2) guest memory on the host.

Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] QEMU-guestOS latencies.
  2016-08-09 11:45   ` Nir Levy
@ 2016-08-11  9:10     ` Stefan Hajnoczi
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2016-08-11  9:10 UTC (permalink / raw)
  To: Nir Levy; +Cc: qemu-devel@nongnu.org, Yan Fridland

[-- Attachment #1: Type: text/plain, Size: 633 bytes --]

On Tue, Aug 09, 2016 at 11:45:54AM +0000, Nir Levy wrote:
> I am continuing to learn your simple trace.
> I have used it and seen how it is working.
> the get_clock is saved as 8 octets but is converted via ./scripts/simpletrace.py to delta_ns = timestamp - self.last_timestamp
> which result with illogical diff (negative to very large in a split file within few seconds)
> it is worth mentioning I am tracing all possible events and maybe it what cause problems.

The number of events shouldn't affect the timestamp field since it uses
the simply fetches the get_clock() value.

I'm not sure what you mean by "split file"?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-11  9:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-28  9:25 [Qemu-devel] QEMU-guestOS latencies Nir Levy
2016-08-08 15:10 ` Stefan Hajnoczi
2016-08-09 11:45   ` Nir Levy
2016-08-11  9:10     ` Stefan Hajnoczi
  -- strict thread matches above, loose matches on Subject: below --
2016-07-28 16:37 Nir Levy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).