From: Avi Kivity <avi.kivity@gmail.com>
To: Etienne Martineau <etmartinau@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Peter Crosthwaite <peter.crosthwaite@xilinx.com>,
gonglei <arei.gonglei@huawei.com>, Fam <famz@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] memory: memory_region_transaction_commit() slow
Date: Sun, 29 Jun 2014 09:56:19 +0300 [thread overview]
Message-ID: <53AFB893.1010105@gmail.com> (raw)
In-Reply-To: <53AC2EDF.5090408@gmail.com>
On 06/26/2014 05:31 PM, Etienne Martineau wrote:
> On 14-06-26 04:18 AM, Avi Kivity wrote:
>> On 06/25/2014 08:53 PM, Etienne Martineau wrote:
>>> Hi,
>>>
>>> It seems to me that there is a scale issue O(n) in memory_region_transaction_commit().
>> It's actually O(n^3). Flatview is kept sorted but is just a vector, so if you insert n regions, you have n^2 operations. In addition every PCI device has an address space, so we get n^3 (technically the third n is different from the first two, but they are related).
>>
>> The first problem can be solved by implementing Flatview with an std::set<> or equivalent, the second by memoization - most pci address spaces are equal (they only differ based on whether bus mastering is enabled or not), so a clever cache can reduce the effort to generate them.
>>
>> However I'm not at all sure that the problem is cpu time in qemu. It could be due to rcu_synchronize delays when the new memory maps are fed to kvm and vfio. I recommend trying to isolate exactly where the time is spent.
>>
> It's seem like the linear increase in CPU time comes from QEMU ( at least from my measurements below)
In those code paths QEMU calls back into KVM (KVM_SET_MEMORY_REGION) and
vfio. So it would be good to understand exactly where the time is
spent. I doubt it's computation (which is O(n^3), but very fast),
instead it's likely waiting for something.
> In QEMU kvm_cpu_exec() I've added a hook that measure the time that is spent outside 'kvm_vcpu_ioctl(cpu, KVM_RUN, 0)'
> From the logs below this is "QEMU long exit vCPU n x(msec) exit_reason'
>
> Similarly in KVM vcpu_enter_guest() I've added a new ftrace that measure the time spent outside 'kvm_x86_ops->run(vcpu)'
> From the logs below this is "kvm_long_exit: x(msec)'. Please note that this is a trimmed down view of the real ftrace output.
>
> Also please note that the above hacks are useful ( at least to me since I haven't figured out a better way to do the same with existing ftrace ) to measure the RTT at both QEMU and KVM level.
>
> The time spent outside KVM 'kvm_x86_ops->run(vcpu)' will always be greater than the time spent outside QEMU 'kvm_vcpu_ioctl(cpu, KVM_RUN, 0)' for a given vCPU. Now
> the difference between the time spent outside KVM to the time spend outside QEMU ( for a given vCPU ) tells us who is burning cycle ( QEMU or KVM ) and how much ( in msec )
>
> In the below experiment I've put side by side the QEMU and the KVM RTT time. We can see that the time to assign device ( same BAR size for all devices ) increase
> linearly ( like previously reported ). Also from the RTT measurement both QEMU and KVM are mostly within the same range suggesting that the increase comes from QEMU and not KVM.
>
> The one exception is that for every device assign there is a KVM operation that seems to be taking ~100msec each time. Since this is O(1) I'm not too concerned.
>
>
> device assign #1:
> device_add pci-assign,host=28:10.2,bus=pciehp.3.8
>
> kvm_long_exit: 100
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 26
> QEMU long exit vCPU 0 20 2 kvm_long_exit: 20
> QEMU long exit vCPU 0 20 2 kvm_long_exit: 20
> QEMU long exit vCPU 0 20 2 kvm_long_exit: 20
> QEMU long exit vCPU 0 19 2 kvm_long_exit: 19
> QEMU long exit vCPU 0 19 2 kvm_long_exit: 19
> QEMU long exit vCPU 0 19 2 kvm_long_exit: 20
> QEMU long exit vCPU 0 19 2 kvm_long_exit: 19
> QEMU long exit vCPU 0 19 2 kvm_long_exit: 19
> QEMU long exit vCPU 0 19 2 kvm_long_exit: 19
> QEMU long exit vCPU 0 19 2 kvm_long_exit: 20
> QEMU long exit vCPU 0 42 2 kvm_long_exit: 42
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
>
> device assign #2:
> device_add pci-assign,host=28:10.3,bus=pciehp.3.9
>
> kvm_long_exit: 101
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 25
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
> QEMU long exit vCPU 0 21 2 kvm_long_exit: 21
> QEMU long exit vCPU 0 45 2 kvm_long_exit: 45
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
>
> device assign #3:
> device_add pci-assign,host=28:10.4,bus=pciehp.3.10
>
> kvm_long_exit: 100
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 25
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
> QEMU long exit vCPU 0 23 2 kvm_long_exit: 23
> QEMU long exit vCPU 0 48 2 kvm_long_exit: 48
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 25
>
> device assign #4:
> device_add pci-assign,host=28:10.5,bus=pciehp.3.11
>
> kvm_long_exit: 100
> QEMU long exit vCPU 0 27 2 kvm_long_exit: 27
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 25
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 25
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 25
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 25
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 25
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 25
> QEMU long exit vCPU 0 24 2 kvm_long_exit: 24
> QEMU long exit vCPU 0 25 2 kvm_long_exit: 25
> QEMU long exit vCPU 0 24 2 kvm_long_exit: 24
> QEMU long exit vCPU 0 24 2 kvm_long_exit: 25
> QEMU long exit vCPU 0 52 2 kvm_long_exit: 52
> QEMU long exit vCPU 0 26 2 kvm_long_exit: 26
>
> device assign #5:
> device_add pci-assign,host=28:10.6,bus=pciehp.3.12
>
> kvm_long_exit: 100
> QEMU long exit vCPU 0 28 2 kvm_long_exit: 28
> QEMU long exit vCPU 0 27 2 kvm_long_exit: 27
> QEMU long exit vCPU 0 26 2 kvm_long_exit: 26
> QEMU long exit vCPU 0 27 2 kvm_long_exit: 27
> QEMU long exit vCPU 0 26 2 kvm_long_exit: 26
> QEMU long exit vCPU 0 26 2 kvm_long_exit: 26
> QEMU long exit vCPU 0 26 2 kvm_long_exit: 26
> QEMU long exit vCPU 0 26 2 kvm_long_exit: 26
> QEMU long exit vCPU 0 26 2 kvm_long_exit: 26
> QEMU long exit vCPU 0 26 2 kvm_long_exit: 26
> QEMU long exit vCPU 0 26 2 kvm_long_exit: 26
> QEMU long exit vCPU 0 55 2 kvm_long_exit: 56
> QEMU long exit vCPU 0 28 2 kvm_long_exit: 28
>
> thanks,
> Etienne
>
prev parent reply other threads:[~2014-06-29 6:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-25 17:53 [Qemu-devel] memory: memory_region_transaction_commit() slow Etienne Martineau
2014-06-25 18:58 ` Paolo Bonzini
2014-06-25 20:41 ` Etienne Martineau
2014-06-26 3:52 ` Peter Crosthwaite
2014-06-26 15:02 ` Etienne Martineau
2014-06-26 8:18 ` Avi Kivity
2014-06-26 14:31 ` Etienne Martineau
2014-06-29 6:56 ` Avi Kivity [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53AFB893.1010105@gmail.com \
--to=avi.kivity@gmail.com \
--cc=arei.gonglei@huawei.com \
--cc=etmartinau@gmail.com \
--cc=famz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.crosthwaite@xilinx.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).