Re: [Qemu-devel] memory: memory_region_transaction_commit() slow

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Avi Kivity <avi.kivity@gmail.com>
To: Etienne Martineau <etmartinau@gmail.com>,
	gonglei <arei.gonglei@huawei.com>, Fam <famz@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Peter Crosthwaite <peter.crosthwaite@xilinx.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] memory: memory_region_transaction_commit() slow
Date: Thu, 26 Jun 2014 11:18:18 +0300	[thread overview]
Message-ID: <53ABD74A.2070607@gmail.com> (raw)
In-Reply-To: <53AB0C7C.1050805@gmail.com>


On 06/25/2014 08:53 PM, Etienne Martineau wrote:
> Hi,
>
> It seems to me that there is a scale issue O(n) in memory_region_transaction_commit().

It's actually O(n^3).  Flatview is kept sorted but is just a vector, so 
if you insert n regions, you have n^2 operations. In addition every PCI 
device has an address space, so we get n^3 (technically the third n is 
different from the first two, but they are related).

The first problem can be solved by implementing Flatview with an 
std::set<> or equivalent, the second by memoization - most pci address 
spaces are equal (they only differ based on whether bus mastering is 
enabled or not), so a clever cache can reduce the effort to generate them.

However I'm not at all sure that the problem is cpu time in qemu. It 
could be due to rcu_synchronize delays when the new memory maps are fed 
to kvm and vfio.  I recommend trying to isolate exactly where the time 
is spent.

> Basically the time it takes to rebuild the memory view during device assignment
> pci_bridge_update_mappings() increase linearly with respect to the number of
> device already assigned to the guest.
>
> I'm running on a recent qemu.git and I merged from git://github.com/bonzini/qemu.git memory
> the following patches that seems to be related to the scale issue I'm facing:
>    Fam Zheng (1):
>        memory: Don't call memory_region_update_coalesced_range if nothing changed
>    Gonglei (1):
>        memory: Don't update all memory region when ioeventfd changed
>
> Those patches help but don't fix the issue. The problem become more noticeable
> when lots of device are being assigned to the guest.
>
> I'm running my test on a QEMU q35 machine with the following topology:
>   ioh3420 ( root port )
>    x3130-upstream
>     xio3130-downstream
>     xio3130-downstream
>     xio3130-downstream
>     ...
>
> I have added instrumentation in kvm_cpu_exec() to track to amount of time spend
> in the emulation ( patch at the end but not relevant for this discussion )
>
> Here what I see when I assign device one after to other. NOTE the time-stamp is in
> msec. The linear increase in the time comes from memory_region_transaction_commit().
>
> (qemu) device_add pci-assign,host=28:10.1,bus=pciehp.3.7
> QEMU long exit vCPU 0 25 2
> QEMU long exit vCPU 0 22 2
> QEMU long exit vCPU 0 21 2
> QEMU long exit vCPU 0 22 2
> QEMU long exit vCPU 0 21 2
> QEMU long exit vCPU 0 21 2
> QEMU long exit vCPU 0 21 2
> QEMU long exit vCPU 0 21 2
> QEMU long exit vCPU 0 21 2
> QEMU long exit vCPU 0 21 2
> QEMU long exit vCPU 0 21 2
> QEMU long exit vCPU 0 45 2  <<<
> QEMU long exit vCPU 0 23 2
>
> (qemu) device_add pci-assign,host=28:10.2,bus=pciehp.3.8
> QEMU long exit vCPU 0 26 2
> QEMU long exit vCPU 0 24 2
> QEMU long exit vCPU 0 23 2
> QEMU long exit vCPU 0 23 2
> QEMU long exit vCPU 0 23 2
> QEMU long exit vCPU 0 23 2
> QEMU long exit vCPU 0 23 2
> QEMU long exit vCPU 0 23 2
> QEMU long exit vCPU 0 23 2
> QEMU long exit vCPU 0 23 2
> QEMU long exit vCPU 0 23 2
> QEMU long exit vCPU 0 49 2 <<<
> QEMU long exit vCPU 0 25 2
>
>
> (qemu) device_add pci-assign,host=28:10.3,bus=pciehp.3.9
> QEMU long exit vCPU 0 28 2
> QEMU long exit vCPU 0 26 2
> QEMU long exit vCPU 0 25 2
> QEMU long exit vCPU 0 25 2
> QEMU long exit vCPU 0 25 2
> QEMU long exit vCPU 0 25 2
> QEMU long exit vCPU 0 24 2
> QEMU long exit vCPU 0 24 2
> QEMU long exit vCPU 0 24 2
> QEMU long exit vCPU 0 24 2
> QEMU long exit vCPU 0 24 2
> QEMU long exit vCPU 0 52 2 <<<
> QEMU long exit vCPU 0 26 2
>
> (qemu) device_add pci-assign,host=28:10.4,bus=pciehp.3.10
> QEMU long exit vCPU 0 35 2
> QEMU long exit vCPU 0 28 2
> QEMU long exit vCPU 0 26 2
> QEMU long exit vCPU 0 27 2
> QEMU long exit vCPU 0 26 2
> QEMU long exit vCPU 0 26 2
> QEMU long exit vCPU 0 26 2
> QEMU long exit vCPU 0 26 2
> QEMU long exit vCPU 0 26 2
> QEMU long exit vCPU 0 26 2
> QEMU long exit vCPU 0 26 2
> QEMU long exit vCPU 0 56 2 <<<
> QEMU long exit vCPU 0 28 2
>
> (qemu) device_add pci-assign,host=28:10.5,bus=pciehp.3.11
> QEMU long exit vCPU 0 33 2
> QEMU long exit vCPU 0 30 2
> QEMU long exit vCPU 0 28 2
> QEMU long exit vCPU 0 29 2
> QEMU long exit vCPU 0 28 2
> QEMU long exit vCPU 0 28 2
> QEMU long exit vCPU 0 28 2
> QEMU long exit vCPU 0 28 2
> QEMU long exit vCPU 0 28 2
> QEMU long exit vCPU 0 28 2
> QEMU long exit vCPU 0 28 2
> QEMU long exit vCPU 0 59 2 <<<
> QEMU long exit vCPU 0 30 2
>
>
>
> diff --git a/kvm-all.c b/kvm-all.c
> index 3ae30ee..e3a1964 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -1685,6 +1685,8 @@ int kvm_cpu_exec(CPUState *cpu)
>   {
>       struct kvm_run *run = cpu->kvm_run;
>       int ret, run_ret;
> +    int64_t clock_ns, delta_ms;
> +    __u32 last_exit_reason, last_vcpu;
>   
>       DPRINTF("kvm_cpu_exec()\n");
>   
> @@ -1711,6 +1713,12 @@ int kvm_cpu_exec(CPUState *cpu)
>           }
>           qemu_mutex_unlock_iothread();
>   
> +        delta_ms = (get_clock() - clock_ns)/(1000*1000);
> +        if( delta_ms >= 10){
> +            fprintf(stderr, "QEMU long exit vCPU %d %ld %d\n",last_vcpu,
> +                delta_ms, last_exit_reason);
> +        }
> +
>           run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
>   
>           qemu_mutex_lock_iothread();
> @@ -1727,7 +1735,15 @@ int kvm_cpu_exec(CPUState *cpu)
>               abort();
>           }
>   
> +        /*
> +         * Capture exit reasons
> +         */
> +        clock_ns = get_clock();
> +        last_exit_reason = run->exit_reason;
> +        last_vcpu = cpu->cpu_index;
> +
>
> thanks,
> Etienne
>

next prev parent reply	other threads:[~2014-06-26  8:18 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-25 17:53 [Qemu-devel] memory: memory_region_transaction_commit() slow Etienne Martineau
2014-06-25 18:58 ` Paolo Bonzini
2014-06-25 20:41   ` Etienne Martineau
2014-06-26  3:52   ` Peter Crosthwaite
2014-06-26 15:02     ` Etienne Martineau
2014-06-26  8:18 ` Avi Kivity [this message]
2014-06-26 14:31   ` Etienne Martineau
2014-06-29  6:56     ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53ABD74A.2070607@gmail.com \
    --to=avi.kivity@gmail.com \
    --cc=arei.gonglei@huawei.com \
    --cc=etmartinau@gmail.com \
    --cc=famz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.crosthwaite@xilinx.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.