Re: Question about high CPU load during iperf ethernet testing

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Iurii Konovalenko <iurii.konovalenko@globallogic.com>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Julien Grall <julien.grall@linaro.org>,
	xen-devel@lists.xen.org,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: Re: Question about high CPU load during iperf ethernet testing
Date: Sat, 27 Sep 2014 14:20:08 +0300	[thread overview]
Message-ID: <CABc08z+QNA1s3DoQFAuTcHrUaFFUBe495=rvmvrHc44RY31cxQ@mail.gmail.com> (raw)
In-Reply-To: <1411657808.2053.19.camel@kazak.uk.xensource.com>

On Thu, Sep 25, 2014 at 6:10 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Thu, 2014-09-25 at 17:12 +0300, Iurii Konovalenko wrote:
>> On Wed, Sep 24, 2014 at 8:57 PM, Stefano Stabellini
>> <stefano.stabellini@eu.citrix.com> wrote:
>> > On Wed, 24 Sep 2014, Iurii Konovalenko wrote:
>> >> Hi, Stefano!
>> >> Thank you for your reply!
>> >>
>> >> On Tue, Sep 23, 2014 at 7:41 PM, Stefano Stabellini
>> >> <stefano.stabellini@eu.citrix.com> wrote:
>> >> > On Mon, 22 Sep 2014, Iurii Konovalenko wrote:
>> >> >> Hello, all!
>> >> >>
>> >> >> I am running iperf ethernet tests on DRA7XX_EVM board (OMAP5).
>> >> >> Xen version is 4.4.
>> >> >> I run only Linux (kernel 3.8) as Dom0, no other active domains (For clear tests results I decided not to start DomU).
>> >> >> iperf server is started on host, iperf client is started on board with command line "iperf -c 192.168.2.10 -w 256k -m
>> >> >> -f M -d -t 60".
>> >> >
>> >> > Just to double check: you are running the iperf test in Dom0, correct?
>> >>
>> >> Yes, iperf is running in Dom0.
>> >>
>> >> >> During test I studied CPU load with top tool on Dom0, and saw, that one VCPU is totally loaded, spending about 50% in
>> >> >> software IRQs, and 50% in system.
>> >> >> Running the same test on clear Linux without Xen, I saw that CPU load is about 2-4%.
>> >> >>
>> >> >> I decided to debug a bit, so I used "({register uint64_t _r; asm volatile("mrrc " "p15, 0, %0, %H0, c14" ";" : "=r"
>> >> >> (_r)); _r; })" command to read timer counter before and after operations I want to test.
>> >> >>
>> >> >> In such way I've found, that most time of CPU is spent in functions enable_irq/disable_irq_nosync and
>> >> >> spin_lock_irqsave/spin_unlock_irqrestore (mostly in "mrs    %0, cpsr    @ arch_local_irq_save"/"msr    cpsr_c, %0    @
>> >> >> local_irq_restore"). When running without Xen it should not take so much time.
>> >> >
>> >> > There is nothing Xen specific in the Linux ARM implementation of
>> >> > spin_lock_irqsave/spin_unlock_irqrestore and
>> >> > enable_irq/disable_irq_nosync.
>> >> >
>> >>
>> >> That is strange, because my explorations show a lot of time is spent
>> >> there, for example in spin_unlock_irqrestore (mostly in  mrs
>> >> instuction) about 20%, when running in Dom0.
>> >
>> > Unless you are doing something wrong in your measurements, if you really
>> > narrowed it down to one instruction then I would try to do the same on a
>> > different SoC of another vendor to see if it is actually an hardware issue.
>> >
>> >
>> >> >> So, could anyone explain me some questions:
>> >> >> 1. Is it normal behaviour?
>> >> >
>> >> > No, it is not normal.
>> >> > Assuming that you assign all the memory to Dom0 and as many vcpu as
>> >> > physical cpus on your platform then you should get the same numbers as
>> >> > native.
>> >>
>> >> OK, so I might do something wrong.
>> >>
>> >> >> 2. Does hypervisor trap cpsr register? I suppose, that hypervisor trap access to cpsr register, that leads to
>> >> >> additional overhead, but I can't find place in sources where it happens.
>> >> >
>> >> > We don't trap cpsr.
>> >>
>> >> It is strange, because it was only one my assumption, where time can be spent.
>> >> So could you please advise where to go to understand the reason of
>> >> such high VCPU load?
>> >
>> > I don't know. When you say that arch_local_irq_save is the one taking
>> > all the time, do you actually have something like:
>> >
>> > time1 = read CNTVCT;
>> > arch_local_irq_save();
>> > time2 = read CNTVCT;
>> > printk(time2-time1);
>> >
>> > in your code?
>>
>> Almost like this, with only difference that I accumulate difference of
>> time in variable, and when it's value is grater then 1s, make print,
>> like this:
>>
>> volatile u64 total;
>> #define SECONDS(_s)     ((s64)((_s)  * 1000000000ULL))
>> static inline s64 ticks_to_ns(uint64_t ticks)
>> {
>>     return muldiv64(ticks, SECONDS(1), 1000 * 6144 );
>> }
>>
>> time1 = read CNTVCT;
>> arch_local_irq_save();
>> time2 = read CNTVCT;
>> total += time2-time1;
>> if(ticks_to_ns(total) > 1000000000)
>> {
>>     printk("1 second spent here");
>>     total = 0;
>> }
>
> Doesn't that just tell you that this code is called frequently and hence
> clocks up the total faster? As opposed to telling you that this
> operation is slow.
>
> I'd expect IRQs to go on and off quite a lot under normal operation e.g.
> certain spinlock variants will disable irqs for example and might be
> frequently used.
>
> Ian.

Of cause it can be because of frequent calls, but why such difference
with and without Xen?
I thought number of spinlock calls is more or less equal with Xen or without.

Best regards.

Iurii Konovalenko | Senior Software Engineer
GlobalLogic

next prev parent reply	other threads:[~2014-09-27 11:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-22 13:01 Question about high CPU load during iperf ethernet testing Iurii Konovalenko
2014-09-23 16:41 ` Stefano Stabellini
2014-09-24 16:37   ` Iurii Konovalenko
2014-09-24 17:57     ` Stefano Stabellini
2014-09-25 14:12       ` Iurii Konovalenko
2014-09-25 15:10         ` Ian Campbell
2014-09-27 11:20           ` Iurii Konovalenko [this message]
2014-09-23 16:48 ` Ian Campbell
2014-09-24 17:01   ` Iurii Konovalenko
2014-09-24 17:51     ` Stefano Stabellini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABc08z+QNA1s3DoQFAuTcHrUaFFUBe495=rvmvrHc44RY31cxQ@mail.gmail.com' \
    --to=iurii.konovalenko@globallogic.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=julien.grall@linaro.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).