From: "Alex Bennée" <alex.bennee@linaro.org>
To: Laurent Desnogues <laurent.desnogues@gmail.com>
Cc: "Lukáš Doktor" <ldoktor@redhat.com>,
"Peter Maydell" <peter.maydell@linaro.org>,
"Stefan Hajnoczi" <stefanha@gmail.com>,
"Richard Henderson" <richard.henderson@linaro.org>,
"QEMU Developers" <qemu-devel@nongnu.org>,
ahmedkhaledkaraman@gmail.com,
"Aleksandar Markovic" <aleksandar.qemu.devel@gmail.com>,
"Emilio G . Cota" <cota@braap.org>,
"Gerd Hoffmann" <kraxel@redhat.com>
Subject: Re: [INFO] Some preliminary performance data
Date: Sat, 09 May 2020 17:49:55 +0100 [thread overview]
Message-ID: <87v9l5vyrw.fsf@linaro.org> (raw)
In-Reply-To: <CABoDooMJcJi0gKx4HNk50X4YZ4DazgoMLU3y5-KgNrwR_w6-jw@mail.gmail.com>
Laurent Desnogues <laurent.desnogues@gmail.com> writes:
> On Sat, May 9, 2020 at 2:38 PM Aleksandar Markovic
> <aleksandar.qemu.devel@gmail.com> wrote:
>>
>> суб, 9. мај 2020. у 13:37 Laurent Desnogues
>> <laurent.desnogues@gmail.com> је написао/ла:
>> >
>> > On Sat, May 9, 2020 at 12:17 PM Aleksandar Markovic
>> > <aleksandar.qemu.devel@gmail.com> wrote:
>> > > сре, 6. мај 2020. у 13:26 Alex Bennée <alex.bennee@linaro.org> је написао/ла:
>> > >
>> > > > This is very much driven by how much code generation vs running you see.
>> > > > In most of my personal benchmarks I never really notice code generation
>> > > > because I give my machines large amounts of RAM so code tends to stay
>> > > > resident so not need to be re-translated. When the optimiser shows up
>> > > > it's usually accompanied by high TB flush and invalidate counts in "info
>> > > > jit" because we are doing more translation that we usually do.
>> > > >
>> > >
>> > > Yes, I think the machine was setup with only 128MB RAM.
>> > >
>> > > That would be an interesting experiment for Ahmed actually - to
>> > > measure impact of given RAM memory to performance.
>> > >
>> > > But it looks that at least for machines with small RAM, translation
>> > > phase will take significant percentage.
>> > >
>> > > I am attaching call graph for translation phase for "Hello World" built
>> > > for mips, and emulated by QEMU: *tb_gen_code() and its calees)
>> >
>>
>> Hi, Laurent,
>>
>> "Hello world" was taken as an example where code generation is
>> dominant. It was taken to illustrate how performance-wise code
>> generation overhead is distributed (illustrating dominance of a
>> single function).
>>
>> While "Hello world" by itself is not a significant example, it conveys
>> a useful information: it says how much is the overhead of QEMU
>> linux-user executable initialization, and code generation spent on
>> emulation of loading target executable and printing a simple
>> message. This can be roughly deducted from the result for
>> a meaningful benchmark.
>>
>> Booting of a virtual machine is a legitimate scenario for measuring
>> performance, and perhaps even attempting improving it.
>>
>> Everything should be measured - code generation, JIT-ed code
>> execution, and helpers execution - in all cases, and checked
>> whether it departs from expected behavior.
>>
>> Let's say that we emulate a benchmark that basically runs some
>> code in a loop, or an algorithm - one would expect that after a
>> while, while increasing number of iterations of the loop, or the
>> size of data in the algorithm, code generation becomes less and
>> less significant, converging to zero. Well, this should be confirmed
>> with an experiment, and not taken for granted.
>>
>> I think limiting measurements only on, let's say, execution of
>> JIT-ed code (if that is what you implied) is a logical mistake.
>> The right conclusions should be drawn from the complete
>> picture, shouldn't it?
>
> I explicitly wrote that you should consider a wide spectrum of
> programs so I think we're in violent agreement ;-)
If you want a good example for a real world use case where we could
improve things then I suggest looking at compilers.
They are frequently instantiated once per compilation unit and once done
all the JIT translations are thrown away. While the code-path taken by a
compiler may be different for every unit it compiles I bet there are
savings we could make by caching compilation. The first step would be
identifying how similar the profiles of the generated code generated is.
--
Alex Bennée
prev parent reply other threads:[~2020-05-09 16:50 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-02 23:20 [INFO] Some preliminary performance data Aleksandar Markovic
2020-05-02 23:24 ` Aleksandar Markovic
2020-05-03 6:47 ` Ahmed Karaman
2020-05-06 11:26 ` Alex Bennée
2020-05-09 10:16 ` Aleksandar Markovic
2020-05-09 10:26 ` Aleksandar Markovic
2020-05-09 11:36 ` Laurent Desnogues
2020-05-09 12:37 ` Aleksandar Markovic
2020-05-09 12:50 ` Laurent Desnogues
2020-05-09 12:55 ` Aleksandar Markovic
2020-05-09 16:49 ` Alex Bennée [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v9l5vyrw.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=ahmedkhaledkaraman@gmail.com \
--cc=aleksandar.qemu.devel@gmail.com \
--cc=cota@braap.org \
--cc=kraxel@redhat.com \
--cc=laurent.desnogues@gmail.com \
--cc=ldoktor@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.