From: Claudio Fontana <cfontana@suse.de>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
"Philippe Mathieu-Daudé" <philmd@redhat.com>,
qemu-devel@nongnu.org, "Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: sharing intention for developing per-target, dynamically loadable accelerator modules
Date: Thu, 21 May 2020 13:28:37 +0200 [thread overview]
Message-ID: <d5ddb936-f243-5de3-fc03-2e2ac8452e6e@suse.de> (raw)
In-Reply-To: <350b2ad7-b92d-659b-481a-e41fe7cfc738@suse.de>
On 5/19/20 9:53 AM, Claudio Fontana wrote:
> On 5/18/20 8:18 PM, Alex Bennée wrote:
>>
>> Claudio Fontana <cfontana@suse.de> writes:
>>
>>> Hello all,
>>>
>>> my intention would be to develop per-target, dynamically loadable accelerator modules.
>>>
>>> This would allow to distribute a single QEMU base binary, and then provide accelerators as optional additional binary packages to install,
>>> with the first separate optional package being TCG.
>>>
>>> CONFIG_TCG would become 'm' as a result, but then also CONFIG_KVM, CONFIG_HAX, CONFIG_WHPX, CONFIG_HVF.
>>>
>>> Here are some elements that seem to be needed:
>>>
>>> 1 - The module CONFIG_MODULE part of the build system would need some extension to add per-target modules. I have some tentative results that shows that this is possible (but a bit clunky atm).
>>> There is some existing instability in the existing Makefile infrastructure of modules that shows up when trying to extend it.
>>>
>>> 2 - new "accelerator drivers" seems to be needed, either in addition or as additional functionality inside the current AccelState.
>>>
>>> 3 - for target/i386 in particular, there is some refactoring work needed to split even more different unrelated bits and pieces.
>>> dependencies of hw/i386 machine stuff with accelerator-specific
>>> stuff are also painful.
>>
>> There are a couple of per-arch hacks in the main TCG cpu loops it would
>> be good to excise from the code.
>>
>>>
>>> 4 - CPU Arch Classes could be extended with per-accelerator methods. Initial fooling around shows it should probably work.
>>> One alternative would be trying to play with the dynamic linker (weak symbols, creative use of dlsym etc), but I have not sorted out the details of this option.
>>>
>>> 5 - cputlb, in particular tlb_flush and friends is a separate problem
>>> since it is not part of the cpuclass. Should it be?
>>
>> tlb_flush and friends are TCG implementation details for softmmu that
>> ensure a lookup for any address will always return to a guest specific
>> tlb_fill to rebuild the cache. The behaviour is not guest-specific
>> per-se although we do partition the table entries based on the guest
>> size.
>>
>> Perhaps we can make it more dynamic but it would have to ensure both the
>> slow path and the fast path are using the same mask and shifts when
>> walking the table.
>>
>>> 6 - a painpoint is represented by the fact that in USER mode, the accel class is not applied, which causes a lot of uncleanliness all around
>>> (tcg_allowed outside of the AccelClass).
>>
>> The user-mode run loops are a whole separate chunk of code. I don't know
>> if it is worth trying to make them plugable as you will only ever have
>> one per linux-user binary.
>>
>>> 7 - I have not really thought about the KConfig aspects because I am not super-familiar
>>>
>>> 8 - cpus.c needs some good splitting
>>
>> Although there are several different run loops in there I think they
>> share a lot of commonality which is why they are bundled together. They
>> all share the same backend for dealing with ioevents and generic
>> pause/unpause machinery. But feel free to prove me wrong ;-)
>
> Hi Alex, I got my first compile, and it is currently in github, I still need to split the series though and there is still cleanup needed.
>
> https://github.com/hw-claudio/qemu.git
> branch "cpus-refactoring"
>
> just in case you are interested in a peek.
>
> The idea results currently in:
>
> cpu-throttle.c | 154 +++++++++
> cpu-timers.c | 784 +++++++++++++++++++++++++++++++++++++++++++++
> cpus-tcg.c | 515 ++++++++++++++++++++++++++++++
> cpus.c | 1405 +++++----------------------------------------------------------------------------
>
> New interfaces are in:
>
> include/sysemu/cpu-throttle.h | 50 +++
> include/sysemu/cpu-timers.h | 73 +++++
> include/sysemu/cpus.h | 44 ++-
>
> cpu-throttle (new) is self-explanatory, contains the cpu-throttle in cpus.c
> cpu-timers (new) contains the icount, ticks, clock timers from cpus.c
>
> cpus.h adds an interface to per-accel vcpu threads:
>
> qemu_register_start_vcpu(void (*f)(CPUState *cpu));
> bool all_cpu_threads_idle(void);
> bool cpu_can_run(CPUState *cpu);
> void qemu_wait_io_event_common(CPUState *cpu);
> void qemu_wait_io_event(CPUState *cpu);
> void cpu_thread_signal_created(CPUState *cpu);
> void cpu_thread_signal_destroyed(CPUState *cpu);
> void cpu_handle_guest_debug(CPUState *cpu);
>
> Very much still all WIP...
>
> Ciao,
>
> C
>
>
>>
>>> ... more things to find out and think about ...
>>>
>>> Overall, I think that the activity has the potential to provide benefits overall beyond the actual goal, in the form of cleanups, leaner configurations,
>>> minor fixes, maybe improving the CONFIG_MODULE instabilities if any
>>> etc.
>>
>> There are certainly some ugly bits we could shave down in such an
>> exercise.
>>
>>> As an example, the first activity I would plan to submit as RFC is point 8 above,
>>> there is the split between cpus.c and cpus-tcg.c that results in lots of TCG-specific code being removed from non-tcg builds (using CONFIG_TCG).
>>>
>>> One thing that should be kept in check is any performance impact of
>>> the changes, in particular for point 4, hot paths should probably
>>> avoid going through too many pointer indirections.
>>
>> Maybe - certainly for TCG you have pretty much lost if you have exited
>> the main execution loop I doubt it would show up much. Not so sure about
>> the HW accelerators. Most of the performance sensitive stuff is dealt
>> with close to the ioctls IIRC.
>>
>>> Does anybody share similar goals? Any major obstacle or blocker that would put the feasibility into question?
>>> Any suggestion on any of this? In particular point 4 and 5 come to
>>> mind, as well as some better understanding of the reasons behind 6, or
>>> even suggestions on how to best to 2.
>>
>> For linux-user each CPU run loop is it's own special snowflake because
>> pretty much every architecture has it's own special set of EXCP_ exits
>> to deal with various bits. There are per-arch EXCP_ flags for system
>> emulation as well but not nearly as many.
>>
>>>
>>> Anyway, I will continue to work on the first RFC for some smaller initial steps and hopefully have something to submit soon.
>>>
>>> Ciao ciao,
>>>
>>> Claudio
>>
>>
I have tried something a bit more ambitious at
https://github.com/hw-claudio/qemu.git branch: "cpus-interface"
starting to look nice, with each single accelerator registering an interface to cpus.c,
so that the cpu thread can reside within that specific accelerator.
Some workarounds are eliminated as a result as well, for example for the kick.
Removes some windows special casing (note: while doing this I noticed that windows has been bolted in initially for HAX, so there might be some assumptions there that became wrong with the addition of whpx.
In particular, the DUMMY_APC thing. Left the "bug" there, but added a comment so it can be discussed.
The most difficult part is to split the patch: everything is so knit together that "doing one accel at a time" for example is really challenging.
Will try to post what I have and then I will be looking for feedback on how to split it if necessary for easier reviewing, but already some comments on the approach could be valuable..
Ciao,
C
prev parent reply other threads:[~2020-05-21 11:29 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-17 11:30 sharing intention for developing per-target, dynamically loadable accelerator modules Claudio Fontana
2020-05-18 18:18 ` Alex Bennée
2020-05-19 7:53 ` Claudio Fontana
2020-05-21 11:28 ` Claudio Fontana [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d5ddb936-f243-5de3-fc03-2e2ac8452e6e@suse.de \
--to=cfontana@suse.de \
--cc=alex.bennee@linaro.org \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=philmd@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).