From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=36301 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Obtad-0007RP-KB for qemu-devel@nongnu.org; Thu, 22 Jul 2010 07:05:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1ObtaZ-0003sj-0K for qemu-devel@nongnu.org; Thu, 22 Jul 2010 07:05:23 -0400 Received: from goliath.siemens.de ([192.35.17.28]:22995) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1ObtaY-0003rx-MX for qemu-devel@nongnu.org; Thu, 22 Jul 2010 07:05:18 -0400 Message-ID: <4C482600.3000208@siemens.com> Date: Thu, 22 Jul 2010 13:05:36 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <438ED7C1-AD7C-4946-99D2-B0E9A91B8DF1@gmail.com> <6BBDD0C9-39D5-435C-8CD7-4E3DD8BAF57D@gmail.com> <4C472889.5000407@mail.berlios.de> <2FAAD29F-6C3E-4128-A6E1-46EE15AF80FA@gmail.com> In-Reply-To: <2FAAD29F-6C3E-4128-A6E1-46EE15AF80FA@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] Re: Release of COREMU, a scalable and portable full-system emulator List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Chen Yufei Cc: qemu-devel@nongnu.org Chen Yufei wrote: > On 2010-7-22, at =E4=B8=8A=E5=8D=881:04, Stefan Weil wrote: >=20 >> Am 21.07.2010 09:03, schrieb Chen Yufei: >>> On 2010-7-21, at =E4=B8=8A=E5=8D=885:43, Blue Swirl wrote: >>> >>> =20 >>>> On Sat, Jul 17, 2010 at 10:27 AM, Chen Yufei wr= ote: >>>> =20 >>>>> We are pleased to announce COREMU, which is a "multicore-on-multico= re" full-system emulator built on Qemu. (Simply speaking, we made Qemu pa= rallel.) >>>>> >>>>> The project web page is located at: >>>>> http://ppi.fudan.edu.cn/coremu >>>>> >>>>> You can also download the source code, images for playing on source= forge >>>>> http://sf.net/p/coremu >>>>> >>>>> COREMU is composed of >>>>> 1. a parallel emulation library >>>>> 2. a set of patches to qemu >>>>> (We worked on the master branch, commit 54d7cf136f040713095cbc064f6= 2d753bff6f9d2) >>>>> >>>>> It currently supports full-system emulation of x64 and ARM MPcore p= latforms. >>>>> >>>>> By leveraging the underlying multicore resources, it can emulate up= to 255 cores running commodity operating systems (even on a 4-core machi= ne). >>>>> >>>>> Enjoy, >>>>> =20 >>>> Nice work. Do you plan to submit the improvements back to upstream Q= EMU? >>>> =20 >>> It would be great if we can submit our code to QEMU, but we do not kn= ow the process. >>> Would you please give us some instructions? >>> >>> -- >>> Best regards, >>> Chen Yufei >>> =20 >> Some hints can be found here: >> http://wiki.qemu.org/Contribute/StartHere >> >> Kind regards, >> Stefan Weil >=20 > The patch is in the attachment, produced with command > git diff 54d7cf136f040713095cbc064f62d753bff6f9d2 >=20 > In order to separate what need to be done to make QEMU parallel, we cre= ated a separate library, and the patched QEMU need to be compiled and lin= ked with that library. To submit our enhancement to QEMU, maybe we need t= o incorporate this library into QEMU. I don't know what would be the best= solution. For upstream QEMU, the goal should be to integrate your modifications and enhancements into the existing architecture in a mostly seamless way. The library approach may help maintaining your changes out of tree, but it likely cannot contribute any benefit to an in-tree extension of QEMU for parallel TCG VCPUs. >=20 > Our approach to make QEMU parallel can be found at http://ppi.fudan.edu= .cn/coremu >=20 > I will give a short summary here: >=20 > 1. Each emulated core thread runs a separate binary translator engine a= nd has private code cache. We marked some variables in TCG as thread loca= l. We also modified the TB invalidation mechanism. >=20 > 2. Each core has a queue holding pending interrupts. The COREMU library= provides this queue, and interrupt notification is done by sending realt= ime signals to the emulated core thread. >=20 > 3. Atomic instruction emulation has to be modified for parallel emulati= on. We use lightweight memory transaction which requires only compare-and= -swap instruction to emulate atomic instruction. >=20 > 4. Some code in the original QEMU may cause data race bug after we make= it parallel. We fixed these problems. >=20 Upstream integration requires such iterative steps as well - in form of ideally small, focused patches that finally convert QEMU into a parallel emulator. Also note that upstream already supports threaded VCPUs - in KVM mode. You obviously have resolved the major blocking points to apply this on TCG mode as well. But I don't see yet why we may need a new VCPU threading infrastructure for this. Rather only small tuning of what KVM already uses should suffice - if that's required at all. To give it a start, you could identify some more trivial changes in your patches, split them out and rebase them over latest qemu.git, then post them as a patch series for inclusion (see the mailing list for various examples). Make sure to describe the reason for your changes as clear as possible, specifically if they are not (yet) obvious in the absence of COREMU features in upstream QEMU. Be prepared that merging your code can be a lengthy process with quite a few discussions about why and how things are done, likely also with requests to change your current solution in some aspects. However, the result should be an optimal solution for the overall goal, parallel VCPU emulation - and no longer any need to maintain your private set of patches against quickly evolving QEMU. Jan --=20 Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux