From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56378) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cB5BE-0005NF-GT for qemu-devel@nongnu.org; Sun, 27 Nov 2016 14:32:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cB5BA-00033l-Td for qemu-devel@nongnu.org; Sun, 27 Nov 2016 14:32:04 -0500 Received: from clearmind.me ([178.32.49.9]:52997) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cB5BA-00033Y-Mm for qemu-devel@nongnu.org; Sun, 27 Nov 2016 14:32:00 -0500 Date: Sun, 27 Nov 2016 20:32:44 +0100 From: Alessandro Di Federico MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Message-Id: Sender: ale@clearmind.me Subject: [Qemu-devel] Support for using TCG frontend as a library List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Yan , "Jonas Zaddach (jzaddach)" Hi all, QEMU is a great emulator, but in recent years it has also been used for instrumentation purposes [QIRA,AFL] or as a lifter for static analysis purposes [rev.ng,angr,libqemu,S=C2=B2E]. I'd like to hear your take on the second use case, and the possibility of offering upstream support for it. The general idea is to introduce a new build configuration which produces a library for each supported input ISA exposing the TCG frontend in a unified way. We could call it libtcg-$ARCH.so. In practice, given a buffer containing code for a certain architecture, the user program loads the appropriate version of this library and asks it to produce the corresponding TCG instructions. I've been investigating the needs of the various projects that might be interested in using it and they sum up to the following: * Be able to load in the same process multiple libtcg-$ARCH.so for different architectures. * Obtain the TCG instructions from code in a memory buffer. * Dump the assembly code of the code in a memory buffer. * Dump the TCG instructions in textual form. For what concerns helpers, it would be nice to have some metadata about them, for instance the parts of the CPU state they can change. It would also be nice to have a build configuration which produces a library containing all the helpers ready to be used, or, even better, a library as LLVM bitcode, which can then be further processed/analyzed. Here you can find some relevant parts of my draft implementation part of rev.ng: * The interface exposed to users: https://polimicg.org/gitlab/revng/qemu/blob/develop/linux-user/ptc.h * Implementation of the interface functions: https://polimicg.org/gitlab/revng/qemu/blob/develop/linux-user/ptc.c * For the changes introduced elsewhere look for CONFIG_LIBTINYCODE: https://polimicg.org/gitlab/search?utf8=3D%E2%9C%93&search=3DCONFIG_LIBTI= NYCODE&group_id=3D&project_id=3D83&search_code=3Dtrue&repository_ref=3Ddeve= lop It's rough but it works (see [rev.ng]). I'm interested to hear your opinion and willingness to take patches. Being able to unify the various efforts in this direction would be good, having upstream support would be amazing. -- Alessandro Di Federico PhD student at Politecnico di Milano [QIRA] http://qira.me/ [AFL] http://lcamtuf.coredump.cx/afl/ (for the black-box mode) [rev.ng] https://rev.ng/ [angr] http://angr.io/ (currently using VEX IR, QEMU support planned) [libqemu] https://github.com/zaddach/libqemu [S=C2=B2E] http://s2e.epfl.ch/