From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55485) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1g9SrE-0003dy-FS for qemu-devel@nongnu.org; Mon, 08 Oct 2018 06:33:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1g9SmQ-0005Dg-IH for qemu-devel@nongnu.org; Mon, 08 Oct 2018 06:28:54 -0400 Received: from mail-wr1-x443.google.com ([2a00:1450:4864:20::443]:44367) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1g9SmL-0005B2-0a for qemu-devel@nongnu.org; Mon, 08 Oct 2018 06:28:48 -0400 Received: by mail-wr1-x443.google.com with SMTP id 63-v6so20177026wra.11 for ; Mon, 08 Oct 2018 03:28:41 -0700 (PDT) References: <20181005154910.3099-1-alex.bennee@linaro.org> <20181007012337.GA23787@flamenco> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: <20181007012337.GA23787@flamenco> Date: Mon, 08 Oct 2018 11:28:38 +0100 Message-ID: <875zycohpl.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC PATCH 00/21] Trace updates and plugin RFC List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Emilio G. Cota" Cc: qemu-devel@nongnu.org, Pavel.Dovgaluk@ispras.ru, vilanova@ac.upc.edu Emilio G. Cota writes: > On Fri, Oct 05, 2018 at 16:48:49 +0100, Alex Benn=C3=A9e wrote: > (snip) >> =3D=3DKnown Limitations=3D=3D >> >> Currently there is only one hook allowed per trace event. We could >> make this more flexible or simply just error out if two plugins try >> and hook to the same point. What are the expectations of running >> multiple plugins hooking into the same point in QEMU? > > It's very common. All popular instrumentation tools (e.g. PANDA, > DynamoRIO, Pin) support multiple plugins. Fair enough. > >> =3D=3DTCG Hooks=3D=3D >> >> Thanks to Llu=C3=ADs' work the trace API already splits up TCG events in= to >> translation time and exec time events and provides the machinery for >> hooking a trace helper into the translation stream. Currently that >> helper is unconditionally added although perhaps we could expand the >> call convention a little for TCG events to allow the translation time >> event to filter out planting the execution time helper? > > A TCG helper is suboptimal for these kind of events, e.g. instruction/TB/ > mem callbacks, because (1) these events happen *very* often, and > (2) a helper then has to iterate over a list of callbacks (assuming > we support multiple plugins). That is, one TCG helper call, > plus cache misses for the callback pointers, plus function calls > to call the callbacks. That adds up to 2x average slowdown > for SPEC06int, instead of 1.5x slowdown when embedding the > callbacks directly into the generated code. Yes, you have to > flush the code when unsubscribing from the event, but that cost > is amortized by the savings you get when the callbacks occur, > which are way more frequent. What would you want instead of a TCG helper? But certainly being able be selective about which instances of each trace point are instrumented will be valuable. > Besides performance, to provide a pleasant plugin experience we need > something better than the current tracing callbacks. What I hope to avoid in re-using trace points is having a whole bunch of a additional hook points just for plugins. However nothing stops us adding more tracepoints at more useful places for instrumentation. We could also do it on a whitelist basis similar to the way the tcg events are marked. > >> =3D=3D=3DInstruction Tracing=3D=3D=3D >> >> Pavel's series had a specific hook for instrumenting individual >> instructions. I have not yet added it to this series but I think it be >> done in a slightly cleaner way now we have the ability to insert TCG >> ops into the instruction stream. > > I thought Peter explicitly disallowed TCG generation from plugins. > Also, IIRC others also mentioned that exposing QEMU internals > (e.g. "struct TranslationBlock", or "struct CPUState") to plugins > was not on the table. We definitely want to avoid plugin controlled code generation but the tcg tracepoint mechanism is transparent to the plugin itself. I think the pointers should really be treated as anonymous handles rather than windows into QEMU's internals. Arguably some of the tracepoints should be exporting more useful numbers (I used cpu->cpu_index in the TLB trace points) but I don't know if we can change existing trace point definitions to clean that up. Again if we whitelist tracepoints for plugins we can be more careful about the data exported. > >> If we add a tracepoint for post >> instruction generation which passes a buffer with the instruction >> translated and method to insert a helper before or after the >> instruction. This would avoid exposing the cpu_ldst macros to the >> plugins. > > Again, for performance you'd avoid the tracepoint (i.e. calling > a helper to call another function) and embed directly the > callback from TCG. Same thing applies to TB's. OK I see what you mean. I think that is doable although it might take a bit more tcg plumbing. > >> So what do people think? Could this be a viable way to extend QEMU >> with plugins? > > For frequent events such as the ones mentioned above, I am > not sure plugins can be efficiently implemented under > tracing. I assume some form of helper-per-instrumented-event/insn is still going to be needed though? We are not considering some sort of EBF craziness? > For others (e.g. cpu_init events), sure, they could. > But still, differently from tracers, plugins can come and go > anytime, so I am not convinced that merging the two features > is a good idea. I don't think we have to mirror tracepoints and plugin points but I'm in favour of sharing the general mechanism and tooling rather than having a whole separate set of hooks. We certainly don't want anything like: trace_exec_tb(tb, pc); plugin_exec_tb(tb, pc); scattered throughout the code where the two do align. > > Thanks, > > Emilio -- Alex Benn=C3=A9e