From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44373) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dyMCx-0006Ps-LX for qemu-devel@nongnu.org; Sat, 30 Sep 2017 14:09:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dyMCu-0004vY-Pb for qemu-devel@nongnu.org; Sat, 30 Sep 2017 14:09:47 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:52727) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dyMCu-0004u3-KA for qemu-devel@nongnu.org; Sat, 30 Sep 2017 14:09:44 -0400 Date: Sat, 30 Sep 2017 14:09:41 -0400 From: "Emilio G. Cota" Message-ID: <20170930180941.GA22048@flamenco> References: <150529666493.10902.14830445134051381968.stgit@frigg.lan> <87poasgjyh.fsf@frigg.lan> <87d16o53xr.fsf@frigg.lan> <87o9pywt8k.fsf@frigg.lan> <87shf5zlty.fsf@frigg.lan> <20170929175943.GA25038@flamenco> <87vak1w53a.fsf@frigg.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87vak1w53a.fsf@frigg.lan> Subject: Re: [Qemu-devel] [PATCH v6 01/22] instrument: Add documentation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell , QEMU Developers , Stefan Hajnoczi , Markus Armbruster List-ID: On Sat, Sep 30, 2017 at 00:46:33 +0300, Lluís Vilanova wrote: > Emilio G Cota writes: > > I'm not sure I understand this concept of filtering. Are you saying that in > > the first case, all memory accesses are instrumented, and then in the > > "access helper" we only call the user's callback if it's a memory write? > > And in the second case, we simply just generate a "write helper" instead > > of an "access helper". Am I understanding this correctly? > > In the previous case (no filtering), the user callback is always called when a > memory access is *executed*, and the user then checks if the access mode is a > write to decide whether to increment a counter. > > In this case (with filtering), a user callback is called when a memory access is > *translated*, and if the access mode is a write, the user generates a call to a > second callback that is executed every time a memory access is executed (only > that it is only generated for memory writes, the ones we care about). > > Is this clearer? I get it now, thanks! > > FWIW my experiments so far show similar numbers for instrumenting each > > instruction (haven't done the per-tb yet). The difference is that I'm > > exposing to instrumenters a copy of the guest instructions (const void *data, > > size_t size). These copies are kept around until TB's are flushed. > > Luckily there seems to be very little overhead in keeping these around, > > apart from the memory overhead -- but in terms of performance, the > > necessary allocations do not induce significant overhead. > > To keep this use-case simpler, I added the memory access API I posted in this > series, where instrumenters can read guest memory (more general than passing a > copy of the current instruction). I see some potential problems with this: 1. Instrumenters' accesses could generate exceptions. I presume we'd want to avoid this, or leave it as a debug-only kind of option. 2. Instrumenters won't know where the end of an instruction (for variable-length ISAs) or of a TB is (TB != basic block). For instructions one could have a loop where we read byte-by-byte and pass it to the decoder, something similar to what we have in the capstone code recently posted to the list (v4). For TBs, we really should have a way to delimit the length of the TB. This is further complicated if we want instrumentation to be inserted *before* a TB is translated. Some thoughts on the latter problem: if we want a tb_trans_pre callback, like Pin/DynamoRIO provide, instead of doing two passes (one to delimit the TB and call the tb_trans_pre callback, to then generate the translated TB), we could: - have a tb_trans_pre callback. This callback inserts an exec-time callback with a user-defined pointer (let's call it **tb_info). The callback has no arguments, perhaps just the pc. - have a tb_trans_post callback. This one passes a copy of the guest instructions. The instrumenter then can allocate whatever data structure to represent the TB (*tb_info), and copies this pointer to **tb_info, so that at execution time, we can obtain tb_info _before_ the TB is executed. After the callback returns, the copy of the guest instructions can be freed. This has two disadvantages: - We have an extra dereference to find tb_info - If it turns out that the TB should not be instrumented, we have generated a callback for nothing. Emilio