From: "Alex Bennée" <alex.bennee@linaro.org>
To: Jon Wilson <jonwilson030981@googlemail.com>
Cc: qemu-devel@nongnu.org,
Richard Henderson <richard.henderson@linaro.org>,
Pierrick Bouvier <pierrick.bouvier@linaro.org>
Subject: Re: TCG Address Sanitizer Optimization.
Date: Mon, 02 Jun 2025 16:09:42 +0100 [thread overview]
Message-ID: <87h60y41u1.fsf@draig.linaro.org> (raw)
In-Reply-To: <CAJHT5-JwrQ31MEKKSNL-E0RPm+cg7UOiqzV5cPL-mnTOPa7eUA@mail.gmail.com> (Jon Wilson's message of "Mon, 2 Jun 2025 13:02:08 +0100")
Jon Wilson <jonwilson030981@googlemail.com> writes:
(Adding Richard, the TCG maintainer to CC)
> I am attempting to optimize some TCG code which I have previously written for
> qemu-libafl-bridge (https://github.com/AFLplusplus/qemu-libafl-bridge), the
> component used by the LibAFL project when fuzzing binaries using QEMU to
> provide runtime instrumentation. The code is used to write additional TCG into
> basic blocks whenever a load or store operation is performed in order to provide
> address sanitizer functionality.
I would like to figure out if we can push any of this instrumentation
into TCG plugins so you can avoid patching QEMU itself. I guess you need
something that allows you to hook a memory address into some sort of
gadget?
> Address sanitizer is quite simple and works by mapping each 8-byte region of
> address space to single byte within a region called the shadow map. The address
> (on a 64-bit platform) of the shadow map for a given address is:
>
> Shadow = (Mem >> 3) + 0x7fff8000;
>
> The value in the shadow map encodes the accessibility of an address:
>
> 0 - The whole 8 byte region is accessible.
> 1 .. 7 - The first n bytes are accessible.
> negative - The whole 8 byte region is inaccessible.
>
> The following pseudo code shows the algorithm:
>
> ////////////////////////////////////////////////////////////////////////////////
>
> https://github.com/google/sanitizers/wiki/addresssanitizeralgorithm
>
> byte *shadow_address = MemToShadow(address);
> byte shadow_value = *shadow_address;
> if (shadow_value) {
> if (SlowPathCheck(shadow_value, address, kAccessSize)) {
> ReportError(address, kAccessSize, kIsWrite);
> }
> }
>
> // Check the cases where we access first k bytes of the qword
> // and these k bytes are unpoisoned.
> bool SlowPathCheck(shadow_value, address, kAccessSize) {
> last_accessed_byte = (address & 7) + kAccessSize - 1;
> return (last_accessed_byte >= shadow_value);
> }
>
> ////////////////////////////////////////////////////////////////////////////////
>
> My current implementation makes use of conditional move instructions to trigger
> a segfault by way of null dereference in the event that the shadow map indicates
> that a memory access is invalid.
>
> ////////////////////////////////////////////////////////////////////////////////
>
> #if TARGET_LONG_BITS == 32
> #define SHADOW_BASE (0x20000000)
> #elif TARGET_LONG_BITS == 64
> #define SHADOW_BASE (0x7fff8000)
> #else
> #error Unhandled TARGET_LONG_BITS value
> #endif
>
> void libafl_tcg_gen_asan(TCGTemp * addr, size_t size)
> {
> if (size == 0)
> return;
>
> TCGv addr_val = temp_tcgv_tl(addr);
> TCGv k = tcg_temp_new();
> TCGv shadow_addr = tcg_temp_new();
> TCGv_ptr shadow_ptr = tcg_temp_new_ptr();
> TCGv shadow_val = tcg_temp_new();
> TCGv test_addr = tcg_temp_new();
> TCGv_ptr test_ptr = tcg_temp_new_ptr();
>
> tcg_gen_andi_tl(k, addr_val, 7);
> tcg_gen_addi_tl(k, k, size - 1);
>
> tcg_gen_shri_tl(shadow_addr, addr_val, 3);
> tcg_gen_addi_tl(shadow_addr, shadow_addr, SHADOW_BASE);
> tcg_gen_tl_ptr(shadow_ptr, shadow_addr);
> tcg_gen_ld8s_tl(shadow_val, shadow_ptr, 0);
>
> /*
> * Making conditional branches here appears to cause QEMU issues with dead
> * temporaries so we will instead avoid branches.
This sounds like a TCG bug that may have been fixed.
> We will cause the guest
> * to perform a NULL dereference in the event of an ASAN fault. Note that
> * we will do this by using a store rather than a load, since the TCG may
> * otherwise determine that the result of the load is unused and simply
> * discard the operation. In the event that the shadow memory doesn't
> * detect a fault, we will simply write the value read from the shadow
> * memory back to it's original location. If, however, the shadow memory
> * detects an invalid access, we will instead attempt to write the value
> * at 0x0.
> */
Why not conditionally call a helper here? Forcing the guest to actually
fault seems a bit hammer like.
> tcg_gen_movcond_tl(TCG_COND_EQ, test_addr,
> shadow_val, tcg_constant_tl(0),
> shadow_addr, tcg_constant_tl(0));
>
> if (size < 8)
> {
> tcg_gen_movcond_tl(TCG_COND_GE, test_addr,
> k, shadow_val,
> test_addr, shadow_addr);
> }
>
> tcg_gen_tl_ptr(test_ptr, test_addr);
> tcg_gen_st8_tl(shadow_val, test_ptr, 0);
> }
>
> ////////////////////////////////////////////////////////////////////////////////
>
> However, I would like test an implementation more like the following to see how
> the performance compares. Whilst this introduces branches, the fast path is much
> more likely to be executed than the slow path and hence bypassing the additional
> checks and unnecessary memory writes I am hopeful it will improve performance.
>
> ////////////////////////////////////////////////////////////////////////////////
>
> void libafl_tcg_gen_asan(TCGTemp* addr, size_t size)
> {
> if (size == 0) {
> return;
> }
>
> if (size > 8) {
> size = 8;
> }
>
> TCGLabel *done = gen_new_label();
>
> TCGv addr_val = temp_tcgv_tl(addr);
> TCGv shadow_addr = tcg_temp_new();
> TCGv_ptr shadow_ptr = tcg_temp_new_ptr();
> TCGv shadow_val = tcg_temp_new();
> TCGv k = tcg_temp_new();
> TCGv zero = tcg_constant_tl(0);
> TCGv_ptr null_ptr = tcg_temp_new_ptr();
>
> tcg_gen_shri_tl(shadow_addr, addr_val, 3);
> tcg_gen_addi_tl(shadow_addr, shadow_addr, SHADOW_BASE);
> tcg_gen_tl_ptr(shadow_ptr, shadow_addr);
> tcg_gen_ld8s_tl(shadow_val, shadow_ptr, 0);
>
> tcg_gen_brcond_tl(TCG_COND_EQ, shadow_val, zero, done);
>
> tcg_gen_andi_tl(k, addr_val, 7);
> tcg_gen_addi_tl(k, k, size - 1);
>
> tcg_gen_brcond_tl(TCG_COND_LT, shadow_val, k, done);
>
> tcg_gen_tl_ptr(null_ptr, zero);
> tcg_gen_st8_tl(zero, null_ptr, 0);
>
> gen_set_label(done);
> }
>
> ////////////////////////////////////////////////////////////////////////////////
>
> However, when I change to using this implementation, I get the following error.
> I have tested it with a trivial hello world implementation for x86_64 running in
> qemu-user. It doesn't occur the first time the block is executed, therefore I
> think the issue is caused by the surrounding TCG in the block it is injected
> into?
>
> ////////////////////////////////////////////////////////////////////////////////
> runner-x86_64: ../tcg/tcg.c:4852: tcg_reg_alloc_mov: Assertion `ts->val_type == TEMP_VAL_REG' failed.
> Aborted (core dumped)
> ////////////////////////////////////////////////////////////////////////////////
>
> I would be very grateful for any advice of how to resolve this issue, or any
> alternative approaches I could use to optimize my original implementation. The
> code is obviously a very hot path and so even a tiny performance improvement
> could result in a large performance gain overall.
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
next prev parent reply other threads:[~2025-06-02 15:10 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-02 12:02 TCG Address Sanitizer Optimization Jon Wilson
2025-06-02 15:09 ` Alex Bennée [this message]
2025-06-02 15:54 ` Jon Wilson
2025-06-02 15:58 ` Richard Henderson
2025-06-02 16:26 ` Alex Bennée
2025-06-03 8:20 ` Jon Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h60y41u1.fsf@draig.linaro.org \
--to=alex.bennee@linaro.org \
--cc=jonwilson030981@googlemail.com \
--cc=pierrick.bouvier@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).