* Re: [PATCH v2 01/28] accel/tcg: Introduce translator_use_goto_tb [not found] ` <12e54d71-e208-507c-c9d2-c313f9301fc3@intel.com> @ 2023-03-14 18:15 ` Richard Henderson 2023-03-16 2:07 ` Wu, Fei 0 siblings, 1 reply; 4+ messages in thread From: Richard Henderson @ 2023-03-14 18:15 UTC (permalink / raw) To: Wu, Fei; +Cc: qemu-devel On 3/14/23 06:47, Wu, Fei wrote: > On 3/13/2023 11:00 PM, Richard Henderson wrote: >> On 3/13/23 07:13, Wu, Fei2 wrote: >>> Hi Richard, >>> >>> Sorry for disturbing you. I'm doing some perf profiling on qemu-riscv64, >>> I see 10%+ faster to build stress-ng without the following patch. I know >>> it's incorrect to just skip this patch, I'm wondering if we can do >>> something on intercepting mmap/mprotect (very rare), e.g. even >>> invalidating all the TBs, but keep the cross-page block chaining. >> >> It also affects breakpoints. >> >> I have no good ideas for how to keep cross-page block chaining without >> breaking either of these use cases. If you come up with a good idea, >> please post on qemu-devel for discussion. >> > Thank you for reply. I am new to qemu/tcg, lots of details and > backgrounds need to catch up. > > If we only want to address user-mode qemu, and assume this cross-page > chain, first page -> second page: > > * breakpoints. If a new bp is added to second page, the chain is hard to > maintain, but it looks acceptable to flush all TBs and fall back to > current non-cross-page implementation during debugging? I think It's > different from the full system situation here: > https://gitlab.com/qemu-project/qemu/-/issues/404 > > * mprotect. If the 2nd page remains 'X' permission after mprotect, the > chain is still valid, if it's changed to non-X, then the syscall > interceptor will change the permission of corresponding host page to > non-X, it will be segfault as expected? > > * mmap. I cannot figure out the situation. Is there any unit test for > this, or could you please shed some light? Also munmap, but handled via the same path through page_set_flags, see if (inval_tb) { tb_invalidate_phys_range(start, end); } There is no unit test for mmap over an existing code page. I believe we do have one for mprotect. You could plausibly add a global variable choosing between link-all-pages and link-one-page modes; it would be protected by mmap_lock. For link-all-pages mode, the above tb_invalidate_phys_range becomes tb_flush. We probably want to start in link-one-page mode if gdbstub is active, which is the only way to set breakpoints in user-only mode. I expect mprotect/mmap over existing executable pages to be extremely rare. I expect munmap of existing executable pages to be rare-ish, with dlclose() being the most common case. You might wish to change from link-all-pages mode to link-one-page mode after one or more instances. And as I said, this discussion should happen on qemu-devel. r~ ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 01/28] accel/tcg: Introduce translator_use_goto_tb 2023-03-14 18:15 ` [PATCH v2 01/28] accel/tcg: Introduce translator_use_goto_tb Richard Henderson @ 2023-03-16 2:07 ` Wu, Fei 2023-03-16 3:07 ` Wu, Fei 0 siblings, 1 reply; 4+ messages in thread From: Wu, Fei @ 2023-03-16 2:07 UTC (permalink / raw) To: Richard Henderson; +Cc: qemu-devel On 3/15/2023 2:15 AM, Richard Henderson wrote: > On 3/14/23 06:47, Wu, Fei wrote: >> On 3/13/2023 11:00 PM, Richard Henderson wrote: >>> On 3/13/23 07:13, Wu, Fei2 wrote: >>>> Hi Richard, >>>> >>>> Sorry for disturbing you. I'm doing some perf profiling on >>>> qemu-riscv64, >>>> I see 10%+ faster to build stress-ng without the following patch. I >>>> know >>>> it's incorrect to just skip this patch, I'm wondering if we can do >>>> something on intercepting mmap/mprotect (very rare), e.g. even >>>> invalidating all the TBs, but keep the cross-page block chaining. >>> >>> It also affects breakpoints. >>> >>> I have no good ideas for how to keep cross-page block chaining without >>> breaking either of these use cases. If you come up with a good idea, >>> please post on qemu-devel for discussion. >>> >> Thank you for reply. I am new to qemu/tcg, lots of details and >> backgrounds need to catch up. >> >> If we only want to address user-mode qemu, and assume this cross-page >> chain, first page -> second page: >> >> * breakpoints. If a new bp is added to second page, the chain is hard to >> maintain, but it looks acceptable to flush all TBs and fall back to >> current non-cross-page implementation during debugging? I think It's >> different from the full system situation here: >> https://gitlab.com/qemu-project/qemu/-/issues/404 >> >> * mprotect. If the 2nd page remains 'X' permission after mprotect, the >> chain is still valid, if it's changed to non-X, then the syscall >> interceptor will change the permission of corresponding host page to >> non-X, it will be segfault as expected? >> >> * mmap. I cannot figure out the situation. Is there any unit test for >> this, or could you please shed some light? > Also munmap, but handled via the same path through page_set_flags, see > > if (inval_tb) { > tb_invalidate_phys_range(start, end); > } > > There is no unit test for mmap over an existing code page. > I believe we do have one for mprotect. > > You could plausibly add a global variable choosing between > link-all-pages and link-one-page modes; it would be protected by > mmap_lock. For link-all-pages mode, the above tb_invalidate_phys_range > becomes tb_flush. We probably want to start in link-one-page mode if > gdbstub is active, which is the only way to set breakpoints in user-only > mode. > > I expect mprotect/mmap over existing executable pages to be extremely > rare. I expect munmap of existing executable pages to be rare-ish, with > dlclose() being the most common case. You might wish to change from > link-all-pages mode to link-one-page mode after one or more instances. > > And as I said, this discussion should happen on qemu-devel. > My fault. I didn't notice the cc list, and initialized another thread: https://www.mail-archive.com/qemu-devel@nongnu.org/msg949625.html Would you prefer commenting there, or I move the content here? Thanks, Fei. > > r~ ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 01/28] accel/tcg: Introduce translator_use_goto_tb 2023-03-16 2:07 ` Wu, Fei @ 2023-03-16 3:07 ` Wu, Fei 0 siblings, 0 replies; 4+ messages in thread From: Wu, Fei @ 2023-03-16 3:07 UTC (permalink / raw) To: Richard Henderson; +Cc: qemu-devel On 3/16/2023 10:07 AM, Wu, Fei wrote: > On 3/15/2023 2:15 AM, Richard Henderson wrote: >> On 3/14/23 06:47, Wu, Fei wrote: >>> On 3/13/2023 11:00 PM, Richard Henderson wrote: >>>> On 3/13/23 07:13, Wu, Fei2 wrote: >>>>> Hi Richard, >>>>> >>>>> Sorry for disturbing you. I'm doing some perf profiling on >>>>> qemu-riscv64, >>>>> I see 10%+ faster to build stress-ng without the following patch. I >>>>> know >>>>> it's incorrect to just skip this patch, I'm wondering if we can do >>>>> something on intercepting mmap/mprotect (very rare), e.g. even >>>>> invalidating all the TBs, but keep the cross-page block chaining. >>>> >>>> It also affects breakpoints. >>>> >>>> I have no good ideas for how to keep cross-page block chaining without >>>> breaking either of these use cases. If you come up with a good idea, >>>> please post on qemu-devel for discussion. >>>> >>> Thank you for reply. I am new to qemu/tcg, lots of details and >>> backgrounds need to catch up. >>> >>> If we only want to address user-mode qemu, and assume this cross-page >>> chain, first page -> second page: >>> >>> * breakpoints. If a new bp is added to second page, the chain is hard to >>> maintain, but it looks acceptable to flush all TBs and fall back to >>> current non-cross-page implementation during debugging? I think It's >>> different from the full system situation here: >>> https://gitlab.com/qemu-project/qemu/-/issues/404 >>> >>> * mprotect. If the 2nd page remains 'X' permission after mprotect, the >>> chain is still valid, if it's changed to non-X, then the syscall >>> interceptor will change the permission of corresponding host page to >>> non-X, it will be segfault as expected? >>> >>> * mmap. I cannot figure out the situation. Is there any unit test for >>> this, or could you please shed some light? >> Also munmap, but handled via the same path through page_set_flags, see >> >> if (inval_tb) { >> tb_invalidate_phys_range(start, end); >> } >> >> There is no unit test for mmap over an existing code page. >> I believe we do have one for mprotect. >> >> You could plausibly add a global variable choosing between >> link-all-pages and link-one-page modes; it would be protected by >> mmap_lock. For link-all-pages mode, the above tb_invalidate_phys_range >> becomes tb_flush. We probably want to start in link-one-page mode if >> gdbstub is active, which is the only way to set breakpoints in user-only >> mode. >> This is a good solution for gdbstub case, clean and simple. Current code leverages tb_flush() during gdb, it looks ready to support link-all-pages mode, I tried to test gdb with link-all-pages mode, and didn't find any counter example yet. >> I expect mprotect/mmap over existing executable pages to be extremely >> rare. I expect munmap of existing executable pages to be rare-ish, with >> dlclose() being the most common case. You might wish to change from >> link-all-pages mode to link-one-page mode after one or more instances. >> Yes, I agree these calls are rare, so performance of this path is not crucial. If I understand correctly, we need to avoid the situation when the latter page is munmap-ed or changed to non executable protection, then the jump from preceding TB to this one shouldn't happen. In tb_invalidate_phys_range() -> do_tb_phys_invalidate(), it removes all relative TBs from cache, and also unlinks/unchains these TBs from preceding TBs, so next time guest attempts to run code in this munmap-ed page, the chain doesn't exist anymore, the protection will be checked and enforced. Thanks, Fei. >> And as I said, this discussion should happen on qemu-devel. >> > My fault. I didn't notice the cc list, and initialized another thread: > https://www.mail-archive.com/qemu-devel@nongnu.org/msg949625.html > > Would you prefer commenting there, or I move the content here? > > Thanks, > Fei. > > >> >> r~ > ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 00/28] accel/tcg: Introduce translator_use_goto_tb @ 2021-06-30 18:31 Richard Henderson 2021-06-30 18:31 ` [PATCH v2 01/28] " Richard Henderson 0 siblings, 1 reply; 4+ messages in thread From: Richard Henderson @ 2021-06-30 18:31 UTC (permalink / raw) To: qemu-devel Based-on: <20210629185455.3131172-1-richard.henderson@linaro.org> ("[PULL 00/63] tcg patch queue") There are a number of inconsistencies with goto_tb usage, and I plan to make changes in order to better support breakpoints. (1) Testing CF_LAST_IO is a hold-over from since before ba3e7926691 ("icount: clean up cpu_can_io at the entry to the block"). Several targets still have this test. (2) Testing singlestep is superfluous, as it doesn't mean anything besides limiting max_insns to 1. (3) Not testing page crossing for CONFIG_USER_ONLY is wrong, because mmap and mprotect can change page permissions. It's a very uncommon case wrt executables, but it's still wrong. (4) Not testing page crossing for non-mmu targets (where page permissions literally cannot change) is not currently wrong, but will be after the breakpoint changes. (5) When the TB does cross two pages, considering non-page crossing from the second page is not currently wrong, but will be after the breakpoint changes. Changes for v2: * Fix aarch32 ISB, SB insns vs single-stepping. * Drop use_goto_tb for aarch32 * Retain use_goto_tb for aarch64. Patches lacking review: 02-target-alpha-Remove-use_exit_tb.patch 03-target-alpha-Remove-in_superpage.patch 04-target-alpha-Use-translator_use_goto_tb.patch 05-target-arm-Use-gen_jmp-for-ISB-and-SB.patch 06-target-arm-Use-translator_use_goto_tb-for-aarch64.patch 07-target-arm-Use-translator_use_goto_tb-for-aarch32.patch 08-target-avr-Use-translator_use_goto_tb.patch 10-target-cris-Use-translator_use_goto_tb.patch 11-target-hppa-Use-translator_use_goto_tb.patch 12-target-i386-Use-translator_use_goto_tb.patch 14-target-microblaze-Use-translator_use_goto_tb.patch 15-target-mips-Use-translator_use_goto_tb.patch 17-target-nios2-Use-translator_use_goto_tb.patch 18-target-openrisc-Use-translator_use_goto_tb.patch 21-target-rx-Use-translator_use_goto_tb.patch 22-target-s390x-Use-translator_use_goto_tb.patch 23-target-s390x-Remove-use_exit_tb.patch 24-target-sh4-Use-translator_use_goto_tb.patch r~ Richard Henderson (28): accel/tcg: Introduce translator_use_goto_tb target/alpha: Remove use_exit_tb target/alpha: Remove in_superpage target/alpha: Use translator_use_goto_tb target/arm: Use gen_jmp for ISB and SB target/arm: Use translator_use_goto_tb for aarch64 target/arm: Use translator_use_goto_tb for aarch32 target/avr: Use translator_use_goto_tb target/avr: Mark some helpers noreturn target/cris: Use translator_use_goto_tb target/hppa: Use translator_use_goto_tb target/i386: Use translator_use_goto_tb target/m68k: Use translator_use_goto_tb target/microblaze: Use translator_use_goto_tb target/mips: Use translator_use_goto_tb target/mips: Fix missing else in gen_goto_tb target/nios2: Use translator_use_goto_tb target/openrisc: Use translator_use_goto_tb target/ppc: Use translator_use_goto_tb target/riscv: Use translator_use_goto_tb target/rx: Use translator_use_goto_tb target/s390x: Use translator_use_goto_tb target/s390x: Remove use_exit_tb target/sh4: Use translator_use_goto_tb target/sparc: Use translator_use_goto_tb target/tricore: Use translator_use_goto_tb target/tricore: Use tcg_gen_lookup_and_goto_ptr target/xtensa: Use translator_use_goto_tb include/exec/translator.h | 10 ++++++++ target/avr/helper.h | 8 +++--- accel/tcg/translator.c | 11 +++++++++ target/alpha/translate.c | 46 ++++------------------------------- target/arm/translate-a64.c | 25 ++++--------------- target/arm/translate.c | 16 +++--------- target/avr/translate.c | 9 ++++--- target/cris/translate.c | 5 ++-- target/hppa/translate.c | 5 +--- target/i386/tcg/translate.c | 14 ++--------- target/m68k/translate.c | 12 +-------- target/microblaze/translate.c | 11 +-------- target/mips/tcg/translate.c | 20 +++------------ target/nios2/translate.c | 15 +----------- target/openrisc/translate.c | 15 ++++++------ target/ppc/translate.c | 10 +------- target/riscv/translate.c | 20 +-------------- target/rx/translate.c | 11 +-------- target/s390x/translate.c | 18 +++----------- target/sh4/translate.c | 11 +++------ target/sparc/translate.c | 19 ++++----------- target/tricore/translate.c | 20 +++------------ target/xtensa/translate.c | 6 +---- 23 files changed, 83 insertions(+), 254 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 01/28] accel/tcg: Introduce translator_use_goto_tb 2021-06-30 18:31 [PATCH v2 00/28] " Richard Henderson @ 2021-06-30 18:31 ` Richard Henderson 0 siblings, 0 replies; 4+ messages in thread From: Richard Henderson @ 2021-06-30 18:31 UTC (permalink / raw) To: qemu-devel; +Cc: Max Filippov, Philippe Mathieu-Daudé Add a generic version of the common use_goto_tb test. Various targets avoid the page crossing test for CONFIG_USER_ONLY, but that is wrong: mmap and mprotect can change page permissions. Reviewed-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- include/exec/translator.h | 10 ++++++++++ accel/tcg/translator.c | 11 +++++++++++ 2 files changed, 21 insertions(+) diff --git a/include/exec/translator.h b/include/exec/translator.h index 24232ead41..dd9c06d40d 100644 --- a/include/exec/translator.h +++ b/include/exec/translator.h @@ -145,6 +145,16 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db, void translator_loop_temp_check(DisasContextBase *db); +/** + * translator_use_goto_tb + * @db: Disassembly context + * @dest: target pc of the goto + * + * Return true if goto_tb is allowed between the current TB + * and the destination PC. + */ +bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest); + /* * Translator Load Functions * diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c index 1d32732198..59804af37b 100644 --- a/accel/tcg/translator.c +++ b/accel/tcg/translator.c @@ -31,6 +31,17 @@ void translator_loop_temp_check(DisasContextBase *db) } } +bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest) +{ + /* Suppress goto_tb in the case of single-steping. */ + if (db->singlestep_enabled || singlestep) { + return false; + } + + /* Check for the dest on the same page as the start of the TB. */ + return ((db->pc_first ^ dest) & TARGET_PAGE_MASK) == 0; +} + void translator_loop(const TranslatorOps *ops, DisasContextBase *db, CPUState *cpu, TranslationBlock *tb, int max_insns) { -- 2.25.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-03-16 3:08 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <b160a8f7-10b0-5674-a040-b11ca4aed3c9@intel.com> [not found] ` <714313d8-7828-196b-50ac-fe12d2143135@linaro.org> [not found] ` <12e54d71-e208-507c-c9d2-c313f9301fc3@intel.com> 2023-03-14 18:15 ` [PATCH v2 01/28] accel/tcg: Introduce translator_use_goto_tb Richard Henderson 2023-03-16 2:07 ` Wu, Fei 2023-03-16 3:07 ` Wu, Fei 2021-06-30 18:31 [PATCH v2 00/28] " Richard Henderson 2021-06-30 18:31 ` [PATCH v2 01/28] " Richard Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).