From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 354C2C25B08 for ; Sat, 13 Aug 2022 06:19:47 +0000 (UTC) Received: from localhost ([::1]:56338 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oMkUr-000062-Vu for qemu-devel@archiver.kernel.org; Sat, 13 Aug 2022 02:19:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41052) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oMkRi-0007CB-PC for qemu-devel@nongnu.org; Sat, 13 Aug 2022 02:16:31 -0400 Received: from mail-wr1-x429.google.com ([2a00:1450:4864:20::429]:46632) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oMkRg-00016w-J7 for qemu-devel@nongnu.org; Sat, 13 Aug 2022 02:16:30 -0400 Received: by mail-wr1-x429.google.com with SMTP id l4so3178548wrm.13 for ; Fri, 12 Aug 2022 23:16:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:user-agent:from:to:cc; bh=J/yu8t65NuL3it7Gj/+Z69nS+R4m955LsB8cz2h6Gw4=; b=ARGCnlBaUIxYmgfDnXTN7BijtQCCeLvgt6DvmEknKsBxPBMrgKrbLl/HvAmi6cG5Yl SEfNfWj+TwZ7NR9yoP7aVc2YjxnDqXqvv84VFWQsqleZgYxoR7S7T9zpDok8Fj8H4duG hpjEkO4topD8Od6W30wAOOkxvX9B4idqnX5a6njt+eNOe8V9hpOWhBGKa23zmvQWIJ9A CXEE65RWFnQajY0ALtHveZ+yHmQg9nzfHfvcMnzCwMT9p6daP/+dqF40tR17byNcc8NU knZQD2kfUgQSHlOdQG3xulCctMP3BZPFJUL9Tcyibr8CVtA/r781yFvCCHJlypNOTCqc ekdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:user-agent:x-gm-message-state:from:to:cc; bh=J/yu8t65NuL3it7Gj/+Z69nS+R4m955LsB8cz2h6Gw4=; b=FLCwJ19JBIrq9LsvspeoGGk/fJzur0DSgYBCQHEMqe+T1Oty2ualNqdCSTbfA561xP uyc1AapMFLUhXYy9ef/Xse02ji5+x3byun4zYwlbtfDbaTHuyokE6zhMpeNSjtdDNRv7 XjhkLyXAu3aNLUfgNBCZxNsCSr9CY2aF30k3ByFe/isPD2HvfE71E/yGV19iITfOk1Z4 55hSIvuCVD5pt2aH1WwxGdbjYhPDg07w/zu9kif9808h9EGNTGnBQlAq3e6kF4kKG/Dr gv2vDdSPGtehr71jjc/N80m/RO4ziHTQgKgyy9ZR7Y+VdEd8J33IAweypQtiKdeHr0jE zt0w== X-Gm-Message-State: ACgBeo06JrrzGbacgOMXhE+ZfGAuDxvTWJYdiV4oOBTROCOoJiK8kQmF qsm3EfLcEGOI5T4Ei3qWwJ0uOA== X-Google-Smtp-Source: AA6agR6xm6L3RdD90D6p1ndDyTQrBJ4h5Xod2Dfx3p1+nfrumnW48VwuVwTOAXwxTx2I39DWzRvt/Q== X-Received: by 2002:a5d:47c4:0:b0:21f:e92:7ba1 with SMTP id o4-20020a5d47c4000000b0021f0e927ba1mr3662631wrc.408.1660371386633; Fri, 12 Aug 2022 23:16:26 -0700 (PDT) Received: from zen.linaroharston ([185.81.254.11]) by smtp.gmail.com with ESMTPSA id bg6-20020a05600c3c8600b003a308e9a192sm1787096wmb.30.2022.08.12.23.16.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Aug 2022 23:16:26 -0700 (PDT) Received: from zen (localhost [127.0.0.1]) by zen.linaroharston (Postfix) with ESMTP id 318C11FFB7; Sat, 13 Aug 2022 07:16:25 +0100 (BST) User-agent: mu4e 1.8.8; emacs 28.1.91 From: Alex =?utf-8?Q?Benn=C3=A9e?= To: qemu-devel@nongnu.org Cc: Cleber Rosa , Philippe =?utf-8?Q?Mathieu-Daud=C3=A9?= , Wainer dos Santos Moschetta , Beraldo Leal , Peter Maydell , Richard Henderson , Mark Cave-Ayland Subject: Info about the slowest avocado tcg tests (arm, s390x, ppc, replay) Date: Sat, 13 Aug 2022 07:10:32 +0100 Message-ID: <87r11kpt1j.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::429; envelope-from=alex.bennee@linaro.org; helo=mail-wr1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" [RESEND because the ML ate my email again] Hi All, In my investigations into our current Avocado tests I identified all that ran for more than 30s for an --enable-debug build. They are: tests/avocado/boot_linux.py:BootLinuxAarch64.test_virt_tcg_gicv2: PASS( 7= 63.81 s) tests/avocado/boot_linux.py:BootLinuxAarch64.test_virt_tcg_gicv3: PASS( 7= 42.75 s) The main slowness here is due to inefficient TB invalidation of code generated during the firmware phase when the rest of the system boots. tests/avocado/boot_linux.py:BootLinuxS390X.test_s390_ccw_virtio_tcg: PAS= S( 433.48 s) This looks like a very fat distro. A lot of slow delays waiting for systemd units to come up. That said the positively svelte Alpine ISO still takes about 4min30s to churn through it's userspace boot. Looking at the callchains though it might be helped by the same fixes as the ARM tcg GIC tests: - 28.15% 0.16% qemu-system-s39 qemu-system-s390x [.] st= ore_helper - 27.99% store_helper - 23.08% notdirty_write - 17.51% page_collection_lock 2.76% page_trylock_add 1.22% g_tree_lookup 0.62% g_tree_find_node - 4.53% tb_invalidate_phys_page_fast - 4.37% tb_invalidate_phys_page_range__locked - 3.44% tb_phys_invalidate__locked - 3.44% do_tb_phys_invalidate 3.40% tb_page_remove 0.63% page_trylock_add - 4.25% tlb_fill - 4.21% s390_cpu_tlb_fill - 2.35% mmu_translate - 1.77% mmu_translate_asce - 1.58% read_table_entry - 1.39% address_space_read_full - 1.11% flatview_read - 0.62% flatview_translate 0.55% flatview_do_translate - 1.70% tlb_set_page - 1.64% tlb_set_page_with_attrs 0.52% cpu_physical_memory_is_clean - 25.10% 0.02% qemu-system-s39 qemu-system-s390x [.] no= tdirty_write - 25.08% notdirty_write - 18.14% page_collection_lock 2.84% page_trylock_add 1.25% g_tree_lookup 0.64% g_tree_find_node - 5.86% tb_invalidate_phys_page_fast - 5.69% tb_invalidate_phys_page_range__locked - 4.62% tb_phys_invalidate__locked - 4.62% do_tb_phys_invalidate 4.57% tb_page_remove 0.65% page_trylock_add There might be some millage looking at the tb_jmp_cache utilisation though because it looks like pretty much all tb_lookup falls through to the full QHT lookup: - 19.82% 1.65% qemu-system-s39 qemu-system-s390x [.] tb= _lookup - 18.17% tb_lookup - 17.84% tb_htable_lookup - 9.94% get_page_addr_code - 9.66% get_page_addr_code_hostp - 5.55% tlb_fill - 5.48% s390_cpu_tlb_fill - 3.55% mmu_translate - 2.73% mmu_translate_asce - 2.44% read_table_entry - 2.17% address_space_read_full - 1.72% flatview_read - 0.93% flatview_translate - 0.83% flatview_do_translate 0.58% address_space_translate_i= nternal - 0.54% mmu_absolute_addr_valid 0.51% address_space_access_valid - 1.73% tlb_set_page - 1.64% tlb_set_page_with_attrs 0.54% tlb_flush_vtlb_page_locked - 1.56% qemu_ram_addr_from_host_nofail - 1.40% qemu_ram_addr_from_host 1.17% qemu_ram_block_from_host 0.73% tlb_entry - 4.30% qht_lookup_custom - 1.90% qht_do_lookup 1.21% tb_lookup_cmp 1.59% seqlock_read_begin - 2.82% tb_hash_func - 2.66% qemu_xxhash6 - 2.35% qemu_xxhash7 rol32 tests/avocado/boot_linux.py:BootLinuxPPC64.test_pseries_tcg: PASS( 341.90= s) tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg: PASS( 240.= 76 s) tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_q35_tcg: PASS( 221.70 = s) Nothing especially jumps out of perf (in fact it looks very similar to s390x): 6.40% qemu-system-ppc qemu-system-ppc64 [.] tb_lookup 5.63% qemu-system-ppc qemu-system-ppc64 [.] check_regs 4.87% qemu-system-ppc qemu-system-ppc64 [.] hreg_compute_hflags= _value 4.21% qemu-system-ppc qemu-system-ppc64 [.] pmu_increment_insns 2.42% qemu-system-ppc qemu-system-ppc64 [.] helper_insns_inc 2.26% qemu-system-ppc qemu-system-ppc64 [.] cpu_get_tb_cpu_state 2.20% qemu-system-ppc qemu-system-ppc64 [.] type_get_parent 2.07% qemu-system-ppc qemu-system-ppc64 [.] helper_lookup_tb_ptr 1.96% qemu-system-ppc [kernel.kallsyms] [k] list_lru_count_one 1.63% qemu-system-ppc qemu-system-ppc64 [.] type_is_ancestor 1.62% qemu-system-ppc qemu-system-ppc64 [.] helper_vsldoi 1.49% qemu-system-ppc qemu-system-ppc64 [.] la_reset_pref 1.14% qemu-system-ppc qemu-system-ppc64 [.] seqlock_read_begin 1.02% qemu-system-ppc qemu-system-ppc64 [.] liveness_pass_1 0.97% qemu-system-ppc [kernel.kallsyms] [k] _raw_spin_lock 0.81% qemu-system-ppc qemu-system-ppc64 [.] curr_cflags 0.76% qemu-system-ppc libglib-2.0.so.0.5600.4 [.] g_hash_table_lookup 0.75% qemu-system-ppc qemu-system-ppc64 [.] check_for_breakpoin= ts 0.75% qemu-system-ppc qemu-system-ppc64 [.] tb_jmp_cache_hash_f= unc=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 which all looks fairly typical for --debug code with heavy computed jumps (triggering a lot of tb_lookups). tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu: WARN( 91.80 s) tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu_strict: WARN( 91= .47 s) tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu_pt: WARN( 91.36 = s) tests/avocado/intel_iommu.py:IntelIOMMU.test_intel_iommu_strict_cm: WARN(= 88.56 s) These all end with WARN so could be broken anyway and just timing out. Didn't dig in too deeply. tests/avocado/replay_kernel.py:ReplayKernelNormal.test_arm_cubieboard_ini= trd: PASS( 63.24 s) tests/avocado/replay_kernel.py:ReplayKernelNormal.test_sparc_ss20: PASS( = 59.36 s) This could just be replay being inherently slow given it runs several times under icount. tests/avocado/boot_xen.py:BootXen.test_arm64_xen_411_and_dom0: PASS( 57.1= 5 s) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_sd= : PASS( 55.00 s) tests/avocado/machine_aspeed.py:AST2x00Machine.test_arm_ast2500_romulus_o= penbmc_v2_9_0: PASS( 52.73 s) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_x86_64_pc: PASS= ( 51.08 s) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_aarch64_xlnx_ve= rsal_virt: PASS( 47.98 s) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_arm_ast2600_deb= ian: PASS( 47.48 s) tests/avocado/linux_initrd.py:LinuxInitrd.test_with_2gib_file_should_work= _with_linux_v4_16: PASS( 45.29 s) tests/avocado/machine_aspeed.py:AST2x00Machine.test_arm_ast2400_palmetto_= openbmc_v2_9_0: PASS( 44.54 s) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_arm_orangepi_in= itrd: PASS( 40.96 s) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_arm_exynos4210_= initrd: PASS( 37.70 s) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_arm_cubieboard_= initrd: PASS( 34.29 s) tests/avocado/boot_linux_console.py:BootLinuxConsole.test_arm_cubieboard_= sata: PASS( 33.19 s) --=20 Alex Benn=C3=A9e