From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31ECCC433E0 for ; Thu, 25 Feb 2021 00:30:02 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 933986146B for ; Thu, 25 Feb 2021 00:30:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 933986146B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:54656 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lF4XY-00026E-Hk for qemu-devel@archiver.kernel.org; Wed, 24 Feb 2021 19:30:00 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59786) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lF4Wb-0001LC-IB for qemu-devel@nongnu.org; Wed, 24 Feb 2021 19:29:01 -0500 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]:37482) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lF4WZ-0000SW-LY for qemu-devel@nongnu.org; Wed, 24 Feb 2021 19:29:01 -0500 Received: by mail-pj1-x1032.google.com with SMTP id u12so2241583pjr.2 for ; Wed, 24 Feb 2021 16:28:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=iF1WPvGNj23F4diYJgGZcmMLOtU/fITYj7BO6CyjsQY=; b=G8DmEMcRuMWuZarQvPH4rEPGdvAGxHrlEMPtSARFuEQU1jap+Jy0/m/OYXjOip9ubG XYWGDuY0UK+tc679GoFY7t6XAa2QWSk5uYnFsE1uSuIEQWRR5sn5MQE2GuH0OW43+kf0 r11+GBHO9mvNFaEauNvbPRfRUW8QEpvkelnmnGnH5gpYjWrEfiJKfYhHd/S7Oc8zMZOP vQXJv8YYTuUOlWyN6r45Nz0z1RrE4oIEDnIqy86E7zBmJSQnixC5EY4fMwH+qIWnlxHL N06J/O9EgZbwb0DRvM/Tebn1VLMWRtsB/LPDdZ/C4wHMRBL3IRbsl5q5nVxe4eukEvRF GEFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=iF1WPvGNj23F4diYJgGZcmMLOtU/fITYj7BO6CyjsQY=; b=UEutL+MdaFnQJ3lR/zZf8AkYRW3zwypyw9ZxpDPds96pTJB5JdXpLoak7v4HM1DfhE 5kjZVEnrr5cjXVRskde/XVuubOYnaw0OpvzSf6p3ATG2tsfR5KMDg1QBG9KRcOKlKFz7 /E/fbDLv92PXE6oSbNH8a8DXbwGSjNq4x+yJfCVRsooG1DzEMg5tb8ALjol9FrSa2Q62 bIrcm9IVdf/3zoWTtF51ljm9IQ//MFdFk/E5ba8aqr2O0iRsVp3ec/e6hjJDjErmL2bM 3nWydPDaKQ7GGbOe5VCIAyVeRohqJ75tH8/wgSZwkA4EOkmz7IiLuxUQDsKr7Zyi30DO EHhw== X-Gm-Message-State: AOAM531FaXKmwmzUTcdiyAxSLlb1/jmLm5EJWstaZ4ZGvWdU81tImdhv c7RFWK7zrSuaEfovkr3Evjy4BIbpdBOnwQ== X-Google-Smtp-Source: ABdhPJxpHsaGx2ZhtNfNMP/FoXvupho55l7uQhtcdV6hHW3IM2vvkaLtZ2i7R16wWDRiIla3g8O2mg== X-Received: by 2002:a17:90a:9909:: with SMTP id b9mr502809pjp.46.1614212937659; Wed, 24 Feb 2021 16:28:57 -0800 (PST) Received: from [192.168.1.11] (174-21-84-25.tukw.qwest.net. [174.21.84.25]) by smtp.gmail.com with ESMTPSA id t187sm3866484pfb.91.2021.02.24.16.28.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 24 Feb 2021 16:28:57 -0800 (PST) Subject: Re: [RFC PATCH 0/5] Experimenting with tb-lookup tweaks To: =?UTF-8?Q?Alex_Benn=c3=a9e?= References: <20210224165811.11567-1-alex.bennee@linaro.org> From: Richard Henderson Message-ID: <7d23665f-fa20-028f-d48a-2ea79ab35b2f@linaro.org> Date: Wed, 24 Feb 2021 16:28:54 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20210224165811.11567-1-alex.bennee@linaro.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::1032; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1032.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: cota@braap.org, qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On 2/24/21 8:58 AM, Alex Bennée wrote: > Hi Richard, > > Well I spun up some of the ideas we talked about to see if there was > anything to be squeezed out of the function. In the end the results > seem to be a washout with my pigz benchmark: > > qemu-system-aarch64 -cpu cortex-a57 \ > -machine type=virt,virtualization=on,gic-version=3 \ > -serial mon:stdio \ > -netdev user,id=unet,hostfwd=tcp::2222-:22 \ > -device virtio-net-pci,netdev=unet,id=virt-net,disable-legacy=on \ > -device virtio-scsi-pci,id=virt-scsi,disable-legacy=on \ > -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-buster-arm64 \ > -device scsi-hd,drive=hd,id=virt-scsi-hd \ > -smp 4 -m 4096 \ > -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image \ > -append "root=/dev/sda2 systemd.unit=benchmark-pigz.service" \ > -display none -snapshot > > | Command | Mean [s] | Min [s] | Max [s] | Relative | > |---------+----------------+---------+---------+----------| > | Before | 46.597 ± 2.482 | 45.208 | 53.618 | 1.00 | > | After | 46.867 ± 2.242 | 45.871 | 53.180 | 1.00 | Well that's disappointing. > Maybe the code cleanup itself makes it worthwhile. WDYT? I think there's little doubt that the first 3 patches are a good code cleanup. Patch 4 I think is still beneficial, simply so that we can add that "Above fields" comment. Patch 5 would only be worthwhile if we could measure any positive difference, which it seems we cannot. I have a follow-up patch to remove the parallel_cpus global variable which I will post in a moment. While it removes a handful of insns from this fast-path, I doubt it helps. But getting rid of a global is probably always positive, no? I was glancing through the lookup function for alpha, instead of aarch64 and saw: 21e: 33 43 18 xor 0x18(%rbx),%eax 221: 4c 31 e1 xor %r12,%rcx 224: 44 31 ea xor %r13d,%edx 227: 09 c2 or %eax,%edx 229: 48 0b 4b 08 or 0x8(%rbx),%rcx and thought -- hang on, how come we're just ORing nor XORing here? Of course it's the cs_base field, which alpha has set to zero. The compiler has simplified bits |= 0 ^ tb->cs_base. Which got me thinking: what if we had a per-cpu typedef struct { target_ulong pc; ... } TranslationBlockID; static inline bool arch_tbid_cmp(TranslationBlockID x, TranslationBlockID y) { return x.pc == y.pc && ...; } We could potentially reduce this to memcmp(&x, &y). First, this would allow cs_base to be eliminated where it is not used. Second, this would allow cs_base to be renamed for the non-x86 targets for which it is being abused. Third, it would allow tb->flags to be either (a) elided or (b) extended by the target as needed. This final is directed at ARM, of course, where we've overflowed the uint32_t that is tb->flags. We could now extend that to 64-bits. Obviously, some tweaks to tb_hash_func would be required as well, but that's manageable. What do you think about this last? r~