From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3F21CCA47D for ; Wed, 22 Jun 2022 13:53:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357667AbiFVNw7 (ORCPT ); Wed, 22 Jun 2022 09:52:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357658AbiFVNw7 (ORCPT ); Wed, 22 Jun 2022 09:52:59 -0400 Received: from mail-ej1-x635.google.com (mail-ej1-x635.google.com [IPv6:2a00:1450:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EED2ABAD for ; Wed, 22 Jun 2022 06:52:52 -0700 (PDT) Received: by mail-ej1-x635.google.com with SMTP id u12so34390555eja.8 for ; Wed, 22 Jun 2022 06:52:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=i58AvAf3xFLhQ/L4OE2PIGkyGZEELgf2pk8ePev6kB4=; b=ZkBEZC+r/wMombXRguzU+DolPji+n50SYFb6A0PR3PacPtUpGCNYgvsZVodwBvhB+Y HrormOXJ+myRBnLX1j5uZOwGsLSXr9ktsBD6s6AwWO/tz+1mPzzjLyQgYzQID2Rj43GW OPbZM2TqykrzmCbZcIYyQpswfX0SbmVk21uxtoCDAriIIcSFrEnyXqlH7+vQ3o4S/FMW iiHreXsvl9piaIunAUaUmlNdw39EhsXKLbw+UBzxjHCUtuvb6c/sxy9Vd0L9F+jg2O1r /CroBhvjd8bv0uW0EjJE5ocqOhWF9BeMROOIPhEnMlYVCH0HypyoKfUAdyiBvH6ynf/H w7hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=i58AvAf3xFLhQ/L4OE2PIGkyGZEELgf2pk8ePev6kB4=; b=Ik4NsBSWmjPkLi/7K1IgIDILC59u1JkBa5iq49T3E3ZuP8uu8ngHVHwwRFlkNHo2RI UgIWKoH8iQaDqlDUX5ExUu7ir2Z7dX07/K1/+Err1+H+cXaEkqzJhTKuEGSboBTvQw8g sAQeBSXJ5zF+QXP1JKK/cunDAQsPlmee9dWFhG2TMN8nIyKw7DMMgxw7i1MYiboenpoT T2cxw+ffg5QUdODCMildtWDkiAROnSmf3uyVr5LPxXdX+X74GzdRvov74+DjhANeTj3P oNCB3VilQZ5lxLaFFZJG3DCu9xxtWhTl+NxsRsuBazh1MWYmUJ6ze1iEGMK87+vkROQt 3Dgw== X-Gm-Message-State: AJIora8suPe+0wSqpMy5KTWVeWD4hljA4ioCX8qqDTryc1rUQX8t5HkW 9YffiR3s/oqVMGLEt9oFQuUwzw== X-Google-Smtp-Source: AGRyM1vah6C9m/j0HXJ9FZUYU4HSpCWgqDbDo7mMu1bIGnoxi3GAUHrMtda/2yQgJJS6TrXv5OUtPA== X-Received: by 2002:a17:906:33db:b0:70f:12c0:4ade with SMTP id w27-20020a17090633db00b0070f12c04ademr3286585eja.320.1655905971223; Wed, 22 Jun 2022 06:52:51 -0700 (PDT) Received: from [192.168.0.224] (xdsl-188-155-176-92.adslplus.ch. [188.155.176.92]) by smtp.gmail.com with ESMTPSA id kx10-20020a170907774a00b00722df6db8f3sm3296099ejc.115.2022.06.22.06.52.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Jun 2022 06:52:50 -0700 (PDT) Message-ID: <64eb52ee-b3ac-3d94-cfce-ceb1c88dddb6@linaro.org> Date: Wed, 22 Jun 2022 15:52:49 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [PATCH v4 4/4] arm64: dts: qcom: sdm845: Add CPU BWMON Content-Language: en-US To: Rajendra Nayak , Andy Gross , Bjorn Andersson , Georgi Djakov , Rob Herring , Catalin Marinas , Will Deacon , linux-arm-msm@vger.kernel.org, linux-pm@vger.kernel.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Thara Gopinath References: <20220601101140.170504-1-krzysztof.kozlowski@linaro.org> <20220601101140.170504-5-krzysztof.kozlowski@linaro.org> From: Krzysztof Kozlowski In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: devicetree@vger.kernel.org On 22/06/2022 13:46, Rajendra Nayak wrote: > > On 6/1/2022 3:41 PM, Krzysztof Kozlowski wrote: >> Add device node for CPU-memory BWMON device (bandwidth monitoring) on >> SDM845 measuring bandwidth between CPU (gladiator_noc) and Last Level >> Cache (memnoc). Usage of this BWMON allows to remove fixed bandwidth >> votes from cpufreq (CPU nodes) thus achieve high memory throughput even >> with lower CPU frequencies. >> >> Performance impact (SDM845-MTP RB3 board, linux next-20220422): >> 1. No noticeable impact when running with schedutil or performance >> governors. >> >> 2. When comparing to customized kernel with synced interconnects and >> without bandwidth votes from CPU freq, the sysbench memory tests >> show significant improvement with bwmon for blocksizes past the L3 >> cache. The results for such superficial comparison: >> >> sysbench memory test, results in MB/s (higher is better) >> bs kB | type | V | V+no bw votes | bwmon | benefit % >> 1 | W/seq | 14795 | 4816 | 4985 | 3.5% >> 64 | W/seq | 41987 | 10334 | 10433 | 1.0% >> 4096 | W/seq | 29768 | 8728 | 32007 | 266.7% >> 65536 | W/seq | 17711 | 4846 | 18399 | 279.6% >> 262144 | W/seq | 16112 | 4538 | 17429 | 284.1% >> 64 | R/seq | 61202 | 67092 | 66804 | -0.4% >> 4096 | R/seq | 23871 | 5458 | 24307 | 345.4% >> 65536 | R/seq | 18554 | 4240 | 18685 | 340.7% >> 262144 | R/seq | 17524 | 4207 | 17774 | 322.4% >> 64 | W/rnd | 2663 | 1098 | 1119 | 1.9% >> 65536 | W/rnd | 600 | 316 | 610 | 92.7% >> 64 | R/rnd | 4915 | 4784 | 4594 | -4.0% >> 65536 | R/rnd | 664 | 281 | 678 | 140.7% >> >> Legend: >> bs kB: block size in KB (small block size means only L1-3 caches are >> used >> type: R - read, W - write, seq - sequential, rnd - random >> V: vanilla (next-20220422) >> V + no bw votes: vanilla without bandwidth votes from CPU freq >> bwmon: bwmon without bandwidth votes from CPU freq >> benefit %: difference between vanilla without bandwidth votes and bwmon >> (higher is better) >> >> Co-developed-by: Thara Gopinath >> Signed-off-by: Thara Gopinath >> Signed-off-by: Krzysztof Kozlowski >> --- >> arch/arm64/boot/dts/qcom/sdm845.dtsi | 54 ++++++++++++++++++++++++++++ >> 1 file changed, 54 insertions(+) >> >> diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi >> index 83e8b63f0910..adffb9c70566 100644 >> --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi >> +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi >> @@ -2026,6 +2026,60 @@ llcc: system-cache-controller@1100000 { >> interrupts = ; >> }; >> >> + pmu@1436400 { >> + compatible = "qcom,sdm845-cpu-bwmon"; >> + reg = <0 0x01436400 0 0x600>; >> + >> + interrupts = ; >> + >> + interconnects = <&gladiator_noc MASTER_APPSS_PROC 3 &mem_noc SLAVE_EBI1 3>, >> + <&osm_l3 MASTER_OSM_L3_APPS &osm_l3 SLAVE_OSM_L3>; >> + interconnect-names = "ddr", "l3c"; > > Is this the pmu/bwmon instance between the cpu and caches or the one between the caches and DDR? To my understanding this is the one between CPU and caches. > Depending on which one it is, shouldn;t we just be scaling either one and not both the interconnect paths? The interconnects are the same as ones used for CPU nodes, therefore if we want to scale both when scaling CPU, then we also want to scale both when seeing traffic between CPU and cache. Maybe the assumption here is not correct, so basically the two interconnects in CPU nodes are also not proper? Best regards, Krzysztof