From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E719C41604 for ; Tue, 6 Oct 2020 16:14:14 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 078DB206D4 for ; Tue, 6 Oct 2020 16:14:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="TG5f6uis" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 078DB206D4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=n2fKXsjhQOXp4atfrDwi190/1wA4lU+5mB+msRvIpGQ=; b=TG5f6uisy+IBYwuAPV7MitwP6 RKyC03iH7AQ9HK/+LnO8TVnVhI37fmQjAHqXmyWQD2Ybc9Rf9ElV494zGU9J0HsypQ0Af3HAj1bIn N/IcHM3rQOenJr0zTDSD4DLGZY15HlqSBdEIKvzbteVXzQs0EqiiOeFEfhWzsYHQZ9yO17MF3bw11 AWYuEp+xaTwBLz+mSrdczYnxHONknIxv5b+LSSvZoGJa9riQaWWYNODoRLqKk8/o0IyRh/sJEYAsX 1XI/X5ESP1AF3/BrWFhDqSg5xwX/O5KyA/yRqmyspW6pTZ6M2tXTKR9X+uSEtG7xIWQBPIONkcztB rUJIgh7Vw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kPpZw-0005GD-3p; Tue, 06 Oct 2020 16:12:40 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kPpZr-0005Er-AA for linux-arm-kernel@lists.infradead.org; Tue, 06 Oct 2020 16:12:36 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 977B111B3; Tue, 6 Oct 2020 09:12:34 -0700 (PDT) Received: from [10.37.12.66] (unknown [10.37.12.66]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B2EC83F66B; Tue, 6 Oct 2020 09:12:32 -0700 (PDT) Subject: Re: [PATCH] perf: arm_spe: Use Inner Shareable DSB when draining the buffer To: Marc Zyngier References: <20201006150520.161985-1-alexandru.elisei@arm.com> <87ft6r4bgd.wl-maz@kernel.org> From: Alexandru Elisei Message-ID: <8fa8af94-ab08-b43a-95e4-55a13de09efe@arm.com> Date: Tue, 6 Oct 2020 17:13:31 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <87ft6r4bgd.wl-maz@kernel.org> Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201006_121235_487782_2188D038 X-CRM114-Status: GOOD ( 23.39 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.rutland@arm.com, suzuki.poulose@arm.com, catalin.marinas@arm.com, linux-kernel@vger.kernel.org, james.morse@arm.com, linux-arm-kernel@lists.infradead.org, will@kernel.org, kvmarm@lists.cs.columbia.edu, julien.thierry.kdev@gmail.com Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Marc, Thank you for having a look at the patch! On 10/6/20 4:32 PM, Marc Zyngier wrote: > Hi Alex, > > On Tue, 06 Oct 2020 16:05:20 +0100, > Alexandru Elisei wrote: >> From ARM DDI 0487F.b, page D9-2807: >> >> "Although the Statistical Profiling Extension acts as another observer in >> the system, for determining the Shareability domain of the DSB >> instructions, the writes of sample records are treated as coming from the >> PE that is being profiled." >> >> Similarly, on page D9-2801: >> >> "The memory type and attributes that are used for a write by the >> Statistical Profiling Extension to the Profiling Buffer is taken from the >> translation table entries for the virtual address being written to. That >> is: >> - The writes are treated as coming from an observer that is coherent with >> all observers in the Shareability domain that is defined by the >> translation tables." >> >> All the PEs are in the Inner Shareable domain, use a DSB ISH to make sure >> writes to the profiling buffer have completed. > I'm a bit sceptical of this change. The SPE writes are per-CPU, and > all we are trying to ensure is that the CPU we are running on has > drained its own queue of accesses. > > The accesses being made within the IS domain doesn't invalidate the > fact that they are still per-CPU, because "the writes of sample > records are treated as coming from the PE that is being profiled.". > > So why should we have an IS-wide synchronisation for accesses that are > purely local? I think I might have misunderstood how perf spe works. Below is my original train of thought. In the buffer management event interrupt we drain the buffer, and if the buffer is full, we call arm_spe_perf_aux_output_end() -> perf_aux_output_end(). The comment for perf_aux_output_end() says "Commit the data written by hardware into the ring buffer by adjusting aux_head and posting a PERF_RECORD_AUX into the perf buffer. It is the pmu driver's responsibility to observe ordering rules of the hardware, so that all the data is externally visible before this is called." My conclusion was that after we drain the buffer, the data must be visible to all CPUs. >From the definition of non-shareable memory (ARM DDI0487F.b, page B2-155): "For Normal memory locations, the Non-shareable attribute identifies Normal memory that is likely to be accessed only by a single PE. A location in Normal memory with the Non-shareable attribute does not require the hardware to make data accesses by different observers coherent, unless the memory is Non-cacheable." Linux configures all memory to be Inner Shareable (SH[1:0] = 0b11), *not* Non-shareable (SH[1:0] = 0b00). I think that the DSB NSH doesn't really do anything, because the PE will not do any accesses to Non-shareable memory, and we end up breaking the assumption of perf_aux_output_end(). Did I make a mistake in my reasoning? Thanks, Alex _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel