From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF963C433DF for ; Tue, 23 Jun 2020 17:21:00 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9D209204EA for ; Tue, 23 Jun 2020 17:21:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="WYI0VFQY"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="p7ePoQ6w" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9D209204EA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=YcuTg2I5MnG7xA82iwk7/OzxdYRyUZXU6bt14thD+3U=; b=WYI0VFQY0OeDnbWsJgjjOwC95 inAs7lV9fj3n67pRq2MSkBdNRENZodMxUdWh503pWqDjmD2+A/I/sxvyg3NFuYf5gQeOeuBg0oQN6 qhGzLIKhddhFNjOLAxQNcn2u6w2jQKedWxxzRsYJBAb0W9U8tEXimBh+7N5B+KXtifLRltbKgxsJC eNKhr/1cdzft7mL9S9y3UclR51qChpsHEEjEiAB9NXfpvWEhBfobXh2lIhMPGwhFa1lBN6r43HaS/ BIbcbpeUMrWKtu8LMoURO8S7OGFcfkUhFgQlbyWXl6esGtIqaNTqMAlWUZvfUx+bY5VC3wUMDNTqU lBtgugmtw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jnmZr-0007SH-Fc; Tue, 23 Jun 2020 17:19:19 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jnmZo-0007RE-2M for linux-arm-kernel@lists.infradead.org; Tue, 23 Jun 2020 17:19:17 +0000 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 990D020781; Tue, 23 Jun 2020 17:19:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1592932755; bh=88RROu/Udh8k8VGMaeC38rwcu+eyfh+EM/C+2a0MOZU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=p7ePoQ6wpoYX/e81nIXy8P33qv/UlfcE2H0CmdRhh09yI0kfjVqUmDI93TUbSkLy6 O47eePucoTvcDzNRE9aOu/zmNgo5cjZHiDgfqyowsFjrnEe35EPNvhR1VKoPosnmS0 xgSRzrJac2oVGstnyhIbIjWn9gRwfcs3rV0JfFlo= Date: Tue, 23 Jun 2020 18:19:10 +0100 From: Will Deacon To: Mark Rutland Subject: Re: [PATCH][V3] arm64: perf: Get the wrong PC value in REGS_ABI_32 mode Message-ID: <20200623171909.GC4819@willie-the-truck> References: <1589165527-188401-1-git-send-email-jiping.ma2@windriver.com> <20200526102611.GA1363@C02TD0UTHF1T.local> <1e57ec27-1d54-c7cd-5e5b-6c0cc47f9891@windriver.com> <20200527151928.GC59947@C02TD0UTHF1T.local> <20200528075418.GB22156@willie-the-truck> <20200618130332.GA53391@C02TD0UTHF1T.local> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200618130332.GA53391@C02TD0UTHF1T.local> User-Agent: Mutt/1.10.1 (2018-07-13) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jiping Ma , zhe.he@windriver.com, bruce.ashfield@gmail.com, yue.tao@windriver.com, will.deacon@arm.com, linux-kernel@vger.kernel.org, paul.gortmaker@windriver.com, catalin.marinas@arm.com, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Jun 18, 2020 at 02:03:32PM +0100, Mark Rutland wrote: > On Thu, May 28, 2020 at 08:54:19AM +0100, Will Deacon wrote: > > On Thu, May 28, 2020 at 09:06:07AM +0800, Jiping Ma wrote: > > > On 05/27/2020 11:19 PM, Mark Rutland wrote: > > > > On Wed, May 27, 2020 at 09:33:00AM +0800, Jiping Ma wrote: > > > > > On 05/26/2020 06:26 PM, Mark Rutland wrote: > > > > > > On Mon, May 11, 2020 at 10:52:07AM +0800, Jiping Ma wrote: > > > > > This modification can not fix our issue,=EF=BF=BD we need > > > > > perf_reg_abi(current) =3D=3D PERF_SAMPLE_REGS_ABI_32 to judge if = it is 32-bit > > > > > task or not, > > > > > then return the correct PC value. > > > > I must be missing something here. > > > > = > > > > The core code perf_reg_abi(task) is called with the task being samp= led, > > > > and the regs are from the task being sampled. For a userspace sampl= e for > > > > a compat task, compat_user_mode(regs) should be equivalent to the > > > > is_compat_thread(task_thread_info(task)) check. > > > > = > > > > What am I missing? > > > This issue caused by PC value is not correct. regs are sampled in fun= ction > > > perf_output_sample_regs, that call perf_reg_value(regs, bit) to get PC > > > value. > > > PC value is regs[15] in perf_reg_value() function. it should be regs[= 32]. > > > = > > > perf_output_sample_regs(struct perf_output_handle *handle, > > > =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD struct pt_r= egs *regs, u64 mask) > > > { > > > =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD int b= it; > > > =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD DECLA= RE_BITMAP(_mask, 64); > > > = > > > =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD bitma= p_from_u64(_mask, mask); > > > =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD for_e= ach_set_bit(bit, _mask, sizeof(mask) * BITS_PER_BYTE) { > > > =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD u64 val; > > > = > > > =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD val =3D = perf_reg_value(regs, bit); > > > =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF= =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD perf_out= put_put(handle, val); > > > =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD } wtf happened here?! > > Yes, but Mark's point is that checking 'compat_user_mode(regs)' should = be > > exactly the same as checking 'perf_reg_abi(current) =3D=3D PERF_SAMPLE_= REGS_ABI_32'. > > Are you saying that's not the case? If so, please can you provide an ex= ample > > of when they are different? > > = > > Leaving that aside for a second, I also think it's reasonable to questi= on > > whether this whole interface is busted or not. I looked at it last nigh= t but > > struggled to work out what it's supposed to do. Consider these three > > scenarios, all under an arm64 kernel: > > = > > 1. 64-bit perf + 64-bit application being profiled > > 2. 64-bit perf + 32-bit application being profiled > > 3. 32-bit perf + 32-bit application being profiled > > = > > It looks like the current code is a bodge to try to handle both (2) and > > (3) at the same time: > > = > > - In case (3), userspace only asks about registers 0-15 > > - In case (2), we fudge the higher registers so that 64-bit SP and LR > > hold the 32-bit values as a bodge to allow a 64-bit dwarf unwinder > > to unwind the stack > = > I think the fudging is nonsensical to begin with, as I would have > expected that PERF_REGS_ABI_32 should be the same layout regardless of > consumer (and therefore should be identical to the 32-bit arm native > format). I realise that doesn't change that we might be stuck with it... Thankfully, I don't think the fudging is affected by the patch here, so I'm inclined to take it as a fix. In fact, with this change, doesn't the prefix of the regs returned in (2) look the same as the one you would expect in 32-bit arm? > > So the idea behind the patch looks fine because case (3) is expecting t= he PC > > in register 15 and instead gets 0, but the temptation is to clean this = up so > > that cases (2) and (3) report the same data to userspace (along the lin= es of > > Mark's patch), namely only the first 16 registers with the PC moved dow= n. We > > can only do that if the unwinder is happy, which it might be if it only= ever > > looks up dwarf register numbers based on the unwind tables in the binar= y. > > Somebody would need to dig into that. > = > Agreed; I will try to figure out what the perf tool does in the three > cases above. I would be grateful if others could take a look too. > = > Another slightly scary thought: what happens for a 32-bit perf with a > 64-bit application being profiled? I don't see how that'd be forbidden, > but I also don't see how it'd work. Although you can obviously feed a 32-bit perf with a 64-bit perf.data, a 32-bit task cannot spawn a 64-bit child so that rules out a lot of the runtime weirdness, no? > > Otherwise, if it generates unconditional references to things like > > register 30 to grab the link register, then we're stuck with the bodge > > and need to special-case the PC. > = > I agree that in that case we'd have to keep the existing bodge, and we'd > have to special-case the PC, but I'd prefer to split the logic for case > 1 into a separate function for cases 2 and 3 so that we can more easily > avoid getting this more confused. > = > Let's figure out what userspace does first... afaict, perf just slurps all the registers in one go and hands them off to the unwinder, which will index into them according to the dwarf unwind tables. I have doubts as to how well that works (for example, unwinder address corrections due to Thumb would not be applied). So, I think we should take this patch (which puts the PC where you'd expect to find it for compat tasks) and then we could consider removing the current lr/sp fudging as a separate patch, which we could revert if it causes a problem. However, I'm not sure I want to open that up. What do you think? Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel