From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81AE7C11D05 for ; Thu, 20 Feb 2020 14:22:28 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4D929207FD for ; Thu, 20 Feb 2020 14:22:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="mZCcuqKT"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="aukdObax" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D929207FD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=pTR0y2H/r1T5VxpWAsjwB3BAnzcXXCCEff65eQBHKFY=; b=mZCcuqKTKTf0X5 WHaAfZMf30bggtt9c0GMyNr7jvNhRaSaNwu34CUyMkXX7h9khr1KRdECQb7LniNww8noONqmKBsRS WwSl8eU+++6mkDNkuRRImP83HJd55MvyJ0Xwb2I160IUjBZx8oMzFdq5dHyAFmmT43EM2lRJr/X2u ED0J0fEnACBwaxccjZKEr4+n8fi/VKj4Ci45RsWBy4Vx8n6F8ROdy9Xx0Y/6TyvHQo+gTFOcOnBNh Ge7ECCGZrFFrkKLHyV4tZd6vpeVVe/VW140cvqC24e3deG3tP4flp9AjADWBH+A/qrqOn8dAC2Vge nxA1CZNfJdra6Lf1oseQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1j4mic-0005W8-Vd; Thu, 20 Feb 2020 14:22:22 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1j4miZ-0005VH-5V for linux-arm-kernel@lists.infradead.org; Thu, 20 Feb 2020 14:22:20 +0000 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id DD9D3207FD; Thu, 20 Feb 2020 14:22:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1582208538; bh=W1XxzghoGomPsmEhrSpUB0orTne/xvHkP5/NBHjVR6I=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=aukdObaxSVYsNKMVYAMJUK+bIjeopCk3WB0kVw86qSBO96Nc4OpardnPbvs248AlF NiOPdcTB6tZmQIVJmTflhGN5trI+GwEIQST54rJZWCWav+lRPvUyxyWtttoYqASXzr bt25A9vCnTQHAB6/MklKtQthxtcU641JIWLBpc5Y= Date: Thu, 20 Feb 2020 14:22:14 +0000 From: Will Deacon To: minyard@acm.org Subject: Re: [PATCH v2] arm64:kgdb: Fix kernel single-stepping Message-ID: <20200220142214.GC14459@willie-the-truck> References: <20200219152403.3495-1-minyard@acm.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200219152403.3495-1-minyard@acm.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200220_062219_266300_426E6740 X-CRM114-Status: GOOD ( 33.82 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Catalin Marinas , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Corey Minyard Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Feb 19, 2020 at 09:24:03AM -0600, minyard@acm.org wrote: > From: Corey Minyard > > I was working on a single-step bug on kgdb on an ARM64 system, and I saw > this scenario: > > * A single step is setup to return to el1 > * The ERET return to el1 > * An interrupt is pending and runs before the instruction > * As soon as PSTATE.D (the debug disable bit) is cleared, the single > step happens in that location, not where it should have. > > This appears to be due to PSTATE.SS not being cleared when the exception > happens. Per section D.2.12.5 of the ARMv8 reference manual, that > appears to be incorrect, it says "As part of exception entry, the PE > does all of the following: ... Sets PSTATE.SS to 0." Sorry, but I don't follow you here. If PSTATE.SS is not cleared, why would you take the step exception? > However, I appear to not be the first person who has noticed this. In > the el0-only portion of the kernel_entry macro in entry.S, I found the > following comment: "Ensure MDSCR_EL1.SS is clear, since we can unmask > debug exceptions when scheduling." Exactly the same scenario, except > coming from a userland single step, not a kernel one. No, I think you might be conflating PSTATE.SS and MDSCR_EL1.SS. > As I was studying this, though, I realized that the following scenario > had an issue: > > * Kernel enables MDSCR.SS, MDSCR.KDE, MDSCR.MDE (unnecessary), and > PSTATE.SS to enable a single step in el1, for kgdb or kprobes, > on the current CPU's MDSCR register and the process' PSTATE.SS > register. > * Kernel returns from the exception with ERET. > * An interrupt or page fault happens on the instruction, causing the > instruction to not be run, but the exception handler runs. > * The exception causes the task to migrate to a new core. > * The return from the exception runs on a different processor now, > where the MDSCR values are not set up for a single step. > * The single step fails to happen. > > This is bad for kgdb, of course, but it seems really bad for kprobes if > this happens. I don't see how this can happen for kprobes. Have you managed to reproduce the failure? > To fix both these problems, rework the handling of single steps to clear > things out upon entry to the kernel from el1, and then to set up single > step when returning to el1, and not do the setup in debug-monitors.c. > This means that single stepping does not use > enable/disable_debug_monitors(); it is no longer necessary to track > those flags for single stepping. This is much like single stepping is > handled for el0. A new flag is added in pt_regs to enable single > stepping from el1. Unfortunately, the old value of PSTATE.SS cannot be > used for this because of the hardware bug mentioned earlier. I don't think there's a hardware bug. It sound like you're trying to make kernel debugging per-task instead of per-cpu, but I don't think that's the right thing to do. What if I /want/ to debug an interrupt handler? For example, I might have a watchpoint on something accessed by timer interrupt. > As part of this, there is an interaction between single stepping and the > other users of debug monitors with the MDSCR.KDE bit. That bit has to > be set for both hardware breakpoints at el1 and single stepping at el1. > A new variable was created to store the cpu-wide value of MDSCR.KDE; the > single stepping code makes sure not to clear that bit on kernel entry if > it's set in the per-cpu variable. > > After fixing this and doing some more testing, I ran into another issue: > > * Kernel enables the pt_regs single step > * Kernel returns from the exception with ERET. > * An interrupt or page fault happens on the instruction, causing the > instruction to not be run, but the exception handler runs. This sounds like you've broken debug; we should take the step exception in the exception handler. That's the way this is supposed to work. > There's no easy way to find the pt_regs that has the single step flag > set. So a thread info flag was added so that the single step could be > disabled in this case. Both that flag and the flag in pt_regs must be > set to enable a single step. Honestly, I get the feeling that you don't really understand the code you're changing here and it's a tonne of effort to try to untangle what you're doing. That's not necessarily your fault because the debug architecture is a nightmare to comprehend, but I'm not keen to change it unless we have a really good justification. I'm sure kgdb is riddled with bugs but, as I said before, the fixes should be in kgdb, not by tearing up the low-level debug code (which has the potential to break other users). Maybe it would be easier if you tried to fix one problem per patch, preferably with a way to reproduce the issue you're seeing each time? Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel