From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1B60230F924 for ; Thu, 25 Jun 2026 21:30:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782423060; cv=none; b=AWRwN+7F1pbGodOnSsT390r7NEplmMNSqP8U1hjjbYApOHIhe+mS39yh25jYKM/lv7mKBS4dkuj4xfL523JZrtvaZQCFP2nRg/dFLFvqtj1sRwvMVp9agDf13ESoscS0ctUZZQNROlo1Hw6VHXOv3JPOICuFvO/Q5+IuON/Q4PY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782423060; c=relaxed/simple; bh=AelvX1WsPNXPproKqs289gt9isoWaQovDvB9/PNd15k=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=bx2ae26t9phpXQxkqGZbxulP32UkUAgP9vLrhtfBns5aZ2MOZyIvthHk50q1E/p/5/0D7cItbBbLg/w1+06ylyB3J1wNB2gFz1ZC+IFVqLc4PmQReeZjCybKbbmlyhSETrgtuEtWjOWA6LM1cxXn8AGzG8lWOFisp0I00kLqCy0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=eBhgbI3Z; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="eBhgbI3Z" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2B7F61F000E9; Thu, 25 Jun 2026 21:30:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782423058; bh=bsdLX3wApsSsEfPd8xntaUy06blFi/1uXpqxVHjFImI=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=eBhgbI3Zvdd27VIYMbwWJmfonACfallyea1TpKTPDq1PtpEF5XANnNCnDIybgqfXf kspXO94Ep41GTLvbfTABMidhca0iHl+PqnMPST58vyUfFR+gQS7Sst0VjE7fu2L4Qm 5WrW5+6Y9qgRIawAdYFKndrg5P6RwNkvQ+6BKCWrH19etHpZWw5xN9VuzKRtLv5rld BOOM39OMlyLdlHRdPy5vxu7Fuxj2UOG2opnYRDf8fga1nIMfXytwhajM/bSUutQbl3 LZPO2/zeX3kzNB3UQD46csR6Zi1vh6MF+iLZLoAbscqZVu6wNiyQcHoqJoDV7HKWhz BbaLdSD0ilZbg== From: Thomas Gleixner To: Feng Tang Cc: "David Hildenbrand (Arm)" , Andrew Morton , Petr Mladek , Steven Rostedt , paulmck@kernel.org, Douglas Anderson , Peter Zijlstra , Vlastimil Babka , linux-kernel@vger.kernel.org, Ard Biesheuvel Subject: Re: [PATCH v1] kernel: add a simple timer based software watchpoint In-Reply-To: References: <20260622081430.37557-1-feng.tang@linux.alibaba.com> <0c39c459-306f-49f5-b08e-e7b9b27b6352@kernel.org> <87a4skl36t.ffs@fw13> Date: Thu, 25 Jun 2026 23:30:55 +0200 Message-ID: <87pl1ejoj4.ffs@fw13> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain On Wed, Jun 24 2026 at 19:12, Feng Tang wrote: > On Wed, Jun 24, 2026 at 11:04:26AM +0200, Thomas Gleixner wrote: >> On Tue, Jun 23 2026 at 16:26, Feng Tang wrote: >> > On Mon, Jun 22, 2026 at 04:13:37PM +0200, David Hildenbrand (Arm) wrote: >> > As discussed in RFC patch review, this debug feature is similar to >> > soft/hard lockup detector and task-hung detector, should I make the control >> >> How is this very specialized ad hoc debug magic in any way similar to >> generally useful and just working debug mechanism like the lockup or >> hung detector? Those are just turned on, do not need a boatload of >> command line parameters and are generally useful. > > That's right. They are very useful and easy to use, as a big part > of my time is dealing with all kinds of lockup/task-hung bugs :) > > >> Your debug magic is a workaround for a disfunctional hardware debugger, >> which means it's going to be used by three people twice a year if at >> all. Seriously? > > The HW debugger is a Lautebach TRACE32 one, and may not be disfunctional, > as it works well for watchpoint virtual address and other daily job. My > own guess is that it can only see virtual address and doesn't have the Guessing is the worst engineering principle as I told you before. > ability to do the virtual to physical address translation instantly to > watch a _physical_ address. So I guess, not able to watchpoint a physical > address may be common for HW debuggers (I could be very wrong). If the hardware debugger and the underlying CPU facility (ETM on ARM64 IIRC) does not support triggers on physical addresses and you already concluded from other information that the problem is in the BIOS, then tracing the kernel with it's virt/phys translation is not going to work. You obviously have to use the BIOS translation which might be very different, no? > As in https://lore.kernel.org/lkml/ajkuf08Cj0Se4P_0@U-2FWC9VHC-2323.local/, > we also used this method to solve one issue that BIOS runtime service > corrupting ACPI_ENABLE register issue. Again, if the BIOS runtime service changes virt/phys translation the you have to trace the BIOS not the kernel. It's pretty obvious, no? > Then I tried to recall some old memory corruption issues I've met before, > and think about if there is some that could be captured by this method, > one example was a static global array overflow issue, which corrupted > some other global variables which was next to it in kernel bss segment. No. This is just all catching the problem after the fact with no trace and conclusive information about the root cause. The tools are there, you just have to use them correctly. But sure creating magic hacks which by chance give you the same information is way better... > But yes, as you pointed out, the frequency is low (all of the 3 happened > in the past 6 months) for myself. And my wild guess is there could be > other developers that meet similar issues :) Can you for once have an informed opinion instead of wild guesses? Thanks, tglx