From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E517E2DFA21 for ; Fri, 12 Sep 2025 10:11:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.176 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757671921; cv=none; b=drk0PE6K2k5b22aItY3ouusjSb6MqNf3emfd+7108X1FwhReUEBsOmNedX8x4xhfskGd3RbkgIQ/N2a0MjxAIDDvYBI+27P/tmnYk+hp9gju+k6mbExU/X5/nKgEP/2LpYiD/axnXgz6+Lo8+jtO9mN8hDvKCuehvdrHY5u4DjY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757671921; c=relaxed/simple; bh=FHszHWwv+FXNM1rWp+0e1v1GXLl+e5k0dLTLr+1rTQ4=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=Wtq7Eew717bAhdTNFg4F+NEcCRw1ahy3C9BB6pLq0HY07hLxADdrPCVCaqVYiTVicySBhtb78F+8XCZqm2hzWKNBYBx0ENJ8z3CA8cj258rlVXctv0N6vDfmHTunRRINnJiomHHdClc0A9Kf+MmDMMkno6HDwXZAfxoVX1oagRM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=G5Iwa9UX; arc=none smtp.client-ip=209.85.210.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="G5Iwa9UX" Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-7724cacc32bso1453312b3a.0 for ; Fri, 12 Sep 2025 03:11:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1757671918; x=1758276718; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=a0RAur8Marng0nhbSNOQRJYXrBS2GdOE60OklyU/Dxo=; b=G5Iwa9UXc1yl/uSIW/S8znrgZgqBhdLNuTUOIvfmiViKq9fXt2wiKuQFQriORBCsuR UnjuAcV0ntSJQJTOucdNvwn8lDY+BvooTLLRq63dYlH29u1zvLxle/amW1xRZlYzOFOX pCghPQ3K+KWWw+RfwvTFKr8YjAJpBA2rafy3CiqAVekAayZcaWKy4azHejRFJNR5+2bU MQVzyDFQcgwj+G3cwpty95RCZ4CY5nj2VRdeCC2q3QS1MStCt1uQLUBOUBNiSoSsta9c YijI5+zjGamIvI2C4jI/lkdC3eerbL4bJej+SWjnp6fpl2+oCUKBcDKv/Qvz+iNi3XdG oIaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757671918; x=1758276718; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=a0RAur8Marng0nhbSNOQRJYXrBS2GdOE60OklyU/Dxo=; b=NuOlMVNQeWvkNlsfuoXfL6vZjqeb/CxU/K/isPoM7kq8ymXaczLxDFfAFg2cgdYfWD 0rAksNRWcGyf5Li1SSis//dJUR65l8OcNl6qnn3UcafR4V+e9oyBMz6sAqFAH9HZ6/SX 218hbXcWuFvhVd4HIA7AMy9KeljnmkVibYAO+sTS2CMyniEytYkiWOgmBLd7pUOFceJf 9eOvVV576mZ+JiI7DeWHsS64tR7yiPygvDfN/qbwb5ZrgmAAMt+oMxc+N8DxOKjgeIay Q4bj7J2JYDbtTYWfO5vj6FeJ59ho55S7hPHzcNHyPzKR1LP4bbRcHYS2T2gqlfFC0gsC aImg== X-Forwarded-Encrypted: i=1; AJvYcCUy39+hnJE9dcCUPFsmIPHpKDsGjMGkFIjMXk4BRBKu5dD70cpV7SmkrsqTf8qJ6Q1ZErQyEvzpdl/ojJf4Yb4JAMg=@vger.kernel.org X-Gm-Message-State: AOJu0YzxhwFc/+Hnuiw1dpvwHJ9+CtVngikI4Jv+CEkFH0QgtR8RW/9v ERDACbZDnOqfzALyONsyU/3myiO07cNdKtsMd+udp0GP4SUD2U56+XLK X-Gm-Gg: ASbGncuZUJH6gQxzCt/FTDksXUhb1frmu2yMol7636ykjHWVxfbR11KzRYo5R84pi7k a0alcG2fHOaSMx0rxy4FOpiArsirHCsoywlW5wL8uZPMlU7/X4mLbtXCUqUPRSw9O6ErnXouNHP I8daivTu8b5Nw+gHHnwTfG8pWAVBJySym3XTIT3bQS8DrzgClqju3iICye2/Ui+j5h1JqP1v5n6 SO3C97EyNko/dbFiV8uB6lun23xCnCXOsOjw82BTKdcGoxY1tEWyv3tHZ4G0QKF4le5gsc3RF9K q83n6X0JD2s1NFpv4mpzNP/8JeWZcUlU5OpLXuG+h9ak56QXI+iCFb3wyChEPutNDHcL2ApBwPQ EMEIkz5mlpX80xm/gJWXIA9aANjYaRYYi7LphuV/nWMmRSx0yKQ== X-Google-Smtp-Source: AGHT+IFPwbQYuxjB7a0ey4JvHITRQZqedavu3i0Y8Nu7EBQTFjbU8//ZN1py4iYXafr0IzmJIdbrVA== X-Received: by 2002:a05:6a00:194b:b0:772:5271:d1ba with SMTP id d2e1a72fcca58-77612061e2emr3147992b3a.7.1757671917940; Fri, 12 Sep 2025 03:11:57 -0700 (PDT) Received: from localhost ([185.49.34.62]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-77607b18371sm5059816b3a.49.2025.09.12.03.11.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 12 Sep 2025 03:11:57 -0700 (PDT) From: Jinchao Wang To: Andrew Morton , Masami Hiramatsu , Peter Zijlstra , Mike Rapoport , Alexander Potapenko , Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , "Liang, Kan" , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Nathan Chancellor , Nick Desaulniers , Bill Wendling , Justin Stitt , Kees Cook , Alice Ryhl , Sami Tolvanen , Miguel Ojeda , Masahiro Yamada , Rong Xu , Naveen N Rao , David Kaplan , Andrii Nakryiko , Jinjie Ruan , Nam Cao , workflows@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-mm@kvack.org, llvm@lists.linux.dev, Andrey Ryabinin , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , kasan-dev@googlegroups.com, "David S. Miller" , Mathieu Desnoyers , linux-trace-kernel@vger.kernel.org Cc: Jinchao Wang Subject: [PATCH v4 00/21] mm/ksw: Introduce real-time KStackWatch debugging tool Date: Fri, 12 Sep 2025 18:11:10 +0800 Message-ID: <20250912101145.465708-1-wangjinchao600@gmail.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-trace-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This patch series introduces KStackWatch, a lightweight kernel debugging tool for detecting kernel stack corruption in real time. The motivation comes from scenarios where corruption occurs silently in one function but manifests later as a crash in another. Using other tools may not reproduce the issue due to its heavy overhead. with no direct call trace linking the two. Such bugs are often extremely hard to debug with existing tools. I demonstrate this scenario in test2 (silent corruption test). KStackWatch works by combining a hardware breakpoint with kprobe and fprobe. It can watch a stack canary or a selected local variable and detects the moment the corruption actually occurs. This allows developers to pinpoint the real source rather than only observing the final crash. Key features include: - Lightweight overhead with minimal impact on bug reproducibility - Real-time detection of stack corruption - Simple configuration through `/proc/kstackwatch` - Support for recursive depth filter To validate the approach, the patch includes a test module and a test script. --- Changelog V4: * Solve the lockdep issues with: * per-task KStackWatch context to track depth * atomic flag to protect watched_addr * Use refactored version of arch_reinstall_hw_breakpoint Patches 1–3 of this series are also used in the wprobe work proposed by Masami Hiramatsu, so there may be some overlap between our patches. Patch 3 comes directly from Masami Hiramatsu (thanks). V3: Main changes: * Use modify_wide_hw_breakpoint_local() (from Masami) * Add atomic flag to restrict /proc/kstackwatch to a single opener * Protect stack probe with an atomic PID flag * Handle CPU hotplug for watchpoints * Add preempt_disable/enable in ksw_watch_on_local_cpu() * Introduce const struct ksw_config *ksw_get_config(void) and use it * Switch to global watch_attr, remove struct watch_info * Validate local_var_len in parser() * Handle case when canary is not found * Use dump_stack() instead of show_regs() to allow module build Cleanups: * Reduce logging and comments * Format logs with KBUILD_MODNAME * Remove unused headers Documentation: * Add new document V2: https://lore.kernel.org/all/20250904002126.1514566-1-wangjinchao600@gmail.com/ * Make hardware breakpoint and stack operations architecture-independent. V1: https://lore.kernel.org/all/20250828073311.1116593-1-wangjinchao600@gmail.com/ Core Implementation * Replaced kretprobe with fprobe for function exit hooking, as suggested by Masami Hiramatsu * Introduced per-task depth logic to track recursion across scheduling * Removed the use of workqueue for a more efficient corruption check * Reordered patches for better logical flow * Simplified and improved commit messages throughout the series * Removed initial archcheck which should be improved later Testing and Architecture * Replaced the multiple-thread test with silent corruption test * Split self-tests into a separate patch to improve clarity. Maintenance * Added a new entry for KStackWatch to the MAINTAINERS file. RFC: https://lore.kernel.org/lkml/20250818122720.434981-1-wangjinchao600@gmail.com/ --- The series is structured as follows: Jinchao Wang (20): x86/hw_breakpoint: Unify breakpoint install/uninstall x86/hw_breakpoint: Add arch_reinstall_hw_breakpoint mm/ksw: add build system support mm/ksw: add ksw_config struct and parser mm/ksw: add singleton /proc/kstackwatch interface mm/ksw: add HWBP pre-allocation mm/ksw: Add atomic ksw_watch_on() and ksw_watch_off() mm/ksw: support CPU hotplug sched: add per-task KStackWatch context mm/ksw: add probe management helpers mm/ksw: resolve stack watch addr and len mm/ksw: manage probe and HWBP lifecycle via procfs mm/ksw: add self-debug helpers mm/ksw: add test module mm/ksw: add stack overflow test mm/ksw: add silent corruption test case mm/ksw: add recursive stack corruption test tools/ksw: add test script docs: add KStackWatch document MAINTAINERS: add entry for KStackWatch Masami Hiramatsu (Google) (1): HWBP: Add modify_wide_hw_breakpoint_local() API Documentation/dev-tools/kstackwatch.rst | 94 +++++++++ MAINTAINERS | 8 + arch/Kconfig | 10 + arch/x86/Kconfig | 1 + arch/x86/include/asm/hw_breakpoint.h | 8 + arch/x86/kernel/hw_breakpoint.c | 148 +++++++------ include/linux/hw_breakpoint.h | 6 + include/linux/kstackwatch_types.h | 13 ++ include/linux/sched.h | 5 + kernel/events/hw_breakpoint.c | 36 ++++ mm/Kconfig.debug | 21 ++ mm/Makefile | 1 + mm/kstackwatch/Makefile | 8 + mm/kstackwatch/kernel.c | 239 +++++++++++++++++++++ mm/kstackwatch/kstackwatch.h | 53 +++++ mm/kstackwatch/stack.c | 194 ++++++++++++++++++ mm/kstackwatch/test.c | 262 ++++++++++++++++++++++++ mm/kstackwatch/watch.c | 181 ++++++++++++++++ tools/kstackwatch/kstackwatch_test.sh | 40 ++++ 19 files changed, 1266 insertions(+), 62 deletions(-) create mode 100644 Documentation/dev-tools/kstackwatch.rst create mode 100644 include/linux/kstackwatch_types.h create mode 100644 mm/kstackwatch/Makefile create mode 100644 mm/kstackwatch/kernel.c create mode 100644 mm/kstackwatch/kstackwatch.h create mode 100644 mm/kstackwatch/stack.c create mode 100644 mm/kstackwatch/test.c create mode 100644 mm/kstackwatch/watch.c create mode 100755 tools/kstackwatch/kstackwatch_test.sh -- 2.43.0