From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-dy1-f172.google.com (mail-dy1-f172.google.com [74.125.82.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 50E5127FB2E for ; Tue, 27 Jan 2026 02:17:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769480240; cv=none; b=kX5LhruEyX/KuZBNC7MVdMw+7Cn/GANTp51Eweju3EhlGsKU9iL/1eZPARhfEnRdCXtvocscquq3N+/eh3ERxcoFbyw4AijqFnwvJcHQV8bmIbn8ulBD/H8ss1fyXXBhhrBhraouaV9SKlQopER16HzipYFLXezNdNuGbubGRMI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769480240; c=relaxed/simple; bh=25VI4oHzP0aWx8/NOg1WDXR3gKRU4PGxsQh/AZJv+oQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=TaLdeYeVvglS7W0QiBhHPszDa432Zc6EN2OuzDTaFSklLAXiY29sWFbXO7Y+OrHg5kpOO1rzzAUU9xq3nJKgMbjRsK/XJSsXfsbsmnE9ZPIonJIhTL1USCj0MY+w/2osPqg80puqF+xcYRrv4reGfakzQ59UiliI5Maw/28GHpc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=cEraHn0i; arc=none smtp.client-ip=74.125.82.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cEraHn0i" Received: by mail-dy1-f172.google.com with SMTP id 5a478bee46e88-2b4520f6b32so8248008eec.0 for ; Mon, 26 Jan 2026 18:17:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769480238; x=1770085038; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7BfFqoZAmKuiZwAth/3GTp8/RM+Lu+El376s9oBouTo=; b=cEraHn0iMeeUSywo7mhbU/OoSTubVRHiwDmMQVgLp2bk55y0K54MCKGxf0tnUnvMZl Jzqqc0NWdvzQCpekIrTTVxjNHtUn842Kh6wCT2nvyU80qQEwl6Ovr67C8P3m/TRk7n54 PaojKD+r30kN4BodvbItJD74wPK1f1mGZAo1oqpqjcvjQM7hgJxpMFrouKZbOR0SfHbg XOpAHkIO5aD8GyhuLd7igcjEl2oMgmaLf8xyvCfmpGf0AfxxEJMChjRfW/8zFuYVyAXL UJaYW3LMnLCTZNpKpX2vgpshw9eYkJ7U2UwPspiCyh1FkQal31Uk4uSGU1Jo1VInPGx/ NJpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769480238; x=1770085038; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=7BfFqoZAmKuiZwAth/3GTp8/RM+Lu+El376s9oBouTo=; b=kCZ+PrEcGl4bxz+U7F1eYOVbpxlWyNKPxwid/r4FYnYMiBeXcar7X1ZsoZKFJxzb/Q lZiHHMVoV8S2heH6zUug+GON+ePi71FejZ/vXSRsqflYQHrIs/+tFtHJvnr4unBoLZSY ZDTz1uL2FGUUHRDJ87xWS0rCGAgfjYgpGiFFtpw2qe85x/p27hGaYsfqiy2YH+oDS4OU v0nAlf0JXWyogsqFmBjb2G61SWF9j9jFhTfZAYQFNK7bWLwqoBgnTmrKhMAy4fn6Y22T 8wB+QU8uEp6kInUjJsNTNgsHutUgyuEaDdJ/pMg+HIj4sk+FV0V5okefCHXci33sBEy1 kUfg== X-Forwarded-Encrypted: i=1; AJvYcCW5QmiGYOT0jksd63AH1t2Rhq76z0QsEvOvBP6TsBAWx339w/OLI2eIcH3BCMLHdbRxRjjr7AjBkevQMQc=@vger.kernel.org X-Gm-Message-State: AOJu0YwcRnuq6rADcPnGK/9z3Jag6ATJhldeeWKIico8WuHrFBxpGQXa gTOGSP7tfFmi8T4QuXUVp3INzBbwKF1Ei+JdHIaYA+KUHJuvoXRMyM5qMFRAaQ== X-Gm-Gg: AZuq6aIQkLtXS99nVibrQIR7S3ydjS+aaPgvykZ350ZRIJXVpmM/ZxoPC5OApo7mk57 VLOV/W/22ZQD4ZXHOgc/StS6IPNyZOlPfgAsGwaqgTih0HA949HyJdzWjChQBN9zbQDyD6OoHrk sG96E/lr33CXuxfsAxBZMN5/7Qmy7r7+mQXABdo0uiUUT5OZtQYWGoovkqCY8YFMXZlywEmxWwF /M5nIUq7DMgF6HUAhHLQacQqe9KEp20M9gxIWuVaDXq4+eSTJsmZKq8dic6GeKTY/Rly8M/kHlW 2UkazJFezoCSnVPAg5QogmhjEJPnLG5AWj268riVkRR6EZdEhM1HACkLctDYxx8qOCg8PRbIlU0 cr8YuXeqwFYJY3JsUlBXxXIFi0MNK5XP9PGeaQjQZTU/wj/ofhY8uC4ZQVLG+aWP/aNC2lNLjgW hqEbA= X-Received: by 2002:a05:7300:e607:b0:2b0:4b5b:6820 with SMTP id 5a478bee46e88-2b78d98ec0amr146038eec.26.1769480238323; Mon, 26 Jan 2026 18:17:18 -0800 (PST) Received: from debian ([74.48.213.230]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2b73aa2b1f6sm15490271eec.32.2026.01.26.18.17.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Jan 2026 18:17:17 -0800 (PST) From: Qiliang Yuan To: dianders@chromium.org Cc: akpm@linux-foundation.org, lihuafei1@huawei.com, linux-kernel@vger.kernel.org, mingo@kernel.org, mm-commits@vger.kernel.org, realwujing@gmail.com, song@kernel.org, stable@vger.kernel.org, sunshx@chinatelecom.cn, thorsten.blum@linux.dev, wangjinchao600@gmail.com, yangyicong@hisilicon.com, yuanql9@chinatelecom.cn, zhangjn11@chinatelecom.cn Subject: Re: [PATCH v4] watchdog/hardlockup: Fix UAF in perf event cleanup due to migration race Date: Mon, 26 Jan 2026 21:16:54 -0500 Message-ID: <20260127021711.1180952-1-realwujing@gmail.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=y Content-Transfer-Encoding: 8bit Hi Doug, Thanks for your insightful follow-up! It's great to have the openEuler vs. Mainline timing differences clarified—it definitely explains why we hit this so reliably in our downstream environment. On Mon, Jan 26, 2026 at 5:14 PM Doug Anderson wrote: > OK, so I think the answer is: you haven't actually seen the problem > (or the WARN_ON) on a mainline kernel, only on the openEuler 4.19 > kernel... > > ...actually, I looked and now think the problem doesn't exist on a > mainline kernel. Specificaly, when we run lockup_detector_retry_init() > we call schedule_work() to do the work. That schedules work on the > "system_percpu_wq". While the work ends up being queued with > "WORK_CPU_UNBOUND", I believe that we still end up running on a thread > that's bound to just one CPU in the end. This is presumably why > nobody has reported that "WARN_ON(!is_percpu_thread())" actually > hitting on mainline. You are right that in the latest mainline, schedule_work() has been updated to use 'system_percpu_wq'. However, in many LTS kernels (including 4.19), schedule_work() still submits to 'system_wq', which lacks the per-cpu guarantee. More importantly, even on 'system_percpu_wq', the worker threads do not carry the PF_PERCPU_THREAD flag. is_percpu_thread() specifically checks (current->flags & PF_PERCPU_THREAD), which is reserved for kthreads specifically pinned via kthread_create_on_cpu(). Therefore, the WARN_ON(!is_percpu_thread()) in hardlockup_detector_event_create() is still violated in the retry path even on mainline. The UAF risk stems from the fact that preemption is enabled during the probe. If the worker thread (even if on a per-cpu wq) is preempted or if the logic assumes the task cannot migrate (which is_percpu_thread usually guarantees), we have a logical gap. By making the probe path stateless and using cpu_hotplug_disable(), we eliminate this dependency entirely. > If that's the case, we'd definitely want to at least change the > description and presumably _remove_ the Fixes tag? I actually still > think the code looks nicer after your CL and (maybe?) we could even > remove the whole schedule_work() for running this code? Maybe it was > only added to deal with this exact problem? ...but the CL description > would definitely need to be updated. The schedule_work() in lockup_detector_retry_init() (added by 930d8f8dbab9) is necessary for platforms where the PMU or other dependencies aren't ready during early init. I agree that the commit description should be updated to clarify that while the issue was caught in a downstream kernel with shifted init timings, it identifies a latent race condition in the mainline retry path. Regarding the 'Fixes' tag, since 930d8f8dbab9 introduced the asynchronous retry path which calls the probe logic from a non-percpu-thread context, it still seems like the appropriate target for the "root cause" of the vulnerability. I'll refactor the commit message in V5 to better reflect this context and remove the emphasis on ToT being "broken" out-of-the-box (since early init is indeed safe there). How does that sound to you? Best regards, Qiliang