From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 456962F90CE; Mon, 13 Oct 2025 08:38:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.12 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760344703; cv=none; b=TD7yJSTXWtJuK34JBBfsm83thGsVTlgkFP6N5JMcM2L7bQYvcfhSZuGrrnJergtQa8uzJypJ8WyNxOPdS/fk/q7EL89beX8I5HF+5Fot0Y+E9Wm0yBkBTdGN4ona8ZTQlKl/yNex3l9DahUVrsDcoOLoX4FKI4axoc4NmqwqZr8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760344703; c=relaxed/simple; bh=26t7Q08DL+2PT4vvqMsKQUPvVz73YGp900Us8qtoeII=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=q1jfAkSqnqLls9amXrpSEA6K8WbQIyvUc8T+K+9PNXwLdgpZxHmhWbVOpOl9Y4cx0DUnn6psA8p2L83U0MRKHgQvje4LRADtIxDW8Za+SS4DGPCyLMXv7R8hhXcvpxfrk46Gpt47vijcsi97PyP/x+7uPl+noRppjziNgwVrO2I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lkUIJutH; arc=none smtp.client-ip=192.198.163.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lkUIJutH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1760344701; x=1791880701; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=26t7Q08DL+2PT4vvqMsKQUPvVz73YGp900Us8qtoeII=; b=lkUIJutHr9yY0H/ZYB3i2JISsNxIOfPMlTo3lDBpoVCyABBWHo6cmGGx q2JXr2O9kiLGcvJiQiDo/eC0+hNpDW9uohAJN0NRsaO6CA4Mo1A7Z4rgY TM3tpC5rlB2TbohWxmN79GFKSSw4aRor4+1d3Am4bf7zPk1a3icQodqmW +UMR5wFNcMZ4+CLH1TotzznxNSTOkr1HB3i8DbcgWkbAEe1vyeJnk6GPP 1iMu5MqgrSdi0RzVMIZUGEDd/QTHqq8Po8XshtanPNHPY48ByaBfssWYE ImPv9PMxp0HuThN94j4STTsiRjZHp33qCfvBgENdwqxD/LfX1vh7vT13q g==; X-CSE-ConnectionGUID: DlGtR1eORQWz1VTKepLVpQ== X-CSE-MsgGUID: xaivkSxtQtKYH4So7K6qtw== X-IronPort-AV: E=McAfee;i="6800,10657,11580"; a="66332243" X-IronPort-AV: E=Sophos;i="6.19,225,1754982000"; d="scan'208";a="66332243" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Oct 2025 01:38:20 -0700 X-CSE-ConnectionGUID: NM7kJcUdShCh09p123QYuA== X-CSE-MsgGUID: 1AgSA781SSGWq/B2gxPzRQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.19,225,1754982000"; d="scan'208";a="212172484" Received: from unknown (HELO [10.238.2.75]) ([10.238.2.75]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Oct 2025 01:38:18 -0700 Message-ID: <30c62dee-2219-4b39-94c7-b9cc81130c9e@linux.intel.com> Date: Mon, 13 Oct 2025 16:38:15 +0800 Precedence: bulk X-Mailing-List: linux-perf-users@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [REGRESSION] bisected: perf: hang when using async-profiler caused by perf: Fix the POLL_HUP delivery breakage To: Peter Zijlstra Cc: Octavia Togami , stable@vger.kernel.org, regressions@lists.linux.dev, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org References: <8aed5e69-57b1-4a01-b90c-56402eb27b37@linux.intel.com> <20251013080531.GJ3245006@noisy.programming.kicks-ass.net> Content-Language: en-US From: "Mi, Dapeng" In-Reply-To: <20251013080531.GJ3245006@noisy.programming.kicks-ass.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 10/13/2025 4:05 PM, Peter Zijlstra wrote: > On Mon, Oct 13, 2025 at 10:34:27AM +0800, Mi, Dapeng wrote: > >> It looks the issue described in the link >> (https://lore.kernel.org/all/20250606192546.915765-1-kan.liang@linux.intel.com/T/#u) >> happens again but in a different way. :( >> >> As the commit message above link described,  cpu-clock (and task-clock) is >> a specific SW event which rely on hrtimer. The hrtimer handler calls >> __perf_event_overflow() and then event_stop (cpu_clock_event_stop()) and >> eventually call hrtimer_cancel() which traps into a dead loop which waits >> for the calling hrtimer handler finishes. >> >> As the >> change (https://lore.kernel.org/all/20250606192546.915765-1-kan.liang@linux.intel.com/T/#u), >> it should be enough to just disable the event and don't need an extra event >> stop. >> >> @Octavia, could you please check if the change below can fix this issue? >> Thanks. >> >> diff --git a/kernel/events/core.c b/kernel/events/core.c >> index 7541f6f85fcb..883b0e1fa5d3 100644 >> --- a/kernel/events/core.c >> +++ b/kernel/events/core.c >> @@ -10343,7 +10343,20 @@ static int __perf_event_overflow(struct perf_event >> *event, >>                 ret = 1; >>                 event->pending_kill = POLL_HUP; >>                 perf_event_disable_inatomic(event); >> -               event->pmu->stop(event, 0); >> + >> +               /* >> +                * The cpu-clock and task-clock are two special SW events, >> +                * which rely on the hrtimer. The __perf_event_overflow() >> +                * is invoked from the hrtimer handler for these 2 events. >> +                * Avoid to call event_stop()->hrtimer_cancel() for these >> +                * 2 events since hrtimer_cancel() waits for the hrtimer >> +                * handler to finish, which would trigger a deadlock. >> +                * Only disabling the events is enough to stop the hrtimer. >> +                * See perf_swevent_cancel_hrtimer(). >> +                */ >> +               if (event->attr.config != PERF_COUNT_SW_CPU_CLOCK && >> +                   event->attr.config != PERF_COUNT_SW_TASK_CLOCK) >> +                       event->pmu->stop(event, 0); > This is broken though; you cannot test config without first knowing > which PMU you're dealing with. Ah, yes. Just ignore this. > > Also, that timer really should get stopped, we can't know for certain > this overflow is of the timer itself or not, it could be a related > event. > > Something like the below might do -- but please carefully consider the > cases where hrtimer_try_to_cancel() might fail; in those cases we'll > have set HES_STOPPED and the hrtimer callback *SHOULD* observe this and > NORESTART. > > But I didn't check all the details. The only reason that hrtimer_try_to_cancel() could fail is that the hrtimer callback is currently executing, so current change should be fine.  > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 820127536e62..a91481d57841 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -11756,7 +11756,8 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer) > > event = container_of(hrtimer, struct perf_event, hw.hrtimer); > > - if (event->state != PERF_EVENT_STATE_ACTIVE) > + if (event->state != PERF_EVENT_STATE_ACTIVE || > + event->hw.state & PERF_HES_STOPPED) > return HRTIMER_NORESTART; > > event->pmu->read(event); > @@ -11810,7 +11811,7 @@ static void perf_swevent_cancel_hrtimer(struct perf_event *event) > ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer); > local64_set(&hwc->period_left, ktime_to_ns(remaining)); > > - hrtimer_cancel(&hwc->hrtimer); > + hrtimer_try_to_cancel(&hwc->hrtimer); > } > } > > @@ -11854,12 +11855,14 @@ static void cpu_clock_event_update(struct perf_event *event) > > static void cpu_clock_event_start(struct perf_event *event, int flags) > { > + event->hw.state = 0; > local64_set(&event->hw.prev_count, local_clock()); > perf_swevent_start_hrtimer(event); > } > > static void cpu_clock_event_stop(struct perf_event *event, int flags) > { > + event->hw.state = PERF_HES_STOPPED; > perf_swevent_cancel_hrtimer(event); > if (flags & PERF_EF_UPDATE) > cpu_clock_event_update(event); Besides cpu-clock, task-clock should need similar change as well. I would post a complete change later.