From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-183.mta1.migadu.com (out-183.mta1.migadu.com [95.215.58.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6812E3921FA for ; Wed, 13 May 2026 05:32:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.183 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778650381; cv=none; b=jsLr2oteNNPzft07pzmJIBX+VWG+F+NkiAbEAxqD9Af15yLOfhk4vVvvfgkPQIy+BVWYU5umltk+NLhOGiqkMhwibyFf4CuQlJEEEGMBJsodPldoNY7YqDiuGNxRqtLc0gxPcl7rfAzS7xp1sNOTb3NuPYxlLoc6T9PjFyy8mac= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778650381; c=relaxed/simple; bh=kSbEm7k58Eo6lWtNEuFinBdUBfjgQ7eoeLviWwnfWkA=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=i4a9d/L6yrV7myGelErRGe8RQnfPAGsCMD+iLpJk+w5XfySs4RJCiFMVUuS2WZDcH3b5bI+WfnBZZjnkNgh3HzcZKNLlaEJD0VxbnLxfvQH1aoMHw8cIt1tkZFUwCRGlzaCn89sXBqw9ADIYkhOGm3i/rjsSraWYMGgGN7Ozl2s= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=tCmKNPgd; arc=none smtp.client-ip=95.215.58.183 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="tCmKNPgd" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1778650367; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BHAIdS14F1xCMUzmy/sos/gkrsZKrIkMdmnfzXNcQAw=; b=tCmKNPgdqOvnIfkFzHi7b9nRGz1cCaPjWwgT4b3a6uoFDJ0Xs5HNoSfPEd79RsoMjmjmEV 1ZhhPZ43l0qf4bhLHVq130uGSg+hUSc4ojnYiQvEF1YbRk9KjzB1iL2DJU9wmePRjJnHeX m0KQJLIx01hodVBKWmdMy2s29OpmtwI= Date: Wed, 13 May 2026 13:32:17 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [RFC PATCH v2 02/10] rv/da: fix per-task da_monitor_destroy() ordering and sync To: Gabriele Monaco Cc: linux-trace-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Steven Rostedt References: <8e80cbcf739304de95356f1fac677261628977fa.camel@redhat.com> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Wen Yang In-Reply-To: <8e80cbcf739304de95356f1fac677261628977fa.camel@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On 5/12/26 17:09, Gabriele Monaco wrote: > On Tue, 2026-05-12 at 10:27 +0200, Gabriele Monaco wrote: >> On Tue, 2026-05-12 at 02:24 +0800, wen.yang@linux.dev wrote: >>> From: Wen Yang >>> >>> The following two paths race: >>> >>>   CPU 0 (disable_stall/__rv_disable_monitor)  CPU 1 (wwnr probe handler) >> ^ did you mean stall? > > Ok I got it now, so essentially you'd reproduce it like: > > * start a DA per-task monitor (no timer) > * stop it, a handler is still running after reset, it sets monitoring back to 1 > * start an HA per-task monitor > > that would use the same slot that is now looking like: > > { monitoring = 1, timer.function = NULL } > > because it was not initialised as HA but monitoring was reset in the race. > > Thinking about this again, it isn't just an issue with per-task monitors, all > monitors reusing slots would suffer from it. > Besides, relying on monitoring can be fragile when using LTL monitors on the > same task (those don't even have monitoring). > > Perhaps the solution isn't that trivial, I'm going to give one more thought on > it, but thanks again for bringing this up! > > Gabriele > >>>   ------------------------------------------  ----------------------------- >>>   disable_stall() >>>     da_monitor_destroy() >>>       da_monitor_reset_all()          <------ [task T: monitoring=0] >>>                                               da_monitor_start(&T->rv[n]) >>>                                               /* no timer_setup */ >>>                                                monitoring=1  <---- >>>   tracepoint_synchronize_unregister() >>>   // CPU 1 probe has already returned; sync returns >>> >>> Later, enable_stall() acquires the same slot and calls da_monitor_init(): >>> >>>   da_monitor_reset_all() >>>     da_monitor_reset(&T->rv[slot])    // monitoring=1, timer.function==0 >>>       ha_monitor_reset_env() >>>         ha_cancel_timer() >>>           timer_delete(&ha_mon->timer)  // ODEBUG: timer never initialised >>> >>>   ODEBUG: assert_init not available (active state 0) >>>   object type: timer_list >>>   Call trace: timer_delete <- da_monitor_reset_all <- enable_stall >>> >>> Call tracepoint_synchronize_unregister() inside da_monitor_destroy() >>> before da_monitor_reset_all().  The unregister_trace_xxx() calls in the >>> monitor's disable() have already disconnected the tracepoints; the sync >>> here drains any handler still in flight, so no new monitoring=1 can >>> appear after da_monitor_reset_all() clears the slot. >>> >>> Also fix the slot release ordering: release the slot only after >>> reset_all() to avoid accessing rv[] with an out-of-bounds index. >>> >>> Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type") >>> Signed-off-by: Wen Yang >>> --- >> >> Thanks for the fix, I have a similar one waiting for submission. >> >> These are technically 2 separate fixes though: the ordering with unset >> task_mon_slot (independent on HA) and the synchronisation with pending >> tracepoints. They probably deserve separate patches and visibility, the first >> has always been around and we're technically overwriting who knows what. >> >> >> The explanation above is a bit hard to follow though, are you talking about a >> handler for the same (stall) monitor running after the reset, effectively >> undoing it by setting the monitoring flag? >> >> Then this is indeed an issue with ha_monitor_reset_env() which expects a clean >> environment. >> >> So that's basically what you'd see now much more often because in fact we >> don't >> reset the right slot (though, again, that's a different issue). >> >> >> Calling tracepoint_synchronize_unregister() there too would surely fix, but it >> used to be kinda slow. But it's probably gotten faster since now tracepoints >> use >> SRCU, so we can wait for a dedicated grace period. >> >> I liked the idea to wait cumulatively in the end, but that's just making >> things >> harder.. Let's do like this: >> >> Prepare 2 separate patches as fixes, put the task slot one first (would ease >> backporting), mention this issue with the race condition only in the second. >> You can send them independently and I'll add them to the tree as urgent. >> >> >> I'm soon going to send my set of fixes that will also include the task slot >> patch (not removing to ease my life with conflicts). >> Hi Gabriele, Thanks for both messages. Two patches are ready; let me address your follow-up concerns before sending. 1. "all monitors reusing slots would suffer from it" Only RV_MON_PER_TASK uses the rv_get/put_task_monitor_slot() pool. RV_MON_GLOBAL and RV_MON_PER_CPU each have dedicated storage (a single static variable and a per-cpu variable) and never share slots across monitor types. The race is exclusive to PER_TASK, so fixing that variant's da_monitor_destroy() is the correct scope. 2. "LTL monitors don't even have monitoring" tracepoint_synchronize_unregister() does not rely on the monitoring flag at all. It is a system-wide barrier — it calls synchronize_rcu_tasks_trace() followed by synchronize_srcu(&tracepoint_srcu) — draining every in-flight tracepoint handler on every CPU regardless of which monitor dispatched it. LTL handlers are covered without any special treatment. The slot-ordering issue (patch 1) affects all per-task DA monitors, not only HA ones — "independent on HA" — because RV_PER_TASK_MONITOR_INIT equals CONFIG_RV_PER_TASK_MONITORS (one past the end of rv[]), so da_monitor_reset_all() overwrites whatever follows rv[] in task_struct whenever any per-task monitor is disabled. Also corrected "wwnr probe handler" to "stall probe handler" in patch 2 per your annotation. Please let me know if the above reasoning addresses your concerns. -- Best wishes, Wen >> >>>  include/rv/da_monitor.h | 18 ++++++++++++++++-- >>>  1 file changed, 16 insertions(+), 2 deletions(-) >>> >>> diff --git a/include/rv/da_monitor.h b/include/rv/da_monitor.h >>> index 00ded3d5ab3f..d04bb3229c75 100644 >>> --- a/include/rv/da_monitor.h >>> +++ b/include/rv/da_monitor.h >>> @@ -304,6 +304,20 @@ static int da_monitor_init(void) >>> >>>  /* >>>   * da_monitor_destroy - return the allocated slot >>> + * >>> + * Call tracepoint_synchronize_unregister() before reset_all() to close >>> + * the race where an in-flight non-HA probe handler sets monitoring=1 >>> + * (without calling timer_setup()) after da_monitor_reset_all() has >>> + * already cleared the slot but before the caller's own sync completes. >>> + * Without this barrier, an HA_TIMER_WHEEL monitor that later acquires >>> + * the same slot would call timer_delete() on a never-initialised >>> + * timer_list, triggering ODEBUG warnings. >>> + * >>> + * Note: tracepoint_synchronize_unregister() is a system-wide barrier >>> + * that waits for all CPUs to finish any in-flight tracepoint handlers. >>> + * The caller's own __rv_disable_monitor() issues a second sync after >>> + * returning from disable(); that redundant call is harmless on the >>> + * infrequent admin (enable/disable) path. >>>   */ >>>  static inline void da_monitor_destroy(void) >>>  { >>> @@ -311,10 +325,10 @@ static inline void da_monitor_destroy(void) >>>   WARN_ONCE(1, "Disabling a disabled monitor: " >>> __stringify(MONITOR_NAME)); >>>   return; >>>   } >>> + tracepoint_synchronize_unregister(); >>> + da_monitor_reset_all(); >>>   rv_put_task_monitor_slot(task_mon_slot); >>>   task_mon_slot = RV_PER_TASK_MONITOR_INIT; >>> - >>> - da_monitor_reset_all(); >>>  } >>> >>>  #elif RV_MON_TYPE == RV_MON_PER_OBJ >