Linux Trace Kernel
 help / color / mirror / Atom feed
From: Wen Yang <wen.yang@linux.dev>
To: Gabriele Monaco <gmonaco@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	Nam Cao <namcao@linutronix.de>,
	linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCH 6/9] rv: Ensure synchronous cleanup for HA monitors
Date: Wed, 27 May 2026 01:27:33 +0800	[thread overview]
Message-ID: <43073c3d-11d3-4ac6-bd8a-03f1644dd05b@linux.dev> (raw)
In-Reply-To: <1d7268da9c74d99c84a83d7039e67a9f0da14d98.camel@redhat.com>



On 5/20/26 19:22, Gabriele Monaco wrote:
> On Wed, 2026-05-20 at 00:48 +0800, Wen Yang wrote:
>> The goal is right.  One thing worth double-checking is the load order
>> in the callback against the "SMP BARRIER PAIRING" section of
>> Documentation/memory-barriers.txt, which states:
>>
> 
> Yeah I realised that after I sent my answer.. You might have noticed but I
> proposed a version using acquire/release semantics in [1].
> 
> I'm waiting to send it all in a V2 for the fixes series.
> 
> Are you going to send your patch with tracepoint_synchronize_unregister() in the
> per-task destruction (can be a patch alone)?
> 
> If not I'll do it myself and append that too, I prefer to have everything
> together to avoid conflict resolution issues.
> 

Thanks for the update.

I did notice your acquire/release version — much appreciated.

For the tracepoint_synchronize_unregister() patch, I would prefer that 
you send it yourself, whether as a standalone patch or within your fix 
series. That way we can concentrate our efforts on the tlob monitor 
without worrying about cross-patch conflicts.

So please go ahead and submit it.
Thanks a lot for handling this!


--
Best wishes,
Wen

> 
> [1] -
> https://lore.kernel.org/lkml/02c522f2a09183c9e1a6ff5b0110d0d5cc5e35bd.camel@redhat.com/
> 
>>     [!] Note that the stores before the write barrier would normally be
>>     expected to match the loads after the read barrier or the
>>     address-dependency barrier, and vice versa ...
>>
>> So, we should to swap the read order in the callback so that it matches
>> the standard pattern:
>>
>>     void __ha_monitor_timer_callback() {
>>           guard(rcu)();
>>           curr_state = READ_ONCE(ha_mon->da_mon.curr_state);  /* B:
>> before rmb */
>>           smp_rmb();
>>           if (unlikely(!da_monitoring(&ha_mon->da_mon)))       /* A:
>> after rmb */
>>                   return;
>>           /*
>>            * Reached here: monitoring = 1 (old_A).
>>            * Standard wmb/rmb guarantee: curr_state (read before rmb) is also
>>            * old, i.e. not initial_state.
>>            */
>>           ha_react(curr_state, EVENT_NONE, env_string.buffer);
>>           ...
>>     }
>>
>>     void da_monitor_reset() {
>>           da_monitor_reset_hook(da_mon);
>>           WRITE_ONCE(da_mon->monitoring, 0);   /* A: before wmb */
>>           smp_wmb();
>>           WRITE_ONCE(da_mon->curr_state, model_get_initial_state());  /*
>> B: after wmb */
>>     }
>>
>>
>>
>> --
>> Best wishes,
>> Wen
>>
>>>
>>> [1] -
>>> https://lore.kernel.org/lkml/8af5ba4bd93d2acb8a546e8e47ced974a87c1eb8.1778522945.git.wen.yang@linux.dev
>>>
>>>>
>>>>
>>>> --
>>>> Best wishes,
>>>> Wen
>>>>
>>>>
>>>> On 5/12/26 22:02, Gabriele Monaco wrote:
>>>>> HA monitors may start timers, all cleanup functions currently stop the
>>>>> timers asynchronously to avoid sleeping in the wrong context.
>>>>> Nothing makes sure running callbacks terminate on cleanup.
>>>>>
>>>>> Run the entire HA timer callback in an RCU read-side critical section,
>>>>> this way we can simply synchronize_rcu() with any pending timer and are
>>>>> sure any cleanup using kfree_rcu() runs after callbacks terminated.
>>>>> Additionally make sure any unlikely callback running late won't run any
>>>>> code if the monitor is marked as disabled.
>>>>>
>>>>> Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type")
>>>>> Fixes: 4a24127bd6cb ("rv: Add support for per-object monitors in DA/HA")
>>>>> Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
>>>>> ---
>>>>>     include/rv/da_monitor.h | 23 +++++++++++++++++++----
>>>>>     include/rv/ha_monitor.h | 18 ++++++++++++++++--
>>>>>     2 files changed, 35 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/include/rv/da_monitor.h b/include/rv/da_monitor.h
>>>>> index a4a13b62d1a4..402d3b935c08 100644
>>>>> --- a/include/rv/da_monitor.h
>>>>> +++ b/include/rv/da_monitor.h
>>>>> @@ -57,6 +57,15 @@ static struct rv_monitor rv_this;
>>>>>     #define da_monitor_reset_hook(da_mon)
>>>>>     #endif
>>>>>     
>>>>> +/*
>>>>> + * Hook to allow the implementation of hybrid automata: define it with
>>>>> a
>>>>> + * function that waits for the termination of all monitors background
>>>>> + * activities (e.g. all timers). This hook can sleep.
>>>>> + */
>>>>> +#ifndef da_monitor_sync_hook
>>>>> +#define da_monitor_sync_hook()
>>>>> +#endif
>>>>> +
>>>>>     /*
>>>>>      * Type for the target id, default to int but can be overridden.
>>>>>      * A long type can work as hash table key (PER_OBJ) but will be
>>>>> downgraded
>>>>> to
>>>>> @@ -179,6 +188,7 @@ static inline int da_monitor_init(void)
>>>>>     static inline void da_monitor_destroy(void)
>>>>>     {
>>>>>     	da_monitor_reset_all();
>>>>> +	da_monitor_sync_hook();
>>>>>     }
>>>>>     
>>>>>     #ifndef da_implicit_guard
>>>>> @@ -232,6 +242,7 @@ static inline int da_monitor_init(void)
>>>>>     static inline void da_monitor_destroy(void)
>>>>>     {
>>>>>     	da_monitor_reset_all();
>>>>> +	da_monitor_sync_hook();
>>>>>     }
>>>>>     
>>>>>     #ifndef da_implicit_guard
>>>>> @@ -319,6 +330,7 @@ static inline void da_monitor_destroy(void)
>>>>>     	}
>>>>>     
>>>>>     	da_monitor_reset_all();
>>>>> +	da_monitor_sync_hook();
>>>>>     
>>>>>     	rv_put_task_monitor_slot(task_mon_slot);
>>>>>     	task_mon_slot = RV_PER_TASK_MONITOR_INIT;
>>>>> @@ -497,10 +509,9 @@ static void da_monitor_reset_all(void)
>>>>>     	struct da_monitor_storage *mon_storage;
>>>>>     	int bkt;
>>>>>     
>>>>> -	rcu_read_lock();
>>>>> +	guard(rcu)();
>>>>>     	hash_for_each_rcu(da_monitor_ht, bkt, mon_storage, node)
>>>>>     		da_monitor_reset(&mon_storage->rv.da_mon);
>>>>> -	rcu_read_unlock();
>>>>>     }
>>>>>     
>>>>>     static inline int da_monitor_init(void)
>>>>> @@ -516,13 +527,17 @@ static inline void da_monitor_destroy(void)
>>>>>     	int bkt;
>>>>>     
>>>>>     	tracepoint_synchronize_unregister();
>>>>> +	scoped_guard(rcu) {
>>>>> +		hash_for_each_rcu(da_monitor_ht, bkt, mon_storage,
>>>>> node) {
>>>>> +			da_monitor_reset_hook(&mon_storage->rv.da_mon);
>>>>> +		}
>>>>> +	}
>>>>> +	da_monitor_sync_hook();
>>>>>     	/*
>>>>>     	 * This function is called after all probes are disabled and no
>>>>> longer
>>>>>     	 * pending, we can safely assume no concurrent user.
>>>>>     	 */
>>>>> -	synchronize_rcu();
>>>>>     	hash_for_each_safe(da_monitor_ht, bkt, tmp, mon_storage, node)
>>>>> {
>>>>> -		da_monitor_reset_hook(&mon_storage->rv.da_mon);
>>>>>     		hash_del_rcu(&mon_storage->node);
>>>>>     		kfree(mon_storage);
>>>>>     	}
>>>>> diff --git a/include/rv/ha_monitor.h b/include/rv/ha_monitor.h
>>>>> index d59507e8cb30..47ff1a41febe 100644
>>>>> --- a/include/rv/ha_monitor.h
>>>>> +++ b/include/rv/ha_monitor.h
>>>>> @@ -36,6 +36,7 @@ static bool ha_monitor_handle_constraint(struct
>>>>> da_monitor
>>>>> *da_mon,
>>>>>     #define da_monitor_event_hook ha_monitor_handle_constraint
>>>>>     #define da_monitor_init_hook ha_monitor_init_env
>>>>>     #define da_monitor_reset_hook ha_monitor_reset_env
>>>>> +#define da_monitor_sync_hook() synchronize_rcu()
>>>>>     
>>>>>     #include <rv/da_monitor.h>
>>>>>     #include <linux/seq_buf.h>
>>>>> @@ -237,12 +238,25 @@ static bool ha_monitor_handle_constraint(struct
>>>>> da_monitor *da_mon,
>>>>>     	return false;
>>>>>     }
>>>>>     
>>>>> +/*
>>>>> + * __ha_monitor_timer_callback - generic callback representation
>>>>> + *
>>>>> + * This callback runs in an RCU read-side critical section to allow the
>>>>> + * destruction sequence to easily synchronize_rcu() with all pending
>>>>> timer
>>>>> + * after asynchronously disabling them.
>>>>> + */
>>>>>     static inline void __ha_monitor_timer_callback(struct ha_monitor
>>>>> *ha_mon)
>>>>>     {
>>>>> -	enum states curr_state = READ_ONCE(ha_mon->da_mon.curr_state);
>>>>>     	DECLARE_SEQ_BUF(env_string, ENV_BUFFER_SIZE);
>>>>> -	u64 time_ns = ha_get_ns();
>>>>> +	enum states curr_state;
>>>>> +	u64 time_ns;
>>>>> +
>>>>> +	if (unlikely(!da_monitor_handling_event(&ha_mon->da_mon)))
>>>>> +		return;
>>>>>     
>>>>> +	guard(rcu)();
>>>>> +	curr_state = READ_ONCE(ha_mon->da_mon.curr_state);
>>>>> +	time_ns = ha_get_ns();
>>>>>     	ha_get_env_string(&env_string, ha_mon, time_ns);
>>>>>     	ha_react(curr_state, EVENT_NONE, env_string.buffer);
>>>>>     	ha_trace_error_env(ha_mon, model_get_state_name(curr_state),
>>>
> 

  reply	other threads:[~2026-05-26 17:28 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12 14:02 [PATCH 0/9] rv: Fixes on Deterministic and Hybrid Automata Gabriele Monaco
2026-05-12 14:02 ` [PATCH 1/9] rv: Fix __user specifier usage in extract_params() Gabriele Monaco
2026-05-17  8:48   ` Wen Yang
2026-05-12 14:02 ` [PATCH 2/9] rv: Fix read_lock scope in per-task DA cleanup Gabriele Monaco
2026-05-17  8:51   ` Wen Yang
2026-05-12 14:02 ` [PATCH 3/9] rv: Reset per-task DA monitors before releasing the slot Gabriele Monaco
2026-05-17  8:55   ` Wen Yang
2026-05-12 14:02 ` [PATCH 4/9] rv: Prevent task migration while handling per-CPU events Gabriele Monaco
2026-05-17  8:57   ` Wen Yang
2026-05-12 14:02 ` [PATCH 5/9] rv: Ensure all pending probes terminate on per-obj monitor destroy Gabriele Monaco
2026-05-17  9:01   ` Wen Yang
2026-05-12 14:02 ` [PATCH 6/9] rv: Ensure synchronous cleanup for HA monitors Gabriele Monaco
2026-05-17  9:12   ` Wen Yang
2026-05-18 11:54     ` Gabriele Monaco
2026-05-19  9:31       ` Gabriele Monaco
2026-05-19 16:48       ` Wen Yang
2026-05-20 11:22         ` Gabriele Monaco
2026-05-26 17:27           ` Wen Yang [this message]
2026-05-12 14:02 ` [PATCH 7/9] rv: Do not rely on clean monitor when initialising HA Gabriele Monaco
2026-05-17  9:15   ` Wen Yang
2026-05-12 14:02 ` [PATCH 8/9] rv: Add automatic cleanup handlers for per-task HA monitors Gabriele Monaco
2026-05-17  9:40   ` Wen Yang
2026-05-18 12:18     ` Gabriele Monaco
2026-05-12 14:02 ` [PATCH 9/9] rv: Mandate deallocation for per-obj monitors Gabriele Monaco
2026-05-17  9:52   ` Wen Yang
2026-05-18  6:36     ` Gabriele Monaco
2026-05-18 15:40       ` Wen Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43073c3d-11d3-4ac6-bd8a-03f1644dd05b@linux.dev \
    --to=wen.yang@linux.dev \
    --cc=gmonaco@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=namcao@linutronix.de \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox