From: Wen Yang <wen.yang@linux.dev>
To: Gabriele Monaco <gmonaco@redhat.com>
Cc: linux-kernel@vger.kernel.org,
Steven Rostedt <rostedt@goodmis.org>,
Nam Cao <namcao@linutronix.de>,
linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCH 6/9] rv: Ensure synchronous cleanup for HA monitors
Date: Wed, 27 May 2026 01:27:33 +0800 [thread overview]
Message-ID: <43073c3d-11d3-4ac6-bd8a-03f1644dd05b@linux.dev> (raw)
In-Reply-To: <1d7268da9c74d99c84a83d7039e67a9f0da14d98.camel@redhat.com>
On 5/20/26 19:22, Gabriele Monaco wrote:
> On Wed, 2026-05-20 at 00:48 +0800, Wen Yang wrote:
>> The goal is right. One thing worth double-checking is the load order
>> in the callback against the "SMP BARRIER PAIRING" section of
>> Documentation/memory-barriers.txt, which states:
>>
>
> Yeah I realised that after I sent my answer.. You might have noticed but I
> proposed a version using acquire/release semantics in [1].
>
> I'm waiting to send it all in a V2 for the fixes series.
>
> Are you going to send your patch with tracepoint_synchronize_unregister() in the
> per-task destruction (can be a patch alone)?
>
> If not I'll do it myself and append that too, I prefer to have everything
> together to avoid conflict resolution issues.
>
Thanks for the update.
I did notice your acquire/release version — much appreciated.
For the tracepoint_synchronize_unregister() patch, I would prefer that
you send it yourself, whether as a standalone patch or within your fix
series. That way we can concentrate our efforts on the tlob monitor
without worrying about cross-patch conflicts.
So please go ahead and submit it.
Thanks a lot for handling this!
--
Best wishes,
Wen
>
> [1] -
> https://lore.kernel.org/lkml/02c522f2a09183c9e1a6ff5b0110d0d5cc5e35bd.camel@redhat.com/
>
>> [!] Note that the stores before the write barrier would normally be
>> expected to match the loads after the read barrier or the
>> address-dependency barrier, and vice versa ...
>>
>> So, we should to swap the read order in the callback so that it matches
>> the standard pattern:
>>
>> void __ha_monitor_timer_callback() {
>> guard(rcu)();
>> curr_state = READ_ONCE(ha_mon->da_mon.curr_state); /* B:
>> before rmb */
>> smp_rmb();
>> if (unlikely(!da_monitoring(&ha_mon->da_mon))) /* A:
>> after rmb */
>> return;
>> /*
>> * Reached here: monitoring = 1 (old_A).
>> * Standard wmb/rmb guarantee: curr_state (read before rmb) is also
>> * old, i.e. not initial_state.
>> */
>> ha_react(curr_state, EVENT_NONE, env_string.buffer);
>> ...
>> }
>>
>> void da_monitor_reset() {
>> da_monitor_reset_hook(da_mon);
>> WRITE_ONCE(da_mon->monitoring, 0); /* A: before wmb */
>> smp_wmb();
>> WRITE_ONCE(da_mon->curr_state, model_get_initial_state()); /*
>> B: after wmb */
>> }
>>
>>
>>
>> --
>> Best wishes,
>> Wen
>>
>>>
>>> [1] -
>>> https://lore.kernel.org/lkml/8af5ba4bd93d2acb8a546e8e47ced974a87c1eb8.1778522945.git.wen.yang@linux.dev
>>>
>>>>
>>>>
>>>> --
>>>> Best wishes,
>>>> Wen
>>>>
>>>>
>>>> On 5/12/26 22:02, Gabriele Monaco wrote:
>>>>> HA monitors may start timers, all cleanup functions currently stop the
>>>>> timers asynchronously to avoid sleeping in the wrong context.
>>>>> Nothing makes sure running callbacks terminate on cleanup.
>>>>>
>>>>> Run the entire HA timer callback in an RCU read-side critical section,
>>>>> this way we can simply synchronize_rcu() with any pending timer and are
>>>>> sure any cleanup using kfree_rcu() runs after callbacks terminated.
>>>>> Additionally make sure any unlikely callback running late won't run any
>>>>> code if the monitor is marked as disabled.
>>>>>
>>>>> Fixes: f5587d1b6ec9 ("rv: Add Hybrid Automata monitor type")
>>>>> Fixes: 4a24127bd6cb ("rv: Add support for per-object monitors in DA/HA")
>>>>> Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
>>>>> ---
>>>>> include/rv/da_monitor.h | 23 +++++++++++++++++++----
>>>>> include/rv/ha_monitor.h | 18 ++++++++++++++++--
>>>>> 2 files changed, 35 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/include/rv/da_monitor.h b/include/rv/da_monitor.h
>>>>> index a4a13b62d1a4..402d3b935c08 100644
>>>>> --- a/include/rv/da_monitor.h
>>>>> +++ b/include/rv/da_monitor.h
>>>>> @@ -57,6 +57,15 @@ static struct rv_monitor rv_this;
>>>>> #define da_monitor_reset_hook(da_mon)
>>>>> #endif
>>>>>
>>>>> +/*
>>>>> + * Hook to allow the implementation of hybrid automata: define it with
>>>>> a
>>>>> + * function that waits for the termination of all monitors background
>>>>> + * activities (e.g. all timers). This hook can sleep.
>>>>> + */
>>>>> +#ifndef da_monitor_sync_hook
>>>>> +#define da_monitor_sync_hook()
>>>>> +#endif
>>>>> +
>>>>> /*
>>>>> * Type for the target id, default to int but can be overridden.
>>>>> * A long type can work as hash table key (PER_OBJ) but will be
>>>>> downgraded
>>>>> to
>>>>> @@ -179,6 +188,7 @@ static inline int da_monitor_init(void)
>>>>> static inline void da_monitor_destroy(void)
>>>>> {
>>>>> da_monitor_reset_all();
>>>>> + da_monitor_sync_hook();
>>>>> }
>>>>>
>>>>> #ifndef da_implicit_guard
>>>>> @@ -232,6 +242,7 @@ static inline int da_monitor_init(void)
>>>>> static inline void da_monitor_destroy(void)
>>>>> {
>>>>> da_monitor_reset_all();
>>>>> + da_monitor_sync_hook();
>>>>> }
>>>>>
>>>>> #ifndef da_implicit_guard
>>>>> @@ -319,6 +330,7 @@ static inline void da_monitor_destroy(void)
>>>>> }
>>>>>
>>>>> da_monitor_reset_all();
>>>>> + da_monitor_sync_hook();
>>>>>
>>>>> rv_put_task_monitor_slot(task_mon_slot);
>>>>> task_mon_slot = RV_PER_TASK_MONITOR_INIT;
>>>>> @@ -497,10 +509,9 @@ static void da_monitor_reset_all(void)
>>>>> struct da_monitor_storage *mon_storage;
>>>>> int bkt;
>>>>>
>>>>> - rcu_read_lock();
>>>>> + guard(rcu)();
>>>>> hash_for_each_rcu(da_monitor_ht, bkt, mon_storage, node)
>>>>> da_monitor_reset(&mon_storage->rv.da_mon);
>>>>> - rcu_read_unlock();
>>>>> }
>>>>>
>>>>> static inline int da_monitor_init(void)
>>>>> @@ -516,13 +527,17 @@ static inline void da_monitor_destroy(void)
>>>>> int bkt;
>>>>>
>>>>> tracepoint_synchronize_unregister();
>>>>> + scoped_guard(rcu) {
>>>>> + hash_for_each_rcu(da_monitor_ht, bkt, mon_storage,
>>>>> node) {
>>>>> + da_monitor_reset_hook(&mon_storage->rv.da_mon);
>>>>> + }
>>>>> + }
>>>>> + da_monitor_sync_hook();
>>>>> /*
>>>>> * This function is called after all probes are disabled and no
>>>>> longer
>>>>> * pending, we can safely assume no concurrent user.
>>>>> */
>>>>> - synchronize_rcu();
>>>>> hash_for_each_safe(da_monitor_ht, bkt, tmp, mon_storage, node)
>>>>> {
>>>>> - da_monitor_reset_hook(&mon_storage->rv.da_mon);
>>>>> hash_del_rcu(&mon_storage->node);
>>>>> kfree(mon_storage);
>>>>> }
>>>>> diff --git a/include/rv/ha_monitor.h b/include/rv/ha_monitor.h
>>>>> index d59507e8cb30..47ff1a41febe 100644
>>>>> --- a/include/rv/ha_monitor.h
>>>>> +++ b/include/rv/ha_monitor.h
>>>>> @@ -36,6 +36,7 @@ static bool ha_monitor_handle_constraint(struct
>>>>> da_monitor
>>>>> *da_mon,
>>>>> #define da_monitor_event_hook ha_monitor_handle_constraint
>>>>> #define da_monitor_init_hook ha_monitor_init_env
>>>>> #define da_monitor_reset_hook ha_monitor_reset_env
>>>>> +#define da_monitor_sync_hook() synchronize_rcu()
>>>>>
>>>>> #include <rv/da_monitor.h>
>>>>> #include <linux/seq_buf.h>
>>>>> @@ -237,12 +238,25 @@ static bool ha_monitor_handle_constraint(struct
>>>>> da_monitor *da_mon,
>>>>> return false;
>>>>> }
>>>>>
>>>>> +/*
>>>>> + * __ha_monitor_timer_callback - generic callback representation
>>>>> + *
>>>>> + * This callback runs in an RCU read-side critical section to allow the
>>>>> + * destruction sequence to easily synchronize_rcu() with all pending
>>>>> timer
>>>>> + * after asynchronously disabling them.
>>>>> + */
>>>>> static inline void __ha_monitor_timer_callback(struct ha_monitor
>>>>> *ha_mon)
>>>>> {
>>>>> - enum states curr_state = READ_ONCE(ha_mon->da_mon.curr_state);
>>>>> DECLARE_SEQ_BUF(env_string, ENV_BUFFER_SIZE);
>>>>> - u64 time_ns = ha_get_ns();
>>>>> + enum states curr_state;
>>>>> + u64 time_ns;
>>>>> +
>>>>> + if (unlikely(!da_monitor_handling_event(&ha_mon->da_mon)))
>>>>> + return;
>>>>>
>>>>> + guard(rcu)();
>>>>> + curr_state = READ_ONCE(ha_mon->da_mon.curr_state);
>>>>> + time_ns = ha_get_ns();
>>>>> ha_get_env_string(&env_string, ha_mon, time_ns);
>>>>> ha_react(curr_state, EVENT_NONE, env_string.buffer);
>>>>> ha_trace_error_env(ha_mon, model_get_state_name(curr_state),
>>>
>
next prev parent reply other threads:[~2026-05-26 17:28 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-12 14:02 [PATCH 0/9] rv: Fixes on Deterministic and Hybrid Automata Gabriele Monaco
2026-05-12 14:02 ` [PATCH 1/9] rv: Fix __user specifier usage in extract_params() Gabriele Monaco
2026-05-17 8:48 ` Wen Yang
2026-05-12 14:02 ` [PATCH 2/9] rv: Fix read_lock scope in per-task DA cleanup Gabriele Monaco
2026-05-17 8:51 ` Wen Yang
2026-05-12 14:02 ` [PATCH 3/9] rv: Reset per-task DA monitors before releasing the slot Gabriele Monaco
2026-05-17 8:55 ` Wen Yang
2026-05-12 14:02 ` [PATCH 4/9] rv: Prevent task migration while handling per-CPU events Gabriele Monaco
2026-05-17 8:57 ` Wen Yang
2026-05-12 14:02 ` [PATCH 5/9] rv: Ensure all pending probes terminate on per-obj monitor destroy Gabriele Monaco
2026-05-17 9:01 ` Wen Yang
2026-05-12 14:02 ` [PATCH 6/9] rv: Ensure synchronous cleanup for HA monitors Gabriele Monaco
2026-05-17 9:12 ` Wen Yang
2026-05-18 11:54 ` Gabriele Monaco
2026-05-19 9:31 ` Gabriele Monaco
2026-05-19 16:48 ` Wen Yang
2026-05-20 11:22 ` Gabriele Monaco
2026-05-26 17:27 ` Wen Yang [this message]
2026-05-12 14:02 ` [PATCH 7/9] rv: Do not rely on clean monitor when initialising HA Gabriele Monaco
2026-05-17 9:15 ` Wen Yang
2026-05-12 14:02 ` [PATCH 8/9] rv: Add automatic cleanup handlers for per-task HA monitors Gabriele Monaco
2026-05-17 9:40 ` Wen Yang
2026-05-18 12:18 ` Gabriele Monaco
2026-05-12 14:02 ` [PATCH 9/9] rv: Mandate deallocation for per-obj monitors Gabriele Monaco
2026-05-17 9:52 ` Wen Yang
2026-05-18 6:36 ` Gabriele Monaco
2026-05-18 15:40 ` Wen Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43073c3d-11d3-4ac6-bd8a-03f1644dd05b@linux.dev \
--to=wen.yang@linux.dev \
--cc=gmonaco@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=namcao@linutronix.de \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox