* Re: [PATCH v2] smp: Use release stores for csd_lock_record() state
2026-06-22 16:38 [PATCH v2] smp: Use release stores for csd_lock_record() state Usama Arif
@ 2026-06-24 14:15 ` Kunwu Chan
2026-06-24 15:27 ` Usama Arif
2026-06-25 14:42 ` Dmitry Ilvokhin
2026-06-26 19:25 ` Thomas Gleixner
2 siblings, 1 reply; 8+ messages in thread
From: Kunwu Chan @ 2026-06-24 14:15 UTC (permalink / raw)
To: Usama Arif, lkmm, joelagnelf, linux-kernel, marco.crivellari,
paulmck, rafael.j.wysocki, rdunlap, riel, sshegde, tglx, ulfh,
yury.norov, rcu
Cc: shakeel.butt, hannes, kernel-team
On 6/23/26 00:38, Usama Arif wrote:
> __csd_lock_record() publishes per-CPU CSD debug state that is read by
> csd_lock_wait_toolong() on another CPU. The remote side first reads
> cur_csd with smp_load_acquire() and, when non-NULL, may then read the
> matching cur_csd_func and cur_csd_info fields.
>
> Use smp_store_release() when publishing cur_csd so that the preceding
> cur_csd_func and cur_csd_info stores are ordered before the pointer
> that csd_lock_wait_toolong() acquires. This replaces the open-coded
> smp_wmb() plus plain cur_csd store with the release operation that
> matches the smp_load_acquire() in csd_lock_wait_toolong().
>
> For the clear path, use smp_store_release(&cur_csd, NULL) so that
> clearing the diagnostic state remains ordered after the preceding
> callback/unlock work, without requiring a full barrier before the
> store. On x86 this removes the locked full barrier from the clear
> path; on weaker memory models it uses the release operation needed by
> the smp_load_acquire() in csd_lock_wait_toolong().
>
> The old code also had smp_mb() calls around cur_csd updates. Those would
> only be needed if cur_csd were treated as an exact live-state marker whose
> publication had to be observed before callback execution or CSD unlock.
The original comments around those updates seem to suggest a stronger
relationship:
/* Update cur_csd before function call. */
/* NULL cur_csd after unlock. */
The changelog characterizes cur_csd as best-effort diagnostic context.
Is there prior consensus that cur_csd carries no ordering relationship
to callback execution / unlock state,
or is this patch effectively relaxing that historical assumption?
Thanks, Kunwu
> CSD stall warnings do not currently have RCU-style stall-ended checks, so
> they already allow the stall to end while diagnostics are being assembled.
> The cur_csd record is therefore best-effort diagnostic context, not a
> precise completion/stall boundary.
>
> Signed-off-by: Usama Arif <usama.arif@linux.dev>
> ---
> v1 -> v2: https://lore.kernel.org/all/01437928-ff79-4d8e-823b-7f20146946f6@linux.dev/
> - Document where the smp_store_release() synchronizes with (Alan Stern,
> Randy Dunlap and Paul McKenney).
> ---
> kernel/smp.c | 18 ++++++++++++------
> 1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/smp.c b/kernel/smp.c
> index a0bb56bd8dda..685829875a3e 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -182,16 +182,22 @@ static atomic_t csd_bug_count = ATOMIC_INIT(0);
> static void __csd_lock_record(call_single_data_t *csd)
> {
> if (!csd) {
> - smp_mb(); /* NULL cur_csd after unlock. */
> - __this_cpu_write(cur_csd, NULL);
> + /*
> + * Pairs with smp_load_acquire() of cur_csd in
> + * csd_lock_wait_toolong(): orders any preceding CSD
> + * callback/unlock before a remote reader observes NULL.
> + */
> + smp_store_release(this_cpu_ptr(&cur_csd), NULL);
> return;
> }
> __this_cpu_write(cur_csd_func, csd->func);
> __this_cpu_write(cur_csd_info, csd->info);
> - smp_wmb(); /* func and info before csd. */
> - __this_cpu_write(cur_csd, csd);
> - smp_mb(); /* Update cur_csd before function call. */
> - /* Or before unlock, as the case may be. */
> + /*
> + * Pairs with smp_load_acquire() of cur_csd in
> + * csd_lock_wait_toolong(): publishes cur_csd_func and
> + * cur_csd_info before the non-NULL pointer becomes visible.
> + */
> + smp_store_release(this_cpu_ptr(&cur_csd), csd);
> }
>
> static __always_inline void csd_lock_record(call_single_data_t *csd)
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH v2] smp: Use release stores for csd_lock_record() state
2026-06-24 14:15 ` Kunwu Chan
@ 2026-06-24 15:27 ` Usama Arif
0 siblings, 0 replies; 8+ messages in thread
From: Usama Arif @ 2026-06-24 15:27 UTC (permalink / raw)
To: Kunwu Chan, lkmm, joelagnelf, linux-kernel, marco.crivellari,
paulmck, rafael.j.wysocki, rdunlap, riel, sshegde, tglx, ulfh,
yury.norov, rcu
Cc: shakeel.butt, hannes, kernel-team
On 24/06/2026 15:15, Kunwu Chan wrote:
> On 6/23/26 00:38, Usama Arif wrote:
>> __csd_lock_record() publishes per-CPU CSD debug state that is read by
>> csd_lock_wait_toolong() on another CPU. The remote side first reads
>> cur_csd with smp_load_acquire() and, when non-NULL, may then read the
>> matching cur_csd_func and cur_csd_info fields.
>>
>> Use smp_store_release() when publishing cur_csd so that the preceding
>> cur_csd_func and cur_csd_info stores are ordered before the pointer
>> that csd_lock_wait_toolong() acquires. This replaces the open-coded
>> smp_wmb() plus plain cur_csd store with the release operation that
>> matches the smp_load_acquire() in csd_lock_wait_toolong().
>>
>> For the clear path, use smp_store_release(&cur_csd, NULL) so that
>> clearing the diagnostic state remains ordered after the preceding
>> callback/unlock work, without requiring a full barrier before the
>> store. On x86 this removes the locked full barrier from the clear
>> path; on weaker memory models it uses the release operation needed by
>> the smp_load_acquire() in csd_lock_wait_toolong().
>>
>> The old code also had smp_mb() calls around cur_csd updates. Those would
>> only be needed if cur_csd were treated as an exact live-state marker whose
>> publication had to be observed before callback execution or CSD unlock.
> The original comments around those updates seem to suggest a stronger
> relationship:
> /* Update cur_csd before function call. */
> /* NULL cur_csd after unlock. */
>
> The changelog characterizes cur_csd as best-effort diagnostic context.
>
> Is there prior consensus that cur_csd carries no ordering relationship
> to callback execution / unlock state,
>
> or is this patch effectively relaxing that historical assumption?
>
>
> Thanks, Kunwu
Thanks for the review Kunwu!
The patch preserves the ordering that reader actually needs:
csd_lock_wait_toolong() observes non-NULL cur_csd, the matching func/info have
been published. If it observes NULL, the prior callback/unlock work is ordered
before the clear.
So yes, this relaxes the historical implementation ordering around the
diagnostic marker, but not the CSD execution/reuse ordering.
>
>> CSD stall warnings do not currently have RCU-style stall-ended checks, so
>> they already allow the stall to end while diagnostics are being assembled.
>> The cur_csd record is therefore best-effort diagnostic context, not a
>> precise completion/stall boundary.
>>
>> Signed-off-by: Usama Arif <usama.arif@linux.dev>
>> ---
>> v1 -> v2: https://lore.kernel.org/all/01437928-ff79-4d8e-823b-7f20146946f6@linux.dev/
>> - Document where the smp_store_release() synchronizes with (Alan Stern,
>> Randy Dunlap and Paul McKenney).
>> ---
>> kernel/smp.c | 18 ++++++++++++------
>> 1 file changed, 12 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/smp.c b/kernel/smp.c
>> index a0bb56bd8dda..685829875a3e 100644
>> --- a/kernel/smp.c
>> +++ b/kernel/smp.c
>> @@ -182,16 +182,22 @@ static atomic_t csd_bug_count = ATOMIC_INIT(0);
>> static void __csd_lock_record(call_single_data_t *csd)
>> {
>> if (!csd) {
>> - smp_mb(); /* NULL cur_csd after unlock. */
>> - __this_cpu_write(cur_csd, NULL);
>> + /*
>> + * Pairs with smp_load_acquire() of cur_csd in
>> + * csd_lock_wait_toolong(): orders any preceding CSD
>> + * callback/unlock before a remote reader observes NULL.
>> + */
>> + smp_store_release(this_cpu_ptr(&cur_csd), NULL);
>> return;
>> }
>> __this_cpu_write(cur_csd_func, csd->func);
>> __this_cpu_write(cur_csd_info, csd->info);
>> - smp_wmb(); /* func and info before csd. */
>> - __this_cpu_write(cur_csd, csd);
>> - smp_mb(); /* Update cur_csd before function call. */
>> - /* Or before unlock, as the case may be. */
>> + /*
>> + * Pairs with smp_load_acquire() of cur_csd in
>> + * csd_lock_wait_toolong(): publishes cur_csd_func and
>> + * cur_csd_info before the non-NULL pointer becomes visible.
>> + */
>> + smp_store_release(this_cpu_ptr(&cur_csd), csd);
>> }
>>
>> static __always_inline void csd_lock_record(call_single_data_t *csd)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] smp: Use release stores for csd_lock_record() state
2026-06-22 16:38 [PATCH v2] smp: Use release stores for csd_lock_record() state Usama Arif
2026-06-24 14:15 ` Kunwu Chan
@ 2026-06-25 14:42 ` Dmitry Ilvokhin
2026-06-26 16:30 ` Usama Arif
2026-06-26 19:25 ` Thomas Gleixner
2 siblings, 1 reply; 8+ messages in thread
From: Dmitry Ilvokhin @ 2026-06-25 14:42 UTC (permalink / raw)
To: Usama Arif
Cc: lkmm, joelagnelf, linux-kernel, marco.crivellari, paulmck,
rafael.j.wysocki, rdunlap, riel, sshegde, tglx, ulfh, yury.norov,
rcu, shakeel.butt, hannes, kernel-team
On Mon, Jun 22, 2026 at 09:38:07AM -0700, Usama Arif wrote:
> __csd_lock_record() publishes per-CPU CSD debug state that is read by
> csd_lock_wait_toolong() on another CPU. The remote side first reads
> cur_csd with smp_load_acquire() and, when non-NULL, may then read the
> matching cur_csd_func and cur_csd_info fields.
>
> Use smp_store_release() when publishing cur_csd so that the preceding
> cur_csd_func and cur_csd_info stores are ordered before the pointer
> that csd_lock_wait_toolong() acquires. This replaces the open-coded
> smp_wmb() plus plain cur_csd store with the release operation that
> matches the smp_load_acquire() in csd_lock_wait_toolong().
>
> For the clear path, use smp_store_release(&cur_csd, NULL) so that
> clearing the diagnostic state remains ordered after the preceding
> callback/unlock work, without requiring a full barrier before the
> store. On x86 this removes the locked full barrier from the clear
> path; on weaker memory models it uses the release operation needed by
> the smp_load_acquire() in csd_lock_wait_toolong().
The changelog only calls out the clear path here, but the publish path
also drops its trailing smp_mb() (plus the smp_wmb()), so on x86 both
paths lose a locked full barrier. Worth describing symmetrically.
>
> The old code also had smp_mb() calls around cur_csd updates. Those would
> only be needed if cur_csd were treated as an exact live-state marker whose
> publication had to be observed before callback execution or CSD unlock.
> CSD stall warnings do not currently have RCU-style stall-ended checks, so
> they already allow the stall to end while diagnostics are being assembled.
> The cur_csd record is therefore best-effort diagnostic context, not a
> precise completion/stall boundary.
>
> Signed-off-by: Usama Arif <usama.arif@linux.dev>
> ---
> v1 -> v2: https://lore.kernel.org/all/01437928-ff79-4d8e-823b-7f20146946f6@linux.dev/
> - Document where the smp_store_release() synchronizes with (Alan Stern,
> Randy Dunlap and Paul McKenney).
> ---
> kernel/smp.c | 18 ++++++++++++------
> 1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/smp.c b/kernel/smp.c
> index a0bb56bd8dda..685829875a3e 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -182,16 +182,22 @@ static atomic_t csd_bug_count = ATOMIC_INIT(0);
> static void __csd_lock_record(call_single_data_t *csd)
> {
> if (!csd) {
> - smp_mb(); /* NULL cur_csd after unlock. */
> - __this_cpu_write(cur_csd, NULL);
> + /*
> + * Pairs with smp_load_acquire() of cur_csd in
> + * csd_lock_wait_toolong(): orders any preceding CSD
> + * callback/unlock before a remote reader observes NULL.
> + */
> + smp_store_release(this_cpu_ptr(&cur_csd), NULL);
> return;
> }
> __this_cpu_write(cur_csd_func, csd->func);
> __this_cpu_write(cur_csd_info, csd->info);
> - smp_wmb(); /* func and info before csd. */
> - __this_cpu_write(cur_csd, csd);
> - smp_mb(); /* Update cur_csd before function call. */
> - /* Or before unlock, as the case may be. */
> + /*
> + * Pairs with smp_load_acquire() of cur_csd in
> + * csd_lock_wait_toolong(): publishes cur_csd_func and
> + * cur_csd_info before the non-NULL pointer becomes visible.
> + */
> + smp_store_release(this_cpu_ptr(&cur_csd), csd);
> }
Since v2 is specifically about documenting the pairing, it would be good
to make it symmetric and add the comment on the acquire side in
csd_lock_wait_toolong().
>
> static __always_inline void csd_lock_record(call_single_data_t *csd)
> --
> 2.53.0-Meta
>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH v2] smp: Use release stores for csd_lock_record() state
2026-06-25 14:42 ` Dmitry Ilvokhin
@ 2026-06-26 16:30 ` Usama Arif
2026-06-26 16:45 ` Dmitry Ilvokhin
0 siblings, 1 reply; 8+ messages in thread
From: Usama Arif @ 2026-06-26 16:30 UTC (permalink / raw)
To: Dmitry Ilvokhin
Cc: lkmm, joelagnelf, linux-kernel, marco.crivellari, paulmck,
rafael.j.wysocki, rdunlap, riel, sshegde, tglx, ulfh, yury.norov,
rcu, shakeel.butt, hannes, kernel-team
On 25/06/2026 15:42, Dmitry Ilvokhin wrote:
> On Mon, Jun 22, 2026 at 09:38:07AM -0700, Usama Arif wrote:
>> __csd_lock_record() publishes per-CPU CSD debug state that is read by
>> csd_lock_wait_toolong() on another CPU. The remote side first reads
>> cur_csd with smp_load_acquire() and, when non-NULL, may then read the
>> matching cur_csd_func and cur_csd_info fields.
>>
>> Use smp_store_release() when publishing cur_csd so that the preceding
>> cur_csd_func and cur_csd_info stores are ordered before the pointer
>> that csd_lock_wait_toolong() acquires. This replaces the open-coded
>> smp_wmb() plus plain cur_csd store with the release operation that
>> matches the smp_load_acquire() in csd_lock_wait_toolong().
>>
>> For the clear path, use smp_store_release(&cur_csd, NULL) so that
>> clearing the diagnostic state remains ordered after the preceding
>> callback/unlock work, without requiring a full barrier before the
>> store. On x86 this removes the locked full barrier from the clear
>> path; on weaker memory models it uses the release operation needed by
>> the smp_load_acquire() in csd_lock_wait_toolong().
>
> The changelog only calls out the clear path here, but the publish path
> also drops its trailing smp_mb() (plus the smp_wmb()), so on x86 both
> paths lose a locked full barrier. Worth describing symmetrically.
Do you mean whats written in the next paragraph? The paragraph above
is for clear, the paragraph just below is for publish. The reason for removing
smp_wmb() is in 2nd paragraph. I think all the information regarding the changes
is in the commit message.>
>>
>> The old code also had smp_mb() calls around cur_csd updates. Those would
>> only be needed if cur_csd were treated as an exact live-state marker whose
>> publication had to be observed before callback execution or CSD unlock.
>> CSD stall warnings do not currently have RCU-style stall-ended checks, so
>> they already allow the stall to end while diagnostics are being assembled.
>> The cur_csd record is therefore best-effort diagnostic context, not a
>> precise completion/stall boundary.
>>
>> Signed-off-by: Usama Arif <usama.arif@linux.dev>
>> ---
>> v1 -> v2: https://lore.kernel.org/all/01437928-ff79-4d8e-823b-7f20146946f6@linux.dev/
>> - Document where the smp_store_release() synchronizes with (Alan Stern,
>> Randy Dunlap and Paul McKenney).
>> ---
>> kernel/smp.c | 18 ++++++++++++------
>> 1 file changed, 12 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/smp.c b/kernel/smp.c
>> index a0bb56bd8dda..685829875a3e 100644
>> --- a/kernel/smp.c
>> +++ b/kernel/smp.c
>> @@ -182,16 +182,22 @@ static atomic_t csd_bug_count = ATOMIC_INIT(0);
>> static void __csd_lock_record(call_single_data_t *csd)
>> {
>> if (!csd) {
>> - smp_mb(); /* NULL cur_csd after unlock. */
>> - __this_cpu_write(cur_csd, NULL);
>> + /*
>> + * Pairs with smp_load_acquire() of cur_csd in
>> + * csd_lock_wait_toolong(): orders any preceding CSD
>> + * callback/unlock before a remote reader observes NULL.
>> + */
>> + smp_store_release(this_cpu_ptr(&cur_csd), NULL);
>> return;
>> }
>> __this_cpu_write(cur_csd_func, csd->func);
>> __this_cpu_write(cur_csd_info, csd->info);
>> - smp_wmb(); /* func and info before csd. */
>> - __this_cpu_write(cur_csd, csd);
>> - smp_mb(); /* Update cur_csd before function call. */
>> - /* Or before unlock, as the case may be. */
>> + /*
>> + * Pairs with smp_load_acquire() of cur_csd in
>> + * csd_lock_wait_toolong(): publishes cur_csd_func and
>> + * cur_csd_info before the non-NULL pointer becomes visible.
>> + */
>> + smp_store_release(this_cpu_ptr(&cur_csd), csd);
>> }
>
> Since v2 is specifically about documenting the pairing, it would be good
> to make it symmetric and add the comment on the acquire side in
> csd_lock_wait_toolong().
>
Its already documented [1]
[1] https://elixir.bootlin.com/linux/v7.1.1/source/kernel/smp.c#L275
>>
>> static __always_inline void csd_lock_record(call_single_data_t *csd)
>> --
>> 2.53.0-Meta
>>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH v2] smp: Use release stores for csd_lock_record() state
2026-06-26 16:30 ` Usama Arif
@ 2026-06-26 16:45 ` Dmitry Ilvokhin
2026-06-26 19:20 ` Thomas Gleixner
0 siblings, 1 reply; 8+ messages in thread
From: Dmitry Ilvokhin @ 2026-06-26 16:45 UTC (permalink / raw)
To: Usama Arif
Cc: lkmm, joelagnelf, linux-kernel, marco.crivellari, paulmck,
rafael.j.wysocki, rdunlap, riel, sshegde, tglx, ulfh, yury.norov,
rcu, shakeel.butt, hannes, kernel-team
On Fri, Jun 26, 2026 at 05:30:21PM +0100, Usama Arif wrote:
>
>
> On 25/06/2026 15:42, Dmitry Ilvokhin wrote:
> > On Mon, Jun 22, 2026 at 09:38:07AM -0700, Usama Arif wrote:
> >> __csd_lock_record() publishes per-CPU CSD debug state that is read by
> >> csd_lock_wait_toolong() on another CPU. The remote side first reads
> >> cur_csd with smp_load_acquire() and, when non-NULL, may then read the
> >> matching cur_csd_func and cur_csd_info fields.
> >>
> >> Use smp_store_release() when publishing cur_csd so that the preceding
> >> cur_csd_func and cur_csd_info stores are ordered before the pointer
> >> that csd_lock_wait_toolong() acquires. This replaces the open-coded
> >> smp_wmb() plus plain cur_csd store with the release operation that
> >> matches the smp_load_acquire() in csd_lock_wait_toolong().
> >>
> >> For the clear path, use smp_store_release(&cur_csd, NULL) so that
> >> clearing the diagnostic state remains ordered after the preceding
> >> callback/unlock work, without requiring a full barrier before the
> >> store. On x86 this removes the locked full barrier from the clear
> >> path; on weaker memory models it uses the release operation needed by
> >> the smp_load_acquire() in csd_lock_wait_toolong().
> >
> > The changelog only calls out the clear path here, but the publish path
> > also drops its trailing smp_mb() (plus the smp_wmb()), so on x86 both
> > paths lose a locked full barrier. Worth describing symmetrically.
>
> Do you mean whats written in the next paragraph? The paragraph above
> is for clear, the paragraph just below is for publish. The reason for removing
> smp_wmb() is in 2nd paragraph. I think all the information regarding the changes
> is in the commit message.>
Indeed. I clearly can not read properly.
> >>
> >> The old code also had smp_mb() calls around cur_csd updates. Those would
> >> only be needed if cur_csd were treated as an exact live-state marker whose
> >> publication had to be observed before callback execution or CSD unlock.
> >> CSD stall warnings do not currently have RCU-style stall-ended checks, so
> >> they already allow the stall to end while diagnostics are being assembled.
> >> The cur_csd record is therefore best-effort diagnostic context, not a
> >> precise completion/stall boundary.
> >>
> >> Signed-off-by: Usama Arif <usama.arif@linux.dev>
> >> ---
> >> v1 -> v2: https://lore.kernel.org/all/01437928-ff79-4d8e-823b-7f20146946f6@linux.dev/
> >> - Document where the smp_store_release() synchronizes with (Alan Stern,
> >> Randy Dunlap and Paul McKenney).
> >> ---
> >> kernel/smp.c | 18 ++++++++++++------
> >> 1 file changed, 12 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/kernel/smp.c b/kernel/smp.c
> >> index a0bb56bd8dda..685829875a3e 100644
> >> --- a/kernel/smp.c
> >> +++ b/kernel/smp.c
> >> @@ -182,16 +182,22 @@ static atomic_t csd_bug_count = ATOMIC_INIT(0);
> >> static void __csd_lock_record(call_single_data_t *csd)
> >> {
> >> if (!csd) {
> >> - smp_mb(); /* NULL cur_csd after unlock. */
> >> - __this_cpu_write(cur_csd, NULL);
> >> + /*
> >> + * Pairs with smp_load_acquire() of cur_csd in
> >> + * csd_lock_wait_toolong(): orders any preceding CSD
> >> + * callback/unlock before a remote reader observes NULL.
> >> + */
> >> + smp_store_release(this_cpu_ptr(&cur_csd), NULL);
> >> return;
> >> }
> >> __this_cpu_write(cur_csd_func, csd->func);
> >> __this_cpu_write(cur_csd_info, csd->info);
> >> - smp_wmb(); /* func and info before csd. */
> >> - __this_cpu_write(cur_csd, csd);
> >> - smp_mb(); /* Update cur_csd before function call. */
> >> - /* Or before unlock, as the case may be. */
> >> + /*
> >> + * Pairs with smp_load_acquire() of cur_csd in
> >> + * csd_lock_wait_toolong(): publishes cur_csd_func and
> >> + * cur_csd_info before the non-NULL pointer becomes visible.
> >> + */
> >> + smp_store_release(this_cpu_ptr(&cur_csd), csd);
> >> }
> >
> > Since v2 is specifically about documenting the pairing, it would be good
> > to make it symmetric and add the comment on the acquire side in
> > csd_lock_wait_toolong().
> >
>
> Its already documented [1]
>
> [1] https://elixir.bootlin.com/linux/v7.1.1/source/kernel/smp.c#L275
It is documented to some extent, but it doesn't explicitly state which
smp_store_release() the smp_load_acquire() pairs with. I think that's
the main benefit of these comments: making the synchronization pair
explicit so readers don't have to infer it.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH v2] smp: Use release stores for csd_lock_record() state
2026-06-26 16:45 ` Dmitry Ilvokhin
@ 2026-06-26 19:20 ` Thomas Gleixner
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2026-06-26 19:20 UTC (permalink / raw)
To: Dmitry Ilvokhin, Usama Arif
Cc: lkmm, joelagnelf, linux-kernel, marco.crivellari, paulmck,
rafael.j.wysocki, rdunlap, riel, sshegde, ulfh, yury.norov, rcu,
shakeel.butt, hannes, kernel-team
On Fri, Jun 26 2026 at 16:45, Dmitry Ilvokhin wrote:
> On Fri, Jun 26, 2026 at 05:30:21PM +0100, Usama Arif wrote:
>> > Since v2 is specifically about documenting the pairing, it would be good
>> > to make it symmetric and add the comment on the acquire side in
>> > csd_lock_wait_toolong().
>> >
>>
>> Its already documented [1]
>>
>> [1] https://elixir.bootlin.com/linux/v7.1.1/source/kernel/smp.c#L275
Can you please avoid these silly links to a random source tree and just
tell people kernel/smp.c line 275?
> It is documented to some extent, but it doesn't explicitly state which
> smp_store_release() the smp_load_acquire() pairs with. I think that's
> the main benefit of these comments: making the synchronization pair
> explicit so readers don't have to infer it.
It's mandatory according to Documentation:
"Certain things should always be commented. Uses of memory barriers should
be accompanied by a line explaining why the barrier is necessary"
which means it needs to be at the place where the barrier is used and
that implies an explanation for the pairing. It's a pain having to do
detective work to figure it out.
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] smp: Use release stores for csd_lock_record() state
2026-06-22 16:38 [PATCH v2] smp: Use release stores for csd_lock_record() state Usama Arif
2026-06-24 14:15 ` Kunwu Chan
2026-06-25 14:42 ` Dmitry Ilvokhin
@ 2026-06-26 19:25 ` Thomas Gleixner
2 siblings, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2026-06-26 19:25 UTC (permalink / raw)
To: Usama Arif, lkmm, joelagnelf, linux-kernel, marco.crivellari,
paulmck, rafael.j.wysocki, rdunlap, riel, sshegde, ulfh,
usama.arif, yury.norov, rcu
Cc: shakeel.butt, hannes, kernel-team
On Mon, Jun 22 2026 at 09:38, Usama Arif wrote:
> __csd_lock_record() publishes per-CPU CSD debug state that is read by
> csd_lock_wait_toolong() on another CPU. The remote side first reads
> cur_csd with smp_load_acquire() and, when non-NULL, may then read the
> matching cur_csd_func and cur_csd_info fields.
That's not how a change log should look like. See
https://docs.kernel.org/process/maintainer-tip.html#changelog
When I use that template of context, problem, solution, then I clearly
have to assume that the above paragraph describes the current state,
which is not the case.
So in the first paragraph you want to explain that the code uses
smp_mb() to protect against foo.
The second paragraph explains the problem, i.e. that smp_mb() is a big
hammer.
The third paragraph explains how this can be replaced and why the
replacement is correct.
^ permalink raw reply [flat|nested] 8+ messages in thread