* [PATCH] staging: most: dim2: fix a race condition in complete_all_mbos()
@ 2026-02-05 16:02 Minu Jin
2026-02-06 7:20 ` Dan Carpenter
2026-02-07 11:55 ` Greg KH
0 siblings, 2 replies; 4+ messages in thread
From: Minu Jin @ 2026-02-05 16:02 UTC (permalink / raw)
To: parthiban.veerasooran, christian.gromm, gregkh, dan.carpenter
Cc: linux-staging, linux-kernel, Minu Jin
The current implementation of complete_all_mbos() repeatedly acquires
and releases the spinlock in loop. This causes lock contention.
This patch refactors the function to use list_replace_init(), moving all
entries to a local list. This removes the loop-based locking approach
and significantly reduces lock contention.
Signed-off-by: Minu Jin <s9430939@naver.com>
---
drivers/staging/most/dim2/dim2.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/drivers/staging/most/dim2/dim2.c b/drivers/staging/most/dim2/dim2.c
index 80af965356d0..fb54bb1a803f 100644
--- a/drivers/staging/most/dim2/dim2.c
+++ b/drivers/staging/most/dim2/dim2.c
@@ -415,20 +415,16 @@ static irqreturn_t dim2_ahb_isr(int irq, void *_dev)
*/
static void complete_all_mbos(struct list_head *head)
{
+ struct mbo *mbo, *mbo_tmp;
unsigned long flags;
- struct mbo *mbo;
-
- for (;;) {
- spin_lock_irqsave(&dim_lock, flags);
- if (list_empty(head)) {
- spin_unlock_irqrestore(&dim_lock, flags);
- break;
- }
+ LIST_HEAD(del_list);
- mbo = list_first_entry(head, struct mbo, list);
- list_del(head->next);
- spin_unlock_irqrestore(&dim_lock, flags);
+ spin_lock_irqsave(&dim_lock, flags);
+ list_replace_init(head, &del_list);
+ spin_unlock_irqrestore(&dim_lock, flags);
+ list_for_each_entry_safe(mbo, mbo_tmp, &del_list, list) {
+ list_del(&mbo->list);
mbo->processed_length = 0;
mbo->status = MBO_E_CLOSE;
mbo->complete(mbo);
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] staging: most: dim2: fix a race condition in complete_all_mbos()
2026-02-05 16:02 [PATCH] staging: most: dim2: fix a race condition in complete_all_mbos() Minu Jin
@ 2026-02-06 7:20 ` Dan Carpenter
2026-02-06 18:04 ` Minu Jin
2026-02-07 11:55 ` Greg KH
1 sibling, 1 reply; 4+ messages in thread
From: Dan Carpenter @ 2026-02-06 7:20 UTC (permalink / raw)
To: Minu Jin
Cc: parthiban.veerasooran, christian.gromm, gregkh, linux-staging,
linux-kernel
On Fri, Feb 06, 2026 at 01:02:31AM +0900, Minu Jin wrote:
> The current implementation of complete_all_mbos() repeatedly acquires
> and releases the spinlock in loop. This causes lock contention.
>
> This patch refactors the function to use list_replace_init(), moving all
> entries to a local list. This removes the loop-based locking approach
> and significantly reduces lock contention.
>
> Signed-off-by: Minu Jin <s9430939@naver.com>
The subject talks about race conditions but the commit message talks
about reducing lock contention. It does obviously reduce lock
contention (althought I don't think anyone has benchmarked it to see
if it matters) but does it prevent a race condition? Let's review:
This complete_all_mbos() function is called when we do a
most_stop_channel() and we ->poison_channel().
The list heads are &hdm_ch->started_list and &hdm_ch->pending_list. I
feel like if we add something to the list while we are also freeing
items from the list then we are toasted. In service_done_flag(), we
delete items from the list but deleting items is fine in this context.
We add things to the ->pending_list in enqueue() and
service_done_flag(). We move things from the ->pending_list to the
->started_list in try_start_dim_transfer(). So if any of those three
functions can be run at the same time as complete_all_mbos() we are in
trouble.
The hdm_enqueue_thread() function calls enqueue() until
kthread_should_stop(). The most_stop_channel() function calls
kthread_stop(c->hdm_enqueue_task) before doing the ->poison_channel()
so that's fine.
The service_done_flag() and try_start_dim_transfer() functions are
called from dim2_task_irq(). When do we stop taking interrupts? To be
honest, I don't know. I thought we had to call disable_irq()?
So that's the question, when do we disable IRQs in this driver? I
would have assumed it was in most_stop_channel() but I can't see it,
but I'm also not very familiar with this code.
Let's answer this question and then either add a Fixes tag or say that
there doesn't appear to be a race condition.
regards,
dan carpenter
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] staging: most: dim2: fix a race condition in complete_all_mbos()
2026-02-06 7:20 ` Dan Carpenter
@ 2026-02-06 18:04 ` Minu Jin
0 siblings, 0 replies; 4+ messages in thread
From: Minu Jin @ 2026-02-06 18:04 UTC (permalink / raw)
To: Dan Carpenter
Cc: parthiban.veerasooran, christian.gromm, gregkh, linux-staging,
linux-kernel
On Fri, Feb 06, 2026 at 10:20:51AM +0300, Dan Carpenter wrote:
> On Fri, Feb 06, 2026 at 01:02:31AM +0900, Minu Jin wrote:
> > The current implementation of complete_all_mbos() repeatedly acquires
> > and releases the spinlock in loop. This causes lock contention.
> >
> > This patch refactors the function to use list_replace_init(), moving all
> > entries to a local list. This removes the loop-based locking approach
> > and significantly reduces lock contention.
> >
> > Signed-off-by: Minu Jin <s9430939@naver.com>
>
> This complete_all_mbos() function is called when we do a
> most_stop_channel() and we ->poison_channel().
>
> The list heads are &hdm_ch->started_list and &hdm_ch->pending_list. I
> feel like if we add something to the list while we are also freeing
> items from the list then we are toasted. In service_done_flag(), we
> delete items from the list but deleting items is fine in this context.
>
> We add things to the ->pending_list in enqueue() and
> service_done_flag(). We move things from the ->pending_list to the
> ->started_list in try_start_dim_transfer(). So if any of those three
> functions can be run at the same time as complete_all_mbos() we are in
> trouble.
>
> The hdm_enqueue_thread() function calls enqueue() until
> kthread_should_stop(). The most_stop_channel() function calls
> kthread_stop(c->hdm_enqueue_task) before doing the ->poison_channel()
> so that's fine.
>
> The service_done_flag() and try_start_dim_transfer() functions are
> called from dim2_task_irq(). When do we stop taking interrupts? To be
> honest, I don't know. I thought we had to call disable_irq()?
>
> So that's the question, when do we disable IRQs in this driver? I
> would have assumed it was in most_stop_channel() but I can't see it,
> but I'm also not very familiar with this code.
>
> Let's answer this question and then either add a Fixes tag or say that
> there doesn't appear to be a race condition.
>
> regards,
> dan carpenter
>
>
Hi Dan,
Thank you for spending your time for detailed review and analysis.
To be honest,
my original intention was to reduce lock contention by optimizing
the repeated lock/unlock in the loop from O(n) to O(1).
I wanted to minimize the overhead of acquiring the spinlock multiple times.
However, after reviewing your feedback, I traced the code again
that you pointed out. I confirmed that IRQs are not disabled during
the call path. `most_stop_channel() -> poison_channel() -> complete_all_mbos()`
In the original code,
the brief time where the lock is released inside the loop create a time
where an interrupt (eg, dim2_task_irq()) could trigger and modify the list,
leading to a race condition.
Although it wasn't my original intent,
I think this patch could also solve this race condition.
By moving the list items to a local list under a single lock,
it provides the necessary isolation from interrupts.
Does this reasoning make sense to you, or is there something I am missing?
I would appreciate your opinion before
I update the commit message and send a v2.
Minu Jin
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] staging: most: dim2: fix a race condition in complete_all_mbos()
2026-02-05 16:02 [PATCH] staging: most: dim2: fix a race condition in complete_all_mbos() Minu Jin
2026-02-06 7:20 ` Dan Carpenter
@ 2026-02-07 11:55 ` Greg KH
1 sibling, 0 replies; 4+ messages in thread
From: Greg KH @ 2026-02-07 11:55 UTC (permalink / raw)
To: Minu Jin
Cc: parthiban.veerasooran, christian.gromm, dan.carpenter,
linux-staging, linux-kernel
On Fri, Feb 06, 2026 at 01:02:31AM +0900, Minu Jin wrote:
> The current implementation of complete_all_mbos() repeatedly acquires
> and releases the spinlock in loop. This causes lock contention.
Is that actually measurable? If not, don't worry about it. Only if it
is should you worry about it :)
So please, benchmark the code and see if this is the case or not.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-07 11:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-05 16:02 [PATCH] staging: most: dim2: fix a race condition in complete_all_mbos() Minu Jin
2026-02-06 7:20 ` Dan Carpenter
2026-02-06 18:04 ` Minu Jin
2026-02-07 11:55 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox