linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Add lock to avoid race when ringing channel DB
@ 2023-09-13  8:47 Qiang Yu
  2023-09-13  8:47 ` [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs Qiang Yu
  2023-09-13  8:47 ` [PATCH v2 2/2] bus: mhi: host: Take irqsave lock after TRE is generated Qiang Yu
  0 siblings, 2 replies; 16+ messages in thread
From: Qiang Yu @ 2023-09-13  8:47 UTC (permalink / raw)
  To: mani, quic_jhugo
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana, Qiang Yu

1. We need a write lock in mhi_gen_tre otherwise there is race of the WP
used for ringing channel DB between mhi_queue and M0 transition.
2. We can not invoke local_bh_enable() when irqs are disabled, so move
read_lock_irqsave() under the mhi_gen_tre() since we add write_lock_bh() in
mhi_gen_tre().

v1 -> v2:
Added write_unlock_bh(&mhi_chan->lock) in mhi_gen_tre() before return
because of error process.

Bhaumik Bhatt (1):
  bus: mhi: host: Add spinlock to protect WP access when queueing TREs

Hemant Kumar (1):
  bus: mhi: host: Take irqsave lock after TRE is generated

 drivers/bus/mhi/host/main.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-09-13  8:47 [PATCH v2 0/2] Add lock to avoid race when ringing channel DB Qiang Yu
@ 2023-09-13  8:47 ` Qiang Yu
  2023-09-22 14:44   ` Jeffrey Hugo
  2023-11-06  4:41   ` Manivannan Sadhasivam
  2023-09-13  8:47 ` [PATCH v2 2/2] bus: mhi: host: Take irqsave lock after TRE is generated Qiang Yu
  1 sibling, 2 replies; 16+ messages in thread
From: Qiang Yu @ 2023-09-13  8:47 UTC (permalink / raw)
  To: mani, quic_jhugo
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana,
	Bhaumik Bhatt, Qiang Yu

From: Bhaumik Bhatt <bbhatt@codeaurora.org>

Protect WP accesses such that multiple threads queueing buffers for
incoming data do not race and access the same WP twice. Ensure read and
write locks for the channel are not taken in succession by dropping the
read lock from parse_xfer_event() such that a callback given to client
can potentially queue buffers and acquire the write lock in that process.
Any queueing of buffers should be done without channel read lock acquired
as it can result in multiple locks and a soft lockup.

Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
---
 drivers/bus/mhi/host/main.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
index dcf627b..13c4b89 100644
--- a/drivers/bus/mhi/host/main.c
+++ b/drivers/bus/mhi/host/main.c
@@ -642,6 +642,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
 			mhi_del_ring_element(mhi_cntrl, tre_ring);
 			local_rp = tre_ring->rp;
 
+			read_unlock_bh(&mhi_chan->lock);
 			/* notify client */
 			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
 
@@ -667,6 +668,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
 					kfree(buf_info->cb_buf);
 				}
 			}
+			read_lock_bh(&mhi_chan->lock);
 		}
 		break;
 	} /* CC_EOT */
@@ -1204,6 +1206,9 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
 	int eot, eob, chain, bei;
 	int ret;
 
+	/* Protect accesses for reading and incrementing WP */
+	write_lock_bh(&mhi_chan->lock);
+
 	buf_ring = &mhi_chan->buf_ring;
 	tre_ring = &mhi_chan->tre_ring;
 
@@ -1221,8 +1226,10 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
 
 	if (!info->pre_mapped) {
 		ret = mhi_cntrl->map_single(mhi_cntrl, buf_info);
-		if (ret)
+		if (ret) {
+			write_unlock_bh(&mhi_chan->lock);
 			return ret;
+		}
 	}
 
 	eob = !!(flags & MHI_EOB);
@@ -1239,6 +1246,8 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
 	mhi_add_ring_element(mhi_cntrl, tre_ring);
 	mhi_add_ring_element(mhi_cntrl, buf_ring);
 
+	write_unlock_bh(&mhi_chan->lock);
+
 	return 0;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/2] bus: mhi: host: Take irqsave lock after TRE is generated
  2023-09-13  8:47 [PATCH v2 0/2] Add lock to avoid race when ringing channel DB Qiang Yu
  2023-09-13  8:47 ` [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs Qiang Yu
@ 2023-09-13  8:47 ` Qiang Yu
  2023-09-22 14:50   ` Jeffrey Hugo
  1 sibling, 1 reply; 16+ messages in thread
From: Qiang Yu @ 2023-09-13  8:47 UTC (permalink / raw)
  To: mani, quic_jhugo
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana,
	Hemant Kumar, Lazarus Motha, Qiang Yu

From: Hemant Kumar <quic_hemantk@quicinc.com>

Take irqsave lock after TRE is generated to avoid deadlock due to core
getting interrupts enabled as local_bh_enable must not be called with
irqs disabled based on upstream patch.

Signed-off-by: Hemant Kumar <quic_hemantk@quicinc.com>
Signed-off-by: Lazarus Motha <quic_lmotha@quicinc.com>
Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
---
 drivers/bus/mhi/host/main.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
index 13c4b89..8accdfd 100644
--- a/drivers/bus/mhi/host/main.c
+++ b/drivers/bus/mhi/host/main.c
@@ -1124,17 +1124,15 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
 	if (unlikely(MHI_PM_IN_ERROR_STATE(mhi_cntrl->pm_state)))
 		return -EIO;
 
-	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
-
 	ret = mhi_is_ring_full(mhi_cntrl, tre_ring);
-	if (unlikely(ret)) {
-		ret = -EAGAIN;
-		goto exit_unlock;
-	}
+	if (unlikely(ret))
+		return -EAGAIN;
 
 	ret = mhi_gen_tre(mhi_cntrl, mhi_chan, buf_info, mflags);
 	if (unlikely(ret))
-		goto exit_unlock;
+		return ret;
+
+	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
 
 	/* Packet is queued, take a usage ref to exit M3 if necessary
 	 * for host->device buffer, balanced put is done on buffer completion
@@ -1154,7 +1152,6 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
 	if (dir == DMA_FROM_DEVICE)
 		mhi_cntrl->runtime_put(mhi_cntrl);
 
-exit_unlock:
 	read_unlock_irqrestore(&mhi_cntrl->pm_lock, flags);
 
 	return ret;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-09-13  8:47 ` [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs Qiang Yu
@ 2023-09-22 14:44   ` Jeffrey Hugo
  2023-09-25  3:10     ` Qiang Yu
  2023-11-06  4:41   ` Manivannan Sadhasivam
  1 sibling, 1 reply; 16+ messages in thread
From: Jeffrey Hugo @ 2023-09-22 14:44 UTC (permalink / raw)
  To: Qiang Yu, mani; +Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana

On 9/13/2023 2:47 AM, Qiang Yu wrote:
> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
> 
> Protect WP accesses such that multiple threads queueing buffers for
> incoming data do not race and access the same WP twice. Ensure read and
> write locks for the channel are not taken in succession by dropping the
> read lock from parse_xfer_event() such that a callback given to client
> can potentially queue buffers and acquire the write lock in that process.
> Any queueing of buffers should be done without channel read lock acquired
> as it can result in multiple locks and a soft lockup.
> 
> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
> ---
>   drivers/bus/mhi/host/main.c | 11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> index dcf627b..13c4b89 100644
> --- a/drivers/bus/mhi/host/main.c
> +++ b/drivers/bus/mhi/host/main.c
> @@ -642,6 +642,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>   			mhi_del_ring_element(mhi_cntrl, tre_ring);
>   			local_rp = tre_ring->rp;
>   
> +			read_unlock_bh(&mhi_chan->lock);

This doesn't work due to the write_lock_irqsave(&mhi_chan->lock, flags); 
on line 591.

I really don't like that we are unlocking the mhi_chan while still using 
it.  It opens up a window where the mhi_chan state can be updated 
between here and the client using the callback to queue a buf.

Perhaps we need a new lock that just protects the wp, and needs to be 
only grabbed while mhi_chan->lock is held?

>   			/* notify client */
>   			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>   
> @@ -667,6 +668,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>   					kfree(buf_info->cb_buf);
>   				}
>   			}
> +			read_lock_bh(&mhi_chan->lock);
>   		}
>   		break;
>   	} /* CC_EOT */
> @@ -1204,6 +1206,9 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
>   	int eot, eob, chain, bei;
>   	int ret;
>   
> +	/* Protect accesses for reading and incrementing WP */
> +	write_lock_bh(&mhi_chan->lock);
> +
>   	buf_ring = &mhi_chan->buf_ring;
>   	tre_ring = &mhi_chan->tre_ring;
>   
> @@ -1221,8 +1226,10 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
>   
>   	if (!info->pre_mapped) {
>   		ret = mhi_cntrl->map_single(mhi_cntrl, buf_info);
> -		if (ret)
> +		if (ret) {
> +			write_unlock_bh(&mhi_chan->lock);
>   			return ret;
> +		}
>   	}
>   
>   	eob = !!(flags & MHI_EOB);
> @@ -1239,6 +1246,8 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
>   	mhi_add_ring_element(mhi_cntrl, tre_ring);
>   	mhi_add_ring_element(mhi_cntrl, buf_ring);
>   
> +	write_unlock_bh(&mhi_chan->lock);
> +
>   	return 0;
>   }
>   


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/2] bus: mhi: host: Take irqsave lock after TRE is generated
  2023-09-13  8:47 ` [PATCH v2 2/2] bus: mhi: host: Take irqsave lock after TRE is generated Qiang Yu
@ 2023-09-22 14:50   ` Jeffrey Hugo
  2023-09-25  4:08     ` Qiang Yu
  0 siblings, 1 reply; 16+ messages in thread
From: Jeffrey Hugo @ 2023-09-22 14:50 UTC (permalink / raw)
  To: Qiang Yu, mani
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana,
	Hemant Kumar, Lazarus Motha

On 9/13/2023 2:47 AM, Qiang Yu wrote:
> From: Hemant Kumar <quic_hemantk@quicinc.com>
> 
> Take irqsave lock after TRE is generated to avoid deadlock due to core
> getting interrupts enabled as local_bh_enable must not be called with
> irqs disabled based on upstream patch.

Where is local_bh_enable() being called?  What patch?  What is upstream 
of the codebase you submitted this to?  Why is it safe to call 
mhi_gen_tre() without the lock?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-09-22 14:44   ` Jeffrey Hugo
@ 2023-09-25  3:10     ` Qiang Yu
  2023-09-29 15:22       ` Jeffrey Hugo
  0 siblings, 1 reply; 16+ messages in thread
From: Qiang Yu @ 2023-09-25  3:10 UTC (permalink / raw)
  To: Jeffrey Hugo, mani
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana


On 9/22/2023 10:44 PM, Jeffrey Hugo wrote:
> On 9/13/2023 2:47 AM, Qiang Yu wrote:
>> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>
>> Protect WP accesses such that multiple threads queueing buffers for
>> incoming data do not race and access the same WP twice. Ensure read and
>> write locks for the channel are not taken in succession by dropping the
>> read lock from parse_xfer_event() such that a callback given to client
>> can potentially queue buffers and acquire the write lock in that 
>> process.
>> Any queueing of buffers should be done without channel read lock 
>> acquired
>> as it can result in multiple locks and a soft lockup.
>>
>> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>> ---
>>   drivers/bus/mhi/host/main.c | 11 ++++++++++-
>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>> index dcf627b..13c4b89 100644
>> --- a/drivers/bus/mhi/host/main.c
>> +++ b/drivers/bus/mhi/host/main.c
>> @@ -642,6 +642,7 @@ static int parse_xfer_event(struct mhi_controller 
>> *mhi_cntrl,
>>               mhi_del_ring_element(mhi_cntrl, tre_ring);
>>               local_rp = tre_ring->rp;
>>   +            read_unlock_bh(&mhi_chan->lock);
>
> This doesn't work due to the write_lock_irqsave(&mhi_chan->lock, 
> flags); on line 591.
Write_lock_irqsave(&mhi_chan->lock, flags) is used in case of ev_code >= 
MHI_EV_CC_OOB. We only read_lock/read_unlock the mhi_chan while ev_code 
< MHI_EV_CC_OOB.
>
> I really don't like that we are unlocking the mhi_chan while still 
> using it.  It opens up a window where the mhi_chan state can be 
> updated between here and the client using the callback to queue a buf.
>
> Perhaps we need a new lock that just protects the wp, and needs to be 
> only grabbed while mhi_chan->lock is held?

Since we have employed mhi_chan lock to protect the channel and what we 
are concerned here is that client may queue buf to a disabled or stopped 
channel, can we check channel state after getting mhi_chan->lock like 
line 595.

We can add the check after getting write lock in mhi_gen_tre() and after 
getting read lock again here.

>
>>               /* notify client */
>>               mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>>   @@ -667,6 +668,7 @@ static int parse_xfer_event(struct 
>> mhi_controller *mhi_cntrl,
>>                       kfree(buf_info->cb_buf);
>>                   }
>>               }
>> +            read_lock_bh(&mhi_chan->lock);
>>           }
>>           break;
>>       } /* CC_EOT */
>> @@ -1204,6 +1206,9 @@ int mhi_gen_tre(struct mhi_controller 
>> *mhi_cntrl, struct mhi_chan *mhi_chan,
>>       int eot, eob, chain, bei;
>>       int ret;
>>   +    /* Protect accesses for reading and incrementing WP */
>> +    write_lock_bh(&mhi_chan->lock);
>> +
>>       buf_ring = &mhi_chan->buf_ring;
>>       tre_ring = &mhi_chan->tre_ring;
>>   @@ -1221,8 +1226,10 @@ int mhi_gen_tre(struct mhi_controller 
>> *mhi_cntrl, struct mhi_chan *mhi_chan,
>>         if (!info->pre_mapped) {
>>           ret = mhi_cntrl->map_single(mhi_cntrl, buf_info);
>> -        if (ret)
>> +        if (ret) {
>> +            write_unlock_bh(&mhi_chan->lock);
>>               return ret;
>> +        }
>>       }
>>         eob = !!(flags & MHI_EOB);
>> @@ -1239,6 +1246,8 @@ int mhi_gen_tre(struct mhi_controller 
>> *mhi_cntrl, struct mhi_chan *mhi_chan,
>>       mhi_add_ring_element(mhi_cntrl, tre_ring);
>>       mhi_add_ring_element(mhi_cntrl, buf_ring);
>>   +    write_unlock_bh(&mhi_chan->lock);
>> +
>>       return 0;
>>   }
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/2] bus: mhi: host: Take irqsave lock after TRE is generated
  2023-09-22 14:50   ` Jeffrey Hugo
@ 2023-09-25  4:08     ` Qiang Yu
  2023-09-29 15:25       ` Jeffrey Hugo
  0 siblings, 1 reply; 16+ messages in thread
From: Qiang Yu @ 2023-09-25  4:08 UTC (permalink / raw)
  To: Jeffrey Hugo, mani
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana,
	Hemant Kumar, Lazarus Motha


On 9/22/2023 10:50 PM, Jeffrey Hugo wrote:
> On 9/13/2023 2:47 AM, Qiang Yu wrote:
>> From: Hemant Kumar <quic_hemantk@quicinc.com>
>>
>> Take irqsave lock after TRE is generated to avoid deadlock due to core
>> getting interrupts enabled as local_bh_enable must not be called with
>> irqs disabled based on upstream patch.
>
> Where is local_bh_enable() being called?  What patch?  What is 
> upstream of the codebase you submitted this to?  Why is it safe to 
> call mhi_gen_tre() without the lock?

This patch is to fix the issue included by  "[PATCH v2 1/2] bus: mhi: 
host: Add spinlock to protect WP access when queueing TREs". In that 
patch, we add write_lock_bh/write_unlock_bh in mhi_gen_tre().

However, before mhi_gen_tre() is invoked, mhi_cntrl->pm_lock is getted, 
line 1125, and it is a spin lock. So it becomes we want to get and 
release bh lock after spin lock.  __local_bh_enable_ip is called as part 
of write_unlock_bh

and local_bh_enable. When CONFIG_TRACE_IRQFLAGS is enabled, irq will be 
enabled once __local_bh_enable_ip is called. The commit message is not 
clear and confusing, will change it in [patch v3].


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-09-25  3:10     ` Qiang Yu
@ 2023-09-29 15:22       ` Jeffrey Hugo
  2023-10-16  8:46         ` Qiang Yu
  0 siblings, 1 reply; 16+ messages in thread
From: Jeffrey Hugo @ 2023-09-29 15:22 UTC (permalink / raw)
  To: Qiang Yu, mani; +Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana

On 9/24/2023 9:10 PM, Qiang Yu wrote:
> 
> On 9/22/2023 10:44 PM, Jeffrey Hugo wrote:
>> On 9/13/2023 2:47 AM, Qiang Yu wrote:
>>> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>>
>>> Protect WP accesses such that multiple threads queueing buffers for
>>> incoming data do not race and access the same WP twice. Ensure read and
>>> write locks for the channel are not taken in succession by dropping the
>>> read lock from parse_xfer_event() such that a callback given to client
>>> can potentially queue buffers and acquire the write lock in that 
>>> process.
>>> Any queueing of buffers should be done without channel read lock 
>>> acquired
>>> as it can result in multiple locks and a soft lockup.
>>>
>>> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>>> ---
>>>   drivers/bus/mhi/host/main.c | 11 ++++++++++-
>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>> index dcf627b..13c4b89 100644
>>> --- a/drivers/bus/mhi/host/main.c
>>> +++ b/drivers/bus/mhi/host/main.c
>>> @@ -642,6 +642,7 @@ static int parse_xfer_event(struct mhi_controller 
>>> *mhi_cntrl,
>>>               mhi_del_ring_element(mhi_cntrl, tre_ring);
>>>               local_rp = tre_ring->rp;
>>>   +            read_unlock_bh(&mhi_chan->lock);
>>
>> This doesn't work due to the write_lock_irqsave(&mhi_chan->lock, 
>> flags); on line 591.
> Write_lock_irqsave(&mhi_chan->lock, flags) is used in case of ev_code >= 
> MHI_EV_CC_OOB. We only read_lock/read_unlock the mhi_chan while ev_code 
> < MHI_EV_CC_OOB.

Sorry.  OOB != EOB

>>
>> I really don't like that we are unlocking the mhi_chan while still 
>> using it.  It opens up a window where the mhi_chan state can be 
>> updated between here and the client using the callback to queue a buf.
>>
>> Perhaps we need a new lock that just protects the wp, and needs to be 
>> only grabbed while mhi_chan->lock is held?
> 
> Since we have employed mhi_chan lock to protect the channel and what we 
> are concerned here is that client may queue buf to a disabled or stopped 
> channel, can we check channel state after getting mhi_chan->lock like 
> line 595.
> 
> We can add the check after getting write lock in mhi_gen_tre() and after 
> getting read lock again here.

I'm not sure that is sufficient.  After you unlock to notify the client, 
MHI is going to manipulate the packet count and runtime_pm without the 
lock (648-652).  It seems like that adds additional races which won't be 
covered by the additional check you propose.

> 
>>
>>>               /* notify client */
>>>               mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>>>   @@ -667,6 +668,7 @@ static int parse_xfer_event(struct 
>>> mhi_controller *mhi_cntrl,
>>>                       kfree(buf_info->cb_buf);
>>>                   }
>>>               }
>>> +            read_lock_bh(&mhi_chan->lock);
>>>           }
>>>           break;
>>>       } /* CC_EOT */
>>> @@ -1204,6 +1206,9 @@ int mhi_gen_tre(struct mhi_controller 
>>> *mhi_cntrl, struct mhi_chan *mhi_chan,
>>>       int eot, eob, chain, bei;
>>>       int ret;
>>>   +    /* Protect accesses for reading and incrementing WP */
>>> +    write_lock_bh(&mhi_chan->lock);
>>> +
>>>       buf_ring = &mhi_chan->buf_ring;
>>>       tre_ring = &mhi_chan->tre_ring;
>>>   @@ -1221,8 +1226,10 @@ int mhi_gen_tre(struct mhi_controller 
>>> *mhi_cntrl, struct mhi_chan *mhi_chan,
>>>         if (!info->pre_mapped) {
>>>           ret = mhi_cntrl->map_single(mhi_cntrl, buf_info);
>>> -        if (ret)
>>> +        if (ret) {
>>> +            write_unlock_bh(&mhi_chan->lock);
>>>               return ret;
>>> +        }
>>>       }
>>>         eob = !!(flags & MHI_EOB);
>>> @@ -1239,6 +1246,8 @@ int mhi_gen_tre(struct mhi_controller 
>>> *mhi_cntrl, struct mhi_chan *mhi_chan,
>>>       mhi_add_ring_element(mhi_cntrl, tre_ring);
>>>       mhi_add_ring_element(mhi_cntrl, buf_ring);
>>>   +    write_unlock_bh(&mhi_chan->lock);
>>> +
>>>       return 0;
>>>   }
>>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/2] bus: mhi: host: Take irqsave lock after TRE is generated
  2023-09-25  4:08     ` Qiang Yu
@ 2023-09-29 15:25       ` Jeffrey Hugo
  2023-10-16  8:49         ` Qiang Yu
  0 siblings, 1 reply; 16+ messages in thread
From: Jeffrey Hugo @ 2023-09-29 15:25 UTC (permalink / raw)
  To: Qiang Yu, mani
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana,
	Hemant Kumar, Lazarus Motha

On 9/24/2023 10:08 PM, Qiang Yu wrote:
> 
> On 9/22/2023 10:50 PM, Jeffrey Hugo wrote:
>> On 9/13/2023 2:47 AM, Qiang Yu wrote:
>>> From: Hemant Kumar <quic_hemantk@quicinc.com>
>>>
>>> Take irqsave lock after TRE is generated to avoid deadlock due to core
>>> getting interrupts enabled as local_bh_enable must not be called with
>>> irqs disabled based on upstream patch.
>>
>> Where is local_bh_enable() being called?  What patch?  What is 
>> upstream of the codebase you submitted this to?  Why is it safe to 
>> call mhi_gen_tre() without the lock?
> 
> This patch is to fix the issue included by  "[PATCH v2 1/2] bus: mhi: 
> host: Add spinlock to protect WP access when queueing TREs". In that 
> patch, we add write_lock_bh/write_unlock_bh in mhi_gen_tre().
> 
> However, before mhi_gen_tre() is invoked, mhi_cntrl->pm_lock is getted, 
> line 1125, and it is a spin lock. So it becomes we want to get and 
> release bh lock after spin lock.  __local_bh_enable_ip is called as part 
> of write_unlock_bh
> 
> and local_bh_enable. When CONFIG_TRACE_IRQFLAGS is enabled, irq will be 
> enabled once __local_bh_enable_ip is called. The commit message is not 
> clear and confusing, will change it in [patch v3].
> 

In addition to clarifying the commit message, I recommend looking at 
adding this to the other patch.  It seems very odd to review a series 
where one patch introduces a known issue, and a following patch corrects 
that issue.  It would be better to not introduce the issue in the first 
place.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-09-29 15:22       ` Jeffrey Hugo
@ 2023-10-16  8:46         ` Qiang Yu
  2023-10-20 15:07           ` Jeffrey Hugo
  0 siblings, 1 reply; 16+ messages in thread
From: Qiang Yu @ 2023-10-16  8:46 UTC (permalink / raw)
  To: Jeffrey Hugo, mani
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana


On 9/29/2023 11:22 PM, Jeffrey Hugo wrote:
> On 9/24/2023 9:10 PM, Qiang Yu wrote:
>>
>> On 9/22/2023 10:44 PM, Jeffrey Hugo wrote:
>>> On 9/13/2023 2:47 AM, Qiang Yu wrote:
>>>> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>>>
>>>> Protect WP accesses such that multiple threads queueing buffers for
>>>> incoming data do not race and access the same WP twice. Ensure read 
>>>> and
>>>> write locks for the channel are not taken in succession by dropping 
>>>> the
>>>> read lock from parse_xfer_event() such that a callback given to client
>>>> can potentially queue buffers and acquire the write lock in that 
>>>> process.
>>>> Any queueing of buffers should be done without channel read lock 
>>>> acquired
>>>> as it can result in multiple locks and a soft lockup.
>>>>
>>>> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>>>> ---
>>>>   drivers/bus/mhi/host/main.c | 11 ++++++++++-
>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>>> index dcf627b..13c4b89 100644
>>>> --- a/drivers/bus/mhi/host/main.c
>>>> +++ b/drivers/bus/mhi/host/main.c
>>>> @@ -642,6 +642,7 @@ static int parse_xfer_event(struct 
>>>> mhi_controller *mhi_cntrl,
>>>>               mhi_del_ring_element(mhi_cntrl, tre_ring);
>>>>               local_rp = tre_ring->rp;
>>>>   +            read_unlock_bh(&mhi_chan->lock);
>>>
>>> This doesn't work due to the write_lock_irqsave(&mhi_chan->lock, 
>>> flags); on line 591.
>> Write_lock_irqsave(&mhi_chan->lock, flags) is used in case of ev_code 
>> >= MHI_EV_CC_OOB. We only read_lock/read_unlock the mhi_chan while 
>> ev_code < MHI_EV_CC_OOB.
>
> Sorry.  OOB != EOB
>
>>>
>>> I really don't like that we are unlocking the mhi_chan while still 
>>> using it.  It opens up a window where the mhi_chan state can be 
>>> updated between here and the client using the callback to queue a buf.
>>>
>>> Perhaps we need a new lock that just protects the wp, and needs to 
>>> be only grabbed while mhi_chan->lock is held?
>>
>> Since we have employed mhi_chan lock to protect the channel and what 
>> we are concerned here is that client may queue buf to a disabled or 
>> stopped channel, can we check channel state after getting 
>> mhi_chan->lock like line 595.
>>
>> We can add the check after getting write lock in mhi_gen_tre() and 
>> after getting read lock again here.
>
> I'm not sure that is sufficient.  After you unlock to notify the 
> client, MHI is going to manipulate the packet count and runtime_pm 
> without the lock (648-652).  It seems like that adds additional races 
> which won't be covered by the additional check you propose.

I don't think read_lock_bh(&mhi_chan->lock) can protect runtime_pm and 
the packet count here. Even if we do not unlock, mhi state and packet 
count can still be changed because we did not get pm_lock here, which is 
used in all mhi state transition function.

I also checked all places that mhi_chan->lock is grabbed, did not see 
packet count and runtime_pm be protected by write_lock(&mhi_chan->lock).


If you really don't like the unlock operation, we can also take a new 
lock. But I think we only need to add the new lock in two places, 
mhi_gen_tre and mhi_pm_m0_transition while mhi_chan->lock is held.

>
>>
>>>
>>>>               /* notify client */
>>>>               mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>>>>   @@ -667,6 +668,7 @@ static int parse_xfer_event(struct 
>>>> mhi_controller *mhi_cntrl,
>>>>                       kfree(buf_info->cb_buf);
>>>>                   }
>>>>               }
>>>> +            read_lock_bh(&mhi_chan->lock);
>>>>           }
>>>>           break;
>>>>       } /* CC_EOT */
>>>> @@ -1204,6 +1206,9 @@ int mhi_gen_tre(struct mhi_controller 
>>>> *mhi_cntrl, struct mhi_chan *mhi_chan,
>>>>       int eot, eob, chain, bei;
>>>>       int ret;
>>>>   +    /* Protect accesses for reading and incrementing WP */
>>>> +    write_lock_bh(&mhi_chan->lock);
>>>> +
>>>>       buf_ring = &mhi_chan->buf_ring;
>>>>       tre_ring = &mhi_chan->tre_ring;
>>>>   @@ -1221,8 +1226,10 @@ int mhi_gen_tre(struct mhi_controller 
>>>> *mhi_cntrl, struct mhi_chan *mhi_chan,
>>>>         if (!info->pre_mapped) {
>>>>           ret = mhi_cntrl->map_single(mhi_cntrl, buf_info);
>>>> -        if (ret)
>>>> +        if (ret) {
>>>> +            write_unlock_bh(&mhi_chan->lock);
>>>>               return ret;
>>>> +        }
>>>>       }
>>>>         eob = !!(flags & MHI_EOB);
>>>> @@ -1239,6 +1246,8 @@ int mhi_gen_tre(struct mhi_controller 
>>>> *mhi_cntrl, struct mhi_chan *mhi_chan,
>>>>       mhi_add_ring_element(mhi_cntrl, tre_ring);
>>>>       mhi_add_ring_element(mhi_cntrl, buf_ring);
>>>>   +    write_unlock_bh(&mhi_chan->lock);
>>>> +
>>>>       return 0;
>>>>   }
>>>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/2] bus: mhi: host: Take irqsave lock after TRE is generated
  2023-09-29 15:25       ` Jeffrey Hugo
@ 2023-10-16  8:49         ` Qiang Yu
  0 siblings, 0 replies; 16+ messages in thread
From: Qiang Yu @ 2023-10-16  8:49 UTC (permalink / raw)
  To: Jeffrey Hugo, mani
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana,
	Hemant Kumar, Lazarus Motha


On 9/29/2023 11:25 PM, Jeffrey Hugo wrote:
> On 9/24/2023 10:08 PM, Qiang Yu wrote:
>>
>> On 9/22/2023 10:50 PM, Jeffrey Hugo wrote:
>>> On 9/13/2023 2:47 AM, Qiang Yu wrote:
>>>> From: Hemant Kumar <quic_hemantk@quicinc.com>
>>>>
>>>> Take irqsave lock after TRE is generated to avoid deadlock due to core
>>>> getting interrupts enabled as local_bh_enable must not be called with
>>>> irqs disabled based on upstream patch.
>>>
>>> Where is local_bh_enable() being called?  What patch?  What is 
>>> upstream of the codebase you submitted this to?  Why is it safe to 
>>> call mhi_gen_tre() without the lock?
>>
>> This patch is to fix the issue included by  "[PATCH v2 1/2] bus: mhi: 
>> host: Add spinlock to protect WP access when queueing TREs". In that 
>> patch, we add write_lock_bh/write_unlock_bh in mhi_gen_tre().
>>
>> However, before mhi_gen_tre() is invoked, mhi_cntrl->pm_lock is 
>> getted, line 1125, and it is a spin lock. So it becomes we want to 
>> get and release bh lock after spin lock. __local_bh_enable_ip is 
>> called as part of write_unlock_bh
>>
>> and local_bh_enable. When CONFIG_TRACE_IRQFLAGS is enabled, irq will 
>> be enabled once __local_bh_enable_ip is called. The commit message is 
>> not clear and confusing, will change it in [patch v3].
>>
>
> In addition to clarifying the commit message, I recommend looking at 
> adding this to the other patch.  It seems very odd to review a series 
> where one patch introduces a known issue, and a following patch 
> corrects that issue.  It would be better to not introduce the issue in 
> the first place.
OK, will squash two patches into one patch after we achieve an agreement.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-10-16  8:46         ` Qiang Yu
@ 2023-10-20 15:07           ` Jeffrey Hugo
  2023-11-06  4:51             ` Manivannan Sadhasivam
  0 siblings, 1 reply; 16+ messages in thread
From: Jeffrey Hugo @ 2023-10-20 15:07 UTC (permalink / raw)
  To: Qiang Yu, mani; +Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana

On 10/16/2023 2:46 AM, Qiang Yu wrote:
> 
> On 9/29/2023 11:22 PM, Jeffrey Hugo wrote:
>> On 9/24/2023 9:10 PM, Qiang Yu wrote:
>>>
>>> On 9/22/2023 10:44 PM, Jeffrey Hugo wrote:
>>>> On 9/13/2023 2:47 AM, Qiang Yu wrote:
>>>>> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>>>>
>>>>> Protect WP accesses such that multiple threads queueing buffers for
>>>>> incoming data do not race and access the same WP twice. Ensure read 
>>>>> and
>>>>> write locks for the channel are not taken in succession by dropping 
>>>>> the
>>>>> read lock from parse_xfer_event() such that a callback given to client
>>>>> can potentially queue buffers and acquire the write lock in that 
>>>>> process.
>>>>> Any queueing of buffers should be done without channel read lock 
>>>>> acquired
>>>>> as it can result in multiple locks and a soft lockup.
>>>>>
>>>>> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>>>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>>>>> ---
>>>>>   drivers/bus/mhi/host/main.c | 11 ++++++++++-
>>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>>>> index dcf627b..13c4b89 100644
>>>>> --- a/drivers/bus/mhi/host/main.c
>>>>> +++ b/drivers/bus/mhi/host/main.c
>>>>> @@ -642,6 +642,7 @@ static int parse_xfer_event(struct 
>>>>> mhi_controller *mhi_cntrl,
>>>>>               mhi_del_ring_element(mhi_cntrl, tre_ring);
>>>>>               local_rp = tre_ring->rp;
>>>>>   +            read_unlock_bh(&mhi_chan->lock);
>>>>
>>>> This doesn't work due to the write_lock_irqsave(&mhi_chan->lock, 
>>>> flags); on line 591.
>>> Write_lock_irqsave(&mhi_chan->lock, flags) is used in case of ev_code 
>>> >= MHI_EV_CC_OOB. We only read_lock/read_unlock the mhi_chan while 
>>> ev_code < MHI_EV_CC_OOB.
>>
>> Sorry.  OOB != EOB
>>
>>>>
>>>> I really don't like that we are unlocking the mhi_chan while still 
>>>> using it.  It opens up a window where the mhi_chan state can be 
>>>> updated between here and the client using the callback to queue a buf.
>>>>
>>>> Perhaps we need a new lock that just protects the wp, and needs to 
>>>> be only grabbed while mhi_chan->lock is held?
>>>
>>> Since we have employed mhi_chan lock to protect the channel and what 
>>> we are concerned here is that client may queue buf to a disabled or 
>>> stopped channel, can we check channel state after getting 
>>> mhi_chan->lock like line 595.
>>>
>>> We can add the check after getting write lock in mhi_gen_tre() and 
>>> after getting read lock again here.
>>
>> I'm not sure that is sufficient.  After you unlock to notify the 
>> client, MHI is going to manipulate the packet count and runtime_pm 
>> without the lock (648-652).  It seems like that adds additional races 
>> which won't be covered by the additional check you propose.
> 
> I don't think read_lock_bh(&mhi_chan->lock) can protect runtime_pm and 
> the packet count here. Even if we do not unlock, mhi state and packet 
> count can still be changed because we did not get pm_lock here, which is 
> used in all mhi state transition function.
> 
> I also checked all places that mhi_chan->lock is grabbed, did not see 
> packet count and runtime_pm be protected by write_lock(&mhi_chan->lock).
> 
> 
> If you really don't like the unlock operation, we can also take a new 
> lock. But I think we only need to add the new lock in two places, 
> mhi_gen_tre and mhi_pm_m0_transition while mhi_chan->lock is held.

Mani, if I recall correctly, you were the architect of the locking.  Do 
you have an opinion?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-09-13  8:47 ` [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs Qiang Yu
  2023-09-22 14:44   ` Jeffrey Hugo
@ 2023-11-06  4:41   ` Manivannan Sadhasivam
  2023-11-07  7:19     ` Qiang Yu
  1 sibling, 1 reply; 16+ messages in thread
From: Manivannan Sadhasivam @ 2023-11-06  4:41 UTC (permalink / raw)
  To: Qiang Yu
  Cc: mani, quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana, Bhaumik Bhatt

On Wed, Sep 13, 2023 at 04:47:40PM +0800, Qiang Yu wrote:
> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
> 
> Protect WP accesses such that multiple threads queueing buffers for
> incoming data do not race and access the same WP twice. Ensure read and
> write locks for the channel are not taken in succession by dropping the
> read lock from parse_xfer_event() such that a callback given to client
> can potentially queue buffers and acquire the write lock in that process.
> Any queueing of buffers should be done without channel read lock acquired
> as it can result in multiple locks and a soft lockup.
> 

This change is doing two things:

1. Unlocking xfer_cb to prevent potential lockup
2. Protecting mhi_gen_tre() against concurrent access

So you should split this into two patches and also add Fixes tag if appropriate.

- Mani

> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
> ---
>  drivers/bus/mhi/host/main.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> index dcf627b..13c4b89 100644
> --- a/drivers/bus/mhi/host/main.c
> +++ b/drivers/bus/mhi/host/main.c
> @@ -642,6 +642,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>  			mhi_del_ring_element(mhi_cntrl, tre_ring);
>  			local_rp = tre_ring->rp;
>  
> +			read_unlock_bh(&mhi_chan->lock);
>  			/* notify client */
>  			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>  
> @@ -667,6 +668,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>  					kfree(buf_info->cb_buf);
>  				}
>  			}
> +			read_lock_bh(&mhi_chan->lock);
>  		}
>  		break;
>  	} /* CC_EOT */
> @@ -1204,6 +1206,9 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
>  	int eot, eob, chain, bei;
>  	int ret;
>  
> +	/* Protect accesses for reading and incrementing WP */
> +	write_lock_bh(&mhi_chan->lock);
> +
>  	buf_ring = &mhi_chan->buf_ring;
>  	tre_ring = &mhi_chan->tre_ring;
>  
> @@ -1221,8 +1226,10 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
>  
>  	if (!info->pre_mapped) {
>  		ret = mhi_cntrl->map_single(mhi_cntrl, buf_info);
> -		if (ret)
> +		if (ret) {
> +			write_unlock_bh(&mhi_chan->lock);
>  			return ret;
> +		}
>  	}
>  
>  	eob = !!(flags & MHI_EOB);
> @@ -1239,6 +1246,8 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
>  	mhi_add_ring_element(mhi_cntrl, tre_ring);
>  	mhi_add_ring_element(mhi_cntrl, buf_ring);
>  
> +	write_unlock_bh(&mhi_chan->lock);
> +
>  	return 0;
>  }
>  
> -- 
> 2.7.4
> 
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-10-20 15:07           ` Jeffrey Hugo
@ 2023-11-06  4:51             ` Manivannan Sadhasivam
  2023-11-07  7:59               ` Qiang Yu
  0 siblings, 1 reply; 16+ messages in thread
From: Manivannan Sadhasivam @ 2023-11-06  4:51 UTC (permalink / raw)
  To: Jeffrey Hugo
  Cc: Qiang Yu, mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana

On Fri, Oct 20, 2023 at 09:07:35AM -0600, Jeffrey Hugo wrote:
> On 10/16/2023 2:46 AM, Qiang Yu wrote:
> > 
> > On 9/29/2023 11:22 PM, Jeffrey Hugo wrote:
> > > On 9/24/2023 9:10 PM, Qiang Yu wrote:
> > > > 
> > > > On 9/22/2023 10:44 PM, Jeffrey Hugo wrote:
> > > > > On 9/13/2023 2:47 AM, Qiang Yu wrote:
> > > > > > From: Bhaumik Bhatt <bbhatt@codeaurora.org>
> > > > > > 
> > > > > > Protect WP accesses such that multiple threads queueing buffers for
> > > > > > incoming data do not race and access the same WP twice.
> > > > > > Ensure read and
> > > > > > write locks for the channel are not taken in succession
> > > > > > by dropping the
> > > > > > read lock from parse_xfer_event() such that a callback given to client
> > > > > > can potentially queue buffers and acquire the write lock
> > > > > > in that process.
> > > > > > Any queueing of buffers should be done without channel
> > > > > > read lock acquired
> > > > > > as it can result in multiple locks and a soft lockup.
> > > > > > 
> > > > > > Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
> > > > > > Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
> > > > > > ---
> > > > > >   drivers/bus/mhi/host/main.c | 11 ++++++++++-
> > > > > >   1 file changed, 10 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> > > > > > index dcf627b..13c4b89 100644
> > > > > > --- a/drivers/bus/mhi/host/main.c
> > > > > > +++ b/drivers/bus/mhi/host/main.c
> > > > > > @@ -642,6 +642,7 @@ static int parse_xfer_event(struct
> > > > > > mhi_controller *mhi_cntrl,
> > > > > >               mhi_del_ring_element(mhi_cntrl, tre_ring);
> > > > > >               local_rp = tre_ring->rp;
> > > > > >   +            read_unlock_bh(&mhi_chan->lock);
> > > > > 
> > > > > This doesn't work due to the
> > > > > write_lock_irqsave(&mhi_chan->lock, flags); on line 591.
> > > > Write_lock_irqsave(&mhi_chan->lock, flags) is used in case of
> > > > ev_code >= MHI_EV_CC_OOB. We only read_lock/read_unlock the
> > > > mhi_chan while ev_code < MHI_EV_CC_OOB.
> > > 
> > > Sorry.  OOB != EOB
> > > 
> > > > > 
> > > > > I really don't like that we are unlocking the mhi_chan while
> > > > > still using it.  It opens up a window where the mhi_chan
> > > > > state can be updated between here and the client using the
> > > > > callback to queue a buf.
> > > > > 
> > > > > Perhaps we need a new lock that just protects the wp, and
> > > > > needs to be only grabbed while mhi_chan->lock is held?
> > > > 
> > > > Since we have employed mhi_chan lock to protect the channel and
> > > > what we are concerned here is that client may queue buf to a
> > > > disabled or stopped channel, can we check channel state after
> > > > getting mhi_chan->lock like line 595.
> > > > 
> > > > We can add the check after getting write lock in mhi_gen_tre()
> > > > and after getting read lock again here.
> > > 
> > > I'm not sure that is sufficient.  After you unlock to notify the
> > > client, MHI is going to manipulate the packet count and runtime_pm
> > > without the lock (648-652).  It seems like that adds additional
> > > races which won't be covered by the additional check you propose.
> > 
> > I don't think read_lock_bh(&mhi_chan->lock) can protect runtime_pm and
> > the packet count here. Even if we do not unlock, mhi state and packet
> > count can still be changed because we did not get pm_lock here, which is
> > used in all mhi state transition function.
> > 
> > I also checked all places that mhi_chan->lock is grabbed, did not see
> > packet count and runtime_pm be protected by write_lock(&mhi_chan->lock).
> > 
> > 
> > If you really don't like the unlock operation, we can also take a new
> > lock. But I think we only need to add the new lock in two places,
> > mhi_gen_tre and mhi_pm_m0_transition while mhi_chan->lock is held.
> 
> Mani, if I recall correctly, you were the architect of the locking.  Do you
> have an opinion?
> 

TBH, the locking situation is a mess with MHI. Initially, we happen to have
separate locks for protecting various operations, but then during review, it was
advised to reuse existing locks and avoid having too many separate locks.

This worked well but then we kind of abused the locks over time. I asked Hemant
and Bhaumik to audit the locks and fix them, but both of them left Qcom.

So in this situation, the intent of the pm_lock was to protect concurrent access
against updating the pm_state. And it also happen to protect _other_things_ such
as runtime_put, pending_pkts etc... But not properly, because most of the time
read lock is taken in places where pm_state is being read. So there is still a
possibility of race while accessing these _other_things_.

For this patch, I'm happy with dropping chan->lock before calling xfer_cb() and
I want someone (maybe Qiang) to do the audit of locking in general and come up
with fixes where needed.

- Mani
-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-11-06  4:41   ` Manivannan Sadhasivam
@ 2023-11-07  7:19     ` Qiang Yu
  0 siblings, 0 replies; 16+ messages in thread
From: Qiang Yu @ 2023-11-07  7:19 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana, Bhaumik Bhatt


On 11/6/2023 12:41 PM, Manivannan Sadhasivam wrote:
> On Wed, Sep 13, 2023 at 04:47:40PM +0800, Qiang Yu wrote:
>> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>
>> Protect WP accesses such that multiple threads queueing buffers for
>> incoming data do not race and access the same WP twice. Ensure read and
>> write locks for the channel are not taken in succession by dropping the
>> read lock from parse_xfer_event() such that a callback given to client
>> can potentially queue buffers and acquire the write lock in that process.
>> Any queueing of buffers should be done without channel read lock acquired
>> as it can result in multiple locks and a soft lockup.
>>
> This change is doing two things:
>
> 1. Unlocking xfer_cb to prevent potential lockup
> 2. Protecting mhi_gen_tre() against concurrent access
>
> So you should split this into two patches and also add Fixes tag if appropriate.
>
> - Mani
Hi Mani, thanks for review. I split this into two patch and add Fixes tag.
>
>> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>> ---
>>   drivers/bus/mhi/host/main.c | 11 ++++++++++-
>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>> index dcf627b..13c4b89 100644
>> --- a/drivers/bus/mhi/host/main.c
>> +++ b/drivers/bus/mhi/host/main.c
>> @@ -642,6 +642,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>   			mhi_del_ring_element(mhi_cntrl, tre_ring);
>>   			local_rp = tre_ring->rp;
>>   
>> +			read_unlock_bh(&mhi_chan->lock);
>>   			/* notify client */
>>   			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>>   
>> @@ -667,6 +668,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>   					kfree(buf_info->cb_buf);
>>   				}
>>   			}
>> +			read_lock_bh(&mhi_chan->lock);
>>   		}
>>   		break;
>>   	} /* CC_EOT */
>> @@ -1204,6 +1206,9 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
>>   	int eot, eob, chain, bei;
>>   	int ret;
>>   
>> +	/* Protect accesses for reading and incrementing WP */
>> +	write_lock_bh(&mhi_chan->lock);
>> +
>>   	buf_ring = &mhi_chan->buf_ring;
>>   	tre_ring = &mhi_chan->tre_ring;
>>   
>> @@ -1221,8 +1226,10 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
>>   
>>   	if (!info->pre_mapped) {
>>   		ret = mhi_cntrl->map_single(mhi_cntrl, buf_info);
>> -		if (ret)
>> +		if (ret) {
>> +			write_unlock_bh(&mhi_chan->lock);
>>   			return ret;
>> +		}
>>   	}
>>   
>>   	eob = !!(flags & MHI_EOB);
>> @@ -1239,6 +1246,8 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
>>   	mhi_add_ring_element(mhi_cntrl, tre_ring);
>>   	mhi_add_ring_element(mhi_cntrl, buf_ring);
>>   
>> +	write_unlock_bh(&mhi_chan->lock);
>> +
>>   	return 0;
>>   }
>>   
>> -- 
>> 2.7.4
>>
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-11-06  4:51             ` Manivannan Sadhasivam
@ 2023-11-07  7:59               ` Qiang Yu
  0 siblings, 0 replies; 16+ messages in thread
From: Qiang Yu @ 2023-11-07  7:59 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Jeffrey Hugo
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana


On 11/6/2023 12:51 PM, Manivannan Sadhasivam wrote:
> On Fri, Oct 20, 2023 at 09:07:35AM -0600, Jeffrey Hugo wrote:
>> On 10/16/2023 2:46 AM, Qiang Yu wrote:
>>> On 9/29/2023 11:22 PM, Jeffrey Hugo wrote:
>>>> On 9/24/2023 9:10 PM, Qiang Yu wrote:
>>>>> On 9/22/2023 10:44 PM, Jeffrey Hugo wrote:
>>>>>> On 9/13/2023 2:47 AM, Qiang Yu wrote:
>>>>>>> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>>>>>>
>>>>>>> Protect WP accesses such that multiple threads queueing buffers for
>>>>>>> incoming data do not race and access the same WP twice.
>>>>>>> Ensure read and
>>>>>>> write locks for the channel are not taken in succession
>>>>>>> by dropping the
>>>>>>> read lock from parse_xfer_event() such that a callback given to client
>>>>>>> can potentially queue buffers and acquire the write lock
>>>>>>> in that process.
>>>>>>> Any queueing of buffers should be done without channel
>>>>>>> read lock acquired
>>>>>>> as it can result in multiple locks and a soft lockup.
>>>>>>>
>>>>>>> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>>>>>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>>>>>>> ---
>>>>>>>    drivers/bus/mhi/host/main.c | 11 ++++++++++-
>>>>>>>    1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>>>>>> index dcf627b..13c4b89 100644
>>>>>>> --- a/drivers/bus/mhi/host/main.c
>>>>>>> +++ b/drivers/bus/mhi/host/main.c
>>>>>>> @@ -642,6 +642,7 @@ static int parse_xfer_event(struct
>>>>>>> mhi_controller *mhi_cntrl,
>>>>>>>                mhi_del_ring_element(mhi_cntrl, tre_ring);
>>>>>>>                local_rp = tre_ring->rp;
>>>>>>>    +            read_unlock_bh(&mhi_chan->lock);
>>>>>> This doesn't work due to the
>>>>>> write_lock_irqsave(&mhi_chan->lock, flags); on line 591.
>>>>> Write_lock_irqsave(&mhi_chan->lock, flags) is used in case of
>>>>> ev_code >= MHI_EV_CC_OOB. We only read_lock/read_unlock the
>>>>> mhi_chan while ev_code < MHI_EV_CC_OOB.
>>>> Sorry.  OOB != EOB
>>>>
>>>>>> I really don't like that we are unlocking the mhi_chan while
>>>>>> still using it.  It opens up a window where the mhi_chan
>>>>>> state can be updated between here and the client using the
>>>>>> callback to queue a buf.
>>>>>>
>>>>>> Perhaps we need a new lock that just protects the wp, and
>>>>>> needs to be only grabbed while mhi_chan->lock is held?
>>>>> Since we have employed mhi_chan lock to protect the channel and
>>>>> what we are concerned here is that client may queue buf to a
>>>>> disabled or stopped channel, can we check channel state after
>>>>> getting mhi_chan->lock like line 595.
>>>>>
>>>>> We can add the check after getting write lock in mhi_gen_tre()
>>>>> and after getting read lock again here.
>>>> I'm not sure that is sufficient.  After you unlock to notify the
>>>> client, MHI is going to manipulate the packet count and runtime_pm
>>>> without the lock (648-652).  It seems like that adds additional
>>>> races which won't be covered by the additional check you propose.
>>> I don't think read_lock_bh(&mhi_chan->lock) can protect runtime_pm and
>>> the packet count here. Even if we do not unlock, mhi state and packet
>>> count can still be changed because we did not get pm_lock here, which is
>>> used in all mhi state transition function.
>>>
>>> I also checked all places that mhi_chan->lock is grabbed, did not see
>>> packet count and runtime_pm be protected by write_lock(&mhi_chan->lock).
>>>
>>>
>>> If you really don't like the unlock operation, we can also take a new
>>> lock. But I think we only need to add the new lock in two places,
>>> mhi_gen_tre and mhi_pm_m0_transition while mhi_chan->lock is held.
>> Mani, if I recall correctly, you were the architect of the locking.  Do you
>> have an opinion?
>>
> TBH, the locking situation is a mess with MHI. Initially, we happen to have
> separate locks for protecting various operations, but then during review, it was
> advised to reuse existing locks and avoid having too many separate locks.
>
> This worked well but then we kind of abused the locks over time. I asked Hemant
> and Bhaumik to audit the locks and fix them, but both of them left Qcom.
>
> So in this situation, the intent of the pm_lock was to protect concurrent access
> against updating the pm_state. And it also happen to protect _other_things_ such
> as runtime_put, pending_pkts etc... But not properly, because most of the time
> read lock is taken in places where pm_state is being read. So there is still a
> possibility of race while accessing these _other_things_.
>
> For this patch, I'm happy with dropping chan->lock before calling xfer_cb() and
> I want someone (maybe Qiang) to do the audit of locking in general and come up
> with fixes where needed.
>
> - Mani

As discussed with Jeff before, we also need to check channel state 
before queue buffer and after re-lock

in parse_xfer_event, so I also add the channel state check in next 
version patch.

Probably I can do the audit of locking. It's a good chance for me to 
understand various locks in MHI host

driver completely.


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-11-07  8:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-13  8:47 [PATCH v2 0/2] Add lock to avoid race when ringing channel DB Qiang Yu
2023-09-13  8:47 ` [PATCH v2 1/2] bus: mhi: host: Add spinlock to protect WP access when queueing TREs Qiang Yu
2023-09-22 14:44   ` Jeffrey Hugo
2023-09-25  3:10     ` Qiang Yu
2023-09-29 15:22       ` Jeffrey Hugo
2023-10-16  8:46         ` Qiang Yu
2023-10-20 15:07           ` Jeffrey Hugo
2023-11-06  4:51             ` Manivannan Sadhasivam
2023-11-07  7:59               ` Qiang Yu
2023-11-06  4:41   ` Manivannan Sadhasivam
2023-11-07  7:19     ` Qiang Yu
2023-09-13  8:47 ` [PATCH v2 2/2] bus: mhi: host: Take irqsave lock after TRE is generated Qiang Yu
2023-09-22 14:50   ` Jeffrey Hugo
2023-09-25  4:08     ` Qiang Yu
2023-09-29 15:25       ` Jeffrey Hugo
2023-10-16  8:49         ` Qiang Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).