public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] bus: mhi: host: Add lock to avoid race when ringing channel DB
@ 2023-11-14  5:27 Qiang Yu
  2023-11-14  5:27 ` [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when queueing TREs Qiang Yu
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Qiang Yu @ 2023-11-14  5:27 UTC (permalink / raw)
  To: mani, quic_jhugo
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana, Qiang Yu

1. We need a write lock in mhi_gen_tre otherwise there is race of the WP
used for ringing channel DB between mhi_queue and M0 transition.
2. We can not invoke local_bh_enable() when irqs are disabled, so move
read_lock_irqsave() under the mhi_gen_tre() since we add write_lock_bh() in
mhi_gen_tre().
3. Unlock xfer_cb to prevent potential lockup
4. After re-lock, check mhi channel state again to stop processing of a
disabled or stopped channel.  

v1 -> v2:
Added write_unlock_bh(&mhi_chan->lock) in mhi_gen_tre() before return
because of error process.

v2 -> v3:
1. split protecting WP and unlocking xfer_cb into two patches
2. Add a new patch to stop processing buffer and eventof a disabled or
stopped channel.

v3 -> v4:
1. Modify commit message
2. Add unlock operation before return error

Bhaumik Bhatt (1):
  bus: mhi: host: Add spinlock to protect WP access when queueing TREs

Hemant Kumar (1):
  bus: mhi: host: Take irqsave lock after TRE is generated

Qiang Yu (2):
  bus: mhi: host: Drop chan lock before queuing buffers
  bus: mhi: host: Avoid processing buffer and event of a disable channel

 drivers/bus/mhi/host/main.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when queueing TREs
  2023-11-14  5:27 [PATCH v4 0/4] bus: mhi: host: Add lock to avoid race when ringing channel DB Qiang Yu
@ 2023-11-14  5:27 ` Qiang Yu
  2023-11-14  5:27 ` [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers Qiang Yu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 21+ messages in thread
From: Qiang Yu @ 2023-11-14  5:27 UTC (permalink / raw)
  To: mani, quic_jhugo
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana,
	Bhaumik Bhatt, stable, Qiang Yu

From: Bhaumik Bhatt <bbhatt@codeaurora.org>

Protect WP accesses such that multiple threads queueing buffers for
incoming data do not race.

Cc: <stable@vger.kernel.org>
Fixes: 189ff97cca53 ("bus: mhi: core: Add support for data transfer")
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
 drivers/bus/mhi/host/main.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
index dcf627b..6c6d253 100644
--- a/drivers/bus/mhi/host/main.c
+++ b/drivers/bus/mhi/host/main.c
@@ -1204,6 +1204,9 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
 	int eot, eob, chain, bei;
 	int ret;
 
+	/* Protect accesses for reading and incrementing WP */
+	write_lock_bh(&mhi_chan->lock);
+
 	buf_ring = &mhi_chan->buf_ring;
 	tre_ring = &mhi_chan->tre_ring;
 
@@ -1221,8 +1224,10 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
 
 	if (!info->pre_mapped) {
 		ret = mhi_cntrl->map_single(mhi_cntrl, buf_info);
-		if (ret)
+		if (ret) {
+			write_unlock_bh(&mhi_chan->lock);
 			return ret;
+		}
 	}
 
 	eob = !!(flags & MHI_EOB);
@@ -1239,6 +1244,8 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
 	mhi_add_ring_element(mhi_cntrl, tre_ring);
 	mhi_add_ring_element(mhi_cntrl, buf_ring);
 
+	write_unlock_bh(&mhi_chan->lock);
+
 	return 0;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-11-14  5:27 [PATCH v4 0/4] bus: mhi: host: Add lock to avoid race when ringing channel DB Qiang Yu
  2023-11-14  5:27 ` [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when queueing TREs Qiang Yu
@ 2023-11-14  5:27 ` Qiang Yu
  2023-11-24 10:04   ` Manivannan Sadhasivam
  2023-11-14  5:27 ` [PATCH v4 3/4] bus: mhi: host: Avoid processing buffer and event of a disable channel Qiang Yu
  2023-11-14  5:27 ` [PATCH v4 4/4] bus: mhi: host: Take irqsave lock after TRE is generated Qiang Yu
  3 siblings, 1 reply; 21+ messages in thread
From: Qiang Yu @ 2023-11-14  5:27 UTC (permalink / raw)
  To: mani, quic_jhugo
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana, Qiang Yu

Ensure read and write locks for the channel are not taken in succession by
dropping the read lock from parse_xfer_event() such that a callback given
to client can potentially queue buffers and acquire the write lock in that
process. Any queueing of buffers should be done without channel read lock
acquired as it can result in multiple locks and a soft lockup.

Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
---
 drivers/bus/mhi/host/main.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
index 6c6d253..c4215b0 100644
--- a/drivers/bus/mhi/host/main.c
+++ b/drivers/bus/mhi/host/main.c
@@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
 			mhi_del_ring_element(mhi_cntrl, tre_ring);
 			local_rp = tre_ring->rp;
 
+			read_unlock_bh(&mhi_chan->lock);
+
 			/* notify client */
 			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
 
@@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
 					kfree(buf_info->cb_buf);
 				}
 			}
+
+			read_lock_bh(&mhi_chan->lock);
 		}
 		break;
 	} /* CC_EOT */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 3/4] bus: mhi: host: Avoid processing buffer and event of a disable channel
  2023-11-14  5:27 [PATCH v4 0/4] bus: mhi: host: Add lock to avoid race when ringing channel DB Qiang Yu
  2023-11-14  5:27 ` [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when queueing TREs Qiang Yu
  2023-11-14  5:27 ` [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers Qiang Yu
@ 2023-11-14  5:27 ` Qiang Yu
  2023-11-14  5:27 ` [PATCH v4 4/4] bus: mhi: host: Take irqsave lock after TRE is generated Qiang Yu
  3 siblings, 0 replies; 21+ messages in thread
From: Qiang Yu @ 2023-11-14  5:27 UTC (permalink / raw)
  To: mani, quic_jhugo
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana, Qiang Yu

MHI channel state is protected by mhi_chan->lock. Hence, after core drops
mhi_chan->lock during processing xfer event, it can not prevent channel
state being changed if client closes channel or driver is removed at this
time. So let's check mhi channel state after getting chan->lock again to
avoid queuing buffer to a disabled channel in xfer callback and stop
processing event of the disabled channel.

Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
---
 drivers/bus/mhi/host/main.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
index c4215b0..33f27e2 100644
--- a/drivers/bus/mhi/host/main.c
+++ b/drivers/bus/mhi/host/main.c
@@ -671,6 +671,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
 			}
 
 			read_lock_bh(&mhi_chan->lock);
+			if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED)
+				goto end_process_tx_event;
 		}
 		break;
 	} /* CC_EOT */
@@ -1210,6 +1212,10 @@ int mhi_gen_tre(struct mhi_controller *mhi_cntrl, struct mhi_chan *mhi_chan,
 
 	/* Protect accesses for reading and incrementing WP */
 	write_lock_bh(&mhi_chan->lock);
+	if (mhi_chan->ch_state != MHI_CH_STATE_ENABLED) {
+		write_unlock_bh(&mhi_chan->lock);
+		return -EINVAL;
+	}
 
 	buf_ring = &mhi_chan->buf_ring;
 	tre_ring = &mhi_chan->tre_ring;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 4/4] bus: mhi: host: Take irqsave lock after TRE is generated
  2023-11-14  5:27 [PATCH v4 0/4] bus: mhi: host: Add lock to avoid race when ringing channel DB Qiang Yu
                   ` (2 preceding siblings ...)
  2023-11-14  5:27 ` [PATCH v4 3/4] bus: mhi: host: Avoid processing buffer and event of a disable channel Qiang Yu
@ 2023-11-14  5:27 ` Qiang Yu
  2023-11-24 10:09   ` Manivannan Sadhasivam
  3 siblings, 1 reply; 21+ messages in thread
From: Qiang Yu @ 2023-11-14  5:27 UTC (permalink / raw)
  To: mani, quic_jhugo
  Cc: mhi, linux-arm-msm, linux-kernel, quic_cang, quic_mrana,
	Hemant Kumar, Lazarus Motha, Qiang Yu

From: Hemant Kumar <quic_hemantk@quicinc.com>

If CONFIG_TRACE_IRQFLAGS is enabled, irq will be enabled once __local_bh_
enable_ip is called as part of write_unlock_bh. Hence, let's take irqsave
lock after TRE is generated to avoid running write_unlock_bh when irqsave
lock is held.

Signed-off-by: Hemant Kumar <quic_hemantk@quicinc.com>
Signed-off-by: Lazarus Motha <quic_lmotha@quicinc.com>
Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
---
 drivers/bus/mhi/host/main.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
index 33f27e2..d7abd0b 100644
--- a/drivers/bus/mhi/host/main.c
+++ b/drivers/bus/mhi/host/main.c
@@ -1128,17 +1128,15 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
 	if (unlikely(MHI_PM_IN_ERROR_STATE(mhi_cntrl->pm_state)))
 		return -EIO;
 
-	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
-
 	ret = mhi_is_ring_full(mhi_cntrl, tre_ring);
-	if (unlikely(ret)) {
-		ret = -EAGAIN;
-		goto exit_unlock;
-	}
+	if (unlikely(ret))
+		return -EAGAIN;
 
 	ret = mhi_gen_tre(mhi_cntrl, mhi_chan, buf_info, mflags);
 	if (unlikely(ret))
-		goto exit_unlock;
+		return ret;
+
+	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
 
 	/* Packet is queued, take a usage ref to exit M3 if necessary
 	 * for host->device buffer, balanced put is done on buffer completion
@@ -1158,7 +1156,6 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
 	if (dir == DMA_FROM_DEVICE)
 		mhi_cntrl->runtime_put(mhi_cntrl);
 
-exit_unlock:
 	read_unlock_irqrestore(&mhi_cntrl->pm_lock, flags);
 
 	return ret;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-11-14  5:27 ` [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers Qiang Yu
@ 2023-11-24 10:04   ` Manivannan Sadhasivam
  2023-11-27  7:12     ` Qiang Yu
  2023-11-27  7:13     ` Qiang Yu
  0 siblings, 2 replies; 21+ messages in thread
From: Manivannan Sadhasivam @ 2023-11-24 10:04 UTC (permalink / raw)
  To: Qiang Yu
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana

On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
> Ensure read and write locks for the channel are not taken in succession by
> dropping the read lock from parse_xfer_event() such that a callback given
> to client can potentially queue buffers and acquire the write lock in that
> process. Any queueing of buffers should be done without channel read lock
> acquired as it can result in multiple locks and a soft lockup.
> 

Is this patch trying to fix an existing issue in client drivers or a potential
issue in the future drivers?

Even if you take care of disabled channels, "mhi_event->lock" acquired during
mhi_mark_stale_events() can cause deadlock, since event lock is already held by
mhi_ev_task().

I'd prefer not to open the window unless this patch is fixing a real issue.

- Mani

> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
> ---
>  drivers/bus/mhi/host/main.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> index 6c6d253..c4215b0 100644
> --- a/drivers/bus/mhi/host/main.c
> +++ b/drivers/bus/mhi/host/main.c
> @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>  			mhi_del_ring_element(mhi_cntrl, tre_ring);
>  			local_rp = tre_ring->rp;
>  
> +			read_unlock_bh(&mhi_chan->lock);
> +
>  			/* notify client */
>  			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>  
> @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>  					kfree(buf_info->cb_buf);
>  				}
>  			}
> +
> +			read_lock_bh(&mhi_chan->lock);
>  		}
>  		break;
>  	} /* CC_EOT */
> -- 
> 2.7.4
> 
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 4/4] bus: mhi: host: Take irqsave lock after TRE is generated
  2023-11-14  5:27 ` [PATCH v4 4/4] bus: mhi: host: Take irqsave lock after TRE is generated Qiang Yu
@ 2023-11-24 10:09   ` Manivannan Sadhasivam
  2023-11-27  7:19     ` Qiang Yu
  0 siblings, 1 reply; 21+ messages in thread
From: Manivannan Sadhasivam @ 2023-11-24 10:09 UTC (permalink / raw)
  To: Qiang Yu
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana, Hemant Kumar, Lazarus Motha

On Tue, Nov 14, 2023 at 01:27:41PM +0800, Qiang Yu wrote:
> From: Hemant Kumar <quic_hemantk@quicinc.com>
> 
> If CONFIG_TRACE_IRQFLAGS is enabled, irq will be enabled once __local_bh_
> enable_ip is called as part of write_unlock_bh. Hence, let's take irqsave

"__local_bh_enable_ip" is a function name, so you should not break it.

> lock after TRE is generated to avoid running write_unlock_bh when irqsave
> lock is held.
> 

I still don't understand this commit message. Where is the write_unlock_bh()
being called?

- Mani

> Signed-off-by: Hemant Kumar <quic_hemantk@quicinc.com>
> Signed-off-by: Lazarus Motha <quic_lmotha@quicinc.com>
> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
> ---
>  drivers/bus/mhi/host/main.c | 13 +++++--------
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> index 33f27e2..d7abd0b 100644
> --- a/drivers/bus/mhi/host/main.c
> +++ b/drivers/bus/mhi/host/main.c
> @@ -1128,17 +1128,15 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
>  	if (unlikely(MHI_PM_IN_ERROR_STATE(mhi_cntrl->pm_state)))
>  		return -EIO;
>  
> -	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
> -
>  	ret = mhi_is_ring_full(mhi_cntrl, tre_ring);
> -	if (unlikely(ret)) {
> -		ret = -EAGAIN;
> -		goto exit_unlock;
> -	}
> +	if (unlikely(ret))
> +		return -EAGAIN;
>  
>  	ret = mhi_gen_tre(mhi_cntrl, mhi_chan, buf_info, mflags);
>  	if (unlikely(ret))
> -		goto exit_unlock;
> +		return ret;
> +
> +	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
>  
>  	/* Packet is queued, take a usage ref to exit M3 if necessary
>  	 * for host->device buffer, balanced put is done on buffer completion
> @@ -1158,7 +1156,6 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
>  	if (dir == DMA_FROM_DEVICE)
>  		mhi_cntrl->runtime_put(mhi_cntrl);
>  
> -exit_unlock:
>  	read_unlock_irqrestore(&mhi_cntrl->pm_lock, flags);
>  
>  	return ret;
> -- 
> 2.7.4
> 
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-11-24 10:04   ` Manivannan Sadhasivam
@ 2023-11-27  7:12     ` Qiang Yu
  2023-11-27  7:13     ` Qiang Yu
  1 sibling, 0 replies; 21+ messages in thread
From: Qiang Yu @ 2023-11-27  7:12 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana


On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
> On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
>> Ensure read and write locks for the channel are not taken in succession by
>> dropping the read lock from parse_xfer_event() such that a callback given
>> to client can potentially queue buffers and acquire the write lock in that
>> process. Any queueing of buffers should be done without channel read lock
>> acquired as it can result in multiple locks and a soft lockup.
>>
> Is this patch trying to fix an existing issue in client drivers or a potential
> issue in the future drivers?
>
> Even if you take care of disabled channels, "mhi_event->lock" acquired during
> mhi_mark_stale_events() can cause deadlock, since event lock is already held by
> mhi_ev_task().
>
> I'd prefer not to open the window unless this patch is fixing a real issue.
>
> - Mani
In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when 
queueing
TREs,  we add 
write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer 
callback,
so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid 
acquiring mhi_chan->lock
twice.

Sorry for confusing you. Do you think we need to sqush this two patch 
into one?
>
>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>> ---
>>   drivers/bus/mhi/host/main.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>> index 6c6d253..c4215b0 100644
>> --- a/drivers/bus/mhi/host/main.c
>> +++ b/drivers/bus/mhi/host/main.c
>> @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>   			mhi_del_ring_element(mhi_cntrl, tre_ring);
>>   			local_rp = tre_ring->rp;
>>   
>> +			read_unlock_bh(&mhi_chan->lock);
>> +
>>   			/* notify client */
>>   			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>>   
>> @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>   					kfree(buf_info->cb_buf);
>>   				}
>>   			}
>> +
>> +			read_lock_bh(&mhi_chan->lock);
>>   		}
>>   		break;
>>   	} /* CC_EOT */
>> -- 
>> 2.7.4
>>
>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-11-24 10:04   ` Manivannan Sadhasivam
  2023-11-27  7:12     ` Qiang Yu
@ 2023-11-27  7:13     ` Qiang Yu
  2023-11-28 13:32       ` Manivannan Sadhasivam
  1 sibling, 1 reply; 21+ messages in thread
From: Qiang Yu @ 2023-11-27  7:13 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana


On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
> On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
>> Ensure read and write locks for the channel are not taken in succession by
>> dropping the read lock from parse_xfer_event() such that a callback given
>> to client can potentially queue buffers and acquire the write lock in that
>> process. Any queueing of buffers should be done without channel read lock
>> acquired as it can result in multiple locks and a soft lockup.
>>
> Is this patch trying to fix an existing issue in client drivers or a potential
> issue in the future drivers?
>
> Even if you take care of disabled channels, "mhi_event->lock" acquired during
> mhi_mark_stale_events() can cause deadlock, since event lock is already held by
> mhi_ev_task().
>
> I'd prefer not to open the window unless this patch is fixing a real issue.
>
> - Mani
In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when 
queueing
TREs,  we add 
write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer 
callback,
so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid 
acquiring mhi_chan->lock
twice.

Sorry for confusing you. Do you think we need to sqush this two patch 
into one?
>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>> ---
>>   drivers/bus/mhi/host/main.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>> index 6c6d253..c4215b0 100644
>> --- a/drivers/bus/mhi/host/main.c
>> +++ b/drivers/bus/mhi/host/main.c
>> @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>   			mhi_del_ring_element(mhi_cntrl, tre_ring);
>>   			local_rp = tre_ring->rp;
>>   
>> +			read_unlock_bh(&mhi_chan->lock);
>> +
>>   			/* notify client */
>>   			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>>   
>> @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>   					kfree(buf_info->cb_buf);
>>   				}
>>   			}
>> +
>> +			read_lock_bh(&mhi_chan->lock);
>>   		}
>>   		break;
>>   	} /* CC_EOT */
>> -- 
>> 2.7.4
>>
>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 4/4] bus: mhi: host: Take irqsave lock after TRE is generated
  2023-11-24 10:09   ` Manivannan Sadhasivam
@ 2023-11-27  7:19     ` Qiang Yu
  2023-12-07  6:38       ` Manivannan Sadhasivam
  0 siblings, 1 reply; 21+ messages in thread
From: Qiang Yu @ 2023-11-27  7:19 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana, Hemant Kumar, Lazarus Motha


On 11/24/2023 6:09 PM, Manivannan Sadhasivam wrote:
> On Tue, Nov 14, 2023 at 01:27:41PM +0800, Qiang Yu wrote:
>> From: Hemant Kumar <quic_hemantk@quicinc.com>
>>
>> If CONFIG_TRACE_IRQFLAGS is enabled, irq will be enabled once __local_bh_
>> enable_ip is called as part of write_unlock_bh. Hence, let's take irqsave
> "__local_bh_enable_ip" is a function name, so you should not break it.
Thanks for let me know, will note this in following patch.
>
>> lock after TRE is generated to avoid running write_unlock_bh when irqsave
>> lock is held.
>>
> I still don't understand this commit message. Where is the write_unlock_bh()
> being called?
>
> - Mani
Write_unlock_bh() is invoked in mhi_gen_te()
The calling flow is like
mhi_queue
     read_lock_irqsave(&mhi_cntrl->pm_lock, flags)
     mhi_gen_tre
         write_lock_bh(&mhi_chan->lock)
         write_unlock_bh(&mhi_chan->lock)   // Will enable irq if 
CONFIG_TRACE_IRQFLAGS is enabled
     read_lock_irqsave(&mhi_cntrl->pm_lock, flags)

after adding this patch, the calling flow becomes

mhi_queue
     mhi_gen_tre
         write_lock_bh(&mhi_chan->lock)
         write_unlock_bh(&mhi_chan->lock)
     read_lock_irqsave(&mhi_cntrl->pm_lock, flags)
     read_lock_irqsave(&mhi_cntrl->pm_lock, flags)
>
>> Signed-off-by: Hemant Kumar <quic_hemantk@quicinc.com>
>> Signed-off-by: Lazarus Motha <quic_lmotha@quicinc.com>
>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>> ---
>>   drivers/bus/mhi/host/main.c | 13 +++++--------
>>   1 file changed, 5 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>> index 33f27e2..d7abd0b 100644
>> --- a/drivers/bus/mhi/host/main.c
>> +++ b/drivers/bus/mhi/host/main.c
>> @@ -1128,17 +1128,15 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
>>   	if (unlikely(MHI_PM_IN_ERROR_STATE(mhi_cntrl->pm_state)))
>>   		return -EIO;
>>   
>> -	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
>> -
>>   	ret = mhi_is_ring_full(mhi_cntrl, tre_ring);
>> -	if (unlikely(ret)) {
>> -		ret = -EAGAIN;
>> -		goto exit_unlock;
>> -	}
>> +	if (unlikely(ret))
>> +		return -EAGAIN;
>>   
>>   	ret = mhi_gen_tre(mhi_cntrl, mhi_chan, buf_info, mflags);
>>   	if (unlikely(ret))
>> -		goto exit_unlock;
>> +		return ret;
>> +
>> +	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
>>   
>>   	/* Packet is queued, take a usage ref to exit M3 if necessary
>>   	 * for host->device buffer, balanced put is done on buffer completion
>> @@ -1158,7 +1156,6 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
>>   	if (dir == DMA_FROM_DEVICE)
>>   		mhi_cntrl->runtime_put(mhi_cntrl);
>>   
>> -exit_unlock:
>>   	read_unlock_irqrestore(&mhi_cntrl->pm_lock, flags);
>>   
>>   	return ret;
>> -- 
>> 2.7.4
>>
>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-11-27  7:13     ` Qiang Yu
@ 2023-11-28 13:32       ` Manivannan Sadhasivam
  2023-11-29  3:29         ` Qiang Yu
  0 siblings, 1 reply; 21+ messages in thread
From: Manivannan Sadhasivam @ 2023-11-28 13:32 UTC (permalink / raw)
  To: Qiang Yu
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana

On Mon, Nov 27, 2023 at 03:13:55PM +0800, Qiang Yu wrote:
> 
> On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
> > On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
> > > Ensure read and write locks for the channel are not taken in succession by
> > > dropping the read lock from parse_xfer_event() such that a callback given
> > > to client can potentially queue buffers and acquire the write lock in that
> > > process. Any queueing of buffers should be done without channel read lock
> > > acquired as it can result in multiple locks and a soft lockup.
> > > 
> > Is this patch trying to fix an existing issue in client drivers or a potential
> > issue in the future drivers?
> > 
> > Even if you take care of disabled channels, "mhi_event->lock" acquired during
> > mhi_mark_stale_events() can cause deadlock, since event lock is already held by
> > mhi_ev_task().
> > 
> > I'd prefer not to open the window unless this patch is fixing a real issue.
> > 
> > - Mani
> In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when
> queueing
> TREs,  we add
> write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
> in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer
> callback,
> so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid acquiring
> mhi_chan->lock
> twice.
> 
> Sorry for confusing you. Do you think we need to sqush this two patch into
> one?

Well, if patch 1 is introducing a potential deadlock, then we should fix patch
1 itself and not introduce a follow up patch.

But there is one more issue that I pointed out in my previous reply.

Also, I'm planning to cleanup the locking mess within MHI in the coming days.
Perhaps we can revisit this series at that point of time. Will that be OK for
you?

- Mani

> > > Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
> > > ---
> > >   drivers/bus/mhi/host/main.c | 4 ++++
> > >   1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> > > index 6c6d253..c4215b0 100644
> > > --- a/drivers/bus/mhi/host/main.c
> > > +++ b/drivers/bus/mhi/host/main.c
> > > @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
> > >   			mhi_del_ring_element(mhi_cntrl, tre_ring);
> > >   			local_rp = tre_ring->rp;
> > > +			read_unlock_bh(&mhi_chan->lock);
> > > +
> > >   			/* notify client */
> > >   			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
> > > @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
> > >   					kfree(buf_info->cb_buf);
> > >   				}
> > >   			}
> > > +
> > > +			read_lock_bh(&mhi_chan->lock);
> > >   		}
> > >   		break;
> > >   	} /* CC_EOT */
> > > -- 
> > > 2.7.4
> > > 
> > > 
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-11-28 13:32       ` Manivannan Sadhasivam
@ 2023-11-29  3:29         ` Qiang Yu
  2023-11-30  5:31           ` Manivannan Sadhasivam
  0 siblings, 1 reply; 21+ messages in thread
From: Qiang Yu @ 2023-11-29  3:29 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana


On 11/28/2023 9:32 PM, Manivannan Sadhasivam wrote:
> On Mon, Nov 27, 2023 at 03:13:55PM +0800, Qiang Yu wrote:
>> On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
>>> On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
>>>> Ensure read and write locks for the channel are not taken in succession by
>>>> dropping the read lock from parse_xfer_event() such that a callback given
>>>> to client can potentially queue buffers and acquire the write lock in that
>>>> process. Any queueing of buffers should be done without channel read lock
>>>> acquired as it can result in multiple locks and a soft lockup.
>>>>
>>> Is this patch trying to fix an existing issue in client drivers or a potential
>>> issue in the future drivers?
>>>
>>> Even if you take care of disabled channels, "mhi_event->lock" acquired during
>>> mhi_mark_stale_events() can cause deadlock, since event lock is already held by
>>> mhi_ev_task().
>>>
>>> I'd prefer not to open the window unless this patch is fixing a real issue.
>>>
>>> - Mani
>> In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when
>> queueing
>> TREs,  we add
>> write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
>> in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer
>> callback,
>> so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid acquiring
>> mhi_chan->lock
>> twice.
>>
>> Sorry for confusing you. Do you think we need to sqush this two patch into
>> one?
> Well, if patch 1 is introducing a potential deadlock, then we should fix patch
> 1 itself and not introduce a follow up patch.
>
> But there is one more issue that I pointed out in my previous reply.
Sorry, I can not understand why "mhi_event->lock" acquired during
mhi_mark_stale_events() can cause deadlock. In mhi_ev_task(), we will
not invoke mhi_mark_stale_events(). Can you provide some interpretation?
>
> Also, I'm planning to cleanup the locking mess within MHI in the coming days.
> Perhaps we can revisit this series at that point of time. Will that be OK for
> you?
Sure, that will be great.
>
> - Mani
>
>>>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>>>> ---
>>>>    drivers/bus/mhi/host/main.c | 4 ++++
>>>>    1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>>> index 6c6d253..c4215b0 100644
>>>> --- a/drivers/bus/mhi/host/main.c
>>>> +++ b/drivers/bus/mhi/host/main.c
>>>> @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>>>    			mhi_del_ring_element(mhi_cntrl, tre_ring);
>>>>    			local_rp = tre_ring->rp;
>>>> +			read_unlock_bh(&mhi_chan->lock);
>>>> +
>>>>    			/* notify client */
>>>>    			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>>>> @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>>>    					kfree(buf_info->cb_buf);
>>>>    				}
>>>>    			}
>>>> +
>>>> +			read_lock_bh(&mhi_chan->lock);
>>>>    		}
>>>>    		break;
>>>>    	} /* CC_EOT */
>>>> -- 
>>>> 2.7.4
>>>>
>>>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-11-29  3:29         ` Qiang Yu
@ 2023-11-30  5:31           ` Manivannan Sadhasivam
  2023-12-06  2:25             ` Qiang Yu
  0 siblings, 1 reply; 21+ messages in thread
From: Manivannan Sadhasivam @ 2023-11-30  5:31 UTC (permalink / raw)
  To: Qiang Yu
  Cc: Manivannan Sadhasivam, quic_jhugo, mhi, linux-arm-msm,
	linux-kernel, quic_cang, quic_mrana

On Wed, Nov 29, 2023 at 11:29:07AM +0800, Qiang Yu wrote:
> 
> On 11/28/2023 9:32 PM, Manivannan Sadhasivam wrote:
> > On Mon, Nov 27, 2023 at 03:13:55PM +0800, Qiang Yu wrote:
> > > On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
> > > > On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
> > > > > Ensure read and write locks for the channel are not taken in succession by
> > > > > dropping the read lock from parse_xfer_event() such that a callback given
> > > > > to client can potentially queue buffers and acquire the write lock in that
> > > > > process. Any queueing of buffers should be done without channel read lock
> > > > > acquired as it can result in multiple locks and a soft lockup.
> > > > > 
> > > > Is this patch trying to fix an existing issue in client drivers or a potential
> > > > issue in the future drivers?
> > > > 
> > > > Even if you take care of disabled channels, "mhi_event->lock" acquired during
> > > > mhi_mark_stale_events() can cause deadlock, since event lock is already held by
> > > > mhi_ev_task().
> > > > 
> > > > I'd prefer not to open the window unless this patch is fixing a real issue.
> > > > 
> > > > - Mani
> > > In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when
> > > queueing
> > > TREs,  we add
> > > write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
> > > in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer
> > > callback,
> > > so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid acquiring
> > > mhi_chan->lock
> > > twice.
> > > 
> > > Sorry for confusing you. Do you think we need to sqush this two patch into
> > > one?
> > Well, if patch 1 is introducing a potential deadlock, then we should fix patch
> > 1 itself and not introduce a follow up patch.
> > 
> > But there is one more issue that I pointed out in my previous reply.
> Sorry, I can not understand why "mhi_event->lock" acquired during
> mhi_mark_stale_events() can cause deadlock. In mhi_ev_task(), we will
> not invoke mhi_mark_stale_events(). Can you provide some interpretation?

Going by your theory that if a channel gets disabled while processing the event,
the process trying to disable the channel will try to acquire "mhi_event->lock"
which is already held by the process processing the event.

- Mani

> > 
> > Also, I'm planning to cleanup the locking mess within MHI in the coming days.
> > Perhaps we can revisit this series at that point of time. Will that be OK for
> > you?
> Sure, that will be great.
> > 
> > - Mani
> > 
> > > > > Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
> > > > > ---
> > > > >    drivers/bus/mhi/host/main.c | 4 ++++
> > > > >    1 file changed, 4 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> > > > > index 6c6d253..c4215b0 100644
> > > > > --- a/drivers/bus/mhi/host/main.c
> > > > > +++ b/drivers/bus/mhi/host/main.c
> > > > > @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
> > > > >    			mhi_del_ring_element(mhi_cntrl, tre_ring);
> > > > >    			local_rp = tre_ring->rp;
> > > > > +			read_unlock_bh(&mhi_chan->lock);
> > > > > +
> > > > >    			/* notify client */
> > > > >    			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
> > > > > @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
> > > > >    					kfree(buf_info->cb_buf);
> > > > >    				}
> > > > >    			}
> > > > > +
> > > > > +			read_lock_bh(&mhi_chan->lock);
> > > > >    		}
> > > > >    		break;
> > > > >    	} /* CC_EOT */
> > > > > -- 
> > > > > 2.7.4
> > > > > 
> > > > > 
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-11-30  5:31           ` Manivannan Sadhasivam
@ 2023-12-06  2:25             ` Qiang Yu
  2023-12-06 13:48               ` Manivannan Sadhasivam
  0 siblings, 1 reply; 21+ messages in thread
From: Qiang Yu @ 2023-12-06  2:25 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana


On 11/30/2023 1:31 PM, Manivannan Sadhasivam wrote:
> On Wed, Nov 29, 2023 at 11:29:07AM +0800, Qiang Yu wrote:
>> On 11/28/2023 9:32 PM, Manivannan Sadhasivam wrote:
>>> On Mon, Nov 27, 2023 at 03:13:55PM +0800, Qiang Yu wrote:
>>>> On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
>>>>> On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
>>>>>> Ensure read and write locks for the channel are not taken in succession by
>>>>>> dropping the read lock from parse_xfer_event() such that a callback given
>>>>>> to client can potentially queue buffers and acquire the write lock in that
>>>>>> process. Any queueing of buffers should be done without channel read lock
>>>>>> acquired as it can result in multiple locks and a soft lockup.
>>>>>>
>>>>> Is this patch trying to fix an existing issue in client drivers or a potential
>>>>> issue in the future drivers?
>>>>>
>>>>> Even if you take care of disabled channels, "mhi_event->lock" acquired during
>>>>> mhi_mark_stale_events() can cause deadlock, since event lock is already held by
>>>>> mhi_ev_task().
>>>>>
>>>>> I'd prefer not to open the window unless this patch is fixing a real issue.
>>>>>
>>>>> - Mani
>>>> In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when
>>>> queueing
>>>> TREs,  we add
>>>> write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
>>>> in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer
>>>> callback,
>>>> so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid acquiring
>>>> mhi_chan->lock
>>>> twice.
>>>>
>>>> Sorry for confusing you. Do you think we need to sqush this two patch into
>>>> one?
>>> Well, if patch 1 is introducing a potential deadlock, then we should fix patch
>>> 1 itself and not introduce a follow up patch.
>>>
>>> But there is one more issue that I pointed out in my previous reply.
>> Sorry, I can not understand why "mhi_event->lock" acquired during
>> mhi_mark_stale_events() can cause deadlock. In mhi_ev_task(), we will
>> not invoke mhi_mark_stale_events(). Can you provide some interpretation?
> Going by your theory that if a channel gets disabled while processing the event,
> the process trying to disable the channel will try to acquire "mhi_event->lock"
> which is already held by the process processing the event.
>
> - Mani
OK, I get you. Thank you for kind explanation. Hopefully I didn't 
intrude too much.
>
>>> Also, I'm planning to cleanup the locking mess within MHI in the coming days.
>>> Perhaps we can revisit this series at that point of time. Will that be OK for
>>> you?
>> Sure, that will be great.
>>> - Mani
>>>
>>>>>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>>>>>> ---
>>>>>>     drivers/bus/mhi/host/main.c | 4 ++++
>>>>>>     1 file changed, 4 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>>>>> index 6c6d253..c4215b0 100644
>>>>>> --- a/drivers/bus/mhi/host/main.c
>>>>>> +++ b/drivers/bus/mhi/host/main.c
>>>>>> @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>>>>>     			mhi_del_ring_element(mhi_cntrl, tre_ring);
>>>>>>     			local_rp = tre_ring->rp;
>>>>>> +			read_unlock_bh(&mhi_chan->lock);
>>>>>> +
>>>>>>     			/* notify client */
>>>>>>     			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>>>>>> @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>>>>>     					kfree(buf_info->cb_buf);
>>>>>>     				}
>>>>>>     			}
>>>>>> +
>>>>>> +			read_lock_bh(&mhi_chan->lock);
>>>>>>     		}
>>>>>>     		break;
>>>>>>     	} /* CC_EOT */
>>>>>> -- 
>>>>>> 2.7.4
>>>>>>
>>>>>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-12-06  2:25             ` Qiang Yu
@ 2023-12-06 13:48               ` Manivannan Sadhasivam
  2023-12-07  5:25                 ` Qiang Yu
  2023-12-07  5:27                 ` Qiang Yu
  0 siblings, 2 replies; 21+ messages in thread
From: Manivannan Sadhasivam @ 2023-12-06 13:48 UTC (permalink / raw)
  To: Qiang Yu
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana

On Wed, Dec 06, 2023 at 10:25:12AM +0800, Qiang Yu wrote:
> 
> On 11/30/2023 1:31 PM, Manivannan Sadhasivam wrote:
> > On Wed, Nov 29, 2023 at 11:29:07AM +0800, Qiang Yu wrote:
> > > On 11/28/2023 9:32 PM, Manivannan Sadhasivam wrote:
> > > > On Mon, Nov 27, 2023 at 03:13:55PM +0800, Qiang Yu wrote:
> > > > > On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
> > > > > > On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
> > > > > > > Ensure read and write locks for the channel are not taken in succession by
> > > > > > > dropping the read lock from parse_xfer_event() such that a callback given
> > > > > > > to client can potentially queue buffers and acquire the write lock in that
> > > > > > > process. Any queueing of buffers should be done without channel read lock
> > > > > > > acquired as it can result in multiple locks and a soft lockup.
> > > > > > > 
> > > > > > Is this patch trying to fix an existing issue in client drivers or a potential
> > > > > > issue in the future drivers?
> > > > > > 
> > > > > > Even if you take care of disabled channels, "mhi_event->lock" acquired during
> > > > > > mhi_mark_stale_events() can cause deadlock, since event lock is already held by
> > > > > > mhi_ev_task().
> > > > > > 
> > > > > > I'd prefer not to open the window unless this patch is fixing a real issue.
> > > > > > 
> > > > > > - Mani
> > > > > In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when
> > > > > queueing
> > > > > TREs,  we add
> > > > > write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
> > > > > in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer
> > > > > callback,
> > > > > so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid acquiring
> > > > > mhi_chan->lock
> > > > > twice.
> > > > > 
> > > > > Sorry for confusing you. Do you think we need to sqush this two patch into
> > > > > one?
> > > > Well, if patch 1 is introducing a potential deadlock, then we should fix patch
> > > > 1 itself and not introduce a follow up patch.
> > > > 
> > > > But there is one more issue that I pointed out in my previous reply.
> > > Sorry, I can not understand why "mhi_event->lock" acquired during
> > > mhi_mark_stale_events() can cause deadlock. In mhi_ev_task(), we will
> > > not invoke mhi_mark_stale_events(). Can you provide some interpretation?
> > Going by your theory that if a channel gets disabled while processing the event,
> > the process trying to disable the channel will try to acquire "mhi_event->lock"
> > which is already held by the process processing the event.
> > 
> > - Mani
> OK, I get you. Thank you for kind explanation. Hopefully I didn't intrude
> too much.

Not at all. Btw, did you actually encounter any issue that this patch is trying
to fix? Or just fixing based on code inspection.

- Mani

> > 
> > > > Also, I'm planning to cleanup the locking mess within MHI in the coming days.
> > > > Perhaps we can revisit this series at that point of time. Will that be OK for
> > > > you?
> > > Sure, that will be great.
> > > > - Mani
> > > > 
> > > > > > > Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
> > > > > > > ---
> > > > > > >     drivers/bus/mhi/host/main.c | 4 ++++
> > > > > > >     1 file changed, 4 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> > > > > > > index 6c6d253..c4215b0 100644
> > > > > > > --- a/drivers/bus/mhi/host/main.c
> > > > > > > +++ b/drivers/bus/mhi/host/main.c
> > > > > > > @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
> > > > > > >     			mhi_del_ring_element(mhi_cntrl, tre_ring);
> > > > > > >     			local_rp = tre_ring->rp;
> > > > > > > +			read_unlock_bh(&mhi_chan->lock);
> > > > > > > +
> > > > > > >     			/* notify client */
> > > > > > >     			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
> > > > > > > @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
> > > > > > >     					kfree(buf_info->cb_buf);
> > > > > > >     				}
> > > > > > >     			}
> > > > > > > +
> > > > > > > +			read_lock_bh(&mhi_chan->lock);
> > > > > > >     		}
> > > > > > >     		break;
> > > > > > >     	} /* CC_EOT */
> > > > > > > -- 
> > > > > > > 2.7.4
> > > > > > > 
> > > > > > > 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-12-06 13:48               ` Manivannan Sadhasivam
@ 2023-12-07  5:25                 ` Qiang Yu
  2023-12-07  5:27                 ` Qiang Yu
  1 sibling, 0 replies; 21+ messages in thread
From: Qiang Yu @ 2023-12-07  5:25 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana


On 12/6/2023 9:48 PM, Manivannan Sadhasivam wrote:
> On Wed, Dec 06, 2023 at 10:25:12AM +0800, Qiang Yu wrote:
>> On 11/30/2023 1:31 PM, Manivannan Sadhasivam wrote:
>>> On Wed, Nov 29, 2023 at 11:29:07AM +0800, Qiang Yu wrote:
>>>> On 11/28/2023 9:32 PM, Manivannan Sadhasivam wrote:
>>>>> On Mon, Nov 27, 2023 at 03:13:55PM +0800, Qiang Yu wrote:
>>>>>> On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
>>>>>>> On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
>>>>>>>> Ensure read and write locks for the channel are not taken in succession by
>>>>>>>> dropping the read lock from parse_xfer_event() such that a callback given
>>>>>>>> to client can potentially queue buffers and acquire the write lock in that
>>>>>>>> process. Any queueing of buffers should be done without channel read lock
>>>>>>>> acquired as it can result in multiple locks and a soft lockup.
>>>>>>>>
>>>>>>> Is this patch trying to fix an existing issue in client drivers or a potential
>>>>>>> issue in the future drivers?
>>>>>>>
>>>>>>> Even if you take care of disabled channels, "mhi_event->lock" acquired during
>>>>>>> mhi_mark_stale_events() can cause deadlock, since event lock is already held by
>>>>>>> mhi_ev_task().
>>>>>>>
>>>>>>> I'd prefer not to open the window unless this patch is fixing a real issue.
>>>>>>>
>>>>>>> - Mani
>>>>>> In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when
>>>>>> queueing
>>>>>> TREs,  we add
>>>>>> write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
>>>>>> in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer
>>>>>> callback,
>>>>>> so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid acquiring
>>>>>> mhi_chan->lock
>>>>>> twice.
>>>>>>
>>>>>> Sorry for confusing you. Do you think we need to sqush this two patch into
>>>>>> one?
>>>>> Well, if patch 1 is introducing a potential deadlock, then we should fix patch
>>>>> 1 itself and not introduce a follow up patch.
>>>>>
>>>>> But there is one more issue that I pointed out in my previous reply.
>>>> Sorry, I can not understand why "mhi_event->lock" acquired during
>>>> mhi_mark_stale_events() can cause deadlock. In mhi_ev_task(), we will
>>>> not invoke mhi_mark_stale_events(). Can you provide some interpretation?
>>> Going by your theory that if a channel gets disabled while processing the event,
>>> the process trying to disable the channel will try to acquire "mhi_event->lock"
>>> which is already held by the process processing the event.
>>>
>>> - Mani
>> OK, I get you. Thank you for kind explanation. Hopefully I didn't intrude
>> too much.
> Not at all. Btw, did you actually encounter any issue that this patch is trying
> to fix? Or just fixing based on code inspection.
>
> - Mani
Yes, we actually meet the race issue in downstream driver. But I can not 
find more details about the issue.
>>>>> Also, I'm planning to cleanup the locking mess within MHI in the coming days.
>>>>> Perhaps we can revisit this series at that point of time. Will that be OK for
>>>>> you?
>>>> Sure, that will be great.
>>>>> - Mani
>>>>>
>>>>>>>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>>>>>>>> ---
>>>>>>>>      drivers/bus/mhi/host/main.c | 4 ++++
>>>>>>>>      1 file changed, 4 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>>>>>>> index 6c6d253..c4215b0 100644
>>>>>>>> --- a/drivers/bus/mhi/host/main.c
>>>>>>>> +++ b/drivers/bus/mhi/host/main.c
>>>>>>>> @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>>>>>>>      			mhi_del_ring_element(mhi_cntrl, tre_ring);
>>>>>>>>      			local_rp = tre_ring->rp;
>>>>>>>> +			read_unlock_bh(&mhi_chan->lock);
>>>>>>>> +
>>>>>>>>      			/* notify client */
>>>>>>>>      			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>>>>>>>> @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>>>>>>>      					kfree(buf_info->cb_buf);
>>>>>>>>      				}
>>>>>>>>      			}
>>>>>>>> +
>>>>>>>> +			read_lock_bh(&mhi_chan->lock);
>>>>>>>>      		}
>>>>>>>>      		break;
>>>>>>>>      	} /* CC_EOT */
>>>>>>>> -- 
>>>>>>>> 2.7.4
>>>>>>>>
>>>>>>>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-12-06 13:48               ` Manivannan Sadhasivam
  2023-12-07  5:25                 ` Qiang Yu
@ 2023-12-07  5:27                 ` Qiang Yu
  2023-12-07  6:43                   ` Manivannan Sadhasivam
  1 sibling, 1 reply; 21+ messages in thread
From: Qiang Yu @ 2023-12-07  5:27 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana


On 12/6/2023 9:48 PM, Manivannan Sadhasivam wrote:
> On Wed, Dec 06, 2023 at 10:25:12AM +0800, Qiang Yu wrote:
>> On 11/30/2023 1:31 PM, Manivannan Sadhasivam wrote:
>>> On Wed, Nov 29, 2023 at 11:29:07AM +0800, Qiang Yu wrote:
>>>> On 11/28/2023 9:32 PM, Manivannan Sadhasivam wrote:
>>>>> On Mon, Nov 27, 2023 at 03:13:55PM +0800, Qiang Yu wrote:
>>>>>> On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
>>>>>>> On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
>>>>>>>> Ensure read and write locks for the channel are not taken in succession by
>>>>>>>> dropping the read lock from parse_xfer_event() such that a callback given
>>>>>>>> to client can potentially queue buffers and acquire the write lock in that
>>>>>>>> process. Any queueing of buffers should be done without channel read lock
>>>>>>>> acquired as it can result in multiple locks and a soft lockup.
>>>>>>>>
>>>>>>> Is this patch trying to fix an existing issue in client drivers or a potential
>>>>>>> issue in the future drivers?
>>>>>>>
>>>>>>> Even if you take care of disabled channels, "mhi_event->lock" acquired during
>>>>>>> mhi_mark_stale_events() can cause deadlock, since event lock is already held by
>>>>>>> mhi_ev_task().
>>>>>>>
>>>>>>> I'd prefer not to open the window unless this patch is fixing a real issue.
>>>>>>>
>>>>>>> - Mani
>>>>>> In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when
>>>>>> queueing
>>>>>> TREs,  we add
>>>>>> write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
>>>>>> in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer
>>>>>> callback,
>>>>>> so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid acquiring
>>>>>> mhi_chan->lock
>>>>>> twice.
>>>>>>
>>>>>> Sorry for confusing you. Do you think we need to sqush this two patch into
>>>>>> one?
>>>>> Well, if patch 1 is introducing a potential deadlock, then we should fix patch
>>>>> 1 itself and not introduce a follow up patch.
>>>>>
>>>>> But there is one more issue that I pointed out in my previous reply.
>>>> Sorry, I can not understand why "mhi_event->lock" acquired during
>>>> mhi_mark_stale_events() can cause deadlock. In mhi_ev_task(), we will
>>>> not invoke mhi_mark_stale_events(). Can you provide some interpretation?
>>> Going by your theory that if a channel gets disabled while processing the event,
>>> the process trying to disable the channel will try to acquire "mhi_event->lock"
>>> which is already held by the process processing the event.
>>>
>>> - Mani
>> OK, I get you. Thank you for kind explanation. Hopefully I didn't intrude
>> too much.
> Not at all. Btw, did you actually encounter any issue that this patch is trying
> to fix? Or just fixing based on code inspection.
>
> - Mani
Yes, we actually meet the race issue in downstream driver. But I can not 
find more details about the issue.
>>>>> Also, I'm planning to cleanup the locking mess within MHI in the coming days.
>>>>> Perhaps we can revisit this series at that point of time. Will that be OK for
>>>>> you?
>>>> Sure, that will be great.
>>>>> - Mani
>>>>>
>>>>>>>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>>>>>>>> ---
>>>>>>>>      drivers/bus/mhi/host/main.c | 4 ++++
>>>>>>>>      1 file changed, 4 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>>>>>>> index 6c6d253..c4215b0 100644
>>>>>>>> --- a/drivers/bus/mhi/host/main.c
>>>>>>>> +++ b/drivers/bus/mhi/host/main.c
>>>>>>>> @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>>>>>>>      			mhi_del_ring_element(mhi_cntrl, tre_ring);
>>>>>>>>      			local_rp = tre_ring->rp;
>>>>>>>> +			read_unlock_bh(&mhi_chan->lock);
>>>>>>>> +
>>>>>>>>      			/* notify client */
>>>>>>>>      			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
>>>>>>>> @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
>>>>>>>>      					kfree(buf_info->cb_buf);
>>>>>>>>      				}
>>>>>>>>      			}
>>>>>>>> +
>>>>>>>> +			read_lock_bh(&mhi_chan->lock);
>>>>>>>>      		}
>>>>>>>>      		break;
>>>>>>>>      	} /* CC_EOT */
>>>>>>>> -- 
>>>>>>>> 2.7.4
>>>>>>>>
>>>>>>>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 4/4] bus: mhi: host: Take irqsave lock after TRE is generated
  2023-11-27  7:19     ` Qiang Yu
@ 2023-12-07  6:38       ` Manivannan Sadhasivam
  2023-12-07  9:20         ` Qiang Yu
  0 siblings, 1 reply; 21+ messages in thread
From: Manivannan Sadhasivam @ 2023-12-07  6:38 UTC (permalink / raw)
  To: Qiang Yu
  Cc: Manivannan Sadhasivam, quic_jhugo, mhi, linux-arm-msm,
	linux-kernel, quic_cang, quic_mrana, Hemant Kumar, Lazarus Motha

On Mon, Nov 27, 2023 at 03:19:49PM +0800, Qiang Yu wrote:
> 
> On 11/24/2023 6:09 PM, Manivannan Sadhasivam wrote:
> > On Tue, Nov 14, 2023 at 01:27:41PM +0800, Qiang Yu wrote:
> > > From: Hemant Kumar <quic_hemantk@quicinc.com>
> > > 
> > > If CONFIG_TRACE_IRQFLAGS is enabled, irq will be enabled once __local_bh_
> > > enable_ip is called as part of write_unlock_bh. Hence, let's take irqsave
> > "__local_bh_enable_ip" is a function name, so you should not break it.
> Thanks for let me know, will note this in following patch.
> > 
> > > lock after TRE is generated to avoid running write_unlock_bh when irqsave
> > > lock is held.
> > > 
> > I still don't understand this commit message. Where is the write_unlock_bh()
> > being called?
> > 
> > - Mani
> Write_unlock_bh() is invoked in mhi_gen_te()
> The calling flow is like
> mhi_queue
>     read_lock_irqsave(&mhi_cntrl->pm_lock, flags)
>     mhi_gen_tre
>         write_lock_bh(&mhi_chan->lock)
>         write_unlock_bh(&mhi_chan->lock)   // Will enable irq if
> CONFIG_TRACE_IRQFLAGS is enabled
>     read_lock_irqsave(&mhi_cntrl->pm_lock, flags)
> 
> after adding this patch, the calling flow becomes
> 
> mhi_queue
>     mhi_gen_tre
>         write_lock_bh(&mhi_chan->lock)
>         write_unlock_bh(&mhi_chan->lock)
>     read_lock_irqsave(&mhi_cntrl->pm_lock, flags)
>     read_lock_irqsave(&mhi_cntrl->pm_lock, flags)

So this patch essentially fixes the issue caused by patch 1? If so, this should
be squashed into patch 1.

- Mani

> > 
> > > Signed-off-by: Hemant Kumar <quic_hemantk@quicinc.com>
> > > Signed-off-by: Lazarus Motha <quic_lmotha@quicinc.com>
> > > Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
> > > ---
> > >   drivers/bus/mhi/host/main.c | 13 +++++--------
> > >   1 file changed, 5 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> > > index 33f27e2..d7abd0b 100644
> > > --- a/drivers/bus/mhi/host/main.c
> > > +++ b/drivers/bus/mhi/host/main.c
> > > @@ -1128,17 +1128,15 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
> > >   	if (unlikely(MHI_PM_IN_ERROR_STATE(mhi_cntrl->pm_state)))
> > >   		return -EIO;
> > > -	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
> > > -
> > >   	ret = mhi_is_ring_full(mhi_cntrl, tre_ring);
> > > -	if (unlikely(ret)) {
> > > -		ret = -EAGAIN;
> > > -		goto exit_unlock;
> > > -	}
> > > +	if (unlikely(ret))
> > > +		return -EAGAIN;
> > >   	ret = mhi_gen_tre(mhi_cntrl, mhi_chan, buf_info, mflags);
> > >   	if (unlikely(ret))
> > > -		goto exit_unlock;
> > > +		return ret;
> > > +
> > > +	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
> > >   	/* Packet is queued, take a usage ref to exit M3 if necessary
> > >   	 * for host->device buffer, balanced put is done on buffer completion
> > > @@ -1158,7 +1156,6 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
> > >   	if (dir == DMA_FROM_DEVICE)
> > >   		mhi_cntrl->runtime_put(mhi_cntrl);
> > > -exit_unlock:
> > >   	read_unlock_irqrestore(&mhi_cntrl->pm_lock, flags);
> > >   	return ret;
> > > -- 
> > > 2.7.4
> > > 
> > > 
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-12-07  5:27                 ` Qiang Yu
@ 2023-12-07  6:43                   ` Manivannan Sadhasivam
  2023-12-07  9:50                     ` Qiang Yu
  0 siblings, 1 reply; 21+ messages in thread
From: Manivannan Sadhasivam @ 2023-12-07  6:43 UTC (permalink / raw)
  To: Qiang Yu
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana

On Thu, Dec 07, 2023 at 01:27:19PM +0800, Qiang Yu wrote:
> 
> On 12/6/2023 9:48 PM, Manivannan Sadhasivam wrote:
> > On Wed, Dec 06, 2023 at 10:25:12AM +0800, Qiang Yu wrote:
> > > On 11/30/2023 1:31 PM, Manivannan Sadhasivam wrote:
> > > > On Wed, Nov 29, 2023 at 11:29:07AM +0800, Qiang Yu wrote:
> > > > > On 11/28/2023 9:32 PM, Manivannan Sadhasivam wrote:
> > > > > > On Mon, Nov 27, 2023 at 03:13:55PM +0800, Qiang Yu wrote:
> > > > > > > On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
> > > > > > > > On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
> > > > > > > > > Ensure read and write locks for the channel are not taken in succession by
> > > > > > > > > dropping the read lock from parse_xfer_event() such that a callback given
> > > > > > > > > to client can potentially queue buffers and acquire the write lock in that
> > > > > > > > > process. Any queueing of buffers should be done without channel read lock
> > > > > > > > > acquired as it can result in multiple locks and a soft lockup.
> > > > > > > > > 
> > > > > > > > Is this patch trying to fix an existing issue in client drivers or a potential
> > > > > > > > issue in the future drivers?
> > > > > > > > 
> > > > > > > > Even if you take care of disabled channels, "mhi_event->lock" acquired during
> > > > > > > > mhi_mark_stale_events() can cause deadlock, since event lock is already held by
> > > > > > > > mhi_ev_task().
> > > > > > > > 
> > > > > > > > I'd prefer not to open the window unless this patch is fixing a real issue.
> > > > > > > > 
> > > > > > > > - Mani
> > > > > > > In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when
> > > > > > > queueing
> > > > > > > TREs,  we add
> > > > > > > write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
> > > > > > > in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer
> > > > > > > callback,
> > > > > > > so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid acquiring
> > > > > > > mhi_chan->lock
> > > > > > > twice.
> > > > > > > 
> > > > > > > Sorry for confusing you. Do you think we need to sqush this two patch into
> > > > > > > one?
> > > > > > Well, if patch 1 is introducing a potential deadlock, then we should fix patch
> > > > > > 1 itself and not introduce a follow up patch.
> > > > > > 
> > > > > > But there is one more issue that I pointed out in my previous reply.
> > > > > Sorry, I can not understand why "mhi_event->lock" acquired during
> > > > > mhi_mark_stale_events() can cause deadlock. In mhi_ev_task(), we will
> > > > > not invoke mhi_mark_stale_events(). Can you provide some interpretation?
> > > > Going by your theory that if a channel gets disabled while processing the event,
> > > > the process trying to disable the channel will try to acquire "mhi_event->lock"
> > > > which is already held by the process processing the event.
> > > > 
> > > > - Mani
> > > OK, I get you. Thank you for kind explanation. Hopefully I didn't intrude
> > > too much.
> > Not at all. Btw, did you actually encounter any issue that this patch is trying
> > to fix? Or just fixing based on code inspection.
> > 
> > - Mani
> Yes, we actually meet the race issue in downstream driver. But I can not
> find more details about the issue.

Hmm. I think it is OK to accept this patch and ignore the channel disabling
concern since the event lock is in place to prevent that. There would be no
deadlock as I mentioned above, since the process that is parsing the xfer event
is not the one that is going to disable the channel in parallel.

Could you please respin this series dropping patch 3/4 and also addressing the
issue I mentioned in patch 4/4?

- Mani

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 4/4] bus: mhi: host: Take irqsave lock after TRE is generated
  2023-12-07  6:38       ` Manivannan Sadhasivam
@ 2023-12-07  9:20         ` Qiang Yu
  0 siblings, 0 replies; 21+ messages in thread
From: Qiang Yu @ 2023-12-07  9:20 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana, Hemant Kumar, Lazarus Motha


On 12/7/2023 2:38 PM, Manivannan Sadhasivam wrote:
> On Mon, Nov 27, 2023 at 03:19:49PM +0800, Qiang Yu wrote:
>> On 11/24/2023 6:09 PM, Manivannan Sadhasivam wrote:
>>> On Tue, Nov 14, 2023 at 01:27:41PM +0800, Qiang Yu wrote:
>>>> From: Hemant Kumar <quic_hemantk@quicinc.com>
>>>>
>>>> If CONFIG_TRACE_IRQFLAGS is enabled, irq will be enabled once __local_bh_
>>>> enable_ip is called as part of write_unlock_bh. Hence, let's take irqsave
>>> "__local_bh_enable_ip" is a function name, so you should not break it.
>> Thanks for let me know, will note this in following patch.
>>>> lock after TRE is generated to avoid running write_unlock_bh when irqsave
>>>> lock is held.
>>>>
>>> I still don't understand this commit message. Where is the write_unlock_bh()
>>> being called?
>>>
>>> - Mani
>> Write_unlock_bh() is invoked in mhi_gen_te()
>> The calling flow is like
>> mhi_queue
>>      read_lock_irqsave(&mhi_cntrl->pm_lock, flags)
>>      mhi_gen_tre
>>          write_lock_bh(&mhi_chan->lock)
>>          write_unlock_bh(&mhi_chan->lock)   // Will enable irq if
>> CONFIG_TRACE_IRQFLAGS is enabled
>>      read_lock_irqsave(&mhi_cntrl->pm_lock, flags)
>>
>> after adding this patch, the calling flow becomes
>>
>> mhi_queue
>>      mhi_gen_tre
>>          write_lock_bh(&mhi_chan->lock)
>>          write_unlock_bh(&mhi_chan->lock)
>>      read_lock_irqsave(&mhi_cntrl->pm_lock, flags)
>>      read_lock_irqsave(&mhi_cntrl->pm_lock, flags)
> So this patch essentially fixes the issue caused by patch 1? If so, this should
> be squashed into patch 1.
>
> - Mani

Yes, this patch is to fix the issue caused by patch 1. Will squash patch 
1 and this patch into one patch

in next version.

>
>>>> Signed-off-by: Hemant Kumar <quic_hemantk@quicinc.com>
>>>> Signed-off-by: Lazarus Motha <quic_lmotha@quicinc.com>
>>>> Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
>>>> ---
>>>>    drivers/bus/mhi/host/main.c | 13 +++++--------
>>>>    1 file changed, 5 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
>>>> index 33f27e2..d7abd0b 100644
>>>> --- a/drivers/bus/mhi/host/main.c
>>>> +++ b/drivers/bus/mhi/host/main.c
>>>> @@ -1128,17 +1128,15 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
>>>>    	if (unlikely(MHI_PM_IN_ERROR_STATE(mhi_cntrl->pm_state)))
>>>>    		return -EIO;
>>>> -	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
>>>> -
>>>>    	ret = mhi_is_ring_full(mhi_cntrl, tre_ring);
>>>> -	if (unlikely(ret)) {
>>>> -		ret = -EAGAIN;
>>>> -		goto exit_unlock;
>>>> -	}
>>>> +	if (unlikely(ret))
>>>> +		return -EAGAIN;
>>>>    	ret = mhi_gen_tre(mhi_cntrl, mhi_chan, buf_info, mflags);
>>>>    	if (unlikely(ret))
>>>> -		goto exit_unlock;
>>>> +		return ret;
>>>> +
>>>> +	read_lock_irqsave(&mhi_cntrl->pm_lock, flags);
>>>>    	/* Packet is queued, take a usage ref to exit M3 if necessary
>>>>    	 * for host->device buffer, balanced put is done on buffer completion
>>>> @@ -1158,7 +1156,6 @@ static int mhi_queue(struct mhi_device *mhi_dev, struct mhi_buf_info *buf_info,
>>>>    	if (dir == DMA_FROM_DEVICE)
>>>>    		mhi_cntrl->runtime_put(mhi_cntrl);
>>>> -exit_unlock:
>>>>    	read_unlock_irqrestore(&mhi_cntrl->pm_lock, flags);
>>>>    	return ret;
>>>> -- 
>>>> 2.7.4
>>>>
>>>>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers
  2023-12-07  6:43                   ` Manivannan Sadhasivam
@ 2023-12-07  9:50                     ` Qiang Yu
  0 siblings, 0 replies; 21+ messages in thread
From: Qiang Yu @ 2023-12-07  9:50 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: quic_jhugo, mhi, linux-arm-msm, linux-kernel, quic_cang,
	quic_mrana


On 12/7/2023 2:43 PM, Manivannan Sadhasivam wrote:
> On Thu, Dec 07, 2023 at 01:27:19PM +0800, Qiang Yu wrote:
>> On 12/6/2023 9:48 PM, Manivannan Sadhasivam wrote:
>>> On Wed, Dec 06, 2023 at 10:25:12AM +0800, Qiang Yu wrote:
>>>> On 11/30/2023 1:31 PM, Manivannan Sadhasivam wrote:
>>>>> On Wed, Nov 29, 2023 at 11:29:07AM +0800, Qiang Yu wrote:
>>>>>> On 11/28/2023 9:32 PM, Manivannan Sadhasivam wrote:
>>>>>>> On Mon, Nov 27, 2023 at 03:13:55PM +0800, Qiang Yu wrote:
>>>>>>>> On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
>>>>>>>>> On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
>>>>>>>>>> Ensure read and write locks for the channel are not taken in succession by
>>>>>>>>>> dropping the read lock from parse_xfer_event() such that a callback given
>>>>>>>>>> to client can potentially queue buffers and acquire the write lock in that
>>>>>>>>>> process. Any queueing of buffers should be done without channel read lock
>>>>>>>>>> acquired as it can result in multiple locks and a soft lockup.
>>>>>>>>>>
>>>>>>>>> Is this patch trying to fix an existing issue in client drivers or a potential
>>>>>>>>> issue in the future drivers?
>>>>>>>>>
>>>>>>>>> Even if you take care of disabled channels, "mhi_event->lock" acquired during
>>>>>>>>> mhi_mark_stale_events() can cause deadlock, since event lock is already held by
>>>>>>>>> mhi_ev_task().
>>>>>>>>>
>>>>>>>>> I'd prefer not to open the window unless this patch is fixing a real issue.
>>>>>>>>>
>>>>>>>>> - Mani
>>>>>>>> In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when
>>>>>>>> queueing
>>>>>>>> TREs,  we add
>>>>>>>> write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
>>>>>>>> in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer
>>>>>>>> callback,
>>>>>>>> so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid acquiring
>>>>>>>> mhi_chan->lock
>>>>>>>> twice.
>>>>>>>>
>>>>>>>> Sorry for confusing you. Do you think we need to sqush this two patch into
>>>>>>>> one?
>>>>>>> Well, if patch 1 is introducing a potential deadlock, then we should fix patch
>>>>>>> 1 itself and not introduce a follow up patch.
>>>>>>>
>>>>>>> But there is one more issue that I pointed out in my previous reply.
>>>>>> Sorry, I can not understand why "mhi_event->lock" acquired during
>>>>>> mhi_mark_stale_events() can cause deadlock. In mhi_ev_task(), we will
>>>>>> not invoke mhi_mark_stale_events(). Can you provide some interpretation?
>>>>> Going by your theory that if a channel gets disabled while processing the event,
>>>>> the process trying to disable the channel will try to acquire "mhi_event->lock"
>>>>> which is already held by the process processing the event.
>>>>>
>>>>> - Mani
>>>> OK, I get you. Thank you for kind explanation. Hopefully I didn't intrude
>>>> too much.
>>> Not at all. Btw, did you actually encounter any issue that this patch is trying
>>> to fix? Or just fixing based on code inspection.
>>>
>>> - Mani
>> Yes, we actually meet the race issue in downstream driver. But I can not
>> find more details about the issue.
> Hmm. I think it is OK to accept this patch and ignore the channel disabling
> concern since the event lock is in place to prevent that. There would be no
> deadlock as I mentioned above, since the process that is parsing the xfer event
> is not the one that is going to disable the channel in parallel.
>
> Could you please respin this series dropping patch 3/4 and also addressing the
> issue I mentioned in patch 4/4?
>
> - Mani
Thank you for tirelessly review these patches. Will do this in next version.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-12-07  9:50 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-14  5:27 [PATCH v4 0/4] bus: mhi: host: Add lock to avoid race when ringing channel DB Qiang Yu
2023-11-14  5:27 ` [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when queueing TREs Qiang Yu
2023-11-14  5:27 ` [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers Qiang Yu
2023-11-24 10:04   ` Manivannan Sadhasivam
2023-11-27  7:12     ` Qiang Yu
2023-11-27  7:13     ` Qiang Yu
2023-11-28 13:32       ` Manivannan Sadhasivam
2023-11-29  3:29         ` Qiang Yu
2023-11-30  5:31           ` Manivannan Sadhasivam
2023-12-06  2:25             ` Qiang Yu
2023-12-06 13:48               ` Manivannan Sadhasivam
2023-12-07  5:25                 ` Qiang Yu
2023-12-07  5:27                 ` Qiang Yu
2023-12-07  6:43                   ` Manivannan Sadhasivam
2023-12-07  9:50                     ` Qiang Yu
2023-11-14  5:27 ` [PATCH v4 3/4] bus: mhi: host: Avoid processing buffer and event of a disable channel Qiang Yu
2023-11-14  5:27 ` [PATCH v4 4/4] bus: mhi: host: Take irqsave lock after TRE is generated Qiang Yu
2023-11-24 10:09   ` Manivannan Sadhasivam
2023-11-27  7:19     ` Qiang Yu
2023-12-07  6:38       ` Manivannan Sadhasivam
2023-12-07  9:20         ` Qiang Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox