[PATCH v6 0/2] Add Block event interrupt support for I2C protocol

linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v6 0/2] Add Block event interrupt support for I2C protocol
@ 2025-05-06 11:18 Jyothi Kumar Seerapu
  2025-05-06 11:18 ` [PATCH v6 1/2] dmaengine: qcom: gpi: Add GPI Block event interrupt support Jyothi Kumar Seerapu
  2025-05-06 11:18 ` [PATCH v6 2/2] i2c: i2c-qcom-geni: Add " Jyothi Kumar Seerapu
  0 siblings, 2 replies; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-05-06 11:18 UTC (permalink / raw)
  To: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König
  Cc: linux-arm-msm, dmaengine, linux-kernel, linux-i2c, linux-media,
	dri-devel, linaro-mm-sig, quic_vtanuku

The I2C driver gets an interrupt upon transfer completion.
When handling multiple messages in a single transfer, this
results in N interrupts for N messages, leading to significant
software interrupt latency.

To mitigate this latency, utilize Block Event Interrupt (BEI)
mechanism. Enabling BEI instructs the hardware to prevent interrupt
generation and BEI is disabled when an interrupt is necessary.

Large I2C transfer can be divided into chunks of 8 messages internally.
Interrupts are not expected for the first 7 message completions, only
the last message triggers an interrupt, indicating the completion of
8 messages. This BEI mechanism enhances overall transfer efficiency.

This optimization reduces transfer time from 168 ms to 48 ms for a
series of 200 I2C write messages in a single transfer, with a
clock frequency support of 100 kHz.

BEI optimizations are currently implemented for I2C write transfers only,
as there is no use case for multiple I2C read messages in a single transfer
at this time.

v5 -> v6:
   - Instead of using bei_flag, moved the logic to use with DMA
     supported flags like DMA_PREP_INTERRUPT.
   - Additional parameter comments removed from geni_i2c_gpi_multi_desc_unmap
     function documentation.

v4 -> v5:
   -  BEI flag naming changed from flags to bei_flag.
   -  QCOM_GPI_BLOCK_EVENT_IRQ macro is removed from qcom-gpi-dma.h
      file, and Block event support is checked with bei_flag.
   -  Documentation added for "struct geni_i2c_dev".

v3 -> v4:
  - API's added for Block event interrupt with multi descriptor support is
    moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
  - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
    I2C driver.
  - geni_i2c_gpi_multi_desc_xfer structure is added as a member of
    struct geni_i2c_dev.
  - Removed the changes of making I2C driver is dependent on GPI driver.

v2 -> v3:
  - Updated commit description
  - In I2C GENI driver, for i2c_gpi_cb_result moved the logic of
    "!is_tx_multi_xfer" to else part.
  - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
  - Changes of I2C GENI driver to depend on the GPI driver moved
    to patch3.
  - Renamed gpi_multi_desc_process to gpi_multi_xfer_timeout_handler
  - Added description for newly added changes in "qcom-gpi-dma.h" file.

v1 -> v2:
  - DT changes are reverted for adding dma channel size as a new arg of
    dma-cells property.
  - DT binding change reveted for dma channel size as a new arg of
    dma-cells property.
  - In GPI driver, reverted the changes to parse the channel TRE size
    from device tree.
  - Made the changes in QCOM I2C geni driver to support the BEI
    functionality with the existing TRE size of 64.
  - Made changes in QCOM I2C geni driver as per the review comments.
  - Fixed Kernel test robot reported compiltion issues.

Jyothi Kumar Seerapu (2):
  dmaengine: qcom: gpi: Add GPI Block event interrupt support
  i2c: i2c-qcom-geni: Add Block event interrupt support

 drivers/dma/qcom/gpi.c             |   3 +
 drivers/i2c/busses/i2c-qcom-geni.c | 307 ++++++++++++++++++++++++++---
 include/linux/dma/qcom-gpi-dma.h   |   2 +
 3 files changed, 285 insertions(+), 27 deletions(-)

-- 
base-commit: 55bcd2e0d04c1171d382badef1def1fd04ef66c5
2.17.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v6 1/2] dmaengine: qcom: gpi: Add GPI Block event interrupt support
  2025-05-06 11:18 [PATCH v6 0/2] Add Block event interrupt support for I2C protocol Jyothi Kumar Seerapu
@ 2025-05-06 11:18 ` Jyothi Kumar Seerapu
  2025-05-06 11:32   ` Dmitry Baryshkov
  2025-05-06 11:18 ` [PATCH v6 2/2] i2c: i2c-qcom-geni: Add " Jyothi Kumar Seerapu
  1 sibling, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-05-06 11:18 UTC (permalink / raw)
  To: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König
  Cc: linux-arm-msm, dmaengine, linux-kernel, linux-i2c, linux-media,
	dri-devel, linaro-mm-sig, quic_vtanuku

GSI hardware generates an interrupt for each transfer completion.
For multiple messages within a single transfer, this results in
N interrupts for N messages, leading to significant software
interrupt latency.

To mitigate this latency, utilize Block Event Interrupt (BEI) mechanism.
Enabling BEI instructs the GSI hardware to prevent interrupt generation
and BEI is disabled when an interrupt is necessary.

When using BEI, consider splitting a single multi-message transfer into
chunks of 8 messages internally and so interrupts are not expected for
the first 7 message completions, only the last message triggers
an interrupt, indicating the completion of 8 messages.

This BEI mechanism enhances overall transfer efficiency.

Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
---
v5 ->v6:
  - For updating the block event interrupt bit, instead of relying on
    bei_flag, decision check is moved with DMA_PREP_INTERRUPT flag.
 
v4 -> v5:
  - BEI flag naming changed from flags to bei_flag.
  - QCOM_GPI_BLOCK_EVENT_IRQ macro is removed from qcom-gpi-dma.h
    file, and Block event interrupt support is checked with bei_flag.

v3 -> v4:
  - API's added for Block event interrupt with multi descriptor support for
    I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
  - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
    I2C driver.

v2-> v3:
   - Renamed gpi_multi_desc_process to gpi_multi_xfer_timeout_handler
   - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
   - Added documentation for newly added changes in "qcom-gpi-dma.h" file
   - Updated commit description.

v1 -> v2:
   - Changed dma_addr type from array of pointers to array.
   - To support BEI functionality with the TRE size of 64 defined in GPI driver,
     updated QCOM_GPI_MAX_NUM_MSGS to 16 and NUM_MSGS_PER_IRQ to 4.

 drivers/dma/qcom/gpi.c           | 3 +++
 include/linux/dma/qcom-gpi-dma.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/drivers/dma/qcom/gpi.c b/drivers/dma/qcom/gpi.c
index b1f0001cc99c..7e511f54166a 100644
--- a/drivers/dma/qcom/gpi.c
+++ b/drivers/dma/qcom/gpi.c
@@ -1695,6 +1695,9 @@ static int gpi_create_i2c_tre(struct gchan *chan, struct gpi_desc *desc,
 
 		tre->dword[3] = u32_encode_bits(TRE_TYPE_DMA, TRE_FLAGS_TYPE);
 		tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_IEOT);
+
+		if (!(i2c->dma_flags & DMA_PREP_INTERRUPT))
+			tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_BEI);
 	}
 
 	for (i = 0; i < tre_idx; i++)
diff --git a/include/linux/dma/qcom-gpi-dma.h b/include/linux/dma/qcom-gpi-dma.h
index 6680dd1a43c6..ebac0d3edff2 100644
--- a/include/linux/dma/qcom-gpi-dma.h
+++ b/include/linux/dma/qcom-gpi-dma.h
@@ -65,6 +65,7 @@ enum i2c_op {
  * @rx_len: receive length for buffer
  * @op: i2c cmd
  * @muli-msg: is part of multi i2c r-w msgs
+ * @dma_flags: Flags indicating DMA capabilities
  */
 struct gpi_i2c_config {
 	u8 set_config;
@@ -78,6 +79,7 @@ struct gpi_i2c_config {
 	u32 rx_len;
 	enum i2c_op op;
 	bool multi_msg;
+	unsigned int dma_flags;
 };
 
 #endif /* QCOM_GPI_DMA_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 1/2] dmaengine: qcom: gpi: Add GPI Block event interrupt support
  2025-05-06 11:18 ` [PATCH v6 1/2] dmaengine: qcom: gpi: Add GPI Block event interrupt support Jyothi Kumar Seerapu
@ 2025-05-06 11:32   ` Dmitry Baryshkov
  2025-05-09  6:18     ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2025-05-06 11:32 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

On Tue, May 06, 2025 at 04:48:43PM +0530, Jyothi Kumar Seerapu wrote:
> GSI hardware generates an interrupt for each transfer completion.
> For multiple messages within a single transfer, this results in
> N interrupts for N messages, leading to significant software
> interrupt latency.
> 
> To mitigate this latency, utilize Block Event Interrupt (BEI) mechanism.
> Enabling BEI instructs the GSI hardware to prevent interrupt generation
> and BEI is disabled when an interrupt is necessary.
> 
> When using BEI, consider splitting a single multi-message transfer into
> chunks of 8 messages internally and so interrupts are not expected for
> the first 7 message completions, only the last message triggers
> an interrupt, indicating the completion of 8 messages.
> 
> This BEI mechanism enhances overall transfer efficiency.
> 
> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
> ---
> v5 ->v6:
>   - For updating the block event interrupt bit, instead of relying on
>     bei_flag, decision check is moved with DMA_PREP_INTERRUPT flag.
>  
> v4 -> v5:
>   - BEI flag naming changed from flags to bei_flag.
>   - QCOM_GPI_BLOCK_EVENT_IRQ macro is removed from qcom-gpi-dma.h
>     file, and Block event interrupt support is checked with bei_flag.
> 
> v3 -> v4:
>   - API's added for Block event interrupt with multi descriptor support for
>     I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
>   - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
>     I2C driver.
> 
> v2-> v3:
>    - Renamed gpi_multi_desc_process to gpi_multi_xfer_timeout_handler
>    - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
>    - Added documentation for newly added changes in "qcom-gpi-dma.h" file
>    - Updated commit description.
> 
> v1 -> v2:
>    - Changed dma_addr type from array of pointers to array.
>    - To support BEI functionality with the TRE size of 64 defined in GPI driver,
>      updated QCOM_GPI_MAX_NUM_MSGS to 16 and NUM_MSGS_PER_IRQ to 4.
> 
>  drivers/dma/qcom/gpi.c           | 3 +++
>  include/linux/dma/qcom-gpi-dma.h | 2 ++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/drivers/dma/qcom/gpi.c b/drivers/dma/qcom/gpi.c
> index b1f0001cc99c..7e511f54166a 100644
> --- a/drivers/dma/qcom/gpi.c
> +++ b/drivers/dma/qcom/gpi.c
> @@ -1695,6 +1695,9 @@ static int gpi_create_i2c_tre(struct gchan *chan, struct gpi_desc *desc,
>  
>  		tre->dword[3] = u32_encode_bits(TRE_TYPE_DMA, TRE_FLAGS_TYPE);
>  		tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_IEOT);
> +
> +		if (!(i2c->dma_flags & DMA_PREP_INTERRUPT))
> +			tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_BEI);
>  	}
>  
>  	for (i = 0; i < tre_idx; i++)
> diff --git a/include/linux/dma/qcom-gpi-dma.h b/include/linux/dma/qcom-gpi-dma.h
> index 6680dd1a43c6..ebac0d3edff2 100644
> --- a/include/linux/dma/qcom-gpi-dma.h
> +++ b/include/linux/dma/qcom-gpi-dma.h
> @@ -65,6 +65,7 @@ enum i2c_op {
>   * @rx_len: receive length for buffer
>   * @op: i2c cmd
>   * @muli-msg: is part of multi i2c r-w msgs
> + * @dma_flags: Flags indicating DMA capabilities
>   */
>  struct gpi_i2c_config {
>  	u8 set_config;
> @@ -78,6 +79,7 @@ struct gpi_i2c_config {
>  	u32 rx_len;
>  	enum i2c_op op;
>  	bool multi_msg;
> +	unsigned int dma_flags;

Why do you need extra field instead of using
dma_async_tx_descriptor.flags?

>  };
>  
>  #endif /* QCOM_GPI_DMA_H */
> -- 
> 2.17.1
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 1/2] dmaengine: qcom: gpi: Add GPI Block event interrupt support
  2025-05-06 11:32   ` Dmitry Baryshkov
@ 2025-05-09  6:18     ` Jyothi Kumar Seerapu
  2025-05-30 14:05       ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-05-09  6:18 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku



On 5/6/2025 5:02 PM, Dmitry Baryshkov wrote:
> On Tue, May 06, 2025 at 04:48:43PM +0530, Jyothi Kumar Seerapu wrote:
>> GSI hardware generates an interrupt for each transfer completion.
>> For multiple messages within a single transfer, this results in
>> N interrupts for N messages, leading to significant software
>> interrupt latency.
>>
>> To mitigate this latency, utilize Block Event Interrupt (BEI) mechanism.
>> Enabling BEI instructs the GSI hardware to prevent interrupt generation
>> and BEI is disabled when an interrupt is necessary.
>>
>> When using BEI, consider splitting a single multi-message transfer into
>> chunks of 8 messages internally and so interrupts are not expected for
>> the first 7 message completions, only the last message triggers
>> an interrupt, indicating the completion of 8 messages.
>>
>> This BEI mechanism enhances overall transfer efficiency.
>>
>> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
>> ---
>> v5 ->v6:
>>    - For updating the block event interrupt bit, instead of relying on
>>      bei_flag, decision check is moved with DMA_PREP_INTERRUPT flag.
>>   
>> v4 -> v5:
>>    - BEI flag naming changed from flags to bei_flag.
>>    - QCOM_GPI_BLOCK_EVENT_IRQ macro is removed from qcom-gpi-dma.h
>>      file, and Block event interrupt support is checked with bei_flag.
>>
>> v3 -> v4:
>>    - API's added for Block event interrupt with multi descriptor support for
>>      I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
>>    - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
>>      I2C driver.
>>
>> v2-> v3:
>>     - Renamed gpi_multi_desc_process to gpi_multi_xfer_timeout_handler
>>     - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
>>     - Added documentation for newly added changes in "qcom-gpi-dma.h" file
>>     - Updated commit description.
>>
>> v1 -> v2:
>>     - Changed dma_addr type from array of pointers to array.
>>     - To support BEI functionality with the TRE size of 64 defined in GPI driver,
>>       updated QCOM_GPI_MAX_NUM_MSGS to 16 and NUM_MSGS_PER_IRQ to 4.
>>
>>   drivers/dma/qcom/gpi.c           | 3 +++
>>   include/linux/dma/qcom-gpi-dma.h | 2 ++
>>   2 files changed, 5 insertions(+)
>>
>> diff --git a/drivers/dma/qcom/gpi.c b/drivers/dma/qcom/gpi.c
>> index b1f0001cc99c..7e511f54166a 100644
>> --- a/drivers/dma/qcom/gpi.c
>> +++ b/drivers/dma/qcom/gpi.c
>> @@ -1695,6 +1695,9 @@ static int gpi_create_i2c_tre(struct gchan *chan, struct gpi_desc *desc,
>>   
>>   		tre->dword[3] = u32_encode_bits(TRE_TYPE_DMA, TRE_FLAGS_TYPE);
>>   		tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_IEOT);
>> +
>> +		if (!(i2c->dma_flags & DMA_PREP_INTERRUPT))
>> +			tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_BEI);
>>   	}
>>   
>>   	for (i = 0; i < tre_idx; i++)
>> diff --git a/include/linux/dma/qcom-gpi-dma.h b/include/linux/dma/qcom-gpi-dma.h
>> index 6680dd1a43c6..ebac0d3edff2 100644
>> --- a/include/linux/dma/qcom-gpi-dma.h
>> +++ b/include/linux/dma/qcom-gpi-dma.h
>> @@ -65,6 +65,7 @@ enum i2c_op {
>>    * @rx_len: receive length for buffer
>>    * @op: i2c cmd
>>    * @muli-msg: is part of multi i2c r-w msgs
>> + * @dma_flags: Flags indicating DMA capabilities
>>    */
>>   struct gpi_i2c_config {
>>   	u8 set_config;
>> @@ -78,6 +79,7 @@ struct gpi_i2c_config {
>>   	u32 rx_len;
>>   	enum i2c_op op;
>>   	bool multi_msg;
>> +	unsigned int dma_flags;
> 
> Why do you need extra field instead of using
> dma_async_tx_descriptor.flags?

In the original I2C QCOM GENI driver, using the local variable (unsigned 
in flags) and updating the "DMA_PREP_INTERRUPT" flag.

Sure, i will review if "dma_async_tx_descriptor.flags" can be retrieved 
in GPI driver for DMA_PREP_INTERRUPT flag status.
> 
>>   };
>>   
>>   #endif /* QCOM_GPI_DMA_H */
>> -- 
>> 2.17.1
>>
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 1/2] dmaengine: qcom: gpi: Add GPI Block event interrupt support
  2025-05-09  6:18     ` Jyothi Kumar Seerapu
@ 2025-05-30 14:05       ` Jyothi Kumar Seerapu
  2025-05-30 15:53         ` Dmitry Baryshkov
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-05-30 14:05 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku



On 5/9/2025 11:48 AM, Jyothi Kumar Seerapu wrote:
> 
> 
> On 5/6/2025 5:02 PM, Dmitry Baryshkov wrote:
>> On Tue, May 06, 2025 at 04:48:43PM +0530, Jyothi Kumar Seerapu wrote:
>>> GSI hardware generates an interrupt for each transfer completion.
>>> For multiple messages within a single transfer, this results in
>>> N interrupts for N messages, leading to significant software
>>> interrupt latency.
>>>
>>> To mitigate this latency, utilize Block Event Interrupt (BEI) mechanism.
>>> Enabling BEI instructs the GSI hardware to prevent interrupt generation
>>> and BEI is disabled when an interrupt is necessary.
>>>
>>> When using BEI, consider splitting a single multi-message transfer into
>>> chunks of 8 messages internally and so interrupts are not expected for
>>> the first 7 message completions, only the last message triggers
>>> an interrupt, indicating the completion of 8 messages.
>>>
>>> This BEI mechanism enhances overall transfer efficiency.
>>>
>>> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
>>> ---
>>> v5 ->v6:
>>>    - For updating the block event interrupt bit, instead of relying on
>>>      bei_flag, decision check is moved with DMA_PREP_INTERRUPT flag.
>>> v4 -> v5:
>>>    - BEI flag naming changed from flags to bei_flag.
>>>    - QCOM_GPI_BLOCK_EVENT_IRQ macro is removed from qcom-gpi-dma.h
>>>      file, and Block event interrupt support is checked with bei_flag.
>>>
>>> v3 -> v4:
>>>    - API's added for Block event interrupt with multi descriptor 
>>> support for
>>>      I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
>>>    - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
>>>      I2C driver.
>>>
>>> v2-> v3:
>>>     - Renamed gpi_multi_desc_process to gpi_multi_xfer_timeout_handler
>>>     - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
>>>     - Added documentation for newly added changes in "qcom-gpi-dma.h" 
>>> file
>>>     - Updated commit description.
>>>
>>> v1 -> v2:
>>>     - Changed dma_addr type from array of pointers to array.
>>>     - To support BEI functionality with the TRE size of 64 defined in 
>>> GPI driver,
>>>       updated QCOM_GPI_MAX_NUM_MSGS to 16 and NUM_MSGS_PER_IRQ to 4.
>>>
>>>   drivers/dma/qcom/gpi.c           | 3 +++
>>>   include/linux/dma/qcom-gpi-dma.h | 2 ++
>>>   2 files changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/dma/qcom/gpi.c b/drivers/dma/qcom/gpi.c
>>> index b1f0001cc99c..7e511f54166a 100644
>>> --- a/drivers/dma/qcom/gpi.c
>>> +++ b/drivers/dma/qcom/gpi.c
>>> @@ -1695,6 +1695,9 @@ static int gpi_create_i2c_tre(struct gchan 
>>> *chan, struct gpi_desc *desc,
>>>           tre->dword[3] = u32_encode_bits(TRE_TYPE_DMA, TRE_FLAGS_TYPE);
>>>           tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_IEOT);
>>> +
>>> +        if (!(i2c->dma_flags & DMA_PREP_INTERRUPT))
>>> +            tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_BEI);
>>>       }
>>>       for (i = 0; i < tre_idx; i++)
>>> diff --git a/include/linux/dma/qcom-gpi-dma.h b/include/linux/dma/ 
>>> qcom-gpi-dma.h
>>> index 6680dd1a43c6..ebac0d3edff2 100644
>>> --- a/include/linux/dma/qcom-gpi-dma.h
>>> +++ b/include/linux/dma/qcom-gpi-dma.h
>>> @@ -65,6 +65,7 @@ enum i2c_op {
>>>    * @rx_len: receive length for buffer
>>>    * @op: i2c cmd
>>>    * @muli-msg: is part of multi i2c r-w msgs
>>> + * @dma_flags: Flags indicating DMA capabilities
>>>    */
>>>   struct gpi_i2c_config {
>>>       u8 set_config;
>>> @@ -78,6 +79,7 @@ struct gpi_i2c_config {
>>>       u32 rx_len;
>>>       enum i2c_op op;
>>>       bool multi_msg;
>>> +    unsigned int dma_flags;
>>
>> Why do you need extra field instead of using
>> dma_async_tx_descriptor.flags?
> 
> In the original I2C QCOM GENI driver, using the local variable (unsigned 
> in flags) and updating the "DMA_PREP_INTERRUPT" flag.
> 
> Sure, i will review if "dma_async_tx_descriptor.flags" can be retrieved 
> in GPI driver for DMA_PREP_INTERRUPT flag status.

Hi Dmitry,

In the I2C Geni driver, the dma flags are primarily used in the 
dmaengine_prep_slave_single() function, which expects the argument type 
to be unsigned int. Therefore, the flags should be defined either as
enum dma_ctrl_flags, or unsigned int.

In the GPI driver, specifically within the gpi_prep_slave_sg() function, 
the flags are correctly received from the I2C driver. However, these 
flags are not currently passed to the gpi_create_i2c_tre() function.

If we pass the existing flags variable to the gpi_create_i2c_tre() 
function, we can retrieve the DMA flags information without introducing 
any additional or external variables.

Please confirm if this approach—reusing the existing flags argument in 
the GPI driver—is acceptable and good to proceed with.

>>
>>>   };
>>>   #endif /* QCOM_GPI_DMA_H */
>>> -- 
>>> 2.17.1
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 1/2] dmaengine: qcom: gpi: Add GPI Block event interrupt support
  2025-05-30 14:05       ` Jyothi Kumar Seerapu
@ 2025-05-30 15:53         ` Dmitry Baryshkov
  2025-06-17 14:16           ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2025-05-30 15:53 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

On Fri, 30 May 2025 at 17:05, Jyothi Kumar Seerapu
<quic_jseerapu@quicinc.com> wrote:
>
>
>
> On 5/9/2025 11:48 AM, Jyothi Kumar Seerapu wrote:
> >
> >
> > On 5/6/2025 5:02 PM, Dmitry Baryshkov wrote:
> >> On Tue, May 06, 2025 at 04:48:43PM +0530, Jyothi Kumar Seerapu wrote:
> >>> GSI hardware generates an interrupt for each transfer completion.
> >>> For multiple messages within a single transfer, this results in
> >>> N interrupts for N messages, leading to significant software
> >>> interrupt latency.
> >>>
> >>> To mitigate this latency, utilize Block Event Interrupt (BEI) mechanism.
> >>> Enabling BEI instructs the GSI hardware to prevent interrupt generation
> >>> and BEI is disabled when an interrupt is necessary.
> >>>
> >>> When using BEI, consider splitting a single multi-message transfer into
> >>> chunks of 8 messages internally and so interrupts are not expected for
> >>> the first 7 message completions, only the last message triggers
> >>> an interrupt, indicating the completion of 8 messages.
> >>>
> >>> This BEI mechanism enhances overall transfer efficiency.
> >>>
> >>> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
> >>> ---
> >>> v5 ->v6:
> >>>    - For updating the block event interrupt bit, instead of relying on
> >>>      bei_flag, decision check is moved with DMA_PREP_INTERRUPT flag.
> >>> v4 -> v5:
> >>>    - BEI flag naming changed from flags to bei_flag.
> >>>    - QCOM_GPI_BLOCK_EVENT_IRQ macro is removed from qcom-gpi-dma.h
> >>>      file, and Block event interrupt support is checked with bei_flag.
> >>>
> >>> v3 -> v4:
> >>>    - API's added for Block event interrupt with multi descriptor
> >>> support for
> >>>      I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
> >>>    - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
> >>>      I2C driver.
> >>>
> >>> v2-> v3:
> >>>     - Renamed gpi_multi_desc_process to gpi_multi_xfer_timeout_handler
> >>>     - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
> >>>     - Added documentation for newly added changes in "qcom-gpi-dma.h"
> >>> file
> >>>     - Updated commit description.
> >>>
> >>> v1 -> v2:
> >>>     - Changed dma_addr type from array of pointers to array.
> >>>     - To support BEI functionality with the TRE size of 64 defined in
> >>> GPI driver,
> >>>       updated QCOM_GPI_MAX_NUM_MSGS to 16 and NUM_MSGS_PER_IRQ to 4.
> >>>
> >>>   drivers/dma/qcom/gpi.c           | 3 +++
> >>>   include/linux/dma/qcom-gpi-dma.h | 2 ++
> >>>   2 files changed, 5 insertions(+)
> >>>
> >>> diff --git a/drivers/dma/qcom/gpi.c b/drivers/dma/qcom/gpi.c
> >>> index b1f0001cc99c..7e511f54166a 100644
> >>> --- a/drivers/dma/qcom/gpi.c
> >>> +++ b/drivers/dma/qcom/gpi.c
> >>> @@ -1695,6 +1695,9 @@ static int gpi_create_i2c_tre(struct gchan
> >>> *chan, struct gpi_desc *desc,
> >>>           tre->dword[3] = u32_encode_bits(TRE_TYPE_DMA, TRE_FLAGS_TYPE);
> >>>           tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_IEOT);
> >>> +
> >>> +        if (!(i2c->dma_flags & DMA_PREP_INTERRUPT))
> >>> +            tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_BEI);
> >>>       }
> >>>       for (i = 0; i < tre_idx; i++)
> >>> diff --git a/include/linux/dma/qcom-gpi-dma.h b/include/linux/dma/
> >>> qcom-gpi-dma.h
> >>> index 6680dd1a43c6..ebac0d3edff2 100644
> >>> --- a/include/linux/dma/qcom-gpi-dma.h
> >>> +++ b/include/linux/dma/qcom-gpi-dma.h
> >>> @@ -65,6 +65,7 @@ enum i2c_op {
> >>>    * @rx_len: receive length for buffer
> >>>    * @op: i2c cmd
> >>>    * @muli-msg: is part of multi i2c r-w msgs
> >>> + * @dma_flags: Flags indicating DMA capabilities
> >>>    */
> >>>   struct gpi_i2c_config {
> >>>       u8 set_config;
> >>> @@ -78,6 +79,7 @@ struct gpi_i2c_config {
> >>>       u32 rx_len;
> >>>       enum i2c_op op;
> >>>       bool multi_msg;
> >>> +    unsigned int dma_flags;
> >>
> >> Why do you need extra field instead of using
> >> dma_async_tx_descriptor.flags?
> >
> > In the original I2C QCOM GENI driver, using the local variable (unsigned
> > in flags) and updating the "DMA_PREP_INTERRUPT" flag.
> >
> > Sure, i will review if "dma_async_tx_descriptor.flags" can be retrieved
> > in GPI driver for DMA_PREP_INTERRUPT flag status.
>
> Hi Dmitry,
>
> In the I2C Geni driver, the dma flags are primarily used in the
> dmaengine_prep_slave_single() function, which expects the argument type
> to be unsigned int. Therefore, the flags should be defined either as
> enum dma_ctrl_flags, or unsigned int.
>
> In the GPI driver, specifically within the gpi_prep_slave_sg() function,
> the flags are correctly received from the I2C driver. However, these
> flags are not currently passed to the gpi_create_i2c_tre() function.
>
> If we pass the existing flags variable to the gpi_create_i2c_tre()
> function, we can retrieve the DMA flags information without introducing
> any additional or external variables.
>
> Please confirm if this approach—reusing the existing flags argument in
> the GPI driver—is acceptable and good to proceed with.

Could you please check how other drivers use the DMA_PREP_INTERRUPT
flag? That will answer your question.

>
> >>
> >>>   };
> >>>   #endif /* QCOM_GPI_DMA_H */
> >>> --
> >>> 2.17.1
> >>>
> >>
> >
> >
>


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 1/2] dmaengine: qcom: gpi: Add GPI Block event interrupt support
  2025-05-30 15:53         ` Dmitry Baryshkov
@ 2025-06-17 14:16           ` Jyothi Kumar Seerapu
  0 siblings, 0 replies; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-06-17 14:16 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku



On 5/30/2025 9:23 PM, Dmitry Baryshkov wrote:
> On Fri, 30 May 2025 at 17:05, Jyothi Kumar Seerapu
> <quic_jseerapu@quicinc.com> wrote:
>>
>>
>>
>> On 5/9/2025 11:48 AM, Jyothi Kumar Seerapu wrote:
>>>
>>>
>>> On 5/6/2025 5:02 PM, Dmitry Baryshkov wrote:
>>>> On Tue, May 06, 2025 at 04:48:43PM +0530, Jyothi Kumar Seerapu wrote:
>>>>> GSI hardware generates an interrupt for each transfer completion.
>>>>> For multiple messages within a single transfer, this results in
>>>>> N interrupts for N messages, leading to significant software
>>>>> interrupt latency.
>>>>>
>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI) mechanism.
>>>>> Enabling BEI instructs the GSI hardware to prevent interrupt generation
>>>>> and BEI is disabled when an interrupt is necessary.
>>>>>
>>>>> When using BEI, consider splitting a single multi-message transfer into
>>>>> chunks of 8 messages internally and so interrupts are not expected for
>>>>> the first 7 message completions, only the last message triggers
>>>>> an interrupt, indicating the completion of 8 messages.
>>>>>
>>>>> This BEI mechanism enhances overall transfer efficiency.
>>>>>
>>>>> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
>>>>> ---
>>>>> v5 ->v6:
>>>>>     - For updating the block event interrupt bit, instead of relying on
>>>>>       bei_flag, decision check is moved with DMA_PREP_INTERRUPT flag.
>>>>> v4 -> v5:
>>>>>     - BEI flag naming changed from flags to bei_flag.
>>>>>     - QCOM_GPI_BLOCK_EVENT_IRQ macro is removed from qcom-gpi-dma.h
>>>>>       file, and Block event interrupt support is checked with bei_flag.
>>>>>
>>>>> v3 -> v4:
>>>>>     - API's added for Block event interrupt with multi descriptor
>>>>> support for
>>>>>       I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
>>>>>     - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
>>>>>       I2C driver.
>>>>>
>>>>> v2-> v3:
>>>>>      - Renamed gpi_multi_desc_process to gpi_multi_xfer_timeout_handler
>>>>>      - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
>>>>>      - Added documentation for newly added changes in "qcom-gpi-dma.h"
>>>>> file
>>>>>      - Updated commit description.
>>>>>
>>>>> v1 -> v2:
>>>>>      - Changed dma_addr type from array of pointers to array.
>>>>>      - To support BEI functionality with the TRE size of 64 defined in
>>>>> GPI driver,
>>>>>        updated QCOM_GPI_MAX_NUM_MSGS to 16 and NUM_MSGS_PER_IRQ to 4.
>>>>>
>>>>>    drivers/dma/qcom/gpi.c           | 3 +++
>>>>>    include/linux/dma/qcom-gpi-dma.h | 2 ++
>>>>>    2 files changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/drivers/dma/qcom/gpi.c b/drivers/dma/qcom/gpi.c
>>>>> index b1f0001cc99c..7e511f54166a 100644
>>>>> --- a/drivers/dma/qcom/gpi.c
>>>>> +++ b/drivers/dma/qcom/gpi.c
>>>>> @@ -1695,6 +1695,9 @@ static int gpi_create_i2c_tre(struct gchan
>>>>> *chan, struct gpi_desc *desc,
>>>>>            tre->dword[3] = u32_encode_bits(TRE_TYPE_DMA, TRE_FLAGS_TYPE);
>>>>>            tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_IEOT);
>>>>> +
>>>>> +        if (!(i2c->dma_flags & DMA_PREP_INTERRUPT))
>>>>> +            tre->dword[3] |= u32_encode_bits(1, TRE_FLAGS_BEI);
>>>>>        }
>>>>>        for (i = 0; i < tre_idx; i++)
>>>>> diff --git a/include/linux/dma/qcom-gpi-dma.h b/include/linux/dma/
>>>>> qcom-gpi-dma.h
>>>>> index 6680dd1a43c6..ebac0d3edff2 100644
>>>>> --- a/include/linux/dma/qcom-gpi-dma.h
>>>>> +++ b/include/linux/dma/qcom-gpi-dma.h
>>>>> @@ -65,6 +65,7 @@ enum i2c_op {
>>>>>     * @rx_len: receive length for buffer
>>>>>     * @op: i2c cmd
>>>>>     * @muli-msg: is part of multi i2c r-w msgs
>>>>> + * @dma_flags: Flags indicating DMA capabilities
>>>>>     */
>>>>>    struct gpi_i2c_config {
>>>>>        u8 set_config;
>>>>> @@ -78,6 +79,7 @@ struct gpi_i2c_config {
>>>>>        u32 rx_len;
>>>>>        enum i2c_op op;
>>>>>        bool multi_msg;
>>>>> +    unsigned int dma_flags;
>>>>
>>>> Why do you need extra field instead of using
>>>> dma_async_tx_descriptor.flags?
>>>
>>> In the original I2C QCOM GENI driver, using the local variable (unsigned
>>> in flags) and updating the "DMA_PREP_INTERRUPT" flag.
>>>
>>> Sure, i will review if "dma_async_tx_descriptor.flags" can be retrieved
>>> in GPI driver for DMA_PREP_INTERRUPT flag status.
>>
>> Hi Dmitry,
>>
>> In the I2C Geni driver, the dma flags are primarily used in the
>> dmaengine_prep_slave_single() function, which expects the argument type
>> to be unsigned int. Therefore, the flags should be defined either as
>> enum dma_ctrl_flags, or unsigned int.
>>
>> In the GPI driver, specifically within the gpi_prep_slave_sg() function,
>> the flags are correctly received from the I2C driver. However, these
>> flags are not currently passed to the gpi_create_i2c_tre() function.
>>
>> If we pass the existing flags variable to the gpi_create_i2c_tre()
>> function, we can retrieve the DMA flags information without introducing
>> any additional or external variables.
>>
>> Please confirm if this approach—reusing the existing flags argument in
>> the GPI driver—is acceptable and good to proceed with.
> 
> Could you please check how other drivers use the DMA_PREP_INTERRUPT
> flag? That will answer your question.
Sure, thanks.


> 
>>
>>>>
>>>>>    };
>>>>>    #endif /* QCOM_GPI_DMA_H */
>>>>> --
>>>>> 2.17.1
>>>>>
>>>>
>>>
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-06 11:18 [PATCH v6 0/2] Add Block event interrupt support for I2C protocol Jyothi Kumar Seerapu
  2025-05-06 11:18 ` [PATCH v6 1/2] dmaengine: qcom: gpi: Add GPI Block event interrupt support Jyothi Kumar Seerapu
@ 2025-05-06 11:18 ` Jyothi Kumar Seerapu
  2025-05-06 11:46   ` Dmitry Baryshkov
  2025-05-07  2:21   ` ALOK TIWARI
  1 sibling, 2 replies; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-05-06 11:18 UTC (permalink / raw)
  To: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König
  Cc: linux-arm-msm, dmaengine, linux-kernel, linux-i2c, linux-media,
	dri-devel, linaro-mm-sig, quic_vtanuku

The I2C driver gets an interrupt upon transfer completion.
When handling multiple messages in a single transfer, this
results in N interrupts for N messages, leading to significant
software interrupt latency.

To mitigate this latency, utilize Block Event Interrupt (BEI)
mechanism. Enabling BEI instructs the hardware to prevent interrupt
generation and BEI is disabled when an interrupt is necessary.

Large I2C transfer can be divided into chunks of 8 messages internally.
Interrupts are not expected for the first 7 message completions, only
the last message triggers an interrupt, indicating the completion of
8 messages. This BEI mechanism enhances overall transfer efficiency.

This optimization reduces transfer time from 168 ms to 48 ms for a
series of 200 I2C write messages in a single transfer, with a
clock frequency support of 100 kHz.

BEI optimizations are currently implemented for I2C write transfers only,
as there is no use case for multiple I2C read messages in a single transfer
at this time.

Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
---
v5 -> v6:
  - Instead of using bei_flag, moved the logic to use with DMA
    supported flags like DMA_PREP_INTERRUPT.
  - Additional parameter comments removed from geni_i2c_gpi_multi_desc_unmap
    function documentation.
   
v4 -> v5:
  - Block event interrupt flag naming changed from flags to bei_flag.
  - Documentation added for "struct geni_i2c_dev".

v3 -> v4:
  - API's added for Block event interrupt with multi descriptor support for
    I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
  - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
    I2C driver.
  - geni_i2c_gpi_multi_desc_xfer structure is added as a member of
    struct geni_i2c_dev.

v2 -> v3:
   - In i2c_gpi_cb_result function, moved the logic of
    "!is_tx_multi_xfer" to else.
   - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
   - Updated commit description

v1 -> v2:
   - Moved gi2c_gpi_xfer->msg_idx_cnt to separate local variable.
   - Updated goto labels for error scenarios in geni_i2c_gpi function
   - memset tx_multi_xfer to 0.
   - Removed passing current msg index to geni_i2c_gpi
   - Fixed kernel test robot reported compilation issues.


 drivers/i2c/busses/i2c-qcom-geni.c | 307 ++++++++++++++++++++++++++---
 1 file changed, 280 insertions(+), 27 deletions(-)

diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
index 515a784c951c..e390cf5b4ddc 100644
--- a/drivers/i2c/busses/i2c-qcom-geni.c
+++ b/drivers/i2c/busses/i2c-qcom-geni.c
@@ -78,6 +78,62 @@ enum geni_i2c_err_code {
 #define XFER_TIMEOUT		HZ
 #define RST_TIMEOUT		HZ
 
+#define QCOM_I2C_GPI_MAX_NUM_MSGS		16
+#define QCOM_I2C_GPI_NUM_MSGS_PER_IRQ		8
+#define QCOM_I2C_MIN_NUM_OF_MSGS_MULTI_DESC	2
+
+/**
+ * struct geni_i2c_gpi_multi_desc_xfer - Used for multi transfer support
+ *
+ * @msg_idx_cnt: message index for the transfer
+ * @buf_idx: dma buffer index
+ * @unmap_msg_cnt: unmapped transfer index
+ * @freed_msg_cnt: freed transfer index
+ * @irq_cnt: received interrupt count
+ * @irq_msg_cnt: transfer message count for the received irqs
+ * @dma_buf: virtual addresses of the buffers
+ * @dma_addr: dma addresses of the buffers
+ */
+struct geni_i2c_gpi_multi_desc_xfer {
+	u32 msg_idx_cnt;
+	u32 buf_idx;
+	u32 unmap_msg_cnt;
+	u32 freed_msg_cnt;
+	u32 irq_cnt;
+	u32 irq_msg_cnt;
+	void *dma_buf[QCOM_I2C_GPI_MAX_NUM_MSGS];
+	dma_addr_t dma_addr[QCOM_I2C_GPI_MAX_NUM_MSGS];
+};
+
+/**
+ * struct geni_i2c_dev - I2C Geni device specific structure
+ *
+ * @se: geni serial engine
+ * @tx_wm: Tx watermark level
+ * @irq: i2c serial engine interrupt
+ * @err: specifies error codes in i2c transfer failures
+ * @adap: i2c geni adapter
+ * @done: completion variable
+ * @cur: pointer to the i2c_msg mentioning current i2c message in use
+ * @cur_wr: variable used for i2c write opertions
+ * @cur_rd: variable used for i2c read operations
+ * @lock: spinlock variable used for synchronization
+ * @core_clk: pointer to clk
+ * @clk_freq_out: contains the i2c clock frequency
+ * @clk_fld: pointer to geni_i2c_clk_fld
+ * @suspended: flag used for system supend status
+ * @dma_buf: virtual address of the buffer
+ * @xfer_len: holds length for the dma operation
+ * @dma_addr: dma address of the buffer
+ * @tx_c: Tx dma channel
+ * @rx_c: Rx dma channel
+ * @gpi_mode: GPI DMA mode of operation
+ * @abort_done: true for marking i2c abort transfer
+ * @is_tx_multi_desc_xfer: true for i2c multi transfer support
+ * @num_msgs: number of i2c messages in a transfer
+ * @tx_irq_cnt: flag used for tx irq count in i2c multi transfer
+ * @i2c_multi_desc_config: used for multi transfer support
+ */
 struct geni_i2c_dev {
 	struct geni_se se;
 	u32 tx_wm;
@@ -100,6 +156,10 @@ struct geni_i2c_dev {
 	struct dma_chan *rx_c;
 	bool gpi_mode;
 	bool abort_done;
+	bool is_tx_multi_desc_xfer;
+	u32 num_msgs;
+	u32 tx_irq_cnt;
+	struct geni_i2c_gpi_multi_desc_xfer i2c_multi_desc_config;
 };
 
 struct geni_i2c_desc {
@@ -500,6 +560,7 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
 static void i2c_gpi_cb_result(void *cb, const struct dmaengine_result *result)
 {
 	struct geni_i2c_dev *gi2c = cb;
+	struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer;
 
 	if (result->result != DMA_TRANS_NOERROR) {
 		dev_err(gi2c->se.dev, "DMA txn failed:%d\n", result->result);
@@ -508,7 +569,22 @@ static void i2c_gpi_cb_result(void *cb, const struct dmaengine_result *result)
 		dev_dbg(gi2c->se.dev, "DMA xfer has pending: %d\n", result->residue);
 	}
 
-	complete(&gi2c->done);
+	if (!gi2c->is_tx_multi_desc_xfer) {
+		complete(&gi2c->done);
+	} else {
+		tx_multi_xfer = &gi2c->i2c_multi_desc_config;
+
+		/*
+		 * Send Completion for last message or multiple of
+		 * QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
+		 */
+		if ((tx_multi_xfer->irq_msg_cnt == gi2c->num_msgs - 1) ||
+		    (!((tx_multi_xfer->irq_msg_cnt + 1) % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ))) {
+			tx_multi_xfer->irq_cnt++;
+			complete(&gi2c->done);
+		}
+		tx_multi_xfer->irq_msg_cnt++;
+	}
 }
 
 static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
@@ -526,38 +602,140 @@ static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
 	}
 }
 
-static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
+/**
+ * geni_i2c_gpi_multi_desc_unmap() - unmaps the buffers post multi message TX transfers
+ * @gi2c: i2c dev handle
+ * @msgs: i2c messages array
+ * @peripheral: pointer to gpi_i2c_config
+ */
+static void geni_i2c_gpi_multi_desc_unmap(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[],
+					  struct gpi_i2c_config *peripheral)
+{
+	u32 msg_xfer_cnt, wr_idx = 0;
+	struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer = &gi2c->i2c_multi_desc_config;
+
+	/*
+	 * In error case, need to unmap all messages based on the msg_idx_cnt.
+	 * Non-error case unmap all the processed messages.
+	 */
+	if (gi2c->err)
+		msg_xfer_cnt = tx_multi_xfer->msg_idx_cnt;
+	else
+		msg_xfer_cnt = tx_multi_xfer->irq_cnt * QCOM_I2C_GPI_NUM_MSGS_PER_IRQ;
+
+	/* Unmap the processed DMA buffers based on the received interrupt count */
+	for (; tx_multi_xfer->unmap_msg_cnt < msg_xfer_cnt; tx_multi_xfer->unmap_msg_cnt++) {
+		if (tx_multi_xfer->unmap_msg_cnt == gi2c->num_msgs)
+			break;
+		wr_idx = tx_multi_xfer->unmap_msg_cnt % QCOM_I2C_GPI_MAX_NUM_MSGS;
+		geni_i2c_gpi_unmap(gi2c, &msgs[tx_multi_xfer->unmap_msg_cnt],
+				   tx_multi_xfer->dma_buf[wr_idx],
+				   tx_multi_xfer->dma_addr[wr_idx],
+				   NULL, (dma_addr_t)NULL);
+		tx_multi_xfer->freed_msg_cnt++;
+	}
+}
+
+/**
+ * geni_i2c_gpi_multi_xfer_timeout_handler() - Handle multi message transfer timeout
+ * @dev: pointer to the corresponding dev node
+ * @multi_xfer: pointer to the geni_i2c_gpi_multi_desc_xfer
+ * @num_xfers: total number of transfers
+ * @transfer_timeout_msecs: transfer timeout value
+ * @transfer_comp: completion object of the transfer
+ *
+ * This function is used to wait for the processed transfers based on
+ * the interrupts generated upon transfer completion.
+ * Return: On success returns 0, otherwise return error code (-ETIMEDOUT)
+ */
+static int geni_i2c_gpi_multi_xfer_timeout_handler(struct device *dev,
+						   struct geni_i2c_gpi_multi_desc_xfer *multi_xfer,
+						   u32 num_xfers, u32 transfer_timeout_msecs,
+						   struct completion *transfer_comp)
+{
+	int i;
+	u32 max_irq_cnt, time_left;
+
+	max_irq_cnt = num_xfers / QCOM_I2C_GPI_NUM_MSGS_PER_IRQ;
+	if (num_xfers % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ)
+		max_irq_cnt++;
+
+	/*
+	 * Wait for the interrupts of the processed transfers in multiple
+	 * of 8 and for the last transfer. If the hardware is fast and
+	 * already processed all the transfers then no need to wait.
+	 */
+	for (i = 0; i < max_irq_cnt; i++) {
+		reinit_completion(transfer_comp);
+		if (max_irq_cnt != multi_xfer->irq_cnt) {
+			time_left = wait_for_completion_timeout(transfer_comp,
+								transfer_timeout_msecs);
+			if (!time_left) {
+				dev_err(dev, "%s: Transfer timeout\n", __func__);
+				return -ETIMEDOUT;
+			}
+		}
+		if (num_xfers > multi_xfer->msg_idx_cnt)
+			return 0;
+	}
+	return 0;
+}
+
+static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[],
 			struct dma_slave_config *config, dma_addr_t *dma_addr_p,
 			void **buf, unsigned int op, struct dma_chan *dma_chan)
 {
 	struct gpi_i2c_config *peripheral;
-	unsigned int flags;
 	void *dma_buf;
 	dma_addr_t addr;
 	enum dma_data_direction map_dirn;
 	enum dma_transfer_direction dma_dirn;
 	struct dma_async_tx_descriptor *desc;
 	int ret;
+	struct geni_i2c_gpi_multi_desc_xfer *gi2c_gpi_xfer;
+	dma_cookie_t cookie;
+	u32 msg_idx;
 
 	peripheral = config->peripheral_config;
-
-	dma_buf = i2c_get_dma_safe_msg_buf(msg, 1);
-	if (!dma_buf)
-		return -ENOMEM;
+	gi2c_gpi_xfer = &gi2c->i2c_multi_desc_config;
+	dma_buf = gi2c_gpi_xfer->dma_buf[gi2c_gpi_xfer->buf_idx];
+	addr = gi2c_gpi_xfer->dma_addr[gi2c_gpi_xfer->buf_idx];
+	msg_idx = gi2c_gpi_xfer->msg_idx_cnt;
+
+	dma_buf = i2c_get_dma_safe_msg_buf(&msgs[msg_idx], 1);
+	if (!dma_buf) {
+		ret = -ENOMEM;
+		goto out;
+	}
 
 	if (op == I2C_WRITE)
 		map_dirn = DMA_TO_DEVICE;
 	else
 		map_dirn = DMA_FROM_DEVICE;
 
-	addr = dma_map_single(gi2c->se.dev->parent, dma_buf, msg->len, map_dirn);
+	addr = dma_map_single(gi2c->se.dev->parent, dma_buf,
+			      msgs[msg_idx].len, map_dirn);
 	if (dma_mapping_error(gi2c->se.dev->parent, addr)) {
-		i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
-		return -ENOMEM;
+		i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	if (gi2c->is_tx_multi_desc_xfer) {
+		if (((msg_idx + 1) % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ))
+			peripheral->dma_flags = DMA_CTRL_ACK;
+		else
+			peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
+
+		/* BEI bit to be cleared for last TRE */
+		if (msg_idx == gi2c->num_msgs - 1)
+			peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
+	} else {
+		peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
 	}
 
 	/* set the length as message for rx txn */
-	peripheral->rx_len = msg->len;
+	peripheral->rx_len = msgs[msg_idx].len;
 	peripheral->op = op;
 
 	ret = dmaengine_slave_config(dma_chan, config);
@@ -568,14 +746,14 @@ static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
 
 	peripheral->set_config = 0;
 	peripheral->multi_msg = true;
-	flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
 
 	if (op == I2C_WRITE)
 		dma_dirn = DMA_MEM_TO_DEV;
 	else
 		dma_dirn = DMA_DEV_TO_MEM;
 
-	desc = dmaengine_prep_slave_single(dma_chan, addr, msg->len, dma_dirn, flags);
+	desc = dmaengine_prep_slave_single(dma_chan, addr, msgs[msg_idx].len,
+					   dma_dirn, peripheral->dma_flags);
 	if (!desc) {
 		dev_err(gi2c->se.dev, "prep_slave_sg failed\n");
 		ret = -EIO;
@@ -585,15 +763,48 @@ static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
 	desc->callback_result = i2c_gpi_cb_result;
 	desc->callback_param = gi2c;
 
-	dmaengine_submit(desc);
-	*buf = dma_buf;
-	*dma_addr_p = addr;
+	if (!((msgs[msg_idx].flags & I2C_M_RD) && op == I2C_WRITE)) {
+		gi2c_gpi_xfer->msg_idx_cnt++;
+		gi2c_gpi_xfer->buf_idx = (msg_idx + 1) % QCOM_I2C_GPI_MAX_NUM_MSGS;
+	}
+	cookie = dmaengine_submit(desc);
+	if (dma_submit_error(cookie)) {
+		dev_err(gi2c->se.dev,
+			"%s: dmaengine_submit failed (%d)\n", __func__, cookie);
+		ret = -EINVAL;
+		goto err_config;
+	}
 
+	if (gi2c->is_tx_multi_desc_xfer) {
+		dma_async_issue_pending(gi2c->tx_c);
+		if ((msg_idx == (gi2c->num_msgs - 1)) ||
+		    (gi2c_gpi_xfer->msg_idx_cnt >=
+		     QCOM_I2C_GPI_MAX_NUM_MSGS + gi2c_gpi_xfer->freed_msg_cnt)) {
+			ret = geni_i2c_gpi_multi_xfer_timeout_handler(gi2c->se.dev, gi2c_gpi_xfer,
+								      gi2c->num_msgs, XFER_TIMEOUT,
+								      &gi2c->done);
+			if (ret) {
+				dev_err(gi2c->se.dev,
+					"I2C multi write msg transfer timeout: %d\n",
+					ret);
+				gi2c->err = ret;
+				goto err_config;
+			}
+		}
+	} else {
+		/* Non multi descriptor message transfer */
+		*buf = dma_buf;
+		*dma_addr_p = addr;
+	}
 	return 0;
 
 err_config:
-	dma_unmap_single(gi2c->se.dev->parent, addr, msg->len, map_dirn);
-	i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
+	dma_unmap_single(gi2c->se.dev->parent, addr,
+			 msgs[msg_idx].len, map_dirn);
+	i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
+
+out:
+	gi2c->err = ret;
 	return ret;
 }
 
@@ -605,6 +816,7 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
 	unsigned long time_left;
 	dma_addr_t tx_addr, rx_addr;
 	void *tx_buf = NULL, *rx_buf = NULL;
+	struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer;
 	const struct geni_i2c_clk_fld *itr = gi2c->clk_fld;
 
 	config.peripheral_config = &peripheral;
@@ -618,6 +830,33 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
 	peripheral.set_config = 1;
 	peripheral.multi_msg = false;
 
+	gi2c->num_msgs = num;
+	gi2c->is_tx_multi_desc_xfer = false;
+	gi2c->tx_irq_cnt = 0;
+
+	tx_multi_xfer = &gi2c->i2c_multi_desc_config;
+	memset(tx_multi_xfer, 0, sizeof(struct geni_i2c_gpi_multi_desc_xfer));
+
+	/*
+	 * If number of write messages are two and higher then
+	 * configure hardware for multi descriptor transfers with BEI.
+	 */
+	if (num >= QCOM_I2C_MIN_NUM_OF_MSGS_MULTI_DESC) {
+		gi2c->is_tx_multi_desc_xfer = true;
+		for (i = 0; i < num; i++) {
+			if (msgs[i].flags & I2C_M_RD) {
+				/*
+				 * Multi descriptor transfer with BEI
+				 * support is enabled for write transfers.
+				 * TODO: Add BEI optimization support for
+				 * read transfers later.
+				 */
+				gi2c->is_tx_multi_desc_xfer = false;
+				break;
+			}
+		}
+	}
+
 	for (i = 0; i < num; i++) {
 		gi2c->cur = &msgs[i];
 		gi2c->err = 0;
@@ -628,14 +867,16 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
 			peripheral.stretch = 1;
 
 		peripheral.addr = msgs[i].addr;
+		if (i > 0 && (!(msgs[i].flags & I2C_M_RD)))
+			peripheral.multi_msg = false;
 
-		ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
+		ret =  geni_i2c_gpi(gi2c, msgs, &config,
 				    &tx_addr, &tx_buf, I2C_WRITE, gi2c->tx_c);
 		if (ret)
 			goto err;
 
 		if (msgs[i].flags & I2C_M_RD) {
-			ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
+			ret =  geni_i2c_gpi(gi2c, msgs, &config,
 					    &rx_addr, &rx_buf, I2C_READ, gi2c->rx_c);
 			if (ret)
 				goto err;
@@ -643,18 +884,26 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
 			dma_async_issue_pending(gi2c->rx_c);
 		}
 
-		dma_async_issue_pending(gi2c->tx_c);
-
-		time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
-		if (!time_left)
-			gi2c->err = -ETIMEDOUT;
+		if (!gi2c->is_tx_multi_desc_xfer) {
+			dma_async_issue_pending(gi2c->tx_c);
+			time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
+			if (!time_left) {
+				dev_err(gi2c->se.dev, "%s:I2C timeout\n", __func__);
+				gi2c->err = -ETIMEDOUT;
+			}
+		}
 
 		if (gi2c->err) {
 			ret = gi2c->err;
 			goto err;
 		}
 
-		geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
+		if (!gi2c->is_tx_multi_desc_xfer) {
+			geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
+		} else if (gi2c->tx_irq_cnt != tx_multi_xfer->irq_cnt) {
+			gi2c->tx_irq_cnt = tx_multi_xfer->irq_cnt;
+			geni_i2c_gpi_multi_desc_unmap(gi2c, msgs, &peripheral);
+		}
 	}
 
 	return num;
@@ -663,7 +912,11 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
 	dev_err(gi2c->se.dev, "GPI transfer failed: %d\n", ret);
 	dmaengine_terminate_sync(gi2c->rx_c);
 	dmaengine_terminate_sync(gi2c->tx_c);
-	geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
+	if (gi2c->is_tx_multi_desc_xfer)
+		geni_i2c_gpi_multi_desc_unmap(gi2c, msgs, &peripheral);
+	else
+		geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
+
 	return ret;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-06 11:18 ` [PATCH v6 2/2] i2c: i2c-qcom-geni: Add " Jyothi Kumar Seerapu
@ 2025-05-06 11:46   ` Dmitry Baryshkov
  2025-05-09  6:18     ` Jyothi Kumar Seerapu
  2025-05-07  2:21   ` ALOK TIWARI
  1 sibling, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2025-05-06 11:46 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
> The I2C driver gets an interrupt upon transfer completion.
> When handling multiple messages in a single transfer, this
> results in N interrupts for N messages, leading to significant
> software interrupt latency.
> 
> To mitigate this latency, utilize Block Event Interrupt (BEI)
> mechanism. Enabling BEI instructs the hardware to prevent interrupt
> generation and BEI is disabled when an interrupt is necessary.
> 
> Large I2C transfer can be divided into chunks of 8 messages internally.
> Interrupts are not expected for the first 7 message completions, only
> the last message triggers an interrupt, indicating the completion of
> 8 messages. This BEI mechanism enhances overall transfer efficiency.

Why do you need this complexity? Is it possible to set the
DMA_PREP_INTERRUPT flag on the last message in the transfer?

> 
> This optimization reduces transfer time from 168 ms to 48 ms for a
> series of 200 I2C write messages in a single transfer, with a
> clock frequency support of 100 kHz.
> 
> BEI optimizations are currently implemented for I2C write transfers only,
> as there is no use case for multiple I2C read messages in a single transfer
> at this time.
> 
> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
> ---
> v5 -> v6:
>   - Instead of using bei_flag, moved the logic to use with DMA
>     supported flags like DMA_PREP_INTERRUPT.
>   - Additional parameter comments removed from geni_i2c_gpi_multi_desc_unmap
>     function documentation.
>    
> v4 -> v5:
>   - Block event interrupt flag naming changed from flags to bei_flag.
>   - Documentation added for "struct geni_i2c_dev".
> 
> v3 -> v4:
>   - API's added for Block event interrupt with multi descriptor support for
>     I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
>   - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
>     I2C driver.
>   - geni_i2c_gpi_multi_desc_xfer structure is added as a member of
>     struct geni_i2c_dev.
> 
> v2 -> v3:
>    - In i2c_gpi_cb_result function, moved the logic of
>     "!is_tx_multi_xfer" to else.
>    - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
>    - Updated commit description
> 
> v1 -> v2:
>    - Moved gi2c_gpi_xfer->msg_idx_cnt to separate local variable.
>    - Updated goto labels for error scenarios in geni_i2c_gpi function
>    - memset tx_multi_xfer to 0.
>    - Removed passing current msg index to geni_i2c_gpi
>    - Fixed kernel test robot reported compilation issues.
> 
> 
>  drivers/i2c/busses/i2c-qcom-geni.c | 307 ++++++++++++++++++++++++++---
>  1 file changed, 280 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
> index 515a784c951c..e390cf5b4ddc 100644
> --- a/drivers/i2c/busses/i2c-qcom-geni.c
> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> @@ -78,6 +78,62 @@ enum geni_i2c_err_code {
>  #define XFER_TIMEOUT		HZ
>  #define RST_TIMEOUT		HZ
>  
> +#define QCOM_I2C_GPI_MAX_NUM_MSGS		16
> +#define QCOM_I2C_GPI_NUM_MSGS_PER_IRQ		8
> +#define QCOM_I2C_MIN_NUM_OF_MSGS_MULTI_DESC	2
> +
> +/**
> + * struct geni_i2c_gpi_multi_desc_xfer - Used for multi transfer support
> + *
> + * @msg_idx_cnt: message index for the transfer
> + * @buf_idx: dma buffer index
> + * @unmap_msg_cnt: unmapped transfer index
> + * @freed_msg_cnt: freed transfer index
> + * @irq_cnt: received interrupt count
> + * @irq_msg_cnt: transfer message count for the received irqs
> + * @dma_buf: virtual addresses of the buffers
> + * @dma_addr: dma addresses of the buffers
> + */
> +struct geni_i2c_gpi_multi_desc_xfer {
> +	u32 msg_idx_cnt;
> +	u32 buf_idx;
> +	u32 unmap_msg_cnt;
> +	u32 freed_msg_cnt;
> +	u32 irq_cnt;
> +	u32 irq_msg_cnt;
> +	void *dma_buf[QCOM_I2C_GPI_MAX_NUM_MSGS];
> +	dma_addr_t dma_addr[QCOM_I2C_GPI_MAX_NUM_MSGS];
> +};
> +
> +/**
> + * struct geni_i2c_dev - I2C Geni device specific structure
> + *
> + * @se: geni serial engine
> + * @tx_wm: Tx watermark level
> + * @irq: i2c serial engine interrupt
> + * @err: specifies error codes in i2c transfer failures
> + * @adap: i2c geni adapter
> + * @done: completion variable
> + * @cur: pointer to the i2c_msg mentioning current i2c message in use
> + * @cur_wr: variable used for i2c write opertions
> + * @cur_rd: variable used for i2c read operations
> + * @lock: spinlock variable used for synchronization
> + * @core_clk: pointer to clk
> + * @clk_freq_out: contains the i2c clock frequency
> + * @clk_fld: pointer to geni_i2c_clk_fld
> + * @suspended: flag used for system supend status
> + * @dma_buf: virtual address of the buffer
> + * @xfer_len: holds length for the dma operation
> + * @dma_addr: dma address of the buffer
> + * @tx_c: Tx dma channel
> + * @rx_c: Rx dma channel
> + * @gpi_mode: GPI DMA mode of operation
> + * @abort_done: true for marking i2c abort transfer
> + * @is_tx_multi_desc_xfer: true for i2c multi transfer support
> + * @num_msgs: number of i2c messages in a transfer
> + * @tx_irq_cnt: flag used for tx irq count in i2c multi transfer
> + * @i2c_multi_desc_config: used for multi transfer support
> + */

Unrelated. Documentation should go to a separate patch.

>  struct geni_i2c_dev {
>  	struct geni_se se;
>  	u32 tx_wm;
> @@ -100,6 +156,10 @@ struct geni_i2c_dev {
>  	struct dma_chan *rx_c;
>  	bool gpi_mode;
>  	bool abort_done;
> +	bool is_tx_multi_desc_xfer;
> +	u32 num_msgs;
> +	u32 tx_irq_cnt;
> +	struct geni_i2c_gpi_multi_desc_xfer i2c_multi_desc_config;
>  };
>  
>  struct geni_i2c_desc {
> @@ -500,6 +560,7 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>  static void i2c_gpi_cb_result(void *cb, const struct dmaengine_result *result)
>  {
>  	struct geni_i2c_dev *gi2c = cb;
> +	struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer;

Define it in the corresponding chunk.

>  
>  	if (result->result != DMA_TRANS_NOERROR) {
>  		dev_err(gi2c->se.dev, "DMA txn failed:%d\n", result->result);
> @@ -508,7 +569,22 @@ static void i2c_gpi_cb_result(void *cb, const struct dmaengine_result *result)
>  		dev_dbg(gi2c->se.dev, "DMA xfer has pending: %d\n", result->residue);
>  	}
>  
> -	complete(&gi2c->done);
> +	if (!gi2c->is_tx_multi_desc_xfer) {
> +		complete(&gi2c->done);
> +	} else {
> +		tx_multi_xfer = &gi2c->i2c_multi_desc_config;
> +
> +		/*
> +		 * Send Completion for last message or multiple of
> +		 * QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> +		 */
> +		if ((tx_multi_xfer->irq_msg_cnt == gi2c->num_msgs - 1) ||
> +		    (!((tx_multi_xfer->irq_msg_cnt + 1) % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ))) {
> +			tx_multi_xfer->irq_cnt++;
> +			complete(&gi2c->done);
> +		}
> +		tx_multi_xfer->irq_msg_cnt++;
> +	}
>  }
>  
>  static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> @@ -526,38 +602,140 @@ static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>  	}
>  }
>  
> -static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
> +/**
> + * geni_i2c_gpi_multi_desc_unmap() - unmaps the buffers post multi message TX transfers
> + * @gi2c: i2c dev handle
> + * @msgs: i2c messages array
> + * @peripheral: pointer to gpi_i2c_config
> + */
> +static void geni_i2c_gpi_multi_desc_unmap(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[],
> +					  struct gpi_i2c_config *peripheral)
> +{
> +	u32 msg_xfer_cnt, wr_idx = 0;
> +	struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer = &gi2c->i2c_multi_desc_config;
> +
> +	/*
> +	 * In error case, need to unmap all messages based on the msg_idx_cnt.
> +	 * Non-error case unmap all the processed messages.
> +	 */
> +	if (gi2c->err)
> +		msg_xfer_cnt = tx_multi_xfer->msg_idx_cnt;
> +	else
> +		msg_xfer_cnt = tx_multi_xfer->irq_cnt * QCOM_I2C_GPI_NUM_MSGS_PER_IRQ;
> +
> +	/* Unmap the processed DMA buffers based on the received interrupt count */
> +	for (; tx_multi_xfer->unmap_msg_cnt < msg_xfer_cnt; tx_multi_xfer->unmap_msg_cnt++) {
> +		if (tx_multi_xfer->unmap_msg_cnt == gi2c->num_msgs)
> +			break;
> +		wr_idx = tx_multi_xfer->unmap_msg_cnt % QCOM_I2C_GPI_MAX_NUM_MSGS;
> +		geni_i2c_gpi_unmap(gi2c, &msgs[tx_multi_xfer->unmap_msg_cnt],
> +				   tx_multi_xfer->dma_buf[wr_idx],
> +				   tx_multi_xfer->dma_addr[wr_idx],
> +				   NULL, (dma_addr_t)NULL);
> +		tx_multi_xfer->freed_msg_cnt++;
> +	}
> +}
> +
> +/**
> + * geni_i2c_gpi_multi_xfer_timeout_handler() - Handle multi message transfer timeout
> + * @dev: pointer to the corresponding dev node
> + * @multi_xfer: pointer to the geni_i2c_gpi_multi_desc_xfer
> + * @num_xfers: total number of transfers
> + * @transfer_timeout_msecs: transfer timeout value
> + * @transfer_comp: completion object of the transfer
> + *
> + * This function is used to wait for the processed transfers based on
> + * the interrupts generated upon transfer completion.
> + * Return: On success returns 0, otherwise return error code (-ETIMEDOUT)
> + */
> +static int geni_i2c_gpi_multi_xfer_timeout_handler(struct device *dev,
> +						   struct geni_i2c_gpi_multi_desc_xfer *multi_xfer,
> +						   u32 num_xfers, u32 transfer_timeout_msecs,
> +						   struct completion *transfer_comp)
> +{
> +	int i;
> +	u32 max_irq_cnt, time_left;
> +
> +	max_irq_cnt = num_xfers / QCOM_I2C_GPI_NUM_MSGS_PER_IRQ;
> +	if (num_xfers % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ)
> +		max_irq_cnt++;
> +
> +	/*
> +	 * Wait for the interrupts of the processed transfers in multiple
> +	 * of 8 and for the last transfer. If the hardware is fast and
> +	 * already processed all the transfers then no need to wait.
> +	 */
> +	for (i = 0; i < max_irq_cnt; i++) {
> +		reinit_completion(transfer_comp);
> +		if (max_irq_cnt != multi_xfer->irq_cnt) {
> +			time_left = wait_for_completion_timeout(transfer_comp,
> +								transfer_timeout_msecs);
> +			if (!time_left) {
> +				dev_err(dev, "%s: Transfer timeout\n", __func__);
> +				return -ETIMEDOUT;
> +			}
> +		}
> +		if (num_xfers > multi_xfer->msg_idx_cnt)
> +			return 0;
> +	}
> +	return 0;
> +}
> +
> +static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[],
>  			struct dma_slave_config *config, dma_addr_t *dma_addr_p,
>  			void **buf, unsigned int op, struct dma_chan *dma_chan)
>  {
>  	struct gpi_i2c_config *peripheral;
> -	unsigned int flags;
>  	void *dma_buf;
>  	dma_addr_t addr;
>  	enum dma_data_direction map_dirn;
>  	enum dma_transfer_direction dma_dirn;
>  	struct dma_async_tx_descriptor *desc;
>  	int ret;
> +	struct geni_i2c_gpi_multi_desc_xfer *gi2c_gpi_xfer;
> +	dma_cookie_t cookie;
> +	u32 msg_idx;
>  
>  	peripheral = config->peripheral_config;
> -
> -	dma_buf = i2c_get_dma_safe_msg_buf(msg, 1);
> -	if (!dma_buf)
> -		return -ENOMEM;
> +	gi2c_gpi_xfer = &gi2c->i2c_multi_desc_config;
> +	dma_buf = gi2c_gpi_xfer->dma_buf[gi2c_gpi_xfer->buf_idx];
> +	addr = gi2c_gpi_xfer->dma_addr[gi2c_gpi_xfer->buf_idx];
> +	msg_idx = gi2c_gpi_xfer->msg_idx_cnt;
> +
> +	dma_buf = i2c_get_dma_safe_msg_buf(&msgs[msg_idx], 1);
> +	if (!dma_buf) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
>  
>  	if (op == I2C_WRITE)
>  		map_dirn = DMA_TO_DEVICE;
>  	else
>  		map_dirn = DMA_FROM_DEVICE;
>  
> -	addr = dma_map_single(gi2c->se.dev->parent, dma_buf, msg->len, map_dirn);
> +	addr = dma_map_single(gi2c->se.dev->parent, dma_buf,
> +			      msgs[msg_idx].len, map_dirn);
>  	if (dma_mapping_error(gi2c->se.dev->parent, addr)) {
> -		i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
> -		return -ENOMEM;
> +		i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
> +	if (gi2c->is_tx_multi_desc_xfer) {
> +		if (((msg_idx + 1) % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ))
> +			peripheral->dma_flags = DMA_CTRL_ACK;
> +		else
> +			peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
> +
> +		/* BEI bit to be cleared for last TRE */
> +		if (msg_idx == gi2c->num_msgs - 1)
> +			peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
> +	} else {
> +		peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;

No-no-no. There is no need to set peripheral->dma_flags.

>  	}
>  
>  	/* set the length as message for rx txn */
> -	peripheral->rx_len = msg->len;
> +	peripheral->rx_len = msgs[msg_idx].len;
>  	peripheral->op = op;
>  
>  	ret = dmaengine_slave_config(dma_chan, config);
> @@ -568,14 +746,14 @@ static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>  
>  	peripheral->set_config = 0;
>  	peripheral->multi_msg = true;
> -	flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>  
>  	if (op == I2C_WRITE)
>  		dma_dirn = DMA_MEM_TO_DEV;
>  	else
>  		dma_dirn = DMA_DEV_TO_MEM;
>  
> -	desc = dmaengine_prep_slave_single(dma_chan, addr, msg->len, dma_dirn, flags);
> +	desc = dmaengine_prep_slave_single(dma_chan, addr, msgs[msg_idx].len,
> +					   dma_dirn, peripheral->dma_flags);
>  	if (!desc) {
>  		dev_err(gi2c->se.dev, "prep_slave_sg failed\n");
>  		ret = -EIO;
> @@ -585,15 +763,48 @@ static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>  	desc->callback_result = i2c_gpi_cb_result;
>  	desc->callback_param = gi2c;
>  
> -	dmaengine_submit(desc);
> -	*buf = dma_buf;
> -	*dma_addr_p = addr;
> +	if (!((msgs[msg_idx].flags & I2C_M_RD) && op == I2C_WRITE)) {
> +		gi2c_gpi_xfer->msg_idx_cnt++;
> +		gi2c_gpi_xfer->buf_idx = (msg_idx + 1) % QCOM_I2C_GPI_MAX_NUM_MSGS;
> +	}
> +	cookie = dmaengine_submit(desc);
> +	if (dma_submit_error(cookie)) {
> +		dev_err(gi2c->se.dev,
> +			"%s: dmaengine_submit failed (%d)\n", __func__, cookie);
> +		ret = -EINVAL;
> +		goto err_config;
> +	}
>  
> +	if (gi2c->is_tx_multi_desc_xfer) {
> +		dma_async_issue_pending(gi2c->tx_c);
> +		if ((msg_idx == (gi2c->num_msgs - 1)) ||
> +		    (gi2c_gpi_xfer->msg_idx_cnt >=
> +		     QCOM_I2C_GPI_MAX_NUM_MSGS + gi2c_gpi_xfer->freed_msg_cnt)) {
> +			ret = geni_i2c_gpi_multi_xfer_timeout_handler(gi2c->se.dev, gi2c_gpi_xfer,
> +								      gi2c->num_msgs, XFER_TIMEOUT,
> +								      &gi2c->done);
> +			if (ret) {
> +				dev_err(gi2c->se.dev,
> +					"I2C multi write msg transfer timeout: %d\n",
> +					ret);
> +				gi2c->err = ret;
> +				goto err_config;
> +			}
> +		}
> +	} else {
> +		/* Non multi descriptor message transfer */
> +		*buf = dma_buf;
> +		*dma_addr_p = addr;
> +	}
>  	return 0;
>  
>  err_config:
> -	dma_unmap_single(gi2c->se.dev->parent, addr, msg->len, map_dirn);
> -	i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
> +	dma_unmap_single(gi2c->se.dev->parent, addr,
> +			 msgs[msg_idx].len, map_dirn);
> +	i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
> +
> +out:
> +	gi2c->err = ret;
>  	return ret;
>  }
>  
> @@ -605,6 +816,7 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>  	unsigned long time_left;
>  	dma_addr_t tx_addr, rx_addr;
>  	void *tx_buf = NULL, *rx_buf = NULL;
> +	struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer;
>  	const struct geni_i2c_clk_fld *itr = gi2c->clk_fld;
>  
>  	config.peripheral_config = &peripheral;
> @@ -618,6 +830,33 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>  	peripheral.set_config = 1;
>  	peripheral.multi_msg = false;
>  
> +	gi2c->num_msgs = num;
> +	gi2c->is_tx_multi_desc_xfer = false;
> +	gi2c->tx_irq_cnt = 0;
> +
> +	tx_multi_xfer = &gi2c->i2c_multi_desc_config;
> +	memset(tx_multi_xfer, 0, sizeof(struct geni_i2c_gpi_multi_desc_xfer));
> +
> +	/*
> +	 * If number of write messages are two and higher then
> +	 * configure hardware for multi descriptor transfers with BEI.
> +	 */
> +	if (num >= QCOM_I2C_MIN_NUM_OF_MSGS_MULTI_DESC) {
> +		gi2c->is_tx_multi_desc_xfer = true;
> +		for (i = 0; i < num; i++) {
> +			if (msgs[i].flags & I2C_M_RD) {
> +				/*
> +				 * Multi descriptor transfer with BEI
> +				 * support is enabled for write transfers.
> +				 * TODO: Add BEI optimization support for
> +				 * read transfers later.
> +				 */
> +				gi2c->is_tx_multi_desc_xfer = false;
> +				break;
> +			}
> +		}
> +	}
> +
>  	for (i = 0; i < num; i++) {
>  		gi2c->cur = &msgs[i];
>  		gi2c->err = 0;
> @@ -628,14 +867,16 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>  			peripheral.stretch = 1;
>  
>  		peripheral.addr = msgs[i].addr;
> +		if (i > 0 && (!(msgs[i].flags & I2C_M_RD)))
> +			peripheral.multi_msg = false;
>  
> -		ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
> +		ret =  geni_i2c_gpi(gi2c, msgs, &config,
>  				    &tx_addr, &tx_buf, I2C_WRITE, gi2c->tx_c);

Can't you just pass the flag to set DMA_PREP_INTERRUPT for the last
message here and on the next geni_i2c_gpi() call?

>  		if (ret)
>  			goto err;
>  
>  		if (msgs[i].flags & I2C_M_RD) {
> -			ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
> +			ret =  geni_i2c_gpi(gi2c, msgs, &config,
>  					    &rx_addr, &rx_buf, I2C_READ, gi2c->rx_c);
>  			if (ret)
>  				goto err;
> @@ -643,18 +884,26 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>  			dma_async_issue_pending(gi2c->rx_c);
>  		}
>  
> -		dma_async_issue_pending(gi2c->tx_c);
> -
> -		time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
> -		if (!time_left)
> -			gi2c->err = -ETIMEDOUT;
> +		if (!gi2c->is_tx_multi_desc_xfer) {
> +			dma_async_issue_pending(gi2c->tx_c);
> +			time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
> +			if (!time_left) {
> +				dev_err(gi2c->se.dev, "%s:I2C timeout\n", __func__);
> +				gi2c->err = -ETIMEDOUT;
> +			}
> +		}
>  
>  		if (gi2c->err) {
>  			ret = gi2c->err;
>  			goto err;
>  		}
>  
> -		geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
> +		if (!gi2c->is_tx_multi_desc_xfer) {
> +			geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
> +		} else if (gi2c->tx_irq_cnt != tx_multi_xfer->irq_cnt) {
> +			gi2c->tx_irq_cnt = tx_multi_xfer->irq_cnt;
> +			geni_i2c_gpi_multi_desc_unmap(gi2c, msgs, &peripheral);
> +		}
>  	}
>  
>  	return num;
> @@ -663,7 +912,11 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>  	dev_err(gi2c->se.dev, "GPI transfer failed: %d\n", ret);
>  	dmaengine_terminate_sync(gi2c->rx_c);
>  	dmaengine_terminate_sync(gi2c->tx_c);
> -	geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
> +	if (gi2c->is_tx_multi_desc_xfer)
> +		geni_i2c_gpi_multi_desc_unmap(gi2c, msgs, &peripheral);
> +	else
> +		geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
> +
>  	return ret;
>  }
>  
> -- 
> 2.17.1
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-06 11:46   ` Dmitry Baryshkov
@ 2025-05-09  6:18     ` Jyothi Kumar Seerapu
  2025-05-09 16:01       ` Dmitry Baryshkov
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-05-09  6:18 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

Hi Dimitry, Thanks for providing the review comments.

On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
>> The I2C driver gets an interrupt upon transfer completion.
>> When handling multiple messages in a single transfer, this
>> results in N interrupts for N messages, leading to significant
>> software interrupt latency.
>>
>> To mitigate this latency, utilize Block Event Interrupt (BEI)
>> mechanism. Enabling BEI instructs the hardware to prevent interrupt
>> generation and BEI is disabled when an interrupt is necessary.
>>
>> Large I2C transfer can be divided into chunks of 8 messages internally.
>> Interrupts are not expected for the first 7 message completions, only
>> the last message triggers an interrupt, indicating the completion of
>> 8 messages. This BEI mechanism enhances overall transfer efficiency.
> 
> Why do you need this complexity? Is it possible to set the
> DMA_PREP_INTERRUPT flag on the last message in the transfer?

If i undertsand correctly, the suggestion is to get the single 
intetrrupt for last i2c message only.

But With this approach, we can't handle large number of i2c messages in 
the transfer.

In GPI driver, number of max TREs support is harcoded to 64 (#define 
CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and DMA 
TREs. So, the avilable TREs are not sufficient to handle all the N messages.

Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS or 
'num' incase for less messsages), process and unmap/free upon the 
interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.


> 
>>
>> This optimization reduces transfer time from 168 ms to 48 ms for a
>> series of 200 I2C write messages in a single transfer, with a
>> clock frequency support of 100 kHz.
>>
>> BEI optimizations are currently implemented for I2C write transfers only,
>> as there is no use case for multiple I2C read messages in a single transfer
>> at this time.
>>
>> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
>> ---
>> v5 -> v6:
>>    - Instead of using bei_flag, moved the logic to use with DMA
>>      supported flags like DMA_PREP_INTERRUPT.
>>    - Additional parameter comments removed from geni_i2c_gpi_multi_desc_unmap
>>      function documentation.
>>     
>> v4 -> v5:
>>    - Block event interrupt flag naming changed from flags to bei_flag.
>>    - Documentation added for "struct geni_i2c_dev".
>>
>> v3 -> v4:
>>    - API's added for Block event interrupt with multi descriptor support for
>>      I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
>>    - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
>>      I2C driver.
>>    - geni_i2c_gpi_multi_desc_xfer structure is added as a member of
>>      struct geni_i2c_dev.
>>
>> v2 -> v3:
>>     - In i2c_gpi_cb_result function, moved the logic of
>>      "!is_tx_multi_xfer" to else.
>>     - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
>>     - Updated commit description
>>
>> v1 -> v2:
>>     - Moved gi2c_gpi_xfer->msg_idx_cnt to separate local variable.
>>     - Updated goto labels for error scenarios in geni_i2c_gpi function
>>     - memset tx_multi_xfer to 0.
>>     - Removed passing current msg index to geni_i2c_gpi
>>     - Fixed kernel test robot reported compilation issues.
>>
>>
>>   drivers/i2c/busses/i2c-qcom-geni.c | 307 ++++++++++++++++++++++++++---
>>   1 file changed, 280 insertions(+), 27 deletions(-)
>>
>> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
>> index 515a784c951c..e390cf5b4ddc 100644
>> --- a/drivers/i2c/busses/i2c-qcom-geni.c
>> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
>> @@ -78,6 +78,62 @@ enum geni_i2c_err_code {
>>   #define XFER_TIMEOUT		HZ
>>   #define RST_TIMEOUT		HZ
>>   
>> +#define QCOM_I2C_GPI_MAX_NUM_MSGS		16
>> +#define QCOM_I2C_GPI_NUM_MSGS_PER_IRQ		8
>> +#define QCOM_I2C_MIN_NUM_OF_MSGS_MULTI_DESC	2
>> +
>> +/**
>> + * struct geni_i2c_gpi_multi_desc_xfer - Used for multi transfer support
>> + *
>> + * @msg_idx_cnt: message index for the transfer
>> + * @buf_idx: dma buffer index
>> + * @unmap_msg_cnt: unmapped transfer index
>> + * @freed_msg_cnt: freed transfer index
>> + * @irq_cnt: received interrupt count
>> + * @irq_msg_cnt: transfer message count for the received irqs
>> + * @dma_buf: virtual addresses of the buffers
>> + * @dma_addr: dma addresses of the buffers
>> + */
>> +struct geni_i2c_gpi_multi_desc_xfer {
>> +	u32 msg_idx_cnt;
>> +	u32 buf_idx;
>> +	u32 unmap_msg_cnt;
>> +	u32 freed_msg_cnt;
>> +	u32 irq_cnt;
>> +	u32 irq_msg_cnt;
>> +	void *dma_buf[QCOM_I2C_GPI_MAX_NUM_MSGS];
>> +	dma_addr_t dma_addr[QCOM_I2C_GPI_MAX_NUM_MSGS];
>> +};
>> +
>> +/**
>> + * struct geni_i2c_dev - I2C Geni device specific structure
>> + *
>> + * @se: geni serial engine
>> + * @tx_wm: Tx watermark level
>> + * @irq: i2c serial engine interrupt
>> + * @err: specifies error codes in i2c transfer failures
>> + * @adap: i2c geni adapter
>> + * @done: completion variable
>> + * @cur: pointer to the i2c_msg mentioning current i2c message in use
>> + * @cur_wr: variable used for i2c write opertions
>> + * @cur_rd: variable used for i2c read operations
>> + * @lock: spinlock variable used for synchronization
>> + * @core_clk: pointer to clk
>> + * @clk_freq_out: contains the i2c clock frequency
>> + * @clk_fld: pointer to geni_i2c_clk_fld
>> + * @suspended: flag used for system supend status
>> + * @dma_buf: virtual address of the buffer
>> + * @xfer_len: holds length for the dma operation
>> + * @dma_addr: dma address of the buffer
>> + * @tx_c: Tx dma channel
>> + * @rx_c: Rx dma channel
>> + * @gpi_mode: GPI DMA mode of operation
>> + * @abort_done: true for marking i2c abort transfer
>> + * @is_tx_multi_desc_xfer: true for i2c multi transfer support
>> + * @num_msgs: number of i2c messages in a transfer
>> + * @tx_irq_cnt: flag used for tx irq count in i2c multi transfer
>> + * @i2c_multi_desc_config: used for multi transfer support
>> + */
> 
> Unrelated. Documentation should go to a separate patch.
Sure, I will follow it in next patch.
> 
>>   struct geni_i2c_dev {
>>   	struct geni_se se;
>>   	u32 tx_wm;
>> @@ -100,6 +156,10 @@ struct geni_i2c_dev {
>>   	struct dma_chan *rx_c;
>>   	bool gpi_mode;
>>   	bool abort_done;
>> +	bool is_tx_multi_desc_xfer;
>> +	u32 num_msgs;
>> +	u32 tx_irq_cnt;
>> +	struct geni_i2c_gpi_multi_desc_xfer i2c_multi_desc_config;
>>   };
>>   
>>   struct geni_i2c_desc {
>> @@ -500,6 +560,7 @@ static int geni_i2c_tx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>   static void i2c_gpi_cb_result(void *cb, const struct dmaengine_result *result)
>>   {
>>   	struct geni_i2c_dev *gi2c = cb;
>> +	struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer;
> 
> Define it in the corresponding chunk.
If I understand correctly, do i need to add this in else block ?
> 
>>   
>>   	if (result->result != DMA_TRANS_NOERROR) {
>>   		dev_err(gi2c->se.dev, "DMA txn failed:%d\n", result->result);
>> @@ -508,7 +569,22 @@ static void i2c_gpi_cb_result(void *cb, const struct dmaengine_result *result)
>>   		dev_dbg(gi2c->se.dev, "DMA xfer has pending: %d\n", result->residue);
>>   	}
>>   
>> -	complete(&gi2c->done);
>> +	if (!gi2c->is_tx_multi_desc_xfer) {
>> +		complete(&gi2c->done);
>> +	} else {
>> +		tx_multi_xfer = &gi2c->i2c_multi_desc_config;
>> +
>> +		/*
>> +		 * Send Completion for last message or multiple of
>> +		 * QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>> +		 */
>> +		if ((tx_multi_xfer->irq_msg_cnt == gi2c->num_msgs - 1) ||
>> +		    (!((tx_multi_xfer->irq_msg_cnt + 1) % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ))) {
>> +			tx_multi_xfer->irq_cnt++;
>> +			complete(&gi2c->done);
>> +		}
>> +		tx_multi_xfer->irq_msg_cnt++;
>> +	}
>>   }
>>   
>>   static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>> @@ -526,38 +602,140 @@ static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>   	}
>>   }
>>   
>> -static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>> +/**
>> + * geni_i2c_gpi_multi_desc_unmap() - unmaps the buffers post multi message TX transfers
>> + * @gi2c: i2c dev handle
>> + * @msgs: i2c messages array
>> + * @peripheral: pointer to gpi_i2c_config
>> + */
>> +static void geni_i2c_gpi_multi_desc_unmap(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[],
>> +					  struct gpi_i2c_config *peripheral)
>> +{
>> +	u32 msg_xfer_cnt, wr_idx = 0;
>> +	struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer = &gi2c->i2c_multi_desc_config;
>> +
>> +	/*
>> +	 * In error case, need to unmap all messages based on the msg_idx_cnt.
>> +	 * Non-error case unmap all the processed messages.
>> +	 */
>> +	if (gi2c->err)
>> +		msg_xfer_cnt = tx_multi_xfer->msg_idx_cnt;
>> +	else
>> +		msg_xfer_cnt = tx_multi_xfer->irq_cnt * QCOM_I2C_GPI_NUM_MSGS_PER_IRQ;
>> +
>> +	/* Unmap the processed DMA buffers based on the received interrupt count */
>> +	for (; tx_multi_xfer->unmap_msg_cnt < msg_xfer_cnt; tx_multi_xfer->unmap_msg_cnt++) {
>> +		if (tx_multi_xfer->unmap_msg_cnt == gi2c->num_msgs)
>> +			break;
>> +		wr_idx = tx_multi_xfer->unmap_msg_cnt % QCOM_I2C_GPI_MAX_NUM_MSGS;
>> +		geni_i2c_gpi_unmap(gi2c, &msgs[tx_multi_xfer->unmap_msg_cnt],
>> +				   tx_multi_xfer->dma_buf[wr_idx],
>> +				   tx_multi_xfer->dma_addr[wr_idx],
>> +				   NULL, (dma_addr_t)NULL);
>> +		tx_multi_xfer->freed_msg_cnt++;
>> +	}
>> +}
>> +
>> +/**
>> + * geni_i2c_gpi_multi_xfer_timeout_handler() - Handle multi message transfer timeout
>> + * @dev: pointer to the corresponding dev node
>> + * @multi_xfer: pointer to the geni_i2c_gpi_multi_desc_xfer
>> + * @num_xfers: total number of transfers
>> + * @transfer_timeout_msecs: transfer timeout value
>> + * @transfer_comp: completion object of the transfer
>> + *
>> + * This function is used to wait for the processed transfers based on
>> + * the interrupts generated upon transfer completion.
>> + * Return: On success returns 0, otherwise return error code (-ETIMEDOUT)
>> + */
>> +static int geni_i2c_gpi_multi_xfer_timeout_handler(struct device *dev,
>> +						   struct geni_i2c_gpi_multi_desc_xfer *multi_xfer,
>> +						   u32 num_xfers, u32 transfer_timeout_msecs,
>> +						   struct completion *transfer_comp)
>> +{
>> +	int i;
>> +	u32 max_irq_cnt, time_left;
>> +
>> +	max_irq_cnt = num_xfers / QCOM_I2C_GPI_NUM_MSGS_PER_IRQ;
>> +	if (num_xfers % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ)
>> +		max_irq_cnt++;
>> +
>> +	/*
>> +	 * Wait for the interrupts of the processed transfers in multiple
>> +	 * of 8 and for the last transfer. If the hardware is fast and
>> +	 * already processed all the transfers then no need to wait.
>> +	 */
>> +	for (i = 0; i < max_irq_cnt; i++) {
>> +		reinit_completion(transfer_comp);
>> +		if (max_irq_cnt != multi_xfer->irq_cnt) {
>> +			time_left = wait_for_completion_timeout(transfer_comp,
>> +								transfer_timeout_msecs);
>> +			if (!time_left) {
>> +				dev_err(dev, "%s: Transfer timeout\n", __func__);
>> +				return -ETIMEDOUT;
>> +			}
>> +		}
>> +		if (num_xfers > multi_xfer->msg_idx_cnt)
>> +			return 0;
>> +	}
>> +	return 0;
>> +}
>> +
>> +static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[],
>>   			struct dma_slave_config *config, dma_addr_t *dma_addr_p,
>>   			void **buf, unsigned int op, struct dma_chan *dma_chan)
>>   {
>>   	struct gpi_i2c_config *peripheral;
>> -	unsigned int flags;
>>   	void *dma_buf;
>>   	dma_addr_t addr;
>>   	enum dma_data_direction map_dirn;
>>   	enum dma_transfer_direction dma_dirn;
>>   	struct dma_async_tx_descriptor *desc;
>>   	int ret;
>> +	struct geni_i2c_gpi_multi_desc_xfer *gi2c_gpi_xfer;
>> +	dma_cookie_t cookie;
>> +	u32 msg_idx;
>>   
>>   	peripheral = config->peripheral_config;
>> -
>> -	dma_buf = i2c_get_dma_safe_msg_buf(msg, 1);
>> -	if (!dma_buf)
>> -		return -ENOMEM;
>> +	gi2c_gpi_xfer = &gi2c->i2c_multi_desc_config;
>> +	dma_buf = gi2c_gpi_xfer->dma_buf[gi2c_gpi_xfer->buf_idx];
>> +	addr = gi2c_gpi_xfer->dma_addr[gi2c_gpi_xfer->buf_idx];
>> +	msg_idx = gi2c_gpi_xfer->msg_idx_cnt;
>> +
>> +	dma_buf = i2c_get_dma_safe_msg_buf(&msgs[msg_idx], 1);
>> +	if (!dma_buf) {
>> +		ret = -ENOMEM;
>> +		goto out;
>> +	}
>>   
>>   	if (op == I2C_WRITE)
>>   		map_dirn = DMA_TO_DEVICE;
>>   	else
>>   		map_dirn = DMA_FROM_DEVICE;
>>   
>> -	addr = dma_map_single(gi2c->se.dev->parent, dma_buf, msg->len, map_dirn);
>> +	addr = dma_map_single(gi2c->se.dev->parent, dma_buf,
>> +			      msgs[msg_idx].len, map_dirn);
>>   	if (dma_mapping_error(gi2c->se.dev->parent, addr)) {
>> -		i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
>> -		return -ENOMEM;
>> +		i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
>> +		ret = -ENOMEM;
>> +		goto out;
>> +	}
>> +
>> +	if (gi2c->is_tx_multi_desc_xfer) {
>> +		if (((msg_idx + 1) % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ))
>> +			peripheral->dma_flags = DMA_CTRL_ACK;
>> +		else
>> +			peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>> +
>> +		/* BEI bit to be cleared for last TRE */
>> +		if (msg_idx == gi2c->num_msgs - 1)
>> +			peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>> +	} else {
>> +		peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
> 
> No-no-no. There is no need to set peripheral->dma_flags.

Initially it was "flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;".

I thought of using flags in GPI driver as well, so moved flags gpi 
include file to access b/w I2C and GPI.

But as per your suggestion i will review if 
"dma_async_tx_descriptor.flags" can be retrieved in GPI driver for 
getting DMA_PREP_INTERRUPT flag status.
> 
>>   	}
>>   
>>   	/* set the length as message for rx txn */
>> -	peripheral->rx_len = msg->len;
>> +	peripheral->rx_len = msgs[msg_idx].len;
>>   	peripheral->op = op;
>>   
>>   	ret = dmaengine_slave_config(dma_chan, config);
>> @@ -568,14 +746,14 @@ static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>   
>>   	peripheral->set_config = 0;
>>   	peripheral->multi_msg = true;
>> -	flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>>   
>>   	if (op == I2C_WRITE)
>>   		dma_dirn = DMA_MEM_TO_DEV;
>>   	else
>>   		dma_dirn = DMA_DEV_TO_MEM;
>>   
>> -	desc = dmaengine_prep_slave_single(dma_chan, addr, msg->len, dma_dirn, flags);
>> +	desc = dmaengine_prep_slave_single(dma_chan, addr, msgs[msg_idx].len,
>> +					   dma_dirn, peripheral->dma_flags);
>>   	if (!desc) {
>>   		dev_err(gi2c->se.dev, "prep_slave_sg failed\n");
>>   		ret = -EIO;
>> @@ -585,15 +763,48 @@ static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>   	desc->callback_result = i2c_gpi_cb_result;
>>   	desc->callback_param = gi2c;
>>   
>> -	dmaengine_submit(desc);
>> -	*buf = dma_buf;
>> -	*dma_addr_p = addr;
>> +	if (!((msgs[msg_idx].flags & I2C_M_RD) && op == I2C_WRITE)) {
>> +		gi2c_gpi_xfer->msg_idx_cnt++;
>> +		gi2c_gpi_xfer->buf_idx = (msg_idx + 1) % QCOM_I2C_GPI_MAX_NUM_MSGS;
>> +	}
>> +	cookie = dmaengine_submit(desc);
>> +	if (dma_submit_error(cookie)) {
>> +		dev_err(gi2c->se.dev,
>> +			"%s: dmaengine_submit failed (%d)\n", __func__, cookie);
>> +		ret = -EINVAL;
>> +		goto err_config;
>> +	}
>>   
>> +	if (gi2c->is_tx_multi_desc_xfer) {
>> +		dma_async_issue_pending(gi2c->tx_c);
>> +		if ((msg_idx == (gi2c->num_msgs - 1)) ||
>> +		    (gi2c_gpi_xfer->msg_idx_cnt >=
>> +		     QCOM_I2C_GPI_MAX_NUM_MSGS + gi2c_gpi_xfer->freed_msg_cnt)) {
>> +			ret = geni_i2c_gpi_multi_xfer_timeout_handler(gi2c->se.dev, gi2c_gpi_xfer,
>> +								      gi2c->num_msgs, XFER_TIMEOUT,
>> +								      &gi2c->done);
>> +			if (ret) {
>> +				dev_err(gi2c->se.dev,
>> +					"I2C multi write msg transfer timeout: %d\n",
>> +					ret);
>> +				gi2c->err = ret;
>> +				goto err_config;
>> +			}
>> +		}
>> +	} else {
>> +		/* Non multi descriptor message transfer */
>> +		*buf = dma_buf;
>> +		*dma_addr_p = addr;
>> +	}
>>   	return 0;
>>   
>>   err_config:
>> -	dma_unmap_single(gi2c->se.dev->parent, addr, msg->len, map_dirn);
>> -	i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
>> +	dma_unmap_single(gi2c->se.dev->parent, addr,
>> +			 msgs[msg_idx].len, map_dirn);
>> +	i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
>> +
>> +out:
>> +	gi2c->err = ret;
>>   	return ret;
>>   }
>>   
>> @@ -605,6 +816,7 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>   	unsigned long time_left;
>>   	dma_addr_t tx_addr, rx_addr;
>>   	void *tx_buf = NULL, *rx_buf = NULL;
>> +	struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer;
>>   	const struct geni_i2c_clk_fld *itr = gi2c->clk_fld;
>>   
>>   	config.peripheral_config = &peripheral;
>> @@ -618,6 +830,33 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>   	peripheral.set_config = 1;
>>   	peripheral.multi_msg = false;
>>   
>> +	gi2c->num_msgs = num;
>> +	gi2c->is_tx_multi_desc_xfer = false;
>> +	gi2c->tx_irq_cnt = 0;
>> +
>> +	tx_multi_xfer = &gi2c->i2c_multi_desc_config;
>> +	memset(tx_multi_xfer, 0, sizeof(struct geni_i2c_gpi_multi_desc_xfer));
>> +
>> +	/*
>> +	 * If number of write messages are two and higher then
>> +	 * configure hardware for multi descriptor transfers with BEI.
>> +	 */
>> +	if (num >= QCOM_I2C_MIN_NUM_OF_MSGS_MULTI_DESC) {
>> +		gi2c->is_tx_multi_desc_xfer = true;
>> +		for (i = 0; i < num; i++) {
>> +			if (msgs[i].flags & I2C_M_RD) {
>> +				/*
>> +				 * Multi descriptor transfer with BEI
>> +				 * support is enabled for write transfers.
>> +				 * TODO: Add BEI optimization support for
>> +				 * read transfers later.
>> +				 */
>> +				gi2c->is_tx_multi_desc_xfer = false;
>> +				break;
>> +			}
>> +		}
>> +	}
>> +
>>   	for (i = 0; i < num; i++) {
>>   		gi2c->cur = &msgs[i];
>>   		gi2c->err = 0;
>> @@ -628,14 +867,16 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>   			peripheral.stretch = 1;
>>   
>>   		peripheral.addr = msgs[i].addr;
>> +		if (i > 0 && (!(msgs[i].flags & I2C_M_RD)))
>> +			peripheral.multi_msg = false;
>>   
>> -		ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
>> +		ret =  geni_i2c_gpi(gi2c, msgs, &config,
>>   				    &tx_addr, &tx_buf, I2C_WRITE, gi2c->tx_c);
> 
> Can't you just pass the flag to set DMA_PREP_INTERRUPT for the last
> message here and on the next geni_i2c_gpi() call?

Sure, I will review it.
We need to set/mask DMA_PREP_INTERRUPT through dma_async_tx_descriptor 
structure and need to pass this structure to GPI and so GPI can retrives 
flags for GPI HW BEI set functionality.
> 
>>   		if (ret)
>>   			goto err;
>>   
>>   		if (msgs[i].flags & I2C_M_RD) {
>> -			ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
>> +			ret =  geni_i2c_gpi(gi2c, msgs, &config,
>>   					    &rx_addr, &rx_buf, I2C_READ, gi2c->rx_c);
>>   			if (ret)
>>   				goto err;
>> @@ -643,18 +884,26 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>   			dma_async_issue_pending(gi2c->rx_c);
>>   		}
>>   
>> -		dma_async_issue_pending(gi2c->tx_c);
>> -
>> -		time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
>> -		if (!time_left)
>> -			gi2c->err = -ETIMEDOUT;
>> +		if (!gi2c->is_tx_multi_desc_xfer) {
>> +			dma_async_issue_pending(gi2c->tx_c);
>> +			time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
>> +			if (!time_left) {
>> +				dev_err(gi2c->se.dev, "%s:I2C timeout\n", __func__);
>> +				gi2c->err = -ETIMEDOUT;
>> +			}
>> +		}
>>   
>>   		if (gi2c->err) {
>>   			ret = gi2c->err;
>>   			goto err;
>>   		}
>>   
>> -		geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
>> +		if (!gi2c->is_tx_multi_desc_xfer) {
>> +			geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
>> +		} else if (gi2c->tx_irq_cnt != tx_multi_xfer->irq_cnt) {
>> +			gi2c->tx_irq_cnt = tx_multi_xfer->irq_cnt;
>> +			geni_i2c_gpi_multi_desc_unmap(gi2c, msgs, &peripheral);
>> +		}
>>   	}
>>   
>>   	return num;
>> @@ -663,7 +912,11 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>   	dev_err(gi2c->se.dev, "GPI transfer failed: %d\n", ret);
>>   	dmaengine_terminate_sync(gi2c->rx_c);
>>   	dmaengine_terminate_sync(gi2c->tx_c);
>> -	geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
>> +	if (gi2c->is_tx_multi_desc_xfer)
>> +		geni_i2c_gpi_multi_desc_unmap(gi2c, msgs, &peripheral);
>> +	else
>> +		geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, rx_addr);
>> +
>>   	return ret;
>>   }
>>   
>> -- 
>> 2.17.1
>>
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-09  6:18     ` Jyothi Kumar Seerapu
@ 2025-05-09 16:01       ` Dmitry Baryshkov
  2025-05-21 10:28         ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2025-05-09 16:01 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
> Hi Dimitry, Thanks for providing the review comments.
> 
> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
>>> The I2C driver gets an interrupt upon transfer completion.
>>> When handling multiple messages in a single transfer, this
>>> results in N interrupts for N messages, leading to significant
>>> software interrupt latency.
>>>
>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
>>> mechanism. Enabling BEI instructs the hardware to prevent interrupt
>>> generation and BEI is disabled when an interrupt is necessary.
>>>
>>> Large I2C transfer can be divided into chunks of 8 messages internally.
>>> Interrupts are not expected for the first 7 message completions, only
>>> the last message triggers an interrupt, indicating the completion of
>>> 8 messages. This BEI mechanism enhances overall transfer efficiency.
>>
>> Why do you need this complexity? Is it possible to set the
>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
> 
> If i undertsand correctly, the suggestion is to get the single 
> intetrrupt for last i2c message only.
> 
> But With this approach, we can't handle large number of i2c messages in 
> the transfer.
> 
> In GPI driver, number of max TREs support is harcoded to 64 (#define 
> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and DMA 
> TREs. So, the avilable TREs are not sufficient to handle all the N 
> messages.

It sounds like a DMA driver issue. In other words, the DMA driver can 
know that it must issue an interrupt before exausting 64 TREs in order to

> 
> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS or 
> 'num' incase for less messsages), process and unmap/free upon the 
> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.

Why? This is some random value which has no connection with CHAN_TREs. 
Also, what if one of the platforms get a 'liter' GPI which supports less 
TREs in a single run? Or a super-premium platform which can use 256 
TREs? Please don't workaround issues from one driver in another one.

> 
> 
>>
>>>
>>> This optimization reduces transfer time from 168 ms to 48 ms for a
>>> series of 200 I2C write messages in a single transfer, with a
>>> clock frequency support of 100 kHz.
>>>
>>> BEI optimizations are currently implemented for I2C write transfers 
>>> only,
>>> as there is no use case for multiple I2C read messages in a single 
>>> transfer
>>> at this time.
>>>
>>> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
>>> ---
>>> v5 -> v6:
>>>    - Instead of using bei_flag, moved the logic to use with DMA
>>>      supported flags like DMA_PREP_INTERRUPT.
>>>    - Additional parameter comments removed from 
>>> geni_i2c_gpi_multi_desc_unmap
>>>      function documentation.
>>> v4 -> v5:
>>>    - Block event interrupt flag naming changed from flags to bei_flag.
>>>    - Documentation added for "struct geni_i2c_dev".
>>>
>>> v3 -> v4:
>>>    - API's added for Block event interrupt with multi descriptor 
>>> support for
>>>      I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver file.
>>>    - gpi_multi_xfer_timeout_handler function is moved from GPI driver to
>>>      I2C driver.
>>>    - geni_i2c_gpi_multi_desc_xfer structure is added as a member of
>>>      struct geni_i2c_dev.
>>>
>>> v2 -> v3:
>>>     - In i2c_gpi_cb_result function, moved the logic of
>>>      "!is_tx_multi_xfer" to else.
>>>     - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
>>>     - Updated commit description
>>>
>>> v1 -> v2:
>>>     - Moved gi2c_gpi_xfer->msg_idx_cnt to separate local variable.
>>>     - Updated goto labels for error scenarios in geni_i2c_gpi function
>>>     - memset tx_multi_xfer to 0.
>>>     - Removed passing current msg index to geni_i2c_gpi
>>>     - Fixed kernel test robot reported compilation issues.
>>>
>>>
>>>   drivers/i2c/busses/i2c-qcom-geni.c | 307 ++++++++++++++++++++++++++---
>>>   1 file changed, 280 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/ 
>>> i2c-qcom-geni.c
>>> index 515a784c951c..e390cf5b4ddc 100644
>>> --- a/drivers/i2c/busses/i2c-qcom-geni.c
>>> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
>>> @@ -78,6 +78,62 @@ enum geni_i2c_err_code {
>>>   #define XFER_TIMEOUT        HZ
>>>   #define RST_TIMEOUT        HZ
>>> +#define QCOM_I2C_GPI_MAX_NUM_MSGS        16
>>> +#define QCOM_I2C_GPI_NUM_MSGS_PER_IRQ        8
>>> +#define QCOM_I2C_MIN_NUM_OF_MSGS_MULTI_DESC    2
>>> +
>>> +/**
>>> + * struct geni_i2c_gpi_multi_desc_xfer - Used for multi transfer 
>>> support
>>> + *
>>> + * @msg_idx_cnt: message index for the transfer
>>> + * @buf_idx: dma buffer index
>>> + * @unmap_msg_cnt: unmapped transfer index
>>> + * @freed_msg_cnt: freed transfer index
>>> + * @irq_cnt: received interrupt count
>>> + * @irq_msg_cnt: transfer message count for the received irqs
>>> + * @dma_buf: virtual addresses of the buffers
>>> + * @dma_addr: dma addresses of the buffers
>>> + */
>>> +struct geni_i2c_gpi_multi_desc_xfer {
>>> +    u32 msg_idx_cnt;
>>> +    u32 buf_idx;
>>> +    u32 unmap_msg_cnt;
>>> +    u32 freed_msg_cnt;
>>> +    u32 irq_cnt;
>>> +    u32 irq_msg_cnt;
>>> +    void *dma_buf[QCOM_I2C_GPI_MAX_NUM_MSGS];
>>> +    dma_addr_t dma_addr[QCOM_I2C_GPI_MAX_NUM_MSGS];
>>> +};
>>> +
>>> +/**
>>> + * struct geni_i2c_dev - I2C Geni device specific structure
>>> + *
>>> + * @se: geni serial engine
>>> + * @tx_wm: Tx watermark level
>>> + * @irq: i2c serial engine interrupt
>>> + * @err: specifies error codes in i2c transfer failures
>>> + * @adap: i2c geni adapter
>>> + * @done: completion variable
>>> + * @cur: pointer to the i2c_msg mentioning current i2c message in use
>>> + * @cur_wr: variable used for i2c write opertions
>>> + * @cur_rd: variable used for i2c read operations
>>> + * @lock: spinlock variable used for synchronization
>>> + * @core_clk: pointer to clk
>>> + * @clk_freq_out: contains the i2c clock frequency
>>> + * @clk_fld: pointer to geni_i2c_clk_fld
>>> + * @suspended: flag used for system supend status
>>> + * @dma_buf: virtual address of the buffer
>>> + * @xfer_len: holds length for the dma operation
>>> + * @dma_addr: dma address of the buffer
>>> + * @tx_c: Tx dma channel
>>> + * @rx_c: Rx dma channel
>>> + * @gpi_mode: GPI DMA mode of operation
>>> + * @abort_done: true for marking i2c abort transfer
>>> + * @is_tx_multi_desc_xfer: true for i2c multi transfer support
>>> + * @num_msgs: number of i2c messages in a transfer
>>> + * @tx_irq_cnt: flag used for tx irq count in i2c multi transfer
>>> + * @i2c_multi_desc_config: used for multi transfer support
>>> + */
>>
>> Unrelated. Documentation should go to a separate patch.
> Sure, I will follow it in next patch.
>>
>>>   struct geni_i2c_dev {
>>>       struct geni_se se;
>>>       u32 tx_wm;
>>> @@ -100,6 +156,10 @@ struct geni_i2c_dev {
>>>       struct dma_chan *rx_c;
>>>       bool gpi_mode;
>>>       bool abort_done;
>>> +    bool is_tx_multi_desc_xfer;
>>> +    u32 num_msgs;
>>> +    u32 tx_irq_cnt;
>>> +    struct geni_i2c_gpi_multi_desc_xfer i2c_multi_desc_config;
>>>   };
>>>   struct geni_i2c_desc {
>>> @@ -500,6 +560,7 @@ static int geni_i2c_tx_one_msg(struct 
>>> geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>>   static void i2c_gpi_cb_result(void *cb, const struct 
>>> dmaengine_result *result)
>>>   {
>>>       struct geni_i2c_dev *gi2c = cb;
>>> +    struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer;
>>
>> Define it in the corresponding chunk.
> If I understand correctly, do i need to add this in else block ?

Yes.

>>
>>>       if (result->result != DMA_TRANS_NOERROR) {
>>>           dev_err(gi2c->se.dev, "DMA txn failed:%d\n", result->result);
>>> @@ -508,7 +569,22 @@ static void i2c_gpi_cb_result(void *cb, const 
>>> struct dmaengine_result *result)
>>>           dev_dbg(gi2c->se.dev, "DMA xfer has pending: %d\n", result- 
>>> >residue);
>>>       }
>>> -    complete(&gi2c->done);
>>> +    if (!gi2c->is_tx_multi_desc_xfer) {
>>> +        complete(&gi2c->done);
>>> +    } else {
>>> +        tx_multi_xfer = &gi2c->i2c_multi_desc_config;
>>> +
>>> +        /*
>>> +         * Send Completion for last message or multiple of
>>> +         * QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>> +         */
>>> +        if ((tx_multi_xfer->irq_msg_cnt == gi2c->num_msgs - 1) ||
>>> +            (!((tx_multi_xfer->irq_msg_cnt + 1) % 
>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ))) {
>>> +            tx_multi_xfer->irq_cnt++;
>>> +            complete(&gi2c->done);
>>> +        }
>>> +        tx_multi_xfer->irq_msg_cnt++;
>>> +    }
>>>   }
>>>   static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, struct 
>>> i2c_msg *msg,
>>> @@ -526,38 +602,140 @@ static void geni_i2c_gpi_unmap(struct 
>>> geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>>       }
>>>   }
>>> -static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>> +/**
>>> + * geni_i2c_gpi_multi_desc_unmap() - unmaps the buffers post multi 
>>> message TX transfers
>>> + * @gi2c: i2c dev handle
>>> + * @msgs: i2c messages array
>>> + * @peripheral: pointer to gpi_i2c_config
>>> + */
>>> +static void geni_i2c_gpi_multi_desc_unmap(struct geni_i2c_dev *gi2c, 
>>> struct i2c_msg msgs[],
>>> +                      struct gpi_i2c_config *peripheral)
>>> +{
>>> +    u32 msg_xfer_cnt, wr_idx = 0;
>>> +    struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer = &gi2c- 
>>> >i2c_multi_desc_config;
>>> +
>>> +    /*
>>> +     * In error case, need to unmap all messages based on the 
>>> msg_idx_cnt.
>>> +     * Non-error case unmap all the processed messages.
>>> +     */
>>> +    if (gi2c->err)
>>> +        msg_xfer_cnt = tx_multi_xfer->msg_idx_cnt;
>>> +    else
>>> +        msg_xfer_cnt = tx_multi_xfer->irq_cnt * 
>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ;
>>> +
>>> +    /* Unmap the processed DMA buffers based on the received 
>>> interrupt count */
>>> +    for (; tx_multi_xfer->unmap_msg_cnt < msg_xfer_cnt; 
>>> tx_multi_xfer->unmap_msg_cnt++) {
>>> +        if (tx_multi_xfer->unmap_msg_cnt == gi2c->num_msgs)
>>> +            break;
>>> +        wr_idx = tx_multi_xfer->unmap_msg_cnt % 
>>> QCOM_I2C_GPI_MAX_NUM_MSGS;
>>> +        geni_i2c_gpi_unmap(gi2c, &msgs[tx_multi_xfer->unmap_msg_cnt],
>>> +                   tx_multi_xfer->dma_buf[wr_idx],
>>> +                   tx_multi_xfer->dma_addr[wr_idx],
>>> +                   NULL, (dma_addr_t)NULL);
>>> +        tx_multi_xfer->freed_msg_cnt++;
>>> +    }
>>> +}
>>> +
>>> +/**
>>> + * geni_i2c_gpi_multi_xfer_timeout_handler() - Handle multi message 
>>> transfer timeout
>>> + * @dev: pointer to the corresponding dev node
>>> + * @multi_xfer: pointer to the geni_i2c_gpi_multi_desc_xfer
>>> + * @num_xfers: total number of transfers
>>> + * @transfer_timeout_msecs: transfer timeout value
>>> + * @transfer_comp: completion object of the transfer
>>> + *
>>> + * This function is used to wait for the processed transfers based on
>>> + * the interrupts generated upon transfer completion.
>>> + * Return: On success returns 0, otherwise return error code (- 
>>> ETIMEDOUT)
>>> + */
>>> +static int geni_i2c_gpi_multi_xfer_timeout_handler(struct device *dev,
>>> +                           struct geni_i2c_gpi_multi_desc_xfer 
>>> *multi_xfer,
>>> +                           u32 num_xfers, u32 transfer_timeout_msecs,
>>> +                           struct completion *transfer_comp)
>>> +{
>>> +    int i;
>>> +    u32 max_irq_cnt, time_left;
>>> +
>>> +    max_irq_cnt = num_xfers / QCOM_I2C_GPI_NUM_MSGS_PER_IRQ;
>>> +    if (num_xfers % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ)
>>> +        max_irq_cnt++;
>>> +
>>> +    /*
>>> +     * Wait for the interrupts of the processed transfers in multiple
>>> +     * of 8 and for the last transfer. If the hardware is fast and
>>> +     * already processed all the transfers then no need to wait.
>>> +     */
>>> +    for (i = 0; i < max_irq_cnt; i++) {
>>> +        reinit_completion(transfer_comp);
>>> +        if (max_irq_cnt != multi_xfer->irq_cnt) {
>>> +            time_left = wait_for_completion_timeout(transfer_comp,
>>> +                                transfer_timeout_msecs);
>>> +            if (!time_left) {
>>> +                dev_err(dev, "%s: Transfer timeout\n", __func__);
>>> +                return -ETIMEDOUT;
>>> +            }
>>> +        }
>>> +        if (num_xfers > multi_xfer->msg_idx_cnt)
>>> +            return 0;
>>> +    }
>>> +    return 0;
>>> +}
>>> +
>>> +static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg 
>>> msgs[],
>>>               struct dma_slave_config *config, dma_addr_t *dma_addr_p,
>>>               void **buf, unsigned int op, struct dma_chan *dma_chan)
>>>   {
>>>       struct gpi_i2c_config *peripheral;
>>> -    unsigned int flags;
>>>       void *dma_buf;
>>>       dma_addr_t addr;
>>>       enum dma_data_direction map_dirn;
>>>       enum dma_transfer_direction dma_dirn;
>>>       struct dma_async_tx_descriptor *desc;
>>>       int ret;
>>> +    struct geni_i2c_gpi_multi_desc_xfer *gi2c_gpi_xfer;
>>> +    dma_cookie_t cookie;
>>> +    u32 msg_idx;
>>>       peripheral = config->peripheral_config;
>>> -
>>> -    dma_buf = i2c_get_dma_safe_msg_buf(msg, 1);
>>> -    if (!dma_buf)
>>> -        return -ENOMEM;
>>> +    gi2c_gpi_xfer = &gi2c->i2c_multi_desc_config;
>>> +    dma_buf = gi2c_gpi_xfer->dma_buf[gi2c_gpi_xfer->buf_idx];
>>> +    addr = gi2c_gpi_xfer->dma_addr[gi2c_gpi_xfer->buf_idx];
>>> +    msg_idx = gi2c_gpi_xfer->msg_idx_cnt;
>>> +
>>> +    dma_buf = i2c_get_dma_safe_msg_buf(&msgs[msg_idx], 1);
>>> +    if (!dma_buf) {
>>> +        ret = -ENOMEM;
>>> +        goto out;
>>> +    }
>>>       if (op == I2C_WRITE)
>>>           map_dirn = DMA_TO_DEVICE;
>>>       else
>>>           map_dirn = DMA_FROM_DEVICE;
>>> -    addr = dma_map_single(gi2c->se.dev->parent, dma_buf, msg->len, 
>>> map_dirn);
>>> +    addr = dma_map_single(gi2c->se.dev->parent, dma_buf,
>>> +                  msgs[msg_idx].len, map_dirn);
>>>       if (dma_mapping_error(gi2c->se.dev->parent, addr)) {
>>> -        i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
>>> -        return -ENOMEM;
>>> +        i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
>>> +        ret = -ENOMEM;
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (gi2c->is_tx_multi_desc_xfer) {
>>> +        if (((msg_idx + 1) % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ))
>>> +            peripheral->dma_flags = DMA_CTRL_ACK;
>>> +        else
>>> +            peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>>> +
>>> +        /* BEI bit to be cleared for last TRE */
>>> +        if (msg_idx == gi2c->num_msgs - 1)
>>> +            peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>>> +    } else {
>>> +        peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>>
>> No-no-no. There is no need to set peripheral->dma_flags.
> 
> Initially it was "flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;".
> 
> I thought of using flags in GPI driver as well, so moved flags gpi 
> include file to access b/w I2C and GPI.
> 
> But as per your suggestion i will review if 
> "dma_async_tx_descriptor.flags" can be retrieved in GPI driver for 
> getting DMA_PREP_INTERRUPT flag status.

Well. Nobody thought that you'd continue inventing drvier-specific API 
when we asked you to use a generic pre-existing const.

>>
>>>       }
>>>       /* set the length as message for rx txn */
>>> -    peripheral->rx_len = msg->len;
>>> +    peripheral->rx_len = msgs[msg_idx].len;
>>>       peripheral->op = op;
>>>       ret = dmaengine_slave_config(dma_chan, config);
>>> @@ -568,14 +746,14 @@ static int geni_i2c_gpi(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg *msg,
>>>       peripheral->set_config = 0;
>>>       peripheral->multi_msg = true;
>>> -    flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>>>       if (op == I2C_WRITE)
>>>           dma_dirn = DMA_MEM_TO_DEV;
>>>       else
>>>           dma_dirn = DMA_DEV_TO_MEM;
>>> -    desc = dmaengine_prep_slave_single(dma_chan, addr, msg->len, 
>>> dma_dirn, flags);
>>> +    desc = dmaengine_prep_slave_single(dma_chan, addr, 
>>> msgs[msg_idx].len,
>>> +                       dma_dirn, peripheral->dma_flags);
>>>       if (!desc) {
>>>           dev_err(gi2c->se.dev, "prep_slave_sg failed\n");
>>>           ret = -EIO;
>>> @@ -585,15 +763,48 @@ static int geni_i2c_gpi(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg *msg,
>>>       desc->callback_result = i2c_gpi_cb_result;
>>>       desc->callback_param = gi2c;
>>> -    dmaengine_submit(desc);
>>> -    *buf = dma_buf;
>>> -    *dma_addr_p = addr;
>>> +    if (!((msgs[msg_idx].flags & I2C_M_RD) && op == I2C_WRITE)) {
>>> +        gi2c_gpi_xfer->msg_idx_cnt++;
>>> +        gi2c_gpi_xfer->buf_idx = (msg_idx + 1) % 
>>> QCOM_I2C_GPI_MAX_NUM_MSGS;
>>> +    }
>>> +    cookie = dmaengine_submit(desc);
>>> +    if (dma_submit_error(cookie)) {
>>> +        dev_err(gi2c->se.dev,
>>> +            "%s: dmaengine_submit failed (%d)\n", __func__, cookie);
>>> +        ret = -EINVAL;
>>> +        goto err_config;
>>> +    }
>>> +    if (gi2c->is_tx_multi_desc_xfer) {
>>> +        dma_async_issue_pending(gi2c->tx_c);
>>> +        if ((msg_idx == (gi2c->num_msgs - 1)) ||
>>> +            (gi2c_gpi_xfer->msg_idx_cnt >=
>>> +             QCOM_I2C_GPI_MAX_NUM_MSGS + gi2c_gpi_xfer- 
>>> >freed_msg_cnt)) {
>>> +            ret = geni_i2c_gpi_multi_xfer_timeout_handler(gi2c- 
>>> >se.dev, gi2c_gpi_xfer,
>>> +                                      gi2c->num_msgs, XFER_TIMEOUT,
>>> +                                      &gi2c->done);
>>> +            if (ret) {
>>> +                dev_err(gi2c->se.dev,
>>> +                    "I2C multi write msg transfer timeout: %d\n",
>>> +                    ret);
>>> +                gi2c->err = ret;
>>> +                goto err_config;
>>> +            }
>>> +        }
>>> +    } else {
>>> +        /* Non multi descriptor message transfer */
>>> +        *buf = dma_buf;
>>> +        *dma_addr_p = addr;
>>> +    }
>>>       return 0;
>>>   err_config:
>>> -    dma_unmap_single(gi2c->se.dev->parent, addr, msg->len, map_dirn);
>>> -    i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
>>> +    dma_unmap_single(gi2c->se.dev->parent, addr,
>>> +             msgs[msg_idx].len, map_dirn);
>>> +    i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
>>> +
>>> +out:
>>> +    gi2c->err = ret;
>>>       return ret;
>>>   }
>>> @@ -605,6 +816,7 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg msgs[], i
>>>       unsigned long time_left;
>>>       dma_addr_t tx_addr, rx_addr;
>>>       void *tx_buf = NULL, *rx_buf = NULL;
>>> +    struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer;
>>>       const struct geni_i2c_clk_fld *itr = gi2c->clk_fld;
>>>       config.peripheral_config = &peripheral;
>>> @@ -618,6 +830,33 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg msgs[], i
>>>       peripheral.set_config = 1;
>>>       peripheral.multi_msg = false;
>>> +    gi2c->num_msgs = num;
>>> +    gi2c->is_tx_multi_desc_xfer = false;
>>> +    gi2c->tx_irq_cnt = 0;
>>> +
>>> +    tx_multi_xfer = &gi2c->i2c_multi_desc_config;
>>> +    memset(tx_multi_xfer, 0, sizeof(struct 
>>> geni_i2c_gpi_multi_desc_xfer));
>>> +
>>> +    /*
>>> +     * If number of write messages are two and higher then
>>> +     * configure hardware for multi descriptor transfers with BEI.
>>> +     */
>>> +    if (num >= QCOM_I2C_MIN_NUM_OF_MSGS_MULTI_DESC) {
>>> +        gi2c->is_tx_multi_desc_xfer = true;
>>> +        for (i = 0; i < num; i++) {
>>> +            if (msgs[i].flags & I2C_M_RD) {
>>> +                /*
>>> +                 * Multi descriptor transfer with BEI
>>> +                 * support is enabled for write transfers.
>>> +                 * TODO: Add BEI optimization support for
>>> +                 * read transfers later.
>>> +                 */
>>> +                gi2c->is_tx_multi_desc_xfer = false;
>>> +                break;
>>> +            }
>>> +        }
>>> +    }
>>> +
>>>       for (i = 0; i < num; i++) {
>>>           gi2c->cur = &msgs[i];
>>>           gi2c->err = 0;
>>> @@ -628,14 +867,16 @@ static int geni_i2c_gpi_xfer(struct 
>>> geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>>               peripheral.stretch = 1;
>>>           peripheral.addr = msgs[i].addr;
>>> +        if (i > 0 && (!(msgs[i].flags & I2C_M_RD)))
>>> +            peripheral.multi_msg = false;
>>> -        ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
>>> +        ret =  geni_i2c_gpi(gi2c, msgs, &config,
>>>                       &tx_addr, &tx_buf, I2C_WRITE, gi2c->tx_c);
>>
>> Can't you just pass the flag to set DMA_PREP_INTERRUPT for the last
>> message here and on the next geni_i2c_gpi() call?
> 
> Sure, I will review it.
> We need to set/mask DMA_PREP_INTERRUPT through dma_async_tx_descriptor 
> structure and need to pass this structure to GPI and so GPI can retrives 
> flags for GPI HW BEI set functionality.

Hmm. Do you? Check how other drivers handle the flag, it's not that you 
are the first driver author implementing it.

>>
>>>           if (ret)
>>>               goto err;
>>>           if (msgs[i].flags & I2C_M_RD) {
>>> -            ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
>>> +            ret =  geni_i2c_gpi(gi2c, msgs, &config,
>>>                           &rx_addr, &rx_buf, I2C_READ, gi2c->rx_c);
>>>               if (ret)
>>>                   goto err;
>>> @@ -643,18 +884,26 @@ static int geni_i2c_gpi_xfer(struct 
>>> geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>>               dma_async_issue_pending(gi2c->rx_c);
>>>           }
>>> -        dma_async_issue_pending(gi2c->tx_c);
>>> -
>>> -        time_left = wait_for_completion_timeout(&gi2c->done, 
>>> XFER_TIMEOUT);
>>> -        if (!time_left)
>>> -            gi2c->err = -ETIMEDOUT;
>>> +        if (!gi2c->is_tx_multi_desc_xfer) {
>>> +            dma_async_issue_pending(gi2c->tx_c);
>>> +            time_left = wait_for_completion_timeout(&gi2c->done, 
>>> XFER_TIMEOUT);
>>> +            if (!time_left) {
>>> +                dev_err(gi2c->se.dev, "%s:I2C timeout\n", __func__);
>>> +                gi2c->err = -ETIMEDOUT;
>>> +            }
>>> +        }
>>>           if (gi2c->err) {
>>>               ret = gi2c->err;
>>>               goto err;
>>>           }
>>> -        geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, 
>>> rx_addr);
>>> +        if (!gi2c->is_tx_multi_desc_xfer) {
>>> +            geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, 
>>> rx_buf, rx_addr);
>>> +        } else if (gi2c->tx_irq_cnt != tx_multi_xfer->irq_cnt) {
>>> +            gi2c->tx_irq_cnt = tx_multi_xfer->irq_cnt;
>>> +            geni_i2c_gpi_multi_desc_unmap(gi2c, msgs, &peripheral);
>>> +        }
>>>       }
>>>       return num;
>>> @@ -663,7 +912,11 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg msgs[], i
>>>       dev_err(gi2c->se.dev, "GPI transfer failed: %d\n", ret);
>>>       dmaengine_terminate_sync(gi2c->rx_c);
>>>       dmaengine_terminate_sync(gi2c->tx_c);
>>> -    geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, 
>>> rx_addr);
>>> +    if (gi2c->is_tx_multi_desc_xfer)
>>> +        geni_i2c_gpi_multi_desc_unmap(gi2c, msgs, &peripheral);
>>> +    else
>>> +        geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, 
>>> rx_addr);
>>> +
>>>       return ret;
>>>   }
>>> -- 
>>> 2.17.1
>>>
>>
> 


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-09 16:01       ` Dmitry Baryshkov
@ 2025-05-21 10:28         ` Jyothi Kumar Seerapu
  2025-05-21 12:45           ` Dmitry Baryshkov
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-05-21 10:28 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku



On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
>> Hi Dimitry, Thanks for providing the review comments.
>>
>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
>>>> The I2C driver gets an interrupt upon transfer completion.
>>>> When handling multiple messages in a single transfer, this
>>>> results in N interrupts for N messages, leading to significant
>>>> software interrupt latency.
>>>>
>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
>>>> mechanism. Enabling BEI instructs the hardware to prevent interrupt
>>>> generation and BEI is disabled when an interrupt is necessary.
>>>>
>>>> Large I2C transfer can be divided into chunks of 8 messages internally.
>>>> Interrupts are not expected for the first 7 message completions, only
>>>> the last message triggers an interrupt, indicating the completion of
>>>> 8 messages. This BEI mechanism enhances overall transfer efficiency.
>>>
>>> Why do you need this complexity? Is it possible to set the
>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
>>
>> If i undertsand correctly, the suggestion is to get the single 
>> intetrrupt for last i2c message only.
>>
>> But With this approach, we can't handle large number of i2c messages 
>> in the transfer.
>>
>> In GPI driver, number of max TREs support is harcoded to 64 (#define 
>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and 
>> DMA TREs. So, the avilable TREs are not sufficient to handle all the N 
>> messages.
> 
> It sounds like a DMA driver issue. In other words, the DMA driver can 
> know that it must issue an interrupt before exausting 64 TREs in order to
> 
>>
>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS or 
>> 'num' incase for less messsages), process and unmap/free upon the 
>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> 
> Why? This is some random value which has no connection with CHAN_TREs. 
> Also, what if one of the platforms get a 'liter' GPI which supports less 
> TREs in a single run? Or a super-premium platform which can use 256 
> TREs? Please don't workaround issues from one driver in another one.

We are trying to utilize the existing CHAN_TRES mentioned in the GPI 
driver.
With the following approach, the GPI hardware can process N number of 
I2C messages, thereby improving throughput and transfer efficiency.

The main design consideration for using the block event interrupt is as 
follows:

Allow the hardware to process the TREs (I2C messages), while the 
software concurrently prepares the next set of TREs to be submitted to 
the hardware. Once the TREs are processed, they can be freed, enabling 
the software to queue new TREs. This approach enhances overall optimization.

Please let me know if you have any questions, concerns, or suggestions.
> 
>>
>>
>>>
>>>>
>>>> This optimization reduces transfer time from 168 ms to 48 ms for a
>>>> series of 200 I2C write messages in a single transfer, with a
>>>> clock frequency support of 100 kHz.
>>>>
>>>> BEI optimizations are currently implemented for I2C write transfers 
>>>> only,
>>>> as there is no use case for multiple I2C read messages in a single 
>>>> transfer
>>>> at this time.
>>>>
>>>> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@quicinc.com>
>>>> ---
>>>> v5 -> v6:
>>>>    - Instead of using bei_flag, moved the logic to use with DMA
>>>>      supported flags like DMA_PREP_INTERRUPT.
>>>>    - Additional parameter comments removed from 
>>>> geni_i2c_gpi_multi_desc_unmap
>>>>      function documentation.
>>>> v4 -> v5:
>>>>    - Block event interrupt flag naming changed from flags to bei_flag.
>>>>    - Documentation added for "struct geni_i2c_dev".
>>>>
>>>> v3 -> v4:
>>>>    - API's added for Block event interrupt with multi descriptor 
>>>> support for
>>>>      I2C is moved from qcom-gpi-dma.h file to I2C geni qcom driver 
>>>> file.
>>>>    - gpi_multi_xfer_timeout_handler function is moved from GPI 
>>>> driver to
>>>>      I2C driver.
>>>>    - geni_i2c_gpi_multi_desc_xfer structure is added as a member of
>>>>      struct geni_i2c_dev.
>>>>
>>>> v2 -> v3:
>>>>     - In i2c_gpi_cb_result function, moved the logic of
>>>>      "!is_tx_multi_xfer" to else.
>>>>     - MIN_NUM_OF_MSGS_MULTI_DESC changed from 4 to 2
>>>>     - Updated commit description
>>>>
>>>> v1 -> v2:
>>>>     - Moved gi2c_gpi_xfer->msg_idx_cnt to separate local variable.
>>>>     - Updated goto labels for error scenarios in geni_i2c_gpi function
>>>>     - memset tx_multi_xfer to 0.
>>>>     - Removed passing current msg index to geni_i2c_gpi
>>>>     - Fixed kernel test robot reported compilation issues.
>>>>
>>>>
>>>>   drivers/i2c/busses/i2c-qcom-geni.c | 307 +++++++++++++++++++++++++ 
>>>> +---
>>>>   1 file changed, 280 insertions(+), 27 deletions(-)
>>>>
>>>> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/ 
>>>> busses/ i2c-qcom-geni.c
>>>> index 515a784c951c..e390cf5b4ddc 100644
>>>> --- a/drivers/i2c/busses/i2c-qcom-geni.c
>>>> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
>>>> @@ -78,6 +78,62 @@ enum geni_i2c_err_code {
>>>>   #define XFER_TIMEOUT        HZ
>>>>   #define RST_TIMEOUT        HZ
>>>> +#define QCOM_I2C_GPI_MAX_NUM_MSGS        16
>>>> +#define QCOM_I2C_GPI_NUM_MSGS_PER_IRQ        8
>>>> +#define QCOM_I2C_MIN_NUM_OF_MSGS_MULTI_DESC    2
>>>> +
>>>> +/**
>>>> + * struct geni_i2c_gpi_multi_desc_xfer - Used for multi transfer 
>>>> support
>>>> + *
>>>> + * @msg_idx_cnt: message index for the transfer
>>>> + * @buf_idx: dma buffer index
>>>> + * @unmap_msg_cnt: unmapped transfer index
>>>> + * @freed_msg_cnt: freed transfer index
>>>> + * @irq_cnt: received interrupt count
>>>> + * @irq_msg_cnt: transfer message count for the received irqs
>>>> + * @dma_buf: virtual addresses of the buffers
>>>> + * @dma_addr: dma addresses of the buffers
>>>> + */
>>>> +struct geni_i2c_gpi_multi_desc_xfer {
>>>> +    u32 msg_idx_cnt;
>>>> +    u32 buf_idx;
>>>> +    u32 unmap_msg_cnt;
>>>> +    u32 freed_msg_cnt;
>>>> +    u32 irq_cnt;
>>>> +    u32 irq_msg_cnt;
>>>> +    void *dma_buf[QCOM_I2C_GPI_MAX_NUM_MSGS];
>>>> +    dma_addr_t dma_addr[QCOM_I2C_GPI_MAX_NUM_MSGS];
>>>> +};
>>>> +
>>>> +/**
>>>> + * struct geni_i2c_dev - I2C Geni device specific structure
>>>> + *
>>>> + * @se: geni serial engine
>>>> + * @tx_wm: Tx watermark level
>>>> + * @irq: i2c serial engine interrupt
>>>> + * @err: specifies error codes in i2c transfer failures
>>>> + * @adap: i2c geni adapter
>>>> + * @done: completion variable
>>>> + * @cur: pointer to the i2c_msg mentioning current i2c message in use
>>>> + * @cur_wr: variable used for i2c write opertions
>>>> + * @cur_rd: variable used for i2c read operations
>>>> + * @lock: spinlock variable used for synchronization
>>>> + * @core_clk: pointer to clk
>>>> + * @clk_freq_out: contains the i2c clock frequency
>>>> + * @clk_fld: pointer to geni_i2c_clk_fld
>>>> + * @suspended: flag used for system supend status
>>>> + * @dma_buf: virtual address of the buffer
>>>> + * @xfer_len: holds length for the dma operation
>>>> + * @dma_addr: dma address of the buffer
>>>> + * @tx_c: Tx dma channel
>>>> + * @rx_c: Rx dma channel
>>>> + * @gpi_mode: GPI DMA mode of operation
>>>> + * @abort_done: true for marking i2c abort transfer
>>>> + * @is_tx_multi_desc_xfer: true for i2c multi transfer support
>>>> + * @num_msgs: number of i2c messages in a transfer
>>>> + * @tx_irq_cnt: flag used for tx irq count in i2c multi transfer
>>>> + * @i2c_multi_desc_config: used for multi transfer support
>>>> + */
>>>
>>> Unrelated. Documentation should go to a separate patch.
>> Sure, I will follow it in next patch.
>>>
>>>>   struct geni_i2c_dev {
>>>>       struct geni_se se;
>>>>       u32 tx_wm;
>>>> @@ -100,6 +156,10 @@ struct geni_i2c_dev {
>>>>       struct dma_chan *rx_c;
>>>>       bool gpi_mode;
>>>>       bool abort_done;
>>>> +    bool is_tx_multi_desc_xfer;
>>>> +    u32 num_msgs;
>>>> +    u32 tx_irq_cnt;
>>>> +    struct geni_i2c_gpi_multi_desc_xfer i2c_multi_desc_config;
>>>>   };
>>>>   struct geni_i2c_desc {
>>>> @@ -500,6 +560,7 @@ static int geni_i2c_tx_one_msg(struct 
>>>> geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>>>   static void i2c_gpi_cb_result(void *cb, const struct 
>>>> dmaengine_result *result)
>>>>   {
>>>>       struct geni_i2c_dev *gi2c = cb;
>>>> +    struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer;
>>>
>>> Define it in the corresponding chunk.
>> If I understand correctly, do i need to add this in else block ?
> 
> Yes.
> 
>>>
>>>>       if (result->result != DMA_TRANS_NOERROR) {
>>>>           dev_err(gi2c->se.dev, "DMA txn failed:%d\n", result->result);
>>>> @@ -508,7 +569,22 @@ static void i2c_gpi_cb_result(void *cb, const 
>>>> struct dmaengine_result *result)
>>>>           dev_dbg(gi2c->se.dev, "DMA xfer has pending: %d\n", 
>>>> result- >residue);
>>>>       }
>>>> -    complete(&gi2c->done);
>>>> +    if (!gi2c->is_tx_multi_desc_xfer) {
>>>> +        complete(&gi2c->done);
>>>> +    } else {
>>>> +        tx_multi_xfer = &gi2c->i2c_multi_desc_config;
>>>> +
>>>> +        /*
>>>> +         * Send Completion for last message or multiple of
>>>> +         * QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>> +         */
>>>> +        if ((tx_multi_xfer->irq_msg_cnt == gi2c->num_msgs - 1) ||
>>>> +            (!((tx_multi_xfer->irq_msg_cnt + 1) % 
>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ))) {
>>>> +            tx_multi_xfer->irq_cnt++;
>>>> +            complete(&gi2c->done);
>>>> +        }
>>>> +        tx_multi_xfer->irq_msg_cnt++;
>>>> +    }
>>>>   }
>>>>   static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, struct 
>>>> i2c_msg *msg,
>>>> @@ -526,38 +602,140 @@ static void geni_i2c_gpi_unmap(struct 
>>>> geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>>>       }
>>>>   }
>>>> -static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg 
>>>> *msg,
>>>> +/**
>>>> + * geni_i2c_gpi_multi_desc_unmap() - unmaps the buffers post multi 
>>>> message TX transfers
>>>> + * @gi2c: i2c dev handle
>>>> + * @msgs: i2c messages array
>>>> + * @peripheral: pointer to gpi_i2c_config
>>>> + */
>>>> +static void geni_i2c_gpi_multi_desc_unmap(struct geni_i2c_dev 
>>>> *gi2c, struct i2c_msg msgs[],
>>>> +                      struct gpi_i2c_config *peripheral)
>>>> +{
>>>> +    u32 msg_xfer_cnt, wr_idx = 0;
>>>> +    struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer = &gi2c- 
>>>> >i2c_multi_desc_config;
>>>> +
>>>> +    /*
>>>> +     * In error case, need to unmap all messages based on the 
>>>> msg_idx_cnt.
>>>> +     * Non-error case unmap all the processed messages.
>>>> +     */
>>>> +    if (gi2c->err)
>>>> +        msg_xfer_cnt = tx_multi_xfer->msg_idx_cnt;
>>>> +    else
>>>> +        msg_xfer_cnt = tx_multi_xfer->irq_cnt * 
>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ;
>>>> +
>>>> +    /* Unmap the processed DMA buffers based on the received 
>>>> interrupt count */
>>>> +    for (; tx_multi_xfer->unmap_msg_cnt < msg_xfer_cnt; 
>>>> tx_multi_xfer->unmap_msg_cnt++) {
>>>> +        if (tx_multi_xfer->unmap_msg_cnt == gi2c->num_msgs)
>>>> +            break;
>>>> +        wr_idx = tx_multi_xfer->unmap_msg_cnt % 
>>>> QCOM_I2C_GPI_MAX_NUM_MSGS;
>>>> +        geni_i2c_gpi_unmap(gi2c, &msgs[tx_multi_xfer->unmap_msg_cnt],
>>>> +                   tx_multi_xfer->dma_buf[wr_idx],
>>>> +                   tx_multi_xfer->dma_addr[wr_idx],
>>>> +                   NULL, (dma_addr_t)NULL);
>>>> +        tx_multi_xfer->freed_msg_cnt++;
>>>> +    }
>>>> +}
>>>> +
>>>> +/**
>>>> + * geni_i2c_gpi_multi_xfer_timeout_handler() - Handle multi message 
>>>> transfer timeout
>>>> + * @dev: pointer to the corresponding dev node
>>>> + * @multi_xfer: pointer to the geni_i2c_gpi_multi_desc_xfer
>>>> + * @num_xfers: total number of transfers
>>>> + * @transfer_timeout_msecs: transfer timeout value
>>>> + * @transfer_comp: completion object of the transfer
>>>> + *
>>>> + * This function is used to wait for the processed transfers based on
>>>> + * the interrupts generated upon transfer completion.
>>>> + * Return: On success returns 0, otherwise return error code (- 
>>>> ETIMEDOUT)
>>>> + */
>>>> +static int geni_i2c_gpi_multi_xfer_timeout_handler(struct device *dev,
>>>> +                           struct geni_i2c_gpi_multi_desc_xfer 
>>>> *multi_xfer,
>>>> +                           u32 num_xfers, u32 transfer_timeout_msecs,
>>>> +                           struct completion *transfer_comp)
>>>> +{
>>>> +    int i;
>>>> +    u32 max_irq_cnt, time_left;
>>>> +
>>>> +    max_irq_cnt = num_xfers / QCOM_I2C_GPI_NUM_MSGS_PER_IRQ;
>>>> +    if (num_xfers % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ)
>>>> +        max_irq_cnt++;
>>>> +
>>>> +    /*
>>>> +     * Wait for the interrupts of the processed transfers in multiple
>>>> +     * of 8 and for the last transfer. If the hardware is fast and
>>>> +     * already processed all the transfers then no need to wait.
>>>> +     */
>>>> +    for (i = 0; i < max_irq_cnt; i++) {
>>>> +        reinit_completion(transfer_comp);
>>>> +        if (max_irq_cnt != multi_xfer->irq_cnt) {
>>>> +            time_left = wait_for_completion_timeout(transfer_comp,
>>>> +                                transfer_timeout_msecs);
>>>> +            if (!time_left) {
>>>> +                dev_err(dev, "%s: Transfer timeout\n", __func__);
>>>> +                return -ETIMEDOUT;
>>>> +            }
>>>> +        }
>>>> +        if (num_xfers > multi_xfer->msg_idx_cnt)
>>>> +            return 0;
>>>> +    }
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg 
>>>> msgs[],
>>>>               struct dma_slave_config *config, dma_addr_t *dma_addr_p,
>>>>               void **buf, unsigned int op, struct dma_chan *dma_chan)
>>>>   {
>>>>       struct gpi_i2c_config *peripheral;
>>>> -    unsigned int flags;
>>>>       void *dma_buf;
>>>>       dma_addr_t addr;
>>>>       enum dma_data_direction map_dirn;
>>>>       enum dma_transfer_direction dma_dirn;
>>>>       struct dma_async_tx_descriptor *desc;
>>>>       int ret;
>>>> +    struct geni_i2c_gpi_multi_desc_xfer *gi2c_gpi_xfer;
>>>> +    dma_cookie_t cookie;
>>>> +    u32 msg_idx;
>>>>       peripheral = config->peripheral_config;
>>>> -
>>>> -    dma_buf = i2c_get_dma_safe_msg_buf(msg, 1);
>>>> -    if (!dma_buf)
>>>> -        return -ENOMEM;
>>>> +    gi2c_gpi_xfer = &gi2c->i2c_multi_desc_config;
>>>> +    dma_buf = gi2c_gpi_xfer->dma_buf[gi2c_gpi_xfer->buf_idx];
>>>> +    addr = gi2c_gpi_xfer->dma_addr[gi2c_gpi_xfer->buf_idx];
>>>> +    msg_idx = gi2c_gpi_xfer->msg_idx_cnt;
>>>> +
>>>> +    dma_buf = i2c_get_dma_safe_msg_buf(&msgs[msg_idx], 1);
>>>> +    if (!dma_buf) {
>>>> +        ret = -ENOMEM;
>>>> +        goto out;
>>>> +    }
>>>>       if (op == I2C_WRITE)
>>>>           map_dirn = DMA_TO_DEVICE;
>>>>       else
>>>>           map_dirn = DMA_FROM_DEVICE;
>>>> -    addr = dma_map_single(gi2c->se.dev->parent, dma_buf, msg->len, 
>>>> map_dirn);
>>>> +    addr = dma_map_single(gi2c->se.dev->parent, dma_buf,
>>>> +                  msgs[msg_idx].len, map_dirn);
>>>>       if (dma_mapping_error(gi2c->se.dev->parent, addr)) {
>>>> -        i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
>>>> -        return -ENOMEM;
>>>> +        i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
>>>> +        ret = -ENOMEM;
>>>> +        goto out;
>>>> +    }
>>>> +
>>>> +    if (gi2c->is_tx_multi_desc_xfer) {
>>>> +        if (((msg_idx + 1) % QCOM_I2C_GPI_NUM_MSGS_PER_IRQ))
>>>> +            peripheral->dma_flags = DMA_CTRL_ACK;
>>>> +        else
>>>> +            peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>>>> +
>>>> +        /* BEI bit to be cleared for last TRE */
>>>> +        if (msg_idx == gi2c->num_msgs - 1)
>>>> +            peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>>>> +    } else {
>>>> +        peripheral->dma_flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>>>
>>> No-no-no. There is no need to set peripheral->dma_flags.
>>
>> Initially it was "flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;".
>>
>> I thought of using flags in GPI driver as well, so moved flags gpi 
>> include file to access b/w I2C and GPI.
>>
>> But as per your suggestion i will review if 
>> "dma_async_tx_descriptor.flags" can be retrieved in GPI driver for 
>> getting DMA_PREP_INTERRUPT flag status.
> 
> Well. Nobody thought that you'd continue inventing drvier-specific API 
> when we asked you to use a generic pre-existing const.
> 
>>>
>>>>       }
>>>>       /* set the length as message for rx txn */
>>>> -    peripheral->rx_len = msg->len;
>>>> +    peripheral->rx_len = msgs[msg_idx].len;
>>>>       peripheral->op = op;
>>>>       ret = dmaengine_slave_config(dma_chan, config);
>>>> @@ -568,14 +746,14 @@ static int geni_i2c_gpi(struct geni_i2c_dev 
>>>> *gi2c, struct i2c_msg *msg,
>>>>       peripheral->set_config = 0;
>>>>       peripheral->multi_msg = true;
>>>> -    flags = DMA_PREP_INTERRUPT | DMA_CTRL_ACK;
>>>>       if (op == I2C_WRITE)
>>>>           dma_dirn = DMA_MEM_TO_DEV;
>>>>       else
>>>>           dma_dirn = DMA_DEV_TO_MEM;
>>>> -    desc = dmaengine_prep_slave_single(dma_chan, addr, msg->len, 
>>>> dma_dirn, flags);
>>>> +    desc = dmaengine_prep_slave_single(dma_chan, addr, 
>>>> msgs[msg_idx].len,
>>>> +                       dma_dirn, peripheral->dma_flags);
>>>>       if (!desc) {
>>>>           dev_err(gi2c->se.dev, "prep_slave_sg failed\n");
>>>>           ret = -EIO;
>>>> @@ -585,15 +763,48 @@ static int geni_i2c_gpi(struct geni_i2c_dev 
>>>> *gi2c, struct i2c_msg *msg,
>>>>       desc->callback_result = i2c_gpi_cb_result;
>>>>       desc->callback_param = gi2c;
>>>> -    dmaengine_submit(desc);
>>>> -    *buf = dma_buf;
>>>> -    *dma_addr_p = addr;
>>>> +    if (!((msgs[msg_idx].flags & I2C_M_RD) && op == I2C_WRITE)) {
>>>> +        gi2c_gpi_xfer->msg_idx_cnt++;
>>>> +        gi2c_gpi_xfer->buf_idx = (msg_idx + 1) % 
>>>> QCOM_I2C_GPI_MAX_NUM_MSGS;
>>>> +    }
>>>> +    cookie = dmaengine_submit(desc);
>>>> +    if (dma_submit_error(cookie)) {
>>>> +        dev_err(gi2c->se.dev,
>>>> +            "%s: dmaengine_submit failed (%d)\n", __func__, cookie);
>>>> +        ret = -EINVAL;
>>>> +        goto err_config;
>>>> +    }
>>>> +    if (gi2c->is_tx_multi_desc_xfer) {
>>>> +        dma_async_issue_pending(gi2c->tx_c);
>>>> +        if ((msg_idx == (gi2c->num_msgs - 1)) ||
>>>> +            (gi2c_gpi_xfer->msg_idx_cnt >=
>>>> +             QCOM_I2C_GPI_MAX_NUM_MSGS + gi2c_gpi_xfer- 
>>>> >freed_msg_cnt)) {
>>>> +            ret = geni_i2c_gpi_multi_xfer_timeout_handler(gi2c- 
>>>> >se.dev, gi2c_gpi_xfer,
>>>> +                                      gi2c->num_msgs, XFER_TIMEOUT,
>>>> +                                      &gi2c->done);
>>>> +            if (ret) {
>>>> +                dev_err(gi2c->se.dev,
>>>> +                    "I2C multi write msg transfer timeout: %d\n",
>>>> +                    ret);
>>>> +                gi2c->err = ret;
>>>> +                goto err_config;
>>>> +            }
>>>> +        }
>>>> +    } else {
>>>> +        /* Non multi descriptor message transfer */
>>>> +        *buf = dma_buf;
>>>> +        *dma_addr_p = addr;
>>>> +    }
>>>>       return 0;
>>>>   err_config:
>>>> -    dma_unmap_single(gi2c->se.dev->parent, addr, msg->len, map_dirn);
>>>> -    i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
>>>> +    dma_unmap_single(gi2c->se.dev->parent, addr,
>>>> +             msgs[msg_idx].len, map_dirn);
>>>> +    i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
>>>> +
>>>> +out:
>>>> +    gi2c->err = ret;
>>>>       return ret;
>>>>   }
>>>> @@ -605,6 +816,7 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev 
>>>> *gi2c, struct i2c_msg msgs[], i
>>>>       unsigned long time_left;
>>>>       dma_addr_t tx_addr, rx_addr;
>>>>       void *tx_buf = NULL, *rx_buf = NULL;
>>>> +    struct geni_i2c_gpi_multi_desc_xfer *tx_multi_xfer;
>>>>       const struct geni_i2c_clk_fld *itr = gi2c->clk_fld;
>>>>       config.peripheral_config = &peripheral;
>>>> @@ -618,6 +830,33 @@ static int geni_i2c_gpi_xfer(struct 
>>>> geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>>>       peripheral.set_config = 1;
>>>>       peripheral.multi_msg = false;
>>>> +    gi2c->num_msgs = num;
>>>> +    gi2c->is_tx_multi_desc_xfer = false;
>>>> +    gi2c->tx_irq_cnt = 0;
>>>> +
>>>> +    tx_multi_xfer = &gi2c->i2c_multi_desc_config;
>>>> +    memset(tx_multi_xfer, 0, sizeof(struct 
>>>> geni_i2c_gpi_multi_desc_xfer));
>>>> +
>>>> +    /*
>>>> +     * If number of write messages are two and higher then
>>>> +     * configure hardware for multi descriptor transfers with BEI.
>>>> +     */
>>>> +    if (num >= QCOM_I2C_MIN_NUM_OF_MSGS_MULTI_DESC) {
>>>> +        gi2c->is_tx_multi_desc_xfer = true;
>>>> +        for (i = 0; i < num; i++) {
>>>> +            if (msgs[i].flags & I2C_M_RD) {
>>>> +                /*
>>>> +                 * Multi descriptor transfer with BEI
>>>> +                 * support is enabled for write transfers.
>>>> +                 * TODO: Add BEI optimization support for
>>>> +                 * read transfers later.
>>>> +                 */
>>>> +                gi2c->is_tx_multi_desc_xfer = false;
>>>> +                break;
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +
>>>>       for (i = 0; i < num; i++) {
>>>>           gi2c->cur = &msgs[i];
>>>>           gi2c->err = 0;
>>>> @@ -628,14 +867,16 @@ static int geni_i2c_gpi_xfer(struct 
>>>> geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>>>               peripheral.stretch = 1;
>>>>           peripheral.addr = msgs[i].addr;
>>>> +        if (i > 0 && (!(msgs[i].flags & I2C_M_RD)))
>>>> +            peripheral.multi_msg = false;
>>>> -        ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
>>>> +        ret =  geni_i2c_gpi(gi2c, msgs, &config,
>>>>                       &tx_addr, &tx_buf, I2C_WRITE, gi2c->tx_c);
>>>
>>> Can't you just pass the flag to set DMA_PREP_INTERRUPT for the last
>>> message here and on the next geni_i2c_gpi() call?
>>
>> Sure, I will review it.
>> We need to set/mask DMA_PREP_INTERRUPT through dma_async_tx_descriptor 
>> structure and need to pass this structure to GPI and so GPI can 
>> retrives flags for GPI HW BEI set functionality.
> 
> Hmm. Do you? Check how other drivers handle the flag, it's not that you 
> are the first driver author implementing it.
> 
>>>
>>>>           if (ret)
>>>>               goto err;
>>>>           if (msgs[i].flags & I2C_M_RD) {
>>>> -            ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
>>>> +            ret =  geni_i2c_gpi(gi2c, msgs, &config,
>>>>                           &rx_addr, &rx_buf, I2C_READ, gi2c->rx_c);
>>>>               if (ret)
>>>>                   goto err;
>>>> @@ -643,18 +884,26 @@ static int geni_i2c_gpi_xfer(struct 
>>>> geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>>>               dma_async_issue_pending(gi2c->rx_c);
>>>>           }
>>>> -        dma_async_issue_pending(gi2c->tx_c);
>>>> -
>>>> -        time_left = wait_for_completion_timeout(&gi2c->done, 
>>>> XFER_TIMEOUT);
>>>> -        if (!time_left)
>>>> -            gi2c->err = -ETIMEDOUT;
>>>> +        if (!gi2c->is_tx_multi_desc_xfer) {
>>>> +            dma_async_issue_pending(gi2c->tx_c);
>>>> +            time_left = wait_for_completion_timeout(&gi2c->done, 
>>>> XFER_TIMEOUT);
>>>> +            if (!time_left) {
>>>> +                dev_err(gi2c->se.dev, "%s:I2C timeout\n", __func__);
>>>> +                gi2c->err = -ETIMEDOUT;
>>>> +            }
>>>> +        }
>>>>           if (gi2c->err) {
>>>>               ret = gi2c->err;
>>>>               goto err;
>>>>           }
>>>> -        geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, 
>>>> rx_addr);
>>>> +        if (!gi2c->is_tx_multi_desc_xfer) {
>>>> +            geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, 
>>>> rx_buf, rx_addr);
>>>> +        } else if (gi2c->tx_irq_cnt != tx_multi_xfer->irq_cnt) {
>>>> +            gi2c->tx_irq_cnt = tx_multi_xfer->irq_cnt;
>>>> +            geni_i2c_gpi_multi_desc_unmap(gi2c, msgs, &peripheral);
>>>> +        }
>>>>       }
>>>>       return num;
>>>> @@ -663,7 +912,11 @@ static int geni_i2c_gpi_xfer(struct 
>>>> geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>>>       dev_err(gi2c->se.dev, "GPI transfer failed: %d\n", ret);
>>>>       dmaengine_terminate_sync(gi2c->rx_c);
>>>>       dmaengine_terminate_sync(gi2c->tx_c);
>>>> -    geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, 
>>>> rx_addr);
>>>> +    if (gi2c->is_tx_multi_desc_xfer)
>>>> +        geni_i2c_gpi_multi_desc_unmap(gi2c, msgs, &peripheral);
>>>> +    else
>>>> +        geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, 
>>>> rx_addr);
>>>> +
>>>>       return ret;
>>>>   }
>>>> -- 
>>>> 2.17.1
>>>>
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-21 10:28         ` Jyothi Kumar Seerapu
@ 2025-05-21 12:45           ` Dmitry Baryshkov
  2025-05-30 14:06             ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2025-05-21 12:45 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
> 
> 
> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
> > On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
> > > Hi Dimitry, Thanks for providing the review comments.
> > > 
> > > On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
> > > > On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
> > > > > The I2C driver gets an interrupt upon transfer completion.
> > > > > When handling multiple messages in a single transfer, this
> > > > > results in N interrupts for N messages, leading to significant
> > > > > software interrupt latency.
> > > > > 
> > > > > To mitigate this latency, utilize Block Event Interrupt (BEI)
> > > > > mechanism. Enabling BEI instructs the hardware to prevent interrupt
> > > > > generation and BEI is disabled when an interrupt is necessary.
> > > > > 
> > > > > Large I2C transfer can be divided into chunks of 8 messages internally.
> > > > > Interrupts are not expected for the first 7 message completions, only
> > > > > the last message triggers an interrupt, indicating the completion of
> > > > > 8 messages. This BEI mechanism enhances overall transfer efficiency.
> > > > 
> > > > Why do you need this complexity? Is it possible to set the
> > > > DMA_PREP_INTERRUPT flag on the last message in the transfer?
> > > 
> > > If i undertsand correctly, the suggestion is to get the single
> > > intetrrupt for last i2c message only.
> > > 
> > > But With this approach, we can't handle large number of i2c messages
> > > in the transfer.
> > > 
> > > In GPI driver, number of max TREs support is harcoded to 64 (#define
> > > CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and
> > > DMA TREs. So, the avilable TREs are not sufficient to handle all the
> > > N messages.
> > 
> > It sounds like a DMA driver issue. In other words, the DMA driver can
> > know that it must issue an interrupt before exausting 64 TREs in order
> > to
> > 
> > > 
> > > Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
> > > or 'num' incase for less messsages), process and unmap/free upon the
> > > interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> > 
> > Why? This is some random value which has no connection with CHAN_TREs.
> > Also, what if one of the platforms get a 'liter' GPI which supports less
> > TREs in a single run? Or a super-premium platform which can use 256
> > TREs? Please don't workaround issues from one driver in another one.
> 
> We are trying to utilize the existing CHAN_TRES mentioned in the GPI driver.
> With the following approach, the GPI hardware can process N number of I2C
> messages, thereby improving throughput and transfer efficiency.
> 
> The main design consideration for using the block event interrupt is as
> follows:
> 
> Allow the hardware to process the TREs (I2C messages), while the software
> concurrently prepares the next set of TREs to be submitted to the hardware.
> Once the TREs are processed, they can be freed, enabling the software to
> queue new TREs. This approach enhances overall optimization.
> 
> Please let me know if you have any questions, concerns, or suggestions.

The question was why do you limit that to QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
What is the reason for that limit, etc. If you think about it, The GENI
/ I2C doesn't impose any limit on the number of messages processed in
one go (if I understand it correctly). Instead the limit comes from the
GPI DMA driver. As such, please don't add extra 'handling' to the I2C
driver. Make GPI DMA driver responsible for saying 'no more for now',
then I2C driver can setup add an interrupt flag and proceed with
submitting next messages, etc.

I really don't see a reason for additional complicated handling in the
geni driver that you've implemented. Maybe I misunderstand something. In
such a case it usually means that you have to explain the design in the
commit message / in-code comments.

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-21 12:45           ` Dmitry Baryshkov
@ 2025-05-30 14:06             ` Jyothi Kumar Seerapu
  2025-05-30 16:42               ` Dmitry Baryshkov
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-05-30 14:06 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku



On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
> On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
>>
>>
>> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
>>> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
>>>> Hi Dimitry, Thanks for providing the review comments.
>>>>
>>>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
>>>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>> The I2C driver gets an interrupt upon transfer completion.
>>>>>> When handling multiple messages in a single transfer, this
>>>>>> results in N interrupts for N messages, leading to significant
>>>>>> software interrupt latency.
>>>>>>
>>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
>>>>>> mechanism. Enabling BEI instructs the hardware to prevent interrupt
>>>>>> generation and BEI is disabled when an interrupt is necessary.
>>>>>>
>>>>>> Large I2C transfer can be divided into chunks of 8 messages internally.
>>>>>> Interrupts are not expected for the first 7 message completions, only
>>>>>> the last message triggers an interrupt, indicating the completion of
>>>>>> 8 messages. This BEI mechanism enhances overall transfer efficiency.
>>>>>
>>>>> Why do you need this complexity? Is it possible to set the
>>>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
>>>>
>>>> If i undertsand correctly, the suggestion is to get the single
>>>> intetrrupt for last i2c message only.
>>>>
>>>> But With this approach, we can't handle large number of i2c messages
>>>> in the transfer.
>>>>
>>>> In GPI driver, number of max TREs support is harcoded to 64 (#define
>>>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and
>>>> DMA TREs. So, the avilable TREs are not sufficient to handle all the
>>>> N messages.
>>>
>>> It sounds like a DMA driver issue. In other words, the DMA driver can
>>> know that it must issue an interrupt before exausting 64 TREs in order
>>> to
>>>
>>>>
>>>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
>>>> or 'num' incase for less messsages), process and unmap/free upon the
>>>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>
>>> Why? This is some random value which has no connection with CHAN_TREs.
>>> Also, what if one of the platforms get a 'liter' GPI which supports less
>>> TREs in a single run? Or a super-premium platform which can use 256
>>> TREs? Please don't workaround issues from one driver in another one.
>>
>> We are trying to utilize the existing CHAN_TRES mentioned in the GPI driver.
>> With the following approach, the GPI hardware can process N number of I2C
>> messages, thereby improving throughput and transfer efficiency.
>>
>> The main design consideration for using the block event interrupt is as
>> follows:
>>
>> Allow the hardware to process the TREs (I2C messages), while the software
>> concurrently prepares the next set of TREs to be submitted to the hardware.
>> Once the TREs are processed, they can be freed, enabling the software to
>> queue new TREs. This approach enhances overall optimization.
>>
>> Please let me know if you have any questions, concerns, or suggestions.
> 
> The question was why do you limit that to QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> What is the reason for that limit, etc. If you think about it, The GENI
> / I2C doesn't impose any limit on the number of messages processed in
> one go (if I understand it correctly). Instead the limit comes from the
> GPI DMA driver. As such, please don't add extra 'handling' to the I2C
> driver. Make GPI DMA driver responsible for saying 'no more for now',
> then I2C driver can setup add an interrupt flag and proceed with
> submitting next messages, etc.
> 

For I2C messages, we need to prepare TREs for Config, Go and DMAs. 
However, if a large number of I2C messages are submitted then may may 
run out of memory for serving the TREs. The GPI channel supports a 
maximum of 64 TREs, which is insufficient to serve 32 or even 16 I2C 
messages concurrently, given the multiple TREs required per message.

To address this limitation, a strategy has been implemented to manage 
how many messages can be queued and how memory is recycled. The constant 
QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
messages that can be queued at once. Additionally, 
QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
half of the queued messages are expected to be freed or deallocated per 
interrupt.
This approach ensures that the driver can efficiently manage TRE 
resources and continue queuing new I2C messages without exhausting memory.
> I really don't see a reason for additional complicated handling in the
> geni driver that you've implemented. Maybe I misunderstand something. In
> such a case it usually means that you have to explain the design in the
> commit message / in-code comments.
> 


The I2C Geni driver is designed to prepare and submit descriptors to the 
GPI driver one message at a time.
As a result, the GPI driver does not have visibility into the current 
message index or the total number of I2C messages in a transfer. This 
lack of context makes it challenging to determine when to set the block 
event interrupt, which is typically used to signal the completion of a 
batch of messages.

So, the responsibility for deciding when to set the BEI should lie with 
the I2C driver.

If this approach is acceptable, I will proceed with updating the 
relevant details in the commit message.

Please let me know if you have any concerns or suggestions.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-30 14:06             ` Jyothi Kumar Seerapu
@ 2025-05-30 16:42               ` Dmitry Baryshkov
  2025-06-17 14:11                 ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2025-05-30 16:42 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
> 
> 
> On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
> > On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
> > > 
> > > 
> > > On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
> > > > On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
> > > > > Hi Dimitry, Thanks for providing the review comments.
> > > > > 
> > > > > On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
> > > > > > On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
> > > > > > > The I2C driver gets an interrupt upon transfer completion.
> > > > > > > When handling multiple messages in a single transfer, this
> > > > > > > results in N interrupts for N messages, leading to significant
> > > > > > > software interrupt latency.
> > > > > > > 
> > > > > > > To mitigate this latency, utilize Block Event Interrupt (BEI)
> > > > > > > mechanism. Enabling BEI instructs the hardware to prevent interrupt
> > > > > > > generation and BEI is disabled when an interrupt is necessary.
> > > > > > > 
> > > > > > > Large I2C transfer can be divided into chunks of 8 messages internally.
> > > > > > > Interrupts are not expected for the first 7 message completions, only
> > > > > > > the last message triggers an interrupt, indicating the completion of
> > > > > > > 8 messages. This BEI mechanism enhances overall transfer efficiency.
> > > > > > 
> > > > > > Why do you need this complexity? Is it possible to set the
> > > > > > DMA_PREP_INTERRUPT flag on the last message in the transfer?
> > > > > 
> > > > > If i undertsand correctly, the suggestion is to get the single
> > > > > intetrrupt for last i2c message only.
> > > > > 
> > > > > But With this approach, we can't handle large number of i2c messages
> > > > > in the transfer.
> > > > > 
> > > > > In GPI driver, number of max TREs support is harcoded to 64 (#define
> > > > > CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and
> > > > > DMA TREs. So, the avilable TREs are not sufficient to handle all the
> > > > > N messages.
> > > > 
> > > > It sounds like a DMA driver issue. In other words, the DMA driver can
> > > > know that it must issue an interrupt before exausting 64 TREs in order
> > > > to
> > > > 
> > > > > 
> > > > > Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
> > > > > or 'num' incase for less messsages), process and unmap/free upon the
> > > > > interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> > > > 
> > > > Why? This is some random value which has no connection with CHAN_TREs.
> > > > Also, what if one of the platforms get a 'liter' GPI which supports less
> > > > TREs in a single run? Or a super-premium platform which can use 256
> > > > TREs? Please don't workaround issues from one driver in another one.
> > > 
> > > We are trying to utilize the existing CHAN_TRES mentioned in the GPI driver.
> > > With the following approach, the GPI hardware can process N number of I2C
> > > messages, thereby improving throughput and transfer efficiency.
> > > 
> > > The main design consideration for using the block event interrupt is as
> > > follows:
> > > 
> > > Allow the hardware to process the TREs (I2C messages), while the software
> > > concurrently prepares the next set of TREs to be submitted to the hardware.
> > > Once the TREs are processed, they can be freed, enabling the software to
> > > queue new TREs. This approach enhances overall optimization.
> > > 
> > > Please let me know if you have any questions, concerns, or suggestions.
> > 
> > The question was why do you limit that to QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> > What is the reason for that limit, etc. If you think about it, The GENI
> > / I2C doesn't impose any limit on the number of messages processed in
> > one go (if I understand it correctly). Instead the limit comes from the
> > GPI DMA driver. As such, please don't add extra 'handling' to the I2C
> > driver. Make GPI DMA driver responsible for saying 'no more for now',
> > then I2C driver can setup add an interrupt flag and proceed with
> > submitting next messages, etc.
> > 
> 
> For I2C messages, we need to prepare TREs for Config, Go and DMAs. However,
> if a large number of I2C messages are submitted then may may run out of
> memory for serving the TREs. The GPI channel supports a maximum of 64 TREs,
> which is insufficient to serve 32 or even 16 I2C messages concurrently,
> given the multiple TREs required per message.
> 
> To address this limitation, a strategy has been implemented to manage how
> many messages can be queued and how memory is recycled. The constant
> QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
> messages that can be queued at once. Additionally,
> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
> half of the queued messages are expected to be freed or deallocated per
> interrupt.
> This approach ensures that the driver can efficiently manage TRE resources
> and continue queuing new I2C messages without exhausting memory.
> > I really don't see a reason for additional complicated handling in the
> > geni driver that you've implemented. Maybe I misunderstand something. In
> > such a case it usually means that you have to explain the design in the
> > commit message / in-code comments.
> > 
> 
> 
> The I2C Geni driver is designed to prepare and submit descriptors to the GPI
> driver one message at a time.
> As a result, the GPI driver does not have visibility into the current
> message index or the total number of I2C messages in a transfer. This lack
> of context makes it challenging to determine when to set the block event
> interrupt, which is typically used to signal the completion of a batch of
> messages.
> 
> So, the responsibility for deciding when to set the BEI should lie with the
> I2C driver.
> 
> If this approach is acceptable, I will proceed with updating the relevant
> details in the commit message.
> 
> Please let me know if you have any concerns or suggestions.

- Make gpi_prep_slave_sg() return NULL if flags don't have
  DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
  interrupt-enabled transfer.

- If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
  again, adding DMA_PREP_INTERRUPT. Make sure that the last one always
  gets DMA_PREP_INTERRUPT.

- In geni_i2c_gpi_xfer() split the loop to submit messages until you
  can, then call wait_for_completion_timeout() and then
  geni_i2c_gpi_unmap() for submitted messages, then continue with a new
  portion of messages.

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-30 16:42               ` Dmitry Baryshkov
@ 2025-06-17 14:11                 ` Jyothi Kumar Seerapu
  2025-06-17 19:32                   ` Dmitry Baryshkov
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-06-17 14:11 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku



On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
> On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
>>
>>
>> On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
>>> On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
>>>>
>>>>
>>>> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
>>>>> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
>>>>>> Hi Dimitry, Thanks for providing the review comments.
>>>>>>
>>>>>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
>>>>>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>>> The I2C driver gets an interrupt upon transfer completion.
>>>>>>>> When handling multiple messages in a single transfer, this
>>>>>>>> results in N interrupts for N messages, leading to significant
>>>>>>>> software interrupt latency.
>>>>>>>>
>>>>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
>>>>>>>> mechanism. Enabling BEI instructs the hardware to prevent interrupt
>>>>>>>> generation and BEI is disabled when an interrupt is necessary.
>>>>>>>>
>>>>>>>> Large I2C transfer can be divided into chunks of 8 messages internally.
>>>>>>>> Interrupts are not expected for the first 7 message completions, only
>>>>>>>> the last message triggers an interrupt, indicating the completion of
>>>>>>>> 8 messages. This BEI mechanism enhances overall transfer efficiency.
>>>>>>>
>>>>>>> Why do you need this complexity? Is it possible to set the
>>>>>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
>>>>>>
>>>>>> If i undertsand correctly, the suggestion is to get the single
>>>>>> intetrrupt for last i2c message only.
>>>>>>
>>>>>> But With this approach, we can't handle large number of i2c messages
>>>>>> in the transfer.
>>>>>>
>>>>>> In GPI driver, number of max TREs support is harcoded to 64 (#define
>>>>>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and
>>>>>> DMA TREs. So, the avilable TREs are not sufficient to handle all the
>>>>>> N messages.
>>>>>
>>>>> It sounds like a DMA driver issue. In other words, the DMA driver can
>>>>> know that it must issue an interrupt before exausting 64 TREs in order
>>>>> to
>>>>>
>>>>>>
>>>>>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
>>>>>> or 'num' incase for less messsages), process and unmap/free upon the
>>>>>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>>
>>>>> Why? This is some random value which has no connection with CHAN_TREs.
>>>>> Also, what if one of the platforms get a 'liter' GPI which supports less
>>>>> TREs in a single run? Or a super-premium platform which can use 256
>>>>> TREs? Please don't workaround issues from one driver in another one.
>>>>
>>>> We are trying to utilize the existing CHAN_TRES mentioned in the GPI driver.
>>>> With the following approach, the GPI hardware can process N number of I2C
>>>> messages, thereby improving throughput and transfer efficiency.
>>>>
>>>> The main design consideration for using the block event interrupt is as
>>>> follows:
>>>>
>>>> Allow the hardware to process the TREs (I2C messages), while the software
>>>> concurrently prepares the next set of TREs to be submitted to the hardware.
>>>> Once the TREs are processed, they can be freed, enabling the software to
>>>> queue new TREs. This approach enhances overall optimization.
>>>>
>>>> Please let me know if you have any questions, concerns, or suggestions.
>>>
>>> The question was why do you limit that to QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>> What is the reason for that limit, etc. If you think about it, The GENI
>>> / I2C doesn't impose any limit on the number of messages processed in
>>> one go (if I understand it correctly). Instead the limit comes from the
>>> GPI DMA driver. As such, please don't add extra 'handling' to the I2C
>>> driver. Make GPI DMA driver responsible for saying 'no more for now',
>>> then I2C driver can setup add an interrupt flag and proceed with
>>> submitting next messages, etc.
>>>
>>
>> For I2C messages, we need to prepare TREs for Config, Go and DMAs. However,
>> if a large number of I2C messages are submitted then may may run out of
>> memory for serving the TREs. The GPI channel supports a maximum of 64 TREs,
>> which is insufficient to serve 32 or even 16 I2C messages concurrently,
>> given the multiple TREs required per message.
>>
>> To address this limitation, a strategy has been implemented to manage how
>> many messages can be queued and how memory is recycled. The constant
>> QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
>> messages that can be queued at once. Additionally,
>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
>> half of the queued messages are expected to be freed or deallocated per
>> interrupt.
>> This approach ensures that the driver can efficiently manage TRE resources
>> and continue queuing new I2C messages without exhausting memory.
>>> I really don't see a reason for additional complicated handling in the
>>> geni driver that you've implemented. Maybe I misunderstand something. In
>>> such a case it usually means that you have to explain the design in the
>>> commit message / in-code comments.
>>>
>>
>>
>> The I2C Geni driver is designed to prepare and submit descriptors to the GPI
>> driver one message at a time.
>> As a result, the GPI driver does not have visibility into the current
>> message index or the total number of I2C messages in a transfer. This lack
>> of context makes it challenging to determine when to set the block event
>> interrupt, which is typically used to signal the completion of a batch of
>> messages.
>>
>> So, the responsibility for deciding when to set the BEI should lie with the
>> I2C driver.
>>
>> If this approach is acceptable, I will proceed with updating the relevant
>> details in the commit message.
>>
>> Please let me know if you have any concerns or suggestions.
>
Hi Dmitry, Sorry for the delayed response, and thank you for the 
suggestions.

> - Make gpi_prep_slave_sg() return NULL if flags don't have
>    DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
>    interrupt-enabled transfer.
"there are no 3 empty TREs for the interrupt-enabled transfer."
Could you please help me understand this a bit better?
> 
> - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
>    again, adding DMA_PREP_INTERRUPT. Make sure that the last one always
>    gets DMA_PREP_INTERRUPT.
Does this mean we need to proceed to the next I2C message and ensure 
that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each 
chunk? And then, should we submit the chunk of messages to the GSI 
hardware for processing?

> 
> - In geni_i2c_gpi_xfer() split the loop to submit messages until you
>    can, then call wait_for_completion_timeout() and then
>    geni_i2c_gpi_unmap() for submitted messages, then continue with a new
>    portion of messages.
Since the GPI channel supports a maximum of 64 TREs, should we consider 
submitting a smaller number of predefined messages — perhaps fewer than 
32, such as 16?
This is because handling 32 messages would require one TRE for config 
and 64 TREs for the Go and DMA preparation steps, which exceeds the 
channel's TRE capacity of 64.

We designed the approach to submit a portion of the messages — for 
example, 16 at a time. Once 8 messages are processed and freed, the 
hardware can continue processing the TREs, while the software 
simultaneously prepares the next set of TREs. This parallelism helps in 
efficiently utilizing the hardware and enhances overall system 
optimization.


> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-06-17 14:11                 ` Jyothi Kumar Seerapu
@ 2025-06-17 19:32                   ` Dmitry Baryshkov
  2025-06-19 16:16                     ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2025-06-17 19:32 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
<quic_jseerapu@quicinc.com> wrote:
>
>
>
> On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
> > On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
> >>
> >>
> >> On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
> >>> On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
> >>>>
> >>>>
> >>>> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
> >>>>> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
> >>>>>> Hi Dimitry, Thanks for providing the review comments.
> >>>>>>
> >>>>>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
> >>>>>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
> >>>>>>>> The I2C driver gets an interrupt upon transfer completion.
> >>>>>>>> When handling multiple messages in a single transfer, this
> >>>>>>>> results in N interrupts for N messages, leading to significant
> >>>>>>>> software interrupt latency.
> >>>>>>>>
> >>>>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
> >>>>>>>> mechanism. Enabling BEI instructs the hardware to prevent interrupt
> >>>>>>>> generation and BEI is disabled when an interrupt is necessary.
> >>>>>>>>
> >>>>>>>> Large I2C transfer can be divided into chunks of 8 messages internally.
> >>>>>>>> Interrupts are not expected for the first 7 message completions, only
> >>>>>>>> the last message triggers an interrupt, indicating the completion of
> >>>>>>>> 8 messages. This BEI mechanism enhances overall transfer efficiency.
> >>>>>>>
> >>>>>>> Why do you need this complexity? Is it possible to set the
> >>>>>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
> >>>>>>
> >>>>>> If i undertsand correctly, the suggestion is to get the single
> >>>>>> intetrrupt for last i2c message only.
> >>>>>>
> >>>>>> But With this approach, we can't handle large number of i2c messages
> >>>>>> in the transfer.
> >>>>>>
> >>>>>> In GPI driver, number of max TREs support is harcoded to 64 (#define
> >>>>>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and
> >>>>>> DMA TREs. So, the avilable TREs are not sufficient to handle all the
> >>>>>> N messages.
> >>>>>
> >>>>> It sounds like a DMA driver issue. In other words, the DMA driver can
> >>>>> know that it must issue an interrupt before exausting 64 TREs in order
> >>>>> to
> >>>>>
> >>>>>>
> >>>>>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
> >>>>>> or 'num' incase for less messsages), process and unmap/free upon the
> >>>>>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> >>>>>
> >>>>> Why? This is some random value which has no connection with CHAN_TREs.
> >>>>> Also, what if one of the platforms get a 'liter' GPI which supports less
> >>>>> TREs in a single run? Or a super-premium platform which can use 256
> >>>>> TREs? Please don't workaround issues from one driver in another one.
> >>>>
> >>>> We are trying to utilize the existing CHAN_TRES mentioned in the GPI driver.
> >>>> With the following approach, the GPI hardware can process N number of I2C
> >>>> messages, thereby improving throughput and transfer efficiency.
> >>>>
> >>>> The main design consideration for using the block event interrupt is as
> >>>> follows:
> >>>>
> >>>> Allow the hardware to process the TREs (I2C messages), while the software
> >>>> concurrently prepares the next set of TREs to be submitted to the hardware.
> >>>> Once the TREs are processed, they can be freed, enabling the software to
> >>>> queue new TREs. This approach enhances overall optimization.
> >>>>
> >>>> Please let me know if you have any questions, concerns, or suggestions.
> >>>
> >>> The question was why do you limit that to QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> >>> What is the reason for that limit, etc. If you think about it, The GENI
> >>> / I2C doesn't impose any limit on the number of messages processed in
> >>> one go (if I understand it correctly). Instead the limit comes from the
> >>> GPI DMA driver. As such, please don't add extra 'handling' to the I2C
> >>> driver. Make GPI DMA driver responsible for saying 'no more for now',
> >>> then I2C driver can setup add an interrupt flag and proceed with
> >>> submitting next messages, etc.
> >>>
> >>
> >> For I2C messages, we need to prepare TREs for Config, Go and DMAs. However,
> >> if a large number of I2C messages are submitted then may may run out of
> >> memory for serving the TREs. The GPI channel supports a maximum of 64 TREs,
> >> which is insufficient to serve 32 or even 16 I2C messages concurrently,
> >> given the multiple TREs required per message.
> >>
> >> To address this limitation, a strategy has been implemented to manage how
> >> many messages can be queued and how memory is recycled. The constant
> >> QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
> >> messages that can be queued at once. Additionally,
> >> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
> >> half of the queued messages are expected to be freed or deallocated per
> >> interrupt.
> >> This approach ensures that the driver can efficiently manage TRE resources
> >> and continue queuing new I2C messages without exhausting memory.
> >>> I really don't see a reason for additional complicated handling in the
> >>> geni driver that you've implemented. Maybe I misunderstand something. In
> >>> such a case it usually means that you have to explain the design in the
> >>> commit message / in-code comments.
> >>>
> >>
> >>
> >> The I2C Geni driver is designed to prepare and submit descriptors to the GPI
> >> driver one message at a time.
> >> As a result, the GPI driver does not have visibility into the current
> >> message index or the total number of I2C messages in a transfer. This lack
> >> of context makes it challenging to determine when to set the block event
> >> interrupt, which is typically used to signal the completion of a batch of
> >> messages.
> >>
> >> So, the responsibility for deciding when to set the BEI should lie with the
> >> I2C driver.
> >>
> >> If this approach is acceptable, I will proceed with updating the relevant
> >> details in the commit message.
> >>
> >> Please let me know if you have any concerns or suggestions.
> >
> Hi Dmitry, Sorry for the delayed response, and thank you for the
> suggestions.
>
> > - Make gpi_prep_slave_sg() return NULL if flags don't have
> >    DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
> >    interrupt-enabled transfer.
> "there are no 3 empty TREs for the interrupt-enabled transfer."
> Could you please help me understand this a bit better?

In the GPI driver you know how many TREs are available. In
gpi_prep_slave_sg() you can check that and return an error if there
are not enough TREs available.

> >
> > - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
> >    again, adding DMA_PREP_INTERRUPT. Make sure that the last one always
> >    gets DMA_PREP_INTERRUPT.
> Does this mean we need to proceed to the next I2C message and ensure
> that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
> chunk? And then, should we submit the chunk of messages to the GSI
> hardware for processing?

No. You don't have to peek at the next I2C message. This all concerns
the current I2C message. The only point where you have to worry is to
explicitly set the flag for the last message.

>
> >
> > - In geni_i2c_gpi_xfer() split the loop to submit messages until you
> >    can, then call wait_for_completion_timeout() and then
> >    geni_i2c_gpi_unmap() for submitted messages, then continue with a new
> >    portion of messages.
> Since the GPI channel supports a maximum of 64 TREs, should we consider
> submitting a smaller number of predefined messages — perhaps fewer than
> 32, such as 16?

Why? Just submit messages until they fit, then flush the DMA async channel.

> This is because handling 32 messages would require one TRE for config
> and 64 TREs for the Go and DMA preparation steps, which exceeds the
> channel's TRE capacity of 64.
>
> We designed the approach to submit a portion of the messages — for
> example, 16 at a time. Once 8 messages are processed and freed, the
> hardware can continue processing the TREs, while the software
> simultaneously prepares the next set of TREs. This parallelism helps in
> efficiently utilizing the hardware and enhances overall system
> optimization.


And this overcomplicates the driver and introduces artificial
limitations which need explanation. Please fix it in a simple way
first. Then you can e.g. implement the watermark at the half of the
GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
middle of the full sequence, allowing it to be used asynchronously in
the background.

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-06-17 19:32                   ` Dmitry Baryshkov
@ 2025-06-19 16:16                     ` Jyothi Kumar Seerapu
  2025-07-03 12:50                       ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-06-19 16:16 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku



On 6/18/2025 1:02 AM, Dmitry Baryshkov wrote:
> On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
> <quic_jseerapu@quicinc.com> wrote:
>>
>>
>>
>> On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
>>> On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
>>>>
>>>>
>>>> On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
>>>>> On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>
>>>>>>
>>>>>> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
>>>>>>> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
>>>>>>>> Hi Dimitry, Thanks for providing the review comments.
>>>>>>>>
>>>>>>>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
>>>>>>>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>>>>> The I2C driver gets an interrupt upon transfer completion.
>>>>>>>>>> When handling multiple messages in a single transfer, this
>>>>>>>>>> results in N interrupts for N messages, leading to significant
>>>>>>>>>> software interrupt latency.
>>>>>>>>>>
>>>>>>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
>>>>>>>>>> mechanism. Enabling BEI instructs the hardware to prevent interrupt
>>>>>>>>>> generation and BEI is disabled when an interrupt is necessary.
>>>>>>>>>>
>>>>>>>>>> Large I2C transfer can be divided into chunks of 8 messages internally.
>>>>>>>>>> Interrupts are not expected for the first 7 message completions, only
>>>>>>>>>> the last message triggers an interrupt, indicating the completion of
>>>>>>>>>> 8 messages. This BEI mechanism enhances overall transfer efficiency.
>>>>>>>>>
>>>>>>>>> Why do you need this complexity? Is it possible to set the
>>>>>>>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
>>>>>>>>
>>>>>>>> If i undertsand correctly, the suggestion is to get the single
>>>>>>>> intetrrupt for last i2c message only.
>>>>>>>>
>>>>>>>> But With this approach, we can't handle large number of i2c messages
>>>>>>>> in the transfer.
>>>>>>>>
>>>>>>>> In GPI driver, number of max TREs support is harcoded to 64 (#define
>>>>>>>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and
>>>>>>>> DMA TREs. So, the avilable TREs are not sufficient to handle all the
>>>>>>>> N messages.
>>>>>>>
>>>>>>> It sounds like a DMA driver issue. In other words, the DMA driver can
>>>>>>> know that it must issue an interrupt before exausting 64 TREs in order
>>>>>>> to
>>>>>>>
>>>>>>>>
>>>>>>>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
>>>>>>>> or 'num' incase for less messsages), process and unmap/free upon the
>>>>>>>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>>>>
>>>>>>> Why? This is some random value which has no connection with CHAN_TREs.
>>>>>>> Also, what if one of the platforms get a 'liter' GPI which supports less
>>>>>>> TREs in a single run? Or a super-premium platform which can use 256
>>>>>>> TREs? Please don't workaround issues from one driver in another one.
>>>>>>
>>>>>> We are trying to utilize the existing CHAN_TRES mentioned in the GPI driver.
>>>>>> With the following approach, the GPI hardware can process N number of I2C
>>>>>> messages, thereby improving throughput and transfer efficiency.
>>>>>>
>>>>>> The main design consideration for using the block event interrupt is as
>>>>>> follows:
>>>>>>
>>>>>> Allow the hardware to process the TREs (I2C messages), while the software
>>>>>> concurrently prepares the next set of TREs to be submitted to the hardware.
>>>>>> Once the TREs are processed, they can be freed, enabling the software to
>>>>>> queue new TREs. This approach enhances overall optimization.
>>>>>>
>>>>>> Please let me know if you have any questions, concerns, or suggestions.
>>>>>
>>>>> The question was why do you limit that to QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>> What is the reason for that limit, etc. If you think about it, The GENI
>>>>> / I2C doesn't impose any limit on the number of messages processed in
>>>>> one go (if I understand it correctly). Instead the limit comes from the
>>>>> GPI DMA driver. As such, please don't add extra 'handling' to the I2C
>>>>> driver. Make GPI DMA driver responsible for saying 'no more for now',
>>>>> then I2C driver can setup add an interrupt flag and proceed with
>>>>> submitting next messages, etc.
>>>>>
>>>>
>>>> For I2C messages, we need to prepare TREs for Config, Go and DMAs. However,
>>>> if a large number of I2C messages are submitted then may may run out of
>>>> memory for serving the TREs. The GPI channel supports a maximum of 64 TREs,
>>>> which is insufficient to serve 32 or even 16 I2C messages concurrently,
>>>> given the multiple TREs required per message.
>>>>
>>>> To address this limitation, a strategy has been implemented to manage how
>>>> many messages can be queued and how memory is recycled. The constant
>>>> QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
>>>> messages that can be queued at once. Additionally,
>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
>>>> half of the queued messages are expected to be freed or deallocated per
>>>> interrupt.
>>>> This approach ensures that the driver can efficiently manage TRE resources
>>>> and continue queuing new I2C messages without exhausting memory.
>>>>> I really don't see a reason for additional complicated handling in the
>>>>> geni driver that you've implemented. Maybe I misunderstand something. In
>>>>> such a case it usually means that you have to explain the design in the
>>>>> commit message / in-code comments.
>>>>>
>>>>
>>>>
>>>> The I2C Geni driver is designed to prepare and submit descriptors to the GPI
>>>> driver one message at a time.
>>>> As a result, the GPI driver does not have visibility into the current
>>>> message index or the total number of I2C messages in a transfer. This lack
>>>> of context makes it challenging to determine when to set the block event
>>>> interrupt, which is typically used to signal the completion of a batch of
>>>> messages.
>>>>
>>>> So, the responsibility for deciding when to set the BEI should lie with the
>>>> I2C driver.
>>>>
>>>> If this approach is acceptable, I will proceed with updating the relevant
>>>> details in the commit message.
>>>>
>>>> Please let me know if you have any concerns or suggestions.
>>>
>> Hi Dmitry, Sorry for the delayed response, and thank you for the
>> suggestions.
>>
>>> - Make gpi_prep_slave_sg() return NULL if flags don't have
>>>     DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
>>>     interrupt-enabled transfer.
>> "there are no 3 empty TREs for the interrupt-enabled transfer."
>> Could you please help me understand this a bit better?
> 
> In the GPI driver you know how many TREs are available. In
> gpi_prep_slave_sg() you can check that and return an error if there
> are not enough TREs available.
> 
>>>
>>> - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
>>>     again, adding DMA_PREP_INTERRUPT. Make sure that the last one always
>>>     gets DMA_PREP_INTERRUPT.
>> Does this mean we need to proceed to the next I2C message and ensure
>> that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
>> chunk? And then, should we submit the chunk of messages to the GSI
>> hardware for processing?
> 
> No. You don't have to peek at the next I2C message. This all concerns
> the current I2C message. The only point where you have to worry is to
> explicitly set the flag for the last message.
> 
>>
>>>
>>> - In geni_i2c_gpi_xfer() split the loop to submit messages until you
>>>     can, then call wait_for_completion_timeout() and then
>>>     geni_i2c_gpi_unmap() for submitted messages, then continue with a new
>>>     portion of messages.
>> Since the GPI channel supports a maximum of 64 TREs, should we consider
>> submitting a smaller number of predefined messages — perhaps fewer than
>> 32, such as 16?
> 
> Why? Just submit messages until they fit, then flush the DMA async channel.
> 
>> This is because handling 32 messages would require one TRE for config
>> and 64 TREs for the Go and DMA preparation steps, which exceeds the
>> channel's TRE capacity of 64.
>>
>> We designed the approach to submit a portion of the messages — for
>> example, 16 at a time. Once 8 messages are processed and freed, the
>> hardware can continue processing the TREs, while the software
>> simultaneously prepares the next set of TREs. This parallelism helps in
>> efficiently utilizing the hardware and enhances overall system
>> optimization.
> 
> 
> And this overcomplicates the driver and introduces artificial
> limitations which need explanation. Please fix it in a simple way
> first. Then you can e.g. implement the watermark at the half of the
> GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
> middle of the full sequence, allowing it to be used asynchronously in
> the background.
> 

Okay, will review it. Thanks.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-06-19 16:16                     ` Jyothi Kumar Seerapu
@ 2025-07-03 12:50                       ` Jyothi Kumar Seerapu
  2025-07-03 19:41                         ` Dmitry Baryshkov
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-07-03 12:50 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku



On 6/19/2025 9:46 PM, Jyothi Kumar Seerapu wrote:
> 
> 
> On 6/18/2025 1:02 AM, Dmitry Baryshkov wrote:
>> On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
>> <quic_jseerapu@quicinc.com> wrote:
>>>
>>>
>>>
>>> On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
>>>> On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>
>>>>>
>>>>> On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
>>>>>> On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
>>>>>>>> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
>>>>>>>>> Hi Dimitry, Thanks for providing the review comments.
>>>>>>>>>
>>>>>>>>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
>>>>>>>>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu 
>>>>>>>>>> wrote:
>>>>>>>>>>> The I2C driver gets an interrupt upon transfer completion.
>>>>>>>>>>> When handling multiple messages in a single transfer, this
>>>>>>>>>>> results in N interrupts for N messages, leading to significant
>>>>>>>>>>> software interrupt latency.
>>>>>>>>>>>
>>>>>>>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
>>>>>>>>>>> mechanism. Enabling BEI instructs the hardware to prevent 
>>>>>>>>>>> interrupt
>>>>>>>>>>> generation and BEI is disabled when an interrupt is necessary.
>>>>>>>>>>>
>>>>>>>>>>> Large I2C transfer can be divided into chunks of 8 messages 
>>>>>>>>>>> internally.
>>>>>>>>>>> Interrupts are not expected for the first 7 message 
>>>>>>>>>>> completions, only
>>>>>>>>>>> the last message triggers an interrupt, indicating the 
>>>>>>>>>>> completion of
>>>>>>>>>>> 8 messages. This BEI mechanism enhances overall transfer 
>>>>>>>>>>> efficiency.
>>>>>>>>>>
>>>>>>>>>> Why do you need this complexity? Is it possible to set the
>>>>>>>>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
>>>>>>>>>
>>>>>>>>> If i undertsand correctly, the suggestion is to get the single
>>>>>>>>> intetrrupt for last i2c message only.
>>>>>>>>>
>>>>>>>>> But With this approach, we can't handle large number of i2c 
>>>>>>>>> messages
>>>>>>>>> in the transfer.
>>>>>>>>>
>>>>>>>>> In GPI driver, number of max TREs support is harcoded to 64 
>>>>>>>>> (#define
>>>>>>>>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE 
>>>>>>>>> and
>>>>>>>>> DMA TREs. So, the avilable TREs are not sufficient to handle 
>>>>>>>>> all the
>>>>>>>>> N messages.
>>>>>>>>
>>>>>>>> It sounds like a DMA driver issue. In other words, the DMA 
>>>>>>>> driver can
>>>>>>>> know that it must issue an interrupt before exausting 64 TREs in 
>>>>>>>> order
>>>>>>>> to
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
>>>>>>>>> or 'num' incase for less messsages), process and unmap/free 
>>>>>>>>> upon the
>>>>>>>>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>>>>>
>>>>>>>> Why? This is some random value which has no connection with 
>>>>>>>> CHAN_TREs.
>>>>>>>> Also, what if one of the platforms get a 'liter' GPI which 
>>>>>>>> supports less
>>>>>>>> TREs in a single run? Or a super-premium platform which can use 256
>>>>>>>> TREs? Please don't workaround issues from one driver in another 
>>>>>>>> one.
>>>>>>>
>>>>>>> We are trying to utilize the existing CHAN_TRES mentioned in the 
>>>>>>> GPI driver.
>>>>>>> With the following approach, the GPI hardware can process N 
>>>>>>> number of I2C
>>>>>>> messages, thereby improving throughput and transfer efficiency.
>>>>>>>
>>>>>>> The main design consideration for using the block event interrupt 
>>>>>>> is as
>>>>>>> follows:
>>>>>>>
>>>>>>> Allow the hardware to process the TREs (I2C messages), while the 
>>>>>>> software
>>>>>>> concurrently prepares the next set of TREs to be submitted to the 
>>>>>>> hardware.
>>>>>>> Once the TREs are processed, they can be freed, enabling the 
>>>>>>> software to
>>>>>>> queue new TREs. This approach enhances overall optimization.
>>>>>>>
>>>>>>> Please let me know if you have any questions, concerns, or 
>>>>>>> suggestions.
>>>>>>
>>>>>> The question was why do you limit that to 
>>>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>>> What is the reason for that limit, etc. If you think about it, The 
>>>>>> GENI
>>>>>> / I2C doesn't impose any limit on the number of messages processed in
>>>>>> one go (if I understand it correctly). Instead the limit comes 
>>>>>> from the
>>>>>> GPI DMA driver. As such, please don't add extra 'handling' to the I2C
>>>>>> driver. Make GPI DMA driver responsible for saying 'no more for now',
>>>>>> then I2C driver can setup add an interrupt flag and proceed with
>>>>>> submitting next messages, etc.
>>>>>>
>>>>>
>>>>> For I2C messages, we need to prepare TREs for Config, Go and DMAs. 
>>>>> However,
>>>>> if a large number of I2C messages are submitted then may may run 
>>>>> out of
>>>>> memory for serving the TREs. The GPI channel supports a maximum of 
>>>>> 64 TREs,
>>>>> which is insufficient to serve 32 or even 16 I2C messages 
>>>>> concurrently,
>>>>> given the multiple TREs required per message.
>>>>>
>>>>> To address this limitation, a strategy has been implemented to 
>>>>> manage how
>>>>> many messages can be queued and how memory is recycled. The constant
>>>>> QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
>>>>> messages that can be queued at once. Additionally,
>>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
>>>>> half of the queued messages are expected to be freed or deallocated 
>>>>> per
>>>>> interrupt.
>>>>> This approach ensures that the driver can efficiently manage TRE 
>>>>> resources
>>>>> and continue queuing new I2C messages without exhausting memory.
>>>>>> I really don't see a reason for additional complicated handling in 
>>>>>> the
>>>>>> geni driver that you've implemented. Maybe I misunderstand 
>>>>>> something. In
>>>>>> such a case it usually means that you have to explain the design 
>>>>>> in the
>>>>>> commit message / in-code comments.
>>>>>>
>>>>>
>>>>>
>>>>> The I2C Geni driver is designed to prepare and submit descriptors 
>>>>> to the GPI
>>>>> driver one message at a time.
>>>>> As a result, the GPI driver does not have visibility into the current
>>>>> message index or the total number of I2C messages in a transfer. 
>>>>> This lack
>>>>> of context makes it challenging to determine when to set the block 
>>>>> event
>>>>> interrupt, which is typically used to signal the completion of a 
>>>>> batch of
>>>>> messages.
>>>>>
>>>>> So, the responsibility for deciding when to set the BEI should lie 
>>>>> with the
>>>>> I2C driver.
>>>>>
>>>>> If this approach is acceptable, I will proceed with updating the 
>>>>> relevant
>>>>> details in the commit message.
>>>>>
>>>>> Please let me know if you have any concerns or suggestions.
>>>>
>>> Hi Dmitry, Sorry for the delayed response, and thank you for the
>>> suggestions.
>>>
>>>> - Make gpi_prep_slave_sg() return NULL if flags don't have
>>>>     DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
>>>>     interrupt-enabled transfer.
>>> "there are no 3 empty TREs for the interrupt-enabled transfer."
>>> Could you please help me understand this a bit better?
>>
>> In the GPI driver you know how many TREs are available. In
>> gpi_prep_slave_sg() you can check that and return an error if there
>> are not enough TREs available.
>>
>>>>
>>>> - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
>>>>     again, adding DMA_PREP_INTERRUPT. Make sure that the last one 
>>>> always
>>>>     gets DMA_PREP_INTERRUPT.
>>> Does this mean we need to proceed to the next I2C message and ensure
>>> that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
>>> chunk? And then, should we submit the chunk of messages to the GSI
>>> hardware for processing?
>>
>> No. You don't have to peek at the next I2C message. This all concerns
>> the current I2C message. The only point where you have to worry is to
>> explicitly set the flag for the last message.
>>
>>>
>>>>
>>>> - In geni_i2c_gpi_xfer() split the loop to submit messages until you
>>>>     can, then call wait_for_completion_timeout() and then
>>>>     geni_i2c_gpi_unmap() for submitted messages, then continue with 
>>>> a new
>>>>     portion of messages.
>>> Since the GPI channel supports a maximum of 64 TREs, should we consider
>>> submitting a smaller number of predefined messages — perhaps fewer than
>>> 32, such as 16?
>>
>> Why? Just submit messages until they fit, then flush the DMA async 
>> channel.
>>
>>> This is because handling 32 messages would require one TRE for config
>>> and 64 TREs for the Go and DMA preparation steps, which exceeds the
>>> channel's TRE capacity of 64.
>>>
>>> We designed the approach to submit a portion of the messages — for
>>> example, 16 at a time. Once 8 messages are processed and freed, the
>>> hardware can continue processing the TREs, while the software
>>> simultaneously prepares the next set of TREs. This parallelism helps in
>>> efficiently utilizing the hardware and enhances overall system
>>> optimization.
>>
>>
>> And this overcomplicates the driver and introduces artificial
>> limitations which need explanation. Please fix it in a simple way
>> first. Then you can e.g. implement the watermark at the half of the
>> GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
>> middle of the full sequence, allowing it to be used asynchronously in
>> the background.
>>
> 
> Okay, will review it. Thanks.
> 
> 

Hi Dmitry,

Can you please check and confirm the approach to follow is something 
like the pseudo code mentioned below:

GPI driver:
In gpi_prep_slave_sg() function,

if (!(flags & DMA_PREP_INTERRUPT) && !gpi_available_tres(chan))
	return NULL;


I2C GENI driver:

for (i = 0; i < num; i++)
{
    /* Always set interrupt for the last message */
    if (i == num_msgs - 1)
	flags |= DMA_PREP_INTERRUPT;


    desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir, flags);
    if (!desc && !(flags & DMA_PREP_INTERRUPT)) {
	  /* Retry with interrupt if not enough TREs */
	  flags |= DMA_PREP_INTERRUPT;
	  desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir,   flags);
    }


    if (!desc)
	break;
	

     dmaengine_submit(desc);
     msg_idx++;
}	

dma_async_issue_pending(chan));

time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
if (!time_left)
	return -ETIMEDOUT;

Now Invoke "geni_i2c_gpi_unmap" for unmapping all submitted I2C messages.


Thanks,
JyothiKumar




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-07-03 12:50                       ` Jyothi Kumar Seerapu
@ 2025-07-03 19:41                         ` Dmitry Baryshkov
  2025-07-07 16:28                           ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2025-07-03 19:41 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
<quic_jseerapu@quicinc.com> wrote:
>
>
>
> On 6/19/2025 9:46 PM, Jyothi Kumar Seerapu wrote:
> >
> >
> > On 6/18/2025 1:02 AM, Dmitry Baryshkov wrote:
> >> On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
> >> <quic_jseerapu@quicinc.com> wrote:
> >>>
> >>>
> >>>
> >>> On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
> >>>> On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
> >>>>>
> >>>>>
> >>>>> On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
> >>>>>> On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
> >>>>>>>> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
> >>>>>>>>> Hi Dimitry, Thanks for providing the review comments.
> >>>>>>>>>
> >>>>>>>>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
> >>>>>>>>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu
> >>>>>>>>>> wrote:
> >>>>>>>>>>> The I2C driver gets an interrupt upon transfer completion.
> >>>>>>>>>>> When handling multiple messages in a single transfer, this
> >>>>>>>>>>> results in N interrupts for N messages, leading to significant
> >>>>>>>>>>> software interrupt latency.
> >>>>>>>>>>>
> >>>>>>>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
> >>>>>>>>>>> mechanism. Enabling BEI instructs the hardware to prevent
> >>>>>>>>>>> interrupt
> >>>>>>>>>>> generation and BEI is disabled when an interrupt is necessary.
> >>>>>>>>>>>
> >>>>>>>>>>> Large I2C transfer can be divided into chunks of 8 messages
> >>>>>>>>>>> internally.
> >>>>>>>>>>> Interrupts are not expected for the first 7 message
> >>>>>>>>>>> completions, only
> >>>>>>>>>>> the last message triggers an interrupt, indicating the
> >>>>>>>>>>> completion of
> >>>>>>>>>>> 8 messages. This BEI mechanism enhances overall transfer
> >>>>>>>>>>> efficiency.
> >>>>>>>>>>
> >>>>>>>>>> Why do you need this complexity? Is it possible to set the
> >>>>>>>>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
> >>>>>>>>>
> >>>>>>>>> If i undertsand correctly, the suggestion is to get the single
> >>>>>>>>> intetrrupt for last i2c message only.
> >>>>>>>>>
> >>>>>>>>> But With this approach, we can't handle large number of i2c
> >>>>>>>>> messages
> >>>>>>>>> in the transfer.
> >>>>>>>>>
> >>>>>>>>> In GPI driver, number of max TREs support is harcoded to 64
> >>>>>>>>> (#define
> >>>>>>>>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE
> >>>>>>>>> and
> >>>>>>>>> DMA TREs. So, the avilable TREs are not sufficient to handle
> >>>>>>>>> all the
> >>>>>>>>> N messages.
> >>>>>>>>
> >>>>>>>> It sounds like a DMA driver issue. In other words, the DMA
> >>>>>>>> driver can
> >>>>>>>> know that it must issue an interrupt before exausting 64 TREs in
> >>>>>>>> order
> >>>>>>>> to
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
> >>>>>>>>> or 'num' incase for less messsages), process and unmap/free
> >>>>>>>>> upon the
> >>>>>>>>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> >>>>>>>>
> >>>>>>>> Why? This is some random value which has no connection with
> >>>>>>>> CHAN_TREs.
> >>>>>>>> Also, what if one of the platforms get a 'liter' GPI which
> >>>>>>>> supports less
> >>>>>>>> TREs in a single run? Or a super-premium platform which can use 256
> >>>>>>>> TREs? Please don't workaround issues from one driver in another
> >>>>>>>> one.
> >>>>>>>
> >>>>>>> We are trying to utilize the existing CHAN_TRES mentioned in the
> >>>>>>> GPI driver.
> >>>>>>> With the following approach, the GPI hardware can process N
> >>>>>>> number of I2C
> >>>>>>> messages, thereby improving throughput and transfer efficiency.
> >>>>>>>
> >>>>>>> The main design consideration for using the block event interrupt
> >>>>>>> is as
> >>>>>>> follows:
> >>>>>>>
> >>>>>>> Allow the hardware to process the TREs (I2C messages), while the
> >>>>>>> software
> >>>>>>> concurrently prepares the next set of TREs to be submitted to the
> >>>>>>> hardware.
> >>>>>>> Once the TREs are processed, they can be freed, enabling the
> >>>>>>> software to
> >>>>>>> queue new TREs. This approach enhances overall optimization.
> >>>>>>>
> >>>>>>> Please let me know if you have any questions, concerns, or
> >>>>>>> suggestions.
> >>>>>>
> >>>>>> The question was why do you limit that to
> >>>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> >>>>>> What is the reason for that limit, etc. If you think about it, The
> >>>>>> GENI
> >>>>>> / I2C doesn't impose any limit on the number of messages processed in
> >>>>>> one go (if I understand it correctly). Instead the limit comes
> >>>>>> from the
> >>>>>> GPI DMA driver. As such, please don't add extra 'handling' to the I2C
> >>>>>> driver. Make GPI DMA driver responsible for saying 'no more for now',
> >>>>>> then I2C driver can setup add an interrupt flag and proceed with
> >>>>>> submitting next messages, etc.
> >>>>>>
> >>>>>
> >>>>> For I2C messages, we need to prepare TREs for Config, Go and DMAs.
> >>>>> However,
> >>>>> if a large number of I2C messages are submitted then may may run
> >>>>> out of
> >>>>> memory for serving the TREs. The GPI channel supports a maximum of
> >>>>> 64 TREs,
> >>>>> which is insufficient to serve 32 or even 16 I2C messages
> >>>>> concurrently,
> >>>>> given the multiple TREs required per message.
> >>>>>
> >>>>> To address this limitation, a strategy has been implemented to
> >>>>> manage how
> >>>>> many messages can be queued and how memory is recycled. The constant
> >>>>> QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
> >>>>> messages that can be queued at once. Additionally,
> >>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
> >>>>> half of the queued messages are expected to be freed or deallocated
> >>>>> per
> >>>>> interrupt.
> >>>>> This approach ensures that the driver can efficiently manage TRE
> >>>>> resources
> >>>>> and continue queuing new I2C messages without exhausting memory.
> >>>>>> I really don't see a reason for additional complicated handling in
> >>>>>> the
> >>>>>> geni driver that you've implemented. Maybe I misunderstand
> >>>>>> something. In
> >>>>>> such a case it usually means that you have to explain the design
> >>>>>> in the
> >>>>>> commit message / in-code comments.
> >>>>>>
> >>>>>
> >>>>>
> >>>>> The I2C Geni driver is designed to prepare and submit descriptors
> >>>>> to the GPI
> >>>>> driver one message at a time.
> >>>>> As a result, the GPI driver does not have visibility into the current
> >>>>> message index or the total number of I2C messages in a transfer.
> >>>>> This lack
> >>>>> of context makes it challenging to determine when to set the block
> >>>>> event
> >>>>> interrupt, which is typically used to signal the completion of a
> >>>>> batch of
> >>>>> messages.
> >>>>>
> >>>>> So, the responsibility for deciding when to set the BEI should lie
> >>>>> with the
> >>>>> I2C driver.
> >>>>>
> >>>>> If this approach is acceptable, I will proceed with updating the
> >>>>> relevant
> >>>>> details in the commit message.
> >>>>>
> >>>>> Please let me know if you have any concerns or suggestions.
> >>>>
> >>> Hi Dmitry, Sorry for the delayed response, and thank you for the
> >>> suggestions.
> >>>
> >>>> - Make gpi_prep_slave_sg() return NULL if flags don't have
> >>>>     DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
> >>>>     interrupt-enabled transfer.
> >>> "there are no 3 empty TREs for the interrupt-enabled transfer."
> >>> Could you please help me understand this a bit better?
> >>
> >> In the GPI driver you know how many TREs are available. In
> >> gpi_prep_slave_sg() you can check that and return an error if there
> >> are not enough TREs available.
> >>
> >>>>
> >>>> - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
> >>>>     again, adding DMA_PREP_INTERRUPT. Make sure that the last one
> >>>> always
> >>>>     gets DMA_PREP_INTERRUPT.
> >>> Does this mean we need to proceed to the next I2C message and ensure
> >>> that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
> >>> chunk? And then, should we submit the chunk of messages to the GSI
> >>> hardware for processing?
> >>
> >> No. You don't have to peek at the next I2C message. This all concerns
> >> the current I2C message. The only point where you have to worry is to
> >> explicitly set the flag for the last message.
> >>
> >>>
> >>>>
> >>>> - In geni_i2c_gpi_xfer() split the loop to submit messages until you
> >>>>     can, then call wait_for_completion_timeout() and then
> >>>>     geni_i2c_gpi_unmap() for submitted messages, then continue with
> >>>> a new
> >>>>     portion of messages.
> >>> Since the GPI channel supports a maximum of 64 TREs, should we consider
> >>> submitting a smaller number of predefined messages — perhaps fewer than
> >>> 32, such as 16?
> >>
> >> Why? Just submit messages until they fit, then flush the DMA async
> >> channel.
> >>
> >>> This is because handling 32 messages would require one TRE for config
> >>> and 64 TREs for the Go and DMA preparation steps, which exceeds the
> >>> channel's TRE capacity of 64.
> >>>
> >>> We designed the approach to submit a portion of the messages — for
> >>> example, 16 at a time. Once 8 messages are processed and freed, the
> >>> hardware can continue processing the TREs, while the software
> >>> simultaneously prepares the next set of TREs. This parallelism helps in
> >>> efficiently utilizing the hardware and enhances overall system
> >>> optimization.
> >>
> >>
> >> And this overcomplicates the driver and introduces artificial
> >> limitations which need explanation. Please fix it in a simple way
> >> first. Then you can e.g. implement the watermark at the half of the
> >> GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
> >> middle of the full sequence, allowing it to be used asynchronously in
> >> the background.
> >>
> >
> > Okay, will review it. Thanks.
> >
> >
>
> Hi Dmitry,
>
> Can you please check and confirm the approach to follow is something
> like the pseudo code mentioned below:

Yes, this is what I've had in mind.

>
> GPI driver:
> In gpi_prep_slave_sg() function,
>
> if (!(flags & DMA_PREP_INTERRUPT) && !gpi_available_tres(chan))
>         return NULL;
>
>
> I2C GENI driver:
>
> for (i = 0; i < num; i++)
> {
>     /* Always set interrupt for the last message */
>     if (i == num_msgs - 1)
>         flags |= DMA_PREP_INTERRUPT;
>
>
>     desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir, flags);
>     if (!desc && !(flags & DMA_PREP_INTERRUPT)) {
>           /* Retry with interrupt if not enough TREs */
>           flags |= DMA_PREP_INTERRUPT;
>           desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir,   flags);
>     }
>
>
>     if (!desc)
>         break;
>
>
>      dmaengine_submit(desc);
>      msg_idx++;
> }
>
> dma_async_issue_pending(chan));
>
> time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
> if (!time_left)
>         return -ETIMEDOUT;
>
> Now Invoke "geni_i2c_gpi_unmap" for unmapping all submitted I2C messages.
>
>
> Thanks,
> JyothiKumar
>
>
>


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-07-03 19:41                         ` Dmitry Baryshkov
@ 2025-07-07 16:28                           ` Jyothi Kumar Seerapu
  2025-07-19  9:57                             ` Dmitry Baryshkov
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-07-07 16:28 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku



On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
> On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
> <quic_jseerapu@quicinc.com> wrote:
>>
>>
>>
>> On 6/19/2025 9:46 PM, Jyothi Kumar Seerapu wrote:
>>>
>>>
>>> On 6/18/2025 1:02 AM, Dmitry Baryshkov wrote:
>>>> On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
>>>> <quic_jseerapu@quicinc.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
>>>>>> On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
>>>>>>>> On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
>>>>>>>>>> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
>>>>>>>>>>> Hi Dimitry, Thanks for providing the review comments.
>>>>>>>>>>>
>>>>>>>>>>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
>>>>>>>>>>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> The I2C driver gets an interrupt upon transfer completion.
>>>>>>>>>>>>> When handling multiple messages in a single transfer, this
>>>>>>>>>>>>> results in N interrupts for N messages, leading to significant
>>>>>>>>>>>>> software interrupt latency.
>>>>>>>>>>>>>
>>>>>>>>>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
>>>>>>>>>>>>> mechanism. Enabling BEI instructs the hardware to prevent
>>>>>>>>>>>>> interrupt
>>>>>>>>>>>>> generation and BEI is disabled when an interrupt is necessary.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Large I2C transfer can be divided into chunks of 8 messages
>>>>>>>>>>>>> internally.
>>>>>>>>>>>>> Interrupts are not expected for the first 7 message
>>>>>>>>>>>>> completions, only
>>>>>>>>>>>>> the last message triggers an interrupt, indicating the
>>>>>>>>>>>>> completion of
>>>>>>>>>>>>> 8 messages. This BEI mechanism enhances overall transfer
>>>>>>>>>>>>> efficiency.
>>>>>>>>>>>>
>>>>>>>>>>>> Why do you need this complexity? Is it possible to set the
>>>>>>>>>>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
>>>>>>>>>>>
>>>>>>>>>>> If i undertsand correctly, the suggestion is to get the single
>>>>>>>>>>> intetrrupt for last i2c message only.
>>>>>>>>>>>
>>>>>>>>>>> But With this approach, we can't handle large number of i2c
>>>>>>>>>>> messages
>>>>>>>>>>> in the transfer.
>>>>>>>>>>>
>>>>>>>>>>> In GPI driver, number of max TREs support is harcoded to 64
>>>>>>>>>>> (#define
>>>>>>>>>>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE
>>>>>>>>>>> and
>>>>>>>>>>> DMA TREs. So, the avilable TREs are not sufficient to handle
>>>>>>>>>>> all the
>>>>>>>>>>> N messages.
>>>>>>>>>>
>>>>>>>>>> It sounds like a DMA driver issue. In other words, the DMA
>>>>>>>>>> driver can
>>>>>>>>>> know that it must issue an interrupt before exausting 64 TREs in
>>>>>>>>>> order
>>>>>>>>>> to
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
>>>>>>>>>>> or 'num' incase for less messsages), process and unmap/free
>>>>>>>>>>> upon the
>>>>>>>>>>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>>>>>>>
>>>>>>>>>> Why? This is some random value which has no connection with
>>>>>>>>>> CHAN_TREs.
>>>>>>>>>> Also, what if one of the platforms get a 'liter' GPI which
>>>>>>>>>> supports less
>>>>>>>>>> TREs in a single run? Or a super-premium platform which can use 256
>>>>>>>>>> TREs? Please don't workaround issues from one driver in another
>>>>>>>>>> one.
>>>>>>>>>
>>>>>>>>> We are trying to utilize the existing CHAN_TRES mentioned in the
>>>>>>>>> GPI driver.
>>>>>>>>> With the following approach, the GPI hardware can process N
>>>>>>>>> number of I2C
>>>>>>>>> messages, thereby improving throughput and transfer efficiency.
>>>>>>>>>
>>>>>>>>> The main design consideration for using the block event interrupt
>>>>>>>>> is as
>>>>>>>>> follows:
>>>>>>>>>
>>>>>>>>> Allow the hardware to process the TREs (I2C messages), while the
>>>>>>>>> software
>>>>>>>>> concurrently prepares the next set of TREs to be submitted to the
>>>>>>>>> hardware.
>>>>>>>>> Once the TREs are processed, they can be freed, enabling the
>>>>>>>>> software to
>>>>>>>>> queue new TREs. This approach enhances overall optimization.
>>>>>>>>>
>>>>>>>>> Please let me know if you have any questions, concerns, or
>>>>>>>>> suggestions.
>>>>>>>>
>>>>>>>> The question was why do you limit that to
>>>>>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>>>>> What is the reason for that limit, etc. If you think about it, The
>>>>>>>> GENI
>>>>>>>> / I2C doesn't impose any limit on the number of messages processed in
>>>>>>>> one go (if I understand it correctly). Instead the limit comes
>>>>>>>> from the
>>>>>>>> GPI DMA driver. As such, please don't add extra 'handling' to the I2C
>>>>>>>> driver. Make GPI DMA driver responsible for saying 'no more for now',
>>>>>>>> then I2C driver can setup add an interrupt flag and proceed with
>>>>>>>> submitting next messages, etc.
>>>>>>>>
>>>>>>>
>>>>>>> For I2C messages, we need to prepare TREs for Config, Go and DMAs.
>>>>>>> However,
>>>>>>> if a large number of I2C messages are submitted then may may run
>>>>>>> out of
>>>>>>> memory for serving the TREs. The GPI channel supports a maximum of
>>>>>>> 64 TREs,
>>>>>>> which is insufficient to serve 32 or even 16 I2C messages
>>>>>>> concurrently,
>>>>>>> given the multiple TREs required per message.
>>>>>>>
>>>>>>> To address this limitation, a strategy has been implemented to
>>>>>>> manage how
>>>>>>> many messages can be queued and how memory is recycled. The constant
>>>>>>> QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
>>>>>>> messages that can be queued at once. Additionally,
>>>>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
>>>>>>> half of the queued messages are expected to be freed or deallocated
>>>>>>> per
>>>>>>> interrupt.
>>>>>>> This approach ensures that the driver can efficiently manage TRE
>>>>>>> resources
>>>>>>> and continue queuing new I2C messages without exhausting memory.
>>>>>>>> I really don't see a reason for additional complicated handling in
>>>>>>>> the
>>>>>>>> geni driver that you've implemented. Maybe I misunderstand
>>>>>>>> something. In
>>>>>>>> such a case it usually means that you have to explain the design
>>>>>>>> in the
>>>>>>>> commit message / in-code comments.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The I2C Geni driver is designed to prepare and submit descriptors
>>>>>>> to the GPI
>>>>>>> driver one message at a time.
>>>>>>> As a result, the GPI driver does not have visibility into the current
>>>>>>> message index or the total number of I2C messages in a transfer.
>>>>>>> This lack
>>>>>>> of context makes it challenging to determine when to set the block
>>>>>>> event
>>>>>>> interrupt, which is typically used to signal the completion of a
>>>>>>> batch of
>>>>>>> messages.
>>>>>>>
>>>>>>> So, the responsibility for deciding when to set the BEI should lie
>>>>>>> with the
>>>>>>> I2C driver.
>>>>>>>
>>>>>>> If this approach is acceptable, I will proceed with updating the
>>>>>>> relevant
>>>>>>> details in the commit message.
>>>>>>>
>>>>>>> Please let me know if you have any concerns or suggestions.
>>>>>>
>>>>> Hi Dmitry, Sorry for the delayed response, and thank you for the
>>>>> suggestions.
>>>>>
>>>>>> - Make gpi_prep_slave_sg() return NULL if flags don't have
>>>>>>      DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
>>>>>>      interrupt-enabled transfer.
>>>>> "there are no 3 empty TREs for the interrupt-enabled transfer."
>>>>> Could you please help me understand this a bit better?
>>>>
>>>> In the GPI driver you know how many TREs are available. In
>>>> gpi_prep_slave_sg() you can check that and return an error if there
>>>> are not enough TREs available.
>>>>
>>>>>>
>>>>>> - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
>>>>>>      again, adding DMA_PREP_INTERRUPT. Make sure that the last one
>>>>>> always
>>>>>>      gets DMA_PREP_INTERRUPT.
>>>>> Does this mean we need to proceed to the next I2C message and ensure
>>>>> that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
>>>>> chunk? And then, should we submit the chunk of messages to the GSI
>>>>> hardware for processing?
>>>>
>>>> No. You don't have to peek at the next I2C message. This all concerns
>>>> the current I2C message. The only point where you have to worry is to
>>>> explicitly set the flag for the last message.
>>>>
>>>>>
>>>>>>
>>>>>> - In geni_i2c_gpi_xfer() split the loop to submit messages until you
>>>>>>      can, then call wait_for_completion_timeout() and then
>>>>>>      geni_i2c_gpi_unmap() for submitted messages, then continue with
>>>>>> a new
>>>>>>      portion of messages.
>>>>> Since the GPI channel supports a maximum of 64 TREs, should we consider
>>>>> submitting a smaller number of predefined messages — perhaps fewer than
>>>>> 32, such as 16?
>>>>
>>>> Why? Just submit messages until they fit, then flush the DMA async
>>>> channel.
>>>>
>>>>> This is because handling 32 messages would require one TRE for config
>>>>> and 64 TREs for the Go and DMA preparation steps, which exceeds the
>>>>> channel's TRE capacity of 64.
>>>>>
>>>>> We designed the approach to submit a portion of the messages — for
>>>>> example, 16 at a time. Once 8 messages are processed and freed, the
>>>>> hardware can continue processing the TREs, while the software
>>>>> simultaneously prepares the next set of TREs. This parallelism helps in
>>>>> efficiently utilizing the hardware and enhances overall system
>>>>> optimization.
>>>>
>>>>
>>>> And this overcomplicates the driver and introduces artificial
>>>> limitations which need explanation. Please fix it in a simple way
>>>> first. Then you can e.g. implement the watermark at the half of the
>>>> GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
>>>> middle of the full sequence, allowing it to be used asynchronously in
>>>> the background.
>>>>
>>>
>>> Okay, will review it. Thanks.
>>>
>>>
>>
>> Hi Dmitry,
>>
>> Can you please check and confirm the approach to follow is something
>> like the pseudo code mentioned below:
> 
> Yes, this is what I've had in mind.

So, Apart from the changes related to "submitting I2C messages until 
they fit" and "unmapping all processed I2C messages together", the rest 
of the code looks remains the same as in the v6 patch ?
Also, in the GPI driver, we need to add logic to retrieve the number of 
available TREs.

I have a concern regarding throughput and achieving parallelism between 
software and hardware processing with this new approach. Since we need 
to unmap all processed messages together, the software cannot queue the 
next set of TREs while the hardware is still processing the current ones.

As I mentioned earlier, the previous approach allowed partial unmapping 
where half of the messages processed by the hardware could be 
freed/unmapped. This enabled the hardware to continue processing the 
remaining TREs while the software simultaneously prepared the next 
batch. This parallelism helped in better hardware utilization and 
improved overall system performance.

Could you please confirm if can go with the similar approach of unmap 
the processed TREs based on a fixed threshold or constant value, instead 
of unmapping them all at once?
> 
>>
>> GPI driver:
>> In gpi_prep_slave_sg() function,
>>
>> if (!(flags & DMA_PREP_INTERRUPT) && !gpi_available_tres(chan))
>>          return NULL;
>>
>>
>> I2C GENI driver:
>>
>> for (i = 0; i < num; i++)
>> {
>>      /* Always set interrupt for the last message */
>>      if (i == num_msgs - 1)
>>          flags |= DMA_PREP_INTERRUPT;
>>
>>
>>      desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir, flags);
>>      if (!desc && !(flags & DMA_PREP_INTERRUPT)) {
>>            /* Retry with interrupt if not enough TREs */
>>            flags |= DMA_PREP_INTERRUPT;
>>            desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir,   flags);
>>      }
>>
>>
>>      if (!desc)
>>          break;
>>
>>
>>       dmaengine_submit(desc);
>>       msg_idx++;
>> }
>>
>> dma_async_issue_pending(chan));
>>
>> time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
>> if (!time_left)
>>          return -ETIMEDOUT;
>>
>> Now Invoke "geni_i2c_gpi_unmap" for unmapping all submitted I2C messages.
>>
>>
>> Thanks,
>> JyothiKumar
>>
>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-07-07 16:28                           ` Jyothi Kumar Seerapu
@ 2025-07-19  9:57                             ` Dmitry Baryshkov
  2025-07-22 12:20                               ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2025-07-19  9:57 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

On Mon, Jul 07, 2025 at 09:58:30PM +0530, Jyothi Kumar Seerapu wrote:
> 
> 
> On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
> > On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
> > <quic_jseerapu@quicinc.com> wrote:
> > > 
> > > 
> > > 
> > > On 6/19/2025 9:46 PM, Jyothi Kumar Seerapu wrote:
> > > > 
> > > > 
> > > > On 6/18/2025 1:02 AM, Dmitry Baryshkov wrote:
> > > > > On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
> > > > > <quic_jseerapu@quicinc.com> wrote:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
> > > > > > > On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
> > > > > > > > > On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
> > > > > > > > > > > On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
> > > > > > > > > > > > Hi Dimitry, Thanks for providing the review comments.
> > > > > > > > > > > > 
> > > > > > > > > > > > On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
> > > > > > > > > > > > > On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > The I2C driver gets an interrupt upon transfer completion.
> > > > > > > > > > > > > > When handling multiple messages in a single transfer, this
> > > > > > > > > > > > > > results in N interrupts for N messages, leading to significant
> > > > > > > > > > > > > > software interrupt latency.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > To mitigate this latency, utilize Block Event Interrupt (BEI)
> > > > > > > > > > > > > > mechanism. Enabling BEI instructs the hardware to prevent
> > > > > > > > > > > > > > interrupt
> > > > > > > > > > > > > > generation and BEI is disabled when an interrupt is necessary.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Large I2C transfer can be divided into chunks of 8 messages
> > > > > > > > > > > > > > internally.
> > > > > > > > > > > > > > Interrupts are not expected for the first 7 message
> > > > > > > > > > > > > > completions, only
> > > > > > > > > > > > > > the last message triggers an interrupt, indicating the
> > > > > > > > > > > > > > completion of
> > > > > > > > > > > > > > 8 messages. This BEI mechanism enhances overall transfer
> > > > > > > > > > > > > > efficiency.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Why do you need this complexity? Is it possible to set the
> > > > > > > > > > > > > DMA_PREP_INTERRUPT flag on the last message in the transfer?
> > > > > > > > > > > > 
> > > > > > > > > > > > If i undertsand correctly, the suggestion is to get the single
> > > > > > > > > > > > intetrrupt for last i2c message only.
> > > > > > > > > > > > 
> > > > > > > > > > > > But With this approach, we can't handle large number of i2c
> > > > > > > > > > > > messages
> > > > > > > > > > > > in the transfer.
> > > > > > > > > > > > 
> > > > > > > > > > > > In GPI driver, number of max TREs support is harcoded to 64
> > > > > > > > > > > > (#define
> > > > > > > > > > > > CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE
> > > > > > > > > > > > and
> > > > > > > > > > > > DMA TREs. So, the avilable TREs are not sufficient to handle
> > > > > > > > > > > > all the
> > > > > > > > > > > > N messages.
> > > > > > > > > > > 
> > > > > > > > > > > It sounds like a DMA driver issue. In other words, the DMA
> > > > > > > > > > > driver can
> > > > > > > > > > > know that it must issue an interrupt before exausting 64 TREs in
> > > > > > > > > > > order
> > > > > > > > > > > to
> > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
> > > > > > > > > > > > or 'num' incase for less messsages), process and unmap/free
> > > > > > > > > > > > upon the
> > > > > > > > > > > > interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> > > > > > > > > > > 
> > > > > > > > > > > Why? This is some random value which has no connection with
> > > > > > > > > > > CHAN_TREs.
> > > > > > > > > > > Also, what if one of the platforms get a 'liter' GPI which
> > > > > > > > > > > supports less
> > > > > > > > > > > TREs in a single run? Or a super-premium platform which can use 256
> > > > > > > > > > > TREs? Please don't workaround issues from one driver in another
> > > > > > > > > > > one.
> > > > > > > > > > 
> > > > > > > > > > We are trying to utilize the existing CHAN_TRES mentioned in the
> > > > > > > > > > GPI driver.
> > > > > > > > > > With the following approach, the GPI hardware can process N
> > > > > > > > > > number of I2C
> > > > > > > > > > messages, thereby improving throughput and transfer efficiency.
> > > > > > > > > > 
> > > > > > > > > > The main design consideration for using the block event interrupt
> > > > > > > > > > is as
> > > > > > > > > > follows:
> > > > > > > > > > 
> > > > > > > > > > Allow the hardware to process the TREs (I2C messages), while the
> > > > > > > > > > software
> > > > > > > > > > concurrently prepares the next set of TREs to be submitted to the
> > > > > > > > > > hardware.
> > > > > > > > > > Once the TREs are processed, they can be freed, enabling the
> > > > > > > > > > software to
> > > > > > > > > > queue new TREs. This approach enhances overall optimization.
> > > > > > > > > > 
> > > > > > > > > > Please let me know if you have any questions, concerns, or
> > > > > > > > > > suggestions.
> > > > > > > > > 
> > > > > > > > > The question was why do you limit that to
> > > > > > > > > QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> > > > > > > > > What is the reason for that limit, etc. If you think about it, The
> > > > > > > > > GENI
> > > > > > > > > / I2C doesn't impose any limit on the number of messages processed in
> > > > > > > > > one go (if I understand it correctly). Instead the limit comes
> > > > > > > > > from the
> > > > > > > > > GPI DMA driver. As such, please don't add extra 'handling' to the I2C
> > > > > > > > > driver. Make GPI DMA driver responsible for saying 'no more for now',
> > > > > > > > > then I2C driver can setup add an interrupt flag and proceed with
> > > > > > > > > submitting next messages, etc.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > For I2C messages, we need to prepare TREs for Config, Go and DMAs.
> > > > > > > > However,
> > > > > > > > if a large number of I2C messages are submitted then may may run
> > > > > > > > out of
> > > > > > > > memory for serving the TREs. The GPI channel supports a maximum of
> > > > > > > > 64 TREs,
> > > > > > > > which is insufficient to serve 32 or even 16 I2C messages
> > > > > > > > concurrently,
> > > > > > > > given the multiple TREs required per message.
> > > > > > > > 
> > > > > > > > To address this limitation, a strategy has been implemented to
> > > > > > > > manage how
> > > > > > > > many messages can be queued and how memory is recycled. The constant
> > > > > > > > QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
> > > > > > > > messages that can be queued at once. Additionally,
> > > > > > > > QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
> > > > > > > > half of the queued messages are expected to be freed or deallocated
> > > > > > > > per
> > > > > > > > interrupt.
> > > > > > > > This approach ensures that the driver can efficiently manage TRE
> > > > > > > > resources
> > > > > > > > and continue queuing new I2C messages without exhausting memory.
> > > > > > > > > I really don't see a reason for additional complicated handling in
> > > > > > > > > the
> > > > > > > > > geni driver that you've implemented. Maybe I misunderstand
> > > > > > > > > something. In
> > > > > > > > > such a case it usually means that you have to explain the design
> > > > > > > > > in the
> > > > > > > > > commit message / in-code comments.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > The I2C Geni driver is designed to prepare and submit descriptors
> > > > > > > > to the GPI
> > > > > > > > driver one message at a time.
> > > > > > > > As a result, the GPI driver does not have visibility into the current
> > > > > > > > message index or the total number of I2C messages in a transfer.
> > > > > > > > This lack
> > > > > > > > of context makes it challenging to determine when to set the block
> > > > > > > > event
> > > > > > > > interrupt, which is typically used to signal the completion of a
> > > > > > > > batch of
> > > > > > > > messages.
> > > > > > > > 
> > > > > > > > So, the responsibility for deciding when to set the BEI should lie
> > > > > > > > with the
> > > > > > > > I2C driver.
> > > > > > > > 
> > > > > > > > If this approach is acceptable, I will proceed with updating the
> > > > > > > > relevant
> > > > > > > > details in the commit message.
> > > > > > > > 
> > > > > > > > Please let me know if you have any concerns or suggestions.
> > > > > > > 
> > > > > > Hi Dmitry, Sorry for the delayed response, and thank you for the
> > > > > > suggestions.
> > > > > > 
> > > > > > > - Make gpi_prep_slave_sg() return NULL if flags don't have
> > > > > > >      DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
> > > > > > >      interrupt-enabled transfer.
> > > > > > "there are no 3 empty TREs for the interrupt-enabled transfer."
> > > > > > Could you please help me understand this a bit better?
> > > > > 
> > > > > In the GPI driver you know how many TREs are available. In
> > > > > gpi_prep_slave_sg() you can check that and return an error if there
> > > > > are not enough TREs available.
> > > > > 
> > > > > > > 
> > > > > > > - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
> > > > > > >      again, adding DMA_PREP_INTERRUPT. Make sure that the last one
> > > > > > > always
> > > > > > >      gets DMA_PREP_INTERRUPT.
> > > > > > Does this mean we need to proceed to the next I2C message and ensure
> > > > > > that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
> > > > > > chunk? And then, should we submit the chunk of messages to the GSI
> > > > > > hardware for processing?
> > > > > 
> > > > > No. You don't have to peek at the next I2C message. This all concerns
> > > > > the current I2C message. The only point where you have to worry is to
> > > > > explicitly set the flag for the last message.
> > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > - In geni_i2c_gpi_xfer() split the loop to submit messages until you
> > > > > > >      can, then call wait_for_completion_timeout() and then
> > > > > > >      geni_i2c_gpi_unmap() for submitted messages, then continue with
> > > > > > > a new
> > > > > > >      portion of messages.
> > > > > > Since the GPI channel supports a maximum of 64 TREs, should we consider
> > > > > > submitting a smaller number of predefined messages — perhaps fewer than
> > > > > > 32, such as 16?
> > > > > 
> > > > > Why? Just submit messages until they fit, then flush the DMA async
> > > > > channel.
> > > > > 
> > > > > > This is because handling 32 messages would require one TRE for config
> > > > > > and 64 TREs for the Go and DMA preparation steps, which exceeds the
> > > > > > channel's TRE capacity of 64.
> > > > > > 
> > > > > > We designed the approach to submit a portion of the messages — for
> > > > > > example, 16 at a time. Once 8 messages are processed and freed, the
> > > > > > hardware can continue processing the TREs, while the software
> > > > > > simultaneously prepares the next set of TREs. This parallelism helps in
> > > > > > efficiently utilizing the hardware and enhances overall system
> > > > > > optimization.
> > > > > 
> > > > > 
> > > > > And this overcomplicates the driver and introduces artificial
> > > > > limitations which need explanation. Please fix it in a simple way
> > > > > first. Then you can e.g. implement the watermark at the half of the
> > > > > GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
> > > > > middle of the full sequence, allowing it to be used asynchronously in
> > > > > the background.
> > > > > 
> > > > 
> > > > Okay, will review it. Thanks.
> > > > 
> > > > 
> > > 
> > > Hi Dmitry,
> > > 
> > > Can you please check and confirm the approach to follow is something
> > > like the pseudo code mentioned below:
> > 
> > Yes, this is what I've had in mind.
> 
> So, Apart from the changes related to "submitting I2C messages until they
> fit" and "unmapping all processed I2C messages together", the rest of the
> code looks remains the same as in the v6 patch ?
> Also, in the GPI driver, we need to add logic to retrieve the number of
> available TREs.
> 
> I have a concern regarding throughput and achieving parallelism between
> software and hardware processing with this new approach. Since we need to
> unmap all processed messages together, the software cannot queue the next
> set of TREs while the hardware is still processing the current ones.

Does that warrant the over-complexity of the driver or close-coupling of
I2C and GPE drivers?

The I2C is a slow bus and it is not expected to be used for
high-throughput data.

> 
> As I mentioned earlier, the previous approach allowed partial unmapping
> where half of the messages processed by the hardware could be
> freed/unmapped. This enabled the hardware to continue processing the
> remaining TREs while the software simultaneously prepared the next batch.
> This parallelism helped in better hardware utilization and improved overall
> system performance.

Measurements / values / impact?

> 
> Could you please confirm if can go with the similar approach of unmap the
> processed TREs based on a fixed threshold or constant value, instead of
> unmapping them all at once?

I'd still say, that's a bad idea. Please stay within the boundaries of
the DMA API.

> > 
> > > 
> > > GPI driver:
> > > In gpi_prep_slave_sg() function,
> > > 
> > > if (!(flags & DMA_PREP_INTERRUPT) && !gpi_available_tres(chan))
> > >          return NULL;
> > > 
> > > 
> > > I2C GENI driver:
> > > 
> > > for (i = 0; i < num; i++)
> > > {
> > >      /* Always set interrupt for the last message */
> > >      if (i == num_msgs - 1)
> > >          flags |= DMA_PREP_INTERRUPT;
> > > 
> > > 
> > >      desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir, flags);
> > >      if (!desc && !(flags & DMA_PREP_INTERRUPT)) {
> > >            /* Retry with interrupt if not enough TREs */
> > >            flags |= DMA_PREP_INTERRUPT;
> > >            desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir,   flags);
> > >      }
> > > 
> > > 
> > >      if (!desc)
> > >          break;
> > > 
> > > 
> > >       dmaengine_submit(desc);
> > >       msg_idx++;
> > > }
> > > 
> > > dma_async_issue_pending(chan));
> > > 
> > > time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
> > > if (!time_left)
> > >          return -ETIMEDOUT;
> > > 
> > > Now Invoke "geni_i2c_gpi_unmap" for unmapping all submitted I2C messages.
> > > 
> > > 
> > > Thanks,
> > > JyothiKumar
> > > 
> > > 
> > > 
> > 
> > 
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-07-19  9:57                             ` Dmitry Baryshkov
@ 2025-07-22 12:20                               ` Jyothi Kumar Seerapu
  2025-07-22 12:46                                 ` Dmitry Baryshkov
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-07-22 12:20 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku



On 7/19/2025 3:27 PM, Dmitry Baryshkov wrote:
> On Mon, Jul 07, 2025 at 09:58:30PM +0530, Jyothi Kumar Seerapu wrote:
>>
>>
>> On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
>>> On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
>>> <quic_jseerapu@quicinc.com> wrote:
>>>>
>>>>
>>>>
>>>> On 6/19/2025 9:46 PM, Jyothi Kumar Seerapu wrote:
>>>>>
>>>>>
>>>>> On 6/18/2025 1:02 AM, Dmitry Baryshkov wrote:
>>>>>> On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
>>>>>> <quic_jseerapu@quicinc.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
>>>>>>>> On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
>>>>>>>>>> On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
>>>>>>>>>>>> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
>>>>>>>>>>>>> Hi Dimitry, Thanks for providing the review comments.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
>>>>>>>>>>>>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> The I2C driver gets an interrupt upon transfer completion.
>>>>>>>>>>>>>>> When handling multiple messages in a single transfer, this
>>>>>>>>>>>>>>> results in N interrupts for N messages, leading to significant
>>>>>>>>>>>>>>> software interrupt latency.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
>>>>>>>>>>>>>>> mechanism. Enabling BEI instructs the hardware to prevent
>>>>>>>>>>>>>>> interrupt
>>>>>>>>>>>>>>> generation and BEI is disabled when an interrupt is necessary.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Large I2C transfer can be divided into chunks of 8 messages
>>>>>>>>>>>>>>> internally.
>>>>>>>>>>>>>>> Interrupts are not expected for the first 7 message
>>>>>>>>>>>>>>> completions, only
>>>>>>>>>>>>>>> the last message triggers an interrupt, indicating the
>>>>>>>>>>>>>>> completion of
>>>>>>>>>>>>>>> 8 messages. This BEI mechanism enhances overall transfer
>>>>>>>>>>>>>>> efficiency.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Why do you need this complexity? Is it possible to set the
>>>>>>>>>>>>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
>>>>>>>>>>>>>
>>>>>>>>>>>>> If i undertsand correctly, the suggestion is to get the single
>>>>>>>>>>>>> intetrrupt for last i2c message only.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But With this approach, we can't handle large number of i2c
>>>>>>>>>>>>> messages
>>>>>>>>>>>>> in the transfer.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In GPI driver, number of max TREs support is harcoded to 64
>>>>>>>>>>>>> (#define
>>>>>>>>>>>>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE
>>>>>>>>>>>>> and
>>>>>>>>>>>>> DMA TREs. So, the avilable TREs are not sufficient to handle
>>>>>>>>>>>>> all the
>>>>>>>>>>>>> N messages.
>>>>>>>>>>>>
>>>>>>>>>>>> It sounds like a DMA driver issue. In other words, the DMA
>>>>>>>>>>>> driver can
>>>>>>>>>>>> know that it must issue an interrupt before exausting 64 TREs in
>>>>>>>>>>>> order
>>>>>>>>>>>> to
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
>>>>>>>>>>>>> or 'num' incase for less messsages), process and unmap/free
>>>>>>>>>>>>> upon the
>>>>>>>>>>>>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>>>>>>>>>
>>>>>>>>>>>> Why? This is some random value which has no connection with
>>>>>>>>>>>> CHAN_TREs.
>>>>>>>>>>>> Also, what if one of the platforms get a 'liter' GPI which
>>>>>>>>>>>> supports less
>>>>>>>>>>>> TREs in a single run? Or a super-premium platform which can use 256
>>>>>>>>>>>> TREs? Please don't workaround issues from one driver in another
>>>>>>>>>>>> one.
>>>>>>>>>>>
>>>>>>>>>>> We are trying to utilize the existing CHAN_TRES mentioned in the
>>>>>>>>>>> GPI driver.
>>>>>>>>>>> With the following approach, the GPI hardware can process N
>>>>>>>>>>> number of I2C
>>>>>>>>>>> messages, thereby improving throughput and transfer efficiency.
>>>>>>>>>>>
>>>>>>>>>>> The main design consideration for using the block event interrupt
>>>>>>>>>>> is as
>>>>>>>>>>> follows:
>>>>>>>>>>>
>>>>>>>>>>> Allow the hardware to process the TREs (I2C messages), while the
>>>>>>>>>>> software
>>>>>>>>>>> concurrently prepares the next set of TREs to be submitted to the
>>>>>>>>>>> hardware.
>>>>>>>>>>> Once the TREs are processed, they can be freed, enabling the
>>>>>>>>>>> software to
>>>>>>>>>>> queue new TREs. This approach enhances overall optimization.
>>>>>>>>>>>
>>>>>>>>>>> Please let me know if you have any questions, concerns, or
>>>>>>>>>>> suggestions.
>>>>>>>>>>
>>>>>>>>>> The question was why do you limit that to
>>>>>>>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>>>>>>> What is the reason for that limit, etc. If you think about it, The
>>>>>>>>>> GENI
>>>>>>>>>> / I2C doesn't impose any limit on the number of messages processed in
>>>>>>>>>> one go (if I understand it correctly). Instead the limit comes
>>>>>>>>>> from the
>>>>>>>>>> GPI DMA driver. As such, please don't add extra 'handling' to the I2C
>>>>>>>>>> driver. Make GPI DMA driver responsible for saying 'no more for now',
>>>>>>>>>> then I2C driver can setup add an interrupt flag and proceed with
>>>>>>>>>> submitting next messages, etc.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> For I2C messages, we need to prepare TREs for Config, Go and DMAs.
>>>>>>>>> However,
>>>>>>>>> if a large number of I2C messages are submitted then may may run
>>>>>>>>> out of
>>>>>>>>> memory for serving the TREs. The GPI channel supports a maximum of
>>>>>>>>> 64 TREs,
>>>>>>>>> which is insufficient to serve 32 or even 16 I2C messages
>>>>>>>>> concurrently,
>>>>>>>>> given the multiple TREs required per message.
>>>>>>>>>
>>>>>>>>> To address this limitation, a strategy has been implemented to
>>>>>>>>> manage how
>>>>>>>>> many messages can be queued and how memory is recycled. The constant
>>>>>>>>> QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
>>>>>>>>> messages that can be queued at once. Additionally,
>>>>>>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
>>>>>>>>> half of the queued messages are expected to be freed or deallocated
>>>>>>>>> per
>>>>>>>>> interrupt.
>>>>>>>>> This approach ensures that the driver can efficiently manage TRE
>>>>>>>>> resources
>>>>>>>>> and continue queuing new I2C messages without exhausting memory.
>>>>>>>>>> I really don't see a reason for additional complicated handling in
>>>>>>>>>> the
>>>>>>>>>> geni driver that you've implemented. Maybe I misunderstand
>>>>>>>>>> something. In
>>>>>>>>>> such a case it usually means that you have to explain the design
>>>>>>>>>> in the
>>>>>>>>>> commit message / in-code comments.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The I2C Geni driver is designed to prepare and submit descriptors
>>>>>>>>> to the GPI
>>>>>>>>> driver one message at a time.
>>>>>>>>> As a result, the GPI driver does not have visibility into the current
>>>>>>>>> message index or the total number of I2C messages in a transfer.
>>>>>>>>> This lack
>>>>>>>>> of context makes it challenging to determine when to set the block
>>>>>>>>> event
>>>>>>>>> interrupt, which is typically used to signal the completion of a
>>>>>>>>> batch of
>>>>>>>>> messages.
>>>>>>>>>
>>>>>>>>> So, the responsibility for deciding when to set the BEI should lie
>>>>>>>>> with the
>>>>>>>>> I2C driver.
>>>>>>>>>
>>>>>>>>> If this approach is acceptable, I will proceed with updating the
>>>>>>>>> relevant
>>>>>>>>> details in the commit message.
>>>>>>>>>
>>>>>>>>> Please let me know if you have any concerns or suggestions.
>>>>>>>>
>>>>>>> Hi Dmitry, Sorry for the delayed response, and thank you for the
>>>>>>> suggestions.
>>>>>>>
>>>>>>>> - Make gpi_prep_slave_sg() return NULL if flags don't have
>>>>>>>>       DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
>>>>>>>>       interrupt-enabled transfer.
>>>>>>> "there are no 3 empty TREs for the interrupt-enabled transfer."
>>>>>>> Could you please help me understand this a bit better?
>>>>>>
>>>>>> In the GPI driver you know how many TREs are available. In
>>>>>> gpi_prep_slave_sg() you can check that and return an error if there
>>>>>> are not enough TREs available.
>>>>>>
>>>>>>>>
>>>>>>>> - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
>>>>>>>>       again, adding DMA_PREP_INTERRUPT. Make sure that the last one
>>>>>>>> always
>>>>>>>>       gets DMA_PREP_INTERRUPT.
>>>>>>> Does this mean we need to proceed to the next I2C message and ensure
>>>>>>> that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
>>>>>>> chunk? And then, should we submit the chunk of messages to the GSI
>>>>>>> hardware for processing?
>>>>>>
>>>>>> No. You don't have to peek at the next I2C message. This all concerns
>>>>>> the current I2C message. The only point where you have to worry is to
>>>>>> explicitly set the flag for the last message.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> - In geni_i2c_gpi_xfer() split the loop to submit messages until you
>>>>>>>>       can, then call wait_for_completion_timeout() and then
>>>>>>>>       geni_i2c_gpi_unmap() for submitted messages, then continue with
>>>>>>>> a new
>>>>>>>>       portion of messages.
>>>>>>> Since the GPI channel supports a maximum of 64 TREs, should we consider
>>>>>>> submitting a smaller number of predefined messages — perhaps fewer than
>>>>>>> 32, such as 16?
>>>>>>
>>>>>> Why? Just submit messages until they fit, then flush the DMA async
>>>>>> channel.
>>>>>>
>>>>>>> This is because handling 32 messages would require one TRE for config
>>>>>>> and 64 TREs for the Go and DMA preparation steps, which exceeds the
>>>>>>> channel's TRE capacity of 64.
>>>>>>>
>>>>>>> We designed the approach to submit a portion of the messages — for
>>>>>>> example, 16 at a time. Once 8 messages are processed and freed, the
>>>>>>> hardware can continue processing the TREs, while the software
>>>>>>> simultaneously prepares the next set of TREs. This parallelism helps in
>>>>>>> efficiently utilizing the hardware and enhances overall system
>>>>>>> optimization.
>>>>>>
>>>>>>
>>>>>> And this overcomplicates the driver and introduces artificial
>>>>>> limitations which need explanation. Please fix it in a simple way
>>>>>> first. Then you can e.g. implement the watermark at the half of the
>>>>>> GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
>>>>>> middle of the full sequence, allowing it to be used asynchronously in
>>>>>> the background.
>>>>>>
>>>>>
>>>>> Okay, will review it. Thanks.
>>>>>
>>>>>
>>>>
>>>> Hi Dmitry,
>>>>
>>>> Can you please check and confirm the approach to follow is something
>>>> like the pseudo code mentioned below:
>>>
>>> Yes, this is what I've had in mind.
>>
>> So, Apart from the changes related to "submitting I2C messages until they
>> fit" and "unmapping all processed I2C messages together", the rest of the
>> code looks remains the same as in the v6 patch ?
>> Also, in the GPI driver, we need to add logic to retrieve the number of
>> available TREs.
>>
>> I have a concern regarding throughput and achieving parallelism between
>> software and hardware processing with this new approach. Since we need to
>> unmap all processed messages together, the software cannot queue the next
>> set of TREs while the hardware is still processing the current ones.
> 
> Does that warrant the over-complexity of the driver or close-coupling of
> I2C and GPE drivers?
> 
> The I2C is a slow bus and it is not expected to be used for
> high-throughput data.
> 
The block event interrupt and multi-descriptor handling are primarily 
added to support a camera use case, where multiple registers are need to 
be configured in a single I2C transfer with 200 or more i2c messages. 
This enhancement is expected to improve throughput and meet performance 
KPIs.

>>
>> As I mentioned earlier, the previous approach allowed partial unmapping
>> where half of the messages processed by the hardware could be
>> freed/unmapped. This enabled the hardware to continue processing the
>> remaining TREs while the software simultaneously prepared the next batch.
>> This parallelism helped in better hardware utilization and improved overall
>> system performance.
> 
> Measurements / values / impact?
> 
>>
>> Could you please confirm if can go with the similar approach of unmap the
>> processed TREs based on a fixed threshold or constant value, instead of
>> unmapping them all at once?
> 
> I'd still say, that's a bad idea. Please stay within the boundaries of
> the DMA API.
 >
I agree with the approach you suggested—it's the GPI's responsibility to 
manage the available TREs.

However, I'm curious whether can we set a dynamic watermark value 
perhaps half the available TREs) to trigger unmapping of processed TREs 
? This would allow the software to prepare the next set of TREs while 
the hardware continues processing the remaining ones, enabling better 
parallelism and throughput.

>>>
>>>>
>>>> GPI driver:
>>>> In gpi_prep_slave_sg() function,
>>>>
>>>> if (!(flags & DMA_PREP_INTERRUPT) && !gpi_available_tres(chan))
>>>>           return NULL;
>>>>
>>>>
>>>> I2C GENI driver:
>>>>
>>>> for (i = 0; i < num; i++)
>>>> {
>>>>       /* Always set interrupt for the last message */
>>>>       if (i == num_msgs - 1)
>>>>           flags |= DMA_PREP_INTERRUPT;
>>>>
>>>>
>>>>       desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir, flags);
>>>>       if (!desc && !(flags & DMA_PREP_INTERRUPT)) {
>>>>             /* Retry with interrupt if not enough TREs */
>>>>             flags |= DMA_PREP_INTERRUPT;
>>>>             desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir,   flags);
>>>>       }
>>>>
>>>>
>>>>       if (!desc)
>>>>           break;
>>>>
>>>>
>>>>        dmaengine_submit(desc);
>>>>        msg_idx++;
>>>> }
>>>>
>>>> dma_async_issue_pending(chan));
>>>>
>>>> time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
>>>> if (!time_left)
>>>>           return -ETIMEDOUT;
>>>>
>>>> Now Invoke "geni_i2c_gpi_unmap" for unmapping all submitted I2C messages.
>>>>
>>>>
>>>> Thanks,
>>>> JyothiKumar
>>>>
>>>>
>>>>
>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-07-22 12:20                               ` Jyothi Kumar Seerapu
@ 2025-07-22 12:46                                 ` Dmitry Baryshkov
  2025-07-23  7:15                                   ` Vinod Koul
  0 siblings, 1 reply; 30+ messages in thread
From: Dmitry Baryshkov @ 2025-07-22 12:46 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti,
	Sumit Semwal, Christian König, linux-arm-msm, dmaengine,
	linux-kernel, linux-i2c, linux-media, dri-devel, linaro-mm-sig,
	quic_vtanuku

On Tue, Jul 22, 2025 at 05:50:08PM +0530, Jyothi Kumar Seerapu wrote:
> 
> 
> On 7/19/2025 3:27 PM, Dmitry Baryshkov wrote:
> > On Mon, Jul 07, 2025 at 09:58:30PM +0530, Jyothi Kumar Seerapu wrote:
> > > 
> > > 
> > > On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
> > > > On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
> > > > <quic_jseerapu@quicinc.com> wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > On 6/19/2025 9:46 PM, Jyothi Kumar Seerapu wrote:
> > > > > > 
> > > > > > 
> > > > > > On 6/18/2025 1:02 AM, Dmitry Baryshkov wrote:
> > > > > > > On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
> > > > > > > <quic_jseerapu@quicinc.com> wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
> > > > > > > > > On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
> > > > > > > > > > > On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
> > > > > > > > > > > > > On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
> > > > > > > > > > > > > > Hi Dimitry, Thanks for providing the review comments.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
> > > > > > > > > > > > > > > On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > The I2C driver gets an interrupt upon transfer completion.
> > > > > > > > > > > > > > > > When handling multiple messages in a single transfer, this
> > > > > > > > > > > > > > > > results in N interrupts for N messages, leading to significant
> > > > > > > > > > > > > > > > software interrupt latency.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > To mitigate this latency, utilize Block Event Interrupt (BEI)
> > > > > > > > > > > > > > > > mechanism. Enabling BEI instructs the hardware to prevent
> > > > > > > > > > > > > > > > interrupt
> > > > > > > > > > > > > > > > generation and BEI is disabled when an interrupt is necessary.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Large I2C transfer can be divided into chunks of 8 messages
> > > > > > > > > > > > > > > > internally.
> > > > > > > > > > > > > > > > Interrupts are not expected for the first 7 message
> > > > > > > > > > > > > > > > completions, only
> > > > > > > > > > > > > > > > the last message triggers an interrupt, indicating the
> > > > > > > > > > > > > > > > completion of
> > > > > > > > > > > > > > > > 8 messages. This BEI mechanism enhances overall transfer
> > > > > > > > > > > > > > > > efficiency.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Why do you need this complexity? Is it possible to set the
> > > > > > > > > > > > > > > DMA_PREP_INTERRUPT flag on the last message in the transfer?
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > If i undertsand correctly, the suggestion is to get the single
> > > > > > > > > > > > > > intetrrupt for last i2c message only.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > But With this approach, we can't handle large number of i2c
> > > > > > > > > > > > > > messages
> > > > > > > > > > > > > > in the transfer.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > In GPI driver, number of max TREs support is harcoded to 64
> > > > > > > > > > > > > > (#define
> > > > > > > > > > > > > > CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > DMA TREs. So, the avilable TREs are not sufficient to handle
> > > > > > > > > > > > > > all the
> > > > > > > > > > > > > > N messages.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > It sounds like a DMA driver issue. In other words, the DMA
> > > > > > > > > > > > > driver can
> > > > > > > > > > > > > know that it must issue an interrupt before exausting 64 TREs in
> > > > > > > > > > > > > order
> > > > > > > > > > > > > to
> > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
> > > > > > > > > > > > > > or 'num' incase for less messsages), process and unmap/free
> > > > > > > > > > > > > > upon the
> > > > > > > > > > > > > > interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Why? This is some random value which has no connection with
> > > > > > > > > > > > > CHAN_TREs.
> > > > > > > > > > > > > Also, what if one of the platforms get a 'liter' GPI which
> > > > > > > > > > > > > supports less
> > > > > > > > > > > > > TREs in a single run? Or a super-premium platform which can use 256
> > > > > > > > > > > > > TREs? Please don't workaround issues from one driver in another
> > > > > > > > > > > > > one.
> > > > > > > > > > > > 
> > > > > > > > > > > > We are trying to utilize the existing CHAN_TRES mentioned in the
> > > > > > > > > > > > GPI driver.
> > > > > > > > > > > > With the following approach, the GPI hardware can process N
> > > > > > > > > > > > number of I2C
> > > > > > > > > > > > messages, thereby improving throughput and transfer efficiency.
> > > > > > > > > > > > 
> > > > > > > > > > > > The main design consideration for using the block event interrupt
> > > > > > > > > > > > is as
> > > > > > > > > > > > follows:
> > > > > > > > > > > > 
> > > > > > > > > > > > Allow the hardware to process the TREs (I2C messages), while the
> > > > > > > > > > > > software
> > > > > > > > > > > > concurrently prepares the next set of TREs to be submitted to the
> > > > > > > > > > > > hardware.
> > > > > > > > > > > > Once the TREs are processed, they can be freed, enabling the
> > > > > > > > > > > > software to
> > > > > > > > > > > > queue new TREs. This approach enhances overall optimization.
> > > > > > > > > > > > 
> > > > > > > > > > > > Please let me know if you have any questions, concerns, or
> > > > > > > > > > > > suggestions.
> > > > > > > > > > > 
> > > > > > > > > > > The question was why do you limit that to
> > > > > > > > > > > QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
> > > > > > > > > > > What is the reason for that limit, etc. If you think about it, The
> > > > > > > > > > > GENI
> > > > > > > > > > > / I2C doesn't impose any limit on the number of messages processed in
> > > > > > > > > > > one go (if I understand it correctly). Instead the limit comes
> > > > > > > > > > > from the
> > > > > > > > > > > GPI DMA driver. As such, please don't add extra 'handling' to the I2C
> > > > > > > > > > > driver. Make GPI DMA driver responsible for saying 'no more for now',
> > > > > > > > > > > then I2C driver can setup add an interrupt flag and proceed with
> > > > > > > > > > > submitting next messages, etc.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > For I2C messages, we need to prepare TREs for Config, Go and DMAs.
> > > > > > > > > > However,
> > > > > > > > > > if a large number of I2C messages are submitted then may may run
> > > > > > > > > > out of
> > > > > > > > > > memory for serving the TREs. The GPI channel supports a maximum of
> > > > > > > > > > 64 TREs,
> > > > > > > > > > which is insufficient to serve 32 or even 16 I2C messages
> > > > > > > > > > concurrently,
> > > > > > > > > > given the multiple TREs required per message.
> > > > > > > > > > 
> > > > > > > > > > To address this limitation, a strategy has been implemented to
> > > > > > > > > > manage how
> > > > > > > > > > many messages can be queued and how memory is recycled. The constant
> > > > > > > > > > QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
> > > > > > > > > > messages that can be queued at once. Additionally,
> > > > > > > > > > QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
> > > > > > > > > > half of the queued messages are expected to be freed or deallocated
> > > > > > > > > > per
> > > > > > > > > > interrupt.
> > > > > > > > > > This approach ensures that the driver can efficiently manage TRE
> > > > > > > > > > resources
> > > > > > > > > > and continue queuing new I2C messages without exhausting memory.
> > > > > > > > > > > I really don't see a reason for additional complicated handling in
> > > > > > > > > > > the
> > > > > > > > > > > geni driver that you've implemented. Maybe I misunderstand
> > > > > > > > > > > something. In
> > > > > > > > > > > such a case it usually means that you have to explain the design
> > > > > > > > > > > in the
> > > > > > > > > > > commit message / in-code comments.
> > > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > The I2C Geni driver is designed to prepare and submit descriptors
> > > > > > > > > > to the GPI
> > > > > > > > > > driver one message at a time.
> > > > > > > > > > As a result, the GPI driver does not have visibility into the current
> > > > > > > > > > message index or the total number of I2C messages in a transfer.
> > > > > > > > > > This lack
> > > > > > > > > > of context makes it challenging to determine when to set the block
> > > > > > > > > > event
> > > > > > > > > > interrupt, which is typically used to signal the completion of a
> > > > > > > > > > batch of
> > > > > > > > > > messages.
> > > > > > > > > > 
> > > > > > > > > > So, the responsibility for deciding when to set the BEI should lie
> > > > > > > > > > with the
> > > > > > > > > > I2C driver.
> > > > > > > > > > 
> > > > > > > > > > If this approach is acceptable, I will proceed with updating the
> > > > > > > > > > relevant
> > > > > > > > > > details in the commit message.
> > > > > > > > > > 
> > > > > > > > > > Please let me know if you have any concerns or suggestions.
> > > > > > > > > 
> > > > > > > > Hi Dmitry, Sorry for the delayed response, and thank you for the
> > > > > > > > suggestions.
> > > > > > > > 
> > > > > > > > > - Make gpi_prep_slave_sg() return NULL if flags don't have
> > > > > > > > >       DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
> > > > > > > > >       interrupt-enabled transfer.
> > > > > > > > "there are no 3 empty TREs for the interrupt-enabled transfer."
> > > > > > > > Could you please help me understand this a bit better?
> > > > > > > 
> > > > > > > In the GPI driver you know how many TREs are available. In
> > > > > > > gpi_prep_slave_sg() you can check that and return an error if there
> > > > > > > are not enough TREs available.
> > > > > > > 
> > > > > > > > > 
> > > > > > > > > - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
> > > > > > > > >       again, adding DMA_PREP_INTERRUPT. Make sure that the last one
> > > > > > > > > always
> > > > > > > > >       gets DMA_PREP_INTERRUPT.
> > > > > > > > Does this mean we need to proceed to the next I2C message and ensure
> > > > > > > > that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
> > > > > > > > chunk? And then, should we submit the chunk of messages to the GSI
> > > > > > > > hardware for processing?
> > > > > > > 
> > > > > > > No. You don't have to peek at the next I2C message. This all concerns
> > > > > > > the current I2C message. The only point where you have to worry is to
> > > > > > > explicitly set the flag for the last message.
> > > > > > > 
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > - In geni_i2c_gpi_xfer() split the loop to submit messages until you
> > > > > > > > >       can, then call wait_for_completion_timeout() and then
> > > > > > > > >       geni_i2c_gpi_unmap() for submitted messages, then continue with
> > > > > > > > > a new
> > > > > > > > >       portion of messages.
> > > > > > > > Since the GPI channel supports a maximum of 64 TREs, should we consider
> > > > > > > > submitting a smaller number of predefined messages — perhaps fewer than
> > > > > > > > 32, such as 16?
> > > > > > > 
> > > > > > > Why? Just submit messages until they fit, then flush the DMA async
> > > > > > > channel.
> > > > > > > 
> > > > > > > > This is because handling 32 messages would require one TRE for config
> > > > > > > > and 64 TREs for the Go and DMA preparation steps, which exceeds the
> > > > > > > > channel's TRE capacity of 64.
> > > > > > > > 
> > > > > > > > We designed the approach to submit a portion of the messages — for
> > > > > > > > example, 16 at a time. Once 8 messages are processed and freed, the
> > > > > > > > hardware can continue processing the TREs, while the software
> > > > > > > > simultaneously prepares the next set of TREs. This parallelism helps in
> > > > > > > > efficiently utilizing the hardware and enhances overall system
> > > > > > > > optimization.
> > > > > > > 
> > > > > > > 
> > > > > > > And this overcomplicates the driver and introduces artificial
> > > > > > > limitations which need explanation. Please fix it in a simple way
> > > > > > > first. Then you can e.g. implement the watermark at the half of the
> > > > > > > GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
> > > > > > > middle of the full sequence, allowing it to be used asynchronously in
> > > > > > > the background.
> > > > > > > 
> > > > > > 
> > > > > > Okay, will review it. Thanks.
> > > > > > 
> > > > > > 
> > > > > 
> > > > > Hi Dmitry,
> > > > > 
> > > > > Can you please check and confirm the approach to follow is something
> > > > > like the pseudo code mentioned below:
> > > > 
> > > > Yes, this is what I've had in mind.
> > > 
> > > So, Apart from the changes related to "submitting I2C messages until they
> > > fit" and "unmapping all processed I2C messages together", the rest of the
> > > code looks remains the same as in the v6 patch ?
> > > Also, in the GPI driver, we need to add logic to retrieve the number of
> > > available TREs.
> > > 
> > > I have a concern regarding throughput and achieving parallelism between
> > > software and hardware processing with this new approach. Since we need to
> > > unmap all processed messages together, the software cannot queue the next
> > > set of TREs while the hardware is still processing the current ones.
> > 
> > Does that warrant the over-complexity of the driver or close-coupling of
> > I2C and GPE drivers?
> > 
> > The I2C is a slow bus and it is not expected to be used for
> > high-throughput data.
> > 
> The block event interrupt and multi-descriptor handling are primarily added
> to support a camera use case, where multiple registers are need to be
> configured in a single I2C transfer with 200 or more i2c messages. This
> enhancement is expected to improve throughput and meet performance KPIs.
> 
> > > 
> > > As I mentioned earlier, the previous approach allowed partial unmapping
> > > where half of the messages processed by the hardware could be
> > > freed/unmapped. This enabled the hardware to continue processing the
> > > remaining TREs while the software simultaneously prepared the next batch.
> > > This parallelism helped in better hardware utilization and improved overall
> > > system performance.
> > 
> > Measurements / values / impact?
> > 
> > > 
> > > Could you please confirm if can go with the similar approach of unmap the
> > > processed TREs based on a fixed threshold or constant value, instead of
> > > unmapping them all at once?
> > 
> > I'd still say, that's a bad idea. Please stay within the boundaries of
> > the DMA API.
> >
> I agree with the approach you suggested—it's the GPI's responsibility to
> manage the available TREs.
> 
> However, I'm curious whether can we set a dynamic watermark value perhaps
> half the available TREs) to trigger unmapping of processed TREs ? This would
> allow the software to prepare the next set of TREs while the hardware
> continues processing the remaining ones, enabling better parallelism and
> throughput.

Let's land the simple implementation first, which can then be improved.
However I don't see any way to return 'above the watermark' from the DMA
controller. You might need to enhance the API.

> 
> > > > 
> > > > > 
> > > > > GPI driver:
> > > > > In gpi_prep_slave_sg() function,
> > > > > 
> > > > > if (!(flags & DMA_PREP_INTERRUPT) && !gpi_available_tres(chan))
> > > > >           return NULL;
> > > > > 
> > > > > 
> > > > > I2C GENI driver:
> > > > > 
> > > > > for (i = 0; i < num; i++)
> > > > > {
> > > > >       /* Always set interrupt for the last message */
> > > > >       if (i == num_msgs - 1)
> > > > >           flags |= DMA_PREP_INTERRUPT;
> > > > > 
> > > > > 
> > > > >       desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir, flags);
> > > > >       if (!desc && !(flags & DMA_PREP_INTERRUPT)) {
> > > > >             /* Retry with interrupt if not enough TREs */
> > > > >             flags |= DMA_PREP_INTERRUPT;
> > > > >             desc = dmaengine_prep_slave_single(chan, dma_addr, len, dir,   flags);
> > > > >       }
> > > > > 
> > > > > 
> > > > >       if (!desc)
> > > > >           break;
> > > > > 
> > > > > 
> > > > >        dmaengine_submit(desc);
> > > > >        msg_idx++;
> > > > > }
> > > > > 
> > > > > dma_async_issue_pending(chan));
> > > > > 
> > > > > time_left = wait_for_completion_timeout(&gi2c->done, XFER_TIMEOUT);
> > > > > if (!time_left)
> > > > >           return -ETIMEDOUT;
> > > > > 
> > > > > Now Invoke "geni_i2c_gpi_unmap" for unmapping all submitted I2C messages.
> > > > > 
> > > > > 
> > > > > Thanks,
> > > > > JyothiKumar
> > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > 
> > 
> 

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-07-22 12:46                                 ` Dmitry Baryshkov
@ 2025-07-23  7:15                                   ` Vinod Koul
  2025-07-25 10:50                                     ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Vinod Koul @ 2025-07-23  7:15 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Jyothi Kumar Seerapu, Mukesh Kumar Savaliya, Viken Dadhaniya,
	Andi Shyti, Sumit Semwal, Christian König, linux-arm-msm,
	dmaengine, linux-kernel, linux-i2c, linux-media, dri-devel,
	linaro-mm-sig, quic_vtanuku

On 22-07-25, 15:46, Dmitry Baryshkov wrote:
> On Tue, Jul 22, 2025 at 05:50:08PM +0530, Jyothi Kumar Seerapu wrote:
> > On 7/19/2025 3:27 PM, Dmitry Baryshkov wrote:
> > > On Mon, Jul 07, 2025 at 09:58:30PM +0530, Jyothi Kumar Seerapu wrote:
> > > > On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
> > > > > On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu

[Folks, would be nice to trim replies]

> > > > Could you please confirm if can go with the similar approach of unmap the
> > > > processed TREs based on a fixed threshold or constant value, instead of
> > > > unmapping them all at once?
> > > 
> > > I'd still say, that's a bad idea. Please stay within the boundaries of
> > > the DMA API.
> > >
> > I agree with the approach you suggested—it's the GPI's responsibility to
> > manage the available TREs.
> > 
> > However, I'm curious whether can we set a dynamic watermark value perhaps
> > half the available TREs) to trigger unmapping of processed TREs ? This would
> > allow the software to prepare the next set of TREs while the hardware
> > continues processing the remaining ones, enabling better parallelism and
> > throughput.
> 
> Let's land the simple implementation first, which can then be improved.
> However I don't see any way to return 'above the watermark' from the DMA
> controller. You might need to enhance the API.

Traditionally, we set the dma transfers for watermark level and we get a
interrupt. So you might want to set the callback for watermark level
and then do mapping/unmapping etc in the callback. This is typical model
for dmaengines, we should follow that well

BR
-- 
~Vinod

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-07-23  7:15                                   ` Vinod Koul
@ 2025-07-25 10:50                                     ` Jyothi Kumar Seerapu
  2025-08-11 17:40                                       ` Vinod Koul
  0 siblings, 1 reply; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-07-25 10:50 UTC (permalink / raw)
  To: Vinod Koul, Dmitry Baryshkov
  Cc: Mukesh Kumar Savaliya, Viken Dadhaniya, Andi Shyti, Sumit Semwal,
	Christian König, linux-arm-msm, dmaengine, linux-kernel,
	linux-i2c, linux-media, dri-devel, linaro-mm-sig, quic_vtanuku



On 7/23/2025 12:45 PM, Vinod Koul wrote:
> On 22-07-25, 15:46, Dmitry Baryshkov wrote:
>> On Tue, Jul 22, 2025 at 05:50:08PM +0530, Jyothi Kumar Seerapu wrote:
>>> On 7/19/2025 3:27 PM, Dmitry Baryshkov wrote:
>>>> On Mon, Jul 07, 2025 at 09:58:30PM +0530, Jyothi Kumar Seerapu wrote:
>>>>> On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
>>>>>> On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
> 
> [Folks, would be nice to trim replies]
> 
>>>>> Could you please confirm if can go with the similar approach of unmap the
>>>>> processed TREs based on a fixed threshold or constant value, instead of
>>>>> unmapping them all at once?
>>>>
>>>> I'd still say, that's a bad idea. Please stay within the boundaries of
>>>> the DMA API.
>>>>
>>> I agree with the approach you suggested—it's the GPI's responsibility to
>>> manage the available TREs.
>>>
>>> However, I'm curious whether can we set a dynamic watermark value perhaps
>>> half the available TREs) to trigger unmapping of processed TREs ? This would
>>> allow the software to prepare the next set of TREs while the hardware
>>> continues processing the remaining ones, enabling better parallelism and
>>> throughput.
>>
>> Let's land the simple implementation first, which can then be improved.
>> However I don't see any way to return 'above the watermark' from the DMA
>> controller. You might need to enhance the API.
> 
> Traditionally, we set the dma transfers for watermark level and we get a
> interrupt. So you might want to set the callback for watermark level
> and then do mapping/unmapping etc in the callback. This is typical model
> for dmaengines, we should follow that well
> 
> BR

Thanks Dmitry and Vinod, I will work on V7 patch for submitting the I2C 
messages until they fit and and unmap all processed messages together 
for now.

Regarding the watermark mechanism, looks GENI SE DMA supports watermark 
interrupts but it appears that GPI DMA doesn't have such provision of 
watermark.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-07-25 10:50                                     ` Jyothi Kumar Seerapu
@ 2025-08-11 17:40                                       ` Vinod Koul
  2025-08-14 12:44                                         ` Jyothi Kumar Seerapu
  0 siblings, 1 reply; 30+ messages in thread
From: Vinod Koul @ 2025-08-11 17:40 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu
  Cc: Dmitry Baryshkov, Mukesh Kumar Savaliya, Viken Dadhaniya,
	Andi Shyti, Sumit Semwal, Christian König, linux-arm-msm,
	dmaengine, linux-kernel, linux-i2c, linux-media, dri-devel,
	linaro-mm-sig, quic_vtanuku

On 25-07-25, 16:20, Jyothi Kumar Seerapu wrote:
> 
> 
> On 7/23/2025 12:45 PM, Vinod Koul wrote:
> > On 22-07-25, 15:46, Dmitry Baryshkov wrote:
> > > On Tue, Jul 22, 2025 at 05:50:08PM +0530, Jyothi Kumar Seerapu wrote:
> > > > On 7/19/2025 3:27 PM, Dmitry Baryshkov wrote:
> > > > > On Mon, Jul 07, 2025 at 09:58:30PM +0530, Jyothi Kumar Seerapu wrote:
> > > > > > On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
> > > > > > > On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
> > 
> > [Folks, would be nice to trim replies]
> > 
> > > > > > Could you please confirm if can go with the similar approach of unmap the
> > > > > > processed TREs based on a fixed threshold or constant value, instead of
> > > > > > unmapping them all at once?
> > > > > 
> > > > > I'd still say, that's a bad idea. Please stay within the boundaries of
> > > > > the DMA API.
> > > > > 
> > > > I agree with the approach you suggested—it's the GPI's responsibility to
> > > > manage the available TREs.
> > > > 
> > > > However, I'm curious whether can we set a dynamic watermark value perhaps
> > > > half the available TREs) to trigger unmapping of processed TREs ? This would
> > > > allow the software to prepare the next set of TREs while the hardware
> > > > continues processing the remaining ones, enabling better parallelism and
> > > > throughput.
> > > 
> > > Let's land the simple implementation first, which can then be improved.
> > > However I don't see any way to return 'above the watermark' from the DMA
> > > controller. You might need to enhance the API.
> > 
> > Traditionally, we set the dma transfers for watermark level and we get a
> > interrupt. So you might want to set the callback for watermark level
> > and then do mapping/unmapping etc in the callback. This is typical model
> > for dmaengines, we should follow that well
> > 
> > BR
> 
> Thanks Dmitry and Vinod, I will work on V7 patch for submitting the I2C
> messages until they fit and and unmap all processed messages together for
> now.
> 
> Regarding the watermark mechanism, looks GENI SE DMA supports watermark
> interrupts but it appears that GPI DMA doesn't have such provision of
> watermark.

What is the mechanism to get interrupts from the GPI? If you submit 10
txn, can you ask it to interrupt when half of them are done?

-- 
~Vinod

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-08-11 17:40                                       ` Vinod Koul
@ 2025-08-14 12:44                                         ` Jyothi Kumar Seerapu
  0 siblings, 0 replies; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-08-14 12:44 UTC (permalink / raw)
  To: Vinod Koul
  Cc: Dmitry Baryshkov, Mukesh Kumar Savaliya, Viken Dadhaniya,
	Andi Shyti, Sumit Semwal, Christian König, linux-arm-msm,
	dmaengine, linux-kernel, linux-i2c, linux-media, dri-devel,
	linaro-mm-sig, quic_vtanuku



On 8/11/2025 11:10 PM, Vinod Koul wrote:
> On 25-07-25, 16:20, Jyothi Kumar Seerapu wrote:
>>
>>
>> On 7/23/2025 12:45 PM, Vinod Koul wrote:
>>> On 22-07-25, 15:46, Dmitry Baryshkov wrote:
>>>> On Tue, Jul 22, 2025 at 05:50:08PM +0530, Jyothi Kumar Seerapu wrote:
>>>>> On 7/19/2025 3:27 PM, Dmitry Baryshkov wrote:
>>>>>> On Mon, Jul 07, 2025 at 09:58:30PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>> On 7/4/2025 1:11 AM, Dmitry Baryshkov wrote:
>>>>>>>> On Thu, 3 Jul 2025 at 15:51, Jyothi Kumar Seerapu
>>>
>>> [Folks, would be nice to trim replies]
>>>
>>>>>>> Could you please confirm if can go with the similar approach of unmap the
>>>>>>> processed TREs based on a fixed threshold or constant value, instead of
>>>>>>> unmapping them all at once?
>>>>>>
>>>>>> I'd still say, that's a bad idea. Please stay within the boundaries of
>>>>>> the DMA API.
>>>>>>
>>>>> I agree with the approach you suggested—it's the GPI's responsibility to
>>>>> manage the available TREs.
>>>>>
>>>>> However, I'm curious whether can we set a dynamic watermark value perhaps
>>>>> half the available TREs) to trigger unmapping of processed TREs ? This would
>>>>> allow the software to prepare the next set of TREs while the hardware
>>>>> continues processing the remaining ones, enabling better parallelism and
>>>>> throughput.
>>>>
>>>> Let's land the simple implementation first, which can then be improved.
>>>> However I don't see any way to return 'above the watermark' from the DMA
>>>> controller. You might need to enhance the API.
>>>
>>> Traditionally, we set the dma transfers for watermark level and we get a
>>> interrupt. So you might want to set the callback for watermark level
>>> and then do mapping/unmapping etc in the callback. This is typical model
>>> for dmaengines, we should follow that well
>>>
>>> BR
>>
>> Thanks Dmitry and Vinod, I will work on V7 patch for submitting the I2C
>> messages until they fit and and unmap all processed messages together for
>> now.
>>
>> Regarding the watermark mechanism, looks GENI SE DMA supports watermark
>> interrupts but it appears that GPI DMA doesn't have such provision of
>> watermark.
> 
> What is the mechanism to get interrupts from the GPI? If you submit 10
> txn, can you ask it to interrupt when half of them are done?

GPI interrupts are triggered upon each transaction completion or when 
error conditions occur.

Using the Block Event Interrupt (BEI) mechanism, we can control when 
interrupts are generated.
For example, if there are 10 transactions (i.e., 10 I2C messages in a 
transfer), it's possible to configure the GSI to generate an interrupt 
after the completion of the 5th transaction.

However, with the new design discussed above, I2C messages are submitted 
dynamically based on available TREs.
BEI is configured to generate interrupts either:
   - After the last I2C message, if sufficient TREs are available, or
   - After a specific I2C message, when no further TREs are available.
	
In this dynamic setup, there is no static way to configure BEI to 
trigger interrupts at a fixed transaction count.
The interrupt behavior is instead determined by runtime resource 
availability.



> 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-06 11:18 ` [PATCH v6 2/2] i2c: i2c-qcom-geni: Add " Jyothi Kumar Seerapu
  2025-05-06 11:46   ` Dmitry Baryshkov
@ 2025-05-07  2:21   ` ALOK TIWARI
  2025-05-09  6:16     ` Jyothi Kumar Seerapu
  1 sibling, 1 reply; 30+ messages in thread
From: ALOK TIWARI @ 2025-05-07  2:21 UTC (permalink / raw)
  To: Jyothi Kumar Seerapu, Vinod Koul, Mukesh Kumar Savaliya,
	Viken Dadhaniya, Andi Shyti, Sumit Semwal, Christian König
  Cc: linux-arm-msm, dmaengine, linux-kernel, linux-i2c, linux-media,
	dri-devel, linaro-mm-sig, quic_vtanuku



On 06-05-2025 16:48, Jyothi Kumar Seerapu wrote:
> +/**
> + * struct geni_i2c_dev - I2C Geni device specific structure
> + *
> + * @se: geni serial engine
> + * @tx_wm: Tx watermark level
> + * @irq: i2c serial engine interrupt
> + * @err: specifies error codes in i2c transfer failures
> + * @adap: i2c geni adapter
> + * @done: completion variable
> + * @cur: pointer to the i2c_msg mentioning current i2c message in use
> + * @cur_wr: variable used for i2c write opertions

typo opertions -> operations

> + * @cur_rd: variable used for i2c read operations
> + * @lock: spinlock variable used for synchronization
> + * @core_clk: pointer to clk
> + * @clk_freq_out: contains the i2c clock frequency
> + * @clk_fld: pointer to geni_i2c_clk_fld
> + * @suspended: flag used for system supend status

typo supend -> suspend

> + * @dma_buf: virtual address of the buffer
> + * @xfer_len: holds length for the dma operation
> + * @dma_addr: dma address of the buffer
> + * @tx_c: Tx dma channel
> + * @rx_c: Rx dma channel
> + * @gpi_mode: GPI DMA mode of operation
> + * @abort_done: true for marking i2c abort transfer
> + * @is_tx_multi_desc_xfer: true for i2c multi transfer support
> + * @num_msgs: number of i2c messages in a transfer
> + * @tx_irq_cnt: flag used for tx irq count in i2c multi transfer
> + * @i2c_multi_desc_config: used for multi transfer support
> + */
>   struct geni_i2c_dev {
>   	struct geni_se se;
>   	u32 tx_wm;
> @@ -100,6 +156,10 @@ struct geni_i2c_dev {
>   	struct dma_chan *rx_c;
>   	bool gpi_mode;
>   	bool abort_done;
> +	bool is_tx_multi_desc_xfer;
> +	u32 num_msgs;
> +	u32 tx_irq_cnt;
> +	struct geni_i2c_gpi_multi_desc_xfer i2c_multi_desc_config;
>   };
>   


Thanks,
Alok

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support
  2025-05-07  2:21   ` ALOK TIWARI
@ 2025-05-09  6:16     ` Jyothi Kumar Seerapu
  0 siblings, 0 replies; 30+ messages in thread
From: Jyothi Kumar Seerapu @ 2025-05-09  6:16 UTC (permalink / raw)
  To: ALOK TIWARI, Vinod Koul, Mukesh Kumar Savaliya, Viken Dadhaniya,
	Andi Shyti, Sumit Semwal, Christian König
  Cc: linux-arm-msm, dmaengine, linux-kernel, linux-i2c, linux-media,
	dri-devel, linaro-mm-sig, quic_vtanuku



On 5/7/2025 7:51 AM, ALOK TIWARI wrote:
> 
> 
> On 06-05-2025 16:48, Jyothi Kumar Seerapu wrote:
>> +/**
>> + * struct geni_i2c_dev - I2C Geni device specific structure
>> + *
>> + * @se: geni serial engine
>> + * @tx_wm: Tx watermark level
>> + * @irq: i2c serial engine interrupt
>> + * @err: specifies error codes in i2c transfer failures
>> + * @adap: i2c geni adapter
>> + * @done: completion variable
>> + * @cur: pointer to the i2c_msg mentioning current i2c message in use
>> + * @cur_wr: variable used for i2c write opertions
> 
> typo opertions -> operations
Sure, thanks will correct it.
> 
>> + * @cur_rd: variable used for i2c read operations
>> + * @lock: spinlock variable used for synchronization
>> + * @core_clk: pointer to clk
>> + * @clk_freq_out: contains the i2c clock frequency
>> + * @clk_fld: pointer to geni_i2c_clk_fld
>> + * @suspended: flag used for system supend status
> 
> typo supend -> suspend
sure, will correct it.
> 
>> + * @dma_buf: virtual address of the buffer
>> + * @xfer_len: holds length for the dma operation
>> + * @dma_addr: dma address of the buffer
>> + * @tx_c: Tx dma channel
>> + * @rx_c: Rx dma channel
>> + * @gpi_mode: GPI DMA mode of operation
>> + * @abort_done: true for marking i2c abort transfer
>> + * @is_tx_multi_desc_xfer: true for i2c multi transfer support
>> + * @num_msgs: number of i2c messages in a transfer
>> + * @tx_irq_cnt: flag used for tx irq count in i2c multi transfer
>> + * @i2c_multi_desc_config: used for multi transfer support
>> + */
>>   struct geni_i2c_dev {
>>       struct geni_se se;
>>       u32 tx_wm;
>> @@ -100,6 +156,10 @@ struct geni_i2c_dev {
>>       struct dma_chan *rx_c;
>>       bool gpi_mode;
>>       bool abort_done;
>> +    bool is_tx_multi_desc_xfer;
>> +    u32 num_msgs;
>> +    u32 tx_irq_cnt;
>> +    struct geni_i2c_gpi_multi_desc_xfer i2c_multi_desc_config;
>>   };
> 
> 
> Thanks,
> Alok


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-08-14 12:44 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-06 11:18 [PATCH v6 0/2] Add Block event interrupt support for I2C protocol Jyothi Kumar Seerapu
2025-05-06 11:18 ` [PATCH v6 1/2] dmaengine: qcom: gpi: Add GPI Block event interrupt support Jyothi Kumar Seerapu
2025-05-06 11:32   ` Dmitry Baryshkov
2025-05-09  6:18     ` Jyothi Kumar Seerapu
2025-05-30 14:05       ` Jyothi Kumar Seerapu
2025-05-30 15:53         ` Dmitry Baryshkov
2025-06-17 14:16           ` Jyothi Kumar Seerapu
2025-05-06 11:18 ` [PATCH v6 2/2] i2c: i2c-qcom-geni: Add " Jyothi Kumar Seerapu
2025-05-06 11:46   ` Dmitry Baryshkov
2025-05-09  6:18     ` Jyothi Kumar Seerapu
2025-05-09 16:01       ` Dmitry Baryshkov
2025-05-21 10:28         ` Jyothi Kumar Seerapu
2025-05-21 12:45           ` Dmitry Baryshkov
2025-05-30 14:06             ` Jyothi Kumar Seerapu
2025-05-30 16:42               ` Dmitry Baryshkov
2025-06-17 14:11                 ` Jyothi Kumar Seerapu
2025-06-17 19:32                   ` Dmitry Baryshkov
2025-06-19 16:16                     ` Jyothi Kumar Seerapu
2025-07-03 12:50                       ` Jyothi Kumar Seerapu
2025-07-03 19:41                         ` Dmitry Baryshkov
2025-07-07 16:28                           ` Jyothi Kumar Seerapu
2025-07-19  9:57                             ` Dmitry Baryshkov
2025-07-22 12:20                               ` Jyothi Kumar Seerapu
2025-07-22 12:46                                 ` Dmitry Baryshkov
2025-07-23  7:15                                   ` Vinod Koul
2025-07-25 10:50                                     ` Jyothi Kumar Seerapu
2025-08-11 17:40                                       ` Vinod Koul
2025-08-14 12:44                                         ` Jyothi Kumar Seerapu
2025-05-07  2:21   ` ALOK TIWARI
2025-05-09  6:16     ` Jyothi Kumar Seerapu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).