public inbox for linux-i2c@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] i2c: aspeed: Nullify bus messages after timeout
@ 2025-01-31 22:29 Eddie James
  2025-02-03  3:31 ` Andrew Jeffery
  0 siblings, 1 reply; 4+ messages in thread
From: Eddie James @ 2025-01-31 22:29 UTC (permalink / raw)
  To: linux-i2c; +Cc: openbmc, ryan_chen, benh, joel, andi.shyti, andrew, Eddie James

For multimaster case, it's conceivable that an interrupt comes
in after the transfer times out and attempts to use bus messages
that have already been freed by the i2c core.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
---
 drivers/i2c/busses/i2c-aspeed.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c
index 1550d3d552aed..e344dcc2233fe 100644
--- a/drivers/i2c/busses/i2c-aspeed.c
+++ b/drivers/i2c/busses/i2c-aspeed.c
@@ -731,6 +731,7 @@ static int aspeed_i2c_master_xfer(struct i2c_adapter *adap,
 		 * master command.
 		 */
 		spin_lock_irqsave(&bus->lock, flags);
+		bus->msgs = NULL;
 		if (bus->master_state == ASPEED_I2C_MASTER_PENDING)
 			bus->master_state = ASPEED_I2C_MASTER_INACTIVE;
 		spin_unlock_irqrestore(&bus->lock, flags);
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] i2c: aspeed: Nullify bus messages after timeout
  2025-01-31 22:29 [PATCH] i2c: aspeed: Nullify bus messages after timeout Eddie James
@ 2025-02-03  3:31 ` Andrew Jeffery
  2025-02-03 20:29   ` Eddie James
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Jeffery @ 2025-02-03  3:31 UTC (permalink / raw)
  To: Eddie James, linux-i2c; +Cc: openbmc, ryan_chen, benh, joel, andi.shyti

On Fri, 2025-01-31 at 16:29 -0600, Eddie James wrote:
> For multimaster case, it's conceivable that an interrupt comes
> in after the transfer times out and attempts to use bus messages
> that have already been freed by the i2c core.

This description seems a little vague. Did you hit this case in
practice?

> 
> Signed-off-by: Eddie James <eajames@linux.ibm.com>
> ---
>  drivers/i2c/busses/i2c-aspeed.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/i2c/busses/i2c-aspeed.c
> b/drivers/i2c/busses/i2c-aspeed.c
> index 1550d3d552aed..e344dcc2233fe 100644
> --- a/drivers/i2c/busses/i2c-aspeed.c
> +++ b/drivers/i2c/busses/i2c-aspeed.c
> @@ -731,6 +731,7 @@ static int aspeed_i2c_master_xfer(struct
> i2c_adapter *adap,
>                  * master command.
>                  */
>                 spin_lock_irqsave(&bus->lock, flags);
> +               bus->msgs = NULL;

It feels like there's buggy code elsewhere in the driver that this is
protecting (broken state machine)? I've had a look at the
aspeed_i2c_master_irq() implementation and can't see what though, as we
take an early-exit (before indexing into bus->msgs) if bus-
>master_state is INACTIVE or PENDING.

Can you be a bit more specific about where the issue lies?

Andrew


>                 if (bus->master_state == ASPEED_I2C_MASTER_PENDING)
>                         bus->master_state =
> ASPEED_I2C_MASTER_INACTIVE;
>                 spin_unlock_irqrestore(&bus->lock, flags);


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] i2c: aspeed: Nullify bus messages after timeout
  2025-02-03  3:31 ` Andrew Jeffery
@ 2025-02-03 20:29   ` Eddie James
  2025-02-04  4:13     ` Andrew Jeffery
  0 siblings, 1 reply; 4+ messages in thread
From: Eddie James @ 2025-02-03 20:29 UTC (permalink / raw)
  To: Andrew Jeffery, linux-i2c; +Cc: openbmc, ryan_chen, benh, joel, andi.shyti


On 2/2/25 21:31, Andrew Jeffery wrote:
> On Fri, 2025-01-31 at 16:29 -0600, Eddie James wrote:
>> For multimaster case, it's conceivable that an interrupt comes
>> in after the transfer times out and attempts to use bus messages
>> that have already been freed by the i2c core.
> This description seems a little vague. Did you hit this case in
> practice?


Yes. I no longer have the back trace but it's a null pointer access in 
the interrupt handler. We had a certain system that would hit this under 
certain conditions and this patch fixed it.


I can update the commit message with some more detail.


>
>> Signed-off-by: Eddie James <eajames@linux.ibm.com>
>> ---
>>   drivers/i2c/busses/i2c-aspeed.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/i2c/busses/i2c-aspeed.c
>> b/drivers/i2c/busses/i2c-aspeed.c
>> index 1550d3d552aed..e344dcc2233fe 100644
>> --- a/drivers/i2c/busses/i2c-aspeed.c
>> +++ b/drivers/i2c/busses/i2c-aspeed.c
>> @@ -731,6 +731,7 @@ static int aspeed_i2c_master_xfer(struct
>> i2c_adapter *adap,
>>                   * master command.
>>                   */
>>                  spin_lock_irqsave(&bus->lock, flags);
>> +               bus->msgs = NULL;
> It feels like there's buggy code elsewhere in the driver that this is
> protecting (broken state machine)? I've had a look at the
> aspeed_i2c_master_irq() implementation and can't see what though, as we
> take an early-exit (before indexing into bus->msgs) if bus-
>> master_state is INACTIVE or PENDING.
> Can you be a bit more specific about where the issue lies?


I'm sure the state machine isn't perfect, yea. The bad access can happen 
like this: if a transfer times out while waiting for an interrupt, the 
aspeed_i2c_master_xfer function will either reset the engine or recover 
the bus, and exit ETIMEDOUT. Resetting the engine will turn off 
interrupts, so we're safe. But recovering the bus doesn't turn off 
interrupts, so after the function exits ETIMEDOUT, I assume what happens 
is we get the interrupt for the previous transfer and try and access the 
messages pointer, which the i2c core has already freed.


Thanks for looking!

Eddie



>
> Andrew
>
>
>>                  if (bus->master_state == ASPEED_I2C_MASTER_PENDING)
>>                          bus->master_state =
>> ASPEED_I2C_MASTER_INACTIVE;
>>                  spin_unlock_irqrestore(&bus->lock, flags);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] i2c: aspeed: Nullify bus messages after timeout
  2025-02-03 20:29   ` Eddie James
@ 2025-02-04  4:13     ` Andrew Jeffery
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Jeffery @ 2025-02-04  4:13 UTC (permalink / raw)
  To: Eddie James, linux-i2c; +Cc: openbmc, ryan_chen, benh, joel, andi.shyti

On Mon, 2025-02-03 at 14:29 -0600, Eddie James wrote:
> 
> On 2/2/25 21:31, Andrew Jeffery wrote:
> > On Fri, 2025-01-31 at 16:29 -0600, Eddie James wrote:
> > > For multimaster case, it's conceivable that an interrupt comes
> > > in after the transfer times out and attempts to use bus messages
> > > that have already been freed by the i2c core.
> > This description seems a little vague. Did you hit this case in
> > practice?
> 
> 
> Yes. I no longer have the back trace but it's a null pointer access in 
> the interrupt handler. We had a certain system that would hit this under 
> certain conditions and this patch fixed it.
> 
> 
> I can update the commit message with some more detail.

Thanks.

> 
> 
> > 
> > > Signed-off-by: Eddie James <eajames@linux.ibm.com>
> > > ---
> > >   drivers/i2c/busses/i2c-aspeed.c | 1 +
> > >   1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/drivers/i2c/busses/i2c-aspeed.c
> > > b/drivers/i2c/busses/i2c-aspeed.c
> > > index 1550d3d552aed..e344dcc2233fe 100644
> > > --- a/drivers/i2c/busses/i2c-aspeed.c
> > > +++ b/drivers/i2c/busses/i2c-aspeed.c
> > > @@ -731,6 +731,7 @@ static int aspeed_i2c_master_xfer(struct
> > > i2c_adapter *adap,
> > >                   * master command.
> > >                   */
> > >                  spin_lock_irqsave(&bus->lock, flags);
> > > +               bus->msgs = NULL;
> > It feels like there's buggy code elsewhere in the driver that this is
> > protecting (broken state machine)? I've had a look at the
> > aspeed_i2c_master_irq() implementation and can't see what though, as we
> > take an early-exit (before indexing into bus->msgs) if bus-
> > > master_state is INACTIVE or PENDING.
> > Can you be a bit more specific about where the issue lies?
> 
> 
> I'm sure the state machine isn't perfect, yea.
> 

Right, so I think that's what should be fixed; the explicit states
define possible invariants in the implementation. We shouldn't need to
test `msgs` to know its value (whether its value is correct should be
defined by the current state).

> The bad access can happen 
> like this: if a transfer times out while waiting for an interrupt, the 
> aspeed_i2c_master_xfer function will either reset the engine or recover 
> the bus, and exit ETIMEDOUT. Resetting the engine will turn off 
> interrupts, so we're safe. But recovering the bus doesn't turn off 
> interrupts, so after the function exits ETIMEDOUT, I assume what happens 
> is we get the interrupt for the previous transfer and try and access the 
> messages pointer, which the i2c core has already freed.

So what immediately concerns me is there's no RECOVER state in `enum
aspeed_i2c_master_state` or the rest of the implementation. We do have
the PENDING state, which we don't have in hardware, so there's no
reason for RECOVER to be missing, especially since we have the RECOVER
state in hardware (I2CD14[22:19] = 0b0011).

What do you think of adding that, and testing for it in the interrupt
handler?

Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-02-04  4:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-31 22:29 [PATCH] i2c: aspeed: Nullify bus messages after timeout Eddie James
2025-02-03  3:31 ` Andrew Jeffery
2025-02-03 20:29   ` Eddie James
2025-02-04  4:13     ` Andrew Jeffery

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox