* [PATCH] i2c: aspeed: Nullify bus messages after timeout @ 2025-01-31 22:29 Eddie James 2025-02-03 3:31 ` Andrew Jeffery 0 siblings, 1 reply; 4+ messages in thread From: Eddie James @ 2025-01-31 22:29 UTC (permalink / raw) To: linux-i2c; +Cc: openbmc, ryan_chen, benh, joel, andi.shyti, andrew, Eddie James For multimaster case, it's conceivable that an interrupt comes in after the transfer times out and attempts to use bus messages that have already been freed by the i2c core. Signed-off-by: Eddie James <eajames@linux.ibm.com> --- drivers/i2c/busses/i2c-aspeed.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/i2c/busses/i2c-aspeed.c b/drivers/i2c/busses/i2c-aspeed.c index 1550d3d552aed..e344dcc2233fe 100644 --- a/drivers/i2c/busses/i2c-aspeed.c +++ b/drivers/i2c/busses/i2c-aspeed.c @@ -731,6 +731,7 @@ static int aspeed_i2c_master_xfer(struct i2c_adapter *adap, * master command. */ spin_lock_irqsave(&bus->lock, flags); + bus->msgs = NULL; if (bus->master_state == ASPEED_I2C_MASTER_PENDING) bus->master_state = ASPEED_I2C_MASTER_INACTIVE; spin_unlock_irqrestore(&bus->lock, flags); -- 2.43.5 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] i2c: aspeed: Nullify bus messages after timeout 2025-01-31 22:29 [PATCH] i2c: aspeed: Nullify bus messages after timeout Eddie James @ 2025-02-03 3:31 ` Andrew Jeffery 2025-02-03 20:29 ` Eddie James 0 siblings, 1 reply; 4+ messages in thread From: Andrew Jeffery @ 2025-02-03 3:31 UTC (permalink / raw) To: Eddie James, linux-i2c; +Cc: openbmc, ryan_chen, benh, joel, andi.shyti On Fri, 2025-01-31 at 16:29 -0600, Eddie James wrote: > For multimaster case, it's conceivable that an interrupt comes > in after the transfer times out and attempts to use bus messages > that have already been freed by the i2c core. This description seems a little vague. Did you hit this case in practice? > > Signed-off-by: Eddie James <eajames@linux.ibm.com> > --- > drivers/i2c/busses/i2c-aspeed.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/i2c/busses/i2c-aspeed.c > b/drivers/i2c/busses/i2c-aspeed.c > index 1550d3d552aed..e344dcc2233fe 100644 > --- a/drivers/i2c/busses/i2c-aspeed.c > +++ b/drivers/i2c/busses/i2c-aspeed.c > @@ -731,6 +731,7 @@ static int aspeed_i2c_master_xfer(struct > i2c_adapter *adap, > * master command. > */ > spin_lock_irqsave(&bus->lock, flags); > + bus->msgs = NULL; It feels like there's buggy code elsewhere in the driver that this is protecting (broken state machine)? I've had a look at the aspeed_i2c_master_irq() implementation and can't see what though, as we take an early-exit (before indexing into bus->msgs) if bus- >master_state is INACTIVE or PENDING. Can you be a bit more specific about where the issue lies? Andrew > if (bus->master_state == ASPEED_I2C_MASTER_PENDING) > bus->master_state = > ASPEED_I2C_MASTER_INACTIVE; > spin_unlock_irqrestore(&bus->lock, flags); ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] i2c: aspeed: Nullify bus messages after timeout 2025-02-03 3:31 ` Andrew Jeffery @ 2025-02-03 20:29 ` Eddie James 2025-02-04 4:13 ` Andrew Jeffery 0 siblings, 1 reply; 4+ messages in thread From: Eddie James @ 2025-02-03 20:29 UTC (permalink / raw) To: Andrew Jeffery, linux-i2c; +Cc: openbmc, ryan_chen, benh, joel, andi.shyti On 2/2/25 21:31, Andrew Jeffery wrote: > On Fri, 2025-01-31 at 16:29 -0600, Eddie James wrote: >> For multimaster case, it's conceivable that an interrupt comes >> in after the transfer times out and attempts to use bus messages >> that have already been freed by the i2c core. > This description seems a little vague. Did you hit this case in > practice? Yes. I no longer have the back trace but it's a null pointer access in the interrupt handler. We had a certain system that would hit this under certain conditions and this patch fixed it. I can update the commit message with some more detail. > >> Signed-off-by: Eddie James <eajames@linux.ibm.com> >> --- >> drivers/i2c/busses/i2c-aspeed.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/i2c/busses/i2c-aspeed.c >> b/drivers/i2c/busses/i2c-aspeed.c >> index 1550d3d552aed..e344dcc2233fe 100644 >> --- a/drivers/i2c/busses/i2c-aspeed.c >> +++ b/drivers/i2c/busses/i2c-aspeed.c >> @@ -731,6 +731,7 @@ static int aspeed_i2c_master_xfer(struct >> i2c_adapter *adap, >> * master command. >> */ >> spin_lock_irqsave(&bus->lock, flags); >> + bus->msgs = NULL; > It feels like there's buggy code elsewhere in the driver that this is > protecting (broken state machine)? I've had a look at the > aspeed_i2c_master_irq() implementation and can't see what though, as we > take an early-exit (before indexing into bus->msgs) if bus- >> master_state is INACTIVE or PENDING. > Can you be a bit more specific about where the issue lies? I'm sure the state machine isn't perfect, yea. The bad access can happen like this: if a transfer times out while waiting for an interrupt, the aspeed_i2c_master_xfer function will either reset the engine or recover the bus, and exit ETIMEDOUT. Resetting the engine will turn off interrupts, so we're safe. But recovering the bus doesn't turn off interrupts, so after the function exits ETIMEDOUT, I assume what happens is we get the interrupt for the previous transfer and try and access the messages pointer, which the i2c core has already freed. Thanks for looking! Eddie > > Andrew > > >> if (bus->master_state == ASPEED_I2C_MASTER_PENDING) >> bus->master_state = >> ASPEED_I2C_MASTER_INACTIVE; >> spin_unlock_irqrestore(&bus->lock, flags); ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] i2c: aspeed: Nullify bus messages after timeout 2025-02-03 20:29 ` Eddie James @ 2025-02-04 4:13 ` Andrew Jeffery 0 siblings, 0 replies; 4+ messages in thread From: Andrew Jeffery @ 2025-02-04 4:13 UTC (permalink / raw) To: Eddie James, linux-i2c; +Cc: openbmc, ryan_chen, benh, joel, andi.shyti On Mon, 2025-02-03 at 14:29 -0600, Eddie James wrote: > > On 2/2/25 21:31, Andrew Jeffery wrote: > > On Fri, 2025-01-31 at 16:29 -0600, Eddie James wrote: > > > For multimaster case, it's conceivable that an interrupt comes > > > in after the transfer times out and attempts to use bus messages > > > that have already been freed by the i2c core. > > This description seems a little vague. Did you hit this case in > > practice? > > > Yes. I no longer have the back trace but it's a null pointer access in > the interrupt handler. We had a certain system that would hit this under > certain conditions and this patch fixed it. > > > I can update the commit message with some more detail. Thanks. > > > > > > > Signed-off-by: Eddie James <eajames@linux.ibm.com> > > > --- > > > drivers/i2c/busses/i2c-aspeed.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/drivers/i2c/busses/i2c-aspeed.c > > > b/drivers/i2c/busses/i2c-aspeed.c > > > index 1550d3d552aed..e344dcc2233fe 100644 > > > --- a/drivers/i2c/busses/i2c-aspeed.c > > > +++ b/drivers/i2c/busses/i2c-aspeed.c > > > @@ -731,6 +731,7 @@ static int aspeed_i2c_master_xfer(struct > > > i2c_adapter *adap, > > > * master command. > > > */ > > > spin_lock_irqsave(&bus->lock, flags); > > > + bus->msgs = NULL; > > It feels like there's buggy code elsewhere in the driver that this is > > protecting (broken state machine)? I've had a look at the > > aspeed_i2c_master_irq() implementation and can't see what though, as we > > take an early-exit (before indexing into bus->msgs) if bus- > > > master_state is INACTIVE or PENDING. > > Can you be a bit more specific about where the issue lies? > > > I'm sure the state machine isn't perfect, yea. > Right, so I think that's what should be fixed; the explicit states define possible invariants in the implementation. We shouldn't need to test `msgs` to know its value (whether its value is correct should be defined by the current state). > The bad access can happen > like this: if a transfer times out while waiting for an interrupt, the > aspeed_i2c_master_xfer function will either reset the engine or recover > the bus, and exit ETIMEDOUT. Resetting the engine will turn off > interrupts, so we're safe. But recovering the bus doesn't turn off > interrupts, so after the function exits ETIMEDOUT, I assume what happens > is we get the interrupt for the previous transfer and try and access the > messages pointer, which the i2c core has already freed. So what immediately concerns me is there's no RECOVER state in `enum aspeed_i2c_master_state` or the rest of the implementation. We do have the PENDING state, which we don't have in hardware, so there's no reason for RECOVER to be missing, especially since we have the RECOVER state in hardware (I2CD14[22:19] = 0b0011). What do you think of adding that, and testing for it in the interrupt handler? Andrew ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-02-04 4:13 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-31 22:29 [PATCH] i2c: aspeed: Nullify bus messages after timeout Eddie James 2025-02-03 3:31 ` Andrew Jeffery 2025-02-03 20:29 ` Eddie James 2025-02-04 4:13 ` Andrew Jeffery
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox