* [PATCH] can: c_can: Fix bit clearing and message object read @ 2014-04-08 9:46 Alexander Stein 2014-04-08 13:07 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: Alexander Stein @ 2014-04-08 9:46 UTC (permalink / raw) To: Wolfgang Grandegger, Marc Kleine-Budde, Thomas Gleixner Cc: Alexander Stein, linux-can It seems that clearing some bits in a message object using COMMSK_REG and reading this object afterwards using COMREQ_REG is not race free. It might occur that the read message object still contains the IF_MCONT_NEWDAT bit set while it was already cleared upon write to COMMSK_REG. So insert a dummy read to the hardware. Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com> --- drivers/net/can/c_can/c_can.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c index 0e9f974..625f8e2 100644 --- a/drivers/net/can/c_can/c_can.c +++ b/drivers/net/can/c_can/c_can.c @@ -290,6 +290,13 @@ static inline void c_can_object_get(struct net_device *dev, */ priv->write_reg(priv, C_CAN_IFACE(COMMSK_REG, iface), IFX_WRITE_LOW_16BIT(mask)); + + /* + * Insert additional dummy read to let + * the write to COMMSK_REG take effect + */ + priv->read_reg(priv, C_CAN_IFACE(COMREQ_REG, iface)); + priv->write_reg(priv, C_CAN_IFACE(COMREQ_REG, iface), IFX_WRITE_LOW_16BIT(objno)); @@ -832,12 +839,11 @@ static int c_can_read_objects(struct net_device *dev, struct c_can_priv *priv, } /* - * This really should not happen, but this covers some - * odd HW behaviour. Do not remove that unless you - * want to brick your machine. + * This really should not happen, but this warns about some + * odd HW behaviour. */ - if (!(ctrl & IF_MCONT_NEWDAT)) - continue; + if ((ctrl & IF_MCONT_NEWDAT)) + netdev_warn(dev, "IF_MCONT_NEWDAT still set on obj: %d", obj); /* read the data from the message object */ c_can_read_msg_object(dev, IF_RX, ctrl); -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] can: c_can: Fix bit clearing and message object read 2014-04-08 9:46 [PATCH] can: c_can: Fix bit clearing and message object read Alexander Stein @ 2014-04-08 13:07 ` Thomas Gleixner 2014-04-08 14:02 ` Alexander Stein 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2014-04-08 13:07 UTC (permalink / raw) To: Alexander Stein; +Cc: Wolfgang Grandegger, Marc Kleine-Budde, linux-can On Tue, 8 Apr 2014, Alexander Stein wrote: > It seems that clearing some bits in a message object using COMMSK_REG and > reading this object afterwards using COMREQ_REG is not race free. > It might occur that the read message object still contains the > IF_MCONT_NEWDAT bit set while it was already cleared upon write to > COMMSK_REG. So insert a dummy read to the hardware. No. It does not get cleared with the write to COMMSK_REG. That's just a write. That write does nothing. The execution of the transfer starts with writing the message object number to COMREQ_REG. How should the IF know which message object you want to access? > */ > priv->write_reg(priv, C_CAN_IFACE(COMMSK_REG, iface), > IFX_WRITE_LOW_16BIT(mask)); > + > + /* > + * Insert additional dummy read to let > + * the write to COMMSK_REG take effect > + */ > + priv->read_reg(priv, C_CAN_IFACE(COMREQ_REG, iface)); > + So that read confuses the hell out of the hardware. > priv->write_reg(priv, C_CAN_IFACE(COMREQ_REG, iface), > IFX_WRITE_LOW_16BIT(objno)); > > @@ -832,12 +839,11 @@ static int c_can_read_objects(struct net_device *dev, struct c_can_priv *priv, > } > > /* > - * This really should not happen, but this covers some > - * odd HW behaviour. Do not remove that unless you > - * want to brick your machine. > + * This really should not happen, but this warns about some > + * odd HW behaviour. > */ > - if (!(ctrl & IF_MCONT_NEWDAT)) > - continue; > + if ((ctrl & IF_MCONT_NEWDAT)) > + netdev_warn(dev, "IF_MCONT_NEWDAT still set on obj: %d", obj); No, that's wrong. That does not work with any hardware I have access to. From the datasheet: "Note: A read access to a message object can be combined with the reset of the INTPND and NEWDAT control bits. The values of these two bits, which will be transferred to the IFm message control register, always reflect the status before resetting them." So it is expected, that NEWDAT is set in the crtl register readout if the message object in the message ram had the bit set. The point is that when the IF transfers the message from message ram to the interface it clears the INTPND and NEWDAT bits in the message ram. That avoids the extra transfers after reading the message. Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] can: c_can: Fix bit clearing and message object read 2014-04-08 13:07 ` Thomas Gleixner @ 2014-04-08 14:02 ` Alexander Stein 2014-04-08 14:15 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: Alexander Stein @ 2014-04-08 14:02 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Wolfgang Grandegger, Marc Kleine-Budde, linux-can On Tuesday 08 April 2014 15:07:10, Thomas Gleixner wrote: > On Tue, 8 Apr 2014, Alexander Stein wrote: > > It seems that clearing some bits in a message object using COMMSK_REG and > > reading this object afterwards using COMREQ_REG is not race free. > > It might occur that the read message object still contains the > > IF_MCONT_NEWDAT bit set while it was already cleared upon write to > > COMMSK_REG. So insert a dummy read to the hardware. > > No. It does not get cleared with the write to COMMSK_REG. That's just > a write. That write does nothing. > > The execution of the transfer starts with writing the message object > number to COMREQ_REG. How should the IF know which message object you > want to access? Ok, I agree with you here. Strangly using the followed patch IF_MCONT_NEWDAT is not even set in a single message buffer. How can that delay influence a write that will happen later? diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c index 0e9f974..ea0aacc 100644 --- a/drivers/net/can/c_can/c_can.c +++ b/drivers/net/can/c_can/c_can.c @@ -290,6 +290,9 @@ static inline void c_can_object_get(struct net_device *dev, */ priv->write_reg(priv, C_CAN_IFACE(COMMSK_REG, iface), IFX_WRITE_LOW_16BIT(mask)); + + udelay(10); + priv->write_reg(priv, C_CAN_IFACE(COMREQ_REG, iface), IFX_WRITE_LOW_16BIT(objno)); @@ -832,12 +835,11 @@ static int c_can_read_objects(struct net_device *dev, struct c_can_priv *priv, } /* - * This really should not happen, but this covers some - * odd HW behaviour. Do not remove that unless you - * want to brick your machine. + * This really should not happen, but this warns about some + * odd HW behaviour. */ - if (!(ctrl & IF_MCONT_NEWDAT)) - continue; + if ((ctrl & IF_MCONT_NEWDAT)) + netdev_warn(dev, "IF_MCONT_NEWDAT still set on obj: %d", obj); /* read the data from the message object */ c_can_read_msg_object(dev, IF_RX, ctrl); > > priv->write_reg(priv, C_CAN_IFACE(COMREQ_REG, iface), > > IFX_WRITE_LOW_16BIT(objno)); > > > > @@ -832,12 +839,11 @@ static int c_can_read_objects(struct net_device *dev, struct c_can_priv *priv, > > } > > > > /* > > - * This really should not happen, but this covers some > > - * odd HW behaviour. Do not remove that unless you > > - * want to brick your machine. > > + * This really should not happen, but this warns about some > > + * odd HW behaviour. > > */ > > - if (!(ctrl & IF_MCONT_NEWDAT)) > > - continue; > > + if ((ctrl & IF_MCONT_NEWDAT)) > > + netdev_warn(dev, "IF_MCONT_NEWDAT still set on obj: %d", obj); > > No, that's wrong. That does not work with any hardware I have access > to. > > From the datasheet: > > "Note: A read access to a message object can be combined with the reset > of the INTPND and NEWDAT control bits. The values of these two bits, > which will be transferred to the IFm message control register, always > reflect the status before resetting them." > > So it is expected, that NEWDAT is set in the crtl register readout if > the message object in the message ram had the bit set. > > The point is that when the IF transfers the message from message ram > to the interface it clears the INTPND and NEWDAT bits in the message > ram. That avoids the extra transfers after reading the message. The problem is that NEWDAT in the crtl register readout is most of the time set, but sometimes not. Adding the delay above it is unset all the time :( Regards, Alexander -- Dipl.-Inf. Alexander Stein SYS TEC electronic GmbH Am Windrad 2 08468 Heinsdorfergrund Tel.: 03765 38600-1156 Fax: 03765 38600-4100 Email: alexander.stein@systec-electronic.com Website: www.systec-electronic.com Managing Director: Dipl.-Phys. Siegmar Schmidt Commercial registry: Amtsgericht Chemnitz, HRB 28082 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] can: c_can: Fix bit clearing and message object read 2014-04-08 14:02 ` Alexander Stein @ 2014-04-08 14:15 ` Thomas Gleixner 2014-04-08 14:24 ` Alexander Stein 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2014-04-08 14:15 UTC (permalink / raw) To: Alexander Stein; +Cc: Wolfgang Grandegger, Marc Kleine-Budde, linux-can On Tue, 8 Apr 2014, Alexander Stein wrote: > On Tuesday 08 April 2014 15:07:10, Thomas Gleixner wrote: > The problem is that NEWDAT in the crtl register readout is most of the time set, but sometimes not. > Adding the delay above it is unset all the time :( Now what happens if you disable interrupts across the writes: local_irq_disable() write(COMMSK) write(COMREQ) local_irq_enable() Thanks, tglx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] can: c_can: Fix bit clearing and message object read 2014-04-08 14:15 ` Thomas Gleixner @ 2014-04-08 14:24 ` Alexander Stein 2014-04-09 0:42 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: Alexander Stein @ 2014-04-08 14:24 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Wolfgang Grandegger, Marc Kleine-Budde, linux-can On Tuesday 08 April 2014 16:15:37, Thomas Gleixner wrote: > On Tue, 8 Apr 2014, Alexander Stein wrote: > > On Tuesday 08 April 2014 15:07:10, Thomas Gleixner wrote: > > The problem is that NEWDAT in the crtl register readout is most of the time set, but sometimes not. > > Adding the delay above it is unset all the time :( > > Now what happens if you disable interrupts across the writes: > > local_irq_disable() > write(COMMSK) > write(COMREQ) > local_irq_enable() There are still message objects where NEWDAT is unset. Even if the c_can_msg_obj_is_busy check is included. Regards, Alexander -- Dipl.-Inf. Alexander Stein SYS TEC electronic GmbH Am Windrad 2 08468 Heinsdorfergrund Tel.: 03765 38600-1156 Fax: 03765 38600-4100 Email: alexander.stein@systec-electronic.com Website: www.systec-electronic.com Managing Director: Dipl.-Phys. Siegmar Schmidt Commercial registry: Amtsgericht Chemnitz, HRB 28082 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] can: c_can: Fix bit clearing and message object read 2014-04-08 14:24 ` Alexander Stein @ 2014-04-09 0:42 ` Thomas Gleixner 2014-04-09 7:40 ` Alexander Stein 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2014-04-09 0:42 UTC (permalink / raw) To: Alexander Stein; +Cc: Wolfgang Grandegger, Marc Kleine-Budde, linux-can B1;3202;0cOn Tue, 8 Apr 2014, Alexander Stein wrote: > On Tuesday 08 April 2014 16:15:37, Thomas Gleixner wrote: > > On Tue, 8 Apr 2014, Alexander Stein wrote: > > > On Tuesday 08 April 2014 15:07:10, Thomas Gleixner wrote: > > > The problem is that NEWDAT in the crtl register readout is most of the time set, but sometimes not. > > > Adding the delay above it is unset all the time :( > > > > Now what happens if you disable interrupts across the writes: > > > > local_irq_disable() > > write(COMMSK) > > write(COMREQ) > > local_irq_enable() > > There are still message objects where NEWDAT is unset. Even if the c_can_msg_obj_is_busy check is included. > I found the issue after a I got hold of a PCH afflicted system. It might not apply cleanly as I changed a few other things in my patch queue, but you get the idea. Too tired now to write changelogs, run tests .... W/o that patch I observe the same problems as you. With that applied I did not observe any dropout with a 10 * 1e6 packet run. Thanks, tglx --------------> Subject: can: c_can: Work around C_CAN RX wreckage From: Thomas Gleixner <tglx@linutronix.de> Date: Tue, 08 Apr 2014 21:41:51 +0200 Alexander reported that the new optimized handling of the RX fifo causes random packet loss on Intel PCH C_CAN hardware. After a few fruitless debugging sessions I got hold of a PCH (eg20t) afflicted system. That machine does not have the CAN interface wired up, but it was possible to reproduce the issue with the HW loopback mode. As Alexander observed correctly, clearing the NewDat flag along with reading out the message buffer causes that issue on C_CAN, while D_CAN handles that correctly. Instead of restoring the original message buffer handling horror the following workaround solves the issue: transfer buffer to IF without clearing the NewDat handle the message clear NewDat bit That's similar to the original code but conditional for C_CAN. I really wonder why all user manuals (C_CAN, Intel PCH and some more) recommend to clear the NewDat bit right away. The knows it all Oracle operated by Gurgle does not unearth any useful information either. I simply cannot believe that we are the first to uncover that HW issue. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- drivers/net/can/c_can/c_can.c | 13 ++++++++++--- drivers/net/can/c_can/c_can.h | 1 + 2 files changed, 11 insertions(+), 3 deletions(-) Index: linux-2.6/drivers/net/can/c_can/c_can.c =================================================================== --- linux-2.6.orig/drivers/net/can/c_can/c_can.c +++ linux-2.6/drivers/net/can/c_can/c_can.c @@ -644,6 +644,10 @@ static int c_can_start(struct net_device if (err) return err; + /* Setup the command for new messages */ + priv->comm_rcv_high = priv->type != BOSCH_D_CAN ? + IF_COMM_RCV_LOW : IF_COMM_RCV_HIGH; + priv->can.state = CAN_STATE_ERROR_ACTIVE; /* reset tx helper pointers and the rx mask */ @@ -788,14 +792,15 @@ static u32 c_can_adjust_pending(u32 pend return pend & ~((1 << lasts) - 1); } -static inline void c_can_rx_object_get(struct net_device *dev, u32 obj) +static inline void c_can_rx_object_get(struct net_device *dev, + struct c_can_priv *priv, u32 obj) { #ifdef CONFIG_CAN_C_CAN_STRICT_FRAME_ORDERING if (obj < C_CAN_MSG_RX_LOW_LAST) c_can_object_get(dev, IF_RX, obj, IF_COMM_RCV_LOW); else #endif - c_can_object_get(dev, IF_RX, obj, IF_COMM_RCV_HIGH); + c_can_object_get(dev, IF_RX, obj, priv->comm_rcv_high); } static inline void c_can_rx_finalize(struct net_device *dev, @@ -810,6 +815,8 @@ static inline void c_can_rx_finalize(str c_can_activate_all_lower_rx_msg_obj(dev, IF_RX); } #endif + if (priv->type != BOSCH_D_CAN) + c_can_object_get(dev, IF_RX, obj, IF_COMM_CLR_NEWDAT); } static int c_can_read_objects(struct net_device *dev, struct c_can_priv *priv, @@ -820,7 +827,7 @@ static int c_can_read_objects(struct net while ((obj = ffs(pend)) && quota > 0) { pend &= ~BIT(obj - 1); - c_can_rx_object_get(dev, obj); + c_can_rx_object_get(dev, priv, obj); ctrl = priv->read_reg(priv, C_CAN_IFACE(MSGCTRL_REG, IF_RX)); if (ctrl & IF_MCONT_MSGLST) { Index: linux-2.6/drivers/net/can/c_can/c_can.h =================================================================== --- linux-2.6.orig/drivers/net/can/c_can/c_can.h +++ linux-2.6/drivers/net/can/c_can/c_can.h @@ -198,6 +198,7 @@ struct c_can_priv { u32 __iomem *raminit_ctrlreg; unsigned int instance; void (*raminit) (const struct c_can_priv *priv, bool enable); + u32 comm_rcv_high; u32 rxmasked; u32 dlc[C_CAN_MSG_OBJ_TX_NUM]; }; ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] can: c_can: Fix bit clearing and message object read 2014-04-09 0:42 ` Thomas Gleixner @ 2014-04-09 7:40 ` Alexander Stein 2014-04-09 10:28 ` Alexander Stein 0 siblings, 1 reply; 8+ messages in thread From: Alexander Stein @ 2014-04-09 7:40 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Wolfgang Grandegger, Marc Kleine-Budde, linux-can On Wednesday 09 April 2014 02:42:57, Thomas Gleixner wrote: > I found the issue after a I got hold of a PCH afflicted system. It > might not apply cleanly as I changed a few other things in my patch > queue, but you get the idea. Too tired now to write changelogs, run > tests .... > > W/o that patch I observe the same problems as you. With that applied I > did not observe any dropout with a 10 * 1e6 packet run. Applying your patch I run my test with 5.000.000 CAN frames where none got lost and only a single one got switched. Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Just for the reference I added my actually applied patch. Thanks and best regards, Alexander diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c index 0e9f974..b27b147 100644 --- a/drivers/net/can/c_can/c_can.c +++ b/drivers/net/can/c_can/c_can.c @@ -645,6 +645,10 @@ static int c_can_start(struct net_device *dev) if (err) return err; + /* Setup the command for new messages */ + priv->comm_rcv_high = priv->type != BOSCH_D_CAN ? + IF_COMM_RCV_LOW : IF_COMM_RCV_HIGH; + priv->can.state = CAN_STATE_ERROR_ACTIVE; /* reset tx helper pointers and the rx mask */ @@ -789,14 +793,15 @@ static u32 c_can_adjust_pending(u32 pend) return pend & ~((1 << lasts) - 1); } -static inline void c_can_rx_object_get(struct net_device *dev, u32 obj) +static inline void c_can_rx_object_get(struct net_device *dev, + struct c_can_priv *priv, u32 obj) { #ifdef CONFIG_CAN_C_CAN_STRICT_FRAME_ORDERING if (obj < C_CAN_MSG_RX_LOW_LAST) c_can_object_get(dev, IF_RX, obj, IF_COMM_RCV_LOW); else #endif - c_can_object_get(dev, IF_RX, obj, IF_COMM_RCV_HIGH); + c_can_object_get(dev, IF_RX, obj, priv->comm_rcv_high); } static inline void c_can_rx_finalize(struct net_device *dev, struct c_can_priv *priv, u32 obj) @@ -810,6 +815,8 @@ static inline void c_can_rx_finalize(struct net_device *dev, struct c_can_priv * c_can_activate_all_lower_rx_msg_obj(dev, IF_RX); } #endif + if (priv->type != BOSCH_D_CAN) + c_can_object_get(dev, IF_RX, obj, IF_COMM_CLR_NEWDAT); } static int c_can_read_objects(struct net_device *dev, struct c_can_priv *priv, @@ -820,7 +827,7 @@ static int c_can_read_objects(struct net_device *dev, struct c_can_priv *priv, while ((obj = ffs(pend)) && quota > 0) { pend &= ~BIT(obj - 1); - c_can_rx_object_get(dev, obj); + c_can_rx_object_get(dev, priv, obj); ctrl = priv->read_reg(priv, C_CAN_IFACE(MSGCTRL_REG, IF_RX)); if (ctrl & IF_MCONT_MSGLST) { diff --git a/drivers/net/can/c_can/c_can.h b/drivers/net/can/c_can/c_can.h index cd91960..792944c 100644 --- a/drivers/net/can/c_can/c_can.h +++ b/drivers/net/can/c_can/c_can.h @@ -198,6 +198,7 @@ struct c_can_priv { u32 __iomem *raminit_ctrlreg; unsigned int instance; void (*raminit) (const struct c_can_priv *priv, bool enable); + u32 comm_rcv_high; u32 rxmasked; u32 dlc[C_CAN_MSG_OBJ_TX_NUM]; }; -- Dipl.-Inf. Alexander Stein SYS TEC electronic GmbH Am Windrad 2 08468 Heinsdorfergrund Tel.: 03765 38600-1156 Fax: 03765 38600-4100 Email: alexander.stein@systec-electronic.com Website: www.systec-electronic.com Managing Director: Dipl.-Phys. Siegmar Schmidt Commercial registry: Amtsgericht Chemnitz, HRB 28082 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] can: c_can: Fix bit clearing and message object read 2014-04-09 7:40 ` Alexander Stein @ 2014-04-09 10:28 ` Alexander Stein 0 siblings, 0 replies; 8+ messages in thread From: Alexander Stein @ 2014-04-09 10:28 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Wolfgang Grandegger, Marc Kleine-Budde, linux-can On Wednesday 09 April 2014 09:40:11, Alexander Stein wrote: > On Wednesday 09 April 2014 02:42:57, Thomas Gleixner wrote: > > I found the issue after a I got hold of a PCH afflicted system. It > > might not apply cleanly as I changed a few other things in my patch > > queue, but you get the idea. Too tired now to write changelogs, run > > tests .... > > > > W/o that patch I observe the same problems as you. With that applied I > > did not observe any dropout with a 10 * 1e6 packet run. > > Applying your patch I run my test with 5.000.000 CAN frames where none got lost and only a single one got switched. Just for completeness: If this test is done while iperf and I2C is ongoing meanwhile there are 24 switched messages. Regards, Alexander -- Dipl.-Inf. Alexander Stein SYS TEC electronic GmbH Am Windrad 2 08468 Heinsdorfergrund Tel.: 03765 38600-1156 Fax: 03765 38600-4100 Email: alexander.stein@systec-electronic.com Website: www.systec-electronic.com Managing Director: Dipl.-Phys. Siegmar Schmidt Commercial registry: Amtsgericht Chemnitz, HRB 28082 ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-04-09 10:30 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-04-08 9:46 [PATCH] can: c_can: Fix bit clearing and message object read Alexander Stein 2014-04-08 13:07 ` Thomas Gleixner 2014-04-08 14:02 ` Alexander Stein 2014-04-08 14:15 ` Thomas Gleixner 2014-04-08 14:24 ` Alexander Stein 2014-04-09 0:42 ` Thomas Gleixner 2014-04-09 7:40 ` Alexander Stein 2014-04-09 10:28 ` Alexander Stein
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).