From mboxrd@z Thu Jan 1 00:00:00 1970 From: Cyrille Pitchen Subject: Re: [PATCH] i2c: at91: fix write transfers by clearing pending interrupt first Date: Tue, 13 Oct 2015 18:47:25 +0200 Message-ID: <561D359D.8000500@atmel.com> References: <1444746104-20615-1-git-send-email-ludovic.desroches@atmel.com> <561D20E7.5070405@lysator.liu.se> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <561D20E7.5070405@lysator.liu.se> Sender: linux-kernel-owner@vger.kernel.org To: Peter Rosin , Ludovic Desroches , wsa@the-dreams.de Cc: linux-i2c@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, nicolas.ferre@atmel.com List-Id: linux-i2c@vger.kernel.org Le 13/10/2015 17:19, Peter Rosin a =E9crit : > On 2015-10-13 16:21, Ludovic Desroches wrote: >> From: Cyrille Pitchen >> >> In some cases a NACK interrupt may be pending in the Status Register= (SR) >> as a result of a previous transfer. However at91_do_twi_transfer() d= id not >> read the SR to clear pending interruptions before starting a new tra= nsfer. >> Hence a NACK interrupt rose as soon as it was enabled again at the I= 2C >> controller level, resulting in a wrong sequence of operations and st= range >> patterns of behaviour on the I2C bus, such as a clock stretch follow= ed by >> a restart of the transfer. >> >> This first issue occurred with both DMA and PIO write transfers. >> >> Also when a NACK error was detected during a PIO write transfer, the >> interrupt handler used to wrongly start a new transfer by writing in= to the >> Transmit Holding Register (THR). Then the I2C slave was likely to re= ply >> with a second NACK. >> >> This second issue is fixed in atmel_twi_interrupt() by handling the = TXRDY >> status bit only if both the TXCOMP and NACK status bits are cleared. >> >> Tested with a at24 eeprom on sama5d36ek board running a linux-4.1-at= 91 >> kernel image. Adapted to linux-next. >> >> Signed-off-by: Cyrille Pitchen >> Fixes: 93563a6a71bb ("i2c: at91: fix a race condition when using the= DMA controller") >> Reported-by: Peter Rosin >> Signed-off-by: Ludovic Desroches >> Cc: stable@vger.kernel.org #4.1 >=20 > The regression is gone with this patch (vanilla 4.2), thank you! >=20 > However, looking at the bus, there are two NACKs after each > successful chunk of memory written by the eeprom driver. >=20 > Looking at the eeprom driver, I expect this on the bus: >=20 > S 0x50 0x00 "hello there guys" P > S 0x50 NACK P > delay for at least 1 ms (since the eeprom driver has a msleep(1) call= ). > S 0x50 NACK P > delay for at least 1 ms > ... > ... > S 0x50 NACK P > delay for at least 1 ms > S 0x50 0x10 "and girls\n" P >=20 > This is not what I observe on the bus, most of the time there is a > second NACK immediately following the first. I suspect that it is > the i2c bus driver that somehow confuses itself and reissues the > command for some reason? >=20 > But this behavior has been there since the beginning, so it's probabl= y > orthogonal, and killing the data corrupting regression is much more > important to me than fussing over a surplus failed transfer. Hence >=20 > Tested-by: Peter Rosin >=20 > Cheers, > Peter >=20 Hi Peter, sama5d3x and sama5d4x don't support the so called "Alternative Command = mode" whereas sama5d2x do. The Alternative Command mode comes with a new hard= ware mechanism inside the I2C controller which locks the transmission of dat= a on the I2C bus when a NACK is detected. It means that even if a data is wr= itten into the THR, the I2C controller doesn't push this data on the I2C bus = but retains the data in the THR (and its associated FIFO for sama5d2x and f= uture SoCs) until the driver unlocks the transmitter by writing the LOCKCLR (= Lock Clear) bit in the Control Register. Then and only then, the transmitter= outputs pending data again. This new mechanism was introduced to cope with an unwanted DMA write in= to the THR after a NACK. Indeed, as I've tried to explain in my patch, when a = first NACK is detected, the I2C controller sets the TXCOMP, NACK and TXRDY bi= ts alltogether in the Status Register. However setting the TXRDY bit also = triggers the DMA controller to write the next data into the THR. Pitifully, WITH= OUT the new lock mechanism, writing into the THR starts a new I2C frame. Since = the eeprom is likely not to be ready yet, it replies by a second NACK. So y= ou see on the scope two consecutive NACKs. On sama5d3x and sama5d4x, which do not support this lock mechanism, you= are likely to see a successful write transfer followed by two NACKs then a = delay and finally a new successful write transfer. This is the case 2b: 1 - A successfull write transfer is completed. 2 - The at24 driver immediately tries to perform the next write transfe= r... 3 - ... but the eeprom is not ready yet and replies with a first NACK. 4 - The I2C controller detects this NACK and sets the NACK, TXCOMP and = TXRDY bits in its Status Register (and interrupts the CPU). 5 - The DMA controller is triggered by the TXRDY bit to write the next = data into the THR. 6 - WITHTOUT the lock mechanism, the I2C controller starts a new frame = because of the write into the THR by the DMA controller... 7 - ... but the eeprom is still not ready and replies with a second NAC= K. 8 - starting from 4, the CPU handles the interrupt generated by the fir= st NACK. Depending on the timing, it has already masked the interrupted befo= re the second NACK. In such a case, the NACK bit is pending in the Status Register. Then it reports the first NACK error to the at24 driver. 9 - The at24 driver handles the NACK error: it sleeps for a while befor= e retrying to perform the write operation. 10 - With the new patch, the I2C controller reads the Status Register t= o clear pending (NACK) interrupts before actually starting this new (retri= ed) transfer. So if you are using DMA transfers on a sama5d3x board, you're are likel= y to see, as I did see, a valid write transfer followed by 2 NACKs, then a d= elay and finally a new write transfer. The 2nd NACK is not wanted but expected. Without the lock mechanism, th= ere is nothing we can do at the I2C controller level. However a workaround may= exist: inserting a small delay after a successful write transfer should give t= he eeprom enough time to be ready to process the next write so even the fi= rst NACK would disappear. This is what I saw while debugging with dev_info() traces. Those traces= had the side effect of inserting delays hence changing the timing. As you see, we have to deal with some known hardware design quirks, whi= ch are fixed by the lock mechanism. Without the lock mechanism, the interrupt = handler and the CPU are likely to come too late to cancel the DMA transfer befo= re the DMA controller writes a new byte into the THR, hence generating an unwa= nted frame. Best Regards, Cyrille