linux-can.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [mcp251x - spi] Blocked to "wait_for_completion"
@ 2012-12-06 11:22 Mylene Josserand
  2012-12-06 11:49 ` Marc Kleine-Budde
  0 siblings, 1 reply; 5+ messages in thread
From: Mylene Josserand @ 2012-12-06 11:22 UTC (permalink / raw)
  To: linux-can

Hi everyone,


I am new to this mailing list so I hope I am writing to the good section. 
It is the first time I use CAN protocol so I am a newbie !
I am currently working on CAN with the driver MCP251x for my company.

For some reasons, my company have the kernel 2.6.32.59 version. The MCP251x driver have been backported from the Linux version 2.6.34.
I am testing it but only for reading purpose. So I have a set-up to send CAN frame (in another computer) and received them with candump (this is what I want).


When I send CAN frames every 20 ms, the CAN interface seems to be blocked after 1-2 minutes (sometimes 5 minutes) of reading. In fact, the candump stop showing CAN messages and I can not act anymore on it. 
If I increase the interval time between frames (100 msec), the time before being blocked is around 30-40 minutes. And if it is every 1 sec, it is after 13 hours !

I have made some debugging traces on the driver and I saw that the problem is in the call of "spi_sync" function (in the "mcp251x_spi_trans" function).
I have done the same on the "spi.c" and I saw that the spi is blocked by the "wait_for_completion" function. What is it waiting for ?
With some readings, I saw that this function is not stoppable so it is normal that the function is blocked. But I did not understand why it is blocking in this function only after some times.

Can you have some explanation of this problem ? Have you already seen that before ?
How solve it ? Update the kernel to new version ? (If you have new version, could you test it please ?)
I have seen that some new kernel version (3.0.53 and upper) have some update on this spi_sync function. Is this going to fix this problem or not ?


Thank you in advance for any help !


PS : I had an Undelivered response. I did not know if my email has been sent so I send it a 2nd time. Sorry for potential spam !


Best regards,


Mylene JOSSERAND

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [mcp251x - spi] Blocked to "wait_for_completion"
  2012-12-06 11:22 [mcp251x - spi] Blocked to "wait_for_completion" Mylene Josserand
@ 2012-12-06 11:49 ` Marc Kleine-Budde
  2012-12-06 13:43   ` Mylene Josserand
  0 siblings, 1 reply; 5+ messages in thread
From: Marc Kleine-Budde @ 2012-12-06 11:49 UTC (permalink / raw)
  To: Mylene Josserand; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 3527 bytes --]

Hello Mylene,

what SPI controller are you using?
Which SoC are you on?
Do you have a MCP2510 or a MCP2515?

On 12/06/2012 12:22 PM, Mylene Josserand wrote:
> I am new to this mailing list so I hope I am writing to the good
> section. It is the first time I use CAN protocol so I am a newbie ! I
> am currently working on CAN with the driver MCP251x for my company.
> 
> For some reasons, my company have the kernel 2.6.32.59 version. The
> MCP251x driver have been backported from the Linux version 2.6.34. I

A lot of fixes have been applied to the driver since v2.6.34, you should
backport them, too.

> am testing it but only for reading purpose. So I have a set-up to
> send CAN frame (in another computer) and received them with candump
> (this is what I want).
> 
> 
> When I send CAN frames every 20 ms, the CAN interface seems to be
> blocked after 1-2 minutes (sometimes 5 minutes) of reading. In fact,
> the candump stop showing CAN messages and I can not act anymore on
> it. If I increase the interval time between frames (100 msec), the
> time before being blocked is around 30-40 minutes. And if it is every
> 1 sec, it is after 13 hours !
> 
> I have made some debugging traces on the driver and I saw that the
> problem is in the call of "spi_sync" function (in the
> "mcp251x_spi_trans" function). I have done the same on the "spi.c"
> and I saw that the spi is blocked by the "wait_for_completion"
> function. What is it waiting for ? With some readings, I saw that

The CAN driver calls into the SPI layer to send a message synchronously.
The SPI layer kicks the SPI driver to send the message, bus, this is
done asynchronously. In the completion handler of the SPI driver the SPI
layer is told that the SPI transfer is complete.

In the above described mechanism the SPI layer is waiting with
wait_for_completion(), the SPI driver will call (either directly, or via
a callback, which has been set by the SPI layer) a complete() which will
wake up the SPI layer.

If the SPI layer is still in wait_for_completion(), the SPI transfer
never finished...there's might be a bug in the SPI driver.

> this function is not stoppable so it is normal that the function is
> blocked. But I did not understand why it is blocking in this function
> only after some times.
> 
> Can you have some explanation of this problem ? Have you already seen

See above.

> that before ? How solve it ? Update the kernel to new version ? (If

Never seen it.

Debug your SPI driver. Especially the completion of a transfer. Hook up
a scope to the SPI lines.

Updating the Kernel is always a good idea. Use latest v3.7-rc or newer
if updating.

> you have new version, could you test it please ?) I have seen that
> some new kernel version (3.0.53 and upper) have some update on this
> spi_sync function. Is this going to fix this problem or not ?

There are probably a lot of changes between 2.6.32 and 3.0.53 in the SPI
layer in probably in your SPI driver as well.

Marc

> PS : I had an Undelivered response. I did not know if my email has
> been sent so I send it a 2nd time. Sorry for potential spam !

Maybe you tried to send a HTML mail, which is not supported here :)

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 261 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [mcp251x - spi] Blocked to "wait_for_completion"
  2012-12-06 11:49 ` Marc Kleine-Budde
@ 2012-12-06 13:43   ` Mylene Josserand
  2012-12-06 14:21     ` Marc Kleine-Budde
  0 siblings, 1 reply; 5+ messages in thread
From: Mylene Josserand @ 2012-12-06 13:43 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

Hi Marc,

Thank you for your quick answer ! :)


>what SPI controller are you using?
>Which SoC are you on?

We are using a IMX27 and the spi controller is inside I think.

>Do you have a MCP2510 or a MCP2515?

We have a MCP2515.


>A lot of fixes have been applied to the driver since v2.6.34, you should
>backport them, too.

Yes, I see it. I have tried to update the MCP driver with a patch that you have wrote to "fix and optimize" this driver. Unfortunately, it did not seem to solve the problem. But I will try to backport the new driver version.


>The CAN driver calls into the SPI layer to send a message synchronously.
>The SPI layer kicks the SPI driver to send the message, bus, this is
>done asynchronously. In the completion handler of the SPI driver the SPI
>layer is told that the SPI transfer is complete.

>In the above described mechanism the SPI layer is waiting with
>wait_for_completion(), the SPI driver will call (either directly, or via
>a callback, which has been set by the SPI layer) a complete() which will
>wake up the SPI layer.

>If the SPI layer is still in wait_for_completion(), the SPI transfer
>never finished...there's might be a bug in the SPI driver.

Okay. It was what I am thinking of. It was like waiting something...


>Never seen it.

Arf ! I thought that maybe, it was a bug known that you/someone had fixed with new versions.

>Debug your SPI driver. Especially the completion of a transfer. Hook up
>a scope to the SPI lines.

Okay. So the spi.c with the wait_for_completion is the spi layer ? My problem would not be in this file but in the spi driver (which I do not know where it is but I will search it ! ) ?

>Updating the Kernel is always a good idea. Use latest v3.7-rc or newer
>if updating.

Okay ! This was something that I propose to my company. I will maybe try directly to update the kernel (so the driver too) !
I will let you know if I find why it is happening.


Thank you again for your answer and your work on the CAN ! :)


Mylène JOSSERAND


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [mcp251x - spi] Blocked to "wait_for_completion"
  2012-12-06 13:43   ` Mylene Josserand
@ 2012-12-06 14:21     ` Marc Kleine-Budde
  2012-12-07  7:28       ` Mylene Josserand
  0 siblings, 1 reply; 5+ messages in thread
From: Marc Kleine-Budde @ 2012-12-06 14:21 UTC (permalink / raw)
  To: Mylene Josserand; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 4546 bytes --]

On 12/06/2012 02:43 PM, Mylene Josserand wrote:
>> what SPI controller are you using? Which SoC are you on?
> 
> We are using a IMX27 and the spi controller is inside I think.

uuuhhh....2.6.32 is pretty old for an imx27. There are probably a lot of
spi fixes, too.

> 
>> Do you have a MCP2510 or a MCP2515?
> 
> We have a MCP2515.
> 
> 
>> A lot of fixes have been applied to the driver since v2.6.34, you
>> should backport them, too.
> 
> Yes, I see it. I have tried to update the MCP driver with a patch
> that you have wrote to "fix and optimize" this driver. Unfortunately,
> it did not seem to solve the problem. But I will try to backport the
> new driver version.

I suggest to cherry-pick the patches from your kernel to the current
one:

3c8ac0f can: remove __dev* attributes
cab32f3 can: mcp251x: avoid repeated frame bug
194b9a4 can: mark bittiming_const pointer in struct can_priv as const
c2fd03a drivers: net: Remove casts to same type
aabdfd6 can: replace the dev_dbg/info/err/... with the new netdev_xxx macros
34206f2 can: mcp251x: Allow pass IRQ flags through platform data.
58a69cb workqueue, freezer: unify spelling of 'freeze' + 'able' to 'freezable'
b9958a9 can: mcp251x: fix reception of standard RTR frames
612eef4 can: mcp251x: fix generation of error frames
5601b2d can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set
9c473fc can: mcp251x: optimize 2515, rx int gets cleared automatically
beab675 can: mcp251x: define helper functions mcp251x_is_2510, mcp251x_is_2515
f1f8c6c can: mcp251x: Don't use pdata->model for chip selection anymore
d3cd156 can: mcp251x: write intf only when needed
7e15de3 can: mcp251x: read-modify-write eflag only when needed
f3a3ed3 can: mcp251x: allow to read two registers in one spi transfer
711e4d6 can: mcp251x: increase rx_errors on overflow, not only rx_over_errors
57d3c7b can: mcp251x: fix NOHZ local_softirq_pending 08 warning
1ae5dc3 net: trans_start cleanups
829e001 Fix some #includes in CAN drivers (rebased for net-next-2.6)
e446630 Add hotplug support to mcp251x driver
5a0e3ad include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
bf66f37 can: mcp251x: Move to threaded interrupts instead of workqueues.
ad72c34 can: Proper ctrlmode handling for CAN devices
3ccd4c6 can: Unify droping of invalid tx skbs and netdev stats
ce739b4 drivers/net/can: Correct NULL test
c7cd606 can: Fix data length code handling in rx path
615534b can: fix setting mcp251x bit timing on open
e000016 can: Driver for the Microchip MCP251x SPI CAN controllers

But you have to omit some that are not compatible with your kernel.

>> The CAN driver calls into the SPI layer to send a message
>> synchronously. The SPI layer kicks the SPI driver to send the
>> message, bus, this is done asynchronously. In the completion
>> handler of the SPI driver the SPI layer is told that the SPI
>> transfer is complete.
> 
>> In the above described mechanism the SPI layer is waiting with 
>> wait_for_completion(), the SPI driver will call (either directly,
>> or via a callback, which has been set by the SPI layer) a
>> complete() which will wake up the SPI layer.
> 
>> If the SPI layer is still in wait_for_completion(), the SPI
>> transfer never finished...there's might be a bug in the SPI
>> driver.
> 
> Okay. It was what I am thinking of. It was like waiting something...
> 
> 
>> Never seen it.
> 
> Arf ! I thought that maybe, it was a bug known that you/someone had
> fixed with new versions.
> 
>> Debug your SPI driver. Especially the completion of a transfer.
>> Hook up a scope to the SPI lines.
> 
> Okay. So the spi.c with the wait_for_completion is the spi layer ? My
> problem would not be in this file but in the spi driver (which I do
> not know where it is but I will search it ! ) ?

Maybe there is the same mechanism in the spi driver, too.

>> Updating the Kernel is always a good idea. Use latest v3.7-rc or
>> newer if updating.
> 
> Okay ! This was something that I propose to my company. I will maybe
> try directly to update the kernel (so the driver too) ! I will let
> you know if I find why it is happening.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 261 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [mcp251x - spi] Blocked to "wait_for_completion"
  2012-12-06 14:21     ` Marc Kleine-Budde
@ 2012-12-07  7:28       ` Mylene Josserand
  0 siblings, 0 replies; 5+ messages in thread
From: Mylene Josserand @ 2012-12-07  7:28 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

Hi Marc,


>uuuhhh....2.6.32 is pretty old for an imx27. There are probably a lot of
>spi fixes, too.


Okay. I am a new employee in my company and I propose to do it so thank you for this remark. It is good to have the point of view of experimented persons ! :)


>I suggest to cherry-pick the patches from your kernel to the current
>one:
>But you have to omit some that are not compatible with your kernel.


Thank you for the patches and the help !


Mylène JOSSERAND

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-12-07  7:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-06 11:22 [mcp251x - spi] Blocked to "wait_for_completion" Mylene Josserand
2012-12-06 11:49 ` Marc Kleine-Budde
2012-12-06 13:43   ` Mylene Josserand
2012-12-06 14:21     ` Marc Kleine-Budde
2012-12-07  7:28       ` Mylene Josserand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).