All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wolfgang Grandegger <wg@grandegger.com>
To: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>,
	netdev@vger.kernel.org, mkl@pengutronix.de,
	linux-can@vger.kernel.org
Cc: linux-sh@vger.kernel.org, vksavl@gmail.com
Subject: Re: [PATCH] can: add Renesas R-Car CAN driver
Date: Sat, 09 Nov 2013 15:47:04 +0100	[thread overview]
Message-ID: <527E4AE8.6020905@grandegger.com> (raw)
In-Reply-To: <527E142D.6070805@grandegger.com>

On 11/09/2013 11:53 AM, Wolfgang Grandegger wrote:
> Hi Sergei,
> 
> On 11/09/2013 02:02 AM, Sergei Shtylyov wrote:
>> Hello.
>>
>> On 10/21/2013 11:12 PM, Wolfgang Grandegger wrote:
>>
>>>>     Sorry for the belated reply -- was on vacations.
>>
>>    And again sorry, couldn't get to this due to other things.
>>
>>>>> thanks for your contribution. The patch looks already quite good.
>>>>> Before
>>>>> I find time for a detailed review could you please check error handling
>>>>> and bus-off recovery by reporting the output of "$ candump -td -e
>>>>> any,0:0,#FFFFFFFF" while sending messages to the device ...
>>
>> [...]
>>
>>>> root@10.0.0.101:/opt/can-utils# ip -details link show can0
>>>> 2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>>> qlen 10 link/can
>>>> can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0
>>>> bitrate 297619 sample-point 0.714
>>
>>> Strange, what bitrate did you configure?
>>
>>    300000.
> 
> Ah, OK. It's just a very unusual CAN bitrate. Common are 125k, 250k,
> 500kB, 800kB and 1 MBit/s. Is it your choice?
> 
>>>> tq 480 prop-seg 2 phase-seg1 2 phase-seg2 2 sjw 1
>>>> rcar_can: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..1024 brp-inc 1
>>>> clock 49999999
>>
>>> Could you please try if the algorithm works better with 50000000.
>>
>>    It doesn't. Look at the logs below:
> 
> OK, I was mainly confused by the bitrate. Anyway, the bitrate algorithim
> sometimes does not like exotic clock frequencies or bitrates. Then
> manual setting of the bit-timing parameters might be necessary. But that
> seem not the case here.
> 
>>>>> 2. ... with short-circuited CAN high and low and doing some time later
>>>>>          a manual recovery with "ip link set can0 type can restart"
>>
>>>>     Now we have auto recovery only. Manual recovery was tested with the
>>>> first driver version and worked.
>>
>>> What do you mean with "auto recovery"? Auto recovery by the hardware or
>>> via "restart-ms <ms>"? How do you choose between "manual" and "auto"
>>> recovery?
>>
>>    This exact test was done with hardware auto-recovery only. No
>> "restart-ms" was programmed.
> 
> OK, you already explained that in another mail and your driver does not
> use/support hardware auto-recovery any longer.
> 
>>
>>>> Terminal 1:
>>
>>>> root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
>>>> root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
>>>> root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
>>>> root@10.0.0.104:/opt/can-utils#
>>
>>>> Terminal 2:
>>
>>>> root@10.0.0.104:/opt/can-utils# ./candump -td -e any,0:0,#FFFFFFFF
>>>> (000.000000) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>>> protocol-violation{{tx-dominant-bit-error}{}}
>>>> bus-error
>>>> (000.021147) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>>> bus-off
>>>> restarted-after-bus-off
>>
>>> Why does it get "restarted" directly after the bus-off?
>>
>>    Because we have hardware auto-recovery enabled.
>>
>>>> (011.738522) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>
>>> What controller problem? data[1] is not set for some reasom.
>>
>>    Not comments. Looking into it.
>>
>>>> protocol-violation{{tx-dominant-bit-error}{}}
>>>> bus-error
>>>> (000.021163) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>>> bus-off
>>>> restarted-after-bus-off
>>>> (001.666625) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>>> protocol-violation{{tx-dominant-bit-error}{}}
>>>> bus-error
>>>> (000.021157) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>>> bus-off
>>>> restarted-after-bus-off
>>
>>>> dmesg:
>>>> rcar_can rcar_can.0 can0: Error warning interrupt
>>>> rcar_can rcar_can.0 can0: Error passive interrupt
>>>> rcar_can rcar_can.0 can0: Bus error interrupt:
>>>> rcar_can rcar_can.0 can0: Bit Error (dominant)
>>>> rcar_can rcar_can.0 can0: Error warning interrupt
>>>> rcar_can rcar_can.0 can0: Error passive interrupt
>>
>>> Why are they reported again. You are already in error passive.
>>
>>    Don't know. :-/
> 
> The hardware might not be that smart. Then the software should care.
> 
>>>>> I also wonder if the messages are always sent in order. You could use
>>>>> the program "canfdtest" [1] from the can-utils for validation.
>>
>>>>     This program is PITA. With the driver workaroung it works:
>>
>>> What workaround?
>>
>>    Doesn't matter already, got rid of it.
> 
> OK. BTW: I suggest to run "canfdtest" at *1* MB/s with additional system
> and I/O load and for much longer than a minute to increase the
> probability of an out-of-order transmissions to occur.

That's probably a wrong assumption because "canfdtest" does do a 1ms
sleep after each generated messages :(. Therefore I would try at 125
KBit/s. Sorry for not providing a reliable tool for out-of-order
validation. There is also the "cansequence" program from the Pengutronix
canutils [1], which might be better suited, also to reveal races.

Wolfgang.

[1] http://git.pengutronix.de/?p=tools/canutils.git;a=summary
    http://www.pengutronix.de/software/socket-can/download

Wolfgang.

WARNING: multiple messages have this Message-ID (diff)
From: Wolfgang Grandegger <wg@grandegger.com>
To: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>,
	netdev@vger.kernel.org, mkl@pengutronix.de,
	linux-can@vger.kernel.org
Cc: linux-sh@vger.kernel.org, vksavl@gmail.com
Subject: Re: [PATCH] can: add Renesas R-Car CAN driver
Date: Sat, 09 Nov 2013 14:47:04 +0000	[thread overview]
Message-ID: <527E4AE8.6020905@grandegger.com> (raw)
In-Reply-To: <527E142D.6070805@grandegger.com>

On 11/09/2013 11:53 AM, Wolfgang Grandegger wrote:
> Hi Sergei,
> 
> On 11/09/2013 02:02 AM, Sergei Shtylyov wrote:
>> Hello.
>>
>> On 10/21/2013 11:12 PM, Wolfgang Grandegger wrote:
>>
>>>>     Sorry for the belated reply -- was on vacations.
>>
>>    And again sorry, couldn't get to this due to other things.
>>
>>>>> thanks for your contribution. The patch looks already quite good.
>>>>> Before
>>>>> I find time for a detailed review could you please check error handling
>>>>> and bus-off recovery by reporting the output of "$ candump -td -e
>>>>> any,0:0,#FFFFFFFF" while sending messages to the device ...
>>
>> [...]
>>
>>>> root@10.0.0.101:/opt/can-utils# ip -details link show can0
>>>> 2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>>> qlen 10 link/can
>>>> can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0
>>>> bitrate 297619 sample-point 0.714
>>
>>> Strange, what bitrate did you configure?
>>
>>    300000.
> 
> Ah, OK. It's just a very unusual CAN bitrate. Common are 125k, 250k,
> 500kB, 800kB and 1 MBit/s. Is it your choice?
> 
>>>> tq 480 prop-seg 2 phase-seg1 2 phase-seg2 2 sjw 1
>>>> rcar_can: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..1024 brp-inc 1
>>>> clock 49999999
>>
>>> Could you please try if the algorithm works better with 50000000.
>>
>>    It doesn't. Look at the logs below:
> 
> OK, I was mainly confused by the bitrate. Anyway, the bitrate algorithim
> sometimes does not like exotic clock frequencies or bitrates. Then
> manual setting of the bit-timing parameters might be necessary. But that
> seem not the case here.
> 
>>>>> 2. ... with short-circuited CAN high and low and doing some time later
>>>>>          a manual recovery with "ip link set can0 type can restart"
>>
>>>>     Now we have auto recovery only. Manual recovery was tested with the
>>>> first driver version and worked.
>>
>>> What do you mean with "auto recovery"? Auto recovery by the hardware or
>>> via "restart-ms <ms>"? How do you choose between "manual" and "auto"
>>> recovery?
>>
>>    This exact test was done with hardware auto-recovery only. No
>> "restart-ms" was programmed.
> 
> OK, you already explained that in another mail and your driver does not
> use/support hardware auto-recovery any longer.
> 
>>
>>>> Terminal 1:
>>
>>>> root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
>>>> root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
>>>> root@10.0.0.104:/opt/can-utils# ./cangen -n 1 -g 1 can0
>>>> root@10.0.0.104:/opt/can-utils#
>>
>>>> Terminal 2:
>>
>>>> root@10.0.0.104:/opt/can-utils# ./candump -td -e any,0:0,#FFFFFFFF
>>>> (000.000000) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>>> protocol-violation{{tx-dominant-bit-error}{}}
>>>> bus-error
>>>> (000.021147) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>>> bus-off
>>>> restarted-after-bus-off
>>
>>> Why does it get "restarted" directly after the bus-off?
>>
>>    Because we have hardware auto-recovery enabled.
>>
>>>> (011.738522) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>
>>> What controller problem? data[1] is not set for some reasom.
>>
>>    Not comments. Looking into it.
>>
>>>> protocol-violation{{tx-dominant-bit-error}{}}
>>>> bus-error
>>>> (000.021163) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>>> bus-off
>>>> restarted-after-bus-off
>>>> (001.666625) can0 2000008C [8] 00 00 08 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>>> protocol-violation{{tx-dominant-bit-error}{}}
>>>> bus-error
>>>> (000.021157) can0 20000144 [8] 00 00 00 00 00 00 00 00 ERRORFRAME
>>>> controller-problem{}
>>>> bus-off
>>>> restarted-after-bus-off
>>
>>>> dmesg:
>>>> rcar_can rcar_can.0 can0: Error warning interrupt
>>>> rcar_can rcar_can.0 can0: Error passive interrupt
>>>> rcar_can rcar_can.0 can0: Bus error interrupt:
>>>> rcar_can rcar_can.0 can0: Bit Error (dominant)
>>>> rcar_can rcar_can.0 can0: Error warning interrupt
>>>> rcar_can rcar_can.0 can0: Error passive interrupt
>>
>>> Why are they reported again. You are already in error passive.
>>
>>    Don't know. :-/
> 
> The hardware might not be that smart. Then the software should care.
> 
>>>>> I also wonder if the messages are always sent in order. You could use
>>>>> the program "canfdtest" [1] from the can-utils for validation.
>>
>>>>     This program is PITA. With the driver workaroung it works:
>>
>>> What workaround?
>>
>>    Doesn't matter already, got rid of it.
> 
> OK. BTW: I suggest to run "canfdtest" at *1* MB/s with additional system
> and I/O load and for much longer than a minute to increase the
> probability of an out-of-order transmissions to occur.

That's probably a wrong assumption because "canfdtest" does do a 1ms
sleep after each generated messages :(. Therefore I would try at 125
KBit/s. Sorry for not providing a reliable tool for out-of-order
validation. There is also the "cansequence" program from the Pengutronix
canutils [1], which might be better suited, also to reveal races.

Wolfgang.

[1] http://git.pengutronix.de/?p=tools/canutils.git;a=summary
    http://www.pengutronix.de/software/socket-can/download

Wolfgang.

  reply	other threads:[~2013-11-09 14:47 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-27 22:11 [PATCH] can: add Renesas R-Car CAN driver Sergei Shtylyov
2013-09-27 22:11 ` Sergei Shtylyov
2013-09-28  0:40 ` Joe Perches
2013-09-28  0:40   ` Joe Perches
2013-09-28  0:45 ` Stephen Hemminger
2013-09-28  0:45   ` Stephen Hemminger
2013-09-28  0:52   ` Sergei Shtylyov
2013-09-28  0:52     ` Sergei Shtylyov
2013-09-29 19:03 ` Marc Kleine-Budde
2013-09-29 19:03   ` Marc Kleine-Budde
2013-10-17 21:54   ` Sergei Shtylyov
2013-10-17 21:54     ` Sergei Shtylyov
2013-10-02  6:09 ` Wolfgang Grandegger
2013-10-02  6:09   ` Wolfgang Grandegger
2013-10-17 22:16   ` Sergei Shtylyov
2013-10-17 22:16     ` Sergei Shtylyov
2013-10-21 19:12     ` Wolfgang Grandegger
2013-10-21 19:12       ` Wolfgang Grandegger
2013-11-09  0:02       ` Sergei Shtylyov
2013-11-09  1:02         ` Sergei Shtylyov
2013-11-09 10:53         ` Wolfgang Grandegger
2013-11-09 10:53           ` Wolfgang Grandegger
2013-11-09 14:47           ` Wolfgang Grandegger [this message]
2013-11-09 14:47             ` Wolfgang Grandegger
2013-11-12 20:45         ` Sergei Shtylyov
2013-11-12 21:45           ` Sergei Shtylyov
2013-11-12 22:17           ` Wolfgang Grandegger
2013-11-12 22:17             ` Wolfgang Grandegger
2013-10-05 17:57 ` Wolfgang Grandegger
2013-10-05 17:57   ` Wolfgang Grandegger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=527E4AE8.6020905@grandegger.com \
    --to=wg@grandegger.com \
    --cc=linux-can@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=mkl@pengutronix.de \
    --cc=netdev@vger.kernel.org \
    --cc=sergei.shtylyov@cogentembedded.com \
    --cc=vksavl@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.