From: Marc Kleine-Budde <mkl@pengutronix.de>
To: Wolfgang Grandegger <wg@grandegger.com>
Cc: Alexander Stein <alexander.stein@systec-electronic.com>,
linux-can@vger.kernel.org, bhupesh.sharma@st.com,
tomoya.rohm@gmail.com
Subject: Re: [RFC v2 0/7] pch_can/c_can: fix races and add PCH support to c_can
Date: Wed, 05 Dec 2012 22:52:38 +0100 [thread overview]
Message-ID: <50BFC226.5030609@pengutronix.de> (raw)
In-Reply-To: <50BF85DD.6090809@grandegger.com>
[-- Attachment #1: Type: text/plain, Size: 4189 bytes --]
On 12/05/2012 06:35 PM, Wolfgang Grandegger wrote:
> On 12/05/2012 03:46 PM, Alexander Stein wrote:
>> Hello Wolfgang,
>>
>> On Wednesday 05 December 2012 13:50:46, Wolfgang Grandegger wrote:
>>> Hi Alexander,
>>>
>>> thanks for testing!. Maybe we deal with more than one problem.
>>>
> ...
>>> A few general questions to understand your hardware and setup:
>>>
>>> - Is this a multi-processor system (SMP)? If not, you may not run into
>>> tx-not-working-any-more problem. Have you ever realized it?
>>
>> This is a Intel E660 single core CPU with HT, so it is a SMP system. I'm
>> currently not aware that tx is not working anymore.
>
> OK, your send rate is very low and therefore it's unlikely that you hit
> that problem.
>
>>> - Did you see the problems below with the old PCH_CAN driver as well.
>>>
>>> - Do the problems show up with the still existing PCH_CAN driver
>>> (including the "pch_can: add spinlocks to protect tx objects" patch)?
>>
>> With the current version of pch_can from Linuxs' tree and the named patch I
>> get at least some messaged twice.
>
> OK, sounds better but also not good.
>
>>>> but if I run my heavy CAN load testcase I get errors sometimes.
>>>> This test works as follows: I send a CAN message to 2 other CAN nodes
>>>> configuring some timings (like burst length or time between each can
>> frame)
>>>> and they send 250000 messages each containing a counter. This way I can
>> detect
>>>> any missing or switched message with a high bus load.
>>>> If I use the described software state alone it works, but if I run 'watch
>>>> sensors' in a different ssh session, CAN start to misbehave like missing
>> CAN
>>>> frames or switched order. It seems that I2C usage on the PCH influences
>> the
>>>> CAN part also:
>>>
>>> - When your app sends/writes messages, does it check for errno==ENOBUFS?
>>
>> My test application sends only 1 message each test run to start the other
>> nodes. It checks ENOBUFS and returns an error in that case. Though I've never
>> seen that.
>
> OK, your TX rate it low.
>
>>
>>> - The messages look still ok (not currupted, I mean)?
>>
>> The received frames all look good (despite wrong counter sometimes due to
>> wrong order or lost frames).
>>
>>>> Even worse, if I use the following patch to check if PCI writes were
>>>> successfully, I notices that some writes (or the consecutive read) don't
>>>> succeed. And I also get lots of I2C timeouts waiting for a xfer complete.
>>>
>>> Be careful, there might be some registers changing their values after
>>> writing. Can you show the value read after writing and the register
>>> offset? The influence on the I2C bus looks more like an overload or
>>> hardware problem. What is your CAN interrupt rate?
>>
>> I get about 33 interrupts per second on i2c. On a successful run I get 366886
>> interrupts for 500000 messages with the c_can driver.
>
> In what time? Is the CAN bus highly loaded.
>
>> Here are some failed writes to the CAN controller.
>> [ 50.445695] c_can_pci 0000:02:0c.3: can0: write 0x0 to offset 0x4 failed.
>> got: 0x10
>> [ 51.043027] c_can_pci 0000:02:0c.3: can0: write 0xe to offset 0x0 failed.
>> got: 0x0
>> [... repeats several times]
>> [ 64.046031] c_can_pci 0000:02:0c.3: can0: write 0xe to offset 0x0 failed.
>> got: 0x0
>> [ 64.458286] c_can_pci 0000:02:0c.3: can0: write 0x73 to offset 0x24 failed.
>> got: 0xb8
>> [ 64.811025] c_can_pci 0000:02:0c.3: can0: write 0xe to offset 0x0 failed.
>> got: 0x0
>> and the last one is repeated all the time.
>
> That's wired! Writing 0xe to offset 0x0 does re-enable the interrupts at
> the end of poll-rx. Disabling the interrupts in the isr does not show
> that symptoms. Strange.
The write+read check is racy. The interrupt handler might disable the
interrupts again.
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 261 bytes --]
next prev parent reply other threads:[~2012-12-05 21:52 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-29 14:39 [RFC v2 0/7] pch_can/c_can: fix races and add PCH support to c_can Wolfgang Grandegger
2012-11-29 14:39 ` [RFC v2 1/7] pch_can: add spinlocks to protect tx objects Wolfgang Grandegger
2012-11-29 14:39 ` [RFC v2 2/7] c_can: rename callback "initram" to "init" to more general usage Wolfgang Grandegger
2012-12-03 14:20 ` Alexander Stein
2012-12-03 14:32 ` Wolfgang Grandegger
2012-11-29 14:39 ` [RFC v2 3/7] c_can: use different sets of interface registers for rx and tx Wolfgang Grandegger
2012-11-30 8:39 ` Marc Kleine-Budde
2012-11-30 9:15 ` Wolfgang Grandegger
2012-11-29 14:39 ` [RFC v2 4/7] c_can_pci: introduce board specific PCI bar Wolfgang Grandegger
2012-11-30 8:45 ` Marc Kleine-Budde
2012-11-30 9:11 ` Wolfgang Grandegger
2012-11-30 9:19 ` Marc Kleine-Budde
2012-11-29 14:39 ` [RFC v2 5/7] c_can_pci: enable PCI bus master only for MSI Wolfgang Grandegger
2012-11-30 8:54 ` Marc Kleine-Budde
2012-11-29 14:39 ` [RFC v2 6/7] c_can_pci: add support for PCH CAN on Intel EG20T PCH Wolfgang Grandegger
2012-11-29 14:39 ` [RFC v2 7/7] c_can: add spinlock to protect tx and rx objects Wolfgang Grandegger
2012-12-05 12:09 ` [RFC v2 0/7] pch_can/c_can: fix races and add PCH support to c_can Alexander Stein
2012-12-05 12:50 ` Wolfgang Grandegger
2012-12-05 14:46 ` Alexander Stein
2012-12-05 17:35 ` Wolfgang Grandegger
2012-12-05 21:52 ` Marc Kleine-Budde [this message]
2012-12-06 7:09 ` Wolfgang Grandegger
2012-12-06 8:35 ` Marc Kleine-Budde
2012-12-06 8:17 ` Wolfgang Grandegger
2012-12-06 13:38 ` Alexander Stein
2012-12-06 14:02 ` Marc Kleine-Budde
2012-12-06 14:31 ` Wolfgang Grandegger
2012-12-06 14:37 ` Marc Kleine-Budde
2012-12-06 14:56 ` Alexander Stein
2012-12-06 15:15 ` Wolfgang Grandegger
2012-12-06 15:27 ` Wolfgang Grandegger
2012-12-06 15:55 ` Alexander Stein
2012-12-06 17:14 ` Alexander Stein
2012-12-06 23:34 ` Marc Kleine-Budde
2012-12-07 9:26 ` Wolfgang Grandegger
2012-12-07 9:55 ` Marc Kleine-Budde
2012-12-07 10:00 ` Bhupesh SHARMA
2012-12-07 10:09 ` Marc Kleine-Budde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50BFC226.5030609@pengutronix.de \
--to=mkl@pengutronix.de \
--cc=alexander.stein@systec-electronic.com \
--cc=bhupesh.sharma@st.com \
--cc=linux-can@vger.kernel.org \
--cc=tomoya.rohm@gmail.com \
--cc=wg@grandegger.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).