From: Wolfgang Grandegger <wg@grandegger.com>
To: Henrik Bork Steffensen <hbs@rosetechnology.dk>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>, linux-can@vger.kernel.org
Subject: Re: at91_can.c: Data transmission stops
Date: Wed, 28 Nov 2012 17:23:42 +0100 [thread overview]
Message-ID: <50B63A8E.5010802@grandegger.com> (raw)
In-Reply-To: <50B63164.5090601@rosetechnology.dk>
On 11/28/2012 04:44 PM, Henrik Bork Steffensen wrote:
> On 11/28/2012 04:12 PM, Marc Kleine-Budde wrote:
>> On 11/28/2012 04:09 PM, Henrik Bork Steffensen wrote:
>>> On 11/28/2012 03:29 PM, Marc Kleine-Budde wrote:
>>>> On 11/28/2012 03:22 PM, Henrik Bork Steffensen wrote:
>>>>> On 11/27/2012 05:31 PM, Wolfgang Grandegger wrote:
>>>>>> On 11/27/2012 03:11 PM, Henrik Bork Steffensen wrote:
>>>>>> Hm, could you show your diffs.
>>>>> Do You mean a diff on these 7 lines, or a diff to the original file?
>>>>>
>>>>>>> I this case "at91_poll" is basicly the same as "c_can_poll", in both
>>>>>>> cases they call the function with the spinlock in the rx chain.
>>>>>> You don't need to protect against RX. Sorry, forgot that. On the
>>>>>> c_can
>>>>>> this is necessary due to concurrent accesses to the same message RAM.
>>>>> Ok, I think that at91_can.c might have an issue in register access.
>>>>> I am not sure, but I will look into it.
>>>>>
>>>>>>> Looking at the patch Wolfgang sugested, I became uncertain of what
>>>>>>> this
>>>>>>> patch actually wants to protect.
>>>>>>> Is it the registers in the cpu can interface? (mailboxes and control
>>>>>>> regs, i don't know the hw)
>>>>>> As mentioned above, on the c_can there is definitely a race with the
>>>>>> message ram due to the busy wait after accessing it. See:
>>>>>>
>>>>>>
>>>>>> http://lxr.linux.no/#linux+v3.6.8/drivers/net/can/c_can/c_can.c#L237
>>>>>>
>>>>>>> Or is it the potential race between "c_can_start_xmit" and
>>>>>>> "c_can_do_tx" ?
>>>>>>> Or even the access to the net api?
>>>>>>>
>>>>>>> Would someone care to explain?
>>>>>> I will try. In at91_start_xmit, if we get interrupted
>>>>>>
>>>>>> if (!(at91_read(priv, AT91_MSR(get_tx_next_mb(priv)))&
>>>>>> AT91_MSR_MRDY) ||
>>>>>> (priv->tx_next& get_next_mask(priv)) == 0)
>>>>>>
>>>>>> /* HERE */
>>>>>>
>>>>>> netif_stop_queue(dev);
>>>>>>
>>>>>> and then at91_irq_tx() is called executing netif_wake_queue() we may
>>>>>> end
>>>>>> up with a stopped tx queue. But I'm not yet 100% sure.
>>>>> Ok, thanks a lot.
>>>>>
>>>>> In my case i changed the driver to only use one mailbox for
>>>>> transmission.
>>>>> Which means that the "net_stop_queue" will be called every time a
>>>>> packet
>>>>> is tx'ed.
>>>>> And the "net_wake_queue" will be called after the packet is actually
>>>>> transmitted.
>>>> In your first mail you've written that using only one mailbox increases
>>>> the probability for a lockup.
>>>>
>>>>> This is as far as i can see this is 100% safe, provided that no
>>>>> further
>>>>> "ndo_start_xmit"
>>>>> calls come before the wake_queue call.
>>>>>
>>>>>
>>>>> Anyway, after removing the spin_lock from rx, it loads fine and
>>>>> seems to
>>>>> work.
>>>>> I will do a test with the suggested changes to the tx chain and get to
>>>>> the list
>>>>> if anything interesting appears.
>>>>>
>>>>> Thank You very much for Your help so far :-)
>>>> Can you send me a diff of your current changes?
>>> Hi,
>>>
>>> Please note that i have not yet tested it for lockup.
>>> So far i just did a simple rx/tx test.
>> How many TX mailboxes are you using? According to your patch, the number
>> is unchanged.
> This patch only contains this tx spin_lock - the rest of the driver
> contains changes too.
>
> e.g: "at91_write(priv, AT91_IER, 1 << AT91_MB_TX_SINGLE_MB_NUM);"
> Only using one mailbox for TX was part of an divide-and-conquer process,
> but also because the data sheet errata suggested it for low bw
> applications.
Don't change two (or more) things at a time. Otherwise you don't know
what really helped.
Just my 0.01 EUR
Wolfgang.
next prev parent reply other threads:[~2012-11-28 16:23 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-26 14:28 at91_can.c: Data transmission stops Henrik Bork Steffensen
2012-11-26 15:25 ` Wolfgang Grandegger
2012-11-26 16:29 ` Henrik Bork Steffensen
2012-11-27 14:11 ` Henrik Bork Steffensen
2012-11-27 16:31 ` Wolfgang Grandegger
2012-11-28 14:22 ` Henrik Bork Steffensen
2012-11-28 14:29 ` Marc Kleine-Budde
2012-11-28 15:09 ` Henrik Bork Steffensen
2012-11-28 15:12 ` Marc Kleine-Budde
2012-11-28 15:44 ` Henrik Bork Steffensen
2012-11-28 16:23 ` Wolfgang Grandegger [this message]
2012-12-03 16:13 ` Henrik Bork Steffensen
2012-11-28 14:38 ` Wolfgang Grandegger
2012-11-28 15:17 ` Henrik Bork Steffensen
2012-11-28 14:56 ` Marc Kleine-Budde
2012-11-28 15:17 ` Wolfgang Grandegger
2012-11-26 16:36 ` Marc Kleine-Budde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50B63A8E.5010802@grandegger.com \
--to=wg@grandegger.com \
--cc=hbs@rosetechnology.dk \
--cc=linux-can@vger.kernel.org \
--cc=mkl@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).