public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Wolfgang Grandegger <wg@grandegger.com>
To: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 3/7] [PATCH 3/8] can: CAN Network device driver and Netlink interface
Date: Fri, 15 May 2009 09:46:18 +0200	[thread overview]
Message-ID: <4A0D1DCA.8090306@grandegger.com> (raw)
In-Reply-To: <4A0D1677.4020209@volkswagen.de>

Oliver Hartkopp wrote:
> Wolfgang Grandegger wrote:
>> Oliver Hartkopp wrote:
>>  
>>> Wolfgang Grandegger wrote:
>>>    
>>>> Oliver Hartkopp wrote:
>>>>  
>>>>      
>>>>> Andrew Morton wrote:
>>>>>           
>>>>>> On Tue, 12 May 2009 11:28:00 +0200 Wolfgang Grandegger
>>>>>> <wg@grandegger.com> wrote:
>>>>>>
>>>>>>  
>>>>>>               
>>>>>>> +int can_restart_now(struct net_device *dev)
>>>>>>> +{
>>>>>>> +    struct can_priv *priv = netdev_priv(dev);
>>>>>>> +    struct net_device_stats *stats = &dev->stats;
>>>>>>> +    struct sk_buff *skb;
>>>>>>> +    struct can_frame *cf;
>>>>>>> +    int err;
>>>>>>> +
>>>>>>> +    /* Synchronize with dev->hard_start_xmit() */
>>>>>>> +    netif_tx_lock(dev);
>>>>>>> +
>>>>>>> +    /* Ensure that no more messages can go out */
>>>>>>> +    if (netif_carrier_ok(dev))
>>>>>>> +        netif_carrier_off(dev);
>>>>>>> +
>>>>>>> +    /* Ensure that no more messages can come in */
>>>>>>> +    err = priv->do_set_mode(dev, CAN_MODE_STOP);
>>>>>>> +    if (err)
>>>>>>> +        return err;
>>>>>>> +
>>>>>>> +    /*  Now it's save to clean up */
>>>>>>> +    del_timer_sync(&priv->restart_timer);
>>>>>>>                         
>>>>>> This is deadlockable.
>>>>>>
>>>>>> It calls del_timer_sync() while holding netif_tx_lock().  But the
>>>>>> timer
>>>>>> handler (can_restart_now()) also takes netif_tx_lock().  So if the
>>>>>> timer handler is presently running, it's sitting there spinning in
>>>>>> netif_tx_lock().  And del_timer_sync() is sitting there waiting
>>>>>> for the
>>>>>> timer handler to complete.
>>>>>>
>>>>>>
>>>>>>                   
>>>>> Hi Wolfgang,
>>>>>
>>>>> would it be an appropriate solution, just to invoke
>>>>>
>>>>> netif_stop_queue() in can_bus_off()
>>>>>
>>>>> and invoke
>>>>>
>>>>> netif_wake_queue() in can_restart_now()
>>>>>
>>>>> ???
>>>>>
>>>>> In a BUSOFF condition we're not able to send CAN frames anyway, so  we
>>>>> can disable the device queue and the we won't  need any
>>>>> netif_tx_lock()
>>>>> right?
>>>>>
>>>>> AFAIK this was the original implementation before some of the latest
>>>>> improvement with the netif_carrier stuff.
>>>>>
>>>>> What do you think?
>>>>>             
>>>> The problem is the "manual" restart triggered via the netlink
>>>> interface,
>>>> which can occur in the middle of ndo_start_xmit().
>>>>
>>>>         
>>> Ah, i see.
>>>
>>> What if the manual restart via netlink would also stop the queue and
>>> start the timer?
>>>     
>>
>> It will not help if the restart is triggered in the middle of
>> ndo_start_xmit().
>>
>>   
> 
> Hi Wolfgang,
> 
> i think, i found a solution that removes the locking problem completely:
> 
> When a bus-off occurs in the controller, the communication on the CAN
> bus can be treated as unusable for this controller (let's say it is dead).
> E.g. the SJA1000 set's its reset bit for that reason and waits to be
> initialized by the CPU again.
> 
> So IMO restarting the CAN controller while in operational state is not a
> valid use case.
>
> When a bus-off (interrupt) occurs, we should
> 
> - invoke netif_carrier_off(dev)
> - invoke netif_stop_queue(dev)
> - set the state to CAN_STATE_BUS_OFF
> 
> and of course create the error message, clear the interrupts(!) and then
> leave the irq service function.
> 
> That's it.

It's already like that.

> When the automatic CAN controller restart is enabled: start the timer.
> 
> For the manual netlink function: Test for CAN_STATE_BUS_OFF (!) and
> invoke the current can_restart_now(dev) or start the timer with e.g. 1ms
> ...
> 
> This approach should make it and fulfills the bus-off intention of the
> CAN controllers ("disabled and wait for re-initialisation").
> 
> And there's no locking of the tx_queue needed anymore as the tx_queue is
> already stopped, when the restart is performed.
> 
> What do you think about this approach?

That's a solution and would comply with the can spec, I believe, but it
should be discussed on the Socket-CAN mailing list. We should not change
policy just to avoid locking or simplify the code.

Wolfgang.

  reply	other threads:[~2009-05-15  7:46 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-12  9:27 [PATCH v2 0/7] can: CAN network device driver interface and drivers Wolfgang Grandegger
2009-05-12  9:27 ` [PATCH v2 1/7] [PATCH 1/8] can: Documentation for the CAN device driver interface Wolfgang Grandegger
2009-05-12  9:27 ` [PATCH v2 2/7] [PATCH 2/8] can: Update MAINTAINERS and CREDITS file Wolfgang Grandegger
2009-05-12  9:28 ` [PATCH v2 3/7] [PATCH 3/8] can: CAN Network device driver and Netlink interface Wolfgang Grandegger
2009-05-13  6:30   ` Andrew Morton
2009-05-13  6:53     ` Andrew Morton
2009-05-13 11:37       ` Wolfgang Grandegger
2009-05-13 15:57         ` Andrew Morton
2009-05-14  9:51           ` Wolfgang Grandegger
2009-05-13 10:02     ` Oliver Hartkopp
2009-05-13 11:39       ` Wolfgang Grandegger
2009-05-13 12:08         ` Oliver Hartkopp
2009-05-13 12:23           ` Wolfgang Grandegger
2009-05-15  7:15             ` Oliver Hartkopp
2009-05-15  7:46               ` Wolfgang Grandegger [this message]
2009-05-13 21:31   ` Jonathan Corbet
2009-05-14  7:55     ` Wolfgang Grandegger
2009-05-12  9:28 ` [PATCH v2 4/7] [PATCH 4/8] can: Driver for the SJA1000 CAN controller Wolfgang Grandegger
2009-05-13 21:52   ` Jonathan Corbet
2009-05-14  9:03     ` Wolfgang Grandegger
2009-05-15 20:39       ` Jonathan Corbet
2009-05-15 21:24         ` Wolfgang Grandegger
2009-05-16  6:57           ` Wolfgang Grandegger
2009-05-20 21:31             ` Jonathan Corbet
2009-05-12  9:28 ` [PATCH v2 5/7] [PATCH 5/8] can: SJA1000 generic platform bus driver Wolfgang Grandegger
2009-05-13 22:02   ` Jonathan Corbet
2009-05-15  9:33     ` Wolfgang Grandegger
2009-05-14  6:46   ` Sascha Hauer
2009-05-15  9:35     ` Wolfgang Grandegger
2009-05-15 12:07       ` Sascha Hauer
2009-05-15 13:12         ` Wolfgang Grandegger
2009-05-12  9:28 ` [PATCH v2 6/7] [PATCH 6/8] can: SJA1000 driver for EMS PCI cards Wolfgang Grandegger
2009-05-12  9:28 ` [PATCH v2 7/7] [PATCH 7/8] can: SJA1000 driver for Kvaser " Wolfgang Grandegger
2009-05-13 22:20   ` Jonathan Corbet
2009-05-15  8:54     ` Wolfgang Grandegger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A0D1DCA.8090306@grandegger.com \
    --to=wg@grandegger.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=oliver.hartkopp@volkswagen.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox