What are you doing if the TX buffer overflows?

linux-can.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* What are you doing if the TX buffer overflows?
@ 2012-09-17 13:58 Heinz-Jürgen Oertel
  2012-09-17 19:19 ` Oliver Hartkopp
  2012-09-18 12:37 ` Wolfgang Grandegger
  0 siblings, 2 replies; 40+ messages in thread
From: Heinz-Jürgen Oertel @ 2012-09-17 13:58 UTC (permalink / raw)
  To: linux-can@vger.kernel.org


Hello,
is there a way to empty the tx buffer ?
Or read out the occupied size of it
to get the mumber of CAN frames waiting for transmission?

Best regards
 Heinz-Jürgen Oertel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-17 13:58 What are you doing if the TX buffer overflows? Heinz-Jürgen Oertel
@ 2012-09-17 19:19 ` Oliver Hartkopp
  2012-09-17 19:26   ` Andrew Bell
                     ` (2 more replies)
  2012-09-18 12:37 ` Wolfgang Grandegger
  1 sibling, 3 replies; 40+ messages in thread
From: Oliver Hartkopp @ 2012-09-17 19:19 UTC (permalink / raw)
  To: Heinz-Jürgen Oertel; +Cc: linux-can@vger.kernel.org

Hello Heinz,

On 17.09.2012 15:58, Heinz-Jürgen Oertel wrote:

> 
> is there a way to empty the tx buffer ?

Usually the buffer does not get empty due to a problem of the CAN controller
and/or the CAN network.

So even if you could flush the queue the CAN controller is probably still
stuck with his processed frame.

Setting the interface to DOWN and UP again re-initializes the CAN controller
and flushes all the queues.

This would work - but all open sockets would get a notification of the
interface went down.

Btw. IIRC there's a IFLA_CAN_RESTART functionality to kick the CAN controller
if it got stuck. It can by triggered by a netlink message. This is the same
configuration interface that's used to set the bittiming:

http://lxr.linux.no/#linux+v3.5.4/include/linux/can/netlink.h#L103

But i have currently no source code example as i always use the 'ip' tool from
the iproute2 package to configure my CAN interfaces:

http://lxr.linux.no/#linux+v3.5.4/Documentation/networking/can.txt#L646

Following that documentation

	ip link set can0 type can restart

should do it for 'can0'.

> Or read out the occupied size of it
> to get the mumber of CAN frames waiting for transmission?

I'll take a look, if there's a programming interface to get this value.
Or is this request obsolete now after my answer above?

Regards,
Oliver

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-17 19:19 ` Oliver Hartkopp
@ 2012-09-17 19:26   ` Andrew Bell
  2012-09-17 19:33     ` Oliver Hartkopp
       [not found]   ` <4283CE44E963D741A50240F32D185B9F109AA1@SBSPORT3.portgmbh.local>
  2012-09-18 12:14   ` Wolfgang Grandegger
  2 siblings, 1 reply; 40+ messages in thread
From: Andrew Bell @ 2012-09-17 19:26 UTC (permalink / raw)
  To: Oliver Hartkopp; +Cc: Heinz-Jürgen Oertel, linux-can@vger.kernel.org

On Mon, Sep 17, 2012 at 2:19 PM, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
> Hello Heinz,
>
> On 17.09.2012 15:58, Heinz-Jürgen Oertel wrote:
>
>>
>> is there a way to empty the tx buffer ?
>
>
> Usually the buffer does not get empty due to a problem of the CAN controller
> and/or the CAN network.
>
> So even if you could flush the queue the CAN controller is probably still
> stuck with his processed frame.

On my interface, the TX frames time out after one second.  But I could
find nothing in the spec to say that this is how things should
operate.  Is this driver-specific or is there something in a document
that says how retransmission and timeouts works?

Thanks,

-- 
Andrew Bell
andrew.bell.ia@gmail.com

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-17 19:26   ` Andrew Bell
@ 2012-09-17 19:33     ` Oliver Hartkopp
  2012-09-18 13:36       ` Andrew Bell
  0 siblings, 1 reply; 40+ messages in thread
From: Oliver Hartkopp @ 2012-09-17 19:33 UTC (permalink / raw)
  To: Andrew Bell; +Cc: Heinz-Jürgen Oertel, linux-can@vger.kernel.org

On 17.09.2012 21:26, Andrew Bell wrote:

> On Mon, Sep 17, 2012 at 2:19 PM, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
>> Hello Heinz,
>>
>> On 17.09.2012 15:58, Heinz-Jürgen Oertel wrote:
>>
>>>
>>> is there a way to empty the tx buffer ?
>>
>>
>> Usually the buffer does not get empty due to a problem of the CAN controller
>> and/or the CAN network.
>>
>> So even if you could flush the queue the CAN controller is probably still
>> stuck with his processed frame.
> 
> On my interface, the TX frames time out after one second.  But I could
> find nothing in the spec to say that this is how things should
> operate.  Is this driver-specific or is there something in a document
> that says how retransmission and timeouts works?
> 


The TX timeout functionality has been removed a long time ago. There was a
discussion about that topic that days.

So what kind of driver do you have there? A PEAK PCAN driver?

Regards,
Oliver

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-17 19:33     ` Oliver Hartkopp
@ 2012-09-18 13:36       ` Andrew Bell
  2012-09-18 13:46         ` Wolfgang Grandegger
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Bell @ 2012-09-18 13:36 UTC (permalink / raw)
  To: Oliver Hartkopp; +Cc: Heinz-Jürgen Oertel, linux-can@vger.kernel.org

On Mon, Sep 17, 2012 at 2:33 PM, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
> On 17.09.2012 21:26, Andrew Bell wrote:
>
> The TX timeout functionality has been removed a long time ago. There was a
> discussion about that topic that days.
>
> So what kind of driver do you have there? A PEAK PCAN driver?

We're using an old mscan driver.

So, what's supposed to happen?  Does the hardware just attempt to send
once and then drop the packet if no acknowledgement is received?  Is
an error frame generated?  Is the issue being discussed just one where
the application writes faster than the frames can be put onto the bus,
rather than the condition of the bus being bad?

-- 
Andrew Bell
andrew.bell.ia@gmail.com

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 13:36       ` Andrew Bell
@ 2012-09-18 13:46         ` Wolfgang Grandegger
  0 siblings, 0 replies; 40+ messages in thread
From: Wolfgang Grandegger @ 2012-09-18 13:46 UTC (permalink / raw)
  To: Andrew Bell
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

On 09/18/2012 03:36 PM, Andrew Bell wrote:
> On Mon, Sep 17, 2012 at 2:33 PM, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
>> On 17.09.2012 21:26, Andrew Bell wrote:
>>
>> The TX timeout functionality has been removed a long time ago. There was a
>> discussion about that topic that days.
>>
>> So what kind of driver do you have there? A PEAK PCAN driver?
> 
> We're using an old mscan driver.
> 
> So, what's supposed to happen?  Does the hardware just attempt to send
> once and then drop the packet if no acknowledgement is received?  Is
> an error frame generated?  Is the issue being discussed just one where
> the application writes faster than the frames can be put onto the bus,
> rather than the condition of the bus being bad?

The CAN hardware usually *retries* to send to message until you abort it
or stop the device (going bus-off). If the message does not go out,
first the hardware and then the software queues will fill up and finally
the send() fails with errno=ENOBUFS. The send can also end up returning
ENOBUFS if the CPU is sending faster than the packets go out. Hope this
answers your question.

Wolfgang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

[parent not found: <4283CE44E963D741A50240F32D185B9F109AA1@SBSPORT3.portgmbh.local>]

* Re: What are you doing if the TX buffer overflows?
       [not found]   ` <4283CE44E963D741A50240F32D185B9F109AA1@SBSPORT3.portgmbh.local>
@ 2012-09-17 19:40     ` Heinz-Jürgen Oertel
  2012-09-18 11:44       ` Kurt Van Dijck
  0 siblings, 1 reply; 40+ messages in thread
From: Heinz-Jürgen Oertel @ 2012-09-17 19:40 UTC (permalink / raw)
  To: Oliver Hartkopp ‎; +Cc: linux-can


Hallo Oliver,
Danke für Deine Antwort.

> 
> Hello Heinz,
> 
> On 17.09.2012 15:58, Heinz-Jürgen Oertel wrote:
> 
> >
> > is there a way to empty the tx buffer ?
> 
> 
> Usually the buffer does not get empty due to a problem of the CAN controller
> and/or the CAN network.

I know :-)

> 
> So even if you could flush the queue the CAN controller is probably still
> stuck with his processed frame.

At least the SJA1000 (I'm using most often)
allows to abort a pending transmission.

> 
> Setting the interface to DOWN and UP again re-initializes the CAN controller
> and flushes all the queues.
> 
> This would work - but all open sockets would get a notification of the
> interface went down.

May be in my current case this is acceptable.

> 
> Btw. IIRC there's a IFLA_CAN_RESTART functionality to kick the CAN controller
> if it got stuck. It can by triggered by a netlink message. This is the same
> configuration interface that's used to set the bittiming:
> 
> http://lxr.linux.no/#linux+v3.5.4/include/linux/can/netlink.h#L103
> 
> But i have currently no source code example as i always use the 'ip' tool from
> the iproute2 package to configure my CAN interfaces:
> 
> http://lxr.linux.no/#linux+v3.5.4/Documentation/networking/can.txt#L646
> 
> Following that documentation
> 
>         ip link set can0 type can restart
> 
> should do it for 'can0'.
> 
> > Or read out the occupied size of it
> > to get the mumber of CAN frames waiting for transmission?
> 
> 
> I'll take a look, if there's a programming interface to get this value.
> Or is this request obsolete now after my answer above?

No. This is still of interest.
Knowing if the transmit queue is mostly empty is a good thing.
Otherwise the application is sending to fast,
or a lot of higher priority messages are on the network.
> 
> Regards,
> Oliver
> 

Thanks for your effort.
Regards
    Heinz




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-17 19:40     ` Heinz-Jürgen Oertel
@ 2012-09-18 11:44       ` Kurt Van Dijck
  0 siblings, 0 replies; 40+ messages in thread
From: Kurt Van Dijck @ 2012-09-18 11:44 UTC (permalink / raw)
  To: Heinz-Jürgen Oertel; +Cc: Oliver Hartkopp ‎, linux-can

On Mon, Sep 17, 2012 at 09:40:41PM +0200, Heinz-Jürgen Oertel wrote:
> 
[...]
> > 
> > > Or read out the occupied size of it
> > > to get the mumber of CAN frames waiting for transmission?
> > 
> > 
> > I'll take a look, if there's a programming interface to get this value.
> > Or is this request obsolete now after my answer above?
> 
> No. This is still of interest.
> Knowing if the transmit queue is mostly empty is a good thing.
> Otherwise the application is sending to fast,

I distinguish 2 use cases there:
1. repeating a limited set of identical frames endlessly (eg. grabbing a bootloader)
2. burst a lot of data on the bus.

Case 1 can be solved by setting CAN_RAW_RECV_OWN_MSGS. If you don't
see your transmitted frames, it makes no sense to repeat it, since
they're still queued.

Case 2 is solved by waiting a little while when you get ENOBUFS. This
way, you will not need to track the buffer's status.
Throttling messages based on buffer fill status is a bit difficult
in a multi-user environment. You may know the buffer max. size,
but not the number of applications using it.

Can your issue be categorized under one of these 2 cases?
If not, I'd like to learn the use case :-)

Kurt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-17 19:19 ` Oliver Hartkopp
  2012-09-17 19:26   ` Andrew Bell
       [not found]   ` <4283CE44E963D741A50240F32D185B9F109AA1@SBSPORT3.portgmbh.local>
@ 2012-09-18 12:14   ` Wolfgang Grandegger
  2012-09-18 12:34     ` Marc Kleine-Budde
  2 siblings, 1 reply; 40+ messages in thread
From: Wolfgang Grandegger @ 2012-09-18 12:14 UTC (permalink / raw)
  To: Oliver Hartkopp; +Cc: Heinz-Jürgen Oertel, linux-can@vger.kernel.org

On 09/17/2012 09:19 PM, Oliver Hartkopp wrote:
> Hello Heinz,
> 
> On 17.09.2012 15:58, Heinz-Jürgen Oertel wrote:
> 
>>
>> is there a way to empty the tx buffer ?
> 
> 
> Usually the buffer does not get empty due to a problem of the CAN controller
> and/or the CAN network.
> 
> So even if you could flush the queue the CAN controller is probably still
> stuck with his processed frame.
> 
> Setting the interface to DOWN and UP again re-initializes the CAN controller
> and flushes all the queues.
> 
> This would work - but all open sockets would get a notification of the
> interface went down.
> 
> Btw. IIRC there's a IFLA_CAN_RESTART functionality to kick the CAN controller
> if it got stuck. It can by triggered by a netlink message. This is the same
> configuration interface that's used to set the bittiming:
> 
> http://lxr.linux.no/#linux+v3.5.4/include/linux/can/netlink.h#L103
> 
> But i have currently no source code example as i always use the 'ip' tool from
> the iproute2 package to configure my CAN interfaces:
> 
> http://lxr.linux.no/#linux+v3.5.4/Documentation/networking/can.txt#L646
> 
> Following that documentation
> 
> 	ip link set can0 type can restart
> 
> should do it for 'can0'.

No, this does only works if a bus-off is pending. See:

http://lxr.linux.no/#linux+v3.5.4/drivers/net/can/dev.c#L413

Well, the name IFLA_CAN_RESTART is not really well chosen.

Wolfgang.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 12:14   ` Wolfgang Grandegger
@ 2012-09-18 12:34     ` Marc Kleine-Budde
  2012-09-18 12:49       ` Wolfgang Grandegger
  2012-11-14 20:48       ` Jason White
  0 siblings, 2 replies; 40+ messages in thread
From: Marc Kleine-Budde @ 2012-09-18 12:34 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 2115 bytes --]

On 09/18/2012 02:14 PM, Wolfgang Grandegger wrote:
> On 09/17/2012 09:19 PM, Oliver Hartkopp wrote:
>> Hello Heinz,
>>
>> On 17.09.2012 15:58, Heinz-Jürgen Oertel wrote:
>>
>>>
>>> is there a way to empty the tx buffer ?
>>
>>
>> Usually the buffer does not get empty due to a problem of the CAN controller
>> and/or the CAN network.
>>
>> So even if you could flush the queue the CAN controller is probably still
>> stuck with his processed frame.
>>
>> Setting the interface to DOWN and UP again re-initializes the CAN controller
>> and flushes all the queues.
>>
>> This would work - but all open sockets would get a notification of the
>> interface went down.
>>
>> Btw. IIRC there's a IFLA_CAN_RESTART functionality to kick the CAN controller
>> if it got stuck. It can by triggered by a netlink message. This is the same
>> configuration interface that's used to set the bittiming:
>>
>> http://lxr.linux.no/#linux+v3.5.4/include/linux/can/netlink.h#L103
>>
>> But i have currently no source code example as i always use the 'ip' tool from
>> the iproute2 package to configure my CAN interfaces:
>>
>> http://lxr.linux.no/#linux+v3.5.4/Documentation/networking/can.txt#L646
>>
>> Following that documentation
>>
>> 	ip link set can0 type can restart
>>
>> should do it for 'can0'.
> 
> No, this does only works if a bus-off is pending. See:
> 
> http://lxr.linux.no/#linux+v3.5.4/drivers/net/can/dev.c#L413
> 
> Well, the name IFLA_CAN_RESTART is not really well chosen.

Hmmm, it corresponds somewhat to the restart-ms property.

We have several customers who asked how to abort pending TX messages,
too. Which involves:
a) clear the TX-queue in Linux
b) clear queue in hardware
c) abort currently transmitting CAN frame

I think c) would be a usecase of its own, too.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 12:34     ` Marc Kleine-Budde
@ 2012-09-18 12:49       ` Wolfgang Grandegger
  2012-09-18 13:00         ` Marc Kleine-Budde
  2012-11-14 20:48       ` Jason White
  1 sibling, 1 reply; 40+ messages in thread
From: Wolfgang Grandegger @ 2012-09-18 12:49 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

On 09/18/2012 02:34 PM, Marc Kleine-Budde wrote:
> On 09/18/2012 02:14 PM, Wolfgang Grandegger wrote:
>> On 09/17/2012 09:19 PM, Oliver Hartkopp wrote:
>>> Hello Heinz,
>>>
>>> On 17.09.2012 15:58, Heinz-Jürgen Oertel wrote:
>>>
>>>>
>>>> is there a way to empty the tx buffer ?
>>>
>>>
>>> Usually the buffer does not get empty due to a problem of the CAN controller
>>> and/or the CAN network.
>>>
>>> So even if you could flush the queue the CAN controller is probably still
>>> stuck with his processed frame.
>>>
>>> Setting the interface to DOWN and UP again re-initializes the CAN controller
>>> and flushes all the queues.
>>>
>>> This would work - but all open sockets would get a notification of the
>>> interface went down.
>>>
>>> Btw. IIRC there's a IFLA_CAN_RESTART functionality to kick the CAN controller
>>> if it got stuck. It can by triggered by a netlink message. This is the same
>>> configuration interface that's used to set the bittiming:
>>>
>>> http://lxr.linux.no/#linux+v3.5.4/include/linux/can/netlink.h#L103
>>>
>>> But i have currently no source code example as i always use the 'ip' tool from
>>> the iproute2 package to configure my CAN interfaces:
>>>
>>> http://lxr.linux.no/#linux+v3.5.4/Documentation/networking/can.txt#L646
>>>
>>> Following that documentation
>>>
>>> 	ip link set can0 type can restart
>>>
>>> should do it for 'can0'.
>>
>> No, this does only works if a bus-off is pending. See:
>>
>> http://lxr.linux.no/#linux+v3.5.4/drivers/net/can/dev.c#L413
>>
>> Well, the name IFLA_CAN_RESTART is not really well chosen.
> 
> Hmmm, it corresponds somewhat to the restart-ms property.

Yes, for automatic restart after bus-off.

> We have several customers who asked how to abort pending TX messages,
> too. Which involves:
> a) clear the TX-queue in Linux
> b) clear queue in hardware
> c) abort currently transmitting CAN frame
> 
> I think c) would be a usecase of its own, too.

I think you need c) for b), at least for some controllers. These
features, especially b) and c), are not yet available because they are
not easy to implement (hardware-dependent) and there was no request so
far. Anyway, clearing the TX queue should be rather common and might
already be available for other protocols.

Wolfgang.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 12:49       ` Wolfgang Grandegger
@ 2012-09-18 13:00         ` Marc Kleine-Budde
  2012-09-18 13:39           ` Wolfgang Grandegger
                             ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Marc Kleine-Budde @ 2012-09-18 13:00 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1175 bytes --]

On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
[...]

>> We have several customers who asked how to abort pending TX messages,
>> too. Which involves:
>> a) clear the TX-queue in Linux
>> b) clear queue in hardware
>> c) abort currently transmitting CAN frame
>>
>> I think c) would be a usecase of its own, too.
> 
> I think you need c) for b), at least for some controllers. These

Yes, if it's a hardware limitation so be it. But if we design an
interface it should support "clear everything" (a+b+c), but also just
only c.

> features, especially b) and c), are not yet available because they are
> not easy to implement (hardware-dependent) and there was no request so
> far.

Sure, this is why I'm coming up with this topic.

> Anyway, clearing the TX queue should be rather common and might
> already be available for other protocols.

Does anyone know?

Marc
-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 13:00         ` Marc Kleine-Budde
@ 2012-09-18 13:39           ` Wolfgang Grandegger
  2012-09-18 13:42             ` Marc Kleine-Budde
  2012-09-19 10:18           ` Steffen Rose
       [not found]           ` <34567791.oZ5dyCnTQA@lisa>
  2 siblings, 1 reply; 40+ messages in thread
From: Wolfgang Grandegger @ 2012-09-18 13:39 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
> [...]
> 
>>> We have several customers who asked how to abort pending TX messages,
>>> too. Which involves:
>>> a) clear the TX-queue in Linux
>>> b) clear queue in hardware
>>> c) abort currently transmitting CAN frame
>>>
>>> I think c) would be a usecase of its own, too.
>>
>> I think you need c) for b), at least for some controllers. These
> 
> Yes, if it's a hardware limitation so be it. But if we design an
> interface it should support "clear everything" (a+b+c), but also just
> only c.

Yes, that you be nice. The only portable "clear everything" (a+b+c) I
see is "ifconfig down -> up". This also answers you other related mail.

What do people really want/need and why? This is still not clear to me.
More input would be nice.

Wolfgang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 13:39           ` Wolfgang Grandegger
@ 2012-09-18 13:42             ` Marc Kleine-Budde
  2012-09-18 18:50               ` Wolfgang Grandegger
  0 siblings, 1 reply; 40+ messages in thread
From: Marc Kleine-Budde @ 2012-09-18 13:42 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1471 bytes --]

On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
>> [...]
>>
>>>> We have several customers who asked how to abort pending TX messages,
>>>> too. Which involves:
>>>> a) clear the TX-queue in Linux
>>>> b) clear queue in hardware
>>>> c) abort currently transmitting CAN frame
>>>>
>>>> I think c) would be a usecase of its own, too.
>>>
>>> I think you need c) for b), at least for some controllers. These
>>
>> Yes, if it's a hardware limitation so be it. But if we design an
>> interface it should support "clear everything" (a+b+c), but also just
>> only c.
> 
> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
> see is "ifconfig down -> up". This also answers you other related mail.
> 
> What do people really want/need and why? This is still not clear to me.
> More input would be nice.

Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
more insight? I've talked to customers, e.g. they want to abort the
current frame if it takes "too long" to send it, because the frames CAN
id priority is too low.

Marc
-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 13:42             ` Marc Kleine-Budde
@ 2012-09-18 18:50               ` Wolfgang Grandegger
  2012-09-18 19:01                 ` Marc Kleine-Budde
  0 siblings, 1 reply; 40+ messages in thread
From: Wolfgang Grandegger @ 2012-09-18 18:50 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

On 09/18/2012 03:42 PM, Marc Kleine-Budde wrote:
> On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
>> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
>>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
>>> [...]
>>>
>>>>> We have several customers who asked how to abort pending TX messages,
>>>>> too. Which involves:
>>>>> a) clear the TX-queue in Linux
>>>>> b) clear queue in hardware
>>>>> c) abort currently transmitting CAN frame
>>>>>
>>>>> I think c) would be a usecase of its own, too.
>>>>
>>>> I think you need c) for b), at least for some controllers. These
>>>
>>> Yes, if it's a hardware limitation so be it. But if we design an
>>> interface it should support "clear everything" (a+b+c), but also just
>>> only c.
>>
>> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
>> see is "ifconfig down -> up". This also answers you other related mail.
>>
>> What do people really want/need and why? This is still not clear to me.
>> More input would be nice.
> 
> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
> more insight? I've talked to customers, e.g. they want to abort the
> current frame if it takes "too long" to send it, because the frames CAN
> id priority is too low.

What we could implement rather easily is a "tx-abort-last" or
"tx-abort-all" netlink command. As this command does not make sense when
more than one message is pending I'm in favor of "tx-abort-last".

Wolfgang.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 18:50               ` Wolfgang Grandegger
@ 2012-09-18 19:01                 ` Marc Kleine-Budde
  2012-09-18 19:13                   ` Wolfgang Grandegger
  0 siblings, 1 reply; 40+ messages in thread
From: Marc Kleine-Budde @ 2012-09-18 19:01 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1885 bytes --]

On 09/18/2012 08:50 PM, Wolfgang Grandegger wrote:
> On 09/18/2012 03:42 PM, Marc Kleine-Budde wrote:
>> On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
>>> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
>>>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
>>>> [...]
>>>>
>>>>>> We have several customers who asked how to abort pending TX messages,
>>>>>> too. Which involves:
>>>>>> a) clear the TX-queue in Linux
>>>>>> b) clear queue in hardware
>>>>>> c) abort currently transmitting CAN frame
>>>>>>
>>>>>> I think c) would be a usecase of its own, too.
>>>>>
>>>>> I think you need c) for b), at least for some controllers. These
>>>>
>>>> Yes, if it's a hardware limitation so be it. But if we design an
>>>> interface it should support "clear everything" (a+b+c), but also just
>>>> only c.
>>>
>>> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
>>> see is "ifconfig down -> up". This also answers you other related mail.
>>>
>>> What do people really want/need and why? This is still not clear to me.
>>> More input would be nice.
>>
>> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
>> more insight? I've talked to customers, e.g. they want to abort the
>> current frame if it takes "too long" to send it, because the frames CAN
>> id priority is too low.
> 
> What we could implement rather easily is a "tx-abort-last" or
> "tx-abort-all" netlink command. As this command does not make sense when
> more than one message is pending I'm in favor of "tx-abort-last".

I like to have both commands.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 19:01                 ` Marc Kleine-Budde
@ 2012-09-18 19:13                   ` Wolfgang Grandegger
  2012-09-18 20:20                     ` Kurt Van Dijck
  0 siblings, 1 reply; 40+ messages in thread
From: Wolfgang Grandegger @ 2012-09-18 19:13 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

On 09/18/2012 09:01 PM, Marc Kleine-Budde wrote:
> On 09/18/2012 08:50 PM, Wolfgang Grandegger wrote:
>> On 09/18/2012 03:42 PM, Marc Kleine-Budde wrote:
>>> On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
>>>> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
>>>>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
>>>>> [...]
>>>>>
>>>>>>> We have several customers who asked how to abort pending TX messages,
>>>>>>> too. Which involves:
>>>>>>> a) clear the TX-queue in Linux
>>>>>>> b) clear queue in hardware
>>>>>>> c) abort currently transmitting CAN frame
>>>>>>>
>>>>>>> I think c) would be a usecase of its own, too.
>>>>>>
>>>>>> I think you need c) for b), at least for some controllers. These
>>>>>
>>>>> Yes, if it's a hardware limitation so be it. But if we design an
>>>>> interface it should support "clear everything" (a+b+c), but also just
>>>>> only c.
>>>>
>>>> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
>>>> see is "ifconfig down -> up". This also answers you other related mail.
>>>>
>>>> What do people really want/need and why? This is still not clear to me.
>>>> More input would be nice.
>>>
>>> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
>>> more insight? I've talked to customers, e.g. they want to abort the
>>> current frame if it takes "too long" to send it, because the frames CAN
>>> id priority is too low.
>>
>> What we could implement rather easily is a "tx-abort-last" or
>> "tx-abort-all" netlink command. As this command does not make sense when
>> more than one message is pending I'm in favor of "tx-abort-last".
> 
> I like to have both commands.

But then "tx-abort-all" should also clear the tx socket queue. Also be
aware that more than one socket may send messages. Let's wait for more
use-cases.

Wolfgang.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 19:13                   ` Wolfgang Grandegger
@ 2012-09-18 20:20                     ` Kurt Van Dijck
  2012-09-19  5:42                       ` Oliver Hartkopp
                                         ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Kurt Van Dijck @ 2012-09-18 20:20 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: Marc Kleine-Budde, Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

On Tue, Sep 18, 2012 at 09:13:48PM +0200, Wolfgang Grandegger wrote:
> On 09/18/2012 09:01 PM, Marc Kleine-Budde wrote:
> > On 09/18/2012 08:50 PM, Wolfgang Grandegger wrote:
> >> On 09/18/2012 03:42 PM, Marc Kleine-Budde wrote:
> >>> On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
> >>>> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
> >>>>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
> >>>>> [...]
> >>>>>
> >>>>>>> We have several customers who asked how to abort pending TX messages,
> >>>>>>> too. Which involves:
> >>>>>>> a) clear the TX-queue in Linux
> >>>>>>> b) clear queue in hardware
> >>>>>>> c) abort currently transmitting CAN frame
> >>>>>>>
> >>>>>>> I think c) would be a usecase of its own, too.
> >>>>>>
> >>>>>> I think you need c) for b), at least for some controllers. These
> >>>>>
> >>>>> Yes, if it's a hardware limitation so be it. But if we design an
> >>>>> interface it should support "clear everything" (a+b+c), but also just
> >>>>> only c.
> >>>>
> >>>> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
> >>>> see is "ifconfig down -> up". This also answers you other related mail.
> >>>>
> >>>> What do people really want/need and why? This is still not clear to me.
> >>>> More input would be nice.
> >>>
> >>> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
> >>> more insight? I've talked to customers, e.g. they want to abort the
> >>> current frame if it takes "too long" to send it, because the frames CAN
> >>> id priority is too low.

If "too long" is defined as a period of time where it makes sense to
take such actions in software (like > 10msec), and your message still
did not get out, but you wanted it to be,
then IMO the bus load is too high with regard to the expected time restrictions.

Or am I missing something?

> >>
> >> What we could implement rather easily is a "tx-abort-last" or
> >> "tx-abort-all" netlink command. As this command does not make sense when
> >> more than one message is pending I'm in favor of "tx-abort-last".
> > 
> > I like to have both commands.
> 
> But then "tx-abort-all" should also clear the tx socket queue. Also be
> aware that more than one socket may send messages. Let's wait for more
> use-cases.

I'm afraid tx-abort-all & tx-abort-last cause more damage than good,
especially, but not only, in multi-user environments.

Kurt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 20:20                     ` Kurt Van Dijck
@ 2012-09-19  5:42                       ` Oliver Hartkopp
  2012-09-19  7:47                         ` Marc Kleine-Budde
  2012-09-19  9:04                         ` Kurt Van Dijck
  2012-09-19  6:50                       ` Wolfgang Grandegger
  2012-09-19  7:31                       ` Marc Kleine-Budde
  2 siblings, 2 replies; 40+ messages in thread
From: Oliver Hartkopp @ 2012-09-19  5:42 UTC (permalink / raw)
  To: Wolfgang Grandegger, Marc Kleine-Budde, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org, Kurt Van Dijck

On 18.09.2012 22:20, Kurt Van Dijck wrote:

> On Tue, Sep 18, 2012 at 09:13:48PM +0200, Wolfgang Grandegger wrote:
>> On 09/18/2012 09:01 PM, Marc Kleine-Budde wrote:
>>> On 09/18/2012 08:50 PM, Wolfgang Grandegger wrote:
>>>> On 09/18/2012 03:42 PM, Marc Kleine-Budde wrote:
>>>>> On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
>>>>>> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
>>>>>>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
>>>>>>> [...]
>>>>>>>
>>>>>>>>> We have several customers who asked how to abort pending TX messages,
>>>>>>>>> too. Which involves:
>>>>>>>>> a) clear the TX-queue in Linux
>>>>>>>>> b) clear queue in hardware
>>>>>>>>> c) abort currently transmitting CAN frame
>>>>>>>>>
>>>>>>>>> I think c) would be a usecase of its own, too.
>>>>>>>>
>>>>>>>> I think you need c) for b), at least for some controllers. These
>>>>>>>
>>>>>>> Yes, if it's a hardware limitation so be it. But if we design an
>>>>>>> interface it should support "clear everything" (a+b+c), but also just
>>>>>>> only c.
>>>>>>
>>>>>> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
>>>>>> see is "ifconfig down -> up". This also answers you other related mail.
>>>>>>
>>>>>> What do people really want/need and why? This is still not clear to me.
>>>>>> More input would be nice.
>>>>>
>>>>> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
>>>>> more insight? I've talked to customers, e.g. they want to abort the
>>>>> current frame if it takes "too long" to send it, because the frames CAN
>>>>> id priority is too low.
> 
> If "too long" is defined as a period of time where it makes sense to
> take such actions in software (like > 10msec), and your message still
> did not get out, but you wanted it to be,
> then IMO the bus load is too high with regard to the expected time restrictions.
> 
> Or am I missing something?


The question is, if we should add some time of expiry to the tx-skb?!?

I have not checked whether the new ematch queuing discipline

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=f057bbb6f9ed0fb61ea11105c9ef0ed5ac1a354d

http://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf

would allow already to sort out expired skbs.

This is probably the preferred solution instead of checking expired skbs
inside the driver.
 

> 
>>>>
>>>> What we could implement rather easily is a "tx-abort-last" or
>>>> "tx-abort-all" netlink command. As this command does not make sense when
>>>> more than one message is pending I'm in favor of "tx-abort-last".
>>>
>>> I like to have both commands.
>>
>> But then "tx-abort-all" should also clear the tx socket queue. Also be
>> aware that more than one socket may send messages. Let's wait for more
>> use-cases.
> 
> I'm afraid tx-abort-all & tx-abort-last cause more damage than good,
> especially, but not only, in multi-user environments.


These netlink commands are/can be restricted to be used by root only.
Maybe for some special use-cases it makes sense to have them.

When draining the tx queues from the driver side, all skbs are purged.
I think this is not bound to a special socket instance then.

Summarizing i would suggest to check the expiring possibility of skbs (which
will kill out-dated CAN frames) - and only implement a tx-abort command that
kills the latest frame only.

Regards,
Oliver

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-19  5:42                       ` Oliver Hartkopp
@ 2012-09-19  7:47                         ` Marc Kleine-Budde
  2012-09-19  9:04                         ` Kurt Van Dijck
  1 sibling, 0 replies; 40+ messages in thread
From: Marc Kleine-Budde @ 2012-09-19  7:47 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: Wolfgang Grandegger, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org, Kurt Van Dijck

[-- Attachment #1: Type: text/plain, Size: 3101 bytes --]

On 09/19/2012 07:42 AM, Oliver Hartkopp wrote:
[...]

>>>>>>> What do people really want/need and why? This is still not clear to me.
>>>>>>> More input would be nice.
>>>>>>
>>>>>> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
>>>>>> more insight? I've talked to customers, e.g. they want to abort the
>>>>>> current frame if it takes "too long" to send it, because the frames CAN
>>>>>> id priority is too low.
>>
>> If "too long" is defined as a period of time where it makes sense to
>> take such actions in software (like > 10msec), and your message still
>> did not get out, but you wanted it to be,
>> then IMO the bus load is too high with regard to the expected time restrictions.
>>
>> Or am I missing something?

I was just describing the use case for a "abort-current-tx" command. The
it takes "too long" and then abort will be implemented in the userspace.
But without a "abort-current-tx" command this is not possible. It was
not my intention to propose to implement an automatic abort if message
is too old in the kernel as a framework option.

> The question is, if we should add some time of expiry to the tx-skb?!?

No, see above.

> I have not checked whether the new ematch queuing discipline
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=f057bbb6f9ed0fb61ea11105c9ef0ed5ac1a354d
> 
> http://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf
> 
> would allow already to sort out expired skbs.
> 
> This is probably the preferred solution instead of checking expired skbs
> inside the driver.

I don't want to check for expired frames in the driver.

>>>>> What we could implement rather easily is a "tx-abort-last" or
>>>>> "tx-abort-all" netlink command. As this command does not make sense when
>>>>> more than one message is pending I'm in favor of "tx-abort-last".
>>>>
>>>> I like to have both commands.
>>>
>>> But then "tx-abort-all" should also clear the tx socket queue. Also be
>>> aware that more than one socket may send messages. Let's wait for more
>>> use-cases.
>>
>> I'm afraid tx-abort-all & tx-abort-last cause more damage than good,
>> especially, but not only, in multi-user environments.
> 
> 
> These netlink commands are/can be restricted to be used by root only.
> Maybe for some special use-cases it makes sense to have them.
> 
> When draining the tx queues from the driver side, all skbs are purged.
> I think this is not bound to a special socket instance then.
> 
> Summarizing i would suggest to check the expiring possibility of skbs (which
> will kill out-dated CAN frames) - and only implement a tx-abort command that
> kills the latest frame only.

You mean a "in-band" skb that kills out-dated CAN frames? No, no inband
magic, please.

Marc
-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-19  5:42                       ` Oliver Hartkopp
  2012-09-19  7:47                         ` Marc Kleine-Budde
@ 2012-09-19  9:04                         ` Kurt Van Dijck
  1 sibling, 0 replies; 40+ messages in thread
From: Kurt Van Dijck @ 2012-09-19  9:04 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: Wolfgang Grandegger, Marc Kleine-Budde, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

On Wed, Sep 19, 2012 at 07:42:34AM +0200, Oliver Hartkopp wrote:
> On 18.09.2012 22:20, Kurt Van Dijck wrote:
> 
> > On Tue, Sep 18, 2012 at 09:13:48PM +0200, Wolfgang Grandegger wrote:
> >> On 09/18/2012 09:01 PM, Marc Kleine-Budde wrote:
> >>> On 09/18/2012 08:50 PM, Wolfgang Grandegger wrote:
> >>>> On 09/18/2012 03:42 PM, Marc Kleine-Budde wrote:
> >>>>> On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
> >>>>>> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
> >>>>>>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
> >>>>>>> [...]
> >>>>>>>
> >>>>>>>>> We have several customers who asked how to abort pending TX messages,
> >>>>>>>>> too. Which involves:
> >>>>>>>>> a) clear the TX-queue in Linux
> >>>>>>>>> b) clear queue in hardware
> >>>>>>>>> c) abort currently transmitting CAN frame
> >>>>>>>>>
> >>>>>>>>> I think c) would be a usecase of its own, too.
> >>>>>>>>
> >>>>>>>> I think you need c) for b), at least for some controllers. These
> >>>>>>>
> >>>>>>> Yes, if it's a hardware limitation so be it. But if we design an
> >>>>>>> interface it should support "clear everything" (a+b+c), but also just
> >>>>>>> only c.
> >>>>>>
> >>>>>> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
> >>>>>> see is "ifconfig down -> up". This also answers you other related mail.
> >>>>>>
> >>>>>> What do people really want/need and why? This is still not clear to me.
> >>>>>> More input would be nice.
> >>>>>
> >>>>> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
> >>>>> more insight? I've talked to customers, e.g. they want to abort the
> >>>>> current frame if it takes "too long" to send it, because the frames CAN
> >>>>> id priority is too low.
> > 
> > If "too long" is defined as a period of time where it makes sense to
> > take such actions in software (like > 10msec), and your message still
> > did not get out, but you wanted it to be,
> > then IMO the bus load is too high with regard to the expected time restrictions.
> > 
> > Or am I missing something?
> 
> 
> The question is, if we should add some time of expiry to the tx-skb?!?
> 
> I have not checked whether the new ematch queuing discipline
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=f057bbb6f9ed0fb61ea11105c9ef0ed5ac1a354d
> 
> http://rtime.felk.cvut.cz/can/socketcan-qdisc-final.pdf
> 
> would allow already to sort out expired skbs.
> 
> This is probably the preferred solution instead of checking expired skbs
> inside the driver.
>  

I did not check the qdisc code.

Would this expiry be a per-skb option (maybe per-socket), then I think
this solves the unclear usecase mentioned above, while not
being too invasive in current and future designs.

Kurt


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 20:20                     ` Kurt Van Dijck
  2012-09-19  5:42                       ` Oliver Hartkopp
@ 2012-09-19  6:50                       ` Wolfgang Grandegger
  2012-09-19  7:39                         ` Marc Kleine-Budde
  2012-09-19  7:31                       ` Marc Kleine-Budde
  2 siblings, 1 reply; 40+ messages in thread
From: Wolfgang Grandegger @ 2012-09-19  6:50 UTC (permalink / raw)
  To: Marc Kleine-Budde, Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

On 09/18/2012 10:20 PM, Kurt Van Dijck wrote:
> On Tue, Sep 18, 2012 at 09:13:48PM +0200, Wolfgang Grandegger wrote:
>> On 09/18/2012 09:01 PM, Marc Kleine-Budde wrote:
>>> On 09/18/2012 08:50 PM, Wolfgang Grandegger wrote:
>>>> On 09/18/2012 03:42 PM, Marc Kleine-Budde wrote:
>>>>> On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
>>>>>> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
>>>>>>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
>>>>>>> [...]
>>>>>>>
>>>>>>>>> We have several customers who asked how to abort pending TX messages,
>>>>>>>>> too. Which involves:
>>>>>>>>> a) clear the TX-queue in Linux
>>>>>>>>> b) clear queue in hardware
>>>>>>>>> c) abort currently transmitting CAN frame
>>>>>>>>>
>>>>>>>>> I think c) would be a usecase of its own, too.
>>>>>>>>
>>>>>>>> I think you need c) for b), at least for some controllers. These
>>>>>>>
>>>>>>> Yes, if it's a hardware limitation so be it. But if we design an
>>>>>>> interface it should support "clear everything" (a+b+c), but also just
>>>>>>> only c.
>>>>>>
>>>>>> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
>>>>>> see is "ifconfig down -> up". This also answers you other related mail.
>>>>>>
>>>>>> What do people really want/need and why? This is still not clear to me.
>>>>>> More input would be nice.
>>>>>
>>>>> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
>>>>> more insight? I've talked to customers, e.g. they want to abort the
>>>>> current frame if it takes "too long" to send it, because the frames CAN
>>>>> id priority is too low.
> 
> If "too long" is defined as a period of time where it makes sense to
> take such actions in software (like > 10msec), and your message still
> did not get out, but you wanted it to be,
> then IMO the bus load is too high with regard to the expected time restrictions.
> 
> Or am I missing something?
> 
>>>>
>>>> What we could implement rather easily is a "tx-abort-last" or
>>>> "tx-abort-all" netlink command. As this command does not make sense when
>>>> more than one message is pending I'm in favor of "tx-abort-last".
>>>
>>> I like to have both commands.
>>
>> But then "tx-abort-all" should also clear the tx socket queue. Also be
>> aware that more than one socket may send messages. Let's wait for more
>> use-cases.
> 
> I'm afraid tx-abort-all & tx-abort-last cause more damage than good,
> especially, but not only, in multi-user environments.

At least it's incompatible with multi-user (-sender) support and it does
not fit into the Linux networking principles! And so far I have not
heard strong arguments for implementing message flush and abort. I still
believe that a device stop->restart is enough to cure such problems. But
let's wait and see.

Wolfgang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-19  6:50                       ` Wolfgang Grandegger
@ 2012-09-19  7:39                         ` Marc Kleine-Budde
  2012-09-19  8:10                           ` Wolfgang Grandegger
  0 siblings, 1 reply; 40+ messages in thread
From: Marc Kleine-Budde @ 2012-09-19  7:39 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 3739 bytes --]

On 09/19/2012 08:50 AM, Wolfgang Grandegger wrote:
> On 09/18/2012 10:20 PM, Kurt Van Dijck wrote:
>> On Tue, Sep 18, 2012 at 09:13:48PM +0200, Wolfgang Grandegger wrote:
>>> On 09/18/2012 09:01 PM, Marc Kleine-Budde wrote:
>>>> On 09/18/2012 08:50 PM, Wolfgang Grandegger wrote:
>>>>> On 09/18/2012 03:42 PM, Marc Kleine-Budde wrote:
>>>>>> On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
>>>>>>> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
>>>>>>>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
>>>>>>>> [...]
>>>>>>>>
>>>>>>>>>> We have several customers who asked how to abort pending TX messages,
>>>>>>>>>> too. Which involves:
>>>>>>>>>> a) clear the TX-queue in Linux
>>>>>>>>>> b) clear queue in hardware
>>>>>>>>>> c) abort currently transmitting CAN frame
>>>>>>>>>>
>>>>>>>>>> I think c) would be a usecase of its own, too.
>>>>>>>>>
>>>>>>>>> I think you need c) for b), at least for some controllers. These
>>>>>>>>
>>>>>>>> Yes, if it's a hardware limitation so be it. But if we design an
>>>>>>>> interface it should support "clear everything" (a+b+c), but also just
>>>>>>>> only c.
>>>>>>>
>>>>>>> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
>>>>>>> see is "ifconfig down -> up". This also answers you other related mail.
>>>>>>>
>>>>>>> What do people really want/need and why? This is still not clear to me.
>>>>>>> More input would be nice.
>>>>>>
>>>>>> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
>>>>>> more insight? I've talked to customers, e.g. they want to abort the
>>>>>> current frame if it takes "too long" to send it, because the frames CAN
>>>>>> id priority is too low.
>>
>> If "too long" is defined as a period of time where it makes sense to
>> take such actions in software (like > 10msec), and your message still
>> did not get out, but you wanted it to be,
>> then IMO the bus load is too high with regard to the expected time restrictions.
>>
>> Or am I missing something?
>>
>>>>>
>>>>> What we could implement rather easily is a "tx-abort-last" or
>>>>> "tx-abort-all" netlink command. As this command does not make sense when
>>>>> more than one message is pending I'm in favor of "tx-abort-last".
>>>>
>>>> I like to have both commands.
>>>
>>> But then "tx-abort-all" should also clear the tx socket queue. Also be
>>> aware that more than one socket may send messages. Let's wait for more
>>> use-cases.
>>
>> I'm afraid tx-abort-all & tx-abort-last cause more damage than good,
>> especially, but not only, in multi-user environments.
> 
> At least it's incompatible with multi-user (-sender) support and it does
> not fit into the Linux networking principles! And so far I have not
> heard strong arguments for implementing message flush and abort. I still
> believe that a device stop->restart is enough to cure such problems. But
> let's wait and see.

Network stop->start does basically drain all buffers (software +
hardware + frame currently sending). Just remove the currently tx-ing
frame is a valid use case, and ifconfig down; ifconfig  up has some
drawbacks:
- a transceiver (if controlled by the driver) will be switched
  off and on.
- the clocks will be turned off and on
- there might be some "mdelay" or "msleep" loops during hw/sw reset

In case the hardware a clear hw fifo and abort current tx frame there
might be a much faster and cleaner way.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-19  7:39                         ` Marc Kleine-Budde
@ 2012-09-19  8:10                           ` Wolfgang Grandegger
  0 siblings, 0 replies; 40+ messages in thread
From: Wolfgang Grandegger @ 2012-09-19  8:10 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

On 09/19/2012 09:39 AM, Marc Kleine-Budde wrote:
> On 09/19/2012 08:50 AM, Wolfgang Grandegger wrote:
>> On 09/18/2012 10:20 PM, Kurt Van Dijck wrote:
>>> On Tue, Sep 18, 2012 at 09:13:48PM +0200, Wolfgang Grandegger wrote:
>>>> On 09/18/2012 09:01 PM, Marc Kleine-Budde wrote:
>>>>> On 09/18/2012 08:50 PM, Wolfgang Grandegger wrote:
>>>>>> On 09/18/2012 03:42 PM, Marc Kleine-Budde wrote:
>>>>>>> On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
>>>>>>>> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
>>>>>>>>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>>>> We have several customers who asked how to abort pending TX messages,
>>>>>>>>>>> too. Which involves:
>>>>>>>>>>> a) clear the TX-queue in Linux
>>>>>>>>>>> b) clear queue in hardware
>>>>>>>>>>> c) abort currently transmitting CAN frame
>>>>>>>>>>>
>>>>>>>>>>> I think c) would be a usecase of its own, too.
>>>>>>>>>>
>>>>>>>>>> I think you need c) for b), at least for some controllers. These
>>>>>>>>>
>>>>>>>>> Yes, if it's a hardware limitation so be it. But if we design an
>>>>>>>>> interface it should support "clear everything" (a+b+c), but also just
>>>>>>>>> only c.
>>>>>>>>
>>>>>>>> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
>>>>>>>> see is "ifconfig down -> up". This also answers you other related mail.
>>>>>>>>
>>>>>>>> What do people really want/need and why? This is still not clear to me.
>>>>>>>> More input would be nice.
>>>>>>>
>>>>>>> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
>>>>>>> more insight? I've talked to customers, e.g. they want to abort the
>>>>>>> current frame if it takes "too long" to send it, because the frames CAN
>>>>>>> id priority is too low.
>>>
>>> If "too long" is defined as a period of time where it makes sense to
>>> take such actions in software (like > 10msec), and your message still
>>> did not get out, but you wanted it to be,
>>> then IMO the bus load is too high with regard to the expected time restrictions.
>>>
>>> Or am I missing something?
>>>
>>>>>>
>>>>>> What we could implement rather easily is a "tx-abort-last" or
>>>>>> "tx-abort-all" netlink command. As this command does not make sense when
>>>>>> more than one message is pending I'm in favor of "tx-abort-last".
>>>>>
>>>>> I like to have both commands.
>>>>
>>>> But then "tx-abort-all" should also clear the tx socket queue. Also be
>>>> aware that more than one socket may send messages. Let's wait for more
>>>> use-cases.
>>>
>>> I'm afraid tx-abort-all & tx-abort-last cause more damage than good,
>>> especially, but not only, in multi-user environments.
>>
>> At least it's incompatible with multi-user (-sender) support and it does
>> not fit into the Linux networking principles! And so far I have not
>> heard strong arguments for implementing message flush and abort. I still
>> believe that a device stop->restart is enough to cure such problems. But
>> let's wait and see.
> 
> Network stop->start does basically drain all buffers (software +
> hardware + frame currently sending). Just remove the currently tx-ing
> frame is a valid use case, and ifconfig down; ifconfig  up has some
> drawbacks:
> - a transceiver (if controlled by the driver) will be switched
>   off and on.
> - the clocks will be turned off and on
> - there might be some "mdelay" or "msleep" loops during hw/sw reset

Before doing this, the controller goes bus-off. Therefore it should not
harm.

> In case the hardware a clear hw fifo and abort current tx frame there

What would clear hw fifo be good for? Then, for portability reasons I
would vote for tx-abort-all, which does both. As the app does not known
which message is currently transmitted, that does make sense to me.

> might be a much faster and cleaner way.

Well, it does not need to be fast because the bus is blocked anyway and
the support will be heavily hardware dependent. Anyway, till now I have
not seen a good use-case (apart from the vague one you mentioned). I'm
also afraid of mis-using such a feature.

Wolfgang.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 20:20                     ` Kurt Van Dijck
  2012-09-19  5:42                       ` Oliver Hartkopp
  2012-09-19  6:50                       ` Wolfgang Grandegger
@ 2012-09-19  7:31                       ` Marc Kleine-Budde
  2 siblings, 0 replies; 40+ messages in thread
From: Marc Kleine-Budde @ 2012-09-19  7:31 UTC (permalink / raw)
  To: Wolfgang Grandegger, Oliver Hartkopp, Heinz-Jürgen Oertel,
	linux-can@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 2997 bytes --]

On 09/18/2012 10:20 PM, Kurt Van Dijck wrote:
> On Tue, Sep 18, 2012 at 09:13:48PM +0200, Wolfgang Grandegger wrote:
>> On 09/18/2012 09:01 PM, Marc Kleine-Budde wrote:
>>> On 09/18/2012 08:50 PM, Wolfgang Grandegger wrote:
>>>> On 09/18/2012 03:42 PM, Marc Kleine-Budde wrote:
>>>>> On 09/18/2012 03:39 PM, Wolfgang Grandegger wrote:
>>>>>> On 09/18/2012 03:00 PM, Marc Kleine-Budde wrote:
>>>>>>> On 09/18/2012 02:49 PM, Wolfgang Grandegger wrote:
>>>>>>> [...]
>>>>>>>
>>>>>>>>> We have several customers who asked how to abort pending TX messages,
>>>>>>>>> too. Which involves:
>>>>>>>>> a) clear the TX-queue in Linux
>>>>>>>>> b) clear queue in hardware
>>>>>>>>> c) abort currently transmitting CAN frame
>>>>>>>>>
>>>>>>>>> I think c) would be a usecase of its own, too.
>>>>>>>>
>>>>>>>> I think you need c) for b), at least for some controllers. These
>>>>>>>
>>>>>>> Yes, if it's a hardware limitation so be it. But if we design an
>>>>>>> interface it should support "clear everything" (a+b+c), but also just
>>>>>>> only c.
>>>>>>
>>>>>> Yes, that you be nice. The only portable "clear everything" (a+b+c) I
>>>>>> see is "ifconfig down -> up". This also answers you other related mail.
>>>>>>
>>>>>> What do people really want/need and why? This is still not clear to me.
>>>>>> More input would be nice.
>>>>>
>>>>> Heinz-Jürgen uses abort current TX Message on SJA1000, can you give us
>>>>> more insight? I've talked to customers, e.g. they want to abort the
>>>>> current frame if it takes "too long" to send it, because the frames CAN
>>>>> id priority is too low.
> 
> If "too long" is defined as a period of time where it makes sense to
> take such actions in software (like > 10msec), and your message still
> did not get out, but you wanted it to be,
> then IMO the bus load is too high with regard to the expected time restrictions.
> 
> Or am I missing something?
> 
>>>>
>>>> What we could implement rather easily is a "tx-abort-last" or
>>>> "tx-abort-all" netlink command. As this command does not make sense when
>>>> more than one message is pending I'm in favor of "tx-abort-last".
>>>
>>> I like to have both commands.
>>
>> But then "tx-abort-all" should also clear the tx socket queue. Also be
>> aware that more than one socket may send messages. Let's wait for more
>> use-cases.
> 
> I'm afraid tx-abort-all & tx-abort-last cause more damage than good,
> especially, but not only, in multi-user environments.

Yes, sure, they might, but so a "ifconfig down; ifconfig up" does. But
"tx-abort-last" is a valid use case. IIRC even the FPGA CAN controller
I've seen on the CIA meeting implemented something like this.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 13:00         ` Marc Kleine-Budde
  2012-09-18 13:39           ` Wolfgang Grandegger
@ 2012-09-19 10:18           ` Steffen Rose
       [not found]           ` <34567791.oZ5dyCnTQA@lisa>
  2 siblings, 0 replies; 40+ messages in thread
From: Steffen Rose @ 2012-09-19 10:18 UTC (permalink / raw)
  To: linux-can

Hallo,

more a question to this discussion then a meaning:

> >> a) clear the TX-queue in Linux
[...]
> Yes, if it's a hardware limitation so be it. But if we design an
> interface it should support "clear everything" (a+b+c), but also just
> only c.
> 

Example:
 I use more than one program, e.g.3 to communicate over one interface, 
e.g.can0.

Question:
The TX Queue is interface related, not program related, right?

This would mean, if my program (one of the 3) clear the TX queue as part of an 
internal communication reset it would also destroy the messages of the other 
programs.

Is this idea right?

-- 
Mit freundlichen Grüßen / Regards

Steffen Rose

--------------------------------------------------------------

emtas - your embedded solution partner

Meet our specialists on the SPS IPC drives fair 
from 27 to 29.November in Nuremberg

--------------------------------------------------------------
emtas GmbH
Fischweg 17
06217 Merseburg
Tel.:  +49(0)3461-79416-0
Fax.:  +49(0)3461-79416-10
www.emtas.de - service@emtas.de
Geschäftsführer: Andreas Boebel, Torsten Gedenk, Steffen Rose
Registergericht: Amtsgericht Stendal
Registernummer:  HRB 17616 
UstID: DE281092198
--------------------------------------------------------------

^ permalink raw reply	[flat|nested] 40+ messages in thread

[parent not found: <34567791.oZ5dyCnTQA@lisa>]

* Re: [Socketcan-users] What are you doing if the TX buffer overflows?
       [not found]           ` <34567791.oZ5dyCnTQA@lisa>
@ 2012-09-19 10:26             ` Kurt Van Dijck
  2012-09-19 11:32               ` Steffen Rose
  0 siblings, 1 reply; 40+ messages in thread
From: Kurt Van Dijck @ 2012-09-19 10:26 UTC (permalink / raw)
  To: Steffen Rose; +Cc: linux-can, socketcan-users

On Wed, Sep 19, 2012 at 12:02:37PM +0200, Steffen Rose wrote:
> Hallo,

Please use linux-can@vger.kernel.org mailing list :-)

> 
> more a question to this discussion then a meaning:
(small english vocabulary fix: s/meaning/opinion/)
> 
> > >> a) clear the TX-queue in Linux
> [...]
> > Yes, if it's a hardware limitation so be it. But if we design an
> > interface it should support "clear everything" (a+b+c), but also just
> > only c.
> > 
> 
> Example:
>  I use more than one program, e.g.3 to communicate over one interface, 
> e.g.can0.
> 
> Question:
> The TX Queue is interface related, not program related, right?
> 
> This would mean, if my program (one of the 3) clear the TX queue as part of an 
> internal communication reset it would also destroy the messages of the other 
> programs.
> 
> Is this idea right?

Yes, you're right.
That's an aspect of 'multi-user' environment,
with a flexible 'user' interpretation.

Kind regards,
Kurt

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Socketcan-users] What are you doing if the TX buffer overflows?
  2012-09-19 10:26             ` [Socketcan-users] " Kurt Van Dijck
@ 2012-09-19 11:32               ` Steffen Rose
  0 siblings, 0 replies; 40+ messages in thread
From: Steffen Rose @ 2012-09-19 11:32 UTC (permalink / raw)
  To: linux-can

Hello,

> > This would mean, if my program (one of the 3) clear the TX queue as part
> > of an internal communication reset it would also destroy the messages of
> > the other programs.
> > 
> > Is this idea right?
> 
> Yes, you're right.
> That's an aspect of 'multi-user' environment,
> with a flexible 'user' interpretation.

In this case I have a new case:

An other program (master) send a communication reset request to my program.
I would clear all of my outstanding messages, because it contains old data and 
I don't want to sent it. The most problems are, I think, if the messages are a 
long time in the buffer, e.g. because the 'root' user, (not my program) has 
created a large buffer.

I want to clear this messages with user rights. I think, this last requirement 
(user rights) should be ok, because I want to clear only my messages.

I'm unsure, if the buffered messages have an assignment to the program, that 
sent this message.

Additional: Of course I response the reset request an I hope, that this 
response is send after all old buffered messages. It's depend of the queue 
configuration, I think.

-- 
Mit freundlichen Grüßen / Regards
Steffen Rose
www.emtas.de

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 12:34     ` Marc Kleine-Budde
  2012-09-18 12:49       ` Wolfgang Grandegger
@ 2012-11-14 20:48       ` Jason White
  2012-11-15 12:54         ` Marc Kleine-Budde
  1 sibling, 1 reply; 40+ messages in thread
From: Jason White @ 2012-11-14 20:48 UTC (permalink / raw)
  To: linux-can

Marc Kleine-Budde <mkl <at> pengutronix.de> writes:

> 
> 
> We have several customers who asked how to abort pending TX messages,
> too. Which involves:
> a) clear the TX-queue in Linux
> b) clear queue in hardware
> c) abort currently transmitting CAN frame
> 
> I think c) would be a usecase of its own, too.
> 
> Marc
> 

Has there been any more investigation into this tx buffer overflow scenario?

I've been working on using SocketCAN interface for the past several months.  
I have quite a bit of experience with CAN from an embedded perspective 
(outside of Linux) and even multiple clients accessing a single CAN port.  
We have to have a method of flushing the transmit queue because if an ECU is 
initially alone on the network and goes error passive (no ack), then we want
to be able to flush the tx queue periodically to prevent stale data on the bus.
Other scenarios have also been listed.

Right now we are taking the interface down and then back up and while this 
works for a single process using CAN, I don't think it will work well with 
multi-process support.  Would the other processes know that the interface 
was taken down?  I haven't decided the route we will go, but maybe an 
intermediate layer between SocketCAN and user processes to monitor the 
flushing.  Does the socket itself have any buffering or are messages 
forwarded directly to the driver?

Ideally a packet timeout would be implemented in the driver since it knows 
when the last message was transmitted (tx interrupt) and could perform the 
flush properly.  Has there been any more thought to this scenario?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-11-14 20:48       ` Jason White
@ 2012-11-15 12:54         ` Marc Kleine-Budde
  2012-11-15 17:12           ` Oliver Hartkopp
  2012-11-15 19:07           ` Jason White
  0 siblings, 2 replies; 40+ messages in thread
From: Marc Kleine-Budde @ 2012-11-15 12:54 UTC (permalink / raw)
  To: Jason White; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 2498 bytes --]

On 11/14/2012 09:48 PM, Jason White wrote:
> Marc Kleine-Budde <mkl <at> pengutronix.de> writes:
> 
>>
>>
>> We have several customers who asked how to abort pending TX messages,
>> too. Which involves:
>> a) clear the TX-queue in Linux
>> b) clear queue in hardware
>> c) abort currently transmitting CAN frame
>>
>> I think c) would be a usecase of its own, too.
>>
>> Marc
>>
> 
> 
> Has there been any more investigation into this tx buffer overflow scenario?
> 
> I've been working on using SocketCAN interface for the past several months.  
> I have quite a bit of experience with CAN from an embedded perspective 
> (outside of Linux) and even multiple clients accessing a single CAN port.  
> We have to have a method of flushing the transmit queue because if an ECU is 
> initially alone on the network and goes error passive (no ack), then we want
> to be able to flush the tx queue periodically to prevent stale data on the bus.
> Other scenarios have also been listed.
> 
> Right now we are taking the interface down and then back up and while this 
> works for a single process using CAN, I don't think it will work well with 
> multi-process support.  Would the other processes know that the interface 
> was taken down?  I haven't decided the route we will go, but maybe an 

If the other process is doing a read, the read will return. I'm not sure
what happens to a write after ifdown/ifup, I think it will fail.

> intermediate layer between SocketCAN and user processes to monitor the 
> flushing.  Does the socket itself have any buffering or are messages 
> forwarded directly to the driver?
> 
> Ideally a packet timeout would be implemented in the driver since it knows 
> when the last message was transmitted (tx interrupt) and could perform the 
> flush properly.  Has there been any more thought to this scenario?

This is a scenario, I think the first step will be to implement an
"abort currently transmitted frame" callback. Note that not all hardware
supports a tx-complete interrupt. A timeout mechanism can then be
implemented in the framework in a hardware independent way, e.g. if you
extend the currently used can_echo mechanism.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-11-15 12:54         ` Marc Kleine-Budde
@ 2012-11-15 17:12           ` Oliver Hartkopp
  2012-11-15 19:11             ` Jason White
  2012-11-16 15:13             ` Kurt Van Dijck
  2012-11-15 19:07           ` Jason White
  1 sibling, 2 replies; 40+ messages in thread
From: Oliver Hartkopp @ 2012-11-15 17:12 UTC (permalink / raw)
  To: Marc Kleine-Budde, Jason White; +Cc: linux-can

On 15.11.2012 13:54, Marc Kleine-Budde wrote:

> On 11/14/2012 09:48 PM, Jason White wrote:

>> Right now we are taking the interface down and then back up and while this 
>> works for a single process using CAN, I don't think it will work well with 
>> multi-process support.  Would the other processes know that the interface 
>> was taken down?  I haven't decided the route we will go, but maybe an 
> 
> If the other process is doing a read, the read will return. I'm not sure
> what happens to a write after ifdown/ifup, I think it will fail.

Yes. It fails with -ENETDOWN ...

Btw. we had a similar use-case:

Only allow the applications to work in the CAN bus, once the CAN network
management established a stable state.

To do so, i implemented the network management e.g. on 'can0'. But the other
applications have been working on a virtual CAN interface 'can0v'.

Once the network management was ok, i cross-linked 'can0' and 'can0v' with two
can-gw rules:

	cangw -A -s can0 -d can0v -e
	cangw -A -s can0v -d can0 -e

I did not use the 'cangw' tool but created the netlink messages myself.

The advantage of this solution is that the can0v interface is always up and
stable - while the can0 interface can be set to up/down.

The traffic is routed on demand only.

The can-gw frame router is part of mainline Linux 3.2+

>> intermediate layer between SocketCAN and user processes to monitor the 
>> flushing.  Does the socket itself have any buffering or are messages 
>> forwarded directly to the driver?
>>
>> Ideally a packet timeout would be implemented in the driver since it knows 
>> when the last message was transmitted (tx interrupt) and could perform the 
>> flush properly.  Has there been any more thought to this scenario?
> 
> This is a scenario, I think the first step will be to implement an
> "abort currently transmitted frame" callback. Note that not all hardware
> supports a tx-complete interrupt. A timeout mechanism can then be
> implemented in the framework in a hardware independent way, e.g. if you
> extend the currently used can_echo mechanism.

Officially the TX-timeout has been removed as the controller just sends out
the CAN frames, when it comes back to life ...

The question is, if the controller gets into the BUS_OFF state and if the
restart-ms option (see ip tool) would help here.

Regards,
Oliver

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-11-15 17:12           ` Oliver Hartkopp
@ 2012-11-15 19:11             ` Jason White
  2012-11-15 21:04               ` Oliver Hartkopp
  2012-11-16 15:13             ` Kurt Van Dijck
  1 sibling, 1 reply; 40+ messages in thread
From: Jason White @ 2012-11-15 19:11 UTC (permalink / raw)
  To: linux-can

Oliver Hartkopp <socketcan@hartkopp.net> wrote on 11/15/2012 11:12:06 AM:

> 
> The can-gw frame router is part of mainline Linux 3.2+
> 
> 

I'm not sure I entirely follow what you are doing.  Here is what I think you
mean.  Please correct me if I'm wrong.  You have implemented a network
management layer that interfaces directly with can0.  All other applications
interact with can0v, which is always up.  When you know the communication
is stable you route between can0 and can0v.  Did I get that right?

So do you take can0 down/up when timeouts occur with messaging?  Is there
any kind of startup delay associated with this?  Is there any kind of
delays going to can0v?

> Officially the TX-timeout has been removed as the controller just sends out
> the CAN frames, when it comes back to life ...
> 
> The question is, if the controller gets into the BUS_OFF state and if the
> restart-ms option (see ip tool) would help here.
> 
> Regards,
> Oliver

What do you mean by the TX-timeout or restart-ms option?

Jason

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-11-15 19:11             ` Jason White
@ 2012-11-15 21:04               ` Oliver Hartkopp
  0 siblings, 0 replies; 40+ messages in thread
From: Oliver Hartkopp @ 2012-11-15 21:04 UTC (permalink / raw)
  To: Jason White; +Cc: linux-can

On 15.11.2012 20:11, Jason White wrote:

> Oliver Hartkopp <socketcan@hartkopp.net> wrote on 11/15/2012 11:12:06 AM:
> 
>>
>> The can-gw frame router is part of mainline Linux 3.2+
>>
>>
> 
> I'm not sure I entirely follow what you are doing.  Here is what I think you
> mean.  Please correct me if I'm wrong.  You have implemented a network
> management layer that interfaces directly with can0.  All other applications
> interact with can0v, which is always up.  When you know the communication
> is stable you route between can0 and can0v.  Did I get that right?


Yep!

> 
> So do you take can0 down/up when timeouts occur with messaging?


No. I did not need to do so. But it is generally possible to down/up the
interfaces without destroying the cangw's routing table.

>  Is there
> any kind of startup delay associated with this?


Of what? The gateway route establishment? 

>  Is there any kind of
> delays going to can0v?


When a CAN frame is received on can0, the NET_RX softirq is invoked to handle
the CAN filters and passes the frame to the cangw ... which passes the frame
directly into the can0v interface.

Usually this is done withing the same softirq and you don't have any
noticeable delay.

See measurements in the text here:

http://permalink.gmane.org/gmane.linux.network/205565

>  
>> Officially the TX-timeout has been removed as the controller just sends out
>> the CAN frames, when it comes back to life ...
>>
>> The question is, if the controller gets into the BUS_OFF state and if the
>> restart-ms option (see ip tool) would help here.
>>
>> Regards,
>> Oliver
> 
> What do you mean by the TX-timeout or restart-ms option?


See:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob;f=Documentation/networking/can.txt;h=820f55344edc0a035358fa643bbe79c48d4d887f;hb=HEAD#l817

Regards,
Oliver

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-11-15 17:12           ` Oliver Hartkopp
  2012-11-15 19:11             ` Jason White
@ 2012-11-16 15:13             ` Kurt Van Dijck
  2012-11-16 17:09               ` Jason White
  1 sibling, 1 reply; 40+ messages in thread
From: Kurt Van Dijck @ 2012-11-16 15:13 UTC (permalink / raw)
  To: Oliver Hartkopp; +Cc: Marc Kleine-Budde, Jason White, linux-can

On Thu, Nov 15, 2012 at 06:12:06PM +0100, Oliver Hartkopp wrote:
> On 15.11.2012 13:54, Marc Kleine-Budde wrote:
> 
> > On 11/14/2012 09:48 PM, Jason White wrote:
> 
> 
> Officially the TX-timeout has been removed as the controller just sends out
> the CAN frames, when it comes back to life ...
> 
> The question is, if the controller gets into the BUS_OFF state and if the
> restart-ms option (see ip tool) would help here.
FYI:
A CAN chip that sits alone on a proper bus, trying to transmit a frame,
will never go into BUS_OFF. It can only go in BUS_OFF when a bad network
is encountered, i.e. the chip does not see it's TX activity on its RX.

I think this scenario (chip alone, going in BUS_OFF) is no different
than regular BUS_OFF, and should be treated likewise.
> 
> Regards,
> Oliver
> --
> To unsubscribe from this list: send the line "unsubscribe linux-can" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-11-16 15:13             ` Kurt Van Dijck
@ 2012-11-16 17:09               ` Jason White
  0 siblings, 0 replies; 40+ messages in thread
From: Jason White @ 2012-11-16 17:09 UTC (permalink / raw)
  To: linux-can

Kurt Van Dijck <kurt.van.dijck <at> eia.be> writes:

> > 
> > Officially the TX-timeout has been removed as the controller just sends out
> > the CAN frames, when it comes back to life ...
> > 
> > The question is, if the controller gets into the BUS_OFF state and if the
> > restart-ms option (see ip tool) would help here.
> FYI:
> A CAN chip that sits alone on a proper bus, trying to transmit a frame,
> will never go into BUS_OFF. It can only go in BUS_OFF when a bad network
> is encountered, i.e. the chip does not see it's TX activity on its RX.
> 
> I think this scenario (chip alone, going in BUS_OFF) is no different
> than regular BUS_OFF, and should be treated likewise.
> > 

This sounds like something I would be interested in.  Just a couple of
questions.  What do you mean TX-timeout has been removed?  Also does
it work in error passive state (scenario where the ECU is alone on
the bus - no ack from another ECU).  Does it just reset the CAN
hardware or does the queues get flushed as well?

Jason




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-11-15 12:54         ` Marc Kleine-Budde
  2012-11-15 17:12           ` Oliver Hartkopp
@ 2012-11-15 19:07           ` Jason White
  1 sibling, 0 replies; 40+ messages in thread
From: Jason White @ 2012-11-15 19:07 UTC (permalink / raw)
  To: linux-can

Marc Kleine-Budde <mkl@pengutronix.de> wrote on 11/15/2012 06:54:20 AM:

> > Right now we are taking the interface down and then back up and while this 
> > works for a single process using CAN, I don't think it will work well with 
> > multi-process support.  Would the other processes know that the interface 
> > was taken down?  I haven't decided the route we will go, but maybe an 
> 
> If the other process is doing a read, the read will return. I'm not sure
> what happens to a write after ifdown/ifup, I think it will fail.
> 

But you wouldn't want all processes to implement this timeout, otherwise 
they would all timeout at the same time and bring the interface down 
hence having something lower down in the stack implement it. 

> > Ideally a packet timeout would be implemented in the driver since it knows 
> > when the last message was transmitted (tx interrupt) and could perform the 
> > flush properly.  Has there been any more thought to this scenario?
> 
> This is a scenario, I think the first step will be to implement an
> "abort currently transmitted frame" callback. Note that not all hardware
> supports a tx-complete interrupt. A timeout mechanism can then be
> implemented in the framework in a hardware independent way, e.g. if you
> extend the currently used can_echo mechanism.
> 
> Marc

Every piece of CAN hardware I've seen (10+ devices) has a tx interrupt 
that indicates when the message is transmitted.  If the current message 
being transmitted has been in the buffer XX amount of time, then it 
and probably the entire queue should be flushed.  Obviously not all 
hardware supports aborting the current frame, but there are work arounds. 
Usually if a CAN message can't transmit within this time it's data is way 
too old to be sent.  The actual time might be up for discussion and 
depends on what is being sent, but for controls its not that long. 

Jason

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-17 13:58 What are you doing if the TX buffer overflows? Heinz-Jürgen Oertel
  2012-09-17 19:19 ` Oliver Hartkopp
@ 2012-09-18 12:37 ` Wolfgang Grandegger
  2012-09-18 13:22   ` Marc Kleine-Budde
                     ` (2 more replies)
  1 sibling, 3 replies; 40+ messages in thread
From: Wolfgang Grandegger @ 2012-09-18 12:37 UTC (permalink / raw)
  To: Heinz-Jürgen Oertel; +Cc: linux-can@vger.kernel.org

On 09/17/2012 03:58 PM, Heinz-Jürgen Oertel wrote:
> 
> Hello,
> is there a way to empty the tx buffer ?
> Or read out the occupied size of it
> to get the mumber of CAN frames waiting for transmission?

As we usally get a TX done interrupt we in principle know how much
messages are still pending in the software and hardware queue but we
will only be able to flush the messages in the software queue because
aborting of messages in the device TX queue is not supported (even if
most controllers can do it). Therefore the only clean way to get rid of
all pending messages is to stop and restart the device (ifconfig down ->
up).

Other protocols seem to provide the ioctl request SIOCINQ to query the
number of untransmitted messages, e.g.:

 http://lxr.linux.no/#linux+v3.5.4/net/packet/af_packet.c#L3385

Should not be a big deal to add it to the CAN ioctl requests.

Wolfgang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 12:37 ` Wolfgang Grandegger
@ 2012-09-18 13:22   ` Marc Kleine-Budde
  2012-09-18 13:24   ` Marc Kleine-Budde
  2012-09-18 13:25   ` Wolfgang Grandegger
  2 siblings, 0 replies; 40+ messages in thread
From: Marc Kleine-Budde @ 2012-09-18 13:22 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: Heinz-Jürgen Oertel, linux-can@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1270 bytes --]

On 09/18/2012 02:37 PM, Wolfgang Grandegger wrote:
> On 09/17/2012 03:58 PM, Heinz-Jürgen Oertel wrote:
>>
>> Hello,
>> is there a way to empty the tx buffer ?
>> Or read out the occupied size of it
>> to get the mumber of CAN frames waiting for transmission?
> 
> As we usally get a TX done interrupt we in principle know how much
> messages are still pending in the software and hardware queue but we
> will only be able to flush the messages in the software queue because
> aborting of messages in the device TX queue is not supported (even if
> most controllers can do it). Therefore the only clean way to get rid of
> all pending messages is to stop and restart the device (ifconfig down ->
> up).
> 
> Other protocols seem to provide the ioctl request SIOCINQ to query the
> number of untransmitted messages, e.g.:
> 
>  http://lxr.linux.no/#linux+v3.5.4/net/packet/af_packet.c#L3385
> 
> Should not be a big deal to add it to the CAN ioctl requests.

+1

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 12:37 ` Wolfgang Grandegger
  2012-09-18 13:22   ` Marc Kleine-Budde
@ 2012-09-18 13:24   ` Marc Kleine-Budde
  2012-09-18 13:25   ` Wolfgang Grandegger
  2 siblings, 0 replies; 40+ messages in thread
From: Marc Kleine-Budde @ 2012-09-18 13:24 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: Heinz-Jürgen Oertel, linux-can@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1131 bytes --]

On 09/18/2012 02:37 PM, Wolfgang Grandegger wrote:
> On 09/17/2012 03:58 PM, Heinz-Jürgen Oertel wrote:
>>
>> Hello,
>> is there a way to empty the tx buffer ?
>> Or read out the occupied size of it
>> to get the mumber of CAN frames waiting for transmission?
> 
> As we usally get a TX done interrupt we in principle know how much
> messages are still pending in the software and hardware queue but we
> will only be able to flush the messages in the software queue because
> aborting of messages in the device TX queue is not supported (even if
> most controllers can do it). Therefore the only clean way to get rid of
> all pending messages is to stop and restart the device (ifconfig down ->
> up).

This is more a sledgehammer approach, a I outlined a more fine grained
solution (if hardware supports it) in my other mail.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: What are you doing if the TX buffer overflows?
  2012-09-18 12:37 ` Wolfgang Grandegger
  2012-09-18 13:22   ` Marc Kleine-Budde
  2012-09-18 13:24   ` Marc Kleine-Budde
@ 2012-09-18 13:25   ` Wolfgang Grandegger
  2 siblings, 0 replies; 40+ messages in thread
From: Wolfgang Grandegger @ 2012-09-18 13:25 UTC (permalink / raw)
  To: Heinz-Jürgen Oertel; +Cc: linux-can@vger.kernel.org

On 09/18/2012 02:37 PM, Wolfgang Grandegger wrote:
> On 09/17/2012 03:58 PM, Heinz-Jürgen Oertel wrote:
>>
>> Hello,
>> is there a way to empty the tx buffer ?
>> Or read out the occupied size of it
>> to get the mumber of CAN frames waiting for transmission?
> 
> As we usally get a TX done interrupt we in principle know how much
> messages are still pending in the software and hardware queue but we
> will only be able to flush the messages in the software queue because
> aborting of messages in the device TX queue is not supported (even if
> most controllers can do it). Therefore the only clean way to get rid of
> all pending messages is to stop and restart the device (ifconfig down ->
> up).
> 
> Other protocols seem to provide the ioctl request SIOCINQ to query the
> number of untransmitted messages, e.g.:
> 
>  http://lxr.linux.no/#linux+v3.5.4/net/packet/af_packet.c#L3385
> 
> Should not be a big deal to add it to the CAN ioctl requests.

SIOCINQ is for RX. SIOCOUTQ for TX.

Wolfgang.


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2012-11-16 17:10 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-17 13:58 What are you doing if the TX buffer overflows? Heinz-Jürgen Oertel
2012-09-17 19:19 ` Oliver Hartkopp
2012-09-17 19:26   ` Andrew Bell
2012-09-17 19:33     ` Oliver Hartkopp
2012-09-18 13:36       ` Andrew Bell
2012-09-18 13:46         ` Wolfgang Grandegger
     [not found]   ` <4283CE44E963D741A50240F32D185B9F109AA1@SBSPORT3.portgmbh.local>
2012-09-17 19:40     ` Heinz-Jürgen Oertel
2012-09-18 11:44       ` Kurt Van Dijck
2012-09-18 12:14   ` Wolfgang Grandegger
2012-09-18 12:34     ` Marc Kleine-Budde
2012-09-18 12:49       ` Wolfgang Grandegger
2012-09-18 13:00         ` Marc Kleine-Budde
2012-09-18 13:39           ` Wolfgang Grandegger
2012-09-18 13:42             ` Marc Kleine-Budde
2012-09-18 18:50               ` Wolfgang Grandegger
2012-09-18 19:01                 ` Marc Kleine-Budde
2012-09-18 19:13                   ` Wolfgang Grandegger
2012-09-18 20:20                     ` Kurt Van Dijck
2012-09-19  5:42                       ` Oliver Hartkopp
2012-09-19  7:47                         ` Marc Kleine-Budde
2012-09-19  9:04                         ` Kurt Van Dijck
2012-09-19  6:50                       ` Wolfgang Grandegger
2012-09-19  7:39                         ` Marc Kleine-Budde
2012-09-19  8:10                           ` Wolfgang Grandegger
2012-09-19  7:31                       ` Marc Kleine-Budde
2012-09-19 10:18           ` Steffen Rose
     [not found]           ` <34567791.oZ5dyCnTQA@lisa>
2012-09-19 10:26             ` [Socketcan-users] " Kurt Van Dijck
2012-09-19 11:32               ` Steffen Rose
2012-11-14 20:48       ` Jason White
2012-11-15 12:54         ` Marc Kleine-Budde
2012-11-15 17:12           ` Oliver Hartkopp
2012-11-15 19:11             ` Jason White
2012-11-15 21:04               ` Oliver Hartkopp
2012-11-16 15:13             ` Kurt Van Dijck
2012-11-16 17:09               ` Jason White
2012-11-15 19:07           ` Jason White
2012-09-18 12:37 ` Wolfgang Grandegger
2012-09-18 13:22   ` Marc Kleine-Budde
2012-09-18 13:24   ` Marc Kleine-Budde
2012-09-18 13:25   ` Wolfgang Grandegger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).