data received but not detected

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* data received but not detected
@ 2008-06-17 22:08 Travis Stratman
  2008-06-17 22:27 ` Stephen Hemminger
  2008-06-17 22:31 ` Ben Greear
  0 siblings, 2 replies; 25+ messages in thread
From: Travis Stratman @ 2008-06-17 22:08 UTC (permalink / raw)
  To: netdev

Hello,

(I sent this earlier today but it doesn't look like it made it, I
apologize if it gets through multiple times)

I am working on an application that uses a fairly simple UDP protocol to
send data between two embedded devices. I'm noticing an issue with an
initial test that was written where datagrams are received but not seen
by the recvfrom() call until more data arrives after it. As of right now
the test case does not implement any type of lost packet protection or
other flow control, which is what makes the issue so noticeable.

The target for this code is a board using the Atmel AT91SAM9260 ARM
processor. I have tested with 2.6.20 and 2.6.25 on this board.

The test consists of a two applications with the following pseudo code
(msg_size = 127, 9003/9005 are the UDP ports used):

"client app"
while(1) {
    sendto(9003, &msg_size, 4bytes);
    sendto(9003, buffer, msg_size);
    recvfrom(9005, &msg_size, 4bytes);
    recvfrom(9005, buffer, msg_size);
}

"server app"
while(1) {
    recvfrom(9003, &msg_size, 4bytes);
    recvfrom(9003, buffer, msg_size);
    sendto(9005, &msg_size, 4bytes);
    sendto(9005, buffer, msg_size);
}

As long as the server is started first and no packets are lost or out of
order, the client and server should continue indefinitely. When run
between two boards on a local gigabit switch, the application will run
smoothly most of the time, but I periodically see delays of 30 seconds
or more where one of the applications is waiting for the second datagram
to arrive before sending the next packet. Wireshark shows that the data
was sent very shortly after the first datagram, and no packets are ever
lost, ifconfig reports no collisions, overruns, or errors.

When I run the application between two identical devices on a cross-over
cable, data is transferred for a few seconds after which everything
freezes until I send a ping between the two boards in the background.
This forces the communication to start up again for a few seconds before
they hang up again. If I insert a delay between the sendto() calls with
usleep(1) (CONFIG_HZ is 100 so this could be up to 10ms) everything
seems to work. Using a busy loop I was able to determine that
approximately 500 us delay is required to "fix" the issue but even then
I saw one hang up in several hours of testing.

At first I thought that this was the "rotting packet" case that the NAPI
references where an IRQ is missed on Rx, so I rewrote the poll function
in the macb driver to try to fix this but I didn't see any noticeable
differences. If I enable debugging in the MACB driver it slows things
down enough to make everything work.

Next, I tested on a Cirrus ep93xx based board (with 2.6.20) and a 133
MHz x86 board (with 2.6.14.7) and noticed the same issue when run
between the target and my PC. When run between my 2.6.23 2GHz PC and
another similar PC, the issue does not show up (these both use Intel
NICs). I also tested on the local loopback and things worked as
expected.

I would very much appreciate any suggestions that anyone could give to
point me in the right direction.

Thanks in advance,

Travis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-17 22:08 data received but not detected Travis Stratman
@ 2008-06-17 22:27 ` Stephen Hemminger
  2008-06-17 22:40   ` Travis Stratman
  2008-06-17 22:31 ` Ben Greear
  1 sibling, 1 reply; 25+ messages in thread
From: Stephen Hemminger @ 2008-06-17 22:27 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

On Tue, 17 Jun 2008 17:08:58 -0500
Travis Stratman <tstratman@emacinc.com> wrote:

> Hello,
> 
> (I sent this earlier today but it doesn't look like it made it, I
> apologize if it gets through multiple times)
> 
> I am working on an application that uses a fairly simple UDP protocol to
> send data between two embedded devices. I'm noticing an issue with an
> initial test that was written where datagrams are received but not seen
> by the recvfrom() call until more data arrives after it. As of right now
> the test case does not implement any type of lost packet protection or
> other flow control, which is what makes the issue so noticeable.
> 
> The target for this code is a board using the Atmel AT91SAM9260 ARM
> processor. I have tested with 2.6.20 and 2.6.25 on this board.
> 
> The test consists of a two applications with the following pseudo code
> (msg_size = 127, 9003/9005 are the UDP ports used):
> 
> "client app"
> while(1) {
>     sendto(9003, &msg_size, 4bytes);
>     sendto(9003, buffer, msg_size);
>     recvfrom(9005, &msg_size, 4bytes);
>     recvfrom(9005, buffer, msg_size);
> }
> 
> "server app"
> while(1) {
>     recvfrom(9003, &msg_size, 4bytes);
>     recvfrom(9003, buffer, msg_size);
>     sendto(9005, &msg_size, 4bytes);
>     sendto(9005, buffer, msg_size);
> }
> 
> As long as the server is started first and no packets are lost or out of
> order, the client and server should continue indefinitely. When run
> between two boards on a local gigabit switch, the application will run
> smoothly most of the time, but I periodically see delays of 30 seconds
> or more where one of the applications is waiting for the second datagram
> to arrive before sending the next packet. Wireshark shows that the data
> was sent very shortly after the first datagram, and no packets are ever
> lost, ifconfig reports no collisions, overruns, or errors.
> 
> When I run the application between two identical devices on a cross-over
> cable, data is transferred for a few seconds after which everything
> freezes until I send a ping between the two boards in the background.
> This forces the communication to start up again for a few seconds before
> they hang up again. If I insert a delay between the sendto() calls with
> usleep(1) (CONFIG_HZ is 100 so this could be up to 10ms) everything
> seems to work. Using a busy loop I was able to determine that
> approximately 500 us delay is required to "fix" the issue but even then
> I saw one hang up in several hours of testing.
> 
> At first I thought that this was the "rotting packet" case that the NAPI
> references where an IRQ is missed on Rx, so I rewrote the poll function
> in the macb driver to try to fix this but I didn't see any noticeable
> differences. If I enable debugging in the MACB driver it slows things
> down enough to make everything work.
> 
> Next, I tested on a Cirrus ep93xx based board (with 2.6.20) and a 133
> MHz x86 board (with 2.6.14.7) and noticed the same issue when run
> between the target and my PC. When run between my 2.6.23 2GHz PC and
> another similar PC, the issue does not show up (these both use Intel
> NICs). I also tested on the local loopback and things worked as
> expected.
> 
> I would very much appreciate any suggestions that anyone could give to
> point me in the right direction.
> 
> Thanks in advance,
> 
> Travis

I am unfamiliar with interrupts on the ARM. Are IRQ's level or edge triggered?
NAPI won't work if interrupts are edge-triggered.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-17 22:27 ` Stephen Hemminger
@ 2008-06-17 22:40   ` Travis Stratman
  0 siblings, 0 replies; 25+ messages in thread
From: Travis Stratman @ 2008-06-17 22:40 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

On Tue, 2008-06-17 at 15:27 -0700, Stephen Hemminger wrote:
> On Tue, 17 Jun 2008 17:08:58 -0500
> Travis Stratman <tstratman@emacinc.com> wrote:
> > 
> > I am working on an application that uses a fairly simple UDP protocol to
> > send data between two embedded devices. I'm noticing an issue with an
> > initial test that was written where datagrams are received but not seen
> > by the recvfrom() call until more data arrives after it.
> > 
> > The target for this code is a board using the Atmel AT91SAM9260 ARM
> > processor. I have tested with 2.6.20 and 2.6.25 on this board.
> > 
> > 
> > When I run the application between two identical devices on a cross-over
> > cable, data is transferred for a few seconds after which everything
> > freezes until I send a ping between the two boards in the background.
> > This forces the communication to start up again for a few seconds before
> > they hang up again.
> > 
> > At first I thought that this was the "rotting packet" case that the NAPI
> > references where an IRQ is missed on Rx, so I rewrote the poll function
> > in the macb driver to try to fix this but I didn't see any noticeable
> > differences.
> > 
> > I would very much appreciate any suggestions that anyone could give to
> > point me in the right direction.
> > 
> > Thanks in advance,
> > 
> > Travis
> 
> I am unfamiliar with interrupts on the ARM. Are IRQ's level or edge triggered?
> NAPI won't work if interrupts are edge-triggered.

Interrupts in this case are set to be level triggered. It has an
interrupt controller that allows them to be configured several ways. The
EMAC driver for the at91sam9260 is in drivers/net/macb.[ch]. Also note
that the 133 MHz x86 that I tested on was an STPC Elite (it also
displayed the same behavior).

Thanks,

Travis



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-17 22:08 data received but not detected Travis Stratman
  2008-06-17 22:27 ` Stephen Hemminger
@ 2008-06-17 22:31 ` Ben Greear
  2008-06-17 22:58   ` Travis Stratman
  1 sibling, 1 reply; 25+ messages in thread
From: Ben Greear @ 2008-06-17 22:31 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

Travis Stratman wrote:
> Hello,
> 
> (I sent this earlier today but it doesn't look like it made it, I
> apologize if it gets through multiple times)
> 
> I am working on an application that uses a fairly simple UDP protocol to
> send data between two embedded devices. I'm noticing an issue with an
> initial test that was written where datagrams are received but not seen
> by the recvfrom() call until more data arrives after it. As of right now
> the test case does not implement any type of lost packet protection or
> other flow control, which is what makes the issue so noticeable.

UDP packets can be lost anywhere..including in the receive buffer
after it has been received by the NIC.

You probably just need to write your code smarter to use non-blocking
IO and deal with packet loss.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-17 22:31 ` Ben Greear
@ 2008-06-17 22:58   ` Travis Stratman
  2008-06-17 23:45     ` Ben Greear
  2008-06-18  6:28     ` Evgeniy Polyakov
  0 siblings, 2 replies; 25+ messages in thread
From: Travis Stratman @ 2008-06-17 22:58 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev

On Tue, 2008-06-17 at 15:31 -0700, Ben Greear wrote:
> Travis Stratman wrote:
> > I am working on an application that uses a fairly simple UDP protocol to
> > send data between two embedded devices. I'm noticing an issue with an
> > initial test that was written where datagrams are received but not seen
> > by the recvfrom() call until more data arrives after it. As of right now
> > the test case does not implement any type of lost packet protection or
> > other flow control, which is what makes the issue so noticeable.
> 
> UDP packets can be lost anywhere..including in the receive buffer
> after it has been received by the NIC.
> 
> You probably just need to write your code smarter to use non-blocking
> IO and deal with packet loss.

Thanks Ben.

I understand that there is no guarantee of anything with UDP, but it
seems to me that if there is a packet in the buffer (it shows up after
another packet comes in behind it) the system should know about it,
right?

The code will eventually deal with packet loss / retransmission (it is
actually a customer's application, not my own). Development was only
stopped at this point because this behavior was discovered. However, if
the final application behaves in the same way that things are going now,
the application would need to timeout on read, request retransmission,
receive the original packet (that was just stuck in the buffer
somewhere) and the retransmitted packet and decide which to toss every
couple of seconds. This is a whole lot more retransmissions than I would
expect to see on a cross-over cable, especially from receiving and
processing only two small packets at one pass.

If this is what's required I will relay that to the customer or
implement some type of workaround to force a poll or flush. However, if
there is possibly a bug or race condition that is not getting handled
properly it would be better to try and find it.

Thanks,
Travis

> 
> Thanks,
> Ben
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-17 22:58   ` Travis Stratman
@ 2008-06-17 23:45     ` Ben Greear
  2008-06-19 22:53       ` Travis Stratman
  2008-06-18  6:28     ` Evgeniy Polyakov
  1 sibling, 1 reply; 25+ messages in thread
From: Ben Greear @ 2008-06-17 23:45 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

Travis Stratman wrote:
> On Tue, 2008-06-17 at 15:31 -0700, Ben Greear wrote:
>> Travis Stratman wrote:
>>> I am working on an application that uses a fairly simple UDP protocol to
>>> send data between two embedded devices. I'm noticing an issue with an
>>> initial test that was written where datagrams are received but not seen
>>> by the recvfrom() call until more data arrives after it. As of right now
>>> the test case does not implement any type of lost packet protection or
>>> other flow control, which is what makes the issue so noticeable.
>> UDP packets can be lost anywhere..including in the receive buffer
>> after it has been received by the NIC.
>>
>> You probably just need to write your code smarter to use non-blocking
>> IO and deal with packet loss.
> 
> Thanks Ben.
> 
> I understand that there is no guarantee of anything with UDP, but it
> seems to me that if there is a packet in the buffer (it shows up after
> another packet comes in behind it) the system should know about it,
> right?

Ahh, I see what you mean.

I'm afraid I don't know anything about your NIC driver, and it would
seem to be implicated.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-17 23:45     ` Ben Greear
@ 2008-06-19 22:53       ` Travis Stratman
  2008-06-19 23:08         ` Ben Greear
  2008-06-22  9:16         ` James Chapman
  0 siblings, 2 replies; 25+ messages in thread
From: Travis Stratman @ 2008-06-19 22:53 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev

On Tue, 2008-06-17 at 16:45 -0700, Ben Greear wrote:
> Travis Stratman wrote:
> > On Tue, 2008-06-17 at 15:31 -0700, Ben Greear wrote:
> >> Travis Stratman wrote:
> >>> I am working on an application that uses a fairly simple UDP protocol to
> >>> send data between two embedded devices. I'm noticing an issue with an
> >>> initial test that was written where datagrams are received but not seen
> >>> by the recvfrom() call until more data arrives after it. As of right now
> >>> the test case does not implement any type of lost packet protection or
> >>> other flow control, which is what makes the issue so noticeable.
> >> UDP packets can be lost anywhere..including in the receive buffer
> >> after it has been received by the NIC.
> >>
> >> You probably just need to write your code smarter to use non-blocking
> >> IO and deal with packet loss.
> > 
> > Thanks Ben.
> > 
> > I understand that there is no guarantee of anything with UDP, but it
> > seems to me that if there is a packet in the buffer (it shows up after
> > another packet comes in behind it) the system should know about it,
> > right?
> 
> Ahh, I see what you mean.
> 
> I'm afraid I don't know anything about your NIC driver, and it would
> seem to be implicated.

I agree, but it also troubles me that the x86 board that I noticed the
same issue on uses the realtek (8139too) driver, so I'm not completely
convinced that the issue is at the NIC level.

I was able to do some more extensive testing today with the macb (atmel
Eternet MAC controller) driver and noticed that the
netif_rx_schedule_prep function is returning false at times in the
interrupt handler. In the code below, the printk shows up during heavy
traffic, though it only happens a handful of times. (The else block is
code that I have added to the driver while debugging).

if (status & MACB_RX_INT_FLAGS) {
    if (netif_rx_schedule_prep(dev)) {
    /*
     * There's no point taking any more interrupts
     * until we have processed the buffers
     */
        macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
        dev_dbg(&bp->pdev->dev, "scheduling RX softirq\n");
        __netif_rx_schedule(dev);
    } else {
        printk(KERN_ERR "%s: Driver bug: interrupt while in polling mode\n", dev->name);
        /* disable interrupts */
        macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
    }
}

>From what I can tell of this function, it should only return false if
polling is already enabled for the interface (though I haven't looked
much deeper than the inline for netif_rx_schedule_prep()).

I went through the poll function, and actually rewrote the whole thing
according to the guidelines in the NAPI documentation, and I can't see
anyway for it to get out of poll with interrupts enabled without first
removing itself from the polling list.

Can someone who knows more about this give me some more insight into
what might be happening here? I can post the poll function or a patch to
macb.c if it would be helpful.

Thanks,

Travis


> 
> Thanks,
> Ben
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-19 22:53       ` Travis Stratman
@ 2008-06-19 23:08         ` Ben Greear
  2008-06-22  9:16         ` James Chapman
  1 sibling, 0 replies; 25+ messages in thread
From: Ben Greear @ 2008-06-19 23:08 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

Travis Stratman wrote:
> On Tue, 2008-06-17 at 16:45 -0700, Ben Greear wrote:
>> Travis Stratman wrote:
>>> On Tue, 2008-06-17 at 15:31 -0700, Ben Greear wrote:
>>>> Travis Stratman wrote:
>>>>> I am working on an application that uses a fairly simple UDP protocol to
>>>>> send data between two embedded devices. I'm noticing an issue with an
>>>>> initial test that was written where datagrams are received but not seen
>>>>> by the recvfrom() call until more data arrives after it. As of right now
>>>>> the test case does not implement any type of lost packet protection or
>>>>> other flow control, which is what makes the issue so noticeable.
>>>> UDP packets can be lost anywhere..including in the receive buffer
>>>> after it has been received by the NIC.
>>>>
>>>> You probably just need to write your code smarter to use non-blocking
>>>> IO and deal with packet loss.
>>> Thanks Ben.
>>>
>>> I understand that there is no guarantee of anything with UDP, but it
>>> seems to me that if there is a packet in the buffer (it shows up after
>>> another packet comes in behind it) the system should know about it,
>>> right?
>> Ahh, I see what you mean.
>>
>> I'm afraid I don't know anything about your NIC driver, and it would
>> seem to be implicated.
> 
> I agree, but it also troubles me that the x86 board that I noticed the
> same issue on uses the realtek (8139too) driver, so I'm not completely
> convinced that the issue is at the NIC level.

If you run a sniffer on the machine that is dropping/delaying receiving
the pkt, you can probably determine whether it is a driver issue or some
other stack issue:

If you see the pkt in the sniffer, but not in the application, then
it's probably a udp stack issue or at least not the driver.
Otherwise, the driver must be holding onto the packet.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-19 22:53       ` Travis Stratman
  2008-06-19 23:08         ` Ben Greear
@ 2008-06-22  9:16         ` James Chapman
  2008-07-07 21:56           ` Travis Stratman
  1 sibling, 1 reply; 25+ messages in thread
From: James Chapman @ 2008-06-22  9:16 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

Travis Stratman wrote:
> I was able to do some more extensive testing today with the macb (atmel
> Eternet MAC controller) driver and noticed that the
> netif_rx_schedule_prep function is returning false at times in the
> interrupt handler. In the code below, the printk shows up during heavy
> traffic, though it only happens a handful of times. (The else block is
> code that I have added to the driver while debugging).
> 
> if (status & MACB_RX_INT_FLAGS) {
>     if (netif_rx_schedule_prep(dev)) {
>     /*
>      * There's no point taking any more interrupts
>      * until we have processed the buffers
>      */
>         macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
>         dev_dbg(&bp->pdev->dev, "scheduling RX softirq\n");
>         __netif_rx_schedule(dev);
>     } else {
>         printk(KERN_ERR "%s: Driver bug: interrupt while in polling mode\n", dev->name);
>         /* disable interrupts */
>         macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
>     }
> }
> 
>>From what I can tell of this function, it should only return false if
> polling is already enabled for the interface (though I haven't looked
> much deeper than the inline for netif_rx_schedule_prep()).
> 
> I went through the poll function, and actually rewrote the whole thing
> according to the guidelines in the NAPI documentation, and I can't see
> anyway for it to get out of poll with interrupts enabled without first
> removing itself from the polling list.
> 
> Can someone who knows more about this give me some more insight into
> what might be happening here? I can post the poll function or a patch to
> macb.c if it would be helpful.

I looked at macb.c and can see that it uses napi only for rx work, 
leaving tx interrupts enabled at all times. The interrupt handler reads 
the device interrupt status when a tx interrupt happens and may find rx 
bits also set. As a result, your netif_rx_schedule_prep() will sometimes 
return false because napi might be already scheduled. The code you have 
above (i.e. the "driver bug" case) is wrong.

The napi code in the in-tree version looks suspect because it seems to 
enable rx interrupts unconditionally regardless of whether napi rx 
processing is complete.

It might help to post a patch here showing all of your changes.


-- 
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-22  9:16         ` James Chapman
@ 2008-07-07 21:56           ` Travis Stratman
  2008-07-08  9:37             ` James Chapman
  0 siblings, 1 reply; 25+ messages in thread
From: Travis Stratman @ 2008-07-07 21:56 UTC (permalink / raw)
  To: James Chapman; +Cc: netdev

On Sun, 2008-06-22 at 10:16 +0100, James Chapman wrote:
> 
> I looked at macb.c and can see that it uses napi only for rx work, 
> leaving tx interrupts enabled at all times. The interrupt handler reads 
> the device interrupt status when a tx interrupt happens and may find rx 
> bits also set. As a result, your netif_rx_schedule_prep() will sometimes 
> return false because napi might be already scheduled. The code you have 
> above (i.e. the "driver bug" case) is wrong.

Thanks for the reply James.

That is somewhat confusing to me because once an rx interrupt is
detected and the rx interrupts are disabled the rx bits should not be
set in the interrupt status register until they are re-enabled again
after polling has finished. Can you explain your point a little more?

>From what I can tell, an interrupt would need to come in between when
the ISR is read and when the rx bits are tested and rx ints are disabled
for it to be there the next time around in the while(status) loop.
Looking at it that way, it is completely possible.

> The napi code in the in-tree version looks suspect because it seems to 
> enable rx interrupts unconditionally regardless of whether napi rx 
> processing is complete.

Correct, this is one of the reasons that I rewrote the driver poll
function. There are a couple of other issues that I noticed as well.

> It might help to post a patch here showing all of your changes.

Did this earlier today, I should get a patch against 2.6.25 up tomorrow
which will be a little more useful.

Thanks!

Travis


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-07-07 21:56           ` Travis Stratman
@ 2008-07-08  9:37             ` James Chapman
  2008-07-15 20:46               ` Travis Stratman
  0 siblings, 1 reply; 25+ messages in thread
From: James Chapman @ 2008-07-08  9:37 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

Travis Stratman wrote:
> On Sun, 2008-06-22 at 10:16 +0100, James Chapman wrote:
>> I looked at macb.c and can see that it uses napi only for rx work, 
>> leaving tx interrupts enabled at all times. The interrupt handler reads 
>> the device interrupt status when a tx interrupt happens and may find rx 
>> bits also set. As a result, your netif_rx_schedule_prep() will sometimes 
>> return false because napi might be already scheduled. The code you have 
>> above (i.e. the "driver bug" case) is wrong.
> 
> Thanks for the reply James.
> 
> That is somewhat confusing to me because once an rx interrupt is
> detected and the rx interrupts are disabled the rx bits should not be
> set in the interrupt status register until they are re-enabled again
> after polling has finished. Can you explain your point a little more?

The rx and tx status are flagged in the same status register. The bits 
are set regardless of whether rx or tx interrupts are enabled in the 
device. So when you handle a tx interrupt, the interrupt routine will 
read the status register and may see rx bits also set.

You could mask the status register value that you read to ignore rx bits 
if rx interrupts are disabled (NAPI polled mode). But to be honest, I 
think it is simpler to handle rx _and_ tx work in the NAPI poll handler 
so you only get interrupts when not in NAPI polled mode. See tg3.c or 
e100.c for example.

-- 
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-07-08  9:37             ` James Chapman
@ 2008-07-15 20:46               ` Travis Stratman
  0 siblings, 0 replies; 25+ messages in thread
From: Travis Stratman @ 2008-07-15 20:46 UTC (permalink / raw)
  To: James Chapman; +Cc: netdev

On Tue, 2008-07-08 at 10:37 +0100, James Chapman wrote:
> 
> The rx and tx status are flagged in the same status register. The bits 
> are set regardless of whether rx or tx interrupts are enabled in the 
> device. So when you handle a tx interrupt, the interrupt routine will 
> read the status register and may see rx bits also set.

That makes sense, I was making an incorrect assumption.
> 
> You could mask the status register value that you read to ignore rx bits 
> if rx interrupts are disabled (NAPI polled mode). But to be honest, I 
> think it is simpler to handle rx _and_ tx work in the NAPI poll handler 
> so you only get interrupts when not in NAPI polled mode. See tg3.c or 
> e100.c for example.

I will take a look at modifying the driver to use NAPI for tx.

Thanks,

Travis


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-17 22:58   ` Travis Stratman
  2008-06-17 23:45     ` Ben Greear
@ 2008-06-18  6:28     ` Evgeniy Polyakov
  2008-06-19 23:10       ` Travis Stratman
  1 sibling, 1 reply; 25+ messages in thread
From: Evgeniy Polyakov @ 2008-06-18  6:28 UTC (permalink / raw)
  To: Travis Stratman; +Cc: Ben Greear, netdev

Hi.

On Tue, Jun 17, 2008 at 05:58:26PM -0500, Travis Stratman (tstratman@emacinc.com) wrote:
> I understand that there is no guarantee of anything with UDP, but it
> seems to me that if there is a packet in the buffer (it shows up after
> another packet comes in behind it) the system should know about it,
> right?

Did you run wireshark on receiver or sender?
Check MIB stats if packet was dropped because of low mem or incorrect
checksumm or some other problematic fields in UDP header. Sending part
can see it perfectly correct, which will not be the issue on the
receiver. If packet was delivered to receiving host, udp input path is
rather simple so there are no places which can race with something and
thus lost the packet.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-18  6:28     ` Evgeniy Polyakov
@ 2008-06-19 23:10       ` Travis Stratman
       [not found]         ` <20080620060219.GA22784@2ka.mipt.ru>
  0 siblings, 1 reply; 25+ messages in thread
From: Travis Stratman @ 2008-06-19 23:10 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: Ben Greear, netdev

On Wed, 2008-06-18 at 10:28 +0400, Evgeniy Polyakov wrote:
> Hi.
> 
> On Tue, Jun 17, 2008 at 05:58:26PM -0500, Travis Stratman (tstratman@emacinc.com) wrote:
> > I understand that there is no guarantee of anything with UDP, but it
> > seems to me that if there is a packet in the buffer (it shows up after
> > another packet comes in behind it) the system should know about it,
> > right?
> 
> Did you run wireshark on receiver or sender?
> Check MIB stats if packet was dropped because of low mem or incorrect
> checksumm or some other problematic fields in UDP header. Sending part
> can see it perfectly correct, which will not be the issue on the
> receiver. If packet was delivered to receiving host, udp input path is
> rather simple so there are no places which can race with something and
> thus lost the packet.
> 

Initially, I had run wireshark on my PC and connected it to one of the
embedded boards (the issue still shows up in this case). I did some more
testing today where I ran tcpdump on both of the boards connected with a
cross-over cable until the application froze. What I was able to find
was that the first 1 or 2 hangups are corrected after 4 or 5 seconds
because the boards send an ARP request when data communication stops.
This causes communication to start up again. No packets are ever lost or
corrupted, they just don't appear to the application until something
else happens on the network.

Here is a snippet of the packet trace surrounding the hangup (these are
from the same session, but the clocks on the two boards were not set to
the same time):
(On the "server" -- sbc41):
22:53:57.763656 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
22:53:57.764000 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
22:53:57.764229 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 4
22:53:57.764387 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
22:54:01.034522 arp who-has sbc041.emacinc.com tell sbc042.emacinc.com
22:54:01.034642 arp reply sbc041.emacinc.com is-at 00:50:c2:0d:6e:00 (oui Unknown)
22:54:01.035585 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
22:54:01.035736 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
22:54:01.036095 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 4
22:54:01.036263 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
22:54:01.036793 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
--
22:54:01.803384 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
22:54:01.803773 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
22:54:01.803916 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
22:54:01.804274 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 4
22:54:01.804440 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
22:54:06.034670 arp who-has sbc042.emacinc.com tell sbc041.emacinc.com
22:54:06.034995 arp reply sbc042.emacinc.com is-at 00:50:c2:0e:0b:ac (oui Unknown)
22:54:06.035597 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
22:54:06.035750 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
22:54:06.036088 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 4
22:54:06.036249 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
22:54:06.036790 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4

(On the "client" -- sbc42):
17:18:03.141864 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
17:18:03.142195 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
17:18:03.142627 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 4
17:18:06.412741 arp who-has sbc041.emacinc.com tell sbc042.emacinc.com
17:18:06.413035 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
17:18:06.413122 arp reply sbc041.emacinc.com is-at 00:50:c2:0d:6e:00 (oui Unknown)
17:18:06.413793 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
17:18:06.413955 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
17:18:06.414500 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 4
17:18:06.414656 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
17:18:06.415001 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
--
17:18:07.181787 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
17:18:07.181995 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
17:18:07.182136 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
17:18:07.182676 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 4
17:18:11.413067 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
17:18:11.413160 arp who-has sbc042.emacinc.com tell sbc041.emacinc.com
17:18:11.413221 arp reply sbc042.emacinc.com is-at 00:50:c2:0e:0b:ac (oui Unknown)
17:18:11.413813 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4
17:18:11.413973 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 127
17:18:11.414493 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 4
17:18:11.414642 IP sbc041.emacinc.com.3072 > sbc042.emacinc.com.9005: UDP, length 127
17:18:11.414998 IP sbc042.emacinc.com.3072 > sbc041.emacinc.com.9003: UDP, length 4

Thanks,

Travis



^ permalink raw reply	[flat|nested] 25+ messages in thread

[parent not found: <20080620060219.GA22784@2ka.mipt.ru>]

* Re: data received but not detected
       [not found]         ` <20080620060219.GA22784@2ka.mipt.ru>
@ 2008-06-20 17:10           ` Travis Stratman
  2008-06-20 17:25             ` Evgeniy Polyakov
  0 siblings, 1 reply; 25+ messages in thread
From: Travis Stratman @ 2008-06-20 17:10 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev

On Fri, 2008-06-20 at 10:02 +0400, Evgeniy Polyakov wrote:
> On Thu, Jun 19, 2008 at 06:10:29PM -0500, Travis Stratman (tstratman@emacinc.com) wrote:
> > Initially, I had run wireshark on my PC and connected it to one of the
> > embedded boards (the issue still shows up in this case). I did some more
> > testing today where I ran tcpdump on both of the boards connected with a
> > cross-over cable until the application froze. What I was able to find
> > was that the first 1 or 2 hangups are corrected after 4 or 5 seconds
> > because the boards send an ARP request when data communication stops.
> > This causes communication to start up again. No packets are ever lost or
> > corrupted, they just don't appear to the application until something
> > else happens on the network.

Also note that it just needs to be data that the board receives, it does
not have to come in on the same port (ARPs, ICMP pings, broadcast
packets all force the data to be recognized).

> This looks like wakeup missing/not accounted (probalby by application).
> Does your application use poll()? If no, can you add it into the
> receiving loop and check its output? Or even add a signal handler for
> harmless signal like usr1 and put poll() call with 0 timeout there, so
> when system will freeze you could check what poll() returns by that
> signal.
Thanks. 

Initially the application was just using a blocking recvfrom() call. I
changed to poll() and non-blocking recvfrom (MSG_DONTWAIT) today, and
poll() always times out when the lockup occurs. I also tried using an
FIONREAD ioctl on the socket and actually saw fairly decent results
using code like this:
--function--
static inline int is_data(int socket)
{
	int num_bytes;
	ioctl(socket, FIONREAD, &num_bytes);
	return num_bytes;
}
-- then before each recvfrom --
while (!is_data(recv_socket)) {usleep(1);}

It ran for about an hour before it froze. I'm not sure why this would
give me better results than poll(), that is something that I'm looking
into.

I also noticed that when I run iperf between the two ARM boards on a
xover cable, I _always_ see at least one dropped packet reported by
iperf. For bandwidths that are below 50m it is almost always 1, but
never 0 dropped. When I run it between my PC and an ARM board, I don't
(generally) see any dropped packets until I exceed 50m for the bandwidth
setting. This probably doesn't mean anything, but it is interesting.


Thanks,

Travis



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-20 17:10           ` Travis Stratman
@ 2008-06-20 17:25             ` Evgeniy Polyakov
  2008-06-20 17:41               ` Travis Stratman
  0 siblings, 1 reply; 25+ messages in thread
From: Evgeniy Polyakov @ 2008-06-20 17:25 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

On Fri, Jun 20, 2008 at 12:10:59PM -0500, Travis Stratman (tstratman@emacinc.com) wrote:
> Initially the application was just using a blocking recvfrom() call. I
> changed to poll() and non-blocking recvfrom (MSG_DONTWAIT) today, and
> poll() always times out when the lockup occurs. I also tried using an

Can you confirm that IPSTATS_MIB_INRECEIVES MIB does not increase when
packets are sent to the frozen machine, but userspace does not receive
it?

Please also clarify this bit again: you see packets in tcpdump running
on receiving (frozen) host, but do not see them in userspace? After this
freeze happend system recvmsg/poll calls do not respond until some
activity on NIC happend? I.e. recv/poll 'unfreeze' only when something
is received (arp, icmp reply) or after sending too?

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-20 17:25             ` Evgeniy Polyakov
@ 2008-06-20 17:41               ` Travis Stratman
  2008-06-20 17:54                 ` Evgeniy Polyakov
  0 siblings, 1 reply; 25+ messages in thread
From: Travis Stratman @ 2008-06-20 17:41 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev

On Fri, 2008-06-20 at 21:25 +0400, Evgeniy Polyakov wrote:
> On Fri, Jun 20, 2008 at 12:10:59PM -0500, Travis Stratman (tstratman@emacinc.com) wrote:
> > Initially the application was just using a blocking recvfrom() call. I
> > changed to poll() and non-blocking recvfrom (MSG_DONTWAIT) today, and
> > poll() always times out when the lockup occurs. I also tried using an
> 
> Can you confirm that IPSTATS_MIB_INRECEIVES MIB does not increase when
> packets are sent to the frozen machine, but userspace does not receive
> it?

I will come up with a way to test for this in the application.

> Please also clarify this bit again: you see packets in tcpdump running
> on receiving (frozen) host, but do not see them in userspace? After this
> freeze happend system recvmsg/poll calls do not respond until some
> activity on NIC happend? I.e. recv/poll 'unfreeze' only when something
> is received (arp, icmp reply) or after sending too?

I see the packets being sent in tcpdump from the sending application,
but they don't show up in the receiving side until something else comes
in behind them. For example if you look at the client and server trace
that I sent yesterday side by side you will see how the timings line up.
The same packets are shown in each trace. Data needs to be received by
the board to unlock it, sending does not seem to have any effect.

I appreciate the help.

-Travis


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-20 17:41               ` Travis Stratman
@ 2008-06-20 17:54                 ` Evgeniy Polyakov
  2008-06-20 18:17                   ` Travis Stratman
  0 siblings, 1 reply; 25+ messages in thread
From: Evgeniy Polyakov @ 2008-06-20 17:54 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

On Fri, Jun 20, 2008 at 12:41:04PM -0500, Travis Stratman (tstratman@emacinc.com) wrote:
> > Please also clarify this bit again: you see packets in tcpdump running
> > on receiving (frozen) host, but do not see them in userspace? After this
> > freeze happend system recvmsg/poll calls do not respond until some
> > activity on NIC happend? I.e. recv/poll 'unfreeze' only when something
> > is received (arp, icmp reply) or after sending too?
> 
> I see the packets being sent in tcpdump from the sending application,
> but they don't show up in the receiving side until something else comes
> in behind them. For example if you look at the client and server trace
> that I sent yesterday side by side you will see how the timings line up.
> The same packets are shown in each trace. Data needs to be received by
> the board to unlock it, sending does not seem to have any effect.

So, packets are actually received by the host, since you see them in the
receiving host tcpdump, but they do not reach socket queue. Please check
UDP_MIB_INERRORS mib. You can do that via netstat -s.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-20 17:54                 ` Evgeniy Polyakov
@ 2008-06-20 18:17                   ` Travis Stratman
  2008-06-20 18:23                     ` Evgeniy Polyakov
  0 siblings, 1 reply; 25+ messages in thread
From: Travis Stratman @ 2008-06-20 18:17 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev

On Fri, 2008-06-20 at 21:54 +0400, Evgeniy Polyakov wrote:
> > 
> > I see the packets being sent in tcpdump from the sending application,
> > but they don't show up in the receiving side until something else comes
> > in behind them. For example if you look at the client and server trace
> > that I sent yesterday side by side you will see how the timings line up.
> > The same packets are shown in each trace. Data needs to be received by
> > the board to unlock it, sending does not seem to have any effect.
> 
> So, packets are actually received by the host, since you see them in the
> receiving host tcpdump, but they do not reach socket queue. Please check
> UDP_MIB_INERRORS mib. You can do that via netstat -s.

Let me clarify this again... I see the packet being sent at the expected
time from the sender on the tcpdump. The packet does not show up in
tcpdump or in the application on the receive side. When some other data
is received by the receiver (i.e. ARP), the missing packet shows up in
the tcpdump and in the application at the same time. So the delay shows
up in the tcpdump as well. It seems to me that everything is pointing to
the packet being in the DMA buffer but the controller driver not knowing
anything about it.

netstat -s shows 0 UDP errors on both systems.

Thanks,

Travis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-20 18:17                   ` Travis Stratman
@ 2008-06-20 18:23                     ` Evgeniy Polyakov
  2008-06-20 21:06                       ` Travis Stratman
  0 siblings, 1 reply; 25+ messages in thread
From: Evgeniy Polyakov @ 2008-06-20 18:23 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

On Fri, Jun 20, 2008 at 01:17:06PM -0500, Travis Stratman (tstratman@emacinc.com) wrote:
> Let me clarify this again... I see the packet being sent at the expected
> time from the sender on the tcpdump. The packet does not show up in
> tcpdump or in the application on the receive side. When some other data
> is received by the receiver (i.e. ARP), the missing packet shows up in
> the tcpdump and in the application at the same time. So the delay shows
> up in the tcpdump as well. It seems to me that everything is pointing to
> the packet being in the DMA buffer but the controller driver not knowing
> anything about it.

Argh. Ok, then please check that napi polling is called and rx interrupt
happen for the driver.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-20 18:23                     ` Evgeniy Polyakov
@ 2008-06-20 21:06                       ` Travis Stratman
  2008-06-21  7:12                         ` Evgeniy Polyakov
  0 siblings, 1 reply; 25+ messages in thread
From: Travis Stratman @ 2008-06-20 21:06 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev

On Fri, 2008-06-20 at 22:23 +0400, Evgeniy Polyakov wrote:
> On Fri, Jun 20, 2008 at 01:17:06PM -0500, Travis Stratman (tstratman@emacinc.com) wrote:
> > Let me clarify this again... I see the packet being sent at the expected
> > time from the sender on the tcpdump. The packet does not show up in
> > tcpdump or in the application on the receive side. When some other data
> > is received by the receiver (i.e. ARP), the missing packet shows up in
> > the tcpdump and in the application at the same time. So the delay shows
> > up in the tcpdump as well. It seems to me that everything is pointing to
> > the packet being in the DMA buffer but the controller driver not knowing
> > anything about it.
> 
> Argh. Ok, then please check that napi polling is called and rx interrupt
> happen for the driver.

This is what I have been focusing on. I'm still trying to figure out a
good way to see if the interrupt is triggered for a specific packet
because I have no way of determining which packet it will freeze on and
if I put any prints in the interrupt handler or poll function it slows
things down enough that the problem disappears.

In the meantime I was testing why the FIONREAD ioctl made such a big
difference and I found that if I insert a usleep(1) between the two
receive calls, the problem does not occur. During my testing before I
had put a usleep() between the send calls, which fixed the issue for me
and led me to assume that an IRQ was being missed if the packets come in
too close to each other.

The fact that inserting a sleep between the two receive calls fixes the
issue makes this seem less like a driver issue. The only hypothesis that
I have bee able to come up with so far is that calling recv() somehow
masks the interrupts momentarily so that if the packet comes in at
exactly the same time as the recv or poll() is called, the system does
not know anything about it, to the point that it does not even show on
the packet trace. I have no idea how this could happen at this point.

Thanks,

Travis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-20 21:06                       ` Travis Stratman
@ 2008-06-21  7:12                         ` Evgeniy Polyakov
  2008-07-07 21:10                           ` Travis Stratman
  0 siblings, 1 reply; 25+ messages in thread
From: Evgeniy Polyakov @ 2008-06-21  7:12 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

On Fri, Jun 20, 2008 at 04:06:12PM -0500, Travis Stratman (tstratman@emacinc.com) wrote:
> This is what I have been focusing on. I'm still trying to figure out a
> good way to see if the interrupt is triggered for a specific packet
> because I have no way of determining which packet it will freeze on and
> if I put any prints in the interrupt handler or poll function it slows
> things down enough that the problem disappears.

It may or may not be the driver issue, but the way it works with NAPI.
Or driver just looses interrupt (or if it has weird interrupt
coalescing/mitigation feature) under the load. What about adding a
counter into interrupt handler and napi polling callback with ability to
clear/read it via driver ioctl (or just clear it when first small packet
is recived and dump when module is unloaded), so can determine via
tcpdump how many packets were actually received and what counter is.
It can be trivial issue with work_done < or <= than budget, which was a
frequent error in drivers for a while, and with your protocol it can be
fatal until next received packet.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-06-21  7:12                         ` Evgeniy Polyakov
@ 2008-07-07 21:10                           ` Travis Stratman
  2008-07-07 21:25                             ` Evgeniy Polyakov
  0 siblings, 1 reply; 25+ messages in thread
From: Travis Stratman @ 2008-07-07 21:10 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev

On Sat, 2008-06-21 at 11:12 +0400, Evgeniy Polyakov wrote:
> 
> It may or may not be the driver issue, but the way it works with NAPI.
> Or driver just looses interrupt (or if it has weird interrupt
> coalescing/mitigation feature) under the load. What about adding a
> counter into interrupt handler and napi polling callback with ability to
> clear/read it via driver ioctl (or just clear it when first small packet
> is recived and dump when module is unloaded), so can determine via
> tcpdump how many packets were actually received and what counter is.
> It can be trivial issue with work_done < or <= than budget, which was a
> frequent error in drivers for a while, and with your protocol it can be
> fatal until next received packet.

I have not been able to work on this recently but I have some time to
look at it again now. Before I stopped working on it, I implemented a
workaround using a private ioctl() that was able to correct the issue
after a hangup and (I believe) illustrates that a missed interrupt is
causing the problem.

I added a private ioctl call to the macb driver that does the following:
1. read rx status register
2. if rx status is true, schedule poll (same block of code as in
interrupt handler)
3. return

Then, in my userspace code, I called poll() with a timeout and called
the ioctl() if poll timed out with no data. What I found was that the rx
status register always reported a packet present but not acknowledged
when poll timed out (on the board that missed the packet). Scheduling a
poll in the driver forced it to read this new packet and the userspace
code was able to continue from there.

If the macb poll is executed, the receive status register will be
cleared, so somewhere along the way an interrupt is being missed (or
like you suggested some type of coalescing is happening).

Below is a patch of the changes that I have made to the driver including
my rewrite of the poll() function and additional private ioctl()
workaround. This patch is against 2.6.20 with some of the patches from
http://maxim.org.za/sam9.html , most of which have been added to the
vanilla kernel in the more current versions that I have tested (i.e.
2.6.25). This shows the changes that I have made more easily, but I can
provide the full patch from vanilla if it would be more helpful (i.e.
this one will not apply cleanly to a vanilla kernel). I wasn't sure
which would be the best to post.

Thanks,

Travis

Index: linux-2.6.20.AT91/drivers/net/macb.c
===================================================================
--- linux-2.6.20.AT91/drivers/net/macb.c        (revision 646)
+++ linux-2.6.20.AT91/drivers/net/macb.c        (working copy)
@@ -8,6 +8,9 @@
  * published by the Free Software Foundation.
  */

+//#define DEBUG 1
+#undef DEBUG
+
 #include <linux/clk.h>
 #include <linux/module.h>
 #include <linux/moduleparam.h>
@@ -429,10 +432,9 @@
        int received = 0;
        unsigned int tail = bp->rx_tail;
        int first_frag = -1;
+       u32 addr, ctrl;

        for (; budget > 0; tail = NEXT_RX(tail)) {
-               u32 addr, ctrl;
-
                rmb();
                addr = bp->rx_ring[tail].addr;
                ctrl = bp->rx_ring[tail].ctrl;
@@ -470,55 +472,55 @@
 static int macb_poll(struct net_device *dev, int *budget)
 {
        struct macb *bp = netdev_priv(dev);
-       int orig_budget, work_done, retval = 0;
+       int orig_budget, work_done;
        u32 status;

        status = macb_readl(bp, RSR);
-       macb_writel(bp, RSR, status);
-
        if (!status) {
                /*
                 * This may happen if an interrupt was pending before
                 * this function was called last time, and no packets
                 * have been received since.
                 */
-               netif_rx_complete(dev);
-               goto out;
+               goto done; /* close polling, reset interrupts, return 0 */
        }
+       do {
+               macb_writel(bp, RSR, status);
+               dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
+                       (unsigned long)status, *budget);
+               if (!(status & MACB_BIT(REC))) {
+                       dev_warn(&bp->pdev->dev,
+                                "No RX buffers complete, status = %02lx\n",
+                                (unsigned long)status);
+                       goto done; /* re-enable ints and return 0 */
+               }

-       dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
-               (unsigned long)status, *budget);
+               orig_budget = *budget;
+               if (orig_budget > dev->quota)
+                       orig_budget = dev->quota;
+               work_done = macb_rx(bp, orig_budget);
+
+               *budget -= work_done;
+               dev->quota -= work_done;
+
+               if (work_done >= orig_budget) {
+                       goto hitquota; /* DONT touch interrupt enable register */
+               }
+       } while ((status = macb_readl(bp, RSR)));

-       if (!(status & MACB_BIT(REC))) {
-               dev_warn(&bp->pdev->dev,
-                        "No RX buffers complete, status = %02lx\n",
-                        (unsigned long)status);
-               netif_rx_complete(dev);
-               goto out;
-       }
+done:
+       /* close polling */
+       netif_rx_complete(dev);
+       /* enable interrupts */
+       macb_writel(bp, IER, MACB_RX_INT_FLAGS);

-       orig_budget = *budget;
-       if (orig_budget > dev->quota)
-               orig_budget = dev->quota;
+       return 0;

-       work_done = macb_rx(bp, orig_budget);
-       if (work_done < orig_budget) {
-               netif_rx_complete(dev);
-               retval = 0;
-       } else {
-               retval = 1;
-       }
-
-       /*
-        * We've done what we can to clean the buffers. Make sure we
-        * get notified when new packets arrive.
-        */
-out:
-       macb_writel(bp, IER, MACB_RX_INT_FLAGS);
-
        /* TODO: Handle errors */

-       return retval;
+hitquota:
+       printk(KERN_ERR "hit quota!!\n");
+       return 1;
 }

 static irqreturn_t macb_interrupt(int irq, void *dev_id)
@@ -545,7 +547,7 @@
                }

                if (status & MACB_RX_INT_FLAGS) {
-                       if (netif_rx_schedule_prep(dev)) {
+                       if (likely(netif_rx_schedule_prep(dev))) {
                                /*
                                 * There's no point taking any more interrupts
                                 * until we have processed the buffers
@@ -553,7 +555,12 @@
                                macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
                                dev_dbg(&bp->pdev->dev, "scheduling RX softirq\n");
                                __netif_rx_schedule(dev);
-                       }
+                       }
+                       //else {
+                       //      printk(KERN_ERR "%s: Driver bug: interrupt while in polling mode\n", dev->name);
+                               /* disable interrupts */
+                               //macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
+                       //}
                }

                if (status & (MACB_BIT(TCOMP) | MACB_BIT(ISR_TUND)))
@@ -564,7 +571,7 @@
                 * add that if/when we get our hands on a full-blown MII PHY.
                 */

-               if (status & MACB_BIT(HRESP)) {
+               if (unlikely(status & MACB_BIT(HRESP))) {
                        /*
                         * TODO: Reset the hardware, and maybe move the printk
                         * to a lower-priority context as well (work queue?)
@@ -572,7 +579,6 @@
                        printk(KERN_ERR "%s: DMA bus error: HRESP not OK\n",
                               dev->name);
                }
-
                status = macb_readl(bp, ISR);
        }

@@ -987,11 +993,34 @@
 static int macb_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 {
        struct macb *bp = netdev_priv(dev);
+       int rc;

        if (!netif_running(dev))
                return -EINVAL;
-
-       return generic_mii_ioctl(&bp->mii, if_mii(rq), cmd, NULL);
+
+       rc = generic_mii_ioctl(&bp->mii, if_mii(rq), cmd, NULL);
+       /* custom private commands */
+       switch(cmd)
+       {
+       /**
+        * SIOCDEVPRIVATE is used to force the driver to examine the RSR and
+        * check for missed data. If an IRQ is missed, calling this ioctl will
+        * force polling to be re-enabled.
+        */
+       case SIOCDEVPRIVATE:
+               if (macb_readl(bp, RSR)) {
+                       if (likely(netif_rx_schedule_prep(dev))) {
+                               /* disable RX interrupts */
+                               macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
+                               dev_dbg(&bp->pdev->dev, "scheduling RX softirq\n");
+                               __netif_rx_schedule(dev);
+                       }
+               }
+               return 0;
+       default:
+               return -EINVAL;
+       }
+       return rc;
 }

 static ssize_t macb_mii_show(const struct class_device *cd, char *buf,
@@ -1283,3 +1312,4 @@
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("Atmel MACB Ethernet driver");
 MODULE_AUTHOR("Haavard Skinnemoen <hskinnemoen@atmel.com>");
+



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-07-07 21:10                           ` Travis Stratman
@ 2008-07-07 21:25                             ` Evgeniy Polyakov
  2008-07-15 20:43                               ` Travis Stratman
  0 siblings, 1 reply; 25+ messages in thread
From: Evgeniy Polyakov @ 2008-07-07 21:25 UTC (permalink / raw)
  To: Travis Stratman; +Cc: netdev

On Mon, Jul 07, 2008 at 04:10:30PM -0500, Travis Stratman (tstratman@emacinc.com) wrote:
> I have not been able to work on this recently but I have some time to
> look at it again now. Before I stopped working on it, I implemented a
> workaround using a private ioctl() that was able to correct the issue
> after a hangup and (I believe) illustrates that a missed interrupt is
> causing the problem.
> 
> I added a private ioctl call to the macb driver that does the following:
> 1. read rx status register
> 2. if rx status is true, schedule poll (same block of code as in
> interrupt handler)
> 3. return
> 
> Then, in my userspace code, I called poll() with a timeout and called
> the ioctl() if poll timed out with no data. What I found was that the rx
> status register always reported a packet present but not acknowledged
> when poll timed out (on the board that missed the packet). Scheduling a
> poll in the driver forced it to read this new packet and the userspace
> code was able to continue from there.

Can it be missed acknowledge when ->poll() hits the quota limit?

> If the macb poll is executed, the receive status register will be
> cleared, so somewhere along the way an interrupt is being missed (or
> like you suggested some type of coalescing is happening).
> 
> Below is a patch of the changes that I have made to the driver including
> my rewrite of the poll() function and additional private ioctl()
> workaround. This patch is against 2.6.20 with some of the patches from
> http://maxim.org.za/sam9.html , most of which have been added to the
> vanilla kernel in the more current versions that I have tested (i.e.
> 2.6.25). This shows the changes that I have made more easily, but I can
> provide the full patch from vanilla if it would be more helpful (i.e.
> this one will not apply cleanly to a vanilla kernel). I wasn't sure
> which would be the best to post.

It is quite clear patch, but please provide needed part on top of .25
tree.
-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: data received but not detected
  2008-07-07 21:25                             ` Evgeniy Polyakov
@ 2008-07-15 20:43                               ` Travis Stratman
  0 siblings, 0 replies; 25+ messages in thread
From: Travis Stratman @ 2008-07-15 20:43 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev

On Tue, 2008-07-08 at 01:25 +0400, Evgeniy Polyakov wrote:
> 
> Can it be missed acknowledge when ->poll() hits the quota limit?

I don't think so. The poll() function explicitly checks if (work_done >=
budget) and if so returns work_done without calling netif_rx_complete().
Also, I added a printk to detect if the budget was ever reached, and it
is not (which makes sense in this case because only two packets are sent
at a time, 127 B followed by 4 B).

> It is quite clear patch, but please provide needed part on top of .25
> tree.

I have pasted my patch against vanilla 2.6.25 below. It displays the
same behavior as 2.6.20. The largest change is in the poll() function
and the added private ioctl().

I was unsure about one thing: in the poll() function, I used a loop to
keep receiving packets until either the budget was exhausted or the
receive status register claimed that no packets were available. This
requires the possibility for multiple calls to macb_rx() which takes the
budget as a parameter. I assumed that the budget passed to macb_rx()
should be decremented by the amount received in the previous call rather
than the original budget, so I used a few extra variables
(pass_work_done - the amount received in this pass of the loop,
orig_budget - the original budget that poll() was called with) to keep
track of this (I couldn't see a better way at the time). Was this a
correct assumption?

Thank you,

Travis

--- linux-2.6.25/drivers/net/macb.c     2008-04-16 21:49:44.000000000 -0500
+++ linux-2.6.25.AT91/drivers/net/macb.c        2008-07-15 14:49:56.000000000 -0500
@@ -455,10 +455,9 @@
        int received = 0;
        unsigned int tail = bp->rx_tail;
        int first_frag = -1;
+       u32 addr, ctrl;

        for (; budget > 0; tail = NEXT_RX(tail)) {
-               u32 addr, ctrl;
-
                rmb();
                addr = bp->rx_ring[tail].addr;
                ctrl = bp->rx_ring[tail].ctrl;
@@ -497,47 +496,48 @@
 {
        struct macb *bp = container_of(napi, struct macb, napi);
        struct net_device *dev = bp->dev;
-       int work_done;
+       int work_done, orig_budget, pass_work_done;
        u32 status;

        status = macb_readl(bp, RSR);
-       macb_writel(bp, RSR, status);

        work_done = 0;
+       pass_work_done = 0;
+       orig_budget = budget;
        if (!status) {
                /*
                 * This may happen if an interrupt was pending before
                 * this function was called last time, and no packets
                 * have been received since.
                 */
-               netif_rx_complete(dev, napi);
-               goto out;
-       }
-
-       dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
-               (unsigned long)status, budget);
-
-       if (!(status & MACB_BIT(REC))) {
-               dev_warn(&bp->pdev->dev,
-                        "No RX buffers complete, status = %02lx\n",
-                        (unsigned long)status);
-               netif_rx_complete(dev, napi);
                goto out;
        }
+       do {
+               macb_writel(bp, RSR, status);
+               dev_dbg(&bp->pdev->dev, "poll: status = %08lx, budget = %d\n",
+                       (unsigned long)status, budget);
+
+               if (unlikely(!(status & MACB_BIT(REC)))) {
+                       dev_warn(&bp->pdev->dev,
+                               "No RX buffers complete, status = %02lx\n",
+                               (unsigned long)status);
+                       goto out;
+               }

-       work_done = macb_rx(bp, budget);
-       if (work_done < budget)
-               netif_rx_complete(dev, napi);
-
-       /*
-        * We've done what we can to clean the buffers. Make sure we
-        * get notified when new packets arrive.
-        */
+               pass_work_done = macb_rx(bp, budget);
+               work_done += pass_work_done;
+               budget -= pass_work_done;
+               if (unlikely(work_done >= orig_budget)) {
+                       printk("macb hit quota\n");
+                       goto hitquota;
+               }
+       } while ((status = macb_readl(bp, RSR)));
 out:
+       netif_rx_complete(dev, napi);
        macb_writel(bp, IER, MACB_RX_INT_FLAGS);

        /* TODO: Handle errors */
-
+hitquota:
        return work_done;
 }

@@ -562,7 +562,7 @@
                }

                if (status & MACB_RX_INT_FLAGS) {
-                       if (netif_rx_schedule_prep(dev, &bp->napi)) {
+                       if (likely(netif_rx_schedule_prep(dev, &bp->napi))) {
                                /*
                                 * There's no point taking any more interrupts
                                 * until we have processed the buffers
@@ -582,7 +582,7 @@
                 * add that if/when we get our hands on a full-blown MII PHY.
                 */

-               if (status & MACB_BIT(HRESP)) {
+               if (unlikely(status & MACB_BIT(HRESP))) {
                        /*
                         * TODO: Reset the hardware, and maybe move the printk
                         * to a lower-priority context as well (work queue?)
@@ -1074,6 +1074,7 @@
 {
        struct macb *bp = netdev_priv(dev);
        struct phy_device *phydev = bp->phy_dev;
+       int rc;

        if (!netif_running(dev))
                return -EINVAL;
@@ -1081,7 +1082,32 @@
        if (!phydev)
                return -ENODEV;

-       return phy_mii_ioctl(phydev, if_mii(rq), cmd);
+       rc = phy_mii_ioctl(phydev, if_mii(rq), cmd);
+
+       /* custom private commands */
+       switch(cmd)
+       {
+       /**
+       * SIOCDEVPRIVATE is used to force the driver to examine the RSR and
+       * check for missed data. If an IRQ is missed, calling this ioctl will
+       * force polling to be re-enabled.
+       */
+       case SIOCDEVPRIVATE:
+               if (likely(macb_readl(bp, RSR))) {
+                       spin_lock(&bp->lock);
+                       if (likely(netif_rx_schedule_prep(dev, &bp->napi))) {
+                               /* disable RX interrupts */
+                               macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
+                               dev_dbg(&bp->pdev->dev, "scheduling RX softirq via ioctl\n");
+                               __netif_rx_schedule(dev, &bp->napi);
+                       }
+                       spin_unlock(&bp->lock);
+               }
+               return 0;
+       default:
+               return -EINVAL;
+       }
+       return rc;
 }

 static int __init macb_probe(struct platform_device *pdev)



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2008-07-15 20:46 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-17 22:08 data received but not detected Travis Stratman
2008-06-17 22:27 ` Stephen Hemminger
2008-06-17 22:40   ` Travis Stratman
2008-06-17 22:31 ` Ben Greear
2008-06-17 22:58   ` Travis Stratman
2008-06-17 23:45     ` Ben Greear
2008-06-19 22:53       ` Travis Stratman
2008-06-19 23:08         ` Ben Greear
2008-06-22  9:16         ` James Chapman
2008-07-07 21:56           ` Travis Stratman
2008-07-08  9:37             ` James Chapman
2008-07-15 20:46               ` Travis Stratman
2008-06-18  6:28     ` Evgeniy Polyakov
2008-06-19 23:10       ` Travis Stratman
     [not found]         ` <20080620060219.GA22784@2ka.mipt.ru>
2008-06-20 17:10           ` Travis Stratman
2008-06-20 17:25             ` Evgeniy Polyakov
2008-06-20 17:41               ` Travis Stratman
2008-06-20 17:54                 ` Evgeniy Polyakov
2008-06-20 18:17                   ` Travis Stratman
2008-06-20 18:23                     ` Evgeniy Polyakov
2008-06-20 21:06                       ` Travis Stratman
2008-06-21  7:12                         ` Evgeniy Polyakov
2008-07-07 21:10                           ` Travis Stratman
2008-07-07 21:25                             ` Evgeniy Polyakov
2008-07-15 20:43                               ` Travis Stratman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).