Ethernet Bridging on 8260

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* Ethernet Bridging on 8260
@ 2001-04-10 16:41 jtm
  0 siblings, 0 replies; 4+ messages in thread
From: jtm @ 2001-04-10 16:41 UTC (permalink / raw)
  To: linuxppc-embedded

We're trying to enable ethernet bridging between FCC2 and FCC3 on an
8260 (EST's SBC8260 eval board), and running into problems.

Our test is to send it continuous 1500 byte ethernet packets from
a SmartBits traffic generator. Transmitting on one port, everything
is fine until we send more than about 7500 packets back to back.

Below that value, all of the values reported by ifconfig are correct,
and memory use (reported by free) is constant. Above that value,
the 'TX packets' value reported from ifconfig does not match what
the SmartBits says it received, and according to free, we start using
more memory that never gets released. If we send 3-5 bursts of 8000
packets, we start getting output like:
	Out of Memory: Killed process 16 (sh)
	Out of Memory: Killed process 18 (inetd)
	__alloc_pages: 0 - order allocation failed.
	eth2: Memory squeeze, dropping packet

And so on.

Another test that we have tried is sending traffic on both ports. As
long as we send on both ports, we can generate traffic all the way
up to 100 Mbps without killing the kernel. (It can't keep up, but it
doesn't die). Above 95 Mbps, if we stop transmitting on one of the
ports, the system dies with error messages like those above.

We are using kernel 2.4.3, with a modified driver from bitkeeper's 2.5
kernel. The differences are:
	mii_discover_phy_poll() is commented out, and
		cep->phy_speed is set to 100,
		cep->phy_duplex is set to 1
	fcc->fcc_fgmr has the TCI bit cleared

We are running the 8260 core at 133 MHz and the CPM at 133 MHz

--
Jay Monkman	    The truth knocks on the door and you say "Go away, I'm
monkman@jump.net    looking for the truth," and so it goes away. Puzzling.
		     - from _Zen_and_the_Art_of_Motorcycle_Maintenance_

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Ethernet Bridging on 8260
@ 2001-04-11 23:36 Jim Chapman
  0 siblings, 0 replies; 4+ messages in thread
From: Jim Chapman @ 2001-04-11 23:36 UTC (permalink / raw)
  To: jtm; +Cc: linuxppc-embedded

Ah, there's someone else out there trying to break linux with a
SmartBits tester!! :)

I ran into similar problems; and they get even worse when you blast 64
byte packets rather than 1500 byte packets at the box! The problem is
caused by the kernel being starved of the CPU by spending too much
time in interrupt handlers.

On a typical PC, the CPU interrupt rate from a NIC can be throttled by
the PCI bus / PCI bridge chip. Also, while data is being copied across
PCI to/from the NIC from/to the host memory, the host CPU is often
able to do other work between PCI burst cycles. For the 8260,
interrupts will occur as fast as new packets arrive in the FCC's bd
ring, assuming FCC interrupts are enabled. For 100M ethernet, that can
be real fast...

With the 8260, when packets are received, the FCC interrupt handler
allocates skbs and queues them on the kernel's network input queue via
netif_rx(). The interrupt handler keeps taking buffers from the rx bd
ring until no more rx bds are available. Assuming the ISR eventually
gives up the CPU, when a packet reaches the bridge code, the bridge
determines the output device(s) and queues the packet for transmission
via dev_queue_xmit(). (I assume your SmartBit test frames are "real"
unicast frames - broadcast / multicast frames incur much more
processing overhead.) Since you're getting so many interrupts, the
kernel isn't getting enough of the CPU to process its input / output
queues so the skbs just build up on backlog queues until a limit is
reached or you run out of memory.

Unlike BSD, linux does not have network buffer pools; the network
drivers and network kernel code allocate buffers (skbs) from system
memory. However, there is a configurable limit to the size of the
receive backlog queue. Try changing the
/proc/sys/net/core/netdev_max_backlog value and make it much
smaller. Perhaps its value (default 300) is too big for your available
memory (300 1500 byte packets is ~450k). If the length of this queue
exceeds the configured value, netif_rx() just drops packets.

Packets can build up in the transmit queues(s) too and there is no
limit to the size to which they can grow. You say that if you blast
packets in on both ports, the kernel doesn't die. This might be
because the CPU is stuck permanently in CPU receive processing, where
netif_rx() just keeps discarding the skbs because the receive backlog
limit has been reached. If the bridge code (or other protocol stacks)
are unable to generate transmit data because they never run, skbs
won't build up on the transmit queues.

I've made extensive modifications to the FCC driver and related code
which make the system much better behaved under load. I'm still
working on fixing the last few bugs. Here's a brief summary, fyi:

 * Add a cpm tasklet for doing all cpm "interrupt" processing. The ISR
   now simply disables further interrupts from the device, acks the
   interrupt and flags that the cpm tasklet has work to do. The kernel
   schedules the tasklet as soon as it can, typically at the next call
   to schedule(). The interrupt is used only to kick off task level
   kernel processing.

 * Change the rx/tx bd ring processing routines (which used to be in
   the ISR) to loop only for a fixed quota before returning control
   back to the kernel (tasklet). The tasklet calls these routines
   again and again (perhaps between servicing other devices) until the
   device driver says that it has no more work to do. This allows all
   CPM device drivers to get a fair slice of the CPU cake. The
   device's interrupt is enabled again only when the tx/rx bd rings
   have no more events to process. In this way, FCC interrupts are
   effectively disabled when the system is under load while the
   tasklet invokes active drivers.

 * Improve the efficiency of bd ring and register access.

 * Modify net/core/skbuff.c to allow skbs to be preallocated with
   specific memory buffers and assigned to pools. Modify the FCC
   driver to use the preallocated skbs so that no data copy is needed
   in the driver's receive path (the bd ring's data buffer ptr points
   into the pre-prepared skb data buffers). Also, dev_alloc_skb()
   effectively becomes a skb_dequeue() from the skb pool list which is
   much more efficient. I use the ability to assign specific data
   buffers to skb->data because my board has local memory (on the
   CPU's Local Bus) which I use specifically for data buffers.

Changing the driver interrupt strategy as described above
significantly improves performance and behavior under load. It also
allows the kernel to decide which events should be processed when, not
the CPU's interrupt prioritization logic. With these changes, I can
still type commands at the console shell prompt when running my
SmartBits tests...

Hope I've helped!

Jim

jtm@smoothsmoothie.com wrote:

> We're trying to enable ethernet bridging between FCC2 and FCC3 on an
> 8260 (EST's SBC8260 eval board), and running into problems.
>
> Our test is to send it continuous 1500 byte ethernet packets from
> a SmartBits traffic generator. Transmitting on one port, everything
> is fine until we send more than about 7500 packets back to back.
>
> Below that value, all of the values reported by ifconfig are correct,
> and memory use (reported by free) is constant. Above that value,
> the 'TX packets' value reported from ifconfig does not match what
> the SmartBits says it received, and according to free, we start using
> more memory that never gets released. If we send 3-5 bursts of 8000
> packets, we start getting output like:
>         Out of Memory: Killed process 16 (sh)
>         Out of Memory: Killed process 18 (inetd)
>         __alloc_pages: 0 - order allocation failed.
>         eth2: Memory squeeze, dropping packet
>
> And so on.
>
> Another test that we have tried is sending traffic on both ports. As
> long as we send on both ports, we can generate traffic all the way
> up to 100 Mbps without killing the kernel. (It can't keep up, but it
> doesn't die). Above 95 Mbps, if we stop transmitting on one of the
> ports, the system dies with error messages like those above.
>
> We are using kernel 2.4.3, with a modified driver from bitkeeper's 2.5
> kernel. The differences are:
>         mii_discover_phy_poll() is commented out, and
>                 cep->phy_speed is set to 100,
>                 cep->phy_duplex is set to 1
>         fcc->fcc_fgmr has the TCI bit cleared
>
> We are running the 8260 core at 133 MHz and the CPM at 133 MHz
>
> --
> Jay Monkman         The truth knocks on the door and you say "Go away, I'm
> monkman@jump.net    looking for the truth," and so it goes away. Puzzling.
>                      - from _Zen_and_the_Art_of_Motorcycle_Maintenance_
>

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Ethernet Bridging on 8260
@ 2001-04-12 13:48 Scott Rogerson
  2001-04-12 17:31 ` Dan Malek
  0 siblings, 1 reply; 4+ messages in thread
From: Scott Rogerson @ 2001-04-12 13:48 UTC (permalink / raw)
  To: hea23587; +Cc: linuxppc-embedded, jtm

I encountered the same problem with the FCC driver for the 8260.  For
the most part I agree with your analysis of the problem.  I think
however that you may have an unintended consequence with your solution.

As you stated a packet is queued by the driver via netif_rx() which
before your modification occurred in the interrupt handler.  This queue
is then marked with mark_bh(NET_BH).  The network queue (for that matter
all bottom half queues) is normally drained prior to leaving the
interrupt context when no further interrupts are pending.  In other
words the queue is normally drained immediately.

By scheduling the netif_rx() call you may have introduced a lag of up to
1 jiffy 1/100th of a second between receiving a packet and processing
that packet. It's true that this delay is only associated with the first
packet but it will be there every time you leave the driver interrupt
service routine.

I solved it in a way similar to your suggestion of putting a quota on
the rx bd ring processing.  If while in the interrupt handler the quota
is exceeded, the rx interrupt gets turned off for 1 jiffie (I schedule a
task to turn it back on after 1 jiffy) thereby allowing user code to
run.   I experimented with the quota to ensure that normal traffic that
I want to process does not cause me to disable the interrupt.  It really
is only abnormal stuff (ie smart bits or perhaps a broadcast storm)
which will cause me to gate the rx process.  The end result is the same
in that the console responds normally under any smart bits load but
there is no lag in processing normal network stuff.

Good luck

Scott Rogerson

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Ethernet Bridging on 8260
  2001-04-12 13:48 Ethernet Bridging on 8260 Scott Rogerson
@ 2001-04-12 17:31 ` Dan Malek
  0 siblings, 0 replies; 4+ messages in thread
From: Dan Malek @ 2001-04-12 17:31 UTC (permalink / raw)
  To: Scott Rogerson; +Cc: hea23587, linuxppc-embedded, jtm

Scott Rogerson wrote:
>
> I encountered the same problem with the FCC driver for the 8260.  For
> the most part I agree with your analysis of the problem.  I think
> however that you may have an unintended consequence with your solution.

I think "problem" and "solution" are a little out of context here.
I certainly don't have any emotional attachment to this code and
actually hope for improvements, but I also have experience with
developing software for bridges and routers.  Based on that
experience I would have used a completely different design than
Linux provides if you are trying to build such equipment (or better,
use Linux on something like MMC network processors).

As stated in a previous message, Linux does have the ability to
relieve some of the memory pressure when the CPU core can't keep
up with the incoming packet rate.  If the CPU can't keep up,
what else is there to do?  The current driver will process Ethernet
frames as fast as they come in.  I would like to make a few changes
to handle smaller BD fragments, which would be a little more memory
efficient, but wouldn't do anything to solve the situation where
the CPU core is too slow to handle the IP processing.  Making changes
in the Ethernet driver for resource scheduling the IP stack just
doesn't seem like the right thing to do.

Adding a tasklet for the actual processing of incoming frames,
so it can be scheduled against (or assist with) user tasks would
be OK.  Just remember that what this does is cause the FCC to
discard incoming frames as the resource management.  This may not
be acceptable because it will increase the IP processing load
by requesting retransmissions, and this really hurts for large
UDP buffer sizes (because any missing fragment causes the entire
UDP to be resent).

	-- Dan

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2001-04-12 17:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-04-12 13:48 Ethernet Bridging on 8260 Scott Rogerson
2001-04-12 17:31 ` Dan Malek
  -- strict thread matches above, loose matches on Subject: below --
2001-04-11 23:36 Jim Chapman
2001-04-10 16:41 jtm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).