Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [net-next v2 02/71] 3c*/acenic/typhoon: Move 3Com Ethernet drivers
From: Jeff Kirsher @ 2011-08-01 20:15 UTC (permalink / raw)
  To: Alan Cox
  Cc: davem@davemloft.net, netdev@vger.kernel.org, gospo@redhat.com,
	sassmann@redhat.com, Philip Blundell, Steffen Klassert,
	David Dillow, Jes Sorensen, Donald Becker, Craig Southeren,
	David Hinds
In-Reply-To: <20110801100303.48ed75f7@bob.linux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 2037 bytes --]

On Mon, 2011-08-01 at 02:03 -0700, Alan Cox wrote:
> On Sat, 30 Jul 2011 20:26:21 -0700
> Jeff Kirsher <jeffrey.t.kirsher@intel.com> wrote:
> 
> > Moves the 3Com drivers into drivers/net/ethernet/3com/ and the
> > necessary Kconfig and Makefile changes.
> 
> This still seems crazy
> 
> The 3c503 is not being moved (as its 8390 based)
> 
> But the 3c505/3c523/3c527/3c507 by that logic also shouldn't be moved
> as really they all belong with the rest of the Intel devices they are
> basically variants of (the 3c527 is weirder, in fact you can probably
> run CP/M 86 on it if you were mad enough)

I did as you asked, just not in this patch.  I should have cleaned up
this patch to reflect the changes I made in patch #4 and #10.

> 
> >  drivers/net/{pcmcia => ethernet/3com}/3c574_cs.c |    0
> >  drivers/net/{pcmcia => ethernet/3com}/3c589_cs.c |    0
> 
> These are currently sensibly where they belong - with the pcmcia
> adapters.
> 
> >  drivers/net/{ => ethernet/3com}/3c59x.c          |    0
> >  drivers/net/ethernet/3com/Kconfig                |  200
> > ++++++++++++++++++++++
> > drivers/net/ethernet/3com/Makefile               |   16 ++
> > drivers/net/{ => ethernet/3com}/acenic.c         |    0
> > drivers/net/{ => ethernet/3com}/acenic.h         |    0
> 
> And most Acenic devices are probably branded Netgear not 3COM and may
> also claim to be from Farallon, SGI, Alteon or DEC. Again not a 3Com
> originated part.
> 
> So I still think this patch is utter nonsense and just noise.
> 
> There isn't any sense in trying to line the network drivers up by
> whatever is written on the box that was thrown away years before. The
> reality is that most cards do not bear anything relevant to the chipset
> vendors name, even by the early 1990s. 
> 
> Architectually it makes more sense to keep tidy by bus type and by
> chipset, not by vendor name
> 
> NAK
> 
> And even if you wanted to make Kconfig simpler - you don't need to move
> files around.
> 
> Alan



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH] net: add Documentation/networking/scaling.txt
From: Rick Jones @ 2011-08-01 18:49 UTC (permalink / raw)
  To: Tom Herbert; +Cc: rdunlap, linux-doc, davem, netdev, willemb
In-Reply-To: <alpine.DEB.2.00.1107312346350.28722@pokey.mtv.corp.google.com>

On 07/31/2011 11:56 PM, Tom Herbert wrote:
> Describes RSS, RPS, RFS, accelerated RFS, and XPS.
>
> Signed-off-by: Tom Herbert<therbert@google.com>
> ---
>   Documentation/networking/scaling.txt |  346 ++++++++++++++++++++++++++++++++++
>   1 files changed, 346 insertions(+), 0 deletions(-)
>   create mode 100644 Documentation/networking/scaling.txt
>
> diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt
> new file mode 100644
> index 0000000..aa51f0f
> --- /dev/null
> +++ b/Documentation/networking/scaling.txt
> @@ -0,0 +1,346 @@
> +Scaling in the Linux Networking Stack
> +
> +
> +Introduction
> +============
> +
> +This document describes a set of complementary techniques in the Linux
> +networking stack to increase parallelism and improve performance (in
> +throughput, latency, CPU utilization, etc.) for multi-processor systems.

Why not just leave-out the parenthetical lest some picky pedant find a 
specific example where either of those three are not improved?

> +
> +The following technologies are described:
> +
> +  RSS: Receive Side Scaling
> +  RPS: Receive Packet Steering
> +  RFS: Receive Flow Steering
> +  Accelerated Receive Flow Steering
> +  XPS: Transmit Packet Steering
> +
> +
> +RSS: Receive Side Scaling
> +=========================
> +
> +Contemporary NICs support multiple receive queues (multi-queue), which
> +can be used to distribute packets amongst CPUs for processing. The NIC
> +distributes packets by applying a filter to each packet to assign it to
> +one of a small number of logical flows.  Packets for each flow are
> +steered to a separate receive queue, which in turn can be processed by
> +separate CPUs.  This mechanism is generally known as “Receive-side
> +Scaling” (RSS).
> +
> +The filter used in RSS is typically a hash function over the network or
> +transport layer headers-- for example, a 4-tuple hash over IP addresses

Network *and* transport layer headers?  And/or?


> +== RSS IRQ Configuration
> +
> +Each receive queue has a separate IRQ associated with it. The NIC
> +triggers this to notify a CPU when new packets arrive on the given
> +queue. The signaling path for PCIe devices uses message signaled
> +interrupts (MSI-X), that can route each interrupt to a particular CPU.
> +The active mapping of queues to IRQs can be determined from
> +/proc/interrupts. By default, all IRQs are routed to CPU0.  Because a

Really?

> +non-negligible part of packet processing takes place in receive
> +interrupt handling, it is advantageous to spread receive interrupts
> +between CPUs. To manually adjust the IRQ affinity of each interrupt see
> +Documentation/IRQ-affinity. On some systems, the irqbalance daemon is
> +running and will try to dynamically optimize this setting.

I would probably make it explicit that the irqbalance daemon will undo 
one's manual changes:

"Some systems will be running an irqbalance daemon which will be trying 
to dynamically optimize IRQ assignments and will undo manual adjustments."

Whether one needs to go so far as to explicitly suggest that the 
irqbalance daemon should be disabled in such cases I'm not sure.


> +RPS: Receive Packet Steering
> +============================
> +
> +Receive Packet Steering (RPS) is logically a software implementation of
> ...
> +
> +Each receive hardware qeueue has associated list of CPUs which can

"queue has an associated" (spelling and grammar nits)

> +process packets received on the queue for RPS.  For each received
> +packet, an index into the list is computed from the flow hash modulo the
> +size of the list.  The indexed CPU is the target for processing the
> +packet, and the packet is queued to the tail of that CPU’s backlog
> +queue. At the end of the bottom half routine, inter-processor interrupts
> +(IPIs) are sent to any CPUs for which packets have been queued to their
> +backlog queue. The IPI wakes backlog processing on the remote CPU, and
> +any queued packets are then processed up the networking stack. Note that
> +the list of CPUs can be configured separately for each hardware receive
> +queue.
> +
> +== RPS Configuration
> +
> +RPS requires a kernel compiled with the CONFIG_RPS flag (on by default
> +for smp). Even when compiled in, it is disabled without any
> +configuration. The list of CPUs to which RPS may forward traffic can be
> +configured for each receive queue using the sysfs file entry:
> +
> + /sys/class/net/<dev>/queues/rx-<n>/rps_cpus
> +
> +This file implements a bitmap of CPUs. RPS is disabled when it is zero
> +(the default), in which case packets are processed on the interrupting
> +CPU.  IRQ-affinity.txt explains how CPUs are assigned to the bitmap.

Earlier in the writeup (snipped) it is presented as 
"Documentation/IRQ-affinity" and here as IRQ-affinity.txt, should that 
be "Documentation/IRQ-affinity.txt" in both cases?

> +For a single queue device, a typical RPS configuration would be to set
> +the rps_cpus to the CPUs in the same cache domain of the interrupting
> +CPU for a queue. If NUMA locality is not an issue, this could also be
> +all CPUs in the system. At high interrupt rate, it might wise to exclude
> +the interrupting CPU from the map since that already performs much work.
> +
> +For a multi-queue system, if RSS is configured so that a receive queue

Multple hardware queue to help keep the "queues" separate in the mind of 
the reader?

> +is mapped to each CPU, then RPS is probably redundant and unnecessary.
> +If there are fewer queues than CPUs, then RPS might be beneficial if the

same.

> +rps_cpus for each queue are the ones that share the same cache domain as
> +the interrupting CPU for the queue.
> +
> +RFS: Receive Flow Steering
> +==========================
> +
> +While RPS steers packet solely based on hash, and thus generally
> +provides good load distribution, it does not take into account
> +application locality. This is accomplished by Receive Flow Steering

Should it also mention how an application thread of execution might be 
processing requests on multiple connections, which themselves might not 
normally hash to the same place?


> +== RFS Configuration
> +
> +RFS is only available if the kernel flag CONFIG_RFS is enabled (on by
> +default for smp). The functionality is disabled without any
> +configuration.

Perhaps just wordsmithing, but "This functionality remains disabled 
until explicitly configured." seems clearer.

> +== Accelerated RFS Configuration
> +
> +Accelerated RFS is only available if the kernel is compiled with
> +CONFIG_RFS_ACCEL and support is provided by the NIC device and driver.
> +It also requires that ntuple filtering is enabled via ethtool.

Requires that ntuple filtering be enabled?

> +XPS: Transmit Packet Steering
> +=============================
> +
> +Transmit Packet Steering is a mechanism for intelligently selecting
> +which transmit queue to use when transmitting a packet on a multi-queue
> +device.

Minor nit.  Up to this point a multi-queue device was only described as 
one with multiple receive queues.


> +Further Information
> +===================
> +RPS and RFS were introduced in kernel 2.6.35. XPS was incorporated into
> +2.6.38. Original patches were submitted by Tom Herbert
> +(therbert@google.com)
> +
> +
> +Accelerated RFS was introduced in 2.6.35. Original patches were
> +submitted by Ben Hutchings (bhutchings@solarflare.com)
> +
> +Authors:
> +Tom Herbert (therbert@google.com)
> +Willem de Bruijn (willemb@google.com)
> +

While there are tidbits and indications in the descriptions of each 
mechanism, a section with explicit description of when one would use the 
different mechanisms would be goodness.

rick jones

^ permalink raw reply

* Re: [PATCH] net: add Documentation/networking/scaling.txt
From: Randy Dunlap @ 2011-08-01 18:41 UTC (permalink / raw)
  To: Tom Herbert; +Cc: linux-doc, davem, netdev, willemb
In-Reply-To: <alpine.DEB.2.00.1107312346350.28722@pokey.mtv.corp.google.com>

On Sun, 31 Jul 2011 23:56:26 -0700 (PDT) Tom Herbert wrote:

> Describes RSS, RPS, RFS, accelerated RFS, and XPS.
> 
> Signed-off-by: Tom Herbert <therbert@google.com>
> ---
>  Documentation/networking/scaling.txt |  346 ++++++++++++++++++++++++++++++++++
>  1 files changed, 346 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/networking/scaling.txt
> 
> diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt
> new file mode 100644
> index 0000000..aa51f0f
> --- /dev/null
> +++ b/Documentation/networking/scaling.txt
> @@ -0,0 +1,346 @@
> +Scaling in the Linux Networking Stack
> +
> +
> +Introduction
> +============ 
> +
> +This document describes a set of complementary techniques in the Linux
> +networking stack to increase parallelism and improve performance (in
> +throughput, latency, CPU utilization, etc.) for multi-processor systems.
> +
> +The following technologies are described:
> +
> +  RSS: Receive Side Scaling
> +  RPS: Receive Packet Steering
> +  RFS: Receive Flow Steering
> +  Accelerated Receive Flow Steering
> +  XPS: Transmit Packet Steering
> +
> +
> +RSS: Receive Side Scaling
> +=========================
> +
> +Contemporary NICs support multiple receive queues (multi-queue), which
> +can be used to distribute packets amongst CPUs for processing. The NIC
> +distributes packets by applying a filter to each packet to assign it to
> +one of a small number of logical flows.  Packets for each flow are
> +steered to a separate receive queue, which in turn can be processed by
> +separate CPUs.  This mechanism is generally known as “Receive-side
> +Scaling” (RSS).
> +
> +The filter used in RSS is typically a hash function over the network or
> +transport layer headers-- for example, a 4-tuple hash over IP addresses
> +and TCP ports of a packet. The most common hardware implementation of
> +RSS uses a 128 entry indirection table where each entry stores a queue

              128-entry

> +number. The receive queue for a packet is determined by masking out the
> +low order seven bits of the computed hash for the packet (usually a
> +Toeplitz hash), taking this number as a key into the indirection table
> +and reading the corresponding value.
> +
> +Some advanced NICs allow steering packets to queues based on
> +programmable filters. For example, webserver bound TCP port 80 packets
> +can be directed to their own receive queue. Such “n-tuple” filters can
> +be configured from ethtool (--config-ntuple).
> +
> +== RSS Configuration
> +
> +The driver for a multi-queue capable NIC typically provides a module
> +parameter specifying the number of hardware queues to configure. In the
> +bnx2x driver, for instance, this parameter is called num_queues. A
> +typical RSS configuration would be to have one receive queue for each
> +CPU if the device supports enough queues, or otherwise at least one for
> +each cache domain at a particular cache level (L1, L2, etc.).
> +
> +The indirection table of an RSS device, which resolves a queue by masked
> +hash, is usually programmed by the driver at initialization.  The
> +default mapping is to distribute the queues evenly in the table, but the
> +indirection table can be retrieved and modified at runtime using ethtool
> +commands (--show-rxfh-indir and --set-rxfh-indir).  Modifying the
> +indirection table could be done to to give different queues different

                                   ^^drop one "to"

> +relative weights. 

Drop trailing whitespace above and anywhere else that it's found. (5 places)

I thought (long ago :) that multiple RX queues were for prioritizing traffic,
but there is nothing here about using multi-queues for priorities.
Is that (no longer) done?


> +
> +== RSS IRQ Configuration
> +
> +Each receive queue has a separate IRQ associated with it. The NIC
> +triggers this to notify a CPU when new packets arrive on the given
> +queue. The signaling path for PCIe devices uses message signaled
> +interrupts (MSI-X), that can route each interrupt to a particular CPU.
> +The active mapping of queues to IRQs can be determined from
> +/proc/interrupts. By default, all IRQs are routed to CPU0.  Because a
> +non-negligible part of packet processing takes place in receive
> +interrupt handling, it is advantageous to spread receive interrupts
> +between CPUs. To manually adjust the IRQ affinity of each interrupt see
> +Documentation/IRQ-affinity. On some systems, the irqbalance daemon is
> +running and will try to dynamically optimize this setting.

or (avoid a split infinitive):  will try to optimize this setting dynamically.

> +
> +
> +RPS: Receive Packet Steering
> +============================
> +
> +Receive Packet Steering (RPS) is logically a software implementation of
> +RSS.  Being in software, it is necessarily called later in the datapath.
> +Whereas RSS selects the queue and hence CPU that will run the hardware
> +interrupt handler, RPS selects the CPU to perform protocol processing
> +above the interrupt handler.  This is accomplished by placing the packet
> +on the desired CPU’s backlog queue and waking up the CPU for processing.
> +RPS has some advantages over RSS: 1) it can be used with any NIC, 2)
> +software filters can easily be added to handle new protocols, 3) it does
> +not increase hardware device interrupt rate (but does use IPIs).
> +
> +RPS is called during bottom half of the receive interrupt handler, when
> +a driver sends a packet up the network stack with netif_rx() or
> +netif_receive_skb(). These call the get_rps_cpu() function, which
> +selects the queue that should process a packet.
> +
> +The first step in determining the target CPU for RPS is to calculate a
> +flow hash over the packet’s addresses or ports (2-tuple or 4-tuple hash
> +depending on the protocol). This serves as a consistent hash of the
> +associated flow of the packet. The hash is either provided by hardware
> +or will be computed in the stack. Capable hardware can pass the hash in
> +the receive descriptor for the packet, this would usually be the same

                                  packet;

> +hash used for RSS (e.g. computed Toeplitz hash). The hash is saved in
> +skb->rx_hash and can be used elsewhere in the stack as a hash of the
> +packet’s flow.
> +
> +Each receive hardware qeueue has associated list of CPUs which can

                                has an associated list (?)

> +process packets received on the queue for RPS.  For each received
> +packet, an index into the list is computed from the flow hash modulo the
> +size of the list.  The indexed CPU is the target for processing the
> +packet, and the packet is queued to the tail of that CPU’s backlog
> +queue. At the end of the bottom half routine, inter-processor interrupts
> +(IPIs) are sent to any CPUs for which packets have been queued to their
> +backlog queue. The IPI wakes backlog processing on the remote CPU, and
> +any queued packets are then processed up the networking stack. Note that
> +the list of CPUs can be configured separately for each hardware receive
> +queue.
> +
> +== RPS Configuration
> +
> +RPS requires a kernel compiled with the CONFIG_RPS flag (on by default

s/flag/kconfig symbol/

> +for smp). Even when compiled in, it is disabled without any

   for SMP).

> +configuration. The list of CPUs to which RPS may forward traffic can be
> +configured for each receive queue using the sysfs file entry:
> +
> + /sys/class/net/<dev>/queues/rx-<n>/rps_cpus
> +
> +This file implements a bitmap of CPUs. RPS is disabled when it is zero
> +(the default), in which case packets are processed on the interrupting
> +CPU.  IRQ-affinity.txt explains how CPUs are assigned to the bitmap.
> +
> +For a single queue device, a typical RPS configuration would be to set
> +the rps_cpus to the CPUs in the same cache domain of the interrupting
> +CPU for a queue. If NUMA locality is not an issue, this could also be
> +all CPUs in the system. At high interrupt rate, it might wise to exclude

                                                   it might be wise

> +the interrupting CPU from the map since that already performs much work.
> +
> +For a multi-queue system, if RSS is configured so that a receive queue
> +is mapped to each CPU, then RPS is probably redundant and unnecessary.
> +If there are fewer queues than CPUs, then RPS might be beneficial if the
> +rps_cpus for each queue are the ones that share the same cache domain as
> +the interrupting CPU for the queue.
> +
> +RFS: Receive Flow Steering
> +==========================
> +
> +While RPS steers packet solely based on hash, and thus generally

             steers packets

> +provides good load distribution, it does not take into account
> +application locality. This is accomplished by Receive Flow Steering
> +(RFS). The goal of RFS is to increase datacache hitrate by steering
> +kernel processing of packets to the CPU where the application thread
> +consuming the packet is running. RFS relies on the same RPS mechanisms
> +to enqueue packets onto the backlog of another CPU and to wake that CPU.
> +
> +In RFS, packets are not forwarded directly by the value of their hash,
> +but the hash is used as index into a flow lookup table. This table maps
> +flows to the CPUs where those flows are being processed. The flow hash
> +(see RPS section above) is used to calculate the index into this table.
> +The CPU recorded in each entry is the one which last processed the flow,
> +and if there is not a valid CPU for an entry, then packets mapped to
> +that entry are steered using plain RPS.
> +
> +To avoid out of order packets (ie. when scheduler moves a thread with

                                 (i.e., when the scheduler moves a thread that

> +outstanding receive packets on) there are two levels of flow tables used

has outstanding receive packets),

> +by RFS: rps_sock_flow_table and rps_dev_flow_table.
> +
> +rps_sock_table is a global flow table. Each table value is a CPU index
> +and is populated by recvmsg and sendmsg (specifically, inet_recvmsg(),
> +inet_sendmsg(), inet_sendpage() and tcp_splice_read()). This table
> +contains the *desired* CPUs for flows.
> +
> +rps_dev_flow_table is specific to each hardware receive queue of each
> +device.  Each table value stores a CPU index and a counter. The CPU
> +index represents the *current* CPU that is assigned to processing the
> +matching flows.
> +
> +The counter records the length of this CPU's backlog when a packet in
> +this flow was last enqueued.  Each backlog queue has a head counter that
> +is incremented on dequeue. A tail counter is computed as head counter +
> +queue length. In other words, the counter in rps_dev_flow_table[i]
> +records the last element in flow i that has been enqueued onto the
> +currently designated CPU for flow i (of course, entry i is actually
> +selected by hash and multiple flows may hash to the same entry i). 
> +
> +And now the trick for avoiding out of order packets: when selecting the
> +CPU for packet processing (from get_rps_cpu()) the rps_sock_flow table
> +and the rps_dev_flow table of the queue that the packet was received on
> +are compared.  If the desired CPU for the flow (found in the
> +rps_sock_flow table) matches the current CPU (found in the rps_dev_flow
> +table), the packet is enqueud onto that CPU’s backlog. If they differ,

                         enqueued

> +the current cpu is updated to match the desired CPU if one of the

s/cpu/CPU/ (globally as needed)

> +following is true:
> +
> +- The current CPU's queue head counter >= the recorded tail counter
> +  value in rps_dev_flow[i]
> +- The current CPU is unset (equal to NR_CPUS)
> +- The current CPU is offline
> +
> +After this check, the packet is sent to the (possibly updated) current
> +CPU.  These rules aim to ensure that a flow only moves to a new CPU when
> +there are no packets outstanding on the old CPU, as the outstanding
> +packets could arrive later than those about to be processed on the new
> +CPU.
> +
> +== RFS Configuration
> +
> +RFS is only available if the kernel flag CONFIG_RFS is enabled (on by

s/flag/kconfig symbol/

> +default for smp). The functionality is disabled without any

s/smp/SMP/

> +configuration. The number of entries in the global flow table is set
> +through:
> +
> + /proc/sys/net/core/rps_sock_flow_entries
> +
> +The number of entries in the per queue flow table are set through:

                                per-queue

> +
> + /sys/class/net/<dev>/queues/tx-<n>/rps_flow_cnt
> +
> +Both of these need to be set before RFS is enabled for a receive queue.
> +Values for both of these are rounded up to the nearest power of two. The
> +suggested flow count depends on the expected number active connections

                                                number of

> +at any given time, which may be significantly less than the number of
> +open connections. We have found that a value of 32768 for
> +rps_sock_flow_entries works fairly well on a moderately loaded server.
> +
> +For a single queue device, the rps_flow_cnt value for the single queue
> +would normally be configured to the same value as rps_sock_flow_entries.
> +For a multi-queue device, the rps_flow_cnt for each queue might be
> +configured as rps_sock_flow_entries / N, where N is the number of
> +queues. So for instance, if rps_flow_entries is set to 32768 and there
> +are 16 configured receive queues, rps_flow_cnt for each queue might be
> +configured as 2048.
> +
> +
> +Accelerated RFS
> +===============
> +
> +Accelerated RFS is to RFS what RSS is to RPS: a hardware-accelerated
> +load balancing mechanism that uses soft state to steer flows based on
> +where the thread consuming the packets of each flow is running.
> +Accelerated RFS should perform better than RFS since packets are sent
> +directly to a CPU local to the thread consuming the data. The target CPU
> +will either be the same CPU where the application runs, or at least a
> +CPU which is local to the application thread’s CPU in the cache
> +hierarchy. 
> +
> +To enable accelerated RFS, the networking stack calls the
> +ndo_rx_flow_steer driver function to communicate the desired hardware
> +queue for packets matching a particular flow. The network stack
> +automatically calls this function every time a flow entry in
> +rps_dev_flow_table is updated. The driver in turn uses a device specific

                                                            device-specific

> +method to program the NIC to steer the packets.
> +
> +The hardware queue for a flow is derived from the CPU recorded in
> +rps_dev_flow_table. The stack consults a CPU to hardware queue map which

                                            CPU-to-hardware-queue map

> +is maintained by the NIC driver. This is an autogenerated reverse map of
> +the IRQ affinity table shown by /proc/interrupts. Drivers can use
> +functions in the cpu_rmap (“cpu affinitiy reverse map”) kernel library
> +to populate the map. For each CPU, the corresponding queue in the map is
> +set to be one whose processing CPU is closest in cache locality.
> +
> +== Accelerated RFS Configuration
> +
> +Accelerated RFS is only available if the kernel is compiled with
> +CONFIG_RFS_ACCEL and support is provided by the NIC device and driver.
> +It also requires that ntuple filtering is enabled via ethtool. The map
> +of CPU to queues is automatically deduced from the IRQ affinities
> +configured for each receive queue by the driver, so no additional
> +configuration should be necessary.
> +
> +XPS: Transmit Packet Steering
> +=============================
> +
> +Transmit Packet Steering is a mechanism for intelligently selecting
> +which transmit queue to use when transmitting a packet on a multi-queue
> +device. To accomplish this, a mapping from CPU to hardware queue(s) is
> +recorded. The goal of this mapping is usually to assign queues
> +exclusively to a subset of CPUs, where the transmit completions for
> +these queues are processed on a CPU within this set. This choice
> +provides two benefits. First, contention on the device queue lock is
> +significantly reduced since fewer CPUs contend for the same queue
> +(contention can be eliminated completely if each CPU has its own
> +transmit queue).  Secondly, cache miss rate on transmit completion is
> +reduced, in particular for data cache lines that hold the sk_buff
> +structures.
> +
> +XPS is configured per transmit queue by setting a bitmap of CPUs that
> +may use that queue to transmit. The reverse mapping, from CPUs to
> +transmit queues, is computed and maintained for each network device.
> +When transmitting the first packet in a flow, the function
> +get_xps_queue() is called to select a queue.  This function uses the ID
> +of the running CPU as a key into the CPU to queue lookup table. If the

                                        CPU-to-queue

> +ID matches a single queue, that is used for transmission.  If multiple
> +queues match, one is selected by using the flow hash to compute an index
> +into the set.
> +
> +The queue chosen for transmitting a particular flow is saved in the
> +corresponding socket structure for the flow (e.g. a TCP connection).
> +This transmit queue is used for subsequent packets sent on the flow to
> +prevent out of order (ooo) packets. The choice also amortizes the cost
> +of calling get_xps_queues() over all packets in the connection. To avoid
> +ooo packets, the queue for a flow can subsequently only be changed if
> +skb->ooo_okay is set for a packet in the flow. This flag indicates that
> +there are no outstanding packets in the flow, so the transmit queue can
> +change without the risk of generating out of order packets. The
> +transport layer is responsible for setting ooo_okay appropriately. TCP,
> +for instance, sets the flag when all data for a connection has been
> +acknowledged.
> +
> +
> +== XPS Configuration
> +
> +XPS is only available if the kernel flag CONFIG_XPS is enabled (on by

s/flag/kconfig symbol/

> +default for smp). The functionality is disabled without any

s/smp/SMP/

> +configuration, in which case the the transmit queue for a packet is
> +selected by using a flow hash as an index into the set of all transmit
> +queues for the device. To enable XPS, the bitmap of CPUs that may use a
> +transmit queue is configured using the sysfs file entry:
> +
> +/sys/class/net/<dev>/queues/tx-<n>/xps_cpus
> +
> +XPS is disabled when it is zero (the default). IRQ-affinity.txt explains
> +how CPUs are assigned to the bitmap. 
> +
> +For a network device with a single transmission queue, XPS configuration
> +has no effect, since there is no choice in this case. In a multi-queue
> +system, XPS is usually configured so that each CPU maps onto one queue.
> +If there are as many queues as there are CPUs in the system, then each
> +queue can also map onto one CPU, resulting in exclusive pairings that
> +experience no contention. If there are fewer queues than CPUs, then the
> +best CPUs to share a given queue are probably those that share the cache
> +with the CPU that processes transmit completions for that queue
> +(transmit interrupts).
> +
> +
> +Further Information
> +===================
> +RPS and RFS were introduced in kernel 2.6.35. XPS was incorporated into
> +2.6.38. Original patches were submitted by Tom Herbert
> +(therbert@google.com)
> +
> +
> +Accelerated RFS was introduced in 2.6.35. Original patches were
> +submitted by Ben Hutchings (bhutchings@solarflare.com)
> +
> +Authors:
> +Tom Herbert (therbert@google.com)
> +Willem de Bruijn (willemb@google.com)
> +
> -- 


Very nice writeup.  Thanks.

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply

* Re: [PATCH] net: filter: Convert the BPF VM to threaded code
From: Eric Dumazet @ 2011-08-01 18:37 UTC (permalink / raw)
  To: Hagen Paul Pfeifer; +Cc: Rui Ueyama, netdev
In-Reply-To: <20110801181652.GB2732@nuttenaction>

Le lundi 01 août 2011 à 20:16 +0200, Hagen Paul Pfeifer a écrit :
> * Rui Ueyama | 2011-07-29 01:10:26 [-0700]:
> 
> >Convert the BPF VM to threaded code to improve performance.
> >
> >The BPF VM is basically a big for loop containing a switch statement.  That is
> >slow because for each instruction it checks the for loop condition and does the
> >conditional branch of the switch statement.
> >
> >This patch eliminates the conditional branch, by replacing it with jump table
> >using GCC's labels-as-values feature. The for loop condition check can also be
> >removed, because the filter code always end with a RET instruction.
> 
> With commit 01f2f3f6ef4d076c I reworked the BPF code so that gcc is in the
> ability to generate a jump table, I double checked this. Not sure what happened
> in the meantime.
> 

A switch() always generates one conditional branch, catching values not
enumerated in the "case ..." clauses.




^ permalink raw reply

* [PATCH 2/2] Create a new connector proc_event for successful calls to accept.
From: Joe Damato @ 2011-08-01 18:04 UTC (permalink / raw)
  To: zbr; +Cc: netdev, Joe Damato
In-Reply-To: <1312221865-3012-1-git-send-email-joe@boundary.com>


Signed-off-by: Joe Damato <joe@boundary.com>
---
 drivers/connector/cn_proc.c |   36 ++++++++++++++++++++++++++++++++++++
 include/linux/cn_proc.h     |   19 ++++++++++++++++++-
 net/socket.c                |    3 +++
 3 files changed, 57 insertions(+), 1 deletions(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index 3e88d07..1106014 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -85,6 +85,42 @@ void proc_connect_connector(struct task_struct *task, struct socket *sock,
 	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
 }
 
+void proc_accept_connector(struct task_struct *task, struct socket *sock,
+			   struct sockaddr *addr, int addrlen)
+{
+	struct cn_msg *msg;
+	struct proc_event *ev;
+	__u8 buffer[CN_PROC_MSG_SIZE];
+	struct timespec ts;
+
+	if (atomic_read(&proc_event_num_listeners) < 1)
+		return;
+
+	msg = (struct cn_msg*)buffer;
+	ev = (struct proc_event*)msg->data;
+	get_seq(&msg->seq, &ev->cpu);
+	ktime_get_ts(&ts); /* get high res monotonic timestamp */
+	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
+	ev->what = PROC_EVENT_ACCEPT;
+	ev->event_data.accept.process_pid = task->pid;
+	ev->event_data.accept.process_tgid = task->tgid;
+	ev->event_data.accept.protocol = sock->sk->sk_protocol;
+	ev->event_data.accept.address_len = addrlen;
+	memcpy(&ev->event_data.accept.address, addr, addrlen);
+
+	ev->event_data.accept.local_address_len = sizeof(struct __kernel_sockaddr_storage);
+	kernel_getsockname(sock, (struct sockaddr *) &ev->event_data.accept.local_address,
+			  &ev->event_data.accept.local_address_len);
+
+	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
+	msg->ack = 0; /* not used */
+	msg->len = sizeof(*ev);
+	/*  If cn_netlink_send() failed, the data is not sent */
+	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
+
+	return;
+}
+
 void proc_fork_connector(struct task_struct *task)
 {
 	struct cn_msg *msg;
diff --git a/include/linux/cn_proc.h b/include/linux/cn_proc.h
index a49ed22..d68a3de 100644
--- a/include/linux/cn_proc.h
+++ b/include/linux/cn_proc.h
@@ -56,7 +56,8 @@ struct proc_event {
 		PROC_EVENT_SID  = 0x00000080,
 		PROC_EVENT_PTRACE = 0x00000100,
 		PROC_EVENT_CONNECT = 0x00000400,  
-		/* "next" should be 0x00000800 */
+		PROC_EVENT_ACCEPT = 0x00000800,  
+		/* "next" should be 0x00001000 */
 		/* "last" is the last process event: exit */
 		PROC_EVENT_EXIT = 0x80000000
 	} what;
@@ -90,6 +91,16 @@ struct proc_event {
 			int protocol;
 		} connect;
 
+		struct accept_proc_event {
+			__kernel_pid_t process_pid;
+			__kernel_pid_t process_tgid;
+			struct sockaddr_storage address;
+			int address_len;
+			struct __kernel_sockaddr_storage local_address;
+			int local_address_len;
+			int protocol;
+		} accept;
+
 		struct id_proc_event {
 			__kernel_pid_t process_pid;
 			__kernel_pid_t process_tgid;
@@ -133,6 +144,8 @@ void proc_ptrace_connector(struct task_struct *task, int which_id);
 void proc_exit_connector(struct task_struct *task);
 void proc_connect_connector(struct task_struct *task, struct socket *sock,
 			    struct sockaddr *addr, int addrlen);
+void proc_accept_connector(struct task_struct *task, struct socket *sock,
+			   struct sockaddr *addr, int addrlen);
 #else
 static inline void proc_fork_connector(struct task_struct *task)
 {}
@@ -159,6 +172,10 @@ static inline void proc_connect_connector(struct task_struct *task,
 					  struct sockaddr *addr, int addrlen)
 {}
 
+static inline void proc_accept_connector(struct task_struct *task,
+					 struct socket *sock,
+					 struct sockaddr *addr, int addrlen)
+{}
 #endif	/* CONFIG_PROC_EVENTS */
 #endif	/* __KERNEL__ */
 #endif	/* CN_PROC_H */
diff --git a/net/socket.c b/net/socket.c
index b4f9a6c..d21a266 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1544,6 +1544,9 @@ SYSCALL_DEFINE4(accept4, int, fd, struct sockaddr __user *, upeer_sockaddr,
 			goto out_fd;
 	}
 
+	proc_accept_connector(current, newsock,
+			      (struct sockaddr *)&address, len);
+
 	/* File flags are not inherited via accept() unlike another OSes. */
 
 	fd_install(newfd, newfile);
-- 
1.7.4.1


^ permalink raw reply related

* [PATCH 1/2] Create a new connector proc_event for successful calls to connect.
From: Joe Damato @ 2011-08-01 18:04 UTC (permalink / raw)
  To: zbr; +Cc: netdev, Joe Damato
In-Reply-To: <1312221865-3012-1-git-send-email-joe@boundary.com>


Signed-off-by: Joe Damato <joe@boundary.com>
---
 drivers/connector/cn_proc.c |   34 ++++++++++++++++++++++++++++++++++
 include/linux/cn_proc.h     |   22 +++++++++++++++++++++-
 net/socket.c                |    6 ++++++
 3 files changed, 61 insertions(+), 1 deletions(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index 3ee1fdb..3e88d07 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -51,6 +51,40 @@ static inline void get_seq(__u32 *ts, int *cpu)
 	preempt_enable();
 }
 
+void proc_connect_connector(struct task_struct *task, struct socket *sock,
+			    struct sockaddr *addr, int addrlen)
+{
+	struct cn_msg *msg;
+	struct proc_event *ev;
+	__u8 buffer[CN_PROC_MSG_SIZE];
+	struct timespec ts;
+
+	if (atomic_read(&proc_event_num_listeners) < 1)
+		return;
+
+	msg = (struct cn_msg*)buffer;
+	ev = (struct proc_event*)msg->data;
+	get_seq(&msg->seq, &ev->cpu);
+	ktime_get_ts(&ts); /* get high res monotonic timestamp */
+	put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
+	ev->what = PROC_EVENT_CONNECT;
+	ev->event_data.connect.process_pid = task->pid;
+	ev->event_data.connect.process_tgid = task->tgid;
+	ev->event_data.connect.protocol = sock->sk->sk_protocol;
+	ev->event_data.connect.address_len = addrlen;
+	memcpy(&ev->event_data.connect.address, addr, addrlen);
+
+	ev->event_data.connect.local_address_len = sizeof(struct __kernel_sockaddr_storage);
+	kernel_getsockname(sock, (struct sockaddr *) &ev->event_data.connect.local_address,
+			  &ev->event_data.connect.local_address_len);
+
+	memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
+	msg->ack = 0; /* not used */
+	msg->len = sizeof(*ev);
+	/*  If cn_netlink_send() failed, the data is not sent */
+	cn_netlink_send(msg, CN_IDX_PROC, GFP_KERNEL);
+}
+
 void proc_fork_connector(struct task_struct *task)
 {
 	struct cn_msg *msg;
diff --git a/include/linux/cn_proc.h b/include/linux/cn_proc.h
index 12c517b..a49ed22 100644
--- a/include/linux/cn_proc.h
+++ b/include/linux/cn_proc.h
@@ -19,6 +19,7 @@
 #define CN_PROC_H
 
 #include <linux/types.h>
+#include <linux/socket.h>
 
 /*
  * Userspace sends this enum to register with the kernel that it is listening
@@ -54,7 +55,8 @@ struct proc_event {
 		PROC_EVENT_GID  = 0x00000040,
 		PROC_EVENT_SID  = 0x00000080,
 		PROC_EVENT_PTRACE = 0x00000100,
-		/* "next" should be 0x00000400 */
+		PROC_EVENT_CONNECT = 0x00000400,  
+		/* "next" should be 0x00000800 */
 		/* "last" is the last process event: exit */
 		PROC_EVENT_EXIT = 0x80000000
 	} what;
@@ -78,6 +80,16 @@ struct proc_event {
 			__kernel_pid_t process_tgid;
 		} exec;
 
+		struct connect_proc_event {
+			__kernel_pid_t process_pid;
+			__kernel_pid_t process_tgid;
+			struct __kernel_sockaddr_storage address;
+			int address_len;
+			struct __kernel_sockaddr_storage local_address;
+			int local_address_len;
+			int protocol;
+		} connect;
+
 		struct id_proc_event {
 			__kernel_pid_t process_pid;
 			__kernel_pid_t process_tgid;
@@ -119,6 +131,8 @@ void proc_id_connector(struct task_struct *task, int which_id);
 void proc_sid_connector(struct task_struct *task);
 void proc_ptrace_connector(struct task_struct *task, int which_id);
 void proc_exit_connector(struct task_struct *task);
+void proc_connect_connector(struct task_struct *task, struct socket *sock,
+			    struct sockaddr *addr, int addrlen);
 #else
 static inline void proc_fork_connector(struct task_struct *task)
 {}
@@ -139,6 +153,12 @@ static inline void proc_ptrace_connector(struct task_struct *task,
 
 static inline void proc_exit_connector(struct task_struct *task)
 {}
+
+static inline void proc_connect_connector(struct task_struct *task,
+					  struct socket *sock,
+					  struct sockaddr *addr, int addrlen)
+{}
+
 #endif	/* CONFIG_PROC_EVENTS */
 #endif	/* __KERNEL__ */
 #endif	/* CN_PROC_H */
diff --git a/net/socket.c b/net/socket.c
index b1cbbcd..b4f9a6c 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -88,6 +88,8 @@
 #include <linux/nsproxy.h>
 #include <linux/magic.h>
 #include <linux/slab.h>
+#include <linux/connector.h>
+#include <linux/cn_proc.h>
 
 #include <asm/uaccess.h>
 #include <asm/unistd.h>
@@ -1596,6 +1598,10 @@ SYSCALL_DEFINE3(connect, int, fd, struct sockaddr __user *, uservaddr,
 
 	err = sock->ops->connect(sock, (struct sockaddr *)&address, addrlen,
 				 sock->file->f_flags);
+
+	if (err == 0)
+	  proc_connect_connector(current, sock, (struct sockaddr *)&address, addrlen);
+
 out_put:
 	fput_light(sock->file, fput_needed);
 out:
-- 
1.7.4.1


^ permalink raw reply related

* [PATCH 0/2] connector: Add proc_events for connect/accept
From: Joe Damato @ 2011-08-01 18:04 UTC (permalink / raw)
  To: zbr; +Cc: netdev, Joe Damato

Hi -

It would be extremely useful to have a simple way of mapping pids to network
connections without having to create piles of inotify watches in /proc/ and
/proc/<pid>/fd/ and then search for corresponding inode numbers in
/proc/net/{tcp, udp, ... }.

I've added two simple connector events so that monitoring processes using
connector can get a notification of successful calls to connect/accept. This
allows a monitoring process to be aware of network connections without having
to jump through the inotify+proc parsing hoops.

Thanks,
Joe

Joe Damato (2):
  Create a new connector proc_event for successful calls to connect.
  Create a new connector proc_event for successful calls to accept.

 drivers/connector/cn_proc.c |   70 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/cn_proc.h     |   39 +++++++++++++++++++++++-
 net/socket.c                |    9 +++++
 3 files changed, 117 insertions(+), 1 deletions(-)

-- 
1.7.4.1

^ permalink raw reply

* Re: [PATCH] net: filter: Convert the BPF VM to threaded code
From: Hagen Paul Pfeifer @ 2011-08-01 18:16 UTC (permalink / raw)
  To: Rui Ueyama; +Cc: netdev
In-Reply-To: <CACKH++ZfNTB7Y8YhvQnZPEXpwmpWXzxQgnWniamDrjRWUwxaNw@mail.gmail.com>

* Rui Ueyama | 2011-07-29 01:10:26 [-0700]:

>Convert the BPF VM to threaded code to improve performance.
>
>The BPF VM is basically a big for loop containing a switch statement.  That is
>slow because for each instruction it checks the for loop condition and does the
>conditional branch of the switch statement.
>
>This patch eliminates the conditional branch, by replacing it with jump table
>using GCC's labels-as-values feature. The for loop condition check can also be
>removed, because the filter code always end with a RET instruction.

With commit 01f2f3f6ef4d076c I reworked the BPF code so that gcc is in the
ability to generate a jump table, I double checked this. Not sure what happened
in the meantime.

Hagen



^ permalink raw reply

* Re: [net-next v2 70/71] tile: Move the Tilera driver
From: Chris Metcalf @ 2011-08-01 17:21 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: davem, netdev, gospo, sassmann
In-Reply-To: <1312082850-24914-71-git-send-email-jeffrey.t.kirsher@intel.com>

On 7/30/2011 11:27 PM, Jeff Kirsher wrote:
> Move the Tilera driver into drivers/net/ethernet/tile and
> make the necessary Kconfig and Makefile changes.
>
> CC: Chris Metcalf <cmetcalf@tilera.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> [...]
> +++ b/drivers/net/ethernet/tile/Kconfig
> @@ -0,0 +1,28 @@
> +#
> +# Tilera network device configuration
> +#
> +
> +config NET_VENDOR_TILERA
> +	bool "Tilera devices"
> +	depends on TILE
> +	---help---
> +	  If you have a network (Ethernet) card belonging to this class, say Y
> +	  and read the Ethernet-HOWTO, available from
> +	  <http://www.tldp.org/docs.html#howto>.
> +
> +	  Note that the answer to this question doesn't directly affect the
> +	  kernel: saying N will just cause the configurator to skip all
> +	  the questions about Tilera cards. If you say Y, you will be asked for
> +	  your specific card in the following questions.
> +
> +config TILE_NET
> +	tristate "Tilera GBE/XGBE network driver support"
> +	depends on NET_VENDOR_TILERA && TILE
> +	default y
> +	select CRC32
> +	---help---
> +	  This is a standard Linux network device driver for the
> +	  on-chip Tilera Gigabit Ethernet and XAUI interfaces.
> +
> +	  To compile this driver as a module, choose M here: the module
> +	  will be called tile_net.

Overall, this seems fine, since the Tilera drivers get grouped more
appropriately as a result.  However, the drivers in question are not
Ethernet cards (and Tilera is not an Ethernet card vendor and has no plans
to become one).  Instead, this is the driver support for the built-in
networking hardware on the Tilera multicore CPU chip.  I'm happy to group
this support under drivers/net/ethernet/tile/, but I think it's appropriate
to default it to "Y" if you are building a TILE kernel (since you are
guaranteed to have the networking hardware available).

I suspect for now the cleanest thing to do is to fold the two config
options together, using NET_VENDOR_TILERA for consistency with other
NET_VENDOR_xxx symbols, and defaulting it to "Y" via "depends on TILE".  I
don't think the Ethernet-HOWO reference is particularly helpful since it
mostly tackles all the various card issues, kernel boot param issues, etc.,
none of which are relevant to this driver.  Something like:

+config NET_VENDOR_TILERA
+	bool "Tilera devices"
+	depends on TILE
+	default y
+	select CRC32
+	---help---
+	  This is a standard Linux network device driver for the arch/tile
+	  on-chip Gigabit Ethernet and XAUI interfaces.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called tile_net.

Obviously you'd also need to tweak the TILE_NET symbol in the Makefile to
be VENDOR_TILERA.  If this makes sense to you, go ahead and make the
change, and feel free to use my

Acked-by: Chris Metcalf <cmetcalf@tilera.com>

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

^ permalink raw reply

* Re: [net-next v2 17/71] myri*: Move the Myricom drivers
From: Jon Mason @ 2011-08-01 17:09 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: davem, netdev, gospo, sassmann, Andrew Gallatin, Brice Goglin
In-Reply-To: <1312082850-24914-18-git-send-email-jeffrey.t.kirsher@intel.com>

On Sat, Jul 30, 2011 at 10:26 PM, Jeff Kirsher
<jeffrey.t.kirsher@intel.com> wrote:
> Move the Myricom drivers into drivers/net/ethernet/myricom/ and make
> the necessary Kconfig and Makefile changes.

Acked-by: Jon Mason <mason@myri.com>

> CC: Andrew Gallatin <gallatin@myri.com>
> CC: Brice Goglin <brice@myri.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> ---
>  MAINTAINERS                                        |    2 +-
>  drivers/net/Kconfig                                |   26 ------------
>  drivers/net/Makefile                               |    1 -
>  drivers/net/ethernet/Kconfig                       |    1 +
>  drivers/net/ethernet/Makefile                      |    1 +
>  drivers/net/ethernet/myricom/Kconfig               |   42 ++++++++++++++++++++
>  drivers/net/ethernet/myricom/Makefile              |    5 ++
>  .../net/{ => ethernet/myricom}/myri10ge/Makefile   |    0
>  .../net/{ => ethernet/myricom}/myri10ge/myri10ge.c |    0
>  .../{ => ethernet/myricom}/myri10ge/myri10ge_mcp.h |    0
>  .../myricom}/myri10ge/myri10ge_mcp_gen_header.h    |    0
>  11 files changed, 50 insertions(+), 28 deletions(-)
>  create mode 100644 drivers/net/ethernet/myricom/Kconfig
>  create mode 100644 drivers/net/ethernet/myricom/Makefile
>  rename drivers/net/{ => ethernet/myricom}/myri10ge/Makefile (100%)
>  rename drivers/net/{ => ethernet/myricom}/myri10ge/myri10ge.c (100%)
>  rename drivers/net/{ => ethernet/myricom}/myri10ge/myri10ge_mcp.h (100%)
>  rename drivers/net/{ => ethernet/myricom}/myri10ge/myri10ge_mcp_gen_header.h (100%)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ddec2eb..3423692 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4308,7 +4308,7 @@ M:        Andrew Gallatin <gallatin@myri.com>
>  L:     netdev@vger.kernel.org
>  W:     http://www.myri.com/scs/download-Myri10GE.html
>  S:     Supported
> -F:     drivers/net/myri10ge/
> +F:     drivers/net/ethernet/myricom/myri10ge/
>
>  NATSEMI ETHERNET DRIVER (DP8381x)
>  M:     Tim Hockin <thockin@hockin.org>
> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> index 56c033a..38fcaea 100644
> --- a/drivers/net/Kconfig
> +++ b/drivers/net/Kconfig
> @@ -1519,32 +1519,6 @@ config VXGE_DEBUG_TRACE_ALL
>          the vxge driver. By default only few debug trace statements are
>          enabled.
>
> -config MYRI10GE
> -       tristate "Myricom Myri-10G Ethernet support"
> -       depends on PCI && INET
> -       select FW_LOADER
> -       select CRC32
> -       select INET_LRO
> -       ---help---
> -         This driver supports Myricom Myri-10G Dual Protocol interface in
> -         Ethernet mode. If the eeprom on your board is not recent enough,
> -         you will need a newer firmware image.
> -         You may get this image or more information, at:
> -
> -         <http://www.myri.com/scs/download-Myri10GE.html>
> -
> -         To compile this driver as a module, choose M here. The module
> -         will be called myri10ge.
> -
> -config MYRI10GE_DCA
> -       bool "Direct Cache Access (DCA) Support"
> -       default y
> -       depends on MYRI10GE && DCA && !(MYRI10GE=y && DCA=m)
> -       ---help---
> -         Say Y here if you want to use Direct Cache Access (DCA) in the
> -         driver.  DCA is a method for warming the CPU cache before data
> -         is used, with the intent of lessening the impact of cache misses.
> -
>  config PASEMI_MAC
>        tristate "PA Semi 1/10Gbit MAC"
>        depends on PPC_PASEMI && PCI && INET
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index 73e357e..b9e1f5a 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -151,7 +151,6 @@ obj-$(CONFIG_R8169) += r8169.o
>  obj-$(CONFIG_IBMVETH) += ibmveth.o
>  obj-$(CONFIG_S2IO) += s2io.o
>  obj-$(CONFIG_VXGE) += vxge/
> -obj-$(CONFIG_MYRI10GE) += myri10ge/
>  obj-$(CONFIG_PXA168_ETH) += pxa168_eth.o
>  obj-$(CONFIG_BFIN_MAC) += bfin_mac.o
>  obj-$(CONFIG_DM9000) += dm9000.o
> diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
> index 8bbddc9..ce3040d 100644
> --- a/drivers/net/ethernet/Kconfig
> +++ b/drivers/net/ethernet/Kconfig
> @@ -21,6 +21,7 @@ source "drivers/net/ethernet/emulex/Kconfig"
>  source "drivers/net/ethernet/intel/Kconfig"
>  source "drivers/net/ethernet/i825xx/Kconfig"
>  source "drivers/net/ethernet/mellanox/Kconfig"
> +source "drivers/net/ethernet/myricom/Kconfig"
>  source "drivers/net/ethernet/qlogic/Kconfig"
>  source "drivers/net/ethernet/racal/Kconfig"
>  source "drivers/net/ethernet/sfc/Kconfig"
> diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
> index e5f2954..b4dcb93 100644
> --- a/drivers/net/ethernet/Makefile
> +++ b/drivers/net/ethernet/Makefile
> @@ -12,6 +12,7 @@ obj-$(CONFIG_NET_VENDOR_EMULEX) += emulex/
>  obj-$(CONFIG_NET_VENDOR_INTEL) += intel/
>  obj-$(CONFIG_NET_VENDOR_I825XX) += i825xx/
>  obj-$(CONFIG_NET_VENDOR_MELLANOX) += mellanox/
> +obj-$(CONFIG_NET_VENDOR_MYRI) += myricom/
>  obj-$(CONFIG_NET_VENDOR_QLOGIC) += qlogic/
>  obj-$(CONFIG_NET_VENDOR_RACAL) += racal/
>  obj-$(CONFIG_SFC) += sfc/
> diff --git a/drivers/net/ethernet/myricom/Kconfig b/drivers/net/ethernet/myricom/Kconfig
> new file mode 100644
> index 0000000..8dc4241
> --- /dev/null
> +++ b/drivers/net/ethernet/myricom/Kconfig
> @@ -0,0 +1,42 @@
> +#
> +# Myricom device configuration
> +#
> +
> +config NET_VENDOR_MYRI
> +       bool "Myricom devices"
> +       depends on PCI || INET
> +       ---help---
> +         If you have a network (Ethernet) card belonging to this class, say
> +         Y and read the Ethernet-HOWTO, available from
> +         <http://www.tldp.org/docs.html#howto>.
> +
> +         Note that the answer to this question doesn't directly affect the
> +         kernel: saying N will just cause the configurator to skip all
> +         the questions about Myricom cards. If you say Y, you will be asked for
> +         your specific card in the following questions.
> +
> +config MYRI10GE
> +       tristate "Myricom Myri-10G Ethernet support"
> +       depends on NET_VENDOR_MYRI && PCI && INET
> +       select FW_LOADER
> +       select CRC32
> +       select INET_LRO
> +       ---help---
> +         This driver supports Myricom Myri-10G Dual Protocol interface in
> +         Ethernet mode. If the eeprom on your board is not recent enough,
> +         you will need a newer firmware image.
> +         You may get this image or more information, at:
> +
> +         <http://www.myri.com/scs/download-Myri10GE.html>
> +
> +         To compile this driver as a module, choose M here. The module
> +         will be called myri10ge.
> +
> +config MYRI10GE_DCA
> +       bool "Direct Cache Access (DCA) Support"
> +       default y
> +       depends on MYRI10GE && DCA && !(MYRI10GE=y && DCA=m)
> +       ---help---
> +         Say Y here if you want to use Direct Cache Access (DCA) in the
> +         driver.  DCA is a method for warming the CPU cache before data
> +         is used, with the intent of lessening the impact of cache misses.
> diff --git a/drivers/net/ethernet/myricom/Makefile b/drivers/net/ethernet/myricom/Makefile
> new file mode 100644
> index 0000000..296c0a1
> --- /dev/null
> +++ b/drivers/net/ethernet/myricom/Makefile
> @@ -0,0 +1,5 @@
> +#
> +# Makefile for the Myricom network device drivers.
> +#
> +
> +obj-$(CONFIG_MYRI10GE) += myri10ge/
> diff --git a/drivers/net/myri10ge/Makefile b/drivers/net/ethernet/myricom/myri10ge/Makefile
> similarity index 100%
> rename from drivers/net/myri10ge/Makefile
> rename to drivers/net/ethernet/myricom/myri10ge/Makefile
> diff --git a/drivers/net/myri10ge/myri10ge.c b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
> similarity index 100%
> rename from drivers/net/myri10ge/myri10ge.c
> rename to drivers/net/ethernet/myricom/myri10ge/myri10ge.c
> diff --git a/drivers/net/myri10ge/myri10ge_mcp.h b/drivers/net/ethernet/myricom/myri10ge/myri10ge_mcp.h
> similarity index 100%
> rename from drivers/net/myri10ge/myri10ge_mcp.h
> rename to drivers/net/ethernet/myricom/myri10ge/myri10ge_mcp.h
> diff --git a/drivers/net/myri10ge/myri10ge_mcp_gen_header.h b/drivers/net/ethernet/myricom/myri10ge/myri10ge_mcp_gen_header.h
> similarity index 100%
> rename from drivers/net/myri10ge/myri10ge_mcp_gen_header.h
> rename to drivers/net/ethernet/myricom/myri10ge/myri10ge_mcp_gen_header.h
> --
> 1.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Re: [net-next v2 16/71] mlx4: Move the Mellanox driver
From: Eli Cohen @ 2011-08-01 16:15 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Jeff Kirsher, davem, netdev, gospo, sassmann
In-Reply-To: <CAG4TOxPP-+DnPmveCryXpiiuQT=kvS65yfMDnph2SumuNM+r_A@mail.gmail.com>

On Mon, Aug 01, 2011 at 06:10:21AM -0700, Roland Dreier wrote:
> 
> Hi,
> 
> no objection to this, but if we're going to move this code around,
> maybe it makes sense to split the mlx4_core and mlx4_en code
> into separate directories at the same time?
> 

Hi Roland,
it makes sense to split the original mlx4 driver into mlx4_en and
mlx4_core. We will submit patches in a few days.

^ permalink raw reply

* Re: [net-next v2 54/71] lantiq: Move the Lantiq SoC driver
From: John Crispin @ 2011-08-01 16:14 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: davem, netdev, gospo, sassmann, John Crispin
In-Reply-To: <1312082850-24914-55-git-send-email-jeffrey.t.kirsher@intel.com>

Am 7/31/11 5:27 AM, schrieb Jeff Kirsher:
> Move the Lantiq driver into drivers/net/ethernet/ and the
> necessary Kconfig and Makefile changes.
>
> CC: John Crispin<blogic@openwrt.org>
> Signed-off-by: Jeff Kirsher<jeffrey.t.kirsher@intel.com>

Acked-by: John Crispin<blogic@openwrt.org>



^ permalink raw reply

* Re: [Bug?] Machine hangs, rtl8192se possible cause
From: Larry Finger @ 2011-08-01 15:30 UTC (permalink / raw)
  To: Jaroslaw Fedewicz
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <CAFdoEiT54o+NQKO=dAcOXgyf6GJ1E51w_H_p7PJyRmynz8Sbzg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On 08/01/2011 08:26 AM, Jaroslaw Fedewicz wrote:
> Hello,
>
> I own a Thinkpad Edge 13 (AMD, machine type 0197) laptop, which is
> shipped with a Realtek 8192 SE WLAN card.
>
> The WLAN support with this particular card was never brilliant under
> Linux, first with (very) flakey drivers from Realtek which would stop
> transmitting packets every so often or panic after a few hours of
> usage. The in-tree drivers are better in this respect, but I'm
> experiencing mysterious hangups every once in a while. The machine is
> effectively dead and has to be power-cycled — no oops, no kernel
> panic, no nothing, it just hangs and that's it.
>
> I'm sure this is not a regression because the hangups were right there
> from the start.
>
> The last meaningful message which might be helpful was: "wait for
> BIT(6) return value X" (I don't remember what X was, it was a while
> ago and only once).
>
> I don't know if there are other means to debug (netconsole over eth0?)
> those hangs. The only other thing I know for sure that I can get a
> week long uptime if I blacklist rtl8192se.ko from loading.
>
> If I can provide any additional information to track the bug (or a
> faulty piece of hardware?) down, please tell me. Google tells me
> nobody reported this before, or it was just me feeding incorrect
> keywords.
>
> Thanks for your kind attention.
>
> P. S. Tried netconsole before, got nothing to pinpoint the error. The
> only recurring pattern I could see in it was that almost every time
> the machine hanged was after ip6tables initialized, at least it was
> the last message in the log.
>
> P. P. S. I don't track netdev@ and linux-wireless@ lists, so please Cc: me.

What kernel are you using? The only problems I've had were some kernel panics 
due to improper handling of memory allocation failures with the receive skb's, 
but they have been fixed.

It can be difficult to use netconsole to debug problems with wireless devices.

As you prevent rtl8192se from loading automatically, the logging console may 
provide some clues. Use the following command to load the driver:

sleep 10 ; modprobe rtl8192se

During the 10 second sleep, use CTRL-ALT-F10 to switch consoles and see if any 
messages appear.

Please use 'lspci -nn' to determine which version of the card you have.

Thanks,

Larry
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: PROBLEM: BUG (NULL ptr dereference in ipv4_dst_check)
From: synapse @ 2011-08-01 15:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1312190145.2719.2.camel@edumazet-laptop>

On 08/01/11 11:15, Eric Dumazet wrote:
> Le lundi 01 août 2011 à 10:57 +0200, synapse a écrit :
>> Hello
>>
>> Sorry, I wasn't home on the weekend. Exactly to which tree should I
>> apply this?
>> It doesn't apply cleanly to 3.0.0. Am I missing something?
>>
> Could you try latest linux tree ?
>
> We first validate patches on current tree, then backport them if needed
> to previous kernels.
>
> Thanks
>
deployed, we'll see if it works out :)

Gergely Kalman

^ permalink raw reply

* Re: [PATCH net-next 0/5] qlcnic: Fixes and debug support
From: Anirban Chakraborty @ 2011-08-01 15:24 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Dept_NX_Linux_NIC_Driver
In-Reply-To: <20110801.015706.719035744873995476.davem@davemloft.net>


On Aug 1, 2011, at 1:57 AM, David Miller wrote:

> From: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
> Date: Fri, 29 Jul 2011 16:30:30 -0700
> 
>> Please apply the series to net-next. Thanks.
> 
> Queued up for net-next

Would it be too much of a trouble to push these to net-2.6, as these are minor bug fixes anyway?

Thanks a lot.

-Anirban



^ permalink raw reply

* Re: [GIT] Networking
From: Ingo Molnar @ 2011-08-01 15:13 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, akpm, netdev, linux-kernel
In-Reply-To: <20110722.073339.1236244143490935644.davem@davemloft.net>

* David Miller <davem@davemloft.net> wrote:

>       forcedeth: do vlan cleanup

Trying to bring latest -git into latest -tip today i managed to 
bisect back to a pretty bad networking breakage on one of my 
testboxes back to this commit - where i have discovered that it has 
been fixed freshly.

This bug cost me multiple days of debugging so here's a bit of a post 
mortem.

The bit that IMO wasnt very optimal was the timing of the merge path:

 -                  AuthorDate: Wed Jul 20 04:54:38 2011 +0000
 -                  CommitDate: Thu Jul 21 13:47:57 2011 -0700
 - tree Linus merge CommitDate: Fri Jul 22 14:43:13 2011 -0700
 -        first lkml bugreport: Sun Jul 24 16:10:59 2011 -0700
 -              fix CommitDate: Wed Jul 27 22:39:30 2011 -0700
 - fix  Linus merge CommitDate: Thu Jul 28 05:58:19 2011 -0700

So you can see that the commit was committed to net-next within 24 
hours of it being submitted, the (bad) breakage was not discovered 
until 4 days down the road.

I submit that *no one* with real forcedeth hardware actually tested 
this commit before it hit upstream. It has not touched linux-next 
before going to Linus and it took 8 days for the fix to get upstream.

If the latency of common driver bugfixes is on the order of 1 week 
then the golden rule is that commits must be tested for at least 1 
week as well. One day of testing was *way* too short.

Furthermore, the changelog of the fix:

 0891b0e08937: forcedeth: fix vlans

Doesn't contain any reference to the bisection work done by
walt <w41ter@gmail.com> nor by any of the other bugreporters.

So this really sucked all around - could we please improve on it?

Thanks,

	Ingo

^ permalink raw reply

* [PATCH 1/1] atm: br2864: sent packets truncated in VC routed mode
From: chas williams - CONTRACTOR @ 2011-08-01 15:03 UTC (permalink / raw)
  To: netdev; +Cc: pascal, davem, linux-atm-general

hopefully this could be included for the next kernel release.

From: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>

commit 8b5e1c9db2bcd2d13c2db08e6c6dbe66882fa186
Author: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Date:   Mon Aug 1 07:55:07 2011 -0400

    atm: br2864: sent packets truncated in VC routed mode
    
    Reported-by: Pascal Hambourg <pascal@plouf.fr.eu.org>
    Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>

diff --git a/net/atm/br2684.c b/net/atm/br2684.c
index 2252c20..52cfd0c 100644
--- a/net/atm/br2684.c
+++ b/net/atm/br2684.c
@@ -242,8 +242,6 @@ static int br2684_xmit_vcc(struct sk_buff *skb, struct net_device *dev,
 		if (brdev->payload == p_bridged) {
 			skb_push(skb, 2);
 			memset(skb->data, 0, 2);
-		} else { /* p_routed */
-			skb_pull(skb, ETH_HLEN);
 		}
 	}
 	skb_debug(skb);

^ permalink raw reply related

* Re: net-next-2.6 [PATCH 0/7] dccp: add support for dynamic parameter updates
From: Gerrit Renker @ 2011-08-01 14:43 UTC (permalink / raw)
  To: David Miller; +Cc: dccp, netdev
In-Reply-To: <20110801.001042.6263177526030880.davem@davemloft.net>

Quoting David S. Miller:
| From: Gerrit Renker <gerrit@erg.abdn.ac.uk>
| Date: Mon, 25 Jul 2011 07:36:29 -0600
| 
| > I have also placed this in into a fresh (today's) copy of net-next-2.6, on
| > 
| >     git://eden-feed.erg.abdn.ac.uk/net-next-2.6        [subtree 'dccp']
| 
| I did a test pull and this URL doesn't work:
| 
| 
Sorry, I must have clobbered something during a subsequent update last week. 

I have just regenerated the whole tree from scratch, including the 'dccp' sub-branch,
using  today's net-next-2.6, double-checking it running 
  git ls-remote git://eden-feed.erg.abdn.ac.uk/net-next-2.6  dccp
and doing a test-pull of the dccp sub-tree.

Can you please consider pulling again, from

    git://eden-feed.erg.abdn.ac.uk/net-next-2.6		[sub-tree 'dccp']

The patch listing of this set is at
  http://eden-feed.erg.abdn.ac.uk/cgi-bin/gitweb.cgi?p=net-next-2.6.git;a=log;h=dccp

Thank you for the update
Gerrit
-- 

^ permalink raw reply

* Re: [net-next v2 56/71] macb: Move the Atmel driver
From: Nicolas Ferre @ 2011-08-01 14:49 UTC (permalink / raw)
  To: Jeff Kirsher, netdev; +Cc: davem, gospo, sassmann, Jamie Iles
In-Reply-To: <1312082850-24914-57-git-send-email-jeffrey.t.kirsher@intel.com>

On 07/31/2011 04:27 AM, Jeff Kirsher wrote:
> Move the Atmel driver into drivers/net/ethernet/cadence/ and
> make the necessary Kconfig and Makefile changes.
>
> CC: Nicolas Ferre<nicolas.ferre@atmel.com>

You can add my:
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>

> CC: Jamie Iles<jamie@jamieiles.com>
> Signed-off-by: Jeff Kirsher<jeffrey.t.kirsher@intel.com>
> ---
>   MAINTAINERS                                        |    2 +-
>   drivers/net/Kconfig                                |   16 --------
>   drivers/net/Makefile                               |    3 -
>   drivers/net/arm/Kconfig                            |   12 ------
>   drivers/net/arm/Makefile                           |    6 ---
>   drivers/net/ethernet/Kconfig                       |    1 +
>   drivers/net/ethernet/Makefile                      |    1 +
>   drivers/net/ethernet/cadence/Kconfig               |   40 ++++++++++++++++++++
>   drivers/net/ethernet/cadence/Makefile              |    6 +++
>   drivers/net/{arm =>  ethernet/cadence}/at91_ether.c |    0
>   drivers/net/{arm =>  ethernet/cadence}/at91_ether.h |    0
>   drivers/net/{ =>  ethernet/cadence}/macb.c          |    0
>   drivers/net/{ =>  ethernet/cadence}/macb.h          |    0
>   13 files changed, 49 insertions(+), 38 deletions(-)
>   delete mode 100644 drivers/net/arm/Kconfig
>   delete mode 100644 drivers/net/arm/Makefile
>   create mode 100644 drivers/net/ethernet/cadence/Kconfig
>   create mode 100644 drivers/net/ethernet/cadence/Makefile
>   rename drivers/net/{arm =>  ethernet/cadence}/at91_ether.c (100%)
>   rename drivers/net/{arm =>  ethernet/cadence}/at91_ether.h (100%)
>   rename drivers/net/{ =>  ethernet/cadence}/macb.c (100%)
>   rename drivers/net/{ =>  ethernet/cadence}/macb.h (100%)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 83a51ad..995f504 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1317,7 +1317,7 @@ F:	include/video/atmel_lcdc.h
>   ATMEL MACB ETHERNET DRIVER
>   M:	Nicolas Ferre<nicolas.ferre@atmel.com>
>   S:	Supported
> -F:	drivers/net/macb.*
> +F:	drivers/net/ethernet/cadence/
>
>   ATMEL SPI DRIVER
>   M:	Nicolas Ferre<nicolas.ferre@atmel.com>
> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
> index bb4bc4b..6db322c 100644
> --- a/drivers/net/Kconfig
> +++ b/drivers/net/Kconfig
> @@ -2,9 +2,6 @@
>   # Network device configuration
>   #
>
> -config HAVE_NET_MACB
> -	bool
> -
>   menuconfig NETDEVICES
>   	default y if UML
>   	depends on NET
> @@ -224,19 +221,6 @@ menuconfig NET_ETHERNET
>
>   if NET_ETHERNET
>
> -config MACB
> -	tristate "Atmel MACB support"
> -	depends on HAVE_NET_MACB
> -	select PHYLIB
> -	help
> -	  The Atmel MACB ethernet interface is found on many AT32 and AT91
> -	  parts. Say Y to include support for the MACB chip.
> -
> -	  To compile this driver as a module, choose M here: the module
> -	  will be called macb.
> -
> -source "drivers/net/arm/Kconfig"
> -
>   config SH_ETH
>   	tristate "Renesas SuperH Ethernet support"
>   	depends on SUPERH&&  \
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index d249d76..d7873ba 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -63,9 +63,6 @@ obj-$(CONFIG_ETHOC) += ethoc.o
>   obj-$(CONFIG_GRETH) += greth.o
>
>   obj-$(CONFIG_DNET) += dnet.o
> -obj-$(CONFIG_MACB) += macb.o
> -
> -obj-$(CONFIG_ARM) += arm/
>   obj-$(CONFIG_DEV_APPLETALK) += appletalk/
>   obj-$(CONFIG_ETHERNET) += ethernet/
>   obj-$(CONFIG_TR) += tokenring/
> diff --git a/drivers/net/arm/Kconfig b/drivers/net/arm/Kconfig
> deleted file mode 100644
> index 57d16b9..0000000
> --- a/drivers/net/arm/Kconfig
> +++ /dev/null
> @@ -1,12 +0,0 @@
> -#
> -# Acorn Network device configuration
> -#  These are for Acorn's Expansion card network interfaces
> -#
> -
> -config ARM_AT91_ETHER
> -	tristate "AT91RM9200 Ethernet support"
> -	depends on ARM&&  ARCH_AT91RM9200
> -	select MII
> -	help
> -	  If you wish to compile a kernel for the AT91RM9200 and enable
> -	  ethernet support, then you should always answer Y to this.
> diff --git a/drivers/net/arm/Makefile b/drivers/net/arm/Makefile
> deleted file mode 100644
> index fc0f85c..0000000
> --- a/drivers/net/arm/Makefile
> +++ /dev/null
> @@ -1,6 +0,0 @@
> -# File: drivers/net/arm/Makefile
> -#
> -# Makefile for the ARM network device drivers
> -#
> -
> -obj-$(CONFIG_ARM_AT91_ETHER)	+= at91_ether.o
> diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
> index e087337..68a31b9 100644
> --- a/drivers/net/ethernet/Kconfig
> +++ b/drivers/net/ethernet/Kconfig
> @@ -15,6 +15,7 @@ source "drivers/net/ethernet/3com/Kconfig"
>   source "drivers/net/ethernet/amd/Kconfig"
>   source "drivers/net/ethernet/apple/Kconfig"
>   source "drivers/net/ethernet/atheros/Kconfig"
> +source "drivers/net/ethernet/cadence/Kconfig"
>   source "drivers/net/ethernet/adi/Kconfig"
>   source "drivers/net/ethernet/broadcom/Kconfig"
>   source "drivers/net/ethernet/brocade/Kconfig"
> diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
> index 826db27..0e91c4d 100644
> --- a/drivers/net/ethernet/Makefile
> +++ b/drivers/net/ethernet/Makefile
> @@ -7,6 +7,7 @@ obj-$(CONFIG_NET_VENDOR_8390) += 8390/
>   obj-$(CONFIG_NET_VENDOR_AMD) += amd/
>   obj-$(CONFIG_NET_VENDOR_APPLE) += apple/
>   obj-$(CONFIG_NET_VENDOR_ATHEROS) += atheros/
> +obj-$(CONFIG_NET_ATMEL) += cadence/
>   obj-$(CONFIG_NET_BFIN) += adi/
>   obj-$(CONFIG_NET_VENDOR_BROADCOM) += broadcom/
>   obj-$(CONFIG_NET_VENDOR_BROCADE) += brocade/
> diff --git a/drivers/net/ethernet/cadence/Kconfig b/drivers/net/ethernet/cadence/Kconfig
> new file mode 100644
> index 0000000..4c443da
> --- /dev/null
> +++ b/drivers/net/ethernet/cadence/Kconfig
> @@ -0,0 +1,40 @@
> +#
> +# Atmel device configuration
> +#
> +
> +config HAVE_NET_MACB
> +	bool
> +
> +config NET_ATMEL
> +	bool "Atmel devices"
> +	depends on HAVE_NET_MACB || (ARM&&  ARCH_AT91RM9200)
> +	---help---
> +	  If you have a network (Ethernet) card belonging to this class, say Y.
> +	  Make sure you know the name of your card. Read the Ethernet-HOWTO,
> +	  available from<http://www.tldp.org/docs.html#howto>.
> +
> +	  If unsure, say Y.
> +
> +	  Note that the answer to this question doesn't directly affect the
> +	  kernel: saying N will just cause the configurator to skip all
> +	  the remaining Atmel network card questions. If you say Y, you will be
> +	  asked for your specific card in the following questions.
> +
> +config ARM_AT91_ETHER
> +	tristate "AT91RM9200 Ethernet support"
> +	depends on NET_ATMEL&&  ARM&&  ARCH_AT91RM9200
> +	select MII
> +	---help---
> +	  If you wish to compile a kernel for the AT91RM9200 and enable
> +	  ethernet support, then you should always answer Y to this.
> +
> +config MACB
> +	tristate "Atmel MACB support"
> +	depends on NET_ATMEL&&  HAVE_NET_MACB
> +	select PHYLIB
> +	---help---
> +	  The Atmel MACB ethernet interface is found on many AT32 and AT91
> +	  parts. Say Y to include support for the MACB chip.
> +
> +	  To compile this driver as a module, choose M here: the module
> +	  will be called macb.
> diff --git a/drivers/net/ethernet/cadence/Makefile b/drivers/net/ethernet/cadence/Makefile
> new file mode 100644
> index 0000000..9068b83
> --- /dev/null
> +++ b/drivers/net/ethernet/cadence/Makefile
> @@ -0,0 +1,6 @@
> +#
> +# Makefile for the Atmel network device drivers.
> +#
> +
> +obj-$(CONFIG_ARM_AT91_ETHER) += at91_ether.o
> +obj-$(CONFIG_MACB) += macb.o
> diff --git a/drivers/net/arm/at91_ether.c b/drivers/net/ethernet/cadence/at91_ether.c
> similarity index 100%
> rename from drivers/net/arm/at91_ether.c
> rename to drivers/net/ethernet/cadence/at91_ether.c
> diff --git a/drivers/net/arm/at91_ether.h b/drivers/net/ethernet/cadence/at91_ether.h
> similarity index 100%
> rename from drivers/net/arm/at91_ether.h
> rename to drivers/net/ethernet/cadence/at91_ether.h
> diff --git a/drivers/net/macb.c b/drivers/net/ethernet/cadence/macb.c
> similarity index 100%
> rename from drivers/net/macb.c
> rename to drivers/net/ethernet/cadence/macb.c
> diff --git a/drivers/net/macb.h b/drivers/net/ethernet/cadence/macb.h
> similarity index 100%
> rename from drivers/net/macb.h
> rename to drivers/net/ethernet/cadence/macb.h


^ permalink raw reply

* Re: [PATCH] cxgb3i: ref count cdev access to prevent modification while in use
From: Steve Wise @ 2011-08-01 14:42 UTC (permalink / raw)
  To: Neil Horman; +Cc: netdev, Divy Le Ray, Steve Wise, David S. Miller, Karen Xie
In-Reply-To: <20110801111824.GB23131@hmsreliant.think-freely.org>

>> Signed-off-by: Neil Horman<nhorman@tuxdriver.com>
>> CC: Divy Le Ray<divy@chelsio.com>
>> CC: Steve Wise<swise@chelsio.com>
>> CC: "David S. Miller"<davem@davemloft.net>
> Divy, Steve, I think Dave is waiting for an ACK from one of you to, since you're
> the listed maintainers.
> Neil

Karen is the cxgb3i maintainer, but I just now reviewed the patch and it looks good.  Divy is on vacation for a few 
weeks.  I think its ok to pull it in since Karen already acked it.

Steve.

^ permalink raw reply

* Re: [net-next v2 56/71] macb: Move the Atmel driver
From: Jamie Iles @ 2011-08-01 14:41 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: davem, netdev, gospo, sassmann, Nicolas Ferre, Jamie Iles
In-Reply-To: <1312082850-24914-57-git-send-email-jeffrey.t.kirsher@intel.com>

On Sat, Jul 30, 2011 at 08:27:15PM -0700, Jeff Kirsher wrote:
> Move the Atmel driver into drivers/net/ethernet/cadence/ and
> make the necessary Kconfig and Makefile changes.
> 
> CC: Nicolas Ferre <nicolas.ferre@atmel.com>
> CC: Jamie Iles <jamie@jamieiles.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Looks good to me.

Acked-by: Jamie Iles <jamie@jamieiles.com>

^ permalink raw reply

* [Bug?] Machine hangs, rtl8192se possible cause
From: Jaroslaw Fedewicz @ 2011-08-01 13:26 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	Larry.Finger-tQ5ms3gMjBLk1uMJSBkQmQ

Hello,

I own a Thinkpad Edge 13 (AMD, machine type 0197) laptop, which is
shipped with a Realtek 8192 SE WLAN card.

The WLAN support with this particular card was never brilliant under
Linux, first with (very) flakey drivers from Realtek which would stop
transmitting packets every so often or panic after a few hours of
usage. The in-tree drivers are better in this respect, but I'm
experiencing mysterious hangups every once in a while. The machine is
effectively dead and has to be power-cycled — no oops, no kernel
panic, no nothing, it just hangs and that's it.

I'm sure this is not a regression because the hangups were right there
from the start.

The last meaningful message which might be helpful was: "wait for
BIT(6) return value X" (I don't remember what X was, it was a while
ago and only once).

I don't know if there are other means to debug (netconsole over eth0?)
those hangs. The only other thing I know for sure that I can get a
week long uptime if I blacklist rtl8192se.ko from loading.

If I can provide any additional information to track the bug (or a
faulty piece of hardware?) down, please tell me. Google tells me
nobody reported this before, or it was just me feeding incorrect
keywords.

Thanks for your kind attention.

P. S. Tried netconsole before, got nothing to pinpoint the error. The
only recurring pattern I could see in it was that almost every time
the machine hanged was after ip6tables initialized, at least it was
the last message in the log.

P. P. S. I don't track netdev@ and linux-wireless@ lists, so please Cc: me.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [net-next v2 16/71] mlx4: Move the Mellanox driver
From: Roland Dreier @ 2011-08-01 13:10 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: davem, netdev, gospo, sassmann
In-Reply-To: <1312082850-24914-17-git-send-email-jeffrey.t.kirsher@intel.com>

On Sat, Jul 30, 2011 at 8:26 PM, Jeff Kirsher
<jeffrey.t.kirsher@intel.com> wrote:
> Moves the Mellanox driver into drivers/net/ethernet/mellanox/ and
> make the necessary Kconfig and Makefile changes.

Hi,

no objection to this, but if we're going to move this code around,
maybe it makes sense to split the mlx4_core and mlx4_en code
into separate directories at the same time?

 - R.

^ permalink raw reply

* Re: [Bugme-new] [Bug 39742] New: 2.6.39.3 crash and hangs in 1-2 minutes. Still have traps.
From: Rustam Afanasyev @ 2011-08-01 11:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: netdev, bugme-daemon, Patrick McHardy
In-Reply-To: <20110722144214.577718f0.akpm@linux-foundation.org>

With applied patch from xeb 
(http://www.spinics.net/lists/netdev/msg170766.html)
there isn't hangs any more. Panics is still occurs. They only occur much 
less frequently. And seems it's different place.
Here it is.
1 was like:
[ 1582.319303] ------------[ cut here ]------------
[ 1582.323205] kernel BUG at include/linux/skbuff.h:1189!
[ 1582.323205] invalid opcode: 0000 [#1] SMP
[ 1582.323205] last sysfs file: /sys/devices/virtual/net/ppp168/uevent
[ 1582.323205] CPU 0
[ 1582.323205] Modules linked in: cls_fw sch_sfq arc4 ecb ppp_mppe 
act_mirred act_skbedit cls_u32 sch_ingress l2tp_ppp l2d
[ 1582.323205]
[ 1582.598100] Pid: 0, comm: swapper Not tainted 2.6.39-std-def-alt3 #1 
ASUS RS100-E4/PI2/P5M2-M/RS100-E4
[ 1582.616208] RIP: 0010:[<ffffffff8134a94b>]  [<ffffffff8134a94b>] 
skb_pull+0x2b/0x30
[ 1582.616208] RSP: 0018:ffff88011fc03b20  EFLAGS: 00010293
[ 1582.616208] RAX: 0000000000000247 RBX: ffff880117ac2880 RCX: 
0000000000000027
[ 1582.616208] RDX: 0000000000000027 RSI: 0000000000000002 RDI: 
ffff880117ac2880
[ 1582.698595] RBP: ffff88011fc03b20 R08: 0000000000000228 R09: 
000000000000000c
[ 1582.698595] R10: 0000000000000240 R11: 0000000000000001 R12: 
ffff880118f00800
[ 1582.698595] R13: ffff8801176a086e R14: 000000000000002f R15: 
ffff880115346000
[ 1582.698595] FS:  0000000000000000(0000) GS:ffff88011fc00000(0000) 
knlGS:0000000000000000
[ 1582.698595] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1582.808009] CR2: 0000000000645f4c CR3: 000000011760b000 CR4: 
00000000000006f0
[ 1582.808009] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[ 1582.808009] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[ 1582.808009] Process swapper (pid: 0, threadinfo ffffffff81800000, 
task ffffffff8180b020)
[ 1582.808009] Stack:
[ 1582.808009]  ffff88011fc03b50 ffffffffa034663a ffff88011fc03b80 
ffff880118f00800
[ 1582.808009]  ffff880117ac2880 0000000000000000 ffff88011fc03b90 
ffffffff81346ecc
[ 1582.808009]  ffff88011fc03bb0 ffffffff8134b4e9 0000000000000001 
ffff880117ac2880
[ 1582.808009] Call Trace:
[ 1582.808009]  <IRQ>
[ 1582.808009]  [<ffffffffa034663a>] pptp_rcv_core+0x21a/0x220 [pptp]
[ 1582.808009]  [<ffffffff81346ecc>] sk_receive_skb+0x10c/0x140
[ 1582.808009]  [<ffffffff8134b4e9>] ? __pskb_pull_tail+0x59/0x3e0
[ 1582.808009]  [<ffffffffa0346783>] pptp_rcv+0x143/0x190 [pptp]
[ 1582.808009]  [<ffffffffa03250ad>] gre_rcv+0x5d/0x80 [gre]
[ 1582.808009]  [<ffffffff8138946d>] ip_local_deliver_finish+0xdd/0x2a0
[ 1582.808009]  [<ffffffff813897f0>] ip_local_deliver+0x80/0x90
[ 1582.808009]  [<ffffffff81389135>] ip_rcv_finish+0x135/0x390
[ 1582.808009]  [<ffffffff81389a1c>] ip_rcv+0x21c/0x2e0
[ 1582.808009]  [<ffffffff81356f5a>] __netif_receive_skb+0x52a/0x690
[ 1582.808009]  [<ffffffff8100cf01>] ? do_IRQ+0x61/0xe0
[ 1582.808009]  [<ffffffff8140a593>] ? common_interrupt+0x13/0x13
[ 1582.808009]  [<ffffffff813572d0>] netif_receive_skb+0x60/0x90
[ 1582.808009]  [<ffffffff8124de9c>] ? is_swiotlb_buffer+0x3c/0x50
[ 1582.808009]  [<ffffffff81357440>] napi_skb_finish+0x50/0x70
[ 1582.808009]  [<ffffffff813579bd>] napi_gro_receive+0xbd/0xd0
[ 1582.808009]  [<ffffffffa01b229b>] igb_poll+0x6fb/0xae0 [igb]
[ 1582.808009]  [<ffffffff81356f5a>] ? __netif_receive_skb+0x52a/0x690
[ 1582.808009]  [<ffffffff81357be5>] net_rx_action+0x135/0x270
[ 1582.808009]  [<ffffffff81062705>] __do_softirq+0xa5/0x1d0
[ 1582.808009]  [<ffffffff8141301c>] call_softirq+0x1c/0x30
[ 1582.808009]  [<ffffffff8100d355>] do_softirq+0x65/0xa0
[ 1582.808009]  [<ffffffff81062a96>] irq_exit+0x86/0xa0
[ 1582.808009]  [<ffffffff8100cf01>] do_IRQ+0x61/0xe0
[ 1582.808009]  [<ffffffff8140a593>] common_interrupt+0x13/0x13
[ 1582.808009]  <EOI>
[ 1582.808009]  [<ffffffff81012deb>] ? mwait_idle+0x9b/0x1d0
[ 1582.808009]  [<ffffffff8140de35>] ? atomic_notifier_call_chain+0x15/0x20
[ 1582.808009]  [<ffffffff8100a1e6>] cpu_idle+0x56/0xa0
[ 1582.808009]  [<ffffffff813f396d>] rest_init+0x6d/0x80
[ 1582.808009]  [<ffffffff8187fbe6>] start_kernel+0x392/0x39d
[ 1582.808009]  [<ffffffff8187f347>] x86_64_start_reservations+0x132/0x136
[ 1582.808009]  [<ffffffff8187f44c>] x86_64_start_kernel+0x101/0x110
[ 1582.808009] Code: 8b 47 68 55 48 89 e5 39 c6 77 1c 29 f0 3b 47 6c 89 
47 68 72 16 89 f0 48 03 87 e0 00 00 00 48 89 87 e
[ 1582.808009] RIP  [<ffffffff8134a94b>] skb_pull+0x2b/0x30
[ 1582.808009]  RSP <ffff88011fc03b20>
[ 1583.628015] ---[ end trace 6c39f3d0e04ed229 ]---
[ 1583.641960] Kernel panic - not syncing: Fatal exception in interrupt
[ 1583.661088] Pid: 0, comm: swapper Tainted: G      D 
2.6.39-std-def-alt3 #1
[ 1583.682846] Call Trace:
[ 1583.690254]  <IRQ>  [<ffffffff81407284>] panic+0x8c/0x197
[ 1583.706576]  [<ffffffff8140b4d2>] oops_end+0xe2/0xf0
[ 1583.721540]  [<ffffffff8100e876>] die+0x56/0x90
[ 1583.735215]  [<ffffffff8140abf4>] do_trap+0xc4/0x170
[ 1583.750191]  [<ffffffff8100bf00>] do_invalid_op+0x90/0xb0
[ 1583.766466]  [<ffffffff8134a94b>] ? skb_pull+0x2b/0x30
[ 1583.781965]  [<ffffffffa01f31d6>] ? ipt_do_table+0x256/0x670 [ip_tables]

other was:
[  245.268856] ------------[ cut here ]------------
[  245.272797] kernel BUG at include/linux/skbuff.h:1189!
[  245.272797] invalid opcode: 0000 [#1] SMP
[  245.272797] last sysfs file: /sys/devices/virtual/net/ppp100/uevent
[  245.272797] CPU 3
[  245.272797] Modules linked in: cls_fw sch_sfq arc4 ecb ppp_mppe 
act_mirred act_skbedit cls_u32 sch_ingress l2tp_ppp l2d
[  245.272797]
[  245.272797] Pid: 0, comm: kworker/0:1 Not tainted 2.6.39-std-def-alt3 
#1 ASUS RS100-E4/PI2/P5M2-M/RS100-E4
[  245.272797] RIP: 0010:[<ffffffff8134a94b>]  [<ffffffff8134a94b>] 
skb_pull+0x2b/0x30
[  245.272797] RSP: 0018:ffff88011fd83b20  EFLAGS: 00010297
[  245.272797] RAX: 0000000000000518 RBX: ffff880116a0e0c0 RCX: 
0000000000000444
[  245.272797] RDX: 0000000000000444 RSI: 0000000000000010 RDI: 
ffff880116a0e0c0
[  245.272797] RBP: ffff88011fd83b20 R08: 0000000000000500 R09: 
000000000000000c
[  245.272797] R10: 0000000000000240 R11: 0000000000000001 R12: 
ffff8801171fb400
[  245.272797] R13: ffff880116a0cc72 R14: 000000000000002f R15: 
ffff8801168de000
[  245.272797] FS:  0000000000000000(0000) GS:ffff88011fd80000(0000) 
knlGS:0000000000000000
[  245.272797] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  245.272797] CR2: 00000000008faa2c CR3: 00000001151c2000 CR4: 
00000000000006e0
[  245.272797] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  245.272797] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  245.272797] Process kworker/0:1 (pid: 0, threadinfo ffff880118f46000, 
task ffff880118f444c0)
[  245.272797] Stack:
[  245.272797]  ffff88011fd83b50 ffffffffa03374e2 ffff88011fd83b80 
ffff8801171fb400
[  245.272797]  ffff880116a0e0c0 0000000000000000 ffff88011fd83b90 
ffffffff81346ecc
[  245.892667]  ffff88011fd83bb0 ffffffff8134b4e9 0000000000000001 
ffff880116a0e0c0
[  245.892667] Call Trace:
[  245.892667]  <IRQ>
[  245.892667]  [<ffffffffa03374e2>] pptp_rcv_core+0xc2/0x220 [pptp]
[  245.892667]  [<ffffffff81346ecc>] sk_receive_skb+0x10c/0x140
[  245.892667]  [<ffffffff8134b4e9>] ? __pskb_pull_tail+0x59/0x3e0
[  245.892667]  [<ffffffffa0337783>] pptp_rcv+0x143/0x190 [pptp]
[  245.892667]  [<ffffffffa03160ad>] gre_rcv+0x5d/0x80 [gre]
[  245.892667]  [<ffffffff8138946d>] ip_local_deliver_finish+0xdd/0x2a0
[  245.892667]  [<ffffffff813897f0>] ip_local_deliver+0x80/0x90
[  245.892667]  [<ffffffff81389135>] ip_rcv_finish+0x135/0x390
[  245.892667]  [<ffffffff81389a1c>] ip_rcv+0x21c/0x2e0
[  245.892667]  [<ffffffff81356f5a>] __netif_receive_skb+0x52a/0x690
[  245.892667]  [<ffffffff81137985>] ? __kmalloc_node_track_caller+0x55/0x60
[  245.892667]  [<ffffffff813572d0>] netif_receive_skb+0x60/0x90
[  245.892667]  [<ffffffff8124de9c>] ? is_swiotlb_buffer+0x3c/0x50
[  245.892667]  [<ffffffff81357440>] napi_skb_finish+0x50/0x70
[  245.892667]  [<ffffffff813579bd>] napi_gro_receive+0xbd/0xd0
[  245.892667]  [<ffffffffa01ea29b>] igb_poll+0x6fb/0xae0 [igb]
[  245.892667]  [<ffffffff81356f5a>] ? __netif_receive_skb+0x52a/0x690
[  245.892667]  [<ffffffff81357be5>] net_rx_action+0x135/0x270
[  245.892667]  [<ffffffff81062705>] __do_softirq+0xa5/0x1d0
[  245.892667]  [<ffffffff8141301c>] call_softirq+0x1c/0x30
[  245.892667]  [<ffffffff8100d355>] do_softirq+0x65/0xa0
[  245.892667]  [<ffffffff81062a96>] irq_exit+0x86/0xa0
[  245.892667]  [<ffffffff8100cf01>] do_IRQ+0x61/0xe0
[  245.892667]  [<ffffffff8140a593>] common_interrupt+0x13/0x13
[  245.892667]  <EOI>
[  245.892667]  [<ffffffff81012deb>] ? mwait_idle+0x9b/0x1d0
[  245.892667]  [<ffffffff8140de35>] ? atomic_notifier_call_chain+0x15/0x20
[  245.892667]  [<ffffffff8100a1e6>] cpu_idle+0x56/0xa0
[  245.892667]  [<ffffffff81403729>] start_secondary+0x197/0x19c
[  245.892667] Code: 8b 47 68 55 48 89 e5 39 c6 77 1c 29 f0 3b 47 6c 89 
47 68 72 16 89 f0 48 03 87 e0 00 00 00 48 89 87 e
[  245.892667] RIP  [<ffffffff8134a94b>] skb_pull+0x2b/0x30
[  245.892667]  RSP <ffff88011fd83b20>
[  246.520989] ---[ end trace ad4c118e9d8ac857 ]---
[  246.534912] Kernel panic - not syncing: Fatal exception in interrupt
[  246.534916] Pid: 0, comm: kworker/0:1 Tainted: G      D 
2.6.39-std-def-alt3 #1

^ permalink raw reply

* Re: [PATCH] cxgb3i: ref count cdev access to prevent modification while in use
From: Neil Horman @ 2011-08-01 11:18 UTC (permalink / raw)
  To: netdev; +Cc: Divy Le Ray, Steve Wise, David S. Miller
In-Reply-To: <1311623817-6417-1-git-send-email-nhorman@tuxdriver.com>

On Mon, Jul 25, 2011 at 03:56:57PM -0400, Neil Horman wrote:
> This oops was reported recently:
> d:mon> e
> cpu 0xd: Vector: 300 (Data Access) at [c0000000fd4c7120]
>     pc: d00000000076f194: .t3_l2t_get+0x44/0x524 [cxgb3]
>     lr: d000000000b02108: .init_act_open+0x150/0x3d4 [cxgb3i]
>     sp: c0000000fd4c73a0
>    msr: 8000000000009032
>    dar: 0
>  dsisr: 40000000
>   current = 0xc0000000fd640d40
>   paca    = 0xc00000000054ff80
>     pid   = 5085, comm = iscsid
> d:mon> t
> [c0000000fd4c7450] d000000000b02108 .init_act_open+0x150/0x3d4 [cxgb3i]
> [c0000000fd4c7500] d000000000e45378 .cxgbi_ep_connect+0x784/0x8e8 [libcxgbi]
> [c0000000fd4c7650] d000000000db33f0 .iscsi_if_rx+0x71c/0xb18
> [scsi_transport_iscsi2]
> [c0000000fd4c7740] c000000000370c9c .netlink_data_ready+0x40/0xa4
> [c0000000fd4c77c0] c00000000036f010 .netlink_sendskb+0x4c/0x9c
> [c0000000fd4c7850] c000000000370c18 .netlink_sendmsg+0x358/0x39c
> [c0000000fd4c7950] c00000000033be24 .sock_sendmsg+0x114/0x1b8
> [c0000000fd4c7b50] c00000000033d208 .sys_sendmsg+0x218/0x2ac
> [c0000000fd4c7d70] c00000000033f55c .sys_socketcall+0x228/0x27c
> [c0000000fd4c7e30] c0000000000086a4 syscall_exit+0x0/0x40
> --- Exception: c01 (System Call) at 00000080da560cfc
> 
> The root cause was an EEH error, which sent us down the offload_close path in
> the cxgb3 driver, which in turn sets cdev->lldev to NULL, without regard for
> upper layer driver (like the cxgbi drivers) which might have execution contexts
> in the middle of its use. The result is the oops above, when t3_l2t_get attempts
> to dereference cdev->lldev right after the EEH error handler sets it to NULL.
> 
> The fix is to reference count the cdev structure.  When an EEH error occurs, the
> shutdown path:
> t3_adapter_error->offload_close->cxgb3i_remove_clients->cxgb3i_dev_close
> will now block until such time as the cdev pointer has a use count of zero.
> This coupled with the fact that lookups will now skip finding any registered
> cdev's in cxgbi_device_find_by_[lldev|netdev] with the CXGBI_FLAG_ADAPTER_RESET
> bit set ensures that on an EEH, the setting of lldev to NULL in offload_close
> will only happen after there are no longer any active users of the data
> structure.
> 
> This has been tested by the reporter and shown to fix the reproted oops
> 
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: Divy Le Ray <divy@chelsio.com>
> CC: Steve Wise <swise@chelsio.com>
> CC: "David S. Miller" <davem@davemloft.net>
Divy, Steve, I think Dave is waiting for an ACK from one of you to, since you're
the listed maintainers.
Neil

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox