Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
From: Eric W. Biederman @ 2012-12-06 22:19 UTC (permalink / raw)
  To: Ani Sinha; +Cc: Michael Richardson, tcpdump-workers, netdev, Francesco Ruggeri
In-Reply-To: <CAOxq_8OC6zUm93mCHD5daxDmuYmmbUdgy0DnjfQdBKky_2t4nQ@mail.gmail.com>

Ani Sinha <ani@aristanetworks.com> writes:

> On Sat, Nov 17, 2012 at 3:33 PM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>> the vlan header in packets as we receive them.
>>
>> The code is correct except for the case of packets in vlan 0.  Currently
>> the packet reconstruction is ambiguous.  The most recent kernels have
>> a TP_STATUS_VLAN_VALID flag that can be checked to see if the packet was
>> in vlan 0 or if there was no vlan at all.  libpcap probably should be
>> taught how to handle TP_STATUS_VLAN_VALID so that it can get the vlan 0
>> handling correct.
>>
>
> May be this?

Two things.

- TP_STATUS_VLAN_VALID lives in the tp_status field not the tp_vlan_tci field.
- To work on older kernels with binaries compiled with newer headers you
  first want to test for tp_vlan_tci == 0 and then look at the status field for
  TP_STATUS_VALID.

Which means the tests need to look something like:

- if (aux->tp_vlan_tci == 0)
+#if defined(TP_STATUS_VLAN_VALID)
+ if ((aux->tp_vlan_tci == 0) && !(aux->tp_status & TP_STATUS_VLAN_VALID))
+#else
+ if (aux->tp_vlan_tci == 0) /* this is ambigious but without the
+
TP_STATUS_VLAN_VALID flag, there is
+                                                      nothing that we can do */
+#endif

 #ifdef HAVE_TPACKET2
- if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci &&
+ if (handle->md.tp_version == TPACKET_V2 &&
+#if defined(TP_STATUS_VLAN_VALID)
+     (h.h2->tp_vlan_tci || (h.h2->tp_status & TP_STATUS_VALID)) &&
+#else
+     h.h2->tp_vlan_tci &&
+#endif

Eric

^ permalink raw reply

* Re: [PATCHv5] virtio-spec: virtio network device RFS support
From: Ben Hutchings @ 2012-12-06 22:03 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Jason Wang, rusty, virtualization, netdev, kvm
In-Reply-To: <20121206210132.GB6576@redhat.com>

On Thu, 2012-12-06 at 23:01 +0200, Michael S. Tsirkin wrote:
> On Thu, Dec 06, 2012 at 08:53:59PM +0000, Ben Hutchings wrote:
> > On Thu, 2012-12-06 at 22:29 +0200, Michael S. Tsirkin wrote:
> > > On Thu, Dec 06, 2012 at 08:03:14PM +0000, Ben Hutchings wrote:
> > [...]
> > > > Since this doesn't seem to be intended to have *any* connection with the
> > > > existing core networking feature called RFS, perhaps you could find a
> > > > different name for it.
> > > > 
> > > > Ben.
> > > 
> > > 
> > > Ah I see what you mean. We started out calling this feature "multiqueue"
> > > Rusty suggested "RFS" since it gives similar functionality to RFS but in
> > > device: it has receive steering logic per flow as part of the device.
> > 
> > The name is quite generic, but in the context of Linux it has so far
> > been used for a specific software feature and not as a generic name for
> > flow steering by hardware (or drivers).  The existing documentation
> > (Documentation/networking/scaling.txt) states quite clearly that 'RFS'
> > means that specific software implementation (with optional driver
> > integration) and configuration interface.
> >
> > > Maybe simply adding a statement similar to the one above would be
> > > sufficient to avoid confusion?
> > 
> > No, I don't think it's sufficient.  We have documentation that says how
> > to configure 'RFS', and you're proposing to add a very similar feature
> > called 'RFS' that is configured differently.  No matter how clearly you
> > distinguish them in new documentation, this will make the old
> > documentation confusing.
> > 
> > Ben.
> 
> I don't mind, renaming is just s/RFS/whatever/ away -
> how should hardware call this in your opinion?

If by 'this' you mean the use of perfect filters or a large hash table
to select the RX queue per flow, then 'flow steering'.

But that is usually combined with the fall-back of a simple mapping from
hash to queue ('RSS' or 'flow hashing') in case there is no specific
queue selection yet, which I can see tun has.  And you're specifying
multiple transmit queues too.  If you want a name for the whole set of
features involved, I don't see any better name than 'multiqueue'/'MQ'.

If you want a name for this specific flow steering mechanism, add some
distinguishing adjective(s) like 'virtual' or 'automatic'.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH net-next 0/7] Allow to monitor multicast cache event via rtnetlink
From: Nicolas Dichtel @ 2012-12-06 21:49 UTC (permalink / raw)
  To: Thomas Graf; +Cc: David Miller, David.Laight, netdev
In-Reply-To: <20121206174905.GC16122@casper.infradead.org>

Le 06/12/2012 18:49, Thomas Graf a écrit :
> On 12/06/12 at 09:43am, Nicolas Dichtel wrote:
>> Le 05/12/2012 18:54, David Miller a écrit :
>>> From: "David Laight" <David.Laight@ACULAB.COM>
>>> Date: Wed, 5 Dec 2012 11:41:33 -0000
>>>
>>>> Probably worth commenting that the 64bit items might only be 32bit aligned.
>>>> Just to stop anyone trying to read/write them with pointer casts.
>>>
>>> Rather, let's not create this situation at all.
>>>
>>> It's totally inappropriate to have special code to handle every single
>>> time we want to put 64-bit values into netlink messages.
>>>
>>> We need a real solution to this issue.
>>>
>> The easiest way is to update *_ALIGNTO values (maybe we can keep
>> NLMSG_ALIGNTO to 4). But I think that many userland apps have these
>> values hardcoded and, the most important thing, this may increase
>> size of many netlink messages. Hence we need probably to find
>> something better.
>
> We can't do this, as you say, ALIGNTO is compiled into all the
> binaries.
>
> A simple backwards compatible workaround would be to include an
> unknown, empty padding attribute if needed. That would be 4 bytes
> in size and could be used to include padding as needed.
>
> We could use nla_type = 0 as it is a reserved value that should
> be available in all protocols. All readers (kernel and user space)
> must ignore such an attribute just like any other unknown
> attribute they encounter.
>
> We could easily extend nla_put_u64() and variants to automatically
> include such a padding attribute as needed.
>
> The only situation that I can think of where this would not work
> is if we have code like this:
>
>     foo = nla_nest_start();
>     for ([..])
>        nla_put_u64([...])
>     nla_nest_end([...])
>
> and a reader would stupidly do a nla_for_each_attr() in user space
> and assume all attributes found must be NLA_U64 without even
> checking the length of the attribute.
>
> I would say we take that risk and let such code die horribly.
>
Ok, I can work to a patch next week if you want (I will be off until Tuesday).

^ permalink raw reply

* Re: [PATCH wireless-next] iwlwifi: iwlagn_request_scan: Fix check for NULL priv->scan_request
From: Johannes Berg @ 2012-12-06 21:41 UTC (permalink / raw)
  To: Tim Gardner
  Cc: linux-kernel, Wey-Yi Guy, Intel Linux Wireless, John W. Linville,
	Emmanuel Grumbach, Don Fry, linux-wireless, netdev
In-Reply-To: <1354825445-79551-1-git-send-email-tim.gardner@canonical.com>


> --- a/drivers/net/wireless/iwlwifi/dvm/scan.c
> +++ b/drivers/net/wireless/iwlwifi/dvm/scan.c
> @@ -673,8 +673,9 @@ static int iwlagn_request_scan(struct iwl_priv *priv, struct ieee80211_vif *vif)
>  	const u8 *ssid = NULL;
>  	u8 ssid_len = 0;
>  
> -	if (WARN_ON_ONCE(priv->scan_request &&
> -			 priv->scan_request->n_channels > MAX_SCAN_CHANNEL))
> +	if (WARN_ON(priv->scan_type == IWL_SCAN_NORMAL &&
> +		(!priv->scan_request ||
> +		priv->scan_request->n_channels > MAX_SCAN_CHANNEL)))
>  		return -EINVAL;

I'll pick it up if you fix the indentation :P

johannes

^ permalink raw reply

* Re: [net-next PATCH V3-evictor] net: frag evictor, avoid killing warm frag queues
From: David Miller @ 2012-12-06 21:38 UTC (permalink / raw)
  To: David.Laight
  Cc: fw, jbrouer, eric.dumazet, netdev, tgraf, paulmck, amwang,
	herbert
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B70E2@saturn3.aculab.com>

From: "David Laight" <David.Laight@ACULAB.COM>
Date: Thu, 6 Dec 2012 13:29:13 -0000

> NFS/UDP is about the only thing that generates very large
> IP datagrams - and no one in their right mind runs that
> over non-local links.

There are people with real applications that use UDP with
large IP datagrams.  As unfortunate as it is, this is the
reality we have to deal with.

^ permalink raw reply

* Re: [PATCH] tulip: Fix compiler warning when CONFIG_DEBUG_SECTION_MISMATCH=y
From: David Miller @ 2012-12-06 21:35 UTC (permalink / raw)
  To: christoph.paasch; +Cc: grundler, netdev
In-Reply-To: <1354799067-25680-1-git-send-email-christoph.paasch@uclouvain.be>

From: Christoph Paasch <christoph.paasch@uclouvain.be>
Date: Thu,  6 Dec 2012 14:04:27 +0100

> WARNING: drivers/net/ethernet/dec/tulip/tulip.o(.text+0x4057): Section
> mismatch in reference from the function tulip_init_one() to the variable
> .devinit.rodata:early_486_chipsets
> The function tulip_init_one() references
> the variable __devinitconst early_486_chipsets.
> This is often because tulip_init_one lacks a __devinitconst
> annotation or the annotation of early_486_chipsets is wrong.
> 
> Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>

Can you at least do the bare minimum amount of research when making
"fixes" like this?

These __devinit removals were done on purpose.

^ permalink raw reply

* Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
From: Ani Sinha @ 2012-12-06 21:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Michael Richardson, tcpdump-workers, netdev, Francesco Ruggeri
In-Reply-To: <87obivu7n7.fsf@xmission.com>

On Sat, Nov 17, 2012 at 3:33 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> the vlan header in packets as we receive them.
>
> The code is correct except for the case of packets in vlan 0.  Currently
> the packet reconstruction is ambiguous.  The most recent kernels have
> a TP_STATUS_VLAN_VALID flag that can be checked to see if the packet was
> in vlan 0 or if there was no vlan at all.  libpcap probably should be
> taught how to handle TP_STATUS_VLAN_VALID so that it can get the vlan 0
> handling correct.
>

May be this?

Index: libpcap-1.1.1/pcap-linux.c
===================================================================
--- libpcap-1.1.1.orig/pcap-linux.c
+++ libpcap-1.1.1/pcap-linux.c
@@ -132,6 +132,7 @@ static const char rcsid[] _U_ =
 #include <sys/utsname.h>
 #include <sys/mman.h>
 #include <linux/if.h>
+#include <linux/if_packet.h>
 #include <netinet/in.h>
 #include <linux/if_ether.h>
 #include <net/if_arp.h>
@@ -1486,7 +1487,13 @@ pcap_read_packet(pcap_t *handle, pcap_ha
  continue;

  aux = (struct tpacket_auxdata *)CMSG_DATA(cmsg);
- if (aux->tp_vlan_tci == 0)
+#if defined(TP_STATUS_VLAN_VALID)
+                        if (!(aux->tp_vlan_tci & TP_STATUS_VLAN_VALID))
+#else
+ if (aux->tp_vlan_tci == 0) /* this is ambigious but without the
+
TP_STATUS_VLAN_VALID flag, there is
+                                                      nothing that we can do */
+#endif
  continue;

  len = packet_len > iov.iov_len ? iov.iov_len : packet_len;
@@ -3565,7 +3555,11 @@ pcap_read_linux_mmap(pcap_t *handle, int
  }

 #ifdef HAVE_TPACKET2
- if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci &&
+#if defined(TP_STATUS_VLAN_VALID)
+                if (handle->md.tp_version == TPACKET_V2 &&
(h.h2->tp_vlan_tci & TP_STATUS_VLAN_VALID) &&
+#else
+                if (handle->md.tp_version == TPACKET_V2 && h.h2->tp_vlan_tci &&
+#endif
     handle->md.vlan_offset != -1 &&
     tp_snaplen >= (unsigned int) handle->md.vlan_offset) {
  struct vlan_tag *tag;

^ permalink raw reply

* Re: [tcpdump-workers] vlan tagged packets and libpcap breakage
From: Ani Sinha @ 2012-12-06 21:20 UTC (permalink / raw)
  To: Guy Harris; +Cc: netdev, Francesco Ruggeri, tcpdump-workers
In-Reply-To: <B23C28F0-4005-4970-984D-3BAD063EA513@alum.mit.edu>

On Wed, Oct 31, 2012 at 5:50 PM, Guy Harris <guy@alum.mit.edu> wrote:
>
> On Oct 31, 2012, at 3:35 PM, Ani Sinha <ani@aristanetworks.com> wrote:
>
>> yes but if the packet is passed to the filter within libpcap (when we
>> are not using the kernel filter) before the reinsertion,
>
> ...that would be a bug.
>
> Currently, that bug doesn't exist in the recvfrom() code path, but *does* appear to exist in the tpacket code path - and that code path also runs the filter before the SLL header is constructed.  That should be fixed.

Something like this?

Index: libpcap-1.1.1/pcap-linux.c
===================================================================
--- libpcap-1.1.1.orig/pcap-linux.c
+++ libpcap-1.1.1/pcap-linux.c
@@ -132,6 +132,7 @@ static const char rcsid[] _U_ =
 #include <sys/utsname.h>
 #include <sys/mman.h>
 #include <linux/if.h>
+#include <linux/if_packet.h>
 #include <netinet/in.h>
 #include <linux/if_ether.h>
 #include <net/if_arp.h>
@@ -3469,23 +3476,6 @@ pcap_read_linux_mmap(pcap_t *handle, int
  return -1;
  }

- /* run filter on received packet
- * If the kernel filtering is enabled we need to run the
- * filter until all the frames present into the ring
- * at filter creation time are processed.
- * In such case md.use_bpf is used as a counter for the
- * packet we need to filter.
- * Note: alternatively it could be possible to stop applying
- * the filter when the ring became empty, but it can possibly
- * happen a lot later... */
- bp = (unsigned char*)h.raw + tp_mac;
- run_bpf = (!handle->md.use_bpf) ||
- ((handle->md.use_bpf>1) && handle->md.use_bpf--);
- if (run_bpf && handle->fcode.bf_insns &&
- (bpf_filter(handle->fcode.bf_insns, bp,
- tp_len, tp_snaplen) == 0))
- goto skip;
-
  /*
  * Do checks based on packet direction.
  */
@@ -3582,6 +3576,23 @@ pcap_read_linux_mmap(pcap_t *handle, int
  }
 #endif

+ /* run filter on received packet
+ * If the kernel filtering is enabled we need to run the
+ * filter until all the frames present into the ring
+ * at filter creation time are processed.
+ * In such case md.use_bpf is used as a counter for the
+ * packet we need to filter.
+ * Note: alternatively it could be possible to stop applying
+ * the filter when the ring became empty, but it can possibly
+ * happen a lot later... */
+ bp = (unsigned char*)h.raw + tp_mac;
+ run_bpf = (!handle->md.use_bpf) ||
+ ((handle->md.use_bpf>1) && handle->md.use_bpf--);
+ if (run_bpf && handle->fcode.bf_insns &&
+ (bpf_filter(handle->fcode.bf_insns, bp,
+ tp_len, tp_snaplen) == 0))
+ goto skip;
+
  /*
  * The only way to tell the kernel to cut off the
  * packet at a snapshot length is with a filter program;

^ permalink raw reply

* Re: [PATCH 2/2] sctp: Add RCU protection to assoc->transport_addr_list
From: Neil Horman @ 2012-12-06 21:14 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Thomas Graf, linux-sctp, netdev
In-Reply-To: <50C0FB17.9000602@gmail.com>

On Thu, Dec 06, 2012 at 03:07:51PM -0500, Vlad Yasevich wrote:
> On 12/06/2012 02:25 PM, Thomas Graf wrote:
> >peer.transport_addr_list is currently only protected by sk_sock
> >which is inpractical to acquire for procfs dumping purposes.
> >
> >This patch adds RCU protection allowing for the procfs readers to
> >enter RCU read-side critical sections.
> >
> >Modification of the list continues to be serialized via sk_lock.
> >
> >V2: Use list_del_rcu() in sctp_association_free() to be safe
> >     Skip transports marked dead when dumping for procfs
> >
> >Cc: Vlad Yasevich <vyasevich@gmail.com>
> >Cc: Neil Horman <nhorman@tuxdriver.com>
> >Signed-off-by: Thomas Graf <tgraf@suug.ch>
> 
> Acked-by: Vlad Yasevich <vyasevich@gmail.com>
> 
> -vlad
> 
Acked-by: Neil Horman <nhorman@tuxdriver.com>

> >---
> >  include/net/sctp/structs.h |  2 ++
> >  net/sctp/associola.c       |  6 +++---
> >  net/sctp/proc.c            | 14 ++++++++++++--
> >  net/sctp/transport.c       | 18 +++++++++++++-----
> >  4 files changed, 30 insertions(+), 10 deletions(-)
> >
> >diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> >index 64158aa..5d6987b 100644
> >--- a/include/net/sctp/structs.h
> >+++ b/include/net/sctp/structs.h
> >@@ -948,6 +948,8 @@ struct sctp_transport {
> >
> >  	/* 64-bit random number sent with heartbeat. */
> >  	__u64 hb_nonce;
> >+
> >+	struct rcu_head rcu;
> >  };
> >
> >  struct sctp_transport *sctp_transport_new(struct net *, const union sctp_addr *,
> >diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> >index b1ef3bc..81b1236 100644
> >--- a/net/sctp/associola.c
> >+++ b/net/sctp/associola.c
> >@@ -445,7 +445,7 @@ void sctp_association_free(struct sctp_association *asoc)
> >  	/* Release the transport structures. */
> >  	list_for_each_safe(pos, temp, &asoc->peer.transport_addr_list) {
> >  		transport = list_entry(pos, struct sctp_transport, transports);
> >-		list_del(pos);
> >+		list_del_rcu(pos);
> >  		sctp_transport_free(transport);
> >  	}
> >
> >@@ -565,7 +565,7 @@ void sctp_assoc_rm_peer(struct sctp_association *asoc,
> >  		sctp_assoc_update_retran_path(asoc);
> >
> >  	/* Remove this peer from the list. */
> >-	list_del(&peer->transports);
> >+	list_del_rcu(&peer->transports);
> >
> >  	/* Get the first transport of asoc. */
> >  	pos = asoc->peer.transport_addr_list.next;
> >@@ -765,7 +765,7 @@ struct sctp_transport *sctp_assoc_add_peer(struct sctp_association *asoc,
> >  	peer->state = peer_state;
> >
> >  	/* Attach the remote transport to our asoc.  */
> >-	list_add_tail(&peer->transports, &asoc->peer.transport_addr_list);
> >+	list_add_tail_rcu(&peer->transports, &asoc->peer.transport_addr_list);
> >  	asoc->peer.transport_count++;
> >
> >  	/* If we do not yet have a primary path, set one.  */
> >diff --git a/net/sctp/proc.c b/net/sctp/proc.c
> >index 06b05ee..8c19e97 100644
> >--- a/net/sctp/proc.c
> >+++ b/net/sctp/proc.c
> >@@ -162,15 +162,20 @@ static void sctp_seq_dump_remote_addrs(struct seq_file *seq, struct sctp_associa
> >  	struct sctp_af *af;
> >
> >  	primary = &assoc->peer.primary_addr;
> >-	list_for_each_entry(transport, &assoc->peer.transport_addr_list,
> >+	rcu_read_lock();
> >+	list_for_each_entry_rcu(transport, &assoc->peer.transport_addr_list,
> >  			transports) {
> >  		addr = &transport->ipaddr;
> >+		if (transport->dead)
> >+			continue;
> >+
> >  		af = sctp_get_af_specific(addr->sa.sa_family);
> >  		if (af->cmp_addr(addr, primary)) {
> >  			seq_printf(seq, "*");
> >  		}
> >  		af->seq_dump_addr(seq, addr);
> >  	}
> >+	rcu_read_unlock();
> >  }
> >
> >  static void * sctp_eps_seq_start(struct seq_file *seq, loff_t *pos)
> >@@ -441,12 +446,16 @@ static int sctp_remaddr_seq_show(struct seq_file *seq, void *v)
> >  	head = &sctp_assoc_hashtable[hash];
> >  	sctp_local_bh_disable();
> >  	read_lock(&head->lock);
> >+	rcu_read_lock();
> >  	sctp_for_each_hentry(epb, node, &head->chain) {
> >  		if (!net_eq(sock_net(epb->sk), seq_file_net(seq)))
> >  			continue;
> >  		assoc = sctp_assoc(epb);
> >-		list_for_each_entry(tsp, &assoc->peer.transport_addr_list,
> >+		list_for_each_entry_rcu(tsp, &assoc->peer.transport_addr_list,
> >  					transports) {
> >+			if (tsp->dead)
> >+				continue;
> >+
> >  			/*
> >  			 * The remote address (ADDR)
> >  			 */
> >@@ -492,6 +501,7 @@ static int sctp_remaddr_seq_show(struct seq_file *seq, void *v)
> >  		}
> >  	}
> >
> >+	rcu_read_unlock();
> >  	read_unlock(&head->lock);
> >  	sctp_local_bh_enable();
> >
> >diff --git a/net/sctp/transport.c b/net/sctp/transport.c
> >index 206cf52..1295aec 100644
> >--- a/net/sctp/transport.c
> >+++ b/net/sctp/transport.c
> >@@ -163,13 +163,11 @@ void sctp_transport_free(struct sctp_transport *transport)
> >  	sctp_transport_put(transport);
> >  }
> >
> >-/* Destroy the transport data structure.
> >- * Assumes there are no more users of this structure.
> >- */
> >-static void sctp_transport_destroy(struct sctp_transport *transport)
> >+static void sctp_transport_destroy_rcu(struct rcu_head *head)
> >  {
> >-	SCTP_ASSERT(transport->dead, "Transport is not dead", return);
> >+	struct sctp_transport *transport;
> >
> >+	transport = container_of(head, struct sctp_transport, rcu);
> >  	if (transport->asoc)
> >  		sctp_association_put(transport->asoc);
> >
> >@@ -180,6 +178,16 @@ static void sctp_transport_destroy(struct sctp_transport *transport)
> >  	SCTP_DBG_OBJCNT_DEC(transport);
> >  }
> >
> >+/* Destroy the transport data structure.
> >+ * Assumes there are no more users of this structure.
> >+ */
> >+static void sctp_transport_destroy(struct sctp_transport *transport)
> >+{
> >+	SCTP_ASSERT(transport->dead, "Transport is not dead", return);
> >+
> >+	call_rcu(&transport->rcu, sctp_transport_destroy_rcu);
> >+}
> >+
> >  /* Start T3_rtx timer if it is not already running and update the heartbeat
> >   * timer.  This routine is called every time a DATA chunk is sent.
> >   */
> >
> 
> 

^ permalink raw reply

* Re: [PATCH 1/2] sctp: proc: protect bind_addr->address_list accesses with rcu_read_lock()
From: Neil Horman @ 2012-12-06 21:12 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Thomas Graf, linux-sctp, netdev, Thomas Graf
In-Reply-To: <50C0FAF8.6060706@gmail.com>

On Thu, Dec 06, 2012 at 03:07:20PM -0500, Vlad Yasevich wrote:
> On 12/06/2012 02:25 PM, Thomas Graf wrote:
> >From: Thomas Graf <tgraf@redhat.com>
> >
> >address_list is protected via the socket lock or RCU. Since we don't want
> >to take the socket lock for each assoc we dump in procfs a RCU read-side
> >critical section must be entered.
> >
> >V2: Skip local addresses marked as dead
> >
> >Cc: Vlad Yasevich <vyasevich@gmail.com>
> >Cc: Neil Horman <nhorman@tuxdriver.com>
> >Signed-off-by: Thomas Graf <tgraf@suug.ch>
> 
> Acked-by: Vlad Yasevich <vyasevic@gmail.com>
> 
> -vlad
> 
Acked-by: Neil Horman <nhorman@tuxdriver.com>

> >---
> >  net/sctp/proc.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> >diff --git a/net/sctp/proc.c b/net/sctp/proc.c
> >index 9966e7b..06b05ee 100644
> >--- a/net/sctp/proc.c
> >+++ b/net/sctp/proc.c
> >@@ -139,7 +139,11 @@ static void sctp_seq_dump_local_addrs(struct seq_file *seq, struct sctp_ep_commo
> >  	    primary = &peer->saddr;
> >  	}
> >
> >-	list_for_each_entry(laddr, &epb->bind_addr.address_list, list) {
> >+	rcu_read_lock();
> >+	list_for_each_entry_rcu(laddr, &epb->bind_addr.address_list, list) {
> >+		if (!laddr->valid)
> >+			continue;
> >+
> >  		addr = &laddr->a;
> >  		af = sctp_get_af_specific(addr->sa.sa_family);
> >  		if (primary && af->cmp_addr(addr, primary)) {
> >@@ -147,6 +151,7 @@ static void sctp_seq_dump_local_addrs(struct seq_file *seq, struct sctp_ep_commo
> >  		}
> >  		af->seq_dump_addr(seq, addr);
> >  	}
> >+	rcu_read_unlock();
> >  }
> >
> >  /* Dump remote addresses of an association. */
> >
> 
> 

^ permalink raw reply

* Re: [PATCH rfc] netfilter: two xtables matches
From: Willem de Bruijn @ 2012-12-06 21:12 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Jan Engelhardt, netfilter-devel, netdev, Eric Dumazet,
	David Miller, Patrick McHardy
In-Reply-To: <20121206052246.GA2905@1984>

On Thu, Dec 6, 2012 at 12:22 AM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Wed, Dec 05, 2012 at 09:00:36PM +0100, Jan Engelhardt wrote:
>> On Wednesday 2012-12-05 20:28, Willem de Bruijn wrote:
>>
>> >Somehow, the first part of this email went missing. Not critical,
>> >but for completeness:
>> >
>> >These two patches each add an xtables match.
>> >
>> >The xt_priority match is a straighforward addition in the style of
>> >xt_mark, adding the option to filter on one more sk_buff field. I
>> >have an immediate application for this. The amount of code (in
>> >kernel + userspace) to add a single check proved quite large.
>>
>> Hm so yeah, can't we just place this in xt_mark.c?
>
> I don't feel this belongs to xt_mark at all.

Do you have other concerns, or can I resubmit as is for merging in a
few days if no one raises additional issues?

For this and netfilter changes in general: should these patches be
against git://1984.lsi.us.es/nf-next instead of net-next? This patch
likely applies cleanly there, but I haven't tried yet. Thanks.

^ permalink raw reply

* Re: [RFC PATCH v2 3/3] tun: fix LSM/SELinux labeling of tun/tap devices
From: Paul Moore @ 2012-12-06 21:09 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-security-module, selinux, jasowang
In-Reply-To: <20121206205716.GA6576@redhat.com>

On Thursday, December 06, 2012 10:57:16 PM Michael S. Tsirkin wrote:
> On Thu, Dec 06, 2012 at 11:56:45AM -0500, Paul Moore wrote:
> > The SETQUEUE/tun_socket:create_queue permissions do not yet exist in any
> > released SELinux policy as we are just now adding them with this patchset.
> > With current policies loaded into a kernel with this patchset applied the
> > SETQUEUE/tun_socket:create_queue permission would be treated according to
> > the policy's unknown permission setting.
> 
> OK I think we need to rethink what we are doing here: what you sent
> addresses the problem as stated but I think we mis-stated it.  Let me
> try to restate the problem: it is not just selinux problem. Let's assume
> qemu wants to use tun, I (libvirt) don't want to run it as root.
> 
> 1. TUNSETIFF: I can open tun, attach an fd and pass it to qemu.
> Now, qemu does not invoke TUNSETIFF so it can run without
> kernel priveledges.

Correct me if I'm wrong, but I believe libvirt does this while running as 
root.  Assuming that is the case, why not simply setuid()/setgid() to the same 
credentials as the QEMU instance before creating the TUN device?  You can 
always (re)configure the device afterwards while running as 
root/CAP_NET_ADMIN.
 
> 2. TUNSETQUEUE - I can open tun and attach a queue but this
> is not what is needed since this automatically switches
> to multiqueue mode - we want to change number of queues
> on the fly.
> So qemu needs to be allowed to run TUNSETQUEUE.
> Since this checks tun_not_capable(tun) we would need
> to give qemu these priveledges, and we want to avoid this
> (I can go into why if it's not obvious).

If libvirt creates the TUN device while its effective credentials match those 
of the QEMU instance then the QEMU instance should be able to perform a 
TUNSETQUEUE, yes?

> How can we slove this?
> I don't see a way without extending the interface.
> Here's a simple way to extend it: pass a flag to TUNSETQUEUE
> that enables/disables TX on this queue.
> If TX is disabled, ignore this queue for flow steering decisions.
> Allow TUNSETQUEUE for a non priveledged user if it
> it already bound to the currect tun and only changes this flag.
> 
> Now I open tun and SETQUEUE with TX disabled flag. Pass it to qemu.
> qemu calls SETQUEUE with TX enabled flag.
> 
> Jason? Want to try implementing and see what people think?

-- 
paul moore
security and virtualization @ redhat

^ permalink raw reply

* Re: [PATCHv5] virtio-spec: virtio network device RFS support
From: Michael S. Tsirkin @ 2012-12-06 21:01 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, kvm, virtualization
In-Reply-To: <1354827239.2828.36.camel@bwh-desktop.uk.solarflarecom.com>

On Thu, Dec 06, 2012 at 08:53:59PM +0000, Ben Hutchings wrote:
> On Thu, 2012-12-06 at 22:29 +0200, Michael S. Tsirkin wrote:
> > On Thu, Dec 06, 2012 at 08:03:14PM +0000, Ben Hutchings wrote:
> [...]
> > > Since this doesn't seem to be intended to have *any* connection with the
> > > existing core networking feature called RFS, perhaps you could find a
> > > different name for it.
> > > 
> > > Ben.
> > 
> > 
> > Ah I see what you mean. We started out calling this feature "multiqueue"
> > Rusty suggested "RFS" since it gives similar functionality to RFS but in
> > device: it has receive steering logic per flow as part of the device.
> 
> The name is quite generic, but in the context of Linux it has so far
> been used for a specific software feature and not as a generic name for
> flow steering by hardware (or drivers).  The existing documentation
> (Documentation/networking/scaling.txt) states quite clearly that 'RFS'
> means that specific software implementation (with optional driver
> integration) and configuration interface.
>
> > Maybe simply adding a statement similar to the one above would be
> > sufficient to avoid confusion?
> 
> No, I don't think it's sufficient.  We have documentation that says how
> to configure 'RFS', and you're proposing to add a very similar feature
> called 'RFS' that is configured differently.  No matter how clearly you
> distinguish them in new documentation, this will make the old
> documentation confusing.
> 
> Ben.

I don't mind, renaming is just s/RFS/whatever/ away -
how should hardware call this in your opinion?

> -- 
> Ben Hutchings, Staff Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [RFC PATCH v2 3/3] tun: fix LSM/SELinux labeling of tun/tap devices
From: Michael S. Tsirkin @ 2012-12-06 20:57 UTC (permalink / raw)
  To: Paul Moore; +Cc: netdev, linux-security-module, selinux, jasowang
In-Reply-To: <6001427.qD54i2BbtH@sifl>

On Thu, Dec 06, 2012 at 11:56:45AM -0500, Paul Moore wrote:
> The SETQUEUE/tun_socket:create_queue permissions do not yet exist in any 
> released SELinux policy as we are just now adding them with this patchset.  
> With current policies loaded into a kernel with this patchset applied the 
> SETQUEUE/tun_socket:create_queue permission would be treated according to the 
> policy's unknown permission setting.

OK I think we need to rethink what we are doing here: what you sent
addresses the problem as stated but I think we mis-stated it.  Let me
try to restate the problem: it is not just selinux problem.  Let's
assume qemu wants to use tun, I (libvirt) don't want to run it as root.

1. TUNSETIFF: I can open tun, attach an fd and pass it to qemu.
Now, qemu does not invoke TUNSETIFF so it can run without
kernel priveledges.

2. TUNSETQUEUE - I can open tun and attach a queue but this
is not what is needed since this automatically switches
to multiqueue mode - we want to change number of queues
on the fly.
So qemu needs to be allowed to run TUNSETQUEUE.
Since this checks tun_not_capable(tun) we would need
to give qemu these priveledges, and we want to avoid this
(I can go into why if it's not obvious).

How can we slove this?
I don't see a way without extending the interface.
Here's a simple way to extend it: pass a flag to TUNSETQUEUE
that enables/disables TX on this queue.
If TX is disabled, ignore this queue for flow steering decisions.
Allow TUNSETQUEUE for a non priveledged user if it
it already bound to the currect tun and only changes this flag.

Now I open tun and SETQUEUE with TX disabled flag. Pass it to qemu.
qemu calls SETQUEUE with TX enabled flag.

Jason? Want to try implementing and see what people think?

> -- 
> paul moore
> security and virtualization @ redhat

^ permalink raw reply

* [PATCH net-next 1/1] vlan: restore offload use on vlan transmit
From: Andrew Gallatin @ 2012-12-06 20:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, Andrew Gallatin
In-Reply-To: <1354827296-12009-1-git-send-email-gallatin@myri.com>

This patch copies the vlan device's dev->features to its
dev->vlan_features, which allows packets to arrive at
vlan_dev_hard_start_xmit() with their offloads intact.

When a packet destined for a vlan interface is transmitted, it passes
through dev_hard_start_xmit() (and netif_skb_features()) twice.  First
on its way to vlan_dev_hard_start_xmit(), and then again on its way to
the backing device's xmit handler. If the vlan device does not setup
its dev->vlan_features, then netif_skb_features() will strip the
features on the first trip through it (on the way to
vlan_dev_hard_start_xmit()).

Signed-off-by: Andrew Gallatin <gallatin@myri.com>
---
 net/8021q/vlan_dev.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 4a6d31a..9d4b3c9 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -559,6 +559,7 @@ static int vlan_dev_init(struct net_device *dev)

 	dev->features |= real_dev->vlan_features | NETIF_F_LLTX;
 	dev->gso_max_size = real_dev->gso_max_size;
+	dev->vlan_features = dev->features;

 	/* ipv6 shared card related stuff */
 	dev->dev_id = real_dev->dev_id;
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH net-next 0/1] fix vlan transmit performance
From: Andrew Gallatin @ 2012-12-06 20:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, Andrew Gallatin

Hi,

When doing some 10GbE perf measurments on very old athlon64
machines with myri10ge, I noticed that I was seeing CPU saturation
when transmitting vlan tagged traffic.  I think I traced the
problem to this line in netif_skb_features():

	features &= (skb->dev->vlan_features | NETIF_F_HW_VLAN_TX);

The problem seems to be that packets travel through this function
twice, first on their way to the vlan xmit handler, and then on
their way to the backing device's xmit handler.  On the first
pass, "skb->dev" is the vlan device, and skb->dev->vlan_features
is blank.  This causes netif_skb_features() to strip the offloads
away.

The following patch (just copy dev->features to dev->vlan_features in
vlan_dev_init()) seems to be the simplest way to fix it. Perhaps this
is wrong, and there is a better way?  Given that this has apparently
been broken for nearly 2 years (since f01a5236), I'm worried that
either I'm doing something wrong in myri10ge, or that I'm missing
something in general.

At any rate, performance jumps from 5.6Gb/s with one CPU entirely
saturated on the sender, to 9Gb/s with idle time:

Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

before:
 87380  65536  65536    10.00      5660.66   25.54    51.41    1.478   1.488  

after:
 87380  65536  65536    10.00      9081.39   15.66    76.42    0.565   1.379  

Andrew Gallatin (1):
  vlan: restore offload use on vlan transmit

 net/8021q/vlan_dev.c |    1 +
 1 file changed, 1 insertion(+)

-- 
1.7.9.5

^ permalink raw reply

* RE: [net-next 6/8] bna: Add Stats Clear Counter
From: Ben Hutchings @ 2012-12-06 20:54 UTC (permalink / raw)
  To: Rasesh Mody
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	Adapter Linux Open SRC Team
In-Reply-To: <E5313AF6F2BFD14293E5FD0F94750F86B5B0B9CAF5@HQ1-EXCH01.corp.brocade.com>

On Thu, 2012-12-06 at 12:34 -0800, Rasesh Mody wrote:
> >From: Ben Hutchings [mailto:bhutchings@solarflare.com]
> >Sent: Wednesday, December 05, 2012 6:20 PM
> >Subject: Re: [net-next 6/8] bna: Add Stats Clear Counter
> >
> >On Wed, 2012-12-05 at 15:01 -0800, Rasesh Mody wrote:
> >> Added Stats clear counter to the bfi_enet_stats_mac structure and
> >> ethtool stats
> >>
> >> Signed-off-by: Rasesh Mody <rmody@brocade.com>
> >
> >Since this structure appears to be part of the firmware interface, you
> >should combine this (and any other interface changes) with the change to
> >the requested firmware version (7/8).
> 
> Ben,
> 
> Thanks for the feedback. Yes, bfi_enet_stats_mac is a part of firmware
> interface update.
> 
> In future submission, I will plan to take care of combining such
> firmware interface updates/changes and the firmware request version
> change into single patch. Please let me know if you think it is
> required for this patch set.

That's not my decision.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCHv5] virtio-spec: virtio network device RFS support
From: Ben Hutchings @ 2012-12-06 20:53 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, kvm, virtualization
In-Reply-To: <20121206202921.GB4340@redhat.com>

On Thu, 2012-12-06 at 22:29 +0200, Michael S. Tsirkin wrote:
> On Thu, Dec 06, 2012 at 08:03:14PM +0000, Ben Hutchings wrote:
[...]
> > Since this doesn't seem to be intended to have *any* connection with the
> > existing core networking feature called RFS, perhaps you could find a
> > different name for it.
> > 
> > Ben.
> 
> 
> Ah I see what you mean. We started out calling this feature "multiqueue"
> Rusty suggested "RFS" since it gives similar functionality to RFS but in
> device: it has receive steering logic per flow as part of the device.

The name is quite generic, but in the context of Linux it has so far
been used for a specific software feature and not as a generic name for
flow steering by hardware (or drivers).  The existing documentation
(Documentation/networking/scaling.txt) states quite clearly that 'RFS'
means that specific software implementation (with optional driver
integration) and configuration interface.

> Maybe simply adding a statement similar to the one above would be
> sufficient to avoid confusion?

No, I don't think it's sufficient.  We have documentation that says how
to configure 'RFS', and you're proposing to add a very similar feature
called 'RFS' that is configured differently.  No matter how clearly you
distinguish them in new documentation, this will make the old
documentation confusing.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH net-next] rps: overflow prevention for saturated cpus
From: Willem de Bruijn @ 2012-12-06 20:36 UTC (permalink / raw)
  To: netdev, davem, edumazet, therbert; +Cc: Willem de Bruijn

RPS and RFS balance load across cpus with flow affinity. This can
cause local bottlenecks, where a small number or single large flow
(DoS) can saturate one CPU while others are idle.

This patch maintains flow affinity in normal conditions, but
trades it for throughput when a cpu becomes saturated. Then, packets
destined to that cpu (only) are redirected to the lightest loaded cpu
in the rxqueue's rps_map. This breaks flow affinity under high load
for some flows, in favor of processing packets up to the capacity
of the complete rps_map cpuset in all circumstances.

Overload on an rps cpu is detected when the cpu's input queue
exceeds a high watermark. This threshold, netdev_max_rps_backlog,
is configurable through sysctl. By default, it is the same as
netdev_max_backlog, the threshold at which point packets are
dropped. Therefore, the behavior is disabled by default. It is
enabled by setting the rps threshold lower than the drop threshold.

This mechanism is orthogonal to filtering approaches to handle
unwanted large flows (DoS). The goal here is to avoid dropping any
traffic until the entire system is saturated.

Tested:
Sent a steady stream of 1.4M UDP packets to a machine with four
RPS cpus 0--3, set in /sys/class/net/eth0/queues/rx-N/rps_cpus.
RFS is disabled to illustrate load balancing more clearly. Showing
the output from /proc/net/softnet_stat, columns 0 (processed), 1
(dropped), 2 (time squeeuze) and 9 (rps), for the relevant CPUs:

- without patch, 40 source IP addresses (i.e., balanced):
0: ok=00409483 drop=00000000 time=00001224 rps=00000089
1: ok=00496336 drop=00051365 time=00001551 rps=00000000
2: ok=00374380 drop=00000000 time=00001105 rps=00000129
3: ok=00411348 drop=00000000 time=00001175 rps=00000165

- without patch, 1 source IP address:
0: ok=00856313 drop=00863842 time=00002676 rps=00000000
1: ok=00000003 drop=00000000 time=00000000 rps=00000001
2: ok=00000001 drop=00000000 time=00000000 rps=00000001
3: ok=00000014 drop=00000000 time=00000000 rps=00000000

- with patch, 1 source IP address:
0: ok=00278675 drop=00000000 time=00000475 rps=00001201
1: ok=00276154 drop=00000000 time=00000459 rps=00001213
2: ok=00647050 drop=00000000 time=00002022 rps=00000000
3: ok=00276228 drop=00000000 time=00000464 rps=00001218

(let me know if commit messages like this are too wordy)
---
 Documentation/networking/scaling.txt |   12 +++++++++++
 include/linux/netdevice.h            |    3 ++
 net/core/dev.c                       |   37 ++++++++++++++++++++++++++++++++-
 net/core/sysctl_net_core.c           |    9 ++++++++
 4 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt
index 579994a..f454564 100644
--- a/Documentation/networking/scaling.txt
+++ b/Documentation/networking/scaling.txt
@@ -135,6 +135,18 @@ packets have been queued to their backlog queue. The IPI wakes backlog
 processing on the remote CPU, and any queued packets are then processed
 up the networking stack.
 
+==== RPS Overflow Protection
+
+By selecting the same cpu from the cpuset for each packet in the same
+flow, RPS will cause load imbalance when input flows are not uniformly
+random. In the extreme case, a single flow, all packets are handled on a
+single CPU, which limits the throughput of the machine to the throughput
+of that CPU. RPS has optional overflow protection, which disables flow
+affinity when an RPS CPU becomes saturated: during overload, its packets
+will be sent to the least loaded other CPU in the RPS cpuset. To enable
+this option, set sysctl net.core.netdev_max_rps_backlog to be smaller than
+net.core.netdev_max_backlog. Setting it to half is a reasonable heuristic.
+
 ==== RPS Configuration
 
 RPS requires a kernel compiled with the CONFIG_RPS kconfig symbol (on
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 18c5dc9..84624fa 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2609,6 +2609,9 @@ extern void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64,
 				    const struct net_device_stats *netdev_stats);
 
 extern int		netdev_max_backlog;
+#ifdef CONFIG_RPS
+extern int		netdev_max_rps_backlog;
+#endif
 extern int		netdev_tstamp_prequeue;
 extern int		weight_p;
 extern int		bpf_jit_enable;
diff --git a/net/core/dev.c b/net/core/dev.c
index 2f94df2..08c99ad 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2734,6 +2734,9 @@ EXPORT_SYMBOL(dev_queue_xmit);
 int netdev_max_backlog __read_mostly = 1000;
 EXPORT_SYMBOL(netdev_max_backlog);
 
+#ifdef CONFIG_RPS
+int netdev_max_rps_backlog __read_mostly = 1000;
+#endif
 int netdev_tstamp_prequeue __read_mostly = 1;
 int netdev_budget __read_mostly = 300;
 int weight_p __read_mostly = 64;            /* old backlog weight */
@@ -2834,6 +2837,36 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb,
 	return rflow;
 }
 
+/* @return cpu under normal conditions, another rps_cpu if backlogged. */
+static int get_rps_overflow_cpu(int cpu, const struct rps_map* map)
+{
+       struct softnet_data *sd;
+       unsigned int cur, tcpu, min;
+       int i;
+
+       if (skb_queue_len(&per_cpu(softnet_data, cpu).input_pkt_queue) <
+           netdev_max_rps_backlog || !map)
+               return cpu;
+
+       /* leave room to prioritize the flows sent to the cpu by rxhash. */
+       min = netdev_max_rps_backlog;
+       min -= min >> 3;
+
+       for (i = 0; i < map->len; i++) {
+               tcpu = map->cpus[i];
+               if (cpu_online(tcpu)) {
+                       sd = &per_cpu(softnet_data, tcpu);
+                       cur = skb_queue_len(&sd->input_pkt_queue);
+                       if (cur < min) {
+                               min = cur;
+                               cpu = tcpu;
+                       }
+               }
+       }
+
+       return cpu;
+}
+
 /*
  * get_rps_cpu is called from netif_receive_skb and returns the target
  * CPU from the RPS map of the receiving queue for a given skb.
@@ -2912,7 +2945,7 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
 
 		if (tcpu != RPS_NO_CPU && cpu_online(tcpu)) {
 			*rflowp = rflow;
-			cpu = tcpu;
+			cpu = get_rps_overflow_cpu(tcpu, map);
 			goto done;
 		}
 	}
@@ -2921,7 +2954,7 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb,
 		tcpu = map->cpus[((u64) skb->rxhash * map->len) >> 32];
 
 		if (cpu_online(tcpu)) {
-			cpu = tcpu;
+			cpu = get_rps_overflow_cpu(tcpu, map);
 			goto done;
 		}
 	}
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index d1b0804..c1b7829 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -129,6 +129,15 @@ static struct ctl_table net_core_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec
 	},
+#ifdef CONFIG_RPS
+	{
+		.procname	= "netdev_max_rps_backlog",
+		.data		= &netdev_max_rps_backlog,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec
+	},
+#endif
 #ifdef CONFIG_BPF_JIT
 	{
 		.procname	= "bpf_jit_enable",
-- 
1.7.7.3

^ permalink raw reply related

* RE: [net-next 6/8] bna: Add Stats Clear Counter
From: Rasesh Mody @ 2012-12-06 20:34 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: davem@davemloft.net, netdev@vger.kernel.org,
	Adapter Linux Open SRC Team
In-Reply-To: <1354760382.17107.140.camel@deadeye.wl.decadent.org.uk>

>From: Ben Hutchings [mailto:bhutchings@solarflare.com]
>Sent: Wednesday, December 05, 2012 6:20 PM
>Subject: Re: [net-next 6/8] bna: Add Stats Clear Counter
>
>On Wed, 2012-12-05 at 15:01 -0800, Rasesh Mody wrote:
>> Added Stats clear counter to the bfi_enet_stats_mac structure and
>> ethtool stats
>>
>> Signed-off-by: Rasesh Mody <rmody@brocade.com>
>
>Since this structure appears to be part of the firmware interface, you
>should combine this (and any other interface changes) with the change to
>the requested firmware version (7/8).

Ben,

Thanks for the feedback. Yes, bfi_enet_stats_mac is a part of firmware interface update.

In future submission, I will plan to take care of combining such firmware interface updates/changes and the firmware request version change into single patch. Please let me know if you think it is required for this patch set.

Regards
Rasesh 

^ permalink raw reply

* Re: [PATCHv5] virtio-spec: virtio network device RFS support
From: Michael S. Tsirkin @ 2012-12-06 20:29 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, kvm, virtualization
In-Reply-To: <1354824194.2828.6.camel@bwh-desktop.uk.solarflarecom.com>

On Thu, Dec 06, 2012 at 08:03:14PM +0000, Ben Hutchings wrote:
> On Thu, 2012-12-06 at 10:13 +0200, Michael S. Tsirkin wrote:
> > On Wed, Dec 05, 2012 at 08:39:26PM +0000, Ben Hutchings wrote:
> > > On Mon, 2012-12-03 at 12:58 +0200, Michael S. Tsirkin wrote:
> > > > Add RFS support to virtio network device.
> > > > Add a new feature flag VIRTIO_NET_F_RFS for this feature, a new
> > > > configuration field max_virtqueue_pairs to detect supported number of
> > > > virtqueues as well as a new command VIRTIO_NET_CTRL_RFS to program
> > > > packet steering for unidirectional protocols.
> > > [...]
> > > > +Programming of the receive flow classificator is implicit.
> > > > + Transmitting a packet of a specific flow on transmitqX will cause incoming
> > > > + packets for this flow to be steered to receiveqX.
> > > > + For uni-directional protocols, or where no packets have been transmitted
> > > > + yet, device will steer a packet to a random queue out of the specified
> > > > + receiveq0..receiveqn.
> > > [...]
> > > 
> > > It doesn't seem like this is usable to implement accelerated RFS in the
> > > guest, though perhaps that doesn't matter.
> > 
> > What is the issue? Could you be more explicit please?
> > 
> > It seems to work pretty well: if we have
> > # of queues >= # of cpus, incoming TCP_STREAM into
> > guest scales very nicely without manual tweaks in guest.
> > 
> > The way it works is, when guest sends a packet driver
> > select the rx queue that we want to use for incoming
> > packets for this slow, and transmit on the matching tx queue.
> > This is exactly what text above suggests no?
> 
> Yes, I get that.
> 
> > >  On the host side, presumably
> > > you'll want vhost_net to do the equivalent of sock_rps_record_flow() -
> > > only without a socket?  But in any case, that requires an rxhash, so I
> > > don't see how this is supposed to work.
> > > 
> > > Ben.
> > 
> > Host should just do what guest tells it to.
> > On the host side we build up the steering table as we get packets
> > to transmit. See the code in drivers/net/tun.c in recent
> > kernels.
> > 
> > Again this actually works fine - what are the problems that you see?
> > Could you give an example please?
> 
> I'm not saying it doesn't work in its own way, I just don't see how you
> would make it work with the existing RFS!
> 
> Since this doesn't seem to be intended to have *any* connection with the
> existing core networking feature called RFS, perhaps you could find a
> different name for it.
> 
> Ben.


Ah I see what you mean. We started out calling this feature "multiqueue"
Rusty suggested "RFS" since it gives similar functionality to RFS but in
device: it has receive steering logic per flow as part of the device.

Maybe simply adding a statement similar to the one above would be
sufficient to avoid confusion?


> -- 
> Ben Hutchings, Staff Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH wireless-next] iwlwifi: iwlagn_request_scan: Fix check for NULL priv->scan_request
From: Tim Gardner @ 2012-12-06 20:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Tim Gardner, Johannes Berg, Wey-Yi Guy, Intel Linux Wireless,
	John W. Linville, Emmanuel Grumbach, Don Fry, linux-wireless,
	netdev

The WARN_ON_ONCE() check for scan_request will not correctly detect
a NULL pointer for scan_type == IWL_SCAN_NORMAL. Make it explicit
that the check only applies to normal scans.

Convert WARN_ON_ONCE to WARN_ON since priv->scan_request really _can't_
be NULL for normal scans. If it is then we should emit frequent warnings.

This smatch warning led to scrutiny of iwlagn_request_scan():

drivers/net/wireless/iwlwifi/dvm/scan.c:894 iwlagn_request_scan() error: we previously assumed 'priv->scan_request' could be null (see line 792)

Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Cc: Intel Linux Wireless <ilw@linux.intel.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Cc: Don Fry <donald.h.fry@intel.com>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---

This patch does apply to 3.6.y, but it doesn't fix an existing
bug so I don't think it qualifies. This patch simply makes
the driver more robust for future development.

 drivers/net/wireless/iwlwifi/dvm/scan.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/dvm/scan.c b/drivers/net/wireless/iwlwifi/dvm/scan.c
index bb9f625..e5cbcca 100644
--- a/drivers/net/wireless/iwlwifi/dvm/scan.c
+++ b/drivers/net/wireless/iwlwifi/dvm/scan.c
@@ -673,8 +673,9 @@ static int iwlagn_request_scan(struct iwl_priv *priv, struct ieee80211_vif *vif)
 	const u8 *ssid = NULL;
 	u8 ssid_len = 0;
 
-	if (WARN_ON_ONCE(priv->scan_request &&
-			 priv->scan_request->n_channels > MAX_SCAN_CHANNEL))
+	if (WARN_ON(priv->scan_type == IWL_SCAN_NORMAL &&
+		(!priv->scan_request ||
+		priv->scan_request->n_channels > MAX_SCAN_CHANNEL)))
 		return -EINVAL;
 
 	lockdep_assert_held(&priv->mutex);
-- 
1.7.9.5

^ permalink raw reply related

* Re: [GIT PULL] Remove __dev* markings from the networking drivers
From: Greg KH @ 2012-12-06 20:23 UTC (permalink / raw)
  To: John W. Linville; +Cc: linux-wireless, netdev, Bill Pemberton
In-Reply-To: <20121206201135.GA2246@tuxdriver.com>

On Thu, Dec 06, 2012 at 03:11:35PM -0500, John W. Linville wrote:
> I went ahead and applied these as patches, since they were based on
> something other than wireless-next.  I hope that's OK!

That's fine with me, thanks for doing so.

greg k-h

^ permalink raw reply

* Re: [PATCH net-next 2/2] net: doc: add default value for neighbour parameters
From: Ben Hutchings @ 2012-12-06 20:20 UTC (permalink / raw)
  To: Shan Wei; +Cc: David Miller, Eric Dumazet, NetDev
In-Reply-To: <50C009E9.6050101@gmail.com>

On Thu, 2012-12-06 at 10:58 +0800, Shan Wei wrote:
> Ben Hutchings said, at 2012/12/6 9:43:
> >>  neigh/default/unres_qlen - INTEGER
> >>  	The maximum number of packets which may be queued for each
> >>  	unresolved address by other network layers.
> >>  	(deprecated in linux 3.3) : use unres_qlen_bytes instead.
> >> +	Prior to linux 3.3, the default value is 3 which may cause
> >> +	secluded packet loss. The current default value is calculated
> >           ^^^^^^^^
> > I think the proper word here is 'silent'?
>  
> The number of lost packets is recorded in unresolved_discards
> of /proc/net/stat/arp_cache. Although, arp_cache is not easy
> to understand(I still don't know why we need so many rows),
> We can confirm dropping event from unresolved_discards in last column.
> The dropping event is not marked by absence of sound.
> 
> But for general user who using TCP/UDP or ping to sending packets
> out, can't simply find the dropping reason that destination ip is unresolved.
> They just doubt about the TCP/UDP or ping. So I use 'secluded' which is hidden
> from general view.
> 
> My English is not good enough, if missing something, please point to me.
> Thanks.

'Secluded' is a fine word but your dictionary should have said that it
is used to describe places, not events.  You're quite right that this is
not silent, though.  So a word like 'hidden', 'quiet', 'non-obvious' or
'unexpected' would be more appropriate.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH 3/5 net-next] bnx2: Add BNX2 prefix to CHIP ID and name macros
From: Michael Chan @ 2012-12-06 20:33 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1354825992-8547-2-git-send-email-mchan@broadcom.com>

for namespace consistency.

Signed-off-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2.c |  178 +++++++++++++++++-----------------
 drivers/net/ethernet/broadcom/bnx2.h |   48 +++++-----
 drivers/net/ethernet/broadcom/cnic.c |   14 ++--
 3 files changed, 120 insertions(+), 120 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index 98cb76b..a9b2fea 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -306,7 +306,7 @@ bnx2_ctx_wr(struct bnx2 *bp, u32 cid_addr, u32 offset, u32 val)
 {
 	offset += cid_addr;
 	spin_lock_bh(&bp->indirect_lock);
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		int i;
 
 		BNX2_WR(bp, BNX2_CTX_CTX_DATA, val);
@@ -887,7 +887,7 @@ bnx2_alloc_mem(struct bnx2 *bp)
 
 	bp->stats_blk_mapping = bp->status_blk_mapping + status_blk_size;
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		bp->ctx_pages = 0x2000 / BNX2_PAGE_SIZE;
 		if (bp->ctx_pages == 0)
 			bp->ctx_pages = 1;
@@ -1034,7 +1034,7 @@ bnx2_resolve_flow_ctrl(struct bnx2 *bp)
 	}
 
 	if ((bp->phy_flags & BNX2_PHY_FLAG_SERDES) &&
-	    (CHIP_NUM(bp) == CHIP_NUM_5708)) {
+	    (BNX2_CHIP(bp) == BNX2_CHIP_5708)) {
 		u32 val;
 
 		bnx2_read_phy(bp, BCM5708S_1000X_STAT1, &val);
@@ -1310,7 +1310,7 @@ bnx2_set_mac_link(struct bnx2 *bp)
 	if (bp->link_up) {
 		switch (bp->line_speed) {
 			case SPEED_10:
-				if (CHIP_NUM(bp) != CHIP_NUM_5706) {
+				if (BNX2_CHIP(bp) != BNX2_CHIP_5706) {
 					val |= BNX2_EMAC_MODE_PORT_MII_10M;
 					break;
 				}
@@ -1360,7 +1360,7 @@ static void
 bnx2_enable_bmsr1(struct bnx2 *bp)
 {
 	if ((bp->phy_flags & BNX2_PHY_FLAG_SERDES) &&
-	    (CHIP_NUM(bp) == CHIP_NUM_5709))
+	    (BNX2_CHIP(bp) == BNX2_CHIP_5709))
 		bnx2_write_phy(bp, MII_BNX2_BLK_ADDR,
 			       MII_BNX2_BLK_ADDR_GP_STATUS);
 }
@@ -1369,7 +1369,7 @@ static void
 bnx2_disable_bmsr1(struct bnx2 *bp)
 {
 	if ((bp->phy_flags & BNX2_PHY_FLAG_SERDES) &&
-	    (CHIP_NUM(bp) == CHIP_NUM_5709))
+	    (BNX2_CHIP(bp) == BNX2_CHIP_5709))
 		bnx2_write_phy(bp, MII_BNX2_BLK_ADDR,
 			       MII_BNX2_BLK_ADDR_COMBO_IEEEB0);
 }
@@ -1386,7 +1386,7 @@ bnx2_test_and_enable_2g5(struct bnx2 *bp)
 	if (bp->autoneg & AUTONEG_SPEED)
 		bp->advertising |= ADVERTISED_2500baseX_Full;
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709)
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 		bnx2_write_phy(bp, MII_BNX2_BLK_ADDR, MII_BNX2_BLK_ADDR_OVER1G);
 
 	bnx2_read_phy(bp, bp->mii_up1, &up1);
@@ -1396,7 +1396,7 @@ bnx2_test_and_enable_2g5(struct bnx2 *bp)
 		ret = 0;
 	}
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709)
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 		bnx2_write_phy(bp, MII_BNX2_BLK_ADDR,
 			       MII_BNX2_BLK_ADDR_COMBO_IEEEB0);
 
@@ -1412,7 +1412,7 @@ bnx2_test_and_disable_2g5(struct bnx2 *bp)
 	if (!(bp->phy_flags & BNX2_PHY_FLAG_2_5G_CAPABLE))
 		return 0;
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709)
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 		bnx2_write_phy(bp, MII_BNX2_BLK_ADDR, MII_BNX2_BLK_ADDR_OVER1G);
 
 	bnx2_read_phy(bp, bp->mii_up1, &up1);
@@ -1422,7 +1422,7 @@ bnx2_test_and_disable_2g5(struct bnx2 *bp)
 		ret = 1;
 	}
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709)
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 		bnx2_write_phy(bp, MII_BNX2_BLK_ADDR,
 			       MII_BNX2_BLK_ADDR_COMBO_IEEEB0);
 
@@ -1438,7 +1438,7 @@ bnx2_enable_forced_2g5(struct bnx2 *bp)
 	if (!(bp->phy_flags & BNX2_PHY_FLAG_2_5G_CAPABLE))
 		return;
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		u32 val;
 
 		bnx2_write_phy(bp, MII_BNX2_BLK_ADDR,
@@ -1454,7 +1454,7 @@ bnx2_enable_forced_2g5(struct bnx2 *bp)
 			       MII_BNX2_BLK_ADDR_COMBO_IEEEB0);
 		err = bnx2_read_phy(bp, bp->mii_bmcr, &bmcr);
 
-	} else if (CHIP_NUM(bp) == CHIP_NUM_5708) {
+	} else if (BNX2_CHIP(bp) == BNX2_CHIP_5708) {
 		err = bnx2_read_phy(bp, bp->mii_bmcr, &bmcr);
 		if (!err)
 			bmcr |= BCM5708S_BMCR_FORCE_2500;
@@ -1482,7 +1482,7 @@ bnx2_disable_forced_2g5(struct bnx2 *bp)
 	if (!(bp->phy_flags & BNX2_PHY_FLAG_2_5G_CAPABLE))
 		return;
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		u32 val;
 
 		bnx2_write_phy(bp, MII_BNX2_BLK_ADDR,
@@ -1496,7 +1496,7 @@ bnx2_disable_forced_2g5(struct bnx2 *bp)
 			       MII_BNX2_BLK_ADDR_COMBO_IEEEB0);
 		err = bnx2_read_phy(bp, bp->mii_bmcr, &bmcr);
 
-	} else if (CHIP_NUM(bp) == CHIP_NUM_5708) {
+	} else if (BNX2_CHIP(bp) == BNX2_CHIP_5708) {
 		err = bnx2_read_phy(bp, bp->mii_bmcr, &bmcr);
 		if (!err)
 			bmcr &= ~BCM5708S_BMCR_FORCE_2500;
@@ -1547,7 +1547,7 @@ bnx2_set_link(struct bnx2 *bp)
 	bnx2_disable_bmsr1(bp);
 
 	if ((bp->phy_flags & BNX2_PHY_FLAG_SERDES) &&
-	    (CHIP_NUM(bp) == CHIP_NUM_5706)) {
+	    (BNX2_CHIP(bp) == BNX2_CHIP_5706)) {
 		u32 val, an_dbg;
 
 		if (bp->phy_flags & BNX2_PHY_FLAG_FORCED_DOWN) {
@@ -1571,11 +1571,11 @@ bnx2_set_link(struct bnx2 *bp)
 		bp->link_up = 1;
 
 		if (bp->phy_flags & BNX2_PHY_FLAG_SERDES) {
-			if (CHIP_NUM(bp) == CHIP_NUM_5706)
+			if (BNX2_CHIP(bp) == BNX2_CHIP_5706)
 				bnx2_5706s_linkup(bp);
-			else if (CHIP_NUM(bp) == CHIP_NUM_5708)
+			else if (BNX2_CHIP(bp) == BNX2_CHIP_5708)
 				bnx2_5708s_linkup(bp);
-			else if (CHIP_NUM(bp) == CHIP_NUM_5709)
+			else if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 				bnx2_5709s_linkup(bp);
 		}
 		else {
@@ -1757,7 +1757,7 @@ __acquires(&bp->phy_lock)
 		new_bmcr = bmcr & ~BMCR_ANENABLE;
 		new_bmcr |= BMCR_SPEED1000;
 
-		if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+		if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 			if (bp->req_line_speed == SPEED_2500)
 				bnx2_enable_forced_2g5(bp);
 			else if (bp->req_line_speed == SPEED_1000) {
@@ -1765,7 +1765,7 @@ __acquires(&bp->phy_lock)
 				new_bmcr &= ~0x2000;
 			}
 
-		} else if (CHIP_NUM(bp) == CHIP_NUM_5708) {
+		} else if (BNX2_CHIP(bp) == BNX2_CHIP_5708) {
 			if (bp->req_line_speed == SPEED_2500)
 				new_bmcr |= BCM5708S_BMCR_FORCE_2500;
 			else
@@ -2230,9 +2230,9 @@ bnx2_init_5708s_phy(struct bnx2 *bp, int reset_phy)
 		bnx2_write_phy(bp, BCM5708S_UP1, val);
 	}
 
-	if ((CHIP_ID(bp) == CHIP_ID_5708_A0) ||
-	    (CHIP_ID(bp) == CHIP_ID_5708_B0) ||
-	    (CHIP_ID(bp) == CHIP_ID_5708_B1)) {
+	if ((BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5708_A0) ||
+	    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5708_B0) ||
+	    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5708_B1)) {
 		/* increase tx signal amplitude */
 		bnx2_write_phy(bp, BCM5708S_BLK_ADDR,
 			       BCM5708S_BLK_ADDR_TX_MISC);
@@ -2268,7 +2268,7 @@ bnx2_init_5706s_phy(struct bnx2 *bp, int reset_phy)
 
 	bp->phy_flags &= ~BNX2_PHY_FLAG_PARALLEL_DETECT;
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5706)
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5706)
 		BNX2_WR(bp, BNX2_MISC_GP_HW_CTL0, 0x300);
 
 	if (bp->dev->mtu > 1500) {
@@ -2379,11 +2379,11 @@ __acquires(&bp->phy_lock)
 	bp->phy_id |= val & 0xffff;
 
 	if (bp->phy_flags & BNX2_PHY_FLAG_SERDES) {
-		if (CHIP_NUM(bp) == CHIP_NUM_5706)
+		if (BNX2_CHIP(bp) == BNX2_CHIP_5706)
 			rc = bnx2_init_5706s_phy(bp, reset_phy);
-		else if (CHIP_NUM(bp) == CHIP_NUM_5708)
+		else if (BNX2_CHIP(bp) == BNX2_CHIP_5708)
 			rc = bnx2_init_5708s_phy(bp, reset_phy);
-		else if (CHIP_NUM(bp) == CHIP_NUM_5709)
+		else if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 			rc = bnx2_init_5709s_phy(bp, reset_phy);
 	}
 	else {
@@ -2449,7 +2449,7 @@ bnx2_dump_mcp_state(struct bnx2 *bp)
 	u32 mcp_p0, mcp_p1;
 
 	netdev_err(dev, "<--- start MCP states dump --->\n");
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		mcp_p0 = BNX2_MCP_STATE_P0;
 		mcp_p1 = BNX2_MCP_STATE_P1;
 	} else {
@@ -2591,7 +2591,7 @@ bnx2_init_context(struct bnx2 *bp)
 
 		vcid--;
 
-		if (CHIP_ID(bp) == CHIP_ID_5706_A0) {
+		if (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A0) {
 			u32 new_vcid;
 
 			vcid_addr = GET_PCID_ADDR(vcid);
@@ -3668,10 +3668,10 @@ static int bnx2_request_uncached_firmware(struct bnx2 *bp)
 	const struct bnx2_rv2p_fw_file *rv2p_fw;
 	int rc;
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		mips_fw_file = FW_MIPS_FILE_09;
-		if ((CHIP_ID(bp) == CHIP_ID_5709_A0) ||
-		    (CHIP_ID(bp) == CHIP_ID_5709_A1))
+		if ((BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5709_A0) ||
+		    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5709_A1))
 			rv2p_fw_file = FW_RV2P_FILE_09_Ax;
 		else
 			rv2p_fw_file = FW_RV2P_FILE_09;
@@ -4021,8 +4021,8 @@ bnx2_set_power_state(struct bnx2 *bp, pci_power_t state)
 				     1, 0);
 
 		pmcsr &= ~PCI_PM_CTRL_STATE_MASK;
-		if ((CHIP_ID(bp) == CHIP_ID_5706_A0) ||
-		    (CHIP_ID(bp) == CHIP_ID_5706_A1)) {
+		if ((BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A0) ||
+		    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A1)) {
 
 			if (bp->wol)
 				pmcsr |= 3;
@@ -4292,7 +4292,7 @@ bnx2_init_nvram(struct bnx2 *bp)
 	int j, entry_count, rc = 0;
 	const struct flash_spec *flash;
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		bp->flash_info = &flash_5709;
 		goto get_flash_size;
 	}
@@ -4716,8 +4716,8 @@ bnx2_reset_chip(struct bnx2 *bp, u32 reset_code)
 
 	/* Wait for the current PCI transaction to complete before
 	 * issuing a reset. */
-	if ((CHIP_NUM(bp) == CHIP_NUM_5706) ||
-	    (CHIP_NUM(bp) == CHIP_NUM_5708)) {
+	if ((BNX2_CHIP(bp) == BNX2_CHIP_5706) ||
+	    (BNX2_CHIP(bp) == BNX2_CHIP_5708)) {
 		BNX2_WR(bp, BNX2_MISC_ENABLE_CLR_BITS,
 			BNX2_MISC_ENABLE_CLR_BITS_TX_DMA_ENABLE |
 			BNX2_MISC_ENABLE_CLR_BITS_DMA_ENGINE_ENABLE |
@@ -4751,7 +4751,7 @@ bnx2_reset_chip(struct bnx2 *bp, u32 reset_code)
 	 * before we issue a reset. */
 	val = BNX2_RD(bp, BNX2_MISC_ID);
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		BNX2_WR(bp, BNX2_MISC_COMMAND, BNX2_MISC_COMMAND_SW_RESET);
 		BNX2_RD(bp, BNX2_MISC_COMMAND);
 		udelay(5);
@@ -4773,8 +4773,8 @@ bnx2_reset_chip(struct bnx2 *bp, u32 reset_code)
 		 * bus on 5706 A0 and A1.  The msleep below provides plenty
 		 * of margin for write posting.
 		 */
-		if ((CHIP_ID(bp) == CHIP_ID_5706_A0) ||
-		    (CHIP_ID(bp) == CHIP_ID_5706_A1))
+		if ((BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A0) ||
+		    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A1))
 			msleep(20);
 
 		/* Reset takes approximate 30 usec */
@@ -4813,7 +4813,7 @@ bnx2_reset_chip(struct bnx2 *bp, u32 reset_code)
 		bnx2_set_default_remote_link(bp);
 	spin_unlock_bh(&bp->phy_lock);
 
-	if (CHIP_ID(bp) == CHIP_ID_5706_A0) {
+	if (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A0) {
 		/* Adjust the voltage regular to two steps lower.  The default
 		 * of this register is 0x0000000e. */
 		BNX2_WR(bp, BNX2_MISC_VREG_CONTROL, 0x000000fa);
@@ -4855,13 +4855,14 @@ bnx2_init_chip(struct bnx2 *bp)
 	if ((bp->flags & BNX2_FLAG_PCIX) && (bp->bus_speed_mhz == 133))
 		val |= (1 << 23);
 
-	if ((CHIP_NUM(bp) == CHIP_NUM_5706) &&
-	    (CHIP_ID(bp) != CHIP_ID_5706_A0) && !(bp->flags & BNX2_FLAG_PCIX))
+	if ((BNX2_CHIP(bp) == BNX2_CHIP_5706) &&
+	    (BNX2_CHIP_ID(bp) != BNX2_CHIP_ID_5706_A0) &&
+	    !(bp->flags & BNX2_FLAG_PCIX))
 		val |= BNX2_DMA_CONFIG_CNTL_PING_PONG_DMA;
 
 	BNX2_WR(bp, BNX2_DMA_CONFIG, val);
 
-	if (CHIP_ID(bp) == CHIP_ID_5706_A0) {
+	if (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A0) {
 		val = BNX2_RD(bp, BNX2_TDMA_CONFIG);
 		val |= BNX2_TDMA_CONFIG_ONE_DMA;
 		BNX2_WR(bp, BNX2_TDMA_CONFIG, val);
@@ -4883,7 +4884,7 @@ bnx2_init_chip(struct bnx2 *bp)
 
 	/* Initialize context mapping and zero out the quick contexts.  The
 	 * context block must have already been enabled. */
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		rc = bnx2_init_5709_context(bp);
 		if (rc)
 			return rc;
@@ -4900,9 +4901,9 @@ bnx2_init_chip(struct bnx2 *bp)
 	val = BNX2_RD(bp, BNX2_MQ_CONFIG);
 	val &= ~BNX2_MQ_CONFIG_KNL_BYP_BLK_SIZE;
 	val |= BNX2_MQ_CONFIG_KNL_BYP_BLK_SIZE_256;
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		val |= BNX2_MQ_CONFIG_BIN_MQ_MODE;
-		if (CHIP_REV(bp) == CHIP_REV_Ax)
+		if (BNX2_CHIP_REV(bp) == BNX2_CHIP_REV_Ax)
 			val |= BNX2_MQ_CONFIG_HALT_DIS;
 	}
 
@@ -4988,7 +4989,7 @@ bnx2_init_chip(struct bnx2 *bp)
 		BNX2_WR(bp, BNX2_HC_STATS_TICKS, bp->stats_ticks);
 	BNX2_WR(bp, BNX2_HC_STAT_COLLECT_TICKS, 0xbb8);  /* 3ms */
 
-	if (CHIP_ID(bp) == CHIP_ID_5706_A1)
+	if (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A1)
 		val = BNX2_HC_CONFIG_COLLECT_STATS;
 	else {
 		val = BNX2_HC_CONFIG_RX_TMR_MODE | BNX2_HC_CONFIG_TX_TMR_MODE |
@@ -5044,7 +5045,7 @@ bnx2_init_chip(struct bnx2 *bp)
 	/* Initialize the receive filter. */
 	bnx2_set_rx_mode(bp->dev);
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		val = BNX2_RD(bp, BNX2_MISC_NEW_CORE_CTL);
 		val |= BNX2_MISC_NEW_CORE_CTL_DMA_ENABLE;
 		BNX2_WR(bp, BNX2_MISC_NEW_CORE_CTL, val);
@@ -5091,7 +5092,7 @@ bnx2_init_tx_context(struct bnx2 *bp, u32 cid, struct bnx2_tx_ring_info *txr)
 	u32 val, offset0, offset1, offset2, offset3;
 	u32 cid_addr = GET_CID_ADDR(cid);
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		offset0 = BNX2_L2CTX_TYPE_XI;
 		offset1 = BNX2_L2CTX_CMD_TYPE_XI;
 		offset2 = BNX2_L2CTX_TBDR_BHADDR_HI_XI;
@@ -5192,7 +5193,7 @@ bnx2_init_rx_ring(struct bnx2 *bp, int ring_num)
 
 	bnx2_init_rx_context(bp, cid);
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		val = BNX2_RD(bp, BNX2_MQ_MAP_L2_5);
 		BNX2_WR(bp, BNX2_MQ_MAP_L2_5, val | BNX2_MQ_MAP_L2_5_ARM);
 	}
@@ -5213,7 +5214,7 @@ bnx2_init_rx_ring(struct bnx2 *bp, int ring_num)
 		val = (u64) rxr->rx_pg_desc_mapping[0] & 0xffffffff;
 		bnx2_ctx_wr(bp, rx_cid_addr, BNX2_L2CTX_NX_PG_BDHADDR_LO, val);
 
-		if (CHIP_NUM(bp) == CHIP_NUM_5709)
+		if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 			BNX2_WR(bp, BNX2_MQ_MAP_L2_3, BNX2_MQ_MAP_L2_3_DEFAULT);
 	}
 
@@ -5621,7 +5622,7 @@ bnx2_test_registers(struct bnx2 *bp)
 
 	ret = 0;
 	is_5709 = 0;
-	if (CHIP_NUM(bp) == CHIP_NUM_5709)
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 		is_5709 = 1;
 
 	for (i = 0; reg_tbl[i].offset != 0xffff; i++) {
@@ -5720,7 +5721,7 @@ bnx2_test_memory(struct bnx2 *bp)
 	};
 	struct mem_entry *mem_tbl;
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709)
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 		mem_tbl = mem_tbl_5709;
 	else
 		mem_tbl = mem_tbl_5706;
@@ -6142,7 +6143,7 @@ bnx2_timer(unsigned long data)
 			BNX2_HC_COMMAND_STATS_NOW);
 
 	if (bp->phy_flags & BNX2_PHY_FLAG_SERDES) {
-		if (CHIP_NUM(bp) == CHIP_NUM_5706)
+		if (BNX2_CHIP(bp) == BNX2_CHIP_5706)
 			bnx2_5706_serdes_timer(bp);
 		else
 			bnx2_5708_serdes_timer(bp);
@@ -6280,7 +6281,7 @@ bnx2_setup_int_mode(struct bnx2 *bp, int dis_msi)
 	    !(bp->flags & BNX2_FLAG_USING_MSIX)) {
 		if (pci_enable_msi(bp->pdev) == 0) {
 			bp->flags |= BNX2_FLAG_USING_MSI;
-			if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+			if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 				bp->flags |= BNX2_FLAG_ONE_SHOT_MSI;
 				bp->irq_tbl[0].handler = bnx2_msi_1shot;
 			} else
@@ -6816,8 +6817,8 @@ bnx2_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *net_stats)
 		GET_32BIT_NET_STATS(stat_Dot3StatsExcessiveCollisions) +
 		GET_32BIT_NET_STATS(stat_Dot3StatsLateCollisions);
 
-	if ((CHIP_NUM(bp) == CHIP_NUM_5706) ||
-	    (CHIP_ID(bp) == CHIP_ID_5708_A0))
+	if ((BNX2_CHIP(bp) == BNX2_CHIP_5706) ||
+	    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5708_A0))
 		net_stats->tx_carrier_errors = 0;
 	else {
 		net_stats->tx_carrier_errors =
@@ -7620,10 +7621,10 @@ bnx2_get_ethtool_stats(struct net_device *dev,
 		return;
 	}
 
-	if ((CHIP_ID(bp) == CHIP_ID_5706_A0) ||
-	    (CHIP_ID(bp) == CHIP_ID_5706_A1) ||
-	    (CHIP_ID(bp) == CHIP_ID_5706_A2) ||
-	    (CHIP_ID(bp) == CHIP_ID_5708_A0))
+	if ((BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A0) ||
+	    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A1) ||
+	    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A2) ||
+	    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5708_A0))
 		stats_len_arr = bnx2_5706_stats_len_arr;
 	else
 		stats_len_arr = bnx2_5708_stats_len_arr;
@@ -8143,14 +8144,14 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev)
 
 	bp->chip_id = BNX2_RD(bp, BNX2_MISC_ID);
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709) {
 		if (!pci_is_pcie(pdev)) {
 			dev_err(&pdev->dev, "Not PCIE, aborting\n");
 			rc = -EIO;
 			goto err_out_unmap;
 		}
 		bp->flags |= BNX2_FLAG_PCIE;
-		if (CHIP_REV(bp) == CHIP_REV_Ax)
+		if (BNX2_CHIP_REV(bp) == BNX2_CHIP_REV_Ax)
 			bp->flags |= BNX2_FLAG_JUMBO_BROKEN;
 
 		/* AER (Advanced Error Reporting) hooks */
@@ -8169,18 +8170,20 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev)
 		bp->flags |= BNX2_FLAG_BROKEN_STATS;
 	}
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709 && CHIP_REV(bp) != CHIP_REV_Ax) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709 &&
+	    BNX2_CHIP_REV(bp) != BNX2_CHIP_REV_Ax) {
 		if (pci_find_capability(pdev, PCI_CAP_ID_MSIX))
 			bp->flags |= BNX2_FLAG_MSIX_CAP;
 	}
 
-	if (CHIP_ID(bp) != CHIP_ID_5706_A0 && CHIP_ID(bp) != CHIP_ID_5706_A1) {
+	if (BNX2_CHIP_ID(bp) != BNX2_CHIP_ID_5706_A0 &&
+	    BNX2_CHIP_ID(bp) != BNX2_CHIP_ID_5706_A1) {
 		if (pci_find_capability(pdev, PCI_CAP_ID_MSI))
 			bp->flags |= BNX2_FLAG_MSI_CAP;
 	}
 
 	/* 5708 cannot support DMA addresses > 40-bit.  */
-	if (CHIP_NUM(bp) == CHIP_NUM_5708)
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5708)
 		persist_dma_mask = dma_mask = DMA_BIT_MASK(40);
 	else
 		persist_dma_mask = dma_mask = DMA_BIT_MASK(64);
@@ -8203,12 +8206,11 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev)
 		bnx2_get_pci_speed(bp);
 
 	/* 5706A0 may falsely detect SERR and PERR. */
-	if (CHIP_ID(bp) == CHIP_ID_5706_A0) {
+	if (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A0) {
 		reg = BNX2_RD(bp, PCI_COMMAND);
 		reg &= ~(PCI_COMMAND_SERR | PCI_COMMAND_PARITY);
 		BNX2_WR(bp, PCI_COMMAND, reg);
-	}
-	else if ((CHIP_ID(bp) == CHIP_ID_5706_A1) &&
+	} else if ((BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A1) &&
 		!(bp->flags & BNX2_FLAG_PCIX)) {
 
 		dev_err(&pdev->dev,
@@ -8325,9 +8327,9 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev)
 	bp->phy_addr = 1;
 
 	/* Disable WOL support if we are running on a SERDES chip. */
-	if (CHIP_NUM(bp) == CHIP_NUM_5709)
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 		bnx2_get_5709_media(bp);
-	else if (CHIP_BOND_ID(bp) & CHIP_BOND_ID_SERDES_BIT)
+	else if (BNX2_CHIP_BOND(bp) & BNX2_CHIP_BOND_SERDES_BIT)
 		bp->phy_flags |= BNX2_PHY_FLAG_SERDES;
 
 	bp->phy_port = PORT_TP;
@@ -8338,7 +8340,7 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev)
 			bp->flags |= BNX2_FLAG_NO_WOL;
 			bp->wol = 0;
 		}
-		if (CHIP_NUM(bp) == CHIP_NUM_5706) {
+		if (BNX2_CHIP(bp) == BNX2_CHIP_5706) {
 			/* Don't do parallel detect on this board because of
 			 * some board problems.  The link will not go down
 			 * if we do parallel detect.
@@ -8351,25 +8353,25 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev)
 			if (reg & BNX2_SHARED_HW_CFG_PHY_2_5G)
 				bp->phy_flags |= BNX2_PHY_FLAG_2_5G_CAPABLE;
 		}
-	} else if (CHIP_NUM(bp) == CHIP_NUM_5706 ||
-		   CHIP_NUM(bp) == CHIP_NUM_5708)
+	} else if (BNX2_CHIP(bp) == BNX2_CHIP_5706 ||
+		   BNX2_CHIP(bp) == BNX2_CHIP_5708)
 		bp->phy_flags |= BNX2_PHY_FLAG_CRC_FIX;
-	else if (CHIP_NUM(bp) == CHIP_NUM_5709 &&
-		 (CHIP_REV(bp) == CHIP_REV_Ax ||
-		  CHIP_REV(bp) == CHIP_REV_Bx))
+	else if (BNX2_CHIP(bp) == BNX2_CHIP_5709 &&
+		 (BNX2_CHIP_REV(bp) == BNX2_CHIP_REV_Ax ||
+		  BNX2_CHIP_REV(bp) == BNX2_CHIP_REV_Bx))
 		bp->phy_flags |= BNX2_PHY_FLAG_DIS_EARLY_DAC;
 
 	bnx2_init_fw_cap(bp);
 
-	if ((CHIP_ID(bp) == CHIP_ID_5708_A0) ||
-	    (CHIP_ID(bp) == CHIP_ID_5708_B0) ||
-	    (CHIP_ID(bp) == CHIP_ID_5708_B1) ||
+	if ((BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5708_A0) ||
+	    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5708_B0) ||
+	    (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5708_B1) ||
 	    !(BNX2_RD(bp, BNX2_PCI_CONFIG_3) & BNX2_PCI_CONFIG_3_VAUX_PRESET)) {
 		bp->flags |= BNX2_FLAG_NO_WOL;
 		bp->wol = 0;
 	}
 
-	if (CHIP_ID(bp) == CHIP_ID_5706_A0) {
+	if (BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A0) {
 		bp->tx_quick_cons_trip_int =
 			bp->tx_quick_cons_trip;
 		bp->tx_ticks_int = bp->tx_ticks;
@@ -8391,7 +8393,7 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev)
 	 * AMD believes this incompatibility is unique to the 5706, and
 	 * prefers to locally disable MSI rather than globally disabling it.
 	 */
-	if (CHIP_NUM(bp) == CHIP_NUM_5706 && disable_msi == 0) {
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5706 && disable_msi == 0) {
 		struct pci_dev *amd_8132 = NULL;
 
 		while ((amd_8132 = pci_get_device(PCI_VENDOR_ID_AMD,
@@ -8547,7 +8549,7 @@ bnx2_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		NETIF_F_TSO | NETIF_F_TSO_ECN |
 		NETIF_F_RXHASH | NETIF_F_RXCSUM;
 
-	if (CHIP_NUM(bp) == CHIP_NUM_5709)
+	if (BNX2_CHIP(bp) == BNX2_CHIP_5709)
 		dev->hw_features |= NETIF_F_IPV6_CSUM | NETIF_F_TSO6;
 
 	dev->vlan_features = dev->hw_features;
@@ -8562,15 +8564,15 @@ bnx2_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	netdev_info(dev, "%s (%c%d) %s found at mem %lx, IRQ %d, "
 		    "node addr %pM\n", board_info[ent->driver_data].name,
-		    ((CHIP_ID(bp) & 0xf000) >> 12) + 'A',
-		    ((CHIP_ID(bp) & 0x0ff0) >> 4),
+		    ((BNX2_CHIP_ID(bp) & 0xf000) >> 12) + 'A',
+		    ((BNX2_CHIP_ID(bp) & 0x0ff0) >> 4),
 		    bnx2_bus_string(bp, str), (long)pci_resource_start(pdev, 0),
 		    pdev->irq, dev->dev_addr);
 
 	return 0;
 
 error:
-	pci_iounmap(pdev, bp->regview);
+	iounmap(bp->regview);
 	pci_release_regions(pdev);
 	pci_disable_device(pdev);
 	pci_set_drvdata(pdev, NULL);
@@ -8748,7 +8750,7 @@ static void bnx2_io_resume(struct pci_dev *pdev)
 	rtnl_unlock();
 }
 
-static const struct pci_error_handlers bnx2_err_handler = {
+static struct pci_error_handlers bnx2_err_handler = {
 	.error_detected	= bnx2_io_error_detected,
 	.slot_reset	= bnx2_io_slot_reset,
 	.resume		= bnx2_io_resume,
diff --git a/drivers/net/ethernet/broadcom/bnx2.h b/drivers/net/ethernet/broadcom/bnx2.h
index 58caa22..38a640d 100644
--- a/drivers/net/ethernet/broadcom/bnx2.h
+++ b/drivers/net/ethernet/broadcom/bnx2.h
@@ -6854,33 +6854,31 @@ struct bnx2 {
 
 	u32			chip_id;
 	/* chip num:16-31, rev:12-15, metal:4-11, bond_id:0-3 */
-#define CHIP_NUM(bp)			(((bp)->chip_id) & 0xffff0000)
-#define CHIP_NUM_5706			0x57060000
-#define CHIP_NUM_5708			0x57080000
-#define CHIP_NUM_5709			0x57090000
-
-#define CHIP_REV(bp)			(((bp)->chip_id) & 0x0000f000)
-#define CHIP_REV_Ax			0x00000000
-#define CHIP_REV_Bx			0x00001000
-#define CHIP_REV_Cx			0x00002000
-
-#define CHIP_METAL(bp)			(((bp)->chip_id) & 0x00000ff0)
-#define CHIP_BONDING(bp)		(((bp)->chip_id) & 0x0000000f)
-
-#define CHIP_ID(bp)			(((bp)->chip_id) & 0xfffffff0)
-#define CHIP_ID_5706_A0			0x57060000
-#define CHIP_ID_5706_A1			0x57060010
-#define CHIP_ID_5706_A2			0x57060020
-#define CHIP_ID_5708_A0			0x57080000
-#define CHIP_ID_5708_B0			0x57081000
-#define CHIP_ID_5708_B1			0x57081010
-#define CHIP_ID_5709_A0			0x57090000
-#define CHIP_ID_5709_A1			0x57090010
-
-#define CHIP_BOND_ID(bp)		(((bp)->chip_id) & 0xf)
+#define BNX2_CHIP(bp)			(((bp)->chip_id) & 0xffff0000)
+#define BNX2_CHIP_5706			0x57060000
+#define BNX2_CHIP_5708			0x57080000
+#define BNX2_CHIP_5709			0x57090000
+
+#define BNX2_CHIP_REV(bp)		(((bp)->chip_id) & 0x0000f000)
+#define BNX2_CHIP_REV_Ax		0x00000000
+#define BNX2_CHIP_REV_Bx		0x00001000
+#define BNX2_CHIP_REV_Cx		0x00002000
+
+#define BNX2_CHIP_METAL(bp)		(((bp)->chip_id) & 0x00000ff0)
+#define BNX2_CHIP_BOND(bp)		(((bp)->chip_id) & 0x0000000f)
+
+#define BNX2_CHIP_ID(bp)		(((bp)->chip_id) & 0xfffffff0)
+#define BNX2_CHIP_ID_5706_A0		0x57060000
+#define BNX2_CHIP_ID_5706_A1			0x57060010
+#define BNX2_CHIP_ID_5706_A2			0x57060020
+#define BNX2_CHIP_ID_5708_A0			0x57080000
+#define BNX2_CHIP_ID_5708_B0			0x57081000
+#define BNX2_CHIP_ID_5708_B1			0x57081010
+#define BNX2_CHIP_ID_5709_A0			0x57090000
+#define BNX2_CHIP_ID_5709_A1			0x57090010
 
 /* A serdes chip will have the first bit of the bond id set. */
-#define CHIP_BOND_ID_SERDES_BIT		0x01
+#define BNX2_CHIP_BOND_SERDES_BIT		0x01
 
 	u32			phy_addr;
 	u32			phy_id;
diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index 756a2a7..5778098 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -895,7 +895,7 @@ static int cnic_alloc_context(struct cnic_dev *dev)
 {
 	struct cnic_local *cp = dev->cnic_priv;
 
-	if (CHIP_NUM(cp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(cp) == BNX2_CHIP_5709) {
 		int i, k, arr_size;
 
 		cp->ctx_blk_size = BNX2_PAGE_SIZE;
@@ -4358,7 +4358,7 @@ static int cnic_setup_5709_context(struct cnic_dev *dev, int valid)
 	int ret = 0, i;
 	u32 valid_bit = valid ? BNX2_CTX_HOST_PAGE_TBL_DATA0_VALID : 0;
 
-	if (CHIP_NUM(cp) != CHIP_NUM_5709)
+	if (BNX2_CHIP(cp) != BNX2_CHIP_5709)
 		return 0;
 
 	for (i = 0; i < cp->ctx_blks; i++) {
@@ -4526,7 +4526,7 @@ static void cnic_init_bnx2_tx_ring(struct cnic_dev *dev)
 	cp->tx_cons = *cp->tx_cons_ptr;
 
 	cid_addr = GET_CID_ADDR(tx_cid);
-	if (CHIP_NUM(cp) == CHIP_NUM_5709) {
+	if (BNX2_CHIP(cp) == BNX2_CHIP_5709) {
 		u32 cid_addr2 = GET_CID_ADDR(tx_cid + 4) + 0x40;
 
 		for (i = 0; i < PHY_CTX_SIZE; i += 4)
@@ -4671,7 +4671,7 @@ static void cnic_set_bnx2_mac(struct cnic_dev *dev)
 	CNIC_WR(dev, BNX2_EMAC_MAC_MATCH5, val);
 
 	val = 4 | BNX2_RPM_SORT_USER2_BC_EN;
-	if (CHIP_NUM(cp) != CHIP_NUM_5709)
+	if (BNX2_CHIP(cp) != BNX2_CHIP_5709)
 		val |= BNX2_RPM_SORT_USER2_PROM_VLAN;
 
 	CNIC_WR(dev, BNX2_RPM_SORT_USER2, 0x0);
@@ -4717,7 +4717,7 @@ static int cnic_start_bnx2_hw(struct cnic_dev *dev)
 	cp->kwq_con_idx = 0;
 	set_bit(CNIC_LCL_FL_KWQ_INIT, &cp->cnic_local_flags);
 
-	if (CHIP_NUM(cp) == CHIP_NUM_5706 || CHIP_NUM(cp) == CHIP_NUM_5708)
+	if (BNX2_CHIP(cp) == BNX2_CHIP_5706 || BNX2_CHIP(cp) == BNX2_CHIP_5708)
 		cp->kwq_con_idx_ptr = &sblk->status_rx_quick_consumer_index15;
 	else
 		cp->kwq_con_idx_ptr = &sblk->status_cmd_consumer_index;
@@ -4917,9 +4917,9 @@ static void cnic_init_bnx2x_tx_ring(struct cnic_dev *dev,
 
 		if (BNX2X_CHIP_IS_E2_PLUS(cp->chip_id))
 			pbd_e2->parsing_data = (UNICAST_ADDRESS <<
-				 ETH_TX_PARSE_BD_E2_ETH_ADDR_TYPE_SHIFT);
+				ETH_TX_PARSE_BD_E2_ETH_ADDR_TYPE_SHIFT);
 		else
-			 pbd_e1x->global_data = (UNICAST_ADDRESS <<
+			pbd_e1x->global_data = (UNICAST_ADDRESS <<
 				ETH_TX_PARSE_BD_E1X_ETH_ADDR_TYPE_SHIFT);
 	}
 
-- 
1.7.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox