Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
From: Eric Dumazet @ 2011-01-07  2:43 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Jesse Gross, Michael Leun, Michael Chan, David Miller, Ben Greear,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <1294368071.2704.49.camel@edumazet-laptop>

Le vendredi 07 janvier 2011 à 03:41 +0100, Eric Dumazet a écrit :
> Le jeudi 06 janvier 2011 à 18:29 -0800, Matt Carlson a écrit :
> 
> > Hi Eric.  Sorry for the delay.  I was under the impression that your
> > problems were software related and that you just needed a revised
> > version of these VLAN patches I was sending to Michael.  Is this not
> > true?
> > 
> > Having a hardware stat increment suggests this is a new problem.
> > Maybe I missed it, but I didn't see what hardware you are working
> > with and whether or not management firmware was enabled.  Could you tell
> > me that info?
> > 
> 
> Hi Matt
> 
> I started a bisection, because I couldnt sleep tonight anyway :(
> 
> 14:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S
> Gigabit Ethernet (rev a3)
> 	Subsystem: Hewlett-Packard Company NC326m PCIe Dual Port Adapter
> 	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 43
> 	Memory at fdff0000 (64-bit, non-prefetchable) [size=64K]
> 	Memory at fdfe0000 (64-bit, non-prefetchable) [size=64K]
> 	[virtual] Expansion ROM at fdbe0000 [disabled] [size=128K]
> 	Capabilities: [40] PCI-X non-bridge device
> 	Capabilities: [48] Power Management version 2
> 	Capabilities: [50] Vital Product Data
> 	Capabilities: [58] MSI: Enable+ Count=1/8 Maskable- 64bit+
> 	Kernel driver in use: tg3
> 	Kernel modules: tg3
> 
> 

$ ethtool -i eth2
driver: tg3
version: 3.115
firmware-version: 5715s-v3.28
bus-info: 0000:14:04.0
$ dmesg | grep ASF
[    6.220577] tg3 0000:14:04.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0]
ASF[0] TSOcap[1]
[    6.228586] tg3 0000:14:04.1: eth3: RXcsums[1] LinkChgREG[0] MIirq[0]
ASF[0] TSOcap[1]

^ permalink raw reply

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
From: Eric Dumazet @ 2011-01-07  2:41 UTC (permalink / raw)
  To: Matt Carlson
  Cc: Jesse Gross, Michael Leun, Michael Chan, David Miller, Ben Greear,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <20110107022912.GA17757@mcarlson.broadcom.com>

Le jeudi 06 janvier 2011 à 18:29 -0800, Matt Carlson a écrit :

> Hi Eric.  Sorry for the delay.  I was under the impression that your
> problems were software related and that you just needed a revised
> version of these VLAN patches I was sending to Michael.  Is this not
> true?
> 
> Having a hardware stat increment suggests this is a new problem.
> Maybe I missed it, but I didn't see what hardware you are working
> with and whether or not management firmware was enabled.  Could you tell
> me that info?
> 

Hi Matt

I started a bisection, because I couldnt sleep tonight anyway :(

14:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5715S
Gigabit Ethernet (rev a3)
	Subsystem: Hewlett-Packard Company NC326m PCIe Dual Port Adapter
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 43
	Memory at fdff0000 (64-bit, non-prefetchable) [size=64K]
	Memory at fdfe0000 (64-bit, non-prefetchable) [size=64K]
	[virtual] Expansion ROM at fdbe0000 [disabled] [size=128K]
	Capabilities: [40] PCI-X non-bridge device
	Capabilities: [48] Power Management version 2
	Capabilities: [50] Vital Product Data
	Capabilities: [58] MSI: Enable+ Count=1/8 Maskable- 64bit+
	Kernel driver in use: tg3
	Kernel modules: tg3

^ permalink raw reply

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
From: Matt Carlson @ 2011-01-07  2:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesse Gross, Matthew Carlson, Michael Leun, Michael Chan,
	David Miller, Ben Greear, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org
In-Reply-To: <1294363201.2704.19.camel@edumazet-laptop>

On Thu, Jan 06, 2011 at 05:20:01PM -0800, Eric Dumazet wrote:
> Le vendredi 07 janvier 2011 ?? 00:34 +0100, Eric Dumazet a ??crit :
> > Le jeudi 06 janvier 2011 ?? 16:01 -0500, Jesse Gross a ??crit :
> > 
> > > Hmm, I thought that it might be some interaction with a corner case in
> > > the networking core but now it seems less likely.  There weren't too
> > > many vlan changes between the working and non-working states.  Plus,
> > > since the rx counter isn't increasing, the packets probably aren't
> > > making it anywhere.
> > > 
> > > I see that tg3 increases the drop counter in one place, which also
> > > happens to be checking for vlan errors (at tg3.c:4753).  That seems
> > > suspicious - maybe the NIC is only partially configured for vlan
> > > offloading.  If we can confirm that is where the drop counter is being
> > > incremented and what the error code is maybe it would shed some light.
> > > 
> > 
> > Hmm... I am pretty sure the drop counter is the dev rx_dropped (core
> > network handled, not tg3 one) incremented at the end of
> > __netif_receive_skb() : We found no suitable handler for packets.
> > 
> > atomic_long_inc(&skb->dev->rx_dropped);
> > 
> > But thats a guess, I'll have to check
> > 
> 
> wrong guess. Its really the tg3 which drops frames
> 
> increasing rx_missed_errors  (get_stat64(&hw_stats->rx_discards)
> 
> ip -s -s link show dev eth2
> 5: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq
> master bond0 state UP qlen 1000
>     link/ether 00:1e:0b:92:78:50 brd ff:ff:ff:ff:ff:ff
>     RX: bytes  packets  errors  dropped overrun mcast   
>     11627      167      0       0       0       2      
>     RX errors: length  crc     frame   fifo    missed
>                0        0       0       0       2713   
>     TX: bytes  packets  errors  dropped carrier collsns 
>     2274       31       0       0       0       0      
>     TX errors: aborted fifo    window  heartbeat
>                0        0       0       0      
> 
> 
> 
> It would be nice Broadcom guys could help a bit ?

Hi Eric.  Sorry for the delay.  I was under the impression that your
problems were software related and that you just needed a revised
version of these VLAN patches I was sending to Michael.  Is this not
true?

Having a hardware stat increment suggests this is a new problem.
Maybe I missed it, but I didn't see what hardware you are working
with and whether or not management firmware was enabled.  Could you tell
me that info?

^ permalink raw reply

* Re: Bad TCP timestamps on non-PC platforms
From: Eric Dumazet @ 2011-01-07  2:11 UTC (permalink / raw)
  To: Alex Dubov; +Cc: netdev, David Miller
In-Reply-To: <117536.54377.qm@web37607.mail.mud.yahoo.com>

Le jeudi 06 janvier 2011 à 17:55 -0800, Alex Dubov a écrit :
> Sorry for the awful synopsis of my problem. I never cease to amaze myself
> at how bad those usually turn up. :-)
> 
> What I really meant to write is:
> 
> I have a dev board running 2.6.37-rc7. Normal kernel config, nothing fancy.
> Remote machines are just usual linux boxes in constant operation (I tried
> several of those).
> 
> UDP/DHCP works correctly all the time, so ethernet side is probably ok.
> 
> When tcp_timestamps are enabled, SYN packets from dev board just get
> ignored by the remote side. I see them arrive in wireshark, but nothing
> else happens.
> 
> When I disable tcp_timestamps on the dev board everything works.
> 
> The problem is reproducible every single time.
> 
> The only difference is the "Options" block of the SYN packets.
> If timestamps are not really to blame, then it probably window scale
> parameters. That's what I see on a usual dropped packet:
> 
> Options: (20 bytes)
>         Maximum segment size: 1460 bytes
>         SACK permitted
>         Timestamps: TSval 4294893842, TSecr 0
>         NOP
>         Window scale: 5 (multiply by 32)
> 
> 
> 
>       


You dont give new informations ;)

I asked if you could give information on the other side : The bug is to
drop this legal packet.

uname -a
sysctl -a | grep tcp



^ permalink raw reply

* Re: Bad TCP timestamps on non-PC platforms
From: Alex Dubov @ 2011-01-07  1:55 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, David Miller
In-Reply-To: <1294301356.2723.73.camel@edumazet-laptop>

Sorry for the awful synopsis of my problem. I never cease to amaze myself
at how bad those usually turn up. :-)

What I really meant to write is:

I have a dev board running 2.6.37-rc7. Normal kernel config, nothing fancy.
Remote machines are just usual linux boxes in constant operation (I tried
several of those).

UDP/DHCP works correctly all the time, so ethernet side is probably ok.

When tcp_timestamps are enabled, SYN packets from dev board just get
ignored by the remote side. I see them arrive in wireshark, but nothing
else happens.

When I disable tcp_timestamps on the dev board everything works.

The problem is reproducible every single time.

The only difference is the "Options" block of the SYN packets.
If timestamps are not really to blame, then it probably window scale
parameters. That's what I see on a usual dropped packet:

Options: (20 bytes)
        Maximum segment size: 1460 bytes
        SACK permitted
        Timestamps: TSval 4294893842, TSecr 0
        NOP
        Window scale: 5 (multiply by 32)



      

^ permalink raw reply

* Re: About disabling congestion control
From: Stephen Hemminger @ 2011-01-07  1:37 UTC (permalink / raw)
  To: Syed Obaid Amin; +Cc: netdev
In-Reply-To: <AANLkTin53drsv18TZ0CPZM7==6G=rK6vScUb=SCQ4-xr@mail.gmail.com>

On Thu, 6 Jan 2011 20:25:18 -0500
Syed Obaid Amin <obaidasyed@gmail.com> wrote:

> Hey all,
> 
> I am currently working on a socket option to disable the tcp
> congestion control. I think the simplest approach to do this is to
> ignore cwnd before sending out a packet.
> 
> After going through tcp output engine it seems that tcp_cwnd_test is
> the method that decides that how many segments can be sent out on a
> wire. For testing it out, I changed this method so that if no-cc
> option is ON, just return a big constant value. But, it didn't work
> and I am unable to see a burst of pkts. It looks like that I am
> missing something here.
> 
> Any suggestions that what is the right place to look for disabling the
> congestion control ?
> 
> Thanks much!
> 
> Obaid

I assume this is just a local hack experiment; not something you
want to actually submit for other users to use...

The easiest/safest way to do this would be to build/define a new TCP congestion
control type that does nothing.


^ permalink raw reply

* Re: genetlink misinterprets NEW as GET
From: Pablo Neira Ayuso @ 2011-01-07  1:31 UTC (permalink / raw)
  To: Ben Pfaff
  Cc: Jan Engelhardt, Netfilter Developer Mailing List,
	Linux Networking Developer Mailing List
In-Reply-To: <878vyyvtci.fsf@benpfaff.org>

On 06/01/11 18:23, Ben Pfaff wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> writes:
> 
>> On 04/01/11 03:14, Jan Engelhardt wrote:
>>> 	/* Modifiers to GET request */
>>> 	#define NLM_F_ROOT      0x100
>>> 	#define NLM_F_MATCH     0x200
>>> 	#define NLM_F_ATOMIC    0x400
>>> 	#define NLM_F_DUMP      (NLM_F_ROOT|NLM_F_MATCH)
> [...]
>>> [N.B.: I am also wondering whether
>>> 	(nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP
>>> may have been desired, because NLM_F_DUMP is composed of two bits.]
>>
>> Someone may include NLM_F_ATOMIC to a dump operation, in that case the
>> checking that you propose is not valid.
> 
> Are you saying that NLM_F_MATCH and NLM_F_ATOMIC are mutually
> exclusive, and that NLM_F_ROOT|NLM_F_ATOMIC would also signal a
> dump operation?  Otherwise the test that Jan proposes looks valid
> to me.

Indeed, Jan's test is fine to fix this. Please, send a patch to Davem asap.

^ permalink raw reply

* Re: [PATCH v2] net: Allow ethtool to set interface in loopback mode.
From: Ben Hutchings @ 2011-01-07  1:30 UTC (permalink / raw)
  To: Mahesh Bandewar
  Cc: Jeff Garzik, Stephen Hemminger, David Miller, Laurent Chavey,
	Tom Herbert, netdev
In-Reply-To: <AANLkTi=3ewzgz=z-WmqT=vBvci3-H6HE3CCVk4ZGuFED@mail.gmail.com>

On Thu, 2011-01-06 at 16:47 -0800, Mahesh Bandewar wrote:
> On Thu, Jan 6, 2011 at 2:13 PM, Ben Hutchings <bhutchings@solarflare.com> wrote:
> > On Wed, 2011-01-05 at 11:22 -0500, Jeff Garzik wrote:
> >> On 01/04/2011 08:21 PM, Ben Hutchings wrote:
> >> > On Tue, 2011-01-04 at 16:36 -0800, Stephen Hemminger wrote:
> >> >> On Tue,  4 Jan 2011 16:30:01 -0800
> >> >> Mahesh Bandewar<maheshb@google.com>  wrote:
> >> >>
> >> >>> This patch enables ethtool to set the loopback mode on a given interface.
> >> >>> By configuring the interface in loopback mode in conjunction with a policy
> >> >>> route / rule, a userland application can stress the egress / ingress path
> >> >>> exposing the flows of the change in progress and potentially help developer(s)
> >> >>> understand the impact of those changes without even sending a packet out
> >> >>> on the network.
> >> >>>
> >> >>> Following set of commands illustrates one such example -
> >> >>>   a) ip -4 addr add 192.168.1.1/24 dev eth1
> >> >>>   b) ip -4 rule add from all iif eth1 lookup 250
> >> >>>   c) ip -4 route add local 0/0 dev lo proto kernel scope host table 250
> >> >>>   d) arp -Ds 192.168.1.100 eth1
> >> >>>   e) arp -Ds 192.168.1.200 eth1
> >> >>>   f) sysctl -w net.ipv4.ip_nonlocal_bind=1
> >> >>>   g) sysctl -w net.ipv4.conf.all.accept_local=1
> >> >>>   # Assuming that the machine has 8 cores
> >> >>>   h) taskset 000f netserver -L 192.168.1.200
> >> >>>   i) taskset 00f0 netperf -t TCP_CRR -L 192.168.1.100 -H 192.168.1.200 -l 30
> >> >>>
> >> >>> Signed-off-by: Mahesh Bandewar<maheshb@google.com>
> >> >>> Reviewed-by: Ben Hutchings<bhutchings@solarflare.com>
> >> >>
> >> >> Since this is a boolean it SHOULD go into ethtool_flags rather than
> >> >> being a high level operation.
> >> >
> >> > It could do, but I though ETHTOOL_{G,S}FLAGS were intended for
> >> > controlling offload features.
> >>
> >> It doesn't have to be.  As Stephen guessed, [GS]FLAGS are basically
> >> common flags -- as differentiated from private,
> >> driver-specific/hardware-specific flags.
> >
> > Well, that would allow the patch to be simplified quite a bit. :-)
> 
> Ben, Are you suggesting to use ETH_FLAG_LOOPBACK instead of
> ETHTOOL_{G|S}LOOPBACK flags?
[...]

Exactly.

An example implementation (untested):

--- a/drivers/net/sfc/ethtool.c
+++ b/drivers/net/sfc/ethtool.c
@@ -548,11 +548,24 @@ static u32 efx_ethtool_get_rx_csum(struct net_device *net_dev)
 	return efx->rx_checksum_enabled;
 }
 
+static u32 efx_ethtool_get_flags(struct net_device *net_dev)
+{
+	struct efx_nic *efx = netdev_priv(net_dev);
+	u32 flags;
+
+	flags = ethtool_op_get_flags(net_dev);
+	if (efx->loopback_mode != LOOPBACK_NONE)
+		flags |= ETH_FLAG_LOOPBACK;
+	return flags;
+}
+
 static int efx_ethtool_set_flags(struct net_device *net_dev, u32 data)
 {
 	struct efx_nic *efx = netdev_priv(net_dev);
-	u32 supported = (efx->type->offload_features &
-			 (ETH_FLAG_RXHASH | ETH_FLAG_NTUPLE));
+	u32 supported = (ETH_FLAG_LOOPBACK |
+			 (efx->type->offload_features &
+			  (ETH_FLAG_RXHASH | ETH_FLAG_NTUPLE)));
+	enum efx_loopback_mode loopback;
 	int rc;
 
 	rc = ethtool_op_set_flags(net_dev, data, supported);
@@ -562,7 +575,15 @@ static int efx_ethtool_set_flags(struct net_device *net_dev, u32 data)
 	if (!(data & ETH_FLAG_NTUPLE))
 		efx_filter_clear_rx(efx, EFX_FILTER_PRI_MANUAL);
 
-	return 0;
+	loopback = (data & ETH_FLAG_LOOPBACK) ? LOOPBACK_DATA : LOOPBACK_NONE;
+	mutex_lock(&efx->mac_lock);
+	if (efx->loopback_mode != loopback) {
+		efx->loopback_mode = loopback;
+		rc = __efx_reconfigure_port(efx);
+	}
+	mutex_unlock(&efx->mac_lock);
+
+	return rc;
 }
 
 static void efx_ethtool_self_test(struct net_device *net_dev,
@@ -1057,7 +1078,7 @@ const struct ethtool_ops efx_ethtool_ops = {
 	.get_tso		= ethtool_op_get_tso,
 	/* Need to enable/disable TSO-IPv6 too */
 	.set_tso		= efx_ethtool_set_tso,
-	.get_flags		= ethtool_op_get_flags,
+	.get_flags		= efx_ethtool_get_flags,
 	.set_flags		= efx_ethtool_set_flags,
 	.get_sset_count		= efx_ethtool_get_sset_count,
 	.self_test		= efx_ethtool_self_test,
---

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* About disabling congestion control
From: Syed Obaid Amin @ 2011-01-07  1:25 UTC (permalink / raw)
  To: netdev

Hey all,

I am currently working on a socket option to disable the tcp
congestion control. I think the simplest approach to do this is to
ignore cwnd before sending out a packet.

After going through tcp output engine it seems that tcp_cwnd_test is
the method that decides that how many segments can be sent out on a
wire. For testing it out, I changed this method so that if no-cc
option is ON, just return a big constant value. But, it didn't work
and I am unable to see a burst of pkts. It looks like that I am
missing something here.

Any suggestions that what is the right place to look for disabling the
congestion control ?

Thanks much!

Obaid

^ permalink raw reply

* Re: Flow Control and Port Mirroring Revisited
From: Simon Horman @ 2011-01-07  1:23 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Eric Dumazet, Rusty Russell, virtualization, dev, virtualization,
	netdev, kvm, Michael S. Tsirkin
In-Reply-To: <AANLkTinJK-nbkP5_ee2cuS8RA7jTB4-bcWmAf4bjSouP@mail.gmail.com>

On Thu, Jan 06, 2011 at 05:38:01PM -0500, Jesse Gross wrote:

[ snip ]
> 
> I know that everyone likes a nice netperf result but I agree with
> Michael that this probably isn't the right question to be asking.  I
> don't think that socket buffers are a real solution to the flow
> control problem: they happen to provide that functionality but it's
> more of a side effect than anything.  It's just that the amount of
> memory consumed by packets in the queue(s) doesn't really have any
> implicit meaning for flow control (think multiple physical adapters,
> all with the same speed instead of a virtual device and a physical
> device with wildly different speeds).  The analog in the physical
> world that you're looking for would be Ethernet flow control.
> Obviously, if the question is limiting CPU or memory consumption then
> that's a different story.

Point taken. I will see if I can control CPU (and thus memory) consumption
using cgroups and/or tc.

> This patch also double counts memory, since the full size of the
> packet will be accounted for by each clone, even though they share the
> actual packet data.  Probably not too significant here but it might be
> when flooding/mirroring to many interfaces.  This is at least fixable
> (the Xen-style accounting through page tracking deals with it, though
> it has its own problems).

Agreed on all counts.



^ permalink raw reply

* Re: [PATCH 2.6.36] vlan: Avoid hwaccel vlan packets when vid not used
From: Eric Dumazet @ 2011-01-07  1:20 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Matt Carlson, Michael Leun, Michael Chan, David Miller,
	Ben Greear, linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <1294356887.2704.13.camel@edumazet-laptop>

Le vendredi 07 janvier 2011 à 00:34 +0100, Eric Dumazet a écrit :
> Le jeudi 06 janvier 2011 à 16:01 -0500, Jesse Gross a écrit :
> 
> > Hmm, I thought that it might be some interaction with a corner case in
> > the networking core but now it seems less likely.  There weren't too
> > many vlan changes between the working and non-working states.  Plus,
> > since the rx counter isn't increasing, the packets probably aren't
> > making it anywhere.
> > 
> > I see that tg3 increases the drop counter in one place, which also
> > happens to be checking for vlan errors (at tg3.c:4753).  That seems
> > suspicious - maybe the NIC is only partially configured for vlan
> > offloading.  If we can confirm that is where the drop counter is being
> > incremented and what the error code is maybe it would shed some light.
> > 
> 
> Hmm... I am pretty sure the drop counter is the dev rx_dropped (core
> network handled, not tg3 one) incremented at the end of
> __netif_receive_skb() : We found no suitable handler for packets.
> 
> atomic_long_inc(&skb->dev->rx_dropped);
> 
> But thats a guess, I'll have to check
> 

wrong guess. Its really the tg3 which drops frames

increasing rx_missed_errors  (get_stat64(&hw_stats->rx_discards)

ip -s -s link show dev eth2
5: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq
master bond0 state UP qlen 1000
    link/ether 00:1e:0b:92:78:50 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    11627      167      0       0       0       2      
    RX errors: length  crc     frame   fifo    missed
               0        0       0       0       2713   
    TX: bytes  packets  errors  dropped carrier collsns 
    2274       31       0       0       0       0      
    TX errors: aborted fifo    window  heartbeat
               0        0       0       0      



It would be nice Broadcom guys could help a bit ?

^ permalink raw reply

* Re: linux-next: manual merge of the security-testing tree with the net tree
From: Casey Schaufler @ 2011-01-07  1:05 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: James Morris, linux-next, linux-kernel, David Miller, netdev,
	Casey Schaufler
In-Reply-To: <20110107114400.e4d1d33b.sfr@canb.auug.org.au>

On 1/6/2011 4:44 PM, Stephen Rothwell wrote:
> Hi James,
>
> Today's linux-next merge of the security-testing tree got a conflict in
> security/smack/smack_lsm.c between commit
> 3610cda53f247e176bcbb7a7cca64bc53b12acdb ("af_unix: Avoid socket->sk NULL
> OOPS in stream connect security hooks") from the net tree and commit
> b4e0d5f0791bd6dd12a1c1edea0340969c7c1f90 ("Smack: UDS revision") from the
> security-testing tree.
>
> I fixed it up (I think - see below) and can carry the fix as necessary.

The change looks like it addresses the change in interface. Thank you.

^ permalink raw reply

* Re: [net-next 12/12] ixgbe: update ntuple filter configuration
From: Ben Hutchings @ 2011-01-07  1:02 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: davem, Alexander Duyck, netdev, gosp, bphilips
In-Reply-To: <1294360199-9860-13-git-send-email-jeffrey.t.kirsher@intel.com>

On Thu, 2011-01-06 at 16:29 -0800, jeffrey.t.kirsher@intel.com wrote:
> From: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> This change fixes several issues found in ntuple filtering while I was
> doing the ATR refactor.
> 
> Specifically I updated the masks to work correctly with the latest version
> of ethtool,
[...]

Did the previous code not correctly handle a zero value with a non-zero
mask for some fields?  If so, I can revert that change to ethtool.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: 2.6.37 vlans on bnx2 not functional, panic with tcpdump
From: Michael Chan @ 2011-01-07  0:46 UTC (permalink / raw)
  To: Iain Paton; +Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <1294357941.21580.2.camel@HP1>


On Thu, 2011-01-06 at 15:52 -0800, Michael Chan wrote:
> On Thu, 2011-01-06 at 13:32 -0800, Iain Paton wrote:
> > Hi,
> > 
> > vlans don't appear to be functional on my HP DL380G6 with onboard bnx2
> > adapter using vanilla 2.6.37 kernel. No tagged vlan traffic 
> > is arriving at the vlan interface.
> 
> VLANs on net-next-2.6 kernel works for me on bnx2 devices.  I'll try
> 2.6.37 next.

May be you have management firmware running on your devices that can
change the behavior.  Can you provide ethtool -i eth0 on both bnx2
devices on your system?



^ permalink raw reply

* Re: [net-next 03/12] e1000e: properly bounds-check string functions
From: Ben Hutchings @ 2011-01-07  0:48 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: davem, Bruce Allan, netdev, gosp, bphilips
In-Reply-To: <1294360199-9860-4-git-send-email-jeffrey.t.kirsher@intel.com>

On Thu, 2011-01-06 at 16:29 -0800, jeffrey.t.kirsher@intel.com wrote:
> From: Bruce Allan <bruce.w.allan@intel.com>
> 
> Use string functions with bounds checking rather than their non-bounds
> checking counterparts, and do not hard code these boundaries.
[...]
> --- a/drivers/net/e1000e/netdev.c
> +++ b/drivers/net/e1000e/netdev.c
[...]
> @@ -5968,7 +5968,7 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
>  	if (!(adapter->flags & FLAG_HAS_AMT))
>  		e1000_get_hw_control(adapter);
>  
> -	strcpy(netdev->name, "eth%d");
> +	strncpy(netdev->name, "eth%d", sizeof(netdev->name) - 1);
>  	err = register_netdev(netdev);
>  	if (err)
>  		goto err_register;
[...]

This statement is actually redundant - alloc_etherdev() sets the name
for you.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH v2] net: Allow ethtool to set interface in loopback mode.
From: Mahesh Bandewar @ 2011-01-07  0:47 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Jeff Garzik, Stephen Hemminger, David Miller, Laurent Chavey,
	Tom Herbert, netdev
In-Reply-To: <1294352011.11825.50.camel@bwh-desktop>

On Thu, Jan 6, 2011 at 2:13 PM, Ben Hutchings <bhutchings@solarflare.com> wrote:
> On Wed, 2011-01-05 at 11:22 -0500, Jeff Garzik wrote:
>> On 01/04/2011 08:21 PM, Ben Hutchings wrote:
>> > On Tue, 2011-01-04 at 16:36 -0800, Stephen Hemminger wrote:
>> >> On Tue,  4 Jan 2011 16:30:01 -0800
>> >> Mahesh Bandewar<maheshb@google.com>  wrote:
>> >>
>> >>> This patch enables ethtool to set the loopback mode on a given interface.
>> >>> By configuring the interface in loopback mode in conjunction with a policy
>> >>> route / rule, a userland application can stress the egress / ingress path
>> >>> exposing the flows of the change in progress and potentially help developer(s)
>> >>> understand the impact of those changes without even sending a packet out
>> >>> on the network.
>> >>>
>> >>> Following set of commands illustrates one such example -
>> >>>   a) ip -4 addr add 192.168.1.1/24 dev eth1
>> >>>   b) ip -4 rule add from all iif eth1 lookup 250
>> >>>   c) ip -4 route add local 0/0 dev lo proto kernel scope host table 250
>> >>>   d) arp -Ds 192.168.1.100 eth1
>> >>>   e) arp -Ds 192.168.1.200 eth1
>> >>>   f) sysctl -w net.ipv4.ip_nonlocal_bind=1
>> >>>   g) sysctl -w net.ipv4.conf.all.accept_local=1
>> >>>   # Assuming that the machine has 8 cores
>> >>>   h) taskset 000f netserver -L 192.168.1.200
>> >>>   i) taskset 00f0 netperf -t TCP_CRR -L 192.168.1.100 -H 192.168.1.200 -l 30
>> >>>
>> >>> Signed-off-by: Mahesh Bandewar<maheshb@google.com>
>> >>> Reviewed-by: Ben Hutchings<bhutchings@solarflare.com>
>> >>
>> >> Since this is a boolean it SHOULD go into ethtool_flags rather than
>> >> being a high level operation.
>> >
>> > It could do, but I though ETHTOOL_{G,S}FLAGS were intended for
>> > controlling offload features.
>>
>> It doesn't have to be.  As Stephen guessed, [GS]FLAGS are basically
>> common flags -- as differentiated from private,
>> driver-specific/hardware-specific flags.
>
> Well, that would allow the patch to be simplified quite a bit. :-)

Ben, Are you suggesting to use ETH_FLAG_LOOPBACK instead of
ETHTOOL_{G|S}LOOPBACK flags?

Thanks,
--mahesh..

>
> Ben.
>
> From: Ben Hutchings <bhutchings@solarflare.com>
> Subject: [PATCH net-2.6] ethtool: Define ETH_FLAG_LOOPBACK
> Date: Thu, 6 Jan 2011 22:10:55 +0000
>
> Mahesh Bandewar <maheshb@google.com> requested this, writing:
>
> By configuring the interface in loopback mode in conjunction with a policy
> route / rule, a userland application can stress the egress / ingress path
> exposing the flows of the change in progress and potentially help developer(s)
> understand the impact of those changes without even sending a packet out
> on the network.
>
> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
>
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -309,6 +309,7 @@ struct ethtool_perm_addr {
>  * flag differs from the read-only value.
>  */
>  enum ethtool_flags {
> +       ETH_FLAG_LOOPBACK       = (1 << 2),     /* Host-side loopback enabled */
>        ETH_FLAG_TXVLAN         = (1 << 7),     /* TX VLAN offload enabled */
>        ETH_FLAG_RXVLAN         = (1 << 8),     /* RX VLAN offload enabled */
>        ETH_FLAG_LRO            = (1 << 15),    /* LRO is enabled */
> ---
>
> --
> Ben Hutchings, Senior Software Engineer, Solarflare Communications
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
>
>

^ permalink raw reply

* linux-next: manual merge of the security-testing tree with the net tree
From: Stephen Rothwell @ 2011-01-07  0:44 UTC (permalink / raw)
  To: James Morris
  Cc: linux-next, linux-kernel, David Miller, netdev, Casey Schaufler

Hi James,

Today's linux-next merge of the security-testing tree got a conflict in
security/smack/smack_lsm.c between commit
3610cda53f247e176bcbb7a7cca64bc53b12acdb ("af_unix: Avoid socket->sk NULL
OOPS in stream connect security hooks") from the net tree and commit
b4e0d5f0791bd6dd12a1c1edea0340969c7c1f90 ("Smack: UDS revision") from the
security-testing tree.

I fixed it up (I think - see below) and can carry the fix as necessary.
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

diff --cc security/smack/smack_lsm.c
index ccb71a0,05dc4da..0000000
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@@ -2415,17 -2534,21 +2534,21 @@@ static int smack_setprocattr(struct tas
   * Return 0 if a subject with the smack of sock could access
   * an object with the smack of other, otherwise an error code
   */
 -static int smack_unix_stream_connect(struct socket *sock,
 -				     struct socket *other, struct sock *newsk)
 +static int smack_unix_stream_connect(struct sock *sock,
 +				     struct sock *other, struct sock *newsk)
  {
- 	struct inode *sp = SOCK_INODE(sock->sk_socket);
- 	struct inode *op = SOCK_INODE(other->sk_socket);
 -	struct socket_smack *ssp = sock->sk->sk_security;
 -	struct socket_smack *osp = other->sk->sk_security;
++	struct socket_smack *ssp = sock->sk_security;
++	struct socket_smack *osp = other->sk_security;
  	struct smk_audit_info ad;
+ 	int rc = 0;
  
  	smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_NET);
 -	smk_ad_setfield_u_net_sk(&ad, other->sk);
 +	smk_ad_setfield_u_net_sk(&ad, other);
- 	return smk_access(smk_of_inode(sp), smk_of_inode(op),
- 				 MAY_READWRITE, &ad);
+ 
+ 	if (!capable(CAP_MAC_OVERRIDE))
+ 		rc = smk_access(ssp->smk_out, osp->smk_in, MAY_WRITE, &ad);
+ 
+ 	return rc;
  }
  
  /**

^ permalink raw reply

* Re: [net-next 00/12][pull-request] Intel Wired LAN Driver Updates
From: Jeff Kirsher @ 2011-01-07  0:37 UTC (permalink / raw)
  To: davem@davemloft.net
  Cc: netdev@vger.kernel.org, gospo@redhat.com, bphilips@novell.com
In-Reply-To: <1294360199-9860-1-git-send-email-jeffrey.t.kirsher@intel.com>

[-- Attachment #1: Type: text/plain, Size: 2689 bytes --]

On Thu, 2011-01-06 at 16:29 -0800, Kirsher, Jeffrey T wrote:
> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> 
> The following series contains ixgbe/e1000e cleanups and fixes.  The
> addition of CE4100 support in e1000, and ixgb VLAN conversion to the
> new model.
> 
> The following changes since commit dbbe68bb12b34f3e450da7a73c20e6fa1f85d63a:
> 
>   Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
> 
> are available in the git repository at:
> 
>   master.kernel.org:/pub/scm/linux/kernel/git/jkirsher/net-next-2.6.git master
> 
> Alexander Duyck (3):
>   ixgbe: cleanup flow director hash computation to improve performance
>   ixgbe: further flow director performance optimizations
>   ixgbe: update ntuple filter configuration
> 
> Bruce Allan (6):
>   e1000e: cleanup variables set but not used
>   e1000e: convert calls of ops.[read|write]_reg to e1e_[r|w]phy
>   e1000e: properly bounds-check string functions
>   e1000e: use either_crc_le() rather than re-write it
>   e1000e: power off PHY after reset when interface is down
>   e1000e: add custom set_d[0|3]_lplu_state function pointer for 82574
> 
> Dirk Brandewie (1):
>   e1000: Add support for the CE4100 reference platform
> 
> Emil Tantilov (1):
>   ixgb: convert to new VLAN model
> 
> Yi Zou (1):
>   ixgbe: make sure per Rx queue is disabled before unmapping the
>     receive buffer
> 
>  drivers/net/e1000/e1000_hw.c      |  328 +++++++++++++----
>  drivers/net/e1000/e1000_hw.h      |   59 +++-
>  drivers/net/e1000/e1000_main.c    |   35 ++
>  drivers/net/e1000/e1000_osdep.h   |   19 +-
>  drivers/net/e1000e/82571.c        |   77 ++++-
>  drivers/net/e1000e/e1000.h        |    3 +
>  drivers/net/e1000e/es2lan.c       |    4 +-
>  drivers/net/e1000e/ethtool.c      |   54 ++-
>  drivers/net/e1000e/hw.h           |    1 +
>  drivers/net/e1000e/ich8lan.c      |   77 ++---
>  drivers/net/e1000e/lib.c          |    3 +-
>  drivers/net/e1000e/netdev.c       |   53 ++--
>  drivers/net/e1000e/phy.c          |   40 +--
>  drivers/net/ixgb/ixgb.h           |    2 +-
>  drivers/net/ixgb/ixgb_ethtool.c   |   35 ++
>  drivers/net/ixgb/ixgb_main.c      |   54 +--
>  drivers/net/ixgbe/ixgbe.h         |   21 +-
>  drivers/net/ixgbe/ixgbe_82599.c   |  749 +++++++++++++++----------------------
>  drivers/net/ixgbe/ixgbe_ethtool.c |  142 +++++---
>  drivers/net/ixgbe/ixgbe_main.c    |  169 ++++++---
>  drivers/net/ixgbe/ixgbe_type.h    |   91 +++--
>  21 files changed, 1182 insertions(+), 834 deletions(-)
> 

I apologize, I fat fingered Andy Gospodarek's email address.  I have
corrected it in this response.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* [net-next 12/12] ixgbe: update ntuple filter configuration
From: jeffrey.t.kirsher @ 2011-01-07  0:29 UTC (permalink / raw)
  To: davem, davem; +Cc: Alexander Duyck, netdev, gosp, bphilips, Jeff Kirsher
In-Reply-To: <1294360199-9860-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change fixes several issues found in ntuple filtering while I was
doing the ATR refactor.

Specifically I updated the masks to work correctly with the latest version
of ethtool, I cleaned up the exception handling and added detailed error
output when a filter is rejected, and corrected several bits that were set
incorrectly in ixgbe_type.h.

The previous version of this patch included a printk that was left over from
me fixing the filter setup.  This patch does not include that printk.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe.h         |   14 --
 drivers/net/ixgbe/ixgbe_82599.c   |  436 ++++++++++++-------------------------
 drivers/net/ixgbe/ixgbe_ethtool.c |  134 ++++++++----
 drivers/net/ixgbe/ixgbe_main.c    |   21 +-
 drivers/net/ixgbe/ixgbe_type.h    |   16 +-
 5 files changed, 250 insertions(+), 371 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index 341b3db..3b8c924 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -533,20 +533,6 @@ extern s32 ixgbe_fdir_add_perfect_filter_82599(struct ixgbe_hw *hw,
                                       union ixgbe_atr_input *input,
                                       struct ixgbe_atr_input_masks *input_masks,
                                       u16 soft_id, u8 queue);
-extern s32 ixgbe_atr_set_vlan_id_82599(union ixgbe_atr_input *input,
-                                       u16 vlan_id);
-extern s32 ixgbe_atr_set_src_ipv4_82599(union ixgbe_atr_input *input,
-                                        u32 src_addr);
-extern s32 ixgbe_atr_set_dst_ipv4_82599(union ixgbe_atr_input *input,
-                                        u32 dst_addr);
-extern s32 ixgbe_atr_set_src_port_82599(union ixgbe_atr_input *input,
-                                        u16 src_port);
-extern s32 ixgbe_atr_set_dst_port_82599(union ixgbe_atr_input *input,
-                                        u16 dst_port);
-extern s32 ixgbe_atr_set_flex_byte_82599(union ixgbe_atr_input *input,
-                                         u16 flex_byte);
-extern s32 ixgbe_atr_set_l4type_82599(union ixgbe_atr_input *input,
-                                      u8 l4type);
 extern void ixgbe_configure_rscctl(struct ixgbe_adapter *adapter,
                                    struct ixgbe_ring *ring);
 extern void ixgbe_clear_rscctl(struct ixgbe_adapter *adapter,
diff --git a/drivers/net/ixgbe/ixgbe_82599.c b/drivers/net/ixgbe/ixgbe_82599.c
index d41931f..8d316d9 100644
--- a/drivers/net/ixgbe/ixgbe_82599.c
+++ b/drivers/net/ixgbe/ixgbe_82599.c
@@ -1422,211 +1422,6 @@ static u32 ixgbe_atr_compute_sig_hash_82599(union ixgbe_atr_hash_dword input,
 }
 
 /**
- *  ixgbe_atr_set_vlan_id_82599 - Sets the VLAN id in the ATR input stream
- *  @input: input stream to modify
- *  @vlan: the VLAN id to load
- **/
-s32 ixgbe_atr_set_vlan_id_82599(union ixgbe_atr_input *input, __be16 vlan)
-{
-	input->formatted.vlan_id = vlan;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_set_src_ipv4_82599 - Sets the source IPv4 address
- *  @input: input stream to modify
- *  @src_addr: the IP address to load
- **/
-s32 ixgbe_atr_set_src_ipv4_82599(union ixgbe_atr_input *input, __be32 src_addr)
-{
-	input->formatted.src_ip[0] = src_addr;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_set_dst_ipv4_82599 - Sets the destination IPv4 address
- *  @input: input stream to modify
- *  @dst_addr: the IP address to load
- **/
-s32 ixgbe_atr_set_dst_ipv4_82599(union ixgbe_atr_input *input, __be32 dst_addr)
-{
-	input->formatted.dst_ip[0] = dst_addr;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_set_src_port_82599 - Sets the source port
- *  @input: input stream to modify
- *  @src_port: the source port to load
- **/
-s32 ixgbe_atr_set_src_port_82599(union ixgbe_atr_input *input, __be16 src_port)
-{
-	input->formatted.src_port = src_port;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_set_dst_port_82599 - Sets the destination port
- *  @input: input stream to modify
- *  @dst_port: the destination port to load
- **/
-s32 ixgbe_atr_set_dst_port_82599(union ixgbe_atr_input *input, __be16 dst_port)
-{
-	input->formatted.dst_port = dst_port;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_set_flex_byte_82599 - Sets the flexible bytes
- *  @input: input stream to modify
- *  @flex_bytes: the flexible bytes to load
- **/
-s32 ixgbe_atr_set_flex_byte_82599(union ixgbe_atr_input *input,
-				  __be16 flex_bytes)
-{
-	input->formatted.flex_bytes = flex_bytes;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_set_l4type_82599 - Sets the layer 4 packet type
- *  @input: input stream to modify
- *  @l4type: the layer 4 type value to load
- **/
-s32 ixgbe_atr_set_l4type_82599(union ixgbe_atr_input *input, u8 l4type)
-{
-	input->formatted.flow_type = l4type;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_get_vlan_id_82599 - Gets the VLAN id from the ATR input stream
- *  @input: input stream to search
- *  @vlan: the VLAN id to load
- **/
-static s32 ixgbe_atr_get_vlan_id_82599(union ixgbe_atr_input *input, __be16 *vlan)
-{
-	*vlan = input->formatted.vlan_id;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_get_src_ipv4_82599 - Gets the source IPv4 address
- *  @input: input stream to search
- *  @src_addr: the IP address to load
- **/
-static s32 ixgbe_atr_get_src_ipv4_82599(union ixgbe_atr_input *input,
-                                        __be32 *src_addr)
-{
-	*src_addr = input->formatted.src_ip[0];
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_get_dst_ipv4_82599 - Gets the destination IPv4 address
- *  @input: input stream to search
- *  @dst_addr: the IP address to load
- **/
-static s32 ixgbe_atr_get_dst_ipv4_82599(union ixgbe_atr_input *input,
-                                        __be32 *dst_addr)
-{
-	*dst_addr = input->formatted.dst_ip[0];
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_get_src_ipv6_82599 - Gets the source IPv6 address
- *  @input: input stream to search
- *  @src_addr_1: the first 4 bytes of the IP address to load
- *  @src_addr_2: the second 4 bytes of the IP address to load
- *  @src_addr_3: the third 4 bytes of the IP address to load
- *  @src_addr_4: the fourth 4 bytes of the IP address to load
- **/
-static s32 ixgbe_atr_get_src_ipv6_82599(union ixgbe_atr_input *input,
-                                        __be32 *src_addr_0, __be32 *src_addr_1,
-                                        __be32 *src_addr_2, __be32 *src_addr_3)
-{
-	*src_addr_0 = input->formatted.src_ip[0];
-	*src_addr_1 = input->formatted.src_ip[1];
-	*src_addr_2 = input->formatted.src_ip[2];
-	*src_addr_3 = input->formatted.src_ip[3];
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_get_src_port_82599 - Gets the source port
- *  @input: input stream to modify
- *  @src_port: the source port to load
- *
- *  Even though the input is given in big-endian, the FDIRPORT registers
- *  expect the ports to be programmed in little-endian.  Hence the need to swap
- *  endianness when retrieving the data.  This can be confusing since the
- *  internal hash engine expects it to be big-endian.
- **/
-static s32 ixgbe_atr_get_src_port_82599(union ixgbe_atr_input *input,
-                                        __be16 *src_port)
-{
-	*src_port = input->formatted.src_port;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_get_dst_port_82599 - Gets the destination port
- *  @input: input stream to modify
- *  @dst_port: the destination port to load
- *
- *  Even though the input is given in big-endian, the FDIRPORT registers
- *  expect the ports to be programmed in little-endian.  Hence the need to swap
- *  endianness when retrieving the data.  This can be confusing since the
- *  internal hash engine expects it to be big-endian.
- **/
-static s32 ixgbe_atr_get_dst_port_82599(union ixgbe_atr_input *input,
-                                        __be16 *dst_port)
-{
-	*dst_port = input->formatted.dst_port;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_get_flex_byte_82599 - Gets the flexible bytes
- *  @input: input stream to modify
- *  @flex_bytes: the flexible bytes to load
- **/
-static s32 ixgbe_atr_get_flex_byte_82599(union ixgbe_atr_input *input,
-                                         __be16 *flex_bytes)
-{
-	*flex_bytes = input->formatted.flex_bytes;
-
-	return 0;
-}
-
-/**
- *  ixgbe_atr_get_l4type_82599 - Gets the layer 4 packet type
- *  @input: input stream to modify
- *  @l4type: the layer 4 type value to load
- **/
-static s32 ixgbe_atr_get_l4type_82599(union ixgbe_atr_input *input,
-                                      u8 *l4type)
-{
-	*l4type = input->formatted.flow_type;
-
-	return 0;
-}
-
-/**
  *  ixgbe_atr_add_signature_filter_82599 - Adds a signature hash filter
  *  @hw: pointer to hardware structure
  *  @input: unique input dword
@@ -1679,6 +1474,43 @@ s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
 }
 
 /**
+ *  ixgbe_get_fdirtcpm_82599 - generate a tcp port from atr_input_masks
+ *  @input_mask: mask to be bit swapped
+ *
+ *  The source and destination port masks for flow director are bit swapped
+ *  in that bit 15 effects bit 0, 14 effects 1, 13, 2 etc.  In order to
+ *  generate a correctly swapped value we need to bit swap the mask and that
+ *  is what is accomplished by this function.
+ **/
+static u32 ixgbe_get_fdirtcpm_82599(struct ixgbe_atr_input_masks *input_masks)
+{
+	u32 mask = ntohs(input_masks->dst_port_mask);
+	mask <<= IXGBE_FDIRTCPM_DPORTM_SHIFT;
+	mask |= ntohs(input_masks->src_port_mask);
+	mask = ((mask & 0x55555555) << 1) | ((mask & 0xAAAAAAAA) >> 1);
+	mask = ((mask & 0x33333333) << 2) | ((mask & 0xCCCCCCCC) >> 2);
+	mask = ((mask & 0x0F0F0F0F) << 4) | ((mask & 0xF0F0F0F0) >> 4);
+	return ((mask & 0x00FF00FF) << 8) | ((mask & 0xFF00FF00) >> 8);
+}
+
+/*
+ * These two macros are meant to address the fact that we have registers
+ * that are either all or in part big-endian.  As a result on big-endian
+ * systems we will end up byte swapping the value to little-endian before
+ * it is byte swapped again and written to the hardware in the original
+ * big-endian format.
+ */
+#define IXGBE_STORE_AS_BE32(_value) \
+	(((u32)(_value) >> 24) | (((u32)(_value) & 0x00FF0000) >> 8) | \
+	 (((u32)(_value) & 0x0000FF00) << 8) | ((u32)(_value) << 24))
+
+#define IXGBE_WRITE_REG_BE32(a, reg, value) \
+	IXGBE_WRITE_REG((a), (reg), IXGBE_STORE_AS_BE32(ntohl(value)))
+
+#define IXGBE_STORE_AS_BE16(_value) \
+	(((u16)(_value) >> 8) | ((u16)(_value) << 8))
+
+/**
  *  ixgbe_fdir_add_perfect_filter_82599 - Adds a perfect filter
  *  @hw: pointer to hardware structure
  *  @input: input bitstream
@@ -1694,131 +1526,135 @@ s32 ixgbe_fdir_add_perfect_filter_82599(struct ixgbe_hw *hw,
                                       struct ixgbe_atr_input_masks *input_masks,
                                       u16 soft_id, u8 queue)
 {
-	u32 fdircmd = 0;
 	u32 fdirhash;
-	u32 src_ipv4 = 0, dst_ipv4 = 0;
-	u32 src_ipv6_1, src_ipv6_2, src_ipv6_3, src_ipv6_4;
-	u16 src_port, dst_port, vlan_id, flex_bytes;
-	u16 bucket_hash;
-	u8  l4type;
-	u8  fdirm = 0;
-
-	/* Get our input values */
-	ixgbe_atr_get_l4type_82599(input, &l4type);
+	u32 fdircmd;
+	u32 fdirport, fdirtcpm;
+	u32 fdirvlan;
+	/* start with VLAN, flex bytes, VM pool, and IPv6 destination masked */
+	u32 fdirm = IXGBE_FDIRM_VLANID | IXGBE_FDIRM_VLANP | IXGBE_FDIRM_FLEX |
+		    IXGBE_FDIRM_POOL | IXGBE_FDIRM_DIPv6;
 
 	/*
-	 * Check l4type formatting, and bail out before we touch the hardware
+	 * Check flow_type formatting, and bail out before we touch the hardware
 	 * if there's a configuration issue
 	 */
-	switch (l4type & IXGBE_ATR_L4TYPE_MASK) {
-	case IXGBE_ATR_L4TYPE_TCP:
-		fdircmd |= IXGBE_FDIRCMD_L4TYPE_TCP;
-		break;
-	case IXGBE_ATR_L4TYPE_UDP:
-		fdircmd |= IXGBE_FDIRCMD_L4TYPE_UDP;
-		break;
-	case IXGBE_ATR_L4TYPE_SCTP:
-		fdircmd |= IXGBE_FDIRCMD_L4TYPE_SCTP;
+	switch (input->formatted.flow_type) {
+	case IXGBE_ATR_FLOW_TYPE_IPV4:
+		/* use the L4 protocol mask for raw IPv4/IPv6 traffic */
+		fdirm |= IXGBE_FDIRM_L4P;
+	case IXGBE_ATR_FLOW_TYPE_SCTPV4:
+		if (input_masks->dst_port_mask || input_masks->src_port_mask) {
+			hw_dbg(hw, " Error on src/dst port mask\n");
+			return IXGBE_ERR_CONFIG;
+		}
+	case IXGBE_ATR_FLOW_TYPE_TCPV4:
+	case IXGBE_ATR_FLOW_TYPE_UDPV4:
 		break;
 	default:
-		hw_dbg(hw, "Error on l4type input\n");
+		hw_dbg(hw, " Error on flow type input\n");
 		return IXGBE_ERR_CONFIG;
 	}
 
-	bucket_hash = ixgbe_atr_compute_hash_82599(input,
-	                                           IXGBE_ATR_BUCKET_HASH_KEY);
-
-	/* bucket_hash is only 15 bits */
-	bucket_hash &= IXGBE_ATR_HASH_MASK;
-
-	ixgbe_atr_get_vlan_id_82599(input, &vlan_id);
-	ixgbe_atr_get_src_port_82599(input, &src_port);
-	ixgbe_atr_get_dst_port_82599(input, &dst_port);
-	ixgbe_atr_get_flex_byte_82599(input, &flex_bytes);
-
-	fdirhash = soft_id << IXGBE_FDIRHASH_SIG_SW_INDEX_SHIFT | bucket_hash;
-
-	/* Now figure out if we're IPv4 or IPv6 */
-	if (l4type & IXGBE_ATR_L4TYPE_IPV6_MASK) {
-		/* IPv6 */
-		ixgbe_atr_get_src_ipv6_82599(input, &src_ipv6_1, &src_ipv6_2,
-	                                     &src_ipv6_3, &src_ipv6_4);
-
-		IXGBE_WRITE_REG(hw, IXGBE_FDIRSIPv6(0), src_ipv6_1);
-		IXGBE_WRITE_REG(hw, IXGBE_FDIRSIPv6(1), src_ipv6_2);
-		IXGBE_WRITE_REG(hw, IXGBE_FDIRSIPv6(2), src_ipv6_3);
-		/* The last 4 bytes is the same register as IPv4 */
-		IXGBE_WRITE_REG(hw, IXGBE_FDIRIPSA, src_ipv6_4);
-
-		fdircmd |= IXGBE_FDIRCMD_IPV6;
-		fdircmd |= IXGBE_FDIRCMD_IPv6DMATCH;
-	} else {
-		/* IPv4 */
-		ixgbe_atr_get_src_ipv4_82599(input, &src_ipv4);
-		IXGBE_WRITE_REG(hw, IXGBE_FDIRIPSA, src_ipv4);
-	}
-
-	ixgbe_atr_get_dst_ipv4_82599(input, &dst_ipv4);
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRIPDA, dst_ipv4);
-
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRVLAN, (vlan_id |
-	                            (flex_bytes << IXGBE_FDIRVLAN_FLEX_SHIFT)));
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRPORT, (src_port |
-	              (dst_port << IXGBE_FDIRPORT_DESTINATION_SHIFT)));
-
 	/*
-	 * Program the relevant mask registers.  L4type cannot be
-	 * masked out in this implementation.
+	 * Program the relevant mask registers.  If src/dst_port or src/dst_addr
+	 * are zero, then assume a full mask for that field.  Also assume that
+	 * a VLAN of 0 is unspecified, so mask that out as well.  L4type
+	 * cannot be masked out in this implementation.
 	 *
 	 * This also assumes IPv4 only.  IPv6 masking isn't supported at this
 	 * point in time.
 	 */
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRSIP4M, input_masks->src_ip_mask);
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRDIP4M, input_masks->dst_ip_mask);
-
-	switch (l4type & IXGBE_ATR_L4TYPE_MASK) {
-	case IXGBE_ATR_L4TYPE_TCP:
-		IXGBE_WRITE_REG(hw, IXGBE_FDIRTCPM, input_masks->src_port_mask);
-		IXGBE_WRITE_REG(hw, IXGBE_FDIRTCPM,
-				(IXGBE_READ_REG(hw, IXGBE_FDIRTCPM) |
-				 (input_masks->dst_port_mask << 16)));
+
+	/* Program FDIRM */
+	switch (ntohs(input_masks->vlan_id_mask) & 0xEFFF) {
+	case 0xEFFF:
+		/* Unmask VLAN ID - bit 0 and fall through to unmask prio */
+		fdirm &= ~IXGBE_FDIRM_VLANID;
+	case 0xE000:
+		/* Unmask VLAN prio - bit 1 */
+		fdirm &= ~IXGBE_FDIRM_VLANP;
 		break;
-	case IXGBE_ATR_L4TYPE_UDP:
-		IXGBE_WRITE_REG(hw, IXGBE_FDIRUDPM, input_masks->src_port_mask);
-		IXGBE_WRITE_REG(hw, IXGBE_FDIRUDPM,
-				(IXGBE_READ_REG(hw, IXGBE_FDIRUDPM) |
-				 (input_masks->src_port_mask << 16)));
+	case 0x0FFF:
+		/* Unmask VLAN ID - bit 0 */
+		fdirm &= ~IXGBE_FDIRM_VLANID;
 		break;
-	default:
-		/* this already would have failed above */
+	case 0x0000:
+		/* do nothing, vlans already masked */
 		break;
+	default:
+		hw_dbg(hw, " Error on VLAN mask\n");
+		return IXGBE_ERR_CONFIG;
 	}
 
-	/* Program the last mask register, FDIRM */
-	if (input_masks->vlan_id_mask)
-		/* Mask both VLAN and VLANP - bits 0 and 1 */
-		fdirm |= 0x3;
-
-	if (input_masks->data_mask)
-		/* Flex bytes need masking, so mask the whole thing - bit 4 */
-		fdirm |= 0x10;
+	if (input_masks->flex_mask & 0xFFFF) {
+		if ((input_masks->flex_mask & 0xFFFF) != 0xFFFF) {
+			hw_dbg(hw, " Error on flexible byte mask\n");
+			return IXGBE_ERR_CONFIG;
+		}
+		/* Unmask Flex Bytes - bit 4 */
+		fdirm &= ~IXGBE_FDIRM_FLEX;
+	}
 
 	/* Now mask VM pool and destination IPv6 - bits 5 and 2 */
-	fdirm |= 0x24;
-
 	IXGBE_WRITE_REG(hw, IXGBE_FDIRM, fdirm);
 
-	fdircmd |= IXGBE_FDIRCMD_CMD_ADD_FLOW;
-	fdircmd |= IXGBE_FDIRCMD_FILTER_UPDATE;
-	fdircmd |= IXGBE_FDIRCMD_LAST;
-	fdircmd |= IXGBE_FDIRCMD_QUEUE_EN;
-	fdircmd |= queue << IXGBE_FDIRCMD_RX_QUEUE_SHIFT;
+	/* store the TCP/UDP port masks, bit reversed from port layout */
+	fdirtcpm = ixgbe_get_fdirtcpm_82599(input_masks);
+
+	/* write both the same so that UDP and TCP use the same mask */
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRTCPM, ~fdirtcpm);
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRUDPM, ~fdirtcpm);
+
+	/* store source and destination IP masks (big-enian) */
+	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRSIP4M,
+			     ~input_masks->src_ip_mask[0]);
+	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRDIP4M,
+			     ~input_masks->dst_ip_mask[0]);
+
+	/* Apply masks to input data */
+	input->formatted.vlan_id &= input_masks->vlan_id_mask;
+	input->formatted.flex_bytes &= input_masks->flex_mask;
+	input->formatted.src_port &= input_masks->src_port_mask;
+	input->formatted.dst_port &= input_masks->dst_port_mask;
+	input->formatted.src_ip[0] &= input_masks->src_ip_mask[0];
+	input->formatted.dst_ip[0] &= input_masks->dst_ip_mask[0];
+
+	/* record vlan (little-endian) and flex_bytes(big-endian) */
+	fdirvlan =
+		IXGBE_STORE_AS_BE16(ntohs(input->formatted.flex_bytes));
+	fdirvlan <<= IXGBE_FDIRVLAN_FLEX_SHIFT;
+	fdirvlan |= ntohs(input->formatted.vlan_id);
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRVLAN, fdirvlan);
+
+	/* record source and destination port (little-endian)*/
+	fdirport = ntohs(input->formatted.dst_port);
+	fdirport <<= IXGBE_FDIRPORT_DESTINATION_SHIFT;
+	fdirport |= ntohs(input->formatted.src_port);
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRPORT, fdirport);
+
+	/* record the first 32 bits of the destination address (big-endian) */
+	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRIPDA, input->formatted.dst_ip[0]);
+
+	/* record the source address (big-endian) */
+	IXGBE_WRITE_REG_BE32(hw, IXGBE_FDIRIPSA, input->formatted.src_ip[0]);
+
+	/* configure FDIRCMD register */
+	fdircmd = IXGBE_FDIRCMD_CMD_ADD_FLOW | IXGBE_FDIRCMD_FILTER_UPDATE |
+		  IXGBE_FDIRCMD_LAST | IXGBE_FDIRCMD_QUEUE_EN;
+	fdircmd |= input->formatted.flow_type << IXGBE_FDIRCMD_FLOW_TYPE_SHIFT;
+	fdircmd |= (u32)queue << IXGBE_FDIRCMD_RX_QUEUE_SHIFT;
+
+	/* we only want the bucket hash so drop the upper 16 bits */
+	fdirhash = ixgbe_atr_compute_hash_82599(input,
+						IXGBE_ATR_BUCKET_HASH_KEY);
+	fdirhash |= soft_id << IXGBE_FDIRHASH_SIG_SW_INDEX_SHIFT;
 
 	IXGBE_WRITE_REG(hw, IXGBE_FDIRHASH, fdirhash);
 	IXGBE_WRITE_REG(hw, IXGBE_FDIRCMD, fdircmd);
 
 	return 0;
 }
+
 /**
  *  ixgbe_read_analog_reg8_82599 - Reads 8 bit Omer analog register
  *  @hw: pointer to hardware structure
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 76e40e2..2002ea8 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -2277,10 +2277,11 @@ static int ixgbe_set_rx_ntuple(struct net_device *dev,
                                struct ethtool_rx_ntuple *cmd)
 {
 	struct ixgbe_adapter *adapter = netdev_priv(dev);
-	struct ethtool_rx_ntuple_flow_spec fs = cmd->fs;
+	struct ethtool_rx_ntuple_flow_spec *fs = &cmd->fs;
 	union ixgbe_atr_input input_struct;
 	struct ixgbe_atr_input_masks input_masks;
 	int target_queue;
+	int err;
 
 	if (adapter->hw.mac.type == ixgbe_mac_82598EB)
 		return -EOPNOTSUPP;
@@ -2289,67 +2290,122 @@ static int ixgbe_set_rx_ntuple(struct net_device *dev,
 	 * Don't allow programming if the action is a queue greater than
 	 * the number of online Tx queues.
 	 */
-	if ((fs.action >= adapter->num_tx_queues) ||
-	    (fs.action < ETHTOOL_RXNTUPLE_ACTION_DROP))
+	if ((fs->action >= adapter->num_tx_queues) ||
+	    (fs->action < ETHTOOL_RXNTUPLE_ACTION_DROP))
 		return -EINVAL;
 
 	memset(&input_struct, 0, sizeof(union ixgbe_atr_input));
 	memset(&input_masks, 0, sizeof(struct ixgbe_atr_input_masks));
 
-	input_masks.src_ip_mask = fs.m_u.tcp_ip4_spec.ip4src;
-	input_masks.dst_ip_mask = fs.m_u.tcp_ip4_spec.ip4dst;
-	input_masks.src_port_mask = fs.m_u.tcp_ip4_spec.psrc;
-	input_masks.dst_port_mask = fs.m_u.tcp_ip4_spec.pdst;
-	input_masks.vlan_id_mask = fs.vlan_tag_mask;
-	/* only use the lowest 2 bytes for flex bytes */
-	input_masks.data_mask = (fs.data_mask & 0xffff);
-
-	switch (fs.flow_type) {
+	/* record flow type */
+	switch (fs->flow_type) {
+	case IPV4_FLOW:
+		input_struct.formatted.flow_type = IXGBE_ATR_FLOW_TYPE_IPV4;
+		break;
 	case TCP_V4_FLOW:
-		ixgbe_atr_set_l4type_82599(&input_struct, IXGBE_ATR_L4TYPE_TCP);
+		input_struct.formatted.flow_type = IXGBE_ATR_FLOW_TYPE_TCPV4;
 		break;
 	case UDP_V4_FLOW:
-		ixgbe_atr_set_l4type_82599(&input_struct, IXGBE_ATR_L4TYPE_UDP);
+		input_struct.formatted.flow_type = IXGBE_ATR_FLOW_TYPE_UDPV4;
 		break;
 	case SCTP_V4_FLOW:
-		ixgbe_atr_set_l4type_82599(&input_struct, IXGBE_ATR_L4TYPE_SCTP);
+		input_struct.formatted.flow_type = IXGBE_ATR_FLOW_TYPE_SCTPV4;
 		break;
 	default:
 		return -1;
 	}
 
-	/* Mask bits from the inputs based on user-supplied mask */
-	ixgbe_atr_set_src_ipv4_82599(&input_struct,
-	            (fs.h_u.tcp_ip4_spec.ip4src & ~fs.m_u.tcp_ip4_spec.ip4src));
-	ixgbe_atr_set_dst_ipv4_82599(&input_struct,
-	            (fs.h_u.tcp_ip4_spec.ip4dst & ~fs.m_u.tcp_ip4_spec.ip4dst));
-	/* 82599 expects these to be byte-swapped for perfect filtering */
-	ixgbe_atr_set_src_port_82599(&input_struct,
-	       ((ntohs(fs.h_u.tcp_ip4_spec.psrc)) & ~fs.m_u.tcp_ip4_spec.psrc));
-	ixgbe_atr_set_dst_port_82599(&input_struct,
-	       ((ntohs(fs.h_u.tcp_ip4_spec.pdst)) & ~fs.m_u.tcp_ip4_spec.pdst));
-
-	/* VLAN and Flex bytes are either completely masked or not */
-	if (!fs.vlan_tag_mask)
-		ixgbe_atr_set_vlan_id_82599(&input_struct, fs.vlan_tag);
-
-	if (!input_masks.data_mask)
-		/* make sure we only use the first 2 bytes of user data */
-		ixgbe_atr_set_flex_byte_82599(&input_struct,
-		                              (fs.data & 0xffff));
+	/* copy vlan tag minus the CFI bit */
+	if ((fs->vlan_tag & 0xEFFF) || (~fs->vlan_tag_mask & 0xEFFF)) {
+		input_struct.formatted.vlan_id = htons(fs->vlan_tag & 0xEFFF);
+		if (!fs->vlan_tag_mask) {
+			input_masks.vlan_id_mask = htons(0xEFFF);
+		} else {
+			switch (~fs->vlan_tag_mask & 0xEFFF) {
+			/* all of these are valid vlan-mask values */
+			case 0xEFFF:
+			case 0xE000:
+			case 0x0FFF:
+			case 0x0000:
+				input_masks.vlan_id_mask =
+					htons(~fs->vlan_tag_mask);
+				break;
+			/* exit with error if vlan-mask is invalid */
+			default:
+				e_err(drv, "Partial VLAN ID or "
+				      "priority mask in vlan-mask is not "
+				      "supported by hardware\n");
+				return -1;
+			}
+		}
+	}
+
+	/* make sure we only use the first 2 bytes of user data */
+	if ((fs->data & 0xFFFF) || (~fs->data_mask & 0xFFFF)) {
+		input_struct.formatted.flex_bytes = htons(fs->data & 0xFFFF);
+		if (!(fs->data_mask & 0xFFFF)) {
+			input_masks.flex_mask = 0xFFFF;
+		} else if (~fs->data_mask & 0xFFFF) {
+			e_err(drv, "Partial user-def-mask is not "
+			      "supported by hardware\n");
+			return -1;
+		}
+	}
+
+	/*
+	 * Copy input into formatted structures
+	 *
+	 * These assignments are based on the following logic
+	 * If neither input or mask are set assume value is masked out.
+	 * If input is set, but mask is not mask should default to accept all.
+	 * If input is not set, but mask is set then mask likely results in 0.
+	 * If input is set and mask is set then assign both.
+	 */
+	if (fs->h_u.tcp_ip4_spec.ip4src || ~fs->m_u.tcp_ip4_spec.ip4src) {
+		input_struct.formatted.src_ip[0] = fs->h_u.tcp_ip4_spec.ip4src;
+		if (!fs->m_u.tcp_ip4_spec.ip4src)
+			input_masks.src_ip_mask[0] = 0xFFFFFFFF;
+		else
+			input_masks.src_ip_mask[0] =
+				~fs->m_u.tcp_ip4_spec.ip4src;
+	}
+	if (fs->h_u.tcp_ip4_spec.ip4dst || ~fs->m_u.tcp_ip4_spec.ip4dst) {
+		input_struct.formatted.dst_ip[0] = fs->h_u.tcp_ip4_spec.ip4dst;
+		if (!fs->m_u.tcp_ip4_spec.ip4dst)
+			input_masks.dst_ip_mask[0] = 0xFFFFFFFF;
+		else
+			input_masks.dst_ip_mask[0] =
+				~fs->m_u.tcp_ip4_spec.ip4dst;
+	}
+	if (fs->h_u.tcp_ip4_spec.psrc || ~fs->m_u.tcp_ip4_spec.psrc) {
+		input_struct.formatted.src_port = fs->h_u.tcp_ip4_spec.psrc;
+		if (!fs->m_u.tcp_ip4_spec.psrc)
+			input_masks.src_port_mask = 0xFFFF;
+		else
+			input_masks.src_port_mask = ~fs->m_u.tcp_ip4_spec.psrc;
+	}
+	if (fs->h_u.tcp_ip4_spec.pdst || ~fs->m_u.tcp_ip4_spec.pdst) {
+		input_struct.formatted.dst_port = fs->h_u.tcp_ip4_spec.pdst;
+		if (!fs->m_u.tcp_ip4_spec.pdst)
+			input_masks.dst_port_mask = 0xFFFF;
+		else
+			input_masks.dst_port_mask = ~fs->m_u.tcp_ip4_spec.pdst;
+	}
 
 	/* determine if we need to drop or route the packet */
-	if (fs.action == ETHTOOL_RXNTUPLE_ACTION_DROP)
+	if (fs->action == ETHTOOL_RXNTUPLE_ACTION_DROP)
 		target_queue = MAX_RX_QUEUES - 1;
 	else
-		target_queue = fs.action;
+		target_queue = fs->action;
 
 	spin_lock(&adapter->fdir_perfect_lock);
-	ixgbe_fdir_add_perfect_filter_82599(&adapter->hw, &input_struct,
-	                                    &input_masks, 0, target_queue);
+	err = ixgbe_fdir_add_perfect_filter_82599(&adapter->hw,
+						  &input_struct,
+						  &input_masks, 0,
+						  target_queue);
 	spin_unlock(&adapter->fdir_perfect_lock);
 
-	return 0;
+	return err ? -1 : 0;
 }
 
 static const struct ethtool_ops ixgbe_ethtool_ops = {
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 490818c..a060610 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -4821,6 +4821,12 @@ static int ixgbe_set_interrupt_capability(struct ixgbe_adapter *adapter)
 
 	adapter->flags &= ~IXGBE_FLAG_DCB_ENABLED;
 	adapter->flags &= ~IXGBE_FLAG_RSS_ENABLED;
+	if (adapter->flags & (IXGBE_FLAG_FDIR_HASH_CAPABLE |
+			      IXGBE_FLAG_FDIR_PERFECT_CAPABLE)) {
+		e_err(probe,
+		      "Flow Director is not supported while multiple "
+		      "queues are disabled.  Disabling Flow Director\n");
+	}
 	adapter->flags &= ~IXGBE_FLAG_FDIR_HASH_CAPABLE;
 	adapter->flags &= ~IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
 	adapter->atr_sample_rate = 0;
@@ -5126,16 +5132,11 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
 		adapter->flags2 |= IXGBE_FLAG2_RSC_ENABLED;
 		if (hw->device_id == IXGBE_DEV_ID_82599_T3_LOM)
 			adapter->flags2 |= IXGBE_FLAG2_TEMP_SENSOR_CAPABLE;
-		if (dev->features & NETIF_F_NTUPLE) {
-			/* Flow Director perfect filter enabled */
-			adapter->flags |= IXGBE_FLAG_FDIR_PERFECT_CAPABLE;
-			adapter->atr_sample_rate = 0;
-			spin_lock_init(&adapter->fdir_perfect_lock);
-		} else {
-			/* Flow Director hash filters enabled */
-			adapter->flags |= IXGBE_FLAG_FDIR_HASH_CAPABLE;
-			adapter->atr_sample_rate = 20;
-		}
+		/* n-tuple support exists, always init our spinlock */
+		spin_lock_init(&adapter->fdir_perfect_lock);
+		/* Flow Director hash filters enabled */
+		adapter->flags |= IXGBE_FLAG_FDIR_HASH_CAPABLE;
+		adapter->atr_sample_rate = 20;
 		adapter->ring_feature[RING_F_FDIR].indices =
 							 IXGBE_MAX_FDIR_INDICES;
 		adapter->fdir_pballoc = 0;
diff --git a/drivers/net/ixgbe/ixgbe_type.h b/drivers/net/ixgbe/ixgbe_type.h
index 0d9392d..fd3358f 100644
--- a/drivers/net/ixgbe/ixgbe_type.h
+++ b/drivers/net/ixgbe/ixgbe_type.h
@@ -1947,10 +1947,9 @@ enum ixgbe_fdir_pballoc_type {
 #define IXGBE_FDIRM_VLANID                      0x00000001
 #define IXGBE_FDIRM_VLANP                       0x00000002
 #define IXGBE_FDIRM_POOL                        0x00000004
-#define IXGBE_FDIRM_L3P                         0x00000008
-#define IXGBE_FDIRM_L4P                         0x00000010
-#define IXGBE_FDIRM_FLEX                        0x00000020
-#define IXGBE_FDIRM_DIPv6                       0x00000040
+#define IXGBE_FDIRM_L4P                         0x00000008
+#define IXGBE_FDIRM_FLEX                        0x00000010
+#define IXGBE_FDIRM_DIPv6                       0x00000020
 
 #define IXGBE_FDIRFREE_FREE_MASK                0xFFFF
 #define IXGBE_FDIRFREE_FREE_SHIFT               0
@@ -2215,12 +2214,13 @@ union ixgbe_atr_hash_dword {
 };
 
 struct ixgbe_atr_input_masks {
-	__be32 src_ip_mask;
-	__be32 dst_ip_mask;
+	__be16 rsvd0;
+	__be16 vlan_id_mask;
+	__be32 dst_ip_mask[4];
+	__be32 src_ip_mask[4];
 	__be16 src_port_mask;
 	__be16 dst_port_mask;
-	__be16 vlan_id_mask;
-	__be16 data_mask;
+	__be16 flex_mask;
 };
 
 enum ixgbe_eeprom_type {
-- 
1.7.3.4


^ permalink raw reply related

* [net-next 11/12] ixgbe: further flow director performance optimizations
From: jeffrey.t.kirsher @ 2011-01-07  0:29 UTC (permalink / raw)
  To: davem, davem; +Cc: Alexander Duyck, netdev, gosp, bphilips, Jeff Kirsher
In-Reply-To: <1294360199-9860-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change adds a compressed input type for atr signature hash
computation.  It also drops the use of the set functions when setting up
the ATR input since we can then directly setup the hash input as two dwords
that can be stored and passed as registers.

With these changes the cost of computing the has is low enough that we can
perform a hash computation on each TCP SYN flagged packet allowing us to
drop the number of flow director misses considerably in tests such as
netperf TCP_CRR.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe.h       |    3 +-
 drivers/net/ixgbe/ixgbe_82599.c |  112 ++++++++++++++++++++++++++++++++++-----
 drivers/net/ixgbe/ixgbe_main.c  |  107 +++++++++++++++++++++++++++----------
 drivers/net/ixgbe/ixgbe_type.h  |   16 ++++++
 4 files changed, 194 insertions(+), 44 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index 2666e69..341b3db 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -526,7 +526,8 @@ extern s32 ixgbe_reinit_fdir_tables_82599(struct ixgbe_hw *hw);
 extern s32 ixgbe_init_fdir_signature_82599(struct ixgbe_hw *hw, u32 pballoc);
 extern s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 pballoc);
 extern s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
-                                                 union ixgbe_atr_input *input,
+						 union ixgbe_atr_hash_dword input,
+						 union ixgbe_atr_hash_dword common,
                                                  u8 queue);
 extern s32 ixgbe_fdir_add_perfect_filter_82599(struct ixgbe_hw *hw,
                                       union ixgbe_atr_input *input,
diff --git a/drivers/net/ixgbe/ixgbe_82599.c b/drivers/net/ixgbe/ixgbe_82599.c
index 40aa3c2..d41931f 100644
--- a/drivers/net/ixgbe/ixgbe_82599.c
+++ b/drivers/net/ixgbe/ixgbe_82599.c
@@ -1331,6 +1331,96 @@ static u32 ixgbe_atr_compute_hash_82599(union ixgbe_atr_input *atr_input,
 	return hash_result & IXGBE_ATR_HASH_MASK;
 }
 
+/*
+ * These defines allow us to quickly generate all of the necessary instructions
+ * in the function below by simply calling out IXGBE_COMPUTE_SIG_HASH_ITERATION
+ * for values 0 through 15
+ */
+#define IXGBE_ATR_COMMON_HASH_KEY \
+		(IXGBE_ATR_BUCKET_HASH_KEY & IXGBE_ATR_SIGNATURE_HASH_KEY)
+#define IXGBE_COMPUTE_SIG_HASH_ITERATION(_n) \
+do { \
+	u32 n = (_n); \
+	if (IXGBE_ATR_COMMON_HASH_KEY & (0x01 << n)) \
+		common_hash ^= lo_hash_dword >> n; \
+	else if (IXGBE_ATR_BUCKET_HASH_KEY & (0x01 << n)) \
+		bucket_hash ^= lo_hash_dword >> n; \
+	else if (IXGBE_ATR_SIGNATURE_HASH_KEY & (0x01 << n)) \
+		sig_hash ^= lo_hash_dword << (16 - n); \
+	if (IXGBE_ATR_COMMON_HASH_KEY & (0x01 << (n + 16))) \
+		common_hash ^= hi_hash_dword >> n; \
+	else if (IXGBE_ATR_BUCKET_HASH_KEY & (0x01 << (n + 16))) \
+		bucket_hash ^= hi_hash_dword >> n; \
+	else if (IXGBE_ATR_SIGNATURE_HASH_KEY & (0x01 << (n + 16))) \
+		sig_hash ^= hi_hash_dword << (16 - n); \
+} while (0);
+
+/**
+ *  ixgbe_atr_compute_sig_hash_82599 - Compute the signature hash
+ *  @stream: input bitstream to compute the hash on
+ *
+ *  This function is almost identical to the function above but contains
+ *  several optomizations such as unwinding all of the loops, letting the
+ *  compiler work out all of the conditional ifs since the keys are static
+ *  defines, and computing two keys at once since the hashed dword stream
+ *  will be the same for both keys.
+ **/
+static u32 ixgbe_atr_compute_sig_hash_82599(union ixgbe_atr_hash_dword input,
+					    union ixgbe_atr_hash_dword common)
+{
+	u32 hi_hash_dword, lo_hash_dword, flow_vm_vlan;
+	u32 sig_hash = 0, bucket_hash = 0, common_hash = 0;
+
+	/* record the flow_vm_vlan bits as they are a key part to the hash */
+	flow_vm_vlan = ntohl(input.dword);
+
+	/* generate common hash dword */
+	hi_hash_dword = ntohl(common.dword);
+
+	/* low dword is word swapped version of common */
+	lo_hash_dword = (hi_hash_dword >> 16) | (hi_hash_dword << 16);
+
+	/* apply flow ID/VM pool/VLAN ID bits to hash words */
+	hi_hash_dword ^= flow_vm_vlan ^ (flow_vm_vlan >> 16);
+
+	/* Process bits 0 and 16 */
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(0);
+
+	/*
+	 * apply flow ID/VM pool/VLAN ID bits to lo hash dword, we had to
+	 * delay this because bit 0 of the stream should not be processed
+	 * so we do not add the vlan until after bit 0 was processed
+	 */
+	lo_hash_dword ^= flow_vm_vlan ^ (flow_vm_vlan << 16);
+
+	/* Process remaining 30 bit of the key */
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(1);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(2);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(3);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(4);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(5);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(6);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(7);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(8);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(9);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(10);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(11);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(12);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(13);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(14);
+	IXGBE_COMPUTE_SIG_HASH_ITERATION(15);
+
+	/* combine common_hash result with signature and bucket hashes */
+	bucket_hash ^= common_hash;
+	bucket_hash &= IXGBE_ATR_HASH_MASK;
+
+	sig_hash ^= common_hash << 16;
+	sig_hash &= IXGBE_ATR_HASH_MASK << 16;
+
+	/* return completed signature hash */
+	return sig_hash ^ bucket_hash;
+}
+
 /**
  *  ixgbe_atr_set_vlan_id_82599 - Sets the VLAN id in the ATR input stream
  *  @input: input stream to modify
@@ -1539,22 +1629,23 @@ static s32 ixgbe_atr_get_l4type_82599(union ixgbe_atr_input *input,
 /**
  *  ixgbe_atr_add_signature_filter_82599 - Adds a signature hash filter
  *  @hw: pointer to hardware structure
- *  @stream: input bitstream
+ *  @input: unique input dword
+ *  @common: compressed common input dword
  *  @queue: queue index to direct traffic to
  **/
 s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
-                                          union ixgbe_atr_input *input,
+                                          union ixgbe_atr_hash_dword input,
+                                          union ixgbe_atr_hash_dword common,
                                           u8 queue)
 {
 	u64  fdirhashcmd;
 	u32  fdircmd;
-	u32  bucket_hash, sig_hash;
 
 	/*
 	 * Get the flow_type in order to program FDIRCMD properly
 	 * lowest 2 bits are FDIRCMD.L4TYPE, third lowest bit is FDIRCMD.IPV6
 	 */
-	switch (input->formatted.flow_type) {
+	switch (input.formatted.flow_type) {
 	case IXGBE_ATR_FLOW_TYPE_TCPV4:
 	case IXGBE_ATR_FLOW_TYPE_UDPV4:
 	case IXGBE_ATR_FLOW_TYPE_SCTPV4:
@@ -1570,7 +1661,7 @@ s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
 	/* configure FDIRCMD register */
 	fdircmd = IXGBE_FDIRCMD_CMD_ADD_FLOW | IXGBE_FDIRCMD_FILTER_UPDATE |
 	          IXGBE_FDIRCMD_LAST | IXGBE_FDIRCMD_QUEUE_EN;
-	fdircmd |= input->formatted.flow_type << IXGBE_FDIRCMD_FLOW_TYPE_SHIFT;
+	fdircmd |= input.formatted.flow_type << IXGBE_FDIRCMD_FLOW_TYPE_SHIFT;
 	fdircmd |= (u32)queue << IXGBE_FDIRCMD_RX_QUEUE_SHIFT;
 
 	/*
@@ -1578,17 +1669,12 @@ s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
 	 * is for FDIRCMD.  Then do a 64-bit register write from FDIRHASH.
 	 */
 	fdirhashcmd = (u64)fdircmd << 32;
-
-	sig_hash = ixgbe_atr_compute_hash_82599(input,
-	                                        IXGBE_ATR_SIGNATURE_HASH_KEY);
-	fdirhashcmd |= sig_hash << IXGBE_FDIRHASH_SIG_SW_INDEX_SHIFT;
-
-	bucket_hash = ixgbe_atr_compute_hash_82599(input,
-	                                           IXGBE_ATR_BUCKET_HASH_KEY);
-	fdirhashcmd |= bucket_hash;
+	fdirhashcmd |= ixgbe_atr_compute_sig_hash_82599(input, common);
 
 	IXGBE_WRITE_REG64(hw, IXGBE_FDIRHASH, fdirhashcmd);
 
+	hw_dbg(hw, "Tx Queue=%x hash=%x\n", queue, (u32)fdirhashcmd);
+
 	return 0;
 }
 
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 26718ab..490818c 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -6506,37 +6506,92 @@ static void ixgbe_tx_queue(struct ixgbe_ring *tx_ring,
 	writel(i, tx_ring->tail);
 }
 
-static void ixgbe_atr(struct ixgbe_adapter *adapter, struct sk_buff *skb,
-		      u8 queue, u32 tx_flags, __be16 protocol)
-{
-	union ixgbe_atr_input atr_input;
-	struct iphdr *iph = ip_hdr(skb);
-	struct ethhdr *eth = (struct ethhdr *)skb->data;
+static void ixgbe_atr(struct ixgbe_ring *ring, struct sk_buff *skb,
+		      u32 tx_flags, __be16 protocol)
+{
+	struct ixgbe_q_vector *q_vector = ring->q_vector;
+	union ixgbe_atr_hash_dword input = { .dword = 0 };
+	union ixgbe_atr_hash_dword common = { .dword = 0 };
+	union {
+		unsigned char *network;
+		struct iphdr *ipv4;
+		struct ipv6hdr *ipv6;
+	} hdr;
 	struct tcphdr *th;
 	__be16 vlan_id;
 
-	/* Right now, we support IPv4 w/ TCP only */
-	if (protocol != htons(ETH_P_IP) ||
-	    iph->protocol != IPPROTO_TCP)
+	/* if ring doesn't have a interrupt vector, cannot perform ATR */
+	if (!q_vector)
+		return;
+
+	/* do nothing if sampling is disabled */
+	if (!ring->atr_sample_rate)
 		return;
 
-	memset(&atr_input, 0, sizeof(union ixgbe_atr_input));
+	ring->atr_count++;
 
-	vlan_id = htons(tx_flags >> IXGBE_TX_FLAGS_VLAN_SHIFT);
+	/* snag network header to get L4 type and address */
+	hdr.network = skb_network_header(skb);
+
+	/* Currently only IPv4/IPv6 with TCP is supported */
+	if ((protocol != __constant_htons(ETH_P_IPV6) ||
+	     hdr.ipv6->nexthdr != IPPROTO_TCP) &&
+	    (protocol != __constant_htons(ETH_P_IP) ||
+	     hdr.ipv4->protocol != IPPROTO_TCP))
+		return;
 
 	th = tcp_hdr(skb);
 
-	ixgbe_atr_set_vlan_id_82599(&atr_input, vlan_id);
-	ixgbe_atr_set_src_port_82599(&atr_input, th->dest);
-	ixgbe_atr_set_dst_port_82599(&atr_input, th->source);
-	ixgbe_atr_set_flex_byte_82599(&atr_input, eth->h_proto);
-	ixgbe_atr_set_l4type_82599(&atr_input, IXGBE_ATR_FLOW_TYPE_TCPV4);
-	/* src and dst are inverted, think how the receiver sees them */
-	ixgbe_atr_set_src_ipv4_82599(&atr_input, iph->daddr);
-	ixgbe_atr_set_dst_ipv4_82599(&atr_input, iph->saddr);
+	/* skip this packet since the socket is closing */
+	if (th->fin)
+		return;
+
+	/* sample on all syn packets or once every atr sample count */
+	if (!th->syn && (ring->atr_count < ring->atr_sample_rate))
+		return;
+
+	/* reset sample count */
+	ring->atr_count = 0;
+
+	vlan_id = htons(tx_flags >> IXGBE_TX_FLAGS_VLAN_SHIFT);
+
+	/*
+	 * src and dst are inverted, think how the receiver sees them
+	 *
+	 * The input is broken into two sections, a non-compressed section
+	 * containing vm_pool, vlan_id, and flow_type.  The rest of the data
+	 * is XORed together and stored in the compressed dword.
+	 */
+	input.formatted.vlan_id = vlan_id;
+
+	/*
+	 * since src port and flex bytes occupy the same word XOR them together
+	 * and write the value to source port portion of compressed dword
+	 */
+	if (vlan_id)
+		common.port.src ^= th->dest ^ __constant_htons(ETH_P_8021Q);
+	else
+		common.port.src ^= th->dest ^ protocol;
+	common.port.dst ^= th->source;
+
+	if (protocol == __constant_htons(ETH_P_IP)) {
+		input.formatted.flow_type = IXGBE_ATR_FLOW_TYPE_TCPV4;
+		common.ip ^= hdr.ipv4->saddr ^ hdr.ipv4->daddr;
+	} else {
+		input.formatted.flow_type = IXGBE_ATR_FLOW_TYPE_TCPV6;
+		common.ip ^= hdr.ipv6->saddr.s6_addr32[0] ^
+			     hdr.ipv6->saddr.s6_addr32[1] ^
+			     hdr.ipv6->saddr.s6_addr32[2] ^
+			     hdr.ipv6->saddr.s6_addr32[3] ^
+			     hdr.ipv6->daddr.s6_addr32[0] ^
+			     hdr.ipv6->daddr.s6_addr32[1] ^
+			     hdr.ipv6->daddr.s6_addr32[2] ^
+			     hdr.ipv6->daddr.s6_addr32[3];
+	}
 
 	/* This assumes the Rx queue and Tx queue are bound to the same CPU */
-	ixgbe_fdir_add_signature_filter_82599(&adapter->hw, &atr_input, queue);
+	ixgbe_fdir_add_signature_filter_82599(&q_vector->adapter->hw,
+					      input, common, ring->queue_index);
 }
 
 static int __ixgbe_maybe_stop_tx(struct ixgbe_ring *tx_ring, int size)
@@ -6707,16 +6762,8 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 	count = ixgbe_tx_map(adapter, tx_ring, skb, tx_flags, first, hdr_len);
 	if (count) {
 		/* add the ATR filter if ATR is on */
-		if (tx_ring->atr_sample_rate) {
-			++tx_ring->atr_count;
-			if ((tx_ring->atr_count >= tx_ring->atr_sample_rate) &&
-			     test_bit(__IXGBE_TX_FDIR_INIT_DONE,
-				      &tx_ring->state)) {
-				ixgbe_atr(adapter, skb, tx_ring->queue_index,
-					  tx_flags, protocol);
-				tx_ring->atr_count = 0;
-			}
-		}
+		if (test_bit(__IXGBE_TX_FDIR_INIT_DONE, &tx_ring->state))
+			ixgbe_atr(tx_ring, skb, tx_flags, protocol);
 		txq = netdev_get_tx_queue(netdev, tx_ring->queue_index);
 		txq->tx_bytes += skb->len;
 		txq->tx_packets++;
diff --git a/drivers/net/ixgbe/ixgbe_type.h b/drivers/net/ixgbe/ixgbe_type.h
index c56a712..0d9392d 100644
--- a/drivers/net/ixgbe/ixgbe_type.h
+++ b/drivers/net/ixgbe/ixgbe_type.h
@@ -2198,6 +2198,22 @@ union ixgbe_atr_input {
 	__be32 dword_stream[11];
 };
 
+/* Flow Director compressed ATR hash input struct */
+union ixgbe_atr_hash_dword {
+	struct {
+		u8 vm_pool;
+		u8 flow_type;
+		__be16 vlan_id;
+	} formatted;
+	__be32 ip;
+	struct {
+		__be16 src;
+		__be16 dst;
+	} port;
+	__be16 flex_bytes;
+	__be32 dword;
+};
+
 struct ixgbe_atr_input_masks {
 	__be32 src_ip_mask;
 	__be32 dst_ip_mask;
-- 
1.7.3.4


^ permalink raw reply related

* [net-next 10/12] ixgbe: cleanup flow director hash computation to improve performance
From: jeffrey.t.kirsher @ 2011-01-07  0:29 UTC (permalink / raw)
  To: davem, davem; +Cc: Alexander Duyck, netdev, gosp, bphilips, Jeff Kirsher
In-Reply-To: <1294360199-9860-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change cleans up the layout of the flow director data, and the
algorithm used to calculate the hash resulting in a 35x / 3500% performance
increase versus the old flow director hash computation.  The overall effect
is only a 1% increase in transactions per second though due to the fact
that only 1 packet in 20 are actually hashed upon.

TCP_RR before:
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       60.00    23059.27
16384  87380

TCP_RR after:
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       60.00    23239.98
16384  87380

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe.h         |   18 +-
 drivers/net/ixgbe/ixgbe_82599.c   |  335 ++++++++++++++-----------------------
 drivers/net/ixgbe/ixgbe_ethtool.c |    4 +-
 drivers/net/ixgbe/ixgbe_main.c    |   11 +-
 drivers/net/ixgbe/ixgbe_type.h    |   67 +++++---
 5 files changed, 182 insertions(+), 253 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index bdeaa9e..2666e69 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -526,25 +526,25 @@ extern s32 ixgbe_reinit_fdir_tables_82599(struct ixgbe_hw *hw);
 extern s32 ixgbe_init_fdir_signature_82599(struct ixgbe_hw *hw, u32 pballoc);
 extern s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 pballoc);
 extern s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
-                                                 struct ixgbe_atr_input *input,
+                                                 union ixgbe_atr_input *input,
                                                  u8 queue);
 extern s32 ixgbe_fdir_add_perfect_filter_82599(struct ixgbe_hw *hw,
-                                      struct ixgbe_atr_input *input,
+                                      union ixgbe_atr_input *input,
                                       struct ixgbe_atr_input_masks *input_masks,
                                       u16 soft_id, u8 queue);
-extern s32 ixgbe_atr_set_vlan_id_82599(struct ixgbe_atr_input *input,
+extern s32 ixgbe_atr_set_vlan_id_82599(union ixgbe_atr_input *input,
                                        u16 vlan_id);
-extern s32 ixgbe_atr_set_src_ipv4_82599(struct ixgbe_atr_input *input,
+extern s32 ixgbe_atr_set_src_ipv4_82599(union ixgbe_atr_input *input,
                                         u32 src_addr);
-extern s32 ixgbe_atr_set_dst_ipv4_82599(struct ixgbe_atr_input *input,
+extern s32 ixgbe_atr_set_dst_ipv4_82599(union ixgbe_atr_input *input,
                                         u32 dst_addr);
-extern s32 ixgbe_atr_set_src_port_82599(struct ixgbe_atr_input *input,
+extern s32 ixgbe_atr_set_src_port_82599(union ixgbe_atr_input *input,
                                         u16 src_port);
-extern s32 ixgbe_atr_set_dst_port_82599(struct ixgbe_atr_input *input,
+extern s32 ixgbe_atr_set_dst_port_82599(union ixgbe_atr_input *input,
                                         u16 dst_port);
-extern s32 ixgbe_atr_set_flex_byte_82599(struct ixgbe_atr_input *input,
+extern s32 ixgbe_atr_set_flex_byte_82599(union ixgbe_atr_input *input,
                                          u16 flex_byte);
-extern s32 ixgbe_atr_set_l4type_82599(struct ixgbe_atr_input *input,
+extern s32 ixgbe_atr_set_l4type_82599(union ixgbe_atr_input *input,
                                       u8 l4type);
 extern void ixgbe_configure_rscctl(struct ixgbe_adapter *adapter,
                                    struct ixgbe_ring *ring);
diff --git a/drivers/net/ixgbe/ixgbe_82599.c b/drivers/net/ixgbe/ixgbe_82599.c
index bfd3c22..40aa3c2 100644
--- a/drivers/net/ixgbe/ixgbe_82599.c
+++ b/drivers/net/ixgbe/ixgbe_82599.c
@@ -1003,7 +1003,7 @@ s32 ixgbe_reinit_fdir_tables_82599(struct ixgbe_hw *hw)
 		udelay(10);
 	}
 	if (i >= IXGBE_FDIRCMD_CMD_POLL) {
-		hw_dbg(hw ,"Flow Director previous command isn't complete, "
+		hw_dbg(hw, "Flow Director previous command isn't complete, "
 		       "aborting table re-initialization.\n");
 		return IXGBE_ERR_FDIR_REINIT_FAILED;
 	}
@@ -1113,13 +1113,10 @@ s32 ixgbe_init_fdir_signature_82599(struct ixgbe_hw *hw, u32 pballoc)
 	/* Move the flexible bytes to use the ethertype - shift 6 words */
 	fdirctrl |= (0x6 << IXGBE_FDIRCTRL_FLEX_SHIFT);
 
-	fdirctrl |= IXGBE_FDIRCTRL_REPORT_STATUS;
 
 	/* Prime the keys for hashing */
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRHKEY,
-	                htonl(IXGBE_ATR_BUCKET_HASH_KEY));
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRSKEY,
-	                htonl(IXGBE_ATR_SIGNATURE_HASH_KEY));
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRHKEY, IXGBE_ATR_BUCKET_HASH_KEY);
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRSKEY, IXGBE_ATR_SIGNATURE_HASH_KEY);
 
 	/*
 	 * Poll init-done after we write the register.  Estimated times:
@@ -1209,10 +1206,8 @@ s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 pballoc)
 	fdirctrl |= (0x6 << IXGBE_FDIRCTRL_FLEX_SHIFT);
 
 	/* Prime the keys for hashing */
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRHKEY,
-	                htonl(IXGBE_ATR_BUCKET_HASH_KEY));
-	IXGBE_WRITE_REG(hw, IXGBE_FDIRSKEY,
-	                htonl(IXGBE_ATR_SIGNATURE_HASH_KEY));
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRHKEY, IXGBE_ATR_BUCKET_HASH_KEY);
+	IXGBE_WRITE_REG(hw, IXGBE_FDIRSKEY, IXGBE_ATR_SIGNATURE_HASH_KEY);
 
 	/*
 	 * Poll init-done after we write the register.  Estimated times:
@@ -1251,8 +1246,8 @@ s32 ixgbe_init_fdir_perfect_82599(struct ixgbe_hw *hw, u32 pballoc)
  *  @stream: input bitstream to compute the hash on
  *  @key: 32-bit hash key
  **/
-static u16 ixgbe_atr_compute_hash_82599(struct ixgbe_atr_input *atr_input,
-                                        u32 key)
+static u32 ixgbe_atr_compute_hash_82599(union ixgbe_atr_input *atr_input,
+					u32 key)
 {
 	/*
 	 * The algorithm is as follows:
@@ -1272,100 +1267,68 @@ static u16 ixgbe_atr_compute_hash_82599(struct ixgbe_atr_input *atr_input,
 	 *    To simplify for programming, the algorithm is implemented
 	 *    in software this way:
 	 *
-	 *    Key[31:0], Stream[335:0]
+	 *    key[31:0], hi_hash_dword[31:0], lo_hash_dword[31:0], hash[15:0]
+	 *
+	 *    for (i = 0; i < 352; i+=32)
+	 *        hi_hash_dword[31:0] ^= Stream[(i+31):i];
+	 *
+	 *    lo_hash_dword[15:0]  ^= Stream[15:0];
+	 *    lo_hash_dword[15:0]  ^= hi_hash_dword[31:16];
+	 *    lo_hash_dword[31:16] ^= hi_hash_dword[15:0];
+	 *
+	 *    hi_hash_dword[31:0]  ^= Stream[351:320];
 	 *
-	 *    tmp_key[11 * 32 - 1:0] = 11{Key[31:0] = key concatenated 11 times
-	 *    int_key[350:0] = tmp_key[351:1]
-	 *    int_stream[365:0] = Stream[14:0] | Stream[335:0] | Stream[335:321]
+	 *    if(key[0])
+	 *        hash[15:0] ^= Stream[15:0];
 	 *
-	 *    hash[15:0] = 0;
-	 *    for (i = 0; i < 351; i++) {
-	 *        if (int_key[i])
-	 *            hash ^= int_stream[(i + 15):i];
+	 *    for (i = 0; i < 16; i++) {
+	 *        if (key[i])
+	 *            hash[15:0] ^= lo_hash_dword[(i+15):i];
+	 *        if (key[i + 16])
+	 *            hash[15:0] ^= hi_hash_dword[(i+15):i];
 	 *    }
+	 *
 	 */
+	__be32 common_hash_dword = 0;
+	u32 hi_hash_dword, lo_hash_dword, flow_vm_vlan;
+	u32 hash_result = 0;
+	u8 i;
 
-	union {
-		u64    fill[6];
-		u32    key[11];
-		u8     key_stream[44];
-	} tmp_key;
+	/* record the flow_vm_vlan bits as they are a key part to the hash */
+	flow_vm_vlan = ntohl(atr_input->dword_stream[0]);
 
-	u8   *stream = (u8 *)atr_input;
-	u8   int_key[44];      /* upper-most bit unused */
-	u8   hash_str[46];     /* upper-most 2 bits unused */
-	u16  hash_result = 0;
-	int  i, j, k, h;
+	/* generate common hash dword */
+	for (i = 10; i; i -= 2)
+		common_hash_dword ^= atr_input->dword_stream[i] ^
+				     atr_input->dword_stream[i - 1];
 
-	/*
-	 * Initialize the fill member to prevent warnings
-	 * on some compilers
-	 */
-	 tmp_key.fill[0] = 0;
+	hi_hash_dword = ntohl(common_hash_dword);
 
-	/* First load the temporary key stream */
-	for (i = 0; i < 6; i++) {
-		u64 fillkey = ((u64)key << 32) | key;
-		tmp_key.fill[i] = fillkey;
-	}
+	/* low dword is word swapped version of common */
+	lo_hash_dword = (hi_hash_dword >> 16) | (hi_hash_dword << 16);
 
-	/*
-	 * Set the interim key for the hashing.  Bit 352 is unused, so we must
-	 * shift and compensate when building the key.
-	 */
+	/* apply flow ID/VM pool/VLAN ID bits to hash words */
+	hi_hash_dword ^= flow_vm_vlan ^ (flow_vm_vlan >> 16);
 
-	int_key[0] = tmp_key.key_stream[0] >> 1;
-	for (i = 1, j = 0; i < 44; i++) {
-		unsigned int this_key = tmp_key.key_stream[j] << 7;
-		j++;
-		int_key[i] = (u8)(this_key | (tmp_key.key_stream[j] >> 1));
-	}
+	/* Process bits 0 and 16 */
+	if (key & 0x0001) hash_result ^= lo_hash_dword;
+	if (key & 0x00010000) hash_result ^= hi_hash_dword;
 
 	/*
-	 * Set the interim bit string for the hashing.  Bits 368 and 367 are
-	 * unused, so shift and compensate when building the string.
+	 * apply flow ID/VM pool/VLAN ID bits to lo hash dword, we had to
+	 * delay this because bit 0 of the stream should not be processed
+	 * so we do not add the vlan until after bit 0 was processed
 	 */
-	hash_str[0] = (stream[40] & 0x7f) >> 1;
-	for (i = 1, j = 40; i < 46; i++) {
-		unsigned int this_str = stream[j] << 7;
-		j++;
-		if (j > 41)
-			j = 0;
-		hash_str[i] = (u8)(this_str | (stream[j] >> 1));
-	}
+	lo_hash_dword ^= flow_vm_vlan ^ (flow_vm_vlan << 16);
 
-	/*
-	 * Now compute the hash.  i is the index into hash_str, j is into our
-	 * key stream, k is counting the number of bits, and h interates within
-	 * each byte.
-	 */
-	for (i = 45, j = 43, k = 0; k < 351 && i >= 2 && j >= 0; i--, j--) {
-		for (h = 0; h < 8 && k < 351; h++, k++) {
-			if (int_key[j] & (1 << h)) {
-				/*
-				 * Key bit is set, XOR in the current 16-bit
-				 * string.  Example of processing:
-				 *    h = 0,
-				 *      tmp = (hash_str[i - 2] & 0 << 16) |
-				 *            (hash_str[i - 1] & 0xff << 8) |
-				 *            (hash_str[i] & 0xff >> 0)
-				 *      So tmp = hash_str[15 + k:k], since the
-				 *      i + 2 clause rolls off the 16-bit value
-				 *    h = 7,
-				 *      tmp = (hash_str[i - 2] & 0x7f << 9) |
-				 *            (hash_str[i - 1] & 0xff << 1) |
-				 *            (hash_str[i] & 0x80 >> 7)
-				 */
-				int tmp = (hash_str[i] >> h);
-				tmp |= (hash_str[i - 1] << (8 - h));
-				tmp |= (int)(hash_str[i - 2] & ((1 << h) - 1))
-				             << (16 - h);
-				hash_result ^= (u16)tmp;
-			}
-		}
+
+	/* process the remaining 30 bits in the key 2 bits at a time */
+	for (i = 15; i; i-- ) {
+		if (key & (0x0001 << i)) hash_result ^= lo_hash_dword >> i;
+		if (key & (0x00010000 << i)) hash_result ^= hi_hash_dword >> i;
 	}
 
-	return hash_result;
+	return hash_result & IXGBE_ATR_HASH_MASK;
 }
 
 /**
@@ -1373,10 +1336,9 @@ static u16 ixgbe_atr_compute_hash_82599(struct ixgbe_atr_input *atr_input,
  *  @input: input stream to modify
  *  @vlan: the VLAN id to load
  **/
-s32 ixgbe_atr_set_vlan_id_82599(struct ixgbe_atr_input *input, u16 vlan)
+s32 ixgbe_atr_set_vlan_id_82599(union ixgbe_atr_input *input, __be16 vlan)
 {
-	input->byte_stream[IXGBE_ATR_VLAN_OFFSET + 1] = vlan >> 8;
-	input->byte_stream[IXGBE_ATR_VLAN_OFFSET] = vlan & 0xff;
+	input->formatted.vlan_id = vlan;
 
 	return 0;
 }
@@ -1386,14 +1348,9 @@ s32 ixgbe_atr_set_vlan_id_82599(struct ixgbe_atr_input *input, u16 vlan)
  *  @input: input stream to modify
  *  @src_addr: the IP address to load
  **/
-s32 ixgbe_atr_set_src_ipv4_82599(struct ixgbe_atr_input *input, u32 src_addr)
+s32 ixgbe_atr_set_src_ipv4_82599(union ixgbe_atr_input *input, __be32 src_addr)
 {
-	input->byte_stream[IXGBE_ATR_SRC_IPV4_OFFSET + 3] = src_addr >> 24;
-	input->byte_stream[IXGBE_ATR_SRC_IPV4_OFFSET + 2] =
-	                                               (src_addr >> 16) & 0xff;
-	input->byte_stream[IXGBE_ATR_SRC_IPV4_OFFSET + 1] =
-	                                                (src_addr >> 8) & 0xff;
-	input->byte_stream[IXGBE_ATR_SRC_IPV4_OFFSET] = src_addr & 0xff;
+	input->formatted.src_ip[0] = src_addr;
 
 	return 0;
 }
@@ -1403,14 +1360,9 @@ s32 ixgbe_atr_set_src_ipv4_82599(struct ixgbe_atr_input *input, u32 src_addr)
  *  @input: input stream to modify
  *  @dst_addr: the IP address to load
  **/
-s32 ixgbe_atr_set_dst_ipv4_82599(struct ixgbe_atr_input *input, u32 dst_addr)
+s32 ixgbe_atr_set_dst_ipv4_82599(union ixgbe_atr_input *input, __be32 dst_addr)
 {
-	input->byte_stream[IXGBE_ATR_DST_IPV4_OFFSET + 3] = dst_addr >> 24;
-	input->byte_stream[IXGBE_ATR_DST_IPV4_OFFSET + 2] =
-	                                               (dst_addr >> 16) & 0xff;
-	input->byte_stream[IXGBE_ATR_DST_IPV4_OFFSET + 1] =
-	                                                (dst_addr >> 8) & 0xff;
-	input->byte_stream[IXGBE_ATR_DST_IPV4_OFFSET] = dst_addr & 0xff;
+	input->formatted.dst_ip[0] = dst_addr;
 
 	return 0;
 }
@@ -1420,10 +1372,9 @@ s32 ixgbe_atr_set_dst_ipv4_82599(struct ixgbe_atr_input *input, u32 dst_addr)
  *  @input: input stream to modify
  *  @src_port: the source port to load
  **/
-s32 ixgbe_atr_set_src_port_82599(struct ixgbe_atr_input *input, u16 src_port)
+s32 ixgbe_atr_set_src_port_82599(union ixgbe_atr_input *input, __be16 src_port)
 {
-	input->byte_stream[IXGBE_ATR_SRC_PORT_OFFSET + 1] = src_port >> 8;
-	input->byte_stream[IXGBE_ATR_SRC_PORT_OFFSET] = src_port & 0xff;
+	input->formatted.src_port = src_port;
 
 	return 0;
 }
@@ -1433,10 +1384,9 @@ s32 ixgbe_atr_set_src_port_82599(struct ixgbe_atr_input *input, u16 src_port)
  *  @input: input stream to modify
  *  @dst_port: the destination port to load
  **/
-s32 ixgbe_atr_set_dst_port_82599(struct ixgbe_atr_input *input, u16 dst_port)
+s32 ixgbe_atr_set_dst_port_82599(union ixgbe_atr_input *input, __be16 dst_port)
 {
-	input->byte_stream[IXGBE_ATR_DST_PORT_OFFSET + 1] = dst_port >> 8;
-	input->byte_stream[IXGBE_ATR_DST_PORT_OFFSET] = dst_port & 0xff;
+	input->formatted.dst_port = dst_port;
 
 	return 0;
 }
@@ -1446,10 +1396,10 @@ s32 ixgbe_atr_set_dst_port_82599(struct ixgbe_atr_input *input, u16 dst_port)
  *  @input: input stream to modify
  *  @flex_bytes: the flexible bytes to load
  **/
-s32 ixgbe_atr_set_flex_byte_82599(struct ixgbe_atr_input *input, u16 flex_byte)
+s32 ixgbe_atr_set_flex_byte_82599(union ixgbe_atr_input *input,
+				  __be16 flex_bytes)
 {
-	input->byte_stream[IXGBE_ATR_FLEX_BYTE_OFFSET + 1] = flex_byte >> 8;
-	input->byte_stream[IXGBE_ATR_FLEX_BYTE_OFFSET] = flex_byte & 0xff;
+	input->formatted.flex_bytes = flex_bytes;
 
 	return 0;
 }
@@ -1459,9 +1409,9 @@ s32 ixgbe_atr_set_flex_byte_82599(struct ixgbe_atr_input *input, u16 flex_byte)
  *  @input: input stream to modify
  *  @l4type: the layer 4 type value to load
  **/
-s32 ixgbe_atr_set_l4type_82599(struct ixgbe_atr_input *input, u8 l4type)
+s32 ixgbe_atr_set_l4type_82599(union ixgbe_atr_input *input, u8 l4type)
 {
-	input->byte_stream[IXGBE_ATR_L4TYPE_OFFSET] = l4type;
+	input->formatted.flow_type = l4type;
 
 	return 0;
 }
@@ -1471,10 +1421,9 @@ s32 ixgbe_atr_set_l4type_82599(struct ixgbe_atr_input *input, u8 l4type)
  *  @input: input stream to search
  *  @vlan: the VLAN id to load
  **/
-static s32 ixgbe_atr_get_vlan_id_82599(struct ixgbe_atr_input *input, u16 *vlan)
+static s32 ixgbe_atr_get_vlan_id_82599(union ixgbe_atr_input *input, __be16 *vlan)
 {
-	*vlan = input->byte_stream[IXGBE_ATR_VLAN_OFFSET];
-	*vlan |= input->byte_stream[IXGBE_ATR_VLAN_OFFSET + 1] << 8;
+	*vlan = input->formatted.vlan_id;
 
 	return 0;
 }
@@ -1484,13 +1433,10 @@ static s32 ixgbe_atr_get_vlan_id_82599(struct ixgbe_atr_input *input, u16 *vlan)
  *  @input: input stream to search
  *  @src_addr: the IP address to load
  **/
-static s32 ixgbe_atr_get_src_ipv4_82599(struct ixgbe_atr_input *input,
-                                        u32 *src_addr)
+static s32 ixgbe_atr_get_src_ipv4_82599(union ixgbe_atr_input *input,
+                                        __be32 *src_addr)
 {
-	*src_addr = input->byte_stream[IXGBE_ATR_SRC_IPV4_OFFSET];
-	*src_addr |= input->byte_stream[IXGBE_ATR_SRC_IPV4_OFFSET + 1] << 8;
-	*src_addr |= input->byte_stream[IXGBE_ATR_SRC_IPV4_OFFSET + 2] << 16;
-	*src_addr |= input->byte_stream[IXGBE_ATR_SRC_IPV4_OFFSET + 3] << 24;
+	*src_addr = input->formatted.src_ip[0];
 
 	return 0;
 }
@@ -1500,13 +1446,10 @@ static s32 ixgbe_atr_get_src_ipv4_82599(struct ixgbe_atr_input *input,
  *  @input: input stream to search
  *  @dst_addr: the IP address to load
  **/
-static s32 ixgbe_atr_get_dst_ipv4_82599(struct ixgbe_atr_input *input,
-                                        u32 *dst_addr)
+static s32 ixgbe_atr_get_dst_ipv4_82599(union ixgbe_atr_input *input,
+                                        __be32 *dst_addr)
 {
-	*dst_addr = input->byte_stream[IXGBE_ATR_DST_IPV4_OFFSET];
-	*dst_addr |= input->byte_stream[IXGBE_ATR_DST_IPV4_OFFSET + 1] << 8;
-	*dst_addr |= input->byte_stream[IXGBE_ATR_DST_IPV4_OFFSET + 2] << 16;
-	*dst_addr |= input->byte_stream[IXGBE_ATR_DST_IPV4_OFFSET + 3] << 24;
+	*dst_addr = input->formatted.dst_ip[0];
 
 	return 0;
 }
@@ -1519,29 +1462,14 @@ static s32 ixgbe_atr_get_dst_ipv4_82599(struct ixgbe_atr_input *input,
  *  @src_addr_3: the third 4 bytes of the IP address to load
  *  @src_addr_4: the fourth 4 bytes of the IP address to load
  **/
-static s32 ixgbe_atr_get_src_ipv6_82599(struct ixgbe_atr_input *input,
-                                        u32 *src_addr_1, u32 *src_addr_2,
-                                        u32 *src_addr_3, u32 *src_addr_4)
+static s32 ixgbe_atr_get_src_ipv6_82599(union ixgbe_atr_input *input,
+                                        __be32 *src_addr_0, __be32 *src_addr_1,
+                                        __be32 *src_addr_2, __be32 *src_addr_3)
 {
-	*src_addr_1 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 12];
-	*src_addr_1 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 13] << 8;
-	*src_addr_1 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 14] << 16;
-	*src_addr_1 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 15] << 24;
-
-	*src_addr_2 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 8];
-	*src_addr_2 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 9] << 8;
-	*src_addr_2 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 10] << 16;
-	*src_addr_2 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 11] << 24;
-
-	*src_addr_3 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 4];
-	*src_addr_3 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 5] << 8;
-	*src_addr_3 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 6] << 16;
-	*src_addr_3 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 7] << 24;
-
-	*src_addr_4 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET];
-	*src_addr_4 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 1] << 8;
-	*src_addr_4 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 2] << 16;
-	*src_addr_4 = input->byte_stream[IXGBE_ATR_SRC_IPV6_OFFSET + 3] << 24;
+	*src_addr_0 = input->formatted.src_ip[0];
+	*src_addr_1 = input->formatted.src_ip[1];
+	*src_addr_2 = input->formatted.src_ip[2];
+	*src_addr_3 = input->formatted.src_ip[3];
 
 	return 0;
 }
@@ -1556,11 +1484,10 @@ static s32 ixgbe_atr_get_src_ipv6_82599(struct ixgbe_atr_input *input,
  *  endianness when retrieving the data.  This can be confusing since the
  *  internal hash engine expects it to be big-endian.
  **/
-static s32 ixgbe_atr_get_src_port_82599(struct ixgbe_atr_input *input,
-                                        u16 *src_port)
+static s32 ixgbe_atr_get_src_port_82599(union ixgbe_atr_input *input,
+                                        __be16 *src_port)
 {
-	*src_port = input->byte_stream[IXGBE_ATR_SRC_PORT_OFFSET] << 8;
-	*src_port |= input->byte_stream[IXGBE_ATR_SRC_PORT_OFFSET + 1];
+	*src_port = input->formatted.src_port;
 
 	return 0;
 }
@@ -1575,11 +1502,10 @@ static s32 ixgbe_atr_get_src_port_82599(struct ixgbe_atr_input *input,
  *  endianness when retrieving the data.  This can be confusing since the
  *  internal hash engine expects it to be big-endian.
  **/
-static s32 ixgbe_atr_get_dst_port_82599(struct ixgbe_atr_input *input,
-                                        u16 *dst_port)
+static s32 ixgbe_atr_get_dst_port_82599(union ixgbe_atr_input *input,
+                                        __be16 *dst_port)
 {
-	*dst_port = input->byte_stream[IXGBE_ATR_DST_PORT_OFFSET] << 8;
-	*dst_port |= input->byte_stream[IXGBE_ATR_DST_PORT_OFFSET + 1];
+	*dst_port = input->formatted.dst_port;
 
 	return 0;
 }
@@ -1589,11 +1515,10 @@ static s32 ixgbe_atr_get_dst_port_82599(struct ixgbe_atr_input *input,
  *  @input: input stream to modify
  *  @flex_bytes: the flexible bytes to load
  **/
-static s32 ixgbe_atr_get_flex_byte_82599(struct ixgbe_atr_input *input,
-                                         u16 *flex_byte)
+static s32 ixgbe_atr_get_flex_byte_82599(union ixgbe_atr_input *input,
+                                         __be16 *flex_bytes)
 {
-	*flex_byte = input->byte_stream[IXGBE_ATR_FLEX_BYTE_OFFSET];
-	*flex_byte |= input->byte_stream[IXGBE_ATR_FLEX_BYTE_OFFSET + 1] << 8;
+	*flex_bytes = input->formatted.flex_bytes;
 
 	return 0;
 }
@@ -1603,10 +1528,10 @@ static s32 ixgbe_atr_get_flex_byte_82599(struct ixgbe_atr_input *input,
  *  @input: input stream to modify
  *  @l4type: the layer 4 type value to load
  **/
-static s32 ixgbe_atr_get_l4type_82599(struct ixgbe_atr_input *input,
+static s32 ixgbe_atr_get_l4type_82599(union ixgbe_atr_input *input,
                                       u8 *l4type)
 {
-	*l4type = input->byte_stream[IXGBE_ATR_L4TYPE_OFFSET];
+	*l4type = input->formatted.flow_type;
 
 	return 0;
 }
@@ -1618,57 +1543,49 @@ static s32 ixgbe_atr_get_l4type_82599(struct ixgbe_atr_input *input,
  *  @queue: queue index to direct traffic to
  **/
 s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
-                                          struct ixgbe_atr_input *input,
+                                          union ixgbe_atr_input *input,
                                           u8 queue)
 {
 	u64  fdirhashcmd;
-	u64  fdircmd;
-	u32  fdirhash;
-	u16  bucket_hash, sig_hash;
-	u8   l4type;
-
-	bucket_hash = ixgbe_atr_compute_hash_82599(input,
-	                                           IXGBE_ATR_BUCKET_HASH_KEY);
-
-	/* bucket_hash is only 15 bits */
-	bucket_hash &= IXGBE_ATR_HASH_MASK;
-
-	sig_hash = ixgbe_atr_compute_hash_82599(input,
-	                                        IXGBE_ATR_SIGNATURE_HASH_KEY);
-
-	/* Get the l4type in order to program FDIRCMD properly */
-	/* lowest 2 bits are FDIRCMD.L4TYPE, third lowest bit is FDIRCMD.IPV6 */
-	ixgbe_atr_get_l4type_82599(input, &l4type);
+	u32  fdircmd;
+	u32  bucket_hash, sig_hash;
 
 	/*
-	 * The lower 32-bits of fdirhashcmd is for FDIRHASH, the upper 32-bits
-	 * is for FDIRCMD.  Then do a 64-bit register write from FDIRHASH.
+	 * Get the flow_type in order to program FDIRCMD properly
+	 * lowest 2 bits are FDIRCMD.L4TYPE, third lowest bit is FDIRCMD.IPV6
 	 */
-	fdirhash = sig_hash << IXGBE_FDIRHASH_SIG_SW_INDEX_SHIFT | bucket_hash;
-
-	fdircmd = (IXGBE_FDIRCMD_CMD_ADD_FLOW | IXGBE_FDIRCMD_FILTER_UPDATE |
-	           IXGBE_FDIRCMD_LAST | IXGBE_FDIRCMD_QUEUE_EN);
-
-	switch (l4type & IXGBE_ATR_L4TYPE_MASK) {
-	case IXGBE_ATR_L4TYPE_TCP:
-		fdircmd |= IXGBE_FDIRCMD_L4TYPE_TCP;
-		break;
-	case IXGBE_ATR_L4TYPE_UDP:
-		fdircmd |= IXGBE_FDIRCMD_L4TYPE_UDP;
-		break;
-	case IXGBE_ATR_L4TYPE_SCTP:
-		fdircmd |= IXGBE_FDIRCMD_L4TYPE_SCTP;
+	switch (input->formatted.flow_type) {
+	case IXGBE_ATR_FLOW_TYPE_TCPV4:
+	case IXGBE_ATR_FLOW_TYPE_UDPV4:
+	case IXGBE_ATR_FLOW_TYPE_SCTPV4:
+	case IXGBE_ATR_FLOW_TYPE_TCPV6:
+	case IXGBE_ATR_FLOW_TYPE_UDPV6:
+	case IXGBE_ATR_FLOW_TYPE_SCTPV6:
 		break;
 	default:
-		hw_dbg(hw, "Error on l4type input\n");
+		hw_dbg(hw, " Error on flow type input\n");
 		return IXGBE_ERR_CONFIG;
 	}
 
-	if (l4type & IXGBE_ATR_L4TYPE_IPV6_MASK)
-		fdircmd |= IXGBE_FDIRCMD_IPV6;
+	/* configure FDIRCMD register */
+	fdircmd = IXGBE_FDIRCMD_CMD_ADD_FLOW | IXGBE_FDIRCMD_FILTER_UPDATE |
+	          IXGBE_FDIRCMD_LAST | IXGBE_FDIRCMD_QUEUE_EN;
+	fdircmd |= input->formatted.flow_type << IXGBE_FDIRCMD_FLOW_TYPE_SHIFT;
+	fdircmd |= (u32)queue << IXGBE_FDIRCMD_RX_QUEUE_SHIFT;
 
-	fdircmd |= ((u64)queue << IXGBE_FDIRCMD_RX_QUEUE_SHIFT);
-	fdirhashcmd = ((fdircmd << 32) | fdirhash);
+	/*
+	 * The lower 32-bits of fdirhashcmd is for FDIRHASH, the upper 32-bits
+	 * is for FDIRCMD.  Then do a 64-bit register write from FDIRHASH.
+	 */
+	fdirhashcmd = (u64)fdircmd << 32;
+
+	sig_hash = ixgbe_atr_compute_hash_82599(input,
+	                                        IXGBE_ATR_SIGNATURE_HASH_KEY);
+	fdirhashcmd |= sig_hash << IXGBE_FDIRHASH_SIG_SW_INDEX_SHIFT;
+
+	bucket_hash = ixgbe_atr_compute_hash_82599(input,
+	                                           IXGBE_ATR_BUCKET_HASH_KEY);
+	fdirhashcmd |= bucket_hash;
 
 	IXGBE_WRITE_REG64(hw, IXGBE_FDIRHASH, fdirhashcmd);
 
@@ -1687,7 +1604,7 @@ s32 ixgbe_fdir_add_signature_filter_82599(struct ixgbe_hw *hw,
  *  hardware writes must be protected from one another.
  **/
 s32 ixgbe_fdir_add_perfect_filter_82599(struct ixgbe_hw *hw,
-                                      struct ixgbe_atr_input *input,
+                                      union ixgbe_atr_input *input,
                                       struct ixgbe_atr_input_masks *input_masks,
                                       u16 soft_id, u8 queue)
 {
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index a8bab15..76e40e2 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -2278,7 +2278,7 @@ static int ixgbe_set_rx_ntuple(struct net_device *dev,
 {
 	struct ixgbe_adapter *adapter = netdev_priv(dev);
 	struct ethtool_rx_ntuple_flow_spec fs = cmd->fs;
-	struct ixgbe_atr_input input_struct;
+	union ixgbe_atr_input input_struct;
 	struct ixgbe_atr_input_masks input_masks;
 	int target_queue;
 
@@ -2293,7 +2293,7 @@ static int ixgbe_set_rx_ntuple(struct net_device *dev,
 	    (fs.action < ETHTOOL_RXNTUPLE_ACTION_DROP))
 		return -EINVAL;
 
-	memset(&input_struct, 0, sizeof(struct ixgbe_atr_input));
+	memset(&input_struct, 0, sizeof(union ixgbe_atr_input));
 	memset(&input_masks, 0, sizeof(struct ixgbe_atr_input_masks));
 
 	input_masks.src_ip_mask = fs.m_u.tcp_ip4_spec.ip4src;
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index e8ae311..26718ab 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -6509,21 +6509,20 @@ static void ixgbe_tx_queue(struct ixgbe_ring *tx_ring,
 static void ixgbe_atr(struct ixgbe_adapter *adapter, struct sk_buff *skb,
 		      u8 queue, u32 tx_flags, __be16 protocol)
 {
-	struct ixgbe_atr_input atr_input;
+	union ixgbe_atr_input atr_input;
 	struct iphdr *iph = ip_hdr(skb);
 	struct ethhdr *eth = (struct ethhdr *)skb->data;
 	struct tcphdr *th;
-	u16 vlan_id;
+	__be16 vlan_id;
 
 	/* Right now, we support IPv4 w/ TCP only */
 	if (protocol != htons(ETH_P_IP) ||
 	    iph->protocol != IPPROTO_TCP)
 		return;
 
-	memset(&atr_input, 0, sizeof(struct ixgbe_atr_input));
+	memset(&atr_input, 0, sizeof(union ixgbe_atr_input));
 
-	vlan_id = (tx_flags & IXGBE_TX_FLAGS_VLAN_MASK) >>
-		   IXGBE_TX_FLAGS_VLAN_SHIFT;
+	vlan_id = htons(tx_flags >> IXGBE_TX_FLAGS_VLAN_SHIFT);
 
 	th = tcp_hdr(skb);
 
@@ -6531,7 +6530,7 @@ static void ixgbe_atr(struct ixgbe_adapter *adapter, struct sk_buff *skb,
 	ixgbe_atr_set_src_port_82599(&atr_input, th->dest);
 	ixgbe_atr_set_dst_port_82599(&atr_input, th->source);
 	ixgbe_atr_set_flex_byte_82599(&atr_input, eth->h_proto);
-	ixgbe_atr_set_l4type_82599(&atr_input, IXGBE_ATR_L4TYPE_TCP);
+	ixgbe_atr_set_l4type_82599(&atr_input, IXGBE_ATR_FLOW_TYPE_TCPV4);
 	/* src and dst are inverted, think how the receiver sees them */
 	ixgbe_atr_set_src_ipv4_82599(&atr_input, iph->daddr);
 	ixgbe_atr_set_dst_ipv4_82599(&atr_input, iph->saddr);
diff --git a/drivers/net/ixgbe/ixgbe_type.h b/drivers/net/ixgbe/ixgbe_type.h
index 446f3467..c56a712 100644
--- a/drivers/net/ixgbe/ixgbe_type.h
+++ b/drivers/net/ixgbe/ixgbe_type.h
@@ -1990,6 +1990,7 @@ enum ixgbe_fdir_pballoc_type {
 #define IXGBE_FDIRCMD_LAST                      0x00000800
 #define IXGBE_FDIRCMD_COLLISION                 0x00001000
 #define IXGBE_FDIRCMD_QUEUE_EN                  0x00008000
+#define IXGBE_FDIRCMD_FLOW_TYPE_SHIFT           5
 #define IXGBE_FDIRCMD_RX_QUEUE_SHIFT            16
 #define IXGBE_FDIRCMD_VT_POOL_SHIFT             24
 #define IXGBE_FDIR_INIT_DONE_POLL               10
@@ -2147,51 +2148,63 @@ typedef u32 ixgbe_physical_layer;
 #define FC_LOW_WATER(MTU)  (2 * (2 * PAUSE_MTU(MTU) + PAUSE_RTT))
 
 /* Software ATR hash keys */
-#define IXGBE_ATR_BUCKET_HASH_KEY    0xE214AD3D
-#define IXGBE_ATR_SIGNATURE_HASH_KEY 0x14364D17
-
-/* Software ATR input stream offsets and masks */
-#define IXGBE_ATR_VLAN_OFFSET       0
-#define IXGBE_ATR_SRC_IPV6_OFFSET   2
-#define IXGBE_ATR_SRC_IPV4_OFFSET  14
-#define IXGBE_ATR_DST_IPV6_OFFSET  18
-#define IXGBE_ATR_DST_IPV4_OFFSET  30
-#define IXGBE_ATR_SRC_PORT_OFFSET  34
-#define IXGBE_ATR_DST_PORT_OFFSET  36
-#define IXGBE_ATR_FLEX_BYTE_OFFSET 38
-#define IXGBE_ATR_VM_POOL_OFFSET   40
-#define IXGBE_ATR_L4TYPE_OFFSET    41
+#define IXGBE_ATR_BUCKET_HASH_KEY    0x3DAD14E2
+#define IXGBE_ATR_SIGNATURE_HASH_KEY 0x174D3614
 
+/* Software ATR input stream values and masks */
+#define IXGBE_ATR_HASH_MASK     0x7fff
 #define IXGBE_ATR_L4TYPE_MASK      0x3
-#define IXGBE_ATR_L4TYPE_IPV6_MASK 0x4
 #define IXGBE_ATR_L4TYPE_UDP       0x1
 #define IXGBE_ATR_L4TYPE_TCP       0x2
 #define IXGBE_ATR_L4TYPE_SCTP      0x3
-#define IXGBE_ATR_HASH_MASK     0x7fff
+#define IXGBE_ATR_L4TYPE_IPV6_MASK 0x4
+enum ixgbe_atr_flow_type {
+	IXGBE_ATR_FLOW_TYPE_IPV4   = 0x0,
+	IXGBE_ATR_FLOW_TYPE_UDPV4  = 0x1,
+	IXGBE_ATR_FLOW_TYPE_TCPV4  = 0x2,
+	IXGBE_ATR_FLOW_TYPE_SCTPV4 = 0x3,
+	IXGBE_ATR_FLOW_TYPE_IPV6   = 0x4,
+	IXGBE_ATR_FLOW_TYPE_UDPV6  = 0x5,
+	IXGBE_ATR_FLOW_TYPE_TCPV6  = 0x6,
+	IXGBE_ATR_FLOW_TYPE_SCTPV6 = 0x7,
+};
 
 /* Flow Director ATR input struct. */
-struct ixgbe_atr_input {
-	/* Byte layout in order, all values with MSB first:
+union ixgbe_atr_input {
+	/*
+	 * Byte layout in order, all values with MSB first:
 	 *
+	 * vm_pool    - 1 byte
+	 * flow_type  - 1 byte
 	 * vlan_id    - 2 bytes
 	 * src_ip     - 16 bytes
 	 * dst_ip     - 16 bytes
 	 * src_port   - 2 bytes
 	 * dst_port   - 2 bytes
 	 * flex_bytes - 2 bytes
-	 * vm_pool    - 1 byte
-	 * l4type     - 1 byte
+	 * rsvd0      - 2 bytes - space reserved must be 0.
 	 */
-	u8 byte_stream[42];
+	struct {
+		u8     vm_pool;
+		u8     flow_type;
+		__be16 vlan_id;
+		__be32 dst_ip[4];
+		__be32 src_ip[4];
+		__be16 src_port;
+		__be16 dst_port;
+		__be16 flex_bytes;
+		__be16 rsvd0;
+	} formatted;
+	__be32 dword_stream[11];
 };
 
 struct ixgbe_atr_input_masks {
-	u32 src_ip_mask;
-	u32 dst_ip_mask;
-	u16 src_port_mask;
-	u16 dst_port_mask;
-	u16 vlan_id_mask;
-	u16 data_mask;
+	__be32 src_ip_mask;
+	__be32 dst_ip_mask;
+	__be16 src_port_mask;
+	__be16 dst_port_mask;
+	__be16 vlan_id_mask;
+	__be16 data_mask;
 };
 
 enum ixgbe_eeprom_type {
-- 
1.7.3.4


^ permalink raw reply related

* [net-next 09/12] ixgbe: make sure per Rx queue is disabled before unmapping the receive buffer
From: jeffrey.t.kirsher @ 2011-01-07  0:29 UTC (permalink / raw)
  To: davem, davem; +Cc: Yi Zou, netdev, gosp, bphilips, Jeff Kirsher
In-Reply-To: <1294360199-9860-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Yi Zou <yi.zou@intel.com>

When disable the Rx logic globally, we would also want to disable the per Rx
queue receive logic by per queue Rx control register RXDCTL so no more DMA is
happening from the packet buffer to the receive buffer associated with the Rx
ring, before we start unmapping Rx ring receive buffer. The hardware may take
max of 100us before the corresponding Rx queue is really disabled. Added
ixgbe_disable_rx_queue() for this purpose.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgbe/ixgbe.h         |    2 +
 drivers/net/ixgbe/ixgbe_ethtool.c |    4 +--
 drivers/net/ixgbe/ixgbe_main.c    |   40 +++++++++++++++++++++++++++++++++---
 3 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index 3ae30b8..bdeaa9e 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -508,6 +508,8 @@ extern void ixgbe_free_rx_resources(struct ixgbe_ring *);
 extern void ixgbe_free_tx_resources(struct ixgbe_ring *);
 extern void ixgbe_configure_rx_ring(struct ixgbe_adapter *,struct ixgbe_ring *);
 extern void ixgbe_configure_tx_ring(struct ixgbe_adapter *,struct ixgbe_ring *);
+extern void ixgbe_disable_rx_queue(struct ixgbe_adapter *adapter,
+				   struct ixgbe_ring *);
 extern void ixgbe_update_stats(struct ixgbe_adapter *adapter);
 extern int ixgbe_init_interrupt_scheme(struct ixgbe_adapter *adapter);
 extern void ixgbe_clear_interrupt_scheme(struct ixgbe_adapter *adapter);
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 23ff23e..a8bab15 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -1477,9 +1477,7 @@ static void ixgbe_free_desc_rings(struct ixgbe_adapter *adapter)
 	reg_ctl = IXGBE_READ_REG(hw, IXGBE_RXCTRL);
 	reg_ctl &= ~IXGBE_RXCTRL_RXEN;
 	IXGBE_WRITE_REG(hw, IXGBE_RXCTRL, reg_ctl);
-	reg_ctl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(rx_ring->reg_idx));
-	reg_ctl &= ~IXGBE_RXDCTL_ENABLE;
-	IXGBE_WRITE_REG(hw, IXGBE_RXDCTL(rx_ring->reg_idx), reg_ctl);
+	ixgbe_disable_rx_queue(adapter, rx_ring);
 
 	/* now Tx */
 	reg_ctl = IXGBE_READ_REG(hw, IXGBE_TXDCTL(tx_ring->reg_idx));
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 38ab4f3..e8ae311 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -3024,6 +3024,36 @@ static void ixgbe_rx_desc_queue_enable(struct ixgbe_adapter *adapter,
 	}
 }
 
+void ixgbe_disable_rx_queue(struct ixgbe_adapter *adapter,
+			    struct ixgbe_ring *ring)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	int wait_loop = IXGBE_MAX_RX_DESC_POLL;
+	u32 rxdctl;
+	u8 reg_idx = ring->reg_idx;
+
+	rxdctl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(reg_idx));
+	rxdctl &= ~IXGBE_RXDCTL_ENABLE;
+
+	/* write value back with RXDCTL.ENABLE bit cleared */
+	IXGBE_WRITE_REG(hw, IXGBE_RXDCTL(reg_idx), rxdctl);
+
+	if (hw->mac.type == ixgbe_mac_82598EB &&
+	    !(IXGBE_READ_REG(hw, IXGBE_LINKS) & IXGBE_LINKS_UP))
+		return;
+
+	/* the hardware may take up to 100us to really disable the rx queue */
+	do {
+		udelay(10);
+		rxdctl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(reg_idx));
+	} while (--wait_loop && (rxdctl & IXGBE_RXDCTL_ENABLE));
+
+	if (!wait_loop) {
+		e_err(drv, "RXDCTL.ENABLE on Rx queue %d not cleared within "
+		      "the polling period\n", reg_idx);
+	}
+}
+
 void ixgbe_configure_rx_ring(struct ixgbe_adapter *adapter,
 			     struct ixgbe_ring *ring)
 {
@@ -3034,9 +3064,7 @@ void ixgbe_configure_rx_ring(struct ixgbe_adapter *adapter,
 
 	/* disable queue to avoid issues while updating state */
 	rxdctl = IXGBE_READ_REG(hw, IXGBE_RXDCTL(reg_idx));
-	IXGBE_WRITE_REG(hw, IXGBE_RXDCTL(reg_idx),
-			rxdctl & ~IXGBE_RXDCTL_ENABLE);
-	IXGBE_WRITE_FLUSH(hw);
+	ixgbe_disable_rx_queue(adapter, ring);
 
 	IXGBE_WRITE_REG(hw, IXGBE_RDBAL(reg_idx), (rdba & DMA_BIT_MASK(32)));
 	IXGBE_WRITE_REG(hw, IXGBE_RDBAH(reg_idx), (rdba >> 32));
@@ -4064,7 +4092,11 @@ void ixgbe_down(struct ixgbe_adapter *adapter)
 	rxctrl = IXGBE_READ_REG(hw, IXGBE_RXCTRL);
 	IXGBE_WRITE_REG(hw, IXGBE_RXCTRL, rxctrl & ~IXGBE_RXCTRL_RXEN);
 
-	IXGBE_WRITE_FLUSH(hw);
+	/* disable all enabled rx queues */
+	for (i = 0; i < adapter->num_rx_queues; i++)
+		/* this call also flushes the previous write */
+		ixgbe_disable_rx_queue(adapter, adapter->rx_ring[i]);
+
 	msleep(10);
 
 	netif_tx_stop_all_queues(netdev);
-- 
1.7.3.4


^ permalink raw reply related

* [net-next 08/12] ixgb: convert to new VLAN model
From: jeffrey.t.kirsher @ 2011-01-07  0:29 UTC (permalink / raw)
  To: davem, davem
  Cc: Emil Tantilov, netdev, gosp, bphilips, Jesse Gross, Jeff Kirsher
In-Reply-To: <1294360199-9860-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Emil Tantilov <emil.s.tantilov@intel.com>

Based on a patch from Jesse Gross <jesse@nicira.com>

This switches the ixgb driver to use the new VLAN interfaces.
In doing this, it completes the work begun in
ae54496f9e8d40c89e5668205c181dccfa9ecda1 allowing the use of
hardware VLAN insertion without having a VLAN group configured.

CC: Jesse Gross <jesse@nicira.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Jeff Pieper jeffrey.e.pieper@intel.com
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ixgb/ixgb.h         |    2 +-
 drivers/net/ixgb/ixgb_ethtool.c |   35 +++++++++++++++++++++++++
 drivers/net/ixgb/ixgb_main.c    |   54 ++++++++------------------------------
 3 files changed, 48 insertions(+), 43 deletions(-)

diff --git a/drivers/net/ixgb/ixgb.h b/drivers/net/ixgb/ixgb.h
index 521c0c7..8f3df04 100644
--- a/drivers/net/ixgb/ixgb.h
+++ b/drivers/net/ixgb/ixgb.h
@@ -149,7 +149,7 @@ struct ixgb_desc_ring {
 
 struct ixgb_adapter {
 	struct timer_list watchdog_timer;
-	struct vlan_group *vlgrp;
+	unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
 	u32 bd_number;
 	u32 rx_buffer_len;
 	u32 part_num;
diff --git a/drivers/net/ixgb/ixgb_ethtool.c b/drivers/net/ixgb/ixgb_ethtool.c
index 43994c1..1294161 100644
--- a/drivers/net/ixgb/ixgb_ethtool.c
+++ b/drivers/net/ixgb/ixgb_ethtool.c
@@ -706,6 +706,39 @@ ixgb_get_strings(struct net_device *netdev, u32 stringset, u8 *data)
 	}
 }
 
+static int ixgb_set_flags(struct net_device *netdev, u32 data)
+{
+	struct ixgb_adapter *adapter = netdev_priv(netdev);
+	bool need_reset;
+	int rc;
+
+	/* 
+	 * TX vlan insertion does not work per HW design when Rx stripping is
+	 * disabled.  Disable txvlan when rxvlan is off.
+	 */
+	if ((data & ETH_FLAG_RXVLAN) != (netdev->features & NETIF_F_HW_VLAN_RX))
+		data ^= ETH_FLAG_TXVLAN;
+
+	need_reset = (data & ETH_FLAG_RXVLAN) !=
+		     (netdev->features & NETIF_F_HW_VLAN_RX);
+
+	rc = ethtool_op_set_flags(netdev, data, ETH_FLAG_RXVLAN |
+						ETH_FLAG_TXVLAN);
+	if (rc)
+		return rc;
+
+	if (need_reset) {
+		if (netif_running(netdev)) {
+			ixgb_down(adapter, true);
+			ixgb_up(adapter);
+			ixgb_set_speed_duplex(netdev);
+		} else
+			ixgb_reset(adapter);
+	}
+
+	return 0;
+}
+
 static const struct ethtool_ops ixgb_ethtool_ops = {
 	.get_settings = ixgb_get_settings,
 	.set_settings = ixgb_set_settings,
@@ -732,6 +765,8 @@ static const struct ethtool_ops ixgb_ethtool_ops = {
 	.phys_id = ixgb_phys_id,
 	.get_sset_count = ixgb_get_sset_count,
 	.get_ethtool_stats = ixgb_get_ethtool_stats,
+	.get_flags = ethtool_op_get_flags,
+	.set_flags = ixgb_set_flags,
 };
 
 void ixgb_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
index 5639ccc..0f681ac 100644
--- a/drivers/net/ixgb/ixgb_main.c
+++ b/drivers/net/ixgb/ixgb_main.c
@@ -100,8 +100,6 @@ static void ixgb_tx_timeout_task(struct work_struct *work);
 
 static void ixgb_vlan_strip_enable(struct ixgb_adapter *adapter);
 static void ixgb_vlan_strip_disable(struct ixgb_adapter *adapter);
-static void ixgb_vlan_rx_register(struct net_device *netdev,
-                                  struct vlan_group *grp);
 static void ixgb_vlan_rx_add_vid(struct net_device *netdev, u16 vid);
 static void ixgb_vlan_rx_kill_vid(struct net_device *netdev, u16 vid);
 static void ixgb_restore_vlan(struct ixgb_adapter *adapter);
@@ -336,7 +334,6 @@ static const struct net_device_ops ixgb_netdev_ops = {
 	.ndo_set_mac_address	= ixgb_set_mac,
 	.ndo_change_mtu		= ixgb_change_mtu,
 	.ndo_tx_timeout		= ixgb_tx_timeout,
-	.ndo_vlan_rx_register	= ixgb_vlan_rx_register,
 	.ndo_vlan_rx_add_vid	= ixgb_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= ixgb_vlan_rx_kill_vid,
 #ifdef CONFIG_NET_POLL_CONTROLLER
@@ -1508,7 +1505,7 @@ ixgb_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
                      DESC_NEEDED)))
 		return NETDEV_TX_BUSY;
 
-	if (adapter->vlgrp && vlan_tx_tag_present(skb)) {
+	if (vlan_tx_tag_present(skb)) {
 		tx_flags |= IXGB_TX_FLAGS_VLAN;
 		vlan_id = vlan_tx_tag_get(skb);
 	}
@@ -2049,12 +2046,11 @@ ixgb_clean_rx_irq(struct ixgb_adapter *adapter, int *work_done, int work_to_do)
 		ixgb_rx_checksum(adapter, rx_desc, skb);
 
 		skb->protocol = eth_type_trans(skb, netdev);
-		if (adapter->vlgrp && (status & IXGB_RX_DESC_STATUS_VP)) {
-			vlan_hwaccel_receive_skb(skb, adapter->vlgrp,
-			                        le16_to_cpu(rx_desc->special));
-		} else {
-			netif_receive_skb(skb);
-		}
+		if (status & IXGB_RX_DESC_STATUS_VP)
+			__vlan_hwaccel_put_tag(skb,
+					       le16_to_cpu(rx_desc->special));
+
+		netif_receive_skb(skb);
 
 rxdesc_done:
 		/* clean up descriptor, might be written over by hw */
@@ -2152,20 +2148,6 @@ map_skb:
 	}
 }
 
-/**
- * ixgb_vlan_rx_register - enables or disables vlan tagging/stripping.
- *
- * @param netdev network interface device structure
- * @param grp indicates to enable or disable tagging/stripping
- **/
-static void
-ixgb_vlan_rx_register(struct net_device *netdev, struct vlan_group *grp)
-{
-	struct ixgb_adapter *adapter = netdev_priv(netdev);
-
-	adapter->vlgrp = grp;
-}
-
 static void
 ixgb_vlan_strip_enable(struct ixgb_adapter *adapter)
 {
@@ -2200,6 +2182,7 @@ ixgb_vlan_rx_add_vid(struct net_device *netdev, u16 vid)
 	vfta = IXGB_READ_REG_ARRAY(&adapter->hw, VFTA, index);
 	vfta |= (1 << (vid & 0x1F));
 	ixgb_write_vfta(&adapter->hw, index, vfta);
+	set_bit(vid, adapter->active_vlans);
 }
 
 static void
@@ -2208,35 +2191,22 @@ ixgb_vlan_rx_kill_vid(struct net_device *netdev, u16 vid)
 	struct ixgb_adapter *adapter = netdev_priv(netdev);
 	u32 vfta, index;
 
-	ixgb_irq_disable(adapter);
-
-	vlan_group_set_device(adapter->vlgrp, vid, NULL);
-
-	/* don't enable interrupts unless we are UP */
-	if (adapter->netdev->flags & IFF_UP)
-		ixgb_irq_enable(adapter);
-
 	/* remove VID from filter table */
 
 	index = (vid >> 5) & 0x7F;
 	vfta = IXGB_READ_REG_ARRAY(&adapter->hw, VFTA, index);
 	vfta &= ~(1 << (vid & 0x1F));
 	ixgb_write_vfta(&adapter->hw, index, vfta);
+	clear_bit(vid, adapter->active_vlans);
 }
 
 static void
 ixgb_restore_vlan(struct ixgb_adapter *adapter)
 {
-	ixgb_vlan_rx_register(adapter->netdev, adapter->vlgrp);
-
-	if (adapter->vlgrp) {
-		u16 vid;
-		for (vid = 0; vid < VLAN_N_VID; vid++) {
-			if (!vlan_group_get_device(adapter->vlgrp, vid))
-				continue;
-			ixgb_vlan_rx_add_vid(adapter->netdev, vid);
-		}
-	}
+	u16 vid;
+
+	for_each_set_bit(vid, adapter->active_vlans, VLAN_N_VID)
+		ixgb_vlan_rx_add_vid(adapter->netdev, vid);
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-- 
1.7.3.4


^ permalink raw reply related

* [net-next 07/12] e1000: Add support for the CE4100 reference platform
From: jeffrey.t.kirsher @ 2011-01-07  0:29 UTC (permalink / raw)
  To: davem, davem; +Cc: Dirk Brandewie, netdev, gosp, bphilips, Jeff Kirsher
In-Reply-To: <1294360199-9860-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Dirk Brandewie <dirk.j.brandewie@intel.com>

This patch adds support for the gigabit phys present on the CE4100 reference
platforms.

Signed-off-by:  Dirk Brandewie <dirk.j.brandewie@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/e1000/e1000_hw.c    |  328 +++++++++++++++++++++++++++++++--------
 drivers/net/e1000/e1000_hw.h    |   59 +++++++-
 drivers/net/e1000/e1000_main.c  |   35 ++++
 drivers/net/e1000/e1000_osdep.h |   19 ++-
 4 files changed, 365 insertions(+), 76 deletions(-)

diff --git a/drivers/net/e1000/e1000_hw.c b/drivers/net/e1000/e1000_hw.c
index 77d08e6..aed223b 100644
--- a/drivers/net/e1000/e1000_hw.c
+++ b/drivers/net/e1000/e1000_hw.c
@@ -130,10 +130,15 @@ static s32 e1000_set_phy_type(struct e1000_hw *hw)
 		if (hw->mac_type == e1000_82541 ||
 		    hw->mac_type == e1000_82541_rev_2 ||
 		    hw->mac_type == e1000_82547 ||
-		    hw->mac_type == e1000_82547_rev_2) {
+		    hw->mac_type == e1000_82547_rev_2)
 			hw->phy_type = e1000_phy_igp;
-			break;
-		}
+		break;
+	case RTL8211B_PHY_ID:
+		hw->phy_type = e1000_phy_8211;
+		break;
+	case RTL8201N_PHY_ID:
+		hw->phy_type = e1000_phy_8201;
+		break;
 	default:
 		/* Should never have loaded on this device */
 		hw->phy_type = e1000_phy_undefined;
@@ -318,6 +323,9 @@ s32 e1000_set_mac_type(struct e1000_hw *hw)
 	case E1000_DEV_ID_82547GI:
 		hw->mac_type = e1000_82547_rev_2;
 		break;
+	case E1000_DEV_ID_INTEL_CE4100_GBE:
+		hw->mac_type = e1000_ce4100;
+		break;
 	default:
 		/* Should never have loaded on this device */
 		return -E1000_ERR_MAC_TYPE;
@@ -372,6 +380,9 @@ void e1000_set_media_type(struct e1000_hw *hw)
 		case e1000_82542_rev2_1:
 			hw->media_type = e1000_media_type_fiber;
 			break;
+		case e1000_ce4100:
+			hw->media_type = e1000_media_type_copper;
+			break;
 		default:
 			status = er32(STATUS);
 			if (status & E1000_STATUS_TBIMODE) {
@@ -460,6 +471,7 @@ s32 e1000_reset_hw(struct e1000_hw *hw)
 		/* Reset is performed on a shadow of the control register */
 		ew32(CTRL_DUP, (ctrl | E1000_CTRL_RST));
 		break;
+	case e1000_ce4100:
 	default:
 		ew32(CTRL, (ctrl | E1000_CTRL_RST));
 		break;
@@ -952,6 +964,67 @@ static s32 e1000_setup_fiber_serdes_link(struct e1000_hw *hw)
 }
 
 /**
+ * e1000_copper_link_rtl_setup - Copper link setup for e1000_phy_rtl series.
+ * @hw: Struct containing variables accessed by shared code
+ *
+ * Commits changes to PHY configuration by calling e1000_phy_reset().
+ */
+static s32 e1000_copper_link_rtl_setup(struct e1000_hw *hw)
+{
+	s32 ret_val;
+
+	/* SW reset the PHY so all changes take effect */
+	ret_val = e1000_phy_reset(hw);
+	if (ret_val) {
+		e_dbg("Error Resetting the PHY\n");
+		return ret_val;
+	}
+
+	return E1000_SUCCESS;
+}
+
+static s32 gbe_dhg_phy_setup(struct e1000_hw *hw)
+{
+	s32 ret_val;
+	u32 ctrl_aux;
+
+	switch (hw->phy_type) {
+	case e1000_phy_8211:
+		ret_val = e1000_copper_link_rtl_setup(hw);
+		if (ret_val) {
+			e_dbg("e1000_copper_link_rtl_setup failed!\n");
+			return ret_val;
+		}
+		break;
+	case e1000_phy_8201:
+		/* Set RMII mode */
+		ctrl_aux = er32(CTL_AUX);
+		ctrl_aux |= E1000_CTL_AUX_RMII;
+		ew32(CTL_AUX, ctrl_aux);
+		E1000_WRITE_FLUSH();
+
+		/* Disable the J/K bits required for receive */
+		ctrl_aux = er32(CTL_AUX);
+		ctrl_aux |= 0x4;
+		ctrl_aux &= ~0x2;
+		ew32(CTL_AUX, ctrl_aux);
+		E1000_WRITE_FLUSH();
+		ret_val = e1000_copper_link_rtl_setup(hw);
+
+		if (ret_val) {
+			e_dbg("e1000_copper_link_rtl_setup failed!\n");
+			return ret_val;
+		}
+		break;
+	default:
+		e_dbg("Error Resetting the PHY\n");
+		return E1000_ERR_PHY_TYPE;
+	}
+
+	return E1000_SUCCESS;
+}
+
+/**
  * e1000_copper_link_preconfig - early configuration for copper
  * @hw: Struct containing variables accessed by shared code
  *
@@ -1286,6 +1359,10 @@ static s32 e1000_copper_link_autoneg(struct e1000_hw *hw)
 	if (hw->autoneg_advertised == 0)
 		hw->autoneg_advertised = AUTONEG_ADVERTISE_SPEED_DEFAULT;
 
+	/* IFE/RTL8201N PHY only supports 10/100 */
+	if (hw->phy_type == e1000_phy_8201)
+		hw->autoneg_advertised &= AUTONEG_ADVERTISE_10_100_ALL;
+
 	e_dbg("Reconfiguring auto-neg advertisement params\n");
 	ret_val = e1000_phy_setup_autoneg(hw);
 	if (ret_val) {
@@ -1341,7 +1418,7 @@ static s32 e1000_copper_link_postconfig(struct e1000_hw *hw)
 	s32 ret_val;
 	e_dbg("e1000_copper_link_postconfig");
 
-	if (hw->mac_type >= e1000_82544) {
+	if ((hw->mac_type >= e1000_82544) && (hw->mac_type != e1000_ce4100)) {
 		e1000_config_collision_dist(hw);
 	} else {
 		ret_val = e1000_config_mac_to_phy(hw);
@@ -1395,6 +1472,12 @@ static s32 e1000_setup_copper_link(struct e1000_hw *hw)
 		ret_val = e1000_copper_link_mgp_setup(hw);
 		if (ret_val)
 			return ret_val;
+	} else {
+		ret_val = gbe_dhg_phy_setup(hw);
+		if (ret_val) {
+			e_dbg("gbe_dhg_phy_setup failed!\n");
+			return ret_val;
+		}
 	}
 
 	if (hw->autoneg) {
@@ -1461,10 +1544,11 @@ s32 e1000_phy_setup_autoneg(struct e1000_hw *hw)
 		return ret_val;
 
 	/* Read the MII 1000Base-T Control Register (Address 9). */
-	ret_val =
-	    e1000_read_phy_reg(hw, PHY_1000T_CTRL, &mii_1000t_ctrl_reg);
+	ret_val = e1000_read_phy_reg(hw, PHY_1000T_CTRL, &mii_1000t_ctrl_reg);
 	if (ret_val)
 		return ret_val;
+	else if (hw->phy_type == e1000_phy_8201)
+		mii_1000t_ctrl_reg &= ~REG9_SPEED_MASK;
 
 	/* Need to parse both autoneg_advertised and fc and set up
 	 * the appropriate PHY registers.  First we will parse for
@@ -1577,9 +1661,14 @@ s32 e1000_phy_setup_autoneg(struct e1000_hw *hw)
 
 	e_dbg("Auto-Neg Advertising %x\n", mii_autoneg_adv_reg);
 
-	ret_val = e1000_write_phy_reg(hw, PHY_1000T_CTRL, mii_1000t_ctrl_reg);
-	if (ret_val)
-		return ret_val;
+	if (hw->phy_type == e1000_phy_8201) {
+		mii_1000t_ctrl_reg = 0;
+	} else {
+		ret_val = e1000_write_phy_reg(hw, PHY_1000T_CTRL,
+		                              mii_1000t_ctrl_reg);
+		if (ret_val)
+			return ret_val;
+	}
 
 	return E1000_SUCCESS;
 }
@@ -1860,7 +1949,7 @@ static s32 e1000_config_mac_to_phy(struct e1000_hw *hw)
 
 	/* 82544 or newer MAC, Auto Speed Detection takes care of
 	 * MAC speed/duplex configuration.*/
-	if (hw->mac_type >= e1000_82544)
+	if ((hw->mac_type >= e1000_82544) && (hw->mac_type != e1000_ce4100))
 		return E1000_SUCCESS;
 
 	/* Read the Device Control Register and set the bits to Force Speed
@@ -1870,27 +1959,49 @@ static s32 e1000_config_mac_to_phy(struct e1000_hw *hw)
 	ctrl |= (E1000_CTRL_FRCSPD | E1000_CTRL_FRCDPX);
 	ctrl &= ~(E1000_CTRL_SPD_SEL | E1000_CTRL_ILOS);
 
-	/* Set up duplex in the Device Control and Transmit Control
-	 * registers depending on negotiated values.
-	 */
-	ret_val = e1000_read_phy_reg(hw, M88E1000_PHY_SPEC_STATUS, &phy_data);
-	if (ret_val)
-		return ret_val;
+	switch (hw->phy_type) {
+	case e1000_phy_8201:
+		ret_val = e1000_read_phy_reg(hw, PHY_CTRL, &phy_data);
+		if (ret_val)
+			return ret_val;
 
-	if (phy_data & M88E1000_PSSR_DPLX)
-		ctrl |= E1000_CTRL_FD;
-	else
-		ctrl &= ~E1000_CTRL_FD;
+		if (phy_data & RTL_PHY_CTRL_FD)
+			ctrl |= E1000_CTRL_FD;
+		else
+			ctrl &= ~E1000_CTRL_FD;
 
-	e1000_config_collision_dist(hw);
+		if (phy_data & RTL_PHY_CTRL_SPD_100)
+			ctrl |= E1000_CTRL_SPD_100;
+		else
+			ctrl |= E1000_CTRL_SPD_10;
 
-	/* Set up speed in the Device Control register depending on
-	 * negotiated values.
-	 */
-	if ((phy_data & M88E1000_PSSR_SPEED) == M88E1000_PSSR_1000MBS)
-		ctrl |= E1000_CTRL_SPD_1000;
-	else if ((phy_data & M88E1000_PSSR_SPEED) == M88E1000_PSSR_100MBS)
-		ctrl |= E1000_CTRL_SPD_100;
+		e1000_config_collision_dist(hw);
+		break;
+	default:
+		/* Set up duplex in the Device Control and Transmit Control
+		 * registers depending on negotiated values.
+		 */
+		ret_val = e1000_read_phy_reg(hw, M88E1000_PHY_SPEC_STATUS,
+		                             &phy_data);
+		if (ret_val)
+			return ret_val;
+
+		if (phy_data & M88E1000_PSSR_DPLX)
+			ctrl |= E1000_CTRL_FD;
+		else
+			ctrl &= ~E1000_CTRL_FD;
+
+		e1000_config_collision_dist(hw);
+
+		/* Set up speed in the Device Control register depending on
+		 * negotiated values.
+		 */
+		if ((phy_data & M88E1000_PSSR_SPEED) == M88E1000_PSSR_1000MBS)
+			ctrl |= E1000_CTRL_SPD_1000;
+		else if ((phy_data & M88E1000_PSSR_SPEED) ==
+		         M88E1000_PSSR_100MBS)
+			ctrl |= E1000_CTRL_SPD_100;
+	}
 
 	/* Write the configured values back to the Device Control Reg. */
 	ew32(CTRL, ctrl);
@@ -2401,7 +2512,8 @@ s32 e1000_check_for_link(struct e1000_hw *hw)
 		 * speed/duplex on the MAC to the current PHY speed/duplex
 		 * settings.
 		 */
-		if (hw->mac_type >= e1000_82544)
+		if ((hw->mac_type >= e1000_82544) &&
+		    (hw->mac_type != e1000_ce4100))
 			e1000_config_collision_dist(hw);
 		else {
 			ret_val = e1000_config_mac_to_phy(hw);
@@ -2738,7 +2850,7 @@ static s32 e1000_read_phy_reg_ex(struct e1000_hw *hw, u32 reg_addr,
 {
 	u32 i;
 	u32 mdic = 0;
-	const u32 phy_addr = 1;
+	const u32 phy_addr = (hw->mac_type == e1000_ce4100) ? hw->phy_addr : 1;
 
 	e_dbg("e1000_read_phy_reg_ex");
 
@@ -2752,28 +2864,61 @@ static s32 e1000_read_phy_reg_ex(struct e1000_hw *hw, u32 reg_addr,
 		 * Control register.  The MAC will take care of interfacing with the
 		 * PHY to retrieve the desired data.
 		 */
-		mdic = ((reg_addr << E1000_MDIC_REG_SHIFT) |
-			(phy_addr << E1000_MDIC_PHY_SHIFT) |
-			(E1000_MDIC_OP_READ));
+		if (hw->mac_type == e1000_ce4100) {
+			mdic = ((reg_addr << E1000_MDIC_REG_SHIFT) |
+				(phy_addr << E1000_MDIC_PHY_SHIFT) |
+				(INTEL_CE_GBE_MDIC_OP_READ) |
+				(INTEL_CE_GBE_MDIC_GO));
 
-		ew32(MDIC, mdic);
+			writel(mdic, E1000_MDIO_CMD);
 
-		/* Poll the ready bit to see if the MDI read completed */
-		for (i = 0; i < 64; i++) {
-			udelay(50);
-			mdic = er32(MDIC);
-			if (mdic & E1000_MDIC_READY)
-				break;
-		}
-		if (!(mdic & E1000_MDIC_READY)) {
-			e_dbg("MDI Read did not complete\n");
-			return -E1000_ERR_PHY;
-		}
-		if (mdic & E1000_MDIC_ERROR) {
-			e_dbg("MDI Error\n");
-			return -E1000_ERR_PHY;
+			/* Poll the ready bit to see if the MDI read
+			 * completed
+			 */
+			for (i = 0; i < 64; i++) {
+				udelay(50);
+				mdic = readl(E1000_MDIO_CMD);
+				if (!(mdic & INTEL_CE_GBE_MDIC_GO))
+					break;
+			}
+
+			if (mdic & INTEL_CE_GBE_MDIC_GO) {
+				e_dbg("MDI Read did not complete\n");
+				return -E1000_ERR_PHY;
+			}
+
+			mdic = readl(E1000_MDIO_STS);
+			if (mdic & INTEL_CE_GBE_MDIC_READ_ERROR) {
+				e_dbg("MDI Read Error\n");
+				return -E1000_ERR_PHY;
+			}
+			*phy_data = (u16) mdic;
+		} else {
+			mdic = ((reg_addr << E1000_MDIC_REG_SHIFT) |
+				(phy_addr << E1000_MDIC_PHY_SHIFT) |
+				(E1000_MDIC_OP_READ));
+
+			ew32(MDIC, mdic);
+
+			/* Poll the ready bit to see if the MDI read
+			 * completed
+			 */
+			for (i = 0; i < 64; i++) {
+				udelay(50);
+				mdic = er32(MDIC);
+				if (mdic & E1000_MDIC_READY)
+					break;
+			}
+			if (!(mdic & E1000_MDIC_READY)) {
+				e_dbg("MDI Read did not complete\n");
+				return -E1000_ERR_PHY;
+			}
+			if (mdic & E1000_MDIC_ERROR) {
+				e_dbg("MDI Error\n");
+				return -E1000_ERR_PHY;
+			}
+			*phy_data = (u16) mdic;
 		}
-		*phy_data = (u16) mdic;
 	} else {
 		/* We must first send a preamble through the MDIO pin to signal the
 		 * beginning of an MII instruction.  This is done by sending 32
@@ -2840,7 +2985,7 @@ static s32 e1000_write_phy_reg_ex(struct e1000_hw *hw, u32 reg_addr,
 {
 	u32 i;
 	u32 mdic = 0;
-	const u32 phy_addr = 1;
+	const u32 phy_addr = (hw->mac_type == e1000_ce4100) ? hw->phy_addr : 1;
 
 	e_dbg("e1000_write_phy_reg_ex");
 
@@ -2850,27 +2995,54 @@ static s32 e1000_write_phy_reg_ex(struct e1000_hw *hw, u32 reg_addr,
 	}
 
 	if (hw->mac_type > e1000_82543) {
-		/* Set up Op-code, Phy Address, register address, and data intended
-		 * for the PHY register in the MDI Control register.  The MAC will take
-		 * care of interfacing with the PHY to send the desired data.
+		/* Set up Op-code, Phy Address, register address, and data
+		 * intended for the PHY register in the MDI Control register.
+		 * The MAC will take care of interfacing with the PHY to send
+		 * the desired data.
 		 */
-		mdic = (((u32) phy_data) |
-			(reg_addr << E1000_MDIC_REG_SHIFT) |
-			(phy_addr << E1000_MDIC_PHY_SHIFT) |
-			(E1000_MDIC_OP_WRITE));
+		if (hw->mac_type == e1000_ce4100) {
+			mdic = (((u32) phy_data) |
+				(reg_addr << E1000_MDIC_REG_SHIFT) |
+				(phy_addr << E1000_MDIC_PHY_SHIFT) |
+				(INTEL_CE_GBE_MDIC_OP_WRITE) |
+				(INTEL_CE_GBE_MDIC_GO));
 
-		ew32(MDIC, mdic);
+			writel(mdic, E1000_MDIO_CMD);
 
-		/* Poll the ready bit to see if the MDI read completed */
-		for (i = 0; i < 641; i++) {
-			udelay(5);
-			mdic = er32(MDIC);
-			if (mdic & E1000_MDIC_READY)
-				break;
-		}
-		if (!(mdic & E1000_MDIC_READY)) {
-			e_dbg("MDI Write did not complete\n");
-			return -E1000_ERR_PHY;
+			/* Poll the ready bit to see if the MDI read
+			 * completed
+			 */
+			for (i = 0; i < 640; i++) {
+				udelay(5);
+				mdic = readl(E1000_MDIO_CMD);
+				if (!(mdic & INTEL_CE_GBE_MDIC_GO))
+					break;
+			}
+			if (mdic & INTEL_CE_GBE_MDIC_GO) {
+				e_dbg("MDI Write did not complete\n");
+				return -E1000_ERR_PHY;
+			}
+		} else {
+			mdic = (((u32) phy_data) |
+				(reg_addr << E1000_MDIC_REG_SHIFT) |
+				(phy_addr << E1000_MDIC_PHY_SHIFT) |
+				(E1000_MDIC_OP_WRITE));
+
+			ew32(MDIC, mdic);
+
+			/* Poll the ready bit to see if the MDI read
+			 * completed
+			 */
+			for (i = 0; i < 641; i++) {
+				udelay(5);
+				mdic = er32(MDIC);
+				if (mdic & E1000_MDIC_READY)
+					break;
+			}
+			if (!(mdic & E1000_MDIC_READY)) {
+				e_dbg("MDI Write did not complete\n");
+				return -E1000_ERR_PHY;
+			}
 		}
 	} else {
 		/* We'll need to use the SW defined pins to shift the write command
@@ -3048,6 +3220,11 @@ static s32 e1000_detect_gig_phy(struct e1000_hw *hw)
 		if (hw->phy_id == M88E1011_I_PHY_ID)
 			match = true;
 		break;
+	case e1000_ce4100:
+		if ((hw->phy_id == RTL8211B_PHY_ID) ||
+		    (hw->phy_id == RTL8201N_PHY_ID))
+			match = true;
+		break;
 	case e1000_82541:
 	case e1000_82541_rev_2:
 	case e1000_82547:
@@ -3291,6 +3468,9 @@ s32 e1000_phy_get_info(struct e1000_hw *hw, struct e1000_phy_info *phy_info)
 
 	if (hw->phy_type == e1000_phy_igp)
 		return e1000_phy_igp_get_info(hw, phy_info);
+	else if ((hw->phy_type == e1000_phy_8211) ||
+	         (hw->phy_type == e1000_phy_8201))
+		return E1000_SUCCESS;
 	else
 		return e1000_phy_m88_get_info(hw, phy_info);
 }
@@ -3742,6 +3922,12 @@ static s32 e1000_do_read_eeprom(struct e1000_hw *hw, u16 offset, u16 words,
 
 	e_dbg("e1000_read_eeprom");
 
+	if (hw->mac_type == e1000_ce4100) {
+		GBE_CONFIG_FLASH_READ(GBE_CONFIG_BASE_VIRT, offset, words,
+		                      data);
+		return E1000_SUCCESS;
+	}
+
 	/* If eeprom is not yet detected, do so now */
 	if (eeprom->word_size == 0)
 		e1000_init_eeprom_params(hw);
@@ -3904,6 +4090,12 @@ static s32 e1000_do_write_eeprom(struct e1000_hw *hw, u16 offset, u16 words,
 
 	e_dbg("e1000_write_eeprom");
 
+	if (hw->mac_type == e1000_ce4100) {
+		GBE_CONFIG_FLASH_WRITE(GBE_CONFIG_BASE_VIRT, offset, words,
+		                       data);
+		return E1000_SUCCESS;
+	}
+
 	/* If eeprom is not yet detected, do so now */
 	if (eeprom->word_size == 0)
 		e1000_init_eeprom_params(hw);
diff --git a/drivers/net/e1000/e1000_hw.h b/drivers/net/e1000/e1000_hw.h
index ecd9f6c..f5514a0 100644
--- a/drivers/net/e1000/e1000_hw.h
+++ b/drivers/net/e1000/e1000_hw.h
@@ -52,6 +52,7 @@ typedef enum {
 	e1000_82545,
 	e1000_82545_rev_3,
 	e1000_82546,
+	e1000_ce4100,
 	e1000_82546_rev_3,
 	e1000_82541,
 	e1000_82541_rev_2,
@@ -209,9 +210,11 @@ typedef enum {
 } e1000_1000t_rx_status;
 
 typedef enum {
-    e1000_phy_m88 = 0,
-    e1000_phy_igp,
-    e1000_phy_undefined = 0xFF
+	e1000_phy_m88 = 0,
+	e1000_phy_igp,
+	e1000_phy_8211,
+	e1000_phy_8201,
+	e1000_phy_undefined = 0xFF
 } e1000_phy_type;
 
 typedef enum {
@@ -442,6 +445,7 @@ void e1000_io_write(struct e1000_hw *hw, unsigned long port, u32 value);
 #define E1000_DEV_ID_82547EI             0x1019
 #define E1000_DEV_ID_82547EI_MOBILE      0x101A
 #define E1000_DEV_ID_82546GB_QUAD_COPPER_KSP3 0x10B5
+#define E1000_DEV_ID_INTEL_CE4100_GBE    0x2E6E
 
 #define NODE_ADDRESS_SIZE 6
 #define ETH_LENGTH_OF_ADDRESS 6
@@ -808,6 +812,16 @@ struct e1000_ffvt_entry {
 #define E1000_CTRL_EXT 0x00018	/* Extended Device Control - RW */
 #define E1000_FLA      0x0001C	/* Flash Access - RW */
 #define E1000_MDIC     0x00020	/* MDI Control - RW */
+
+extern void __iomem *ce4100_gbe_mdio_base_virt;
+#define INTEL_CE_GBE_MDIO_RCOMP_BASE    (ce4100_gbe_mdio_base_virt)
+#define E1000_MDIO_STS  (INTEL_CE_GBE_MDIO_RCOMP_BASE + 0)
+#define E1000_MDIO_CMD  (INTEL_CE_GBE_MDIO_RCOMP_BASE + 4)
+#define E1000_MDIO_DRV  (INTEL_CE_GBE_MDIO_RCOMP_BASE + 8)
+#define E1000_MDC_CMD   (INTEL_CE_GBE_MDIO_RCOMP_BASE + 0xC)
+#define E1000_RCOMP_CTL (INTEL_CE_GBE_MDIO_RCOMP_BASE + 0x20)
+#define E1000_RCOMP_STS (INTEL_CE_GBE_MDIO_RCOMP_BASE + 0x24)
+
 #define E1000_SCTL     0x00024	/* SerDes Control - RW */
 #define E1000_FEXTNVM  0x00028	/* Future Extended NVM register */
 #define E1000_FCAL     0x00028	/* Flow Control Address Low - RW */
@@ -820,6 +834,34 @@ struct e1000_ffvt_entry {
 #define E1000_IMS      0x000D0	/* Interrupt Mask Set - RW */
 #define E1000_IMC      0x000D8	/* Interrupt Mask Clear - WO */
 #define E1000_IAM      0x000E0	/* Interrupt Acknowledge Auto Mask */
+
+/* Auxiliary Control Register. This register is CE4100 specific,
+ * RMII/RGMII function is switched by this register - RW
+ * Following are bits definitions of the Auxiliary Control Register
+ */
+#define E1000_CTL_AUX  0x000E0
+#define E1000_CTL_AUX_END_SEL_SHIFT     10
+#define E1000_CTL_AUX_ENDIANESS_SHIFT   8
+#define E1000_CTL_AUX_RGMII_RMII_SHIFT  0
+
+/* descriptor and packet transfer use CTL_AUX.ENDIANESS */
+#define E1000_CTL_AUX_DES_PKT   (0x0 << E1000_CTL_AUX_END_SEL_SHIFT)
+/* descriptor use CTL_AUX.ENDIANESS, packet use default */
+#define E1000_CTL_AUX_DES       (0x1 << E1000_CTL_AUX_END_SEL_SHIFT)
+/* descriptor use default, packet use CTL_AUX.ENDIANESS */
+#define E1000_CTL_AUX_PKT       (0x2 << E1000_CTL_AUX_END_SEL_SHIFT)
+/* all use CTL_AUX.ENDIANESS */
+#define E1000_CTL_AUX_ALL       (0x3 << E1000_CTL_AUX_END_SEL_SHIFT)
+
+#define E1000_CTL_AUX_RGMII     (0x0 << E1000_CTL_AUX_RGMII_RMII_SHIFT)
+#define E1000_CTL_AUX_RMII      (0x1 << E1000_CTL_AUX_RGMII_RMII_SHIFT)
+
+/* LW little endian, Byte big endian */
+#define E1000_CTL_AUX_LWLE_BBE  (0x0 << E1000_CTL_AUX_ENDIANESS_SHIFT)
+#define E1000_CTL_AUX_LWLE_BLE  (0x1 << E1000_CTL_AUX_ENDIANESS_SHIFT)
+#define E1000_CTL_AUX_LWBE_BBE  (0x2 << E1000_CTL_AUX_ENDIANESS_SHIFT)
+#define E1000_CTL_AUX_LWBE_BLE  (0x3 << E1000_CTL_AUX_ENDIANESS_SHIFT)
+
 #define E1000_RCTL     0x00100	/* RX Control - RW */
 #define E1000_RDTR1    0x02820	/* RX Delay Timer (1) - RW */
 #define E1000_RDBAL1   0x02900	/* RX Descriptor Base Address Low (1) - RW */
@@ -1011,6 +1053,7 @@ struct e1000_ffvt_entry {
  * in more current versions of the 8254x. Despite the difference in location,
  * the registers function in the same manner.
  */
+#define E1000_82542_CTL_AUX  E1000_CTL_AUX
 #define E1000_82542_CTRL     E1000_CTRL
 #define E1000_82542_CTRL_DUP E1000_CTRL_DUP
 #define E1000_82542_STATUS   E1000_STATUS
@@ -1571,6 +1614,11 @@ struct e1000_hw {
 #define E1000_MDIC_INT_EN    0x20000000
 #define E1000_MDIC_ERROR     0x40000000
 
+#define INTEL_CE_GBE_MDIC_OP_WRITE      0x04000000
+#define INTEL_CE_GBE_MDIC_OP_READ       0x00000000
+#define INTEL_CE_GBE_MDIC_GO            0x80000000
+#define INTEL_CE_GBE_MDIC_READ_ERROR    0x80000000
+
 #define E1000_KUMCTRLSTA_MASK           0x0000FFFF
 #define E1000_KUMCTRLSTA_OFFSET         0x001F0000
 #define E1000_KUMCTRLSTA_OFFSET_SHIFT   16
@@ -2871,6 +2919,11 @@ struct e1000_host_command_info {
 #define M88E1111_I_PHY_ID  0x01410CC0
 #define L1LXT971A_PHY_ID   0x001378E0
 
+#define RTL8211B_PHY_ID    0x001CC910
+#define RTL8201N_PHY_ID    0x8200
+#define RTL_PHY_CTRL_FD    0x0100 /* Full duplex.0=half; 1=full */
+#define RTL_PHY_CTRL_SPD_100    0x200000 /* Force 100Mb */
+
 /* Bits...
  * 15-5: page
  * 4-0: register offset
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 340e12d..4ff88a6 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -28,6 +28,12 @@
 
 #include "e1000.h"
 #include <net/ip6_checksum.h>
+#include <linux/io.h>
+
+/* Intel Media SOC GbE MDIO physical base address */
+static unsigned long ce4100_gbe_mdio_base_phy;
+/* Intel Media SOC GbE MDIO virtual base address */
+void __iomem *ce4100_gbe_mdio_base_virt;
 
 char e1000_driver_name[] = "e1000";
 static char e1000_driver_string[] = "Intel(R) PRO/1000 Network Driver";
@@ -79,6 +85,7 @@ static DEFINE_PCI_DEVICE_TABLE(e1000_pci_tbl) = {
 	INTEL_E1000_ETHERNET_DEVICE(0x108A),
 	INTEL_E1000_ETHERNET_DEVICE(0x1099),
 	INTEL_E1000_ETHERNET_DEVICE(0x10B5),
+	INTEL_E1000_ETHERNET_DEVICE(0x2E6E),
 	/* required last entry */
 	{0,}
 };
@@ -459,6 +466,7 @@ static void e1000_power_down_phy(struct e1000_adapter *adapter)
 		case e1000_82545:
 		case e1000_82545_rev_3:
 		case e1000_82546:
+		case e1000_ce4100:
 		case e1000_82546_rev_3:
 		case e1000_82541:
 		case e1000_82541_rev_2:
@@ -573,6 +581,7 @@ void e1000_reset(struct e1000_adapter *adapter)
 	case e1000_82545:
 	case e1000_82545_rev_3:
 	case e1000_82546:
+	case e1000_ce4100:
 	case e1000_82546_rev_3:
 		pba = E1000_PBA_48K;
 		break;
@@ -894,6 +903,7 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 	static int global_quad_port_a = 0; /* global ksp3 port a indication */
 	int i, err, pci_using_dac;
 	u16 eeprom_data = 0;
+	u16 tmp = 0;
 	u16 eeprom_apme_mask = E1000_EEPROM_APME;
 	int bars, need_ioport;
 
@@ -996,6 +1006,14 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 		goto err_sw_init;
 
 	err = -EIO;
+	if (hw->mac_type == e1000_ce4100) {
+		ce4100_gbe_mdio_base_phy = pci_resource_start(pdev, BAR_1);
+		ce4100_gbe_mdio_base_virt = ioremap(ce4100_gbe_mdio_base_phy,
+		                                pci_resource_len(pdev, BAR_1));
+
+		if (!ce4100_gbe_mdio_base_virt)
+			goto err_mdio_ioremap;
+	}
 
 	if (hw->mac_type >= e1000_82543) {
 		netdev->features = NETIF_F_SG |
@@ -1135,6 +1153,20 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 	adapter->wol = adapter->eeprom_wol;
 	device_set_wakeup_enable(&adapter->pdev->dev, adapter->wol);
 
+	/* Auto detect PHY address */
+	if (hw->mac_type == e1000_ce4100) {
+		for (i = 0; i < 32; i++) {
+			hw->phy_addr = i;
+			e1000_read_phy_reg(hw, PHY_ID2, &tmp);
+			if (tmp == 0 || tmp == 0xFF) {
+				if (i == 31)
+					goto err_eeprom;
+				continue;
+			} else
+				break;
+		}
+	}
+
 	/* reset the hardware with the new settings */
 	e1000_reset(adapter);
 
@@ -1171,6 +1203,8 @@ err_eeprom:
 	kfree(adapter->rx_ring);
 err_dma:
 err_sw_init:
+err_mdio_ioremap:
+	iounmap(ce4100_gbe_mdio_base_virt);
 	iounmap(hw->hw_addr);
 err_ioremap:
 	free_netdev(netdev);
@@ -1409,6 +1443,7 @@ static bool e1000_check_64k_bound(struct e1000_adapter *adapter, void *start,
 	/* First rev 82545 and 82546 need to not allow any memory
 	 * write location to cross 64k boundary due to errata 23 */
 	if (hw->mac_type == e1000_82545 ||
+	    hw->mac_type == e1000_ce4100 ||
 	    hw->mac_type == e1000_82546) {
 		return ((begin ^ (end - 1)) >> 16) != 0 ? false : true;
 	}
diff --git a/drivers/net/e1000/e1000_osdep.h b/drivers/net/e1000/e1000_osdep.h
index edd1c75..55c1711 100644
--- a/drivers/net/e1000/e1000_osdep.h
+++ b/drivers/net/e1000/e1000_osdep.h
@@ -34,12 +34,21 @@
 #ifndef _E1000_OSDEP_H_
 #define _E1000_OSDEP_H_
 
-#include <linux/types.h>
-#include <linux/pci.h>
-#include <linux/delay.h>
 #include <asm/io.h>
-#include <linux/interrupt.h>
-#include <linux/sched.h>
+
+#define CONFIG_RAM_BASE         0x60000
+#define GBE_CONFIG_OFFSET       0x0
+
+#define GBE_CONFIG_RAM_BASE \
+	((unsigned int)(CONFIG_RAM_BASE + GBE_CONFIG_OFFSET))
+
+#define GBE_CONFIG_BASE_VIRT    phys_to_virt(GBE_CONFIG_RAM_BASE)
+
+#define GBE_CONFIG_FLASH_WRITE(base, offset, count, data) \
+	(iowrite16_rep(base + offset, data, count))
+
+#define GBE_CONFIG_FLASH_READ(base, offset, count, data) \
+	(ioread16_rep(base + (offset << 1), data, count))
 
 #define er32(reg)							\
 	(readl(hw->hw_addr + ((hw->mac_type >= e1000_82543)		\
-- 
1.7.3.4


^ permalink raw reply related

* [net-next 06/12] e1000e: add custom set_d[0|3]_lplu_state function pointer for 82574
From: jeffrey.t.kirsher @ 2011-01-07  0:29 UTC (permalink / raw)
  To: davem, davem; +Cc: Bruce Allan, netdev, gosp, bphilips, Jeff Kirsher
In-Reply-To: <1294360199-9860-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Bruce Allan <bruce.w.allan@intel.com>

82574 needs to configure Low Power Link Up (or LPLU) differently than
the other parts in the 8257x family supported by the driver.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/e1000e/82571.c |   56 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/net/e1000e/hw.h    |    1 +
 2 files changed, 57 insertions(+), 0 deletions(-)

diff --git a/drivers/net/e1000e/82571.c b/drivers/net/e1000e/82571.c
index 11a273e..cb6c7b1 100644
--- a/drivers/net/e1000e/82571.c
+++ b/drivers/net/e1000e/82571.c
@@ -78,6 +78,8 @@ static void e1000_power_down_phy_copper_82571(struct e1000_hw *hw);
 static void e1000_put_hw_semaphore_82573(struct e1000_hw *hw);
 static s32 e1000_get_hw_semaphore_82574(struct e1000_hw *hw);
 static void e1000_put_hw_semaphore_82574(struct e1000_hw *hw);
+static s32 e1000_set_d0_lplu_state_82574(struct e1000_hw *hw, bool active);
+static s32 e1000_set_d3_lplu_state_82574(struct e1000_hw *hw, bool active);
 
 /**
  *  e1000_init_phy_params_82571 - Init PHY func ptrs.
@@ -113,6 +115,8 @@ static s32 e1000_init_phy_params_82571(struct e1000_hw *hw)
 		phy->type		 = e1000_phy_bm;
 		phy->ops.acquire = e1000_get_hw_semaphore_82574;
 		phy->ops.release = e1000_put_hw_semaphore_82574;
+		phy->ops.set_d0_lplu_state = e1000_set_d0_lplu_state_82574;
+		phy->ops.set_d3_lplu_state = e1000_set_d3_lplu_state_82574;
 		break;
 	default:
 		return -E1000_ERR_PHY;
@@ -656,6 +660,58 @@ static void e1000_put_hw_semaphore_82574(struct e1000_hw *hw)
 }
 
 /**
+ *  e1000_set_d0_lplu_state_82574 - Set Low Power Linkup D0 state
+ *  @hw: pointer to the HW structure
+ *  @active: true to enable LPLU, false to disable
+ *
+ *  Sets the LPLU D0 state according to the active flag.
+ *  LPLU will not be activated unless the
+ *  device autonegotiation advertisement meets standards of
+ *  either 10 or 10/100 or 10/100/1000 at all duplexes.
+ *  This is a function pointer entry point only called by
+ *  PHY setup routines.
+ **/
+static s32 e1000_set_d0_lplu_state_82574(struct e1000_hw *hw, bool active)
+{
+	u16 data = er32(POEMB);
+
+	if (active)
+		data |= E1000_PHY_CTRL_D0A_LPLU;
+	else
+		data &= ~E1000_PHY_CTRL_D0A_LPLU;
+
+	ew32(POEMB, data);
+	return 0;
+}
+
+/**
+ *  e1000_set_d3_lplu_state_82574 - Sets low power link up state for D3
+ *  @hw: pointer to the HW structure
+ *  @active: boolean used to enable/disable lplu
+ *
+ *  The low power link up (lplu) state is set to the power management level D3
+ *  when active is true, else clear lplu for D3. LPLU
+ *  is used during Dx states where the power conservation is most important.
+ *  During driver activity, SmartSpeed should be enabled so performance is
+ *  maintained.
+ **/
+static s32 e1000_set_d3_lplu_state_82574(struct e1000_hw *hw, bool active)
+{
+	u16 data = er32(POEMB);
+
+	if (!active) {
+		data &= ~E1000_PHY_CTRL_NOND0A_LPLU;
+	} else if ((hw->phy.autoneg_advertised == E1000_ALL_SPEED_DUPLEX) ||
+		   (hw->phy.autoneg_advertised == E1000_ALL_NOT_GIG) ||
+		   (hw->phy.autoneg_advertised == E1000_ALL_10_SPEED)) {
+		data |= E1000_PHY_CTRL_NOND0A_LPLU;
+	}
+
+	ew32(POEMB, data);
+	return 0;
+}
+
+/**
  *  e1000_acquire_nvm_82571 - Request for access to the EEPROM
  *  @hw: pointer to the HW structure
  *
diff --git a/drivers/net/e1000e/hw.h b/drivers/net/e1000e/hw.h
index ba302a5..e774380 100644
--- a/drivers/net/e1000e/hw.h
+++ b/drivers/net/e1000e/hw.h
@@ -83,6 +83,7 @@ enum e1e_registers {
 	E1000_EXTCNF_CTRL  = 0x00F00, /* Extended Configuration Control */
 	E1000_EXTCNF_SIZE  = 0x00F08, /* Extended Configuration Size */
 	E1000_PHY_CTRL     = 0x00F10, /* PHY Control Register in CSR */
+#define E1000_POEMB	E1000_PHY_CTRL	/* PHY OEM Bits */
 	E1000_PBA      = 0x01000, /* Packet Buffer Allocation - RW */
 	E1000_PBS      = 0x01008, /* Packet Buffer Size */
 	E1000_EEMNGCTL = 0x01010, /* MNG EEprom Control */
-- 
1.7.3.4


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox