Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: jme driver loses connection after resuming from suspend
From: Leonardo  L. P. da Mata @ 2011-02-10 21:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Guo-Fu Tseng, netdev
In-Reply-To: <20110128163315.1e4e8346.akpm@linux-foundation.org>

Updated information on the bug, Guo-Fu Tseng says that might not be a
bug on the driver, but i've tested other network cards and they don't
share the same issue.

Also, on interesting update that people may consider is that after
running tcpdump on the device, the network starts working again.
Information are updated in the bug.

On Fri, Jan 28, 2011 at 10:33 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> (cc's added)
>
> On Fri, 28 Jan 2011 16:03:11 -0200
> "Leonardo  L. P. da Mata" <barroca@gmail.com> wrote:
>
>> Hello, i'm testing the kernel 2.6.37 on my hardware, Once connect on
>> wired network, i call the suspend with:
>> echo "mem" >/sys/power/state
>>
>> The system goes on suspend. After resuming from suspend, the network card
>> cannot be used anymore.
>>
>>
>> The bug is reported here:
>>  https://bugzilla.kernel.org/show_bug.cgi?id=27692
>>
>> Can you please point me to similar problems on other network cards so
>> i can get possible solutions on this.
>
>

-- 
Leonardo Luiz Padovani da Mata
barroca@gmail.com

"May the force be with you, always"
"Nerd Pride... eu tenho. Voce tem?"

^ permalink raw reply

* Re: [PATCH 1/3] tg3: Avoid setting power.can_wakeup for devices that cannot wake up
From: Rafael J. Wysocki @ 2011-02-10 21:08 UTC (permalink / raw)
  To: Matt Carlson
  Cc: netdev@vger.kernel.org, David Miller, Michael Chan,
	Linux PM mailing list, Thomas Fjellstrom, Jay Cliburn,
	Chris Snook, Jie Yang
In-Reply-To: <20110210204844.GA4158@mcarlson.broadcom.com>

On Thursday, February 10, 2011, Matt Carlson wrote:
> On Thu, Feb 10, 2011 at 08:53:09AM -0800, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > The tg3 driver uses device_init_wakeup() in such a way that the
> > device's power.can_wakeup flag may be set even though the PCI
> > subsystem cleared it before, in which case the device cannot wake
> > up the system from sleep states.  Modify the driver to only change
> > the power.can_wakeup flag if the device is not capable of generating
> > wakeup signals.
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > ---
> >  drivers/net/tg3.c |    6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > Index: linux-2.6/drivers/net/tg3.c
> > ===================================================================
> > --- linux-2.6.orig/drivers/net/tg3.c
> > +++ linux-2.6/drivers/net/tg3.c
> > @@ -12403,9 +12403,11 @@ static void __devinit tg3_get_eeprom_hw_
> >  			tp->tg3_flags3 |= TG3_FLG3_RGMII_EXT_IBND_TX_EN;
> >  	}
> >  done:
> > -	device_init_wakeup(&tp->pdev->dev, tp->tg3_flags & TG3_FLAG_WOL_CAP);
> > -	device_set_wakeup_enable(&tp->pdev->dev,
> > +	if (tp->tg3_flags & TG3_FLAG_WOL_CAP)
> > +		device_set_wakeup_enable(&tp->pdev->dev,
> >  				 tp->tg3_flags & TG3_FLAG_WOL_ENABLE);
> > +	else
> > +		device_set_wakeup_capable(&tp->pdev->dev, false);
> 
> I did this because I couldn't see where 'can_wakeup' gets set.  I don't
> see a call to device_init_wakeup() that would be relevant to tg3
> devices.  I do see a couple calls to device_set_wakeup_capable() in
> acpi/glue.c and acpi/scan.c.  Is that the place?

No, it's pci_pm_init() or platform_pci_wakeup_init() and they both use
device_set_wakeup_capable() rather tha device_init_wakeup(), which is just a
combination of device_set_wakeup_capable() and device_set_wakeup_enable()
anyway.

And there's no why reason PCI drivers should use device_pm_init() at all.

> >  }
> >  
> >  static int __devinit tg3_issue_otp_command(struct tg3 *tp, u32 cmd)
> 
> This is something I was always curious about too.  The TG3_FLAG_WOL_CAP
> tracks whether or not the device can handle WOL.  Is it safe to do away
> with this flag and lean on the 'can_wakeup' flag instead?

I don't think so.  power.can_wakeup only tracks the capability to generate
wakeup signals from the PCI perspective and it is set by PCI if the device
appears to be able to generate wakeup signals.

So, I think the driver should work as in the $subject patch - reset the
power.can_wakeup flag if TG3_FLAG_WOL_CAP is unset and don't touch it
otherwise.

Thanks,
Rafael

^ permalink raw reply

* Re: [PATCH] tcp: undo_retrans counter fixes
From: Ilpo Järvinen @ 2011-02-10 21:31 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: David Miller, Netdev
In-Reply-To: <AANLkTikN9C6XV3JHUoypmVQ0H0cChmmvUU3zSWa8Fva8@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3170 bytes --]

On Thu, 10 Feb 2011, Yuchung Cheng wrote:

> On Tue, Feb 8, 2011 at 1:54 AM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
> >
> > On Mon, 7 Feb 2011, Yuchung Cheng wrote:
> >
> > > On Mon, Feb 7, 2011 at 3:36 PM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
> > > >
> > > > On Mon, 7 Feb 2011, David Miller wrote:
> > > >
> > > > > From: Yuchung Cheng <ycheng@google.com>
> > > > > Date: Mon,  7 Feb 2011 14:57:04 -0800
> > > > >
> > > > > > Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
> > > > > > not set or undo_retrans is already 0. This happens when sender receives
> > > > > > more DSACK ACKs than packets retransmitted during the current
> > > > > > undo phase. This may also happen when sender receives DSACK after
> > > > > > the undo operation is completed or cancelled.
> > > > > >
> > > > > > Fix another bug that undo_retrans is incorrectly incremented when
> > > > > > sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
> > > > > > is rare but not impossible.
> > > > > >
> > > > > > Signed-off-by: Yuchung Cheng <ycheng@google.com>
> > > >
> > > > Neither is harmful to "fix" but I think they're partially also checking
> > > > for things which shouldn't cause problems... E.g., undo_retrans is only
> > > > used after checking undo_marker's validity first so I don't think
> > > > undo_marker check is exactly necessary there (but on the other hand it
> > > > does no harm)...
> > >
> > > logically we should check the validity of undo_marker/undo_retrans
> > > before we use them? The current code has no problem if
> > > tcp_fastretrans_alert() always call tcp_try_undo_*  functions whenever
> > > undo_marker != 0 and undo_retrans == 0. I don't think that's always
> > > true.
> >
> > We certainly should be letting the undo_retrans to underflow that in this
> > your patch has merit (the added tp->undo_retrans check).
> >
> > However, the only users are:
> >
> > static inline int tcp_may_undo(struct tcp_sock *tp)
> > {
> >         return tp->undo_marker && (!tp->undo_retrans ...)
> >
> > and:
> >
> > static void tcp_try_undo_dsack(struct sock *sk)
> > {
> >        struct tcp_sock *tp = tcp_sk(sk);
> >
> >        if (tp->undo_marker && !tp->undo_retrans) {
> >
> >
> > ...which check that undo_retrans is valid.
> But that does not make this bug go away.
> The sender does not always call these functions in tcp_fastretrans_alert().
> 
> A common example is that the sender receives a DUPACK with DSACK option
> during CA_Recovery and undo_retrans goes to 0. Since it's not a partial ACK,
> no undo function is called (another bug?) while processing the DUPACK. If the
> sender receives another DSACK, undo_retrans underflows and the undo
> chance is missed forever.

But the underflow can be fixed without checking for tp->undo_marker with 
the tp->undo_retrans > 0 condition only before decrementing it, right 
(in sacktag code like you do)? ...I don't understand what that has to do 
with these functions being called from tcp_fastretrans_alert or not.

> I think that's another potential bug in handling this situation but I
> want to fix in
> this boundary checks first.



-- 
 i.

^ permalink raw reply

* [PATCH] Don't potentially dereference NULL in net/dcb/dcbnl.c:dcbnl_getapp()
From: Jesper Juhl @ 2011-02-10 21:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: netdev, Alexey Dobriyan, Dan Carpenter, Shmulik Ravid,
	John Fastabend, David S. Miller, Lucy Liu

nla_nest_start() may return NULL. If it does then we'll blow up in 
nla_nest_end() when we dereference the pointer.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
---
 dcbnl.c |    3 +++
 1 file changed, 3 insertions(+)

  only compile tested.

diff --git a/net/dcb/dcbnl.c b/net/dcb/dcbnl.c
index 6b03f56..13cdc30 100644
--- a/net/dcb/dcbnl.c
+++ b/net/dcb/dcbnl.c
@@ -626,6 +626,9 @@ static int dcbnl_getapp(struct net_device *netdev, struct nlattr **tb,
 	dcb->cmd = DCB_CMD_GAPP;
 
 	app_nest = nla_nest_start(dcbnl_skb, DCB_ATTR_APP);
+	if (!app_nest)
+		goto out_cancel;
+
 	ret = nla_put_u8(dcbnl_skb, DCB_APP_ATTR_IDTYPE, idtype);
 	if (ret)
 		goto out_cancel;
-- 
Jesper Juhl <jj@chaosbits.net>            http://www.chaosbits.net/
Plain text mails only, please.
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html

^ permalink raw reply related

* [PATCH 1/2] igb: Allow extra 4 bytes on RX for vlan tags.
From: greearb @ 2011-02-10 21:59 UTC (permalink / raw)
  To: netdev; +Cc: Ben Greear

From: Ben Greear <greearb@candelatech.com>

This allows the NIC to receive 1518 byte (not counting
FCS) packets when MTU is 1500, thus allowing 1500 MTU
VLAN frames to be received.  Please note that no VLANs
were actually configured on the NIC...it was just acting
as pass-through device.

Signed-off-by: Ben Greear <greearb@candelatech.com>
---
:100644 100644 58c665b... 30c9cc6... M	drivers/net/igb/igb_main.c
 drivers/net/igb/igb_main.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 58c665b..30c9cc6 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -2281,7 +2281,8 @@ static int __devinit igb_sw_init(struct igb_adapter *adapter)
 	adapter->rx_itr_setting = IGB_DEFAULT_ITR;
 	adapter->tx_itr_setting = IGB_DEFAULT_ITR;
 
-	adapter->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
+	adapter->max_frame_size = (netdev->mtu + ETH_HLEN + ETH_FCS_LEN
+				   + VLAN_HLEN);
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
 	spin_lock_init(&adapter->stats64_lock);
@@ -4303,7 +4304,7 @@ static int igb_change_mtu(struct net_device *netdev, int new_mtu)
 {
 	struct igb_adapter *adapter = netdev_priv(netdev);
 	struct pci_dev *pdev = adapter->pdev;
-	int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN;
+	int max_frame = new_mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN;
 	u32 rx_buffer_len, i;
 
 	if ((new_mtu < 68) || (max_frame > MAX_JUMBO_FRAME_SIZE)) {
-- 
1.7.2.3


^ permalink raw reply related

* [PATCH 2/2] network: Allow af_packet to transmit +4 bytes for VLAN packets.
From: greearb @ 2011-02-10 21:59 UTC (permalink / raw)
  To: netdev; +Cc: Ben Greear
In-Reply-To: <1297375149-18458-1-git-send-email-greearb@candelatech.com>

From: Ben Greear <greearb@candelatech.com>

This allows user-space to send a '1500' MTU VLAN packet on a
1500 MTU ethernet frame.  The extra 4 bytes of a VLAN header is
not usually charged against the MTU when other parts of the
network stack is transmitting vlans...

Signed-off-by: Ben Greear <greearb@candelatech.com>
---
:100644 100644 91cb1d7... ef7f378... M	net/packet/af_packet.c
 net/packet/af_packet.c |   31 +++++++++++++++++++++++++++++--
 1 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 91cb1d7..ef7f378 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -466,7 +466,7 @@ retry:
 	 */
 
 	err = -EMSGSIZE;
-	if (len > dev->mtu + dev->hard_header_len)
+	if (len > dev->mtu + dev->hard_header_len + VLAN_HLEN)
 		goto out_unlock;
 
 	if (!skb) {
@@ -497,6 +497,19 @@ retry:
 		goto retry;
 	}
 
+	if (len > (dev->mtu + dev->hard_header_len)) {
+		/* Earlier code assumed this would be a VLAN pkt,
+		 * double-check this now that we have the actual
+		 * packet in hand.
+		 */
+		struct ethhdr *ehdr;
+		skb_reset_mac_header(skb);
+		ehdr = eth_hdr(skb);
+		if (ehdr->h_proto != htons(ETH_P_8021Q)) {
+			err = -EMSGSIZE;
+			goto out_unlock;
+		}
+	}
 
 	skb->protocol = proto;
 	skb->dev = dev;
@@ -1200,7 +1213,7 @@ static int packet_snd(struct socket *sock,
 	}
 
 	err = -EMSGSIZE;
-	if (!gso_type && (len > dev->mtu+reserve))
+	if (!gso_type && (len > dev->mtu + reserve + ETH_HLEN))
 		goto out_unlock;
 
 	err = -ENOBUFS;
@@ -1225,6 +1238,20 @@ static int packet_snd(struct socket *sock,
 	if (err < 0)
 		goto out_free;
 
+	if (!gso_type && (len > dev->mtu + reserve)) {
+		/* Earlier code assumed this would be a VLAN pkt,
+		 * double-check this now that we have the actual
+		 * packet in hand.
+		 */
+		struct ethhdr *ehdr;
+		skb_reset_mac_header(skb);
+		ehdr = eth_hdr(skb);
+		if (ehdr->h_proto != htons(ETH_P_8021Q)) {
+			err = -EMSGSIZE;
+			goto out_free;
+		}
+	}
+
 	skb->protocol = proto;
 	skb->dev = dev;
 	skb->priority = sk->sk_priority;
-- 
1.7.2.3


^ permalink raw reply related

* netfilter is not a filesystem
From: Andrew Morton @ 2011-02-10 22:11 UTC (permalink / raw)
  To: netdev, linux-kernel
In-Reply-To: <bug-28862-10286@https.bugzilla.kernel.org/>

On Thu, 10 Feb 2011 21:55:26 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=28862
> 
>            Summary: /proc/net/ip_conntrack: no space left on device
>                     systematically

This is why I'm forever nagging people to not just grab some errno
because its name happens to sound similar to the error you just detected.

Yes, it superficially seems nice and logical for netfilter to use
ENOSPC when it runs out of space.  But when that error code propagates
up to the user, they see "no space left on device" and will then run
"df" and wonder what the hell happened to their computer.

The kernel makes this mistake a *lot*.  EFBIG in the rtc drivers?  Really?

^ permalink raw reply

* Re: netfilter is not a filesystem
From: David Miller @ 2011-02-10 22:22 UTC (permalink / raw)
  To: akpm; +Cc: netdev, linux-kernel
In-Reply-To: <20110210141119.56d789fc.akpm@linux-foundation.org>

From: Andrew Morton <akpm@linux-foundation.org>
Date: Thu, 10 Feb 2011 14:11:19 -0800

> On Thu, 10 Feb 2011 21:55:26 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
>> https://bugzilla.kernel.org/show_bug.cgi?id=28862
>> 
>>            Summary: /proc/net/ip_conntrack: no space left on device
>>                     systematically
> 
> This is why I'm forever nagging people to not just grab some errno
> because its name happens to sound similar to the error you just detected.
> 
> Yes, it superficially seems nice and logical for netfilter to use
> ENOSPC when it runs out of space.  But when that error code propagates
> up to the user, they see "no space left on device" and will then run
> "df" and wonder what the hell happened to their computer.
> 
> The kernel makes this mistake a *lot*.  EFBIG in the rtc drivers?  Really?

We are in this conundrum because the granularity of errors which can
be indicated by errno signalling is very low.

And one way people handle this is to use all sorts of different types
of errno values to indicate the different cases.

Also, one can argue that it is erroneous for userspace to assume that
error codes are not context dependent.  They most certainly are.

^ permalink raw reply

* GRO/GSO hiding PMTU?
From: David Miller @ 2011-02-10 22:55 UTC (permalink / raw)
  To: herbert; +Cc: netdev, netfilter-devel

I was trying to setup something simple to trigger PMTU to test my
PMTU patches, and the simplest (I thought) would be to simply down
the mtu of the internet facing side of my simple NAT box.

All I did was "ip link set eth0 mtu 1400", and try to send large
TCP sequences from inside.

To my surprise I saw no ICMP messages, on input to the NAT machine the
TCP packets had length 1448 and on output they had length 1348.

This NAT box has TG3 on both input and output, so supports GRO and
TSO.  The kernel is 2.6.34-rc7 vintage :-)

I suspect that the packet arrives on eth1, accumulates into GRO, and
thus marked as GSO as well, then GSO/TSO on output to eth0 is
re-segmenting things transparently, and we're not getting the ICMP
frag-needed message and the packet drop because of the skb_is_gso()
check in ip_forward().

	if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
		     (ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) {
		IP_INC_STATS(dev_net(rt->dst.dev), IPSTATS_MIB_FRAGFAILS);
		icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
			  htonl(dst_mtu(&rt->dst)));
		goto drop;
	}

So if that's what is happening, that's cute, but I think we need to
fix this :-)

Perhaps the check in ip_forward() should instead validate the gso_size
in the skb_is_gso() case?

That'd be a little tricky since gso_size is an MSS value whereas what
we're checking against (skb->len) is the full packet size, headers and
all.

^ permalink raw reply

* Re: GRO/GSO hiding PMTU?
From: David Miller @ 2011-02-10 23:07 UTC (permalink / raw)
  To: herbert; +Cc: netdev, netfilter-devel
In-Reply-To: <20110210.145555.39165146.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Thu, 10 Feb 2011 14:55:55 -0800 (PST)

> I suspect that the packet arrives on eth1, accumulates into GRO, and
> thus marked as GSO as well, then GSO/TSO on output to eth0 is
> re-segmenting things transparently, and we're not getting the ICMP
> frag-needed message and the packet drop because of the skb_is_gso()
> check in ip_forward().
> 
> 	if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
> 		     (ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) {
> 		IP_INC_STATS(dev_net(rt->dst.dev), IPSTATS_MIB_FRAGFAILS);
> 		icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
> 			  htonl(dst_mtu(&rt->dst)));
> 		goto drop;
> 	}
> 
> So if that's what is happening, that's cute, but I think we need to
> fix this :-)
> 
> Perhaps the check in ip_forward() should instead validate the gso_size
> in the skb_is_gso() case?
> 
> That'd be a little tricky since gso_size is an MSS value whereas what
> we're checking against (skb->len) is the full packet size, headers and
> all.

Nevermind, I turned off gso/tso on eth0 (outgoing interface) and it still
happens.

I guess netfilter or something else is causing this.


^ permalink raw reply

* Re: [RFC PATCH 0/5] Cache PMTU/redirects in inetpeer
From: David Miller @ 2011-02-10 23:09 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20110209.221249.112596220.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Wed, 09 Feb 2011 22:12:49 -0800 (PST)

> This is what I've been working on for the past several days.

Just FYI I've pushed just the infrastructure bits (patches 1, 2,
and 3) into net-next-2.6 since those have no side effects.

I won't push the other two until I can do some good testing on
them.

^ permalink raw reply

* Re:...
From: Young Chang @ 2011-02-10 23:13 UTC (permalink / raw)


***********************
This message has been scanned by the InterScan for CSC SSM and found to be free of known security risks.
***********************



May I ask if you would be eligible to pursue a Business Proposal of $19.7m with me if you dont mind? Let me know if you are interested.

The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system.

E-mail transmission cannot be guaranteed to be secure or error-free as information could be
intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses.
The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission.

Wijeya Newspapers Ltd : 2010


^ permalink raw reply

* Re: jme driver loses connection after resuming from suspend
From: Guo-Fu Tseng @ 2011-02-10 23:26 UTC (permalink / raw)
  To: Leonardo  L. P. da Mata; +Cc: linux-kernel, netdev
In-Reply-To: <AANLkTikV9Ncy3Jki1G20SrzA_HMiAk_hF3qY8U+vpZ1D@mail.gmail.com>

Hi Leonardo  L. P. da Mata:
The value of register which holds receive Unicast MAC Address somehow
get messed-up after resume!

Would you see if this source fix the issue:
http://bbs.cooldavid.org/git/?p=jme.git;a=snapshot;h=refs/heads/resumefix;sf=tbz2

The shortlog is here:
http://bbs.cooldavid.org/git/?p=jme.git;a=shortlog;h=refs/heads/resumefix


On Thu, 10 Feb 2011 19:07:31 -0200, Leonardo  L. P. da Mata wrote
> Updated information on the bug, Guo-Fu Tseng says that might not be a
> bug on the driver, but i've tested other network cards and they don't
> share the same issue.
> 
> Also, on interesting update that people may consider is that after
> running tcpdump on the device, the network starts working again.
> Information are updated in the bug.
> 
> On Fri, Jan 28, 2011 at 10:33 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > (cc's added)
> >
> > On Fri, 28 Jan 2011 16:03:11 -0200
> > "Leonardo &#160;L. P. da Mata" <barroca@gmail.com> wrote:
> >
> >> Hello, i'm testing the kernel 2.6.37 on my hardware, Once connect on
> >> wired network, i call the suspend with:
> >> echo "mem" >/sys/power/state
> >>
> >> The system goes on suspend. After resuming from suspend, the network card
> >> cannot be used anymore.
> >>
> >>
> >> The bug is reported here:
> >> &#160;https://bugzilla.kernel.org/show_bug.cgi?id=27692
> >>
> >> Can you please point me to similar problems on other network cards so
> >> i can get possible solutions on this.
> >
> >
> 
> -- 
> Leonardo Luiz Padovani da Mata
> barroca@gmail.com
> 
> "May the force be with you, always"
> "Nerd Pride... eu tenho. Voce tem?"
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Guo-Fu Tseng

^ permalink raw reply

* Re: GRO/GSO hiding PMTU?
From: Herbert Xu @ 2011-02-10 23:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, netfilter-devel
In-Reply-To: <20110210.145555.39165146.davem@davemloft.net>

On Thu, Feb 10, 2011 at 02:55:55PM -0800, David Miller wrote:
>
> I suspect that the packet arrives on eth1, accumulates into GRO, and
> thus marked as GSO as well, then GSO/TSO on output to eth0 is
> re-segmenting things transparently, and we're not getting the ICMP
> frag-needed message and the packet drop because of the skb_is_gso()
> check in ip_forward().
> 
> 	if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
> 		     (ip_hdr(skb)->frag_off & htons(IP_DF))) && !skb->local_df) {
> 		IP_INC_STATS(dev_net(rt->dst.dev), IPSTATS_MIB_FRAGFAILS);
> 		icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
> 			  htonl(dst_mtu(&rt->dst)));
> 		goto drop;
> 	}
> 
> So if that's what is happening, that's cute, but I think we need to
> fix this :-)

Yes this is a known problem and we do need to fix this, even if
it doesn't appear to be the cause of your immediate issue :)

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 1/3] tg3: Avoid setting power.can_wakeup for devices that cannot wake up
From: Matt Carlson @ 2011-02-11  0:00 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matthew Carlson, netdev@vger.kernel.org, David Miller,
	Michael Chan, Linux PM mailing list, Thomas Fjellstrom,
	Jay Cliburn, Chris Snook, Jie Yang
In-Reply-To: <201102102208.42410.rjw@sisk.pl>

On Thu, Feb 10, 2011 at 01:08:42PM -0800, Rafael J. Wysocki wrote:
> On Thursday, February 10, 2011, Matt Carlson wrote:
> > On Thu, Feb 10, 2011 at 08:53:09AM -0800, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > The tg3 driver uses device_init_wakeup() in such a way that the
> > > device's power.can_wakeup flag may be set even though the PCI
> > > subsystem cleared it before, in which case the device cannot wake
> > > up the system from sleep states.  Modify the driver to only change
> > > the power.can_wakeup flag if the device is not capable of generating
> > > wakeup signals.
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > > ---
> > >  drivers/net/tg3.c |    6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > Index: linux-2.6/drivers/net/tg3.c
> > > ===================================================================
> > > --- linux-2.6.orig/drivers/net/tg3.c
> > > +++ linux-2.6/drivers/net/tg3.c
> > > @@ -12403,9 +12403,11 @@ static void __devinit tg3_get_eeprom_hw_
> > >  			tp->tg3_flags3 |= TG3_FLG3_RGMII_EXT_IBND_TX_EN;
> > >  	}
> > >  done:
> > > -	device_init_wakeup(&tp->pdev->dev, tp->tg3_flags & TG3_FLAG_WOL_CAP);
> > > -	device_set_wakeup_enable(&tp->pdev->dev,
> > > +	if (tp->tg3_flags & TG3_FLAG_WOL_CAP)
> > > +		device_set_wakeup_enable(&tp->pdev->dev,
> > >  				 tp->tg3_flags & TG3_FLAG_WOL_ENABLE);
> > > +	else
> > > +		device_set_wakeup_capable(&tp->pdev->dev, false);
> > 
> > I did this because I couldn't see where 'can_wakeup' gets set.  I don't
> > see a call to device_init_wakeup() that would be relevant to tg3
> > devices.  I do see a couple calls to device_set_wakeup_capable() in
> > acpi/glue.c and acpi/scan.c.  Is that the place?
> 
> No, it's pci_pm_init() or platform_pci_wakeup_init() and they both use
> device_set_wakeup_capable() rather tha device_init_wakeup(), which is just a
> combination of device_set_wakeup_capable() and device_set_wakeup_enable()
> anyway.
> 
> And there's no why reason PCI drivers should use device_pm_init() at all.

O.K.

Acked-by: Matt Carlson <mcarlson@broadcom.com>

> > >  }
> > >  
> > >  static int __devinit tg3_issue_otp_command(struct tg3 *tp, u32 cmd)
> > 
> > This is something I was always curious about too.  The TG3_FLAG_WOL_CAP
> > tracks whether or not the device can handle WOL.  Is it safe to do away
> > with this flag and lean on the 'can_wakeup' flag instead?
> 
> I don't think so.  power.can_wakeup only tracks the capability to generate
> wakeup signals from the PCI perspective and it is set by PCI if the device
> appears to be able to generate wakeup signals.
> 
> So, I think the driver should work as in the $subject patch - reset the
> power.can_wakeup flag if TG3_FLAG_WOL_CAP is unset and don't touch it
> otherwise.

That makes sense.  Thanks.


^ permalink raw reply

* Re: GRO/GSO hiding PMTU?
From: Herbert Xu @ 2011-02-11  0:07 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, netfilter-devel
In-Reply-To: <20110210.150759.59671055.davem@davemloft.net>

On Thu, Feb 10, 2011 at 03:07:59PM -0800, David Miller wrote:
>
> Nevermind, I turned off gso/tso on eth0 (outgoing interface) and it still
> happens.

Turning off GSO/TSO has no effect if it's GRO generating the packet
since it still has to be segmented regardless.

So I think this is the problem that you found and we need to fix it
by checking the gso_size parameter.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* (unknown), 
From: COCA COLA NOTIFICATION @ 2011-02-11  0:20 UTC (permalink / raw)


DEPT COCA-COLA AVENUE
STAMFORD BRIDGE LONDON.
SW1V 3DW UNITED KINGDOM

Attention Winner

This email is to notify you that your email address was
randomly selected and entered into our free Third Category
draws.You have subsequently emerged a winner and therefore
entitled to a substantial amount of 1,000,000.00 Great British
Pounds.kindly confirm receipt of this email, by forwarding
Your Details to the claims department.

Name: Tommy Roger
Email:drawsupdate105@hotmail.co.uk

IMPORTANT FILL OUT THIS WINNERS VERIFICATION FORM BELOW:

FULL NAMES----------
DATE OF BIRTH---------
SEX.----------------
CONTACT ADDRESS----------
COUNTRY--------------------
MOBILE NUMBER--------------
OCCUPATION----------
E-MAIL ID--------------

Congratulations once again.
Online Co-coordinator

The Coca-Cola Company. Copy Right 2010 All Right Reserve

^ permalink raw reply

* [PATCH] net: fix ifenslave build flags
From: Randy Dunlap @ 2011-02-11  0:27 UTC (permalink / raw)
  To: Netdev, David Miller; +Cc: Alexey Salmin

From: Randy Dunlap <randy.dunlap@oracle.com>

-I (include path) should be specified for host builds.
This one was overlooked somehow.  Fixes
https://bugzilla.kernel.org/show_bug.cgi?id=25902

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Alexey Salmin <alexey.salmin@gmail.com>
---
 Documentation/networking/Makefile |    2 ++
 1 file changed, 2 insertions(+)

--- lnx-2638-rc4.orig/Documentation/networking/Makefile
+++ lnx-2638-rc4/Documentation/networking/Makefile
@@ -4,6 +4,8 @@ obj- := dummy.o
 # List of programs to build
 hostprogs-y := ifenslave
 
+HOSTCFLAGS_ifenslave.o += -I$(objtree)/usr/include
+
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
 

^ permalink raw reply

* [ethtool PATCH 0/2] Add support for RX network flow classifier rules
From: Alexander Duyck @ 2011-02-11  1:18 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev

These two patches add support for the RX network flow classifier set and get
rules calls already supported by the kernel.  The first patch is actually just
a cleanup patch in the manpage to make it a bit easier to do the value/mask
pairing follow ons in the second patch.

The second patch is the meat of the changes and implements the packet
classifier interface.  In updating and testing it I found a number of issues
that had to be addressed.  In resolving them though I found several features
that I felt were better off removed such as prioritization of filters which
was causing multiple issues including memory leaks and non-constructive
blocking of filter insertions due to the state of expanded priorities not
being saved.

I'm still currently exploring my options in resolving the needs for ixgbe to
have the ability to retain and display its filters and plan to submit RFC
patches in the next few days to get some feedback on that.

Thanks,

Alex

---

Alexander Duyck (1):
      Add macro for displaying [value N] formatting to manpage

Santwona Behera (1):
      Add RX packet classification interface

 Makefile.am      |    3 
 ethtool-bitops.h |   25 ++
 ethtool-util.h   |   13 +
 ethtool.8.in     |  203 +++++++++-----
 ethtool.c        |  144 +++++++++-
 rxclass.c        |  809 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 1115 insertions(+), 82 deletions(-)
 create mode 100644 ethtool-bitops.h
 create mode 100644 rxclass.c

-- 

^ permalink raw reply

* [ethtool PATCH 1/2] Add macro for displaying [value N] formatting to manpage
From: Alexander Duyck @ 2011-02-11  1:18 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev
In-Reply-To: <20110211010806.23554.98333.stgit@gitlad.jf.intel.com>

This change adds a macro for displaying optional values that take a numeric
value to the manpage.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 ethtool.8.in |  102 ++++++++++++++++++++++------------------------------------
 1 files changed, 38 insertions(+), 64 deletions(-)

diff --git a/ethtool.8.in b/ethtool.8.in
index 0ee91a0..133825b 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -34,6 +34,12 @@
 [\\fB\\$1\\fP\ \\fB\\$2\\fP|\\fB\\$3\\fP|\\fB\\$4\\fP|\\fB\\$5\\fP]
 ..
 .\"
+.\"	.BN - value with a numeric input as in "[value N]"
+.\"
+.de BN
+[\\fB\\$1\\fP\ \\fIN\\fP]
+..
+.\"
 .\"	\(*MA - mac address
 .\"
 .ds MA \fIxx\fP\fB:\fP\fIyy\fP\fB:\fP\fIzz\fP\fB:\fP\fIaa\fP\fB:\fP\fIbb\fP\fB:\fP\fIcc\fP
@@ -110,60 +116,36 @@ ethtool \- query or control network driver and hardware settings
 .I ethX
 .B2 adaptive-rx on off
 .B2 adaptive-tx on off
-.RB [ rx-usecs
-.IR N ]
-.RB [ rx-frames
-.IR N ]
-.RB [ rx-usecs-irq
-.IR N ]
-.RB [ rx-frames-irq
-.IR N ]
-.RB [ tx-usecs
-.IR N ]
-.RB [ tx-frames
-.IR N ]
-.RB [ tx-usecs-irq
-.IR N ]
-.RB [ tx-frames-irq
-.IR N ]
-.RB [ stats-block-usecs
-.IR N ]
-.RB [ pkt-rate-low
-.IR N ]
-.RB [ rx-usecs-low
-.IR N ]
-.RB [ rx-frames-low
-.IR N ]
-.RB [ tx-usecs-low
-.IR N ]
-.RB [ tx-frames-low
-.IR N ]
-.RB [ pkt-rate-high
-.IR N ]
-.RB [ rx-usecs-high
-.IR N ]
-.RB [ rx-frames-high
-.IR N ]
-.RB [ tx-usecs-high
-.IR N ]
-.RB [ tx-frames-high
-.IR N ]
-.RB [ sample-interval
-.IR N ]
+.BN rx-usecs
+.BN rx-frames
+.BN rx-usecs-irq
+.BN rx-frames-irq
+.BN tx-usecs
+.BN tx-frames
+.BN tx-usecs-irq
+.BN tx-frames-irq
+.BN stats-block-usecs
+.BN pkt-rate-low
+.BN rx-usecs-low
+.BN rx-frames-low
+.BN tx-usecs-low
+.BN tx-frames-low
+.BN pkt-rate-high
+.BN rx-usecs-high
+.BN rx-frames-high
+.BN tx-usecs-high
+.BN tx-frames-high
+.BN sample-interval
 
 .B ethtool \-g|\-\-show\-ring
 .I ethX
 
 .B ethtool \-G|\-\-set\-ring
 .I ethX
-.RB [ rx
-.IR N ]
-.RB [ rx-mini
-.IR N ]
-.RB [ rx-jumbo
-.IR N ]
-.RB [ tx
-.IR N ]
+.BN rx
+.BN rx-mini
+.BN rx-jumbo
+.BN tx
 
 .B ethtool \-i|\-\-driver
 .I ethX
@@ -178,21 +160,15 @@ ethtool \- query or control network driver and hardware settings
 .B ethtool \-e|\-\-eeprom\-dump
 .I ethX
 .B2 raw on off
-.RB [ offset
-.IR N ]
-.RB [ length
-.IR N ]
+.BN offset
+.BN length
 
 .B ethtool \-E|\-\-change\-eeprom
 .I ethX
-.RB [ magic
-.IR N ]
-.RB [ offset
-.IR N ]
-.RB [ length
-.IR N ]
-.RB [ value
-.IR N ]
+.BN magic
+.BN offset
+.BN length
+.BN value
 
 .B ethtool \-k|\-\-show\-offload
 .I ethX
@@ -234,10 +210,8 @@ ethtool \- query or control network driver and hardware settings
 .B2 duplex half full
 .B4 port tp aui bnc mii fibre
 .B2 autoneg on off
-.RB [ advertise
-.IR N ]
-.RB [ phyad
-.IR N ]
+.BN advertise
+.BN phyad
 .B2 xcvr internal external
 .RB [ wol \ \*(WO]
 .RB [ sopass \ \*(MA]


^ permalink raw reply related

* [ethtool PATCH 2/2] Add RX packet classification interface
From: Alexander Duyck @ 2011-02-11  1:18 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev
In-Reply-To: <20110211010806.23554.98333.stgit@gitlad.jf.intel.com>

From: Santwona Behera <santwona.behera@sun.com>

This patch was originally introduced as:
  [PATCH 1/3] [ethtool] Add rx pkt classification interface
  Signed-off-by: Santwona Behera <santwona.behera@sun.com>
  http://patchwork.ozlabs.org/patch/23223/

I have updated it to address a number of issues.  As a result I removed the
local caching of rules due to the fact that there were memory leaks in this
code and the rule manager would consume over 1Mb of space for an 8K table
when all that was needed was 1K in order to store which rules were active
and which were not.

In addition I dropped the use of regions as there were multiple issue found
including the fact that the regions were not properly expanding beyond 2
and the fact that the regions required reading all of the rules in order to
correctly expand beyond 2.  By dropping the regions from the rule manager
it is possible to write a much cleaner interface leaving region management
to be done by either the driver or by external management scripts.

I also added an ethtool bitops interface to allow for simple bit set and
test activities since the rule manager can most efficiently store the list
of active rules via a bitmap.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

 Makefile.am      |    3 
 ethtool-bitops.h |   25 ++
 ethtool-util.h   |   13 +
 ethtool.8.in     |  101 ++++++-
 ethtool.c        |  144 +++++++++-
 rxclass.c        |  809 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 1077 insertions(+), 18 deletions(-)
 create mode 100644 ethtool-bitops.h
 create mode 100644 rxclass.c

diff --git a/Makefile.am b/Makefile.am
index a0d2116..0262c31 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -8,7 +8,8 @@ ethtool_SOURCES = ethtool.c ethtool-copy.h ethtool-util.h	\
 		  amd8111e.c de2104x.c e100.c e1000.c igb.c	\
 		  fec_8xx.c ibm_emac.c ixgb.c ixgbe.c natsemi.c	\
 		  pcnet32.c realtek.c tg3.c marvell.c vioc.c	\
-		  smsc911x.c at76c50x-usb.c sfc.c stmmac.c
+		  smsc911x.c at76c50x-usb.c sfc.c stmmac.c	\
+		  rxclass.c
 
 dist-hook:
 	cp $(top_srcdir)/ethtool.spec $(distdir)
diff --git a/ethtool-bitops.h b/ethtool-bitops.h
new file mode 100644
index 0000000..7101056
--- /dev/null
+++ b/ethtool-bitops.h
@@ -0,0 +1,25 @@
+#ifndef ETHTOOL_BITOPS_H__
+#define ETHTOOL_BITOPS_H__
+
+#define BITS_PER_LONG		__WORDSIZE
+#define BITS_PER_BYTE		8
+#define DIV_ROUND_UP(n, d)	(((n) + (d) - 1) / (d))
+#define BITS_TO_LONGS(nr)	DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
+
+static inline void set_bit(int nr, unsigned long *addr)
+{
+	addr[nr / BITS_PER_LONG] |= 1UL << (nr % BITS_PER_LONG);
+}
+
+static inline void clear_bit(int nr, unsigned long *addr)
+{
+	addr[nr / BITS_PER_LONG] &= ~(1UL << (nr % BITS_PER_LONG));
+}
+
+static __always_inline int test_bit(unsigned int nr, const unsigned long *addr)
+{
+	return ((1UL << (nr % BITS_PER_LONG)) &
+		(((unsigned long *)addr)[nr / BITS_PER_LONG])) != 0UL;
+}
+
+#endif
diff --git a/ethtool-util.h b/ethtool-util.h
index f053028..e9300e2 100644
--- a/ethtool-util.h
+++ b/ethtool-util.h
@@ -103,4 +103,17 @@ int sfc_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
 int st_mac100_dump_regs(struct ethtool_drvinfo *info,
 			struct ethtool_regs *regs);
 int st_gmac_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
+
+/* Rx flow classification */
+#include <sys/ioctl.h>
+#include <net/if.h>
+
+int rxclass_parse_ruleopts(char **optstr, int opt_cnt,
+			   struct ethtool_rx_flow_spec *fsp, __u8 *loc_valid);
+int rxclass_rule_getall(int fd, struct ifreq *ifr);
+int rxclass_rule_get(int fd, struct ifreq *ifr, __u32 loc);
+int rxclass_rule_ins(int fd, struct ifreq *ifr,
+		     struct ethtool_rx_flow_spec *fsp, __u8 loc_valid);
+int rxclass_rule_del(int fd, struct ifreq *ifr, __u32 loc);
+
 #endif
diff --git a/ethtool.8.in b/ethtool.8.in
index 133825b..c183a3d 100644
--- a/ethtool.8.in
+++ b/ethtool.8.in
@@ -40,21 +40,36 @@
 [\\fB\\$1\\fP\ \\fIN\\fP]
 ..
 .\"
+.\"	.BM - same as above but has a mask field for format "[value N [m N]]"
+.\"
+.de BM
+[\\fB\\$1\\fP\ \\fIN\\fP\ [\\fBm\\fP\ \\fIN\\fP]]
+..
+.\"
 .\"	\(*MA - mac address
 .\"
 .ds MA \fIxx\fP\fB:\fP\fIyy\fP\fB:\fP\fIzz\fP\fB:\fP\fIaa\fP\fB:\fP\fIbb\fP\fB:\fP\fIcc\fP
 .\"
+.\"	\(*PA - IP address
+.\"
+.ds PA \fIx\fP\fB.\fP\fIx\fP\fB.\fP\fIx\fP\fB.\fP\fIx\fP
+.\"
 .\"	\(*WO - wol flags
 .\"
 .ds WO \fBp\fP|\fBu\fP|\fBm\fP|\fBb\fP|\fBa\fP|\fBg\fP|\fBs\fP|\fBd\fP...
 .\"
 .\"	\(*FL - flow type values
 .\"
-.ds FL \fBtcp4\fP|\fBudp4\fP|\fBah4\fP|\fBsctp4\fP|\fBtcp6\fP|\fBudp6\fP|\fBah6\fP|\fBsctp6\fP
+.ds FL \fBtcp4\fP|\fBudp4\fP|\fBah4\fP||\fBesp4\fP|\fBsctp4\fP|\fBtcp6\fP|\fBudp6\fP|\fBah6\fP|\fBesp6\fP|\fBsctp6\fP
 .\"
 .\"	\(*HO - hash options
 .\"
 .ds HO \fBm\fP|\fBv\fP|\fBt\fP|\fBs\fP|\fBd\fP|\fBf\fP|\fBn\fP|\fBr\fP...
+.\"
+.\"	\(*L4 - L4 proto options
+.\"
+.ds L4 \fBtcp\fP|\fBudp\fP|\fBsctp\fP|\fBah\fP|\fBesp\fP|\fIN\fP
+.\"
 .\" Start URL.
 .de UR
 .  ds m1 \\$1\"
@@ -224,11 +239,27 @@ ethtool \- query or control network driver and hardware settings
 .B ethtool \-n
 .I ethX
 .RB [ rx-flow-hash \ \*(FL]
+.RB [ rx-rings ]
+.RB [ rx-class-rule-all ]
+.RB [ rx-class-rule
+.IR N ]
 
 .B ethtool \-N
 .I ethX
-.RB [ rx-flow-hash \ \*(FL
-.RB \ \*(HO]
+.RB [ rx-flow-hash \ \*(FL \  \*(HO]
+.BN rx-class-rule-del
+.RB [ rx-class-rule-add \  ip4 | ip6 \ \*(L4
+.RB [ sip \ \*(PA\ [ m \ \*(PA]]
+.RB [ dip \ \*(PA\ [ m \ \*(PA]]
+.BM tos
+.BM sport
+.BM dport
+.BM spi
+.RB [ ring
+.IR N |
+.BR drop ]
+.RB [ loc
+.IR N ]]
 
 .B ethtool \-x|\-\-show\-rxfh\-indir
 .I ethX
@@ -624,6 +655,15 @@ Retrieves the hash options for the specified network traffic type.
 .PD
 .RE
 .TP
+.B rx-rings
+Retrieves the number of RX rings available for this interface.
+.TP
+.B rx-class-rule-all
+Retrieves all the RX classification rules programmed for this interface.
+.TP
+.BI rx-class-rule \ N
+Retrieves the RX classification rule with the given ID.
+.TP
 .B \-N \-\-config-nfc
 Configures the receive network flow classification.
 .TP
@@ -654,10 +694,63 @@ Hash on bytes 0 and 1 of the Layer 4 header of the rx packet.
 Hash on bytes 2 and 3 of the Layer 4 header of the rx packet.
 .TP 3
 .B r
-Discard all packets of this flow type. When this option is set, all other options are ignored.
+Discard all packets of this flow type. When this option is set, all
+other options are ignored.
 .PD
 .RE
 .TP
+.BI rx-class-rule-del \ N
+Deletes the RX classification rule with the given ID.
+.TP
+.BR rx-class-rule-add
+Adds an RX packet classification rule.
+.TP
+.A1 ip4 ip6
+Select IP version for the rule - IPv4 or IPv6
+.TP
+.RB \*(L4
+Select the L4 protocol for the rule. An integer value for a user defined
+protocol can be specified.
+.TP
+.BR sip \ \*(PA\ [ m \ \*(PA]
+Specify the source IP address of the incoming packet to
+match along with an optional mask.
+.TP
+.BR dip \ \*(PA\ [ m \ \*(PA]
+Specify the destination IP address of the incoming packet to
+match along with an optional mask.
+.TP
+.BI tos \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Specify the value of the Type of Service field in the incoming packet to
+match along with an optional mask.
+.TP
+.BI sport \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Specify the value of the source port field (applicable to
+TCP/UDP packets)in the incoming packet to match along with an
+optional mask.
+.TP
+.BI dport \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Specify the value of the destination port field (applicable to
+TCP/UDP packets)in the incoming packet to match along with an
+optional mask.
+.TP
+.BI spi \ N \\fR\ [\\fPm \ N \\fR]\\fP
+Specify the value of the SPI field (applicable to
+SCTP packets)in the incoming packet to match along with an
+optional mask.
+.TP
+.BI ring \ N
+.B |
+.BR drop
+Specify the RX ring index to which a packet matching this
+rule should be steered, or specify if the matching packet
+should be dropped.
+.TP
+.BI loc \ \ \ N
+Specify the location/ID to insert the rule. This will overwrite
+any rule present in that location and will not go through any
+of the rule ordering process.
+.TP
 .B \-x \-\-show\-rxfh\-indir
 Retrieves the receive flow hash indirection table.
 .TP
diff --git a/ethtool.c b/ethtool.c
index 1afdfe4..b624980 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -6,6 +6,7 @@
  * Kernel 2.4 update Copyright 2001 Jeff Garzik <jgarzik@mandrakesoft.com>
  * Wake-on-LAN,natsemi,misc support by Tim Hockin <thockin@sun.com>
  * Portions Copyright 2002 Intel
+ * Portions Copyright (C) Sun Microsystems 2008
  * do_test support by Eli Kupermann <eli.kupermann@intel.com>
  * ETHTOOL_PHYS_ID support by Chris Leech <christopher.leech@intel.com>
  * e1000 support by Scott Feldman <scott.feldman@intel.com>
@@ -14,6 +15,7 @@
  * amd8111e support by Reeja John <reeja.john@amd.com>
  * long arguments by Andi Kleen.
  * SMSC LAN911x support by Steve Glendinning <steve.glendinning@smsc.com>
+ * Rx Network Flow Control configuration support <santwona.behera@sun.com>
  * Various features by Ben Hutchings <bhutchings@solarflare.com>;
  *	Copyright 2009, 2010 Solarflare Communications
  *
@@ -32,7 +34,7 @@
 #include <sys/ioctl.h>
 #include <sys/stat.h>
 #include <stdio.h>
-#include <string.h>
+#include <strings.h>
 #include <errno.h>
 #include <net/if.h>
 #include <sys/utsname.h>
@@ -44,6 +46,8 @@
 #include <arpa/inet.h>
 
 #include <linux/sockios.h>
+#include <sys/socket.h>
+#include <arpa/inet.h>
 #include "ethtool-util.h"
 
 
@@ -232,15 +236,28 @@ static struct option {
     { "-S", "--statistics", MODE_GSTATS, "Show adapter statistics" },
     { "-n", "--show-nfc", MODE_GNFC, "Show Rx network flow classification "
 		"options",
-		"		[ rx-flow-hash tcp4|udp4|ah4|sctp4|"
-		"tcp6|udp6|ah6|sctp6 ]\n" },
+		"		[ rx-flow-hash tcp4|udp4|ah4|esp4|sctp4|"
+		"tcp6|udp6|ah6|esp6|sctp6 ]\n"
+		"		[ rx-rings ]\n"
+		"		[ rx-class-rule-all ]\n"
+		"		[ rx-class-rule %d ]\n"},
     { "-f", "--flash", MODE_FLASHDEV, "FILENAME " "Flash firmware image "
     		"from the specified file to a region on the device",
 		"               [ REGION-NUMBER-TO-FLASH ]\n" },
     { "-N", "--config-nfc", MODE_SNFC, "Configure Rx network flow "
 		"classification options",
-		"		[ rx-flow-hash tcp4|udp4|ah4|sctp4|"
-		"tcp6|udp6|ah6|sctp6 m|v|t|s|d|f|n|r... ]\n" },
+		"		[ rx-flow-hash tcp4|udp4|ah4|esp4|sctp4|"
+		"tcp6|udp6|ah6|esp6|sctp6 m|v|t|s|d|f|n|r... ]\n"
+		"		[ rx-class-rule-del %d ]\n"
+		"		[ rx-class-rule-add ip4|ip6 tcp|udp|sctp|ah|esp|%d\n"
+		"			[sip %d.%d.%d.%d [m %d.%d.%d.%d]]\n"
+		"			[dip %d.%d.%d.%d [m %d.%d.%d.%d]]\n"
+		"			[tos %d [m %x]]\n"
+		"			[sport %d [m %x]]\n"
+		"			[dport %d [m %x]]\n"
+		"			[spi %d [m %x]]\n"
+		"			[ring %d | drop]\n"
+		"			[loc %d]]\n"},
     { "-x", "--show-rxfh-indir", MODE_GRXFHINDIR, "Show Rx flow hash "
 		"indirection" },
     { "-X", "--set-rxfh-indir", MODE_SRXFHINDIR, "Set Rx flow hash indirection",
@@ -408,6 +425,14 @@ static int msglvl_changed;
 static u32 msglvl_wanted = 0;
 static u32 msglvl_mask = 0;
 
+static int rx_rings_get = 0;
+static int rx_class_rule_get = -1;
+static int rx_class_rule_getall = 0;
+static int rx_class_rule_del = -1;
+static int rx_class_rule_added = 0;
+static struct ethtool_rx_flow_spec rx_rule_fs;
+static u8 rxclass_loc_valid = 0;
+
 static enum {
 	ONLINE=0,
 	OFFLINE,
@@ -777,7 +802,9 @@ static int rxflow_str_to_type(const char *str)
 	else if (!strcmp(str, "udp4"))
 		flow_type = UDP_V4_FLOW;
 	else if (!strcmp(str, "ah4"))
-		flow_type = AH_ESP_V4_FLOW;
+		flow_type = AH_V4_FLOW;
+	else if (!strcmp(str, "esp4"))
+		flow_type = ESP_V4_FLOW;
 	else if (!strcmp(str, "sctp4"))
 		flow_type = SCTP_V4_FLOW;
 	else if (!strcmp(str, "tcp6"))
@@ -785,7 +812,9 @@ static int rxflow_str_to_type(const char *str)
 	else if (!strcmp(str, "udp6"))
 		flow_type = UDP_V6_FLOW;
 	else if (!strcmp(str, "ah6"))
-		flow_type = AH_ESP_V6_FLOW;
+		flow_type = AH_V6_FLOW;
+	else if (!strcmp(str, "esp6"))
+		flow_type = ESP_V6_FLOW;
 	else if (!strcmp(str, "sctp6"))
 		flow_type = SCTP_V6_FLOW;
 	else if (!strcmp(str, "ether"))
@@ -945,6 +974,23 @@ static void parse_cmdline(int argc, char **argp)
 						rxflow_str_to_type(argp[i]);
 					if (!rx_fhash_get)
 						show_usage(1);
+				} else if (!strcmp(argp[i], "rx-rings")) {
+					i += 1;
+					rx_rings_get = 1;
+				} else if (!strcmp(argp[i],
+						   "rx-class-rule-all")) {
+					i += 1;
+					rx_class_rule_getall = 1;
+				} else if (!strcmp(argp[i], "rx-class-rule")) {
+					i += 1;
+					if (i >= argc) {
+						show_usage(1);
+						break;
+					}
+					rx_class_rule_get =
+						strtol(argp[i], NULL, 0);
+					if (rx_class_rule_get < 0)
+						show_usage(1);
 				} else
 					show_usage(1);
 				break;
@@ -978,8 +1024,37 @@ static void parse_cmdline(int argc, char **argp)
 						show_usage(1);
 					else
 						rx_fhash_changed = 1;
-				} else
+				} else if (!strcmp(argp[i],
+						   "rx-class-rule-del")) {
+					i += 1;
+					if (i >= argc) {
+						show_usage(1);
+						break;
+					}
+					rx_class_rule_del =
+						strtol(argp[i], NULL, 0);
+					if (rx_class_rule_del < 0)
+						show_usage(1);
+				} else if (!strcmp(argp[i],
+						   "rx-class-rule-add")) {
+					i += 1;
+					if (i >= argc) {
+						show_usage(1);
+						break;
+					}
+					if (rxclass_parse_ruleopts(&argp[i],
+								   argc - i,
+								   &rx_rule_fs,
+								   &rxclass_loc_valid) < 0) {
+						show_usage(1);
+					} else {
+						i = argc;
+						rx_class_rule_added = 1;
+					}
+				} else {
 					show_usage(1);
+				}
+
 				break;
 			}
 			if (mode == MODE_SRXFHINDIR) {
@@ -1917,9 +1992,12 @@ static int dump_rxfhash(int fhash, u64 val)
 	case SCTP_V4_FLOW:
 		fprintf(stdout, "SCTP over IPV4 flows");
 		break;
-	case AH_ESP_V4_FLOW:
+	case AH_V4_FLOW:
 		fprintf(stdout, "IPSEC AH over IPV4 flows");
 		break;
+	case ESP_V4_FLOW:
+		fprintf(stdout, "IPSEC ESP over IPV4 flows");
+		break;
 	case TCP_V6_FLOW:
 		fprintf(stdout, "TCP over IPV6 flows");
 		break;
@@ -1929,9 +2007,12 @@ static int dump_rxfhash(int fhash, u64 val)
 	case SCTP_V6_FLOW:
 		fprintf(stdout, "SCTP over IPV6 flows");
 		break;
-	case AH_ESP_V6_FLOW:
+	case AH_V6_FLOW:
 		fprintf(stdout, "IPSEC AH over IPV6 flows");
 		break;
+	case ESP_V6_FLOW:
+		fprintf(stdout, "IPSEC ESP over IPV6 flows");
+		break;
 	default:
 		break;
 	}
@@ -2911,14 +2992,12 @@ static int do_gstats(int fd, struct ifreq *ifr)
 	return 0;
 }
 
-
 static int do_srxclass(int fd, struct ifreq *ifr)
 {
 	int err;
+	struct ethtool_rxnfc nfccmd;
 
 	if (rx_fhash_changed) {
-		struct ethtool_rxnfc nfccmd;
-
 		nfccmd.cmd = ETHTOOL_SRXFH;
 		nfccmd.flow_type = rx_fhash_set;
 		nfccmd.data = rx_fhash_val;
@@ -2930,6 +3009,20 @@ static int do_srxclass(int fd, struct ifreq *ifr)
 
 	}
 
+	if (rx_class_rule_added) {
+		err = rxclass_rule_ins(fd, ifr, &rx_rule_fs,
+				       rxclass_loc_valid);
+		if (err < 0)
+			fprintf(stderr, "Cannot insert RX classification rule\n");
+	}
+
+	if (rx_class_rule_del >= 0) {
+		err = rxclass_rule_del(fd, ifr, rx_class_rule_del);
+
+		if (err < 0)
+			fprintf(stderr, "Cannot delete RX classification rule\n");
+	}
+
 	return 0;
 }
 
@@ -2950,6 +3043,31 @@ static int do_grxclass(int fd, struct ifreq *ifr)
 			dump_rxfhash(rx_fhash_get, nfccmd.data);
 	}
 
+	if (rx_rings_get) {
+		struct ethtool_rxnfc nfccmd;
+
+		nfccmd.cmd = ETHTOOL_GRXRINGS;
+		ifr->ifr_data = (caddr_t)&nfccmd;
+		err = ioctl(fd, SIOCETHTOOL, ifr);
+		if (err < 0)
+			perror("Cannot get RX rings");
+		else
+			fprintf(stdout, "%d RX rings available\n",
+				(int)nfccmd.data);
+	}
+
+	if (rx_class_rule_get >= 0) {
+		err = rxclass_rule_get(fd, ifr, rx_class_rule_get);
+		if (err < 0)
+			fprintf(stderr, "Cannot get RX classification rule\n");
+	}
+
+	if (rx_class_rule_getall) {
+		err = rxclass_rule_getall(fd, ifr);
+		if (err < 0)
+			fprintf(stderr, "RX classification rule retrieval failed\n");
+	}
+
 	return 0;
 }
 
diff --git a/rxclass.c b/rxclass.c
new file mode 100644
index 0000000..fd01a32
--- /dev/null
+++ b/rxclass.c
@@ -0,0 +1,809 @@
+/*
+ * Copyright (C) 2008 Sun Microsystems, Inc. All rights reserved.
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stddef.h>
+#include <stdlib.h>
+#include <string.h>
+#include <strings.h>
+
+#include <linux/sockios.h>
+#include <arpa/inet.h>
+#include "ethtool-util.h"
+#include "ethtool-bitops.h"
+
+/*
+ * This is a rule manager implementation for ordering rx flow
+ * classification rules in a longest prefix first match order.
+ * The assumption is that this rule manager is the only one adding rules to
+ * the device's hardware classifier.
+ */
+
+struct rmgr_ctrl {
+	/* slot contains a bitmap indicating which filters are valid */
+	unsigned long		*slot;
+	__u32			n_rules;
+	__u32			size;
+};
+
+static struct rmgr_ctrl rmgr;
+static int rmgr_init_done = 0;
+
+#ifndef SIOCETHTOOL
+#define SIOCETHTOOL     0x8946
+#endif
+
+static void rmgr_print_ipv4_rule(struct ethtool_rx_flow_spec *fsp)
+{
+	char		chan[16];
+	char		l4_proto[16];
+	__u32		sip, dip, sipm, dipm;
+
+	sip = ntohl(fsp->h_u.tcp_ip4_spec.ip4src);
+	dip = ntohl(fsp->h_u.tcp_ip4_spec.ip4dst);
+	sipm = ntohl(fsp->m_u.tcp_ip4_spec.ip4src);
+	dipm = ntohl(fsp->m_u.tcp_ip4_spec.ip4dst);
+
+	if (fsp->ring_cookie != RX_CLS_FLOW_DISC)
+		sprintf(chan, "Rx Ring [%d]", (int)fsp->ring_cookie);
+	else
+		sprintf(chan, "Discard");
+
+	switch (fsp->flow_type) {
+	case TCP_V4_FLOW:
+	case UDP_V4_FLOW:
+	case SCTP_V4_FLOW:
+	case AH_V4_FLOW:
+	case ESP_V4_FLOW:
+	case IP_USER_FLOW:
+		fprintf(stdout,
+			"      IPv4 Rule:  ID[%d] Target[%s]\n"
+			"      IP src addr[%d.%d.%d.%d] mask[%d.%d.%d.%d]\n"
+			"      IP dst addr[%d.%d.%d.%d] mask[%d.%d.%d.%d]\n"
+			"      IP TOS[0x%x] mask[0x%x]\n",
+			fsp->location, chan,
+			(sip & 0xff000000) >> 24,
+			(sip & 0xff0000) >> 16,
+			(sip & 0xff00) >> 8,
+			sip & 0xff,
+			(sipm & 0xff000000) >> 24,
+			(sipm & 0xff0000) >> 16,
+			(sipm & 0xff00) >> 8,
+			sipm & 0xff,
+			(dip & 0xff000000) >> 24,
+			(dip & 0xff0000) >> 16,
+			(dip & 0xff00) >> 8,
+			dip & 0xff,
+			(dipm & 0xff000000) >> 24,
+			(dipm & 0xff0000) >> 16,
+			(dipm & 0xff00) >> 8,
+			dipm & 0xff,
+			fsp->h_u.tcp_ip4_spec.tos,
+			fsp->m_u.tcp_ip4_spec.tos);
+
+		switch (fsp->flow_type) {
+		case TCP_V4_FLOW:
+			sprintf(l4_proto, "TCP");
+			break;
+		case UDP_V4_FLOW:
+			sprintf(l4_proto, "UDP");
+			break;
+		case SCTP_V4_FLOW:
+			sprintf(l4_proto, "SCTP");
+			break;
+		case AH_V4_FLOW:
+			sprintf(l4_proto, "AH");
+			break;
+		case ESP_V4_FLOW:
+			sprintf(l4_proto, "ESP");
+			break;
+		default:
+			break;
+		}
+		switch (fsp->flow_type) {
+		case TCP_V4_FLOW:
+		case UDP_V4_FLOW:
+		case SCTP_V4_FLOW:
+			fprintf(stdout,
+				"      L4 proto[%s]\n"
+				"      L4 src port[%d] mask[0x%x]\n"
+				"      L4 dst port[%d] mask[0x%x]\n",
+				l4_proto,
+				ntohs(fsp->h_u.tcp_ip4_spec.psrc),
+				ntohs(fsp->m_u.tcp_ip4_spec.psrc),
+				ntohs(fsp->h_u.tcp_ip4_spec.pdst),
+				ntohs(fsp->m_u.tcp_ip4_spec.pdst));
+			break;
+		case AH_V4_FLOW:
+		case ESP_V4_FLOW:
+			fprintf(stdout,
+				"      L4 proto[%s]\n"
+				"      L4 SPI[%d] mask[0x%x]\n",
+				l4_proto,
+				ntohl(fsp->h_u.esp_ip4_spec.spi),
+				ntohl(fsp->m_u.esp_ip4_spec.spi));
+			break;
+		case IP_USER_FLOW:
+			fprintf(stdout,
+				"      L4 user proto[%d]\n"
+				"      L4 first 4 bytes[0x%x] mask[0x%x]\n",
+				fsp->h_u.usr_ip4_spec.proto,
+				ntohl(fsp->h_u.usr_ip4_spec.l4_4_bytes),
+				ntohl(fsp->m_u.usr_ip4_spec.l4_4_bytes));
+			break;
+		default:
+			break;
+		}
+		break;
+	default:
+		fprintf(stdout,
+			"      Unknown L4 proto, type[%d]\n", fsp->flow_type);
+		break;
+	}
+
+	fprintf(stdout, "\n\n");
+}
+
+static void rmgr_print_rule(struct ethtool_rx_flow_spec *fsp)
+{
+	/* print the rule in this location */
+	switch (fsp->flow_type) {
+	case TCP_V4_FLOW:
+	case UDP_V4_FLOW:
+	case SCTP_V4_FLOW:
+	case AH_V4_FLOW:
+	case ESP_V4_FLOW:
+		rmgr_print_ipv4_rule(fsp);
+		break;
+	case IP_USER_FLOW:
+		if (fsp->h_u.usr_ip4_spec.ip_ver == ETH_RX_NFC_IP4) {
+			rmgr_print_ipv4_rule(fsp);
+			break;
+		}
+		/* IPv6 User Flow falls through to the case below */
+	case TCP_V6_FLOW:
+	case UDP_V6_FLOW:
+	case SCTP_V6_FLOW:
+	case AH_V6_FLOW:
+	case ESP_V6_FLOW:
+		fprintf(stderr, "IPv6 flows not implemented\n");
+		break;
+	default:
+		fprintf(stderr, "rmgr: Unknown flow type\n");
+		break;
+	}
+}
+
+static int rmgr_ins(__u32 loc)
+{
+	/* verify location is in rule manager range */
+	if ((loc < 0) || (loc >= rmgr.size)) {
+		fprintf(stderr, "rmgr: Location out of range\n");
+		return -1;
+	}
+
+	/* set bit for the rule */
+	set_bit(loc, rmgr.slot);
+
+	return 0;
+}
+
+static int rmgr_find(__u32 loc)
+{
+	/* verify location is in rule manager range */
+	if ((loc < 0) || (loc >= rmgr.size)) {
+		fprintf(stderr, "rmgr: Location out of range\n");
+		return -1;
+	}
+
+	/* if slot is found return 0 indicating success */
+	if (test_bit(loc, rmgr.slot))
+		return 0;
+
+	/* rule not found */
+	fprintf(stderr, "rmgr: No such rule\n");
+	return -1;
+}
+
+static int rmgr_del(__u32 loc)
+{
+	/* verify rule exists before attempting to delete */
+	int err = rmgr_find(loc);
+	if (err)
+		return err;
+
+	/* clear bit for the rule */
+	clear_bit(loc, rmgr.slot);
+
+	return 0;
+}
+
+static int rmgr_add(struct ethtool_rx_flow_spec *fsp, __u8 loc_valid)
+{
+	__u32 loc = fsp->location;
+
+	/* location provided, insert rule and update regions to match rule */
+	if (loc_valid)
+		return rmgr_ins(loc);
+
+	/* find an open slot */
+	for (loc = 0; loc < rmgr.size; loc += BITS_PER_LONG) {
+		if ((rmgr.slot[loc / BITS_PER_LONG]) != ~0UL)
+			break;
+	}
+
+	/* find and use available location in slot */
+	for (; loc < rmgr.size; loc++) {
+		if (!test_bit(loc, rmgr.slot)) {
+			fsp->location = loc;
+			return rmgr_ins(loc);
+		}
+	}
+
+	/* No space to add this rule */
+	fprintf(stderr, "rmgr: Cannot find appropriate slot to insert rule\n");
+
+	return -1;
+}
+
+static int rmgr_init(int fd, struct ifreq *ifr)
+{
+	struct ethtool_rxnfc *nfccmd;
+	int err, i;
+	__u32 *rule_locs;
+
+	if (rmgr_init_done)
+		return 0;
+
+	/* clear rule manager settings */
+	bzero((void *)&rmgr, sizeof(struct rmgr_ctrl));
+
+	/* allocate memory for count request */
+	nfccmd = calloc(1, sizeof(*nfccmd));
+	if (!nfccmd) {
+		perror("rmgr: Cannot allocate memory for RX class rule data");
+		return -1;
+	}
+
+	/* request count and store in rmgr.n_rules */
+	nfccmd->cmd = ETHTOOL_GRXCLSRLCNT;
+	ifr->ifr_data = (caddr_t)nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	rmgr.n_rules = nfccmd->rule_cnt;
+	free(nfccmd);
+	if (err < 0) {
+		perror("rmgr: Cannot get RX class rule count");
+		return -1;
+	}
+
+	/* alloc memory for request of location list */
+	nfccmd = calloc(1, sizeof(*nfccmd) + (rmgr.n_rules * sizeof(__u32)));
+	if (!nfccmd) {
+		perror("rmgr: Cannot allocate memory for RX class rule locations");
+		return -1;
+	}
+
+	/* request location list */
+	nfccmd->cmd = ETHTOOL_GRXCLSRLALL;
+	nfccmd->rule_cnt = rmgr.n_rules;
+	ifr->ifr_data = (caddr_t)nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0) {
+		perror("rmgr: Cannot get RX class rules");
+		free(nfccmd);
+		return -1;
+	}
+
+	/* intitialize bitmap for storage of valid locations */
+	rmgr.size = nfccmd->data;
+	rmgr.slot = calloc(1, BITS_TO_LONGS(rmgr.size) * sizeof(long));
+	if (!rmgr.slot) {
+		perror("rmgr: Cannot allocate memory for RX class rules");
+		return -1;
+	}
+
+	/* write locations to bitmap */
+	rule_locs = nfccmd->rule_locs;
+	for (i = 0; i < rmgr.n_rules; i++) {
+		err = rmgr_ins(rule_locs[i]);
+		if (err < 0)
+			break;
+	}
+
+	/* free memory and set flag to avoid reinit */
+	free(nfccmd);
+	rmgr_init_done = 1;
+
+	return err;
+}
+
+static void rmgr_cleanup(void)
+{
+	if (!rmgr_init_done)
+		return;
+
+	rmgr_init_done = 0;
+
+	free(rmgr.slot);
+	rmgr.slot = NULL;
+	rmgr.size = 0;
+}
+
+int rxclass_rule_getall(int fd, struct ifreq *ifr)
+{
+	struct ethtool_rxnfc nfccmd;
+	int err, i, j;
+
+	/* init table of available rules */
+	err = rmgr_init(fd, ifr);
+	if (err < 0)
+		return err;
+
+	fprintf(stdout, "Total %d rules\n\n", rmgr.n_rules);
+
+	/* fetch and display all available rules */
+	for (i = 0; i < rmgr.size; i += BITS_PER_LONG) {
+		if (rmgr.slot[i / BITS_PER_LONG] == 0UL)
+			continue;
+		for (j = 0; j < BITS_PER_LONG; j++) {
+			if (!test_bit(i + j, rmgr.slot))
+				continue;
+			nfccmd.cmd = ETHTOOL_GRXCLSRULE;
+			bzero(&nfccmd.fs, sizeof(struct ethtool_rx_flow_spec));
+			nfccmd.fs.location = i + j;
+			ifr->ifr_data = (caddr_t)&nfccmd;
+			err = ioctl(fd, SIOCETHTOOL, ifr);
+			if (err < 0) {
+				perror("rmgr: Cannot get RX class rule");
+				return -1;
+			}
+			rmgr_print_rule(&nfccmd.fs);
+		}
+	}
+
+	rmgr_cleanup();
+
+	return 0;
+}
+
+int rxclass_rule_get(int fd, struct ifreq *ifr, __u32 loc)
+{
+	struct ethtool_rxnfc nfccmd;
+	int err;
+
+	/* init table of available rules */
+	err = rmgr_init(fd, ifr);
+	if (err < 0)
+		return err;
+
+	/* verify rule exists before attempting to display */
+	err = rmgr_find(loc);
+	if (err < 0)
+		return err;
+
+	/* fetch rule from netdev and display */
+	nfccmd.cmd = ETHTOOL_GRXCLSRULE;
+	bzero(&nfccmd.fs, sizeof(struct ethtool_rx_flow_spec));
+	nfccmd.fs.location = loc;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0) {
+		perror("rmgr: Cannot get RX class rule");
+		return -1;
+	}
+	rmgr_print_rule(&nfccmd.fs);
+
+	rmgr_cleanup();
+
+	return 0;
+}
+
+int rxclass_rule_ins(int fd, struct ifreq *ifr,
+		     struct ethtool_rx_flow_spec *fsp, __u8 loc_valid)
+{
+	struct ethtool_rxnfc nfccmd;
+	int err;
+
+	/* init table of available rules */
+	err = rmgr_init(fd, ifr);
+	if (err < 0)
+		return err;
+
+	/* verify rule location */
+	err = rmgr_add(fsp, loc_valid);
+	if (err < 0)
+		return err;
+
+	/* notify netdev of new rule */
+	nfccmd.cmd = ETHTOOL_SRXCLSRLINS;
+	nfccmd.fs = *fsp;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0) {
+		perror("rmgr: Cannot insert RX class rule");
+		return -1;
+	}
+	rmgr.n_rules++;
+
+	printf("Added rule with ID %d\n", fsp->location);
+
+	rmgr_cleanup();
+
+	return 0;
+}
+
+int rxclass_rule_del(int fd, struct ifreq *ifr, __u32 loc)
+{
+	struct ethtool_rxnfc nfccmd;
+	int err;
+
+	/* init table of available rules */
+	err = rmgr_init(fd, ifr);
+	if (err < 0)
+		return err;
+
+	/* verify rule exists */
+	err = rmgr_del(loc);
+	if (err < 0)
+		return err;
+
+	/* notify netdev of rule removal */
+	nfccmd.cmd = ETHTOOL_SRXCLSRLDEL;
+	nfccmd.fs.location = loc;
+	ifr->ifr_data = (caddr_t)&nfccmd;
+	err = ioctl(fd, SIOCETHTOOL, ifr);
+	if (err < 0) {
+		perror("rmgr: Cannot delete RX class rule");
+		return -1;
+	}
+	rmgr.n_rules--;
+
+	rmgr_cleanup();
+
+	return 0;
+}
+
+int rxclass_parse_ruleopts(char **optstr, int opt_cnt,
+			   struct ethtool_rx_flow_spec *fsp,
+			   u_int8_t *loc_valid)
+{
+	int i = 0;
+
+	u_int8_t discard, ring_set;
+	u_int32_t ipsa, ipsm, ipda, ipdm, spi, spim;
+	u_int16_t sp, spm, dp, dpm;
+	u_int8_t ip_ver, proto, tos, tm;
+	struct in_addr in_addr;
+
+	if (*optstr == NULL || **optstr == '\0' || opt_cnt < 2) {
+		fprintf(stdout, "Add rule, invalid syntax\n");
+		return -1;
+	}
+
+	bzero(fsp, sizeof(struct ethtool_rx_flow_spec));
+	ipsa = ipda = ipsm = ipdm = spi = spim = 0x0;
+	sp = dp = spm = dpm = 0x0;
+	ip_ver = proto = tos = tm = 0x0;
+	discard = ring_set = 0;
+
+	if (!strcmp(optstr[i], "ip4")) {
+		ip_ver = ETH_RX_NFC_IP4;
+	} else if (!strcmp(optstr[i], "ip6")) {
+		fprintf(stdout, "IPv6 not yet implemented\n");
+		return -1;
+	} else {
+		fprintf(stdout, "Add rule, invalid syntax for IP version\n");
+		return -1;
+	}
+
+	i++;
+
+	switch (ip_ver) {
+	case ETH_RX_NFC_IP4:
+		if (!strcmp(optstr[i], "tcp"))
+			fsp->flow_type = TCP_V4_FLOW;
+		else if (!strcmp(optstr[i], "udp"))
+			fsp->flow_type = UDP_V4_FLOW;
+		else if (!strcmp(optstr[i], "sctp"))
+			fsp->flow_type = SCTP_V4_FLOW;
+		else if (!strcmp(optstr[i], "ah"))
+			fsp->flow_type = AH_V4_FLOW;
+		else if (!strcmp(optstr[i], "esp"))
+			fsp->flow_type = ESP_V4_FLOW;
+		break;
+	default:
+		fprintf(stdout, "Add rule, Invalid IP version %d\n", ip_ver);
+			return -1;
+	}
+
+	if (fsp->flow_type == 0) {
+		proto = (u_int8_t)strtoul(optstr[i], (char **)NULL, 0);
+		if (proto != 0) {
+			fprintf(stdout, "Add rule, user defined proto %d\n",
+				proto);
+			fsp->flow_type = IP_USER_FLOW;
+			fsp->h_u.usr_ip4_spec.proto = proto;
+			fsp->h_u.usr_ip4_spec.ip_ver = ip_ver;
+		} else {
+			fprintf(stdout, "Add rule, invalid IP proto %s\n",
+				optstr[i]);
+			return -1;
+		}
+	}
+
+	for (i = 2; i < opt_cnt;) {
+		if (!strcmp(optstr[i], "tos")) {
+			tos = (u_int8_t)strtoul(optstr[i+1], (char **)NULL,
+						 0);
+			tm = 0xff;
+			fsp->h_u.tcp_ip4_spec.tos = tos;
+
+			i += 2;
+			if (opt_cnt > (i+1)) {
+				if (!strcmp(optstr[i], "m")) {
+					tm = (u_int8_t)strtoul(optstr[i+1],
+							       (char **)NULL,
+							       16);
+					i += 2;
+				}
+			}
+			fsp->m_u.tcp_ip4_spec.tos = tm;
+		} else if (!strcmp(optstr[i], "sip")) {
+			if (strchr(optstr[i+1], '.') == NULL) {
+				ipsa = strtoul(optstr[i+1], (char **)NULL, 16);
+			} else {
+				if (!inet_pton(AF_INET, optstr[i+1], &in_addr)) {
+					fprintf(stdout,
+						"Invalid src address [%s]\n" ,
+						optstr[i+1]);
+					return -1;
+				}
+				ipsa = ntohl(in_addr.s_addr);
+			}
+			ipsm = 0xffffffff;
+			fsp->h_u.tcp_ip4_spec.ip4src = ipsa;
+
+			i += 2;
+			if (opt_cnt > (i+1)) {
+				if (!strcmp(optstr[i], "m")) {
+					if (strchr(optstr[i+1], '.') == NULL) {
+						ipsm = strtoul(optstr[i+1],
+							       (char **)NULL,
+							       16);
+					} else {
+						if (!inet_pton(AF_INET,
+							       optstr[i+1],
+							       &in_addr)) {
+							fprintf(stdout,
+								"Invalid smask"
+								"[%s]\n",
+								optstr[i+1]);
+								return -1;
+						}
+						ipsm = ntohl(in_addr.s_addr);
+					}
+					i += 2;
+				}
+			}
+			fsp->m_u.tcp_ip4_spec.ip4src = ipsm;
+		} else if (!strcmp(optstr[i], "dip")) {
+			if (strchr(optstr[i+1], '.') == NULL) {
+				ipda = strtoul(optstr[i+1], (char **)NULL, 16);
+			} else {
+				if (!inet_pton(AF_INET, optstr[i+1], &in_addr)) {
+					fprintf(stdout,
+						"Invalid dst address [%s]\n",
+						optstr[i+1]);
+					return -1;
+				}
+				ipda = ntohl(in_addr.s_addr);
+			}
+			ipdm = 0xffffffff;
+			fsp->h_u.tcp_ip4_spec.ip4dst = ipda;
+
+			i += 2;
+			if (opt_cnt > (i+1)) {
+				if (!strcmp(optstr[i], "m")) {
+					if (strchr(optstr[i+1], '.') == NULL) {
+						ipdm = strtoul(optstr[i+1],
+							       (char **)NULL,
+							       16);
+					} else {
+						if (!inet_pton(AF_INET,
+							       optstr[i+1],
+							       &in_addr)) {
+							fprintf(stdout,
+								"Invalid dmask"
+								"[%s]\n",
+								optstr[i+1]);
+								return -1;
+						}
+						ipdm = ntohl(in_addr.s_addr);
+					}
+					i += 2;
+				}
+			}
+			fsp->m_u.tcp_ip4_spec.ip4dst = ipdm;
+		} else if (!strcmp(optstr[i], "sport")) {
+			switch (fsp->flow_type) {
+			case TCP_V4_FLOW:
+			case UDP_V4_FLOW:
+			case SCTP_V4_FLOW:
+			case TCP_V6_FLOW:
+			case UDP_V6_FLOW:
+			case SCTP_V6_FLOW:
+			case IP_USER_FLOW:
+				break;
+			default:
+				fprintf(stdout, "Invalid option <sport> "
+					"for this flow\n");
+				return -1;
+			}
+			sp = (u_int16_t)strtoul(optstr[i+1], (char **)NULL,
+						 0);
+			spm = 0xffff;
+			if (fsp->flow_type == IP_USER_FLOW) {
+				fsp->h_u.usr_ip4_spec.l4_4_bytes |=
+					((u32)sp << 16);
+			} else {
+				fsp->h_u.tcp_ip4_spec.psrc = sp;
+			}
+			i += 2;
+			if (opt_cnt > (i+1)) {
+				if (!strcmp(optstr[i], "m")) {
+					spm = (u_int16_t)strtoul(optstr[i+1],
+								 (char **)NULL,
+								 16);
+					i += 2;
+				}
+			}
+			if (fsp->flow_type == IP_USER_FLOW) {
+				fsp->m_u.usr_ip4_spec.l4_4_bytes |=
+					((u32)spm << 16);
+			} else {
+				fsp->m_u.tcp_ip4_spec.psrc = spm;
+			}
+		} else if (!strcmp(optstr[i], "dport")) {
+			switch (fsp->flow_type) {
+			case TCP_V4_FLOW:
+			case UDP_V4_FLOW:
+			case SCTP_V4_FLOW:
+			case TCP_V6_FLOW:
+			case UDP_V6_FLOW:
+			case SCTP_V6_FLOW:
+			case IP_USER_FLOW:
+				break;
+			default:
+				fprintf(stdout, "Invalid option <dport> "
+					"for this flow\n");
+				return -1;
+			}
+			dp = (u_int16_t)strtoul(optstr[i+1], (char **)NULL,
+						 0);
+			dpm = 0xffff;
+			if (fsp->flow_type == IP_USER_FLOW)
+				fsp->h_u.usr_ip4_spec.l4_4_bytes |= dp;
+			else
+				fsp->h_u.tcp_ip4_spec.pdst = dp;
+
+			i += 2;
+			if (opt_cnt > (i+1)) {
+				if (!strcmp(optstr[i], "m")) {
+					dpm = (u_int16_t)strtoul(optstr[i+1],
+								 (char **)NULL,
+								 16);
+					i += 2;
+				}
+			}
+			if (fsp->flow_type == IP_USER_FLOW)
+				fsp->m_u.usr_ip4_spec.l4_4_bytes |= dpm;
+			else
+				fsp->m_u.tcp_ip4_spec.pdst = dpm;
+		} else if (!strcmp(optstr[i], "spi")) {
+			switch (fsp->flow_type) {
+			case AH_V4_FLOW:
+			case ESP_V4_FLOW:
+			case AH_V6_FLOW:
+			case ESP_V6_FLOW:
+			case IP_USER_FLOW:
+				break;
+			case TCP_V4_FLOW:
+			case UDP_V4_FLOW:
+			case SCTP_V4_FLOW:
+			case TCP_V6_FLOW:
+			case UDP_V6_FLOW:
+			case SCTP_V6_FLOW:
+			default:
+				fprintf(stdout, "Invalid option <spi> "
+					"for this flow\n");
+				return -1;
+			}
+			spi = (u_int32_t)strtoul(optstr[i+1], (char **)NULL,
+						 0);
+			spim = 0xffffffff;
+			if (fsp->flow_type == IP_USER_FLOW)
+				fsp->h_u.usr_ip4_spec.l4_4_bytes = spi;
+			else
+				fsp->h_u.esp_ip4_spec.spi = spi;
+
+			i += 2;
+			if (opt_cnt > (i+1)) {
+				if (!strcmp(optstr[i], "m")) {
+					spim = (u_int32_t)strtoul(optstr[i+1],
+								 (char **)NULL,
+								 16);
+					i += 2;
+				}
+			}
+			if (fsp->flow_type == IP_USER_FLOW)
+				fsp->m_u.usr_ip4_spec.l4_4_bytes = spim;
+			else
+				fsp->m_u.esp_ip4_spec.spi = spim;
+		} else if (!strcmp(optstr[i], "ring")) {
+			if (discard == 1) {
+				fprintf(stdout, "Invalid syntax - <drop> "
+					"option already specified\n");
+				return -1;
+			}
+			fsp->ring_cookie = strtol(optstr[i+1], (char **)NULL,
+						  0);
+			i += 2;
+			ring_set = 1;
+		} else if (!strcmp(optstr[i], "drop")) {
+			if (ring_set == 1) {
+				fprintf(stdout, "Invalid syntax - <ring> "
+					"option already specified\n");
+				return -1;
+			}
+			fsp->ring_cookie = RX_CLS_FLOW_DISC;
+			i++;
+			discard = 1;
+		} else if (!strcmp(optstr[i], "loc")) {
+			fsp->location = strtol(optstr[i+1], (char **)NULL,
+						  0);
+			i += 2;
+			*loc_valid = 1;
+		} else {
+			fprintf(stdout, "Add rule, invalid syntax\n");
+			return -1;
+		}
+	}
+
+	/* Convert all multibyte network fields to network byte order */
+	fsp->h_u.tcp_ip4_spec.ip4src = htonl(fsp->h_u.tcp_ip4_spec.ip4src);
+	fsp->h_u.tcp_ip4_spec.ip4dst = htonl(fsp->h_u.tcp_ip4_spec.ip4dst);
+	fsp->m_u.tcp_ip4_spec.ip4src = htonl(fsp->m_u.tcp_ip4_spec.ip4src);
+	fsp->m_u.tcp_ip4_spec.ip4dst = htonl(fsp->m_u.tcp_ip4_spec.ip4dst);
+
+	switch (fsp->flow_type) {
+	case TCP_V4_FLOW:
+	case UDP_V4_FLOW:
+	case SCTP_V4_FLOW:
+	case TCP_V6_FLOW:
+	case UDP_V6_FLOW:
+	case SCTP_V6_FLOW:
+		fsp->h_u.tcp_ip4_spec.psrc = htons(fsp->h_u.tcp_ip4_spec.psrc);
+		fsp->h_u.tcp_ip4_spec.pdst = htons(fsp->h_u.tcp_ip4_spec.pdst);
+		fsp->m_u.tcp_ip4_spec.psrc = htons(fsp->m_u.tcp_ip4_spec.psrc);
+		fsp->m_u.tcp_ip4_spec.pdst = htons(fsp->m_u.tcp_ip4_spec.pdst);
+		break;
+	case AH_V4_FLOW:
+	case ESP_V4_FLOW:
+	case AH_V6_FLOW:
+	case ESP_V6_FLOW:
+		fsp->h_u.esp_ip4_spec.spi = htonl(fsp->h_u.esp_ip4_spec.spi);
+		fsp->m_u.esp_ip4_spec.spi = htonl(fsp->m_u.esp_ip4_spec.spi);
+		break;
+	case IP_USER_FLOW:
+		fsp->h_u.usr_ip4_spec.l4_4_bytes =
+			htonl(fsp->h_u.usr_ip4_spec.l4_4_bytes);
+		fsp->m_u.usr_ip4_spec.l4_4_bytes =
+			htonl(fsp->m_u.usr_ip4_spec.l4_4_bytes);
+		break;
+	default:
+		break;
+	}
+
+	return 0;
+}


^ permalink raw reply related

* [PATCH] tg3: Expand 5719 workaround
From: Matt Carlson @ 2011-02-11  1:57 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson

As a precautionary measure, expand the fix submitted in commit
4d163b75e979833979cc401ae433cb1d7743d57e entitled "tg3: Fix 5719 A0 tx
completion bug" to apply to all 5719 revisions.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
---
 drivers/net/tg3.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index cc06952..ecb3eb0 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -13318,7 +13318,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
 	}
 
 	/* Determine TSO capabilities */
-	if (tp->pci_chip_rev_id == CHIPREV_ID_5719_A0)
+	if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719)
 		; /* Do nothing. HW bug. */
 	else if (tp->tg3_flags3 & TG3_FLG3_5717_PLUS)
 		tp->tg3_flags2 |= TG3_FLG2_HW_TSO_3;
@@ -13372,7 +13372,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
 	}
 
 	if ((tp->tg3_flags3 & TG3_FLG3_5717_PLUS) &&
-	    tp->pci_chip_rev_id != CHIPREV_ID_5719_A0)
+	    GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5719)
 		tp->tg3_flags3 |= TG3_FLG3_USE_JUMBO_BDFLAG;
 
 	if (!(tp->tg3_flags2 & TG3_FLG2_5705_PLUS) ||
-- 
1.7.3.4



^ permalink raw reply related

* Re: [PATCH] Add JMEMCMP to Berkeley Packet Filters
From: Ian Molton @ 2011-02-11  2:02 UTC (permalink / raw)
  To: Octavian Purdila
  Cc: Eric Dumazet, netdev, rdunlap, isdn, paulus, arnd, davem, herbert,
	ebiederm, alban.crequy, astanciu
In-Reply-To: <AANLkTimOzvMx0FUaqxaa2QVV2FzsuhP8gQVX_CFmhiNR@mail.gmail.com>

On 10/02/11 15:27, Octavian Purdila wrote:
> On Thu, Feb 10, 2011 at 3:35 PM, Ian Molton<ian.molton@collabora.co.uk>  wrote:
>> On 10/02/11 13:24, Eric Dumazet wrote:
>>
>> Hi!
>>
>> Thanks for reviewing! :)
>>
>>>> * Can sk_run_filter() be called in a context where kmalloc(GFP_KERNEL) is
>>>>    not allowed (I think not)
>>>
>>> You cannot use GFP_KERNEL in sk_run_filter() : We run in {soft}irq mode,
>>> in input path.
>>
>> Ok, that can be sorted.
>>
>>>> * Data section allocated with second call to sock_kmalloc().
>>>> * Should the patch be broken into two - one to add the data uploading,
>>>>    one to add the JMEMCMP insn. ?
>>>
>>> May I ask why it is needed at all ?
>>
>> So we can match strings in packet filters... I don't think I understand the
>> question...
>>
>
> Adding a data section (some sort of persistent memory storage that the
> filter can access) is also useful for creating capture triggers, e.g.
> starting capturing after a marker packet has arrived.

Nice to see that people are thinking of more use-cases.

Eric, I think I understand what you meant now...

Our use case is experimental for now, so I wanted to see if other people 
would find this useful, as adding an experimental feature that is never 
used seems pointless.

In our case, we need to match strings in d-bus packets. if the packet is 
not intended for the recipient, it gets dropped, thus avoiding a 
pointless context switch.

-Ian

^ permalink raw reply

* Re: [PATCH] tg3: Expand 5719 workaround
From: David Miller @ 2011-02-11  4:04 UTC (permalink / raw)
  To: mcarlson; +Cc: netdev
In-Reply-To: <1297389447-5324-1-git-send-email-mcarlson@broadcom.com>

From: "Matt Carlson" <mcarlson@broadcom.com>
Date: Thu, 10 Feb 2011 17:57:27 -0800

> As a precautionary measure, expand the fix submitted in commit
> 4d163b75e979833979cc401ae433cb1d7743d57e entitled "tg3: Fix 5719 A0 tx
> completion bug" to apply to all 5719 revisions.
> 
> Signed-off-by: Matt Carlson <mcarlson@broadcom.com>

Applied to net-next-2.6, since that is where this applied cleanly.

If you want this in net-2.6 too, just submit a patch which applies
cleanly there.

Thanks.

^ permalink raw reply

* Re: [PATCH] net: fix ifenslave build flags
From: David Miller @ 2011-02-11  4:04 UTC (permalink / raw)
  To: randy.dunlap; +Cc: netdev, alexey.salmin
In-Reply-To: <4D54826D.3060406@oracle.com>

From: Randy Dunlap <randy.dunlap@oracle.com>
Date: Thu, 10 Feb 2011 16:27:25 -0800

> From: Randy Dunlap <randy.dunlap@oracle.com>
> 
> -I (include path) should be specified for host builds.
> This one was overlooked somehow.  Fixes
> https://bugzilla.kernel.org/show_bug.cgi?id=25902
> 
> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
> Reported-by: Alexey Salmin <alexey.salmin@gmail.com>

I'll apply this, thanks Randy.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox