Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v2 1/3] gianfar: Implement workaround for eTSEC74 erratum
From: David Miller @ 2010-06-30 18:37 UTC (permalink / raw)
  To: avorontsov
  Cc: manfred.rudigier, Sandeep.Kumar, afleming, netdev, linuxppc-dev
In-Reply-To: <20100630163912.GA23337@oksana.dev.rtsoft.ru>

From: Anton Vorontsov <avorontsov@mvista.com>
Date: Wed, 30 Jun 2010 20:39:12 +0400

> MPC8313ECE says:
> 
> "If MACCFG2[Huge Frame]=0 and the Ethernet controller receives frames
>  which are larger than MAXFRM, the controller truncates the frames to
>  length MAXFRM and marks RxBD[TR]=1 to indicate the error. The controller
>  also erroneously marks RxBD[TR]=1 if the received frame length is MAXFRM
>  or MAXFRM-1, even though those frames are not truncated.
>  No truncation or truncation error occurs if MACCFG2[Huge Frame]=1."
> 
> There are two options to workaround the issue:
> 
> "1. Set MACCFG2[Huge Frame]=1, so no truncation occurs for invalid large
>  frames. Software can determine if a frame is larger than MAXFRM by
>  reading RxBD[LG] or RxBD[Data Length].
> 
>  2. Set MAXFRM to 1538 (0x602) instead of the default 1536 (0x600), so
>  normal-length frames are not marked as truncated. Software can examine
>  RxBD[Data Length] to determine if the frame was larger than MAXFRM-2."
> 
> This patch implements the first workaround option by setting HUGEFRAME
> bit, and gfar_clean_rx_ring() already checks the RxBD[Data Length].
> 
> Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/3] gianfar: Implement workaround for eTSEC74 erratum
From: David Miller @ 2010-06-30 18:37 UTC (permalink / raw)
  To: avorontsov
  Cc: manfred.rudigier, Sandeep.Kumar, afleming, netdev, linuxppc-dev
In-Reply-To: <20100630163804.GA636@oksana.dev.rtsoft.ru>

From: Anton Vorontsov <avorontsov@mvista.com>
Date: Wed, 30 Jun 2010 20:38:04 +0400

> On Tue, Jun 29, 2010 at 03:16:26PM -0700, David Miller wrote:
>> 
>> I really don't see any value at all to this config option,
>> the errata fixup code should be there all the time.
> 
> Well, at least for eTSEC76 erratum (patch 2/3) we have to touch
> fast path (i.e. start_xmit), so I just wanted to make zero
> overhead for controllers that don't need any fixups.
> 
> Not that there's much of the overhead in a single additional
> 'if' condition, no. ;-)

The register accesses will dominate the costs with this chip.

The only case where a if() test is going to potentially create
some practical performance impact is if the TX is performed
purely using changes to a shared memory data structure and
absolutely no MMIO register reads or writes.

^ permalink raw reply

* TCP not triggering a fast retransmit?
From: Ivan Novick @ 2010-06-30 18:04 UTC (permalink / raw)
  To: netdev; +Cc: jmatthews, Tim Heath

[-- Attachment #1: Type: text/plain, Size: 1555 bytes --]

Hello all,

Attached is a packet capture from my application that is running on
RedHat Enterprise Linux 5.4

I am seeing a Retransmission timeout but I was hoping this case would
go into fast retransmit and not RTO.

I am wondering why did the sender not send more data?  If the sender
was to send more data and extend the window then it would seem the
duplicate acks or SACKS should trigger fast retransmit.

The application does not constantly send data, but data should arrive
to be sent before 200 milliseconds that it takes for the RTO.

Is it possible that there was no data to send and the window is not
advanced and the sender is waiting for an ACK.  Then data arrives half
way into the RTO 200 milliseconds while waiting for an outstanding ACK
but the window is not advanced?

You can see right after the RTO and retransmission additional data is
sent, so there is additional data.  In theory that additional data
could be arriving right at the 200 milliseconds point, but we see this
pattern in the dump regularly and I believe data is there before the
end of the 200 milliseconds RTO.

As a related point the advertised window from the receiver seems to be
a constant value of 22060, so either the receiver is handling its data
fast enough to never have to reduce its window... or this number is
really not used to indicate space available currently in the receive
window?

Any feedback to help understand why we are not doing fast retransmit
and or why the sender is not extending the window would be greatly
appreciated.

Cheers,
Ivan Novick

[-- Attachment #2: sender_backoff.cap --]
[-- Type: application/octet-stream, Size: 3076 bytes --]

^ permalink raw reply

* Re: [PATCH net-next-2.6] netfilter: remove deprecated config option NF_CT_ACCT
From: David Miller @ 2010-06-30 17:48 UTC (permalink / raw)
  To: kaber; +Cc: jpirko, netdev, ole
In-Reply-To: <4C2B49B9.5050302@trash.net>

From: Patrick McHardy <kaber@trash.net>
Date: Wed, 30 Jun 2010 15:42:17 +0200

> I usually leave those for the architecture maintainers.

Right, that's how this is supposed to work.  If we update defconfigs
every time we make some Kconfig change it's going to be nothing but
pain for arch folks.

^ permalink raw reply

* Re: [PATCH] xfrm: fix XFRMA_MARK extraction in xfrm_mark_get
From: David Miller @ 2010-06-30 17:41 UTC (permalink / raw)
  To: andreas.steffen; +Cc: netdev, hadi
In-Reply-To: <4C2AF8EE.8030508@strongswan.org>

From: Andreas Steffen <andreas.steffen@strongswan.org>
Date: Wed, 30 Jun 2010 09:57:34 +0200

> Determine the size of the xfrm_mark struct, not of its pointer.
> 
> Signed-off-by: Andreas Steffen <andreas.steffen@strongswan.org>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v2] net/core: use ntohs for skb->protocol
From: David Miller @ 2010-06-30 17:39 UTC (permalink / raw)
  To: sebastian; +Cc: netdev
In-Reply-To: <20100630090229.GB25541@Chamillionaire.breakpoint.cc>

From: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Date: Wed, 30 Jun 2010 11:02:29 +0200

> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> This is only noticed by people that are not doing everything correct in
> the first place.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Applied to net-next-2.6, thanks.

^ permalink raw reply

* Re: [PATCH 2/2] ipv6: Use interface max_desync_factor instead of static default
From: David Miller @ 2010-06-30 17:29 UTC (permalink / raw)
  To: ben; +Cc: yoshfuji, kaber, 514646, piotr.lewandowski, netdev
In-Reply-To: <1277588575.26161.311.camel@localhost>

From: Ben Hutchings <ben@decadent.org.uk>
Date: Sat, 26 Jun 2010 22:42:55 +0100

> max_desync_factor can be configured per-interface, but nothing is
> using the value.
> 
> Reported-by: Piotr Lewandowski <piotr.lewandowski@gmail.com>
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

Applied to net-next-2.6

^ permalink raw reply

* Re: [PATCH 1/2] ipv6: Clamp reported valid_lft to a minimum of 0
From: David Miller @ 2010-06-30 17:28 UTC (permalink / raw)
  To: ben; +Cc: yoshfuji, kaber, 514644, piotr.lewandowski, netdev
In-Reply-To: <1277588267.26161.300.camel@localhost>

From: Ben Hutchings <ben@decadent.org.uk>
Date: Sat, 26 Jun 2010 22:37:47 +0100

> Since addresses are only revalidated every 2 minutes, the reported
> valid_lft can underflow shortly before the address is deleted.
> Clamp it to a minimum of 0, as for prefered_lft.
> 
> Reported-by: Piotr Lewandowski <piotr.lewandowski@gmail.com>
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

Applied to net-next-2.6

^ permalink raw reply

* Re: [PATCH] usb: pegasus: fixed coding style issues
From: David Miller @ 2010-06-30 17:26 UTC (permalink / raw)
  To: nikai-bVCNqDZ4lKNeoWH0uzbU5w
  Cc: petkan-Rn4VEauK+AKRv+LV9MX5uipxlwaOVQ5f,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20100626185854.4a3eff43-m0oTu1HmxXpuhsjpcywf5g@public.gmane.org>

From: Nicolas Kaiser <nikai-bVCNqDZ4lKNeoWH0uzbU5w@public.gmane.org>
Date: Sat, 26 Jun 2010 18:58:54 +0200

> Fixed brace, static initialization, comment, whitespace and spacing
> coding style issues.
> 
> Signed-off-by: Nicolas Kaiser <nikai-bVCNqDZ4lKNeoWH0uzbU5w@public.gmane.org>

Applied to net-next-2.6, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] igbvf: avoid name clash between PF and VF
From: Casey Leedom @ 2010-06-30 16:59 UTC (permalink / raw)
  To: Stefan Assmann
  Cc: netdev, e1000-devel, Duyck, Alexander H, gregory.v.rose,
	jeffrey.t.kirsher, Andy Gospodarek
In-Reply-To: <4C2B0614.9040004@redhat.com>

| From: Stefan Assmann <sassmann@redhat.com>
| Date: Wednesday, June 30, 2010 01:53 am
| 
| This is not a udev bug since udev doesn't create persistent rules for
| VFs as their MAC address changes every reboot.
| 
| To avoid this problem we could change the kernel name for the VFs and
| thus avoid confusion between VFs and PFs.
| 
| I've already discussed this with Alexander Duyck and Greg Rose, so far
| they have no objection. However this problem appears for all drivers that
| support PFs and VFs and thus the changes should be applied consistently
| to all of these drivers.

  I'm not sure that this problem affects "all drivers which support PFs and VFs."  
I think that you might mean "all drivers which support PFs and VFs with non-
persistent MAC addresses for the VFs."  For instance, the MAC addresses 
associated with the new cxgb4vf VFs are persistent so, from what I understand of 
the scenario you outlined, I don't think that they would trigger the problem you 
describe.  Please correct me if I've missed something.  Thanks.

Casey

^ permalink raw reply

* RE: [Pv-drivers] [PATCH net-next-2.6 3/3] vmxnet3: Remove incorrect implementation of ethtool_ops::get_flags()
From: Bhavesh Davda @ 2010-06-30 16:46 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: David Miller, VMware, Inc., netdev@vger.kernel.org
In-Reply-To: <1277913534.2082.28.camel@achroite.uk.solarflarecom.com>

> From: Ben Hutchings [mailto:bhutchings@solarflare.com]
> Sent: Wednesday, June 30, 2010 8:59 AM
> 
> On Wed, 2010-06-30 at 08:44 -0700, Bhavesh Davda wrote:
> > Thanks for fixing this, Ben! Had to look at ethtool-2.6.34 src to
> > convince myself of the correctness. Looks good to me.
> >
> > Signed-off-by: Bhavesh Davda <bhavesh@vmware.com>
> >
> > ps: I do wonder, however, why not always use ethtool_op_get_flags for
> > all drivers, and mask whatever is returned by the driver specific
> > dev->ethtool_ops->get_flags with flags_dup_features instead of this
> > approach?
> 
> I think you're right that ethtool_op_get_flags could be the implicit
> default (i.e. used if ethtool_ops::get_flags is NULL) and drivers
> should
> not have to set it.  Following this change, no drivers in net-next-2.6
> will be using any other implementation.  However, I don't think
> ethtool_ops::get_flags should be removed - in future there are likely
> to
> be additional ethtool flags that do not correspond to net device
> feature
> flags, and some drivers will need a different implementation.
> 
> Ben.
> 

Hi Ben,

I'm not suggesting nuking ethtool_ops::get_flags. What I was suggesting is *always* calling ethtool_ops_get_flags from dev_ethtool for case ETHTOOL_GFLAGS, but doing something like this simple patch:

diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index a0f4964..f5da6ed 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -139,8 +139,9 @@ u32 ethtool_op_get_flags(struct net_device *dev)
         * handling for flags which are not so easily handled
         * by a simple masking operation
         */
-
-       return dev->features & flags_dup_features;
+       return (dev->ethtool_ops->get_flags ?
+               dev->ethtool_ops->get_flags(dev) :
+               dev->features) & flags_dup_features;
 }
 EXPORT_SYMBOL(ethtool_op_get_flags);

@@ -1495,9 +1496,7 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
                break;
        case ETHTOOL_GFLAGS:
                rc = ethtool_get_value(dev, useraddr, ethcmd,
-                                      (dev->ethtool_ops->get_flags ?
-                                       dev->ethtool_ops->get_flags :
-                                       ethtool_op_get_flags));
+                                       ethtool_op_get_flags);
                break;
        case ETHTOOL_SFLAGS:
                rc = ethtool_set_value(dev, useraddr,

Plus nuking all such code in the netdev drivers:
        .get_flags              = ethtool_op_get_flags,

This is still pretty fragile and could lead to infinite recursion if some drivers are missed. But you get the idea.

Thanks

- Bhavesh

^ permalink raw reply related

* Re: [PATCH 3/3] pm_qos: get rid of the allocation in pm_qos_add_request()
From: Daniel Walker @ 2010-06-30 16:45 UTC (permalink / raw)
  To: James Bottomley; +Cc: Takashi Iwai, linux-pm, markgross, netdev
In-Reply-To: <1277763049.10879.204.camel@mulgrave.site>

On Mon, 2010-06-28 at 17:10 -0500, James Bottomley wrote:
> On Mon, 2010-06-28 at 23:59 +0200, Rafael J. Wysocki wrote:
> > On Monday, June 28, 2010, James Bottomley wrote:
> > > Since every caller has to squirrel away the returned pointer anyway,
> > > they might as well supply the memory area.  This fixes a bug in a few of
> > > the call sites where the returned pointer was dereferenced without
> > > checking it for NULL (which gets returned if the kzalloc failed).
> > > 
> > > I'd like to hear how sound and netdev feels about this: it will add
> > > about two more pointers worth of data to struct netdev and struct
> > > snd_pcm_substream .. but I think it's worth it.  If you're OK, I'll add
> > > your acks and send through the pm tree.
> > > 
> > > This also looks to me like an android independent clean up (even though
> > > it renders the request_add atomically callable).  I also added include
> > > guards to include/linux/pm_qos_params.h
> > > 
> > > cc: netdev@vger.kernel.org
> > > cc: Takashi Iwai <tiwai@suse.de>
> > > Signed-off-by: James Bottomley <James.Bottomley@suse.de>
> > 
> > I like all of the patches in this series, thanks a lot for doing this!
> > 
> > I guess it might be worth sending a CC to the LKML next round so that people
> > can see [1/3] (I don't expect any objections, but anyway it would be nice).
> 
> I cc'd the latest owners of plist.h ... although Daniel Walker has
> apparently since left MontaVista, Thomas Gleixner is still current ...
> and he can speak for the RT people, who are the primary plist users.
> 
> I can do another round and cc lkml, I was just hoping this would be the
> last revision.

I'm still paying attention tho .. I didn't see anything objection worthy
in the plist changes.. If you do send another round you might want to
add Oleg Nesterov , most of the code was redone by him ..

Daniel

^ permalink raw reply

* [PATCH v2 3/3] gianfar: Implement workaround for eTSEC-A002 erratum
From: Anton Vorontsov @ 2010-06-30 16:39 UTC (permalink / raw)
  To: David Miller
  Cc: Manfred Rudigier, Sandeep Gopalpet, Andy Fleming, netdev,
	linuxppc-dev
In-Reply-To: <20100630163804.GA636@oksana.dev.rtsoft.ru>

MPC8313ECE says:

"If the controller receives a 1- or 2-byte frame (such as an illegal
 runt packet or a packet with RX_ER asserted) before GRS is asserted
 and does not receive any other frames, the controller may fail to set
 GRSC even when the receive logic is completely idle. Any subsequent
 receive frame that is larger than two bytes will reset the state so
 the graceful stop can complete. A MAC receiver (Rx) reset will also
 reset the state."

This patch implements the proposed workaround:

"If IEVENT[GRSC] is still not set after the timeout, read the eTSEC
 register at offset 0xD1C. If bits 7-14 are the same as bits 23-30,
 the eTSEC Rx is assumed to be idle and the Rx can be safely reset.
 If the register fields are not equal, wait for another timeout
 period and check again."

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
---
 drivers/net/gianfar.c |   40 +++++++++++++++++++++++++++++++++++++---
 drivers/net/gianfar.h |    1 +
 2 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 8ba2973..35626eb 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -947,6 +947,11 @@ static void gfar_detect_errata(struct gfar_private *priv)
 			(pvr == 0x80861010 && (mod & 0xfff9) == 0x80c0))
 		priv->errata |= GFAR_ERRATA_76;
 
+	/* MPC8313 and MPC837x all rev */
+	if ((pvr == 0x80850010 && mod == 0x80b0) ||
+			(pvr == 0x80861010 && (mod & 0xfff9) == 0x80c0))
+		priv->errata |= GFAR_ERRATA_A002;
+
 	if (priv->errata)
 		dev_info(dev, "enabled errata workarounds, flags: 0x%x\n",
 			 priv->errata);
@@ -1570,6 +1575,29 @@ static void init_registers(struct net_device *dev)
 	gfar_write(&regs->minflr, MINFLR_INIT_SETTINGS);
 }
 
+static int __gfar_is_rx_idle(struct gfar_private *priv)
+{
+	u32 res;
+
+	/*
+	 * Normaly TSEC should not hang on GRS commands, so we should
+	 * actually wait for IEVENT_GRSC flag.
+	 */
+	if (likely(!gfar_has_errata(priv, GFAR_ERRATA_A002)))
+		return 0;
+
+	/*
+	 * Read the eTSEC register at offset 0xD1C. If bits 7-14 are
+	 * the same as bits 23-30, the eTSEC Rx is assumed to be idle
+	 * and the Rx can be safely reset.
+	 */
+	res = gfar_read((void __iomem *)priv->gfargrp[0].regs + 0xd1c);
+	res &= 0x7f807f80;
+	if ((res & 0xffff) == (res >> 16))
+		return 1;
+
+	return 0;
+}
 
 /* Halt the receive and transmit queues */
 static void gfar_halt_nodisable(struct net_device *dev)
@@ -1593,12 +1621,18 @@ static void gfar_halt_nodisable(struct net_device *dev)
 	tempval = gfar_read(&regs->dmactrl);
 	if ((tempval & (DMACTRL_GRS | DMACTRL_GTS))
 	    != (DMACTRL_GRS | DMACTRL_GTS)) {
+		int ret;
+
 		tempval |= (DMACTRL_GRS | DMACTRL_GTS);
 		gfar_write(&regs->dmactrl, tempval);
 
-		spin_event_timeout(((gfar_read(&regs->ievent) &
-			 (IEVENT_GRSC | IEVENT_GTSC)) ==
-			 (IEVENT_GRSC | IEVENT_GTSC)), -1, 0);
+		do {
+			ret = spin_event_timeout(((gfar_read(&regs->ievent) &
+				 (IEVENT_GRSC | IEVENT_GTSC)) ==
+				 (IEVENT_GRSC | IEVENT_GTSC)), 1000000, 0);
+			if (!ret && !(gfar_read(&regs->ievent) & IEVENT_GRSC))
+				ret = __gfar_is_rx_idle(priv);
+		} while (!ret);
 	}
 }
 
diff --git a/drivers/net/gianfar.h b/drivers/net/gianfar.h
index c414374..710810e 100644
--- a/drivers/net/gianfar.h
+++ b/drivers/net/gianfar.h
@@ -1028,6 +1028,7 @@ struct gfar_priv_grp {
 enum gfar_errata {
 	GFAR_ERRATA_74		= 0x01,
 	GFAR_ERRATA_76		= 0x02,
+	GFAR_ERRATA_A002	= 0x04,
 };
 
 /* Struct stolen almost completely (and shamelessly) from the FCC enet source
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH v2 2/3] gianfar: Implement workaround for eTSEC76 erratum
From: Anton Vorontsov @ 2010-06-30 16:39 UTC (permalink / raw)
  To: David Miller
  Cc: Manfred Rudigier, Sandeep Gopalpet, Andy Fleming, netdev,
	linuxppc-dev
In-Reply-To: <20100630163804.GA636@oksana.dev.rtsoft.ru>

MPC8313ECE says:

"For TOE=1 huge or jumbo frames, the data required to generate the
 checksum may exceed the 2500-byte threshold beyond which the controller
 constrains itself to one memory fetch every 256 eTSEC system clocks.

 This throttling threshold is supposed to trigger only when the
 controller has sufficient data to keep transmit active for the duration
 of the memory fetches. The state machine handling this threshold,
 however, fails to take large TOE frames into account. As a result,
 TOE=1 frames larger than 2500 bytes often see excess delays before start
 of transmission."

This patch implements the workaround as suggested by the errata
document, i.e.:

"Limit TOE=1 frames to less than 2500 bytes to avoid excess delays due to
 memory throttling.
 When using packets larger than 2700 bytes, it is recommended to turn TOE
 off."

To be sure, we limit the TOE frames to 2500 bytes, and do software
checksumming instead.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
---
 drivers/net/gianfar.c |   19 +++++++++++++++++++
 drivers/net/gianfar.h |    1 +
 2 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 9abcb39..8ba2973 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -942,6 +942,11 @@ static void gfar_detect_errata(struct gfar_private *priv)
 			(pvr == 0x80861010 && (mod & 0xfff9) == 0x80c0))
 		priv->errata |= GFAR_ERRATA_74;

+	/* MPC8313 and MPC837x all rev */
+	if ((pvr == 0x80850010 && mod == 0x80b0) ||
+			(pvr == 0x80861010 && (mod & 0xfff9) == 0x80c0))
+		priv->errata |= GFAR_ERRATA_76;
+
 	if (priv->errata)
 		dev_info(dev, "enabled errata workarounds, flags: 0x%x\n",
 			 priv->errata);
@@ -2011,6 +2016,20 @@ static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	unsigned int nr_frags, nr_txbds, length;
 	union skb_shared_tx *shtx;

+	/*
+	 * TOE=1 frames larger than 2500 bytes may see excess delays
+	 * before start of transmission.
+	 */
+	if (unlikely(gfar_has_errata(priv, GFAR_ERRATA_76) &&
+			skb->ip_summed == CHECKSUM_PARTIAL &&
+			skb->len > 2500)) {
+		int ret;
+
+		ret = skb_checksum_help(skb);
+		if (ret)
+			return ret;
+	}
+
 	rq = skb->queue_mapping;
 	tx_queue = priv->tx_queue[rq];
 	txq = netdev_get_tx_queue(dev, rq);
diff --git a/drivers/net/gianfar.h b/drivers/net/gianfar.h
index 0a0483c..c414374 100644
--- a/drivers/net/gianfar.h
+++ b/drivers/net/gianfar.h
@@ -1027,6 +1027,7 @@ struct gfar_priv_grp {

 enum gfar_errata {
 	GFAR_ERRATA_74		= 0x01,
+	GFAR_ERRATA_76		= 0x02,
 };

 /* Struct stolen almost completely (and shamelessly) from the FCC enet source
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH v2 1/3] gianfar: Implement workaround for eTSEC74 erratum
From: Anton Vorontsov @ 2010-06-30 16:39 UTC (permalink / raw)
  To: David Miller
  Cc: Manfred Rudigier, Sandeep Gopalpet, Andy Fleming, netdev,
	linuxppc-dev
In-Reply-To: <20100630163804.GA636@oksana.dev.rtsoft.ru>

MPC8313ECE says:

"If MACCFG2[Huge Frame]=0 and the Ethernet controller receives frames
 which are larger than MAXFRM, the controller truncates the frames to
 length MAXFRM and marks RxBD[TR]=1 to indicate the error. The controller
 also erroneously marks RxBD[TR]=1 if the received frame length is MAXFRM
 or MAXFRM-1, even though those frames are not truncated.
 No truncation or truncation error occurs if MACCFG2[Huge Frame]=1."

There are two options to workaround the issue:

"1. Set MACCFG2[Huge Frame]=1, so no truncation occurs for invalid large
 frames. Software can determine if a frame is larger than MAXFRM by
 reading RxBD[LG] or RxBD[Data Length].

 2. Set MAXFRM to 1538 (0x602) instead of the default 1536 (0x600), so
 normal-length frames are not marked as truncated. Software can examine
 RxBD[Data Length] to determine if the frame was larger than MAXFRM-2."

This patch implements the first workaround option by setting HUGEFRAME
bit, and gfar_clean_rx_ring() already checks the RxBD[Data Length].

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
---
 drivers/net/gianfar.c |   29 +++++++++++++++++++++++++++--
 drivers/net/gianfar.h |   11 +++++++++++
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 28b53d1..9abcb39 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -85,6 +85,7 @@
 #include <linux/net_tstamp.h>
 
 #include <asm/io.h>
+#include <asm/reg.h>
 #include <asm/irq.h>
 #include <asm/uaccess.h>
 #include <linux/module.h>
@@ -928,6 +929,24 @@ static void gfar_init_filer_table(struct gfar_private *priv)
 	}
 }
 
+static void gfar_detect_errata(struct gfar_private *priv)
+{
+	struct device *dev = &priv->ofdev->dev;
+	unsigned int pvr = mfspr(SPRN_PVR);
+	unsigned int svr = mfspr(SPRN_SVR);
+	unsigned int mod = (svr >> 16) & 0xfff6; /* w/o E suffix */
+	unsigned int rev = svr & 0xffff;
+
+	/* MPC8313 Rev 2.0 and higher; All MPC837x */
+	if ((pvr == 0x80850010 && mod == 0x80b0 && rev >= 0x0020) ||
+			(pvr == 0x80861010 && (mod & 0xfff9) == 0x80c0))
+		priv->errata |= GFAR_ERRATA_74;
+
+	if (priv->errata)
+		dev_info(dev, "enabled errata workarounds, flags: 0x%x\n",
+			 priv->errata);
+}
+
 /* Set up the ethernet device structure, private data,
  * and anything else we need before we start */
 static int gfar_probe(struct of_device *ofdev,
@@ -960,6 +979,8 @@ static int gfar_probe(struct of_device *ofdev,
 	dev_set_drvdata(&ofdev->dev, priv);
 	regs = priv->gfargrp[0].regs;
 
+	gfar_detect_errata(priv);
+
 	/* Stop the DMA engine now, in case it was running before */
 	/* (The firmware could have used it, and left it running). */
 	gfar_halt(dev);
@@ -974,7 +995,10 @@ static int gfar_probe(struct of_device *ofdev,
 	gfar_write(&regs->maccfg1, tempval);
 
 	/* Initialize MACCFG2. */
-	gfar_write(&regs->maccfg2, MACCFG2_INIT_SETTINGS);
+	tempval = MACCFG2_INIT_SETTINGS;
+	if (gfar_has_errata(priv, GFAR_ERRATA_74))
+		tempval |= MACCFG2_HUGEFRAME | MACCFG2_LENGTHCHECK;
+	gfar_write(&regs->maccfg2, tempval);
 
 	/* Initialize ECNTRL */
 	gfar_write(&regs->ecntrl, ECNTRL_INIT_SETTINGS);
@@ -2300,7 +2324,8 @@ static int gfar_change_mtu(struct net_device *dev, int new_mtu)
 	 * to allow huge frames, and to check the length */
 	tempval = gfar_read(&regs->maccfg2);
 
-	if (priv->rx_buffer_size > DEFAULT_RX_BUFFER_SIZE)
+	if (priv->rx_buffer_size > DEFAULT_RX_BUFFER_SIZE ||
+			gfar_has_errata(priv, GFAR_ERRATA_74))
 		tempval |= (MACCFG2_HUGEFRAME | MACCFG2_LENGTHCHECK);
 	else
 		tempval &= ~(MACCFG2_HUGEFRAME | MACCFG2_LENGTHCHECK);
diff --git a/drivers/net/gianfar.h b/drivers/net/gianfar.h
index ac4a92e..0a0483c 100644
--- a/drivers/net/gianfar.h
+++ b/drivers/net/gianfar.h
@@ -1025,6 +1025,10 @@ struct gfar_priv_grp {
 	char int_name_er[GFAR_INT_NAME_MAX];
 };
 
+enum gfar_errata {
+	GFAR_ERRATA_74		= 0x01,
+};
+
 /* Struct stolen almost completely (and shamelessly) from the FCC enet source
  * (Ok, that's not so true anymore, but there is a family resemblence)
  * The GFAR buffer descriptors track the ring buffers.  The rx_bd_base
@@ -1049,6 +1053,7 @@ struct gfar_private {
 	struct device_node *node;
 	struct net_device *ndev;
 	struct of_device *ofdev;
+	enum gfar_errata errata;
 
 	struct gfar_priv_grp gfargrp[MAXGROUPS];
 	struct gfar_priv_tx_q *tx_queue[MAX_TX_QS];
@@ -1111,6 +1116,12 @@ struct gfar_private {
 extern unsigned int ftp_rqfpr[MAX_FILER_IDX + 1];
 extern unsigned int ftp_rqfcr[MAX_FILER_IDX + 1];
 
+static inline int gfar_has_errata(struct gfar_private *priv,
+				  enum gfar_errata err)
+{
+	return priv->errata & err;
+}
+
 static inline u32 gfar_read(volatile unsigned __iomem *addr)
 {
 	u32 val;
-- 
1.7.0.5


^ permalink raw reply related

* Re: [PATCH 1/3] gianfar: Implement workaround for eTSEC74 erratum
From: Anton Vorontsov @ 2010-06-30 16:38 UTC (permalink / raw)
  To: David Miller
  Cc: manfred.rudigier, Sandeep.Kumar, afleming, netdev, linuxppc-dev
In-Reply-To: <20100629.151626.90794001.davem@davemloft.net>

On Tue, Jun 29, 2010 at 03:16:26PM -0700, David Miller wrote:
> 
> I really don't see any value at all to this config option,
> the errata fixup code should be there all the time.

Well, at least for eTSEC76 erratum (patch 2/3) we have to touch
fast path (i.e. start_xmit), so I just wanted to make zero
overhead for controllers that don't need any fixups.

Not that there's much of the overhead in a single additional
'if' condition, no. ;-)

> It really does no harm to be there in the cases where it isn't
> used, and forcing users to turn this on is just another
> obscure way to end up with an incorrect configuration.

OK, resending the new patches, without Kconfig stuff...

If we'll have too many or too big errata so that it would cause
major performance or code size penalty for non-affected SOCs, we
can always do:

enum gfar_errata {
#ifdef CONFIG_PPC_FOO
	GFAR_ERRATA_FOO = 0x01,
#else
	GFAR_ERRATA_FOO = 0,
#endif
}

And then, priv->errata & GFAR_ERRATA_FOO will be optimized
away by the compiler.

Thanks,

-- 
Anton Vorontsov
email: cbouatmailru@gmail.com
irc://irc.freenode.net/bd2

^ permalink raw reply

* [PATCH] net/neighbour.h: fix typo
From: Kulikov Vasiliy @ 2010-06-30 16:08 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: David S. Miller, Tejun Heo, Eric W. Biederman, Eric Dumazet,
	Stephen Hemminger, netdev, linux-kernel

'Shoul' must be 'should'.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
---
 include/net/neighbour.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index eb21340..242879b 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -151,7 +151,7 @@ struct neigh_table {
 	void			(*proxy_redo)(struct sk_buff *skb);
 	char			*id;
 	struct neigh_parms	parms;
-	/* HACK. gc_* shoul follow parms without a gap! */
+	/* HACK. gc_* should follow parms without a gap! */
 	int			gc_interval;
 	int			gc_thresh1;
 	int			gc_thresh2;
-- 
1.7.0.4


^ permalink raw reply related

* Re: RFC:  Allow 'ip' to run in daemon mode?
From: Ben Greear @ 2010-06-30 16:01 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: NetDev, Stephen Hemminger
In-Reply-To: <jqfv3n22gfvc4e8j9qgxednu.1277881246865@email.android.com>

On 06/30/2010 12:00 AM, Stephen Hemminger wrote:
> write a new service rather than bloating the existing code or just use netlink or libnl

I find netlink code a pain to deal with, so I'd like to leverage
existing and ongoing support in iproute.

Would you at least consider the changes in 1) and 2)
so that the new project can continue to use iproute as upstream
w/out significant merge issues?  Those changes shouldn't
significantly add to the code, I believe.

Thanks,
Ben

>
> Ben Greear<greearb@candelatech.com>  wrote:
>
>> I'm considering modifying 'ip' to be able to run in daemon
>> mode so that I can do lots of IP commands without having to
>> pay the startup cost of iproute.
>>
>> The -batch option almost works, but it's hard to programatically
>> figure out failure codes.
>>
>> I'm thinking about making these changes:
>>
>> 1)  Move all of the error printing code into common methods (basically,
>>     wrap printf).  In daemon mode this text can be sent back to the
>>     calling process, and in normal mode, it will be printed to stdout/stderr
>>     as it is currently.
>>
>> 2)  Remove all or most calls to 'exit' and instead return error codes
>>     to the calling logic.
>>
>> 3)  Add ability to listen on a unix socket for commands, basically treat
>>     them just like batch commands, one command per packet.
>>
>> 4)  Return well formatted error code and text response to calling process
>>     over the unix socket, maybe something like:
>>
>> RV: [errno or equiv, zero for success]\n
>> CMD: [ command string this relates to ]\n
>> [ Optional free form text ]
>>
>>
>> Does something like this have any chance of upstream inclusion?
>>
>> Thanks,
>> Ben
>>
>> --
>> Ben Greear<greearb@candelatech.com>
>> Candela Technologies Inc  http://www.candelatech.com
> N�����r��y���b�X��ǧv�^�)޺{.n�+���z�^�)���w*\x1fjg���\x1e�����ݢj/���z�ޖ��2�ޙ���&�)ߡ�a��\x7f��\x1e�G���h�\x0f�j:+v���w�٥


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* RE: [Pv-drivers] [PATCH net-next-2.6 3/3] vmxnet3: Remove incorrect implementation of ethtool_ops::get_flags()
From: Ben Hutchings @ 2010-06-30 15:58 UTC (permalink / raw)
  To: Bhavesh Davda; +Cc: David Miller, VMware, Inc., netdev@vger.kernel.org
In-Reply-To: <8B1F619C9F5F454E81D90D3C161698D703362510AE@EXCH-MBX-3.vmware.com>

On Wed, 2010-06-30 at 08:44 -0700, Bhavesh Davda wrote:
> Thanks for fixing this, Ben! Had to look at ethtool-2.6.34 src to
> convince myself of the correctness. Looks good to me. 
> 
> Signed-off-by: Bhavesh Davda <bhavesh@vmware.com>
> 
> ps: I do wonder, however, why not always use ethtool_op_get_flags for
> all drivers, and mask whatever is returned by the driver specific
> dev->ethtool_ops->get_flags with flags_dup_features instead of this
> approach?

I think you're right that ethtool_op_get_flags could be the implicit
default (i.e. used if ethtool_ops::get_flags is NULL) and drivers should
not have to set it.  Following this change, no drivers in net-next-2.6
will be using any other implementation.  However, I don't think
ethtool_ops::get_flags should be removed - in future there are likely to
be additional ethtool flags that do not correspond to net device feature
flags, and some drivers will need a different implementation.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* RE: [Pv-drivers] [PATCH net-next-2.6 3/3] vmxnet3: Remove incorrect implementation of ethtool_ops::get_flags()
From: Bhavesh Davda @ 2010-06-30 15:44 UTC (permalink / raw)
  To: Ben Hutchings, David Miller; +Cc: VMware, Inc., netdev@vger.kernel.org
In-Reply-To: <1277902060.2082.13.camel@achroite.uk.solarflarecom.com>

Thanks for fixing this, Ben! Had to look at ethtool-2.6.34 src to convince myself of the correctness. Looks good to me. 

Signed-off-by: Bhavesh Davda <bhavesh@vmware.com>

ps: I do wonder, however, why not always use ethtool_op_get_flags for all drivers, and mask whatever is returned by the driver specific dev->ethtool_ops->get_flags with flags_dup_features instead of this approach?


- Bhavesh
 
Bhavesh P. Davda

> -----Original Message-----
> From: pv-drivers-bounces@vmware.com [mailto:pv-drivers-
> bounces@vmware.com] On Behalf Of Ben Hutchings
> Sent: Wednesday, June 30, 2010 5:48 AM
> To: David Miller
> Cc: VMware, Inc.; netdev@vger.kernel.org
> Subject: [Pv-drivers] [PATCH net-next-2.6 3/3] vmxnet3: Remove
> incorrect implementation of ethtool_ops::get_flags()
> 
> Only some netdev feature flags correspond directly to ethtool feature
> flags.  ethtool_op_get_flags() does the right thing.
> 
> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
> ---
>  drivers/net/vmxnet3/vmxnet3_ethtool.c |    8 +-------
>  1 files changed, 1 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/vmxnet3/vmxnet3_ethtool.c
> b/drivers/net/vmxnet3/vmxnet3_ethtool.c
> index 8a71a21..de1ba14 100644
> --- a/drivers/net/vmxnet3/vmxnet3_ethtool.c
> +++ b/drivers/net/vmxnet3/vmxnet3_ethtool.c
> @@ -275,12 +275,6 @@ vmxnet3_get_strings(struct net_device *netdev, u32
> stringset, u8 *buf)
>  	}
>  }
> 
> -static u32
> -vmxnet3_get_flags(struct net_device *netdev)
> -{
> -	return netdev->features;
> -}
> -
>  static int
>  vmxnet3_set_flags(struct net_device *netdev, u32 data)
>  {
> @@ -559,7 +553,7 @@ static struct ethtool_ops vmxnet3_ethtool_ops = {
>  	.get_tso           = ethtool_op_get_tso,
>  	.set_tso           = ethtool_op_set_tso,
>  	.get_strings       = vmxnet3_get_strings,
> -	.get_flags	   = vmxnet3_get_flags,
> +	.get_flags	   = ethtool_op_get_flags,
>  	.set_flags	   = vmxnet3_set_flags,
>  	.get_sset_count	   = vmxnet3_get_sset_count,
>  	.get_ethtool_stats = vmxnet3_get_ethtool_stats,
> --
> 1.6.2.5
> 
> --
> Ben Hutchings, Senior Software Engineer, Solarflare Communications
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
> 
> _______________________________________________
> Pv-drivers mailing list
> Pv-drivers@vmware.com
> http://mailman2.vmware.com/mailman/listinfo/pv-drivers

^ permalink raw reply

* Re: [PATCH 1/2] Export firmware assigned labels of network devices to sysfs
From: Greg KH @ 2010-06-30 15:42 UTC (permalink / raw)
  To: Narendra K
  Cc: matt_domsch, netdev, linux-hotplug, linux-pci, jordan_hargrave,
	charles_rose, vijay_nijhawan
In-Reply-To: <20100629162818.GA17099@auslistsprd01.us.dell.com>

On Tue, Jun 29, 2010 at 11:28:18AM -0500, Narendra K wrote:
> --- a/drivers/pci/Makefile
> +++ b/drivers/pci/Makefile
> @@ -4,7 +4,7 @@
>  
>  obj-y		+= access.o bus.o probe.o remove.o pci.o \
>  			pci-driver.o search.o pci-sysfs.o rom.o setup-res.o \
> -			irq.o vpd.o
> +			irq.o vpd.o pci-label.o

No, only build this if CONFIG_DMI is set.

> diff --git a/drivers/pci/pci-label.c b/drivers/pci/pci-label.c
> new file mode 100644
> index 0000000..0f824d6
> --- /dev/null
> +++ b/drivers/pci/pci-label.c
> @@ -0,0 +1,140 @@
> +/*
> + * Purpose: Export the firmware instance/index and label associated with 
> + * a pci device to sysfs
> + * Copyright (C) 2010 Dell Inc.
> + * by Narendra K <Narendra_K@dell.com>, Jordan Hargrave <Jordan_Hargrave@dell.com>
> + *
> + * SMBIOS defines type 41 for onboard pci devices. This code retrieves
> + * the instance number and string from the type 41 record and exports 
> + * it to sysfs.
> + *
> + * Please see http://linux.dell.com/wiki/index.php/Oss/libnetdevname for more
> + * information.
> + */
> +
> +#include <linux/dmi.h>
> +#include <linux/sysfs.h>
> +#include <linux/pci.h>
> +#include <linux/pci_ids.h>
> +#include <linux/module.h>
> +#include "pci.h"
> +
> +#ifndef CONFIG_DMI
> +
> +static inline int 
> +pci_create_smbiosname_file(struct pci_dev *pdev)
> +{
> +	return -1;
> +}
> +
> +static inline int 
> +pci_remove_smbiosname_file(struct pci_dev *pdev)
> +{
> +	return -1;
> +}
> +
> +#else

The above Makefile change will allow you to remove these, right?  You
don't want to create the files if there is nothing that can be in them,
right?

> +pci_create_smbiosname_file(struct pci_dev *pdev)
> +{
> +	if (smbios_attr_label.test && smbios_attr_label.test(&pdev->dev, NULL, NULL)) {
> +		if (sysfs_create_file(&pdev->dev.kobj, &smbios_attr_label.attr))
> +			return -1;

What's wrong with the 'device_create_file' calls?

thanks,

greg k-h

^ permalink raw reply

* [PATCH ethtool 2/2] ethtool: Add support for control of RX flow hash indirection
From: Ben Hutchings @ 2010-06-30 15:13 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: netdev, linux-net-drivers
In-Reply-To: <1277910740.2082.17.camel@achroite.uk.solarflarecom.com>

Many NICs use an indirection table to map an RX flow hash value to one
of an arbitrary number of queues (not necessarily a power of 2).  It
can be useful to remove some queues from this indirection table so
that they are only used for flows that are specifically filtered
there.  It may also be useful to weight the mapping to account for
user processes with the same CPU-affinity as the RX interrupts.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool.8 |   26 ++++++++++
 ethtool.c |  162 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 188 insertions(+), 0 deletions(-)

diff --git a/ethtool.8 b/ethtool.8
index 5983d0e..6003743 100644
--- a/ethtool.8
+++ b/ethtool.8
@@ -214,6 +214,17 @@ ethtool \- Display or change ethernet card settings
 .RB [ rx-flow-hash \ \*(FL
 .RB \ \*(HO]
 
+.B ethtool \-x|\-\-show\-rxfh\-indir
+.I ethX
+
+.B ethtool \-X|\-\-set\-rxfh\-indir
+.I ethX
+.RB [\  equal
+.IR N \ |
+.BI weight\  W0
+.IR W1
+.RB ...\ ]
+
 .B ethtool \-f|\-\-flash
 .I ethX
 .RI FILE
@@ -606,6 +617,21 @@ Discard all packets of this flow type. When this option is set, all other option
 .PD
 .RE
 .TP
+.B \-x \-\-show\-rxfh\-indir
+Retrieves the receive flow hash indirection table.
+.TP
+.B \-X \-\-set\-rxfh\-indir
+Configures the receive flow hash indirection table.
+.TP
+.BI equal\  N
+Sets the receive flow hash indirection table to spread flows evenly
+between the first \fIN\fR receive queues.
+.TP
+\fBweight\fR \fIW0 W1\fR ...
+Sets the receive flow hash indirection table to spread flows between
+receive queues according to the given weights.  The sum of the weights
+must be non-zero and must not exceed the size of the indirection table.
+.TP
 .B \-f \-\-flash \ FILE
 Flash firmware image from the specified file to a region on the adapter.
 By default this will flash all the regions on the adapter.
diff --git a/ethtool.c b/ethtool.c
index 4073745..c7c269d 100644
--- a/ethtool.c
+++ b/ethtool.c
@@ -98,6 +98,8 @@ static char *unparse_rxfhashopts(u64 opts);
 static int dump_rxfhash(int fhash, u64 val);
 static int do_srxclass(int fd, struct ifreq *ifr);
 static int do_grxclass(int fd, struct ifreq *ifr);
+static int do_grxfhindir(int fd, struct ifreq *ifr);
+static int do_srxfhindir(int fd, struct ifreq *ifr);
 static int do_srxntuple(int fd, struct ifreq *ifr);
 static int do_grxntuple(int fd, struct ifreq *ifr);
 static int do_flash(int fd, struct ifreq *ifr);
@@ -125,6 +127,8 @@ static enum {
 	MODE_GSTATS,
 	MODE_GNFC,
 	MODE_SNFC,
+	MODE_GRXFHINDIR,
+	MODE_SRXFHINDIR,
 	MODE_SNTUPLE,
 	MODE_GNTUPLE,
 	MODE_FLASHDEV,
@@ -225,6 +229,10 @@ static struct option {
 		"classification options",
 		"		[ rx-flow-hash tcp4|udp4|ah4|sctp4|"
 		"tcp6|udp6|ah6|sctp6 m|v|t|s|d|f|n|r... ]\n" },
+    { "-x", "--show-rxfh-indir", MODE_GRXFHINDIR, "Show Rx flow hash "
+		"indirection" },
+    { "-X", "--set-rxfh-indir", MODE_SRXFHINDIR, "Set Rx flow hash indirection",
+		"		equal N | weight W0 W1 ...\n" },
     { "-U", "--config-ntuple", MODE_SNTUPLE, "Configure Rx ntuple filters "
 		"and actions",
 		"               [ flow-type tcp4|udp4|sctp4 src-ip <addr> "
@@ -350,6 +358,8 @@ static int rx_fhash_get = 0;
 static int rx_fhash_set = 0;
 static u32 rx_fhash_val = 0;
 static int rx_fhash_changed = 0;
+static int rxfhindir_equal = 0;
+static char **rxfhindir_weight = NULL;
 static int sntuple_changed = 0;
 static struct ethtool_rx_ntuple_flow_spec ntuple_fs;
 static char *flash_file = NULL;
@@ -549,6 +559,11 @@ static int get_int(char *str, int base)
 	return get_int_range(str, base, INT_MIN, INT_MAX);
 }
 
+static u32 get_u32(char *str, int base)
+{
+	return get_uint_range(str, base, 0xffffffff);
+}
+
 static void parse_generic_cmdline(int argc, char **argp,
 				  int first_arg, int *changed,
 				  struct cmdline_info *info,
@@ -725,6 +740,8 @@ static void parse_cmdline(int argc, char **argp)
 			    (mode == MODE_GSTATS) ||
 			    (mode == MODE_GNFC) ||
 			    (mode == MODE_SNFC) ||
+			    (mode == MODE_GRXFHINDIR) ||
+			    (mode == MODE_SRXFHINDIR) ||
 			    (mode == MODE_SNTUPLE) ||
 			    (mode == MODE_GNTUPLE) ||
 			    (mode == MODE_PHYS_ID) ||
@@ -878,6 +895,30 @@ static void parse_cmdline(int argc, char **argp)
 					show_usage(1);
 				break;
 			}
+			if (mode == MODE_SRXFHINDIR) {
+				if (!strcmp(argp[i], "equal")) {
+					if (argc != i + 2) {
+						show_usage(1);
+						break;
+					}
+					i += 1;
+					rxfhindir_equal =
+						get_int_range(argp[i], 0, 1,
+							      INT_MAX);
+					i += 1;
+				} else if (!strcmp(argp[i], "weight")) {
+					i += 1;
+					if (i >= argc) {
+						show_usage(1);
+						break;
+					}
+					rxfhindir_weight = argp + i;
+					i = argc;
+				} else {
+					show_usage(1);
+				}
+				break;
+			}
 			if (mode != MODE_SSET)
 				show_usage(1);
 			if (!strcmp(argp[i], "speed")) {
@@ -1812,6 +1853,10 @@ static int doit(void)
 		return do_grxclass(fd, &ifr);
 	} else if (mode == MODE_SNFC) {
 		return do_srxclass(fd, &ifr);
+	} else if (mode == MODE_GRXFHINDIR) {
+		return do_grxfhindir(fd, &ifr);
+	} else if (mode == MODE_SRXFHINDIR) {
+		return do_srxfhindir(fd, &ifr);
 	} else if (mode == MODE_SNTUPLE) {
 		return do_srxntuple(fd, &ifr);
 	} else if (mode == MODE_GNTUPLE) {
@@ -2751,6 +2796,123 @@ static int do_grxclass(int fd, struct ifreq *ifr)
 	return 0;
 }
 
+static int do_grxfhindir(int fd, struct ifreq *ifr)
+{
+	struct ethtool_rxnfc ring_count;
+	struct ethtool_rxfh_indir indir_head;
+	struct ethtool_rxfh_indir *indir;
+	u32 i;
+	int err;
+
+	ring_count.cmd = ETHTOOL_GRXRINGS;
+	ifr->ifr_data = (caddr_t) &ring_count;
+	err = send_ioctl(fd, ifr);
+	if (err < 0) {
+		perror("Cannot get RX ring count");
+		return 102;
+	}
+
+	indir_head.cmd = ETHTOOL_GRXFHINDIR;
+	indir_head.size = 0;
+	ifr->ifr_data = (caddr_t) &indir_head;
+	err = send_ioctl(fd, ifr);
+	if (err < 0) {
+		perror("Cannot get RX flow hash indirection table size");
+		return 103;
+	}
+
+	indir = malloc(sizeof(*indir) +
+		       indir_head.size * sizeof(*indir->ring_index));
+	indir->cmd = ETHTOOL_GRXFHINDIR;
+	indir->size = indir_head.size;
+	ifr->ifr_data = (caddr_t) indir;
+	err = send_ioctl(fd, ifr);
+	if (err < 0) {
+		perror("Cannot get RX flow hash indirection table");
+		return 103;
+	}
+
+	printf("RX flow hash indirection table for %s with %llu RX ring(s):\n",
+	       devname, ring_count.data);
+	for (i = 0; i < indir->size; i++) {
+		if (i % 8 == 0)
+			printf("%5zu: ", i);
+		printf(" %5u", indir->ring_index[i]);
+		if (i % 8 == 7)
+			fputc('\n', stdout);
+	}
+	return 0;
+}
+
+static int do_srxfhindir(int fd, struct ifreq *ifr)
+{
+	struct ethtool_rxfh_indir indir_head;
+	struct ethtool_rxfh_indir *indir;
+	u32 i;
+	int err;
+
+	if (!rxfhindir_equal && !rxfhindir_weight)
+		show_usage(1);
+
+	indir_head.cmd = ETHTOOL_GRXFHINDIR;
+	indir_head.size = 0;
+	ifr->ifr_data = (caddr_t) &indir_head;
+	err = send_ioctl(fd, ifr);
+	if (err < 0) {
+		perror("Cannot get RX flow hash indirection table size");
+		return 104;
+	}
+
+	indir = malloc(sizeof(*indir) +
+		       indir_head.size * sizeof(*indir->ring_index));
+	indir->cmd = ETHTOOL_SRXFHINDIR;
+	indir->size = indir_head.size;
+
+	if (rxfhindir_equal) {
+		for (i = 0; i < indir->size; i++)
+			indir->ring_index[i] = i % rxfhindir_equal;
+	} else {
+		u32 j, weight, sum = 0, partial = 0;
+
+		for (j = 0; rxfhindir_weight[j]; j++) {
+			weight = get_u32(rxfhindir_weight[j], 0);
+			sum += weight;
+		}
+
+		if (sum == 0) {
+			fprintf(stderr,
+				"At least one weight must be non-zero\n");
+			exit(1);
+		}
+
+		if (sum > indir->size) {
+			fprintf(stderr,
+				"Total weight exceeds the size of the "
+				"indirection table\n");
+			exit(1);
+		}
+
+		j = -1;
+		for (i = 0; i < indir->size; i++) {
+			if (i >= indir->size * partial / sum) {
+				j += 1;
+				weight = get_u32(rxfhindir_weight[j], 0);
+				partial += weight;
+			}
+			indir->ring_index[i] = j;
+		}
+	}
+
+	ifr->ifr_data = (caddr_t) indir;
+	err = send_ioctl(fd, ifr);
+	if (err < 0) {
+		perror("Cannot set RX flow hash indirection table");
+		return 105;
+	}
+
+	return 0;
+}
+
 static int do_flash(int fd, struct ifreq *ifr)
 {
 	struct ethtool_flash efl;
-- 
1.6.2.5

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH ethtool 1/2] ethtool-copy.h: sync with net-next
From: Ben Hutchings @ 2010-06-30 15:12 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: netdev, linux-net-drivers

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
This "sync" assumes that "ethtool: Add support for control of RX flow
hash indirection" has been merged into net-next-2.6.

Ben.

 ethtool-copy.h |   46 +++++++++++++++++++++++++++++-----------------
 1 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/ethtool-copy.h b/ethtool-copy.h
index c5eb680..0b3aea2 100644
--- a/ethtool-copy.h
+++ b/ethtool-copy.h
@@ -36,7 +36,7 @@ struct ethtool_cmd {
 	__u32	reserved[2];
 };
 
-static inline void ethtool_cmd_speed_set(struct ethtool_cmd *ep,
+static __inline__ void ethtool_cmd_speed_set(struct ethtool_cmd *ep,
 						__u32 speed)
 {
 
@@ -44,7 +44,7 @@ static inline void ethtool_cmd_speed_set(struct ethtool_cmd *ep,
 	ep->speed_hi = (__u16)(speed >> 16);
 }
 
-static inline __u32 ethtool_cmd_speed(struct ethtool_cmd *ep)
+static __inline__ __u32 ethtool_cmd_speed(struct ethtool_cmd *ep)
 {
 	return (ep->speed_hi << 16) | ep->speed;
 }
@@ -384,6 +384,15 @@ struct ethtool_rxnfc {
 	__u32				rule_locs[0];
 };
 
+struct ethtool_rxfh_indir {
+	__u32	cmd;
+	/* On entry, this is the array size of the user buffer.  On
+	 * return from ETHTOOL_GRXFHINDIR, this is the array size of
+	 * the hardware indirection table. */
+	__u32	size;
+	__u32	ring_index[0];	/* ring/queue index for each hash value */
+};
+
 struct ethtool_rx_ntuple_flow_spec {
 	__u32		 flow_type;
 	union {
@@ -425,6 +434,7 @@ struct ethtool_flash {
 	char	data[ETHTOOL_FLASH_MAX_FILENAME];
 };
 
+
 /* CMDs currently supported */
 #define ETHTOOL_GSET		0x00000001 /* Get settings. */
 #define ETHTOOL_SSET		0x00000002 /* Set settings. */
@@ -468,21 +478,23 @@ struct ethtool_flash {
 #define ETHTOOL_GPFLAGS		0x00000027 /* Get driver-private flags bitmap */
 #define ETHTOOL_SPFLAGS		0x00000028 /* Set driver-private flags bitmap */
 
-#define	ETHTOOL_GRXFH		0x00000029 /* Get RX flow hash configuration */
-#define	ETHTOOL_SRXFH		0x0000002a /* Set RX flow hash configuration */
+#define ETHTOOL_GRXFH		0x00000029 /* Get RX flow hash configuration */
+#define ETHTOOL_SRXFH		0x0000002a /* Set RX flow hash configuration */
 #define ETHTOOL_GGRO		0x0000002b /* Get GRO enable (ethtool_value) */
 #define ETHTOOL_SGRO		0x0000002c /* Set GRO enable (ethtool_value) */
-#define	ETHTOOL_GRXRINGS	0x0000002d /* Get RX rings available for LB */
-#define	ETHTOOL_GRXCLSRLCNT	0x0000002e /* Get RX class rule count */
-#define	ETHTOOL_GRXCLSRULE	0x0000002f /* Get RX classification rule */
-#define	ETHTOOL_GRXCLSRLALL	0x00000030 /* Get all RX classification rule */
-#define	ETHTOOL_SRXCLSRLDEL	0x00000031 /* Delete RX classification rule */
-#define	ETHTOOL_SRXCLSRLINS	0x00000032 /* Insert RX classification rule */
-#define	ETHTOOL_FLASHDEV	0x00000033 /* Flash firmware to device */
-#define	ETHTOOL_RESET		0x00000034 /* Reset hardware */
-#define	ETHTOOL_SRXNTUPLE	0x00000035 /* Add an n-tuple filter to device */
-#define	ETHTOOL_GRXNTUPLE	0x00000036 /* Get n-tuple filters from device */
-#define	ETHTOOL_GSSET_INFO	0x00000037 /* Get string set info */
+#define ETHTOOL_GRXRINGS	0x0000002d /* Get RX rings available for LB */
+#define ETHTOOL_GRXCLSRLCNT	0x0000002e /* Get RX class rule count */
+#define ETHTOOL_GRXCLSRULE	0x0000002f /* Get RX classification rule */
+#define ETHTOOL_GRXCLSRLALL	0x00000030 /* Get all RX classification rule */
+#define ETHTOOL_SRXCLSRLDEL	0x00000031 /* Delete RX classification rule */
+#define ETHTOOL_SRXCLSRLINS	0x00000032 /* Insert RX classification rule */
+#define ETHTOOL_FLASHDEV	0x00000033 /* Flash firmware to device */
+#define ETHTOOL_RESET		0x00000034 /* Reset hardware */
+#define ETHTOOL_SRXNTUPLE	0x00000035 /* Add an n-tuple filter to device */
+#define ETHTOOL_GRXNTUPLE	0x00000036 /* Get n-tuple filters from device */
+#define ETHTOOL_GSSET_INFO	0x00000037 /* Get string set info */
+#define ETHTOOL_GRXFHINDIR	0x00000038 /* Get RX flow hash indir'n table */
+#define ETHTOOL_SRXFHINDIR	0x00000039 /* Set RX flow hash indir'n table */
 
 /* compatibility with older code */
 #define SPARC_ETH_GSET		ETHTOOL_GSET
@@ -602,8 +614,8 @@ struct ethtool_flash {
 #define	AH_V6_FLOW	0x0b
 #define	ESP_V6_FLOW	0x0c
 #define	IP_USER_FLOW	0x0d
-#define IPV4_FLOW       0x10
-#define IPV6_FLOW       0x11
+#define	IPV4_FLOW	0x10
+#define	IPV6_FLOW	0x11
 
 /* L3-L4 network traffic flow hash options */
 #define	RXH_L2DA	(1 << 1)
-- 
1.6.2.5


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH net-next-2.6 2/2] sfc: Add support for RX flow hash control
From: Ben Hutchings @ 2010-06-30 15:06 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1277910323.2082.14.camel@achroite.uk.solarflarecom.com>

Allow ethtool to query the number of RX rings, the fields used in RX
flow hashing and the hash indirection table.

Allow ethtool to update the RX flow hash indirection table.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/sfc/efx.c        |    3 +
 drivers/net/sfc/ethtool.c    |   90 ++++++++++++++++++++++++++++++++++++++++++
 drivers/net/sfc/net_driver.h |    2 +
 drivers/net/sfc/nic.c        |   19 ++++-----
 drivers/net/sfc/nic.h        |    1 +
 5 files changed, 105 insertions(+), 10 deletions(-)

diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index 2a90bf9..35b3f29 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -1121,6 +1121,7 @@ static void efx_set_channels(struct efx_nic *efx)
 
 static int efx_probe_nic(struct efx_nic *efx)
 {
+	size_t i;
 	int rc;
 
 	netif_dbg(efx, probe, efx->net_dev, "creating NIC\n");
@@ -1136,6 +1137,8 @@ static int efx_probe_nic(struct efx_nic *efx)
 
 	if (efx->n_channels > 1)
 		get_random_bytes(&efx->rx_hash_key, sizeof(efx->rx_hash_key));
+	for (i = 0; i < ARRAY_SIZE(efx->rx_indir_table); i++)
+		efx->rx_indir_table[i] = i % efx->n_rx_channels;
 
 	efx_set_channels(efx);
 	efx->net_dev->real_num_tx_queues = efx->n_tx_channels;
diff --git a/drivers/net/sfc/ethtool.c b/drivers/net/sfc/ethtool.c
index 7693cfb..ab2510d 100644
--- a/drivers/net/sfc/ethtool.c
+++ b/drivers/net/sfc/ethtool.c
@@ -871,6 +871,93 @@ extern int efx_ethtool_reset(struct net_device *net_dev, u32 *flags)
 	return efx_reset(efx, method);
 }
 
+static int
+efx_ethtool_get_rxnfc(struct net_device *net_dev,
+		      struct ethtool_rxnfc *info, void *rules __always_unused)
+{
+	struct efx_nic *efx = netdev_priv(net_dev);
+	
+	switch (info->cmd) {
+	case ETHTOOL_GRXRINGS:
+		info->data = efx->n_rx_channels;
+		return 0;
+
+	case ETHTOOL_GRXFH: {
+		unsigned min_revision = 0;
+
+		info->data = 0;
+		switch (info->flow_type) {
+		case TCP_V4_FLOW:
+			info->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
+			/* fall through */
+		case UDP_V4_FLOW:
+		case SCTP_V4_FLOW:
+		case AH_ESP_V4_FLOW:
+		case IPV4_FLOW:
+			info->data |= RXH_IP_SRC | RXH_IP_DST;
+			min_revision = EFX_REV_FALCON_B0;
+			break;
+		case TCP_V6_FLOW:
+			info->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3;
+			/* fall through */
+		case UDP_V6_FLOW:
+		case SCTP_V6_FLOW:
+		case AH_ESP_V6_FLOW:
+		case IPV6_FLOW:
+			info->data |= RXH_IP_SRC | RXH_IP_DST;
+			min_revision = EFX_REV_SIENA_A0;
+			break;
+		default:
+			break;
+		}
+		if (efx_nic_rev(efx) < min_revision)
+			info->data = 0;
+		return 0;
+	}
+
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int efx_ethtool_get_rxfh_indir(struct net_device *net_dev,
+				      struct ethtool_rxfh_indir *indir)
+{
+	struct efx_nic *efx = netdev_priv(net_dev);
+	size_t copy_size =
+		min_t(size_t, indir->size, ARRAY_SIZE(efx->rx_indir_table));
+
+	if (efx_nic_rev(efx) < EFX_REV_FALCON_B0)
+		return -EOPNOTSUPP;
+
+	indir->size = ARRAY_SIZE(efx->rx_indir_table);
+	memcpy(indir->ring_index, efx->rx_indir_table,
+	       copy_size * sizeof(indir->ring_index[0]));
+	return 0;
+}
+
+static int efx_ethtool_set_rxfh_indir(struct net_device *net_dev,
+				      const struct ethtool_rxfh_indir *indir)
+{
+	struct efx_nic *efx = netdev_priv(net_dev);
+	size_t i;
+
+	if (efx_nic_rev(efx) < EFX_REV_FALCON_B0)
+		return -EOPNOTSUPP;
+
+	/* Validate size and indices */
+	if (indir->size != ARRAY_SIZE(efx->rx_indir_table))
+		return -EINVAL;
+	for (i = 0; i < ARRAY_SIZE(efx->rx_indir_table); i++)
+		if (indir->ring_index[i] >= efx->n_rx_channels)
+			return -EINVAL;
+
+	memcpy(efx->rx_indir_table, indir->ring_index,
+	       sizeof(efx->rx_indir_table));
+	efx_nic_push_rx_indir_table(efx);
+	return 0;
+}
+
 const struct ethtool_ops efx_ethtool_ops = {
 	.get_settings		= efx_ethtool_get_settings,
 	.set_settings		= efx_ethtool_set_settings,
@@ -908,4 +995,7 @@ const struct ethtool_ops efx_ethtool_ops = {
 	.get_wol                = efx_ethtool_get_wol,
 	.set_wol                = efx_ethtool_set_wol,
 	.reset			= efx_ethtool_reset,
+	.get_rxnfc		= efx_ethtool_get_rxnfc,
+	.get_rxfh_indir		= efx_ethtool_get_rxfh_indir,
+	.set_rxfh_indir		= efx_ethtool_set_rxfh_indir,
 };
diff --git a/drivers/net/sfc/net_driver.h b/drivers/net/sfc/net_driver.h
index ba272a4..64e7caa 100644
--- a/drivers/net/sfc/net_driver.h
+++ b/drivers/net/sfc/net_driver.h
@@ -647,6 +647,7 @@ union efx_multicast_hash {
  * @n_tx_channels: Number of channels used for TX
  * @rx_buffer_len: RX buffer length
  * @rx_buffer_order: Order (log2) of number of pages for each RX buffer
+ * @rx_indir_table: Indirection table for RSS
  * @int_error_count: Number of internal errors seen recently
  * @int_error_expire: Time at which error count will be expired
  * @irq_status: Interrupt status buffer
@@ -734,6 +735,7 @@ struct efx_nic {
 	unsigned int rx_buffer_len;
 	unsigned int rx_buffer_order;
 	u8 rx_hash_key[40];
+	u32 rx_indir_table[128];
 
 	unsigned int_error_count;
 	unsigned long int_error_expire;
diff --git a/drivers/net/sfc/nic.c b/drivers/net/sfc/nic.c
index 3083657..f595d92 100644
--- a/drivers/net/sfc/nic.c
+++ b/drivers/net/sfc/nic.c
@@ -1484,22 +1484,21 @@ static irqreturn_t efx_msi_interrupt(int irq, void *dev_id)
 /* Setup RSS indirection table.
  * This maps from the hash value of the packet to RXQ
  */
-static void efx_setup_rss_indir_table(struct efx_nic *efx)
+void efx_nic_push_rx_indir_table(struct efx_nic *efx)
 {
-	int i = 0;
-	unsigned long offset;
+	size_t i = 0;
 	efx_dword_t dword;
 
 	if (efx_nic_rev(efx) < EFX_REV_FALCON_B0)
 		return;
 
-	for (offset = FR_BZ_RX_INDIRECTION_TBL;
-	     offset < FR_BZ_RX_INDIRECTION_TBL + 0x800;
-	     offset += 0x10) {
+	BUILD_BUG_ON(ARRAY_SIZE(efx->rx_indir_table) !=
+		     FR_BZ_RX_INDIRECTION_TBL_ROWS);
+
+	for (i = 0; i < FR_BZ_RX_INDIRECTION_TBL_ROWS; i++) {
 		EFX_POPULATE_DWORD_1(dword, FRF_BZ_IT_QUEUE,
-				     i % efx->n_rx_channels);
-		efx_writed(efx, &dword, offset);
-		i++;
+				     efx->rx_indir_table[i]);
+		efx_writed_table(efx, &dword, FR_BZ_RX_INDIRECTION_TBL, i);
 	}
 }
 
@@ -1634,7 +1633,7 @@ void efx_nic_init_common(struct efx_nic *efx)
 	EFX_INVERT_OWORD(temp);
 	efx_writeo(efx, &temp, FR_AZ_FATAL_INTR_KER);
 
-	efx_setup_rss_indir_table(efx);
+	efx_nic_push_rx_indir_table(efx);
 
 	/* Disable the ugly timer-based TX DMA backoff and allow TX DMA to be
 	 * controlled by the RX FIFO fill level. Set arbitration to one pkt/Q.
diff --git a/drivers/net/sfc/nic.h b/drivers/net/sfc/nic.h
index a39822d..0438dc9 100644
--- a/drivers/net/sfc/nic.h
+++ b/drivers/net/sfc/nic.h
@@ -207,6 +207,7 @@ extern void falcon_stop_nic_stats(struct efx_nic *efx);
 extern void falcon_setup_xaui(struct efx_nic *efx);
 extern int falcon_reset_xaui(struct efx_nic *efx);
 extern void efx_nic_init_common(struct efx_nic *efx);
+extern void efx_nic_push_rx_indir_table(struct efx_nic *efx);
 
 int efx_nic_alloc_buffer(struct efx_nic *efx, struct efx_buffer *buffer,
 			 unsigned int len);
-- 
1.6.2.5

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCH net-next-2.6 1/2] ethtool: Add support for control of RX flow hash indirection
From: Ben Hutchings @ 2010-06-30 15:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers

Many NICs use an indirection table to map an RX flow hash value to one
of an arbitrary number of queues (not necessarily a power of 2).  It
can be useful to remove some queues from this indirection table so
that they are only used for flows that are specifically filtered
there.  It may also be useful to weight the mapping to account for
user processes with the same CPU-affinity as the RX interrupts.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 include/linux/ethtool.h |   15 +++++++++
 net/core/ethtool.c      |   80 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 95 insertions(+), 0 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 2c8af09..3b089d8 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -384,6 +384,15 @@ struct ethtool_rxnfc {
 	__u32				rule_locs[0];
 };
 
+struct ethtool_rxfh_indir {
+	__u32	cmd;
+	/* On entry, this is the array size of the user buffer.  On
+	 * return from ETHTOOL_GRXFHINDIR, this is the array size of
+	 * the hardware indirection table. */
+	__u32	size;
+	__u32	ring_index[0];	/* ring/queue index for each hash value */
+};
+
 struct ethtool_rx_ntuple_flow_spec {
 	__u32		 flow_type;
 	union {
@@ -576,6 +585,10 @@ struct ethtool_ops {
 	int	(*set_rx_ntuple)(struct net_device *,
 				 struct ethtool_rx_ntuple *);
 	int	(*get_rx_ntuple)(struct net_device *, u32 stringset, void *);
+	int	(*get_rxfh_indir)(struct net_device *,
+				  struct ethtool_rxfh_indir *);
+	int	(*set_rxfh_indir)(struct net_device *,
+				  const struct ethtool_rxfh_indir *);
 };
 #endif /* __KERNEL__ */
 
@@ -637,6 +650,8 @@ struct ethtool_ops {
 #define ETHTOOL_SRXNTUPLE	0x00000035 /* Add an n-tuple filter to device */
 #define ETHTOOL_GRXNTUPLE	0x00000036 /* Get n-tuple filters from device */
 #define ETHTOOL_GSSET_INFO	0x00000037 /* Get string set info */
+#define ETHTOOL_GRXFHINDIR	0x00000038 /* Get RX flow hash indir'n table */
+#define ETHTOOL_SRXFHINDIR	0x00000039 /* Set RX flow hash indir'n table */
 
 /* compatibility with older code */
 #define SPARC_ETH_GSET		ETHTOOL_GSET
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index a0f4964..d978096 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -376,6 +376,80 @@ err_out:
 	return ret;
 }
 
+static noinline_for_stack int ethtool_get_rxfh_indir(struct net_device *dev,
+						     void __user *useraddr)
+{
+	struct ethtool_rxfh_indir *indir;
+	u32 table_size;
+	size_t full_size;
+	int ret;
+
+	if (!dev->ethtool_ops->get_rxfh_indir)
+		return -EOPNOTSUPP;
+
+	if (copy_from_user(&table_size,
+			   useraddr + offsetof(struct ethtool_rxfh_indir, size),
+			   sizeof(table_size)))
+		return -EFAULT;
+
+	if (table_size >
+	    (KMALLOC_MAX_SIZE - sizeof(*indir)) / sizeof(*indir->ring_index))
+		return -ENOMEM;
+	full_size = sizeof(*indir) + sizeof(*indir->ring_index) * table_size;
+	indir = kmalloc(full_size, GFP_USER);
+	if (!indir)
+		return -ENOMEM;
+
+	indir->cmd = ETHTOOL_GRXFHINDIR;
+	indir->size = table_size;
+	ret = dev->ethtool_ops->get_rxfh_indir(dev, indir);
+	if (ret)
+		goto out;
+
+	if (copy_to_user(useraddr, indir, full_size))
+		ret = -EFAULT;
+
+out:
+	kfree(indir);
+	return ret;
+}
+
+static noinline_for_stack int ethtool_set_rxfh_indir(struct net_device *dev,
+						     void __user *useraddr)
+{
+	struct ethtool_rxfh_indir *indir;
+	u32 table_size;
+	size_t full_size;
+	int ret;
+
+	if (!dev->ethtool_ops->set_rxfh_indir)
+		return -EOPNOTSUPP;
+
+	if (copy_from_user(&table_size,
+			   useraddr + offsetof(struct ethtool_rxfh_indir, size),
+			   sizeof(table_size)))
+		return -EFAULT;
+
+	if (table_size >
+	    (KMALLOC_MAX_SIZE - sizeof(*indir)) / sizeof(*indir->ring_index))
+		return -ENOMEM;
+	full_size = sizeof(*indir) + sizeof(*indir->ring_index) * table_size;
+	indir = kmalloc(full_size, GFP_USER);
+	if (!indir)
+		return -ENOMEM;
+
+	if (copy_from_user(indir, useraddr, full_size)) {
+		ret = -EFAULT;
+		goto out;
+	}
+
+	ret = dev->ethtool_ops->set_rxfh_indir(dev, indir);
+
+out:
+	kfree(indir);
+	return ret;
+}
+
 static void __rx_ntuple_filter_add(struct ethtool_rx_ntuple_list *list,
 			struct ethtool_rx_ntuple_flow_spec *spec,
 			struct ethtool_rx_ntuple_flow_spec_container *fsc)
@@ -1544,6 +1618,12 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 	case ETHTOOL_GSSET_INFO:
 		rc = ethtool_get_sset_info(dev, useraddr);
 		break;
+	case ETHTOOL_GRXFHINDIR:
+		rc = ethtool_get_rxfh_indir(dev, useraddr);
+		break;
+	case ETHTOOL_SRXFHINDIR:
+		rc = ethtool_set_rxfh_indir(dev, useraddr);
+		break;
 	default:
 		rc = -EOPNOTSUPP;
 	}
-- 
1.6.2.5


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox