Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next-2.6 12/12] qlcnic: convert to set_phys_id
From: David Miller @ 2011-04-06 22:06 UTC (permalink / raw)
  To: shemminger; +Cc: bhutchings, amit.salecha, anirban.chakraborty, netdev
In-Reply-To: <20110406144723.26467b77@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 6 Apr 2011 14:47:23 -0700

> Convert driver to use new ethtool set_phys_id.
> Not completely sure that this is correct for all cases of device
> up/down and doing operation. Compile tested only.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH 8/8] ewrk3: convert to set_phys_id
From: David Miller @ 2011-04-06 22:06 UTC (permalink / raw)
  To: shemminger; +Cc: bhutchings, netdev
In-Reply-To: <20110406145836.35af537a@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 6 Apr 2011 14:58:36 -0700

> Use ethtool infrastructure for blinking, which is now does
> locking at higher level.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH 07/19] timberdale: mfd_cell is now implicitly available to drivers
From: Greg KH @ 2011-04-06 22:09 UTC (permalink / raw)
  To: Felipe Balbi
  Cc: Samuel Ortiz, Grant Likely, Andres Salomon,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Mark Brown,
	khali-PUYAD+kWke1g9hUCZPvPmw, ben-linux-elnMNo+KYs3YtjvyW6yDsg,
	Peter Korsgaard, Mauro Carvalho Chehab, David Brownell,
	linux-i2c-u79uwXL29TY76Z2rM5mHXA,
	linux-media-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	spi-devel-general-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	Mocean Laboratories
In-Reply-To: <20110406185902.GN25654-UiBtZHVXSwEVvW8u9ZQWYwjfymiNCTlR@public.gmane.org>

On Wed, Apr 06, 2011 at 09:59:02PM +0300, Felipe Balbi wrote:
> Hi,
> 
> On Wed, Apr 06, 2011 at 08:47:34PM +0200, Samuel Ortiz wrote:
> > > > > What is a "MFD cell pointer" and why is it needed in struct device?
> > > > An MFD cell is an MFD instantiated device.
> > > > MFD (Multi Function Device) drivers instantiate platform devices. Those
> > > > devices drivers sometimes need a platform data pointer, sometimes an MFD
> > > > specific pointer, and sometimes both. Also, some of those drivers have been
> > > > implemented as MFD sub drivers, while others know nothing about MFD and just
> > > > expect a plain platform_data pointer.
> > > 
> > > That sounds like a bug in those drivers, why not fix them to properly
> > > pass in the correct pointer?
> > Because they're drivers for generic IPs, not MFD ones. By forcing them to use
> > MFD specific structure and APIs, we make it more difficult for platform code
> > to instantiate them.
> 
> I agree. What I do on those cases is to have a simple platform_device
> for the core IP driver and use platform_device_id tables to do runtime
> checks of the small differences. If one platform X doesn't use a
> platform_bus, it uses e.g. PCI, then you make a PCI "bridge" which
> allocates a platform_device with the correct name and adds that to the
> driver model.
> 
> See [1] (for the core driver) and [2] (for a PCI bridge driver) for an
> example of what I'm talking about.

Yes, thanks for providing a real example, this is the best way to handle
this.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH] net: ethtool support to configure number of channels
From: Ben Hutchings @ 2011-04-06 22:18 UTC (permalink / raw)
  To: Anirban Chakraborty
  Cc: Amit Salecha, davem@davemloft.net, netdev@vger.kernel.org,
	Ameen Rahman, Sucheta Chakraborty
In-Reply-To: <BEEA5B09-0726-463D-B542-234C4FB49FC5@qlogic.com>

On Fri, 2011-04-01 at 22:47 -0700, Anirban Chakraborty wrote:
> On Apr 1, 2011, at 7:55 PM, Ben Hutchings wrote:
> 
> > On Fri, 2011-04-01 at 21:36 -0500, Amit Salecha wrote:
> >>> I'm not sure why you reduced this to a single count.  If if the driver
> >>> or hardware doesn't allow certain combinations of counts, it might be
> >>> necessary to configure several types at the same time
> >>>
> >>>> +/* Channel ID is made up of a type */
> >>>> +enum ethtool_channel_id {
> >>>> +   ETH_CHAN_TYPE_RX = 0x1,
> >>>> +   ETH_CHAN_TYPE_TX = 0x2
> >>>> +};
> >>> [...]
> >>>
> >>> enum ethtool_channel_id was meant to be an identifier of a specific
> >>> channel.  An enumeration of channel types should be named differently.
> >>>
> >>
> >> I will name it as ethtool_channel_type. Any other suggestion ?
> >>
> >>> This also omits the 'combined' and 'other' types.  Most multiqueue
> >>> drivers pair up RX and TX queues so that most channels combine RX and
> >>> TX
> >>> work.
> >>
> >> 'combined' is ok, what is use of 'other' ?
> >
> > Could be link interrupts, SR-IOV coordination, or something else.  Not
> > something you'd likely be able to change, but it could be useful to know
> > that some interrupts are allocated to them.  Actually, that does mean it
> > might be helpful for the 'get' operation to return a minimum value along
> > with the maximum value.
> 
> Are you thinking of using the 'other' field as a way to a represent a 'virtual port'
> that a VF could have. A virtual port could have a set of rx/tx rings, interrupts,
> QoS parameters, MAC filters, VLAN ids etc. etc. A VF could have one or many such
> channels. If thats the case, I would think that configuring these channels should
> be done via a PF rather than on a VF. It is possible I could get you totally wrong here,
> however it would be good to hear your thoughts.

The net device for a VF could have all sorts of channels, and their
numbers may or may not be configurable depending on limitations of the
hardware, firmware, driver or hypervisor.

The channel counts reported by a net device should include all those
IRQs allocated by the net device driver for its parent device (e.g. a
PCI device).

To be more concrete, here is how I would count channels:

1. RX queue with IRQ, exposed to the network stack => RX channel
2. TX queue with IRQ, exposed to the network stack => TX channel
3. RX and TX queue sharing IRQ, exposed to the network stack => combined channel
4. Link change IRQ => other channel
5. Queue(s) and IRQ for iSCSI traffic, not exposed to the network stack => other channel
6. Qeuue(s) and IRQ for coordination between PCI functions => other channel
7. Queue(s) and IRQ allocated to other PCI function => not counted

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Fw: [Bug 32832] New: shutdown(2) does not fully shut down socket any more
From: Stephen Hemminger @ 2011-04-06 23:07 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Wed, 6 Apr 2011 22:42:39 GMT
From: bugzilla-daemon@bugzilla.kernel.org
To: shemminger@linux-foundation.org
Subject: [Bug 32832] New: shutdown(2) does not fully shut down socket any more


https://bugzilla.kernel.org/show_bug.cgi?id=32832

           Summary: shutdown(2) does not fully shut down socket any more
           Product: Networking
           Version: 2.5
    Kernel Version: 2.6.38
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: IPV4
        AssignedTo: shemminger@linux-foundation.org
        ReportedBy: kees@outflux.net
        Regression: Yes


In 2.6.35 and earlier, shutdown(2) will fully remove a socket. This does not
appear to be true any more and is causing software to misbehave.

2.6.35:
$ ./testcase
parent: 5957
before:
tcp        0      0 0.0.0.0:12345           0.0.0.0:*               LISTEN     
after:
child: 5961
$ ./testcase
parent: 6001
before:
tcp        0      0 0.0.0.0:12345           0.0.0.0:*               LISTEN     
after:
child: 6002

2.6.38:
$ ./testcase
parent: 1138
before:
tcp        0      0 0.0.0.0:12345           0.0.0.0:*               LISTEN     
after:
child: 1142
$ ./testcase
bind: Address already in use

The listener doesn't show up in netstat any more, but as long as the child
process is running, the socket is unavailable. It is as if the shutdown(2)
behavior has partially reverted to close(2) behavior (but in the case of using
close(2), the child's socket would remain visible in netstat).

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


-- 

^ permalink raw reply

* [RFC net-next] qlge: use ethtool set_phys_id
From: Stephen Hemminger @ 2011-04-06 23:47 UTC (permalink / raw)
  To: Ron Mercer, David Miller; +Cc: linux-driver, netdev

This is a stab at replacing old ethtool phys_id with set_phys_id
on the Qlogic 10Gb driver. Compile tested only.

Not sure if set_led_cfg will flash continuously, or needs
to be replaced by ETHTOOL_ID_ON/ETHTOOL_ID_OFF

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/drivers/net/qlge/qlge_ethtool.c	2011-04-06 16:28:33.897200810 -0700
+++ b/drivers/net/qlge/qlge_ethtool.c	2011-04-06 16:39:55.140139828 -0700
@@ -412,31 +412,31 @@ static int ql_set_wol(struct net_device
 	return 0;
 }
 
-static int ql_phys_id(struct net_device *ndev, u32 data)
+static int ql_set_phys_id(struct net_device *ndev,
+			  enum ethtool_phys_id_state state)
+
 {
 	struct ql_adapter *qdev = netdev_priv(ndev);
-	u32 led_reg, i;
-	int status;
-
-	/* Save the current LED settings */
-	status = ql_mb_get_led_cfg(qdev);
-	if (status)
-		return status;
-	led_reg = qdev->led_config;
 
-	/* Start blinking the led */
-	if (!data || data > 300)
-		data = 300;
+	switch (state) {
+	case ETHTOOL_ID_ACTIVE:
+		/* Save the current LED settings */
+		if (ql_mb_get_led_cfg(qdev))
+			return -EIO;
 
-	for (i = 0; i < (data * 10); i++)
+		/* Start blinking */
 		ql_mb_set_led_cfg(qdev, QL_LED_BLINK);
+		return 0;
 
-	/* Restore LED settings */
-	status = ql_mb_set_led_cfg(qdev, led_reg);
-	if (status)
-		return status;
+	case ETHTOOL_ID_INACTIVE:
+		/* Restore LED settings */
+		if (ql_mb_set_led_cfg(qdev, qdev->led_config))
+			return -EIO;
+		return 0;
 
-	return 0;
+	default:
+		return -EINVAL;
+	}
 }
 
 static int ql_start_loopback(struct ql_adapter *qdev)
@@ -703,7 +703,7 @@ const struct ethtool_ops qlge_ethtool_op
 	.get_msglevel = ql_get_msglevel,
 	.set_msglevel = ql_set_msglevel,
 	.get_link = ethtool_op_get_link,
-	.phys_id		 = ql_phys_id,
+	.set_phys_id		 = ql_set_phys_id,
 	.self_test		 = ql_self_test,
 	.get_pauseparam		 = ql_get_pauseparam,
 	.set_pauseparam		 = ql_set_pauseparam,


^ permalink raw reply

* Re: [Bug 32832] New: shutdown(2) does not fully shut down socket any more
From: David Miller @ 2011-04-06 23:48 UTC (permalink / raw)
  To: shemminger; +Cc: netdev
In-Reply-To: <20110406160713.7ff48ef1@nehalam>

From: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Wed, 6 Apr 2011 16:07:13 -0700

> Begin forwarded message:
> 
> Date: Wed, 6 Apr 2011 22:42:39 GMT
> From: bugzilla-daemon@bugzilla.kernel.org
> To: shemminger@linux-foundation.org
> Subject: [Bug 32832] New: shutdown(2) does not fully shut down socket any more
> 
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=32832
> 
>            Summary: shutdown(2) does not fully shut down socket any more
 ...
> 
> In 2.6.35 and earlier, shutdown(2) will fully remove a socket. This does not
> appear to be true any more and is causing software to misbehave.
> 

This is already being discussed:

http://marc.info/?l=linux-netdev&m=130176733401613&w=2

^ permalink raw reply

* [PATCH net-next] cxgb4: don't hold RTNL during ethtool phys_id
From: Stephen Hemminger @ 2011-04-07  0:09 UTC (permalink / raw)
  To: Dimitris Michailidis, Casey Leedom, Ben Hutchings, David Miller; +Cc: netdev

The Chelsio cxgb4 drivers implement blinking in a unique way by
waiting on the mailbox. This patch cleans it up slightly by no longer
holding the system wide network configuration lock during the process.

The patch also uses correct semantics for the time argument
which is supposed to be in seconds; and zero is supposed
to signify infinite blinking.

This is still a bad firmware interface design for this
since it means the board is basically hung while doing the blink.
But fixing it correctly would require hardware and firmware
documentation. With that information the device could be converted
to the new set_phys_id.

Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 drivers/net/cxgb4/cxgb4_main.c     |   17 ++++++++++++++---
 drivers/net/cxgb4vf/cxgb4vf_main.c |   21 +++++++++++++++++++--
 2 files changed, 33 insertions(+), 5 deletions(-)

--- a/drivers/net/cxgb4/cxgb4_main.c	2011-04-06 16:49:02.045648800 -0700
+++ b/drivers/net/cxgb4/cxgb4_main.c	2011-04-06 17:00:59.508851692 -0700
@@ -1339,12 +1339,23 @@ static int restart_autoneg(struct net_de
 static int identify_port(struct net_device *dev, u32 data)
 {
 	struct adapter *adap = netdev2adap(dev);
+	int rc;
+	unsigned long blinks;
 
 	if (data == 0)
-		data = 2;     /* default to 2 seconds */
+		blinks = UINT_MAX;
+	else
+		blinks = 2*data + data/2;
 
-	return t4_identify_port(adap, adap->fn, netdev2pinfo(dev)->viid,
-				data * 5);
+	/* Don't block networking updates while blink is in progress */
+	dev_hold(dev);
+	rtnl_unlock();
+
+	rc = t4_identify_port(adap, adap->fn, netdev2pinfo(dev)->viid,
+			      blinks);
+	rtnl_lock();
+	dev_put(dev);
+	return rc;
 }
 
 static unsigned int from_fw_linkcaps(unsigned int type, unsigned int caps)
--- a/drivers/net/cxgb4vf/cxgb4vf_main.c	2011-04-06 16:49:09.989728600 -0700
+++ b/drivers/net/cxgb4vf/cxgb4vf_main.c	2011-04-06 17:02:38.609846223 -0700
@@ -43,6 +43,7 @@
 #include <linux/etherdevice.h>
 #include <linux/debugfs.h>
 #include <linux/ethtool.h>
+#include <linux/rtnetlink.h>
 
 #include "t4vf_common.h"
 #include "t4vf_defs.h"
@@ -1352,11 +1353,27 @@ static int cxgb4vf_set_rx_csum(struct ne
 /*
  * Identify the port by blinking the port's LED.
  */
-static int cxgb4vf_phys_id(struct net_device *dev, u32 id)
+static int cxgb4vf_phys_id(struct net_device *dev, u32 data)
 {
 	struct port_info *pi = netdev_priv(dev);
+	int rc;
+	unsigned int blinks;
 
-	return t4vf_identify_port(pi->adapter, pi->viid, 5);
+	if (data == 0)
+		blinks = UINT_MAX;
+	else
+		blinks = 2*data + data/2;
+
+	/* Don't block networking updates while blink is in progress */
+	dev_hold(dev);
+	rtnl_unlock();
+
+	rc =  t4vf_identify_port(pi->adapter, pi->viid, blinks);
+
+	rtnl_lock();
+	dev_put(dev);
+
+	return rc;
 }
 
 /*

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: don't hold RTNL during ethtool phys_id
From: Casey Leedom @ 2011-04-07  0:20 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Dimitris Michailidis, Ben Hutchings, David Miller, netdev
In-Reply-To: <20110406170929.6e427b36@nehalam>

| From: Stephen Hemminger <shemminger@linux-foundation.org>
| Date: Wednesday, April 06, 2011 05:09 pm
| 
| The Chelsio cxgb4 drivers implement blinking in a unique way by
| waiting on the mailbox. This patch cleans it up slightly by no longer
| holding the system wide network configuration lock during the process.
| 
| The patch also uses correct semantics for the time argument
| which is supposed to be in seconds; and zero is supposed
| to signify infinite blinking.
| 
| This is still a bad firmware interface design for this
| since it means the board is basically hung while doing the blink.
| But fixing it correctly would require hardware and firmware
| documentation. With that information the device could be converted
| to the new set_phys_id.
| 
| Compile tested only.
| 
| Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

  Are you assuming that the firmware won't respond with a command completion 
until the LED blinking is complete?  If so, that's a bad assumption.  The 
firmware runs as an asynchronous real-time OS.  The LED blinking simply becomes 
a thread of activity within the OS and the command completes immediately.

Casey

| ---
|  drivers/net/cxgb4/cxgb4_main.c     |   17 ++++++++++++++---
|  drivers/net/cxgb4vf/cxgb4vf_main.c |   21 +++++++++++++++++++--
|  2 files changed, 33 insertions(+), 5 deletions(-)
| 
| --- a/drivers/net/cxgb4/cxgb4_main.c	2011-04-06 16:49:02.045648800 -0700
| +++ b/drivers/net/cxgb4/cxgb4_main.c	2011-04-06 17:00:59.508851692 -0700
| @@ -1339,12 +1339,23 @@ static int restart_autoneg(struct net_de
|  static int identify_port(struct net_device *dev, u32 data)
|  {
|  	struct adapter *adap = netdev2adap(dev);
| +	int rc;
| +	unsigned long blinks;
| 
|  	if (data == 0)
| -		data = 2;     /* default to 2 seconds */
| +		blinks = UINT_MAX;
| +	else
| +		blinks = 2*data + data/2;
| 
| -	return t4_identify_port(adap, adap->fn, netdev2pinfo(dev)->viid,
| -				data * 5);
| +	/* Don't block networking updates while blink is in progress */
| +	dev_hold(dev);
| +	rtnl_unlock();
| +
| +	rc = t4_identify_port(adap, adap->fn, netdev2pinfo(dev)->viid,
| +			      blinks);
| +	rtnl_lock();
| +	dev_put(dev);
| +	return rc;
|  }
| 
|  static unsigned int from_fw_linkcaps(unsigned int type, unsigned int caps)
| --- a/drivers/net/cxgb4vf/cxgb4vf_main.c	2011-04-06 16:49:09.989728600
| -0700 +++ b/drivers/net/cxgb4vf/cxgb4vf_main.c	2011-04-06
| 17:02:38.609846223 -0700 @@ -43,6 +43,7 @@
|  #include <linux/etherdevice.h>
|  #include <linux/debugfs.h>
|  #include <linux/ethtool.h>
| +#include <linux/rtnetlink.h>
| 
|  #include "t4vf_common.h"
|  #include "t4vf_defs.h"
| @@ -1352,11 +1353,27 @@ static int cxgb4vf_set_rx_csum(struct ne
|  /*
|   * Identify the port by blinking the port's LED.
|   */
| -static int cxgb4vf_phys_id(struct net_device *dev, u32 id)
| +static int cxgb4vf_phys_id(struct net_device *dev, u32 data)
|  {
|  	struct port_info *pi = netdev_priv(dev);
| +	int rc;
| +	unsigned int blinks;
| 
| -	return t4vf_identify_port(pi->adapter, pi->viid, 5);
| +	if (data == 0)
| +		blinks = UINT_MAX;
| +	else
| +		blinks = 2*data + data/2;
| +
| +	/* Don't block networking updates while blink is in progress */
| +	dev_hold(dev);
| +	rtnl_unlock();
| +
| +	rc =  t4vf_identify_port(pi->adapter, pi->viid, blinks);
| +
| +	rtnl_lock();
| +	dev_put(dev);
| +
| +	return rc;
|  }
| 
|  /*

^ permalink raw reply

* Associate Needed Asap
From: Song Lile @ 2011-04-06 22:04 UTC (permalink / raw)



Good Day,

My name is Mr. Song Lile, I am the credit officer in Hang Seng Bank, Hong
Kong. I have a business proposal of certain amount of funds to be transferred
to an offshore account with your assistace if willing.

After the successful transfer, we shall share in ratio of 30% for you and
70% for me. Should you be interested, please respond to my letter
immediately, so we can commence all arrangements and I will give you more
information on the project and how we would handle it.

You can contact me on my private email:( sogli903@asus.hk )
and send me the following information for documentation purpose:

1. Full names
2. Private phone number
3. Current residential address

I look forward to hearing from you.

Kind Regards,
Mr. Song Lile.

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: don't hold RTNL during ethtool phys_id
From: Stephen Hemminger @ 2011-04-07  0:33 UTC (permalink / raw)
  To: Casey Leedom; +Cc: Dimitris Michailidis, Ben Hutchings, David Miller, netdev
In-Reply-To: <201104061720.30219.leedom@chelsio.com>

On Wed, 6 Apr 2011 17:20:29 -0700
Casey Leedom <leedom@chelsio.com> wrote:

> | From: Stephen Hemminger <shemminger@linux-foundation.org>
> | Date: Wednesday, April 06, 2011 05:09 pm
> | 
> | The Chelsio cxgb4 drivers implement blinking in a unique way by
> | waiting on the mailbox. This patch cleans it up slightly by no longer
> | holding the system wide network configuration lock during the process.
> | 
> | The patch also uses correct semantics for the time argument
> | which is supposed to be in seconds; and zero is supposed
> | to signify infinite blinking.
> | 
> | This is still a bad firmware interface design for this
> | since it means the board is basically hung while doing the blink.
> | But fixing it correctly would require hardware and firmware
> | documentation. With that information the device could be converted
> | to the new set_phys_id.
> | 
> | Compile tested only.
> | 
> | Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
>   Are you assuming that the firmware won't respond with a command completion 
> until the LED blinking is complete?  If so, that's a bad assumption.  The 
> firmware runs as an asynchronous real-time OS.  The LED blinking simply becomes 
> a thread of activity within the OS and the command completes immediately.
> 
> Casey

Then how is LED blinking stopped?

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: don't hold RTNL during ethtool phys_id
From: Ben Hutchings @ 2011-04-07  0:35 UTC (permalink / raw)
  To: Casey Leedom
  Cc: Stephen Hemminger, Dimitris Michailidis, David Miller, netdev
In-Reply-To: <201104061720.30219.leedom@chelsio.com>

On Wed, 2011-04-06 at 17:20 -0700, Casey Leedom wrote:
> | From: Stephen Hemminger <shemminger@linux-foundation.org>
> | Date: Wednesday, April 06, 2011 05:09 pm
> | 
> | The Chelsio cxgb4 drivers implement blinking in a unique way by
> | waiting on the mailbox. This patch cleans it up slightly by no longer
> | holding the system wide network configuration lock during the process.
> | 
> | The patch also uses correct semantics for the time argument
> | which is supposed to be in seconds; and zero is supposed
> | to signify infinite blinking.
> | 
> | This is still a bad firmware interface design for this
> | since it means the board is basically hung while doing the blink.
> | But fixing it correctly would require hardware and firmware
> | documentation. With that information the device could be converted
> | to the new set_phys_id.
> | 
> | Compile tested only.
> | 
> | Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
>   Are you assuming that the firmware won't respond with a command completion 
> until the LED blinking is complete?  If so, that's a bad assumption.  The 
> firmware runs as an asynchronous real-time OS.  The LED blinking simply becomes 
> a thread of activity within the OS and the command completes immediately.
[...]

Stephen was assuming (as I did) that you actually implemented this
operation correctly.  You're supposed to blink the LED for the specified
time but let the user interrupt early.  If you just start the LED
blinking and then return, then it appears the user has no way to
interrupt it.

Is there a defined firmware command to stop blinking the LED?

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: ipv6: Add support for RTA_PREFSRC
From: David Miller @ 2011-04-07  1:37 UTC (permalink / raw)
  To: dwalter; +Cc: netdev
In-Reply-To: <1301903804.31789.234.camel@localhost>

From: Daniel Walter <dwalter@barracuda.com>
Date: Mon, 4 Apr 2011 09:56:44 +0200

> On Fri, 2011-04-01 at 20:46 -0700, David Miller wrote:
>> You can't change the layout of "struct in6_rtmsg", as that structure
>> is explicitly exported to user space and changing it will break every
>> application out there.
> 
> Hi,
> 
> I've kicked support for setting the preferred source via ioctl,
> to keep "struct in6_rtmsg" untouched.
> This reduces the RTA_PREFSRC support to netlink only, unless
> we break the struct.
> 
> Do you see any other way around this problem?

This is fine, adding new feature support to deprecated things like
the ioctl routing calls is undesirable anyways.

Since you do the prefsrc extraction in at least two places, make a
helper function that does the whole "if prefsrc.plen use prefsrc, else
use ipv6_dev_get_saddr()"

This would be akin to ipv4's FIB_RES_PREFSRC

^ permalink raw reply

* linux-next: manual merge of the net tree with the net-current tree
From: Stephen Rothwell @ 2011-04-07  1:39 UTC (permalink / raw)
  To: David Miller, netdev
  Cc: linux-next, linux-kernel, Sathya Perla, Ajit Khaparde

Hi all,

Today's linux-next merge of the net tree got a conflict in
drivers/net/benet/be_main.c between commit 2d5d41546504 ("be2net: Fix a
potential crash during shutdown.") from the net-current tree and commit
0f4a68288217 ("be2net: cancel be_worker in be_shutdown() even when i/f is
down") from the net tree.

I fixed it up (see below) and can carry the fix as necessary.
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

diff --cc drivers/net/benet/be_main.c
index 88d4c80,a24fb45..0000000
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@@ -3142,14 -3152,11 +3153,13 @@@ static int be_resume(struct pci_dev *pd
  static void be_shutdown(struct pci_dev *pdev)
  {
  	struct be_adapter *adapter = pci_get_drvdata(pdev);
 -	struct net_device *netdev =  adapter->netdev;
 +
 +	if (!adapter)
 +		return;
  
- 	if (netif_running(adapter->netdev))
- 		cancel_delayed_work_sync(&adapter->work);
+ 	cancel_delayed_work_sync(&adapter->work);
  
 -	netif_device_detach(netdev);
 +	netif_device_detach(adapter->netdev);
  
  	be_cmd_reset_function(adapter);
  

^ permalink raw reply

* (unknown), 
From: Wang Lei @ 2011-04-07  3:22 UTC (permalink / raw)




I'm Wang Lei, I have a deal worth 25 Million USD if interested, please contact me via my personal email: wlei2344@gala.net


^ permalink raw reply

* [PATCH net-next 0/5] be2net: Patch series
From: Ajit Khaparde @ 2011-04-07  4:07 UTC (permalink / raw)
  To: netdev

Series of 5 patches. Please apply.

Thanks
-Ajit

[1/5] add rxhash support
[2/5] use common method to check for sriov function type
[3/5] fix to get max VFs supported from adapter
[4/5] dynamically allocate adapter->vf_cfg
[5/5] call FLR after setup wol in be_shutdown

^ permalink raw reply

* [PATCH net-next 2/5] be2net: use common method to check for sriov function type
From: Ajit Khaparde @ 2011-04-07  4:08 UTC (permalink / raw)
  To: netdev

Lancer and BE can both use SLI_INTF_REG to check a VF or a PF.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
---
 drivers/net/benet/be.h |   12 ++----------
 1 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/net/benet/be.h b/drivers/net/benet/be.h
index 8941b98..eba405b 100644
--- a/drivers/net/benet/be.h
+++ b/drivers/net/benet/be.h
@@ -458,18 +458,10 @@ static inline u8 is_udp_pkt(struct sk_buff *skb)
 
 static inline void be_check_sriov_fn_type(struct be_adapter *adapter)
 {
-	u8 data;
 	u32 sli_intf;
 
-	if (lancer_chip(adapter)) {
-		pci_read_config_dword(adapter->pdev, SLI_INTF_REG_OFFSET,
-								&sli_intf);
-		adapter->is_virtfn = (sli_intf & SLI_INTF_FT_MASK) ? 1 : 0;
-	} else {
-		pci_write_config_byte(adapter->pdev, 0xFE, 0xAA);
-		pci_read_config_byte(adapter->pdev, 0xFE, &data);
-		adapter->is_virtfn = (data != 0xAA);
-	}
+	pci_read_config_dword(adapter->pdev, SLI_INTF_REG_OFFSET, &sli_intf);
+	adapter->is_virtfn = (sli_intf & SLI_INTF_FT_MASK) ? 1 : 0;
 }
 
 static inline void be_vf_eth_addr_generate(struct be_adapter *adapter, u8 *mac)
-- 
1.7.1


^ permalink raw reply related

* [PATCH net-next 1/5] be2net: add rxhash support
From: Ajit Khaparde @ 2011-04-07  4:07 UTC (permalink / raw)
  To: netdev; +Cc: Eric Dumazet

Add rxhash support,
Based on initial work by Eric Dumazet.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
---
 drivers/net/benet/be.h         |    5 +++++
 drivers/net/benet/be_ethtool.c |   13 +++++++++++++
 drivers/net/benet/be_main.c    |   17 ++++++++++++-----
 3 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/drivers/net/benet/be.h b/drivers/net/benet/be.h
index 0899d91..8941b98 100644
--- a/drivers/net/benet/be.h
+++ b/drivers/net/benet/be.h
@@ -485,6 +485,11 @@ static inline void be_vf_eth_addr_generate(struct be_adapter *adapter, u8 *mac)
 	memcpy(mac, adapter->netdev->dev_addr, 3);
 }
 
+static inline bool be_multi_rxq(struct be_adapter *adapter)
+{
+	return (adapter->num_rx_qs > 1);
+}
+
 extern void be_cq_notify(struct be_adapter *adapter, u16 qid, bool arm,
 		u16 num_popped);
 extern void be_link_status_update(struct be_adapter *adapter, bool link_up);
diff --git a/drivers/net/benet/be_ethtool.c b/drivers/net/benet/be_ethtool.c
index a665697..1565c81 100644
--- a/drivers/net/benet/be_ethtool.c
+++ b/drivers/net/benet/be_ethtool.c
@@ -735,6 +735,18 @@ be_read_eeprom(struct net_device *netdev, struct ethtool_eeprom *eeprom,
 	return status;
 }
 
+static int be_set_flags(struct net_device *netdev, u32 data)
+{
+	struct be_adapter *adapter = netdev_priv(netdev);
+	int rc = -1;
+
+	if (be_multi_rxq(adapter))
+		rc = ethtool_op_set_flags(netdev, data, ETH_FLAG_RXHASH |
+				ETH_FLAG_TXVLAN | ETH_FLAG_RXVLAN);
+
+	return rc;
+}
+
 const struct ethtool_ops be_ethtool_ops = {
 	.get_settings = be_get_settings,
 	.get_drvinfo = be_get_drvinfo,
@@ -764,4 +776,5 @@ const struct ethtool_ops be_ethtool_ops = {
 	.get_regs = be_get_regs,
 	.flash_device = be_do_flash,
 	.self_test = be_self_test,
+	.set_flags = be_set_flags,
 };
diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index a24fb45..4910055 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -116,11 +116,6 @@ static char *ue_status_hi_desc[] = {
 	"Unknown"
 };
 
-static inline bool be_multi_rxq(struct be_adapter *adapter)
-{
-	return (adapter->num_rx_qs > 1);
-}
-
 static void be_queue_free(struct be_adapter *adapter, struct be_queue_info *q)
 {
 	struct be_dma_mem *mem = &q->dma_mem;
@@ -1012,6 +1007,9 @@ static void be_rx_compl_process(struct be_adapter *adapter,
 
 	skb->truesize = skb->len + sizeof(struct sk_buff);
 	skb->protocol = eth_type_trans(skb, adapter->netdev);
+	if (adapter->netdev->features & NETIF_F_RXHASH)
+		skb->rxhash = rxcp->rss_hash;
+
 
 	if (unlikely(rxcp->vlanf)) {
 		if (!adapter->vlan_grp || adapter->vlans_added == 0) {
@@ -1072,6 +1070,8 @@ static void be_rx_compl_process_gro(struct be_adapter *adapter,
 	skb->data_len = rxcp->pkt_size;
 	skb->truesize += rxcp->pkt_size;
 	skb->ip_summed = CHECKSUM_UNNECESSARY;
+	if (adapter->netdev->features & NETIF_F_RXHASH)
+		skb->rxhash = rxcp->rss_hash;
 
 	if (likely(!rxcp->vlanf))
 		napi_gro_frags(&eq_obj->napi);
@@ -1101,6 +1101,8 @@ static void be_parse_rx_compl_v1(struct be_adapter *adapter,
 		AMAP_GET_BITS(struct amap_eth_rx_compl_v1, numfrags, compl);
 	rxcp->pkt_type =
 		AMAP_GET_BITS(struct amap_eth_rx_compl_v1, cast_enc, compl);
+	rxcp->rss_hash =
+		AMAP_GET_BITS(struct amap_eth_rx_compl_v1, rsshash, rxcp);
 	if (rxcp->vlanf) {
 		rxcp->vtm = AMAP_GET_BITS(struct amap_eth_rx_compl_v1, vtm,
 				compl);
@@ -1131,6 +1133,8 @@ static void be_parse_rx_compl_v0(struct be_adapter *adapter,
 		AMAP_GET_BITS(struct amap_eth_rx_compl_v0, numfrags, compl);
 	rxcp->pkt_type =
 		AMAP_GET_BITS(struct amap_eth_rx_compl_v0, cast_enc, compl);
+	rxcp->rss_hash =
+		AMAP_GET_BITS(struct amap_eth_rx_compl_v0, rsshash, rxcp);
 	if (rxcp->vlanf) {
 		rxcp->vtm = AMAP_GET_BITS(struct amap_eth_rx_compl_v0, vtm,
 				compl);
@@ -2614,6 +2618,9 @@ static void be_netdev_init(struct net_device *netdev)
 		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
 		NETIF_F_GRO | NETIF_F_TSO6;
 
+	if (be_multi_rxq(adapter))
+		netdev->features |= NETIF_F_RXHASH;
+
 	netdev->vlan_features |= NETIF_F_SG | NETIF_F_TSO |
 		NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
 
-- 
1.7.1


^ permalink raw reply related

* [PATCH net-next 3/5] be2net: fix to get max VFs supported from adapter
From: Ajit Khaparde @ 2011-04-07  4:08 UTC (permalink / raw)
  To: netdev

The user supplied num_vfs value need not be compared
against a static BE_MAX_VF, but can be checked against
the actual VFs that the device can support.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
---
 drivers/net/benet/be_main.c |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index 4910055..9d6982a 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -1947,7 +1947,20 @@ static void be_sriov_enable(struct be_adapter *adapter)
 	be_check_sriov_fn_type(adapter);
 #ifdef CONFIG_PCI_IOV
 	if (be_physfn(adapter) && num_vfs) {
-		int status;
+		int status, pos;
+		u16 nvfs;
+
+		pos = pci_find_ext_capability(adapter->pdev,
+						PCI_EXT_CAP_ID_SRIOV);
+		pci_read_config_word(adapter->pdev,
+					pos + PCI_SRIOV_TOTAL_VF, &nvfs);
+
+		if (num_vfs > nvfs) {
+			dev_info(&adapter->pdev->dev,
+					"Device supports %d VFs and not %d\n",
+					nvfs, num_vfs);
+			num_vfs = nvfs;
+		}
 
 		status = pci_enable_sriov(adapter->pdev, num_vfs);
 		adapter->sriov_enabled = status ? false : true;
@@ -3281,13 +3294,6 @@ static int __init be_init_module(void)
 		rx_frag_size = 2048;
 	}
 
-	if (num_vfs > 32) {
-		printk(KERN_WARNING DRV_NAME
-			" : Module param num_vfs must not be greater than 32."
-			"Using 32\n");
-		num_vfs = 32;
-	}
-
 	return pci_register_driver(&be_driver);
 }
 module_init(be_init_module);
-- 
1.7.1


^ permalink raw reply related

* [PATCH net-next 4/5] be2net: dynamically allocate adapter->vf_cfg
From: Ajit Khaparde @ 2011-04-07  4:08 UTC (permalink / raw)
  To: netdev

Instead of a fixed sized array for vf_cfg, allocate the size dynamically
depending on number of VFs the device supports.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
---
 drivers/net/benet/be.h      |    4 +---
 drivers/net/benet/be_main.c |   14 ++++++++++++--
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/net/benet/be.h b/drivers/net/benet/be.h
index eba405b..1d976b8 100644
--- a/drivers/net/benet/be.h
+++ b/drivers/net/benet/be.h
@@ -92,8 +92,6 @@ static inline char *nic_name(struct pci_dev *pdev)
 
 #define FW_VER_LEN		32
 
-#define BE_MAX_VF		32
-
 struct be_dma_mem {
 	void *va;
 	dma_addr_t dma;
@@ -336,7 +334,7 @@ struct be_adapter {
 
 	bool be3_native;
 	bool sriov_enabled;
-	struct be_vf_cfg vf_cfg[BE_MAX_VF];
+	struct be_vf_cfg *vf_cfg;
 	u8 is_virtfn;
 	u32 sli_family;
 	u8 hba_port_num;
diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index 9d6982a..6a43b26 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -2837,6 +2837,7 @@ static void __devexit be_remove(struct pci_dev *pdev)
 
 	be_ctrl_cleanup(adapter);
 
+	kfree(adapter->vf_cfg);
 	be_sriov_disable(adapter);
 
 	be_msix_disable(adapter);
@@ -3021,16 +3022,23 @@ static int __devinit be_probe(struct pci_dev *pdev,
 	}
 
 	be_sriov_enable(adapter);
+	if (adapter->sriov_enabled) {
+		adapter->vf_cfg = kcalloc(num_vfs,
+			sizeof(struct be_vf_cfg), GFP_KERNEL);
+
+		if (!adapter->vf_cfg)
+			goto free_netdev;
+	}
 
 	status = be_ctrl_init(adapter);
 	if (status)
-		goto free_netdev;
+		goto free_vf_cfg;
 
 	if (lancer_chip(adapter)) {
 		status = lancer_test_and_set_rdy_state(adapter);
 		if (status) {
 			dev_err(&pdev->dev, "Adapter in non recoverable error\n");
-			goto free_netdev;
+			goto ctrl_clean;
 		}
 	}
 
@@ -3092,6 +3100,8 @@ stats_clean:
 	be_stats_cleanup(adapter);
 ctrl_clean:
 	be_ctrl_cleanup(adapter);
+free_vf_cfg:
+	kfree(adapter->vf_cfg);
 free_netdev:
 	be_sriov_disable(adapter);
 	free_netdev(netdev);
-- 
1.7.1


^ permalink raw reply related

* [PATCH net-next 5/5] be2net: call FLR after setup wol in be_shutdown
From: Ajit Khaparde @ 2011-04-07  4:08 UTC (permalink / raw)
  To: netdev

Calling setup_wol after a reset is inconsequential.
The WOL setting should be programmed before FLR.
And yes, FLR does not erase wol information.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
---
 drivers/net/benet/be_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index 6a43b26..a7a2dec 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -3188,11 +3188,11 @@ static void be_shutdown(struct pci_dev *pdev)
 
 	netif_device_detach(netdev);
 
-	be_cmd_reset_function(adapter);
-
 	if (adapter->wol)
 		be_setup_wol(adapter, true);
 
+	be_cmd_reset_function(adapter);
+
 	pci_disable_device(pdev);
 }
 
-- 
1.7.1


^ permalink raw reply related

* Re: problem of "ipv4: revert Set rt->rt_iif more sanely on output routes."
From: OGAWA Hirofumi @ 2011-04-07  4:31 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20110406.132829.116375350.davem@davemloft.net>

David Miller <davem@davemloft.net> writes:

> From: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
> Date: Tue, 05 Apr 2011 22:05:10 +0900
>
>> ipv4: Set rt->rt_iif more sanely on output routes.
>> (1018b5c01636c7c6bda31a719bda34fc631db29a)
>> 
>> The above patch seems to be caused of avahi breakage.
>> 
>> I'm not debugging fully though, avahi is using IP_PKTINFO and checking
>> in_pktinfo->ipi_ifindex > 0.
>> 
>> And if I reverted above patch, it seems to fix avahi's IP_PKTINFO problem.
>
> in_pktinfo is given to the application only during recvmsg() calls, the
> call chain is (for example):
>
> udp_recvmsg()
> 	--> ip_cmsg_recv()
> 		--> ip_cmsg_recv_pktinfo()
>
> Therefore we will only be working with receive packets, whose routes are
> computed using ip_route_input*() which will fill in the rt_iif field
> appropriately.
>
> The only exception to this would be packets which are looped back, in
> which case the cached output route attached to the packet will be used.

I see.

> Your RFC patch should work, but we're trying to make "struct rtable"
> smaller rather than larger.

I felt it from git hisotry.

> In what way does routing break if you simply restore the original
> rt_iif assignment in output route creation?  That's the most preferred
> fix for this.

I'm not pretty sure though, output message is

	ip_finish_output2: No header cache and no neighbour!

I'm not debugging this though,

static inline bool rt_is_output_route(struct rtable *rt)
{
	return rt->rt_iif == 0;
}

from review I guess the above is one of cause.

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply

* [PATCH net-next] net: fix skb_add_data_nocache() to calc csum correctly
From: Wei Yongjun @ 2011-04-07  4:40 UTC (permalink / raw)
  To: David Miller, Tom Herbert, netdev@vger.kernel.org

commit c6e1a0d12ca7b4f22c58e55a16beacfb7d3d8462 broken the calc
 (net: Allow no-cache copy from user on transmit)
of checksum, which may cause some tcp packets be dropped because
incorrect checksum. ssh does not work under today's net-next-2.6
tree.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
---
 include/net/sock.h |   15 ++++++++-------
 1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 43bd515..9cbf23c 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1392,14 +1392,14 @@ static inline void sk_nocaps_add(struct sock *sk, int flags)
 
 static inline int skb_do_copy_data_nocache(struct sock *sk, struct sk_buff *skb,
 					   char __user *from, char *to,
-					   int copy)
+					   int copy, int offset)
 {
 	if (skb->ip_summed == CHECKSUM_NONE) {
 		int err = 0;
 		__wsum csum = csum_and_copy_from_user(from, to, copy, 0, &err);
 		if (err)
 			return err;
-		skb->csum = csum_block_add(skb->csum, csum, skb->len);
+		skb->csum = csum_block_add(skb->csum, csum, offset);
 	} else if (sk->sk_route_caps & NETIF_F_NOCACHE_COPY) {
 		if (!access_ok(VERIFY_READ, from, copy) ||
 		    __copy_from_user_nocache(to, from, copy))
@@ -1413,11 +1413,12 @@ static inline int skb_do_copy_data_nocache(struct sock *sk, struct sk_buff *skb,
 static inline int skb_add_data_nocache(struct sock *sk, struct sk_buff *skb,
 				       char __user *from, int copy)
 {
-	int err;
+	int err, offset = skb->len;
 
-	err = skb_do_copy_data_nocache(sk, skb, from, skb_put(skb, copy), copy);
+	err = skb_do_copy_data_nocache(sk, skb, from, skb_put(skb, copy),
+				       copy, offset);
 	if (err)
-		__skb_trim(skb, skb->len);
+		__skb_trim(skb, offset);
 
 	return err;
 }
@@ -1429,8 +1430,8 @@ static inline int skb_copy_to_page_nocache(struct sock *sk, char __user *from,
 {
 	int err;
 
-	err = skb_do_copy_data_nocache(sk, skb, from,
-				       page_address(page) + off, copy);
+	err = skb_do_copy_data_nocache(sk, skb, from, page_address(page) + off,
+				       copy, skb->len);
 	if (err)
 		return err;
 
-- 
1.6.5.2



^ permalink raw reply related

* Re: [PATCH net-next] net: fix skb_add_data_nocache() to calc csum correctly
From: Tom Herbert @ 2011-04-07  4:50 UTC (permalink / raw)
  To: Wei Yongjun; +Cc: David Miller, netdev@vger.kernel.org
In-Reply-To: <4D9D402C.60107@cn.fujitsu.com>

Nice catch.

Acked-by: Tom Herbert <therbert@google.com>

On Wed, Apr 6, 2011 at 9:40 PM, Wei Yongjun <yjwei@cn.fujitsu.com> wrote:
> commit c6e1a0d12ca7b4f22c58e55a16beacfb7d3d8462 broken the calc
>  (net: Allow no-cache copy from user on transmit)
> of checksum, which may cause some tcp packets be dropped because
> incorrect checksum. ssh does not work under today's net-next-2.6
> tree.
>
> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
> ---
>  include/net/sock.h |   15 ++++++++-------
>  1 files changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 43bd515..9cbf23c 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1392,14 +1392,14 @@ static inline void sk_nocaps_add(struct sock *sk, int flags)
>
>  static inline int skb_do_copy_data_nocache(struct sock *sk, struct sk_buff *skb,
>                                           char __user *from, char *to,
> -                                          int copy)
> +                                          int copy, int offset)
>  {
>        if (skb->ip_summed == CHECKSUM_NONE) {
>                int err = 0;
>                __wsum csum = csum_and_copy_from_user(from, to, copy, 0, &err);
>                if (err)
>                        return err;
> -               skb->csum = csum_block_add(skb->csum, csum, skb->len);
> +               skb->csum = csum_block_add(skb->csum, csum, offset);
>        } else if (sk->sk_route_caps & NETIF_F_NOCACHE_COPY) {
>                if (!access_ok(VERIFY_READ, from, copy) ||
>                    __copy_from_user_nocache(to, from, copy))
> @@ -1413,11 +1413,12 @@ static inline int skb_do_copy_data_nocache(struct sock *sk, struct sk_buff *skb,
>  static inline int skb_add_data_nocache(struct sock *sk, struct sk_buff *skb,
>                                       char __user *from, int copy)
>  {
> -       int err;
> +       int err, offset = skb->len;
>
> -       err = skb_do_copy_data_nocache(sk, skb, from, skb_put(skb, copy), copy);
> +       err = skb_do_copy_data_nocache(sk, skb, from, skb_put(skb, copy),
> +                                      copy, offset);
>        if (err)
> -               __skb_trim(skb, skb->len);
> +               __skb_trim(skb, offset);
>
>        return err;
>  }
> @@ -1429,8 +1430,8 @@ static inline int skb_copy_to_page_nocache(struct sock *sk, char __user *from,
>  {
>        int err;
>
> -       err = skb_do_copy_data_nocache(sk, skb, from,
> -                                      page_address(page) + off, copy);
> +       err = skb_do_copy_data_nocache(sk, skb, from, page_address(page) + off,
> +                                      copy, skb->len);
>        if (err)
>                return err;
>
> --
> 1.6.5.2
>
>
>

^ permalink raw reply

* RE: Question on "net: allocate skbs on local node"
From: Eric Dumazet @ 2011-04-07  4:58 UTC (permalink / raw)
  To: Wei Gu; +Cc: netdev, Alexander Duyck, Jeff Kirsher
In-Reply-To: <D12839161ADD3A4B8DA63D1A134D084026E48B9BEB@ESGSCCMS0001.eapac.ericsson.se>

Le jeudi 07 avril 2011 à 10:16 +0800, Wei Gu a écrit :
> Hi Eric,
> Testing with ixgbe Linux 2.6.38 driver:
> We have a little better thruput figure with this driver, but it looks
> not scalling at all, I always stressed one CPU core/24.
> And when look the perf report for ksoftirqd/24, the most cost function
> is still "_raw_spin_unlock_irqstore" and the IRQ/s is huge, it's
> somehow conflicts with desgin of NAPI. On linux 2.6.32 while the CPU
> was stressed the IRQ will descreased while the NAPI will running much
> on the polling mode. I don't know why on 2.6.38 the IRQ was keep
> increasing.


CC netdev and Intel guys, since they said it should not happen (TM)

IF you dont use DCA (make sure ioatdma module is not loaded), how comes
alloc_iova() is called at all ?

IF you use DCA, how comes its called, since the same CPU serves a given
interrupt ?



>  
> CONFIG_TICK_ONESHOT=y
> CONFIG_NO_HZ=y
>  
> PerfTop:  512417 irqs/sec  kernel:91.3%  exact:  0.0% [1000Hz cpu-clock-msecs],  (all, 64 CPUs)
> ------------------------------------------------------------------------------------------------------------------------------------------------------
> -      0.82%     ksoftirqd/24  [kernel.kallsyms]          [k] _raw_spin_unlock_irqrestore                                                                                                                                                 
> \u2592   - _raw_spin_unlock_irqrestore                                                                                                                
> \u2592      - 44.27% alloc_iova                                                                                
> \u2592           intel_alloc_iova                                                                                                                                                                                                               
> \u2592           __intel_map_single                                                                             
> \u2592           intel_map_page                                                                                                              
> \u2592         - ixgbe_init_interrupt_scheme                                                                                                             
> \u2592            - 59.97% ixgbe_alloc_rx_buffers                                                                                                                 
> \u2592                 ixgbe_clean_rx_irq                                                                                                                
> \u2592                 0xffffffffa033a5                                                                                               
> \u2592                 net_rx_action                                                                                                                   
> u2592                 __do_softirq                                                                                                        
> \u2592               + call_softirq                                                                                                              
> \u2592            - 40.03% ixgbe_change_mtu                                                                                                                                                                                                     
> \u2592                 ixgbe_change_mtu                                                                                               
> \u2592                 dev_hard_start_xmit                                                       
> \u2592                 sch_direct_xmit                                                                   
> \u2592                 dev_queue_xmit                                                                                                 
> \u2592                 vlan_dev_hard_start_xmit                                                                                                                                                                                                 
> \u2592                 hook_func                                                                                                                                                                                                                
> \u2592                 nf_iterate                                                                                                                                                                                                              
> \u2592                nf_hook_slow                                                                                                                                                                                                             
> \u2592                 NF_HOOK.clone.1                                                                                                                                                                                                          
> \u2592                 ip_rcv                                                                                                                                                                                                                   
> \u2592                 __netif_receive_skb                                                                                                                                                                                                      
> \u2592                 __netif_receive_skb                                                                                                                                                                                                      
> \u2592                 netif_receive_skb                                                                                                                                                                                                        
> \u2592                 napi_skb_finish                                                                                                                                                                                                          
> \u2592                 napi_gro_receive                                                                                                                                                                                                         
> \u2592                 ixgbe_clean_rx_irq                                                                                                                                                                                                       
> \u2592                 0xffffffffa033a5                                                                                                                                                                                                         
> \u2592                 net_rx_action                                                                                                                                                                                                            
> \u2592                 __do_softirq                                                                                                                                                                                                             
> \u2592               + call_softirq                                                                                                                                                                                                             
> \u2592      + 35.85% find_iova                                                                                                                                                                                                                  
> \u2592      + 19.44% add_unmap      
>  
>  
> Thanks
> WeiGu
>  



^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox