Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] bonding: Don't allow mode change via sysfs with slaves present
From: Andy Gospodarek @ 2011-11-15 20:47 UTC (permalink / raw)
  To: Nicolas de Pesloüan
  Cc: Andy Gospodarek, Veaceslav Falico, netdev, Jay Vosburgh
In-Reply-To: <4EC2C550.6050805@gmail.com>

On Tue, Nov 15, 2011 at 09:02:24PM +0100, Nicolas de Pesloüan wrote:
> Le 15/11/2011 20:35, Andy Gospodarek a écrit :
>> On Tue, Nov 15, 2011 at 08:24:29PM +0100, Nicolas de Pesloüan wrote:
>>> Le 15/11/2011 18:00, Andy Gospodarek a écrit :
>>>> On Tue, Nov 15, 2011 at 05:44:42PM +0100, Veaceslav Falico wrote:
>>>>> When changing mode via bonding's sysfs, the slaves are not initialized
>>>>> correctly. Forbid to change modes with slaves present to ensure that every
>>>>> slave is initialized correctly via bond_enslave().
>>>>>
>>>>> Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
>>>>
>>>> Looks good.  This behavior forces someone who wants to change to mode to
>>>> go through steps that are almost as destructive as when module options
>>>> are used to configure the mode.  I do not see a problem with this.
>>>
>>> Except the fact that is enforce one more constraint on the exact order
>>> one should write into sysfs to setup a bonding interface. We already have
>>> many such constraints and probably don't need more.
>>>
>>> Currently, it is possible to enslave slaves before selecting the mode.
>>> The ifenslave-2.6 package from Debian currently enslave slaves before
>>> setting the mode and would break with this change.
>>>
>>
>> Our testing indicates that 802.3ad mode bonding will not work unless the
>> devices are enslaved after the mode is set.  Does this mean that no one
>> using Debian is using 802.2ad mode or are they just not reporting it?
>
> I don't know. Possibly, they setup the bonding by hand, instead of 
> relying on the bonding extensions of /etc/network/interfaces provided by 
> the ifenslave-2.6 package.
>
> Having a look at popularity for the package 
> (http://qa.debian.org/popcon.php?package=ifenslave-2.6), it is obviously 
> not the most popular one, but...
>

Nicolas,

I took a look at the ifenslave package for debian more closely and it
actually looks like devices are enslaved last, after mode is set.  Can
you please take a look at this package and confirm what I'm seeing in
the 'pre-up' script.

It appears to me that setup_master sets the mode and enslave_slaves is
called after and enslaves the devices:

# Option slaves deprecated, replaced by bond-slaves, but still supported
# for backward compatibility.
IF_BOND_SLAVES=${IF_BOND_SLAVES:-$IF_SLAVES}

if [ "$IF_BOND_MASTER" ] ; then
        BOND_MASTER="$IF_BOND_MASTER"
        BOND_SLAVES="$IFACE"
else
        if [ "$IF_BOND_SLAVES" ] ; then
                BOND_MASTER="$IFACE"
                BOND_SLAVES="$IF_BOND_SLAVES"
        fi
fi

# Exit if nothing to do...
[ -z "$BOND_MASTER$BOND_SLAVES" ] && exit

add_master
early_setup_master
setup_master
enslave_slaves
exit 0

-andy

^ permalink raw reply

* Re: [PATCH] bonding: Don't allow mode change via sysfs with slaves present
From: Veaceslav Falico @ 2011-11-15 21:04 UTC (permalink / raw)
  To: Nicolas de Pesloüan; +Cc: Andy Gospodarek, netdev, Jay Vosburgh
In-Reply-To: <4EC2BC6D.9000304@gmail.com>

On Tue, Nov 15, 2011 at 08:24:29PM +0100, Nicolas de Pesloüan wrote:
> Le 15/11/2011 18:00, Andy Gospodarek a écrit :
> >On Tue, Nov 15, 2011 at 05:44:42PM +0100, Veaceslav Falico wrote:
> >>When changing mode via bonding's sysfs, the slaves are not initialized
> >>correctly. Forbid to change modes with slaves present to ensure that every
> >>slave is initialized correctly via bond_enslave().
> >>
> >>Signed-off-by: Veaceslav Falico<vfalico@redhat.com>
> >
> >Looks good.  This behavior forces someone who wants to change to mode to
> >go through steps that are almost as destructive as when module options
> >are used to configure the mode.  I do not see a problem with this.
> 
> Except the fact that is enforce one more constraint on the exact
> order one should write into sysfs to setup a bonding interface. We
> already have many such constraints and probably don't need more.
> 
> Currently, it is possible to enslave slaves before selecting the
> mode. The ifenslave-2.6 package from Debian currently enslave slaves
> before setting the mode and would break with this change.

Yes, it's possible, however the enslaved interfaces are initialized with
the current mode parameters, and when the mode is changed - they aren't
reinitialized at all. There are a lot of mode-specific initialization stuff
that's present only in bond_enslave(), and here are only some of the
(most obvious) snippets:

ALB/TLB balancing:
<snip>
        if (bond_is_lb(bond)) {
                /* bond_alb_init_slave() must be called before all other
 * stages since
                 * it might fail and we do not want to have to undo
                 * everything
                 */
                res = bond_alb_init_slave(bond, new_slave);
                if (res)
                        goto err_close;
        }    
</snip>

bond_alb_init_slave() is called only in this case. This means that the mac
address won't be changed at all, and some other stuff won't be properly
changed as well.

802.3ad:
<snip>
        if (bond->params.mode == BOND_MODE_8023AD) {
                /* add lacpdu mc addr to mc list */
                u8 lacpdu_multicast[ETH_ALEN] = MULTICAST_LACPDU_ADDR;

                dev_mc_add(slave_dev, lacpdu_multicast);
        }
</snip>

This means that the slave device will just drop all the LACPDUs.

So this means that at least two modes won't work if your first load the
bonding module with the default mode and then change it with slaves
attached. And I'm *really* sceptic on the other modes.

So even if the kernel doesn't show any error, it still doesn't work as
expected. To *really* fix this bug without adding any constraints would
require quite a few lines of code, and before it is fixed this patch is the
best way to avoid it.

^ permalink raw reply

* Re: Unable to flush ICMP redirect routes in kernel 3.0+
From: Eric Dumazet @ 2011-11-15 21:09 UTC (permalink / raw)
  To: Ivan Zahariev; +Cc: netdev
In-Reply-To: <4EC2CA52.6020104@icdsoft.com>

Le mardi 15 novembre 2011 à 22:23 +0200, Ivan Zahariev a écrit :
> Hello,
> 
> We have changed nothing in our network infrastructure but only upgraded 
> from Linux kernel 2.6.36.2 to 3.0.3. Here is the problem we are 
> experiencing:
> 
> ICMP redirected routes are cached forever, and they can be cleared only 
> by a reboot.
> 
> Here is an example:
> 
> root@machine5:~# ip route get 1.1.1.1
> 1.1.1.1 via 9.0.0.1 dev eth0  src 5.5.5.5
>      cache <redirected>  ipid 0xfb5d rtt 1475ms rttvar 450ms cwnd 10
> 
> root@machine5:~# ip route list cache match 1.1.1.1
> 1.1.1.1 tos lowdelay via 9.0.0.1 dev eth0  src 5.5.5.5
>      cache <redirected>  ipid 0xfb5d rtt 1475ms rttvar 450ms cwnd 10
> 1.1.1.1 via 9.0.0.1 dev eth0  src 5.5.5.5
>      cache <redirected>  ipid 0xfb5d rtt 1475ms rttvar 450ms cwnd 10
> ...(two more entries, all go via 9.0.0.1)...
> 
> 1.1.1.1 is the test destination address
> 5.5.5.5 is the source IP address of "machine5" via dev eth0, the only 
> interface besides "lo"
> 9.0.0.1 is the incorrect gateway which we were redirected to; we want to 
> change the route to 9.0.0.8
> 
> I found no way to clear this route. What I tried:
> 
> root@machine5:~# ip route flush cache ### CACHE FLUSH ###
> root@machine5:~# ip route list cache match 1.1.1.1 # empty
> 
> root@machine5:~# ip route flush cache ### CACHE FLUSH ###
> root@machine5:~# echo 1 > /proc/sys/net/ipv4/route/flush
> root@machine5:~# ip route list cache match 1.1.1.1 # empty
> 
> root@machine5:~# ip route get 1.1.1.1 # magically re-inserts the 
> <redirected> route, tcpdump sees NO ICMP traffic
> 1.1.1.1 via 9.0.0.1 dev eth0  src 5.5.5.5
>      cache <redirected>  ipid 0xfb5d rtt 1475ms rttvar 450ms cwnd 10
> 
> I also tried to force a scheduled route flush:
> 
> root@machine5:~# echo 1 > /proc/sys/net/ipv4/route/gc_timeout
> root@machine5:~# echo 1 > /proc/sys/net/ipv4/route/gc_interval
> 
> A reboot fixed it all.
> 
> This may be related to the "Several major changes to our routing 
> infrastructure" (https://lkml.org/lkml/2011/3/16/384).
> Other users are reporting the same problem:
> * https://plus.google.com/u/0/117161704068825702652/posts/1UK1Rp4KA4J
> * http://lists.debian.org/debian-kernel/2011/10/msg00633.html
> Other similar issues:
> * http://www.spinics.net/lists/netdev/msg176966.html
> * http://forums.gentoo.org/viewtopic-t-901024-start-0.html
> 
> This has been occurring on a few KVM guest machines and also on a 
> regular Linux machine, so it's not KVM related.
> 
> Is this a bug, or it's me who's missing something?
> 

It is a bug, and as such could you provide needed information for us to
reproduce it ?

What is your network setup ?

^ permalink raw reply

* Re: [PATCH 5/5] net-next:asix: update VERSION and white space changes
From: David Miller @ 2011-11-15 21:41 UTC (permalink / raw)
  To: kernel; +Cc: grundler, netdev, linux-kernel, allan, freddy
In-Reply-To: <4EC2831F.9070907@teksavvy.com>

From: Mark Lord <kernel@teksavvy.com>
Date: Tue, 15 Nov 2011 10:19:59 -0500

> On 11-11-14 09:45 PM, David Miller wrote:
>> From: David Miller <davem@davemloft.net>
>> Date: Mon, 14 Nov 2011 21:41:51 -0500 (EST)
>> 
>>> Come on man... are you kidding me?
>> 
>> Want to know what really pisses me off about this?
>> 
>> All of Mark Lord's hard work to bring the entire vendor driver over
>> was thrown out.
> 
> 
> Well, ASIX and I appear to be back on track again.

That's great news.

^ permalink raw reply

* Re: [PATCH 1/5] net-next:asix:PHY_MODE_RTL8211CL should be 0xC
From: David Miller @ 2011-11-15 21:41 UTC (permalink / raw)
  To: grundler; +Cc: netdev, linux-kernel, allan, freddy, grundler
In-Reply-To: <1321377163-26308-1-git-send-email-grundler@chromium.org>

From: Grant Grundler <grundler@chromium.org>
Date: Tue, 15 Nov 2011 09:12:39 -0800

> From: Grant Grundler <grundler@google.com>
> 
> Use correct value for rtl phy support.
> (rtl phy are in AX88178 devices like NWU220G and USB2-ET1000).
> 
> Signed-off-by: Allan Chou <allan@asix.com.tw>
> Tested-by: Grant Grundler <grundler@chromium.org>

Applied.

^ permalink raw reply

* Re: [PATCH 2/5] net-next:asix:poll in asix_get_phyid in case phy not ready
From: David Miller @ 2011-11-15 21:41 UTC (permalink / raw)
  To: grundler; +Cc: netdev, linux-kernel, allan, freddy, grundler
In-Reply-To: <1321377163-26308-2-git-send-email-grundler@chromium.org>

From: Grant Grundler <grundler@chromium.org>
Date: Tue, 15 Nov 2011 09:12:40 -0800

> From: Grant Grundler <grundler@google.com>
> 
> Sometimes the phy isn't ready after reset...poll and pray it will be soon.
> 
> Signed-off-by: Freddy Xin <freddy@asix.com.tw>
> Signed-off-by: Grant Grundler <grundler@chromium.org>

Applied.

^ permalink raw reply

* Re: [PATCH 3/5] net-next:asix: reduce AX88772 init time by about 2 seconds
From: David Miller @ 2011-11-15 21:41 UTC (permalink / raw)
  To: grundler; +Cc: netdev, linux-kernel, allan, freddy, grundler
In-Reply-To: <1321377163-26308-3-git-send-email-grundler@chromium.org>

From: Grant Grundler <grundler@chromium.org>
Date: Tue, 15 Nov 2011 09:12:41 -0800

> From: Grant Grundler <grundler@google.com>
> 
> ax88772_reset takes about 2 seconds and is called twice.
> Once from ax88772_bind() directly and again indirectly from usbnet_open().
> Reset the USB FW/Phy enough to blink the LEDs when inserted.
> 
> Signed-off-by: Allan Chou <allan@asix.com.tw>
> Signed-off-by: Grant Grundler <grundler@chromium.org>

Applied.

^ permalink raw reply

* Re: [PATCH 4/5] net-next:asix: V2 more fixes for ax88178 phy init sequence
From: David Miller @ 2011-11-15 21:42 UTC (permalink / raw)
  To: grundler; +Cc: netdev, linux-kernel, allan, freddy, grundler
In-Reply-To: <1321377163-26308-4-git-send-email-grundler@chromium.org>

From: Grant Grundler <grundler@chromium.org>
Date: Tue, 15 Nov 2011 09:12:42 -0800

> From: Grant Grundler <grundler@google.com>
> 
> Now works on Samsung Series 5 (chromebook)
> 
> Two fixes here:
> o use 0x7F mask for phymode
> o read phyid *AFTER* phy is powered up (via GPIOs)
> 
> Signed-off-by: Allan Chou <allan@asix.com.tw>
> Signed-off-by: Grant Grundler <grundler@chromium.org>

Applied.

^ permalink raw reply

* Re: [PATCH 5/5] net-next:asix: V2 Update VERSION
From: David Miller @ 2011-11-15 21:42 UTC (permalink / raw)
  To: grundler; +Cc: netdev, linux-kernel, allan, freddy, grundler
In-Reply-To: <1321377163-26308-5-git-send-email-grundler@chromium.org>

From: Grant Grundler <grundler@chromium.org>
Date: Tue, 15 Nov 2011 09:12:43 -0800

> From: Grant Grundler <grundler@google.com>
> 
> Only update VERSION to reflect previous changes.
> 
> Signed-off-by: Grant Grundler <grundler@chromium.org>

Applied.

^ permalink raw reply

* Re: [PATCH] ipv4: return NET_RX_DROP when arp_rcv drops the received packet.
From: David Miller @ 2011-11-15 21:47 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1321322987-16042-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Tue, 15 Nov 2011 10:09:47 +0800

> From: RongQing.Li <roy.qing.li@gmail.com>
> 
> return NET_RX_DROP when arp_rcv drops the received packet.
> 
> Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>

This is not appropriate.

NET_RX_DROP means that the packet was dropped because something about
the packet's contents were not acceptable, or the packet violated our
policies so was dropped.

In this arp_rcv() case, we would have accepted the packet, but we had
a memory allocation error.  This memory allocation error has nothing
to do with the packet's contents, and is a transient issue.

^ permalink raw reply

* Re: [PATCH 1/3] net/mlx4_en: allow setting number of rx rings for RSS/TCP
From: David Miller @ 2011-11-15 21:49 UTC (permalink / raw)
  To: amirv; +Cc: netdev, oren, yevgenyp, ogerlitz, amirv
In-Reply-To: <1321348446-29605-2-git-send-email-amirv@mellanox.com>

From: Amir Vadai <amirv@dev.mellanox.co.il>
Date: Tue, 15 Nov 2011 11:14:04 +0200

> +	if (priv->mdev->profile.tcp_rss == -1 ||
> +			priv->mdev->profile.tcp_rss > priv->rx_ring_num)

Please format your code properly:

	if (priv->mdev->profile.tcp_rss == -1 ||
	    priv->mdev->profile.tcp_rss > priv->rx_ring_num)

^ permalink raw reply

* Re: [PATCH] mlx4_en: using non collapsed CQ on TX
From: David Miller @ 2011-11-15 21:52 UTC (permalink / raw)
  To: yevgenyp; +Cc: netdev
In-Reply-To: <4EC23E94.2030607@mellanox.co.il>

From: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Date: Tue, 15 Nov 2011 12:27:32 +0200

> Moving to regular Completion Queue implementation (not collapsed)
> Completion for each transmitted packet is written to new entry.
> 
> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>

Can the people maintaining the Mellanox driver please coordinate
your efforts?

I should not be seeing multiple engineers submit seperate patches
on the same exact day to the mlx4 driver.

One person should be in charge for submitting all pending patches,
adding "From: " lines to the body of the commit message (as needed) to
indicate authorship properly.

^ permalink raw reply

* Re: [PATCH net-next] IPv6: Removing unnecessary NULL checks introduced in 4a287eba2de395713d8b2b2aeaa69fa086832d34.
From: David Miller @ 2011-11-15 21:55 UTC (permalink / raw)
  To: matti.vaittinen; +Cc: netdev
In-Reply-To: <1321354739.1858.72.camel@hakki>

From: Matti Vaittinen <matti.vaittinen@nsn.com>
Date: Tue, 15 Nov 2011 12:58:59 +0200

> This patch removes unnecessary NULL checks noticed by Dan Carpenter.
> Checks were introduced in commit
> 4a287eba2de395713d8b2b2aeaa69fa086832d34 to net-next.
> 
> Signed-off-by: Matti Vaittinen <Mazziesaccount@gmail.com>

Applied.

I took the commit ID reference out of the subject line, that is generally
considered bad form.

^ permalink raw reply

* Re: [PATCH Kernel-3.1.0] mdio-gpio: Add reset functionality to mdio-gpio driver.
From: David Miller @ 2011-11-15 21:56 UTC (permalink / raw)
  To: srinivas.kandagatla; +Cc: netdev, stuart.menefy
In-Reply-To: <4EC25608.8070308@st.com>

From: Srinivas KANDAGATLA <srinivas.kandagatla@st.com>
Date: Tue, 15 Nov 2011 12:07:36 +0000

> Subject: [PATCH Kernel-3.1.0] mdio-gpio: Add reset functionality to mdio-gpio driver(v2).
> 
> This patch adds phy reset functionality to mdio-gpio driver. Now
> mdio_gpio_platform_data has new member as function pointer which can be
> filled at the bsp level for a callback from phy infrastructure. Also the
> mdio-bitbang driver fills-in the reset function of mii_bus structure.
> 
> Without this patch the bsp level code has to takecare of the reseting
> PHY's on the bus, which become bit hacky for every bsp and
> phy-infrastructure is ignored aswell.
> 
> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>

Applied.

^ permalink raw reply

* [PATCH] Add ethtool to mii advertisment conversion helpers
From: Matt Carlson @ 2011-11-15 22:00 UTC (permalink / raw)
  To: davem; +Cc: netdev, mcarlson, Michael Chan

Translating between ethtool advertisement settings and MII
advertisements are common operations for ethernet drivers.  This patch
adds a set of helper functions that implements the conversion.  The
patch then modifies a couple of the drivers to use the new functions.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2.c |   15 +---
 drivers/net/ethernet/broadcom/tg3.c  |   53 +++--------
 drivers/net/ethernet/sun/niu.c       |   15 +---
 drivers/net/mii.c                    |   48 ++--------
 drivers/net/phy/phy_device.c         |   20 +----
 include/linux/mii.h                  |  165 ++++++++++++++++++++++++++++++++++
 6 files changed, 196 insertions(+), 120 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index 32d1f92..e82b981 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -2064,21 +2064,12 @@ __acquires(&bp->phy_lock)
 		bnx2_read_phy(bp, MII_CTRL1000, &adv1000_reg);
 		adv1000_reg &= PHY_ALL_1000_SPEED;
 
-		if (bp->advertising & ADVERTISED_10baseT_Half)
-			new_adv_reg |= ADVERTISE_10HALF;
-		if (bp->advertising & ADVERTISED_10baseT_Full)
-			new_adv_reg |= ADVERTISE_10FULL;
-		if (bp->advertising & ADVERTISED_100baseT_Half)
-			new_adv_reg |= ADVERTISE_100HALF;
-		if (bp->advertising & ADVERTISED_100baseT_Full)
-			new_adv_reg |= ADVERTISE_100FULL;
-		if (bp->advertising & ADVERTISED_1000baseT_Full)
-			new_adv1000_reg |= ADVERTISE_1000FULL;
-
+		new_adv_reg = ethtool_adv_to_mii_100bt(bp->advertising);
 		new_adv_reg |= ADVERTISE_CSMA;
-
 		new_adv_reg |= bnx2_phy_get_pause_adv(bp);
 
+		new_adv1000_reg |= ethtool_adv_to_mii_1000T(bp->advertising);
+
 		if ((adv1000_reg != new_adv1000_reg) ||
 			(adv_reg != new_adv_reg) ||
 			((bmcr & BMCR_ANENABLE) == 0)) {
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index cd36234..b329459 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -3594,15 +3594,7 @@ static int tg3_phy_autoneg_cfg(struct tg3 *tp, u32 advertise, u32 flowctrl)
 	u32 val, new_adv;
 
 	new_adv = ADVERTISE_CSMA;
-	if (advertise & ADVERTISED_10baseT_Half)
-		new_adv |= ADVERTISE_10HALF;
-	if (advertise & ADVERTISED_10baseT_Full)
-		new_adv |= ADVERTISE_10FULL;
-	if (advertise & ADVERTISED_100baseT_Half)
-		new_adv |= ADVERTISE_100HALF;
-	if (advertise & ADVERTISED_100baseT_Full)
-		new_adv |= ADVERTISE_100FULL;
-
+	new_adv |= ethtool_adv_to_mii_100bt(advertise);
 	new_adv |= tg3_advert_flowctrl_1000T(flowctrl);
 
 	err = tg3_writephy(tp, MII_ADVERTISE, new_adv);
@@ -3612,11 +3604,7 @@ static int tg3_phy_autoneg_cfg(struct tg3 *tp, u32 advertise, u32 flowctrl)
 	if (tp->phy_flags & TG3_PHYFLG_10_100_ONLY)
 		goto done;
 
-	new_adv = 0;
-	if (advertise & ADVERTISED_1000baseT_Half)
-		new_adv |= ADVERTISE_1000HALF;
-	if (advertise & ADVERTISED_1000baseT_Full)
-		new_adv |= ADVERTISE_1000FULL;
+	new_adv = ethtool_adv_to_mii_1000T(advertise);
 
 	if (tp->pci_chip_rev_id == CHIPREV_ID_5701_A0 ||
 	    tp->pci_chip_rev_id == CHIPREV_ID_5701_B0)
@@ -3790,14 +3778,7 @@ static int tg3_copper_is_advertising_all(struct tg3 *tp, u32 mask)
 {
 	u32 adv_reg, all_mask = 0;
 
-	if (mask & ADVERTISED_10baseT_Half)
-		all_mask |= ADVERTISE_10HALF;
-	if (mask & ADVERTISED_10baseT_Full)
-		all_mask |= ADVERTISE_10FULL;
-	if (mask & ADVERTISED_100baseT_Half)
-		all_mask |= ADVERTISE_100HALF;
-	if (mask & ADVERTISED_100baseT_Full)
-		all_mask |= ADVERTISE_100FULL;
+	all_mask = ethtool_adv_to_mii_100bt(mask);
 
 	if (tg3_readphy(tp, MII_ADVERTISE, &adv_reg))
 		return 0;
@@ -3808,11 +3789,7 @@ static int tg3_copper_is_advertising_all(struct tg3 *tp, u32 mask)
 	if (!(tp->phy_flags & TG3_PHYFLG_10_100_ONLY)) {
 		u32 tg3_ctrl;
 
-		all_mask = 0;
-		if (mask & ADVERTISED_1000baseT_Half)
-			all_mask |= ADVERTISE_1000HALF;
-		if (mask & ADVERTISED_1000baseT_Full)
-			all_mask |= ADVERTISE_1000FULL;
+		all_mask = ethtool_adv_to_mii_1000T(mask);
 
 		if (tg3_readphy(tp, MII_CTRL1000, &tg3_ctrl))
 			return 0;
@@ -4903,23 +4880,19 @@ static int tg3_setup_fiber_mii_phy(struct tg3 *tp, int force_reset)
 	    (tp->phy_flags & TG3_PHYFLG_PARALLEL_DETECT)) {
 		/* do nothing, just check for link up at the end */
 	} else if (tp->link_config.autoneg == AUTONEG_ENABLE) {
-		u32 adv, new_adv;
+		u32 adv, newadv;
 
 		err |= tg3_readphy(tp, MII_ADVERTISE, &adv);
-		new_adv = adv & ~(ADVERTISE_1000XFULL | ADVERTISE_1000XHALF |
-				  ADVERTISE_1000XPAUSE |
-				  ADVERTISE_1000XPSE_ASYM |
-				  ADVERTISE_SLCT);
-
-		new_adv |= tg3_advert_flowctrl_1000X(tp->link_config.flowctrl);
+		newadv = adv & ~(ADVERTISE_1000XFULL | ADVERTISE_1000XHALF |
+				 ADVERTISE_1000XPAUSE |
+				 ADVERTISE_1000XPSE_ASYM |
+				 ADVERTISE_SLCT);
 
-		if (tp->link_config.advertising & ADVERTISED_1000baseT_Half)
-			new_adv |= ADVERTISE_1000XHALF;
-		if (tp->link_config.advertising & ADVERTISED_1000baseT_Full)
-			new_adv |= ADVERTISE_1000XFULL;
+		newadv |= tg3_advert_flowctrl_1000X(tp->link_config.flowctrl);
+		newadv |= ethtool_adv_to_mii_1000X(tp->link_config.advertising);
 
-		if ((new_adv != adv) || !(bmcr & BMCR_ANENABLE)) {
-			tg3_writephy(tp, MII_ADVERTISE, new_adv);
+		if ((newadv != adv) || !(bmcr & BMCR_ANENABLE)) {
+			tg3_writephy(tp, MII_ADVERTISE, newadv);
 			bmcr |= BMCR_ANENABLE | BMCR_ANRESTART;
 			tg3_writephy(tp, MII_BMCR, bmcr);
 
diff --git a/drivers/net/ethernet/sun/niu.c b/drivers/net/ethernet/sun/niu.c
index 3ebeb9d..9997be5 100644
--- a/drivers/net/ethernet/sun/niu.c
+++ b/drivers/net/ethernet/sun/niu.c
@@ -1151,19 +1151,8 @@ static int link_status_mii(struct niu *np, int *link_up_p)
 		supported |= SUPPORTED_1000baseT_Full;
 	lp->supported = supported;
 
-	advertising = 0;
-	if (advert & ADVERTISE_10HALF)
-		advertising |= ADVERTISED_10baseT_Half;
-	if (advert & ADVERTISE_10FULL)
-		advertising |= ADVERTISED_10baseT_Full;
-	if (advert & ADVERTISE_100HALF)
-		advertising |= ADVERTISED_100baseT_Half;
-	if (advert & ADVERTISE_100FULL)
-		advertising |= ADVERTISED_100baseT_Full;
-	if (ctrl1000 & ADVERTISE_1000HALF)
-		advertising |= ADVERTISED_1000baseT_Half;
-	if (ctrl1000 & ADVERTISE_1000FULL)
-		advertising |= ADVERTISED_1000baseT_Full;
+	advertising = mii_adv_to_ethtool_100bt(advert);
+	advertising |= mii_adv_to_ethtool_1000T(ctrl1000);
 
 	if (bmcr & BMCR_ANENABLE) {
 		int neg, neg1000;
diff --git a/drivers/net/mii.c b/drivers/net/mii.c
index c62e781..d0a2962 100644
--- a/drivers/net/mii.c
+++ b/drivers/net/mii.c
@@ -41,20 +41,8 @@ static u32 mii_get_an(struct mii_if_info *mii, u16 addr)
 	advert = mii->mdio_read(mii->dev, mii->phy_id, addr);
 	if (advert & LPA_LPACK)
 		result |= ADVERTISED_Autoneg;
-	if (advert & ADVERTISE_10HALF)
-		result |= ADVERTISED_10baseT_Half;
-	if (advert & ADVERTISE_10FULL)
-		result |= ADVERTISED_10baseT_Full;
-	if (advert & ADVERTISE_100HALF)
-		result |= ADVERTISED_100baseT_Half;
-	if (advert & ADVERTISE_100FULL)
-		result |= ADVERTISED_100baseT_Full;
-	if (advert & ADVERTISE_PAUSE_CAP)
-		result |= ADVERTISED_Pause;
-	if (advert & ADVERTISE_PAUSE_ASYM)
-		result |= ADVERTISED_Asym_Pause;
-
-	return result;
+
+	return result | mii_adv_to_ethtool_100bt(advert);
 }
 
 /**
@@ -104,19 +92,13 @@ int mii_ethtool_gset(struct mii_if_info *mii, struct ethtool_cmd *ecmd)
 		ecmd->autoneg = AUTONEG_ENABLE;
 
 		ecmd->advertising |= mii_get_an(mii, MII_ADVERTISE);
-		if (ctrl1000 & ADVERTISE_1000HALF)
-			ecmd->advertising |= ADVERTISED_1000baseT_Half;
-		if (ctrl1000 & ADVERTISE_1000FULL)
-			ecmd->advertising |= ADVERTISED_1000baseT_Full;
+		if (mii->supports_gmii)
+			ecmd->advertising |= mii_adv_to_ethtool_1000T(ctrl1000);
 
 		if (bmsr & BMSR_ANEGCOMPLETE) {
 			ecmd->lp_advertising = mii_get_an(mii, MII_LPA);
-			if (stat1000 & LPA_1000HALF)
-				ecmd->lp_advertising |=
-					ADVERTISED_1000baseT_Half;
-			if (stat1000 & LPA_1000FULL)
-				ecmd->lp_advertising |=
-					ADVERTISED_1000baseT_Full;
+			ecmd->lp_advertising |=
+					     mii_lpa_to_ethtool_1000T(stat1000);
 		} else {
 			ecmd->lp_advertising = 0;
 		}
@@ -204,20 +186,10 @@ int mii_ethtool_sset(struct mii_if_info *mii, struct ethtool_cmd *ecmd)
 			advert2 = mii->mdio_read(dev, mii->phy_id, MII_CTRL1000);
 			tmp2 = advert2 & ~(ADVERTISE_1000HALF | ADVERTISE_1000FULL);
 		}
-		if (ecmd->advertising & ADVERTISED_10baseT_Half)
-			tmp |= ADVERTISE_10HALF;
-		if (ecmd->advertising & ADVERTISED_10baseT_Full)
-			tmp |= ADVERTISE_10FULL;
-		if (ecmd->advertising & ADVERTISED_100baseT_Half)
-			tmp |= ADVERTISE_100HALF;
-		if (ecmd->advertising & ADVERTISED_100baseT_Full)
-			tmp |= ADVERTISE_100FULL;
-		if (mii->supports_gmii) {
-			if (ecmd->advertising & ADVERTISED_1000baseT_Half)
-				tmp2 |= ADVERTISE_1000HALF;
-			if (ecmd->advertising & ADVERTISED_1000baseT_Full)
-				tmp2 |= ADVERTISE_1000FULL;
-		}
+		tmp |= ethtool_adv_to_mii_100bt(ecmd->advertising);
+
+		if (mii->supports_gmii)
+			tmp2 |= ethtool_adv_to_mii_1000T(ecmd->advertising);
 		if (advert != tmp) {
 			mii->mdio_write(dev, mii->phy_id, MII_ADVERTISE, tmp);
 			mii->advertising = tmp;
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 83a5a5a..edb905f 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -563,20 +563,9 @@ static int genphy_config_advert(struct phy_device *phydev)
 	if (adv < 0)
 		return adv;
 
-	adv &= ~(ADVERTISE_ALL | ADVERTISE_100BASE4 | ADVERTISE_PAUSE_CAP | 
+	adv &= ~(ADVERTISE_ALL | ADVERTISE_100BASE4 | ADVERTISE_PAUSE_CAP |
 		 ADVERTISE_PAUSE_ASYM);
-	if (advertise & ADVERTISED_10baseT_Half)
-		adv |= ADVERTISE_10HALF;
-	if (advertise & ADVERTISED_10baseT_Full)
-		adv |= ADVERTISE_10FULL;
-	if (advertise & ADVERTISED_100baseT_Half)
-		adv |= ADVERTISE_100HALF;
-	if (advertise & ADVERTISED_100baseT_Full)
-		adv |= ADVERTISE_100FULL;
-	if (advertise & ADVERTISED_Pause)
-		adv |= ADVERTISE_PAUSE_CAP;
-	if (advertise & ADVERTISED_Asym_Pause)
-		adv |= ADVERTISE_PAUSE_ASYM;
+	adv |= ethtool_adv_to_mii_100bt(advertise);
 
 	if (adv != oldadv) {
 		err = phy_write(phydev, MII_ADVERTISE, adv);
@@ -595,10 +584,7 @@ static int genphy_config_advert(struct phy_device *phydev)
 			return adv;
 
 		adv &= ~(ADVERTISE_1000FULL | ADVERTISE_1000HALF);
-		if (advertise & SUPPORTED_1000baseT_Half)
-			adv |= ADVERTISE_1000HALF;
-		if (advertise & SUPPORTED_1000baseT_Full)
-			adv |= ADVERTISE_1000FULL;
+		adv |= ethtool_adv_to_mii_1000T(advertise);
 
 		if (adv != oldadv) {
 			err = phy_write(phydev, MII_CTRL1000, adv);
diff --git a/include/linux/mii.h b/include/linux/mii.h
index 2774823..ac49406 100644
--- a/include/linux/mii.h
+++ b/include/linux/mii.h
@@ -240,6 +240,171 @@ static inline unsigned int mii_duplex (unsigned int duplex_lock,
 }
 
 /**
+ * ethtool_adv_to_mii_100bt
+ * @ethadv: the ethtool advertisement settings
+ *
+ * A small helper function that translates ethtool advertisement
+ * settings to phy autonegotiation advertisements for the
+ * MII_ADVERTISE register.
+ */
+static inline u32 ethtool_adv_to_mii_100bt(u32 ethadv)
+{
+	u32 result = 0;
+
+	if (ethadv & ADVERTISED_10baseT_Half)
+		result |= ADVERTISE_10HALF;
+	if (ethadv & ADVERTISED_10baseT_Full)
+		result |= ADVERTISE_10FULL;
+	if (ethadv & ADVERTISED_100baseT_Half)
+		result |= ADVERTISE_100HALF;
+	if (ethadv & ADVERTISED_100baseT_Full)
+		result |= ADVERTISE_100FULL;
+	if (ethadv & ADVERTISED_Pause)
+		result |= ADVERTISE_PAUSE_CAP;
+	if (ethadv & ADVERTISED_Asym_Pause)
+		result |= ADVERTISE_PAUSE_ASYM;
+
+	return result;
+}
+
+/**
+ * mii_adv_to_ethtool_100bt
+ * @adv: value of the MII_ADVERTISE register
+ *
+ * A small helper function that translates MII_ADVERTISE bits
+ * to ethtool advertisement settings.
+ */
+static inline u32 mii_adv_to_ethtool_100bt(u32 adv)
+{
+	u32 result = 0;
+
+	if (adv & ADVERTISE_10HALF)
+		result |= ADVERTISED_10baseT_Half;
+	if (adv & ADVERTISE_10FULL)
+		result |= ADVERTISED_10baseT_Full;
+	if (adv & ADVERTISE_100HALF)
+		result |= ADVERTISED_100baseT_Half;
+	if (adv & ADVERTISE_100FULL)
+		result |= ADVERTISED_100baseT_Full;
+	if (adv & ADVERTISE_PAUSE_CAP)
+		result |= ADVERTISED_Pause;
+	if (adv & ADVERTISE_PAUSE_ASYM)
+		result |= ADVERTISED_Asym_Pause;
+
+	return result;
+}
+
+/**
+ * ethtool_adv_to_mii_1000T
+ * @ethadv: the ethtool advertisement settings
+ *
+ * A small helper function that translates ethtool advertisement
+ * settings to phy autonegotiation advertisements for the
+ * MII_CTRL1000 register when in 1000T mode.
+ */
+static inline u32 ethtool_adv_to_mii_1000T(u32 ethadv)
+{
+	u32 result = 0;
+
+	if (ethadv & ADVERTISED_1000baseT_Half)
+		result |= ADVERTISE_1000HALF;
+	if (ethadv & ADVERTISED_1000baseT_Full)
+		result |= ADVERTISE_1000FULL;
+
+	return result;
+}
+
+/**
+ * mii_adv_to_ethtool_1000T
+ * @adv: value of the MII_CTRL1000 register
+ *
+ * A small helper function that translates MII_CTRL1000
+ * bits, when in 1000Base-T mode, to ethtool
+ * advertisement settings.
+ */
+static inline u32 mii_adv_to_ethtool_1000T(u32 adv)
+{
+	u32 result = 0;
+
+	if (adv & ADVERTISE_1000HALF)
+		result |= ADVERTISED_1000baseT_Half;
+	if (adv & ADVERTISE_1000FULL)
+		result |= ADVERTISED_1000baseT_Full;
+
+	return result;
+}
+
+#define mii_lpa_to_ethtool_100bt(lpa)	mii_adv_to_ethtool_100bt(lpa)
+
+/**
+ * mii_lpa_to_ethtool_1000T
+ * @adv: value of the MII_STAT1000 register
+ *
+ * A small helper function that translates MII_STAT1000
+ * bits, when in 1000Base-T mode, to ethtool
+ * advertisement settings.
+ */
+static inline u32 mii_lpa_to_ethtool_1000T(u32 lpa)
+{
+	u32 result = 0;
+
+	if (lpa & LPA_1000HALF)
+		result |= ADVERTISED_1000baseT_Half;
+	if (lpa & LPA_1000FULL)
+		result |= ADVERTISED_1000baseT_Full;
+
+	return result;
+}
+
+/**
+ * ethtool_adv_to_mii_1000X
+ * @ethadv: the ethtool advertisement settings
+ *
+ * A small helper function that translates ethtool advertisement
+ * settings to phy autonegotiation advertisements for the
+ * MII_CTRL1000 register when in 1000Base-X mode.
+ */
+static inline u32 ethtool_adv_to_mii_1000X(u32 ethadv)
+{
+	u32 result = 0;
+
+	if (ethadv & ADVERTISED_1000baseT_Half)
+		result |= ADVERTISE_1000XHALF;
+	if (ethadv & ADVERTISED_1000baseT_Full)
+		result |= ADVERTISE_1000XFULL;
+	if (ethadv & ADVERTISED_Pause)
+		result |= ADVERTISE_1000XPAUSE;
+	if (ethadv & ADVERTISED_Asym_Pause)
+		result |= ADVERTISE_1000XPSE_ASYM;
+
+	return result;
+}
+
+/**
+ * mii_adv_to_ethtool_1000X
+ * @adv: value of the MII_CTRL1000 register
+ *
+ * A small helper function that translates MII_CTRL1000
+ * bits, when in 1000Base-X mode, to ethtool
+ * advertisement settings.
+ */
+static inline u32 mii_adv_to_ethtool_1000X(u32 adv)
+{
+	u32 result = 0;
+
+	if (adv & ADVERTISE_1000XHALF)
+		result |= ADVERTISED_1000baseT_Half;
+	if (adv & ADVERTISE_1000XFULL)
+		result |= ADVERTISED_1000baseT_Full;
+	if (adv & ADVERTISE_1000XPAUSE)
+		result |= ADVERTISED_Pause;
+	if (adv & ADVERTISE_1000XPSE_ASYM)
+		result |= ADVERTISED_Asym_Pause;
+
+	return result;
+}
+
+/**
  * mii_advertise_flowctrl - get flow control advertisement flags
  * @cap: Flow control capabilities (FLOW_CTRL_RX, FLOW_CTRL_TX or both)
  */
-- 
1.7.3.4

^ permalink raw reply related

* Re: [PATCH net-next v4 5/8] forcedeth: implement ndo_get_stats64() API
From: David Decotigny @ 2011-11-15 22:01 UTC (permalink / raw)
  To: netdev, linux-kernel, Stephen Hemminger
  Cc: David S. Miller, Ian Campbell, Eric Dumazet, Jeff Kirsher,
	Ben Hutchings, Jiri Pirko, Joe Perches, Szymon Janc,
	Richard Jones, Ayaz Abdulla, David Decotigny
In-Reply-To: <6c785722f068deef5ff546f53b8011ecff43a4c1.1321384662.git.david.decotigny@google.com>

Hi all,


I'm afraid this version (http://patchwork.ozlabs.org/patch/125861/) is wrong.

Each software stat field is updated by one single writer. But these
different stats are guarded by a single seqcount, so effectively
different writers are fiddling with the same seqcount. Question is: is
it Ok for the seqcount to be updated concurrently without protection?
Is the seqcount guaranteed to be correctly updated from the readers'
perspective? Or should I serialize the sections that update the
seqcount?

If I should protect it, then I need to revisit that patch again: I'd
prefer not to lock in the fast paths just because of the stats. I
could for example revert to v3 (using atomic_t stats). Would you have
any recommendation/suggestion?

Thanks! Regards,

--
David Decotigny



On Tue, Nov 15, 2011 at 11:25 AM, David Decotigny
<david.decotigny@google.com> wrote:
> This commit implements the ndo_get_stats64() API for forcedeth. Since
> hardware stats are being updated from different contexts (process and
> timer), this commit adds protection (locking + atomic variables). For
> software stats, it relies on the u64_stats_sync.h API.
>
> Tested:
>  - 16-way SMP x86_64 ->
>    RX bytes:7244556582 (7.2 GB)  TX bytes:181904254 (181.9 MB)
>  - pktgen + loopback: identical rx_bytes/tx_bytes and rx_packets/tx_packets
>
>
>
> Signed-off-by: David Decotigny <david.decotigny@google.com>
> ---
>  drivers/net/ethernet/nvidia/forcedeth.c |  195 +++++++++++++++++++++++--------
>  1 files changed, 144 insertions(+), 51 deletions(-)
>
> diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
> index ee8cce5..ff01d5e 100644
> --- a/drivers/net/ethernet/nvidia/forcedeth.c
> +++ b/drivers/net/ethernet/nvidia/forcedeth.c
> @@ -65,7 +65,8 @@
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
>  #include <linux/prefetch.h>
> -#include  <linux/io.h>
> +#include <linux/u64_stats_sync.h>
> +#include <linux/io.h>
>
>  #include <asm/irq.h>
>  #include <asm/system.h>
> @@ -736,6 +737,18 @@ struct nv_skb_map {
>  * - tx setup is lockless: it relies on netif_tx_lock. Actual submission
>  *     needs netdev_priv(dev)->lock :-(
>  * - set_multicast_list: preparation lockless, relies on netif_tx_lock.
> + *
> + * Hardware stats updates are protected by hwstats_lock:
> + * - updated by nv_do_stats_poll (timer). This is meant to avoid
> + *   integer wraparound in the NIC stats registers, at low frequency
> + *   (0.1 Hz)
> + * - updated by nv_get_ethtool_stats + nv_get_stats64
> + *
> + * Software stats are accessed only through a 64b synchronization
> + * point and are not subject to other synchronization techniques (one
> + * unique updating thread for each stat [single queue RX/TX fast
> + * paths], or callers already synchronized [for tx_dropped, except from
> + * nv_open/nv_close]).
>  */
>
>  /* in dev: base, irq */
> @@ -745,9 +758,13 @@ struct fe_priv {
>        struct net_device *dev;
>        struct napi_struct napi;
>
> -       /* General data:
> -        * Locking: spin_lock(&np->lock); */
> +       /* hardware stats are updated in syscall and timer */
> +       spinlock_t hwstats_lock;
>        struct nv_ethtool_stats estats;
> +
> +       /* software stats are accessed through a 64b synchronization point */
> +       struct u64_stats_sync swstats_syncp;
> +
>        int in_shutdown;
>        u32 linkspeed;
>        int duplex;
> @@ -798,6 +815,11 @@ struct fe_priv {
>        u32 nic_poll_irq;
>        int rx_ring_size;
>
> +       /* RX software stats */
> +       u64 stat_rx_packets;
> +       u64 stat_rx_bytes; /* not always available in HW */
> +       u64 stat_rx_missed_errors;
> +
>        /* media detection workaround.
>         * Locking: Within irq hander or disable_irq+spin_lock(&np->lock);
>         */
> @@ -820,6 +842,11 @@ struct fe_priv {
>        struct nv_skb_map *tx_end_flip;
>        int tx_stop;
>
> +       /* TX software stats */
> +       u64 stat_tx_packets; /* not always available in HW */
> +       u64 stat_tx_bytes;
> +       u64 stat_tx_dropped;
> +
>        /* msi/msi-x fields */
>        u32 msi_flags;
>        struct msix_entry msi_x_entry[NV_MSI_X_MAX_VECTORS];
> @@ -1635,11 +1662,19 @@ static void nv_mac_reset(struct net_device *dev)
>        pci_push(base);
>  }
>
> -static void nv_get_hw_stats(struct net_device *dev)
> +/* Caller must appropriately lock netdev_priv(dev)->hwstats_lock */
> +static void nv_update_stats(struct net_device *dev)
>  {
>        struct fe_priv *np = netdev_priv(dev);
>        u8 __iomem *base = get_hwbase(dev);
>
> +       /* If it happens that this is run in top-half context, then
> +        * replace the spin_lock of hwstats_lock with
> +        * spin_lock_irqsave() in calling functions. */
> +       WARN_ONCE(in_irq(), "forcedeth: estats spin_lock(_bh) from top-half");
> +       assert_spin_locked(&np->hwstats_lock);
> +
> +       /* query hardware */
>        np->estats.tx_bytes += readl(base + NvRegTxCnt);
>        np->estats.tx_zero_rexmt += readl(base + NvRegTxZeroReXmt);
>        np->estats.tx_one_rexmt += readl(base + NvRegTxOneReXmt);
> @@ -1698,40 +1733,67 @@ static void nv_get_hw_stats(struct net_device *dev)
>  }
>
>  /*
> - * nv_get_stats: dev->get_stats function
> + * nv_get_stats64: dev->ndo_get_stats64 function
>  * Get latest stats value from the nic.
>  * Called with read_lock(&dev_base_lock) held for read -
>  * only synchronized against unregister_netdevice.
>  */
> -static struct net_device_stats *nv_get_stats(struct net_device *dev)
> +static struct rtnl_link_stats64*
> +nv_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *storage)
> +       __acquires(&netdev_priv(dev)->hwstats_lock)
> +       __releases(&netdev_priv(dev)->hwstats_lock)
>  {
>        struct fe_priv *np = netdev_priv(dev);
> +       unsigned int syncp_start;
> +
> +       /*
> +        * Note: because HW stats are not always available and for
> +        * consistency reasons, the following ifconfig stats are
> +        * managed by software: rx_bytes, tx_bytes, rx_packets and
> +        * tx_packets. The related hardware stats reported by ethtool
> +        * should be equivalent to these ifconfig stats, with 4
> +        * additional bytes per packet (Ethernet FCS CRC).
> +        */
> +
> +       /* software stats */
> +       do {
> +               syncp_start = u64_stats_fetch_begin(&np->swstats_syncp);
> +               storage->rx_packets       = np->stat_rx_packets;
> +               storage->tx_packets       = np->stat_tx_packets;
> +               storage->rx_bytes         = np->stat_rx_bytes;
> +               storage->tx_bytes         = np->stat_tx_bytes;
> +               storage->tx_dropped       = np->stat_tx_dropped;
> +               storage->rx_missed_errors = np->stat_rx_missed_errors;
> +       } while (u64_stats_fetch_retry(&np->swstats_syncp, syncp_start));
>
>        /* If the nic supports hw counters then retrieve latest values */
> -       if (np->driver_data & (DEV_HAS_STATISTICS_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_STATISTICS_V3)) {
> -               nv_get_hw_stats(dev);
> +       if (np->driver_data & DEV_HAS_STATISTICS_V123) {
> +               spin_lock_bh(&np->hwstats_lock);
>
> -               /*
> -                * Note: because HW stats are not always available and
> -                * for consistency reasons, the following ifconfig
> -                * stats are managed by software: rx_bytes, tx_bytes,
> -                * rx_packets and tx_packets. The related hardware
> -                * stats reported by ethtool should be equivalent to
> -                * these ifconfig stats, with 4 additional bytes per
> -                * packet (Ethernet FCS CRC).
> -                */
> +               nv_update_stats(dev);
> +
> +               /* generic stats */
> +               storage->rx_errors = np->estats.rx_errors_total;
> +               storage->tx_errors = np->estats.tx_errors_total;
> +
> +               /* meaningful only when NIC supports stats v3 */
> +               storage->multicast = np->estats.rx_multicast;
> +
> +               /* detailed rx_errors */
> +               storage->rx_length_errors = np->estats.rx_length_error;
> +               storage->rx_over_errors   = np->estats.rx_over_errors;
> +               storage->rx_crc_errors    = np->estats.rx_crc_errors;
> +               storage->rx_frame_errors  = np->estats.rx_frame_align_error;
> +               storage->rx_fifo_errors   = np->estats.rx_drop_frame;
>
> -               /* copy to net_device stats */
> -               dev->stats.tx_fifo_errors = np->estats.tx_fifo_errors;
> -               dev->stats.tx_carrier_errors = np->estats.tx_carrier_errors;
> -               dev->stats.rx_crc_errors = np->estats.rx_crc_errors;
> -               dev->stats.rx_over_errors = np->estats.rx_over_errors;
> -               dev->stats.rx_fifo_errors = np->estats.rx_drop_frame;
> -               dev->stats.rx_errors = np->estats.rx_errors_total;
> -               dev->stats.tx_errors = np->estats.tx_errors_total;
> +               /* detailed tx_errors */
> +               storage->tx_carrier_errors = np->estats.tx_carrier_errors;
> +               storage->tx_fifo_errors    = np->estats.tx_fifo_errors;
> +
> +               spin_unlock_bh(&np->hwstats_lock);
>        }
>
> -       return &dev->stats;
> +       return storage;
>  }
>
>  /*
> @@ -1932,8 +1994,11 @@ static void nv_drain_tx(struct net_device *dev)
>                        np->tx_ring.ex[i].bufhigh = 0;
>                        np->tx_ring.ex[i].buflow = 0;
>                }
> -               if (nv_release_txskb(np, &np->tx_skb[i]))
> -                       dev->stats.tx_dropped++;
> +               if (nv_release_txskb(np, &np->tx_skb[i])) {
> +                       u64_stats_update_begin(&np->swstats_syncp);
> +                       np->stat_tx_dropped++;
> +                       u64_stats_update_end(&np->swstats_syncp);
> +               }
>                np->tx_skb[i].dma = 0;
>                np->tx_skb[i].dma_len = 0;
>                np->tx_skb[i].dma_single = 0;
> @@ -2390,11 +2455,14 @@ static int nv_tx_done(struct net_device *dev, int limit)
>                if (np->desc_ver == DESC_VER_1) {
>                        if (flags & NV_TX_LASTPACKET) {
>                                if (flags & NV_TX_ERROR) {
> -                                       if ((flags & NV_TX_RETRYERROR) && !(flags & NV_TX_RETRYCOUNT_MASK))
> +                                       if ((flags & NV_TX_RETRYERROR)
> +                                           && !(flags & NV_TX_RETRYCOUNT_MASK))
>                                                nv_legacybackoff_reseed(dev);
>                                } else {
> -                                       dev->stats.tx_packets++;
> -                                       dev->stats.tx_bytes += np->get_tx_ctx->skb->len;
> +                                       u64_stats_update_begin(&np->swstats_syncp);
> +                                       np->stat_tx_packets++;
> +                                       np->stat_tx_bytes += np->get_tx_ctx->skb->len;
> +                                       u64_stats_update_end(&np->swstats_syncp);
>                                }
>                                dev_kfree_skb_any(np->get_tx_ctx->skb);
>                                np->get_tx_ctx->skb = NULL;
> @@ -2403,11 +2471,14 @@ static int nv_tx_done(struct net_device *dev, int limit)
>                } else {
>                        if (flags & NV_TX2_LASTPACKET) {
>                                if (flags & NV_TX2_ERROR) {
> -                                       if ((flags & NV_TX2_RETRYERROR) && !(flags & NV_TX2_RETRYCOUNT_MASK))
> +                                       if ((flags & NV_TX2_RETRYERROR)
> +                                           && !(flags & NV_TX2_RETRYCOUNT_MASK))
>                                                nv_legacybackoff_reseed(dev);
>                                } else {
> -                                       dev->stats.tx_packets++;
> -                                       dev->stats.tx_bytes += np->get_tx_ctx->skb->len;
> +                                       u64_stats_update_begin(&np->swstats_syncp);
> +                                       np->stat_tx_packets++;
> +                                       np->stat_tx_bytes += np->get_tx_ctx->skb->len;
> +                                       u64_stats_update_end(&np->swstats_syncp);
>                                }
>                                dev_kfree_skb_any(np->get_tx_ctx->skb);
>                                np->get_tx_ctx->skb = NULL;
> @@ -2441,15 +2512,18 @@ static int nv_tx_done_optimized(struct net_device *dev, int limit)
>
>                if (flags & NV_TX2_LASTPACKET) {
>                        if (flags & NV_TX2_ERROR) {
> -                               if ((flags & NV_TX2_RETRYERROR) && !(flags & NV_TX2_RETRYCOUNT_MASK)) {
> +                               if ((flags & NV_TX2_RETRYERROR)
> +                                   && !(flags & NV_TX2_RETRYCOUNT_MASK)) {
>                                        if (np->driver_data & DEV_HAS_GEAR_MODE)
>                                                nv_gear_backoff_reseed(dev);
>                                        else
>                                                nv_legacybackoff_reseed(dev);
>                                }
>                        } else {
> -                               dev->stats.tx_packets++;
> -                               dev->stats.tx_bytes += np->get_tx_ctx->skb->len;
> +                                       u64_stats_update_begin(&np->swstats_syncp);
> +                                       np->stat_tx_packets++;
> +                                       np->stat_tx_bytes += np->get_tx_ctx->skb->len;
> +                                       u64_stats_update_end(&np->swstats_syncp);
>                        }
>
>                        dev_kfree_skb_any(np->get_tx_ctx->skb);
> @@ -2662,8 +2736,11 @@ static int nv_rx_process(struct net_device *dev, int limit)
>                                        }
>                                        /* the rest are hard errors */
>                                        else {
> -                                               if (flags & NV_RX_MISSEDFRAME)
> -                                                       dev->stats.rx_missed_errors++;
> +                                               if (flags & NV_RX_MISSEDFRAME) {
> +                                                       u64_stats_update_begin(&np->swstats_syncp);
> +                                                       np->stat_rx_missed_errors++;
> +                                                       u64_stats_update_end(&np->swstats_syncp);
> +                                               }
>                                                dev_kfree_skb(skb);
>                                                goto next_pkt;
>                                        }
> @@ -2706,8 +2783,10 @@ static int nv_rx_process(struct net_device *dev, int limit)
>                skb_put(skb, len);
>                skb->protocol = eth_type_trans(skb, dev);
>                napi_gro_receive(&np->napi, skb);
> -               dev->stats.rx_packets++;
> -               dev->stats.rx_bytes += len;
> +               u64_stats_update_begin(&np->swstats_syncp);
> +               np->stat_rx_packets++;
> +               np->stat_rx_bytes += len;
> +               u64_stats_update_end(&np->swstats_syncp);
>  next_pkt:
>                if (unlikely(np->get_rx.orig++ == np->last_rx.orig))
>                        np->get_rx.orig = np->first_rx.orig;
> @@ -2790,8 +2869,10 @@ static int nv_rx_process_optimized(struct net_device *dev, int limit)
>                                __vlan_hwaccel_put_tag(skb, vid);
>                        }
>                        napi_gro_receive(&np->napi, skb);
> -                       dev->stats.rx_packets++;
> -                       dev->stats.rx_bytes += len;
> +                       u64_stats_update_begin(&np->swstats_syncp);
> +                       np->stat_rx_packets++;
> +                       np->stat_rx_bytes += len;
> +                       u64_stats_update_end(&np->swstats_syncp);
>                } else {
>                        dev_kfree_skb(skb);
>                }
> @@ -4000,11 +4081,18 @@ static void nv_poll_controller(struct net_device *dev)
>  #endif
>
>  static void nv_do_stats_poll(unsigned long data)
> +       __acquires(&netdev_priv(dev)->hwstats_lock)
> +       __releases(&netdev_priv(dev)->hwstats_lock)
>  {
>        struct net_device *dev = (struct net_device *) data;
>        struct fe_priv *np = netdev_priv(dev);
>
> -       nv_get_hw_stats(dev);
> +       /* If lock is currently taken, the stats are being refreshed
> +        * and hence fresh enough */
> +       if (spin_trylock(&np->hwstats_lock)) {
> +               nv_update_stats(dev);
> +               spin_unlock(&np->hwstats_lock);
> +       }
>
>        if (!np->in_shutdown)
>                mod_timer(&np->stats_poll,
> @@ -4711,14 +4799,18 @@ static int nv_get_sset_count(struct net_device *dev, int sset)
>        }
>  }
>
> -static void nv_get_ethtool_stats(struct net_device *dev, struct ethtool_stats *estats, u64 *buffer)
> +static void nv_get_ethtool_stats(struct net_device *dev,
> +                                struct ethtool_stats *estats, u64 *buffer)
> +       __acquires(&netdev_priv(dev)->hwstats_lock)
> +       __releases(&netdev_priv(dev)->hwstats_lock)
>  {
>        struct fe_priv *np = netdev_priv(dev);
>
> -       /* update stats */
> -       nv_get_hw_stats(dev);
> -
> -       memcpy(buffer, &np->estats, nv_get_sset_count(dev, ETH_SS_STATS)*sizeof(u64));
> +       spin_lock_bh(&np->hwstats_lock);
> +       nv_update_stats(dev);
> +       memcpy(buffer, &np->estats,
> +              nv_get_sset_count(dev, ETH_SS_STATS)*sizeof(u64));
> +       spin_unlock_bh(&np->hwstats_lock);
>  }
>
>  static int nv_link_test(struct net_device *dev)
> @@ -5362,7 +5454,7 @@ static int nv_close(struct net_device *dev)
>  static const struct net_device_ops nv_netdev_ops = {
>        .ndo_open               = nv_open,
>        .ndo_stop               = nv_close,
> -       .ndo_get_stats          = nv_get_stats,
> +       .ndo_get_stats64        = nv_get_stats64,
>        .ndo_start_xmit         = nv_start_xmit,
>        .ndo_tx_timeout         = nv_tx_timeout,
>        .ndo_change_mtu         = nv_change_mtu,
> @@ -5379,7 +5471,7 @@ static const struct net_device_ops nv_netdev_ops = {
>  static const struct net_device_ops nv_netdev_ops_optimized = {
>        .ndo_open               = nv_open,
>        .ndo_stop               = nv_close,
> -       .ndo_get_stats          = nv_get_stats,
> +       .ndo_get_stats64        = nv_get_stats64,
>        .ndo_start_xmit         = nv_start_xmit_optimized,
>        .ndo_tx_timeout         = nv_tx_timeout,
>        .ndo_change_mtu         = nv_change_mtu,
> @@ -5418,6 +5510,7 @@ static int __devinit nv_probe(struct pci_dev *pci_dev, const struct pci_device_i
>        np->dev = dev;
>        np->pci_dev = pci_dev;
>        spin_lock_init(&np->lock);
> +       spin_lock_init(&np->hwstats_lock);
>        SET_NETDEV_DEV(dev, &pci_dev->dev);
>
>        init_timer(&np->oom_kick);
> --
> 1.7.3.1
>
>

^ permalink raw reply

* [PATCH] bnx2x: cache-in compressed fw image
From: Dmitry Kravkov @ 2011-11-15 22:07 UTC (permalink / raw)
  To: davem, netdev; +Cc: Dmitry Kravkov, Eilon Greenstein

Re-request fw from fs may fail for different reasons, once the fw was
loaded we won't release it until driver is removed.

This also resolves the boot problem when initial fw is located on initrd,
but rootfs is still unavailable, in this case device reset will fail due
to absence of fw files.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |   50 +++++++++++++---------
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c   |   15 +++----
 2 files changed, 35 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 9090afc..f6d21fa 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -10574,33 +10574,38 @@ do {									\
 
 int bnx2x_init_firmware(struct bnx2x *bp)
 {
-	const char *fw_file_name;
 	struct bnx2x_fw_file_hdr *fw_hdr;
 	int rc;
 
-	if (CHIP_IS_E1(bp))
-		fw_file_name = FW_FILE_NAME_E1;
-	else if (CHIP_IS_E1H(bp))
-		fw_file_name = FW_FILE_NAME_E1H;
-	else if (!CHIP_IS_E1x(bp))
-		fw_file_name = FW_FILE_NAME_E2;
-	else {
-		BNX2X_ERR("Unsupported chip revision\n");
-		return -EINVAL;
-	}
 
-	BNX2X_DEV_INFO("Loading %s\n", fw_file_name);
+	if (!bp->firmware) {
+		const char *fw_file_name;
 
-	rc = request_firmware(&bp->firmware, fw_file_name, &bp->pdev->dev);
-	if (rc) {
-		BNX2X_ERR("Can't load firmware file %s\n", fw_file_name);
-		goto request_firmware_exit;
-	}
+		if (CHIP_IS_E1(bp))
+			fw_file_name = FW_FILE_NAME_E1;
+		else if (CHIP_IS_E1H(bp))
+			fw_file_name = FW_FILE_NAME_E1H;
+		else if (!CHIP_IS_E1x(bp))
+			fw_file_name = FW_FILE_NAME_E2;
+		else {
+			BNX2X_ERR("Unsupported chip revision\n");
+			return -EINVAL;
+		}
+		BNX2X_DEV_INFO("Loading %s\n", fw_file_name);
 
-	rc = bnx2x_check_firmware(bp);
-	if (rc) {
-		BNX2X_ERR("Corrupt firmware file %s\n", fw_file_name);
-		goto request_firmware_exit;
+		rc = request_firmware(&bp->firmware, fw_file_name,
+				      &bp->pdev->dev);
+		if (rc) {
+			BNX2X_ERR("Can't load firmware file %s\n",
+				  fw_file_name);
+			goto request_firmware_exit;
+		}
+
+		rc = bnx2x_check_firmware(bp);
+		if (rc) {
+			BNX2X_ERR("Corrupt firmware file %s\n", fw_file_name);
+			goto request_firmware_exit;
+		}
 	}
 
 	fw_hdr = (struct bnx2x_fw_file_hdr *)bp->firmware->data;
@@ -10656,6 +10661,7 @@ static void bnx2x_release_firmware(struct bnx2x *bp)
 	kfree(bp->init_ops);
 	kfree(bp->init_data);
 	release_firmware(bp->firmware);
+	bp->firmware = NULL;
 }
 
 
@@ -10951,6 +10957,8 @@ static void __devexit bnx2x_remove_one(struct pci_dev *pdev)
 	if (bp->doorbells)
 		iounmap(bp->doorbells);
 
+	bnx2x_release_firmware(bp);
+
 	bnx2x_free_mem_bp(bp);
 
 	free_netdev(dev);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
index 0440425..1451769 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
@@ -5380,7 +5380,7 @@ static int bnx2x_func_hw_init(struct bnx2x *bp,
 	rc = drv->init_fw(bp);
 	if (rc) {
 		BNX2X_ERR("Error loading firmware\n");
-		goto fw_init_err;
+		goto init_err;
 	}
 
 	/* Handle the beginning of COMMON_XXX pases separatelly... */
@@ -5388,25 +5388,25 @@ static int bnx2x_func_hw_init(struct bnx2x *bp,
 	case FW_MSG_CODE_DRV_LOAD_COMMON_CHIP:
 		rc = bnx2x_func_init_cmn_chip(bp, drv);
 		if (rc)
-			goto init_hw_err;
+			goto init_err;
 
 		break;
 	case FW_MSG_CODE_DRV_LOAD_COMMON:
 		rc = bnx2x_func_init_cmn(bp, drv);
 		if (rc)
-			goto init_hw_err;
+			goto init_err;
 
 		break;
 	case FW_MSG_CODE_DRV_LOAD_PORT:
 		rc = bnx2x_func_init_port(bp, drv);
 		if (rc)
-			goto init_hw_err;
+			goto init_err;
 
 		break;
 	case FW_MSG_CODE_DRV_LOAD_FUNCTION:
 		rc = bnx2x_func_init_func(bp, drv);
 		if (rc)
-			goto init_hw_err;
+			goto init_err;
 
 		break;
 	default:
@@ -5414,10 +5414,7 @@ static int bnx2x_func_hw_init(struct bnx2x *bp,
 		rc = -EINVAL;
 	}
 
-init_hw_err:
-	drv->release_fw(bp);
-
-fw_init_err:
+init_err:
 	drv->gunzip_end(bp);
 
 	/* In case of success, complete the comand immediatelly: no ramrods
-- 
1.7.7.2

^ permalink raw reply related

* Re: [PATCH net-next v4 6/8] forcedeth: account for dropped RX frames
From: Stephen Hemminger @ 2011-11-15 22:21 UTC (permalink / raw)
  To: David Decotigny
  Cc: netdev, linux-kernel, David S. Miller, Ian Campbell, Eric Dumazet,
	Jeff Kirsher, Ben Hutchings, Jiri Pirko, Joe Perches, Szymon Janc,
	Richard Jones, Ayaz Abdulla
In-Reply-To: <005cae310e19433c9f68c178805f16c774e8dedd.1321384662.git.david.decotigny@google.com>

On Tue, 15 Nov 2011 11:25:39 -0800
David Decotigny <david.decotigny@google.com> wrote:

> This adds the stats counter for dropped RX frames.
> 

There is already an rx_dropped statistic in netdevice structure,
why do you need your own in ethtool?

^ permalink raw reply

* Re: [PATCH 1/5] net-next:asix:PHY_MODE_RTL8211CL should be 0xC
From: Grant Grundler @ 2011-11-15 22:26 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel, allan, freddy
In-Reply-To: <20111115.164134.1813312136000862195.davem@davemloft.net>

On Tue, Nov 15, 2011 at 1:41 PM, David Miller <davem@davemloft.net> wrote:
> From: Grant Grundler <grundler@chromium.org>
> Date: Tue, 15 Nov 2011 09:12:39 -0800
>
>> From: Grant Grundler <grundler@google.com>
>>
>> Use correct value for rtl phy support.
>> (rtl phy are in AX88178 devices like NWU220G and USB2-ET1000).
>>
>> Signed-off-by: Allan Chou <allan@asix.com.tw>
>> Tested-by: Grant Grundler <grundler@chromium.org>
>
> Applied.

Dave,
Thanks for applying the series and offering to push immediately to
linus. I'll submit the white space/coding style clean ups after the
frustration of my blooper has receded from your memory a bit more. :)

cheers,
grant

^ permalink raw reply

* Re: [PATCH net-next v4 3/8] forcedeth: allow to silence "TX timeout" debug messages
From: Stephen Hemminger @ 2011-11-15 22:27 UTC (permalink / raw)
  To: David Decotigny
  Cc: netdev, linux-kernel, David S. Miller, Ian Campbell, Eric Dumazet,
	Jeff Kirsher, Ben Hutchings, Jiri Pirko, Joe Perches, Szymon Janc,
	Richard Jones, Ayaz Abdulla, Sameer Nanda
In-Reply-To: <47650719c85908eb4dff05f5d243cc0e9e181748.1321384662.git.david.decotigny@google.com>

On Tue, 15 Nov 2011 11:25:36 -0800
David Decotigny <david.decotigny@google.com> wrote:

> From: Sameer Nanda <snanda@google.com>
> 
> This adds a new module parameter "debug_tx_timeout" to silence most
> debug messages in case of TX timeout. These messages don't provide a
> signal/noise ratio high enough for production systems and, with ~30kB
> logged each time, they tend to add to a cascade effect if the system
> is already under stress (memory pressure, disk, etc.).
> 
> By default, the parameter is clear, meaning that only a single warning
> will be reported.
> 
> 
> 
> Signed-off-by: David Decotigny <david.decotigny@google.com>

This (and the counter) should really be generic. I know it is more annoying
to have to solve a generic problem, but putting my distributor hat on,
any solution that is specific to only one driver is not a solution that
is useful.

The control of tx_timeout should be a property of the device, and the statistic
should be available for all devices. There is a problem though, the existing
network device statistics structure is part of ABI and can't grow. You can
add new statistics to netlink and sysfs as attributes, but not for the older
static API's.

^ permalink raw reply

* Re: [RFC PATCH 1/2] powerpc: Remove duplicate cacheable_memcpy/memzero functions
From: Benjamin Herrenschmidt @ 2011-11-15 22:31 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Mike Frysinger, Ian Campbell, Eric Dumazet, Jiri Pirko, netdev,
	B04825, linux-kernel, Milton Miller, paul.gortmaker,
	Paul Mackerras, Anton Blanchard, Oleg Nesterov, scottwood,
	Andrew Morton, linuxppc-dev, David S. Miller, Jeff Kirsher
In-Reply-To: <1321324332-22964-2-git-send-email-Kyle.D.Moffett@boeing.com>

On Mon, 2011-11-14 at 21:32 -0500, Kyle Moffett wrote:
> These functions are only used from one place each.  If the cacheable_*
> versions really are more efficient, then those changes should be
> migrated into the common code instead.
> 
> NOTE: The old routines are just flat buggy on kernels that support
>       hardware with different cacheline sizes.
> 
> Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
> ---

Right, considering where those are used, I think we can safely remove
them. Thanks.

Ben.

>  arch/powerpc/include/asm/system.h    |    2 -
>  arch/powerpc/kernel/ppc_ksyms.c      |    2 -
>  arch/powerpc/lib/copy_32.S           |  127 ----------------------------------
>  arch/powerpc/mm/ppc_mmu_32.c         |    2 +-
>  drivers/net/ethernet/ibm/emac/core.c |   12 +---
>  5 files changed, 3 insertions(+), 142 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/system.h b/arch/powerpc/include/asm/system.h
> index e30a13d..25389d1 100644
> --- a/arch/powerpc/include/asm/system.h
> +++ b/arch/powerpc/include/asm/system.h
> @@ -189,8 +189,6 @@ static inline void flush_spe_to_thread(struct task_struct *t)
>  #endif
>  
>  extern int call_rtas(const char *, int, int, unsigned long *, ...);
> -extern void cacheable_memzero(void *p, unsigned int nb);
> -extern void *cacheable_memcpy(void *, const void *, unsigned int);
>  extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
>  extern void bad_page_fault(struct pt_regs *, unsigned long, int);
>  extern int die(const char *, struct pt_regs *, long);
> diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
> index d3114a7..acba8ce 100644
> --- a/arch/powerpc/kernel/ppc_ksyms.c
> +++ b/arch/powerpc/kernel/ppc_ksyms.c
> @@ -159,8 +159,6 @@ EXPORT_SYMBOL(screen_info);
>  #ifdef CONFIG_PPC32
>  EXPORT_SYMBOL(timer_interrupt);
>  EXPORT_SYMBOL(tb_ticks_per_jiffy);
> -EXPORT_SYMBOL(cacheable_memcpy);
> -EXPORT_SYMBOL(cacheable_memzero);
>  #endif
>  
>  #ifdef CONFIG_PPC32
> diff --git a/arch/powerpc/lib/copy_32.S b/arch/powerpc/lib/copy_32.S
> index 55f19f9..6813f80 100644
> --- a/arch/powerpc/lib/copy_32.S
> +++ b/arch/powerpc/lib/copy_32.S
> @@ -69,54 +69,6 @@ CACHELINE_BYTES = L1_CACHE_BYTES
>  LG_CACHELINE_BYTES = L1_CACHE_SHIFT
>  CACHELINE_MASK = (L1_CACHE_BYTES-1)
>  
> -/*
> - * Use dcbz on the complete cache lines in the destination
> - * to set them to zero.  This requires that the destination
> - * area is cacheable.  -- paulus
> - */
> -_GLOBAL(cacheable_memzero)
> -	mr	r5,r4
> -	li	r4,0
> -	addi	r6,r3,-4
> -	cmplwi	0,r5,4
> -	blt	7f
> -	stwu	r4,4(r6)
> -	beqlr
> -	andi.	r0,r6,3
> -	add	r5,r0,r5
> -	subf	r6,r0,r6
> -	clrlwi	r7,r6,32-LG_CACHELINE_BYTES
> -	add	r8,r7,r5
> -	srwi	r9,r8,LG_CACHELINE_BYTES
> -	addic.	r9,r9,-1	/* total number of complete cachelines */
> -	ble	2f
> -	xori	r0,r7,CACHELINE_MASK & ~3
> -	srwi.	r0,r0,2
> -	beq	3f
> -	mtctr	r0
> -4:	stwu	r4,4(r6)
> -	bdnz	4b
> -3:	mtctr	r9
> -	li	r7,4
> -10:	dcbz	r7,r6
> -	addi	r6,r6,CACHELINE_BYTES
> -	bdnz	10b
> -	clrlwi	r5,r8,32-LG_CACHELINE_BYTES
> -	addi	r5,r5,4
> -2:	srwi	r0,r5,2
> -	mtctr	r0
> -	bdz	6f
> -1:	stwu	r4,4(r6)
> -	bdnz	1b
> -6:	andi.	r5,r5,3
> -7:	cmpwi	0,r5,0
> -	beqlr
> -	mtctr	r5
> -	addi	r6,r6,3
> -8:	stbu	r4,1(r6)
> -	bdnz	8b
> -	blr
> -
>  _GLOBAL(memset)
>  	rlwimi	r4,r4,8,16,23
>  	rlwimi	r4,r4,16,0,15
> @@ -142,85 +94,6 @@ _GLOBAL(memset)
>  	bdnz	8b
>  	blr
>  
> -/*
> - * This version uses dcbz on the complete cache lines in the
> - * destination area to reduce memory traffic.  This requires that
> - * the destination area is cacheable.
> - * We only use this version if the source and dest don't overlap.
> - * -- paulus.
> - */
> -_GLOBAL(cacheable_memcpy)
> -	add	r7,r3,r5		/* test if the src & dst overlap */
> -	add	r8,r4,r5
> -	cmplw	0,r4,r7
> -	cmplw	1,r3,r8
> -	crand	0,0,4			/* cr0.lt &= cr1.lt */
> -	blt	memcpy			/* if regions overlap */
> -
> -	addi	r4,r4,-4
> -	addi	r6,r3,-4
> -	neg	r0,r3
> -	andi.	r0,r0,CACHELINE_MASK	/* # bytes to start of cache line */
> -	beq	58f
> -
> -	cmplw	0,r5,r0			/* is this more than total to do? */
> -	blt	63f			/* if not much to do */
> -	andi.	r8,r0,3			/* get it word-aligned first */
> -	subf	r5,r0,r5
> -	mtctr	r8
> -	beq+	61f
> -70:	lbz	r9,4(r4)		/* do some bytes */
> -	stb	r9,4(r6)
> -	addi	r4,r4,1
> -	addi	r6,r6,1
> -	bdnz	70b
> -61:	srwi.	r0,r0,2
> -	mtctr	r0
> -	beq	58f
> -72:	lwzu	r9,4(r4)		/* do some words */
> -	stwu	r9,4(r6)
> -	bdnz	72b
> -
> -58:	srwi.	r0,r5,LG_CACHELINE_BYTES /* # complete cachelines */
> -	clrlwi	r5,r5,32-LG_CACHELINE_BYTES
> -	li	r11,4
> -	mtctr	r0
> -	beq	63f
> -53:
> -	dcbz	r11,r6
> -	COPY_16_BYTES
> -#if L1_CACHE_BYTES >= 32
> -	COPY_16_BYTES
> -#if L1_CACHE_BYTES >= 64
> -	COPY_16_BYTES
> -	COPY_16_BYTES
> -#if L1_CACHE_BYTES >= 128
> -	COPY_16_BYTES
> -	COPY_16_BYTES
> -	COPY_16_BYTES
> -	COPY_16_BYTES
> -#endif
> -#endif
> -#endif
> -	bdnz	53b
> -
> -63:	srwi.	r0,r5,2
> -	mtctr	r0
> -	beq	64f
> -30:	lwzu	r0,4(r4)
> -	stwu	r0,4(r6)
> -	bdnz	30b
> -
> -64:	andi.	r0,r5,3
> -	mtctr	r0
> -	beq+	65f
> -40:	lbz	r0,4(r4)
> -	stb	r0,4(r6)
> -	addi	r4,r4,1
> -	addi	r6,r6,1
> -	bdnz	40b
> -65:	blr
> -
>  _GLOBAL(memmove)
>  	cmplw	0,r3,r4
>  	bgt	backwards_memcpy
> diff --git a/arch/powerpc/mm/ppc_mmu_32.c b/arch/powerpc/mm/ppc_mmu_32.c
> index 11571e1..9f16b9f 100644
> --- a/arch/powerpc/mm/ppc_mmu_32.c
> +++ b/arch/powerpc/mm/ppc_mmu_32.c
> @@ -224,7 +224,7 @@ void __init MMU_init_hw(void)
>  	 */
>  	if ( ppc_md.progress ) ppc_md.progress("hash:find piece", 0x322);
>  	Hash = __va(memblock_alloc(Hash_size, Hash_size));
> -	cacheable_memzero(Hash, Hash_size);
> +	memset(Hash, 0, Hash_size);
>  	_SDR1 = __pa(Hash) | SDR1_LOW_BITS;
>  
>  	Hash_end = (struct hash_pte *) ((unsigned long)Hash + Hash_size);
> diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
> index ed79b2d..be214ad 100644
> --- a/drivers/net/ethernet/ibm/emac/core.c
> +++ b/drivers/net/ethernet/ibm/emac/core.c
> @@ -77,13 +77,6 @@ MODULE_AUTHOR
>      ("Eugene Surovegin <eugene.surovegin@zultys.com> or <ebs@ebshome.net>");
>  MODULE_LICENSE("GPL");
>  
> -/*
> - * PPC64 doesn't (yet) have a cacheable_memcpy
> - */
> -#ifdef CONFIG_PPC64
> -#define cacheable_memcpy(d,s,n) memcpy((d),(s),(n))
> -#endif
> -
>  /* minimum number of free TX descriptors required to wake up TX process */
>  #define EMAC_TX_WAKEUP_THRESH		(NUM_TX_BUFF / 4)
>  
> @@ -1637,7 +1630,7 @@ static inline int emac_rx_sg_append(struct emac_instance *dev, int slot)
>  			dev_kfree_skb(dev->rx_sg_skb);
>  			dev->rx_sg_skb = NULL;
>  		} else {
> -			cacheable_memcpy(skb_tail_pointer(dev->rx_sg_skb),
> +			memcpy(skb_tail_pointer(dev->rx_sg_skb),
>  					 dev->rx_skb[slot]->data, len);
>  			skb_put(dev->rx_sg_skb, len);
>  			emac_recycle_rx_skb(dev, slot, len);
> @@ -1694,8 +1687,7 @@ static int emac_poll_rx(void *param, int budget)
>  				goto oom;
>  
>  			skb_reserve(copy_skb, EMAC_RX_SKB_HEADROOM + 2);
> -			cacheable_memcpy(copy_skb->data - 2, skb->data - 2,
> -					 len + 2);
> +			memcpy(copy_skb->data - 2, skb->data - 2, len + 2);
>  			emac_recycle_rx_skb(dev, slot, len);
>  			skb = copy_skb;
>  		} else if (unlikely(emac_alloc_rx_skb(dev, slot, GFP_ATOMIC)))

^ permalink raw reply

* Re: [PATCH net-next v4 4/8] forcedeth: expose module parameters in /sys/module
From: Stephen Hemminger @ 2011-11-15 22:32 UTC (permalink / raw)
  To: David Decotigny
  Cc: netdev, linux-kernel, David S. Miller, Ian Campbell, Eric Dumazet,
	Jeff Kirsher, Ben Hutchings, Jiri Pirko, Joe Perches, Szymon Janc,
	Richard Jones, Ayaz Abdulla
In-Reply-To: <558f3ff3d373b1cdcbebebe842816b3c91438728.1321384662.git.david.decotigny@google.com>

On Tue, 15 Nov 2011 11:25:37 -0800
David Decotigny <david.decotigny@google.com> wrote:

> +module_param(optimization_mode, int, S_IRUGO);
>  MODULE_PARM_DESC(optimization_mode, "In throughput mode (0), every tx & rx packet will generate an interrupt. In CPU mode (1), interrupts are controlled by a timer. In dynamic mode (2), the mode toggles between throughput and CPU mode based on network load.");

Probably the original developer (or marketing data sheet), thought this was some
unique feature of the hardware. But most devices have this already.

This driver should just implement proper control irq coalescing control via ethtool
and get rid of the silly module parameter.

^ permalink raw reply

* Re: [PATCH net-next v4 4/8] forcedeth: expose module parameters in /sys/module
From: Stephen Hemminger @ 2011-11-15 22:33 UTC (permalink / raw)
  To: David Decotigny
  Cc: netdev, linux-kernel, David S. Miller, Ian Campbell, Eric Dumazet,
	Jeff Kirsher, Ben Hutchings, Jiri Pirko, Joe Perches, Szymon Janc,
	Richard Jones, Ayaz Abdulla
In-Reply-To: <558f3ff3d373b1cdcbebebe842816b3c91438728.1321384662.git.david.decotigny@google.com>

On Tue, 15 Nov 2011 11:25:37 -0800
David Decotigny <david.decotigny@google.com> wrote:

> +module_param(msi, int, S_IRUGO);
>  MODULE_PARM_DESC(msi, "MSI interrupts are enabled by setting to 1 and disabled by setting to 0.");
> -module_param(msix, int, 0);
> +module_param(msix, int, S_IRUGO);
>  MODULE_PARM_DESC(msix, "MSIX interrupts are enabled by setting to 1 and disabled by setting to 0.");
> -module_param(dma_64bit, int, 0);
> +module_param(dma_64bit, int, S_IRUGO);

Once again these attributes are visible through other means (/proc/interrupts for MSI)
and the 64bit dma is NETIF_F_HIGHDMA. They shouldn't be module parameters.

^ permalink raw reply

* Re: [PATCH net-next v4 6/8] forcedeth: account for dropped RX frames
From: David Decotigny @ 2011-11-15 22:35 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: netdev, linux-kernel, David S. Miller, Ian Campbell, Eric Dumazet,
	Jeff Kirsher, Ben Hutchings, Jiri Pirko, Joe Perches, Szymon Janc,
	Richard Jones, Ayaz Abdulla
In-Reply-To: <20111115142130.796d52c3@s6510.linuxnetplumber.net>

Hello,

On Tue, Nov 15, 2011 at 2:21 PM, Stephen Hemminger
<shemminger@vyatta.com> wrote:
> There is already an rx_dropped statistic in netdevice structure,
> why do you need your own in ethtool?

Sorry, I think my commit message was (once again) inaccurate. This
commit doesn't really "add" the stat, instead it adds code to update
the standard stats (struct rtnl_link_stats64).

Regards,

^ permalink raw reply

* 3.1.0 rtl8169_open oops.
From: Dave Jones @ 2011-11-15 22:37 UTC (permalink / raw)
  To: netdev; +Cc: kernel-team, romieu, hayeswang

just had this reported by a user who hit this during installing Fedora 16.
(https://bugzilla.redhat.com/show_bug.cgi?id=753078)


BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffffa002ac0f>] rtl8169_open+0x34e/0x684 [r8169]
PGD 1f9f91067 PUD 210bad067 PMD 0 
Oops: 0000 [#1] SMP 
CPU 1 
Modules linked in: iscsi_ibft iscsi_boot_sysfs pcspkr edd iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi cramfs nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core r8169 mii video mxm_wmi wmi squashfs

Pid: 850, comm: NetworkManager Not tainted 3.1.0-7.fc16.x86_64 #1 Gigabyte Technology Co., Ltd. Z68A-D3H-B3/Z68A-D3H-B3
RIP: 0010:[<ffffffffa002ac0f>]  [<ffffffffa002ac0f>] rtl8169_open+0x34e/0x684 [r8169]
RSP: 0018:ffff8801fb7cb7a8  EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff88020fc6a000 RCX: 000000000000262b
RDX: 0000000000000af4 RSI: 0000000000000000 RDI: ffff88020fc6a000
RBP: ffff8801fb7cb818 R08: ffffea000820f1c0 R09: 0000000000543ad0
R10: 0000000000000000 R11: 0000000000012f80 R12: ffff88020fc6a740
R13: ffff880210a08280 R14: ffff88020676d9c0 R15: ffffffffa002c0e5
FS:  00007f629e776800(0000) GS:ffff88021f440000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000020fe0e000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process NetworkManager (pid: 850, threadinfo ffff8801fb7ca000, task ffff880207671730)
Stack:
 ffff8801fb7cb7e8 00000000dc61a000 ffff880211bef000 ffff88007f85e000
 ffffc90005cd4000 0000000000000ff0 ffff880211bef090 ffff880211bef090
 ffff8801fb7cb7f8 ffff88020fc6a000 ffffffffa002ca40 0000000000001000
Call Trace:
 [<ffffffff813d995c>] __dev_open+0x91/0xbf
 [<ffffffff813d9b90>] __dev_change_flags+0xab/0x12f
 [<ffffffff813d9c8d>] dev_change_flags+0x1e/0x54
 [<ffffffff813e42ac>] do_setlink+0x2b0/0x779
 [<ffffffff811ddb72>] ? avc_has_perm_flags+0x61/0x7a
 [<ffffffff813e49db>] rtnl_setlink+0xcd/0xed
 [<ffffffff811e3300>] ? selinux_netlink_recv.part.22+0x3a/0x9e
 [<ffffffff813e4d37>] rtnetlink_rcv_msg+0x23b/0x251
 [<ffffffff813e4afc>] ? __rtnl_unlock+0x17/0x17
 [<ffffffff813f8ca7>] netlink_rcv_skb+0x42/0x8d
 [<ffffffff813e3e88>] rtnetlink_rcv+0x26/0x2d
 [<ffffffff813f87b1>] netlink_unicast+0xec/0x156
 [<ffffffff813f8a9b>] netlink_sendmsg+0x280/0x2b8
 [<ffffffff813c62f7>] sock_sendmsg+0xe6/0x109
 [<ffffffff81450265>] ? scm_destroy+0x2b/0x4c
 [<ffffffff81044023>] ? should_resched+0xe/0x2d
 [<ffffffff814b4685>] ? _cond_resched+0xe/0x22
 [<ffffffff81044023>] ? should_resched+0xe/0x2d
 [<ffffffff813d0287>] ? copy_from_user+0x2f/0x31
 [<ffffffff813d0672>] ? verify_iovec+0x52/0xa4
 [<ffffffff813c65dc>] __sys_sendmsg+0x213/0x2ba
 [<ffffffff813c636f>] ? fput_light+0x12/0x14
 [<ffffffff813c81dd>] sys_sendmsg+0x42/0x60
 [<ffffffff814bc482>] system_call_fastpath+0x16/0x1b
Code: c6 d2 2b e1 85 c0 41 89 c6 0f 88 e4 01 00 00 4d 8b 75 00 48 8b bb 50 07 00 00 49 8b 16 49 8b 76 08 48 83 fa 03 0f 86 a8 00 00 00 
RIP  [<ffffffffa002ac0f>] rtl8169_open+0x34e/0x684 [r8169]
 RSP <ffff8801fb7cb7a8>
CR2: 0000000000000000
---[ end trace a9088da3782901d6 ]---


(gdb) list *(rtl8169_open)+0x34e
0x4c0f is in rtl8169_open (drivers/net/r8169.c:1881).
1876		bool rc = false;
1877	
1878		if (fw->size < FW_OPCODE_SIZE)
1879			goto out;
1880	
1881		if (!fw_info->magic) {
1882			size_t i, size, start;
1883			u8 checksum = 0;
1884	
1885			if (fw->size < sizeof(*fw_info))


	Dave

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox