Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 2/2] ipv4: Fix IPsec slowpath fragmentation problem
From: David Miller @ 2011-06-28  3:39 UTC (permalink / raw)
  To: steffen.klassert; +Cc: eric.dumazet, herbert, netdev
In-Reply-To: <20110622110537.GH6489@secunet.com>

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Wed, 22 Jun 2011 13:05:37 +0200

> ip_append_data() builds packets based on the mtu from dst_mtu(rt->dst.path).
> On IPsec the effective mtu is lower because we need to add the protocol
> headers and trailers later when we do the IPsec transformations. So after
> the IPsec transformations the packet might be too big, which leads to a
> slowpath fragmentation then. This patch fixes this by building the packets
> based on the lower IPsec mtu from dst_mtu(&rt->dst) and adapts the exthdr
> handling to this.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/3] ipv4: Fix packet size calculation for IPsec packets in __ip_append_data
From: David Miller @ 2011-06-28  3:39 UTC (permalink / raw)
  To: steffen.klassert; +Cc: eric.dumazet, herbert, netdev
In-Reply-To: <20110622110219.GF6489@secunet.com>

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Wed, 22 Jun 2011 13:02:19 +0200

> In between I found the problem. In ip_setup_cork() we take the mtu on the
> base of dst_mtu(rt->dst.path) and assign it to cork->fragsize which is
> used as the mtu in __ip_append_data(). The path dst_entry is a routing
> dst_entry that does not take the IPsec header and trailer overhead into
> account. So if we build an IPsec packet based on this mtu it might be to
> big after the IPsec transformations are applied. If we take the actual
> IPsec mtu from dst_mtu(&rt->dst) and adapt the exthdr handling to this,
> it works as expected. So I'll send two patches, one that reverts Eric's
> patch and one that fixes the slowpath issue.

Thanks for doing this work Steffen.

> While reading through the code of __ip_append_data() I noticed that we
> might use ip_ufo_append_data() for packets that will be IPsec transformed
> later, is this ok? I don't know how ufo handling works, but I would guess
> that it expects an udp header and not an IPsec header as the packets
> transport header.

Indeed, it could be a real problem.

> The IPsec mtu is 1438 here, so the first packet is too big.
> xfrm4_tunnel_check_size() notices this and sends a ICMP_FRAG_NEEDED
> packet that announces a mtu of 1438 to the original sender of the ping
> packet. Unfortunately the sender is a local address, it's the IPsec
> tunnel entry point. So we update the mtu for this connection to 1438.
> Now, with the next packet xfrm_bundle_ok() notices that the path mtu has
> changed, so it subtracts the IPsec overhead from the mtu a second time
> and we end up with a mtu of 1374. This game goes until we reach a minimal
> mtu of 494.
> 
> Unfortunately I don't know how to fix this. Any ideas?

If the generic PMTU handling in net/ipv4/route.c is adjusting the MTU
for the IPSEC path's route, that would be the problem.

In the case of IPSEC encapsulation, tunnels, and similar, only the
tunnel or the XFRM layer should be processing the PMTU notification.

^ permalink raw reply

* Re: [PATCH v2 1/2] ipv4: Fix packet size calculation in __ip_append_data
From: David Miller @ 2011-06-28  3:39 UTC (permalink / raw)
  To: steffen.klassert; +Cc: eric.dumazet, herbert, netdev
In-Reply-To: <20110622110437.GG6489@secunet.com>

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Wed, 22 Jun 2011 13:04:37 +0200

> Git commit 59104f06 (ip: take care of last fragment in ip_append_data)
> added a check to see if we exceed the mtu when we add trailer_len.
> However, the mtu is already subtracted by the trailer length when the
> xfrm transfomation bundles are set up. So IPsec packets with mtu
> size get fragmented, or if the DF bit is set the packets will not
> be send even though they match the mtu perfectly fine. This patch
> actually reverts commit 59104f06.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

Applied.

^ permalink raw reply

* Re: [PATCH 3/9 v2] myri10ge: rework parity error check and cleanup
From: Jon Mason @ 2011-06-28  3:31 UTC (permalink / raw)
  To: Joe Perches; +Cc: davem, netdev, Andrew Gallatin
In-Reply-To: <1309227435.3344.20.camel@Joe-Laptop>

On Mon, Jun 27, 2011 at 9:17 PM, Joe Perches <joe@perches.com> wrote:
> On Mon, 2011-06-27 at 15:54 -0500, Jon Mason wrote:
>> Clean up watchdog reset code:
>>  - move code that checks for stuck slice to a common routine
>>  - unless there is a confirmed h/w fault, verify that a stuck
>>    slice is still stuck in the watchdog worker; if the slice is no
>>    longer stuck, abort the reset.
>>  - this removes an egregious 2000ms pause in the watchdog worker that
>>    was a diagnostic aid (to look for spurious resets) the snuck into
>>    production code.
>> v2 includes corrections from Ben Hutchings and Joe Perches
>
> Here's some more trivia:
>
>> diff --git a/drivers/net/myri10ge/myri10ge.c b/drivers/net/myri10ge/myri10ge.c
> []
>> @@ -3442,6 +3443,42 @@ static u32 myri10ge_read_reboot(struct myri10ge_priv *mgp)
>>       return reboot;
>>  }
>>
>> +static void
>> +myri10ge_check_slice(struct myri10ge_slice_state *ss, int *reset_needed,
>> +                  int *busy_slice_cnt, u32 rx_pause_cnt)
>> +{
> []
>> +             /* nic seems like it might be stuck.. */
>> +             if (rx_pause_cnt != mgp->watchdog_pause) {
>> +                     if (net_ratelimit())
>> +                             netdev_warn(mgp->dev, "slice %d: TX paused, "
>> +                                         "check link partner\n", slice);
>
> I think this would be better if the format weren't split.
>
>                                netdev_warn(mgp->dev, "slice %d: TX paused, check link partner\n",
>                                            slice);
> or
>                                netdev_warn(mgp->dev,
>                                            "slice %d: TX paused, check link partner\n",
>                                            slice);
> or if you really must split it because exceeding 80 columns
> makes you itchy:
>                                netdev_warn(mgp->dev, "slice %d: "
>                                            "TX paused, check link partner\n",
>                                            slice);

Naa, I prefer it this way.

>> @@ -3465,8 +3504,7 @@ static void myri10ge_watchdog(struct work_struct *work)
>>                * For now, just report it */
>>               reboot = myri10ge_read_reboot(mgp);
>>               netdev_err(mgp->dev, "NIC rebooted (0x%x),%s resetting\n",
>> -                        reboot,
>> -                        myri10ge_reset_recover ? "" : " not");
>> +                        reboot, myri10ge_reset_recover ? " " : " not");
>
> I think this was correct before you changed it.
>
> Maybe:
>                           reboot, myri10ge_reset_recover ? "" : " not");

Yes, I believe this was the intent.

>
>
>

^ permalink raw reply

* Re: [PATCH 3/9 v2] myri10ge: rework parity error check and cleanup
From: Joe Perches @ 2011-06-28  2:17 UTC (permalink / raw)
  To: Jon Mason; +Cc: davem, netdev, Andrew Gallatin
In-Reply-To: <20110627205432.GA18978@myri.com>

On Mon, 2011-06-27 at 15:54 -0500, Jon Mason wrote:
> Clean up watchdog reset code:
>  - move code that checks for stuck slice to a common routine
>  - unless there is a confirmed h/w fault, verify that a stuck
>    slice is still stuck in the watchdog worker; if the slice is no
>    longer stuck, abort the reset.
>  - this removes an egregious 2000ms pause in the watchdog worker that
>    was a diagnostic aid (to look for spurious resets) the snuck into
>    production code.
> v2 includes corrections from Ben Hutchings and Joe Perches

Here's some more trivia:

> diff --git a/drivers/net/myri10ge/myri10ge.c b/drivers/net/myri10ge/myri10ge.c
[]
> @@ -3442,6 +3443,42 @@ static u32 myri10ge_read_reboot(struct myri10ge_priv *mgp)
>  	return reboot;
>  }
>  
> +static void
> +myri10ge_check_slice(struct myri10ge_slice_state *ss, int *reset_needed,
> +		     int *busy_slice_cnt, u32 rx_pause_cnt)
> +{
[]
> +		/* nic seems like it might be stuck.. */
> +		if (rx_pause_cnt != mgp->watchdog_pause) {
> +			if (net_ratelimit())
> +				netdev_warn(mgp->dev, "slice %d: TX paused, "
> +					    "check link partner\n", slice);

I think this would be better if the format weren't split.

				netdev_warn(mgp->dev, "slice %d: TX paused, check link partner\n",
					    slice);
or
				netdev_warn(mgp->dev,
					    "slice %d: TX paused, check link partner\n",
					    slice);
or if you really must split it because exceeding 80 columns
makes you itchy:
				netdev_warn(mgp->dev, "slice %d: "
					    "TX paused, check link partner\n",
					    slice);

> @@ -3465,8 +3504,7 @@ static void myri10ge_watchdog(struct work_struct *work)
>  		 * For now, just report it */
>  		reboot = myri10ge_read_reboot(mgp);
>  		netdev_err(mgp->dev, "NIC rebooted (0x%x),%s resetting\n",
> -			   reboot,
> -			   myri10ge_reset_recover ? "" : " not");
> +			   reboot, myri10ge_reset_recover ? " " : " not");

I think this was correct before you changed it.

Maybe:
			   reboot, myri10ge_reset_recover ? "" : " not");



^ permalink raw reply

* Re: [PATCH v2] NET: AX88796: Tighten up Kconfig dependencies
From: David Miller @ 2011-06-28  1:38 UTC (permalink / raw)
  To: magnus.damm
  Cc: ralf, eric.y.miao, linux, ben-linux, lethal, jeff,
	linux-arm-kernel, linux-kernel, linux-sh, netdev, linux-mips
In-Reply-To: <BANLkTikDxsOJKpiJs0NpMXbjVOFMHL7RZw@mail.gmail.com>

From: Magnus Damm <magnus.damm@gmail.com>
Date: Tue, 28 Jun 2011 09:40:56 +0900

> As for SH and SH-Mobile ARM, unless explicitly requested we usually
> don't restrict our platform drivers. Allowing them to build on any
> system helps to catch compile errors.

I totally agree with Magnus, drivers should build on as many systems
as possible.  Even on those for which the hardware never appears.

Ralf, unless these drivers have unfixable build errors on MIPS I
do not want to add the new restrictions.

^ permalink raw reply

* RE: [PATCH] [net][bna] Fix call trace when interrupts are disabled while sleeping function  kzalloc is called
From: Rasesh Mody @ 2011-06-28  0:51 UTC (permalink / raw)
  To: Shyam Iyer, netdev@vger.kernel.org
  Cc: ddutt@brocadel.com, Shyam Iyer, Jing Huang
In-Reply-To: <1309206092-23064-1-git-send-email-shyam_iyer@dell.com>


>From: Shyam Iyer [mailto:shyam.iyer.t@gmail.com]
>Sent: Monday, June 27, 2011 1:22 PM
>Subject: [PATCH] [net][bna] Fix call trace when interrupts are disabled
>while sleeping function kzalloc is called
>
>The kzalloc sleeps and disabling interrupts(spin_lock_irqsave) causes
>oops like the one.

Hi Shyam,

We are not calling any sleeping function while holding the lock in bnad_mbox_irq_alloc(). How would your patch fix the call trace? Also can you tell which conditions led to this trace.

Thanks,
--Rasesh

^ permalink raw reply

* Re: [PATCH v2] NET: AX88796: Tighten up Kconfig dependencies
From: Magnus Damm @ 2011-06-28  0:40 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: David S. Miller, Eric Miao, Russell King, Ben Dooks, Paul Mundt,
	Jeff Garzik, linux-arm-kernel, linux-kernel, linux-sh, netdev,
	linux-mips
In-Reply-To: <20110627111259.GA13620@linux-mips.org>

Hi Ralf,

On Mon, Jun 27, 2011 at 8:13 PM, Ralf Baechle <ralf@linux-mips.org> wrote:
> In def47c5095d53814512bb0c62ec02dfdec769db1 [[netdrvr] Fix dependencies for
> ax88796 ne2k clone driver] the AX88796 driver got restricted to just be
> build for ARM and MIPS on the sole merrit that it was written for some ARM
> sytems and the driver had the misfortune to just build on MIPS, so MIPS was
> throw into the dependency for a good measure.  Later
> 8687991a734a67f1638782c968f46fff0f94bb1f [ax88796: add superh to kconfig
> dependencies] added SH but only one in-tree SH system actually has an
> AX88796.
>
> Tighten up dependencies by using an auxilliary config sysmbol
> HAS_NET_AX88796 which is selected only by the platforms that actually
> have or may have an AX88796.  This also means the driver won't be built
> anymore for any MIPS platform.
>
> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
> ---
> v2: fixed Sergei's complaints about the log message

I'm the one who added the SuperH bits a few years ago. Judging by the
text above it seems like you prefer not to build this driver for MIPS.
Which is totally fine with me.

As for SH and SH-Mobile ARM, unless explicitly requested we usually
don't restrict our platform drivers. Allowing them to build on any
system helps to catch compile errors. It also makes it possible to add
board support by simply adding platform data to the board file and
then updating the kconfig. Keeping the amount of code at the bare
minimum makes back porting rather easy too.

I'm not sure if the ax88796 driver does something non-standard to
require special symbols, but usually platform drivers are rather clean
and can be compiled for any architecture or platform. At least in
theory. =)

Cheers,

/ magnus

^ permalink raw reply

* Re: [PATCH 15/19 v2] tg3: remove unnecessary read of PCI_CAP_ID_EXP
From: Matt Carlson, Jon Mason @ 2011-06-27 23:33 UTC (permalink / raw)
  Cc: Matthew Carlson, Michael Chan, netdev@vger.kernel.org
In-Reply-To: <20110627225649.GA20786@kudzu.us>

On Mon, Jun 27, 2011 at 03:56:50PM -0700, Jon Mason wrote:
> The PCIE capability offset is saved during PCI bus walking.  Use the
> value from pci_dev instead of checking in the driver and saving it off
> the the driver specific structure.  It will remove an unnecessary search
> in the PCI configuration space if this value is referenced instead of
> reacquiring it.
> 
> v2 of the patch re-adds the PCI_EXPRESS flag and adds comments
> describing why it is necessary.
> 
> Signed-off-by: Jon Mason <jdmason@kudzu.us>
> ---
>  drivers/net/tg3.c |   25 ++++++++++++++-----------
>  drivers/net/tg3.h |    5 +----
>  2 files changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
> index 97cd02d..a555efd 100644
> --- a/drivers/net/tg3.c
> +++ b/drivers/net/tg3.c
> @@ -2679,11 +2679,11 @@ static int tg3_power_down_prepare(struct tg3 *tp)
>  		u16 lnkctl;
>  
>  		pci_read_config_word(tp->pdev,
> -				     tp->pcie_cap + PCI_EXP_LNKCTL,
> +				     tp->pdev->pcie_cap + PCI_EXP_LNKCTL,

Sorry to be a stickler, but can we convert all occurances of
'tp->pdev->pcie_cap' to pci_pcie_cap(tp->pdev)?  If the PCI layer is
taking control of that variable, the driver shouldn't be accessing it
directly if it can help it.

>  				     &lnkctl);
>  		lnkctl |= PCI_EXP_LNKCTL_CLKREQ_EN;
>  		pci_write_config_word(tp->pdev,
> -				      tp->pcie_cap + PCI_EXP_LNKCTL,
> +				      tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
>  				      lnkctl);
>  	}
>  
> @@ -3485,7 +3485,7 @@ relink:
>  		u16 oldlnkctl, newlnkctl;
>  
>  		pci_read_config_word(tp->pdev,
> -				     tp->pcie_cap + PCI_EXP_LNKCTL,
> +				     tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
>  				     &oldlnkctl);
>  		if (tp->link_config.active_speed == SPEED_100 ||
>  		    tp->link_config.active_speed == SPEED_10)
> @@ -3494,7 +3494,7 @@ relink:
>  			newlnkctl = oldlnkctl | PCI_EXP_LNKCTL_CLKREQ_EN;
>  		if (newlnkctl != oldlnkctl)
>  			pci_write_config_word(tp->pdev,
> -					      tp->pcie_cap + PCI_EXP_LNKCTL,
> +					      tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
>  					      newlnkctl);
>  	}
>  
> @@ -7226,7 +7226,7 @@ static int tg3_chip_reset(struct tg3 *tp)
>  
>  	udelay(120);
>  
> -	if (tg3_flag(tp, PCI_EXPRESS) && tp->pcie_cap) {
> +	if (tg3_flag(tp, PCI_EXPRESS) && pci_pcie_cap(tp->pdev)) {
>  		u16 val16;
>  
>  		if (tp->pci_chip_rev_id == CHIPREV_ID_5750_A0) {
> @@ -7244,7 +7244,7 @@ static int tg3_chip_reset(struct tg3 *tp)
>  
>  		/* Clear the "no snoop" and "relaxed ordering" bits. */
>  		pci_read_config_word(tp->pdev,
> -				     tp->pcie_cap + PCI_EXP_DEVCTL,
> +				     tp->pdev->pcie_cap + PCI_EXP_DEVCTL,
>  				     &val16);
>  		val16 &= ~(PCI_EXP_DEVCTL_RELAX_EN |
>  			   PCI_EXP_DEVCTL_NOSNOOP_EN);
> @@ -7255,14 +7255,14 @@ static int tg3_chip_reset(struct tg3 *tp)
>  		if (!tg3_flag(tp, CPMU_PRESENT))
>  			val16 &= ~PCI_EXP_DEVCTL_PAYLOAD;
>  		pci_write_config_word(tp->pdev,
> -				      tp->pcie_cap + PCI_EXP_DEVCTL,
> +				      tp->pdev->pcie_cap + PCI_EXP_DEVCTL,
>  				      val16);
>  
>  		pcie_set_readrq(tp->pdev, tp->pcie_readrq);
>  
>  		/* Clear error status */
>  		pci_write_config_word(tp->pdev,
> -				      tp->pcie_cap + PCI_EXP_DEVSTA,
> +				      tp->pdev->pcie_cap + PCI_EXP_DEVSTA,
>  				      PCI_EXP_DEVSTA_CED |
>  				      PCI_EXP_DEVSTA_NFED |
>  				      PCI_EXP_DEVSTA_FED |
> @@ -13777,8 +13777,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
>  	pci_read_config_dword(tp->pdev, TG3PCI_PCISTATE,
>  			      &pci_state_reg);
>  
> -	tp->pcie_cap = pci_find_capability(tp->pdev, PCI_CAP_ID_EXP);
> -	if (tp->pcie_cap != 0) {
> +	if (pci_is_pcie(tp->pdev)) {
>  		u16 lnkctl;
>  
>  		tg3_flag_set(tp, PCI_EXPRESS);
> @@ -13791,7 +13790,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
>  		pcie_set_readrq(tp->pdev, tp->pcie_readrq);
>  
>  		pci_read_config_word(tp->pdev,
> -				     tp->pcie_cap + PCI_EXP_LNKCTL,
> +				     tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
>  				     &lnkctl);
>  		if (lnkctl & PCI_EXP_LNKCTL_CLKREQ_EN) {
>  			if (GET_ASIC_REV(tp->pci_chip_rev_id) ==
> @@ -13808,6 +13807,10 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
>  			tg3_flag_set(tp, L1PLLPD_EN);
>  		}
>  	} else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5785) {
> +		/* BCM5785 devices are effectively PCIe devices, and should
> +		 * follow PCIe codepaths, but do not have a PCIe capabilities
> +		 * section.
> +		*/
>  		tg3_flag_set(tp, PCI_EXPRESS);
>  	} else if (!tg3_flag(tp, 5705_PLUS) ||
>  		   tg3_flag(tp, 5780_CLASS)) {
> diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
> index bedc3b4..5f250ae 100644
> --- a/drivers/net/tg3.h
> +++ b/drivers/net/tg3.h
> @@ -2857,7 +2857,7 @@ enum TG3_FLAGS {
>  	TG3_FLAG_IS_5788,
>  	TG3_FLAG_MAX_RXPEND_64,
>  	TG3_FLAG_TSO_CAPABLE,
> -	TG3_FLAG_PCI_EXPRESS,
> +	TG3_FLAG_PCI_EXPRESS, /* BCM5785 + pci_is_pcie() */
>  	TG3_FLAG_ASF_NEW_HANDSHAKE,
>  	TG3_FLAG_HW_AUTONEG,
>  	TG3_FLAG_IS_NIC,
> @@ -3022,10 +3022,7 @@ struct tg3 {
>  
>  	int				pm_cap;
>  	int				msi_cap;
> -	union {
>  	int				pcix_cap;
> -	int				pcie_cap;
> -	};
>  	int				pcie_readrq;
>  
>  	struct mii_bus			*mdio_bus;
> -- 
> 1.7.5.4
> 
> 


^ permalink raw reply

* Re: [PATCH] [net][bna] Fix call trace when interrupts are disabled while sleeping function kzalloc is called
From: David Miller @ 2011-06-27 23:07 UTC (permalink / raw)
  To: shyam.iyer.t; +Cc: netdev, rmody, ddutt, shyam_iyer
In-Reply-To: <1309206092-23064-1-git-send-email-shyam_iyer@dell.com>

From: Shyam Iyer <shyam.iyer.t@gmail.com>
Date: Mon, 27 Jun 2011 16:21:32 -0400

> The kzalloc sleeps and disabling interrupts(spin_lock_irqsave)
> causes oops like the one.

What if ->cfg_flags changes while you have dropped the lock?

If the lock doesn't protect those flags, what was it being
taken for in the first place?

^ permalink raw reply

* r8169 :  always copying the rx buffer to new skb
From: John Lumby @ 2011-06-27 22:54 UTC (permalink / raw)
  To: netdev

Summary of some results since previous posts in April :

Previously I suggested re-introducing the rx_copybreak parameter to provide the option of un-hooking the receive buffer rather than copying it,  in order to save the overhead of the memcpy,   which shows as the highest tick-count in oprofile.  All buffer memcpy'ing is done on CPU0 on my system.

I then found that,  without the memcpy,  the driver and net stack consume other overhead elsewhere,  particularly in too-frequent polling/interrupting.

Eric D pointed out that :
            Doing the copy of data and building an exact size skb has benefit of
            providing 'right' skb->truesize (might reduce RCVBUF contention and
            avoid backlog drops) and already cached data (hot in cpu caches).
            Next 'copy' is almost free (L1 cache access)

There was also some discussion off-line about using larger MTU size.

Since then,  I have explored some ideas for dealing with the too-frequent polling/interrupting and the cache aspect,  with some success on the first and no success on the second.   In summary of results:
   .  With MTU of 1500 and "normal" workload,   I see an improvement of between 4% - 6% in throughput,  depending on kernel release and kernel .config.    Specifically,  with the heaviest workload and most tuned kernel .config:
       no changes  -  ~  1440 Megabits/sec bi-directional
     with changes  -  ~  1530 Megabits/sec bi-directional
   (same .config for each of course)
      All 4 of my atom 330's (2 physical x 2SMt per physical) were at 100% on both without and with changes for this workload,  but with very different profiles.
      These throughput numbers are higher than I reported before,   and % improvement lower,  because of the tuning to the base system and workload.

   .  With MTU of 6144,  I see a more dramatic effect  -  the same workload runs at 1725 Megabit/sec on both kernels,  (which may be a practical hardware limit on one of the adapters,  since it hits exactly this rate almost every time no matter what else I change),  but overall CPU utilization drops from ~ 80% without changes to ~60% with changes.    I feel this is significant but of course its use limited to networks that can support this segment size everywhere.

Notes on the changes:

 Too-frequent polling/interrupting:
 These two are highly interrelated by NAPI.
     Too-frequent polling:
         The NAPI weight is a double-duty parameter,  controlling both the dynamic choice between continuing a NAPI polling session versus leaving and resuming interrupts,  and also the maximum number of receive buffers to be passed up per poll.   It's also not configurable (set to 64).    I split it into two numbers,  one for each purpose,  and made them configurable,  and tried tuning them.    A good value for the poll/int choice was 16,   while the max-size number was best left at 64.    This helps a bit,  but polling is still too frequent.
         I then made an interface up into the softirqd to let the driver tell the softirqd :
              "keep my napi session alive but sched_yield to other runnable processes before running another poll"
         I added a check to __do_softirq that if the *only* pending softirq is NET_RX_SOFTIRQ and the rx_action routine requested this, then it exits and tells the deamon to yield.
         I borrowed a bit in local_softirq_pending for this.  This helped a lot for certain workloads.    I saw considerable drop in system CPU% on CPU0 and higher user CPU% there.

     Too-frequent interrupting:
         I made use of the r8169's Interrupt Mitigation feature,   setting it to the maximum multiplied by a factor between 0-1 based inversely on tx queue size  (large qsize,  short delay and vice versa).     This also helped a lot.  The current driver sets these registers but only once per "up" session,  during rtl8169_open of the NIC;  But Hayes explained that the regs must be set on each enabling of interrupts.    This is the one case where (I think) I corrected a bug present in the current driver.   Harmless but not doing what was intended.

      The effect of these two changes was to reduce the rate of hardware interrupts down to less than 1/20 of before,  and also hold the polling rate down (around 4-5 packets per poll on average on a typical run,  sometimes much higher).

  memory and caching:
  Here I failed to achieve anything.    Based on Eric's point about memcpy giving a "free" next copy, I thought possibly memory prefetching might provide something equivalent.   Specifically,  prefetch the skb and its databuff immediately after un-dma'ing.
  For example,   with my changes and no memcpy,  I see eth_type_trans() high in oprofile tick score on CPU0.
  This small function does very little work but is the first (I think) to access a field in the skb->data buffer - the ethernet header.  Prefetching ought to do better than memcpy'ing since only one copy of the data will enter L1,  not two.    But my attempts at this achieved nothing or negative.
  Note  -  the current driver does issue a prefetch of the original buffer prior to the memcpy.  But,  on my system (atom CPUs),  gdb of the object file r8169.o indicates no prefetch instructions are generated,   only lea of the address to be prefetched.     I tried changing the prefetch call to an asm generated prefetcht0/prefetchnta instruction with disappointing results.   I noticed some discussion of memory prefetch in this list earlier and maybe it is not useful.

  I tried to explore Eric's other point about skb->truesize but ran out of time researching.     I guess my current results are negatively impacted by these memory and skb issues that Eric mentions,  but I could not find any answer.

  There was a question of how this changed driver handles memory pressure :
  Along with the rx_copybreak change,  I made the number of rx and tx ring buffers configurable and dynamically replenishable.    The changed driver can tolerate occasional or even bursty alloc failures without exposing any effects outside itself,    whereas the current driver drops packets.   However,  under extreme consecutive failures,  the changed driver will eventually run too low and stop completely,  whereas the current driver will (I assume) stay up.     I was unable to cause either of these in my tests.    Measurements with concurrent memory hogs confirmed this but did show heavy drop in throughput for the changed driver.

  I've tried these changes out on all kernel release levels from 2.6.36 to 3.0-rc3 and see roughly comparable deltas on all,  but with slightly different tuning required to hit the optimum,  and some variability on all after 2.6.37.     2.6.37 seemed to be slightly the "best".   Not sure why although I see some relevant changes to the scheduler between 2.6.37 - 38.    There is also a strange effect with the old RTC in 3.0  -  I had to remove it from the kernel to get good results,    whereas it was a module in 2.6 levels (which I did not load for the tests).     I don't need it on my system except for one ancient utility.     I also found major impact from iptables and cleared all tables for the tests.    That is the one item that would normally be needed in a production setup that I turned off.  The overhead of iptables is presumably highly dependent on how many rules in the filter chains.    (I have rather a lot in INPUT)

I don't plan to do any more on this but can provide my patch (currently one monolithic one based on DaveM 3.0.0-rc1netnext-110615) and detailed results if anyone wants.

Cheers,   John Lumby

^ permalink raw reply

* [PATCH 15/19 v2] tg3: remove unnecessary read of PCI_CAP_ID_EXP
From: Jon Mason @ 2011-06-27 22:56 UTC (permalink / raw)
  To: Matt Carlson; +Cc: Michael Chan, netdev
In-Reply-To: <1309196854-16232-1-git-send-email-jdmason@kudzu.us>

The PCIE capability offset is saved during PCI bus walking.  Use the
value from pci_dev instead of checking in the driver and saving it off
the the driver specific structure.  It will remove an unnecessary search
in the PCI configuration space if this value is referenced instead of
reacquiring it.

v2 of the patch re-adds the PCI_EXPRESS flag and adds comments
describing why it is necessary.

Signed-off-by: Jon Mason <jdmason@kudzu.us>
---
 drivers/net/tg3.c |   25 ++++++++++++++-----------
 drivers/net/tg3.h |    5 +----
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 97cd02d..a555efd 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -2679,11 +2679,11 @@ static int tg3_power_down_prepare(struct tg3 *tp)
 		u16 lnkctl;
 
 		pci_read_config_word(tp->pdev,
-				     tp->pcie_cap + PCI_EXP_LNKCTL,
+				     tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
 				     &lnkctl);
 		lnkctl |= PCI_EXP_LNKCTL_CLKREQ_EN;
 		pci_write_config_word(tp->pdev,
-				      tp->pcie_cap + PCI_EXP_LNKCTL,
+				      tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
 				      lnkctl);
 	}
 
@@ -3485,7 +3485,7 @@ relink:
 		u16 oldlnkctl, newlnkctl;
 
 		pci_read_config_word(tp->pdev,
-				     tp->pcie_cap + PCI_EXP_LNKCTL,
+				     tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
 				     &oldlnkctl);
 		if (tp->link_config.active_speed == SPEED_100 ||
 		    tp->link_config.active_speed == SPEED_10)
@@ -3494,7 +3494,7 @@ relink:
 			newlnkctl = oldlnkctl | PCI_EXP_LNKCTL_CLKREQ_EN;
 		if (newlnkctl != oldlnkctl)
 			pci_write_config_word(tp->pdev,
-					      tp->pcie_cap + PCI_EXP_LNKCTL,
+					      tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
 					      newlnkctl);
 	}
 
@@ -7226,7 +7226,7 @@ static int tg3_chip_reset(struct tg3 *tp)
 
 	udelay(120);
 
-	if (tg3_flag(tp, PCI_EXPRESS) && tp->pcie_cap) {
+	if (tg3_flag(tp, PCI_EXPRESS) && pci_pcie_cap(tp->pdev)) {
 		u16 val16;
 
 		if (tp->pci_chip_rev_id == CHIPREV_ID_5750_A0) {
@@ -7244,7 +7244,7 @@ static int tg3_chip_reset(struct tg3 *tp)
 
 		/* Clear the "no snoop" and "relaxed ordering" bits. */
 		pci_read_config_word(tp->pdev,
-				     tp->pcie_cap + PCI_EXP_DEVCTL,
+				     tp->pdev->pcie_cap + PCI_EXP_DEVCTL,
 				     &val16);
 		val16 &= ~(PCI_EXP_DEVCTL_RELAX_EN |
 			   PCI_EXP_DEVCTL_NOSNOOP_EN);
@@ -7255,14 +7255,14 @@ static int tg3_chip_reset(struct tg3 *tp)
 		if (!tg3_flag(tp, CPMU_PRESENT))
 			val16 &= ~PCI_EXP_DEVCTL_PAYLOAD;
 		pci_write_config_word(tp->pdev,
-				      tp->pcie_cap + PCI_EXP_DEVCTL,
+				      tp->pdev->pcie_cap + PCI_EXP_DEVCTL,
 				      val16);
 
 		pcie_set_readrq(tp->pdev, tp->pcie_readrq);
 
 		/* Clear error status */
 		pci_write_config_word(tp->pdev,
-				      tp->pcie_cap + PCI_EXP_DEVSTA,
+				      tp->pdev->pcie_cap + PCI_EXP_DEVSTA,
 				      PCI_EXP_DEVSTA_CED |
 				      PCI_EXP_DEVSTA_NFED |
 				      PCI_EXP_DEVSTA_FED |
@@ -13777,8 +13777,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
 	pci_read_config_dword(tp->pdev, TG3PCI_PCISTATE,
 			      &pci_state_reg);
 
-	tp->pcie_cap = pci_find_capability(tp->pdev, PCI_CAP_ID_EXP);
-	if (tp->pcie_cap != 0) {
+	if (pci_is_pcie(tp->pdev)) {
 		u16 lnkctl;
 
 		tg3_flag_set(tp, PCI_EXPRESS);
@@ -13791,7 +13790,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
 		pcie_set_readrq(tp->pdev, tp->pcie_readrq);
 
 		pci_read_config_word(tp->pdev,
-				     tp->pcie_cap + PCI_EXP_LNKCTL,
+				     tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
 				     &lnkctl);
 		if (lnkctl & PCI_EXP_LNKCTL_CLKREQ_EN) {
 			if (GET_ASIC_REV(tp->pci_chip_rev_id) ==
@@ -13808,6 +13807,10 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
 			tg3_flag_set(tp, L1PLLPD_EN);
 		}
 	} else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5785) {
+		/* BCM5785 devices are effectively PCIe devices, and should
+		 * follow PCIe codepaths, but do not have a PCIe capabilities
+		 * section.
+		*/
 		tg3_flag_set(tp, PCI_EXPRESS);
 	} else if (!tg3_flag(tp, 5705_PLUS) ||
 		   tg3_flag(tp, 5780_CLASS)) {
diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
index bedc3b4..5f250ae 100644
--- a/drivers/net/tg3.h
+++ b/drivers/net/tg3.h
@@ -2857,7 +2857,7 @@ enum TG3_FLAGS {
 	TG3_FLAG_IS_5788,
 	TG3_FLAG_MAX_RXPEND_64,
 	TG3_FLAG_TSO_CAPABLE,
-	TG3_FLAG_PCI_EXPRESS,
+	TG3_FLAG_PCI_EXPRESS, /* BCM5785 + pci_is_pcie() */
 	TG3_FLAG_ASF_NEW_HANDSHAKE,
 	TG3_FLAG_HW_AUTONEG,
 	TG3_FLAG_IS_NIC,
@@ -3022,10 +3022,7 @@ struct tg3 {
 
 	int				pm_cap;
 	int				msi_cap;
-	union {
 	int				pcix_cap;
-	int				pcie_cap;
-	};
 	int				pcie_readrq;
 
 	struct mii_bus			*mdio_bus;
-- 
1.7.5.4


^ permalink raw reply related

* Re: [PATCH V7 2/4 net-next] skbuff: Add userspace zero-copy buffers in skb
From: David Miller @ 2011-06-27 22:54 UTC (permalink / raw)
  To: mashirle; +Cc: mst, eric.dumazet, avi, arnd, netdev, kvm, linux-kernel
In-Reply-To: <1309189510.21764.1.camel@localhost.localdomain>

From: Shirley Ma <mashirle@us.ibm.com>
Date: Mon, 27 Jun 2011 08:45:10 -0700

> To support skb zero-copy, a pointer is needed to add to skb share info.
> Do you agree with this approach? If not, do you have any other
> suggestions?

I really can't form an opinion unless I am shown the complete
implementation, what this give us in return, what the impact is, etc.

^ permalink raw reply

* Re: SKB paged fragment lifecycle on receive
From: David Miller @ 2011-06-27 22:49 UTC (permalink / raw)
  To: Ian.Campbell; +Cc: netdev, jeremy, xen-devel, eric.dumazet, rusty
In-Reply-To: <1309185724.32717.241.camel@zakaz.uk.xensource.com>

From: Ian Campbell <Ian.Campbell@eu.citrix.com>
Date: Mon, 27 Jun 2011 15:42:04 +0100

> However it seems like this might still have a problem if your SKBs are
> ever cloned. What happens in this case, e.g if a user of AF_PACKET sends
> a broadcast via a device associated with a bridge[1] (where it would be
> flooded)?

You don't need a bridge to get a clone on transmit, the packet
scheduler can do clones.  Just grep for skb_clone in the packet
action handlers net/sched/act_*.c

^ permalink raw reply

* [PATCH 7/9 v2] myri10ge: misc style cleanups
From: Jon Mason @ 2011-06-27 20:56 UTC (permalink / raw)
  To: davem; +Cc: netdev, Andrew Gallatin
In-Reply-To: <1309187108-12715-7-git-send-email-mason@myri.com>

Miscellaneous white space, style, and other cleanups

v2 includes corrections from Joe Perches

Signed-off-by: Jon Mason <mason@myri.com>
---
 drivers/net/myri10ge/myri10ge.c |   32 ++++++++++++++------------------
 1 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/drivers/net/myri10ge/myri10ge.c b/drivers/net/myri10ge/myri10ge.c
index b9b80c0..90c8330 100644
--- a/drivers/net/myri10ge/myri10ge.c
+++ b/drivers/net/myri10ge/myri10ge.c
@@ -1086,6 +1086,9 @@ static int myri10ge_toggle_relaxed(struct pci_dev *pdev, int on)
 		return 0;
 
 	err = pci_read_config_word(pdev, cap + PCI_EXP_DEVCTL, &ctl);
+	if (err)
+		return 0;
+
 	ret = (ctl & PCI_EXP_DEVCTL_RELAX_EN) >> 4;
 	if (ret != on) {
 		ctl &= ~PCI_EXP_DEVCTL_RELAX_EN;
@@ -1140,20 +1143,19 @@ static void myri10ge_setup_dca(struct myri10ge_priv *mgp)
 		mgp->ss[i].cpu = -1;
 		mgp->ss[i].cached_dca_tag = -1;
 		myri10ge_update_dca(&mgp->ss[i]);
-	 }
+	}
 }
 
 static void myri10ge_teardown_dca(struct myri10ge_priv *mgp)
 {
 	struct pci_dev *pdev = mgp->pdev;
-	int err;
 
 	if (!mgp->dca_enabled)
 		return;
 	mgp->dca_enabled = 0;
 	if (mgp->relaxed_order)
 		myri10ge_toggle_relaxed(pdev, 1);
-	err = dca_remove_requester(&pdev->dev);
+	dca_remove_requester(&pdev->dev);
 }
 
 static int myri10ge_notify_dca_device(struct device *dev, void *data)
@@ -1314,7 +1316,7 @@ myri10ge_unmap_rx_page(struct pci_dev *pdev,
 
 static inline int
 myri10ge_rx_done(struct myri10ge_slice_state *ss, int len, __wsum csum,
-		 int lro_enabled)
+		 bool lro_enabled)
 {
 	struct myri10ge_priv *mgp = ss->mgp;
 	struct sk_buff *skb;
@@ -1474,11 +1476,9 @@ myri10ge_clean_rx_done(struct myri10ge_slice_state *ss, int budget)
 {
 	struct myri10ge_rx_done *rx_done = &ss->rx_done;
 	struct myri10ge_priv *mgp = ss->mgp;
-
 	unsigned long rx_bytes = 0;
 	unsigned long rx_packets = 0;
 	unsigned long rx_ok;
-
 	int idx = rx_done->idx;
 	int cnt = rx_done->cnt;
 	int work_done = 0;
@@ -1531,16 +1531,14 @@ static inline void myri10ge_check_statblock(struct myri10ge_priv *mgp)
 			mgp->link_state = link_up;
 
 			if (mgp->link_state == MXGEFW_LINK_UP) {
-				if (netif_msg_link(mgp))
-					netdev_info(mgp->dev, "link up\n");
+				netif_info(mgp, link, mgp->dev, "link up\n");
 				netif_carrier_on(mgp->dev);
 				mgp->link_changes++;
 			} else {
-				if (netif_msg_link(mgp))
-					netdev_info(mgp->dev, "link %s\n",
-					    link_up == MXGEFW_LINK_MYRINET ?
+				netif_info(mgp, link, mgp->dev, "link %s\n",
+					   (link_up == MXGEFW_LINK_MYRINET ?
 					    "mismatch (Myrinet detected)" :
-					    "down");
+					    "down"));
 				netif_carrier_off(mgp->dev);
 				mgp->link_changes++;
 			}
@@ -1621,7 +1619,7 @@ static irqreturn_t myri10ge_intr(int irq, void *arg)
 		if (send_done_count != tx->pkt_done)
 			myri10ge_tx_done(ss, (int)send_done_count);
 		if (unlikely(i > myri10ge_max_irq_loops)) {
-			netdev_err(mgp->dev, "irq stuck?\n");
+			netdev_warn(mgp->dev, "irq stuck?\n");
 			stats->valid = 0;
 			schedule_work(&mgp->watchdog_work);
 		}
@@ -1785,9 +1783,8 @@ static const char myri10ge_gstrings_slice_stats[][ETH_GSTRING_LEN] = {
 	"----------- slice ---------",
 	"tx_pkt_start", "tx_pkt_done", "tx_req", "tx_done",
 	"rx_small_cnt", "rx_big_cnt",
-	"wake_queue", "stop_queue", "tx_linearized", "LRO aggregated",
-	    "LRO flushed",
-	"LRO avg aggr", "LRO no_desc"
+	"wake_queue", "stop_queue", "tx_linearized",
+	"LRO aggregated", "LRO flushed", "LRO avg aggr", "LRO no_desc",
 };
 
 #define MYRI10GE_NET_STATS_LEN      21
@@ -3329,7 +3326,6 @@ abort:
 	/* fall back to using the unaligned firmware */
 	mgp->tx_boundary = 2048;
 	set_fw_name(mgp, myri10ge_fw_unaligned, false);
-
 }
 
 static void myri10ge_select_firmware(struct myri10ge_priv *mgp)
@@ -3715,8 +3711,8 @@ static void myri10ge_free_slices(struct myri10ge_priv *mgp)
 			dma_free_coherent(&pdev->dev, bytes,
 					  ss->fw_stats, ss->fw_stats_bus);
 			ss->fw_stats = NULL;
-			netif_napi_del(&ss->napi);
 		}
+		netif_napi_del(&ss->napi);
 	}
 	kfree(mgp->ss);
 	mgp->ss = NULL;
-- 
1.7.5.4


^ permalink raw reply related

* [PATCH 3/9 v2] myri10ge: rework parity error check and cleanup
From: Jon Mason @ 2011-06-27 20:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, Andrew Gallatin
In-Reply-To: <1309187108-12715-3-git-send-email-mason@myri.com>

Clean up watchdog reset code:
 - move code that checks for stuck slice to a common routine
 - unless there is a confirmed h/w fault, verify that a stuck
   slice is still stuck in the watchdog worker; if the slice is no
   longer stuck, abort the reset.
 - this removes an egregious 2000ms pause in the watchdog worker that
   was a diagnostic aid (to look for spurious resets) the snuck into
   production code.

v2 includes corrections from Ben Hutchings and Joe Perches

Signed-off-by: Jon Mason <mason@myri.com>
---
 drivers/net/myri10ge/myri10ge.c |  100 +++++++++++++++++++++++---------------
 1 files changed, 60 insertions(+), 40 deletions(-)

diff --git a/drivers/net/myri10ge/myri10ge.c b/drivers/net/myri10ge/myri10ge.c
index 0f0f83d..c2574c5 100644
--- a/drivers/net/myri10ge/myri10ge.c
+++ b/drivers/net/myri10ge/myri10ge.c
@@ -193,6 +193,7 @@ struct myri10ge_slice_state {
 	int watchdog_tx_done;
 	int watchdog_tx_req;
 	int watchdog_rx_done;
+	int stuck;
 #ifdef CONFIG_MYRI10GE_DCA
 	int cached_dca_tag;
 	int cpu;
@@ -3442,6 +3443,42 @@ static u32 myri10ge_read_reboot(struct myri10ge_priv *mgp)
 	return reboot;
 }
 
+static void
+myri10ge_check_slice(struct myri10ge_slice_state *ss, int *reset_needed,
+		     int *busy_slice_cnt, u32 rx_pause_cnt)
+{
+	struct myri10ge_priv *mgp = ss->mgp;
+	int slice = ss - mgp->ss;
+
+	if (ss->tx.req != ss->tx.done &&
+	    ss->tx.done == ss->watchdog_tx_done &&
+	    ss->watchdog_tx_req != ss->watchdog_tx_done) {
+		/* nic seems like it might be stuck.. */
+		if (rx_pause_cnt != mgp->watchdog_pause) {
+			if (net_ratelimit())
+				netdev_warn(mgp->dev, "slice %d: TX paused, "
+					    "check link partner\n", slice);
+		} else {
+			netdev_warn(mgp->dev,
+				    "slice %d: TX stuck %d %d %d %d %d %d\n",
+				    slice, ss->tx.queue_active, ss->tx.req,
+				    ss->tx.done, ss->tx.pkt_start,
+				    ss->tx.pkt_done,
+				    (int)ntohl(mgp->ss[slice].fw_stats->
+					       send_done_count));
+			*reset_needed = 1;
+			ss->stuck = 1;
+		}
+	}
+	if (ss->watchdog_tx_done != ss->tx.done ||
+	    ss->watchdog_rx_done != ss->rx_done.cnt) {
+		*busy_slice_cnt += 1;
+	}
+	ss->watchdog_tx_done = ss->tx.done;
+	ss->watchdog_tx_req = ss->tx.req;
+	ss->watchdog_rx_done = ss->rx_done.cnt;
+}
+
 /*
  * This watchdog is used to check whether the board has suffered
  * from a parity error and needs to be recovered.
@@ -3450,10 +3487,12 @@ static void myri10ge_watchdog(struct work_struct *work)
 {
 	struct myri10ge_priv *mgp =
 	    container_of(work, struct myri10ge_priv, watchdog_work);
-	struct myri10ge_tx_buf *tx;
-	u32 reboot;
+	struct myri10ge_slice_state *ss;
+	u32 reboot, rx_pause_cnt;
 	int status, rebooted;
 	int i;
+	int reset_needed = 0;
+	int busy_slice_cnt = 0;
 	u16 cmd, vendor;
 
 	mgp->watchdog_resets++;
@@ -3465,8 +3504,7 @@ static void myri10ge_watchdog(struct work_struct *work)
 		 * For now, just report it */
 		reboot = myri10ge_read_reboot(mgp);
 		netdev_err(mgp->dev, "NIC rebooted (0x%x),%s resetting\n",
-			   reboot,
-			   myri10ge_reset_recover ? "" : " not");
+			   reboot, myri10ge_reset_recover ? " " : " not");
 		if (myri10ge_reset_recover == 0)
 			return;
 		rtnl_lock();
@@ -3498,23 +3536,24 @@ static void myri10ge_watchdog(struct work_struct *work)
 				return;
 			}
 		}
-		/* Perhaps it is a software error.  Try to reset */
-
-		netdev_err(mgp->dev, "device timeout, resetting\n");
+		/* Perhaps it is a software error. See if stuck slice
+		 * has recovered, reset if not */
+		rx_pause_cnt = ntohl(mgp->ss[0].fw_stats->dropped_pause);
 		for (i = 0; i < mgp->num_slices; i++) {
-			tx = &mgp->ss[i].tx;
-			netdev_err(mgp->dev, "(%d): %d %d %d %d %d %d\n",
-				   i, tx->queue_active, tx->req,
-				   tx->done, tx->pkt_start, tx->pkt_done,
-				   (int)ntohl(mgp->ss[i].fw_stats->
-					      send_done_count));
-			msleep(2000);
-			netdev_info(mgp->dev, "(%d): %d %d %d %d %d %d\n",
-				    i, tx->queue_active, tx->req,
-				    tx->done, tx->pkt_start, tx->pkt_done,
-				    (int)ntohl(mgp->ss[i].fw_stats->
-					       send_done_count));
+			ss = mgp->ss;
+			if (ss->stuck) {
+				myri10ge_check_slice(ss, &reset_needed,
+						     &busy_slice_cnt,
+						     rx_pause_cnt);
+				ss->stuck = 0;
+			}
 		}
+		if (!reset_needed) {
+			netdev_dbg(mgp->dev, "not resetting\n");
+			return;
+		}
+
+		netdev_err(mgp->dev, "device timeout, resetting\n");
 	}
 
 	if (!rebooted) {
@@ -3567,27 +3606,8 @@ static void myri10ge_watchdog_timer(unsigned long arg)
 			    myri10ge_fill_thresh)
 				ss->rx_big.watchdog_needed = 0;
 		}
-
-		if (ss->tx.req != ss->tx.done &&
-		    ss->tx.done == ss->watchdog_tx_done &&
-		    ss->watchdog_tx_req != ss->watchdog_tx_done) {
-			/* nic seems like it might be stuck.. */
-			if (rx_pause_cnt != mgp->watchdog_pause) {
-				if (net_ratelimit())
-					netdev_err(mgp->dev, "slice %d: TX paused, check link partner\n",
-						   i);
-			} else {
-				netdev_warn(mgp->dev, "slice %d stuck:", i);
-				reset_needed = 1;
-			}
-		}
-		if (ss->watchdog_tx_done != ss->tx.done ||
-		    ss->watchdog_rx_done != ss->rx_done.cnt) {
-			busy_slice_cnt++;
-		}
-		ss->watchdog_tx_done = ss->tx.done;
-		ss->watchdog_tx_req = ss->tx.req;
-		ss->watchdog_rx_done = ss->rx_done.cnt;
+		myri10ge_check_slice(ss, &reset_needed, &busy_slice_cnt,
+				     rx_pause_cnt);
 	}
 	/* if we've sent or received no traffic, poll the NIC to
 	 * ensure it is still there.  Otherwise, we risk not noticing
-- 
1.7.5.4


^ permalink raw reply related

* Re: SKB paged fragment lifecycle on receive
From: Jeremy Fitzhardinge @ 2011-06-27 20:51 UTC (permalink / raw)
  To: Ian Campbell; +Cc: netdev, rusty, xen-devel, David Miller, eric.dumazet
In-Reply-To: <1309003121.5807.20.camel@dagon.hellion.org.uk>

On 06/25/2011 12:58 PM, Ian Campbell wrote:
> On Fri, 2011-06-24 at 13:11 -0700, Jeremy Fitzhardinge wrote:
>> On 06/24/2011 12:46 PM, David Miller wrote:
>>> Pages get transferred between different SKBs all the time.
>>>
>>> For example, GRO makes extensive use of this technique.
>>> See net/core/skbuff.c:skb_gro_receive().
>>>
>>> It is just one example.
>> I see, and the new skb doesn't get a destructor copied from the
>> original, so there'd be no second callback.
> What about if we were to have a per-shinfo destructor (called once for
> each page as its refcount goes 1->0, from whichever skb ends up with the
> last ref) as well as the skb-destructors.

We never want the refcount for granted pages to go from 1 -> 0.  The
safest thing is to make sure we always elevate the refcount to make sure
that nothing else can ever drop the last ref.

If we can trust the network stack to always do the last release (and not
hand it off to something else to do it), then we could have a destructor
which gets called before the last ref drop (or leaves the ref drop to
the destructor to do), and do everything required that way.  But it
seems pretty fragile.  At the very least it would need a thorough code
audit to make sure that everything handles page lifetimes in the
expected way - but then I'd still worry about out-of-tree patches
breaking something in subtle ways.

>  This already handles the
> cloning case but when pages are moved between shinfo then would it make
> sense for that to be propagated between skb's under these circumstances
> and/or require them to be the same? Since in the case of something like
> skb_gro_receive the skbs (and hence the frag array pages) are all from
> the same 'owner' (even if the skb is actually created by the stack on
> their behalf) I suspect this could work?
>
> But I bet this assumption isn't valid in all cases.

Hm.

> In which case I end up wondering about a destructor per page in the frag
> array. At which point we might as well consider it as a part of the core
> mm stuff rather than something net specific?

Doing it generically still needs some kind of marker that the page has a
special-case destructor (and the destructor pointer itself).

    J

^ permalink raw reply

* Fw: Remove over-broad module alias from zaurus.
From: Greg KH @ 2011-06-27 20:42 UTC (permalink / raw)
  To: Dave Jones, netdev

This should have gone to netdev@

----- Forwarded message from Dave Jones <davej@redhat.com> -----

Date: Fri, 17 Jun 2011 20:02:10 -0400
From: Dave Jones <davej@redhat.com>
To: Linux Kernel <linux-kernel@vger.kernel.org>
Cc: linux-usb@vger.kernel.org
Subject: Remove over-broad module alias from zaurus.

This module and a bunch of dependancies are getting loaded on several
of laptops I have (probably picking up the mobile broadband device),
that have nothing to do with zaurus. Matching by class without
any vendor/device pair isn't the right thing to do here, as it
will prevent any other driver from correctly binding to it.
(Or in the absense of a driver, will just waste time & memory by
 unnecessarily loading modules)

Signed-off-by: Dave Jones <davej@redhat.com>

diff --git a/drivers/net/usb/zaurus.c b/drivers/net/usb/zaurus.c
index 241756e..1a2234c 100644
--- a/drivers/net/usb/zaurus.c
+++ b/drivers/net/usb/zaurus.c
@@ -331,17 +331,7 @@ static const struct usb_device_id	products [] = {
 	ZAURUS_MASTER_INTERFACE,
 	.driver_info = ZAURUS_PXA_INFO,
 },
-
-
-/* At least some of the newest PXA units have very different lies about
- * their standards support:  they claim to be cell phones offering
- * direct access to their radios!  (No, they don't conform to CDC MDLM.)
- */
 {
-	USB_INTERFACE_INFO(USB_CLASS_COMM, USB_CDC_SUBCLASS_MDLM,
-			USB_CDC_PROTO_NONE),
-	.driver_info = (unsigned long) &bogus_mdlm_info,
-}, {
 	/* Motorola MOTOMAGX phones */
 	USB_DEVICE_AND_INTERFACE_INFO(0x22b8, 0x6425, USB_CLASS_COMM,
 			USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

----- End forwarded message -----

^ permalink raw reply related

* Re: [PATCH 15/19] tg3: remove unnecessary read of PCI_CAP_ID_EXP
From: Matt Carlson @ 2011-06-27 20:38 UTC (permalink / raw)
  To: Jon Mason, davem@davemloft.net, netdev@vger.kernel.org; +Cc: Matthew Carlson
In-Reply-To: <BANLkTi=-aoj=zg7BZbkzVikF7V9Si8c1xA@mail.gmail.com>

On Mon, Jun 27, 2011 at 01:23:04PM -0700, Jon Mason wrote:
> On Mon, Jun 27, 2011 at 1:42 PM, Matt Carlson <mcarlson@broadcom.com> wrote:
> > On Mon, Jun 27, 2011 at 10:47:34AM -0700, Jon Mason wrote:
> >> The PCIE capability offset is saved during PCI bus walking.  Use the
> >> value from pci_dev instead of checking in the driver and saving it off
> >> the the driver specific structure.  It will remove an unnecessary search
> >> in the PCI configuration space if this value is referenced instead of
> >> reacquiring it.  Also, there is no need to set a driver specific flag to
> >> show whether the device is PCIE.  pci_is_pcie can be used for this
> >> purpose (thus removing the need for this flag).
> >>
> >> Signed-off-by: Jon Mason <jdmason@kudzu.us>
> >
> > I like the direction this is taking the code, but there are hidden dangers
> > here you might not be aware of.  BCM5785 devices are effectively PCIe
> > devices, and should follow PCIe codepaths, but do not have a PCIe
> > capabilities section.
> 
> Ah, that could cause a problem.  Good catch!
> 
> > Instead, can we keep the PCI_EXPRESS flag, but change all occurrances of
> > 'tp->pcie_cap' to either pci_pcie_cap(tp->pdev) or
> > pci_is_pcie(tp->pdev)?  We should probably add the following next to the
> > TG3_FLAG_PCI_EXPRESS enumeration too:
> >
> > /* BCM5785 + pci_is_pcie() */
> 
> Yes, I can revert that 1/2 of the patch, make the change above, and resubmit.
> 
> FYI, this e-mail didn't CC netdev.  You might wanna nack it there :)

Yes.  You are right.  Added netdev to the recipient list.

Consider it NAK'd. :)

> Thanks,
> Jon
> 
> >
> >> ---
> >>  drivers/net/tg3.c |   53 ++++++++++++++++++++++++-----------------------------
> >>  drivers/net/tg3.h |    4 ----
> >>  2 files changed, 24 insertions(+), 33 deletions(-)
> >>
> >> diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
> >> index 97cd02d..5ecbc5c 100644
> >> --- a/drivers/net/tg3.c
> >> +++ b/drivers/net/tg3.c
> >> @@ -2679,11 +2679,11 @@ static int tg3_power_down_prepare(struct tg3 *tp)
> >>               u16 lnkctl;
> >>
> >>               pci_read_config_word(tp->pdev,
> >> -                                  tp->pcie_cap + PCI_EXP_LNKCTL,
> >> +                                  tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
> >>                                    &lnkctl);
> >>               lnkctl |= PCI_EXP_LNKCTL_CLKREQ_EN;
> >>               pci_write_config_word(tp->pdev,
> >> -                                   tp->pcie_cap + PCI_EXP_LNKCTL,
> >> +                                   tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
> >>                                     lnkctl);
> >>       }
> >>
> >> @@ -3485,7 +3485,7 @@ relink:
> >>               u16 oldlnkctl, newlnkctl;
> >>
> >>               pci_read_config_word(tp->pdev,
> >> -                                  tp->pcie_cap + PCI_EXP_LNKCTL,
> >> +                                  tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
> >>                                    &oldlnkctl);
> >>               if (tp->link_config.active_speed == SPEED_100 ||
> >>                   tp->link_config.active_speed == SPEED_10)
> >> @@ -3494,7 +3494,7 @@ relink:
> >>                       newlnkctl = oldlnkctl | PCI_EXP_LNKCTL_CLKREQ_EN;
> >>               if (newlnkctl != oldlnkctl)
> >>                       pci_write_config_word(tp->pdev,
> >> -                                           tp->pcie_cap + PCI_EXP_LNKCTL,
> >> +                                           tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
> >>                                             newlnkctl);
> >>       }
> >>
> >> @@ -4604,7 +4604,7 @@ static void tg3_dump_state(struct tg3 *tp)
> >>               return;
> >>       }
> >>
> >> -     if (tg3_flag(tp, PCI_EXPRESS)) {
> >> +     if (pci_is_pcie(tp->pdev)) {
> >>               /* Read up to but not including private PCI registers */
> >>               for (i = 0; i < TG3_PCIE_TLDLPL_PORT; i += sizeof(u32))
> >>                       regs[i / sizeof(u32)] = tr32(i);
> >> @@ -7064,7 +7064,7 @@ static void tg3_restore_pci_state(struct tg3 *tp)
> >>       pci_write_config_word(tp->pdev, PCI_COMMAND, tp->pci_cmd);
> >>
> >>       if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5785) {
> >> -             if (tg3_flag(tp, PCI_EXPRESS))
> >> +             if (pci_is_pcie(tp->pdev))
> >>                       pcie_set_readrq(tp->pdev, tp->pcie_readrq);
> >>               else {
> >>                       pci_write_config_byte(tp->pdev, PCI_CACHE_LINE_SIZE,
> >> @@ -7172,7 +7172,7 @@ static int tg3_chip_reset(struct tg3 *tp)
> >>       /* do the reset */
> >>       val = GRC_MISC_CFG_CORECLK_RESET;
> >>
> >> -     if (tg3_flag(tp, PCI_EXPRESS)) {
> >> +     if (pci_is_pcie(tp->pdev)) {
> >>               /* Force PCIe 1.0a mode */
> >>               if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5785 &&
> >>                   !tg3_flag(tp, 57765_PLUS) &&
> >> @@ -7226,7 +7226,7 @@ static int tg3_chip_reset(struct tg3 *tp)
> >>
> >>       udelay(120);
> >>
> >> -     if (tg3_flag(tp, PCI_EXPRESS) && tp->pcie_cap) {
> >> +     if (pci_is_pcie(tp->pdev)) {
> >>               u16 val16;
> >>
> >>               if (tp->pci_chip_rev_id == CHIPREV_ID_5750_A0) {
> >> @@ -7244,7 +7244,7 @@ static int tg3_chip_reset(struct tg3 *tp)
> >>
> >>               /* Clear the "no snoop" and "relaxed ordering" bits. */
> >>               pci_read_config_word(tp->pdev,
> >> -                                  tp->pcie_cap + PCI_EXP_DEVCTL,
> >> +                                  tp->pdev->pcie_cap + PCI_EXP_DEVCTL,
> >>                                    &val16);
> >>               val16 &= ~(PCI_EXP_DEVCTL_RELAX_EN |
> >>                          PCI_EXP_DEVCTL_NOSNOOP_EN);
> >> @@ -7255,14 +7255,14 @@ static int tg3_chip_reset(struct tg3 *tp)
> >>               if (!tg3_flag(tp, CPMU_PRESENT))
> >>                       val16 &= ~PCI_EXP_DEVCTL_PAYLOAD;
> >>               pci_write_config_word(tp->pdev,
> >> -                                   tp->pcie_cap + PCI_EXP_DEVCTL,
> >> +                                   tp->pdev->pcie_cap + PCI_EXP_DEVCTL,
> >>                                     val16);
> >>
> >>               pcie_set_readrq(tp->pdev, tp->pcie_readrq);
> >>
> >>               /* Clear error status */
> >>               pci_write_config_word(tp->pdev,
> >> -                                   tp->pcie_cap + PCI_EXP_DEVSTA,
> >> +                                   tp->pdev->pcie_cap + PCI_EXP_DEVSTA,
> >>                                     PCI_EXP_DEVSTA_CED |
> >>                                     PCI_EXP_DEVSTA_NFED |
> >>                                     PCI_EXP_DEVSTA_FED |
> >> @@ -7325,7 +7325,7 @@ static int tg3_chip_reset(struct tg3 *tp)
> >>
> >>       tg3_mdio_start(tp);
> >>
> >> -     if (tg3_flag(tp, PCI_EXPRESS) &&
> >> +     if (pci_is_pcie(tp->pdev) &&
> >>           tp->pci_chip_rev_id != CHIPREV_ID_5750_A0 &&
> >>           GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5785 &&
> >>           !tg3_flag(tp, 57765_PLUS)) {
> >> @@ -8062,7 +8062,7 @@ static int tg3_reset_hw(struct tg3 *tp, int reset_phy)
> >>        * chips and don't even touch the clocks if the CPMU is present.
> >>        */
> >>       if (!tg3_flag(tp, CPMU_PRESENT)) {
> >> -             if (!tg3_flag(tp, PCI_EXPRESS))
> >> +             if (!pci_is_pcie(tp->pdev))
> >>                       tp->pci_clock_ctrl |= CLOCK_CTRL_DELAY_PCI_GRANT;
> >>               tw32_f(TG3PCI_CLOCK_CTRL, tp->pci_clock_ctrl);
> >>       }
> >> @@ -8339,7 +8339,7 @@ static int tg3_reset_hw(struct tg3 *tp, int reset_phy)
> >>               }
> >>       }
> >>
> >> -     if (tg3_flag(tp, PCI_EXPRESS))
> >> +     if (pci_is_pcie(tp->pdev))
> >>               rdmac_mode |= RDMAC_MODE_FIFO_LONG_BURST;
> >>
> >>       if (tg3_flag(tp, HW_TSO_1) ||
> >> @@ -12873,7 +12873,7 @@ static void __devinit tg3_get_eeprom_hw_cfg(struct tg3 *tp)
> >>                   (cfg2 & NIC_SRAM_DATA_CFG_2_APD_EN))
> >>                       tp->phy_flags |= TG3_PHYFLG_ENABLE_APD;
> >>
> >> -             if (tg3_flag(tp, PCI_EXPRESS) &&
> >> +             if (pci_is_pcie(tp->pdev) &&
> >>                   GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5785 &&
> >>                   !tg3_flag(tp, 57765_PLUS)) {
> >>                       u32 cfg3;
> >> @@ -13777,12 +13777,9 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
> >>       pci_read_config_dword(tp->pdev, TG3PCI_PCISTATE,
> >>                             &pci_state_reg);
> >>
> >> -     tp->pcie_cap = pci_find_capability(tp->pdev, PCI_CAP_ID_EXP);
> >> -     if (tp->pcie_cap != 0) {
> >> +     if (pci_is_pcie(tp->pdev)) {
> >>               u16 lnkctl;
> >>
> >> -             tg3_flag_set(tp, PCI_EXPRESS);
> >> -
> >>               tp->pcie_readrq = 4096;
> >>               if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719 ||
> >>                   GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5720)
> >> @@ -13791,7 +13788,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
> >>               pcie_set_readrq(tp->pdev, tp->pcie_readrq);
> >>
> >>               pci_read_config_word(tp->pdev,
> >> -                                  tp->pcie_cap + PCI_EXP_LNKCTL,
> >> +                                  tp->pdev->pcie_cap + PCI_EXP_LNKCTL,
> >>                                    &lnkctl);
> >>               if (lnkctl & PCI_EXP_LNKCTL_CLKREQ_EN) {
> >>                       if (GET_ASIC_REV(tp->pci_chip_rev_id) ==
> >> @@ -13807,8 +13804,6 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
> >>               } else if (tp->pci_chip_rev_id == CHIPREV_ID_5717_A0) {
> >>                       tg3_flag_set(tp, L1PLLPD_EN);
> >>               }
> >> -     } else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5785) {
> >> -             tg3_flag_set(tp, PCI_EXPRESS);
> >>       } else if (!tg3_flag(tp, 5705_PLUS) ||
> >>                  tg3_flag(tp, 5780_CLASS)) {
> >>               tp->pcix_cap = pci_find_capability(tp->pdev, PCI_CAP_ID_PCIX);
> >> @@ -13829,7 +13824,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
> >>        * posted to the chip in order.
> >>        */
> >>       if (pci_dev_present(tg3_write_reorder_chipsets) &&
> >> -         !tg3_flag(tp, PCI_EXPRESS))
> >> +         !pci_is_pcie(tp->pdev))
> >>               tg3_flag_set(tp, MBOX_WRITE_REORDER);
> >>
> >>       pci_read_config_byte(tp->pdev, PCI_CACHE_LINE_SIZE,
> >> @@ -13903,7 +13898,7 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
> >>       if (tg3_flag(tp, PCIX_TARGET_HWBUG))
> >>               tp->write32 = tg3_write_indirect_reg32;
> >>       else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5701 ||
> >> -              (tg3_flag(tp, PCI_EXPRESS) &&
> >> +              (pci_is_pcie(tp->pdev) &&
> >>                 tp->pci_chip_rev_id == CHIPREV_ID_5750_A0)) {
> >>               /*
> >>                * Back to back register writes can cause problems on these
> >> @@ -14390,7 +14385,7 @@ static u32 __devinit tg3_calc_dma_bndry(struct tg3 *tp, u32 val)
> >>        */
> >>       if (GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5700 &&
> >>           GET_ASIC_REV(tp->pci_chip_rev_id) != ASIC_REV_5701 &&
> >> -         !tg3_flag(tp, PCI_EXPRESS))
> >> +         !pci_is_pcie(tp->pdev))
> >>               goto out;
> >>
> >>  #if defined(CONFIG_PPC64) || defined(CONFIG_IA64) || defined(CONFIG_PARISC)
> >> @@ -14422,7 +14417,7 @@ static u32 __devinit tg3_calc_dma_bndry(struct tg3 *tp, u32 val)
> >>        * other than 5700 and 5701 which do not implement the
> >>        * boundary bits.
> >>        */
> >> -     if (tg3_flag(tp, PCIX_MODE) && !tg3_flag(tp, PCI_EXPRESS)) {
> >> +     if (tg3_flag(tp, PCIX_MODE) && !pci_is_pcie(tp->pdev)) {
> >>               switch (cacheline_size) {
> >>               case 16:
> >>               case 32:
> >> @@ -14447,7 +14442,7 @@ static u32 __devinit tg3_calc_dma_bndry(struct tg3 *tp, u32 val)
> >>                               DMA_RWCTRL_WRITE_BNDRY_384_PCIX);
> >>                       break;
> >>               }
> >> -     } else if (tg3_flag(tp, PCI_EXPRESS)) {
> >> +     } else if (pci_is_pcie(tp->pdev)) {
> >>               switch (cacheline_size) {
> >>               case 16:
> >>               case 32:
> >> @@ -14622,7 +14617,7 @@ static int __devinit tg3_test_dma(struct tg3 *tp)
> >>       if (tg3_flag(tp, 57765_PLUS))
> >>               goto out;
> >>
> >> -     if (tg3_flag(tp, PCI_EXPRESS)) {
> >> +     if (pci_is_pcie(tp->pdev)) {
> >>               /* DMA read watermark not used on PCIE */
> >>               tp->dma_rwctrl |= 0x00180000;
> >>       } else if (!tg3_flag(tp, PCIX_MODE)) {
> >> @@ -14880,7 +14875,7 @@ static char * __devinit tg3_phy_string(struct tg3 *tp)
> >>
> >>  static char * __devinit tg3_bus_string(struct tg3 *tp, char *str)
> >>  {
> >> -     if (tg3_flag(tp, PCI_EXPRESS)) {
> >> +     if (pci_is_pcie(tp->pdev)) {
> >>               strcpy(str, "PCI Express");
> >>               return str;
> >>       } else if (tg3_flag(tp, PCIX_MODE)) {
> >> diff --git a/drivers/net/tg3.h b/drivers/net/tg3.h
> >> index bedc3b4..ede661f 100644
> >> --- a/drivers/net/tg3.h
> >> +++ b/drivers/net/tg3.h
> >> @@ -2857,7 +2857,6 @@ enum TG3_FLAGS {
> >>       TG3_FLAG_IS_5788,
> >>       TG3_FLAG_MAX_RXPEND_64,
> >>       TG3_FLAG_TSO_CAPABLE,
> >> -     TG3_FLAG_PCI_EXPRESS,
> >>       TG3_FLAG_ASF_NEW_HANDSHAKE,
> >>       TG3_FLAG_HW_AUTONEG,
> >>       TG3_FLAG_IS_NIC,
> >> @@ -3022,10 +3021,7 @@ struct tg3 {
> >>
> >>       int                             pm_cap;
> >>       int                             msi_cap;
> >> -     union {
> >>       int                             pcix_cap;
> >> -     int                             pcie_cap;
> >> -     };
> >>       int                             pcie_readrq;
> >>
> >>       struct mii_bus                  *mdio_bus;
> >> --
> >> 1.7.5.4
> >>
> >>
> >
> >
> 


^ permalink raw reply

* Re: [PATCH 3/9] myri10ge: rework parity error check and cleanup
From: Jon Mason @ 2011-06-27 20:26 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: davem, netdev, Andrew Gallatin
In-Reply-To: <1309188438.2744.5.camel@bwh-desktop>

On Mon, Jun 27, 2011 at 10:27 AM, Ben Hutchings
<bhutchings@solarflare.com> wrote:
> On Mon, 2011-06-27 at 10:05 -0500, Jon Mason wrote:
>> Clean up watchdog reset code:
>>  - move code that checks for stuck slice to a common routine
>>  - unless there is a confirmed h/w fault, verify that a stuck
>>    slice is still stuck in the watchdog worker; if the slice is no
>>    longer stuck, abort the reset.
>>  - this removes an egregarious 2000ms pause in the watchdog worker that
> [...]
>
> Egregious & gregarious?  Or maybe just egregious?

It's my new errno, -EGREGARIOUS.  :)

Thanks, I'll roll this and Joe's comments in version #2.

>
> Ben.
>
> --
> Ben Hutchings, Senior Software Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
>
>

^ permalink raw reply

* [PATCH] [net][bna] Fix call trace when interrupts are disabled while sleeping function  kzalloc is called
From: Shyam Iyer @ 2011-06-27 20:21 UTC (permalink / raw)
  To: netdev; +Cc: rmody, ddutt, Shyam Iyer

The kzalloc sleeps and disabling interrupts(spin_lock_irqsave) causes oops like the one.

Jun 27 08:15:24 home-t710 kernel: [11735.634550] Brocade 10G Ethernet driver
Jun 27 08:15:24 home-t710 kernel: [11735.634590] bnad_pci_probe : (0xffff880427f3d000, 0xffffffffa020f3e0) PCI Func : (2)
Jun 27 08:15:24 home-t710 kernel: [11735.637677] bna 0000:82:00.2: PCI INT A -> GSI 66 (level, low) -> IRQ 66
Jun 27 08:15:24 home-t710 kernel: [11735.638290] bar0 mapped to ffffc90014980000, len 262144
Jun 27 08:15:24 home-t710 kernel: [11735.638732] BUG: sleeping function called from invalid context at mm/slub.c:847
Jun 27 08:15:24 home-t710 kernel: [11735.638736] in_atomic(): 0, irqs_disabled(): 1, pid: 11243, name: insmod
Jun 27 08:15:24 home-t710 kernel: [11735.638740] Pid: 11243, comm: insmod Not tainted 3.0.0-rc4+ #6
Jun 27 08:15:24 home-t710 kernel: [11735.638743] Call Trace:
Jun 27 08:15:24 home-t710 kernel: [11735.638755]  [<ffffffff81046427>] __might_sleep+0xeb/0xf0
Jun 27 08:15:24 home-t710 kernel: [11735.638766]  [<ffffffffa01fe469>] ? netif_wake_queue+0x3d/0x3d [bna]
Jun 27 08:15:24 home-t710 kernel: [11735.638773]  [<ffffffff8111201c>] kmem_cache_alloc_trace+0x43/0xd8
Jun 27 08:15:24 home-t710 kernel: [11735.638782]  [<ffffffffa01fe469>] ? netif_wake_queue+0x3d/0x3d [bna]
Jun 27 08:15:24 home-t710 kernel: [11735.638787]  [<ffffffff810ab791>] request_threaded_irq+0xa1/0x113
Jun 27 08:15:24 home-t710 kernel: [11735.638798]  [<ffffffffa020f0c0>] bnad_pci_probe+0x612/0x8e5 [bna]
Jun 27 08:15:24 home-t710 kernel: [11735.638807]  [<ffffffffa01fe469>] ? netif_wake_queue+0x3d/0x3d [bna]
Jun 27 08:15:24 home-t710 kernel: [11735.638816]  [<ffffffff81482ef4>] ? _raw_spin_unlock_irqrestore+0x17/0x19
Jun 27 08:15:24 home-t710 kernel: [11735.638822]  [<ffffffff8124d17a>] local_pci_probe+0x44/0x75
Jun 27 08:15:24 home-t710 kernel: [11735.638826]  [<ffffffff8124dc06>] pci_device_probe+0xd0/0xff
Jun 27 08:15:24 home-t710 kernel: [11735.638832]  [<ffffffff812ef8ab>] driver_probe_device+0x131/0x213
Jun 27 08:15:24 home-t710 kernel: [11735.638836]  [<ffffffff812ef9e7>] __driver_attach+0x5a/0x7e
Jun 27 08:15:24 home-t710 kernel: [11735.638840]  [<ffffffff812ef98d>] ? driver_probe_device+0x213/0x213
Jun 27 08:15:24 home-t710 kernel: [11735.638844]  [<ffffffff812ee933>] bus_for_each_dev+0x53/0x89
Jun 27 08:15:24 home-t710 kernel: [11735.638848]  [<ffffffff812ef48a>] driver_attach+0x1e/0x20
Jun 27 08:15:24 home-t710 kernel: [11735.638852]  [<ffffffff812ef0ae>] bus_add_driver+0xd1/0x224
Jun 27 08:15:24 home-t710 kernel: [11735.638858]  [<ffffffffa01b8000>] ? 0xffffffffa01b7fff
Jun 27 08:15:24 home-t710 kernel: [11735.638862]  [<ffffffff812efe57>] driver_register+0x98/0x105
Jun 27 08:15:24 home-t710 kernel: [11735.638866]  [<ffffffffa01b8000>] ? 0xffffffffa01b7fff
Jun 27 08:15:24 home-t710 kernel: [11735.638871]  [<ffffffff8124e4c9>] __pci_register_driver+0x56/0xc1
Jun 27 08:15:24 home-t710 kernel: [11735.638875]  [<ffffffffa01b8000>] ? 0xffffffffa01b7fff
Jun 27 08:15:24 home-t710 kernel: [11735.638884]  [<ffffffffa01b8040>] bnad_module_init+0x40/0x60 [bna]
Jun 27 08:15:24 home-t710 kernel: [11735.638892]  [<ffffffff81002099>] do_one_initcall+0x7f/0x136
Jun 27 08:15:24 home-t710 kernel: [11735.638899]  [<ffffffff8108608b>] sys_init_module+0x88/0x1d0
Jun 27 08:15:24 home-t710 kernel: [11735.638906]  [<ffffffff81489682>] system_call_fastpath+0x16/0x1b
Jun 27 08:15:24 home-t710 kernel: [11735.639642] bnad_pci_probe : (0xffff880427f3e000, 0xffffffffa020f3e0) PCI Func : (3)
Jun 27 08:15:24 home-t710 kernel: [11735.639665] bna 0000:82:00.3: PCI INT A -> GSI 66 (level, low) -> IRQ 66
Jun 27 08:15:24 home-t710 kernel: [11735.639735] bar0 mapped to ffffc90014400000, len 262144


Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
---
 drivers/net/bna/bnad.c |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bna/bnad.c b/drivers/net/bna/bnad.c
index 7d25a97..04da3c8 100644
--- a/drivers/net/bna/bnad.c
+++ b/drivers/net/bna/bnad.c
@@ -1114,6 +1114,11 @@ bnad_mbox_irq_alloc(struct bnad *bnad,
 	unsigned long 	flags;
 	u32	irq;
 	irq_handler_t 	irq_handler;
+	u32	cfg_flags;
+
+	spin_lock_irqsave(&bnad->bna_lock, flags);
+	cfg_flags = bnad->cfg_flags;
+	spin_unlock_irqrestore(&bnad->bna_lock, flags);
 
 	/* Mbox should use only 1 vector */
 
@@ -1121,8 +1126,7 @@ bnad_mbox_irq_alloc(struct bnad *bnad,
 	if (!intr_info->idl)
 		return -ENOMEM;
 
-	spin_lock_irqsave(&bnad->bna_lock, flags);
-	if (bnad->cfg_flags & BNAD_CF_MSIX) {
+	if (cfg_flags & BNAD_CF_MSIX) {
 		irq_handler = (irq_handler_t)bnad_msix_mbox_handler;
 		irq = bnad->msix_table[bnad->msix_num - 1].vector;
 		flags = 0;
@@ -1135,7 +1139,6 @@ bnad_mbox_irq_alloc(struct bnad *bnad,
 		intr_info->intr_type = BNA_INTR_T_INTX;
 		/* intr_info->idl.vector = 0 ? */
 	}
-	spin_unlock_irqrestore(&bnad->bna_lock, flags);
 
 	sprintf(bnad->mbox_irq_name, "%s", BNAD_NAME);
 
-- 
1.7.5.4


^ permalink raw reply related

* Re: [PATCH] e1000: Allow the driver to be used on PA RISC C8000 workstation
From: Matt Turner @ 2011-06-27 19:33 UTC (permalink / raw)
  To: Rolf Eike Beer
  Cc: kyle@mcmartin.ca, linux-parisc@vger.kernel.org,
	e1000-devel@lists.sourceforge.net, netdev, Guy Martin,
	mikulas@artax.karlin.mff.cuni.cz
In-Reply-To: <1793556.1X3EUabhs6@donald.sf-tec.de>

On Mon, Jun 27, 2011 at 2:42 PM, Rolf Eike Beer <eike-kernel@sf-tec.de> wrote:
> Rolf Eike Beer wrote:
>> Am Freitag, 18. März 2011, 17:39:57 schrieb Rolf Eike Beer:
>> > Am Mittwoch, 2. März 2011, 21:19:24 schrieb Jesse Brandeburg:
>> > > On Mon, Feb 28, 2011 at 5:40 AM, Guy Martin <gmsoft@tuxicoman.be> wrote:
>> > > > Hi Jeff,
>> > > >
>> > > > Any luck getting this into mainline ?
>> > >
>> > > Hi Guy, sorry for the delay,
>> > > We haven't been able to get our contacts in HP to give us a decent
>> > > response so far, we are following up with them to see whats up.  We
>> > > have not lost the patch and are still tracking it internally.
>> > >
>> > > Give us a couple more weeks if that is okay and we should be able to
>> > > settle this by then.
>> >
>> > I wonder what exactly you are waiting for? This is a sanity check that
>> > we
>> > disable, so no working systems could get broken by this. And every
>> > single
>> > C8000 seems to be affected by this and is working fine with that patch.
>> > So maybe people at HP might have a clue _why_ this is screwed, but
>> > until then I don't see any point in waiting.
>> >
>> > So please just add my tested-by and push this upstream soon. Since this
>> > is basically a hardware quirk I would like to get this into stable also
>> > so we may run vanilla 2.6.38.1 or something like that on C8000.
>> >
>> > Tested-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
>>
>> For the netdev folks: it's this patch we are talking about
>>
>> http://www.spinics.net/lists/linux-parisc/msg03091.html
>>
>> I would love to see that someone finally picks this up and pushes this
>> upstream, CC stable. This is absolutely annoying as it breaks every time
>> anyone touches the kernel on one of this machines.
>>
>> Jeff, David, James: can you please make a decision of who takes this and
>> then just do it?
>
> Ping?

Ping indeed. Waiting as if HP is going to say "yes, we fucked that up.
Go ahead with the patch." is silly.

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [PATCH 14/19] sky2: remove unnecessary reads of PCI_CAP_ID_EXP
From: Stephen Hemminger @ 2011-06-27 18:55 UTC (permalink / raw)
  To: Jon Mason; +Cc: netdev
In-Reply-To: <1309196816-16198-1-git-send-email-jdmason@kudzu.us>

On Mon, 27 Jun 2011 12:46:56 -0500
Jon Mason <jdmason@kudzu.us> wrote:

> The PCIE capability offset is saved during PCI bus walking.  It will
> remove an unnecessary search in the PCI configuration space if this
> value is referenced instead of reacquiring it.  Also, pci_is_pcie is a
> better way of determining if the device is PCIE or not (as it uses the
> same saved PCIE capability offset).
> 
> Signed-off-by: Jon Mason <jdmason@kudzu.us>

Acked-by: Stephen Hemminger <shemminger@vyatta.com>

^ permalink raw reply

* Re: [PATCH net-next-2.6 1/3] be2net: fix netdev_stats_update
From: Stephen Hemminger @ 2011-06-27 18:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Sathya Perla, netdev
In-Reply-To: <1309195241.2532.79.camel@edumazet-laptop>

On Mon, 27 Jun 2011 19:20:41 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le lundi 27 juin 2011 à 09:43 -0700, Stephen Hemminger a écrit :
> > On Mon, 27 Jun 2011 12:10:48 +0530
> > Sathya Perla <sathya.perla@emulex.com> wrote:
> > 
> > > Problem initially reproted and fixed by Eric Dumazet <eric.dumazet@gmail.com>
> > > 
> > > netdev_stats_update() resets netdev->stats and then accumulates stats from
> > > various rings. This is wrong as stats readers can sometimes catch zero values.
> > > Use temporary variables instead for accumulating per-ring values.
> > > 
> > > Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
> > 
> > Should also use u64_stats_sync to ensure correct rollover or 32 bit SMP
> > platform.
> 
> These are "unsigned long" fields, you dont need u64_stats_sync.

The source fields (in be.h) are u64, but since destination is unsigned long
that works.

^ permalink raw reply

* [PATCH net-next]  benet: convert to 64 bit stats
From: Stephen Hemminger @ 2011-06-27 18:43 UTC (permalink / raw)
  To: David Miller; +Cc: Sathya Perla, netdev
In-Reply-To: <20110627094337.5108b5f6@nehalam.ftrdhcpuser.net>

This changes how the benet driver does statistics:
  * use 64 bit statistics interface (old api was only 32 bit on 32 bit platform)
  * use u64_stats_sync to ensure atomic 64 bit on 32 bit SMP
  * only update statistics when needed

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---

 drivers/net/benet/be.h      |    4 -
 drivers/net/benet/be_cmds.c |    1 
 drivers/net/benet/be_main.c |  161 ++++++++++++++++++++++++--------------------
 3 files changed, 93 insertions(+), 73 deletions(-)

--- a/drivers/net/benet/be.h	2011-06-27 10:37:01.519999268 -0700
+++ b/drivers/net/benet/be.h	2011-06-27 11:00:02.691999144 -0700
@@ -29,6 +29,7 @@
 #include <linux/interrupt.h>
 #include <linux/firmware.h>
 #include <linux/slab.h>
+#include <linux/u64_stats_sync.h>
 
 #include "be_hw.h"
 
@@ -176,6 +177,7 @@ struct be_tx_stats {
 	u64 be_tx_bytes_prev;
 	u64 be_tx_pkts;
 	u32 be_tx_rate;
+	struct u64_stats_sync syncp;
 };
 
 struct be_tx_obj {
@@ -210,6 +212,7 @@ struct be_rx_stats {
 	u32 rx_frags;
 	u32 prev_rx_frags;
 	u32 rx_fps;		/* Rx frags per second */
+	struct u64_stats_sync syncp;
 };
 
 struct be_rx_compl_info {
@@ -526,7 +529,6 @@ static inline bool be_multi_rxq(const st
 extern void be_cq_notify(struct be_adapter *adapter, u16 qid, bool arm,
 		u16 num_popped);
 extern void be_link_status_update(struct be_adapter *adapter, bool link_up);
-extern void netdev_stats_update(struct be_adapter *adapter);
 extern void be_parse_stats(struct be_adapter *adapter);
 extern int be_load_fw(struct be_adapter *adapter, u8 *func);
 #endif				/* BE_H */
--- a/drivers/net/benet/be_main.c	2011-06-27 10:26:59.635999322 -0700
+++ b/drivers/net/benet/be_main.c	2011-06-27 10:59:37.923999145 -0700
@@ -418,77 +418,6 @@ void be_parse_stats(struct be_adapter *a
 	}
 }
 
-void netdev_stats_update(struct be_adapter *adapter)
-{
-	struct be_drv_stats *drvs = &adapter->drv_stats;
-	struct net_device_stats *dev_stats = &adapter->netdev->stats;
-	struct be_rx_obj *rxo;
-	struct be_tx_obj *txo;
-	unsigned long pkts = 0, bytes = 0, mcast = 0, drops = 0;
-	int i;
-
-	for_all_rx_queues(adapter, rxo, i) {
-		pkts += rx_stats(rxo)->rx_pkts;
-		bytes += rx_stats(rxo)->rx_bytes;
-		mcast += rx_stats(rxo)->rx_mcast_pkts;
-		/*  no space in linux buffers: best possible approximation */
-		if (adapter->generation == BE_GEN3) {
-			if (!(lancer_chip(adapter))) {
-				struct be_erx_stats_v1 *erx =
-					be_erx_stats_from_cmd(adapter);
-				drops += erx->rx_drops_no_fragments[rxo->q.id];
-			}
-		} else {
-			struct be_erx_stats_v0 *erx =
-					be_erx_stats_from_cmd(adapter);
-			drops += erx->rx_drops_no_fragments[rxo->q.id];
-		}
-	}
-	dev_stats->rx_packets = pkts;
-	dev_stats->rx_bytes = bytes;
-	dev_stats->multicast = mcast;
-	dev_stats->rx_dropped = drops;
-
-	pkts = bytes = 0;
-	for_all_tx_queues(adapter, txo, i) {
-		pkts += tx_stats(txo)->be_tx_pkts;
-		bytes += tx_stats(txo)->be_tx_bytes;
-	}
-	dev_stats->tx_packets = pkts;
-	dev_stats->tx_bytes = bytes;
-
-	/* bad pkts received */
-	dev_stats->rx_errors = drvs->rx_crc_errors +
-		drvs->rx_alignment_symbol_errors +
-		drvs->rx_in_range_errors +
-		drvs->rx_out_range_errors +
-		drvs->rx_frame_too_long +
-		drvs->rx_dropped_too_small +
-		drvs->rx_dropped_too_short +
-		drvs->rx_dropped_header_too_small +
-		drvs->rx_dropped_tcp_length +
-		drvs->rx_dropped_runt +
-		drvs->rx_tcp_checksum_errs +
-		drvs->rx_ip_checksum_errs +
-		drvs->rx_udp_checksum_errs;
-
-	/* detailed rx errors */
-	dev_stats->rx_length_errors = drvs->rx_in_range_errors +
-		drvs->rx_out_range_errors +
-		drvs->rx_frame_too_long;
-
-	dev_stats->rx_crc_errors = drvs->rx_crc_errors;
-
-	/* frame alignment errors */
-	dev_stats->rx_frame_errors = drvs->rx_alignment_symbol_errors;
-
-	/* receiver fifo overrun */
-	/* drops_no_pbuf is no per i/f, it's per BE card */
-	dev_stats->rx_fifo_errors = drvs->rxpp_fifo_overflow_drop +
-				drvs->rx_input_fifo_overflow_drop +
-				drvs->rx_drops_no_pbuf;
-}
-
 void be_link_status_update(struct be_adapter *adapter, bool link_up)
 {
 	struct net_device *netdev = adapter->netdev;
@@ -586,8 +515,10 @@ static void be_tx_stats_update(struct be
 
 	stats->be_tx_reqs++;
 	stats->be_tx_wrbs += wrb_cnt;
+	u64_stats_update_begin(&stats->syncp);
 	stats->be_tx_bytes += copied;
 	stats->be_tx_pkts += (gso_segs ? gso_segs : 1);
+	u64_stats_update_end(&stats->syncp);
 	if (stopped)
 		stats->be_tx_stops++;
 }
@@ -793,6 +724,89 @@ static netdev_tx_t be_xmit(struct sk_buf
 	return NETDEV_TX_OK;
 }
 
+static struct rtnl_link_stats64 *be_get_stats(struct net_device *netdev,
+					      struct rtnl_link_stats64 *stats)
+{
+	struct be_adapter *adapter = netdev_priv(netdev);
+	struct be_drv_stats *drvs = &adapter->drv_stats;
+	const struct be_rx_obj *rxo;
+	const struct be_tx_obj *txo;
+	u64 pkts, bytes;
+	unsigned int start;
+	int i;
+
+	for_all_rx_queues(adapter, rxo, i) {
+		const struct be_rx_stats *rx_stats = rx_stats(rxo);
+		do {
+			start = u64_stats_fetch_begin(&rx_stats->syncp);
+			pkts = rx_stats->rx_pkts;
+			bytes = rx_stats->rx_bytes;
+		} while (u64_stats_fetch_retry(&rx_stats->syncp, start));
+
+		stats->rx_packets += pkts;
+		stats->rx_bytes += bytes;
+		stats->multicast += rx_stats->rx_mcast_pkts;
+
+		/*  no space in linux buffers: best possible approximation */
+		if (adapter->generation == BE_GEN3) {
+			if (!(lancer_chip(adapter))) {
+				const struct be_erx_stats_v1 *erx =
+					be_erx_stats_from_cmd(adapter);
+				stats->rx_dropped += erx->rx_drops_no_fragments[rxo->q.id];
+			}
+		} else {
+			const struct be_erx_stats_v0 *erx =
+				be_erx_stats_from_cmd(adapter);
+			stats->rx_dropped += erx->rx_drops_no_fragments[rxo->q.id];
+		}
+	}
+
+	for_all_tx_queues(adapter, txo, i) {
+		const struct be_tx_stats *tx_stats = tx_stats(txo);
+		do {
+			start = u64_stats_fetch_begin(&tx_stats->syncp);
+			pkts = tx_stats->be_tx_pkts;
+			bytes = tx_stats->be_tx_bytes;
+		} while (u64_stats_fetch_retry(&tx_stats->syncp, start));
+
+		stats->tx_packets += pkts;
+		stats->tx_bytes += bytes;
+	}
+
+	/* bad pkts received */
+	stats->rx_errors = drvs->rx_crc_errors +
+		drvs->rx_alignment_symbol_errors +
+		drvs->rx_in_range_errors +
+		drvs->rx_out_range_errors +
+		drvs->rx_frame_too_long +
+		drvs->rx_dropped_too_small +
+		drvs->rx_dropped_too_short +
+		drvs->rx_dropped_header_too_small +
+		drvs->rx_dropped_tcp_length +
+		drvs->rx_dropped_runt +
+		drvs->rx_tcp_checksum_errs +
+		drvs->rx_ip_checksum_errs +
+		drvs->rx_udp_checksum_errs;
+
+	/* detailed rx errors */
+	stats->rx_length_errors = drvs->rx_in_range_errors +
+		drvs->rx_out_range_errors +
+		drvs->rx_frame_too_long;
+
+	stats->rx_crc_errors = drvs->rx_crc_errors;
+
+	/* frame alignment errors */
+	stats->rx_frame_errors = drvs->rx_alignment_symbol_errors;
+
+	/* receiver fifo overrun */
+	/* drops_no_pbuf is no per i/f, it's per BE card */
+	stats->rx_fifo_errors = drvs->rxpp_fifo_overflow_drop +
+		drvs->rx_input_fifo_overflow_drop +
+		drvs->rx_drops_no_pbuf;
+
+	return stats;
+}
+
 static int be_change_mtu(struct net_device *netdev, int new_mtu)
 {
 	struct be_adapter *adapter = netdev_priv(netdev);
@@ -1040,8 +1054,12 @@ static void be_rx_stats_update(struct be
 
 	stats->rx_compl++;
 	stats->rx_frags += rxcp->num_rcvd;
+
+	u64_stats_update_begin(&stats->syncp);
 	stats->rx_bytes += rxcp->pkt_size;
 	stats->rx_pkts++;
+	u64_stats_update_end(&stats->syncp);
+
 	if (rxcp->pkt_type == BE_MULTICAST_PACKET)
 		stats->rx_mcast_pkts++;
 	if (rxcp->err)
@@ -2918,6 +2936,7 @@ static struct net_device_ops be_netdev_o
 	.ndo_set_rx_mode	= be_set_multicast_list,
 	.ndo_set_mac_address	= be_mac_addr_set,
 	.ndo_change_mtu		= be_change_mtu,
+	.ndo_get_stats64	= be_get_stats,
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_vlan_rx_register	= be_vlan_register,
 	.ndo_vlan_rx_add_vid	= be_vlan_add_vid,
--- a/drivers/net/benet/be_cmds.c	2011-06-27 10:59:50.935999145 -0700
+++ b/drivers/net/benet/be_cmds.c	2011-06-27 10:59:54.643999144 -0700
@@ -103,7 +103,6 @@ static int be_mcc_compl_process(struct b
 							sizeof(resp->hw_stats));
 			}
 			be_parse_stats(adapter);
-			netdev_stats_update(adapter);
 			adapter->stats_cmd_sent = false;
 		}
 	} else if ((compl_status != MCC_STATUS_NOT_SUPPORTED) &&

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox