Netdev List
 help / color / mirror / Atom feed
* Re: Regarding to your linux kernel CL
From: Chung-Yih Wang (王崇懿) @ 2010-10-06  8:04 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Timo Teräs, davem, netdev
In-Reply-To: <20101006075946.GA5311@gondor.apana.org.au>

I have submitted a patch([PATCH] net: Fix sk_dst_check() to reset the
obsolete dst_entry of a socket) for this, please reply to that thread
then.

Thanks,
Chung-yih

On Wed, Oct 6, 2010 at 12:59 AM, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Wed, Oct 06, 2010 at 10:02:56AM +0300, Timo Teräs wrote:
>>
>> What's the problem here? sk_dst_check not honoring if dst->obsolete>0 ?
>> Sounds like the sk_dst_check was buggy in the first place.
>
> Well the problem is that before we changed ip4_dst_check, everything
> worked properly.  With IPv6, whenever a route is released, the serial
> number is always updated accordingly.  This means that ip6_dst_check
> will always return NULL when obsolete > 1.
>
> The old ip4_dst_check also satisfied this requirement since it always
> returns NULL.
>
>> Looks like there's still some code around that does not do what the
>> obsolete field has been used for a long time.
>>   obsolete =  0, dst entry is ok
>>   obsolete = -1, you need to call ops->check for this entry
>>   obsolete >  0, this entry is invalid
>>
>> So net/core/sock.c needs fixing. Just if we should change dst_check()
>> too, I'm not sure.
>>
>> Should we fix sk_dst_check to use dst_check(), and dst_check() to check
>> for dst->obsolete>0 ?
>
> Yes this should work too.  However, I was never totally happy with
> this new dst->obsolete logic which means that we're doing an
> indirect call for every single packet which almost always does
> nothing.
>
> Perhaps we should move the genid/cookie logic into the dst so that
> we can eliminate the dst->check call or at least make it a lot less
> frequent.
>
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
>

^ permalink raw reply

* Re: Regarding to your linux kernel CL
From: Herbert Xu @ 2010-10-06  7:59 UTC (permalink / raw)
  To: Timo Teräs
  Cc: "Chung-Yih Wang (王崇懿)", davem,
	netdev
In-Reply-To: <4CAC1F20.6070009@iki.fi>

On Wed, Oct 06, 2010 at 10:02:56AM +0300, Timo Teräs wrote:
>
> What's the problem here? sk_dst_check not honoring if dst->obsolete>0 ?
> Sounds like the sk_dst_check was buggy in the first place.

Well the problem is that before we changed ip4_dst_check, everything
worked properly.  With IPv6, whenever a route is released, the serial
number is always updated accordingly.  This means that ip6_dst_check
will always return NULL when obsolete > 1.

The old ip4_dst_check also satisfied this requirement since it always
returns NULL.

> Looks like there's still some code around that does not do what the
> obsolete field has been used for a long time.
>   obsolete =  0, dst entry is ok
>   obsolete = -1, you need to call ops->check for this entry
>   obsolete >  0, this entry is invalid
> 
> So net/core/sock.c needs fixing. Just if we should change dst_check()
> too, I'm not sure.
> 
> Should we fix sk_dst_check to use dst_check(), and dst_check() to check
> for dst->obsolete>0 ?

Yes this should work too.  However, I was never totally happy with
this new dst->obsolete logic which means that we're doing an
indirect call for every single packet which almost always does
nothing.

Perhaps we should move the genid/cookie logic into the dst so that
we can eliminate the dst->check call or at least make it a lot less
frequent.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] net: Fix sk_dst_check() to reset the obsolete dst_entry of a socket.
From: Chung-Yih Wang (王崇懿) @ 2010-10-06  7:47 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel, timo.teras
In-Reply-To: <20101006.003541.71128466.davem@davemloft.net>

In fact, that is what I intent to change originally. However, consider
Timo's issue, I intent to submit this patch instead.

On Wed, Oct 6, 2010 at 12:35 AM, David Miller <davem@davemloft.net> wrote:
>
> This should have been fixed by:
>
> --------------------
> commit ae2688d59b5f861dc70a091d003773975d2ae7fb
> Author: Jianzhao Wang <jianzhao.wang@6wind.com>
> Date:   Wed Sep 8 14:35:43 2010 -0700
>
>    net: blackhole route should always be recalculated
>
>    Blackhole routes are used when xfrm_lookup() returns -EREMOTE (error
>    triggered by IKE for example), hence this kind of route is always
>    temporary and so we should check if a better route exists for next
>    packets.
>    Bug has been introduced by commit d11a4dc18bf41719c9f0d7ed494d295dd2973b92.
>
>    Signed-off-by: Jianzhao Wang <jianzhao.wang@6wind.com>
>    Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>    Signed-off-by: David S. Miller <davem@davemloft.net>
>
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 3f56b6e..6298f75 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -2738,6 +2738,11 @@ slow_output:
>  }
>  EXPORT_SYMBOL_GPL(__ip_route_output_key);
>
> +static struct dst_entry *ipv4_blackhole_dst_check(struct dst_entry *dst, u32 cookie)
> +{
> +       return NULL;
> +}
> +
>  static void ipv4_rt_blackhole_update_pmtu(struct dst_entry *dst, u32 mtu)
>  {
>  }
> @@ -2746,7 +2751,7 @@ static struct dst_ops ipv4_dst_blackhole_ops = {
>        .family                 =       AF_INET,
>        .protocol               =       cpu_to_be16(ETH_P_IP),
>        .destroy                =       ipv4_dst_destroy,
> -       .check                  =       ipv4_dst_check,
> +       .check                  =       ipv4_blackhole_dst_check,
>        .update_pmtu            =       ipv4_rt_blackhole_update_pmtu,
>        .entries                =       ATOMIC_INIT(0),
>  };
>

^ permalink raw reply

* Re: [patch] eicon: make buffer larger
From: Dan Carpenter @ 2010-10-06  7:47 UTC (permalink / raw)
  To: Armin Schindler; +Cc: Karsten Keil, netdev, linux-kernel, kernel-janitors
In-Reply-To: <alpine.DEB.2.00.1010060923120.29107@justus.melware.de>

On Wed, Oct 06, 2010 at 09:25:44AM +0200, Armin Schindler wrote:
> On Mon, 4 Oct 2010, Dan Carpenter wrote:
>> In diva_mnt_add_xdi_adapter() we do this:
>>  strcpy (clients[id].drvName,     tmp);
>>  strcpy (clients[id].Dbg.drvName, tmp);
>>
>> The "clients[id].drvName" is a 128 character buffer and
>> "clients[id].Dbg.drvName" was originally a 16 character buffer but I've
>> changed it to 128 as well.  We don't actually use 128 characters but we
>> do use more than 16.
>
> I don't see any reason for that change. The driver names here do not use  
> more than 16 characters and when filled, the length is checked anyway.
> Please avoid changing the size of that structure.
>

drivers/isdn/hardware/eicon/debug.c diva_mnt_add_xdi_adapter()
   874      sprintf (tmp, "ADAPTER:%d SN:%u-%d",
                           12345678 90123 45 67

	That's a minimum 17 characters.

   875               (int)logical,
   876               serial & 0x00ffffff,
   877               (byte)(((serial & 0xff000000) >> 24) + 1));
   878    } else {
   879      sprintf (tmp, "ADAPTER:%d SN:%u", (int)logical, serial);
   880    }

regards,
dan carpenter

> Armin


^ permalink raw reply

* Re: [RFC] bonding: fix workqueue re-arming races
From: Narendra_K @ 2010-10-06  7:36 UTC (permalink / raw)
  To: jbohac; +Cc: fubar, bonding-devel, markine, jarkao2, chavey, netdev
In-Reply-To: <20101005150317.GA15555@libnet-test.oslab.blr.amer.dell.com>

On Tue, Oct 05, 2010 at 08:33:29PM +0530, K, Narendra wrote:
>    On Fri, Oct 01, 2010 at 11:52:32PM +0530, Jiri Bohac wrote:
>    >    On Fri, Sep 24, 2010 at 06:23:53AM -0500, Narendra K wrote:
>    >    > On Fri, Sep 17, 2010 at 04:14:33AM +0530, Jay Vosburgh wrote:
>    >    > > Jay Vosburgh <fubar@us.ibm.com> wrote:
>    >    > The follwing call trace was seen -
>    >
>    >    This should be the BUG_ON(cwq->nr_active) in
>    >    destroy_workqueue()
>    >
>    >    This is really strange. bondng_store_bonds() can do two things:
>    >    create or delete a bonding device.
>    >
>    >    I checked the delete path, where I would normally expect such a
>    >    problem, but I can't find a way it could fail in this way.
>    >    bondng_store_bonds() calls unregister_netdevice(), which
>    >    - calls rollback_registered() -> bond_close()
>    >    - puts the device on the net_todo_list.
>    >    On rtnl_unlock() netdev_run_todo() gets called and that calls
>    >    bond_destructor().
>    >
>    >    bond_close() now makes sure the rearming work items are not
>    >    pending, thus, the only work items that may still be pending on
>    >    the workqueue are the non-rearming "commit" work items.
>    >    flush_workqueue(), called at the beginning of destroy_workqueue()
>    >    should have waited for these to finish.
>    >    If all of the above is correct, this BUG_ON should never trigger.
>    >
>    >    Maybe I am overlooking something, or it may be some kind of
>    >    failure/race condition in the create path, resulting in
>    >    bond_destructor() being called as well.
>    >
>    >    Narendra, any chance to capture the dmesg lines preceeding the
>    >    BUG message? This should show which of the above cases it is.
>
>    Jiri, I will try to reproduce the issue with ignore_loglevel to capture
>    more data on the serial console and share it shortly.

Here is a more verbose sequence of log messages just before the issue is
hit. I have attached logs beginning two iterations before the failure.
Please let me know if you need logs further back in the sequence.

[ 6139.115628] bonding: bond0 is being created...
[ 6139.122159] bonding: bond0: setting mode to balance-alb (6).
[ 6139.128537] bonding: bond0: Setting MII monitoring interval to 100.
[ 6139.137702] ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 6139.145516] bonding: bond0: Adding slave eth0.
[ 6139.173964] bnx2 0000:01:00.0: irq 79 for MSI/MSI-X
[ 6139.179111] bnx2 0000:01:00.0: irq 80 for MSI/MSI-X
[ 6139.184262] bnx2 0000:01:00.0: irq 81 for MSI/MSI-X
[ 6139.189424] bnx2 0000:01:00.0: irq 82 for MSI/MSI-X
[ 6139.194559] bnx2 0000:01:00.0: irq 83 for MSI/MSI-X
[ 6139.199701] bnx2 0000:01:00.0: irq 84 for MSI/MSI-X
[ 6139.204873] bnx2 0000:01:00.0: irq 85 for MSI/MSI-X
[ 6139.210016] bnx2 0000:01:00.0: irq 86 for MSI/MSI-X
[ 6139.215158] bnx2 0000:01:00.0: irq 87 for MSI/MSI-X
[ 6139.270975] bnx2 0000:01:00.0: eth0: using MSIX
[ 6139.278893] bonding: bond0: enslaving eth0 as an active interface with a down link.
[ 6139.291007] bonding: bond0: Adding slave eth1.
[ 6139.321113] bnx2 0000:01:00.1: irq 88 for MSI/MSI-X
[ 6139.325991] bnx2 0000:01:00.1: irq 89 for MSI/MSI-X
[ 6139.330866] bnx2 0000:01:00.1: irq 90 for MSI/MSI-X
[ 6139.335752] bnx2 0000:01:00.1: irq 91 for MSI/MSI-X
[ 6139.340626] bnx2 0000:01:00.1: irq 92 for MSI/MSI-X
[ 6139.345500] bnx2 0000:01:00.1: irq 93 for MSI/MSI-X
[ 6139.350373] bnx2 0000:01:00.1: irq 94 for MSI/MSI-X
[ 6139.355246] bnx2 0000:01:00.1: irq 95 for MSI/MSI-X
[ 6139.360119] bnx2 0000:01:00.1: irq 96 for MSI/MSI-X
[ 6139.418765] bnx2 0000:01:00.1: eth1: using MSIX
[ 6139.426671] bonding: bond0: enslaving eth1 as an active interface with a down link.
[ 6139.438664] bonding: bond0: Adding slave eth2.
[ 6139.469101] bnx2 0000:02:00.0: irq 97 for MSI/MSI-X
[ 6139.473980] bnx2 0000:02:00.0: irq 98 for MSI/MSI-X
[ 6139.478856] bnx2 0000:02:00.0: irq 99 for MSI/MSI-X
[ 6139.483743] bnx2 0000:02:00.0: irq 100 for MSI/MSI-X
[ 6139.488706] bnx2 0000:02:00.0: irq 101 for MSI/MSI-X
[ 6139.493670] bnx2 0000:02:00.0: irq 102 for MSI/MSI-X
[ 6139.498641] bnx2 0000:02:00.0: irq 103 for MSI/MSI-X
[ 6139.503604] bnx2 0000:02:00.0: irq 104 for MSI/MSI-X
[ 6139.508566] bnx2 0000:02:00.0: irq 105 for MSI/MSI-X
[ 6139.566908] bnx2 0000:02:00.0: eth2: using MSIX
[ 6139.574815] bonding: bond0: enslaving eth2 as an active interface with a down link.
[ 6139.586686] bonding: bond0: Adding slave eth3.
[ 6139.617042] bnx2 0000:02:00.1: irq 106 for MSI/MSI-X
[ 6139.622011] bnx2 0000:02:00.1: irq 107 for MSI/MSI-X
[ 6139.626974] bnx2 0000:02:00.1: irq 108 for MSI/MSI-X
[ 6139.631942] bnx2 0000:02:00.1: irq 109 for MSI/MSI-X
[ 6139.636904] bnx2 0000:02:00.1: irq 110 for MSI/MSI-X
[ 6139.641867] bnx2 0000:02:00.1: irq 111 for MSI/MSI-X
[ 6139.646835] bnx2 0000:02:00.1: irq 112 for MSI/MSI-X
[ 6139.651798] bnx2 0000:02:00.1: irq 113 for MSI/MSI-X
[ 6139.656760] bnx2 0000:02:00.1: irq 114 for MSI/MSI-X
[ 6139.714833] bnx2 0000:02:00.1: eth3: using MSIX
[ 6139.722929] bonding: bond0: enslaving eth3 as an active interface with a down link.
[ 6141.684544] bnx2 0000:01:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex
[ 6141.732924] bonding: bond0: link status definitely up for interface eth0.
[ 6141.739714] bonding: bond0: making interface eth0 the new active one.
[ 6141.749618] bonding: bond0: first active interface up!
[ 6141.756427] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[ 6141.832597] bnx2 0000:01:00.1: eth1: NIC Copper Link is Up, 1000 Mbps full duplex
[ 6141.840185] bonding: bond0: link status definitely up for interface eth1.
[ 6142.013511] bnx2 0000:02:00.0: eth2: NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
[ 6142.037019] bonding: bond0: link status definitely up for interface eth2.
[ 6146.252919] bonding: bond0: link status definitely down for interface eth0, disabling it
[ 6146.261009] bonding: bond0: making interface eth1 the new active one.
[ 6146.305466] bnx2 0000:01:00.0: irq 79 for MSI/MSI-X
[ 6146.310347] bnx2 0000:01:00.0: irq 80 for MSI/MSI-X
[ 6146.315223] bnx2 0000:01:00.0: irq 81 for MSI/MSI-X
[ 6146.320100] bnx2 0000:01:00.0: irq 82 for MSI/MSI-X
[ 6146.324983] bnx2 0000:01:00.0: irq 83 for MSI/MSI-X
[ 6146.329858] bnx2 0000:01:00.0: irq 84 for MSI/MSI-X
[ 6146.334735] bnx2 0000:01:00.0: irq 85 for MSI/MSI-X
[ 6146.339611] bnx2 0000:01:00.0: irq 86 for MSI/MSI-X
[ 6146.344535] bnx2 0000:01:00.0: irq 87 for MSI/MSI-X
[ 6146.402416] bnx2 0000:01:00.0: eth0: using MSIX
[ 6147.100849] bonding: bond0: link status definitely down for interface eth1, disabling it
[ 6147.108943] bonding: bond0: making interface eth2 the new active one.
[ 6147.153439] bnx2 0000:01:00.1: irq 88 for MSI/MSI-X
[ 6147.158322] bnx2 0000:01:00.1: irq 89 for MSI/MSI-X
[ 6147.163199] bnx2 0000:01:00.1: irq 90 for MSI/MSI-X
[ 6147.168081] bnx2 0000:01:00.1: irq 91 for MSI/MSI-X
[ 6147.172957] bnx2 0000:01:00.1: irq 92 for MSI/MSI-X
[ 6147.177833] bnx2 0000:01:00.1: irq 93 for MSI/MSI-X
[ 6147.182717] bnx2 0000:01:00.1: irq 94 for MSI/MSI-X
[ 6147.187597] bnx2 0000:01:00.1: irq 95 for MSI/MSI-X
[ 6147.192584] bnx2 0000:01:00.1: irq 96 for MSI/MSI-X
[ 6147.250432] bnx2 0000:01:00.1: eth1: using MSIX
[ 6147.956727] bonding: bond0: link status definitely down for interface eth2, disabling it
[ 6147.964821] bonding: bond0: now running without any active interface !
[ 6148.005316] bnx2 0000:02:00.0: irq 97 for MSI/MSI-X
[ 6148.010198] bnx2 0000:02:00.0: irq 98 for MSI/MSI-X
[ 6148.015072] bnx2 0000:02:00.0: irq 99 for MSI/MSI-X
[ 6148.019949] bnx2 0000:02:00.0: irq 100 for MSI/MSI-X
[ 6148.024911] bnx2 0000:02:00.0: irq 101 for MSI/MSI-X
[ 6148.029872] bnx2 0000:02:00.0: irq 102 for MSI/MSI-X
[ 6148.034833] bnx2 0000:02:00.0: irq 103 for MSI/MSI-X
[ 6148.039793] bnx2 0000:02:00.0: irq 104 for MSI/MSI-X
[ 6148.044756] bnx2 0000:02:00.0: irq 105 for MSI/MSI-X
[ 6148.102286] bnx2 0000:02:00.0: eth2: using MSIX
[ 6148.873352] bnx2 0000:02:00.1: irq 106 for MSI/MSI-X
[ 6148.878319] bnx2 0000:02:00.1: irq 107 for MSI/MSI-X
[ 6148.883293] bnx2 0000:02:00.1: irq 108 for MSI/MSI-X
[ 6148.888253] bnx2 0000:02:00.1: irq 109 for MSI/MSI-X
[ 6148.893213] bnx2 0000:02:00.1: irq 110 for MSI/MSI-X
[ 6148.898174] bnx2 0000:02:00.1: irq 111 for MSI/MSI-X
[ 6148.903134] bnx2 0000:02:00.1: irq 112 for MSI/MSI-X
[ 6148.908094] bnx2 0000:02:00.1: irq 113 for MSI/MSI-X
[ 6148.913060] bnx2 0000:02:00.1: irq 114 for MSI/MSI-X
[ 6148.970345] bnx2 0000:02:00.1: eth3: using MSIX
[ 6149.445381] bnx2 0000:01:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex
[ 6149.540563] bonding: bond0: link status definitely up for interface eth0.
[ 6149.547352] bonding: bond0: making interface eth0 the new active one.
[ 6149.557274] bonding: bond0: first active interface up!
[ 6149.941355] bonding: bond0: Removing slave eth0.
[ 6149.945983] bonding: bond0: Warning: the permanent HWaddr of eth0 - 00:22:19:cc:9a:25 - is still in use by bond0. Set the HWaddr of eth0 to a different address to avoid conflicts.
[ 6149.962010] bonding: bond0: releasing active interface eth0
[ 6150.074401] bonding: bond0: Removing slave eth1.
[ 6150.079027] bonding: bond0: releasing active interface eth1
[ 6150.202403] bonding: bond0: Removing slave eth2.
[ 6150.207028] bonding: bond0: releasing active interface eth2
[ 6150.330275] bonding: bond0: Removing slave eth3.
[ 6150.334900] bonding: bond0: releasing active interface eth3
[ 6150.552776] bonding: bond0 is being deleted...
[ 6156.145939] bonding: bond0 is being created...
[ 6156.152253] bonding: bond0: setting mode to balance-alb (6).
[ 6156.158899] bonding: bond0: Setting MII monitoring interval to 100.
[ 6156.168142] ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 6156.176120] bonding: bond0: Adding slave eth0.
[ 6156.205188] bnx2 0000:01:00.0: irq 79 for MSI/MSI-X
[ 6156.210066] bnx2 0000:01:00.0: irq 80 for MSI/MSI-X
[ 6156.214942] bnx2 0000:01:00.0: irq 81 for MSI/MSI-X
[ 6156.219817] bnx2 0000:01:00.0: irq 82 for MSI/MSI-X
[ 6156.224724] bnx2 0000:01:00.0: irq 83 for MSI/MSI-X
[ 6156.229599] bnx2 0000:01:00.0: irq 84 for MSI/MSI-X
[ 6156.234473] bnx2 0000:01:00.0: irq 85 for MSI/MSI-X
[ 6156.239347] bnx2 0000:01:00.0: irq 86 for MSI/MSI-X
[ 6156.244222] bnx2 0000:01:00.0: irq 87 for MSI/MSI-X
[ 6156.301985] bnx2 0000:01:00.0: eth0: using MSIX
[ 6156.309899] bonding: bond0: enslaving eth0 as an active interface with a down link.
[ 6156.321860] bonding: bond0: Adding slave eth1.
[ 6156.348184] bnx2 0000:01:00.1: irq 88 for MSI/MSI-X
[ 6156.353066] bnx2 0000:01:00.1: irq 89 for MSI/MSI-X
[ 6156.357942] bnx2 0000:01:00.1: irq 90 for MSI/MSI-X
[ 6156.362818] bnx2 0000:01:00.1: irq 91 for MSI/MSI-X
[ 6156.367694] bnx2 0000:01:00.1: irq 92 for MSI/MSI-X
[ 6156.372570] bnx2 0000:01:00.1: irq 93 for MSI/MSI-X
[ 6156.377445] bnx2 0000:01:00.1: irq 94 for MSI/MSI-X
[ 6156.382325] bnx2 0000:01:00.1: irq 95 for MSI/MSI-X
[ 6156.387199] bnx2 0000:01:00.1: irq 96 for MSI/MSI-X
[ 6156.445966] bnx2 0000:01:00.1: eth1: using MSIX
[ 6156.453878] bonding: bond0: enslaving eth1 as an active interface with a down link.
[ 6156.465826] bonding: bond0: Adding slave eth2.
[ 6156.496230] bnx2 0000:02:00.0: irq 97 for MSI/MSI-X
[ 6156.501110] bnx2 0000:02:00.0: irq 98 for MSI/MSI-X
[ 6156.505986] bnx2 0000:02:00.0: irq 99 for MSI/MSI-X
[ 6156.510868] bnx2 0000:02:00.0: irq 100 for MSI/MSI-X
[ 6156.515832] bnx2 0000:02:00.0: irq 101 for MSI/MSI-X
[ 6156.520794] bnx2 0000:02:00.0: irq 102 for MSI/MSI-X
[ 6156.525760] bnx2 0000:02:00.0: irq 103 for MSI/MSI-X
[ 6156.530723] bnx2 0000:02:00.0: irq 104 for MSI/MSI-X
[ 6156.535685] bnx2 0000:02:00.0: irq 105 for MSI/MSI-X
[ 6156.594017] bnx2 0000:02:00.0: eth2: using MSIX
[ 6156.601932] bonding: bond0: enslaving eth2 as an active interface with a down link.
[ 6156.613845] bonding: bond0: Adding slave eth3.
[ 6156.644229] bnx2 0000:02:00.1: irq 106 for MSI/MSI-X
[ 6156.649197] bnx2 0000:02:00.1: irq 107 for MSI/MSI-X
[ 6156.654161] bnx2 0000:02:00.1: irq 108 for MSI/MSI-X
[ 6156.659130] bnx2 0000:02:00.1: irq 109 for MSI/MSI-X
[ 6156.664093] bnx2 0000:02:00.1: irq 110 for MSI/MSI-X
[ 6156.669056] bnx2 0000:02:00.1: irq 111 for MSI/MSI-X
[ 6156.674023] bnx2 0000:02:00.1: irq 112 for MSI/MSI-X
[ 6156.678985] bnx2 0000:02:00.1: irq 113 for MSI/MSI-X
[ 6156.683947] bnx2 0000:02:00.1: irq 114 for MSI/MSI-X
[ 6156.742024] bnx2 0000:02:00.1: eth3: using MSIX
[ 6156.749940] bonding: bond0: enslaving eth3 as an active interface with a down link.
[ 6158.808483] bnx2 0000:01:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex
[ 6158.868054] bonding: bond0: link status definitely up for interface eth0.
[ 6158.874846] bonding: bond0: making interface eth0 the new active one.
[ 6158.884753] bonding: bond0: first active interface up!
[ 6158.891563] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[ 6158.922727] bnx2 0000:01:00.1: eth1: NIC Copper Link is Up, 1000 Mbps full duplex
[ 6158.968157] bonding: bond0: link status definitely up for interface eth1.
[ 6159.021517] bnx2 0000:02:00.0: eth2: NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
[ 6159.068235] bonding: bond0: link status definitely up for interface eth2.
[ 6160.040150] bnx2 0000:01:00.1: eth1: NIC Copper Link is Down
[ 6160.069773] bonding: bond0: link status definitely down for interface eth1, disabling it
[ 6162.460199] bnx2 0000:01:00.1: eth1: NIC Copper Link is Up, 1000 Mbps full duplex
[ 6162.467865] bonding: bond0: link status definitely up for interface eth1.
[ 6163.168061] bonding: bond0: link status definitely down for interface eth0, disabling it
[ 6163.176149] bonding: bond0: making interface eth1 the new active one.
[ 6163.220460] bnx2 0000:01:00.0: irq 79 for MSI/MSI-X
[ 6163.225481] bnx2 0000:01:00.0: irq 80 for MSI/MSI-X
[ 6163.230413] bnx2 0000:01:00.0: irq 81 for MSI/MSI-X
[ 6163.235342] bnx2 0000:01:00.0: irq 82 for MSI/MSI-X
[ 6163.240282] bnx2 0000:01:00.0: irq 83 for MSI/MSI-X
[ 6163.245212] bnx2 0000:01:00.0: irq 84 for MSI/MSI-X
[ 6163.250142] bnx2 0000:01:00.0: irq 85 for MSI/MSI-X
[ 6163.255080] bnx2 0000:01:00.0: irq 86 for MSI/MSI-X
[ 6163.260009] bnx2 0000:01:00.0: irq 87 for MSI/MSI-X
[ 6163.317682] bnx2 0000:01:00.0: eth0: using MSIX
[ 6164.032023] bonding: bond0: link status definitely down for interface eth1, disabling it
[ 6164.040115] bonding: bond0: making interface eth2 the new active one.
[ 6164.084607] bnx2 0000:01:00.1: irq 88 for MSI/MSI-X
[ 6164.089488] bnx2 0000:01:00.1: irq 89 for MSI/MSI-X
[ 6164.094361] bnx2 0000:01:00.1: irq 90 for MSI/MSI-X
[ 6164.099236] bnx2 0000:01:00.1: irq 91 for MSI/MSI-X
[ 6164.104111] bnx2 0000:01:00.1: irq 92 for MSI/MSI-X
[ 6164.108986] bnx2 0000:01:00.1: irq 93 for MSI/MSI-X
[ 6164.113860] bnx2 0000:01:00.1: irq 94 for MSI/MSI-X
[ 6164.118733] bnx2 0000:01:00.1: irq 95 for MSI/MSI-X
[ 6164.123608] bnx2 0000:01:00.1: irq 96 for MSI/MSI-X
[ 6164.181533] bnx2 0000:01:00.1: eth1: using MSIX
[ 6164.880097] bonding: bond0: link status definitely down for interface eth2, disabling it
[ 6164.888191] bonding: bond0: now running without any active interface !
[ 6164.932440] bnx2 0000:02:00.0: irq 97 for MSI/MSI-X
[ 6164.937322] bnx2 0000:02:00.0: irq 98 for MSI/MSI-X
[ 6164.942253] bnx2 0000:02:00.0: irq 99 for MSI/MSI-X
[ 6164.947192] bnx2 0000:02:00.0: irq 100 for MSI/MSI-X
[ 6164.952208] bnx2 0000:02:00.0: irq 101 for MSI/MSI-X
[ 6164.957234] bnx2 0000:02:00.0: irq 102 for MSI/MSI-X
[ 6164.962247] bnx2 0000:02:00.0: irq 103 for MSI/MSI-X
[ 6164.967274] bnx2 0000:02:00.0: irq 104 for MSI/MSI-X
[ 6164.972290] bnx2 0000:02:00.0: irq 105 for MSI/MSI-X
[ 6165.029585] bnx2 0000:02:00.0: eth2: using MSIX
[ 6165.776479] bnx2 0000:02:00.1: irq 106 for MSI/MSI-X
[ 6165.781449] bnx2 0000:02:00.1: irq 107 for MSI/MSI-X
[ 6165.786422] bnx2 0000:02:00.1: irq 108 for MSI/MSI-X
[ 6165.791385] bnx2 0000:02:00.1: irq 109 for MSI/MSI-X
[ 6165.796345] bnx2 0000:02:00.1: irq 110 for MSI/MSI-X
[ 6165.801312] bnx2 0000:02:00.1: irq 111 for MSI/MSI-X
[ 6165.806273] bnx2 0000:02:00.1: irq 112 for MSI/MSI-X
[ 6165.811233] bnx2 0000:02:00.1: irq 113 for MSI/MSI-X
[ 6165.816199] bnx2 0000:02:00.1: irq 114 for MSI/MSI-X
[ 6165.873445] bnx2 0000:02:00.1: eth3: using MSIX
[ 6165.951287] bnx2 0000:01:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex
[ 6165.967840] bonding: bond0: link status definitely up for interface eth0.
[ 6165.974631] bonding: bond0: making interface eth0 the new active one.
[ 6165.984534] bonding: bond0: first active interface up!
[ 6166.836175] bnx2 0000:01:00.1: eth1: NIC Copper Link is Up, 1000 Mbps full duplex
[ 6166.839757] bonding: bond0: Removing slave eth0.
[ 6166.839768] bonding: bond0: Warning: the permanent HWaddr of eth0 - 00:22:19:cc:9a:25 - is still in use by bond0. Set the HWaddr of eth0 to a different address to avoid conflicts.
[ 6166.839772] bonding: bond0: releasing active interface eth0
[ 6166.870004]
[ 6166.999414] bonding: bond0: Removing slave eth1.
[ 6167.004041] bonding: bond0: releasing active interface eth1
[ 6167.125571] bonding: bond0: Removing slave eth2.
[ 6167.130196] bonding: bond0: releasing active interface eth2
[ 6167.253539] bonding: bond0: Removing slave eth3.
[ 6167.258162] bonding: bond0: releasing active interface eth3
[ 6167.443911] bonding: bond0 is being deleted...
[ 6167.508557] ------------[ cut here ]------------
[ 6167.513167] kernel BUG at kernel/workqueue.c:2844!
[ 6167.517948] invalid opcode: 0000 [#1] SMP
[ 6167.522058] last sysfs file: /sys/class/net/bonding_masters
[ 6167.527619] CPU 0
[ 6167.529452] Modules linked in: af_packet bonding ipv6 mperf microcode fuse loop dm_mod joydev usbhid hid tpm_tis tpm usb_storage iTCO_wdt tpm_bios iTCO_vendor_support rtc_cmos rtc_core rtc_lib sg mptctl sr_mod cdrom dcdbas power_meter bnx2 serio_raw button pcspkr uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif edd ext3 mbcache jbd fan processor ide_pci_generic ide_core ata_generic ata_piix libata mptsas mptscsih mptbase scsi_transport_sas scsi_mod thermal thermal_sys hwmon
[ 6167.571721]
[ 6167.573209] Pid: 13848, comm: ifdown-bonding Not tainted 2.6.35.with.upstream.patch-next-20100811-0.7-default+ #1 0M233H/PowerEdge R710
[ 6167.585362] RIP: 0010:[<ffffffff81067b50>]  [<ffffffff81067b50>] destroy_workqueue+0x1d0/0x1e0
[ 6167.593977] RSP: 0018:ffff880229981d88  EFLAGS: 00010286
[ 6167.599278] RAX: 000000000000003c RBX: ffff880127646800 RCX: ffff880128324700
[ 6167.606401] RDX: 0000000000001000 RSI: 0000000000000002 RDI: 000000000000001a
[ 6167.613523] RBP: ffff880229981da8 R08: ffff880229981cf8 R09: 0000000000000000
[ 6167.620646] R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000002
[ 6167.627769] R13: ffffffff817b8560 R14: ffff88012a028480 R15: ffff88012a028488
[ 6167.634892] FS:  00007fb251db3700(0000) GS:ffff880133a00000(0000) knlGS:0000000000000000
[ 6167.642969] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6167.648705] CR2: 00007fb251de6000 CR3: 00000001283ca000 CR4: 00000000000006f0
[ 6167.655827] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6167.662950] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 6167.670073] Process ifdown-bonding (pid: 13848, threadinfo ffff880229980000, task ffff88021db9a600)
[ 6167.679103] Stack:
[ 6167.681110]  ffff880229981da8 ffff88012a028000 ffff88012a028000 000000010016619c
[ 6167.688354] <0> ffff880229981dc8 ffffffffa03bf91d ffff88012a028000 000000010016619c
[ 6167.696053] <0> ffff880229981e28 ffffffff812e0a08 ffff880229981e38 ffff880229981de8
[ 6167.703936] Call Trace:
[ 6167.706382]  [<ffffffffa03bf91d>] bond_destructor+0x1d/0x30 [bonding]
[ 6167.712816]  [<ffffffff812e0a08>] netdev_run_todo+0x1a8/0x270
[ 6167.718555]  [<ffffffff812ee859>] rtnl_unlock+0x9/0x10
[ 6167.723752]  [<ffffffffa03cc824>] bonding_store_bonds+0x1c4/0x1f0 [bonding]
[ 6167.730705]  [<ffffffff810f26be>] ? alloc_pages_current+0x9e/0x110
[ 6167.736876]  [<ffffffff81285c9e>] class_attr_store+0x1e/0x20
[ 6167.742528]  [<ffffffff8116e365>] sysfs_write_file+0xc5/0x140
[ 6167.748267]  [<ffffffff8110a68f>] vfs_write+0xcf/0x190
[ 6167.753397]  [<ffffffff8110a840>] sys_write+0x50/0x90
[ 6167.758444]  [<ffffffff81002ec2>] system_call_fastpath+0x16/0x1b
[ 6167.764439] Code: 00 7f 14 8b 3b eb 91 3d 00 10 00 00 89 c2 77 10 8b 3b e9 07 ff ff ff 3d 00 10 00 00 89 c2 76 f0 8b 3b e9 a9 fe ff ff 0f 0b eb fe <0f> 0b eb fe 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 8b 3d 00
[ 6167.783930] RIP  [<ffffffff81067b50>] destroy_workqueue+0x1d0/0x1e0
[ 6167.790200]  RSP <ffff880229981d88>
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu


--
With regards,
Narendra K

^ permalink raw reply

* Re: [PATCH] net: Fix sk_dst_check() to reset the obsolete dst_entry of a socket.
From: David Miller @ 2010-10-06  7:35 UTC (permalink / raw)
  To: cywang; +Cc: netdev, linux-kernel, timo.teras
In-Reply-To: <AANLkTikkv4htP1f=KFwSaa8=K6LicDN0eusXctow-Efb@mail.gmail.com>


This should have been fixed by:

--------------------
commit ae2688d59b5f861dc70a091d003773975d2ae7fb
Author: Jianzhao Wang <jianzhao.wang@6wind.com>
Date:   Wed Sep 8 14:35:43 2010 -0700

    net: blackhole route should always be recalculated
    
    Blackhole routes are used when xfrm_lookup() returns -EREMOTE (error
    triggered by IKE for example), hence this kind of route is always
    temporary and so we should check if a better route exists for next
    packets.
    Bug has been introduced by commit d11a4dc18bf41719c9f0d7ed494d295dd2973b92.
    
    Signed-off-by: Jianzhao Wang <jianzhao.wang@6wind.com>
    Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 3f56b6e..6298f75 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2738,6 +2738,11 @@ slow_output:
 }
 EXPORT_SYMBOL_GPL(__ip_route_output_key);
 
+static struct dst_entry *ipv4_blackhole_dst_check(struct dst_entry *dst, u32 cookie)
+{
+	return NULL;
+}
+
 static void ipv4_rt_blackhole_update_pmtu(struct dst_entry *dst, u32 mtu)
 {
 }
@@ -2746,7 +2751,7 @@ static struct dst_ops ipv4_dst_blackhole_ops = {
 	.family			=	AF_INET,
 	.protocol		=	cpu_to_be16(ETH_P_IP),
 	.destroy		=	ipv4_dst_destroy,
-	.check			=	ipv4_dst_check,
+	.check			=	ipv4_blackhole_dst_check,
 	.update_pmtu		=	ipv4_rt_blackhole_update_pmtu,
 	.entries		=	ATOMIC_INIT(0),
 };

^ permalink raw reply related

* [PATCH] net: Fix sk_dst_check() to reset the obsolete dst_entry of a socket.
From: Chung-Yih Wang (王崇懿) @ 2010-10-06  7:27 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

    The issue is caused by the CL d11a4dc18bf41719c9f0d7ed494d295dd2973b92
    which will never reset the dst_entry of a socket if its current entry
    is freed(obsolete) for ipv4. This will block the socket's traffic
    instead of looking up a new dst_entry.

    Signed-off-by: Chung-yih Wang <cywang@google.com>
---
diff --git a/net/core/sock.c b/net/core/sock.c
index ef30e9d..b508819 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -382,7 +382,8 @@ struct dst_entry *__sk_dst_check(struct sock *sk,
u32 cookie)
 {
        struct dst_entry *dst = __sk_dst_get(sk);

-       if (dst && dst->obsolete && dst->ops->check(dst, cookie) == NULL) {
+       if (dst && dst->obsolete && ((dst->obsolete > 0) ||
+           (dst->ops->check(dst, cookie) == NULL))) {
                sk_tx_queue_clear(sk);
                rcu_assign_pointer(sk->sk_dst_cache, NULL);
                dst_release(dst);
@@ -397,7 +398,8 @@ struct dst_entry *sk_dst_check(struct sock *sk, u32 cookie)
 {
        struct dst_entry *dst = sk_dst_get(sk);

-       if (dst && dst->obsolete && dst->ops->check(dst, cookie) == NULL) {
+       if (dst && dst->obsolete && ((dst->obsolete > 0) ||
+           (dst->ops->check(dst, cookie) == NULL))) {
                sk_dst_reset(sk);
                dst_release(dst);
                return NULL;

^ permalink raw reply related

* Re: [patch] eicon: make buffer larger
From: Armin Schindler @ 2010-10-06  7:25 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Karsten Keil, netdev, linux-kernel, kernel-janitors
In-Reply-To: <20101004192459.GE5692@bicker>

On Mon, 4 Oct 2010, Dan Carpenter wrote:
> In diva_mnt_add_xdi_adapter() we do this:
>  strcpy (clients[id].drvName,     tmp);
>  strcpy (clients[id].Dbg.drvName, tmp);
>
> The "clients[id].drvName" is a 128 character buffer and
> "clients[id].Dbg.drvName" was originally a 16 character buffer but I've
> changed it to 128 as well.  We don't actually use 128 characters but we
> do use more than 16.

I don't see any reason for that change. The driver names here do not use 
more than 16 characters and when filled, the length is checked anyway.
Please avoid changing the size of that structure.

Armin

> I've also changed the size of "tmp" to 128 characters instead of 256.
>
> Signed-off-by: Dan Carpenter <error27@gmail.com>
>
> diff --git a/drivers/isdn/hardware/eicon/debuglib.h b/drivers/isdn/hardware/eicon/debuglib.h
> index 8ea5877..02eed6b 100644
> --- a/drivers/isdn/hardware/eicon/debuglib.h
> +++ b/drivers/isdn/hardware/eicon/debuglib.h
> @@ -249,7 +249,7 @@ typedef struct _DbgHandle_
>  }     regTime ;  /* timestamp for registration       */
>  void               *pIrp ;   /* ptr to pending i/o request       */
>  unsigned long       dbgMask ;  /* current debug mask               */
> - char                drvName[16] ; /* ASCII name of registered driver  */
> + char                drvName[128] ; /* ASCII name of registered driver  */
>  char                drvTag[64] ; /* revision string     */
>  DbgEnd              dbg_end ;  /* function for debug closing       */
>  DbgLog              dbg_prt ;  /* function for debug appending     */
> diff --git a/drivers/isdn/hardware/eicon/debug.c b/drivers/isdn/hardware/eicon/debug.c
> index 33ce89e..3626401 100644
> --- a/drivers/isdn/hardware/eicon/debug.c
> +++ b/drivers/isdn/hardware/eicon/debug.c
> @@ -862,7 +862,7 @@ void diva_mnt_add_xdi_adapter (const DESCRIPTOR* d) {
>   diva_os_spin_lock_magic_t old_irql, old_irql1;
>   dword sec, usec, logical, serial, org_mask;
>   int id, best_id = 0, free_id = -1;
> -  char tmp[256];
> +  char tmp[128];
>   diva_dbg_entry_head_t* pmsg = NULL;
>   int len;
>   word size;
>

^ permalink raw reply

* Re: Regarding to your linux kernel CL
From: Timo Teräs @ 2010-10-06  7:02 UTC (permalink / raw)
  To: "Chung-Yih Wang (王崇懿)"
  Cc: herbert, davem, netdev
In-Reply-To: <AANLkTikdcL0JQgkR6u0qtmDu-phMZ6-Juq91B1N5GfiY@mail.gmail.com>

On 10/05/2010 04:23 AM, Chung-Yih Wang (王崇懿) wrote:
>     I encountered an issue with your CL
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d11a4dc18bf41719c9f0d7ed494d295dd2973b92.
> The cause is that we use a connected UDP socket for building the
> l2tp/ipsec vpn connection. However, when the ipsec tunnel is built,
> your CL made the sk_dst_check useless(since it always return the
> 'freed' dst_entry and can not reset the dst entry for the socket).
> What is your comment to conquer this issue?
> 
> Solution 1. We could add a CL to change it to (dst && dst->obsolete &&
> (dst->obsolete>0  || dst->ops->check(...)==NULL) in sk_dst_check()) ?
> 
> Solution 2. Revert the change?
> 
> Any comment?

What's the problem here? sk_dst_check not honoring if dst->obsolete>0 ?
Sounds like the sk_dst_check was buggy in the first place.

Looks like there's still some code around that does not do what the
obsolete field has been used for a long time.
  obsolete =  0, dst entry is ok
  obsolete = -1, you need to call ops->check for this entry
  obsolete >  0, this entry is invalid

So net/core/sock.c needs fixing. Just if we should change dst_check()
too, I'm not sure.

Should we fix sk_dst_check to use dst_check(), and dst_check() to check
for dst->obsolete>0 ?

^ permalink raw reply

* Re: [PATCH net-next] igb: fix stats handling
From: Jeff Kirsher @ 2010-10-06  5:49 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesper Dangaard Brouer, Alexander Duyck, Jesper Dangaard Brouer,
	David S. Miller, netdev, Carolyn Wyborny
In-Reply-To: <1286339791.4861.26.camel@edumazet-laptop>

On Tue, Oct 5, 2010 at 21:36, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mercredi 06 octobre 2010 à 05:28 +0200, Eric Dumazet a écrit :
>
>> I'll let Intel guys doing the backporting work, but for old kernels,
>> you'll probably need to use "unsigned long" instead of "u64"
>>
>> My plan is :
>>
>> - Provide 64bit counters even on 32bit arch
>> - with proper synchro (include/linux/u64_stats_sync.h)
>> - Add a spinlock so we can apply Jesper patch.
>
> Here is the net-next-2.6 patch, I am currently enable to test it, the
> dev machine with IGB NIC cannot be restarted until tomorrow, my son
> Nicolas is currently using it ;)
>
> Could you and/or Jesper test it, possibly on 32 and 64 bit kernels ?
>
> Thanks !
>
> [PATCH net-next] igb: fix stats handling
>
> There are currently some problems with igb.
>
> - On 32bit arches, maintaining 64bit counters without proper
> synchronization between writers and readers.
>
> - Stats updated every two seconds, as reported by Jesper.
>   (Jesper provided a patch for this)
>
> - Potential problem between worker thread and ethtool -S
>
> This patch uses u64_stats_sync, and convert everything to be 64bit safe,
> SMP safe, even on 32bit arches.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
>  drivers/net/igb/igb.h         |    7 +-
>  drivers/net/igb/igb_ethtool.c |   10 +-
>  drivers/net/igb/igb_main.c    |  111 +++++++++++++++++++++++---------
>  3 files changed, 94 insertions(+), 34 deletions(-)
>

Thanks again Eric!  I will add your patch to my queue.

-- 
Cheers,
Jeff

^ permalink raw reply

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
From: Jeff Kirsher @ 2010-10-06  5:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jesper Dangaard Brouer, Alexander Duyck, Jesper Dangaard Brouer,
	David S. Miller, netdev, Carolyn Wyborny
In-Reply-To: <1286335729.4861.13.camel@edumazet-laptop>

On Tue, Oct 5, 2010 at 20:28, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 05 octobre 2010 à 15:34 -0700, Jeff Kirsher a écrit :
>> On Tue, Oct 5, 2010 at 14:16, Jesper Dangaard Brouer <hawk@diku.dk> wrote:
>> > On Tue, 5 Oct 2010, Eric Dumazet wrote:
>> >
>> >> Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
>> >>>
>> >>> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
>> >>>
>> >>>> Its already racy, because "ethtool -S" reads out the stats immediately,
>> >>>> and thus races with the timer.
>> >>>>
>> >>>> See: igb_ethtool.c
>> >>>>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
>> >>>>
>> >>>
>> >>> You would be surprised how many bugs are waiting to be found and
>> >>> fixed ;)
>> >>
>> >> I took a look at igb stats, and it appears they also are racy on 32bit
>> >> arches. igb uses u64 counters, with no synchronization between
>> >> producers(writers) and consumers(readers).
>> >
>> > Are you saying, that we need more than a simple mutex in igb_update_stats()?
>> >
>> >
>> >> Some fixes are needed ;)
>> >
>> > The question is then if Intel wants to fix it, or let it be up to you or me?
>> >
>>
>> I will make sure that Carolyn and Alex know about the issue.  But,
>> feel free to submit a patch if you guys have the time.
>>
>
> I woke up early this morning, I'll provide patches to fix issues for
> net-next-2.6
>
> I'll let Intel guys doing the backporting work, but for old kernels,
> you'll probably need to use "unsigned long" instead of "u64"
>
> My plan is :
>
> - Provide 64bit counters even on 32bit arch
> - with proper synchro (include/linux/u64_stats_sync.h)
> - Add a spinlock so we can apply Jesper patch.
>
>

Thanks Eric, I really appreciate it.

-- 
Cheers,
Jeff

^ permalink raw reply

* Re: [PATCH] ethtool: add the stmmac support
From: Peppe CAVALLARO @ 2010-10-06  5:36 UTC (permalink / raw)
  To: netdev@vger.kernel.org
In-Reply-To: <1285667519-4621-1-git-send-email-peppe.cavallaro@st.com>

Hello,

On 09/28/2010 11:51 AM, Giuseppe CAVALLARO wrote:
> Add the stmmac support into the ethtool to
> dump both the Mac Core and Dma registers.

Any news for this patch?

The stmmac is now working on several platforms (not only on STM ST40
based boxes). I think it's worth having the ethtool support for the driver.

Welcome review and advice as usual.

Regards
Peppe

> 
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
> ---
> Makefile.am | 2 +-
> ethtool-util.h | 4 +++
> ethtool.c | 2 +
> stmmac.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 73 insertions(+), 1 deletions(-)
> create mode 100644 stmmac.c
> 
> diff --git a/Makefile.am b/Makefile.am
> index 632f054..a0d2116 100644
> --- a/Makefile.am
> +++ b/Makefile.am
> @@ -8,7 +8,7 @@ ethtool_SOURCES = ethtool.c ethtool-copy.h ethtool-util.h \
> amd8111e.c de2104x.c e100.c e1000.c igb.c \
> fec_8xx.c ibm_emac.c ixgb.c ixgbe.c natsemi.c \
> pcnet32.c realtek.c tg3.c marvell.c vioc.c \
> - smsc911x.c at76c50x-usb.c sfc.c
> + smsc911x.c at76c50x-usb.c sfc.c stmmac.c
> 
> dist-hook:
> cp $(top_srcdir)/ethtool.spec $(distdir)
> diff --git a/ethtool-util.h b/ethtool-util.h
> index 01b1d03..4ef3a9f 100644
> --- a/ethtool-util.h
> +++ b/ethtool-util.h
> @@ -93,4 +93,8 @@ int at76c50x_usb_dump_regs(struct ethtool_drvinfo *info, 
> struct ethtool_regs *re
> /* Solarflare Solarstorm controllers */
> int sfc_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
> 
> +/* STMMAC embedded ethernet controller */
> +int st_mac100_dump_regs(struct ethtool_drvinfo *info,
> + struct ethtool_regs *regs);
> +int st_gmac_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs);
> #endif
> diff --git a/ethtool.c b/ethtool.c
> index 6b2b7c8..ab69b95 100644
> --- a/ethtool.c
> +++ b/ethtool.c
> @@ -1539,6 +1539,8 @@ static struct {
> { "smsc911x", smsc911x_dump_regs },
> { "at76c50x-usb", at76c50x_usb_dump_regs },
> { "sfc", sfc_dump_regs },
> + { "st_mac100", st_mac100_dump_regs },
> + { "st_gmac", st_gmac_dump_regs },
> };
> 
> static int dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs)
> diff --git a/stmmac.c b/stmmac.c
> new file mode 100644
> index 0000000..ad4311c
> --- /dev/null
> +++ b/stmmac.c
> @@ -0,0 +1,66 @@
> +/****************************************************************************
> + * Support for the Synopsys MAC 10/100/1000 on-chip Ethernet controllers
> + *
> + * Copyright (C) 2010 STMicroelectronics Ltd
> + *
> + * Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation, incorporated herein by reference.
> + */
> +
> +#include <stdio.h>
> +#include <string.h>
> +#include "ethtool-util.h"
> +
> +int st_mac100_dump_regs(struct ethtool_drvinfo *info,
> + struct ethtool_regs *regs)
> +{
> + int i;
> + unsigned int *stmmac_reg = (unsigned int *)regs->data;
> +
> + fprintf(stdout, "ST MAC 10/100 Registers\n");
> + fprintf(stdout, "control reg 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "addr HI 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "addr LO 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "multicast hash HI 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "multicast hash LO 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "MII addr 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "MII data %08X\n", *stmmac_reg++);
> + fprintf(stdout, "flow control 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "VLAN1 tag 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "VLAN2 tag 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "mac wakeup frame 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "mac wakeup crtl 0x%08X\n", *stmmac_reg++);
> +
> + fprintf(stdout, "\n");
> + fprintf(stdout, "DMA Registers\n");
> + for (i = 0; i < 9; i++)
> + fprintf(stdout, "CSR%d 0x%08X\n", i, *stmmac_reg++);
> +
> + fprintf(stdout, "DMA cur tx buf addr 0x%08X\n", *stmmac_reg++);
> + fprintf(stdout, "DMA cur rx buf addr 0x%08X\n", *stmmac_reg++);
> +
> + fprintf(stdout, "\n");
> +
> + return 0;
> +}
> +
> +int st_gmac_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs)
> +{
> + int i;
> + unsigned int *stmmac_reg = (unsigned int *)regs->data;
> +
> + fprintf(stdout, "ST GMAC Registers\n");
> + fprintf(stdout, "GMAC Registers\n");
> + for (i = 0; i < 55; i++)
> + fprintf(stdout, "Reg%d 0x%08X\n", i, *stmmac_reg++);
> +
> + fprintf(stdout, "\n");
> + fprintf(stdout, "DMA Registers\n");
> + for (i = 0; i < 22; i++)
> + fprintf(stdout, "Reg%d 0x%08X\n", i, *stmmac_reg++);
> +
> + return 0;
> +}
> --
> 1.5.5.6
> 

^ permalink raw reply

* [patch] isdn: strcpy() => strlcpy()
From: Dan Carpenter @ 2010-10-06  5:17 UTC (permalink / raw)
  To: Al Viro; +Cc: Karsten Keil, netdev, kernel-janitors
In-Reply-To: <20101005164306.GP19804@ZenIV.linux.org.uk>

setup.phone and setup.eazmsn are 32 character buffers.
rcvmsg.msg_data.byte_array is a 48 character buffer.
sc_adapter[card]->channel[rcvmsg.phy_link_no - 1].dn is 50 chars.

The rcvmsg struct comes from the memcpy_fromio() in receivemessage().
I guess that means it's data off the wire.  I'm not very familiar with
this code but I don't see any reason to assume these strings are NULL
terminated.

Also it's weird that "dn" in a 50 character buffer but we only seem to
use 32 characters.  In drivers/isdn/sc/scioc.h, "dn" is only a 49
character buffer.  So potentially there is still an issue there.

The important thing for now is to prevent the memory corruption.

Signed-off-by: Dan Carpenter <error27@gmail.com>
---
I don't have this hardware, but this patch should not introduce any bugs
that weren't there in the original.

v2:  If the strlcpy() strings aren't NULL terminated then bail out
     earlier.  Add a better commit message.  The first commit message
     sucked.  Sorry for that.

diff --git a/drivers/isdn/sc/interrupt.c b/drivers/isdn/sc/interrupt.c
index 485be8b..f0225bc 100644
--- a/drivers/isdn/sc/interrupt.c
+++ b/drivers/isdn/sc/interrupt.c
@@ -112,11 +112,19 @@ irqreturn_t interrupt_handler(int dummy, void *card_inst)
 			}
 			else if(callid>=0x0000 && callid<=0x7FFF)
 			{
+				int len;
+
 				pr_debug("%s: Got Incoming Call\n",
 						sc_adapter[card]->devicename);
-				strcpy(setup.phone,&(rcvmsg.msg_data.byte_array[4]));
-				strcpy(setup.eazmsn,
-					sc_adapter[card]->channel[rcvmsg.phy_link_no-1].dn);
+				len = strlcpy(setup.phone, &(rcvmsg.msg_data.byte_array[4]),
+						sizeof(setup.phone));
+				if (len >= sizeof(setup.phone))
+					continue;
+				len = strlcpy(setup.eazmsn,
+						sc_adapter[card]->channel[rcvmsg.phy_link_no - 1].dn,
+						sizeof(setup.eazmsn));
+				if (len >= sizeof(setup.eazmsn))
+					continue;
 				setup.si1 = 7;
 				setup.si2 = 0;
 				setup.plan = 0;
@@ -176,7 +184,9 @@ irqreturn_t interrupt_handler(int dummy, void *card_inst)
 		 * Handle a GetMyNumber Rsp
 		 */
 		if (IS_CE_MESSAGE(rcvmsg,Call,0,GetMyNumber)){
-			strcpy(sc_adapter[card]->channel[rcvmsg.phy_link_no-1].dn,rcvmsg.msg_data.byte_array);
+			strlcpy(sc_adapter[card]->channel[rcvmsg.phy_link_no - 1].dn,
+				rcvmsg.msg_data.byte_array,
+				sizeof(rcvmsg.msg_data.byte_array));
 			continue;
 		}
 			

^ permalink raw reply related

* [PATCH net-next] igb: fix stats handling
From: Eric Dumazet @ 2010-10-06  4:36 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Jesper Dangaard Brouer, Alexander Duyck, Jesper Dangaard Brouer,
	David S. Miller, netdev, Carolyn Wyborny
In-Reply-To: <1286335729.4861.13.camel@edumazet-laptop>

Le mercredi 06 octobre 2010 à 05:28 +0200, Eric Dumazet a écrit :

> I'll let Intel guys doing the backporting work, but for old kernels,
> you'll probably need to use "unsigned long" instead of "u64"
> 
> My plan is :
> 
> - Provide 64bit counters even on 32bit arch
> - with proper synchro (include/linux/u64_stats_sync.h)
> - Add a spinlock so we can apply Jesper patch.

Here is the net-next-2.6 patch, I am currently enable to test it, the
dev machine with IGB NIC cannot be restarted until tomorrow, my son
Nicolas is currently using it ;)

Could you and/or Jesper test it, possibly on 32 and 64 bit kernels ?

Thanks !

[PATCH net-next] igb: fix stats handling

There are currently some problems with igb.

- On 32bit arches, maintaining 64bit counters without proper
synchronization between writers and readers.

- Stats updated every two seconds, as reported by Jesper.
   (Jesper provided a patch for this)

- Potential problem between worker thread and ethtool -S

This patch uses u64_stats_sync, and convert everything to be 64bit safe,
SMP safe, even on 32bit arches.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 drivers/net/igb/igb.h         |    7 +-
 drivers/net/igb/igb_ethtool.c |   10 +-
 drivers/net/igb/igb_main.c    |  111 +++++++++++++++++++++++---------
 3 files changed, 94 insertions(+), 34 deletions(-)

diff --git a/drivers/net/igb/igb.h b/drivers/net/igb/igb.h
index 44e0ff1..a1b9584 100644
--- a/drivers/net/igb/igb.h
+++ b/drivers/net/igb/igb.h
@@ -159,6 +159,7 @@ struct igb_tx_queue_stats {
 	u64 packets;
 	u64 bytes;
 	u64 restart_queue;
+	struct u64_stats_sync syncp;
 };
 
 struct igb_rx_queue_stats {
@@ -167,6 +168,7 @@ struct igb_rx_queue_stats {
 	u64 drops;
 	u64 csum_err;
 	u64 alloc_failed;
+	struct u64_stats_sync syncp;
 };
 
 struct igb_q_vector {
@@ -288,6 +290,9 @@ struct igb_adapter {
 	struct timecompare compare;
 	struct hwtstamp_config hwtstamp_config;
 
+	spinlock_t stats64_lock;
+	struct rtnl_link_stats64 stats64;
+
 	/* structs defined in e1000_hw.h */
 	struct e1000_hw hw;
 	struct e1000_hw_stats stats;
@@ -357,7 +362,7 @@ extern netdev_tx_t igb_xmit_frame_ring_adv(struct sk_buff *, struct igb_ring *);
 extern void igb_unmap_and_free_tx_resource(struct igb_ring *,
 					   struct igb_buffer *);
 extern void igb_alloc_rx_buffers_adv(struct igb_ring *, int);
-extern void igb_update_stats(struct igb_adapter *);
+extern void igb_update_stats(struct igb_adapter *, struct rtnl_link_stats64 *);
 extern bool igb_has_link(struct igb_adapter *adapter);
 extern void igb_set_ethtool_ops(struct net_device *);
 extern void igb_power_up_link(struct igb_adapter *);
diff --git a/drivers/net/igb/igb_ethtool.c b/drivers/net/igb/igb_ethtool.c
index 26bf6a1..e51c233 100644
--- a/drivers/net/igb/igb_ethtool.c
+++ b/drivers/net/igb/igb_ethtool.c
@@ -90,8 +90,8 @@ static const struct igb_stats igb_gstrings_stats[] = {
 
 #define IGB_NETDEV_STAT(_net_stat) { \
 	.stat_string = __stringify(_net_stat), \
-	.sizeof_stat = FIELD_SIZEOF(struct net_device_stats, _net_stat), \
-	.stat_offset = offsetof(struct net_device_stats, _net_stat) \
+	.sizeof_stat = FIELD_SIZEOF(struct rtnl_link_stats64, _net_stat), \
+	.stat_offset = offsetof(struct rtnl_link_stats64, _net_stat) \
 }
 static const struct igb_stats igb_gstrings_net_stats[] = {
 	IGB_NETDEV_STAT(rx_errors),
@@ -2070,12 +2070,13 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
 				  struct ethtool_stats *stats, u64 *data)
 {
 	struct igb_adapter *adapter = netdev_priv(netdev);
-	struct net_device_stats *net_stats = &netdev->stats;
+	struct rtnl_link_stats64 *net_stats = &adapter->stats64;
 	u64 *queue_stat;
 	int i, j, k;
 	char *p;
 
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, net_stats);
 
 	for (i = 0; i < IGB_GLOBAL_STATS_LEN; i++) {
 		p = (char *)adapter + igb_gstrings_stats[i].stat_offset;
@@ -2097,6 +2098,7 @@ static void igb_get_ethtool_stats(struct net_device *netdev,
 		for (k = 0; k < IGB_RX_QUEUE_STATS_LEN; k++, i++)
 			data[i] = queue_stat[k];
 	}
+	spin_unlock(&adapter->stats64_lock);
 }
 
 static void igb_get_strings(struct net_device *netdev, u32 stringset, u8 *data)
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 55edcb7..8a009ff 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -96,7 +96,6 @@ static int igb_setup_all_rx_resources(struct igb_adapter *);
 static void igb_free_all_tx_resources(struct igb_adapter *);
 static void igb_free_all_rx_resources(struct igb_adapter *);
 static void igb_setup_mrqc(struct igb_adapter *);
-void igb_update_stats(struct igb_adapter *);
 static int igb_probe(struct pci_dev *, const struct pci_device_id *);
 static void __devexit igb_remove(struct pci_dev *pdev);
 static int igb_sw_init(struct igb_adapter *);
@@ -113,7 +112,8 @@ static void igb_update_phy_info(unsigned long);
 static void igb_watchdog(unsigned long);
 static void igb_watchdog_task(struct work_struct *);
 static netdev_tx_t igb_xmit_frame_adv(struct sk_buff *skb, struct net_device *);
-static struct net_device_stats *igb_get_stats(struct net_device *);
+static struct rtnl_link_stats64 *igb_get_stats64(struct net_device *dev,
+						 struct rtnl_link_stats64 *stats);
 static int igb_change_mtu(struct net_device *, int);
 static int igb_set_mac(struct net_device *, void *);
 static void igb_set_uta(struct igb_adapter *adapter);
@@ -1536,7 +1536,9 @@ void igb_down(struct igb_adapter *adapter)
 	netif_carrier_off(netdev);
 
 	/* record the stats before reset*/
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	spin_unlock(&adapter->stats64_lock);
 
 	adapter->link_speed = 0;
 	adapter->link_duplex = 0;
@@ -1689,7 +1691,7 @@ static const struct net_device_ops igb_netdev_ops = {
 	.ndo_open		= igb_open,
 	.ndo_stop		= igb_close,
 	.ndo_start_xmit		= igb_xmit_frame_adv,
-	.ndo_get_stats		= igb_get_stats,
+	.ndo_get_stats64	= igb_get_stats64,
 	.ndo_set_rx_mode	= igb_set_rx_mode,
 	.ndo_set_multicast_list	= igb_set_rx_mode,
 	.ndo_set_mac_address	= igb_set_mac,
@@ -2276,6 +2278,7 @@ static int __devinit igb_sw_init(struct igb_adapter *adapter)
 	adapter->max_frame_size = netdev->mtu + ETH_HLEN + ETH_FCS_LEN;
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 
+	spin_lock_init(&adapter->stats64_lock);
 #ifdef CONFIG_PCI_IOV
 	if (hw->mac.type == e1000_82576)
 		adapter->vfs_allocated_count = (max_vfs > 7) ? 7 : max_vfs;
@@ -3483,7 +3486,9 @@ static void igb_watchdog_task(struct work_struct *work)
 		}
 	}
 
-	igb_update_stats(adapter);
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	spin_unlock(&adapter->stats64_lock);
 
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igb_ring *tx_ring = adapter->tx_ring[i];
@@ -3550,6 +3555,8 @@ static void igb_update_ring_itr(struct igb_q_vector *q_vector)
 	int new_val = q_vector->itr_val;
 	int avg_wire_size = 0;
 	struct igb_adapter *adapter = q_vector->adapter;
+	struct igb_ring *ring;
+	unsigned int packets;
 
 	/* For non-gigabit speeds, just fix the interrupt rate at 4000
 	 * ints/sec - ITR timer value of 120 ticks.
@@ -3559,16 +3566,21 @@ static void igb_update_ring_itr(struct igb_q_vector *q_vector)
 		goto set_itr_val;
 	}
 
-	if (q_vector->rx_ring && q_vector->rx_ring->total_packets) {
-		struct igb_ring *ring = q_vector->rx_ring;
-		avg_wire_size = ring->total_bytes / ring->total_packets;
+	ring = q_vector->rx_ring;
+	if (ring) {
+		packets = ACCESS_ONCE(ring->total_packets);
+
+		if (packets) 
+			avg_wire_size = ring->total_bytes / packets;
 	}
 
-	if (q_vector->tx_ring && q_vector->tx_ring->total_packets) {
-		struct igb_ring *ring = q_vector->tx_ring;
-		avg_wire_size = max_t(u32, avg_wire_size,
-		                      (ring->total_bytes /
-		                       ring->total_packets));
+	ring = q_vector->tx_ring;
+	if (ring) {
+		packets = ACCESS_ONCE(ring->total_packets);
+
+		if (packets)
+			avg_wire_size = max_t(u32, avg_wire_size,
+			                      ring->total_bytes / packets);
 	}
 
 	/* if avg_wire_size isn't set no work was done */
@@ -4077,7 +4089,11 @@ static int __igb_maybe_stop_tx(struct igb_ring *tx_ring, int size)
 
 	/* A reprieve! */
 	netif_wake_subqueue(netdev, tx_ring->queue_index);
+
+	u64_stats_update_begin(&tx_ring->tx_stats.syncp);
 	tx_ring->tx_stats.restart_queue++;
+	u64_stats_update_end(&tx_ring->tx_stats.syncp);
+
 	return 0;
 }
 
@@ -4214,16 +4230,22 @@ static void igb_reset_task(struct work_struct *work)
 }
 
 /**
- * igb_get_stats - Get System Network Statistics
+ * igb_get_stats64 - Get System Network Statistics
  * @netdev: network interface device structure
+ * @stats: rtnl_link_stats64 pointer
  *
- * Returns the address of the device statistics structure.
- * The statistics are actually updated from the timer callback.
  **/
-static struct net_device_stats *igb_get_stats(struct net_device *netdev)
+static struct rtnl_link_stats64 *igb_get_stats64(struct net_device *netdev,
+						 struct rtnl_link_stats64 *stats)
 {
-	/* only return the current stats */
-	return &netdev->stats;
+	struct igb_adapter *adapter = netdev_priv(netdev);
+
+	spin_lock(&adapter->stats64_lock);
+	igb_update_stats(adapter, &adapter->stats64);
+	memcpy(stats, &adapter->stats64, sizeof(*stats));
+	spin_unlock(&adapter->stats64_lock);
+
+	return stats;
 }
 
 /**
@@ -4305,15 +4327,17 @@ static int igb_change_mtu(struct net_device *netdev, int new_mtu)
  * @adapter: board private structure
  **/
 
-void igb_update_stats(struct igb_adapter *adapter)
+void igb_update_stats(struct igb_adapter *adapter,
+		      struct rtnl_link_stats64 *net_stats)
 {
-	struct net_device_stats *net_stats = igb_get_stats(adapter->netdev);
 	struct e1000_hw *hw = &adapter->hw;
 	struct pci_dev *pdev = adapter->pdev;
 	u32 reg, mpc;
 	u16 phy_tmp;
 	int i;
 	u64 bytes, packets;
+	unsigned int start;
+	u64 _bytes, _packets;
 
 #define PHY_IDLE_ERROR_COUNT_MASK 0x00FF
 
@@ -4331,10 +4355,17 @@ void igb_update_stats(struct igb_adapter *adapter)
 	for (i = 0; i < adapter->num_rx_queues; i++) {
 		u32 rqdpc_tmp = rd32(E1000_RQDPC(i)) & 0x0FFF;
 		struct igb_ring *ring = adapter->rx_ring[i];
+
 		ring->rx_stats.drops += rqdpc_tmp;
 		net_stats->rx_fifo_errors += rqdpc_tmp;
-		bytes += ring->rx_stats.bytes;
-		packets += ring->rx_stats.packets;
+		
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->rx_stats.syncp);
+			_bytes = ring->rx_stats.bytes;
+			_packets = ring->rx_stats.packets;
+		} while (u64_stats_fetch_retry_bh(&ring->rx_stats.syncp, start));
+		bytes += _bytes;
+		packets += _packets;
 	}
 
 	net_stats->rx_bytes = bytes;
@@ -4344,8 +4375,13 @@ void igb_update_stats(struct igb_adapter *adapter)
 	packets = 0;
 	for (i = 0; i < adapter->num_tx_queues; i++) {
 		struct igb_ring *ring = adapter->tx_ring[i];
-		bytes += ring->tx_stats.bytes;
-		packets += ring->tx_stats.packets;
+		do {
+			start = u64_stats_fetch_begin_bh(&ring->tx_stats.syncp);
+			_bytes = ring->tx_stats.bytes;
+			_packets = ring->tx_stats.packets;
+		} while (u64_stats_fetch_retry_bh(&ring->tx_stats.syncp, start));
+		bytes += _bytes;
+		packets += _packets;
 	}
 	net_stats->tx_bytes = bytes;
 	net_stats->tx_packets = packets;
@@ -5397,7 +5433,10 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
 		if (__netif_subqueue_stopped(netdev, tx_ring->queue_index) &&
 		    !(test_bit(__IGB_DOWN, &adapter->state))) {
 			netif_wake_subqueue(netdev, tx_ring->queue_index);
+
+			u64_stats_update_begin(&tx_ring->tx_stats.syncp);
 			tx_ring->tx_stats.restart_queue++;
+			u64_stats_update_end(&tx_ring->tx_stats.syncp);
 		}
 	}
 
@@ -5437,8 +5476,10 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector)
 	}
 	tx_ring->total_bytes += total_bytes;
 	tx_ring->total_packets += total_packets;
+	u64_stats_update_begin(&tx_ring->tx_stats.syncp);
 	tx_ring->tx_stats.bytes += total_bytes;
 	tx_ring->tx_stats.packets += total_packets;
+	u64_stats_update_end(&tx_ring->tx_stats.syncp);
 	return count < tx_ring->count;
 }
 
@@ -5480,9 +5521,11 @@ static inline void igb_rx_checksum_adv(struct igb_ring *ring,
 		 * packets, (aka let the stack check the crc32c)
 		 */
 		if ((skb->len == 60) &&
-		    (ring->flags & IGB_RING_FLAG_RX_SCTP_CSUM))
+		    (ring->flags & IGB_RING_FLAG_RX_SCTP_CSUM)) {
+			u64_stats_update_begin(&ring->rx_stats.syncp);
 			ring->rx_stats.csum_err++;
-
+			u64_stats_update_end(&ring->rx_stats.syncp);
+		}
 		/* let the stack verify checksum errors */
 		return;
 	}
@@ -5669,8 +5712,10 @@ next_desc:
 
 	rx_ring->total_packets += total_packets;
 	rx_ring->total_bytes += total_bytes;
+	u64_stats_update_begin(&rx_ring->rx_stats.syncp);
 	rx_ring->rx_stats.packets += total_packets;
 	rx_ring->rx_stats.bytes += total_bytes;
+	u64_stats_update_end(&rx_ring->rx_stats.syncp);
 	return cleaned;
 }
 
@@ -5698,8 +5743,10 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 		if ((bufsz < IGB_RXBUFFER_1024) && !buffer_info->page_dma) {
 			if (!buffer_info->page) {
 				buffer_info->page = netdev_alloc_page(netdev);
-				if (!buffer_info->page) {
+				if (unlikely(!buffer_info->page)) {
+					u64_stats_update_begin(&rx_ring->rx_stats.syncp);
 					rx_ring->rx_stats.alloc_failed++;
+					u64_stats_update_end(&rx_ring->rx_stats.syncp);
 					goto no_buffers;
 				}
 				buffer_info->page_offset = 0;
@@ -5714,7 +5761,9 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 			if (dma_mapping_error(rx_ring->dev,
 					      buffer_info->page_dma)) {
 				buffer_info->page_dma = 0;
+				u64_stats_update_begin(&rx_ring->rx_stats.syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_stats.syncp);
 				goto no_buffers;
 			}
 		}
@@ -5722,8 +5771,10 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 		skb = buffer_info->skb;
 		if (!skb) {
 			skb = netdev_alloc_skb_ip_align(netdev, bufsz);
-			if (!skb) {
+			if (unlikely(!skb)) {
+				u64_stats_update_begin(&rx_ring->rx_stats.syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_stats.syncp);
 				goto no_buffers;
 			}
 
@@ -5737,7 +5788,9 @@ void igb_alloc_rx_buffers_adv(struct igb_ring *rx_ring, int cleaned_count)
 			if (dma_mapping_error(rx_ring->dev,
 					      buffer_info->dma)) {
 				buffer_info->dma = 0;
+				u64_stats_update_begin(&rx_ring->rx_stats.syncp);
 				rx_ring->rx_stats.alloc_failed++;
+				u64_stats_update_end(&rx_ring->rx_stats.syncp);
 				goto no_buffers;
 			}
 		}



^ permalink raw reply related

* Re: [PATCH 2/2] bna: scope and dead code cleanup
From: David Miller @ 2010-10-06  3:41 UTC (permalink / raw)
  To: rmody; +Cc: netdev, shemminger, ddutt
In-Reply-To: <1286329565-20234-2-git-send-email-rmody@brocade.com>

From: Rasesh Mody <rmody@brocade.com>
Date: Tue, 5 Oct 2010 18:46:05 -0700

> As suggested by Stephen Hemminger:
> 1) Made functions and data structures static wherever possible.
> 2) Removed unused code.
> 
> Signed-off-by: Debashis Dutt <ddutt@brocade.com>
> Signed-off-by: Rasesh Mody <rmody@brocade.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/2] bna: fix interrupt handling
From: David Miller @ 2010-10-06  3:40 UTC (permalink / raw)
  To: rmody; +Cc: netdev, shemminger, ddutt
In-Reply-To: <1286329565-20234-1-git-send-email-rmody@brocade.com>

From: Rasesh Mody <rmody@brocade.com>
Date: Tue, 5 Oct 2010 18:46:04 -0700

> This fix handles the case when IRQ handler is called (for shared IRQs)
> even before the driver is ready to handle interrupts.
> 
> Signed-off-by: Debashis Dutt <ddutt@brocade.com>
> Signed-off-by: Rasesh Mody <rmody@brocade.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] fib: RCU conversion of fib_lookup()
From: David Miller @ 2010-10-06  3:40 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1286311296.2593.32.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 05 Oct 2010 22:41:36 +0200

> fib_lookup() converted to be called in RCU protected context, no
> reference taken and released on a contended cache line (fib_clntref)
> 
> fib_table_lookup() and fib_semantic_match() get an additional parameter.
> 
> struct fib_info gets an rcu_head field, and is freed after an rcu grace
> period.
 ...
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

I love it!

Applied, thanks!

^ permalink raw reply

* Re: [PATCH] caif: fix two caif_connect() bugs
From: David Miller @ 2010-10-06  3:38 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, sjur.brandeland
In-Reply-To: <1286268128.2796.27.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 05 Oct 2010 10:42:08 +0200

> caif_connect() might dereference a netdevice after dev_put() it.
> 
> It also doesnt check dev_get_by_index() return value and could
> dereference a NULL pointer.
> 
> Fix it, using RCU to avoid taking a reference.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [net-next PATCH] igb: update adapter stats when reading /proc/net/dev.
From: Eric Dumazet @ 2010-10-06  3:28 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Jesper Dangaard Brouer, Alexander Duyck, Jesper Dangaard Brouer,
	David S. Miller, netdev, Carolyn Wyborny
In-Reply-To: <AANLkTi=+rkietX9jAMy5BRVwAENf-mK15py7y+LYQ-XJ@mail.gmail.com>

Le mardi 05 octobre 2010 à 15:34 -0700, Jeff Kirsher a écrit :
> On Tue, Oct 5, 2010 at 14:16, Jesper Dangaard Brouer <hawk@diku.dk> wrote:
> > On Tue, 5 Oct 2010, Eric Dumazet wrote:
> >
> >> Le mardi 05 octobre 2010 à 17:19 +0200, Eric Dumazet a écrit :
> >>>
> >>> Le mardi 05 octobre 2010 à 16:53 +0200, Jesper Dangaard Brouer a écrit :
> >>>
> >>>> Its already racy, because "ethtool -S" reads out the stats immediately,
> >>>> and thus races with the timer.
> >>>>
> >>>> See: igb_ethtool.c
> >>>>  igb_get_ethtool_stats() invoke igb_update_stats(adapter);
> >>>>
> >>>
> >>> You would be surprised how many bugs are waiting to be found and
> >>> fixed ;)
> >>
> >> I took a look at igb stats, and it appears they also are racy on 32bit
> >> arches. igb uses u64 counters, with no synchronization between
> >> producers(writers) and consumers(readers).
> >
> > Are you saying, that we need more than a simple mutex in igb_update_stats()?
> >
> >
> >> Some fixes are needed ;)
> >
> > The question is then if Intel wants to fix it, or let it be up to you or me?
> >
> 
> I will make sure that Carolyn and Alex know about the issue.  But,
> feel free to submit a patch if you guys have the time.
> 

I woke up early this morning, I'll provide patches to fix issues for
net-next-2.6

I'll let Intel guys doing the backporting work, but for old kernels,
you'll probably need to use "unsigned long" instead of "u64"

My plan is :

- Provide 64bit counters even on 32bit arch
- with proper synchro (include/linux/u64_stats_sync.h)
- Add a spinlock so we can apply Jesper patch.




^ permalink raw reply

* Re: [PATCH 3/3] bonding: add retransmit membership reports tunable
From: David Miller @ 2010-10-06  3:29 UTC (permalink / raw)
  To: fleitner; +Cc: netdev, bonding-devel, fubar, andy
In-Reply-To: <1286324639-22314-3-git-send-email-fleitner@redhat.com>

From: Flavio Leitner <fleitner@redhat.com>
Date: Tue,  5 Oct 2010 21:23:59 -0300

> Allow sysadmins to configure the number of multicast
> membership report sent on a link failure event.
> 
> Signed-off-by: Flavio Leitner <fleitner@redhat.com>

Also applied to net-next-2.6, thanks.

^ permalink raw reply

* Re: [PATCH 2/3] bonding: fix to rejoin multicast groups immediately
From: David Miller @ 2010-10-06  3:28 UTC (permalink / raw)
  To: fleitner; +Cc: netdev, bonding-devel, fubar, andy
In-Reply-To: <1286324639-22314-2-git-send-email-fleitner@redhat.com>

From: Flavio Leitner <fleitner@redhat.com>
Date: Tue,  5 Oct 2010 21:23:58 -0300

> The IGMP specs states that if the system receives a
> membership report, it shouldn't send another for the
> next minute. However, if a link failure happens right
> after that, the backup slave and the switch connected
> to this slave will not know about the multicast and
> the traffic will hang for about a minute.
> 
> This patch fixes it to rejoin multicast groups immediately
> after a failover restoring the multicast traffic.
> 
> Signed-off-by: Flavio Leitner <fleitner@redhat.com>

Much better commit message :-)

Applied to net-next-2.6

^ permalink raw reply

* Re: [PATCH 1/3] bonding: rejoin multicast groups on VLANs
From: David Miller @ 2010-10-06  3:28 UTC (permalink / raw)
  To: fleitner; +Cc: netdev, bonding-devel, fubar, andy
In-Reply-To: <1286324639-22314-1-git-send-email-fleitner@redhat.com>

From: Flavio Leitner <fleitner@redhat.com>
Date: Tue,  5 Oct 2010 21:23:57 -0300

> During a failover, the IGMP membership is sent to update
> the switch restoring the traffic, but it misses groups added
> to VLAN devices running on top of bonding devices.
> 
> This patch changes it to iterate over all VLAN devices
> on top of it sending IGMP memberships too.
> 
> Signed-off-by: Flavio Leitner <fleitner@redhat.com>

This seems to address Andy's feedback properly, and otherwise
looks good to me, so applied to net-next-2.6

^ permalink raw reply

* Re: [PATCH 2/2] ehea: converting msleeps to waitqueue on check_sqs() function
From: David Miller @ 2010-10-06  3:15 UTC (permalink / raw)
  To: leitao; +Cc: netdev
In-Reply-To: <1286320583-5594-2-git-send-email-leitao@linux.vnet.ibm.com>

From: leitao@linux.vnet.ibm.com
Date: Tue,  5 Oct 2010 19:16:23 -0400

> Removing the msleep() call in check_sqs() function, and replacing by a wait queue.
> 
> Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>

Applied.

^ permalink raw reply

* Re: [PATCH 1/2] ehea: using wait queues instead of msleep on ehea_flush_sq
From: David Miller @ 2010-10-06  3:14 UTC (permalink / raw)
  To: leitao; +Cc: netdev
In-Reply-To: <1286320583-5594-1-git-send-email-leitao@linux.vnet.ibm.com>

From: leitao@linux.vnet.ibm.com
Date: Tue,  5 Oct 2010 19:16:22 -0400

> This patch just remove a msleep loop and change to wait queue,
> making the code cleaner.
> 
> Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
> Acked-by: David Howells <dhowells@redhat.com>

Applied.

^ permalink raw reply

* Re: [net-next-2.6 PATCH] ixgbevf: declare functions as static
From: David Miller @ 2010-10-06  3:14 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: netdev, gospo, bphilips, shemminger, emil.s.tantilov, greg.v.rose
In-Reply-To: <20101005231030.24410.47680.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 05 Oct 2010 16:11:30 -0700

> From: Emil Tantilov <emil.s.tantilov@intel.com>
> 
> Following patch fixes warnings reported by `make namespacecheck`
> 
> Reported by Stephen Hemminger
> 
> CC: Stephen Hemminger <shemminger@vyatta.com>
> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
> Acked-by: Greg Rose <greg.v.rose@intel.com>
> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox