Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [patch v2] bluetooth: handle l2cap_create_connless_pdu() errors
From: Gustavo F. Padovan @ 2010-04-26 18:27 UTC (permalink / raw)
  To: David Miller
  Cc: error27-Re5JQEeQqe8AvxtiuMwx3w, marcel-kz+m5ild9QBg9hUCZPvPmw,
	andrei.emeltchenko-xNZwKgViW5gAvxtiuMwx3w,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	kernel-janitors-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20100426.111259.112594696.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

Hi David,

* David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> [2010-04-26 11:12:59 -0700]:

> From: "Gustavo F. Padovan" <gustavo-THi1TnShQwVAfugRpC6u6w@public.gmane.org>
> Date: Mon, 26 Apr 2010 12:09:19 -0300
> 
> > Hi Dan,
> > 
> > * Dan Carpenter <error27-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> [2010-04-26 13:36:27 +0200]:
> > 
> >> l2cap_create_connless_pdu() can sometimes return ERR_PTR(-ENOMEM) or
> >> ERR_PTR(-EFAULT).
> >> 
> >> Signed-off-by: Dan Carpenter <error27-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >> ---
> >> In v2 I wrote the patch on top of Gustavo Padovon's devel tree
> 
> This is the kind of bug that could cause a crash if the path actually
> executes.
> 
> Therefore it tires me that that submitter was told to regenerate this
> patch against some devel tree that is -next bound, when in fact this
> is the kind of fix that warrants inclusion right now into net-2.6

My bad here. So I think we should pick the first version of the Dan's patch. 
It applies against bluetooth-testing right now and then against net-2.6 too.

Marcel, is that ok to you?

> 
> Marcel, please do whatever magic you need to so I can get this into
> Linus's tree as I did the rest of the ERR_PTR() fixes from Dan already.
> No reason to treat Bluetooth special and defer these fixes to -next.
> 
> Thanks.

-- 
Gustavo F. Padovan
http://padovan.org

^ permalink raw reply

* Re: [PATCH net-next-2.6] bridge br_multicast: Ensure to initialize BR_INPUT_SKB_CB(skb)->mrouters_only.
From: David Miller @ 2010-04-26 18:26 UTC (permalink / raw)
  To: yoshfuji; +Cc: yoshfuji, netdev
In-Reply-To: <201004251806.o3PI6e2p008200@94.43.138.210.xn.2iij.net>

From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Date: Mon, 26 Apr 2010 03:06:40 +0900

> Even with commit 32dec5dd0233ebffa9cae25ce7ba6daeb7df4467 ("bridge
> br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only
> without IGMP snooping."), BR_INPUT_SKB_CB(skb)->mrouters_only is
> not appropriately initialized if IGMP/MLD snooping support is
> compiled and disabled, so we can see garbage.
> 
> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6] bridge br_multicast: Ensure to initialize BR_INPUT_SKB_CB(skb)->mrouters_only.
From: David Miller @ 2010-04-26 18:26 UTC (permalink / raw)
  To: yoshfuji; +Cc: netdev
In-Reply-To: <201004251859.o3PIx7Vo012485@94.43.138.210.xn.2iij.net>

From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Date: Mon, 26 Apr 2010 03:59:07 +0900

> Even with commit 32dec5dd0233ebffa9cae25ce7ba6daeb7df4467 ("bridge
> br_multicast: Don't refer to BR_INPUT_SKB_CB(skb)->mrouters_only
> without IGMP snooping."), BR_INPUT_SKB_CB(skb)->mrouters_only is
> not appropriately initialized if IGMP snooping support is
> compiled and disabled, so we can see garbage.
> 
> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

Applied.

^ permalink raw reply

* Re: [PATCH v3] TCP: avoid to send keepalive probes if receiving data
From: David Miller @ 2010-04-26 18:24 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: fleitner, netdev, eric.dumazet
In-Reply-To: <alpine.DEB.2.00.1004261241510.7041@wel-95.cs.helsinki.fi>

From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Mon, 26 Apr 2010 12:47:13 +0300 (EEST)

>>  			    !((1 << sk->sk_state) &
>>  			      (TCPF_CLOSE | TCPF_LISTEN))) {
>> -				__u32 elapsed = tcp_time_stamp - tp->rcv_tstamp;
>> +				u32 elapsed = min_t(u32,
>> +				      tcp_time_stamp - icsk->icsk_ack.lrcvtime,
>> +				      tcp_time_stamp - tp->rcv_tstamp);
>> +
> 
> I'd put this into a helper in include/net/tcp.h
> 
 ...
>> +	if (elapsed < keepalive_time_when(tp)) {
>> +		elapsed = keepalive_time_when(tp) - elapsed;
>> +		goto resched;
>> +	}
>> +
>>  	elapsed = tcp_time_stamp - tp->rcv_tstamp;
> 
> ...And then use it here too for setting the elapsed and doing the test 
> and what follows only once?
> 

Agreed.

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: David Miller @ 2010-04-26 18:22 UTC (permalink / raw)
  To: therbert; +Cc: eric.dumazet, netdev
In-Reply-To: <u2q65634d661004261119j74042496z4f1ba570251e0c44@mail.gmail.com>

From: Tom Herbert <therbert@google.com>
Date: Mon, 26 Apr 2010 11:19:05 -0700

> This also hits RSS/multiqueue. In a netperf RR test, 500 streams
> between my two 16 core AMDs:  TCP 970K tps, UDP 370K tps.  I'm
> surprised they didn't catch that in some benchmarks...

Meanwhile, these NIC vendors seem to have all the time in the world to
add iSCSI, RDMA and all the other stateful offload junk into their
firmware and silicon.

Yet they can't hash ports if the protocol is not TCP?  Beyond
baffling...

^ permalink raw reply

* Re: [PATCH] ieee802154: Fix oops during ieee802154_sock_ioctl
From: David Miller @ 2010-04-26 18:20 UTC (permalink / raw)
  To: dbaryshkov; +Cc: netdev, linux-kernel, stefan
In-Reply-To: <1272293202-11712-1-git-send-email-dbaryshkov@gmail.com>

From: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Date: Mon, 26 Apr 2010 18:46:42 +0400

> From: Stefan Schmidt <stefan@datenfreihafen.org>
> 
> Trying to run izlisten (from lowpan-tools tests) on a device that does not
> exists I got the oops below. The problem is that we are using get_dev_by_name
> without checking if we really get a device back. We don't in this case and
> writing to dev->type generates this oops.
> 
> [Oops code removed by Dmitry Eremin-Solenikov]
> 
> If possible this patch should be applied to the current -rc fixes branch.
> 
> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>

Applied, and queued up for -stable too, thanks guys.

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: Tom Herbert @ 2010-04-26 18:19 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev
In-Reply-To: <20100426.110432.104061817.davem@davemloft.net>

> I'm pretty sure there isn't at this point.
>
> We'll need to elide setting ->rxhash for non-TCP packets.  I bet that
> the ETH_FAST_PATH_RX_CQE_RSS_HASH_TYPE field might be usable to making
> this decision, but if not in the worst case we'll need to parse the
> VLAN/ETH and IP4/IP6 headers to figure out the protocol.
>
> Damn, I'm so pissed off about this.  This ruins everything!
>
> How damn hard is it to add two 16-bit ports to the hash regardless of
> protocol?
>
Fair question.

This also hits RSS/multiqueue. In a netperf RR test, 500 streams
between my two 16 core AMDs:  TCP 970K tps, UDP 370K tps.  I'm
surprised they didn't catch that in some benchmarks...

^ permalink raw reply

* Re: [PATCH net-next-2.6] pppoe: use phonet_pernet instead of directly net_generic
From: David Miller @ 2010-04-26 18:17 UTC (permalink / raw)
  To: jpirko; +Cc: netdev
In-Reply-To: <20100426134100.GF2941@psychotron.lab.eng.brq.redhat.com>

From: Jiri Pirko <jpirko@redhat.com>
Date: Mon, 26 Apr 2010 15:41:00 +0200

> As in for example pppoe introduce phonet_pernet and use it instead of calling
> net_generic directly.
> 
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>

Applied, and I modified your commit header line slightly.  This isn't
a change to "pppoe: " but rather "phonet: " :-)

Thanks.

^ permalink raw reply

* Re: [PATCH net-next-2.6] pppoe: use pppoe_pernet instead of directly net_generic
From: David Miller @ 2010-04-26 18:17 UTC (permalink / raw)
  To: jpirko; +Cc: netdev, mostrows
In-Reply-To: <20100426114611.GC2941@psychotron.lab.eng.brq.redhat.com>

From: Jiri Pirko <jpirko@redhat.com>
Date: Mon, 26 Apr 2010 13:46:12 +0200

> 
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next-2.6] net: sk_sleep() helper
From: David Miller @ 2010-04-26 18:17 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1272270006.2346.13.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 26 Apr 2010 10:20:06 +0200

> Here is a followup to this patch, I missed three files in this
> conversion.
> 
> Thanks !
> 
> [PATCH net-next-2.6] net: use sk_sleep()
> 
> Commit aa395145 (net: sk_sleep() helper) missed three files in the
> conversion.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2] tg3: Fix INTx fallback when MSI fails
From: David Miller @ 2010-04-26 18:15 UTC (permalink / raw)
  To: mchan; +Cc: adetsch, netdev, mcarlson
In-Reply-To: <1272304881.9950.19.camel@HP1>

From: "Michael Chan" <mchan@broadcom.com>
Date: Mon, 26 Apr 2010 11:01:21 -0700

> Sorry, Matt just got back from vacation.  I discussed with Andre earlier
> today to come up with the new v2 patch.  So this patch has my:
> 
> Acked-by: Michael Chan <mchan@broadcom.com>
> 

Thanks a lot, applied.

^ permalink raw reply

* Re: [PATCH v2] tg3: Fix INTx fallback when MSI fails
From: Michael Chan @ 2010-04-26 18:01 UTC (permalink / raw)
  To: David Miller; +Cc: adetsch@br.ibm.com, netdev@vger.kernel.org, Matthew Carlson
In-Reply-To: <20100426.111017.189698604.davem@davemloft.net>


On Mon, 2010-04-26 at 11:10 -0700, David Miller wrote:
> From: Andre Detsch <adetsch@br.ibm.com>
> Date: Mon, 26 Apr 2010 14:27:07 -0300
> 
> > tg3: Fix INTx fallback when MSI fails
> > 
> > MSI setup changes the value of irq_vec in struct tg3 *tp.
> > This attribute must be taken into account and restored before
> > we try to do a new request_irq for INTx fallback.
> > 
> > In powerpc, the original code was leading to an EINVAL return within
> > request_irq, because the driver was trying to use the disabled MSI
> > virtual irq number instead of tp->pdev->irq.
> > 
> > Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
> 
> Broadcom folks, I've already asked you to review Andre's original patch.
> 
> If you can't be bothered to look at and ACK/NACK a simple fix like
> this for 10 days, then I'm just going to apply Andre's patch as-is
> since it looks right to me.
> 
> If the idea was to bundle this up into a set of 15 patches you guys
> already have ready to bomb at me, I absolutely do not want you guys to
> operate that way.  It doesn't work.  You should be able to ACK a
> simple patch like this as soon as possible so it can be integrated
> upstream in the most expediant manner possible.
> 
> Thanks.
> 

Sorry, Matt just got back from vacation.  I discussed with Andre earlier
today to come up with the new v2 patch.  So this patch has my:

Acked-by: Michael Chan <mchan@broadcom.com>




^ permalink raw reply

* Re: [patch v2] bluetooth: handle l2cap_create_connless_pdu() errors
From: David Miller @ 2010-04-26 18:12 UTC (permalink / raw)
  To: gustavo
  Cc: error27, marcel, andrei.emeltchenko, linux-bluetooth, netdev,
	kernel-janitors
In-Reply-To: <20100426150919.GA12813@vigoh>

From: "Gustavo F. Padovan" <gustavo@padovan.org>
Date: Mon, 26 Apr 2010 12:09:19 -0300

> Hi Dan,
> 
> * Dan Carpenter <error27@gmail.com> [2010-04-26 13:36:27 +0200]:
> 
>> l2cap_create_connless_pdu() can sometimes return ERR_PTR(-ENOMEM) or
>> ERR_PTR(-EFAULT).
>> 
>> Signed-off-by: Dan Carpenter <error27@gmail.com>
>> ---
>> In v2 I wrote the patch on top of Gustavo Padovon's devel tree

This is the kind of bug that could cause a crash if the path actually
executes.

Therefore it tires me that that submitter was told to regenerate this
patch against some devel tree that is -next bound, when in fact this
is the kind of fix that warrants inclusion right now into net-2.6

Marcel, please do whatever magic you need to so I can get this into
Linus's tree as I did the rest of the ERR_PTR() fixes from Dan already.
No reason to treat Bluetooth special and defer these fixes to -next.

Thanks.

^ permalink raw reply

* Re: [PATCH v2] tg3: Fix INTx fallback when MSI fails
From: David Miller @ 2010-04-26 18:10 UTC (permalink / raw)
  To: adetsch; +Cc: netdev, mcarlson, mchan
In-Reply-To: <201004261427.07099.adetsch@br.ibm.com>

From: Andre Detsch <adetsch@br.ibm.com>
Date: Mon, 26 Apr 2010 14:27:07 -0300

> tg3: Fix INTx fallback when MSI fails
> 
> MSI setup changes the value of irq_vec in struct tg3 *tp.
> This attribute must be taken into account and restored before
> we try to do a new request_irq for INTx fallback.
> 
> In powerpc, the original code was leading to an EINVAL return within
> request_irq, because the driver was trying to use the disabled MSI
> virtual irq number instead of tp->pdev->irq.
> 
> Signed-off-by: Andre Detsch <adetsch@br.ibm.com>

Broadcom folks, I've already asked you to review Andre's original patch.

If you can't be bothered to look at and ACK/NACK a simple fix like
this for 10 days, then I'm just going to apply Andre's patch as-is
since it looks right to me.

If the idea was to bundle this up into a set of 15 patches you guys
already have ready to bomb at me, I absolutely do not want you guys to
operate that way.  It doesn't work.  You should be able to ACK a
simple patch like this as soon as possible so it can be integrated
upstream in the most expediant manner possible.

Thanks.

^ permalink raw reply

* Re: [PATCH net-2.6] netns: assign NULL after pernet memory is freed
From: Eric W. Biederman @ 2010-04-26 18:05 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem
In-Reply-To: <20100426133020.GE2941@psychotron.lab.eng.brq.redhat.com>

Jiri Pirko <jpirko@redhat.com> writes:

> Mon, Apr 26, 2010 at 02:07:17PM CEST, jpirko@redhat.com wrote:
>>Mon, Apr 26, 2010 at 01:18:07PM CEST, jpirko@redhat.com wrote:
>>>This is needed to let know appropriate code (driver) know that pernet memory
>>>chunk  was freed already. For example when driver has also registered netdev
>>>notifier, it can be called after memory is freed which could eventually lead
>>>to panic. Also a null check should be present in these notifiers.
>>>
>>>For example, previously, when drivers were responsible for pernet memory
>>>allocation/freeing, in pppoe this assign was done in pppoe_exit_net() right
>>>after pernet memory was freed. The check in notifier stayed (in pppoe_flush_dev
>>>called from pppoe_device_event) but makes no sense now without this patch.
>>
>>Hmm, thinking about this, since the life of pernet memories is directly connected
>>with the life of net, as described by Eric, this should not be needed. So please
>>scrach this one too. Sorry.
>>
>>Anyway, I'm still wondering how to prevent netdev_notifiers from using this when
>>net is gone. Going to do more research, I'm probably missing something.
>
> Right, it becomes part of init_ns then. So this looks good then. One thing I can
> still imagine what may cause problem, since netdev_notifier is registered before
> register_pernet_device, the notifier can be called in between. So I think this
> should be reordered, am I right?

No.  We guarantee that all network devices are either destroyed or moved
to init_net with the appropriate notifications sent.  Before we tear down
the network namespace.  This guarantees we won't have packets in flight when
destroying the network subsystems.

The moving and destroying of network devices should simply look like normal
events, that don't need special handling.

Eric

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: David Miller @ 2010-04-26 18:04 UTC (permalink / raw)
  To: therbert; +Cc: eric.dumazet, netdev
In-Reply-To: <q2o65634d661004261038t8f0c32f5ye8169ef0c38172b7@mail.gmail.com>

From: Tom Herbert <therbert@google.com>
Date: Mon, 26 Apr 2010 10:38:27 -0700

> It would appear that way :-(.  I was going to ping Broadcom folks to
> see if there's support for UDP.

I'm pretty sure there isn't at this point.

We'll need to elide setting ->rxhash for non-TCP packets.  I bet that
the ETH_FAST_PATH_RX_CQE_RSS_HASH_TYPE field might be usable to making
this decision, but if not in the worst case we'll need to parse the
VLAN/ETH and IP4/IP6 headers to figure out the protocol.

Damn, I'm so pissed off about this.  This ruins everything!

How damn hard is it to add two 16-bit ports to the hash regardless of
protocol?

^ permalink raw reply

* Re: [PATCH net-2.6] netns: assign NULL after pernet memory is freed
From: Eric W. Biederman @ 2010-04-26 18:01 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem
In-Reply-To: <20100426120716.GD2941@psychotron.lab.eng.brq.redhat.com>

Jiri Pirko <jpirko@redhat.com> writes:

> Mon, Apr 26, 2010 at 01:18:07PM CEST, jpirko@redhat.com wrote:
>>This is needed to let know appropriate code (driver) know that pernet memory
>>chunk  was freed already. For example when driver has also registered netdev
>>notifier, it can be called after memory is freed which could eventually lead
>>to panic. Also a null check should be present in these notifiers.
>>
>>For example, previously, when drivers were responsible for pernet memory
>>allocation/freeing, in pppoe this assign was done in pppoe_exit_net() right
>>after pernet memory was freed. The check in notifier stayed (in pppoe_flush_dev
>>called from pppoe_device_event) but makes no sense now without this patch.
>
> Hmm, thinking about this, since the life of pernet memories is directly connected
> with the life of net, as described by Eric, this should not be needed. So please
> scrach this one too. Sorry.
>
> Anyway, I'm still wondering how to prevent netdev_notifiers from using this when
> net is gone. Going to do more research, I'm probably missing something.

Also semenatically net_assign_generic is only guaranteed to work once, so assigning
NULL is at least semantically wrong.

Eric


>>I already know about several notifiers where the check should be present,
>>patches will follow up.
>>
>>Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>>
>>diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
>>index bd8c471..29f622c 100644
>>--- a/net/core/net_namespace.c
>>+++ b/net/core/net_namespace.c
>>@@ -51,6 +51,7 @@ static void ops_free(const struct pernet_operations *ops, struct net *net)
>> 	if (ops->id && ops->size) {
>> 		int id = *ops->id;
>> 		kfree(net_generic(net, id));
>>+		net_assign_generic(net, id, NULL);
>> 	}
>> }
>> 

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: David Miller @ 2010-04-26 17:56 UTC (permalink / raw)
  To: bhutchings; +Cc: therbert, eric.dumazet, netdev
In-Reply-To: <1272304072.2376.10.camel@achroite.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 26 Apr 2010 18:47:52 +0100

> Unfortunately the widely-implemented Toeplitz hash functions are defined
> only for TCP/IPv4, TCP/IPv6, IPv4 and IPv6.

What a complete and utter waste all of this is then.

Defining the hash as TCP only is about as stupid as making hw checksum
offload be TCP only.

Really, I'm completely stunned.

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: Ben Hutchings @ 2010-04-26 17:47 UTC (permalink / raw)
  To: Tom Herbert; +Cc: Eric Dumazet, davem, netdev
In-Reply-To: <q2o65634d661004261038t8f0c32f5ye8169ef0c38172b7@mail.gmail.com>

On Mon, 2010-04-26 at 10:38 -0700, Tom Herbert wrote:
> > Hi Tom
> >
> > I tested this rxhash feature on my bnx2x card, using latest net-next-2.6
> > and appropriate ethtool
> >
> > ethtool -k eth1 rxhash on
> >
> > Then I used my pktgen script, to flood machine with flows with udp dst
> > port varying between 4000 and 4015.
> >
> > Software generated rxhash is OK (16 different values).
> >
> > But with bnx2x provided hash, all skb were hashed to same rxhash
> > value :(
> >
> > What are the specs of this hardware hash ?
> > It seems to not care of udp source/destination ports.
> >
> 
> It would appear that way :-(.  I was going to ping Broadcom folks to
> see if there's support for UDP.
[...]

Unfortunately the widely-implemented Toeplitz hash functions are defined
only for TCP/IPv4, TCP/IPv6, IPv4 and IPv6.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: Tom Herbert @ 2010-04-26 17:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev
In-Reply-To: <1272302434.19143.76.camel@edumazet-laptop>

> Hi Tom
>
> I tested this rxhash feature on my bnx2x card, using latest net-next-2.6
> and appropriate ethtool
>
> ethtool -k eth1 rxhash on
>
> Then I used my pktgen script, to flood machine with flows with udp dst
> port varying between 4000 and 4015.
>
> Software generated rxhash is OK (16 different values).
>
> But with bnx2x provided hash, all skb were hashed to same rxhash
> value :(
>
> What are the specs of this hardware hash ?
> It seems to not care of udp source/destination ports.
>

It would appear that way :-(.  I was going to ping Broadcom folks to
see if there's support for UDP.

> Also, should'nt we feed same values for the seeds on different nics ?
>
> for (i = SRC_REG_KEYRSS0_0; i <= SRC_REG_KEYRSS1_9; i += 4)
>        REG_WR(bp, i, random32());
>

Yes, that is reasonable.

> Thanks
>
>
>

^ permalink raw reply

* [PATCH v2] tg3: Fix INTx fallback when MSI fails
From: Andre Detsch @ 2010-04-26 17:27 UTC (permalink / raw)
  To: netdev, mcarlson, Michael Chan
In-Reply-To: <20100421.161751.94080129.davem@davemloft.net>

tg3: Fix INTx fallback when MSI fails

MSI setup changes the value of irq_vec in struct tg3 *tp.
This attribute must be taken into account and restored before
we try to do a new request_irq for INTx fallback.

In powerpc, the original code was leading to an EINVAL return within
request_irq, because the driver was trying to use the disabled MSI
virtual irq number instead of tp->pdev->irq.

Signed-off-by: Andre Detsch <adetsch@br.ibm.com>

---
Tested on powerpc, but should be safe for other architectures as well.
v2: removed unnecessary changes

Index: linux-2.6.34-rc5/drivers/net/tg3.c
===================================================================
--- linux-2.6.34-rc5.orig/drivers/net/tg3.c	2010-04-19 20:29:56.000000000 -0300
+++ linux-2.6.34-rc5/drivers/net/tg3.c	2010-04-26 14:10:42.000000000 -0300
@@ -8633,6 +8633,7 @@ static int tg3_test_msi(struct tg3 *tp)
 	pci_disable_msi(tp->pdev);
 
 	tp->tg3_flags2 &= ~TG3_FLG2_USING_MSI;
+	tp->napi[0].irq_vec = tp->pdev->irq;
 
 	err = tg3_request_irq(tp, 0);
 	if (err)

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: Eric Dumazet @ 2010-04-26 17:20 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <alpine.DEB.1.00.1004222249400.27016@pokey.mtv.corp.google.com>

Le jeudi 22 avril 2010 à 22:54 -0700, Tom Herbert a écrit :
> Add support to bnx2x to extract Toeplitz hash out of the receive descriptor
> for use in skb->rxhash.
> 
> Signed-off-by: Tom Herbert <therbert@google.com>
> ---
> diff --git a/drivers/net/bnx2x.h b/drivers/net/bnx2x.h
> index 0819530..8bd2368 100644
> --- a/drivers/net/bnx2x.h
> +++ b/drivers/net/bnx2x.h
> @@ -1330,7 +1330,7 @@ static inline u32 reg_poll(struct bnx2x *bp, u32 reg, u32 expected, int ms,
>  		AEU_INPUTS_ATTN_BITS_MCP_LATCHED_UMP_TX_PARITY | \
>  		AEU_INPUTS_ATTN_BITS_MCP_LATCHED_SCPAD_PARITY)
>  
> -#define MULTI_FLAGS(bp) \
> +#define RSS_FLAGS(bp) \
>  		(TSTORM_ETH_FUNCTION_COMMON_CONFIG_RSS_IPV4_CAPABILITY | \
>  		 TSTORM_ETH_FUNCTION_COMMON_CONFIG_RSS_IPV4_TCP_CAPABILITY | \
>  		 TSTORM_ETH_FUNCTION_COMMON_CONFIG_RSS_IPV6_CAPABILITY | \
> diff --git a/drivers/net/bnx2x_main.c b/drivers/net/bnx2x_main.c
> index 0c6dba2..613f727 100644
> --- a/drivers/net/bnx2x_main.c
> +++ b/drivers/net/bnx2x_main.c
> @@ -1582,7 +1582,7 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
>  		struct sw_rx_bd *rx_buf = NULL;
>  		struct sk_buff *skb;
>  		union eth_rx_cqe *cqe;
> -		u8 cqe_fp_flags;
> +		u8 cqe_fp_flags, cqe_fp_status_flags;
>  		u16 len, pad;
>  
>  		comp_ring_cons = RCQ_BD(sw_comp_cons);
> @@ -1598,6 +1598,7 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
>  
>  		cqe = &fp->rx_comp_ring[comp_ring_cons];
>  		cqe_fp_flags = cqe->fast_path_cqe.type_error_flags;
> +		cqe_fp_status_flags = cqe->fast_path_cqe.status_flags;
>  
>  		DP(NETIF_MSG_RX_STATUS, "CQE type %x  err %x  status %x"
>  		   "  queue %x  vlan %x  len %u\n", CQE_TYPE(cqe_fp_flags),
> @@ -1727,6 +1728,12 @@ reuse_rx:
>  
>  			skb->protocol = eth_type_trans(skb, bp->dev);
>  
> +			if ((bp->dev->features & ETH_FLAG_RXHASH) &&
> +			    (cqe_fp_status_flags &
> +			     ETH_FAST_PATH_RX_CQE_RSS_HASH_FLG))
> +				skb->rxhash = le32_to_cpu(
> +				    cqe->fast_path_cqe.rss_hash_result);
> +
>  			skb->ip_summed = CHECKSUM_NONE;
>  			if (bp->rx_csum) {
>  				if (likely(BNX2X_RX_CSUM_OK(cqe)))
> @@ -5750,10 +5757,10 @@ static void bnx2x_init_internal_func(struct bnx2x *bp)
>  	u32 offset;
>  	u16 max_agg_size;
>  
> -	if (is_multi(bp)) {
> -		tstorm_config.config_flags = MULTI_FLAGS(bp);
> +	tstorm_config.config_flags = RSS_FLAGS(bp);
> +
> +	if (is_multi(bp))
>  		tstorm_config.rss_result_mask = MULTI_MASK;
> -	}
>  
>  	/* Enable TPA if needed */
>  	if (bp->flags & TPA_ENABLE_FLAG)
> @@ -6629,10 +6636,8 @@ static int bnx2x_init_common(struct bnx2x *bp)
>  	bnx2x_init_block(bp, PBF_BLOCK, COMMON_STAGE);
>  
>  	REG_WR(bp, SRC_REG_SOFT_RST, 1);
> -	for (i = SRC_REG_KEYRSS0_0; i <= SRC_REG_KEYRSS1_9; i += 4) {
> -		REG_WR(bp, i, 0xc0cac01a);
> -		/* TODO: replace with something meaningful */
> -	}
> +	for (i = SRC_REG_KEYRSS0_0; i <= SRC_REG_KEYRSS1_9; i += 4)
> +		REG_WR(bp, i, random32());
>  	bnx2x_init_block(bp, SRCH_BLOCK, COMMON_STAGE);
>  #ifdef BCM_CNIC
>  	REG_WR(bp, SRC_REG_KEYSEARCH_0, 0x63285672);
> @@ -11001,6 +11006,11 @@ static int bnx2x_set_flags(struct net_device *dev, u32 data)
>  		changed = 1;
>  	}
>  
> +	if (data & ETH_FLAG_RXHASH)
> +		dev->features |= NETIF_F_RXHASH;
> +	else
> +		dev->features &= ~NETIF_F_RXHASH;
> +
>  	if (changed && netif_running(dev)) {
>  		bnx2x_nic_unload(bp, UNLOAD_NORMAL);
>  		rc = bnx2x_nic_load(bp, LOAD_NORMAL);
> --

Hi Tom

I tested this rxhash feature on my bnx2x card, using latest net-next-2.6
and appropriate ethtool

ethtool -k eth1 rxhash on

Then I used my pktgen script, to flood machine with flows with udp dst
port varying between 4000 and 4015.

Software generated rxhash is OK (16 different values).

But with bnx2x provided hash, all skb were hashed to same rxhash
value :(

What are the specs of this hardware hash ?
It seems to not care of udp source/destination ports.

Also, should'nt we feed same values for the seeds on different nics ?

for (i = SRC_REG_KEYRSS0_0; i <= SRC_REG_KEYRSS1_9; i += 4)
	REG_WR(bp, i, random32());

Thanks



^ permalink raw reply

* Re: 2.6.34-rc5-git6 (plus all patches) -- new INFO: suspicious rcu_dereference_check() usage.
From: Paul E. McKenney @ 2010-04-26 16:10 UTC (permalink / raw)
  To: Miles Lane
  Cc: Vivek Goyal, Eric Paris, Lai Jiangshan, Ingo Molnar,
	Peter Zijlstra, LKML, nauman, eric.dumazet, netdev, Jens Axboe,
	Gui Jianfeng, Li Zefan, Johannes Berg
In-Reply-To: <s2ka44ae5cd1004251419hb8f4c137md0e399c464d9290a@mail.gmail.com>

On Sun, Apr 25, 2010 at 05:19:38PM -0400, Miles Lane wrote:
> [  139.730133] [ INFO: suspicious rcu_dereference_check() usage. ]
> [  139.730136] ---------------------------------------------------
> [  139.730139] include/net/inet_timewait_sock.h:227 invoked
> rcu_dereference_check() without protection!
> [  139.730142]
> [  139.730143] other info that might help us debug this:
> [  139.730144]
> [  139.730147]
> [  139.730148] rcu_scheduler_active = 1, debug_locks = 1
> [  139.730151] 1 lock held by swapper/0:
> [  139.730158]  #0:  (net/ipv4/tcp_minisocks.c:41){+.-...}, at:
> [<ffffffff81046ad2>] run_timer_softirq+0x19f/0x370
> [  139.730169]
> [  139.730170] stack backtrace:
> [  139.730173] Pid: 0, comm: swapper Not tainted 2.6.34-rc5-git6 #28
> [  139.730176] Call Trace:
> [  139.730178]  <IRQ>  [<ffffffff81065f9a>] lockdep_rcu_dereference+0x9d/0xa5
> [  139.730189]  [<ffffffff813d1abf>] twsk_net+0x4f/0x57
> [  139.730193]  [<ffffffff813d2122>] __inet_twsk_kill+0x79/0xd0
> [  139.730198]  [<ffffffff813d24f9>] inet_twdr_do_twkill_work+0x59/0xba
> [  139.730203]  [<ffffffff813d263d>] inet_twdr_hangman+0x30/0x97
> [  139.730207]  [<ffffffff81046b9e>] run_timer_softirq+0x26b/0x370
> [  139.730211]  [<ffffffff81046ad2>] ? run_timer_softirq+0x19f/0x370
> [  139.730216]  [<ffffffff813d260d>] ? inet_twdr_hangman+0x0/0x97
> [  139.730221]  [<ffffffff81040230>] ? __do_softirq+0x7f/0x252
> [  139.730225]  [<ffffffff810402f5>] __do_softirq+0x144/0x252
> [  139.730230]  [<ffffffff8100320c>] call_softirq+0x1c/0x34
> [  139.730235]  [<ffffffff81004864>] do_softirq+0x38/0x80
> [  139.730239]  [<ffffffff8103fdc2>] irq_exit+0x45/0x94
> [  139.730243]  [<ffffffff81003fa9>] do_IRQ+0xad/0xc4
> [  139.730247]  [<ffffffff81432953>] ret_from_intr+0x0/0xf
> [  139.730250]  <EOI>  [<ffffffff81248383>] ? acpi_idle_enter_bm+0x262/0x28d
> [  139.730260]  [<ffffffff81248379>] ? acpi_idle_enter_bm+0x258/0x28d
> [  139.730266]  [<ffffffff8137b9cf>] cpuidle_idle_call+0x9c/0x13d
> [  139.730273]  [<ffffffff810012c2>] cpu_idle+0xa5/0xfc
> [  139.730277]  [<ffffffff8142abdd>] start_secondary+0x1f3/0x1fc

This appears to be another symptom of the bug you located in your earlier
testing, so either Eric Biederman sends a patch or I improvise.

And thank you again for all your testing, Miles!!!

							Thanx, Paul

^ permalink raw reply

* Re: [PATCH] RCU: don't turn off lockdep when find suspicious rcu_dereference_check() usage
From: Paul E. McKenney @ 2010-04-26 16:09 UTC (permalink / raw)
  To: Miles Lane
  Cc: Vivek Goyal, Eric Paris, Lai Jiangshan, Ingo Molnar,
	Peter Zijlstra, LKML, nauman, eric.dumazet, netdev, Jens Axboe,
	Gui Jianfeng, Li Zefan, Johannes Berg, Eric W. Biederman
In-Reply-To: <p2ga44ae5cd1004251320j55d23ba7ua0fa14497f623d76@mail.gmail.com>

On Sun, Apr 25, 2010 at 04:20:13PM -0400, Miles Lane wrote:
> > I am down to seeing three suspicious rcu_dereference_check traces when
> > I apply this patch and all the previous patches to 2.6.34-rc5-git6.
> >
> > 1. The "__sched_setscheduler+0x19d/0x300" trace.
> > 2. The two "is_swiotlb_buffer+0x2e/0x3b" traces (waiting to see
> > Johannes' patch show up in a Linux snapshot)
> >
> > Did I miss a patch for the setscheduler issue?
> 
> Hmm.  I am still seeing these two messages as well.
> 
> [   83.363146] [ INFO: suspicious rcu_dereference_check() usage. ]
> [   83.363148] ---------------------------------------------------
> [   83.363151] include/net/inet_timewait_sock.h:227 invoked
> rcu_dereference_check() without protection!
> [   83.363154]
> [   83.363155] other info that might help us debug this:
> [   83.363156]
> [   83.363158]
> [   83.363159] rcu_scheduler_active = 1, debug_locks = 1
> [   83.363162] 2 locks held by gwibber-service/5076:
> [   83.363164]  #0:  (&p->lock){+.+.+.}, at: [<ffffffff8110534a>]
> seq_read+0x37/0x381
> [   83.363176]  #1:  (&(&hashinfo->ehash_locks[i])->rlock){+.-...},
> at: [<ffffffff813ddcd5>] established_get_next+0xc4/0x132
> [   83.363186]
> [   83.363187] stack backtrace:
> [   83.363191] Pid: 5076, comm: gwibber-service Not tainted 2.6.34-rc5-git6 #27
> [   83.363194] Call Trace:
> [   83.363202]  [<ffffffff81068086>] lockdep_rcu_dereference+0x9d/0xa5
> [   83.363207]  [<ffffffff813dc998>] twsk_net+0x4f/0x57
> [   83.363212]  [<ffffffff813ddc65>] established_get_next+0x54/0x132
> [   83.363216]  [<ffffffff813dde47>] tcp_seq_next+0x5d/0x6a
> [   83.363221]  [<ffffffff81105599>] seq_read+0x286/0x381
> [   83.363226]  [<ffffffff81105313>] ? seq_read+0x0/0x381
> [   83.363231]  [<ffffffff8113503c>] proc_reg_read+0x8d/0xac
> [   83.363236]  [<ffffffff810ebf14>] vfs_read+0xa6/0x103
> [   83.363241]  [<ffffffff810ec027>] sys_read+0x45/0x69
> [   83.363246]  [<ffffffff81002b6b>] system_call_fastpath+0x16/0x1b
> 
> [   84.660302] [ INFO: suspicious rcu_dereference_check() usage. ]
> [   84.660304] ---------------------------------------------------
> [   84.660308] include/net/inet_timewait_sock.h:227 invoked
> rcu_dereference_check() without protection!
> [   84.660311]
> [   84.660312] other info that might help us debug this:
> [   84.660313]
> [   84.660315]
> [   84.660316] rcu_scheduler_active = 1, debug_locks = 1
> [   84.660319] no locks held by gwibber-service/5081.
> [   84.660321]
> [   84.660322] stack backtrace:
> [   84.660325] Pid: 5081, comm: gwibber-service Not tainted 2.6.34-rc5-git6 #27
> [   84.660328] Call Trace:
> [   84.660339]  [<ffffffff81068086>] lockdep_rcu_dereference+0x9d/0xa5
> [   84.660345]  [<ffffffff813cad6f>] twsk_net+0x4f/0x57
> [   84.660350]  [<ffffffff813cb18f>] __inet_twsk_hashdance+0x50/0x158
> [   84.660355]  [<ffffffff813e0bb9>] tcp_time_wait+0x1c1/0x24b
> [   84.660360]  [<ffffffff813d3d97>] tcp_fin+0x83/0x162
> [   84.660364]  [<ffffffff813d4727>] tcp_data_queue+0x1ff/0xa1e
> [   84.660370]  [<ffffffff810496aa>] ? mod_timer+0x1e/0x20
> [   84.660375]  [<ffffffff813d8363>] tcp_rcv_state_process+0x89d/0x8f2
> [   84.660381]  [<ffffffff813943bb>] ? release_sock+0x30/0x10b
> [   84.660386]  [<ffffffff813de772>] tcp_v4_do_rcv+0x2de/0x33f
> [   84.660391]  [<ffffffff8139440d>] release_sock+0x82/0x10b
> [   84.660395]  [<ffffffff813ce875>] tcp_close+0x1b5/0x37e
> [   84.660401]  [<ffffffff813ecdb7>] inet_release+0x50/0x57
> [   84.660405]  [<ffffffff81391ae4>] sock_release+0x1a/0x66
> [   84.660410]  [<ffffffff81391b52>] sock_close+0x22/0x26
> [   84.660415]  [<ffffffff810ece07>] __fput+0x120/0x1cd
> [   84.660420]  [<ffffffff810ecec9>] fput+0x15/0x17
> [   84.660424]  [<ffffffff810e9d41>] filp_close+0x63/0x6d
> [   84.660428]  [<ffffffff810e9e22>] sys_close+0xd7/0x111
> [   84.660434]  [<ffffffff81002b6b>] system_call_fastpath+0x16/0x1b

Eric Dumazet traced these down to a commit from Eric Biederman.

If I don't hear from Eric Biederman in a few days, I will attempt a
patch, but it would be more likely to be correct coming from someone
with a better understanding of the code.  ;-)

							Thanx, Paul

^ permalink raw reply

* Re: [PATCH 1/4] e1000e: save skb counts in TX to avoid cache misses
From: Alexander Duyck @ 2010-04-26 15:55 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <alpine.DEB.1.00.1004231719050.16806@pokey.mtv.corp.google.com>

Tom Herbert wrote:
> In e1000_tx_map, precompute number of segements and bytecounts which
> are derived from fields in skb; these are stored in buffer_info.  When
> cleaning tx in e1000_clean_tx_irq use the values in the associated
> buffer_info for statistics counting, this eliminates cache misses
> on skb fields.
> 
> Signed-off-by: Tom Herbert <therbert@google.com>
> ---
> diff --git a/drivers/net/e1000e/e1000.h b/drivers/net/e1000e/e1000.h
> index 12648a1..d6da75b 100644
> --- a/drivers/net/e1000e/e1000.h
> +++ b/drivers/net/e1000e/e1000.h
> @@ -188,6 +188,8 @@ struct e1000_buffer {
>  			unsigned long time_stamp;
>  			u16 length;
>  			u16 next_to_watch;
> +			unsigned int segs;
> +			unsigned int bytecount;
>  			u16 mapped_as_page;
>  		};
>  		/* Rx */

Does segs need to be a full int here?  I think you can probably get away 
with either an unsigned short or just define it as a u16, then combine 
it with mapped_as_page to cut down on the overall size.

Thanks,

Alex

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox