Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: BUG: unable to handle kernel paging request at 00000000d8be176d
From: Eric Dumazet @ 2012-07-06  6:13 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: David Miller, netdev, Steffen Klassert
In-Reply-To: <20120706055859.GA27693@localhost>

On Fri, 2012-07-06 at 13:58 +0800, Fengguang Wu wrote:
> On Thu, Jul 05, 2012 at 02:22:00PM -0700, David Miller wrote:
> > 
> > Steffen Klassert posted a patch which fixes this.
> 
> Steffen's patch converts one oops message into another.
> 
> tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
> head:   700db99d0140e9da2a31e08ebd3e1b121691aa26
> commit: a2de86f63cfc92f7aaf11e7b9d9f2150946a1622 [1/2] ipv6: Initialize the neighbour pointer of rt6_info on allocation
> 
> x86_64-allyesdebian         BBB

Please try :

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6d9c0ab..c6af596 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -975,7 +975,7 @@ static int ip6_dst_lookup_tail(struct sock *sk,
 	 * dst entry of the nexthop router
 	 */
 	rcu_read_lock();
-	rt = (struct rt6_info *) dst;
+	rt = (struct rt6_info *) *dst;
 	n = rt->n;
 	if (n && !(n->nud_state & NUD_VALID)) {
 		struct inet6_ifaddr *ifp;

^ permalink raw reply related

* Re: Intel e1000 ethernet device, change in kernel v3.4, and jumbo frames setting DENIED.
From: Jeff Kirsher @ 2012-07-06  6:17 UTC (permalink / raw)
  To: D. Stussy; +Cc: D. Stussy, e1000-devel, netdev
In-Reply-To: <jt59dm$fiq$1@snarked.org>

[-- Attachment #1: Type: text/plain, Size: 2219 bytes --]

On 07/05/2012 04:53 PM, D. Stussy wrote:
> "Henrique de Moraes Holschuh"  wrote in message
> news:jjGoO-7l0-19@gated-at.bofh.it...
> On Wed, 04 Jul 2012, D. Stussy wrote:
>> There has been some changes to the driver such that things like
>> checksum verification is offloaded from the CPU.  However, this
>> blocks the ability to set a jumbo frame.  The kernel does record an
>> error indicating that checksum offloading need be disabled fro jumbo
>> frame MTU sizes to be used.  My e1000 interfaces use the 82574L
>> chipset.
>>
>> 1)  Was this an intentional change?
>
> Yes, but I understand Intel is trying to come up with a way not to
> need it.
> ==============
> OK, found the modification.  NETIF_F_RXHASH turned on by default in a
> modification ("Receive Packet Steering support") from January 26, 2012.
>
> Features which have no OFF setting should DEFAULT OFF and wait to be
> explicitly enabled.  The change which was inserted is stupid.
> ==============
>
>> 2)  How do I disable that function so I can set jumbo frames with
>> "ifconfig" or "ip"?  I simply don't know what setting I need to
>> pass.
>
> ethtool can do it.  Check its manpage, it can manipulate the various NIC
> offload engines, as well as other parameters such as dma ring buffer
> size,
> etc.
> ===============
> Got it:  ethtool -K eth0 rx off
>
> However, I'd prefer to turn off hashing instead of the checksum and I
> didn't recognize a single setting which would do that.  Under "ethtool
> -N", there seem to be 8 settings, but none of the options seemed to
> disable hashing (without simultaneously dropping packets).  I still
> need help:  What is the correct setting I need to disable this?  I
> don't see a simple OFF setting.
>
> Else, this part of the patch should be reversed:
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -6136,6 +6213,7 @@ static int __devinit e1000_probe(struct pci_dev
> *pdev,
> NETIF_F_HW_VLAN_TX |
> NETIF_F_TSO |
> NETIF_F_TSO6 |
> +                NETIF_F_RXHASH |
> NETIF_F_RXCSUM |
> NETIF_F_HW_CSUM);
> -- 

Adding the more appropriate mailing lists of netdev and e1000-devel


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 900 bytes --]

^ permalink raw reply

* [PATCH net-next] ipv6: remove redundant declarations
From: Eric Dumazet @ 2012-07-06  6:34 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

remove redundant declarations, they belong in include/net/tcp.h

Signed-off-by: Eric Dumazet <edumazet@google.com>
--
 net/ipv6/syncookies.c |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index 8e951d8..7bf3cc4 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -21,9 +21,6 @@
 #include <net/ipv6.h>
 #include <net/tcp.h>
 
-extern int sysctl_tcp_syncookies;
-extern __u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS];
-
 #define COOKIEBITS 24	/* Upper bits store count */
 #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1)
 

^ permalink raw reply related

* Re: [net-next RFC V5 5/5] virtio_net: support negotiating the number of queues through ctrl vq
From: Stephen Hemminger @ 2012-07-06  6:38 UTC (permalink / raw)
  To: Jason Wang
  Cc: krkumar2, habanero, kvm, mst, netdev, mashirle, linux-kernel,
	virtualization, edumazet, Sasha Levin, jwhan, sri, davem, tahm
In-Reply-To: <4FF65966.9040600@redhat.com>

On Fri, 06 Jul 2012 11:20:06 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On 07/05/2012 08:51 PM, Sasha Levin wrote:
> > On Thu, 2012-07-05 at 18:29 +0800, Jason Wang wrote:
> >> @@ -1387,6 +1404,10 @@ static int virtnet_probe(struct virtio_device *vdev)
> >>          if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
> >>                  vi->has_cvq = true;
> >>
> >> +       /* Use single tx/rx queue pair as default */
> >> +       vi->num_queue_pairs = 1;
> >> +       vi->total_queue_pairs = num_queue_pairs;
> > The code is using this "default" even if the amount of queue pairs it
> > wants was specified during initialization. This basically limits any
> > device to use 1 pair when starting up.
> >
> 
> Yes, currently the virtio-net driver would use 1 txq/txq by default 
> since multiqueue may not outperform in all kinds of workload. So it's 
> better to keep it as default and let user enable multiqueue by ethtool -L.
> 

I would prefer that the driver sized number of queues based on number
of online CPU's. That is what real hardware does. What kind of workload
are you doing? If it is some DBMS benchmark then maybe the issue is that
some CPU's need to be reserved.

^ permalink raw reply

* Re: [PATCH] ipv4: Avoid overhead when no custom FIB rules are installed.
From: David Miller @ 2012-07-06  6:39 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1341553282.3265.49.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 06 Jul 2012 07:41:22 +0200

> Hmm, this seems quite big to be inlined ?

It's about the same size as the non-rule case.

^ permalink raw reply

* Re: [PATCH net-next] ipv6: remove redundant declarations
From: David Miller @ 2012-07-06  6:40 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1341556475.3265.80.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 06 Jul 2012 08:34:35 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> remove redundant declarations, they belong in include/net/tcp.h
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: BUG: unable to handle kernel paging request at 00000000d8be176d
From: Eric Dumazet @ 2012-07-06  6:41 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: David Miller, netdev, Steffen Klassert
In-Reply-To: <1341555227.3265.54.camel@edumazet-glaptop>

On Fri, 2012-07-06 at 08:13 +0200, Eric Dumazet wrote:
> On Fri, 2012-07-06 at 13:58 +0800, Fengguang Wu wrote:
> > On Thu, Jul 05, 2012 at 02:22:00PM -0700, David Miller wrote:
> > > 
> > > Steffen Klassert posted a patch which fixes this.
> > 
> > Steffen's patch converts one oops message into another.
> > 
> > tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
> > head:   700db99d0140e9da2a31e08ebd3e1b121691aa26
> > commit: a2de86f63cfc92f7aaf11e7b9d9f2150946a1622 [1/2] ipv6: Initialize the neighbour pointer of rt6_info on allocation
> > 
> > x86_64-allyesdebian         BBB
> 
> Please try :
> 
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 6d9c0ab..c6af596 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -975,7 +975,7 @@ static int ip6_dst_lookup_tail(struct sock *sk,
>  	 * dst entry of the nexthop router
>  	 */
>  	rcu_read_lock();
> -	rt = (struct rt6_info *) dst;
> +	rt = (struct rt6_info *) *dst;
>  	n = rt->n;
>  	if (n && !(n->nud_state & NUD_VALID)) {
>  		struct inet6_ifaddr *ifp;
> 

David, what do you think if I submit a patch using following accessor ?

/* get a rt6_info given a dst_entry pointer */
static inline struct rt6_info *dst_rt6_info(struct dst_entry *dst)
{
	return (struct rt6_info *)dst;
}

^ permalink raw reply

* Re: BUG: unable to handle kernel paging request at 00000000d8be176d
From: David Miller @ 2012-07-06  6:42 UTC (permalink / raw)
  To: eric.dumazet; +Cc: wfg, netdev, steffen.klassert
In-Reply-To: <1341555227.3265.54.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 06 Jul 2012 08:13:47 +0200

> @@ -975,7 +975,7 @@ static int ip6_dst_lookup_tail(struct sock *sk,
>  	 * dst entry of the nexthop router
>  	 */
>  	rcu_read_lock();
> -	rt = (struct rt6_info *) dst;
> +	rt = (struct rt6_info *) *dst;
>  	n = rt->n;
>  	if (n && !(n->nud_state & NUD_VALID)) {
>  		struct inet6_ifaddr *ifp;

This is obviously correct, please submit this formally.

Fengguang Wu can I ask you politely not to quote the quilty patch in
it's entirety when reporting bugs?  That screws up my workflow because
that patch goes then gets installed as a new patch in patchwork and I
have to therefore tick it off every time you report a bug.

^ permalink raw reply

* Re: BUG: unable to handle kernel paging request at 00000000d8be176d
From: David Miller @ 2012-07-06  6:44 UTC (permalink / raw)
  To: eric.dumazet; +Cc: wfg, netdev, steffen.klassert
In-Reply-To: <1341556907.3265.93.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 06 Jul 2012 08:41:47 +0200

> David, what do you think if I submit a patch using following accessor ?
> 
> /* get a rt6_info given a dst_entry pointer */
> static inline struct rt6_info *dst_rt6_info(struct dst_entry *dst)
> {
> 	return (struct rt6_info *)dst;
> }

I'd rather we simply not use address-of pointers in our interfaces
like we do now in some spots.

99 times out of 100 it's a case where PTR_ERR() would do.

I spent a lot of time moving both ipv4 and ipv6 in this direction,
we're almost there, and should simply finish off the remaining
cases.

^ permalink raw reply

* Re: BUG: unable to handle kernel paging request at 00000000d8be176d
From: Eric Dumazet @ 2012-07-06  6:53 UTC (permalink / raw)
  To: David Miller; +Cc: wfg, netdev, steffen.klassert
In-Reply-To: <20120705.234413.823095842284954048.davem@davemloft.net>

On Thu, 2012-07-05 at 23:44 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 06 Jul 2012 08:41:47 +0200
> 
> > David, what do you think if I submit a patch using following accessor ?
> > 
> > /* get a rt6_info given a dst_entry pointer */
> > static inline struct rt6_info *dst_rt6_info(struct dst_entry *dst)
> > {
> > 	return (struct rt6_info *)dst;
> > }
> 
> I'd rather we simply not use address-of pointers in our interfaces
> like we do now in some spots.
> 
> 99 times out of 100 it's a case where PTR_ERR() would do.
> 
> I spent a lot of time moving both ipv4 and ipv6 in this direction,
> we're almost there, and should simply finish off the remaining
> cases.

Not sure what you mean. I dont use address of pointer.

I suggested a type safe thing

ie change all

struct rt6_info *rt = (struct rt6_info *)dst;

by

struct rt6_info *rt = dst_rt6_info(dst);


same generated code, but we have compiler checks instead of a raw cast.

^ permalink raw reply

* Re: [PATCH net-next] ipv6: Initialize the neighbour pointer of rt6_info on allocation
From: Steffen Klassert @ 2012-07-06  6:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1341501404.2583.4267.camel@edumazet-glaptop>

On Thu, Jul 05, 2012 at 05:16:44PM +0200, Eric Dumazet wrote:
> On Thu, 2012-07-05 at 15:18 +0200, Steffen Klassert wrote:
> >  
> >  	if (rt) {
> > -		memset(&rt->rt6i_table, 0,
> > +		memset(&rt->n, 0,
> >  		       sizeof(*rt) - sizeof(struct dst_entry));
> >  		rt6_init_peer(rt, table ? &table->tb6_peers : net->ipv6.peers);
> >  	}
> 
> Hmm, could we find a way to avoid this for future changes ?
> 
> We know dst_entry is the first field, so maybe :
> 
> if (rt) {
> 	struct dst_entry *dst = (struct dst_entry *)rt;
> 
> 	memset(dst + 1, 0, sizeof(*rt) - sizeof(*dst));
> 

Yes, I think we need to do something like this.

I've just noticed that ip6_blackhole_route, xfrm_alloc_dst,
dn_route_output_slow and dn_route_input_slow have the same issue.

^ permalink raw reply

* Re: BUG: unable to handle kernel paging request at 00000000d8be176d
From: David Miller @ 2012-07-06  7:03 UTC (permalink / raw)
  To: eric.dumazet; +Cc: wfg, netdev, steffen.klassert
In-Reply-To: <1341557587.3265.107.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 06 Jul 2012 08:53:07 +0200

> Not sure what you mean. I dont use address of pointer.

I'm talking about the cases where this caused a bug here.

Passing "struct dst_entry **dst" as an argument.

It's just stupid, and a relic of code that wants to return
a route and a locallized error at the same time.

I'm saying that in the end we should simply return "struct dst_entry
*" from such functions.

Anyways, please submit your original patch in this thread so I
can get this simple case fixed in net-next.

^ permalink raw reply

* [PATCH net-next] ipv6: fix a bad cast in ip6_dst_lookup_tail()
From: Eric Dumazet @ 2012-07-06  7:19 UTC (permalink / raw)
  To: David Miller; +Cc: wfg, netdev, steffen.klassert
In-Reply-To: <20120706.000341.755169655502291970.davem@davemloft.net>

From: Eric Dumazet <edumazet@google.com>

Fix a bug in ip6_dst_lookup_tail(), where typeof(dst) is 
"struct dst_entry **", not "struct dst_entry *"

Reported-by: Fengguang Wu <wfg@linux.intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6d9c0ab..c6af596 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -975,7 +975,7 @@ static int ip6_dst_lookup_tail(struct sock *sk,
 	 * dst entry of the nexthop router
 	 */
 	rcu_read_lock();
-	rt = (struct rt6_info *) dst;
+	rt = (struct rt6_info *) *dst;
 	n = rt->n;
 	if (n && !(n->nud_state & NUD_VALID)) {
 		struct inet6_ifaddr *ifp;

^ permalink raw reply related

* Re: [PATCH net-next] ipv6: fix a bad cast in ip6_dst_lookup_tail()
From: David Miller @ 2012-07-06  7:24 UTC (permalink / raw)
  To: eric.dumazet; +Cc: wfg, netdev, steffen.klassert
In-Reply-To: <1341559145.3265.141.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 06 Jul 2012 09:19:05 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> Fix a bug in ip6_dst_lookup_tail(), where typeof(dst) is 
> "struct dst_entry **", not "struct dst_entry *"
> 
> Reported-by: Fengguang Wu <wfg@linux.intel.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: BUG: unable to handle kernel paging request at 00000000d8be176d
From: Fengguang Wu @ 2012-07-06  7:37 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, steffen.klassert
In-Reply-To: <20120705.234229.526162609862035833.davem@davemloft.net>

Hi David,

> Fengguang Wu can I ask you politely not to quote the quilty patch in
> it's entirety when reporting bugs?  That screws up my workflow because
> that patch goes then gets installed as a new patch in patchwork and I
> have to therefore tick it off every time you report a bug.

Sorry for that!  Is it fine to _attach_ the referenced patch, or just
a raw diff?  Or, the commit SHA and subject are all you want to see?

Thanks,
Fengguang

^ permalink raw reply

* Re: [net-next RFC V5 0/5] Multiqueue virtio-net
From: Jason Wang @ 2012-07-06  7:42 UTC (permalink / raw)
  To: Rick Jones
  Cc: krkumar2, habanero, mashirle, kvm, mst, netdev, linux-kernel,
	virtualization, edumazet, tahm, jwhan, davem, sri
In-Reply-To: <4FF5D2B7.6080602@hp.com>

On 07/06/2012 01:45 AM, Rick Jones wrote:
> On 07/05/2012 03:29 AM, Jason Wang wrote:
>
>>
>> Test result:
>>
>> 1) 1 vm 2 vcpu 1q vs 2q, 1 - 1q, 2 - 2q, no pinning
>>
>> - Guest to External Host TCP STREAM
>> sessions size throughput1 throughput2   norm1 norm2
>> 1 64 650.55 655.61 100% 24.88 24.86 99%
>> 2 64 1446.81 1309.44 90% 30.49 27.16 89%
>> 4 64 1430.52 1305.59 91% 30.78 26.80 87%
>> 8 64 1450.89 1270.82 87% 30.83 25.95 84%
>
> Was the -D test-specific option used to set TCP_NODELAY?  I'm guessing 
> from your description of how packet sizes were smaller with multiqueue 
> and your need to hack tcp_write_xmit() it wasn't but since we don't 
> have the specific netperf command lines (hint hint :) I wanted to make 
> certain.
Hi Rick:

I didn't specify -D for disabling Nagle. I also collects rx packets and 
average packet size:

Guest to External Host ( 2vcpu 1q vs 2q )
sessions size tput-sq tput-mq %  norm-sq norm-mq %  #tx-pkts-sq 
#tx-pkts-mq % avg-sz-sq avg-sz-mq %
1 64 668.85 671.13 100% 25.80 26.86 104% 629038 627126 99% 1395 1403 100%
2 64 1421.29 1345.40 94% 32.06 27.57 85% 1318498 1246721 94% 1413 1414 100%
4 64 1469.96 1365.42 92% 32.44 27.04 83% 1362542 1277848 93% 1414 1401 99%
8 64 1131.00 1361.58 120% 24.81 26.76 107% 1223700 1280970 104% 1395 
1394 99%
1 256 1883.98 1649.87 87% 60.67 58.48 96% 1542775 1465836 95% 1592 1472 92%
2 256 4847.09 3539.74 73% 98.35 64.05 65% 2683346 3074046 114% 2323 1505 64%
4 256 5197.33 3283.48 63% 109.14 62.39 57% 1819814 2929486 160% 3636 
1467 40%
8 256 5953.53 3359.22 56% 122.75 64.21 52% 906071 2924148 322% 8282 1502 18%
1 512 3019.70 2646.07 87% 93.89 86.78 92% 2003780 2256077 112% 1949 1532 78%
2 512 7455.83 5861.03 78% 173.79 104.43 60% 1200322 3577142 298% 7831 
2114 26%
4 512 8962.28 7062.20 78% 213.08 127.82 59% 468142 2594812 554% 24030 
3468 14%
8 512 7849.82 8523.85 108% 175.41 154.19 87% 304923 1662023 545% 38640 
6479 16%

When multiqueue were enabled, it does have a higher packets per second 
but with a much more smaller packet size. It looks to me that multiqueue 
is faster and guest tcp have less oppotunity to build a larger skbs to 
send, so lots of small packet were required to send which leads to much 
more #exit and vhost works. One interesting thing is, if I run tcpdump 
in the host where guest run, I can get obvious throughput increasing. To 
verify the assumption, I hack the tcp_write_xmit() with following patch 
and set tcp_tso_win_divisor=1, then I multiqueue can outperform or at 
least get the same throughput as singlequeue, though it could introduce 
latency but I havent' measured it.

I'm not expert of tcp, but looks like the changes are reasonable:
- we can do full-sized TSO check in tcp_tso_should_defer() only for 
westwood, according to tcp westwood
- run tcp_tso_should_defer for tso_segs = 1 when tso is enabled.

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index c465d3e..166a888 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1567,7 +1567,7 @@ static bool tcp_tso_should_defer(struct sock *sk, 
struct sk_buff *skb)

         in_flight = tcp_packets_in_flight(tp);

-       BUG_ON(tcp_skb_pcount(skb) <= 1 || (tp->snd_cwnd <= in_flight));
+       BUG_ON(tp->snd_cwnd <= in_flight);

         send_win = tcp_wnd_end(tp) - TCP_SKB_CB(skb)->seq;

@@ -1576,9 +1576,11 @@ static bool tcp_tso_should_defer(struct sock *sk, 
struct sk_buff *skb)

         limit = min(send_win, cong_win);

+#if 0
         /* If a full-sized TSO skb can be sent, do it. */
         if (limit >= sk->sk_gso_max_size)
                 goto send_now;
+#endif

         /* Middle in queue won't get any more data, full sendable 
already? */
         if ((skb != tcp_write_queue_tail(sk)) && (limit >= skb->len))
@@ -1795,10 +1797,9 @@ static bool tcp_write_xmit(struct sock *sk, 
unsigned int mss_now, int nonagle,
                                                      
(tcp_skb_is_last(sk, skb) ?
                                                       nonagle : 
TCP_NAGLE_PUSH))))
                                 break;
-               } else {
-                       if (!push_one && tcp_tso_should_defer(sk, skb))
-                               break;
                 }
+               if (!push_one && tcp_tso_should_defer(sk, skb))
+                       break;

                 limit = mss_now;
                 if (tso_segs > 1 && !tcp_urg_mode(tp))




>
> Instead of calling them throughput1 and throughput2, it might be more 
> clear in future to identify them as singlequeue and multiqueue.
>

Sure.
> Also, how are you combining the concurrent netperf results?  Are you 
> taking sums of what netperf reports, or are you gathering statistics 
> outside of netperf?
>

The throughput were just sumed from netperf result like what netperf 
manual suggests. The cpu utilization were measured by mpstat.
>> - TCP RR
>> sessions size throughput1 throughput2   norm1 norm2
>> 50 1 54695.41 84164.98 153% 1957.33 1901.31 97%
>
> A single instance TCP_RR test would help confirm/refute any 
> non-trivial change in (effective) path length between the two cases.
>

Yes, I would test this thanks.
> happy benchmarking,
>
> rick jones
> -- 
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply related

* RTL-8139C Link NOT detected in Linux 3.4.4 kernel
From: Anand Raj Manickam @ 2012-07-06  7:42 UTC (permalink / raw)
  To: netdev

I m running a 3.4.4 kernel , i face a strange issue where the Link on
8139too driver for RTL-8139C is not detected.
The same works great if i revert the kernel back to 2.6.36.
Anybody faced the same ?

^ permalink raw reply

* Re: [net-next RFC V5 4/5] virtio_net: multiqueue support
From: Jason Wang @ 2012-07-06  7:45 UTC (permalink / raw)
  To: Amos Kong
  Cc: krkumar2, habanero, mashirle, kvm, mst, netdev, linux-kernel,
	virtualization, edumazet, tahm, jwhan, davem, sri
In-Reply-To: <4FF5F2F2.9050307@redhat.com>

On 07/06/2012 04:02 AM, Amos Kong wrote:
> On 07/05/2012 06:29 PM, Jason Wang wrote:
>> This patch converts virtio_net to a multi queue device. After negotiated
>> VIRTIO_NET_F_MULTIQUEUE feature, the virtio device has many tx/rx queue pairs,
>> and driver could read the number from config space.
>>
>> The driver expects the number of rx/tx queue paris is equal to the number of
>> vcpus. To maximize the performance under this per-cpu rx/tx queue pairs, some
>> optimization were introduced:
>>
>> - Txq selection is based on the processor id in order to avoid contending a lock
>>    whose owner may exits to host.
>> - Since the txq/txq were per-cpu, affinity hint were set to the cpu that owns
>>    the queue pairs.
>>
>> Signed-off-by: Krishna Kumar<krkumar2@in.ibm.com>
>> Signed-off-by: Jason Wang<jasowang@redhat.com>
>> ---
> ...
>
>>
>>   static int virtnet_probe(struct virtio_device *vdev)
>>   {
>> -	int err;
>> +	int i, err;
>>   	struct net_device *dev;
>>   	struct virtnet_info *vi;
>> +	u16 num_queues, num_queue_pairs;
>> +
>> +	/* Find if host supports multiqueue virtio_net device */
>> +	err = virtio_config_val(vdev, VIRTIO_NET_F_MULTIQUEUE,
>> +				offsetof(struct virtio_net_config,
>> +				num_queues),&num_queues);
>> +
>> +	/* We need atleast 2 queue's */
>
> s/atleast/at least/
>
>
>> +	if (err || num_queues<  2)
>> +		num_queues = 2;
>> +	if (num_queues>  MAX_QUEUES * 2)
>> +		num_queues = MAX_QUEUES;
>                  num_queues = MAX_QUEUES * 2;
>
> MAX_QUEUES is the limitation of RX or TX.

Right, it's a typo, thanks.
>> +
>> +	num_queue_pairs = num_queues / 2;
> ...
>

^ permalink raw reply

* Re: [net-next RFC V5 5/5] virtio_net: support negotiating the number of queues through ctrl vq
From: Jason Wang @ 2012-07-06  7:46 UTC (permalink / raw)
  To: Amos Kong
  Cc: krkumar2, habanero, mashirle, kvm, mst, netdev, linux-kernel,
	virtualization, edumazet, Sasha Levin, jwhan, sri, davem, tahm
In-Reply-To: <4FF5F3F7.8090307@redhat.com>

On 07/06/2012 04:07 AM, Amos Kong wrote:
> On 07/05/2012 08:51 PM, Sasha Levin wrote:
>> On Thu, 2012-07-05 at 18:29 +0800, Jason Wang wrote:
>>> @@ -1387,6 +1404,10 @@ static int virtnet_probe(struct virtio_device *vdev)
>>>          if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
>>>                  vi->has_cvq = true;
>>>
>
>>> +       /* Use single tx/rx queue pair as default */
>>> +       vi->num_queue_pairs = 1;
>>> +       vi->total_queue_pairs = num_queue_pairs;
> vi->total_queue_pairs also should be set to 1
>
>             vi->total_queue_pairs = 1;

Hi Amos:

total_queue_pairs is the max number of queue pairs that the deivce could 
provide, so it's ok here.
>> The code is using this "default" even if the amount of queue pairs it
>> wants was specified during initialization. This basically limits any
>> device to use 1 pair when starting up.
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Re: BUG: unable to handle kernel paging request at 00000000d8be176d
From: Fengguang Wu @ 2012-07-06  7:52 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, steffen.klassert
In-Reply-To: <20120706073745.GA28197@localhost>

On Fri, Jul 06, 2012 at 03:37:45PM +0800, Fengguang Wu wrote:
> Hi David,
> 
> > Fengguang Wu can I ask you politely not to quote the quilty patch in
> > it's entirety when reporting bugs?  That screws up my workflow because
> > that patch goes then gets installed as a new patch in patchwork and I
> > have to therefore tick it off every time you report a bug.
> 
> Sorry for that!  Is it fine to _attach_ the referenced patch, or just
> a raw diff?  Or, the commit SHA and subject are all you want to see?

I used git-format-patch which makes a formal patch. How about git-show?
The output will be less like a formal patch, for example:

:       commit c5fb75aafab2fe31353b96cf556c1a689f8ac7e9
:       Author: Fengguang Wu <fengguang.wu@intel.com>
:       Date:   Thu Jun 14 22:36:29 2012 +0800

:           pms: fix build error in pms_probe()
:           
:           drivers/media/video/pms.c: In function ‘pms_probe’:
:           drivers/media/video/pms.c:1047:2: error: implicit declaration of function ‘kzalloc’ [-Werror=implicit-function-declaration]
:           drivers/media/video/pms.c:1047:6: warning: assignment makes pointer from integer without a cast [enabled by default]
:           drivers/media/video/pms.c:1116:2: error: implicit declaration of function ‘kfree’ [-Werror=implicit-function-declaration]
:           
:           Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>

:       diff --git a/drivers/media/video/pms.c b/drivers/media/video/pms.c
:       index af2d908..77f9c92 100644
:       --- a/drivers/media/video/pms.c
:       +++ b/drivers/media/video/pms.c
:       @@ -26,6 +26,7 @@
:        #include <linux/fs.h>
:        #include <linux/kernel.h>
:        #include <linux/mm.h>
:       +#include <linux/slab.h>
:        #include <linux/ioport.h>
:        #include <linux/init.h>
:        #include <linux/mutex.h>

Thanks,
Fengguang

^ permalink raw reply

* RE: [PATCH v2 net-next 2/2] r8169: support RTL8168G
From: hayeswang @ 2012-07-06  7:57 UTC (permalink / raw)
  To: 'Francois Romieu'; +Cc: netdev, linux-kernel
In-Reply-To: <20120704220458.GA1621@electric-eye.fr.zoreil.com>

[-- Attachment #1: Type: text/plain, Size: 642 bytes --]

Francois Romieu [romieu@fr.zoreil.com]
[...]
> 
> Any objection against merging it with the patch below ?
> 
> - more BUG() avoidance
> - save Joe P. some work
> - remove useless parenthesis
> - fix r8168g_mdio_write (if (reg_addr == 0x1f) { if (reg_addr == 0) snafu)
>   -> Please check this one.

That is fine.

> - long declarations before short ones
> - avoid unbounded loops
> - use a descriptive name for the 0xe8de value in rtl_hw_init_8168g.
>   -> Please suggest something better than "PLOP"

I have no idea about naming 0xe8de. Our hardware engineers don't release any
datasheet about it. It seems to be related with oob settings. 

[-- Attachment #2: fix.patch --]
[-- Type: application/octet-stream, Size: 1656 bytes --]


Fix some code.
 
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/ethernet/realtek/r8169.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index c37aed9..1d7d6f0 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -3392,7 +3392,7 @@ static void rtl8168g_1_hw_phy_config(struct rtl8169_private *tp)
 	};
 
 	/* patch code for GPHY reset */
-	for (i = 0; ARRAY_SIZE(mac_ocp_patch); i++)
+	for (i = 0; i < ARRAY_SIZE(mac_ocp_patch); i++)
 		r8168_mac_ocp_write(tp, 0xf800 + 2*i, mac_ocp_patch[i]);
 	r8168_mac_ocp_write(tp, 0xfc26, 0x8000);
 	r8168_mac_ocp_write(tp, 0xfc28, 0x0075);
@@ -6764,6 +6764,7 @@ static void __devinit rtl_hw_init_8168g(struct rtl8169_private *tp)
 	u32 data;
 	int i;
 
+	tp->ocp_base = OCP_STD_PHY_BASE;
 	RTL_W32(MISC, RTL_R32(MISC) | RXDV_GATED_EN);
 
 	for (i = 0; i < RTL_LOOP_MAX; i++) {
@@ -6782,15 +6783,15 @@ static void __devinit rtl_hw_init_8168g(struct rtl8169_private *tp)
 	msleep(1);
 	RTL_W8(MCU, RTL_R8(MCU) & ~NOW_IS_OOB);
 
-	data = r8168_mac_ocp_read(ioaddr, PLOP);
+	data = r8168_mac_ocp_read(tp, PLOP);
 	data &= ~(1 << 14);
-	r8168_mac_ocp_write(ioaddr, PLOP, data);
+	r8168_mac_ocp_write(tp, PLOP, data);
 
 	rtl_mcu_wait_list_ready(ioaddr);
 
-	data = r8168_mac_ocp_read(ioaddr, PLOP);
+	data = r8168_mac_ocp_read(tp, PLOP);
 	data |= (1 << 15);
-	r8168_mac_ocp_write(ioaddr, PLOP, data);
+	r8168_mac_ocp_write(tp, PLOP, data);
 
 	rtl_mcu_wait_list_ready(ioaddr);
 }
-- 
1.7.10.2


^ permalink raw reply related

* Re: [net-next:master 294/295] drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:287:3: error: implicit declaration of function 'phy_init_eee'
From: Giuseppe CAVALLARO @ 2012-07-06  8:08 UTC (permalink / raw)
  To: wfg; +Cc: David S. Miller, netdev
In-Reply-To: <20120701113521.GA11294@localhost>

On 7/1/2012 1:35 PM, wfg@linux.intel.com wrote:
> Hi Giuseppe,
> 
> Here is a build dependency: the below commit used some functions
> provided by the immediate next commit "phy: add the EEE support and
> the way to access to the MMD registers."
> 

I had built all w/o issues (on arm/sh/x86) but I'm looking at the build
again.
I hope to fix it in case of I am able to reproduce the problem because
I'm on business trip and I have some issue to run tests :-(

I'll try

Peppe

> ---
> Kernel build failed on
> 
> tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
> head:   a59a4d1921664da63d801ba477950114c71c88c9
> commit: d765955d2ae0b88781a0db3a5bacfe4241925e09 [294/295] stmmac: add the Energy Efficient Ethernet support
> config: x86_64-allmodconfig
> 
> All related error/warning messages:
> 
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_eee_init':
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:287:3: error: implicit declaration of function 'phy_init_eee' [-Werror=implicit-function-declaration]
> drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c: In function 'stmmac_get_ethtool_stats':
> drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:380:4: error: implicit declaration of function 'phy_get_eee_err' [-Werror=implicit-function-declaration]
> drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c: In function 'stmmac_ethtool_op_get_eee':
> drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:494:2: error: implicit declaration of function 'phy_ethtool_get_eee' [-Werror=implicit-function-declaration]
> drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c: In function 'stmmac_ethtool_op_set_eee':
> drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:519:2: error: implicit declaration of function 'phy_ethtool_set_eee' [-Werror=implicit-function-declaration]
> 
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:287:
>    284		/* MAC core supports the EEE feature. */
>    285		if (priv->dma_cap.eee) {
>    286			/* Check if the PHY supports EEE */
>  > 287			if (phy_init_eee(priv->phydev, 1))
>    288				goto out;
>    289	
>    290			priv->eee_active = 1;
> 
> ---
> 0-DAY kernel build testing backend         Open Source Technology Centre
> Fengguang Wu <wfg@linux.intel.com>                     Intel Corporation
> 

^ permalink raw reply

* Re: [net-next RFC V5 5/5] virtio_net: support negotiating the number of queues through ctrl vq
From: Sasha Levin @ 2012-07-06  8:10 UTC (permalink / raw)
  To: Jason Wang
  Cc: krkumar2, habanero, mashirle, kvm, mst, netdev, linux-kernel,
	virtualization, edumazet, tahm, jwhan, davem, sri
In-Reply-To: <4FF65966.9040600@redhat.com>

On Fri, 2012-07-06 at 11:20 +0800, Jason Wang wrote:
> On 07/05/2012 08:51 PM, Sasha Levin wrote:
> > On Thu, 2012-07-05 at 18:29 +0800, Jason Wang wrote:
> >> @@ -1387,6 +1404,10 @@ static int virtnet_probe(struct virtio_device *vdev)
> >>          if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
> >>                  vi->has_cvq = true;
> >>
> >> +       /* Use single tx/rx queue pair as default */
> >> +       vi->num_queue_pairs = 1;
> >> +       vi->total_queue_pairs = num_queue_pairs;
> > The code is using this "default" even if the amount of queue pairs it
> > wants was specified during initialization. This basically limits any
> > device to use 1 pair when starting up.
> >
> 
> Yes, currently the virtio-net driver would use 1 txq/txq by default 
> since multiqueue may not outperform in all kinds of workload. So it's 
> better to keep it as default and let user enable multiqueue by ethtool -L.

I think it makes sense to set it to 1 if the amount of initial queue
pairs wasn't specified.

On the other hand, if a virtio-net driver was probed to provide
VIRTIO_NET_F_MULTIQUEUE and has set something reasonable in
virtio_net_config.num_queues, then that setting shouldn't be quietly
ignored and reset back to 1.

What I'm basically saying is that I agree that the *default* should be 1
- but if the user has explicitly asked for something else during
initialization, then the default should be overridden.

^ permalink raw reply

* Re: [net-next:master 294/295] drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:287:3: error: implicit declaration of function 'phy_init_eee'
From: Fengguang Wu @ 2012-07-06  8:15 UTC (permalink / raw)
  To: Giuseppe CAVALLARO; +Cc: David S. Miller, netdev
In-Reply-To: <4FF69D1A.2030903@st.com>

On Fri, Jul 06, 2012 at 10:08:58AM +0200, Giuseppe CAVALLARO wrote:
> On 7/1/2012 1:35 PM, wfg@linux.intel.com wrote:
> > Hi Giuseppe,
> > 
> > Here is a build dependency: the below commit used some functions
> > provided by the immediate next commit "phy: add the EEE support and
> > the way to access to the MMD registers."
> > 
> 
> I had built all w/o issues (on arm/sh/x86) but I'm looking at the build
> again.
> I hope to fix it in case of I am able to reproduce the problem because
> I'm on business trip and I have some issue to run tests :-(
> 
> I'll try
> 
> Peppe

Hi Peppe, please note that the next commit actually builds fine.
So it should be a simple matter of swapping order of the two commits.

I'm not sure if net-next tree is rebase-able. If not, I'll stop doing
commit-by-commit build tests on it in future.

Thanks,
Fengguang

> > ---
> > Kernel build failed on
> > 
> > tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
> > head:   a59a4d1921664da63d801ba477950114c71c88c9
> > commit: d765955d2ae0b88781a0db3a5bacfe4241925e09 [294/295] stmmac: add the Energy Efficient Ethernet support
> > config: x86_64-allmodconfig
> > 
> > All related error/warning messages:
> > 
> > drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_eee_init':
> > drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:287:3: error: implicit declaration of function 'phy_init_eee' [-Werror=implicit-function-declaration]
> > drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c: In function 'stmmac_get_ethtool_stats':
> > drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:380:4: error: implicit declaration of function 'phy_get_eee_err' [-Werror=implicit-function-declaration]
> > drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c: In function 'stmmac_ethtool_op_get_eee':
> > drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:494:2: error: implicit declaration of function 'phy_ethtool_get_eee' [-Werror=implicit-function-declaration]
> > drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c: In function 'stmmac_ethtool_op_set_eee':
> > drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:519:2: error: implicit declaration of function 'phy_ethtool_set_eee' [-Werror=implicit-function-declaration]
> > 
> > drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:287:
> >    284		/* MAC core supports the EEE feature. */
> >    285		if (priv->dma_cap.eee) {
> >    286			/* Check if the PHY supports EEE */
> >  > 287			if (phy_init_eee(priv->phydev, 1))
> >    288				goto out;
> >    289	
> >    290			priv->eee_active = 1;
> > 
> > ---
> > 0-DAY kernel build testing backend         Open Source Technology Centre
> > Fengguang Wu <wfg@linux.intel.com>                     Intel Corporation
> > 
> 

^ permalink raw reply

* Re: [PATCH] force dentry revalidation after namespace change
From: Glauber Costa @ 2012-07-06  9:00 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, netdev, Andrew Morton, Tejun Heo,
	Greg Kroah-Hartman
In-Reply-To: <8762a1vl76.fsf@xmission.com>

On 07/06/2012 03:31 AM, Eric W. Biederman wrote:
> The important difference there it is the directory that the dirent is
> in that the type comes from.  Not the dirent itself.
> 
>> >  	/* The sysfs dirent has been deleted */
>> >  	if (sd->s_flags & SYSFS_FLAG_REMOVED)
>> >  		goto out_bad;
> Glauber.  Do you think you can fix your patch and resubmit.
> 
> Eric

Yes. In a quick test it seems to work. I'll resubmit shortly.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox