Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] net: NETCONSOLE_DYNAMIC need CONFIGFS_FS=y
From: Yong Zhang @ 2011-01-25  8:44 UTC (permalink / raw)
  To: Américo Wang
  Cc: linux-kernel, netdev, nab, David S. Miller, Kevin Hilman
In-Reply-To: <AANLkTi=4HkAnwGVxypjGZHkwTbBrBvKk4ZOPnHG+SVRS@mail.gmail.com>

On Tue, Jan 25, 2011 at 4:42 PM, Yong Zhang <yong.zhang0@gmail.com> wrote:
> On Tue, Jan 25, 2011 at 4:39 PM, Américo Wang <xiyou.wangcong@gmail.com> wrote:
>> On Tue, Jan 25, 2011 at 04:22:28PM +0800, Yong Zhang wrote:
>>>With CONFIGFS_FS=m && NETCONSOLE_DYNAMIC=y, build error:
>>>
>>>drivers/built-in.o: In function `netconsole_target_put':
>>>build/drivers/net/netconsole.c:133: undefined reference to `config_item_put'
>>>drivers/built-in.o: In function `netconsole_target_get':
>>>build/drivers/net/netconsole.c:127: undefined reference to `config_item_get'
>>>build/drivers/net/netconsole.c:127: undefined reference to `config_item_get'
>>>drivers/built-in.o: In function `netconsole_target_put':
>>>build/drivers/net/netconsole.c:133: undefined reference to `config_item_put'
>>>drivers/built-in.o: In function `drop_netconsole_target':
>>>build/drivers/net/netconsole.c:634: undefined reference to `config_item_put'
>>>drivers/built-in.o: In function `make_netconsole_target':
>>>build/drivers/net/netconsole.c:607: undefined reference to `config_item_init_type_name'
>>>drivers/built-in.o: In function `dynamic_netconsole_init':
>>>build/drivers/net/netconsole.c:109: undefined reference to `config_group_init'
>>>build/drivers/net/netconsole.c:111: undefined reference to `configfs_register_subsystem'
>>>drivers/built-in.o: In function `dynamic_netconsole_exit':
>>>build/drivers/net/netconsole.c:116: undefined reference to `configfs_unregister_subsystem'
>>>
>>
>> Yeah, this was reported, another way to fix this is "select CONFIGFS_FS".
>> Using "select" makes you configure NETCONSOLE_DYNAMIC without caring
>> about if you have configured CONFIGFS_FS=y, which seems better.
>
> You mean commit d9f9ab51e55e36379773752ffeaac677b080d469?

That commit show "select CONFIGFS_FS" is the wrong way :(

>
> Thanks,
> Yong
>
> --
> Only stand for myself
>



-- 
Only stand for myself

^ permalink raw reply

* Re: [PATCH v2 2/2] tlan: add suspend/resume support
From: Sakari Ailus @ 2011-01-25  8:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, chessman, joe
In-Reply-To: <20110124.150728.246519223.davem@davemloft.net>

David Miller wrote:
> From: Sakari Ailus<sakari.ailus@iki.fi>
> Date: Fri, 21 Jan 2011 22:59:31 +0200
>
>> Add suspend/resume support to tlan driver. This allows not unloading the
>> driver over suspend/resume.
>>
>> Also, start (or now, wake) the queue after resetting the adapter --- not the
>> other way around.
>>
>> Signed-off-by: Sakari Ailus<sakari.ailus@iki.fi>
>
> Applied.

Thanks, David!

-- 
Sakari Ailus
sakari.ailus@iki.fi

^ permalink raw reply

* Re: [PATCH] ipv6: Revert 'administrative down' address handling changes.
From: Eric W. Biederman @ 2011-01-25  8:51 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, shemminger, brian.haley, lorenzo, herbert
In-Reply-To: <20110124.233813.246546891.davem@davemloft.net>

David Miller <davem@davemloft.net> writes:

> From: David Miller <davem@davemloft.net>
> Date: Sun, 23 Jan 2011 23:41:01 -0800 (PST)
>
>> Eric B. and co., please do some testing to make sure all of your
>> disable_ipv6 cases are functioning properly with this applied.
>
> Ping?

In progress.  I had to make a small change to your patch to get it
to apply against 2.6.37.  neigh_ifdown has not been removed from the
beginning of addrconf_ifdown there.  The piece that was failing for
me is not failing now so, so far so good.

It was reported that in 2.6.37 there was a new regression that
1connecting to ::1 when ipv6 was disabled would not fail immediately but
would have to wait a while.  With your patch applied I am not seeing
that behavior either.

Tomorrow I should know if I see any weird side effects with your patch,
after my regression tests for everything else have finished running.

My apologies for not testing Stephens patch more thoroughly in the lead
up to 2.6.37.  I goofed and bear some of the responsibility.  Hopefully
we can get the deeper problems fixed, and make Stephens use case work as
well.

Eric

^ permalink raw reply

* Re: [PATCH] net: NETCONSOLE_DYNAMIC need CONFIGFS_FS=y
From: Yong Zhang @ 2011-01-25  9:01 UTC (permalink / raw)
  To: Américo Wang
  Cc: linux-kernel, netdev, nab, David S. Miller, Kevin Hilman
In-Reply-To: <20110125083915.GE13618@cr0.nay.redhat.com>

On Tue, Jan 25, 2011 at 4:39 PM, Américo Wang <xiyou.wangcong@gmail.com> wrote:
> On Tue, Jan 25, 2011 at 04:22:28PM +0800, Yong Zhang wrote:
>>With CONFIGFS_FS=m && NETCONSOLE_DYNAMIC=y, build error:
>>
>>drivers/built-in.o: In function `netconsole_target_put':
>>build/drivers/net/netconsole.c:133: undefined reference to `config_item_put'
>>drivers/built-in.o: In function `netconsole_target_get':
>>build/drivers/net/netconsole.c:127: undefined reference to `config_item_get'
>>build/drivers/net/netconsole.c:127: undefined reference to `config_item_get'
>>drivers/built-in.o: In function `netconsole_target_put':
>>build/drivers/net/netconsole.c:133: undefined reference to `config_item_put'
>>drivers/built-in.o: In function `drop_netconsole_target':
>>build/drivers/net/netconsole.c:634: undefined reference to `config_item_put'
>>drivers/built-in.o: In function `make_netconsole_target':
>>build/drivers/net/netconsole.c:607: undefined reference to `config_item_init_type_name'
>>drivers/built-in.o: In function `dynamic_netconsole_init':
>>build/drivers/net/netconsole.c:109: undefined reference to `config_group_init'
>>build/drivers/net/netconsole.c:111: undefined reference to `configfs_register_subsystem'
>>drivers/built-in.o: In function `dynamic_netconsole_exit':
>>build/drivers/net/netconsole.c:116: undefined reference to `configfs_unregister_subsystem'
>>
>
> Yeah, this was reported, another way to fix this is "select CONFIGFS_FS".
> Using "select" makes you configure NETCONSOLE_DYNAMIC without caring
> about if you have configured CONFIGFS_FS=y, which seems better.

OK. Find the original reporting.

Sorry for the noise.

-- 
Only stand for myself

^ permalink raw reply

* Re: does intel X520-SR(ixgbe) support RSS on single VLAN?
From: Rui @ 2011-01-25  9:03 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev@vger.kernel.org, e1000-devel@lists.sourceforge.net
In-Reply-To: <1295924734.2896.6.camel@localhost>

On Tue, Jan 25, 2011 at 11:05 AM, Ben Hutchings
<bhutchings@solarflare.com> wrote:
> On Tue, 2011-01-25 at 10:10 +0800, Rui wrote:
>> On Tue, Jan 25, 2011 at 1:09 AM, Alexander Duyck
>> <alexander.h.duyck@intel.com> wrote:
>> > On 1/24/2011 6:18 AM, Rui wrote:
>> >>
>> >> hi
>> >> does intel X520-SR support RSS on single VLAN?
>> >>
>> >> tested with 3 different vlan id and priority packets
>> >> What I saw is that all packets were always delivered to the same RxQ.
>> >> looks can not get a different RSS index for these packet?
>> >> any setting needed?
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> >> the body of a message to majordomo@vger.kernel.org
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >
>> > The X520 should have no problems hashing on a single VLAN tagged frame.
>> >  However the VLAN will not be a part of the RSS hash.  The  only components
>> > of the hash are the IPv4/IPv6 source and destination addresses, and if the
>> > flow is TCP then the port numbers.
>> >
>> hi alexander
>> I got these information from the intel community:
>>
>> 'I asked our software engineers about your question, and this is what I learned.
>> You cannot filter by just VLAN or VLAN priority.  The L4 type will
>> also play a role in the filter and as such you would only be able to
>> filter TCP, UDP, and SCTP packets that are bound for a VLAN.
>> The command itself to setup a filter is “ethtool –U ethX flow-type
>> tcp4 vlan 0x2000 vlan-mask 0xE000 action Y” where X is the correct
>> index for the interface and Y is the queue you want to route the
>> traffic to.  This would have to be repeated for udp4 and sctp4.
>> I hope this will help.
>> Mark H"
>
> The mask specifies bits to be ignored, so if you want to filter on the
> basis of only the priority bits you should use vlan-mask 0xfff.  Unless
> this is another inconsistency I failed to notice...
>
>> so my question is that the VLAN is PART of the RSS or not?
>
> It's not part of any specified Toeplitz hash.  However, some hardware
> supports adding the hash (after indirection) to the queue number
> specified by a filter.  Currently the ethtool API doesn't have a way to
> request that.
>
>> looks the
>> perfect filter support vlan id ?can the perfect filter support
>> wildchar,such as: flow-type ANY?
>
> It is possible to specify this using flow-type ether, but the ixgbe
> driver does not yet support that (and I have no idea whether the
> hardware does).
>
> Ben.
>
> --
> Ben Hutchings, Senior Software Engineer, Solarflare Communications
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
>
>

I got this msg:
Cannot add new RX n-tuple filter: Operation not supported
This command is only supported after 2.6.34?

------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [PATCH] netfilter: ipvs: fix compiler warnings
From: Patrick McHardy @ 2011-01-25 10:25 UTC (permalink / raw)
  To: Simon Horman
  Cc: Changli Gao, Wensong Zhang, Julian Anastasov, David S. Miller,
	netdev, lvs-devel, netfilter-devel
In-Reply-To: <20110125050535.GB7183@verge.net.au>

On 25.01.2011 06:05, Simon Horman wrote:
> On Tue, Jan 25, 2011 at 12:40:18PM +0800, Changli Gao wrote:
>> Fix compiler warnings when IP_VS_DBG() isn't defined.
>>
>> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
> 
> Thanks, I will queue this up for when nf-next-2.6 is open again.

It's already open.

^ permalink raw reply

* Re: [PATCH v2 02/16] net: change netdev->features to u32
From: Michał Mirosław @ 2011-01-25 11:07 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, bhutchings
In-Reply-To: <20110124.153004.232756003.davem@davemloft.net>

On Mon, Jan 24, 2011 at 03:30:04PM -0800, David Miller wrote:
> From: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> Date: Sat, 22 Jan 2011 23:14:12 +0100 (CET)
> > Quoting Ben Hutchings: we presumably won't be defining features that
> > can only be enabled on 64-bit architectures.
> > 
> > Occurences found by `grep -r` on net/, drivers/net, include/
> > 
> > Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> I'll apply this, with one change, adding the suggestion that
> features and vlan_features be moved next to each other in
> struct netdev, from Eric Dumazet.

You can use the one I sent in reply to Eric's suggestion.
It's this one: http://patchwork.ozlabs.org/patch/80056/

Best Regards,
Michał Mirosław

^ permalink raw reply

* [PATCH v3] smc91x: add devicetree support
From: Thomas Chou @ 2011-01-25 12:11 UTC (permalink / raw)
  To: Grant Likely, Nicolas Pitre
  Cc: linux-kernel, nios2-dev, devicetree-discuss, netdev, Thomas Chou
In-Reply-To: <1295939399-4197-1-git-send-email-thomas@wytron.com.tw>

Signed-off-by: Thomas Chou <thomas@wytron.com.tw>
---
v2 specify part numbers in compat as Grant suggested.
v3 more specific part name.

 drivers/net/smc91x.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/net/smc91x.c b/drivers/net/smc91x.c
index 726df61..65b2a70 100644
--- a/drivers/net/smc91x.c
+++ b/drivers/net/smc91x.c
@@ -81,6 +81,7 @@ static const char version[] =
 #include <linux/ethtool.h>
 #include <linux/mii.h>
 #include <linux/workqueue.h>
+#include <linux/of.h>
 
 #include <linux/netdevice.h>
 #include <linux/etherdevice.h>
@@ -2394,6 +2395,13 @@ static int smc_drv_resume(struct device *dev)
 	return 0;
 }
 
+static const struct of_device_id smc91x_match[] = {
+	{ .compatible = "smsc,lan91c94", },
+	{ .compatible = "smsc,lan91c111", },
+	{},
+}
+MODULE_DEVICE_TABLE(of, smc91x_match);
+
 static struct dev_pm_ops smc_drv_pm_ops = {
 	.suspend	= smc_drv_suspend,
 	.resume		= smc_drv_resume,
@@ -2406,6 +2414,9 @@ static struct platform_driver smc_driver = {
 		.name	= CARDNAME,
 		.owner	= THIS_MODULE,
 		.pm	= &smc_drv_pm_ops,
+#ifdef CONFIG_OF
+		.of_match_table = smc91x_match,
+#endif
 	},
 };
 
-- 
1.7.3.5

^ permalink raw reply related

* Re: [PATCH] netfilter: ipvs: fix compiler warnings
From: Simon Horman @ 2011-01-25 13:09 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Changli Gao, Wensong Zhang, Julian Anastasov, David S. Miller,
	netdev, lvs-devel, netfilter-devel
In-Reply-To: <4D3EA526.5030704@trash.net>

On Tue, Jan 25, 2011 at 11:25:42AM +0100, Patrick McHardy wrote:
> On 25.01.2011 06:05, Simon Horman wrote:
> > On Tue, Jan 25, 2011 at 12:40:18PM +0800, Changli Gao wrote:
> >> Fix compiler warnings when IP_VS_DBG() isn't defined.
> >>
> >> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
> > 
> > Thanks, I will queue this up for when nf-next-2.6 is open again.
> 
> It's already open.
> 

Hi Patrick,

I'm a little unsure how you would like me to differentiate
between fixes for 2.6.38 - e.g. my previous two pull requests.
And patches intended for 2.6.39.

In any case I will send a pull request that includes
Changli's change against nf-next-2.6.

^ permalink raw reply

* Re: [PATCH] netfilter: ipvs: fix compiler warnings
From: Simon Horman @ 2011-01-25 13:15 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Changli Gao, Wensong Zhang, Julian Anastasov, David S. Miller,
	netdev, lvs-devel, netfilter-devel
In-Reply-To: <20110125130933.GA2841@verge.net.au>

On Tue, Jan 25, 2011 at 11:09:35PM +1000, Simon Horman wrote:
> On Tue, Jan 25, 2011 at 11:25:42AM +0100, Patrick McHardy wrote:
> > On 25.01.2011 06:05, Simon Horman wrote:
> > > On Tue, Jan 25, 2011 at 12:40:18PM +0800, Changli Gao wrote:
> > >> Fix compiler warnings when IP_VS_DBG() isn't defined.
> > >>
> > >> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
> > > 
> > > Thanks, I will queue this up for when nf-next-2.6 is open again.
> > 
> > It's already open.
> > 
> 
> Hi Patrick,
> 
> I'm a little unsure how you would like me to differentiate
> between fixes for 2.6.38 - e.g. my previous two pull requests.
> And patches intended for 2.6.39.
> 
> In any case I will send a pull request that includes
> Changli's change against nf-next-2.6.

Actually, I realise that Changli's change is actually
2.6.38 material. I'll send a pull request. Please ignore
my noise about 2.6.39.

^ permalink raw reply

* [PATCH 2/2] netfilter: ipvs: fix compiler warnings
From: Simon Horman @ 2011-01-25 13:17 UTC (permalink / raw)
  To: lvs-devel, netfilter-devel, netdev
  Cc: Changli Gao, Julian Anastasov, Patrick McHardy, Simon Horman
In-Reply-To: <1295961426-7402-1-git-send-email-horms@verge.net.au>

From: Changli Gao <xiaosuo@gmail.com>

Fix compiler warnings when IP_VS_DBG() isn't defined.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
 net/netfilter/ipvs/ip_vs_core.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index f36a84f..d889f4f 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1894,9 +1894,7 @@ static int __net_init __ip_vs_init(struct net *net)
 
 static void __net_exit __ip_vs_cleanup(struct net *net)
 {
-	struct netns_ipvs *ipvs = net_ipvs(net);

^ permalink raw reply related

* [GIT PULL nf-next-2.6] IPVS changes for 2.6.38-rc3
From: Simon Horman @ 2011-01-25 13:17 UTC (permalink / raw)
  To: lvs-devel, netfilter-devel, netdev
  Cc: Changli Gao, Julian Anastasov, Patrick McHardy

Hi Patrick,

please consider pulling
  git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-test-2.6.git for-patrick

Which includes the following changes, both of which I believe 2.6.38 material.

Changli Gao (1):
      netfilter: ipvs: fix compiler warnings

Hans Schillstrom (1):
      IPVS netns BUG, register sysctl for root ns

 net/netfilter/ipvs/ip_vs_core.c |    4 +---
 net/netfilter/ipvs/ip_vs_ctl.c  |    2 +-
 2 files changed, 2 insertions(+), 4 deletions(-)


^ permalink raw reply

* [PATCH 1/2] IPVS netns BUG, register sysctl for root ns
From: Simon Horman @ 2011-01-25 13:17 UTC (permalink / raw)
  To: lvs-devel, netfilter-devel, netdev
  Cc: Changli Gao, Julian Anastasov, Patrick McHardy, Hans Schillstrom,
	Simon Horman
In-Reply-To: <1295961426-7402-1-git-send-email-horms@verge.net.au>

From: Hans Schillstrom <hans.schillstrom@ericsson.com>

The newly created table was not used when register sysctl for a new namespace.
I.e. sysctl doesn't work for other than root namespace (init_net)

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
 net/netfilter/ipvs/ip_vs_ctl.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 68b8033..98df59a 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -3556,7 +3556,7 @@ int __net_init __ip_vs_control_init(struct net *net)
 
 
 	ipvs->sysctl_hdr = register_net_sysctl_table(net, net_vs_ctl_path,
-						  vs_vars);
+						     tbl);
 	if (ipvs->sysctl_hdr == NULL)
 		goto err_reg;
 	ip_vs_new_estimator(net, ipvs->tot_stats);
-- 
1.7.2.3


^ permalink raw reply related

* Re: [PATCH] netfilter: ipvs: fix compiler warnings
From: Patrick McHardy @ 2011-01-25 13:51 UTC (permalink / raw)
  To: Simon Horman
  Cc: Changli Gao, Wensong Zhang, Julian Anastasov, David S. Miller,
	netdev, lvs-devel, netfilter-devel
In-Reply-To: <20110125130933.GA2841@verge.net.au>

On 25.01.2011 14:09, Simon Horman wrote:
> On Tue, Jan 25, 2011 at 11:25:42AM +0100, Patrick McHardy wrote:
>> On 25.01.2011 06:05, Simon Horman wrote:
>>> On Tue, Jan 25, 2011 at 12:40:18PM +0800, Changli Gao wrote:
>>>> Fix compiler warnings when IP_VS_DBG() isn't defined.
>>>>
>>>> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
>>>
>>> Thanks, I will queue this up for when nf-next-2.6 is open again.
>>
>> It's already open.
>>
> 
> Hi Patrick,
> 
> I'm a little unsure how you would like me to differentiate
> between fixes for 2.6.38 - e.g. my previous two pull requests.
> And patches intended for 2.6.39.

Oh, I thought those were intended for -next because it included
a netns fix. Best is to just state the intended tree in your
pull request.

> In any case I will send a pull request that includes
> Changli's change against nf-next-2.6.

Thanks.

^ permalink raw reply

* Re: [GIT PULL nf-next-2.6] IPVS changes for 2.6.38-rc3
From: Patrick McHardy @ 2011-01-25 14:01 UTC (permalink / raw)
  To: Simon Horman
  Cc: lvs-devel, netfilter-devel, netdev, Changli Gao, Julian Anastasov
In-Reply-To: <1295961426-7402-1-git-send-email-horms@verge.net.au>

On 25.01.2011 14:17, Simon Horman wrote:
> Hi Patrick,
> 
> please consider pulling
>   git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-test-2.6.git for-patrick
> 
> Which includes the following changes, both of which I believe 2.6.38 material.
> 
> Changli Gao (1):
>       netfilter: ipvs: fix compiler warnings
> 
> Hans Schillstrom (1):
>       IPVS netns BUG, register sysctl for root ns

That tree is based on nf-next-2.6, so I can't pull this into my nf-2.6
tree.

^ permalink raw reply

* Re: [PATCH] xen: netfront: Drop GSO SKBs which do not have csum_blank.
From: Ian Campbell @ 2011-01-25 17:09 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: netdev@vger.kernel.org, xen-devel@lists.xensource.com
In-Reply-To: <4D3DBD1C.6020503@goop.org>

On Mon, 2011-01-24 at 17:55 +0000, Jeremy Fitzhardinge wrote:
> No, I didn't mean to drop the skb_is_gso() test.  But still, the if()s
> can be folded to share the same body.

My original opinion was that sharing that bit of body at the expense of
an increasingly hard to decipher if conditional was not a good trade
off.

However thinking about it again I think the test logic would be better
of factored out into a validate_incoming_skb() type function anyway.
Short (2 patch) replacement series follows.

Ian.

^ permalink raw reply

* [PATCH 1/2] xen: netfront: refactor code for checking validity of incoming skbs
From: Ian Campbell @ 2011-01-25 17:09 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Ian Campbell, Jeremy Fitzhardinge, xen-devel, netdev
In-Reply-To: <1295975376.14780.6595.camel@zakaz.uk.xensource.com>

Makes future additional validation clearer.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: xen-devel@lists.xensource.com
Cc: netdev@vger.kernel.org
---
 drivers/net/xen-netfront.c |   24 +++++++++++++++++-------
 1 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 546de57..4dc347b 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -809,6 +809,18 @@ out:
 	return err;
 }
 
+static int validate_incoming_skb(struct sk_buff *skb)
+{
+	/*
+	 * If the SKB is partial then we must be able to setup the
+	 * checksum fields in the skb.
+	 */
+	if (skb->ip_summed == CHECKSUM_PARTIAL && skb_checksum_setup(skb))
+		return 0;
+
+	return 1;
+}
+
 static int handle_incoming_queue(struct net_device *dev,
 				 struct sk_buff_head *rxq)
 {
@@ -829,13 +841,11 @@ static int handle_incoming_queue(struct net_device *dev,
 		/* Ethernet work: Delayed to here as it peeks the header. */
 		skb->protocol = eth_type_trans(skb, dev);
 
-		if (skb->ip_summed == CHECKSUM_PARTIAL) {
-			if (skb_checksum_setup(skb)) {
-				kfree_skb(skb);
-				packets_dropped++;
-				dev->stats.rx_errors++;
-				continue;
-			}
+		if (!validate_incoming_skb(skb)) {
+			kfree_skb(skb);
+			packets_dropped++;
+			dev->stats.rx_errors++;
+			continue;
 		}
 
 		dev->stats.rx_packets++;
-- 
1.5.6.5


^ permalink raw reply related

* [PATCH 2/2] xen: netfront: Drop GSO SKBs which do not have csum_blank.
From: Ian Campbell @ 2011-01-25 17:10 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Ian Campbell, Jeremy Fitzhardinge, xen-devel, netdev
In-Reply-To: <1295975376.14780.6595.camel@zakaz.uk.xensource.com>

The Linux network stack expects all GSO SKBs to have ip_summed ==
CHECKSUM_PARTIAL (which implies that the frame contains a partial
checksum) and the Xen network ring protocol similarly expects an SKB
which has GSO set to also have NETRX_csum_blank (which also implies a
partial checksum). Therefore drop such frames on receive otherwise
they will trigger the warning in skb_gso_segment.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: xen-devel@lists.xensource.com
Cc: netdev@vger.kernel.org
---
 drivers/net/xen-netfront.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 4dc347b..0ea47da 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -818,6 +818,10 @@ static int validate_incoming_skb(struct sk_buff *skb)
 	if (skb->ip_summed == CHECKSUM_PARTIAL && skb_checksum_setup(skb))
 		return 0;

+	/* A GSO SKB must be partial. */
+	if (skb->ip_summed != CHECKSUM_PARTIAL && skb_is_gso(skb))
+		return 0;
+
 	return 1;
 }

-- 
1.5.6.5

^ permalink raw reply related

* [PATCH net-next-2.6] pktgen: speedup fragmented skbs
From: Eric Dumazet @ 2011-01-25 17:13 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

We spend lot of time clearing pages in pktgen.
(Or not clearing them on ipv6 and leaking kernel memory)

Since we dont modify them, we can use one zeroed page, and get
references on it. This page can use NUMA affinity as well.

Define pktgen_finalize_skb() helper, used both in ipv4 and ipv6

Results using skbs with one frag :

Before patch :

Result: OK: 608980458(c608978520+d1938) nsec, 1000000000
(100byte,1frags)
  1642088pps 1313Mb/sec (1313670400bps) errors: 0

After patch :

Result: OK: 345285014(c345283891+d1123) nsec, 1000000000
(100byte,1frags)
  2896158pps 2316Mb/sec (2316926400bps) errors: 0

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/core/pktgen.c |  234 +++++++++++++++++---------------------------
 1 files changed, 93 insertions(+), 141 deletions(-)

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index a9e7fc4..17c4e16 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -251,6 +251,7 @@ struct pktgen_dev {
 	int max_pkt_size;	/* = ETH_ZLEN; */
 	int pkt_overhead;	/* overhead for MPLS, VLANs, IPSEC etc */
 	int nfrags;
+	struct page *page;
 	u64 delay;		/* nano-seconds */
 
 	__u64 count;		/* Default No packets to send */
@@ -1134,6 +1135,10 @@ static ssize_t pktgen_if_write(struct file *file,
 		if (node_possible(value)) {
 			pkt_dev->node = value;
 			sprintf(pg_result, "OK: node=%d", pkt_dev->node);
+			if (pkt_dev->page) {
+				put_page(pkt_dev->page);
+				pkt_dev->page = 0;
+			}
 		}
 		else
 			sprintf(pg_result, "ERROR: node not possible");
@@ -2605,6 +2610,90 @@ static inline __be16 build_tci(unsigned int id, unsigned int cfi,
 	return htons(id | (cfi << 12) | (prio << 13));
 }
 
+static void pktgen_finalize_skb(struct pktgen_dev *pkt_dev, struct sk_buff *skb,
+				int datalen)
+{
+	struct timeval timestamp;
+	struct pktgen_hdr *pgh;
+
+	pgh = (struct pktgen_hdr *)skb_put(skb, sizeof(*pgh));
+	datalen -= sizeof(*pgh);
+
+	if (pkt_dev->nfrags <= 0) {
+		pgh = (struct pktgen_hdr *)skb_put(skb, datalen);
+		memset(pgh + 1, 0, datalen);
+	} else {
+		int frags = pkt_dev->nfrags;
+		int i, len;
+
+
+		if (frags > MAX_SKB_FRAGS)
+			frags = MAX_SKB_FRAGS;
+		len = datalen - frags * PAGE_SIZE;
+		if (len > 0) {
+			memset(skb_put(skb, len), 0, len);
+			datalen = frags * PAGE_SIZE;
+		}
+
+		i = 0;
+		while (datalen > 0) {
+			if (unlikely(!pkt_dev->page)) {
+				int node = numa_node_id();
+
+				if (pkt_dev->node >= 0 && (pkt_dev->flags & F_NODE))
+					node = pkt_dev->node;
+				pkt_dev->page = alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, 0);
+				if (!pkt_dev->page)
+					break;
+			}
+			skb_shinfo(skb)->frags[i].page = pkt_dev->page;
+			get_page(pkt_dev->page);
+			skb_shinfo(skb)->frags[i].page_offset = 0;
+			skb_shinfo(skb)->frags[i].size =
+			    (datalen < PAGE_SIZE ? datalen : PAGE_SIZE);
+			datalen -= skb_shinfo(skb)->frags[i].size;
+			skb->len += skb_shinfo(skb)->frags[i].size;
+			skb->data_len += skb_shinfo(skb)->frags[i].size;
+			i++;
+			skb_shinfo(skb)->nr_frags = i;
+		}
+
+		while (i < frags) {
+			int rem;
+
+			if (i == 0)
+				break;
+
+			rem = skb_shinfo(skb)->frags[i - 1].size / 2;
+			if (rem == 0)
+				break;
+
+			skb_shinfo(skb)->frags[i - 1].size -= rem;
+
+			skb_shinfo(skb)->frags[i] =
+			    skb_shinfo(skb)->frags[i - 1];
+			get_page(skb_shinfo(skb)->frags[i].page);
+			skb_shinfo(skb)->frags[i].page =
+			    skb_shinfo(skb)->frags[i - 1].page;
+			skb_shinfo(skb)->frags[i].page_offset +=
+			    skb_shinfo(skb)->frags[i - 1].size;
+			skb_shinfo(skb)->frags[i].size = rem;
+			i++;
+			skb_shinfo(skb)->nr_frags = i;
+		}
+	}
+
+	/* Stamp the time, and sequence number,
+	 * convert them to network byte order
+	 */
+	pgh->pgh_magic = htonl(PKTGEN_MAGIC);
+	pgh->seq_num = htonl(pkt_dev->seq_num);
+
+	do_gettimeofday(&timestamp);
+	pgh->tv_sec = htonl(timestamp.tv_sec);
+	pgh->tv_usec = htonl(timestamp.tv_usec);
+}
+
 static struct sk_buff *fill_packet_ipv4(struct net_device *odev,
 					struct pktgen_dev *pkt_dev)
 {
@@ -2613,7 +2702,6 @@ static struct sk_buff *fill_packet_ipv4(struct net_device *odev,
 	struct udphdr *udph;
 	int datalen, iplen;
 	struct iphdr *iph;
-	struct pktgen_hdr *pgh = NULL;
 	__be16 protocol = htons(ETH_P_IP);
 	__be32 *mpls;
 	__be16 *vlan_tci = NULL;                 /* Encapsulates priority and VLAN ID */
@@ -2729,76 +2817,7 @@ static struct sk_buff *fill_packet_ipv4(struct net_device *odev,
 			   pkt_dev->pkt_overhead);
 	skb->dev = odev;
 	skb->pkt_type = PACKET_HOST;
-
-	if (pkt_dev->nfrags <= 0) {
-		pgh = (struct pktgen_hdr *)skb_put(skb, datalen);
-		memset(pgh + 1, 0, datalen - sizeof(struct pktgen_hdr));
-	} else {
-		int frags = pkt_dev->nfrags;
-		int i, len;
-
-		pgh = (struct pktgen_hdr *)(((char *)(udph)) + 8);
-
-		if (frags > MAX_SKB_FRAGS)
-			frags = MAX_SKB_FRAGS;
-		if (datalen > frags * PAGE_SIZE) {
-			len = datalen - frags * PAGE_SIZE;
-			memset(skb_put(skb, len), 0, len);
-			datalen = frags * PAGE_SIZE;
-		}
-
-		i = 0;
-		while (datalen > 0) {
-			struct page *page = alloc_pages(GFP_KERNEL | __GFP_ZERO, 0);
-			skb_shinfo(skb)->frags[i].page = page;
-			skb_shinfo(skb)->frags[i].page_offset = 0;
-			skb_shinfo(skb)->frags[i].size =
-			    (datalen < PAGE_SIZE ? datalen : PAGE_SIZE);
-			datalen -= skb_shinfo(skb)->frags[i].size;
-			skb->len += skb_shinfo(skb)->frags[i].size;
-			skb->data_len += skb_shinfo(skb)->frags[i].size;
-			i++;
-			skb_shinfo(skb)->nr_frags = i;
-		}
-
-		while (i < frags) {
-			int rem;
-
-			if (i == 0)
-				break;
-
-			rem = skb_shinfo(skb)->frags[i - 1].size / 2;
-			if (rem == 0)
-				break;
-
-			skb_shinfo(skb)->frags[i - 1].size -= rem;
-
-			skb_shinfo(skb)->frags[i] =
-			    skb_shinfo(skb)->frags[i - 1];
-			get_page(skb_shinfo(skb)->frags[i].page);
-			skb_shinfo(skb)->frags[i].page =
-			    skb_shinfo(skb)->frags[i - 1].page;
-			skb_shinfo(skb)->frags[i].page_offset +=
-			    skb_shinfo(skb)->frags[i - 1].size;
-			skb_shinfo(skb)->frags[i].size = rem;
-			i++;
-			skb_shinfo(skb)->nr_frags = i;
-		}
-	}
-
-	/* Stamp the time, and sequence number,
-	 * convert them to network byte order
-	 */
-	if (pgh) {
-		struct timeval timestamp;
-
-		pgh->pgh_magic = htonl(PKTGEN_MAGIC);
-		pgh->seq_num = htonl(pkt_dev->seq_num);
-
-		do_gettimeofday(&timestamp);
-		pgh->tv_sec = htonl(timestamp.tv_sec);
-		pgh->tv_usec = htonl(timestamp.tv_usec);
-	}
+	pktgen_finalize_skb(pkt_dev, skb, datalen);
 
 #ifdef CONFIG_XFRM
 	if (!process_ipsec(pkt_dev, skb, protocol))
@@ -2980,7 +2999,6 @@ static struct sk_buff *fill_packet_ipv6(struct net_device *odev,
 	struct udphdr *udph;
 	int datalen;
 	struct ipv6hdr *iph;
-	struct pktgen_hdr *pgh = NULL;
 	__be16 protocol = htons(ETH_P_IPV6);
 	__be32 *mpls;
 	__be16 *vlan_tci = NULL;                 /* Encapsulates priority and VLAN ID */
@@ -3083,75 +3101,7 @@ static struct sk_buff *fill_packet_ipv6(struct net_device *odev,
 	skb->dev = odev;
 	skb->pkt_type = PACKET_HOST;
 
-	if (pkt_dev->nfrags <= 0)
-		pgh = (struct pktgen_hdr *)skb_put(skb, datalen);
-	else {
-		int frags = pkt_dev->nfrags;
-		int i;
-
-		pgh = (struct pktgen_hdr *)(((char *)(udph)) + 8);
-
-		if (frags > MAX_SKB_FRAGS)
-			frags = MAX_SKB_FRAGS;
-		if (datalen > frags * PAGE_SIZE) {
-			skb_put(skb, datalen - frags * PAGE_SIZE);
-			datalen = frags * PAGE_SIZE;
-		}
-
-		i = 0;
-		while (datalen > 0) {
-			struct page *page = alloc_pages(GFP_KERNEL, 0);
-			skb_shinfo(skb)->frags[i].page = page;
-			skb_shinfo(skb)->frags[i].page_offset = 0;
-			skb_shinfo(skb)->frags[i].size =
-			    (datalen < PAGE_SIZE ? datalen : PAGE_SIZE);
-			datalen -= skb_shinfo(skb)->frags[i].size;
-			skb->len += skb_shinfo(skb)->frags[i].size;
-			skb->data_len += skb_shinfo(skb)->frags[i].size;
-			i++;
-			skb_shinfo(skb)->nr_frags = i;
-		}
-
-		while (i < frags) {
-			int rem;
-
-			if (i == 0)
-				break;
-
-			rem = skb_shinfo(skb)->frags[i - 1].size / 2;
-			if (rem == 0)
-				break;
-
-			skb_shinfo(skb)->frags[i - 1].size -= rem;
-
-			skb_shinfo(skb)->frags[i] =
-			    skb_shinfo(skb)->frags[i - 1];
-			get_page(skb_shinfo(skb)->frags[i].page);
-			skb_shinfo(skb)->frags[i].page =
-			    skb_shinfo(skb)->frags[i - 1].page;
-			skb_shinfo(skb)->frags[i].page_offset +=
-			    skb_shinfo(skb)->frags[i - 1].size;
-			skb_shinfo(skb)->frags[i].size = rem;
-			i++;
-			skb_shinfo(skb)->nr_frags = i;
-		}
-	}
-
-	/* Stamp the time, and sequence number,
-	 * convert them to network byte order
-	 * should we update cloned packets too ?
-	 */
-	if (pgh) {
-		struct timeval timestamp;
-
-		pgh->pgh_magic = htonl(PKTGEN_MAGIC);
-		pgh->seq_num = htonl(pkt_dev->seq_num);
-
-		do_gettimeofday(&timestamp);
-		pgh->tv_sec = htonl(timestamp.tv_sec);
-		pgh->tv_usec = htonl(timestamp.tv_usec);
-	}
-	/* pkt_dev->seq_num++; FF: you really mean this? */
+	pktgen_finalize_skb(pkt_dev, skb, datalen);
 
 	return skb;
 }
@@ -3884,6 +3834,8 @@ static int pktgen_remove_device(struct pktgen_thread *t,
 	free_SAs(pkt_dev);
 #endif
 	vfree(pkt_dev->flows);
+	if (pkt_dev->page)
+		put_page(pkt_dev->page);
 	kfree(pkt_dev);
 	return 0;
 }



^ permalink raw reply related

* Re: [PATCH net-next-2.6] pktgen: speedup fragmented skbs
From: Ben Greear @ 2011-01-25 17:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1295975635.3588.337.camel@edumazet-laptop>

On 01/25/2011 09:13 AM, Eric Dumazet wrote:
> We spend lot of time clearing pages in pktgen.
> (Or not clearing them on ipv6 and leaking kernel memory)
>
> Since we dont modify them, we can use one zeroed page, and get
> references on it. This page can use NUMA affinity as well.
>
> Define pktgen_finalize_skb() helper, used both in ipv4 and ipv6

Some devices, like vlans, can change the skb, but perhaps they
will not actually mess with the paged data?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply

* Re: [net-next 08/12] ixgb: convert to new VLAN model
From: Jesse Gross @ 2011-01-25 17:22 UTC (permalink / raw)
  To: Tantilov, Emil S
  Cc: Kirsher, Jeffrey T, davem@davemloft.net, netdev@vger.kernel.org,
	bphilips@novell.com, Pieper, Jeffrey E
In-Reply-To: <EA929A9653AAE14F841771FB1DE5A136602DEE63CB@rrsmsx501.amr.corp.intel.com>

On Sun, Jan 23, 2011 at 4:25 PM, Tantilov, Emil S
<emil.s.tantilov@intel.com> wrote:
> Jesse Gross wrote:
>> On Thu, Jan 6, 2011 at 7:29 PM,  <jeffrey.t.kirsher@intel.com> wrote:
>>> +static int ixgb_set_flags(struct net_device *netdev, u32 data) +{
>>> +       struct ixgb_adapter *adapter = netdev_priv(netdev); +
>>> bool need_reset; +       int rc;
>>> +
>>> +       /*
>>> +        * TX vlan insertion does not work per HW design when Rx
>>> stripping is +        * disabled.  Disable txvlan when rxvlan is
>>> off. +        */ +       if ((data & ETH_FLAG_RXVLAN) !=
>>> (netdev->features & NETIF_F_HW_VLAN_RX)) +               data ^=
>>> ETH_FLAG_TXVLAN;
>>
>> Does this really do the right thing?  If the RX vlan setting is
>> changed, it will do the opposite of what the user requested for TX
>> vlan?
>>
>> So if I start with both on (the default) and turn them both off in one
>> command (a valid setting), I will get RX off and TX on (an invalid
>> setting).
>>
>> Why not:
>>
>> if (!(data & ETH_FLAG_RXVLAN))
>>         data &= ~ETH_FLAG_TXVLAN;
>
> Yeah that works for disabling rxvlan, but what if rxvlan is disabled, and the user attempts to enable txvlan? At least our validation argued that we should make it work both ways. Perhaps something like the following?
>
>        if (!(data & ETH_FLAG_RXVLAN) &&
>           (netdev->features & NETIF_F_HW_VLAN_TX))
>                data &= ~ETH_FLAG_TXVLAN;
>        else if (data & ETH_FLAG_TXVLAN)
>                data |= ETH_FLAG_RXVLAN;

I think the logic above does what you describe and will always result
in a consistent state.  Turning dependent features on when needed is a
little bit inconsistent with the rest of Ethtool (for example, turning
on TSO when checksum offloading is off will not enable checksum
offloading, it will produce an error).  However, I know that drivers
aren't completely consistent here and the most important part is that
it enforces valid states, so I don't have a strong opinion.  Ben's
previous suggestion of Ethtool querying again after the operation and
reporting any flags that were automatically changed would help a lot
here.

^ permalink raw reply

* Re: [PATCH net-next-2.6] pktgen: speedup fragmented skbs
From: Eric Dumazet @ 2011-01-25 17:47 UTC (permalink / raw)
  To: Ben Greear; +Cc: David Miller, netdev
In-Reply-To: <4D3F06BF.7010903@candelatech.com>

Le mardi 25 janvier 2011 à 09:22 -0800, Ben Greear a écrit :
> On 01/25/2011 09:13 AM, Eric Dumazet wrote:
> > We spend lot of time clearing pages in pktgen.
> > (Or not clearing them on ipv6 and leaking kernel memory)
> >
> > Since we dont modify them, we can use one zeroed page, and get
> > references on it. This page can use NUMA affinity as well.
> >
> > Define pktgen_finalize_skb() helper, used both in ipv4 and ipv6
> 
> Some devices, like vlans, can change the skb, but perhaps they
> will not actually mess with the paged data?

Yes, its absolutely forbidden to write on paged data.

If necessary, a COW must be done.




^ permalink raw reply

* Re: STMMAC driver: NFS Problem on 2.6.37
From: Chuck Lever @ 2011-01-25 18:04 UTC (permalink / raw)
  To: deepaksi
  Cc: Armando VISCONTI, Trond Myklebust, netdev-u79uwXL29TY76Z2rM5mHXA,
	Linux NFS Mailing List, Shiraz HASHIM, Viresh KUMAR,
	Peppe CAVALLARO, amitgoel
In-Reply-To: <4D3EBA54.4020308-qxv4g6HH51o@public.gmane.org>

See analysis in line.

On Jan 25, 2011, at 6:56 AM, deepaksi wrote:

> Hi All,
> 
> Please find more debug information on the issue. The log below is related to the kernel debug information.
> 
> 1. Kernel Debug Information log : ( The relevant portions are marked in bold )
> 
> [    2.980000] stmmac: Rx Checksum Offload Engine supported
> [    3.980000]  ** phydev->state 4 
> [    4.000000] IP-Config: Complete:                                                                                        /* Ethernet device open print */
> [    4.000000]      device=eth0, addr=192.168.1.10, mask=255.255.255.0, gw=192.168.1.1,
> [    4.010000]      host=192.168.1.10, domain=, nis-domain=(none),
> [    4.010000]      bootserver=192.168.1.1, rootserver=192.168.1.1, rootpath=
> [    4.020000] Root-NFS: nfsroot=/opt/STM/STLinux-2.4/devkit/armv7/target
> [    4.030000] NFS: nfs mount opts='nolock,addr=192.168.1.1'
> [    4.030000] NFS:   parsing nfs mount option 'nolock'
> [    4.040000] NFS:   parsing nfs mount option 'addr=192.168.1.1'
> [    4.040000] NFS: MNTPATH: '/opt/STM/STLinux-2.4/devkit/armv7/target'
> [    4.050000] NFS: sending MNT request for 192.168.1.1:/opt/STM/STLinux-2.4/devkit/armv7/target
> [    4.060000] Calling rpc_create
> [    4.060000] RPC:       set up xprt to 192.168.1.1 (autobind) via tcp
> [    4.070000] RPC:       created transport cf894000 with 16 slots
> [    4.080000] xprt_create_transport: RPC:       created transport cf894000 with 16 slots
> [    4.080000] RPC:       creating mount client for 192.168.1.1 (xprt cf894000)
> [    4.090000] RPC:       creating UNIX authenticator for client cfa90800
> [    4.100000] Calling rpc_ping
> [    4.100000] RPC:       new task initialized, procpid 1
> [    4.100000] RPC:       allocated task cfa11980
> [    4.110000] RPC:     1 __rpc_execute flags=0x680
> [    4.110000] RPC:     1 call_start mount3 proc NULL (sync)

The first RPC request is a MNT protocol request to determine the NFS file handle for the export to be mounted.

> [    4.120000] RPC:     1 call_reserve (status 0)
> [    4.120000] RPC:     1 reserved req cfb80000 xid f68b1f23
> [    4.130000] RPC:     1 call_reserveresult (status 0)
> [    4.130000] RPC:     1 call_refresh (status 0)
> [    4.140000] RPC:     1 holding NULL cred c0492798
> [    4.140000] RPC:     1 refreshing NULL cred c0492798
> [    4.150000] RPC:     1 call_refreshresult (status 0)
> [    4.150000] RPC:     1 call_allocate (status 0)
> [    4.160000] RPC:     1 allocated buffer of size 92 at cf894800
> [    4.160000] RPC:     1 call_bind (status 0)
> [    4.170000] RPC:     1 rpcb_getport_async(192.168.1.1, 100005, 3, 6)
> [    4.170000] RPC:     1 sleep_on(queue "xprt_binding" time 4294937713)
> [    4.180000] RPC:     1 added to queue cf8940a4 "xprt_binding"
> [    4.190000] RPC:     1 setting alarm for 60000 ms
> [    4.190000] RPC:     1 rpcb_getport_async: trying rpcbind version 2
> [    4.200000] Calling rpc_create
> [    4.200000] RPC:       set up xprt to 192.168.1.1 (port 111) via tcp
> [    4.210000] RPC:       created transport cf895000 with 16 slots
> [    4.210000] xprt_create_transport: RPC:       created transport cf895000 with 16 slots
> [    4.220000] RPC:       creating rpcbind client for 192.168.1.1 (xprt cf895000)
> [    4.230000] RPC:       creating UNIX authenticator for client cfa90a00
> [    4.230000] rpc_create returns 0xcfa90a00
> [    4.240000] RPC:       new task initialized, procpid 1
> [    4.240000] RPC:       allocated task cfa11a00
> [    4.250000] RPC:       rpc_release_client(cfa90a00)
> [    4.250000] RPC:     1 sync task going to sleep
> [    4.260000] RPC:     2 __rpc_execute flags=0x681
> [    4.260000] RPC:     2 call_start rpcbind2 proc GETPORT (async)

This is a child RPC request, which is an rpcbind query to discover the port where the NFS server's MNT service resides.  This is necessary before a transport socket for the MNT request can be set up.  The MNT request is put to sleep while the rpcbind query proceeds.

> [    4.270000] RPC:     2 call_reserve (status 0)
> [    4.270000] RPC:     2 reserved req cfb81000 xid 59bc347d
> [    4.280000] RPC:     2 call_reserveresult (status 0)
> [    4.280000] RPC:     2 call_refresh (status 0)
> [    4.290000] RPC:     2 looking up UNIX cred
> [    4.290000] RPC:       looking up UNIX cred
> [    4.300000] RPC:       allocating UNIX cred for uid 0 gid 0
> [    4.300000] RPC:     2 refreshing UNIX cred cfa11a80
> [    4.310000] RPC:     2 call_refreshresult (status 0)
> [    4.310000] RPC:     2 call_allocate (status 0)
> [    4.320000] RPC:     2 allocated buffer of size 412 at cf895800
> [    4.320000] RPC:     2 call_bind (status 0)
> [    4.330000] RPC:     2 call_connect xprt cf895000 is not connected
> [    4.330000] RPC:     2 xprt_connect xprt cf895000 is not connected
> [    4.340000] RPC:     2 sleep_on(queue "xprt_pending" time 4294937730)
> [    4.340000] RPC:     2 added to queue cf8951dc "xprt_pending"
> [    4.350000] RPC:     2 setting alarm for 60000 ms
> [    4.360000] RPC:       xs_connect scheduled xprt cf895000
> [    4.360000] RPC:       xs_bind 0.0.0.0:0: ok (0)
> [    4.370000] RPC:       worker connecting xprt cf895000 via tcp to 192.168.1.1 (port 111)
> [    4.370000] RPC:       cf895000 connect status 115 connected 0 sock state 2

A transport socket is set up for the rpcbind query, which connects to port 111 on the server.  The RPC client puts the rpcbind query request to sleep waiting for the transport socket connection to complete.  This is done in a worker thread.

> [    6.980000] PHY: 0:05 - Link is Up - 1000/Full            /* As soon as network link is up, post auto negotiation, we get the error below ( Unable to reach icmp)*/
> [    7.380000] RPC:       xs_error_report client cf895000...    /* The ethernet phy on our board typically takes 3 seconds for auto negotiation process */
> [    7.380000] RPC:       error 113                                 /* In case the frequency of the phy state machine (in phy framework) polling is increased from HZ to HZ/10, we do not face this issue*/

After three seconds, the initial socket connection attempt is getting EHOSTUNREACH.  The transport socket's error report callback wakes up RPC task 2 with status -EAGAIN.

"we do not face this issue" -- in that case, does the connection attempt succeed?  Do you have debugging output for that case?

> [    7.390000] RPC:     2 __rpc_wake_up_task (now 4294938035)
> [    7.390000] RPC:     2 disabling timer
> [    7.390000] RPC:     2 removed from queue cf8951dc "xprt_pending"
> [    7.400000] RPC:       __rpc_wake_up_task done
> [    7.410000] RPC:       xs_tcp_state_change client cf895000...
> [    7.410000] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [    7.420000] RPC:       disconnected transport cf895000

The transport is marked disconnected here by the transport socket's state change callback.  The socket state is TCP_CLOSE.  xprt_disconnect_done() wakes up any pending tasks with -EAGAIN (a second wake up).

> [    7.420000] RPC:     2 __rpc_execute flags=0x681
> [    7.420000] ************* skb->protocol = 608**************
> [    7.420000] ************* skb->protocol = 8**************
> [    7.440000] RPC:     2 xprt_connect_status: retrying

Here is RPC task 2 waking up in xprt_connect_status with status -EAGAIN.

> [    7.440000] RPC:     2 call_connect_status (status -11)
> [    7.450000] RPC:     2 call_transmit (status 0)
> [    7.450000] RPC:     2 xprt_prepare_transmit
> [    7.460000] RPC:     2 rpc_xdr_encode (status 0)
> [    7.460000] RPC:     2 marshaling UNIX cred cfa11a80
> [    7.470000] RPC:     2 using AUTH_UNIX cred cfa11a80 to wrap rpc data
> [    7.470000] RPC:     2 encoding PMAP_GETPORT call (100005, 3, 6, 0)
> [    7.480000] RPC:     2 xprt_transmit(92)
> [    7.480000] RPC:       xs_tcp_send_request(92) = -113

> [    7.490000] RPC:       sendmsg returned unrecognized error 113

I don't see a fresh connection attempt here.  The RPC client appears to be attempting to send on an unconnected transport socket.  The next send request is getting EHOSTUNREACH again, and thus the rpcbind query fails.

I thought there used to be code in xprt_prepare_transmit() to reconnect if needed, but I no longer find it here.  call_connect_status() handles -EAGAIN by advancing the state machine to call_transmit(), so the FSM does expect call_transmit() and its children to deal with an unconnected transport.

Possibly relevant commits: 2a491991, c8485e4d

In any event, the outcome is probably correct: if the RPC client gets EHOSTUNREACH while trying to connect or send, then it will fail this RPC request.

> [    7.490000] RPC:       xs_tcp_state_change client cf895000...
> [    7.500000] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [    7.510000] RPC:       disconnected transport cf895000
> [    7.510000] RPC:       wake_up_next(cf895174 "xprt_resend")
> [    7.520000] RPC:       wake_up_next(cf89510c "xprt_sending")
> [    7.520000] RPC:       setting port for xprt cf894000 to 0
> [    7.530000] RPC:     2 rpcb_getport_done(status -113, port 0)
> [    7.530000] RPC:     2 return 0, status -113
> [    7.540000] RPC:     2 release task
> [    7.540000] RPC:       freeing buffer of size 412 at cf895800
> [    7.550000] RPC:     2 release request cfb81000
> [    7.550000] RPC:       wake_up_next(cf895244 "xprt_backlog")
> [    7.560000] RPC:       rpc_release_client(cfa90a00)
> [    7.560000] RPC:       destroying rpcbind client for 192.168.1.1
> [    7.570000] RPC:       destroying transport cf895000
> [    7.570000] RPC:       xs_destroy xprt cf895000
> [    7.580000] RPC:       xs_close xprt cf895000
> [    7.580000] RPC:       disconnected transport cf895000
> [    7.590000] RPC:     2 freeing task
> [    7.590000] RPC:     1 __rpc_wake_up_task (now 4294938055)
> [    7.600000] RPC:     1 disabling timer
> [    7.600000] RPC:     1 removed from queue cf8940a4 "xprt_binding"
> [    7.610000] RPC:       __rpc_wake_up_task done
> [    7.610000] RPC:     1 sync task resuming
> [    7.620000] RPC:     1 remote rpcbind unreachable: -113
> [    7.620000] RPC:     1 return 0, status -113
> [    7.630000] RPC:     1 release task
> [    7.630000] RPC:       freeing buffer of size 92 at cf894800
> [    7.630000] RPC:     1 release request cfb80000
> [    7.640000] RPC:       wake_up_next(cf894244 "xprt_backlog")
> [    7.640000] RPC:       rpc_release_client(cfa90800)
> [    7.650000] RPC:     1 freeing task
> [    7.650000] rpc_ping returned -113
> [    7.660000] Calling rpc_ping failed

Note that the RPC ping is for RPC task 1, the MNT request, not for the rpcbind query (RPC task 2).

An RPC ping for the MNT request can't proceed until the rpcbind query has discovered the correct server port to connect to.  The rpcbind query is done first, then, and is what is failing to connect.  So RPC ping is a red herring here -- the RPC client doesn't get far enough to send a MNT NULL request.

  [ ... remaining dump contents snipped ... ]

> We have made following observations
> 1. It seems that the time taken by phy auto negotiation process is long and as soon as the link gets up rpc ping request is getting timed out and we receive "Unable to reach ICMP" error. The time out error is same even if you do not connect a network cable and do a nfs boot.
> 2. We tried to modify the rate at which the work queue is scheduled in the phy framework. instead of scheduling every HZ ( 1 sec), we modified it to HZ/10. We did not received the error. This probably reduced the margin of the phy framework informing the kernel that the link is up.
> 3. We tried to use another network card and did a nfs boot. The only relevant difference we could find was the time of auto negotiation.

Can you post a similar debugging dump of a root mount that succeeds using a different network card?

> Are there some changes in the kernel framework w.r.t rpc ping time out ? This problem was not there in previous kernels.

There have been changes in the RPC socket code around how it manages recovery from failed attempts to connect.  We also have new logic now in the RPC client that causes RPC ping to fail immediately if a host can't be reached.

Thanks for your efforts so far.  It would be helpful if you could bisect to determine which commit(s) introduced this RPC client behavior (or any related changes to your network driver behavior).

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* RE: [net-next 08/12] ixgb: convert to new VLAN model
From: Tantilov, Emil S @ 2011-01-25 18:20 UTC (permalink / raw)
  To: Jesse Gross
  Cc: Kirsher, Jeffrey T, davem@davemloft.net, netdev@vger.kernel.org,
	bphilips@novell.com, Pieper, Jeffrey E, Ben Hutchings
In-Reply-To: <AANLkTikS7zw5OfPFTJBUeUyy39Tx26Wkg9FX3n77-v1G@mail.gmail.com>

>-----Original Message-----
>From: Jesse Gross [mailto:jesse@nicira.com]
>Sent: Tuesday, January 25, 2011 9:23 AM
>To: Tantilov, Emil S
>Cc: Kirsher, Jeffrey T; davem@davemloft.net; netdev@vger.kernel.org;
>bphilips@novell.com; Pieper, Jeffrey E
>Subject: Re: [net-next 08/12] ixgb: convert to new VLAN model
>
>On Sun, Jan 23, 2011 at 4:25 PM, Tantilov, Emil S
><emil.s.tantilov@intel.com> wrote:
>> Jesse Gross wrote:
>>> On Thu, Jan 6, 2011 at 7:29 PM,  <jeffrey.t.kirsher@intel.com> wrote:
>>>> +static int ixgb_set_flags(struct net_device *netdev, u32 data) +{
>>>> +       struct ixgb_adapter *adapter = netdev_priv(netdev); +
>>>> bool need_reset; +       int rc;
>>>> +
>>>> +       /*
>>>> +        * TX vlan insertion does not work per HW design when Rx
>>>> stripping is +        * disabled.  Disable txvlan when rxvlan is
>>>> off. +        */ +       if ((data & ETH_FLAG_RXVLAN) !=
>>>> (netdev->features & NETIF_F_HW_VLAN_RX)) +               data ^=
>>>> ETH_FLAG_TXVLAN;
>>>
>>> Does this really do the right thing?  If the RX vlan setting is
>>> changed, it will do the opposite of what the user requested for TX
>>> vlan?
>>>
>>> So if I start with both on (the default) and turn them both off in one
>>> command (a valid setting), I will get RX off and TX on (an invalid
>>> setting).
>>>
>>> Why not:
>>>
>>> if (!(data & ETH_FLAG_RXVLAN))
>>>         data &= ~ETH_FLAG_TXVLAN;
>>
>> Yeah that works for disabling rxvlan, but what if rxvlan is disabled, and
>the user attempts to enable txvlan? At least our validation argued that we
>should make it work both ways. Perhaps something like the following?
>>
>>        if (!(data & ETH_FLAG_RXVLAN) &&
>>           (netdev->features & NETIF_F_HW_VLAN_TX))
>>                data &= ~ETH_FLAG_TXVLAN;
>>        else if (data & ETH_FLAG_TXVLAN)
>>                data |= ETH_FLAG_RXVLAN;
>
>I think the logic above does what you describe and will always result
>in a consistent state.  Turning dependent features on when needed is a
>little bit inconsistent with the rest of Ethtool (for example, turning
>on TSO when checksum offloading is off will not enable checksum
>offloading, it will produce an error).  However, I know that drivers

That is the reason I asked, as I don't want to keep bouncing the patch back and forth. Personally I like the idea of helping the user and adjusting the flags to something that works rather than a generic error message.

>aren't completely consistent here and the most important part is that
>it enforces valid states, so I don't have a strong opinion.  Ben's
>previous suggestion of Ethtool querying again after the operation and
>reporting any flags that were automatically changed would help a lot
>here.

Sure, but I think a savvy user would always check the result of an ethtool command (ie. `ethtool -K` followed with `ethtool -k`, -A/-a, etc).
`
Added Ben in case he has comments.

Thanks,
Emil

^ permalink raw reply

* [PATCH net-next-2.6] bonding: fix return value of bonding_store_slaves_active()
From: Jiri Pirko @ 2011-01-25 18:53 UTC (permalink / raw)
  To: netdev; +Cc: netdev, fubar, davem

count is incorrectly returned even in case of fail. Return ret instead.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 8fd0174..f33155f 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1595,7 +1595,7 @@ static ssize_t bonding_store_slaves_active(struct device *d,
 		}
 	}
 out:
-	return count;
+	return ret;
 }
 static DEVICE_ATTR(all_slaves_active, S_IRUGO | S_IWUSR,
 		   bonding_show_slaves_active, bonding_store_slaves_active);

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox