Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v3 7/8] cgroup: Assign subsystem IDs during compile time
From: Tejun Heo @ 2012-09-11 21:36 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Daniel Wagner, David S. Miller, Andrew Morton, Eric Dumazet,
	Gao feng, Glauber Costa, Jamal Hadi Salim, John Fastabend,
	Kamezawa Hiroyuki, Li Zefan, Neil Horman
In-Reply-To: <504FADC1.4060503-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>

Hello, Daniel.

On Tue, Sep 11, 2012 at 11:31:45PM +0200, Daniel Wagner wrote:
> >Oops, that was wrong.  net_prio_subsys_id itself becomes constant.
> >Let's please better explain why the RCU trick removal is safe then.
> 
> In the last paragraph in the commit message I tried to document why
> it is safe to remove the RCU trick. Not good enough?

It isn't clear to me why it was necessary before and why it now
becomes unnecessary.  It states what the code does and that it's no
longer necessary but I'd really like more elaboration.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH v3 7/8] cgroup: Assign subsystem IDs during compile time
From: Daniel Wagner @ 2012-09-11 21:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: netdev, cgroups, Daniel Wagner, David S. Miller, Andrew Morton,
	Eric Dumazet, Gao feng, Glauber Costa, Jamal Hadi Salim,
	John Fastabend, Kamezawa Hiroyuki, Li Zefan, Neil Horman
In-Reply-To: <20120911210821.GB7677@google.com>

On 09/11/2012 11:08 PM, Tejun Heo wrote:
> On Tue, Sep 11, 2012 at 02:01:09PM -0700, Tejun Heo wrote:
>> For example, it's not evident the above is correct and it's burried
>> with all other changes.  Can't we just assign the fixed subsys ID to
>> net_prio_subsys_id in this patch?  This patch would be correct without
>> any netprio changes, no?
>
> Oops, that was wrong.  net_prio_subsys_id itself becomes constant.
> Let's please better explain why the RCU trick removal is safe then.

In the last paragraph in the commit message I tried to document why it 
is safe to remove the RCU trick. Not good enough?

^ permalink raw reply

* Re: [PATCH v3 7/8] cgroup: Assign subsystem IDs during compile time
From: Tejun Heo @ 2012-09-11 21:27 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: netdev, cgroups, Daniel Wagner, David S. Miller, Andrew Morton,
	Eric Dumazet, Gao feng, Glauber Costa, Jamal Hadi Salim,
	John Fastabend, Kamezawa Hiroyuki, Li Zefan, Neil Horman
In-Reply-To: <504FA9D6.4050201@monom.org>

Hello, Daniel

On Tue, Sep 11, 2012 at 11:15:02PM +0200, Daniel Wagner wrote:
> If net_prio_subsys_id is changed to be an enum, then the compiler
> will report an error:
> 
> error: lvalue required as left operand of assignment
> 
> that was the reason why I kept this change here. I think I just
> don't get what you are trying to tell me.
> 
> >Please separate these changes and explain them.
> 
> I will do that as soon I figured out what you are telling me.

Sorry about that.  I was thinking that was a separate variable.  Well,
we can introduce a variable, change the id allocation and then swap it
back to the constant, but that would be too much.  Let's just try to
explain it better.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH v3 0/8] cgroup: Assign subsystem IDs during compile time
From: Daniel Wagner @ 2012-09-11 21:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Daniel Wagner, David S. Miller, Andrew Morton, Eric Dumazet,
	Gao feng, Glauber Costa, Jamal Hadi Salim, John Fastabend,
	Kamezawa Hiroyuki, Li Zefan, Neil Horman
In-Reply-To: <20120911211151.GC7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

Hi Tejun,

On 09/11/2012 11:11 PM, Tejun Heo wrote:
> Hello,
>
> On Tue, Sep 11, 2012 at 06:26:06PM +0200, Daniel Wagner wrote:
>> From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
>>
>> Hi,
>>
>> In this version I tried to concentrate on the main topic of this
>> series, so I removed some of the things which were not really needed
>> and I have to admit the result looks much better. So I hope that will
>> simplify the review for you.
>
> Other than the few nits I like the whole series and, given that all
> the changes are entangled with cgroup core, I would like to route it
> through cgroup/for-3.7 if nobody objects.  Daniel, can you please
> rebase on top of the following branch when sending out updated
> version?
>
>    git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-3.7

Sure, I'll rebase and update the series tomorrow. Thanks for your 
patience with me.

cheers,
daniel

^ permalink raw reply

* Re: [PATCH v3 7/8] cgroup: Assign subsystem IDs during compile time
From: Daniel Wagner @ 2012-09-11 21:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Daniel Wagner, David S. Miller, Andrew Morton, Eric Dumazet,
	Gao feng, Glauber Costa, Jamal Hadi Salim, John Fastabend,
	Kamezawa Hiroyuki, Li Zefan, Neil Horman
In-Reply-To: <20120911210435.GA7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

Hi Tejun,

On 09/11/2012 11:04 PM, Tejun Heo wrote:
> Hello, Daniel.
>
> One more thing.
>
> On Tue, Sep 11, 2012 at 06:26:13PM +0200, Daniel Wagner wrote:
>> From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
>>
>> WARNING: With this change it is not possible to load external built
>> controllers anymore.
>>
>> In case where CONFIG_NETPRIO_CGROUP=m and CONFIG_NET_CLS_CGROUP=m is
>> set, the type of the corresponding subsys_id should also be of type
>> enum. Up to now, net_prio_subsys_id and net_cls_subsys_id would be an
>> int in this configuration.
>>
>> With switching the macro definition IS_SUBSYS_ENABLED from IS_BUILTIN
>> to IS_ENABLED, the subsys_id will always be enum for all
>> subsystems. That means we need to remove all the code which assumes
>> that net_prio_subsys_id and net_cls_subsys_id is of type int.
>
> I don't think int or enum is the matter here.  enum is an int.  It's
> whether the ID is allocated statically or dynamically.  Can you please
> update the description using those terms instead?

Sure, no problem.

cheers,
daniel

^ permalink raw reply

* Re: [PATCH v3 7/8] cgroup: Assign subsystem IDs during compile time
From: Daniel Wagner @ 2012-09-11 21:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Daniel Wagner, David S. Miller, Andrew Morton, Eric Dumazet,
	Gao feng, Glauber Costa, Jamal Hadi Salim, John Fastabend,
	Kamezawa Hiroyuki, Li Zefan, Neil Horman
In-Reply-To: <20120911210109.GZ7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

Hi Tejun,

On 09/11/2012 11:01 PM, Tejun Heo wrote:
> Hello, Daniel.
>
> I generally like this but I still think it's too big a patch.

Yes, I agree it is a bit too big.

>> diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
>> index c75e3f9..6bc460c 100644
>> --- a/net/core/netprio_cgroup.c
>> +++ b/net/core/netprio_cgroup.c
>> @@ -326,9 +326,7 @@ struct cgroup_subsys net_prio_subsys = {
>>   	.create		= cgrp_create,
>>   	.destroy	= cgrp_destroy,
>>   	.attach		= net_prio_attach,
>> -#ifdef CONFIG_NETPRIO_CGROUP
>>   	.subsys_id	= net_prio_subsys_id,
>> -#endif
>>   	.base_cftypes	= ss_files,
>>   	.module		= THIS_MODULE
>>   };
>> @@ -366,10 +364,6 @@ static int __init init_cgroup_netprio(void)
>>   	ret = cgroup_load_subsys(&net_prio_subsys);
>>   	if (ret)
>>   		goto out;
>> -#ifndef CONFIG_NETPRIO_CGROUP
>> -	smp_wmb();
>> -	net_prio_subsys_id = net_prio_subsys.subsys_id;
>> -#endif
>>
>>   	register_netdevice_notifier(&netprio_device_notifier);
>>
>> @@ -386,11 +380,6 @@ static void __exit exit_cgroup_netprio(void)
>>
>>   	cgroup_unload_subsys(&net_prio_subsys);
>>
>> -#ifndef CONFIG_NETPRIO_CGROUP
>> -	net_prio_subsys_id = -1;
>> -	synchronize_rcu();
>
> For example, it's not evident the above is correct and it's burried
> with all other changes.  Can't we just assign the fixed subsys ID to
> net_prio_subsys_id in this patch?  This patch would be correct without
> any netprio changes, no?

If net_prio_subsys_id is changed to be an enum, then the compiler will 
report an error:

error: lvalue required as left operand of assignment

that was the reason why I kept this change here. I think I just don't 
get what you are trying to tell me.

> Please separate these changes and explain them.

I will do that as soon I figured out what you are telling me.

> BTW, people who use barriers of any kind without explicitly explaining
> what's going on need to be kicked hard in the ass.  :(

I looked up the commit message when the synchronze_rcu() was added. It 
was not really explaining it. I really spend a few hours starring at 
this code.

thanks for the review,
daniel

^ permalink raw reply

* Re: [PATCH v3 0/8] cgroup: Assign subsystem IDs during compile time
From: Tejun Heo @ 2012-09-11 21:11 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: netdev, cgroups, Daniel Wagner, David S. Miller, Andrew Morton,
	Eric Dumazet, Gao feng, Glauber Costa, Jamal Hadi Salim,
	John Fastabend, Kamezawa Hiroyuki, Li Zefan, Neil Horman
In-Reply-To: <1347380774-9546-1-git-send-email-wagi@monom.org>

Hello,

On Tue, Sep 11, 2012 at 06:26:06PM +0200, Daniel Wagner wrote:
> From: Daniel Wagner <daniel.wagner@bmw-carit.de>
> 
> Hi,
> 
> In this version I tried to concentrate on the main topic of this
> series, so I removed some of the things which were not really needed
> and I have to admit the result looks much better. So I hope that will
> simplify the review for you.

Other than the few nits I like the whole series and, given that all
the changes are entangled with cgroup core, I would like to route it
through cgroup/for-3.7 if nobody objects.  Daniel, can you please
rebase on top of the following branch when sending out updated
version?

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-3.7

Li, can you please review these?

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH v3 7/8] cgroup: Assign subsystem IDs during compile time
From: Tejun Heo @ 2012-09-11 21:08 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Daniel Wagner, David S. Miller, Andrew Morton, Eric Dumazet,
	Gao feng, Glauber Costa, Jamal Hadi Salim, John Fastabend,
	Kamezawa Hiroyuki, Li Zefan, Neil Horman
In-Reply-To: <20120911210109.GZ7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

On Tue, Sep 11, 2012 at 02:01:09PM -0700, Tejun Heo wrote:
> For example, it's not evident the above is correct and it's burried
> with all other changes.  Can't we just assign the fixed subsys ID to
> net_prio_subsys_id in this patch?  This patch would be correct without
> any netprio changes, no?

Oops, that was wrong.  net_prio_subsys_id itself becomes constant.
Let's please better explain why the RCU trick removal is safe then.

Thanks.

-- 
tejun

^ permalink raw reply

* [PATCH] rtlwifi: Remove EXPERIMENTAL as pre-requisite for the drivers
From: Larry Finger @ 2012-09-11 21:04 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, Larry Finger, netdev

All of the rtlwifi-family of drivers have been in the kernel since 3.1
or earlier. The dependence on EXPERIMENTAL can be removed.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
---

John,

There is no rush on this patch. Inclusion in 3.7 will be fine.

Larry
---

 drivers/net/wireless/rtlwifi/Kconfig |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/rtlwifi/Kconfig b/drivers/net/wireless/rtlwifi/Kconfig
index cefac6a..6b28e92 100644
--- a/drivers/net/wireless/rtlwifi/Kconfig
+++ b/drivers/net/wireless/rtlwifi/Kconfig
@@ -1,6 +1,6 @@
 config RTL8192CE
 	tristate "Realtek RTL8192CE/RTL8188CE Wireless Network Adapter"
-	depends on MAC80211 && PCI && EXPERIMENTAL
+	depends on MAC80211 && PCI
 	select FW_LOADER
 	select RTLWIFI
 	select RTL8192C_COMMON
@@ -12,7 +12,7 @@ config RTL8192CE
 
 config RTL8192SE
 	tristate "Realtek RTL8192SE/RTL8191SE PCIe Wireless Network Adapter"
-	depends on MAC80211 && EXPERIMENTAL && PCI
+	depends on MAC80211 && PCI
 	select FW_LOADER
 	select RTLWIFI
 	---help---
@@ -23,7 +23,7 @@ config RTL8192SE
 
 config RTL8192DE
 	tristate "Realtek RTL8192DE/RTL8188DE PCIe Wireless Network Adapter"
-	depends on MAC80211 && EXPERIMENTAL && PCI
+	depends on MAC80211 && PCI
 	select FW_LOADER
 	select RTLWIFI
 	---help---
@@ -34,7 +34,7 @@ config RTL8192DE
 
 config RTL8192CU
 	tristate "Realtek RTL8192CU/RTL8188CU USB Wireless Network Adapter"
-	depends on MAC80211 && USB && EXPERIMENTAL
+	depends on MAC80211 && USB
 	select FW_LOADER
 	select RTLWIFI
 	select RTL8192C_COMMON
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH v3 7/8] cgroup: Assign subsystem IDs during compile time
From: Tejun Heo @ 2012-09-11 21:04 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Daniel Wagner, David S. Miller, Andrew Morton, Eric Dumazet,
	Gao feng, Glauber Costa, Jamal Hadi Salim, John Fastabend,
	Kamezawa Hiroyuki, Li Zefan, Neil Horman
In-Reply-To: <1347380774-9546-8-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>

Hello, Daniel.

One more thing.

On Tue, Sep 11, 2012 at 06:26:13PM +0200, Daniel Wagner wrote:
> From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> 
> WARNING: With this change it is not possible to load external built
> controllers anymore.
> 
> In case where CONFIG_NETPRIO_CGROUP=m and CONFIG_NET_CLS_CGROUP=m is
> set, the type of the corresponding subsys_id should also be of type
> enum. Up to now, net_prio_subsys_id and net_cls_subsys_id would be an
> int in this configuration.
> 
> With switching the macro definition IS_SUBSYS_ENABLED from IS_BUILTIN
> to IS_ENABLED, the subsys_id will always be enum for all
> subsystems. That means we need to remove all the code which assumes
> that net_prio_subsys_id and net_cls_subsys_id is of type int.

I don't think int or enum is the matter here.  enum is an int.  It's
whether the ID is allocated statically or dynamically.  Can you please
update the description using those terms instead?

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH v2 2/2] iproute2: use libgenl in ipl2tp
From: Stephen Hemminger @ 2012-09-11 21:02 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: netdev
In-Reply-To: <alpine.LFD.2.00.1209112304410.2023@ja.ssi.bg>

On Tue, 11 Sep 2012 23:22:41 +0300 (EEST)
Julian Anastasov <ja@ssi.bg> wrote:

> 
> 	Hello,
> 
> On Tue, 11 Sep 2012, Stephen Hemminger wrote:
> 
> > On Tue, 11 Sep 2012 12:04:34 +0300
> > Julian Anastasov <ja@ssi.bg> wrote:
> > 
> > > 	Use the common code from libgenl.c to parse family.
> > > 
> > > Signed-off-by: Julian Anastasov <ja@ssi.bg>
> > 
> > I applied these two but made some modifications:
> >   1. change to GENL_INITIALIZER
> >   2. use GENL_INITIALIZER
> 
> 	I used 2 defines, so that in tcp_metrics.c I can
> use different initialization depending on the command.
> But as the structure is unnamed may be it is better to have
> single define (both defines merged as you proposed) and
> later just to update cmd and flags if needed.
> 
> Regards
> 
> --
> Julian Anastasov <ja@ssi.bg>

Changes accepted any time.

^ permalink raw reply

* Re: [PATCH v3 7/8] cgroup: Assign subsystem IDs during compile time
From: Tejun Heo @ 2012-09-11 21:01 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: netdev, cgroups, Daniel Wagner, David S. Miller, Andrew Morton,
	Eric Dumazet, Gao feng, Glauber Costa, Jamal Hadi Salim,
	John Fastabend, Kamezawa Hiroyuki, Li Zefan, Neil Horman
In-Reply-To: <1347380774-9546-8-git-send-email-wagi@monom.org>

Hello, Daniel.

I generally like this but I still think it's too big a patch.

> diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
> index c75e3f9..6bc460c 100644
> --- a/net/core/netprio_cgroup.c
> +++ b/net/core/netprio_cgroup.c
> @@ -326,9 +326,7 @@ struct cgroup_subsys net_prio_subsys = {
>  	.create		= cgrp_create,
>  	.destroy	= cgrp_destroy,
>  	.attach		= net_prio_attach,
> -#ifdef CONFIG_NETPRIO_CGROUP
>  	.subsys_id	= net_prio_subsys_id,
> -#endif
>  	.base_cftypes	= ss_files,
>  	.module		= THIS_MODULE
>  };
> @@ -366,10 +364,6 @@ static int __init init_cgroup_netprio(void)
>  	ret = cgroup_load_subsys(&net_prio_subsys);
>  	if (ret)
>  		goto out;
> -#ifndef CONFIG_NETPRIO_CGROUP
> -	smp_wmb();
> -	net_prio_subsys_id = net_prio_subsys.subsys_id;
> -#endif
>  
>  	register_netdevice_notifier(&netprio_device_notifier);
>  
> @@ -386,11 +380,6 @@ static void __exit exit_cgroup_netprio(void)
>  
>  	cgroup_unload_subsys(&net_prio_subsys);
>  
> -#ifndef CONFIG_NETPRIO_CGROUP
> -	net_prio_subsys_id = -1;
> -	synchronize_rcu();

For example, it's not evident the above is correct and it's burried
with all other changes.  Can't we just assign the fixed subsys ID to
net_prio_subsys_id in this patch?  This patch would be correct without
any netprio changes, no?  Please separate these changes and explain
them.

BTW, people who use barriers of any kind without explicitly explaining
what's going on need to be kicked hard in the ass.  :(

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH v3 4/8] cgroup: Remove CGROUP_BUILTIN_SUBSYS_COUNT
From: Tejun Heo @ 2012-09-11 20:41 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Daniel Wagner, Gao feng, Jamal Hadi Salim, John Fastabend,
	Li Zefan, Neil Horman
In-Reply-To: <1347380774-9546-5-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>

Hello, Daniel.

Just one nit.

On Tue, Sep 11, 2012 at 06:26:10PM +0200, Daniel Wagner wrote:
> @@ -4502,10 +4507,13 @@ int __init cgroup_init_early(void)
>  	for (i = 0; i < CSS_SET_TABLE_SIZE; i++)
>  		INIT_HLIST_HEAD(&css_set_table[i]);
>  
> -	/* at bootup time, we don't worry about modular subsystems */
> -	for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
> +	for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
>  		struct cgroup_subsys *ss = subsys[i];
>  
> +		/* at bootup time, we don't worry about modular subsystems */
> +		if (!ss || (ss && ss->module))
> +			continue;
> +

The middle "ss" test is unnecessary.  Control never gets there if
NULL.

> @@ -4538,9 +4546,12 @@ int __init cgroup_init(void)
>  	if (err)
>  		return err;
>  
> -	/* at bootup time, we don't worry about modular subsystems */
> -	for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
> +	for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
>  		struct cgroup_subsys *ss = subsys[i];
> +
> +		/* at bootup time, we don't worry about modular subsystems */
> +		if (!ss || (ss && ss->module))
> +			continue;

Ditto.

> @@ -4735,13 +4746,16 @@ void cgroup_fork_callbacks(struct task_struct *child)
>  {
>  	if (need_forkexit_callback) {
>  		int i;
> -		/*
> -		 * forkexit callbacks are only supported for builtin
> -		 * subsystems, and the builtin section of the subsys array is
> -		 * immutable, so we don't need to lock the subsys array here.
> -		 */
> -		for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
> +		for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
>  			struct cgroup_subsys *ss = subsys[i];
> +
> +			/*
> +			 * forkexit callbacks are only supported for
> +			 * builtin subsystems.
> +			 */
> +			if (!ss || (ss && ss->module))
> +				continue;
> +

Ditto.

> @@ -4846,12 +4860,13 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
>  	tsk->cgroups = &init_css_set;
>  
>  	if (run_callbacks && need_forkexit_callback) {
> -		/*
> -		 * modular subsystems can't use callbacks, so no need to lock
> -		 * the subsys array
> -		 */
> -		for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
> +		for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
>  			struct cgroup_subsys *ss = subsys[i];
> +
> +			/* modular subsystems can't use callbacks */
> +			if (!ss || (ss && ss->module))
> +				continue;
> +

Ditto.

> @@ -5037,13 +5052,17 @@ static int __init cgroup_disable(char *str)
>  	while ((token = strsep(&str, ",")) != NULL) {
>  		if (!*token)
>  			continue;
> -		/*
> -		 * cgroup_disable, being at boot time, can't know about module
> -		 * subsystems, so we don't worry about them.
> -		 */
> -		for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
> +		for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
>  			struct cgroup_subsys *ss = subsys[i];
>  
> +			/*
> +			 * cgroup_disable, being at boot time, can't
> +			 * know about module subsystems, so we don't
> +			 * worry about them.
> +			 */
> +			if (!ss || (ss && ss->module))
> +				continue;
> +

Ditto.

Other than that,

 Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH v3 3/8] cgroup: net_prio: Do not define task_netpioidx() when not selected
From: Tejun Heo @ 2012-09-11 20:37 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
	Daniel Wagner, Gao feng, Jamal Hadi Salim, John Fastabend,
	Li Zefan, Neil Horman
In-Reply-To: <1347380774-9546-4-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>

On Tue, Sep 11, 2012 at 06:26:09PM +0200, Daniel Wagner wrote:
> From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> 
> task_netprioidx() should not be defined in case the configuration is
> CONFIG_NETPRIO_CGROUP=n. The reason is that in a following patch the
> net_prio_subsys_id will only be defined if CONFIG_NETPRIO_CGROUP!=n.
> When net_prio is not built at all any callee should only get an empty
> task_netprioidx() without any references to net_prio_subsys_id.
> 
> Signed-off-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> Cc: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> Cc: Jamal Hadi Salim <jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
> Cc: John Fastabend <john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

For 1-3.

  Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH v2 2/2] iproute2: use libgenl in ipl2tp
From: Julian Anastasov @ 2012-09-11 20:22 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20120911090908.605d1fbe@nehalam.linuxnetplumber.net>


	Hello,

On Tue, 11 Sep 2012, Stephen Hemminger wrote:

> On Tue, 11 Sep 2012 12:04:34 +0300
> Julian Anastasov <ja@ssi.bg> wrote:
> 
> > 	Use the common code from libgenl.c to parse family.
> > 
> > Signed-off-by: Julian Anastasov <ja@ssi.bg>
> 
> I applied these two but made some modifications:
>   1. change to GENL_INITIALIZER
>   2. use GENL_INITIALIZER

	I used 2 defines, so that in tcp_metrics.c I can
use different initialization depending on the command.
But as the structure is unnamed may be it is better to have
single define (both defines merged as you proposed) and
later just to update cmd and flags if needed.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* [PATCH NEXT] rtlwifi: rtl8192c: rtl8192ce: Add support for B-CUT version of RTL8188CE
From: Larry Finger @ 2012-09-11 20:06 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, Larry Finger, netdev, Anisse Astier, Li Chaoming

Realtek devices with designation RTL8188CE-VL have the so-called B-cut
of the wireless chip. This patch adds the special programming needed by
these devices.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Anisse Astier <anisse@astier.eu>
Cc: Li Chaoming <chaoming_li@realsil.com.cn>
---
 drivers/net/wireless/rtlwifi/rtl8192c/phy_common.c |   21 +++++++
 drivers/net/wireless/rtlwifi/rtl8192ce/def.h       |    3 +
 drivers/net/wireless/rtlwifi/rtl8192ce/hw.c        |   61 ++++++++++++++++++--
 drivers/net/wireless/rtlwifi/rtl8192ce/phy.c       |    4 +-
 drivers/net/wireless/rtlwifi/rtl8192ce/sw.c        |    6 +-
 drivers/net/wireless/rtlwifi/rtl8192ce/trx.c       |    4 +-
 6 files changed, 87 insertions(+), 12 deletions(-)
---

John,

This patch has the patch entitled "rtlwifi: rtl8192ce: Log message that
B_CUT device may not work" as a pre-requisite. Unlike the previous patch,
this one is too invasive to backport to the stable kernels, thus it should
be applied to 3.7.

Thanks,

Larry
---

diff --git a/drivers/net/wireless/rtlwifi/rtl8192c/phy_common.c b/drivers/net/wireless/rtlwifi/rtl8192c/phy_common.c
index cdcad7d..6ae2268 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192c/phy_common.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192c/phy_common.c
@@ -724,6 +724,26 @@ u8 rtl92c_phy_sw_chnl(struct ieee80211_hw *hw)
 }
 EXPORT_SYMBOL(rtl92c_phy_sw_chnl);
 
+static void _rtl92c_phy_sw_rf_setting(struct ieee80211_hw *hw, u8 channel)
+{
+	struct rtl_priv *rtlpriv = rtl_priv(hw);
+	struct rtl_phy *rtlphy = &(rtlpriv->phy);
+	struct rtl_hal *rtlhal = rtl_hal(rtl_priv(hw));
+
+	if (IS_81xxC_VENDOR_UMC_B_CUT(rtlhal->version)) {
+		if (channel == 6 && rtlphy->current_chan_bw ==
+		    HT_CHANNEL_WIDTH_20)
+			rtl_set_rfreg(hw, RF90_PATH_A, RF_RX_G1, MASKDWORD,
+				      0x00255);
+		else{
+			u32 backupRF0x1A = (u32)rtl_get_rfreg(hw, RF90_PATH_A,
+					    RF_RX_G1, RFREG_OFFSET_MASK);
+			rtl_set_rfreg(hw, RF90_PATH_A, RF_RX_G1, MASKDWORD,
+				      backupRF0x1A);
+		}
+	}
+}
+
 static bool _rtl92c_phy_set_sw_chnl_cmdarray(struct swchnlcmd *cmdtable,
 					     u32 cmdtableidx, u32 cmdtablesz,
 					     enum swchnlcmd_id cmdid,
@@ -837,6 +857,7 @@ bool _rtl92c_phy_sw_chnl_step_by_step(struct ieee80211_hw *hw,
 					      currentcmd->para1,
 					      RFREG_OFFSET_MASK,
 					      rtlphy->rfreg_chnlval[rfpath]);
+			_rtl92c_phy_sw_rf_setting(hw, channel);
 			}
 			break;
 		default:
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ce/def.h b/drivers/net/wireless/rtlwifi/rtl8192ce/def.h
index 2925094..3cfa1bb 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ce/def.h
+++ b/drivers/net/wireless/rtlwifi/rtl8192ce/def.h
@@ -116,6 +116,9 @@
 	LE_BITS_TO_4BYTE(((__pcmdfbhdr) + 4), 20, 12)
 
 #define CHIP_VER_B			BIT(4)
+#define CHIP_BONDING_IDENTIFIER(_value) (((_value) >> 22) & 0x3)
+#define CHIP_BONDING_92C_1T2R		0x1
+#define RF_TYPE_1T2R			BIT(1)
 #define CHIP_92C_BITMASK		BIT(0)
 #define CHIP_UNKNOWN			BIT(7)
 #define CHIP_92C_1T2R			0x03
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ce/hw.c b/drivers/net/wireless/rtlwifi/rtl8192ce/hw.c
index 86d73b3..bae5269 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ce/hw.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ce/hw.c
@@ -896,7 +896,6 @@ int rtl92ce_hw_init(struct ieee80211_hw *hw)
 	struct rtl_phy *rtlphy = &(rtlpriv->phy);
 	struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw));
 	struct rtl_ps_ctl *ppsc = rtl_psc(rtl_priv(hw));
-	static bool iqk_initialized; /* initialized to false */
 	bool rtstatus = true;
 	bool is92c;
 	int err;
@@ -921,9 +920,28 @@ int rtl92ce_hw_init(struct ieee80211_hw *hw)
 
 	rtlhal->last_hmeboxnum = 0;
 	rtl92c_phy_mac_config(hw);
+	/* because last function modify RCR, so we update
+	 * rcr var here, or TP will unstable for receive_config
+	 * is wrong, RX RCR_ACRC32 will cause TP unstabel & Rx
+	 * RCR_APP_ICV will cause mac80211 unassoc for cisco 1252*/
+	rtlpci->receive_config = rtl_read_dword(rtlpriv, REG_RCR);
+	rtlpci->receive_config &= ~(RCR_ACRC32 | RCR_AICV);
+	rtl_write_dword(rtlpriv, REG_RCR, rtlpci->receive_config);
 	rtl92c_phy_bb_config(hw);
 	rtlphy->rf_mode = RF_OP_BY_SW_3WIRE;
 	rtl92c_phy_rf_config(hw);
+	if (IS_VENDOR_UMC_A_CUT(rtlhal->version) &&
+	    !IS_92C_SERIAL(rtlhal->version)) {
+		rtl_set_rfreg(hw, RF90_PATH_A, RF_RX_G1, MASKDWORD, 0x30255);
+		rtl_set_rfreg(hw, RF90_PATH_A, RF_RX_G2, MASKDWORD, 0x50a00);
+	} else if (IS_81xxC_VENDOR_UMC_B_CUT(rtlhal->version)) {
+		rtl_set_rfreg(hw, RF90_PATH_A, 0x0C, MASKDWORD, 0x894AE);
+		rtl_set_rfreg(hw, RF90_PATH_A, 0x0A, MASKDWORD, 0x1AF31);
+		rtl_set_rfreg(hw, RF90_PATH_A, RF_IPA, MASKDWORD, 0x8F425);
+		rtl_set_rfreg(hw, RF90_PATH_A, RF_SYN_G2, MASKDWORD, 0x4F200);
+		rtl_set_rfreg(hw, RF90_PATH_A, RF_RCK1, MASKDWORD, 0x44053);
+		rtl_set_rfreg(hw, RF90_PATH_A, RF_RCK2, MASKDWORD, 0x80201);
+	}
 	rtlphy->rfreg_chnlval[0] = rtl_get_rfreg(hw, (enum radio_path)0,
 						 RF_CHNLBW, RFREG_OFFSET_MASK);
 	rtlphy->rfreg_chnlval[1] = rtl_get_rfreg(hw, (enum radio_path)1,
@@ -945,11 +963,11 @@ int rtl92ce_hw_init(struct ieee80211_hw *hw)
 
 	if (ppsc->rfpwr_state == ERFON) {
 		rtl92c_phy_set_rfpath_switch(hw, 1);
-		if (iqk_initialized) {
+		if (rtlphy->iqk_initialized) {
 			rtl92c_phy_iq_calibrate(hw, true);
 		} else {
 			rtl92c_phy_iq_calibrate(hw, false);
-			iqk_initialized = true;
+			rtlphy->iqk_initialized = true;
 		}
 
 		rtl92c_dm_check_txpower_tracking(hw);
@@ -1004,6 +1022,13 @@ static enum version_8192c _rtl92ce_read_chip_version(struct ieee80211_hw *hw)
 				   ? CHIP_VENDOR_UMC_B_CUT : CHIP_UNKNOWN) |
 				   CHIP_VENDOR_UMC));
 		}
+		if (IS_92C_SERIAL(version)) {
+			value32 = rtl_read_dword(rtlpriv, REG_HPON_FSM);
+			version = (enum version_8192c)(version |
+				   ((CHIP_BONDING_IDENTIFIER(value32)
+				   == CHIP_BONDING_92C_1T2R) ?
+				   RF_TYPE_1T2R : 0));
+		}
 	}
 
 	switch (version) {
@@ -1019,12 +1044,30 @@ static enum version_8192c _rtl92ce_read_chip_version(struct ieee80211_hw *hw)
 	case VERSION_A_CHIP_88C:
 		versionid = "A_CHIP_88C";
 		break;
+	case VERSION_NORMAL_UMC_CHIP_92C_1T2R_A_CUT:
+		versionid = "A_CUT_92C_1T2R";
+		break;
+	case VERSION_NORMAL_UMC_CHIP_92C_A_CUT:
+		versionid = "A_CUT_92C";
+		break;
+	case VERSION_NORMAL_UMC_CHIP_88C_A_CUT:
+		versionid = "A_CUT_88C";
+		break;
+	case VERSION_NORMAL_UMC_CHIP_92C_1T2R_B_CUT:
+		versionid = "B_CUT_92C_1T2R";
+		break;
+	case VERSION_NORMAL_UMC_CHIP_92C_B_CUT:
+		versionid = "B_CUT_92C";
+		break;
+	case VERSION_NORMAL_UMC_CHIP_88C_B_CUT:
+		versionid = "B_CUT_88C";
+		break;
 	default:
 		versionid = "Unknown. Bug?";
 		break;
 	}
 
-	RT_TRACE(rtlpriv, COMP_INIT, DBG_TRACE,
+	RT_TRACE(rtlpriv, COMP_INIT, DBG_EMERG,
 		 "Chip Version ID: %s\n", versionid);
 
 	switch (version & 0x3) {
@@ -1197,6 +1240,7 @@ static void _rtl92ce_poweroff_adapter(struct ieee80211_hw *hw)
 {
 	struct rtl_priv *rtlpriv = rtl_priv(hw);
 	struct rtl_pci_priv *rtlpcipriv = rtl_pcipriv(hw);
+	struct rtl_hal *rtlhal = rtl_hal(rtlpriv);
 	u8 u1b_tmp;
 	u32 u4b_tmp;
 
@@ -1225,7 +1269,8 @@ static void _rtl92ce_poweroff_adapter(struct ieee80211_hw *hw)
 	rtl_write_word(rtlpriv, REG_GPIO_IO_SEL, 0x0790);
 	rtl_write_word(rtlpriv, REG_LEDCFG0, 0x8080);
 	rtl_write_byte(rtlpriv, REG_AFE_PLL_CTRL, 0x80);
-	rtl_write_byte(rtlpriv, REG_SPS0_CTRL, 0x23);
+	if (!IS_81xxC_VENDOR_UMC_B_CUT(rtlhal->version))
+		rtl_write_byte(rtlpriv, REG_SPS0_CTRL, 0x23);
 	if (rtlpcipriv->bt_coexist.bt_coexistence) {
 		u4b_tmp = rtl_read_dword(rtlpriv, REG_AFE_XTAL_CTRL);
 		u4b_tmp |= 0x03824800;
@@ -1254,6 +1299,9 @@ void rtl92ce_card_disable(struct ieee80211_hw *hw)
 		rtlpriv->cfg->ops->led_control(hw, LED_CTL_POWER_OFF);
 	RT_SET_PS_LEVEL(ppsc, RT_RF_OFF_LEVL_HALT_NIC);
 	_rtl92ce_poweroff_adapter(hw);
+
+	/* after power off we should do iqk again */
+	rtlpriv->phy.iqk_initialized = false;
 }
 
 void rtl92ce_interrupt_recognized(struct ieee80211_hw *hw,
@@ -1912,6 +1960,8 @@ static void rtl92ce_update_hal_rate_mask(struct ieee80211_hw *hw,
 			ratr_bitmap &= 0x0f0ff0ff;
 		break;
 	}
+	sta_entry->ratr_index = ratr_index;
+
 	RT_TRACE(rtlpriv, COMP_RATR, DBG_DMESG,
 		 "ratr_bitmap :%x\n", ratr_bitmap);
 	*(u32 *)&rate_mask = (ratr_bitmap & 0x0fffffff) |
@@ -2291,3 +2341,4 @@ void rtl92ce_suspend(struct ieee80211_hw *hw)
 void rtl92ce_resume(struct ieee80211_hw *hw)
 {
 }
+
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ce/phy.c b/drivers/net/wireless/rtlwifi/rtl8192ce/phy.c
index 88deae6..a66ba97 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ce/phy.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ce/phy.c
@@ -82,7 +82,9 @@ bool rtl92c_phy_mac_config(struct ieee80211_hw *hw)
 
 	if (is92c)
 		rtl_write_byte(rtlpriv, 0x14, 0x71);
-	return rtstatus;
+	else
+		rtl_write_byte(rtlpriv, 0x04CA, 0x0A);
+	return rtstatus;
 }
 
 bool rtl92c_phy_bb_config(struct ieee80211_hw *hw)
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ce/sw.c b/drivers/net/wireless/rtlwifi/rtl8192ce/sw.c
index ea2e1bd..60451ee 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ce/sw.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ce/sw.c
@@ -162,12 +162,10 @@ int rtl92c_init_sw_vars(struct ieee80211_hw *hw)
 
 	/* request fw */
 	if (IS_VENDOR_UMC_A_CUT(rtlhal->version) &&
-	    !IS_92C_SERIAL(rtlhal->version)) {
+	    !IS_92C_SERIAL(rtlhal->version))
 		rtlpriv->cfg->fw_name = "rtlwifi/rtl8192cfwU.bin";
-	} else if (IS_81xxC_VENDOR_UMC_B_CUT(rtlhal->version)) {
+	else if (IS_81xxC_VENDOR_UMC_B_CUT(rtlhal->version))
 		rtlpriv->cfg->fw_name = "rtlwifi/rtl8192cfwU_B.bin";
-		pr_info("****** This B_CUT device may not work with kernels 3.6 and earlier\n");
-	}
 
 	rtlpriv->max_fw_size = 0x4000;
 	pr_info("Using firmware %s\n", rtlpriv->cfg->fw_name);
diff --git a/drivers/net/wireless/rtlwifi/rtl8192ce/trx.c b/drivers/net/wireless/rtlwifi/rtl8192ce/trx.c
index 390d6d4..b8a3c03 100644
--- a/drivers/net/wireless/rtlwifi/rtl8192ce/trx.c
+++ b/drivers/net/wireless/rtlwifi/rtl8192ce/trx.c
@@ -127,11 +127,11 @@ static void _rtl92ce_query_rxphystatus(struct ieee80211_hw *hw,
 {
 	struct rtl_priv *rtlpriv = rtl_priv(hw);
 	struct phy_sts_cck_8192s_t *cck_buf;
+	struct rtl_ps_ctl *ppsc = rtl_psc(rtlpriv);
 	s8 rx_pwr_all = 0, rx_pwr[4];
 	u8 evm, pwdb_all, rf_rx_num = 0;
 	u8 i, max_spatial_stream;
 	u32 rssi, total_rssi = 0;
-	bool in_powersavemode = false;
 	bool is_cck_rate;
 
 	is_cck_rate = RX_HAL_IS_CCK_RATE(pdesc);
@@ -147,7 +147,7 @@ static void _rtl92ce_query_rxphystatus(struct ieee80211_hw *hw,
 		u8 report, cck_highpwr;
 		cck_buf = (struct phy_sts_cck_8192s_t *)p_drvinfo;
 
-		if (!in_powersavemode)
+		if (ppsc->rfpwr_state == ERFON)
 			cck_highpwr = (u8) rtl_get_bbreg(hw,
 						 RFPGA0_XA_HSSIPARAMETER2,
 						 BIT(9));
-- 
1.7.10.4

^ permalink raw reply related

* RE: GRO aggregation
From: Eric Dumazet @ 2012-09-11 19:35 UTC (permalink / raw)
  To: Shlomo Pongratz; +Cc: netdev@vger.kernel.org
In-Reply-To: <36F7E4A28C18BE4DB7C86058E7B607241E622083@MTRDAG01.mtl.com>

On Tue, 2012-09-11 at 19:24 +0000, Shlomo Pongratz wrote:

> 
> I see that in ixgbe the weight for the NAPI is 64 (netif_napi_add). So
> if packets are arriving in high rate then an the CPU is fast enough to
> collect the packets as they arrive, assuming packets continue to
> arrives while the NAPI runs. Then it should have aggregate more. So we
> will have less passes trough the stack.
> 

As I said, _if_ your cpu was loaded by other stuff, then you would see
biggest GRO packets.

GRO is not : "We want to kill latency and have big packets just because
its better"

Its more like : If load is big enough, try to aggregate TCP frames in
less skbs.

^ permalink raw reply

* Re: GRO aggregation
From: David Miller @ 2012-09-11 19:35 UTC (permalink / raw)
  To: shlomop; +Cc: eric.dumazet, netdev
In-Reply-To: <36F7E4A28C18BE4DB7C86058E7B607241E622083@MTRDAG01.mtl.com>

From: Shlomo Pongratz <shlomop@mellanox.com>
Date: Tue, 11 Sep 2012 19:24:26 +0000

> I see that in ixgbe the weight for the NAPI is 64
> (netif_napi_add). So if packets are arriving in high rate then an
> the CPU is fast enough to collect the packets as they arrive,
> assuming packets continue to arrives while the NAPI runs. Then it
> should have aggregate more. So we will have less passes trough the
> stack.

Eric is trying to say that that cpu is fast enough that it completely
depletes the pending RX packets, the RX queue is empty, and there is
only 32K worth of GRO to accumulate.

BTW, your email quoting is non-standard and very confusing.  There
is absolutely no delineation between the text that you are writing
and the text of the people you are responding too.  Please learn how
to write email replies properly.

Thank you.

^ permalink raw reply

* RE: GRO aggregation
From: Shlomo Pongratz @ 2012-09-11 19:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev@vger.kernel.org
In-Reply-To: <1347390113.13103.660.camel@edumazet-glaptop>

From: Eric Dumazet [eric.dumazet@gmail.com]
Sent: Tuesday, September 11, 2012 10:02 PM
To: Shlomo Pongratz
Cc: netdev@vger.kernel.org
Subject: RE: GRO aggregation

On Tue, 2012-09-11 at 18:49 +0000, Shlomo Pongratz wrote:

> I disabled the LRO. I actually tried the all the 4 options and found that LRO, GRO, LRO+GRO gives the same results for ixgbe w.r.t aggregation size (didn't check for throughput or latency).
> Is there a timeout that flushes the aggregated SKBs before 64K were aggregated?

At the end of NAPI run, we flush the gro state.

It basically means that an interrupt came, and we fetched 21 frames from
the NIC.

To get more packets per interrupt, you might try to slow down your
cpu ;)

But I dont get the point.

I see that in ixgbe the weight for the NAPI is 64 (netif_napi_add). So if packets are arriving in high rate then an the CPU is fast enough to collect the packets as they arrive, assuming packets continue to arrives while the NAPI runs. Then it should have aggregate more. So we will have less passes trough the stack.

Shlomo

^ permalink raw reply

* Re: NIC emulation with built-in rate limiting?
From: Gregory Carter @ 2012-09-11 19:05 UTC (permalink / raw)
  To: Rick Jones; +Cc: netdev, kvm, Lee Schermerhorn, Brian Haley
In-Reply-To: <504F7DCB.3050401@hp.com>

You can not model TCP/IP accurately in a KVM VM environment.

Too much background machinations are going on to make that plausible.

I would use a small network with actual hardware for the testing model. 
You will have to use the actual gear in place and test and tweak there 
if you are doing bandwidth sharing, multi-channel with qdiscs stochastic 
queuing.

However, you can model various protocols using single channel qdiscs 
fairly well, certainly well enough to use the data to direct your build 
outs.

Application behavior works pretty well, if you are simply limiting 
bandwidth sharing using single channel qdiscs such as discovering lower 
end acceptable transmission rates for VoIP traffic etc.  I have had 
really good success with various codecs tested with single channel rate 
limited qdiscs to answer various questions about latency and 
bandwidth/quality issues in transmission of audio/video, yielding 
numbers that reveal useful behavior in the design planning phases of 
network services.

May I suggest allocating one channel to one qdisc.

Also, you have to strip the machine down if you want accurate results.  
Do not have X running or anything other than the virtual machines 
required as part of your testing process.  Strip the process queue on 
the testing gear to only running the VM's and Virtual network in question.

The lower you go in the network VM's connections, the more chaotic and 
useless your numbers are going to be.  In certain situations, if you 
strip your test bed down far enough, you can predict how certain kernel 
processes will affect your monitoring and screen those out of the data sets.

I use stripped down source built kernels by the way for many of these 
questions because a lot of junk in the kernel such as I/O queing and 
scheduling I turn off, specifically building kernels for running complex 
VM point to point virtualized networks with as little background noise 
as I can get.

After a while, if you standardize your network setup, you can screen out 
a lot of background noise, and get some useful answers to how 
applications and limited bandwidth connected endpoints will fair.

-gc

On 09/11/2012 01:07 PM, Rick Jones wrote:
> Are there NIC emulations in the kernel with built-in rate limiting?  
> Or is that supposed to be strictly the province of qdiscs/filters?
>
> I've been messing about with netperf in a VM using virtio_net, to 
> which rate limiting has been applied to its corresponding vnetN 
> interface - rate policing on vnetN ingress (the VM's outbound) and htb 
> on the vnetN egress (the VM's inbound).
>
> Looking at the "demo mode" output of netperf and a VM throttled to 800 
> Mbit/s in each direction I see that inbound to the VM is quite steady 
> over time - right at about 800 Mbit/s.  However, looking at that same 
> sort of data for outbound from the VM shows considerable variability 
> ranging anywhere from 700 to 900 Mbit/s (though the bulk of the 
> variability is clustered more like 750 to 850.
>
> I was thinking that part of the reason may stem from the lack of 
> direct feedback to the VM since the policing is on the vnetN interface 
> and wondered if it might be "better" if the VM's outbound network rate 
> were constrained not by an ingress policing filter on the vnetN 
> interface but by the host/hypervisor/emulator portion of the NIC and 
> how quickly it pulls packets from the tx queue.  That would allow the 
> queue which built-up to be in the VM itself and would more accurately 
> represent what a "real NIC" of that bandwidth would do.
>
> happy benchmarking,
>
> rick jones
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

^ permalink raw reply

* RE: GRO aggregation
From: Eric Dumazet @ 2012-09-11 19:01 UTC (permalink / raw)
  To: Shlomo Pongratz; +Cc: netdev@vger.kernel.org
In-Reply-To: <36F7E4A28C18BE4DB7C86058E7B607241E622022@MTRDAG01.mtl.com>

On Tue, 2012-09-11 at 18:49 +0000, Shlomo Pongratz wrote:

> I disabled the LRO. I actually tried the all the 4 options and found that LRO, GRO, LRO+GRO gives the same results for ixgbe w.r.t aggregation size (didn't check for throughput or latency).
> Is there a timeout that flushes the aggregated SKBs before 64K were aggregated?

At the end of NAPI run, we flush the gro state.

It basically means that an interrupt came, and we fetched 21 frames from
the NIC.

To get more packets per interrupt, you might try to slow down your
cpu ;)

But I dont get the point.

^ permalink raw reply

* e1000e: Disabling IRQ #57 .. machine dies
From: Cristian Rodríguez @ 2012-09-11 18:50 UTC (permalink / raw)
  To: Netdev

Hi:

In current linus tree, in a machine using the e1000e driver

hwinfo --netcard
22: PCI 400.0: 0200 Ethernet controller
  [Created at pci.319]
  Unique ID: rBUF.cuT2KaC3ZS1
  Parent ID: HnsE.Z52SDLRIaH2
  SysFS ID: /devices/pci0000:00/0000:00:1c.5/0000:04:00.0
  SysFS BusID: 0000:04:00.0
  Hardware Class: network
  Model: "Intel 82574L Gigabit Network Connection"
  Vendor: pci 0x8086 "Intel Corporation"
  Device: pci 0x10d3 "82574L Gigabit Network Connection"
  SubVendor: pci 0x1043 "ASUSTeK Computer Inc."
  SubDevice: pci 0x8369
  Driver: "e1000e"
  Driver Modules: "e1000e"
  Device File: eth0
  Memory Range: 0xf7d00000-0xf7d1ffff (rw,non-prefetchable)
  I/O Ports: 0xd000-0xdfff (rw)
  Memory Range: 0xf7d20000-0xf7d23fff (rw,non-prefetchable)
  IRQ: 17 (no events)
  Link detected: yes
  Module Alias: "pci:v00008086d000010D3sv00001043sd00008369bc02sc00i00"
  Driver Info #0:
    Driver Status: e1000e is active
    Driver Activation Cmd: "modprobe e1000e"
  Config Status: cfg=no, avail=yes, need=no, active=unknown
  Attached to: #15 (PCI bridge)



cat /proc/interrupts

57 IR-PCI-MSI-edge      eth0

the driver works after a while, but then a single message

"Disabling IRQ #57"

is sent to the console and the machine dies, nothing else is written in
the logs and remote ikvm becames unresponsive so there is no way to
bring more debug info back.. kernel 3.5.3 works ok.

Anyone else seeing this ?

^ permalink raw reply

* RE: GRO aggregation
From: Shlomo Pongratz @ 2012-09-11 18:51 UTC (permalink / raw)
  To: mleitner@redhat.com; +Cc: netdev@vger.kernel.org
In-Reply-To: <504F876D.9020102@redhat.com>

From: Marcelo Ricardo Leitner [mleitner@redhat.com]
Sent: Tuesday, September 11, 2012 9:48 PM
To: Shlomo Pongratz
Cc: netdev@vger.kernel.org
Subject: Re: GRO aggregation

On 09/11/2012 03:41 PM, Shlomo Pongratz wrote:
> From: Marcelo Ricardo Leitner [mleitner@redhat.com]
> Sent: Tuesday, September 11, 2012 9:20 PM
> To: Shlomo Pongratz
> Cc: netdev@vger.kernel.org
> Subject: Re: GRO aggregation
>
> On 09/11/2012 10:45 AM, Shlomo Pongartz wrote:
>> Hi,
>>
>> I’m checking GRO aggregation with kernel 3.6.0-rc1+ using Intel ixgbe
>> driver.
>> The mtu is 1500 and GRO is on and so are SG and RX checksum.
>> I ran iperf with default setting and monitor the receiver with tcpdump.
>> The tcpdump shows that the maximal aggregation is 32120 which is 21 * 1500.
>> In the transmitter side tcpdump shows that TSO works better (~64K).
>> I did a capture without GRO enabled to see if there was a difference
>> between any flag
>> of any two consecutive packets that forced flushing but didn't find
>> anything.
>> Is the GRO aggregation can be tuned.
>
> Hi Shlomo,
>
> Have you tried tuning coalescing parameters?
>
> Marcelo
>
>
> Hi Marcelo
>
> I didn't play with interrupts coalescing.
> Do you suggest to increase the value?

Actually it was an idea from top of my mind, I don't know how it applies
to ixgbe, sorry. But making the NIC hold the packets a bit more should
make it send larger ones to kernel. Trade-off between latency/throughput.

I was thinking about ethtool -c options, like rx-usecs*

Marcelo

I'll try to play with it a little.
Thanks.

Shlomo

^ permalink raw reply

* RE: GRO aggregation
From: Shlomo Pongratz @ 2012-09-11 18:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev@vger.kernel.org
In-Reply-To: <1347388396.13103.658.camel@edumazet-glaptop>

From: Eric Dumazet [eric.dumazet@gmail.com]
Sent: Tuesday, September 11, 2012 9:33 PM
To: Shlomo Pongratz
Cc: netdev@vger.kernel.org
Subject: Re: GRO aggregation

On Tue, 2012-09-11 at 16:45 +0300, Shlomo Pongartz wrote:
> Hi,
>
> I’m checking GRO aggregation with kernel 3.6.0-rc1+ using Intel ixgbe
> driver.
> The mtu is 1500 and GRO is on and so are SG and RX checksum.
> I ran iperf with default setting and monitor the receiver with tcpdump.
> The tcpdump shows that the maximal aggregation is 32120 which is 21 * 1500.
> In the transmitter side tcpdump shows that TSO works better (~64K).
> I did a capture without GRO enabled to see if there was a difference
> between any flag
> of any two consecutive packets that forced flushing but didn't find
> anything.
> Is the GRO aggregation can be tuned.

It might mean NAPI runs while about 21 frames can be fetched at once
from NIC.

If receiver cpu is fast enough, it has no need to aggregate more segment
per skb.

Is LRO off or on ?

GRO itself has a 64Kbytes limit.

Hi Eric.

I disabled the LRO. I actually tried the all the 4 options and found that LRO, GRO, LRO+GRO gives the same results for ixgbe w.r.t aggregation size (didn't check for throughput or latency).
Is there a timeout that flushes the aggregated SKBs before 64K were aggregated?

Shlomo

^ permalink raw reply

* Re: GRO aggregation
From: Marcelo Ricardo Leitner @ 2012-09-11 18:48 UTC (permalink / raw)
  To: Shlomo Pongratz; +Cc: netdev@vger.kernel.org
In-Reply-To: <36F7E4A28C18BE4DB7C86058E7B607241E622015@MTRDAG01.mtl.com>

On 09/11/2012 03:41 PM, Shlomo Pongratz wrote:
> From: Marcelo Ricardo Leitner [mleitner@redhat.com]
> Sent: Tuesday, September 11, 2012 9:20 PM
> To: Shlomo Pongratz
> Cc: netdev@vger.kernel.org
> Subject: Re: GRO aggregation
>
> On 09/11/2012 10:45 AM, Shlomo Pongartz wrote:
>> Hi,
>>
>> I’m checking GRO aggregation with kernel 3.6.0-rc1+ using Intel ixgbe
>> driver.
>> The mtu is 1500 and GRO is on and so are SG and RX checksum.
>> I ran iperf with default setting and monitor the receiver with tcpdump.
>> The tcpdump shows that the maximal aggregation is 32120 which is 21 * 1500.
>> In the transmitter side tcpdump shows that TSO works better (~64K).
>> I did a capture without GRO enabled to see if there was a difference
>> between any flag
>> of any two consecutive packets that forced flushing but didn't find
>> anything.
>> Is the GRO aggregation can be tuned.
>
> Hi Shlomo,
>
> Have you tried tuning coalescing parameters?
>
> Marcelo
>
>
> Hi Marcelo
>
> I didn't play with interrupts coalescing.
> Do you suggest to increase the value?

Actually it was an idea from top of my mind, I don't know how it applies 
to ixgbe, sorry. But making the NIC hold the packets a bit more should 
make it send larger ones to kernel. Trade-off between latency/throughput.

I was thinking about ethtool -c options, like rx-usecs*

Marcelo

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox