Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next] ethernet: use IS_ERR_OR_NULL as much as possible
From: roy.qing.li @ 2014-12-19  5:20 UTC (permalink / raw)
  To: netdev

From: Li RongQing <roy.qing.li@gmail.com>

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
 drivers/net/ethernet/broadcom/tg3.c  | 2 +-
 drivers/net/ethernet/realtek/r8169.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index bb48a61..2e6e504 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -7858,7 +7858,7 @@ static int tg3_tso_bug(struct tg3 *tp, struct tg3_napi *tnapi,
 
 	segs = skb_gso_segment(skb, tp->dev->features &
 				    ~(NETIF_F_TSO | NETIF_F_TSO6));
-	if (IS_ERR(segs) || !segs)
+	if (IS_ERR_OR_NULL(segs))
 		goto tg3_tso_bug_end;
 
 	do {
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 14a1c5c..254dbe4 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -6876,7 +6876,7 @@ static void r8169_csum_workaround(struct rtl8169_private *tp,
 
 		features &= ~(NETIF_F_SG | NETIF_F_IPV6_CSUM | NETIF_F_TSO6);
 		segs = skb_gso_segment(skb, features);
-		if (IS_ERR(segs) || !segs)
+		if (IS_ERR_OR_NULL(segs))
 			goto drop;
 
 		do {
-- 
2.1.0

^ permalink raw reply related

* [PATCH net-next] netdevice: optimise netdev_intersect_features slightly
From: roy.qing.li @ 2014-12-19  5:21 UTC (permalink / raw)
  To: netdev; +Cc: mkubecek

From: Li RongQing <roy.qing.li@gmail.com>

Since f1 and f2 always have NETIF_F_GEN_CSUM when doing Bitwise OR assignment,
it is unnecessory to clear NETIF_F_GEN_CSUM from the added data

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Cc: Michal Kubecek <mkubecek@suse.cz>
---
 include/linux/netdevice.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c31f74d..b8facf9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3542,9 +3542,9 @@ static inline netdev_features_t netdev_intersect_features(netdev_features_t f1,
 							  netdev_features_t f2)
 {
 	if (f1 & NETIF_F_GEN_CSUM)
-		f1 |= (NETIF_F_ALL_CSUM & ~NETIF_F_GEN_CSUM);
+		f1 |= NETIF_F_ALL_CSUM;
 	if (f2 & NETIF_F_GEN_CSUM)
-		f2 |= (NETIF_F_ALL_CSUM & ~NETIF_F_GEN_CSUM);
+		f2 |= NETIF_F_ALL_CSUM;
 	f1 &= f2;
 	if (f1 & NETIF_F_GEN_CSUM)
 		f1 &= ~(NETIF_F_ALL_CSUM & ~NETIF_F_GEN_CSUM);
-- 
2.1.0

^ permalink raw reply related

* Re: [PATCH net-next] ethernet: use IS_ERR_OR_NULL as much as possible
From: David Miller @ 2014-12-19  5:58 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1418966457-23294-1-git-send-email-roy.qing.li@gmail.com>


net-next is not open for submissions, please resubmit these changes
when I open that tree back up.

^ permalink raw reply

* Re: [iwlwifi] BUG: unable to handle kernel
From: Grumbach, Emmanuel @ 2014-12-19  5:59 UTC (permalink / raw)
  To: Wu, Fengguang
  Cc: linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, lkp@01.org,
	linux-leds@vger.kernel.org
In-Reply-To: <20141218210705.GA26694@wfg-t540p.sh.intel.com>

On Thu, 2014-12-18 at 13:07 -0800, Fengguang Wu wrote:
> On Fri, Dec 19, 2014 at 03:42:17AM +0800, Grumbach, Emmanuel wrote:
> > On Thu, 2014-12-18 at 09:13 -0800, Fengguang Wu wrote:
> > > Hi All,
> > > 
> > > I don't see any relationship between the BUG and this bisected commit.
> > > Anyway, it's better to report it to the lists than to ignore.
> > 
> > Right - but I have to say that I have no clue how this comment can cause
> > the bug you are seeing...
> 
> s?comment?commit?
> 

Yes :)

> > Do you even have an Intel Wireless device the VM could access?
> 
> Nope. It's simple QEMU virtual machine boot test.
> 

In that case - this just can't be right... Don't know what to say...

> Thanks,
> Fengguang
> 
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes.git master
> > > 
> > > commit 03d6c3b0fa4f5f0379cede079ec828a6c999fe43
> > > Author:     Emmanuel Grumbach <emmanuel.grumbach@intel.com>
> > > AuthorDate: Wed Dec 3 10:39:07 2014 +0200
> > > Commit:     Emmanuel Grumbach <emmanuel.grumbach@intel.com>
> > > CommitDate: Sun Dec 14 10:20:29 2014 +0200
> > > 
> > >     iwlwifi: pcie: re-ACK all interrupts after device reset
> > >     
> > >     When we reset the device, the CSR_INT gets cleared as well
> > >     as CSR_INT_MASK. Meaning that we shouldn't get any interrupt
> > >     but, due to a hardware bug, recent devices will keep sending
> > >     interrupts. This leads to an interrupt storm while stopping
> > >     the device.
> > >     The way to fix this is to ACK all the interrupts after the
> > >     device is reset so that the value of CSR_INT will stay
> > >     0xffffffff.
> > >     
> > >     Fixes: 522713c81e4e ("iwlwifi: pcie: properly reset the device")
> > >     Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
> > > 
> > > +------------------------------------------+------------+------------+------------+
> > > |                                          | 0a79a0c011 | 03d6c3b0fa | iwlwifi-fi |
> > > +------------------------------------------+------------+------------+------------+
> > > | boot_successes                           | 60         | 19         | 3          |
> > > | boot_failures                            | 0          | 1          | 9          |
> > > | BUG:unable_to_handle_kernel              | 0          | 1          | 9          |
> > > | Oops                                     | 0          | 1          | 9          |
> > > | RIP:strcmp                               | 0          | 1          | 9          |
> > > | Kernel_panic-not_syncing:Fatal_exception | 0          | 1          | 9          |
> > > | backtrace:led_trigger_register_simple    | 0          | 1          | 9          |
> > > | backtrace:ledtrig_usb_init               | 0          | 1          | 9          |
> > > | backtrace:kernel_init_freeable           | 0          | 1          | 9          |
> > > +------------------------------------------+------------+------------+------------+
> > > 
> > > [    5.345018] g_serial gadget: Gadget Serial v2.4
> > > [    5.345927] g_serial gadget: g_serial ready
> > > [    5.345927] g_serial gadget: g_serial ready
> > > [    5.346777] BUG: unable to handle kernel 
> > > [    5.346777] BUG: unable to handle kernel paging requestpaging request at ffff88000004e5f0
> > >  at ffff88000004e5f0
> > > [    5.348183] IP:
> > > [    5.348183] IP: [<ffffffff81446a68>] strcmp+0x6/0x20
> > >  [<ffffffff81446a68>] strcmp+0x6/0x20
> > > [    5.349183] PGD 37f1067 
> > > [    5.349183] PGD 37f1067 PUD 37f2067 PUD 37f2067 PMD 37f3067 PMD 37f3067 PTE 800000000004e060PTE 800000000004e060
> > > 
> > > [    5.350498] Oops: 0000 [#1] 
> > > [    5.350498] Oops: 0000 [#1] DEBUG_PAGEALLOCDEBUG_PAGEALLOC
> > > 
> > > [    5.351360] CPU: 0 PID: 1 Comm: swapper Not tainted 3.18.0-g03d6c3b #1
> > > [    5.351360] CPU: 0 PID: 1 Comm: swapper Not tainted 3.18.0-g03d6c3b #1
> > > [    5.352660] task: ffff880012060000 ti: ffff88001204c000 task.ti: ffff88001204c000
> > > [    5.352660] task: ffff880012060000 ti: ffff88001204c000 task.ti: ffff88001204c000
> > > [    5.354143] RIP: 0010:[<ffffffff81446a68>] 
> > > [    5.354143] RIP: 0010:[<ffffffff81446a68>]  [<ffffffff81446a68>] strcmp+0x6/0x20
> > >  [<ffffffff81446a68>] strcmp+0x6/0x20
> > > [    5.354451] RSP: 0000:ffff88001204fe28  EFLAGS: 00010246
> > > [    5.354451] RSP: 0000:ffff88001204fe28  EFLAGS: 00010246
> > > [    5.354451] RAX: 0000000000000000 RBX: ffff88000c08fe00 RCX: ffffffff81d35310
> > > [    5.354451] RAX: 0000000000000000 RBX: ffff88000c08fe00 RCX: ffffffff81d35310
> > > [    5.354451] RDX: ffff88000c08fe68 RSI: ffffffff826d05be RDI: ffff88000004e5f0
> > > [    5.354451] RDX: ffff88000c08fe68 RSI: ffffffff826d05be RDI: ffff88000004e5f0
> > > [    5.354451] RBP: ffff88001204fe28 R08: 0000000000000001 R09: 000000000000033a
> > > [    5.354451] RBP: ffff88001204fe28 R08: 0000000000000001 R09: 000000000000033a
> > > [    5.354451] R10: 0000000000000000 R11: ffffffff82531cd1 R12: ffff88000c19fa00
> > > [    5.354451] R10: 0000000000000000 R11: ffffffff82531cd1 R12: ffff88000c19fa00
> > > [    5.354451] R13: 0000000000000000 R14: ffffffff837958b8 R15: 0000000000000000
> > > [    5.354451] R13: 0000000000000000 R14: ffffffff837958b8 R15: 0000000000000000
> > > [    5.354451] FS:  0000000000000000(0000) GS:ffffffff82789000(0000) knlGS:0000000000000000
> > > [    5.354451] FS:  0000000000000000(0000) GS:ffffffff82789000(0000) knlGS:0000000000000000
> > > [    5.354451] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > [    5.354451] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > [    5.354451] CR2: ffff88000004e5f0 CR3: 0000000002776000 CR4: 00000000000006b0
> > > [    5.354451] CR2: ffff88000004e5f0 CR3: 0000000002776000 CR4: 00000000000006b0
> > > [    5.354451] Stack:
> > > [    5.354451] Stack:
> > > [    5.354451]  ffff88001204fe58
> > > [    5.354451]  ffff88001204fe58 ffffffff81d35334 ffffffff81d35334 0000000000000000 0000000000000000 ffff88000c19fa00 ffff88000c19fa00
> > > 
> > > [    5.354451]  ffffffff826d05be
> > > [    5.354451]  ffffffff826d05be 0000000000000000 0000000000000000 ffff88001204fe88 ffff88001204fe88 ffffffff81d35648 ffffffff81d35648
> > > 
> > > [    5.354451]  ffff88000e3bbcc0
> > > [    5.354451]  ffff88000e3bbcc0 ffffffff82b3fe61 ffffffff82b3fe61 0000000000000000 0000000000000000 ffffffff82b98910 ffffffff82b98910
> > > 
> > > [    5.354451] Call Trace:
> > > [    5.354451] Call Trace:
> > > [    5.354451]  [<ffffffff81d35334>] led_trigger_register+0x63/0x129
> > > [    5.354451]  [<ffffffff81d35334>] led_trigger_register+0x63/0x129
> > > [    5.354451]  [<ffffffff81d35648>] led_trigger_register_simple+0x35/0x79
> > > [    5.354451]  [<ffffffff81d35648>] led_trigger_register_simple+0x35/0x79
> > > [    5.354451]  [<ffffffff82b3fe61>] ? gs_bind+0xea/0xea
> > > [    5.354451]  [<ffffffff82b3fe61>] ? gs_bind+0xea/0xea
> > > [    5.354451]  [<ffffffff82b3fe78>] ledtrig_usb_init+0x17/0x2e
> > > [    5.354451]  [<ffffffff82b3fe78>] ledtrig_usb_init+0x17/0x2e
> > > [    5.354451]  [<ffffffff82b00044>] do_one_initcall+0xe6/0x171
> > > [    5.354451]  [<ffffffff82b00044>] do_one_initcall+0xe6/0x171
> > > [    5.354451]  [<ffffffff82b001c7>] kernel_init_freeable+0xf8/0x180
> > > [    5.354451]  [<ffffffff82b001c7>] kernel_init_freeable+0xf8/0x180
> > > [    5.354451]  [<ffffffff82060791>] ? rest_init+0xbd/0xbd
> > > [    5.354451]  [<ffffffff82060791>] ? rest_init+0xbd/0xbd
> > > [    5.354451]  [<ffffffff8206079a>] kernel_init+0x9/0xd0
> > > [    5.354451]  [<ffffffff8206079a>] kernel_init+0x9/0xd0
> > > [    5.354451]  [<ffffffff8207d2ba>] ret_from_fork+0x7a/0xb0
> > > [    5.354451]  [<ffffffff8207d2ba>] ret_from_fork+0x7a/0xb0
> > > [    5.354451]  [<ffffffff82060791>] ? rest_init+0xbd/0xbd
> > > [    5.354451]  [<ffffffff82060791>] ? rest_init+0xbd/0xbd
> > > [    5.354451] Code: 
> > > [    5.354451] Code: c0 c0 eb eb f5 f5 31 31 c9 c9 40 40 8a 8a 3c 3c 0e 0e 4d 4d 8d 8d 0c 0c 08 08 40 40 84 84 ff ff 41 41 88 88 3c 3c 08 08 74 74 0d 0d 48 48 ff ff c1 c1 48 48 39 39 ca ca 75 75 e7 e7 41 41 c6 c6 41 41 01 01 00 00 5d 5d c3 c3 55 55 31 31 c0 c0 48 48 89 89 e5 e5 <8a> <8a> 14 14 07 07 3a 3a 14 14 06 06 74 74 07 07 19 19 c0 c0 83 83 c8 c8 01 01 eb eb 09 09 48 48 ff ff c0 c0 84 84 d2 d2 75 75 
> > > 
> > > [    5.354451] RIP 
> > > [    5.354451] RIP  [<ffffffff81446a68>] strcmp+0x6/0x20
> > >  [<ffffffff81446a68>] strcmp+0x6/0x20
> > > [    5.354451]  RSP <ffff88001204fe28>
> > > [    5.354451]  RSP <ffff88001204fe28>
> > > [    5.354451] CR2: ffff88000004e5f0
> > > [    5.354451] CR2: ffff88000004e5f0
> > > [    5.354451] ---[ end trace 8f9377b34c860a0c ]---
> > > [    5.354451] ---[ end trace 8f9377b34c860a0c ]---
> > > 
> > > git bisect start baa21e834941ee5fbe4bd421c871f7c0c5f9a086 70e71ca0af244f48a5dcf56dc435243792e3a495 --
> > > git bisect  bad 03d6c3b0fa4f5f0379cede079ec828a6c999fe43  # 16:23      0-      1  iwlwifi: pcie: re-ACK all interrupts after device reset
> > > git bisect good 0a79a0c011cb291675e3b80760a452fcba5c59d9  # 16:28     20+      0  iwlwifi: mvm: clear IN_HW_RESTART flag on stop()
> > > # first bad commit: [03d6c3b0fa4f5f0379cede079ec828a6c999fe43] iwlwifi: pcie: re-ACK all interrupts after device reset
> > > git bisect good 0a79a0c011cb291675e3b80760a452fcba5c59d9  # 16:30     60+      0  iwlwifi: mvm: clear IN_HW_RESTART flag on stop()
> > > # extra tests on HEAD of iwlwifi-fixes/master
> > > git bisect  bad baa21e834941ee5fbe4bd421c871f7c0c5f9a086  # 16:30      0-      9  iwlwifi: pcie: limit fw chunk sizes given to fh
> > > # extra tests on tree/branch iwlwifi-fixes/master
> > > git bisect  bad baa21e834941ee5fbe4bd421c871f7c0c5f9a086  # 16:30      0-      9  iwlwifi: pcie: limit fw chunk sizes given to fh
> > > # extra tests on tree/branch linus/master
> > > git bisect good 44e8967d591686463e84a88b46b03beba3ab49fb  # 16:32     60+      0  Ceph: remove left-over reject file
> > > # extra tests on tree/branch next/master
> > > git bisect good cfaa3a95dd2b402599b1d8122dc3107478db8717  # 16:35     60+      1  Add linux-next specific files for 20141218
> > > 
> > > 
> > > This script may reproduce the error.
> > > 
> > > ----------------------------------------------------------------------------
> > > #!/bin/bash
> > > 
> > > kernel=$1
> > > initrd=quantal-core-x86_64.cgz
> > > 
> > > wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
> > > 
> > > kvm=(
> > > 	qemu-system-x86_64
> > > 	-cpu kvm64
> > > 	-enable-kvm
> > > 	-kernel $kernel
> > > 	-initrd $initrd
> > > 	-m 320
> > > 	-smp 2
> > > 	-net nic,vlan=1,model=e1000
> > > 	-net user,vlan=1
> > > 	-boot order=nc
> > > 	-no-reboot
> > > 	-watchdog i6300esb
> > > 	-rtc base=localtime
> > > 	-serial stdio
> > > 	-display none
> > > 	-monitor null 
> > > )
> > > 
> > > append=(
> > > 	hung_task_panic=1
> > > 	earlyprintk=ttyS0,115200
> > > 	debug
> > > 	apic=debug
> > > 	sysrq_always_enabled
> > > 	rcupdate.rcu_cpu_stall_timeout=100
> > > 	panic=-1
> > > 	softlockup_panic=1
> > > 	nmi_watchdog=panic
> > > 	oops=panic
> > > 	load_ramdisk=2
> > > 	prompt_ramdisk=0
> > > 	console=ttyS0,115200
> > > 	console=tty0
> > > 	vga=normal
> > > 	root=/dev/ram0
> > > 	rw
> > > 	drbd.minor_count=8
> > > )
> > > 
> > > "${kvm[@]}" --append "${append[*]}"
> > > ----------------------------------------------------------------------------
> > > 
> > > Thanks,
> > > Fengguang
> > > _______________________________________________
> > > LKP mailing list
> > > LKP@linux.intel.com
> > 


^ permalink raw reply

* [PATCH Iproute2 next] ip netns: 'ip netns id' cmd without argument should not give error
From: Mahesh Bandewar @ 2014-12-19  6:05 UTC (permalink / raw)
  To: netdev, Stephen Hemminger; +Cc: Vadim Kochan, Saied Kazemi, Mahesh Bandewar

The command 'ip netns identify' without PID parameter is supposed to
use the self PID to identify its ns but a trivial error prevented it
from doing so. This patch fixes that error.

So before the patch -
	# ip netns id
	No pid specified
	# echo $?
	1
	#

After the patch -
	# ip netns id
	test-ns
	# echo $?
	0
	#

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Vadim Kochan <vadim4j@gmail.com> 
Cc: Saied Kazemi <saied@google.com>
---
 ip/ipnetns.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ip/ipnetns.c b/ip/ipnetns.c
index 1c8aa029073e..ec9afa2177a5 100644
--- a/ip/ipnetns.c
+++ b/ip/ipnetns.c
@@ -298,7 +298,7 @@ static int netns_identify(int argc, char **argv)
 	DIR *dir;
 	struct dirent *entry;
 
-	if (argc < 1) {
+	if (!argc) {
 		pidstr = "self";
 	} else if (argc > 1) {
 		fprintf(stderr, "extra arguments specified\n");
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related

* Re: [PATCH net-next] netdevice: optimise netdev_intersect_features slightly
From: Michal Kubecek @ 2014-12-19  6:09 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1418966487-23347-1-git-send-email-roy.qing.li@gmail.com>

On Fri, Dec 19, 2014 at 01:21:27PM +0800, roy.qing.li@gmail.com wrote:
> From: Li RongQing <roy.qing.li@gmail.com>
> 
> Since f1 and f2 always have NETIF_F_GEN_CSUM when doing Bitwise OR assignment,
> it is unnecessory to clear NETIF_F_GEN_CSUM from the added data
> 
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
> Cc: Michal Kubecek <mkubecek@suse.cz>
> ---
>  include/linux/netdevice.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index c31f74d..b8facf9 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3542,9 +3542,9 @@ static inline netdev_features_t netdev_intersect_features(netdev_features_t f1,
>  							  netdev_features_t f2)
>  {
>  	if (f1 & NETIF_F_GEN_CSUM)
> -		f1 |= (NETIF_F_ALL_CSUM & ~NETIF_F_GEN_CSUM);
> +		f1 |= NETIF_F_ALL_CSUM;
>  	if (f2 & NETIF_F_GEN_CSUM)
> -		f2 |= (NETIF_F_ALL_CSUM & ~NETIF_F_GEN_CSUM);
> +		f2 |= NETIF_F_ALL_CSUM;
>  	f1 &= f2;
>  	if (f1 & NETIF_F_GEN_CSUM)
>  		f1 &= ~(NETIF_F_ALL_CSUM & ~NETIF_F_GEN_CSUM);

This is not an optimization in the sense of resulting code as the
expression right of "|=" is a constant so the patch just replaces or
with one constant by an or with a different one. But the source looks
nicer so it may make sense anyway.

However, I will have to take a look at both versions of the resulting
code first as we are using the same expression (or its bitwise inverse)
later so that current version might be actually slightly more efficient.

                                                         Michal Kubecek

^ permalink raw reply

* [patches] a bunch of old bluetooth fixes
From: Al Viro @ 2014-12-19  6:18 UTC (permalink / raw)
  To: David Miller
  Cc: Marcel Holtmann, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA

	This stuff has been sitting in my queue since March; basically,
several places in net/bluetooth assume that they are dealing with
l2cap sockets, while it is possible to get an arbitrary socket to those.
Results are not pretty.
	* HIDPCONNADD gets an arbitrary user-supplied socket; the code
it calls (hidp_connection_add()) verifies that the socket is l2cap one,
but before doing so it finds l2cap_pi(ctrl_sock->sk)->chan.  It's not
that big a deal (it's only 5 words past the end of struct sock), but
it's trivial to avoid and, in theory, we might end up oopsing here if
we are very unlucky and it happens to hit an unmapped page just past
the actual object ctrl_sock->sk sits in.
	* CMTP counterpart of that doesn't validate the socket at all.
It proceeds to
        s = __cmtp_get_session(&l2cap_pi(sock->sk)->chan->dst);
which can very easily oops - here ->chan is already garbage and we
proceed to dereference that.  As with HIDP, one needs CAP_NET_ADMIN to
trigger that, but it's really a clear bug.  The only sanity check we
do is verifying that nsock->sk->sk_state is equal to BT_CONNECTED,
which is not unique to bluetooth, to put it mildly.  It's just 1,
so a TCP_ESTABLISHED tcp socket will pass that check just fune.
The fix is trivial...
	* BNEP situation is identical to CMTP one.

I've sent these patches back then (March 10), but they seem to have fallen
through the cracks.  The bugs are still there and the fixes still apply.
If you would prefer me to resend them after -rc1, just tell...

Anyway, patches follow

^ permalink raw reply

* [PATCH 1/3] bluetooth: hidp_connection_add() unsafe use of l2cap_pi()
From: Al Viro @ 2014-12-19  6:20 UTC (permalink / raw)
  To: David Miller; +Cc: Marcel Holtmann, netdev, linux-bluetooth
In-Reply-To: <20141219061801.GU22149@ZenIV.linux.org.uk>

From: Al Viro <viro@zeniv.linux.org.uk>

it's OK after we'd verified the sockets, but not before that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/bluetooth/hidp/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
index cc25d0b..07348e1 100644
--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -1314,13 +1314,14 @@ int hidp_connection_add(struct hidp_connadd_req *req,
 {
 	struct hidp_session *session;
 	struct l2cap_conn *conn;
-	struct l2cap_chan *chan = l2cap_pi(ctrl_sock->sk)->chan;
+	struct l2cap_chan *chan;
 	int ret;
 
 	ret = hidp_verify_sockets(ctrl_sock, intr_sock);
 	if (ret)
 		return ret;
 
+	chan = l2cap_pi(ctrl_sock->sk)->chan;
 	conn = NULL;
 	l2cap_chan_lock(chan);
 	if (chan->conn)
-- 
2.1.3

^ permalink raw reply related

* [PATCH 2/3] cmtp: cmtp_add_connection() should verify that it's dealing with l2cap socket
From: Al Viro @ 2014-12-19  6:20 UTC (permalink / raw)
  To: David Miller; +Cc: Marcel Holtmann, netdev, linux-bluetooth
In-Reply-To: <20141219061801.GU22149@ZenIV.linux.org.uk>

From: Al Viro <viro@zeniv.linux.org.uk>

... rather than relying on ciptool(8) never passing it anything else.  Give
it e.g. an AF_UNIX connected socket (from socketpair(2)) and it'll oops,
trying to evaluate &l2cap_pi(sock->sk)->chan->dst...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/bluetooth/cmtp/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/bluetooth/cmtp/core.c b/net/bluetooth/cmtp/core.c
index 67fe5e8..fd57db8 100644
--- a/net/bluetooth/cmtp/core.c
+++ b/net/bluetooth/cmtp/core.c
@@ -333,6 +333,8 @@ int cmtp_add_connection(struct cmtp_connadd_req *req, struct socket *sock)
 	int i, err;
 
 	BT_DBG("");
+	if (!l2cap_is_socket(sock))
+		return -EBADFD;
 
 	session = kzalloc(sizeof(struct cmtp_session), GFP_KERNEL);
 	if (!session)
-- 
2.1.3

^ permalink raw reply related

* [PATCH 3/3] bnep: bnep_add_connection() should verify that it's dealing with l2cap socket
From: Al Viro @ 2014-12-19  6:20 UTC (permalink / raw)
  To: David Miller; +Cc: Marcel Holtmann, netdev, linux-bluetooth
In-Reply-To: <20141219061801.GU22149@ZenIV.linux.org.uk>

From: Al Viro <viro@zeniv.linux.org.uk>

same story as cmtp

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/bluetooth/bnep/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/bluetooth/bnep/core.c b/net/bluetooth/bnep/core.c
index 85bcc21..ce82722d0 100644
--- a/net/bluetooth/bnep/core.c
+++ b/net/bluetooth/bnep/core.c
@@ -533,6 +533,9 @@ int bnep_add_connection(struct bnep_connadd_req *req, struct socket *sock)
 
 	BT_DBG("");
 
+	if (!l2cap_is_socket(sock))
+		return -EBADFD;
+
 	baswap((void *) dst, &l2cap_pi(sock->sk)->chan->dst);
 	baswap((void *) src, &l2cap_pi(sock->sk)->chan->src);
 
-- 
2.1.3

^ permalink raw reply related

* RE: [PATCH net] r8152: drop the tx packet with invalid length
From: Hayes Wang @ 2014-12-19  6:42 UTC (permalink / raw)
  To: David Miller, eric.dumazet@gmail.com
  Cc: netdev@vger.kernel.org, nic_swsd, linux-kernel@vger.kernel.org,
	linux-usb@vger.kernel.org
In-Reply-To: <20141126.153356.461494573591656082.davem@davemloft.net>

> From: David Miller [mailto:davem@davemloft.net] 
> Sent: Thursday, November 27, 2014 4:34 AM
[...]
> >> > Looks like a candidate for ndo_gso_check(), so that we do not drop, but
> >> > instead segment from netif_needs_gso()/validate_xmit_skb()
> >> 
> >> You mean have the bridge implement the ndo_gso_check() method right?
> > 
> > No, I meant this particular driver.
> > 
> > Note that netif_skb_features() does only this check :
> > 
> > if (gso_segs > dev->gso_max_segs || gso_segs < dev->gso_min_segs)
> >       features &= ~NETIF_F_GSO_MASK;
> > 
> > Ie not testing gso_max_size
> > 
> > It looks like all these particular tests should be moved on
> > ndo_gso_check(), to remove code from netif_skb_features()
> 
> A check against gso_max_size is generic enough that it ought to be put
> right into netif_needs_gso() rather then duplicating it into every
> driver's ndo_gso_check() method don't you think?

Excuse me. I try to implement ndo_gso_check() with kernel 3.18.
However, I still get packets with gso and their skb lengths are more
than the acceptable one. Do I miss something?

+static bool rtl8152_gso_check(struct sk_buff *skb, struct net_device *dev)
+{
+	if ((skb->len + sizeof(struct tx_desc)) > agg_buf_sz)
+		return false;
+
+	return true;
+}
 
@@ -5861,6 +5876,9 @@ static const struct net_device_ops rtl8152_netdev_ops = {
 	.ndo_set_mac_address	= rtl8152_set_mac_address,
 	.ndo_change_mtu		= rtl8152_change_mtu,
 	.ndo_validate_addr	= eth_validate_addr,
+	.ndo_gso_check		= rtl8152_gso_check,
 };

Best Regards,
Hayes

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  6:59 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +static struct {
> +	const char str[ETH_GSTRING_LEN];
> +} ethtool_stats_keys[] = {
> +	{
> +		"jiffies"
> +	}, {

Jiffies does not belong in ethtool stats. Looks like some developer debug option.

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:01 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +
> +static u32 lio_get_link(struct net_device *dev)
> +{
> +	u32 ret;
> +
> +	ret = netif_carrier_ok(dev) ? 1 : 0;
> +	return ret;
> +}
> +

This is unnecessary, this is what the default ethtool handler already does
if you give a NULL handler.

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:03 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +	if (linfo->link.s.status) {
> +		ecmd->speed = linfo->link.s.speed;
> +		ecmd->duplex = linfo->link.s.duplex;
> +	} else {
> +		ecmd->speed = -1;
> +		ecmd->duplex = -1;
> +	}
> +

You should SPEED_UNKNOWN and DUPLEX_UNKNOWN (not -1).
and don't set ecmd->speed directly, use
     ethtool_cmd_speed_sed(ecmd, speed)

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:05 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +static void
> +lio_get_ethtool_stats(struct net_device *netdev,
> +		      struct ethtool_stats *stats, u64 *data)
> +{
> +	struct lio *lio = GET_LIO(netdev);
> +	struct octeon_device *oct_dev = lio->oct_dev;
> +	int i = 0, j;
> +
> +	data[i++] = jiffies;
> +	data[i++] = lio->stats.tx_packets;
> +	data[i++] = lio->stats.tx_bytes;
> +	data[i++] = lio->stats.rx_packets;
> +	data[i++] = lio->stats.rx_bytes;
> +	data[i++] = lio->stats.tx_errors;
> +	data[i++] = lio->stats.tx_dropped;
> +	data[i++] = lio->stats.rx_dropped;
> +

Do not mirror existing netdevice statistics in ethtool.
The purpose of ethtool stats is to provide device specific information.

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:07 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +/** \brief Network device get stats
> + * @param netdev    pointer to network device
> + * @returns pointer stats structure
> + */
> +static struct net_device_stats *liquidio_stats(struct net_device *netdev)
> +{
> +	return &(GET_LIO(netdev)->stats);
> +}
> +

Unnecessary function, this is what network core does already
if .ndo_get_stats is NULL;

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:11 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +		if (packet_was_received) {
> +			lio->stats.rx_bytes += len;
> +			atomic64_inc((atomic64_t *)&lio->stats.rx_packets);
> +			netdev->last_rx = jiffies;
> +		} else {
> +			atomic64_inc((atomic64_t *)&lio->stats.rx_dropped);
> +			lio_info(lio, rx_err,
> +				 "netif_rxXX  error rx_dropped:%ld\n",
> +				 lio->stats.rx_dropped);
> +		}

What is point of keeping own statistics in atomic (which are slower)?
Just use dev->stats directly.

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:14 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +static int liquidio_xmit(struct sk_buff *skb, struct net_device *netdev)
> +{
> +	struct lio *lio;
> +	struct octnet_buf_free_info *finfo;
> +	union octnic_cmd_setup cmdsetup;
> +	struct octnic_data_pkt ndata;
> +	struct octeon_device *oct;
> +	int cpu = 0, status = 0;
> +	int q_idx = 0;
> +	int xmit_more;
> +
> +	lio = GET_LIO(netdev);
> +	oct = lio->oct_dev;
> +	atomic64_inc((atomic64_t *)&lio->stats.tx_packets);
> +	lio->stats.tx_bytes += skb->len;
> +
> +	if (!ifstate_check(lio, LIO_IFSTATE_TXENABLED)) {
> +		lio_info(lio, tx_err, "Transmit Busy, not enabled\n");
> +		return NETDEV_TX_BUSY;
> +	}

This kind of busy handling in transmit, will cause the send in higher
level to retry and spin the CPU. Better to do proper flow control
like other drivers.

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:21 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +static int liquidio_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd)
> +{
> +	switch (cmd) {
> +	case SIOCSHWTSTAMP:
> +		return hwtstamp_ioctl(netdev, ifr, cmd);
> +	default:
> +		return -EINVAL

Standard error code for unsupported ioctl is -EOPNOTSUPP

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:23 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +static int __init liquidio_init(void)
> +{
> +	int i;
> +	struct handshake *hs;
> +
> +	pr_info("LiquidIO: Starting Network module version %s\n",
> +		LIQUIDIO_VERSION);
> +

Do you really need to log this. Systems are already too chatty.

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:26 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +/**
> + * \brief Alloc net device
> + * @param size Size to allocate
> + * @param nq how many queues
> + * @returns pointer to net device structure
> + */
> +static inline struct net_device *liquidio_alloc_netdev(int size, int nq)
> +{
> +	if (nq > 1)
> +		return alloc_netdev_mq(size, "lio%d", NET_NAME_UNKNOWN,
> +				       ether_setup, nq);
> +	else
> +		return alloc_netdev(size, "lio%d", NET_NAME_UNKNOWN,
> +				    ether_setup);
> +}

There is no need for the (nq > 1) test, just do:
> +		return alloc_netdev_mq(size, "lio%d", NET_NAME_UNKNOWN,
> +				       ether_setup, nq);

Also, why does this device not have standard ethernet naming?

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:28 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +void process_noresponse_list(struct octeon_device *oct,
> +			     struct octeon_instr_queue *iq)
> +{
> +	uint32_t get_idx;
> +	struct octeon_soft_command *sc;
> +	struct octeon_instr_ih *ih;

All global function names must start with same prefix, driver may not
always be built as a module. 

In this case oceteon_process_noresponse_list.

^ permalink raw reply

* Re: [PATCH RFC v4 net-next 1/5] virtio_net: enable tx interrupt
From: Qin Chuanyu @ 2014-12-19  7:32 UTC (permalink / raw)
  To: Jason Wang, mst, virtualization, netdev, linux-kernel, davem; +Cc: pagupta
In-Reply-To: <1417429028-11971-2-git-send-email-jasowang@redhat.com>

On 2014/12/1 18:17, Jason Wang wrote:
> On newer hosts that support delayed tx interrupts,
> we probably don't have much to gain from orphaning
> packets early.
>
> Note: this might degrade performance for
> hosts without event idx support.
> Should be addressed by the next patch.
>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>   drivers/net/virtio_net.c | 132 +++++++++++++++++++++++++++++++----------------
>   1 file changed, 88 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ec2a8b4..f68114e 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
>   static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
>   {
>   	struct skb_vnet_hdr *hdr;
> @@ -912,7 +951,9 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
>   		sg_set_buf(sq->sg, hdr, hdr_len);
>   		num_sg = skb_to_sgvec(skb, sq->sg + 1, 0, skb->len) + 1;
>   	}
> -	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC);
> +
> +	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb,
> +				    GFP_ATOMIC);
>   }
>
>   static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> @@ -924,8 +965,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>   	struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
>   	bool kick = !skb->xmit_more;
>
> -	/* Free up any pending old buffers before queueing new ones. */
> -	free_old_xmit_skbs(sq);

I think there is no need to remove free_old_xmit_skbs here.
you could add free_old_xmit_skbs in tx_irq's napi func.
also could do this in start_xmit if you handle the race well.

I have done the same thing in ixgbe driver(free skb in ndo_start_xmit 
and both in tx_irq's poll func), and it seems work well:)

I think there would be no so much interrupts in this way, also tx 
interrupt coalesce is not needed.

> +	virtqueue_disable_cb(sq->vq);
>
>   	/* Try to transmit */
>   	err = xmit_skb(sq, skb);
> @@ -941,27 +981,19 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>   		return NETDEV_TX_OK;
>   	}
>
> -	/* Don't wait up for transmitted skbs to be freed. */
> -	skb_orphan(skb);
> -	nf_reset(skb);
> -
>   	/* Apparently nice girls don't return TX_BUSY; stop the queue
>   	 * before it gets out of hand.  Naturally, this wastes entries. */
> -	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> +	if (sq->vq->num_free < 2+MAX_SKB_FRAGS)
>   		netif_stop_subqueue(dev, qnum);
> -		if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> -			/* More just got used, free them then recheck. */
> -			free_old_xmit_skbs(sq);
> -			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> -				netif_start_subqueue(dev, qnum);
> -				virtqueue_disable_cb(sq->vq);
> -			}
> -		}
> -	}
>
>   	if (kick || netif_xmit_stopped(txq))
>   		virtqueue_kick(sq->vq);
>
> +	if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> +		virtqueue_disable_cb(sq->vq);
> +		napi_schedule(&sq->napi);
> +	}
> +
>   	return NETDEV_TX_OK;
>   }
>
> @@ -1138,8 +1170,10 @@ static int virtnet_close(struct net_device *dev)
>   	/* Make sure refill_work doesn't re-enable napi! */
>   	cancel_delayed_work_sync(&vi->refill);
>
> -	for (i = 0; i < vi->max_queue_pairs; i++)
> +	for (i = 0; i < vi->max_queue_pairs; i++) {
>   		napi_disable(&vi->rq[i].napi);
> +		napi_disable(&vi->sq[i].napi);
> +	}
>
>   	return 0;
>   }
> @@ -1452,8 +1486,10 @@ static void virtnet_free_queues(struct virtnet_info *vi)
>   {
>   	int i;
>
> -	for (i = 0; i < vi->max_queue_pairs; i++)
> +	for (i = 0; i < vi->max_queue_pairs; i++) {
>   		netif_napi_del(&vi->rq[i].napi);
> +		netif_napi_del(&vi->sq[i].napi);
> +	}
>
>   	kfree(vi->rq);
>   	kfree(vi->sq);

^ permalink raw reply

* Re: [RFC PATCH net-next v3 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-19  8:12 UTC (permalink / raw)
  To: Thomas Graf
  Cc: John Fastabend, Varlese, Marco, netdev@vger.kernel.org,
	Fastabend, John R, roopa@cumulusnetworks.com, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <20141219003510.GC16239@casper.infradead.org>

Fri, Dec 19, 2014 at 01:35:10AM CET, tgraf@suug.ch wrote:
>On 12/18/14 at 08:03am, John Fastabend wrote:
>> On 12/18/2014 07:30 AM, Varlese, Marco wrote:
>> Could you also document the attributes. I think they are mostly
>> clear but what is IFLA_SW_LOOPBACK. It will help later when we
>> try to read the code in 6months and implement drivers.
>> 
>> I am thinking something like
>> 
>> /* Switch Port Attributes section
>>  * IFLA_SW_LEARNING - turns learning on in the bridge
>>  * IFLA_SW_LOOPBACK - does something interesting
>> 
>>  [...]
>>  */
>
>+1. I would even ask for more than that. While clear in the bridge
>context, "learning" for this API targetting multi layer switches
>is ambigious. The expectation towards the driver must be crystical
>clear.
>
>> >+
>> >+enum {
>> >+	IFLA_SW_UNSPEC,
>> >+	IFLA_SW_LEARNING,
>> 
>> Can you address Roopa's feedback. I'm also a bit confused by the
>> duplication.
>
>Agreed. Can we decide on the ndo first and then build the APIs on
>top of that? While I agree that we should have a non-bridge based
>Netlink API, the underlying ndo should be the same.

Maybe we can use this kind of ndos (proposed by this patch) and call it
for switchdevs instead of bridge_get/setlink. Would make sense to me.
single set of ndos, many possible users (userspace/in-kernel).
bridge_get/setlink would be just a wrapper for this.

>
>> >+static const struct nla_policy ifla_sw_attr_policy[IFLA_SW_ATTR_MAX+1] = {
>> >+	[IFLA_SW_LEARNING]	= { .type = NLA_U64 },
>> >+	[IFLA_SW_LOOPBACK]	= { .type = NLA_U64 },
>> >+	[IFLA_SW_BCAST_FLOODING] = { .type = NLA_U64 },
>> >+	[IFLA_SW_UCAST_FLOODING] = { .type = NLA_U64 },
>> >+	[IFLA_SW_MCAST_FLOODING] = { .type = NLA_U64 },
>> >+};
>> 
>> Why U64 values? What would we pass in these? Are these just boolean
>> bits? Maybe the annotation above will help me understand this.
>
>I think the intent is to keep the ndo API as simple as possible
>but I agree that this is wasteful. I gave this feedback on v2 already.

^ permalink raw reply

* Re: [RFC PATCH net-next v2 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-19  8:14 UTC (permalink / raw)
  To: Arad, Ronen
  Cc: Fastabend, John R, Roopa Prabhu, Varlese, Marco,
	netdev@vger.kernel.org, Thomas Graf, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <E4CD12F19ABA0C4D8729E087A761DC3505DC9956@ORSMSX101.amr.corp.intel.com>

Thu, Dec 18, 2014 at 11:43:06PM CET, ronen.arad@intel.com wrote:
>
>
>>-----Original Message-----
>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>>Behalf Of John Fastabend
>>Sent: Thursday, December 18, 2014 9:21 PM
>>To: Roopa Prabhu; Varlese, Marco
>>Cc: netdev@vger.kernel.org; Thomas Graf; Jiri Pirko; sfeldma@gmail.com; linux-
>>kernel@vger.kernel.org
>>Subject: Re: [RFC PATCH net-next v2 1/1] net: Support for switch port
>>configuration
>>
>>On 12/18/2014 10:14 AM, Roopa Prabhu wrote:
>>> On 12/18/14, 10:02 AM, Varlese, Marco wrote:
>>>> Removed unnecessary content for ease of reading...
>>>>
>>>>>>>>>> +/* Switch Port Attributes section */
>>>>>>>>>> +
>>>>>>>>>> +enum {
>>>>>>>>>> +    IFLA_ATTR_UNSPEC,
>>>>>>>>>> +    IFLA_ATTR_LEARNING,
>>>>>>>>> Any reason you want learning here ?. This is covered as part  of
>>>>>>>>> the bridge setlink attributes.
>>>>>>>>>
>>>>>>>> Yes, because the user may _not_ want to go through a bridge
>>>>>>>> interface
>>>>>>> necessarily.
>>>>>>> But, the bridge setlink/getlink interface was changed to accommodate
>>>>> 'self'
>>>>>>> for exactly such cases.
>>>>>>> I kind of understand your case for the other attributes (these are
>>>>>>> per port settings that switch asics provide).
>>>>>>>
>>>>>>> However, i don't understand the reason to pull in bridge attributes
>>here.
>>>>>>>
>>>>>> Maybe, I am missing something so you might help. The learning attribute -
>>>>> in my case - it is like all other attributes: a port attribute (as you
>>said, port
>>>>> settings that the switch provides per port).
>>>>>> So, what I was saying is "why the user shall go through a bridge to
>>configure
>>>>> the learning attribute"? From my perspective, it is as any other attribute
>>and
>>>>> as such configurable on the port.
>>>>>
>>>>> Thinking about this some more, i don't see why any of these attributes
>>>>> (except loopback. I dont understand the loopback attribute) cant be part
>>of
>>>>> the birdge port attributes.
>>>>>
>>>>> With this we will end up adding l2 attributes in two places: the general
>>link
>>>>> attributes and bridge attributes.
>>>>>
>>>>> And since we have gone down the path of using ndo_bridge_setlink/getlink
>>>>> with 'self'....we should stick to that for all l2 attributes.
>>>>>
>>>>> The idea of overloading ndo_bridge_set/getlink, was to have the same set
>>of
>>>>> attributes but support both cases where the user wants to go through the
>>>>> bridge driver or directly to the switch port driver. So, you are not
>>really going
>>>>> through the bridge driver if you use 'self' and
>>ndo_bridge_setlink/getlink.
>>>>>
>>
>>>> Roopa, one of the comments I got from Thomas Graf on my v1 patch
>>>> was that your patch and mine were supplementary ("I think Roopa's
>>>> patches are supplementary. Not all switchdev users will be backed
>>>> with a Linux Bridge. I therefore welcome your patches very
>>>> much")... I also understood by others that the patch made sense for
>>>> the same reason. I simply do not understand why these attributes
>>>> (and maybe others in the future) could not be configured directly
>>>> on a standard port but have to go through a bridge.
>>>>
>>> ok, i am very confused in that case. The whole moving of bridge
>>> attributes from the bridge driver to rtnetlink.c was to make the
>>> bridge attributes accessible to any driver who wants to set l2/bridge
>>> attributes on their switch ports. So, its unclear to me why we are
>>> doing this parallel thing again. This move to rtnetlink.c was done
>>> during the recent rocker support. so, maybe scott/jiri can elaborate
>>> more.
>>
>>
>>Not sure if this will add to the confusion or help. But you do not
>>need to have the bridge.ko loaded or netdev's attached to a bridge
>>to use the setlink/getlink ndo ops and netlink messages.
>
>No you don't need bridge.ko to implement ndo_bridge_setlink/getlink. Rtnetlink invokes those ndos from code which does not depend on CONFIG_BRIDGE or the presence of bridge.ko.
>Calling some bridge exported functions such as br_fdb_external_learn_add/del requires the presence of bridge.ko and it only makes sense when the switch port device is enslaved to a bridge.

Note I plan to change br_fdb_external_learn_add/del to (semi-)generic
notifier very soon.


>
>>
>>This was intentionally done. Its already used with NIC devices to
>>configure embedded bridge settings such as VEB/VEPA.
>>
>>I think I'm just repeating Roopa though.
>>
>>--
>>To unsubscribe from this list: send the line "unsubscribe netdev" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox