Netdev List
 help / color / mirror / Atom feed
* [PATCH rdma-next 12/12] RDMA/uverbs: Fix slab-out-of-bounds in ib_uverbs_ex_create_flow
From: Leon Romanovsky @ 2018-06-24  8:23 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Hadar Hen Zion, Matan Barak,
	Michael J Ruhl, Noa Osherovich, Raed Salem, Yishai Hadas,
	Saeed Mahameed, linux-netdev
In-Reply-To: <20180624082353.16138-1-leon@kernel.org>

From: Leon Romanovsky <leonro@mellanox.com>

The check of cmd.flow_attr.size should check into account the size of
reserved field (2 bytes), otherwise user can provide size whihc will
cause to slab-out-of-bounds warning below.

==================================================================
BUG: KASAN: slab-out-of-bounds in ib_uverbs_ex_create_flow+0x1740/0x1d00
Read of size 2 at addr ffff880068dff1a6 by task syz-executor775/269

CPU: 0 PID: 269 Comm: syz-executor775 Not tainted 4.18.0-rc1+ #245
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
Call Trace:
 dump_stack+0xef/0x17e
 print_address_description+0x83/0x3b0
 kasan_report+0x18d/0x4d0
 ib_uverbs_ex_create_flow+0x1740/0x1d00
 ib_uverbs_write+0x923/0x1010
 __vfs_write+0x10d/0x720
 vfs_write+0x1b0/0x550
 ksys_write+0xc6/0x1a0
 do_syscall_64+0xa7/0x590
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x433899
Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d
89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 91 fd ff c3 66
2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc2724db58 EFLAGS: 00000217 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000020006880 RCX: 0000000000433899
RDX: 00000000000000e0 RSI: 0000000020002480 RDI: 0000000000000003
RBP: 00000000006d7018 R08: 00000000004002f8 R09: 00000000004002f8
R10: 00000000004002f8 R11: 0000000000000217 R12: 0000000000000000

R13: 000000000040cd20 R14: 000000000040cdb0 R15: 0000000000000006

Allocated by task 269:
 kasan_kmalloc+0xa0/0xd0
 __kmalloc+0x1a9/0x510
 ib_uverbs_ex_create_flow+0x26c/0x1d00
 ib_uverbs_write+0x923/0x1010
 __vfs_write+0x10d/0x720
 vfs_write+0x1b0/0x550
 ksys_write+0xc6/0x1a0
 do_syscall_64+0xa7/0x590
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 0:
 __kasan_slab_free+0x12e/0x180
 kfree+0x159/0x630
 detach_buf+0x559/0x7a0
 virtqueue_get_buf_ctx+0x3cc/0xab0
 virtblk_done+0x1eb/0x3d0
 vring_interrupt+0x16d/0x2b0
 __handle_irq_event_percpu+0x10a/0x980
 handle_irq_event_percpu+0x77/0x190
 handle_irq_event+0xc6/0x1a0
 handle_edge_irq+0x211/0xd80
 handle_irq+0x3d/0x60
 do_IRQ+0x9b/0x220

The buggy address belongs to the object at ffff880068dff180
 which belongs to the cache kmalloc-64 of size 64
The buggy address is located 38 bytes inside of
 64-byte region [ffff880068dff180, ffff880068dff1c0)
The buggy address belongs to the page:
page:ffffea0001a37fc0 count:1 mapcount:0 mapping:ffff88006c401780
index:0x0
flags: 0x4000000000000100(slab)
raw: 4000000000000100 ffffea0001a31100 0000001100000011 ffff88006c401780
raw: 0000000000000000 00000000802a002a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff880068dff080: fb fb fb fb fc fc fc fc fb fb fb fb fb fb fb fb
 ffff880068dff100: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
>ffff880068dff180: 00 00 00 00 07 fc fc fc fc fc fc fc fb fb fb fb
                               ^
 ffff880068dff200: fb fb fb fb fc fc fc fc 00 00 00 00 00 00 fc fc
 ffff880068dff280: fc fc fc fc 00 00 00 00 00 00 00 00 fc fc fc fc
==================================================================

Cc: <stable@vger.kernel.org> # 3.12
Fixes: f88482743872 ("IB/core: clarify overflow/underflow checks on ib_create/destroy_flow")
Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/infiniband/core/uverbs_cmd.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 3a0bc4c1b17b..b6bca79fd48b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -3488,8 +3488,8 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 	struct ib_flow_attr		  *flow_attr;
 	struct ib_qp			  *qp;
 	struct ib_uflow_resources	  *uflow_res;
+	struct ib_uverbs_flow_spec_hdr	  *kern_spec;
 	int err = 0;
-	void *kern_spec;
 	void *ib_spec;
 	int i;
 
@@ -3538,8 +3538,8 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 		if (!kern_flow_attr)
 			return -ENOMEM;
 
-		memcpy(kern_flow_attr, &cmd.flow_attr, sizeof(*kern_flow_attr));
-		err = ib_copy_from_udata(kern_flow_attr + 1, ucore,
+		*kern_flow_attr = cmd.flow_attr;
+		err = ib_copy_from_udata(&kern_flow_attr->flow_specs, ucore,
 					 cmd.flow_attr.size);
 		if (err)
 			goto err_free_attr;
@@ -3589,21 +3589,22 @@ int ib_uverbs_ex_create_flow(struct ib_uverbs_file *file,
 	flow_attr->flags = kern_flow_attr->flags;
 	flow_attr->size = sizeof(*flow_attr);
 
-	kern_spec = kern_flow_attr + 1;
+	kern_spec = kern_flow_attr->flow_specs;
 	ib_spec = flow_attr + 1;
 	for (i = 0; i < flow_attr->num_of_specs &&
-	     cmd.flow_attr.size > offsetof(struct ib_uverbs_flow_spec, reserved) &&
-	     cmd.flow_attr.size >=
-	     ((struct ib_uverbs_flow_spec *)kern_spec)->size; i++) {
-		err = kern_spec_to_ib_spec(file->ucontext, kern_spec, ib_spec,
-					   uflow_res);
+			cmd.flow_attr.size > sizeof(*kern_spec) &&
+			cmd.flow_attr.size >= kern_spec->size;
+	     i++) {
+		err = kern_spec_to_ib_spec(
+				file->ucontext, (struct ib_uverbs_flow_spec *)kern_spec,
+				ib_spec, uflow_res);
 		if (err)
 			goto err_free;
 
 		flow_attr->size +=
 			((union ib_flow_spec *) ib_spec)->size;
-		cmd.flow_attr.size -= ((struct ib_uverbs_flow_spec *)kern_spec)->size;
-		kern_spec += ((struct ib_uverbs_flow_spec *) kern_spec)->size;
+		cmd.flow_attr.size -= kern_spec->size;
+		kern_spec = ((void *)kern_spec) + kern_spec->size;
 		ib_spec += ((union ib_flow_spec *) ib_spec)->size;
 	}
 	if (cmd.flow_attr.size || (i != flow_attr->num_of_specs)) {
-- 
2.14.4

^ permalink raw reply related

* [patch net-next 0/3] net: sched: couple of ndo_setup_tc fixes and adjustments
From: Jiri Pirko @ 2018-06-24  8:38 UTC (permalink / raw)
  To: netdev
  Cc: davem, jakub.kicinski, simon.horman, john.hurley,
	pieter.jansenvanvuuren, oss-drivers, michael.chan,
	intel-wired-lan, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

This patchset includes couple of patches that fix or adjust default
cases and return values in ndo_setup_tc implementations in drivers.

Jiri Pirko (3):
  bnxt: simplify cls_flower command switch and handle default case
  nfp: handle cls_flower command default case
  cls_flower: fix error values for commands not supported by drivers

 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c        | 16 +++++-----------
 drivers/net/ethernet/intel/i40e/i40e_main.c         |  2 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c     |  2 +-
 drivers/net/ethernet/intel/igb/igb_main.c           |  2 +-
 drivers/net/ethernet/netronome/nfp/flower/offload.c |  4 ++--
 5 files changed, 10 insertions(+), 16 deletions(-)

-- 
2.14.4

^ permalink raw reply

* [patch net-next 1/3] bnxt: simplify cls_flower command switch and handle default case
From: Jiri Pirko @ 2018-06-24  8:38 UTC (permalink / raw)
  To: netdev
  Cc: davem, jakub.kicinski, simon.horman, john.hurley,
	pieter.jansenvanvuuren, oss-drivers, michael.chan,
	intel-wired-lan, mlxsw
In-Reply-To: <20180624083839.1692-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Currently the default case is not handled, which with future command
introductions would introduce a warning. So handle it and make the
switch a bit simplier removing unneeded "rc" variable.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 795f45024c20..d0699f39ba34 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -1544,22 +1544,16 @@ void bnxt_tc_flow_stats_work(struct bnxt *bp)
 int bnxt_tc_setup_flower(struct bnxt *bp, u16 src_fid,
 			 struct tc_cls_flower_offload *cls_flower)
 {
-	int rc = 0;
-
 	switch (cls_flower->command) {
 	case TC_CLSFLOWER_REPLACE:
-		rc = bnxt_tc_add_flow(bp, src_fid, cls_flower);
-		break;
-
+		return bnxt_tc_add_flow(bp, src_fid, cls_flower);
 	case TC_CLSFLOWER_DESTROY:
-		rc = bnxt_tc_del_flow(bp, cls_flower);
-		break;
-
+		return bnxt_tc_del_flow(bp, cls_flower);
 	case TC_CLSFLOWER_STATS:
-		rc = bnxt_tc_get_flow_stats(bp, cls_flower);
-		break;
+		return bnxt_tc_get_flow_stats(bp, cls_flower);
+	default:
+		return -EOPNOTSUPP;
 	}
-	return rc;
 }
 
 static const struct rhashtable_params bnxt_tc_flow_ht_params = {
-- 
2.14.4

^ permalink raw reply related

* [patch net-next 2/3] nfp: handle cls_flower command default case
From: Jiri Pirko @ 2018-06-24  8:38 UTC (permalink / raw)
  To: netdev
  Cc: davem, jakub.kicinski, simon.horman, john.hurley,
	pieter.jansenvanvuuren, oss-drivers, michael.chan,
	intel-wired-lan, mlxsw
In-Reply-To: <20180624083839.1692-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

Currently the default case is not handled, which with future command
introductions would introduce a warning. So handle it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/netronome/nfp/flower/offload.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index c42e64f32333..c0e74aa4cb5e 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -576,9 +576,9 @@ nfp_flower_repr_offload(struct nfp_app *app, struct net_device *netdev,
 		return nfp_flower_del_offload(app, netdev, flower, egress);
 	case TC_CLSFLOWER_STATS:
 		return nfp_flower_get_stats(app, netdev, flower, egress);
+	default:
+		return -EOPNOTSUPP;
 	}
-
-	return -EOPNOTSUPP;
 }
 
 int nfp_flower_setup_tc_egress_cb(enum tc_setup_type type, void *type_data,
-- 
2.14.4

^ permalink raw reply related

* [patch net-next 3/3] cls_flower: fix error values for commands not supported by drivers
From: Jiri Pirko @ 2018-06-24  8:38 UTC (permalink / raw)
  To: netdev
  Cc: davem, jakub.kicinski, simon.horman, john.hurley,
	pieter.jansenvanvuuren, oss-drivers, michael.chan,
	intel-wired-lan, mlxsw
In-Reply-To: <20180624083839.1692-1-jiri@resnulli.us>

From: Jiri Pirko <jiri@mellanox.com>

-EOPNOTSUPP is the error value that should be reported if a flower
command is not supported by a driver. Fix it in couple of Intel drivers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c     | 2 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +-
 drivers/net/ethernet/intel/igb/igb_main.c       | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 95e9dfbe9839..7ad2b1b0b125 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -7522,7 +7522,7 @@ static int i40e_setup_tc_cls_flower(struct i40e_netdev_priv *np,
 	case TC_CLSFLOWER_STATS:
 		return -EOPNOTSUPP;
 	default:
-		return -EINVAL;
+		return -EOPNOTSUPP;
 	}
 }
 
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index a7b87f935411..dc56a8667495 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -2884,7 +2884,7 @@ static int i40evf_setup_tc_cls_flower(struct i40evf_adapter *adapter,
 	case TC_CLSFLOWER_STATS:
 		return -EOPNOTSUPP;
 	default:
-		return -EINVAL;
+		return -EOPNOTSUPP;
 	}
 }
 
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index f707709969ac..6a78d8272eb2 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2698,7 +2698,7 @@ static int igb_setup_tc_cls_flower(struct igb_adapter *adapter,
 	case TC_CLSFLOWER_STATS:
 		return -EOPNOTSUPP;
 	default:
-		return -EINVAL;
+		return -EOPNOTSUPP;
 	}
 }
 
-- 
2.14.4

^ permalink raw reply related

* Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
From: Peter Robinson @ 2018-06-24  9:24 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-arm-kernel, labbott
In-Reply-To: <7ff516fd-1d01-4d7a-1d5d-b58932c0c69d@gmail.com>

>> I'm seeing this netlink/sk_filter_trim_cap crash on ARMv7 across quite
>> a few ARMv7 platforms on Fedora with 4.18rc1. I've tested RPi2/RPi3
>> (doesn't happen on aarch64), AllWinner H3, BeagleBone and a few
>> others, both LPAE/normal kernels.
>>
>> I'm a bit out of my depth in this part of the kernel but I'm wondering
>> if it's known, I couldn't find anything that looked obvious on a few
>> mailing lists.
>>
>> Peter
>
> Hi Peter
>
> Could you provide symbolic information ?

I passed in through scripts/decode_stacktrace.sh is that what you were after:

[    8.673880] Internal error: Oops: a06 [#10] SMP ARM
[    8.673949] ---[ end trace 049df4786ea3140a ]---
[    8.678754] Modules linked in:
[    8.678766] CPU: 1 PID: 206 Comm: systemd-udevd Tainted: G      D
        4.18.0-0.rc1.git0.1.fc29.armv7hl+lpae #1
[    8.678769] Hardware name: Allwinner sun8i Family
[    8.678781] PC is at sk_filter_trim_cap ()
[    8.678790] LR is at   (null)
[    8.709463] pc : lr : psr: 60000013 ()
[    8.715722] sp : c996bd60  ip : 00000000  fp : 00000000
[    8.720939] r10: ee79dc00  r9 : c12c9f80  r8 : 00000000
[    8.726157] r7 : 00000000  r6 : 00000001  r5 : f1648000  r4 : 00000000
[    8.732674] r3 : 00000007  r2 : 00000000  r1 : 00000000  r0 : 00000000
[    8.739193] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[    8.746318] Control: 30c5387d  Table: 6e7bc880  DAC: ffe75ece
[    8.752055] Process systemd-udevd (pid: 206, stack limit = 0x(ptrval))
[    8.758574] Stack: (0xc996bd60 to 0xc996c000)
[    8.762929] bd60: 00000000 ee7ad0c0 006000c0 00000000 00000000
c0a64ab8 ee7ad240 ee7ad240
[    8.771098] bd80: ee7ad0c0 00000000 00000000 00000000 c12c9f80
c0abbb8c ef001a00 00000001
[    8.779267] bda0: ee722400 00000000 00000002 00000000 00000001
ee79dc64 c996bf70 00000002
[    8.787435] bdc0: ee7ad0c0 00000000 c996bf68 0000008b ee722400
00000008 00000000 c0abbc88
[    8.795604] bde0: 006000c0 00000000 00000000 00000002 00000002
c0abdfb0 006000c0 00000000
[    8.803772] be00: c98ce580 00000000 000000ce 00000000 00000000
00000000 c124ebf4 c996bf68
[    8.811941] be20: eead4c40 c996be58 00000040 00000000 eead4c40
00000000 00000000 c0a5d198
[    8.820110] be40: c996bf68 00000000 c996be58 c0a5d958 00000000
00000000 ee78c2c0 7fff0000
[    8.828278] be60: c996be90 c996beec ffff0000 000000a0 00000000
c05103ac bef897e4 00000028
[    8.836447] be80: 004ee0a8 00000063 00000000 004f3820 00000128
40000028 b6c9a548 00000000
[    8.844615] bea0: 0000000d 00000000 bef897b8 00000000 00000000
00000000 00000010 00000000
[    8.852784] bec0: 00000002 00000000 004f3820 00000000 c996bfb0
00000128 bef897b8 00000000
[    8.860953] bee0: 00000000 c0510450 00000000 00000000 c120eaa4
b6deca00 c996bfb0 30c5387d
[    8.869122] bf00: 004f38d8 bef89720 bef89728 c0434e94 00000000
c05e0290 ee4e6010 00000ff0
[    8.877291] bf20: ee4e6010 00000ff0 ee4e6000 00000000 00000000
c0506354 eead4c40 bef897b8
[    8.885460] bf40: 00000000 00000128 c0401324 c996a000 00000128
c0a5e6d4 00000000 00000000
[    8.893628] bf60: 00000000 fffffff7 c996beb8 0000000c 00000001
00000000 00000000 c996be88
[    8.901796] bf80: 00000000 c0429ac0 00000000 00000000 00000040
00000000 00000000 004f3820
[    8.909965] bfa0: bef897b8 c04012e8 00000000 004f3820 0000000d
bef897b8 00000000 00000000
[    8.918134] bfc0: 00000000 004f3820 bef897b8 00000128 00000063
004eae70 004f4078 00000000
[    8.926302] bfe0: b6f60ad4 bef89780 b6da5780 b6c9a548 60000010
0000000d 00000000 00000000
[    8.934488] (sk_filter_trim_cap) from netlink_broadcast_filtered ()
[    8.943963] (netlink_broadcast_filtered) from netlink_broadcast ()
[    8.953174] (netlink_broadcast) from netlink_sendmsg ()
[    8.961608] (netlink_sendmsg) from sock_sendmsg ()
[    8.969432] (sock_sendmsg) from ___sys_sendmsg ()
[    8.977343] (___sys_sendmsg) from __sys_sendmsg ()
[    8.985170] (__sys_sendmsg) from __sys_trace_return ()
[    8.993247] Exception stack(0xc996bfa8 to 0xc996bff0)
[    8.998294] bfa0:                   00000000 004f3820 0000000d
bef897b8 00000000 00000000
[    9.006463] bfc0: 00000000 004f3820 bef897b8 00000128 00000063
004eae70 004f4078 00000000
[    9.014629] bfe0: b6f60ad4 bef89780 b6da5780 b6c9a548
[ 9.019680] Code: 1afffff7 e59c0000 e5830000 e3520000 (e584800c)
All code
========
   0:   1afffff7        .word   0x1afffff7
   4:   e59c0000        .word   0xe59c0000
   8:   e5830000        .word   0xe5830000
   c:   e3520000        .word   0xe3520000
  10:*  e584800c        .word   0xe584800c              <-- trapping instruction

Code starting with the faulting instruction
===========================================
   0:   e584800c        .word   0xe584800c
[    9.025823] ---[ end trace 049df4786ea3140b ]---

^ permalink raw reply

* Re: BUG: unable to handle kernel paging request in bpf_int_jit_compile
From: Ingo Molnar @ 2018-06-24 10:02 UTC (permalink / raw)
  To: David Miller
  Cc: tglx, syzbot+a4eb8c7766952a1ca872, ast, daniel, hpa, kuznet,
	linux-kernel, mingo, netdev, syzkaller-bugs, x86, yoshfuji,
	peterz
In-Reply-To: <20180624.161411.1560796210597132716.davem@davemloft.net>


* David Miller <davem@davemloft.net> wrote:

> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Sun, 24 Jun 2018 09:09:09 +0200 (CEST)
> 
> > I'm really tempted to make the BPF config switch depend on BROKEN. 
> 
> This really isn't necessary Thomas.
> 
> Whoever wrote the code didn't understand that set ro can legitimately
> fail.

No, that's *NOT* the only thing that happened, according to the Git history.

The first use of set_memory_ro() in include/linux/filter.h was added by
this commit almost four years ago:

  # 2014/09
  60a3b2253c41 ("net: bpf: make eBPF interpreter images read-only")

... and yes, that commit didn't anticipate the (in hindsight) obvious property of 
a function that changes global kernel mappings that if it is used after bootup 
without locking it 'may fail'. So that commit slipping through is 'shit happens' 
and I don't think we ever complained about such things slipping through.

But what happened after that is not so good:

A bit over two years later a crash was found:

    Eric and Willem reported that they recently saw random crashes when
    JIT was in use and bisected this to 74451e66d516 ("bpf: make jited
    programs visible in traces"). Issue was that the consolidation part
    added bpf_jit_binary_unlock_ro() that would unlock previously made
    read-only memory back to read-write.

... but instead of fixing it for real, it was only tinkered with:

  # 2017//02
  9d876e79df6a ("bpf: fix unlocking of jited image when module ronx not set")

... but the problems persisted:

    Improve bpf_{prog,jit_binary}_{un,}lock_ro() by throwing a
    one-time warning in case of an error when the image couldn't
    be set read-only, and also mark struct bpf_prog as locked when
    bpf_prog_lock_ro() was called.

... so the warnings Thomas complained about here were then added a month later:

  # 2017/03
  65869a47f348 ("bpf: improve read-only handling")

It 'improved' nothing of the sort, and the warnings and 'debug code' shows that
the author was aware that these functions could actually fail. To quote the fine
code, introduced a year ago:

                WARN_ON_ONCE(set_memory_rw((unsigned long)fp, fp->pages));
                /* In case set_memory_rw() fails, we want to be the first
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                 * to crash here instead of some random place later on.
                 */
                fp->locked = 0;

... and then, this month, it was tweaked *YET ANOTHER TIME*:

    bpf: reject any prog that failed read-only lock

    We currently lock any JITed image as read-only via bpf_jit_binary_lock_ro()
    as well as the BPF image as read-only through bpf_prog_lock_ro(). In
    the case any of these would fail we throw a WARN_ON_ONCE() in order to
    yell loudly to the log. Perhaps, to some extend, this may be comparable
    to an allocation where __GFP_NOWARN is explicitly not set.

  # 2018/06
  9facc336876f ("bpf: reject any prog that failed read-only lock")

The tone of uncertainty of the changelog, combined with the unfixed typo in it, 
suggests that this commit too was just waved through to upstream without any real 
review and without much design thinking behind it.

And yes, this was still not the right fix, as the fuzzer crash reported in this 
thread outlines - we'll probably need a 5th commit?

> So let's correct that instead of flaming a feature.

So accusing Thomas of 'flaming a feature' is a really unfair attack in light of 
all the details above.

Thanks,

	Ingo

^ permalink raw reply

* [PATCH] sfc: make function efx_rps_hash_bucket static
From: Colin King @ 2018-06-24 10:57 UTC (permalink / raw)
  To: Solarflare linux maintainers, Edward Cree, Bert Kenward,
	David S . Miller, netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

The function efx_rps_hash_bucket is local to the source and
does not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'efx_rps_hash_bucket' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/ethernet/sfc/efx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index ad4a354ce570..570ec72266f3 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -3180,6 +3180,7 @@ bool efx_rps_check_rule(struct efx_arfs_rule *rule, unsigned int filter_idx,
 	return true;
 }
 
+static
 struct hlist_head *efx_rps_hash_bucket(struct efx_nic *efx,
 				       const struct efx_filter_spec *spec)
 {
-- 
2.17.0

^ permalink raw reply related

* [PATCH net-next] strparser: Corrected typo in documentation.
From: Vakul Garg @ 2018-06-24 12:37 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, linux-doc, corbet, Vakul Garg

Replaced strp_pause() with strp_unpause() to correct a seemingly copy
paste documentation mistake.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
---
 Documentation/networking/strparser.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/networking/strparser.txt b/Documentation/networking/strparser.txt
index 13081b3decef..a7d354ddda7b 100644
--- a/Documentation/networking/strparser.txt
+++ b/Documentation/networking/strparser.txt
@@ -48,7 +48,7 @@ void strp_pause(struct strparser *strp)
      Temporarily pause a stream parser. Message parsing is suspended
      and no new messages are delivered to the upper layer.
 
-void strp_pause(struct strparser *strp)
+void strp_unpause(struct strparser *strp)
 
      Unpause a paused stream parser.
 
-- 
2.13.6

^ permalink raw reply related

* [PATCH net] strparser: Corrected typo in documentation.
From: Vakul Garg @ 2018-06-24 12:44 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-kernel, linux-doc, corbet, Vakul Garg

Replaced strp_pause() with strp_unpause() to correct a seemingly copy
paste documentation mistake.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
---
Resending for 'net' as advised.

 Documentation/networking/strparser.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/networking/strparser.txt b/Documentation/networking/strparser.txt
index 13081b3decef..a7d354ddda7b 100644
--- a/Documentation/networking/strparser.txt
+++ b/Documentation/networking/strparser.txt
@@ -48,7 +48,7 @@ void strp_pause(struct strparser *strp)
      Temporarily pause a stream parser. Message parsing is suspended
      and no new messages are delivered to the upper layer.
 
-void strp_pause(struct strparser *strp)
+void strp_unpause(struct strparser *strp)
 
      Unpause a paused stream parser.
 
-- 
2.13.6

^ permalink raw reply related

* Re: Route fallback issue
From: Erik Auerswald @ 2018-06-24 13:45 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Grant Taylor, Akshat Kakkar, netdev, cronolog+lartc, lartc
In-Reply-To: <alpine.LFD.2.20.1806212218070.2159@ja.home.ssi.bg>

Hello Julien,

On Thu, Jun 21, 2018 at 10:57:14PM +0300, Julian Anastasov wrote:
> On Wed, 20 Jun 2018, Grant Taylor wrote:
> > On 06/20/2018 01:00 PM, Julian Anastasov wrote:
> > > You can also try alternative routes.
> > 
> > "Alternative routes"?  I can't say as I've heard that description as a
> > specific technique / feature / capability before.
> > 
> > Is that it's official name?
> 
> 	I think so
> 
> > Where can I find out more about it?
> 
> 	You can search on net. I have some old docs on
> these issues, they should be actual:
> 
> http://ja.ssi.bg/dgd-usage.txt

Thanks for that info!

Can you tell us what parts from the above text is actually implemented
in the upstream Linux kernel, and starting with which version(s)
(approximately)? The text describes ideas and patches from nearly two
decades ago, is more recent documentation available somewhere?

Thanks,
Erik
-- 
In the beginning, there was static routing.
                        -- RFC 1118

^ permalink raw reply

* [PATCH net-next] tcp: add SNMP counter for zero-window drops
From: Yafang Shao @ 2018-06-24 14:02 UTC (permalink / raw)
  To: davem; +Cc: netdev, Yafang Shao

It will be helpful if we could display the drops due to zero window or no
enough window space.
So a new SNMP MIB entry is added to track this behavior.
This entry is named LINUX_MIB_TCPZEROWINDOWDROP and published in
/proc/net/netstat in TcpExt line as TCPZeroWindowDrop.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 include/uapi/linux/snmp.h | 1 +
 net/ipv4/proc.c           | 1 +
 net/ipv4/tcp_input.c      | 8 ++++++--
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index 750d891..97517f3 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -279,6 +279,7 @@ enum
 	LINUX_MIB_TCPDELIVERED,			/* TCPDelivered */
 	LINUX_MIB_TCPDELIVEREDCE,		/* TCPDeliveredCE */
 	LINUX_MIB_TCPACKCOMPRESSED,		/* TCPAckCompressed */
+	LINUX_MIB_TCPZEROWINDOWDROP,		/* TCPZeroWindowDrop */
 	__LINUX_MIB_MAX
 };
 
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 77350c1..225ef34 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -287,6 +287,7 @@ static int sockstat_seq_show(struct seq_file *seq, void *v)
 	SNMP_MIB_ITEM("TCPDelivered", LINUX_MIB_TCPDELIVERED),
 	SNMP_MIB_ITEM("TCPDeliveredCE", LINUX_MIB_TCPDELIVEREDCE),
 	SNMP_MIB_ITEM("TCPAckCompressed", LINUX_MIB_TCPACKCOMPRESSED),
+	SNMP_MIB_ITEM("TCPZeroWindowDrop", LINUX_MIB_TCPZEROWINDOWDROP),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 76ca88f..9c5b341 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4668,8 +4668,10 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 	 *  Out of sequence packets to the out_of_order_queue.
 	 */
 	if (TCP_SKB_CB(skb)->seq == tp->rcv_nxt) {
-		if (tcp_receive_window(tp) == 0)
+		if (tcp_receive_window(tp) == 0) {
+			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPZEROWINDOWDROP);
 			goto out_of_window;
+		}
 
 		/* Ok. In sequence. In window. */
 queue_and_out:
@@ -4735,8 +4737,10 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 		/* If window is closed, drop tail of packet. But after
 		 * remembering D-SACK for its head made in previous line.
 		 */
-		if (!tcp_receive_window(tp))
+		if (!tcp_receive_window(tp)) {
+			NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPZEROWINDOWDROP);
 			goto out_of_window;
+		}
 		goto queue_and_out;
 	}
 
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH] sfc: make function efx_rps_hash_bucket static
From: David Miller @ 2018-06-24 14:08 UTC (permalink / raw)
  To: colin.king
  Cc: linux-net-drivers, ecree, bkenward, netdev, kernel-janitors,
	linux-kernel
In-Reply-To: <20180624105731.5167-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Sun, 24 Jun 2018 11:57:31 +0100

> From: Colin Ian King <colin.king@canonical.com>
> 
> The function efx_rps_hash_bucket is local to the source and
> does not need to be in global scope, so make it static.
> 
> Cleans up sparse warning:
> symbol 'efx_rps_hash_bucket' was not declared. Should it be static?
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] tls: Removed unused variable
From: David Miller @ 2018-06-24 14:54 UTC (permalink / raw)
  To: vakul.garg; +Cc: netdev, linux-kernel, borisp, aviadye, davejwatson
In-Reply-To: <20180624200750.26192-1-vakul.garg@nxp.com>

From: Vakul Garg <vakul.garg@nxp.com>
Date: Mon, 25 Jun 2018 01:37:50 +0530

> Removed unused variable 'rxm' from tls_queue().
> 
> Signed-off-by: Vakul Garg <vakul.garg@nxp.com>

Applied.

^ permalink raw reply

* Wohltätigkeitsspende in Höhe von € 2.000.000,00 EUR
From: jcsilva @ 2018-06-24  9:41 UTC (permalink / raw)
  To: Recipients

Lieber Freund,
 
Ich bin Herr Richard Wahl der Mega-Gewinner von $ 533M In Mega Millions Jackpot spende ich an 5 zufällige Personen, wenn Sie diese E-Mail erhalten, dann wurde Ihre E-Mail nach einem Spinball ausgewählt. Ich habe den größten Teil meines Vermögens auf eine Reihe von Wohltätigkeitsorganisationen und Organisationen verteilt. Ich habe mich freiwillig dazu entschieden, Ihnen den Betrag von € 2.000.000,00 EUR zu spenden eine der ausgewählten 5, um meine Gewinne zu überprüfen, finden Sie auf meiner You Tube Seite unten.
 
UHR MICH HIER: https://www.youtube.com/watch?v=NejIUDafu3U
 
Das ist dein Spendencode: [DF00430342018]
 
Antworten Sie mit dem Spendencode auf diese E-Mail: richardwahldonations@housemail.com
 
Ich hoffe, Sie und Ihre Familie glücklich zu machen.
 
Grüße
Herr Richard Wahl

^ permalink raw reply

* [PATCH RFT] net: dsa: Allow configuring CPU port VLANs
From: Florian Fainelli @ 2018-06-24 15:33 UTC (permalink / raw)
  To: netdev
  Cc: petrm, jiri, ilias.apalodimas, Florian Fainelli, Andrew Lunn,
	Vivien Didelot, David S. Miller, open list

Up until now there was no way to specifically target the VLAN attributes and
membership of the CPU port of a DSA switch. This forced drivers to either
always have the CPU port be "VLAN tagged" (b53) in every VLAN that gets added
to the front-panel/user facing ports, or when the switch supports it, use an
"unmodified" semantic (mv88e6xxx).

This is less than ideal because there are cases where we might not even want to
have the CPU port be part of the same VLAN than its user facing ports, e.g: to
isolate a group of noisy stations. There are also cases where we want to
control exactly how the CPU port receives VLAN traffic such that proper
separation/identification can occur.

Make this possible by flagging events targeting an orig_dev which is a bridge
master and using that as a hint to mean that we want to configure the
CPU/management port. This is compatible with multiple bridges over the same
switch in that, an user still has the responsibility to create separate
broadcast domains with separate VLAN databases/IDs per bridge.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
Andrew, Vivien,

Could you test this on mv88e6xxx to make sure there is no regression? Thanks

 net/dsa/port.c   | 4 ++--
 net/dsa/switch.c | 5 ++++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/dsa/port.c b/net/dsa/port.c
index ed0595459df1..37385e491117 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -253,7 +253,7 @@ int dsa_port_vlan_add(struct dsa_port *dp,
 	};
 
 	if (netif_is_bridge_master(vlan->obj.orig_dev))
-		return -EOPNOTSUPP;
+		info.port = dp->cpu_dp->index;
 
 	if (br_vlan_enabled(dp->bridge_dev))
 		return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_ADD, &info);
@@ -271,7 +271,7 @@ int dsa_port_vlan_del(struct dsa_port *dp,
 	};
 
 	if (netif_is_bridge_master(vlan->obj.orig_dev))
-		return -EOPNOTSUPP;
+		info.port = dp->cpu_dp->index;
 
 	if (br_vlan_enabled(dp->bridge_dev))
 		return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info);
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index b93511726069..d69bcc8f9ba2 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -211,8 +211,11 @@ static int dsa_switch_vlan_add(struct dsa_switch *ds,
 	bitmap_zero(members, ds->num_ports);
 	if (ds->index == info->sw_index)
 		set_bit(info->port, members);
+	/* CPU port is configured via dsa_port_vlan_add() with events
+	 * targeting the bridge device
+	 */
 	for (port = 0; port < ds->num_ports; port++)
-		if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
+		if (dsa_is_dsa_port(ds, port))
 			set_bit(port, members);
 
 	if (switchdev_trans_ph_prepare(trans))
-- 
2.14.1

^ permalink raw reply related

* Re: [PATCH v2 net-next] net/sched: add skbprio scheduler
From: Jamal Hadi Salim @ 2018-06-24 15:43 UTC (permalink / raw)
  To: Nishanth Devarajan, xiyou.wangcong, jiri, davem
  Cc: netdev, doucette, michel, alexander.duyck
In-Reply-To: <20180623204745.GA4337@gmail.com>

On 23/06/18 04:47 PM, Nishanth Devarajan wrote:
[..]

> +	/* Drop the packet at the tail of the lowest priority qdisc. */
> +	lp_qdisc = &q->qdiscs[lp];
> +	to_drop = __skb_dequeue_tail(lp_qdisc);
> +	BUG_ON(!to_drop);
> +	qdisc_qstats_backlog_dec(sch, to_drop);
> +	qdisc_drop(to_drop, sch, to_free);
> +

Maybe also increase overlimit stat here? It will keep track
of low prio things dropped because you were congested.
Such a stat helps when debugging or collecting analytics.

Per Alex's comment, how about:

-----------
Skbprio (SKB Priority Queue) is a queueing discipline that
prioritizes packets according to their skb->priority field.
Under congestion, already-enqueued lower priority packets
will be dropped to make space available for higher priority
packets. Skbprio was conceived as a solution for
denial-of-service defenses that need to route packets with
different priorities as a means to overcome DoS attacks
as described in paper xxxx...


cheers,
jamal

^ permalink raw reply

* Re: [RFC] net: Add new LoRaWAN subsystem
From: Andreas Färber @ 2018-06-24 15:49 UTC (permalink / raw)
  To: Jian-Hong Pan
  Cc: Marcel Holtmann, David S. Miller, Alexander Aring, Stefan Schmidt,
	linux-wpan - ML, netdev, linux-kernel
In-Reply-To: <CAC=mGzj-GNQ0Xz3PodDHyypNtFsOEfBdNBnNkysW2ZZ8Wwuijg@mail.gmail.com>

Hi Jian-Hong Pan,

Am 13.05.2018 um 04:42 schrieb Jian-Hong Pan:
> Hi Jiri and Marcel,
> 
> 2018-05-11 23:39 GMT+08:00 Marcel Holtmann <marcel@holtmann.org>:
>> Hi Jian-Hong,
>>
>>> A Low-Power Wide-Area Network (LPWAN) is a type of wireless
>>> telecommunication wide area network designed to allow long range
>>> communications at a low bit rate among things (connected objects), such
>>> as sensors operated on a battery.  It can be used widely in IoT area.
>>> LoRaWAN, which is one kind of implementation of LPWAN, is a medium
>>> access control (MAC) layer protocol for managing communication between
>>> LPWAN gateways and end-node devices, maintained by the LoRa Alliance.
>>> LoRaWAN™ Specification could be downloaded at:
>>> https://lora-alliance.org/lorawan-for-developers
>>>
>>> However, LoRaWAN is not implemented in Linux kernel right now, so I am
>>> trying to develop it.  Here is my repository:
>>> https://github.com/starnight/LoRa/tree/lorawan-ndo/LoRaWAN
>>>
>>> Because it is a kind of network, the ideal usage in an user space
>>> program should be like "socket(PF_LORAWAN, SOCK_DGRAM, 0)" and with
>>> other socket APIs.  Therefore, the definitions like AF_LORAWAN,
>>> PF_LORAWAN ..., must be listed in the header files of glibc.
>>> For the driver in kernel space, the definitions also must be listed in
>>> the corresponding Linux socket header files.
>>> Especially, both are for the testing programs.
>>>
>>> Back to the mentioned "LoRaWAN is not implemented in Linux kernel now".
>>> Could or should we add the definitions into corresponding kernel header
>>> files now, if LoRaWAN will be accepted as a subsystem in Linux?
>>
>> when you submit your LoRaWAN subsystem to netdev for review, include a patch that adds these new address family definitions. Just pick the next one available. There will be no pre-allocation of numbers until your work has been accepted upstream. Meaning, that the number might change if other address families get merged before yours. So you have to keep updating. glibc will eventually follow the number assigned by the kernel.
> 
> Thanks for your guidance.  I will follow the steps.

I have been working on a similar thing on and off since proposing it at
FOSDEM 2017:

At https://github.com/afaerber/lora-modules you will find my proof of
concept of PF_LORA with SOCK_DGRAM and stub drivers for various modules.
My idea was to layer LoRaWAN on top of LoRa later.

The way I have developed this was to simply reuse numbers unused in our
distro kernel and built my modules against the distro kernel, to avoid
frequent reboots and full kernel builds.

Not having looked at your code yet, do you think our implementations are
fairly independent at this point, or do you see conflicts apart from
number allocation? Like, I am currently using lora0 as name - are you
planning to use lorawan0 or rather something more generic like lpwan0?
We might place your code in net/lora/lorawan/ and mine in net/lora/?

More problematic would be the actual device drivers, where some devices
would support both modes - some with soft MAC, others with full MAC. Do
you have any ideas how to handle that in a sane way?

Please keep me CC'ed on any follow-ups.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply

* [PATCH net-next 0/3] r8169: improve PHY initialization and WoL handling
From: Heiner Kallweit @ 2018-06-24 16:35 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org

Series with smaller improvements regarding PHY initialization and
WoL handling.

Heiner Kallweit (3):
  r8169: improve phy inititalization when resuming
  r8169: improve saved_wolopts handling
  r8169: don't check WoL when powering down PHY and interface is down

 drivers/net/ethernet/realtek/r8169.c | 40 +++++++++++-----------------
 1 file changed, 15 insertions(+), 25 deletions(-)

-- 
2.18.0

^ permalink raw reply

* [PATCH net-next 1/3] r8169: improve phy initialization when resuming
From: Heiner Kallweit @ 2018-06-24 16:37 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <f8a65bb4-aed5-7264-1e1c-b8c76ee79717@gmail.com>

Let's move calling rtl8169_init_phy() to __rtl8169_resume().
It simplifies the code and avoids rtl8169_init_phy() being called
when resuming whilst interface is down. rtl_open() will initialize
the PHY when the interface is brought up.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 44715958..480fb141 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7334,6 +7334,7 @@ static void __rtl8169_resume(struct net_device *dev)
 	netif_device_attach(dev);
 
 	rtl_pll_power_up(tp);
+	rtl8169_init_phy(dev, tp);
 
 	rtl_lock_work(tp);
 	napi_enable(&tp->napi);
@@ -7347,9 +7348,6 @@ static int rtl8169_resume(struct device *device)
 {
 	struct pci_dev *pdev = to_pci_dev(device);
 	struct net_device *dev = pci_get_drvdata(pdev);
-	struct rtl8169_private *tp = netdev_priv(dev);
-
-	rtl8169_init_phy(dev, tp);
 
 	if (netif_running(dev))
 		__rtl8169_resume(dev);
@@ -7397,8 +7395,6 @@ static int rtl8169_runtime_resume(struct device *device)
 	tp->saved_wolopts = 0;
 	rtl_unlock_work(tp);
 
-	rtl8169_init_phy(dev, tp);
-
 	__rtl8169_resume(dev);
 
 	return 0;
-- 
2.18.0

^ permalink raw reply related

* [PATCH net-next 2/3] r8169: improve saved_wolopts handling
From: Heiner Kallweit @ 2018-06-24 16:39 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <f8a65bb4-aed5-7264-1e1c-b8c76ee79717@gmail.com>

Let's make saved_wolopts a shadow copy of the WoL options. This allows
to simplify the code and get rid of calls to now unneeded function
__rtl8169_get_wol(). However don't remove __rtl8169_get_wol()
completely to be prepared for the case that we can respect BIOS WOL
settings again.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 34 ++++++++++++----------------
 1 file changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 480fb141..f8a1309a 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1587,6 +1587,12 @@ static void rtl8169_check_link_status(struct net_device *dev,
 
 #define WAKE_ANY (WAKE_PHY | WAKE_MAGIC | WAKE_UCAST | WAKE_BCAST | WAKE_MCAST)
 
+/* Currently we only enable WoL if explicitly told by userspace to circumvent
+ * issues on certain platforms, see commit bde135a672bf ("r8169: only enable
+ * PCI wakeups when WOL is active"). Let's keep __rtl8169_get_wol() for the
+ * case that we want to respect BIOS settings again.
+ */
+#if 0
 static u32 __rtl8169_get_wol(struct rtl8169_private *tp)
 {
 	u8 options;
@@ -1621,25 +1627,16 @@ static u32 __rtl8169_get_wol(struct rtl8169_private *tp)
 
 	return wolopts;
 }
+#endif
 
 static void rtl8169_get_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
-	struct device *d = tp_to_dev(tp);
-
-	pm_runtime_get_noresume(d);
 
 	rtl_lock_work(tp);
-
 	wol->supported = WAKE_ANY;
-	if (pm_runtime_active(d))
-		wol->wolopts = __rtl8169_get_wol(tp);
-	else
-		wol->wolopts = tp->saved_wolopts;
-
+	wol->wolopts = tp->saved_wolopts;
 	rtl_unlock_work(tp);
-
-	pm_runtime_put_noidle(d);
 }
 
 static void __rtl8169_set_wol(struct rtl8169_private *tp, u32 wolopts)
@@ -1719,14 +1716,14 @@ static int rtl8169_set_wol(struct net_device *dev, struct ethtool_wolinfo *wol)
 
 	rtl_lock_work(tp);
 
+	tp->saved_wolopts = wol->wolopts & WAKE_ANY;
+
 	if (pm_runtime_active(d))
-		__rtl8169_set_wol(tp, wol->wolopts);
-	else
-		tp->saved_wolopts = wol->wolopts;
+		__rtl8169_set_wol(tp, tp->saved_wolopts);
 
 	rtl_unlock_work(tp);
 
-	device_set_wakeup_enable(d, wol->wolopts);
+	device_set_wakeup_enable(d, tp->saved_wolopts);
 
 	pm_runtime_put_noidle(d);
 
@@ -4638,7 +4635,7 @@ static void rtl_wol_suspend_quirk(struct rtl8169_private *tp)
 
 static bool rtl_wol_pll_power_down(struct rtl8169_private *tp)
 {
-	if (!(__rtl8169_get_wol(tp) & WAKE_ANY))
+	if (!tp->saved_wolopts)
 		return false;
 
 	rtl_speed_down(tp);
@@ -7219,7 +7216,6 @@ static int rtl_open(struct net_device *dev)
 
 	rtl_unlock_work(tp);
 
-	tp->saved_wolopts = 0;
 	pm_runtime_put_sync(&pdev->dev);
 
 	rtl8169_check_link_status(dev, tp);
@@ -7367,7 +7363,6 @@ static int rtl8169_runtime_suspend(struct device *device)
 	}
 
 	rtl_lock_work(tp);
-	tp->saved_wolopts = __rtl8169_get_wol(tp);
 	__rtl8169_set_wol(tp, WAKE_ANY);
 	rtl_unlock_work(tp);
 
@@ -7392,7 +7387,6 @@ static int rtl8169_runtime_resume(struct device *device)
 
 	rtl_lock_work(tp);
 	__rtl8169_set_wol(tp, tp->saved_wolopts);
-	tp->saved_wolopts = 0;
 	rtl_unlock_work(tp);
 
 	__rtl8169_resume(dev);
@@ -7462,7 +7456,7 @@ static void rtl_shutdown(struct pci_dev *pdev)
 	rtl8169_hw_reset(tp);
 
 	if (system_state == SYSTEM_POWER_OFF) {
-		if (__rtl8169_get_wol(tp) & WAKE_ANY) {
+		if (tp->saved_wolopts) {
 			rtl_wol_suspend_quirk(tp);
 			rtl_wol_shutdown_quirk(tp);
 		}
-- 
2.18.0

^ permalink raw reply related

* [PATCH net-next 3/3] r8169: don't check WoL when powering down PHY and interface is down
From: Heiner Kallweit @ 2018-06-24 16:40 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <f8a65bb4-aed5-7264-1e1c-b8c76ee79717@gmail.com>

We can power down the PHY irregardless of WOL settings if interface
is down. So far we would have left the PHY enabled if WOL options
are set and the interface is brought down.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index f8a1309a..1d33672c 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -4635,7 +4635,7 @@ static void rtl_wol_suspend_quirk(struct rtl8169_private *tp)
 
 static bool rtl_wol_pll_power_down(struct rtl8169_private *tp)
 {
-	if (!tp->saved_wolopts)
+	if (!netif_running(tp->dev) || !tp->saved_wolopts)
 		return false;
 
 	rtl_speed_down(tp);
-- 
2.18.0

^ permalink raw reply related

* Re: [V2] brcmfmac: stop watchdog before detach and free everything
From: Kalle Valo @ 2018-06-24 16:58 UTC (permalink / raw)
  To: Michael Trimarchi
  Cc: Arend van Spriel, Franky Lin, Hante Meuleman, Chi-Hsien Lin,
	Wright Feng, David S. Miller, Pieter-Paul Giesberts, Ian Molton,
	linux-wireless, brcm80211-dev-list.pdl, brcm80211-dev-list,
	netdev, linux-kernel
In-Reply-To: <20180530090633.GA15390@panicking>

Michael Trimarchi <michael@amarulasolutions.com> wrote:

> Using built-in in kernel image without a firmware in filesystem
> or in the kernel image can lead to a kernel NULL pointer deference.
> Watchdog need to be stopped in brcmf_sdio_remove
> 
> The system is going down NOW!
> [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual address 000002f8
> Sent SIGTERM to all processes
> [ 1348.121412] Mem abort info:
> [ 1348.126962]   ESR = 0x96000004
> [ 1348.130023]   Exception class = DABT (current EL), IL = 32 bits
> [ 1348.135948]   SET = 0, FnV = 0
> [ 1348.138997]   EA = 0, S1PTW = 0
> [ 1348.142154] Data abort info:
> [ 1348.145045]   ISV = 0, ISS = 0x00000004
> [ 1348.148884]   CM = 0, WnR = 0
> [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____)
> [ 1348.158475] [00000000000002f8] pgd=0000000000000000
> [ 1348.163364] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> [ 1348.168927] Modules linked in: ipv6
> [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted 4.17.0-rc5-next-20180517 #18
> [ 1348.180757] Hardware name: Amarula A64-Relic (DT)
> [ 1348.185455] pstate: 60000005 (nZCv daif -PAN -UAO)
> [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20
> [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290
> [ 1348.200253] sp : ffff00000b85be30
> [ 1348.203561] x29: ffff00000b85be30 x28: 0000000000000000
> [ 1348.208868] x27: ffff00000b6cb918 x26: ffff80003b990638
> [ 1348.214176] x25: ffff0000087b1a20 x24: ffff80003b94f800
> [ 1348.219483] x23: ffff000008e620c8 x22: ffff000008f0b660
> [ 1348.224790] x21: ffff000008c6a858 x20: 00000000fffffe00
> [ 1348.230097] x19: ffff80003b94f800 x18: 0000000000000001
> [ 1348.235404] x17: 0000ffffab2e8a74 x16: ffff0000080d7de8
> [ 1348.240711] x15: 0000000000000000 x14: 0000000000000400
> [ 1348.246018] x13: 0000000000000400 x12: 0000000000000001
> [ 1348.251324] x11: 00000000000002c4 x10: 0000000000000a10
> [ 1348.256631] x9 : ffff00000b85bc40 x8 : ffff80003be11870
> [ 1348.261937] x7 : ffff80003dfc7308 x6 : 000000078ff08b55
> [ 1348.267243] x5 : 00000139e1058400 x4 : 0000000000000000
> [ 1348.272550] x3 : dead000000000100 x2 : 958f2788d6618100
> [ 1348.277856] x1 : 00000000fffffe00 x0 : 0000000000000000
> 
> Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
> Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>

Patch applied to wireless-drivers.git, thanks.

373c83a801f1 brcmfmac: stop watchdog before detach and free everything

-- 
https://patchwork.kernel.org/patch/10437931/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH rdma-next 09/12] RDMA/mlx5: Fix shift overflow in mlx5_ib_create_wq
From: Jason Gunthorpe @ 2018-06-24 19:56 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Hadar Hen Zion,
	Matan Barak, Michael J Ruhl, Noa Osherovich, Raed Salem,
	Yishai Hadas, Saeed Mahameed, linux-netdev
In-Reply-To: <20180624082353.16138-10-leon@kernel.org>

On Sun, Jun 24, 2018 at 11:23:50AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> [   61.182439] UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/qp.c:5366:34
> [   61.183673] shift exponent 4294967288 is too large for 32-bit type 'unsigned int'
> [   61.185530] CPU: 0 PID: 639 Comm: qp Not tainted 4.18.0-rc1-00037-g4aa1d69a9c60-dirty #96
> [   61.186981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
> [   61.188315] Call Trace:
> [   61.188661]  dump_stack+0xc7/0x13b
> [   61.190427]  ubsan_epilogue+0x9/0x49
> [   61.190899]  __ubsan_handle_shift_out_of_bounds+0x1ea/0x22f
> [   61.197040]  mlx5_ib_create_wq+0x1c99/0x1d50
> [   61.206632]  ib_uverbs_ex_create_wq+0x499/0x820
> [   61.213892]  ib_uverbs_write+0x77e/0xae0
> [   61.248018]  vfs_write+0x121/0x3b0
> [   61.249831]  ksys_write+0xa1/0x120
> [   61.254024]  do_syscall_64+0x7c/0x2a0
> [   61.256178]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   61.259211] RIP: 0033:0x7f54bab70e99
> [   61.262125] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89
> [   61.268678] RSP: 002b:00007ffe1541c318 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [   61.271076] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f54bab70e99
> [   61.273795] RDX: 0000000000000070 RSI: 0000000020000240 RDI: 0000000000000003
> [   61.276982] RBP: 00007ffe1541c330 R08: 00000000200078e0 R09: 0000000000000002
> [   61.280035] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004005c0
> [   61.283279] R13: 00007ffe1541c420 R14: 0000000000000000 R15: 0000000000000000
> 
> Cc: <stable@vger.kernel.org> # 4.7
> Fixes: 79b20a6c3014 ("IB/mlx5: Add receive Work Queue verbs")
> Cc: syzkaller <syzkaller@googlegroups.com>
> Reported-by: Noa Osherovich <noaos@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>  drivers/infiniband/hw/mlx5/qp.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
> index 6034a670859f..8e40263fd40e 100644
> +++ b/drivers/infiniband/hw/mlx5/qp.c
> @@ -5377,7 +5377,11 @@ static int set_user_rq_size(struct mlx5_ib_dev *dev,
>  
>  	rwq->wqe_count = ucmd->rq_wqe_count;
>  	rwq->wqe_shift = ucmd->rq_wqe_shift;
> -	rwq->buf_size = (rwq->wqe_count << rwq->wqe_shift);
> +	rwq->buf_size =
> +		shift_overflow((size_t)rwq->wqe_count, (size_t)rwq->wqe_shift);

The casts are redundant, the function argument is already size_t so
implicit promotion is guaranteed.

Jason

^ permalink raw reply

* Re: [PATCH rdma-next 06/12] RDMA/uverbs: Don't overwrite NULL pointer with ZERO_SIZE_PTR
From: Jason Gunthorpe @ 2018-06-24 19:57 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Hadar Hen Zion,
	Matan Barak, Michael J Ruhl, Noa Osherovich, Raed Salem,
	Yishai Hadas, Saeed Mahameed, linux-netdev
In-Reply-To: <20180624082353.16138-7-leon@kernel.org>

On Sun, Jun 24, 2018 at 11:23:47AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> Number of specs is provided by user and in valid case can be equal to zero.
> Such argument causes to call to kcalloc() with zero-length request and in
> return the ZERO_SIZE_PTR is assigned. This pointer is different from NULL
> and makes various if (..) checks to success.

The one seems really weird. There is nothing wrong with ZERO_SIZE_PTR,
but this description and fix suggest that something did

ptr = kalloc(0);
ptr[0] = ...;

Which is not allowed of course. Doesn't this mean there is also a
missing range check someplace?

Jason

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox