Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next] net: mscc: ocelot: remove set but not used variable 'phy_mode'
From: David Miller @ 2018-10-08 18:04 UTC (permalink / raw)
  To: yuehaibing; +Cc: alexandre.belloni, netdev, kernel-janitors
In-Reply-To: <1539007670-82488-1-git-send-email-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Mon, 8 Oct 2018 14:07:50 +0000

> Fixes gcc '-Wunused-but-set-variable' warning:
> 
> drivers/net/ethernet/mscc/ocelot_board.c: In function 'mscc_ocelot_probe':
> drivers/net/ethernet/mscc/ocelot_board.c:262:17: warning:
>  variable 'phy_mode' set but not used [-Wunused-but-set-variable]
>    enum phy_mode phy_mode;
> 
> It never used since introduction in 
> commit 71e32a20cfbf ("net: mscc: ocelot: make use of SerDes PHYs for handling their configuration")
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2 net-next 11/23] rtnetlink: Update rtnl_stats_dump for strict data checking
From: David Miller @ 2018-10-08 18:02 UTC (permalink / raw)
  To: dsahern; +Cc: christian, dsahern, netdev, jbenc, stephen
In-Reply-To: <ddf7efb8-e7d6-fb6a-7b85-fe3c94819488@gmail.com>

From: David Ahern <dsahern@gmail.com>
Date: Mon, 8 Oct 2018 07:25:34 -0600

> On 10/8/18 4:17 AM, Christian Brauner wrote:
>>> @@ -4696,13 +4697,32 @@ static int rtnl_stats_dump(struct sk_buff *skb, struct netlink_callback *cb)
>>>  
>>>  	cb->seq = net->dev_base_seq;
>>>  
>>> -	if (nlmsg_len(cb->nlh) < sizeof(*ifsm))
>>> +	if (nlmsg_len(cb->nlh) < sizeof(*ifsm)) {
>>> +		NL_SET_ERR_MSG(extack, "Invalid header for stats dump");
>>>  		return -EINVAL;
>>> +	}
>>>  
>>>  	ifsm = nlmsg_data(cb->nlh);
>>> +
>>> +	/* only requests using NLM_F_DUMP_PROPER_HDR can pass data to
>> 
>> That looks like an accidental leftover before we changed this to a
>> socket option. :)
>> 
> 
> ugh. thanks for noticing.

David, I applied this series, please send me relative fixups at this point
if necessary.

Thanks.

^ permalink raw reply

* skb length without fragments
From: pradeep kumar nalla @ 2018-10-08 18:02 UTC (permalink / raw)
  To: netdev

Hi

While testing my network driver with pktgen I could see an skb greater
than 16K without fragments in xmit function. This lead to a fix in my
driver that assumes when an SKB whose length is greater than 16K will
come with fragments. Apart from pktgen what are the chances or
possibilities of getting an SKB greater than 16K without fragments? .
When I tried with tools like iperf/iper3/netperf, didn’t see a single
incidence where the SKB length is greater than 16K and without frags.
Even socket layer I see function alloc_skb_with_frags, does this mean
all the larger packets come with frags.

Please pardon if it is not a correct mailing list to ask above question.

Thanks
Pradeep.

^ permalink raw reply

* Re: [PATCH V1 net 0/5] minor bug fixes for ENA Ethernet driver
From: David Miller @ 2018-10-08 18:01 UTC (permalink / raw)
  To: nafea
  Cc: akiyano, netdev, dwmw, zorik, matua, saeedb, msw, aliguori,
	gtzalik, netanel, alisaidi
In-Reply-To: <7BF68A50-AD04-4FA7-91ED-F3F6412E2E2E@amazon.com>

From: "Bshara, Nafea" <nafea@amazon.com>
Date: Mon, 8 Oct 2018 12:47:50 +0000

> Ship it

Anything but... see my feedback.

^ permalink raw reply

* Re: [PATCH net-next 0/3] selftests: add more PMTU tests
From: David Miller @ 2018-10-08 18:00 UTC (permalink / raw)
  To: sd; +Cc: netdev, sbrivio
In-Reply-To: <cover.1539001627.git.sd@queasysnail.net>

From: Sabrina Dubroca <sd@queasysnail.net>
Date: Mon,  8 Oct 2018 14:37:02 +0200

> The current selftests for PMTU cover VTI tunnels, but there's nothing
> about the generation and handling of PMTU exceptions by intermediate
> routers. This series adds and improves existing helpers, then adds
> IPv4 and IPv6 selftests with a setup involving an intermediate router.
> 
> Joint work with Stefano Brivio.

This is really great stuff to see.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net 1/2] net: ipv4: update fnhe_pmtu when first hop's MTU changes
From: David Miller @ 2018-10-08 17:58 UTC (permalink / raw)
  To: sd; +Cc: netdev, sbrivio
In-Reply-To: <bdb0235df165b1a9684670be3839962c80c9b45a.1539000663.git.sd@queasysnail.net>

From: Sabrina Dubroca <sd@queasysnail.net>
Date: Mon,  8 Oct 2018 14:36:38 +0200

> Since commit 5aad1de5ea2c ("ipv4: use separate genid for next hop
> exceptions"), exceptions get deprecated separately from cached
> routes. In particular, administrative changes don't clear PMTU anymore.
> 
> As Stefano described in commit e9fa1495d738 ("ipv6: Reflect MTU changes
> on PMTU of exceptions for MTU-less routes"), the PMTU discovered before
> the local MTU change can become stale:
>  - if the local MTU is now lower than the PMTU, that PMTU is now
>    incorrect
>  - if the local MTU was the lowest value in the path, and is increased,
>    we might discover a higher PMTU
> 
> Similarly to what commit e9fa1495d738 did for IPv6, update PMTU in those
> cases.
> 
> If the exception was locked, discovered the PMTU was smaller than the
> minimal accepted PMTU. In that case, if the new local MTU is smaller
> than the current PMTU, let PMTU discovery figure out if locking of the
> exception is still needed.
> 
> To do this, we need to know the old link MTU in the NETDEV_CHANGEMTU
> notifier. By the time the notifier is called, dev->mtu has been
> changed. This patch adds the old MTU as additional information in the
> notifier structure, and a new call_netdevice_notifiers_u32() function.
> 
> Fixes: 5aad1de5ea2c ("ipv4: use separate genid for next hop exceptions")
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> Reviewed-by: Stefano Brivio <sbrivio@redhat.com>

Please, when you respin this to address David Ahern's feedback, provide
a proper "0/N" header posting.

Thank you.

^ permalink raw reply

* Re: [PATCH V1 net 2/5] net: ena: fix warning in rmmod caused by double iounmap
From: David Miller @ 2018-10-08 17:56 UTC (permalink / raw)
  To: akiyano
  Cc: netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea, gtzalik,
	netanel, alisaidi
In-Reply-To: <1539001724-25980-3-git-send-email-akiyano@amazon.com>

From: <akiyano@amazon.com>
Date: Mon, 8 Oct 2018 15:28:41 +0300

> From: Arthur Kiyanovski <akiyano@amazon.com>
> 
> Memory mapped with devm_ioremap is automatically freed when the driver
> is disconnected from the device. Therefore there is no need to
> explicitly call devm_iounmap.
> 
> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
> ---
>  drivers/net/ethernet/amazon/ena/ena_netdev.c | 10 +---------
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> index 25621a2..23f2dda 100644
> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> @@ -3099,15 +3099,7 @@ static int ena_rss_init_default(struct ena_adapter *adapter)
>  
>  static void ena_release_bars(struct ena_com_dev *ena_dev, struct pci_dev *pdev)
>  {
> -	int release_bars;
> -
> -	if (ena_dev->mem_bar)
> -		devm_iounmap(&pdev->dev, ena_dev->mem_bar);
> -
> -	if (ena_dev->reg_bar)
> -		devm_iounmap(&pdev->dev, ena_dev->reg_bar);
> -
> -	release_bars = pci_select_bars(pdev, IORESOURCE_MEM) & ENA_BAR_MASK;
> +	int release_bars = pci_select_bars(pdev, IORESOURCE_MEM) & ENA_BAR_MASK;
>  	pci_release_selected_regions(pdev, release_bars);

Always have an empty line between local variable declarations and actual
function code.

^ permalink raw reply

* Re: [PATCH V1 net 0/5] minor bug fixes for ENA Ethernet driver
From: David Miller @ 2018-10-08 17:56 UTC (permalink / raw)
  To: akiyano
  Cc: netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea, gtzalik,
	netanel, alisaidi
In-Reply-To: <1539001724-25980-1-git-send-email-akiyano@amazon.com>

From: <akiyano@amazon.com>
Date: Mon, 8 Oct 2018 15:28:39 +0300

> From: Arthur Kiyanovski <akiyano@amazon.com>
> 
> Arthur Kiyanovski (5):
>   net: ena: fix indentations in ena_defs for better readability
>   net: ena: fix warning in rmmod caused by double iounmap
>   net: ena: fix rare bug when failed restart/resume is followed by
>     driver removal
>   net: ena: fix NULL dereference due to untimely napi initialization
>   net: ena: fix auto casting to boolean

Indentation fixes are not appropriate for a series of bug fixes.  Separate
those out from the real legitimate bug fixes into a series for net-next.

^ permalink raw reply

* Re: [PATCH] bpf: btf: Fix a missing check bug
From: valdis.kletnieks @ 2018-10-09  1:07 UTC (permalink / raw)
  To: Song Liu
  Cc: wang6495, kjlu, Alexei Starovoitov, Daniel Borkmann, Networking,
	open list
In-Reply-To: <CAPhsuW5nPGMOEMKa1UFwgquoUeYiUV8i8nVgWfM6McH2Ri1esg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1406 bytes --]

On Mon, 08 Oct 2018 17:44:46 -0700, Song Liu said:

> I think I get the security concept here. However, hdr_len here is only used to
> copy the whole header into kernel space, and it is not used in other
> logic at all.
> I cannot image any security flaw with either hdr_len > btf->hdr->hdr_len case or
> hdr_len < btf->hdr->hdr_len. Could you please provide more insights on what
> would break by malicious user space?

Say the biggest allowed value for hdr_len is 128.  We check the value, the user has 98.
They then stuff 16,383 into there.

Now here's the problem - hdr_len is a local variable, and evaporates when the function
returns.  From here on out, anybody who cares about the header length will use the
value in btf->hdr_len....

(And yes, somebody *does* care about the length, otherwise we wouldn't need a field
saying what the length was....)

Now think how many ways that can go pear-shaped.  You copied in 98 bytes, but outside
the function, they think that header is almost 4 pages long.  Does that ever get used as
a length for kmemcpy()?  Or a limit for a 'for (i=start; i< (start+hdr->hdr_len); i++)' that
walks across a variable length header?

Can you cook up a way to have a good chance to oops the kernel when it walks off the
page you allocated the 98 bytes on?  Can you use it to export chunks of memory out to
userspace?  Lots and lots of ways for this to kersplat a kernel...;

[-- Attachment #2: Type: application/pgp-signature, Size: 486 bytes --]

^ permalink raw reply

* Re: [PATCH] dt-bindings: Add bindings for aliases node
From: Brian Norris @ 2018-10-09  1:07 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: Rob Herring, Mark Rutland, devicetree, linux-kernel,
	linux-wireless, linux-spi, netdev, Stephen Boyd, Florian Fainelli
In-Reply-To: <20180925210255.172734-1-mka@chromium.org>

+ linux-spi, linux-wireless, netdev
+ others from previous conversations

Hi,

On Tue, Sep 25, 2018 at 02:02:55PM -0700, Matthias Kaehlcke wrote:
> Add a global binding for the 'aliases' node. This includes an initial list
> of standardized alias names for some hardware components that are commonly
> found in 'aliases'.
> 
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> ---
>  Documentation/devicetree/bindings/aliases.txt | 47 +++++++++++++++++++
>  1 file changed, 47 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/aliases.txt
> 
> diff --git a/Documentation/devicetree/bindings/aliases.txt b/Documentation/devicetree/bindings/aliases.txt
> new file mode 100644
> index 000000000000..d64ed1c7eb34
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/aliases.txt
> @@ -0,0 +1,47 @@
> +The aliases node
> +----------------

I like the idea in general, and it might be good to note (e.g., commit
message) that this was inspired by this thread:

https://lore.kernel.org/lkml/20180815221601.GB24830@rob-hp-laptop/

where we were interested in firmware-to-device-tree path stability --
and the answer was basically: don't memorize paths, just use aliases
instead. But then, it was clear that aliases were not documented very
formally at all.

So here we are!

> +
> +The aliases node contains properties that represent aliases to device tree
> +nodes. The name of the property is the alias name, the value is the path of
> +a the device tree node that corresponds to the alias. The path may be
> +specified as a string or a phandle.
> +
> +Alias names are often suffixed with a numeric ID, especially when there may
> +be multiple instances of the same type. The ID typically corresponds to the
> +hardware layout, it may also be used by drivers for a stable mapping of
> +device names and hardware entities.
> +
> +Alias names
> +-----------
> +
> +The devicetree specification doesn't require the use of specific alias
> +names to refer to hardware entities of a given type, however the Linux
> +kernel aims for a certain level of consistency.
> +
> +The following standardized alias names shall be used for their
> +corresponding hardware components:
> +
> +  bluetoothN		Bluetooth controller
> +  ethernetN		Ethernet interface
> +  gpioN			GPIO controller
> +  i2cN			i2c bus
> +  mmcN			MMC bus
> +  rtcN			Real time clock
> +  serialN		UART port
> +  spiN			SPI bus
> +  wifiN			Wireless network interface

For the network-device-related names (bluetooth, ethernet, and wifi), I
think there's a clear documented reason for this (supporting MAC address
plumbing from a DT-aware bootloader). I'm not quite as sure about all
the others, and unfortunately, I'm aware of at least one subsystem owner
that explicitly does NOT like the aliases usage that is currently
supported (spi), and shot down a patch where I tried to use it in a DTS
file (despite its regular usage in many other DTS files).

So I guess I'm saying: perhaps we should get buy-in from various
subsystems before we include them? So maybe it's wiser to start
small(er) and only add once we're sure they are useful? Or perhaps Rob
has other thoughts.

> +
> +The above list is not exhaustive and will be extended over time. Please
> +send patches to devicetree@vger.kernel.org if you think a hardware
> +component and its alias name should be on the list.
> +
> +Example
> +-------
> +
> +aliases {
> +	bluetooth0 = "/soc/serial@fdf01000/bluetooth";
> +	rtc0 = &rtc0;
> +	wifi0 = &wlcore;
> +};
> +
> +(based on arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dts)

What is the relevance of this line? This doesn't look anything like that
hikey DTS. Maybe the "based on" line should just be removed? The example
seems fine though.

Anyway, perhaps with a trimmed list of supported alias names:

Reviewed-by: Brian Norris <briannorris@chromium.org>

^ permalink raw reply

* Re: BUG: corrupted list in p9_read_work
From: syzbot @ 2018-10-09  1:07 UTC (permalink / raw)
  To: asmadeus, davem, ericvh, linux-kernel, lucho, netdev, rminnich,
	syzkaller-bugs, v9fs-developer
In-Reply-To: <000000000000ca61cd0571178677@google.com>

syzbot has found a reproducer for the following crash on:

HEAD commit:    0854ba5ff5c9 Merge git://git.kernel.org/pub/scm/linux/kern..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1514ec06400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=88e9a8a39dc0be2d
dashboard link: https://syzkaller.appspot.com/bug?extid=2222c34dc40b515f30dc
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=10b91685400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+2222c34dc40b515f30dc@syzkaller.appspotmail.com

FS-Cache: N-cookie d=000000000a092700 n=00000000d8ee0022
FS-Cache: N-key=[10] '34323935303034313132'
list_del corruption, ffff88019ae36ee8->next is LIST_POISON1  
(dead000000000100)
------------[ cut here ]------------
kobject: '9p-11043': free name
kernel BUG at lib/list_debug.c:47!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 2686 Comm: kworker/1:2 Not tainted 4.19.0-rc7+ #274
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
kobject: '9p-11049' (0000000096206f7a): kobject_add_internal:  
parent: 'bdi', set: 'devices'
Workqueue: events p9_read_work
RIP: 0010:__list_del_entry_valid.cold.1+0x26/0x58 lib/list_debug.c:45
Code: d7 fd 0f 0b 4c 89 e2 48 89 de 48 c7 c7 40 92 40 88 e8 7a a2 d7 fd 0f  
0b 4c 89 ea 48 89 de 48 c7 c7 e0 91 40 88 e8 66 a2 d7 fd <0f> 0b 48 89 de  
48 c7 c7 00 93 40 88 e8 55 a2 d7 fd 0f 0b 48 89 de
RSP: 0018:ffff8801cc5975b8 EFLAGS: 00010282
kobject: '9p-11049' (0000000096206f7a): kobject_uevent_env
RAX: 000000000000004e RBX: ffff88019ae36ee8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff81650405 RDI: 0000000000000005
RBP: ffff8801cc5975d0 R08: ffff8801cc58a4c0 R09: ffffed003b5e4fe8
R10: ffffed003b5e4fe8 R11: ffff8801daf27f47 R12: dead000000000200
R13: dead000000000100 R14: ffff8801c8931050 R15: ffff8801c8931010
FS:  0000000000000000(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fef196c1000 CR3: 00000001ccff7000 CR4: 00000000001406e0
Call Trace:
kobject: '9p-11049' (0000000096206f7a): fill_kobj_path: path  
= '/devices/virtual/bdi/9p-11049'
  __list_del_entry include/linux/list.h:117 [inline]
  list_del include/linux/list.h:125 [inline]
  p9_read_work+0xab6/0x10e0 net/9p/trans_fd.c:379
kobject: 'loop4' (00000000513f3e2f): kobject_uevent_env
FS-Cache: Duplicate cookie detected
  process_one_work+0xc90/0x1b90 kernel/workqueue.c:2153
FS-Cache: O-cookie c=00000000911358e4 [p=000000006545c95d fl=222 nc=0 na=1]
FS-Cache: O-cookie d=000000000a092700 n=000000007635356b
FS-Cache: O-key=[10] '
34
32
39
35
30
30
34
31
32
  worker_thread+0x17f/0x1390 kernel/workqueue.c:2296
36
'
FS-Cache: N-cookie c=00000000abaeee81 [p=000000006545c95d fl=2 nc=0 na=1]
FS-Cache: N-cookie d=000000000a092700 n=00000000ee16a363
FS-Cache: N-key=[10] '
34
32
39
35
30
30
34
  kthread+0x35a/0x420 kernel/kthread.c:246
31
32
36
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413
'
Modules linked in:
---[ end trace 41e06641f5c3c814 ]---
kobject: '9p-11050' (000000002a096aa2): kobject_add_internal:  
parent: 'bdi', set: 'devices'
RIP: 0010:__list_del_entry_valid.cold.1+0x26/0x58 lib/list_debug.c:45
Code: d7 fd 0f 0b 4c 89 e2 48 89 de 48 c7 c7 40 92 40 88 e8 7a a2 d7 fd 0f  
0b 4c 89 ea 48 89 de 48 c7 c7 e0 91 40 88 e8 66 a2 d7 fd <0f> 0b 48 89 de  
48 c7 c7 00 93 40 88 e8 55 a2 d7 fd 0f 0b 48 89 de
RSP: 0018:ffff8801cc5975b8 EFLAGS: 00010282
RAX: 000000000000004e RBX: ffff88019ae36ee8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff81650405 RDI: 0000000000000005
RBP: ffff8801cc5975d0 R08: ffff8801cc58a4c0 R09: ffffed003b5e4fe8
R10: ffffed003b5e4fe8 R11: ffff8801daf27f47 R12: dead000000000200
R13: dead000000000100 R14: ffff8801c8931050 R15: ffff8801c8931010
FS:  0000000000000000(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fef196c1000 CR3: 00000001ccff7000 CR4: 00000000001406e0

^ permalink raw reply

* [PATCH bpf] xsk: do not call synchronize_net() under RCU read lock
From: Björn Töpel @ 2018-10-08 17:40 UTC (permalink / raw)
  To: ast, daniel, netdev, eric.dumazet
  Cc: Björn Töpel, magnus.karlsson, magnus.karlsson
In-Reply-To: <CAJ+HfNhCR-OY3hvLeqyt--1m9vTOPGh8PGGjmrVnh6yPPgKtsA@mail.gmail.com>

From: Björn Töpel <bjorn.topel@intel.com>

The XSKMAP update and delete functions called synchronize_net(), which
can sleep. It is not allowed to sleep during an RCU read section.

Instead we need to make sure that the sock sk_destruct (xsk_destruct)
function is asynchronously called after an RCU grace period. Setting
the SOCK_RCU_FREE flag for XDP sockets takes care of this.

Fixes: fbfc504a24f5 ("bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 kernel/bpf/xskmap.c | 10 ++--------
 net/xdp/xsk.c       |  2 ++
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/kernel/bpf/xskmap.c b/kernel/bpf/xskmap.c
index 9f8463afda9c..47147c9e184d 100644
--- a/kernel/bpf/xskmap.c
+++ b/kernel/bpf/xskmap.c
@@ -192,11 +192,8 @@ static int xsk_map_update_elem(struct bpf_map *map, void *key, void *value,
 	sock_hold(sock->sk);
 
 	old_xs = xchg(&m->xsk_map[i], xs);
-	if (old_xs) {
-		/* Make sure we've flushed everything. */
-		synchronize_net();
+	if (old_xs)
 		sock_put((struct sock *)old_xs);
-	}
 
 	sockfd_put(sock);
 	return 0;
@@ -212,11 +209,8 @@ static int xsk_map_delete_elem(struct bpf_map *map, void *key)
 		return -EINVAL;
 
 	old_xs = xchg(&m->xsk_map[k], NULL);
-	if (old_xs) {
-		/* Make sure we've flushed everything. */
-		synchronize_net();
+	if (old_xs)
 		sock_put((struct sock *)old_xs);
-	}
 
 	return 0;
 }
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 0577cd49aa72..07156f43d295 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -754,6 +754,8 @@ static int xsk_create(struct net *net, struct socket *sock, int protocol,
 	sk->sk_destruct = xsk_destruct;
 	sk_refcnt_debug_inc(sk);
 
+	sock_set_flag(sk, SOCK_RCU_FREE);
+
 	xs = xdp_sk(sk);
 	mutex_init(&xs->mutex);
 	spin_lock_init(&xs->tx_completion_lock);
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH v2 net-next 00/23] rtnetlink: Add support for rigid checking of data in dump request
From: David Miller @ 2018-10-08 17:40 UTC (permalink / raw)
  To: christian; +Cc: dsahern, netdev, jbenc, stephen, dsahern
In-Reply-To: <20181008110412.43o5qgaaqvsf2znw@brauner.io>

From: Christian Brauner <christian@brauner.io>
Date: Mon, 8 Oct 2018 13:04:13 +0200

> On Sun, Oct 07, 2018 at 08:16:21PM -0700, David Ahern wrote:
>> From: David Ahern <dsahern@gmail.com>
>> 
>> There are many use cases where a user wants to influence what is
>> returned in a dump for some rtnetlink command: one is wanting data
>> for a different namespace than the one the request is received and
>> another is limiting the amount of data returned in the dump to a
>> specific set of interest to userspace, reducing the cpu overhead of
>> both kernel and userspace. Unfortunately, the kernel has historically
>> not been strict with checking for the proper header or checking the
>> values passed in the header. This lenient implementation has allowed
>> iproute2 and other packages to pass any struct or data in the dump
>> request as long as the family is the first byte. For example, ifinfomsg
>> struct is used by iproute2 for all generic dump requests - links,
>> addresses, routes and rules when it is really only valid for link
>> requests.
>> 
>> There is 1 is example where the kernel deals with the wrong struct: link
>> dumps after VF support was added. Older iproute2 was sending rtgenmsg as
>> the header instead of ifinfomsg so a patch was added to try and detect
>> old userspace vs new:
>> e5eca6d41f53 ("rtnetlink: fix userspace API breakage for iproute2 < v3.9.0")
>> 
>> The latest example is Christian's patch set wanting to return addresses for
>> a target namespace. It guesses the header struct is an ifaddrmsg and if it
>> guesses wrong a netlink warning is generated in the kernel log on every
>> address dump which is unacceptable.
>> 
>> Another example where the kernel is a bit lenient is route dumps: iproute2
>> can send either a request with either ifinfomsg or a rtmsg as the header
>> struct, yet the kernel always treats the header as an rtmsg (see
>> inet_dump_fib and rtm_flags check). The header inconsistency impacts the
>> ability to add kernel side filters for route dumps - a necessary feature
>> for scale setups with 100k+ routes.
>> 
>> How to resolve the problem of not breaking old userspace yet be able to
>> move forward with new features such as kernel side filtering which are
>> crucial for efficient operation at high scale?
>> 
>> This patch set addresses the problem by adding a new socket flag,
>> NETLINK_DUMP_STRICT_CHK, that userspace can use with setsockopt to
>> request strict checking of headers and attributes on dump requests and
>> hence unlock the ability to use kernel side filters as they are added.
 ...
> At this point it's all nits so it's got my ACK but keener eyes than mine
> might see other issues.
> 
> Acked-by: Christian Brauner <christian@brauner.io>

Series applied, thanks everyone.

Please be on the lookout for userspace regressions from this patch set.

Thanks.

^ permalink raw reply

* Re: [RFC PATCH 05/11] net: ethernet: ti: cpsw: add support for port interface mode selection phy
From: Andrew Lunn @ 2018-10-09  0:50 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: David S. Miller, netdev, Tony Lindgren, Rob Herring,
	Kishon Vijay Abraham I, Sekhar Nori, linux-kernel, linux-omap,
	devicetree
In-Reply-To: <20181008234949.15416-6-grygorii.strashko@ti.com>

>  	/* Configure GMII_SEL register */
> -	cpsw_phy_sel(cpsw->dev, slave->phy->interface, slave->slave_num);
> +	if (!IS_ERR(slave->data->ifphy))
> +		phy_set_netif_mode(slave->data->ifphy, slave->data->phy_if);

Is slave->data->phy_if also passed to phy_connect()? So you are going
to end up with both the MAC and the PHY inserting RGMII delays, and it
not working.

You need to somehow decide if the MAC is going to do the delay, or the
PHY. But not both.

	Andrew

^ permalink raw reply

* Re: [PATCH net-next 3/3] selftests: pmtu: add basic IPv4 and IPv6 PMTU tests
From: David Ahern @ 2018-10-08 17:36 UTC (permalink / raw)
  To: Sabrina Dubroca, netdev; +Cc: Stefano Brivio
In-Reply-To: <634b29aaf3b4228e307e84f41c5e0e66dc6c2915.1539001627.git.sd@queasysnail.net>

On 10/8/18 6:37 AM, Sabrina Dubroca wrote:
> Commit d1f1b9cbf34c ("selftests: net: Introduce first PMTU test") and
> follow-ups introduced some PMTU tests, but they all rely on tunneling,
> and, particularly, on VTI.
> 
> These new tests use simple routing to exercise the generation and
> update of PMTU exceptions in IPv4 and IPv6.
> 
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  tools/testing/selftests/net/pmtu.sh | 207 +++++++++++++++++++++++++++-
>  1 file changed, 203 insertions(+), 4 deletions(-)
> 

Thanks for adding more pmtu tests.

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next 2/3] selftests: pmtu: extend MTU parsing helper to locked MTU
From: David Ahern @ 2018-10-08 17:35 UTC (permalink / raw)
  To: Sabrina Dubroca, netdev; +Cc: Stefano Brivio
In-Reply-To: <99a5ccbf8aa2b8dae9955d14a8562f4de6ad2152.1539001627.git.sd@queasysnail.net>

On 10/8/18 6:37 AM, Sabrina Dubroca wrote:
> The mtu_parse helper introduced in commit f2c929feeccd ("selftests:
> pmtu: Factor out MTU parsing helper") can only handle "mtu 1234", but
> not "mtu lock 1234". Extend it, so that we can do IPv4 tests with PMTU
> smaller than net.ipv4.route.min_pmtu
> 
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> ---
>  tools/testing/selftests/net/pmtu.sh | 2 ++
>  1 file changed, 2 insertions(+)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next 1/3] selftests: pmtu: Introduce check_pmtu_value()
From: David Ahern @ 2018-10-08 17:35 UTC (permalink / raw)
  To: Sabrina Dubroca, netdev; +Cc: Stefano Brivio
In-Reply-To: <01bd5e1c237ba61f38f816b2361276fc4b565ccc.1539001627.git.sd@queasysnail.net>

On 10/8/18 6:37 AM, Sabrina Dubroca wrote:
> From: Stefano Brivio <sbrivio@redhat.com>
> 
> Introduce and use a function that checks PMTU values against
> expected values and logs error messages, to remove some clutter.
> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
> ---
>  tools/testing/selftests/net/pmtu.sh | 49 +++++++++++++----------------
>  1 file changed, 22 insertions(+), 27 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v2 00/12] net: sched: cls_u32 Various improvements
From: David Miller @ 2018-10-08 17:33 UTC (permalink / raw)
  To: jhs; +Cc: xiyou.wangcong, jiri, netdev, viro, hadi
In-Reply-To: <20181008102244.22212-1-jhs@emojatatu.com>

From: Jamal Hadi Salim <jhs@mojatatu.com>
Date: Mon,  8 Oct 2018 06:22:32 -0400

> Various improvements from Al.
> 
> Changes from version 1: Add missing commit 

Series applied.

^ permalink raw reply

* Re: [PATCH] bpf: btf: Fix a missing check bug
From: Song Liu @ 2018-10-09  0:44 UTC (permalink / raw)
  To: valdis.kletnieks
  Cc: wang6495, kjlu, Alexei Starovoitov, Daniel Borkmann, Networking,
	open list
In-Reply-To: <43898.1539033428@turing-police.cc.vt.edu>

On Mon, Oct 8, 2018 at 2:17 PM <valdis.kletnieks@vt.edu> wrote:
>
> On Mon, 08 Oct 2018 13:51:09 -0700, Song Liu said:
> > On Sun, Oct 7, 2018 at 1:26 PM Wenwen Wang <wang6495@umn.edu> wrote:
>
> > > same value. Given that the btf data is in the user space, a malicious user
> > > can race to change the data between the two copies. By doing so, the user
> > > can provide malicious data to the kernel and cause undefined behavior.
>
> > These two numbers are copied from same memory location, right? So I
> > think this check is not necessary?
>
> Security researchers call this a TOCTOU bug - Time of Check - Time of Use.
>
> What can happen:
>
> 1) We fetch the value  (say we get 90) from userspace and stash it in hdr_len.
>
> 2) We do some other stuff like check the hdr_len isn't too big, etc..
>
>         meanwhile, on another CPU running another thread of the process...
>                 3) malicious code stuffs a 117 into that field
>
> 4) We fetch the entire header, incliding a now-changed hdr_len  (now 117) and
> stick it in btf->hdr->hdr_len.
>
> 5) Any code that assumes that hdr_len and btf->hdr->hdr_len are the same value
> explodes in interesting ways.

I think I get the security concept here. However, hdr_len here is only used to
copy the whole header into kernel space, and it is not used in other
logic at all.
I cannot image any security flaw with either hdr_len > btf->hdr->hdr_len case or
hdr_len < btf->hdr->hdr_len. Could you please provide more insights on what
would break by malicious user space?

Thanks,
Song

^ permalink raw reply

* Re: [PATCH net-next] netdev: remove useless codes of tun_automq_select_queue
From: David Miller @ 2018-10-08 17:27 UTC (permalink / raw)
  To: wangli39; +Cc: netdev
In-Reply-To: <20181008085112.56290-1-wangli39@baidu.com>

From: Wang Li <wangli39@baidu.com>
Date: Mon,  8 Oct 2018 16:51:12 +0800

> @@ -1045,15 +1039,12 @@ static void tun_automq_xmit(struct tun_struct *tun, struct sk_buff *skb)
>  		 * RPS hash and save it into the flow_table here.
>  		 */
>  		__u32 rxhash;
> +		struct tun_flow_entry *e;

Please always order local variable declarations from longest to
shortest line.

Thank you.

^ permalink raw reply

* Re: [RFC PATCH 03/11] phy: ti: introduce phy-gmii-sel driver
From: Andrew Lunn @ 2018-10-09  0:39 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: David S. Miller, netdev, Tony Lindgren, Rob Herring,
	Kishon Vijay Abraham I, Sekhar Nori, linux-kernel, linux-omap,
	devicetree
In-Reply-To: <20181008234949.15416-4-grygorii.strashko@ti.com>

On Mon, Oct 08, 2018 at 06:49:41PM -0500, Grygorii Strashko wrote:
> +static int phy_gmii_sel_mode(struct phy *phy, phy_interface_t intf_mode)
> +{
> +	struct phy_gmii_sel_phy_priv *if_phy = phy_get_drvdata(phy);
> +	const struct phy_gmii_sel_soc_data *soc_data = if_phy->priv->soc_data;
> +	struct device *dev = if_phy->priv->dev;
> +	struct regmap_field *regfield;
> +	int ret, rgmii_id = 0;
> +	u32 mode = 0;
> +
> +	if_phy->phy_if_mode = intf_mode;
> +
> +	switch (if_phy->phy_if_mode) {
> +	case PHY_INTERFACE_MODE_RMII:
> +		mode = AM33XX_GMII_SEL_MODE_RMII;
> +		break;
> +
> +	case PHY_INTERFACE_MODE_RGMII:
> +		mode = AM33XX_GMII_SEL_MODE_RGMII;
> +		break;
> +
> +	case PHY_INTERFACE_MODE_RGMII_ID:
> +	case PHY_INTERFACE_MODE_RGMII_RXID:
> +	case PHY_INTERFACE_MODE_RGMII_TXID:
> +		mode = AM33XX_GMII_SEL_MODE_RGMII;
> +		rgmii_id = 1;
> +		break;

Hi Grygorii

It looks like the MAC can do AM33XX_GMII_SEL_MODE_RGMII and
AM33XX_GMII_SEL_MODE_RGMII_ID. I don't think it can do
AM33XX_GMII_SEL_MODE_RGMII_RXID or AM33XX_GMII_SEL_MODE_RGMII_TXID?  I
would prefer it return -EINVAL when asked to do something it cannot
do.

> +
> +	default:
> +		dev_warn(dev,
> +			 "port%u: unsupported mode: \"%s\". Defaulting to MII.\n",
> +			 if_phy->id, phy_modes(rgmii_id));
> +		/* fall through */

Returning -EINVAL would be better. Otherwise the DT might never get
fixed.

> +	case PHY_INTERFACE_MODE_MII:
> +		mode = AM33XX_GMII_SEL_MODE_MII;
> +		break;
> +	};
> +
> +	dev_dbg(dev, "%s id:%u mode:%u rgmii_id:%d rmii_clk_ext:%d\n",
> +		__func__, if_phy->id, mode, rgmii_id,
> +		if_phy->rmii_clock_external);
> +
> +	regfield = if_phy->fields[PHY_GMII_SEL_PORT_MODE];
> +	ret = regmap_field_write(regfield, mode);
> +
> +	if (soc_data->features & BIT(PHY_GMII_SEL_RGMII_ID_MODE) &&
> +	    if_phy->fields[PHY_GMII_SEL_RGMII_ID_MODE]) {
> +		regfield = if_phy->fields[PHY_GMII_SEL_RGMII_ID_MODE];
> +		ret |= regmap_field_write(regfield, rgmii_id);
> +	}
> +
> +	if (soc_data->features & BIT(PHY_GMII_SEL_RMII_IO_CLK_EN) &&
> +	    if_phy->fields[PHY_GMII_SEL_RMII_IO_CLK_EN]) {
> +		regfield = if_phy->fields[PHY_GMII_SEL_RMII_IO_CLK_EN];
> +		ret |= regmap_field_write(regfield,
> +					  if_phy->rmii_clock_external);
> +	}
> +
> +	if (ret) {
> +		dev_err(dev, "port%u: set mode fail %d", if_phy->id, ret);
> +		return -EIO;
> +	}

I would prefer each write had its own error check. The fact you don't
return ret means you know ret could be -EINVAL|-EOIO, making
-EMORECOFFEE.

	Andrew

^ permalink raw reply

* Re: [v3, 3/6] net: dpaa2: fix dependency of config FSL_DPAA2_ETH
From: David Miller @ 2018-10-08 17:23 UTC (permalink / raw)
  To: yangbo.lu
  Cc: linux-kernel, devel, netdev, richardcochran, ruxandra.radulescu,
	gregkh, andrew
In-Reply-To: <20181008074430.34379-3-yangbo.lu@nxp.com>

From: Yangbo Lu <yangbo.lu@nxp.com>
Date: Mon,  8 Oct 2018 15:44:27 +0800

> The NETDEVICES dependency and ETHERNET dependency hadn't
> been required since dpaa2-eth was moved out of staging.
> Also allowed COMPILE_TEST for dpaa2-eth.
> 
> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>

Applied.

^ permalink raw reply

* Re: [PATCH net 1/2] net: ipv4: update fnhe_pmtu when first hop's MTU changes
From: David Ahern @ 2018-10-08 17:18 UTC (permalink / raw)
  To: Sabrina Dubroca, netdev; +Cc: Stefano Brivio, Ido Schimmel
In-Reply-To: <bdb0235df165b1a9684670be3839962c80c9b45a.1539000663.git.sd@queasysnail.net>

On 10/8/18 6:36 AM, Sabrina Dubroca wrote:
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index c7861e4b402c..dc9d2668d9bb 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2458,6 +2458,13 @@ struct netdev_notifier_info {
>  	struct netlink_ext_ack	*extack;
>  };
>  
> +struct netdev_notifier_info_ext {
> +	struct netdev_notifier_info info; /* must be first */
> +	union {
> +		u32 u32;

I realize you want this to be generic, but that is a really odd
definition. can you make that mtu instead? the union allows other use
cases to add new names.

> +	} ext;
> +};
> +
>  struct netdev_notifier_change_info {
>  	struct netdev_notifier_info info; /* must be first */
>  	unsigned int flags_changed;

^ permalink raw reply

* Re: [PATCH bpf-next v3 07/15] bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP
From: Björn Töpel @ 2018-10-08 17:04 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Karlsson, Magnus, Duyck, Alexander H, Alexander Duyck,
	John Fastabend, Alexei Starovoitov, Jesper Dangaard Brouer,
	Willem de Bruijn, Daniel Borkmann, Michael S. Tsirkin, Netdev,
	Björn Töpel, michael.lundkvist, Brandeburg, Jesse,
	Singhai, Anjali, Zhang, Qi Z
In-Reply-To: <044ed6af-b3df-d916-9550-43226179e1e3@gmail.com>

Den mån 8 okt. 2018 kl 18:55 skrev Eric Dumazet <eric.dumazet@gmail.com>:
>
[...]
>
> You might take a look at SOCK_RCU_FREE flag for sockets.
>

Ah, thanks! I'll use this instead.

^ permalink raw reply

* Re: [PATCH net] rds: RDS (tcp) hangs on sendto() to unresponding address
From: Santosh Shilimkar @ 2018-10-08 16:59 UTC (permalink / raw)
  To: Ka-Cheong Poon, netdev; +Cc: davem, rds-devel
In-Reply-To: <1539015431-26974-1-git-send-email-ka-cheong.poon@oracle.com>

10/8/2018 9:17 AM, Ka-Cheong Poon wrote:
> In rds_send_mprds_hash(), if the calculated hash value is non-zero and
> the MPRDS connections are not yet up, it will wait.  But it should not
> wait if the send is non-blocking.  In this case, it should just use the
> base c_path for sending the message.
> 
> Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
> ---
Thanks for getting this out on the list.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox