Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next] r8169: add support for Byte Queue Limits
From: David Miller @ 2018-10-20 19:36 UTC (permalink / raw)
  To: hkallweit1; +Cc: nic_swsd, netdev, fw, holger
In-Reply-To: <778bd978-e851-6604-9822-34bde3eb07b9@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Sat, 20 Oct 2018 12:25:27 +0200

> From: Florian Westphal <fw@strlen.de>
> This patch is basically a resubmit of 1e918876853a ("r8169: add support
> for Byte Queue Limits") which was reverted later. The problems causing
> the revert seem to have been fixed in the meantime.
> Only change to the original patch is that the call to
> netdev_reset_queue was moved to rtl8169_tx_clear.
> 
> The Tested-by refers to a system using the RTL8168evl chip version.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>

Applied.

Heiner, I just want to say how happy I am with the work you've done on
the r8169 driver.  So many long term issues have been resolved, the
driver uses more and more generic facilities instead of reimplementing
things, and it's much cleaner and easier to maintain now.

Thank you!

^ permalink raw reply

* Re: [PATCH net-next] r8169: handle all interrupt events in the hard irq handler
From: David Miller @ 2018-10-20 19:35 UTC (permalink / raw)
  To: hkallweit1; +Cc: nic_swsd, netdev
In-Reply-To: <b37a0812-1c23-3743-eaf6-ca8bdcb1b1ab@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Thu, 18 Oct 2018 22:19:28 +0200

> Having a separate "slow event" handler isn't needed because all
> interrupt events trigger asynchronous activity. And in case of SYSErr
> we have bigger problems than performance anyway.
> This patch also allows to get rid of acking interrupt events in the
> NAPI poll callback.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> ---
> This patch will apply to net-next only after 6b839b6cf9ea ("r8169: fix
> NAPI handling under high load") was merged from net to net-next.

Applied.

^ permalink raw reply

* Re: pull request: bluetooth-next 2018-10-20
From: David Miller @ 2018-10-20 19:34 UTC (permalink / raw)
  To: johan.hedberg; +Cc: linux-bluetooth, netdev
In-Reply-To: <20181020090342.GA14926@x1c.home>

From: Johan Hedberg <johan.hedberg@gmail.com>
Date: Sat, 20 Oct 2018 12:03:42 +0300

> Here's one more bluetooth-next pull request for the 4.20 kernel.
> 
>  - Added new USB ID for QCA_ROME controller
>  - Added debug trace support from QCA wcn3990 controllers
>  - Updated L2CAP to conform to latest Errata Service Release
>  - Fix binding to non-removable BCM43430 devices
> 
> Please let me know if there are any issues pulling. Thanks.

Pulled, thanks Johan.

^ permalink raw reply

* Re: [PATCH 00/10] Netfilter updates for net-next
From: David Miller @ 2018-10-20 19:33 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <20181020094317.16011-1-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Sat, 20 Oct 2018 11:43:07 +0200

> The following patchset contains Netfilter updates for your net-next tree:
> 
> 1) Use lockdep_is_held() in ipset_dereference_protected(), from Lance Roy.
> 
> 2) Remove unused variable in cttimeout, from YueHaibing.
> 
> 3) Add ttl option for nft_osf, from Fernando Fernandez Mancera.
> 
> 4) Use xfrm family to deal with IPv6-in-IPv4 packets from nft_xfrm,
>    from Florian Westphal.
> 
> 5) Simplify xt_osf_match_packet().
> 
> 6) Missing ct helper alias definition in snmp_trap helper, from Taehee Yoo.
> 
> 7) Remove unnecessary parameter in nf_flow_table_cleanup(), from Taehee Yoo.
> 
> 8) Remove unused variable definitions in nft_{dup,fwd}, from Weongyo Jeong.
> 
> 9) Remove empty net/netfilter/nfnetlink_log.h file, from Taehee Yoo.
> 
> 10) Revert xt_quota updates remain option due to problems in the listing
>     path for 32-bit arches, from Maze.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Pulled, thanks.

^ permalink raw reply

* Re: [PATCH iproute2 1/1] DEBUG: Fix make check when need build generate_nlmsg
From: Petr Vorel @ 2018-10-20 19:15 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20180925124956.10565-1-pvorel@suse.cz>

Hi Stephen,

> make check from top level Makefile defines several flags which break
> building generate_nlmsg:

> $ make check
> make -C tools
> gcc  -Wall -Wstrict-prototypes  -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wformat=2 -O2 -I../include -I../include/uapi -DRESOLVE_HOSTNAMES -DLIBDIR=\"/usr/lib\" -DCONFDIR=\"/etc/iproute2\" -DNETNS_RUN_DIR=\"/var/run/netns\" -DNETNS_ETC_DIR=\"/etc/netns\" -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE  -DHAVE_SETNS -DHAVE_SELINUX -DHAVE_ELF -DHAVE_LIBMNL -I/usr/include/libmnl -DNEED_STRLCPY -DHAVE_LIBCAP ../lib/libutil.a ../lib/libnetlink.a -lselinux -lelf -lmnl -lcap  -I../../include -include../../include/uapi/linux/netlink.h -o generate_nlmsg generate_nlmsg.c ../../lib/libnetlink.c -lmnl
> gcc: error: ../lib/libutil.a: No such file or directory
> gcc: error: ../lib/libnetlink.a: No such file or directory
> make[2]: *** [Makefile:5: generate_nlmsg] Error 1
> make[1]: *** [Makefile:40: generate_nlmsg] Error 2

> To fix it reset CFLAGS in sub Makefile and remove LDLIBS entirely (as
> required -lmnl flag was specified in 5dc2204c ("testsuite: add libmnl").

> Fixes: 8804a8c0 ("Makefile: Add check target")

> Signed-off-by: Petr Vorel <pvorel@suse.cz>
> ---
> Hi Stephen,

> I'm sorry for this regression.

> Kind regards,
> Petr
> ---
>  testsuite/tools/Makefile | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

> diff --git a/testsuite/tools/Makefile b/testsuite/tools/Makefile
> index 7d53d226..e1d9bfef 100644
> --- a/testsuite/tools/Makefile
> +++ b/testsuite/tools/Makefile
> @@ -1,8 +1,9 @@
>  # SPDX-License-Identifier: GPL-2.0
> +CFLAGS=
>  include ../../config.mk

>  generate_nlmsg: generate_nlmsg.c ../../lib/libnetlink.c
> -	$(CC) $(CPPFLAGS) $(CFLAGS) $(LDLIBS) $(EXTRA_CFLAGS) -I../../include -include../../include/uapi/linux/netlink.h -o $@ $^ -lmnl
> +	$(CC) $(CPPFLAGS) $(CFLAGS) $(EXTRA_CFLAGS) -I../../include -include../../include/uapi/linux/netlink.h -o $@ $^ -lmnl

>  clean:
>  	rm -f generate_nlmsg

ping, please. Patch is in state "accepted in patchwork [1], but not merged.
Subject should be "testsuite: Fix make check when need build generate_nlmsg".


Kind regards,
Petr

[1] https://patchwork.ozlabs.org/patch/974391/

^ permalink raw reply

* [PATCH 2/2] net: bpfilter: Set user mode helper's command line
From: Olivier Brunel @ 2018-10-20 17:39 UTC (permalink / raw)
  To: Network Development
  Cc: Alexei Starovoitov, Daniel Borkmann, Luis R . Rodriguez,
	linux-kernel, David S . Miller, Olivier Brunel
In-Reply-To: <20181020173957.31239-1-jjk@jjacky.com>

Signed-off-by: Olivier Brunel <jjk@jjacky.com>
---
 net/bpfilter/bpfilter_kern.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/bpfilter/bpfilter_kern.c b/net/bpfilter/bpfilter_kern.c
index 94e88f510..7acfc8308 100644
--- a/net/bpfilter/bpfilter_kern.c
+++ b/net/bpfilter/bpfilter_kern.c
@@ -92,6 +92,7 @@ static int __init load_umh(void)
 	int err;
 
 	/* fork usermode process */
+	info.cmdline = "bpfilter_umh";
 	err = fork_usermode_blob(&bpfilter_umh_start,
 				 &bpfilter_umh_end - &bpfilter_umh_start,
 				 &info);
-- 
2.19.0

^ permalink raw reply related

* [PATCH 1/2] umh: Add command line to user mode helpers
From: Olivier Brunel @ 2018-10-20 17:39 UTC (permalink / raw)
  To: Network Development
  Cc: Alexei Starovoitov, Daniel Borkmann, Luis R . Rodriguez,
	linux-kernel, David S . Miller, Olivier Brunel
In-Reply-To: <20181020173957.31239-1-jjk@jjacky.com>

User mode helpers were spawned without a command line, and because
an empty command line is used by many tools to identify processes as
kernel threads, this could cause some issues.

Notably during killing spree on shutdown, since such helper would then
be skipped (i.e. not killed) which would result in the process remaining
alive, and thus preventing unmouting of the rootfs (as experienced with
the bpfilter umh).

Fixes: 449325b52b7a ("umh: introduce fork_usermode_blob() helper")
Signed-off-by: Olivier Brunel <jjk@jjacky.com>
---
 include/linux/umh.h |  1 +
 kernel/umh.c        | 16 ++++++++++++++--
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/linux/umh.h b/include/linux/umh.h
index 5c812acbb..235f51b62 100644
--- a/include/linux/umh.h
+++ b/include/linux/umh.h
@@ -44,6 +44,7 @@ struct subprocess_info *call_usermodehelper_setup_file(struct file *file,
 			  int (*init)(struct subprocess_info *info, struct cred *new),
 			  void (*cleanup)(struct subprocess_info *), void *data);
 struct umh_info {
+	const char *cmdline;
 	struct file *pipe_to_umh;
 	struct file *pipe_from_umh;
 	pid_t pid;
diff --git a/kernel/umh.c b/kernel/umh.c
index c44985894..0baa672e0 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -405,11 +405,19 @@ struct subprocess_info *call_usermodehelper_setup_file(struct file *file,
 		void (*cleanup)(struct subprocess_info *info), void *data)
 {
 	struct subprocess_info *sub_info;
+	struct umh_info *info = data;
+	const char *cmdline = (info->cmdline) ? info->cmdline : "usermodehelper";
 
 	sub_info = kzalloc(sizeof(struct subprocess_info), GFP_KERNEL);
 	if (!sub_info)
 		return NULL;
 
+	sub_info->argv = argv_split(GFP_KERNEL, cmdline, NULL);
+	if (!sub_info->argv) {
+		kfree(sub_info);
+		return NULL;
+	}
+
 	INIT_WORK(&sub_info->work, call_usermodehelper_exec_work);
 	sub_info->path = "none";
 	sub_info->file = file;
@@ -458,10 +466,11 @@ static int umh_pipe_setup(struct subprocess_info *info, struct cred *new)
 	return 0;
 }
 
-static void umh_save_pid(struct subprocess_info *info)
+static void umh_clean_and_save_pid(struct subprocess_info *info)
 {
 	struct umh_info *umh_info = info->data;
 
+	argv_free(info->argv);
 	umh_info->pid = info->pid;
 }
 
@@ -471,6 +480,9 @@ static void umh_save_pid(struct subprocess_info *info)
  * @len: length of the blob
  * @info: information about usermode process (shouldn't be NULL)
  *
+ * If info->cmdline is set it will be used as command line for the
+ * user process, else "usermodehelper" is used.
+ *
  * Returns either negative error or zero which indicates success
  * in executing a blob of bytes as a usermode process. In such
  * case 'struct umh_info *info' is populated with two pipes
@@ -500,7 +512,7 @@ int fork_usermode_blob(void *data, size_t len, struct umh_info *info)
 
 	err = -ENOMEM;
 	sub_info = call_usermodehelper_setup_file(file, umh_pipe_setup,
-						  umh_save_pid, info);
+						  umh_clean_and_save_pid, info);
 	if (!sub_info)
 		goto out;
 
-- 
2.19.0

^ permalink raw reply related

* [PATCH 0/2] Re: bpfilter causes a leftover kernel process
From: Olivier Brunel @ 2018-10-20 17:39 UTC (permalink / raw)
  To: Network Development
  Cc: Alexei Starovoitov, Daniel Borkmann, Luis R . Rodriguez,
	linux-kernel, David S . Miller, Olivier Brunel
In-Reply-To: <CAADnVQ+xFfwV67BRXD0epoTwKz54W3mcYHXVWBd1iWb94p+zpg@mail.gmail.com>

On Tue, 16 Oct 2018 16:38:56 +0000
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Wed, Sep 5, 2018 at 5:05 PM Olivier Brunel <jjk@jjacky.com> wrote:
> >
> > You'll see in the end that systemd complains that it can't
> > unmount /oldroot (EBUSY), aka the root fs; and that's because of the
> > bpfilter helper, which wasn't killed because it's seen as a kernel
> > thread due to its empty command line and therefore not signaled.  
> 
> thanks for tracking it down.
> can somebody send a patch to give bpfilter non-empty cmdline?
> I think that would be a better fix than tweaking all pid1s.

So I'm not a kernel dev and this would be my first atttempt at a kernel patch,
but I did have a look and came up with the following patch(es) to fix this.
Hopefully I did things right.

It adds a default command line ("usermodehelper") to such processes, so any &
all such helpers will be seen as user process and not kernel threads, but
there's also the possibility to specify a command line to use, here
"bpfilter_umh"

Cheers,


Olivier Brunel (2):
  umh: Add command line to user mode helpers
  net: bpfilter: Set user mode helper's command line

 include/linux/umh.h          |  1 +
 kernel/umh.c                 | 16 ++++++++++++++--
 net/bpfilter/bpfilter_kern.c |  1 +
 3 files changed, 16 insertions(+), 2 deletions(-)

-- 
2.19.0

^ permalink raw reply

* Re: [PATCH] net: ethernet: fec: Add missing SPEED_
From: Andrew Lunn @ 2018-10-20 15:39 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Florian Fainelli, LABBE Corentin, davem, fugang.duan,
	linux-kernel, netdev
In-Reply-To: <f1933c0a-d124-a7ac-afab-2fa8f88907ab@gmail.com>

> >> I have patched by adding:
> >> phy_remove_link_mode(phy_dev, ETHTOOL_LINK_MODE_Asym_Pause_BIT);
> 
> Instead of programmatically removing the feature bit it should be
> possible to do this in the PHY driver configuration. See also
> this part of phy_probe().
> 
> 	if (phydrv->features & (SUPPORTED_Pause | SUPPORTED_Asym_Pause)) {
> 		phydev->supported &= ~(SUPPORTED_Pause | SUPPORTED_Asym_Pause);
> 		phydev->supported |= phydrv->features &
> 				     (SUPPORTED_Pause | SUPPORTED_Asym_Pause);
> 	} else {
> 		phydev->supported |= SUPPORTED_Pause | SUPPORTED_Asym_Pause;
> 	}

Sorry for the late reply. Been on vacation.

I need to check the datasheet, but it seems like the KSZ9021 does not
support asym pause. Using the above code is the correct way to solve
this problem. Look at the bcm63xx.c:bcm63xx_config_init() which does
this.

I will cook up a patch.

  Andrew

^ permalink raw reply

* [PATCH] net: socket: fix a missing-check bug
From: Wenwen Wang @ 2018-10-20 15:58 UTC (permalink / raw)
  To: Wenwen Wang
  Cc: Kangjie Lu, David S. Miller, open list:NETWORKING [GENERAL],
	open list

In ethtool_ioctl(), the ioctl command is firstly obtained from the
user-space buffer 'compat_rxnfc' through get_user() and saved to 'ethcmd'.
Then, 'ethcmd' is checked to see whether it is necessary to pre-process the
ethool structure, because the structure ethtool_rxnfc is defined with
padding, as mentioned in the comment. If yes, a user-space buffer 'rxnfc'
is allocated through compat_alloc_user_space() and then the data in the
original buffer 'compat_rxnfc' is copied to 'rxnfc' through copy_in_user(),
including the ioctl command. It is worth noting that after this copy, there
is no check enforced on the copied ioctl command. That means it is possible
that 'rxnfc->cmd' is different from 'ethcmd', because a malicious user can
race to modify the ioctl command in 'compat_rxnfc' between these two
copies. Eventually, the ioctl command in 'rxnfc' will be used in
dev_ethtool(). This can cause undefined behavior of the kernel and
introduce potential security risk.

This patch avoids the above issue by rewriting 'rxnfc->cmd' using 'ethcmd'
after copy_in_user().

Signed-off-by: Wenwen Wang <wang6495@umn.edu>
---
 net/socket.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/socket.c b/net/socket.c
index 01f3f8f..c5f969c 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2879,6 +2879,8 @@ static int ethtool_ioctl(struct net *net, struct compat_ifreq __user *ifr32)
 		    copy_in_user(&rxnfc->rule_cnt, &compat_rxnfc->rule_cnt,
 				 sizeof(rxnfc->rule_cnt)))
 			return -EFAULT;
+
+		rxnfc->cmd = ethcmd;
 	}
 
 	ret = dev_ioctl(net, SIOCETHTOOL, &ifr, NULL);
-- 
2.7.4

^ permalink raw reply related

* [GIT] Networking
From: David Miller @ 2018-10-20 22:47 UTC (permalink / raw)
  To: gregkh; +Cc: akpm, netdev, linux-kernel


A few straggler bug fixes:

1) Fix indexing of multi-pass dumps of ipv6 addresses, from David
   Ahern.

2) Revert RCU locking change for bonding netpoll, causes worse problems
   than it solves.

3) pskb_trim_rcsum_slow() doesn't handle odd trim offsets, resulting in
   erroneous bad hw checksum triggers with CHECKSUM_COMPLETE devices.
   From Dimitris Michailidis.

Please pull, thanks a lot!

The following changes since commit 91b15613ce7fb3e724ca0d433eef8e6bf15322af:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2018-10-19 09:16:20 +0200)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 

for you to fetch changes up to 4ba4c566ba8448a05e6257e0b98a21f1a0d55315:

  net/ipv6: Fix index counter for unicast addresses in in6_dump_addrs (2018-10-20 15:43:14 -0700)

----------------------------------------------------------------
David Ahern (1):
      net/ipv6: Fix index counter for unicast addresses in in6_dump_addrs

David S. Miller (1):
      Revert "bond: take rcu lock in netpoll_send_skb_on_dev"

Dimitris Michailidis (1):
      net: fix pskb_trim_rcsum_slow() with odd trim offset

 net/core/netpoll.c  | 2 --
 net/core/skbuff.c   | 5 +++--
 net/ipv6/addrconf.c | 6 ++++--
 3 files changed, 7 insertions(+), 6 deletions(-)

^ permalink raw reply

* Re: [PATCH net-next v8 28/28] net: WireGuard secure network tunnel
From: Andrew Lunn @ 2018-10-20 22:47 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: linux-kernel, netdev, linux-crypto, davem, gregkh
In-Reply-To: <20181018145712.7538-29-Jason@zx2c4.com>

> +#define choose_node(parent, key)                                               \
> +	parent->bit[(key[parent->bit_at_a] >> parent->bit_at_b) & 1]

Hi Jason

This should be a function, not a macro.

> +
> +static void node_free_rcu(struct rcu_head *rcu)
> +{
> +	kfree(container_of(rcu, struct allowedips_node, rcu));
> +}
> +
> +#define push_rcu(stack, p, len) ({                                             \
> +		if (rcu_access_pointer(p)) {                                   \
> +			WARN_ON(IS_ENABLED(DEBUG) && (len) >= 128);            \
> +			stack[(len)++] = rcu_dereference_raw(p);               \
> +		}                                                              \
> +		true;                                                          \
> +	})

This also looks like it could be a function.

> +static void root_free_rcu(struct rcu_head *rcu)
> +{
> +	struct allowedips_node *node, *stack[128] = {
> +		container_of(rcu, struct allowedips_node, rcu) };
> +	unsigned int len = 1;
> +
> +	while (len > 0 && (node = stack[--len]) &&
> +	       push_rcu(stack, node->bit[0], len) &&
> +	       push_rcu(stack, node->bit[1], len))
> +		kfree(node);
> +}
> +
> +#define ref(p) rcu_access_pointer(p)
> +#define deref(p) rcu_dereference_protected(*(p), lockdep_is_held(lock))

Macros should be uppercase, or better still, functions.

> +#define push(p) ({                                                             \
> +		WARN_ON(IS_ENABLED(DEBUG) && len >= 128);                      \
> +		stack[len++] = p;                                              \
> +	})

This one definitely should be upper case, to warn readers it has
unexpected side effects.

> +
> +static void walk_remove_by_peer(struct allowedips_node __rcu **top,
> +				struct wg_peer *peer, struct mutex *lock)
> +{
> +	struct allowedips_node __rcu **stack[128], **nptr;
> +	struct allowedips_node *node, *prev;
> +	unsigned int len;
> +
> +	if (unlikely(!peer || !ref(*top)))
> +		return;
> +
> +	for (prev = NULL, len = 0, push(top); len > 0; prev = node) {
> +		nptr = stack[len - 1];
> +		node = deref(nptr);
> +		if (!node) {
> +			--len;
> +			continue;
> +		}
> +		if (!prev || ref(prev->bit[0]) == node ||
> +		    ref(prev->bit[1]) == node) {
> +			if (ref(node->bit[0]))
> +				push(&node->bit[0]);
> +			else if (ref(node->bit[1]))
> +				push(&node->bit[1]);
> +		} else if (ref(node->bit[0]) == prev) {
> +			if (ref(node->bit[1]))
> +				push(&node->bit[1]);
> +		} else {
> +			if (rcu_dereference_protected(node->peer,
> +				lockdep_is_held(lock)) == peer) {
> +				RCU_INIT_POINTER(node->peer, NULL);
> +				if (!node->bit[0] || !node->bit[1]) {
> +					rcu_assign_pointer(*nptr,
> +					deref(&node->bit[!ref(node->bit[0])]));
> +					call_rcu_bh(&node->rcu, node_free_rcu);
> +					node = deref(nptr);
> +				}
> +			}
> +			--len;
> +		}
> +	}
> +}
> +
> +#undef ref
> +#undef deref
> +#undef push
> +
> +static __always_inline unsigned int fls128(u64 a, u64 b)
> +{
> +	return a ? fls64(a) + 64U : fls64(b);
> +}

Does the compiler actually get this wrong? Not inline it?  There was
an interesting LWN post about this recently:

https://lwn.net/Articles/767884/

But in general, inline of any form should be avoided in .c files.

> +
> +static __always_inline u8 common_bits(const struct allowedips_node *node,
> +				      const u8 *key, u8 bits)
> +{
> +	if (bits == 32)
> +		return 32U - fls(*(const u32 *)node->bits ^ *(const u32 *)key);
> +	else if (bits == 128)
> +		return 128U - fls128(
> +			*(const u64 *)&node->bits[0] ^ *(const u64 *)&key[0],
> +			*(const u64 *)&node->bits[8] ^ *(const u64 *)&key[8]);
> +	return 0;
> +}
> +
> +/* This could be much faster if it actually just compared the common bits
> + * properly, by precomputing a mask bswap(~0 << (32 - cidr)), and the rest, but
> + * it turns out that common_bits is already super fast on modern processors,
> + * even taking into account the unfortunate bswap. So, we just inline it like
> + * this instead.
> + */
> +#define prefix_matches(node, key, bits)                                        \
> +	(common_bits(node, key, bits) >= (node)->cidr)

Could be a function.

> +
> +static __always_inline struct allowedips_node *
> +find_node(struct allowedips_node *trie, u8 bits, const u8 *key)
> +{
> +	struct allowedips_node *node = trie, *found = NULL;
> +
> +	while (node && prefix_matches(node, key, bits)) {
> +		if (rcu_access_pointer(node->peer))
> +			found = node;
> +		if (node->cidr == bits)
> +			break;
> +		node = rcu_dereference_bh(choose_node(node, key));
> +	}
> +	return found;
> +}
> +
> +/* Returns a strong reference to a peer */
> +static __always_inline struct wg_peer *
> +lookup(struct allowedips_node __rcu *root, u8 bits, const void *be_ip)
> +{
> +	u8 ip[16] __aligned(__alignof(u64));

You virtually never see aligned stack variables. This needs some sort
of comment.

> +	struct allowedips_node *node;
> +	struct wg_peer *peer = NULL;
> +
> +	swap_endian(ip, be_ip, bits);
> +
> +	rcu_read_lock_bh();
> +retry:
> +	node = find_node(rcu_dereference_bh(root), bits, ip);
> +	if (node) {
> +		peer = wg_peer_get_maybe_zero(rcu_dereference_bh(node->peer));
> +		if (!peer)
> +			goto retry;
> +	}
> +	rcu_read_unlock_bh();
> +	return peer;
> +}
> +
> +__attribute__((nonnull(1))) static bool
> +node_placement(struct allowedips_node __rcu *trie, const u8 *key, u8 cidr,
> +	       u8 bits, struct allowedips_node **rnode, struct mutex *lock)
> +{
> +	struct allowedips_node *node = rcu_dereference_protected(trie,
> +						lockdep_is_held(lock));
> +	struct allowedips_node *parent = NULL;
> +	bool exact = false;

Should there be a WARN_ON(!key) here, since the attribute will only
detect problems at compile time, and maybe some runtime cases will get
passed it?

> +
> +	while (node && node->cidr <= cidr && prefix_matches(node, key, bits)) {
> +		parent = node;
> +		if (parent->cidr == cidr) {
> +			exact = true;
> +			break;
> +		}
> +		node = rcu_dereference_protected(choose_node(parent, key),
> +						 lockdep_is_held(lock));
> +	}
> +	*rnode = parent;
> +	return exact;
> +}
> +void wg_cookie_message_consume(struct message_handshake_cookie *src,
> +			       struct wg_device *wg)
> +{
> +	struct wg_peer *peer = NULL;
> +	u8 cookie[COOKIE_LEN];
> +	bool ret;
> +
> +	if (unlikely(!wg_index_hashtable_lookup(&wg->index_hashtable,
> +					     INDEX_HASHTABLE_HANDSHAKE |
> +					     INDEX_HASHTABLE_KEYPAIR,
> +					     src->receiver_index, &peer)))
> +		return;
> +
> +	down_read(&peer->latest_cookie.lock);
> +	if (unlikely(!peer->latest_cookie.have_sent_mac1)) {
> +		up_read(&peer->latest_cookie.lock);
> +		goto out;
> +	}
> +	ret = xchacha20poly1305_decrypt(
> +		cookie, src->encrypted_cookie, sizeof(src->encrypted_cookie),
> +		peer->latest_cookie.last_mac1_sent, COOKIE_LEN, src->nonce,
> +		peer->latest_cookie.cookie_decryption_key);
> +	up_read(&peer->latest_cookie.lock);
> +
> +	if (ret) {
> +		down_write(&peer->latest_cookie.lock);
> +		memcpy(peer->latest_cookie.cookie, cookie, COOKIE_LEN);
> +		peer->latest_cookie.birthdate = ktime_get_boot_fast_ns();
> +		peer->latest_cookie.is_valid = true;
> +		peer->latest_cookie.have_sent_mac1 = false;
> +		up_write(&peer->latest_cookie.lock);
> +	} else {
> +		net_dbg_ratelimited("%s: Could not decrypt invalid cookie response\n",
> +				    wg->dev->name);

It might be worth adding a netdev_dbg_ratelimited(), which takes a
netdev as its first parameter, just line netdev_dbg().

> +static int wg_open(struct net_device *dev)
> +{
> +	struct in_device *dev_v4 = __in_dev_get_rtnl(dev);
> +	struct inet6_dev *dev_v6 = __in6_dev_get(dev);
> +	struct wg_device *wg = netdev_priv(dev);
> +	struct wg_peer *peer;
> +	int ret;
> +
> +	if (dev_v4) {
> +		/* At some point we might put this check near the ip_rt_send_
> +		 * redirect call of ip_forward in net/ipv4/ip_forward.c, similar
> +		 * to the current secpath check.
> +		 */
> +		IN_DEV_CONF_SET(dev_v4, SEND_REDIRECTS, false);
> +		IPV4_DEVCONF_ALL(dev_net(dev), SEND_REDIRECTS) = false;
> +	}
> +	if (dev_v6)
> +		dev_v6->cnf.addr_gen_mode = IN6_ADDR_GEN_MODE_NONE;
> +
> +	ret = wg_socket_init(wg, wg->incoming_port);
> +	if (ret < 0)
> +		return ret;
> +	mutex_lock(&wg->device_update_lock);
> +	list_for_each_entry(peer, &wg->peer_list, peer_list) {
> +		wg_packet_send_staged_packets(peer);
> +		if (peer->persistent_keepalive_interval)
> +			wg_packet_send_keepalive(peer);
> +	}
> +	mutex_unlock(&wg->device_update_lock);
> +	return 0;
> +}
> +
> +#if defined(CONFIG_PM_SLEEP) && !defined(CONFIG_ANDROID)

I don't see any other code which uses this combination. Why is this
needed?

> +static netdev_tx_t wg_xmit(struct sk_buff *skb, struct net_device *dev)
> +{
> +	struct wg_device *wg = netdev_priv(dev);
> +	struct sk_buff_head packets;
> +	struct wg_peer *peer;
> +	struct sk_buff *next;
> +	sa_family_t family;
> +	u32 mtu;
> +	int ret;
> +
> +	if (unlikely(wg_skb_examine_untrusted_ip_hdr(skb) != skb->protocol)) {
> +		ret = -EPROTONOSUPPORT;
> +		net_dbg_ratelimited("%s: Invalid IP packet\n", dev->name);
> +		goto err;
> +	}
> +
> +	peer = wg_allowedips_lookup_dst(&wg->peer_allowedips, skb);
> +	if (unlikely(!peer)) {
> +		ret = -ENOKEY;
> +		if (skb->protocol == htons(ETH_P_IP))
> +			net_dbg_ratelimited("%s: No peer has allowed IPs matching %pI4\n",
> +					    dev->name, &ip_hdr(skb)->daddr);
> +		else if (skb->protocol == htons(ETH_P_IPV6))
> +			net_dbg_ratelimited("%s: No peer has allowed IPs matching %pI6\n",
> +					    dev->name, &ipv6_hdr(skb)->daddr);
> +		goto err;
> +	}
> +
> +	family = READ_ONCE(peer->endpoint.addr.sa_family);
> +	if (unlikely(family != AF_INET && family != AF_INET6)) {
> +		ret = -EDESTADDRREQ;
> +		net_dbg_ratelimited("%s: No valid endpoint has been configured or discovered for peer %llu\n",
> +				    dev->name, peer->internal_id);
> +		goto err_peer;
> +	}
> +
> +	mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
> +
> +	__skb_queue_head_init(&packets);
> +	if (!skb_is_gso(skb)) {
> +		skb->next = NULL;
> +	} else {
> +		struct sk_buff *segs = skb_gso_segment(skb, 0);
> +
> +		if (unlikely(IS_ERR(segs))) {
> +			ret = PTR_ERR(segs);
> +			goto err_peer;
> +		}
> +		dev_kfree_skb(skb);
> +		skb = segs;
> +	}
> +	do {
> +		next = skb->next;
> +		skb->next = skb->prev = NULL;
> +
> +		skb = skb_share_check(skb, GFP_ATOMIC);
> +		if (unlikely(!skb))
> +			continue;
> +
> +		/* We only need to keep the original dst around for icmp,
> +		 * so at this point we're in a position to drop it.
> +		 */
> +		skb_dst_drop(skb);
> +
> +		PACKET_CB(skb)->mtu = mtu;
> +
> +		__skb_queue_tail(&packets, skb);
> +	} while ((skb = next) != NULL);
> +
> +	spin_lock_bh(&peer->staged_packet_queue.lock);
> +	/* If the queue is getting too big, we start removing the oldest packets
> +	 * until it's small again. We do this before adding the new packet, so
> +	 * we don't remove GSO segments that are in excess.
> +	 */
> +	while (skb_queue_len(&peer->staged_packet_queue) > MAX_STAGED_PACKETS)
> +		dev_kfree_skb(__skb_dequeue(&peer->staged_packet_queue));

It would be good to have some stats counters in here. Maybe the
standard interface statistics will cover it, otherwise ethtool -S.

You should also get this code looked at by some of the queueing
people. Rather than discarding, you might want to be applying back
pressure to slow down the sender application?

> +static void kdf(u8 *first_dst, u8 *second_dst, u8 *third_dst, const u8 *data,
> +		size_t first_len, size_t second_len, size_t third_len,
> +		size_t data_len, const u8 chaining_key[NOISE_HASH_LEN])
> +{
> +	u8 output[BLAKE2S_HASH_SIZE + 1];
> +	u8 secret[BLAKE2S_HASH_SIZE];
> +
> +	WARN_ON(IS_ENABLED(DEBUG) &&
> +		(first_len > BLAKE2S_HASH_SIZE ||
> +		 second_len > BLAKE2S_HASH_SIZE ||
> +		 third_len > BLAKE2S_HASH_SIZE ||
> +		 ((second_len || second_dst || third_len || third_dst) &&
> +		  (!first_len || !first_dst)) ||
> +		 ((third_len || third_dst) && (!second_len || !second_dst))));

Maybe split this up into a number of WARN_ON()s. At the moment, if it
fires, you have little idea why, there are so many comparisons. Also,
is this on the hot path? I guess not, since this is keys, not
packets. Do you need to care about DEBUG here?

	 Andrew

^ permalink raw reply

* [iproute2 PATCH] bridge: fix vlan show stats formatting
From: Tobias Jungel @ 2018-10-20 13:42 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

The output of -statistics vlan show was broken previous change for json
output. This aligns the format to vlan show.

Signed-off-by: Tobias Jungel <tobias.jungel@gmail.com>
---
 bridge/vlan.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/bridge/vlan.c b/bridge/vlan.c
index bdce55ae..85f4a539 100644
--- a/bridge/vlan.c
+++ b/bridge/vlan.c
@@ -487,8 +487,7 @@ static void print_vlan_stats_attr(struct rtattr *attr, int ifindex)
 	list = brtb[LINK_XSTATS_TYPE_BRIDGE];
 	rem = RTA_PAYLOAD(list);

-	ifname = ll_index_to_name(ifindex);
-	open_json_object(ifname);
+	open_vlan_port(ifindex);

 	print_color_string(PRINT_FP, COLOR_IFNAME,
 			   NULL, "%-16s", ifname);
@@ -509,8 +508,7 @@ static void print_vlan_stats_attr(struct rtattr *attr, int ifindex)

 		print_one_vlan_stats(vstats);
 	}
-	close_json_object();
-
+	close_vlan_port();
 }

 static int print_vlan_stats(const struct sockaddr_nl *who,

^ permalink raw reply related

* Solutions
From: Linda @ 2018-10-20 12:58 UTC (permalink / raw)
  To: netdev

We are one image studio who is able to process 300+ photos a day.

If you need any image editing, please let us know. We can do it for you
such as:
Image cut out for photos  and clipping path, masking for your photos,
They are mostly used for ecommerce photos, jewelry photos retouching,
beauty and skin images
and wedding photos.

We do also different kind of beauty retouching, portraits retouching.

We can send editing for your photos if you send us one or two photos.

Thanks,
Linda

^ permalink raw reply

* Re: [PATCH] net: ethernet: fec: Add missing SPEED_
From: Andrew Lunn @ 2018-10-20 20:32 UTC (permalink / raw)
  To: LABBE Corentin; +Cc: Florian Fainelli, davem, fugang.duan, linux-kernel, netdev
In-Reply-To: <20181018195909.GA11317@Red>

On Thu, Oct 18, 2018 at 09:59:09PM +0200, LABBE Corentin wrote:
> On Thu, Oct 18, 2018 at 12:38:32PM -0700, Florian Fainelli wrote:
> > On 10/18/2018 12:16 PM, LABBE Corentin wrote:
> > > On Thu, Oct 18, 2018 at 11:55:49AM -0700, Florian Fainelli wrote:
> > >> On 10/18/2018 11:47 AM, LABBE Corentin wrote:
> > >>> On Thu, Oct 18, 2018 at 11:39:24AM -0700, Florian Fainelli wrote:
> > >>>> On 10/18/2018 08:05 AM, Corentin Labbe wrote:
> > >>>>> Since commit 58056c1e1b0e ("net: ethernet: Use phy_set_max_speed() to limit advertised speed"), the fec driver is unable to get any link.
> > >>>>> This is due to missing SPEED_.
> > >>>>
> > >>>> But SPEED_1000 is defined in include/uapi/linux/ethtool.h as 1000, so
> > >>>> surely this would amount to the same code paths being taken or am I
> > >>>> missing something here?
> > >>>
> > >>> The bisect session pointed your patch, reverting it fix the issue.
> > >>> BUT since the fix seemed trivial I sent the patch without more test then compile it.
> > >>> Sorry, I have just found some minutes ago that it didnt fix the issue.
> > >>>
> > >>> But your patch is still the cause for sure.
> > >>>
> > >>
> > >> What you are writing is really lowering the confidence level, first
> > >> Andrew is the author of that patch, and second "just compiling" and
> > >> pretending this fixes a problem when it does not is not quite what I
> > >> would expect.
> > >>
> > >> I don't have a problem helping you find the solution or the right fix
> > >> though, even if it is not my patch, but please get the author and actual
> > >> problem right so we can move forward in confidence, thanks!
> > > 
> > > Sorry again, I wanted to acknoledge my error but I did it too fast and late.
> > > And sorry to have confound you with Andrew.
> > 
> > No worries, here to help, let us know what your bisection points to. THanks
> 
> I have added printing of phydev->supported
> My working kernel (on top of 58056c1e1b0e + revert patch) got:
> [    5.550838] fec_enet_mii_probe 2ff (gbit features)
> [    5.555848] fec_enet_mii_probe 2ef (without 1000baseT_Half)
> [    5.561620] fec_enet_mii_probe 22ef final (after pause)
> [    5.566914] Micrel KSZ9021 Gigabit PHY 2188000.ethernet-1:06: attached PHY driver [Micrel KSZ9021 Gigabit PHY] (mii_bus:phy_addr=2188000.ethernet-1:06, irq=POLL)

I just looked at the datasheet for the KSZ9021. It supports Pause and
ASym Pause. So i would expect these bits to be set. However, the FEC
MAC is unable to support Asym pause, it only supports Pause. So it is
the MAC drivers responsibility to clear Asym Pause.

        /* mask with MAC supported features */
	if (fep->quirks & FEC_QUIRK_HAS_GBIT) {
                phy_set_max_speed(phy_dev, 1000);
                phy_remove_link_mode(phy_dev,
                                     ETHTOOL_LINK_MODE_1000baseT_Half_BIT);
#if !defined(CONFIG_M5272)
	phy_support_sym_pause(phy_dev);
#endif
        }
        else
                phy_set_max_speed(phy_dev, 100);

I think we just need to take this #if !defined out, so always
indicating that sym_pause is supported. And we want
phy_support_sym_pause() to clear the asym_pause bit, if set.

	Andrew

^ permalink raw reply

* Re: [PATCH] net: ethernet: fec: Add missing SPEED_
From: Andrew Lunn @ 2018-10-20 20:12 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Florian Fainelli, LABBE Corentin, davem, fugang.duan,
	linux-kernel, netdev
In-Reply-To: <1153d0a5-1c23-1e14-7ebd-459886af957a@gmail.com>

> > Yes, you are correct.
> > 
> > But it is no longer possible to just do PHY_BASIC_FEATURES |
> > SUPPORTED_Pause because .features is no loner a u32 but a linux
> > bitmap.
> > 
> Ahh, missed that.

Yes, easy to miss, since it is still work in progress. The next
patchset completes the migration, making phydev->advertise and
phydev->supported bitmaps.

> > We need to keep the same idea, allow the PHY driver to indicate it
> > supports a subset of Pause, and if not, enable pause by default.
> > 
> Could the PHY driver define its own bitmap instead of pointing to
> one of the predefined ones?

I'm actually just adding a few more pre-defined ones. I suspect they
will get used by more than one driver. These changed have uncovered a
few bugs with pause, and i would not be surprised to find there are
more still to be found.

     Andrew

^ permalink raw reply

* Re: [PATCH net-next v2] netpoll: allow cleanup to be synchronous
From: Neil Horman @ 2018-10-20 11:33 UTC (permalink / raw)
  To: Banerjee, Debabrata; +Cc: David S . Miller, netdev@vger.kernel.org
In-Reply-To: <9074672b62cd43278884db522bfe877e@usma1ex-dag1mb2.msg.corp.akamai.com>

On Fri, Oct 19, 2018 at 08:46:45PM +0000, Banerjee, Debabrata wrote:
> > From: Neil Horman <nhorman@tuxdriver.com>
> 
> > I presume you've tested this with some of the stacked devices?  I think I'm
> > ok with this change, but I'd like confirmation that its worked.
> > 
> > Neil
> 
> Yes I've tested this on a bond device with vlan stacked on top.
> 
> -Deb
> 

Cool, thanks
Acked-by: Neil Horman <nhorman@tuxdriver.com>

> > 
> > > CC: Neil Horman <nhorman@tuxdriver.com>
> > > CC: "David S. Miller" <davem@davemloft.net>
> > > Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
> > > ---
> > >  drivers/net/bonding/bond_main.c |  3 ++-
> > >  drivers/net/macvlan.c           |  2 +-
> > >  drivers/net/team/team.c         |  5 +----
> > >  include/linux/netpoll.h         |  4 +---
> > >  net/8021q/vlan_dev.c            |  3 +--
> > >  net/bridge/br_device.c          |  2 +-
> > >  net/core/netpoll.c              | 20 +++++---------------
> > >  net/dsa/slave.c                 |  2 +-
> > >  8 files changed, 13 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/drivers/net/bonding/bond_main.c
> > > b/drivers/net/bonding/bond_main.c index ee28ec9e0aba..ffa37adb7681
> > > 100644
> > > --- a/drivers/net/bonding/bond_main.c
> > > +++ b/drivers/net/bonding/bond_main.c
> > > @@ -963,7 +963,8 @@ static inline void slave_disable_netpoll(struct slave
> > *slave)
> > >  		return;
> > >
> > >  	slave->np = NULL;
> > > -	__netpoll_free_async(np);
> > > +
> > > +	__netpoll_free(np);
> > >  }
> > >
> > >  static void bond_poll_controller(struct net_device *bond_dev) diff
> > > --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index
> > > cfda146f3b3b..fc8d5f1ee1ad 100644
> > > --- a/drivers/net/macvlan.c
> > > +++ b/drivers/net/macvlan.c
> > > @@ -1077,7 +1077,7 @@ static void macvlan_dev_netpoll_cleanup(struct
> > > net_device *dev)
> > >
> > >  	vlan->netpoll = NULL;
> > >
> > > -	__netpoll_free_async(netpoll);
> > > +	__netpoll_free(netpoll);
> > >  }
> > >  #endif	/* CONFIG_NET_POLL_CONTROLLER */
> > >
> > > diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index
> > > d887016e54b6..db633ae9f784 100644
> > > --- a/drivers/net/team/team.c
> > > +++ b/drivers/net/team/team.c
> > > @@ -1104,10 +1104,7 @@ static void team_port_disable_netpoll(struct
> > team_port *port)
> > >  		return;
> > >  	port->np = NULL;
> > >
> > > -	/* Wait for transmitting packets to finish before freeing. */
> > > -	synchronize_rcu_bh();
> > > -	__netpoll_cleanup(np);
> > > -	kfree(np);
> > > +	__netpoll_free(np);
> > >  }
> > >  #else
> > >  static int team_port_enable_netpoll(struct team_port *port) diff
> > > --git a/include/linux/netpoll.h b/include/linux/netpoll.h index
> > > 3ef82d3a78db..676f1ff161a9 100644
> > > --- a/include/linux/netpoll.h
> > > +++ b/include/linux/netpoll.h
> > > @@ -31,8 +31,6 @@ struct netpoll {
> > >  	bool ipv6;
> > >  	u16 local_port, remote_port;
> > >  	u8 remote_mac[ETH_ALEN];
> > > -
> > > -	struct work_struct cleanup_work;
> > >  };
> > >
> > >  struct netpoll_info {
> > > @@ -63,7 +61,7 @@ int netpoll_parse_options(struct netpoll *np, char
> > > *opt);  int __netpoll_setup(struct netpoll *np, struct net_device
> > > *ndev);  int netpoll_setup(struct netpoll *np);  void
> > > __netpoll_cleanup(struct netpoll *np); -void
> > > __netpoll_free_async(struct netpoll *np);
> > > +void __netpoll_free(struct netpoll *np);
> > >  void netpoll_cleanup(struct netpoll *np);  void
> > > netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
> > >  			     struct net_device *dev);
> > > diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index
> > > 546af0e73ac3..ff720f1ebf73 100644
> > > --- a/net/8021q/vlan_dev.c
> > > +++ b/net/8021q/vlan_dev.c
> > > @@ -756,8 +756,7 @@ static void vlan_dev_netpoll_cleanup(struct
> > net_device *dev)
> > >  		return;
> > >
> > >  	vlan->netpoll = NULL;
> > > -
> > > -	__netpoll_free_async(netpoll);
> > > +	__netpoll_free(netpoll);
> > >  }
> > >  #endif /* CONFIG_NET_POLL_CONTROLLER */
> > >
> > > diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index
> > > e053a4e43758..c6abf927f0c9 100644
> > > --- a/net/bridge/br_device.c
> > > +++ b/net/bridge/br_device.c
> > > @@ -344,7 +344,7 @@ void br_netpoll_disable(struct net_bridge_port *p)
> > >
> > >  	p->np = NULL;
> > >
> > > -	__netpoll_free_async(np);
> > > +	__netpoll_free(np);
> > >  }
> > >
> > >  #endif
> > > diff --git a/net/core/netpoll.c b/net/core/netpoll.c index
> > > de1d1ba92f2d..6ac71624ead4 100644
> > > --- a/net/core/netpoll.c
> > > +++ b/net/core/netpoll.c
> > > @@ -591,7 +591,6 @@ int __netpoll_setup(struct netpoll *np, struct
> > > net_device *ndev)
> > >
> > >  	np->dev = ndev;
> > >  	strlcpy(np->dev_name, ndev->name, IFNAMSIZ);
> > > -	INIT_WORK(&np->cleanup_work, netpoll_async_cleanup);
> > >
> > >  	if (ndev->priv_flags & IFF_DISABLE_NETPOLL) {
> > >  		np_err(np, "%s doesn't support polling, aborting\n", @@ -
> > 790,10
> > > +789,6 @@ void __netpoll_cleanup(struct netpoll *np)  {
> > >  	struct netpoll_info *npinfo;
> > >
> > > -	/* rtnl_dereference would be preferable here but
> > > -	 * rcu_cleanup_netpoll path can put us in here safely without
> > > -	 * holding the rtnl, so plain rcu_dereference it is
> > > -	 */
> > >  	npinfo = rtnl_dereference(np->dev->npinfo);
> > >  	if (!npinfo)
> > >  		return;
> > > @@ -814,21 +809,16 @@ void __netpoll_cleanup(struct netpoll *np)  }
> > > EXPORT_SYMBOL_GPL(__netpoll_cleanup);
> > >
> > > -static void netpoll_async_cleanup(struct work_struct *work)
> > > +void __netpoll_free(struct netpoll *np)
> > >  {
> > > -	struct netpoll *np = container_of(work, struct netpoll,
> > cleanup_work);
> > > +	ASSERT_RTNL();
> > >
> > > -	rtnl_lock();
> > > +	/* Wait for transmitting packets to finish before freeing. */
> > > +	synchronize_rcu_bh();
> > >  	__netpoll_cleanup(np);
> > > -	rtnl_unlock();
> > >  	kfree(np);
> > >  }
> > > -
> > > -void __netpoll_free_async(struct netpoll *np) -{
> > > -	schedule_work(&np->cleanup_work);
> > > -}
> > > -EXPORT_SYMBOL_GPL(__netpoll_free_async);
> > > +EXPORT_SYMBOL_GPL(__netpoll_free);
> > >
> > >  void netpoll_cleanup(struct netpoll *np)  { diff --git
> > > a/net/dsa/slave.c b/net/dsa/slave.c index 3f840b6eea69..3679e13b2ead
> > > 100644
> > > --- a/net/dsa/slave.c
> > > +++ b/net/dsa/slave.c
> > > @@ -722,7 +722,7 @@ static void dsa_slave_netpoll_cleanup(struct
> > > net_device *dev)
> > >
> > >  	p->netpoll = NULL;
> > >
> > > -	__netpoll_free_async(netpoll);
> > > +	__netpoll_free(netpoll);
> > >  }
> > >
> > >  static void dsa_slave_poll_controller(struct net_device *dev)
> > > --
> > > 2.19.1
> > >
> > >
> 

^ permalink raw reply

* Re: [PATCH] net: ethernet: fec: Add missing SPEED_
From: Heiner Kallweit @ 2018-10-20 19:26 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Florian Fainelli, LABBE Corentin, davem, fugang.duan,
	linux-kernel, netdev
In-Reply-To: <20181020185947.GA6615@lunn.ch>

On 20.10.2018 20:59, Andrew Lunn wrote:
>> I dare to dispute here ;) Above code snippet from phy_probe() will
>> (try to) set also SUPPORTED_Asym_Pause, because phydrv->features
>> doesn't include any of the two pause flags.
>> The statement in bcm63xx_config_init you refer to seems to be a
>> no-op to me therefore.
>>
>> I'd say the correct way is to change the PHY config like this:
>> .features	= PHY_BASIC_FEATURES | SUPPORTED_Pause;
>> It's exactly the use case the code snippet above covers.
> 
> Yes, you are correct.
> 
> But it is no longer possible to just do PHY_BASIC_FEATURES |
> SUPPORTED_Pause because .features is no loner a u32 but a linux
> bitmap.
> 
Ahh, missed that.

> We need to keep the same idea, allow the PHY driver to indicate it
> supports a subset of Pause, and if not, enable pause by default.
> 
Could the PHY driver define its own bitmap instead of pointing to
one of the predefined ones?

> Maybe the easiest way is to move this chunk of code to after the probe
> function is called.
> 
> 	 Andrew
> 

^ permalink raw reply

* Re: [PATCH] net: ethernet: fec: Add missing SPEED_
From: Andrew Lunn @ 2018-10-20 18:59 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Florian Fainelli, LABBE Corentin, davem, fugang.duan,
	linux-kernel, netdev
In-Reply-To: <c8f4f203-b2b6-81ba-2c79-85501a640948@gmail.com>

> I dare to dispute here ;) Above code snippet from phy_probe() will
> (try to) set also SUPPORTED_Asym_Pause, because phydrv->features
> doesn't include any of the two pause flags.
> The statement in bcm63xx_config_init you refer to seems to be a
> no-op to me therefore.
> 
> I'd say the correct way is to change the PHY config like this:
> .features	= PHY_BASIC_FEATURES | SUPPORTED_Pause;
> It's exactly the use case the code snippet above covers.

Yes, you are correct.

But it is no longer possible to just do PHY_BASIC_FEATURES |
SUPPORTED_Pause because .features is no loner a u32 but a linux
bitmap.

We need to keep the same idea, allow the PHY driver to indicate it
supports a subset of Pause, and if not, enable pause by default.

Maybe the easiest way is to move this chunk of code to after the probe
function is called.

	 Andrew

^ permalink raw reply

* [PATCH net-next] r8169: add support for Byte Queue Limits
From: Heiner Kallweit @ 2018-10-20 10:25 UTC (permalink / raw)
  To: David Miller, Realtek linux nic maintainers
  Cc: netdev@vger.kernel.org, Florian Westphal, Holger Hoffstätte

From: Florian Westphal <fw@strlen.de>
This patch is basically a resubmit of 1e918876853a ("r8169: add support
for Byte Queue Limits") which was reverted later. The problems causing
the revert seem to have been fixed in the meantime.
Only change to the original patch is that the call to
netdev_reset_queue was moved to rtl8169_tx_clear.

The Tested-by refers to a system using the RTL8168evl chip version.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
---
 drivers/net/ethernet/realtek/r8169.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 114bd9e54..006b0aa8c 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -5851,6 +5851,7 @@ static void rtl8169_tx_clear(struct rtl8169_private *tp)
 {
 	rtl8169_tx_clear_range(tp, tp->dirty_tx, NUM_TX_DESC);
 	tp->cur_tx = tp->dirty_tx = 0;
+	netdev_reset_queue(tp->dev);
 }
 
 static void rtl_reset_work(struct rtl8169_private *tp)
@@ -6153,6 +6154,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
 
 	txd->opts2 = cpu_to_le32(opts[1]);
 
+	netdev_sent_queue(dev, skb->len);
+
 	skb_tx_timestamp(skb);
 
 	/* Force memory writes to complete before releasing descriptor */
@@ -6251,7 +6254,7 @@ static void rtl8169_pcierr_interrupt(struct net_device *dev)
 
 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
 {
-	unsigned int dirty_tx, tx_left;
+	unsigned int dirty_tx, tx_left, bytes_compl = 0, pkts_compl = 0;
 
 	dirty_tx = tp->dirty_tx;
 	smp_rmb();
@@ -6275,10 +6278,8 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
 		rtl8169_unmap_tx_skb(tp_to_dev(tp), tx_skb,
 				     tp->TxDescArray + entry);
 		if (status & LastFrag) {
-			u64_stats_update_begin(&tp->tx_stats.syncp);
-			tp->tx_stats.packets++;
-			tp->tx_stats.bytes += tx_skb->skb->len;
-			u64_stats_update_end(&tp->tx_stats.syncp);
+			pkts_compl++;
+			bytes_compl += tx_skb->skb->len;
 			dev_consume_skb_any(tx_skb->skb);
 			tx_skb->skb = NULL;
 		}
@@ -6287,6 +6288,13 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
 	}
 
 	if (tp->dirty_tx != dirty_tx) {
+		netdev_completed_queue(dev, pkts_compl, bytes_compl);
+
+		u64_stats_update_begin(&tp->tx_stats.syncp);
+		tp->tx_stats.packets += pkts_compl;
+		tp->tx_stats.bytes += bytes_compl;
+		u64_stats_update_end(&tp->tx_stats.syncp);
+
 		tp->dirty_tx = dirty_tx;
 		/* Sync with rtl8169_start_xmit:
 		 * - publish dirty_tx ring index (write barrier)
-- 
2.19.1

^ permalink raw reply related

* Re: [PATCH net] r8169: fix NAPI handling under high load
From: Holger Hoffstätte @ 2018-10-20  9:55 UTC (permalink / raw)
  To: Heiner Kallweit, David Miller, Realtek linux nic maintainers
  Cc: netdev@vger.kernel.org
In-Reply-To: <addbdec0-b497-44ef-d3e6-d0fc4b7740c5@applied-asynchrony.com>

On 10/17/18 22:07, Holger Hoffstätte wrote:
> On 10/17/18 21:27, Heiner Kallweit wrote:
> (snip)
>> Good to know. What's your kernel version and RTL8168 chip version?
>> Regarding the chip version the dmesg line with the XID would be relevant.
> 
> 4.18.15 + PDS (custom CPU scheduler) + cherry pickings from mainline.
> Applied both the original patch in this thread & bql, built fine.

Good news everyone! Been running with the new BQL patch for three days
now on 2 machines and not a single hang/reset, regardless of load.
Coupled with the original patch in this thread (already in mainline)
this looks pretty good!

So while I can only speak for myself & my hardware, here's a

Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>

Thanks Heiner!

-h

^ permalink raw reply

* [PATCH 10/10] Revert "netfilter: xt_quota: fix the behavior of xt_quota module"
From: Pablo Neira Ayuso @ 2018-10-20  9:43 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20181020094317.16011-1-pablo@netfilter.org>

This reverts commit e9837e55b0200da544a095a1fca36efd7fd3ba30.

When talking to Maze and Chenbo, we agreed to keep this back by now
due to problems in the ruleset listing path with 32-bit arches.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/uapi/linux/netfilter/xt_quota.h |  8 ++---
 net/netfilter/xt_quota.c                | 55 ++++++++++++++++++++-------------
 2 files changed, 36 insertions(+), 27 deletions(-)

diff --git a/include/uapi/linux/netfilter/xt_quota.h b/include/uapi/linux/netfilter/xt_quota.h
index d72fd52adbba..f3ba5d9e58b6 100644
--- a/include/uapi/linux/netfilter/xt_quota.h
+++ b/include/uapi/linux/netfilter/xt_quota.h
@@ -15,11 +15,9 @@ struct xt_quota_info {
 	__u32 flags;
 	__u32 pad;
 	__aligned_u64 quota;
-#ifdef __KERNEL__
-	atomic64_t counter;
-#else
-	__aligned_u64 remain;
-#endif
+
+	/* Used internally by the kernel */
+	struct xt_quota_priv	*master;
 };
 
 #endif /* _XT_QUOTA_H */
diff --git a/net/netfilter/xt_quota.c b/net/netfilter/xt_quota.c
index fceae245eb03..10d61a6eed71 100644
--- a/net/netfilter/xt_quota.c
+++ b/net/netfilter/xt_quota.c
@@ -11,6 +11,11 @@
 #include <linux/netfilter/xt_quota.h>
 #include <linux/module.h>
 
+struct xt_quota_priv {
+	spinlock_t	lock;
+	uint64_t	quota;
+};
+
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Sam Johnston <samj@samj.net>");
 MODULE_DESCRIPTION("Xtables: countdown quota match");
@@ -21,48 +26,54 @@ static bool
 quota_mt(const struct sk_buff *skb, struct xt_action_param *par)
 {
 	struct xt_quota_info *q = (void *)par->matchinfo;
-	u64 current_count = atomic64_read(&q->counter);
+	struct xt_quota_priv *priv = q->master;
 	bool ret = q->flags & XT_QUOTA_INVERT;
-	u64 old_count, new_count;
-
-	do {
-		if (current_count == 1)
-			return ret;
-		if (current_count <= skb->len) {
-			atomic64_set(&q->counter, 1);
-			return ret;
-		}
-		old_count = current_count;
-		new_count = current_count - skb->len;
-		current_count = atomic64_cmpxchg(&q->counter, old_count,
-						 new_count);
-	} while (current_count != old_count);
-	return !ret;
+
+	spin_lock_bh(&priv->lock);
+	if (priv->quota >= skb->len) {
+		priv->quota -= skb->len;
+		ret = !ret;
+	} else {
+		/* we do not allow even small packets from now on */
+		priv->quota = 0;
+	}
+	spin_unlock_bh(&priv->lock);
+
+	return ret;
 }
 
 static int quota_mt_check(const struct xt_mtchk_param *par)
 {
 	struct xt_quota_info *q = par->matchinfo;
 
-	BUILD_BUG_ON(sizeof(atomic64_t) != sizeof(__u64));
-
 	if (q->flags & ~XT_QUOTA_MASK)
 		return -EINVAL;
-	if (atomic64_read(&q->counter) > q->quota + 1)
-		return -ERANGE;
 
-	if (atomic64_read(&q->counter) == 0)
-		atomic64_set(&q->counter, q->quota + 1);
+	q->master = kmalloc(sizeof(*q->master), GFP_KERNEL);
+	if (q->master == NULL)
+		return -ENOMEM;
+
+	spin_lock_init(&q->master->lock);
+	q->master->quota = q->quota;
 	return 0;
 }
 
+static void quota_mt_destroy(const struct xt_mtdtor_param *par)
+{
+	const struct xt_quota_info *q = par->matchinfo;
+
+	kfree(q->master);
+}
+
 static struct xt_match quota_mt_reg __read_mostly = {
 	.name       = "quota",
 	.revision   = 0,
 	.family     = NFPROTO_UNSPEC,
 	.match      = quota_mt,
 	.checkentry = quota_mt_check,
+	.destroy    = quota_mt_destroy,
 	.matchsize  = sizeof(struct xt_quota_info),
+	.usersize   = offsetof(struct xt_quota_info, master),
 	.me         = THIS_MODULE,
 };
 
-- 
2.11.0

^ permalink raw reply related

* [PATCH 06/10] netfilter: nf_nat_snmp_basic: add missing helper alias name
From: Pablo Neira Ayuso @ 2018-10-20  9:43 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20181020094317.16011-1-pablo@netfilter.org>

From: Taehee Yoo <ap420073@gmail.com>

In order to upload helper module automatically, helper alias name
is needed. so that MODULE_ALIAS_NFCT_HELPER() should be added.
And unlike other nat helper modules, the nf_nat_snmp_basic can be
used independently.
helper name is "snmp_trap" so that alias name will be
"nfct-helper-snmp_trap" by MODULE_ALIAS_NFCT_HELPER(snmp_trap)

test command:
   %iptables -t raw -I PREROUTING -p udp -j CT --helper snmp_trap
   %lsmod | grep nf_nat_snmp_basic

We can see nf_nat_snmp_basic module is uploaded automatically.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/nf_nat_snmp_basic_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/netfilter/nf_nat_snmp_basic_main.c b/net/ipv4/netfilter/nf_nat_snmp_basic_main.c
index ac110c1d55b5..a0aa13bcabda 100644
--- a/net/ipv4/netfilter/nf_nat_snmp_basic_main.c
+++ b/net/ipv4/netfilter/nf_nat_snmp_basic_main.c
@@ -60,6 +60,7 @@ MODULE_LICENSE("GPL");
 MODULE_AUTHOR("James Morris <jmorris@intercode.com.au>");
 MODULE_DESCRIPTION("Basic SNMP Application Layer Gateway");
 MODULE_ALIAS("ip_nat_snmp_basic");
+MODULE_ALIAS_NFCT_HELPER("snmp_trap");
 
 #define SNMP_PORT 161
 #define SNMP_TRAP_PORT 162
-- 
2.11.0

^ permalink raw reply related

* [PATCH 09/10] netfilter: nfnetlink_log: remove empty nfnetlink_log.h header file
From: Pablo Neira Ayuso @ 2018-10-20  9:43 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20181020094317.16011-1-pablo@netfilter.org>

From: Taehee Yoo <ap420073@gmail.com>

/include/net/netfilter/nfnetlink_log.h file is empty.
so that it can be removed.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nfnetlink_log.h | 1 -
 1 file changed, 1 deletion(-)
 delete mode 100644 include/net/netfilter/nfnetlink_log.h

diff --git a/include/net/netfilter/nfnetlink_log.h b/include/net/netfilter/nfnetlink_log.h
deleted file mode 100644
index ea32a7d3cf1b..000000000000
--- a/include/net/netfilter/nfnetlink_log.h
+++ /dev/null
@@ -1 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-- 
2.11.0

^ permalink raw reply related

* [PATCH 08/10] netfilter: remove two unused variables.
From: Pablo Neira Ayuso @ 2018-10-20  9:43 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20181020094317.16011-1-pablo@netfilter.org>

From: Weongyo Jeong <weongyo.linux@gmail.com>

nft_dup_netdev_ingress_ops and nft_fwd_netdev_ingress_ops variables are
no longer used at the code.

Signed-off-by: Weongyo Jeong <weongyo.linux@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_dup_netdev.c | 2 --
 net/netfilter/nft_fwd_netdev.c | 4 ----
 2 files changed, 6 deletions(-)

diff --git a/net/netfilter/nft_dup_netdev.c b/net/netfilter/nft_dup_netdev.c
index 2cc1e0ef56e8..15cc62b293d6 100644
--- a/net/netfilter/nft_dup_netdev.c
+++ b/net/netfilter/nft_dup_netdev.c
@@ -46,8 +46,6 @@ static int nft_dup_netdev_init(const struct nft_ctx *ctx,
 	return nft_validate_register_load(priv->sreg_dev, sizeof(int));
 }
 
-static const struct nft_expr_ops nft_dup_netdev_ingress_ops;
-
 static int nft_dup_netdev_dump(struct sk_buff *skb, const struct nft_expr *expr)
 {
 	struct nft_dup_netdev *priv = nft_expr_priv(expr);
diff --git a/net/netfilter/nft_fwd_netdev.c b/net/netfilter/nft_fwd_netdev.c
index 8abb9891cdf2..d7694e7255a0 100644
--- a/net/netfilter/nft_fwd_netdev.c
+++ b/net/netfilter/nft_fwd_netdev.c
@@ -53,8 +53,6 @@ static int nft_fwd_netdev_init(const struct nft_ctx *ctx,
 	return nft_validate_register_load(priv->sreg_dev, sizeof(int));
 }
 
-static const struct nft_expr_ops nft_fwd_netdev_ingress_ops;
-
 static int nft_fwd_netdev_dump(struct sk_buff *skb, const struct nft_expr *expr)
 {
 	struct nft_fwd_netdev *priv = nft_expr_priv(expr);
@@ -169,8 +167,6 @@ static int nft_fwd_neigh_init(const struct nft_ctx *ctx,
 	return nft_validate_register_load(priv->sreg_addr, addr_len);
 }
 
-static const struct nft_expr_ops nft_fwd_netdev_ingress_ops;
-
 static int nft_fwd_neigh_dump(struct sk_buff *skb, const struct nft_expr *expr)
 {
 	struct nft_fwd_neigh *priv = nft_expr_priv(expr);
-- 
2.11.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox