* Re: [PATCH] cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan
From: David Miller @ 2017-10-09 23:03 UTC (permalink / raw)
To: aleksander
Cc: oliver, marco.demarco, stefano.godeas, linux-usb, netdev,
linux-kernel
In-Reply-To: <20171009120512.16681-1-aleksander@aleksander.es>
From: Aleksander Morgado <aleksander@aleksander.es>
Date: Mon, 9 Oct 2017 14:05:12 +0200
> The u-blox TOBY-L2 is a LTE Cat 4 module with HSPA+ and 2G fallback.
> This module allows switching to different USB profiles with the
> 'AT+UUSBCONF' command, and provides a ECM network interface when the
> 'AT+UUSBCONF=2' profile is selected.
>
> The u-blox SARA-U2 is a HSPA module with 2G fallback. The default USB
> configuration includes a ECM network interface.
>
> Both these modules are controlled via AT commands through one of the
> TTYs exposed. Connecting these modules may be done just by activating
> the desired PDP context with 'AT+CGACT=1,<cid>' and then running DHCP
> on the ECM interface.
>
> Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Applied, thank you.
^ permalink raw reply
* Re: linux-next: manual merge of the cgroup tree with the net-next tree
From: Alexei Starovoitov @ 2017-10-09 23:04 UTC (permalink / raw)
To: Mark Brown
Cc: Tejun Heo, Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
David S. Miller, netdev, Linux-Next Mailing List,
Linux Kernel Mailing List
In-Reply-To: <20171009183836.cceczuqytzmgqubr@sirena.co.uk>
On Mon, Oct 09, 2017 at 07:38:36PM +0100, Mark Brown wrote:
> Hi Tejun,
>
> Today's linux-next merge of the cgroup tree got a conflict in:
>
> kernel/cgroup/cgroup.c
>
> between commit:
>
> 324bda9e6c5ad ("bpf: multi program support for cgroup+bpf")
>
> from the net-next tree and commit:
>
> 041cd640b2f3c ("cgroup: Implement cgroup2 basic CPU usage accounting")
>
> from the cgroup tree.
>
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging. You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
>
> diff --cc kernel/cgroup/cgroup.c
> index 00f5b358aeac,c3421ee0d230..000000000000
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@@ -4765,8 -4785,9 +4788,11 @@@ static struct cgroup *cgroup_create(str
>
> return cgrp;
>
> +out_idr_free:
> + cgroup_idr_remove(&root->cgroup_idr, cgrp->id);
> + out_stat_exit:
> + if (cgroup_on_dfl(parent))
> + cgroup_stat_exit(cgrp);
thanks. I did the same merge conflict resolution for our combined tree.
^ permalink raw reply
* Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: Alexei Starovoitov @ 2017-10-09 23:06 UTC (permalink / raw)
To: Levin, Alexander (Sasha Levin)
Cc: Tim Hansen, davem@davemloft.net, willemb@google.com,
edumazet@google.com, soheil@google.com, pabeni@redhat.com,
elena.reshetova@intel.com, tom@quantonium.net, Jason@zx2c4.com,
fw@strlen.de, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20171009202633.ep5pbi2tlg7dqidz@sasha-lappy>
On Mon, Oct 09, 2017 at 08:26:34PM +0000, Levin, Alexander (Sasha Levin) wrote:
> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
> >> Fix BUG() calls to use BUG_ON(conditional) macros.
> >>
> >> This was found using make coccicheck M=net/core on linux next
> >> tag next-2017092
> >>
> >> Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
> >> ---
> >> net/core/skbuff.c | 15 ++++++---------
> >> 1 file changed, 6 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
> >> --- a/net/core/skbuff.c
> >> +++ b/net/core/skbuff.c
> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
> >> /* Set the tail pointer and length */
> >> skb_put(n, skb->len);
> >>
> >> - if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
> >> - BUG();
> >> + BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
> >
> >I'm concerned with this change.
> >1. Calling non-trivial bit of code inside the macro is a poor coding style (imo)
> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and implementation
> >of BUG and BUG_ON look quite different.
>
> For these archs, wouldn't it then be more efficient to use BUG_ON rather than BUG()?
why more efficient? any data to prove that?
I'm pointing that the change is not equivalent and
this code has been around forever (pre-git days), so I see
no reason to risk changing it.
^ permalink raw reply
* Re: [PATCH net-next v2 1/5] bpf: Add file mode configuration into bpf maps
From: Alexei Starovoitov @ 2017-10-09 23:07 UTC (permalink / raw)
To: Chenbo Feng
Cc: linux-security-module, netdev, SELinux, Jeffrey Vander Stoep,
lorenzo, Daniel Borkmann, Stephen Smalley, Chenbo Feng
In-Reply-To: <20171009222028.13096-2-chenbofeng.kernel@gmail.com>
On Mon, Oct 09, 2017 at 03:20:24PM -0700, Chenbo Feng wrote:
> From: Chenbo Feng <fengc@google.com>
>
> Introduce the map read/write flags to the eBPF syscalls that returns the
> map fd. The flags is used to set up the file mode when construct a new
> file descriptor for bpf maps. To not break the backward capability, the
> f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise
> it should be O_RDONLY or O_WRONLY. When the userspace want to modify or
> read the map content, it will check the file mode to see if it is
> allowed to make the change.
>
> Signed-off-by: Chenbo Feng <fengc@google.com>
> Acked-by: Alexei Starovoitov <ast@kernel.org>
> ---
> include/linux/bpf.h | 6 ++--
> include/uapi/linux/bpf.h | 6 ++++
> kernel/bpf/arraymap.c | 7 +++--
> kernel/bpf/devmap.c | 5 ++-
> kernel/bpf/hashtab.c | 5 +--
> kernel/bpf/inode.c | 15 ++++++---
> kernel/bpf/lpm_trie.c | 3 +-
> kernel/bpf/sockmap.c | 5 ++-
> kernel/bpf/stackmap.c | 5 ++-
> kernel/bpf/syscall.c | 80 +++++++++++++++++++++++++++++++++++++++++++-----
> 10 files changed, 114 insertions(+), 23 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index bc7da2ddfcaf..0e9ca2555d7f 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -308,11 +308,11 @@ void bpf_map_area_free(void *base);
>
> extern int sysctl_unprivileged_bpf_disabled;
>
> -int bpf_map_new_fd(struct bpf_map *map);
> +int bpf_map_new_fd(struct bpf_map *map, int flags);
> int bpf_prog_new_fd(struct bpf_prog *prog);
>
> int bpf_obj_pin_user(u32 ufd, const char __user *pathname);
> -int bpf_obj_get_user(const char __user *pathname);
> +int bpf_obj_get_user(const char __user *pathname, int flags);
>
> int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value);
> int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value);
> @@ -331,6 +331,8 @@ int bpf_fd_htab_map_update_elem(struct bpf_map *map, struct file *map_file,
> void *key, void *value, u64 map_flags);
> int bpf_fd_htab_map_lookup_elem(struct bpf_map *map, void *key, u32 *value);
>
> +int bpf_get_file_flag(int flags);
> +
> /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and
> * forced to use 'long' read/writes to try to atomically copy long counters.
> * Best-effort only. No barriers here, since it _will_ race with concurrent
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 6db9e1d679cd..9cb50a228c39 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -217,6 +217,10 @@ enum bpf_attach_type {
>
> #define BPF_OBJ_NAME_LEN 16U
>
> +/* Flags for accessing BPF object */
> +#define BPF_F_RDONLY (1U << 3)
> +#define BPF_F_WRONLY (1U << 4)
> +
> union bpf_attr {
> struct { /* anonymous struct used by BPF_MAP_CREATE command */
> __u32 map_type; /* one of enum bpf_map_type */
> @@ -259,6 +263,7 @@ union bpf_attr {
> struct { /* anonymous struct used by BPF_OBJ_* commands */
> __aligned_u64 pathname;
> __u32 bpf_fd;
> + __u32 file_flags;
> };
>
> struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
> @@ -286,6 +291,7 @@ union bpf_attr {
> __u32 map_id;
> };
> __u32 next_id;
> + __u32 open_flags;
> };
>
> struct { /* anonymous struct used by BPF_OBJ_GET_INFO_BY_FD */
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 68d866628be0..f869e48ef2f6 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -19,6 +19,9 @@
>
> #include "map_in_map.h"
>
> +#define ARRAY_CREATE_FLAG_MASK \
> + (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY)
> +
> static void bpf_array_free_percpu(struct bpf_array *array)
> {
> int i;
> @@ -56,8 +59,8 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
>
> /* check sanity of attributes */
> if (attr->max_entries == 0 || attr->key_size != 4 ||
> - attr->value_size == 0 || attr->map_flags & ~BPF_F_NUMA_NODE ||
> - (percpu && numa_node != NUMA_NO_NODE))
> + attr->value_size == 0 || attr->map_flags &
> + ~ARRAY_CREATE_FLAG_MASK || (percpu && numa_node != NUMA_NO_NODE))
that's very non-standard way of breaking lines.
Did you run checkpatch ? did it complain?
^ permalink raw reply
* [GIT] Networking
From: David Miller @ 2017-10-09 23:10 UTC (permalink / raw)
To: torvalds; +Cc: akpm, linux-kernel, netdev
1) Fix object leak on IPSEC offload failure, from Steffen Klassert.
2) Fix range checks in ipset address range addition operations,
from Jozsef Kadlecsik.
3) Fix pernet ops unregistration order in ipset, from Florian
Westphal.
4) Add missing netlink attribute policy for nl80211 packet pattern
attrs, from Peng Xu.
5) Fix PPP device destruction race, from Guillaume Nault.
6) Write marks get lost when BPF verifier processes R1=R2 register
assignments, causing incorrect liveness information and less
state pruning. Fix from Alexei Starovoitov.
7) Fix blockhole routes so that they are marked dead and therefore
not cached in sockets, otherwise IPSEC stops working. From
Steffen Klassert.
8) Fix broadcast handling of UDP socket early demux, from Paolo
Abeni.
Please pull, thanks a lot!
The following changes since commit 7a92616c0bac849e790283723b36c399668a1d9f:
Merge tag 'pm-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm (2017-10-05 15:51:37 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
for you to fetch changes up to fdfbad3256918fc5736d68384331d2dbf45ccbd6:
cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan (2017-10-09 16:03:32 -0700)
----------------------------------------------------------------
Aleksander Morgado (1):
cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan
Alexei Starovoitov (1):
bpf: fix liveness marking
Alexey Kodanev (2):
vti: fix NULL dereference in xfrm_input()
gso: fix payload length when gso_size is zero
Artem Savkov (2):
xfrm: don't call xfrm_policy_cache_flush under xfrm_state_lock
netfilter: ebtables: fix race condition in frame_filter_net_init()
Arvind Yadav (1):
netfilter: nf_tables: Release memory obtained by kasprintf
Axel Beckert (1):
doc: Fix typo "8023.ad" in bonding documentation
Dan Carpenter (1):
selftests/net: rxtimestamp: Fix an off by one
David S. Miller (4):
Merge branch 'master' of git://git.kernel.org/.../klassert/ipsec
Merge tag 'mac80211-for-davem-2017-10-09' of git://git.kernel.org/.../jberg/mac80211
Merge branch '10GbE' of git://git.kernel.org/.../jkirsher/net-queue
Merge git://git.kernel.org/.../pablo/nf
Ding Tianhong (2):
Revert commit 1a8b6d76dc5b ("net:add one common config...")
net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
Eric Dumazet (1):
netfilter: x_tables: avoid stack-out-of-bounds read in xt_copy_counters_from_user
Florian Westphal (1):
netfilter: ipset: pernet ops must be unregistered last
Guillaume Nault (1):
ppp: fix race in ppp device destruction
Gustavo A. R. Silva (1):
net: thunderx: mark expected switch fall-throughs in nicvf_main()
Ido Schimmel (1):
mlxsw: spectrum_router: Avoid expensive lookup during route removal
Jason A. Donenfeld (1):
netlink: do not set cb_running if dump's start() errs
JingPiao Chen (1):
netfilter: nf_tables: fix update chain error
John Fastabend (1):
ixgbe: incorrect XDP ring accounting in ethtool tx_frame param
Jon Maloy (2):
tipc: correct initialization of skb list
tipc: Unclone message at secondary destination lookup
Jozsef Kadlecsik (1):
netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses
Lin Zhang (1):
netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook
Mark D Rustad (1):
ixgbe: Return error when getting PHY address if PHY access is not supported
Matteo Croce (1):
ipv6: fix net.ipv6.conf.all.accept_dad behaviour for real
Pablo Neira Ayuso (1):
netfilter: nf_tables: do not dump chain counters if not enabled
Paolo Abeni (1):
udp: fix bcast packet reception
Peng Xu (1):
nl80211: Define policy for packet pattern attributes
Ross Lagerwall (1):
netfilter: ipset: Fix race between dump and swap
Sabrina Dubroca (1):
ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register
Shmulik Ladkani (1):
netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'
Steffen Klassert (4):
xfrm: Fix deletion of offloaded SAs on failure.
xfrm: Fix negative device refcount on offload failure.
ipv6: Fix traffic triggered IPsec connections.
ipv4: Fix traffic triggered IPsec connections.
Subash Abhinov Kasiviswanathan (1):
netfilter: xt_socket: Restore mark from full sockets only
Vadim Fedorenko (1):
netfilter: ipvs: full-functionality option for ECN encapsulation in tunnel
Documentation/networking/bonding.txt | 2 +-
arch/Kconfig | 3 ---
arch/sparc/Kconfig | 1 -
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 2 ++
drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c | 22 ----------------------
drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 19 -------------------
drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 16 ++++++++--------
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 +++++-
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 14 --------------
drivers/net/ppp/ppp_generic.c | 20 ++++++++++++++++++++
drivers/net/usb/cdc_ether.c | 13 +++++++++++++
include/linux/bpf.h | 5 +++++
include/linux/netfilter_bridge/ebtables.h | 7 ++++---
include/uapi/linux/netfilter/xt_bpf.h | 1 +
kernel/bpf/inode.c | 1 +
kernel/bpf/verifier.c | 5 +++++
net/bridge/netfilter/ebtable_broute.c | 4 ++--
net/bridge/netfilter/ebtable_filter.c | 4 ++--
net/bridge/netfilter/ebtable_nat.c | 4 ++--
net/bridge/netfilter/ebtables.c | 17 +++++++++--------
net/ipv4/gre_offload.c | 2 +-
net/ipv4/netfilter/ipt_SYNPROXY.c | 3 ++-
net/ipv4/route.c | 2 +-
net/ipv4/udp.c | 14 +++++---------
net/ipv4/udp_offload.c | 2 +-
net/ipv6/addrconf.c | 4 ++--
net/ipv6/ip6_offload.c | 2 +-
net/ipv6/netfilter/ip6t_SYNPROXY.c | 2 +-
net/ipv6/route.c | 2 +-
net/netfilter/ipset/ip_set_core.c | 29 ++++++++++++++++++-----------
net/netfilter/ipset/ip_set_hash_ip.c | 22 ++++++++++++----------
net/netfilter/ipset/ip_set_hash_ipmark.c | 2 +-
net/netfilter/ipset/ip_set_hash_ipport.c | 2 +-
net/netfilter/ipset/ip_set_hash_ipportip.c | 2 +-
net/netfilter/ipset/ip_set_hash_ipportnet.c | 4 ++--
net/netfilter/ipset/ip_set_hash_net.c | 2 +-
net/netfilter/ipset/ip_set_hash_netiface.c | 2 +-
net/netfilter/ipset/ip_set_hash_netnet.c | 4 ++--
net/netfilter/ipset/ip_set_hash_netport.c | 2 +-
net/netfilter/ipset/ip_set_hash_netportnet.c | 4 ++--
net/netfilter/ipvs/ip_vs_xmit.c | 8 ++++++--
net/netfilter/nf_tables_api.c | 10 ++++++----
net/netfilter/x_tables.c | 4 ++--
net/netfilter/xt_bpf.c | 22 ++++++++++++++++++++--
net/netfilter/xt_socket.c | 4 ++--
net/netlink/af_netlink.c | 13 +++++++------
net/tipc/bcast.c | 4 ++--
net/tipc/msg.c | 8 ++++++++
net/wireless/nl80211.c | 14 ++++++++++++--
net/xfrm/xfrm_device.c | 1 +
net/xfrm/xfrm_input.c | 6 ++++--
net/xfrm/xfrm_state.c | 4 ++--
net/xfrm/xfrm_user.c | 1 +
tools/testing/selftests/networking/timestamping/rxtimestamp.c | 2 +-
54 files changed, 211 insertions(+), 164 deletions(-)
^ permalink raw reply
* Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: Levin, Alexander (Sasha Levin) @ 2017-10-09 23:15 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Tim Hansen, davem@davemloft.net, willemb@google.com,
edumazet@google.com, soheil@google.com, pabeni@redhat.com,
elena.reshetova@intel.com, tom@quantonium.net, Jason@zx2c4.com,
fw@strlen.de, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20171009230618.e5gla2iuqwmndkig@ast-mbp>
On Mon, Oct 09, 2017 at 04:06:20PM -0700, Alexei Starovoitov wrote:
>On Mon, Oct 09, 2017 at 08:26:34PM +0000, Levin, Alexander (Sasha Levin) wrote:
>> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
>> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
>> >> Fix BUG() calls to use BUG_ON(conditional) macros.
>> >>
>> >> This was found using make coccicheck M=net/core on linux next
>> >> tag next-2017092
>> >>
>> >> Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
>> >> ---
>> >> net/core/skbuff.c | 15 ++++++---------
>> >> 1 file changed, 6 insertions(+), 9 deletions(-)
>> >>
>> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
>> >> --- a/net/core/skbuff.c
>> >> +++ b/net/core/skbuff.c
>> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
>> >> /* Set the tail pointer and length */
>> >> skb_put(n, skb->len);
>> >>
>> >> - if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
>> >> - BUG();
>> >> + BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
>> >
>> >I'm concerned with this change.
>> >1. Calling non-trivial bit of code inside the macro is a poor coding style (imo)
>> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and implementation
>> >of BUG and BUG_ON look quite different.
>>
>> For these archs, wouldn't it then be more efficient to use BUG_ON rather than BUG()?
>
>why more efficient? any data to prove that?
Just guessing.
Either way, is there a particular reason for not using BUG_ON() here
besides that it's implementation is "quite different"?
>I'm pointing that the change is not equivalent and
>this code has been around forever (pre-git days), so I see
>no reason to risk changing it.
Do you know that BUG_ON() is broken on any archs?
If not, "this code has been around forever" is really not an excuse to
not touch code.
If BUG_ON() behavior is broken somewhere, then it needs to get fixed.
--
Thanks,
Sasha
^ permalink raw reply
* Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: David Miller @ 2017-10-09 23:17 UTC (permalink / raw)
To: alexei.starovoitov
Cc: alexander.levin, devtimhansen, willemb, edumazet, soheil, pabeni,
elena.reshetova, tom, Jason, fw, netdev, linux-kernel
In-Reply-To: <20171009230618.e5gla2iuqwmndkig@ast-mbp>
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Date: Mon, 9 Oct 2017 16:06:20 -0700
>> For these archs, wouldn't it then be more efficient to use BUG_ON
>> rather than BUG()?
>
> why more efficient? any data to prove that?
It can completely eliminate a branch.
For example on powerpc if you use BUG() then the code generated is:
test condition
branch_not_true 1f
unconditional_trap
1:
Whereas with BUG_ON() it's just:
test condition
trap_if_true
Which is a lot better even when the branches in the first case are
well predicted.
^ permalink raw reply
* Re: linux-next: manual merge of the drivers-x86 tree with the net-next tree
From: Darren Hart @ 2017-10-09 23:18 UTC (permalink / raw)
To: Mark Brown
Cc: Mika Westerberg, Mario Limonciello, Yehezkel Bernat,
Andy Shevchenko, Amir Levy, Michael Jamet, David S. Miller,
netdev, Linux-Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20171009195634.gzxwcqn7xklumqyr@sirena.co.uk>
On Mon, Oct 09, 2017 at 08:56:34PM +0100, Mark Brown wrote:
> On Mon, Oct 09, 2017 at 10:43:01PM +0300, Mika Westerberg wrote:
>
> > If possible, I would rather move this chapter to be before "Networking
> > over Thunderbolt cable". Reason is that it then follows NVM flashing
> > chapter which is typically where you need to force power in the first
> > place.
>
Agreed.
> I guess that's something best sorted out either in the relevant trees or
> during the merge window?
I'm not sure how we would deal with it in the trees. Best to note this during
the merge window - whichever goes in second. Test merge will identify the merge
conflict, and we can include a note to Linus on the preference.
--
Darren Hart
VMware Open Source Technology Center
^ permalink raw reply
* Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: Alexei Starovoitov @ 2017-10-09 23:23 UTC (permalink / raw)
To: Levin, Alexander (Sasha Levin)
Cc: Tim Hansen, davem@davemloft.net, willemb@google.com,
edumazet@google.com, soheil@google.com, pabeni@redhat.com,
elena.reshetova@intel.com, tom@quantonium.net, Jason@zx2c4.com,
fw@strlen.de, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20171009231538.doypjzvxzkoxyoeo@sasha-lappy>
On Mon, Oct 09, 2017 at 11:15:40PM +0000, Levin, Alexander (Sasha Levin) wrote:
> On Mon, Oct 09, 2017 at 04:06:20PM -0700, Alexei Starovoitov wrote:
> >On Mon, Oct 09, 2017 at 08:26:34PM +0000, Levin, Alexander (Sasha Levin) wrote:
> >> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
> >> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
> >> >> Fix BUG() calls to use BUG_ON(conditional) macros.
> >> >>
> >> >> This was found using make coccicheck M=net/core on linux next
> >> >> tag next-2017092
> >> >>
> >> >> Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
> >> >> ---
> >> >> net/core/skbuff.c | 15 ++++++---------
> >> >> 1 file changed, 6 insertions(+), 9 deletions(-)
> >> >>
> >> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
> >> >> --- a/net/core/skbuff.c
> >> >> +++ b/net/core/skbuff.c
> >> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
> >> >> /* Set the tail pointer and length */
> >> >> skb_put(n, skb->len);
> >> >>
> >> >> - if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
> >> >> - BUG();
> >> >> + BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
> >> >
> >> >I'm concerned with this change.
> >> >1. Calling non-trivial bit of code inside the macro is a poor coding style (imo)
> >> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and implementation
> >> >of BUG and BUG_ON look quite different.
> >>
> >> For these archs, wouldn't it then be more efficient to use BUG_ON rather than BUG()?
> >
> >why more efficient? any data to prove that?
>
> Just guessing.
>
> Either way, is there a particular reason for not using BUG_ON() here
> besides that it's implementation is "quite different"?
>
> >I'm pointing that the change is not equivalent and
> >this code has been around forever (pre-git days), so I see
> >no reason to risk changing it.
>
> Do you know that BUG_ON() is broken on any archs?
>
> If not, "this code has been around forever" is really not an excuse to
> not touch code.
>
> If BUG_ON() behavior is broken somewhere, then it needs to get fixed.
no idea whether it's broken. My main objection is #1.
imo it's a very poor coding style to put functions with
side-effects into macros. Especially debug/bug/warn-like.
For example llvm has DEBUG() macro and everything inside
will disappear depending on compilation flags.
I wouldn't be surprised if somebody for the name
of security (to avoid crash on BUG_ON) will replace
BUG/BUG_ON with some other implementation or nop
and will have real bugs, since skb_copy_bits() is somehow
not called or called in different context.
^ permalink raw reply
* Re: [PATCH net-next v2 1/5] bpf: Add file mode configuration into bpf maps
From: Chenbo Feng @ 2017-10-09 23:31 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Chenbo Feng, linux-security-module, netdev, SELinux,
Jeffrey Vander Stoep, Lorenzo Colitti, Daniel Borkmann,
Stephen Smalley
In-Reply-To: <20171009230718.q6y57izbnyqtfw4y@ast-mbp>
On Mon, Oct 9, 2017 at 4:07 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Mon, Oct 09, 2017 at 03:20:24PM -0700, Chenbo Feng wrote:
>> From: Chenbo Feng <fengc@google.com>
>>
>> Introduce the map read/write flags to the eBPF syscalls that returns the
>> map fd. The flags is used to set up the file mode when construct a new
>> file descriptor for bpf maps. To not break the backward capability, the
>> f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise
>> it should be O_RDONLY or O_WRONLY. When the userspace want to modify or
>> read the map content, it will check the file mode to see if it is
>> allowed to make the change.
>>
>> Signed-off-by: Chenbo Feng <fengc@google.com>
>> Acked-by: Alexei Starovoitov <ast@kernel.org>
>> ---
>> include/linux/bpf.h | 6 ++--
>> include/uapi/linux/bpf.h | 6 ++++
>> kernel/bpf/arraymap.c | 7 +++--
>> kernel/bpf/devmap.c | 5 ++-
>> kernel/bpf/hashtab.c | 5 +--
>> kernel/bpf/inode.c | 15 ++++++---
>> kernel/bpf/lpm_trie.c | 3 +-
>> kernel/bpf/sockmap.c | 5 ++-
>> kernel/bpf/stackmap.c | 5 ++-
>> kernel/bpf/syscall.c | 80 +++++++++++++++++++++++++++++++++++++++++++-----
>> 10 files changed, 114 insertions(+), 23 deletions(-)
>>
>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>> index bc7da2ddfcaf..0e9ca2555d7f 100644
>> --- a/include/linux/bpf.h
>> +++ b/include/linux/bpf.h
>> @@ -308,11 +308,11 @@ void bpf_map_area_free(void *base);
>>
>> extern int sysctl_unprivileged_bpf_disabled;
>>
>> -int bpf_map_new_fd(struct bpf_map *map);
>> +int bpf_map_new_fd(struct bpf_map *map, int flags);
>> int bpf_prog_new_fd(struct bpf_prog *prog);
>>
>> int bpf_obj_pin_user(u32 ufd, const char __user *pathname);
>> -int bpf_obj_get_user(const char __user *pathname);
>> +int bpf_obj_get_user(const char __user *pathname, int flags);
>>
>> int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value);
>> int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value);
>> @@ -331,6 +331,8 @@ int bpf_fd_htab_map_update_elem(struct bpf_map *map, struct file *map_file,
>> void *key, void *value, u64 map_flags);
>> int bpf_fd_htab_map_lookup_elem(struct bpf_map *map, void *key, u32 *value);
>>
>> +int bpf_get_file_flag(int flags);
>> +
>> /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and
>> * forced to use 'long' read/writes to try to atomically copy long counters.
>> * Best-effort only. No barriers here, since it _will_ race with concurrent
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 6db9e1d679cd..9cb50a228c39 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -217,6 +217,10 @@ enum bpf_attach_type {
>>
>> #define BPF_OBJ_NAME_LEN 16U
>>
>> +/* Flags for accessing BPF object */
>> +#define BPF_F_RDONLY (1U << 3)
>> +#define BPF_F_WRONLY (1U << 4)
>> +
>> union bpf_attr {
>> struct { /* anonymous struct used by BPF_MAP_CREATE command */
>> __u32 map_type; /* one of enum bpf_map_type */
>> @@ -259,6 +263,7 @@ union bpf_attr {
>> struct { /* anonymous struct used by BPF_OBJ_* commands */
>> __aligned_u64 pathname;
>> __u32 bpf_fd;
>> + __u32 file_flags;
>> };
>>
>> struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
>> @@ -286,6 +291,7 @@ union bpf_attr {
>> __u32 map_id;
>> };
>> __u32 next_id;
>> + __u32 open_flags;
>> };
>>
>> struct { /* anonymous struct used by BPF_OBJ_GET_INFO_BY_FD */
>> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
>> index 68d866628be0..f869e48ef2f6 100644
>> --- a/kernel/bpf/arraymap.c
>> +++ b/kernel/bpf/arraymap.c
>> @@ -19,6 +19,9 @@
>>
>> #include "map_in_map.h"
>>
>> +#define ARRAY_CREATE_FLAG_MASK \
>> + (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY)
>> +
>> static void bpf_array_free_percpu(struct bpf_array *array)
>> {
>> int i;
>> @@ -56,8 +59,8 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
>>
>> /* check sanity of attributes */
>> if (attr->max_entries == 0 || attr->key_size != 4 ||
>> - attr->value_size == 0 || attr->map_flags & ~BPF_F_NUMA_NODE ||
>> - (percpu && numa_node != NUMA_NO_NODE))
>> + attr->value_size == 0 || attr->map_flags &
>> + ~ARRAY_CREATE_FLAG_MASK || (percpu && numa_node != NUMA_NO_NODE))
>
> that's very non-standard way of breaking lines.
> Did you run checkpatch ? did it complain?
>
Will fix in next revision, checkpatch didn't say anything about
this....0 error and 0 warning for this patch series.
^ permalink raw reply
* Re: [net-next 00/10][pull request] 10GbE Intel Wired LAN Driver Updates 2017-10-09
From: David Miller @ 2017-10-09 23:39 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann, jogreene
In-Reply-To: <20171009184000.80053-1-jeffrey.t.kirsher@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Mon, 9 Oct 2017 11:39:50 -0700
> This series contains updates to ixgbe only.
Pulled, thanks Jeff.
^ permalink raw reply
* ATENCIÓN
From: Sistemas administrador @ 2017-10-09 21:10 UTC (permalink / raw)
To: Recipients
ATENCIÓN;
Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser capaz de enviar o recibir correo nuevo hasta que vuelva a validar su buzón de correo electrónico. Para revalidar su buzón de correo, envíe la siguiente información a continuación:
nombre:
Nombre de usuario:
contraseña:
Confirmar contraseña:
E-mail:
teléfono:
Si usted no puede revalidar su buzón, el buzón se deshabilitará!
Disculpa las molestias.
Código de verificación: es: 006524
Correo Soporte Técnico © 2017
¡gracias
Sistemas administrador
(null)
^ permalink raw reply
* [PATCH] timer: Remove meaningless .data/.function assignments
From: Kees Cook @ 2017-10-10 0:10 UTC (permalink / raw)
To: Thomas Gleixner
Cc: devel, netdev, linux-wireless, linux-kernel, Jens Axboe,
Ganesh Krishna, Greg Kroah-Hartman, Aditya Shankar,
Krzysztof Halasa
Several timer users needlessly reset their .function/.data fields during
their timer callback, but nothing else changes them. Some users do not
use their .data field at all. Each instance is removed here.
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Cc: Aditya Shankar <aditya.shankar@microchip.com>
Cc: Ganesh Krishna <ganesh.krishna@microchip.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: netdev@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> # for staging
Acked-by: Krzysztof Halasa <khc@pm.waw.pl> # for wan/hdlc*
Acked-by: Jens Axboe <axboe@kernel.dk> # for amiflop
---
This should go via the timer/core tree, please. It's been acked by each
of the maintainers. Thanks!
---
drivers/block/amiflop.c | 3 +--
drivers/net/wan/hdlc_cisco.c | 2 --
drivers/net/wan/hdlc_fr.c | 2 --
drivers/staging/wilc1000/wilc_wfi_cfgoperations.c | 4 +---
4 files changed, 2 insertions(+), 9 deletions(-)
diff --git a/drivers/block/amiflop.c b/drivers/block/amiflop.c
index 49908c74bfcb..4e3fb9f104af 100644
--- a/drivers/block/amiflop.c
+++ b/drivers/block/amiflop.c
@@ -323,7 +323,7 @@ static void fd_deselect (int drive)
}
-static void motor_on_callback(unsigned long nr)
+static void motor_on_callback(unsigned long ignored)
{
if (!(ciaa.pra & DSKRDY) || --on_attempts == 0) {
complete_all(&motor_on_completion);
@@ -344,7 +344,6 @@ static int fd_motor_on(int nr)
fd_select(nr);
reinit_completion(&motor_on_completion);
- motor_on_timer.data = nr;
mod_timer(&motor_on_timer, jiffies + HZ/2);
on_attempts = 10;
diff --git a/drivers/net/wan/hdlc_cisco.c b/drivers/net/wan/hdlc_cisco.c
index a408abc25512..f4b0ab34f048 100644
--- a/drivers/net/wan/hdlc_cisco.c
+++ b/drivers/net/wan/hdlc_cisco.c
@@ -276,8 +276,6 @@ static void cisco_timer(unsigned long arg)
spin_unlock(&st->lock);
st->timer.expires = jiffies + st->settings.interval * HZ;
- st->timer.function = cisco_timer;
- st->timer.data = arg;
add_timer(&st->timer);
}
diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c
index 78596e42a3f3..07f265fa2826 100644
--- a/drivers/net/wan/hdlc_fr.c
+++ b/drivers/net/wan/hdlc_fr.c
@@ -644,8 +644,6 @@ static void fr_timer(unsigned long arg)
state(hdlc)->settings.t391 * HZ;
}
- state(hdlc)->timer.function = fr_timer;
- state(hdlc)->timer.data = arg;
add_timer(&state(hdlc)->timer);
}
diff --git a/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c b/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
index ac5aaafa461c..60f088babf27 100644
--- a/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
+++ b/drivers/staging/wilc1000/wilc_wfi_cfgoperations.c
@@ -266,7 +266,7 @@ static void update_scan_time(void)
last_scanned_shadow[i].time_scan = jiffies;
}
-static void remove_network_from_shadow(unsigned long arg)
+static void remove_network_from_shadow(unsigned long unused)
{
unsigned long now = jiffies;
int i, j;
@@ -287,7 +287,6 @@ static void remove_network_from_shadow(unsigned long arg)
}
if (last_scanned_cnt != 0) {
- hAgingTimer.data = arg;
mod_timer(&hAgingTimer, jiffies + msecs_to_jiffies(AGING_TIME));
}
}
@@ -304,7 +303,6 @@ static int is_network_in_shadow(struct network_info *pstrNetworkInfo,
int i;
if (last_scanned_cnt == 0) {
- hAgingTimer.data = (unsigned long)user_void;
mod_timer(&hAgingTimer, jiffies + msecs_to_jiffies(AGING_TIME));
state = -1;
} else {
--
2.7.4
--
Kees Cook
Pixel Security
^ permalink raw reply related
* [PATCH net-next] ipv6: use rcu_dereference_bh() in ipv6_route_seq_next()
From: Wei Wang @ 2017-10-10 0:17 UTC (permalink / raw)
To: David Miller, netdev; +Cc: Eric Dumazet, Martin KaFai Lau, Wei Wang
From: Wei Wang <weiwan@google.com>
This patch replaces rcu_deference() with rcu_dereference_bh() in
ipv6_route_seq_next() to avoid the following warning:
[ 19.431685] WARNING: suspicious RCU usage
[ 19.433451] 4.14.0-rc3-00914-g66f5d6c #118 Not tainted
[ 19.435509] -----------------------------
[ 19.437267] net/ipv6/ip6_fib.c:2259 suspicious
rcu_dereference_check() usage!
[ 19.440790]
[ 19.440790] other info that might help us debug this:
[ 19.440790]
[ 19.444734]
[ 19.444734] rcu_scheduler_active = 2, debug_locks = 1
[ 19.447757] 2 locks held by odhcpd/3720:
[ 19.449480] #0: (&p->lock){+.+.}, at: [<ffffffffb1231f7d>]
seq_read+0x3c/0x333
[ 19.452720] #1: (rcu_read_lock_bh){....}, at: [<ffffffffb1d2b984>]
ipv6_route_seq_start+0x5/0xfd
[ 19.456323]
[ 19.456323] stack backtrace:
[ 19.458812] CPU: 0 PID: 3720 Comm: odhcpd Not tainted
4.14.0-rc3-00914-g66f5d6c #118
[ 19.462042] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1 04/01/2014
[ 19.465414] Call Trace:
[ 19.466788] dump_stack+0x86/0xc0
[ 19.468358] lockdep_rcu_suspicious+0xea/0xf3
[ 19.470183] ipv6_route_seq_next+0x71/0x164
[ 19.471963] seq_read+0x244/0x333
[ 19.473522] proc_reg_read+0x48/0x67
[ 19.475152] ? proc_reg_write+0x67/0x67
[ 19.476862] __vfs_read+0x26/0x10b
[ 19.478463] ? __might_fault+0x37/0x84
[ 19.480148] vfs_read+0xba/0x146
[ 19.481690] SyS_read+0x51/0x8e
[ 19.483197] do_int80_syscall_32+0x66/0x15a
[ 19.484969] entry_INT80_compat+0x32/0x50
[ 19.486707] RIP: 0023:0xf7f0be8e
[ 19.488244] RSP: 002b:00000000ffa75d04 EFLAGS: 00000246 ORIG_RAX:
0000000000000003
[ 19.491431] RAX: ffffffffffffffda RBX: 0000000000000009 RCX:
0000000008056068
[ 19.493886] RDX: 0000000000001000 RSI: 0000000008056008 RDI:
0000000000001000
[ 19.496331] RBP: 00000000000001ff R08: 0000000000000000 R09:
0000000000000000
[ 19.498768] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[ 19.501217] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
Reported-by: Xiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
---
net/ipv6/ip6_fib.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 52a29ba32928..c2ecd5ec638a 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -2262,7 +2262,7 @@ static void *ipv6_route_seq_next(struct seq_file *seq, void *v, loff_t *pos)
if (!v)
goto iter_table;
- n = rcu_dereference(((struct rt6_info *)v)->dst.rt6_next);
+ n = rcu_dereference_bh(((struct rt6_info *)v)->dst.rt6_next);
if (n) {
++*pos;
return n;
--
2.14.2.920.gcf0c67979c-goog
^ permalink raw reply related
* [PATCH RFC tip/core/rcu 03/15] drivers/net/ethernet/qlogic/qed: Fix __qed_spq_block() ordering
From: Paul E. McKenney @ 2017-10-10 0:22 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, torvalds, mark.rutland, dhowells, linux-arch, peterz,
will.deacon, Paul E. McKenney, Ariel Elior, everest-linux-l2,
netdev
In-Reply-To: <20171010001951.GA6476@linux.vnet.ibm.com>
The __qed_spq_block() function expects an smp_read_barrier_depends()
to order a prior READ_ONCE() against a later load that does not depend
on the prior READ_ONCE(), an expectation that can fail to be met.
This commit therefore replaces the READ_ONCE() with smp_load_acquire()
and removes the smp_read_barrier_depends().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ariel Elior <Ariel.Elior@cavium.com>
Cc: <everest-linux-l2@cavium.com>
Cc: <netdev@vger.kernel.org>
---
drivers/net/ethernet/qlogic/qed/qed_spq.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c b/drivers/net/ethernet/qlogic/qed/qed_spq.c
index be48d9abd001..c1237ec58b6c 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_spq.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c
@@ -97,9 +97,7 @@ static int __qed_spq_block(struct qed_hwfn *p_hwfn,
while (iter_cnt--) {
/* Validate we receive completion update */
- if (READ_ONCE(comp_done->done) == 1) {
- /* Read updated FW return value */
- smp_read_barrier_depends();
+ if (smp_load_acquire(&comp_done->done) == 1) { /* ^^^ */
if (p_fw_ret)
*p_fw_ret = comp_done->fw_return_code;
return 0;
--
2.5.2
^ permalink raw reply related
* [PATCH RFC tip/core/rcu 14/15] netfilter: Remove now-redundant smp_read_barrier_depends()
From: Paul E. McKenney @ 2017-10-10 0:22 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, torvalds, mark.rutland, dhowells, linux-arch, peterz,
will.deacon, Paul E. McKenney, Pablo Neira Ayuso,
Jozsef Kadlecsik, Florian Westphal, David S. Miller,
netfilter-devel, coreteam, netdev
In-Reply-To: <20171010001951.GA6476@linux.vnet.ibm.com>
READ_ONCE() now implies smp_read_barrier_depends(), which means that
the instances in arpt_do_table(), ipt_do_table(), and ip6t_do_table()
are now redundant. This commit removes them and adjusts the comments.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: <netfilter-devel@vger.kernel.org>
Cc: <coreteam@netfilter.org>
Cc: <netdev@vger.kernel.org>
---
net/ipv4/netfilter/arp_tables.c | 7 +------
net/ipv4/netfilter/ip_tables.c | 7 +------
net/ipv6/netfilter/ip6_tables.c | 7 +------
3 files changed, 3 insertions(+), 18 deletions(-)
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 9e2770fd00be..d555b3b31c49 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -202,13 +202,8 @@ unsigned int arpt_do_table(struct sk_buff *skb,
local_bh_disable();
addend = xt_write_recseq_begin();
- private = table->private;
+ private = READ_ONCE(table->private); /* Address dependency. */
cpu = smp_processor_id();
- /*
- * Ensure we load private-> members after we've fetched the base
- * pointer.
- */
- smp_read_barrier_depends();
table_base = private->entries;
jumpstack = (struct arpt_entry **)private->jumpstack[cpu];
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 39286e543ee6..f63752bec442 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -260,13 +260,8 @@ ipt_do_table(struct sk_buff *skb,
WARN_ON(!(table->valid_hooks & (1 << hook)));
local_bh_disable();
addend = xt_write_recseq_begin();
- private = table->private;
+ private = READ_ONCE(table->private); /* Address dependency. */
cpu = smp_processor_id();
- /*
- * Ensure we load private-> members after we've fetched the base
- * pointer.
- */
- smp_read_barrier_depends();
table_base = private->entries;
jumpstack = (struct ipt_entry **)private->jumpstack[cpu];
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 01bd3ee5ebc6..52afcab9b0d6 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -282,12 +282,7 @@ ip6t_do_table(struct sk_buff *skb,
local_bh_disable();
addend = xt_write_recseq_begin();
- private = table->private;
- /*
- * Ensure we load private-> members after we've fetched the base
- * pointer.
- */
- smp_read_barrier_depends();
+ private = READ_ONCE(table->private); /* Address dependency. */
cpu = smp_processor_id();
table_base = private->entries;
jumpstack = (struct ip6t_entry **)private->jumpstack[cpu];
--
2.5.2
^ permalink raw reply related
* Re: [PATCH] timer: Remove meaningless .data/.function assignments
From: David Miller @ 2017-10-10 1:05 UTC (permalink / raw)
To: keescook
Cc: devel, gregkh, linux-wireless, linux-kernel, axboe,
ganesh.krishna, netdev, aditya.shankar, tglx, khc
In-Reply-To: <20171010001032.GA119829@beast>
From: Kees Cook <keescook@chromium.org>
Date: Mon, 9 Oct 2017 17:10:32 -0700
> Several timer users needlessly reset their .function/.data fields during
> their timer callback, but nothing else changes them. Some users do not
> use their .data field at all. Each instance is removed here.
>
> Cc: Krzysztof Halasa <khc@pm.waw.pl>
> Cc: Aditya Shankar <aditya.shankar@microchip.com>
> Cc: Ganesh Krishna <ganesh.krishna@microchip.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Jens Axboe <axboe@fb.com>
> Cc: netdev@vger.kernel.org
> Cc: linux-wireless@vger.kernel.org
> Cc: devel@driverdev.osuosl.org
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> # for staging
> Acked-by: Krzysztof Halasa <khc@pm.waw.pl> # for wan/hdlc*
> Acked-by: Jens Axboe <axboe@kernel.dk> # for amiflop
> ---
> This should go via the timer/core tree, please. It's been acked by each
> of the maintainers. Thanks!
For networking bits:
Acked-by: David S. Miller <davem@davemloft.net>
^ permalink raw reply
* Re: [net-next 00/14][pull request] 40GbE Intel Wired LAN Driver Updates 2017-10-09
From: David Miller @ 2017-10-10 1:12 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann, jogreene
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Mon, 9 Oct 2017 15:38:27 -0700
> This series contains updates to i40e and i40evf only.
Pulled, thanks Jeff.
^ permalink raw reply
* Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: Levin, Alexander (Sasha Levin) @ 2017-10-10 1:14 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Tim Hansen, davem@davemloft.net, willemb@google.com,
edumazet@google.com, soheil@google.com, pabeni@redhat.com,
elena.reshetova@intel.com, tom@quantonium.net, Jason@zx2c4.com,
fw@strlen.de, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20171009232355.5mcd3fj7gjhenv25@ast-mbp>
On Mon, Oct 09, 2017 at 04:23:57PM -0700, Alexei Starovoitov wrote:
>On Mon, Oct 09, 2017 at 11:15:40PM +0000, Levin, Alexander (Sasha Levin) wrote:
>> On Mon, Oct 09, 2017 at 04:06:20PM -0700, Alexei Starovoitov wrote:
>> >On Mon, Oct 09, 2017 at 08:26:34PM +0000, Levin, Alexander (Sasha Levin) wrote:
>> >> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
>> >> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
>> >> >> Fix BUG() calls to use BUG_ON(conditional) macros.
>> >> >>
>> >> >> This was found using make coccicheck M=net/core on linux next
>> >> >> tag next-2017092
>> >> >>
>> >> >> Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
>> >> >> ---
>> >> >> net/core/skbuff.c | 15 ++++++---------
>> >> >> 1 file changed, 6 insertions(+), 9 deletions(-)
>> >> >>
>> >> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> >> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
>> >> >> --- a/net/core/skbuff.c
>> >> >> +++ b/net/core/skbuff.c
>> >> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
>> >> >> /* Set the tail pointer and length */
>> >> >> skb_put(n, skb->len);
>> >> >>
>> >> >> - if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
>> >> >> - BUG();
>> >> >> + BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
>> >> >
>> >> >I'm concerned with this change.
>> >> >1. Calling non-trivial bit of code inside the macro is a poor coding style (imo)
>> >> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and implementation
>> >> >of BUG and BUG_ON look quite different.
>> >>
>> >> For these archs, wouldn't it then be more efficient to use BUG_ON rather than BUG()?
>> >
>> >why more efficient? any data to prove that?
>>
>> Just guessing.
>>
>> Either way, is there a particular reason for not using BUG_ON() here
>> besides that it's implementation is "quite different"?
>>
>> >I'm pointing that the change is not equivalent and
>> >this code has been around forever (pre-git days), so I see
>> >no reason to risk changing it.
>>
>> Do you know that BUG_ON() is broken on any archs?
>>
>> If not, "this code has been around forever" is really not an excuse to
>> not touch code.
>>
>> If BUG_ON() behavior is broken somewhere, then it needs to get fixed.
>
>no idea whether it's broken. My main objection is #1.
>imo it's a very poor coding style to put functions with
>side-effects into macros. Especially debug/bug/warn-like.
This, however, seems to be an accepted coding style in the net/
subsys. Look a few lines lower in the very same file to find:
BUG_ON(skb_copy_bits(from, 0, skb_put(to, len), len));
Side effects ahoy ;)
>For example llvm has DEBUG() macro and everything inside
>will disappear depending on compilation flags.
>I wouldn't be surprised if somebody for the name
>of security (to avoid crash on BUG_ON) will replace
>BUG/BUG_ON with some other implementation or nop
>and will have real bugs, since skb_copy_bits() is somehow
>not called or called in different context.
This was already discussed, with the conclusion that BUG() can never
be disabled, since, as you described, it'll lead to very catastrophic
results. See i.e.:
commit b06dd879f5db33c1d7f5ab516ea671627f99c0c9
Author: Josh Triplett <josh@joshtriplett.org>
Date: Mon Apr 7 15:39:14 2014 -0700
x86: always define BUG() and HAVE_ARCH_BUG, even with !CONFIG_BUG
Anyway, as you said, this boils down to coding style nitpicking. I
guess that my only comment here would be that it shpid go one way or
the other and we document the decision somewhere (either per subsys,
or for the entire tree).
--
Thanks,
Sasha
^ permalink raw reply
* Re: [PATCH net-next RFC 4/9] net: dsa: mv88e6xxx: add support for event capture
From: Richard Cochran @ 2017-10-10 1:53 UTC (permalink / raw)
To: Levi Pearson
Cc: Brandon Streiff, Linux Kernel Network Developers, linux-kernel,
David S. Miller, Florian Fainelli, Andrew Lunn, Vivien Didelot,
Erik Hons
In-Reply-To: <CAEYbN3TpMVPVapmLNCPj69pxP9b_1y8VizS3XvccskcWOrpwEw@mail.gmail.com>
On Mon, Oct 09, 2017 at 04:08:50PM -0600, Levi Pearson wrote:
> Another issue related to this is that while the free-running counter
> in the hardware can't be easily adjusted, the periodic event generator
> *can* be finely adjusted (via picosecond and sub-picosecond
> accumulators) to correct for drift between the local clock and the PTP
> grandmaster time. So to be semantically correct, this needs to be both
> started at the right time *and* it needs to have the periodic
> corrections made so that the fine correction parameters in the
> hardware keep it adjusted to be synchronous with PTP grandmaster time.
So if the accumulators are safe to adjust on the fly, then the
adjfine() method will have to program them with every adjustment.
Thanks,
Richard
^ permalink raw reply
* Re: [PATCH 1/1] xdp: Sample xdp program implementing ip forward
From: Jacob, Christina @ 2017-10-10 2:24 UTC (permalink / raw)
To: Daniel Borkmann, netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
alexei.starovoitov@gmail.com
In-Reply-To: <DM5PR07MB346826EDCF6C5F2B1287D1578A710@DM5PR07MB3468.namprd07.prod.outlook.com>
Sorry for the late reply. I will include the suggested changes in the next revision of the patch.
Please see inline for clarifications and questions.
Thanks,
Christina
________________________________
From: Daniel Borkmann <daniel@iogearbox.net>
Sent: Tuesday, October 3, 2017 9:24 PM
To: Jacob, Christina; netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org; linux-arm-kernel@lists.infradead.org; alexei.starovoitov@gmail.com
Subject: Re: [PATCH 1/1] xdp: Sample xdp program implementing ip forward
>On 10/03/2017 09:37 AM, cjacob wrote:
>> Implements port to port forwarding with route table and arp table
>> lookup for ipv4 packets using bpf_redirect helper function and
>> lpm_trie map.
>>
>> Signed-off-by: cjacob <Christina.Jacob@cavium.com>
>
>Thanks for the patch, just few minor comments below!
>
>Note, should be full name, e.g.:
>
> Signed-off-by: Christina Jacob <Christina.Jacob@cavium.com>
>
>Also you From: only shows 'cjacob' as can be seen from the cover letter
>as well, so perhaps check your git settings to make that full name:
>
> cjacob (1):
> xdp: Sample xdp program implementing ip forward
>
>If there's one single patch, then cover letter is not needed, only
>for >1 sets.
>
>[...]
>> +#define KBUILD_MODNAME "foo"
>> +#include <uapi/linux/bpf.h>
>> +#include <linux/in.h>
>> +#include <linux/if_ether.h>
>> +#include <linux/if_packet.h>
>> +#include <linux/if_vlan.h>
>> +#include <linux/ip.h>
>> +#include <linux/ipv6.h>
>> +#include "bpf_helpers.h"
>> +#include <linux/slab.h>
>> +#include <net/ip_fib.h>
>> +
>> +struct trie_value {
>> + __u8 prefix[4];
>> + long value;
>> + int gw;
>> + int ifindex;
>> + int metric;
>> +};
>> +
>> +union key_4 {
>> + u32 b32[2];
>> + u8 b8[8];
>> +};
>> +
>> +struct arp_entry {
>> + int dst;
>> + long mac;
>> +};
>> +
>> +struct direct_map {
>> + long mac;
>> + int ifindex;
>> + struct arp_entry arp;
>> +};
>> +
>> +/* Map for trie implementation*/
>> +struct bpf_map_def SEC("maps") lpm_map = {
>> + .type = BPF_MAP_TYPE_LPM_TRIE,
>> + .key_size = 8,
>> + .value_size =
>> + sizeof(struct trie_value),
>
>(Nit: there are couple of such breaks throughout the patch, can we
> just use single line for such cases where reasonable?)
>
>> + .max_entries = 50,
>> + .map_flags = BPF_F_NO_PREALLOC,
>> +};
>> +
>> +/* Map for counter*/
>> +struct bpf_map_def SEC("maps") rxcnt = {
>> + .type = BPF_MAP_TYPE_PERCPU_ARRAY,
>> + .key_size = sizeof(u32),
>> + .value_size = sizeof(long),
>> + .max_entries = 256,
>> +};
>> +
>> +/* Map for ARP table*/
>> +struct bpf_map_def SEC("maps") arp_table = {
>> + .type = BPF_MAP_TYPE_HASH,
>> + .key_size = sizeof(int),
>> + .value_size = sizeof(long),
>
>Perhaps these should be proper structs here, such that it
>becomes easier to read/handle later on lookup.
>
I am not clear about this. I am defining a ebpf map.
I did not understand what structure you are refering to
Am I missing something here?.
>> + .max_entries = 50,
>> +};
>> +
>> +/* Map to keep the exact match entries in the route table*/
>> +struct bpf_map_def SEC("maps") exact_match = {
>> + .type = BPF_MAP_TYPE_HASH,
>> + .key_size = sizeof(int),
>> + .value_size = sizeof(struct direct_map),
>> + .max_entries = 50,
>> +};
>> +
>> +/**
>> + * Function to set source and destination mac of the packet
>> + */
>> +static inline void set_src_dst_mac(void *data, void *src, void *dst)
>> +{
>> + unsigned short *p = data;
>> + unsigned short *dest = dst;
>> + unsigned short *source = src;
>> +
>> + p[3] = source[0];
>> + p[4] = source[1];
>> + p[5] = source[2];
>> + p[0] = dest[0];
>> + p[1] = dest[1];
>> + p[2] = dest[2];
>
>You could just use __builtin_memcpy() given length is
>constant anyway, so LLVM will do the inlining.
>
>> +}
>> +
>> +/**
>> + * Parse IPV4 packet to get SRC, DST IP and protocol
>> + */
>> +static inline int parse_ipv4(void *data, u64 nh_off, void *data_end,
>> + unsigned int *src, unsigned int *dest)
>> +{
>> + struct iphdr *iph = data + nh_off;
>> +
>> + if (iph + 1 > data_end)
>> + return 0;
>> + *src = (unsigned int)iph->saddr;
>> + *dest = (unsigned int)iph->daddr;
>
>Why not stay with __be32 types?
>
>> + return iph->protocol;
>> +}
>> +
>> +SEC("xdp3")
>> +int xdp_prog3(struct xdp_md *ctx)
>> +{
>> + void *data_end = (void *)(long)ctx->data_end;
>> + void *data = (void *)(long)ctx->data;
>> + struct ethhdr *eth = data;
>> + int rc = XDP_DROP, forward_to;
>> + long *value;
>> + struct trie_value *prefix_value;
>> + long *dest_mac = NULL, *src_mac = NULL;
>> + u16 h_proto;
>> + u64 nh_off;
>> + u32 ipproto;
>> + union key_4 key4;
>> +
>> + nh_off = sizeof(*eth);
>> + if (data + nh_off > data_end)
>> + return rc;
>> +
>> + h_proto = eth->h_proto;
>> +
>> + if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
>> + struct vlan_hdr *vhdr;
>> +
>> + vhdr = data + nh_off;
>> + nh_off += sizeof(struct vlan_hdr);
>> + if (data + nh_off > data_end)
>> + return rc;
>> + h_proto = vhdr->h_vlan_encapsulated_proto;
>> + }
>> + if (h_proto == htons(ETH_P_ARP)) {
>> + return XDP_PASS;
>> + } else if (h_proto == htons(ETH_P_IP)) {
>> + int src_ip = 0, dest_ip = 0;
>> + struct direct_map *direct_entry;
>> +
>> + ipproto = parse_ipv4(data, nh_off, data_end, &src_ip, &dest_ip);
>> + direct_entry = (struct direct_map *)bpf_map_lookup_elem
>> + (&exact_match, &dest_ip);
>> + /*check for exact match, this would give a faster lookup*/
>> + if (direct_entry && direct_entry->mac &&
>> + direct_entry->arp.mac) {
>> + src_mac = &direct_entry->mac;
>> + dest_mac = &direct_entry->arp.mac;
>> + forward_to = direct_entry->ifindex;
>> + } else {
>> + /*Look up in the trie for lpm*/
>> + // Key for trie
>
>Nit: please check style throughout the patch.
>
>> + key4.b32[0] = 32;
>> + key4.b8[4] = dest_ip % 0x100;
>> + key4.b8[5] = (dest_ip >> 8) % 0x100;
>> + key4.b8[6] = (dest_ip >> 16) % 0x100;
>> + key4.b8[7] = (dest_ip >> 24) % 0x100;
>> + prefix_value =
>> + ((struct trie_value *)bpf_map_lookup_elem
>> + (&lpm_map, &key4));
>
>For key, please use proper struct bpf_lpm_trie_key, see also
>usage example in tools/testing/selftests/bpf/test_lpm_map.c
>for LPM handling.
>
I am following the way how it is done in the kernel program of other sample programs.
Can we do dynamic memory allocation in ebpf kernel program. I am getting invalid instruction errors in runtime.
>> + if (!prefix_value) {
>> + return XDP_DROP;
>> + } else {
>> + src_mac = &prefix_value->value;
>> + if (src_mac) {
>> + dest_mac = (long *)bpf_map_lookup_elem
>> + (&arp_table, &dest_ip);
>> + if (!dest_mac) {
>> + if (prefix_value->gw) {
>> + dest_ip = *(unsigned int *)(&(prefix_value->gw));
>> + dest_mac = (long *)bpf_map_lookup_elem
>> + (&arp_table, &dest_ip);
>> + } else {
>> + return XDP_DROP;
>> + }
>> + }
>> + forward_to = prefix_value->ifindex;
>> + } else {
>> + return XDP_DROP;
>> + }
>> + }
>> + }
>> + } else {
>> + ipproto = 0;
>> + }
>> + if (src_mac && dest_mac) {
>> + set_src_dst_mac(data, src_mac,
>> + dest_mac);
>> + value = bpf_map_lookup_elem
>> + (&rxcnt, &ipproto);
>> + if (value)
>> + *value += 1;
>> + return bpf_redirect(
>> + forward_to,
>> + 0);
>> + }
>> + return rc;
^ permalink raw reply
* Re: [PATCH net-next] ipv6: use rcu_dereference_bh() in ipv6_route_seq_next()
From: David Miller @ 2017-10-10 3:00 UTC (permalink / raw)
To: weiwan; +Cc: netdev, edumazet, kafai
In-Reply-To: <20171010001726.77046-1-tracywwnj@gmail.com>
From: Wei Wang <weiwan@google.com>
Date: Mon, 9 Oct 2017 17:17:26 -0700
> From: Wei Wang <weiwan@google.com>
>
> This patch replaces rcu_deference() with rcu_dereference_bh() in
> ipv6_route_seq_next() to avoid the following warning:
...
> Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
> Reported-by: Xiaolong Ye <xiaolong.ye@intel.com>
> Signed-off-by: Wei Wang <weiwan@google.com>
> Acked-by: Eric Dumazet <edumazet@google.com>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH net-next] once: switch to new jump label API
From: David Miller @ 2017-10-10 3:26 UTC (permalink / raw)
To: ebiggers3; +Cc: netdev, linux-kernel, hannes, jbaron, peterz, ebiggers
In-Reply-To: <20171009213052.97771-1-ebiggers3@gmail.com>
From: Eric Biggers <ebiggers3@gmail.com>
Date: Mon, 9 Oct 2017 14:30:52 -0700
> From: Eric Biggers <ebiggers@google.com>
>
> Switch the DO_ONCE() macro from the deprecated jump label API to the new
> one. The new one is more readable, and for DO_ONCE() it also makes the
> generated code more icache-friendly: now the one-time initialization
> code is placed out-of-line at the jump target, rather than at the inline
> fallthrough case.
>
> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Signed-off-by: Eric Biggers <ebiggers@google.com>
Applied, thank you.
^ permalink raw reply
* Re: [PATCHv2 net-next] openvswitch: Add erspan tunnel support.
From: David Miller @ 2017-10-10 3:46 UTC (permalink / raw)
To: u9012063; +Cc: netdev, pshelar
In-Reply-To: <1507161792-18340-1-git-send-email-u9012063@gmail.com>
From: William Tu <u9012063@gmail.com>
Date: Wed, 4 Oct 2017 17:03:12 -0700
> Add erspan netlink interface for OVS.
>
> Signed-off-by: William Tu <u9012063@gmail.com>
> Cc: Pravin B Shelar <pshelar@ovn.org>
> ---
> v1->v2: remove unnecessary compat code.
Applied, thanks.
^ permalink raw reply
* Re: [PATCH net-next v2] vhost_net: do not stall on zerocopy depletion
From: David Miller @ 2017-10-10 3:47 UTC (permalink / raw)
To: willemdebruijn.kernel; +Cc: netdev, mst, jasowang, den, virtualization, willemb
In-Reply-To: <20171006172231.87435-1-willemdebruijn.kernel@gmail.com>
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Fri, 6 Oct 2017 13:22:31 -0400
> From: Willem de Bruijn <willemb@google.com>
>
> Vhost-net has a hard limit on the number of zerocopy skbs in flight.
> When reached, transmission stalls. Stalls cause latency, as well as
> head-of-line blocking of other flows that do not use zerocopy.
>
> Instead of stalling, revert to copy-based transmission.
>
> Tested by sending two udp flows from guest to host, one with payload
> of VHOST_GOODCOPY_LEN, the other too small for zerocopy (1B). The
> large flow is redirected to a netem instance with 1MBps rate limit
> and deep 1000 entry queue.
>
> modprobe ifb
> ip link set dev ifb0 up
> tc qdisc add dev ifb0 root netem limit 1000 rate 1MBit
>
> tc qdisc add dev tap0 ingress
> tc filter add dev tap0 parent ffff: protocol ip \
> u32 match ip dport 8000 0xffff \
> action mirred egress redirect dev ifb0
>
> Before the delay, both flows process around 80K pps. With the delay,
> before this patch, both process around 400. After this patch, the
> large flow is still rate limited, while the small reverts to its
> original rate. See also discussion in the first link, below.
>
> Without rate limiting, {1, 10, 100}x TCP_STREAM tests continued to
> send at 100% zerocopy.
>
> The limit in vhost_exceeds_maxpend must be carefully chosen. With
> vq->num >> 1, the flows remain correlated. This value happens to
> correspond to VHOST_MAX_PENDING for vq->num == 256. Allow smaller
> fractions and ensure correctness also for much smaller values of
> vq->num, by testing the min() of both explicitly. See also the
> discussion in the second link below.
>
> Changes
> v1 -> v2
> - replaced min with typed min_t
> - avoid unnecessary whitespace change
>
> Link:http://lkml.kernel.org/r/CAF=yD-+Wk9sc9dXMUq1+x_hh=3ThTXa6BnZkygP3tgVpjbp93g@mail.gmail.com
> Link:http://lkml.kernel.org/r/20170819064129.27272-1-den@klaipeden.com
> Signed-off-by: Willem de Bruijn <willemb@google.com>
Applied, thanks Willem.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox