* Re: [net-next 00/10][pull request] 10GbE Intel Wired LAN Driver Updates 2017-10-09
From: David Miller @ 2017-10-09 23:39 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann, jogreene
In-Reply-To: <20171009184000.80053-1-jeffrey.t.kirsher@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Mon, 9 Oct 2017 11:39:50 -0700
> This series contains updates to ixgbe only.
Pulled, thanks Jeff.
^ permalink raw reply
* Re: [PATCH net-next v2 1/5] bpf: Add file mode configuration into bpf maps
From: Chenbo Feng @ 2017-10-09 23:31 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Chenbo Feng, linux-security-module, netdev, SELinux,
Jeffrey Vander Stoep, Lorenzo Colitti, Daniel Borkmann,
Stephen Smalley
In-Reply-To: <20171009230718.q6y57izbnyqtfw4y@ast-mbp>
On Mon, Oct 9, 2017 at 4:07 PM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Mon, Oct 09, 2017 at 03:20:24PM -0700, Chenbo Feng wrote:
>> From: Chenbo Feng <fengc@google.com>
>>
>> Introduce the map read/write flags to the eBPF syscalls that returns the
>> map fd. The flags is used to set up the file mode when construct a new
>> file descriptor for bpf maps. To not break the backward capability, the
>> f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise
>> it should be O_RDONLY or O_WRONLY. When the userspace want to modify or
>> read the map content, it will check the file mode to see if it is
>> allowed to make the change.
>>
>> Signed-off-by: Chenbo Feng <fengc@google.com>
>> Acked-by: Alexei Starovoitov <ast@kernel.org>
>> ---
>> include/linux/bpf.h | 6 ++--
>> include/uapi/linux/bpf.h | 6 ++++
>> kernel/bpf/arraymap.c | 7 +++--
>> kernel/bpf/devmap.c | 5 ++-
>> kernel/bpf/hashtab.c | 5 +--
>> kernel/bpf/inode.c | 15 ++++++---
>> kernel/bpf/lpm_trie.c | 3 +-
>> kernel/bpf/sockmap.c | 5 ++-
>> kernel/bpf/stackmap.c | 5 ++-
>> kernel/bpf/syscall.c | 80 +++++++++++++++++++++++++++++++++++++++++++-----
>> 10 files changed, 114 insertions(+), 23 deletions(-)
>>
>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>> index bc7da2ddfcaf..0e9ca2555d7f 100644
>> --- a/include/linux/bpf.h
>> +++ b/include/linux/bpf.h
>> @@ -308,11 +308,11 @@ void bpf_map_area_free(void *base);
>>
>> extern int sysctl_unprivileged_bpf_disabled;
>>
>> -int bpf_map_new_fd(struct bpf_map *map);
>> +int bpf_map_new_fd(struct bpf_map *map, int flags);
>> int bpf_prog_new_fd(struct bpf_prog *prog);
>>
>> int bpf_obj_pin_user(u32 ufd, const char __user *pathname);
>> -int bpf_obj_get_user(const char __user *pathname);
>> +int bpf_obj_get_user(const char __user *pathname, int flags);
>>
>> int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value);
>> int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value);
>> @@ -331,6 +331,8 @@ int bpf_fd_htab_map_update_elem(struct bpf_map *map, struct file *map_file,
>> void *key, void *value, u64 map_flags);
>> int bpf_fd_htab_map_lookup_elem(struct bpf_map *map, void *key, u32 *value);
>>
>> +int bpf_get_file_flag(int flags);
>> +
>> /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and
>> * forced to use 'long' read/writes to try to atomically copy long counters.
>> * Best-effort only. No barriers here, since it _will_ race with concurrent
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 6db9e1d679cd..9cb50a228c39 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -217,6 +217,10 @@ enum bpf_attach_type {
>>
>> #define BPF_OBJ_NAME_LEN 16U
>>
>> +/* Flags for accessing BPF object */
>> +#define BPF_F_RDONLY (1U << 3)
>> +#define BPF_F_WRONLY (1U << 4)
>> +
>> union bpf_attr {
>> struct { /* anonymous struct used by BPF_MAP_CREATE command */
>> __u32 map_type; /* one of enum bpf_map_type */
>> @@ -259,6 +263,7 @@ union bpf_attr {
>> struct { /* anonymous struct used by BPF_OBJ_* commands */
>> __aligned_u64 pathname;
>> __u32 bpf_fd;
>> + __u32 file_flags;
>> };
>>
>> struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
>> @@ -286,6 +291,7 @@ union bpf_attr {
>> __u32 map_id;
>> };
>> __u32 next_id;
>> + __u32 open_flags;
>> };
>>
>> struct { /* anonymous struct used by BPF_OBJ_GET_INFO_BY_FD */
>> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
>> index 68d866628be0..f869e48ef2f6 100644
>> --- a/kernel/bpf/arraymap.c
>> +++ b/kernel/bpf/arraymap.c
>> @@ -19,6 +19,9 @@
>>
>> #include "map_in_map.h"
>>
>> +#define ARRAY_CREATE_FLAG_MASK \
>> + (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY)
>> +
>> static void bpf_array_free_percpu(struct bpf_array *array)
>> {
>> int i;
>> @@ -56,8 +59,8 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
>>
>> /* check sanity of attributes */
>> if (attr->max_entries == 0 || attr->key_size != 4 ||
>> - attr->value_size == 0 || attr->map_flags & ~BPF_F_NUMA_NODE ||
>> - (percpu && numa_node != NUMA_NO_NODE))
>> + attr->value_size == 0 || attr->map_flags &
>> + ~ARRAY_CREATE_FLAG_MASK || (percpu && numa_node != NUMA_NO_NODE))
>
> that's very non-standard way of breaking lines.
> Did you run checkpatch ? did it complain?
>
Will fix in next revision, checkpatch didn't say anything about
this....0 error and 0 warning for this patch series.
^ permalink raw reply
* Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: Alexei Starovoitov @ 2017-10-09 23:23 UTC (permalink / raw)
To: Levin, Alexander (Sasha Levin)
Cc: Tim Hansen, davem@davemloft.net, willemb@google.com,
edumazet@google.com, soheil@google.com, pabeni@redhat.com,
elena.reshetova@intel.com, tom@quantonium.net, Jason@zx2c4.com,
fw@strlen.de, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20171009231538.doypjzvxzkoxyoeo@sasha-lappy>
On Mon, Oct 09, 2017 at 11:15:40PM +0000, Levin, Alexander (Sasha Levin) wrote:
> On Mon, Oct 09, 2017 at 04:06:20PM -0700, Alexei Starovoitov wrote:
> >On Mon, Oct 09, 2017 at 08:26:34PM +0000, Levin, Alexander (Sasha Levin) wrote:
> >> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
> >> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
> >> >> Fix BUG() calls to use BUG_ON(conditional) macros.
> >> >>
> >> >> This was found using make coccicheck M=net/core on linux next
> >> >> tag next-2017092
> >> >>
> >> >> Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
> >> >> ---
> >> >> net/core/skbuff.c | 15 ++++++---------
> >> >> 1 file changed, 6 insertions(+), 9 deletions(-)
> >> >>
> >> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
> >> >> --- a/net/core/skbuff.c
> >> >> +++ b/net/core/skbuff.c
> >> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
> >> >> /* Set the tail pointer and length */
> >> >> skb_put(n, skb->len);
> >> >>
> >> >> - if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
> >> >> - BUG();
> >> >> + BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
> >> >
> >> >I'm concerned with this change.
> >> >1. Calling non-trivial bit of code inside the macro is a poor coding style (imo)
> >> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and implementation
> >> >of BUG and BUG_ON look quite different.
> >>
> >> For these archs, wouldn't it then be more efficient to use BUG_ON rather than BUG()?
> >
> >why more efficient? any data to prove that?
>
> Just guessing.
>
> Either way, is there a particular reason for not using BUG_ON() here
> besides that it's implementation is "quite different"?
>
> >I'm pointing that the change is not equivalent and
> >this code has been around forever (pre-git days), so I see
> >no reason to risk changing it.
>
> Do you know that BUG_ON() is broken on any archs?
>
> If not, "this code has been around forever" is really not an excuse to
> not touch code.
>
> If BUG_ON() behavior is broken somewhere, then it needs to get fixed.
no idea whether it's broken. My main objection is #1.
imo it's a very poor coding style to put functions with
side-effects into macros. Especially debug/bug/warn-like.
For example llvm has DEBUG() macro and everything inside
will disappear depending on compilation flags.
I wouldn't be surprised if somebody for the name
of security (to avoid crash on BUG_ON) will replace
BUG/BUG_ON with some other implementation or nop
and will have real bugs, since skb_copy_bits() is somehow
not called or called in different context.
^ permalink raw reply
* Re: linux-next: manual merge of the drivers-x86 tree with the net-next tree
From: Darren Hart @ 2017-10-09 23:18 UTC (permalink / raw)
To: Mark Brown
Cc: Mika Westerberg, Mario Limonciello, Yehezkel Bernat,
Andy Shevchenko, Amir Levy, Michael Jamet, David S. Miller,
netdev, Linux-Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20171009195634.gzxwcqn7xklumqyr@sirena.co.uk>
On Mon, Oct 09, 2017 at 08:56:34PM +0100, Mark Brown wrote:
> On Mon, Oct 09, 2017 at 10:43:01PM +0300, Mika Westerberg wrote:
>
> > If possible, I would rather move this chapter to be before "Networking
> > over Thunderbolt cable". Reason is that it then follows NVM flashing
> > chapter which is typically where you need to force power in the first
> > place.
>
Agreed.
> I guess that's something best sorted out either in the relevant trees or
> during the merge window?
I'm not sure how we would deal with it in the trees. Best to note this during
the merge window - whichever goes in second. Test merge will identify the merge
conflict, and we can include a note to Linus on the preference.
--
Darren Hart
VMware Open Source Technology Center
^ permalink raw reply
* Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: David Miller @ 2017-10-09 23:17 UTC (permalink / raw)
To: alexei.starovoitov
Cc: alexander.levin, devtimhansen, willemb, edumazet, soheil, pabeni,
elena.reshetova, tom, Jason, fw, netdev, linux-kernel
In-Reply-To: <20171009230618.e5gla2iuqwmndkig@ast-mbp>
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Date: Mon, 9 Oct 2017 16:06:20 -0700
>> For these archs, wouldn't it then be more efficient to use BUG_ON
>> rather than BUG()?
>
> why more efficient? any data to prove that?
It can completely eliminate a branch.
For example on powerpc if you use BUG() then the code generated is:
test condition
branch_not_true 1f
unconditional_trap
1:
Whereas with BUG_ON() it's just:
test condition
trap_if_true
Which is a lot better even when the branches in the first case are
well predicted.
^ permalink raw reply
* Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: Levin, Alexander (Sasha Levin) @ 2017-10-09 23:15 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Tim Hansen, davem@davemloft.net, willemb@google.com,
edumazet@google.com, soheil@google.com, pabeni@redhat.com,
elena.reshetova@intel.com, tom@quantonium.net, Jason@zx2c4.com,
fw@strlen.de, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20171009230618.e5gla2iuqwmndkig@ast-mbp>
On Mon, Oct 09, 2017 at 04:06:20PM -0700, Alexei Starovoitov wrote:
>On Mon, Oct 09, 2017 at 08:26:34PM +0000, Levin, Alexander (Sasha Levin) wrote:
>> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
>> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
>> >> Fix BUG() calls to use BUG_ON(conditional) macros.
>> >>
>> >> This was found using make coccicheck M=net/core on linux next
>> >> tag next-2017092
>> >>
>> >> Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
>> >> ---
>> >> net/core/skbuff.c | 15 ++++++---------
>> >> 1 file changed, 6 insertions(+), 9 deletions(-)
>> >>
>> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
>> >> --- a/net/core/skbuff.c
>> >> +++ b/net/core/skbuff.c
>> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
>> >> /* Set the tail pointer and length */
>> >> skb_put(n, skb->len);
>> >>
>> >> - if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
>> >> - BUG();
>> >> + BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
>> >
>> >I'm concerned with this change.
>> >1. Calling non-trivial bit of code inside the macro is a poor coding style (imo)
>> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and implementation
>> >of BUG and BUG_ON look quite different.
>>
>> For these archs, wouldn't it then be more efficient to use BUG_ON rather than BUG()?
>
>why more efficient? any data to prove that?
Just guessing.
Either way, is there a particular reason for not using BUG_ON() here
besides that it's implementation is "quite different"?
>I'm pointing that the change is not equivalent and
>this code has been around forever (pre-git days), so I see
>no reason to risk changing it.
Do you know that BUG_ON() is broken on any archs?
If not, "this code has been around forever" is really not an excuse to
not touch code.
If BUG_ON() behavior is broken somewhere, then it needs to get fixed.
--
Thanks,
Sasha
^ permalink raw reply
* [GIT] Networking
From: David Miller @ 2017-10-09 23:10 UTC (permalink / raw)
To: torvalds; +Cc: akpm, linux-kernel, netdev
1) Fix object leak on IPSEC offload failure, from Steffen Klassert.
2) Fix range checks in ipset address range addition operations,
from Jozsef Kadlecsik.
3) Fix pernet ops unregistration order in ipset, from Florian
Westphal.
4) Add missing netlink attribute policy for nl80211 packet pattern
attrs, from Peng Xu.
5) Fix PPP device destruction race, from Guillaume Nault.
6) Write marks get lost when BPF verifier processes R1=R2 register
assignments, causing incorrect liveness information and less
state pruning. Fix from Alexei Starovoitov.
7) Fix blockhole routes so that they are marked dead and therefore
not cached in sockets, otherwise IPSEC stops working. From
Steffen Klassert.
8) Fix broadcast handling of UDP socket early demux, from Paolo
Abeni.
Please pull, thanks a lot!
The following changes since commit 7a92616c0bac849e790283723b36c399668a1d9f:
Merge tag 'pm-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm (2017-10-05 15:51:37 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
for you to fetch changes up to fdfbad3256918fc5736d68384331d2dbf45ccbd6:
cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan (2017-10-09 16:03:32 -0700)
----------------------------------------------------------------
Aleksander Morgado (1):
cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan
Alexei Starovoitov (1):
bpf: fix liveness marking
Alexey Kodanev (2):
vti: fix NULL dereference in xfrm_input()
gso: fix payload length when gso_size is zero
Artem Savkov (2):
xfrm: don't call xfrm_policy_cache_flush under xfrm_state_lock
netfilter: ebtables: fix race condition in frame_filter_net_init()
Arvind Yadav (1):
netfilter: nf_tables: Release memory obtained by kasprintf
Axel Beckert (1):
doc: Fix typo "8023.ad" in bonding documentation
Dan Carpenter (1):
selftests/net: rxtimestamp: Fix an off by one
David S. Miller (4):
Merge branch 'master' of git://git.kernel.org/.../klassert/ipsec
Merge tag 'mac80211-for-davem-2017-10-09' of git://git.kernel.org/.../jberg/mac80211
Merge branch '10GbE' of git://git.kernel.org/.../jkirsher/net-queue
Merge git://git.kernel.org/.../pablo/nf
Ding Tianhong (2):
Revert commit 1a8b6d76dc5b ("net:add one common config...")
net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
Eric Dumazet (1):
netfilter: x_tables: avoid stack-out-of-bounds read in xt_copy_counters_from_user
Florian Westphal (1):
netfilter: ipset: pernet ops must be unregistered last
Guillaume Nault (1):
ppp: fix race in ppp device destruction
Gustavo A. R. Silva (1):
net: thunderx: mark expected switch fall-throughs in nicvf_main()
Ido Schimmel (1):
mlxsw: spectrum_router: Avoid expensive lookup during route removal
Jason A. Donenfeld (1):
netlink: do not set cb_running if dump's start() errs
JingPiao Chen (1):
netfilter: nf_tables: fix update chain error
John Fastabend (1):
ixgbe: incorrect XDP ring accounting in ethtool tx_frame param
Jon Maloy (2):
tipc: correct initialization of skb list
tipc: Unclone message at secondary destination lookup
Jozsef Kadlecsik (1):
netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses
Lin Zhang (1):
netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook
Mark D Rustad (1):
ixgbe: Return error when getting PHY address if PHY access is not supported
Matteo Croce (1):
ipv6: fix net.ipv6.conf.all.accept_dad behaviour for real
Pablo Neira Ayuso (1):
netfilter: nf_tables: do not dump chain counters if not enabled
Paolo Abeni (1):
udp: fix bcast packet reception
Peng Xu (1):
nl80211: Define policy for packet pattern attributes
Ross Lagerwall (1):
netfilter: ipset: Fix race between dump and swap
Sabrina Dubroca (1):
ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register
Shmulik Ladkani (1):
netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'
Steffen Klassert (4):
xfrm: Fix deletion of offloaded SAs on failure.
xfrm: Fix negative device refcount on offload failure.
ipv6: Fix traffic triggered IPsec connections.
ipv4: Fix traffic triggered IPsec connections.
Subash Abhinov Kasiviswanathan (1):
netfilter: xt_socket: Restore mark from full sockets only
Vadim Fedorenko (1):
netfilter: ipvs: full-functionality option for ECN encapsulation in tunnel
Documentation/networking/bonding.txt | 2 +-
arch/Kconfig | 3 ---
arch/sparc/Kconfig | 1 -
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 2 ++
drivers/net/ethernet/intel/ixgbe/ixgbe_82598.c | 22 ----------------------
drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 19 -------------------
drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 16 ++++++++--------
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 +++++-
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 14 --------------
drivers/net/ppp/ppp_generic.c | 20 ++++++++++++++++++++
drivers/net/usb/cdc_ether.c | 13 +++++++++++++
include/linux/bpf.h | 5 +++++
include/linux/netfilter_bridge/ebtables.h | 7 ++++---
include/uapi/linux/netfilter/xt_bpf.h | 1 +
kernel/bpf/inode.c | 1 +
kernel/bpf/verifier.c | 5 +++++
net/bridge/netfilter/ebtable_broute.c | 4 ++--
net/bridge/netfilter/ebtable_filter.c | 4 ++--
net/bridge/netfilter/ebtable_nat.c | 4 ++--
net/bridge/netfilter/ebtables.c | 17 +++++++++--------
net/ipv4/gre_offload.c | 2 +-
net/ipv4/netfilter/ipt_SYNPROXY.c | 3 ++-
net/ipv4/route.c | 2 +-
net/ipv4/udp.c | 14 +++++---------
net/ipv4/udp_offload.c | 2 +-
net/ipv6/addrconf.c | 4 ++--
net/ipv6/ip6_offload.c | 2 +-
net/ipv6/netfilter/ip6t_SYNPROXY.c | 2 +-
net/ipv6/route.c | 2 +-
net/netfilter/ipset/ip_set_core.c | 29 ++++++++++++++++++-----------
net/netfilter/ipset/ip_set_hash_ip.c | 22 ++++++++++++----------
net/netfilter/ipset/ip_set_hash_ipmark.c | 2 +-
net/netfilter/ipset/ip_set_hash_ipport.c | 2 +-
net/netfilter/ipset/ip_set_hash_ipportip.c | 2 +-
net/netfilter/ipset/ip_set_hash_ipportnet.c | 4 ++--
net/netfilter/ipset/ip_set_hash_net.c | 2 +-
net/netfilter/ipset/ip_set_hash_netiface.c | 2 +-
net/netfilter/ipset/ip_set_hash_netnet.c | 4 ++--
net/netfilter/ipset/ip_set_hash_netport.c | 2 +-
net/netfilter/ipset/ip_set_hash_netportnet.c | 4 ++--
net/netfilter/ipvs/ip_vs_xmit.c | 8 ++++++--
net/netfilter/nf_tables_api.c | 10 ++++++----
net/netfilter/x_tables.c | 4 ++--
net/netfilter/xt_bpf.c | 22 ++++++++++++++++++++--
net/netfilter/xt_socket.c | 4 ++--
net/netlink/af_netlink.c | 13 +++++++------
net/tipc/bcast.c | 4 ++--
net/tipc/msg.c | 8 ++++++++
net/wireless/nl80211.c | 14 ++++++++++++--
net/xfrm/xfrm_device.c | 1 +
net/xfrm/xfrm_input.c | 6 ++++--
net/xfrm/xfrm_state.c | 4 ++--
net/xfrm/xfrm_user.c | 1 +
tools/testing/selftests/networking/timestamping/rxtimestamp.c | 2 +-
54 files changed, 211 insertions(+), 164 deletions(-)
^ permalink raw reply
* Re: [PATCH net-next v2 1/5] bpf: Add file mode configuration into bpf maps
From: Alexei Starovoitov @ 2017-10-09 23:07 UTC (permalink / raw)
To: Chenbo Feng
Cc: linux-security-module, netdev, SELinux, Jeffrey Vander Stoep,
lorenzo, Daniel Borkmann, Stephen Smalley, Chenbo Feng
In-Reply-To: <20171009222028.13096-2-chenbofeng.kernel@gmail.com>
On Mon, Oct 09, 2017 at 03:20:24PM -0700, Chenbo Feng wrote:
> From: Chenbo Feng <fengc@google.com>
>
> Introduce the map read/write flags to the eBPF syscalls that returns the
> map fd. The flags is used to set up the file mode when construct a new
> file descriptor for bpf maps. To not break the backward capability, the
> f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise
> it should be O_RDONLY or O_WRONLY. When the userspace want to modify or
> read the map content, it will check the file mode to see if it is
> allowed to make the change.
>
> Signed-off-by: Chenbo Feng <fengc@google.com>
> Acked-by: Alexei Starovoitov <ast@kernel.org>
> ---
> include/linux/bpf.h | 6 ++--
> include/uapi/linux/bpf.h | 6 ++++
> kernel/bpf/arraymap.c | 7 +++--
> kernel/bpf/devmap.c | 5 ++-
> kernel/bpf/hashtab.c | 5 +--
> kernel/bpf/inode.c | 15 ++++++---
> kernel/bpf/lpm_trie.c | 3 +-
> kernel/bpf/sockmap.c | 5 ++-
> kernel/bpf/stackmap.c | 5 ++-
> kernel/bpf/syscall.c | 80 +++++++++++++++++++++++++++++++++++++++++++-----
> 10 files changed, 114 insertions(+), 23 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index bc7da2ddfcaf..0e9ca2555d7f 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -308,11 +308,11 @@ void bpf_map_area_free(void *base);
>
> extern int sysctl_unprivileged_bpf_disabled;
>
> -int bpf_map_new_fd(struct bpf_map *map);
> +int bpf_map_new_fd(struct bpf_map *map, int flags);
> int bpf_prog_new_fd(struct bpf_prog *prog);
>
> int bpf_obj_pin_user(u32 ufd, const char __user *pathname);
> -int bpf_obj_get_user(const char __user *pathname);
> +int bpf_obj_get_user(const char __user *pathname, int flags);
>
> int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value);
> int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value);
> @@ -331,6 +331,8 @@ int bpf_fd_htab_map_update_elem(struct bpf_map *map, struct file *map_file,
> void *key, void *value, u64 map_flags);
> int bpf_fd_htab_map_lookup_elem(struct bpf_map *map, void *key, u32 *value);
>
> +int bpf_get_file_flag(int flags);
> +
> /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and
> * forced to use 'long' read/writes to try to atomically copy long counters.
> * Best-effort only. No barriers here, since it _will_ race with concurrent
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 6db9e1d679cd..9cb50a228c39 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -217,6 +217,10 @@ enum bpf_attach_type {
>
> #define BPF_OBJ_NAME_LEN 16U
>
> +/* Flags for accessing BPF object */
> +#define BPF_F_RDONLY (1U << 3)
> +#define BPF_F_WRONLY (1U << 4)
> +
> union bpf_attr {
> struct { /* anonymous struct used by BPF_MAP_CREATE command */
> __u32 map_type; /* one of enum bpf_map_type */
> @@ -259,6 +263,7 @@ union bpf_attr {
> struct { /* anonymous struct used by BPF_OBJ_* commands */
> __aligned_u64 pathname;
> __u32 bpf_fd;
> + __u32 file_flags;
> };
>
> struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
> @@ -286,6 +291,7 @@ union bpf_attr {
> __u32 map_id;
> };
> __u32 next_id;
> + __u32 open_flags;
> };
>
> struct { /* anonymous struct used by BPF_OBJ_GET_INFO_BY_FD */
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 68d866628be0..f869e48ef2f6 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -19,6 +19,9 @@
>
> #include "map_in_map.h"
>
> +#define ARRAY_CREATE_FLAG_MASK \
> + (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY)
> +
> static void bpf_array_free_percpu(struct bpf_array *array)
> {
> int i;
> @@ -56,8 +59,8 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
>
> /* check sanity of attributes */
> if (attr->max_entries == 0 || attr->key_size != 4 ||
> - attr->value_size == 0 || attr->map_flags & ~BPF_F_NUMA_NODE ||
> - (percpu && numa_node != NUMA_NO_NODE))
> + attr->value_size == 0 || attr->map_flags &
> + ~ARRAY_CREATE_FLAG_MASK || (percpu && numa_node != NUMA_NO_NODE))
that's very non-standard way of breaking lines.
Did you run checkpatch ? did it complain?
^ permalink raw reply
* Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
From: Alexei Starovoitov @ 2017-10-09 23:06 UTC (permalink / raw)
To: Levin, Alexander (Sasha Levin)
Cc: Tim Hansen, davem@davemloft.net, willemb@google.com,
edumazet@google.com, soheil@google.com, pabeni@redhat.com,
elena.reshetova@intel.com, tom@quantonium.net, Jason@zx2c4.com,
fw@strlen.de, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20171009202633.ep5pbi2tlg7dqidz@sasha-lappy>
On Mon, Oct 09, 2017 at 08:26:34PM +0000, Levin, Alexander (Sasha Levin) wrote:
> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote:
> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote:
> >> Fix BUG() calls to use BUG_ON(conditional) macros.
> >>
> >> This was found using make coccicheck M=net/core on linux next
> >> tag next-2017092
> >>
> >> Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
> >> ---
> >> net/core/skbuff.c | 15 ++++++---------
> >> 1 file changed, 6 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644
> >> --- a/net/core/skbuff.c
> >> +++ b/net/core/skbuff.c
> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t gfp_mask)
> >> /* Set the tail pointer and length */
> >> skb_put(n, skb->len);
> >>
> >> - if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
> >> - BUG();
> >> + BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
> >
> >I'm concerned with this change.
> >1. Calling non-trivial bit of code inside the macro is a poor coding style (imo)
> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and implementation
> >of BUG and BUG_ON look quite different.
>
> For these archs, wouldn't it then be more efficient to use BUG_ON rather than BUG()?
why more efficient? any data to prove that?
I'm pointing that the change is not equivalent and
this code has been around forever (pre-git days), so I see
no reason to risk changing it.
^ permalink raw reply
* Re: linux-next: manual merge of the cgroup tree with the net-next tree
From: Alexei Starovoitov @ 2017-10-09 23:04 UTC (permalink / raw)
To: Mark Brown
Cc: Tejun Heo, Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
David S. Miller, netdev, Linux-Next Mailing List,
Linux Kernel Mailing List
In-Reply-To: <20171009183836.cceczuqytzmgqubr@sirena.co.uk>
On Mon, Oct 09, 2017 at 07:38:36PM +0100, Mark Brown wrote:
> Hi Tejun,
>
> Today's linux-next merge of the cgroup tree got a conflict in:
>
> kernel/cgroup/cgroup.c
>
> between commit:
>
> 324bda9e6c5ad ("bpf: multi program support for cgroup+bpf")
>
> from the net-next tree and commit:
>
> 041cd640b2f3c ("cgroup: Implement cgroup2 basic CPU usage accounting")
>
> from the cgroup tree.
>
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging. You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
>
> diff --cc kernel/cgroup/cgroup.c
> index 00f5b358aeac,c3421ee0d230..000000000000
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@@ -4765,8 -4785,9 +4788,11 @@@ static struct cgroup *cgroup_create(str
>
> return cgrp;
>
> +out_idr_free:
> + cgroup_idr_remove(&root->cgroup_idr, cgrp->id);
> + out_stat_exit:
> + if (cgroup_on_dfl(parent))
> + cgroup_stat_exit(cgrp);
thanks. I did the same merge conflict resolution for our combined tree.
^ permalink raw reply
* Re: [PATCH] cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan
From: David Miller @ 2017-10-09 23:03 UTC (permalink / raw)
To: aleksander
Cc: oliver, marco.demarco, stefano.godeas, linux-usb, netdev,
linux-kernel
In-Reply-To: <20171009120512.16681-1-aleksander@aleksander.es>
From: Aleksander Morgado <aleksander@aleksander.es>
Date: Mon, 9 Oct 2017 14:05:12 +0200
> The u-blox TOBY-L2 is a LTE Cat 4 module with HSPA+ and 2G fallback.
> This module allows switching to different USB profiles with the
> 'AT+UUSBCONF' command, and provides a ECM network interface when the
> 'AT+UUSBCONF=2' profile is selected.
>
> The u-blox SARA-U2 is a HSPA module with 2G fallback. The default USB
> configuration includes a ECM network interface.
>
> Both these modules are controlled via AT commands through one of the
> TTYs exposed. Connecting these modules may be done just by activating
> the desired PDP context with 'AT+CGACT=1,<cid>' and then running DHCP
> on the ECM interface.
>
> Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Applied, thank you.
^ permalink raw reply
* [net-next 14/14] i40e: Avoid some useless variables and initializers in NVM functions
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Stefano Brivio, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Stefano Brivio <sbrivio@redhat.com>
Fixes: 09f79fd49d94 ("i40e: avoid NVM acquire deadlock during NVM update")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_nvm.c | 20 +++++++-------------
1 file changed, 7 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
index 57505b1df98d..151d9cfb6ea4 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
@@ -311,13 +311,10 @@ static i40e_status i40e_read_nvm_word_aq(struct i40e_hw *hw, u16 offset,
static i40e_status __i40e_read_nvm_word(struct i40e_hw *hw,
u16 offset, u16 *data)
{
- i40e_status ret_code = 0;
-
if (hw->flags & I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE)
- ret_code = i40e_read_nvm_word_aq(hw, offset, data);
- else
- ret_code = i40e_read_nvm_word_srctl(hw, offset, data);
- return ret_code;
+ return i40e_read_nvm_word_aq(hw, offset, data);
+
+ return i40e_read_nvm_word_srctl(hw, offset, data);
}
/**
@@ -331,7 +328,7 @@ static i40e_status __i40e_read_nvm_word(struct i40e_hw *hw,
i40e_status i40e_read_nvm_word(struct i40e_hw *hw, u16 offset,
u16 *data)
{
- i40e_status ret_code = 0;
+ i40e_status ret_code;
ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ);
if (ret_code)
@@ -446,13 +443,10 @@ static i40e_status __i40e_read_nvm_buffer(struct i40e_hw *hw,
u16 offset, u16 *words,
u16 *data)
{
- i40e_status ret_code = 0;
-
if (hw->flags & I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE)
- ret_code = i40e_read_nvm_buffer_aq(hw, offset, words, data);
- else
- ret_code = i40e_read_nvm_buffer_srctl(hw, offset, words, data);
- return ret_code;
+ return i40e_read_nvm_buffer_aq(hw, offset, words, data);
+
+ return i40e_read_nvm_buffer_srctl(hw, offset, words, data);
}
/**
--
2.14.2
^ permalink raw reply related
* [net-next 11/14] i40e: Retry AQC GetPhyAbilities to overcome I2CRead hangs
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem
Cc: Jayaprakash Shanmugam, netdev, nhorman, sassmann, jogreene,
Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Jayaprakash Shanmugam <jayaprakash.shanmugam@intel.com>
- When the I2C is busy, the PHY reads are delayed. The firmware will
return EGAIN in these cases with an expectation that the SW will
trigger the reads again
- This patch retries the operation for a maximum period of 500ms
Signed-off-by: Jayaprakash Shanmugam <jayaprakash.shanmugam@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_common.c | 42 ++++++++++++++++++---------
drivers/net/ethernet/intel/i40e/i40e_type.h | 3 ++
drivers/net/ethernet/intel/i40evf/i40e_type.h | 3 ++
3 files changed, 35 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
index 60542beda7ad..53aad378d49c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -1567,30 +1567,46 @@ i40e_status i40e_aq_get_phy_capabilities(struct i40e_hw *hw,
struct i40e_aq_desc desc;
i40e_status status;
u16 abilities_size = sizeof(struct i40e_aq_get_phy_abilities_resp);
+ u16 max_delay = I40E_MAX_PHY_TIMEOUT, total_delay = 0;
if (!abilities)
return I40E_ERR_PARAM;
- i40e_fill_default_direct_cmd_desc(&desc,
- i40e_aqc_opc_get_phy_abilities);
+ do {
+ i40e_fill_default_direct_cmd_desc(&desc,
+ i40e_aqc_opc_get_phy_abilities);
- desc.flags |= cpu_to_le16((u16)I40E_AQ_FLAG_BUF);
- if (abilities_size > I40E_AQ_LARGE_BUF)
- desc.flags |= cpu_to_le16((u16)I40E_AQ_FLAG_LB);
+ desc.flags |= cpu_to_le16((u16)I40E_AQ_FLAG_BUF);
+ if (abilities_size > I40E_AQ_LARGE_BUF)
+ desc.flags |= cpu_to_le16((u16)I40E_AQ_FLAG_LB);
- if (qualified_modules)
- desc.params.external.param0 |=
+ if (qualified_modules)
+ desc.params.external.param0 |=
cpu_to_le32(I40E_AQ_PHY_REPORT_QUALIFIED_MODULES);
- if (report_init)
- desc.params.external.param0 |=
+ if (report_init)
+ desc.params.external.param0 |=
cpu_to_le32(I40E_AQ_PHY_REPORT_INITIAL_VALUES);
- status = i40e_asq_send_command(hw, &desc, abilities, abilities_size,
- cmd_details);
+ status = i40e_asq_send_command(hw, &desc, abilities,
+ abilities_size, cmd_details);
- if (hw->aq.asq_last_status == I40E_AQ_RC_EIO)
- status = I40E_ERR_UNKNOWN_PHY;
+ if (status)
+ break;
+
+ if (hw->aq.asq_last_status == I40E_AQ_RC_EIO) {
+ status = I40E_ERR_UNKNOWN_PHY;
+ break;
+ } else if (hw->aq.asq_last_status == I40E_AQ_RC_EAGAIN) {
+ usleep_range(1000, 2000);
+ total_delay++;
+ status = I40E_ERR_TIMEOUT;
+ }
+ } while ((hw->aq.asq_last_status != I40E_AQ_RC_OK) &&
+ (total_delay < max_delay));
+
+ if (status)
+ return status;
if (report_init) {
if (hw->mac.type == I40E_MAC_XL710 &&
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h
index 4b32b1d38a66..0410fcbdbb94 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -46,6 +46,9 @@
/* Max default timeout in ms, */
#define I40E_MAX_NVM_TIMEOUT 18000
+/* Max timeout in ms for the phy to respond */
+#define I40E_MAX_PHY_TIMEOUT 500
+
/* Switch from ms to the 1usec global time (this is the GTIME resolution) */
#define I40E_MS_TO_GTIME(time) ((time) * 1000)
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_type.h b/drivers/net/ethernet/intel/i40evf/i40e_type.h
index 9364b67fff9c..213b773dfad6 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_type.h
@@ -46,6 +46,9 @@
/* Max default timeout in ms, */
#define I40E_MAX_NVM_TIMEOUT 18000
+/* Max timeout in ms for the phy to respond */
+#define I40E_MAX_PHY_TIMEOUT 500
+
/* Switch from ms to the 1usec global time (this is the GTIME resolution) */
#define I40E_MS_TO_GTIME(time) ((time) * 1000)
--
2.14.2
^ permalink raw reply related
* [net-next 13/14] i40e: fix a typo
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Rami Rosen, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Rami Rosen <rami.rosen@intel.com>
This patch fixes a typo in i40e_vsi_alloc_arrays() documentation.
The first parameter name should be "vsi" instead of "type".
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index b26f615bed5a..4de52001a2b9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -7688,7 +7688,7 @@ static int i40e_set_num_rings_in_vsi(struct i40e_vsi *vsi)
/**
* i40e_vsi_alloc_arrays - Allocate queue and vector pointer arrays for the vsi
- * @type: VSI pointer
+ * @vsi: VSI pointer
* @alloc_qvectors: a bool to specify if q_vectors need to be allocated.
*
* On error: returns error code (negative)
--
2.14.2
^ permalink raw reply related
* [net-next 12/14] i40e: use a local variable instead of calculating multiple times
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Lihong Yang, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Lihong Yang <lihong.yang@intel.com>
The computed result of I40E_MAX_VSI_QP * I40E_VIRTCHNL_SUPPORTED_QTYPES
is used more than three times in function i40e_config_irq_link_list.
Simply declare a local variable to store it to improve readability.
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 20 +++++++-------------
1 file changed, 7 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 125dcd1d2233..0c4fa225c7be 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -273,7 +273,7 @@ static void i40e_config_irq_link_list(struct i40e_vf *vf, u16 vsi_id,
struct i40e_hw *hw = &pf->hw;
u16 vsi_queue_id, pf_queue_id;
enum i40e_queue_type qtype;
- u16 next_q, vector_id;
+ u16 next_q, vector_id, size;
u32 reg, reg_idx;
u16 itr_idx = 0;
@@ -303,11 +303,9 @@ static void i40e_config_irq_link_list(struct i40e_vf *vf, u16 vsi_id,
vsi_queue_id + 1));
}
- next_q = find_first_bit(&linklistmap,
- (I40E_MAX_VSI_QP *
- I40E_VIRTCHNL_SUPPORTED_QTYPES));
- if (unlikely(next_q == (I40E_MAX_VSI_QP *
- I40E_VIRTCHNL_SUPPORTED_QTYPES)))
+ size = I40E_MAX_VSI_QP * I40E_VIRTCHNL_SUPPORTED_QTYPES;
+ next_q = find_first_bit(&linklistmap, size);
+ if (unlikely(next_q == size))
goto irq_list_done;
vsi_queue_id = next_q / I40E_VIRTCHNL_SUPPORTED_QTYPES;
@@ -317,7 +315,7 @@ static void i40e_config_irq_link_list(struct i40e_vf *vf, u16 vsi_id,
wr32(hw, reg_idx, reg);
- while (next_q < (I40E_MAX_VSI_QP * I40E_VIRTCHNL_SUPPORTED_QTYPES)) {
+ while (next_q < size) {
switch (qtype) {
case I40E_QUEUE_TYPE_RX:
reg_idx = I40E_QINT_RQCTL(pf_queue_id);
@@ -331,12 +329,8 @@ static void i40e_config_irq_link_list(struct i40e_vf *vf, u16 vsi_id,
break;
}
- next_q = find_next_bit(&linklistmap,
- (I40E_MAX_VSI_QP *
- I40E_VIRTCHNL_SUPPORTED_QTYPES),
- next_q + 1);
- if (next_q <
- (I40E_MAX_VSI_QP * I40E_VIRTCHNL_SUPPORTED_QTYPES)) {
+ next_q = find_next_bit(&linklistmap, size, next_q + 1);
+ if (next_q < size) {
vsi_queue_id = next_q / I40E_VIRTCHNL_SUPPORTED_QTYPES;
qtype = next_q % I40E_VIRTCHNL_SUPPORTED_QTYPES;
pf_queue_id = i40e_vc_get_pf_queue_id(vf, vsi_id,
--
2.14.2
^ permalink raw reply related
* [net-next 08/14] i40e/i40evf: bundle more descriptors when allocating buffers
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
Double the number of descriptors we'll bundle into one tail bump when
receiving. Empirical testing has shown that we reduce CPU utilization
and don't appear to reduce throughput or packet rate. 32 seems to be the
sweet spot, as it's half the default polling budget, so we'd essentially
reduce from 4 tail writes when polling down to 2. Increasing this up to
64 appears to have negative impacts as it may become possible that we
don't bump the tail each time we get polled, which could cause a long
delay between returning descriptors to the hardware.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_txrx.h | 2 +-
drivers/net/ethernet/intel/i40evf/i40e_txrx.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index c3156aa3f709..ff57ae451524 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -208,7 +208,7 @@ static inline bool i40e_test_staterr(union i40e_rx_desc *rx_desc,
}
/* How many Rx Buffers do we bundle into one write to the hardware ? */
-#define I40E_RX_BUFFER_WRITE 16 /* Must be power of 2 */
+#define I40E_RX_BUFFER_WRITE 32 /* Must be power of 2 */
#define I40E_RX_INCREMENT(r, i) \
do { \
(i)++; \
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
index 8f9830d7649a..8d26c85d12e1 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
@@ -191,7 +191,7 @@ static inline bool i40e_test_staterr(union i40e_rx_desc *rx_desc,
}
/* How many Rx Buffers do we bundle into one write to the hardware ? */
-#define I40E_RX_BUFFER_WRITE 16 /* Must be power of 2 */
+#define I40E_RX_BUFFER_WRITE 32 /* Must be power of 2 */
#define I40E_RX_INCREMENT(r, i) \
do { \
(i)++; \
--
2.14.2
^ permalink raw reply related
* [net-next 09/14] i40e: allow XPS with QoS enabled
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
Recently, the kernel gained support for enabling XPS and QoS at the
same time. Thus, we no longer need to worry about the number of
traffic classes when enabling XPS.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 17 ++++++-----------
1 file changed, 6 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 74875ddaeb33..b26f615bed5a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -2879,23 +2879,18 @@ static void i40e_vsi_free_rx_resources(struct i40e_vsi *vsi)
**/
static void i40e_config_xps_tx_ring(struct i40e_ring *ring)
{
- struct i40e_vsi *vsi = ring->vsi;
int cpu;
if (!ring->q_vector || !ring->netdev)
return;
- if ((vsi->tc_config.numtc <= 1) &&
- !test_and_set_bit(__I40E_TX_XPS_INIT_DONE, ring->state)) {
- cpu = cpumask_local_spread(ring->q_vector->v_idx, -1);
- netif_set_xps_queue(ring->netdev, get_cpu_mask(cpu),
- ring->queue_index);
- }
+ /* We only initialize XPS once, so as not to overwrite user settings */
+ if (test_and_set_bit(__I40E_TX_XPS_INIT_DONE, ring->state))
+ return;
- /* schedule our worker thread which will take care of
- * applying the new filter changes
- */
- i40e_service_event_schedule(vsi->back);
+ cpu = cpumask_local_spread(ring->q_vector->v_idx, -1);
+ netif_set_xps_queue(ring->netdev, get_cpu_mask(cpu),
+ ring->queue_index);
}
/**
--
2.14.2
^ permalink raw reply related
* [net-next 10/14] i40e: add check for return from find_first_bit call
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Lihong Yang, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Lihong Yang <lihong.yang@intel.com>
The find_first_bit function will return the size passed to search
if the first set bit is not found. This patch adds the check in case
that happens as the return value would be used as the index in an array
and that would have caused the out-of-bounds access.
Detected by CoverityScan, CID 1295969 Out-of-bounds access
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 83727906a386..125dcd1d2233 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -306,6 +306,10 @@ static void i40e_config_irq_link_list(struct i40e_vf *vf, u16 vsi_id,
next_q = find_first_bit(&linklistmap,
(I40E_MAX_VSI_QP *
I40E_VIRTCHNL_SUPPORTED_QTYPES));
+ if (unlikely(next_q == (I40E_MAX_VSI_QP *
+ I40E_VIRTCHNL_SUPPORTED_QTYPES)))
+ goto irq_list_done;
+
vsi_queue_id = next_q / I40E_VIRTCHNL_SUPPORTED_QTYPES;
qtype = next_q % I40E_VIRTCHNL_SUPPORTED_QTYPES;
pf_queue_id = i40e_vc_get_pf_queue_id(vf, vsi_id, vsi_queue_id);
--
2.14.2
^ permalink raw reply related
* [net-next 07/14] i40e/i40evf: bump tail only in multiples of 8
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
Hardware only fetches descriptors on cachelines of 8, essentially
ignoring the lower 3 bits of the tail register. Thus, it is pointless to
bump tail by an unaligned access as the hardware will ignore some of the
new descriptors we allocated. Thus, it's ideal if we can ensure tail
writes are always aligned to 8.
At first, it seems like we'd already do this, since we allocate
descriptors in batches which are a multiple of 8. Since we'd always
increment by a multiple of 8, it seems like the value should always be
aligned.
However, this ignores allocation failures. If we fail to allocate
a buffer, our tail register will become unaligned. Once it has become
unaligned it will essentially be stuck unaligned until a buffer
allocation happens to fail at the exact amount necessary to re-align it.
We can do better, by simply rounding down the number of buffers we're
about to allocate (cleaned_count) such that "next_to_clean
+ cleaned_count" is rounded to the nearest multiple of 8.
We do this by calculating how far off that value is and subtracting it
from the cleaned_count. This essentially defers allocation of buffers if
they're going to be ignored by hardware anyways, and re-aligns our
next_to_use and tail values after a failure to allocate a descriptor.
This calculation ensures that we always align the tail writes in a way
the hardware expects and don't unnecessarily allocate buffers which
won't be fetched immediately.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 9 +++++++++
drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 9 +++++++++
2 files changed, 18 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 616abf79253e..a23306f04e00 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1372,6 +1372,15 @@ bool i40e_alloc_rx_buffers(struct i40e_ring *rx_ring, u16 cleaned_count)
union i40e_rx_desc *rx_desc;
struct i40e_rx_buffer *bi;
+ /* Hardware only fetches new descriptors in cache lines of 8,
+ * essentially ignoring the lower 3 bits of the tail register. We want
+ * to ensure our tail writes are aligned to avoid unnecessary work. We
+ * can't simply round down the cleaned count, since we might fail to
+ * allocate some buffers. What we really want is to ensure that
+ * next_to_used + cleaned_count produces an aligned value.
+ */
+ cleaned_count -= (ntu + cleaned_count) & 0x7;
+
/* do nothing if no valid netdev defined */
if (!rx_ring->netdev || !cleaned_count)
return false;
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index fe817e2b6fef..6806ada11490 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -711,6 +711,15 @@ bool i40evf_alloc_rx_buffers(struct i40e_ring *rx_ring, u16 cleaned_count)
union i40e_rx_desc *rx_desc;
struct i40e_rx_buffer *bi;
+ /* Hardware only fetches new descriptors in cache lines of 8,
+ * essentially ignoring the lower 3 bits of the tail register. We want
+ * to ensure our tail writes are aligned to avoid unnecessary work. We
+ * can't simply round down the cleaned count, since we might fail to
+ * allocate some buffers. What we really want is to ensure that
+ * next_to_used + cleaned_count produces an aligned value.
+ */
+ cleaned_count -= (ntu + cleaned_count) & 0x7;
+
/* do nothing if no valid netdev defined */
if (!rx_ring->netdev || !cleaned_count)
return false;
--
2.14.2
^ permalink raw reply related
* [net-next 06/14] i40e: reduce lrxqthresh from 2 to 1
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
The lrxq thresh value tells hardware to immediately interrupt when there
are fewer than N*64 packets left in the ring.
Counter intuitively, empirical testing has shown that decreasing this
value from 2 to 1, and thus changing from an immediate interrupt at
fewer than 128 descriptors down to 64 descriptors causes a small
increase in the maximum total packets per second we can receive. This
increase occurs even when we're polling with interrupts masked, as the
hardware must still handle interrupts internally even if we've disabled
them in software.
Also reduce the value for any VFs we allocate.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 00a83afb02e9..74875ddaeb33 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -3030,7 +3030,7 @@ static int i40e_configure_rx_ring(struct i40e_ring *ring)
if (hw->revision_id == 0)
rx_ctx.lrxqthresh = 0;
else
- rx_ctx.lrxqthresh = 2;
+ rx_ctx.lrxqthresh = 1;
rx_ctx.crcstrip = 1;
rx_ctx.l2tsel = 1;
/* this controls whether VLAN is stripped from inner headers */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 10298956a81b..83727906a386 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -639,7 +639,7 @@ static int i40e_config_vsi_rx_queue(struct i40e_vf *vf, u16 vsi_id,
rx_ctx.dsize = 1;
/* default values */
- rx_ctx.lrxqthresh = 2;
+ rx_ctx.lrxqthresh = 1;
rx_ctx.crcstrip = 1;
rx_ctx.prefena = 1;
rx_ctx.l2tsel = 1;
--
2.14.2
^ permalink raw reply related
* [net-next 05/14] i40e/i40evf: always set the CLEARPBA flag when re-enabling interrupts
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
In the past we changed driver behavior to not clear the PBA when
re-enabling interrupts. This change was motivated by the flawed belief
that clearing the PBA would cause a lost interrupt if a receive
interrupt occurred while interrupts were disabled.
According to empirical testing this isn't the case. Additionally, the
data sheet specifically says that we should set the CLEARPBA bit when
re-enabling interrupts in a polling setup.
This reverts commit 40d72a509862 ("i40e/i40evf: don't lose interrupts")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e.h | 5 +----
drivers/net/ethernet/intel/i40e/i40e_main.c | 11 +++++------
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 6 ++----
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 2 +-
drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 4 +---
5 files changed, 10 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 7baf6d8a84dd..8139b4ee1dc3 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -949,9 +949,6 @@ static inline void i40e_irq_dynamic_enable(struct i40e_vsi *vsi, int vector)
struct i40e_hw *hw = &pf->hw;
u32 val;
- /* definitely clear the PBA here, as this function is meant to
- * clean out all previous interrupts AND enable the interrupt
- */
val = I40E_PFINT_DYN_CTLN_INTENA_MASK |
I40E_PFINT_DYN_CTLN_CLEARPBA_MASK |
(I40E_ITR_NONE << I40E_PFINT_DYN_CTLN_ITR_INDX_SHIFT);
@@ -960,7 +957,7 @@ static inline void i40e_irq_dynamic_enable(struct i40e_vsi *vsi, int vector)
}
void i40e_irq_dynamic_disable_icr0(struct i40e_pf *pf);
-void i40e_irq_dynamic_enable_icr0(struct i40e_pf *pf, bool clearpba);
+void i40e_irq_dynamic_enable_icr0(struct i40e_pf *pf);
int i40e_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd);
int i40e_open(struct net_device *netdev);
int i40e_close(struct net_device *netdev);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index d4b0cc36afb1..00a83afb02e9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -3403,15 +3403,14 @@ void i40e_irq_dynamic_disable_icr0(struct i40e_pf *pf)
/**
* i40e_irq_dynamic_enable_icr0 - Enable default interrupt generation for icr0
* @pf: board private structure
- * @clearpba: true when all pending interrupt events should be cleared
**/
-void i40e_irq_dynamic_enable_icr0(struct i40e_pf *pf, bool clearpba)
+void i40e_irq_dynamic_enable_icr0(struct i40e_pf *pf)
{
struct i40e_hw *hw = &pf->hw;
u32 val;
val = I40E_PFINT_DYN_CTL0_INTENA_MASK |
- (clearpba ? I40E_PFINT_DYN_CTL0_CLEARPBA_MASK : 0) |
+ I40E_PFINT_DYN_CTL0_CLEARPBA_MASK |
(I40E_ITR_NONE << I40E_PFINT_DYN_CTL0_ITR_INDX_SHIFT);
wr32(hw, I40E_PFINT_DYN_CTL0, val);
@@ -3597,7 +3596,7 @@ static int i40e_vsi_enable_irq(struct i40e_vsi *vsi)
for (i = 0; i < vsi->num_q_vectors; i++)
i40e_irq_dynamic_enable(vsi, i);
} else {
- i40e_irq_dynamic_enable_icr0(pf, true);
+ i40e_irq_dynamic_enable_icr0(pf);
}
i40e_flush(&pf->hw);
@@ -3746,7 +3745,7 @@ static irqreturn_t i40e_intr(int irq, void *data)
wr32(hw, I40E_PFINT_ICR0_ENA, ena_mask);
if (!test_bit(__I40E_DOWN, pf->state)) {
i40e_service_event_schedule(pf);
- i40e_irq_dynamic_enable_icr0(pf, false);
+ i40e_irq_dynamic_enable_icr0(pf);
}
return ret;
@@ -8455,7 +8454,7 @@ static int i40e_setup_misc_vector(struct i40e_pf *pf)
i40e_flush(hw);
- i40e_irq_dynamic_enable_icr0(pf, true);
+ i40e_irq_dynamic_enable_icr0(pf);
return err;
}
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 3bd176606c09..616abf79253e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2202,9 +2202,7 @@ static u32 i40e_buildreg_itr(const int type, const u16 itr)
u32 val;
val = I40E_PFINT_DYN_CTLN_INTENA_MASK |
- /* Don't clear PBA because that can cause lost interrupts that
- * came in while we were cleaning/polling
- */
+ I40E_PFINT_DYN_CTLN_CLEARPBA_MASK |
(type << I40E_PFINT_DYN_CTLN_ITR_INDX_SHIFT) |
(itr << I40E_PFINT_DYN_CTLN_INTERVAL_SHIFT);
@@ -2241,7 +2239,7 @@ static inline void i40e_update_enable_itr(struct i40e_vsi *vsi,
/* If we don't have MSIX, then we only need to re-enable icr0 */
if (!(vsi->back->flags & I40E_FLAG_MSIX_ENABLED)) {
- i40e_irq_dynamic_enable_icr0(vsi->back, false);
+ i40e_irq_dynamic_enable_icr0(vsi->back);
return;
}
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index c062d74d21f3..10298956a81b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -1358,7 +1358,7 @@ int i40e_alloc_vfs(struct i40e_pf *pf, u16 num_alloc_vfs)
i40e_free_vfs(pf);
err_iov:
/* Re-enable interrupt 0. */
- i40e_irq_dynamic_enable_icr0(pf, false);
+ i40e_irq_dynamic_enable_icr0(pf);
return ret;
}
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index 37e1de886d48..fe817e2b6fef 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -1409,9 +1409,7 @@ static u32 i40e_buildreg_itr(const int type, const u16 itr)
u32 val;
val = I40E_VFINT_DYN_CTLN1_INTENA_MASK |
- /* Don't clear PBA because that can cause lost interrupts that
- * came in while we were cleaning/polling
- */
+ I40E_VFINT_DYN_CTLN1_CLEARPBA_MASK |
(type << I40E_VFINT_DYN_CTLN1_ITR_INDX_SHIFT) |
(itr << I40E_VFINT_DYN_CTLN1_INTERVAL_SHIFT);
--
2.14.2
^ permalink raw reply related
* [net-next 04/14] i40e/i40evf: fix incorrect default ITR values on driver load
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
The ITR register expects to be programmed in units of 2 microseconds.
Because of this, all of the drivers I40E_ITR_* constants are in terms of
this 2 microsecond register.
Unfortunately, the rx_itr_default value is expected to be programmed in
microseconds.
Effectively the driver defaults to an ITR value of half the expected
value (in terms of minimum microseconds between interrupts).
Fix this by changing the default values to be calculated using
ITR_REG_TO_USEC macro which indicates that we're converting from the
register units into microseconds.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 4 ++--
drivers/net/ethernet/intel/i40e/i40e_txrx.h | 6 ++++--
drivers/net/ethernet/intel/i40evf/i40e_txrx.h | 6 ++++--
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 4 ++--
4 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 60b11fdeca2d..d4b0cc36afb1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -8983,8 +8983,8 @@ static int i40e_sw_init(struct i40e_pf *pf)
I40E_FLAG_MSIX_ENABLED;
/* Set default ITR */
- pf->rx_itr_default = I40E_ITR_DYNAMIC | I40E_ITR_RX_DEF;
- pf->tx_itr_default = I40E_ITR_DYNAMIC | I40E_ITR_TX_DEF;
+ pf->rx_itr_default = I40E_ITR_RX_DEF;
+ pf->tx_itr_default = I40E_ITR_TX_DEF;
/* Depending on PF configurations, it is possible that the RSS
* maximum might end up larger than the available queues
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index a4e3e665a1a1..c3156aa3f709 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -38,8 +38,10 @@
#define I40E_ITR_8K 0x003E
#define I40E_ITR_4K 0x007A
#define I40E_MAX_INTRL 0x3B /* reg uses 4 usec resolution */
-#define I40E_ITR_RX_DEF I40E_ITR_20K
-#define I40E_ITR_TX_DEF I40E_ITR_20K
+#define I40E_ITR_RX_DEF (ITR_REG_TO_USEC(I40E_ITR_20K) | \
+ I40E_ITR_DYNAMIC)
+#define I40E_ITR_TX_DEF (ITR_REG_TO_USEC(I40E_ITR_20K) | \
+ I40E_ITR_DYNAMIC)
#define I40E_ITR_DYNAMIC 0x8000 /* use top bit as a flag */
#define I40E_MIN_INT_RATE 250 /* ~= 1000000 / (I40E_MAX_ITR * 2) */
#define I40E_MAX_INT_RATE 500000 /* == 1000000 / (I40E_MIN_ITR * 2) */
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
index d8ca802a71a9..8f9830d7649a 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
@@ -38,8 +38,10 @@
#define I40E_ITR_8K 0x003E
#define I40E_ITR_4K 0x007A
#define I40E_MAX_INTRL 0x3B /* reg uses 4 usec resolution */
-#define I40E_ITR_RX_DEF I40E_ITR_20K
-#define I40E_ITR_TX_DEF I40E_ITR_20K
+#define I40E_ITR_RX_DEF (ITR_REG_TO_USEC(I40E_ITR_20K) | \
+ I40E_ITR_DYNAMIC)
+#define I40E_ITR_TX_DEF (ITR_REG_TO_USEC(I40E_ITR_20K) | \
+ I40E_ITR_DYNAMIC)
#define I40E_ITR_DYNAMIC 0x8000 /* use top bit as a flag */
#define I40E_MIN_INT_RATE 250 /* ~= 1000000 / (I40E_MAX_ITR * 2) */
#define I40E_MAX_INT_RATE 500000 /* == 1000000 / (I40E_MIN_ITR * 2) */
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index f62d9565c7b5..5bcbd46e2f6c 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -1223,7 +1223,7 @@ static int i40evf_alloc_queues(struct i40evf_adapter *adapter)
tx_ring->netdev = adapter->netdev;
tx_ring->dev = &adapter->pdev->dev;
tx_ring->count = adapter->tx_desc_count;
- tx_ring->tx_itr_setting = (I40E_ITR_DYNAMIC | I40E_ITR_TX_DEF);
+ tx_ring->tx_itr_setting = I40E_ITR_TX_DEF;
if (adapter->flags & I40EVF_FLAG_WB_ON_ITR_CAPABLE)
tx_ring->flags |= I40E_TXR_FLAGS_WB_ON_ITR;
@@ -1232,7 +1232,7 @@ static int i40evf_alloc_queues(struct i40evf_adapter *adapter)
rx_ring->netdev = adapter->netdev;
rx_ring->dev = &adapter->pdev->dev;
rx_ring->count = adapter->rx_desc_count;
- rx_ring->rx_itr_setting = (I40E_ITR_DYNAMIC | I40E_ITR_RX_DEF);
+ rx_ring->rx_itr_setting = I40E_ITR_RX_DEF;
}
adapter->num_active_queues = num_active_queues;
--
2.14.2
^ permalink raw reply related
* [net-next 02/14] i40e: use the safe hash table iterator when deleting mac filters
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Lihong Yang, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Lihong Yang <lihong.yang@intel.com>
This patch replaces hash_for_each function with hash_for_each_safe
when calling __i40e_del_filter. The hash_for_each_safe function is
the right one to use when iterating over a hash table to safely remove
a hash entry. Otherwise, incorrect values may be read from freed memory.
Detected by CoverityScan, CID 1402048 Read from pointer after free
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 04568137e029..c062d74d21f3 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2883,6 +2883,7 @@ int i40e_ndo_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
struct i40e_mac_filter *f;
struct i40e_vf *vf;
int ret = 0;
+ struct hlist_node *h;
int bkt;
/* validate the request */
@@ -2921,7 +2922,7 @@ int i40e_ndo_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
/* Delete all the filters for this VSI - we're going to kill it
* anyway.
*/
- hash_for_each(vsi->mac_filter_hash, bkt, f, hlist)
+ hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist)
__i40e_del_filter(vsi, f);
spin_unlock_bh(&vsi->mac_filter_hash_lock);
--
2.14.2
^ permalink raw reply related
* [net-next 01/14] i40e: fix flags declaration
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Jacob Keller <jacob.e.keller@intel.com>
Since we don't yet have more than 32 flags, we'll use a u32 for both the
hw_features and flag field. Should we gain more flags in the future, we
may need to convert to a u64 or separate flags out into two fields.
This was overlooked in the previous commit 2781de2134c4 ("i40e/i40evf:
organize and re-number feature flags"), where the feature flag was not
converted form u64 to u32.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 18c453a3e728..7baf6d8a84dd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -424,7 +424,7 @@ struct i40e_pf {
#define I40E_HW_PORT_ID_VALID BIT(17)
#define I40E_HW_RESTART_AUTONEG BIT(18)
- u64 flags;
+ u32 flags;
#define I40E_FLAG_RX_CSUM_ENABLED BIT(0)
#define I40E_FLAG_MSI_ENABLED BIT(1)
#define I40E_FLAG_MSIX_ENABLED BIT(2)
--
2.14.2
^ permalink raw reply related
* [net-next 03/14] i40evf: fix mac filter removal timing issue
From: Jeff Kirsher @ 2017-10-09 22:38 UTC (permalink / raw)
To: davem; +Cc: Alan Brady, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>
From: Alan Brady <alan.brady@intel.com>
Due to the asynchronous nature in which mac filters are added and
deleted, there exists a bug in which filters are erroneously removed if
removed then added again quickly.
The events are as such:
- filter marked for removal
- same filter is re-added before watchdog that cleans up filters
- we skip re-adding the filter because we have it already in the
list
- watchdog filter cleanup kicks off and filter is removed
So when we were re-adding the same filter, it didn't actually get added
because it already existed in the list, but was marked for removal and
had yet to actually be removed.
This patch fixes the issue by making sure that when adding a filter, if
we find it already existing in our list, make sure it is not marked to
be removed.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 1d2fc898b664..f62d9565c7b5 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -880,6 +880,8 @@ i40evf_mac_filter *i40evf_add_filter(struct i40evf_adapter *adapter,
list_add_tail(&f->list, &adapter->mac_filter_list);
f->add = true;
adapter->aq_required |= I40EVF_FLAG_AQ_ADD_MAC_FILTER;
+ } else {
+ f->remove = false;
}
clear_bit(__I40EVF_IN_CRITICAL_TASK, &adapter->crit_section);
--
2.14.2
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox