* Re: [PATCH v2 binutils] Add BPF support to binutils...
From: Aaron Conole @ 2017-04-28 17:09 UTC (permalink / raw)
To: David Miller; +Cc: ast, daniel, netdev, xdp-newbies
In-Reply-To: <20170428.120457.408834081300892396.davem@davemloft.net>
David Miller <davem@davemloft.net> writes:
> From: Aaron Conole <aconole@bytheb.org>
> Date: Fri, 28 Apr 2017 11:57:36 -0400
>
>> I'll get an arm board up and running to do some testing there. As a
>> teaser:
>
> Great.
>
> I started working on some more relocation stuff, so more of the
> generic gas tests pass.
>
> For example, stuff like this now works properly:
>
> [davem@dhcp-10-15-49-210 build-bpf]$ cat gas/y.s
> .data
> .globl foo
> foo: .xword bar
> [davem@dhcp-10-15-49-210 build-bpf]$ gas/as-new -o gas/y.o gas/y.s
> [davem@dhcp-10-15-49-210 build-bpf]$ binutils/objdump -r gas/y.o
>
> gas/y.o: file format elf64-bpfle
>
> RELOCATION RECORDS FOR [.data]:
> OFFSET TYPE VALUE
> 0000000000000000 R_BPF_DATA_64 bar
>
>
> [davem@dhcp-10-15-49-210 build-bpf]$
>
> It turned out that I needed to separate the R_BPF_* relocations into
> data vs. insn ones.
>
> Another idea I am thinking about pursuing is adding BPF simulator
> support under sim/ so that people can use gdb to step through BPF
> programs.
>
> I hope we can make it work in a way that we can even step through
> XDP programs and feed them simple test packets, stuff like that.
>
> Anyways, quick relative live patch against v2 from my tree for the
> reloc stuff:
>
> diff --git a/bfd/elf64-bpf.c b/bfd/elf64-bpf.c
> index 9944bb4..1be285d 100644
> --- a/bfd/elf64-bpf.c
> +++ b/bfd/elf64-bpf.c
> @@ -1,8 +1,89 @@
> #include "sysdep.h"
> #include "bfd.h"
> +#include "bfdlink.h"
> #include "libbfd.h"
> +#include "libiberty.h"
> #include "elf-bfd.h"
> +#include "elf/bpf.h"
> #include "opcode/bpf.h"
> +#include "objalloc.h"
> +#include "elf64-bpf.h"
I get a compile error here. I guess this file wasn't included.
^ permalink raw reply
* Re: llvm-objdump...
From: Alexei Starovoitov @ 2017-04-28 17:18 UTC (permalink / raw)
To: David Miller; +Cc: daniel, netdev
In-Reply-To: <20170428.123818.2002639536695380745.davem@davemloft.net>
On 4/28/17 9:38 AM, David Miller wrote:
> From: Alexei Starovoitov <ast@fb.com>
> Date: Fri, 28 Apr 2017 09:22:32 -0700
>
>> On 4/28/17 9:17 AM, David Miller wrote:
>>> Even if I give it -triple=bpfeb it emits immediates incorrectly.
>>>
>>> The bug is certainly in the insn field fetcher of the disassembler.
>>
>> got it. so the binary looks correct, but disasm output is wrong?
>
> F.e.
>
> 12: 07 70 00 00 00 00 00 06 r0 += 100663296
>
> Should be "r0 += 6"
Pushed the fix into the llvm trunk:
https://reviews.llvm.org/rL301653
if buildbots don't yell at me in the next few hours then it's
probably good.
^ permalink raw reply
* Re: [PATCH v2 binutils] Add BPF support to binutils...
From: David Miller @ 2017-04-28 17:33 UTC (permalink / raw)
To: aconole; +Cc: ast, daniel, netdev, xdp-newbies
In-Reply-To: <f7tfugssblu.fsf@redhat.com>
From: Aaron Conole <aconole@bytheb.org>
Date: Fri, 28 Apr 2017 13:09:17 -0400
> David Miller <davem@davemloft.net> writes:
>
>> diff --git a/bfd/elf64-bpf.c b/bfd/elf64-bpf.c
>> index 9944bb4..1be285d 100644
>> --- a/bfd/elf64-bpf.c
>> +++ b/bfd/elf64-bpf.c
>> @@ -1,8 +1,89 @@
>> #include "sysdep.h"
>> #include "bfd.h"
>> +#include "bfdlink.h"
>> #include "libbfd.h"
>> +#include "libiberty.h"
>> #include "elf-bfd.h"
>> +#include "elf/bpf.h"
>> #include "opcode/bpf.h"
>> +#include "objalloc.h"
>> +#include "elf64-bpf.h"
>
> I get a compile error here. I guess this file wasn't included.
Sorry about that, here is bfd/elf64-bpf.h:
/* BPF ELF specific backend routines.
Copyright (C) 2017 Free Software Foundation, Inc.
This file is part of BFD, the Binary File Descriptor library.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston,
MA 02110-1301, USA. */
extern reloc_howto_type *_bfd_bpf_elf_reloc_type_lookup
(bfd *, bfd_reloc_code_real_type);
extern reloc_howto_type *_bfd_bpf_elf_reloc_name_lookup
(bfd *, const char *);
^ permalink raw reply
* Re: [PATCH] net: hso: register netdev later to avoid a race condition
From: Andreas Kemnade @ 2017-04-28 17:36 UTC (permalink / raw)
To: Johan Hovold
Cc: davem-fT/PcQaiUtIeIZ0/mPfg9Q, joe-6d6DIl74uiNBDgjK7y7TUQ,
gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
peter-WaGBZJeGNqdsbIuE7sb01tBPR1lH4CV8,
hns-xXXSsgcRVICgSpxsJD1C4w, linux-usb-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20170427084401.GP2823@localhost>
[-- Attachment #1: Type: text/plain, Size: 2282 bytes --]
On Thu, 27 Apr 2017 10:44:01 +0200
Johan Hovold <johan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Wed, Apr 26, 2017 at 07:26:40PM +0200, Andreas Kemnade wrote:
> > If the netdev is accessed before the urbs are initialized,
> > there will be NULL pointer dereferences. That is avoided by
> > registering it when it is fully initialized.
>
> > Reported-by: H. Nikolaus Schaller <hns-xXXSsgcRVICgSpxsJD1C4w@public.gmane.org>
> > Signed-off-by: Andreas Kemnade <andreas-cLv4Z9ELZ06ZuzBka8ofvg@public.gmane.org>
> > ---
> > drivers/net/usb/hso.c | 14 +++++++-------
> > 1 file changed, 7 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c
> > index 93411a3..00067a0 100644
> > --- a/drivers/net/usb/hso.c
> > +++ b/drivers/net/usb/hso.c
> > @@ -2534,13 +2534,6 @@ static struct hso_device *hso_create_net_device(struct usb_interface *interface,
> > SET_NETDEV_DEV(net, &interface->dev);
> > SET_NETDEV_DEVTYPE(net, &hso_type);
> >
> > - /* registering our net device */
> > - result = register_netdev(net);
> > - if (result) {
> > - dev_err(&interface->dev, "Failed to register device\n");
> > - goto exit;
> > - }
> > -
> > /* start allocating */
> > for (i = 0; i < MUX_BULK_RX_BUF_COUNT; i++) {
> > hso_net->mux_bulk_rx_urb_pool[i] = usb_alloc_urb(0, GFP_KERNEL);
> > @@ -2560,6 +2553,13 @@ static struct hso_device *hso_create_net_device(struct usb_interface *interface,
> >
> > add_net_device(hso_dev);
> >
> > + /* registering our net device */
> > + result = register_netdev(net);
> > + if (result) {
> > + dev_err(&interface->dev, "Failed to register device\n");
> > + goto exit;
>
> This all looks good, but you should consider cleaning up the error
> handling of this function as a follow-up as we should not be
> deregistering netdevs that have never been registered (e.g. if a
> required endpoint is missing or if registration fails for some reason).
>
> But just to be clear, this problem existed also before this change.
>
Just to check wether I am understanding this correctly. In your opinion
this patch is good for now. And later when it is applied, there should
be an additional error handling cleanup patch.
Regards,
Andreas
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: [PATCH net-next v2] net: bridge: Fix improper taking over HW learned FDB
From: Ido Schimmel @ 2017-04-28 17:36 UTC (permalink / raw)
To: Arkadi Sharshevsky
Cc: Ido Schimmel, Nikolay Aleksandrov, netdev, bridge, davem
In-Reply-To: <1493397098-38756-1-git-send-email-arkadis@mellanox.com>
On Fri, Apr 28, 2017 at 07:31:38PM +0300, Arkadi Sharshevsky wrote:
> Commit 7e26bf45e4cb ("net: bridge: allow SW learn to take over HW fdb
> entries") added the ability to "take over an entry which was previously
> learned via HW when it shows up from a SW port".
>
> However, if an entry was learned via HW and then a control packet
> (e.g., ARP request) was trapped to the CPU, the bridge driver will
> update the entry and remove the externally learned flag, although the
> entry is still present in HW. Instead, only clear the externally learned
> flag in case of roaming.
>
> Fixes: 7e26bf45e4cb ("net: bridge: allow SW learn to take over HW fdb entries")
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> Signed-off-by: Arkadi Sharashevsky <arkadis@mellanox.com>
> Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---
> v1->v2
> - net-next rebase.
> ---
> net/bridge/br_fdb.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index de7988b..5905eb7 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -589,16 +589,16 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
> if (unlikely(source != fdb->dst)) {
> fdb->dst = source;
> fdb_modified = true;
> + /* Take over HW learned entry */
> + if (unlikely(fdb->added_by_external_learn)) {
> + fdb->added_by_external_learn = 0;
> + fdb_modified = true;
This line is redundant. 'fdb_modified' is already set.
> + }
> }
> if (now != fdb->updated)
> fdb->updated = now;
> if (unlikely(added_by_user))
> fdb->added_by_user = 1;
> - /* Take over HW learned entry */
> - if (unlikely(fdb->added_by_external_learn)) {
> - fdb->added_by_external_learn = 0;
> - fdb_modified = true;
> - }
> if (unlikely(fdb_modified))
> fdb_notify(br, fdb, RTM_NEWNEIGH);
> }
> --
> 2.4.11
>
^ permalink raw reply
* Re: [patch net-next 00/10] net: sched: introduce multichain support for filters
From: Cong Wang @ 2017-04-28 17:40 UTC (permalink / raw)
To: Jiri Pirko
Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
Alexander Duyck, mlxsw, Simon Horman
In-Reply-To: <20170428065349.GB1886@nanopsycho.orion>
On Thu, Apr 27, 2017 at 11:53 PM, Jiri Pirko <jiri@resnulli.us> wrote:
> Thu, Apr 27, 2017 at 07:46:03PM CEST, xiyou.wangcong@gmail.com wrote:
>>On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>> Simple example:
>>> $ tc qdisc add dev eth0 ingress
>>> $ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11
>>> $ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop
>>> $ tc filter show dev eth0 root
>>
>>Interesting.
>>
>>I don't look into the code yet. If I understand the concepts correctly,
>>so with your patchset we can mark either filter with a chain No. to
>>choose which chain it belongs to _logically_ even though
>>_physically_ it is still in the old-fashion chain (prio, proto)?
>
> You have to see the code :)
I don't understand why I have to, these are high-level concepts
and should be put in your cover letter (aka. design doc). You miss
a lot of information about the ordering here.
Also the terms you use are confusing too, without your patchset
we have chains too, struct tcf_proto is a chain, each kind of filter
defines their own way to store their filters into this chain (tp->root),
and of course tp is chained in a singly-linked list too which turns
into multiple-chains.
^ permalink raw reply
* Re: [PATCH net] liquidio: silence a locking static checker warning
From: Felix Manlunas @ 2017-04-28 17:42 UTC (permalink / raw)
To: Dan Carpenter
Cc: Derek Chickles, Satanand Burla, Raghu Vatsavayi, netdev,
kernel-janitors
In-Reply-To: <20170428125715.h6d5ttnfq7rdnpni@mwanda>
From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Fri, 28 Apr 2017 15:57:15 +0300
> Presumably we never hit this return, but static checkers complain that
> we need to unlock so we may as well fix that.
>
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
>
> diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c b/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
> index 201b9875f9bb..5cca73b8880b 100644
> --- a/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
> +++ b/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
> @@ -313,6 +313,7 @@ int octeon_mbox_process_message(struct octeon_mbox *mbox)
> return 0;
> }
>
> + spin_unlock_irqrestore(&mbox->lock, flags);
> WARN_ON(1);
>
> return 0;
Thanks.
Acked-by: Felix Manlunas <felix.manlunas@cavium.com>
^ permalink raw reply
* Re: [patch net-next 02/10] net: sched: introduce tcf block infractructure
From: Cong Wang @ 2017-04-28 17:48 UTC (permalink / raw)
To: Jiri Pirko
Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
Alexander Duyck, mlxsw, Simon Horman
In-Reply-To: <1493291540-2119-3-git-send-email-jiri@resnulli.us>
On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> From: Jiri Pirko <jiri@mellanox.com>
>
> Currently, the filter chains are direcly put into the private structures
> of qdiscs. In order to be able to have multiple chains per qdisc and to
> allow filter chains sharing among qdiscs, there is a need for common
> object that would hold the chains. This introduces such object and calls
> it "tcf_block".
>
What is filter chains sharing here? Sounds like a new feature you are
trying to hide? How could it be possibly shared among qdisc's when they
are still stored in a per-qdisc pointer?
Look at tc actions, they can be shared because they are physically stored
in a per-netns hashtable, filters can just refer them with indexes.
^ permalink raw reply
* Re: [PATCH v2 07/21] crypto: shash, caam: Make use of the new sg_map helper function
From: Herbert Xu @ 2017-04-28 17:51 UTC (permalink / raw)
To: Logan Gunthorpe
Cc: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
target-devel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig,
devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b, James E.J. Bottomley,
linux-scsi-u79uwXL29TY76Z2rM5mHXA,
linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, Sumit Semwal,
open-iscsi-/JYPxA39Uh5TLH3MbocFFw,
linux-media-u79uwXL29TY76Z2rM5mHXA,
intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
sparmaintainer-GLv8BlqOqDDQT0dZR+AlfA,
linux-raid-u79uwXL29TY76Z2rM5mHXA,
megaraidlinux.pdl-dY08KVG/lbpWk0Htik3J/w, Jens Axboe,
Martin K. Petersen, netdev-u79uwXL29TY76Z2rM5mHXA, Matthew Wilcox,
linux-mmc-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
linux-crypto-u79uwXL29TY76Z2rM5mHXA, Greg Kroah-Hartman,
David S. Miller
In-Reply-To: <5a08708b-c3b8-41fe-96de-607a109eacbd-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
On Fri, Apr 28, 2017 at 10:53:45AM -0600, Logan Gunthorpe wrote:
>
>
> On 28/04/17 12:30 AM, Herbert Xu wrote:
> > You are right. Indeed the existing code looks buggy as they
> > don't take sg->offset into account when doing the kmap. Could
> > you send me some patches that fix these problems first so that
> > they can be easily backported?
>
> Ok, I think the only buggy one in crypto is hifn_795x. Shash and caam
> both do have the sg->offset accounted for. I'll send a patch for the
> buggy one shortly.
I think they're all buggy when sg->offset is greater than PAGE_SIZE.
Thanks,
--
Email: Herbert Xu <herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q@public.gmane.org>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* pull-request: mac80211-next 2017-04-28
From: Johannes Berg @ 2017-04-28 17:56 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-wireless
Hi Dave,
Since I had many API changes pending I decided to go ahead and
take the opportunity to get them before the merge window, which
is a natural tree synchronization point :)
I meant to sent this earlier today but forgot due to some debug
(unrelated to this code, in our internal tree), sorry about that.
Please pull and let me know if there's any problem.
Thanks,
johannes
The following changes since commit 5e1fc7c5ba00599ccd7096eef3e9fd3362c1230f:
drivers: net: xgene-v2: Fix error return code in xge_mdio_config() (2017-04-25 13:48:06 -0400)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next.git tags/mac80211-next-for-davem-2017-04-28
for you to fetch changes up to b34939b9836950d261610132853311054b507247:
cfg80211: add request id to cfg80211_sched_scan_*() api (2017-04-28 14:51:43 +0200)
----------------------------------------------------------------
Another set of patches for -next:
* API support for concurrent scheduled scan requests
* API changes for roaming reporting
* BSS max idle support in mac80211
* API changes for TX status reporting in mac80211
* API changes for RX rate reporting in mac80211
* rewrite monitor logic to prepare for BPF filters
* bugfix for rare devices without 2.4 GHz support
* a bugfix for recent DFS changes
* some further cleanups
The API changes are actually at a nice time, since it's
typically quiet just before the merge window, and trees
can be synchronized easily during it.
----------------------------------------------------------------
Arend Van Spriel (4):
nl80211: allow multiple active scheduled scan requests
nl80211: add support for BSSIDs in scheduled scan matchsets
cfg80211: add request id parameter to .sched_scan_stop() signature
cfg80211: add request id to cfg80211_sched_scan_*() api
Avraham Stern (2):
cfg80211: unify cfg80211_roamed() and cfg80211_roamed_bss()
mac80211: Add support for BSS max idle period element
Emmanuel Grumbach (1):
mac80211: don't parse encrypted management frames in ieee80211_frame_acked
Felix Fietkau (3):
mac80211: make rate control tx status API more extensible
mac80211: move ieee80211_tx_status_noskb below ieee80211_tx_status
mac80211: add ieee80211_tx_status_ext
Johannes Berg (8):
mac80211: rewrite monitor mode delivery logic
cfg80211: simplify netlink socket owner interface deletion
ieee80211: fix kernel-doc parsing errors
mac80211: disentangle iflist_mtx and chanctx_mtx
mac80211: clean up rate encoding bits in RX status
mac80211: separate encoding/bandwidth from flags
mac80211: rename ieee80211_rx_status::vht_nss to just nss
mac80211: use bitfield macros for encoded rate
Luca Coelho (3):
ieee80211: add SUITE_B AKM selectors
ieee80211: add FT-802.1X AKM suite selector
mac80211: make multicast variable a bool in ieee80211_accept_frame()
Mohammed Shafi Shajakhan (1):
mac80211: Fix possible sband related NULL pointer de-reference
Vasanthakumar Thiagarajan (1):
cfg80211: Fix dfs state propagation for non-DFS center channel
drivers/net/wireless/ath/ath10k/htt_rx.c | 48 ++---
drivers/net/wireless/ath/ath5k/base.c | 6 +-
drivers/net/wireless/ath/ath6kl/cfg80211.c | 16 +-
drivers/net/wireless/ath/ath6kl/wmi.c | 2 +-
drivers/net/wireless/ath/ath9k/ar9003_mac.c | 7 +-
drivers/net/wireless/ath/ath9k/common.c | 11 +-
drivers/net/wireless/ath/ath9k/debug_sta.c | 6 +-
drivers/net/wireless/ath/ath9k/htc_drv_txrx.c | 7 +-
drivers/net/wireless/ath/ath9k/mac.c | 15 +-
drivers/net/wireless/ath/ath9k/mac.h | 4 +-
drivers/net/wireless/ath/ath9k/recv.c | 8 +-
drivers/net/wireless/ath/carl9170/rx.c | 8 +-
drivers/net/wireless/ath/wcn36xx/txrx.c | 2 +-
drivers/net/wireless/broadcom/b43/xmit.c | 2 +-
.../broadcom/brcm80211/brcmfmac/cfg80211.c | 25 ++-
.../wireless/broadcom/brcm80211/brcmsmac/main.c | 10 +-
drivers/net/wireless/intel/iwlegacy/3945.c | 2 +-
drivers/net/wireless/intel/iwlegacy/4965-mac.c | 8 +-
drivers/net/wireless/intel/iwlwifi/dvm/rx.c | 10 +-
drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c | 2 +-
drivers/net/wireless/intel/iwlwifi/mvm/rx.c | 26 +--
drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c | 26 +--
drivers/net/wireless/intersil/p54/txrx.c | 2 +-
drivers/net/wireless/mac80211_hwsim.c | 10 +-
drivers/net/wireless/marvell/mwifiex/cfg80211.c | 10 +-
drivers/net/wireless/marvell/mwifiex/main.c | 2 +-
drivers/net/wireless/marvell/mwifiex/sta_cmdresp.c | 2 +-
drivers/net/wireless/marvell/mwifiex/sta_event.c | 2 +-
drivers/net/wireless/marvell/mwifiex/sta_ioctl.c | 2 +-
drivers/net/wireless/marvell/mwl8k.c | 16 +-
drivers/net/wireless/mediatek/mt7601u/mac.c | 12 +-
drivers/net/wireless/ralink/rt2x00/rt2800lib.c | 4 +-
drivers/net/wireless/ralink/rt2x00/rt2x00dev.c | 5 +-
drivers/net/wireless/ralink/rt2x00/rt2x00queue.h | 3 +
drivers/net/wireless/realtek/rtl818x/rtl8180/dev.c | 2 +-
drivers/net/wireless/realtek/rtl818x/rtl8187/dev.c | 2 +-
.../net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 10 +-
.../net/wireless/realtek/rtlwifi/rtl8188ee/trx.c | 4 +-
.../net/wireless/realtek/rtlwifi/rtl8192ce/trx.c | 4 +-
.../net/wireless/realtek/rtlwifi/rtl8192cu/trx.c | 8 +-
.../net/wireless/realtek/rtlwifi/rtl8192de/trx.c | 4 +-
.../net/wireless/realtek/rtlwifi/rtl8192ee/trx.c | 4 +-
.../net/wireless/realtek/rtlwifi/rtl8192se/trx.c | 4 +-
.../net/wireless/realtek/rtlwifi/rtl8723ae/trx.c | 4 +-
.../net/wireless/realtek/rtlwifi/rtl8723be/trx.c | 4 +-
.../net/wireless/realtek/rtlwifi/rtl8821ae/trx.c | 12 +-
drivers/net/wireless/rndis_wlan.c | 19 +-
drivers/net/wireless/st/cw1200/txrx.c | 2 +-
drivers/net/wireless/ti/wl1251/rx.c | 2 +-
drivers/net/wireless/ti/wlcore/main.c | 2 +-
drivers/net/wireless/ti/wlcore/rx.c | 2 +-
drivers/staging/wlan-ng/cfg80211.c | 7 +-
include/linux/ieee80211.h | 67 ++++--
include/net/cfg80211.h | 105 +++++----
include/net/mac80211.h | 172 +++++++++------
include/uapi/linux/nl80211.h | 16 +-
net/mac80211/cfg.c | 39 ++--
net/mac80211/ibss.c | 10 +-
net/mac80211/ieee80211_i.h | 41 ++--
net/mac80211/main.c | 2 +
net/mac80211/mesh.c | 29 ++-
net/mac80211/mesh_plink.c | 37 +++-
net/mac80211/mlme.c | 28 ++-
net/mac80211/pm.c | 2 +-
net/mac80211/rate.c | 26 ++-
net/mac80211/rate.h | 44 +---
net/mac80211/rc80211_minstrel.c | 6 +-
net/mac80211/rc80211_minstrel_ht.c | 10 +-
net/mac80211/rx.c | 234 ++++++++++++---------
net/mac80211/scan.c | 12 +-
net/mac80211/sta_info.c | 39 ++--
net/mac80211/sta_info.h | 83 +++++---
net/mac80211/status.c | 168 ++++++++-------
net/mac80211/tdls.c | 29 ++-
net/mac80211/tx.c | 5 +-
net/mac80211/util.c | 79 ++++---
net/wireless/core.c | 57 ++---
net/wireless/core.h | 35 ++-
net/wireless/nl80211.c | 143 ++++++++-----
net/wireless/nl80211.h | 5 +-
net/wireless/rdev-ops.h | 8 +-
net/wireless/reg.c | 3 -
net/wireless/scan.c | 160 ++++++++++----
net/wireless/sme.c | 90 ++++----
net/wireless/trace.h | 54 +++--
net/wireless/util.c | 4 +-
86 files changed, 1323 insertions(+), 936 deletions(-)
^ permalink raw reply
* [PATCH net-next] rtnetlink: Remove NETDEV_CHANGEINFODATA
From: David Ahern @ 2017-04-28 18:06 UTC (permalink / raw)
To: netdev; +Cc: jiri, David Ahern
NETDEV_CHANGEINFODATA was added by d4261e5650004 ("bonding: create
netlink event when bonding option is changed"). RTM_NEWLINK
messages are already created on changelink events, so this event
is just a duplicate. Remove it.
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
drivers/net/bonding/bond_options.c | 2 --
include/linux/netdevice.h | 11 +++++------
net/core/rtnetlink.c | 1 -
3 files changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 1bcbb8913e17..533518a64496 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -673,8 +673,6 @@ int __bond_opt_set(struct bonding *bond,
out:
if (ret)
bond_opt_error_interpret(bond, opt, ret, val);
- else if (bond->dev->reg_state == NETREG_REGISTERED)
- call_netdevice_notifiers(NETDEV_CHANGEINFODATA, bond->dev);
return ret;
}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cc07c3be2705..c49a7a901710 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2279,12 +2279,11 @@ struct netdev_lag_lower_state_info {
#define NETDEV_CHANGEUPPER 0x0015
#define NETDEV_RESEND_IGMP 0x0016
#define NETDEV_PRECHANGEMTU 0x0017 /* notify before mtu change happened */
-#define NETDEV_CHANGEINFODATA 0x0018
-#define NETDEV_BONDING_INFO 0x0019
-#define NETDEV_PRECHANGEUPPER 0x001A
-#define NETDEV_CHANGELOWERSTATE 0x001B
-#define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001C
-#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001E
+#define NETDEV_BONDING_INFO 0x0018
+#define NETDEV_PRECHANGEUPPER 0x0019
+#define NETDEV_CHANGELOWERSTATE 0x001A
+#define NETDEV_UDP_TUNNEL_PUSH_INFO 0x001B
+#define NETDEV_CHANGE_TX_QUEUE_LEN 0x001C
int register_netdevice_notifier(struct notifier_block *nb);
int unregister_netdevice_notifier(struct notifier_block *nb);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 58419da7961b..1072b88e5845 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -4127,7 +4127,6 @@ static int rtnetlink_event(struct notifier_block *this, unsigned long event, voi
case NETDEV_CHANGEUPPER:
case NETDEV_RESEND_IGMP:
case NETDEV_PRECHANGEMTU:
- case NETDEV_CHANGEINFODATA:
case NETDEV_PRECHANGEUPPER:
case NETDEV_CHANGELOWERSTATE:
case NETDEV_UDP_TUNNEL_PUSH_INFO:
--
2.1.4
^ permalink raw reply related
* Re: prog ID and next steps. Was: [RFC net-next 0/2] Introduce bpf_prog ID and iteration
From: Daniel Borkmann @ 2017-04-28 18:24 UTC (permalink / raw)
To: Hannes Frederic Sowa, Alexei Starovoitov, Martin KaFai Lau,
netdev
Cc: kernel-team, David S. Miller, Jesper Dangaard Brouer,
John Fastabend, Thomas Graf
In-Reply-To: <44cdb2d2-9f5c-5d28-2966-3e43e6d2a2ef@stressinduktion.org>
On 04/28/2017 01:50 PM, Hannes Frederic Sowa wrote:
> On 28.04.2017 03:11, Alexei Starovoitov wrote:
[...]
>> i disagree re: kallsyms. The goal of prog_tag is to let program writers
>> understand which program is running in a stable way.
>
> But exactly it doesn't let program writers do that, it just confuses them:
>
> ---
>
> jit on:
>
> perf record -e bpf_redirect -agR
>
> The unwinder walks the stack, extracts address of upper function and
> sends it to user space (perf) or handles it inside the kernel/kallsyms
> (ftrace).
>
> User takes tag of bpf program and wants to inspect related maps to the
> program. Unfortunately the tag is not unique and thus we need to expand
> the tag back to all possible programs with the same tag and expand that
> to the union of all possible maps that those programs reference again.
>
> That is what we present to the application developer. I would seriously
> be very confused.
>
> If application developer doesn't trust perf and uses instruction pointer
> value from the stack directly he can't find out which program there is,
> because fdinfo e.g. doesn't show the actual address of where the program
> is allocated. I would use /dev/kmem now.
I don't think it would be reasonable to let fdinfo unconditionally
dump the address of the program including unprivileged progs. We
probably could add a run-time check into bpf_prog_show_fdinfo() and
show it dynamically when user has cap_sys_admin.
> ---
>
> jit off:
>
> perf probe -a '__bpf_prog_run ctx insn'
> perf probe -a 'bpf_redirect flags ifindex'
> perf record -e bpf_redirect -agR
>
> Situation doesn't change. We do get the insn pointer thus have a unique
> id for the program. That's it, no further introspection. I can read
> /dev/kmem now.
>
> ---
>
> Personally I wouldn't rely on such infrastructure.
>
> My proposal would be to maybe hash a map id into the program, so instead
> of replacing the user space file descriptor with zero, take a map id
> (like discussed below) or an inode number of the map into the register
> and hash with that, so that those program have unique identifiers.
I don't think that proposal would work, f.e. placing dev + inode number
(inode itself wouldn't be sufficient either; map would also have to be
pinned as anonymous inode from fd wouldn't work) or map id into insn
won't give you out of a sudden a unique prog id, since maps can be shared
among multiple progs, but also the same prog can be attached to, say,
multiple attachment points.
> Otherwise construct kallsym entries with prog id instead of tag.
>
> I think that the hash should try to reassemble some kind of identity
> function and mapping two programs to the same tag, that do something
> completely differently is not good (based on we don't include the map).
>
> Also I do think in future the difference between non-jit and jit
> operation in regards to tracing should also be lifted. We could add a
> manual tracing point into the interpreter for reporting the same event
> as if the program was jitted.
>
> Debugging should not be that different based on the sysctl flags.
With regards to tracing it's quite useful to see whether a program was
JITed or not JITed (aka __bpf_prog_run()), so I don't think it makes
sense to e.g. have everything named __bpf_prog_run(), at least the other
way around wouldn't work for interpreter as far as I see.
But lets assume JIT is off for a moment, and you only see __bpf_prog_run().
Then, in the stack trace you'll also see related functions that call this
in the first place, for example, mlx4_en_poll_rx_cq() / mlx4_en_process_rx_cq()
in case of XDP, meaning, you get the call path context as well, for which
you later on (with the proposed infrastructure for getting fds from
attachment points + dumping them) can return the attached prog fd and
with that also dump the code or map data.
^ permalink raw reply
* Re: [PATCH net-next 1/4] ixgbe: sparc: rename the ARCH_WANT_RELAX_ORDER to IXGBE_ALLOW_RELAXED_ORDER
From: Casey Leedom @ 2017-04-28 18:42 UTC (permalink / raw)
To: Lucas Stach, Bjorn Helgaas
Cc: Alexander Duyck, Ding Tianhong, Mark Rutland, Amir Ancel,
Gabriele Paoloni, linux-pci@vger.kernel.org, Catalin Marinas,
Will Deacon, LinuxArm, David Laight, jeffrey.t.kirsher@intel.com,
netdev@vger.kernel.org, Robin Murphy, davem@davemloft.net,
linux-arm-kernel@lists.infradead.org
In-Reply-To: <1493369471.13947.11.camel@pengutronix.de>
| From: Lucas Stach <l.stach@pengutronix.de>
| Sent: Friday, April 28, 2017 1:51 AM
|
| Am Donnerstag, den 27.04.2017, 12:19 -0500 schrieb Bjorn Helgaas:
| >
| >
| > I thought Relaxed Ordering was an optimization. Are there cases where
| > it is actually required for correct behavior?
|
| Yes, at least the Tegra 2 TRM claims that RO needs to be enabled on the
| device side for correct operation with the following language:
|
| "Tegra 2 requires relaxed ordering for responses to downstream requests
| (responses can pass writes). It is possible in some circumstances for PCIe
| transfers from an external bus masters (i.e. upstream transfers) to become
| blocked by a downstream read or non-posted write. The responses to these
| downstream requests are blocked by upstream posted writes only when PCIe
| strict ordering is imposed. It is therefore necessary to never impose strict
| ordering that would block a response to a downstream NPW/read request and
| always set the relaxed ordering bit to 1. Only devices that are capable of
| relaxed ordering may be used with Tegra 2 devices."
(woof) Reading through the above paragraph is difficult because the author
seems to shift language and terminology mid sentence and isn't following
standard PCI terminology conventions. The Root Complex is "Upstream", a
non-Root Complex Node in the PCIe Fabric is "Downstream", Requests that a
Downstream Device (End Point) send to the Root Complex are called "Upstream
Requests", responses that the Root Complex send to a Device are called
"Downstream Responses" (or, even more pedantically, "Responses sent
Downstream for an earlier Upstream Request").
Because a Root Complex is Upstream, but the Requests it sent Downstream,
and Downstream Devices send their Requests Upstream, it's very important
that we use exceedingly precise language.
So, it ~sounds like~ the nVidia Tegra 2 document is talking about the need
for Downstream Devices to echo the Relaxed Ordering Attribute in their
Responses directed Upstream to Requests sent Downstream from the Root
Complex. Moreover, there's code in drivers/pci/host/pci-tegra.c:
tegra_pcie_relax_enable() which appears to set the PCIe Capability Device
Control[Enable Relaxed Ordering] bit on all PCIe Fabric Nodes.
If my reading of the intent of the nVidia document is correct -- and
that's a Big If because of the extremely imprecise language used -- that
means that the tegra_pcie_relax_enable() is completely bogus. The PCIe 3.0
Specification states that Responses MUST reflect the Relaxed Ordering and No
Snoop Attributes of the Requests for which they are responding. Section
2.2.9 of PCI Express(r) Base Specification Revision 3.0 November 10, 2010:
"Completion headers must supply the same values for the Attribute as were
supplied in the header of the corresponding Request, except as explicitly
allowed when IDO is used."
And, specifically, the PCIe Capability Device Control[Enable Relaxed
Ordering] bit _only_ affects the ability of that Device to originate
Transaction Layer Packet Requests with the Relaxed Ordering Attribute set.
Thus, tegra_pcie_relax_enable() setting those bits on all the Downstream
Devices (and intervening Bridges) does not _cause_ those Devices to generate
Requests with Relaxed Ordering set. And, if the Devices are PCIe 3.0
compliant, it also doesn't affect the Responses that they send back Upstream
to the Root Complex.
I apologize for the incredibly detailed nature of these responses, but
it's very easy for people new to PCIe to get these things wrong and/or
misinterpret the PCIe Specifications.
Casey
^ permalink raw reply
* Re: pull-request: mac80211-next 2017-04-28
From: David Miller @ 2017-04-28 18:47 UTC (permalink / raw)
To: johannes; +Cc: netdev, linux-wireless
In-Reply-To: <20170428175658.1961-1-johannes@sipsolutions.net>
From: Johannes Berg <johannes@sipsolutions.net>
Date: Fri, 28 Apr 2017 19:56:57 +0200
> Since I had many API changes pending I decided to go ahead and
> take the opportunity to get them before the merge window, which
> is a natural tree synchronization point :)
Awesome :)
> I meant to sent this earlier today but forgot due to some debug
> (unrelated to this code, in our internal tree), sorry about that.
>
> Please pull and let me know if there's any problem.
Pulled, thanks Johannes.
^ permalink raw reply
* Re: prog ID and next steps. Was: [RFC net-next 0/2] Introduce bpf_prog ID and iteration
From: Hannes Frederic Sowa @ 2017-04-28 18:51 UTC (permalink / raw)
To: Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, netdev
Cc: kernel-team, David S. Miller, Jesper Dangaard Brouer,
John Fastabend, Thomas Graf
In-Reply-To: <590388C2.7080000@iogearbox.net>
Hello,
On 28.04.2017 20:24, Daniel Borkmann wrote:
> On 04/28/2017 01:50 PM, Hannes Frederic Sowa wrote:
>> On 28.04.2017 03:11, Alexei Starovoitov wrote:
> [...]
>>> i disagree re: kallsyms. The goal of prog_tag is to let program writers
>>> understand which program is running in a stable way.
>>
>> But exactly it doesn't let program writers do that, it just confuses
>> them:
>>
>> ---
>>
>> jit on:
>>
>> perf record -e bpf_redirect -agR
>>
>> The unwinder walks the stack, extracts address of upper function and
>> sends it to user space (perf) or handles it inside the kernel/kallsyms
>> (ftrace).
>>
>> User takes tag of bpf program and wants to inspect related maps to the
>> program. Unfortunately the tag is not unique and thus we need to expand
>> the tag back to all possible programs with the same tag and expand that
>> to the union of all possible maps that those programs reference again.
>>
>> That is what we present to the application developer. I would seriously
>> be very confused.
>>
>> If application developer doesn't trust perf and uses instruction pointer
>> value from the stack directly he can't find out which program there is,
>> because fdinfo e.g. doesn't show the actual address of where the program
>> is allocated. I would use /dev/kmem now.
>
> I don't think it would be reasonable to let fdinfo unconditionally
> dump the address of the program including unprivileged progs. We
> probably could add a run-time check into bpf_prog_show_fdinfo() and
> show it dynamically when user has cap_sys_admin.
Okay, it doesn't seem as clean as using an id, but this would work to
correlate traces.
>> ---
>>
>> jit off:
>>
>> perf probe -a '__bpf_prog_run ctx insn'
>> perf probe -a 'bpf_redirect flags ifindex'
>> perf record -e bpf_redirect -agR
>>
>> Situation doesn't change. We do get the insn pointer thus have a unique
>> id for the program. That's it, no further introspection. I can read
>> /dev/kmem now.
>>
>> ---
>>
>> Personally I wouldn't rely on such infrastructure.
>>
>> My proposal would be to maybe hash a map id into the program, so instead
>> of replacing the user space file descriptor with zero, take a map id
>> (like discussed below) or an inode number of the map into the register
>> and hash with that, so that those program have unique identifiers.
>
> I don't think that proposal would work, f.e. placing dev + inode number
> (inode itself wouldn't be sufficient either; map would also have to be
> pinned as anonymous inode from fd wouldn't work) or map id into insn
> won't give you out of a sudden a unique prog id, since maps can be shared
> among multiple progs, but also the same prog can be attached to, say,
> multiple attachment points.
Yep, about the dev + inode number I know about the problems (and wasn't
sure if bpffs was a singelton fs or not - but it is not as I just
tested). I wanted to outline the idea conceptually. The idea behind
mentioning inode number was to save one additional map_id. I don't know
if it works to register an object with the filesystem but not making it
visible.
Anyway the unique map_id (a la Martin's prog_id) would work as well.
I just wonder why we can't use Martin's prog_id for registering the
programs in kallsyms. Problem seemed to be solved and identity of
programs is preserved. Easy to use it for dumping and walking of maps.
>> Otherwise construct kallsym entries with prog id instead of tag.
>>
>> I think that the hash should try to reassemble some kind of identity
>> function and mapping two programs to the same tag, that do something
>> completely differently is not good (based on we don't include the map).
>>
>> Also I do think in future the difference between non-jit and jit
>> operation in regards to tracing should also be lifted. We could add a
>> manual tracing point into the interpreter for reporting the same event
>> as if the program was jitted.
>>
>> Debugging should not be that different based on the sysctl flags.
>
> With regards to tracing it's quite useful to see whether a program was
> JITed or not JITed (aka __bpf_prog_run()), so I don't think it makes
> sense to e.g. have everything named __bpf_prog_run(), at least the other
> way around wouldn't work for interpreter as far as I see.
I don't want to have everything named __bpf_prog_run. Tracepoints have
lots of additional attributes/arguments. The tracepoint should pass a
jit=0/1 argument to user space while using the same name for the hook.
(I didn't check if the dynamic hook registration works - I just assume so).
> But lets assume JIT is off for a moment, and you only see __bpf_prog_run().
> Then, in the stack trace you'll also see related functions that call this
> in the first place, for example, mlx4_en_poll_rx_cq() /
> mlx4_en_process_rx_cq()
> in case of XDP, meaning, you get the call path context as well, for which
> you later on (with the proposed infrastructure for getting fds from
> attachment points + dumping them) can return the attached prog fd and
> with that also dump the code or map data.
Doesn't this break if I have 2 mlx4 cards in the system with different
XDP programs attached? I would have to add an additional parameter to
one of the mlx4 functions to extract the net_device pointer to make the
correlation then. Probably it will be much more difficult for other hooks.
Thanks and bye,
Hannes
^ permalink raw reply
* Re: prog ID and next steps. Was: [RFC net-next 0/2] Introduce bpf_prog ID and iteration
From: Hannes Frederic Sowa @ 2017-04-28 18:57 UTC (permalink / raw)
To: Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, netdev
Cc: kernel-team, David S. Miller, Jesper Dangaard Brouer,
John Fastabend, Thomas Graf
In-Reply-To: <e4931eaa-c126-bd7f-4865-84a251624583@stressinduktion.org>
On 28.04.2017 20:51, Hannes Frederic Sowa wrote:
> Doesn't this break if I have 2 mlx4 cards in the system with different
> XDP programs attached? I would have to add an additional parameter to
> one of the mlx4 functions to extract the net_device pointer to make the
> correlation then. Probably it will be much more difficult for other hooks.
Addendum:
I am referring to same XDP programs attached to two mlx4 NICs which use
different maps. It will be possible to correlate that with some more
advanced perf magic, if the driver passes suitable pointers around so
you can dereference the device name. Also, I am not sure where the
support for multi-rx-queue XDP is heading. It might be even more
difficult to understand it then.
Bye,
Hannes
^ permalink raw reply
* Re: [PATCH net-next] geneve: fix incorrect setting of UDP checksum flag
From: Lance Richardson @ 2017-04-28 18:59 UTC (permalink / raw)
To: Girish Moodalbail; +Cc: davem, netdev, pshelar
In-Reply-To: <1493327513-23247-1-git-send-email-girish.moodalbail@oracle.com>
> From: "Girish Moodalbail" <girish.moodalbail@oracle.com>
> To: davem@davemloft.net
> Cc: netdev@vger.kernel.org, pshelar@ovn.org
> Sent: Thursday, 27 April, 2017 5:11:53 PM
> Subject: [PATCH net-next] geneve: fix incorrect setting of UDP checksum flag
>
> Creating a geneve link with 'udpcsum' set results in a creation of link
> for which UDP checksum will NOT be computed on outbound packets, as can
> be seen below.
>
> 11: gen0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
> link/ether c2:85:27:b6:b4:15 brd ff:ff:ff:ff:ff:ff promiscuity 0
> geneve id 200 remote 192.168.13.1 dstport 6081 noudpcsum
>
> Similarly, creating a link with 'noudpcsum' set results in a creation
> of link for which UDP checksum will be computed on outbound packets.
>
> Fixes: 9b4437a5b870 ("geneve: Unify LWT and netdev handling.")
> Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com>
> ---
> drivers/net/geneve.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> index 7074b40..dec5d56 100644
> --- a/drivers/net/geneve.c
> +++ b/drivers/net/geneve.c
> @@ -1244,7 +1244,7 @@ static int geneve_newlink(struct net *net, struct
> net_device *dev,
> metadata = true;
>
> if (data[IFLA_GENEVE_UDP_CSUM] &&
> - !nla_get_u8(data[IFLA_GENEVE_UDP_CSUM]))
> + nla_get_u8(data[IFLA_GENEVE_UDP_CSUM]))
> info.key.tun_flags |= TUNNEL_CSUM;
>
> if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX] &&
> --
> 1.8.3.1
>
>
Verified issue on 4.10.10 kernel. Note that this doesn't impact
lightweight geneve tunnels (e.g. as used by openvswitch).
Acked-by: Lance Richardson <lrichard@redhat.com>
^ permalink raw reply
* Re: [PATCH net-next] geneve: fix incorrect setting of UDP checksum flag
From: Pravin Shelar @ 2017-04-28 18:58 UTC (permalink / raw)
To: Girish Moodalbail; +Cc: David S. Miller, Linux Kernel Network Developers
In-Reply-To: <1493327513-23247-1-git-send-email-girish.moodalbail@oracle.com>
On Thu, Apr 27, 2017 at 2:11 PM, Girish Moodalbail
<girish.moodalbail@oracle.com> wrote:
> Creating a geneve link with 'udpcsum' set results in a creation of link
> for which UDP checksum will NOT be computed on outbound packets, as can
> be seen below.
>
> 11: gen0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
> link/ether c2:85:27:b6:b4:15 brd ff:ff:ff:ff:ff:ff promiscuity 0
> geneve id 200 remote 192.168.13.1 dstport 6081 noudpcsum
>
> Similarly, creating a link with 'noudpcsum' set results in a creation
> of link for which UDP checksum will be computed on outbound packets.
>
> Fixes: 9b4437a5b870 ("geneve: Unify LWT and netdev handling.")
> Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com>
LGTM.
Acked-by: Pravin B Shelar <pshelar@ovn.org>
^ permalink raw reply
* Re: [PATCH] net: hso: register netdev later to avoid a race condition
From: Johan Hovold @ 2017-04-28 19:00 UTC (permalink / raw)
To: Andreas Kemnade
Cc: Johan Hovold, davem, joe, gregkh, peter, hns, linux-usb, netdev,
linux-kernel
In-Reply-To: <20170428193629.4f72caed@aktux>
On Fri, Apr 28, 2017 at 07:36:29PM +0200, Andreas Kemnade wrote:
> On Thu, 27 Apr 2017 10:44:01 +0200
> Johan Hovold <johan@kernel.org> wrote:
>
> > On Wed, Apr 26, 2017 at 07:26:40PM +0200, Andreas Kemnade wrote:
> > > If the netdev is accessed before the urbs are initialized,
> > > there will be NULL pointer dereferences. That is avoided by
> > > registering it when it is fully initialized.
> >
> > > Reported-by: H. Nikolaus Schaller <hns@goldelico.com>
> > > Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
> > This all looks good, but you should consider cleaning up the error
> > handling of this function as a follow-up as we should not be
> > deregistering netdevs that have never been registered (e.g. if a
> > required endpoint is missing or if registration fails for some reason).
> >
> > But just to be clear, this problem existed also before this change.
> >
> Just to check wether I am understanding this correctly. In your opinion
> this patch is good for now. And later when it is applied, there should
> be an additional error handling cleanup patch.
Exactly; your patch is fine as is and the error-handling issue can be
fixed separately.
Thanks,
Johan
^ permalink raw reply
* Re: [PATCH v2 07/21] crypto: shash, caam: Make use of the new sg_map helper function
From: Logan Gunthorpe @ 2017-04-28 19:01 UTC (permalink / raw)
To: Herbert Xu
Cc: dri-devel, Stephen Bates, dm-devel, target-devel,
Christoph Hellwig, devel, James E.J. Bottomley, linux-scsi,
linux-nvdimm, linux-rdma, Sumit Semwal, Ross Zwisler, open-iscsi,
linux-media, intel-gfx, sparmaintainer, linux-raid, Dan Williams,
megaraidlinux.pdl, Jens Axboe, Martin K. Petersen, netdev,
Matthew Wilcox, linux-mmc, linux-kernel, linux-crypto,
Greg Kroah-Hartman
In-Reply-To: <20170428175147.GA9596@gondor.apana.org.au>
On 28/04/17 11:51 AM, Herbert Xu wrote:
> On Fri, Apr 28, 2017 at 10:53:45AM -0600, Logan Gunthorpe wrote:
>>
>>
>> On 28/04/17 12:30 AM, Herbert Xu wrote:
>>> You are right. Indeed the existing code looks buggy as they
>>> don't take sg->offset into account when doing the kmap. Could
>>> you send me some patches that fix these problems first so that
>>> they can be easily backported?
>>
>> Ok, I think the only buggy one in crypto is hifn_795x. Shash and caam
>> both do have the sg->offset accounted for. I'll send a patch for the
>> buggy one shortly.
>
> I think they're all buggy when sg->offset is greater than PAGE_SIZE.
Yes, technically. But that's a _very_ common mistake. Pretty nearly
every case I looked at did not take that into account. I don't think
sg's that point to more than one continuous page are all that common.
Fixing all those cases without making a common function is a waste of
time IMO.
Logan
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply
* Re: prog ID and next steps. Was: [RFC net-next 0/2] Introduce bpf_prog ID and iteration
From: Alexei Starovoitov @ 2017-04-28 19:31 UTC (permalink / raw)
To: Hannes Frederic Sowa, Martin KaFai Lau, netdev
Cc: Daniel Borkmann, kernel-team, David S. Miller,
Jesper Dangaard Brouer, John Fastabend, Thomas Graf
In-Reply-To: <44cdb2d2-9f5c-5d28-2966-3e43e6d2a2ef@stressinduktion.org>
On 4/28/17 4:50 AM, Hannes Frederic Sowa wrote:
> Hello Alexei,
>
> On 28.04.2017 03:11, Alexei Starovoitov wrote:
>> On 4/27/17 6:36 AM, Hannes Frederic Sowa wrote:
>>> On 27.04.2017 08:24, Martin KaFai Lau wrote:
>>>> This patchset introduces the bpf_prog ID and a new bpf cmd to
>>>> iterate all bpf_prog in the system.
>>>>
>>>> It is still incomplete. The idea can be extended to bpf_map.
>>>>
>>>> Martin KaFai Lau (2):
>>>> bpf: Introduce bpf_prog ID
>>>> bpf: Test for bpf_prog ID and BPF_PROG_GET_NEXT_ID
>>>
>>> Thanks Martin, I like the approach.
>>>
>>> I think the progid is also much more suitable to be used in kallsyms
>>> because it handles collisions correctly and let's correctly walk the
>>> chain (for example imaging loading two identical programs but install
>>> them at different hooks, kallsysms doesn't allow to find out which
>>> program is installed where).
>>
>> i disagree re: kallsyms. The goal of prog_tag is to let program writers
>> understand which program is running in a stable way.
>
> But exactly it doesn't let program writers do that, it just confuses them:
>
> ---
>
> jit on:
>
> perf record -e bpf_redirect -agR
>
> The unwinder walks the stack, extracts address of upper function and
> sends it to user space (perf) or handles it inside the kernel/kallsyms
> (ftrace).
>
> User takes tag of bpf program and wants to inspect related maps to the
> program. Unfortunately the tag is not unique and thus we need to expand
> the tag back to all possible programs with the same tag and expand that
> to the union of all possible maps that those programs reference again.
'all possible programs with the same tag' == all exactly the same
programs == the same single program which was either compiled
multiple times or loaded multiple times.
When debugging you want to see which program is running.
You don't care that it was loaded 10 times with different maps.
Same prog_tag == same program code. We don't add maps into tag
of the program, because it will only confuse users and makes such
tag useless, since the user won't be able to correlate such reported tag
with what they have on disk.
The programs gets unloaded too and this 'perf record' and stack
traces come from the past, hence the need for stable prog_tag.
We can take a 'perf record' from yesterday and today find the program
(if we have elf file for it) which was part of that trace.
That's the key value of the prog_tag.
The program ID is only valid at one point in time and adding it
to kallsyms doesn't help much at all.
Say, we added an id to kallsym, now in the stack trace you'll see
bpf_prog_da4fc6a3f41761a2_12
and
bpf_prog_da4fc6a3f41761a2_25
The only thing it tells you that the same program was loaded twice.
The IDs 12 and 25 won't help to debug at all unless you have
full crashdump of the system at the same exact time and can go and
examine the memory.
But if you have the crashdump, you don't need these IDs.
All kernel data structures can be reconstructed without any IDs.
> That is what we present to the application developer. I would seriously
> be very confused.
documentation needs to be improved. That's for sure.
> ---
>
> jit off:
>
> perf probe -a '__bpf_prog_run ctx insn'
> perf probe -a 'bpf_redirect flags ifindex'
> perf record -e bpf_redirect -agR
>
> Situation doesn't change. We do get the insn pointer thus have a unique
> id for the program.
without JIT+kallsyms the situation is indeed not great, since
__bpf_prog_run is the same for all programs and 'perf record' from
yesterday is useless for debugging today.
That's the reason why I very much in favor of enabling
net.core.bpf_jit_kallsyms by default.
> My proposal would be to maybe hash a map id into the program, so instead
> of replacing the user space file descriptor with zero, take a map id
> (like discussed below) or an inode number of the map into the register
> and hash with that, so that those program have unique identifiers.
>
> Otherwise construct kallsym entries with prog id instead of tag.
That doesn't make sense as explained above.
> Also I do think in future the difference between non-jit and jit
> operation in regards to tracing should also be lifted. We could add a
> manual tracing point into the interpreter for reporting the same event
> as if the program was jitted.
When JIT is off, I'd like to be able to have different __bpf_prog_run
appearing in stack traces for different programs, but don't see how
that's possible yet.
> Debugging should not be that different based on the sysctl flags.
debugging is already different depending which sysctl's are on.
All the sysctl net.* knobs affect debugging.
> Sure, what about tag -> id? Tag is being reported from tracing and thus
> should be one of the starting points to explore which programs are running.
based on prog_tag and list of elf files the user space can tell
precisely which program was or is running.
The elf file may have full debug info as well, so the user will
see source code of the program too.
Which is the ultimate goal of anyone doing debugging.
^ permalink raw reply
* [PATCH net-next v3] net: bridge: Fix improper taking over HW learned FDB
From: Arkadi Sharshevsky @ 2017-04-28 19:39 UTC (permalink / raw)
To: netdev; +Cc: Ido Schimmel, Nikolay Aleksandrov, bridge, Arkadi Sharshevsky,
davem
Commit 7e26bf45e4cb ("net: bridge: allow SW learn to take over HW fdb
entries") added the ability to "take over an entry which was previously
learned via HW when it shows up from a SW port".
However, if an entry was learned via HW and then a control packet
(e.g., ARP request) was trapped to the CPU, the bridge driver will
update the entry and remove the externally learned flag, although the
entry is still present in HW. Instead, only clear the externally learned
flag in case of roaming.
Fixes: 7e26bf45e4cb ("net: bridge: allow SW learn to take over HW fdb entries")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Arkadi Sharashevsky <arkadis@mellanox.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
v1->v2
- net-next rebase.
v2->v3
- remove redundant line.
---
net/bridge/br_fdb.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index de7988b..ab0c7cc 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -589,16 +589,14 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
if (unlikely(source != fdb->dst)) {
fdb->dst = source;
fdb_modified = true;
+ /* Take over HW learned entry */
+ if (unlikely(fdb->added_by_external_learn))
+ fdb->added_by_external_learn = 0;
}
if (now != fdb->updated)
fdb->updated = now;
if (unlikely(added_by_user))
fdb->added_by_user = 1;
- /* Take over HW learned entry */
- if (unlikely(fdb->added_by_external_learn)) {
- fdb->added_by_external_learn = 0;
- fdb_modified = true;
- }
if (unlikely(fdb_modified))
fdb_notify(br, fdb, RTM_NEWNEIGH);
}
--
2.4.11
^ permalink raw reply related
* Re: pull request (net): ipsec 2017-04-28
From: David Miller @ 2017-04-28 19:42 UTC (permalink / raw)
To: steffen.klassert; +Cc: herbert, netdev
In-Reply-To: <1493370873-30836-1-git-send-email-steffen.klassert@secunet.com>
From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Fri, 28 Apr 2017 11:14:31 +0200
> 1) Do garbage collecting after a policy flush to remove old
> bundles immediately. From Xin Long.
>
> 2) Fix GRO if netfilter is not defined.
> From Sabrina Dubroca.
>
> Please pull or let me know if there are problems.
Pulled, thanks!
^ permalink raw reply
* Re: xdp_redirect ifindex vs port. Was: best API for returning/setting egress port?
From: Hannes Frederic Sowa @ 2017-04-28 19:43 UTC (permalink / raw)
To: Alexei Starovoitov, John Fastabend, Jesper Dangaard Brouer,
Andy Gospodarek
Cc: Alexei Starovoitov, Daniel Borkmann, Daniel Borkmann,
netdev@vger.kernel.org, xdp-newbies@vger.kernel.org
In-Reply-To: <b15f490d-5490-e309-9626-d35b8e932483@fb.com>
On 28.04.2017 07:30, Alexei Starovoitov wrote:
> On 4/27/17 10:06 PM, John Fastabend wrote:
>> That is more or less what I was thinking as well. The other question
>> I have though is should we have a bpf_redirect() call for the simple
>> case where I use the ifindex directly. This will be helpful for taking
>> existing programs from tc_cls into xdp. I think it makes sense to have
>> both bpf_tx_allports(), bpf_tx_port(), and bpf_redirect().
>
> I think so too.
> Once netdevice is stored into netdev_array map the netdevice is pinned
> and we need to figure out what to do if somebody tries to delete it.
> Should we add a new netlink notifier that this netdev's refcnt is
> almost zero and it's only in netdev_array(s) ?
We basically do that automatically in netdev_wait_allrefs:
pr_emerg("unregister_netdevice: waiting for %s to become free. Usage
count = %d\n",
dev->name, refcnt);
It is a very unpleasant warning and users probably think about a bug in
the kernel at first.
I don't think we should wait for user space to clean that up but have to
do it automatically from the kernel. Maybe we can introduce a special
value that basically NOPs the transmission. The hash table itself would
install a netdevice notifier and would clean all tables. Could
definitely cause some storm in the kernel, if a lot of keys are mapped
to the same interface.
> or should it be deleted from the array(s) automatically and
> then user space will be notified post-deletion?
> Both approaches have their pros and cons.
I am leaning more towards deleting it automatically. But walking all
tables and in there all keys might cause some unwanted load spikes.
> Whereas raw ifindex approach (via bpf_redirect) doesn't have these
> caveats. It's clear to both bpf prog and user space that ifindex
> can be stale and user space needs to monitor netdevs and update
> programs/maps.
A separate type for ifindex as key or value might be nice to expose this
information directly via the kernel (fdinfo etc.) but at the same time,
debugging infrastructure from user space can also easily deal with that.
Another approach would be:
ifindexes are allocated cyclic and also are signed int and not u32
during allocation. Maybe we can negate the ifindex during transmission
in the table and thus mark it as stale (or set it to -1)? This update
would be done by bpf_tx_*ports() that take a reference to a table and a
key and submit the packets on the appropriate ports and can flag the
relevant ifindexes as stale.
Just wanted to draft this idea, I am not particular happy with that
idea. Maybe someone comes up with a better one.
Thanks,
Hannes
^ permalink raw reply
* Re: pull request (net-next): ipsec-next 2017-04-28
From: David Miller @ 2017-04-28 19:44 UTC (permalink / raw)
To: steffen.klassert; +Cc: herbert, netdev
In-Reply-To: <1493368958-29609-1-git-send-email-steffen.klassert@secunet.com>
From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Fri, 28 Apr 2017 10:42:37 +0200
> Just one patch to fix a misplaced spin_unlock_bh in an error path.
>
> Please pull or let me know if there are problems.
Pulled, thank you.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox