Netdev List
 help / color / mirror / Atom feed
* Re: [patch net-next 00/10] net: sched: introduce multichain support for filters
From: Cong Wang @ 2017-04-27 17:46 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	David Ahern, Eric Dumazet, Stephen Hemminger, Daniel Borkmann,
	Alexander Duyck, mlxsw, Simon Horman
In-Reply-To: <1493291540-2119-1-git-send-email-jiri@resnulli.us>

On Thu, Apr 27, 2017 at 4:12 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Simple example:
> $ tc qdisc add dev eth0 ingress
> $ tc filter add dev eth0 parent ffff: protocol ip pref 33 flower dst_mac 52:54:00:3d:c7:6d action goto chain 11
> $ tc filter add dev eth0 parent ffff: protocol ip pref 22 chain 11 flower dst_ip 192.168.40.1 action drop
> $ tc filter show dev eth0 root

Interesting.

I don't look into the code yet. If I understand the concepts correctly,
so with your patchset we can mark either filter with a chain No. to
choose which chain it belongs to _logically_ even though
_physically_ it is still in the old-fashion chain (prio, proto)?

If so, you have to ensure proto is same since the protocol of
the packet does not change dynamically? And the original
priority becomes pointless with chains since we can just to
any other chain in any order?

By default, without any chain No., you use 0 for all the chains,
so the old-fashion chain still works.

Thanks.

^ permalink raw reply

* Re: [PATCH net v4 3/3] net: hns: fixed bug that skb used after kfree
From: Florian Fainelli @ 2017-04-27 17:38 UTC (permalink / raw)
  To: Yankejian, davem, salil.mehta, yisen.zhuang, matthias.bgg,
	lipeng321, zhouhuiru, huangdaode
  Cc: netdev, linuxarm
In-Reply-To: <1493261053-68197-4-git-send-email-yankejian@huawei.com>

On 04/26/2017 07:44 PM, Yankejian wrote:
>  	struct hns_nic_priv *priv = netdev_priv(ndev);
>  	struct hnae_ring *ring = ring_data->ring;
> @@ -361,6 +361,10 @@ int hns_nic_net_xmit_hw(struct net_device *ndev,
>  	dev_queue = netdev_get_tx_queue(ndev, skb->queue_mapping);
>  	netdev_tx_sent_queue(dev_queue, skb->len);
>  
> +	netif_trans_update(ndev);
> +	ndev->stats.tx_bytes += skb->len;
> +	ndev->stats.tx_packets++;

This is still wrong though, you should not update your TX statistics
until you get a TX completion interrupt that confirms these packets were
actually transmitted. This has the advantage of not causing use after
free in your ndo_start_xmit() function (current bug), and also allows
feeding information into BQL where it is appropriate, and in a central
location: the TX completion handler.
-- 
Florian

^ permalink raw reply

* [PATCH net] bonding: avoid defaulting hard_header_len to ETH_HLEN on slave removal
From: Paolo Abeni @ 2017-04-27 17:29 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Jay Vosburgh, David S. Miller, Honggang LI,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA

On slave list updates, the bonding driver computes its hard_header_len
as the maximum of all enslaved devices's hard_header_len.
If the slave list is empty, e.g. on last enslaved device removal,
ETH_HLEN is used.

Since the bonding header_ops are set only when the first enslaved
device is attached, the above can lead to header_ops->create()
being called with the wrong skb headroom in place.

If bond0 is configured on top of ipoib devices, with the
following commands:

ifup bond0
for slave in $BOND_SLAVES_LIST; do
	ip link set dev $slave nomaster
done
ping -c 1 <ip on bond0 subnet>

we will obtain a skb_under_panic() with a similar call trace:
	skb_push+0x3d/0x40
	push_pseudo_header+0x17/0x30 [ib_ipoib]
	ipoib_hard_header+0x4e/0x80 [ib_ipoib]
	arp_create+0x12f/0x220
	arp_send_dst.part.19+0x28/0x50
	arp_solicit+0x115/0x290
	neigh_probe+0x4d/0x70
	__neigh_event_send+0xa7/0x230
	neigh_resolve_output+0x12e/0x1c0
	ip_finish_output2+0x14b/0x390
	ip_finish_output+0x136/0x1e0
	ip_output+0x76/0xe0
	ip_local_out+0x35/0x40
	ip_send_skb+0x19/0x40
	ip_push_pending_frames+0x33/0x40
	raw_sendmsg+0x7d3/0xb50
	inet_sendmsg+0x31/0xb0
	sock_sendmsg+0x38/0x50
	SYSC_sendto+0x102/0x190
	SyS_sendto+0xe/0x10
	do_syscall_64+0x67/0x180
	entry_SYSCALL64_slow_path+0x25/0x25

This change addresses the issue avoiding updating the bonding device
hard_header_len when the slaves list become empty, forbidding to
shrink it below the value used by header_ops->create().

The bug is there since commit 54ef31371407 ("[PATCH] bonding: Handle large
hard_header_len") but the panic can be triggered only since
commit fc791b633515 ("IB/ipoib: move back IB LL address into the hard
header").

Reported-by: Norbert P <noe-PRwTpj6vllL463JZfw7VRw@public.gmane.org>
Fixes: 54ef31371407 ("[PATCH] bonding: Handle large hard_header_len")
Fixes: fc791b633515 ("IB/ipoib: move back IB LL address into the hard header")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Paolo Abeni <pabeni-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 drivers/net/bonding/bond_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 8a4ba8b..34481c9 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1104,11 +1104,11 @@ static void bond_compute_features(struct bonding *bond)
 		gso_max_size = min(gso_max_size, slave->dev->gso_max_size);
 		gso_max_segs = min(gso_max_segs, slave->dev->gso_max_segs);
 	}
+	bond_dev->hard_header_len = max_hard_header_len;
 
 done:
 	bond_dev->vlan_features = vlan_features;
 	bond_dev->hw_enc_features = enc_features | NETIF_F_GSO_ENCAP_ALL;
-	bond_dev->hard_header_len = max_hard_header_len;
 	bond_dev->gso_max_segs = gso_max_segs;
 	netif_set_gso_max_size(bond_dev, gso_max_size);
 
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH net-next 1/4] ixgbe: sparc: rename the ARCH_WANT_RELAX_ORDER to IXGBE_ALLOW_RELAXED_ORDER
From: Bjorn Helgaas @ 2017-04-27 17:19 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Ding Tianhong, Mark Rutland, Amir Ancel, Gabriele Paoloni,
	linux-pci@vger.kernel.org, Catalin Marinas, Will Deacon, LinuxArm,
	David Laight, jeffrey.t.kirsher@intel.com, netdev@vger.kernel.org,
	Robin Murphy, davem@davemloft.net,
	linux-arm-kernel@lists.infradead.org, Casey Leedom
In-Reply-To: <CAKgT0Uc4L=GgYbpO-Fm9OfN+_fLypbDP1c+X4T_ta90ecQiyGQ@mail.gmail.com>

[+cc Casey]

On Wed, Apr 26, 2017 at 09:18:33AM -0700, Alexander Duyck wrote:
> On Wed, Apr 26, 2017 at 2:26 AM, Ding Tianhong <dingtianhong@huawei.com> wrote:
> > Hi Amir:
> >
> > It is really glad to hear that the mlx5 will support RO mode this year, if so, do you agree that enable it dynamic by ethtool -s xxx,
> > we have try it several month ago but there was only one drivers would use it at that time so the maintainer against it, it mlx5 would support RO,
> > we could try to restart this solution, what do you think about it. :)
> >
> > Thanks
> > Ding
> 
> Hi Ding,
> 
> Enabing relaxed ordering really doesn't have any place in ethtool. It
> is a PCIe attribute that you are essentially wanting to enable.
> 
> It might be worth while to take a look at updating the PCIe code path
> to handle this. Really what we should probably do is guarantee that
> the architectures that need relaxed ordering are setting it in the
> PCIe Device Control register and that the ones that don't are clearing
> the bit. It's possible that this is already occurring, but I don't
> know the state of handling those bits is in the kernel. Once we can
> guarantee that we could use that to have the drivers determine their
> behavior in regards to relaxed ordering. For example in the case of
> igb/ixgbe we could probably change the behavior so that it will bey
> default try to use relaxed ordering but if it is not enabled in PCIe
> Device Control register the hardware should not request to use it. It
> would simplify things in the drivers and allow for each architecture
> to control things as needed in their PCIe code.

I thought Relaxed Ordering was an optimization.  Are there cases where
it is actually required for correct behavior?

The PCI core doesn't currently do anything with Relaxed Ordering.
Some drivers enable/disable it directly.  I think it would probably be
better if the core provided an interface for this.  One reason is
because I think Casey has identified some systems where Relaxed
Ordering doesn't work correctly, and I'd rather deal with them once in
the core than in every driver.

Are you hinting that the PCI core or arch code could actually *enable*
Relaxed Ordering without the driver doing anything?  Is it safe to do
that?  Is there such a thing as a device that is capable of using RO,
but where the driver must be aware of it being enabled, so it programs
the device appropriately?

Bjorn

^ permalink raw reply

* Re: [PATCH 5/7] IB/hfi1: use pcie_flr instead of duplicating it
From: Bjorn Helgaas @ 2017-04-27 16:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Byczkowski, Jakub, Bjorn Helgaas, Cabiddu, Giovanni,
	Benedetto, Salvatore, Marciniszyn, Mike, Dalessandro, Dennis,
	Derek Chickles, Satanand Burla, Felix Manlunas, Raghu Vatsavayi,
	Kirsher, Jeffrey T, linux-pci@vger.kernel.org, qat-linux,
	linux-crypto@vger.kernel.org, linux-rdma@vger.kernel.org,
	"net
In-Reply-To: <20170427064758.GA20614@lst.de>

On Thu, Apr 27, 2017 at 08:47:58AM +0200, Christoph Hellwig wrote:
> On Tue, Apr 25, 2017 at 02:39:55PM -0500, Bjorn Helgaas wrote:
> > This still leaves these:
> > 
> >   [PATCH 4/7] ixgbe: use pcie_flr instead of duplicating it
> >   [PATCH 6/7] crypto: qat: use pcie_flr instead of duplicating it
> >   [PATCH 7/7] liquidio: use pcie_flr instead of duplicating it
> > 
> > I haven't seen any response to 4 and 6.  Felix reported an unused
> > variable in 7.  Let me know if you'd like me to do anything with
> > these.
> 
> Now that Jeff ACKed 4 it might be worth to add it to the pci tree last
> minute.  I'll resend liquidio and qat to the respective maintainers for
> the next merge window.

I applied 4 with Jeff's ack to pci/virtualization for v4.12, thanks!

^ permalink raw reply

* Re: [PATCH net-next] samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel
From: David Miller @ 2017-04-27 16:49 UTC (permalink / raw)
  To: ast; +Cc: dsa, netdev, daniel
In-Reply-To: <085b8b0f-7751-0be4-d3b7-3c06f2cc602b@fb.com>

From: Alexei Starovoitov <ast@fb.com>
Date: Thu, 27 Apr 2017 09:38:37 -0700

> On 4/27/17 9:11 AM, David Ahern wrote:
>> Add option to xdp1 and xdp_tx_iptunnel to insert xdp program in
>> SKB_MODE:
>>  - update set_link_xdp_fd to take a flags argument that is added to the
>>    RTM_SETLINK message
>>
>>  - Add -S option to xdp1 and xdp_tx_iptunnel user code. When passed in
>>    XDP_FLAGS_SKB_MODE is set in the flags arg passed to set_link_xdp_fd
>>
>> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
> 
> awesome. thanks!
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Indeed, very awesome.

Applied!

^ permalink raw reply

* Re: [PATCH v1 net-next 5/6] net: allow simultaneous SW and HW transmit timestamping
From: Willem de Bruijn @ 2017-04-27 16:48 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Network Development, Richard Cochran, Willem de Bruijn,
	Soheil Hassas Yeganeh, Keller, Jacob E, Denny Page, Jiri Benc
In-Reply-To: <20170427163911.GC3401@localhost>

On Thu, Apr 27, 2017 at 12:39 PM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Thu, Apr 27, 2017 at 12:21:00PM -0400, Willem de Bruijn wrote:
>> >> > @@ -720,6 +720,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
>> >> >                 empty = 0;
>> >> >         if (shhwtstamps &&
>> >> >             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
>> >> > +           (empty || !skb_is_err_queue(skb)) &&
>> >> >             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
>> >>
>> >> I find skb->tstamp == 0 easier to understand than the condition on empty.
>> >>
>> >> Indeed, this is so non-obvious that I would suggest another helper function
>> >> skb_is_hwtx_tstamp with a concise comment about the race condition
>> >> between tx software and hardware timestamps (as in the last sentence of
>> >> the commit message).
>> >
>> > Should it include also the skb_is_err_queue() check? If it returned
>> > true for both TX and RX HW timestamps, maybe it could be called
>> > skb_has_hw_tstamp?
>>
>> For the purpose of documenting why this complex condition exists,
>> I would call the skb_is_err_queue in that helper function and make
>> it tx + hw specific.
>
> Hm, like this?
>
>         if (shhwtstamps &&
>             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
> +           (skb_is_hwtx_tstamp(skb) || !skb_is_err_queue(skb)) &&
>             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
>
> where skb_is_hwtx_tstamp() has
>         return skb->tstamp == 0 && skb_is_err_queue(skb);
>
> I was just not sure about the unnecessary skb_is_err_queue() call.

Oh, good point. If the condition is

  (skb_is_err_queue(skb) && !skb->tstamp) || !skb_is_err_queue(skb)

then it makes more sense to just use the simpler expression

  (!skb_is_err_queue(skb)) || (!skb->tstamp)

This cannot be called skb_is_hwtx_tstamp, as this does not imply
skb_hwtstamps(skb). Perhaps instead define

  /* On transmit, software and hardware timestamps are returned independently.
   * Do not return hardware timestamps even if skb_hwtstamps(skb) is true if
   * skb->tstamp is set
   */
  static bool skb_is_swtx_tstamp(skb) {
    return skb_is_err_queue(skb) && skb->tstamp;
  }

and use !skb_is_swtx_tstamp(skb) in this condition. The comment is
just a quick first try, can probably be improved.

^ permalink raw reply

* Re: [PATCH net-next 6/6] bpf: show bpf programs
From: David Miller @ 2017-04-27 16:40 UTC (permalink / raw)
  To: hannes; +Cc: daniel, netdev, ast, daniel, jbenc, aconole
In-Reply-To: <d99407db-895c-01c0-8e26-c6bd2b79d4ff@stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Thu, 27 Apr 2017 18:28:17 +0200

> Merely I tried to establish the procfs interface as quick look
> interface

Show me that "quick look" nftables dumping facility via procfs
and I'll start to listen to you.

What you are proposing has no real value once we have bpf() system
call based traversal and has no strict precedence across the
networking subsystem.

Thank you.

^ permalink raw reply

* status of bpf binutils
From: David Miller @ 2017-04-27 16:39 UTC (permalink / raw)
  To: ast; +Cc: daniel, netdev, xdp-newbies


As I hinted the other day I'm hacking on BPF support in binutils.

Here is what works right now:

1) disassembly of object files made by existing tools, I can build things
   with clang/llvm on my sparc and analyze the resulting object files
   using objdump and gdb

[davem@dhcp-10-15-49-210 build-bpf]$ ./binutils/objdump -d ./x.o

./x.o:     file format elf64-bpf


Disassembly of section test1:

0000000000000000 <process>:
   0:	b7 00 00 00 00 00 00 02 	mov	r0, 2
   8:	61 21 00 50 00 00 00 00 	ldw	r2, [r1+80]
 ...

[davem@dhcp-10-15-49-210 build-bpf]$ ./gdb/gdb ./x.o
GNU gdb (GDB) 8.0.50.20170427-git
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-pc-linux-gnu --target=bpf-elf".
 ...
(gdb) x/10i process
   0x0 <process>:	mov	r0, 2
   0x8 <process+8>:	ldw	r2, [r1+80]
   0x10 <process+16>:	ldw	r1, [r1+76]
   0x18 <process+24>:	mov	r4, r1
   0x20 <process+32>:	add	r4, 14
   0x28 <process+40>:	jgt	r4, r2, 0x148 <LBB0_11>
   0x30 <process+48>:	ldb	r5, [r1+13]
   0x38 <process+56>:	ldb	r3, [r1+12]
   0x40 <process+64>:	lsh	r3, 8
   0x48 <process+72>:	or	r3, r5
(gdb)

2) Simple assembler programs compile.

[davem@dhcp-10-15-49-210 build-bpf]$ cat gas/x.s
	.text
	.align	8
	.globl	foo
foo:	mov	r1, 2
	ja	1f
	mov	r1, 3
1:	exit
[davem@dhcp-10-15-49-210 build-bpf]$ gas/as-new -o gas/x.o gas/x.s
[davem@dhcp-10-15-49-210 build-bpf]$ ./binutils/objdump -d gas/x.o

gas/x.o:     file format elf64-bpf


Disassembly of section .text:

0000000000000000 <foo>:
   0:	b7 10 00 00 00 00 00 02 	mov	r1, 2
   8:	05 00 00 03 00 00 00 00 	ja	18 <foo+0x18>
  10:	b7 10 00 00 00 00 00 03 	mov	r1, 3
  18:	95 00 00 00 00 00 00 00 	exit	
[davem@dhcp-10-15-49-210 build-bpf]$

I've created a few ELF relocations for BPF, there are only really 3
main things to consider:

1) Immediate field, 32-bit
2) Offset field, 16-bit absolute
3) Offset field, 16-bit PC-relative displacement

and thus:

/* Relocation types.  */
START_RELOC_NUMBERS (elf_bpf_reloc_type)
  RELOC_NUMBER (R_BPF_NONE, 0)
  RELOC_NUMBER (R_BPF_16, 1)
  RELOC_NUMBER (R_BPF_32, 2)
  RELOC_NUMBER (R_BPF_WDISP16, 3)
END_RELOC_NUMBERS (R_BPF_max)

is what goes into include/elf/bpf.h

I just realized while writing this that I'll need to add an R_BPF_64
to handle ldimm64 instructions, but that's not a big deal.

I'm going to concentrate on the assembler for now, and start writing
test cases.

Another area I have not resolved completely is endianness.  Right now
just for my hacking and testing, I'm forcing everything to be big
endian which of course will not be the final default :-)

The current patch against:

	git://sourceware.org/git/binutils-gdb.git

is below.

If you want to play with this configure for "--target=bpf-elf".


>From 2e193eecf3eee1c0632f5c1932f76ff387c49ae2 Mon Sep 17 00:00:00 2001
From: "David S. Miller" <davem@davemloft.net>
Date: Wed, 26 Apr 2017 14:27:53 -0400
Subject: [PATCH] Start adding BPF support...

---
 bfd/Makefile.am            |   2 +
 bfd/Makefile.in            |   3 +
 bfd/archures.c             |   3 +
 bfd/bfd-in2.h              |   5 +
 bfd/config.bfd             |   5 +
 bfd/configure              |   1 +
 bfd/configure.ac           |   1 +
 bfd/cpu-bpf.c              |  41 +++++
 bfd/elf64-bpf.c            |  47 +++++
 bfd/libbfd.h               |   1 +
 bfd/reloc.c                |   5 +
 bfd/targets.c              |   3 +
 config.sub                 |   5 +-
 gas/Makefile.am            |   2 +
 gas/Makefile.in            |  17 ++
 gas/config/tc-bpf.c        | 447 +++++++++++++++++++++++++++++++++++++++++++++
 gas/config/tc-bpf.h        |  38 ++++
 gas/configure.tgt          |   3 +
 gdb/bpf-tdep.c             | 229 +++++++++++++++++++++++
 gdb/bpf-tdep.h             |  40 ++++
 gdb/configure.tgt          |   4 +
 include/dis-asm.h          |   1 +
 include/elf/bpf.h          |  34 ++++
 include/opcode/bpf.h       |  16 ++
 ld/Makefile.am             |   4 +
 ld/Makefile.in             |   5 +
 ld/configure.tgt           |   2 +
 ld/emulparams/elf64_bpf.sh |   8 +
 opcodes/Makefile.am        |   2 +
 opcodes/bpf-dis.c          | 152 +++++++++++++++
 opcodes/bpf-opc.c          | 147 +++++++++++++++
 opcodes/configure          |   1 +
 opcodes/configure.ac       |   1 +
 opcodes/disassemble.c      |   6 +
 34 files changed, 1279 insertions(+), 2 deletions(-)
 create mode 100644 bfd/cpu-bpf.c
 create mode 100644 bfd/elf64-bpf.c
 create mode 100644 gas/config/tc-bpf.c
 create mode 100644 gas/config/tc-bpf.h
 create mode 100644 gdb/bpf-tdep.c
 create mode 100644 gdb/bpf-tdep.h
 create mode 100644 include/elf/bpf.h
 create mode 100644 include/opcode/bpf.h
 create mode 100644 ld/emulparams/elf64_bpf.sh
 create mode 100644 opcodes/bpf-dis.c
 create mode 100644 opcodes/bpf-opc.c

diff --git a/bfd/Makefile.am b/bfd/Makefile.am
index 97b608c..911655a 100644
--- a/bfd/Makefile.am
+++ b/bfd/Makefile.am
@@ -95,6 +95,7 @@ ALL_MACHINES = \
 	cpu-arm.lo \
 	cpu-avr.lo \
 	cpu-bfin.lo \
+	cpu-bpf.lo \
 	cpu-cr16.lo \
 	cpu-cr16c.lo \
 	cpu-cris.lo \
@@ -185,6 +186,7 @@ ALL_MACHINES_CFILES = \
 	cpu-arm.c \
 	cpu-avr.c \
 	cpu-bfin.c \
+	cpu-bpf.c \
 	cpu-cr16.c \
 	cpu-cr16c.c \
 	cpu-cris.c \
diff --git a/bfd/Makefile.in b/bfd/Makefile.in
index e48abaf..930aa09 100644
--- a/bfd/Makefile.in
+++ b/bfd/Makefile.in
@@ -428,6 +428,7 @@ ALL_MACHINES = \
 	cpu-arm.lo \
 	cpu-avr.lo \
 	cpu-bfin.lo \
+	cpu-bpf.lo \
 	cpu-cr16.lo \
 	cpu-cr16c.lo \
 	cpu-cris.lo \
@@ -518,6 +519,7 @@ ALL_MACHINES_CFILES = \
 	cpu-arm.c \
 	cpu-avr.c \
 	cpu-bfin.c \
+	cpu-bpf.c \
 	cpu-cr16.c \
 	cpu-cr16c.c \
 	cpu-cris.c \
@@ -1380,6 +1382,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/cpu-arm.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/cpu-avr.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/cpu-bfin.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/cpu-bpf.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/cpu-cr16.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/cpu-cr16c.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/cpu-cris.Plo@am__quote@
diff --git a/bfd/archures.c b/bfd/archures.c
index c6e7152..f096d73 100644
--- a/bfd/archures.c
+++ b/bfd/archures.c
@@ -447,6 +447,8 @@ DESCRIPTION
 .#define bfd_mach_avrxmega7 107
 .  bfd_arch_bfin,        {* ADI Blackfin *}
 .#define bfd_mach_bfin          1
+.  bfd_arch_bpf,        {* eBPF *}
+.#define bfd_mach_bpf           1
 .  bfd_arch_cr16,       {* National Semiconductor CompactRISC (ie CR16). *}
 .#define bfd_mach_cr16		1
 .  bfd_arch_cr16c,       {* National Semiconductor CompactRISC. *}
@@ -582,6 +584,7 @@ extern const bfd_arch_info_type bfd_arc_arch;
 extern const bfd_arch_info_type bfd_arm_arch;
 extern const bfd_arch_info_type bfd_avr_arch;
 extern const bfd_arch_info_type bfd_bfin_arch;
+extern const bfd_arch_info_type bfd_bpf_arch;
 extern const bfd_arch_info_type bfd_cr16_arch;
 extern const bfd_arch_info_type bfd_cr16c_arch;
 extern const bfd_arch_info_type bfd_cris_arch;
diff --git a/bfd/bfd-in2.h b/bfd/bfd-in2.h
index 17a35c0..b4db6b2 100644
--- a/bfd/bfd-in2.h
+++ b/bfd/bfd-in2.h
@@ -2304,6 +2304,8 @@ enum bfd_architecture
 #define bfd_mach_avrxmega7 107
   bfd_arch_bfin,        /* ADI Blackfin */
 #define bfd_mach_bfin          1
+  bfd_arch_bpf,        /* eBPF */
+#define bfd_mach_bpf           1
   bfd_arch_cr16,       /* National Semiconductor CompactRISC (ie CR16). */
 #define bfd_mach_cr16          1
   bfd_arch_cr16c,       /* National Semiconductor CompactRISC. */
@@ -3910,6 +3912,9 @@ pc-relative or some form of GOT-indirect relocation.  */
 /* ADI Blackfin arithmetic relocation.  */
   BFD_ARELOC_BFIN_ADDR,
 
+/* BPF relocations  */
+  BFD_RELOC_BPF_WDISP16,
+
 /* Mitsubishi D10V relocs.
 This is a 10-bit reloc with the right 2 bits
 assumed to be 0.  */
diff --git a/bfd/config.bfd b/bfd/config.bfd
index 151de95..0cbccae 100644
--- a/bfd/config.bfd
+++ b/bfd/config.bfd
@@ -161,6 +161,7 @@ am33_2.0*)	 targ_archs=bfd_mn10300_arch ;;
 arc*)		 targ_archs=bfd_arc_arch ;;
 arm*)		 targ_archs=bfd_arm_arch ;;
 bfin*)		 targ_archs=bfd_bfin_arch ;;
+bpf*)		 targ_archs=bfd_bpf_arch ;;
 c30*)		 targ_archs=bfd_tic30_arch ;;
 c4x*)		 targ_archs=bfd_tic4x_arch ;;
 c54x*)		 targ_archs=bfd_tic54x_arch ;;
@@ -471,6 +472,10 @@ case "${targ}" in
     targ_underscore=yes
     ;;
 
+  bpf-*-*)
+    targ_defvec=bpf_elf64_vec
+    ;;
+
   c30-*-*aout* | tic30-*-*aout*)
     targ_defvec=tic30_aout_vec
     ;;
diff --git a/bfd/configure b/bfd/configure
index 24e3e2f..d1a67bb 100755
--- a/bfd/configure
+++ b/bfd/configure
@@ -14298,6 +14298,7 @@ do
     avr_elf32_vec)		 tb="$tb elf32-avr.lo elf32.lo $elf" ;;
     bfin_elf32_vec)		 tb="$tb elf32-bfin.lo elf32.lo $elf" ;;
     bfin_elf32_fdpic_vec)	 tb="$tb elf32-bfin.lo elf32.lo $elf" ;;
+    bpf_elf64_vec)		 tb="$tb elf64-bpf.lo elf64.lo $elf" ;;
     bout_be_vec)		 tb="$tb bout.lo aout32.lo" ;;
     bout_le_vec)		 tb="$tb bout.lo aout32.lo" ;;
     cr16_elf32_vec)		 tb="$tb elf32-cr16.lo elf32.lo $elf" ;;
diff --git a/bfd/configure.ac b/bfd/configure.ac
index e568847..00c6690 100644
--- a/bfd/configure.ac
+++ b/bfd/configure.ac
@@ -429,6 +429,7 @@ do
     avr_elf32_vec)		 tb="$tb elf32-avr.lo elf32.lo $elf" ;;
     bfin_elf32_vec)		 tb="$tb elf32-bfin.lo elf32.lo $elf" ;;
     bfin_elf32_fdpic_vec)	 tb="$tb elf32-bfin.lo elf32.lo $elf" ;;
+    bpf_elf64_vec)		 tb="$tb elf64-bpf.lo elf64.lo $elf" ;;
     bout_be_vec)		 tb="$tb bout.lo aout32.lo" ;;
     bout_le_vec)		 tb="$tb bout.lo aout32.lo" ;;
     cr16_elf32_vec)		 tb="$tb elf32-cr16.lo elf32.lo $elf" ;;
diff --git a/bfd/cpu-bpf.c b/bfd/cpu-bpf.c
new file mode 100644
index 0000000..551e42e
--- /dev/null
+++ b/bfd/cpu-bpf.c
@@ -0,0 +1,41 @@
+/* BFD Support for the eBPF.
+
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+   This file is part of BFD, the Binary File Descriptor library.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston,
+   MA 02110-1301, USA.  */
+
+#include "sysdep.h"
+#include "bfd.h"
+#include "libbfd.h"
+
+const bfd_arch_info_type bfd_bpf_arch =
+  {
+    64,     		/* Bits in a word.  */
+    64,  		/* Bits in an address.  */
+    8,     		/* Bits in a byte.  */
+    bfd_arch_bpf,
+    0,                	/* Only one machine.  */
+    "bpf",        	/* Arch name.  */
+    "bpf",        	/* Arch printable name.  */
+    3,                	/* Section align power.  */
+    TRUE,             	/* The one and only.  */
+    bfd_default_compatible,
+    bfd_default_scan,
+    bfd_arch_default_fill,
+    0,
+  };
diff --git a/bfd/elf64-bpf.c b/bfd/elf64-bpf.c
new file mode 100644
index 0000000..76f14e7
--- /dev/null
+++ b/bfd/elf64-bpf.c
@@ -0,0 +1,47 @@
+#include "sysdep.h"
+#include "bfd.h"
+#include "libbfd.h"
+#include "elf-bfd.h"
+#include "opcode/bpf.h"
+
+static void
+check_for_relocs (bfd * abfd, asection * o, void * failed)
+{
+  if ((o->flags & SEC_RELOC) != 0)
+    {
+      Elf_Internal_Ehdr *ehdrp;
+
+      ehdrp = elf_elfheader (abfd);
+      /* xgettext:c-format */
+      _bfd_error_handler (_("%B: Relocations in generic ELF (EM: %d)"),
+			  abfd, ehdrp->e_machine);
+
+      bfd_set_error (bfd_error_wrong_format);
+      * (bfd_boolean *) failed = TRUE;
+    }
+}
+
+static bfd_boolean
+elf64_generic_link_add_symbols (bfd *abfd, struct bfd_link_info *info)
+{
+  bfd_boolean failed = FALSE;
+
+  /* Check if there are any relocations.  */
+  bfd_map_over_sections (abfd, check_for_relocs, & failed);
+
+  if (failed)
+    return FALSE;
+  return bfd_elf_link_add_symbols (abfd, info);
+}
+
+#define TARGET_BIG_SYM		bpf_elf64_vec
+#define TARGET_BIG_NAME		"elf64-bpf"
+#define ELF_ARCH		bfd_arch_bpf
+#define ELF_MAXPAGESIZE		0x100000
+#define ELF_MACHINE_CODE	EM_BPF
+
+#define bfd_elf64_bfd_reloc_type_lookup bfd_default_reloc_type_lookup
+#define bfd_elf64_bfd_reloc_name_lookup _bfd_norelocs_bfd_reloc_name_lookup
+#define bfd_elf64_bfd_link_add_symbols	elf64_generic_link_add_symbols
+
+#include "elf64-target.h"
diff --git a/bfd/libbfd.h b/bfd/libbfd.h
index 8bac650..01c6d84 100644
--- a/bfd/libbfd.h
+++ b/bfd/libbfd.h
@@ -1794,6 +1794,7 @@ static const char *const bfd_reloc_code_real_names[] = { "@@uninitialized@@",
   "BFD_ARELOC_BFIN_PAGE",
   "BFD_ARELOC_BFIN_HWPAGE",
   "BFD_ARELOC_BFIN_ADDR",
+  "BFD_RELOC_BPF_WDISP16",
   "BFD_RELOC_D10V_10_PCREL_R",
   "BFD_RELOC_D10V_10_PCREL_L",
   "BFD_RELOC_D10V_18",
diff --git a/bfd/reloc.c b/bfd/reloc.c
index 9a04022..39dc3b2 100644
--- a/bfd/reloc.c
+++ b/bfd/reloc.c
@@ -3854,6 +3854,11 @@ ENUMDOC
   ADI Blackfin arithmetic relocation.
 
 ENUM
+  BFD_RELOC_BPF_WDISP16
+ENUMDOC
+  BPF relocations
+
+ENUM
   BFD_RELOC_D10V_10_PCREL_R
 ENUMDOC
   Mitsubishi D10V relocs.
diff --git a/bfd/targets.c b/bfd/targets.c
index 5841e8d..799e2bb 100644
--- a/bfd/targets.c
+++ b/bfd/targets.c
@@ -619,6 +619,7 @@ extern const bfd_target arm_pei_wince_le_vec;
 extern const bfd_target avr_elf32_vec;
 extern const bfd_target bfin_elf32_vec;
 extern const bfd_target bfin_elf32_fdpic_vec;
+extern const bfd_target bpf_elf64_vec;
 extern const bfd_target bout_be_vec;
 extern const bfd_target bout_le_vec;
 extern const bfd_target cr16_elf32_vec;
@@ -1029,6 +1030,8 @@ static const bfd_target * const _bfd_target_vector[] =
 	&bfin_elf32_vec,
 	&bfin_elf32_fdpic_vec,
 
+	&bpf_elf64_vec,
+
 	&bout_be_vec,
 	&bout_le_vec,
 
diff --git a/config.sub b/config.sub
index 40ea5df..942989e 100755
--- a/config.sub
+++ b/config.sub
@@ -2,7 +2,7 @@
 # Configuration validation subroutine script.
 #   Copyright 1992-2017 Free Software Foundation, Inc.
 
-timestamp='2017-04-02'
+timestamp='2017-04-25'
 
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -257,6 +257,7 @@ case $basic_machine in
 	| ba \
 	| be32 | be64 \
 	| bfin \
+	| bpf \
 	| c4x | c8051 | clipper \
 	| d10v | d30v | dlx | dsp16xx \
 	| e2k | epiphany \
@@ -380,7 +381,7 @@ case $basic_machine in
 	| avr-* | avr32-* \
 	| ba-* \
 	| be32-* | be64-* \
-	| bfin-* | bs2000-* \
+	| bfin-* | bpf-* | bs2000-* \
 	| c[123]* | c30-* | [cjt]90-* | c4x-* \
 	| c8051-* | clipper-* | craynv-* | cydra-* \
 	| d10v-* | d30v-* | dlx-* \
diff --git a/gas/Makefile.am b/gas/Makefile.am
index c9f9de0..bfd6ed9 100644
--- a/gas/Makefile.am
+++ b/gas/Makefile.am
@@ -135,6 +135,7 @@ TARGET_CPU_CFILES = \
 	config/tc-arm.c \
 	config/tc-avr.c \
 	config/tc-bfin.c \
+	config/tc-bpf.c \
 	config/tc-cr16.c \
 	config/tc-cris.c \
 	config/tc-crx.c \
@@ -212,6 +213,7 @@ TARGET_CPU_HFILES = \
 	config/tc-arm.h \
 	config/tc-avr.h \
 	config/tc-bfin.h \
+	config/tc-bpf.h \
 	config/tc-cr16.h \
 	config/tc-cris.h \
 	config/tc-crx.h \
diff --git a/gas/Makefile.in b/gas/Makefile.in
index 1927de5..ee62f1a 100644
--- a/gas/Makefile.in
+++ b/gas/Makefile.in
@@ -431,6 +431,7 @@ TARGET_CPU_CFILES = \
 	config/tc-arm.c \
 	config/tc-avr.c \
 	config/tc-bfin.c \
+	config/tc-bpf.c \
 	config/tc-cr16.c \
 	config/tc-cris.c \
 	config/tc-crx.c \
@@ -508,6 +509,7 @@ TARGET_CPU_HFILES = \
 	config/tc-arm.h \
 	config/tc-avr.h \
 	config/tc-bfin.h \
+	config/tc-bpf.h \
 	config/tc-cr16.h \
 	config/tc-cris.h \
 	config/tc-crx.h \
@@ -868,6 +870,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tc-arm.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tc-avr.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tc-bfin.Po@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tc-bpf.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tc-cr16.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tc-cris.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/tc-crx.Po@am__quote@
@@ -1045,6 +1048,20 @@ tc-bfin.obj: config/tc-bfin.c
 @AMDEP_TRUE@@am__fastdepCC_FALSE@	DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@
 @am__fastdepCC_FALSE@	$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o tc-bfin.obj `if test -f 'config/tc-bfin.c'; then $(CYGPATH_W) 'config/tc-bfin.c'; else $(CYGPATH_W) '$(srcdir)/config/tc-bfin.c'; fi`
 
+tc-bpf.o: config/tc-bpf.c
+@am__fastdepCC_TRUE@	$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -MT tc-bpf.o -MD -MP -MF $(DEPDIR)/tc-bpf.Tpo -c -o tc-bpf.o `test -f 'config/tc-bpf.c' || echo '$(srcdir)/'`config/tc-bpf.c
+@am__fastdepCC_TRUE@	$(am__mv) $(DEPDIR)/tc-bpf.Tpo $(DEPDIR)/tc-bpf.Po
+@AMDEP_TRUE@@am__fastdepCC_FALSE@	source='config/tc-bpf.c' object='tc-bpf.o' libtool=no @AMDEPBACKSLASH@
+@AMDEP_TRUE@@am__fastdepCC_FALSE@	DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@
+@am__fastdepCC_FALSE@	$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o tc-bpf.o `test -f 'config/tc-bpf.c' || echo '$(srcdir)/'`config/tc-bpf.c
+
+tc-bpf.obj: config/tc-bpf.c
+@am__fastdepCC_TRUE@	$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -MT tc-bpf.obj -MD -MP -MF $(DEPDIR)/tc-bpf.Tpo -c -o tc-bpf.obj `if test -f 'config/tc-bpf.c'; then $(CYGPATH_W) 'config/tc-bpf.c'; else $(CYGPATH_W) '$(srcdir)/config/tc-bpf.c'; fi`
+@am__fastdepCC_TRUE@	$(am__mv) $(DEPDIR)/tc-bpf.Tpo $(DEPDIR)/tc-bpf.Po
+@AMDEP_TRUE@@am__fastdepCC_FALSE@	source='config/tc-bpf.c' object='tc-bpf.obj' libtool=no @AMDEPBACKSLASH@
+@AMDEP_TRUE@@am__fastdepCC_FALSE@	DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@
+@am__fastdepCC_FALSE@	$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o tc-bpf.obj `if test -f 'config/tc-bpf.c'; then $(CYGPATH_W) 'config/tc-bpf.c'; else $(CYGPATH_W) '$(srcdir)/config/tc-bpf.c'; fi`
+
 tc-cr16.o: config/tc-cr16.c
 @am__fastdepCC_TRUE@	$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -MT tc-cr16.o -MD -MP -MF $(DEPDIR)/tc-cr16.Tpo -c -o tc-cr16.o `test -f 'config/tc-cr16.c' || echo '$(srcdir)/'`config/tc-cr16.c
 @am__fastdepCC_TRUE@	$(am__mv) $(DEPDIR)/tc-cr16.Tpo $(DEPDIR)/tc-cr16.Po
diff --git a/gas/config/tc-bpf.c b/gas/config/tc-bpf.c
new file mode 100644
index 0000000..334e228
--- /dev/null
+++ b/gas/config/tc-bpf.c
@@ -0,0 +1,447 @@
+/* tc-bpf.c -- Assemble for the SPARC
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of GAS, the GNU Assembler.
+
+   GAS is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GAS is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public
+   License along with GAS; see the file COPYING.  If not, write
+   to the Free Software Foundation, 51 Franklin Street - Fifth Floor,
+   Boston, MA 02110-1301, USA.  */
+
+#include "as.h"
+#include "safe-ctype.h"
+#include "subsegs.h"
+#include "opcode/bpf.h"
+#ifdef OBJ_ELF
+#include "elf/bpf.h"
+#include "dwarf2dbg.h"
+#endif
+
+const pseudo_typeS md_pseudo_table[] =
+{
+  {"align", s_align_bytes, 0},	/* Defaulting is invalid (0).  */
+  {"global", s_globl, 0},
+  {"half", cons, 2},
+  {"skip", s_space, 0},
+  {"word", cons, 4},
+  {"xword", cons, 8},
+  {NULL, 0, 0},
+};
+
+const char comment_chars[] = "!";
+const char line_comment_chars[] = "#";
+const char line_separator_chars[] = ";";
+const char EXP_CHARS[] = "eE";
+const char FLT_CHARS[] = "rRsSfFdDxXpP";
+
+const char *md_shortopts = "";
+struct option md_longopts[] =
+{
+  { NULL,		no_argument,		NULL, 0                 },
+};
+size_t md_longopts_size = sizeof (md_longopts);
+
+int
+md_parse_option (int c ATTRIBUTE_UNUSED, const char *arg ATTRIBUTE_UNUSED)
+{
+  return 0;
+}
+
+void
+md_show_usage (FILE *stream)
+{
+  fprintf (stream, _("BPF options:\n"));
+}
+
+/* Handle of the OPCODE hash table.  */
+static struct hash_control *op_hash;
+
+void
+md_begin (void)
+{
+  const char *retval = NULL;
+  unsigned int i = 0;
+  int lose = 0;
+
+  op_hash = hash_new ();
+  while (i < (unsigned int) bpf_num_opcodes)
+    {
+      const char *name = bpf_opcodes[i].name;
+      retval = hash_insert (op_hash, name, (void *) &bpf_opcodes[i]);
+      if (retval != NULL)
+	{
+	  as_bad (_("Internal error: can't hash `%s': %s\n"),
+		  bpf_opcodes[i].name, retval);
+	  lose = 1;
+	}
+      do
+	{
+	  ++i;
+	}
+      while (i < (unsigned int) bpf_num_opcodes
+	     && !strcmp (bpf_opcodes[i].name, name));
+    }
+  if (lose)
+    as_fatal (_("Broken assembler.  No assembly attempted."));
+
+
+}
+
+struct bpf_it
+  {
+    const char *error;
+    valueT opcode;
+    expressionS exp;
+    int pcrel;
+    bfd_reloc_code_real_type reloc;
+  };
+
+/* Subroutine of md_assemble to output one insn.  */
+
+static void
+output_insn (struct bpf_it *theinsn)
+{
+  char *toP = frag_more (8);
+
+  /* Put out the opcode.  */
+  if (target_big_endian)
+    {
+      number_to_chars_bigendian (toP, theinsn->opcode, 8);
+    }
+  else
+    {
+      number_to_chars_littleendian (toP, theinsn->opcode, 8);
+    }
+
+  /* Put out the symbol-dependent stuff.  */
+  if (theinsn->reloc != BFD_RELOC_NONE)
+    {
+      fixS *fixP =  fix_new_exp (frag_now,	/* Which frag.  */
+				 (toP - frag_now->fr_literal),	/* Where.  */
+				 4,		/* Size.  */
+				 &theinsn->exp,
+				 theinsn->pcrel,
+				 theinsn->reloc);
+      /* Turn off overflow checking in fixup_segment.  We'll do our
+	 own overflow checking in md_apply_fix.  This is necessary because
+	 the insn size is 4 and fixup_segment will signal an overflow for
+	 large 8 byte quantities.  */
+      fixP->fx_no_overflow = 1;
+    }
+
+#ifdef OBJ_ELF
+  dwarf2_emit_insn (8);
+#endif
+}
+
+static struct bpf_it the_insn;
+static char *expr_end;
+
+static int
+get_expression (char *str, expressionS *exp)
+{
+  char *save_in;
+  segT seg;
+
+  save_in = input_line_pointer;
+  input_line_pointer = str;
+  seg = expression (exp);
+  if (seg != absolute_section
+      && seg != text_section
+      && seg != data_section
+      && seg != bss_section
+      && seg != undefined_section)
+    {
+      the_insn.error = _("bad segment");
+      expr_end = input_line_pointer;
+      input_line_pointer = save_in;
+      return 1;
+    }
+  expr_end = input_line_pointer;
+  input_line_pointer = save_in;
+  return 0;
+}
+
+void
+md_assemble (char *str ATTRIBUTE_UNUSED)
+{
+  const struct bpf_opcode *insn;
+  const char *args;
+  char *argsStart;
+  int match = 0;
+  valueT mask;
+  char *s, c;
+
+  s = str;
+  if (ISLOWER (*s))
+    {
+      do
+	++s;
+      while (ISLOWER (*s) || ISDIGIT (*s) || *s == '_');
+    }
+
+  switch (*s)
+    {
+    case '\0':
+      break;
+
+    case ' ':
+      *s++ = '\0';
+      break;
+
+    default:
+      as_bad (_("Unknown opcode: `%s'"), str);
+      return;
+    }
+  insn = (struct bpf_opcode *) hash_find (op_hash, str);
+
+  if (insn == NULL)
+    {
+      as_bad (_("Unknown opcode: `%s'"), str);
+      return;
+    }
+
+  argsStart = s;
+  for (;;)
+    {
+      memset (&the_insn, '\0', sizeof (the_insn));
+      the_insn.reloc = BFD_RELOC_NONE;
+      the_insn.opcode = ((valueT)insn->code << 56);
+
+      for (args = insn->args;; args++)
+	{
+	  switch (*args)
+	    {
+	    case '+':
+	    case ',':
+	    case '[':
+	    case ']':
+	      if (*s++ == *args)
+		continue;
+	      break;
+	    case '1':
+	      if (*s++ == 'r')
+		{
+		  if (!ISDIGIT ((c = *s++)))
+		    {
+		      goto error;
+		    }
+		  c -= '0';
+		  mask = c;
+		  the_insn.opcode |= (mask << 52);
+		  continue;
+		}
+	      break;
+	    case '2':
+	      if (*s++ == 'r')
+		{
+		  if (!ISDIGIT ((c = *s++)))
+		    {
+		      goto error;
+		    }
+		  c -= '0';
+		  mask = c;
+		  the_insn.opcode |= (mask << 48);
+		  continue;
+		}
+	      break;
+	    case 'i':
+	      the_insn.reloc = BFD_RELOC_32;
+	      if (*s == ' ')
+		s++;
+	      get_expression (s, &the_insn.exp);
+	      s = expr_end;
+	      if (the_insn.exp.X_op == O_constant
+		  && the_insn.exp.X_add_symbol == 0
+		  && the_insn.exp.X_op_symbol == 0)
+		{
+		  valueT val = the_insn.exp.X_add_number;
+
+		  the_insn.reloc = BFD_RELOC_NONE;
+		  val &= 0xffffffff;
+		  the_insn.opcode |= val;
+		}
+	      continue;
+	    case 'O':
+	      the_insn.reloc = BFD_RELOC_16;
+	      if (*s == ' ')
+		s++;
+	      get_expression (s, &the_insn.exp);
+	      s = expr_end;
+	      if (the_insn.exp.X_op == O_constant
+		  && the_insn.exp.X_add_symbol == 0
+		  && the_insn.exp.X_op_symbol == 0)
+		{
+		  valueT val = the_insn.exp.X_add_number;
+
+		  the_insn.reloc = BFD_RELOC_NONE;
+		  val &= 0xffff;
+		  the_insn.opcode |= val << 32;
+		}
+	      continue;
+	    case 'L':
+	      the_insn.reloc = BFD_RELOC_BPF_WDISP16;
+	      the_insn.pcrel = 1;
+	      if (*s == ' ')
+		s++;
+	      get_expression (s, &the_insn.exp);
+	      s = expr_end;
+	      if (the_insn.exp.X_op == O_constant
+		  && the_insn.exp.X_add_symbol == 0
+		  && the_insn.exp.X_op_symbol == 0)
+		{
+		  valueT val = the_insn.exp.X_add_number;
+
+		  the_insn.reloc = BFD_RELOC_NONE;
+		  val &= 0xffff;
+		  the_insn.opcode |= val << 32;
+		}
+	      continue;
+	    case 'C':
+	      break;
+	    case 'D':
+	      break;
+	    case '\0':		/* End of args.  */
+	      match = 1;
+	      break;
+	    default:
+	      as_fatal (_("failed sanity check."));
+	    }
+
+	  /* Break out of for() loop.  */
+	  break;
+	}
+    error:
+      if (match == 0)
+	{
+	  /* Args don't match.  */
+	  if (&insn[1] - bpf_opcodes < bpf_num_opcodes
+	      && (insn->name == insn[1].name
+		  || !strcmp (insn->name, insn[1].name)))
+	    {
+	      ++insn;
+	      s = argsStart;
+	      continue;
+	    }
+	  else
+	    {
+	      as_bad (_("Illegal operands%s"), "");
+	      return;
+	    }
+	}
+      break;
+    }
+
+  output_insn (&the_insn);
+}
+
+void
+md_number_to_chars (char *buf ATTRIBUTE_UNUSED, valueT val ATTRIBUTE_UNUSED, int n ATTRIBUTE_UNUSED)
+{
+}
+
+void
+md_apply_fix (fixS *fixP, valueT *valP ATTRIBUTE_UNUSED, segT segment ATTRIBUTE_UNUSED)
+{
+  char *buf = fixP->fx_where + fixP->fx_frag->fr_literal;
+  offsetT val = * (offsetT *) valP;
+
+  gas_assert (fixP->fx_r_type < BFD_RELOC_UNUSED);
+  /* If this is a data relocation, just output VAL.  */
+
+  if (fixP->fx_r_type == BFD_RELOC_8)
+    {
+      md_number_to_chars (buf, val, 1);
+    }
+  else if (fixP->fx_r_type == BFD_RELOC_16)
+    {
+      md_number_to_chars (buf, val, 2);
+    }
+  else if (fixP->fx_r_type == BFD_RELOC_32)
+    {
+      md_number_to_chars (buf, val, 4);
+    }
+  else if (fixP->fx_r_type == BFD_RELOC_64)
+    {
+      md_number_to_chars (buf, val, 8);
+    }
+  else if (fixP->fx_r_type == BFD_RELOC_VTABLE_INHERIT
+           || fixP->fx_r_type == BFD_RELOC_VTABLE_ENTRY)
+    {
+      fixP->fx_done = 0;
+      return;
+    }
+  else
+    {
+      long insn;
+
+      if (target_big_endian)
+	insn = bfd_getb32 ((unsigned char *) buf);
+      else
+	insn = bfd_getl32 ((unsigned char *) buf);
+
+      /* It's a relocation against an instruction.  */
+
+      switch (fixP->fx_r_type)
+	{
+	case BFD_RELOC_BPF_WDISP16:
+	  val = val  >> 3;
+	  insn |= (val + 1) & 0xffff;
+	  break;
+	case BFD_RELOC_NONE:
+	default:
+	  as_bad_where (fixP->fx_file, fixP->fx_line,
+			_("bad or unhandled relocation type: 0x%02x"),
+			fixP->fx_r_type);
+	  break;
+	}
+
+      if (target_big_endian)
+	bfd_putb32 (insn, (unsigned char *) buf);
+      else
+	bfd_putl32 (insn, (unsigned char *) buf);
+    }
+}
+
+arelent *
+tc_gen_reloc (asection *section ATTRIBUTE_UNUSED, fixS *fixp ATTRIBUTE_UNUSED)
+{
+  return NULL;
+}
+
+symbolS *
+md_undefined_symbol (char *name ATTRIBUTE_UNUSED)
+{
+  return 0;
+}
+
+valueT
+md_section_align (segT segment ATTRIBUTE_UNUSED, valueT size)
+{
+  return size;
+}
+
+long
+md_pcrel_from (fixS *fixP)
+{
+  long ret;
+
+  ret = fixP->fx_where + fixP->fx_frag->fr_address;
+  /* XXX */
+  return ret;
+}
+
+const char *
+md_atof (int type, char *litP, int *sizeP)
+{
+  return ieee_md_atof (type, litP, sizeP, target_big_endian);
+}
diff --git a/gas/config/tc-bpf.h b/gas/config/tc-bpf.h
new file mode 100644
index 0000000..013e5ed
--- /dev/null
+++ b/gas/config/tc-bpf.h
@@ -0,0 +1,38 @@
+/* tc-bpf.h - Macros and type defines for the bpf.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+   This file is part of GAS, the GNU Assembler.
+
+   GAS is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as
+   published by the Free Software Foundation; either version 3,
+   or (at your option) any later version.
+
+   GAS is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
+   the GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public
+   License along with GAS; see the file COPYING.  If not, write
+   to the Free Software Foundation, 51 Franklin Street - Fifth Floor,
+   Boston, MA 02110-1301, USA.  */
+
+#ifndef TC_BPF
+#define TC_BPF 1
+
+#define TARGET_ARCH			bfd_arch_bpf
+#define TARGET_FORMAT			"elf64-bpf"
+#define TARGET_BYTES_BIG_ENDIAN		1
+
+#define md_convert_frag(b,s,f) \
+  as_fatal (_("bpf convert_frag\n"))
+#define md_estimate_size_before_relax(f,s) \
+  (as_fatal (_("estimate_size_before_relax called")), 1)
+#define md_operand(x)
+
+#define LISTING_HEADER "BPF GAS "
+
+#define WORKING_DOT_WORD
+
+#endif
diff --git a/gas/configure.tgt b/gas/configure.tgt
index ca58b69..fa959c3 100644
--- a/gas/configure.tgt
+++ b/gas/configure.tgt
@@ -54,6 +54,7 @@ case ${cpu} in
   arm*be|arm*b)		cpu_type=arm endian=big ;;
   arm*)			cpu_type=arm endian=little ;;
   bfin*)		cpu_type=bfin endian=little ;;
+  bpf*)			cpu_type=bpf ;;
   c4x*)			cpu_type=tic4x ;;
   cr16*)		cpu_type=cr16 endian=little ;;
   crisv32)		cpu_type=cris arch=crisv32 ;;
@@ -171,6 +172,8 @@ case ${generic_target} in
   bfin-*-uclinux*)			fmt=elf em=linux ;;
   bfin-*elf)				fmt=elf ;;
 
+  bpf-*elf)				fmt=elf ;;
+
   cr16-*-elf*)				fmt=elf ;;
 
   cris-*-linux-* | crisv32-*-linux-*)
diff --git a/gdb/bpf-tdep.c b/gdb/bpf-tdep.c
new file mode 100644
index 0000000..6629f73
--- /dev/null
+++ b/gdb/bpf-tdep.c
@@ -0,0 +1,229 @@
+/* Target-dependent code for eBPF, for GDB.
+
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "inferior.h"
+#include "gdbcore.h"
+#include "arch-utils.h"
+#include "regcache.h"
+#include "frame.h"
+#include "frame-unwind.h"
+#include "frame-base.h"
+#include "trad-frame.h"
+#include "dis-asm.h"
+#include "dwarf2-frame.h"
+#include "symtab.h"
+#include "elf-bfd.h"
+#include "osabi.h"
+#include "infcall.h"
+#include "bpf-tdep.h"
+
+static const char * const bpf_register_name_strings[] =
+{
+  "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+  "r8", "r9", "r10", "pc",
+};
+
+#define NUM_BPF_REGNAMES ARRAY_SIZE (bpf_register_name_strings)
+
+/* Return the BPF register name corresponding to register I.  */
+
+static const char *
+bpf_register_name (struct gdbarch *gdbarch, int i)
+{
+  return bpf_register_name_strings[i];
+}
+
+/* Return the GDB type object for the "standard" data type of data in
+   register N.  */
+
+static struct type *
+bpf_register_type (struct gdbarch *gdbarch, int regnum)
+{
+  if (regnum == BPF_R10_REGNUM)
+    return builtin_type (gdbarch)->builtin_data_ptr;
+
+  if (regnum == BPF_PC_REGNUM)
+    return builtin_type (gdbarch)->builtin_func_ptr;
+
+  return builtin_type (gdbarch)->builtin_int32;
+}
+
+/* Convert DWARF2 register number REG to the appropriate register number
+   used by GDB.  */
+
+static int
+bpf_reg_to_regnum (struct gdbarch *gdbarch, int reg)
+{
+  if (reg < 0 || reg >= BPF_NUM_REGS)
+    return -1;
+
+  return reg;
+}
+
+static struct frame_id
+bpf_dummy_id (struct gdbarch *gdbarch, struct frame_info *this_frame)
+{
+  CORE_ADDR sp;
+
+  sp = get_frame_register_unsigned (this_frame, BPF_R10_REGNUM);
+
+  return frame_id_build (sp, get_frame_pc (this_frame));
+}
+
+static CORE_ADDR
+bpf_push_dummy_call (struct gdbarch *gdbarch,
+		      struct value *function,
+		      struct regcache *regcache,
+		      CORE_ADDR bp_addr,
+		      int nargs,
+		      struct value **args,
+		      CORE_ADDR sp,
+		      int struct_return,
+		      CORE_ADDR struct_addr)
+{
+  return sp; /* XXX */
+}
+
+/* Extract a function return value of TYPE from REGCACHE, and copy
+   that into VALBUF.  */
+
+static void
+bpf_extract_return_value (struct type *type, struct regcache *regcache,
+			  gdb_byte *valbuf)
+{
+  int len = TYPE_LENGTH (type);
+  gdb_byte buf[8];
+
+  regcache_cooked_read (regcache, BPF_R0_REGNUM, buf);
+  memcpy (valbuf, buf + 8 - len, len);
+}
+
+/* Store the function return value of type TYPE from VALBUF into
+   REGCACHE.  */
+
+static void
+bpf_store_return_value (struct type *type, struct regcache *regcache,
+			const gdb_byte *valbuf)
+{
+  int len = TYPE_LENGTH (type);
+  gdb_byte buf[8];
+
+  memcpy (buf + 8 - len, valbuf, len);
+  regcache_cooked_write (regcache, BPF_R0_REGNUM, buf);
+}
+
+/* Determine, for architecture GDBARCH, how a return value of TYPE
+   should be returned.  If it is supposed to be returned in registers,
+   and READBUF is nonzero, read the appropriate value from REGCACHE,
+   and copy it into READBUF.  If WRITEBUF is nonzero, write the value
+   from WRITEBUF into REGCACHE.  */
+
+static enum return_value_convention
+bpf_return_value (struct gdbarch *gdbarch,
+		   struct value *function,
+		   struct type *type,
+		   struct regcache *regcache,
+		   gdb_byte *readbuf,
+		   const gdb_byte *writebuf)
+{
+  if (TYPE_LENGTH (type) > 8)
+    return RETURN_VALUE_STRUCT_CONVENTION;
+
+  if (readbuf)
+    bpf_extract_return_value (type, regcache, readbuf);
+
+  if (writebuf)
+    bpf_store_return_value (type, regcache, writebuf);
+
+  return RETURN_VALUE_REGISTER_CONVENTION;
+}
+
+static CORE_ADDR
+bpf_unwind_pc (struct gdbarch *gdbarch, struct frame_info *next_frame)
+{
+  return frame_unwind_register_unsigned (next_frame, BPF_PC_REGNUM);
+}
+
+/* Skip all the insns that appear in generated function prologues.  */
+
+static CORE_ADDR
+bpf_skip_prologue (struct gdbarch *gdbarch, CORE_ADDR pc)
+{
+  return pc;
+}
+
+/* Implement the breakpoint_kind_from_pc gdbarch method.  */
+
+static int
+bpf_breakpoint_kind_from_pc (struct gdbarch *gdbarch, CORE_ADDR *pcptr)
+{
+  return 8;
+}
+
+/* Initialize the current architecture based on INFO.  If possible,
+   re-use an architecture from ARCHES, which is a list of
+   architectures already created during this debugging session.
+
+   Called e.g. at program startup, when reading a core file, and when
+   reading a binary file.  */
+
+static struct gdbarch *
+bpf_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
+{
+  struct gdbarch_tdep *tdep;
+  struct gdbarch *gdbarch;
+
+  tdep = XNEW (struct gdbarch_tdep);
+  gdbarch = gdbarch_alloc (&info, tdep);
+  
+  tdep->xxx = 0;
+
+  set_gdbarch_num_regs (gdbarch, BPF_NUM_REGS);
+  set_gdbarch_sp_regnum (gdbarch, BPF_R10_REGNUM);
+  set_gdbarch_pc_regnum (gdbarch, BPF_PC_REGNUM);
+  set_gdbarch_dwarf2_reg_to_regnum (gdbarch, bpf_reg_to_regnum);
+  set_gdbarch_register_name (gdbarch, bpf_register_name);
+  set_gdbarch_register_type (gdbarch, bpf_register_type);
+  set_gdbarch_dummy_id (gdbarch, bpf_dummy_id);
+  set_gdbarch_push_dummy_call (gdbarch, bpf_push_dummy_call);
+  set_gdbarch_return_value (gdbarch, bpf_return_value);
+  set_gdbarch_inner_than (gdbarch, core_addr_lessthan);
+  set_gdbarch_frame_args_skip (gdbarch, 8);
+  set_gdbarch_unwind_pc (gdbarch, bpf_unwind_pc);
+  set_gdbarch_print_insn (gdbarch, print_insn_bpf);
+
+  set_gdbarch_skip_prologue (gdbarch, bpf_skip_prologue);
+  set_gdbarch_breakpoint_kind_from_pc (gdbarch, bpf_breakpoint_kind_from_pc);
+
+  /* Hook in ABI-specific overrides, if they have been registered.  */
+  gdbarch_init_osabi (info, gdbarch);
+
+  dwarf2_append_unwinders (gdbarch);
+  return gdbarch;
+}
+
+/* Provide a prototype to silence -Wmissing-prototypes.  */
+extern initialize_file_ftype _initialize_bpf_tdep;
+
+void
+_initialize_bpf_tdep (void)
+{
+  register_gdbarch_init (bfd_arch_bpf, bpf_gdbarch_init);
+}
diff --git a/gdb/bpf-tdep.h b/gdb/bpf-tdep.h
new file mode 100644
index 0000000..52cae6d
--- /dev/null
+++ b/gdb/bpf-tdep.h
@@ -0,0 +1,40 @@
+/* Target-dependent code for eBPF, for GDB.
+
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+enum gdb_regnum {
+  BPF_R0_REGNUM = 0,
+  BPF_R1_REGNUM,
+  BPF_R2_REGNUM,
+  BPF_R3_REGNUM,
+  BPF_R4_REGNUM,
+  BPF_R5_REGNUM,
+  BPF_R6_REGNUM,
+  BPF_R7_REGNUM,
+  BPF_R8_REGNUM,
+  BPF_R9_REGNUM,
+  BPF_R10_REGNUM,
+  BPF_PC_REGNUM,
+};
+
+#define BPF_NUM_REGS	(BPF_PC_REGNUM + 1)
+
+struct gdbarch_tdep
+{
+  int xxx;
+};
diff --git a/gdb/configure.tgt b/gdb/configure.tgt
index fdcb7b1..e8d5fb4 100644
--- a/gdb/configure.tgt
+++ b/gdb/configure.tgt
@@ -142,6 +142,10 @@ bfin-*-*)
 	gdb_sim=../sim/bfin/libsim.a
 	;;
 
+bpf*)
+	# Target: eBPF
+	gdb_target_obs="bpf-tdep.o"
+	;;
 cris*)
 	# Target: CRIS
 	gdb_target_obs="cris-tdep.o cris-linux-tdep.o linux-tdep.o solib-svr4.o"
diff --git a/include/dis-asm.h b/include/dis-asm.h
index 6f1801d..cbfebc8 100644
--- a/include/dis-asm.h
+++ b/include/dis-asm.h
@@ -241,6 +241,7 @@ extern int print_insn_aarch64		(bfd_vma, disassemble_info *);
 extern int print_insn_alpha		(bfd_vma, disassemble_info *);
 extern int print_insn_avr		(bfd_vma, disassemble_info *);
 extern int print_insn_bfin		(bfd_vma, disassemble_info *);
+extern int print_insn_bpf		(bfd_vma, disassemble_info *);
 extern int print_insn_big_arm		(bfd_vma, disassemble_info *);
 extern int print_insn_big_mips		(bfd_vma, disassemble_info *);
 extern int print_insn_big_nios2		(bfd_vma, disassemble_info *);
diff --git a/include/elf/bpf.h b/include/elf/bpf.h
new file mode 100644
index 0000000..6360db8
--- /dev/null
+++ b/include/elf/bpf.h
@@ -0,0 +1,34 @@
+/* BPF ELF support for BFD.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+   This file is part of BFD, the Binary File Descriptor library.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston,
+   MA 02110-1301, USA.  */
+
+#ifndef _ELF_BPF_H
+#define _ELF_BPF_H
+
+#include "elf/reloc-macros.h"
+
+/* Relocation types.  */
+START_RELOC_NUMBERS (elf_bpf_reloc_type)
+  RELOC_NUMBER (R_BPF_NONE, 0)
+  RELOC_NUMBER (R_BPF_16, 1)
+  RELOC_NUMBER (R_BPF_32, 2)
+  RELOC_NUMBER (R_BPF_WDISP16, 3)
+END_RELOC_NUMBERS (R_BPF_max)
+
+#endif /* _ELF_BPF_H */
diff --git a/include/opcode/bpf.h b/include/opcode/bpf.h
new file mode 100644
index 0000000..298ed1b
--- /dev/null
+++ b/include/opcode/bpf.h
@@ -0,0 +1,16 @@
+#ifndef OPCODE_BPF_H
+#define OPCODE_BPF_H
+
+/* Structure of an opcode table entry.  */
+
+typedef struct bpf_opcode
+{
+  const char *name;
+  unsigned char code;
+  const char *args;
+} bpf_opcode;
+
+extern const struct bpf_opcode bpf_opcodes[];
+extern const int bpf_num_opcodes;
+
+#endif /* OPCODE_BPF_H */
diff --git a/ld/Makefile.am b/ld/Makefile.am
index 3aa7e80..d840bed 100644
--- a/ld/Makefile.am
+++ b/ld/Makefile.am
@@ -477,6 +477,7 @@ ALL_64_EMULATION_SOURCES = \
 	eelf32ltsmipn32_fbsd.c \
 	eelf32mipswindiss.c \
 	eelf64_aix.c \
+	eelf64_bpf.c \
 	eelf64_ia64.c \
 	eelf64_ia64_fbsd.c \
 	eelf64_ia64_vms.c \
@@ -1920,6 +1921,9 @@ eelf32_x86_64_nacl.c: $(srcdir)/emulparams/elf32_x86_64_nacl.sh \
 eelf64_aix.c: $(srcdir)/emulparams/elf64_aix.sh \
   $(ELF_DEPS) $(srcdir)/scripttempl/elf.sc ${GEN_DEPENDS}
 
+eelf64_bpf.c: $(srcdir)/emulparams/elf64_bpf.sh \
+  $(ELF_DEPS) $(srcdir)/scripttempl/elf.sc ${GEN_DEPENDS}
+
 eelf64_ia64.c: $(srcdir)/emulparams/elf64_ia64.sh \
   $(ELF_DEPS) $(srcdir)/emultempl/ia64elf.em \
   $(srcdir)/emultempl/needrelax.em \
diff --git a/ld/Makefile.in b/ld/Makefile.in
index f485f4f..706a889 100644
--- a/ld/Makefile.in
+++ b/ld/Makefile.in
@@ -845,6 +845,7 @@ ALL_64_EMULATION_SOURCES = \
 	eelf32ltsmipn32_fbsd.c \
 	eelf32mipswindiss.c \
 	eelf64_aix.c \
+	eelf64_bpf.c \
 	eelf64_ia64.c \
 	eelf64_ia64_fbsd.c \
 	eelf64_ia64_vms.c \
@@ -1292,6 +1293,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/eelf32xstormy16.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/eelf32xtensa.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/eelf64_aix.Po@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/eelf64_bpf.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/eelf64_ia64.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/eelf64_ia64_fbsd.Po@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/eelf64_ia64_vms.Po@am__quote@
@@ -3484,6 +3486,9 @@ eelf32_x86_64_nacl.c: $(srcdir)/emulparams/elf32_x86_64_nacl.sh \
 eelf64_aix.c: $(srcdir)/emulparams/elf64_aix.sh \
   $(ELF_DEPS) $(srcdir)/scripttempl/elf.sc ${GEN_DEPENDS}
 
+eelf64_bpf.c: $(srcdir)/emulparams/elf64_bpf.sh \
+  $(ELF_DEPS) $(srcdir)/scripttempl/elf.sc ${GEN_DEPENDS}
+
 eelf64_ia64.c: $(srcdir)/emulparams/elf64_ia64.sh \
   $(ELF_DEPS) $(srcdir)/emultempl/ia64elf.em \
   $(srcdir)/emultempl/needrelax.em \
diff --git a/ld/configure.tgt b/ld/configure.tgt
index 895f0fb..13645f5 100644
--- a/ld/configure.tgt
+++ b/ld/configure.tgt
@@ -177,6 +177,8 @@ bfin-*-linux-uclibc*)	targ_emul=elf32bfinfd;
 			targ_extra_emuls="elf32bfin"
 			targ_extra_libpath=$targ_extra_emuls
 			;;
+bpf-*-elf)		targ_emul=elf64_bpf
+			;;
 cr16-*-elf*)            targ_emul=elf32cr16 ;;
 cr16c-*-elf*)           targ_emul=elf32cr16c
 			;;
diff --git a/ld/emulparams/elf64_bpf.sh b/ld/emulparams/elf64_bpf.sh
new file mode 100644
index 0000000..0e1e549
--- /dev/null
+++ b/ld/emulparams/elf64_bpf.sh
@@ -0,0 +1,8 @@
+# See genscripts.sh and ../scripttempl/elf.sc for the meaning of these.
+SCRIPT_NAME=elf
+ELFSIZE=64
+TEMPLATE_NAME=elf32
+OUTPUT_FORMAT="elf64-bpf"
+TARGET_PAGE_SIZE=0x1000
+ARCH=bpf
+MACHINE=
diff --git a/opcodes/Makefile.am b/opcodes/Makefile.am
index 1ac6bb1..ccc9453 100644
--- a/opcodes/Makefile.am
+++ b/opcodes/Makefile.am
@@ -105,6 +105,8 @@ TARGET_LIBOPCODES_CFILES = \
 	arm-dis.c \
 	avr-dis.c \
 	bfin-dis.c \
+	bpf-dis.c \
+	bpf-opc.c \
 	cgen-asm.c \
 	cgen-bitset.c \
 	cgen-dis.c \
diff --git a/opcodes/bpf-dis.c b/opcodes/bpf-dis.c
new file mode 100644
index 0000000..2a0b7da
--- /dev/null
+++ b/opcodes/bpf-dis.c
@@ -0,0 +1,152 @@
+#include "sysdep.h"
+#include <stdio.h>
+#include "opcode/bpf.h"
+#include "dis-asm.h"
+#include "libiberty.h"
+
+#define HASH_SIZE 256
+#define HASH_INSN(CODE)	(CODE)
+
+typedef struct bpf_opcode_hash
+{
+  struct bpf_opcode_hash *next;
+  const bpf_opcode *opcode;
+} bpf_opcode_hash;
+
+static bpf_opcode_hash *opcode_hash_table[HASH_SIZE];
+
+static void
+build_hash_table (const bpf_opcode *opcode_table,
+		  bpf_opcode_hash **hash_table,
+		  int num_opcodes)
+{
+  static bpf_opcode_hash *hash_buf = NULL;
+  int i;
+
+  memset (hash_table, 0, HASH_SIZE * sizeof (hash_table[0]));
+  if (hash_buf != NULL)
+    free (hash_buf);
+  hash_buf = xmalloc (sizeof (* hash_buf) * num_opcodes);
+  for (i = num_opcodes - 1; i >= 0; --i)
+    {
+      int hash = HASH_INSN (opcode_table[i].code);
+      bpf_opcode_hash *h = &hash_buf[i];
+
+      h->next = hash_table[hash];
+      h->opcode = &opcode_table[i];
+      hash_table[hash] = h;
+    }
+}
+
+int
+print_insn_bpf (bfd_vma memaddr, disassemble_info *info)
+{
+  static unsigned long current_mach = 0;
+  static int opcodes_initialized = 0;
+  bfd_vma (*getword) (const void *);
+  bfd_vma (*gethalf) (const void *);
+  FILE *stream = info->stream;
+  bpf_opcode_hash *op;
+  int code, dest, src;
+  bfd_byte buffer[8];
+  unsigned short off;
+  int status, ret;
+  signed int imm;
+
+  if (!opcodes_initialized
+      || info->mach != current_mach)
+    {
+      build_hash_table (bpf_opcodes, opcode_hash_table, bpf_num_opcodes);
+      current_mach = info->mach;
+      opcodes_initialized = 1;
+    }
+
+  info->bytes_per_line = 8;
+
+  status = (*info->read_memory_func) (memaddr, buffer, sizeof (buffer), info);
+  if (status != 0)
+    {
+      (*info->memory_error_func) (status, memaddr, info);
+      return -1;
+    }
+
+  if (info->endian == BFD_ENDIAN_BIG)
+    {
+      getword = bfd_getb32;
+      gethalf = bfd_getb16;
+    }
+  else
+    {
+      getword = bfd_getl32;
+      gethalf = bfd_getl32;
+    }  
+
+  code = buffer[0];
+  dest = (buffer[1] & 0xf0) >> 4;
+  src = buffer[1] & 0x0f;
+  off = gethalf(&buffer[2]);
+  imm = getword(&buffer[4]);
+
+  ret = sizeof (buffer);
+  for (op = opcode_hash_table[HASH_INSN (code)]; op; op = op->next)
+    {
+      const bpf_opcode *opcode = op->opcode;
+      BFD_HOST_U_64_BIT value;
+      signed int imm2;
+      const char *s;
+
+      if (opcode->code != code)
+	continue;
+
+      (*info->fprintf_func) (stream, "%s\t", opcode->name);
+      for (s = opcode->args; *s != '\0'; s++)
+	{
+	  switch (*s)
+	    {
+	    case '+':
+	    default:
+	      (*info->fprintf_func) (stream, "%c", *s);
+	      break;
+	    case ',':
+	      (*info->fprintf_func) (stream, ", ");
+	      break;
+	    case '1':
+	      (*info->fprintf_func) (stream, "r%d", dest);
+	      break;
+	    case '2':
+	      (*info->fprintf_func) (stream, "r%d", src);
+	      break;
+	    case 'i':
+	      (*info->fprintf_func) (stream, "%d", imm);
+	      break;
+	    case 'O':
+	      (*info->fprintf_func) (stream, "%d", off);
+	      break;
+	    case 'L':
+	      info->target = memaddr + ((off - 1) * 8);
+	      (*info->print_address_func) (info->target, info);
+	      break;
+	    case 'C':
+	      info->target = imm;
+	      (*info->print_address_func) (info->target, info);
+	      break;
+	    case 'D':
+	      status = (*info->read_memory_func) (memaddr + 8, buffer,
+						  sizeof (buffer), info);
+	      if (status != 0)
+		{
+		  (*info->memory_error_func) (status, memaddr, info);
+		  return -1;
+		}
+	      ret += sizeof (buffer);
+	      imm2 = getword(&buffer[4]);
+	      value = ((BFD_HOST_U_64_BIT) (unsigned) imm2) << 32;
+	      value |= (BFD_HOST_U_64_BIT) (unsigned) imm;
+	      (*info->fprintf_func) (stream, "%lld", (long long) value);
+	      break;
+	    }
+	}
+    }
+
+  return ret;
+}
diff --git a/opcodes/bpf-opc.c b/opcodes/bpf-opc.c
new file mode 100644
index 0000000..bca8e47
--- /dev/null
+++ b/opcodes/bpf-opc.c
@@ -0,0 +1,147 @@
+#include "sysdep.h"
+#include <stdio.h>
+#include "opcode/bpf.h"
+
+#define BPF_OPC_ALU64	0x07
+#define BPF_OPC_DW	0x18
+#define BPF_OPC_XADD	0xc0
+#define BPF_OPC_MOV	0xb0
+#define BPF_OPC_ARSH	0xc0
+#define BPF_OPC_END	0xd0
+#define BPF_OPC_TO_LE	0x00
+#define BPF_OPC_TO_BE	0x08
+#define BPF_OPC_JNE	0x50
+#define BPF_OPC_JSGT	0x60
+#define BPF_OPC_JSGE	0x70
+#define BPF_OPC_CALL	0x80
+#define BPF_OPC_EXIT	0x90
+
+#define BPF_OPC_LD	0x00
+#define BPF_OPC_LDX	0x01
+#define BPF_OPC_ST	0x02
+#define BPF_OPC_STX	0x03
+#define BPF_OPC_ALU	0x04
+#define BPF_OPC_JMP	0x05
+#define BPF_OPC_RET	0x06
+#define BPF_OPC_MISC	0x07
+
+#define BPF_OPC_W	0x00
+#define BPF_OPC_H	0x08
+#define BPF_OPC_B	0x10
+
+#define BPF_OPC_IMM	0x00
+#define BPF_OPC_ABS	0x20
+#define BPF_OPC_IND	0x40
+#define BPF_OPC_MEM	0x60
+#define BPF_OPC_LEL	0x80
+#define BPF_OPC_MSH	0xa0
+
+#define BPF_OPC_ADD	0x00
+#define BPF_OPC_SUB	0x10
+#define BPF_OPC_MUL	0x20
+#define BPF_OPC_DIV	0x30
+#define BPF_OPC_OR	0x40
+#define BPF_OPC_AND	0x50
+#define BPF_OPC_LSH	0x60
+#define BPF_OPC_RSH	0x70
+#define BPF_OPC_NEG	0x80
+#define BPF_OPC_MOD	0x90
+#define BPF_OPC_XOR	0xa0
+
+#define BPF_OPC_JA	0x00
+#define BPF_OPC_JEQ	0x10
+#define BPF_OPC_JGT	0x20
+#define BPF_OPC_JGE	0x30
+#define BPF_OPC_JSET	0x40
+
+#define BPF_OPC_K	0x00
+#define BPF_OPC_X	0x08
+
+const struct bpf_opcode bpf_opcodes[] = {
+  { "mov32",   BPF_OPC_ALU   | BPF_OPC_MOV  | BPF_OPC_X,     "1,2" },
+  { "mov32",   BPF_OPC_ALU   | BPF_OPC_MOV  | BPF_OPC_K,     "1,i" },
+  { "mov",     BPF_OPC_ALU64 | BPF_OPC_MOV  | BPF_OPC_X,     "1,2" },
+  { "mov",     BPF_OPC_ALU64 | BPF_OPC_MOV  | BPF_OPC_K,     "1,i" },
+  { "add32",   BPF_OPC_ALU   | BPF_OPC_ADD  | BPF_OPC_X,     "1,2" },
+  { "add32",   BPF_OPC_ALU   | BPF_OPC_ADD  | BPF_OPC_K,     "1,i" },
+  { "add",     BPF_OPC_ALU64 | BPF_OPC_ADD  | BPF_OPC_X,     "1,2" },
+  { "add",     BPF_OPC_ALU64 | BPF_OPC_ADD  | BPF_OPC_K,     "1,i" },
+  { "sub32",   BPF_OPC_ALU   | BPF_OPC_SUB  | BPF_OPC_X,     "1,2" },
+  { "sub32",   BPF_OPC_ALU   | BPF_OPC_SUB  | BPF_OPC_K,     "1,i" },
+  { "sub",     BPF_OPC_ALU64 | BPF_OPC_SUB  | BPF_OPC_X,     "1,2" },
+  { "sub",     BPF_OPC_ALU64 | BPF_OPC_SUB  | BPF_OPC_K,     "1,i" },
+  { "and32",   BPF_OPC_ALU   | BPF_OPC_AND  | BPF_OPC_X,     "1,2" },
+  { "and32",   BPF_OPC_ALU   | BPF_OPC_AND  | BPF_OPC_K,     "1,i" },
+  { "and",     BPF_OPC_ALU64 | BPF_OPC_AND  | BPF_OPC_X,     "1,2" },
+  { "and",     BPF_OPC_ALU64 | BPF_OPC_AND  | BPF_OPC_K,     "1,i" },
+  { "or32",    BPF_OPC_ALU   | BPF_OPC_OR   | BPF_OPC_X,     "1,2" },
+  { "or32",    BPF_OPC_ALU   | BPF_OPC_XOR  | BPF_OPC_K,     "1,i" },
+  { "or",      BPF_OPC_ALU64 | BPF_OPC_OR   | BPF_OPC_X,     "1,2" },
+  { "or",      BPF_OPC_ALU64 | BPF_OPC_XOR  | BPF_OPC_K,     "1,i" },
+  { "xor32",   BPF_OPC_ALU   | BPF_OPC_XOR  | BPF_OPC_X,     "1,2" },
+  { "xor32",   BPF_OPC_ALU   | BPF_OPC_OR   | BPF_OPC_K,     "1,i" },
+  { "xor",     BPF_OPC_ALU64 | BPF_OPC_XOR  | BPF_OPC_X,     "1,2" },
+  { "xor",     BPF_OPC_ALU64 | BPF_OPC_OR   | BPF_OPC_K,     "1,i" },
+  { "mul32",   BPF_OPC_ALU   | BPF_OPC_MUL  | BPF_OPC_X,     "1,2" },
+  { "mul32",   BPF_OPC_ALU   | BPF_OPC_MUL  | BPF_OPC_K,     "1,i" },
+  { "mul",     BPF_OPC_ALU64 | BPF_OPC_MUL  | BPF_OPC_X,     "1,2" },
+  { "mul",     BPF_OPC_ALU64 | BPF_OPC_MUL  | BPF_OPC_K,     "1,i" },
+  { "div32",   BPF_OPC_ALU   | BPF_OPC_DIV  | BPF_OPC_X,     "1,2" },
+  { "div32",   BPF_OPC_ALU   | BPF_OPC_DIV  | BPF_OPC_K,     "1,i" },
+  { "div",     BPF_OPC_ALU64 | BPF_OPC_DIV  | BPF_OPC_X,     "1,2" },
+  { "div",     BPF_OPC_ALU64 | BPF_OPC_DIV  | BPF_OPC_K,     "1,i" },
+  { "mod32",   BPF_OPC_ALU   | BPF_OPC_MOD  | BPF_OPC_X,     "1,2" },
+  { "mod32",   BPF_OPC_ALU   | BPF_OPC_MOD  | BPF_OPC_K,     "1,i" },
+  { "mod",     BPF_OPC_ALU64 | BPF_OPC_MOD  | BPF_OPC_X,     "1,2" },
+  { "mod",     BPF_OPC_ALU64 | BPF_OPC_MOD  | BPF_OPC_K,     "1,i" },
+  { "lsh32",   BPF_OPC_ALU   | BPF_OPC_LSH  | BPF_OPC_X,     "1,2" },
+  { "lsh32",   BPF_OPC_ALU   | BPF_OPC_LSH  | BPF_OPC_K,     "1,i" },
+  { "lsh",     BPF_OPC_ALU64 | BPF_OPC_LSH  | BPF_OPC_X,     "1,2" },
+  { "lsh",     BPF_OPC_ALU64 | BPF_OPC_LSH  | BPF_OPC_K,     "1,i" },
+  { "rsh32",   BPF_OPC_ALU   | BPF_OPC_RSH  | BPF_OPC_X,     "1,2" },
+  { "rsh32",   BPF_OPC_ALU   | BPF_OPC_RSH  | BPF_OPC_K,     "1,i" },
+  { "rsh",     BPF_OPC_ALU64 | BPF_OPC_RSH  | BPF_OPC_X,     "1,2" },
+  { "rsh",     BPF_OPC_ALU64 | BPF_OPC_RSH  | BPF_OPC_K,     "1,i" },
+  { "arsh32",  BPF_OPC_ALU   | BPF_OPC_ARSH | BPF_OPC_X,     "1,2" },
+  { "arsh32",  BPF_OPC_ALU   | BPF_OPC_ARSH | BPF_OPC_K,     "1,i" },
+  { "arsh",    BPF_OPC_ALU64 | BPF_OPC_ARSH | BPF_OPC_X,     "1,2" },
+  { "arsh",    BPF_OPC_ALU64 | BPF_OPC_ARSH | BPF_OPC_K,     "1,i" },
+  { "neg32",   BPF_OPC_ALU   | BPF_OPC_NEG  | BPF_OPC_X,     "1" },
+  { "neg",     BPF_OPC_ALU64 | BPF_OPC_NEG  | BPF_OPC_X,     "1" },
+  { "endbe",   BPF_OPC_ALU   | BPF_OPC_END  | BPF_OPC_TO_BE, "1,i" },
+  { "endle",   BPF_OPC_ALU   | BPF_OPC_END  | BPF_OPC_TO_LE, "1,i" },
+  { "ja",      BPF_OPC_JMP   | BPF_OPC_JA,                   "L" },
+  { "jeq",     BPF_OPC_JMP   | BPF_OPC_JEQ  | BPF_OPC_X,     "1,2,L" },
+  { "jeq",     BPF_OPC_JMP   | BPF_OPC_JEQ  | BPF_OPC_K,     "1,i,L" },
+  { "jgt",     BPF_OPC_JMP   | BPF_OPC_JGT  | BPF_OPC_X,     "1,2,L" },
+  { "jgt",     BPF_OPC_JMP   | BPF_OPC_JGT  | BPF_OPC_K,     "1,i,L" },
+  { "jge",     BPF_OPC_JMP   | BPF_OPC_JGE  | BPF_OPC_X,     "1,2,L" },
+  { "jge",     BPF_OPC_JMP   | BPF_OPC_JGE  | BPF_OPC_K,     "1,i,L" },
+  { "jne",     BPF_OPC_JMP   | BPF_OPC_JNE  | BPF_OPC_X,     "1,2,L" },
+  { "jne",     BPF_OPC_JMP   | BPF_OPC_JNE  | BPF_OPC_K,     "1,i,L" },
+  { "jsgt",    BPF_OPC_JMP   | BPF_OPC_JSGT | BPF_OPC_X,     "1,2,L" },
+  { "jsgt",    BPF_OPC_JMP   | BPF_OPC_JSGT | BPF_OPC_K,     "1,i,L" },
+  { "jsge",    BPF_OPC_JMP   | BPF_OPC_JSGE | BPF_OPC_X,     "1,2,L" },
+  { "jsge",    BPF_OPC_JMP   | BPF_OPC_JSGE | BPF_OPC_K,     "1,i,L" },
+  { "jset",    BPF_OPC_JMP   | BPF_OPC_JSET | BPF_OPC_X,     "1,2,L" },
+  { "jset",    BPF_OPC_JMP   | BPF_OPC_JSET | BPF_OPC_K,     "1,i,L" },
+  { "call",    BPF_OPC_JMP   | BPF_OPC_CALL,                 "C" },
+  { "tailcall",BPF_OPC_JMP   | BPF_OPC_CALL | BPF_OPC_X,     "C" },
+  { "exit",    BPF_OPC_JMP   | BPF_OPC_EXIT,                 "" },
+  { "ldimm64", BPF_OPC_LD    | BPF_OPC_IMM  | BPF_OPC_DW,    "1,D" },
+  { "ldw",     BPF_OPC_LDX   | BPF_OPC_MEM  | BPF_OPC_W,     "1,[2+O]" },
+  { "ldh",     BPF_OPC_LDX   | BPF_OPC_MEM  | BPF_OPC_H,     "1,[2+O]" },
+  { "ldb",     BPF_OPC_LDX   | BPF_OPC_MEM  | BPF_OPC_B,     "1,[2+O]" },
+  { "lddw",    BPF_OPC_LDX   | BPF_OPC_MEM  | BPF_OPC_DW,    "1,[2+O]" },
+  { "stw",     BPF_OPC_STX   | BPF_OPC_MEM  | BPF_OPC_W,     "[1+O],2" },
+  { "stw",     BPF_OPC_ST    | BPF_OPC_MEM  | BPF_OPC_W,     "[1+O],i" },
+  { "sth",     BPF_OPC_STX   | BPF_OPC_MEM  | BPF_OPC_H,     "[1+O],2" },
+  { "sth",     BPF_OPC_ST    | BPF_OPC_MEM  | BPF_OPC_H,     "[1+O],i" },
+  { "stb",     BPF_OPC_STX   | BPF_OPC_MEM  | BPF_OPC_B,     "[1+O],2" },
+  { "stb",     BPF_OPC_ST    | BPF_OPC_MEM  | BPF_OPC_B,     "[1+O],i" },
+  { "stdw",    BPF_OPC_STX   | BPF_OPC_MEM  | BPF_OPC_DW,    "[1+O],2" },
+  { "stdw",    BPF_OPC_ST    | BPF_OPC_MEM  | BPF_OPC_DW,    "[1+O],i" },
+  { "xaddw",   BPF_OPC_STX   | BPF_OPC_XADD | BPF_OPC_W,     "[1+O],2" },
+  { "xadddw",  BPF_OPC_STX   | BPF_OPC_XADD | BPF_OPC_DW,    "[1+O],2" },
+};
+const int bpf_num_opcodes = ((sizeof bpf_opcodes)/(sizeof bpf_opcodes[0]));
diff --git a/opcodes/configure b/opcodes/configure
index 27d1472..7583220 100755
--- a/opcodes/configure
+++ b/opcodes/configure
@@ -12634,6 +12634,7 @@ if test x${all_targets} = xfalse ; then
 	bfd_arm_arch)		ta="$ta arm-dis.lo" ;;
 	bfd_avr_arch)		ta="$ta avr-dis.lo" ;;
 	bfd_bfin_arch)		ta="$ta bfin-dis.lo" ;;
+	bfd_bpf_arch)		ta="$ta bpf-dis.lo bpf-opc.lo" ;;
 	bfd_cr16_arch)		ta="$ta cr16-dis.lo cr16-opc.lo" ;;
 	bfd_cris_arch)		ta="$ta cris-dis.lo cris-opc.lo cgen-bitset.lo" ;;
 	bfd_crx_arch)		ta="$ta crx-dis.lo crx-opc.lo" ;;
diff --git a/opcodes/configure.ac b/opcodes/configure.ac
index a9fbfd6..7dc6a92 100644
--- a/opcodes/configure.ac
+++ b/opcodes/configure.ac
@@ -258,6 +258,7 @@ if test x${all_targets} = xfalse ; then
 	bfd_arm_arch)		ta="$ta arm-dis.lo" ;;
 	bfd_avr_arch)		ta="$ta avr-dis.lo" ;;
 	bfd_bfin_arch)		ta="$ta bfin-dis.lo" ;;
+	bfd_bpf_arch)		ta="$ta bpf-dis.lo bpf-opc.lo" ;;
 	bfd_cr16_arch)		ta="$ta cr16-dis.lo cr16-opc.lo" ;;
 	bfd_cris_arch)		ta="$ta cris-dis.lo cris-opc.lo cgen-bitset.lo" ;;
 	bfd_crx_arch)		ta="$ta crx-dis.lo crx-opc.lo" ;;
diff --git a/opcodes/disassemble.c b/opcodes/disassemble.c
index dd7d3a3..e594f86 100644
--- a/opcodes/disassemble.c
+++ b/opcodes/disassemble.c
@@ -29,6 +29,7 @@
 #define ARCH_arm
 #define ARCH_avr
 #define ARCH_bfin
+#define ARCH_bpf
 #define ARCH_cr16
 #define ARCH_cris
 #define ARCH_crx
@@ -151,6 +152,11 @@ disassembler (bfd *abfd)
       disassemble = print_insn_bfin;
       break;
 #endif
+#ifdef ARCH_bpf
+    case bfd_arch_bpf:
+      disassemble = print_insn_bpf;
+      break;
+#endif
 #ifdef ARCH_cr16
     case bfd_arch_cr16:
       disassemble = print_insn_cr16;
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH v1 net-next 5/6] net: allow simultaneous SW and HW transmit timestamping
From: Miroslav Lichvar @ 2017-04-27 16:39 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, Richard Cochran, Willem de Bruijn,
	Soheil Hassas Yeganeh, Keller, Jacob E, Denny Page, Jiri Benc
In-Reply-To: <CAF=yD-+HK-dCG_XjqBKfkSF1bjJavTr7EFgeFNH2yRc2CXgOxA@mail.gmail.com>

On Thu, Apr 27, 2017 at 12:21:00PM -0400, Willem de Bruijn wrote:
> >> > @@ -720,6 +720,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
> >> >                 empty = 0;
> >> >         if (shhwtstamps &&
> >> >             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
> >> > +           (empty || !skb_is_err_queue(skb)) &&
> >> >             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
> >>
> >> I find skb->tstamp == 0 easier to understand than the condition on empty.
> >>
> >> Indeed, this is so non-obvious that I would suggest another helper function
> >> skb_is_hwtx_tstamp with a concise comment about the race condition
> >> between tx software and hardware timestamps (as in the last sentence of
> >> the commit message).
> >
> > Should it include also the skb_is_err_queue() check? If it returned
> > true for both TX and RX HW timestamps, maybe it could be called
> > skb_has_hw_tstamp?
> 
> For the purpose of documenting why this complex condition exists,
> I would call the skb_is_err_queue in that helper function and make
> it tx + hw specific.

Hm, like this?

        if (shhwtstamps &&
            (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
+           (skb_is_hwtx_tstamp(skb) || !skb_is_err_queue(skb)) &&
            ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {

where skb_is_hwtx_tstamp() has
	return skb->tstamp == 0 && skb_is_err_queue(skb);

I was just not sure about the unnecessary skb_is_err_queue() call.

-- 
Miroslav Lichvar

^ permalink raw reply

* Re: [PATCH net-next] samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel
From: Alexei Starovoitov @ 2017-04-27 16:38 UTC (permalink / raw)
  To: David Ahern, netdev; +Cc: daniel
In-Reply-To: <1493309473-27384-1-git-send-email-dsa@cumulusnetworks.com>

On 4/27/17 9:11 AM, David Ahern wrote:
> Add option to xdp1 and xdp_tx_iptunnel to insert xdp program in
> SKB_MODE:
>  - update set_link_xdp_fd to take a flags argument that is added to the
>    RTM_SETLINK message
>
>  - Add -S option to xdp1 and xdp_tx_iptunnel user code. When passed in
>    XDP_FLAGS_SKB_MODE is set in the flags arg passed to set_link_xdp_fd
>
> Signed-off-by: David Ahern <dsa@cumulusnetworks.com>

awesome. thanks!
Acked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply

* [PATCH] net: ath: tx99: fixed a spelling issue
From: ammly @ 2017-04-27 16:31 UTC (permalink / raw)
  To: ath9k-devel; +Cc: kvalo, linux-wireless, netdev, linux-kernel, Ammly Fredrick

Fixed a spelling issue.

Signed-off-by: Ammly Fredrick <ammlyf@gmail.com>
---
 drivers/net/wireless/ath/ath9k/tx99.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath9k/tx99.c b/drivers/net/wireless/ath/ath9k/tx99.c
index 16aca9e28b77..a866cbda0799 100644
--- a/drivers/net/wireless/ath/ath9k/tx99.c
+++ b/drivers/net/wireless/ath/ath9k/tx99.c
@@ -153,7 +153,7 @@ static int ath9k_tx99_init(struct ath_softc *sc)
 		sc->tx99_power,
 		sc->tx99_power / 2);
 
-	/* We leave the harware awake as it will be chugging on */
+	/* We leave the hardware awake as it will be chugging on */
 
 	return 0;
 }
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH net-next 6/6] bpf: show bpf programs
From: Hannes Frederic Sowa @ 2017-04-27 16:28 UTC (permalink / raw)
  To: David Miller; +Cc: daniel, netdev, ast, daniel, jbenc, aconole
In-Reply-To: <20170427.120019.1559603500876505216.davem@davemloft.net>

On 27.04.2017 18:00, David Miller wrote:
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Date: Thu, 27 Apr 2017 15:22:49 +0200
> 
>> Sure, that sounds super. But so far Linux and most (maybe I should write
>> all) subsystems always provided some easy way to get the insights of the
>> kernel without having to code or rely on special tools so far.
> 
> Not true.

Yes, I should not have written it that generally. ;)

> You cannot fully dump socket TCP internal state without netlink based
> tools.  It is just one of many examples.
>
> Can you dump all nftables rules without a special tool?

You got me here, I agree that not all state is discoverable via procfs.
But to some degree even netfilter and tcp do expose some considerable
amount of data via procfs. In the case of netfilter it might be less
valuable, though, I have to agree.

> I don't think this is a legitimate line of argument, and I want
> this to be done via the bpf() system call which is what people
> are working on.

I hope you saw that I was absolutely not against dumping or enumeration
with the bpf syscall. It will be the primary interface to debug ebpf and
I completely agree.

Merely I tried to establish the procfs interface as quick look interface
if some type of bpf program is loaded which could start any further
diagnosis. This interface should not have any dependencies and should
even work on embedded devices, where sometimes it might be difficult to
get a binary for the correct architecture installed ad-hoc (I am
thinking about openwrt). But this is definitely also solvable.

I do think if a common utility in util-linux, like lsbpf, is available I
will be fine.

Anyway, I will take this argument back.

Thanks,
Hannes

^ permalink raw reply

* Re: [PATCH net-next 9/9] ipvlan: introduce individual MAC addresses
From: Dan Williams @ 2017-04-27 16:23 UTC (permalink / raw)
  To: Marco Chiappero, netdev
  Cc: David S . Miller, Jeff Kirsher, Alexander Duyck, Sainath Grandhi,
	Mahesh Bandewar
In-Reply-To: <1493310033.27948.3.camel@redhat.com>

On Thu, 2017-04-27 at 11:20 -0500, Dan Williams wrote:
> On Thu, 2017-04-27 at 15:51 +0100, Marco Chiappero wrote:
> > Currently all the slave devices belonging to the same port inherit
> > their
> > MAC address from its master device. This patch removes this
> > limitation
> > and allows every slave device to obtain a unique MAC address, by
> > default
> > randomly generated at creation time.
> > 
> > Moreover it is now possible to correctly modify the MAC address at
> > any
> > time, fixing an existing bug as MAC address changes on the master
> > were
> > not reflected on the slaves. It also avoids multiple interfaces
> > sharing
> > the same IPv6 link-local address.
> 
> How is this different than macvlan now?  Why would you use unique
> addressed ipvlan instances instead of macvlan?  Wouldn't the same
> problems around external switches not expecting multiple MACs from
> the
> same switch port apply now to ipvlan?
> 
> The whole point of ipvlan AIUI was to get around macvlan problems
> related to multiple MACs on the same port.

Another issue is the unicast MAC limits on cards.  ipvlan is now much
more likely to hit the unicast MAC limit of the NIC and thus trigger
promiscuous mode and the resulting performance drop, where before it
would not.

Dan

> Also, I think the IPv6 thing you mention is incorrect and has long
> since been solved.  Originally, ipvlan did not include a "dev_id"
> property that differened between child interfaces, and thus the IID
> of
> the each interface was the same.  That has now been fixed, and each
> ipvlan slave should now have a different IID and thus a different
> link-
> local address.
> 
> Dan
> 
> > Since ipvlan is designed to expose a single MAC address for
> > external
> > communications, the driver now behaves as follow:
> > - L2 mode:
> >    * Any reference to the internal MAC address of the originating
> > slave
> >      is replaced with the MAC address of the master for outbound
> > frames.
> >    * Likewise, the destination MAC address is overwritten with the
> >      internal one (once the correct slave is determined) for any
> >      incoming external frame.
> >    * For any internal slave-to-slave communication, the original
> > MAC
> >      addresses are preserved (although not used for
> > routing/switching).
> > - L3/L3s mode:
> >    * The destination MAC address for incoming external packets is
> >      replaced with the one belonging to the destination slave
> > device
> >      (as for L2 mode)
> >    * Every other path behaves as before.
> > 
> > Being a significant behavioral change, version number has been
> > increased.
> > 
> > Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
> > Tested-by: Marco Chiappero <marco.chiappero@intel.com>
> > ---
> >  drivers/net/ipvlan/ipvlan.h      |   2 +-
> >  drivers/net/ipvlan/ipvlan_core.c | 113
> > ++++++++++++++++++++++++++++++++++-----
> >  drivers/net/ipvlan/ipvlan_main.c |  18 +++----
> >  3 files changed, 111 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/net/ipvlan/ipvlan.h
> > b/drivers/net/ipvlan/ipvlan.h
> > index 800a46c..efe4fd1 100644
> > --- a/drivers/net/ipvlan/ipvlan.h
> > +++ b/drivers/net/ipvlan/ipvlan.h
> > @@ -32,7 +32,7 @@
> >  #include <net/l3mdev.h>
> >  
> >  #define IPVLAN_DRV	"ipvlan"
> > -#define IPV_DRV_VER	"0.1"
> > +#define IPV_DRV_VER	"0.2"
> >  
> >  #define IPVLAN_HASH_SIZE	(1 << BITS_PER_BYTE)
> >  #define IPVLAN_HASH_MASK	(IPVLAN_HASH_SIZE - 1)
> > diff --git a/drivers/net/ipvlan/ipvlan_core.c
> > b/drivers/net/ipvlan/ipvlan_core.c
> > index 67e342d..a30bc11 100644
> > --- a/drivers/net/ipvlan/ipvlan_core.c
> > +++ b/drivers/net/ipvlan/ipvlan_core.c
> > @@ -215,6 +215,89 @@ static void ipvlan_skb_crossing_ns(struct
> > sk_buff *skb, struct net_device *dev)
> >  		skb->dev = dev;
> >  }
> >  
> > +static inline struct nd_opt_hdr *ipvlan_icmp6_nd_opts(struct
> > icmp6hdr *icmph)
> > +{
> > +	return (struct nd_opt_hdr *)((struct nd_msg *)icmph)->opt;
> > +}
> > +
> > +static inline struct nd_opt_hdr *ipvlan_icmp6_rs_opts(struct
> > icmp6hdr *icmph)
> > +{
> > +	return (struct nd_opt_hdr *)((struct rs_msg *)icmph)->opt;
> > +}
> > +
> > +static void ipvlan_proxy_l2_update_icmp6(const struct net_device
> > *master,
> > +					 struct sk_buff *skb,
> > +					 struct nd_opt_hdr
> > *nd_opt,
> > +					 u8 opt_type)
> > +{
> > +	u32 opts_len = skb_tail_pointer(skb) - (u8 *)nd_opt;
> > +
> > +	while (opts_len) {
> > +		u32 opt_len = nd_opt->nd_opt_len << 3;
> > +
> > +		if (nd_opt->nd_opt_type == opt_type) {
> > +			struct ipv6hdr *ip6h = ipv6_hdr(skb);
> > +			struct icmp6hdr *icmph = icmp6_hdr(skb);
> > +			u32 len = ntohs(ip6h->payload_len);
> > +
> > +			memcpy(nd_opt + 1, master->dev_addr,
> > master-
> > > addr_len);
> > 
> > +			icmph->icmp6_cksum = 0;
> > +			icmph->icmp6_cksum =
> > +				csum_ipv6_magic(&ip6h->saddr,
> > +						&ip6h->daddr, len,
> > +						IPPROTO_ICMPV6,
> > +						csum_partial(icmph
> > ,
> > len, 0));
> > +			return;
> > +		}
> > +
> > +		opts_len -= opt_len;
> > +		nd_opt = ((void *)nd_opt) + opt_len;
> > +	}
> > +}
> > +
> > +static void ipvlan_proxy_l2_outbound(struct sk_buff *skb,
> > +				     const struct net_device
> > *master)
> > +{
> > +	/* masquerade the source MAC address for every outgoing
> > frame */
> > +	memcpy(eth_hdr(skb)->h_source, master->dev_addr, master-
> > > addr_len);
> > 
> > +
> > +	/* ARP and some NDISC packets need additional treatment */
> > +	if (skb->protocol == htons(ETH_P_IPV6)) {
> > +		struct ipv6hdr *ip6h = ipv6_hdr(skb);
> > +		struct icmp6hdr *icmph = icmp6_hdr(skb);
> > +		struct nd_opt_hdr *nd_opt;
> > +		u8 opt_type;
> > +
> > +		if (likely(ip6h->nexthdr != NEXTHDR_ICMP))
> > +			return;
> > +
> > +		switch (icmph->icmp6_type) {
> > +		case NDISC_NEIGHBOUR_SOLICITATION: {
> > +			nd_opt = ipvlan_icmp6_nd_opts(icmph);
> > +			opt_type = ND_OPT_SOURCE_LL_ADDR;
> > +			break;
> > +		}
> > +		case NDISC_NEIGHBOUR_ADVERTISEMENT: {
> > +			nd_opt = ipvlan_icmp6_nd_opts(icmph);
> > +			opt_type = ND_OPT_TARGET_LL_ADDR;
> > +			break;
> > +		}
> > +		case NDISC_ROUTER_SOLICITATION: {
> > +			nd_opt = ipvlan_icmp6_rs_opts(icmph);
> > +			opt_type = ND_OPT_SOURCE_LL_ADDR;
> > +			break;
> > +		}
> > +		default:
> > +			return;
> > +		}
> > +
> > +		ipvlan_proxy_l2_update_icmp6(master, skb, nd_opt,
> > opt_type);
> > +
> > +	} else if (unlikely(skb->protocol == htons(ETH_P_ARP))) {
> > +		memcpy(arp_hdr(skb) + 1, master->dev_addr, master-
> > > addr_len);
> > 
> > +	}
> > +}
> > +
> >  static void ipvlan_dispatch_multicast(struct ipvl_port *port,
> >  				      struct sk_buff *skb, u8
> > pkt_type,
> >  				      unsigned int mac_hash)
> > @@ -258,6 +341,7 @@ static void ipvlan_dispatch_multicast(struct
> > ipvl_port *port,
> >  		/* If the packet originated here, send it out. */
> >  		skb->dev = port->dev;
> >  		skb->pkt_type = pkt_type;
> > +		ipvlan_proxy_l2_outbound(skb, port->dev);
> >  		dev_queue_xmit(skb);
> >  	} else {
> >  		if (consumed)
> > @@ -489,6 +573,7 @@ static int ipvlan_xmit_mode_l3(struct sk_buff
> > *skb, struct net_device *dev)
> >  static inline int ipvlan_process_l2_outbound(struct sk_buff *skb,
> >  					     struct net_device
> > *dev)
> >  {
> > +	ipvlan_proxy_l2_outbound(skb, dev);
> >  	ipvlan_skb_crossing_ns(skb, dev);
> >  	return dev_queue_xmit(skb);
> >  }
> > @@ -499,27 +584,27 @@ static int ipvlan_xmit_mode_l2(struct sk_buff
> > *skb, struct net_device *dev)
> >  	struct ethhdr *ethh = eth_hdr(skb);
> >  	struct ipvl_addr *addr;
> >  
> > -	if (ether_addr_equal(ethh->h_dest, ethh->h_source)) {
> > -		addr = ipvlan_get_slave_addr_dst(skb, ipvlan-
> > >port);
> > -		if (addr)
> > -			return ipvlan_rcv_int_frame(addr, &skb);
> > +	if (is_multicast_ether_addr(ethh->h_dest)) {
> > +		ipvlan_multicast_enqueue(ipvlan->port, skb, true);
> > +		return NET_XMIT_SUCCESS;
> > +	}
> >  
> > +	if (ether_addr_equal(ethh->h_dest, ipvlan->phy_dev-
> > > dev_addr)) {
> > 
> >  		skb = skb_share_check(skb, GFP_ATOMIC);
> >  		if (unlikely(!skb))
> >  			return NET_XMIT_DROP;
> >  
> > -		/* Packet definitely does not belong to any of the
> > -		 * virtual devices, but the dest is local. So
> > forward
> > -		 * the skb for the main-dev. At the RX side we
> > just
> > return
> > -		 * RX_PASS for it to be processed further on the
> > stack.
> > +		/* Forward the skb for the master device. At the
> > RX
> > side we
> > +		 * just return RX_HANDLER_PASS for it to be
> > processed further
> > +		 * on the stack.
> >  		 */
> >  		return dev_forward_skb(ipvlan->phy_dev, skb);
> > -
> > -	} else if (is_multicast_ether_addr(ethh->h_dest)) {
> > -		ipvlan_multicast_enqueue(ipvlan->port, skb, true);
> > -		return NET_XMIT_SUCCESS;
> >  	}
> >  
> > +	addr = ipvlan_get_slave_addr_dst(skb, ipvlan->port);
> > +	if (addr)
> > +		return ipvlan_rcv_int_frame(addr, &skb);
> > +
> >  	return ipvlan_process_l2_outbound(skb, ipvlan->phy_dev);
> >  }
> >  
> > @@ -562,6 +647,10 @@ static int ipvlan_rcv_ext_frame(struct
> > ipvl_addr
> > *addr, struct sk_buff *skb)
> >  	struct net_device *dev = ipvlan->dev;
> >  	unsigned int len = skb->len + ETH_HLEN;
> >  
> > +	/* NOTE: although not necessary restore the actual
> > destination
> > +	 * address; this is also what traffic sniffers will
> > display.
> > +	 */
> > +	memcpy(eth_hdr(skb)->h_dest, dev->dev_addr, dev-
> > >addr_len);
> >  	ipvlan_skb_crossing_ns(skb, dev);
> >  	ipvlan_count_rx(ipvlan, len, true, false);
> >  
> > diff --git a/drivers/net/ipvlan/ipvlan_main.c
> > b/drivers/net/ipvlan/ipvlan_main.c
> > index b837807..709f27d 100644
> > --- a/drivers/net/ipvlan/ipvlan_main.c
> > +++ b/drivers/net/ipvlan/ipvlan_main.c
> > @@ -378,6 +378,7 @@ static const struct net_device_ops
> > ipvlan_netdev_ops = {
> >  	.ndo_start_xmit		= ipvlan_start_xmit,
> >  	.ndo_fix_features	= ipvlan_fix_features,
> >  	.ndo_change_rx_flags	= ipvlan_change_rx_flags,
> > +	.ndo_set_mac_address	= eth_mac_addr,
> >  	.ndo_set_rx_mode	= ipvlan_set_multicast_mac_filter,
> >  	.ndo_get_stats64	= ipvlan_get_stats64,
> >  	.ndo_vlan_rx_add_vid	= ipvlan_vlan_rx_add_vid,
> > @@ -392,9 +393,10 @@ static int ipvlan_hard_header(struct sk_buff
> > *skb, struct net_device *dev,
> >  	const struct ipvl_dev *ipvlan = netdev_priv(dev);
> >  	struct net_device *phy_dev = ipvlan->phy_dev;
> >  
> > -	/* TODO Probably use a different field than dev_addr so
> > that
> > the
> > -	 * mac-address on the virtual device is portable and can
> > be
> > carried
> > -	 * while the packets use the mac-addr on the physical
> > device.
> > +	/* This driver uses (almost exclusively) L3 addresses for
> > +	 * routing/switching. Use the actual slave's MAC address,
> > +	 * but overwrite it later during the packet processing for
> > +	 * frames leaving from master
> >  	 */
> >  	return dev_hard_header(skb, phy_dev, type, daddr,
> >  			       saddr ? : dev->dev_addr, len);
> > @@ -559,11 +561,8 @@ int ipvlan_link_new(struct net *src_net,
> > struct
> > net_device *dev,
> >  	/* Increment id-base to the next slot for the future
> > assignment */
> >  	port->dev_id_start = err + 1;
> >  
> > -	/* TODO Probably put random address here to be presented
> > to
> > the
> > -	 * world but keep using the physical-dev address for the
> > outgoing
> > -	 * packets.
> > -	 */
> > -	memcpy(dev->dev_addr, phy_dev->dev_addr, ETH_ALEN);
> > +	/* TODO: consider storing the original MAC address in dev-
> > > perm_addr */
> > 
> > +	eth_hw_addr_random(dev);
> >  
> >  	dev->priv_flags |= IFF_IPVLAN_SLAVE;
> >  
> > @@ -619,7 +618,8 @@ void ipvlan_link_setup(struct net_device *dev)
> >  	ether_setup(dev);
> >  
> >  	dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE |
> > IFF_TX_SKB_SHARING);
> > -	dev->priv_flags |= IFF_UNICAST_FLT | IFF_NO_QUEUE;
> > +	dev->priv_flags |= IFF_UNICAST_FLT | IFF_NO_QUEUE
> > +			   | IFF_LIVE_ADDR_CHANGE;
> >  	dev->netdev_ops = &ipvlan_netdev_ops;
> >  	dev->destructor = free_netdev;
> >  	dev->header_ops = &ipvlan_header_ops;

^ permalink raw reply

* Re: [PATCH v1 net-next 5/6] net: allow simultaneous SW and HW transmit timestamping
From: Willem de Bruijn @ 2017-04-27 16:21 UTC (permalink / raw)
  To: Miroslav Lichvar
  Cc: Network Development, Richard Cochran, Willem de Bruijn,
	Soheil Hassas Yeganeh, Keller, Jacob E, Denny Page, Jiri Benc
In-Reply-To: <20170427161700.GB3401@localhost>

>> > @@ -720,6 +720,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
>> >                 empty = 0;
>> >         if (shhwtstamps &&
>> >             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
>> > +           (empty || !skb_is_err_queue(skb)) &&
>> >             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
>>
>> I find skb->tstamp == 0 easier to understand than the condition on empty.
>>
>> Indeed, this is so non-obvious that I would suggest another helper function
>> skb_is_hwtx_tstamp with a concise comment about the race condition
>> between tx software and hardware timestamps (as in the last sentence of
>> the commit message).
>
> Should it include also the skb_is_err_queue() check? If it returned
> true for both TX and RX HW timestamps, maybe it could be called
> skb_has_hw_tstamp?

For the purpose of documenting why this complex condition exists,
I would call the skb_is_err_queue in that helper function and make
it tx + hw specific.

^ permalink raw reply

* Re: [PATCH net-next 9/9] ipvlan: introduce individual MAC addresses
From: Dan Williams @ 2017-04-27 16:20 UTC (permalink / raw)
  To: Marco Chiappero, netdev
  Cc: David S . Miller, Jeff Kirsher, Alexander Duyck, Sainath Grandhi,
	Mahesh Bandewar
In-Reply-To: <20170427145142.15830-10-marco.chiappero@intel.com>

On Thu, 2017-04-27 at 15:51 +0100, Marco Chiappero wrote:
> Currently all the slave devices belonging to the same port inherit
> their
> MAC address from its master device. This patch removes this
> limitation
> and allows every slave device to obtain a unique MAC address, by
> default
> randomly generated at creation time.
> 
> Moreover it is now possible to correctly modify the MAC address at
> any
> time, fixing an existing bug as MAC address changes on the master
> were
> not reflected on the slaves. It also avoids multiple interfaces
> sharing
> the same IPv6 link-local address.

How is this different than macvlan now?  Why would you use unique
addressed ipvlan instances instead of macvlan?  Wouldn't the same
problems around external switches not expecting multiple MACs from the
same switch port apply now to ipvlan?

The whole point of ipvlan AIUI was to get around macvlan problems
related to multiple MACs on the same port.

Also, I think the IPv6 thing you mention is incorrect and has long
since been solved.  Originally, ipvlan did not include a "dev_id"
property that differened between child interfaces, and thus the IID of
the each interface was the same.  That has now been fixed, and each
ipvlan slave should now have a different IID and thus a different link-
local address.

Dan

> Since ipvlan is designed to expose a single MAC address for external
> communications, the driver now behaves as follow:
> - L2 mode:
>    * Any reference to the internal MAC address of the originating
> slave
>      is replaced with the MAC address of the master for outbound
> frames.
>    * Likewise, the destination MAC address is overwritten with the
>      internal one (once the correct slave is determined) for any
>      incoming external frame.
>    * For any internal slave-to-slave communication, the original MAC
>      addresses are preserved (although not used for
> routing/switching).
> - L3/L3s mode:
>    * The destination MAC address for incoming external packets is
>      replaced with the one belonging to the destination slave device
>      (as for L2 mode)
>    * Every other path behaves as before.
> 
> Being a significant behavioral change, version number has been
> increased.
> 
> Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
> Tested-by: Marco Chiappero <marco.chiappero@intel.com>
> ---
>  drivers/net/ipvlan/ipvlan.h      |   2 +-
>  drivers/net/ipvlan/ipvlan_core.c | 113
> ++++++++++++++++++++++++++++++++++-----
>  drivers/net/ipvlan/ipvlan_main.c |  18 +++----
>  3 files changed, 111 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/net/ipvlan/ipvlan.h
> b/drivers/net/ipvlan/ipvlan.h
> index 800a46c..efe4fd1 100644
> --- a/drivers/net/ipvlan/ipvlan.h
> +++ b/drivers/net/ipvlan/ipvlan.h
> @@ -32,7 +32,7 @@
>  #include <net/l3mdev.h>
>  
>  #define IPVLAN_DRV	"ipvlan"
> -#define IPV_DRV_VER	"0.1"
> +#define IPV_DRV_VER	"0.2"
>  
>  #define IPVLAN_HASH_SIZE	(1 << BITS_PER_BYTE)
>  #define IPVLAN_HASH_MASK	(IPVLAN_HASH_SIZE - 1)
> diff --git a/drivers/net/ipvlan/ipvlan_core.c
> b/drivers/net/ipvlan/ipvlan_core.c
> index 67e342d..a30bc11 100644
> --- a/drivers/net/ipvlan/ipvlan_core.c
> +++ b/drivers/net/ipvlan/ipvlan_core.c
> @@ -215,6 +215,89 @@ static void ipvlan_skb_crossing_ns(struct
> sk_buff *skb, struct net_device *dev)
>  		skb->dev = dev;
>  }
>  
> +static inline struct nd_opt_hdr *ipvlan_icmp6_nd_opts(struct
> icmp6hdr *icmph)
> +{
> +	return (struct nd_opt_hdr *)((struct nd_msg *)icmph)->opt;
> +}
> +
> +static inline struct nd_opt_hdr *ipvlan_icmp6_rs_opts(struct
> icmp6hdr *icmph)
> +{
> +	return (struct nd_opt_hdr *)((struct rs_msg *)icmph)->opt;
> +}
> +
> +static void ipvlan_proxy_l2_update_icmp6(const struct net_device
> *master,
> +					 struct sk_buff *skb,
> +					 struct nd_opt_hdr *nd_opt,
> +					 u8 opt_type)
> +{
> +	u32 opts_len = skb_tail_pointer(skb) - (u8 *)nd_opt;
> +
> +	while (opts_len) {
> +		u32 opt_len = nd_opt->nd_opt_len << 3;
> +
> +		if (nd_opt->nd_opt_type == opt_type) {
> +			struct ipv6hdr *ip6h = ipv6_hdr(skb);
> +			struct icmp6hdr *icmph = icmp6_hdr(skb);
> +			u32 len = ntohs(ip6h->payload_len);
> +
> +			memcpy(nd_opt + 1, master->dev_addr, master-
> >addr_len);
> +			icmph->icmp6_cksum = 0;
> +			icmph->icmp6_cksum =
> +				csum_ipv6_magic(&ip6h->saddr,
> +						&ip6h->daddr, len,
> +						IPPROTO_ICMPV6,
> +						csum_partial(icmph,
> len, 0));
> +			return;
> +		}
> +
> +		opts_len -= opt_len;
> +		nd_opt = ((void *)nd_opt) + opt_len;
> +	}
> +}
> +
> +static void ipvlan_proxy_l2_outbound(struct sk_buff *skb,
> +				     const struct net_device
> *master)
> +{
> +	/* masquerade the source MAC address for every outgoing
> frame */
> +	memcpy(eth_hdr(skb)->h_source, master->dev_addr, master-
> >addr_len);
> +
> +	/* ARP and some NDISC packets need additional treatment */
> +	if (skb->protocol == htons(ETH_P_IPV6)) {
> +		struct ipv6hdr *ip6h = ipv6_hdr(skb);
> +		struct icmp6hdr *icmph = icmp6_hdr(skb);
> +		struct nd_opt_hdr *nd_opt;
> +		u8 opt_type;
> +
> +		if (likely(ip6h->nexthdr != NEXTHDR_ICMP))
> +			return;
> +
> +		switch (icmph->icmp6_type) {
> +		case NDISC_NEIGHBOUR_SOLICITATION: {
> +			nd_opt = ipvlan_icmp6_nd_opts(icmph);
> +			opt_type = ND_OPT_SOURCE_LL_ADDR;
> +			break;
> +		}
> +		case NDISC_NEIGHBOUR_ADVERTISEMENT: {
> +			nd_opt = ipvlan_icmp6_nd_opts(icmph);
> +			opt_type = ND_OPT_TARGET_LL_ADDR;
> +			break;
> +		}
> +		case NDISC_ROUTER_SOLICITATION: {
> +			nd_opt = ipvlan_icmp6_rs_opts(icmph);
> +			opt_type = ND_OPT_SOURCE_LL_ADDR;
> +			break;
> +		}
> +		default:
> +			return;
> +		}
> +
> +		ipvlan_proxy_l2_update_icmp6(master, skb, nd_opt,
> opt_type);
> +
> +	} else if (unlikely(skb->protocol == htons(ETH_P_ARP))) {
> +		memcpy(arp_hdr(skb) + 1, master->dev_addr, master-
> >addr_len);
> +	}
> +}
> +
>  static void ipvlan_dispatch_multicast(struct ipvl_port *port,
>  				      struct sk_buff *skb, u8
> pkt_type,
>  				      unsigned int mac_hash)
> @@ -258,6 +341,7 @@ static void ipvlan_dispatch_multicast(struct
> ipvl_port *port,
>  		/* If the packet originated here, send it out. */
>  		skb->dev = port->dev;
>  		skb->pkt_type = pkt_type;
> +		ipvlan_proxy_l2_outbound(skb, port->dev);
>  		dev_queue_xmit(skb);
>  	} else {
>  		if (consumed)
> @@ -489,6 +573,7 @@ static int ipvlan_xmit_mode_l3(struct sk_buff
> *skb, struct net_device *dev)
>  static inline int ipvlan_process_l2_outbound(struct sk_buff *skb,
>  					     struct net_device *dev)
>  {
> +	ipvlan_proxy_l2_outbound(skb, dev);
>  	ipvlan_skb_crossing_ns(skb, dev);
>  	return dev_queue_xmit(skb);
>  }
> @@ -499,27 +584,27 @@ static int ipvlan_xmit_mode_l2(struct sk_buff
> *skb, struct net_device *dev)
>  	struct ethhdr *ethh = eth_hdr(skb);
>  	struct ipvl_addr *addr;
>  
> -	if (ether_addr_equal(ethh->h_dest, ethh->h_source)) {
> -		addr = ipvlan_get_slave_addr_dst(skb, ipvlan->port);
> -		if (addr)
> -			return ipvlan_rcv_int_frame(addr, &skb);
> +	if (is_multicast_ether_addr(ethh->h_dest)) {
> +		ipvlan_multicast_enqueue(ipvlan->port, skb, true);
> +		return NET_XMIT_SUCCESS;
> +	}
>  
> +	if (ether_addr_equal(ethh->h_dest, ipvlan->phy_dev-
> >dev_addr)) {
>  		skb = skb_share_check(skb, GFP_ATOMIC);
>  		if (unlikely(!skb))
>  			return NET_XMIT_DROP;
>  
> -		/* Packet definitely does not belong to any of the
> -		 * virtual devices, but the dest is local. So
> forward
> -		 * the skb for the main-dev. At the RX side we just
> return
> -		 * RX_PASS for it to be processed further on the
> stack.
> +		/* Forward the skb for the master device. At the RX
> side we
> +		 * just return RX_HANDLER_PASS for it to be
> processed further
> +		 * on the stack.
>  		 */
>  		return dev_forward_skb(ipvlan->phy_dev, skb);
> -
> -	} else if (is_multicast_ether_addr(ethh->h_dest)) {
> -		ipvlan_multicast_enqueue(ipvlan->port, skb, true);
> -		return NET_XMIT_SUCCESS;
>  	}
>  
> +	addr = ipvlan_get_slave_addr_dst(skb, ipvlan->port);
> +	if (addr)
> +		return ipvlan_rcv_int_frame(addr, &skb);
> +
>  	return ipvlan_process_l2_outbound(skb, ipvlan->phy_dev);
>  }
>  
> @@ -562,6 +647,10 @@ static int ipvlan_rcv_ext_frame(struct ipvl_addr
> *addr, struct sk_buff *skb)
>  	struct net_device *dev = ipvlan->dev;
>  	unsigned int len = skb->len + ETH_HLEN;
>  
> +	/* NOTE: although not necessary restore the actual
> destination
> +	 * address; this is also what traffic sniffers will display.
> +	 */
> +	memcpy(eth_hdr(skb)->h_dest, dev->dev_addr, dev->addr_len);
>  	ipvlan_skb_crossing_ns(skb, dev);
>  	ipvlan_count_rx(ipvlan, len, true, false);
>  
> diff --git a/drivers/net/ipvlan/ipvlan_main.c
> b/drivers/net/ipvlan/ipvlan_main.c
> index b837807..709f27d 100644
> --- a/drivers/net/ipvlan/ipvlan_main.c
> +++ b/drivers/net/ipvlan/ipvlan_main.c
> @@ -378,6 +378,7 @@ static const struct net_device_ops
> ipvlan_netdev_ops = {
>  	.ndo_start_xmit		= ipvlan_start_xmit,
>  	.ndo_fix_features	= ipvlan_fix_features,
>  	.ndo_change_rx_flags	= ipvlan_change_rx_flags,
> +	.ndo_set_mac_address	= eth_mac_addr,
>  	.ndo_set_rx_mode	= ipvlan_set_multicast_mac_filter,
>  	.ndo_get_stats64	= ipvlan_get_stats64,
>  	.ndo_vlan_rx_add_vid	= ipvlan_vlan_rx_add_vid,
> @@ -392,9 +393,10 @@ static int ipvlan_hard_header(struct sk_buff
> *skb, struct net_device *dev,
>  	const struct ipvl_dev *ipvlan = netdev_priv(dev);
>  	struct net_device *phy_dev = ipvlan->phy_dev;
>  
> -	/* TODO Probably use a different field than dev_addr so that
> the
> -	 * mac-address on the virtual device is portable and can be
> carried
> -	 * while the packets use the mac-addr on the physical
> device.
> +	/* This driver uses (almost exclusively) L3 addresses for
> +	 * routing/switching. Use the actual slave's MAC address,
> +	 * but overwrite it later during the packet processing for
> +	 * frames leaving from master
>  	 */
>  	return dev_hard_header(skb, phy_dev, type, daddr,
>  			       saddr ? : dev->dev_addr, len);
> @@ -559,11 +561,8 @@ int ipvlan_link_new(struct net *src_net, struct
> net_device *dev,
>  	/* Increment id-base to the next slot for the future
> assignment */
>  	port->dev_id_start = err + 1;
>  
> -	/* TODO Probably put random address here to be presented to
> the
> -	 * world but keep using the physical-dev address for the
> outgoing
> -	 * packets.
> -	 */
> -	memcpy(dev->dev_addr, phy_dev->dev_addr, ETH_ALEN);
> +	/* TODO: consider storing the original MAC address in dev-
> >perm_addr */
> +	eth_hw_addr_random(dev);
>  
>  	dev->priv_flags |= IFF_IPVLAN_SLAVE;
>  
> @@ -619,7 +618,8 @@ void ipvlan_link_setup(struct net_device *dev)
>  	ether_setup(dev);
>  
>  	dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE |
> IFF_TX_SKB_SHARING);
> -	dev->priv_flags |= IFF_UNICAST_FLT | IFF_NO_QUEUE;
> +	dev->priv_flags |= IFF_UNICAST_FLT | IFF_NO_QUEUE
> +			   | IFF_LIVE_ADDR_CHANGE;
>  	dev->netdev_ops = &ipvlan_netdev_ops;
>  	dev->destructor = free_netdev;
>  	dev->header_ops = &ipvlan_header_ops;

^ permalink raw reply

* Re: [PATCH v1 net-next 5/6] net: allow simultaneous SW and HW transmit timestamping
From: Miroslav Lichvar @ 2017-04-27 16:17 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, Richard Cochran, Willem de Bruijn,
	Soheil Hassas Yeganeh, Keller, Jacob E, Denny Page, Jiri Benc
In-Reply-To: <CAF=yD-+GSK491AWQx8=6yd3=-HHwxdWq677ubwdjbV5AXzRbog@mail.gmail.com>

On Wed, Apr 26, 2017 at 08:00:02PM -0400, Willem de Bruijn wrote:
> > +       if (!hwtstamps && !(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW) &&
> > +           skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)
> > +               return;
> > +
> 
> This check should only happen for software transmit timestamps, so simpler to
> revise the check in sw_tx_timestamp above to
> 
>   if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
> -        !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
> +      (!(skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)) ||
> +      (skb->sk && skb->sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW)

Good point. This will avoid unnecessary calls of skb_tstamp_tx() in
the common case when SOF_TIMESTAMPING_OPT_TX_SWHW will not be enabled.

> > @@ -720,6 +720,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
> >                 empty = 0;
> >         if (shhwtstamps &&
> >             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
> > +           (empty || !skb_is_err_queue(skb)) &&
> >             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
> 
> I find skb->tstamp == 0 easier to understand than the condition on empty.
> 
> Indeed, this is so non-obvious that I would suggest another helper function
> skb_is_hwtx_tstamp with a concise comment about the race condition
> between tx software and hardware timestamps (as in the last sentence of
> the commit message).

Should it include also the skb_is_err_queue() check? If it returned
true for both TX and RX HW timestamps, maybe it could be called
skb_has_hw_tstamp?

-- 
Miroslav Lichvar

^ permalink raw reply

* [PATCH net-next] samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel
From: David Ahern @ 2017-04-27 16:11 UTC (permalink / raw)
  To: netdev; +Cc: ast, daniel, David Ahern

Add option to xdp1 and xdp_tx_iptunnel to insert xdp program in
SKB_MODE:
 - update set_link_xdp_fd to take a flags argument that is added to the
   RTM_SETLINK message

 - Add -S option to xdp1 and xdp_tx_iptunnel user code. When passed in
   XDP_FLAGS_SKB_MODE is set in the flags arg passed to set_link_xdp_fd

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 samples/bpf/bpf_load.c             | 19 +++++++++++++++---
 samples/bpf/bpf_load.h             |  2 +-
 samples/bpf/xdp1_user.c            | 40 ++++++++++++++++++++++++++++++--------
 samples/bpf/xdp_tx_iptunnel_user.c | 13 +++++++++----
 4 files changed, 58 insertions(+), 16 deletions(-)

diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
index 0d449d8032d1..d4433a47e6c3 100644
--- a/samples/bpf/bpf_load.c
+++ b/samples/bpf/bpf_load.c
@@ -563,7 +563,7 @@ struct ksym *ksym_search(long key)
 	return &syms[0];
 }
 
-int set_link_xdp_fd(int ifindex, int fd)
+int set_link_xdp_fd(int ifindex, int fd, int flags)
 {
 	struct sockaddr_nl sa;
 	int sock, seq = 0, len, ret = -1;
@@ -599,15 +599,28 @@ int set_link_xdp_fd(int ifindex, int fd)
 	req.nh.nlmsg_seq = ++seq;
 	req.ifinfo.ifi_family = AF_UNSPEC;
 	req.ifinfo.ifi_index = ifindex;
+
+	/* started nested attribute for XDP */
 	nla = (struct nlattr *)(((char *)&req)
 				+ NLMSG_ALIGN(req.nh.nlmsg_len));
 	nla->nla_type = NLA_F_NESTED | 43/*IFLA_XDP*/;
+	nla->nla_len = NLA_HDRLEN;
 
-	nla_xdp = (struct nlattr *)((char *)nla + NLA_HDRLEN);
+	/* add XDP fd */
+	nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
 	nla_xdp->nla_type = 1/*IFLA_XDP_FD*/;
 	nla_xdp->nla_len = NLA_HDRLEN + sizeof(int);
 	memcpy((char *)nla_xdp + NLA_HDRLEN, &fd, sizeof(fd));
-	nla->nla_len = NLA_HDRLEN + nla_xdp->nla_len;
+	nla->nla_len += nla_xdp->nla_len;
+
+	/* if user passed in any flags, add those too */
+	if (flags) {
+		nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
+		nla_xdp->nla_type = 3/*IFLA_XDP_FLAGS*/;
+		nla_xdp->nla_len = NLA_HDRLEN + sizeof(flags);
+		memcpy((char *)nla_xdp + NLA_HDRLEN, &flags, sizeof(flags));
+		nla->nla_len += nla_xdp->nla_len;
+	}
 
 	req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len);
 
diff --git a/samples/bpf/bpf_load.h b/samples/bpf/bpf_load.h
index 68f6b2d22507..6bfd75ec6a16 100644
--- a/samples/bpf/bpf_load.h
+++ b/samples/bpf/bpf_load.h
@@ -47,5 +47,5 @@ struct ksym {
 
 int load_kallsyms(void);
 struct ksym *ksym_search(long key);
-int set_link_xdp_fd(int ifindex, int fd);
+int set_link_xdp_fd(int ifindex, int fd, int flags);
 #endif
diff --git a/samples/bpf/xdp1_user.c b/samples/bpf/xdp1_user.c
index d2be65d1fd86..deb05e630d84 100644
--- a/samples/bpf/xdp1_user.c
+++ b/samples/bpf/xdp1_user.c
@@ -5,6 +5,7 @@
  * License as published by the Free Software Foundation.
  */
 #include <linux/bpf.h>
+#include <linux/if_link.h>
 #include <assert.h>
 #include <errno.h>
 #include <signal.h>
@@ -12,16 +13,18 @@
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
+#include <libgen.h>
 
 #include "bpf_load.h"
 #include "bpf_util.h"
 #include "libbpf.h"
 
 static int ifindex;
+static int flags;
 
 static void int_exit(int sig)
 {
-	set_link_xdp_fd(ifindex, -1);
+	set_link_xdp_fd(ifindex, -1, flags);
 	exit(0);
 }
 
@@ -54,18 +57,39 @@ static void poll_stats(int interval)
 	}
 }
 
-int main(int ac, char **argv)
+static void usage(const char *prog)
 {
-	char filename[256];
+	fprintf(stderr,
+		"usage: %s [OPTS] IFINDEX\n\n"
+		"OPTS:\n"
+		"    -S    use skb-mode\n",
+		prog);
+}
 
-	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+int main(int argc, char **argv)
+{
+	const char *optstr = "S";
+	char filename[256];
+	int opt;
+
+	while ((opt = getopt(argc, argv, optstr)) != -1) {
+		switch (opt) {
+		case 'S':
+			flags |= XDP_FLAGS_SKB_MODE;
+			break;
+		default:
+			usage(basename(argv[0]));
+			return 1;
+		}
+	}
 
-	if (ac != 2) {
-		printf("usage: %s IFINDEX\n", argv[0]);
+	if (optind == argc) {
+		usage(basename(argv[0]));
 		return 1;
 	}
+	ifindex = strtoul(argv[optind], NULL, 0);
 
-	ifindex = strtoul(argv[1], NULL, 0);
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
 
 	if (load_bpf_file(filename)) {
 		printf("%s", bpf_log_buf);
@@ -79,7 +103,7 @@ int main(int ac, char **argv)
 
 	signal(SIGINT, int_exit);
 
-	if (set_link_xdp_fd(ifindex, prog_fd[0]) < 0) {
+	if (set_link_xdp_fd(ifindex, prog_fd[0], flags) < 0) {
 		printf("link set xdp fd failed\n");
 		return 1;
 	}
diff --git a/samples/bpf/xdp_tx_iptunnel_user.c b/samples/bpf/xdp_tx_iptunnel_user.c
index 70e192fc61aa..cb2bda7b5346 100644
--- a/samples/bpf/xdp_tx_iptunnel_user.c
+++ b/samples/bpf/xdp_tx_iptunnel_user.c
@@ -5,6 +5,7 @@
  * License as published by the Free Software Foundation.
  */
 #include <linux/bpf.h>
+#include <linux/if_link.h>
 #include <assert.h>
 #include <errno.h>
 #include <signal.h>
@@ -28,7 +29,7 @@ static int ifindex = -1;
 static void int_exit(int sig)
 {
 	if (ifindex > -1)
-		set_link_xdp_fd(ifindex, -1);
+		set_link_xdp_fd(ifindex, -1, 0);
 	exit(0);
 }
 
@@ -136,12 +137,13 @@ int main(int argc, char **argv)
 {
 	unsigned char opt_flags[256] = {};
 	unsigned int kill_after_s = 0;
-	const char *optstr = "i:a:p:s:d:m:T:P:h";
+	const char *optstr = "i:a:p:s:d:m:T:P:Sh";
 	int min_port = 0, max_port = 0;
 	struct iptnl_info tnl = {};
 	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct vip vip = {};
 	char filename[256];
+	int flags = 0;
 	int opt;
 	int i;
 
@@ -201,6 +203,9 @@ int main(int argc, char **argv)
 		case 'T':
 			kill_after_s = atoi(optarg);
 			break;
+		case 'S':
+			flags |= XDP_FLAGS_SKB_MODE;
+			break;
 		default:
 			usage(argv[0]);
 			return 1;
@@ -243,14 +248,14 @@ int main(int argc, char **argv)
 		}
 	}
 
-	if (set_link_xdp_fd(ifindex, prog_fd[0]) < 0) {
+	if (set_link_xdp_fd(ifindex, prog_fd[0], flags) < 0) {
 		printf("link set xdp fd failed\n");
 		return 1;
 	}
 
 	poll_stats(kill_after_s);
 
-	set_link_xdp_fd(ifindex, -1);
+	set_link_xdp_fd(ifindex, -1, flags);
 
 	return 0;
 }
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH net-next] can: fix build error without CONFIG_PROC_FS
From: Oliver Hartkopp @ 2017-04-27 16:11 UTC (permalink / raw)
  To: Marc Kleine-Budde, Arnd Bergmann, David S. Miller
  Cc: Thomas Gleixner, Cong Wang, Mario Kicherer, Eric Dumazet,
	linux-can, netdev, linux-kernel
In-Reply-To: <937e9144-c06c-d265-29eb-a1c6f96b6f89@pengutronix.de>

Hello Arnd,

many thanks for your patch.

Btw

 >  static void canbcm_pernet_exit(struct net *net)
 >  {
 > +#ifdef CONFIG_PROC_FS
 >  	/* remove /proc/net/can-bcm directory */
 >  	if (IS_ENABLED(CONFIG_PROC_FS)) {
 >  		if (net->can.bcmproc_dir)
 >  			remove_proc_entry("can-bcm", net->proc_net);
 >  	}
 > +#endif
 >  }

"if (IS_ENABLED(CONFIG_PROC_FS))"

becomes obsolete too then ...

So I would suggest to take my patch to fix my fault ;-)

Best regards,
Oliver

On 04/27/2017 04:29 PM, Marc Kleine-Budde wrote:
> Hello Arnd,
>
> On 04/27/2017 04:21 PM, Arnd Bergmann wrote:
>> The procfs dir entry was added inside of an #ifdef, causing a build error
>> when we try to access it without CONFIG_PROC_FS set:
>>
>> net/can/bcm.c:1541:14: error: 'struct netns_can' has no member named 'bcmproc_dir'
>> net/can/bcm.c: In function 'bcm_connect':
>> net/can/bcm.c:1601:14: error: 'struct netns_can' has no member named 'bcmproc_dir'
>> net/can/bcm.c: In function 'canbcm_pernet_init':
>> net/can/bcm.c:1696:11: error: 'struct netns_can' has no member named 'bcmproc_dir'
>> net/can/bcm.c: In function 'canbcm_pernet_exit':
>> net/can/bcm.c:1707:15: error: 'struct netns_can' has no member named 'bcmproc_dir'
>>
>> This adds the same #ifdef around all users of the pointer. Alternatively
>> we could move the pointer outside of the #ifdef.
>>
>> Fixes: 384317ef4187 ("can: network namespace support for CAN_BCM protocol")
>> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>
> A fix for this problem is part of the pull request I send to David
> earlier today:
>
>     https://www.mail-archive.com/netdev@vger.kernel.org/msg165764.html
>
> regards,
> Marc
>

^ permalink raw reply

* Re: [PATCH net-next 8/9] ipvlan: improve compiler hints
From: David Miller @ 2017-04-27 16:05 UTC (permalink / raw)
  To: alexander.h.duyck
  Cc: marco.chiappero, netdev, jeffrey.t.kirsher, sainath.grandhi,
	maheshb
In-Reply-To: <B1C1DF2ACD01FD4881736AA51731BAB2B2EFB2@ORSMSX107.amr.corp.intel.com>

From: "Duyck, Alexander H" <alexander.h.duyck@intel.com>
Date: Thu, 27 Apr 2017 15:21:16 +0000

>> -----Original Message-----
>> From: Chiappero, Marco
>> Sent: Thursday, April 27, 2017 7:52 AM
>> To: netdev@vger.kernel.org
>> Cc: David S . Miller <davem@davemloft.net>; Kirsher, Jeffrey T
>> <jeffrey.t.kirsher@intel.com>; Duyck, Alexander H
>> <alexander.h.duyck@intel.com>; Grandhi, Sainath
>> <sainath.grandhi@intel.com>; Mahesh Bandewar <maheshb@google.com>;
>> Chiappero, Marco <marco.chiappero@intel.com>
>> Subject: [PATCH net-next 8/9] ipvlan: improve compiler hints
>> 
>> Extend inlining and branch prediction hints.
>> 
>> Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
>> ---
>>  drivers/net/ipvlan/ipvlan_core.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
>> index a9fc1b5..67e342d 100644
>> --- a/drivers/net/ipvlan/ipvlan_core.c
>> +++ b/drivers/net/ipvlan/ipvlan_core.c
>> @@ -88,7 +88,7 @@ void ipvlan_ht_addr_del(struct ipvl_addr *addr)
>>  	hlist_del_init_rcu(&addr->hlnode);
>>  }
>> 
>> -unsigned int ipvlan_mac_hash(const unsigned char *addr)
>> +inline unsigned int ipvlan_mac_hash(const unsigned char *addr)
>>  {
>>  	u32 hash = jhash_1word(__get_unaligned_cpu32(addr + 2),
>>  			       ipvlan_jhash_secret);
> 
> I'm kind of surprised this isn't causing a problem with differing
> declarations between the declaration here and the declaration in
> ipvlan.h. Normally for inlining something like this you would change
> it to a "static inline" and move the entire declaration into the
> header file.

No inlines in foo.c files please, seriously let the compiler decide
it knows better than you.

^ permalink raw reply

* Re: [PATCH net-next 6/6] bpf: show bpf programs
From: David Miller @ 2017-04-27 16:00 UTC (permalink / raw)
  To: hannes; +Cc: daniel, netdev, ast, daniel, jbenc, aconole
In-Reply-To: <5b1f23e3-86a7-69aa-91e2-1dc72125f22b@stressinduktion.org>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Thu, 27 Apr 2017 15:22:49 +0200

> Sure, that sounds super. But so far Linux and most (maybe I should write
> all) subsystems always provided some easy way to get the insights of the
> kernel without having to code or rely on special tools so far.

Not true.

You cannot fully dump socket TCP internal state without netlink based
tools.  It is just one of many examples.

Can you dump all nftables rules without a special tool?

I don't think this is a legitimate line of argument, and I want
this to be done via the bpf() system call which is what people
are working on.

Thanks.

^ permalink raw reply

* Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions
From: Logan Gunthorpe @ 2017-04-27 15:57 UTC (permalink / raw)
  To: Jason Gunthorpe, Christoph Hellwig
  Cc: linux-nvdimm-y27Ovi1pjclAfugRpC6u6w,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
	target-devel-u79uwXL29TY76Z2rM5mHXA, Sumit Semwal,
	devel-gWbeCf7V1WCQmaza687I9mD2FQJk+8+b, James E.J. Bottomley,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA, Matthew Wilcox,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	open-iscsi-/JYPxA39Uh5TLH3MbocFFw,
	linux-media-u79uwXL29TY76Z2rM5mHXA,
	intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	sparmaintainer-GLv8BlqOqDDQT0dZR+AlfA,
	linux-raid-u79uwXL29TY76Z2rM5mHXA,
	megaraidlinux.pdl-dY08KVG/lbpWk0Htik3J/w, Jens Axboe,
	Martin K. Petersen, Greg Kroah-Hartman,
	linux-mmc-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-crypto-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20170427152720.GA7662-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>



On 27/04/17 09:27 AM, Jason Gunthorpe wrote:
> On Thu, Apr 27, 2017 at 08:53:38AM +0200, Christoph Hellwig wrote:
> How about first switching as many call sites as possible to use
> sg_copy_X_buffer instead of kmap?

Yeah, I could look at doing that first.

One problem is we might get more Naks of the form of Herbert Xu's who
might be concerned with the performance implications.

These are definitely a bit more invasive changes than thin wrappers
around kmap calls.

> A random audit of Logan's series suggests this is actually a fairly
> common thing.

It's not _that_ common but there are a significant fraction. One of my
patches actually did this to two places that seemed to be reimplementing
the sg_copy_X_buffer logic.

Thanks,

Logan

^ permalink raw reply

* Re: [PATCH v6 1/5] skbuff: return -EMSGSIZE in skb_to_sgvec to prevent overflow
From: David Miller @ 2017-04-27 15:54 UTC (permalink / raw)
  To: Jason; +Cc: netdev, linux-kernel, David.Laight, kernel-hardening
In-Reply-To: <CAHmME9qDmcvzF_xeaxegC2RpBOs8PziJOaKEqv6Z_X1pUFbR0w@mail.gmail.com>

From: "Jason A. Donenfeld" <Jason@zx2c4.com>
Date: Thu, 27 Apr 2017 11:21:51 +0200

> Hey Dave,
> 
> David Laight and I have been discussing offlist. It occurred to both
> of us that this could just be turned into a loop because perhaps this
> is actually just tail-recursive. Upon further inspection, however, the
> way the current algorithm works, it's possible that each of the
> fraglist skbs has its own fraglist, which would make this into tree
> recursion, which is why in the first place I wanted to place that
> limit on it. If that's the case, then the patch I proposed above is
> the best way forward. However, perhaps there's the chance that
> fraglist skbs having separate fraglists are actually forbidden? Is
> this the case? Are there other parts of the API that enforce this
> contract? Is it something we could safely rely on here? If you say
> yes, I'll send a v7 that makes this into a non-recursive loop.

As Sabrina showed, it can happen.  There are no such restrictions on
the geometry of an SKB.

^ permalink raw reply

* Re: rhashtable - Cap total number of entries to 2^31
From: David Miller @ 2017-04-27 15:48 UTC (permalink / raw)
  To: herbert; +Cc: fw, netdev, tgraf
In-Reply-To: <20170427054451.GA529@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 27 Apr 2017 13:44:51 +0800

> When max_size is not set or if it set to a sufficiently large
> value, the nelems counter can overflow.  This would cause havoc
> with the automatic shrinking as it would then attempt to fit a
> huge number of entries into a tiny hash table.
> 
> This patch fixes this by adding max_elems to struct rhashtable
> to cap the number of elements.  This is set to 2^31 as nelems is
> not a precise count.  This is sufficiently smaller than UINT_MAX
> that it should be safe.
> 
> When max_size is set max_elems will be lowered to at most twice
> max_size as is the status quo.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied to net-next, thanks Herbert.

^ permalink raw reply

* Re: [PATCH net-next] tcp: tcp_rack_reo_timeout() must update tp->tcp_mstamp
From: David Miller @ 2017-04-27 15:46 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, soheil, ncardwell, ycheng
In-Reply-To: <1493266255.6453.103.camel@edumazet-glaptop3.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 26 Apr 2017 21:10:55 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> I wrongly assumed tp->tcp_mstamp was up to date at the time
> tcp_rack_reo_timeout() was called.
> 
> It is not true, since we only update tcp->tcp_mstamp when receiving
> a packet (as initially done in commit 69e996c58a35 ("tcp: add
> tp->tcp_mstamp field")
> 
> tcp_rack_reo_timeout() being called by a timer and not an incoming
> packet, we need to refresh tp->tcp_mstamp
> 
> Fixes: 7c1c7308592f ("tcp: do not pass timestamp to tcp_rack_detect_loss()")
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox