Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v2] net: amd-xgbe: Get rid of custom hex_dump_to_buffer()
From: David Miller @ 2017-12-20 18:05 UTC (permalink / raw)
  To: andriy.shevchenko; +Cc: thomas.lendacky, netdev
In-Reply-To: <20171219212215.11561-1-andriy.shevchenko@linux.intel.com>

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Tue, 19 Dec 2017 23:22:15 +0200

> Get rid of yet another custom hex_dump_to_buffer().
> 
> The output is slightly changed, i.e. each byte followed by white space.
> 
> Note, we don't use print_hex_dump() here since the original code uses
> nedev_dbg().
> 
> Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH v5 3/6] perf: implement pmu perf_kprobe
From: Song Liu @ 2017-12-20 18:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Rostedt, mingo@redhat.com, David Miller,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Daniel Borkmann, Kernel Team
In-Reply-To: <20171220101426.cquh5uv4bgj4bk7y@hirez.programming.kicks-ass.net>


> On Dec 20, 2017, at 2:14 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Wed, Dec 20, 2017 at 11:03:01AM +0100, Peter Zijlstra wrote:
>> On Wed, Dec 06, 2017 at 02:45:15PM -0800, Song Liu wrote:
>>> @@ -8537,7 +8620,7 @@ static int perf_event_set_filter(struct perf_event *event, void __user *arg)
>>> 	char *filter_str;
>>> 	int ret = -EINVAL;
>>> 
>>> -	if ((event->attr.type != PERF_TYPE_TRACEPOINT ||
>>> +	if ((!perf_event_is_tracing(event) ||
>>> 	     !IS_ENABLED(CONFIG_EVENT_TRACING)) &&
>>> 	    !has_addr_filter(event))
>>> 		return -EINVAL;
>> 
>> You actually missed an instance later in this same function... fixing
>> that.
> 
> 
> @@ -8518,23 +8601,19 @@ perf_event_set_addr_filter(struct perf_e
> 
> static int perf_event_set_filter(struct perf_event *event, void __user *arg)
> {
> -	char *filter_str;
> 	int ret = -EINVAL;
> -
> -	if ((event->attr.type != PERF_TYPE_TRACEPOINT ||
> -	    !IS_ENABLED(CONFIG_EVENT_TRACING)) &&
> -	    !has_addr_filter(event))
> -		return -EINVAL;
> +	char *filter_str;
> 
> 	filter_str = strndup_user(arg, PAGE_SIZE);
> 	if (IS_ERR(filter_str))
> 		return PTR_ERR(filter_str);
> 
> -	if (IS_ENABLED(CONFIG_EVENT_TRACING) &&
> -	    event->attr.type == PERF_TYPE_TRACEPOINT)
> -		ret = ftrace_profile_set_filter(event, event->attr.config,
> -						filter_str);
> -	else if (has_addr_filter(event))
> +#ifdef CONFIG_EVENT_TRACING
> +	if (perf_event_is_tracing(event))
> +		ret = ftrace_profile_set_filter(event, event->attr.config, filter_str);
> +	else
> +#endif
> +	if (has_addr_filter(event))
> 		ret = perf_event_set_addr_filter(event, filter_str);
> 
> 	kfree(filter_str);
> 
> 
> 
> Is that right?

Yeah, this is right and neat. Thanks a lot for your help on this. 

I think there is one more thing to change:

diff --git i/kernel/events/core.c w/kernel/events/core.c
index a906f30..516ff9b 100644
--- i/kernel/events/core.c
+++ w/kernel/events/core.c
@@ -8226,7 +8226,7 @@ static int perf_event_set_bpf_prog(struct perf_event *event, u32 prog_fd)

 static void perf_event_free_bpf_prog(struct perf_event *event)
 {
-       if (event->attr.type != PERF_TYPE_TRACEPOINT) {
+       if (!perf_event_is_tracing(event)) {
                perf_event_free_bpf_handler(event);
                return;
        }

Thanks,
Song

^ permalink raw reply related

* Re: [PATCH net v2] openvswitch: Fix pop_vlan action for double tagged frames
From: Eric Garver @ 2017-12-20 18:13 UTC (permalink / raw)
  To: Jiri Benc; +Cc: netdev, ovs-dev
In-Reply-To: <20171220184117.19dfa575@redhat.com>

On Wed, Dec 20, 2017 at 06:41:17PM +0100, Jiri Benc wrote:
> On Wed, 20 Dec 2017 10:39:32 -0500, Eric Garver wrote:
> > +	if (is_flow_key_valid(key) && key->eth.vlan.tci && key->eth.cvlan.tci)
> 
> Maybe (key->eth.vlan.tci & htons(VLAN_TAG_PRESENT)) for consistency
> with the rest of the code? But it's just nitpicking.
> 
> The real problem here is when a double tagged packet leaves the ovs
> bridge, it won't have the skb->protocol that the kernel expects: it
> will be ethertype of the payload, while my understanding is it should
> be the inner tpid, right?

Right.

> 
> This patch fixes that nicely for the pop vlan case. But what about
> other cases? It seems to me that we need to add the logic to
> key_extract.
> 

The part I was missing is; before encap into the L3 tunnel all the VLAN
tags must be explicitly popped.

Setting skb->protocol to the TPID for double tagged frames means the pop
operations shift the payload ethertype into skb->protocol.

I'll send a v3 that does this is key_extract().

^ permalink raw reply

* Re: RCU callback crashes
From: Cong Wang @ 2017-12-20 18:17 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jiri Pirko, netdev@vger.kernel.org
In-Reply-To: <20171219223404.03786d66@cakuba.netronome.com>

On Tue, Dec 19, 2017 at 10:34 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> Ah, no object debug but KASAN on produces this:
>


I bet it is an ingress qdisc which is being freed?



> [   39.268209] BUG: KASAN: use-after-free in cpu_needs_another_gp+0x246/0x2b0
> [   39.275965] Read of size 8 at addr ffff8803aa64f138 by task swapper/13/0
> [   39.283524]
> [   39.285256] CPU: 13 PID: 0 Comm: swapper/13 Not tainted 4.15.0-rc3-perf-00955-g1d0b01347dd5-dirty #8
> [   39.295535] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
> [   39.303969] Call Trace:
> [   39.306769]  <IRQ>
> [   39.309088]  dump_stack+0xa6/0x118
> [   39.312957]  ? _atomic_dec_and_lock+0xe8/0xe8
> [   39.317895]  ? cpu_needs_another_gp+0x246/0x2b0
> [   39.323030]  print_address_description+0x6a/0x270
> [   39.328380]  ? cpu_needs_another_gp+0x246/0x2b0
> [   39.333510]  kasan_report+0x23f/0x350
> [   39.337672]  cpu_needs_another_gp+0x246/0x2b0
> ...
> [   39.383026]  rcu_process_callbacks+0x1a0/0x620
> ...


This is confusing.

I guess it is q->miniqp which is freed in qdisc_graft() without properly
waiting for rcu readers?


> [   39.426713]  __do_softirq+0x17f/0x4de
> ...
> [   39.463841]  irq_exit+0xe1/0xf0
> [   39.467437]  smp_apic_timer_interrupt+0xd9/0x290
> [   39.472685]  ? smp_call_function_single_interrupt+0x230/0x230
> [   39.479195]  ? smp_reschedule_interrupt+0x240/0x240
> [   39.484736]  apic_timer_interrupt+0x8c/0xa0
> [   39.489497]  </IRQ>
> [   39.491929] RIP: 0010:cpuidle_enter_state+0x12a/0x510
> [   39.497660] RSP: 0018:ffff88086bf9fd08 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
> [   39.506228] RAX: 0000000000000000 RBX: ffffe8ffffb060e0 RCX: ffffffff921329f5
> [   39.514291] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff88086f3246e8
> [   39.522354] RBP: 1ffff1010d7f3fa6 R08: fffffbfff2742768 R09: fffffbfff2742768
> [   39.530418] R10: ffff88086bf9fcc8 R11: fffffbfff2742767 R12: 0000000924148b4b
> [   39.538480] R13: 0000000000000004 R14: 0000000000000004 R15: ffffffff9383eb80
> [   39.546545]  ? sched_idle_set_state+0x25/0x30
> [   39.551502]  ? cpuidle_enter_state+0x106/0x510
> [   39.556556]  ? cpuidle_enter_s2idle+0x130/0x130
> [   39.561706]  ? rcu_eqs_enter_common.constprop.62+0xd1/0x1e0
> [   39.568037]  ? rcu_gp_init+0xf70/0xf70
> [   39.572331]  ? sched_set_stop_task+0x160/0x160
> [   39.577384]  do_idle+0x1af/0x200
> [   39.581076]  cpu_startup_entry+0xd2/0xe0
> [   39.585545]  ? cpu_in_idle+0x20/0x20
> [   39.589626]  ? _raw_spin_trylock+0xe0/0xe0
> [   39.594292]  ? memcpy+0x34/0x50
> [   39.597890]  start_secondary+0x271/0x2b0
> [   39.602361]  ? set_cpu_sibling_map+0x840/0x840
> [   39.607416]  secondary_startup_64+0xa5/0xb0
> [   39.612180]
> [   39.613929] Allocated by task 1358:
> [   39.617914]  __kmalloc_node+0x183/0x2c0
> [   39.622290]  qdisc_alloc+0xbd/0x3f0
> [   39.626274]  qdisc_create+0xd8/0x720
> [   39.630355]  tc_modify_qdisc+0x657/0x910
> [   39.634826]  rtnetlink_rcv_msg+0x37c/0x7e0
> [   39.639491]  netlink_rcv_skb+0x122/0x230
> [   39.643960]  netlink_unicast+0x2ae/0x360
> [   39.648443]  netlink_sendmsg+0x5d5/0x620
> [   39.652915]  sock_sendmsg+0x64/0x80
> [   39.656900]  ___sys_sendmsg+0x4a8/0x500
> [   39.661272]  __sys_sendmsg+0xa9/0x140
> [   39.665450]  entry_SYSCALL_64_fastpath+0x1e/0x81
> [   39.670695]
> [   39.672441] Freed by task 1370:
> [   39.676052]  kfree+0x8d/0x1c0
> [   39.679454]  qdisc_graft+0x208/0x670
> [   39.683535]  tc_get_qdisc+0x229/0x350
> [   39.687713]  rtnetlink_rcv_msg+0x37c/0x7e0
> [   39.692411]  netlink_rcv_skb+0x122/0x230
> [   39.696881]  netlink_unicast+0x2ae/0x360
> [   39.701350]  netlink_sendmsg+0x5d5/0x620
> [   39.705819]  sock_sendmsg+0x64/0x80
> [   39.709801]  ___sys_sendmsg+0x4a8/0x500
> [   39.714172]  __sys_sendmsg+0xa9/0x140
> [   39.718351]  entry_SYSCALL_64_fastpath+0x1e/0x81
> [   39.723597]
> [   39.725347] The buggy address belongs to the object at ffff8803aa64ef80
> [   39.725347]  which belongs to the cache kmalloc-512 of size 512
> [   39.739453] The buggy address is located 440 bytes inside of
> [   39.739453]  512-byte region [ffff8803aa64ef80, ffff8803aa64f180)
> [   39.752684] The buggy address belongs to the page:
> [   39.758127] page:0000000042b3124b count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
> [   39.769222] flags: 0x2ffff0000008100(slab|head)
> [   39.774365] raw: 02ffff0000008100 0000000000000000 0000000000000000 0000000180190019
> [   39.783129] raw: dead000000000100 dead000000000200 ffff8803afc0ed80 0000000000000000
> [   39.791986] page dumped because: kasan: bad access detected
> [   39.798300]
> [   39.800063] Memory state around the buggy address:
> [   39.805503]  ffff8803aa64f000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   39.813684]  ffff8803aa64f080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   39.821866] >ffff8803aa64f100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [   39.830045]                                         ^
> [   39.835778]  ffff8803aa64f180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   39.843958]  ffff8803aa64f200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>

^ permalink raw reply

* (U.N.O/W.B.O/20/12/2017/1982/09/05)
From: UNCU @ 2017-12-20 18:19 UTC (permalink / raw)
  To: Recipients

[-- Attachment #1: Mail message body --]
[-- Type: text/plain, Size: 57 bytes --]

Please view the attached file for your compensation code.

[-- Attachment #2: United Nations Compensation Unit.docx --]
[-- Type: application/octet-stream, Size: 17648 bytes --]

^ permalink raw reply

* Re: RCU callback crashes
From: Cong Wang @ 2017-12-20 18:31 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jiri Pirko, netdev@vger.kernel.org
In-Reply-To: <CAM_iQpWUjfv2-Sirmdb5WfV4pZ4uF0m7=HR5YGWaKxb4KHp8gQ@mail.gmail.com>

On Wed, Dec 20, 2017 at 10:17 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> I guess it is q->miniqp which is freed in qdisc_graft() without properly
> waiting for rcu readers?

It is probably so, the call_rcu_bh(&miniq_old->rcu, mini_qdisc_rcu_func)
in the end of mini_qdisc_pair_swap() is invoked on miniq_old->rcu,
but miniq is being freed, no rcu barrier waits for it...

You can try to add a rcu_barrier_bh() at the end to see if this crash
is gone, but I don't think people like adding yet another rcu barrier...

^ permalink raw reply

* Re: [RFC PATCH net-next] tools/bpf: fix build with binutils >= 2.28
From: Roman Gushchin @ 2017-12-20 18:32 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: netdev, linux-kernel, kernel-team, Jakub Kicinski,
	Alexei Starovoitov, Daniel Borkmann
In-Reply-To: <140ea0e9-964f-6f62-0721-88c9f8905cd3@netronome.com>

On Tue, Dec 19, 2017 at 04:22:51PM +0000, Quentin Monnet wrote:
> 2017-12-19 16:10 UTC+0000 ~ Roman Gushchin <guro@fb.com>
> > On Tue, Dec 19, 2017 at 03:57:02PM +0000, Quentin Monnet wrote:
> >> Hi Roman, thanks for working on this!
> >>
> >>
> >> I discussed this issue with Jakub recently, and one suggestion he had
> >> was to look in tools/build/feature to add a new "feature", by trying to
> >> compile short programs, for making the distinction between binutils
> >> versions. It probably requires more work, but could be more robust than
> >> parsing the version from the command line?
> > 
> > Hm, might be an option. Parsing readelf output is pretty ugly, here I agree.
> > In general it feels more like a binutils issue, so we have to workaround it
> > in either way.
> > 
> > Is Jakub or someone else working on it?
> > 
> > Thanks!
> > 
> 
> Jakub isn't. On our side, I noticed last week that there was this change
> in binutils, and started to have a look at how these "features" work.
> But I have nothing that works so far, so feel free to tackle this.
> 
> Quentin

Hi Quentin!

Can you, please, check that the patch below works in your environment.

Thanks!

--


>From b08deabf42e4c143b9e0eec8c49714e4d2c928e3 Mon Sep 17 00:00:00 2001
From: Roman Gushchin <guro@fb.com>
Date: Wed, 20 Dec 2017 13:27:32 +0000
Subject: [RFC PATCH net-next] tools/bpftool: fix bpftool build with bintutils
 >= 2.8

Bpftool build is broken with binutils version 2.28 and later.
The cause is commit 003ca0fd2286 ("Refactor disassembler selection")
in the binutils repo, which changed the disassembler() function
signature.

Fix this by adding a new "feature" to the tools/build/features
infrastructure and make it responsible for decision which
disassembler() function signature to use.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
---
 tools/bpf/Makefile                      | 18 ++++++++++++++++++
 tools/bpf/bpf_jit_disasm.c              |  7 +++++++
 tools/bpf/bpftool/Makefile              | 13 +++++++++++++
 tools/bpf/bpftool/jit_disasm.c          |  7 +++++++
 tools/build/feature/Makefile            |  4 ++++
 tools/build/feature/test-disassembler.c | 15 +++++++++++++++
 6 files changed, 64 insertions(+)
 create mode 100644 tools/build/feature/test-disassembler.c

diff --git a/tools/bpf/Makefile b/tools/bpf/Makefile
index 07a6697466ef..c62b3a311486 100644
--- a/tools/bpf/Makefile
+++ b/tools/bpf/Makefile
@@ -9,6 +9,24 @@ MAKE = make
 CFLAGS += -Wall -O2
 CFLAGS += -D__EXPORTED_HEADERS__ -I../../include/uapi -I../../include
 
+ifeq ($(srctree),)
+srctree := $(patsubst %/,%,$(dir $(CURDIR)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+endif
+
+FEATURE_TESTS = disassembler
+FEATURE_DISPLAY = disassembler
+
+ifeq ($(FEATURES_DUMP),)
+include $(srctree)/tools/build/Makefile.feature
+else
+include $(FEATURES_DUMP)
+endif
+
+ifeq ($(feature-disassembler), 1)
+CFLAGS += -DNEW_DISSASSEMBLER_SIGNATURE
+endif
+
 %.yacc.c: %.y
 	$(YACC) -o $@ -d $<
 
diff --git a/tools/bpf/bpf_jit_disasm.c b/tools/bpf/bpf_jit_disasm.c
index 75bf526a0168..a5f4dbacdb11 100644
--- a/tools/bpf/bpf_jit_disasm.c
+++ b/tools/bpf/bpf_jit_disasm.c
@@ -72,7 +72,14 @@ static void get_asm_insns(uint8_t *image, size_t len, int opcodes)
 
 	disassemble_init_for_target(&info);
 
+#ifdef NEW_DISSASSEMBLER_SIGNATURE
+	disassemble = disassembler(bfd_get_arch(bfdf),
+				   bfd_big_endian(bfdf),
+				   bfd_get_mach(bfdf),
+				   bfdf);
+#else
 	disassemble = disassembler(bfdf);
+#endif
 	assert(disassemble);
 
 	do {
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 3f17ad317512..9c089cfa5f3f 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -43,6 +43,19 @@ LIBS = -lelf -lbfd -lopcodes $(LIBBPF)
 INSTALL ?= install
 RM ?= rm -f
 
+FEATURE_TESTS = disassembler
+FEATURE_DISPLAY = disassembler
+
+ifeq ($(FEATURES_DUMP),)
+include $(srctree)/tools/build/Makefile.feature
+else
+include $(FEATURES_DUMP)
+endif
+
+ifeq ($(feature-disassembler), 1)
+CFLAGS += -DNEW_DISSASSEMBLER_SIGNATURE
+endif
+
 include $(wildcard *.d)
 
 all: $(OUTPUT)bpftool
diff --git a/tools/bpf/bpftool/jit_disasm.c b/tools/bpf/bpftool/jit_disasm.c
index 1551d3918d4c..8295e2f14ed7 100644
--- a/tools/bpf/bpftool/jit_disasm.c
+++ b/tools/bpf/bpftool/jit_disasm.c
@@ -107,7 +107,14 @@ void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes)
 
 	disassemble_init_for_target(&info);
 
+#ifdef NEW_DISSASSEMBLER_SIGNATURE
+	disassemble = disassembler(bfd_get_arch(bfdf),
+				   bfd_big_endian(bfdf),
+				   bfd_get_mach(bfdf),
+				   bfdf);
+#else
 	disassemble = disassembler(bfdf);
+#endif
 	assert(disassemble);
 
 	if (json_output)
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 96982640fbf8..91f937943918 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -13,6 +13,7 @@ FILES=                                          \
          test-hello.bin                         \
          test-libaudit.bin                      \
          test-libbfd.bin                        \
+         test-disassembler.bin                  \
          test-liberty.bin                       \
          test-liberty-z.bin                     \
          test-cplus-demangle.bin                \
@@ -188,6 +189,9 @@ $(OUTPUT)test-libpython-version.bin:
 $(OUTPUT)test-libbfd.bin:
 	$(BUILD) -DPACKAGE='"perf"' -lbfd -lz -liberty -ldl
 
+$(OUTPUT)test-disassembler.bin:
+	$(BUILD) -lopcodes
+
 $(OUTPUT)test-liberty.bin:
 	$(CC) $(CFLAGS) -Wall -Werror -o $@ test-libbfd.c -DPACKAGE='"perf"' $(LDFLAGS) -lbfd -ldl -liberty
 
diff --git a/tools/build/feature/test-disassembler.c b/tools/build/feature/test-disassembler.c
new file mode 100644
index 000000000000..45ce65cfddf0
--- /dev/null
+++ b/tools/build/feature/test-disassembler.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <bfd.h>
+#include <dis-asm.h>
+
+int main(void)
+{
+	bfd *abfd = bfd_openr(NULL, NULL);
+
+	disassembler(bfd_get_arch(abfd),
+		     bfd_big_endian(abfd),
+		     bfd_get_mach(abfd),
+		     abfd);
+
+	return 0;
+}
-- 
2.14.3

^ permalink raw reply related

* Re: [PATCH net 0/2] cls_bpf: fix offload state tracking with block callbacks
From: David Miller @ 2017-12-20 18:32 UTC (permalink / raw)
  To: jakub.kicinski; +Cc: netdev, daniel, jiri, oss-drivers
In-Reply-To: <20171219213214.1084-1-jakub.kicinski@netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Tue, 19 Dec 2017 13:32:12 -0800

> After introduction of block callbacks classifiers can no longer track
> offload state.  cls_bpf used to do that in an attempt to move common
> code from drivers to the core.  Remove that functionality and fix
> drivers.
> 
> The user-visible bug this is fixing is that trying to offload a second
> filter would trigger a spurious DESTROY and in turn disable the already
> installed one.

Series applied, thanks for the heads up about the net-next conflict.

^ permalink raw reply

* [PATCH bpf-next] tools/bpf: adjust rlimit RLIMIT_MEMLOCK for test_dev_cgroup
From: Yonghong Song @ 2017-12-20 18:37 UTC (permalink / raw)
  To: ast, daniel, guro, netdev; +Cc: kernel-team

The default rlimit RLIMIT_MEMLOCK is 64KB. In certain cases,
e.g. in a test machine mimicking our production system, this test may
fail due to unable to charge the required memory for prog load:

  $ ./test_dev_cgroup
  libbpf: load bpf program failed: Operation not permitted
  libbpf: failed to load program 'cgroup/dev'
  libbpf: failed to load object './dev_cgroup.o'
  Failed to load DEV_CGROUP program
  ...

Changing the default rlimit RLIMIT_MEMLOCK to unlimited
makes the test pass.

This patch also fixed a problem where when bpf_prog_load fails,
cleanup_cgroup_environment() should not be called since
setup_cgroup_environment() has not been invoked. Otherwise,
the following confusing message will appear:
  ...
  (/home/yhs/local/linux/tools/testing/selftests/bpf/cgroup_helpers.c:95:
   errno: No such file or directory) Opening Cgroup Procs: /mnt/cgroup.procs
  ...

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/test_dev_cgroup.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/test_dev_cgroup.c b/tools/testing/selftests/bpf/test_dev_cgroup.c
index 02c85d6..c1535b3 100644
--- a/tools/testing/selftests/bpf/test_dev_cgroup.c
+++ b/tools/testing/selftests/bpf/test_dev_cgroup.c
@@ -10,6 +10,8 @@
 #include <string.h>
 #include <errno.h>
 #include <assert.h>
+#include <sys/time.h>
+#include <sys/resource.h>

 #include <linux/bpf.h>
 #include <bpf/bpf.h>
@@ -23,15 +25,19 @@

 int main(int argc, char **argv)
 {
+	struct rlimit limit  = { RLIM_INFINITY, RLIM_INFINITY };
 	struct bpf_object *obj;
 	int error = EXIT_FAILURE;
 	int prog_fd, cgroup_fd;
 	__u32 prog_cnt;

+	if (setrlimit(RLIMIT_MEMLOCK, &limit) < 0)
+		perror("Unable to lift memlock rlimit");
+
 	if (bpf_prog_load(DEV_CGROUP_PROG, BPF_PROG_TYPE_CGROUP_DEVICE,
 			  &obj, &prog_fd)) {
 		printf("Failed to load DEV_CGROUP program\n");
-		goto err;
+		goto out;
 	}

 	if (setup_cgroup_environment()) {
@@ -89,5 +95,6 @@ int main(int argc, char **argv)
 err:
 	cleanup_cgroup_environment();

+out:
 	return error;
 }
-- 
2.9.5

^ permalink raw reply related

* Re: [pull request][net 00/14] Mellanox, mlx5 fixes 2017-12-19
From: David Miller @ 2017-12-20 18:42 UTC (permalink / raw)
  To: saeedm; +Cc: netdev
In-Reply-To: <20171219222456.29627-1-saeedm@mellanox.com>

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Wed, 20 Dec 2017 00:24:42 +0200

> The follwoing series includes some fixes for mlx5 core and etherent
> driver.
> 
> Please pull and let me know if there is any problem.

Pulled.

> For -stable:
> 
> kernels >= v4.7.y
>     ("net/mlx5e: Fix possible deadlock of VXLAN lock")
>     ("net/mlx5e: Add refcount to VXLAN structure")
>     ("net/mlx5e: Prevent possible races in VXLAN control flow")
>     ("net/mlx5e: Fix features check of IPv6 traffic")
> 
> kernels >= v4.9.y
>     ("net/mlx5: Fix error flow in CREATE_QP command")
>     ("net/mlx5: Fix rate limit packet pacing naming and struct")
> 
> kernels >= v4.13.y
>     ("net/mlx5: FPGA, return -EINVAL if size is zero")
> 
> kernels >= v4.14.y
>     ("Revert "mlx5: move affinity hints assignments to generic code")

Queued up.

Thanks.

^ permalink raw reply

* Re: [PATCH v2] hv_netvsc: automatically name slave VF network device
From: David Miller @ 2017-12-20 18:47 UTC (permalink / raw)
  To: stephen; +Cc: netdev, sthemmin
In-Reply-To: <20171219235930.16916-1-sthemmin@microsoft.com>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Tue, 19 Dec 2017 15:59:30 -0800

> Rename the VF device to ethX_vf based on the ethX as the
> synthetic device.  This eliminates the need for delay on setup,
> and the PCI (udev based) naming is not reproducible on Hyper-V
> anyway. The name of the VF does not matter since all control
> operations take place the primary device. It does make the
> user experience better to associate the names.
> 
> Based on feedback from all.systems.go talk.
> 
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
> ---
> v2 - also handle case where synthetic device gets renamed
>      and the case where rename causes clash with pre-existing device name

Besides Jakub's objections (which I agree with) it is just so
unexpected that changing the name of device X has the side effect of
changing the name of device Y.

This is why we do this policy stuff in userspace.  The kernel has no
business messing with these netdev names.

This patch means that if a user decides to add his own custom udev
rules for the VF (for whatever reason, it doesn't have to make sense
to you or me) we're just going to undo them.  That's bad.

^ permalink raw reply

* Re: [PATCH net] enic: add wq clean up budget
From: David Miller @ 2017-12-20 18:50 UTC (permalink / raw)
  To: gvaradar; +Cc: netdev, govindarajulu90, benve
In-Reply-To: <alpine.LNX.2.20.1712191623240.28262@cae-iprp-alln-lb.cisco.com>

From: Govindarajulu Varadarajan <gvaradar@cisco.com>
Date: Tue, 19 Dec 2017 16:37:14 -0800

> How would you want us to fix this issue? Is doing an ioread on
> fetch_index for every poll our only option? (to get head and tail
> point once)
> 
> If 256 is not reasonable, will wq_budget equal to wq ring size be
> acceptable?
> At any point number of wq entries to be cleaned cannot be more than
> ring size.

You can resubmit your orignal patch to increase things to 256 but
realize that I still consider this driver terrible at layering,
indirection, and how it does reclaim.

^ permalink raw reply

* Re: [PATCH v3,net-next] ip6_gre: fix a pontential issue in ip6erspan_rcv
From: David Miller @ 2017-12-20 18:52 UTC (permalink / raw)
  To: yanhaishuang; +Cc: kuznet, yoshfuji, netdev, linux-kernel, u9012063
In-Reply-To: <1513734799-20879-1-git-send-email-yanhaishuang@cmss.chinamobile.com>

From: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Date: Wed, 20 Dec 2017 09:53:19 +0800

> pskb_may_pull() can change skb->data, so we need to load ipv6h/ershdr at
> the right place.
> 
> Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
> Cc: William Tu <u9012063@gmail.com>
> Acked-by: William Tu <u9012063@gmail.com>
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
> 
> ---
> Change since v3:
>   * Rebase on latest master branch.
>   * Fix wrong commit information.

Applied.

^ permalink raw reply

* Re: [PATCH v3,net-next 1/2] ip_gre: fix error path when erspan_rcv failed
From: David Miller @ 2017-12-20 18:52 UTC (permalink / raw)
  To: yanhaishuang; +Cc: kuznet, yoshfuji, netdev, linux-kernel, u9012063
In-Reply-To: <1513736507-22968-2-git-send-email-yanhaishuang@cmss.chinamobile.com>

From: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Date: Wed, 20 Dec 2017 10:21:46 +0800

> When erspan_rcv call return PACKET_REJECT, we shoudn't call ipgre_rcv to
> process packets again, instead send icmp unreachable message in error
> path.
> 
> Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN")
> Acked-by: William Tu <u9012063@gmail.com>
> Cc: William Tu <u9012063@gmail.com>
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>

Applied.

^ permalink raw reply

* Re: [PATCH v3,net-next 2/2] ip6_gre: fix error path when ip6erspan_rcv failed
From: David Miller @ 2017-12-20 18:53 UTC (permalink / raw)
  To: yanhaishuang; +Cc: kuznet, yoshfuji, netdev, linux-kernel, u9012063
In-Reply-To: <1513736507-22968-3-git-send-email-yanhaishuang@cmss.chinamobile.com>

From: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Date: Wed, 20 Dec 2017 10:21:47 +0800

> Same as ipv4 code, when ip6erspan_rcv call return PACKET_REJECT, we
> should call icmpv6_send to send icmp unreachable message in error path.
> 
> Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support")
> Acked-by: William Tu <u9012063@gmail.com>
> Cc: William Tu <u9012063@gmail.com>
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>

Applied.

^ permalink raw reply

* Re: [PATCH v3,net-next 1/2] ip_gre: fix potential memory leak in erspan_rcv
From: David Miller @ 2017-12-20 18:57 UTC (permalink / raw)
  To: yanhaishuang; +Cc: kuznet, yoshfuji, netdev, linux-kernel, u9012063
In-Reply-To: <1513735621-21913-2-git-send-email-yanhaishuang@cmss.chinamobile.com>

From: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Date: Wed, 20 Dec 2017 10:07:00 +0800

> If md is NULL, tun_dst must be freed, otherwise it will cause memory
> leak.
> 
> Fixes: 1a66a836da6 ("gre: add collect_md mode to ERSPAN tunnel")
> Cc: William Tu <u9012063@gmail.com>
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>

Applied.

^ permalink raw reply

* Re: [PATCH v3,net-next 2/2] ip6_gre: fix potential memory leak in ip6erspan_rcv
From: David Miller @ 2017-12-20 18:57 UTC (permalink / raw)
  To: yanhaishuang; +Cc: kuznet, yoshfuji, netdev, linux-kernel, u9012063
In-Reply-To: <1513735621-21913-3-git-send-email-yanhaishuang@cmss.chinamobile.com>

From: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Date: Wed, 20 Dec 2017 10:07:01 +0800

> If md is NULL, tun_dst must be freed, otherwise it will cause memory
> leak.
> 
> Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode")
> Cc: William Tu <u9012063@gmail.com>
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>

Applied.

> @@ -550,8 +550,10 @@ static int ip6erspan_rcv(struct sk_buff *skb, int gre_hdr_len,
>  
>  			info = &tun_dst->u.tun_info;
>  			md = ip_tunnel_info_opts(info);
> -			if (!md)
> +			if (!md) {
> +				dst_release((struct dst_entry *)tun_dst);
>  				return PACKET_REJECT;
> +			}
>  
>  			memcpy(md, pkt_md, sizeof(*md));

I agree with William that 'md' should never be NULL here, but that check existing
before your changes so removing it is a separate patch altogether.

^ permalink raw reply

* Re: [PATCH net-next v3] netdevsim: correctly check return value of debugfs_create_dir
From: David Miller @ 2017-12-20 18:59 UTC (permalink / raw)
  To: bhole_prashant_q7; +Cc: netdev, jakub.kicinski
In-Reply-To: <20171220031857.3696-1-bhole_prashant_q7@lab.ntt.co.jp>

From: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Date: Wed, 20 Dec 2017 12:18:57 +0900

> - Checking return value with IS_ERROR_OR_NULL
> - Added error handling where it was not handled
> 
> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
> v3: nit-pick: directly returning error instead of going to label

Applied.

^ permalink raw reply

* Re: [PATCH v3 net-next 0/5] replace tcp_set_state tracepoint with inet_sock_set_state
From: David Miller @ 2017-12-20 19:00 UTC (permalink / raw)
  To: laoar.shao
  Cc: songliubraving, marcelo.leitner, rostedt, bgregg, netdev,
	linux-kernel
In-Reply-To: <1513739574-3345-1-git-send-email-laoar.shao@gmail.com>

From: Yafang Shao <laoar.shao@gmail.com>
Date: Wed, 20 Dec 2017 11:12:49 +0800

> According to the discussion in the mail thread
> https://patchwork.kernel.org/patch/10099243/,
> tcp_set_state tracepoint is renamed to inet_sock_set_state tracepoint and is
> moved to include/trace/events/sock.h.
> 
> With this new tracepoint, we can trace AF_INET/AF_INET6 sock state transitions.
> As there's only one single tracepoint for inet, so I didn't create a new trace
> file named trace/events/inet_sock.h, and just place it in
> include/trace/events/sock.h
> 
> Currently TCP/DCCP/SCTP state transitions are traced with this tracepoint.
> 
> - Why not more protocol ?
> If we really think that anonter protocol should be traced, I will modify the
> code to trace it.
> I just want to make the code easy and not output useless information.

Series applied, thanks.

^ permalink raw reply

* [PATCH net-next 00/15] s390/net: updates 2017-12-20
From: Julian Wiedmann @ 2017-12-20 19:10 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Stefan Raspl, Ursula Braun, Julian Wiedmann

Hi Dave,

Please apply the following patch series for 4.16.
Nothing too exciting, mostly just beating the qeth L3 code into shape.

Thanks & happy holidays,
Julian


Elena Reshetova (2):
  net: convert lcs_reply.refcnt from atomic_t to refcount_t
  qeth: convert qeth_reply.refcnt from atomic_t to refcount_t

Julian Wiedmann (13):
  s390/qeth: use ip*_eth_mc_map helpers
  s390/qeth: drop CONFIG_QETH_IPV6
  s390/qeth: don't keep track of MAC address's cast type
  s390/qeth: consolidate qeth MAC address helpers
  s390/qeth: use ether_addr_* helpers
  s390/qeth: align L2 and L3 set_rx_mode() implementations
  s390/qeth: robustify qeth_get_ip_version()
  s390/qeth: clean up l3_get_cast_type()
  s390/qeth: recognize non-IP multicast on L3 transmit
  s390/qeth: unionize next-hop field in qeth L3 header
  s390/qeth: streamline l3_fill_header()
  s390/qeth: pass full data length to l3_fill_header()
  s390/qeth: replace open-coded in*_pton()

 drivers/s390/net/Kconfig          |   3 -
 drivers/s390/net/lcs.c            |  10 +-
 drivers/s390/net/lcs.h            |   3 +-
 drivers/s390/net/qeth_core.h      |  42 ++--
 drivers/s390/net/qeth_core_main.c |  19 +-
 drivers/s390/net/qeth_core_mpc.h  |  13 +-
 drivers/s390/net/qeth_l2.h        |   3 +-
 drivers/s390/net/qeth_l2_main.c   |  92 +++-----
 drivers/s390/net/qeth_l3.h        |   3 +-
 drivers/s390/net/qeth_l3_main.c   | 474 +++++++++++---------------------------
 drivers/s390/net/qeth_l3_sys.c    |  12 +
 11 files changed, 234 insertions(+), 440 deletions(-)

-- 
2.13.5

^ permalink raw reply

* [PATCH net-next 02/15] qeth: convert qeth_reply.refcnt from atomic_t to refcount_t
From: Julian Wiedmann @ 2017-12-20 19:10 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Stefan Raspl, Ursula Braun, Julian Wiedmann
In-Reply-To: <20171220191109.90487-1-jwi@linux.vnet.ibm.com>

From: Elena Reshetova <elena.reshetova@intel.com>

atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable qeth_reply.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
[jwi: removed the WARN_ONs. Use CONFIG_REFCOUNT_FULL if you care.]
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
---
 drivers/s390/net/qeth_core.h      | 3 ++-
 drivers/s390/net/qeth_core_main.c | 8 +++-----
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h
index badf42acbf95..f5ee62c98011 100644
--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -21,6 +21,7 @@
 #include <linux/ethtool.h>
 #include <linux/hashtable.h>
 #include <linux/ip.h>
+#include <linux/refcount.h>
 
 #include <net/ipv6.h>
 #include <net/if_inet6.h>
@@ -632,7 +633,7 @@ struct qeth_reply {
 	int rc;
 	void *param;
 	struct qeth_card *card;
-	atomic_t refcnt;
+	refcount_t refcnt;
 };
 
 struct qeth_card_blkt {
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index 6c815207f4f5..bc4e57540a9e 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -564,7 +564,7 @@ static struct qeth_reply *qeth_alloc_reply(struct qeth_card *card)
 
 	reply = kzalloc(sizeof(struct qeth_reply), GFP_ATOMIC);
 	if (reply) {
-		atomic_set(&reply->refcnt, 1);
+		refcount_set(&reply->refcnt, 1);
 		atomic_set(&reply->received, 0);
 		reply->card = card;
 	}
@@ -573,14 +573,12 @@ static struct qeth_reply *qeth_alloc_reply(struct qeth_card *card)
 
 static void qeth_get_reply(struct qeth_reply *reply)
 {
-	WARN_ON(atomic_read(&reply->refcnt) <= 0);
-	atomic_inc(&reply->refcnt);
+	refcount_inc(&reply->refcnt);
 }
 
 static void qeth_put_reply(struct qeth_reply *reply)
 {
-	WARN_ON(atomic_read(&reply->refcnt) <= 0);
-	if (atomic_dec_and_test(&reply->refcnt))
+	if (refcount_dec_and_test(&reply->refcnt))
 		kfree(reply);
 }
 
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next 01/15] net: convert lcs_reply.refcnt from atomic_t to refcount_t
From: Julian Wiedmann @ 2017-12-20 19:10 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Stefan Raspl, Ursula Braun, Julian Wiedmann
In-Reply-To: <20171220191109.90487-1-jwi@linux.vnet.ibm.com>

From: Elena Reshetova <elena.reshetova@intel.com>

atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable lcs_reply.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
[jwi: removed the WARN_ONs. Use CONFIG_REFCOUNT_FULL if you care.]
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
---
 drivers/s390/net/lcs.c | 10 +++-------
 drivers/s390/net/lcs.h |  3 ++-
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/s390/net/lcs.c b/drivers/s390/net/lcs.c
index 92ae84a927fc..0ee8f33efb54 100644
--- a/drivers/s390/net/lcs.c
+++ b/drivers/s390/net/lcs.c
@@ -756,18 +756,14 @@ lcs_get_lancmd(struct lcs_card *card, int count)
 static void
 lcs_get_reply(struct lcs_reply *reply)
 {
-	WARN_ON(atomic_read(&reply->refcnt) <= 0);
-	atomic_inc(&reply->refcnt);
+	refcount_inc(&reply->refcnt);
 }
 
 static void
 lcs_put_reply(struct lcs_reply *reply)
 {
-        WARN_ON(atomic_read(&reply->refcnt) <= 0);
-        if (atomic_dec_and_test(&reply->refcnt)) {
+	if (refcount_dec_and_test(&reply->refcnt))
 		kfree(reply);
-	}
-
 }
 
 static struct lcs_reply *
@@ -780,7 +776,7 @@ lcs_alloc_reply(struct lcs_cmd *cmd)
 	reply = kzalloc(sizeof(struct lcs_reply), GFP_ATOMIC);
 	if (!reply)
 		return NULL;
-	atomic_set(&reply->refcnt,1);
+	refcount_set(&reply->refcnt, 1);
 	reply->sequence_no = cmd->sequence_no;
 	reply->received = 0;
 	reply->rc = 0;
diff --git a/drivers/s390/net/lcs.h b/drivers/s390/net/lcs.h
index fbc8b90b1f85..bd52caa3b11b 100644
--- a/drivers/s390/net/lcs.h
+++ b/drivers/s390/net/lcs.h
@@ -5,6 +5,7 @@
 #include <linux/netdevice.h>
 #include <linux/skbuff.h>
 #include <linux/workqueue.h>
+#include <linux/refcount.h>
 #include <asm/ccwdev.h>
 
 #define LCS_DBF_TEXT(level, name, text) \
@@ -271,7 +272,7 @@ struct lcs_buffer {
 struct lcs_reply {
 	struct list_head list;
 	__u16 sequence_no;
-	atomic_t refcnt;
+	refcount_t refcnt;
 	/* Callback for completion notification. */
 	void (*callback)(struct lcs_card *, struct lcs_cmd *);
 	wait_queue_head_t wait_q;
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next 03/15] s390/qeth: use ip*_eth_mc_map helpers
From: Julian Wiedmann @ 2017-12-20 19:10 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Stefan Raspl, Ursula Braun, Julian Wiedmann
In-Reply-To: <20171220191109.90487-1-jwi@linux.vnet.ibm.com>

Get rid of some wrapper indirection, and stop accessing the skb at
hard-coded offsets.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
---
 drivers/s390/net/qeth_l3_main.c | 43 ++++++++++++++---------------------------
 1 file changed, 14 insertions(+), 29 deletions(-)

diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c
index ef0961e18686..aeff11ad1de5 100644
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -1328,11 +1328,6 @@ qeth_diags_trace(struct qeth_card *card, enum qeth_diags_trace_cmds diags_cmd)
 	return qeth_send_ipa_cmd(card, iob, qeth_diags_trace_cb, NULL);
 }
 
-static void qeth_l3_get_mac_for_ipm(__be32 ipm, char *mac)
-{
-	ip_eth_mc_map(ipm, mac);
-}
-
 static void qeth_l3_mark_all_mc_to_be_deleted(struct qeth_card *card)
 {
 	struct qeth_ipaddr *addr;
@@ -1383,26 +1378,22 @@ static void qeth_l3_delete_nonused_mc(struct qeth_card *card)
 
 }
 
-
 static void
 qeth_l3_add_mc_to_hash(struct qeth_card *card, struct in_device *in4_dev)
 {
 	struct ip_mc_list *im4;
 	struct qeth_ipaddr *tmp, *ipm;
-	char buf[MAX_ADDR_LEN];
 
 	QETH_CARD_TEXT(card, 4, "addmc");
 
 	tmp = qeth_l3_get_addr_buffer(QETH_PROT_IPV4);
-		if (!tmp)
-			return;
+	if (!tmp)
+		return;
 
 	for (im4 = rcu_dereference(in4_dev->mc_list); im4 != NULL;
 	     im4 = rcu_dereference(im4->next_rcu)) {
-		qeth_l3_get_mac_for_ipm(im4->multiaddr, buf);
-
+		ip_eth_mc_map(im4->multiaddr, tmp->mac);
 		tmp->u.a4.addr = be32_to_cpu(im4->multiaddr);
-		memcpy(tmp->mac, buf, sizeof(tmp->mac));
 		tmp->is_multicast = 1;
 
 		ipm = qeth_l3_ip_from_hash(card, tmp);
@@ -1412,7 +1403,7 @@ qeth_l3_add_mc_to_hash(struct qeth_card *card, struct in_device *in4_dev)
 			ipm = qeth_l3_get_addr_buffer(QETH_PROT_IPV4);
 			if (!ipm)
 				continue;
-			memcpy(ipm->mac, buf, sizeof(tmp->mac));
+			memcpy(ipm->mac, tmp->mac, sizeof(tmp->mac));
 			ipm->u.a4.addr = be32_to_cpu(im4->multiaddr);
 			ipm->is_multicast = 1;
 			ipm->disp_flag = QETH_DISP_ADDR_ADD;
@@ -1473,18 +1464,15 @@ qeth_l3_add_mc6_to_hash(struct qeth_card *card, struct inet6_dev *in6_dev)
 	struct qeth_ipaddr *ipm;
 	struct ifmcaddr6 *im6;
 	struct qeth_ipaddr *tmp;
-	char buf[MAX_ADDR_LEN];
 
 	QETH_CARD_TEXT(card, 4, "addmc6");
 
 	tmp = qeth_l3_get_addr_buffer(QETH_PROT_IPV6);
-		if (!tmp)
-			return;
+	if (!tmp)
+		return;
 
 	for (im6 = in6_dev->mc_list; im6 != NULL; im6 = im6->next) {
-		ndisc_mc_map(&im6->mca_addr, buf, in6_dev->dev, 0);
-
-		memcpy(tmp->mac, buf, sizeof(tmp->mac));
+		ipv6_eth_mc_map(&im6->mca_addr, tmp->mac);
 		memcpy(&tmp->u.a6.addr, &im6->mca_addr.s6_addr,
 		       sizeof(struct in6_addr));
 		tmp->is_multicast = 1;
@@ -1499,7 +1487,7 @@ qeth_l3_add_mc6_to_hash(struct qeth_card *card, struct inet6_dev *in6_dev)
 		if (!ipm)
 			continue;
 
-		memcpy(ipm->mac, buf, OSA_ADDR_LEN);
+		memcpy(ipm->mac, tmp->mac, sizeof(tmp->mac));
 		memcpy(&ipm->u.a6.addr, &im6->mca_addr.s6_addr,
 		       sizeof(struct in6_addr));
 		ipm->is_multicast = 1;
@@ -1679,26 +1667,23 @@ static int qeth_l3_vlan_rx_kill_vid(struct net_device *dev,
 static void qeth_l3_rebuild_skb(struct qeth_card *card, struct sk_buff *skb,
 				struct qeth_hdr *hdr)
 {
-	__u16 prot;
-	struct iphdr *ip_hdr;
 	unsigned char tg_addr[MAX_ADDR_LEN];
 
 	if (!(hdr->hdr.l3.flags & QETH_HDR_PASSTHRU)) {
-		prot = (hdr->hdr.l3.flags & QETH_HDR_IPV6) ? ETH_P_IPV6 :
-			      ETH_P_IP;
+		u16 prot = (hdr->hdr.l3.flags & QETH_HDR_IPV6) ? ETH_P_IPV6 :
+								 ETH_P_IP;
+
+		skb_reset_network_header(skb);
 		switch (hdr->hdr.l3.flags & QETH_HDR_CAST_MASK) {
 		case QETH_CAST_MULTICAST:
 			switch (prot) {
 #ifdef CONFIG_QETH_IPV6
 			case ETH_P_IPV6:
-				ndisc_mc_map((struct in6_addr *)
-				     skb->data + 24,
-				     tg_addr, card->dev, 0);
+				ipv6_eth_mc_map(&ipv6_hdr(skb)->daddr, tg_addr);
 				break;
 #endif
 			case ETH_P_IP:
-				ip_hdr = (struct iphdr *)skb->data;
-				ip_eth_mc_map(ip_hdr->daddr, tg_addr);
+				ip_eth_mc_map(ip_hdr(skb)->daddr, tg_addr);
 				break;
 			default:
 				memcpy(tg_addr, card->dev->broadcast,
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next 04/15] s390/qeth: drop CONFIG_QETH_IPV6
From: Julian Wiedmann @ 2017-12-20 19:10 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Stefan Raspl, Ursula Braun, Julian Wiedmann
In-Reply-To: <20171220191109.90487-1-jwi@linux.vnet.ibm.com>

commit "s390/qeth: use ip*_eth_mc_map helpers" removed the last
occurrence of CONFIG_IPV6-dependent code.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
---
 drivers/s390/net/Kconfig        |  3 ---
 drivers/s390/net/qeth_l3_main.c | 60 +++++++----------------------------------
 2 files changed, 9 insertions(+), 54 deletions(-)

diff --git a/drivers/s390/net/Kconfig b/drivers/s390/net/Kconfig
index a782a207ad31..c7e484f70654 100644
--- a/drivers/s390/net/Kconfig
+++ b/drivers/s390/net/Kconfig
@@ -91,9 +91,6 @@ config QETH_L3
 	  To compile as a module choose M. The module name is qeth_l3.
 	  If unsure, choose Y.
 
-config QETH_IPV6
-	def_bool y if (QETH_L3 = IPV6) || (QETH_L3 && IPV6 = 'y')
-
 config CCWGROUP
 	tristate
 	default (LCS || CTCM || QETH)
diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c
index aeff11ad1de5..0404d5c61ad7 100644
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -582,7 +582,6 @@ int qeth_l3_setrouting_v6(struct qeth_card *card)
 	int rc = 0;
 
 	QETH_CARD_TEXT(card, 3, "setrtg6");
-#ifdef CONFIG_QETH_IPV6
 
 	if (!qeth_is_supported(card, IPA_IPV6))
 		return 0;
@@ -599,7 +598,6 @@ int qeth_l3_setrouting_v6(struct qeth_card *card)
 			" on %s. Type set to 'no router'.\n", rc,
 			QETH_CARD_IFNAME(card));
 	}
-#endif
 	return rc;
 }
 
@@ -933,7 +931,6 @@ static int qeth_l3_setadapter_parms(struct qeth_card *card)
 	return rc;
 }
 
-#ifdef CONFIG_QETH_IPV6
 static int qeth_l3_send_simple_setassparms_ipv6(struct qeth_card *card,
 		enum qeth_ipa_funcs ipa_func, __u16 cmd_code)
 {
@@ -949,7 +946,6 @@ static int qeth_l3_send_simple_setassparms_ipv6(struct qeth_card *card,
 				   qeth_setassparms_cb, NULL);
 	return rc;
 }
-#endif
 
 static int qeth_l3_start_ipa_arp_processing(struct qeth_card *card)
 {
@@ -1045,7 +1041,6 @@ static int qeth_l3_start_ipa_multicast(struct qeth_card *card)
 	return rc;
 }
 
-#ifdef CONFIG_QETH_IPV6
 static int qeth_l3_softsetup_ipv6(struct qeth_card *card)
 {
 	int rc;
@@ -1091,12 +1086,9 @@ static int qeth_l3_softsetup_ipv6(struct qeth_card *card)
 	dev_info(&card->gdev->dev, "IPV6 enabled\n");
 	return 0;
 }
-#endif
 
 static int qeth_l3_start_ipa_ipv6(struct qeth_card *card)
 {
-	int rc = 0;
-
 	QETH_CARD_TEXT(card, 3, "strtipv6");
 
 	if (!qeth_is_supported(card, IPA_IPV6)) {
@@ -1104,10 +1096,7 @@ static int qeth_l3_start_ipa_ipv6(struct qeth_card *card)
 			"IPv6 not supported on %s\n", QETH_CARD_IFNAME(card));
 		return 0;
 	}
-#ifdef CONFIG_QETH_IPV6
-	rc = qeth_l3_softsetup_ipv6(card);
-#endif
-	return rc ;
+	return qeth_l3_softsetup_ipv6(card);
 }
 
 static int qeth_l3_start_ipa_broadcast(struct qeth_card *card)
@@ -1457,9 +1446,8 @@ static void qeth_l3_add_multicast_ipv4(struct qeth_card *card)
 	rcu_read_unlock();
 }
 
-#ifdef CONFIG_QETH_IPV6
-static void
-qeth_l3_add_mc6_to_hash(struct qeth_card *card, struct inet6_dev *in6_dev)
+static void qeth_l3_add_mc6_to_hash(struct qeth_card *card,
+				    struct inet6_dev *in6_dev)
 {
 	struct qeth_ipaddr *ipm;
 	struct ifmcaddr6 *im6;
@@ -1548,7 +1536,6 @@ static void qeth_l3_add_multicast_ipv6(struct qeth_card *card)
 	rcu_read_unlock();
 	in6_dev_put(in6_dev);
 }
-#endif /* CONFIG_QETH_IPV6 */
 
 static void qeth_l3_free_vlan_addresses4(struct qeth_card *card,
 			unsigned short vid)
@@ -1588,9 +1575,8 @@ static void qeth_l3_free_vlan_addresses4(struct qeth_card *card,
 }
 
 static void qeth_l3_free_vlan_addresses6(struct qeth_card *card,
-			unsigned short vid)
+					 unsigned short vid)
 {
-#ifdef CONFIG_QETH_IPV6
 	struct inet6_dev *in6_dev;
 	struct inet6_ifaddr *ifa;
 	struct qeth_ipaddr *addr;
@@ -1625,7 +1611,6 @@ static void qeth_l3_free_vlan_addresses6(struct qeth_card *card,
 	kfree(addr);
 out:
 	in6_dev_put(in6_dev);
-#endif /* CONFIG_QETH_IPV6 */
 }
 
 static void qeth_l3_free_vlan_addresses(struct qeth_card *card,
@@ -1676,19 +1661,11 @@ static void qeth_l3_rebuild_skb(struct qeth_card *card, struct sk_buff *skb,
 		skb_reset_network_header(skb);
 		switch (hdr->hdr.l3.flags & QETH_HDR_CAST_MASK) {
 		case QETH_CAST_MULTICAST:
-			switch (prot) {
-#ifdef CONFIG_QETH_IPV6
-			case ETH_P_IPV6:
-				ipv6_eth_mc_map(&ipv6_hdr(skb)->daddr, tg_addr);
-				break;
-#endif
-			case ETH_P_IP:
+			if (prot == ETH_P_IP)
 				ip_eth_mc_map(ip_hdr(skb)->daddr, tg_addr);
-				break;
-			default:
-				memcpy(tg_addr, card->dev->broadcast,
-					card->dev->addr_len);
-			}
+			else
+				ipv6_eth_mc_map(&ipv6_hdr(skb)->daddr, tg_addr);
+
 			card->stats.multicast++;
 			skb->pkt_type = PACKET_MULTICAST;
 			break;
@@ -1949,9 +1926,7 @@ static void qeth_l3_set_multicast_list(struct net_device *dev)
 		qeth_l3_mark_all_mc_to_be_deleted(card);
 
 		qeth_l3_add_multicast_ipv4(card);
-#ifdef CONFIG_QETH_IPV6
 		qeth_l3_add_multicast_ipv6(card);
-#endif
 		qeth_l3_delete_nonused_mc(card);
 		qeth_l3_add_all_new_mc(card);
 
@@ -2222,12 +2197,10 @@ static int qeth_l3_arp_query(struct qeth_card *card, char __user *udata)
 			rc = -EFAULT;
 		goto free_and_out;
 	}
-#ifdef CONFIG_QETH_IPV6
 	if (qinfo.mask_bits & QETH_QARP_WITH_IPV6) {
 		/* fails in case of GuestLAN QDIO mode */
 		qeth_l3_query_arp_cache_info(card, QETH_PROT_IPV6, &qinfo);
 	}
-#endif
 	if (copy_to_user(udata, qinfo.udata, qinfo.udata_len)) {
 		QETH_CARD_TEXT(card, 4, "qactf");
 		rc = -EFAULT;
@@ -3356,10 +3329,6 @@ static struct notifier_block qeth_l3_ip_notifier = {
 	NULL,
 };
 
-#ifdef CONFIG_QETH_IPV6
-/**
- * IPv6 event handler
- */
 static int qeth_l3_ip6_event(struct notifier_block *this,
 			     unsigned long event, void *ptr)
 {
@@ -3404,7 +3373,6 @@ static struct notifier_block qeth_l3_ip6_notifier = {
 	qeth_l3_ip6_event,
 	NULL,
 };
-#endif
 
 static int qeth_l3_register_notifiers(void)
 {
@@ -3414,35 +3382,25 @@ static int qeth_l3_register_notifiers(void)
 	rc = register_inetaddr_notifier(&qeth_l3_ip_notifier);
 	if (rc)
 		return rc;
-#ifdef CONFIG_QETH_IPV6
 	rc = register_inet6addr_notifier(&qeth_l3_ip6_notifier);
 	if (rc) {
 		unregister_inetaddr_notifier(&qeth_l3_ip_notifier);
 		return rc;
 	}
-#else
-	pr_warn("There is no IPv6 support for the layer 3 discipline\n");
-#endif
 	return 0;
 }
 
 static void qeth_l3_unregister_notifiers(void)
 {
-
 	QETH_DBF_TEXT(SETUP, 5, "unregnot");
 	WARN_ON(unregister_inetaddr_notifier(&qeth_l3_ip_notifier));
-#ifdef CONFIG_QETH_IPV6
 	WARN_ON(unregister_inet6addr_notifier(&qeth_l3_ip6_notifier));
-#endif /* QETH_IPV6 */
 }
 
 static int __init qeth_l3_init(void)
 {
-	int rc = 0;
-
 	pr_info("register layer 3 discipline\n");
-	rc = qeth_l3_register_notifiers();
-	return rc;
+	return qeth_l3_register_notifiers();
 }
 
 static void __exit qeth_l3_exit(void)
-- 
2.13.5

^ permalink raw reply related

* [PATCH net-next 05/15] s390/qeth: don't keep track of MAC address's cast type
From: Julian Wiedmann @ 2017-12-20 19:10 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, linux-s390, Martin Schwidefsky, Heiko Carstens,
	Stefan Raspl, Ursula Braun, Julian Wiedmann
In-Reply-To: <20171220191109.90487-1-jwi@linux.vnet.ibm.com>

Instead of tracking the uc/mc state in each MAC address object, just
check the multicast bit in the address itself.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
---
 drivers/s390/net/qeth_l2.h      |  1 -
 drivers/s390/net/qeth_l2_main.c | 27 ++++++++-------------------
 2 files changed, 8 insertions(+), 20 deletions(-)

diff --git a/drivers/s390/net/qeth_l2.h b/drivers/s390/net/qeth_l2.h
index 09b1c4ef3dc9..3223601cc3ac 100644
--- a/drivers/s390/net/qeth_l2.h
+++ b/drivers/s390/net/qeth_l2.h
@@ -23,7 +23,6 @@ bool qeth_l2_vnicc_is_in_use(struct qeth_card *card);
 
 struct qeth_mac {
 	u8 mac_addr[OSA_ADDR_LEN];
-	u8 is_uc:1;
 	u8 disp_flag:2;
 	struct hlist_node hnode;
 };
diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
index 5863ea170ff2..88dd92954eec 100644
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -186,22 +186,16 @@ static int qeth_l2_send_delgroupmac(struct qeth_card *card, __u8 *mac)
 
 static int qeth_l2_write_mac(struct qeth_card *card, struct qeth_mac *mac)
 {
-	if (mac->is_uc) {
-		return qeth_l2_send_setdelmac(card, mac->mac_addr,
-						IPA_CMD_SETVMAC);
-	} else {
+	if (is_multicast_ether_addr_64bits(mac->mac_addr))
 		return qeth_l2_send_setgroupmac(card, mac->mac_addr);
-	}
+	return qeth_l2_send_setdelmac(card, mac->mac_addr, IPA_CMD_SETVMAC);
 }
 
 static int qeth_l2_remove_mac(struct qeth_card *card, struct qeth_mac *mac)
 {
-	if (mac->is_uc) {
-		return qeth_l2_send_setdelmac(card, mac->mac_addr,
-						IPA_CMD_DELVMAC);
-	} else {
+	if (is_multicast_ether_addr_64bits(mac->mac_addr))
 		return qeth_l2_send_delgroupmac(card, mac->mac_addr);
-	}
+	return qeth_l2_send_setdelmac(card, mac->mac_addr, IPA_CMD_DELVMAC);
 }
 
 static void qeth_l2_del_all_macs(struct qeth_card *card)
@@ -597,27 +591,23 @@ static void qeth_promisc_to_bridge(struct qeth_card *card)
  * only if there is not in the hash table storage already
  *
 */
-static void qeth_l2_add_mac(struct qeth_card *card, struct netdev_hw_addr *ha,
-			    u8 is_uc)
+static void qeth_l2_add_mac(struct qeth_card *card, struct netdev_hw_addr *ha)
 {
 	u32 mac_hash = get_unaligned((u32 *)(&ha->addr[2]));
 	struct qeth_mac *mac;
 
 	hash_for_each_possible(card->mac_htable, mac, hnode, mac_hash) {
-		if (is_uc == mac->is_uc &&
-		    !memcmp(ha->addr, mac->mac_addr, OSA_ADDR_LEN)) {
+		if (!memcmp(ha->addr, mac->mac_addr, OSA_ADDR_LEN)) {
 			mac->disp_flag = QETH_DISP_ADDR_DO_NOTHING;
 			return;
 		}
 	}
 
 	mac = kzalloc(sizeof(struct qeth_mac), GFP_ATOMIC);
-
 	if (!mac)
 		return;
 
 	memcpy(mac->mac_addr, ha->addr, OSA_ADDR_LEN);
-	mac->is_uc = is_uc;
 	mac->disp_flag = QETH_DISP_ADDR_ADD;
 
 	hash_add(card->mac_htable, &mac->hnode, mac_hash);
@@ -643,10 +633,9 @@ static void qeth_l2_set_rx_mode(struct net_device *dev)
 	spin_lock_bh(&card->mclock);
 
 	netdev_for_each_mc_addr(ha, dev)
-		qeth_l2_add_mac(card, ha, 0);
-
+		qeth_l2_add_mac(card, ha);
 	netdev_for_each_uc_addr(ha, dev)
-		qeth_l2_add_mac(card, ha, 1);
+		qeth_l2_add_mac(card, ha);
 
 	hash_for_each_safe(card->mac_htable, i, tmp, mac, hnode) {
 		if (mac->disp_flag == QETH_DISP_ADDR_DELETE) {
-- 
2.13.5

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox