Netdev List
 help / color / mirror / Atom feed
* Re: [net-next 11/15] i40e: Implement debug macro hw_dbg using pr_debug
From: Jakub Kicinski @ 2019-08-28 22:39 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: davem, Mauro S. M. Rodrigues, netdev, nhorman, sassmann,
	Andrew Bowers
In-Reply-To: <20190828064407.30168-12-jeffrey.t.kirsher@intel.com>

On Tue, 27 Aug 2019 23:44:03 -0700, Jeff Kirsher wrote:
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_osdep.h b/drivers/net/ethernet/intel/i40e/i40e_osdep.h
> index a07574bff550..c0c9ce3eab23 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_osdep.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e_osdep.h
> @@ -18,7 +18,12 @@
>   * actual OS primitives
>   */
>  
> -#define hw_dbg(hw, S, A...)	do {} while (0)
> +#define hw_dbg(hw, S, A...)							\
> +do {										\
> +	int domain = pci_domain_nr(((struct i40e_pf *)(hw)->back)->pdev->bus);	\
> +	pr_debug("i40e %04x:%02x:%02x.%x " S, domain, (hw)->bus.bus_id,		\
> +		 (hw)->bus.device, (hw)->bus.func, ## A);			\

This looks like open coded dev_dbg() / dev_name(), why?

> +} while (0)

^ permalink raw reply

* Re: [net-next 10/15] i40e: fix hw_dbg usage in i40e_hmc_get_object_va
From: Jakub Kicinski @ 2019-08-28 22:38 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: davem, Mauro S. M. Rodrigues, netdev, nhorman, sassmann,
	Andrew Bowers
In-Reply-To: <20190828064407.30168-11-jeffrey.t.kirsher@intel.com>

On Tue, 27 Aug 2019 23:44:02 -0700, Jeff Kirsher wrote:
> @@ -982,6 +983,7 @@ i40e_status i40e_hmc_get_object_va(struct i40e_hmc_info *hmc_info,
>  	struct i40e_hmc_sd_entry *sd_entry;
>  	struct i40e_hmc_pd_entry *pd_entry;
>  	u32 pd_idx, pd_lmt, rel_pd_idx;
> +	struct i40e_hmc_info *hmc_info = &hw->hmc;

reverse xmas tree

>  	u64 obj_offset_in_fpm;
>  	u32 sd_idx, sd_lmt;
>  

^ permalink raw reply

* Re: general protection fault in tls_sk_proto_close (2)
From: Jakub Kicinski @ 2019-08-28 22:26 UTC (permalink / raw)
  To: john.fastabend
  Cc: syzbot, aviadye, borisp, daniel, davejwatson, davem, linux-kernel,
	netdev, syzkaller-bugs
In-Reply-To: <000000000000c3c461059127a1c4@google.com>

On Tue, 27 Aug 2019 23:38:07 -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    a55aa89a Linux 5.3-rc6
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=16c26ebc600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=2a6a2b9826fdadf9
> dashboard link: https://syzkaller.appspot.com/bug?extid=7a6ee4d0078eac6bf782
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1112a4de600000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+7a6ee4d0078eac6bf782@syzkaller.appspotmail.com

Hi John!

This is a loop where TLS calls it's own close function recursively.
It seems we must have gotten BPF installed on top of TLS, and then 
it handed TLS TLS'es own sk_proto via tcp_update_ulp().

Can BPF on top of TLS be prevented somehow?

Quick fix should probably be something like:

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 43252a801c3f..3f4962756fa4 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -816,6 +816,9 @@ static void tls_update(struct sock *sk, struct proto *p)
 
        ctx = tls_get_ctx(sk);
        if (likely(ctx)) {
+               if (p->setsockopt == tls_setsockopt)
+                       return;
+
                ctx->sk_proto_close = p->close;
                ctx->sk_proto = p;
        } else {

> kasan: CONFIG_KASAN_INLINE enabled
> kasan: GPF could be caused by NULL-ptr deref or user memory access
> general protection fault: 0000 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 10290 Comm: syz-executor.0 Not tainted 5.3.0-rc6 #120
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
> Google 01/01/2011
> RIP: 0010:tls_sk_proto_close+0xe5/0x990 net/tls/tls_main.c:298
> Code: 0f 85 3f 08 00 00 49 8b 84 24 c0 02 00 00 4d 8d 75 14 4c 89 f2 48 c1  
> ea 03 48 89 85 50 ff ff ff 48 b8 00 00 00 00 00 fc ff df <0f> b6 04 02 4c  
> 89 f2 83 e2 07 38 d0 7f 08 84 c0 0f 85 2e 06 00 00
> RSP: 0018:ffff88809b23fb90 EFLAGS: 00010203
> RAX: dffffc0000000000 RBX: dffffc0000000000 RCX: ffffffff862cb8db
> RDX: 0000000000000002 RSI: ffffffff862cb639 RDI: ffff8880a155ef00
> RBP: ffff88809b23fc48 R08: ffff888094344640 R09: ffffed10142abd9a
> R10: ffffed10142abd99 R11: ffff8880a155eccb R12: ffff8880a155ec40
> R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000001
> FS:  00005555556a8940(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f353458e000 CR3: 00000000a9174000 CR4: 00000000001406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>   tls_sk_proto_close+0x35b/0x990 net/tls/tls_main.c:321
>   tcp_bpf_close+0x17c/0x390 net/ipv4/tcp_bpf.c:582
>   inet_release+0xed/0x200 net/ipv4/af_inet.c:427
>   inet6_release+0x53/0x80 net/ipv6/af_inet6.c:470
>   __sock_release+0xce/0x280 net/socket.c:590
>   sock_close+0x1e/0x30 net/socket.c:1268
>   __fput+0x2ff/0x890 fs/file_table.c:280
>   ____fput+0x16/0x20 fs/file_table.c:313
>   task_work_run+0x145/0x1c0 kernel/task_work.c:113
>   tracehook_notify_resume include/linux/tracehook.h:188 [inline]
>   exit_to_usermode_loop+0x316/0x380 arch/x86/entry/common.c:163
>   prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
>   syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
>   do_syscall_64+0x5a9/0x6a0 arch/x86/entry/common.c:299
>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x413540
> Code: 01 f0 ff ff 0f 83 30 1b 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f  
> 44 00 00 83 3d 4d 2d 66 00 00 75 14 b8 03 00 00 00 0f 05 <48> 3d 01 f0 ff  
> ff 0f 83 04 1b 00 00 c3 48 83 ec 08 e8 0a fc ff ff
> RSP: 002b:00007fff5d481778 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
> RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000413540
> RDX: 0000001b2e520000 RSI: 0000000000000000 RDI: 0000000000000005
> RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffffffffffff
> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000075bf20
> R13: 0000000000000003 R14: 0000000000761220 R15: ffffffffffffffff
> Modules linked in:
> ---[ end trace bdfd4385a0f1f76d ]---
> RIP: 0010:tls_sk_proto_close+0xe5/0x990 net/tls/tls_main.c:298
> Code: 0f 85 3f 08 00 00 49 8b 84 24 c0 02 00 00 4d 8d 75 14 4c 89 f2 48 c1  
> ea 03 48 89 85 50 ff ff ff 48 b8 00 00 00 00 00 fc ff df <0f> b6 04 02 4c  
> 89 f2 83 e2 07 38 d0 7f 08 84 c0 0f 85 2e 06 00 00
> RSP: 0018:ffff88809b23fb90 EFLAGS: 00010203
> RAX: dffffc0000000000 RBX: dffffc0000000000 RCX: ffffffff862cb8db
> RDX: 0000000000000002 RSI: ffffffff862cb639 RDI: ffff8880a155ef00
> RBP: ffff88809b23fc48 R08: ffff888094344640 R09: ffffed10142abd9a
> R10: ffffed10142abd99 R11: ffff8880a155eccb R12: ffff8880a155ec40
> R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000001
> FS:  00005555556a8940(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f353458e000 CR3: 00000000a9174000 CR4: 00000000001406e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches


^ permalink raw reply related

* Re: [PATCH bpf-next] bpf, capabilities: introduce CAP_BPF
From: Alexei Starovoitov @ 2019-08-28 22:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andy Lutomirski, Alexei Starovoitov, Kees Cook, LSM List,
	James Morris, Jann Horn, Masami Hiramatsu, Steven Rostedt,
	David S. Miller, Daniel Borkmann, Network Development, bpf,
	kernel-team, Linux API
In-Reply-To: <20190828071421.GK2332@hirez.programming.kicks-ass.net>

On Wed, Aug 28, 2019 at 09:14:21AM +0200, Peter Zijlstra wrote:
> On Tue, Aug 27, 2019 at 04:01:08PM -0700, Andy Lutomirski wrote:
> 
> > > Tracing:
> > >
> > > CAP_BPF and perf_paranoid_tracepoint_raw() (which is kernel.perf_event_paranoid == -1)
> > > are necessary to:
> 
> That's not tracing, that's perf.
> 
> > > +bool cap_bpf_tracing(void)
> > > +{
> > > +       return capable(CAP_SYS_ADMIN) ||
> > > +              (capable(CAP_BPF) && !perf_paranoid_tracepoint_raw());
> > > +}
> 
> A whole long time ago, I proposed we introduce CAP_PERF or something
> along those lines; as a replacement for that horrible crap Android and
> Debian ship. But nobody was ever interested enough.
> 
> The nice thing about that is that you can then disallow perf/tracing in
> general, but tag the perf executable (and similar tools) with the
> capability so that unpriv users can still use it, but only limited
> through the tool, not the syscalls directly.

Exactly.
Similar motivation for CAP_BPF as well.

re: your first comment above.
I'm not sure what difference you see in words 'tracing' and 'perf'.
I really hope we don't partition the overall tracing category
into CAP_PERF and CAP_FTRACE only because these pieces are maintained
by different people.
On one side perf_event_open() isn't really doing tracing (as step by
step ftracing of function sequences), but perf_event_open() opens
an event and the sequence of events (may include IP) becomes a trace.
imo CAP_TRACING is the best name to descibe the privileged space
of operations possible via perf_event_open, ftrace, kprobe, stack traces, etc.

Another reason are kuprobes. They can be crated via perf_event_open
and via tracefs. Are they in CAP_PERF or in CAP_FTRACE ? In both, right?
Should then CAP_KPROBE be used ? that would be an overkill.
It would partition the space even further without obvious need.

Looking from BPF angle... BPF doesn't have integration with ftrace yet.
bpf_trace_printk is using ftrace mechanism, but that's 1% of ftrace.
In the long run I really like to see bpf using all of ftrace.
Whereas bpf is using a lot of 'perf'.
And extending some perf things in bpf specific way.
Take a look at how BPF_F_STACK_BUILD_ID. It's clearly perf/stack_tracing
feature that generic perf can use one day.
Currently it sits in bpf land and accessible via bpf only.
Though its bpf only today I categorize it under CAP_TRACING.

I think CAP_TRACING privilege should allow task to do all of perf_event_open,
kuprobe, stack trace, ftrace, and kallsyms.
We can think of some exceptions that should stay under CAP_SYS_ADMIN,
but most of the functionality available by 'perf' binary should be
usable with CAP_TRACING. 'perf' can do bpf too.
With CAP_BPF it would be all set.


^ permalink raw reply

* Re: [PATCH V3 net 2/2] openvswitch: Clear the L4 portion of the key for "later" fragments.
From: David Miller @ 2019-08-28 21:54 UTC (permalink / raw)
  To: gvrose8192; +Cc: netdev, pshelar, joe, jpettit
In-Reply-To: <1566917890-22304-2-git-send-email-gvrose8192@gmail.com>

From: Greg Rose <gvrose8192@gmail.com>
Date: Tue, 27 Aug 2019 07:58:10 -0700

> From: Justin Pettit <jpettit@ovn.org>
> 
> Only the first fragment in a datagram contains the L4 headers.  When the
> Open vSwitch module parses a packet, it always sets the IP protocol
> field in the key, but can only set the L4 fields on the first fragment.
> The original behavior would not clear the L4 portion of the key, so
> garbage values would be sent in the key for "later" fragments.  This
> patch clears the L4 fields in that circumstance to prevent sending those
> garbage values as part of the upcall.
> 
> Signed-off-by: Justin Pettit <jpettit@ovn.org>

Applied.

^ permalink raw reply

* Re: [PATCH V3 net 1/2] openvswitch: Properly set L4 keys on "later" IP fragments
From: David Miller @ 2019-08-28 21:54 UTC (permalink / raw)
  To: gvrose8192; +Cc: netdev, pshelar, joe
In-Reply-To: <1566917890-22304-1-git-send-email-gvrose8192@gmail.com>

From: Greg Rose <gvrose8192@gmail.com>
Date: Tue, 27 Aug 2019 07:58:09 -0700

> When IP fragments are reassembled before being sent to conntrack, the
> key from the last fragment is used.  Unless there are reordering
> issues, the last fragment received will not contain the L4 ports, so the
> key for the reassembled datagram won't contain them.  This patch updates
> the key once we have a reassembled datagram.
> 
> The handle_fragments() function works on L3 headers so we pull the L3/L4
> flow key update code from key_extract into a new function
> 'key_extract_l3l4'.  Then we add a another new function
> ovs_flow_key_update_l3l4() and export it so that it is accessible by
> handle_fragments() for conntrack packet reassembly.
> 
> Co-authored by: Justin Pettit <jpettit@ovn.org>
> Signed-off-by: Greg Rose <gvrose8192@gmail.com>

Applied with Co-authored-by fixed.

^ permalink raw reply

* Re: [PATCH v2 0/3] Add NETIF_F_HW_BR_CAP feature
From: Horatiu Vultur @ 2019-08-28 21:53 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: roopa, nikolay, davem, UNGLinuxDriver, alexandre.belloni,
	allan.nielsen, f.fainelli, netdev, linux-kernel, bridge
In-Reply-To: <20190827131824.GC11471@lunn.ch>

The 08/27/2019 15:18, Andrew Lunn wrote:
> External E-Mail
> 
> 
> > That sounds like a great idea. I was expecting to add this logic in the
> > set_rx_mode function of the driver. But unfortunetly, I got the calls to
> > this function before the dev->promiscuity is updated or not to get the
> > call at all. For example in case the port is member of a bridge and I try
> > to enable the promisc mode.
> 
> Hi Horatiu

Hi Andrew,
> 
> What about the notifier? Is it called in all the conditions you need
> to know about?

I had a look also over this but without any luck. I can get good
information from this, like knowing when a port is added or removed from
the bridge(NETDEV_CHANGEUPPER). But not in case the promisc is change by
an application(eg. tcpdump). In this case if port is part of the bridge
and then promisc is enable, then there is no callback to the driver or
any notifications.
> 
> Or, you could consider adding a new switchdev call to pass this
> information to any switchdev driver which is interested in the
> information.

Having this new switchdev call and listening for NETDEV_CHANGEUPPER
seems to be enough to know when a port needs to go in promisc mode.
> 
> At the moment, the DSA driver core does not pass onto the driver it
> should put a port into promisc mode. So pcap etc, will only see
> traffic directed to the CPU, not all the traffic ingressing the
> interface. If you put the needed core infrastructure into place, we
> could plumb it down from the DSA core to DSA drivers.
> 
> Having said that, i don't actually know if the Marvell switches
> support this. Forward using the ATU and send a copy to the CPU?  What
> switches tend to support is port mirroring, sending all the traffic
> out another port. A couple of DSA drivers support that, via TC.
> 
> 	Andrew
> 

-- 
/Horatiu

^ permalink raw reply

* Re: [PATCH net-next] phy: mdio-mux-meson-g12a: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-28 21:51 UTC (permalink / raw)
  To: yuehaibing
  Cc: andrew, f.fainelli, hkallweit1, khilman, netdev, linux-arm-kernel,
	linux-amlogic, linux-kernel
In-Reply-To: <20190827134940.14944-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Tue, 27 Aug 2019 21:49:40 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] phy: mdio-sun4i: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-28 21:51 UTC (permalink / raw)
  To: yuehaibing
  Cc: andrew, f.fainelli, hkallweit1, mripard, wens, netdev,
	linux-arm-kernel, linux-kernel
In-Reply-To: <20190827135032.14620-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Tue, 27 Aug 2019 21:50:32 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] phy: mdio-moxart: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-28 21:51 UTC (permalink / raw)
  To: yuehaibing; +Cc: andrew, f.fainelli, hkallweit1, netdev, linux-kernel
In-Reply-To: <20190827134804.14888-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Tue, 27 Aug 2019 21:48:04 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] phy: mdio-hisi-femac: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-28 21:51 UTC (permalink / raw)
  To: yuehaibing; +Cc: andrew, f.fainelli, hkallweit1, netdev, linux-kernel
In-Reply-To: <20190827134722.14332-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Tue, 27 Aug 2019 21:47:22 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] phy: mdio-bcm-iproc: use devm_platform_ioremap_resource() to simplify code
From: David Miller @ 2019-08-28 21:51 UTC (permalink / raw)
  To: yuehaibing
  Cc: andrew, f.fainelli, hkallweit1, rjui, sbranden,
	bcm-kernel-feedback-list, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <20190827134616.11396-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Tue, 27 Aug 2019 21:46:16 +0800

> Use devm_platform_ioremap_resource() to simplify the code a bit.
> This is detected by coccinelle.
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH] wimax/i2400m: remove redundant assignment to variable result
From: David Miller @ 2019-08-28 21:49 UTC (permalink / raw)
  To: colin.king
  Cc: inaky.perez-gonzalez, linux-wimax, netdev, kernel-janitors,
	linux-kernel
In-Reply-To: <20190827114739.27305-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Tue, 27 Aug 2019 12:47:39 +0100

> From: Colin Ian King <colin.king@canonical.com>
> 
> Variable result is being assigned a value that is never read and result
> is being re-assigned a little later on. The assignment is redundant
> and hence can be removed.
> 
> Addresses-Coverity: ("Ununsed value")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH] arcnet: capmode: remove redundant assignment to pointer pkt
From: David Miller @ 2019-08-28 21:49 UTC (permalink / raw)
  To: colin.king; +Cc: m.grzeschik, netdev, kernel-janitors, linux-kernel
In-Reply-To: <20190827112954.26677-1-colin.king@canonical.com>


Please fix the typo spotted by Sergei.

^ permalink raw reply

* Re: [PATCH net] mld: fix memory leak in mld_del_delrec()
From: David Miller @ 2019-08-28 21:48 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, syzkaller
In-Reply-To: <20190827103312.180258-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Tue, 27 Aug 2019 03:33:12 -0700

> Similar to the fix done for IPv4 in commit e5b1c6c6277d
> ("igmp: fix memory leak in igmpv3_del_delrec()"), we need to
> make sure mca_tomb and mca_sources are not blindly overwritten.
> 
> Using swap() then a call to ip6_mc_clear_src() will take care
> of the missing free.
 ...
> Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down")
> Fixes: 9c8bb163ae78 ("igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net v2] net/sched: pfifo_fast: fix wrong dereference when qdisc is reset
From: David Miller @ 2019-08-28 21:46 UTC (permalink / raw)
  To: dcaratti; +Cc: xiyou.wangcong, jhs, jiri, netdev, pabeni, sbrivio, shuali
In-Reply-To: <783231162b9d32faaf5df34ad8ad437b0031bd31.1566901438.git.dcaratti@redhat.com>

From: Davide Caratti <dcaratti@redhat.com>
Date: Tue, 27 Aug 2019 12:29:09 +0200

> Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
> 'TCQ_F_NOLOCK' bit in the parent qdisc, we need to be sure that per-cpu
> counters are present when 'reset()' is called for pfifo_fast qdiscs.
> Otherwise, the following script:
 ...
> can generate the following splat:
 ...
> Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
> before dereferencing 'qdisc->cpu_qstats'.
> 
> Changes since v1:
>  - coding style improvements, thanks to Stefano Brivio
> 
> Fixes: 8a53e616de29 ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
> CC: Paolo Abeni <pabeni@redhat.com>
> Reported-by: Li Shuang <shuali@redhat.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>

Applied and queued up for v5.2 -stable.

^ permalink raw reply

* Re: [PATCH net-next] ipv6: shrink struct ipv6_mc_socklist
From: David Miller @ 2019-08-28 21:43 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet
In-Reply-To: <20190827070812.150106-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Tue, 27 Aug 2019 00:08:12 -0700

> Remove two holes on 64bit arches, to bring the size
> to one cache line exactly.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2] riscv: add support for SECCOMP and SECCOMP_FILTER
From: David Abdurachmanov @ 2019-08-28 21:39 UTC (permalink / raw)
  To: Kees Cook
  Cc: Paul Walmsley, Tycho Andersen, Palmer Dabbelt, Albert Ou,
	Oleg Nesterov, Andy Lutomirski, Will Drewry, Shuah Khan,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Abdurachmanov, Thomas Gleixner,
	Allison Randal, Alexios Zavras, Anup Patel, Vincent Chen,
	Alan Kao, linux-riscv, linux-kernel, linux-kselftest, netdev, bpf,
	me
In-Reply-To: <201908251446.04BCB8C@keescook>

On Wed, Aug 28, 2019 at 10:36 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Fri, Aug 23, 2019 at 05:30:53PM -0700, Paul Walmsley wrote:
> > On Thu, 22 Aug 2019, David Abdurachmanov wrote:
> >
> > > There is one failing kernel selftest: global.user_notification_signal
> >
> > Is this the only failing test?  Or are the rest of the selftests skipped
> > when this test fails, and no further tests are run, as seems to be shown
> > here:
> >
> >   https://lore.kernel.org/linux-riscv/CADnnUqcmDMRe1f+3jG8SPR6jRrnBsY8VVD70VbKEm0NqYeoicA@mail.gmail.com/
> >
> > For example, looking at the source, I'd naively expect to see the
> > user_notification_closed_listener test result -- which follows right
> > after the failing test in the selftest source.  But there aren't any
> > results?
> >
> > Also - could you follow up with the author of this failing test to see if
> > we can get some more clarity about what might be going wrong here?  It
> > appears that the failing test was added in commit 6a21cc50f0c7f ("seccomp:
> > add a return code to trap to userspace") by Tycho Andersen
> > <tycho@tycho.ws>.
>
> So, the original email says the riscv series is tested on top of 5.2-rc7,
> but just for fun, can you confirm that you're building a tree that includes
> 9dd3fcb0ab73 ("selftests/seccomp: Handle namespace failures gracefully")? I
> assume it does, but I suspect something similar is happening, where the
> environment is slightly different than expected and the test stalls.
>
> Does it behave the same way under emulation (i.e. can I hope to
> reproduce this myself?)

This was tested in 5.2-rc7 and later in 5.3-rc with the same behavior.
Also VM or physical HW doesn't matter, same result.

>
> --
> Kees Cook

^ permalink raw reply

* Re: [PATCH v2] riscv: add support for SECCOMP and SECCOMP_FILTER
From: David Abdurachmanov @ 2019-08-28 21:37 UTC (permalink / raw)
  To: Kees Cook
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Oleg Nesterov,
	Andy Lutomirski, Will Drewry, Shuah Khan, Alexei Starovoitov,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Abdurachmanov, Thomas Gleixner, Allison Randal,
	Alexios Zavras, Anup Patel, Vincent Chen, Alan Kao, linux-riscv,
	linux-kernel, linux-kselftest, netdev, bpf, me
In-Reply-To: <201908251451.73C6812E8@keescook>

On Wed, Aug 28, 2019 at 10:36 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, Aug 22, 2019 at 01:55:22PM -0700, David Abdurachmanov wrote:
> > This patch was extensively tested on Fedora/RISCV (applied by default on
> > top of 5.2-rc7 kernel for <2 months). The patch was also tested with 5.3-rc
> > on QEMU and SiFive Unleashed board.
>
> Oops, I see the mention of QEMU here. Where's the best place to find
> instructions on creating a qemu riscv image/environment?

Examples from what I personally use:
https://github.com/riscv/meta-riscv
https://fedoraproject.org/wiki/Architectures/RISC-V/Installing#Boot_with_libvirt
(might be outdated)

If you are running machine with a properly working libvirt/QEMU setup:

VIRTBUILDER_IMAGE=fedora-rawhide-developer-20190703n0
FIRMWARE=fw_payload-uboot-qemu-virt-smode.elf
wget https://dl.fedoraproject.org/pub/alt/risc-v/disk-images/fedora/rawhide/20190703.n.0/Developer/$FIRMWARE
echo riscv > /tmp/rootpw
virt-builder \
    --verbose \
    --source https://dl.fedoraproject.org/pub/alt/risc-v/repo/virt-builder-images/images/index
\
    --no-check-signature \
    --arch riscv64 \
    --size 10G \
    --format raw \
    --hostname fedora-riscv \
    -o disk \
    --root-password file:/tmp/rootpw \
    ${VIRTBUILDER_IMAGE}

sudo virt-install \
    --name fedora-riscv \
    --arch riscv64 \
    --vcpus 4 \
    --memory 3048 \
    --import \
    --disk path=$PWD/disk \
    --boot kernel=$PWD/${FIRMWARE} \
    --network network=default \
    --graphics none \
    --serial log.file=/tmp/fedora-riscv.serial.log \
    --noautoconsole

The following does incl. SECCOMP v2 patch on top of 5.2-rc7 kernel.

>
> > There is one failing kernel selftest: global.user_notification_signal
>
> This test has been fragile (and is not arch-specific), so as long as
> everything else is passing, I would call this patch ready to go. :)
>
> Reviewed-by: Kees Cook <keescook@chromium.org>
>
> --
> Kees Cook

^ permalink raw reply

* Re: [PATCH v1 2/5] mdev: Make mdev alias unique among all mdevs
From: Alex Williamson @ 2019-08-28 21:36 UTC (permalink / raw)
  To: Parav Pandit; +Cc: jiri, kwankhede, cohuck, davem, kvm, linux-kernel, netdev
In-Reply-To: <20190827191654.41161-3-parav@mellanox.com>

On Tue, 27 Aug 2019 14:16:51 -0500
Parav Pandit <parav@mellanox.com> wrote:

> Mdev alias should be unique among all the mdevs, so that when such alias
> is used by the mdev users to derive other objects, there is no
> collision in a given system.
> 
> Signed-off-by: Parav Pandit <parav@mellanox.com>
> 
> ---
> Changelog:
> v0->v1:
>  - Fixed inclusiong of alias for NULL check
>  - Added ratelimited debug print for sha1 hash collision error
> ---
>  drivers/vfio/mdev/mdev_core.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index 62d29f57fe0c..4b9899e40665 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -375,6 +375,13 @@ int mdev_device_create(struct kobject *kobj, struct device *dev,
>  			ret = -EEXIST;
>  			goto mdev_fail;
>  		}
> +		if (tmp->alias && alias && strcmp(tmp->alias, alias) == 0) {

Nit, test if the device we adding has an alias before the device we're
testing against.  The compiler can better optimize keeping alias hot.
Thanks,

Alex

> +			mutex_unlock(&mdev_list_lock);
> +			ret = -EEXIST;
> +			dev_dbg_ratelimited(dev, "Hash collision in alias creation for UUID %pUl\n",
> +					    uuid);
> +			goto mdev_fail;
> +		}
>  	}
>  
>  	mdev = kzalloc(sizeof(*mdev), GFP_KERNEL);


^ permalink raw reply

* Re: [PATCH v1 1/5] mdev: Introduce sha1 based mdev alias
From: Alex Williamson @ 2019-08-28 21:34 UTC (permalink / raw)
  To: Parav Pandit; +Cc: jiri, kwankhede, cohuck, davem, kvm, linux-kernel, netdev
In-Reply-To: <20190828152544.16ba2617@x1.home>

On Wed, 28 Aug 2019 15:25:44 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Tue, 27 Aug 2019 14:16:50 -0500
> Parav Pandit <parav@mellanox.com> wrote:
> >  module_init(mdev_init)
> > diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
> > index 7d922950caaf..cf1c0d9842c6 100644
> > --- a/drivers/vfio/mdev/mdev_private.h
> > +++ b/drivers/vfio/mdev/mdev_private.h
> > @@ -33,6 +33,7 @@ struct mdev_device {
> >  	struct kobject *type_kobj;
> >  	struct device *iommu_device;
> >  	bool active;
> > +	const char *alias;

Nit, put this above active to avoid creating a hole in the structure.
Thanks,

Alex

^ permalink raw reply

* Re: [PATCH bpf-next 01/10] bpf: introduce __MAX_BPF_PROG_TYPE and __MAX_BPF_MAP_TYPE enum values
From: Alexei Starovoitov @ 2019-08-28 21:33 UTC (permalink / raw)
  To: Julia Kartseva; +Cc: rdna, bpf, ast, daniel, netdev, kernel-team
In-Reply-To: <43989d37be938b7d284028481e63df0a0471e29f.1567024943.git.hex@fb.com>

On Wed, Aug 28, 2019 at 02:03:04PM -0700, Julia Kartseva wrote:
> Similar to __MAX_BPF_ATTACH_TYPE identifying the number of elements in
> bpf_attach_type enum, add tailing enum values __MAX_BPF_PROG_TYPE
> and __MAX_BPF_MAP_TYPE to simplify e.g. iteration over enums values in
> the case when new values are added.
> 
> Signed-off-by: Julia Kartseva <hex@fb.com>
> ---
>  include/uapi/linux/bpf.h | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 5d2fb183ee2d..9b681bb82211 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -136,8 +136,11 @@ enum bpf_map_type {
>  	BPF_MAP_TYPE_STACK,
>  	BPF_MAP_TYPE_SK_STORAGE,
>  	BPF_MAP_TYPE_DEVMAP_HASH,
> +	__MAX_BPF_MAP_TYPE
>  };
>  
> +#define MAX_BPF_MAP_TYPE __MAX_BPF_MAP_TYPE
> +
>  /* Note that tracing related programs such as
>   * BPF_PROG_TYPE_{KPROBE,TRACEPOINT,PERF_EVENT,RAW_TRACEPOINT}
>   * are not subject to a stable API since kernel internal data
> @@ -173,8 +176,11 @@ enum bpf_prog_type {
>  	BPF_PROG_TYPE_CGROUP_SYSCTL,
>  	BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
>  	BPF_PROG_TYPE_CGROUP_SOCKOPT,
> +	__MAX_BPF_PROG_TYPE
>  };
>  
> +#define MAX_BPF_PROG_TYPE __MAX_BPF_PROG_TYPE
> +

This came up before and my position is still the same.
I'm against this type of band-aid in uapi.
'bpftool feature probe' can easily discover all supported
prog and map types already.


^ permalink raw reply

* Re: [PATCH net] netdevsim: Restore per-network namespace accounting for fib entries
From: David Ahern @ 2019-08-28 21:26 UTC (permalink / raw)
  To: Jiri Pirko, David Ahern; +Cc: davem, netdev
In-Reply-To: <20190828103718.GF2312@nanopsycho>

On 8/28/19 4:37 AM, Jiri Pirko wrote:
> Tue, Aug 06, 2019 at 09:15:17PM CEST, dsahern@kernel.org wrote:
>> From: David Ahern <dsahern@gmail.com>
>>
>> Prior to the commit in the fixes tag, the resource controller in netdevsim
>> tracked fib entries and rules per network namespace. Restore that behavior.
> 
> David, please help me understand. If the counters are per-device, not
> per-netns, they are both the same. If we have device (devlink instance)
> is in a netns and take only things happening in this netns into account,
> it should count exactly the same amount of fib entries, doesn't it?

if you are only changing where the counters are stored - net_generic vs
devlink private - then yes, they should be equivalent.

> 
> I re-thinked the devlink netns patchset and currently I'm going in
> slightly different direction. I'm having netns as an attribute of
> devlink reload. So all the port netdevices and everything gets
> re-instantiated into new netns. Works fine with mlxsw. There we also
> re-register the fib notifier.
> 
> I think that this can work for your usecase in netdevsim too:
> 1) devlink instance is registering a fib notifier to track all fib
>    entries in a namespace it belongs to. The counters are per-device -
>    counting fib entries in a namespace the device is in.
> 2) another devlink instance can do the same tracking in the same
>    namespace. No problem, it's a separate counter, but the numbers are
>    the same. One can set different limits to different devlink
>    instances, but you can have only one. That is the bahaviour you have
>    now.
> 3) on devlink reload, netdevsim re-instantiates ports and re-registers
>    fib notifier
> 4) on devlink reload with netns change, all should be fine as the
>    re-registered fib nofitier replays the entries. The ports are
>    re-instatiated in new netns.
> 
> This way, we would get consistent behaviour between netdevsim and real
> devices (mlxsw), correct devlink-netns implementation (you also
> suggested to move ports to the namespace). Everyone should be happy.
> 
> What do you think?
> 

Right now, registering the fib notifier walks all namespaces. That is
not a scalable solution. Are you changing that to replay only a given
netns? Are you changing the notifiers to be per-namespace?

Also, you are still allowing devlink instances to be created within a
namespace?

^ permalink raw reply

* Re: [PATCH v1 1/5] mdev: Introduce sha1 based mdev alias
From: Alex Williamson @ 2019-08-28 21:25 UTC (permalink / raw)
  To: Parav Pandit; +Cc: jiri, kwankhede, cohuck, davem, kvm, linux-kernel, netdev
In-Reply-To: <20190827191654.41161-2-parav@mellanox.com>

On Tue, 27 Aug 2019 14:16:50 -0500
Parav Pandit <parav@mellanox.com> wrote:

> Some vendor drivers want an identifier for an mdev device that is
> shorter than the UUID, due to length restrictions in the consumers of
> that identifier.
> 
> Add a callback that allows a vendor driver to request an alias of a
> specified length to be generated for an mdev device. If generated,
> that alias is checked for collisions.
> 
> It is an optional attribute.
> mdev alias is generated using sha1 from the mdev name.
> 
> Signed-off-by: Parav Pandit <parav@mellanox.com>
> 
> ---
> Changelog:
> 
> v0->v1:
>  - Moved alias length check outside of the parent lock
>  - Moved alias and digest allocation from kvzalloc to kzalloc
>  - &alias[0] changed to alias
>  - alias_length check is nested under get_alias_length callback check
>  - Changed comments to start with an empty line
>  - Fixed cleaunup of hash if mdev_bus_register() fails
>  - Added comment where alias memory ownership is handed over to mdev device
>  - Updated commit log to indicate motivation for this feature
> ---
>  drivers/vfio/mdev/mdev_core.c    | 110 ++++++++++++++++++++++++++++++-
>  drivers/vfio/mdev/mdev_private.h |   5 +-
>  drivers/vfio/mdev/mdev_sysfs.c   |  13 ++--
>  include/linux/mdev.h             |   4 ++
>  4 files changed, 122 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index b558d4cfd082..62d29f57fe0c 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -10,9 +10,11 @@
>  #include <linux/module.h>
>  #include <linux/device.h>
>  #include <linux/slab.h>
> +#include <linux/mm.h>
>  #include <linux/uuid.h>
>  #include <linux/sysfs.h>
>  #include <linux/mdev.h>
> +#include <crypto/hash.h>
>  
>  #include "mdev_private.h"
>  
> @@ -27,6 +29,8 @@ static struct class_compat *mdev_bus_compat_class;
>  static LIST_HEAD(mdev_list);
>  static DEFINE_MUTEX(mdev_list_lock);
>  
> +static struct crypto_shash *alias_hash;
> +
>  struct device *mdev_parent_dev(struct mdev_device *mdev)
>  {
>  	return mdev->parent->dev;
> @@ -150,6 +154,16 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
>  	if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
>  		return -EINVAL;
>  
> +	if (ops->get_alias_length) {
> +		unsigned int digest_size;
> +		unsigned int aligned_len;
> +
> +		aligned_len = roundup(ops->get_alias_length(), 2);
> +		digest_size = crypto_shash_digestsize(alias_hash);
> +		if (aligned_len / 2 > digest_size)
> +			return -EINVAL;
> +	}
> +
>  	dev = get_device(dev);
>  	if (!dev)
>  		return -EINVAL;
> @@ -259,6 +273,7 @@ static void mdev_device_free(struct mdev_device *mdev)
>  	mutex_unlock(&mdev_list_lock);
>  
>  	dev_dbg(&mdev->dev, "MDEV: destroying\n");
> +	kfree(mdev->alias);
>  	kfree(mdev);
>  }
>  
> @@ -269,18 +284,88 @@ static void mdev_device_release(struct device *dev)
>  	mdev_device_free(mdev);
>  }
>  
> -int mdev_device_create(struct kobject *kobj,
> -		       struct device *dev, const guid_t *uuid)
> +static const char *
> +generate_alias(const char *uuid, unsigned int max_alias_len)
> +{
> +	struct shash_desc *hash_desc;
> +	unsigned int digest_size;
> +	unsigned char *digest;
> +	unsigned int alias_len;
> +	char *alias;
> +	int ret = 0;
> +
> +	/*
> +	 * Align to multiple of 2 as bin2hex will generate
> +	 * even number of bytes.
> +	 */
> +	alias_len = roundup(max_alias_len, 2);
> +	alias = kzalloc(alias_len + 1, GFP_KERNEL);
> +	if (!alias)
> +		return NULL;
> +
> +	/* Allocate and init descriptor */
> +	hash_desc = kvzalloc(sizeof(*hash_desc) +
> +			     crypto_shash_descsize(alias_hash),
> +			     GFP_KERNEL);
> +	if (!hash_desc)
> +		goto desc_err;
> +
> +	hash_desc->tfm = alias_hash;
> +
> +	digest_size = crypto_shash_digestsize(alias_hash);
> +
> +	digest = kzalloc(digest_size, GFP_KERNEL);
> +	if (!digest) {
> +		ret = -ENOMEM;
> +		goto digest_err;
> +	}
> +	crypto_shash_init(hash_desc);
> +	crypto_shash_update(hash_desc, uuid, UUID_STRING_LEN);
> +	crypto_shash_final(hash_desc, digest);

All of these can fail and many, if not most, of the callers appear
that they might test the return value.  Thanks,

Alex

> +	bin2hex(alias, digest, min_t(unsigned int, digest_size, alias_len / 2));
> +	/*
> +	 * When alias length is odd, zero out and additional last byte
> +	 * that bin2hex has copied.
> +	 */
> +	if (max_alias_len % 2)
> +		alias[max_alias_len] = 0;
> +
> +	kfree(digest);
> +	kvfree(hash_desc);
> +	return alias;
> +
> +digest_err:
> +	kvfree(hash_desc);
> +desc_err:
> +	kfree(alias);
> +	return NULL;
> +}
> +
> +int mdev_device_create(struct kobject *kobj, struct device *dev,
> +		       const char *uuid_str, const guid_t *uuid)
>  {
>  	int ret;
>  	struct mdev_device *mdev, *tmp;
>  	struct mdev_parent *parent;
>  	struct mdev_type *type = to_mdev_type(kobj);
> +	const char *alias = NULL;
>  
>  	parent = mdev_get_parent(type->parent);
>  	if (!parent)
>  		return -EINVAL;
>  
> +	if (parent->ops->get_alias_length) {
> +		unsigned int alias_len;
> +
> +		alias_len = parent->ops->get_alias_length();
> +		if (alias_len) {
> +			alias = generate_alias(uuid_str, alias_len);
> +			if (!alias) {
> +				ret = -ENOMEM;
> +				goto alias_fail;
> +			}
> +		}
> +	}
>  	mutex_lock(&mdev_list_lock);
>  
>  	/* Check for duplicate */
> @@ -300,6 +385,12 @@ int mdev_device_create(struct kobject *kobj,
>  	}
>  
>  	guid_copy(&mdev->uuid, uuid);
> +	mdev->alias = alias;
> +	/*
> +	 * At this point alias memory is owned by the mdev.
> +	 * Mark it NULL, so that only mdev can free it.
> +	 */
> +	alias = NULL;
>  	list_add(&mdev->next, &mdev_list);
>  	mutex_unlock(&mdev_list_lock);
>  
> @@ -346,6 +437,8 @@ int mdev_device_create(struct kobject *kobj,
>  	up_read(&parent->unreg_sem);
>  	put_device(&mdev->dev);
>  mdev_fail:
> +	kfree(alias);
> +alias_fail:
>  	mdev_put_parent(parent);
>  	return ret;
>  }
> @@ -406,7 +499,17 @@ EXPORT_SYMBOL(mdev_get_iommu_device);
>  
>  static int __init mdev_init(void)
>  {
> -	return mdev_bus_register();
> +	int ret;
> +
> +	alias_hash = crypto_alloc_shash("sha1", 0, 0);
> +	if (!alias_hash)
> +		return -ENOMEM;
> +
> +	ret = mdev_bus_register();
> +	if (ret)
> +		crypto_free_shash(alias_hash);
> +
> +	return ret;
>  }
>  
>  static void __exit mdev_exit(void)
> @@ -415,6 +518,7 @@ static void __exit mdev_exit(void)
>  		class_compat_unregister(mdev_bus_compat_class);
>  
>  	mdev_bus_unregister();
> +	crypto_free_shash(alias_hash);
>  }
>  
>  module_init(mdev_init)
> diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
> index 7d922950caaf..cf1c0d9842c6 100644
> --- a/drivers/vfio/mdev/mdev_private.h
> +++ b/drivers/vfio/mdev/mdev_private.h
> @@ -33,6 +33,7 @@ struct mdev_device {
>  	struct kobject *type_kobj;
>  	struct device *iommu_device;
>  	bool active;
> +	const char *alias;
>  };
>  
>  #define to_mdev_device(dev)	container_of(dev, struct mdev_device, dev)
> @@ -57,8 +58,8 @@ void parent_remove_sysfs_files(struct mdev_parent *parent);
>  int  mdev_create_sysfs_files(struct device *dev, struct mdev_type *type);
>  void mdev_remove_sysfs_files(struct device *dev, struct mdev_type *type);
>  
> -int  mdev_device_create(struct kobject *kobj,
> -			struct device *dev, const guid_t *uuid);
> +int mdev_device_create(struct kobject *kobj, struct device *dev,
> +		       const char *uuid_str, const guid_t *uuid);
>  int  mdev_device_remove(struct device *dev);
>  
>  #endif /* MDEV_PRIVATE_H */
> diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
> index 7570c7602ab4..43afe0e80b76 100644
> --- a/drivers/vfio/mdev/mdev_sysfs.c
> +++ b/drivers/vfio/mdev/mdev_sysfs.c
> @@ -63,15 +63,18 @@ static ssize_t create_store(struct kobject *kobj, struct device *dev,
>  		return -ENOMEM;
>  
>  	ret = guid_parse(str, &uuid);
> -	kfree(str);
>  	if (ret)
> -		return ret;
> +		goto err;
>  
> -	ret = mdev_device_create(kobj, dev, &uuid);
> +	ret = mdev_device_create(kobj, dev, str, &uuid);
>  	if (ret)
> -		return ret;
> +		goto err;
>  
> -	return count;
> +	ret = count;
> +
> +err:
> +	kfree(str);
> +	return ret;
>  }
>  
>  MDEV_TYPE_ATTR_WO(create);
> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
> index 0ce30ca78db0..f036fe9854ee 100644
> --- a/include/linux/mdev.h
> +++ b/include/linux/mdev.h
> @@ -72,6 +72,9 @@ struct device *mdev_get_iommu_device(struct device *dev);
>   * @mmap:		mmap callback
>   *			@mdev: mediated device structure
>   *			@vma: vma structure
> + * @get_alias_length:	Generate alias for the mdevs of this parent based on the
> + *			mdev device name when it returns non zero alias length.
> + *			It is optional.
>   * Parent device that support mediated device should be registered with mdev
>   * module with mdev_parent_ops structure.
>   **/
> @@ -92,6 +95,7 @@ struct mdev_parent_ops {
>  	long	(*ioctl)(struct mdev_device *mdev, unsigned int cmd,
>  			 unsigned long arg);
>  	int	(*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
> +	unsigned int (*get_alias_length)(void);
>  };
>  
>  /* interface for exporting mdev supported type attributes */


^ permalink raw reply

* Re: [PATCH bpf-next 03/10] tools/bpf: handle __MAX_BPF_(PROG|MAP)_TYPE in switch statements
From: Arnaldo Carvalho de Melo @ 2019-08-28 21:19 UTC (permalink / raw)
  To: Julia Kartseva; +Cc: rdna, bpf, ast, daniel, netdev, kernel-team
In-Reply-To: <1895f7dfe2a8067f6397ff565edf20130a28aa91.1567024943.git.hex@fb.com>

Em Wed, Aug 28, 2019 at 02:03:06PM -0700, Julia Kartseva escreveu:
> Add cases to switch statements in probe_load, bpf_prog_type__needs_kver
> bpf_probe_map_type to fix enumeration value not handled in switch
> compilation error.
> prog_type_name array in bpftool/main.h doesn't have __MAX_BPF_PROG_TYPE
> entity, same for map, so probe won't be called.

Shouldn't this be added when adding that __MAX_BPF_PROG_TYPE value to
the enum? Otherwise the build will fail when __MAX_BPF_PROG_TYPE is
added but not handled in the switches.

I.e. the tree should build patch by patch, not just at the end of patch
series.

- Arnaldo
 
> Signed-off-by: Julia Kartseva <hex@fb.com>
> ---
>  tools/lib/bpf/libbpf.c        | 1 +
>  tools/lib/bpf/libbpf_probes.c | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 2233f919dd88..72e6e5eb397f 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -3580,6 +3580,7 @@ static bool bpf_prog_type__needs_kver(enum bpf_prog_type type)
>  	case BPF_PROG_TYPE_PERF_EVENT:
>  	case BPF_PROG_TYPE_CGROUP_SYSCTL:
>  	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
> +	case __MAX_BPF_PROG_TYPE:
>  		return false;
>  	case BPF_PROG_TYPE_KPROBE:
>  	default:
> diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
> index 4b0b0364f5fc..8f2ba6a457ac 100644
> --- a/tools/lib/bpf/libbpf_probes.c
> +++ b/tools/lib/bpf/libbpf_probes.c
> @@ -102,6 +102,7 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
>  	case BPF_PROG_TYPE_FLOW_DISSECTOR:
>  	case BPF_PROG_TYPE_CGROUP_SYSCTL:
>  	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
> +	case __MAX_BPF_PROG_TYPE:
>  	default:
>  		break;
>  	}
> @@ -250,6 +251,7 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
>  	case BPF_MAP_TYPE_XSKMAP:
>  	case BPF_MAP_TYPE_SOCKHASH:
>  	case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
> +	case __MAX_BPF_MAP_TYPE:
>  	default:
>  		break;
>  	}
> -- 
> 2.17.1

-- 

- Arnaldo

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox