* Re: One question about __tcp_select_window()
From: Wang Jian @ 2018-04-18 0:23 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
In-Reply-To: <6b7c1a82-e253-09bd-fd3e-363ce6a84653@gmail.com>
Thanks for your reply, Eric.
Actually, this is a query about the code while I am reading code.
>From my instinct and the comment, I think we should choose the bigger
one but maybe I miss something(like your said, autotuning)
Anyway, I will read more codes and do more tests.
Thanks.
On Tue, Apr 17, 2018 at 10:43 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
> On 04/17/2018 06:53 AM, Wang Jian wrote:
>> I test the fix with 4.17.0-rc1+ and it seems work.
>>
>> 1. iperf -c IP -i 20 -t 60 -w 1K
>> with-fix vs without-fix : 1.15Gbits/sec vs 1.05Gbits/sec
>> I also try other windows and have similar results.
>>
>> 2. Use tcp probe trace snd_wind.
>> with-fix vs without-fix: 1245568 vs 1042816
>>
>> 3. I don't see extra retransmit/drops.
>>
>
> Unfortunately I have no idea what exact problem you had to solve.
>
> Setting small windows is not exactly the path we are taking.
>
> And I do not know how many side effects your change will have for 'standard' flows
> using autotuning or sane windows.
>
^ permalink raw reply
* Re: [RFC PATCH v3 bpf-next 2/5] bpf/verifier: rewrite subprog boundary detection
From: Alexei Starovoitov @ 2018-04-17 23:48 UTC (permalink / raw)
To: Edward Cree; +Cc: Daniel Borkmann, netdev
In-Reply-To: <99e70dfe-66a1-911a-6616-60eae4ddc689@solarflare.com>
On Fri, Apr 06, 2018 at 06:13:59PM +0100, Edward Cree wrote:
> By storing a subprogno in each insn's aux data, we avoid the need to keep
> the list of subprog starts sorted or bsearch() it in find_subprog().
> Also, get rid of the weird one-based indexing of subprog numbers.
>
> Signed-off-by: Edward Cree <ecree@solarflare.com>
> ---
> include/linux/bpf_verifier.h | 3 +-
> kernel/bpf/verifier.c | 284 ++++++++++++++++++++++++++-----------------
> 2 files changed, 177 insertions(+), 110 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 8f70dc181e23..17990dd56e65 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -146,6 +146,7 @@ struct bpf_insn_aux_data {
> s32 call_imm; /* saved imm field of call insn */
> };
> int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
> + u16 subprogno; /* subprog in which this insn resides */
> bool seen; /* this insn was processed by the verifier */
> };
as I was saying before this is no go.
subprogno is meaningless in the hierarchy of: prog -> func -> bb -> insn
Soon bpf will have libraries and this field would need to become
a pointer back to bb or func structure creating unnecessary circular dependency.
^ permalink raw reply
* Re: [PATCH net-next] net: introduce a new tracepoint for tcp_rcv_space_adjust
From: Alexei Starovoitov @ 2018-04-17 23:44 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Yafang Shao, davem, songliubraving, netdev, linux-kernel
In-Reply-To: <d2ab1647-c832-c982-5952-ab7a415f7c76@gmail.com>
On Mon, Apr 16, 2018 at 08:43:31AM -0700, Eric Dumazet wrote:
>
>
> On 04/16/2018 08:33 AM, Yafang Shao wrote:
> > tcp_rcv_space_adjust is called every time data is copied to user space,
> > introducing a tcp tracepoint for which could show us when the packet is
> > copied to user.
> > This could help us figure out whether there's latency in user process.
> >
> > When a tcp packet arrives, tcp_rcv_established() will be called and with
> > the existed tracepoint tcp_probe we could get the time when this packet
> > arrives.
> > Then this packet will be copied to user, and tcp_rcv_space_adjust will
> > be called and with this new introduced tracepoint we could get the time
> > when this packet is copied to user.
> >
> > arrives time : user process time => latency caused by user
> > tcp_probe tcp_rcv_space_adjust
> >
> > Hence in the prink message, sk is printed as a key to connect these two
> > tracepoints.
> >
>
> socket pointer is not a key.
>
> TCP sockets can be reused pretty fast after free.
>
> I suggest you go for cookie instead, this is an unique 64bit identifier.
> ( sock_gen_cookie() for details )
I think would be even better if the stack would do this sock_gen_cookie()
on its own in some way that user cannnot infere the order.
In many cases we wanted to use socket cookie, but since it's not inited
by default it's kinda useless.
Turning this tracepoint on just to get cookie would be an ugly workaround.
^ permalink raw reply
* Re: fix for bnx2x panic during ethtool reporting
From: Florian Fainelli @ 2018-04-17 23:42 UTC (permalink / raw)
To: Sebastian Kuzminsky, linux-kernel, netdev, ariel.elior,
everest-linux-l2
In-Reply-To: <CAOTh5U2NwGU68cwzX2amhzPyj_=f0pm4CO4biR4m1DX49MJkUw@mail.gmail.com>
+netdev, Ariel,
On 04/17/2018 10:21 AM, Sebastian Kuzminsky wrote:
> "ethtool -i" on a bnx2x interface causes kernel panic when the
> firmware version is longer than expected. The attached patch fixes
> the problem by simplifying the string handling in bnx2x_fill_fw_str().
> It applies cleanly to 4.14 and 4.17-rc1.
If you want to have a chance of getting your patch included, your should
make sure you copy the driver maintainers and the network mailinglist,
doing that.
--
Florian
^ permalink raw reply
* Re: [patch net-next RFC 00/12] devlink: introduce port flavours and common phys_port_name generation
From: Jakub Kicinski @ 2018-04-17 23:42 UTC (permalink / raw)
To: Or Gerlitz
Cc: Jiri Pirko, Linux Netdev List, David Miller, Ido Schimmel, mlxsw,
Andrew Lunn, Vivien Didelot, Florian Fainelli, Michael Chan,
Ganesh Goudar, Saeed Mahameed, Simon Horman,
Pieter Jansen van Vuuren, John Hurley, Dirk van der Merwe,
Alexander Duyck, Or Gerlitz, David Ahern, vijaya.guvva
In-Reply-To: <CAJ3xEMhX621ed7s-z8OO6rmwP2avczk-bh6HpERwwAU+ufGzTw@mail.gmail.com>
On Tue, 17 Apr 2018 16:23:48 +0300, Or Gerlitz wrote:
> On Thu, Mar 22, 2018 at 1:55 PM, Jiri Pirko <jiri@resnulli.us> wrote:
> > From: Jiri Pirko <jiri@mellanox.com>
> >
> > This patchset resolves 2 issues we have right now:
> > 1) There are many netdevices / ports in the system, for port, pf, vf
> > represenatation but the user has no way to see which is which
> > 2) The ndo_get_phys_port_name is implemented in each driver separatelly,
> > which may lead to inconsistent names between drivers.
> >
> > This patchset introduces port flavours which should address the first
> > problem. I'm testing this with Netronome nfp hardware. When the user
> > has 2 physical ports, 1 pf, and 4 vfs, he should see something like this:
>
> J/J (Jiri/Jakub) --
>
> re "2 physical ports, 1 pf, and 4 vfs" --- does NFP exposes one PF for
> both physical ports?
Yes there are multiple PCIe PFs on the card, but the basic CX card just
uses one for all uplinks (like mlx4).
> FWIW note that in mlx5 and AFAIK any other device except for mlx4 (...)
> folks have FPP (Function Per Port) scheme.
>
> [..]
>
> > The desired output should look like this:
> > # devlink port
> > pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical number 0
> > pci/0000:05:00.0/1: type eth netdev enp5s0np1 flavour physical number 1
> > pci/0000:05:00.0/2: type eth netdev enp5s0npf0 flavour pf_rep number 0
> > pci/0000:05:00.0/3: type eth netdev enp5s0nvf0 flavour vf_rep number 0
> > pci/0000:05:00.0/4: type eth netdev enp5s0nvf1 flavour vf_rep number 1
> > pci/0000:05:00.0/5: type eth netdev enp5s0nvf2 flavour vf_rep number 2
> > pci/0000:05:00.0/6: type eth netdev enp5s0nvf3 flavour vf_rep number 3
> > As you can see, the netdev names are generated according to the flavour
> > and port number. In case the port is split, the split subnumber is also included.
>
> What is the purpose/role in getting dev link ports here? is it such
> that @ the end
> of the day the driver would do a devlink_port_get_phys_port_name() call in their
> get phys port name ndo? or we buy more advantages out of doing so?
IMHO having way to get all netdevs and the netdev type from devlink is
quite user friendly. As of today we also use the devlink ports for port
splitting on 40/100G parts. Hopefully more functionality migrates over
from ethtool over time.
^ permalink raw reply
* [PATCH net] net: qualcomm: rmnet: Fix warning seen with fill_info
From: Subash Abhinov Kasiviswanathan @ 2018-04-17 23:40 UTC (permalink / raw)
To: davem, netdev; +Cc: Subash Abhinov Kasiviswanathan
When the last rmnet device attached to a real device is removed, the
real device is unregistered from rmnet. As a result, the real device
lookup fails resulting in a warning when the fill_info handler is
called as part of the rmnet device unregistration.
Fix this by returning the rmnet flags as 0 when no real device is
present.
WARNING: CPU: 0 PID: 1779 at net/core/rtnetlink.c:3254
rtmsg_ifinfo_build_skb+0xca/0x10d
Modules linked in:
CPU: 0 PID: 1779 Comm: ip Not tainted 4.16.0-11872-g7ce2367 #1
Stack:
7fe655f0 60371ea3 00000000 00000000
60282bc6 6006b116 7fe65600 60371ee8
7fe65660 6003a68c 00000000 900000000
Call Trace:
[<6006b116>] ? printk+0x0/0x94
[<6001f375>] show_stack+0xfe/0x158
[<60371ea3>] ? dump_stack_print_info+0xe8/0xf1
[<60282bc6>] ? rtmsg_ifinfo_build_skb+0xca/0x10d
[<6006b116>] ? printk+0x0/0x94
[<60371ee8>] dump_stack+0x2a/0x2c
[<6003a68c>] __warn+0x10e/0x13e
[<6003a82c>] warn_slowpath_null+0x48/0x4f
[<60282bc6>] rtmsg_ifinfo_build_skb+0xca/0x10d
[<60282c4d>] rtmsg_ifinfo_event.part.37+0x1e/0x43
[<60282c2f>] ? rtmsg_ifinfo_event.part.37+0x0/0x43
[<60282d03>] rtmsg_ifinfo+0x24/0x28
[<60264e86>] dev_close_many+0xba/0x119
[<60282cdf>] ? rtmsg_ifinfo+0x0/0x28
[<6027c225>] ? rtnl_is_locked+0x0/0x1c
[<6026ca67>] rollback_registered_many+0x1ae/0x4ae
[<600314be>] ? unblock_signals+0x0/0xae
[<6026cdc0>] ? unregister_netdevice_queue+0x19/0xec
[<6026ceec>] unregister_netdevice_many+0x21/0xa1
[<6027c765>] rtnl_delete_link+0x3e/0x4e
[<60280ecb>] rtnl_dellink+0x262/0x29c
[<6027c241>] ? rtnl_get_link+0x0/0x3e
[<6027f867>] rtnetlink_rcv_msg+0x235/0x274
Fixes: be81a85f5f87 ("net: qualcomm: rmnet: Implement fill_info")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
index d339885..5f4e447 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
@@ -350,15 +350,16 @@ static int rmnet_fill_info(struct sk_buff *skb, const struct net_device *dev)
real_dev = priv->real_dev;
- if (!rmnet_is_real_dev_registered(real_dev))
- return -ENODEV;
-
if (nla_put_u16(skb, IFLA_RMNET_MUX_ID, priv->mux_id))
goto nla_put_failure;
- port = rmnet_get_port_rtnl(real_dev);
+ if (rmnet_is_real_dev_registered(real_dev)) {
+ port = rmnet_get_port_rtnl(real_dev);
+ f.flags = port->data_format;
+ } else {
+ f.flags = 0;
+ }
- f.flags = port->data_format;
f.mask = ~0;
if (nla_put(skb, IFLA_RMNET_FLAGS, sizeof(f), &f))
--
1.9.1
^ permalink raw reply related
* Re: [PATCH] samples/bpf: correct comment in sock_example.c
From: Alexei Starovoitov @ 2018-04-17 23:39 UTC (permalink / raw)
To: Wang Sheng-Hui; +Cc: ast, daniel, netdev
In-Reply-To: <20180417022520.2412-1-shhuiw@foxmail.com>
On Tue, Apr 17, 2018 at 10:25:20AM +0800, Wang Sheng-Hui wrote:
> The program run against loopback interace "lo", not "eth0".
> Correct the comment.
>
> Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
for future patches please use the following format for the subject:
[PATCH bpf-next] samples/bpf: ...
^ permalink raw reply
* Re: Repeatable inet6_dump_fib crash in stock 4.12.0-rc4+
From: Ben Greear @ 2018-04-17 23:29 UTC (permalink / raw)
To: David Ahern, Michal Kubecek; +Cc: Cong Wang, Eric Dumazet, netdev
In-Reply-To: <763bdb6c-bd5f-2398-53ca-6d9dc28c3df6@candelatech.com>
On 01/24/2018 03:59 PM, Ben Greear wrote:
> On 06/20/2017 08:03 PM, David Ahern wrote:
>> On 6/20/17 5:41 PM, Ben Greear wrote:
>>> On 06/20/2017 11:05 AM, Michal Kubecek wrote:
>>>> On Tue, Jun 20, 2017 at 07:12:27AM -0700, Ben Greear wrote:
>>>>> On 06/14/2017 03:25 PM, David Ahern wrote:
>>>>>> On 6/14/17 4:23 PM, Ben Greear wrote:
>>>>>>> On 06/13/2017 07:27 PM, David Ahern wrote:
>>>>>>>
>>>>>>>> Let's try a targeted debug patch. See attached
>>>>>>>
>>>>>>> I had to change it to pr_err so it would go to our serial console
>>>>>>> since the system locked hard on crash,
>>>>>>> and that appears to be enough to change the timing where we can no
>>>>>>> longer
>>>>>>> reproduce the problem.
>>>>>>
>>>>>>
>>>>>> ok, let's figure out which one is doing that. There are 3 debug
>>>>>> statements. I suspect fib6_del_route is the one setting the state to
>>>>>> FWS_U. Can you remove the debug prints in fib6_repair_tree and
>>>>>> fib6_walk_continue and try again?
>>>>>
>>>>> We cannot reproduce with just that one printf in the kernel either. It
>>>>> must change the timing too much to trigger the bug.
>>>>
>>>> You might try trace_printk() which should have less impact (don't forget
>>>> to enable /proc/sys/kernel/ftrace_dump_on_oops).
>>>
>>> We cannot reproduce with trace_printk() either.
>>
>> I think that suggests the walker state is set to FWS_U in
>> fib6_del_route, and it is the FWS_U case in fib6_walk_continue that
>> triggers the fault -- the null parent (pn = fn->parent). So we have the
>> 2 areas of code that are interacting.
>>
>> I'm on a road trip through the end of this week with little time to
>> focus on this problem. I'll get back to you another suggestion when I can.
FYI, problem still happens in 4.16. I'm going to re-enable my hack below
for this kernel as well...I had hopes it might be fixed...
BUG: unable to handle kernel NULL pointer dereference at 8
IP: fib6_walk_continue+0x5b/0x140 [ipv6]
PGD 80000007dfc0c067 P4D 80000007dfc0c067 PUD 7e66ff067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 libcrc32c vrf]
CPU: 3 PID: 15117 Comm: ip Tainted: G O 4.16.0+ #5
Hardware name: Iron_Systems,Inc CS-CAD-2U-A02/X10SRL-F, BIOS 2.0b 05/02/2017
RIP: 0010:fib6_walk_continue+0x5b/0x140 [ipv6]
RSP: 0018:ffffc90008c3bc10 EFLAGS: 00010287
RAX: ffff88085ac45050 RBX: ffff8807e03008a0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffc90008c3bc48 RDI: ffffffff8232b240
RBP: ffff880819167600 R08: 0000000000000008 R09: ffff8807dff10071
R10: ffffc90008c3bbd0 R11: 0000000000000000 R12: ffff8807e03008a0
R13: 0000000000000002 R14: ffff8807e05744c8 R15: ffff8807e08ef000
FS: 00007f2f04342700(0000) GS:ffff88087fcc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 00000007e0556002 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
inet6_dump_fib+0x14b/0x2c0 [ipv6]
netlink_dump+0x216/0x2a0
netlink_recvmsg+0x254/0x400
? copy_msghdr_from_user+0xb5/0x110
___sys_recvmsg+0xe9/0x230
? find_held_lock+0x3b/0xb0
? __handle_mm_fault+0x617/0x1180
? __audit_syscall_entry+0xb3/0x110
? __sys_recvmsg+0x39/0x70
__sys_recvmsg+0x39/0x70
do_syscall_64+0x63/0x120
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7f2f03a72030
RSP: 002b:00007fffab3de508 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 00007fffab3e641c RCX: 00007f2f03a72030
RDX: 0000000000000000 RSI: 00007fffab3de570 RDI: 0000000000000004
RBP: 0000000000000000 R08: 0000000000007e6c R09: 00007fffab3e63a8
R10: 00007fffab3de5b0 R11: 0000000000000246 R12: 00007fffab3e6608
R13: 000000000066b460 R14: 0000000000007e6c R15: 0000000000000000
Code: 85 d2 74 17 f6 40 2a 04 74 11 8b 53 2c 85 d2 0f 84 d7 00 00 00 83 ea 01 89 53 2c c7 4
RIP: fib6_walk_continue+0x5b/0x140 [ipv6] RSP: ffffc90008c3bc10
CR2: 0000000000000008
---[ end trace bd03458864eb266c ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: disabled
Rebooting in 10 seconds..
ACPI MEMORY or I/O RESET_REG.
>
> So, though I don't know the right way to fix it, the patch below appears
> to make the system not crash.
>
>
> diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
> index 68b9cc7..bf19a14 100644
> --- a/net/ipv6/ip6_fib.c
> +++ b/net/ipv6/ip6_fib.c
> @@ -1614,6 +1614,12 @@ static int fib6_walk_continue(struct fib6_walker *w)
> pn = fn->parent;
> w->node = pn;
> #ifdef CONFIG_IPV6_SUBTREES
> + if (WARN_ON_ONCE(!pn)) {
> + pr_err("FWS-U, w: %p fn: %p pn: %p\n",
> + w, fn, pn);
> + /* Attempt to work around crash that has been here forever. --Ben */
> + return 0;
> + }
> if (FIB6_SUBTREE(pn) == fn) {
> WARN_ON(!(fn->fn_flags & RTN_ROOT));
> w->state = FWS_L;
>
>
>
> The printout looks like this (when adding 4000 mac-vlans, so it is pretty rare). PN is definitely NULL sometimes:
>
> [root@2u-6n ~]# journalctl -f|grep FWS
> Jan 24 15:48:05 2u-6n kernel: IPv6: FWS-U, w: ffff8807ea121ba0 fn: ffff880856a09260 pn: (null)
> Jan 24 15:51:15 2u-6n kernel: IPv6: FWS-U, w: ffff8807e3963de0 fn: ffff880856a09260 pn: (null)
> Jan 24 15:51:15 2u-6n kernel: IPv6: FWS-U, w: ffff88081ac22de0 fn: ffff880856a09260 pn: (null)
> Jan 24 15:53:13 2u-6n kernel: IPv6: FWS-U, w: ffff8808290c69c0 fn: ffff8807e369f920 pn: (null)
> Jan 24 15:53:24 2u-6n kernel: IPv6: FWS-U, w: ffff8807ea3156c0 fn: ffff88082d1eeb60 pn: (null)
>
>
>
> 8066 Jan 24 15:48:04 2u-6n kernel: 8021q: adding VLAN 0 to HW filter on device eth2#1006
> 8067 Jan 24 15:48:05 2u-6n kernel: ------------[ cut here ]------------
> 8068 Jan 24 15:48:05 2u-6n kernel: WARNING: CPU: 5 PID: 3346 at /home/greearb/git/linux-4.13.dev.y/net/ipv6/ip6_fib.c:1617 fib6_walk_continue+ 0x154/0x1b0 [ipv6]
> 8069 Jan 24 15:48:05 2u-6n kernel: Modules linked in: 8021q garp mrp stp llc fuse macvlan wanlink(O) pktgen ipmi_ssif coretemp intel_rapl sb_edac
> x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm ath9k irqbypass iTCO_wdt ath9k_common iTCO_vendor_support ath9k_hw ath i2c_i801 mac80211 joydev
> lpc_ich cfg80211 ioatdma shpchp tpm_tis tpm_tis_core wmi tpm ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter nfsd auth_rpcgss nfs_acl
> sch_fq_codel lockd grace sunrpc ast drm_kms_helper ttm drm igb hwmon ptp pps_core dca i2c_algo_bit i2c_core ipv6 crc_ccitt
> 8070 Jan 24 15:48:05 2u-6n kernel: CPU: 5 PID: 3346 Comm: ip Tainted: G O 4.13.16+ #22
> 8071 Jan 24 15:48:05 2u-6n kernel: Hardware name: Iron_Systems,Inc CS-CAD-2U-A02/X10SRL-F, BIOS 2.0b 05/02/2017
> 8072 Jan 24 15:48:05 2u-6n kernel: task: ffff8807e9ef1dc0 task.stack: ffffc9002083c000
> 8073 Jan 24 15:48:05 2u-6n kernel: RIP: 0010:fib6_walk_continue+0x154/0x1b0 [ipv6]
> 8074 Jan 24 15:48:05 2u-6n kernel: RSP: 0018:ffffc9002083fbc0 EFLAGS: 00010246
> 8075 Jan 24 15:48:05 2u-6n kernel: RAX: 0000000000000000 RBX: ffff8807ea121ba0 RCX: 0000000000000000
> 8076 Jan 24 15:48:05 2u-6n kernel: RDX: ffff880856a09260 RSI: ffffc9002083fc00 RDI: ffffffff81ef2140
> 8077 Jan 24 15:48:05 2u-6n kernel: RBP: ffffc9002083fbc8 R08: 0000000000000008 R09: ffff8807e36f6b25
> 8078 Jan 24 15:48:05 2u-6n kernel: R10: ffffc9002083fb70 R11: 0000000000000000 R12: 0000000000000002
> 8079 Jan 24 15:48:05 2u-6n kernel: R13: 0000000000000002 R14: ffff8807ea121ba0 R15: ffff8807ebcc8d80
> 8080 Jan 24 15:48:05 2u-6n kernel: FS: 00007f77a5d0f700(0000) GS:ffff88087fd40000(0000) knlGS:0000000000000000
> 8081 Jan 24 15:48:05 2u-6n kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 8082 Jan 24 15:48:05 2u-6n kernel: CR2: 0000000003d56c88 CR3: 00000007f3106000 CR4: 00000000003406e0
> 8083 Jan 24 15:48:05 2u-6n kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> 8084 Jan 24 15:48:05 2u-6n kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 8085 Jan 24 15:48:05 2u-6n kernel: Call Trace:
> 8086 Jan 24 15:48:05 2u-6n kernel: inet6_dump_fib+0x1ab/0x2a0 [ipv6]
> 8087 Jan 24 15:48:05 2u-6n kernel: netlink_dump+0x22c/0x2b0
> 8088 Jan 24 15:48:05 2u-6n kernel: netlink_recvmsg+0x260/0x3f0
> 8089 Jan 24 15:48:05 2u-6n kernel: sock_recvmsg+0x38/0x40
> 8090 Jan 24 15:48:05 2u-6n kernel: ___sys_recvmsg+0xe9/0x230
> 8091 Jan 24 15:48:05 2u-6n kernel: ? alloc_pages_vma+0x83/0x1e0
> 8092 Jan 24 15:48:05 2u-6n kernel: ? page_add_new_anon_rmap+0x88/0xc0
> 8093 Jan 24 15:48:05 2u-6n kernel: ? lru_cache_add_active_or_unevictable+0x31/0xb0
> 8094 Jan 24 15:48:05 2u-6n kernel: ? __handle_mm_fault+0x5e5/0xfa0
> 8095 Jan 24 15:48:05 2u-6n kernel: __sys_recvmsg+0x3d/0x70
> 8096 Jan 24 15:48:05 2u-6n kernel: ? __sys_recvmsg+0x3d/0x70
> 8097 Jan 24 15:48:05 2u-6n kernel: SyS_recvmsg+0xd/0x20
> 8098 Jan 24 15:48:05 2u-6n kernel: do_syscall_64+0x56/0xc0
> 8099 Jan 24 15:48:05 2u-6n kernel: entry_SYSCALL64_slow_path+0x25/0x25
> 8100 Jan 24 15:48:05 2u-6n kernel: RIP: 0033:0x7f77a5644030
> 8101 Jan 24 15:48:05 2u-6n kernel: RSP: 002b:00007ffc3e783e68 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
> 8102 Jan 24 15:48:05 2u-6n kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f77a5644030
> 8103 Jan 24 15:48:05 2u-6n kernel: RDX: 0000000000000000 RSI: 00007ffc3e783ed0 RDI: 0000000000000004
> 8104 Jan 24 15:48:05 2u-6n kernel: RBP: 00007ffc3e787ef4 R08: 0000000000003fe4 R09: 0
> 8105 Jan 24 15:48:05 2u-6n kernel: R10: 00007ffc3e783f10 R11: 0000000000000246 R12: 000000000064f360
> 8106 Jan 24 15:48:05 2u-6n kernel: R13: 00007ffc3e787f60 R14: 0000000000003fe4 R15: 0000000000000000
> 8107 Jan 24 15:48:05 2u-6n kernel: Code: ff 24 c5 a8 e5 04 a0 f6 42 2a 02 74 68 c7 43 28 01 00 00 00 48 89 c2 e9 c7 fe ff ff c7 43 28 02 00 00 00 48 89
> c2 e9 b8 fe ff ff <0f> ff 31 c9 48 89 de 48 c7 c7 78 36 05 a0 e8 65 e4 14 e1 31 c0
> 8108 Jan 24 15:48:05 2u-6n kernel: ---[ end trace 1d1c7028c9dec459 ]---
> 8109 Jan 24 15:48:05 2u-6n kernel: IPv6: FWS-U, w: ffff8807ea121ba0 fn: ffff880856a09260 pn: (null)
> 8110 Jan 24 15:48:05 2u-6n kernel: 8021q: adding VLAN 0 to HW filter on device eth2#1008
> 8111 Jan 24 15:48:05 2u-6n kernel: 8021q: adding VLAN 0 to HW filter on device eth2#1009
> ....
>
> Thanks,
> Ben
>
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply
* Re: SRIOV switchdev mode BoF minutes
From: Jakub Kicinski @ 2018-04-17 23:19 UTC (permalink / raw)
To: Andy Gospodarek
Cc: Or Gerlitz, Samudrala, Sridhar, David Miller, Anjali Singhai Jain,
Michael Chan, Simon Horman, John Fastabend, Saeed Mahameed,
Jiri Pirko, Rony Efraim, Linux Netdev List
In-Reply-To: <20180417144700.GJ33938@C02RW35GFVH8.dhcp.broadcom.net>
On Tue, 17 Apr 2018 10:47:00 -0400, Andy Gospodarek wrote:
> There is also a school of thought that the VF reps could be
> pre-allocated on the SmartNIC so that any application processing that
> traffic would sit idle when no traffic arrives on the rep, but could
> process frames that do arrive when the VFs were created on the host.
> This implementation will depend on how resources are allocated on a
> given bit of hardware, but can really work well.
+1 if there is no FW resource allocation issues IMHO it's okay to
just show all reprs for "remote PCIes (PFs and VFs)" on the SmartNIC/
controller. The reprs should just show link down as if PCIe cable
was unpluged until host actually enables them.
A similar issue exists on multi-host for PFs, right? If one of the
hosts is down do we still show their PF repr? IMHO yes.
That makes the thing looks more like a switch with cables being plugged
in and out.
^ permalink raw reply
* Re: [PATCH v2 bpf-next 1/3] bpftool: Support new prog types and attach types
From: Jakub Kicinski @ 2018-04-17 23:09 UTC (permalink / raw)
To: Andrey Ignatov; +Cc: ast, daniel, quentin.monnet, netdev, kernel-team
In-Reply-To: <b5413a9cf18bb0a5a2480346e95101eb3377c0a6.1523985784.git.rdna@fb.com>
On Tue, 17 Apr 2018 10:28:44 -0700, Andrey Ignatov wrote:
> Add recently added prog types to `bpftool prog` and attach types to
> `bpftool cgroup`.
>
> Update bpftool documentation and bash completion appropriately.
>
> Signed-off-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Thank you!!
^ permalink raw reply
* Re: [PATCH bpf-next 08/10] [bpf]: make netronome nfp compatible w/ bpf_xdp_adjust_tail
From: Alexei Starovoitov @ 2018-04-17 23:08 UTC (permalink / raw)
To: Nikita V. Shirokov
Cc: Alexei Starovoitov, Daniel Borkmann, Jakub Kicinski, netdev
In-Reply-To: <20180417065131.3632-9-tehnerd@tehnerd.com>
On Mon, Apr 16, 2018 at 11:51:29PM -0700, Nikita V. Shirokov wrote:
> w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
> well (only "decrease" of pointer's location is going to be supported).
> changing of this pointer will change packet's size.
> for nfp driver we will just calculate packet's length unconditionally
>
> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
> ---
> drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> index 1eb6549f2a54..d9111c077699 100644
> --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
> @@ -1722,7 +1722,7 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
>
> act = bpf_prog_run_xdp(xdp_prog, &xdp);
>
> - pkt_len -= xdp.data - orig_data;
> + pkt_len = xdp.data_end - xdp.data;
Looks correct, but Jakub please review.
^ permalink raw reply
* Re: [PATCH bpf-next 07/10] [bpf]: make cavium thunder compatible w/ bpf_xdp_adjust_tail
From: Alexei Starovoitov @ 2018-04-17 23:07 UTC (permalink / raw)
To: Nikita V. Shirokov
Cc: Alexei Starovoitov, Daniel Borkmann, Robert Richter,
Sunil Goutham, netdev
In-Reply-To: <20180417065131.3632-8-tehnerd@tehnerd.com>
On Mon, Apr 16, 2018 at 11:51:28PM -0700, Nikita V. Shirokov wrote:
> w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
> well (only "decrease" of pointer's location is going to be supported).
> changing of this pointer will change packet's size.
> for cavium's thunder driver we will just calculate packet's length
> unconditionally
>
> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
^ permalink raw reply
* Re: [PATCH bpf-next 06/10] [bpf]: make bnxt compatible w/ bpf_xdp_adjust_tail
From: Alexei Starovoitov @ 2018-04-17 23:07 UTC (permalink / raw)
To: Nikita V. Shirokov
Cc: Alexei Starovoitov, Daniel Borkmann, Michael Chan, netdev
In-Reply-To: <20180417065131.3632-7-tehnerd@tehnerd.com>
On Mon, Apr 16, 2018 at 11:51:27PM -0700, Nikita V. Shirokov wrote:
> w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
> well (only "decrease" of pointer's location is going to be supported).
> changing of this pointer will change packet's size.
> for bnxt driver we will just calculate packet's length unconditionally
>
> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
^ permalink raw reply
* Re: [PATCH bpf-next 05/10] [bpf]: make mlx4 compatible w/ bpf_xdp_adjust_tail
From: Alexei Starovoitov @ 2018-04-17 23:06 UTC (permalink / raw)
To: Nikita V. Shirokov
Cc: Alexei Starovoitov, Daniel Borkmann, Tariq Toukan, netdev
In-Reply-To: <20180417065131.3632-6-tehnerd@tehnerd.com>
On Mon, Apr 16, 2018 at 11:51:26PM -0700, Nikita V. Shirokov wrote:
> w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
> well (only "decrease" of pointer's location is going to be supported).
> changing of this pointer will change packet's size.
> for mlx4 driver we will just calculate packet's length unconditionally
> (the same way as it's already being done in mlx5)
>
> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
^ permalink raw reply
* Re: [PATCH bpf-next 04/10] [bpf]: make generic xdp compatible w/ bpf_xdp_adjust_tail
From: Alexei Starovoitov @ 2018-04-17 23:06 UTC (permalink / raw)
To: Nikita V. Shirokov
Cc: Alexei Starovoitov, Daniel Borkmann, David S. Miller , netdev
In-Reply-To: <20180417065131.3632-5-tehnerd@tehnerd.com>
On Mon, Apr 16, 2018 at 11:51:25PM -0700, Nikita V. Shirokov wrote:
> w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as
> well (only "decrease" of pointer's location is going to be supported).
> changing of this pointer will change packet's size.
> for generic XDP we need to reflect this packet's length change by
> adjusting skb's tail pointer
>
> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
pls also change the order of the test/sample patches.
they should come last, since they will work only after this one
and all other driver support.
^ permalink raw reply
* Re: [PATCH bpf-next 02/10] [bpf]: adding tests for bpf_xdp_adjust_tail
From: Alexei Starovoitov @ 2018-04-17 23:04 UTC (permalink / raw)
To: Nikita V. Shirokov; +Cc: Alexei Starovoitov, Daniel Borkmann, netdev
In-Reply-To: <20180417065131.3632-3-tehnerd@tehnerd.com>
On Mon, Apr 16, 2018 at 11:51:23PM -0700, Nikita V. Shirokov wrote:
> adding selftests for bpf_xdp_adjust_tail helper. in this syntetic test
> we are testing that 1) if data_end < data helper will return EINVAL
> 2) for normal use case packet's length would be reduced.
>
> aside from adding new tests i'm changing behaviour of bpf_prog_test_run
> so it would recalculate packet's length if only data_end pointer was
> changed
>
> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
> ---
> net/bpf/test_run.c | 3 ++-
> tools/include/uapi/linux/bpf.h | 11 ++++++++-
> tools/testing/selftests/bpf/Makefile | 2 +-
> tools/testing/selftests/bpf/bpf_helpers.h | 3 +++
> tools/testing/selftests/bpf/test_adjust_tail.c | 29 +++++++++++++++++++++++
> tools/testing/selftests/bpf/test_progs.c | 32 ++++++++++++++++++++++++++
> 6 files changed, 77 insertions(+), 3 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/test_adjust_tail.c
>
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 2ced48662c1f..68c3578343b4 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -170,7 +170,8 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
> xdp.rxq = &rxqueue->xdp_rxq;
>
> retval = bpf_test_run(prog, &xdp, repeat, &duration);
> - if (xdp.data != data + XDP_PACKET_HEADROOM + NET_IP_ALIGN)
> + if (xdp.data != data + XDP_PACKET_HEADROOM + NET_IP_ALIGN ||
> + xdp.data_end != xdp.data + size)
please split fixing prog_test_run for adjust_tail into separate patch
and selftests into another one.
> size = xdp.data_end - xdp.data;
> ret = bpf_test_finish(kattr, uattr, xdp.data, size, retval, duration);
> kfree(data);
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 9d07465023a2..9a2d1a04eb24 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -755,6 +755,13 @@ union bpf_attr {
> * @addr: pointer to struct sockaddr to bind socket to
> * @addr_len: length of sockaddr structure
> * Return: 0 on success or negative error code
> + *
> + * int bpf_xdp_adjust_tail(xdp_md, delta)
> + * Adjust the xdp_md.data_end by delta. Only shrinking of packet's
> + * size is supported.
> + * @xdp_md: pointer to xdp_md
> + * @delta: A negative integer to be added to xdp_md.data_end
> + * Return: 0 on success or negative on error
> */
> #define __BPF_FUNC_MAPPER(FN) \
> FN(unspec), \
> @@ -821,7 +828,8 @@ union bpf_attr {
> FN(msg_apply_bytes), \
> FN(msg_cork_bytes), \
> FN(msg_pull_data), \
> - FN(bind),
> + FN(bind), \
> + FN(xdp_adjust_tail),
>
> /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> * function eBPF program intends to call
> @@ -864,6 +872,7 @@ enum bpf_func_id {
> /* BPF_FUNC_skb_set_tunnel_key flags. */
> #define BPF_F_ZERO_CSUM_TX (1ULL << 1)
> #define BPF_F_DONT_FRAGMENT (1ULL << 2)
> +#define BPF_F_SEQ_NUMBER (1ULL << 3)
William Tu missed adding it to tools/include/uapi/bpf.h when it was added
to main uapi/bpf.h
but don't add it as part of this patch.
I saw a separate patch for this passing by in tip tree from Arnaldo.
I'm not sure how quickly it will get into Linus tree,
let's not create extra merge conflicts.
>
> /* BPF_FUNC_perf_event_output, BPF_FUNC_perf_event_read and
> * BPF_FUNC_perf_event_read_value flags.
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 0a315ddabbf4..3e819dc70bee 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -31,7 +31,7 @@ TEST_GEN_FILES = test_pkt_access.o test_xdp.o test_l4lb.o test_tcp_estats.o test
> sockmap_verdict_prog.o dev_cgroup.o sample_ret0.o test_tracepoint.o \
> test_l4lb_noinline.o test_xdp_noinline.o test_stacktrace_map.o \
> sample_map_ret0.o test_tcpbpf_kern.o test_stacktrace_build_id.o \
> - sockmap_tcp_msg_prog.o connect4_prog.o connect6_prog.o
> + sockmap_tcp_msg_prog.o connect4_prog.o connect6_prog.o test_adjust_tail.o
>
> # Order correspond to 'make run_tests' order
> TEST_PROGS := test_kmod.sh \
> diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h
> index d8223d99f96d..50c607014b22 100644
> --- a/tools/testing/selftests/bpf/bpf_helpers.h
> +++ b/tools/testing/selftests/bpf/bpf_helpers.h
> @@ -96,6 +96,9 @@ static int (*bpf_msg_pull_data)(void *ctx, int start, int end, int flags) =
> (void *) BPF_FUNC_msg_pull_data;
> static int (*bpf_bind)(void *ctx, void *addr, int addr_len) =
> (void *) BPF_FUNC_bind;
> +static int (*bpf_xdp_adjust_tail)(void *ctx, int offset) =
> + (void *) BPF_FUNC_xdp_adjust_tail;
> +
>
> /* llvm builtin functions that eBPF C program may use to
> * emit BPF_LD_ABS and BPF_LD_IND instructions
> diff --git a/tools/testing/selftests/bpf/test_adjust_tail.c b/tools/testing/selftests/bpf/test_adjust_tail.c
> new file mode 100644
> index 000000000000..86239e792d6d
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/test_adjust_tail.c
> @@ -0,0 +1,29 @@
> +/* Copyright (c) 2016,2017 Facebook
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of version 2 of the GNU General Public
> + * License as published by the Free Software Foundation.
> + */
Please use SPDX header here and in other patches.
> +#include <linux/bpf.h>
> +#include <linux/if_ether.h>
> +#include "bpf_helpers.h"
> +
> +int _version SEC("version") = 1;
we really should fix libbpf to avoid requiring that for all program types.
It's annoying to see this in every networking test.
^ permalink raw reply
* Re: [PATCH v2 8/8] net: New ax88796 platform driver for Amiga X-Surf 100 Zorro board (m68k)
From: Michael Schmitz @ 2018-04-17 23:00 UTC (permalink / raw)
To: Andrew Lunn; +Cc: netdev, Linux/m68k, Michael Karcher, Michael Karcher
In-Reply-To: <20180417132625.GI2591@lunn.ch>
Hi Andrew,
On Wed, Apr 18, 2018 at 1:26 AM, Andrew Lunn <andrew@lunn.ch> wrote:
> On Tue, Apr 17, 2018 at 02:08:15PM +1200, Michael Schmitz wrote:
>> Add platform device driver to populate the ax88796 platform data from
>> information provided by the XSurf100 zorro device driver.
>> This driver will have to be loaded before loading the ax88796 module,
>> or compiled as built-in.
>>
>> Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
>> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
>> ---
>> drivers/net/ethernet/8390/Kconfig | 14 +-
>> drivers/net/ethernet/8390/Makefile | 1 +
>> drivers/net/ethernet/8390/xsurf100.c | 411 ++++++++++++++++++++++++++++++++++
>> 3 files changed, 425 insertions(+), 1 deletions(-)
>> create mode 100644 drivers/net/ethernet/8390/xsurf100.c
>>
>> diff --git a/drivers/net/ethernet/8390/Kconfig b/drivers/net/ethernet/8390/Kconfig
>> index fdc6734..0cadd45 100644
>> --- a/drivers/net/ethernet/8390/Kconfig
>> +++ b/drivers/net/ethernet/8390/Kconfig
>> @@ -30,7 +30,7 @@ config PCMCIA_AXNET
>>
>> config AX88796
>> tristate "ASIX AX88796 NE2000 clone support"
>> - depends on (ARM || MIPS || SUPERH)
>> + depends on (ARM || MIPS || SUPERH || AMIGA)
>
> Hi Michael
>
> Will it compile on other platforms? If so, it is a good idea to add
> COMPILE_TEST as well.
I suppose it will - nothing in there that wouldn't be portable. Well,
let's find out, shall we?
Cheers,
Michael
>
> Andrew
^ permalink raw reply
* Re: [PATCH bpf-next 01/10] [bpf]: adding bpf_xdp_adjust_tail helper
From: Alexei Starovoitov @ 2018-04-17 22:58 UTC (permalink / raw)
To: Nikita V. Shirokov; +Cc: Alexei Starovoitov, Daniel Borkmann, netdev
In-Reply-To: <20180417065131.3632-2-tehnerd@tehnerd.com>
On Mon, Apr 16, 2018 at 11:51:22PM -0700, Nikita V. Shirokov wrote:
> Adding new bpf helper which would allow us to manipulate
> xdp's data_end pointer, and allow us to reduce packet's size
> indended use case: to generate ICMP messages from XDP context,
> where such message would contain truncated original packet.
>
> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com>
> ---
> include/uapi/linux/bpf.h | 10 +++++++++-
> net/core/filter.c | 29 ++++++++++++++++++++++++++++-
> 2 files changed, 37 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c5ec89732a8d..9a2d1a04eb24 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -755,6 +755,13 @@ union bpf_attr {
> * @addr: pointer to struct sockaddr to bind socket to
> * @addr_len: length of sockaddr structure
> * Return: 0 on success or negative error code
> + *
> + * int bpf_xdp_adjust_tail(xdp_md, delta)
> + * Adjust the xdp_md.data_end by delta. Only shrinking of packet's
> + * size is supported.
> + * @xdp_md: pointer to xdp_md
> + * @delta: A negative integer to be added to xdp_md.data_end
> + * Return: 0 on success or negative on error
> */
> #define __BPF_FUNC_MAPPER(FN) \
> FN(unspec), \
> @@ -821,7 +828,8 @@ union bpf_attr {
> FN(msg_apply_bytes), \
> FN(msg_cork_bytes), \
> FN(msg_pull_data), \
> - FN(bind),
> + FN(bind), \
> + FN(xdp_adjust_tail),
>
> /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> * function eBPF program intends to call
> diff --git a/net/core/filter.c b/net/core/filter.c
> index d31aff93270d..6c8ac7b548d6 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2717,6 +2717,30 @@ static const struct bpf_func_proto bpf_xdp_adjust_head_proto = {
> .arg2_type = ARG_ANYTHING,
> };
>
> +BPF_CALL_2(bpf_xdp_adjust_tail, struct xdp_buff *, xdp, int, offset)
> +{
> + /* only shrinking is allowed for now. */
> + if (unlikely(offset > 0))
> + return -EINVAL;
why allow offset == 0 ?
It's a nop. xdp_adjust_head allows it, but it's not a reason
to repeat the same here.
Like we may decide to do something with offset==0 in the future.
Let's keep it reserved.
In the subject please replace
[bpf]: adding bpf_xdp_adjust_tail helper
with
bpf: adding bpf_xdp_adjust_tail helper
"[bpf] foo bar" subject used to be llvm patch convention,
but lately we switched it to kernel style as well with "bpf: foo bar"
^ permalink raw reply
* Re: [PATCH 10/10] net: New ax88796 platform driver for Amiga X-Surf 100 Zorro board (m68k)
From: Michael Schmitz @ 2018-04-17 22:35 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: netdev, Linux/m68k, Michael Karcher, Michael Karcher
In-Reply-To: <CAMuHMdUnonyL93AmF3TdPcUPj5ZEuTb59ZgArH5BjLjcx8LcvA@mail.gmail.com>
Hi Geert,
thanks for your suggestions!
On Wed, Apr 18, 2018 at 1:53 AM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> Hi Michael,
>
> Thanks for your patch!
>
> On Tue, Apr 17, 2018 at 12:04 AM, Michael Schmitz <schmitzmic@gmail.com> wrote:
>> Add platform device driver to populate the ax88796 platform data from
>> information provided by the XSurf100 zorro device driver.
>> This driver will have to be loaded before loading the ax88796 module,
>> or compiled as built-in.
>
> Is that really true? The platform device should be probed when both the
> device and driver have been registered, but order shouldn't matter.
Loading the xsurf100 module will pull in the ax88796 module, so order
does not matter. I'll drop that.
>
>> Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
>
> Missing "From: Michael Karcher ..."?
Fixed the authorship now - probably got mangled when squashing in my
local edits.
>
>> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
>
>> --- a/drivers/net/ethernet/8390/Kconfig
>> +++ b/drivers/net/ethernet/8390/Kconfig
>> @@ -30,7 +30,7 @@ config PCMCIA_AXNET
>>
>> config AX88796
>> tristate "ASIX AX88796 NE2000 clone support"
>> - depends on (ARM || MIPS || SUPERH)
>> + depends on (ARM || MIPS || SUPERH || AMIGA)
>
> s/AMIGA/ZORRO/, for consistency with the below.
Will do.
>
>> select CRC32
>> select PHYLIB
>> select MDIO_BITBANG
>> @@ -45,6 +45,18 @@ config AX88796_93CX6
>> ---help---
>> Select this if your platform comes with an external 93CX6 eeprom.
>>
>> +config XSURF100
>> + tristate "Amiga XSurf 100 AX88796/NE2000 clone support"
>> + depends on ZORRO
>> + depends on AX88796
>
> It's a bit unfortunate the user has to enable _two_ config options to enable
> this driver.
>
> I see two solutions for that:
>
> 1) Hide the XSURF100 symbol, so it gets enabled automatically if AX88796 is
> enabled on a Zorro bus system:
>
> config XSURF100
> tristate
> depends on ZORRO
> default AX88796
>
> 2) Hide the AX88796 symbol, and let it be selected by XSURF100:
>
> config AX88796
> tristate "ASIX AX88796 NE2000 clone support" if !ZORRO
> depends on ARM || MIPS || SUPERH || ZORRO
> ...
>
> config XSURF100
> tristate "Amiga XSurf 100 AX88796/NE2000 clone support"
> depends on ZORRO
> select AX88796
I'll use the latter -
>> --- /dev/null
>> +++ b/drivers/net/ethernet/8390/xsurf100.c
>> @@ -0,0 +1,411 @@
>> +#include <linux/module.h>
>> +#include <linux/netdevice.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/zorro.h>
>> +#include <net/ax88796.h>
>> +#include <asm/amigaints.h>
>> +
>> +#define ZORRO_PROD_INDIVIDUAL_COMPUTERS_X_SURF100 \
>> + ZORRO_ID(INDIVIDUAL_COMPUTERS, 0x64, 0)
>
> Another long define to get rid of? ;-)
>
>> +/* Hard reset the card. This used to pause for the same period that a
>> + * 8390 reset command required, but that shouldn't be necessary.
>> + */
>> +static void ax_reset_8390(struct net_device *dev)
>> +{
>> + struct ei_device *ei_local = netdev_priv(dev);
>> + unsigned long reset_start_time = jiffies;
>> + void __iomem *addr = (void __iomem *)dev->base_addr;
>> +
>> + netif_dbg(ei_local, hw, dev, "resetting the 8390 t=%ld...\n", jiffies);
>> +
>> + ei_outb(ei_inb(addr + NE_RESET), addr + NE_RESET);
>> +
>> + ei_local->txing = 0;
>> + ei_local->dmaing = 0;
>> +
>> + /* This check _should_not_ be necessary, omit eventually. */
>> + while ((ei_inb(addr + EN0_ISR) & ENISR_RESET) == 0) {
>> + if (time_after(jiffies, reset_start_time + 2 * HZ / 100)) {
>> + netdev_warn(dev, "%s: did not complete.\n", __func__);
>> + break;
>> + }
>
> cpu_relax()?
>
> How long does this usually take? If > 1 ms, you can use e.g. msleep(1)
> instead of cpu_relax().
No idea how long this will take - the reset function is lifted
straight out of ax88796.c with no modifications whatsoever.
Come to think of it - it's exported as ei_local->reset_8390 there, so
there is no good reason for even duplicating the code that I can see.
I'lll drop it.
>
>> + }
>> +
>> + ei_outb(ENISR_RESET, addr + EN0_ISR); /* Ack intr. */
>> +}
>
>> + if (ei_local->dmaing) {
>> + netdev_err(dev,
>> + "DMAing conflict in %s "
>> + "[DMAstat:%d][irqlock:%d].\n",
>
> Please don't split error messages, as that makes it more difficult to
> grep for them.
Again, found like that in ax88796.c. Will fix here (and eventually in
ax88796.c).
>> + __func__,
>> + ei_local->dmaing, ei_local->irqlock);
>> + return;
>
>> +static int xsurf100_probe(struct zorro_dev *zdev,
>> + const struct zorro_device_id *ent)
>> +{
>
>> + /* error handling for ioremap regs */
>> + if (!ax88796_data.base_regs) {
>> + dev_err(&zdev->dev, "Cannot ioremap area %p (registers)\n",
>> + (void *)zdev->resource.start);
>
> Please use %pR to format struct resource.
> Documentation/core-api/printk-formats.rst
The driver uses ioremap to map two subsections of the mem resource for
two different purposes - control register access, and ring buffer
access. The output of %pR may be misleading here (wrong size), and
even more so below.
>
>> + /* error handling for ioremap data */
>> + if (!ax88796_data.data_area) {
>> + dev_err(&zdev->dev, "Cannot ioremap area %p (32-bit access)\n",
>> + (void *)zdev->resource.start + XS100_8390_DATA32_BASE);
>
> %pR
I've added the offset into the mem resource here to clarify what we've
tried to map.
>
>> +static void xsurf100_remove(struct zorro_dev *zdev)
>> +{
>> + struct platform_device *pdev;
>> + struct xsurf100_ax_plat_data *xs100;
>> +
>> + pdev = zorro_get_drvdata(zdev);
>> + xs100 = dev_get_platdata(&pdev->dev);
>
> struct platform_device *pdev = pdev = zorro_get_drvdata(zdev);
> struct xsurf100_ax_plat_data *xs100 = dev_get_platdata(&pdev->dev);
Of course.
Cheers,
Michael
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds
^ permalink raw reply
* [PATCH net-next] hv_netvsc: Add NetVSP v6 and v6.1 into version negotiation
From: Haiyang Zhang @ 2018-04-17 22:31 UTC (permalink / raw)
To: davem, netdev
Cc: haiyangz, kys, sthemmin, olaf, vkuznets, devel, linux-kernel
From: Haiyang Zhang <haiyangz@microsoft.com>
This patch adds the NetVSP v6 and 6.1 message structures, and includes
these versions into NetVSC/NetVSP version negotiation process.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
---
drivers/net/hyperv/hyperv_net.h | 164 ++++++++++++++++++++++++++++++++++++++++
drivers/net/hyperv/netvsc.c | 3 +-
2 files changed, 166 insertions(+), 1 deletion(-)
diff --git a/drivers/net/hyperv/hyperv_net.h b/drivers/net/hyperv/hyperv_net.h
index 960f06141472..6ebe39a3dde6 100644
--- a/drivers/net/hyperv/hyperv_net.h
+++ b/drivers/net/hyperv/hyperv_net.h
@@ -237,6 +237,8 @@ void netvsc_switch_datapath(struct net_device *nv_dev, bool vf);
#define NVSP_PROTOCOL_VERSION_2 0x30002
#define NVSP_PROTOCOL_VERSION_4 0x40000
#define NVSP_PROTOCOL_VERSION_5 0x50000
+#define NVSP_PROTOCOL_VERSION_6 0x60000
+#define NVSP_PROTOCOL_VERSION_61 0x60001
enum {
NVSP_MSG_TYPE_NONE = 0,
@@ -308,6 +310,12 @@ enum {
NVSP_MSG5_TYPE_SEND_INDIRECTION_TABLE,
NVSP_MSG5_MAX = NVSP_MSG5_TYPE_SEND_INDIRECTION_TABLE,
+
+ /* Version 6 messages */
+ NVSP_MSG6_TYPE_PD_API,
+ NVSP_MSG6_TYPE_PD_POST_BATCH,
+
+ NVSP_MSG6_MAX = NVSP_MSG6_TYPE_PD_POST_BATCH
};
enum {
@@ -619,12 +627,168 @@ union nvsp_5_message_uber {
struct nvsp_5_send_indirect_table send_table;
} __packed;
+enum nvsp_6_pd_api_op {
+ PD_API_OP_CONFIG = 1,
+ PD_API_OP_SW_DATAPATH, /* Switch Datapath */
+ PD_API_OP_OPEN_PROVIDER,
+ PD_API_OP_CLOSE_PROVIDER,
+ PD_API_OP_CREATE_QUEUE,
+ PD_API_OP_FLUSH_QUEUE,
+ PD_API_OP_FREE_QUEUE,
+ PD_API_OP_ALLOC_COM_BUF, /* Allocate Common Buffer */
+ PD_API_OP_FREE_COM_BUF, /* Free Common Buffer */
+ PD_API_OP_MAX
+};
+
+struct grp_affinity {
+ u64 mask;
+ u16 grp;
+ u16 reserved[3];
+} __packed;
+
+struct nvsp_6_pd_api_req {
+ u32 op;
+
+ union {
+ /* MMIO information is sent from the VM to VSP */
+ struct __packed {
+ u64 mmio_pa; /* MMIO Physical Address */
+ u32 mmio_len;
+
+ /* Number of PD queues a VM can support */
+ u16 num_subchn;
+ } config;
+
+ /* Switch Datapath */
+ struct __packed {
+ /* Host Datapath Is PacketDirect */
+ u8 host_dpath_is_pd;
+
+ /* Guest PacketDirect Is Enabled */
+ u8 guest_pd_enabled;
+ } sw_dpath;
+
+ /* Open Provider*/
+ struct __packed {
+ u32 prov_id; /* Provider id */
+ u32 flag;
+ } open_prov;
+
+ /* Close Provider */
+ struct __packed {
+ u32 prov_id;
+ } cls_prov;
+
+ /* Create Queue*/
+ struct __packed {
+ u32 prov_id;
+ u16 q_id;
+ u16 q_size;
+ u8 is_recv_q;
+ u8 is_rss_q;
+ u32 recv_data_len;
+ struct grp_affinity affy;
+ } cr_q;
+
+ /* Delete Queue*/
+ struct __packed {
+ u32 prov_id;
+ u16 q_id;
+ } del_q;
+
+ /* Flush Queue */
+ struct __packed {
+ u32 prov_id;
+ u16 q_id;
+ } flush_q;
+
+ /* Allocate Common Buffer */
+ struct __packed {
+ u32 len;
+ u32 pf_node; /* Preferred Node */
+ u16 region_id;
+ } alloc_com_buf;
+
+ /* Free Common Buffer */
+ struct __packed {
+ u32 len;
+ u64 pa; /* Physical Address */
+ u32 pf_node; /* Preferred Node */
+ u16 region_id;
+ u8 cache_type;
+ } free_com_buf;
+ } __packed;
+} __packed;
+
+struct nvsp_6_pd_api_comp {
+ u32 op;
+ u32 status;
+
+ union {
+ struct __packed {
+ /* actual number of PD queues allocated to the VM */
+ u16 num_pd_q;
+
+ /* Num Receive Rss PD Queues */
+ u8 num_rss_q;
+
+ u8 is_supported; /* Is supported by VSP */
+ u8 is_enabled; /* Is enabled by VSP */
+ } config;
+
+ /* Open Provider */
+ struct __packed {
+ u32 prov_id;
+ } open_prov;
+
+ /* Create Queue */
+ struct __packed {
+ u32 prov_id;
+ u16 q_id;
+ u16 q_size;
+ u32 recv_data_len;
+ struct grp_affinity affy;
+ } cr_q;
+
+ /* Allocate Common Buffer */
+ struct __packed {
+ u64 pa; /* Physical Address */
+ u32 len;
+ u32 pf_node; /* Preferred Node */
+ u16 region_id;
+ u8 cache_type;
+ } alloc_com_buf;
+ } __packed;
+} __packed;
+
+struct nvsp_6_pd_buf {
+ u32 region_offset;
+ u16 region_id;
+ u16 is_partial:1;
+ u16 reserved:15;
+} __packed;
+
+struct nvsp_6_pd_batch_msg {
+ struct nvsp_message_header hdr;
+ u16 count;
+ u16 guest2host:1;
+ u16 is_recv:1;
+ u16 reserved:14;
+ struct nvsp_6_pd_buf pd_buf[0];
+} __packed;
+
+union nvsp_6_message_uber {
+ struct nvsp_6_pd_api_req pd_req;
+ struct nvsp_6_pd_api_comp pd_comp;
+} __packed;
+
union nvsp_all_messages {
union nvsp_message_init_uber init_msg;
union nvsp_1_message_uber v1_msg;
union nvsp_2_message_uber v2_msg;
union nvsp_4_message_uber v4_msg;
union nvsp_5_message_uber v5_msg;
+ union nvsp_6_message_uber v6_msg;
} __packed;
/* ALL Messages */
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 04f611e6f678..e7308958b7a9 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -525,7 +525,8 @@ static int netvsc_connect_vsp(struct hv_device *device,
struct net_device *ndev = hv_get_drvdata(device);
static const u32 ver_list[] = {
NVSP_PROTOCOL_VERSION_1, NVSP_PROTOCOL_VERSION_2,
- NVSP_PROTOCOL_VERSION_4, NVSP_PROTOCOL_VERSION_5
+ NVSP_PROTOCOL_VERSION_4, NVSP_PROTOCOL_VERSION_5,
+ NVSP_PROTOCOL_VERSION_6, NVSP_PROTOCOL_VERSION_61
};
struct nvsp_message *init_packet;
int ndis_version, i, ret;
--
2.15.1
^ permalink raw reply related
* Re: [PATCH v2 net-next] net: introduce a new tracepoint for tcp_rcv_space_adjust
From: Yafang Shao @ 2018-04-17 21:54 UTC (permalink / raw)
To: Song Liu
Cc: eric.dumazet@gmail.com, davem@davemloft.net, kuznet@ms2.inr.ac.ru,
yoshfuji@linux-ipv6.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <A047B5AE-CA5F-4664-92F7-935E8ED4C0CA@fb.com>
On Wed, Apr 18, 2018 at 1:38 AM, Song Liu <songliubraving@fb.com> wrote:
>
>
>> On Apr 17, 2018, at 9:36 AM, Yafang Shao <laoar.shao@gmail.com> wrote:
>>
>> tcp_rcv_space_adjust is called every time data is copied to user space,
>> introducing a tcp tracepoint for which could show us when the packet is
>> copied to user.
>> This could help us figure out whether there's latency in user process.
>>
>> When a tcp packet arrives, tcp_rcv_established() will be called and with
>> the existed tracepoint tcp_probe we could get the time when this packet
>> arrives.
>> Then this packet will be copied to user, and tcp_rcv_space_adjust will
>> be called and with this new introduced tracepoint we could get the time
>> when this packet is copied to user.
>>
>> arrives time : user process time => latency caused by user
>> tcp_probe tcp_rcv_space_adjust
>>
>> Hence in the printk message, sk_cookie is printed as a key to relate
>> tcp_rcv_space_adjust with tcp_probe.
>>
>> Maybe we could export sockfd in this new tracepoint as well, then we
>> could relate this new tracepoint with epoll/read/recv* tracepoints, and
>> finally that could show us the whole lifespan of this packet. But we
>> could also implement that with pid as these functions are executed in
>> process context.
>>
>> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
>>
>> ---
>> v1 -> v2: use sk_cookie as key suggested by Eric.
>> ---
>> include/trace/events/tcp.h | 33 +++++++++++++++++++++++++++------
>> net/ipv4/tcp_input.c | 2 ++
>> 2 files changed, 29 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
>> index 3dd6802..814f754 100644
>> --- a/include/trace/events/tcp.h
>> +++ b/include/trace/events/tcp.h
>> @@ -10,6 +10,7 @@
>> #include <linux/tracepoint.h>
>> #include <net/ipv6.h>
>> #include <net/tcp.h>
>> +#include <linux/sock_diag.h>
>>
>> #define TP_STORE_V4MAPPED(__entry, saddr, daddr) \
>> do { \
>> @@ -125,6 +126,7 @@
>> __array(__u8, daddr, 4)
>> __array(__u8, saddr_v6, 16)
>> __array(__u8, daddr_v6, 16)
>> + __field(__u64, sock_cookie)
>> ),
>>
>> TP_fast_assign(
>> @@ -144,12 +146,24 @@
>>
>> TP_STORE_ADDRS(__entry, inet->inet_saddr, inet->inet_daddr,
>> sk->sk_v6_rcv_saddr, sk->sk_v6_daddr);
>> +
>> + /*
>> + * sk_cookie is used to identify a socket, with which we could
>> + * relate this tracepoint with other tracepoints,
>> + * i.e. tcp_probe.
>> + * If we needn't this relation, then sk_cookie is useless;
>> + * if we need this relation, then tcp_probe is already set,
>> + * and sk_cookie is already set in tcp_probe, so we could get
>> + * the value directly.
>> + */
>> + __entry->sock_cookie = atomic64_read(&sk->sk_cookie);
>> ),
>>
>> - TP_printk("sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c",
>> + TP_printk("sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c sock_cookie=%llu",
>> __entry->sport, __entry->dport,
>> __entry->saddr, __entry->daddr,
>> - __entry->saddr_v6, __entry->daddr_v6)
>> + __entry->saddr_v6, __entry->daddr_v6,
>> + __entry->sock_cookie)
>> );
>>
>> DEFINE_EVENT(tcp_event_sk, tcp_receive_reset,
>> @@ -166,6 +180,13 @@
>> TP_ARGS(sk)
>> );
>>
>> +DEFINE_EVENT(tcp_event_sk, tcp_rcv_space_adjust,
>> +
>> + TP_PROTO(const struct sock *sk),
>> +
>> + TP_ARGS(sk)
>> +);
>> +
>> TRACE_EVENT(tcp_retransmit_synack,
>>
>> TP_PROTO(const struct sock *sk, const struct request_sock *req),
>> @@ -232,6 +253,7 @@
>> __field(__u32, snd_wnd)
>> __field(__u32, srtt)
>> __field(__u32, rcv_wnd)
>> + __field(__u64, sock_cookie)
>> ),
>>
>> TP_fast_assign(
>> @@ -256,15 +278,14 @@
>> __entry->rcv_wnd = tp->rcv_wnd;
>> __entry->ssthresh = tcp_current_ssthresh(sk);
>> __entry->srtt = tp->srtt_us >> 3;
>> + __entry->sock_cookie = sock_gen_cookie(sk);
>> ),
>>
>> - TP_printk("src=%pISpc dest=%pISpc mark=%#x length=%d snd_nxt=%#x "
>> - "snd_una=%#x snd_cwnd=%u ssthresh=%u snd_wnd=%u srtt=%u "
>> - "rcv_wnd=%u",
>> + TP_printk("src=%pISpc dest=%pISpc mark=%#x length=%d snd_nxt=%#x snd_una=%#x snd_cwnd=%u ssthresh=%u snd_wnd=%u srtt=%u rcv_wnd=%u sock_cookie=%llu",
>> __entry->saddr, __entry->daddr, __entry->mark,
>> __entry->length, __entry->snd_nxt, __entry->snd_una,
>> __entry->snd_cwnd, __entry->ssthresh, __entry->snd_wnd,
>> - __entry->srtt, __entry->rcv_wnd)
>> + __entry->srtt, __entry->rcv_wnd, __entry->sock_cookie)
>> );
>>
>> #endif /* _TRACE_TCP_H */
>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>> index f93687f..43ad468 100644
>> --- a/net/ipv4/tcp_input.c
>> +++ b/net/ipv4/tcp_input.c
>> @@ -582,6 +582,8 @@ void tcp_rcv_space_adjust(struct sock *sk)
>> u32 copied;
>> int time;
>>
>> + trace_tcp_rcv_space_adjust(sk);
>> +
>> tcp_mstamp_refresh(tp);
>> time = tcp_stamp_us_delta(tp->tcp_mstamp, tp->rcvq_space.time);
>> if (time < (tp->rcv_rtt_est.rtt_us >> 3) || tp->rcv_rtt_est.rtt_us == 0)
>> --
>> 1.8.3.1
>>
>
> If I understand this correctly, you can get all the information you need with
> a kprobe on tcp_rcv_space_adjust(). Why is it necessary to introduce a new
> tracepoint?
>
Tracepoint is less expensive and more cnovinient, that is the same
reason why tcp_probe.c was removed and tcp_probe tracepoint was
introduced.
Thanks
Yafang
^ permalink raw reply
* Re: general protection fault in encode_rpcb_string
From: Trond Myklebust @ 2018-04-17 21:54 UTC (permalink / raw)
To: bfields@fieldses.org,
syzbot+4b98281f2401ab849f4b@syzkaller.appspotmail.com
Cc: syzkaller-bugs@googlegroups.com, anna.schumaker@netapp.com,
davem@davemloft.net, linux-kernel@vger.kernel.org,
linux-nfs@vger.kernel.org, jlayton@kernel.org,
netdev@vger.kernel.org
In-Reply-To: <20180417213308.GC18217@fieldses.org>
On Tue, 2018-04-17 at 17:33 -0400, J. Bruce Fields wrote:
> On Mon, Apr 16, 2018 at 09:02:01PM -0700, syzbot wrote:
> > syzbot hit the following crash on bpf-next commit
> > 5d1365940a68dd57b031b6e3c07d7d451cd69daf (Thu Apr 12 18:09:05 2018
> > +0000)
> > Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
> > syzbot dashboard link:
> > https://syzkaller.appspot.com/bug?extid=4b98281f2401ab849f4b
> >
> > So far this crash happened 2 times on bpf-next.
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?id=6433835633
> > 868800
> > syzkaller reproducer:
> > https://syzkaller.appspot.com/x/repro.syz?id=6407311794896896
> > Raw console output:
> > https://syzkaller.appspot.com/x/log.txt?id=5861511176126464
>
> Based on that, looks like it's attempting an nfs mount while causing
> kmalloc failures?
>
> Probably one of rpcb->r_netid, r_addr, or r_owner was bad in
> rpcb_enc_getaddr.
>
> Hm, and previous log makes it look like it was an
> rpc_sockaddr2uaddr()
> in rpcb_getport_async() that was made to fail. Do we need to check
> for
> failure of:
>
> map->r_addr = rpc_sockaddr2uaddr(sap, GFP_ATOMIC);
>
> ?
Yes, and we can probably convert it, and the other GFP_ATOMIC
allocations in the rpcbind client to use GFP_NOFS in order to improve
reliability.
Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammer.space
^ permalink raw reply
* Re: [PATCH v2 3/8] net: ax88796: Do not free IRQ in ax_remove() (already freed in ax_close()).
From: Michael Schmitz @ 2018-04-17 21:53 UTC (permalink / raw)
To: Andrew Lunn
Cc: John Paul Adrian Glaubitz, netdev, Linux/m68k, Michael Karcher,
Michael Karcher
In-Reply-To: <20180417211329.GA18336@lunn.ch>
Hi Andrew,
thanks, that's what I was looking for. The next version will have all
but one patch correctly attributed to Michael Karcher.
Cheers,
Michael
On Wed, Apr 18, 2018 at 9:13 AM, Andrew Lunn <andrew@lunn.ch> wrote:
> On Wed, Apr 18, 2018 at 08:32:25AM +1200, Michael Schmitz wrote:
>> Hi Adrian,
>>
>> On Tue, Apr 17, 2018 at 11:40 PM, John Paul Adrian Glaubitz
>> <glaubitz@physik.fu-berlin.de> wrote:
>> > On 04/17/2018 04:08 AM, Michael Schmitz wrote:
>> >>
>> >> From: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
>> >
>> > This should be:
>> >
>> > From: Michael Karcher <debian@mkarcher.dialup.fu-berlin.de>
>>
>> I haven't found a way to change that in my tree yet, sorry. Unless
>> someone has a simple way to fix patch authorship after a merge, I may
>> have to reimport from scratch.
>
> git commit --am --author=<author>
>
> Andrew
^ permalink raw reply
* Re: [PATCH v2 net-next] net: introduce a new tracepoint for tcp_rcv_space_adjust
From: Yafang Shao @ 2018-04-17 21:53 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, Alexey Kuznetsov, yoshfuji, Song Liu, netdev, LKML
In-Reply-To: <010ea3d3-7925-4718-8aee-c1f6de6cc608@gmail.com>
On Wed, Apr 18, 2018 at 1:27 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
> On 04/17/2018 09:36 AM, Yafang Shao wrote:
>> tcp_rcv_space_adjust is called every time data is copied to user space,
>> introducing a tcp tracepoint for which could show us when the packet is
>> copied to user.
>> This could help us figure out whether there's latency in user process.
>>
>> When a tcp packet arrives, tcp_rcv_established() will be called and with
>> the existed tracepoint tcp_probe we could get the time when this packet
>> arrives.
>> Then this packet will be copied to user, and tcp_rcv_space_adjust will
>> be called and with this new introduced tracepoint we could get the time
>> when this packet is copied to user.
>>
>> arrives time : user process time => latency caused by user
>> tcp_probe tcp_rcv_space_adjust
>
> Sorry, I could not parse these :/
>
Sorry for the poor expression. Will improve it.
I mean with these two tracepoint we could calculate the latency that
if the user process can't process this packet immdiately.
>>
>> Hence in the printk message, sk_cookie is printed as a key to relate
>> tcp_rcv_space_adjust with tcp_probe.
>>
>> Maybe we could export sockfd in this new tracepoint as well, then we
>> could relate this new tracepoint with epoll/read/recv* tracepoints, and
>> finally that could show us the whole lifespan of this packet. But we
>> could also implement that with pid as these functions are executed in
>> process context.
>>
>> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
>>
>> ---
>> v1 -> v2: use sk_cookie as key suggested by Eric.
>> ---
>> include/trace/events/tcp.h | 33 +++++++++++++++++++++++++++------
>> net/ipv4/tcp_input.c | 2 ++
>> 2 files changed, 29 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
>> index 3dd6802..814f754 100644
>> --- a/include/trace/events/tcp.h
>> +++ b/include/trace/events/tcp.h
>> @@ -10,6 +10,7 @@
>> #include <linux/tracepoint.h>
>> #include <net/ipv6.h>
>> #include <net/tcp.h>
>> +#include <linux/sock_diag.h>
>>
>> #define TP_STORE_V4MAPPED(__entry, saddr, daddr) \
>> do { \
>> @@ -125,6 +126,7 @@
>> __array(__u8, daddr, 4)
>> __array(__u8, saddr_v6, 16)
>> __array(__u8, daddr_v6, 16)
>> + __field(__u64, sock_cookie)
>> ),
>>
>> TP_fast_assign(
>> @@ -144,12 +146,24 @@
>>
>> TP_STORE_ADDRS(__entry, inet->inet_saddr, inet->inet_daddr,
>> sk->sk_v6_rcv_saddr, sk->sk_v6_daddr);
>> +
>> + /*
>> + * sk_cookie is used to identify a socket, with which we could
>> + * relate this tracepoint with other tracepoints,
>> + * i.e. tcp_probe.
>> + * If we needn't this relation, then sk_cookie is useless;
>> + * if we need this relation, then tcp_probe is already set,
>> + * and sk_cookie is already set in tcp_probe, so we could get
>> + * the value directly.
>> + */
>> + __entry->sock_cookie = atomic64_read(&sk->sk_cookie);
>
> Please scrap this comment and simply use the real thing.
>
> _entry->sock_cookie = sock_gen_cookie(sk);
>
> We build generic events.
>
> Being able to filter many TCP events on one socket cookie will be useful
>
> If you worry about sock_gen_cookie(sk) being too expensive, then we can add an inline helper
> for the fast path (when sk_cookie has been already set)
>
Will improve.
>> ),
>>
>> - TP_printk("sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c",
>> + TP_printk("sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c sock_cookie=%llu",
>
>
> iproute2/ss command uses hexadcimal output for socket cookie. Please use %llx for consistency.
>
OK
^ permalink raw reply
* [PATCH net-next 19/19] r8169: remove jumbo_tx_csum from chip config struct
From: Heiner Kallweit @ 2018-04-17 21:36 UTC (permalink / raw)
To: David Miller, Realtek linux nic maintainers; +Cc: netdev@vger.kernel.org
In-Reply-To: <4049e598-1b6c-bc3e-a905-178b76d7b161@gmail.com>
According to the chip configuration entries only RTL8169 (ver <= 06)
supports tx checksumming for jumbo packets.
By the way: constant JUMBO_1K is a little misleading because it refers
to the standard packet size and not to a jumbo packet size.
By implementing this rule we can get rid of configuring tx checksumming
support per chip type.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
drivers/net/ethernet/realtek/r8169.c | 133 +++++++++++----------------
1 file changed, 54 insertions(+), 79 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 94e91d3c..fcd42d03 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -171,12 +171,11 @@ enum rtl_tx_desc_version {
#define JUMBO_7K (7*1024 - ETH_HLEN - 2)
#define JUMBO_9K (9*1024 - ETH_HLEN - 2)
-#define _R(NAME,TD,FW,SZ,B) { \
+#define _R(NAME,TD,FW,SZ) { \
.name = NAME, \
.txd_version = TD, \
.fw_name = FW, \
.jumbo_max = SZ, \
- .jumbo_tx_csum = B \
}
static const struct {
@@ -184,135 +183,111 @@ static const struct {
enum rtl_tx_desc_version txd_version;
const char *fw_name;
u16 jumbo_max;
- bool jumbo_tx_csum;
} rtl_chip_infos[] = {
/* PCI devices. */
[RTL_GIGA_MAC_VER_01] =
- _R("RTL8169", RTL_TD_0, NULL, JUMBO_7K, true),
+ _R("RTL8169", RTL_TD_0, NULL, JUMBO_7K),
[RTL_GIGA_MAC_VER_02] =
- _R("RTL8169s", RTL_TD_0, NULL, JUMBO_7K, true),
+ _R("RTL8169s", RTL_TD_0, NULL, JUMBO_7K),
[RTL_GIGA_MAC_VER_03] =
- _R("RTL8110s", RTL_TD_0, NULL, JUMBO_7K, true),
+ _R("RTL8110s", RTL_TD_0, NULL, JUMBO_7K),
[RTL_GIGA_MAC_VER_04] =
- _R("RTL8169sb/8110sb", RTL_TD_0, NULL, JUMBO_7K, true),
+ _R("RTL8169sb/8110sb", RTL_TD_0, NULL, JUMBO_7K),
[RTL_GIGA_MAC_VER_05] =
- _R("RTL8169sc/8110sc", RTL_TD_0, NULL, JUMBO_7K, true),
+ _R("RTL8169sc/8110sc", RTL_TD_0, NULL, JUMBO_7K),
[RTL_GIGA_MAC_VER_06] =
- _R("RTL8169sc/8110sc", RTL_TD_0, NULL, JUMBO_7K, true),
+ _R("RTL8169sc/8110sc", RTL_TD_0, NULL, JUMBO_7K),
/* PCI-E devices. */
[RTL_GIGA_MAC_VER_07] =
- _R("RTL8102e", RTL_TD_1, NULL, JUMBO_1K, true),
+ _R("RTL8102e", RTL_TD_1, NULL, JUMBO_1K),
[RTL_GIGA_MAC_VER_08] =
- _R("RTL8102e", RTL_TD_1, NULL, JUMBO_1K, true),
+ _R("RTL8102e", RTL_TD_1, NULL, JUMBO_1K),
[RTL_GIGA_MAC_VER_09] =
- _R("RTL8102e", RTL_TD_1, NULL, JUMBO_1K, true),
+ _R("RTL8102e", RTL_TD_1, NULL, JUMBO_1K),
[RTL_GIGA_MAC_VER_10] =
- _R("RTL8101e", RTL_TD_0, NULL, JUMBO_1K, true),
+ _R("RTL8101e", RTL_TD_0, NULL, JUMBO_1K),
[RTL_GIGA_MAC_VER_11] =
- _R("RTL8168b/8111b", RTL_TD_0, NULL, JUMBO_4K, false),
+ _R("RTL8168b/8111b", RTL_TD_0, NULL, JUMBO_4K),
[RTL_GIGA_MAC_VER_12] =
- _R("RTL8168b/8111b", RTL_TD_0, NULL, JUMBO_4K, false),
+ _R("RTL8168b/8111b", RTL_TD_0, NULL, JUMBO_4K),
[RTL_GIGA_MAC_VER_13] =
- _R("RTL8101e", RTL_TD_0, NULL, JUMBO_1K, true),
+ _R("RTL8101e", RTL_TD_0, NULL, JUMBO_1K),
[RTL_GIGA_MAC_VER_14] =
- _R("RTL8100e", RTL_TD_0, NULL, JUMBO_1K, true),
+ _R("RTL8100e", RTL_TD_0, NULL, JUMBO_1K),
[RTL_GIGA_MAC_VER_15] =
- _R("RTL8100e", RTL_TD_0, NULL, JUMBO_1K, true),
+ _R("RTL8100e", RTL_TD_0, NULL, JUMBO_1K),
[RTL_GIGA_MAC_VER_16] =
- _R("RTL8101e", RTL_TD_0, NULL, JUMBO_1K, true),
+ _R("RTL8101e", RTL_TD_0, NULL, JUMBO_1K),
[RTL_GIGA_MAC_VER_17] =
- _R("RTL8168b/8111b", RTL_TD_0, NULL, JUMBO_4K, false),
+ _R("RTL8168b/8111b", RTL_TD_0, NULL, JUMBO_4K),
[RTL_GIGA_MAC_VER_18] =
- _R("RTL8168cp/8111cp", RTL_TD_1, NULL, JUMBO_6K, false),
+ _R("RTL8168cp/8111cp", RTL_TD_1, NULL, JUMBO_6K),
[RTL_GIGA_MAC_VER_19] =
- _R("RTL8168c/8111c", RTL_TD_1, NULL, JUMBO_6K, false),
+ _R("RTL8168c/8111c", RTL_TD_1, NULL, JUMBO_6K),
[RTL_GIGA_MAC_VER_20] =
- _R("RTL8168c/8111c", RTL_TD_1, NULL, JUMBO_6K, false),
+ _R("RTL8168c/8111c", RTL_TD_1, NULL, JUMBO_6K),
[RTL_GIGA_MAC_VER_21] =
- _R("RTL8168c/8111c", RTL_TD_1, NULL, JUMBO_6K, false),
+ _R("RTL8168c/8111c", RTL_TD_1, NULL, JUMBO_6K),
[RTL_GIGA_MAC_VER_22] =
- _R("RTL8168c/8111c", RTL_TD_1, NULL, JUMBO_6K, false),
+ _R("RTL8168c/8111c", RTL_TD_1, NULL, JUMBO_6K),
[RTL_GIGA_MAC_VER_23] =
- _R("RTL8168cp/8111cp", RTL_TD_1, NULL, JUMBO_6K, false),
+ _R("RTL8168cp/8111cp", RTL_TD_1, NULL, JUMBO_6K),
[RTL_GIGA_MAC_VER_24] =
- _R("RTL8168cp/8111cp", RTL_TD_1, NULL, JUMBO_6K, false),
+ _R("RTL8168cp/8111cp", RTL_TD_1, NULL, JUMBO_6K),
[RTL_GIGA_MAC_VER_25] =
- _R("RTL8168d/8111d", RTL_TD_1, FIRMWARE_8168D_1,
- JUMBO_9K, false),
+ _R("RTL8168d/8111d", RTL_TD_1, FIRMWARE_8168D_1, JUMBO_9K),
[RTL_GIGA_MAC_VER_26] =
- _R("RTL8168d/8111d", RTL_TD_1, FIRMWARE_8168D_2,
- JUMBO_9K, false),
+ _R("RTL8168d/8111d", RTL_TD_1, FIRMWARE_8168D_2, JUMBO_9K),
[RTL_GIGA_MAC_VER_27] =
- _R("RTL8168dp/8111dp", RTL_TD_1, NULL, JUMBO_9K, false),
+ _R("RTL8168dp/8111dp", RTL_TD_1, NULL, JUMBO_9K),
[RTL_GIGA_MAC_VER_28] =
- _R("RTL8168dp/8111dp", RTL_TD_1, NULL, JUMBO_9K, false),
+ _R("RTL8168dp/8111dp", RTL_TD_1, NULL, JUMBO_9K),
[RTL_GIGA_MAC_VER_29] =
- _R("RTL8105e", RTL_TD_1, FIRMWARE_8105E_1,
- JUMBO_1K, true),
+ _R("RTL8105e", RTL_TD_1, FIRMWARE_8105E_1, JUMBO_1K),
[RTL_GIGA_MAC_VER_30] =
- _R("RTL8105e", RTL_TD_1, FIRMWARE_8105E_1,
- JUMBO_1K, true),
+ _R("RTL8105e", RTL_TD_1, FIRMWARE_8105E_1, JUMBO_1K),
[RTL_GIGA_MAC_VER_31] =
- _R("RTL8168dp/8111dp", RTL_TD_1, NULL, JUMBO_9K, false),
+ _R("RTL8168dp/8111dp", RTL_TD_1, NULL, JUMBO_9K),
[RTL_GIGA_MAC_VER_32] =
- _R("RTL8168e/8111e", RTL_TD_1, FIRMWARE_8168E_1,
- JUMBO_9K, false),
+ _R("RTL8168e/8111e", RTL_TD_1, FIRMWARE_8168E_1, JUMBO_9K),
[RTL_GIGA_MAC_VER_33] =
- _R("RTL8168e/8111e", RTL_TD_1, FIRMWARE_8168E_2,
- JUMBO_9K, false),
+ _R("RTL8168e/8111e", RTL_TD_1, FIRMWARE_8168E_2, JUMBO_9K),
[RTL_GIGA_MAC_VER_34] =
- _R("RTL8168evl/8111evl",RTL_TD_1, FIRMWARE_8168E_3,
- JUMBO_9K, false),
+ _R("RTL8168evl/8111evl",RTL_TD_1, FIRMWARE_8168E_3, JUMBO_9K),
[RTL_GIGA_MAC_VER_35] =
- _R("RTL8168f/8111f", RTL_TD_1, FIRMWARE_8168F_1,
- JUMBO_9K, false),
+ _R("RTL8168f/8111f", RTL_TD_1, FIRMWARE_8168F_1, JUMBO_9K),
[RTL_GIGA_MAC_VER_36] =
- _R("RTL8168f/8111f", RTL_TD_1, FIRMWARE_8168F_2,
- JUMBO_9K, false),
+ _R("RTL8168f/8111f", RTL_TD_1, FIRMWARE_8168F_2, JUMBO_9K),
[RTL_GIGA_MAC_VER_37] =
- _R("RTL8402", RTL_TD_1, FIRMWARE_8402_1,
- JUMBO_1K, true),
+ _R("RTL8402", RTL_TD_1, FIRMWARE_8402_1, JUMBO_1K),
[RTL_GIGA_MAC_VER_38] =
- _R("RTL8411", RTL_TD_1, FIRMWARE_8411_1,
- JUMBO_9K, false),
+ _R("RTL8411", RTL_TD_1, FIRMWARE_8411_1, JUMBO_9K),
[RTL_GIGA_MAC_VER_39] =
- _R("RTL8106e", RTL_TD_1, FIRMWARE_8106E_1,
- JUMBO_1K, true),
+ _R("RTL8106e", RTL_TD_1, FIRMWARE_8106E_1, JUMBO_1K),
[RTL_GIGA_MAC_VER_40] =
- _R("RTL8168g/8111g", RTL_TD_1, FIRMWARE_8168G_2,
- JUMBO_9K, false),
+ _R("RTL8168g/8111g", RTL_TD_1, FIRMWARE_8168G_2, JUMBO_9K),
[RTL_GIGA_MAC_VER_41] =
- _R("RTL8168g/8111g", RTL_TD_1, NULL, JUMBO_9K, false),
+ _R("RTL8168g/8111g", RTL_TD_1, NULL, JUMBO_9K),
[RTL_GIGA_MAC_VER_42] =
- _R("RTL8168g/8111g", RTL_TD_1, FIRMWARE_8168G_3,
- JUMBO_9K, false),
+ _R("RTL8168g/8111g", RTL_TD_1, FIRMWARE_8168G_3, JUMBO_9K),
[RTL_GIGA_MAC_VER_43] =
- _R("RTL8106e", RTL_TD_1, FIRMWARE_8106E_2,
- JUMBO_1K, true),
+ _R("RTL8106e", RTL_TD_1, FIRMWARE_8106E_2, JUMBO_1K),
[RTL_GIGA_MAC_VER_44] =
- _R("RTL8411", RTL_TD_1, FIRMWARE_8411_2,
- JUMBO_9K, false),
+ _R("RTL8411", RTL_TD_1, FIRMWARE_8411_2, JUMBO_9K),
[RTL_GIGA_MAC_VER_45] =
- _R("RTL8168h/8111h", RTL_TD_1, FIRMWARE_8168H_1,
- JUMBO_9K, false),
+ _R("RTL8168h/8111h", RTL_TD_1, FIRMWARE_8168H_1, JUMBO_9K),
[RTL_GIGA_MAC_VER_46] =
- _R("RTL8168h/8111h", RTL_TD_1, FIRMWARE_8168H_2,
- JUMBO_9K, false),
+ _R("RTL8168h/8111h", RTL_TD_1, FIRMWARE_8168H_2, JUMBO_9K),
[RTL_GIGA_MAC_VER_47] =
- _R("RTL8107e", RTL_TD_1, FIRMWARE_8107E_1,
- JUMBO_1K, false),
+ _R("RTL8107e", RTL_TD_1, FIRMWARE_8107E_1, JUMBO_1K),
[RTL_GIGA_MAC_VER_48] =
- _R("RTL8107e", RTL_TD_1, FIRMWARE_8107E_2,
- JUMBO_1K, false),
+ _R("RTL8107e", RTL_TD_1, FIRMWARE_8107E_2, JUMBO_1K),
[RTL_GIGA_MAC_VER_49] =
- _R("RTL8168ep/8111ep", RTL_TD_1, NULL,
- JUMBO_9K, false),
+ _R("RTL8168ep/8111ep", RTL_TD_1, NULL, JUMBO_9K),
[RTL_GIGA_MAC_VER_50] =
- _R("RTL8168ep/8111ep", RTL_TD_1, NULL,
- JUMBO_9K, false),
+ _R("RTL8168ep/8111ep", RTL_TD_1, NULL, JUMBO_9K),
[RTL_GIGA_MAC_VER_51] =
- _R("RTL8168ep/8111ep", RTL_TD_1, NULL,
- JUMBO_9K, false),
+ _R("RTL8168ep/8111ep", RTL_TD_1, NULL, JUMBO_9K),
};
#undef _R
@@ -1954,7 +1929,7 @@ static netdev_features_t rtl8169_fix_features(struct net_device *dev,
features &= ~NETIF_F_ALL_TSO;
if (dev->mtu > JUMBO_1K &&
- !rtl_chip_infos[tp->mac_version].jumbo_tx_csum)
+ tp->mac_version > RTL_GIGA_MAC_VER_06)
features &= ~NETIF_F_IP_CSUM;
return features;
@@ -8338,7 +8313,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
netif_info(tp, probe, dev, "jumbo features [frames: %d bytes, "
"tx checksumming: %s]\n",
rtl_chip_infos[chipset].jumbo_max,
- rtl_chip_infos[chipset].jumbo_tx_csum ? "ok" : "ko");
+ tp->mac_version <= RTL_GIGA_MAC_VER_06 ? "ok" : "ko");
}
if (r8168_check_dash(tp))
--
2.17.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox