Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Heiner Kallweit @ 2019-02-08 21:50 UTC (permalink / raw)
  To: Sander Eikelenboom, Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <059e59c6-2264-fd5c-068f-3656e39539c1@eikelenboom.it>

On 08.02.2019 22:45, Sander Eikelenboom wrote:
> On 08/02/2019 22:22, Heiner Kallweit wrote:
>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>> L.S.,
>>>>>
>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>> that I haven encountered with Linux 4.20.x.
>>>>>
>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>
>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>
>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>
>>> Hmm i did some diging and i think:
>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>
>> You're right. Thought this was added in 4.20 already.
>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>> have onboard Realtek network I have quite a few testers out there.
>> Does the issue occur under specific circumstances like very high load?
> 
> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
> on the host.
> 
>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>> as author of the underlying changes.
> 
> It could also be the barriers weren't that unneeded as assumed.

The barriers were removed after adding xmit_more handling. Therefore it would be good to
test also with only 
bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
removed.

> Since we are almost at RC6 i took the liberty to CC Eric now.
> 
Sure, thanks.

> BTW am i correct these patches are merely optimizations ?

Yes

> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
> to revert them for 5.0 and try again for 5.1 ?
> 
Before removing both it would be good to test with only the barrier-removal removed.

> --
> Sander
> 
Heiner

> 
>>
>>> would be candidates, which were merged in 5.0.
>>>
>>> I have reverted the first two, see how that works out.
>>>
>>> --
>>> Sander
>>>
>> Heiner
>>
>>>  
>>>>> --
>>>>> Sander
>>>>>
>>>> Heiner
>>>>
>>>>>
>>>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>> [ 6466.758366] Call Trace:
>>>>> [ 6466.768118]  <IRQ>
>>>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>>>> [ 6466.835902]  </IRQ>
>>>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>>>> [ 6466.999585] Modules linked in:
>>>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>>>> [ 6467.118166] Kernel Offset: disabled
>>>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>>>
>>>>
>>>
>>>
>>
> 
> 


^ permalink raw reply

* Re: Resource management for ndo_xdp_xmit (Was: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)
From: Saeed Mahameed @ 2019-02-08 22:49 UTC (permalink / raw)
  To: toke@redhat.com, brouer@redhat.com
  Cc: hawk@kernel.org, virtualization@lists.linux-foundation.org,
	borkmann@iogearbox.net, Tariq Toukan, john.fastabend@gmail.com,
	jakub.kicinski@netronome.com, mst@redhat.com, dsahern@gmail.com,
	netdev@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net,
	makita.toshiaki@lab.ntt.co.jp
In-Reply-To: <87o97mp6dp.fsf@toke.dk>

On Fri, 2019-02-08 at 17:55 +0100, Toke Høiland-Jørgensen wrote:
> Saeed Mahameed <saeedm@mellanox.com> writes:
> 
> > But:
> > 2) this won't totally solve our problem, since sometimes the driver
> > can
> > decide to recreate (change of configuration) hw resources on the
> > fly
> > while redirect/devmap is already happening, so we need some kind of
> > a
> > dev_map_notification or a flag with rcu synch, for when the driver
> > want
> > to make the xdp redirect resources unavailable.
> 
> Good point, I'll make a note of this. Do you have a pointer to where
> the
> mlx5 driver does this kind of change currently?
> 

example:
ethtool -L to reduce/increase the number of rings
e.g. @mlx5e_ethtool_set_ringparam
or virtually anywhere mlx5e_switch_priv_channels is called when xdp
prog redirect is attached to mlx5.

> -Toke

^ permalink raw reply

* Re: [PATCH net-next 00/14] mlxsw: Implement periodic ERP rehash
From: David Miller @ 2019-02-08 23:03 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, mlxsw
In-Reply-To: <20190207112211.10375-1-idosch@mellanox.com>

From: Ido Schimmel <idosch@mellanox.com>
Date: Thu, 7 Feb 2019 11:22:44 +0000

> Currently, an ERP set is created for each region according to rules
> inserted and order of their insertion. However that might lead to
> suboptimal ERP sets and possible unnecessary spillage into C-TCAM.
> This patchset aims to fix this problem and introduces periodical checking
> of used ERP sets and in case a better ERP set is possible for the given
> set of rules, it rehashes the region to use the better ERP set.
 ...

Series applied, I'll push this out after my build tests complete.

Thanks.

^ permalink raw reply

* Re: [PATCH] net: hso: do not unregister if not registered
From: David Miller @ 2019-02-08 23:08 UTC (permalink / raw)
  To: tuba; +Cc: netdev
In-Reply-To: <1549413631237.66546@ece.ufl.edu>

From: "Yavuz, Tuba" <tuba@ece.ufl.edu>
Date: Wed, 6 Feb 2019 00:40:31 +0000

> 
> On an error path inside the hso_create_net_device function of the hso
> driver, hso_free_net_device gets called. This causes potentially a
> negative reference count in the net device if register_netdev has not
> been called yet as hso_free_net_device calls unregister_netdev
> regardless. I think the driver should distinguish these cases and call
> unregister_netdev only if register_netdev has been called.
> 
> Signed-off-by: Tuba Yavuz <tuba@ece.ufl.edu>

This patch is corrupted by your email client.

^ permalink raw reply

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Eric Dumazet @ 2019-02-08 23:09 UTC (permalink / raw)
  To: Heiner Kallweit, Sander Eikelenboom,
	Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <140d0df7-1775-5457-aa03-b21ece250a72@gmail.com>



On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>> L.S.,
>>>>>>
>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>
>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>
>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>
>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>
>>>> Hmm i did some diging and i think:
>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>
>>> You're right. Thought this was added in 4.20 already.
>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>> have onboard Realtek network I have quite a few testers out there.
>>> Does the issue occur under specific circumstances like very high load?
>>
>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>> on the host.
>>
>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>> as author of the underlying changes.
>>
>> It could also be the barriers weren't that unneeded as assumed.
> 
> The barriers were removed after adding xmit_more handling. Therefore it would be good to
> test also with only 
> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
> removed.
> 
>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>
> Sure, thanks.
> 
>> BTW am i correct these patches are merely optimizations ?
> 
> Yes
> 
>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>> to revert them for 5.0 and try again for 5.1 ?
>>
> Before removing both it would be good to test with only the barrier-removal removed.
> 

Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
looks buggy to me, since the skb might have been freed already on another cpu when you call

You could try :

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
        dma_addr_t mapping;
        u32 opts[2], len;
        bool stop_queue;
+       bool door_bell;
        int frags;
 
        if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
@@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
        /* Force memory writes to complete before releasing descriptor */
        dma_wmb();
 
+       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
+
        txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
 
        /* Force all memory writes to complete before notifying device */
@@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
        if (unlikely(stop_queue))
                netif_stop_queue(dev);
 
-       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
+       if (door_bell) {
                RTL_W8(tp, TxPoll, NPQ);
                mmiowb();
        }



^ permalink raw reply related

* Re: [PATCH] net: sfp: do not probe SFP module before we're attached
From: David Miller @ 2019-02-08 23:11 UTC (permalink / raw)
  To: rmk+kernel; +Cc: netdev, andrew, f.fainelli, hkallweit1
In-Reply-To: <E1grKog-0004c1-OB@rmk-PC.armlinux.org.uk>

From: Russell King <rmk+kernel@armlinux.org.uk>
Date: Wed, 06 Feb 2019 10:52:30 +0000

> When we probe a SFP module, we expect to be able to call the upstream
> device's module_insert() function so that the upstream link can be
> configured.  However, when the upstream device is delayed, we currently
> may end up probing the module before the upstream device is available,
> and lose the module_insert() call.
> 
> Avoid this by holding off probing the module until the SFP bus is
> properly connected to both the SFP socket driver and the upstream
> driver.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Applied, thanks Russell.

-stable?

^ permalink raw reply

* Re: Resource management for ndo_xdp_xmit (Was: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)
From: Saeed Mahameed @ 2019-02-08 23:17 UTC (permalink / raw)
  To: brouer@redhat.com
  Cc: thoiland@redhat.com, hawk@kernel.org,
	virtualization@lists.linux-foundation.org, borkmann@iogearbox.net,
	Tariq Toukan, toke@toke.dk, john.fastabend@gmail.com,
	mst@redhat.com, jakub.kicinski@netronome.com, dsahern@gmail.com,
	netdev@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net,
	makita.toshiaki@lab.ntt.co.jp
In-Reply-To: <9e5e6882566ac67276209b35ec112a824b256bff.camel@mellanox.com>

On Thu, 2019-02-07 at 19:08 +0000, Saeed Mahameed wrote:
> On Thu, 2019-02-07 at 08:48 +0100, Jesper Dangaard Brouer wrote:
> > On Wed, 6 Feb 2019 00:06:33 +0000 Saeed Mahameed <
> > saeedm@mellanox.com
> > > wrote:
> > > On Mon, 2019-02-04 at 19:13 -0800, David Ahern wrote:
> > [...]
> > > > mlx5 needs some work. As I recall it still has the bug/panic
> > > > removing xdp programs - at least I don't recall seeing a patch
> > > > for
> > > > it.  
> > > 
> > > Only when xdp_redirect to mlx5, and removing the program while
> > > redirect is happening, this is actually due to a lack of
> > > synchronization means between different drivers, we have some
> > > ideas
> > > to overcome this using a standard XDP API, or just use a hack in
> > > mlx5
> > > driver which i don't like:
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp-redirect-fix&id=a3652d03cc35fd3ad62744986c8ccaca74c9f20c
> > > 
> > > I will be working on this towards the end of this week.
> > 
> > Toke and I have been discussing how to solve this.
> > 
> > The main idea for fixing this is to tie resource allocation to
> > interface
> > insertion into interface maps (kernel/bpf/devmap.c). As the
> > =devmap=
> > already have the needed synchronisation mechanisms and steps for
> > safely
> > adding and removing =net_devices= (e.g. stopping RX side, flushing
> > remaining frames, waiting RCU period before freeing objects, etc.)
> > 
> > As described here:
> >  
> > https://github.com/xdp-project/xdp-project/blob/master/xdp-project.org#better-ndo_xdp_xmit-resource-management
> > 
> > --Jesper
> 
> Yes you already suggested this approach @LPC:
> 
> So 
> 1) on dev_map_update_elem() we will call
> dev->dev->ndo_bpf() to notify the device on the intention to
> start/stop
> redirect, and wait for it to create/destroy the HW resources
> before/after actually updating the map
> 

silly me, dev_map_update_elem must be atomic, we can't hook driver
resource allocation to it, it must come as a separate request (syscall)
from user space to request to create XDP redirect resources.


> But:
> 2) this won't totally solve our problem, since sometimes the driver
> can
> decide to recreate (change of configuration) hw resources on the fly
> while redirect/devmap is already happening, so we need some kind of a
> dev_map_notification or a flag with rcu synch, for when the driver
> want
> to make the xdp redirect resources unavailable.
> 

I will focus on this problem first, then figure out how to create XDP
redirect resources without actullay attaching a dummy xdp program.

> Thanks,
> Saeed.

^ permalink raw reply

* Re: [PATCH v2 1/3] net/macb: bindings doc/trivial: fix documentation for sama5d3 10/100 interface
From: David Miller @ 2019-02-08 23:20 UTC (permalink / raw)
  To: nicolas.ferre
  Cc: alexandre.belloni, ludovic.desroches, linux-arm-kernel, robh+dt,
	linux-kernel, netdev, devicetree
In-Reply-To: <20190206175610.26773-1-nicolas.ferre@microchip.com>

From: Nicolas Ferre <nicolas.ferre@microchip.com>
Date: Wed, 6 Feb 2019 18:56:08 +0100

> This removes a line left while adding the correct compatibility string for
> sama5d3 10/100 interface. Now use the "atmel,sama5d3-macb" string.
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
> Reviewed-by: Rob Herring <robh@kernel.org>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH v2 2/3] net/macb: bindings doc: add sam9x60 binding
From: David Miller @ 2019-02-08 23:20 UTC (permalink / raw)
  To: nicolas.ferre
  Cc: alexandre.belloni, ludovic.desroches, linux-arm-kernel, robh+dt,
	linux-kernel, netdev, devicetree
In-Reply-To: <20190206175610.26773-2-nicolas.ferre@microchip.com>

From: Nicolas Ferre <nicolas.ferre@microchip.com>
Date: Wed, 6 Feb 2019 18:56:09 +0100

> Add the compatibility sting documentation for sam9x60 10/100 interface.
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH v2 3/3] net: macb: add sam9x60-macb compatibility string
From: David Miller @ 2019-02-08 23:20 UTC (permalink / raw)
  To: nicolas.ferre
  Cc: alexandre.belloni, ludovic.desroches, linux-arm-kernel, robh+dt,
	linux-kernel, netdev, devicetree
In-Reply-To: <20190206175610.26773-3-nicolas.ferre@microchip.com>

From: Nicolas Ferre <nicolas.ferre@microchip.com>
Date: Wed, 6 Feb 2019 18:56:10 +0100

> Add a new compatibility string for this product. It's using
> at91sam9260-macb layout but has a newer hardware revision: it's safer
> to use its own string.
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH bpf-next v8 4/6] bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c
From: kbuild test robot @ 2019-02-08 23:20 UTC (permalink / raw)
  To: Peter Oskolkov
  Cc: kbuild-all, Alexei Starovoitov, Daniel Borkmann, netdev,
	Peter Oskolkov, David Ahern, Willem de Bruijn, Peter Oskolkov
In-Reply-To: <20190208163849.151626-5-posk@google.com>

[-- Attachment #1: Type: text/plain, Size: 3928 bytes --]

Hi Peter,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Peter-Oskolkov/bpf-add-BPF_LWT_ENCAP_IP-option-to-bpf_lwt_push_encap/20190209-030743
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: x86_64-randconfig-j0-02040958 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   net//core/lwt_bpf.c: In function 'bpf_lwt_xmit_reroute':
>> net//core/lwt_bpf.c:216:10: warning: missing braces around initializer [-Wmissing-braces]
      struct flowi4 fl4 = {0};
             ^
   net//core/lwt_bpf.c:216:10: warning: (near initialization for 'fl4.__fl_common') [-Wmissing-braces]

vim +216 net//core/lwt_bpf.c

   184	
   185	static int bpf_lwt_xmit_reroute(struct sk_buff *skb)
   186	{
   187		struct net_device *l3mdev = l3mdev_master_dev_rcu(skb_dst(skb)->dev);
   188		int oif = l3mdev ? l3mdev->ifindex : 0;
   189		struct dst_entry *dst = NULL;
   190		struct sock *sk;
   191		struct net *net;
   192		bool ipv4;
   193		int err;
   194	
   195		if (skb->protocol == htons(ETH_P_IP)) {
   196			ipv4 = true;
   197		} else if (skb->protocol == htons(ETH_P_IPV6)) {
   198			ipv4 = false;
   199		} else {
   200			pr_warn_once("BPF_LWT_REROUTE xmit: unsupported proto %d\n",
   201				     skb->protocol);
   202			return -EINVAL;
   203		}
   204	
   205		sk = sk_to_full_sk(skb->sk);
   206		if (sk) {
   207			if (sk->sk_bound_dev_if)
   208				oif = sk->sk_bound_dev_if;
   209			net = sock_net(sk);
   210		} else {
   211			net = dev_net(skb_dst(skb)->dev);
   212		}
   213	
   214		if (ipv4) {
   215			struct iphdr *iph = ip_hdr(skb);
 > 216			struct flowi4 fl4 = {0};
   217			struct rtable *rt;
   218	
   219			fl4.flowi4_oif = oif;
   220			fl4.flowi4_mark = skb->mark;
   221			fl4.flowi4_uid = sock_net_uid(net, sk);
   222			fl4.flowi4_tos = RT_TOS(iph->tos);
   223			fl4.flowi4_flags = FLOWI_FLAG_ANYSRC;
   224			fl4.flowi4_proto = iph->protocol;
   225			fl4.daddr = iph->daddr;
   226			fl4.saddr = iph->saddr;
   227	
   228			rt = ip_route_output_key(net, &fl4);
   229			if (IS_ERR(rt) || rt->dst.error)
   230				return -EINVAL;
   231			dst = &rt->dst;
   232		} else {
   233	#if IS_BUILTIN(CONFIG_IPV6)
   234			struct ipv6hdr *iph6 = ipv6_hdr(skb);
   235			struct flowi6 fl6 = {0};
   236	
   237			fl6.flowi6_oif = oif;
   238			fl6.flowi6_mark = skb->mark;
   239			fl6.flowi6_uid = sock_net_uid(net, sk);
   240			fl6.flowlabel = ip6_flowinfo(iph6);
   241			fl6.flowi6_proto = iph6->nexthdr;
   242			fl6.daddr = iph6->daddr;
   243			fl6.saddr = iph6->saddr;
   244	
   245			dst = ip6_route_output(net, skb->sk, &fl6);
   246			if (IS_ERR(dst) || dst->error)
   247				return -EINVAL;
   248	#else
   249			pr_warn_once("BPF_LWT_REROUTE xmit: IPV6 not built-in\n");
   250			return -EINVAL;
   251	#endif
   252		}
   253	
   254		/* Although skb header was reserved in bpf_lwt_push_ip_encap(), it
   255		 * was done for the previous dst, so we are doing it here again, in
   256		 * case the new dst needs much more space. The call below is a noop
   257		 * if there is enough header space in skb.
   258		 */
   259		err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev));
   260		if (unlikely(err))
   261			return err;
   262	
   263		skb_dst_drop(skb);
   264		skb_dst_set(skb, dst);
   265	
   266		err = dst_output(dev_net(skb_dst(skb)->dev), skb->sk, skb);
   267		if (unlikely(err))
   268			return err;
   269	
   270		/* ip[6]_finish_output2 understand LWTUNNEL_XMIT_DONE */
   271		return LWTUNNEL_XMIT_DONE;
   272	}
   273	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 29197 bytes --]

^ permalink raw reply

* Re: [PATCH bpf-next v8 4/6] bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c
From: David Ahern @ 2019-02-08 23:24 UTC (permalink / raw)
  To: Peter Oskolkov, Alexei Starovoitov, Daniel Borkmann, netdev
  Cc: Peter Oskolkov, Willem de Bruijn
In-Reply-To: <20190208163849.151626-5-posk@google.com>

On 2/8/19 8:38 AM, Peter Oskolkov wrote:
> This patch builds on top of the previous patch in the patchset,
> which added BPF_LWT_ENCAP_IP mode to bpf_lwt_push_encap. As the
> encapping can result in the skb needing to go via a different
> interface/route/dst, bpf programs can indicate this by returning
> BPF_LWT_REROUTE, which triggers a new route lookup for the skb.
> 
> v8 changes: fix kbuild errors when LWTUNNEL_BPF is builtin, but
>    IPV6 is a module: as LWTUNNEL_BPF can only be either Y or N,
>    call IPV6 routing functions only if they are built-in.

you need to use the ipv6 stub to access v6 functionality when it is a
module.


^ permalink raw reply

* Re: [RFC PATCH] perf, bpf: Retain kernel executable code in memory to aid Intel PT tracing
From: Alexei Starovoitov @ 2019-02-08 23:29 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Peter Zijlstra, Andi Kleen, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, Song Liu, Daniel Borkmann,
	Alexei Starovoitov, linux-kernel, netdev
In-Reply-To: <20190207111901.2399-1-adrian.hunter@intel.com>

On Thu, Feb 07, 2019 at 01:19:01PM +0200, Adrian Hunter wrote:
> Subject to memory pressure and other limits, retain executable code, such
> as JIT-compiled bpf, in memory instead of freeing it immediately it is no
> longer needed for execution.
> 
> While perf is primarily aimed at statistical analysis, tools like Intel
> PT can aim to provide a trace of exactly what happened. As such, corner
> cases that can be overlooked statistically need to be addressed. For
> example, there is a gap where JIT-compiled bpf can be freed from memory
> before a tracer has a chance to read it out through the bpf syscall.
> While that can be ignored statistically, it contributes to a death by
> 1000 cuts for tracers attempting to assemble exactly what happened. This is
> a bit gratuitous given that retaining the executable code is relatively
> simple, and the amount of memory involved relatively small. The retained
> executable code is then available in memory images such as /proc/kcore.
> 
> This facility could perhaps be extended also to init sections.
> 
> Note that this patch is compile tested only and, at present, is missing
> the ability to retain symbols.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  arch/x86/Kconfig.cpu       |   1 +
>  include/linux/filter.h     |   4 +
>  include/linux/xc_retain.h  |  49 ++++++++++
>  init/Kconfig               |   6 ++
>  kernel/Makefile            |   1 +
>  kernel/bpf/core.c          |  44 ++++++++-
>  kernel/xc_retain.c         | 183 +++++++++++++++++++++++++++++++++++++
>  net/core/sysctl_net_core.c |  62 +++++++++++++
>  8 files changed, 349 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/xc_retain.h
>  create mode 100644 kernel/xc_retain.c
> 
> diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
> index 6adce15268bd..21dcd064c272 100644
> --- a/arch/x86/Kconfig.cpu
> +++ b/arch/x86/Kconfig.cpu
> @@ -389,6 +389,7 @@ menuconfig PROCESSOR_SELECT
>  config CPU_SUP_INTEL
>  	default y
>  	bool "Support Intel processors" if PROCESSOR_SELECT
> +	select XC_RETAIN if PERF_EVENTS && BPF_JIT
>  	---help---
>  	  This enables detection, tunings and quirks for Intel processors
>  
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index d531d4250bff..40b9f601e18f 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -851,6 +851,10 @@ extern int bpf_jit_enable;
>  extern int bpf_jit_harden;
>  extern int bpf_jit_kallsyms;
>  extern long bpf_jit_limit;
> +extern unsigned int bpf_jit_retain_min;
> +extern unsigned int bpf_jit_retain_max;
> +
> +void bpf_jit_retain_update_sz(void);
>  
>  typedef void (*bpf_jit_fill_hole_t)(void *area, unsigned int size);
>  
> diff --git a/include/linux/xc_retain.h b/include/linux/xc_retain.h
> new file mode 100644
> index 000000000000..e79dc138bab8
> --- /dev/null
> +++ b/include/linux/xc_retain.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2019 Intel Corporation.
> + */
> +#ifndef _LINUX_XC_RETAIN_H
> +#define _LINUX_XC_RETAIN_H
> +
> +#include <linux/list.h>
> +#include <linux/shrinker.h>
> +#include <linux/spinlock.h>
> +
> +struct xc_retain_ops {
> +	void (*free)(void *addr);
> +};
> +
> +struct xc_retain {
> +	struct list_head list;
> +	struct list_head items;
> +	const struct xc_retain_ops ops;
> +	unsigned int min_pages;
> +	unsigned int max_pages;
> +	unsigned int current_pages;
> +	unsigned int item_cnt;
> +	spinlock_t lock;
> +	struct shrinker shrinker;
> +};
> +
> +#ifdef CONFIG_XC_RETAIN
> +int xc_retain_register(struct xc_retain *xr);
> +void xc_retain_binary(struct xc_retain *xr, void *addr, unsigned int pages);
> +void xc_retain_set_min_pages(struct xc_retain *xr, unsigned int min_pages);
> +void xc_retain_set_max_pages(struct xc_retain *xr, unsigned int max_pages);
> +#else
> +static inline int xc_retain_register(struct xc_retain *xr)
> +{
> +	return 0;
> +}
> +static inline void xc_retain_binary(struct xc_retain *xr, void *addr,
> +				    unsigned int pages)
> +{
> +	xr->ops.free(addr);
> +}
> +static inline void xc_retain_set_max_pages(struct xc_retain *xr,
> +					   unsigned int max_pages)
> +{
> +}
> +#endif
> +
> +#endif
> diff --git a/init/Kconfig b/init/Kconfig
> index c9386a365eea..954c288cabdc 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1550,6 +1550,12 @@ config EMBEDDED
>  	  an embedded system so certain expert options are available
>  	  for configuration.
>  
> +config XC_RETAIN
> +	bool
> +	help
> +	  Retain kernel executable code (e.g. jitted BPF) in memory after it
> +	  would normally be freed.
> +
>  config HAVE_PERF_EVENTS
>  	bool
>  	help
> diff --git a/kernel/Makefile b/kernel/Makefile
> index 6aa7543bcdb2..5df40e2a934e 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -98,6 +98,7 @@ obj-$(CONFIG_TRACEPOINTS) += trace/
>  obj-$(CONFIG_IRQ_WORK) += irq_work.o
>  obj-$(CONFIG_CPU_PM) += cpu_pm.o
>  obj-$(CONFIG_BPF) += bpf/
> +obj-$(CONFIG_XC_RETAIN) += xc_retain.o
>  
>  obj-$(CONFIG_PERF_EVENTS) += events/
>  
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 19c49313c709..7fd235d235c2 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -34,6 +34,7 @@
>  #include <linux/kallsyms.h>
>  #include <linux/rcupdate.h>
>  #include <linux/perf_event.h>
> +#include <linux/xc_retain.h>
>  
>  #include <asm/unaligned.h>
>  
> @@ -480,6 +481,10 @@ int bpf_jit_enable   __read_mostly = IS_BUILTIN(CONFIG_BPF_JIT_ALWAYS_ON);
>  int bpf_jit_harden   __read_mostly;
>  int bpf_jit_kallsyms __read_mostly;
>  long bpf_jit_limit   __read_mostly;
> +#define BPF_JIT_RETAIN_MIN 0
> +#define BPF_JIT_RETAIN_MAX 16
> +unsigned int bpf_jit_retain_min __read_mostly = BPF_JIT_RETAIN_MIN;
> +unsigned int bpf_jit_retain_max __read_mostly = BPF_JIT_RETAIN_MAX;
>  
>  static __always_inline void
>  bpf_get_prog_addr_region(const struct bpf_prog *prog,
> @@ -795,6 +800,43 @@ void bpf_jit_binary_free(struct bpf_binary_header *hdr)
>  	bpf_jit_uncharge_modmem(pages);
>  }
>  
> +#ifdef CONFIG_XC_RETAIN
> +static struct xc_retain bpf_jit_retain = {
> +	.min_pages = BPF_JIT_RETAIN_MIN,
> +	.max_pages = BPF_JIT_RETAIN_MAX,
> +	.ops = {
> +		.free = module_memfree,
> +	},
> +};
> +
> +void bpf_jit_retain_update_sz(void)
> +{
> +	xc_retain_set_min_pages(&bpf_jit_retain, bpf_jit_retain_min);
> +	xc_retain_set_max_pages(&bpf_jit_retain, bpf_jit_retain_max);
> +}
> +
> +static int __init bpf_jit_retain_init(void)
> +{
> +	return xc_retain_register(&bpf_jit_retain);
> +}
> +subsys_initcall(bpf_jit_retain_init);
> +
> +static void bpf_jit_binary_retain(struct bpf_prog *fp,
> +				  struct bpf_binary_header *hdr)
> +{
> +	u32 pages = hdr->pages;
> +
> +	xc_retain_binary(&bpf_jit_retain, hdr, pages);
> +	bpf_jit_uncharge_modmem(pages);
> +}
> +#else
> +static void bpf_jit_binary_retain(struct bpf_prog *fp,
> +				  struct bpf_binary_header *hdr)
> +{
> +	return bpf_jit_binary_free(hdr);
> +}
> +#endif

I'm strongly against this approach.

I understand that it's under CONFIG, but changing kernel
into garbage collection nightmare even under config
or sysctl is not an option.
In many cases bpf progs are loaded/unloaded a lot.
Consider CI test system that runs tests 24/7.
bpf progs are loaded/unloaded in huge numbers.
Such system will suffer non deterministic test and
performance results due to shrinkers.
perf analysis with PT becomes inaccurate and main goal
of retaining accurate instruction info is not achieved.
bpf_jit_retain_min/max tunables is not an option either.
Please see how perf record is handling bpf prog/unload.
What stops you from doing the same for PT?


^ permalink raw reply

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Sander Eikelenboom @ 2019-02-08 23:34 UTC (permalink / raw)
  To: Heiner Kallweit, Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <140d0df7-1775-5457-aa03-b21ece250a72@gmail.com>

On 08/02/2019 22:50, Heiner Kallweit wrote:
> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>> L.S.,
>>>>>>
>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>
>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>
>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>
>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>
>>>> Hmm i did some diging and i think:
>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>
>>> You're right. Thought this was added in 4.20 already.
>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>> have onboard Realtek network I have quite a few testers out there.
>>> Does the issue occur under specific circumstances like very high load?
>>
>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>> on the host.
>>
>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>> as author of the underlying changes.
>>
>> It could also be the barriers weren't that unneeded as assumed.
> 
> The barriers were removed after adding xmit_more handling. Therefore it would be good to
> test also with only 
> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
> removed.

*arghh* *grmbl*

with both:
    bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3
    and
    2e6eedb4813e34d8d84ac0eb3afb668966f3f356 
reverted i get yet another splat:

[ 3769.246083] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
[ 3769.246095] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
[ 3769.246096] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[ 3769.246098] Call Trace:
[ 3769.246104]  <IRQ>
[ 3769.246114]  dump_stack+0x5c/0x7b
[ 3769.246120]  warn_alloc+0x103/0x190
[ 3769.246122]  __alloc_pages_nodemask+0xe3d/0xe80
[ 3769.246128]  ? inet_gro_receive+0x232/0x2c0
[ 3769.246130]  page_frag_alloc+0x117/0x150
[ 3769.246132]  __napi_alloc_skb+0x83/0xd0
[ 3769.246137]  rtl8169_poll+0x210/0x640
[ 3769.246140]  net_rx_action+0x23d/0x370
[ 3769.246145]  __do_softirq+0xed/0x229
[ 3769.246149]  irq_exit+0xb7/0xc0
[ 3769.246152]  xen_evtchn_do_upcall+0x27/0x40
[ 3769.246154]  xen_do_hypervisor_callback+0x29/0x40
[ 3769.246155]  </IRQ>
[ 3769.246161] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
[ 3769.246163] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
[ 3769.246164] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
[ 3769.246166] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
[ 3769.246167] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
[ 3769.246167] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
[ 3769.246168] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
[ 3769.246169] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
[ 3769.246173]  _raw_spin_lock+0x16/0x20
[ 3769.246176]  list_lru_add+0x59/0x170
[ 3769.246179]  inode_lru_list_add+0x1b/0x40
[ 3769.246182]  iput+0x18b/0x1a0
[ 3769.246184]  __dentry_kill+0xc5/0x170
[ 3769.246186]  shrink_dentry_list+0x93/0x1c0
[ 3769.246187]  prune_dcache_sb+0x4d/0x70
[ 3769.246191]  super_cache_scan+0x104/0x190
[ 3769.246194]  do_shrink_slab+0x12c/0x1e0
[ 3769.246196]  shrink_slab+0xdf/0x2b0
[ 3769.246198]  shrink_node+0x158/0x470
[ 3769.246200]  do_try_to_free_pages+0xd1/0x380
[ 3769.246202]  try_to_free_pages+0xb2/0xe0
[ 3769.246204]  __alloc_pages_nodemask+0x603/0xe80
[ 3769.246207]  ? xas_load+0x9/0x80
[ 3769.246209]  ? find_get_entry+0x58/0x120
[ 3769.246210]  pagecache_get_page+0xde/0x210
[ 3769.246213]  grab_cache_page_write_begin+0x17/0x30
[ 3769.246215]  ext4_da_write_begin+0xc4/0x340
[ 3769.246217]  generic_perform_write+0xb8/0x1b0
[ 3769.246219]  __generic_file_write_iter+0x13c/0x1b0
[ 3769.246223]  ext4_file_write_iter+0x121/0x3c0
[ 3769.246225]  __vfs_write+0x123/0x1a0
[ 3769.246226]  vfs_write+0xab/0x1a0
[ 3769.246229]  ksys_write+0x4d/0xc0
[ 3769.246232]  do_syscall_64+0x49/0x100
[ 3769.246234]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3769.246237] RIP: 0033:0x7fee5b265730
[ 3769.246238] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
[ 3769.246239] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3769.246240] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
[ 3769.246241] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
[ 3769.246241] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
[ 3769.246242] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
[ 3769.246243] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
[ 3769.246244] Mem-Info:
[ 3769.246249] active_anon:152383 inactive_anon:99216 isolated_anon:0
                active_file:51569 inactive_file:85922 isolated_file:0
                unevictable:552 dirty:6866 writeback:0 unstable:0
                slab_reclaimable:6707 slab_unreclaimable:16166
                mapped:1870 shmem:6 pagetables:2716 bounce:0
                free:3639 free_pcp:900 free_cma:0
[ 3769.246252] Node 0 active_anon:609532kB inactive_anon:396864kB active_file:206276kB inactive_file:343688kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:7480kB dirty:27464kB writeback:0kB shmem:24kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[ 3769.246253] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:8056kB inactive_anon:0kB active_file:92kB inactive_file:148kB unevictable:0kB writepending:8kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:20kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 3769.246256] lowmem_reserve[]: 0 1865 1865 1865
[ 3769.246258] Node 0 DMA32 free:7076kB min:19472kB low:21380kB high:23288kB active_anon:601840kB inactive_anon:396512kB active_file:206216kB inactive_file:343644kB unevictable:2208kB writepending:27256kB present:2080768kB managed:1833792kB mlocked:2208kB kernel_stack:9392kB pagetables:10844kB bounce:0kB free_pcp:3600kB local_pcp:596kB free_cma:0kB
[ 3769.246260] lowmem_reserve[]: 0 0 0 0
[ 3769.246262] Node 0 DMA: 6*4kB (UE) 4*8kB (UME) 4*16kB (UME) 2*32kB (UE) 6*64kB (UE) 2*128kB (UM) 4*256kB (UME) 3*512kB (UME) 2*1024kB (ME) 1*2048kB (M) 0*4096kB = 7480kB
[ 3769.246267] Node 0 DMA32: 66*4kB (UM) 271*8kB (UME) 218*16kB (UME) 45*32kB (UME) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7360kB
[ 3769.246272] 144878 total pagecache pages
[ 3769.246276] 6812 pages in swap cache
[ 3769.246277] Swap cache stats: add 62616, delete 55806, find 31/55
[ 3769.246278] Free swap  = 3943164kB
[ 3769.246278] Total swap = 4194300kB
[ 3769.246279] 524181 pages RAM
[ 3769.246279] 0 pages HighMem/MovableOnly
[ 3769.246280] 61765 pages reserved
[ 3769.246280] 0 pages cma reserved
[ 3769.246284] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
[ 3769.246286] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
[ 3769.246287] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[ 3769.246287] Call Trace:
[ 3769.246288]  <IRQ>
[ 3769.246290]  dump_stack+0x5c/0x7b
[ 3769.246291]  warn_alloc+0x103/0x190
[ 3769.246293]  __alloc_pages_nodemask+0xe3d/0xe80
[ 3769.246294]  ? inet_gro_receive+0x232/0x2c0
[ 3769.246296]  page_frag_alloc+0x117/0x150
[ 3769.246297]  __napi_alloc_skb+0x83/0xd0
[ 3769.246299]  rtl8169_poll+0x210/0x640
[ 3769.246300]  net_rx_action+0x23d/0x370
[ 3769.246302]  __do_softirq+0xed/0x229
[ 3769.246304]  irq_exit+0xb7/0xc0
[ 3769.246305]  xen_evtchn_do_upcall+0x27/0x40
[ 3769.246306]  xen_do_hypervisor_callback+0x29/0x40
[ 3769.246307]  </IRQ>
[ 3769.246308] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
[ 3769.246310] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
[ 3769.246310] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
[ 3769.246311] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
[ 3769.246312] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
[ 3769.246313] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
[ 3769.246313] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
[ 3769.246314] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
[ 3769.246316]  _raw_spin_lock+0x16/0x20
[ 3769.246317]  list_lru_add+0x59/0x170
[ 3769.246318]  inode_lru_list_add+0x1b/0x40
[ 3769.246320]  iput+0x18b/0x1a0
[ 3769.246321]  __dentry_kill+0xc5/0x170
[ 3769.246322]  shrink_dentry_list+0x93/0x1c0
[ 3769.246323]  prune_dcache_sb+0x4d/0x70
[ 3769.246325]  super_cache_scan+0x104/0x190
[ 3769.246326]  do_shrink_slab+0x12c/0x1e0
[ 3769.246328]  shrink_slab+0xdf/0x2b0
[ 3769.246329]  shrink_node+0x158/0x470
[ 3769.246331]  do_try_to_free_pages+0xd1/0x380
[ 3769.246333]  try_to_free_pages+0xb2/0xe0
[ 3769.246334]  __alloc_pages_nodemask+0x603/0xe80
[ 3769.246336]  ? xas_load+0x9/0x80
[ 3769.246337]  ? find_get_entry+0x58/0x120
[ 3769.246338]  pagecache_get_page+0xde/0x210
[ 3769.246340]  grab_cache_page_write_begin+0x17/0x30
[ 3769.246341]  ext4_da_write_begin+0xc4/0x340
[ 3769.246342]  generic_perform_write+0xb8/0x1b0
[ 3769.246344]  __generic_file_write_iter+0x13c/0x1b0
[ 3769.246345]  ext4_file_write_iter+0x121/0x3c0
[ 3769.246347]  __vfs_write+0x123/0x1a0
[ 3769.246348]  vfs_write+0xab/0x1a0
[ 3769.246349]  ksys_write+0x4d/0xc0
[ 3769.246350]  do_syscall_64+0x49/0x100
[ 3769.246352]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3769.246353] RIP: 0033:0x7fee5b265730
[ 3769.246354] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
[ 3769.246354] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3769.246355] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
[ 3769.246356] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
[ 3769.246357] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
[ 3769.246357] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
[ 3769.246358] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
[ 3769.246364] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
[ 3769.246366] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
[ 3769.246366] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[ 3769.246366] Call Trace:
[ 3769.246367]  <IRQ>
[ 3769.246368]  dump_stack+0x5c/0x7b
[ 3769.246370]  warn_alloc+0x103/0x190
[ 3769.246371]  __alloc_pages_nodemask+0xe3d/0xe80
[ 3769.246373]  ? inet_gro_receive+0x232/0x2c0
[ 3769.246374]  page_frag_alloc+0x117/0x150
[ 3769.246375]  __napi_alloc_skb+0x83/0xd0
[ 3769.246376]  rtl8169_poll+0x210/0x640
[ 3769.246378]  net_rx_action+0x23d/0x370
[ 3769.246379]  __do_softirq+0xed/0x229
[ 3769.246381]  irq_exit+0xb7/0xc0
[ 3769.246382]  xen_evtchn_do_upcall+0x27/0x40
[ 3769.246383]  xen_do_hypervisor_callback+0x29/0x40
[ 3769.246383]  </IRQ>
[ 3769.246385] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
[ 3769.246386] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
[ 3769.246387] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
[ 3769.246388] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
[ 3769.246388] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
[ 3769.246389] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
[ 3769.246390] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
[ 3769.246390] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
[ 3769.246392]  _raw_spin_lock+0x16/0x20
[ 3769.246393]  list_lru_add+0x59/0x170
[ 3769.246395]  inode_lru_list_add+0x1b/0x40
[ 3769.246396]  iput+0x18b/0x1a0
[ 3769.246397]  __dentry_kill+0xc5/0x170
[ 3769.246398]  shrink_dentry_list+0x93/0x1c0
[ 3769.246399]  prune_dcache_sb+0x4d/0x70
[ 3769.246401]  super_cache_scan+0x104/0x190
[ 3769.246402]  do_shrink_slab+0x12c/0x1e0
[ 3769.246404]  shrink_slab+0xdf/0x2b0
[ 3769.246405]  shrink_node+0x158/0x470
[ 3769.246407]  do_try_to_free_pages+0xd1/0x380
[ 3769.246408]  try_to_free_pages+0xb2/0xe0
[ 3769.246410]  __alloc_pages_nodemask+0x603/0xe80
[ 3769.246411]  ? xas_load+0x9/0x80
[ 3769.246413]  ? find_get_entry+0x58/0x120
[ 3769.246414]  pagecache_get_page+0xde/0x210
[ 3769.246415]  grab_cache_page_write_begin+0x17/0x30
[ 3769.246416]  ext4_da_write_begin+0xc4/0x340
[ 3769.246418]  generic_perform_write+0xb8/0x1b0
[ 3769.246420]  __generic_file_write_iter+0x13c/0x1b0
[ 3769.246421]  ext4_file_write_iter+0x121/0x3c0
[ 3769.246422]  __vfs_write+0x123/0x1a0
[ 3769.246423]  vfs_write+0xab/0x1a0
[ 3769.246424]  ksys_write+0x4d/0xc0
[ 3769.246426]  do_syscall_64+0x49/0x100
[ 3769.246427]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3769.246428] RIP: 0033:0x7fee5b265730
[ 3769.246429] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
[ 3769.246430] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3769.246431] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
[ 3769.246431] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
[ 3769.246432] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
[ 3769.246433] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
[ 3769.246433] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710


 
>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>
> Sure, thanks.
> 
>> BTW am i correct these patches are merely optimizations ?
> 
> Yes
> 
>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>> to revert them for 5.0 and try again for 5.1 ?
>>
> Before removing both it would be good to test with only the barrier-removal removed.
> 
>> --
>> Sander
>>
> Heiner
> 
>>
>>>
>>>> would be candidates, which were merged in 5.0.
>>>>
>>>> I have reverted the first two, see how that works out.
>>>>
>>>> --
>>>> Sander
>>>>
>>> Heiner
>>>
>>>>  
>>>>>> --
>>>>>> Sander
>>>>>>
>>>>> Heiner
>>>>>
>>>>>>
>>>>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>>>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>>>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>>>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>>>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>>> [ 6466.758366] Call Trace:
>>>>>> [ 6466.768118]  <IRQ>
>>>>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>>>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>>>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>>>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>>>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>>>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>>>>> [ 6466.835902]  </IRQ>
>>>>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>>>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>>>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>>>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>>>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>>>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>>>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>>>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>>>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>>>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>>>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>>>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>>>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>>>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>>>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>>>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>>>>> [ 6466.999585] Modules linked in:
>>>>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>>>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>>>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>>>>> [ 6467.118166] Kernel Offset: disabled
>>>>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
> 


^ permalink raw reply

* Re: [PATCH] net: sfp: do not probe SFP module before we're attached
From: Russell King - ARM Linux admin @ 2019-02-08 23:36 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, andrew, f.fainelli, hkallweit1
In-Reply-To: <20190208.151139.1930176272346229162.davem@davemloft.net>

On Fri, Feb 08, 2019 at 03:11:39PM -0800, David Miller wrote:
> From: Russell King <rmk+kernel@armlinux.org.uk>
> Date: Wed, 06 Feb 2019 10:52:30 +0000
> 
> > When we probe a SFP module, we expect to be able to call the upstream
> > device's module_insert() function so that the upstream link can be
> > configured.  However, when the upstream device is delayed, we currently
> > may end up probing the module before the upstream device is available,
> > and lose the module_insert() call.
> > 
> > Avoid this by holding off probing the module until the SFP bus is
> > properly connected to both the SFP socket driver and the upstream
> > driver.
> > 
> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> 
> Applied, thanks Russell.
> 
> -stable?

Yes please.  Would you like me to mail the stable team once it hits
mainline?

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply

* Re: [PATCH] net: sfp: do not probe SFP module before we're attached
From: David Miller @ 2019-02-08 23:42 UTC (permalink / raw)
  To: linux; +Cc: netdev, andrew, f.fainelli, hkallweit1
In-Reply-To: <20190208233651.wdaywntcwwq63xpo@shell.armlinux.org.uk>

From: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Date: Fri, 8 Feb 2019 23:36:51 +0000

> On Fri, Feb 08, 2019 at 03:11:39PM -0800, David Miller wrote:
>> From: Russell King <rmk+kernel@armlinux.org.uk>
>> Date: Wed, 06 Feb 2019 10:52:30 +0000
>> 
>> > When we probe a SFP module, we expect to be able to call the upstream
>> > device's module_insert() function so that the upstream link can be
>> > configured.  However, when the upstream device is delayed, we currently
>> > may end up probing the module before the upstream device is available,
>> > and lose the module_insert() call.
>> > 
>> > Avoid this by holding off probing the module until the SFP bus is
>> > properly connected to both the SFP socket driver and the upstream
>> > driver.
>> > 
>> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
>> 
>> Applied, thanks Russell.
>> 
>> -stable?
> 
> Yes please.  Would you like me to mail the stable team once it hits
> mainline?

Networking -stable submissions are handled purely by me, so no you don't
need to do that.

I've queued this one up, thanks.

Thanks.

^ permalink raw reply

* [PATCH] net: hso: do not call unregister if not registered
From: Yavuz, Tuba @ 2019-02-09  0:02 UTC (permalink / raw)
  To: netdev@vger.kernel.org

On an error path inside the hso_create_net_device function of the hso
driver, hso_free_net_device gets called. This causes potentially a
negative reference count in the net device if register_netdev has not
been called yet as hso_free_net_device calls unregister_netdev
regardless. I think the driver should distinguish these cases and call
unregister_netdev only if register_netdev has been called.

Signed-off-by: Tuba Yavuz <tuba@ece.ufl.edu>
---

--- linux-stable/drivers/net/usb/hso.c.orig	2019-01-27 14:45:58.232683119 -0500
+++ linux-stable/drivers/net/usb/hso.c	2019-02-05 17:54:17.056496019 -0500
@@ -2377,7 +2377,9 @@ static void hso_free_net_device(struct h

 	remove_net_device(hso_net->parent);

-	if (hso_net->net)
+	if (hso_net->net &&
+	    hso_net->net->reg_state == NETREG_REGISTERED)
 		unregister_netdev(hso_net->net);

 	/* start freeing */

^ permalink raw reply

* Re: [PATCH] net: hso: do not unregister if not registered
From: Yavuz, Tuba @ 2019-02-09  0:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <20190208.150818.2115309831038338250.davem@davemloft.net>

I just resubmitted the patch and made sure to send it in plaintext. Hopefully, it will work this time.

Best,

Tuba 
________________________________________
From: David Miller <davem@davemloft.net>
Sent: Friday, February 8, 2019 6:08 PM
To: Yavuz, Tuba
Cc: netdev@vger.kernel.org
Subject: Re: [PATCH] net: hso: do not unregister if not registered

From: "Yavuz, Tuba" <tuba@ece.ufl.edu>
Date: Wed, 6 Feb 2019 00:40:31 +0000

>
> On an error path inside the hso_create_net_device function of the hso
> driver, hso_free_net_device gets called. This causes potentially a
> negative reference count in the net device if register_netdev has not
> been called yet as hso_free_net_device calls unregister_netdev
> regardless. I think the driver should distinguish these cases and call
> unregister_netdev only if register_netdev has been called.
>
> Signed-off-by: Tuba Yavuz <tuba@ece.ufl.edu>

This patch is corrupted by your email client.

^ permalink raw reply

* Re: [PATCH 0/3] iw_cxgb4: add support for completing cached SRQ buffers
From: Jason Gunthorpe @ 2019-02-09  0:04 UTC (permalink / raw)
  To: Raju Rangoju; +Cc: davem, linux-rdma, netdev, swise
In-Reply-To: <20190206172444.21997-1-rajur@chelsio.com>

On Wed, Feb 06, 2019 at 10:54:41PM +0530, Raju Rangoju wrote:
> This series adds support for completing the SRQ buffers that were
> fetched but could not be completed by hw due to connection aborts,
> also fixes the potential srqidx leak during the connection abort.
> 
> This series has both net(cxgb4) and rdma(iw_cxgb4) changes,
> and I would request this merge via rdma repo.
> 
> I have made sure this series applies cleanly on both net-next
> and rdma-for-next and doesn't cause any merge conflicts.
> 
> Raju Rangoju (3):
>   cxgb4: add tcb flags and tcb rpl struct
>   iw_cxgb4: complete the cached SRQ buffers
>   iw_cxgb4: fix srqidx leak during connection abort
> 
>  drivers/infiniband/hw/cxgb4/cm.c            | 166 ++++++++++++++++++++++++++--
>  drivers/infiniband/hw/cxgb4/iw_cxgb4.h      |   3 +
>  drivers/infiniband/hw/cxgb4/t4.h            |   1 +
>  drivers/net/ethernet/chelsio/cxgb4/t4_msg.h |   8 ++
>  drivers/net/ethernet/chelsio/cxgb4/t4_tcb.h |  12 ++
>  5 files changed, 180 insertions(+), 10 deletions(-)

Since this is mostly rdma code let's go through the rdma tree.

Applied to for-next, thanks

Jason 

^ permalink raw reply

* Re: Resource management for ndo_xdp_xmit (Was: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)
From: Saeed Mahameed @ 2019-02-09  0:18 UTC (permalink / raw)
  To: brouer@redhat.com
  Cc: thoiland@redhat.com, hawk@kernel.org,
	virtualization@lists.linux-foundation.org, borkmann@iogearbox.net,
	Tariq Toukan, toke@toke.dk, john.fastabend@gmail.com,
	mst@redhat.com, jakub.kicinski@netronome.com, dsahern@gmail.com,
	netdev@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net,
	makita.toshiaki@lab.ntt.co.jp
In-Reply-To: <71c687209afb1268fdb5dc4aabbab9ecf6c2aa37.camel@mellanox.com>

On Fri, 2019-02-08 at 15:17 -0800, Saeed Mahameed wrote:
> On Thu, 2019-02-07 at 19:08 +0000, Saeed Mahameed wrote:
> > 
> > So 
> > 1) on dev_map_update_elem() we will call
> > dev->dev->ndo_bpf() to notify the device on the intention to
> > start/stop
> > redirect, and wait for it to create/destroy the HW resources
> > before/after actually updating the map
> > 
> 
> silly me, dev_map_update_elem must be atomic, we can't hook driver
> resource allocation to it, it must come as a separate request
> (syscall)
> from user space to request to create XDP redirect resources.
> 

Well, it is possible to render dev_map_update_elem non-atomic and fail
BPF programs who try to update it in the verifier
check_map_func_compatibility.

if you know of any case where devmap needs to be updated from the BPF
program please let me know.


^ permalink raw reply

* [PATCH net-next 00/16] net: Remove switchdev_ops
From: Florian Fainelli @ 2019-02-09  0:32 UTC (permalink / raw)
  To: netdev
  Cc: Florian Fainelli, David S. Miller, Ido Schimmel, open list,
	open list:STAGING SUBSYSTEM, moderated list:ETHERNET BRIDGE, jiri,
	andrew, vivien.didelot

Hi all,

This patch series finishes by the removal of switchdev_ops. To get there
we need to do a few things:

- get rid of the one and only call to switchdev_port_attr_get() which is
  used to fetch the device's bridge port flags capabilities, instead we
  can just check what flags are being programmed during the prepare
  phase

- once we get rid of getting switchdev port attributes we convert the
  setting of such attributes using a blocking notifier

And then remove switchdev_ops completely.

Please review and let me know what you think!

Florian Fainelli (16):
  Documentation: networking: switchdev: Update port parent ID section
  mlxsw: spectrum: Check bridge flags during prepare phase
  staging: fsl-dpaa2: ethsw: Check bridge port flags during set
  net: dsa: Add setter for SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS
  rocker: Check bridge flags during prepare phase
  net: bridge: Stop calling switchdev_port_attr_get()
  net: Remove SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS_SUPPORT
  net: Get rid of switchdev_port_attr_get()
  switchdev: Add SWITCHDEV_PORT_ATTR_SET
  rocker: Handle SWITCHDEV_PORT_ATTR_SET
  net: dsa: Handle SWITCHDEV_PORT_ATTR_SET
  mlxsw: spectrum_switchdev: Handle SWITCHDEV_PORT_ATTR_SET
  net: mscc: ocelot: Handle SWITCHDEV_PORT_ATTR_SET
  staging: fsl-dpaa2: ethsw: Handle SWITCHDEV_PORT_ATTR_SET
  net: switchdev: Replace port attr set SDO with a notification
  net: Remove switchdev_ops

 Documentation/networking/switchdev.txt        | 15 ++-
 .../net/ethernet/mellanox/mlxsw/spectrum.c    | 12 ---
 .../net/ethernet/mellanox/mlxsw/spectrum.h    |  2 -
 .../mellanox/mlxsw/spectrum_switchdev.c       | 72 +++++---------
 drivers/net/ethernet/mscc/ocelot.c            | 25 ++++-
 drivers/net/ethernet/rocker/rocker_main.c     | 99 +++++++++----------
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c       | 52 +++++-----
 include/linux/netdevice.h                     |  3 -
 include/net/switchdev.h                       | 37 ++-----
 net/bridge/br_switchdev.c                     | 20 +---
 net/dsa/dsa_priv.h                            |  3 +
 net/dsa/port.c                                | 10 ++
 net/dsa/slave.c                               | 44 +++++----
 net/switchdev/switchdev.c                     | 92 +++++------------
 14 files changed, 190 insertions(+), 296 deletions(-)

-- 
2.17.1


^ permalink raw reply

* [PATCH net-next 01/16] Documentation: networking: switchdev: Update port parent ID section
From: Florian Fainelli @ 2019-02-09  0:32 UTC (permalink / raw)
  To: netdev
  Cc: Florian Fainelli, David S. Miller, Ido Schimmel, open list,
	open list:STAGING SUBSYSTEM, moderated list:ETHERNET BRIDGE, jiri,
	andrew, vivien.didelot
In-Reply-To: <20190209003248.31088-1-f.fainelli@gmail.com>

Update the section about switchdev drivers having to implement a
switchdev_port_attr_get() function to return
SWITCHDEV_ATTR_ID_PORT_PARENT_ID since that is no longer valid after
commit bccb30254a4a ("net: Get rid of
SWITCHDEV_ATTR_ID_PORT_PARENT_ID").

Fixes: bccb30254a4a ("net: Get rid of SWITCHDEV_ATTR_ID_PORT_PARENT_ID")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 Documentation/networking/switchdev.txt | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
index f3244d87512a..2842f63ad47b 100644
--- a/Documentation/networking/switchdev.txt
+++ b/Documentation/networking/switchdev.txt
@@ -92,11 +92,11 @@ device.
 Switch ID
 ^^^^^^^^^
 
-The switchdev driver must implement the switchdev op switchdev_port_attr_get
-for SWITCHDEV_ATTR_ID_PORT_PARENT_ID for each port netdev, returning the same
-physical ID for each port of a switch.  The ID must be unique between switches
-on the same system.  The ID does not need to be unique between switches on
-different systems.
+The switchdev driver must implement the net_device operation
+ndo_get_port_parent_id for each port netdev,  returning the same physical ID
+for each port of a switch. The ID must be unique between switches on the same
+system. The ID does not need to be unique between switches on different
+systems.
 
 The switch ID is used to locate ports on a switch and to know if aggregated
 ports belong to the same switch.
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next 02/16] mlxsw: spectrum: Check bridge flags during prepare phase
From: Florian Fainelli @ 2019-02-09  0:32 UTC (permalink / raw)
  To: netdev
  Cc: Florian Fainelli, David S. Miller, Ido Schimmel, open list,
	open list:STAGING SUBSYSTEM, moderated list:ETHERNET BRIDGE, jiri,
	andrew, vivien.didelot
In-Reply-To: <20190209003248.31088-1-f.fainelli@gmail.com>

In preparation for getting rid of switchdev_port_attr_get(), have mlxsw
check for the bridge flags being set through switchdev_port_attr_set()
with the SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS attribute identifier.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 95e37de3e48f..468a6d513074 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -623,8 +623,11 @@ static int mlxsw_sp_port_attr_br_flags_set(struct mlxsw_sp_port *mlxsw_sp_port,
 	struct mlxsw_sp_bridge_port *bridge_port;
 	int err;
 
-	if (switchdev_trans_ph_prepare(trans))
+	if (switchdev_trans_ph_prepare(trans)) {
+		if (brport_flags & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD))
+			return -EOPNOTSUPP;
 		return 0;
+	}
 
 	bridge_port = mlxsw_sp_bridge_port_find(mlxsw_sp_port->mlxsw_sp->bridge,
 						orig_dev);
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next 03/16] staging: fsl-dpaa2: ethsw: Check bridge port flags during set
From: Florian Fainelli @ 2019-02-09  0:32 UTC (permalink / raw)
  To: netdev
  Cc: Florian Fainelli, David S. Miller, Ido Schimmel, open list,
	open list:STAGING SUBSYSTEM, moderated list:ETHERNET BRIDGE, jiri,
	andrew, vivien.didelot
In-Reply-To: <20190209003248.31088-1-f.fainelli@gmail.com>

In preparation for removing SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS_SUPPORT,
have ethsw check that the bridge port flags that are being set are
supported.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index e559f4c25cf7..6228c4375835 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -680,8 +680,11 @@ static int port_attr_br_flags_set(struct net_device *netdev,
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	int err = 0;
 
-	if (switchdev_trans_ph_prepare(trans))
+	if (switchdev_trans_ph_prepare(trans)) {
+		if (flags & ~(BR_LEARNING | BR_FLOOD))
+			return -EOPNOTSUPP;
 		return 0;
+	}
 
 	/* Learning is enabled per switch */
 	err = ethsw_set_learning(port_priv->ethsw_data, flags & BR_LEARNING);
-- 
2.17.1


^ permalink raw reply related

* [PATCH net-next 04/16] net: dsa: Add setter for SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS
From: Florian Fainelli @ 2019-02-09  0:32 UTC (permalink / raw)
  To: netdev
  Cc: Florian Fainelli, David S. Miller, Ido Schimmel, open list,
	open list:STAGING SUBSYSTEM, moderated list:ETHERNET BRIDGE, jiri,
	andrew, vivien.didelot
In-Reply-To: <20190209003248.31088-1-f.fainelli@gmail.com>

In preparation for removing SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS_SUPPORT,
add support for a function that processes the
SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS attribute and returns not supported
for any flag set, since DSA does not currently support toggling those
bridge port attributes (yet).

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 net/dsa/dsa_priv.h |  3 +++
 net/dsa/port.c     | 10 ++++++++++
 net/dsa/slave.c    |  4 ++++
 3 files changed, 17 insertions(+)

diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 1f4972dab9f2..97594f0b6efb 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -150,6 +150,9 @@ int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
 			    struct switchdev_trans *trans);
 int dsa_port_ageing_time(struct dsa_port *dp, clock_t ageing_clock,
 			 struct switchdev_trans *trans);
+int dsa_port_bridge_port_flags_set(struct dsa_port *dp,
+				   unsigned long brport_flags,
+				   struct switchdev_trans *trans);
 int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
 		     u16 vid);
 int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr,
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 2d7e01b23572..2ce3752203cf 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -177,6 +177,16 @@ int dsa_port_ageing_time(struct dsa_port *dp, clock_t ageing_clock,
 	return dsa_port_notify(dp, DSA_NOTIFIER_AGEING_TIME, &info);
 }
 
+int dsa_port_bridge_port_flags_set(struct dsa_port *dp,
+				   unsigned long brport_flags,
+				   struct switchdev_trans *trans)
+{
+	if (brport_flags)
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
 int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
 		     u16 vid)
 {
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 70395a0ae52e..212fc1cc27fc 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -292,6 +292,10 @@ static int dsa_slave_port_attr_set(struct net_device *dev,
 	case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
 		ret = dsa_port_ageing_time(dp, attr->u.ageing_time, trans);
 		break;
+	case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:
+		ret = dsa_port_bridge_port_flags_set(dp, attr->u.brport_flags,
+						     trans);
+		break;
 	default:
 		ret = -EOPNOTSUPP;
 		break;
-- 
2.17.1


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox