Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next] xen-netback: mark expected switch fall-through
From: Gustavo A. R. Silva @ 2019-02-08 21:00 UTC (permalink / raw)
  To: David Miller; +Cc: wei.liu2, paul.durrant, xen-devel, netdev, linux-kernel
In-Reply-To: <20190208.122152.2025054759290850260.davem@davemloft.net>



On 2/8/19 2:21 PM, David Miller wrote:
> From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
> Date: Fri, 8 Feb 2019 13:58:38 -0600
> 
>> In preparation to enabling -Wimplicit-fallthrough, mark switch
>> cases where we are expecting to fall through.
>>
>> Warning level 3 was used: -Wimplicit-fallthrough=3
>>
>> This patch is part of the ongoing efforts to enabling
>> -Wimplicit-fallthrough.
>>
>> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
> 
> Applied.
> 

Thanks, Dave.

--
Gustavo

^ permalink raw reply

* RE: [PATCH net-next] devlink: Add WARN_ON to catch errors of not cleaning devlink objects
From: Parav Pandit @ 2019-02-08 21:09 UTC (permalink / raw)
  To: Parav Pandit, David Ahern, netdev@vger.kernel.org,
	davem@davemloft.net
In-Reply-To: <VI1PR0501MB2271230158058746713392C0D1690@VI1PR0501MB2271.eurprd05.prod.outlook.com>



> -----Original Message-----
> From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> On
> Behalf Of Parav Pandit
> Sent: Friday, February 8, 2019 12:01 PM
> To: David Ahern <dsahern@gmail.com>; netdev@vger.kernel.org;
> davem@davemloft.net
> Subject: RE: [PATCH net-next] devlink: Add WARN_ON to catch errors of not
> cleaning devlink objects
> 
> 
> 
> > -----Original Message-----
> > From: David Ahern <dsahern@gmail.com>
> > Sent: Friday, February 8, 2019 11:30 AM
> > To: Parav Pandit <parav@mellanox.com>; netdev@vger.kernel.org;
> > davem@davemloft.net
> > Subject: Re: [PATCH net-next] devlink: Add WARN_ON to catch errors of
> > not cleaning devlink objects
> >
> > On 2/8/19 8:22 AM, Parav Pandit wrote:
> > > Add WARN_ON to make sure that all sub objects of a devlink device
> > > are cleanedup before freeing the devlink device.
> > > This helps to catch any driver bugs.
> > >
> > > Signed-off-by: Parav Pandit <parav@mellanox.com>
> > > Acked-by: Jiri Pirko <jiri@mellanox.com>
> > > ---
> > >  net/core/devlink.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/net/core/devlink.c b/net/core/devlink.c index
> > > cd0d393..5e2ef5a 100644
> > > --- a/net/core/devlink.c
> > > +++ b/net/core/devlink.c
> > > @@ -4229,6 +4229,13 @@ void devlink_unregister(struct devlink
> *devlink)
> > >   */
> > >  void devlink_free(struct devlink *devlink)  {
> > > +	WARN_ON(!list_empty(&devlink->port_list));
> > > +	WARN_ON(!list_empty(&devlink->sb_list));
> > > +	WARN_ON(!list_empty(&devlink->dpipe_table_list));
> > > +	WARN_ON(!list_empty(&devlink->resource_list));
> > > +	WARN_ON(!list_empty(&devlink->param_list));
> > > +	WARN_ON(!list_empty(&devlink->region_list));
> > > +
> > >  	kfree(devlink);
> > >  }
> > >  EXPORT_SYMBOL_GPL(devlink_free);
> > >
> >
> > reporter_list was just added which brings up the maintenance question:
> > If you are going to do this you might want a comment in
> > include/net/devlink.h to remind folks to update this function as relevant.
> I see. Make sense. Adding reporter_list and updating devlink.h, too for
> comment in v1.
I think its little too much to add such a generic comment in devlink.h for lists.
My tree was bit old and missed out the reporter_list entry because we first test on internal trees.
I updated my tree to avoid such problem in future.

^ permalink raw reply

* [PATCH v2 net-next] net: phy: aquantia: add support for AQCS109
From: Heiner Kallweit @ 2019-02-08 21:12 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller
  Cc: netdev@vger.kernel.org, Nikita Yushchenko

From: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Add support for the AQCS109. From software point of view,
it should be almost equivalent to AQR107.

v2:
- make Nikita the author
- document what I changed

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[hkallweit1@gmail.com: use PHY_ID_MATCH_MODEL mascro]
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/aquantia.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/phy/aquantia.c b/drivers/net/phy/aquantia.c
index 482004efa..0f772a47a 100644
--- a/drivers/net/phy/aquantia.c
+++ b/drivers/net/phy/aquantia.c
@@ -17,6 +17,7 @@
 #define PHY_ID_AQR105	0x03a1b4a2
 #define PHY_ID_AQR106	0x03a1b4d0
 #define PHY_ID_AQR107	0x03a1b4e0
+#define PHY_ID_AQCS109	0x03a1b5c2
 #define PHY_ID_AQR405	0x03a1b4b0
 
 #define MDIO_AN_TX_VEND_STATUS1			0xc800
@@ -202,6 +203,16 @@ static struct phy_driver aqr_driver[] = {
 	.ack_interrupt	= aqr_ack_interrupt,
 	.read_status	= aqr_read_status,
 },
+{
+	PHY_ID_MATCH_MODEL(PHY_ID_AQCS109),
+	.name		= "Aquantia AQCS109",
+	.features	= PHY_10GBIT_FULL_FEATURES,
+	.aneg_done	= genphy_c45_aneg_done,
+	.config_aneg    = aqr_config_aneg,
+	.config_intr	= aqr_config_intr,
+	.ack_interrupt	= aqr_ack_interrupt,
+	.read_status	= aqr_read_status,
+},
 {
 	PHY_ID_MATCH_MODEL(PHY_ID_AQR405),
 	.name		= "Aquantia AQR405",
@@ -222,6 +233,7 @@ static struct mdio_device_id __maybe_unused aqr_tbl[] = {
 	{ PHY_ID_MATCH_MODEL(PHY_ID_AQR105) },
 	{ PHY_ID_MATCH_MODEL(PHY_ID_AQR106) },
 	{ PHY_ID_MATCH_MODEL(PHY_ID_AQR107) },
+	{ PHY_ID_MATCH_MODEL(PHY_ID_AQCS109) },
 	{ PHY_ID_MATCH_MODEL(PHY_ID_AQR405) },
 	{ }
 };
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH v2 net-next] net: phy: aquantia: add support for AQCS109
From: Andrew Lunn @ 2019-02-08 21:14 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Florian Fainelli, David Miller, netdev@vger.kernel.org,
	Nikita Yushchenko
In-Reply-To: <ed788670-c0aa-212f-2c8f-0e8f858dc35c@gmail.com>

On Fri, Feb 08, 2019 at 10:12:23PM +0100, Heiner Kallweit wrote:
> From: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
> Add support for the AQCS109. From software point of view,
> it should be almost equivalent to AQR107.
> 
> v2:
> - make Nikita the author
> - document what I changed
> 
> Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> [hkallweit1@gmail.com: use PHY_ID_MATCH_MODEL mascro]
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* [PATCHv1 net-next] devlink: Add WARN_ON to catch errors of not cleaning devlink objects
From: Parav Pandit @ 2019-02-08 21:15 UTC (permalink / raw)
  To: netdev, davem; +Cc: parav

Add WARN_ON to make sure that all sub objects of a devlink device are
cleanedup before freeing the devlink device.
This helps to catch any driver bugs.

Signed-off-by: Parav Pandit <parav@mellanox.com>
---
Changelog:
v0->v1:
 - Added WARN_ON for reporter_list too
 - Change the WARN_ON lists order to follow mirror of its init part.
---
 net/core/devlink.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/core/devlink.c b/net/core/devlink.c
index 7fbdba5..ab7ce80 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -5237,6 +5237,14 @@ void devlink_unregister(struct devlink *devlink)
  */
 void devlink_free(struct devlink *devlink)
 {
+	WARN_ON(!list_empty(&devlink->reporter_list));
+	WARN_ON(!list_empty(&devlink->region_list));
+	WARN_ON(!list_empty(&devlink->param_list));
+	WARN_ON(!list_empty(&devlink->resource_list));
+	WARN_ON(!list_empty(&devlink->dpipe_table_list));
+	WARN_ON(!list_empty(&devlink->sb_list));
+	WARN_ON(!list_empty(&devlink->port_list));
+
 	kfree(devlink);
 }
 EXPORT_SYMBOL_GPL(devlink_free);
-- 
1.8.3.1


^ permalink raw reply related

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Heiner Kallweit @ 2019-02-08 21:22 UTC (permalink / raw)
  To: Sander Eikelenboom, Realtek linux nic maintainers
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <471e550b-c227-22e6-19fd-5f9abd450e5f@eikelenboom.it>

On 08.02.2019 21:55, Sander Eikelenboom wrote:
> On 08/02/2019 19:52, Heiner Kallweit wrote:
>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>> L.S.,
>>>
>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>> that I haven encountered with Linux 4.20.x.
>>>
>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>
>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>
>> Thanks for the report. However I see no change in the r8169 driver between
>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>> be somewhere else. Therefore I'm afraid a bisect will be needed.
> 
> Hmm i did some diging and i think:
> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
> 
You're right. Thought this was added in 4.20 already.
The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
this issue from any user of physical hw. And due to the fact that a lot of mainboards
have onboard Realtek network I have quite a few testers out there.
Does the issue occur under specific circumstances like very high load?

If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
as author of the underlying changes.

> would be candidates, which were merged in 5.0.
> 
> I have reverted the first two, see how that works out.
> 
> --
> Sander
> 
Heiner

>  
>>> --
>>> Sander
>>>
>> Heiner
>>
>>>
>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>> [ 6466.758366] Call Trace:
>>> [ 6466.768118]  <IRQ>
>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>> [ 6466.835902]  </IRQ>
>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>> [ 6466.999585] Modules linked in:
>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>> [ 6467.118166] Kernel Offset: disabled
>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>
>>
> 
> 


^ permalink raw reply

* [PATCH net-next] net: dsa: mv88e6xxx: SERDES support 2500BaseT via external PHY
From: Heiner Kallweit @ 2019-02-08 21:25 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller, Vivien Didelot
  Cc: netdev@vger.kernel.org

From: Andrew Lunn <andrew@lunn.ch>
By using an external PHY, ports 9 and 10 can support 2500BaseT.
So set this link mode in the mask when validating.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/dsa/mv88e6xxx/chip.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 8dca2c949..739c0c168 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -647,8 +647,10 @@ static void mv88e6390_phylink_validate(struct mv88e6xxx_chip *chip, int port,
 				       unsigned long *mask,
 				       struct phylink_link_state *state)
 {
-	if (port >= 9)
+	if (port >= 9) {
 		phylink_set(mask, 2500baseX_Full);
+		phylink_set(mask, 2500baseT_Full);
+	}
 
 	/* No ethtool bits for 200Mbps */
 	phylink_set(mask, 1000baseT_Full);
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH net-next] ipmr: ip6mr: Create new sockopt to clear mfc cache or vifs
From: Nikolay Aleksandrov @ 2019-02-08 21:40 UTC (permalink / raw)
  To: nicolas.dichtel, Callum Sinclair, davem, kuznet, yoshfuji, netdev,
	linux-kernel
In-Reply-To: <31a5155c-ce6e-4c0f-61c0-35a5472549aa@6wind.com>

On 08/02/2019 17:08, Nicolas Dichtel wrote:
> Le 08/02/2019 à 15:43, Nikolay Aleksandrov a écrit :
>> On 08/02/2019 16:18, Nicolas Dichtel wrote:
>>> Le 08/02/2019 à 05:11, Callum Sinclair a écrit :
>>>> Currently the only way to clear the mfc cache was to delete the entries
>>> mfc stands for 'multicast forwarding cache', so 'mfc cache' is a bit strange.
>>>
>>>> one by one using the MRT_DEL_MFC socket option or to destroy and
>>>> recreate the socket.
>>> Note that if entries were added with MRT_ADD_MFC_PROXY, they will survive to the
>>> socket destruction. This is not the case with your new cmd. Is it intended?
>>
>> I think you're referring to MFC_STATIC entries (sk != mroute_sk). It
>> doesn't matter how you add an entry - they all get cleaned up if added
>> through the mroute socket.
> Yes, right.
> MRT_FLUSH_MFC_STATIC ?
> 

Sounds good to me.


^ permalink raw reply

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Sander Eikelenboom @ 2019-02-08 21:45 UTC (permalink / raw)
  To: Heiner Kallweit, Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <1265d424-4943-e571-a74b-b1512ebec179@gmail.com>

On 08/02/2019 22:22, Heiner Kallweit wrote:
> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>> L.S.,
>>>>
>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>> that I haven encountered with Linux 4.20.x.
>>>>
>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>
>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>
>>> Thanks for the report. However I see no change in the r8169 driver between
>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>
>> Hmm i did some diging and i think:
>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>
> You're right. Thought this was added in 4.20 already.
> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
> this issue from any user of physical hw. And due to the fact that a lot of mainboards
> have onboard Realtek network I have quite a few testers out there.
> Does the issue occur under specific circumstances like very high load?

Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
on the host.

> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
> as author of the underlying changes.

It could also be the barriers weren't that unneeded as assumed.
Since we are almost at RC6 i took the liberty to CC Eric now.

BTW am i correct these patches are merely optimizations ?
If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
to revert them for 5.0 and try again for 5.1 ?

--
Sander


> 
>> would be candidates, which were merged in 5.0.
>>
>> I have reverted the first two, see how that works out.
>>
>> --
>> Sander
>>
> Heiner
> 
>>  
>>>> --
>>>> Sander
>>>>
>>> Heiner
>>>
>>>>
>>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>> [ 6466.758366] Call Trace:
>>>> [ 6466.768118]  <IRQ>
>>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>>> [ 6466.835902]  </IRQ>
>>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>>> [ 6466.999585] Modules linked in:
>>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>>> [ 6467.118166] Kernel Offset: disabled
>>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>>
>>>
>>
>>
> 


^ permalink raw reply

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Heiner Kallweit @ 2019-02-08 21:50 UTC (permalink / raw)
  To: Sander Eikelenboom, Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <059e59c6-2264-fd5c-068f-3656e39539c1@eikelenboom.it>

On 08.02.2019 22:45, Sander Eikelenboom wrote:
> On 08/02/2019 22:22, Heiner Kallweit wrote:
>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>> L.S.,
>>>>>
>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>> that I haven encountered with Linux 4.20.x.
>>>>>
>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>
>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>
>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>
>>> Hmm i did some diging and i think:
>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>
>> You're right. Thought this was added in 4.20 already.
>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>> have onboard Realtek network I have quite a few testers out there.
>> Does the issue occur under specific circumstances like very high load?
> 
> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
> on the host.
> 
>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>> as author of the underlying changes.
> 
> It could also be the barriers weren't that unneeded as assumed.

The barriers were removed after adding xmit_more handling. Therefore it would be good to
test also with only 
bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
removed.

> Since we are almost at RC6 i took the liberty to CC Eric now.
> 
Sure, thanks.

> BTW am i correct these patches are merely optimizations ?

Yes

> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
> to revert them for 5.0 and try again for 5.1 ?
> 
Before removing both it would be good to test with only the barrier-removal removed.

> --
> Sander
> 
Heiner

> 
>>
>>> would be candidates, which were merged in 5.0.
>>>
>>> I have reverted the first two, see how that works out.
>>>
>>> --
>>> Sander
>>>
>> Heiner
>>
>>>  
>>>>> --
>>>>> Sander
>>>>>
>>>> Heiner
>>>>
>>>>>
>>>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>> [ 6466.758366] Call Trace:
>>>>> [ 6466.768118]  <IRQ>
>>>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>>>> [ 6466.835902]  </IRQ>
>>>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>>>> [ 6466.999585] Modules linked in:
>>>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>>>> [ 6467.118166] Kernel Offset: disabled
>>>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>>>
>>>>
>>>
>>>
>>
> 
> 


^ permalink raw reply

* Re: Resource management for ndo_xdp_xmit (Was: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)
From: Saeed Mahameed @ 2019-02-08 22:49 UTC (permalink / raw)
  To: toke@redhat.com, brouer@redhat.com
  Cc: hawk@kernel.org, virtualization@lists.linux-foundation.org,
	borkmann@iogearbox.net, Tariq Toukan, john.fastabend@gmail.com,
	jakub.kicinski@netronome.com, mst@redhat.com, dsahern@gmail.com,
	netdev@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net,
	makita.toshiaki@lab.ntt.co.jp
In-Reply-To: <87o97mp6dp.fsf@toke.dk>

On Fri, 2019-02-08 at 17:55 +0100, Toke Høiland-Jørgensen wrote:
> Saeed Mahameed <saeedm@mellanox.com> writes:
> 
> > But:
> > 2) this won't totally solve our problem, since sometimes the driver
> > can
> > decide to recreate (change of configuration) hw resources on the
> > fly
> > while redirect/devmap is already happening, so we need some kind of
> > a
> > dev_map_notification or a flag with rcu synch, for when the driver
> > want
> > to make the xdp redirect resources unavailable.
> 
> Good point, I'll make a note of this. Do you have a pointer to where
> the
> mlx5 driver does this kind of change currently?
> 

example:
ethtool -L to reduce/increase the number of rings
e.g. @mlx5e_ethtool_set_ringparam
or virtually anywhere mlx5e_switch_priv_channels is called when xdp
prog redirect is attached to mlx5.

> -Toke

^ permalink raw reply

* Re: [PATCH net-next 00/14] mlxsw: Implement periodic ERP rehash
From: David Miller @ 2019-02-08 23:03 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, mlxsw
In-Reply-To: <20190207112211.10375-1-idosch@mellanox.com>

From: Ido Schimmel <idosch@mellanox.com>
Date: Thu, 7 Feb 2019 11:22:44 +0000

> Currently, an ERP set is created for each region according to rules
> inserted and order of their insertion. However that might lead to
> suboptimal ERP sets and possible unnecessary spillage into C-TCAM.
> This patchset aims to fix this problem and introduces periodical checking
> of used ERP sets and in case a better ERP set is possible for the given
> set of rules, it rehashes the region to use the better ERP set.
 ...

Series applied, I'll push this out after my build tests complete.

Thanks.

^ permalink raw reply

* Re: [PATCH] net: hso: do not unregister if not registered
From: David Miller @ 2019-02-08 23:08 UTC (permalink / raw)
  To: tuba; +Cc: netdev
In-Reply-To: <1549413631237.66546@ece.ufl.edu>

From: "Yavuz, Tuba" <tuba@ece.ufl.edu>
Date: Wed, 6 Feb 2019 00:40:31 +0000

> 
> On an error path inside the hso_create_net_device function of the hso
> driver, hso_free_net_device gets called. This causes potentially a
> negative reference count in the net device if register_netdev has not
> been called yet as hso_free_net_device calls unregister_netdev
> regardless. I think the driver should distinguish these cases and call
> unregister_netdev only if register_netdev has been called.
> 
> Signed-off-by: Tuba Yavuz <tuba@ece.ufl.edu>

This patch is corrupted by your email client.

^ permalink raw reply

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Eric Dumazet @ 2019-02-08 23:09 UTC (permalink / raw)
  To: Heiner Kallweit, Sander Eikelenboom,
	Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <140d0df7-1775-5457-aa03-b21ece250a72@gmail.com>



On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>> L.S.,
>>>>>>
>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>
>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>
>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>
>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>
>>>> Hmm i did some diging and i think:
>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>
>>> You're right. Thought this was added in 4.20 already.
>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>> have onboard Realtek network I have quite a few testers out there.
>>> Does the issue occur under specific circumstances like very high load?
>>
>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>> on the host.
>>
>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>> as author of the underlying changes.
>>
>> It could also be the barriers weren't that unneeded as assumed.
> 
> The barriers were removed after adding xmit_more handling. Therefore it would be good to
> test also with only 
> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
> removed.
> 
>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>
> Sure, thanks.
> 
>> BTW am i correct these patches are merely optimizations ?
> 
> Yes
> 
>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>> to revert them for 5.0 and try again for 5.1 ?
>>
> Before removing both it would be good to test with only the barrier-removal removed.
> 

Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
looks buggy to me, since the skb might have been freed already on another cpu when you call

You could try :

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
        dma_addr_t mapping;
        u32 opts[2], len;
        bool stop_queue;
+       bool door_bell;
        int frags;
 
        if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
@@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
        /* Force memory writes to complete before releasing descriptor */
        dma_wmb();
 
+       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
+
        txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
 
        /* Force all memory writes to complete before notifying device */
@@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
        if (unlikely(stop_queue))
                netif_stop_queue(dev);
 
-       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
+       if (door_bell) {
                RTL_W8(tp, TxPoll, NPQ);
                mmiowb();
        }



^ permalink raw reply related

* Re: [PATCH] net: sfp: do not probe SFP module before we're attached
From: David Miller @ 2019-02-08 23:11 UTC (permalink / raw)
  To: rmk+kernel; +Cc: netdev, andrew, f.fainelli, hkallweit1
In-Reply-To: <E1grKog-0004c1-OB@rmk-PC.armlinux.org.uk>

From: Russell King <rmk+kernel@armlinux.org.uk>
Date: Wed, 06 Feb 2019 10:52:30 +0000

> When we probe a SFP module, we expect to be able to call the upstream
> device's module_insert() function so that the upstream link can be
> configured.  However, when the upstream device is delayed, we currently
> may end up probing the module before the upstream device is available,
> and lose the module_insert() call.
> 
> Avoid this by holding off probing the module until the SFP bus is
> properly connected to both the SFP socket driver and the upstream
> driver.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Applied, thanks Russell.

-stable?

^ permalink raw reply

* Re: Resource management for ndo_xdp_xmit (Was: [PATCH net] virtio_net: Account for tx bytes and packets on sending xdp_frames)
From: Saeed Mahameed @ 2019-02-08 23:17 UTC (permalink / raw)
  To: brouer@redhat.com
  Cc: thoiland@redhat.com, hawk@kernel.org,
	virtualization@lists.linux-foundation.org, borkmann@iogearbox.net,
	Tariq Toukan, toke@toke.dk, john.fastabend@gmail.com,
	mst@redhat.com, jakub.kicinski@netronome.com, dsahern@gmail.com,
	netdev@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net,
	makita.toshiaki@lab.ntt.co.jp
In-Reply-To: <9e5e6882566ac67276209b35ec112a824b256bff.camel@mellanox.com>

On Thu, 2019-02-07 at 19:08 +0000, Saeed Mahameed wrote:
> On Thu, 2019-02-07 at 08:48 +0100, Jesper Dangaard Brouer wrote:
> > On Wed, 6 Feb 2019 00:06:33 +0000 Saeed Mahameed <
> > saeedm@mellanox.com
> > > wrote:
> > > On Mon, 2019-02-04 at 19:13 -0800, David Ahern wrote:
> > [...]
> > > > mlx5 needs some work. As I recall it still has the bug/panic
> > > > removing xdp programs - at least I don't recall seeing a patch
> > > > for
> > > > it.  
> > > 
> > > Only when xdp_redirect to mlx5, and removing the program while
> > > redirect is happening, this is actually due to a lack of
> > > synchronization means between different drivers, we have some
> > > ideas
> > > to overcome this using a standard XDP API, or just use a hack in
> > > mlx5
> > > driver which i don't like:
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=topic/xdp-redirect-fix&id=a3652d03cc35fd3ad62744986c8ccaca74c9f20c
> > > 
> > > I will be working on this towards the end of this week.
> > 
> > Toke and I have been discussing how to solve this.
> > 
> > The main idea for fixing this is to tie resource allocation to
> > interface
> > insertion into interface maps (kernel/bpf/devmap.c). As the
> > =devmap=
> > already have the needed synchronisation mechanisms and steps for
> > safely
> > adding and removing =net_devices= (e.g. stopping RX side, flushing
> > remaining frames, waiting RCU period before freeing objects, etc.)
> > 
> > As described here:
> >  
> > https://github.com/xdp-project/xdp-project/blob/master/xdp-project.org#better-ndo_xdp_xmit-resource-management
> > 
> > --Jesper
> 
> Yes you already suggested this approach @LPC:
> 
> So 
> 1) on dev_map_update_elem() we will call
> dev->dev->ndo_bpf() to notify the device on the intention to
> start/stop
> redirect, and wait for it to create/destroy the HW resources
> before/after actually updating the map
> 

silly me, dev_map_update_elem must be atomic, we can't hook driver
resource allocation to it, it must come as a separate request (syscall)
from user space to request to create XDP redirect resources.


> But:
> 2) this won't totally solve our problem, since sometimes the driver
> can
> decide to recreate (change of configuration) hw resources on the fly
> while redirect/devmap is already happening, so we need some kind of a
> dev_map_notification or a flag with rcu synch, for when the driver
> want
> to make the xdp redirect resources unavailable.
> 

I will focus on this problem first, then figure out how to create XDP
redirect resources without actullay attaching a dummy xdp program.

> Thanks,
> Saeed.

^ permalink raw reply

* Re: [PATCH v2 1/3] net/macb: bindings doc/trivial: fix documentation for sama5d3 10/100 interface
From: David Miller @ 2019-02-08 23:20 UTC (permalink / raw)
  To: nicolas.ferre
  Cc: alexandre.belloni, ludovic.desroches, linux-arm-kernel, robh+dt,
	linux-kernel, netdev, devicetree
In-Reply-To: <20190206175610.26773-1-nicolas.ferre@microchip.com>

From: Nicolas Ferre <nicolas.ferre@microchip.com>
Date: Wed, 6 Feb 2019 18:56:08 +0100

> This removes a line left while adding the correct compatibility string for
> sama5d3 10/100 interface. Now use the "atmel,sama5d3-macb" string.
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
> Reviewed-by: Rob Herring <robh@kernel.org>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH v2 2/3] net/macb: bindings doc: add sam9x60 binding
From: David Miller @ 2019-02-08 23:20 UTC (permalink / raw)
  To: nicolas.ferre
  Cc: alexandre.belloni, ludovic.desroches, linux-arm-kernel, robh+dt,
	linux-kernel, netdev, devicetree
In-Reply-To: <20190206175610.26773-2-nicolas.ferre@microchip.com>

From: Nicolas Ferre <nicolas.ferre@microchip.com>
Date: Wed, 6 Feb 2019 18:56:09 +0100

> Add the compatibility sting documentation for sam9x60 10/100 interface.
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH v2 3/3] net: macb: add sam9x60-macb compatibility string
From: David Miller @ 2019-02-08 23:20 UTC (permalink / raw)
  To: nicolas.ferre
  Cc: alexandre.belloni, ludovic.desroches, linux-arm-kernel, robh+dt,
	linux-kernel, netdev, devicetree
In-Reply-To: <20190206175610.26773-3-nicolas.ferre@microchip.com>

From: Nicolas Ferre <nicolas.ferre@microchip.com>
Date: Wed, 6 Feb 2019 18:56:10 +0100

> Add a new compatibility string for this product. It's using
> at91sam9260-macb layout but has a newer hardware revision: it's safer
> to use its own string.
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH bpf-next v8 4/6] bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c
From: kbuild test robot @ 2019-02-08 23:20 UTC (permalink / raw)
  To: Peter Oskolkov
  Cc: kbuild-all, Alexei Starovoitov, Daniel Borkmann, netdev,
	Peter Oskolkov, David Ahern, Willem de Bruijn, Peter Oskolkov
In-Reply-To: <20190208163849.151626-5-posk@google.com>

[-- Attachment #1: Type: text/plain, Size: 3928 bytes --]

Hi Peter,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/0day-ci/linux/commits/Peter-Oskolkov/bpf-add-BPF_LWT_ENCAP_IP-option-to-bpf_lwt_push_encap/20190209-030743
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: x86_64-randconfig-j0-02040958 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   net//core/lwt_bpf.c: In function 'bpf_lwt_xmit_reroute':
>> net//core/lwt_bpf.c:216:10: warning: missing braces around initializer [-Wmissing-braces]
      struct flowi4 fl4 = {0};
             ^
   net//core/lwt_bpf.c:216:10: warning: (near initialization for 'fl4.__fl_common') [-Wmissing-braces]

vim +216 net//core/lwt_bpf.c

   184	
   185	static int bpf_lwt_xmit_reroute(struct sk_buff *skb)
   186	{
   187		struct net_device *l3mdev = l3mdev_master_dev_rcu(skb_dst(skb)->dev);
   188		int oif = l3mdev ? l3mdev->ifindex : 0;
   189		struct dst_entry *dst = NULL;
   190		struct sock *sk;
   191		struct net *net;
   192		bool ipv4;
   193		int err;
   194	
   195		if (skb->protocol == htons(ETH_P_IP)) {
   196			ipv4 = true;
   197		} else if (skb->protocol == htons(ETH_P_IPV6)) {
   198			ipv4 = false;
   199		} else {
   200			pr_warn_once("BPF_LWT_REROUTE xmit: unsupported proto %d\n",
   201				     skb->protocol);
   202			return -EINVAL;
   203		}
   204	
   205		sk = sk_to_full_sk(skb->sk);
   206		if (sk) {
   207			if (sk->sk_bound_dev_if)
   208				oif = sk->sk_bound_dev_if;
   209			net = sock_net(sk);
   210		} else {
   211			net = dev_net(skb_dst(skb)->dev);
   212		}
   213	
   214		if (ipv4) {
   215			struct iphdr *iph = ip_hdr(skb);
 > 216			struct flowi4 fl4 = {0};
   217			struct rtable *rt;
   218	
   219			fl4.flowi4_oif = oif;
   220			fl4.flowi4_mark = skb->mark;
   221			fl4.flowi4_uid = sock_net_uid(net, sk);
   222			fl4.flowi4_tos = RT_TOS(iph->tos);
   223			fl4.flowi4_flags = FLOWI_FLAG_ANYSRC;
   224			fl4.flowi4_proto = iph->protocol;
   225			fl4.daddr = iph->daddr;
   226			fl4.saddr = iph->saddr;
   227	
   228			rt = ip_route_output_key(net, &fl4);
   229			if (IS_ERR(rt) || rt->dst.error)
   230				return -EINVAL;
   231			dst = &rt->dst;
   232		} else {
   233	#if IS_BUILTIN(CONFIG_IPV6)
   234			struct ipv6hdr *iph6 = ipv6_hdr(skb);
   235			struct flowi6 fl6 = {0};
   236	
   237			fl6.flowi6_oif = oif;
   238			fl6.flowi6_mark = skb->mark;
   239			fl6.flowi6_uid = sock_net_uid(net, sk);
   240			fl6.flowlabel = ip6_flowinfo(iph6);
   241			fl6.flowi6_proto = iph6->nexthdr;
   242			fl6.daddr = iph6->daddr;
   243			fl6.saddr = iph6->saddr;
   244	
   245			dst = ip6_route_output(net, skb->sk, &fl6);
   246			if (IS_ERR(dst) || dst->error)
   247				return -EINVAL;
   248	#else
   249			pr_warn_once("BPF_LWT_REROUTE xmit: IPV6 not built-in\n");
   250			return -EINVAL;
   251	#endif
   252		}
   253	
   254		/* Although skb header was reserved in bpf_lwt_push_ip_encap(), it
   255		 * was done for the previous dst, so we are doing it here again, in
   256		 * case the new dst needs much more space. The call below is a noop
   257		 * if there is enough header space in skb.
   258		 */
   259		err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev));
   260		if (unlikely(err))
   261			return err;
   262	
   263		skb_dst_drop(skb);
   264		skb_dst_set(skb, dst);
   265	
   266		err = dst_output(dev_net(skb_dst(skb)->dev), skb->sk, skb);
   267		if (unlikely(err))
   268			return err;
   269	
   270		/* ip[6]_finish_output2 understand LWTUNNEL_XMIT_DONE */
   271		return LWTUNNEL_XMIT_DONE;
   272	}
   273	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 29197 bytes --]

^ permalink raw reply

* Re: [PATCH bpf-next v8 4/6] bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c
From: David Ahern @ 2019-02-08 23:24 UTC (permalink / raw)
  To: Peter Oskolkov, Alexei Starovoitov, Daniel Borkmann, netdev
  Cc: Peter Oskolkov, Willem de Bruijn
In-Reply-To: <20190208163849.151626-5-posk@google.com>

On 2/8/19 8:38 AM, Peter Oskolkov wrote:
> This patch builds on top of the previous patch in the patchset,
> which added BPF_LWT_ENCAP_IP mode to bpf_lwt_push_encap. As the
> encapping can result in the skb needing to go via a different
> interface/route/dst, bpf programs can indicate this by returning
> BPF_LWT_REROUTE, which triggers a new route lookup for the skb.
> 
> v8 changes: fix kbuild errors when LWTUNNEL_BPF is builtin, but
>    IPV6 is a module: as LWTUNNEL_BPF can only be either Y or N,
>    call IPV6 routing functions only if they are built-in.

you need to use the ipv6 stub to access v6 functionality when it is a
module.


^ permalink raw reply

* Re: [RFC PATCH] perf, bpf: Retain kernel executable code in memory to aid Intel PT tracing
From: Alexei Starovoitov @ 2019-02-08 23:29 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Peter Zijlstra, Andi Kleen, Alexander Shishkin,
	Arnaldo Carvalho de Melo, Jiri Olsa, Song Liu, Daniel Borkmann,
	Alexei Starovoitov, linux-kernel, netdev
In-Reply-To: <20190207111901.2399-1-adrian.hunter@intel.com>

On Thu, Feb 07, 2019 at 01:19:01PM +0200, Adrian Hunter wrote:
> Subject to memory pressure and other limits, retain executable code, such
> as JIT-compiled bpf, in memory instead of freeing it immediately it is no
> longer needed for execution.
> 
> While perf is primarily aimed at statistical analysis, tools like Intel
> PT can aim to provide a trace of exactly what happened. As such, corner
> cases that can be overlooked statistically need to be addressed. For
> example, there is a gap where JIT-compiled bpf can be freed from memory
> before a tracer has a chance to read it out through the bpf syscall.
> While that can be ignored statistically, it contributes to a death by
> 1000 cuts for tracers attempting to assemble exactly what happened. This is
> a bit gratuitous given that retaining the executable code is relatively
> simple, and the amount of memory involved relatively small. The retained
> executable code is then available in memory images such as /proc/kcore.
> 
> This facility could perhaps be extended also to init sections.
> 
> Note that this patch is compile tested only and, at present, is missing
> the ability to retain symbols.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  arch/x86/Kconfig.cpu       |   1 +
>  include/linux/filter.h     |   4 +
>  include/linux/xc_retain.h  |  49 ++++++++++
>  init/Kconfig               |   6 ++
>  kernel/Makefile            |   1 +
>  kernel/bpf/core.c          |  44 ++++++++-
>  kernel/xc_retain.c         | 183 +++++++++++++++++++++++++++++++++++++
>  net/core/sysctl_net_core.c |  62 +++++++++++++
>  8 files changed, 349 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/xc_retain.h
>  create mode 100644 kernel/xc_retain.c
> 
> diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
> index 6adce15268bd..21dcd064c272 100644
> --- a/arch/x86/Kconfig.cpu
> +++ b/arch/x86/Kconfig.cpu
> @@ -389,6 +389,7 @@ menuconfig PROCESSOR_SELECT
>  config CPU_SUP_INTEL
>  	default y
>  	bool "Support Intel processors" if PROCESSOR_SELECT
> +	select XC_RETAIN if PERF_EVENTS && BPF_JIT
>  	---help---
>  	  This enables detection, tunings and quirks for Intel processors
>  
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index d531d4250bff..40b9f601e18f 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -851,6 +851,10 @@ extern int bpf_jit_enable;
>  extern int bpf_jit_harden;
>  extern int bpf_jit_kallsyms;
>  extern long bpf_jit_limit;
> +extern unsigned int bpf_jit_retain_min;
> +extern unsigned int bpf_jit_retain_max;
> +
> +void bpf_jit_retain_update_sz(void);
>  
>  typedef void (*bpf_jit_fill_hole_t)(void *area, unsigned int size);
>  
> diff --git a/include/linux/xc_retain.h b/include/linux/xc_retain.h
> new file mode 100644
> index 000000000000..e79dc138bab8
> --- /dev/null
> +++ b/include/linux/xc_retain.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2019 Intel Corporation.
> + */
> +#ifndef _LINUX_XC_RETAIN_H
> +#define _LINUX_XC_RETAIN_H
> +
> +#include <linux/list.h>
> +#include <linux/shrinker.h>
> +#include <linux/spinlock.h>
> +
> +struct xc_retain_ops {
> +	void (*free)(void *addr);
> +};
> +
> +struct xc_retain {
> +	struct list_head list;
> +	struct list_head items;
> +	const struct xc_retain_ops ops;
> +	unsigned int min_pages;
> +	unsigned int max_pages;
> +	unsigned int current_pages;
> +	unsigned int item_cnt;
> +	spinlock_t lock;
> +	struct shrinker shrinker;
> +};
> +
> +#ifdef CONFIG_XC_RETAIN
> +int xc_retain_register(struct xc_retain *xr);
> +void xc_retain_binary(struct xc_retain *xr, void *addr, unsigned int pages);
> +void xc_retain_set_min_pages(struct xc_retain *xr, unsigned int min_pages);
> +void xc_retain_set_max_pages(struct xc_retain *xr, unsigned int max_pages);
> +#else
> +static inline int xc_retain_register(struct xc_retain *xr)
> +{
> +	return 0;
> +}
> +static inline void xc_retain_binary(struct xc_retain *xr, void *addr,
> +				    unsigned int pages)
> +{
> +	xr->ops.free(addr);
> +}
> +static inline void xc_retain_set_max_pages(struct xc_retain *xr,
> +					   unsigned int max_pages)
> +{
> +}
> +#endif
> +
> +#endif
> diff --git a/init/Kconfig b/init/Kconfig
> index c9386a365eea..954c288cabdc 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1550,6 +1550,12 @@ config EMBEDDED
>  	  an embedded system so certain expert options are available
>  	  for configuration.
>  
> +config XC_RETAIN
> +	bool
> +	help
> +	  Retain kernel executable code (e.g. jitted BPF) in memory after it
> +	  would normally be freed.
> +
>  config HAVE_PERF_EVENTS
>  	bool
>  	help
> diff --git a/kernel/Makefile b/kernel/Makefile
> index 6aa7543bcdb2..5df40e2a934e 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -98,6 +98,7 @@ obj-$(CONFIG_TRACEPOINTS) += trace/
>  obj-$(CONFIG_IRQ_WORK) += irq_work.o
>  obj-$(CONFIG_CPU_PM) += cpu_pm.o
>  obj-$(CONFIG_BPF) += bpf/
> +obj-$(CONFIG_XC_RETAIN) += xc_retain.o
>  
>  obj-$(CONFIG_PERF_EVENTS) += events/
>  
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 19c49313c709..7fd235d235c2 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -34,6 +34,7 @@
>  #include <linux/kallsyms.h>
>  #include <linux/rcupdate.h>
>  #include <linux/perf_event.h>
> +#include <linux/xc_retain.h>
>  
>  #include <asm/unaligned.h>
>  
> @@ -480,6 +481,10 @@ int bpf_jit_enable   __read_mostly = IS_BUILTIN(CONFIG_BPF_JIT_ALWAYS_ON);
>  int bpf_jit_harden   __read_mostly;
>  int bpf_jit_kallsyms __read_mostly;
>  long bpf_jit_limit   __read_mostly;
> +#define BPF_JIT_RETAIN_MIN 0
> +#define BPF_JIT_RETAIN_MAX 16
> +unsigned int bpf_jit_retain_min __read_mostly = BPF_JIT_RETAIN_MIN;
> +unsigned int bpf_jit_retain_max __read_mostly = BPF_JIT_RETAIN_MAX;
>  
>  static __always_inline void
>  bpf_get_prog_addr_region(const struct bpf_prog *prog,
> @@ -795,6 +800,43 @@ void bpf_jit_binary_free(struct bpf_binary_header *hdr)
>  	bpf_jit_uncharge_modmem(pages);
>  }
>  
> +#ifdef CONFIG_XC_RETAIN
> +static struct xc_retain bpf_jit_retain = {
> +	.min_pages = BPF_JIT_RETAIN_MIN,
> +	.max_pages = BPF_JIT_RETAIN_MAX,
> +	.ops = {
> +		.free = module_memfree,
> +	},
> +};
> +
> +void bpf_jit_retain_update_sz(void)
> +{
> +	xc_retain_set_min_pages(&bpf_jit_retain, bpf_jit_retain_min);
> +	xc_retain_set_max_pages(&bpf_jit_retain, bpf_jit_retain_max);
> +}
> +
> +static int __init bpf_jit_retain_init(void)
> +{
> +	return xc_retain_register(&bpf_jit_retain);
> +}
> +subsys_initcall(bpf_jit_retain_init);
> +
> +static void bpf_jit_binary_retain(struct bpf_prog *fp,
> +				  struct bpf_binary_header *hdr)
> +{
> +	u32 pages = hdr->pages;
> +
> +	xc_retain_binary(&bpf_jit_retain, hdr, pages);
> +	bpf_jit_uncharge_modmem(pages);
> +}
> +#else
> +static void bpf_jit_binary_retain(struct bpf_prog *fp,
> +				  struct bpf_binary_header *hdr)
> +{
> +	return bpf_jit_binary_free(hdr);
> +}
> +#endif

I'm strongly against this approach.

I understand that it's under CONFIG, but changing kernel
into garbage collection nightmare even under config
or sysctl is not an option.
In many cases bpf progs are loaded/unloaded a lot.
Consider CI test system that runs tests 24/7.
bpf progs are loaded/unloaded in huge numbers.
Such system will suffer non deterministic test and
performance results due to shrinkers.
perf analysis with PT becomes inaccurate and main goal
of retaining accurate instruction info is not achieved.
bpf_jit_retain_min/max tunables is not an option either.
Please see how perf record is handling bpf prog/unload.
What stops you from doing the same for PT?


^ permalink raw reply

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Sander Eikelenboom @ 2019-02-08 23:34 UTC (permalink / raw)
  To: Heiner Kallweit, Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <140d0df7-1775-5457-aa03-b21ece250a72@gmail.com>

On 08/02/2019 22:50, Heiner Kallweit wrote:
> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>> L.S.,
>>>>>>
>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>
>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>
>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>
>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>
>>>> Hmm i did some diging and i think:
>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>
>>> You're right. Thought this was added in 4.20 already.
>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>> have onboard Realtek network I have quite a few testers out there.
>>> Does the issue occur under specific circumstances like very high load?
>>
>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>> on the host.
>>
>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>> as author of the underlying changes.
>>
>> It could also be the barriers weren't that unneeded as assumed.
> 
> The barriers were removed after adding xmit_more handling. Therefore it would be good to
> test also with only 
> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
> removed.

*arghh* *grmbl*

with both:
    bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3
    and
    2e6eedb4813e34d8d84ac0eb3afb668966f3f356 
reverted i get yet another splat:

[ 3769.246083] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
[ 3769.246095] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
[ 3769.246096] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[ 3769.246098] Call Trace:
[ 3769.246104]  <IRQ>
[ 3769.246114]  dump_stack+0x5c/0x7b
[ 3769.246120]  warn_alloc+0x103/0x190
[ 3769.246122]  __alloc_pages_nodemask+0xe3d/0xe80
[ 3769.246128]  ? inet_gro_receive+0x232/0x2c0
[ 3769.246130]  page_frag_alloc+0x117/0x150
[ 3769.246132]  __napi_alloc_skb+0x83/0xd0
[ 3769.246137]  rtl8169_poll+0x210/0x640
[ 3769.246140]  net_rx_action+0x23d/0x370
[ 3769.246145]  __do_softirq+0xed/0x229
[ 3769.246149]  irq_exit+0xb7/0xc0
[ 3769.246152]  xen_evtchn_do_upcall+0x27/0x40
[ 3769.246154]  xen_do_hypervisor_callback+0x29/0x40
[ 3769.246155]  </IRQ>
[ 3769.246161] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
[ 3769.246163] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
[ 3769.246164] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
[ 3769.246166] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
[ 3769.246167] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
[ 3769.246167] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
[ 3769.246168] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
[ 3769.246169] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
[ 3769.246173]  _raw_spin_lock+0x16/0x20
[ 3769.246176]  list_lru_add+0x59/0x170
[ 3769.246179]  inode_lru_list_add+0x1b/0x40
[ 3769.246182]  iput+0x18b/0x1a0
[ 3769.246184]  __dentry_kill+0xc5/0x170
[ 3769.246186]  shrink_dentry_list+0x93/0x1c0
[ 3769.246187]  prune_dcache_sb+0x4d/0x70
[ 3769.246191]  super_cache_scan+0x104/0x190
[ 3769.246194]  do_shrink_slab+0x12c/0x1e0
[ 3769.246196]  shrink_slab+0xdf/0x2b0
[ 3769.246198]  shrink_node+0x158/0x470
[ 3769.246200]  do_try_to_free_pages+0xd1/0x380
[ 3769.246202]  try_to_free_pages+0xb2/0xe0
[ 3769.246204]  __alloc_pages_nodemask+0x603/0xe80
[ 3769.246207]  ? xas_load+0x9/0x80
[ 3769.246209]  ? find_get_entry+0x58/0x120
[ 3769.246210]  pagecache_get_page+0xde/0x210
[ 3769.246213]  grab_cache_page_write_begin+0x17/0x30
[ 3769.246215]  ext4_da_write_begin+0xc4/0x340
[ 3769.246217]  generic_perform_write+0xb8/0x1b0
[ 3769.246219]  __generic_file_write_iter+0x13c/0x1b0
[ 3769.246223]  ext4_file_write_iter+0x121/0x3c0
[ 3769.246225]  __vfs_write+0x123/0x1a0
[ 3769.246226]  vfs_write+0xab/0x1a0
[ 3769.246229]  ksys_write+0x4d/0xc0
[ 3769.246232]  do_syscall_64+0x49/0x100
[ 3769.246234]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3769.246237] RIP: 0033:0x7fee5b265730
[ 3769.246238] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
[ 3769.246239] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3769.246240] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
[ 3769.246241] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
[ 3769.246241] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
[ 3769.246242] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
[ 3769.246243] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
[ 3769.246244] Mem-Info:
[ 3769.246249] active_anon:152383 inactive_anon:99216 isolated_anon:0
                active_file:51569 inactive_file:85922 isolated_file:0
                unevictable:552 dirty:6866 writeback:0 unstable:0
                slab_reclaimable:6707 slab_unreclaimable:16166
                mapped:1870 shmem:6 pagetables:2716 bounce:0
                free:3639 free_pcp:900 free_cma:0
[ 3769.246252] Node 0 active_anon:609532kB inactive_anon:396864kB active_file:206276kB inactive_file:343688kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:7480kB dirty:27464kB writeback:0kB shmem:24kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[ 3769.246253] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:8056kB inactive_anon:0kB active_file:92kB inactive_file:148kB unevictable:0kB writepending:8kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:20kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 3769.246256] lowmem_reserve[]: 0 1865 1865 1865
[ 3769.246258] Node 0 DMA32 free:7076kB min:19472kB low:21380kB high:23288kB active_anon:601840kB inactive_anon:396512kB active_file:206216kB inactive_file:343644kB unevictable:2208kB writepending:27256kB present:2080768kB managed:1833792kB mlocked:2208kB kernel_stack:9392kB pagetables:10844kB bounce:0kB free_pcp:3600kB local_pcp:596kB free_cma:0kB
[ 3769.246260] lowmem_reserve[]: 0 0 0 0
[ 3769.246262] Node 0 DMA: 6*4kB (UE) 4*8kB (UME) 4*16kB (UME) 2*32kB (UE) 6*64kB (UE) 2*128kB (UM) 4*256kB (UME) 3*512kB (UME) 2*1024kB (ME) 1*2048kB (M) 0*4096kB = 7480kB
[ 3769.246267] Node 0 DMA32: 66*4kB (UM) 271*8kB (UME) 218*16kB (UME) 45*32kB (UME) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7360kB
[ 3769.246272] 144878 total pagecache pages
[ 3769.246276] 6812 pages in swap cache
[ 3769.246277] Swap cache stats: add 62616, delete 55806, find 31/55
[ 3769.246278] Free swap  = 3943164kB
[ 3769.246278] Total swap = 4194300kB
[ 3769.246279] 524181 pages RAM
[ 3769.246279] 0 pages HighMem/MovableOnly
[ 3769.246280] 61765 pages reserved
[ 3769.246280] 0 pages cma reserved
[ 3769.246284] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
[ 3769.246286] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
[ 3769.246287] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[ 3769.246287] Call Trace:
[ 3769.246288]  <IRQ>
[ 3769.246290]  dump_stack+0x5c/0x7b
[ 3769.246291]  warn_alloc+0x103/0x190
[ 3769.246293]  __alloc_pages_nodemask+0xe3d/0xe80
[ 3769.246294]  ? inet_gro_receive+0x232/0x2c0
[ 3769.246296]  page_frag_alloc+0x117/0x150
[ 3769.246297]  __napi_alloc_skb+0x83/0xd0
[ 3769.246299]  rtl8169_poll+0x210/0x640
[ 3769.246300]  net_rx_action+0x23d/0x370
[ 3769.246302]  __do_softirq+0xed/0x229
[ 3769.246304]  irq_exit+0xb7/0xc0
[ 3769.246305]  xen_evtchn_do_upcall+0x27/0x40
[ 3769.246306]  xen_do_hypervisor_callback+0x29/0x40
[ 3769.246307]  </IRQ>
[ 3769.246308] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
[ 3769.246310] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
[ 3769.246310] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
[ 3769.246311] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
[ 3769.246312] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
[ 3769.246313] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
[ 3769.246313] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
[ 3769.246314] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
[ 3769.246316]  _raw_spin_lock+0x16/0x20
[ 3769.246317]  list_lru_add+0x59/0x170
[ 3769.246318]  inode_lru_list_add+0x1b/0x40
[ 3769.246320]  iput+0x18b/0x1a0
[ 3769.246321]  __dentry_kill+0xc5/0x170
[ 3769.246322]  shrink_dentry_list+0x93/0x1c0
[ 3769.246323]  prune_dcache_sb+0x4d/0x70
[ 3769.246325]  super_cache_scan+0x104/0x190
[ 3769.246326]  do_shrink_slab+0x12c/0x1e0
[ 3769.246328]  shrink_slab+0xdf/0x2b0
[ 3769.246329]  shrink_node+0x158/0x470
[ 3769.246331]  do_try_to_free_pages+0xd1/0x380
[ 3769.246333]  try_to_free_pages+0xb2/0xe0
[ 3769.246334]  __alloc_pages_nodemask+0x603/0xe80
[ 3769.246336]  ? xas_load+0x9/0x80
[ 3769.246337]  ? find_get_entry+0x58/0x120
[ 3769.246338]  pagecache_get_page+0xde/0x210
[ 3769.246340]  grab_cache_page_write_begin+0x17/0x30
[ 3769.246341]  ext4_da_write_begin+0xc4/0x340
[ 3769.246342]  generic_perform_write+0xb8/0x1b0
[ 3769.246344]  __generic_file_write_iter+0x13c/0x1b0
[ 3769.246345]  ext4_file_write_iter+0x121/0x3c0
[ 3769.246347]  __vfs_write+0x123/0x1a0
[ 3769.246348]  vfs_write+0xab/0x1a0
[ 3769.246349]  ksys_write+0x4d/0xc0
[ 3769.246350]  do_syscall_64+0x49/0x100
[ 3769.246352]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3769.246353] RIP: 0033:0x7fee5b265730
[ 3769.246354] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
[ 3769.246354] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3769.246355] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
[ 3769.246356] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
[ 3769.246357] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
[ 3769.246357] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
[ 3769.246358] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
[ 3769.246364] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
[ 3769.246366] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
[ 3769.246366] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[ 3769.246366] Call Trace:
[ 3769.246367]  <IRQ>
[ 3769.246368]  dump_stack+0x5c/0x7b
[ 3769.246370]  warn_alloc+0x103/0x190
[ 3769.246371]  __alloc_pages_nodemask+0xe3d/0xe80
[ 3769.246373]  ? inet_gro_receive+0x232/0x2c0
[ 3769.246374]  page_frag_alloc+0x117/0x150
[ 3769.246375]  __napi_alloc_skb+0x83/0xd0
[ 3769.246376]  rtl8169_poll+0x210/0x640
[ 3769.246378]  net_rx_action+0x23d/0x370
[ 3769.246379]  __do_softirq+0xed/0x229
[ 3769.246381]  irq_exit+0xb7/0xc0
[ 3769.246382]  xen_evtchn_do_upcall+0x27/0x40
[ 3769.246383]  xen_do_hypervisor_callback+0x29/0x40
[ 3769.246383]  </IRQ>
[ 3769.246385] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
[ 3769.246386] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
[ 3769.246387] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
[ 3769.246388] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
[ 3769.246388] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
[ 3769.246389] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
[ 3769.246390] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
[ 3769.246390] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
[ 3769.246392]  _raw_spin_lock+0x16/0x20
[ 3769.246393]  list_lru_add+0x59/0x170
[ 3769.246395]  inode_lru_list_add+0x1b/0x40
[ 3769.246396]  iput+0x18b/0x1a0
[ 3769.246397]  __dentry_kill+0xc5/0x170
[ 3769.246398]  shrink_dentry_list+0x93/0x1c0
[ 3769.246399]  prune_dcache_sb+0x4d/0x70
[ 3769.246401]  super_cache_scan+0x104/0x190
[ 3769.246402]  do_shrink_slab+0x12c/0x1e0
[ 3769.246404]  shrink_slab+0xdf/0x2b0
[ 3769.246405]  shrink_node+0x158/0x470
[ 3769.246407]  do_try_to_free_pages+0xd1/0x380
[ 3769.246408]  try_to_free_pages+0xb2/0xe0
[ 3769.246410]  __alloc_pages_nodemask+0x603/0xe80
[ 3769.246411]  ? xas_load+0x9/0x80
[ 3769.246413]  ? find_get_entry+0x58/0x120
[ 3769.246414]  pagecache_get_page+0xde/0x210
[ 3769.246415]  grab_cache_page_write_begin+0x17/0x30
[ 3769.246416]  ext4_da_write_begin+0xc4/0x340
[ 3769.246418]  generic_perform_write+0xb8/0x1b0
[ 3769.246420]  __generic_file_write_iter+0x13c/0x1b0
[ 3769.246421]  ext4_file_write_iter+0x121/0x3c0
[ 3769.246422]  __vfs_write+0x123/0x1a0
[ 3769.246423]  vfs_write+0xab/0x1a0
[ 3769.246424]  ksys_write+0x4d/0xc0
[ 3769.246426]  do_syscall_64+0x49/0x100
[ 3769.246427]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3769.246428] RIP: 0033:0x7fee5b265730
[ 3769.246429] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
[ 3769.246430] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3769.246431] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
[ 3769.246431] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
[ 3769.246432] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
[ 3769.246433] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
[ 3769.246433] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710


 
>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>
> Sure, thanks.
> 
>> BTW am i correct these patches are merely optimizations ?
> 
> Yes
> 
>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>> to revert them for 5.0 and try again for 5.1 ?
>>
> Before removing both it would be good to test with only the barrier-removal removed.
> 
>> --
>> Sander
>>
> Heiner
> 
>>
>>>
>>>> would be candidates, which were merged in 5.0.
>>>>
>>>> I have reverted the first two, see how that works out.
>>>>
>>>> --
>>>> Sander
>>>>
>>> Heiner
>>>
>>>>  
>>>>>> --
>>>>>> Sander
>>>>>>
>>>>> Heiner
>>>>>
>>>>>>
>>>>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>>>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>>>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>>>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>>>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>>> [ 6466.758366] Call Trace:
>>>>>> [ 6466.768118]  <IRQ>
>>>>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>>>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>>>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>>>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>>>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>>>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>>>>> [ 6466.835902]  </IRQ>
>>>>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>>>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>>>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>>>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>>>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>>>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>>>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>>>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>>>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>>>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>>>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>>>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>>>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>>>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>>>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>>>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>>>>> [ 6466.999585] Modules linked in:
>>>>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>>>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>>>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>>>>> [ 6467.118166] Kernel Offset: disabled
>>>>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
> 


^ permalink raw reply

* Re: [PATCH] net: sfp: do not probe SFP module before we're attached
From: Russell King - ARM Linux admin @ 2019-02-08 23:36 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, andrew, f.fainelli, hkallweit1
In-Reply-To: <20190208.151139.1930176272346229162.davem@davemloft.net>

On Fri, Feb 08, 2019 at 03:11:39PM -0800, David Miller wrote:
> From: Russell King <rmk+kernel@armlinux.org.uk>
> Date: Wed, 06 Feb 2019 10:52:30 +0000
> 
> > When we probe a SFP module, we expect to be able to call the upstream
> > device's module_insert() function so that the upstream link can be
> > configured.  However, when the upstream device is delayed, we currently
> > may end up probing the module before the upstream device is available,
> > and lose the module_insert() call.
> > 
> > Avoid this by holding off probing the module until the SFP bus is
> > properly connected to both the SFP socket driver and the upstream
> > driver.
> > 
> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> 
> Applied, thanks Russell.
> 
> -stable?

Yes please.  Would you like me to mail the stable team once it hits
mainline?

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply

* Re: [PATCH] net: sfp: do not probe SFP module before we're attached
From: David Miller @ 2019-02-08 23:42 UTC (permalink / raw)
  To: linux; +Cc: netdev, andrew, f.fainelli, hkallweit1
In-Reply-To: <20190208233651.wdaywntcwwq63xpo@shell.armlinux.org.uk>

From: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Date: Fri, 8 Feb 2019 23:36:51 +0000

> On Fri, Feb 08, 2019 at 03:11:39PM -0800, David Miller wrote:
>> From: Russell King <rmk+kernel@armlinux.org.uk>
>> Date: Wed, 06 Feb 2019 10:52:30 +0000
>> 
>> > When we probe a SFP module, we expect to be able to call the upstream
>> > device's module_insert() function so that the upstream link can be
>> > configured.  However, when the upstream device is delayed, we currently
>> > may end up probing the module before the upstream device is available,
>> > and lose the module_insert() call.
>> > 
>> > Avoid this by holding off probing the module until the SFP bus is
>> > properly connected to both the SFP socket driver and the upstream
>> > driver.
>> > 
>> > Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
>> 
>> Applied, thanks Russell.
>> 
>> -stable?
> 
> Yes please.  Would you like me to mail the stable team once it hits
> mainline?

Networking -stable submissions are handled purely by me, so no you don't
need to do that.

I've queued this one up, thanks.

Thanks.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox