Netdev List
 help / color / mirror / Atom feed
* [PATCH] inet_diag: fix reporting cgroup classid and fallback to priority
From: Konstantin Khlebnikov @ 2019-02-09 10:35 UTC (permalink / raw)
  To: netdev, linux-kernel, David S. Miller; +Cc: Sasha Levin, linux-sctp

Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested
extensions has only 8 bits. Thus extensions starting from DCTCPINFO
cannot be requested directly. Some of them included into response
unconditionally or hook into some of lower 8 bits.

Extension INET_DIAG_CLASS_ID has not way to request from the beginning.

This patch bundle it with INET_DIAG_TCLASS (ipv6 tos), fixes space
reservation, and documents behavior for other extensions.

Also this patch adds fallback to reporting socket priority. This filed
is more widely used for traffic classification because ipv4 sockets
automatically maps TOS to priority and default qdisc pfifo_fast knows
about that. But priority could be changed via setsockopt SO_PRIORITY so
INET_DIAG_TOS isn't enough for predicting class.

Also cgroup2 obsoletes net_cls classid (it always zero), but we cannot
reuse this field for reporting cgroup2 id because it is 64-bit (ino+gen).

So, after this patch INET_DIAG_CLASS_ID will report socket priority
for most common setup when net_cls isn't set and/or cgroup2 in use.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Fixes: 0888e372c37f ("net: inet: diag: expose sockets cgroup classid")
---
 include/uapi/linux/inet_diag.h |   16 +++++++++++-----
 net/ipv4/inet_diag.c           |   10 +++++++++-
 net/sctp/diag.c                |    1 +
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/include/uapi/linux/inet_diag.h b/include/uapi/linux/inet_diag.h
index 14565d703291..e8baca85bac6 100644
--- a/include/uapi/linux/inet_diag.h
+++ b/include/uapi/linux/inet_diag.h
@@ -137,15 +137,21 @@ enum {
 	INET_DIAG_TCLASS,
 	INET_DIAG_SKMEMINFO,
 	INET_DIAG_SHUTDOWN,
-	INET_DIAG_DCTCPINFO,
-	INET_DIAG_PROTOCOL,  /* response attribute only */
+
+	/*
+	 * Next extenstions cannot be requested in struct inet_diag_req_v2:
+	 * its field idiag_ext has only 8 bits.
+	 */
+
+	INET_DIAG_DCTCPINFO,	/* request as INET_DIAG_VEGASINFO */
+	INET_DIAG_PROTOCOL,	/* response attribute only */
 	INET_DIAG_SKV6ONLY,
 	INET_DIAG_LOCALS,
 	INET_DIAG_PEERS,
 	INET_DIAG_PAD,
-	INET_DIAG_MARK,
-	INET_DIAG_BBRINFO,
-	INET_DIAG_CLASS_ID,
+	INET_DIAG_MARK,		/* only with CAP_NET_ADMIN */
+	INET_DIAG_BBRINFO,	/* request as INET_DIAG_VEGASINFO */
+	INET_DIAG_CLASS_ID,	/* request as INET_DIAG_TCLASS */
 	INET_DIAG_MD5SIG,
 	__INET_DIAG_MAX,
 };
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 1a4e9ff02762..5731670c560b 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -108,6 +108,7 @@ static size_t inet_sk_attr_size(struct sock *sk,
 		+ nla_total_size(1) /* INET_DIAG_TOS */
 		+ nla_total_size(1) /* INET_DIAG_TCLASS */
 		+ nla_total_size(4) /* INET_DIAG_MARK */
+		+ nla_total_size(4) /* INET_DIAG_CLASS_ID */
 		+ nla_total_size(sizeof(struct inet_diag_meminfo))
 		+ nla_total_size(sizeof(struct inet_diag_msg))
 		+ nla_total_size(SK_MEMINFO_VARS * sizeof(u32))
@@ -287,12 +288,19 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_connection_sock *icsk,
 			goto errout;
 	}
 
-	if (ext & (1 << (INET_DIAG_CLASS_ID - 1))) {
+	if (ext & (1 << (INET_DIAG_CLASS_ID - 1)) ||
+	    ext & (1 << (INET_DIAG_TCLASS - 1))) {
 		u32 classid = 0;
 
 #ifdef CONFIG_SOCK_CGROUP_DATA
 		classid = sock_cgroup_classid(&sk->sk_cgrp_data);
 #endif
+		/* Fallback to socket priority if class id isn't set.
+		 * Classful qdiscs use it as direct reference to class.
+		 * For cgroup2 classid is always zero.
+		 */
+		if (!classid)
+			classid = sk->sk_priority;
 
 		if (nla_put_u32(skb, INET_DIAG_CLASS_ID, classid))
 			goto errout;
diff --git a/net/sctp/diag.c b/net/sctp/diag.c
index 078f01a8d582..435847d98b51 100644
--- a/net/sctp/diag.c
+++ b/net/sctp/diag.c
@@ -256,6 +256,7 @@ static size_t inet_assoc_attr_size(struct sctp_association *asoc)
 		+ nla_total_size(1) /* INET_DIAG_TOS */
 		+ nla_total_size(1) /* INET_DIAG_TCLASS */
 		+ nla_total_size(4) /* INET_DIAG_MARK */
+		+ nla_total_size(4) /* INET_DIAG_CLASS_ID */
 		+ nla_total_size(addrlen * asoc->peer.transport_count)
 		+ nla_total_size(addrlen * addrcnt)
 		+ nla_total_size(sizeof(struct inet_diag_meminfo))


^ permalink raw reply related

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Sander Eikelenboom @ 2019-02-09 10:07 UTC (permalink / raw)
  To: Heiner Kallweit, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <70e9a3fe-158a-c3a2-a427-2343bc6c9031@gmail.com>

On 09/02/2019 10:59, Heiner Kallweit wrote:
> On 09.02.2019 10:34, Sander Eikelenboom wrote:
>> On 09/02/2019 10:02, Heiner Kallweit wrote:
>>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>>
>>>>
>>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>>> L.S.,
>>>>>>>>>>
>>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>>
>>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>>
>>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>>
>>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>>
>>>>>>>> Hmm i did some diging and i think:
>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>>
>>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>>
>>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>>> on the host.
>>>>>>
>>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>>> as author of the underlying changes.
>>>>>>
>>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>>
>>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>>> test also with only 
>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>> removed.
>>>>>
>>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>>
>>>>> Sure, thanks.
>>>>>
>>>>>> BTW am i correct these patches are merely optimizations ?
>>>>>
>>>>> Yes
>>>>>
>>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>>
>>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>>
>>>>
>>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>>
>>>> You could try :
>>>>
>>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>         dma_addr_t mapping;
>>>>         u32 opts[2], len;
>>>>         bool stop_queue;
>>>> +       bool door_bell;
>>>>         int frags;
>>>>  
>>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>         /* Force memory writes to complete before releasing descriptor */
>>>>         dma_wmb();
>>>>  
>>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>>> +
>>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>>  
>>>>         /* Force all memory writes to complete before notifying device */
>>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>         if (unlikely(stop_queue))
>>>>                 netif_stop_queue(dev);
>>>>  
>>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>>> +       if (door_bell) {
>>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>>                 mmiowb();
>>>>         }
>>>>
>>> Thanks a lot for checking and for the proposed fix.
>>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
>>
>> I have done that already during the night .. the results:
>> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
>>
>> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
>>
>>   If I can, it is a separate issue.
>>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
>>
>>   The timeout in question:
>>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>>         [28336.893358] Modules linked in:
>>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>>         [28337.090052] Call Trace:
>>         [28337.103615]  <IRQ>
>>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>>         [28337.128905]  call_timer_fn+0x19/0x90
>>         [28337.141892]  expire_timers+0x8b/0xa0
>>         [28337.153354]  run_timer_softirq+0x7e/0x160
>>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>>         [28337.186734]  __do_softirq+0xed/0x229
>>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>>         [28337.207822]  irq_exit+0xb7/0xc0
>>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>>         [28337.241261]  </IRQ>
>>         [28337.253283] RIP: e033:0xff7e62
>>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
>>
> Thanks for your efforts. As usual this tx timeout trace says basically nothing except
> "timeout" and root cause could be anything. Earlier you reported a memory allocation error,
> did that occur again?
> If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
> contribute to the issue) and just submit a patch to effectively revert
> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.

I can't say if that is correct, because i haven't tested that.

Another thing I could test is:
 - putting all the r8169 patches (and prerequisites) that went into 5.0 
   up to bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3, onto 4.20.7 and see what that does.
   If that would be feasible (not too many needed prerequisites out of r8169) and if 
   you could spare me some time and prep such a branch somewhere so i can pull and compile that,
   that would be great.

--
Sander

>> --
>> Sander
>>
>>  
>>>>
>>>> .
>>>>
>>> Heiner
>>>
>>
>>


^ permalink raw reply

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Heiner Kallweit @ 2019-02-09  9:59 UTC (permalink / raw)
  To: Sander Eikelenboom, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <6b4e8aa0-03c5-c0a8-439e-77daabb07416@eikelenboom.it>

On 09.02.2019 10:34, Sander Eikelenboom wrote:
> On 09/02/2019 10:02, Heiner Kallweit wrote:
>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>
>>>
>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>> L.S.,
>>>>>>>>>
>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>
>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>
>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>
>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>
>>>>>>> Hmm i did some diging and i think:
>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>
>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>
>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>> on the host.
>>>>>
>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>> as author of the underlying changes.
>>>>>
>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>
>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>> test also with only 
>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>> removed.
>>>>
>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>
>>>> Sure, thanks.
>>>>
>>>>> BTW am i correct these patches are merely optimizations ?
>>>>
>>>> Yes
>>>>
>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>
>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>
>>>
>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>
>>> You could try :
>>>
>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>         dma_addr_t mapping;
>>>         u32 opts[2], len;
>>>         bool stop_queue;
>>> +       bool door_bell;
>>>         int frags;
>>>  
>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>         /* Force memory writes to complete before releasing descriptor */
>>>         dma_wmb();
>>>  
>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>> +
>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>  
>>>         /* Force all memory writes to complete before notifying device */
>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>         if (unlikely(stop_queue))
>>>                 netif_stop_queue(dev);
>>>  
>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>> +       if (door_bell) {
>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>                 mmiowb();
>>>         }
>>>
>> Thanks a lot for checking and for the proposed fix.
>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
> 
> I have done that already during the night .. the results:
> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
> 
> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
> 
>   If I can, it is a separate issue.
>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
> 
>   The timeout in question:
>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>         [28336.893358] Modules linked in:
>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>         [28337.090052] Call Trace:
>         [28337.103615]  <IRQ>
>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>         [28337.128905]  call_timer_fn+0x19/0x90
>         [28337.141892]  expire_timers+0x8b/0xa0
>         [28337.153354]  run_timer_softirq+0x7e/0x160
>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>         [28337.186734]  __do_softirq+0xed/0x229
>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>         [28337.207822]  irq_exit+0xb7/0xc0
>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>         [28337.241261]  </IRQ>
>         [28337.253283] RIP: e033:0xff7e62
>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
> 
Thanks for your efforts. As usual this tx timeout trace says basically nothing except
"timeout" and root cause could be anything. Earlier you reported a memory allocation error,
did that occur again?
If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
contribute to the issue) and just submit a patch to effectively revert
2e6eedb4813e34d8d84ac0eb3afb668966f3f356.

> --
> Sander
> 
>  
>>>
>>> .
>>>
>> Heiner
>>
> 
> 

^ permalink raw reply

* Re: [PATCH 13/19] net: split out functions related to registering inflight socket files
From: Hannes Reinecke @ 2019-02-09  9:49 UTC (permalink / raw)
  To: Jens Axboe, linux-aio, linux-block, linux-api
  Cc: hch, jmoyer, avi, jannh, viro, netdev, David S . Miller
In-Reply-To: <20190208173423.27014-14-axboe@kernel.dk>

On 2/8/19 6:34 PM, Jens Axboe wrote:
> We need this functionality for the io_uring file registration, but
> we cannot rely on it since CONFIG_UNIX can be modular. Move the helpers
> to a separate file, that's always builtin to the kernel if CONFIG_UNIX is
> m/y.
> 
> No functional changes in this patch, just moving code around.
> 
> Cc: netdev@vger.kernel.org
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   include/net/af_unix.h |   1 +
>   net/unix/Kconfig      |   5 ++
>   net/unix/Makefile     |   2 +
>   net/unix/af_unix.c    |  63 +-----------------
>   net/unix/garbage.c    |  71 +-------------------
>   net/unix/scm.c        | 146 ++++++++++++++++++++++++++++++++++++++++++
>   net/unix/scm.h        |  10 +++
>   7 files changed, 168 insertions(+), 130 deletions(-)
>   create mode 100644 net/unix/scm.c
>   create mode 100644 net/unix/scm.h
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes



^ permalink raw reply

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Sander Eikelenboom @ 2019-02-09  9:34 UTC (permalink / raw)
  To: Heiner Kallweit, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <aa7eb814-ebf7-ffa5-ce61-317b9bbf2394@gmail.com>

On 09/02/2019 10:02, Heiner Kallweit wrote:
> On 09.02.2019 00:09, Eric Dumazet wrote:
>>
>>
>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>> L.S.,
>>>>>>>>
>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>
>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>
>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>
>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>
>>>>>> Hmm i did some diging and i think:
>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>
>>>>> You're right. Thought this was added in 4.20 already.
>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>> Does the issue occur under specific circumstances like very high load?
>>>>
>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>> on the host.
>>>>
>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>> as author of the underlying changes.
>>>>
>>>> It could also be the barriers weren't that unneeded as assumed.
>>>
>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>> test also with only 
>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>> removed.
>>>
>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>
>>> Sure, thanks.
>>>
>>>> BTW am i correct these patches are merely optimizations ?
>>>
>>> Yes
>>>
>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>
>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>
>>
>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>
>> You could try :
>>
>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>> --- a/drivers/net/ethernet/realtek/r8169.c
>> +++ b/drivers/net/ethernet/realtek/r8169.c
>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>         dma_addr_t mapping;
>>         u32 opts[2], len;
>>         bool stop_queue;
>> +       bool door_bell;
>>         int frags;
>>  
>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>         /* Force memory writes to complete before releasing descriptor */
>>         dma_wmb();
>>  
>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>> +
>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>  
>>         /* Force all memory writes to complete before notifying device */
>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>         if (unlikely(stop_queue))
>>                 netif_stop_queue(dev);
>>  
>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>> +       if (door_bell) {
>>                 RTL_W8(tp, TxPoll, NPQ);
>>                 mmiowb();
>>         }
>>
> Thanks a lot for checking and for the proposed fix.
> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?

I have done that already during the night .. the results:
- I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
  (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).

- The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
  The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
  this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
  compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
  Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.

  If I can, it is a separate issue.
  If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
  a revert would be the right thing to do (since as you indicated these are merely optimizations), 
  which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
  (especially since I seem to still have other issues which need to be sorted out and time is limited)

  The timeout in question:
        [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
        [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
        [28336.893358] Modules linked in:
        [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
        [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
        [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
        [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
        [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
        [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
        [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
        [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
        [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
        [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
        [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
        [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
        [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
        [28337.090052] Call Trace:
        [28337.103615]  <IRQ>
        [28337.116587]  ? qdisc_destroy+0x120/0x120
        [28337.128905]  call_timer_fn+0x19/0x90
        [28337.141892]  expire_timers+0x8b/0xa0
        [28337.153354]  run_timer_softirq+0x7e/0x160
        [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
        [28337.176548]  ? handle_percpu_irq+0x32/0x50
        [28337.186734]  __do_softirq+0xed/0x229
        [28337.196404]  ? hypervisor_callback+0xa/0x20
        [28337.207822]  irq_exit+0xb7/0xc0
        [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
        [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
        [28337.241261]  </IRQ>
        [28337.253283] RIP: e033:0xff7e62
        [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
        [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
        [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
        [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
        [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
        [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
        [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
        [28337.353977] ---[ end trace 6ff49f09286816b7 ]---

--
Sander

 
>>
>> .
>>
> Heiner
> 


^ permalink raw reply

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Heiner Kallweit @ 2019-02-09  9:02 UTC (permalink / raw)
  To: Eric Dumazet, Sander Eikelenboom, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <e6423f51-42cc-e49f-bb48-85ea922f01b6@gmail.com>

On 09.02.2019 00:09, Eric Dumazet wrote:
> 
> 
> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>> L.S.,
>>>>>>>
>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>
>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>
>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>
>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>
>>>>> Hmm i did some diging and i think:
>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>
>>>> You're right. Thought this was added in 4.20 already.
>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>> have onboard Realtek network I have quite a few testers out there.
>>>> Does the issue occur under specific circumstances like very high load?
>>>
>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>> on the host.
>>>
>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>> as author of the underlying changes.
>>>
>>> It could also be the barriers weren't that unneeded as assumed.
>>
>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>> test also with only 
>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>> removed.
>>
>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>
>> Sure, thanks.
>>
>>> BTW am i correct these patches are merely optimizations ?
>>
>> Yes
>>
>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>> to revert them for 5.0 and try again for 5.1 ?
>>>
>> Before removing both it would be good to test with only the barrier-removal removed.
>>
> 
> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
> looks buggy to me, since the skb might have been freed already on another cpu when you call
> 
> You could try :
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>         dma_addr_t mapping;
>         u32 opts[2], len;
>         bool stop_queue;
> +       bool door_bell;
>         int frags;
>  
>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>         /* Force memory writes to complete before releasing descriptor */
>         dma_wmb();
>  
> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
> +
>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>  
>         /* Force all memory writes to complete before notifying device */
> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>         if (unlikely(stop_queue))
>                 netif_stop_queue(dev);
>  
> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
> +       if (door_bell) {
>                 RTL_W8(tp, TxPoll, NPQ);
>                 mmiowb();
>         }
> 
Thanks a lot for checking and for the proposed fix.
Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?

> 
> .
> 
Heiner

^ permalink raw reply

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
From: Heiner Kallweit @ 2019-02-09  9:10 UTC (permalink / raw)
  To: Sander Eikelenboom, Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev
In-Reply-To: <7084be7a-c279-080d-d1ec-cd604f2b2b14@eikelenboom.it>

On 09.02.2019 00:34, Sander Eikelenboom wrote:
> On 08/02/2019 22:50, Heiner Kallweit wrote:
>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>> L.S.,
>>>>>>>
>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>
>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>
>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>
>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>
>>>>> Hmm i did some diging and i think:
>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>
>>>> You're right. Thought this was added in 4.20 already.
>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>> have onboard Realtek network I have quite a few testers out there.
>>>> Does the issue occur under specific circumstances like very high load?
>>>
>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>> on the host.
>>>
>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>> as author of the underlying changes.
>>>
>>> It could also be the barriers weren't that unneeded as assumed.
>>
>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>> test also with only 
>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>> removed.
> 
> *arghh* *grmbl*
> 
> with both:
>     bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3
>     and
>     2e6eedb4813e34d8d84ac0eb3afb668966f3f356 
> reverted i get yet another splat:
> 
Puh, I'm not a memory management expert. The traces include also a failed memory
allocation from a file system operation. Maybe the system is going low on memory?
The issue occurs so deep in the memory mgmt, that I wonder if and how this could
be caused by the network driver.


> [ 3769.246083] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
> [ 3769.246095] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
> [ 3769.246096] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
> [ 3769.246098] Call Trace:
> [ 3769.246104]  <IRQ>
> [ 3769.246114]  dump_stack+0x5c/0x7b
> [ 3769.246120]  warn_alloc+0x103/0x190
> [ 3769.246122]  __alloc_pages_nodemask+0xe3d/0xe80
> [ 3769.246128]  ? inet_gro_receive+0x232/0x2c0
> [ 3769.246130]  page_frag_alloc+0x117/0x150
> [ 3769.246132]  __napi_alloc_skb+0x83/0xd0
> [ 3769.246137]  rtl8169_poll+0x210/0x640
> [ 3769.246140]  net_rx_action+0x23d/0x370
> [ 3769.246145]  __do_softirq+0xed/0x229
> [ 3769.246149]  irq_exit+0xb7/0xc0
> [ 3769.246152]  xen_evtchn_do_upcall+0x27/0x40
> [ 3769.246154]  xen_do_hypervisor_callback+0x29/0x40
> [ 3769.246155]  </IRQ>
> [ 3769.246161] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
> [ 3769.246163] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
> [ 3769.246164] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
> [ 3769.246166] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
> [ 3769.246167] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
> [ 3769.246167] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
> [ 3769.246168] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
> [ 3769.246169] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
> [ 3769.246173]  _raw_spin_lock+0x16/0x20
> [ 3769.246176]  list_lru_add+0x59/0x170
> [ 3769.246179]  inode_lru_list_add+0x1b/0x40
> [ 3769.246182]  iput+0x18b/0x1a0
> [ 3769.246184]  __dentry_kill+0xc5/0x170
> [ 3769.246186]  shrink_dentry_list+0x93/0x1c0
> [ 3769.246187]  prune_dcache_sb+0x4d/0x70
> [ 3769.246191]  super_cache_scan+0x104/0x190
> [ 3769.246194]  do_shrink_slab+0x12c/0x1e0
> [ 3769.246196]  shrink_slab+0xdf/0x2b0
> [ 3769.246198]  shrink_node+0x158/0x470
> [ 3769.246200]  do_try_to_free_pages+0xd1/0x380
> [ 3769.246202]  try_to_free_pages+0xb2/0xe0
> [ 3769.246204]  __alloc_pages_nodemask+0x603/0xe80
> [ 3769.246207]  ? xas_load+0x9/0x80
> [ 3769.246209]  ? find_get_entry+0x58/0x120
> [ 3769.246210]  pagecache_get_page+0xde/0x210
> [ 3769.246213]  grab_cache_page_write_begin+0x17/0x30
> [ 3769.246215]  ext4_da_write_begin+0xc4/0x340
> [ 3769.246217]  generic_perform_write+0xb8/0x1b0
> [ 3769.246219]  __generic_file_write_iter+0x13c/0x1b0
> [ 3769.246223]  ext4_file_write_iter+0x121/0x3c0
> [ 3769.246225]  __vfs_write+0x123/0x1a0
> [ 3769.246226]  vfs_write+0xab/0x1a0
> [ 3769.246229]  ksys_write+0x4d/0xc0
> [ 3769.246232]  do_syscall_64+0x49/0x100
> [ 3769.246234]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 3769.246237] RIP: 0033:0x7fee5b265730
> [ 3769.246238] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
> [ 3769.246239] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ 3769.246240] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
> [ 3769.246241] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
> [ 3769.246241] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
> [ 3769.246242] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
> [ 3769.246243] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
> [ 3769.246244] Mem-Info:
> [ 3769.246249] active_anon:152383 inactive_anon:99216 isolated_anon:0
>                 active_file:51569 inactive_file:85922 isolated_file:0
>                 unevictable:552 dirty:6866 writeback:0 unstable:0
>                 slab_reclaimable:6707 slab_unreclaimable:16166
>                 mapped:1870 shmem:6 pagetables:2716 bounce:0
>                 free:3639 free_pcp:900 free_cma:0
> [ 3769.246252] Node 0 active_anon:609532kB inactive_anon:396864kB active_file:206276kB inactive_file:343688kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:7480kB dirty:27464kB writeback:0kB shmem:24kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
> [ 3769.246253] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:8056kB inactive_anon:0kB active_file:92kB inactive_file:148kB unevictable:0kB writepending:8kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:20kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [ 3769.246256] lowmem_reserve[]: 0 1865 1865 1865
> [ 3769.246258] Node 0 DMA32 free:7076kB min:19472kB low:21380kB high:23288kB active_anon:601840kB inactive_anon:396512kB active_file:206216kB inactive_file:343644kB unevictable:2208kB writepending:27256kB present:2080768kB managed:1833792kB mlocked:2208kB kernel_stack:9392kB pagetables:10844kB bounce:0kB free_pcp:3600kB local_pcp:596kB free_cma:0kB
> [ 3769.246260] lowmem_reserve[]: 0 0 0 0
> [ 3769.246262] Node 0 DMA: 6*4kB (UE) 4*8kB (UME) 4*16kB (UME) 2*32kB (UE) 6*64kB (UE) 2*128kB (UM) 4*256kB (UME) 3*512kB (UME) 2*1024kB (ME) 1*2048kB (M) 0*4096kB = 7480kB
> [ 3769.246267] Node 0 DMA32: 66*4kB (UM) 271*8kB (UME) 218*16kB (UME) 45*32kB (UME) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7360kB
> [ 3769.246272] 144878 total pagecache pages
> [ 3769.246276] 6812 pages in swap cache
> [ 3769.246277] Swap cache stats: add 62616, delete 55806, find 31/55
> [ 3769.246278] Free swap  = 3943164kB
> [ 3769.246278] Total swap = 4194300kB
> [ 3769.246279] 524181 pages RAM
> [ 3769.246279] 0 pages HighMem/MovableOnly
> [ 3769.246280] 61765 pages reserved
> [ 3769.246280] 0 pages cma reserved
> [ 3769.246284] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
> [ 3769.246286] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
> [ 3769.246287] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
> [ 3769.246287] Call Trace:
> [ 3769.246288]  <IRQ>
> [ 3769.246290]  dump_stack+0x5c/0x7b
> [ 3769.246291]  warn_alloc+0x103/0x190
> [ 3769.246293]  __alloc_pages_nodemask+0xe3d/0xe80
> [ 3769.246294]  ? inet_gro_receive+0x232/0x2c0
> [ 3769.246296]  page_frag_alloc+0x117/0x150
> [ 3769.246297]  __napi_alloc_skb+0x83/0xd0
> [ 3769.246299]  rtl8169_poll+0x210/0x640
> [ 3769.246300]  net_rx_action+0x23d/0x370
> [ 3769.246302]  __do_softirq+0xed/0x229
> [ 3769.246304]  irq_exit+0xb7/0xc0
> [ 3769.246305]  xen_evtchn_do_upcall+0x27/0x40
> [ 3769.246306]  xen_do_hypervisor_callback+0x29/0x40
> [ 3769.246307]  </IRQ>
> [ 3769.246308] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
> [ 3769.246310] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
> [ 3769.246310] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
> [ 3769.246311] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
> [ 3769.246312] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
> [ 3769.246313] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
> [ 3769.246313] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
> [ 3769.246314] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
> [ 3769.246316]  _raw_spin_lock+0x16/0x20
> [ 3769.246317]  list_lru_add+0x59/0x170
> [ 3769.246318]  inode_lru_list_add+0x1b/0x40
> [ 3769.246320]  iput+0x18b/0x1a0
> [ 3769.246321]  __dentry_kill+0xc5/0x170
> [ 3769.246322]  shrink_dentry_list+0x93/0x1c0
> [ 3769.246323]  prune_dcache_sb+0x4d/0x70
> [ 3769.246325]  super_cache_scan+0x104/0x190
> [ 3769.246326]  do_shrink_slab+0x12c/0x1e0
> [ 3769.246328]  shrink_slab+0xdf/0x2b0
> [ 3769.246329]  shrink_node+0x158/0x470
> [ 3769.246331]  do_try_to_free_pages+0xd1/0x380
> [ 3769.246333]  try_to_free_pages+0xb2/0xe0
> [ 3769.246334]  __alloc_pages_nodemask+0x603/0xe80
> [ 3769.246336]  ? xas_load+0x9/0x80
> [ 3769.246337]  ? find_get_entry+0x58/0x120
> [ 3769.246338]  pagecache_get_page+0xde/0x210
> [ 3769.246340]  grab_cache_page_write_begin+0x17/0x30
> [ 3769.246341]  ext4_da_write_begin+0xc4/0x340
> [ 3769.246342]  generic_perform_write+0xb8/0x1b0
> [ 3769.246344]  __generic_file_write_iter+0x13c/0x1b0
> [ 3769.246345]  ext4_file_write_iter+0x121/0x3c0
> [ 3769.246347]  __vfs_write+0x123/0x1a0
> [ 3769.246348]  vfs_write+0xab/0x1a0
> [ 3769.246349]  ksys_write+0x4d/0xc0
> [ 3769.246350]  do_syscall_64+0x49/0x100
> [ 3769.246352]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 3769.246353] RIP: 0033:0x7fee5b265730
> [ 3769.246354] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
> [ 3769.246354] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ 3769.246355] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
> [ 3769.246356] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
> [ 3769.246357] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
> [ 3769.246357] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
> [ 3769.246358] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
> [ 3769.246364] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
> [ 3769.246366] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
> [ 3769.246366] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
> [ 3769.246366] Call Trace:
> [ 3769.246367]  <IRQ>
> [ 3769.246368]  dump_stack+0x5c/0x7b
> [ 3769.246370]  warn_alloc+0x103/0x190
> [ 3769.246371]  __alloc_pages_nodemask+0xe3d/0xe80
> [ 3769.246373]  ? inet_gro_receive+0x232/0x2c0
> [ 3769.246374]  page_frag_alloc+0x117/0x150
> [ 3769.246375]  __napi_alloc_skb+0x83/0xd0
> [ 3769.246376]  rtl8169_poll+0x210/0x640
> [ 3769.246378]  net_rx_action+0x23d/0x370
> [ 3769.246379]  __do_softirq+0xed/0x229
> [ 3769.246381]  irq_exit+0xb7/0xc0
> [ 3769.246382]  xen_evtchn_do_upcall+0x27/0x40
> [ 3769.246383]  xen_do_hypervisor_callback+0x29/0x40
> [ 3769.246383]  </IRQ>
> [ 3769.246385] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
> [ 3769.246386] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
> [ 3769.246387] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
> [ 3769.246388] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
> [ 3769.246388] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
> [ 3769.246389] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
> [ 3769.246390] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
> [ 3769.246390] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
> [ 3769.246392]  _raw_spin_lock+0x16/0x20
> [ 3769.246393]  list_lru_add+0x59/0x170
> [ 3769.246395]  inode_lru_list_add+0x1b/0x40
> [ 3769.246396]  iput+0x18b/0x1a0
> [ 3769.246397]  __dentry_kill+0xc5/0x170
> [ 3769.246398]  shrink_dentry_list+0x93/0x1c0
> [ 3769.246399]  prune_dcache_sb+0x4d/0x70
> [ 3769.246401]  super_cache_scan+0x104/0x190
> [ 3769.246402]  do_shrink_slab+0x12c/0x1e0
> [ 3769.246404]  shrink_slab+0xdf/0x2b0
> [ 3769.246405]  shrink_node+0x158/0x470
> [ 3769.246407]  do_try_to_free_pages+0xd1/0x380
> [ 3769.246408]  try_to_free_pages+0xb2/0xe0
> [ 3769.246410]  __alloc_pages_nodemask+0x603/0xe80
> [ 3769.246411]  ? xas_load+0x9/0x80
> [ 3769.246413]  ? find_get_entry+0x58/0x120
> [ 3769.246414]  pagecache_get_page+0xde/0x210
> [ 3769.246415]  grab_cache_page_write_begin+0x17/0x30
> [ 3769.246416]  ext4_da_write_begin+0xc4/0x340
> [ 3769.246418]  generic_perform_write+0xb8/0x1b0
> [ 3769.246420]  __generic_file_write_iter+0x13c/0x1b0
> [ 3769.246421]  ext4_file_write_iter+0x121/0x3c0
> [ 3769.246422]  __vfs_write+0x123/0x1a0
> [ 3769.246423]  vfs_write+0xab/0x1a0
> [ 3769.246424]  ksys_write+0x4d/0xc0
> [ 3769.246426]  do_syscall_64+0x49/0x100
> [ 3769.246427]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 3769.246428] RIP: 0033:0x7fee5b265730
> [ 3769.246429] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
> [ 3769.246430] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ 3769.246431] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
> [ 3769.246431] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
> [ 3769.246432] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
> [ 3769.246433] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
> [ 3769.246433] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
> 
> 
>  
>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>
>> Sure, thanks.
>>
>>> BTW am i correct these patches are merely optimizations ?
>>
>> Yes
>>
>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>> to revert them for 5.0 and try again for 5.1 ?
>>>
>> Before removing both it would be good to test with only the barrier-removal removed.
>>
>>> --
>>> Sander
>>>
>> Heiner
>>
>>>
>>>>
>>>>> would be candidates, which were merged in 5.0.
>>>>>
>>>>> I have reverted the first two, see how that works out.
>>>>>
>>>>> --
>>>>> Sander
>>>>>
>>>> Heiner
>>>>
>>>>>  
>>>>>>> --
>>>>>>> Sander
>>>>>>>
>>>>>> Heiner
>>>>>>
>>>>>>>
>>>>>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>>>>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>>>>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>>>>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>>>>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>>>> [ 6466.758366] Call Trace:
>>>>>>> [ 6466.768118]  <IRQ>
>>>>>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>>>>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>>>>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>>>>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>>>>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>>>>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>>>>>> [ 6466.835902]  </IRQ>
>>>>>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>>>>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>>>>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>>>>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>>>>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>>>>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>>>>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>>>>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>>>>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>>>>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>>>>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>>>>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>>>>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>>>>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>>>>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>>>>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>>>>>> [ 6466.999585] Modules linked in:
>>>>>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>>>>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>>>>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>>>>>> [ 6467.118166] Kernel Offset: disabled
>>>>>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 
> 


^ permalink raw reply

* [PATCH net-next] net: phy: remove unneeded masking of PHY register read results
From: Heiner Kallweit @ 2019-02-09  8:46 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <75c9d8ee-582f-f247-7595-d8732ac26c20@gmail.com>

PHY registers are only 16 bits wide, therefore, if the read was
successful, there's no need to mask out the higher 16 bits.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
---
 drivers/net/phy/phy_device.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index d4fc1fd8a..31f9e7c49 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -676,13 +676,13 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
 	phy_reg = mdiobus_read(bus, addr, reg_addr);
 	if (phy_reg < 0)
 		return -EIO;
-	*devices_in_package = (phy_reg & 0xffff) << 16;
+	*devices_in_package = phy_reg << 16;
 
 	reg_addr = MII_ADDR_C45 | dev_addr << 16 | MDIO_DEVS1;
 	phy_reg = mdiobus_read(bus, addr, reg_addr);
 	if (phy_reg < 0)
 		return -EIO;
-	*devices_in_package |= (phy_reg & 0xffff);
+	*devices_in_package |= phy_reg;
 
 	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
 	*devices_in_package &= ~BIT(0);
@@ -746,13 +746,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 		phy_reg = mdiobus_read(bus, addr, reg_addr);
 		if (phy_reg < 0)
 			return -EIO;
-		c45_ids->device_ids[i] = (phy_reg & 0xffff) << 16;
+		c45_ids->device_ids[i] = phy_reg << 16;
 
 		reg_addr = MII_ADDR_C45 | i << 16 | MII_PHYSID2;
 		phy_reg = mdiobus_read(bus, addr, reg_addr);
 		if (phy_reg < 0)
 			return -EIO;
-		c45_ids->device_ids[i] |= (phy_reg & 0xffff);
+		c45_ids->device_ids[i] |= phy_reg;
 	}
 	*phy_id = 0;
 	return 0;
@@ -789,14 +789,14 @@ static int get_phy_id(struct mii_bus *bus, int addr, u32 *phy_id,
 		return (phy_reg == -EIO || phy_reg == -ENODEV) ? -ENODEV : -EIO;
 	}
 
-	*phy_id = (phy_reg & 0xffff) << 16;
+	*phy_id = phy_reg << 16;
 
 	/* Grab the bits from PHYIR2, and put them in the lower half */
 	phy_reg = mdiobus_read(bus, addr, MII_PHYSID2);
 	if (phy_reg < 0)
 		return -EIO;
 
-	*phy_id |= (phy_reg & 0xffff);
+	*phy_id |= phy_reg;
 
 	return 0;
 }
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH net-next 3/4] nfp: devlink: rename vendor to manufacture
From: Jiri Pirko @ 2019-02-09  8:36 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, oss-drivers
In-Reply-To: <20190209031611.1102-4-jakub.kicinski@netronome.com>

Sat, Feb 09, 2019 at 04:16:10AM CET, jakub.kicinski@netronome.com wrote:
>Vendor may sound ambiguous, let's rename the fab string to
>"board.manufacture".
>
>Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
>Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
>---
> drivers/net/ethernet/netronome/nfp/nfp_devlink.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
>index dddbb0575be9..6e15e216732a 100644
>--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
>+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
>@@ -178,7 +178,7 @@ static const struct nfp_devlink_versions_simple {
> } nfp_devlink_versions_hwinfo[] = {
> 	{ DEVLINK_INFO_VERSION_GENERIC_BOARD_ID,	"assembly.partno", },
> 	{ DEVLINK_INFO_VERSION_GENERIC_BOARD_REV,	"assembly.revision", },
>-	{ "board.vendor", /* fab */			"assembly.vendor", },
>+	{ "board.manufacture",				"assembly.vendor", },

I wonder, why this is not among generic?


> 	{ "board.model", /* code name */		"assembly.model", },
> };
> 
>-- 
>2.19.2
>

^ permalink raw reply

* Re: [PATCH net-next 2/4] devlink: don't allocate attrs on the stack
From: Jiri Pirko @ 2019-02-09  8:35 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, oss-drivers
In-Reply-To: <20190209031611.1102-3-jakub.kicinski@netronome.com>

Sat, Feb 09, 2019 at 04:16:09AM CET, jakub.kicinski@netronome.com wrote:
>Number of devlink attributes has grown over 128, causing the
>following warning:
>
>../net/core/devlink.c: In function ‘devlink_nl_cmd_region_read_dumpit’:
>../net/core/devlink.c:3740:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=]
> }
>  ^
>
>Since the number of attributes is only going to grow allocate
>the array dynamically.
>
>Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply

* Re: [PATCH net-next 1/4] devlink: fix condition for compat device info
From: Jiri Pirko @ 2019-02-09  8:33 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, oss-drivers
In-Reply-To: <20190209031611.1102-2-jakub.kicinski@netronome.com>

Sat, Feb 09, 2019 at 04:16:08AM CET, jakub.kicinski@netronome.com wrote:
>We need the port to be both ethernet and have the rigth netdev,
>not one or the other.
>
>Fixes: ddb6e99e2db1 ("ethtool: add compat for devlink info")
>Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply

* Re: [PATCH v2 net-next] net: phy: disregard "Clause 22 registers present" bit in get_phy_c45_devs_in_pkg
From: Heiner Kallweit @ 2019-02-09  8:40 UTC (permalink / raw)
  To: David Miller; +Cc: andrew, f.fainelli, netdev
In-Reply-To: <20190208.231100.2168035998559471182.davem@davemloft.net>

On 09.02.2019 08:11, David Miller wrote:
> From: Heiner Kallweit <hkallweit1@gmail.com>
> Date: Fri, 8 Feb 2019 19:25:22 +0100
> 
>> Bit 0 in register 1.5 doesn't represent a device but is a flag that
>> Clause 22 registers are present. Therefore disregard this bit when
>> populating the device list. If code needs this information it
>> should read register 1.5 directly instead of accessing the device
>> list.
>> Because this bit doesn't represent a device don't define a
>> MDIO_MMD_XYZ constant, just define a MDIO_DEVS_XYZ constant for
>> the flag in the device list bitmap.
>>
>> v2:
>> - make masking of bit 0 more explicit
>> - improve commit message
>>
Andrew had few further review comments and based on that I prepared a v3,
this time as series of three patches. What you just applied was splitted
to two patches and patch 1 is new. But this shouldn't be a big deal.
We can keep what was applied and I will rebase patch 1 and resubmit it.

>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> 
> Applied, thanks Heiner.
> 
Heiner

^ permalink raw reply

* Re: Possible bug into DSA2 code.
From: Rodolfo Giometti @ 2019-02-09  8:24 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn, Vivien Didelot; +Cc: David S. Miller, netdev
In-Reply-To: <a121e6b5-03cd-da9e-42e8-41c68e12babe@enneenne.com>

Hello,

I'm working with EPRESSObin and DSA2 where I added the ability to dynamically 
load and unload switch configurations by using DT-overlay (a patchwork from here 
https://lore.kernel.org/patchwork/patch/468129/). During my tests I notice that 
when I remove the overlay in order to disable the switch I got the following BUG 
message:

[   24.862079] ------------[ cut here ]------------
[   24.866767] kernel BUG at drivers/net/phy/mdio_bus.c:448!
[   24.872328] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[   24.877967] Modules linked in:
[   24.881109] CPU: 0 PID: 2189 Comm: rmdir Not tainted 4.19.0-sw3720tsn1-038561
[   24.890509] Hardware name: Kbact sw3720tsn1 smart switch (DT)
[   24.896426] pstate: 20000005 (nzCv daif -PAN -UAO)
[   24.901365] pc : mdiobus_unregister+0x90/0x98
[   24.905838] lr : mv88e6xxx_mdios_unregister+0x64/0x88
[   24.911028] sp : ffff80006aea7730
[   24.914434] x29: ffff80006aea7730 x28: ffff80006adc27c0
[   24.919898] x27: ffff0000091e3000 x26: ffff0000090d9000
[   24.925365] x25: ffff80006c9420a0 x24: ffff80006c3e1c10
[   24.930828] x23: 0000000000000060 x22: ffff80006c9e6018
[   24.936294] x21: ffff80006c9e6110 x20: ffff80006c942800
[   24.941758] x19: ffff80006c942d40 x18: ffffffffffffffff
[   24.947225] x17: 0000000000000000 x16: ffff80006adc27c0
[   24.952690] x15: ffff0000090d96c8 x14: ffff000089198737
[   24.958156] x13: ffff000009198745 x12: ffff0000090d9940
[   24.963621] x11: ffff0000086be4b0 x10: 0000000000000040
[   24.969087] x9 : ffff0000090f4710 x8 : 0000000040000000
[   24.974553] x7 : ffff0000090d96c8 x6 : ffff80006a97a921
[   24.980018] x5 : ffff80006a97a920 x4 : 0000000000000fff
[   24.985483] x3 : 0000000000000000 x2 : 0000000000000000
[   24.990948] x1 : 0000000000000003 x0 : ffff80006c942800
[   24.996416] Process rmdir (pid: 2189, stack limit = 0x(____ptrval____))
[   25.003225] Call trace:
[   25.005737]  mdiobus_unregister+0x90/0x98
[   25.009858]  mv88e6xxx_mdios_unregister+0x64/0x88
[   25.014696]  mv88e6xxx_remove+0x2c/0x88
[   25.018637]  mdio_remove+0x20/0x48
[   25.022135]  device_release_driver_internal+0x1a8/0x240
[   25.027509]  device_release_driver+0x14/0x20
[   25.031899]  bus_remove_device+0x110/0x128
[   25.036109]  device_del+0x124/0x340
[   25.039693]  mdio_device_remove+0x14/0x28
[   25.043815]  mdiobus_unregister+0x50/0x98
[   25.047940]  orion_mdio_remove+0x34/0xb0
[   25.051970]  platform_drv_remove+0x24/0x50
[   25.056181]  device_release_driver_internal+0x1a8/0x240
[   25.061557]  device_release_driver+0x14/0x20
[   25.065947]  bus_remove_device+0x110/0x128
[   25.070158]  device_del+0x124/0x340
[   25.073742]  platform_device_del.part.3+0x24/0x90
[   25.078580]  platform_device_unregister+0x18/0x30
[   25.083422]  of_platform_device_destroy+0xb4/0xb8
[   25.088257]  of_platform_notify+0xa8/0x170
[   25.092471]  notifier_call_chain+0x54/0x98
[   25.096679]  blocking_notifier_call_chain+0x48/0x70
[   25.101697]  of_property_notify+0x60/0xa0
[   25.105819]  __of_changeset_entry_notify+0x54/0x100
[   25.110836]  __of_changeset_revert_notify+0x3c/0x70
[   25.115857]  of_overlay_remove+0x2ac/0x378
[   25.120066]  cfs_overlay_release+0x28/0x50
[   25.124278]  config_item_put.part.0+0x70/0xb0
[   25.128757]  config_item_put+0x10/0x20
[   25.132609]  configfs_rmdir+0x1ec/0x2e0
[   25.136554]  vfs_rmdir+0x7c/0x170
[   25.139956]  do_rmdir+0x17c/0x1d0
[   25.143361]  __arm64_sys_unlinkat+0x4c/0x60
[   25.147664]  el0_svc_common+0x60/0xe8
[   25.151426]  el0_svc_handler+0x2c/0x80
[   25.155279]  el0_svc+0x8/0xc
[   25.158236] Code: a94153f3 a9425bf5 a8c37bfd d65f03c0 (d4210000)
[   25.164509] ---[ end trace 5138591d8b9c9222 ]---

After looking into the kernel code I discovered that this depends to the commit
1eb59443e72c69edbb836626f9f7f7e82427eeac which modifications I report below:

diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 921a36fd139d..4e0f3c268103 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -312,6 +312,18 @@ static int dsa_ds_apply(struct dsa_switch_tree *dst, struct
dsa_switch *ds)
          if (err < 0)
                  return err;

+       if (!ds->slave_mii_bus && ds->drv->phy_read) {
+               ds->slave_mii_bus = devm_mdiobus_alloc(ds->dev);
+               if (!ds->slave_mii_bus)
+                       return -ENOMEM;
+
+               dsa_slave_mii_bus_init(ds);
+
+               err = mdiobus_register(ds->slave_mii_bus);
+               if (err < 0)
+                       return err;
+       }
+
          for (index = 0; index < DSA_MAX_PORTS; index++) {
                  port = ds->ports[index].dn;
                  if (!port)
@@ -361,6 +373,9 @@ static void dsa_ds_unapply(struct dsa_switch_tree *dst,
struct dsa_switch *ds)

                  dsa_user_port_unapply(port, index, ds);
          }
+
+       if (ds->slave_mii_bus && ds->drv->phy_read)
+               mdiobus_unregister(ds->slave_mii_bus);
   }

   static int dsa_dst_apply(struct dsa_switch_tree *dst)

This patch looks buggy to me because if this patch has the target to catch 
drivers that call dsa_ds_apply() having ds->slave_mii_bus set to NULL with a 
defined ds->ops->phy_read, then it should take into account also those drivers 
that have both ds->slave_mii_bus and ds->ops->phy_read already defined and then 
DO NOT call mdiobus_unregister() during dsa_ds_unapply()! This because DSA 
should NOT undo an operation it never did.

So we I see two possible solutions:

1) having both ds->slave_mii_bus and ds->ops->phy_read already defined is an 
error, then it must be signaled to the calling code, or

2) we have to use a flag to signal dsa_ds_unapply() what to do.

I don't know DSA too much to provide the rigth-thing(TM) so I'm waiting for a 
reply before proposing a patch. :-)

Ciao,

Rodolfo

-- 
GNU/Linux Solutions                  e-mail: giometti@enneenne.com
Linux Device Driver                          giometti@linux.it
Embedded Systems                     phone:  +39 349 2432127
UNIX programming                     skype:  rodolfo.giometti

^ permalink raw reply related

* [PATCH net-next] net/tls: Disable async decrytion for tls1.3
From: Vakul Garg @ 2019-02-09  7:53 UTC (permalink / raw)
  To: netdev@vger.kernel.org
  Cc: borisp@mellanox.com, aviadye@mellanox.com, davejwatson@fb.com,
	davem@davemloft.net, doronrk@fb.com, Vakul Garg

Function tls_sw_recvmsg() dequeues multiple records from stream parser
and decrypts them. In case the decryption is done by async accelerator,
the records may get submitted for decryption while the previous ones may
not have been decryted yet. For tls1.3, the record type is known only
after decryption. Therefore, for tls1.3, tls_sw_recvmsg() may submit
records for decryption even if it gets 'handshake' records after 'data'
records. These intermediate 'handshake' records may do a key updation.
By the time new keys are given to ktls by userspace, it is possible that
ktls has already submitted some records i(which are encrypted with new
keys) for decryption using old keys. This would lead to decrypt failure.
Therefore, async decryption of records should be disabled for tls1.3.

Fixes: 130b392c6cd6b ("net: tls: Add tls 1.3 support")
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
---
 net/tls/tls_sw.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 8051a9164139..fe8c287cbaa1 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2215,8 +2215,12 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx)
 
 	if (sw_ctx_rx) {
 		tfm = crypto_aead_tfm(sw_ctx_rx->aead_recv);
-		sw_ctx_rx->async_capable =
-			tfm->__crt_alg->cra_flags & CRYPTO_ALG_ASYNC;
+
+		if (crypto_info->version == TLS_1_3_VERSION)
+			sw_ctx_rx->async_capable = false;
+		else
+			sw_ctx_rx->async_capable =
+				tfm->__crt_alg->cra_flags & CRYPTO_ALG_ASYNC;
 
 		/* Set up strparser */
 		memset(&cb, 0, sizeof(cb));
-- 
2.13.6


^ permalink raw reply related

* Re: [PATCH v2 net-next] net: phy: disregard "Clause 22 registers present" bit in get_phy_c45_devs_in_pkg
From: David Miller @ 2019-02-09  7:11 UTC (permalink / raw)
  To: hkallweit1; +Cc: andrew, f.fainelli, netdev
In-Reply-To: <de542d98-a5c4-3dea-18de-a630f11a945c@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Fri, 8 Feb 2019 19:25:22 +0100

> Bit 0 in register 1.5 doesn't represent a device but is a flag that
> Clause 22 registers are present. Therefore disregard this bit when
> populating the device list. If code needs this information it
> should read register 1.5 directly instead of accessing the device
> list.
> Because this bit doesn't represent a device don't define a
> MDIO_MMD_XYZ constant, just define a MDIO_DEVS_XYZ constant for
> the flag in the device list bitmap.
> 
> v2:
> - make masking of bit 0 more explicit
> - improve commit message
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Applied, thanks Heiner.

^ permalink raw reply

* Re: [PATCH net-next 0/5] mvpp2 phylink fixes
From: David Miller @ 2019-02-09  7:09 UTC (permalink / raw)
  To: linux; +Cc: antoine.tenart, maxime.chevallier, baruch, netdev, sven.auhagen
In-Reply-To: <20190208153432.igh26ubphiljsswa@shell.armlinux.org.uk>

From: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Date: Fri, 8 Feb 2019 15:34:32 +0000

> Having spent a while debugging issues with Sven Auhagen, it appears
> that the mvpp2 network driver's phylink support isn't quite correct.
> 
> This series fixes that up, but, despite being tested locally, by
> Sven, and by Antoine, I would prefer it to be applied to net-next
> so that there is time for more people to test before it hits -rc or
> stable backports.
 ...

Series applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH net-next 0/2] Revert wake_on_lan devlink parameter
From: David Miller @ 2019-02-09  7:07 UTC (permalink / raw)
  To: vasundhara-v.volam; +Cc: michael.chan, jiri, netdev
In-Reply-To: <1549617190-387130-1-git-send-email-vasundhara-v.volam@broadcom.com>

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Date: Fri,  8 Feb 2019 14:43:08 +0530

> As per discussion with Jakub Kicinski and Michal Kubecek,
> this will be better addressed by soon-too-come ethtool netlink
> API with additional indication that given WoL configuration request
> is supposed to be persisted.
> 
> Retain bnxt_en code for devlink port param table registration.
> There will be follow up patches to add some devlink port params
> for bnxt_en driver.

Please fix the kbuild robot reported build failure and repost.

^ permalink raw reply

* Re: [PATCH net-next] ethtool: Remove unnecessary null check in ethtool_rx_flow_rule_create
From: David Miller @ 2019-02-09  7:05 UTC (permalink / raw)
  To: natechancellor; +Cc: netdev, linux-kernel, pablo, jiri, ndesaulniers
In-Reply-To: <20190208044652.32166-1-natechancellor@gmail.com>

From: Nathan Chancellor <natechancellor@gmail.com>
Date: Thu,  7 Feb 2019 21:46:53 -0700

> net/core/ethtool.c:3023:19: warning: address of array
> 'ext_m_spec->h_dest' will always evaluate to 'true'
> [-Wpointer-bool-conversion]
>                 if (ext_m_spec->h_dest) {
>                 ~~  ~~~~~~~~~~~~^~~~~~
> 
> h_dest is an array, it can't be null so remove this check.
> 
> Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator")
> Link: https://github.com/ClangBuiltLinux/linux/issues/353
> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] ixgbe: Use struct_size() helper
From: David Miller @ 2019-02-09  7:04 UTC (permalink / raw)
  To: gustavo; +Cc: jeffrey.t.kirsher, intel-wired-lan, netdev, linux-kernel
In-Reply-To: <20190208042258.GA32468@embeddedor>

From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Date: Thu, 7 Feb 2019 22:22:58 -0600

> One of the more common cases of allocation size calculations is finding
> the size of a structure that has a zero-sized array at the end, along
> with memory for some number of elements for that array. For example:
> 
> struct foo {
>     int stuff;
>     struct boo entry[];
> };
> 
> size = sizeof(struct foo) + count * sizeof(struct boo);
> instance = kzalloc(size, GFP_KERNEL);
> 
> Instead of leaving these open-coded and prone to type mistakes, we can
> now use the new struct_size() helper:
> 
> instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
> 
> Notice that, in this case, variable size is not necessary, hence
> it is removed.
> 
> This code was detected with the help of Coccinelle.
> 
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] igc: Use struct_size() helper
From: David Miller @ 2019-02-09  7:04 UTC (permalink / raw)
  To: gustavo; +Cc: jeffrey.t.kirsher, intel-wired-lan, netdev, linux-kernel
In-Reply-To: <20190208041945.GA4687@embeddedor>

From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Date: Thu, 7 Feb 2019 22:19:45 -0600

> One of the more common cases of allocation size calculations is finding
> the size of a structure that has a zero-sized array at the end, along
> with memory for some number of elements for that array. For example:
> 
> struct foo {
>     int stuff;
>     struct boo entry[];
> };
> 
> size = sizeof(struct foo) + count * sizeof(struct boo);
> instance = kzalloc(size, GFP_KERNEL)
> 
> Instead of leaving these open-coded and prone to type mistakes, we can
> now use the new struct_size() helper:
> 
> instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL)
> 
> Notice that, in this case, variable size is not necessary, hence
> it is removed.
> 
> This code was detected with the help of Coccinelle.
> 
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] igb: use struct_size() helper
From: David Miller @ 2019-02-09  7:04 UTC (permalink / raw)
  To: gustavo; +Cc: jeffrey.t.kirsher, intel-wired-lan, netdev, linux-kernel
In-Reply-To: <20190208041540.GA28817@embeddedor>

From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Date: Thu, 7 Feb 2019 22:15:40 -0600

> One of the more common cases of allocation size calculations is finding
> the size of a structure that has a zero-sized array at the end, along
> with memory for some number of elements for that array. For example:
> 
> struct foo {
>     int stuff;
>     struct boo entry[];
> };
> 
> size = sizeof(struct foo) + count * sizeof(struct boo);
> instance = alloc(size, GFP_KERNEL);
> 
> Instead of leaving these open-coded and prone to type mistakes, we can
> now use the new struct_size() helper:
> 
> size = struct_size(instance, entry, count);
> 
> This code was detected with the help of Coccinelle.
> 
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] net: phy: don't double-read link status register if link is up
From: David Miller @ 2019-02-09  7:02 UTC (permalink / raw)
  To: hkallweit1; +Cc: andrew, f.fainelli, netdev
In-Reply-To: <fd5559ed-6843-9ce5-fbb2-d9ffa9eb92e9@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Thu, 7 Feb 2019 20:22:20 +0100

> The link status register latches link-down events. Therefore, if link
> is reported as being up, there's no need for a second read.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Looks good.

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] net: stmmac: Variable "val" in function sun8i_dwmac_set_syscon() could be uninitialized
From: David Miller @ 2019-02-09  7:01 UTC (permalink / raw)
  To: yzhai003
  Cc: csong, zhiyunq, peppe.cavallaro, alexandre.torgue, maxime.ripard,
	wens, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <20190207174623.16712-1-yzhai003@ucr.edu>

From: Yizhuo <yzhai003@ucr.edu>
Date: Thu,  7 Feb 2019 09:46:23 -0800

> In function sun8i_dwmac_set_syscon(), local variable "val" could
> be uninitialized if function regmap_read() returns -EINVAL.
> However, it will be used directly in the if statement, which
> is potentially unsafe.
> 
> Signed-off-by: Yizhuo <yzhai003@ucr.edu>

This doesn't apply to any of my trees.

^ permalink raw reply

* Re: [PATCH net-next] fm10k: use struct_size() in kzalloc()
From: David Miller @ 2019-02-09  6:58 UTC (permalink / raw)
  To: gustavo; +Cc: jeffrey.t.kirsher, intel-wired-lan, netdev, linux-kernel
In-Reply-To: <20190208035537.GA12318@embeddedor>

From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Date: Thu, 7 Feb 2019 21:55:37 -0600

> One of the more common cases of allocation size calculations is finding
> the size of a structure that has a zero-sized array at the end, along
> with memory for some number of elements for that array. For example:
> 
> struct foo {
>     int stuff;
>     struct boo entry[];
> };
> 
> size = sizeof(struct foo) + count * sizeof(struct boo);
> instance = kzalloc(size, GFP_KERNEL);
> 
> Instead of leaving these open-coded and prone to type mistakes, we can
> now use the new struct_size() helper:
> 
> instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
> 
> Notice that, in this case, variable size is not necessary, hence
> it is removed.
> 
> This code was detected with the help of Coccinelle.
> 
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] nfp: flower: cmsg: use struct_size() helper
From: David Miller @ 2019-02-09  6:58 UTC (permalink / raw)
  To: gustavo; +Cc: jakub.kicinski, oss-drivers, netdev, linux-kernel
In-Reply-To: <20190208034725.GA12043@embeddedor>

From: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Date: Thu, 7 Feb 2019 21:47:25 -0600

> One of the more common cases of allocation size calculations is finding
> the size of a structure that has a zero-sized array at the end, along
> with memory for some number of elements for that array. For example:
> 
> struct foo {
>     int stuff;
>     void *entry[];
> };
> 
> size = sizeof(struct foo) + count * sizeof(void *);
> instance = alloc(size, GFP_KERNEL);
> 
> Instead of leaving these open-coded and prone to type mistakes, we can
> now use the new struct_size() helper:
> 
> instance = alloc(struct_size(instance, entry, count), GFP_KERNEL);
> 
> Notice that, in this case, variable size is not necessary, hence
> it is removed.
> 
> This code was detected with the help of Coccinelle.
> 
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

Applied.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox