Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH 0/5] Add support in dwmac-sun8i for accessing EMAC clock
From: Icenowy Zheng @ 2018-04-11 14:16 UTC (permalink / raw)
  To: Rob Herring, Maxime Ripard, Chen-Yu Tsai, Giuseppe Cavallaro,
	Corentin Labbe
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-sunxi-/JYPxA39Uh5TLH3MbocFFw, Icenowy Zheng

On some Allwinner SoCs, the EMAC clock register is in another device's
emory space, e.g. on A64 it's in the memory space of SRAM controller.

This patchset adds the possibility for the device to export the EMAC
clock register as a single-register regmap.

PATCH 1 adds the device tree binding for dwmac-sun8i to use another
device's regmap.

PATCH 2 and 3 are dwmac-sun8i refactors.

PATCH 4 exports the EMAC clock regmap in the SRAM controller driver.

PATCH 5 enable SRAM controller in the A64 device tree (replaces
syscon), and bind dwmac-sun8i to it.

Chen-Yu Tsai (2):
  net: stmmac: dwmac-sun8i: Use regmap_field for syscon register access
  net: stmmac: dwmac-sun8i: Allow getting syscon regmap from device

Icenowy Zheng (3):
  dt-bindings: allow dwmac-sun8i to use other devices' exported regmap
  drivers: soc: sunxi: export a regmap for EMAC clock reg on A64
  arm64: allwinner: a64: add SRAM controller device tree node

 .../devicetree/bindings/net/dwmac-sun8i.txt        |  5 +-
 arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi      | 23 +++++-
 drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c  | 85 +++++++++++++++++++---
 drivers/soc/sunxi/sunxi_sram.c                     | 48 +++++++++++-
 4 files changed, 141 insertions(+), 20 deletions(-)

-- 
2.15.1

^ permalink raw reply

* Re: [PATCH net 4/6] bnxt_en: Support max-mtu with VF-reps
From: David Miller @ 2018-04-11 14:08 UTC (permalink / raw)
  To: michael.chan; +Cc: netdev, sriharsha.basavapatna
In-Reply-To: <1523419093-18637-5-git-send-email-michael.chan@broadcom.com>

From: Michael Chan <michael.chan@broadcom.com>
Date: Tue, 10 Apr 2018 23:58:11 -0400

> From: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
> 
> While a VF is configured with a bigger mtu (> 1500), any packets that
> are punted to the VF-rep (slow-path) get dropped by OVS kernel-datapath
> with the following message: "dropped over-mtu packet". Fix this by
> returning the max-mtu value for a VF-rep derived from its corresponding VF.
> VF-rep's mtu can be changed using 'ip' command as shown in this example:
> 
> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
> Signed-off-by: Michael Chan <michael.chan@broadcom.com>

This commit message appears to be truncated, the example 'ip' command
mentioned is missing.

^ permalink raw reply

* Re: WARNING: possible recursive locking detected
From: Dmitry Vyukov @ 2018-04-11 14:05 UTC (permalink / raw)
  To: syzbot
  Cc: Christian Brauner, David Miller, David Ahern, Florian Westphal,
	Jiri Benc, Kirill Tkhai, LKML, Xin Long, netdev, syzkaller-bugs
In-Reply-To: <94eb2c056114525fed05699315f1@google.com>

On Wed, Apr 11, 2018 at 4:02 PM, syzbot
<syzbot+3c43eecd7745a5ce1640@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> b284d4d5a6785f8cd07eda2646a95782373cd01e (Tue Apr 10 19:25:30 2018 +0000)
> Merge tag 'ceph-for-4.17-rc1' of git://github.com/ceph/ceph-client
> syzbot dashboard link:
> https://syzkaller.appspot.com/bug?extid=3c43eecd7745a5ce1640
>
> So far this crash happened 3 times on upstream.
> C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5103706542440448
> syzkaller reproducer:
> https://syzkaller.appspot.com/x/repro.syz?id=5641659786199040
> Raw console output:
> https://syzkaller.appspot.com/x/log.txt?id=5099510896263168
> Kernel config:
> https://syzkaller.appspot.com/x/.config?id=-1223000601505858474
> compiler: gcc (GCC) 8.0.1 20180301 (experimental)
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+3c43eecd7745a5ce1640@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.

#syz dup: possible deadlock in rtnl_lock (5)

> IPVS: sync thread started: state = BACKUP, mcast_ifn = lo, syncid = 0, id =
> 0
> IPVS: stopping backup sync thread 4546 ...
>
> ============================================
> IPVS: stopping backup sync thread 4559 ...
> WARNING: possible recursive locking detected
> 4.16.0+ #19 Not tainted
> --------------------------------------------
> syzkaller046099/4543 is trying to acquire lock:
> 000000008d06d497 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20
> net/core/rtnetlink.c:74
>
> but task is already holding lock:
> IPVS: stopping backup sync thread 4557 ...
> 000000008d06d497 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20
> net/core/rtnetlink.c:74
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(rtnl_mutex);
>   lock(rtnl_mutex);
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
> 2 locks held by syzkaller046099/4543:
>  #0: 000000008d06d497 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20
> net/core/rtnetlink.c:74
>  #1: 000000008326bc5c (ipvs->sync_mutex){+.+.}, at:
> do_ip_vs_set_ctl+0x562/0x1d30 net/netfilter/ipvs/ip_vs_ctl.c:2388
>
> stack backtrace:
> CPU: 1 PID: 4543 Comm: syzkaller046099 Not tainted 4.16.0+ #19
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>  print_deadlock_bug kernel/locking/lockdep.c:1761 [inline]
>  check_deadlock kernel/locking/lockdep.c:1805 [inline]
>  validate_chain kernel/locking/lockdep.c:2401 [inline]
>  __lock_acquire.cold.62+0x18c/0x55b kernel/locking/lockdep.c:3431
>  lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
>  __mutex_lock_common kernel/locking/mutex.c:756 [inline]
>  __mutex_lock+0x16d/0x17f0 kernel/locking/mutex.c:893
>  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>  rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
>  ip_mc_drop_socket+0x8f/0x270 net/ipv4/igmp.c:2643
>  inet_release+0x4e/0x1f0 net/ipv4/af_inet.c:413
>  sock_release+0x96/0x1b0 net/socket.c:594
>  start_sync_thread+0xdc3/0x2d40 net/netfilter/ipvs/ip_vs_sync.c:1924
>  do_ip_vs_set_ctl+0x59c/0x1d30 net/netfilter/ipvs/ip_vs_ctl.c:2389
>  nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
>  nf_setsockopt+0x7d/0xd0 net/netfilter/nf_sockopt.c:115
>  ip_setsockopt+0xd8/0xf0 net/ipv4/ip_sockglue.c:1253
>  udp_setsockopt+0x62/0xa0 net/ipv4/udp.c:2413
>  ipv6_setsockopt+0x149/0x170 net/ipv6/ipv6_sockglue.c:917
>  udpv6_setsockopt+0x62/0xa0 net/ipv6/udp.c:1424
>  sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3039
>  __sys_setsockopt+0x1bd/0x390 net/socket.c:1903
>  SYSC_setsockopt net/socket.c:1914 [inline]
>  SyS_setsockopt+0x34/0x50 net/socket.c:1911
>  do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x447c19
> RSP: 002b:00007fb627a93db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
> RAX: ffffffffffffffda RBX: 0000000000700024 RCX: 0000000000447c19
> RDX: 000000000000048b RSI: 0000000000000000 RDI: 0000000000000004
> RBP: 0000000000700020 R08: 0000000000000018 R09: 0000000000000000
> R10: 0000000020000100 R11: 0000000000000246 R12: 0000000000000000
> R13: 000000000080fe4f R14: 00007fb627a949c0 R15: 0000000000002710
>
>
> ---
> This bug is generated by a dumb bot. It may contain errors.
> See https://goo.gl/tpsmEJ for details.
> Direct all questions to syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report.
> If you forgot to add the Reported-by tag, once the fix for this bug is
> merged
> into any tree, please reply to this email with:
> #syz fix: exact-commit-title
> If you want to test a patch for this bug, please reply with:
> #syz test: git://repo/address.git branch
> and provide the patch inline or as an attachment.
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug
> report.
> Note: all commands must start from beginning of the line in the email body.
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/94eb2c056114525fed05699315f1%40google.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply

* WARNING: possible recursive locking detected
From: syzbot @ 2018-04-11 14:02 UTC (permalink / raw)
  To: christian.brauner, davem, dsahern, fw, jbenc, ktkhai,
	linux-kernel, lucien.xin, netdev, syzkaller-bugs

Hello,

syzbot hit the following crash on upstream commit
b284d4d5a6785f8cd07eda2646a95782373cd01e (Tue Apr 10 19:25:30 2018 +0000)
Merge tag 'ceph-for-4.17-rc1' of git://github.com/ceph/ceph-client
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=3c43eecd7745a5ce1640

So far this crash happened 3 times on upstream.
C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5103706542440448
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=5641659786199040
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=5099510896263168
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=-1223000601505858474
compiler: gcc (GCC) 8.0.1 20180301 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+3c43eecd7745a5ce1640@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.
If you forward the report, please keep this part and the footer.

IPVS: sync thread started: state = BACKUP, mcast_ifn = lo, syncid = 0, id =  
0
IPVS: stopping backup sync thread 4546 ...

============================================
IPVS: stopping backup sync thread 4559 ...
WARNING: possible recursive locking detected
4.16.0+ #19 Not tainted
--------------------------------------------
syzkaller046099/4543 is trying to acquire lock:
000000008d06d497 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20  
net/core/rtnetlink.c:74

but task is already holding lock:
IPVS: stopping backup sync thread 4557 ...
000000008d06d497 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20  
net/core/rtnetlink.c:74

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(rtnl_mutex);
   lock(rtnl_mutex);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

2 locks held by syzkaller046099/4543:
  #0: 000000008d06d497 (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20  
net/core/rtnetlink.c:74
  #1: 000000008326bc5c (ipvs->sync_mutex){+.+.}, at:  
do_ip_vs_set_ctl+0x562/0x1d30 net/netfilter/ipvs/ip_vs_ctl.c:2388

stack backtrace:
CPU: 1 PID: 4543 Comm: syzkaller046099 Not tainted 4.16.0+ #19
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  print_deadlock_bug kernel/locking/lockdep.c:1761 [inline]
  check_deadlock kernel/locking/lockdep.c:1805 [inline]
  validate_chain kernel/locking/lockdep.c:2401 [inline]
  __lock_acquire.cold.62+0x18c/0x55b kernel/locking/lockdep.c:3431
  lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
  __mutex_lock_common kernel/locking/mutex.c:756 [inline]
  __mutex_lock+0x16d/0x17f0 kernel/locking/mutex.c:893
  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
  rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
  ip_mc_drop_socket+0x8f/0x270 net/ipv4/igmp.c:2643
  inet_release+0x4e/0x1f0 net/ipv4/af_inet.c:413
  sock_release+0x96/0x1b0 net/socket.c:594
  start_sync_thread+0xdc3/0x2d40 net/netfilter/ipvs/ip_vs_sync.c:1924
  do_ip_vs_set_ctl+0x59c/0x1d30 net/netfilter/ipvs/ip_vs_ctl.c:2389
  nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
  nf_setsockopt+0x7d/0xd0 net/netfilter/nf_sockopt.c:115
  ip_setsockopt+0xd8/0xf0 net/ipv4/ip_sockglue.c:1253
  udp_setsockopt+0x62/0xa0 net/ipv4/udp.c:2413
  ipv6_setsockopt+0x149/0x170 net/ipv6/ipv6_sockglue.c:917
  udpv6_setsockopt+0x62/0xa0 net/ipv6/udp.c:1424
  sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3039
  __sys_setsockopt+0x1bd/0x390 net/socket.c:1903
  SYSC_setsockopt net/socket.c:1914 [inline]
  SyS_setsockopt+0x34/0x50 net/socket.c:1911
  do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x447c19
RSP: 002b:00007fb627a93db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000700024 RCX: 0000000000447c19
RDX: 000000000000048b RSI: 0000000000000000 RDI: 0000000000000004
RBP: 0000000000700020 R08: 0000000000000018 R09: 0000000000000000
R10: 0000000020000100 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000080fe4f R14: 00007fb627a949c0 R15: 0000000000002710


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzkaller@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.
Note: all commands must start from beginning of the line in the email body.

^ permalink raw reply

* Re: [PATCH net] rds: MP-RDS may use an invalid c_path
From: santosh.shilimkar @ 2018-04-11 14:00 UTC (permalink / raw)
  To: Ka-Cheong Poon, netdev; +Cc: davem, rds-devel
In-Reply-To: <1523433445-7596-1-git-send-email-ka-cheong.poon@oracle.com>

On 4/11/18 12:57 AM, Ka-Cheong Poon wrote:
> rds_sendmsg() calls rds_send_mprds_hash() to find a c_path to use to
> send a message.  Suppose the RDS connection is not yet up.  In
> rds_send_mprds_hash(), it does
> 
> 	if (conn->c_npaths == 0)
> 		wait_event_interruptible(conn->c_hs_waitq,
> 					 (conn->c_npaths != 0));
> 
> If it is interrupted before the connection is set up,
> rds_send_mprds_hash() will return a non-zero hash value.  Hence
> rds_sendmsg() will use a non-zero c_path to send the message.  But if
> the RDS connection ends up to be non-MP capable, the message will be
> lost as only the zero c_path can be used.
> 
> Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
> ---
Thanks for posting the fix upstream as well.

Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>

^ permalink raw reply

* Re: [PATCH] make net_gso_ok return false when gso_type is zero(invalid)
From: Wenhua Shi @ 2018-04-11 13:59 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: David Miller, netdev, linux-kernel
In-Reply-To: <CAN6D2nrPeWG-LxurpT=u8-kS-2jVQPpWnkth7bxqFn_U=VZJvA@mail.gmail.com>

> Note that TCP stack now works with GSO being always on.
> 0a6b2a1dc2a2 ("tcp: switch to GSO being always on")

I've tested on the latest net-next branch
17dec0a949153d9ac00760ba2f5b78cb583e995f. The problem still exists. My
patch won't work. Reverting commit 0a6b2a1dc2a2 won't help.

^ permalink raw reply

* Re: [PATCH] vhost: Fix vhost_copy_to_user()
From: Michael S. Tsirkin @ 2018-04-11 13:51 UTC (permalink / raw)
  To: Eric Auger
  Cc: kvm, netdev, linux-kernel, virtualization, stefanha, kvmarm,
	eric.auger.pro
In-Reply-To: <1523453438-4266-1-git-send-email-eric.auger@redhat.com>

On Wed, Apr 11, 2018 at 03:30:38PM +0200, Eric Auger wrote:
> vhost_copy_to_user is used to copy vring used elements to userspace.
> We should use VHOST_ADDR_USED instead of VHOST_ADDR_DESC.
> 
> Fixes: f88949138058 ("vhost: introduce O(1) vq metadata cache")
> Signed-off-by: Eric Auger <eric.auger@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
> 
> This fixes a stall observed when running an aarch64 guest with
> virtual smmu
> ---
>  drivers/vhost/vhost.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index bec722e..f44aead 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -744,7 +744,7 @@ static int vhost_copy_to_user(struct vhost_virtqueue *vq, void __user *to,
>  		struct iov_iter t;
>  		void __user *uaddr = vhost_vq_meta_fetch(vq,
>  				     (u64)(uintptr_t)to, size,
> -				     VHOST_ADDR_DESC);
> +				     VHOST_ADDR_USED);
>  
>  		if (uaddr)
>  			return __copy_to_user(uaddr, from, size);
> -- 
> 2.5.5

^ permalink raw reply

* Re: [PATCH] vhost: Fix vhost_copy_to_user()
From: Auger Eric @ 2018-04-11 13:45 UTC (permalink / raw)
  To: Jason Wang, eric.auger.pro, linux-kernel, virtualization, netdev,
	kvm, mst, kvmarm
  Cc: stefanha
In-Reply-To: <85c033b9-b230-7ef9-744c-4e2799684609@redhat.com>

Hi Jason,

On 11/04/18 15:44, Jason Wang wrote:
> 
> 
> On 2018年04月11日 21:30, Eric Auger wrote:
>> vhost_copy_to_user is used to copy vring used elements to userspace.
>> We should use VHOST_ADDR_USED instead of VHOST_ADDR_DESC.
>>
>> Fixes: f88949138058 ("vhost: introduce O(1) vq metadata cache")
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> This fixes a stall observed when running an aarch64 guest with
>> virtual smmu
>> ---
>>   drivers/vhost/vhost.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index bec722e..f44aead 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -744,7 +744,7 @@ static int vhost_copy_to_user(struct
>> vhost_virtqueue *vq, void __user *to,
>>           struct iov_iter t;
>>           void __user *uaddr = vhost_vq_meta_fetch(vq,
>>                        (u64)(uintptr_t)to, size,
>> -                     VHOST_ADDR_DESC);
>> +                     VHOST_ADDR_USED);
>>             if (uaddr)
>>               return __copy_to_user(uaddr, from, size);
> 
> Acked-by: Jason Wang <jasowang@redhat.com>
> 
> Thanks!
> 
> Stable material I think.

yes I think so.

Thanks

Eric

^ permalink raw reply

* Re: [PATCH] vhost: Fix vhost_copy_to_user()
From: Jason Wang @ 2018-04-11 13:44 UTC (permalink / raw)
  To: Eric Auger, eric.auger.pro, linux-kernel, virtualization, netdev,
	kvm, mst, kvmarm
  Cc: stefanha
In-Reply-To: <1523453438-4266-1-git-send-email-eric.auger@redhat.com>



On 2018年04月11日 21:30, Eric Auger wrote:
> vhost_copy_to_user is used to copy vring used elements to userspace.
> We should use VHOST_ADDR_USED instead of VHOST_ADDR_DESC.
>
> Fixes: f88949138058 ("vhost: introduce O(1) vq metadata cache")
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> This fixes a stall observed when running an aarch64 guest with
> virtual smmu
> ---
>   drivers/vhost/vhost.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index bec722e..f44aead 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -744,7 +744,7 @@ static int vhost_copy_to_user(struct vhost_virtqueue *vq, void __user *to,
>   		struct iov_iter t;
>   		void __user *uaddr = vhost_vq_meta_fetch(vq,
>   				     (u64)(uintptr_t)to, size,
> -				     VHOST_ADDR_DESC);
> +				     VHOST_ADDR_USED);
>   
>   		if (uaddr)
>   			return __copy_to_user(uaddr, from, size);

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks!

Stable material I think.

^ permalink raw reply

* [PATCH] vhost: Fix vhost_copy_to_user()
From: Eric Auger @ 2018-04-11 13:30 UTC (permalink / raw)
  To: eric.auger.pro, eric.auger, linux-kernel, virtualization, netdev,
	kvm, jasowang, mst, kvmarm
  Cc: stefanha

vhost_copy_to_user is used to copy vring used elements to userspace.
We should use VHOST_ADDR_USED instead of VHOST_ADDR_DESC.

Fixes: f88949138058 ("vhost: introduce O(1) vq metadata cache")
Signed-off-by: Eric Auger <eric.auger@redhat.com>

---

This fixes a stall observed when running an aarch64 guest with
virtual smmu
---
 drivers/vhost/vhost.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index bec722e..f44aead 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -744,7 +744,7 @@ static int vhost_copy_to_user(struct vhost_virtqueue *vq, void __user *to,
 		struct iov_iter t;
 		void __user *uaddr = vhost_vq_meta_fetch(vq,
 				     (u64)(uintptr_t)to, size,
-				     VHOST_ADDR_DESC);
+				     VHOST_ADDR_USED);
 
 		if (uaddr)
 			return __copy_to_user(uaddr, from, size);
-- 
2.5.5

^ permalink raw reply related

* Re: [PATCH v3 0/2] vhost: fix vhost_vq_access_ok() log check
From: Michael S. Tsirkin @ 2018-04-11 13:24 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, netdev, syzkaller-bugs, linux-kernel, virtualization,
	Linus Torvalds
In-Reply-To: <20180411023541.15776-1-stefanha@redhat.com>

On Wed, Apr 11, 2018 at 10:35:39AM +0800, Stefan Hajnoczi wrote:
> v3:
>  * Rebased onto net/master and resolved conflict [DaveM]
> 
> v2:
>  * Rewrote the conditional to make the vq access check clearer [Linus]
>  * Added Patch 2 to make the return type consistent and harder to misuse [Linus]
> 
> The first patch fixes the vhost virtqueue access check which was recently
> broken.  The second patch replaces the int return type with bool to prevent
> future bugs.

Acked-by: Michael S. Tsirkin <mst@redhat.com>

We need the 1st one on stable I think.


> Stefan Hajnoczi (2):
>   vhost: fix vhost_vq_access_ok() log check
>   vhost: return bool from *_access_ok() functions
> 
>  drivers/vhost/vhost.h |  4 +--
>  drivers/vhost/vhost.c | 70 ++++++++++++++++++++++++++-------------------------
>  2 files changed, 38 insertions(+), 36 deletions(-)
> 
> -- 
> 2.14.3

^ permalink raw reply

* Re: [PATCH v3 2/2] vhost: return bool from *_access_ok() functions
From: Michael S. Tsirkin @ 2018-04-11 13:23 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kvm, netdev, syzkaller-bugs, linux-kernel, virtualization,
	Linus Torvalds
In-Reply-To: <20180411023541.15776-3-stefanha@redhat.com>

On Wed, Apr 11, 2018 at 10:35:41AM +0800, Stefan Hajnoczi wrote:
> Currently vhost *_access_ok() functions return int.  This is error-prone
> because there are two popular conventions:
> 
> 1. 0 means failure, 1 means success
> 2. -errno means failure, 0 means success
> 
> Although vhost mostly uses #1, it does not do so consistently.
> umem_access_ok() uses #2.
> 
> This patch changes the return type from int to bool so that false means
> failure and true means success.  This eliminates a potential source of
> errors.
> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/vhost/vhost.h |  4 ++--
>  drivers/vhost/vhost.c | 66 +++++++++++++++++++++++++--------------------------
>  2 files changed, 35 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index d8ee85ae8fdc..6c844b90a168 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -178,8 +178,8 @@ void vhost_dev_cleanup(struct vhost_dev *);
>  void vhost_dev_stop(struct vhost_dev *);
>  long vhost_dev_ioctl(struct vhost_dev *, unsigned int ioctl, void __user *argp);
>  long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *argp);
> -int vhost_vq_access_ok(struct vhost_virtqueue *vq);
> -int vhost_log_access_ok(struct vhost_dev *);
> +bool vhost_vq_access_ok(struct vhost_virtqueue *vq);
> +bool vhost_log_access_ok(struct vhost_dev *);
>  
>  int vhost_get_vq_desc(struct vhost_virtqueue *,
>  		      struct iovec iov[], unsigned int iov_count,
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index fc805b7fad9d..0fcb51a9940c 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -641,14 +641,14 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
>  }
>  EXPORT_SYMBOL_GPL(vhost_dev_cleanup);
>  
> -static int log_access_ok(void __user *log_base, u64 addr, unsigned long sz)
> +static bool log_access_ok(void __user *log_base, u64 addr, unsigned long sz)
>  {
>  	u64 a = addr / VHOST_PAGE_SIZE / 8;
>  
>  	/* Make sure 64 bit math will not overflow. */
>  	if (a > ULONG_MAX - (unsigned long)log_base ||
>  	    a + (unsigned long)log_base > ULONG_MAX)
> -		return 0;
> +		return false;
>  
>  	return access_ok(VERIFY_WRITE, log_base + a,
>  			 (sz + VHOST_PAGE_SIZE * 8 - 1) / VHOST_PAGE_SIZE / 8);
> @@ -661,30 +661,30 @@ static bool vhost_overflow(u64 uaddr, u64 size)
>  }
>  
>  /* Caller should have vq mutex and device mutex. */
> -static int vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem,
> -			       int log_all)
> +static bool vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem,
> +				int log_all)
>  {
>  	struct vhost_umem_node *node;
>  
>  	if (!umem)
> -		return 0;
> +		return false;
>  
>  	list_for_each_entry(node, &umem->umem_list, link) {
>  		unsigned long a = node->userspace_addr;
>  
>  		if (vhost_overflow(node->userspace_addr, node->size))
> -			return 0;
> +			return false;
>  
>  
>  		if (!access_ok(VERIFY_WRITE, (void __user *)a,
>  				    node->size))
> -			return 0;
> +			return false;
>  		else if (log_all && !log_access_ok(log_base,
>  						   node->start,
>  						   node->size))
> -			return 0;
> +			return false;
>  	}
> -	return 1;
> +	return true;
>  }
>  
>  static inline void __user *vhost_vq_meta_fetch(struct vhost_virtqueue *vq,
> @@ -701,13 +701,13 @@ static inline void __user *vhost_vq_meta_fetch(struct vhost_virtqueue *vq,
>  
>  /* Can we switch to this memory table? */
>  /* Caller should have device mutex but not vq mutex */
> -static int memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem,
> -			    int log_all)
> +static bool memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem,
> +			     int log_all)
>  {
>  	int i;
>  
>  	for (i = 0; i < d->nvqs; ++i) {
> -		int ok;
> +		bool ok;
>  		bool log;
>  
>  		mutex_lock(&d->vqs[i]->mutex);
> @@ -717,12 +717,12 @@ static int memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem,
>  			ok = vq_memory_access_ok(d->vqs[i]->log_base,
>  						 umem, log);
>  		else
> -			ok = 1;
> +			ok = true;
>  		mutex_unlock(&d->vqs[i]->mutex);
>  		if (!ok)
> -			return 0;
> +			return false;
>  	}
> -	return 1;
> +	return true;
>  }
>  
>  static int translate_desc(struct vhost_virtqueue *vq, u64 addr, u32 len,
> @@ -959,21 +959,21 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
>  	spin_unlock(&d->iotlb_lock);
>  }
>  
> -static int umem_access_ok(u64 uaddr, u64 size, int access)
> +static bool umem_access_ok(u64 uaddr, u64 size, int access)
>  {
>  	unsigned long a = uaddr;
>  
>  	/* Make sure 64 bit math will not overflow. */
>  	if (vhost_overflow(uaddr, size))
> -		return -EFAULT;
> +		return false;
>  
>  	if ((access & VHOST_ACCESS_RO) &&
>  	    !access_ok(VERIFY_READ, (void __user *)a, size))
> -		return -EFAULT;
> +		return false;
>  	if ((access & VHOST_ACCESS_WO) &&
>  	    !access_ok(VERIFY_WRITE, (void __user *)a, size))
> -		return -EFAULT;
> -	return 0;
> +		return false;
> +	return true;
>  }
>  
>  static int vhost_process_iotlb_msg(struct vhost_dev *dev,
> @@ -988,7 +988,7 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
>  			ret = -EFAULT;
>  			break;
>  		}
> -		if (umem_access_ok(msg->uaddr, msg->size, msg->perm)) {
> +		if (!umem_access_ok(msg->uaddr, msg->size, msg->perm)) {
>  			ret = -EFAULT;
>  			break;
>  		}
> @@ -1135,10 +1135,10 @@ static int vhost_iotlb_miss(struct vhost_virtqueue *vq, u64 iova, int access)
>  	return 0;
>  }
>  
> -static int vq_access_ok(struct vhost_virtqueue *vq, unsigned int num,
> -			struct vring_desc __user *desc,
> -			struct vring_avail __user *avail,
> -			struct vring_used __user *used)
> +static bool vq_access_ok(struct vhost_virtqueue *vq, unsigned int num,
> +			 struct vring_desc __user *desc,
> +			 struct vring_avail __user *avail,
> +			 struct vring_used __user *used)
>  
>  {
>  	size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
> @@ -1161,8 +1161,8 @@ static void vhost_vq_meta_update(struct vhost_virtqueue *vq,
>  		vq->meta_iotlb[type] = node;
>  }
>  
> -static int iotlb_access_ok(struct vhost_virtqueue *vq,
> -			   int access, u64 addr, u64 len, int type)
> +static bool iotlb_access_ok(struct vhost_virtqueue *vq,
> +			    int access, u64 addr, u64 len, int type)
>  {
>  	const struct vhost_umem_node *node;
>  	struct vhost_umem *umem = vq->iotlb;
> @@ -1220,7 +1220,7 @@ EXPORT_SYMBOL_GPL(vq_iotlb_prefetch);
>  
>  /* Can we log writes? */
>  /* Caller should have device mutex but not vq mutex */
> -int vhost_log_access_ok(struct vhost_dev *dev)
> +bool vhost_log_access_ok(struct vhost_dev *dev)
>  {
>  	return memory_access_ok(dev, dev->umem, 1);
>  }
> @@ -1228,8 +1228,8 @@ EXPORT_SYMBOL_GPL(vhost_log_access_ok);
>  
>  /* Verify access for write logging. */
>  /* Caller should have vq mutex and device mutex */
> -static int vq_log_access_ok(struct vhost_virtqueue *vq,
> -			    void __user *log_base)
> +static bool vq_log_access_ok(struct vhost_virtqueue *vq,
> +			     void __user *log_base)
>  {
>  	size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
>  
> @@ -1242,14 +1242,14 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq,
>  
>  /* Can we start vq? */
>  /* Caller should have vq mutex and device mutex */
> -int vhost_vq_access_ok(struct vhost_virtqueue *vq)
> +bool vhost_vq_access_ok(struct vhost_virtqueue *vq)
>  {
>  	if (!vq_log_access_ok(vq, vq->log_base))
> -		return 0;
> +		return false;
>  
>  	/* Access validation occurs at prefetch time with IOTLB */
>  	if (vq->iotlb)
> -		return 1;
> +		return true;
>  
>  	return vq_access_ok(vq, vq->num, vq->desc, vq->avail, vq->used);
>  }
> -- 
> 2.14.3

^ permalink raw reply

* Re: [PATCH v3 1/2] vhost: fix vhost_vq_access_ok() log check
From: Michael S. Tsirkin @ 2018-04-11 13:23 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: virtualization, Linus Torvalds, jasowang, netdev, syzkaller-bugs,
	kvm, linux-kernel
In-Reply-To: <20180411023541.15776-2-stefanha@redhat.com>

On Wed, Apr 11, 2018 at 10:35:40AM +0800, Stefan Hajnoczi wrote:
> Commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ("vhost: validate log
> when IOTLB is enabled") introduced a regression.  The logic was
> originally:
> 
>   if (vq->iotlb)
>       return 1;
>   return A && B;
> 
> After the patch the short-circuit logic for A was inverted:
> 
>   if (A || vq->iotlb)
>       return A;
>   return B;
> 
> This patch fixes the regression by rewriting the checks in the obvious
> way, no longer returning A when vq->iotlb is non-NULL (which is hard to
> understand).
> 
> Reported-by: syzbot+65a84dde0214b0387ccd@syzkaller.appspotmail.com
> Cc: Jason Wang <jasowang@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/vhost/vhost.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index bec722e41f58..fc805b7fad9d 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -1244,10 +1244,12 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq,
>  /* Caller should have vq mutex and device mutex */
>  int vhost_vq_access_ok(struct vhost_virtqueue *vq)
>  {
> -	int ret = vq_log_access_ok(vq, vq->log_base);
> +	if (!vq_log_access_ok(vq, vq->log_base))
> +		return 0;
>  
> -	if (ret || vq->iotlb)
> -		return ret;
> +	/* Access validation occurs at prefetch time with IOTLB */
> +	if (vq->iotlb)
> +		return 1;
>  
>  	return vq_access_ok(vq, vq->num, vq->desc, vq->avail, vq->used);
>  }
> -- 
> 2.14.3

^ permalink raw reply

* [PATCH][next] iwlwifi: mvm: remove division by size of sizeof(struct ieee80211_wmm_rule)
From: Colin King @ 2018-04-11 13:05 UTC (permalink / raw)
  To: Johannes Berg, Emmanuel Grumbach, Luca Coelho,
	Intel Linux Wireless, Kalle Valo, linux-wireless, netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

The subtraction of two struct ieee80211_wmm_rule pointers leaves a result
that is automatically scaled down by the size of the size of pointed-to
type, hence the division by sizeof(struct ieee80211_wmm_rule) is
bogus and should be removed.

Detected by CoverityScan, CID#1467777 ("Extra sizeof expression")

Fixes: 77e30e10ee28 ("iwlwifi: mvm: query regdb for wmm rule if needed")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c b/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c
index ca0174680af9..e78219057d2f 100644
--- a/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c
+++ b/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c
@@ -1007,8 +1007,7 @@ iwl_parse_nvm_mcc_info(struct device *dev, const struct iwl_cfg *cfg,
 			continue;
 
 		copy_rd->reg_rules[i].wmm_rule = d_wmm +
-			(regd->reg_rules[i].wmm_rule - s_wmm) /
-			sizeof(struct ieee80211_wmm_rule);
+			(regd->reg_rules[i].wmm_rule - s_wmm);
 	}
 
 out:
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH] lan78xx: Correctly indicate invalid OTP
From: Phil Elwell @ 2018-04-11 13:03 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Woojung Huh, Microchip Linux Driver Support, netdev, linux-usb,
	linux-kernel
In-Reply-To: <20180411125737.GB6119@lunn.ch>

Hi Andrew.

On 11/04/2018 13:57, Andrew Lunn wrote:
> On Wed, Apr 11, 2018 at 10:59:17AM +0100, Phil Elwell wrote:
>> lan78xx_read_otp tries to return -EINVAL in the event of invalid OTP
>> content, but the value gets overwritten before it is returned and the
>> read goes ahead anyway. Make the read conditional as it should be
>> and preserve the error code.
> 
> Hi Phil
> 
> Do you know that the Fixes: tag should be for this? When did it break?

It's been broken since day 1, so:

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")

^ permalink raw reply

* RE: [PATCH v3] net: tipc: Replace GFP_ATOMIC with GFP_KERNEL in tipc_mon_create
From: Jon Maloy @ 2018-04-11 13:01 UTC (permalink / raw)
  To: Ying Xue, Jia-Ju Bai, davem@davemloft.net
  Cc: netdev@vger.kernel.org, tipc-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org
In-Reply-To: <7bf48d79-54fe-421a-e04c-f2d6cd2c71e2@windriver.com>



> -----Original Message-----
> From: Ying Xue [mailto:ying.xue@windriver.com]
> Sent: Wednesday, April 11, 2018 06:27
> To: Jia-Ju Bai <baijiaju1990@gmail.com>; Jon Maloy
> <jon.maloy@ericsson.com>; davem@davemloft.net
> Cc: netdev@vger.kernel.org; tipc-discussion@lists.sourceforge.net; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH v3] net: tipc: Replace GFP_ATOMIC with GFP_KERNEL in
> tipc_mon_create
> 
> On 04/11/2018 06:24 PM, Jia-Ju Bai wrote:
> > tipc_mon_create() is never called in atomic context.
> >
> > The call chain ending up at tipc_mon_create() is:
> > [1] tipc_mon_create() <- tipc_enable_bearer() <-
> > tipc_nl_bearer_enable()
> > tipc_nl_bearer_enable() calls rtnl_lock(), which indicates this
> > function is not called in atomic context.
> >
> > Despite never getting called from atomic context,
> > tipc_mon_create() calls kzalloc() with GFP_ATOMIC, which does not
> > sleep for allocation.
> > GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
> which
> > can sleep and improve the possibility of successful allocation.
> >
> > This is found by a static analysis tool named DCNS written by myself.
> > And I also manually check it.
> >
> > Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
> 
> Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
> 
> > ---
> > v2:
> > * Modify the description of GFP_ATOMIC in v1.
> >   Thank Eric for good advice.
> > v3:
> > * Modify wrong text in description in v2.
> >   Thank Ying for good advice.
> > ---
> >  net/tipc/monitor.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/tipc/monitor.c b/net/tipc/monitor.c index
> > 9e109bb..9714d80 100644
> > --- a/net/tipc/monitor.c
> > +++ b/net/tipc/monitor.c
> > @@ -604,9 +604,9 @@ int tipc_mon_create(struct net *net, int bearer_id)
> >  	if (tn->monitors[bearer_id])
> >  		return 0;
> >
> > -	mon = kzalloc(sizeof(*mon), GFP_ATOMIC);
> > -	self = kzalloc(sizeof(*self), GFP_ATOMIC);
> > -	dom = kzalloc(sizeof(*dom), GFP_ATOMIC);
> > +	mon = kzalloc(sizeof(*mon), GFP_KERNEL);
> > +	self = kzalloc(sizeof(*self), GFP_KERNEL);
> > +	dom = kzalloc(sizeof(*dom), GFP_KERNEL);
> >  	if (!mon || !self || !dom) {
> >  		kfree(mon);
> >  		kfree(self);
> >

^ permalink raw reply

* [PATCH net] sctp: do not check port in sctp_inet6_cmp_addr
From: Xin Long @ 2018-04-11 12:58 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: davem, Marcelo Ricardo Leitner, Neil Horman

pf->cmp_addr() is called before binding a v6 address to the sock. It
should not check ports, like in sctp_inet_cmp_addr.

But sctp_inet6_cmp_addr checks the addr by invoking af(6)->cmp_addr,
sctp_v6_cmp_addr where it also compares the ports.

This would cause that setsockopt(SCTP_SOCKOPT_BINDX_ADD) could bind
multiple duplicated IPv6 addresses after Commit 40b4f0fd74e4 ("sctp:
lack the check for ports in sctp_v6_cmp_addr").

This patch is to remove af->cmp_addr called in sctp_inet6_cmp_addr,
but do the proper check for both v6 addrs and v4mapped addrs.

Fixes: 40b4f0fd74e4 ("sctp: lack the check for ports in sctp_v6_cmp_addr")
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sctp/ipv6.c | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index f1fc48e..be4b72c 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -846,8 +846,8 @@ static int sctp_inet6_cmp_addr(const union sctp_addr *addr1,
 			       const union sctp_addr *addr2,
 			       struct sctp_sock *opt)
 {
-	struct sctp_af *af1, *af2;
 	struct sock *sk = sctp_opt2sk(opt);
+	struct sctp_af *af1, *af2;
 
 	af1 = sctp_get_af_specific(addr1->sa.sa_family);
 	af2 = sctp_get_af_specific(addr2->sa.sa_family);
@@ -863,10 +863,31 @@ static int sctp_inet6_cmp_addr(const union sctp_addr *addr1,
 	if (sctp_is_any(sk, addr1) || sctp_is_any(sk, addr2))
 		return 1;
 
-	if (addr1->sa.sa_family != addr2->sa.sa_family)
+	if (addr1->sa.sa_family != addr2->sa.sa_family) {
+		if (addr1->sa.sa_family == AF_INET &&
+		    addr2->sa.sa_family == AF_INET6 &&
+		    ipv6_addr_v4mapped(&addr2->v6.sin6_addr))
+			if (addr2->v6.sin6_addr.s6_addr32[3] ==
+			    addr1->v4.sin_addr.s_addr)
+				return 1;
+		if (addr2->sa.sa_family == AF_INET &&
+		    addr1->sa.sa_family == AF_INET6 &&
+		    ipv6_addr_v4mapped(&addr1->v6.sin6_addr))
+			if (addr1->v6.sin6_addr.s6_addr32[3] ==
+			    addr2->v4.sin_addr.s_addr)
+				return 1;
+		return 0;
+	}
+
+	if (!ipv6_addr_equal(&addr1->v6.sin6_addr, &addr2->v6.sin6_addr))
+		return 0;
+
+	if ((ipv6_addr_type(&addr1->v6.sin6_addr) & IPV6_ADDR_LINKLOCAL) &&
+	    addr1->v6.sin6_scope_id && addr2->v6.sin6_scope_id &&
+	    addr1->v6.sin6_scope_id != addr2->v6.sin6_scope_id)
 		return 0;
 
-	return af1->cmp_addr(addr1, addr2);
+	return 1;
 }
 
 /* Verify that the provided sockaddr looks bindable.   Common verification,
-- 
2.1.0

^ permalink raw reply related

* Re: [PATCH] lan78xx: Correctly indicate invalid OTP
From: Andrew Lunn @ 2018-04-11 12:57 UTC (permalink / raw)
  To: Phil Elwell
  Cc: Woojung Huh, Microchip Linux Driver Support, netdev, linux-usb,
	linux-kernel
In-Reply-To: <1523440757-127451-1-git-send-email-phil@raspberrypi.org>

On Wed, Apr 11, 2018 at 10:59:17AM +0100, Phil Elwell wrote:
> lan78xx_read_otp tries to return -EINVAL in the event of invalid OTP
> content, but the value gets overwritten before it is returned and the
> read goes ahead anyway. Make the read conditional as it should be
> and preserve the error code.

Hi Phil

Do you know that the Fixes: tag should be for this? When did it break?

Thanks
	Andrew

^ permalink raw reply

* Re: [PATCH v1 net 0/3] lan78xx: Fixes and enhancements
From: Andrew Lunn @ 2018-04-11 12:55 UTC (permalink / raw)
  To: Raghuram Chary J; +Cc: davem, netdev, unglinuxdriver, woojung.huh
In-Reply-To: <20180411072450.9809-1-raghuramchary.jallipalli@microchip.com>

On Wed, Apr 11, 2018 at 12:54:47PM +0530, Raghuram Chary J wrote:
> These series of patches have fix and enhancements for
> lan78xx driver.

Hi Raghuram

Please separate the fixes from the enhancements. The enhancements need
to wait until net-next re-opens in a weeks time. The first patch,
which is a real fix, can probably be accepted now.

      Andrew

> 
> Raghuram Chary J (3):
>   lan78xx: PHY DSP registers initialization to address EEE link drop
>     issues with long cables
>   lan78xx: Add support to dump lan78xx registers
>   lan78xx: Lan7801 Support for Fixed PHY
> 
>  drivers/net/phy/microchip.c  | 178 ++++++++++++++++++++++++++++++++++++++++++-
>  drivers/net/usb/Kconfig      |   1 +
>  drivers/net/usb/lan78xx.c    |  96 ++++++++++++++++++++++-
>  include/linux/microchipphy.h |   8 ++
>  4 files changed, 278 insertions(+), 5 deletions(-)
> 
> -- 
> 2.16.2
> 

^ permalink raw reply

* [PATCH net 2/2] net: aquantia: oops when shutdown on already stopped device
From: Igor Russkikh @ 2018-04-11 12:23 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, David Arcari, Pavel Belous, Igor Russkikh
In-Reply-To: <cover.1523449097.git.igor.russkikh@aquantia.com>

In case netdev is closed at the moment of pci shutdown, aq_nic_stop
gets called second time. napi_disable in that case hangs indefinitely.
In other case, if device was never opened at all, we get oops because
of null pointer access.

We should invoke aq_nic_stop conditionally, only if device is running
at the moment of shutdown.

Reported-by: David Arcari <darcari@redhat.com>
Fixes: 90869ddfefeb ("net: aquantia: Implement pci shutdown callback")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
---
 drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
index c96a921..32f6d2e 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -951,9 +951,11 @@ void aq_nic_shutdown(struct aq_nic_s *self)

 	netif_device_detach(self->ndev);

-	err = aq_nic_stop(self);
-	if (err < 0)
-		goto err_exit;
+	if (netif_running(self->ndev)) {
+		err = aq_nic_stop(self);
+		if (err < 0)
+			goto err_exit;
+	}
 	aq_nic_deinit(self);

 err_exit:
-- 
2.7.4

^ permalink raw reply related

* [PATCH net 1/2] net: aquantia: Regression on reset with 1.x firmware
From: Igor Russkikh @ 2018-04-11 12:23 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, David Arcari, Pavel Belous, Igor Russkikh
In-Reply-To: <cover.1523449097.git.igor.russkikh@aquantia.com>

On ASUS XG-C100C with 1.5.44 firmware a special mode called "dirty wake"
is active. With this mode when motherboard gets powered (but no poweron
happens yet), NIC automatically enables powersave link and watches
for WOL packet.
This normally allows to powerup the PC after AC power failures.

Not all motherboards or bios settings gives power to PCI slots,
so this mode is not enabled on all the hardware.

4.16 linux driver introduced full hardware reset sequence
This is required since before that we had no NIC hardware
reset implemented and there were side effects of "not clean start".

But this full reset is incompatible with "dirty wake" WOL feature
it keeps the PHY link in a special mode forever. As a consequence,
driver sees no link and no traffic.

To fix this we forcibly change FW state to idle state before doing
the full reset. This makes FW to restore link state.

Fixes: c8c82eb net: aquantia: Introduce global AQC hardware reset sequence
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
---
 .../net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c
index 84d7f4d..e652d86 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c
@@ -48,6 +48,8 @@
 #define FORCE_FLASHLESS 0

 static int hw_atl_utils_ver_match(u32 ver_expected, u32 ver_actual);
+static int hw_atl_utils_mpi_set_state(struct aq_hw_s *self,
+				      enum hal_atl_utils_fw_state_e state);

 int hw_atl_utils_initfw(struct aq_hw_s *self, const struct aq_fw_ops **fw_ops)
 {
@@ -247,6 +249,20 @@ int hw_atl_utils_soft_reset(struct aq_hw_s *self)

 	self->rbl_enabled = (boot_exit_code != 0);

+	/* FW 1.x may bootup in an invalid POWER state (WOL feature).
+	 * We should work around this by forcing its state back to DEINIT
+	 */
+	if (!hw_atl_utils_ver_match(HW_ATL_FW_VER_1X,
+				    aq_hw_read_reg(self,
+						   HW_ATL_MPI_FW_VERSION))) {
+		int err = 0;
+
+		hw_atl_utils_mpi_set_state(self, MPI_DEINIT);
+		AQ_HW_WAIT_FOR((aq_hw_read_reg(self, HW_ATL_MPI_STATE_ADR) &
+			       HW_ATL_MPI_STATE_MSK) == MPI_DEINIT,
+			       10, 1000U);
+	}
+
 	if (self->rbl_enabled)
 		return hw_atl_utils_soft_reset_rbl(self);
 	else
-- 
2.7.4

^ permalink raw reply related

* [PATCH net 0/2] Aquantia atlantic critical fixes 04/2018
From: Igor Russkikh @ 2018-04-11 12:23 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, David Arcari, Pavel Belous, Igor Russkikh

Two regressions on latest 4.16 driver reported by users

Some of old FW (1.5.44) had a link management logic which prevents
driver to make clean reset. Driver of 4.16 has a full hardware reset
implemented and that broke the link and traffic on such a cards.

Second is oops on shutdown callback in case interface is already
closed or was never opened.

Igor Russkikh (2):
  net: aquantia: Regression on reset with 1.x firmware
  net: aquantia: oops when shutdown on already stopped device

 drivers/net/ethernet/aquantia/atlantic/aq_nic.c          |  8 +++++---
 .../net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c | 16 ++++++++++++++++
 2 files changed, 21 insertions(+), 3 deletions(-)

-- 
2.7.4

^ permalink raw reply

* Re: [RFC PATCH v2 00/14] Introducing AF_XDP support
From: Björn Töpel @ 2018-04-11 12:17 UTC (permalink / raw)
  To: William Tu
  Cc: Karlsson, Magnus, Alexander Duyck, Alexander Duyck,
	John Fastabend, Alexei Starovoitov, Jesper Dangaard Brouer,
	Willem de Bruijn, Daniel Borkmann,
	Linux Kernel Network Developers, Björn Töpel,
	michael.lundkvist, Brandeburg, Jesse, Anjali Singhai Jain,
	Zhang, Qi Z, ravineet.singh
In-Reply-To: <CALDO+SYvxrnqKy=7P2c_agWDJEn2spwNQr0ynE2-JTfWPAaEEg@mail.gmail.com>

2018-04-10 16:14 GMT+02:00 William Tu <u9012063@gmail.com>:
> On Mon, Apr 9, 2018 at 11:47 PM, Björn Töpel <bjorn.topel@gmail.com> wrote:

[...]

>>>
>>
>> So you've setup two identical UMEMs? Then you can just forward the
>> incoming Rx descriptor to the other netdev's Tx queue. Note, that you
>> only need to copy the descriptor, not the actual frame data.
>>
>
> Thanks!
> I will give it a try, I guess you're saying I can do below:
>
> int sfd1; // for device1
> int sfd2; // for device2
> ...
> // create 2 umem
> umem1 = calloc(1, sizeof(*umem));
> umem2 = calloc(1, sizeof(*umem));
>
> // allocate 1 shared buffer, 1 xdp_umem_reg
> posix_memalign(&bufs, ...)
> mr.addr = (__u64)bufs; // shared for umem1,2
> ...
>
> // umem reg the same mr
> setsockopt(sfd1, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr))
> setsockopt(sfd2, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr))
>
> // setup fill, completion, mmap for sfd1 and sfd2
> ...
>
> Since both device can put frame data in 'bufs', I only need to copy
> the descs between 2 umem1 and umem2. Am I understand correct?
>

Yup, spot on! umem1 and umem2 have the same layout/index "address
space", so you can just forward the descriptors and never touch the
data.

In the current RFC you are required to create both an Rx and Tx queue
to bind the socket, which is just weird for your "Rx on one device, Tx
to another" scenario. I'll fix that in the next RFC.


Björn

> Regards,
> William

^ permalink raw reply

* Re: [PATCH linux] net: fix deadlock while clearing neighbor proxy table
From: Wolfgang Bumiller @ 2018-04-11 12:17 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, yoshfuji
In-Reply-To: <20180410.110229.1597289057689247263.davem@davemloft.net>

David Miller wrote:
> From: Wolfgang Bumiller <w.bumiller@proxmox.com>
> Date: Tue, 10 Apr 2018 11:15:14 +0200
> 
> > diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> > index 7b7a14abba28..601df647588c 100644
> > --- a/net/core/neighbour.c
> > +++ b/net/core/neighbour.c
> > @@ -292,7 +292,6 @@ int neigh_ifdown(struct neigh_table *tbl, struct net_device *dev)
> >  	write_lock_bh(&tbl->lock);
> >  	neigh_flush_dev(tbl, dev);
> >  	pneigh_ifdown(tbl, dev);
> > -	write_unlock_bh(&tbl->lock);
> 
> If we are going to fix it this way, we need to annotate the code here in some
> way so that future readers understand why the tbl->lock is not being released
> here.

A better way would of course be nice, too, but I find it hard to find
one given how "far away" the IGMP and then output code are from this
point.

> One way is to add a comment.
> 
> Another way is to rename pneigh_ifdown() to "pneigh_ifdown_and_unlock()".

Sure, I can send a v2 with whichever is preferred - personally I prefer
the rename as it'll be visible at both the calling & implementation
side.

^ permalink raw reply

* Re: TCP one-by-one acking - RFC interpretation question
From: Michal Kubecek @ 2018-04-11 12:06 UTC (permalink / raw)
  To: netdev; +Cc: Eric Dumazet, Yuchung Cheng, Neal Cardwell,
	Kenneth Klette Jonassen
In-Reply-To: <20180411105837.bwnpoqvbra43kjub@unicorn.suse.cz>

On Wed, Apr 11, 2018 at 12:58:37PM +0200, Michal Kubecek wrote:
> There is something else I don't understand, though. In the case of
> acking previously sacked and never retransmitted segment,
> tcp_clean_rtx_queue() calculates the parameters for tcp_ack_update_rtt()
> using
> 
>         if (sack->first_sackt.v64) {
>                 sack_rtt_us = skb_mstamp_us_delta(&now,
> &sack->first_sackt);
>                 ca_rtt_us = skb_mstamp_us_delta(&now,
> &sack->last_sackt);
>         }
> 
> (in 4.4; mainline code replaces &now with tp->tcp_mstamp). If I read the
> code correctly, both sack->first_sackt and sack->last_sackt contain
> timestamps of initial segment transmission. This would mean we use the
> time difference between the initial transmission and now, i.e. including
> the RTO of the lost packet).
> 
> IMHO we should take the actual round trip time instead, i.e. the
> difference between the original transmission and the time the packet
> sacked (first time). It seems we have been doing this before commit
> 31231a8a8730 ("tcp: improve RTT from SACK for CC").

Sorry for the noise, this was my misunderstanding, the first_sackt and
last_sackt values are only taken from segments newly sacked by ack
received right now, not those which were already sacked before.

The actual problem and unrealistic RTT measurements come from another
RFC violation I didn't mention before: the NAS doesn't follow RFC 2018
section 4 rule for ordering of SACK blocks. Rather than sending SACK
blocks three most recently received out-of-order blocks, it simply sends
first three ordered by sequence numbers. In the earlier example (odd
packets were received, even lost)

       ACK             SAK             SAK             SAK
    +-------+-------+-------+-------+-------+-------+-------+-------+-------+
    |   1   |   2   |   3   |   4   |   5   |   6   |   7   |   8   |   9   |
    +-------+-------+-------+-------+-------+-------+-------+-------+-------+
  34273   35701   37129   38557   39985   41413   42841   44269   45697   47125

it responds to retransmitted segment 2 by

  1. ACK 37129, SACK 37129-38557 39985-41413 42841-44269
  2. ACK 38557, SACK 39985-41413 42841-44269 45697-47125

This new SACK block 45697-47125 has not been retransmitted and as it
wasn't sacked before, it is considered newly sacked. Therefore it gets
processed and its deemed RTT (time since its original transmit time)
"poisons" the RTT calculation, leading to RTO spiraling up.

Thus if we want to work around the NAS behaviour, we would need to
recognize such new SACK block as "not really new" and ignore it for
first_sackt/last_sackt. I'm not sure if it's possible without
misinterpreting actually delayed out of order packets. Of course, it is
not clear if it's worth the effort to work around so severely broken TCP
implementations (two obvious RFC violations, even if we don't count the
one-by-one acking).

Michal Kubecek

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox