Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: netconsole deadlock with virtnet
From: Leon Romanovsky @ 2020-11-23 11:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jason Wang, Sergey Senozhatsky, Michael S. Tsirkin, Petr Mladek,
	John Ogness, virtualization, Amit Shah, Itay Aveksis,
	Ran Rozenstein, netdev
In-Reply-To: <20201118091257.2ee6757a@gandalf.local.home>

On Wed, Nov 18, 2020 at 09:12:57AM -0500, Steven Rostedt wrote:
>
> [ Adding netdev as perhaps someone there knows ]
>
> On Wed, 18 Nov 2020 12:09:59 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
> > > This CPU0 lock(_xmit_ETHER#2) -> hard IRQ -> lock(console_owner) is
> > > basically
> > > 	soft IRQ -> lock(_xmit_ETHER#2) -> hard IRQ -> printk()
> > >
> > > Then CPU1 spins on xmit, which is owned by CPU0, CPU0 spins on
> > > console_owner, which is owned by CPU1?
>
> It still looks to me that the target_list_lock is taken in IRQ, (which can
> be the case because printk calls write_msg() which takes that lock). And
> someplace there's a:
>
> 	lock(target_list_lock)
> 	lock(xmit_lock)
>
> which means you can remove the console lock from this scenario completely,
> and you still have a possible deadlock between target_list_lock and
> xmit_lock.
>
> >
> >
> > If this is true, it looks not a virtio-net specific issue but somewhere
> > else.
> >
> > I think all network driver will synchronize through bh instead of hardirq.
>
> I think the issue is where target_list_lock is held when we take xmit_lock.
> Is there anywhere in netconsole.c that can end up taking xmit_lock while
> holding the target_list_lock? If so, that's the problem. As
> target_list_lock is something that can be taken in IRQ context, which means
> *any* other lock that is taking while holding the target_list_lock must
> also protect against interrupts from happening while it they are held.

I increased printk buffer like Petr suggested and the splat is below.
It doesn't happening on x86, but on ARM65 and ppc64.

 [   10.027975] =====================================================
 [   10.027976] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
 [   10.027976] 5.10.0-rc4_for_upstream_min_debug_2020_11_22_19_37 #1 Not tainted
 [   10.027977] -----------------------------------------------------
 [   10.027978] modprobe/638 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
 [   10.027979] ffff0000c9f63c98 (_xmit_ETHER#2){+.-.}-{2:2}, at: virtnet_poll_tx+0x84/0x120
 [   10.027982]
 [   10.027982] and this task is already holding:
 [   10.027983] ffff800009007018 (target_list_lock){....}-{2:2}, at: write_msg+0x6c/0x120 [netconsole]
 [   10.027985] which would create a new lock dependency:
 [   10.027985]  (target_list_lock){....}-{2:2} -> (_xmit_ETHER#2){+.-.}-{2:2}
 [   10.027989]
 [   10.027989] but this new dependency connects a HARDIRQ-irq-safe lock:
 [   10.027990]  (console_owner){-...}-{0:0}
 [   10.027991]
 [   10.027992] ... which became HARDIRQ-irq-safe at:
 [   10.027992]   __lock_acquire+0xa78/0x1a94
 [   10.027993]   lock_acquire.part.0+0x170/0x360
 [   10.027993]   lock_acquire+0x68/0x8c
 [   10.027994]   console_unlock+0x1e8/0x6a4
 [   10.027994]   vprintk_emit+0x1c4/0x3c4
 [   10.027995]   vprintk_default+0x40/0x4c
 [   10.027995]   vprintk_func+0x10c/0x220
 [   10.027995]   printk+0x68/0x90
 [   10.027996]   crng_fast_load+0x1bc/0x1c0
 [   10.027997]   add_interrupt_randomness+0x280/0x290
 [   10.027997]   handle_irq_event+0x80/0x120
 [   10.027997]   handle_fasteoi_irq+0xac/0x200
 [   10.027998]   __handle_domain_irq+0x84/0xf0
 [   10.027999]   gic_handle_irq+0xd4/0x320
 [   10.027999]   el1_irq+0xd0/0x180
 [   10.028000]   arch_cpu_idle+0x24/0x44
 [   10.028000]   default_idle_call+0x48/0xa0
 [   10.028001]   do_idle+0x260/0x300
 [   10.028001]   cpu_startup_entry+0x30/0x6c
 [   10.028001]   rest_init+0x1b4/0x288
 [   10.028002]   arch_call_rest_init+0x18/0x24
 [   10.028002]   start_kernel+0x5cc/0x608
 [   10.028003]
 [   10.028003] to a HARDIRQ-irq-unsafe lock:
 [   10.028004]  (_xmit_ETHER#2){+.-.}-{2:2}
 [   10.028005]
 [   10.028006] ... which became HARDIRQ-irq-unsafe at:
 [   10.028006] ...  __lock_acquire+0x8bc/0x1a94
 [   10.028007]   lock_acquire.part.0+0x170/0x360
 [   10.028007]   lock_acquire+0x68/0x8c
 [   10.028008]   _raw_spin_trylock+0x80/0xd0
 [   10.028008]   virtnet_poll+0xac/0x360
 [   10.028009]   net_rx_action+0x1b0/0x4e0
 [   10.028010]   __do_softirq+0x1f4/0x638
 [   10.028010]   do_softirq+0xb8/0xcc
 [   10.028010]   __local_bh_enable_ip+0x18c/0x200
 [   10.028011]   virtnet_napi_enable+0xc0/0xd4
 [   10.028011]   virtnet_open+0x98/0x1c0
 [   10.028012]   __dev_open+0x12c/0x200
 [   10.028013]   __dev_change_flags+0x1a0/0x220
 [   10.028013]   dev_change_flags+0x2c/0x70
 [   10.028014]   do_setlink+0x214/0xe20
 [   10.028014]   __rtnl_newlink+0x514/0x820
 [   10.028015]   rtnl_newlink+0x58/0x84
 [   10.028015]   rtnetlink_rcv_msg+0x184/0x4b4
 [   10.028016]   netlink_rcv_skb+0x60/0x124
 [   10.028016]   rtnetlink_rcv+0x20/0x30
 [   10.028017]   netlink_unicast+0x1b4/0x270
 [   10.028017]   netlink_sendmsg+0x1f0/0x400
 [   10.028018]   sock_sendmsg+0x5c/0x70
 [   10.028018]   ____sys_sendmsg+0x24c/0x280
 [   10.028019]   ___sys_sendmsg+0x88/0xd0
 [   10.028019]   __sys_sendmsg+0x70/0xd0
 [   10.028020]   __arm64_sys_sendmsg+0x2c/0x40
 [   10.028021]   el0_svc_common.constprop.0+0x84/0x200
 [   10.028021]   do_el0_svc+0x2c/0x90
 [   10.028021]   el0_svc+0x18/0x50
 [   10.028022]   el0_sync_handler+0xe0/0x350
 [   10.028023]   el0_sync+0x158/0x180
 [   10.028023]
 [   10.028023] other info that might help us debug this:
 [   10.028024]
 [   10.028024] Chain exists of:
 [   10.028025]   console_owner --> target_list_lock --> _xmit_ETHER#2
 [   10.028028]
 [   10.028028]  Possible interrupt unsafe locking scenario:
 [   10.028029]
 [   10.028029]        CPU0                    CPU1
 [   10.028030]        ----                    ----
 [   10.028030]   lock(_xmit_ETHER#2);
 [   10.028032]                                local_irq_disable();
 [   10.028032]                                lock(console_owner);
 [   10.028034]                                lock(target_list_lock);
 [   10.028035]   <Interrupt>
 [   10.028035]     lock(console_owner);
 [   10.028036]
 [   10.028037]  *** DEADLOCK ***
 [   10.028037]
 [   10.028038] 3 locks held by modprobe/638:
 [   10.028038]  #0: ffff800011e1efe0 (console_lock){+.+.}-{0:0}, at: register_console+0x144/0x2f4
 [   10.028040]  #1: ffff800011e1f108 (console_owner){-...}-{0:0}, at: console_unlock+0x17c/0x6a4
 [   10.028043]  #2: ffff800009007018 (target_list_lock){....}-{2:2}, at: write_msg+0x6c/0x120 [netconsole]
 [   10.028045]
 [   10.028046] the dependencies between HARDIRQ-irq-safe lock and the holding lock:
 [   10.028046]  -> (console_owner){-...}-{0:0} ops: 1574 {
 [   10.028049]     IN-HARDIRQ-W at:
 [   10.028050]                          __lock_acquire+0xa78/0x1a94
 [   10.028050]                          lock_acquire.part.0+0x170/0x360
 [   10.028051]                          lock_acquire+0x68/0x8c
 [   10.028051]                          console_unlock+0x1e8/0x6a4
 [   10.028052]                          vprintk_emit+0x1c4/0x3c4
 [   10.028052]                          vprintk_default+0x40/0x4c
 [   10.028053]                          vprintk_func+0x10c/0x220
 [   10.028054]                          printk+0x68/0x90
 [   10.028054]                          crng_fast_load+0x1bc/0x1c0
 [   10.028055]                          add_interrupt_randomness+0x280/0x290
 [   10.028056]                          handle_irq_event+0x80/0x120
 [   10.028056]                          handle_fasteoi_irq+0xac/0x200
 [   10.028057]                          __handle_domain_irq+0x84/0xf0
 [   10.028057]                          gic_handle_irq+0xd4/0x320
 [   10.028058]                          el1_irq+0xd0/0x180
 [   10.028058]                          arch_cpu_idle+0x24/0x44
 [   10.028059]                          default_idle_call+0x48/0xa0
 [   10.028060]                          do_idle+0x260/0x300
 [   10.028061]                          cpu_startup_entry+0x30/0x6c
 [   10.028061]                          rest_init+0x1b4/0x288
 [   10.028062]                          arch_call_rest_init+0x18/0x24
 [   10.028062]                          start_kernel+0x5cc/0x608
 [   10.028063]     INITIAL USE at:
 [   10.028064]                         __lock_acquire+0x2e0/0x1a94
 [   10.028064]                         lock_acquire.part.0+0x170/0x360
 [   10.028065]                         lock_acquire+0x68/0x8c
 [   10.028066]                         console_unlock+0x1e8/0x6a4
 [   10.028067]                         vprintk_emit+0x1c4/0x3c4
 [   10.028067]                         vprintk_default+0x40/0x4c
 [   10.028068]                         vprintk_func+0x10c/0x220
 [   10.028068]                         printk+0x68/0x90
 [   10.028069]                         start_kernel+0x8c/0x608
 [   10.028069]   }
 [   10.028070]   ... key      at: [<ffff800011e1f108>] console_owner_dep_map+0x0/0x28
 [   10.028071]   ... acquired at:
 [   10.028071]    lock_acquire.part.0+0x170/0x360
 [   10.028072]    lock_acquire+0x68/0x8c
 [   10.028072]    _raw_spin_lock_irqsave+0x88/0x15c
 [   10.028073]    write_msg+0x6c/0x120 [netconsole]
 [   10.028073]    console_unlock+0x3ec/0x6a4
 [   10.028074]    register_console+0x17c/0x2f4
 [   10.028075]    init_netconsole+0x20c/0x1000 [netconsole]
 [   10.028075]    do_one_initcall+0x8c/0x480
 [   10.028076]    do_init_module+0x60/0x270
 [   10.028076]    load_module+0x21f8/0x2734
 [   10.028077]    __do_sys_finit_module+0xbc/0x12c
 [   10.028077]    __arm64_sys_finit_module+0x28/0x34
 [   10.028078]    el0_svc_common.constprop.0+0x84/0x200
 [   10.028078]    do_el0_svc+0x2c/0x90
 [   10.028079]    el0_svc+0x18/0x50
 [   10.028079]    el0_sync_handler+0xe0/0x350
 [   10.028080]    el0_sync+0x158/0x180
 [   10.028080]
 [   10.028081] -> (target_list_lock){....}-{2:2} ops: 34 {
 [   10.028083]    INITIAL USE at:
 [   10.028084]                       __lock_acquire+0x2e0/0x1a94
 [   10.028084]                       lock_acquire.part.0+0x170/0x360
 [   10.028085]                       lock_acquire+0x68/0x8c
 [   10.028085]                       _raw_spin_lock_irqsave+0x88/0x15c
 [   10.028086]                       init_netconsole+0x148/0x1000 [netconsole]
 [   10.028087]                       do_one_initcall+0x8c/0x480
 [   10.028087]                       do_init_module+0x60/0x270
 [   10.028088]                       load_module+0x21f8/0x2734
 [   10.028088]                       __do_sys_finit_module+0xbc/0x12c
 [   10.028089]                       __arm64_sys_finit_module+0x28/0x34
 [   10.028090]                       el0_svc_common.constprop.0+0x84/0x200
 [   10.028090]                       do_el0_svc+0x2c/0x90
 [   10.028091]                       el0_svc+0x18/0x50
 [   10.028092]                       el0_sync_handler+0xe0/0x350
 [   10.028092]                       el0_sync+0x158/0x180
 [   10.028093]  }
 [   10.028093]  ... key      at: [<ffff800009007018>] target_list_lock+0x18/0xfffffffffffff000 [netconsole]
 [   10.028094]  ... acquired at:
 [   10.028094]    __lock_acquire+0x134c/0x1a94
 [   10.028095]    lock_acquire.part.0+0x170/0x360
 [   10.028095]    lock_acquire+0x68/0x8c
 [   10.028096]    _raw_spin_lock+0x64/0x90
 [   10.028096]    virtnet_poll_tx+0x84/0x120
 [   10.028097]    netpoll_poll_dev+0x12c/0x350
 [   10.028097]    netpoll_send_skb+0x39c/0x400
 [   10.028098]    netpoll_send_udp+0x2b8/0x440
 [   10.028098]    write_msg+0xfc/0x120 [netconsole]
 [   10.028099]    console_unlock+0x3ec/0x6a4
 [   10.028100]    register_console+0x17c/0x2f4
 [   10.028100]    init_netconsole+0x20c/0x1000 [netconsole]
 [   10.028101]    do_one_initcall+0x8c/0x480
 [   10.028101]    do_init_module+0x60/0x270
 [   10.028102]    load_module+0x21f8/0x2734
 [   10.028102]    __do_sys_finit_module+0xbc/0x12c
 [   10.028103]    __arm64_sys_finit_module+0x28/0x34
 [   10.028103]    el0_svc_common.constprop.0+0x84/0x200
 [   10.028104]    do_el0_svc+0x2c/0x90
 [   10.028104]    el0_svc+0x18/0x50
 [   10.028105]    el0_sync_handler+0xe0/0x350
 [   10.028105]    el0_sync+0x158/0x180
 [   10.028106]
 [   10.028106]
 [   10.028107] the dependencies between the lock to be acquired
 [   10.028107]  and HARDIRQ-irq-unsafe lock:
 [   10.028108] -> (_xmit_ETHER#2){+.-.}-{2:2} ops: 217 {
 [   10.028110]    HARDIRQ-ON-W at:
 [   10.028111]                        __lock_acquire+0x8bc/0x1a94
 [   10.028111]                        lock_acquire.part.0+0x170/0x360
 [   10.028112]                        lock_acquire+0x68/0x8c
 [   10.028113]                        _raw_spin_trylock+0x80/0xd0
 [   10.028113]                        virtnet_poll+0xac/0x360
 [   10.028114]                        net_rx_action+0x1b0/0x4e0
 [   10.028115]                        __do_softirq+0x1f4/0x638
 [   10.028115]                        do_softirq+0xb8/0xcc
 [   10.028116]                        __local_bh_enable_ip+0x18c/0x200
 [   10.028116]                        virtnet_napi_enable+0xc0/0xd4
 [   10.028117]                        virtnet_open+0x98/0x1c0
 [   10.028118]                        __dev_open+0x12c/0x200
 [   10.028118]                        __dev_change_flags+0x1a0/0x220
 [   10.028119]                        dev_change_flags+0x2c/0x70
 [   10.028119]                        do_setlink+0x214/0xe20
 [   10.028120]                        __rtnl_newlink+0x514/0x820
 [   10.028120]                        rtnl_newlink+0x58/0x84
 [   10.028121]                        rtnetlink_rcv_msg+0x184/0x4b4
 [   10.028122]                        netlink_rcv_skb+0x60/0x124
 [   10.028122]                        rtnetlink_rcv+0x20/0x30
 [   10.028123]                        netlink_unicast+0x1b4/0x270
 [   10.028124]                        netlink_sendmsg+0x1f0/0x400
 [   10.028124]                        sock_sendmsg+0x5c/0x70
 [   10.028125]                        ____sys_sendmsg+0x24c/0x280
 [   10.028125]                        ___sys_sendmsg+0x88/0xd0
 [   10.028126]                        __sys_sendmsg+0x70/0xd0
 [   10.028127]                        __arm64_sys_sendmsg+0x2c/0x40
 [   10.028128]                        el0_svc_common.constprop.0+0x84/0x200
 [   10.028128]                        do_el0_svc+0x2c/0x90
 [   10.028129]                        el0_svc+0x18/0x50
 [   10.028129]                        el0_sync_handler+0xe0/0x350
 [   10.028130]                        el0_sync+0x158/0x180
 [   10.028130]    IN-SOFTIRQ-W at:
 [   10.028131]                        __lock_acquire+0x894/0x1a94
 [   10.028132]                        lock_acquire.part.0+0x170/0x360
 [   10.028132]                        lock_acquire+0x68/0x8c
 [   10.028133]                        _raw_spin_lock+0x64/0x90
 [   10.028134]                        virtnet_poll_tx+0x84/0x120
 [   10.028134]                        net_rx_action+0x1b0/0x4e0
 [   10.028135]                        __do_softirq+0x1f4/0x638
 [   10.028135]                        do_softirq+0xb8/0xcc
 [   10.028136]                        __local_bh_enable_ip+0x18c/0x200
 [   10.028137]                        virtnet_napi_enable+0xc0/0xd4
 [   10.028137]                        virtnet_open+0x14c/0x1c0
 [   10.028138]                        __dev_open+0x12c/0x200
 [   10.028138]                        __dev_change_flags+0x1a0/0x220
 [   10.028139]                        dev_change_flags+0x2c/0x70
 [   10.028140]                        do_setlink+0x214/0xe20
 [   10.028140]                        __rtnl_newlink+0x514/0x820
 [   10.028141]                        rtnl_newlink+0x58/0x84
 [   10.028141]                        rtnetlink_rcv_msg+0x184/0x4b4
 [   10.028142]                        netlink_rcv_skb+0x60/0x124
 [   10.028142]                        rtnetlink_rcv+0x20/0x30
 [   10.028143]                        netlink_unicast+0x1b4/0x270
 [   10.028144]                        netlink_sendmsg+0x1f0/0x400
 [   10.028144]                        sock_sendmsg+0x5c/0x70
 [   10.028145]                        ____sys_sendmsg+0x24c/0x280
 [   10.028146]                        ___sys_sendmsg+0x88/0xd0
 [   10.028146]                        __sys_sendmsg+0x70/0xd0
 [   10.028147]                        __arm64_sys_sendmsg+0x2c/0x40
 [   10.028148]                        el0_svc_common.constprop.0+0x84/0x200
 [   10.028148]                        do_el0_svc+0x2c/0x90
 [   10.028149]                        el0_svc+0x18/0x50
 [   10.028149]                        el0_sync_handler+0xe0/0x350
 [   10.028150]                        el0_sync+0x158/0x180
 [   10.028150]    INITIAL USE at:
 [   10.028151]                       __lock_acquire+0x2e0/0x1a94
 [   10.028152]                       lock_acquire.part.0+0x170/0x360
 [   10.028153]                       lock_acquire+0x68/0x8c
 [   10.028153]                       _raw_spin_trylock+0x80/0xd0
 [   10.028154]                       virtnet_poll+0xac/0x360
 [   10.028154]                       net_rx_action+0x1b0/0x4e0
 [   10.028155]                       __do_softirq+0x1f4/0x638
 [   10.028155]                       do_softirq+0xb8/0xcc
 [   10.028156]                       __local_bh_enable_ip+0x18c/0x200
 [   10.028157]                       virtnet_napi_enable+0xc0/0xd4
 [   10.028157]                       virtnet_open+0x98/0x1c0
 [   10.028158]                       __dev_open+0x12c/0x200
 [   10.028158]                       __dev_change_flags+0x1a0/0x220
 [   10.028159]                       dev_change_flags+0x2c/0x70
 [   10.028159]                       do_setlink+0x214/0xe20
 [   10.028160]                       __rtnl_newlink+0x514/0x820
 [   10.028161]                       rtnl_newlink+0x58/0x84
 [   10.028161]                       rtnetlink_rcv_msg+0x184/0x4b4
 [   10.028162]                       netlink_rcv_skb+0x60/0x124
 [   10.028162]                       rtnetlink_rcv+0x20/0x30
 [   10.028163]                       netlink_unicast+0x1b4/0x270
 [   10.028163]                       netlink_sendmsg+0x1f0/0x400
 [   10.028164]                       sock_sendmsg+0x5c/0x70
 [   10.028165]                       ____sys_sendmsg+0x24c/0x280
 [   10.028165]                       ___sys_sendmsg+0x88/0xd0
 [   10.028166]                       __sys_sendmsg+0x70/0xd0
 [   10.028166]                       __arm64_sys_sendmsg+0x2c/0x40
 [   10.028167]                       el0_svc_common.constprop.0+0x84/0x200
 [   10.028168]                       do_el0_svc+0x2c/0x90
 [   10.028168]                       el0_svc+0x18/0x50
 [   10.028169]                       el0_sync_handler+0xe0/0x350
 [   10.028169]                       el0_sync+0x158/0x180
 [   10.028170]  }
 [   10.028171]  ... key      at: [<ffff80001312aef8>] netdev_xmit_lock_key+0x10/0x390
 [   10.028171]  ... acquired at:
 [   10.028172]    __lock_acquire+0x134c/0x1a94
 [   10.028172]    lock_acquire.part.0+0x170/0x360
 [   10.028173]    lock_acquire+0x68/0x8c
 [   10.028173]    _raw_spin_lock+0x64/0x90
 [   10.028174]    virtnet_poll_tx+0x84/0x120
 [   10.028174]    netpoll_poll_dev+0x12c/0x350
 [   10.028175]    netpoll_send_skb+0x39c/0x400
 [   10.028175]    netpoll_send_udp+0x2b8/0x440
 [   10.028176]    write_msg+0xfc/0x120 [netconsole]
 [   10.028176]    console_unlock+0x3ec/0x6a4
 [   10.028177]    register_console+0x17c/0x2f4
 [   10.028178]    init_netconsole+0x20c/0x1000 [netconsole]
 [   10.028178]    do_one_initcall+0x8c/0x480
 [   10.028179]    do_init_module+0x60/0x270
 [   10.028179]    load_module+0x21f8/0x2734
 [   10.028180]    __do_sys_finit_module+0xbc/0x12c
 [   10.028180]    __arm64_sys_finit_module+0x28/0x34
 [   10.028181]    el0_svc_common.constprop.0+0x84/0x200
 [   10.028181]    do_el0_svc+0x2c/0x90
 [   10.028182]    el0_svc+0x18/0x50
 [   10.028182]    el0_sync_handler+0xe0/0x350
 [   10.028183]    el0_sync+0x158/0x180
 [   10.028183]
 [   10.028183]
 [   10.028184] stack backtrace:
 [   10.028185] CPU: 14 PID: 638 Comm: modprobe Not tainted 5.10.0-rc4_for_upstream_min_debug_2020_11_22_19_37 #1
 [   10.028186] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
 [   10.028186] Call trace:
 [   10.028186]  dump_backtrace+0x0/0x1d0
 [   10.028187]  show_stack+0x20/0x3c
 [   10.028187]  dump_stack+0xec/0x138
 [   10.028188]  check_irq_usage+0x6b8/0x6cc
 [   10.028188]  __lock_acquire+0x134c/0x1a94
 [   10.028189]  lock_acquire.part.0+0x170/0x360
 [   10.028189]  lock_acquire+0x68/0x8c
 [   10.028190]  _raw_spin_lock+0x64/0x90
 [   10.028191]  virtnet_poll_tx+0x84/0x120
 [   10.028191]  netpoll_poll_dev+0x12c/0x350
 [   10.028192]  netpoll_send_skb+0x39c/0x400
 [   10.028192]  netpoll_send_udp+0x2b8/0x440
 [   10.028193]  write_msg+0xfc/0x120 [netconsole]
 [   10.028193]  console_unlock+0x3ec/0x6a4
 [   10.028194]  register_console+0x17c/0x2f4
 [   10.028194]  init_netconsole+0x20c/0x1000 [netconsole]
 [   10.028195]  do_one_initcall+0x8c/0x480
 [   10.028195]  do_init_module+0x60/0x270
 [   10.028196]  load_module+0x21f8/0x2734
 [   10.028197]  __do_sys_finit_module+0xbc/0x12c
 [   10.028197]  __arm64_sys_finit_module+0x28/0x34
 [   10.028198]  el0_svc_common.constprop.0+0x84/0x200
 [   10.028198]  do_el0_svc+0x2c/0x90
 [   10.028199]  el0_svc+0x18/0x50
 [   10.028199]  el0_sync_handler+0xe0/0x350
 [   10.028200]  el0_sync+0x158/0x180
 [   10.073569] random: crng init done
 [   10.073964] printk: console [netcon0] enabled
 [   10.074704] random: 7 urandom warning(s) missed due to ratelimiting
 [   10.075340] netconsole: network logging started

>
> -- Steve

^ permalink raw reply

* RE: [PATCH] libbpf: add support for canceling cached_cons advance
From: Li,Rongqing @ 2020-11-23 10:43 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: Network Development, bpf
In-Reply-To: <CAJ8uoz3d4x9pWWNxmd9+ozt7ei7WUE=S=FnKE1sLZOqoKRwMJQ@mail.gmail.com>



> -----Original Message-----
> From: Magnus Karlsson [mailto:magnus.karlsson@gmail.com]
> Sent: Monday, November 23, 2020 5:40 PM
> To: Li,Rongqing <lirongqing@baidu.com>
> Cc: Network Development <netdev@vger.kernel.org>; bpf
> <bpf@vger.kernel.org>
> Subject: Re: [PATCH] libbpf: add support for canceling cached_cons advance
> 
> On Sun, Nov 22, 2020 at 2:21 PM Li RongQing <lirongqing@baidu.com> wrote:
> >
> > It is possible to fail receiving packets after calling
> > xsk_ring_cons__peek, at this condition, cached_cons has been advanced,
> > should be cancelled.
> 
> Thanks RongQing,
> 
> I have needed this myself in various situations, so I think we should add this.
> But your motivation in the commit message is somewhat confusing. How about
> something like this?
> 
> Add a new function for returning descriptors the user received after an
> xsk_ring_cons__peek call. After the application has gotten a number of
> descriptors from a ring, it might not be able to or want to process them all for
> various reasons. Therefore, it would be useful to have an interface for returning
> or cancelling a number of them so that they are returned to the ring. This patch
> adds a new function called xsk_ring_cons__cancel that performs this operation
> on nb descriptors counted from the end of the batch of descriptors that was
> received through the peek call.
> 
> Replace your commit message with this, fix the bug below, send a v2 and then I
> am happy to ack this.


Thank you very much
> 
> /Magnus
> 
> > Signed-off-by: Li RongQing <lirongqing@baidu.com>
> > ---
> >  tools/lib/bpf/xsk.h | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h index
> > 1069c46364ff..4128215c246b 100644
> > --- a/tools/lib/bpf/xsk.h
> > +++ b/tools/lib/bpf/xsk.h
> > @@ -153,6 +153,12 @@ static inline size_t xsk_ring_cons__peek(struct
> xsk_ring_cons *cons,
> >         return entries;
> >  }
> >
> > +static inline void xsk_ring_cons__cancel(struct xsk_ring_cons *cons,
> > +                                        size_t nb) {
> > +       rx->cached_cons -= nb;
> 
> cons-> not rx->. Please make sure the v2 compiles and passes checkpatch.
> 

Sorry for building error
I will send V2

Thanks 

-Li


> > +}
> > +
> >  static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons,
> > size_t nb)  {
> >         /* Make sure data has been read before indicating we are done
> > --
> > 2.17.3
> >

^ permalink raw reply

* Re: [PATCH v15 8/9] doc: add ptp_kvm introduction for arm64 support
From: Marc Zyngier @ 2020-11-23 10:58 UTC (permalink / raw)
  To: Jianyong Wu
  Cc: netdev, yangbo.lu, john.stultz, tglx, pbonzini,
	sean.j.christopherson, richardcochran, Mark.Rutland, will,
	suzuki.poulose, Andre.Przywara, steven.price, linux-kernel,
	linux-arm-kernel, kvmarm, kvm, Steve.Capper, justin.he, nd
In-Reply-To: <20201111062211.33144-9-jianyong.wu@arm.com>

On 2020-11-11 06:22, Jianyong Wu wrote:
> PTP_KVM implementation depends on hypercall using SMCCC. So we
> introduce a new SMCCC service ID. This doc explains how does the
> ID define and how does PTP_KVM works on arm/arm64.
> 
> Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
> ---
>  Documentation/virt/kvm/api.rst         |  9 +++++++
>  Documentation/virt/kvm/arm/index.rst   |  1 +
>  Documentation/virt/kvm/arm/ptp_kvm.rst | 29 +++++++++++++++++++++
>  Documentation/virt/kvm/timekeeping.rst | 35 ++++++++++++++++++++++++++
>  4 files changed, 74 insertions(+)
>  create mode 100644 Documentation/virt/kvm/arm/ptp_kvm.rst
> 
> diff --git a/Documentation/virt/kvm/api.rst 
> b/Documentation/virt/kvm/api.rst
> index 36d5f1f3c6dd..9843dbcbf770 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6391,3 +6391,12 @@ When enabled, KVM will disable paravirtual
> features provided to the
>  guest according to the bits in the KVM_CPUID_FEATURES CPUID leaf
>  (0x40000001). Otherwise, a guest may use the paravirtual features
>  regardless of what has actually been exposed through the CPUID leaf.
> +
> +8.27 KVM_CAP_PTP_KVM
> +--------------------
> +
> +:Architectures: arm64
> +
> +This capability indicates that KVM virtual PTP service is supported in 
> host.
> +It must company with the implementation of KVM virtual PTP service in 
> host
> +so VMM can probe if there is the service in host by checking this 
> capability.
> diff --git a/Documentation/virt/kvm/arm/index.rst
> b/Documentation/virt/kvm/arm/index.rst
> index 3e2b2aba90fc..78a9b670aafe 100644
> --- a/Documentation/virt/kvm/arm/index.rst
> +++ b/Documentation/virt/kvm/arm/index.rst
> @@ -10,3 +10,4 @@ ARM
>     hyp-abi
>     psci
>     pvtime
> +   ptp_kvm
> diff --git a/Documentation/virt/kvm/arm/ptp_kvm.rst
> b/Documentation/virt/kvm/arm/ptp_kvm.rst
> new file mode 100644
> index 000000000000..bb1e6cfefe44
> --- /dev/null
> +++ b/Documentation/virt/kvm/arm/ptp_kvm.rst
> @@ -0,0 +1,29 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +PTP_KVM support for arm/arm64
> +=============================
> +
> +PTP_KVM is used for time sync between guest and host in a high 
> precision.
> +It needs to get the wall time and counter value from the host and
> transfer these
> +to guest via hypercall service. So one more hypercall service has been 
> added.
> +
> +This new SMCCC hypercall is defined as:
> +
> +* ARM_SMCCC_HYP_KVM_PTP_FUNC_ID: 0x86000001
> +
> +As both 32 and 64-bits ptp_kvm client should be supported, we choose
> SMC32/HVC32
> +calling convention.
> +
> +ARM_SMCCC_HYP_KVM_PTP_FUNC_ID:
> +
> +    =============    ==========    ==========
> +    Function ID:     (uint32)      0x86000001
> +    Arguments:	     (uint32)      ARM_PTP_PHY_COUNTER(1) or
> ARM_PTP_VIRT_COUNTER(0)
> +                                   which indicate acquiring physical 
> counter or
> +                                   virtual counter respectively.
> +    return value:    (uint32)      NOT_SUPPORTED(-1) or val0 and val1 
> represent
> +                                   wall clock time and val2 and val3 
> represent
> +                                   counter cycle.

This needs a lot more description:

- Which word contains what part of the data (upper/lower part of the 
64bit data)
- The endianness of the data returned

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* Re: [PATCH v15 7/9] ptp: arm/arm64: Enable ptp_kvm for arm/arm64
From: Marc Zyngier @ 2020-11-23 10:49 UTC (permalink / raw)
  To: Jianyong Wu
  Cc: netdev, yangbo.lu, john.stultz, tglx, pbonzini,
	sean.j.christopherson, richardcochran, Mark.Rutland, will,
	suzuki.poulose, Andre.Przywara, steven.price, linux-kernel,
	linux-arm-kernel, kvmarm, kvm, Steve.Capper, justin.he, nd
In-Reply-To: <20201111062211.33144-8-jianyong.wu@arm.com>

On 2020-11-11 06:22, Jianyong Wu wrote:
> Currently, there is no mechanism to keep time sync between guest and 
> host
> in arm/arm64 virtualization environment. Time in guest will drift 
> compared
> with host after boot up as they may both use third party time sources
> to correct their time respectively. The time deviation will be in order
> of milliseconds. But in some scenarios,like in cloud envirenment, we 
> ask

environment

> for higher time precision.
> 
> kvm ptp clock, which chooses the host clock source as a reference
> clock to sync time between guest and host, has been adopted by x86
> which takes the time sync order from milliseconds to nanoseconds.
> 
> This patch enables kvm ptp clock for arm/arm64 and improves clock sync 
> precison

precision

> significantly.
> 
> Test result comparisons between with kvm ptp clock and without it in 
> arm/arm64
> are as follows. This test derived from the result of command 'chronyc
> sources'. we should take more care of the last sample column which 
> shows
> the offset between the local clock and the source at the last 
> measurement.
> 
> no kvm ptp in guest:
> MS Name/IP address   Stratum Poll Reach LastRx Last sample
> ========================================================================
> ^* dns1.synet.edu.cn      2   6   377    13  +1040us[+1581us] +/-   
> 21ms
> ^* dns1.synet.edu.cn      2   6   377    21  +1040us[+1581us] +/-   
> 21ms
> ^* dns1.synet.edu.cn      2   6   377    29  +1040us[+1581us] +/-   
> 21ms
> ^* dns1.synet.edu.cn      2   6   377    37  +1040us[+1581us] +/-   
> 21ms
> ^* dns1.synet.edu.cn      2   6   377    45  +1040us[+1581us] +/-   
> 21ms
> ^* dns1.synet.edu.cn      2   6   377    53  +1040us[+1581us] +/-   
> 21ms
> ^* dns1.synet.edu.cn      2   6   377    61  +1040us[+1581us] +/-   
> 21ms
> ^* dns1.synet.edu.cn      2   6   377     4   -130us[ +796us] +/-   
> 21ms
> ^* dns1.synet.edu.cn      2   6   377    12   -130us[ +796us] +/-   
> 21ms
> ^* dns1.synet.edu.cn      2   6   377    20   -130us[ +796us] +/-   
> 21ms
> 
> in host:
> MS Name/IP address   Stratum Poll Reach LastRx Last sample
> ========================================================================
> ^* 120.25.115.20          2   7   377    72   -470us[ -603us] +/-   
> 18ms
> ^* 120.25.115.20          2   7   377    92   -470us[ -603us] +/-   
> 18ms
> ^* 120.25.115.20          2   7   377   112   -470us[ -603us] +/-   
> 18ms
> ^* 120.25.115.20          2   7   377     2   +872ns[-6808ns] +/-   
> 17ms
> ^* 120.25.115.20          2   7   377    22   +872ns[-6808ns] +/-   
> 17ms
> ^* 120.25.115.20          2   7   377    43   +872ns[-6808ns] +/-   
> 17ms
> ^* 120.25.115.20          2   7   377    63   +872ns[-6808ns] +/-   
> 17ms
> ^* 120.25.115.20          2   7   377    83   +872ns[-6808ns] +/-   
> 17ms
> ^* 120.25.115.20          2   7   377   103   +872ns[-6808ns] +/-   
> 17ms
> ^* 120.25.115.20          2   7   377   123   +872ns[-6808ns] +/-   
> 17ms
> 
> The dns1.synet.edu.cn is the network reference clock for guest and
> 120.25.115.20 is the network reference clock for host. we can't get the
> clock error between guest and host directly, but a roughly estimated 
> value
> will be in order of hundreds of us to ms.
> 
> with kvm ptp in guest:
> chrony has been disabled in host to remove the disturb by network 
> clock.
> 
> MS Name/IP address         Stratum Poll Reach LastRx Last sample
> ========================================================================
> * PHC0                    0   3   377     8     -7ns[   +1ns] +/-    
> 3ns
> * PHC0                    0   3   377     8     +1ns[  +16ns] +/-    
> 3ns
> * PHC0                    0   3   377     6     -4ns[   -0ns] +/-    
> 6ns
> * PHC0                    0   3   377     6     -8ns[  -12ns] +/-    
> 5ns
> * PHC0                    0   3   377     5     +2ns[   +4ns] +/-    
> 4ns
> * PHC0                    0   3   377    13     +2ns[   +4ns] +/-    
> 4ns
> * PHC0                    0   3   377    12     -4ns[   -6ns] +/-    
> 4ns
> * PHC0                    0   3   377    11     -8ns[  -11ns] +/-    
> 6ns
> * PHC0                    0   3   377    10    -14ns[  -20ns] +/-    
> 4ns
> * PHC0                    0   3   377     8     +4ns[   +5ns] +/-    
> 4ns
> 
> The PHC0 is the ptp clock which choose the host clock as its source
> clock. So we can see that the clock difference between host and guest
> is in order of ns.
> 
> Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
> ---
>  drivers/clocksource/arm_arch_timer.c | 28 ++++++++++++++++++
>  drivers/ptp/Kconfig                  |  2 +-
>  drivers/ptp/Makefile                 |  1 +
>  drivers/ptp/ptp_kvm_arm.c            | 44 ++++++++++++++++++++++++++++
>  4 files changed, 74 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/ptp/ptp_kvm_arm.c
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c
> b/drivers/clocksource/arm_arch_timer.c
> index d55acffb0b90..b33c5a663d30 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -25,6 +25,7 @@
>  #include <linux/sched/clock.h>
>  #include <linux/sched_clock.h>
>  #include <linux/acpi.h>
> +#include <linux/arm-smccc.h>
> 
>  #include <asm/arch_timer.h>
>  #include <asm/virt.h>
> @@ -1650,3 +1651,30 @@ static int __init arch_timer_acpi_init(struct
> acpi_table_header *table)
>  }
>  TIMER_ACPI_DECLARE(arch_timer, ACPI_SIG_GTDT, arch_timer_acpi_init);
>  #endif
> +
> +int kvm_arch_ptp_get_crosststamp(u64 *cycle, struct timespec64 *ts,
> +			      struct clocksource **cs)
> +{
> +	struct arm_smccc_res hvc_res;
> +	ktime_t ktime;
> +	u32 ptp_counter;
> +
> +	if (arch_timer_uses_ppi == ARCH_TIMER_VIRT_PPI)
> +		ptp_counter = ARM_PTP_VIRT_COUNTER;
> +	else
> +		ptp_counter = ARM_PTP_PHY_COUNTER;
> +
> +	arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID,
> +			     ptp_counter, &hvc_res);
> +
> +	if ((int)(hvc_res.a0) < 0)
> +		return -EOPNOTSUPP;
> +
> +	ktime = (u64)hvc_res.a0 << 32 | hvc_res.a1;
> +	*ts = ktime_to_timespec64(ktime);
> +	*cycle = (u64)hvc_res.a2 << 32 | hvc_res.a3;

Endianness.

> +	*cs = &clocksource_counter;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_arch_ptp_get_crosststamp);
> diff --git a/drivers/ptp/Kconfig b/drivers/ptp/Kconfig
> index 942f72d8151d..677c7f696b70 100644
> --- a/drivers/ptp/Kconfig
> +++ b/drivers/ptp/Kconfig
> @@ -106,7 +106,7 @@ config PTP_1588_CLOCK_PCH
>  config PTP_1588_CLOCK_KVM
>  	tristate "KVM virtual PTP clock"
>  	depends on PTP_1588_CLOCK
> -	depends on KVM_GUEST && X86
> +	depends on KVM_GUEST && X86 || (HAVE_ARM_SMCCC_DISCOVERY && 
> ARM_ARCH_TIMER)
>  	default y
>  	help
>  	  This driver adds support for using kvm infrastructure as a PTP
> diff --git a/drivers/ptp/Makefile b/drivers/ptp/Makefile
> index 699a4e4d19c2..9fa5ede44b2b 100644
> --- a/drivers/ptp/Makefile
> +++ b/drivers/ptp/Makefile
> @@ -5,6 +5,7 @@
> 
>  ptp-y					:= ptp_clock.o ptp_chardev.o ptp_sysfs.o
>  ptp_kvm-$(CONFIG_X86)			:= ptp_kvm_x86.o ptp_kvm_common.o
> +ptp_kvm-$(CONFIG_HAVE_ARM_SMCCC)	:= ptp_kvm_arm.o ptp_kvm_common.o
>  obj-$(CONFIG_PTP_1588_CLOCK)		+= ptp.o
>  obj-$(CONFIG_PTP_1588_CLOCK_DTE)	+= ptp_dte.o
>  obj-$(CONFIG_PTP_1588_CLOCK_INES)	+= ptp_ines.o
> diff --git a/drivers/ptp/ptp_kvm_arm.c b/drivers/ptp/ptp_kvm_arm.c
> new file mode 100644
> index 000000000000..2212827c0384
> --- /dev/null
> +++ b/drivers/ptp/ptp_kvm_arm.c
> @@ -0,0 +1,44 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + *  Virtual PTP 1588 clock for use with KVM guests
> + *  Copyright (C) 2019 ARM Ltd.
> + *  All Rights Reserved
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/err.h>
> +#include <asm/hypervisor.h>
> +#include <linux/module.h>
> +#include <linux/psci.h>
> +#include <linux/arm-smccc.h>
> +#include <linux/timecounter.h>
> +#include <linux/sched/clock.h>
> +#include <asm/arch_timer.h>
> +#include <asm/hypervisor.h>
> +
> +int kvm_arch_ptp_init(void)
> +{
> +	int ret;
> +
> +	ret = kvm_arm_hyp_service_available(ARM_SMCCC_KVM_FUNC_KVM_PTP);
> +	if (ret <= 0)
> +		return -EOPNOTSUPP;
> +
> +	return 0;
> +}
> +
> +int kvm_arch_ptp_get_clock(struct timespec64 *ts)
> +{
> +	ktime_t ktime;
> +	struct arm_smccc_res hvc_res;
> +
> +	arm_smccc_1_1_invoke(ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID,
> +			     ARM_PTP_NONE_COUNTER, &hvc_res);

I really don't see the need to use a non-architectural counter ID.
Using the virtual counter ID should just be fine, and shouldn't
lead to any issue.

Am I missing something?

> +	if ((int)(hvc_res.a0) < 0)
> +		return -EOPNOTSUPP;
> +
> +	ktime = (u64)hvc_res.a0 << 32 | hvc_res.a1;

Endianness.

> +	*ts = ktime_to_timespec64(ktime);
> +
> +	return 0;
> +}

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* Re: [PATCH v15 6/9] arm64/kvm: Add hypercall service for kvm ptp.
From: Marc Zyngier @ 2020-11-23 10:44 UTC (permalink / raw)
  To: Jianyong Wu
  Cc: netdev, yangbo.lu, john.stultz, tglx, pbonzini,
	sean.j.christopherson, richardcochran, Mark.Rutland, will,
	suzuki.poulose, Andre.Przywara, steven.price, linux-kernel,
	linux-arm-kernel, kvmarm, kvm, Steve.Capper, justin.he, nd
In-Reply-To: <20201111062211.33144-7-jianyong.wu@arm.com>

On 2020-11-11 06:22, Jianyong Wu wrote:
> ptp_kvm will get this service through SMCC call.
> The service offers wall time and cycle count of host to guest.
> The caller must specify whether they want the host cycle count
> or the difference between host cycle count and cntvoff.
> 
> Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
> ---
>  arch/arm64/kvm/hypercalls.c | 61 +++++++++++++++++++++++++++++++++++++
>  include/linux/arm-smccc.h   | 17 +++++++++++
>  2 files changed, 78 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index b9d8607083eb..f7d189563f3d 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c
> @@ -9,6 +9,51 @@
>  #include <kvm/arm_hypercalls.h>
>  #include <kvm/arm_psci.h>
> 
> +static void kvm_ptp_get_time(struct kvm_vcpu *vcpu, u64 *val)
> +{
> +	struct system_time_snapshot systime_snapshot;
> +	u64 cycles = ~0UL;
> +	u32 feature;
> +
> +	/*
> +	 * system time and counter value must captured in the same
> +	 * time to keep consistency and precision.
> +	 */
> +	ktime_get_snapshot(&systime_snapshot);
> +
> +	// binding ptp_kvm clocksource to arm_arch_counter
> +	if (systime_snapshot.cs_id != CSID_ARM_ARCH_COUNTER)
> +		return;
> +
> +	val[0] = upper_32_bits(systime_snapshot.real);
> +	val[1] = lower_32_bits(systime_snapshot.real);

What is the endianness of these values? I can't see it defined
anywhere, and this is likely not to work if guest and hypervisor
don't align.

> +
> +	/*
> +	 * which of virtual counter or physical counter being
> +	 * asked for is decided by the r1 value of SMCCC
> +	 * call. If no invalid r1 value offered, default cycle
> +	 * value(-1) will be returned.
> +	 * Note: keep in mind that feature is u32 and smccc_get_arg1
> +	 * will return u64, so need auto cast here.
> +	 */
> +	feature = smccc_get_arg1(vcpu);
> +	switch (feature) {
> +	case ARM_PTP_VIRT_COUNTER:
> +		cycles = systime_snapshot.cycles - vcpu_read_sys_reg(vcpu, 
> CNTVOFF_EL2);
> +		break;
> +	case ARM_PTP_PHY_COUNTER:
> +		cycles = systime_snapshot.cycles;
> +		break;
> +	case ARM_PTP_NONE_COUNTER:

What is this "NONE" counter?

> +		break;
> +	default:
> +		val[0] = SMCCC_RET_NOT_SUPPORTED;
> +		break;
> +	}
> +	val[2] = upper_32_bits(cycles);
> +	val[3] = lower_32_bits(cycles);

Same problem as above.

> +}
> +
>  int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>  {
>  	u32 func_id = smccc_get_function(vcpu);
> @@ -79,6 +124,22 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>  		break;
>  	case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
>  		val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
> +		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_KVM_PTP);
> +		break;
> +	/*
> +	 * This serves virtual kvm_ptp.
> +	 * Four values will be passed back.
> +	 * reg0 stores high 32-bits of host ktime;
> +	 * reg1 stores low 32-bits of host ktime;
> +	 * For ARM_PTP_VIRT_COUNTER:
> +	 * reg2 stores high 32-bits of difference of host cycles and cntvoff;
> +	 * reg3 stores low 32-bits of difference of host cycles and cntvoff.
> +	 * For ARM_PTP_PHY_COUNTER:
> +	 * reg2 stores the high 32-bits of host cycles;
> +	 * reg3 stores the low 32-bits of host cycles.
> +	 */
> +	case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> +		kvm_ptp_get_time(vcpu, val);
>  		break;
>  	default:
>  		return kvm_psci_call(vcpu);
> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
> index d75408141137..a03c5dd409d3 100644
> --- a/include/linux/arm-smccc.h
> +++ b/include/linux/arm-smccc.h
> @@ -103,6 +103,7 @@
> 
>  /* KVM "vendor specific" services */
>  #define ARM_SMCCC_KVM_FUNC_FEATURES		0
> +#define ARM_SMCCC_KVM_FUNC_KVM_PTP		1

I think having KVM once in the name is enough.

>  #define ARM_SMCCC_KVM_FUNC_FEATURES_2		127
>  #define ARM_SMCCC_KVM_NUM_FUNCS			128
> 
> @@ -114,6 +115,22 @@
> 
>  #define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED	1
> 
> +/*
> + * ptp_kvm is a feature used for time sync between vm and host.
> + * ptp_kvm module in guest kernel will get service from host using
> + * this hypercall ID.
> + */
> +#define ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID				\
> +	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,				\
> +			   ARM_SMCCC_SMC_32,				\
> +			   ARM_SMCCC_OWNER_VENDOR_HYP,			\
> +			   ARM_SMCCC_KVM_FUNC_KVM_PTP)
> +
> +/* ptp_kvm counter type ID */
> +#define ARM_PTP_VIRT_COUNTER			0
> +#define ARM_PTP_PHY_COUNTER			1
> +#define ARM_PTP_NONE_COUNTER			2

The architecture definitely doesn't have this last counter.

> +
>  /* Paravirtualised time calls (defined by ARM DEN0057A) */
>  #define ARM_SMCCC_HV_PV_TIME_FEATURES				\
>  	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,			\

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* Re: [PATCH net-next v4 2/5] net/lapb: support netdev events
From: Martin Schiller @ 2020-11-23 10:38 UTC (permalink / raw)
  To: Xie He
  Cc: Andrew Hendry, David S. Miller, Jakub Kicinski, Linux X25,
	Linux Kernel Network Developers, LKML
In-Reply-To: <CAJht_EO+enBOFMkVVB5y6aRnyMEsOZtUBJcAvOFBS91y7CauyQ@mail.gmail.com>

On 2020-11-23 11:08, Xie He wrote:
> On Mon, Nov 23, 2020 at 1:36 AM Xie He <xie.he.0141@gmail.com> wrote:
>> 
>> Some drivers don't support carrier status and will never change it.
>> Their carrier status will always be UP. There will not be a
>> NETDEV_CHANGE event.

Well, one could argue that we would have to repair these drivers, but I
don't think that will get us anywhere.

 From this point of view it will be the best to handle the NETDEV_UP in
the lapb event handler and establish the link analog to the
NETDEV_CHANGE event if the carrier is UP.

>> 
>> lapbether doesn't change carrier status. I also have my own virtual
>> HDLC WAN driver (for testing) which also doesn't change carrier
>> status.
>> 
>> I just tested with lapbether. When I bring up the interface, there
>> will only be NETDEV_PRE_UP and then NETDEV_UP. There will not be
>> NETDEV_CHANGE. The carrier status is alway UP.
>> 
>> I haven't tested whether a device can receive NETDEV_CHANGE when it is
>> down. It's possible for a device driver to call netif_carrier_on when
>> the interface is down. Do you know what will happen if a device driver
>> calls netif_carrier_on when the interface is down?
> 
> I just did a test on lapbether and saw there would be no NETDEV_CHANGE
> event when the netif is down, even if netif_carrier_on/off is called.
> So we can rest assured of this part.

^ permalink raw reply

* [PATCH][next] net: hns3: fix spelling mistake "memroy" -> "memory"
From: Colin King @ 2020-11-23 10:34 UTC (permalink / raw)
  To: Yisen Zhuang, Salil Mehta, David S . Miller, Jakub Kicinski,
	Huazhong Tan, netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

There are spelling mistakes in two dev_err messages. Fix them.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c   | 2 +-
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 500cc19225f3..ca668a47121e 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -9924,7 +9924,7 @@ static int hclge_dev_mem_map(struct hclge_dev *hdev)
 				       pci_resource_start(pdev, HCLGE_MEM_BAR),
 				       pci_resource_len(pdev, HCLGE_MEM_BAR));
 	if (!hw->mem_base) {
-		dev_err(&pdev->dev, "failed to map device memroy\n");
+		dev_err(&pdev->dev, "failed to map device memory\n");
 		return -EFAULT;
 	}
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 5d6b419b8a78..5b2f9a56f1d8 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -2904,7 +2904,7 @@ static int hclgevf_dev_mem_map(struct hclgevf_dev *hdev)
 							  HCLGEVF_MEM_BAR),
 				       pci_resource_len(pdev, HCLGEVF_MEM_BAR));
 	if (!hw->mem_base) {
-		dev_err(&pdev->dev, "failed to map device memroy\n");
+		dev_err(&pdev->dev, "failed to map device memory\n");
 		return -EFAULT;
 	}
 
-- 
2.28.0


^ permalink raw reply related

* Re: [PATCH] stmmac: pci: Add support for LS7A bridge chip
From: Jiaxun Yang @ 2020-11-23 10:31 UTC (permalink / raw)
  To: lizhi01, davem, kuba, mcoquelin.stm32
  Cc: lixuefeng, gaojuxin, linux-kernel, netdev
In-Reply-To: <1606125828-15742-1-git-send-email-lizhi01@loongson.cn>

Hi Lizhi,

You didn't send the patch to any mail list, is this intentional?

在 2020/11/23 18:03, lizhi01 写道:
> Add gmac driver to support LS7A bridge chip.
>
> Signed-off-by: lizhi01 <lizhi01@loongson.cn>
> ---
>   arch/mips/configs/loongson3_defconfig              |   4 +-
>   drivers/net/ethernet/stmicro/stmmac/Kconfig        |   8 +
>   drivers/net/ethernet/stmicro/stmmac/Makefile       |   1 +
>   .../net/ethernet/stmicro/stmmac/dwmac-loongson.c   | 194 +++++++++++++++++++++
>   4 files changed, 206 insertions(+), 1 deletion(-)
>   create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwmac-loongson.c
>
> diff --git a/arch/mips/configs/loongson3_defconfig b/arch/mips/configs/loongson3_defconfig
> index 38a817e..2e8d2be 100644
> --- a/arch/mips/configs/loongson3_defconfig
> +++ b/arch/mips/configs/loongson3_defconfig
> @@ -225,7 +225,9 @@ CONFIG_R8169=y
>   # CONFIG_NET_VENDOR_SILAN is not set
>   # CONFIG_NET_VENDOR_SIS is not set
>   # CONFIG_NET_VENDOR_SMSC is not set
> -# CONFIG_NET_VENDOR_STMICRO is not set
> +CONFIG_NET_VENDOR_STMICR=y
> +CONFIG_STMMAC_ETH=y
> +CONFIG_DWMAC_LOONGSON=y
>   # CONFIG_NET_VENDOR_SUN is not set
>   # CONFIG_NET_VENDOR_TEHUTI is not set
>   # CONFIG_NET_VENDOR_TI is not set
> diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig
> index 53f14c5..30117cb 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
> +++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
> @@ -230,6 +230,14 @@ config DWMAC_INTEL
>   	  This selects the Intel platform specific bus support for the
>   	  stmmac driver. This driver is used for Intel Quark/EHL/TGL.
>   
> +config DWMAC_LOONGSON
> +	tristate "Intel GMAC support"
> +	depends on STMMAC_ETH && PCI
> +	depends on COMMON_CLK
> +	help
> +	  This selects the Intel platform specific bus support for the
> +	  stmmac driver.

Intel ???

> +
>   config STMMAC_PCI
>   	tristate "STMMAC PCI bus support"
>   	depends on STMMAC_ETH && PCI
> diff --git a/drivers/net/ethernet/stmicro/stmmac/Makefile b/drivers/net/ethernet/stmicro/stmmac/Makefile
> index 24e6145..11ea4569 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/Makefile
> +++ b/drivers/net/ethernet/stmicro/stmmac/Makefile
> @@ -34,4 +34,5 @@ dwmac-altr-socfpga-objs := altr_tse_pcs.o dwmac-socfpga.o
>   
>   obj-$(CONFIG_STMMAC_PCI)	+= stmmac-pci.o
>   obj-$(CONFIG_DWMAC_INTEL)	+= dwmac-intel.o
> +obj-$(CONFIG_DWMAC_LOONGSON)	+= dwmac-loongson.o
>   stmmac-pci-objs:= stmmac_pci.o
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-loongson.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-loongson.c
> new file mode 100644
> index 0000000..765412e
> --- /dev/null
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-loongson.c
> @@ -0,0 +1,194 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2020, Loongson Corporation
> + */
> +
> +#include <linux/clk-provider.h>
> +#include <linux/pci.h>
> +#include <linux/dmi.h>
> +#include <linux/device.h>
> +#include <linux/of_irq.h>
> +#include "stmmac.h"
> +
> +struct stmmac_pci_info {
> +	int (*setup)(struct pci_dev *pdev, struct plat_stmmacenet_data *plat);
> +};
> +
> +static void common_default_data(struct plat_stmmacenet_data *plat)
> +{
> +	plat->clk_csr = 2;
> +	plat->has_gmac = 1;
> +	plat->force_sf_dma_mode = 1;
> +	
> +	plat->mdio_bus_data->needs_reset = true;
> +
> +	plat->multicast_filter_bins = HASH_TABLE_SIZE;
> +
> +	plat->unicast_filter_entries = 1;
> +
> +	plat->maxmtu = JUMBO_LEN;
> +
> +	plat->tx_queues_to_use = 1;
> +	plat->rx_queues_to_use = 1;
> +
> +	plat->tx_queues_cfg[0].use_prio = false;
> +	plat->rx_queues_cfg[0].use_prio = false;
> +
> +	plat->rx_queues_cfg[0].pkt_route = 0x0;
> +}
> +
> +static int loongson_default_data(struct pci_dev *pdev, struct plat_stmmacenet_data *plat)
> +{
> +	common_default_data(plat);
> +	
> +	plat->bus_id = pci_dev_id(pdev);
> +	plat->phy_addr = -1;
> +	plat->interface = PHY_INTERFACE_MODE_GMII;
> +
> +	plat->dma_cfg->pbl = 32;
> +	plat->dma_cfg->pblx8 = true;
> +
> +	plat->multicast_filter_bins = 256;
> +
> +	return 0;	
> +}


You can merge common and Loongson config as the driver is solely used by 
Loongson.

The callback is not necessary as well...


> +
> +static const struct stmmac_pci_info loongson_pci_info = {
> +	.setup = loongson_default_data,
> +};
> +
> +static int loongson_gmac_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> +	struct stmmac_pci_info *info = (struct stmmac_pci_info *)id->driver_data;
> +	struct plat_stmmacenet_data *plat;
> +	struct stmmac_resources res;
> +	int ret, i, lpi_irq;
> +	struct device_node *np;	
> +	
> +	plat = devm_kzalloc(&pdev->dev, sizeof(struct plat_stmmacenet_data), GFP_KERNEL);
> +	if (!plat)
> +		return -ENOMEM;
> +
> +	plat->mdio_bus_data = devm_kzalloc(&pdev->dev, sizeof(struct stmmac_mdio_bus_data), GFP_KERNEL);
> +	if (!plat->mdio_bus_data) {
> +		kfree(plat);
> +		return -ENOMEM;
> +	}
> +
> +	plat->dma_cfg = devm_kzalloc(&pdev->dev, sizeof(struct stmmac_dma_cfg), GFP_KERNEL);
> +	if (!plat->dma_cfg)	{
> +		kfree(plat);
> +		return -ENOMEM;
> +	}
> +
> +	ret = pci_enable_device(pdev);
> +	if (ret) {
> +		dev_err(&pdev->dev, "%s: ERROR: failed to enable device\n", __func__);
> +		kfree(plat);
> +		return ret;
> +	}
> +
> +	for (i = 0; i < PCI_STD_NUM_BARS; i++) {
> +		if (pci_resource_len(pdev, i) == 0)
> +			continue;
> +		ret = pcim_iomap_regions(pdev, BIT(0), pci_name(pdev));
> +		if (ret)
> +			return ret;
> +		break;
> +	}


The BAR order is fixed on Loongson so there is no need to check it one 
by one.

Simply use BAR0 instead.


> +
> +	pci_set_master(pdev);
> +
> +	ret = info->setup(pdev, plat);
> +	if (ret)
> +		return ret;
> +
> +	pci_enable_msi(pdev);
> +
> +	memset(&res, 0, sizeof(res));
> +	res.addr = pcim_iomap_table(pdev)[i];
> +	res.irq = pdev->irq;
> +	res.wol_irq = pdev->irq;	
> +
> +	np = dev_of_node(&pdev->dev);


Please check the node earlier and bailing out in case if there is no node.

Also you should get both IRQs via DT to avoid misordering.


> +	lpi_irq = of_irq_get_byname(np, "eth_lpi");
> +	res.lpi_irq = lpi_irq;
> +	
> +	return stmmac_dvr_probe(&pdev->dev, plat, &res);
> +}
> +
> +static void loongson_gmac_remove(struct pci_dev *pdev)
> +{
> +	int i;
> +	
> +	stmmac_dvr_remove(&pdev->dev);
> +	
> +	for (i = 0; i < PCI_STD_NUM_BARS; i++) {
> +		if (pci_resource_len(pdev, i) == 0)
> +			continue;
> +		pcim_iounmap_regions(pdev, BIT(i));
> +		break;
> +	}
> +
> +	pci_disable_device(pdev);
> +}
> +
> +static int __maybe_unused loongson_eth_pci_suspend(struct device *dev)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	int ret;
> +
> +	ret = stmmac_suspend(dev);
> +	if (ret)
> +		return ret;
> +	
> +	ret = pci_save_state(pdev);
> +	if (ret)
> +		return ret;
> +
> +	pci_disable_device(pdev);
> +	pci_wake_from_d3(pdev, true);
> +	return 0;
> +}
> +
> +static int __maybe_unused loongson_eth_pci_resume(struct device *dev)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	int ret;
> +
> +	pci_restore_state(pdev);
> +	pci_set_power_state(pdev, PCI_D0);
> +
> +	ret = pci_enable_device(pdev);
> +	if (ret)
> +		return ret;
> +	
> +	pci_set_master(pdev);
> +	
> +	return stmmac_resume(dev);
> +}	
> +
> +static SIMPLE_DEV_PM_OPS(loongson_eth_pm_ops, loongson_eth_pci_suspend, loongson_eth_pci_resume);
> +
> +#define PCI_DEVICE_ID_LOONGSON_GMAC 0x7a03
> +
> +static const struct pci_device_id loongson_gmac_table[] = {
> +	{ PCI_DEVICE_DATA(LOONGSON, GMAC, &loongson_pci_info) },
> +	{}
> +};
> +MODULE_DEVICE_TABLE(pci, loongson_gmac_table);
> +
> +struct pci_driver loongson_gmac_driver = {
> +	.name = "loongson gmac",
> +	.id_table = loongson_gmac_table,
> +	.probe = loongson_gmac_probe,
> +	.remove = loongson_gmac_remove,
> +	.driver = {
> +		.pm = &loongson_eth_pm_ops,
> +	},
> +};
> +
> +module_pci_driver(loongson_gmac_driver);
> +
> +MODULE_DESCRIPTION("Loongson DWMAC PCI driver");
> +MODULE_AUTHOR("Zhi Li <lizhi01@loongson.com>");
> +MODULE_LICENSE("GPL v2");


Thanks

- Jiaxun

^ permalink raw reply

* Re: Is test_offload.py supposed to work?
From: Toke Høiland-Jørgensen @ 2020-11-23 10:31 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: Jakub Kicinski, Jiri Pirko, bpf, Networking
In-Reply-To: <CAEf4BzaYPXKCSUX50UrkvbGZ+Ne_YqHLfcgtXzwWFpCvugC8jg@mail.gmail.com>

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Fri, Nov 20, 2020 at 7:49 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Hi Jakub and Jiri
>>
>> I am investigating an error with XDP offload mode, and figured I'd run
>> 'test_offload.py' from selftests. However, I'm unable to get it to run
>> successfully; am I missing some config options, or has it simply
>> bit-rotted to the point where it no longer works?
>>
>
> See also discussion in [0]
>
>   [0] https://www.spinics.net/lists/netdev/msg697523.html

Ah, right, thanks for the pointer :)

-Toke

^ permalink raw reply

* Re: Is test_offload.py supposed to work?
From: Toke Høiland-Jørgensen @ 2020-11-23 10:31 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jiri Pirko, bpf, netdev
In-Reply-To: <20201120084846.710549e8@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

Jakub Kicinski <kuba@kernel.org> writes:

> On Fri, 20 Nov 2020 16:46:51 +0100 Toke Høiland-Jørgensen wrote:
>> Hi Jakub and Jiri
>> 
>> I am investigating an error with XDP offload mode, and figured I'd run
>> 'test_offload.py' from selftests. However, I'm unable to get it to run
>> successfully; am I missing some config options, or has it simply
>> bit-rotted to the point where it no longer works?
>
> Yeah it must have bit rotted, there are no config options to get
> wrong there AFAIK.
>
> It shouldn't be too hard to fix tho, it's just a python script...

Right, I'll take a stab at fixing it, just wanted to make sure I wasn't
missing something obvious; thanks!

-Toke

^ permalink raw reply

* [PATCH v2] ath10k: qmi: Skip host capability request for Xiaomi Poco F1
From: Amit Pundir @ 2020-11-23 10:28 UTC (permalink / raw)
  To: Kalle Valo, David S Miller, Jakub Kicinski, Bjorn Andersson,
	Jeffrey Hugo
  Cc: John Stultz, Sumit Semwal, Konrad Dybcio, Joel S, ath10k,
	linux-wireless, netdev, phone-devel, lkml

Workaround to get WiFi working on Xiaomi Poco F1 (sdm845)
phone. We get a non-fatal QMI_ERR_MALFORMED_MSG_V01 error
message in ath10k_qmi_host_cap_send_sync(), but we can still
bring up WiFi services successfully on AOSP if we ignore it.

We suspect either the host cap is not implemented or there
may be firmware specific issues. Firmware version is
QC_IMAGE_VERSION_STRING=WLAN.HL.2.0.c3-00257-QCAHLSWMTPLZ-1

qcom,snoc-host-cap-8bit-quirk didn't help. If I use this
quirk, then the host capability request does get accepted,
but we run into fatal "msa info req rejected" error and
WiFi interface doesn't come up.

Attempts are being made to debug the failure reasons but no
luck so far. Hence this device specific workaround instead
of checking for QMI_ERR_MALFORMED_MSG_V01 error message.
Tried ath10k/WCN3990/hw1.0/wlanmdsp.mbn from the upstream
linux-firmware project but it didn't help and neither did
building board-2.bin file from stock bdwlan* files.

This workaround will be removed once we have a viable fix.
Thanks to postmarketOS guys for catching this.

Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
---
We dropped this workaround last time in the favor of
a generic dts quirk to skip host cap check. But that
is under under discussion for a while now,
https://lkml.org/lkml/2020/9/25/1119, so resending
this short term workaround for the time being.

v2: ath10k-check complained about a too long line last
    time, so moved the comment to a new line.

 drivers/net/wireless/ath/ath10k/qmi.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/qmi.c b/drivers/net/wireless/ath/ath10k/qmi.c
index ae6b1f402adf..1c58b0ff1d29 100644
--- a/drivers/net/wireless/ath/ath10k/qmi.c
+++ b/drivers/net/wireless/ath/ath10k/qmi.c
@@ -653,7 +653,9 @@ static int ath10k_qmi_host_cap_send_sync(struct ath10k_qmi *qmi)

 	/* older FW didn't support this request, which is not fatal */
 	if (resp.resp.result != QMI_RESULT_SUCCESS_V01 &&
-	    resp.resp.error != QMI_ERR_NOT_SUPPORTED_V01) {
+	    resp.resp.error != QMI_ERR_NOT_SUPPORTED_V01 &&
+	    /* Xiaomi Poco F1 workaround */
+	    !of_machine_is_compatible("xiaomi,beryllium")) {
 		ath10k_err(ar, "host capability request rejected: %d\n", resp.resp.error);
 		ret = -EINVAL;
 		goto out;
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCHv4 net-next 2/3] octeontx2-af: Add devlink health reporters for NPA
From: George Cherian @ 2020-11-23 10:28 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	kuba@kernel.org, davem@davemloft.net, Sunil Kovvuri Goutham,
	Linu Cherian, Geethasowjanya Akula, masahiroy@kernel.org,
	willemdebruijn.kernel@gmail.com, saeed@kernel.org

Hi Jiri,

> -----Original Message-----
> From: Jiri Pirko <jiri@resnulli.us>
> Sent: Monday, November 23, 2020 3:52 PM
> To: George Cherian <gcherian@marvell.com>
> Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> kuba@kernel.org; davem@davemloft.net; Sunil Kovvuri Goutham
> <sgoutham@marvell.com>; Linu Cherian <lcherian@marvell.com>;
> Geethasowjanya Akula <gakula@marvell.com>; masahiroy@kernel.org;
> willemdebruijn.kernel@gmail.com; saeed@kernel.org
> Subject: Re: [PATCHv4 net-next 2/3] octeontx2-af: Add devlink health
> reporters for NPA
> 
> Mon, Nov 23, 2020 at 03:49:06AM CET, gcherian@marvell.com wrote:
> >
> >
> >> -----Original Message-----
> >> From: Jiri Pirko <jiri@resnulli.us>
> >> Sent: Saturday, November 21, 2020 7:44 PM
> >> To: George Cherian <gcherian@marvell.com>
> >> Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> >> kuba@kernel.org; davem@davemloft.net; Sunil Kovvuri Goutham
> >> <sgoutham@marvell.com>; Linu Cherian <lcherian@marvell.com>;
> >> Geethasowjanya Akula <gakula@marvell.com>; masahiroy@kernel.org;
> >> willemdebruijn.kernel@gmail.com; saeed@kernel.org
> >> Subject: Re: [PATCHv4 net-next 2/3] octeontx2-af: Add devlink health
> >> reporters for NPA
> >>
> >> Sat, Nov 21, 2020 at 05:02:00AM CET, george.cherian@marvell.com wrote:
> >> >Add health reporters for RVU NPA block.
> >> >NPA Health reporters handle following HW event groups
> >> > - GENERAL events
> >> > - ERROR events
> >> > - RAS events
> >> > - RVU event
> >> >An event counter per event is maintained in SW.
> >> >
> >> >Output:
> >> > # devlink health
> >> > pci/0002:01:00.0:
> >> >   reporter npa
> >> >     state healthy error 0 recover 0  # devlink  health dump show
> >> >pci/0002:01:00.0 reporter npa
> >> > NPA_AF_GENERAL:
> >> >        Unmap PF Error: 0
> >> >        Free Disabled for NIX0 RX: 0
> >> >        Free Disabled for NIX0 TX: 0
> >> >        Free Disabled for NIX1 RX: 0
> >> >        Free Disabled for NIX1 TX: 0
> >>
> >> This is for 2 ports if I'm not mistaken. Then you need to have this
> >> reporter per-port. Register ports and have reporter for each.
> >>
> >No, these are not port specific reports.
> >NIX is the Network Interface Controller co-processor block.
> >There are (max of) 2 such co-processor blocks per SoC.
> 
> Ah. I see. In that case, could you please structure the json differently. Don't
> concatenate the number with the string. Instead of that, please have 2
> subtrees, one for each NIX.
> 
NPA_AF_GENERAL:
        Unmap PF Error: 0
        Free Disabled for NIX0 
		RX: 0
       		TX: 0
        Free Disabled for NIX1
		RX: 0
        		TX: 0

Something like this?

Regards,
-George
> 
> >
> >Moreover, this is an NPA (Network Pool/Buffer Allocator co- processor)
> reporter.
> >This tells whether a free or alloc operation is skipped due to the
> >configurations set by other co-processor blocks (NIX,SSO,TIM etc).
> >
> >https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__www.kernel.org_doc
> >_html_latest_networking_device-
> 5Fdrivers_ethernet_marvell_octeontx2.htm
> >l&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=npgTSgHrUSLmXpBZJKVhk0
> lE_XNvtVDl8
> >ZA2zBvBqPw&m=FNPm6lB8fRvGYvMqQWer6S9WI6rZIlMmDCqbM8xrnxM
> &s=B47zBTfDlIdM
> >xUmK0hmQkuoZnsGZYSzkvbZUloevT0A&e=
> >> NAK.


^ permalink raw reply

* Re: [PATCHv4 net-next 2/3] octeontx2-af: Add devlink health reporters for NPA
From: Jiri Pirko @ 2020-11-23 10:22 UTC (permalink / raw)
  To: George Cherian
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	kuba@kernel.org, davem@davemloft.net, Sunil Kovvuri Goutham,
	Linu Cherian, Geethasowjanya Akula, masahiroy@kernel.org,
	willemdebruijn.kernel@gmail.com, saeed@kernel.org
In-Reply-To: <BYAPR18MB2679FA2CCEBC4E921C3E078DC5FC0@BYAPR18MB2679.namprd18.prod.outlook.com>

Mon, Nov 23, 2020 at 03:49:06AM CET, gcherian@marvell.com wrote:
>
>
>> -----Original Message-----
>> From: Jiri Pirko <jiri@resnulli.us>
>> Sent: Saturday, November 21, 2020 7:44 PM
>> To: George Cherian <gcherian@marvell.com>
>> Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
>> kuba@kernel.org; davem@davemloft.net; Sunil Kovvuri Goutham
>> <sgoutham@marvell.com>; Linu Cherian <lcherian@marvell.com>;
>> Geethasowjanya Akula <gakula@marvell.com>; masahiroy@kernel.org;
>> willemdebruijn.kernel@gmail.com; saeed@kernel.org
>> Subject: Re: [PATCHv4 net-next 2/3] octeontx2-af: Add devlink health
>> reporters for NPA
>> 
>> Sat, Nov 21, 2020 at 05:02:00AM CET, george.cherian@marvell.com wrote:
>> >Add health reporters for RVU NPA block.
>> >NPA Health reporters handle following HW event groups
>> > - GENERAL events
>> > - ERROR events
>> > - RAS events
>> > - RVU event
>> >An event counter per event is maintained in SW.
>> >
>> >Output:
>> > # devlink health
>> > pci/0002:01:00.0:
>> >   reporter npa
>> >     state healthy error 0 recover 0
>> > # devlink  health dump show pci/0002:01:00.0 reporter npa
>> > NPA_AF_GENERAL:
>> >        Unmap PF Error: 0
>> >        Free Disabled for NIX0 RX: 0
>> >        Free Disabled for NIX0 TX: 0
>> >        Free Disabled for NIX1 RX: 0
>> >        Free Disabled for NIX1 TX: 0
>> 
>> This is for 2 ports if I'm not mistaken. Then you need to have this reporter
>> per-port. Register ports and have reporter for each.
>> 
>No, these are not port specific reports.
>NIX is the Network Interface Controller co-processor block.
>There are (max of) 2 such co-processor blocks per SoC.

Ah. I see. In that case, could you please structure the json
differently. Don't concatenate the number with the string. Instead of
that, please have 2 subtrees, one for each NIX.


>
>Moreover, this is an NPA (Network Pool/Buffer Allocator co- processor) reporter.
>This tells whether a free or alloc operation is skipped due to the configurations set by
>other co-processor blocks (NIX,SSO,TIM etc).
>
>https://www.kernel.org/doc/html/latest/networking/device_drivers/ethernet/marvell/octeontx2.html
>> NAK.

^ permalink raw reply

* [PATCH net-next v2] net/nfc/nci: Support NCI 2.x initial sequence
From: Bongsu Jeon @ 2020-11-23 10:12 UTC (permalink / raw)
  To: davem@davemloft.net, kuba@kernel.org
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <CGME20201123101208epcms2p71d4c8d66f08fb7a2e10ae422abde3389@epcms2p7>

implement the NCI 2.x initial sequence to support NCI 2.x NFCC.
Since NCI 2.0, CORE_RESET and CORE_INIT sequence have been changed.
If NFCEE supports NCI 2.x, then NCI 2.x initial sequence will work.

In NCI 1.0, Initial sequence and payloads are as below:
(DH)                     (NFCC)
 |  -- CORE_RESET_CMD --> |
 |  <-- CORE_RESET_RSP -- |
 |  -- CORE_INIT_CMD -->  |
 |  <-- CORE_INIT_RSP --  |
 CORE_RESET_RSP payloads are Status, NCI version, Configuration Status.
 CORE_INIT_CMD payloads are empty.
 CORE_INIT_RSP payloads are Status, NFCC Features,
    Number of Supported RF Interfaces, Supported RF Interface,
    Max Logical Connections, Max Routing table Size,
    Max Control Packet Payload Size, Max Size for Large Parameters,
    Manufacturer ID, Manufacturer Specific Information.

In NCI 2.0, Initial Sequence and Parameters are as below:
(DH)                     (NFCC)
 |  -- CORE_RESET_CMD --> |
 |  <-- CORE_RESET_RSP -- |
 |  <-- CORE_RESET_NTF -- |
 |  -- CORE_INIT_CMD -->  |
 |  <-- CORE_INIT_RSP --  |
 CORE_RESET_RSP payloads are Status.
 CORE_RESET_NTF payloads are Reset Trigger,
    Configuration Status, NCI Version, Manufacturer ID,
    Manufacturer Specific Information Length,
    Manufacturer Specific Information.
 CORE_INIT_CMD payloads are Feature1, Feature2.
 CORE_INIT_RSP payloads are Status, NFCC Features,
    Max Logical Connections, Max Routing Table Size,
    Max Control Packet Payload Size,
    Max Data Packet Payload Size of the Static HCI Connection,
    Number of Credits of the Static HCI Connection,
    Max NFC-V RF Frame Size, Number of Supported RF Interfaces,
    Supported RF Interfaces.

Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com>
---
 Changes in v2:
  - fix the warning of type casting.
  - changed the __u8 type to unsigned char.

 include/net/nfc/nci.h | 39 ++++++++++++++++++++++
 net/nfc/nci/core.c    | 23 +++++++++++--
 net/nfc/nci/ntf.c     | 21 ++++++++++++
 net/nfc/nci/rsp.c     | 75 +++++++++++++++++++++++++++++++++++++------
 4 files changed, 146 insertions(+), 12 deletions(-)

diff --git a/include/net/nfc/nci.h b/include/net/nfc/nci.h
index 0550e0380b8d..decc89803d4b 100644
--- a/include/net/nfc/nci.h
+++ b/include/net/nfc/nci.h
@@ -25,6 +25,8 @@
 #define NCI_MAX_PARAM_LEN					251
 #define NCI_MAX_PAYLOAD_SIZE					255
 #define NCI_MAX_PACKET_SIZE					258
+#define NCI_MAX_LARGE_PARAMS_NCI_v2				15
+#define NCI_VER_2_MASK						0x20
 
 /* NCI Status Codes */
 #define NCI_STATUS_OK						0x00
@@ -131,6 +133,9 @@
 #define NCI_LF_CON_BITR_F_212					0x02
 #define NCI_LF_CON_BITR_F_424					0x04
 
+/* NCI 2.x Feature Enable Bit */
+#define NCI_FEATURE_DISABLE					0x00
+
 /* NCI Reset types */
 #define NCI_RESET_TYPE_KEEP_CONFIG				0x00
 #define NCI_RESET_TYPE_RESET_CONFIG				0x01
@@ -220,6 +225,11 @@ struct nci_core_reset_cmd {
 } __packed;
 
 #define NCI_OP_CORE_INIT_CMD		nci_opcode_pack(NCI_GID_CORE, 0x01)
+/* To support NCI 2.x */
+struct nci_core_init_v2_cmd {
+	unsigned char	feature1;
+	unsigned char	feature2;
+} __packed;
 
 #define NCI_OP_CORE_SET_CONFIG_CMD	nci_opcode_pack(NCI_GID_CORE, 0x02)
 struct set_config_param {
@@ -316,6 +326,11 @@ struct nci_core_reset_rsp {
 	__u8	config_status;
 } __packed;
 
+/* To support NCI ver 2.x */
+struct nci_core_reset_rsp_nci_ver2 {
+	unsigned char	status;
+} __packed;
+
 #define NCI_OP_CORE_INIT_RSP		nci_opcode_pack(NCI_GID_CORE, 0x01)
 struct nci_core_init_rsp_1 {
 	__u8	status;
@@ -334,6 +349,20 @@ struct nci_core_init_rsp_2 {
 	__le32	manufact_specific_info;
 } __packed;
 
+/* To support NCI ver 2.x */
+struct nci_core_init_rsp_nci_ver2 {
+	unsigned char	status;
+	__le32	nfcc_features;
+	unsigned char	max_logical_connections;
+	__le16	max_routing_table_size;
+	unsigned char	max_ctrl_pkt_payload_len;
+	unsigned char	max_data_pkt_hci_payload_len;
+	unsigned char	number_of_hci_credit;
+	__le16	max_nfc_v_frame_size;
+	unsigned char	num_supported_rf_interfaces;
+	unsigned char	supported_rf_interfaces[];
+} __packed;
+
 #define NCI_OP_CORE_SET_CONFIG_RSP	nci_opcode_pack(NCI_GID_CORE, 0x02)
 struct nci_core_set_config_rsp {
 	__u8	status;
@@ -372,6 +401,16 @@ struct nci_nfcee_discover_rsp {
 /* --------------------------- */
 /* ---- NCI Notifications ---- */
 /* --------------------------- */
+#define NCI_OP_CORE_RESET_NTF		nci_opcode_pack(NCI_GID_CORE, 0x00)
+struct nci_core_reset_ntf {
+	unsigned char	reset_trigger;
+	unsigned char	config_status;
+	unsigned char	nci_ver;
+	unsigned char	manufact_id;
+	unsigned char	manufacturer_specific_len;
+	__le32	manufact_specific_info;
+} __packed;
+
 #define NCI_OP_CORE_CONN_CREDITS_NTF	nci_opcode_pack(NCI_GID_CORE, 0x06)
 struct conn_credit_entry {
 	__u8	conn_id;
diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c
index 4953ee5146e1..68889faadda2 100644
--- a/net/nfc/nci/core.c
+++ b/net/nfc/nci/core.c
@@ -165,7 +165,14 @@ static void nci_reset_req(struct nci_dev *ndev, unsigned long opt)
 
 static void nci_init_req(struct nci_dev *ndev, unsigned long opt)
 {
-	nci_send_cmd(ndev, NCI_OP_CORE_INIT_CMD, 0, NULL);
+	struct nci_core_init_v2_cmd *cmd = (struct nci_core_init_v2_cmd *)opt;
+
+	if (!cmd) {
+		nci_send_cmd(ndev, NCI_OP_CORE_INIT_CMD, 0, NULL);
+	} else {
+		/* if nci version is 2.0, then use the feature parameters */
+		nci_send_cmd(ndev, NCI_OP_CORE_INIT_CMD, sizeof(struct nci_core_init_v2_cmd), cmd);
+	}
 }
 
 static void nci_init_complete_req(struct nci_dev *ndev, unsigned long opt)
@@ -497,8 +504,18 @@ static int nci_open_device(struct nci_dev *ndev)
 	}
 
 	if (!rc) {
-		rc = __nci_request(ndev, nci_init_req, 0,
-				   msecs_to_jiffies(NCI_INIT_TIMEOUT));
+		if (!(ndev->nci_ver & NCI_VER_2_MASK)) {
+			rc = __nci_request(ndev, nci_init_req, 0,
+					   msecs_to_jiffies(NCI_INIT_TIMEOUT));
+		} else {
+			struct nci_core_init_v2_cmd nci_init_v2_cmd;
+
+			nci_init_v2_cmd.feature1 = NCI_FEATURE_DISABLE;
+			nci_init_v2_cmd.feature2 = NCI_FEATURE_DISABLE;
+
+			rc = __nci_request(ndev, nci_init_req, (unsigned long)&nci_init_v2_cmd,
+					   msecs_to_jiffies(NCI_INIT_TIMEOUT));
+		}
 	}
 
 	if (!rc && ndev->ops->post_setup)
diff --git a/net/nfc/nci/ntf.c b/net/nfc/nci/ntf.c
index 33e1170817f0..98af04c86b2c 100644
--- a/net/nfc/nci/ntf.c
+++ b/net/nfc/nci/ntf.c
@@ -27,6 +27,23 @@
 
 /* Handle NCI Notification packets */
 
+static void nci_core_reset_ntf_packet(struct nci_dev *ndev,
+				      struct sk_buff *skb)
+{
+	/* Handle NCI 2.x core reset notification */
+	struct nci_core_reset_ntf *ntf = (void *)skb->data;
+
+	ndev->nci_ver = ntf->nci_ver;
+	pr_debug("nci_ver 0x%x, config_status 0x%x\n",
+		 ntf->nci_ver, ntf->config_status);
+
+	ndev->manufact_id = ntf->manufact_id;
+	ndev->manufact_specific_info =
+		__le32_to_cpu(ntf->manufact_specific_info);
+
+	nci_req_complete(ndev, NCI_STATUS_OK);
+}
+
 static void nci_core_conn_credits_ntf_packet(struct nci_dev *ndev,
 					     struct sk_buff *skb)
 {
@@ -756,6 +773,10 @@ void nci_ntf_packet(struct nci_dev *ndev, struct sk_buff *skb)
 	}
 
 	switch (ntf_opcode) {
+	case NCI_OP_CORE_RESET_NTF:
+		nci_core_reset_ntf_packet(ndev, skb);
+		break;
+
 	case NCI_OP_CORE_CONN_CREDITS_NTF:
 		nci_core_conn_credits_ntf_packet(ndev, skb);
 		break;
diff --git a/net/nfc/nci/rsp.c b/net/nfc/nci/rsp.c
index a48297b79f34..521fa0383d48 100644
--- a/net/nfc/nci/rsp.c
+++ b/net/nfc/nci/rsp.c
@@ -31,16 +31,19 @@ static void nci_core_reset_rsp_packet(struct nci_dev *ndev, struct sk_buff *skb)
 
 	pr_debug("status 0x%x\n", rsp->status);
 
-	if (rsp->status == NCI_STATUS_OK) {
-		ndev->nci_ver = rsp->nci_ver;
-		pr_debug("nci_ver 0x%x, config_status 0x%x\n",
-			 rsp->nci_ver, rsp->config_status);
-	}
+	/* Handle NCI 1.x ver */
+	if (skb->len != 1) {
+		if (rsp->status == NCI_STATUS_OK) {
+			ndev->nci_ver = rsp->nci_ver;
+			pr_debug("nci_ver 0x%x, config_status 0x%x\n",
+				 rsp->nci_ver, rsp->config_status);
+		}
 
-	nci_req_complete(ndev, rsp->status);
+		nci_req_complete(ndev, rsp->status);
+	}
 }
 
-static void nci_core_init_rsp_packet(struct nci_dev *ndev, struct sk_buff *skb)
+static unsigned char nci_core_init_rsp_packet_v1(struct nci_dev *ndev, struct sk_buff *skb)
 {
 	struct nci_core_init_rsp_1 *rsp_1 = (void *) skb->data;
 	struct nci_core_init_rsp_2 *rsp_2;
@@ -48,7 +51,7 @@ static void nci_core_init_rsp_packet(struct nci_dev *ndev, struct sk_buff *skb)
 	pr_debug("status 0x%x\n", rsp_1->status);
 
 	if (rsp_1->status != NCI_STATUS_OK)
-		goto exit;
+		return rsp_1->status;
 
 	ndev->nfcc_features = __le32_to_cpu(rsp_1->nfcc_features);
 	ndev->num_supported_rf_interfaces = rsp_1->num_supported_rf_interfaces;
@@ -77,6 +80,60 @@ static void nci_core_init_rsp_packet(struct nci_dev *ndev, struct sk_buff *skb)
 	ndev->manufact_specific_info =
 		__le32_to_cpu(rsp_2->manufact_specific_info);
 
+	return NCI_STATUS_OK;
+}
+
+static unsigned char nci_core_init_rsp_packet_v2(struct nci_dev *ndev, struct sk_buff *skb)
+{
+	struct nci_core_init_rsp_nci_ver2 *rsp = (void *)skb->data;
+	unsigned char rf_interface_idx = 0;
+	unsigned char rf_extension_cnt = 0;
+	unsigned char *supported_rf_interface = rsp->supported_rf_interfaces;
+
+	pr_debug("status %x\n", rsp->status);
+
+	if (rsp->status != NCI_STATUS_OK)
+		return rsp->status;
+
+	ndev->nfcc_features = __le32_to_cpu(rsp->nfcc_features);
+	ndev->num_supported_rf_interfaces = rsp->num_supported_rf_interfaces;
+
+	if (ndev->num_supported_rf_interfaces >
+	    NCI_MAX_SUPPORTED_RF_INTERFACES) {
+		ndev->num_supported_rf_interfaces =
+			NCI_MAX_SUPPORTED_RF_INTERFACES;
+	}
+
+	while (rf_interface_idx < ndev->num_supported_rf_interfaces) {
+		ndev->supported_rf_interfaces[rf_interface_idx++] = *supported_rf_interface++;
+
+		/* skip rf extension parameters */
+		rf_extension_cnt = *supported_rf_interface++;
+		supported_rf_interface += rf_extension_cnt;
+	}
+
+	ndev->max_logical_connections = rsp->max_logical_connections;
+	ndev->max_routing_table_size =
+			__le16_to_cpu(rsp->max_routing_table_size);
+	ndev->max_ctrl_pkt_payload_len =
+			rsp->max_ctrl_pkt_payload_len;
+	ndev->max_size_for_large_params = NCI_MAX_LARGE_PARAMS_NCI_v2;
+
+	return NCI_STATUS_OK;
+}
+
+static void nci_core_init_rsp_packet(struct nci_dev *ndev, struct sk_buff *skb)
+{
+	unsigned char status = 0;
+
+	if (!(ndev->nci_ver & NCI_VER_2_MASK))
+		status = nci_core_init_rsp_packet_v1(ndev, skb);
+	else
+		status = nci_core_init_rsp_packet_v2(ndev, skb);
+
+	if (status != NCI_STATUS_OK)
+		goto exit;
+
 	pr_debug("nfcc_features 0x%x\n",
 		 ndev->nfcc_features);
 	pr_debug("num_supported_rf_interfaces %d\n",
@@ -103,7 +160,7 @@ static void nci_core_init_rsp_packet(struct nci_dev *ndev, struct sk_buff *skb)
 		 ndev->manufact_specific_info);
 
 exit:
-	nci_req_complete(ndev, rsp_1->status);
+	nci_req_complete(ndev, status);
 }
 
 static void nci_core_set_config_rsp_packet(struct nci_dev *ndev,
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH 1/1] xdp: compact the function xsk_map_inc
From: Magnus Karlsson @ 2020-11-23 10:09 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Karlsson, Magnus, Björn Töpel, David S. Miller,
	Network Development, Zhu Yanjun
In-Reply-To: <1606035891-6797-1-git-send-email-yanjunz@nvidia.com>

On Sun, Nov 22, 2020 at 10:07 AM Zhu Yanjun <yanjunz@nvidia.com> wrote:
>
> From: Zhu Yanjun <zyjzyj2000@gmail.com>
>
> The function xsk_map_inc always returns zero. As such, changing the
> return type to void and removing the test code.
>
> Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
> Signed-off-by: Zhu Yanjun <yanjunz@nvidia.com>
> ---
>  net/xdp/xsk.c    |    1 -
>  net/xdp/xsk.h    |    2 +-
>  net/xdp/xskmap.c |   10 ++--------
>  3 files changed, 3 insertions(+), 10 deletions(-)
>
> diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
> index cfbec39..c1b8a88 100644
> --- a/net/xdp/xsk.c
> +++ b/net/xdp/xsk.c
> @@ -548,7 +548,6 @@ static void xsk_unbind_dev(struct xdp_sock *xs)
>         node = list_first_entry_or_null(&xs->map_list, struct xsk_map_node,
>                                         node);
>         if (node) {
> -               WARN_ON(xsk_map_inc(node->map));
>                 map = node->map;
>                 *map_entry = node->map_entry;
>         }
> diff --git a/net/xdp/xsk.h b/net/xdp/xsk.h
> index b9e896c..766b9e2 100644
> --- a/net/xdp/xsk.h
> +++ b/net/xdp/xsk.h
> @@ -41,7 +41,7 @@ struct xsk_map_node {
>
>  void xsk_map_try_sock_delete(struct xsk_map *map, struct xdp_sock *xs,
>                              struct xdp_sock **map_entry);
> -int xsk_map_inc(struct xsk_map *map);
> +void xsk_map_inc(struct xsk_map *map);
>  void xsk_map_put(struct xsk_map *map);
>  void xsk_clear_pool_at_qid(struct net_device *dev, u16 queue_id);
>  int xsk_reg_pool_at_qid(struct net_device *dev, struct xsk_buff_pool *pool,
> diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c
> index 49da2b8..c7dd94a 100644
> --- a/net/xdp/xskmap.c
> +++ b/net/xdp/xskmap.c
> @@ -11,10 +11,9 @@
>
>  #include "xsk.h"
>
> -int xsk_map_inc(struct xsk_map *map)
> +void xsk_map_inc(struct xsk_map *map)
>  {
>         bpf_map_inc(&map->map);
> -       return 0;
>  }

Thank you Yanjun for your cleanup. I think we can take this one step
further and remove the function xsk_map_inc completely and use
bpf_map_inc directly in the code. Could you please do this and submit
a v2?

>  void xsk_map_put(struct xsk_map *map)
> @@ -26,17 +25,12 @@ void xsk_map_put(struct xsk_map *map)
>                                                struct xdp_sock **map_entry)
>  {
>         struct xsk_map_node *node;
> -       int err;
>
>         node = kzalloc(sizeof(*node), GFP_ATOMIC | __GFP_NOWARN);
>         if (!node)
>                 return ERR_PTR(-ENOMEM);
>
> -       err = xsk_map_inc(map);
> -       if (err) {
> -               kfree(node);
> -               return ERR_PTR(err);
> -       }
> +       xsk_map_inc(map);
>
>         node->map = map;
>         node->map_entry = map_entry;
> --
> 1.7.1
>

^ permalink raw reply

* Re: [PATCH] net: mlx5e: fix fs_tcp.c build when IPV6 is not enabled
From: Tariq Toukan @ 2020-11-23 10:08 UTC (permalink / raw)
  To: Randy Dunlap, netdev
  Cc: kernel test robot, Saeed Mahameed, Boris Pismenny, Tariq Toukan,
	David S. Miller, Jakub Kicinski
In-Reply-To: <20201122211231.5682-1-rdunlap@infradead.org>



On 11/22/2020 11:12 PM, Randy Dunlap wrote:
> Fix build when CONFIG_IPV6 is not enabled by making a function
> be built conditionally.
> 
> Fixes these build errors and warnings:
> 
> ../drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c: In function 'accel_fs_tcp_set_ipv6_flow':
> ../include/net/sock.h:380:34: error: 'struct sock_common' has no member named 'skc_v6_daddr'; did you mean 'skc_daddr'?
>    380 | #define sk_v6_daddr  __sk_common.skc_v6_daddr
>        |                                  ^~~~~~~~~~~~
> ../drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c:55:14: note: in expansion of macro 'sk_v6_daddr'
>     55 |         &sk->sk_v6_daddr, 16);
>        |              ^~~~~~~~~~~
> At top level:
> ../drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c:47:13: warning: 'accel_fs_tcp_set_ipv6_flow' defined but not used [-Wunused-function]
>     47 | static void accel_fs_tcp_set_ipv6_flow(struct mlx5_flow_spec *spec, struct sock *sk)
> 
> Fixes: 5229a96e59ec ("net/mlx5e: Accel, Expose flow steering API for rules add/del")
> Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
> Reported-by: kernel test robot <lkp@intel.com>
> Cc: Saeed Mahameed <saeedm@nvidia.com>
> Cc: Boris Pismenny <borisp@nvidia.com>
> Cc: Tariq Toukan <tariqt@mellanox.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Jakub Kicinski <kuba@kernel.org>
> ---

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>

Thanks for your patch.

^ permalink raw reply

* Re: [PATCH net-next v4 2/5] net/lapb: support netdev events
From: Xie He @ 2020-11-23 10:08 UTC (permalink / raw)
  To: Martin Schiller
  Cc: Andrew Hendry, David S. Miller, Jakub Kicinski, Linux X25,
	Linux Kernel Network Developers, LKML
In-Reply-To: <CAJht_EMjO_Tkm93QmAeK_2jg2KbLdv2744kCSHiZLy48aXiHnw@mail.gmail.com>

On Mon, Nov 23, 2020 at 1:36 AM Xie He <xie.he.0141@gmail.com> wrote:
>
> Some drivers don't support carrier status and will never change it.
> Their carrier status will always be UP. There will not be a
> NETDEV_CHANGE event.
>
> lapbether doesn't change carrier status. I also have my own virtual
> HDLC WAN driver (for testing) which also doesn't change carrier
> status.
>
> I just tested with lapbether. When I bring up the interface, there
> will only be NETDEV_PRE_UP and then NETDEV_UP. There will not be
> NETDEV_CHANGE. The carrier status is alway UP.
>
> I haven't tested whether a device can receive NETDEV_CHANGE when it is
> down. It's possible for a device driver to call netif_carrier_on when
> the interface is down. Do you know what will happen if a device driver
> calls netif_carrier_on when the interface is down?

I just did a test on lapbether and saw there would be no NETDEV_CHANGE
event when the netif is down, even if netif_carrier_on/off is called.
So we can rest assured of this part.

^ permalink raw reply

* general protection fault in ieee80211_subif_start_xmit
From: syzbot @ 2020-11-23 10:04 UTC (permalink / raw)
  To: davem, johannes, kuba, linux-kernel, linux-wireless, netdev,
	syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    a349e4c6 Merge tag 'xfs-5.10-fixes-7' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1427b225500000
kernel config:  https://syzkaller.appspot.com/x/.config?x=330f3436df12fd44
dashboard link: https://syzkaller.appspot.com/bug?extid=d7a3b15976bf7de2238a
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=164652f5500000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d7a3b15976bf7de2238a@syzkaller.appspotmail.com

general protection fault, probably for non-canonical address 0xdffffc0000000034: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000001a0-0x00000000000001a7]
CPU: 0 PID: 10156 Comm: syz-executor.4 Not tainted 5.10.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:ieee80211_multicast_to_unicast net/mac80211/tx.c:4070 [inline]
RIP: 0010:ieee80211_subif_start_xmit+0x24e/0xee0 net/mac80211/tx.c:4154
Code: 03 80 3c 02 00 0f 85 83 0c 00 00 49 8b 9f 50 17 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d bb a4 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 58 0c 00 00
RSP: 0018:ffffc90000007588 EFLAGS: 00010203
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8851c61d
RDX: 0000000000000034 RSI: ffffffff8851c6ad RDI: 00000000000001a4
RBP: ffff88801b850280 R08: 0000000000000000 R09: ffffffff8cecb9cf
R10: 0000000000000004 R11: 0000000000000000 R12: ffffffff8a61f1e0
R13: ffff888012f07042 R14: 000000000000005a R15: ffff8880284b0000
FS:  00007f1159678700(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000016a9e60 CR3: 000000002ca99000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 __netdev_start_xmit include/linux/netdevice.h:4718 [inline]
 netdev_start_xmit include/linux/netdevice.h:4732 [inline]
 xmit_one net/core/dev.c:3564 [inline]
 dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3580
 sch_direct_xmit+0x2e1/0xbd0 net/sched/sch_generic.c:313
 qdisc_restart net/sched/sch_generic.c:376 [inline]
 __qdisc_run+0x4ba/0x15e0 net/sched/sch_generic.c:384
 qdisc_run include/net/pkt_sched.h:131 [inline]
 qdisc_run include/net/pkt_sched.h:123 [inline]
 __dev_xmit_skb net/core/dev.c:3755 [inline]
 __dev_queue_xmit+0x1453/0x2da0 net/core/dev.c:4108
 neigh_hh_output include/net/neighbour.h:499 [inline]
 neigh_output include/net/neighbour.h:508 [inline]
 ip6_finish_output2+0x8db/0x16c0 net/ipv6/ip6_output.c:117
 __ip6_finish_output net/ipv6/ip6_output.c:143 [inline]
 __ip6_finish_output+0x447/0xab0 net/ipv6/ip6_output.c:128
 ip6_finish_output+0x34/0x1f0 net/ipv6/ip6_output.c:153
 NF_HOOK_COND include/linux/netfilter.h:290 [inline]
 ip6_output+0x1db/0x520 net/ipv6/ip6_output.c:176
 dst_output include/net/dst.h:443 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 NF_HOOK include/linux/netfilter.h:295 [inline]
 mld_sendpack+0x92a/0xdb0 net/ipv6/mcast.c:1679
 mld_send_cr net/ipv6/mcast.c:1975 [inline]
 mld_ifc_timer_expire+0x60a/0xf10 net/ipv6/mcast.c:2474
 call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1410
 expire_timers kernel/time/timer.c:1455 [inline]
 __run_timers.part.0+0x67c/0xa50 kernel/time/timer.c:1747
 __run_timers kernel/time/timer.c:1728 [inline]
 run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1760
 __do_softirq+0x2a0/0x9f6 kernel/softirq.c:298
 asm_call_irq_on_stack+0xf/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0xaa/0xd0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:393 [inline]
 __irq_exit_rcu kernel/softirq.c:423 [inline]
 irq_exit_rcu+0x132/0x200 kernel/softirq.c:435
 sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:631
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/irqflags.h:85 [inline]
RIP: 0010:lock_acquire kernel/locking/lockdep.c:5438 [inline]
RIP: 0010:lock_acquire+0x2cd/0x8c0 kernel/locking/lockdep.c:5400
Code: 48 c7 c7 c0 5e 4b 89 48 83 c4 20 e8 dd 68 8f 07 b8 ff ff ff ff 65 0f c1 05 c0 b2 ab 7e 83 f8 01 0f 85 09 04 00 00 ff 34 24 9d <e9> 37 fe ff ff 65 ff 05 67 a1 ab 7e 48 8b 05 a0 ab 82 0b e8 6b 5d
RSP: 0018:ffffc9000aaf73e0 EFLAGS: 00000246
RAX: 0000000000000001 RBX: 1ffff9200155ee7e RCX: ffffffff8155f384
RDX: 1ffff11004e58121 RSI: 0000000000000001 RDI: 0000000000000000
RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffff8ebb166f
R10: fffffbfff1d762cd R11: 0000000000000000 R12: 0000000000000000
R13: ffff88803eff20a8 R14: 0000000000000000 R15: 0000000000000000
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 lockref_put_or_lock+0x14/0x80 lib/lockref.c:174
 fast_dput fs/dcache.c:747 [inline]
 dput+0x4b9/0xbc0 fs/dcache.c:865
 simple_recursive_removal+0x411/0x6b0 fs/libfs.c:296
 debugfs_remove fs/debugfs/inode.c:725 [inline]
 debugfs_remove+0x59/0x80 fs/debugfs/inode.c:719
 ieee80211_debugfs_remove_netdev+0x43/0xc0 net/mac80211/debugfs_netdev.c:833
 ieee80211_teardown_sdata+0x48/0x2d0 net/mac80211/iface.c:687
 ieee80211_runtime_change_iftype net/mac80211/iface.c:1657 [inline]
 ieee80211_if_change_type+0x2b4/0x620 net/mac80211/iface.c:1691
 ieee80211_change_iface+0x26/0x210 net/mac80211/cfg.c:157
 rdev_change_virtual_intf net/wireless/rdev-ops.h:69 [inline]
 cfg80211_change_iface+0x2eb/0xef0 net/wireless/util.c:1032
 nl80211_set_interface+0x65c/0x8d0 net/wireless/nl80211.c:3789
 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739
 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
 genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:671
 __sys_sendto+0x21c/0x320 net/socket.c:1992
 __do_sys_sendto net/socket.c:2004 [inline]
 __se_sys_sendto net/socket.c:2000 [inline]
 __x64_sys_sendto+0xdd/0x1b0 net/socket.c:2000
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x417937
Code: 2c 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 81 19 00 00 c3 48 83 ec 08 e8 e7 fa ff ff 48 89 04 24 49 89 ca b8 2c 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 2d fb ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007f1159676a90 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f1159676be0 RCX: 0000000000417937
RDX: 0000000000000024 RSI: 00007f1159676c30 RDI: 0000000000000007
RBP: 0000000000000000 R08: 00007f1159676aa0 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f1159676c30 R15: 0000000000000007
Modules linked in:
---[ end trace 80d935084a37d7a4 ]---
RIP: 0010:ieee80211_multicast_to_unicast net/mac80211/tx.c:4070 [inline]
RIP: 0010:ieee80211_subif_start_xmit+0x24e/0xee0 net/mac80211/tx.c:4154
Code: 03 80 3c 02 00 0f 85 83 0c 00 00 49 8b 9f 50 17 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d bb a4 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 58 0c 00 00
RSP: 0018:ffffc90000007588 EFLAGS: 00010203
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8851c61d
RDX: 0000000000000034 RSI: ffffffff8851c6ad RDI: 00000000000001a4
RBP: ffff88801b850280 R08: 0000000000000000 R09: ffffffff8cecb9cf
R10: 0000000000000004 R11: 0000000000000000 R12: ffffffff8a61f1e0
R13: ffff888012f07042 R14: 000000000000005a R15: ffff8880284b0000
FS:  00007f1159678700(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000016a9e60 CR3: 000000002ca99000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* BUG: receive list entry not found for dev vcan0, id 001, mask C00007FF
From: syzbot @ 2020-11-23  9:55 UTC (permalink / raw)
  To: davem, kuba, linux-can, linux-kernel, mkl, netdev, socketcan,
	syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    b9ad3e9f bonding: wait for sysfs kobject destruction befor..
git tree:       net
console output: https://syzkaller.appspot.com/x/log.txt?x=1195c5cd500000
kernel config:  https://syzkaller.appspot.com/x/.config?x=330f3436df12fd44
dashboard link: https://syzkaller.appspot.com/bug?extid=d0ddd88c9a7432f041e6
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13c409cd500000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1349ced1500000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d0ddd88c9a7432f041e6@syzkaller.appspotmail.com

RAX: ffffffffffffffda RBX: 00007fffc0827800 RCX: 0000000000443749
RDX: 0000000000000018 RSI: 0000000020000300 RDI: 0000000000000004
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000001bbbbbb
R10: 0000000000000000 R11: 0000000000000246 R12: ffffffffffffffff
R13: 0000000000000005 R14: 0000000000000000 R15: 0000000000000000
------------[ cut here ]------------
BUG: receive list entry not found for dev vcan0, id 001, mask C00007FF
WARNING: CPU: 0 PID: 8495 at net/can/af_can.c:546 can_rx_unregister+0x5a4/0x700 net/can/af_can.c:546
Modules linked in:
CPU: 0 PID: 8495 Comm: syz-executor608 Not tainted 5.10.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:can_rx_unregister+0x5a4/0x700 net/can/af_can.c:546
Code: 8b 7c 24 78 44 8b 64 24 68 49 c7 c5 a0 ae 56 8a e8 11 58 97 f9 44 89 f9 44 89 e2 4c 89 ee 48 c7 c7 e0 ae 56 8a e8 76 ab d3 00 <0f> 0b 48 8b 7c 24 28 e8 90 22 0f 01 e9 54 fb ff ff e8 06 cf d8 f9
RSP: 0018:ffffc9000182f9f0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88801ffe8000 RSI: ffffffff8158f3c5 RDI: fffff52000305f30
RBP: 0000000000000118 R08: 0000000000000001 R09: ffff8880b9e30627
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: ffff88801ab00000 R14: 1ffff92000305f45 R15: 00000000c00007ff
FS:  0000000000000000(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004c8928 CR3: 000000000b08e000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 isotp_notifier+0x2a7/0x540 net/can/isotp.c:1303
 call_netdevice_notifier net/core/dev.c:1735 [inline]
 call_netdevice_unregister_notifiers+0x156/0x1c0 net/core/dev.c:1763
 call_netdevice_unregister_net_notifiers net/core/dev.c:1791 [inline]
 unregister_netdevice_notifier+0xcd/0x170 net/core/dev.c:1870
 isotp_release+0x136/0x600 net/can/isotp.c:1011
 __sock_release+0xcd/0x280 net/socket.c:596
 sock_close+0x18/0x20 net/socket.c:1277
 __fput+0x285/0x920 fs/file_table.c:281
 task_work_run+0xdd/0x190 kernel/task_work.c:151
 exit_task_work include/linux/task_work.h:30 [inline]
 do_exit+0xb64/0x29b0 kernel/exit.c:809
 do_group_exit+0x125/0x310 kernel/exit.c:906
 __do_sys_exit_group kernel/exit.c:917 [inline]
 __se_sys_exit_group kernel/exit.c:915 [inline]
 __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:915
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x442388
Code: Unable to access opcode bytes at RIP 0x44235e.
RSP: 002b:00007fffc0827768 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000000000442388
RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001
RBP: 00000000004c88f0 R08: 00000000000000e7 R09: ffffffffffffffd0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00000000006dd240 R14: 0000000000000000 R15: 0000000000000000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* inconsistent lock state in io_file_data_ref_zero
From: syzbot @ 2020-11-23  9:55 UTC (permalink / raw)
  To: axboe, davem, io-uring, johannes.berg, johannes, kuba,
	linux-fsdevel, linux-kernel, linux-wireless, netdev,
	syzkaller-bugs, viro

Hello,

syzbot found the following issue on:

HEAD commit:    27bba9c5 Merge tag 'scsi-fixes' of git://git.kernel.org/pu..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11041f1e500000
kernel config:  https://syzkaller.appspot.com/x/.config?x=330f3436df12fd44
dashboard link: https://syzkaller.appspot.com/bug?extid=1f4ba1e5520762c523c6
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17d9b775500000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=157e4f75500000

The issue was bisected to:

commit dcd479e10a0510522a5d88b29b8f79ea3467d501
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Fri Oct 9 12:17:11 2020 +0000

    mac80211: always wind down STA state

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=130299a9500000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=108299a9500000
console output: https://syzkaller.appspot.com/x/log.txt?x=170299a9500000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+1f4ba1e5520762c523c6@syzkaller.appspotmail.com
Fixes: dcd479e10a05 ("mac80211: always wind down STA state")

================================
WARNING: inconsistent lock state
5.10.0-rc4-syzkaller #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
ffff8880125202a8 (&file_data->lock){+.?.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
ffff8880125202a8 (&file_data->lock){+.?.}-{2:2}, at: io_file_data_ref_zero+0x75/0x480 fs/io_uring.c:7361
{SOFTIRQ-ON-W} state was registered at:
  lock_acquire kernel/locking/lockdep.c:5435 [inline]
  lock_acquire+0x2a3/0x8c0 kernel/locking/lockdep.c:5400
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
  spin_lock include/linux/spinlock.h:354 [inline]
  io_sqe_files_register fs/io_uring.c:7496 [inline]
  __io_uring_register fs/io_uring.c:9660 [inline]
  __do_sys_io_uring_register+0x343a/0x40d0 fs/io_uring.c:9750
  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
irq event stamp: 131582
hardirqs last  enabled at (131582): [<ffffffff88e80d52>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
hardirqs last  enabled at (131582): [<ffffffff88e80d52>] _raw_spin_unlock_irqrestore+0x42/0x50 kernel/locking/spinlock.c:191
hardirqs last disabled at (131581): [<ffffffff88e80b1e>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (131581): [<ffffffff88e80b1e>] _raw_spin_lock_irqsave+0x4e/0x50 kernel/locking/spinlock.c:159
softirqs last  enabled at (131566): [<ffffffff814279df>] irq_enter_rcu+0xcf/0xf0 kernel/softirq.c:360
softirqs last disabled at (131567): [<ffffffff89000eaf>] asm_call_irq_on_stack+0xf/0x20

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&file_data->lock);
  <Interrupt>
    lock(&file_data->lock);

 *** DEADLOCK ***

2 locks held by swapper/0/0:
 #0: ffffffff8b337700 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2466 [inline]
 #0: ffffffff8b337700 (rcu_callback){....}-{0:0}, at: rcu_core+0x576/0xe80 kernel/rcu/tree.c:2711
 #1: ffffffff8b337820 (rcu_read_lock){....}-{1:2}, at: percpu_ref_put_many.constprop.0+0x0/0x250 net/netfilter/xt_cgroup.c:62

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:118
 print_usage_bug kernel/locking/lockdep.c:3738 [inline]
 valid_state kernel/locking/lockdep.c:3749 [inline]
 mark_lock_irq kernel/locking/lockdep.c:3952 [inline]
 mark_lock.cold+0x32/0x74 kernel/locking/lockdep.c:4409
 mark_usage kernel/locking/lockdep.c:4304 [inline]
 __lock_acquire+0x11b1/0x5c00 kernel/locking/lockdep.c:4784
 lock_acquire kernel/locking/lockdep.c:5435 [inline]
 lock_acquire+0x2a3/0x8c0 kernel/locking/lockdep.c:5400
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 io_file_data_ref_zero+0x75/0x480 fs/io_uring.c:7361
 percpu_ref_put_many.constprop.0+0x217/0x250 include/linux/percpu-refcount.h:322
 rcu_do_batch kernel/rcu/tree.c:2476 [inline]
 rcu_core+0x5df/0xe80 kernel/rcu/tree.c:2711
 __do_softirq+0x2a0/0x9f6 kernel/softirq.c:298
 asm_call_irq_on_stack+0xf/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0xaa/0xd0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:393 [inline]
 __irq_exit_rcu kernel/softirq.c:423 [inline]
 irq_exit_rcu+0x132/0x200 kernel/softirq.c:435
 sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:631
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:79 [inline]
RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:169 [inline]
RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:112 [inline]
RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:517
Code: 8d 21 88 f8 84 db 75 ac e8 74 29 88 f8 e8 2f e8 8d f8 e9 0c 00 00 00 e8 65 29 88 f8 0f 00 2d 5e 74 c0 00 e8 59 29 88 f8 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 b4 21 88 f8 48 85 db
RSP: 0018:ffffffff8b007d60 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffffffff19d8ff9
RDX: ffffffff8b09af80 RSI: ffffffff88e80687 RDI: 0000000000000000
RBP: ffff88814141d064 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: ffff88814141d000 R14: ffff88814141d064 R15: ffff888014984004
 acpi_idle_enter+0x361/0x500 drivers/acpi/processor_idle.c:648
 cpuid


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* RE: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes
From: Danielle Ratson @ 2020-11-23  9:47 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: Jiri Pirko, Andrew Lunn, Jakub Kicinski, Ido Schimmel,
	netdev@vger.kernel.org, davem@davemloft.net, Jiri Pirko,
	f.fainelli@gmail.com, mlxsw, Ido Schimmel,
	johannes@sipsolutions.net
In-Reply-To: <20201022162740.nisrhdzc4keuosgw@lion.mk-sys.cz>



> -----Original Message-----
> From: Michal Kubecek <mkubecek@suse.cz>
> Sent: Thursday, October 22, 2020 7:28 PM
> To: Danielle Ratson <danieller@nvidia.com>
> Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel
> <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw
> <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net
> Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes
> 
> On Thu, Oct 22, 2020 at 06:15:48AM +0000, Danielle Ratson wrote:
> > > -----Original Message-----
> > > From: Michal Kubecek <mkubecek@suse.cz>
> > > Sent: Wednesday, October 21, 2020 11:48 AM
> > >
> > > Ah, right, it does. But as you extend struct ethtool_link_ksettings
> > > and drivers will need to be updated to provide this information,
> > > wouldn't it be more useful to let the driver provide link mode in
> > > use instead (and derive number of lanes from it)?
> >
> > This is the way it is done with the speed parameter, so I have aligned
> > it to it. Why the lanes should be done differently comparing to the
> > speed?
> 
> Speed and duplex have worked this way since ages and the interface was probably introduced back in times when combination of
> speed and duplex was sufficient to identify the link mode. This is no longer the case and even adding number of lanes wouldn't make
> the combination unique. So if we are going to extend the interface now and update drivers to provide extra information, I believe it
> would be more useful to provide full information.
> 
> Michal

Hi Michal,

What do you think of passing the link modes you have suggested as a bitmask, similar to "supported", that contains only one positive bit?
Something like that:

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index afae2beacbc3..dd946c88daa3 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -127,6 +127,7 @@ struct ethtool_link_ksettings {
                __ETHTOOL_DECLARE_LINK_MODE_MASK(supported);
                __ETHTOOL_DECLARE_LINK_MODE_MASK(advertising);
                __ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising);
+               __ETHTOOL_DECLARE_LINK_MODE_MASK(chosen);
        } link_modes;
        u32     lanes;
 };

Do you have perhaps a better suggestion?

And the speed and duplex parameters should be removed from being passed like as well, right?

Thanks,
Danielle

^ permalink raw reply related

* Re: [PATCH] libbpf: add support for canceling cached_cons advance
From: Magnus Karlsson @ 2020-11-23  9:40 UTC (permalink / raw)
  To: Li RongQing; +Cc: Network Development, bpf
In-Reply-To: <1606050623-22963-1-git-send-email-lirongqing@baidu.com>

On Sun, Nov 22, 2020 at 2:21 PM Li RongQing <lirongqing@baidu.com> wrote:
>
> It is possible to fail receiving packets after calling
> xsk_ring_cons__peek, at this condition, cached_cons has
> been advanced, should be cancelled.

Thanks RongQing,

I have needed this myself in various situations, so I think we should
add this. But your motivation in the commit message is somewhat
confusing. How about something like this?

Add a new function for returning descriptors the user received after
an xsk_ring_cons__peek call. After the application has gotten a number
of descriptors from a ring, it might not be able to or want to process
them all for various reasons. Therefore, it would be useful to have an
interface for returning or cancelling a number of them so that they
are returned to the ring. This patch adds a new function called
xsk_ring_cons__cancel that performs this operation on nb descriptors
counted from the end of the batch of descriptors that was received
through the peek call.

Replace your commit message with this, fix the bug below, send a v2
and then I am happy to ack this.

/Magnus

> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
>  tools/lib/bpf/xsk.h | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/tools/lib/bpf/xsk.h b/tools/lib/bpf/xsk.h
> index 1069c46364ff..4128215c246b 100644
> --- a/tools/lib/bpf/xsk.h
> +++ b/tools/lib/bpf/xsk.h
> @@ -153,6 +153,12 @@ static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons,
>         return entries;
>  }
>
> +static inline void xsk_ring_cons__cancel(struct xsk_ring_cons *cons,
> +                                        size_t nb)
> +{
> +       rx->cached_cons -= nb;

cons-> not rx->. Please make sure the v2 compiles and passes checkpatch.

> +}
> +
>  static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, size_t nb)
>  {
>         /* Make sure data has been read before indicating we are done
> --
> 2.17.3
>

^ permalink raw reply

* Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes
From: Jiri Pirko @ 2020-11-23  9:40 UTC (permalink / raw)
  To: Edwin Peer
  Cc: Ido Schimmel, netdev, David S . Miller, Jakub Kicinski, jiri,
	danieller, andrew, f.fainelli, mkubecek, mlxsw, Ido Schimmel
In-Reply-To: <CAKOOJTw1rRdS0+WRqeWY4Hc9gzwvPn7FGFdZuVd3hFYORcRz4g@mail.gmail.com>

Thu, Nov 19, 2020 at 09:38:34PM CET, edwin.peer@broadcom.com wrote:
>On Sat, Oct 10, 2020 at 3:54 PM Ido Schimmel <idosch@idosch.org> wrote:
>
>> Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct
>> ethtool_link_settings' with lanes field in order to implement a new
>> lanes-selector that will enable the user to advertise a specific number
>> of lanes as well.
>
>Why can't this be implied by port break-out configuration? For higher
>speed signalling modes like PAM4, what's the difference between a
>port with unused lanes vs the same port split into multiple logical
>ports? In essence, the driver could then always choose the slowest

There is a crucial difference. Split port is configured alwasy by user.
Each split port has a devlink instace, netdevice associated with it.
It is one level above the lanes.


>signalling mode that utilizes all the available lanes.
>
>Regards,
>Edwin Peer



^ permalink raw reply

* Re: [PATCH] dpaa2-eth: Fix compile error due to missing devlink support
From: Ioana Ciornei @ 2020-11-23  9:39 UTC (permalink / raw)
  To: Ezequiel Garcia
  Cc: netdev@vger.kernel.org, Jakub Kicinski, David S . Miller,
	Ioana Ciocoi Radulescu, kernel@collabora.com
In-Reply-To: <20201122002336.79912-1-ezequiel@collabora.com>


Hi Ezequiel,

Thanks a lot for the fix, I overlooked this when adding devlink support.

On Sat, Nov 21, 2020 at 09:23:36PM -0300, Ezequiel Garcia wrote:
> The dpaa2 driver depends on devlink, so it should select
> NET_DEVLINK in order to fix compile errors, such as:
>
> drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.o: in function `dpaa2_eth_rx_err':
> dpaa2-eth.c:(.text+0x3cec): undefined reference to `devlink_trap_report'
> drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-devlink.o: in function `dpaa2_eth_dl_info_get':
> dpaa2-eth-devlink.c:(.text+0x160): undefined reference to `devlink_info_driver_name_put'
>

What tree is this intended for?

Maybe add a fixes tag and send this towards the net tree?

Ioana

> Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
> ---
>  drivers/net/ethernet/freescale/dpaa2/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/freescale/dpaa2/Kconfig b/drivers/net/ethernet/freescale/dpaa2/Kconfig
> index cfd369cf4c8c..aee59ead7250 100644
> --- a/drivers/net/ethernet/freescale/dpaa2/Kconfig
> +++ b/drivers/net/ethernet/freescale/dpaa2/Kconfig
> @@ -2,6 +2,7 @@
>  config FSL_DPAA2_ETH
>       tristate "Freescale DPAA2 Ethernet"
>       depends on FSL_MC_BUS && FSL_MC_DPIO
> +     select NET_DEVLINK
>       select PHYLINK
>       select PCS_LYNX
>       help
> --
> 2.27.0
>

^ permalink raw reply

* Re: [PATCH net-next v4 2/5] net/lapb: support netdev events
From: Xie He @ 2020-11-23  9:36 UTC (permalink / raw)
  To: Martin Schiller
  Cc: Andrew Hendry, David S. Miller, Jakub Kicinski, Linux X25,
	Linux Kernel Network Developers, LKML
In-Reply-To: <d85a4543eae46bac1de28ec17a2389dd@dev.tdt.de>

On Mon, Nov 23, 2020 at 1:00 AM Martin Schiller <ms@dev.tdt.de> wrote:
>
> AFAIK the carrier can't be up before the device is up. Therefore, there
> will be a NETDEV_CHANGE event after the NETDEV_UP event.
>
> This is what I can see in my tests (with the HDLC interface).
>
> Is the behaviour different for e.g. lapbether?

Some drivers don't support carrier status and will never change it.
Their carrier status will always be UP. There will not be a
NETDEV_CHANGE event.

lapbether doesn't change carrier status. I also have my own virtual
HDLC WAN driver (for testing) which also doesn't change carrier
status.

I just tested with lapbether. When I bring up the interface, there
will only be NETDEV_PRE_UP and then NETDEV_UP. There will not be
NETDEV_CHANGE. The carrier status is alway UP.

I haven't tested whether a device can receive NETDEV_CHANGE when it is
down. It's possible for a device driver to call netif_carrier_on when
the interface is down. Do you know what will happen if a device driver
calls netif_carrier_on when the interface is down?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox