Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net] ipv4: fix cloning issues in fib_trie_unmerge()
From: Alexander Duyck @ 2016-11-14 22:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Alexander Duyck
In-Reply-To: <1479159274.8455.82.camel@edumazet-glaptop3.roam.corp.google.com>

On Mon, Nov 14, 2016 at 1:34 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> I had crashes in a DEBUG_PAGEALLOC kernels in fib_table_flush() or
> fib_table_lookup() that I back tracked to a refcounting issue
> happening when we clone struct fib_alias in fib_trie_unmerge()
>
> While fixing this issue, I also noticed a mem leak happening
> if fib_insert_alias() fails.
>
> Fixes: 0ddcf43d5d4a0 ("ipv4: FIB Local/MAIN table collapse")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
>  net/ipv4/fib_trie.c |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index 4cff74d4133f..ebf49ab889e8 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -1737,14 +1737,19 @@ struct fib_table *fib_trie_unmerge(struct fib_table *oldtb)
>                                 goto out;
>
>                         memcpy(new_fa, fa, sizeof(*fa));
> +                       if (fa->fa_info)
> +                               fa->fa_info->fib_treeref++;
>
>                         /* insert clone into table */
>                         if (!local_l)
>                                 local_l = fib_find_node(lt, &local_tp, l->key);
>
>                         if (fib_insert_alias(lt, local_tp, local_l, new_fa,
> -                                            NULL, l->key))
> +                                            NULL, l->key)) {
> +                               kmem_cache_free(fn_alias_kmem, new_fa);
> +                               fib_release_info(fa->fa_info);
>                                 goto out;
> +                       }
>                 }
>
>                 /* stop loop if key wrapped back to 0 */
>
>

Actually I think this creates a reference leak.  If you look the call
to fib_table_flush_external is skipping the call to fib_release_info.
If you add this then you would probably need to update
fib_table_flush_external so that we call fib_release_info like we do
for fib_table_flush.

^ permalink raw reply

* [patch] netlink.7: srcfix Change buffer size in example code about reading netlink message.
From: dwilder @ 2016-11-14 22:20 UTC (permalink / raw)
  To: mtk.manpages; +Cc: linux-man, netdev

The example code in netlink(7) (for reading netlink message) suggests 
using
a 4k read buffer with recvmsg.  This can cause truncated messages on 
systems
using a page size is >4096.  Please see:
linux/include/linux/netlink.h (in the kernel source)

<snip>
/*
  *      skb should fit one page. This choice is good for headerless 
malloc.
  *      But we should limit to 8K so that userspace does not have to
  *      use enormous buffer sizes on recvmsg() calls just to avoid
  *      MSG_TRUNC when PAGE_SIZE is very large.
  */
#if PAGE_SIZE < 8192UL
#define NLMSG_GOODSIZE  SKB_WITH_OVERHEAD(PAGE_SIZE)
#else
#define NLMSG_GOODSIZE  SKB_WITH_OVERHEAD(8192UL)
#endif

#define NLMSG_DEFAULT_SIZE (NLMSG_GOODSIZE - NLMSG_HDRLEN)
<snip>

I was troubleshooting some up-stream code on a ppc64le system
(page:size of 64k) This code had duplicated the example from netlink(7) 
and
was using a 4k buffer.  On x86-64 with a 4k page size this is not a 
problem,
however on the 64k page system some messages were truncated.  Using an 
8k buffer
as implied in netlink.h prevents problems with any page size.

Lets change the example so others don't propagate the problem further.

Signed-off-by David Wilder <dwilder@us.ibm.com>

--- man7/netlink.7.orig 2016-11-14 13:30:36.522101156 -0800
+++ man7/netlink.7      2016-11-14 13:30:51.002086354 -0800
@@ -511,7 +511,7 @@
  .in +4n
  .nf
  int len;
-char buf[4096];
+char buf[8192];
  struct iovec iov = { buf, sizeof(buf) };
  struct sockaddr_nl sa;
  struct msghdr msg;

^ permalink raw reply

* Re: Long delays creating a netns after deleting one (possibly RCU related)
From: Eric W. Biederman @ 2016-11-14 22:12 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Cong Wang, Rolf Neugebauer, LKML, Linux Kernel Network Developers,
	Justin Cormack, Ian Campbell, netdev, Eric Dumazet
In-Reply-To: <20161114181425.GN4127@linux.vnet.ibm.com>

"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:

> On Mon, Nov 14, 2016 at 09:44:35AM -0800, Cong Wang wrote:
>> On Mon, Nov 14, 2016 at 8:24 AM, Paul E. McKenney
>> <paulmck@linux.vnet.ibm.com> wrote:
>> > On Sun, Nov 13, 2016 at 10:47:01PM -0800, Cong Wang wrote:
>> >> On Fri, Nov 11, 2016 at 4:55 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> >> > On Fri, Nov 11, 2016 at 4:23 PM, Paul E. McKenney
>> >> > <paulmck@linux.vnet.ibm.com> wrote:
>> >> >>
>> >> >> Ah!  This net_mutex is different than RTNL.  Should synchronize_net() be
>> >> >> modified to check for net_mutex being held in addition to the current
>> >> >> checks for RTNL being held?
>> >> >>
>> >> >
>> >> > Good point!
>> >> >
>> >> > Like commit be3fc413da9eb17cce0991f214ab0, checking
>> >> > for net_mutex for this case seems to be an optimization, I assume
>> >> > synchronize_rcu_expedited() and synchronize_rcu() have the same
>> >> > behavior...
>> >>
>> >> Thinking a bit more, I think commit be3fc413da9eb17cce0991f
>> >> gets wrong on rtnl_is_locked(), the lock could be locked by other
>> >> process not by the current one, therefore it should be
>> >> lockdep_rtnl_is_held() which, however, is defined only when LOCKDEP
>> >> is enabled... Sigh.
>> >>
>> >> I don't see any better way than letting callers decide if they want the
>> >> expedited version or not, but this requires changes of all callers of
>> >> synchronize_net(). Hm.
>> >
>> > I must confess that I don't understand how it would help to use an
>> > expedited grace period when some other process is holding RTNL.
>> > In contrast, I do well understand how it helps when the current process
>> > is holding RTNL.
>> 
>> Yeah, this is exactly my point. And same for ASSERT_RTNL() which checks
>> rtnl_is_locked(), clearly we need to assert "it is held by the current process"
>> rather than "it is locked by whatever process".
>> 
>> But given *_is_held() is always defined by LOCKDEP, so we probably need
>> mutex to provide such a helper directly, mutex->owner is not always defined
>> either. :-/
>
> There is always the option of making acquisition and release set a per-task
> variable that can be tested.  (Where did I put that asbestos suit, anyway?)
>
> 							Thanx, Paul

synchronize_rcu_expidited is not enough if you have multiple network
devices in play.

Looking at the code it comes down to this commit, and it appears there
is a promise add rcu grace period combining by Eric Dumazet.

Eric since people are hitting noticable stalls because of the rcu grace
period taking a long time do you think you could look at this code path
a bit more?

commit 93d05d4a320cb16712bb3d57a9658f395d8cecb9
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Nov 18 06:31:03 2015 -0800

    net: provide generic busy polling to all NAPI drivers
    
    NAPI drivers no longer need to observe a particular protocol
    to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y)
    
    napi_hash_add() and napi_hash_del() are automatically called
    from core networking stack, respectively from
    netif_napi_add() and netif_napi_del()
    
    This patch depends on free_netdev() and netif_napi_del() being
    called from process context, which seems to be the norm.
    
    Drivers might still prefer to call napi_hash_del() on their
    own, since they might combine all the rcu grace periods into
    a single one, knowing their NAPI structures lifetime, while
    core networking stack has no idea of a possible combining.
    
    Once this patch proves to not bring serious regressions,
    we will cleanup drivers to either remove napi_hash_del()
    or provide appropriate rcu grace periods combining.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Eric

^ permalink raw reply

* qed, qedi patchset submission
From: Arun Easi @ 2016-11-14 21:53 UTC (permalink / raw)
  To: Martin K. Petersen, David Miller, linux-scsi, netdev

Hi Martin, David,

This is regarding the submission of the recent patch series we have posted
to linux-scsi and netdev:

    [PATCH v2 0/6] Add QLogic FastLinQ iSCSI (qedi) driver.
    [PATCH v2 1/6] qed: Add support for hardware offloaded iSCSI.
    [PATCH v2 2/6] qed: Add iSCSI out of order packet handling.
    [PATCH v2 3/6] qedi: Add QLogic FastLinQ offload iSCSI driver framework.
    [PATCH v2 4/6] qedi: Add LL2 iSCSI interface for offload iSCSI.
    [PATCH v2 5/6] qedi: Add support for iSCSI session management.
    [PATCH v2 6/6] qedi: Add support for data path.

Patches 1 & 2 are "qed" module patches that goes under
drivers/net/ethernet/qlogic/qed/ and include/linux/qed/ directory.
	- These are the iSCSI enablement changes to the common "qed"
	  module. There is no dependency for these patches and so
	  can go independently.

Patches 3, 4, 5 & 6 are the qedi patches that is aimed towards
drivers/scsi/qedi/ directory.
	- These are the core qedi changes and is dependent on the
	  qed changes (invokes qed_XXX functions).

As qed sits in the net tree, the patches are usually submitted via netdev.

Do you have any preference or thoughts on how the "qed" patches be 
approached? Just as a reference, our rdma driver "qedr" went through 
something similar[1], and eventually "qed" patches were taken by David 
in the net tree and "qedr", in the rdma tree (obviously) by Doug L.

Hi David,

For the "qed" enablement sent with the v2 series, we did not prefix the 
qed patches with "[PATCH net-next]" prefix, so netdev folks may have 
failed to notice/review that, sorry about that. We will send the next (v3) 
series with that corrected.

Right now, we are basing the "qed" patches on top of latest net + net-next 
tree. FYI, I tried a test merge of net-next/master + qed patches with 
"net/master" and I see no conflict in qed.

Regards,
-Arun

[1] http://marc.info/?l=linux-rdma&m=147509152719831&w=2

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2016-11-14 22:08 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) Fix off by one wrt. indexing when dumping /proc/net/route entries, from
   Alexander Duyck.

2) Fix lockdep splats in iwlwifi, from Johannes Berg.

3) Cure panic when inserting certain netfilter rules when NFT_SET_HASH
   is disabled, from Liping Zhang.

4) Memory leak when nft_expr_clone() fails, also from Liping Zhang.

5) Disable UFO when path will apply IPSEC tranformations, from Jakub
   Sitnicki.

6) Don't bogusly double cwnd in dctcp module, from Florian Westphal.

7) skb_checksum_help() should never actually use the value "0" for
   the resulting checksum, that has a special meaning, use CSUM_MANGLED_0
   instead.  From Eric Dumazet.

8) Per-tx/rx queue statistic strings are wrong in qed driver, fix from
   Yuval MIntz.

9) Fix SCTP reference counting of associations and transports in
   sctp_diag.  From Xin Long.

10) When we hit ip6tunnel_xmit() we could have come from an ipv4
    path in a previous layer or similar, so explicitly clear the
    ipv6 control block in the skb.  From Eli Cooper.

11) Fix bogus sleeping inside of inet_wait_for_connect(), from WANG
    Cong.

12) Correct deivce ID of T6 adapter in cxgb4 driver, from Hariprasad
    Shenai.

13) Fix potential access past the end of the skb page frag array in
    tcp_sendmsg().  From Eric Dumazet.

14) 'skb' can legitimately be NULL in inet{,6}_exact_dif_match(). Fix
    from David Ahern.

15) Don't return an error in tcp_sendmsg() if we wronte any bytes
    successfully, from Eric Dumazet.

16) Extraneous unlocks in netlink_diag_dump(), we removed the locking
    but forgot to purge these unlock calls. From Eric Dumazet.

17) Fix memory leak in error path of __genl_register_family().  We
    leak the attrbuf, from WANG Cong.

18) cgroupstats netlink policy table is mis-sized, from WANG Cong.

19) Several XDP bug fixes in mlx5, from Saeed Mahameed.

20) Fix several device refcount leaks in network drivers, from Johan
    Hovold.

21) icmp6_send() should use skb dst device not skb->dev to determine
    L3 routing domain.  From David Ahern.

22) ip_vs_genl_family sets maxattr incorrectly, from WANG Cong.

23) We leak new macvlan port in some cases of maclan_common_netlink()
    errors.  Fix from Gao Feng.

24) Similar to the icmp6_send() fix, icmp_route_lookup() should determine
    L3 routing domain using skb_dst(skb)->dev not skb->dev.  Also
    from David Ahern.

25) Several fixes for route offloading and FIB notification handling
    in mlxsw driver, from Jiri Pirko.

26) Properly cap __skb_flow_dissect()'s return value, from Eric
    Dumazet.

27) Fix long standing regression in ipv4 redirect handling,
    wrt. validating the new neighbour's reachability.  From
    Stephen Suryaputra Lin.

28) If sk_filter() trims the packet excessively, handle it reasonably
    in tcp input instead of exploding.  From Eric Dumazet.

29) Fix handling of napi hash state when copying channels in sfc
    driver, from Bert Kenward.

Please pull, thanks a lot!

The following changes since commit 2a26d99b251b8625d27aed14e97fc10707a3a81f:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2016-10-29 20:33:20 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to ac571de999e14b87890cb960ad6f03fbdde6abc8:

  mlxsw: spectrum_router: Flush FIB tables during fini (2016-11-14 16:45:16 -0500)

----------------------------------------------------------------
Alexander Duyck (1):
      fib_trie: Correct /proc/net/route off by one error

Allan Chou (1):
      Net Driver: Add Cypress GX3 VID=04b4 PID=3610.

Andy Gospodarek (1):
      bgmac: stop clearing DMA receive control register right after it is set

Arkadi Sharshevsky (1):
      mlxsw: spectrum_router: Correctly dump neighbour activity

Arnd Bergmann (3):
      brcmfmac: avoid maybe-uninitialized warning in brcmf_cfg80211_start_ap
      netfilter: ip_vs_sync: fix bogus maybe-uninitialized warning
      vxlan: hide unused local variable

Baoquan He (2):
      Revert "bnx2: Reset device during driver initialization"
      bnx2: Wait for in-flight DMA to complete at probe stage

Baruch Siach (1):
      net: bpqether.h: remove if_ether.h guard

Benjamin Poirier (1):
      bna: Add synchronization for tx ring.

Bert Kenward (1):
      sfc: clear napi_hash state when copying channels

Christophe Jaillet (1):
      net/mlx5: Simplify a test

Colin Ian King (2):
      net: ethernet: ixp4xx_eth: fix spelling mistake in debug message
      ps3_gelic: fix spelling mistake in debug message

Daniel Borkmann (2):
      bpf: fix htab map destruction when extra reserve is in use
      bpf: fix map not being uncharged during map creation failure

David Ahern (4):
      net: tcp: check skb is non-NULL for exact match on lookups
      net: icmp6_send should use dst dev to determine L3 domain
      net: icmp_route_lookup should use rt dev to determine L3 domain
      net: tcp response should set oif only if it is L3 master

David S. Miller (14):
      Merge tag 'wireless-drivers-for-davem-2016-10-30' of git://git.kernel.org/.../kvalo/wireless-drivers
      Merge branch 'sctp-hold-transport-fixes'
      Merge tag 'linux-can-fixes-for-4.9-20161031' of git://git.kernel.org/.../mkl/linux-can
      Merge branch 'xgene-coalescing-bugs'
      Merge branch 'mlx5-fixes'
      Merge branch 'phy-ref-leaks'
      Merge branch 'qcom-emac-pause'
      Merge git://git.kernel.org/.../pablo/nf
      Merge branch 'qed-fixes'
      Merge branch 'mlxsw-fixes'
      Merge branch 'fix-bpf_redirect'
      Merge branch 'bnxt_en-fixes'
      Merge branch 'mlxsw-fixes'
      Merge branch 'bnx2-kdump-fix'

Dongli Zhang (2):
      xen-netfront: do not cast grant table reference to signed short
      xen-netfront: cast grant table reference first to type int

Eli Cooper (2):
      ip6_tunnel: Clear IP6CB in ip6tunnel_xmit()
      ip6_udp_tunnel: remove unused IPCB related codes

Eric Dumazet (12):
      net: clear sk_err_soft in sk_clone_lock()
      net: mangle zero checksum in skb_checksum_help()
      tcp: fix potential memory corruption
      tcp: fix return value for partial writes
      dccp: do not release listeners too soon
      dccp: do not send reset to already closed sockets
      dccp: fix out of bound access in dccp_v4_err()
      netlink: netlink_diag_dump() runs without locks
      ipv6: dccp: fix out of bound access in dccp_v6_err()
      ipv6: dccp: add missing bind_conflict to dccp_ipv6_mapped
      net: __skb_flow_dissect() must cap its return value
      tcp: take care of truncations done by sk_filter()

Fabian Mewes (1):
      Documentation: networking: dsa: Update tagging protocols

Florian Fainelli (1):
      net: stmmac: Fix lack of link transition for fixed PHYs

Florian Westphal (5):
      netfilter: conntrack: avoid excess memory allocation
      dctcp: avoid bogus doubling of cwnd after loss
      netfilter: connmark: ignore skbs with magic untracked conntrack objects
      netfilter: conntrack: fix CT target for UNSPEC helpers
      netfilter: conntrack: refine gc worker heuristics

Gao Feng (1):
      driver: macvlan: Destroy new macvlan port if macvlan_common_newlink failed.

Guenter Roeck (1):
      r8152: Fix error path in open function

Guilherme G. Piccoli (1):
      ehea: fix operation state report

Haim Dreyfuss (1):
      iwlwifi: mvm: comply with fw_restart mod param on suspend

Hariprasad Shenai (1):
      cxgb4: correct device ID of T6 adapter

Huy Nguyen (1):
      net/mlx5: Fix invalid pointer reference when prof_sel parameter is invalid

Ido Schimmel (2):
      mlxsw: spectrum: Fix incorrect reuse of MID entries
      mlxsw: spectrum_router: Flush FIB tables during fini

Isaac Boukris (1):
      unix: escape all null bytes in abstract unix domain socket

Iyappan Subramanian (2):
      drivers: net: xgene: fix: Disable coalescing on v1 hardware
      drivers: net: xgene: fix: Coalescing values for v2 hardware

Jakub Sitnicki (1):
      ipv6: Don't use ufo handling on later transformed packets

Jiri Pirko (2):
      mlxsw: spectrum_router: Fix handling of neighbour structure
      mlxsw: spectrum_router: Ignore FIB notification events for non-init namespaces

Johan Hovold (4):
      phy: fix device reference leaks
      net: ethernet: ti: cpsw: fix device and of_node leaks
      net: ethernet: ti: davinci_emac: fix device reference leak
      net: hns: fix device reference leaks

Johannes Berg (1):
      iwlwifi: pcie: mark command queue lock with separate lockdep class

John Allen (1):
      ibmvnic: Start completion queue negotiation at server-provided optimum values

John W. Linville (1):
      netfilter: nf_tables: fix type mismatch with error return from nft_parse_u32_check

Kalle Valo (1):
      Merge tag 'iwlwifi-for-kalle-2015-10-25' of git://git.kernel.org/.../iwlwifi/iwlwifi-fixes

Lance Richardson (2):
      ipv4: allow local fragmentation in ip_finish_output_gso()
      ipv4: update comment to document GSO fragmentation cases.

Liping Zhang (6):
      netfilter: nft_dynset: fix panic if NFT_SET_HASH is not enabled
      netfilter: nf_tables: fix *leak* when expr clone fail
      netfilter: nf_tables: fix race when create new element in dynset
      netfilter: nf_tables: destroy the set if fail to add transaction
      netfilter: nft_dup: do not use sreg_dev if the user doesn't specify it
      netfilter: nf_tables: fix oops when inserting an element into a verdict map

Luca Coelho (4):
      iwlwifi: mvm: use ssize_t for len in iwl_debugfs_mem_read()
      iwlwifi: mvm: fix d3_test with unified D0/D3 images
      iwlwifi: pcie: fix SPLC structure parsing
      iwlwifi: mvm: fix netdetect starting/stopping for unified images

Lukas Resch (1):
      can: sja1000: plx_pci: Add support for Moxa CAN devices

Maciej Żenczykowski (1):
      net-ipv6: on device mtu change do not add mtu to mtu-less routes

Marcelo Ricardo Leitner (1):
      sctp: assign assoc_id earlier in __sctp_connect

Mark Lord (1):
      r8152: Fix broken RX checksums.

Martin KaFai Lau (2):
      bpf: Fix bpf_redirect to an ipip/ip6tnl dev
      bpf: Add test for bpf_redirect to ipip/ip6tnl

Mathias Krause (1):
      rtnl: reset calcit fptr in rtnl_unregister()

Michael Chan (2):
      bnxt_en: Fix ring arithmetic in bnxt_setup_tc().
      bnxt_en: Fix VF virtual link state.

Michael S. Tsirkin (1):
      virtio-net: drop legacy features in virtio 1 mode

Mike Frysinger (1):
      Revert "include/uapi/linux/atm_zatm.h: include linux/time.h"

Mintz, Yuval (2):
      qede: Fix statistics' strings for Tx/Rx queues
      qede: Correctly map aggregation replacement pages

Oliver Hartkopp (1):
      can: bcm: fix warning in bcm_connect/proc_register

Or Gerlitz (3):
      net/mlx5e: Disallow changing name-space for VF representors
      net/mlx5e: Handle matching on vlan priority for offloaded TC rules
      net/mlx5: E-Switch, Set the actions for offloaded rules properly

Rafał Miłecki (1):
      net: bgmac: fix reversed checks for clock control flag

Ram Amrani (2):
      qed: configure ll2 RoCE v1/v2 flavor correctly
      qed: Correct rdma params configuration

Russell King (1):
      net: mv643xx_eth: ensure coalesce settings survive read-modify-write

Saeed Mahameed (3):
      MAINTAINERS: Update MELLANOX MLX5 core VPI driver maintainers
      net/mlx5e: Fix XDP error path of mlx5e_open_channel()
      net/mlx5e: Re-arrange XDP SQ/CQ creation

Sara Sharon (1):
      iwlwifi: mvm: wake the wait queue when the RX sync counter is zero

Soheil Hassas Yeganeh (1):
      sock: fix sendmmsg for partial sendmsg

Stephen Suryaputra Lin (1):
      ipv4: use new_gw for redirect neigh lookup

Tariq Toukan (1):
      Revert "net/mlx4_en: Fix panic during reboot"

Thomas Falcon (2):
      ibmvnic: Unmap ibmvnic_statistics structure
      ibmvnic: Fix size of debugfs name buffer

Timur Tabi (3):
      net: qcom/emac: use correct value for SGMII_LN_UCDR_SO_GAIN_MODE0
      net: qcom/emac: configure the external phy to allow pause frames
      net: qcom/emac: enable flow control if requested

Ulrich Weber (1):
      netfilter: nf_conntrack_sip: extend request line validation

WANG Cong (4):
      inet: fix sleeping inside inet_wait_for_connect()
      genetlink: fix a memory leak on error path
      taskstats: fix the length of cgroupstats_cmd_get_policy
      ipvs: use IPVS_CMD_ATTR_MAX for family.maxattr

Xin Long (5):
      ipv6: add mtu lock check in __ip6_rt_update_pmtu
      sctp: hold transport instead of assoc in sctp_diag
      sctp: return back transport in __sctp_rcv_init_lookup
      sctp: hold transport instead of assoc when lookup assoc in rx path
      sctp: change sk state only when it has assocs in sctp_shutdown

Yotam Gigi (1):
      mlxsw: spectrum: Fix refcount bug on span entries

 Documentation/networking/dsa/dsa.txt                        |   3 +-
 MAINTAINERS                                                 |   1 +
 drivers/net/can/sja1000/plx_pci.c                           |  18 ++++++++++
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c              |  12 -------
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h              |   2 ++
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c            |   3 +-
 drivers/net/ethernet/apm/xgene/xgene_enet_ring2.c           |  12 ++++---
 drivers/net/ethernet/broadcom/bgmac.c                       |   9 +++--
 drivers/net/ethernet/broadcom/bnx2.c                        |  48 +++++++++++++++++++-------
 drivers/net/ethernet/broadcom/bnxt/bnxt.c                   |  11 +++---
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c             |   4 +--
 drivers/net/ethernet/brocade/bna/bnad.c                     |   4 +--
 drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h          |   2 +-
 drivers/net/ethernet/hisilicon/hns/hnae.c                   |   8 ++++-
 drivers/net/ethernet/ibm/ehea/ehea_main.c                   |   2 ++
 drivers/net/ethernet/ibm/ibmvnic.c                          |  10 +++---
 drivers/net/ethernet/marvell/mv643xx_eth.c                  |   2 ++
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c              |   1 -
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c           |  31 +++++++++--------
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c            |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c             |   5 ++-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c  |   3 +-
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c           |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/main.c              |   5 +--
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c              |   4 ++-
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h              |   2 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c       | 134 ++++++++++++++++++++++++++++++++++++----------------------------------
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c    |  14 ++++----
 drivers/net/ethernet/qlogic/qed/qed_hsi.h                   |   3 --
 drivers/net/ethernet/qlogic/qed/qed_ll2.c                   |   1 +
 drivers/net/ethernet/qlogic/qed/qed_main.c                  |  17 +++++----
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c             |  25 +++++++++-----
 drivers/net/ethernet/qlogic/qede/qede_main.c                |   2 +-
 drivers/net/ethernet/qualcomm/emac/emac-mac.c               |  15 +++++---
 drivers/net/ethernet/qualcomm/emac/emac-sgmii.c             |   2 +-
 drivers/net/ethernet/sfc/efx.c                              |   3 ++
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c           |   7 ++++
 drivers/net/ethernet/ti/cpsw-phy-sel.c                      |   3 ++
 drivers/net/ethernet/ti/davinci_emac.c                      |  10 +++---
 drivers/net/ethernet/toshiba/ps3_gelic_wireless.c           |   2 +-
 drivers/net/ethernet/xscale/ixp4xx_eth.c                    |   3 +-
 drivers/net/macvlan.c                                       |  31 ++++++++++++-----
 drivers/net/phy/phy_device.c                                |   2 ++
 drivers/net/usb/ax88179_178a.c                              |  17 +++++++++
 drivers/net/usb/r8152.c                                     |  21 ++++++-----
 drivers/net/virtio_net.c                                    |  30 ++++++++++------
 drivers/net/vxlan.c                                         |   4 ++-
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c |   2 +-
 drivers/net/wireless/intel/iwlwifi/mvm/d3.c                 |  49 ++++++++++++++++++++------
 drivers/net/wireless/intel/iwlwifi/mvm/debugfs.c            |   4 +--
 drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c           |   3 +-
 drivers/net/wireless/intel/iwlwifi/mvm/mvm.h                |   1 +
 drivers/net/wireless/intel/iwlwifi/mvm/ops.c                |   1 +
 drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c               |   3 +-
 drivers/net/wireless/intel/iwlwifi/mvm/scan.c               |  33 ++++++++++++++----
 drivers/net/wireless/intel/iwlwifi/pcie/drv.c               |  79 +++++++++++++++++++++++++-----------------
 drivers/net/wireless/intel/iwlwifi/pcie/tx.c                |   8 +++++
 drivers/net/xen-netfront.c                                  |   4 +--
 include/linux/ipv6.h                                        |   2 +-
 include/linux/netdevice.h                                   |  15 ++++++++
 include/net/ip.h                                            |   3 +-
 include/net/ip6_tunnel.h                                    |   1 +
 include/net/netfilter/nf_conntrack_labels.h                 |   3 +-
 include/net/netfilter/nf_tables.h                           |   8 +++--
 include/net/sctp/sctp.h                                     |   2 +-
 include/net/sock.h                                          |   4 +--
 include/net/tcp.h                                           |   3 +-
 include/uapi/linux/atm_zatm.h                               |   1 -
 include/uapi/linux/bpqether.h                               |   2 --
 kernel/bpf/hashtab.c                                        |   3 +-
 kernel/bpf/syscall.c                                        |   4 ++-
 kernel/taskstats.c                                          |   6 +++-
 net/can/bcm.c                                               |  32 ++++++++++++-----
 net/core/dev.c                                              |  19 ++++------
 net/core/filter.c                                           |  68 +++++++++++++++++++++++++++++++-----
 net/core/flow_dissector.c                                   |  11 ++++--
 net/core/rtnetlink.c                                        |   1 +
 net/core/sock.c                                             |   6 ++--
 net/dccp/ipv4.c                                             |  16 +++++----
 net/dccp/ipv6.c                                             |  19 +++++-----
 net/dccp/proto.c                                            |   4 +++
 net/ipv4/af_inet.c                                          |   9 +++--
 net/ipv4/fib_trie.c                                         |  21 +++++------
 net/ipv4/icmp.c                                             |   4 +--
 net/ipv4/ip_forward.c                                       |   2 +-
 net/ipv4/ip_output.c                                        |  25 ++++++++------
 net/ipv4/ip_tunnel_core.c                                   |  11 ------
 net/ipv4/ipmr.c                                             |   2 +-
 net/ipv4/netfilter/nft_dup_ipv4.c                           |   6 ++--
 net/ipv4/route.c                                            |   4 ++-
 net/ipv4/tcp.c                                              |   4 +--
 net/ipv4/tcp_dctcp.c                                        |  13 ++++++-
 net/ipv4/tcp_ipv4.c                                         |  19 +++++++++-
 net/ipv6/icmp.c                                             |   2 +-
 net/ipv6/ip6_output.c                                       |   2 +-
 net/ipv6/ip6_udp_tunnel.c                                   |   3 --
 net/ipv6/netfilter/nft_dup_ipv6.c                           |   6 ++--
 net/ipv6/route.c                                            |   4 +++
 net/ipv6/tcp_ipv6.c                                         |  14 +++++---
 net/netfilter/ipvs/ip_vs_ctl.c                              |   2 +-
 net/netfilter/ipvs/ip_vs_sync.c                             |   7 ++--
 net/netfilter/nf_conntrack_core.c                           |  49 +++++++++++++++++++++-----
 net/netfilter/nf_conntrack_helper.c                         |  11 ++++--
 net/netfilter/nf_conntrack_sip.c                            |   5 ++-
 net/netfilter/nf_tables_api.c                               |  18 ++++++----
 net/netfilter/nft_dynset.c                                  |  19 ++++++----
 net/netfilter/nft_set_hash.c                                |  19 +++++++---
 net/netfilter/nft_set_rbtree.c                              |   2 +-
 net/netfilter/xt_connmark.c                                 |   4 +--
 net/netlink/diag.c                                          |   5 +--
 net/netlink/genetlink.c                                     |   4 ++-
 net/sctp/input.c                                            |  35 +++++++++----------
 net/sctp/ipv6.c                                             |   2 +-
 net/sctp/socket.c                                           |  27 +++++++--------
 net/socket.c                                                |   2 ++
 net/unix/af_unix.c                                          |   3 +-
 samples/bpf/Makefile                                        |   4 +++
 samples/bpf/tc_l2_redirect.sh                               | 173 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 samples/bpf/tc_l2_redirect_kern.c                           | 236 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 samples/bpf/tc_l2_redirect_user.c                           |  73 +++++++++++++++++++++++++++++++++++++++
 120 files changed, 1358 insertions(+), 465 deletions(-)
 create mode 100755 samples/bpf/tc_l2_redirect.sh
 create mode 100644 samples/bpf/tc_l2_redirect_kern.c
 create mode 100644 samples/bpf/tc_l2_redirect_user.c

^ permalink raw reply

* Re: [PATCH] net/phy/vitesse: Configure RGMII skew on VSC8601, if needed
From: Alex @ 2016-11-14 21:54 UTC (permalink / raw)
  To: Florian Fainelli, David Miller; +Cc: gokhan, netdev, linux-kernel
In-Reply-To: <d567c69f-6b57-7083-9090-df01fb140e36@gmail.com>



On 11/14/2016 01:25 PM, Florian Fainelli wrote:
> On 11/14/2016 01:18 PM, David Miller wrote:
>> From: Alexandru Gagniuc <alex.g@adaptrum.com>
>> Date: Sat, 12 Nov 2016 15:32:13 -0800
>>
>>> +	if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID)
>>> +		ret = vsc8601_add_skew(phydev);
>>
>> I think you should use phy_interface_is_rgmii() here.
>>
>
> This would include all RGMII modes, here I think the intent is to check
> for PHY_INTERFACE_MODE_RGMII_ID and PHY_INTERFACE_MODE_RGMII_TXID (or
> RXID),

That is correct.

>  Alexandru, what direction does the skew settings apply to?

It applies a skew in both TX and RX directions.

Alex

^ permalink raw reply

* Re: [patch net] mlxsw: spectrum_router: Flush FIB tables during fini
From: David Miller @ 2016-11-14 21:46 UTC (permalink / raw)
  To: jiri; +Cc: netdev, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz
In-Reply-To: <1479119192-1545-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Mon, 14 Nov 2016 11:26:32 +0100

> From: Ido Schimmel <idosch@mellanox.com>
> 
> Since commit b45f64d16d45 ("mlxsw: spectrum_router: Use FIB notifications
> instead of switchdev calls") we reflect to the device the entire FIB
> table and not only FIBs that point to netdevs created by the driver.
> 
> During module removal, FIBs of the second type are removed following
> NETDEV_UNREGISTER events sent. The other FIBs are still present in both
> the driver's cache and the device's table.
> 
> Fix this by iterating over all the FIB tables in the device and flush
> them. There's no need to take locks, as we're the only writer.
> 
> Fixes: b45f64d16d45 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

Applied, thanks Jiri.

^ permalink raw reply

* Re: [PATCH net-next] mdio: Demote print from info to debug in mdio_driver_register
From: David Miller @ 2016-11-14 21:41 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, andrew
In-Reply-To: <20161114030117.25169-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Sun, 13 Nov 2016 19:01:17 -0800

> While it is useful to know which MDIO driver is being registered, demote
> the pr_info() to a pr_debug().
> 
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net] net: stmmac: Fix lack of link transition for fixed PHYs
From: David Miller @ 2016-11-14 21:40 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, peppe.cavallaro, alexandre.torgue, linux-kernel
In-Reply-To: <20161114015036.6926-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Sun, 13 Nov 2016 17:50:35 -0800

> Commit 52f95bbfcf72 ("stmmac: fix adjust link call in case of a switch
> is attached") added some logic to avoid polling the fixed PHY and
> therefore invoking the adjust_link callback more than once, since this
> is a fixed PHY and link events won't be generated.
> 
> This works fine the first time, because we start with phydev->irq =
> PHY_POLL, so we call adjust_link, then we set phydev->irq =
> PHY_IGNORE_INTERRUPT and we stop polling the PHY.
> 
> Now, if we called ndo_close(), which calls both phy_stop() and does an
> explicit netif_carrier_off(), we end up with a link down. Upon calling
> ndo_open() again, despite starting the PHY state machine, we have
> PHY_IGNORE_INTERRUPT set, and we generate no link event at all, so the
> link is permanently down.
> 
> 52f95bbfcf72 ("stmmac: fix adjust link call in case of a switch is attached")

I added the missing "Fixes: " here.

> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>

Applied and queued up for -stable, thanks Florian.

^ permalink raw reply

* Re: [PATCH net-next 1/1] driver: macvlan: Replace integer number with bool value
From: David Miller @ 2016-11-14 21:36 UTC (permalink / raw)
  To: fgao; +Cc: kaber, netdev, gfree.wind
In-Reply-To: <1479083059-12688-1-git-send-email-fgao@ikuai8.com>

From: fgao@ikuai8.com
Date: Mon, 14 Nov 2016 08:24:19 +0800

> From: Gao Feng <gfree.wind@gmail.com>
> 
> The return value of function macvlan_addr_busy is used as bool value,
> so use bool value instead of integer number "1" and "0".
> 
> Signed-off-by: Gao Feng <gfree.wind@gmail.com>

Applied, thanks.

^ permalink raw reply

* [PATCH] can: spi: hi311x: fix semicolon.cocci warnings (fwd)
From: Julia Lawall @ 2016-11-14 21:36 UTC (permalink / raw)
  To: Akshay Bhat
  Cc: wg, mkl, robh+dt, mark.rutland, linux-can, netdev, devicetree,
	linux-kernel, Akshay Bhat, Akshay Bhat

 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Akshay Bhat <akshay.bhat@timesys.com>
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

It's a minor issue, but may as well fix it up now.  The tree this code is
from is as follows:

url:
https://github.com/0day-ci/linux/commits/Akshay-Bhat/can-holt_hi311x-documen
t-device-tree-bindings/20161115-034509
:::::: branch date: 2 hours ago
:::::: commit date: 2 hours ago

 hi311x.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/can/spi/hi311x.c
+++ b/drivers/net/can/spi/hi311x.c
@@ -673,7 +673,7 @@ static irqreturn_t hi3110_can_ist(int ir
 		while (!(HI3110_STAT_RXFMTY &
 			hi3110_read(spi, HI3110_READ_STATF))) {
 			hi3110_hw_rx(spi);
-		};
+		}

 		intf = hi3110_read(spi, HI3110_READ_INTF);
 		eflag = hi3110_read(spi, HI3110_READ_ERR);

^ permalink raw reply

* [PATCH net] ipv4: fix cloning issues in fib_trie_unmerge()
From: Eric Dumazet @ 2016-11-14 21:34 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Alexander Duyck

From: Eric Dumazet <edumazet@google.com>

I had crashes in a DEBUG_PAGEALLOC kernels in fib_table_flush() or
fib_table_lookup() that I back tracked to a refcounting issue
happening when we clone struct fib_alias in fib_trie_unmerge()

While fixing this issue, I also noticed a mem leak happening
if fib_insert_alias() fails.

Fixes: 0ddcf43d5d4a0 ("ipv4: FIB Local/MAIN table collapse")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
---
 net/ipv4/fib_trie.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 4cff74d4133f..ebf49ab889e8 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1737,14 +1737,19 @@ struct fib_table *fib_trie_unmerge(struct fib_table *oldtb)
 				goto out;
 
 			memcpy(new_fa, fa, sizeof(*fa));
+			if (fa->fa_info)
+				fa->fa_info->fib_treeref++;
 
 			/* insert clone into table */
 			if (!local_l)
 				local_l = fib_find_node(lt, &local_tp, l->key);
 
 			if (fib_insert_alias(lt, local_tp, local_l, new_fa,
-					     NULL, l->key))
+					     NULL, l->key)) {
+				kmem_cache_free(fn_alias_kmem, new_fa);
+				fib_release_info(fa->fa_info);
 				goto out;
+			}
 		}
 
 		/* stop loop if key wrapped back to 0 */

^ permalink raw reply related

* Re: [PATCH] net: stmmac: Add support for ethtool::nway_reset
From: David Miller @ 2016-11-14 21:33 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, peppe.cavallaro, alexandre.torgue, linux-kernel
In-Reply-To: <680b3968-5cdc-2229-fb8d-e9e75f87809a@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Sun, 13 Nov 2016 13:35:04 -0800

> Le 13/11/2016 à 13:24, Florian Fainelli a écrit :
>> If we have a PHY device, just invoke genphy_restart_aneg() to restart
>> auto-negotiation.
>> 
>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> 
> David, please drop this patch for now, since I have another one pending
> which is going to touch the net_device/phydev interaction, this one also
> causes a build warning since priv is not used.

Ok, thanks for letting me know.

^ permalink raw reply

* Re: [PATCH net-next 00/11] Start adding support for mv88e6390 family
From: David Miller @ 2016-11-14 21:30 UTC (permalink / raw)
  To: andrew; +Cc: netdev, vivien.didelot
In-Reply-To: <20161113202403.GB18258@lunn.ch>

From: Andrew Lunn <andrew@lunn.ch>
Date: Sun, 13 Nov 2016 21:24:03 +0100

> I'm happy to respin, but i'm wondering why the don't apply.

Andrew, even though this issue has been resolved, it looks like Vivien
has some feedback for you to address with this series so I guess I'll
see a v2 or similar soon.

Thanks.

^ permalink raw reply

* Re: [PATCH] net/phy/vitesse: Configure RGMII skew on VSC8601, if needed
From: Florian Fainelli @ 2016-11-14 21:25 UTC (permalink / raw)
  To: David Miller, alex.g; +Cc: gokhan, netdev, linux-kernel
In-Reply-To: <20161114.161818.1460191406108019273.davem@davemloft.net>

On 11/14/2016 01:18 PM, David Miller wrote:
> From: Alexandru Gagniuc <alex.g@adaptrum.com>
> Date: Sat, 12 Nov 2016 15:32:13 -0800
> 
>> +	if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID)
>> +		ret = vsc8601_add_skew(phydev);
> 
> I think you should use phy_interface_is_rgmii() here.
> 

This would include all RGMII modes, here I think the intent is to check
for PHY_INTERFACE_MODE_RGMII_ID and PHY_INTERFACE_MODE_RGMII_TXID (or
RXID),  Alexandru, what direction does the skew settings apply to?
-- 
Florian

^ permalink raw reply

* Re: [PATCH net] sctp: change sk state only when it has assocs in sctp_shutdown
From: David Miller @ 2016-11-14 21:23 UTC (permalink / raw)
  To: lucien.xin
  Cc: netdev, linux-sctp, marcelo.leitner, nhorman, vyasevich,
	andreyknvl
In-Reply-To: <d260f8b59f52d7ef00b83a554fc19d4aa91766a2.1479044677.git.lucien.xin@gmail.com>

From: Xin Long <lucien.xin@gmail.com>
Date: Sun, 13 Nov 2016 21:44:37 +0800

> Now when users shutdown a sock with SEND_SHUTDOWN in sctp, even if
> this sock has no connection (assoc), sk state would be changed to
> SCTP_SS_CLOSING, which is not as we expect.
> 
> Besides, after that if users try to listen on this sock, kernel
> could even panic when it dereference sctp_sk(sk)->bind_hash in
> sctp_inet_listen, as bind_hash is null when sock has no assoc.
> 
> This patch is to move sk state change after checking sk assocs
> is not empty, and also merge these two if() conditions and reduce
> indent level.
> 
> Fixes: d46e416c11c8 ("sctp: sctp should change socket state when shutdown is received")
> Reported-by: Andrey Konovalov <andreyknvl@google.com>
> Tested-by: Andrey Konovalov <andreyknvl@google.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH v2 0/2] bnx2: Wait for in-flight DMA to complete at probe stage
From: David Miller @ 2016-11-14 21:21 UTC (permalink / raw)
  To: bhe
  Cc: netdev, michael.chan, linux-kernel, Dept-GELinuxNICDev,
	rasesh.mody, harish.patil, frank, jsr, pmenzel, jroedel, dyoung
In-Reply-To: <1479013293-21001-1-git-send-email-bhe@redhat.com>

From: Baoquan He <bhe@redhat.com>
Date: Sun, 13 Nov 2016 13:01:31 +0800

> This is v2 post.
> 
> In commit 3e1be7a ("bnx2: Reset device during driver initialization"),
> firmware requesting code was moved from open stage to probe stage.
> The reason is in kdump kernel hardware iommu need device be reset in
> driver probe stage, otherwise those in-flight DMA from 1st kernel
> will continue going and look up into the newly created io-page tables.
> However bnx2 chip resetting involves firmware requesting issue, that
> need be done in open stage. 
> 
> Michale Chan suggested we can just wait for the old in-flight DMA to
> complete at probe stage, then though without device resetting, we
> don't need to worry the old in-flight DMA could continue looking up 
> the newly created io-page tables.
> 
> v1->v2:
>     Michael suggested to wait for the in-flight DMA to complete at probe
>     stage. So give up the old method of trying to reset chip at probe
>     stage, take the new way accordingly.

Series applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH] net/phy/vitesse: Configure RGMII skew on VSC8601, if needed
From: David Miller @ 2016-11-14 21:18 UTC (permalink / raw)
  To: alex.g; +Cc: f.fainelli, gokhan, netdev, linux-kernel
In-Reply-To: <1478993533-1936-1-git-send-email-alex.g@adaptrum.com>

From: Alexandru Gagniuc <alex.g@adaptrum.com>
Date: Sat, 12 Nov 2016 15:32:13 -0800

> +	if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID)
> +		ret = vsc8601_add_skew(phydev);

I think you should use phy_interface_is_rgmii() here.

^ permalink raw reply

* Re: Debugging Ethernet issues
From: Florian Fainelli @ 2016-11-14 21:00 UTC (permalink / raw)
  To: Mason
  Cc: Sebastian Frias, Andrew Lunn, netdev, Mans Rullgard,
	Sergei Shtylyov, Tom Lendacky, Zach Brown, Shaohui Xie, Tim Beale,
	Brian Hill, Vince Bridgers, Balakumaran Kannan, David S. Miller,
	Kirill Kapranov
In-Reply-To: <582A1E3F.8040908@free.fr>

On 11/14/2016 12:27 PM, Mason wrote:
> On 14/11/2016 19:20, Florian Fainelli wrote:
> 
>> On 11/14/2016 09:59 AM, Sebastian Frias wrote:
>>
>>> Could you confirm that Mason's patch is correct and/or that it does not
>>> has negative side-effects?
>>
>> The patch is not correct nor incorrect per-se, it changes the default
>> policy of having pause frames advertised by default to not having them
>> advertised by default. This influences both your Ethernet MAC and the
>> link partner in that the result is either flow control is enabled
>> (before) or it is not (with the patch). There must be something amiss if
>> you see packet loss or some kind of problem like that with an early
>> exchange such as DHCP. Flow control tend to kick in under higher packet
>> rates (at least, that's what you expect).
> 
> Did you note that, without the change under discussion (i.e. with
> the eth driver as it is upstream), when the board is connected to
> a 100 Mbps switch, then *nothing* works *systematically (no ping,
> no DHCP; are there other relevant low-level network tools?).

No I missed that, way too many emails, really. So how about you compare
the register settings that could be (that is, all that could be modified
by the PHYLIB adjust_link function) and try to spot where things could
go wrong? Any other register that can be influenced by the link speed?

It seems like a possible (yet after re-reading, very unlikely) scenario,
considering that priv->speed, priv->duplex and priv->link are initially
zero-initialized (because nb8800_priv is zero initialized) may not force
a correct link transition and a full MAC reconfiguration in
nb8800_link_reconfigure() where some of the cached values are used.

NB: you will see most drivers initialize the previous link, speed,
duplex values to -1, because those are outside of the range of values
that PHYLIB would assign to phydev->{link,duplex,speed}, and therefore,
this is guaranteed to make the adjust_link callback that tries to
minimize these settings to force a transition.

> 
> Also, maybe this comment was lost in my own noise:
> 
> If I manually set the link up, then down, then run udhcpc
> => then nothing works, as if something is wedged somewhere
> (a kernel thread gets borked by a race condition?)

Well then start seriously debugging the problem: firs thing you need to
check is is the RUNNING flag set on the interface (which indicates a
carrier on?) without that, the networking stack won't even send packets.
If it is not set, why is not it set? Did nb8800_mac_config() get called
in the first place to configure the MAC wrt. the link settings?

When you transmit, do transmit counters increase? That would indicate
the TX DMA does its job. When transmission occurs, it is successful or
is it reporting errors? If the PHY supports it, can you access PHY
counters and look for success/error counters changing? Finally, try to
put another golden (working) host and if your switch supports it,
configure port mirroring to look at packets. If the switch does not
support it, then try different link partners.

> 
> Could not advertising pause frames result in making such a
> race condition impossible? (I don't really believe in a race,
> due to the 100% nature of the problem.)
> 
>>> Right now we know that Mason's patch makes this work, but we do not understand
>>> why nor its implications.
>>
>> You need to understand why, right now, the way this problem is
>> presented, you came up with a workaround, not with the root cause or the
>> solution. What does your link partner (switch?) reports, that is, what
>> is the ethtool output when you have a link up from your nb8800 adapter?
> 
> Isn't that what ethtool -a eth0 prints?

No, ethtool -a prints the local pause settings.

> How do I get the link partner information?

ethtool eth0:

# ethtool eth0
Settings for eth0:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                             100baseT/Half 100baseT/Full

	^======================

        Link partner advertised pause frame use: Symmetric
        Link partner advertised auto-negotiation: Yes

	^========================

        Speed: 100Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: gs
        Wake-on: d
        SecureOn password: 00:00:00:00:00:00
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes
#


> Just ethtool eth0?

Yes, just that.
-- 
Florian

^ permalink raw reply

* Re: [PATCH v2 net-next 4/6] bpf: Add BPF_MAP_TYPE_LRU_HASH
From: Alexei Starovoitov @ 2016-11-14 20:52 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: netdev, David Miller, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team
In-Reply-To: <1478890511-1346984-5-git-send-email-kafai@fb.com>

On Fri, Nov 11, 2016 at 10:55:09AM -0800, Martin KaFai Lau wrote:
> Provide a LRU version of the existing BPF_MAP_TYPE_HASH.
> 
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
...
> +/* Instead of having one common LRU list in the
> + * BPF_MAP_TYPE_LRU_HASH map, use a percpu LRU list
> + * which can scale and perform better.
> + * Note, the LRU nodes (including free nodes) cannot be moved
> + * across different LRU lists.
> + */
> +#define BPF_F_NO_COMMON_LRU	(1U << 1)

I couldn't come up with better name, so I think it's good :)

> +	if (lru && !capable(CAP_SYS_ADMIN))
> +		/* LRU implementation is much complicated than other
> +		 * maps.  Hence, limit to CAP_SYS_ADMIN for now.
> +		 */
> +		return ERR_PTR(-EPERM);

+1
good call.

> +	if (!percpu && !lru) {
> +		/* lru itself can remove the least used element, so
> +		 * there is no need for an extra elem during map_update.
> +		 */

yeah. that's an important comment, otherwise
@@ -48,11 +52,19 @@ struct htab_elem {
 	union {
 		struct rcu_head rcu;
 		enum extra_elem_state state;
+		struct bpf_lru_node lru_node;
 	};
wouldn't be correct.

Acked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply

* Re: [RFC v4 00/18] Landlock LSM: Unprivileged sandboxing
From: Mickaël Salaün @ 2016-11-14 20:51 UTC (permalink / raw)
  To: Sargun Dhillon
  Cc: LKML, Alexei Starovoitov, Andy Lutomirski, Daniel Borkmann,
	Daniel Mack, David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Kees Cook, Paul Moore, Serge E . Hallyn,
	Tejun Heo, Thomas Graf, Will Drewry, kernel-hardening, Linux API,
	LSM, netdev, open list:CONTROL GROUP (CGROUP)
In-Reply-To: <CAMp4zn8u3kg-nhiZ5rSUCLGveAzHr6FoP1x=iJasF2W0S56WfA@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 452 bytes --]


On 14/11/2016 11:35, Sargun Dhillon wrote:
> Was there a plan around getting Daniel's patches in as well? Also,
> rather than making these handles landlock-specific, can they be
> implemented in such a way where we can keep track of (some) of these
> in other types of programs?
> 

About the map of handles, this is only a new type of map so it's not
particularly Landlock-specific. Anyway, we'll see that in the third step.

 Mickaël


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* 26861 netdev
From: informationrequest @ 2016-11-14 20:49 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: EMAIL_60359011501123_netdev.zip --]
[-- Type: application/zip, Size: 17976 bytes --]

^ permalink raw reply

* [PATCH net][v2] bpf: fix range arithmetic for bpf map access
From: Josef Bacik @ 2016-11-14 20:45 UTC (permalink / raw)
  To: jannh, ast, daniel, davem, netdev

I made some invalid assumptions with BPF_AND and BPF_MOD that could result in
invalid accesses to bpf map entries.  Fix this up by doing a few things

1) Kill BPF_MOD support.  This doesn't actually get used by the compiler in real
life and just adds extra complexity.

2) Fix the logic for BPF_AND, don't allow AND of negative numbers and set the
minimum value to 0 for positive AND's.

3) Don't do operations on the ranges if they are set to the limits, as they are
by definition undefined, and allowing arithmetic operations on those values
could make them appear valid when they really aren't.

This fixes the testcase provided by Jann as well as a few other theoretical
problems.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
---
V1->V2:
- set the MIN_RANGE to -1 to essentially disable all negative values for the min
  value.
- rebased onto net instead of net-next.

 include/linux/bpf_verifier.h |  5 ++--
 kernel/bpf/verifier.c        | 70 +++++++++++++++++++++++++++++---------------
 2 files changed, 50 insertions(+), 25 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 7035b99..6aaf425 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -14,7 +14,7 @@
   * are obviously wrong for any sort of memory access.
   */
 #define BPF_REGISTER_MAX_RANGE (1024 * 1024 * 1024)
-#define BPF_REGISTER_MIN_RANGE -(1024 * 1024 * 1024)
+#define BPF_REGISTER_MIN_RANGE -1
 
 struct bpf_reg_state {
 	enum bpf_reg_type type;
@@ -22,7 +22,8 @@ struct bpf_reg_state {
 	 * Used to determine if any memory access using this register will
 	 * result in a bad access.
 	 */
-	u64 min_value, max_value;
+	s64 min_value;
+	u64 max_value;
 	union {
 		/* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */
 		s64 imm;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 99a7e5b..6a93615 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -216,8 +216,8 @@ static void print_verifier_state(struct bpf_verifier_state *state)
 				reg->map_ptr->key_size,
 				reg->map_ptr->value_size);
 		if (reg->min_value != BPF_REGISTER_MIN_RANGE)
-			verbose(",min_value=%llu",
-				(unsigned long long)reg->min_value);
+			verbose(",min_value=%lld",
+				(long long)reg->min_value);
 		if (reg->max_value != BPF_REGISTER_MAX_RANGE)
 			verbose(",max_value=%llu",
 				(unsigned long long)reg->max_value);
@@ -758,7 +758,7 @@ static int check_mem_access(struct bpf_verifier_env *env, u32 regno, int off,
 			 * index'es we need to make sure that whatever we use
 			 * will have a set floor within our range.
 			 */
-			if ((s64)reg->min_value < 0) {
+			if (reg->min_value < 0) {
 				verbose("R%d min value is negative, either use unsigned index or do a if (index >=0) check.\n",
 					regno);
 				return -EACCES;
@@ -1468,7 +1468,8 @@ static void check_reg_overflow(struct bpf_reg_state *reg)
 {
 	if (reg->max_value > BPF_REGISTER_MAX_RANGE)
 		reg->max_value = BPF_REGISTER_MAX_RANGE;
-	if ((s64)reg->min_value < BPF_REGISTER_MIN_RANGE)
+	if (reg->min_value < BPF_REGISTER_MIN_RANGE ||
+	    reg->min_value > BPF_REGISTER_MAX_RANGE)
 		reg->min_value = BPF_REGISTER_MIN_RANGE;
 }
 
@@ -1476,7 +1477,8 @@ static void adjust_reg_min_max_vals(struct bpf_verifier_env *env,
 				    struct bpf_insn *insn)
 {
 	struct bpf_reg_state *regs = env->cur_state.regs, *dst_reg;
-	u64 min_val = BPF_REGISTER_MIN_RANGE, max_val = BPF_REGISTER_MAX_RANGE;
+	s64 min_val = BPF_REGISTER_MIN_RANGE;
+	u64 max_val = BPF_REGISTER_MAX_RANGE;
 	bool min_set = false, max_set = false;
 	u8 opcode = BPF_OP(insn->code);
 
@@ -1512,22 +1514,43 @@ static void adjust_reg_min_max_vals(struct bpf_verifier_env *env,
 		return;
 	}
 
+	/* If one of our values was at the end of our ranges then we can't just
+	 * do our normal operations to the register, we need to set the values
+	 * to the min/max since they are undefined.
+	 */
+	if (min_val == BPF_REGISTER_MIN_RANGE)
+		dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
+	if (max_val == BPF_REGISTER_MAX_RANGE)
+		dst_reg->max_value = BPF_REGISTER_MAX_RANGE;
+
 	switch (opcode) {
 	case BPF_ADD:
-		dst_reg->min_value += min_val;
-		dst_reg->max_value += max_val;
+		if (dst_reg->min_value != BPF_REGISTER_MIN_RANGE)
+			dst_reg->min_value += min_val;
+		if (dst_reg->max_value != BPF_REGISTER_MAX_RANGE)
+			dst_reg->max_value += max_val;
 		break;
 	case BPF_SUB:
-		dst_reg->min_value -= min_val;
-		dst_reg->max_value -= max_val;
+		if (dst_reg->min_value != BPF_REGISTER_MIN_RANGE)
+			dst_reg->min_value -= min_val;
+		if (dst_reg->max_value != BPF_REGISTER_MAX_RANGE)
+			dst_reg->max_value -= max_val;
 		break;
 	case BPF_MUL:
-		dst_reg->min_value *= min_val;
-		dst_reg->max_value *= max_val;
+		if (dst_reg->min_value != BPF_REGISTER_MIN_RANGE)
+			dst_reg->min_value *= min_val;
+		if (dst_reg->max_value != BPF_REGISTER_MAX_RANGE)
+			dst_reg->max_value *= max_val;
 		break;
 	case BPF_AND:
-		/* & is special since it could end up with 0 bits set. */
-		dst_reg->min_value &= min_val;
+		/* Disallow AND'ing of negative numbers, ain't nobody got time
+		 * for that.  Otherwise the minimum is 0 and the max is the max
+		 * value we could AND against.
+		 */
+		if (min_val < 0)
+			dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
+		else
+			dst_reg->min_value = 0;
 		dst_reg->max_value = max_val;
 		break;
 	case BPF_LSH:
@@ -1537,24 +1560,25 @@ static void adjust_reg_min_max_vals(struct bpf_verifier_env *env,
 		 */
 		if (min_val > ilog2(BPF_REGISTER_MAX_RANGE))
 			dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
-		else
+		else if (dst_reg->min_value != BPF_REGISTER_MIN_RANGE)
 			dst_reg->min_value <<= min_val;
 
 		if (max_val > ilog2(BPF_REGISTER_MAX_RANGE))
 			dst_reg->max_value = BPF_REGISTER_MAX_RANGE;
-		else
+		else if (dst_reg->max_value != BPF_REGISTER_MAX_RANGE)
 			dst_reg->max_value <<= max_val;
 		break;
 	case BPF_RSH:
-		dst_reg->min_value >>= min_val;
-		dst_reg->max_value >>= max_val;
-		break;
-	case BPF_MOD:
-		/* % is special since it is an unsigned modulus, so the floor
-		 * will always be 0.
+		/* RSH by a negative number is undefined, and the BPF_RSH is an
+		 * unsigned shift, so make the appropriate casts.
 		 */
-		dst_reg->min_value = 0;
-		dst_reg->max_value = max_val - 1;
+		if (min_val < 0 || dst_reg->min_value < 0)
+			dst_reg->min_value = BPF_REGISTER_MIN_RANGE;
+		else
+			dst_reg->min_value =
+				(u64)(dst_reg->min_value) >> min_val;
+		if (dst_reg->max_value != BPF_REGISTER_MAX_RANGE)
+			dst_reg->max_value >>= max_val;
 		break;
 	default:
 		reset_reg_range_values(regs, insn->dst_reg);
-- 
2.5.5

^ permalink raw reply related

* Re: Debugging Ethernet issues
From: Mason @ 2016-11-14 20:27 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Sebastian Frias, Andrew Lunn, netdev, Mans Rullgard,
	Sergei Shtylyov, Tom Lendacky, Zach Brown, Shaohui Xie, Tim Beale,
	Brian Hill, Vince Bridgers, Balakumaran Kannan, David S. Miller,
	Kirill Kapranov
In-Reply-To: <2187db98-dc5a-7a3c-7965-7ccbeffc0fa1@gmail.com>

On 14/11/2016 19:20, Florian Fainelli wrote:

> On 11/14/2016 09:59 AM, Sebastian Frias wrote:
>
>> Could you confirm that Mason's patch is correct and/or that it does not
>> has negative side-effects?
> 
> The patch is not correct nor incorrect per-se, it changes the default
> policy of having pause frames advertised by default to not having them
> advertised by default. This influences both your Ethernet MAC and the
> link partner in that the result is either flow control is enabled
> (before) or it is not (with the patch). There must be something amiss if
> you see packet loss or some kind of problem like that with an early
> exchange such as DHCP. Flow control tend to kick in under higher packet
> rates (at least, that's what you expect).

Did you note that, without the change under discussion (i.e. with
the eth driver as it is upstream), when the board is connected to
a 100 Mbps switch, then *nothing* works *systematically (no ping,
no DHCP; are there other relevant low-level network tools?).

Also, maybe this comment was lost in my own noise:

If I manually set the link up, then down, then run udhcpc
=> then nothing works, as if something is wedged somewhere
(a kernel thread gets borked by a race condition?)

Could not advertising pause frames result in making such a
race condition impossible? (I don't really believe in a race,
due to the 100% nature of the problem.)

>> Right now we know that Mason's patch makes this work, but we do not understand
>> why nor its implications.
> 
> You need to understand why, right now, the way this problem is
> presented, you came up with a workaround, not with the root cause or the
> solution. What does your link partner (switch?) reports, that is, what
> is the ethtool output when you have a link up from your nb8800 adapter?

Isn't that what ethtool -a eth0 prints?
How do I get the link partner information?
Just ethtool eth0?

Regards.

^ permalink raw reply

* Re: Debugging Ethernet issues
From: Florian Fainelli @ 2016-11-14 19:19 UTC (permalink / raw)
  To: Måns Rullgård
  Cc: Sebastian Frias, Mason, Andrew Lunn, netdev, Sergei Shtylyov,
	Tom Lendacky, Zach Brown, Shaohui Xie, Tim Beale, Brian Hill,
	Vince Bridgers, Balakumaran Kannan, David S. Miller,
	Kirill Kapranov
In-Reply-To: <yw1xpolxga3o.fsf@unicorn.mansr.com>

On 11/14/2016 11:00 AM, Måns Rullgård wrote:
> Florian Fainelli <f.fainelli@gmail.com> writes:
> 
>> On 11/14/2016 10:20 AM, Florian Fainelli wrote:
>>> On 11/14/2016 09:59 AM, Sebastian Frias wrote:
>>>> On 11/14/2016 06:32 PM, Florian Fainelli wrote:
>>>>> On 11/14/2016 07:33 AM, Mason wrote:
>>>>>> On 14/11/2016 15:58, Mason wrote:
>>>>>>
>>>>>>> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
>>>>>>> vs
>>>>>>> nb8800 26000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off
>>>>>>>
>>>>>>> I'm not sure whether "flow control" is relevant...
>>>>>>
>>>>>> Based on phy_print_status()
>>>>>> phydev->pause ? "rx/tx" : "off"
>>>>>> I added the following patch.
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/aurora/nb8800.c b/drivers/net/ethernet/aurora/nb8800.c
>>>>>> index defc22a15f67..4e758c1cfa4e 100644
>>>>>> --- a/drivers/net/ethernet/aurora/nb8800.c
>>>>>> +++ b/drivers/net/ethernet/aurora/nb8800.c
>>>>>> @@ -667,6 +667,8 @@ static void nb8800_link_reconfigure(struct net_device *dev)
>>>>>>         struct phy_device *phydev = priv->phydev;
>>>>>>         int change = 0;
>>>>>>  
>>>>>> +       printk("%s from %pf\n", __func__, __builtin_return_address(0));
>>>>>> +
>>>>>>         if (phydev->link) {
>>>>>>                 if (phydev->speed != priv->speed) {
>>>>>>                         priv->speed = phydev->speed;
>>>>>> @@ -1274,9 +1276,9 @@ static int nb8800_hw_init(struct net_device *dev)
>>>>>>         nb8800_writeb(priv, NB8800_PQ2, val & 0xff);
>>>>>>  
>>>>>>         /* Auto-negotiate by default */
>>>>>> -       priv->pause_aneg = true;
>>>>>> -       priv->pause_rx = true;
>>>>>> -       priv->pause_tx = true;
>>>>>> +       priv->pause_aneg = false;
>>>>>> +       priv->pause_rx = false;
>>>>>> +       priv->pause_tx = false;
>>>>>>  
>>>>>>         nb8800_mc_init(dev, 0);
>>>>>>  
>>>>>>
> 
> [...]
> 
>>>>> And the time difference is clearly accounted for auto-negotiation time
>>>>> here, as you can see it takes about 3 seconds for Gigabit Ethernet to
>>>>> auto-negotiate and that seems completely acceptable and normal to me
>>>>> since it is a more involved process than lower speeds.
>>>>>
>>>>>>
>>>>>>
>>>>>> OK, so now it works (by accident?) even on 100 Mbps switch, but it still
>>>>>> prints "flow control rx/tx"...
>>>>>
>>>>> Because your link partner advertises flow control, and that's what
>>>>> phydev->pause and phydev->asym_pause report (I know it's confusing, but
>>>>> that's what it is at the moment).
>>>>
>>>> Thanks.
>>>> Could you confirm that Mason's patch is correct and/or that it does not
>>>> has negative side-effects?
>>>
>>> The patch is not correct nor incorrect per-se, it changes the default
>>> policy of having pause frames advertised by default to not having them
>>> advertised by default.
> 
> I was advised to advertise flow control by default back when I was
> working on the driver, and I think it makes sense to do so.
> 
>>> This influences both your Ethernet MAC and the link partner in that
>>> the result is either flow control is enabled (before) or it is not
>>> (with the patch). There must be something amiss if you see packet
>>> loss or some kind of problem like that with an early exchange such as
>>> DHCP. Flow control tend to kick in under higher packet rates (at
>>> least, that's what you expect).
>>>
>>>>
>>>> Right now we know that Mason's patch makes this work, but we do not
>>>> understand why nor its implications.
>>>
>>> You need to understand why, right now, the way this problem is
>>> presented, you came up with a workaround, not with the root cause or the
>>> solution. What does your link partner (switch?) reports, that is, what
>>> is the ethtool output when you have a link up from  your nb8800 adapter?
>>
>> Actually, nb8800_pause_config() seems to be doing a complete MAC/DMA
>> reconfiguration when pause frames get auto-negotiated while the link is
>> UP,
> 
> This is due to a silly hardware limitation.  The register containing the
> flow control bits can't be written while rx is enabled.

You do a DMA stop, but you don't disable the MAC receiver unlike what
nb8800_stop() does, why is not calling nb8800_mac_rx() necessary here?

> 
>> and it does not differentiate being called from
>> ethtool::set_pauseparam or the PHYLIB adjust_link callback (which it
>> probably should),
> 
> Differentiate how?

Differentiate in that when you are called from adjust_link, why bother
checking with netif_running() since you are only configuring the pause
settings when phydev->link is set. Not that this matters much, but
that's something the caller can tell you.

> 
>> wondering if there is a not a remote chance you can get the reply to
>> arrive right when you just got signaled a link UP?
> 
> If you're attempting to send or receive things before you get the link
> up notification, you shouldn't expect anything to work reliably.

No kidding.
-- 
Florian

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox