Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net] sctp: fix the missing put_user when dumping transport thresholds
From: Marcelo Ricardo Leitner @ 2019-09-10  0:03 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, linux-sctp, davem, Neil Horman
In-Reply-To: <3fa4f7700c93f06530c80bc666d1696cb7c077de.1568014409.git.lucien.xin@gmail.com>

On Mon, Sep 09, 2019 at 03:33:29PM +0800, Xin Long wrote:
> This issue causes SCTP_PEER_ADDR_THLDS sockopt not to be able to dump
> a transport thresholds info.
> 
> Fix it by adding 'goto' put_user in sctp_getsockopt_paddr_thresholds.
> 
> Fixes: 8add543e369d ("sctp: add SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_THLDS sockopt")
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

^ permalink raw reply

* RE: [PATCH v1 net-next 00/15] tc-taprio offload for SJA1105 DSA
From: Gomes, Vinicius @ 2019-09-09 23:49 UTC (permalink / raw)
  To: David Miller, olteanv@gmail.com
  Cc: f.fainelli@gmail.com, vivien.didelot@gmail.com, andrew@lunn.ch,
	Patel, Vedang, richardcochran@gmail.com, Voon, Weifeng,
	jiri@mellanox.com, m-karicheri2@ti.com, Jose.Abreu@synopsys.com,
	ilias.apalodimas@linaro.org, jhs@mojatatu.com,
	xiyou.wangcong@gmail.com, kurt.kanzenbach@linutronix.de,
	netdev@vger.kernel.org
In-Reply-To: <20190907.155549.1880685136488421385.davem@davemloft.net>

Hi Vladimir,

> This is a warning that I will toss this patch series if it receives no series review in
> the next couple of days.

Sorry about the delay on reviewing this. On top on the usual business, some changes to the
IT infrastructure here have hit my email workflow pretty hard.

I am taking a look at the datasheet in the meantime, it's been a long time since I looked at it, 
the idea is to help review the scheduler from hell :-)

One thing that wasn't clear is what you did to test this series.

Cheers,
--
Vinicius



^ permalink raw reply

* Re: general protection fault in qdisc_put
From: Cong Wang @ 2019-09-09 23:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: syzbot, Akinobu Mita, Andrew Morton, David Miller, Dmitry Vyukov,
	Jamal Hadi Salim, Jiri Pirko, Linux List Kernel Mailing,
	Michal Hocko, Netdev, syzkaller-bugs
In-Reply-To: <CAHk-=wgZneAegyitz7f+JLjB6=28ewtvT7M4xy_a-wqsTjOX_w@mail.gmail.com>

On Sun, Sep 8, 2019 at 10:19 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> I see two solutions:
>
>  (a) move the
>
>         q->qdisc = &noop_qdisc;
>
>      up earlier in sfb_init(), so that qdisc is always initialized
> after sfb_init(), even on failure.
>
>  (b) just make qdisc_put(NULL) just silently work as a no-op.
>
>  (c) change all the semantics to not call ->destroy if ->init failed.
>
> Honestly, (a) seems very fragile - do all the other init routines do
> this? And (c) sounds like a big change, and very fragile too.
>
> So I'd suggest that qdisc_put() be made to just ignore a NULL pointer
> (and maybe an error pointer too?).

I think (a) is the best solution here.

(c) changes too much, we already rely on this behavior.

(b) is not bad either, just very slightly more risky.

Alternatively, we can add a quick NULL check inside
sfb_destroy().

I can send out a patch if you don't.

Thanks for looking at this!

^ permalink raw reply

* Re: [PATCH v2] net: enable wireless core features with LEGACY_WEXT_ALLCONFIG
From: Greg KH @ 2019-09-09 22:58 UTC (permalink / raw)
  To: Mark Salyzyn
  Cc: linux-kernel, kernel-team, Johannes Berg, David S. Miller,
	Marcel Holtmann, linux-wireless, netdev, stable
In-Reply-To: <b7027a5d-5d75-677b-0e9b-cd70e5e30092@android.com>

On Mon, Sep 09, 2019 at 07:24:29AM -0700, Mark Salyzyn wrote:
> On 9/6/19 4:30 PM, Greg KH wrote:
> > On Fri, Sep 06, 2019 at 12:24:00PM -0700, Mark Salyzyn wrote:
> > > In embedded environments the requirements are to be able to pick and
> > > chose which features one requires built into the kernel.  If an
> > > embedded environment wants to supports loading modules that have been
> > > kbuilt out of tree, there is a need to enable hidden configurations
> > > for legacy wireless core features to provide the API surface for
> > > them to load.
> > > 
> > > Introduce CONFIG_LEGACY_WEXT_ALLCONFIG to select all legacy wireless
> > > extension core features by activating in turn all the associated
> > > hidden configuration options, without having to specifically select
> > > any wireless module(s).
> > > 
> > > Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> > > Cc: kernel-team@android.com
> > > Cc: Johannes Berg <johannes@sipsolutions.net>
> > > Cc: "David S. Miller" <davem@davemloft.net>
> > > Cc: Marcel Holtmann <marcel@holtmann.org>
> > > Cc: linux-wireless@vger.kernel.org
> > > Cc: netdev@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > Cc: stable@vger.kernel.org # 4.19
> > > ---
> > > v2: change name and documentation to CONFIG_LEGACY_WEXT_ALLCONFIG
> > > ---
> > >   net/wireless/Kconfig | 14 ++++++++++++++
> > >   1 file changed, 14 insertions(+)
> > > 
> > > diff --git a/net/wireless/Kconfig b/net/wireless/Kconfig
> > > index 67f8360dfcee..0d646cf28de5 100644
> > > --- a/net/wireless/Kconfig
> > > +++ b/net/wireless/Kconfig
> > > @@ -17,6 +17,20 @@ config WEXT_SPY
> > >   config WEXT_PRIV
> > >   	bool
> > > +config LEGACY_WEXT_ALLCONFIG
> > > +	bool "allconfig for legacy wireless extensions"
> > > +	select WIRELESS_EXT
> > > +	select WEXT_CORE
> > > +	select WEXT_PROC
> > > +	select WEXT_SPY
> > > +	select WEXT_PRIV
> > > +	help
> > > +	  Config option used to enable all the legacy wireless extensions to
> > > +	  the core functionality used by add-in modules.
> > > +
> > > +	  If you are not building a kernel to be used for a variety of
> > > +	  out-of-kernel built wireless modules, say N here.
> > > +
> > >   config CFG80211
> > >   	tristate "cfg80211 - wireless configuration API"
> > >   	depends on RFKILL || !RFKILL
> > > -- 
> > > 2.23.0.187.g17f5b7556c-goog
> > > 
> > How is this patch applicable to stable kernels???
> 
> A) worth a shot ;-}

Not nice, please, you know better :)

> B) there is a shortcoming in _all_ kernel versions with respect to hidden
> configurations options like this, hoping to set one precedent in how to
> handle them if acceptable to the community.

That's fine, but it's a new feature, not for stable.

> C) [AGENDA ALERT] Android _will_ be back-porting this to android-4.19 kernel
> anyway, would help maintenance if via stable. <holding hat in hand>

That's fine, lots of distros backport loads of stuff for new features
for stuff that is upstream.  That's trivial to do, don't try to abuse
the stable tree for new features like this please.  It only makes
maintainers grumpy when you do so :(

> D) Not an ABI or interface break, does not introduce instability, but rather
> keeps downstream kernels of any distributions from having to hack in their
> own alternate means of dealing with this problem leading to further
> fragmentation.

Again, new feature, not fixing a bug, so not applicable for stable.

For penance I require a handwritten copy of:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH] libbpf: Don't error out if getsockopt() fails for XDP_OPTIONS
From: Toke Høiland-Jørgensen @ 2019-09-09 23:06 UTC (permalink / raw)
  To: Yonghong Song, Alexei Starovoitov, Daniel Borkmann,
	netdev@vger.kernel.org, bpf@vger.kernel.org
In-Reply-To: <8e909219-a225-b242-aaa5-bee1180aed48@fb.com>

Yonghong Song <yhs@fb.com> writes:

> On 9/9/19 10:46 AM, Toke Høiland-Jørgensen wrote:
>> The xsk_socket__create() function fails and returns an error if it cannot
>> get the XDP_OPTIONS through getsockopt(). However, support for XDP_OPTIONS
>> was not added until kernel 5.3, so this means that creating XSK sockets
>> always fails on older kernels.
>> 
>> Since the option is just used to set the zero-copy flag in the xsk struct,
>> there really is no need to error out if the getsockopt() call fails.
>> 
>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
>> ---
>>   tools/lib/bpf/xsk.c | 8 ++------
>>   1 file changed, 2 insertions(+), 6 deletions(-)
>> 
>> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
>> index 680e63066cf3..598e487d9ce8 100644
>> --- a/tools/lib/bpf/xsk.c
>> +++ b/tools/lib/bpf/xsk.c
>> @@ -603,12 +603,8 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
>>   
>>   	optlen = sizeof(opts);
>>   	err = getsockopt(xsk->fd, SOL_XDP, XDP_OPTIONS, &opts, &optlen);
>> -	if (err) {
>> -		err = -errno;
>> -		goto out_mmap_tx;
>> -	}
>> -
>> -	xsk->zc = opts.flags & XDP_OPTIONS_ZEROCOPY;
>> +	if (!err)
>> +		xsk->zc = opts.flags & XDP_OPTIONS_ZEROCOPY;
>>   
>>   	if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
>>   		err = xsk_setup_xdp_prog(xsk);
>
> Since 'zc' is not used by anybody, maybe all codes 'zc' related can be 
> removed? It can be added back back once there is an interface to use
> 'zc'?

Fine with me; up to the maintainers what they prefer, I guess? :)

-Toke

^ permalink raw reply

* Re: [PATCH v4 bpf-next 1/4] capability: introduce CAP_BPF and CAP_TRACING
From: Andy Lutomirski @ 2019-09-09 22:52 UTC (permalink / raw)
  To: Alexei Starovoitov, James Morris, LSM List, Kees Cook, Jann Horn,
	Steven Rostedt
  Cc: David S. Miller, Daniel Borkmann, Peter Zijlstra,
	Network Development, bpf, kernel-team, Linux API
In-Reply-To: <20190906231053.1276792-2-ast@kernel.org>

On Fri, Sep 6, 2019 at 4:10 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> Split BPF and perf/tracing operations that are allowed under
> CAP_SYS_ADMIN into corresponding CAP_BPF and CAP_TRACING.
> For backward compatibility include them in CAP_SYS_ADMIN as well.
>
> The end result provides simple safety model for applications that use BPF:
> - for tracing program types
>   BPF_PROG_TYPE_{KPROBE, TRACEPOINT, PERF_EVENT, RAW_TRACEPOINT, etc}
>   use CAP_BPF and CAP_TRACING
> - for networking program types
>   BPF_PROG_TYPE_{SCHED_CLS, XDP, CGROUP_SKB, SK_SKB, etc}
>   use CAP_BPF and CAP_NET_ADMIN
>
> There are few exceptions from this simple rule:
> - bpf_trace_printk() is allowed in networking programs, but it's using
>   ftrace mechanism, hence this helper needs additional CAP_TRACING.
> - cpumap is used by XDP programs. Currently it's kept under CAP_SYS_ADMIN,
>   but could be relaxed to CAP_NET_ADMIN in the future.
> - BPF_F_ZERO_SEED flag for hash/lru map is allowed under CAP_SYS_ADMIN only
>   to discourage production use.
> - BPF HW offload is allowed under CAP_SYS_ADMIN.
> - cg_sysctl, cg_device, lirc program types are neither networking nor tracing.
>   They can be loaded under CAP_BPF, but attach is allowed under CAP_NET_ADMIN.
>   This will be cleaned up in the future.
>
> userid=nobody + (CAP_TRACING | CAP_NET_ADMIN) + CAP_BPF is safer than
> typical setup with userid=root and sudo by existing bpf applications.
> It's not secure, since these capabilities:
> - allow bpf progs access arbitrary memory
> - let tasks access any bpf map
> - let tasks attach/detach any bpf prog
>
> bpftool, bpftrace, bcc tools binaries should not be installed with
> cap_bpf+cap_tracing, since unpriv users will be able to read kernel secrets.
>
> CAP_BPF, CAP_NET_ADMIN, CAP_TRACING are roughly equal in terms of
> damage they can make to the system.
> Example:
> CAP_NET_ADMIN can stop network traffic. CAP_BPF can write into map
> and if that map is used by firewall-like bpf prog the network traffic
> may stop.
> CAP_BPF allows many bpf prog_load commands in parallel. The verifier
> may consume large amount of memory and significantly slow down the system.
> CAP_TRACING allows many kprobes that can slow down the system.

Do we want to split CAP_TRACE_KERNEL and CAP_TRACE_USER?  It's not
entirely clear to me that it's useful.

>
> In the future more fine-grained bpf permissions may be added.
>
> Existing unprivileged BPF operations are not affected.
> In particular unprivileged users are allowed to load socket_filter and cg_skb
> program types and to create array, hash, prog_array, map-in-map map types.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  include/linux/capability.h          | 18 +++++++++++
>  include/uapi/linux/capability.h     | 49 ++++++++++++++++++++++++++++-
>  security/selinux/include/classmap.h |  4 +--
>  3 files changed, 68 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index ecce0f43c73a..13eb49c75797 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -247,6 +247,24 @@ static inline bool ns_capable_setid(struct user_namespace *ns, int cap)
>         return true;
>  }
>  #endif /* CONFIG_MULTIUSER */
> +
> +static inline bool capable_bpf(void)
> +{
> +       return capable(CAP_SYS_ADMIN) || capable(CAP_BPF);
> +}
> +static inline bool capable_tracing(void)
> +{
> +       return capable(CAP_SYS_ADMIN) || capable(CAP_TRACING);
> +}
> +static inline bool capable_bpf_tracing(void)
> +{
> +       return capable(CAP_SYS_ADMIN) || (capable(CAP_BPF) && capable(CAP_TRACING));
> +}
> +static inline bool capable_bpf_net_admin(void)
> +{
> +       return (capable(CAP_SYS_ADMIN) || capable(CAP_BPF)) && capable(CAP_NET_ADMIN);
> +}
> +

These helpers are all wrong, unfortunately, since they will produce
inappropriate audit events.  capable_bpf() should look more like this:

if (capable_noaudit(CAP_BPF))
  return capable(CAP_BPF);
if (capable_noaudit(CAP_SYS_ADMIN))
  return capable(CAP_SYS_ADMIN);

return capable(CAP_BPF);

James, etc: should there instead be new helpers to do this more
generically rather than going through the noaudit contortions?  My
code above is horrible.

^ permalink raw reply

* Re: Default qdisc not correctly initialized with custom MTU
From: Cong Wang @ 2019-09-09 22:52 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: Netdev
In-Reply-To: <211c7151-7500-f895-7fd7-2c868dd48579@applied-asynchrony.com>

On Mon, Sep 9, 2019 at 5:44 AM Holger Hoffstätte
<holger@applied-asynchrony.com> wrote:
> I can't help but feel this is a slight bug in terms of initialization order,
> and that the default qdisc should only be created when it's first being
> used/attached to a link, not when the sysctls are configured.

Yeah, this is because the fq_codel qdisc is initialized once and
doesn't get any notification when the netdev's MTU get changed.
We can "fix" this by adding a NETDEV_CHANGEMTU notifier to
qdisc's, but I don't know if it is really worth the effort.

Is there any reason you can't change that order?

Thanks.

^ permalink raw reply

* Re: [PATCH] net/ibmvnic: Fix missing { in __ibmvnic_reset
From: Juliet Kim @ 2019-09-09 22:49 UTC (permalink / raw)
  To: Michal Suchanek, netdev, David S. Miller
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Falcon, John Allen, linuxppc-dev, linux-kernel
In-Reply-To: <20190909204451.7929-1-msuchanek@suse.de>

On 9/9/19 3:44 PM, Michal Suchanek wrote:
> Commit 1c2977c09499 ("net/ibmvnic: free reset work of removed device from queue")
> adds a } without corresponding { causing build break.
>
> Fixes: 1c2977c09499 ("net/ibmvnic: free reset work of removed device from queue")
> Signed-off-by: Michal Suchanek <msuchanek@suse.de>

Reviewed-by: Juliet Kim <julietk@linux.vnet.ibm.com>

> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index 6644cabc8e75..5cb55ea671e3 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -1984,7 +1984,7 @@ static void __ibmvnic_reset(struct work_struct *work)
>  	rwi = get_next_rwi(adapter);
>  	while (rwi) {
>  		if (adapter->state == VNIC_REMOVING ||
> -		    adapter->state == VNIC_REMOVED)
> +		    adapter->state == VNIC_REMOVED) {
>  			kfree(rwi);
>  			rc = EBUSY;
>  			break;

^ permalink raw reply

* [net-next v2 00/15][pull request] Intel Wired LAN Driver Updates 2019-09-09
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann

This series contains a variety of cold and hot savoury changes to Intel
drivers.  Some of the fixes could be considered for stable even though
the author did not request it.

Hulk Robert cleans up (i.e. removes) a function that has no caller for
the iavf driver.

Radoslaw fixes an issue when there is no link in the VM after the
hypervisor is restored from a low-power state due to the driver not
properly restoring features in the device that had been disabled during
the suspension for ixgbevf.

Kai-Heng Feng modified e1000e to use mod_delayed_work() to help resolve
a hot plug speed detection issue by adding a deterministic 1 second
delay before running watchdog task after an interrupt.

Sasha moves functions around to avoid forward declarations, since the
forward declarations are not necessary for these static functions in
igc.  Also added a check for igc during driver probe to validate the NVM
checksum.  Cleaned up code defines that were not being used in the igc
driver.  Adds support for IP generic transmit checksum offload in the
igc driver.

Updated the iavf kernel documentation by a developer with no life.

Jake provides another fm10k update to a local variable for ease of code
readability.

Mitch fixes the iavf driver to allow the VF to override the MAC address
set by the host, if the VF is in "trusted" mode.

Mauro S. M. Rodrigues provides several changes for i40e driver, first
with resolving hw_dbg usage and referencing a i40e_hw attribute.  Also
implemented a debug macro using pr_debug, since the use of netdev_dbg
could cause a NULL pointer dereference during probe.  Finally cleaned up
code that is no longer used or needed.

Firo Yang provides a change in the ixgbe driver to ensure we sync the
first fragment unconditionally to help resolve an issue seen in the XEN
environment when the upper network stack could receive an incomplete
network packet.

Mariusz adds a missing device to the i40e PCI table in the driver.

v2: Mauro S. M. Rodrigues updated patches 10 & 11 of the series based on
    feedback from Jakub Kicinski.  Also updated patch 13 description so
    that the "Fixes:" tag was no wrapped.

The following are changes since commit 6703a605b5ab33502d7a327de880188013d7c377:
  Merge branch 'net-tls-small-TX-offload-optimizations'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 10GbE

Firo Yang (1):
  ixgbe: sync the first fragment unconditionally

Jacob Keller (1):
  fm10k: use a local variable for the frag pointer

Jeff Kirsher (1):
  Documentation: iavf: Update the Intel LAN driver doc for iavf

Kai-Heng Feng (1):
  e1000e: Make speed detection on hotplugging cable more reliable

Mariusz Stachura (1):
  i40e: Add support for X710 device

Mauro S. M. Rodrigues (3):
  i40e: fix hw_dbg usage in i40e_hmc_get_object_va
  i40e: Implement debug macro hw_dbg using dev_dbg
  i40e: Remove EMPR traces from debugfs facility

Mitch Williams (1):
  iavf: allow permanent MAC address to change

Radoslaw Tyl (1):
  ixgbevf: Link lost in VM on ixgbevf when restoring from freeze or
    suspend

Sasha Neftin (4):
  igc: Remove useless forward declaration
  igc: Add NVM checksum validation
  igc: Remove unneeded PCI bus defines
  igc: Add tx_csum offload functionality

YueHaibing (1):
  iavf: remove unused debug function iavf_debug_d

 .../networking/device_drivers/intel/iavf.rst  | 115 ++++++++---
 drivers/net/ethernet/intel/e1000e/netdev.c    |  12 +-
 drivers/net/ethernet/intel/fm10k/fm10k_main.c |   8 +-
 drivers/net/ethernet/intel/i40e/i40e.h        |   1 -
 drivers/net/ethernet/intel/i40e/i40e_common.c |   1 +
 .../net/ethernet/intel/i40e/i40e_debugfs.c    |   4 -
 drivers/net/ethernet/intel/i40e/i40e_hmc.c    |   1 +
 .../net/ethernet/intel/i40e/i40e_lan_hmc.c    |  21 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c   |   1 +
 drivers/net/ethernet/intel/i40e/i40e_osdep.h  |   5 +-
 drivers/net/ethernet/intel/iavf/iavf.h        |   1 -
 drivers/net/ethernet/intel/iavf/iavf_main.c   |  26 ---
 drivers/net/ethernet/intel/igc/igc.h          |   4 +
 drivers/net/ethernet/intel/igc/igc_base.h     |   8 +
 drivers/net/ethernet/intel/igc/igc_defines.h  |   9 +-
 drivers/net/ethernet/intel/igc/igc_mac.c      |  73 ++++---
 drivers/net/ethernet/intel/igc/igc_main.c     | 106 ++++++++++
 drivers/net/ethernet/intel/igc/igc_phy.c      | 192 +++++++++---------
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  16 +-
 .../net/ethernet/intel/ixgbevf/ixgbevf_main.c |   1 +
 20 files changed, 373 insertions(+), 232 deletions(-)

-- 
2.21.0

^ permalink raw reply

* [net-next v2 01/15] iavf: remove unused debug function iavf_debug_d
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem
  Cc: YueHaibing, netdev, nhorman, sassmann, Hulk Robot, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: YueHaibing <yuehaibing@huawei.com>

There is no caller of function iavf_debug_d() in tree since
commit 75051ce4c5d8 ("iavf: Fix up debug print macro"),
so it can be removed.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/iavf/iavf_main.c | 22 ---------------------
 1 file changed, 22 deletions(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 9d2b50964a08..554aa619ff02 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -142,28 +142,6 @@ enum iavf_status iavf_free_virt_mem_d(struct iavf_hw *hw,
 	return 0;
 }
 
-/**
- * iavf_debug_d - OS dependent version of debug printing
- * @hw:  pointer to the HW structure
- * @mask: debug level mask
- * @fmt_str: printf-type format description
- **/
-void iavf_debug_d(void *hw, u32 mask, char *fmt_str, ...)
-{
-	char buf[512];
-	va_list argptr;
-
-	if (!(mask & ((struct iavf_hw *)hw)->debug_mask))
-		return;
-
-	va_start(argptr, fmt_str);
-	vsnprintf(buf, sizeof(buf), fmt_str, argptr);
-	va_end(argptr);
-
-	/* the debug string is already formatted with a newline */
-	pr_info("%s", buf);
-}
-
 /**
  * iavf_schedule_reset - Set the flags and schedule a reset event
  * @adapter: board private structure
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 07/15] igc: Add NVM checksum validation
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: Sasha Neftin <sasha.neftin@intel.com>

Add NVM checksum validation during probe functionality.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 251552855c40..965d1c939f0f 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -4133,6 +4133,15 @@ static int igc_probe(struct pci_dev *pdev,
 	 */
 	hw->mac.ops.reset_hw(hw);
 
+	if (igc_get_flash_presence_i225(hw)) {
+		if (hw->nvm.ops.validate(hw) < 0) {
+			dev_err(&pdev->dev,
+				"The NVM Checksum Is Not Valid\n");
+			err = -EIO;
+			goto err_eeprom;
+		}
+	}
+
 	if (eth_platform_get_mac_address(&pdev->dev, hw->mac.addr)) {
 		/* copy the MAC address out of the NVM */
 		if (hw->mac.ops.read_mac_addr(hw))
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 09/15] igc: Remove unneeded PCI bus defines
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: Sasha Neftin <sasha.neftin@intel.com>

PCIe device control 2 defines does not use internally.
This patch comes to clean up those.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_defines.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 11b99acf4abe..549134ecd105 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -10,10 +10,6 @@
 
 #define IGC_CTRL_EXT_DRV_LOAD	0x10000000 /* Drv loaded bit for FW */
 
-/* PCI Bus Info */
-#define PCIE_DEVICE_CONTROL2		0x28
-#define PCIE_DEVICE_CONTROL2_16ms	0x0005
-
 /* Physical Func Reset Done Indication */
 #define IGC_CTRL_EXT_LINK_MODE_MASK	0x00C00000
 
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 15/15] i40e: Add support for X710 device
From: Jeff Kirsher @ 2019-09-09 22:48 UTC (permalink / raw)
  To: davem
  Cc: Mariusz Stachura, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: Mariusz Stachura <mariusz.stachura@intel.com>

Add I40E_DEV_ID_10G_BASE_T_BC to i40e_pci_tbl

Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 3c8a2f55c43a..e9f2f276bf27 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -73,6 +73,7 @@ static const struct pci_device_id i40e_pci_tbl[] = {
 	{PCI_VDEVICE(INTEL, I40E_DEV_ID_QSFP_C), 0},
 	{PCI_VDEVICE(INTEL, I40E_DEV_ID_10G_BASE_T), 0},
 	{PCI_VDEVICE(INTEL, I40E_DEV_ID_10G_BASE_T4), 0},
+	{PCI_VDEVICE(INTEL, I40E_DEV_ID_10G_BASE_T_BC), 0},
 	{PCI_VDEVICE(INTEL, I40E_DEV_ID_10G_SFP), 0},
 	{PCI_VDEVICE(INTEL, I40E_DEV_ID_10G_B), 0},
 	{PCI_VDEVICE(INTEL, I40E_DEV_ID_KX_X722), 0},
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 12/15] i40e: Remove EMPR traces from debugfs facility
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem
  Cc: Mauro S. M. Rodrigues, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>

Since commit
'5098850c9b9b ("i40e/i40evf: i40e_register.h updates")'
it is no longer possible to trigger an EMP Reset from debugfs, but it's
possible to request it either way, to end up with a bad reset request:

echo empr > /sys/kernel/debug/i40e/0002\:01\:00.1/command
i40e 0002:01:00.1: debugfs: forcing EMPR
i40e 0002:01:00.1: bad reset request 0x00010000

So let's remove this piece of code and show the available valid commands
as it is when any invalid command is issued.

Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h         | 1 -
 drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 4 ----
 2 files changed, 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 3e535d3263b3..f1a1bd324b50 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -131,7 +131,6 @@ enum i40e_state_t {
 	__I40E_PF_RESET_REQUESTED,
 	__I40E_CORE_RESET_REQUESTED,
 	__I40E_GLOBAL_RESET_REQUESTED,
-	__I40E_EMP_RESET_REQUESTED,
 	__I40E_EMP_RESET_INTR_RECEIVED,
 	__I40E_SUSPENDED,
 	__I40E_PTP_TX_IN_PROGRESS,
diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
index 41232898d8ae..99ea543dd245 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
@@ -1125,10 +1125,6 @@ static ssize_t i40e_dbg_command_write(struct file *filp,
 		dev_info(&pf->pdev->dev, "debugfs: forcing GlobR\n");
 		i40e_do_reset_safe(pf, BIT(__I40E_GLOBAL_RESET_REQUESTED));
 
-	} else if (strncmp(cmd_buf, "empr", 4) == 0) {
-		dev_info(&pf->pdev->dev, "debugfs: forcing EMPR\n");
-		i40e_do_reset_safe(pf, BIT(__I40E_EMP_RESET_REQUESTED));
-
 	} else if (strncmp(cmd_buf, "read", 4) == 0) {
 		u32 address;
 		u32 value;
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 14/15] igc: Add tx_csum offload functionality
From: Jeff Kirsher @ 2019-09-09 22:48 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: Sasha Neftin <sasha.neftin@intel.com>

Add IP generic TX checksum offload functionality.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h         |  4 +
 drivers/net/ethernet/intel/igc/igc_base.h    |  8 ++
 drivers/net/ethernet/intel/igc/igc_defines.h |  5 +
 drivers/net/ethernet/intel/igc/igc_main.c    | 97 ++++++++++++++++++++
 4 files changed, 114 insertions(+)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 0f5534ce27b0..7e16345d836e 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -135,6 +135,9 @@ extern char igc_driver_version[];
 /* How many Rx Buffers do we bundle into one write to the hardware ? */
 #define IGC_RX_BUFFER_WRITE	16 /* Must be power of 2 */
 
+/* VLAN info */
+#define IGC_TX_FLAGS_VLAN_MASK	0xffff0000
+
 /* igc_test_staterr - tests bits within Rx descriptor status and error fields */
 static inline __le32 igc_test_staterr(union igc_adv_rx_desc *rx_desc,
 				      const u32 stat_err_bits)
@@ -254,6 +257,7 @@ struct igc_ring {
 	u16 count;                      /* number of desc. in the ring */
 	u8 queue_index;                 /* logical index of the ring*/
 	u8 reg_idx;                     /* physical index of the ring */
+	bool launchtime_enable;		/* true if LaunchTime is enabled */
 
 	/* everything past this point are written often */
 	u16 next_to_clean;
diff --git a/drivers/net/ethernet/intel/igc/igc_base.h b/drivers/net/ethernet/intel/igc/igc_base.h
index 58d1109d7f3f..ea627ce52525 100644
--- a/drivers/net/ethernet/intel/igc/igc_base.h
+++ b/drivers/net/ethernet/intel/igc/igc_base.h
@@ -22,6 +22,14 @@ union igc_adv_tx_desc {
 	} wb;
 };
 
+/* Context descriptors */
+struct igc_adv_tx_context_desc {
+	__le32 vlan_macip_lens;
+	__le32 launch_time;
+	__le32 type_tucmd_mlhl;
+	__le32 mss_l4len_idx;
+};
+
 /* Adv Transmit Descriptor Config Masks */
 #define IGC_ADVTXD_MAC_TSTAMP	0x00080000 /* IEEE1588 Timestamp packet */
 #define IGC_ADVTXD_DTYP_CTXT	0x00200000 /* Advanced Context Descriptor */
diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
index 549134ecd105..f3f2325fe567 100644
--- a/drivers/net/ethernet/intel/igc/igc_defines.h
+++ b/drivers/net/ethernet/intel/igc/igc_defines.h
@@ -397,4 +397,9 @@
 #define IGC_VLAPQF_P_VALID(_n)	(0x1 << (3 + (_n) * 4))
 #define IGC_VLAPQF_QUEUE_MASK	0x03
 
+#define IGC_ADVTXD_MACLEN_SHIFT		9  /* Adv ctxt desc mac len shift */
+#define IGC_ADVTXD_TUCMD_IPV4		0x00000400  /* IP Packet Type:1=IPv4 */
+#define IGC_ADVTXD_TUCMD_L4T_TCP	0x00000800  /* L4 Packet Type of TCP */
+#define IGC_ADVTXD_TUCMD_L4T_SCTP	0x00001000 /* L4 packet TYPE of SCTP */
+
 #endif /* _IGC_DEFINES_H_ */
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 965d1c939f0f..63b62d74f961 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -5,6 +5,11 @@
 #include <linux/types.h>
 #include <linux/if_vlan.h>
 #include <linux/aer.h>
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#include <linux/ip.h>
+
+#include <net/ipv6.h>
 
 #include "igc.h"
 #include "igc_hw.h"
@@ -790,8 +795,96 @@ static int igc_set_mac(struct net_device *netdev, void *p)
 	return 0;
 }
 
+static void igc_tx_ctxtdesc(struct igc_ring *tx_ring,
+			    struct igc_tx_buffer *first,
+			    u32 vlan_macip_lens, u32 type_tucmd,
+			    u32 mss_l4len_idx)
+{
+	struct igc_adv_tx_context_desc *context_desc;
+	u16 i = tx_ring->next_to_use;
+	struct timespec64 ts;
+
+	context_desc = IGC_TX_CTXTDESC(tx_ring, i);
+
+	i++;
+	tx_ring->next_to_use = (i < tx_ring->count) ? i : 0;
+
+	/* set bits to identify this as an advanced context descriptor */
+	type_tucmd |= IGC_TXD_CMD_DEXT | IGC_ADVTXD_DTYP_CTXT;
+
+	/* For 82575, context index must be unique per ring. */
+	if (test_bit(IGC_RING_FLAG_TX_CTX_IDX, &tx_ring->flags))
+		mss_l4len_idx |= tx_ring->reg_idx << 4;
+
+	context_desc->vlan_macip_lens	= cpu_to_le32(vlan_macip_lens);
+	context_desc->type_tucmd_mlhl	= cpu_to_le32(type_tucmd);
+	context_desc->mss_l4len_idx	= cpu_to_le32(mss_l4len_idx);
+
+	/* We assume there is always a valid Tx time available. Invalid times
+	 * should have been handled by the upper layers.
+	 */
+	if (tx_ring->launchtime_enable) {
+		ts = ns_to_timespec64(first->skb->tstamp);
+		first->skb->tstamp = 0;
+		context_desc->launch_time = cpu_to_le32(ts.tv_nsec / 32);
+	} else {
+		context_desc->launch_time = 0;
+	}
+}
+
+static inline bool igc_ipv6_csum_is_sctp(struct sk_buff *skb)
+{
+	unsigned int offset = 0;
+
+	ipv6_find_hdr(skb, &offset, IPPROTO_SCTP, NULL, NULL);
+
+	return offset == skb_checksum_start_offset(skb);
+}
+
 static void igc_tx_csum(struct igc_ring *tx_ring, struct igc_tx_buffer *first)
 {
+	struct sk_buff *skb = first->skb;
+	u32 vlan_macip_lens = 0;
+	u32 type_tucmd = 0;
+
+	if (skb->ip_summed != CHECKSUM_PARTIAL) {
+csum_failed:
+		if (!(first->tx_flags & IGC_TX_FLAGS_VLAN) &&
+		    !tx_ring->launchtime_enable)
+			return;
+		goto no_csum;
+	}
+
+	switch (skb->csum_offset) {
+	case offsetof(struct tcphdr, check):
+		type_tucmd = IGC_ADVTXD_TUCMD_L4T_TCP;
+		/* fall through */
+	case offsetof(struct udphdr, check):
+		break;
+	case offsetof(struct sctphdr, checksum):
+		/* validate that this is actually an SCTP request */
+		if ((first->protocol == htons(ETH_P_IP) &&
+		     (ip_hdr(skb)->protocol == IPPROTO_SCTP)) ||
+		    (first->protocol == htons(ETH_P_IPV6) &&
+		     igc_ipv6_csum_is_sctp(skb))) {
+			type_tucmd = IGC_ADVTXD_TUCMD_L4T_SCTP;
+			break;
+		}
+		/* fall through */
+	default:
+		skb_checksum_help(skb);
+		goto csum_failed;
+	}
+
+	/* update TX checksum flag */
+	first->tx_flags |= IGC_TX_FLAGS_CSUM;
+	vlan_macip_lens = skb_checksum_start_offset(skb) -
+			  skb_network_offset(skb);
+no_csum:
+	vlan_macip_lens |= skb_network_offset(skb) << IGC_ADVTXD_MACLEN_SHIFT;
+	vlan_macip_lens |= first->tx_flags & IGC_TX_FLAGS_VLAN_MASK;
+
+	igc_tx_ctxtdesc(tx_ring, first, vlan_macip_lens, type_tucmd, 0);
 }
 
 static int __igc_maybe_stop_tx(struct igc_ring *tx_ring, const u16 size)
@@ -4116,6 +4209,9 @@ static int igc_probe(struct pci_dev *pdev,
 	if (err)
 		goto err_sw_init;
 
+	/* Add supported features to the features list*/
+	netdev->features |= NETIF_F_HW_CSUM;
+
 	/* setup the private structure */
 	err = igc_sw_init(adapter);
 	if (err)
@@ -4123,6 +4219,7 @@ static int igc_probe(struct pci_dev *pdev,
 
 	/* copy netdev features into list of user selectable features */
 	netdev->hw_features |= NETIF_F_NTUPLE;
+	netdev->hw_features |= netdev->features;
 
 	/* MTU range: 68 - 9216 */
 	netdev->min_mtu = ETH_MIN_MTU;
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 13/15] ixgbe: sync the first fragment unconditionally
From: Jeff Kirsher @ 2019-09-09 22:48 UTC (permalink / raw)
  To: davem
  Cc: Firo Yang, netdev, nhorman, sassmann, Alexander Duyck,
	Andrew Bowers, Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: Firo Yang <firo.yang@suse.com>

In Xen environment, if Xen-swiotlb is enabled, ixgbe driver
could possibly allocate a page, DMA memory buffer, for the first
fragment which is not suitable for Xen-swiotlb to do DMA operations.
Xen-swiotlb have to internally allocate another page for doing DMA
operations. This mechanism requires syncing the data from the internal
page to the page which ixgbe sends to upper network stack. However,
since commit f3213d932173 ("ixgbe: Update driver to make use of DMA
attributes in Rx path"), the unmap operation is performed with
DMA_ATTR_SKIP_CPU_SYNC. As a result, the sync is not performed.
Since the sync isn't performed, the upper network stack could receive
a incomplete network packet. By incomplete, it means the linear data
on the first fragment(between skb->head and skb->end) is invalid. So
we have to copy the data from the internal xen-swiotlb page to the page
which ixgbe sends to upper network stack through the sync operation.

More details from Alexander Duyck:
Specifically since we are mapping the frame with
DMA_ATTR_SKIP_CPU_SYNC we have to unmap with that as well. As a result
a sync is not performed on an unmap and must be done manually as we
skipped it for the first frag. As such we need to always sync before
possibly performing a page unmap operation.

Fixes: f3213d932173 ("ixgbe: Update driver to make use of DMA attributes in Rx path")
Signed-off-by: Firo Yang <firo.yang@suse.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 9bcae44e9883..99df595abfba 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1825,13 +1825,7 @@ static void ixgbe_pull_tail(struct ixgbe_ring *rx_ring,
 static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
 				struct sk_buff *skb)
 {
-	/* if the page was released unmap it, else just sync our portion */
-	if (unlikely(IXGBE_CB(skb)->page_released)) {
-		dma_unmap_page_attrs(rx_ring->dev, IXGBE_CB(skb)->dma,
-				     ixgbe_rx_pg_size(rx_ring),
-				     DMA_FROM_DEVICE,
-				     IXGBE_RX_DMA_ATTR);
-	} else if (ring_uses_build_skb(rx_ring)) {
+	if (ring_uses_build_skb(rx_ring)) {
 		unsigned long offset = (unsigned long)(skb->data) & ~PAGE_MASK;

 		dma_sync_single_range_for_cpu(rx_ring->dev,
@@ -1848,6 +1842,14 @@ static void ixgbe_dma_sync_frag(struct ixgbe_ring *rx_ring,
 					      skb_frag_size(frag),
 					      DMA_FROM_DEVICE);
 	}
+
+	/* If the page was released, just unmap it. */
+	if (unlikely(IXGBE_CB(skb)->page_released)) {
+		dma_unmap_page_attrs(rx_ring->dev, IXGBE_CB(skb)->dma,
+				     ixgbe_rx_pg_size(rx_ring),
+				     DMA_FROM_DEVICE,
+				     IXGBE_RX_DMA_ATTR);
+	}
 }

 /**
-- 
2.21.0

^ permalink raw reply related

* [net-next v2 10/15] i40e: fix hw_dbg usage in i40e_hmc_get_object_va
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem
  Cc: Mauro S. M. Rodrigues, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>

The mentioned function references a i40e_hw attribute, as parameter for
hw_dbg, but it doesn't exist in the function scope.
Fixes it by changing  parameters from i40e_hmc_info to i40e_hw which can
retrieve the necessary i40e_hmc_info.

v2:
 - Fixed reverse xmas tree code style issue as suggested by Jakub Kicinski

Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../net/ethernet/intel/i40e/i40e_lan_hmc.c    | 21 ++++++++++---------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_lan_hmc.c b/drivers/net/ethernet/intel/i40e/i40e_lan_hmc.c
index 994011c38fb4..be24d42280d8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_lan_hmc.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_lan_hmc.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright(c) 2013 - 2018 Intel Corporation. */
 
+#include "i40e.h"
 #include "i40e_osdep.h"
 #include "i40e_register.h"
 #include "i40e_type.h"
@@ -963,7 +964,7 @@ static i40e_status i40e_set_hmc_context(u8 *context_bytes,
 
 /**
  * i40e_hmc_get_object_va - retrieves an object's virtual address
- * @hmc_info: pointer to i40e_hmc_info struct
+ * @hw: the hardware struct, from which we obtain the i40e_hmc_info pointer
  * @object_base: pointer to u64 to get the va
  * @rsrc_type: the hmc resource type
  * @obj_idx: hmc object index
@@ -972,16 +973,16 @@ static i40e_status i40e_set_hmc_context(u8 *context_bytes,
  * base pointer.  This function is used for LAN Queue contexts.
  **/
 static
-i40e_status i40e_hmc_get_object_va(struct i40e_hmc_info *hmc_info,
-					u8 **object_base,
-					enum i40e_hmc_lan_rsrc_type rsrc_type,
-					u32 obj_idx)
+i40e_status i40e_hmc_get_object_va(struct i40e_hw *hw, u8 **object_base,
+				   enum i40e_hmc_lan_rsrc_type rsrc_type,
+				   u32 obj_idx)
 {
+	struct i40e_hmc_info *hmc_info = &hw->hmc;
 	u32 obj_offset_in_sd, obj_offset_in_pd;
-	i40e_status ret_code = 0;
 	struct i40e_hmc_sd_entry *sd_entry;
 	struct i40e_hmc_pd_entry *pd_entry;
 	u32 pd_idx, pd_lmt, rel_pd_idx;
+	i40e_status ret_code = 0;
 	u64 obj_offset_in_fpm;
 	u32 sd_idx, sd_lmt;
 
@@ -1047,7 +1048,7 @@ i40e_status i40e_clear_lan_tx_queue_context(struct i40e_hw *hw,
 	i40e_status err;
 	u8 *context_bytes;
 
-	err = i40e_hmc_get_object_va(&hw->hmc, &context_bytes,
+	err = i40e_hmc_get_object_va(hw, &context_bytes,
 				     I40E_HMC_LAN_TX, queue);
 	if (err < 0)
 		return err;
@@ -1068,7 +1069,7 @@ i40e_status i40e_set_lan_tx_queue_context(struct i40e_hw *hw,
 	i40e_status err;
 	u8 *context_bytes;
 
-	err = i40e_hmc_get_object_va(&hw->hmc, &context_bytes,
+	err = i40e_hmc_get_object_va(hw, &context_bytes,
 				     I40E_HMC_LAN_TX, queue);
 	if (err < 0)
 		return err;
@@ -1088,7 +1089,7 @@ i40e_status i40e_clear_lan_rx_queue_context(struct i40e_hw *hw,
 	i40e_status err;
 	u8 *context_bytes;
 
-	err = i40e_hmc_get_object_va(&hw->hmc, &context_bytes,
+	err = i40e_hmc_get_object_va(hw, &context_bytes,
 				     I40E_HMC_LAN_RX, queue);
 	if (err < 0)
 		return err;
@@ -1109,7 +1110,7 @@ i40e_status i40e_set_lan_rx_queue_context(struct i40e_hw *hw,
 	i40e_status err;
 	u8 *context_bytes;
 
-	err = i40e_hmc_get_object_va(&hw->hmc, &context_bytes,
+	err = i40e_hmc_get_object_va(hw, &context_bytes,
 				     I40E_HMC_LAN_RX, queue);
 	if (err < 0)
 		return err;
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 11/15] i40e: Implement debug macro hw_dbg using dev_dbg
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem
  Cc: Mauro S. M. Rodrigues, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>

There are several uses of hw_dbg in the code, producing no output. This
patch implements it using dev_debug.

Initially the intention was to implement it using netdev_dbg, analogously
to what is done in ixgbe for instance. That approach was avoided due to
some early usages of hw_dbg, like i40e_pf_reset, before the VSI structure
initialization causing NULL pointer dereference during the driver probe if
the debug messages were turned on as soon as the module is probed.

v2:
 - Use dev_dbg instead of pr_debug, and take advantage of dev_name
instead of crafting pretty much the same device name locally as suggested
by Jakub Kicinski.

Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_common.c | 1 +
 drivers/net/ethernet/intel/i40e/i40e_hmc.c    | 1 +
 drivers/net/ethernet/intel/i40e/i40e_osdep.h  | 5 ++++-
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
index 46e649c09f72..d37c6e0e5f08 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright(c) 2013 - 2018 Intel Corporation. */
 
+#include "i40e.h"
 #include "i40e_type.h"
 #include "i40e_adminq.h"
 #include "i40e_prototype.h"
diff --git a/drivers/net/ethernet/intel/i40e/i40e_hmc.c b/drivers/net/ethernet/intel/i40e/i40e_hmc.c
index 19ce93d7fd0a..163ee8c6311c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_hmc.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_hmc.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright(c) 2013 - 2018 Intel Corporation. */
 
+#include "i40e.h"
 #include "i40e_osdep.h"
 #include "i40e_register.h"
 #include "i40e_status.h"
diff --git a/drivers/net/ethernet/intel/i40e/i40e_osdep.h b/drivers/net/ethernet/intel/i40e/i40e_osdep.h
index a07574bff550..c302ef2524f8 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_osdep.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_osdep.h
@@ -18,7 +18,10 @@
  * actual OS primitives
  */
 
-#define hw_dbg(hw, S, A...)	do {} while (0)
+#define hw_dbg(hw, S, A...)							\
+do {										\
+	dev_dbg(&((struct i40e_pf *)hw->back)->pdev->dev, S, ##A);		\
+} while (0)
 
 #define wr32(a, reg, value)	writel((value), ((a)->hw_addr + (reg)))
 #define rd32(a, reg)		readl((a)->hw_addr + (reg))
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 04/15] igc: Remove useless forward declaration
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem; +Cc: Sasha Neftin, netdev, nhorman, sassmann, Aaron Brown,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: Sasha Neftin <sasha.neftin@intel.com>

Move igc_phy_setup_autoneg, igc_wait_autoneg and igc_set_fc_watermarks
up to avoid forward declaration.
It is not necessary to forward declare these static methods.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_mac.c |  73 +++++----
 drivers/net/ethernet/intel/igc/igc_phy.c | 192 +++++++++++------------
 2 files changed, 129 insertions(+), 136 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_mac.c b/drivers/net/ethernet/intel/igc/igc_mac.c
index ba4646737288..5eeb4c8caf4a 100644
--- a/drivers/net/ethernet/intel/igc/igc_mac.c
+++ b/drivers/net/ethernet/intel/igc/igc_mac.c
@@ -7,9 +7,6 @@
 #include "igc_mac.h"
 #include "igc_hw.h"
 
-/* forward declaration */
-static s32 igc_set_fc_watermarks(struct igc_hw *hw);
-
 /**
  * igc_disable_pcie_master - Disables PCI-express master access
  * @hw: pointer to the HW structure
@@ -74,6 +71,41 @@ void igc_init_rx_addrs(struct igc_hw *hw, u16 rar_count)
 		hw->mac.ops.rar_set(hw, mac_addr, i);
 }
 
+/**
+ * igc_set_fc_watermarks - Set flow control high/low watermarks
+ * @hw: pointer to the HW structure
+ *
+ * Sets the flow control high/low threshold (watermark) registers.  If
+ * flow control XON frame transmission is enabled, then set XON frame
+ * transmission as well.
+ */
+static s32 igc_set_fc_watermarks(struct igc_hw *hw)
+{
+	u32 fcrtl = 0, fcrth = 0;
+
+	/* Set the flow control receive threshold registers.  Normally,
+	 * these registers will be set to a default threshold that may be
+	 * adjusted later by the driver's runtime code.  However, if the
+	 * ability to transmit pause frames is not enabled, then these
+	 * registers will be set to 0.
+	 */
+	if (hw->fc.current_mode & igc_fc_tx_pause) {
+		/* We need to set up the Receive Threshold high and low water
+		 * marks as well as (optionally) enabling the transmission of
+		 * XON frames.
+		 */
+		fcrtl = hw->fc.low_water;
+		if (hw->fc.send_xon)
+			fcrtl |= IGC_FCRTL_XONE;
+
+		fcrth = hw->fc.high_water;
+	}
+	wr32(IGC_FCRTL, fcrtl);
+	wr32(IGC_FCRTH, fcrth);
+
+	return 0;
+}
+
 /**
  * igc_setup_link - Setup flow control and link settings
  * @hw: pointer to the HW structure
@@ -194,41 +226,6 @@ s32 igc_force_mac_fc(struct igc_hw *hw)
 	return ret_val;
 }
 
-/**
- * igc_set_fc_watermarks - Set flow control high/low watermarks
- * @hw: pointer to the HW structure
- *
- * Sets the flow control high/low threshold (watermark) registers.  If
- * flow control XON frame transmission is enabled, then set XON frame
- * transmission as well.
- */
-static s32 igc_set_fc_watermarks(struct igc_hw *hw)
-{
-	u32 fcrtl = 0, fcrth = 0;
-
-	/* Set the flow control receive threshold registers.  Normally,
-	 * these registers will be set to a default threshold that may be
-	 * adjusted later by the driver's runtime code.  However, if the
-	 * ability to transmit pause frames is not enabled, then these
-	 * registers will be set to 0.
-	 */
-	if (hw->fc.current_mode & igc_fc_tx_pause) {
-		/* We need to set up the Receive Threshold high and low water
-		 * marks as well as (optionally) enabling the transmission of
-		 * XON frames.
-		 */
-		fcrtl = hw->fc.low_water;
-		if (hw->fc.send_xon)
-			fcrtl |= IGC_FCRTL_XONE;
-
-		fcrth = hw->fc.high_water;
-	}
-	wr32(IGC_FCRTL, fcrtl);
-	wr32(IGC_FCRTH, fcrth);
-
-	return 0;
-}
-
 /**
  * igc_clear_hw_cntrs_base - Clear base hardware counters
  * @hw: pointer to the HW structure
diff --git a/drivers/net/ethernet/intel/igc/igc_phy.c b/drivers/net/ethernet/intel/igc/igc_phy.c
index 4c8f96a9a148..f4b05af0dd2f 100644
--- a/drivers/net/ethernet/intel/igc/igc_phy.c
+++ b/drivers/net/ethernet/intel/igc/igc_phy.c
@@ -3,10 +3,6 @@
 
 #include "igc_phy.h"
 
-/* forward declaration */
-static s32 igc_phy_setup_autoneg(struct igc_hw *hw);
-static s32 igc_wait_autoneg(struct igc_hw *hw);
-
 /**
  * igc_check_reset_block - Check if PHY reset is blocked
  * @hw: pointer to the HW structure
@@ -207,100 +203,6 @@ s32 igc_phy_hw_reset(struct igc_hw *hw)
 	return ret_val;
 }
 
-/**
- * igc_copper_link_autoneg - Setup/Enable autoneg for copper link
- * @hw: pointer to the HW structure
- *
- * Performs initial bounds checking on autoneg advertisement parameter, then
- * configure to advertise the full capability.  Setup the PHY to autoneg
- * and restart the negotiation process between the link partner.  If
- * autoneg_wait_to_complete, then wait for autoneg to complete before exiting.
- */
-static s32 igc_copper_link_autoneg(struct igc_hw *hw)
-{
-	struct igc_phy_info *phy = &hw->phy;
-	u16 phy_ctrl;
-	s32 ret_val;
-
-	/* Perform some bounds checking on the autoneg advertisement
-	 * parameter.
-	 */
-	phy->autoneg_advertised &= phy->autoneg_mask;
-
-	/* If autoneg_advertised is zero, we assume it was not defaulted
-	 * by the calling code so we set to advertise full capability.
-	 */
-	if (phy->autoneg_advertised == 0)
-		phy->autoneg_advertised = phy->autoneg_mask;
-
-	hw_dbg("Reconfiguring auto-neg advertisement params\n");
-	ret_val = igc_phy_setup_autoneg(hw);
-	if (ret_val) {
-		hw_dbg("Error Setting up Auto-Negotiation\n");
-		goto out;
-	}
-	hw_dbg("Restarting Auto-Neg\n");
-
-	/* Restart auto-negotiation by setting the Auto Neg Enable bit and
-	 * the Auto Neg Restart bit in the PHY control register.
-	 */
-	ret_val = phy->ops.read_reg(hw, PHY_CONTROL, &phy_ctrl);
-	if (ret_val)
-		goto out;
-
-	phy_ctrl |= (MII_CR_AUTO_NEG_EN | MII_CR_RESTART_AUTO_NEG);
-	ret_val = phy->ops.write_reg(hw, PHY_CONTROL, phy_ctrl);
-	if (ret_val)
-		goto out;
-
-	/* Does the user want to wait for Auto-Neg to complete here, or
-	 * check at a later time (for example, callback routine).
-	 */
-	if (phy->autoneg_wait_to_complete) {
-		ret_val = igc_wait_autoneg(hw);
-		if (ret_val) {
-			hw_dbg("Error while waiting for autoneg to complete\n");
-			goto out;
-		}
-	}
-
-	hw->mac.get_link_status = true;
-
-out:
-	return ret_val;
-}
-
-/**
- * igc_wait_autoneg - Wait for auto-neg completion
- * @hw: pointer to the HW structure
- *
- * Waits for auto-negotiation to complete or for the auto-negotiation time
- * limit to expire, which ever happens first.
- */
-static s32 igc_wait_autoneg(struct igc_hw *hw)
-{
-	u16 i, phy_status;
-	s32 ret_val = 0;
-
-	/* Break after autoneg completes or PHY_AUTO_NEG_LIMIT expires. */
-	for (i = PHY_AUTO_NEG_LIMIT; i > 0; i--) {
-		ret_val = hw->phy.ops.read_reg(hw, PHY_STATUS, &phy_status);
-		if (ret_val)
-			break;
-		ret_val = hw->phy.ops.read_reg(hw, PHY_STATUS, &phy_status);
-		if (ret_val)
-			break;
-		if (phy_status & MII_SR_AUTONEG_COMPLETE)
-			break;
-		msleep(100);
-	}
-
-	/* PHY_AUTO_NEG_TIME expiration doesn't guarantee auto-negotiation
-	 * has completed.
-	 */
-	return ret_val;
-}
-
 /**
  * igc_phy_setup_autoneg - Configure PHY for auto-negotiation
  * @hw: pointer to the HW structure
@@ -485,6 +387,100 @@ static s32 igc_phy_setup_autoneg(struct igc_hw *hw)
 	return ret_val;
 }
 
+/**
+ * igc_wait_autoneg - Wait for auto-neg completion
+ * @hw: pointer to the HW structure
+ *
+ * Waits for auto-negotiation to complete or for the auto-negotiation time
+ * limit to expire, which ever happens first.
+ */
+static s32 igc_wait_autoneg(struct igc_hw *hw)
+{
+	u16 i, phy_status;
+	s32 ret_val = 0;
+
+	/* Break after autoneg completes or PHY_AUTO_NEG_LIMIT expires. */
+	for (i = PHY_AUTO_NEG_LIMIT; i > 0; i--) {
+		ret_val = hw->phy.ops.read_reg(hw, PHY_STATUS, &phy_status);
+		if (ret_val)
+			break;
+		ret_val = hw->phy.ops.read_reg(hw, PHY_STATUS, &phy_status);
+		if (ret_val)
+			break;
+		if (phy_status & MII_SR_AUTONEG_COMPLETE)
+			break;
+		msleep(100);
+	}
+
+	/* PHY_AUTO_NEG_TIME expiration doesn't guarantee auto-negotiation
+	 * has completed.
+	 */
+	return ret_val;
+}
+
+/**
+ * igc_copper_link_autoneg - Setup/Enable autoneg for copper link
+ * @hw: pointer to the HW structure
+ *
+ * Performs initial bounds checking on autoneg advertisement parameter, then
+ * configure to advertise the full capability.  Setup the PHY to autoneg
+ * and restart the negotiation process between the link partner.  If
+ * autoneg_wait_to_complete, then wait for autoneg to complete before exiting.
+ */
+static s32 igc_copper_link_autoneg(struct igc_hw *hw)
+{
+	struct igc_phy_info *phy = &hw->phy;
+	u16 phy_ctrl;
+	s32 ret_val;
+
+	/* Perform some bounds checking on the autoneg advertisement
+	 * parameter.
+	 */
+	phy->autoneg_advertised &= phy->autoneg_mask;
+
+	/* If autoneg_advertised is zero, we assume it was not defaulted
+	 * by the calling code so we set to advertise full capability.
+	 */
+	if (phy->autoneg_advertised == 0)
+		phy->autoneg_advertised = phy->autoneg_mask;
+
+	hw_dbg("Reconfiguring auto-neg advertisement params\n");
+	ret_val = igc_phy_setup_autoneg(hw);
+	if (ret_val) {
+		hw_dbg("Error Setting up Auto-Negotiation\n");
+		goto out;
+	}
+	hw_dbg("Restarting Auto-Neg\n");
+
+	/* Restart auto-negotiation by setting the Auto Neg Enable bit and
+	 * the Auto Neg Restart bit in the PHY control register.
+	 */
+	ret_val = phy->ops.read_reg(hw, PHY_CONTROL, &phy_ctrl);
+	if (ret_val)
+		goto out;
+
+	phy_ctrl |= (MII_CR_AUTO_NEG_EN | MII_CR_RESTART_AUTO_NEG);
+	ret_val = phy->ops.write_reg(hw, PHY_CONTROL, phy_ctrl);
+	if (ret_val)
+		goto out;
+
+	/* Does the user want to wait for Auto-Neg to complete here, or
+	 * check at a later time (for example, callback routine).
+	 */
+	if (phy->autoneg_wait_to_complete) {
+		ret_val = igc_wait_autoneg(hw);
+		if (ret_val) {
+			hw_dbg("Error while waiting for autoneg to complete\n");
+			goto out;
+		}
+	}
+
+	hw->mac.get_link_status = true;
+
+out:
+	return ret_val;
+}
+
 /**
  * igc_setup_copper_link - Configure copper link settings
  * @hw: pointer to the HW structure
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 06/15] fm10k: use a local variable for the frag pointer
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem; +Cc: Jacob Keller, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

In the function fm10k_xmit_frame_ring, we recently switched to using
the skb_frag_size accessor instead of directly using the size member of
the skb fragment.

This made the for loop slightly harder to read because it created a very
long line that is difficult to split up. Avoid this by using a local
variable in the for loop, so that we do not have to break the line on an
open parenthesis.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/fm10k/fm10k_main.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index e0a2be534b20..2be9222510e7 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -1073,9 +1073,11 @@ netdev_tx_t fm10k_xmit_frame_ring(struct sk_buff *skb,
 	 *       + 2 desc gap to keep tail from touching head
 	 * otherwise try next time
 	 */
-	for (f = 0; f < skb_shinfo(skb)->nr_frags; f++)
-		count += TXD_USE_COUNT(skb_frag_size(
-						&skb_shinfo(skb)->frags[f]));
+	for (f = 0; f < skb_shinfo(skb)->nr_frags; f++) {
+		skb_frag_t *frag = &skb_shinfo(skb)->frags[f];
+
+		count += TXD_USE_COUNT(skb_frag_size(frag));
+	}
 
 	if (fm10k_maybe_stop_tx(tx_ring, count + 3)) {
 		tx_ring->tx_stats.tx_busy++;
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 08/15] iavf: allow permanent MAC address to change
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem
  Cc: Mitch Williams, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: Mitch Williams <mitch.a.williams@intel.com>

Allow the VF to override the "permanent" MAC address set by the host.
This allows bonding to work in the case where the administrator has set
the VF MAC.

Note that the VF must still be set to Trusted on the host if this change
is to be accepted by the PF driver.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/iavf/iavf.h      | 1 -
 drivers/net/ethernet/intel/iavf/iavf_main.c | 4 ----
 2 files changed, 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h
index 9fc635d816d2..29de3ae96ef2 100644
--- a/drivers/net/ethernet/intel/iavf/iavf.h
+++ b/drivers/net/ethernet/intel/iavf/iavf.h
@@ -253,7 +253,6 @@ struct iavf_adapter {
 #define IAVF_FLAG_RESET_PENDING		BIT(4)
 #define IAVF_FLAG_RESET_NEEDED		BIT(5)
 #define IAVF_FLAG_WB_ON_ITR_CAPABLE		BIT(6)
-#define IAVF_FLAG_ADDR_SET_BY_PF		BIT(8)
 #define IAVF_FLAG_SERVICE_CLIENT_REQUESTED	BIT(9)
 #define IAVF_FLAG_CLIENT_NEEDS_OPEN		BIT(10)
 #define IAVF_FLAG_CLIENT_NEEDS_CLOSE		BIT(11)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 554aa619ff02..07f5541a0f01 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -790,9 +790,6 @@ static int iavf_set_mac(struct net_device *netdev, void *p)
 	if (ether_addr_equal(netdev->dev_addr, addr->sa_data))
 		return 0;
 
-	if (adapter->flags & IAVF_FLAG_ADDR_SET_BY_PF)
-		return -EPERM;
-
 	spin_lock_bh(&adapter->mac_vlan_list_lock);
 
 	f = iavf_find_filter(adapter, hw->mac.addr);
@@ -1811,7 +1808,6 @@ static int iavf_init_get_resources(struct iavf_adapter *adapter)
 		eth_hw_addr_random(netdev);
 		ether_addr_copy(adapter->hw.mac.addr, netdev->dev_addr);
 	} else {
-		adapter->flags |= IAVF_FLAG_ADDR_SET_BY_PF;
 		ether_addr_copy(netdev->dev_addr, adapter->hw.mac.addr);
 		ether_addr_copy(netdev->perm_addr, adapter->hw.mac.addr);
 	}
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 05/15] Documentation: iavf: Update the Intel LAN driver doc for iavf
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, Aaron Brown
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

Update the LAN driver documentation to include the latest feature
implementation and driver capabilities.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
---
 .../networking/device_drivers/intel/iavf.rst  | 115 +++++++++++++-----
 1 file changed, 82 insertions(+), 33 deletions(-)

diff --git a/Documentation/networking/device_drivers/intel/iavf.rst b/Documentation/networking/device_drivers/intel/iavf.rst
index 2d0c3baa1752..cfc08842e32c 100644
--- a/Documentation/networking/device_drivers/intel/iavf.rst
+++ b/Documentation/networking/device_drivers/intel/iavf.rst
@@ -10,11 +10,15 @@ Copyright(c) 2013-2018 Intel Corporation.
 Contents
 ========
 
+- Overview
 - Identifying Your Adapter
 - Additional Configurations
 - Known Issues/Troubleshooting
 - Support
 
+Overview
+========
+
 This file describes the iavf Linux* Base Driver. This driver was formerly
 called i40evf.
 
@@ -27,6 +31,7 @@ The guest OS loading the iavf driver must support MSI-X interrupts.
 
 Identifying Your Adapter
 ========================
+
 The driver in this kernel is compatible with devices based on the following:
  * Intel(R) XL710 X710 Virtual Function
  * Intel(R) X722 Virtual Function
@@ -50,9 +55,10 @@ Link messages will not be displayed to the console if the distribution is
 restricting system messages. In order to see network driver link messages on
 your console, set dmesg to eight by entering the following::
 
-  dmesg -n 8
+    # dmesg -n 8
 
-NOTE: This setting is not saved across reboots.
+NOTE:
+  This setting is not saved across reboots.
 
 ethtool
 -------
@@ -72,11 +78,11 @@ then requests from that VF to set VLAN tag stripping will be ignored.
 To enable/disable VLAN tag stripping for a VF, issue the following command
 from inside the VM in which you are running the VF::
 
-  ethtool -K <if_name> rxvlan on/off
+    # ethtool -K <if_name> rxvlan on/off
 
 or alternatively::
 
-  ethtool --offload <if_name> rxvlan on/off
+    # ethtool --offload <if_name> rxvlan on/off
 
 Adaptive Virtual Function
 -------------------------
@@ -91,21 +97,21 @@ additional features depending on what features are available in the PF with
 which the AVF is associated. The following are base mode features:
 
 - 4 Queue Pairs (QP) and associated Configuration Status Registers (CSRs)
-  for Tx/Rx.
-- i40e descriptors and ring format.
-- Descriptor write-back completion.
-- 1 control queue, with i40e descriptors, CSRs and ring format.
-- 5 MSI-X interrupt vectors and corresponding i40e CSRs.
-- 1 Interrupt Throttle Rate (ITR) index.
-- 1 Virtual Station Interface (VSI) per VF.
+  for Tx/Rx
+- i40e descriptors and ring format
+- Descriptor write-back completion
+- 1 control queue, with i40e descriptors, CSRs and ring format
+- 5 MSI-X interrupt vectors and corresponding i40e CSRs
+- 1 Interrupt Throttle Rate (ITR) index
+- 1 Virtual Station Interface (VSI) per VF
 - 1 Traffic Class (TC), TC0
 - Receive Side Scaling (RSS) with 64 entry indirection table and key,
-  configured through the PF.
-- 1 unicast MAC address reserved per VF.
-- 16 MAC address filters for each VF.
-- Stateless offloads - non-tunneled checksums.
-- AVF device ID.
-- HW mailbox is used for VF to PF communications (including on Windows).
+  configured through the PF
+- 1 unicast MAC address reserved per VF
+- 16 MAC address filters for each VF
+- Stateless offloads - non-tunneled checksums
+- AVF device ID
+- HW mailbox is used for VF to PF communications (including on Windows)
 
 IEEE 802.1ad (QinQ) Support
 ---------------------------
@@ -117,8 +123,8 @@ VLAN ID, among other uses.
 
 The following are examples of how to configure 802.1ad (QinQ)::
 
-  ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24
-  ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371
+    # ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24
+    # ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371
 
 Where "24" and "371" are example VLAN IDs.
 
@@ -133,6 +139,19 @@ specific application. This can reduce latency for the specified application,
 and allow Tx traffic to be rate limited per application. Follow the steps below
 to set ADq.
 
+Requirements:
+
+- The sch_mqprio, act_mirred and cls_flower modules must be loaded
+- The latest version of iproute2
+- If another driver (for example, DPDK) has set cloud filters, you cannot
+  enable ADQ
+- Depending on the underlying PF device, ADQ cannot be enabled when the
+  following features are enabled:
+
+  + Data Center Bridging (DCB)
+  + Multiple Functions per Port (MFP)
+  + Sideband Filters
+
 1. Create traffic classes (TCs). Maximum of 8 TCs can be created per interface.
 The shaper bw_rlimit parameter is optional.
 
@@ -141,9 +160,9 @@ to 1Gbit for tc0 and 3Gbit for tc1.
 
 ::
 
-  # tc qdisc add dev <interface> root mqprio num_tc 2 map 0 0 0 0 1 1 1 1
-  queues 16@0 16@16 hw 1 mode channel shaper bw_rlimit min_rate 1Gbit 2Gbit
-  max_rate 1Gbit 3Gbit
+    tc qdisc add dev <interface> root mqprio num_tc 2 map 0 0 0 0 1 1 1 1
+    queues 16@0 16@16 hw 1 mode channel shaper bw_rlimit min_rate 1Gbit 2Gbit
+    max_rate 1Gbit 3Gbit
 
 map: priority mapping for up to 16 priorities to tcs (e.g. map 0 0 0 0 1 1 1 1
 sets priorities 0-3 to use tc0 and 4-7 to use tc1)
@@ -162,6 +181,10 @@ Totals must be equal or less than port speed.
 For example: min_rate 1Gbit 3Gbit: Verify bandwidth limit using network
 monitoring tools such as ifstat or sar –n DEV [interval] [number of samples]
 
+NOTE:
+  Setting up channels via ethtool (ethtool -L) is not supported when the
+  TCs are configured using mqprio.
+
 2. Enable HW TC offload on interface::
 
     # ethtool -K <interface> hw-tc-offload on
@@ -171,16 +194,16 @@ monitoring tools such as ifstat or sar –n DEV [interval] [number of samples]
     # tc qdisc add dev <interface> ingress
 
 NOTES:
- - Run all tc commands from the iproute2 <pathtoiproute2>/tc/ directory.
- - ADq is not compatible with cloud filters.
+ - Run all tc commands from the iproute2 <pathtoiproute2>/tc/ directory
+ - ADq is not compatible with cloud filters
  - Setting up channels via ethtool (ethtool -L) is not supported when the TCs
-   are configured using mqprio.
+   are configured using mqprio
  - You must have iproute2 latest version
- - NVM version 6.01 or later is required.
+ - NVM version 6.01 or later is required
  - ADq cannot be enabled when any the following features are enabled: Data
-   Center Bridging (DCB), Multiple Functions per Port (MFP), or Sideband Filters.
+   Center Bridging (DCB), Multiple Functions per Port (MFP), or Sideband Filters
  - If another driver (for example, DPDK) has set cloud filters, you cannot
-   enable ADq.
+   enable ADq
  - Tunnel filters are not supported in ADq. If encapsulated packets do arrive
    in non-tunnel mode, filtering will be done on the inner headers.  For example,
    for VXLAN traffic in non-tunnel mode, PCTYPE is identified as a VXLAN
@@ -198,6 +221,16 @@ NOTES:
 Known Issues/Troubleshooting
 ============================
 
+Bonding fails with VFs bound to an Intel(R) Ethernet Controller 700 series device
+---------------------------------------------------------------------------------
+If you bind Virtual Functions (VFs) to an Intel(R) Ethernet Controller 700
+series based device, the VF slaves may fail when they become the active slave.
+If the MAC address of the VF is set by the PF (Physical Function) of the
+device, when you add a slave, or change the active-backup slave, Linux bonding
+tries to sync the backup slave's MAC address to the same MAC address as the
+active slave. Linux bonding will fail at this point. This issue will not occur
+if the VF's MAC address is not set by the PF.
+
 Traffic Is Not Being Passed Between VM and Client
 -------------------------------------------------
 You may not be able to pass traffic between a client system and a
@@ -215,13 +248,28 @@ Do not unload a port's driver if a Virtual Function (VF) with an active Virtual
 Machine (VM) is bound to it. Doing so will cause the port to appear to hang.
 Once the VM shuts down, or otherwise releases the VF, the command will complete.
 
+Using four traffic classes fails
+--------------------------------
+Do not try to reserve more than three traffic classes in the iavf driver. Doing
+so will fail to set any traffic classes and will cause the driver to write
+errors to stdout. Use a maximum of three queues to avoid this issue.
+
+Multiple log error messages on iavf driver removal
+--------------------------------------------------
+If you have several VFs and you remove the iavf driver, several instances of
+the following log errors are written to the log::
+
+    Unable to send opcode 2 to PF, err I40E_ERR_QUEUE_EMPTY, aq_err ok
+    Unable to send the message to VF 2 aq_err 12
+    ARQ Overflow Error detected
+
 Virtual machine does not get link
 ---------------------------------
 If the virtual machine has more than one virtual port assigned to it, and those
 virtual ports are bound to different physical ports, you may not get link on
 all of the virtual ports. The following command may work around the issue::
 
-  ethtool -r <PF>
+    # ethtool -r <PF>
 
 Where <PF> is the PF interface in the host, for example: p5p1. You may need to
 run the command more than once to get link on all virtual ports.
@@ -251,12 +299,13 @@ traffic.
 If you have multiple interfaces in a server, either turn on ARP filtering by
 entering::
 
-  echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
+    # echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
 
-NOTE: This setting is not saved across reboots. The configuration change can be
-made permanent by adding the following line to the file /etc/sysctl.conf::
+NOTE:
+  This setting is not saved across reboots. The configuration change can be
+  made permanent by adding the following line to the file /etc/sysctl.conf::
 
-  net.ipv4.conf.all.arp_filter = 1
+    net.ipv4.conf.all.arp_filter = 1
 
 Another alternative is to install the interfaces in separate broadcast domains
 (either in different switches or in a switch partitioned to VLANs).
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 03/15] e1000e: Make speed detection on hotplugging cable more reliable
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem; +Cc: Kai-Heng Feng, netdev, nhorman, sassmann, Aaron Brown,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: Kai-Heng Feng <kai.heng.feng@canonical.com>

After hot plugging an 1Gbps Ethernet cable with 1Gbps link partner, the
MII_BMSR may report 10Mbps, renders the network rather slow.

The issue has much lower fail rate after commit 59653e6497d1 ("e1000e:
Make watchdog use delayed work"), which essentially introduces some
delay before running the watchdog task.

But there's still a chance that the hot plugging event and the queued
watchdog task gets run at the same time, then the original issue can be
observed once again.

So let's use mod_delayed_work() to add a deterministic 1 second delay
before running watchdog task, after an interrupt.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 8a3f035c3a5f..d7d56e42a6aa 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1780,8 +1780,8 @@ static irqreturn_t e1000_intr_msi(int __always_unused irq, void *data)
 		}
 		/* guard against interrupt when we're going down */
 		if (!test_bit(__E1000_DOWN, &adapter->state))
-			queue_delayed_work(adapter->e1000_workqueue,
-					   &adapter->watchdog_task, 1);
+			mod_delayed_work(adapter->e1000_workqueue,
+					 &adapter->watchdog_task, HZ);
 	}
 
 	/* Reset on uncorrectable ECC error */
@@ -1861,8 +1861,8 @@ static irqreturn_t e1000_intr(int __always_unused irq, void *data)
 		}
 		/* guard against interrupt when we're going down */
 		if (!test_bit(__E1000_DOWN, &adapter->state))
-			queue_delayed_work(adapter->e1000_workqueue,
-					   &adapter->watchdog_task, 1);
+			mod_delayed_work(adapter->e1000_workqueue,
+					 &adapter->watchdog_task, HZ);
 	}
 
 	/* Reset on uncorrectable ECC error */
@@ -1907,8 +1907,8 @@ static irqreturn_t e1000_msix_other(int __always_unused irq, void *data)
 		hw->mac.get_link_status = true;
 		/* guard against interrupt when we're going down */
 		if (!test_bit(__E1000_DOWN, &adapter->state))
-			queue_delayed_work(adapter->e1000_workqueue,
-					   &adapter->watchdog_task, 1);
+			mod_delayed_work(adapter->e1000_workqueue,
+					 &adapter->watchdog_task, HZ);
 	}
 
 	if (!test_bit(__E1000_DOWN, &adapter->state))
-- 
2.21.0


^ permalink raw reply related

* [net-next v2 02/15] ixgbevf: Link lost in VM on ixgbevf when restoring from freeze or suspend
From: Jeff Kirsher @ 2019-09-09 22:47 UTC (permalink / raw)
  To: davem; +Cc: Radoslaw Tyl, netdev, nhorman, sassmann, Andrew Bowers,
	Jeff Kirsher
In-Reply-To: <20190909224802.29595-1-jeffrey.t.kirsher@intel.com>

From: Radoslaw Tyl <radoslawx.tyl@intel.com>

This patch fixed issue in VM which shows no link when hypervisor is
restored from low-power state. The driver is responsible for re-enabling
any features of the device that had been disabled during suspend calls,
such as IRQs and bus mastering.

Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 8c011d4ce7a9..75e849a64db7 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -2517,6 +2517,7 @@ void ixgbevf_reinit_locked(struct ixgbevf_adapter *adapter)
 		msleep(1);
 
 	ixgbevf_down(adapter);
+	pci_set_master(adapter->pdev);
 	ixgbevf_up(adapter);
 
 	clear_bit(__IXGBEVF_RESETTING, &adapter->state);
-- 
2.21.0


^ permalink raw reply related

* [RFC PATCH net-next 1/2] Allow 225/8-231/8 as unicast
From: Dave Taht @ 2019-09-09 22:37 UTC (permalink / raw)
  To: netdev; +Cc: Dave Taht
In-Reply-To: <1568068639-6511-1-git-send-email-dave.taht@gmail.com>

This patch converts the long "reserved for future use" multicast
address space, 225/8-231/8 - 120m addresses - for use as unicast
addresses instead.

In a comprehensive survey of all the open source code on the planet
we found no users of this range. We found some official and unofficial
usage of addresses in 224/8 and in 239/8 - both spaces at well under
50% allocation in the first place, so we anticipate no additional growth
for any reason, into the 225-231 spaces.

There will be some short term incompatabilities induced.

The principal flaw of converting this space to unicast involves
a non-uniext box, sending a packet to the formerly multicast address, 
and the reply coming back from that "formerly multicast" address
as unicast.

The return packet will be dropped because the source of the reply is unicast
(L2) with what the non-uniext box considers to be multicast (L3).

and, like all multicast packets sent anywhere, the attempt will still
flood all ports on the local switch.

A tcp attempt fails immediately due to the inherent IN_MULTICAST
check in the existing kernel. Some stacks (not linux) MAY do more 
of the wrong thing here.

As for userspace exposure...

We were only able to find 89 packages in fedora that used the IN_MULTICAST
macro. Currently the plan is not to kill IN_MULTICAST, (as doing it right
requires access to the big endian macros) but retire its usages in
the kernel (already done) and then the very few programs that use it userspace.

All the routing daemons we've inspected and modified don't use IN_MULTICAST.
The patches to them are trivial.

New users of multicast, seem to always pick something out of the 224/8
or 239/8 ranges, which are untouched by this patch.

Additional potential problems include: 

* hardware offloads that explicitly check for multicast
* binary firmware that explicitly checks for multicast
* a tiny cpu hit

Whether or not these problems are worth addressing to regain 120m
useful unicast addresses in the next decade is up for debate.

---
 include/linux/in.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/in.h b/include/linux/in.h
index 1873ef642605..8665842a3589 100644
--- a/include/linux/in.h
+++ b/include/linux/in.h
@@ -42,7 +42,10 @@ static inline bool ipv4_is_loopback(__be32 addr)

 static inline bool ipv4_is_multicast(__be32 addr)
 {
-	return (addr & htonl(0xf0000000)) == htonl(0xe0000000);
+	if((addr & htonl(0xf0000000)) == htonl(0xe0000000))
+		return !((htonl(addr) >= 0xe1000000) &&
+			 (htonl(addr) < 0xe8000000));
+	return 0;
 }

 static inline bool ipv4_is_local_multicast(__be32 addr)
-- 
2.17.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox