Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH v2] rsi: remove set but not used variable 'header_size'
From: YueHaibing @ 2018-08-31 11:13 UTC (permalink / raw)
  To: kvalo, davem
  Cc: linux-kernel, netdev, amit.karwar, pavani.muthyala, keescook,
	linux-wireless, YueHaibing

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/wireless/rsi/rsi_91x_hal.c: In function 'rsi_send_data_pkt':
drivers/net/wireless/rsi/rsi_91x_hal.c:288:5: warning:
 variable 'header_size' set but not used [-Wunused-but-set-variable]

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
v2: remove unused 'tx_params'
---
 drivers/net/wireless/rsi/rsi_91x_hal.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/wireless/rsi/rsi_91x_hal.c b/drivers/net/wireless/rsi/rsi_91x_hal.c
index 01edf96..182b066 100644
--- a/drivers/net/wireless/rsi/rsi_91x_hal.c
+++ b/drivers/net/wireless/rsi/rsi_91x_hal.c
@@ -282,10 +282,8 @@ int rsi_send_data_pkt(struct rsi_common *common, struct sk_buff *skb)
 	struct rsi_hw *adapter = common->priv;
 	struct ieee80211_vif *vif;
 	struct ieee80211_tx_info *info;
-	struct skb_info *tx_params;
 	struct ieee80211_bss_conf *bss;
 	int status = -EINVAL;
-	u8 header_size;
 
 	if (!skb)
 		return 0;
@@ -297,8 +295,6 @@ int rsi_send_data_pkt(struct rsi_common *common, struct sk_buff *skb)
 		goto err;
 	vif = info->control.vif;
 	bss = &vif->bss_conf;
-	tx_params = (struct skb_info *)info->driver_data;
-	header_size = tx_params->internal_hdr_size;
 
 	if (((vif->type == NL80211_IFTYPE_STATION) ||
 	     (vif->type == NL80211_IFTYPE_P2P_CLIENT)) &&
-- 
2.7.0

^ permalink raw reply related

* Re: RFC: changed error code when binding unix socket twice
From: Petr Vorel @ 2018-08-31 11:14 UTC (permalink / raw)
  To: Michal Kubecek, David S. Miller
  Cc: netdev, WANG Cong, Rainer Weikusat, linux-kernel, ltp,
	Cyril Hrubis, Junchi Chen
In-Reply-To: <20170630073448.GA9546@unicorn.suse.cz>

Hi,

> commit 0fb44559ffd6 ("af_unix: move unix_mknod() out of bindlock") moves
> the special file creation in unix_bind() before u->bindlock is taken in
> order to avoid an ABBA deadlock with do_splice(). As a side effect, it
> also moves the check for existence of the special file (which would
> result in -EADDRINUSE) before the check of u->addr (which would result
> in -EINVAL if socket is already bound). This means that the error
> returned for an attempt to bind a unix socket to the same path twice
> changed from -EINVAL to -EADDRINUSE with this commit.

> One way to restore the old error code is indicated below but before
> submitting it, I would like to ask if we need/want it.

> Pro:
>   - in general, we do not want to change return code for given testcase
>   - old error (-EINVAL) is consistent with AF_INET(6)
> Con:
>   - both POSIX and Linux man page only list error conditions without
>     stating which should take precedence if more of them apply so
>     neither of them seems wrong, strictly speaking

I'd be for restoring the original behavior (be conservative + looks like as not intended).

Any comment from netdev maintainers?


Kind regards,
Petr


> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 1a0c961f4ffe..509292bdf7ed 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -992,7 +992,7 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
>  	struct unix_sock *u = unix_sk(sk);
>  	struct sockaddr_un *sunaddr = (struct sockaddr_un *)uaddr;
>  	char *sun_path = sunaddr->sun_path;
> -	int err;
> +	int err, mknod_err;
>  	unsigned int hash;
>  	struct unix_address *addr;
>  	struct hlist_head *list;
> @@ -1016,12 +1016,10 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
>  	if (sun_path[0]) {
>  		umode_t mode = S_IFSOCK |
>  		       (SOCK_INODE(sock)->i_mode & ~current_umask());
> -		err = unix_mknod(sun_path, mode, &path);
> -		if (err) {
> -			if (err == -EEXIST)
> -				err = -EADDRINUSE;
> -			goto out;
> -		}
> +		mknod_err = unix_mknod(sun_path, mode, &path);
> +		/* do not exit on error until after u->addr check */
> +		if (mknod_err == -EEXIST)
> +			mknod_err = -EADDRINUSE;
>  	}

>  	err = mutex_lock_interruptible(&u->bindlock);
> @@ -1031,6 +1029,10 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
>  	err = -EINVAL;
>  	if (u->addr)
>  		goto out_up;
> +	if (mknod_err) {
> +		err = mknod_err;
> +		goto out_up;
> +	}

>  	err = -ENOMEM;
>  	addr = kmalloc(sizeof(*addr)+addr_len, GFP_KERNEL);

^ permalink raw reply

* Re: [PATCH net] sctp: hold transport before accessing its asoc in sctp_transport_get_next
From: Xin Long @ 2018-08-31  7:09 UTC (permalink / raw)
  To: Neil Horman; +Cc: network dev, linux-sctp, davem, Marcelo Ricardo Leitner
In-Reply-To: <20180829113532.GA21923@hmswarspite.think-freely.org>

On Wed, Aug 29, 2018 at 7:36 PM Neil Horman <nhorman@tuxdriver.com> wrote:
>
> On Wed, Aug 29, 2018 at 12:08:40AM +0800, Xin Long wrote:
> > On Mon, Aug 27, 2018 at 9:08 PM Neil Horman <nhorman@tuxdriver.com> wrote:
> > >
> > > On Mon, Aug 27, 2018 at 06:38:31PM +0800, Xin Long wrote:
> > > > As Marcelo noticed, in sctp_transport_get_next, it is iterating over
> > > > transports but then also accessing the association directly, without
> > > > checking any refcnts before that, which can cause an use-after-free
> > > > Read.
> > > >
> > > > So fix it by holding transport before accessing the association. With
> > > > that, sctp_transport_hold calls can be removed in the later places.
> > > >
> > > > Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
> > > > Reported-by: syzbot+fe62a0c9aa6a85c6de16@syzkaller.appspotmail.com
> > > > Signed-off-by: Xin Long <lucien.xin@gmail.com>
> > > > ---
> > > >  net/sctp/proc.c   |  4 ----
> > > >  net/sctp/socket.c | 22 +++++++++++++++-------
> > > >  2 files changed, 15 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/net/sctp/proc.c b/net/sctp/proc.c
> > > > index ef5c9a8..4d6f1c8 100644
> > > > --- a/net/sctp/proc.c
> > > > +++ b/net/sctp/proc.c
> > > > @@ -264,8 +264,6 @@ static int sctp_assocs_seq_show(struct seq_file *seq, void *v)
> > > >       }
> > > >
> > > >       transport = (struct sctp_transport *)v;
> > > > -     if (!sctp_transport_hold(transport))
> > > > -             return 0;
> > > >       assoc = transport->asoc;
> > > >       epb = &assoc->base;
> > > >       sk = epb->sk;
> > > > @@ -322,8 +320,6 @@ static int sctp_remaddr_seq_show(struct seq_file *seq, void *v)
> > > >       }
> > > >
> > > >       transport = (struct sctp_transport *)v;
> > > > -     if (!sctp_transport_hold(transport))
> > > > -             return 0;
> > > >       assoc = transport->asoc;
> > > >
> > > >       list_for_each_entry_rcu(tsp, &assoc->peer.transport_addr_list,
> > > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > > > index e96b15a..aa76586 100644
> > > > --- a/net/sctp/socket.c
> > > > +++ b/net/sctp/socket.c
> > > > @@ -5005,9 +5005,14 @@ struct sctp_transport *sctp_transport_get_next(struct net *net,
> > > >                       break;
> > > >               }
> > > >
> > > > +             if (!sctp_transport_hold(t))
> > > > +                     continue;
> > > > +
> > > >               if (net_eq(sock_net(t->asoc->base.sk), net) &&
> > > >                   t->asoc->peer.primary_path == t)
> > > >                       break;
> > > > +
> > > > +             sctp_transport_put(t);
> > > >       }
> > > >
> > > >       return t;
> > > > @@ -5017,13 +5022,18 @@ struct sctp_transport *sctp_transport_get_idx(struct net *net,
> > > >                                             struct rhashtable_iter *iter,
> > > >                                             int pos)
> > > >  {
> > > > -     void *obj = SEQ_START_TOKEN;
> > > > +     struct sctp_transport *t;
> > > >
> > > > -     while (pos && (obj = sctp_transport_get_next(net, iter)) &&
> > > > -            !IS_ERR(obj))
> > > > -             pos--;
> > > > +     if (!pos)
> > > > +             return SEQ_START_TOKEN;
> > > >
> > > > -     return obj;
> > > > +     while ((t = sctp_transport_get_next(net, iter)) && !IS_ERR(t)) {
> > > > +             if (!--pos)
> > > > +                     break;
> > > > +             sctp_transport_put(t);
> > > > +     }
> > > > +
> > > > +     return t;
> > > >  }
> > > >
> > > >  int sctp_for_each_endpoint(int (*cb)(struct sctp_endpoint *, void *),
> > > > @@ -5082,8 +5092,6 @@ int sctp_for_each_transport(int (*cb)(struct sctp_transport *, void *),
> > > >
> > > >       tsp = sctp_transport_get_idx(net, &hti, *pos + 1);
> > > >       for (; !IS_ERR_OR_NULL(tsp); tsp = sctp_transport_get_next(net, &hti)) {
> > > > -             if (!sctp_transport_hold(tsp))
> > > > -                     continue;
> > > >               ret = cb(tsp, p);
> > > >               if (ret)
> > > >                       break;
> > > > --
> > > > 2.1.0
> > > >
> > > >
> > > Acked-by: Neil Horman <nhorman@tuxdriver.com>
> > >
> > > Additionally, its not germaine to this particular fix, but why are we still
> > > using that pos variable in sctp_transport_get_idx?  With the conversion to
> > > rhashtables, it doesn't seem particularly useful anymore.
> > For proc, seems so, hti is saved into seq->private.
> > But for diag, "hti" in sctp_for_each_transport() is a local variable.
> > do you think where we can save it?
> >
> Sorry, wasn't suggesting that it had to be removed from sctp_for_each_trasnport,
> its clearly used as both an input and output there.  All I was sugesting was
> that, in sctp_transport_get_idx, the pos variable might no longer be needed
> there specifically, as sctp_transprt_get_next should terminate the loop on its
> own.  Or is there another purpose for that positional variable I am missing
Yes, Neil, all transports/asocs could not be dumped once as one buffer/rtnl msg
is not big enough for them. when coming into proc/diag again, it needs to start
from the *next* one, and 'pos' is used to save its position.

Without 'pos', it would always start from the first one and dump the duplicate
ones in the next times. Make sense?

>
> Neil
>

^ permalink raw reply

* Re: [PATCH 2/2] net/ibm/emac: deletion of unneeded macros definitions
From: Ivan Mikhaylov @ 2018-08-31 11:23 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel, Christian Lamparter
In-Reply-To: <20180829.181007.967677259832053608.davem@davemloft.net>

>Please do not delete these, they could be useful for other
>developers and people reading this driver.
Hmm, that's possible, need I resubmit first patch without this one?

^ permalink raw reply

* Re: [PATCH bpf-next] xsk: remove unnecessary assignment
From: Björn Töpel @ 2018-08-31  7:27 UTC (permalink / raw)
  To: bhole_prashant_q7
  Cc: ast, Daniel Borkmann, Björn Töpel, Karlsson, Magnus,
	Netdev
In-Reply-To: <20180831005916.9276-1-bhole_prashant_q7@lab.ntt.co.jp>

Den fre 31 aug. 2018 kl 03:02 skrev Prashant Bhole
<bhole_prashant_q7@lab.ntt.co.jp>:
>
> Since xdp_umem_query() was added one assignment of bpf.command was
> missed from cleanup. Removing the assignment statement.
>

Good catch!

Acked-by: Björn Töpel <bjorn.topel@intel.com>

> Fixes: 84c6b86875e01a0 ("xsk: don't allow umem replace at stack level")
> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
> ---
>  net/xdp/xdp_umem.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
> index bfe2dbea480b..d179732617dc 100644
> --- a/net/xdp/xdp_umem.c
> +++ b/net/xdp/xdp_umem.c
> @@ -76,8 +76,6 @@ int xdp_umem_assign_dev(struct xdp_umem *umem, struct net_device *dev,
>         if (!dev->netdev_ops->ndo_bpf || !dev->netdev_ops->ndo_xsk_async_xmit)
>                 return force_zc ? -EOPNOTSUPP : 0; /* fail or fallback */
>
> -       bpf.command = XDP_QUERY_XSK_UMEM;
> -
>         rtnl_lock();
>         err = xdp_umem_query(dev, queue_id);
>         if (err) {
> --
> 2.17.1
>
>

^ permalink raw reply

* Re: [PATCH bpf-next] samples/bpf: xdpsock, minor fixes
From: Björn Töpel @ 2018-08-31  7:32 UTC (permalink / raw)
  To: bhole_prashant_q7
  Cc: ast, Daniel Borkmann, Björn Töpel, Karlsson, Magnus,
	Netdev
In-Reply-To: <20180831010049.9388-1-bhole_prashant_q7@lab.ntt.co.jp>

Den fre 31 aug. 2018 kl 03:04 skrev Prashant Bhole
<bhole_prashant_q7@lab.ntt.co.jp>:
>
> - xsks_map size was fixed to 4, changed it MAX_SOCKS
> - Remove redundant definition of MAX_SOCKS in xdpsock_user.c
> - In dump_stats(), add NULL check for xsks[i]
>

Thanks for the cleanup!

Acked-by: Björn Töpel <bjorn.topel@intel.com>

> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
> ---
>  samples/bpf/xdpsock_kern.c | 2 +-
>  samples/bpf/xdpsock_user.c | 3 +--
>  2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/samples/bpf/xdpsock_kern.c b/samples/bpf/xdpsock_kern.c
> index d8806c41362e..b8ccd0802b3f 100644
> --- a/samples/bpf/xdpsock_kern.c
> +++ b/samples/bpf/xdpsock_kern.c
> @@ -16,7 +16,7 @@ struct bpf_map_def SEC("maps") xsks_map = {
>         .type = BPF_MAP_TYPE_XSKMAP,
>         .key_size = sizeof(int),
>         .value_size = sizeof(int),
> -       .max_entries = 4,
> +       .max_entries = MAX_SOCKS,
>  };
>
>  struct bpf_map_def SEC("maps") rr_map = {
> diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
> index b3906111bedb..57ecadc58403 100644
> --- a/samples/bpf/xdpsock_user.c
> +++ b/samples/bpf/xdpsock_user.c
> @@ -118,7 +118,6 @@ struct xdpsock {
>         unsigned long prev_tx_npkts;
>  };
>
> -#define MAX_SOCKS 4
>  static int num_socks;
>  struct xdpsock *xsks[MAX_SOCKS];
>
> @@ -596,7 +595,7 @@ static void dump_stats(void)
>
>         prev_time = now;
>
> -       for (i = 0; i < num_socks; i++) {
> +       for (i = 0; i < num_socks && xsks[i]; i++) {
>                 char *fmt = "%-15s %'-11.0f %'-11lu\n";
>                 double rx_pps, tx_pps;
>
> --
> 2.17.1
>
>

^ permalink raw reply

* Re: [PATCH bpf-next 08/11] i40e: add AF_XDP zero-copy Rx support
From: Jakub Kicinski @ 2018-08-31  7:55 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Björn Töpel, magnus.karlsson, magnus.karlsson,
	alexander.h.duyck, alexander.duyck, ast, brouer, daniel, netdev,
	jesse.brandeburg, anjali.singhai, peter.waskiewicz.jr,
	michael.lundkvist, willemdebruijn.kernel, john.fastabend,
	neerav.parikh, mykyta.iziumtsev, francois.ozog, ilias.apalodimas,
	brian.brooks, u9012063, pavel, qi.z.zhang
In-Reply-To: <65684cea-76bc-2f2d-600f-402f8504301d@intel.com>

On Thu, 30 Aug 2018 14:06:24 +0200, Björn Töpel wrote:
> On 2018-08-29 21:14, Jakub Kicinski wrote:
>  > On Tue, 28 Aug 2018 14:44:32 +0200, Björn Töpel wrote:  
>  >> From: Björn Töpel <bjorn.topel@intel.com>
>  >>
>  >> This patch adds zero-copy Rx support for AF_XDP sockets. Instead of
>  >> allocating buffers of type MEM_TYPE_PAGE_SHARED, the Rx frames are
>  >> allocated as MEM_TYPE_ZERO_COPY when AF_XDP is enabled for a certain
>  >> queue.
>  >>
>  >> All AF_XDP specific functions are added to a new file, i40e_xsk.c.
>  >>
>  >> Note that when AF_XDP zero-copy is enabled, the XDP action XDP_PASS
>  >> will allocate a new buffer and copy the zero-copy frame prior passing
>  >> it to the kernel stack.
>  >>
>  >> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>  
>  >
>  > Mm.. I'm surprised you don't run into buffer reuse issues that I had
>  > when playing with AF_XDP.  What happens in i40e if someone downs the
>  > interface?  Will UMEMs get destroyed?  Will the RX buffers get freed?
>  >  
> 
> The UMEM will linger in the driver until the sockets are dead.
> 
>  > I'll shortly send an RFC with my quick and dirty RX buffer reuse queue,
>  > FWIW.
>  >  
> 
> Some background for folks that don't know the details: A zero-copy
> capable driver picks buffers off the fill ring and places them on the
> hardware Rx ring to be completed at a later point when DMA is
> complete. Analogous for the Tx side; The driver picks buffers off the
> Tx ring and places them on the Tx hardware ring.
> 
> In the typical flow, the Rx buffer will be placed onto an Rx ring
> (completed to the user), and the Tx buffer will be placed on the
> completion ring to notify the user that the transfer is done.
> 
> However, if the driver needs to tear down the hardware rings for some
> reason (interface goes down, reconfiguration and such), what should be
> done with the Rx and Tx buffers that has been given to the driver?
> 
> So, to frame the problem: What should a driver do when this happens,
> so that buffers aren't leaked?
> 
> Note that when the UMEM is going down, there's no need to complete
> anything, since the sockets are dying/dead already.
> 
> This is, as you state, a missing piece in the implementation and needs
> to be fixed.
> 
> Now on to possible solutions:
> 
> 1. Complete the buffers back to the user. For Tx, this is probably the
>     best way -- just place the buffers onto the completion ring.
> 
>     For Rx, we can give buffers back to user space by setting the
>     length in the Rx descriptor to zero And putting them on the Rx
>     ring. However, one complication here is that we do not have any
>     back-pressure mechanism for the Rx side like we have on Tx. If the
>     Rx ring(s) is (are) full the kernel will have to leak them or
>     implement a retry mechanism (ugly and should be avoided).
> 
>     Another option to solve this without needing any retry or leaking
>     for Rx is to implement the same back-pressure mechanism that we
>     have on the Tx path in the Rx path. In the Tx path, the driver will
>     only get a Tx packet to send if there is space for it in the
>     completion ring. On Rx, this would be that the driver would only
>     get a buffer from the fill ring if there is space for it to put it
>     on the Rx ring. The drawback of this is that it would likely impact
>     performance negatively since the Rx ring would have to be touch one
>     more time (in the Tx path, it increased performance since it made
>     it possible to implement the Tx path without any buffering), but it
>     would guarantee that all buffers can always be returned to user
>     space making solution this a viable option.
> 
> 2. Store the buffers internally in the driver, and make sure that they
>     are inserted into the "normal flow" again. For Rx that would be
>     putting the buffers back into the allocation scheme that the driver
>     is using. For Tx, placing the buffers back onto the Tx HW ring
>     (plus all the logic for making sure that all corner cases work).
> 
> 3. Mark the socket(s) as in error state, en require the user to redo
>     the setup. This is bit harsh...

That's a good summary, one more (bad) option:

4. Disallow any operations on the device which could lead to RX buffer
   discards if any UMEMs are attached.

> For i40e I think #2 for Rx (buffers reside in kernel, return to
> allocator) and #1 for Tx (complete to userland).
> 
> Your RFC is plumbing to implement #2 for Rx in a driver. I'm not a fan
> of extending the umem with the "reuse queue". This decision is really
> up the driver. Some driver might prefer another scheme, or simply
> prefer storing the buffers in another manner.

The only performance cost is the extra pointer in xdp_umem.  Drivers
have to opt-in by using the *_rq() helpers.  We can move the extra
pointer to driver structs, and have them pass it to the helpers, so that
would be zero extra cost, and we can reuse the implementation.

> Looking forward, as both you and Jesper has alluded to, we need a
> proper allocator for zero-copy. Then it would be a matter of injecting
> the Rx buffers back to the allocator.
> 
> I'll send out a patch, so that we don't leak the buffers, but I'd like
> to hear your thoughts on what the behavior should be.
> 
> And what should the behavior be when the netdev is removed from the
> kernel?

What would AF_PACKET do, return ENXIO?

> And thanks for looking into the code!
> 
> Björn

^ permalink raw reply

* [PATCH] r8169: add support for NCube 8168 network card
From: Anthony Wong @ 2018-08-31 12:06 UTC (permalink / raw)
  To: nic_swsd, bhelgaas, netdev, davem, linux-kernel, linux-pci

This card identifies itself as:
  Ethernet controller [0200]: NCube Device [10ff:8168] (rev 06)
  Subsystem: TP-LINK Technologies Co., Ltd. Device [7470:3468]

Adding a new entry to rtl8169_pci_tbl makes the card work.

Link: http://launchpad.net/bugs/1788730
Cc: <stable@vger.kernel.org>
Signed-off-by: Anthony Wong <anthony.wong@ubuntu.com>
---
 drivers/net/ethernet/realtek/r8169.c | 1 +
 include/linux/pci_ids.h              | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index ac30679..b08d51b 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -218,6 +218,7 @@ static const struct pci_device_id rtl8169_pci_tbl[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_REALTEK,	0x8161), 0, 0, RTL_CFG_1 },
 	{ PCI_DEVICE(PCI_VENDOR_ID_REALTEK,	0x8167), 0, 0, RTL_CFG_0 },
 	{ PCI_DEVICE(PCI_VENDOR_ID_REALTEK,	0x8168), 0, 0, RTL_CFG_1 },
+	{ PCI_DEVICE(PCI_VENDOR_ID_NCUBE,	0x8168), 0, 0, RTL_CFG_1 },
 	{ PCI_DEVICE(PCI_VENDOR_ID_REALTEK,	0x8169), 0, 0, RTL_CFG_0 },
 	{ PCI_VENDOR_ID_DLINK,			0x4300,
 		PCI_VENDOR_ID_DLINK, 0x4b10,		 0, 0, RTL_CFG_1 },
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 6656286..d4afd80 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -3088,4 +3088,6 @@
 
 #define PCI_VENDOR_ID_OCZ		0x1b85
 
+#define PCI_VENDOR_ID_NCUBE		0x10ff
+
 #endif /* _LINUX_PCI_IDS_H */
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH] mt76: Enable NL80211_EXT_FEATURE_CQM_RSSI_LIST
From: Kristian Evensen @ 2018-08-31 12:25 UTC (permalink / raw)
  To: lorenzo.bianconi-H+wXaHxf7aLQT0dZR+AlfA
  Cc: arend.vanspriel-dY08KVG/lbpWk0Htik3J/w,
	kvalo-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA, Network Development
In-Reply-To: <CAKfDRXia3zwW9Q=7RZGqK_DWCsyQzG4KEZHsyHzGcNb70Ge1kA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Hi Lorenzo,

On Mon, Aug 13, 2018 at 4:25 PM Kristian Evensen
<kristian.evensen-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Aug 13, 2018 at 12:55 PM Lorenzo Bianconi
> <lorenzo.bianconi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > I am AFK at the moment, I will test that patch when I am back from vacations.

I hope you enjoyed you vacations. Have you been able to test if any of
the mt76 USB-dongles support CQM?

BR,
Kristian

^ permalink raw reply

* Re: [RFC] net: xsk: add a simple buffer reuse queue
From: Björn Töpel @ 2018-08-31  8:34 UTC (permalink / raw)
  To: Jakub Kicinski, Björn Töpel
  Cc: magnus.karlsson, magnus.karlsson, alexander.h.duyck,
	alexander.duyck, ast, brouer, daniel, netdev, jesse.brandeburg,
	anjali.singhai, peter.waskiewicz.jr, michael.lundkvist,
	willemdebruijn.kernel, john.fastabend, neerav.parikh,
	mykyta.iziumtsev, francois.ozog, ilias.apalodimas, brian.brooks,
	u9012063, pavel, qi.z.zhang
In-Reply-To: <20180829211940.54aa0130@cakuba.netronome.com>

On 2018-08-29 21:19, Jakub Kicinski wrote:
> XSK UMEM is strongly single producer single consumer so reuse of
> frames is challenging.  Add a simple "stash" of FILL packets to
> reuse for drivers to optionally make use of.  This is useful
> when driver has to free (ndo_stop) or resize a ring with an active
> AF_XDP ZC socket.
>

I'll take a stab using this in i40e. I have a couple of
comments/thoughts on this RFC, but let me get back when I have an actual
patch in place. :-)


Thanks!
Björn


> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
>   include/net/xdp_sock.h | 44 +++++++++++++++++++++++++++++++++
>   net/xdp/xdp_umem.c     |  2 ++
>   net/xdp/xsk_queue.c    | 56 ++++++++++++++++++++++++++++++++++++++++++
>   net/xdp/xsk_queue.h    |  3 +++
>   4 files changed, 105 insertions(+)
> 
> diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h
> index 6871e4755975..108c1c100de4 100644
> --- a/include/net/xdp_sock.h
> +++ b/include/net/xdp_sock.h
> @@ -14,6 +14,7 @@
>   #include <net/sock.h>
>   
>   struct net_device;
> +struct xdp_umem_fq_reuse;
>   struct xsk_queue;
>   
>   struct xdp_umem_props {
> @@ -41,6 +42,7 @@ struct xdp_umem {
>   	struct page **pgs;
>   	u32 npgs;
>   	struct net_device *dev;
> +	struct xdp_umem_fq_reuse *fq_reuse;
>   	u16 queue_id;
>   	bool zc;
>   	spinlock_t xsk_list_lock;
> @@ -110,4 +112,46 @@ static inline dma_addr_t xdp_umem_get_dma(struct xdp_umem *umem, u64 addr)
>   	return umem->pages[addr >> PAGE_SHIFT].dma + (addr & (PAGE_SIZE - 1));
>   }
>   
> +struct xdp_umem_fq_reuse {
> +	u32 nentries;
> +	u32 length;
> +	u64 handles[];
> +};
> +
> +/* Following functions are not thread-safe in any way */
> +struct xdp_umem_fq_reuse *xsk_reuseq_prepare(u32 nentries);
> +struct xdp_umem_fq_reuse *xsk_reuseq_swap(struct xdp_umem *umem,
> +					  struct xdp_umem_fq_reuse *newq);
> +void xsk_reuseq_free(struct xdp_umem_fq_reuse *rq);
> +
> +/* Reuse-queue aware version of FILL queue helpers */
> +static inline u64 *xsk_umem_peek_addr_rq(struct xdp_umem *umem, u64 *addr)
> +{
> +	struct xdp_umem_fq_reuse *rq = umem->fq_reuse;
> +
> +	if (!rq->length) {
> +		return xsk_umem_peek_addr(umem, addr);
> +	} else {
> +		*addr = rq->handles[rq->length - 1];
> +		return addr;
> +	}
> +}
> +
> +static inline void xsk_umem_discard_addr_rq(struct xdp_umem *umem)
> +{
> +	struct xdp_umem_fq_reuse *rq = umem->fq_reuse;
> +
> +	if (!rq->length)
> +		xsk_umem_discard_addr(umem);
> +	else
> +		rq->length--;
> +}
> +
> +static inline void xsk_umem_fq_reuse(struct xdp_umem *umem, u64 addr)
> +{
> +	struct xdp_umem_fq_reuse *rq = umem->fq_reuse;
> +
> +	rq->handles[rq->length++] = addr;
> +}
> +
>   #endif /* _LINUX_XDP_SOCK_H */
> diff --git a/net/xdp/xdp_umem.c b/net/xdp/xdp_umem.c
> index e762310c9bee..40303e24c954 100644
> --- a/net/xdp/xdp_umem.c
> +++ b/net/xdp/xdp_umem.c
> @@ -170,6 +170,8 @@ static void xdp_umem_release(struct xdp_umem *umem)
>   		umem->cq = NULL;
>   	}
>   
> +	xsk_reuseq_destroy(umem);
> +
>   	xdp_umem_unpin_pages(umem);
>   
>   	task = get_pid_task(umem->pid, PIDTYPE_PID);
> diff --git a/net/xdp/xsk_queue.c b/net/xdp/xsk_queue.c
> index 6c32e92e98fc..f9ee40a13a9a 100644
> --- a/net/xdp/xsk_queue.c
> +++ b/net/xdp/xsk_queue.c
> @@ -3,7 +3,9 @@
>    * Copyright(c) 2018 Intel Corporation.
>    */
>   
> +#include <linux/log2.h>
>   #include <linux/slab.h>
> +#include <linux/overflow.h>
>   
>   #include "xsk_queue.h"
>   
> @@ -61,3 +63,57 @@ void xskq_destroy(struct xsk_queue *q)
>   	page_frag_free(q->ring);
>   	kfree(q);
>   }
> +
> +struct xdp_umem_fq_reuse *xsk_reuseq_prepare(u32 nentries)
> +{
> +	struct xdp_umem_fq_reuse *newq;
> +
> +	/* Check for overflow */
> +	if (nentries > (u32)roundup_pow_of_two(nentries))
> +		return NULL;
> +	nentries = roundup_pow_of_two(nentries);
> +
> +	newq = kvmalloc(struct_size(newq, handles, nentries), GFP_KERNEL);
> +	if (!newq)
> +		return NULL;
> +	memset(newq, 0, offsetof(typeof(*newq), handles));
> +
> +	newq->nentries = nentries;
> +	return newq;
> +}
> +EXPORT_SYMBOL_GPL(xsk_reuseq_prepare);
> +
> +struct xdp_umem_fq_reuse *xsk_reuseq_swap(struct xdp_umem *umem,
> +					  struct xdp_umem_fq_reuse *newq)
> +{
> +	struct xdp_umem_fq_reuse *oldq = umem->fq_reuse;
> +
> +	if (!oldq) {
> +		umem->fq_reuse = newq;
> +		return NULL;
> +	}
> +
> +	if (newq->nentries < oldq->length)
> +		return newq;
> +
> +
> +	memcpy(newq->handles, oldq->handles,
> +	       array_size(oldq->length, sizeof(u64)));
> +	newq->length = oldq->length;
> +
> +	umem->fq_reuse = newq;
> +	return oldq;
> +}
> +EXPORT_SYMBOL_GPL(xsk_reuseq_swap);
> +
> +void xsk_reuseq_free(struct xdp_umem_fq_reuse *rq)
> +{
> +	kvfree(rq);
> +}
> +EXPORT_SYMBOL_GPL(xsk_reuseq_free);
> +
> +void xsk_reuseq_destroy(struct xdp_umem *umem)
> +{
> +	xsk_reuseq_free(umem->fq_reuse);
> +	umem->fq_reuse = NULL;
> +}
> diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
> index 8a64b150be54..7a480e3eb35d 100644
> --- a/net/xdp/xsk_queue.h
> +++ b/net/xdp/xsk_queue.h
> @@ -257,4 +257,7 @@ void xskq_set_umem(struct xsk_queue *q, struct xdp_umem_props *umem_props);
>   struct xsk_queue *xskq_create(u32 nentries, bool umem_queue);
>   void xskq_destroy(struct xsk_queue *q_ops);
>   
> +/* Executed by the core when the entire UMEM gets freed */
> +void xsk_reuseq_destroy(struct xdp_umem *umem);
> +
>   #endif /* _LINUX_XSK_QUEUE_H */
> 

^ permalink raw reply

* Re: [PATCH net] ebpf: fix bpf_msg_pull_data
From: Tushar Dave @ 2018-08-31  8:37 UTC (permalink / raw)
  To: Daniel Borkmann, john.fastabend, ast, davem, netdev; +Cc: sowmini.varadhan
In-Reply-To: <18201a71-d47e-6d60-81e0-19411afbc8d4@iogearbox.net>



On 08/30/2018 12:20 AM, Daniel Borkmann wrote:
> On 08/30/2018 02:21 AM, Tushar Dave wrote:
>> On 08/29/2018 05:07 PM, Tushar Dave wrote:
>>> While doing some preliminary testing it is found that bpf helper
>>> bpf_msg_pull_data does not calculate the data and data_end offset
>>> correctly. Fix it!
>>>
>>> Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
>>> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
>>> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
>>> ---
>>>    net/core/filter.c | 38 +++++++++++++++++++++++++-------------
>>>    1 file changed, 25 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index c25eb36..3eeb3d6 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -2285,7 +2285,7 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>>    BPF_CALL_4(bpf_msg_pull_data,
>>>           struct sk_msg_buff *, msg, u32, start, u32, end, u64, flags)
>>>    {
>>> -    unsigned int len = 0, offset = 0, copy = 0;
>>> +    unsigned int len = 0, offset = 0, copy = 0, off = 0;
>>>        struct scatterlist *sg = msg->sg_data;
>>>        int first_sg, last_sg, i, shift;
>>>        unsigned char *p, *to, *from;
>>> @@ -2299,22 +2299,30 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>>        i = msg->sg_start;
>>>        do {
>>>            len = sg[i].length;
>>> -        offset += len;
>>>            if (start < offset + len)
>>>                break;
>>> +        offset += len;
>>>            i++;
>>>            if (i == MAX_SKB_FRAGS)
>>>                i = 0;
>>> -    } while (i != msg->sg_end);
>>> +    } while (i <= msg->sg_end);
> 
> I don't think this condition is correct, msg operates as a scatterlist ring,
> so sg_end may very well be < current i when there's a wrap-around in the
> traversal ... more below.

I'm wondering then how this is suppose to work in case sg list is not
ring! For RDS, We have sg list that is not a ring. More below.

> 
>>>    +    /* return error if start is out of range */
>>>        if (unlikely(start >= offset + len))
>>>            return -EINVAL;
>>>    -    if (!msg->sg_copy[i] && bytes <= len)
>>> -        goto out;
>>> +    /* return error if i is last entry in sglist and end is out of range */
>>> +    if (msg->sg_copy[i] && end > offset + len)
>>> +        return -EINVAL;
>>>          first_sg = i;
>>>    +    /* if i is not last entry in sg list and end (i.e start + bytes) is
>>> +     * within this sg[i] then goto out and calculate data and data_end
>>> +     */
>>> +    if (!msg->sg_copy[i] && end <= offset + len)
>>> +        goto out;
>>> +
>>>        /* At this point we need to linearize multiple scatterlist
>>>         * elements or a single shared page. Either way we need to
>>>         * copy into a linear buffer exclusively owned by BPF. Then
>>> @@ -2330,9 +2338,14 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>>            i++;
>>>            if (i == MAX_SKB_FRAGS)
>>>                i = 0;
>>> -        if (bytes < copy)
>>> +        if (end < copy)
>>>                break;
>>> -    } while (i != msg->sg_end);
>>> +    } while (i <= msg->sg_end);
>>> +
>>> +    /* return error if i is last entry in sglist and end is out of range */
>>> +    if (i > msg->sg_end && end > offset + copy)
>>> +        return -EINVAL;
>>> +
>>>        last_sg = i;
>>>          if (unlikely(copy < end - start))
>>> @@ -2342,23 +2355,22 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>>        if (unlikely(!page))
>>>            return -ENOMEM;
>>>        p = page_address(page);
>>> -    offset = 0;
>>>          i = first_sg;
>>>        do {
>>>            from = sg_virt(&sg[i]);
>>>            len = sg[i].length;
>>> -        to = p + offset;
>>> +        to = p + off;
>>>              memcpy(to, from, len);
>>> -        offset += len;
>>> +        off += len;
>>>            sg[i].length = 0;
>>>            put_page(sg_page(&sg[i]));
>>>              i++;
>>>            if (i == MAX_SKB_FRAGS)
>>>                i = 0;
>>> -    } while (i != last_sg);
>>> +    } while (i < last_sg);
>>>          sg[first_sg].length = copy;
>>>        sg_set_page(&sg[first_sg], page, copy, 0);
>>> @@ -2380,7 +2392,7 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>>            else
>>>                move_from = i + shift;
>>>    -        if (move_from == msg->sg_end)
>>> +        if (move_from > msg->sg_end)
>>>                break;
>>>              sg[i] = sg[move_from];
>>> @@ -2396,7 +2408,7 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>>        if (msg->sg_end < 0)
>>>            msg->sg_end += MAX_SKB_FRAGS;
>>>    out:
>>> -    msg->data = sg_virt(&sg[i]) + start - offset;
>>> +    msg->data = sg_virt(&sg[first_sg]) + start - offset;
>>>        msg->data_end = msg->data + bytes;
>>>          return 0;
>>>
>>
>> Please discard this patch. I just noticed that Daniel Borkmann sent some similar fixes for bpf_msg_pull_data.
> 
> Yeah I've been looking at these recently as well, please make sure you test
> with the below fixes included to see if there's anything left:

I tested the latest net tree which has all the fixes you mentioned and I
am still seeing issues.

As I already mentioned before on RFC v3 thread, we need to be careful
reusing 'offset' while linearizing multiple scatterlist
elements.
Variable 'offset' is used to calculate the 'msg->data'
i.e. msg->data = sg_virt(&sg[first_sg]) + start - offset"

We should use different variable name (e.g. off) while linearizing
multiple scatterlist elements or a single shared page so that we don't
overwrite 'offset' that has correct value already been calculated.

A code change for this would look like:

diff --git a/net/core/filter.c b/net/core/filter.c
index 60a29ca..076ca09 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2326,7 +2326,7 @@ struct sock *do_msg_redirect_map(struct 
sk_msg_buff *msg)
  BPF_CALL_4(bpf_msg_pull_data,
            struct sk_msg_buff *, msg, u32, start, u32, end, u64, flags)
  {
-       unsigned int len = 0, offset = 0, copy = 0;
+       unsigned int len = 0, offset = 0, copy = 0, off;
         int bytes = end - start, bytes_sg_total;
         struct scatterlist *sg = msg->sg_data;
         int first_sg, last_sg, i, shift;
@@ -2354,6 +2354,8 @@ struct sock *do_msg_redirect_map(struct 
sk_msg_buff *msg)
          * account for the headroom.
          */
         bytes_sg_total = start - offset + bytes;

         if (!msg->sg_copy[i] && bytes_sg_total <= len)
                 goto out;

@@ -2382,18 +2384,18 @@ struct sock *do_msg_redirect_map(struct 
sk_msg_buff *msg)
         if (unlikely(!page))
                 return -ENOMEM;
         p = page_address(page);
-       offset = 0;
+       off = 0;

         i = first_sg;
         do {
                 from = sg_virt(&sg[i]);
                 len = sg[i].length;
-               to = p + offset;
+               to = p + off;

                 memcpy(to, from, len);
-               offset += len;
+               off += len;
                 sg[i].length = 0;
                 put_page(sg_page(&sg[i]));

                 sk_msg_iter_var(i);
         } while (i != last_sg);


Apart from above bug fix, I see issues in below 2 cases:

case 1:
=======
For things like RDS, where sg list is not a ring how to know end of packet?

For example I have RDS packet of length 8192 Bytes which is in
scatterlist like
sge[0].lenghth = 1400
sge[1].length = 1448
sge[2].length = 1448
sge[3].length = 1448
sge[4].length = 1448
sge[5].length = 1000

Now when I execute "bpf_msg_pull_data(msg, 8190, 8196, 0)" on above RDS
packet, I expect bpf_msg_pull_data to return error because end is out of
bound e.g. 8196 > 8192. Instead current code wraps it to sge index 0!
That is if I dump first 6 bytes starting with 'start' I get byte value
at offset 8190, 8191 from sge[5] and value at offset 8192, 8193, 8194
8195 bytes from sge[0] from offset 0 ,1 ,2 ,3 respectively - which is
not expected!


case 2:
=======
Same example of RDS packet as above, when I execute
"bpf_msg_pull_data(msg, 7191, 8192, 0)"
I get -EINVAL which is *wrong*.

Debugging this, I found, right before we enter below do while loop the 
values of variables are:
i = 4
first_sg = 4
last_sg= 5
start=7191
bytes_sg_total=2448
offset=5744
copy=0
len=1448

      do {
                 copy += sg[i].length;
                 sk_msg_iter_var(i);
                 if (bytes_sg_total <= copy)
                         break;
         } while (i != msg->sg_end);
         last_sg = i;

However, above loop execute only one time because after one execution,
i=5 and msg->sg_end=5. That makes condition "while (i != msg->sg_end)"
false. So when loop exit, the values are:
i = 5
first_sg=4
last_sg= 5
start=7191
bytes_sg_total=2448 <<<<<<
offset=5744
copy=1448 <<<<<<
len=1448

And that makes below condition true
         if (unlikely(bytes_sg_total > copy))
                 return -EINVAL;


There may be one or more similar corner scenarios like case 1 and 2
which can be fixed however I am not sure how can we fix so that we can
get it right for sg ring and sg list?

Thanks.
-Tushar
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/net/core/filter.c?id=5b24109b0563d45094c470684c1f8cea1af269f8
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/net/core/filter.c?id=0e06b227c5221dd51b5569de93f3b9f532be4a32
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/net/core/filter.c?id=2e43f95dd8ee62bc8bf57f2afac37fbd70c8d565
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/net/core/filter.c?id=a8cf76a9023bc6709b1361d06bb2fae5227b9d68
> 
> Thanks,
> Daniel
> 

^ permalink raw reply related

* Re: [PATCH bpf-next] tools/bpf: bpftool, add xskmap in map types
From: Jakub Kicinski @ 2018-08-31  8:45 UTC (permalink / raw)
  To: Prashant Bhole
  Cc: Alexei Starovoitov, Daniel Borkmann, Quentin Monnet, netdev
In-Reply-To: <20180831063242.9532-1-bhole_prashant_q7@lab.ntt.co.jp>

On Fri, 31 Aug 2018 15:32:42 +0900, Prashant Bhole wrote:
> When listed all maps, bpftool currently shows (null) for xskmap.
> Added xskmap type in map_type_name[] to show correct type.
> 
> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Thank you!  I feel tempted to suggest considering the bpf tree, but
perhaps that's a stretch..

> diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
> index df175bc33c5d..9c55077ca5dd 100644
> --- a/tools/bpf/bpftool/map.c
> +++ b/tools/bpf/bpftool/map.c
> @@ -68,6 +68,7 @@ static const char * const map_type_name[] = {
>  	[BPF_MAP_TYPE_DEVMAP]		= "devmap",
>  	[BPF_MAP_TYPE_SOCKMAP]		= "sockmap",
>  	[BPF_MAP_TYPE_CPUMAP]		= "cpumap",
> +	[BPF_MAP_TYPE_XSKMAP]           = "xskmap",
>  	[BPF_MAP_TYPE_SOCKHASH]		= "sockhash",
>  	[BPF_MAP_TYPE_CGROUP_STORAGE]	= "cgroup_storage",
>  };

^ permalink raw reply

* [PATCH] net: qca_spi: Fix race condition in spi transfers
From: Stefan Wahren @ 2018-08-31 12:53 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-kernel, Stefan Wahren

The performance optimization with the prepared spi messages
also introduced a race condition between the kernel thread
and the register dump. This could make spi_sync hang forever.

So revert the optimization completely.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
---
 drivers/net/ethernet/qualcomm/qca_7k.c  |  84 ++++++++++++------------
 drivers/net/ethernet/qualcomm/qca_spi.c | 110 +++++++++++++++++---------------
 drivers/net/ethernet/qualcomm/qca_spi.h |   5 --
 3 files changed, 97 insertions(+), 102 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/qca_7k.c b/drivers/net/ethernet/qualcomm/qca_7k.c
index ffe7a16..e9914b6 100644
--- a/drivers/net/ethernet/qualcomm/qca_7k.c
+++ b/drivers/net/ethernet/qualcomm/qca_7k.c
@@ -43,41 +43,40 @@ qcaspi_spi_error(struct qcaspi *qca)
 int
 qcaspi_read_register(struct qcaspi *qca, u16 reg, u16 *result)
 {
-	__be16 rx_data;
 	__be16 tx_data;
-	struct spi_transfer *transfer;
-	struct spi_message *msg;
+	struct spi_transfer transfer[2];
+	struct spi_message msg;
 	int ret;
 
+	memset(transfer, 0, sizeof(transfer));
+
+	spi_message_init(&msg);
+
 	tx_data = cpu_to_be16(QCA7K_SPI_READ | QCA7K_SPI_INTERNAL | reg);
+	*result = 0;
+
+	transfer[0].tx_buf = &tx_data;
+	transfer[0].len = QCASPI_CMD_LEN;
+	transfer[1].rx_buf = result;
+	transfer[1].len = QCASPI_CMD_LEN;
+
+	spi_message_add_tail(&transfer[0], &msg);
 
 	if (qca->legacy_mode) {
-		msg = &qca->spi_msg1;
-		transfer = &qca->spi_xfer1;
-		transfer->tx_buf = &tx_data;
-		transfer->rx_buf = NULL;
-		transfer->len = QCASPI_CMD_LEN;
-		spi_sync(qca->spi_dev, msg);
-	} else {
-		msg = &qca->spi_msg2;
-		transfer = &qca->spi_xfer2[0];
-		transfer->tx_buf = &tx_data;
-		transfer->rx_buf = NULL;
-		transfer->len = QCASPI_CMD_LEN;
-		transfer = &qca->spi_xfer2[1];
+		spi_sync(qca->spi_dev, &msg);
+		spi_message_init(&msg);
 	}
-	transfer->tx_buf = NULL;
-	transfer->rx_buf = &rx_data;
-	transfer->len = QCASPI_CMD_LEN;
-	ret = spi_sync(qca->spi_dev, msg);
+	spi_message_add_tail(&transfer[1], &msg);
+	ret = spi_sync(qca->spi_dev, &msg);
 
 	if (!ret)
-		ret = msg->status;
+		ret = msg.status;
 
-	if (ret)
+	if (ret) {
 		qcaspi_spi_error(qca);
-	else
-		*result = be16_to_cpu(rx_data);
+	} else {
+		*result = be16_to_cpu(*result);
+	}
 
 	return ret;
 }
@@ -86,35 +85,32 @@ int
 qcaspi_write_register(struct qcaspi *qca, u16 reg, u16 value)
 {
 	__be16 tx_data[2];
-	struct spi_transfer *transfer;
-	struct spi_message *msg;
+	struct spi_transfer transfer[2];
+	struct spi_message msg;
 	int ret;
 
+	memset(&transfer, 0, sizeof(transfer));
+
+	spi_message_init(&msg);
+
 	tx_data[0] = cpu_to_be16(QCA7K_SPI_WRITE | QCA7K_SPI_INTERNAL | reg);
 	tx_data[1] = cpu_to_be16(value);
 
+	transfer[0].tx_buf = &tx_data[0];
+	transfer[0].len = QCASPI_CMD_LEN;
+	transfer[1].tx_buf = &tx_data[1];
+	transfer[1].len = QCASPI_CMD_LEN;
+
+	spi_message_add_tail(&transfer[0], &msg);
 	if (qca->legacy_mode) {
-		msg = &qca->spi_msg1;
-		transfer = &qca->spi_xfer1;
-		transfer->tx_buf = &tx_data[0];
-		transfer->rx_buf = NULL;
-		transfer->len = QCASPI_CMD_LEN;
-		spi_sync(qca->spi_dev, msg);
-	} else {
-		msg = &qca->spi_msg2;
-		transfer = &qca->spi_xfer2[0];
-		transfer->tx_buf = &tx_data[0];
-		transfer->rx_buf = NULL;
-		transfer->len = QCASPI_CMD_LEN;
-		transfer = &qca->spi_xfer2[1];
+		spi_sync(qca->spi_dev, &msg);
+		spi_message_init(&msg);
 	}
-	transfer->tx_buf = &tx_data[1];
-	transfer->rx_buf = NULL;
-	transfer->len = QCASPI_CMD_LEN;
-	ret = spi_sync(qca->spi_dev, msg);
+	spi_message_add_tail(&transfer[1], &msg);
+	ret = spi_sync(qca->spi_dev, &msg);
 
 	if (!ret)
-		ret = msg->status;
+		ret = msg.status;
 
 	if (ret)
 		qcaspi_spi_error(qca);
diff --git a/drivers/net/ethernet/qualcomm/qca_spi.c b/drivers/net/ethernet/qualcomm/qca_spi.c
index 206f026..66b775d 100644
--- a/drivers/net/ethernet/qualcomm/qca_spi.c
+++ b/drivers/net/ethernet/qualcomm/qca_spi.c
@@ -99,22 +99,24 @@ static u32
 qcaspi_write_burst(struct qcaspi *qca, u8 *src, u32 len)
 {
 	__be16 cmd;
-	struct spi_message *msg = &qca->spi_msg2;
-	struct spi_transfer *transfer = &qca->spi_xfer2[0];
+	struct spi_message msg;
+	struct spi_transfer transfer[2];
 	int ret;
 
+	memset(&transfer, 0, sizeof(transfer));
+	spi_message_init(&msg);
+
 	cmd = cpu_to_be16(QCA7K_SPI_WRITE | QCA7K_SPI_EXTERNAL);
-	transfer->tx_buf = &cmd;
-	transfer->rx_buf = NULL;
-	transfer->len = QCASPI_CMD_LEN;
-	transfer = &qca->spi_xfer2[1];
-	transfer->tx_buf = src;
-	transfer->rx_buf = NULL;
-	transfer->len = len;
+	transfer[0].tx_buf = &cmd;
+	transfer[0].len = QCASPI_CMD_LEN;
+	transfer[1].tx_buf = src;
+	transfer[1].len = len;
 
-	ret = spi_sync(qca->spi_dev, msg);
+	spi_message_add_tail(&transfer[0], &msg);
+	spi_message_add_tail(&transfer[1], &msg);
+	ret = spi_sync(qca->spi_dev, &msg);
 
-	if (ret || (msg->actual_length != QCASPI_CMD_LEN + len)) {
+	if (ret || (msg.actual_length != QCASPI_CMD_LEN + len)) {
 		qcaspi_spi_error(qca);
 		return 0;
 	}
@@ -125,17 +127,20 @@ qcaspi_write_burst(struct qcaspi *qca, u8 *src, u32 len)
 static u32
 qcaspi_write_legacy(struct qcaspi *qca, u8 *src, u32 len)
 {
-	struct spi_message *msg = &qca->spi_msg1;
-	struct spi_transfer *transfer = &qca->spi_xfer1;
+	struct spi_message msg;
+	struct spi_transfer transfer;
 	int ret;
 
-	transfer->tx_buf = src;
-	transfer->rx_buf = NULL;
-	transfer->len = len;
+	memset(&transfer, 0, sizeof(transfer));
+	spi_message_init(&msg);
+
+	transfer.tx_buf = src;
+	transfer.len = len;
 
-	ret = spi_sync(qca->spi_dev, msg);
+	spi_message_add_tail(&transfer, &msg);
+	ret = spi_sync(qca->spi_dev, &msg);
 
-	if (ret || (msg->actual_length != len)) {
+	if (ret || (msg.actual_length != len)) {
 		qcaspi_spi_error(qca);
 		return 0;
 	}
@@ -146,23 +151,25 @@ qcaspi_write_legacy(struct qcaspi *qca, u8 *src, u32 len)
 static u32
 qcaspi_read_burst(struct qcaspi *qca, u8 *dst, u32 len)
 {
-	struct spi_message *msg = &qca->spi_msg2;
+	struct spi_message msg;
 	__be16 cmd;
-	struct spi_transfer *transfer = &qca->spi_xfer2[0];
+	struct spi_transfer transfer[2];
 	int ret;
 
+	memset(&transfer, 0, sizeof(transfer));
+	spi_message_init(&msg);
+
 	cmd = cpu_to_be16(QCA7K_SPI_READ | QCA7K_SPI_EXTERNAL);
-	transfer->tx_buf = &cmd;
-	transfer->rx_buf = NULL;
-	transfer->len = QCASPI_CMD_LEN;
-	transfer = &qca->spi_xfer2[1];
-	transfer->tx_buf = NULL;
-	transfer->rx_buf = dst;
-	transfer->len = len;
+	transfer[0].tx_buf = &cmd;
+	transfer[0].len = QCASPI_CMD_LEN;
+	transfer[1].rx_buf = dst;
+	transfer[1].len = len;
 
-	ret = spi_sync(qca->spi_dev, msg);
+	spi_message_add_tail(&transfer[0], &msg);
+	spi_message_add_tail(&transfer[1], &msg);
+	ret = spi_sync(qca->spi_dev, &msg);
 
-	if (ret || (msg->actual_length != QCASPI_CMD_LEN + len)) {
+	if (ret || (msg.actual_length != QCASPI_CMD_LEN + len)) {
 		qcaspi_spi_error(qca);
 		return 0;
 	}
@@ -173,17 +180,20 @@ qcaspi_read_burst(struct qcaspi *qca, u8 *dst, u32 len)
 static u32
 qcaspi_read_legacy(struct qcaspi *qca, u8 *dst, u32 len)
 {
-	struct spi_message *msg = &qca->spi_msg1;
-	struct spi_transfer *transfer = &qca->spi_xfer1;
+	struct spi_message msg;
+	struct spi_transfer transfer;
 	int ret;
 
-	transfer->tx_buf = NULL;
-	transfer->rx_buf = dst;
-	transfer->len = len;
+	memset(&transfer, 0, sizeof(transfer));
+	spi_message_init(&msg);
 
-	ret = spi_sync(qca->spi_dev, msg);
+	transfer.rx_buf = dst;
+	transfer.len = len;
 
-	if (ret || (msg->actual_length != len)) {
+	spi_message_add_tail(&transfer, &msg);
+	ret = spi_sync(qca->spi_dev, &msg);
+
+	if (ret || (msg.actual_length != len)) {
 		qcaspi_spi_error(qca);
 		return 0;
 	}
@@ -195,19 +205,23 @@ static int
 qcaspi_tx_cmd(struct qcaspi *qca, u16 cmd)
 {
 	__be16 tx_data;
-	struct spi_message *msg = &qca->spi_msg1;
-	struct spi_transfer *transfer = &qca->spi_xfer1;
+	struct spi_message msg;
+	struct spi_transfer transfer;
 	int ret;
 
+	memset(&transfer, 0, sizeof(transfer));
+
+	spi_message_init(&msg);
+
 	tx_data = cpu_to_be16(cmd);
-	transfer->len = sizeof(tx_data);
-	transfer->tx_buf = &tx_data;
-	transfer->rx_buf = NULL;
+	transfer.len = sizeof(cmd);
+	transfer.tx_buf = &tx_data;
+	spi_message_add_tail(&transfer, &msg);
 
-	ret = spi_sync(qca->spi_dev, msg);
+	ret = spi_sync(qca->spi_dev, &msg);
 
 	if (!ret)
-		ret = msg->status;
+		ret = msg.status;
 
 	if (ret)
 		qcaspi_spi_error(qca);
@@ -835,16 +849,6 @@ qcaspi_netdev_setup(struct net_device *dev)
 	qca = netdev_priv(dev);
 	memset(qca, 0, sizeof(struct qcaspi));
 
-	memset(&qca->spi_xfer1, 0, sizeof(struct spi_transfer));
-	memset(&qca->spi_xfer2, 0, sizeof(struct spi_transfer) * 2);
-
-	spi_message_init(&qca->spi_msg1);
-	spi_message_add_tail(&qca->spi_xfer1, &qca->spi_msg1);
-
-	spi_message_init(&qca->spi_msg2);
-	spi_message_add_tail(&qca->spi_xfer2[0], &qca->spi_msg2);
-	spi_message_add_tail(&qca->spi_xfer2[1], &qca->spi_msg2);
-
 	memset(&qca->txr, 0, sizeof(qca->txr));
 	qca->txr.count = TX_RING_MAX_LEN;
 }
diff --git a/drivers/net/ethernet/qualcomm/qca_spi.h b/drivers/net/ethernet/qualcomm/qca_spi.h
index fc4beb1..fc0e987 100644
--- a/drivers/net/ethernet/qualcomm/qca_spi.h
+++ b/drivers/net/ethernet/qualcomm/qca_spi.h
@@ -83,11 +83,6 @@ struct qcaspi {
 	struct tx_ring txr;
 	struct qcaspi_stats stats;
 
-	struct spi_message spi_msg1;
-	struct spi_message spi_msg2;
-	struct spi_transfer spi_xfer1;
-	struct spi_transfer spi_xfer2[2];
-
 	u8 *rx_buffer;
 	u32 buffer_size;
 	u8 sync;
-- 
2.7.4

^ permalink raw reply related

* [PATCH net] ip6_tunnel: respect ttl inherit for ip6tnl
From: Hangbin Liu @ 2018-08-31  8:52 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Hangbin Liu

man ip-tunnel ttl section says:
0 is a special value meaning that packets inherit the TTL value.

IPv4 tunnel respect this in ip_tunnel_xmit(), but IPv6 tunnel has not
implement it yet. To make IPv6 behave consistently with IP tunnel,
add ipv6 tunnel inherit support.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 net/ipv6/ip6_tunnel.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 5df2a58..419960b 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1188,7 +1188,15 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 		init_tel_txopt(&opt, encap_limit);
 		ipv6_push_frag_opts(skb, &opt.ops, &proto);
 	}
-	hop_limit = hop_limit ? : ip6_dst_hoplimit(dst);
+
+	if (hop_limit == 0) {
+		if (skb->protocol == htons(ETH_P_IP))
+			hop_limit = ip_hdr(skb)->ttl;
+		else if (skb->protocol == htons(ETH_P_IPV6))
+			hop_limit = ipv6_hdr(skb)->hop_limit;
+		else
+			hop_limit = ip6_dst_hoplimit(dst);
+	}
 
 	/* Calculate max headroom for all the headers and adjust
 	 * needed_headroom if necessary.
-- 
2.5.5

^ permalink raw reply related

* Re: [PATCH v2 1/2] IB/ipoib: Use dev_port to expose network interface port numbers
From: Arseny Maslennikov @ 2018-08-31  8:57 UTC (permalink / raw)
  To: Doug Ledford; +Cc: linux-rdma, Jason Gunthorpe, netdev
In-Reply-To: <075a5d0f9fb9dfc868e82073f930a98bea15e578.camel@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2680 bytes --]

On Thu, Aug 30, 2018 at 04:17:58PM -0400, Doug Ledford wrote:
> On Thu, 2018-08-30 at 21:22 +0300, Arseny Maslennikov wrote:
> > Some InfiniBand network devices have multiple ports on the same PCI
> > function. This initializes the `dev_port' sysfs field of those
> > network interfaces with their port number.
> > 
> > Prior to this the kernel erroneously used the `dev_id' sysfs
> > field of those network interfaces to convey the port number to userspace.
> > 
> > The use of `dev_id' was considered correct until Linux 3.15,
> > when another field, `dev_port', was defined for this particular
> > purpose and `dev_id' was reserved for distinguishing stacked ifaces
> > (e.g: VLANs) with the same hardware address as their parent device.
> > 
> > Similar fixes to net/mlx4_en and many other drivers, which started
> > exporting this information through `dev_id' before 3.15, were accepted
> > into the kernel 4 years ago.
> > See 76a066f2a2a0 (`net/mlx4_en: Expose port number through sysfs').
> > 
> > Signed-off-by: Arseny Maslennikov <ar@cs.msu.ru>
> > ---
> >  drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > index e3d28f9ad9c0..ba16a63ee303 100644
> > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > @@ -1880,7 +1880,7 @@ static int ipoib_parent_init(struct net_device *ndev)
> >  	       sizeof(union ib_gid));
> >  
> >  	SET_NETDEV_DEV(priv->dev, priv->ca->dev.parent);
> > -	priv->dev->dev_id = priv->port - 1;
> > +	priv->dev->dev_port = priv->port - 1;
> 
> I don't know that we can't do this.  At least not yet.  Expose the new
> item to make us compliant with the new docs, and deprecate the old sysfs
> item, but we can't just yank the old item.  Existing tools/scripts might
> (probably) rely on it (existing tools already special case IPoIB
> interfaces and we'll need to make sure they don't special case this
> element too).

I'm good with keeping both items for a (probably long) while to not break
things. But how exactly should we notify users of the deprecation, so they
don't special case this again? A comment in the code seems too little —
everyone's obviously too busy to look there and stumble upon that.
A distinct notice in the doc seems too much. I can't think of another place
for the deprecation notice where people would take note of it, however.

Anyway: would it be OK to just restore both items and put a small note in
dev_id's doc entry? If yes, I'll then send a v3.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH RFC net-next 00/11] udp gso
From: Paolo Abeni @ 2018-08-31  9:09 UTC (permalink / raw)
  To: Willem de Bruijn, Sowmini Varadhan; +Cc: Network Development, Willem de Bruijn
In-Reply-To: <CAF=yD-+psRcT+666fCgh8f2r8Raq_D4yCsEycKrNpW-tan1-Yw@mail.gmail.com>

Hi,

On Tue, 2018-04-17 at 17:07 -0400, Willem de Bruijn wrote:
> That said, for negotiated flows an inverse GRO feature could
> conceivably be implemented to reduce rx stack traversal, too.
> Though due to interleaving of packets on the wire, it aggregation
> would be best effort, similar to TCP TSO and GRO using the
> PSH bit as packetization signal.

Reviving this old thread, before I forgot again. I have some local
patches implementing UDP GRO in a dual way to current GSO_UDP_L4
implementation: several datagram with the same length are aggregated
into a single one, and the user space receive a single larger packet
instead of multiple ones. I hope quic can leverage such scenario, but I
really know nothing about the protocol.

I measure roughly a 50% performance improvement with udpgso_bench in
respect to UDP GSO, and ~100% using a pktgen sender, and a reduced CPU
usage on the receiver[1]. Some additional hacking to the general GRO
bits is required to avoid useless socket lookups for ingress UDP
packets when UDP_GSO is not enabled.

If there is interest on this topic, I can share some RFC patches
(hopefully somewhat next week).

Cheers,

Paolo

[1] With udpgso_bench_tx, the bottle-neck is again the sender, even
with GSO enabled. With a pktgen sender, the bottle-neck become the rx
softirqd, and I see a lot of time consumed due to retpolines in the GRO
code. In both scenarios skb_release_data() becomes the topmost perf
offender for the user space process.

^ permalink raw reply

* [PATCH] rsi: remove set but not used variable 'header_size'
From: YueHaibing @ 2018-08-31  9:27 UTC (permalink / raw)
  To: Kalle Valo, Pavani Muthyala, Kees Cook, Colin Ian King,
	Amitkumar Karwar, Prameela Rani Garnepudi, Siva Rebbagondla,
	Sushant Kumar Mishra
  Cc: YueHaibing, linux-wireless, netdev, kernel-janitors

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/wireless/rsi/rsi_91x_hal.c: In function 'rsi_send_data_pkt':
drivers/net/wireless/rsi/rsi_91x_hal.c:288:5: warning:
 variable 'header_size' set but not used [-Wunused-but-set-variable]

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/wireless/rsi/rsi_91x_hal.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/wireless/rsi/rsi_91x_hal.c b/drivers/net/wireless/rsi/rsi_91x_hal.c
index 01edf96..c4fb319 100644
--- a/drivers/net/wireless/rsi/rsi_91x_hal.c
+++ b/drivers/net/wireless/rsi/rsi_91x_hal.c
@@ -285,7 +285,6 @@ int rsi_send_data_pkt(struct rsi_common *common, struct sk_buff *skb)
 	struct skb_info *tx_params;
 	struct ieee80211_bss_conf *bss;
 	int status = -EINVAL;
-	u8 header_size;
 
 	if (!skb)
 		return 0;
@@ -298,7 +297,6 @@ int rsi_send_data_pkt(struct rsi_common *common, struct sk_buff *skb)
 	vif = info->control.vif;
 	bss = &vif->bss_conf;
 	tx_params = (struct skb_info *)info->driver_data;
-	header_size = tx_params->internal_hdr_size;
 
 	if (((vif->type == NL80211_IFTYPE_STATION) ||
 	     (vif->type == NL80211_IFTYPE_P2P_CLIENT)) &&

^ permalink raw reply related

* Re: [PATCH net-next v3 02/10] net: mvpp2: phylink support
From: Antoine Tenart @ 2018-08-31 13:36 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Antoine Tenart, davem, kishon, gregory.clement, andrew, jason,
	sebastian.hesselbarth, netdev, linux-kernel, thomas.petazzoni,
	maxime.chevallier, miquel.raynal, nadavh, stefanc, ymarkman, mw,
	linux-arm-kernel
In-Reply-To: <20180827165058.GD30658@n2100.armlinux.org.uk>

Hello Russell,

On Mon, Aug 27, 2018 at 05:50:59PM +0100, Russell King - ARM Linux wrote:
> On Thu, May 17, 2018 at 10:29:31AM +0200, Antoine Tenart wrote:
> > +static void mvpp2_mac_config(struct net_device *dev, unsigned int mode,
> > +			     const struct phylink_link_state *state)
> > +{
> > +	struct mvpp2_port *port = netdev_priv(dev);
> > +
> > +	/* Check for invalid configuration */
> > +	if (state->interface == PHY_INTERFACE_MODE_10GKR && port->gop_id != 0) {
> > +		netdev_err(dev, "Invalid mode on %s\n", dev->name);
> > +		return;
> > +	}
> > +
> > +	netif_tx_stop_all_queues(port->dev);
> > +	if (!port->has_phy)
> > +		netif_carrier_off(port->dev);
> ...
> > +	/* If the port already was up, make sure it's still in the same state */
> > +	if (state->link || !port->has_phy) {
> > +		mvpp2_port_enable(port);
> > +
> > +		mvpp2_egress_enable(port);
> > +		mvpp2_ingress_enable(port);
> > +		if (!port->has_phy)
> > +			netif_carrier_on(dev);
> > +		netif_tx_wake_all_queues(dev);
> > +	}
> 
> The above appears to cause a problem when testing on a singleshot
> mcbin board - as a direct result of the above, I find that _all_
> the network devices apparently have link, including those which
> have no SFP inserted into the cage.  This results in debian jessie
> hanging at boot for over 1 minute while systemd tries to bring up
> all network devices.

OK, we should fix that. Removing the above code does work for most of
the configurations, including the double shot MacchiatoBin.

> Have you identified a case where mac_config() is called with the
> link up for which a major reconfiguration is required?  mac_config()
> will get called for small reconfigurations such as changing the
> pause parameters (which should _not_ require the queues to be
> restarted) but the link should always be down for major
> reconfigurations such sa changing the PHY interface mode.

With the above code remove one case did not worked anymore: when the
port is configured as a fixed-link because the SFP cage can't be
described and used (on the 7040-db and 8040-db boards). In such cases
phylink is called, mac_config() is called, but link_up() is never
called. I'm not sure this is actually an issue in phylink, but the PPv2
driver should probably take care of this weird case itself (by calling
explicitly link_up()). What do you think?

> mac_config() can also be called when nothing has changed - if we've
> decided to re-run the link state resolve, it can be called with the
> same parameters as previously, especially now that we have polling
> of a GPIO for link state monitoring.  With your code above, this will
> cause mvpp2 to down/up the carrier state every second.

OK, that's bad.

> Note that state->link here is the _future_ state of the link, which
> may change state before the mac_link_up()/down() functions have been
> called - turning the carrier on/off like this will prevent these
> callbacks being made, and the link state being printed by phylink.

OK. Hopefully the fix will drop all use of state->link.


I've made a patch (see below) to stop messing with the carrier link
state in mac_config(), while also fixing the fixed-link handling
directly from within the PPv2 driver. If the patch is OK for you, I'll
sent it upstream to the net tree.

-- >8 --
>From 55ee933e259095d68b9bcae21f69d672e783a813 Mon Sep 17 00:00:00 2001
From: Antoine Tenart <antoine.tenart@bootlin.com>
Date: Fri, 31 Aug 2018 14:05:02 +0200
Subject: [PATCH] net: mvpp2: let phylink manage the carrier state

Net drivers using phylink shouldn't mess with the link carrier
themselves and should let phylink manage it. The mvpp2 driver wasn't
following this best practice as the mac_config() function made calls to
change the link carrier state. This led to wrongly reported carrier link
state which, according to Russell, increased a lot the Debian Jessie
boot time as systemd tried to bring up all those interfaces.

This patch fixes this behaviour.

As parts of the driver relied on this misbehaviour from the mac_config()
function, they were fixed as well to call the link_up() function. Also,
the fixed-link configuration relied on the mac_config() function to mess
with the link carrier state. As it does not anymore, if this mode is
used the link_up() function is called explicitly.

Fixes: 4bb043262878 ("net: mvpp2: phylink support")
Reported-by: linux@armlinux.org.uk
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
---
 drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |  1 +
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 27 +++++++++----------
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
index 67b9e81b7c02..c8d2bc23d21d 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
@@ -613,6 +613,7 @@
 
 /* Port flags */
 #define MVPP2_F_LOOPBACK		BIT(0)
+#define MVPP2_F_FIXED_LINK		BIT(1)
 
 /* Marvell tag types */
 enum mvpp2_tag_type {
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index 28500417843e..19d48992f213 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -58,6 +58,8 @@ static struct {
  */
 static void mvpp2_mac_config(struct net_device *dev, unsigned int mode,
 			     const struct phylink_link_state *state);
+static void mvpp2_mac_link_up(struct net_device *dev, unsigned int mode,
+			      phy_interface_t interface, struct phy_device *phy);
 
 /* Queue modes */
 #define MVPP2_QDIST_SINGLE_MODE	0
@@ -3143,6 +3145,10 @@ static void mvpp2_start_dev(struct mvpp2_port *port)
 
 	if (port->phylink) {
 		phylink_start(port->phylink);
+
+		if (!(port->flags & MVPP2_F_FIXED_LINK))
+			mvpp2_mac_link_up(port->dev, MLO_AN_FIXED,
+					  port->phy_interface, NULL);
 	} else {
 		/* Phylink isn't used as of now for ACPI, so the MAC has to be
 		 * configured manually when the interface is started. This will
@@ -3150,9 +3156,10 @@ static void mvpp2_start_dev(struct mvpp2_port *port)
 		 */
 		struct phylink_link_state state = {
 			.interface = port->phy_interface,
-			.link = 1,
 		};
 		mvpp2_mac_config(port->dev, MLO_AN_INBAND, &state);
+		mvpp2_mac_link_up(port->dev, MLO_AN_INBAND, port->phy_interface,
+				  NULL);
 	}
 
 	netif_tx_start_all_queues(port->dev);
@@ -4495,10 +4502,6 @@ static void mvpp2_mac_config(struct net_device *dev, unsigned int mode,
 		return;
 	}
 
-	netif_tx_stop_all_queues(port->dev);
-	if (!port->has_phy)
-		netif_carrier_off(port->dev);
-
 	/* Make sure the port is disabled when reconfiguring the mode */
 	mvpp2_port_disable(port);
 
@@ -4523,16 +4526,7 @@ static void mvpp2_mac_config(struct net_device *dev, unsigned int mode,
 	if (port->priv->hw_version == MVPP21 && port->flags & MVPP2_F_LOOPBACK)
 		mvpp2_port_loopback_set(port, state);
 
-	/* If the port already was up, make sure it's still in the same state */
-	if (state->link || !port->has_phy) {
-		mvpp2_port_enable(port);
-
-		mvpp2_egress_enable(port);
-		mvpp2_ingress_enable(port);
-		if (!port->has_phy)
-			netif_carrier_on(dev);
-		netif_tx_wake_all_queues(dev);
-	}
+	mvpp2_port_enable(port);
 }
 
 static void mvpp2_mac_link_up(struct net_device *dev, unsigned int mode,
@@ -4691,6 +4685,9 @@ static int mvpp2_port_probe(struct platform_device *pdev,
 	if (fwnode_property_read_bool(port_fwnode, "marvell,loopback"))
 		port->flags |= MVPP2_F_LOOPBACK;
 
+	if (fwnode_property_present(port_fwnode, "fixed-link"))
+		port->flags |= MVPP2_F_FIXED_LINK;
+
 	port->id = id;
 	if (priv->hw_version == MVPP21)
 		port->first_rxq = port->id * port->nrxqs;
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH net-next 1/3] net: rework SIOCGSTAMP ioctl handling
From: Arnd Bergmann @ 2018-08-31 14:00 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Networking, David Miller, linux-arch, y2038 Mailman List,
	Eric Dumazet, Willem de Bruijn, Linux Kernel Mailing List,
	linux-hams, Bluez mailing list, linux-can, dccp, linux-wpan,
	linux-sctp, linux-x25
In-Reply-To: <CAF=yD-+9yc838Y_DO1S4HsBpLrBk8iNt0oAj1ceO1S6sE+tCdg@mail.gmail.com>

On Fri, Aug 31, 2018 at 3:38 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
> On Fri, Aug 31, 2018 at 6:31 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Thu, Aug 30, 2018 at 10:10 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> > > On Wed, Aug 29, 2018 at 9:05 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > > If this is the only valid implementation of .gettstamp, the indirect
> > > call could be avoided in favor of a simple branch.
> >
> > I thought about that as well, but I could not come up with a
> > good way to encode the difference between socket protocols
> > that allow timestamping and those that don't.
> >
> > I think ideally we would just call sock_gettstamp() unconditonally
> > on every socket, and have that function decide whether timestamps
> > make sense or not. The part I did not understand is which ones
> > actually want the timestamps or not. Most protocols that
> > implement the ioctls also assign skb->tstamp, but there are some
> > protocols in which I could not see skb->tstamp ever being set,
> > and some that set it but don't seem to have the ioctls.
>
> These probably only use cmsgs SCM_TIMESTAMP(NS|IMG)
> to read timestamps.

Good point. FWIW, I have discussed with Deepa how those should
be modified for y2038, but we don't have any current patches for
those.

> > Looking at it again, it seems that sock_gettstamp() should
> > actually deal with this gracefully: it will return a -EINVAL
> > error condition if the timestamp remains at the
> > SK_DEFAULT_STAMP initial value, which is probably
> > just as appropriate (or better) as the current -ENOTTY
> > default, and if we are actually recording timestamps, we
> > might just as well report them.
>
> Yes, that's a nice solution. There is always some risk in changing
> error codes. But ioctl callers should be able to support newly
> implemented functionality. Even if partially implemented and
> returning ENOENT instead of ENOIOCTLCMD.

Ok, so do you think we should stay with the current version
for now, and change the two points later, or should I rework
it to integrate the locking and removing the callback?

I suppose the series actually gets nicer without the
callback, since I can simply add the generic timestamping
implementation first, and then remove the dead ioctl
handlers.

        Arnd

^ permalink raw reply

* Re: [PATCH net-next v3 02/10] net: mvpp2: phylink support
From: Russell King - ARM Linux @ 2018-08-31 14:11 UTC (permalink / raw)
  To: Antoine Tenart
  Cc: davem, kishon, gregory.clement, andrew, jason,
	sebastian.hesselbarth, netdev, linux-kernel, thomas.petazzoni,
	maxime.chevallier, miquel.raynal, nadavh, stefanc, ymarkman, mw,
	linux-arm-kernel
In-Reply-To: <20180831133651.GB32574@kwain>

On Fri, Aug 31, 2018 at 03:36:51PM +0200, Antoine Tenart wrote:
> With the above code remove one case did not worked anymore: when the
> port is configured as a fixed-link because the SFP cage can't be
> described and used (on the 7040-db and 8040-db boards). In such cases
> phylink is called, mac_config() is called, but link_up() is never
> called. I'm not sure this is actually an issue in phylink, but the PPv2
> driver should probably take care of this weird case itself (by calling
> explicitly link_up()). What do you think?

Fixed link should work:

- when a fixed link is configured, link_config.link is set true.
- when phylink_start() is called, mac_config() will be called to do the
  initial setup, and a resolve is triggered.
- phylink_resolve() will read the fixed link state, which in the case
  of no GPIO, will inherit link_config.link.
  - you will then see another mac_config() call.

Now what happens depends whether you've set the netdev's carrier state
in the driver - if you haven't, the netdev's carrier state should be
off.  Since the state mismatches the link_state.link (which will be
true), you will get a mac_link_up() call.

mvneta ensures this state by always calling netif_carrier_off() in
mvneta_open(), maybe that ought to be in phylink_start() as that's the
state that phylink expects when phylink_start() has been called.  So,
maybe it's a phylink bug.

Can you see any down-sides to moving the netif_carrier_off() in
mvneta_open() to phylink_start() ?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

^ permalink raw reply

* Re: [PATCH RFC net-next 00/11] udp gso
From: Eric Dumazet @ 2018-08-31 10:09 UTC (permalink / raw)
  To: Paolo Abeni, Willem de Bruijn, Sowmini Varadhan
  Cc: Network Development, Willem de Bruijn
In-Reply-To: <8b4de31a06d9bdb69e348f88ad0dcbf7d8576477.camel@redhat.com>

On 08/31/2018 02:09 AM, Paolo Abeni wrote:
> I hope quic can leverage such scenario, but I
> really know nothing about the protocol.
>

Most QUIC receivers are mobile phones, laptops, with wifi without GRO anyway...

Even if they had GRO, the inter-packet delay would be too high for GRO to be successful.

(vast majority of QUIC flows are < 100 Mbits because of the last mile hop)

GSO UDP is used on servers with clear gains, but there are not
really high speed receivers where GRO could be used.

^ permalink raw reply

* Re: [PATCH 04/15] soc: octeontx2: Add mailbox support infra
From: Arnd Bergmann @ 2018-08-31 14:16 UTC (permalink / raw)
  To: Sunil Kovvuri
  Cc: Linux Kernel Mailing List, Olof Johansson, Linux ARM, linux-soc,
	amakarov, sgoutham, lbartosik, Networking, David Miller
In-Reply-To: <CA+sq2CfjJjswOwypNMtro+g8TeHGoBqAeKpFrf2ys-1=Xg+qHw@mail.gmail.com>

On Thu, Aug 30, 2018 at 8:37 PM Sunil Kovvuri <sunil.kovvuri@gmail.com> wrote:
> On Thu, Aug 30, 2018 at 7:27 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Tue, Aug 28, 2018 at 3:23 PM Sunil Kovvuri <sunil.kovvuri@gmail.com> wrote:
> > > On Tue, Aug 28, 2018 at 6:22 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > On Tue, Aug 28, 2018 at 2:48 PM Sunil Kovvuri <sunil.kovvuri@gmail.com> wrote:
> > > Any PCI device here irrespective in what domain (kernel or userspace)
> > > they are in
> > > use common mailbox communication. Which is
> > > # Write a mailbox msg (format is agreed between all parties) into
> > > shared (between AF and other PF/VFs)
> > >    memory region and trigger a interrupt to admin function.
> > > # Admin function processes the msg and puts reply in the same memory
> > > region and trigger
> > >    IRQ to the requesting device. If the device has a driver instance
> > > in kernel then it uses
> > >    IRQ and userspace applications does polling on the IRQ status bit.
> >
> > What is the purpose of the exported interface then? Is this
> > just an abstraction so each of the drivers can talk to its own
> > mailbox using a set of common helper functions?
> >
>
> Yes, that's correct.
>
> In kernel there will be a minimum of 3 drivers which will use this
> mailbox communication.
> So instead of duplicating APIs and structures in every driver, we
> thought of adding them in this AF driver and export them to ethernet
> and crypto drivers.

Ok. My feeling is then that the API is fine, but that it should not
be part of the AF module but rather be a standalone module.

My comment about the generic mailbox API no longer applies
here: you don't have a single shared mailbox hardware interface,
but each device has its own mailbox register set, so there
is no point in setting up a separate device for it, but I see
no need for creating an artificial dependency on the AF
driver. E.g. in a virtual machine that only has one ethernet
interface, you otherwise wouldn't load that driver, right?

         Arnd

^ permalink raw reply

* Re: [PATCH net-next v3 02/10] net: mvpp2: phylink support
From: Andrew Lunn @ 2018-08-31 14:18 UTC (permalink / raw)
  To: Antoine Tenart
  Cc: Russell King - ARM Linux, davem, kishon, gregory.clement, jason,
	sebastian.hesselbarth, netdev, linux-kernel, thomas.petazzoni,
	maxime.chevallier, miquel.raynal, nadavh, stefanc, ymarkman, mw,
	linux-arm-kernel
In-Reply-To: <20180831133651.GB32574@kwain>

> @@ -4691,6 +4685,9 @@ static int mvpp2_port_probe(struct platform_device *pdev,
>  	if (fwnode_property_read_bool(port_fwnode, "marvell,loopback"))
>  		port->flags |= MVPP2_F_LOOPBACK;
>  
> +	if (fwnode_property_present(port_fwnode, "fixed-link"))
> +		port->flags |= MVPP2_F_FIXED_LINK;
> +

Hi Antoine

There is no need to go look in device tree. When phylink calls
mac_config() it passes mode == MLO_AN_FIXED, when it is using a fixed
link.

	Andrew

^ permalink raw reply

* Re: [PATCH net-next v3 02/10] net: mvpp2: phylink support
From: Antoine Tenart @ 2018-08-31 14:23 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Antoine Tenart, Russell King - ARM Linux, davem, kishon,
	gregory.clement, jason, sebastian.hesselbarth, netdev,
	linux-kernel, thomas.petazzoni, maxime.chevallier, miquel.raynal,
	nadavh, stefanc, ymarkman, mw, linux-arm-kernel
In-Reply-To: <20180831141851.GA19733@lunn.ch>

Hi Andrew,

On Fri, Aug 31, 2018 at 04:18:51PM +0200, Andrew Lunn wrote:
> > @@ -4691,6 +4685,9 @@ static int mvpp2_port_probe(struct platform_device *pdev,
> >  	if (fwnode_property_read_bool(port_fwnode, "marvell,loopback"))
> >  		port->flags |= MVPP2_F_LOOPBACK;
> >  
> > +	if (fwnode_property_present(port_fwnode, "fixed-link"))
> > +		port->flags |= MVPP2_F_FIXED_LINK;
> > +
> 
> There is no need to go look in device tree. When phylink calls
> mac_config() it passes mode == MLO_AN_FIXED, when it is using a fixed
> link.

Sure. This was a proposal to have a workaround when phylink does not
call link_up(), so that we know which mode to pass to the function.

In the meantime Russell explained an assumption about the carrier link
state made by phylink, and a proposal to avoid doing this kind of hack
in the driver. I'll test his solution which will probably means we can
drop this piece of code.

Thanks!
Antoine

-- 
Antoine Ténart, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: [PATCH 1/2] net/ibm/emac: wrong emac_calc_base call was used by typo
From: Ivan Mikhaylov1 @ 2018-08-31 10:18 UTC (permalink / raw)
  To: Christian Lamparter; +Cc: netdev, David S . Miller, linux-kernel
In-Reply-To: <13714694.2rXNjIIGUQ@debian64>

>I do have a ot question though:
>Since you are working for IBM and probably have access to all the
>wonderful docs and the working hardware: Would you consider to be
>the emac/emac4/emac-sync maintainer? From what I can parse from
>the /MAINTAINERS file, there isn't currently anyone listed.

Unfortunately, I can't take this obligation due to some circumstances
in the future. 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox