Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH cumulus-4.18.y] vxlan: support NTF_USE refresh of fdb entries
From: Roopa Prabhu @ 2018-10-11 19:34 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <1539286408-25597-1-git-send-email-roopa@cumulusnetworks.com>

On Thu, Oct 11, 2018 at 12:33 PM Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> This makes use of NTF_USE in vxlan driver consistent
> with bridge driver.
>
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---

pls ignore. wrong patch prefix :)

^ permalink raw reply

* [PATCH cumulus-4.18.y] vxlan: support NTF_USE refresh of fdb entries
From: Roopa Prabhu @ 2018-10-11 19:33 UTC (permalink / raw)
  To: davem; +Cc: netdev

From: Roopa Prabhu <roopa@cumulusnetworks.com>

This makes use of NTF_USE in vxlan driver consistent
with bridge driver.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 drivers/net/vxlan.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index fb0cdbb..018406c 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -697,6 +697,7 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 			    __be16 port, __be32 src_vni, __be32 vni,
 			    __u32 ifindex, __u8 ndm_flags)
 {
+	__u8 fdb_flags = (ndm_flags & ~NTF_USE);
 	struct vxlan_rdst *rd = NULL;
 	struct vxlan_fdb *f;
 	int notify = 0;
@@ -714,8 +715,8 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 			f->updated = jiffies;
 			notify = 1;
 		}
-		if (f->flags != ndm_flags) {
-			f->flags = ndm_flags;
+		if (f->flags != fdb_flags) {
+			f->flags = fdb_flags;
 			f->updated = jiffies;
 			notify = 1;
 		}
@@ -737,6 +738,9 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 				return rc;
 			notify |= rc;
 		}
+
+		if (ndm_flags & NTF_USE)
+			f->used = jiffies;
 	} else {
 		if (!(flags & NLM_F_CREATE))
 			return -ENOENT;
@@ -748,7 +752,7 @@ static int vxlan_fdb_update(struct vxlan_dev *vxlan,
 
 		netdev_dbg(vxlan->dev, "add %pM -> %pIS\n", mac, ip);
 		rc = vxlan_fdb_create(vxlan, mac, ip, state, port, src_vni,
-				      vni, ifindex, ndm_flags, &f);
+				      vni, ifindex, fdb_flags, &f);
 		if (rc < 0)
 			return rc;
 		notify = 1;
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH net-next 0/9] net: Kernel side filtering for route dumps
From: Sowmini Varadhan @ 2018-10-11 19:32 UTC (permalink / raw)
  To: David Miller; +Cc: dsahern, jhs, stephen, dsahern, netdev
In-Reply-To: <20181011.122847.1015772734131510783.davem@davemloft.net>

Without getting into Ahern's patchset, which he obviously feels 
quite passionately about..

On (10/11/18 12:28), David Miller wrote:
> 
> Once you've composed the message, the whole point of filtering is lost.

it would be nice to apply the filter *before* constructing the skb, 
but afaict most things in BPF today only operate on sk_buffs. How should
we use *BPF on something other than an sk_buff?

--Sowmini

^ permalink raw reply

* Re: [PATCH net-next 0/9] net: Kernel side filtering for route dumps
From: David Miller @ 2018-10-11 19:28 UTC (permalink / raw)
  To: dsahern; +Cc: jhs, sowmini.varadhan, stephen, dsahern, netdev
In-Reply-To: <199349c3-fd14-1401-86d1-814776372917@gmail.com>

From: David Ahern <dsahern@gmail.com>
Date: Thu, 11 Oct 2018 12:44:49 -0600

> I disagree with your overall premise of bpf the end-all hammer. It is a
> tool but not the only tool. For starters, you are proposing building the
> message, run the filter on it, and potentially back the message up to
> drop the recently added piece because the filter does not want it
> included. That is still wasting a lot of cpu cycles to build and drop. I
> am thinking about scale to 1 million routes -- I do not need the dump
> loop building a message for 1 million entries only to drop 99% of them.
> That is crazy.

I completely agree.

Once you've composed the message, the whole point of filtering is lost.

^ permalink raw reply

* Re: [PATCH net] net: udp: fix handling of CHECKSUM_COMPLETE packets
From: Eric Dumazet @ 2018-10-11 19:20 UTC (permalink / raw)
  To: Sean Tranchetti, davem, netdev
In-Reply-To: <1539282601-19795-1-git-send-email-stranche@codeaurora.org>



On 10/11/2018 11:30 AM, Sean Tranchetti wrote:
> Current handling of CHECKSUM_COMPLETE packets by the UDP stack is
> incorrect for any packet that has an incorrect checksum value.
> 
> udp4/6_csum_init() will both make a call to
> __skb_checksum_validate_complete() to initialize/validate the csum
> field when receiving a CHECKSUM_COMPLETE packet. When this packet
> fails validation, skb->csum will be overwritten with the pseudoheader
> checksum so the packet can be fully validated by software, but the
> skb->ip_summed value will be left as CHECKSUM_COMPLETE so that way
> the stack can later warn the user about their hardware spewing bad
> checksums. Unfortunately, leaving the SKB in this state can cause
> problems later on in the checksum calculation.
> 
> Since the the packet is still marked as CHECKSUM_COMPLETE,
> udp_csum_pull_header() will SUBTRACT the checksum of the UDP header
> from skb->csum instead of adding it, leaving us with a garbage value
> in that field. Once we try to copy the packet to userspace in the
> udp4/6_recvmsg(), we'll make a call to skb_copy_and_csum_datagram_msg()
> to checksum the packet data and add it in the garbage skb->csum value
> to perform our final validation check.
> 
> Since the value we're validating is not the proper checksum, it's possible
> that the folded value could come out to 0, causing us not to drop the
> packet. Instead, we believe that the packet was checksummed incorrectly
> by hardware since skb->ip_summed is still CHECKSUM_COMPLETE, and we attempt
> to warn the user with netdev_rx_csum_fault(skb->dev);
> 
> Unfortunately, since this is the UDP path, skb->dev has been overwritten
> by skb->dev_scratch and is no longer a valid pointer, so we end up
> reading invalid memory.
> 
> This patch addresses this problem in two ways:
> 	1) Remove the invalid netdev_rx_csum_fault(skb->dev) call from
> 	   skb_copy_and_csum_datagram_msg(). Since this is used by UDP
> 	   where skb->dev is invalid, trying to warn doesn't make sense.
> 
> 	2) Add better CHECKSUM_COMPLETE handling to udp4/6_csum_init().
> 	   If we receive a packet that's CHECKSUM_COMPLETE that fails
> 	   verification (i.e. skb->csum_valid == 0), check who performed
> 	   the calculation. It's possible that the checksum was done in
> 	   software by the network stack earlier (such as Netfilter's
> 	   CONNTRACK module), and if that says the checksum is bad,
> 	   we can drop the packet immediately instead of waiting until
> 	   we try and copy it to userspace. Otherwise, we need to
> 	   mark the SKB as CHECKSUM_NONE, since the skb->csum field
> 	   no longer contains the full packet checksum after the
> 	   call to __skb_checksum_validate_complete().
> 
> Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>

It would be nice if you could provide a Fixes: tag, so that we know what kind
of backport work is needed.

You could also CC the patch author, it is always a nice thing to do.

If really we could emit a message using skb->dev after skb has been queued
and RCU section gone, this always has been buggy, and maybe we should not
even try to use skb->dev when warning is sent.

Alternative would be to use dev index and perform a lookup to find the device,
if device still exists.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next] nfp: replace long license headers with SPDX
From: David Miller @ 2018-10-11 19:16 UTC (permalink / raw)
  To: jakub.kicinski
  Cc: netdev, oss-drivers, simon.horman, edwin.peer, nick.viljoen
In-Reply-To: <20181011155742.19633-1-jakub.kicinski@netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Thu, 11 Oct 2018 08:57:42 -0700

> Replace the repeated license text with SDPX identifiers.
> While at it bump the Copyright dates for files we touched
> this year.
> 
> Signed-off-by: Edwin Peer <edwin.peer@netronome.com>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com>
> Reviewed-by: Simon Horman <simon.horman@netronome.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: phy: sfp: remove sfp_mutex's definition
From: David Miller @ 2018-10-11 19:10 UTC (permalink / raw)
  To: bigeasy; +Cc: netdev, tglx, andrew, f.fainelli
In-Reply-To: <20181011150621.8466-1-bigeasy@linutronix.de>

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Thu, 11 Oct 2018 17:06:21 +0200

> The sfp_mutex variable is defined but never used in this file. Not even
> in the commit that introduced that variable.
> 
> Remove sfp_mutex, it has no purpose.
> 
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Applied.

^ permalink raw reply

* [PATCH net-next] net: fddi: skfp: Remove unused macros 'PNMI_GET_ID' and 'PNMI_SET_ID'
From: YueHaibing @ 2018-10-12  2:37 UTC (permalink / raw)
  To: davem, natechancellor; +Cc: linux-kernel, netdev, YueHaibing

The two PNMI macros are never used

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/fddi/skfp/h/cmtdef.h | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/net/fddi/skfp/h/cmtdef.h b/drivers/net/fddi/skfp/h/cmtdef.h
index a12f464..448d66c 100644
--- a/drivers/net/fddi/skfp/h/cmtdef.h
+++ b/drivers/net/fddi/skfp/h/cmtdef.h
@@ -655,14 +655,6 @@ void dump_hex(char *p, int len);
 #ifndef PNMI_INIT
 #define	PNMI_INIT(smc)	/* Nothing */
 #endif
-#ifndef PNMI_GET_ID
-#define PNMI_GET_ID( smc, ndis_oid, buf, len, BytesWritten, BytesNeeded ) \
-		( 1 ? (-1) : (-1) )
-#endif
-#ifndef PNMI_SET_ID
-#define PNMI_SET_ID( smc, ndis_oid, buf, len, BytesRead, BytesNeeded, \
-		set_type) ( 1 ? (-1) : (-1) )
-#endif
 
 /*
  * SMT_PANIC defines
-- 
2.7.0

^ permalink raw reply related

* Re: [PATCH 1/1] net: socionext: clear rx irq correctly
From: David Miller @ 2018-10-11 19:04 UTC (permalink / raw)
  To: ilias.apalodimas
  Cc: netdev, jaswinder.singh, ard.biesheuvel, masami.hiramatsu
In-Reply-To: <1539260906-29596-1-git-send-email-ilias.apalodimas@linaro.org>

From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Date: Thu, 11 Oct 2018 15:28:26 +0300

> commit 63ae7949e94a ("net: socionext: Use descriptor info instead of MMIO reads on Rx")
> removed constant mmio reads from the driver and started using a descriptor 
> field to check if packet should be processed.
> This lead the napi rx handler being constantly called while no packets
> needed processing and ksoftirq getting 100% cpu usage. Issue one mmio read 
> to clear the irq correcty after processing packets
> 
> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Applied.

^ permalink raw reply

* Re: [RFC PATCH 2/2] net/ncsi: Configure multi-package, multi-channel modes with failover
From: Samuel Mendoza-Jonas @ 2018-10-12  2:24 UTC (permalink / raw)
  To: Justin.Lee1, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <e20a7a3da4ca4aec913800355cd6cdaf@AUSX13MPS302.AMER.DELL.COM>

On Thu, 2018-10-11 at 22:09 +0000, Justin.Lee1@Dell.com wrote:
> Hi Sam,
> 
> Please see my comments below.
> 
> Thanks,
> Justin
>  
>  
> > On Wed, 2018-10-10 at 22:36 +0000, Justin.Lee1@Dell.com wrote:
> > > Hi Samuel,
> > > 
> > > I am still testing your change and have some comments below.
> > > 
> > > Thanks,
> > > Justin
> > > 
> > > 
> > > > This patch extends the ncsi-netlink interface with two new commands and
> > > > three new attributes to configure multiple packages and/or channels at
> > > > once, and configure specific failover modes.
> > > > 
> > > > NCSI_CMD_SET_PACKAGE mask and NCSI_CMD_SET_CHANNEL_MASK set a whitelist
> > > > of packages or channels allowed to be configured with the
> > > > NCSI_ATTR_PACKAGE_MASK and NCSI_ATTR_CHANNEL_MASK attributes
> > > > respectively. If one of these whitelists is set only packages or
> > > > channels matching the whitelist are considered for the channel queue in
> > > > ncsi_choose_active_channel().
> > > > 
> > > > These commands may also use the NCSI_ATTR_MULTI_FLAG to signal that
> > > > multiple packages or channels may be configured simultaneously. NCSI
> > > > hardware arbitration (HWA) must be available in order to enable
> > > > multi-package mode. Multi-channel mode is always available.
> > > > 
> > > > If the NCSI_ATTR_CHANNEL_ID attribute is present in the
> > > > NCSI_CMD_SET_CHANNEL_MASK command the it sets the preferred channel as
> > > > with the NCSI_CMD_SET_INTERFACE command. The combination of preferred
> > > > channel and channel whitelist defines a primary channel and the allowed
> > > > failover channels.
> > > > If the NCSI_ATTR_MULTI_FLAG attribute is also present then the preferred
> > > > channel is configured for Tx/Rx and the other channels are enabled only
> > > > for Rx.
> > > > 
> > > > Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
> > > > ---
> > > >  include/uapi/linux/ncsi.h |  16 +++
> > > >  net/ncsi/internal.h       |  11 +-
> > > >  net/ncsi/ncsi-aen.c       |   2 +-
> > > >  net/ncsi/ncsi-manage.c    | 138 ++++++++++++++++--------
> > > >  net/ncsi/ncsi-netlink.c   | 217 +++++++++++++++++++++++++++++++++-----
> > > >  net/ncsi/ncsi-rsp.c       |   2 +-
> > > >  6 files changed, 312 insertions(+), 74 deletions(-)
> > > > 
> > > > diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h
> > > > index 4c292ecbb748..035fba1693f9 100644
> > > > --- a/include/uapi/linux/ncsi.h
> > > > +++ b/include/uapi/linux/ncsi.h
> > > > @@ -23,6 +23,13 @@
> > > >   *	optionally the preferred NCSI_ATTR_CHANNEL_ID.
> > > >   * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel combination.
> > > >   *	Requires NCSI_ATTR_IFINDEX.
> > > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
> > > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > > > + *	Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
> > > > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > > > + *	Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
> > > > + *	NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
> > > > + *	the primary channel.
> > > >   * @NCSI_CMD_MAX: highest command number
> > > >   */
> > > 
> > > There are some typo in the description.
> > > * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
> > >  *	Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
> > >  * @NCSI_CMD_SET_CHANNEL_MASK: set a whitelist of allowed channels.
> > >  *	Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
> > >  *	NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
> > >  *	the primary channel.
> > 
> > Haha, yes I threw that in at the end, thanks for catching it.
> > 
> > > >  enum ncsi_nl_commands {
> > > > @@ -30,6 +37,8 @@ enum ncsi_nl_commands {
> > > >  	NCSI_CMD_PKG_INFO,
> > > >  	NCSI_CMD_SET_INTERFACE,
> > > >  	NCSI_CMD_CLEAR_INTERFACE,
> > > > +	NCSI_CMD_SET_PACKAGE_MASK,
> > > > +	NCSI_CMD_SET_CHANNEL_MASK,
> > > >  
> > > >  	__NCSI_CMD_AFTER_LAST,
> > > >  	NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1
> > > > @@ -43,6 +52,10 @@ enum ncsi_nl_commands {
> > > >   * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes
> > > >   * @NCSI_ATTR_PACKAGE_ID: package ID
> > > >   * @NCSI_ATTR_CHANNEL_ID: channel ID
> > > > + * @NCSI_ATTR_MULTI_FLAG: flag to signal that multi-mode should be enabled with
> > > > + *	NCSI_CMD_SET_PACKAGE_MASK or NCSI_CMD_SET_CHANNEL_MASK.
> > > > + * @NCSI_ATTR_PACKAGE_MASK: 32-bit mask of allowed packages.
> > > > + * @NCSI_ATTR_CHANNEL_MASK: 32-bit mask of allowed channels.
> > > >   * @NCSI_ATTR_MAX: highest attribute number
> > > >   */
> > > >  enum ncsi_nl_attrs {
> > > > @@ -51,6 +64,9 @@ enum ncsi_nl_attrs {
> > > >  	NCSI_ATTR_PACKAGE_LIST,
> > > >  	NCSI_ATTR_PACKAGE_ID,
> > > >  	NCSI_ATTR_CHANNEL_ID,
> > > > +	NCSI_ATTR_MULTI_FLAG,
> > > > +	NCSI_ATTR_PACKAGE_MASK,
> > > > +	NCSI_ATTR_CHANNEL_MASK,
> > > 
> > > Is there a case that we might set these two masks at the same time?
> > > If not, maybe we can just have one generic MASK attribute.
> > > 
> > 
> > I thought of this too: not yet, but I wonder if we might in the future.
> > I'll have a think about it.
> > 
> > > >  
> > > >  	__NCSI_ATTR_AFTER_LAST,
> > > >  	NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1
> > > > diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
> > > > index 3d0a33b874f5..8437474d0a78 100644
> > > > --- a/net/ncsi/internal.h
> > > > +++ b/net/ncsi/internal.h
> > > > @@ -213,6 +213,10 @@ struct ncsi_package {
> > > >  	unsigned int         channel_num; /* Number of channels     */
> > > >  	struct list_head     channels;    /* List of chanels        */
> > > >  	struct list_head     node;        /* Form list of packages  */
> > > > +
> > > > +	bool                 multi_channel; /* Enable multiple channels  */
> > > > +	u32                  channel_whitelist; /* Channels to configure */
> > > > +	struct ncsi_channel  *preferred_channel; /* Primary channel      */
> > > >  };
> > > >  
> > > >  struct ncsi_request {
> > > > @@ -280,8 +284,6 @@ struct ncsi_dev_priv {
> > > >  	unsigned int        package_num;     /* Number of packages         */
> > > >  	struct list_head    packages;        /* List of packages           */
> > > >  	struct ncsi_channel *hot_channel;    /* Channel was ever active    */
> > > > -	struct ncsi_package *force_package;  /* Force a specific package   */
> > > > -	struct ncsi_channel *force_channel;  /* Force a specific channel   */
> > > >  	struct ncsi_request requests[256];   /* Request table              */
> > > >  	unsigned int        request_id;      /* Last used request ID       */
> > > >  #define NCSI_REQ_START_IDX	1
> > > > @@ -294,6 +296,9 @@ struct ncsi_dev_priv {
> > > >  	struct list_head    node;            /* Form NCSI device list      */
> > > >  #define NCSI_MAX_VLAN_VIDS	15
> > > >  	struct list_head    vlan_vids;       /* List of active VLAN IDs */
> > > > +
> > > > +	bool                multi_package;   /* Enable multiple packages   */
> > > > +	u32                 package_whitelist; /* Packages to configure    */
> > > >  };
> > > >  
> > > >  struct ncsi_cmd_arg {
> > > > @@ -345,6 +350,8 @@ struct ncsi_request *ncsi_alloc_request(struct ncsi_dev_priv *ndp,
> > > >  void ncsi_free_request(struct ncsi_request *nr);
> > > >  struct ncsi_dev *ncsi_find_dev(struct net_device *dev);
> > > >  int ncsi_process_next_channel(struct ncsi_dev_priv *ndp);
> > > > +bool ncsi_channel_is_last(struct ncsi_dev_priv *ndp,
> > > > +			  struct ncsi_channel *channel);
> > > >  
> > > >  /* Packet handlers */
> > > >  u32 ncsi_calculate_checksum(unsigned char *data, int len);
> > > > diff --git a/net/ncsi/ncsi-aen.c b/net/ncsi/ncsi-aen.c
> > > > index 65f47a648be3..eac56aee30c4 100644
> > > > --- a/net/ncsi/ncsi-aen.c
> > > > +++ b/net/ncsi/ncsi-aen.c
> > > > @@ -86,7 +86,7 @@ static int ncsi_aen_handler_lsc(struct ncsi_dev_priv *ndp,
> > > >  	    !(state == NCSI_CHANNEL_ACTIVE && !(data & 0x1)))
> > > >  		return 0;
> > > >  
> > > > -	if (state == NCSI_CHANNEL_ACTIVE)
> > > > +	if (state == NCSI_CHANNEL_ACTIVE && ncsi_channel_is_last(ndp, nc))
> > > >  		ndp->flags |= NCSI_DEV_RESHUFFLE;
> > > >  
> > > >  	ncsi_stop_channel_monitor(nc);
> > > > diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
> > > > index 665bee25ec44..6a55df700bcb 100644
> > > > --- a/net/ncsi/ncsi-manage.c
> > > > +++ b/net/ncsi/ncsi-manage.c
> > > > @@ -27,6 +27,24 @@
> > > >  LIST_HEAD(ncsi_dev_list);
> > > >  DEFINE_SPINLOCK(ncsi_dev_lock);
> > > >  
> > > > +/* Returns true if the given channel is the last channel available */
> > > > +bool ncsi_channel_is_last(struct ncsi_dev_priv *ndp,
> > > > +			  struct ncsi_channel *channel)
> > > > +{
> > > > +	struct ncsi_package *np;
> > > > +	struct ncsi_channel *nc;
> > > > +
> > > > +	NCSI_FOR_EACH_PACKAGE(ndp, np)
> > > > +		NCSI_FOR_EACH_CHANNEL(np, nc) {
> > > > +			if (nc == channel)
> > > > +				continue;
> > > > +			if (nc->state == NCSI_CHANNEL_ACTIVE)
> > > > +				return false;
> > > > +		}
> > > > +
> > > > +	return true;
> > > > +}
> > > > +
> > > >  static void ncsi_report_link(struct ncsi_dev_priv *ndp, bool force_down)
> > > >  {
> > > >  	struct ncsi_dev *nd = &ndp->ndev;
> > > > @@ -266,6 +284,7 @@ struct ncsi_package *ncsi_add_package(struct ncsi_dev_priv *ndp,
> > > >  	np->ndp = ndp;
> > > >  	spin_lock_init(&np->lock);
> > > >  	INIT_LIST_HEAD(&np->channels);
> > > > +	np->channel_whitelist = UINT_MAX;
> > > >  
> > > >  	spin_lock_irqsave(&ndp->lock, flags);
> > > >  	tmp = ncsi_find_package(ndp, id);
> > > > @@ -633,6 +652,34 @@ static int set_one_vid(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc,
> > > >  	return 0;
> > > >  }
> > > >  
> > > > +/* Determine if a given channel should be the Tx channel */
> > > > +bool ncsi_channel_is_tx(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc)
> > > > +{
> > > > +	struct ncsi_package *np = nc->package;
> > > > +	struct ncsi_channel *channel;
> > > > +	struct ncsi_channel_mode *ncm;
> 
> Learn from Dave. All local variable declarations from longest to shortest
> line. :)
> 
> > > > +
> > > > +	NCSI_FOR_EACH_CHANNEL(np, channel) {
> > > > +		ncm = &channel->modes[NCSI_MODE_TX_ENABLE];
> > > > +		/* Another channel is already Tx */
> > > > +		if (ncm->enable)
> > > > +			return false;
> > > > +	}
> > > > +
> > > > +	if (!np->preferred_channel)
> > > > +		return true;
> > > > +
> > > > +	if (np->preferred_channel == nc)
> > > > +		return true;
> > > > +
> > > > +	/* The preferred channel is not in the queue and not active */
> > > > +	if (list_empty(&np->preferred_channel->link) &&
> > > > +	    np->preferred_channel->state != NCSI_CHANNEL_ACTIVE)
> > > > +		return true;
> > > > +
> > > > +	return false;
> > > > +}
> > > > +
> > > >  static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> > > >  {
> > > >  	struct ncsi_dev *nd = &ndp->ndev;
> > > > @@ -745,18 +792,22 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> > > >  		} else if (nd->state == ncsi_dev_state_config_ebf) {
> > > >  			nca.type = NCSI_PKT_CMD_EBF;
> > > >  			nca.dwords[0] = nc->caps[NCSI_CAP_BC].cap;
> > > > -			nd->state = ncsi_dev_state_config_ecnt;
> > > > +			if (ncsi_channel_is_tx(ndp, nc))
> > > > +				nd->state = ncsi_dev_state_config_ecnt;
> > > > +			else
> > > > +				nd->state = ncsi_dev_state_config_ec;
> > > >  #if IS_ENABLED(CONFIG_IPV6)
> > > >  			if (ndp->inet6_addr_num > 0 &&
> > > >  			    (nc->caps[NCSI_CAP_GENERIC].cap &
> > > >  			     NCSI_CAP_GENERIC_MC))
> > > >  				nd->state = ncsi_dev_state_config_egmf;
> > > > -			else
> > > > -				nd->state = ncsi_dev_state_config_ecnt;
> > > >  		} else if (nd->state == ncsi_dev_state_config_egmf) {
> > > >  			nca.type = NCSI_PKT_CMD_EGMF;
> > > >  			nca.dwords[0] = nc->caps[NCSI_CAP_MC].cap;
> > > > -			nd->state = ncsi_dev_state_config_ecnt;
> > > > +			if (ncsi_channel_is_tx(ndp, nc))
> > > > +				nd->state = ncsi_dev_state_config_ecnt;
> > > > +			else
> > > > +				nd->state = ncsi_dev_state_config_ec;
> > > >  #endif /* CONFIG_IPV6 */
> > > >  		} else if (nd->state == ncsi_dev_state_config_ecnt) {
> > > >  			nca.type = NCSI_PKT_CMD_ECNT;
> > > > @@ -840,43 +891,35 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> > > >  
> > > >  static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> > > >  {
> > > > -	struct ncsi_package *np, *force_package;
> > > > -	struct ncsi_channel *nc, *found, *hot_nc, *force_channel;
> > > > +	struct ncsi_package *np;
> > > > +	struct ncsi_channel *nc, *found, *hot_nc;
> > > >  	struct ncsi_channel_mode *ncm;
> > > > -	unsigned long flags;
> > > > +	unsigned long flags, cflags;
> > > > +	bool with_link;
> 
> All local variable declarations from longest to shortest
> line.
> 
> > > >  
> > > >  	spin_lock_irqsave(&ndp->lock, flags);
> > > >  	hot_nc = ndp->hot_channel;
> > > > -	force_channel = ndp->force_channel;
> > > > -	force_package = ndp->force_package;
> > > >  	spin_unlock_irqrestore(&ndp->lock, flags);
> > > >  
> > > > -	/* Force a specific channel whether or not it has link if we have been
> > > > -	 * configured to do so
> > > > -	 */
> > > > -	if (force_package && force_channel) {
> > > > -		found = force_channel;
> > > > -		ncm = &found->modes[NCSI_MODE_LINK];
> > > > -		if (!(ncm->data[2] & 0x1))
> > > > -			netdev_info(ndp->ndev.dev,
> > > > -				    "NCSI: Channel %u forced, but it is link down\n",
> > > > -				    found->id);
> > > > -		goto out;
> > > > -	}
> > > > -
> > > > -	/* The search is done once an inactive channel with up
> > > > -	 * link is found.
> > > > +	/* By default the search is done once an inactive channel with up
> > > > +	 * link is found, unless a preferred channel is set.
> > > > +	 * If multi_package or multi_channel are configured all channels in the
> > > > +	 * whitelist with link are added to the channel queue.
> > > >  	 */
> > > >  	found = NULL;
> > > > +	with_link = false;
> > > >  	NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > > > -		if (ndp->force_package && np != ndp->force_package)
> > > > +		if (!(ndp->package_whitelist & (0x1 << np->id)))
> > > >  			continue;
> > > >  		NCSI_FOR_EACH_CHANNEL(np, nc) {
> > > > -			spin_lock_irqsave(&nc->lock, flags);
> > > > +			if (!(np->channel_whitelist & (0x1 << nc->id)))
> > > > +				continue;
> > > > +
> > > > +			spin_lock_irqsave(&nc->lock, cflags);
> > > >  
> > > >  			if (!list_empty(&nc->link) ||
> > > >  			    nc->state != NCSI_CHANNEL_INACTIVE) {
> > > > -				spin_unlock_irqrestore(&nc->lock, flags);
> > > > +				spin_unlock_irqrestore(&nc->lock, cflags);
> > > >  				continue;
> > > >  			}
> > > >  
> > > > @@ -888,32 +931,42 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> > > >  
> > > >  			ncm = &nc->modes[NCSI_MODE_LINK];
> > > >  			if (ncm->data[2] & 0x1) {
> > > 
> > > This data will not be updated if the channel monitor for it is not running.
> > > If I move the cable from the current configured channel to the other channel,
> > > NC-SI module will not detect the link status as the other channel is not configured
> > > and AEN will not happen.
> > > Is it per design that NC-SI module will always use the first interface with the link?
> > 
> > We'll check the link state of every channel at init, and per-package on
> > suspend events, but you're right that we only get AENs right now from
> > configured channels. There's probably no reason why we could at least
> > enable AENs on all channels even if we don't use them for network, maybe
> > during the package probe step.
> > 
> 
> To receive the AEN packet, the channel (RX) needs to be enabled.
> It seems that we need to send Get Link Status command to all channels before choosing
> The active channel.

We mostly already do a GLS before this; either we've just started the
NCSI device in which case we sent a GLS to every package as part of
probing, or some other event has occured and we've gone through
ncsi_suspend_channel() which will do ncsi_dev_state_suspend_gls.
However there are some gaps here, for example the
ncsi_stop_dev()/ncsi_start_dev() path won't cause GLS commands to be sent
so we'll have stale information. Continued below..

> 
> > > > -				spin_unlock_irqrestore(&nc->lock, flags);
> > > >  				found = nc;
> > > > -				goto out;
> > > > +				with_link = true;
> > > > +
> > > > +				spin_lock_irqsave(&ndp->lock, flags);
> > > > +				list_add_tail_rcu(&found->link,
> > > > +						  &ndp->channel_queue);
> > > > +				spin_unlock_irqrestore(&ndp->lock, flags);
> > > > +
> > > > +				netdev_dbg(ndp->ndev.dev,
> > > > +					   "NCSI: Channel %u added to queue (link %s)\n",
> > > > +					   found->id,
> > > > +					   ncm->data[2] & 0x1 ? "up" : "down");
> > > >  			}
> 
> If multi_channel is enabled, the interface without link still needs to be processed.
> Something likes below?
> 			} else if (np->multi_channel) {
> 				found = nc;
> 
> 				spin_lock_irqsave(&ndp->lock, flags);
> 				list_add_tail_rcu(&found->link,
> 						  &ndp->channel_queue);
> 				spin_unlock_irqrestore(&ndp->lock, flags);
> 
> 				netdev_dbg(ndp->ndev.dev,
> 					   "NCSI: pkg %u ch %u added to queue (link %s)\n",
> 					   found->package->id,
> 					   found->id,
> 					   ncm->data[2] & 0x1 ? "up" : "down");
> 			}

Yes we should just configure every channel in the whitelist if
multi_channel is set - this gives us AENs for that channel and we'll
recognise when it goes link up.

It would be good to have a better way of "bouncing" the NCSI
configuration in these cases though, especially to include a re-probe of
channel states. I'll include something like that for this series.

> 
> > > > +			spin_unlock_irqrestore(&nc->lock, cflags);
> > > >  
> > > > -			spin_unlock_irqrestore(&nc->lock, flags);
> > > > +			if (with_link && !np->multi_channel)
> > > > +				break;
> > > >  		}
> > > > +		if (with_link && !ndp->multi_package)
> > > > +			break;
> > > >  	}
> > > >  
> > > > -	if (!found) {
> > > > +	if (!with_link && found) {
> > > > +		netdev_info(ndp->ndev.dev,
> > > > +			    "NCSI: No channel with link found, configuring channel %u\n",
> > > > +			    found->id);
> > > > +		spin_lock_irqsave(&ndp->lock, flags);
> > > > +		list_add_tail_rcu(&found->link, &ndp->channel_queue);
> > > > +		spin_unlock_irqrestore(&ndp->lock, flags);
> > > > +	} else if (!found) {
> > > >  		netdev_warn(ndp->ndev.dev,
> > > > -			    "NCSI: No channel found with link\n");
> > > > +			    "NCSI: No channel found to configure!\n");
> > > >  		ncsi_report_link(ndp, true);
> > > >  		return -ENODEV;
> > > >  	}
> > > >  
> > > > -	ncm = &found->modes[NCSI_MODE_LINK];
> > > > -	netdev_dbg(ndp->ndev.dev,
> > > > -		   "NCSI: Channel %u added to queue (link %s)\n",
> > > > -		   found->id, ncm->data[2] & 0x1 ? "up" : "down");
> > > > -
> > > > -out:
> > > > -	spin_lock_irqsave(&ndp->lock, flags);
> > > > -	list_add_tail_rcu(&found->link, &ndp->channel_queue);
> > > > -	spin_unlock_irqrestore(&ndp->lock, flags);
> > > > -
> > > >  	return ncsi_process_next_channel(ndp);
> > > >  }
> > > >  
> > > > @@ -1428,6 +1481,7 @@ struct ncsi_dev *ncsi_register_dev(struct net_device *dev,
> > > >  	INIT_LIST_HEAD(&ndp->channel_queue);
> > > >  	INIT_LIST_HEAD(&ndp->vlan_vids);
> > > >  	INIT_WORK(&ndp->work, ncsi_dev_work);
> > > > +	ndp->package_whitelist = UINT_MAX;
> > > >  
> > > >  	/* Initialize private NCSI device */
> > > >  	spin_lock_init(&ndp->lock);
> > > > diff --git a/net/ncsi/ncsi-netlink.c b/net/ncsi/ncsi-netlink.c
> > > > index 32cb7751d216..33a091e6f466 100644
> > > > --- a/net/ncsi/ncsi-netlink.c
> > > > +++ b/net/ncsi/ncsi-netlink.c
> > > 
> > > Is the following missed in the patch?
> > > static const struct nla_policy ncsi_genl_policy[NCSI_ATTR_MAX + 1] = {
> > > ...
> > > 	[NCSI_ATTR_MULTI_FLAG] =	{ .type = NLA_FLAG },
> > > 	[NCSI_ATTR_PACKAGE_MASK] =	{ .type = NLA_U32 },
> > > 	[NCSI_ATTR_CHANNEL_MASK] =	{ .type = NLA_U32 },
> > 
> > Oops, added.
> > 
> > > > @@ -67,7 +67,7 @@ static int ncsi_write_channel_info(struct sk_buff *skb,
> > > >  	nla_put_u32(skb, NCSI_CHANNEL_ATTR_LINK_STATE, m->data[2]);
> > > >  	if (nc->state == NCSI_CHANNEL_ACTIVE)
> > > >  		nla_put_flag(skb, NCSI_CHANNEL_ATTR_ACTIVE);
> > > > -	if (ndp->force_channel == nc)
> > > > +	if (nc == nc->package->preferred_channel)
> > > >  		nla_put_flag(skb, NCSI_CHANNEL_ATTR_FORCED);
> > > >  
> > > >  	nla_put_u32(skb, NCSI_CHANNEL_ATTR_VERSION_MAJOR, nc->version.version);
> > > > @@ -112,7 +112,7 @@ static int ncsi_write_package_info(struct sk_buff *skb,
> > > >  		if (!pnest)
> > > >  			return -ENOMEM;
> > > >  		nla_put_u32(skb, NCSI_PKG_ATTR_ID, np->id);
> > > > -		if (ndp->force_package == np)
> > > > +		if ((0x1 << np->id) == ndp->package_whitelist)
> > > >  			nla_put_flag(skb, NCSI_PKG_ATTR_FORCED);
> > > >  		cnest = nla_nest_start(skb, NCSI_PKG_ATTR_CHANNEL_LIST);
> > > >  		if (!cnest) {
> > > > @@ -288,45 +288,54 @@ static int ncsi_set_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > > >  	package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
> > > >  	package = NULL;
> > > >  
> > > > -	spin_lock_irqsave(&ndp->lock, flags);
> > > > -
> > > >  	NCSI_FOR_EACH_PACKAGE(ndp, np)
> > > >  		if (np->id == package_id)
> > > >  			package = np;
> > > >  	if (!package) {
> > > >  		/* The user has set a package that does not exist */
> > > > -		spin_unlock_irqrestore(&ndp->lock, flags);
> > > >  		return -ERANGE;
> > > >  	}
> > > >  
> > > >  	channel = NULL;
> > > > -	if (!info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > > > -		/* Allow any channel */
> > > > -		channel_id = NCSI_RESERVED_CHANNEL;
> > > > -	} else {
> > > > +	if (info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > > >  		channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
> > > >  		NCSI_FOR_EACH_CHANNEL(package, nc)
> > > > -			if (nc->id == channel_id)
> > > > +			if (nc->id == channel_id) {
> > > >  				channel = nc;
> > > > +				break;
> > > > +			}
> > > > +		if (!channel) {
> > > > +			netdev_info(ndp->ndev.dev,
> > > > +				    "NCSI: Channel %u does not exist!\n",
> > > > +				    channel_id);
> > > > +			return -ERANGE;
> > > > +		}
> > > >  	}
> > > >  
> > > > -	if (channel_id != NCSI_RESERVED_CHANNEL && !channel) {
> > > > -		/* The user has set a channel that does not exist on this
> > > > -		 * package
> > > > -		 */
> > > > -		spin_unlock_irqrestore(&ndp->lock, flags);
> > > > -		netdev_info(ndp->ndev.dev, "NCSI: Channel %u does not exist!\n",
> > > > -			    channel_id);
> > > > -		return -ERANGE;
> > > > -	}
> > > > -
> > > > -	ndp->force_package = package;
> > > > -	ndp->force_channel = channel;
> > > > +	spin_lock_irqsave(&ndp->lock, flags);
> > > > +	ndp->package_whitelist = 0x1 << package->id;
> > > > +	ndp->multi_package = false;
> > > >  	spin_unlock_irqrestore(&ndp->lock, flags);
> > > >  
> > > > -	netdev_info(ndp->ndev.dev, "Set package 0x%x, channel 0x%x%s as preferred\n",
> > > > -		    package_id, channel_id,
> > > > -		    channel_id == NCSI_RESERVED_CHANNEL ? " (any)" : "");
> > > > +	spin_lock_irqsave(&package->lock, flags);
> > > > +	package->multi_channel = false;
> > > > +	if (channel) {
> > > > +		package->channel_whitelist = 0x1 << channel->id;
> > > > +		package->preferred_channel = channel;
> > > > +	} else {
> > > > +		/* Allow any channel */
> > > > +		package->channel_whitelist = UINT_MAX;
> > > > +		package->preferred_channel = NULL;
> > > > +	}
> > > > +	spin_unlock_irqrestore(&package->lock, flags);
> > > > +
> > > > +	if (channel)
> > > > +		netdev_info(ndp->ndev.dev,
> > > > +			    "Set package 0x%x, channel 0x%x as preferred\n",
> > > > +			    package_id, channel_id);
> > > > +	else
> > > > +		netdev_info(ndp->ndev.dev, "Set package 0x%x as preferred\n",
> > > > +			    package_id);
> > > >  
> > > >  	/* Bounce the NCSI channel to set changes */
> > > >  	ncsi_stop_dev(&ndp->ndev);
> > > > @@ -338,6 +347,7 @@ static int ncsi_set_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > > >  static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > > >  {
> > > >  	struct ncsi_dev_priv *ndp;
> > > > +	struct ncsi_package *np;
> > > >  	unsigned long flags;
> > > >  
> > > >  	if (!info || !info->attrs)
> > > > @@ -351,11 +361,19 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > > >  	if (!ndp)
> > > >  		return -ENODEV;
> > > >  
> > > > -	/* Clear any override */
> > > > +	/* Reset any whitelists and disable multi mode */
> > > >  	spin_lock_irqsave(&ndp->lock, flags);
> > > > -	ndp->force_package = NULL;
> > > > -	ndp->force_channel = NULL;
> > > > +	ndp->package_whitelist = UINT_MAX;
> > > > +	ndp->multi_package = false;
> > > >  	spin_unlock_irqrestore(&ndp->lock, flags);
> > > > +
> > > > +	NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > > > +		spin_lock_irqsave(&np->lock, flags);
> > > > +		np->multi_channel = false;
> > > > +		np->channel_whitelist = UINT_MAX;
> > > > +		np->preferred_channel = NULL;
> > > > +		spin_unlock_irqrestore(&np->lock, flags);
> > > > +	}
> > > >  	netdev_info(ndp->ndev.dev, "NCSI: Cleared preferred package/channel\n");
> > > >  
> > > >  	/* Bounce the NCSI channel to set changes */
> > > > @@ -365,6 +383,137 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> > > >  	return 0;
> > > >  }
> > > >  
> > > > +static int ncsi_set_package_mask_nl(struct sk_buff *msg,
> > > > +				    struct genl_info *info)
> > > > +{
> > > > +	struct ncsi_dev_priv *ndp;
> > > > +	unsigned long flags;
> > > > +	int rc;
> > > > +
> > > > +	if (!info || !info->attrs)
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!info->attrs[NCSI_ATTR_IFINDEX])
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!info->attrs[NCSI_ATTR_PACKAGE_MASK])
> > > > +		return -EINVAL;
> > > > +
> > > > +	ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
> > > > +			       nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
> > > > +	if (!ndp)
> > > > +		return -ENODEV;
> > > > +
> > > > +	spin_lock_irqsave(&ndp->lock, flags);
> > > > +	ndp->package_whitelist =
> > > > +		nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_MASK]);
> > > > +
> > > > +	if (nla_get_flag(info->attrs[NCSI_ATTR_MULTI_FLAG])) {
> > > > +		if (ndp->flags & NCSI_DEV_HWA) {
> > > > +			ndp->multi_package = true;
> > > > +			rc = 0;
> > > > +		} else {
> > > > +			netdev_err(ndp->ndev.dev,
> > > > +				   "NCSI: Can't use multiple packages without HWA\n");
> > > > +			rc = -EPERM;
> > > > +		}
> > > > +	} else {
> > > > +		rc = 0;
> > > > +	}
> 
> If the flag is not set, do we need to clear the multi_package flag? It is possible that it is
> user's intension to disable the multi-mode.
> 	} else {
> 		ndp->multi_package = false;
> 		rc = 0;
> 	}

Yep, good catch (although technically multi_package could not have been
set true in the past if HWA is not available).

> 
> > > > +
> > > > +	spin_unlock_irqrestore(&ndp->lock, flags);
> > > > +
> > > > +	if (!rc) {
> > > > +		/* Bounce the NCSI channel to set changes */
> > > > +		ncsi_stop_dev(&ndp->ndev);
> > > > +		ncsi_start_dev(&ndp->ndev);
> > > 
> > > Is it possible to delay the restart? If we have two packages, we might send
> > > set_package_mask command once and set_channel_mask command twice.
> > > We will see the unnecessary reconfigurations in a very short period time.
> > 
> > We could probably do with a better way of bouncing the link here too. I
> > suppose we could schedule the actual link 'bounce' to occur a second or
> > two later, and delay if we receive new configuration in that time.
> > Perhaps something outside of the scope of this patch though.
> > 
> > > > +	}
> > > > +
> > > > +	return rc;
> > > > +}
> > > > +
> > > > +static int ncsi_set_channel_mask_nl(struct sk_buff *msg,
> > > > +				    struct genl_info *info)
> > > > +{
> > > > +	struct ncsi_package *np, *package;
> > > > +	struct ncsi_channel *nc, *channel;
> > > > +	struct ncsi_dev_priv *ndp;
> > > > +	unsigned long flags;
> > > > +	u32 package_id, channel_id;
> 
> All local variable declarations from longest to shortest
> line.
> 
> > > > +
> > > > +	if (!info || !info->attrs)
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!info->attrs[NCSI_ATTR_IFINDEX])
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!info->attrs[NCSI_ATTR_PACKAGE_ID])
> > > > +		return -EINVAL;
> > > > +
> > > > +	if (!info->attrs[NCSI_ATTR_CHANNEL_MASK])
> > > > +		return -EINVAL;
> > > > +
> > > > +	ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
> > > > +			       nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
> > > > +	if (!ndp)
> > > > +		return -ENODEV;
> > > > +
> > > > +	package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
> > > > +	package = NULL;
> > > > +	NCSI_FOR_EACH_PACKAGE(ndp, np)
> > > > +		if (np->id == package_id) {
> > > > +			package = np;
> > > > +			break;
> > > > +		}
> > > > +	if (!package)
> > > > +		return -ERANGE;
> > > > +
> > > > +	spin_lock_irqsave(&package->lock, flags);
> > > > +
> > > > +	channel = NULL;
> > > > +	if (info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > > > +		channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
> > > > +		NCSI_FOR_EACH_CHANNEL(np, nc)
> > > > +			if (nc->id == channel_id) {
> > > > +				channel = nc;
> > > > +				break;
> > > > +			}
> > > > +		if (!channel) {
> > > > +			spin_unlock_irqrestore(&package->lock, flags);
> > > > +			return -ERANGE;
> > > > +		}
> > > > +		netdev_dbg(ndp->ndev.dev,
> > > > +			   "NCSI: Channel %u set as preferred channel\n",
> > > > +			   channel->id);
> > > > +	}
> > > > +
> > > > +	package->channel_whitelist =
> > > > +		nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_MASK]);
> > > > +	if (package->channel_whitelist == 0)
> > > > +		netdev_dbg(ndp->ndev.dev,
> > > > +			   "NCSI: Package %u set to all channels disabled\n",
> > > > +			   package->id);
> > > > +
> > > > +	package->preferred_channel = channel;
> > > > +
> > > > +	if (nla_get_flag(info->attrs[NCSI_ATTR_MULTI_FLAG])) {
> > > > +		package->multi_channel = true;
> > > > +		netdev_info(ndp->ndev.dev,
> > > > +			    "NCSI: Multi-channel enabled on package %u\n",
> > > > +			    package_id);
> > > > +	} else {
> > > > +		package->multi_channel = false;
> > > > +	}
> > > > +
> > > > +	spin_unlock_irqrestore(&package->lock, flags);
> > > > +
> > > > +	/* Bounce the NCSI channel to set changes */
> > > > +	ncsi_stop_dev(&ndp->ndev);
> > > > +	ncsi_start_dev(&ndp->ndev);
> > > 
> > > Same question as set_package_mask function.
> > > Is it possible to delay the restart? If we have two packages, we might send
> > > set_package_mask command once and set_channel_mask command twice.
> > > We will see the unnecessary reconfigurations in a very short period time.
> > > 
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > >  static const struct genl_ops ncsi_ops[] = {
> > > >  	{
> > > >  		.cmd = NCSI_CMD_PKG_INFO,
> > > > @@ -385,6 +534,18 @@ static const struct genl_ops ncsi_ops[] = {
> > > >  		.doit = ncsi_clear_interface_nl,
> > > >  		.flags = GENL_ADMIN_PERM,
> > > >  	},
> > > > +	{
> > > > +		.cmd = NCSI_CMD_SET_PACKAGE_MASK,
> > > > +		.policy = ncsi_genl_policy,
> > > > +		.doit = ncsi_set_package_mask_nl,
> > > > +		.flags = GENL_ADMIN_PERM,
> > > > +	},
> > > > +	{
> > > > +		.cmd = NCSI_CMD_SET_CHANNEL_MASK,
> > > > +		.policy = ncsi_genl_policy,
> > > > +		.doit = ncsi_set_channel_mask_nl,
> > > > +		.flags = GENL_ADMIN_PERM,
> > > > +	},
> > > >  };
> > > >  
> > > >  static struct genl_family ncsi_genl_family __ro_after_init = {
> > > > diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
> > > > index d66b34749027..02ce7626b579 100644
> > > > --- a/net/ncsi/ncsi-rsp.c
> > > > +++ b/net/ncsi/ncsi-rsp.c
> > > > @@ -241,7 +241,7 @@ static int ncsi_rsp_handler_dcnt(struct ncsi_request *nr)
> > > >  	if (!ncm->enable)
> > > >  		return 0;
> > > >  
> > > > -	ncm->enable = 1;
> > > > +	ncm->enable = 0;
> > > >  	return 0;
> > > >  }
> > > >  
> > > > -- 
> > > > 2.19.0

^ permalink raw reply

* Re: [PATCH net-next 0/9] net: Kernel side filtering for route dumps
From: David Ahern @ 2018-10-11 18:44 UTC (permalink / raw)
  To: Jamal Hadi Salim, Sowmini Varadhan, Stephen Hemminger
  Cc: David Ahern, netdev, davem
In-Reply-To: <a8810dd6-fb22-8309-77a2-3e761feea871@mojatatu.com>

On 10/11/18 12:05 PM, Jamal Hadi Salim wrote:
> On 2018-10-11 1:04 p.m., David Ahern wrote:
> 
>> You can already filter link dumps by kind. How? By passing in the KIND
>> attribute on a dump request. This type of filtering exists for link
>> dumps, neighbor dumps, fdb dumps. Why is there a push to make route
>> dumps different? Why can't they be consistent and use existing semantics?
> 
> I think you meant filtering by ifindex in neighbor.

I meant the general API of users passing filter arguments as attributes
to the dump (or values in the header) -- KIND, MASTER, device index,
etc. This is an existing API and existing capability.

> note: I would argue that there are already "adhoc" ways of filtering
> in place, mostly use case driven). Otherwise Sowmini wouldnt have to
> craft that bpf filter. There are netlink users who have none or some
> weird filtering involved. There is no arguement that your approach
> works for rtm. But the rest of the users missing filters will require
> similar kernel changes. Could this be made generic enough to benefit
> other netlink users?
> The problem is there's always one new attribute that would make sense
> for some use case which requires a kernel change ("send me an event only
> if you get link down" or "dump all ports with link down").
> 

I disagree with your overall premise of bpf the end-all hammer. It is a
tool but not the only tool. For starters, you are proposing building the
message, run the filter on it, and potentially back the message up to
drop the recently added piece because the filter does not want it
included. That is still wasting a lot of cpu cycles to build and drop. I
am thinking about scale to 1 million routes -- I do not need the dump
loop building a message for 1 million entries only to drop 99% of them.
That is crazy.

The way the kernel manages route tables says I should pass in the table
id as it is a major differentiator on what is returned. From there
lookup the specific table (I need to fix this part per my response to
Andrew), and then only walk it. The existing semantics, capabilities
that exist for other dump commands is the most efficient for some of
these high level, big hammer filters.

What you want gets into the tiniest of details and yes the imagination
can go wild with combinations of filter options. So maybe this scanning
of post-built messages is reasonable *after* the high level sorting is done.

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2018-10-12  2:06 UTC (permalink / raw)
  To: gregkh; +Cc: akpm, netdev, linux-kernel


1) RXRPC receive path fixes from David Howells.

2) Re-export __skb_recv_udp(), from Jiri Kosina.

3) Fix refcounting in u32 classificer, from Al Viro.

4) Userspace netlink ABI fixes from Eugene Syromiatnikov.

5) Don't double iounmap on rmmod in ena driver, from Arthur
   Kiyanovski.

6) Fix devlink string attribute handling, we must pull a copy
   into a kernel buffer if the lifetime extends past the netlink
   request.  From Moshe Shemesh.

7) Fix hangs in RDS, from Ka-Cheong Poon.

8) Fix recursive locking lockdep warnings in tipc, from Ying Xue.

9) Clear RX irq correctly in socionext, from Ilias Apalodimas.

10) bcm_sf2 fixes from Florian Fainelli.

Please pull, thanks a lot!

The following changes since commit c1d84a1b42ef70d8ae601df9cadedc7ed4f1beb1:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2018-10-06 02:11:30 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to 6b9bab550cac108d86c731c207e3e74ea10eb638:

  Merge branch 'net-dsa-bcm_sf2-Couple-of-fixes' (2018-10-11 15:20:00 -0700)

----------------------------------------------------------------
Al Viro (1):
      net: sched: cls_u32: fix hnode refcounting

Arthur Kiyanovski (4):
      net: ena: fix warning in rmmod caused by double iounmap
      net: ena: fix rare bug when failed restart/resume is followed by driver removal
      net: ena: fix NULL dereference due to untimely napi initialization
      net: ena: fix auto casting to boolean

David Howells (10):
      rxrpc: Fix some missed refs to init_net
      rxrpc: Fix the data_ready handler
      rxrpc: Use the UDP encap_rcv hook
      rxrpc: Don't need to take the RCU read lock in the packet receiver
      rxrpc: Don't check RXRPC_CALL_TX_LAST after calling rxrpc_rotate_tx_window()
      rxrpc: Carry call state out of locked section in rxrpc_rotate_tx_window()
      rxrpc: Only take the rwind and mtu values from latest ACK
      rxrpc: Fix connection-level abort handling
      rxrpc: Fix the rxrpc_tx_packet trace line
      rxrpc: Fix the packet reception routine

David S. Miller (7):
      Merge branch 'net-smc-userspace-breakage-fixes'
      Merge branch 'ena-fixes'
      Merge branch 'devlink-param-type-string-fixes'
      Merge tag 'rxrpc-fixes-20181008' of git://git.kernel.org/.../dhowells/linux-fs
      Merge branch 'net-ipv4-fixes-for-PMTU-when-link-MTU-changes'
      Merge branch 'net-explicitly-requires-bash-when-needed'
      Merge branch 'net-dsa-bcm_sf2-Couple-of-fixes'

Eric Dumazet (1):
      net: make skb_partial_csum_set() more robust against overflows

Eugene Syromiatnikov (2):
      net/smc: use __aligned_u64 for 64-bit smc_diag fields
      net/smc: retain old name for diag_mode field

Florian Fainelli (2):
      net: dsa: bcm_sf2: Fix unbind ordering
      net: dsa: bcm_sf2: Call setup during switch resume

Giacinto Cifelli (1):
      qmi_wwan: Added support for Gemalto's Cinterion ALASxx WWAN interface

Ilias Apalodimas (1):
      net: socionext: clear rx irq correctly

Jiri Kosina (1):
      udp: Unbreak modules that rely on external __skb_recv_udp() availability

Jon Maloy (1):
      tipc: set link tolerance correctly in broadcast link

Ka-Cheong Poon (1):
      rds: RDS (tcp) hangs on sendto() to unresponding address

Maciej S. Szmigiero (1):
      r8169: set RX_MULTI_EN bit in RxConfig for 8168F-family chips

Mike Rapoport (1):
      net/ipv6: stop leaking percpu memory in fib6 info

Moshe Shemesh (4):
      devlink: Fix param set handling for string type
      devlink: Fix param cmode driverinit for string type
      devlink: Add helper function for safely copy string param
      net/mlx4_core: Fix warnings during boot on driverinit param set failures

Paolo Abeni (2):
      selftests: rtnetlink.sh explicitly requires bash.
      selftests: udpgso_bench.sh explicitly requires bash

Parthasarathy Bhuvaragan (1):
      tipc: queue socket protocol error messages into socket receive buffer

Sabrina Dubroca (2):
      net: ipv4: update fnhe_pmtu when first hop's MTU changes
      net: ipv4: don't let PMTU updates increase route MTU

Sebastian Andrzej Siewior (1):
      net: phy: sfp: remove sfp_mutex's definition

Ying Xue (1):
      tipc: eliminate possible recursive locking detected by LOCKDEP

 drivers/net/dsa/bcm_sf2.c                     |  14 ++-----
 drivers/net/ethernet/amazon/ena/ena_eth_com.c |   8 ++--
 drivers/net/ethernet/amazon/ena/ena_netdev.c  |  22 +++++-----
 drivers/net/ethernet/mellanox/mlx4/main.c     |  43 +++++++------------
 drivers/net/ethernet/realtek/r8169.c          |   4 +-
 drivers/net/ethernet/socionext/netsec.c       |   5 ++-
 drivers/net/phy/sfp.c                         |   2 -
 drivers/net/usb/qmi_wwan.c                    |   1 +
 include/linux/netdevice.h                     |   7 ++++
 include/net/devlink.h                         |  12 +++++-
 include/net/ip_fib.h                          |   1 +
 include/trace/events/rxrpc.h                  |   1 +
 include/uapi/linux/smc_diag.h                 |  25 ++++++-----
 include/uapi/linux/udp.h                      |   1 +
 net/core/dev.c                                |  28 ++++++++++++-
 net/core/devlink.c                            |  43 ++++++++++++++++---
 net/core/skbuff.c                             |  12 +++---
 net/ipv4/fib_frontend.c                       |  12 ++++--
 net/ipv4/fib_semantics.c                      |  50 ++++++++++++++++++++++
 net/ipv4/route.c                              |   7 ++--
 net/ipv4/udp.c                                |   2 +-
 net/ipv6/ip6_fib.c                            |   2 +
 net/rds/send.c                                |  13 ++++--
 net/rxrpc/ar-internal.h                       |  23 +++++-----
 net/rxrpc/call_accept.c                       |  27 ++++++++----
 net/rxrpc/call_object.c                       |   5 ++-
 net/rxrpc/conn_client.c                       |  10 +++--
 net/rxrpc/conn_event.c                        |  26 +++++++-----
 net/rxrpc/input.c                             | 253 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------
 net/rxrpc/local_object.c                      |  30 ++++++++++---
 net/rxrpc/peer_event.c                        |   5 +++
 net/rxrpc/peer_object.c                       |  29 ++++++++-----
 net/sched/cls_u32.c                           |  10 ++---
 net/tipc/link.c                               |  27 ++++++++----
 net/tipc/socket.c                             |  14 ++++++-
 tools/testing/selftests/net/rtnetlink.sh      |   2 +-
 tools/testing/selftests/net/udpgso_bench.sh   |   2 +-
 37 files changed, 493 insertions(+), 285 deletions(-)

^ permalink raw reply

* [PATCH net] net: udp: fix handling of CHECKSUM_COMPLETE packets
From: Sean Tranchetti @ 2018-10-11 18:30 UTC (permalink / raw)
  To: eric.dumazet, davem, netdev; +Cc: Sean Tranchetti

Current handling of CHECKSUM_COMPLETE packets by the UDP stack is
incorrect for any packet that has an incorrect checksum value.

udp4/6_csum_init() will both make a call to
__skb_checksum_validate_complete() to initialize/validate the csum
field when receiving a CHECKSUM_COMPLETE packet. When this packet
fails validation, skb->csum will be overwritten with the pseudoheader
checksum so the packet can be fully validated by software, but the
skb->ip_summed value will be left as CHECKSUM_COMPLETE so that way
the stack can later warn the user about their hardware spewing bad
checksums. Unfortunately, leaving the SKB in this state can cause
problems later on in the checksum calculation.

Since the the packet is still marked as CHECKSUM_COMPLETE,
udp_csum_pull_header() will SUBTRACT the checksum of the UDP header
from skb->csum instead of adding it, leaving us with a garbage value
in that field. Once we try to copy the packet to userspace in the
udp4/6_recvmsg(), we'll make a call to skb_copy_and_csum_datagram_msg()
to checksum the packet data and add it in the garbage skb->csum value
to perform our final validation check.

Since the value we're validating is not the proper checksum, it's possible
that the folded value could come out to 0, causing us not to drop the
packet. Instead, we believe that the packet was checksummed incorrectly
by hardware since skb->ip_summed is still CHECKSUM_COMPLETE, and we attempt
to warn the user with netdev_rx_csum_fault(skb->dev);

Unfortunately, since this is the UDP path, skb->dev has been overwritten
by skb->dev_scratch and is no longer a valid pointer, so we end up
reading invalid memory.

This patch addresses this problem in two ways:
	1) Remove the invalid netdev_rx_csum_fault(skb->dev) call from
	   skb_copy_and_csum_datagram_msg(). Since this is used by UDP
	   where skb->dev is invalid, trying to warn doesn't make sense.

	2) Add better CHECKSUM_COMPLETE handling to udp4/6_csum_init().
	   If we receive a packet that's CHECKSUM_COMPLETE that fails
	   verification (i.e. skb->csum_valid == 0), check who performed
	   the calculation. It's possible that the checksum was done in
	   software by the network stack earlier (such as Netfilter's
	   CONNTRACK module), and if that says the checksum is bad,
	   we can drop the packet immediately instead of waiting until
	   we try and copy it to userspace. Otherwise, we need to
	   mark the SKB as CHECKSUM_NONE, since the skb->csum field
	   no longer contains the full packet checksum after the
	   call to __skb_checksum_validate_complete().

Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
---
 net/core/datagram.c     |  3 ---
 net/ipv4/udp.c          | 20 ++++++++++++++++++--
 net/ipv6/ip6_checksum.c | 20 ++++++++++++++++++--
 3 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/net/core/datagram.c b/net/core/datagram.c
index 9aac0d6..ab679c5 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -807,9 +807,6 @@ int skb_copy_and_csum_datagram_msg(struct sk_buff *skb,
 			iov_iter_revert(&msg->msg_iter, chunk);
 			return -EINVAL;
 		}
-
-		if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE))
-			netdev_rx_csum_fault(skb->dev);
 	}
 	return 0;
 fault:
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index c32a4c1..f8183fd 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2120,8 +2120,24 @@ static inline int udp4_csum_init(struct sk_buff *skb, struct udphdr *uh,
 	/* Note, we are only interested in != 0 or == 0, thus the
 	 * force to int.
 	 */
-	return (__force int)skb_checksum_init_zero_check(skb, proto, uh->check,
-							 inet_compute_pseudo);
+	err = (__force int)skb_checksum_init_zero_check(skb, proto, uh->check,
+							inet_compute_pseudo);
+	if (err)
+		return err;
+
+	if (skb->ip_summed == CHECKSUM_COMPLETE && !skb->csum_valid) {
+		/* If SW calculated the value, we know it's bad */
+		if (skb->csum_complete_sw)
+			return 1;
+
+		/* HW says the value is bad. Let's validate that.
+		 * skb->csum is no longer the full packet checksum,
+		 * so don't treat it as such.
+		 */
+		skb_checksum_complete_unset(skb);
+	}
+
+	return 0;
 }

 /* wrapper for udp_queue_rcv_skb tacking care of csum conversion and
diff --git a/net/ipv6/ip6_checksum.c b/net/ipv6/ip6_checksum.c
index 547515e..3777170 100644
--- a/net/ipv6/ip6_checksum.c
+++ b/net/ipv6/ip6_checksum.c
@@ -88,8 +88,24 @@ int udp6_csum_init(struct sk_buff *skb, struct udphdr *uh, int proto)
 	 * Note, we are only interested in != 0 or == 0, thus the
 	 * force to int.
 	 */
-	return (__force int)skb_checksum_init_zero_check(skb, proto, uh->check,
-							 ip6_compute_pseudo);
+	err = (__force int)skb_checksum_init_zero_check(skb, proto, uh->check,
+							ip6_compute_pseudo);
+	if (err)
+		return err;
+
+	if (skb->ip_summed == CHECKSUM_COMPLETE && !skb->csum_valid) {
+		/* If SW calculated the value, we know it's bad */
+		if (skb->csum_complete_sw)
+			return 1;
+
+		/* HW says the value is bad. Let's validate that.
+		 * skb->csum is no longer the full packet checksum,
+		 * so don't treat is as such.
+		 */
+		skb_checksum_complete_unset(skb);
+	}
+
+	return 0;
 }
 EXPORT_SYMBOL(udp6_csum_init);

-- 
1.9.1

^ permalink raw reply related

* Re: [PATCH net-next 2/9] net/ipv4: Plumb support for filtering route dumps
From: Andrew Lunn @ 2018-10-11 18:30 UTC (permalink / raw)
  To: David Ahern; +Cc: David Ahern, netdev, davem
In-Reply-To: <216764dd-a882-9788-c2d7-54f0e69be3df@gmail.com>

On Thu, Oct 11, 2018 at 10:44:04AM -0600, David Ahern wrote:
> On 10/11/18 9:56 AM, Andrew Lunn wrote:
> >> @@ -866,10 +866,13 @@ static int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
> >>  		hlist_for_each_entry_rcu(tb, head, tb_hlist) {
> >>  			if (e < s_e)
> >>  				goto next;
> >> +			if (filter.table_id && filter.table_id != tb->tb_id)
> >> +				goto next;
> >> +
> > 
> > Hi David
> > 
> > Should there be a test here that filter->filter_set is set, before
> > looking at filter.table_id.
> 
> filter_set is meant for places that would need to look at multiple flags.

Hi David

It would be good to add some comments to struct fib_dump_filter. Maybe
also move table_id before filter_set, to try to indicate it is not
relevant for the table_id.

	 Andrew

^ permalink raw reply

* [Patch net] llc: set SOCK_RCU_FREE in llc_sap_add_socket()
From: Cong Wang @ 2018-10-11 18:15 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang

WHen an llc sock is added into the sk_laddr_hash of an llc_sap,
it is not marked with SOCK_RCU_FREE.

This causes that the sock could be freed while it is still being
read by __llc_lookup_established() with RCU read lock. sock is
refcounted, but with RCU read lock, nothing prevents the readers
getting a zero refcnt.

Fix it by setting SOCK_RCU_FREE in llc_sap_add_socket().

Reported-by: syzbot+11e05f04c15e03be5254@syzkaller.appspotmail.com
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/llc/llc_conn.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/llc/llc_conn.c b/net/llc/llc_conn.c
index c0ac522b48a1..4ff89cb7c86f 100644
--- a/net/llc/llc_conn.c
+++ b/net/llc/llc_conn.c
@@ -734,6 +734,7 @@ void llc_sap_add_socket(struct llc_sap *sap, struct sock *sk)
 	llc_sk(sk)->sap = sap;

 	spin_lock_bh(&sap->sk_lock);
+	sock_set_flag(sk, SOCK_RCU_FREE);
 	sap->sk_count++;
 	sk_nulls_add_node_rcu(sk, laddr_hb);
 	hlist_add_head(&llc->dev_hash_node, dev_hb);
-- 
2.14.4

^ permalink raw reply related

* [PATCH net-next v7] net/ncsi: Extend NC-SI Netlink interface to allow user space to send NC-SI command
From: Justin.Lee1 @ 2018-10-11 18:07 UTC (permalink / raw)
  To: davem
  Cc: sam, joel, linux-aspeed, netdev, openbmc, amithash, christian,
	vijaykhemka

The new command (NCSI_CMD_SEND_CMD) is added to allow user space application
to send NC-SI command to the network card.
Also, add a new attribute (NCSI_ATTR_DATA) for transferring request and response.

The work flow is as below. 

Request:
User space application
	-> Netlink interface (msg)
	-> new Netlink handler - ncsi_send_cmd_nl()
	-> ncsi_xmit_cmd()

Response:
Response received - ncsi_rcv_rsp()
	-> internal response handler - ncsi_rsp_handler_xxx()
	-> ncsi_rsp_handler_netlink()
	-> ncsi_send_netlink_rsp ()
	-> Netlink interface (msg)
	-> user space application

Command timeout - ncsi_request_timeout()
	-> ncsi_send_netlink_timeout ()
	-> Netlink interface (msg with zero data length)
	-> user space application

Error:
Error detected
	-> ncsi_send_netlink_err ()
	-> Netlink interface (err msg)
	-> user space application


Signed-off-by: Justin Lee <justin.lee1@dell.com> 


---
V7: Adjust the order of local variables from longest to shortest line.
V6: Add checking before accessing NCSI_ATTR_DATA attribute to avoid null-dereference. 
V5: Update comments and debug message.
V4: Update comments and remove some debug message.
V3: Based on http://patchwork.ozlabs.org/patch/979688/ to remove the duplicated code.
V2: Remove non-related debug message and clean up the code.

 include/uapi/linux/ncsi.h |   6 ++
 net/ncsi/internal.h       |   7 ++
 net/ncsi/ncsi-cmd.c       |   8 ++
 net/ncsi/ncsi-manage.c    |  16 ++++
 net/ncsi/ncsi-netlink.c   | 204 ++++++++++++++++++++++++++++++++++++++++++++++
 net/ncsi/ncsi-netlink.h   |  12 +++
 net/ncsi/ncsi-rsp.c       |  67 +++++++++++++--
 7 files changed, 315 insertions(+), 5 deletions(-)

diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h
index 4c292ec..0a26a55 100644
--- a/include/uapi/linux/ncsi.h
+++ b/include/uapi/linux/ncsi.h
@@ -23,6 +23,9 @@
  *	optionally the preferred NCSI_ATTR_CHANNEL_ID.
  * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel combination.
  *	Requires NCSI_ATTR_IFINDEX.
+ * @NCSI_CMD_SEND_CMD: send NC-SI command to network card.
+ *	Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID
+ *	and NCSI_ATTR_CHANNEL_ID.
  * @NCSI_CMD_MAX: highest command number
  */
 enum ncsi_nl_commands {
@@ -30,6 +33,7 @@ enum ncsi_nl_commands {
 	NCSI_CMD_PKG_INFO,
 	NCSI_CMD_SET_INTERFACE,
 	NCSI_CMD_CLEAR_INTERFACE,
+	NCSI_CMD_SEND_CMD,
 
 	__NCSI_CMD_AFTER_LAST,
 	NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1
@@ -43,6 +47,7 @@ enum ncsi_nl_commands {
  * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes
  * @NCSI_ATTR_PACKAGE_ID: package ID
  * @NCSI_ATTR_CHANNEL_ID: channel ID
+ * @NCSI_ATTR_DATA: command payload
  * @NCSI_ATTR_MAX: highest attribute number
  */
 enum ncsi_nl_attrs {
@@ -51,6 +56,7 @@ enum ncsi_nl_attrs {
 	NCSI_ATTR_PACKAGE_LIST,
 	NCSI_ATTR_PACKAGE_ID,
 	NCSI_ATTR_CHANNEL_ID,
+	NCSI_ATTR_DATA,
 
 	__NCSI_ATTR_AFTER_LAST,
 	NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1
diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
index 3d0a33b..13c9b5e 100644
--- a/net/ncsi/internal.h
+++ b/net/ncsi/internal.h
@@ -175,6 +175,8 @@ struct ncsi_package;
 #define NCSI_RESERVED_CHANNEL	0x1f
 #define NCSI_CHANNEL_INDEX(c)	((c) & ((1 << NCSI_PACKAGE_SHIFT) - 1))
 #define NCSI_TO_CHANNEL(p, c)	(((p) << NCSI_PACKAGE_SHIFT) | (c))
+#define NCSI_MAX_PACKAGE	8
+#define NCSI_MAX_CHANNEL	32
 
 struct ncsi_channel {
 	unsigned char               id;
@@ -220,11 +222,15 @@ struct ncsi_request {
 	bool                 used;    /* Request that has been assigned  */
 	unsigned int         flags;   /* NCSI request property           */
 #define NCSI_REQ_FLAG_EVENT_DRIVEN	1
+#define NCSI_REQ_FLAG_NETLINK_DRIVEN	2
 	struct ncsi_dev_priv *ndp;    /* Associated NCSI device          */
 	struct sk_buff       *cmd;    /* Associated NCSI command packet  */
 	struct sk_buff       *rsp;    /* Associated NCSI response packet */
 	struct timer_list    timer;   /* Timer on waiting for response   */
 	bool                 enabled; /* Time has been enabled or not    */
+	u32                  snd_seq;     /* netlink sending sequence number */
+	u32                  snd_portid;  /* netlink portid of sender        */
+	struct nlmsghdr      nlhdr;       /* netlink message header          */
 };
 
 enum {
@@ -310,6 +316,7 @@ struct ncsi_cmd_arg {
 		unsigned int   dwords[4];
 	};
 	unsigned char        *data;       /* NCSI OEM data                 */
+	struct genl_info     *info;       /* Netlink information           */
 };
 
 extern struct list_head ncsi_dev_list;
diff --git a/net/ncsi/ncsi-cmd.c b/net/ncsi/ncsi-cmd.c
index 82b7d92..356af47 100644
--- a/net/ncsi/ncsi-cmd.c
+++ b/net/ncsi/ncsi-cmd.c
@@ -17,6 +17,7 @@
 #include <net/ncsi.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
+#include <net/genetlink.h>
 
 #include "internal.h"
 #include "ncsi-pkt.h"
@@ -346,6 +347,13 @@ int ncsi_xmit_cmd(struct ncsi_cmd_arg *nca)
 	if (!nr)
 		return -ENOMEM;
 
+	/* track netlink information */
+	if (nca->req_flags == NCSI_REQ_FLAG_NETLINK_DRIVEN) {
+		nr->snd_seq = nca->info->snd_seq;
+		nr->snd_portid = nca->info->snd_portid;
+		nr->nlhdr = *nca->info->nlhdr;
+	}
+
 	/* Prepare the packet */
 	nca->id = nr->id;
 	ret = nch->handler(nr->cmd, nca);
diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
index 0912847..6aa0614 100644
--- a/net/ncsi/ncsi-manage.c
+++ b/net/ncsi/ncsi-manage.c
@@ -19,6 +19,7 @@
 #include <net/addrconf.h>
 #include <net/ipv6.h>
 #include <net/if_inet6.h>
+#include <net/genetlink.h>
 
 #include "internal.h"
 #include "ncsi-pkt.h"
@@ -406,6 +407,9 @@ static void ncsi_request_timeout(struct timer_list *t)
 {
 	struct ncsi_request *nr = from_timer(nr, t, timer);
 	struct ncsi_dev_priv *ndp = nr->ndp;
+	struct ncsi_cmd_pkt *cmd;
+	struct ncsi_package *np;
+	struct ncsi_channel *nc;
 	unsigned long flags;
 
 	/* If the request already had associated response,
@@ -419,6 +423,18 @@ static void ncsi_request_timeout(struct timer_list *t)
 	}
 	spin_unlock_irqrestore(&ndp->lock, flags);
 
+	if (nr->flags == NCSI_REQ_FLAG_NETLINK_DRIVEN) {
+		if (nr->cmd) {
+			/* Find the package */
+			cmd = (struct ncsi_cmd_pkt *)
+			      skb_network_header(nr->cmd);
+			ncsi_find_package_and_channel(ndp,
+						      cmd->cmd.common.channel,
+						      &np, &nc);
+			ncsi_send_netlink_timeout(nr, np, nc);
+		}
+	}
+
 	/* Release the request */
 	ncsi_free_request(nr);
 }
diff --git a/net/ncsi/ncsi-netlink.c b/net/ncsi/ncsi-netlink.c
index 45f33d6..9b48b9f 100644
--- a/net/ncsi/ncsi-netlink.c
+++ b/net/ncsi/ncsi-netlink.c
@@ -20,6 +20,7 @@
 #include <uapi/linux/ncsi.h>
 
 #include "internal.h"
+#include "ncsi-pkt.h"
 #include "ncsi-netlink.h"
 
 static struct genl_family ncsi_genl_family;
@@ -29,6 +30,7 @@ static const struct nla_policy ncsi_genl_policy[NCSI_ATTR_MAX + 1] = {
 	[NCSI_ATTR_PACKAGE_LIST] =	{ .type = NLA_NESTED },
 	[NCSI_ATTR_PACKAGE_ID] =	{ .type = NLA_U32 },
 	[NCSI_ATTR_CHANNEL_ID] =	{ .type = NLA_U32 },
+	[NCSI_ATTR_DATA] =		{ .type = NLA_BINARY, .len = 2048 },
 };
 
 static struct ncsi_dev_priv *ndp_from_ifindex(struct net *net, u32 ifindex)
@@ -366,6 +368,202 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
 	return 0;
 }
 
+static int ncsi_send_cmd_nl(struct sk_buff *msg, struct genl_info *info)
+{
+	struct ncsi_dev_priv *ndp;
+	struct ncsi_pkt_hdr *hdr;
+	struct ncsi_cmd_arg nca;
+	unsigned char *data;
+	u32 package_id;
+	u32 channel_id;
+	int len, ret;
+
+	if (!info || !info->attrs) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (!info->attrs[NCSI_ATTR_IFINDEX]) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (!info->attrs[NCSI_ATTR_PACKAGE_ID]) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (!info->attrs[NCSI_ATTR_CHANNEL_ID]) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (!info->attrs[NCSI_ATTR_DATA]) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
+			       nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
+	if (!ndp) {
+		ret = -ENODEV;
+		goto out;
+	}
+
+	package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
+	channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
+
+	if (package_id >= NCSI_MAX_PACKAGE || channel_id >= NCSI_MAX_CHANNEL) {
+		ret = -ERANGE;
+		goto out_netlink;
+	}
+
+	len = nla_len(info->attrs[NCSI_ATTR_DATA]);
+	if (len < sizeof(struct ncsi_pkt_hdr)) {
+		netdev_info(ndp->ndev.dev, "NCSI: no command to send %u\n",
+			    package_id);
+		ret = -EINVAL;
+		goto out_netlink;
+	} else {
+		data = (unsigned char *)nla_data(info->attrs[NCSI_ATTR_DATA]);
+	}
+
+	hdr = (struct ncsi_pkt_hdr *)data;
+
+	nca.ndp = ndp;
+	nca.package = (unsigned char)package_id;
+	nca.channel = (unsigned char)channel_id;
+	nca.type = hdr->type;
+	nca.req_flags = NCSI_REQ_FLAG_NETLINK_DRIVEN;
+	nca.info = info;
+	nca.payload = ntohs(hdr->length);
+	nca.data = data + sizeof(*hdr);
+
+	ret = ncsi_xmit_cmd(&nca);
+out_netlink:
+	if (ret != 0) {
+		netdev_err(ndp->ndev.dev,
+			   "NCSI: Error %d sending command\n",
+			   ret);
+		ncsi_send_netlink_err(ndp->ndev.dev,
+				      info->snd_seq,
+				      info->snd_portid,
+				      info->nlhdr,
+				      ret);
+	}
+out:
+	return ret;
+}
+
+int ncsi_send_netlink_rsp(struct ncsi_request *nr,
+			  struct ncsi_package *np,
+			  struct ncsi_channel *nc)
+{
+	struct sk_buff *skb;
+	struct net *net;
+	void *hdr;
+	int rc;
+
+	net = dev_net(nr->rsp->dev);
+
+	skb = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
+	if (!skb)
+		return -ENOMEM;
+
+	hdr = genlmsg_put(skb, nr->snd_portid, nr->snd_seq,
+			  &ncsi_genl_family, 0, NCSI_CMD_SEND_CMD);
+	if (!hdr) {
+		kfree_skb(skb);
+		return -EMSGSIZE;
+	}
+
+	nla_put_u32(skb, NCSI_ATTR_IFINDEX, nr->rsp->dev->ifindex);
+	if (np)
+		nla_put_u32(skb, NCSI_ATTR_PACKAGE_ID, np->id);
+	if (nc)
+		nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, nc->id);
+	else
+		nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, NCSI_RESERVED_CHANNEL);
+
+	rc = nla_put(skb, NCSI_ATTR_DATA, nr->rsp->len, (void *)nr->rsp->data);
+	if (rc)
+		goto err;
+
+	genlmsg_end(skb, hdr);
+	return genlmsg_unicast(net, skb, nr->snd_portid);
+
+err:
+	kfree_skb(skb);
+	return rc;
+}
+
+int ncsi_send_netlink_timeout(struct ncsi_request *nr,
+			      struct ncsi_package *np,
+			      struct ncsi_channel *nc)
+{
+	struct sk_buff *skb;
+	struct net *net;
+	void *hdr;
+
+	skb = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
+	if (!skb)
+		return -ENOMEM;
+
+	hdr = genlmsg_put(skb, nr->snd_portid, nr->snd_seq,
+			  &ncsi_genl_family, 0, NCSI_CMD_SEND_CMD);
+	if (!hdr) {
+		kfree_skb(skb);
+		return -EMSGSIZE;
+	}
+
+	net = dev_net(nr->cmd->dev);
+
+	nla_put_u32(skb, NCSI_ATTR_IFINDEX, nr->cmd->dev->ifindex);
+
+	if (np)
+		nla_put_u32(skb, NCSI_ATTR_PACKAGE_ID, np->id);
+	else
+		nla_put_u32(skb, NCSI_ATTR_PACKAGE_ID,
+			    NCSI_PACKAGE_INDEX((((struct ncsi_pkt_hdr *)
+						 nr->cmd->data)->channel)));
+
+	if (nc)
+		nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, nc->id);
+	else
+		nla_put_u32(skb, NCSI_ATTR_CHANNEL_ID, NCSI_RESERVED_CHANNEL);
+
+	genlmsg_end(skb, hdr);
+	return genlmsg_unicast(net, skb, nr->snd_portid);
+}
+
+int ncsi_send_netlink_err(struct net_device *dev,
+			  u32 snd_seq,
+			  u32 snd_portid,
+			  struct nlmsghdr *nlhdr,
+			  int err)
+{
+	struct nlmsghdr *nlh;
+	struct nlmsgerr *nle;
+	struct sk_buff *skb;
+	struct net *net;
+
+	skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
+	if (!skb)
+		return -ENOMEM;
+
+	net = dev_net(dev);
+
+	nlh = nlmsg_put(skb, snd_portid, snd_seq,
+			NLMSG_ERROR, sizeof(*nle), 0);
+	nle = (struct nlmsgerr *)nlmsg_data(nlh);
+	nle->error = err;
+	memcpy(&nle->msg, nlhdr, sizeof(*nlh));
+
+	nlmsg_end(skb, nlh);
+
+	return nlmsg_unicast(net->genl_sock, skb, snd_portid);
+}
+
 static const struct genl_ops ncsi_ops[] = {
 	{
 		.cmd = NCSI_CMD_PKG_INFO,
@@ -386,6 +584,12 @@ static const struct genl_ops ncsi_ops[] = {
 		.doit = ncsi_clear_interface_nl,
 		.flags = GENL_ADMIN_PERM,
 	},
+	{
+		.cmd = NCSI_CMD_SEND_CMD,
+		.policy = ncsi_genl_policy,
+		.doit = ncsi_send_cmd_nl,
+		.flags = GENL_ADMIN_PERM,
+	},
 };
 
 static struct genl_family ncsi_genl_family __ro_after_init = {
diff --git a/net/ncsi/ncsi-netlink.h b/net/ncsi/ncsi-netlink.h
index 91a5c25..c4a4688 100644
--- a/net/ncsi/ncsi-netlink.h
+++ b/net/ncsi/ncsi-netlink.h
@@ -14,6 +14,18 @@
 
 #include "internal.h"
 
+int ncsi_send_netlink_rsp(struct ncsi_request *nr,
+			  struct ncsi_package *np,
+			  struct ncsi_channel *nc);
+int ncsi_send_netlink_timeout(struct ncsi_request *nr,
+			      struct ncsi_package *np,
+			      struct ncsi_channel *nc);
+int ncsi_send_netlink_err(struct net_device *dev,
+			  u32 snd_seq,
+			  u32 snd_portid,
+			  struct nlmsghdr *nlhdr,
+			  int err);
+
 int ncsi_init_netlink(struct net_device *dev);
 int ncsi_unregister_netlink(struct net_device *dev);
 
diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
index d66b347..85fa59a 100644
--- a/net/ncsi/ncsi-rsp.c
+++ b/net/ncsi/ncsi-rsp.c
@@ -16,9 +16,11 @@
 #include <net/ncsi.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
+#include <net/genetlink.h>
 
 #include "internal.h"
 #include "ncsi-pkt.h"
+#include "ncsi-netlink.h"
 
 static int ncsi_validate_rsp_pkt(struct ncsi_request *nr,
 				 unsigned short payload)
@@ -32,15 +34,25 @@ static int ncsi_validate_rsp_pkt(struct ncsi_request *nr,
 	 * before calling this function.
 	 */
 	h = (struct ncsi_rsp_pkt_hdr *)skb_network_header(nr->rsp);
-	if (h->common.revision != NCSI_PKT_REVISION)
+
+	if (h->common.revision != NCSI_PKT_REVISION) {
+		netdev_dbg(nr->ndp->ndev.dev,
+			   "NCSI: unsupported header revision\n");
 		return -EINVAL;
-	if (ntohs(h->common.length) != payload)
+	}
+	if (ntohs(h->common.length) != payload) {
+		netdev_dbg(nr->ndp->ndev.dev,
+			   "NCSI: payload length mismatched\n");
 		return -EINVAL;
+	}
 
 	/* Check on code and reason */
 	if (ntohs(h->code) != NCSI_PKT_RSP_C_COMPLETED ||
-	    ntohs(h->reason) != NCSI_PKT_RSP_R_NO_ERROR)
-		return -EINVAL;
+	    ntohs(h->reason) != NCSI_PKT_RSP_R_NO_ERROR) {
+		netdev_dbg(nr->ndp->ndev.dev,
+			   "NCSI: non zero response/reason code\n");
+		return -EPERM;
+	}
 
 	/* Validate checksum, which might be zeroes if the
 	 * sender doesn't support checksum according to NCSI
@@ -52,8 +64,11 @@ static int ncsi_validate_rsp_pkt(struct ncsi_request *nr,
 
 	checksum = ncsi_calculate_checksum((unsigned char *)h,
 					   sizeof(*h) + payload - 4);
-	if (*pchecksum != htonl(checksum))
+
+	if (*pchecksum != htonl(checksum)) {
+		netdev_dbg(nr->ndp->ndev.dev, "NCSI: checksum mismatched\n");
 		return -EINVAL;
+	}
 
 	return 0;
 }
@@ -941,6 +956,26 @@ static int ncsi_rsp_handler_gpuuid(struct ncsi_request *nr)
 	return 0;
 }
 
+static int ncsi_rsp_handler_netlink(struct ncsi_request *nr)
+{
+	struct ncsi_dev_priv *ndp = nr->ndp;
+	struct ncsi_rsp_pkt *rsp;
+	struct ncsi_package *np;
+	struct ncsi_channel *nc;
+	int ret;
+
+	/* Find the package */
+	rsp = (struct ncsi_rsp_pkt *)skb_network_header(nr->rsp);
+	ncsi_find_package_and_channel(ndp, rsp->rsp.common.channel,
+				      &np, &nc);
+	if (!np)
+		return -ENODEV;
+
+	ret = ncsi_send_netlink_rsp(nr, np, nc);
+
+	return ret;
+}
+
 static struct ncsi_rsp_handler {
 	unsigned char	type;
 	int             payload;
@@ -1043,6 +1078,17 @@ int ncsi_rcv_rsp(struct sk_buff *skb, struct net_device *dev,
 		netdev_warn(ndp->ndev.dev,
 			    "NCSI: 'bad' packet ignored for type 0x%x\n",
 			    hdr->type);
+
+		if (nr->flags == NCSI_REQ_FLAG_NETLINK_DRIVEN) {
+			if (ret == -EPERM)
+				goto out_netlink;
+			else
+				ncsi_send_netlink_err(ndp->ndev.dev,
+						      nr->snd_seq,
+						      nr->snd_portid,
+						      &nr->nlhdr,
+						      ret);
+		}
 		goto out;
 	}
 
@@ -1052,6 +1098,17 @@ int ncsi_rcv_rsp(struct sk_buff *skb, struct net_device *dev,
 		netdev_err(ndp->ndev.dev,
 			   "NCSI: Handler for packet type 0x%x returned %d\n",
 			   hdr->type, ret);
+
+out_netlink:
+	if (nr->flags == NCSI_REQ_FLAG_NETLINK_DRIVEN) {
+		ret = ncsi_rsp_handler_netlink(nr);
+		if (ret) {
+			netdev_err(ndp->ndev.dev,
+				   "NCSI: Netlink handler for packet type 0x%x returned %d\n",
+				   hdr->type, ret);
+		}
+	}
+
 out:
 	ncsi_free_request(nr);
 	return ret;
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH net-next 0/9] net: Kernel side filtering for route dumps
From: Jamal Hadi Salim @ 2018-10-11 18:05 UTC (permalink / raw)
  To: David Ahern, Sowmini Varadhan, Stephen Hemminger
  Cc: David Ahern, netdev, davem
In-Reply-To: <7f4228a7-9a55-1ca5-f36f-62ef30393c90@gmail.com>

On 2018-10-11 1:04 p.m., David Ahern wrote:

> You can already filter link dumps by kind. How? By passing in the KIND
> attribute on a dump request. This type of filtering exists for link
> dumps, neighbor dumps, fdb dumps. Why is there a push to make route
> dumps different? Why can't they be consistent and use existing semantics?

I think you meant filtering by ifindex in neighbor.
note: I would argue that there are already "adhoc" ways of filtering
in place, mostly use case driven). Otherwise Sowmini wouldnt have to
craft that bpf filter. There are netlink users who have none or some
weird filtering involved. There is no arguement that your approach
works for rtm. But the rest of the users missing filters will require
similar kernel changes. Could this be made generic enough to benefit
other netlink users?
The problem is there's always one new attribute that would make sense
for some use case which requires a kernel change ("send me an event only
if you get link down" or "dump all ports with link down").

cheers,
jamal

^ permalink raw reply

* [PATCH net-next] net: phy: trigger state machine immediately in phy_start_machine
From: Heiner Kallweit @ 2018-10-11 17:31 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, David Miller; +Cc: netdev@vger.kernel.org

When starting the state machine there may be work to be done
immediately, e.g. if the initial state is PHY_UP then the state
machine may trigger an autonegotiation. Having said that I see no need
to wait a second until the state machine is run first time.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/phy/phy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 14509a890..704428211 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -654,7 +654,7 @@ static void phy_queue_state_machine(struct phy_device *phydev,
  */
 void phy_start_machine(struct phy_device *phydev)
 {
-	phy_queue_state_machine(phydev, 1);
+	phy_trigger_machine(phydev);
 }
 EXPORT_SYMBOL_GPL(phy_start_machine);
 
-- 
2.19.1

^ permalink raw reply related

* Re: [PATCH net] net/mlx4_core: Fix warnings during boot on driverinit param set failures
From: David Miller @ 2018-10-11 17:25 UTC (permalink / raw)
  To: tariqt; +Cc: netdev, eranbe, moshe
In-Reply-To: <1539259279-6527-1-git-send-email-tariqt@mellanox.com>

From: Tariq Toukan <tariqt@mellanox.com>
Date: Thu, 11 Oct 2018 15:01:19 +0300

> From: Moshe Shemesh <moshe@mellanox.com>
> 
> During boot, mlx4_core sets the driverinit configuration parameters and
> updates the devlink module on the initial values calling
> devlink_param_driverinit_value_set().
> If devlink_param_driverinit_value_set() returns an error mlx4_core
> reports kernel module warning.
> 
> This caused false alarm during boot in case kernel was compiled with
> CONFIG_NET_DEVLINK off.
> Fix by removing warning reported in case
> devlink_param_driverinit_value_set() fails.
> 
> This actually makes the function mlx4_devlink_set_init_value()
> redundant to using directly devlink_param_driverinit_value_set() and so
> removed.
> 
> It fixes the following kernel trace:
> 
>  mlx4_core 0000:00:06.0: devlink set parameter 0 value failed (err = -95)
>  mlx4_core 0000:00:06.0: devlink set parameter 1 value failed (err = -95)
>  mlx4_core 0000:00:06.0: devlink set parameter 4 value failed (err = -95)
>  mlx4_core 0000:00:06.0: devlink set parameter 5 value failed (err = -95)
>  mlx4_core 0000:00:06.0: devlink set parameter 3 value failed (err = -95)
> 
> Fixes: bd1b51dc66df ("mlx4: Add mlx4 initial parameters table and register it")
> Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] selftests: use posix-style redirection in ip_defrag.sh
From: David Miller @ 2018-10-11 17:22 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, posk
In-Reply-To: <808c09ba03e2bb341a5f603d830bd691332554c9.1539249443.git.pabeni@redhat.com>

From: Paolo Abeni <pabeni@redhat.com>
Date: Thu, 11 Oct 2018 11:17:37 +0200

> The ip_defrag.sh script requires bash-style output redirection but
> use the default shell. This may cause random failures if the default
> shell is not bash.
> Address the above using posix compliant output redirection.
> 
> Fixes: 02c7f38b7ace ("selftests/net: add ip_defrag selftest")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Applied.

^ permalink raw reply

* Re: [PATCH net 0/2] net: explicitly requires bash when needed.
From: David Miller @ 2018-10-11 17:19 UTC (permalink / raw)
  To: pabeni; +Cc: netdev, fw, willemb
In-Reply-To: <cover.1539247467.git.pabeni@redhat.com>

From: Paolo Abeni <pabeni@redhat.com>
Date: Thu, 11 Oct 2018 10:54:51 +0200

> Some test scripts require bash-only features but use the default shell.
> This may cause random failures if the default shell is not bash.
> Instead of doing a potentially complex rewrite of such scripts, these patches 
> require the bash interpreter, where needed.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH V2 net-next 00/12] Improving performance and reducing latencies, by using latest capabilities exposed in ENA device
From: David Miller @ 2018-10-11 17:14 UTC (permalink / raw)
  To: akiyano
  Cc: netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea, gtzalik,
	netanel, alisaidi
In-Reply-To: <1539246387-311-1-git-send-email-akiyano@amazon.com>

From: <akiyano@amazon.com>
Date: Thu, 11 Oct 2018 11:26:15 +0300

> From: Arthur Kiyanovski <akiyano@amazon.com>
> 
> This patchset introduces the following:
> 1. A new placement policy of Tx headers and descriptors, which takes
> advantage of an option to place headers + descriptors in device memory
> space. This is sometimes referred to as LLQ - low latency queue.
> The patch set defines the admin capability, maps the device memory as
> write-combined, and adds a mode in transmit datapath to do header +
> descriptor placement on the device.
> 2. Support for RX checksum offloading
> 3. Miscelaneous small improvements and code cleanups
> 
> Note: V1 of this patchset was created as if patches e2a322a 248ab77
> from net were applied to net-next before applying the patchset. This V2 
> version does not assume this, and should be applyed directly on net-next
> without the aformentioned patches.

Series applied.

^ permalink raw reply

* Re: [PATCH net-next v3 1/2] net/ncsi: Add NCSI Broadcom OEM command
From: Samuel Mendoza-Jonas @ 2018-10-12  0:41 UTC (permalink / raw)
  To: Vijay Khemka, David S. Miller, netdev, linux-kernel
  Cc: openbmc @ lists . ozlabs . org, Justin.Lee1, joel, linux-aspeed
In-Reply-To: <20181011230518.3204700-1-vijaykhemka@fb.com>

On Thu, 2018-10-11 at 16:05 -0700, Vijay Khemka wrote:
> This patch adds OEM Broadcom commands and response handling. It also
> defines OEM Get MAC Address handler to get and configure the device.
> 
> ncsi_oem_gma_handler_bcm: This handler send NCSI broadcom command for
> getting mac address.
> ncsi_rsp_handler_oem_bcm: This handles response received for all
> broadcom OEM commands.
> ncsi_rsp_handler_oem_bcm_gma: This handles get mac address response and
> set it to device.
> 
> Signed-off-by: Vijay Khemka <vijaykhemka@fb.com>
> ---
>  net/ncsi/Kconfig       |  6 ++++
>  net/ncsi/internal.h    |  8 +++++
>  net/ncsi/ncsi-manage.c | 70 ++++++++++++++++++++++++++++++++++++++++++
>  net/ncsi/ncsi-pkt.h    |  8 +++++
>  net/ncsi/ncsi-rsp.c    | 40 +++++++++++++++++++++++-
>  5 files changed, 131 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ncsi/Kconfig b/net/ncsi/Kconfig
> index 08a8a6031fd7..7f2b46108a24 100644
> --- a/net/ncsi/Kconfig
> +++ b/net/ncsi/Kconfig
> @@ -10,3 +10,9 @@ config NET_NCSI
>  	  support. Enable this only if your system connects to a network
>  	  device via NCSI and the ethernet driver you're using supports
>  	  the protocol explicitly.
> +config NCSI_OEM_CMD_GET_MAC
> +	bool "Get NCSI OEM MAC Address"
> +	depends on NET_NCSI
> +	---help---
> +	  This allows to get MAC address from NCSI firmware and set them back to
> +		controller.
> diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
> index 3d0a33b874f5..45883b32790e 100644
> --- a/net/ncsi/internal.h
> +++ b/net/ncsi/internal.h
> @@ -71,6 +71,13 @@ enum {
>  /* OEM Vendor Manufacture ID */
>  #define NCSI_OEM_MFR_MLX_ID             0x8119
>  #define NCSI_OEM_MFR_BCM_ID             0x113d
> +/* Broadcom specific OEM Command */
> +#define NCSI_OEM_BCM_CMD_GMA            0x01   /* CMD ID for Get MAC */
> +/* OEM Command payload lengths*/
> +#define NCSI_OEM_BCM_CMD_GMA_LEN        12
> +/* Mac address offset in OEM response */
> +#define BCM_MAC_ADDR_OFFSET             28
> +
>  
>  struct ncsi_channel_version {
>  	u32 version;		/* Supported BCD encoded NCSI version */
> @@ -240,6 +247,7 @@ enum {
>  	ncsi_dev_state_probe_dp,
>  	ncsi_dev_state_config_sp	= 0x0301,
>  	ncsi_dev_state_config_cis,
> +	ncsi_dev_state_config_oem_gma,
>  	ncsi_dev_state_config_clear_vids,
>  	ncsi_dev_state_config_svf,
>  	ncsi_dev_state_config_ev,
> diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
> index 091284760d21..75504ccd1b95 100644
> --- a/net/ncsi/ncsi-manage.c
> +++ b/net/ncsi/ncsi-manage.c
> @@ -635,6 +635,39 @@ static int set_one_vid(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc,
>  	return 0;
>  }
>  
> +#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC)
> +
> +/* NCSI OEM Command APIs */
> +static void ncsi_oem_gma_handler_bcm(struct ncsi_cmd_arg *nca)
> +{
> +	int ret = 0;
> +	unsigned char data[NCSI_OEM_BCM_CMD_GMA_LEN];
> +
> +	nca->payload = NCSI_OEM_BCM_CMD_GMA_LEN;
> +
> +	memset(data, 0, NCSI_OEM_BCM_CMD_GMA_LEN);
> +	*(unsigned int *)data = ntohl(NCSI_OEM_MFR_BCM_ID);
> +	data[5] = NCSI_OEM_BCM_CMD_GMA;
> +
> +	nca->data = data;
> +
> +	ret = ncsi_xmit_cmd(nca);
> +	if (ret)
> +		netdev_err(nca->ndp->ndev.dev,
> +			   "NCSI: Failed to transmit cmd 0x%x during configure\n",
> +			   nca->type);
> +}
> +
> +/* OEM Command handlers initialization */
> +static struct ncsi_oem_gma_handler {
> +	unsigned int	mfr_id;
> +	void		(*handler)(struct ncsi_cmd_arg *nca);
> +} ncsi_oem_gma_handlers[] = {
> +	{ NCSI_OEM_MFR_BCM_ID, ncsi_oem_gma_handler_bcm }
> +};
> +
> +#endif /* CONFIG_NCSI_OEM_CMD_GET_MAC */
> +
>  static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
>  {
>  	struct ncsi_dev *nd = &ndp->ndev;
> @@ -685,6 +718,43 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
>  			goto error;
>  		}
>  
> +#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC)
> +		nd->state = ncsi_dev_state_config_oem_gma;
> +		break;
> +	case ncsi_dev_state_config_oem_gma:
> +		nca.type = NCSI_PKT_CMD_OEM;
> +		nca.package = np->id;
> +		nca.channel = nc->id;
> +		ndp->pending_req_num = 1;
> +
> +		/* Check for manufacturer id and Find the handler */
> +		struct ncsi_oem_gma_handler *nch = NULL;
> +		int i;
> +

This has the opposite affect, now if we do compile with
CONFIG_NCSI_OEM_CMD_GET_MAC we get:

../net/ncsi/ncsi-manage.c: In function ‘ncsi_configure_channel’:
../net/ncsi/ncsi-manage.c:769:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
   struct ncsi_oem_gma_handler *nch = NULL;
   ^~~~~~

Perhaps we should lay this out slightly differently. For example we could go
through the ncsi_dev_state_config_oem_gma state regardless and call some other
function that finds and calls the handler, or does nothing if the config option
isn't set.

Regards,
Sam

> +		for (i = 0; i < ARRAY_SIZE(ncsi_oem_gma_handlers); i++) {
> +			if (ncsi_oem_gma_handlers[i].mfr_id ==
> +					nc->version.mf_id) {
> +				if (ncsi_oem_gma_handlers[i].handler)
> +					nch = &ncsi_oem_gma_handlers[i];
> +				else
> +					nch = NULL;
> +
> +				break;
> +			}
> +		}
> +
> +		if (!nch) {
> +			netdev_err(ndp->ndev.dev, "No handler available for GMA with MFR-ID (0x%x)\n",
> +				   nc->version.mf_id);
> +			nd->state = ncsi_dev_state_config_clear_vids;
> +			schedule_work(&ndp->work);
> +			break;
> +		}
> +
> +		/* Get Mac address from NCSI device */
> +		nch->handler(&nca);
> +#endif /* CONFIG_NCSI_OEM_CMD_GET_MAC */
> +
>  		nd->state = ncsi_dev_state_config_clear_vids;
>  		break;
>  	case ncsi_dev_state_config_clear_vids:
> diff --git a/net/ncsi/ncsi-pkt.h b/net/ncsi/ncsi-pkt.h
> index 0f2087c8d42a..4d3f06be38bd 100644
> --- a/net/ncsi/ncsi-pkt.h
> +++ b/net/ncsi/ncsi-pkt.h
> @@ -165,6 +165,14 @@ struct ncsi_rsp_oem_pkt {
>  	unsigned char           data[];      /* Payload data      */
>  };
>  
> +/* Broadcom Response Data */
> +struct ncsi_rsp_oem_bcm_pkt {
> +	unsigned char           ver;         /* Payload Version   */
> +	unsigned char           type;        /* OEM Command type  */
> +	__be16                  len;         /* Payload Length    */
> +	unsigned char           data[];      /* Cmd specific Data */
> +};
> +
>  /* Get Link Status */
>  struct ncsi_rsp_gls_pkt {
>  	struct ncsi_rsp_pkt_hdr rsp;        /* Response header   */
> diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
> index d66b34749027..31672e967db2 100644
> --- a/net/ncsi/ncsi-rsp.c
> +++ b/net/ncsi/ncsi-rsp.c
> @@ -596,12 +596,50 @@ static int ncsi_rsp_handler_snfc(struct ncsi_request *nr)
>  	return 0;
>  }
>  
> +/* Response handler for Broadcom command Get Mac Address */
> +static int ncsi_rsp_handler_oem_bcm_gma(struct ncsi_request *nr)
> +{
> +	struct ncsi_rsp_oem_pkt *rsp;
> +	struct ncsi_dev_priv *ndp = nr->ndp;
> +	struct net_device *ndev = ndp->ndev.dev;
> +	int ret = 0;
> +	const struct net_device_ops *ops = ndev->netdev_ops;
> +	struct sockaddr saddr;
> +
> +	/* Get the response header */
> +	rsp = (struct ncsi_rsp_oem_pkt *)skb_network_header(nr->rsp);
> +
> +	saddr.sa_family = ndev->type;
> +	ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
> +	memcpy(saddr.sa_data, &rsp->data[BCM_MAC_ADDR_OFFSET], ETH_ALEN);
> +	ret = ops->ndo_set_mac_address(ndev, &saddr);
> +	if (ret < 0)
> +		netdev_warn(ndev, "NCSI: Writing mac address to device failed\n");
> +
> +	return ret;
> +}
> +
> +/* Response handler for Broadcom card */
> +static int ncsi_rsp_handler_oem_bcm(struct ncsi_request *nr)
> +{
> +	struct ncsi_rsp_oem_pkt *rsp;
> +	struct ncsi_rsp_oem_bcm_pkt *bcm;
> +
> +	/* Get the response header */
> +	rsp = (struct ncsi_rsp_oem_pkt *)skb_network_header(nr->rsp);
> +	bcm = (struct ncsi_rsp_oem_bcm_pkt *)(rsp->data);
> +
> +	if (bcm->type == NCSI_OEM_BCM_CMD_GMA)
> +		return ncsi_rsp_handler_oem_bcm_gma(nr);
> +	return 0;
> +}
> +
>  static struct ncsi_rsp_oem_handler {
>  	unsigned int	mfr_id;
>  	int		(*handler)(struct ncsi_request *nr);
>  } ncsi_rsp_oem_handlers[] = {
>  	{ NCSI_OEM_MFR_MLX_ID, NULL },
> -	{ NCSI_OEM_MFR_BCM_ID, NULL }
> +	{ NCSI_OEM_MFR_BCM_ID, ncsi_rsp_handler_oem_bcm }
>  };
>  
>  /* Response handler for OEM command */

^ permalink raw reply

* Re: [net 1/1] tipc: initialize broadcast link stale counter correctly
From: Ying Xue @ 2018-10-12  0:41 UTC (permalink / raw)
  To: Jon Maloy, davem, netdev; +Cc: tipc-discussion
In-Reply-To: <1539288149-24122-1-git-send-email-jon.maloy@ericsson.com>

On 10/12/2018 04:02 AM, Jon Maloy wrote:
> In the commit referred to below we added link tolerance as an additional
> criteria for declaring broadcast transmission "stale" and resetting the
> unicast links to the affected node.
> 
> Unfortunately, this 'improvement' introduced two bugs, which each and
> one alone cause only limited problems, but combined lead to seemingly
> stochastic unicast link resets, depending on the amount of broadcast
> traffic transmitted.
> 
> The first issue, a missing initialization of the 'tolerance' field of
> the receiver broadcast link, was recently fixed by commit 047491ea334a
> ("tipc: set link tolerance correctly in broadcast link").
> 
> Ths second issue, where we omit to reset the 'stale_cnt' field of
> the same link after a 'stale' period is over, leads to this counter
> accumulating over time, and in the absence of the 'tolerance' criteria
> leads to the above described symptoms. This commit adds the missing
> initialization.
> 
> Fixes: a4dc70d46cf1 ("tipc: extend link reset criteria for stale packet
> retransmission")
> 
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>

Acked-by: Ying Xue <ying.xue@windriver.com>

> ---
>  net/tipc/link.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/tipc/link.c b/net/tipc/link.c
> index f6552e4..201c3b5 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -1041,6 +1041,7 @@ static int tipc_link_retrans(struct tipc_link *l, struct tipc_link *r,
>  	if (r->last_retransm != buf_seqno(skb)) {
>  		r->last_retransm = buf_seqno(skb);
>  		r->stale_limit = jiffies + msecs_to_jiffies(r->tolerance);
> +		r->stale_cnt = 0;
>  	} else if (++r->stale_cnt > 99 && time_after(jiffies, r->stale_limit)) {
>  		link_retransmit_failure(l, skb);
>  		if (link_is_bc_sndlink(l))
> 

^ permalink raw reply

* Re: [PATCH net-next 00/19] mlxsw: Preparations for VxLAN support
From: David Miller @ 2018-10-11 17:10 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, petrm, mlxsw
In-Reply-To: <20181011074701.17983-1-idosch@mellanox.com>

From: Ido Schimmel <idosch@mellanox.com>
Date: Thu, 11 Oct 2018 07:47:48 +0000

> This patchset prepares mlxsw for VxLAN support. It contains small and
> mostly non-functional changes.

Series applied, thanks.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox