Netdev List
 help / color / mirror / Atom feed
* Re: net/tipc: recursive locking in tipc_link_reset
From: Dmitry Vyukov @ 2018-10-11 12:11 UTC (permalink / raw)
  To: Ying Xue; +Cc: Jon Maloy, David Miller, netdev, tipc-discussion, LKML
In-Reply-To: <97d8c196-cd11-8859-fc14-ed12191f52af@windriver.com>

On Thu, Oct 11, 2018 at 2:03 PM, Ying Xue <ying.xue@windriver.com> wrote:
>>> Hi,
>>>
>>> I am getting the following error while booting the latest kernel on
>>> bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.
>>>
>>> Since this happens during boot, this makes LOCKDEP completely
>>> unusable, does not allow to discover any other locking issues and
>>> masks all new bugs being introduced into kernel.
>>> Please fix asap.
>>> Thanks
>> -parthasarathy.bhuvaragan address as it gives me bounces
>> but this is highly likely due to:
>>
>> commit 3f32d0be6c16b902b687453c962d17eea5b8ea19
>> Author: Parthasarathy Bhuvaragan
>> Date:   Tue Sep 25 22:09:10 2018 +0200
>>
>>     tipc: lock wakeup & inputq at tipc_link_reset()
>>
>>
>
> Dmitry, I agree with you. The complaint should be caused by the commit
> above. Please try to verify the patch:
> https://patchwork.ozlabs.org/patch/982447.


I trust you for testing ;)

Thanks for the quick fix!

^ permalink raw reply

* Re: net/tipc: recursive locking in tipc_link_reset
From: Ying Xue @ 2018-10-11 12:05 UTC (permalink / raw)
  To: Jon Maloy, Dmitry Vyukov, parthasarathy.bhuvaragan@ericsson.com,
	David Miller, netdev, tipc-discussion@lists.sourceforge.net, LKML
In-Reply-To: <DM5PR15MB15134EF6AA09E8268FA58D519AE10@DM5PR15MB1513.namprd15.prod.outlook.com>

Jon, please help to review the patch:
https://patchwork.ozlabs.org/patch/982447.

Thanks,
Ying

On 10/11/2018 06:55 PM, Jon Maloy wrote:
> Hi Dmitry,
> Yes, we are aware of this, the kernel test robot warned us about this a few days ago.
> I am looking into it.
> 
> ///jon

^ permalink raw reply

* Re: net/tipc: recursive locking in tipc_link_reset
From: Ying Xue @ 2018-10-11 12:03 UTC (permalink / raw)
  To: Dmitry Vyukov, Jon Maloy, David Miller, netdev, tipc-discussion,
	LKML
In-Reply-To: <CACT4Y+Ym3_NuqWrgHXWfTahWTm17xi_2gjy8hvuDeyUCPuYXUQ@mail.gmail.com>

On 10/11/2018 03:59 PM, Dmitry Vyukov wrote:
> On Thu, Oct 11, 2018 at 9:55 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> Hi,
>>
>> I am getting the following error while booting the latest kernel on
>> bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.
>>
>> Since this happens during boot, this makes LOCKDEP completely
>> unusable, does not allow to discover any other locking issues and
>> masks all new bugs being introduced into kernel.
>> Please fix asap.
>> Thanks
> -parthasarathy.bhuvaragan address as it gives me bounces
> but this is highly likely due to:
> 
> commit 3f32d0be6c16b902b687453c962d17eea5b8ea19
> Author: Parthasarathy Bhuvaragan
> Date:   Tue Sep 25 22:09:10 2018 +0200
> 
>     tipc: lock wakeup & inputq at tipc_link_reset()
> 
> 

Dmitry, I agree with you. The complaint should be caused by the commit
above. Please try to verify the patch:
https://patchwork.ozlabs.org/patch/982447.

Thanks,
Ying

^ permalink raw reply

* Re: [RFC PATCH 2/2] net/ncsi: Configure multi-package, multi-channel modes with failover
From: Samuel Mendoza-Jonas @ 2018-10-11  4:25 UTC (permalink / raw)
  To: Justin.Lee1, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <d97bb52a060a40cf8e03a49eb82a285f@AUSX13MPS302.AMER.DELL.COM>

On Wed, 2018-10-10 at 22:36 +0000, Justin.Lee1@Dell.com wrote:
> Hi Samuel,
> 
> I am still testing your change and have some comments below.
> 
> Thanks,
> Justin
> 
> 
> > This patch extends the ncsi-netlink interface with two new commands and
> > three new attributes to configure multiple packages and/or channels at
> > once, and configure specific failover modes.
> > 
> > NCSI_CMD_SET_PACKAGE mask and NCSI_CMD_SET_CHANNEL_MASK set a whitelist
> > of packages or channels allowed to be configured with the
> > NCSI_ATTR_PACKAGE_MASK and NCSI_ATTR_CHANNEL_MASK attributes
> > respectively. If one of these whitelists is set only packages or
> > channels matching the whitelist are considered for the channel queue in
> > ncsi_choose_active_channel().
> > 
> > These commands may also use the NCSI_ATTR_MULTI_FLAG to signal that
> > multiple packages or channels may be configured simultaneously. NCSI
> > hardware arbitration (HWA) must be available in order to enable
> > multi-package mode. Multi-channel mode is always available.
> > 
> > If the NCSI_ATTR_CHANNEL_ID attribute is present in the
> > NCSI_CMD_SET_CHANNEL_MASK command the it sets the preferred channel as
> > with the NCSI_CMD_SET_INTERFACE command. The combination of preferred
> > channel and channel whitelist defines a primary channel and the allowed
> > failover channels.
> > If the NCSI_ATTR_MULTI_FLAG attribute is also present then the preferred
> > channel is configured for Tx/Rx and the other channels are enabled only
> > for Rx.
> > 
> > Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
> > ---
> >  include/uapi/linux/ncsi.h |  16 +++
> >  net/ncsi/internal.h       |  11 +-
> >  net/ncsi/ncsi-aen.c       |   2 +-
> >  net/ncsi/ncsi-manage.c    | 138 ++++++++++++++++--------
> >  net/ncsi/ncsi-netlink.c   | 217 +++++++++++++++++++++++++++++++++-----
> >  net/ncsi/ncsi-rsp.c       |   2 +-
> >  6 files changed, 312 insertions(+), 74 deletions(-)
> > 
> > diff --git a/include/uapi/linux/ncsi.h b/include/uapi/linux/ncsi.h
> > index 4c292ecbb748..035fba1693f9 100644
> > --- a/include/uapi/linux/ncsi.h
> > +++ b/include/uapi/linux/ncsi.h
> > @@ -23,6 +23,13 @@
> >   *	optionally the preferred NCSI_ATTR_CHANNEL_ID.
> >   * @NCSI_CMD_CLEAR_INTERFACE: clear any preferred package/channel combination.
> >   *	Requires NCSI_ATTR_IFINDEX.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > + *	Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
> > + * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed channels.
> > + *	Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
> > + *	NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
> > + *	the primary channel.
> >   * @NCSI_CMD_MAX: highest command number
> >   */
> 
> There are some typo in the description.
> * @NCSI_CMD_SET_PACKAGE_MASK: set a whitelist of allowed packages.
>  *	Requires NCSI_ATTR_IFINDEX and NCSI_ATTR_PACKAGE_MASK.
>  * @NCSI_CMD_SET_CHANNEL_MASK: set a whitelist of allowed channels.
>  *	Requires NCSI_ATTR_IFINDEX, NCSI_ATTR_PACKAGE_ID, and
>  *	NCSI_ATTR_CHANNEL_MASK. If NCSI_ATTR_CHANNEL_ID is present it sets
>  *	the primary channel.

Haha, yes I threw that in at the end, thanks for catching it.

> 
> >  enum ncsi_nl_commands {
> > @@ -30,6 +37,8 @@ enum ncsi_nl_commands {
> >  	NCSI_CMD_PKG_INFO,
> >  	NCSI_CMD_SET_INTERFACE,
> >  	NCSI_CMD_CLEAR_INTERFACE,
> > +	NCSI_CMD_SET_PACKAGE_MASK,
> > +	NCSI_CMD_SET_CHANNEL_MASK,
> >  
> >  	__NCSI_CMD_AFTER_LAST,
> >  	NCSI_CMD_MAX = __NCSI_CMD_AFTER_LAST - 1
> > @@ -43,6 +52,10 @@ enum ncsi_nl_commands {
> >   * @NCSI_ATTR_PACKAGE_LIST: nested array of NCSI_PKG_ATTR attributes
> >   * @NCSI_ATTR_PACKAGE_ID: package ID
> >   * @NCSI_ATTR_CHANNEL_ID: channel ID
> > + * @NCSI_ATTR_MULTI_FLAG: flag to signal that multi-mode should be enabled with
> > + *	NCSI_CMD_SET_PACKAGE_MASK or NCSI_CMD_SET_CHANNEL_MASK.
> > + * @NCSI_ATTR_PACKAGE_MASK: 32-bit mask of allowed packages.
> > + * @NCSI_ATTR_CHANNEL_MASK: 32-bit mask of allowed channels.
> >   * @NCSI_ATTR_MAX: highest attribute number
> >   */
> >  enum ncsi_nl_attrs {
> > @@ -51,6 +64,9 @@ enum ncsi_nl_attrs {
> >  	NCSI_ATTR_PACKAGE_LIST,
> >  	NCSI_ATTR_PACKAGE_ID,
> >  	NCSI_ATTR_CHANNEL_ID,
> > +	NCSI_ATTR_MULTI_FLAG,
> > +	NCSI_ATTR_PACKAGE_MASK,
> > +	NCSI_ATTR_CHANNEL_MASK,
> 
> Is there a case that we might set these two masks at the same time?
> If not, maybe we can just have one generic MASK attribute.
> 

I thought of this too: not yet, but I wonder if we might in the future.
I'll have a think about it.

> >  
> >  	__NCSI_ATTR_AFTER_LAST,
> >  	NCSI_ATTR_MAX = __NCSI_ATTR_AFTER_LAST - 1
> > diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
> > index 3d0a33b874f5..8437474d0a78 100644
> > --- a/net/ncsi/internal.h
> > +++ b/net/ncsi/internal.h
> > @@ -213,6 +213,10 @@ struct ncsi_package {
> >  	unsigned int         channel_num; /* Number of channels     */
> >  	struct list_head     channels;    /* List of chanels        */
> >  	struct list_head     node;        /* Form list of packages  */
> > +
> > +	bool                 multi_channel; /* Enable multiple channels  */
> > +	u32                  channel_whitelist; /* Channels to configure */
> > +	struct ncsi_channel  *preferred_channel; /* Primary channel      */
> >  };
> >  
> >  struct ncsi_request {
> > @@ -280,8 +284,6 @@ struct ncsi_dev_priv {
> >  	unsigned int        package_num;     /* Number of packages         */
> >  	struct list_head    packages;        /* List of packages           */
> >  	struct ncsi_channel *hot_channel;    /* Channel was ever active    */
> > -	struct ncsi_package *force_package;  /* Force a specific package   */
> > -	struct ncsi_channel *force_channel;  /* Force a specific channel   */
> >  	struct ncsi_request requests[256];   /* Request table              */
> >  	unsigned int        request_id;      /* Last used request ID       */
> >  #define NCSI_REQ_START_IDX	1
> > @@ -294,6 +296,9 @@ struct ncsi_dev_priv {
> >  	struct list_head    node;            /* Form NCSI device list      */
> >  #define NCSI_MAX_VLAN_VIDS	15
> >  	struct list_head    vlan_vids;       /* List of active VLAN IDs */
> > +
> > +	bool                multi_package;   /* Enable multiple packages   */
> > +	u32                 package_whitelist; /* Packages to configure    */
> >  };
> >  
> >  struct ncsi_cmd_arg {
> > @@ -345,6 +350,8 @@ struct ncsi_request *ncsi_alloc_request(struct ncsi_dev_priv *ndp,
> >  void ncsi_free_request(struct ncsi_request *nr);
> >  struct ncsi_dev *ncsi_find_dev(struct net_device *dev);
> >  int ncsi_process_next_channel(struct ncsi_dev_priv *ndp);
> > +bool ncsi_channel_is_last(struct ncsi_dev_priv *ndp,
> > +			  struct ncsi_channel *channel);
> >  
> >  /* Packet handlers */
> >  u32 ncsi_calculate_checksum(unsigned char *data, int len);
> > diff --git a/net/ncsi/ncsi-aen.c b/net/ncsi/ncsi-aen.c
> > index 65f47a648be3..eac56aee30c4 100644
> > --- a/net/ncsi/ncsi-aen.c
> > +++ b/net/ncsi/ncsi-aen.c
> > @@ -86,7 +86,7 @@ static int ncsi_aen_handler_lsc(struct ncsi_dev_priv *ndp,
> >  	    !(state == NCSI_CHANNEL_ACTIVE && !(data & 0x1)))
> >  		return 0;
> >  
> > -	if (state == NCSI_CHANNEL_ACTIVE)
> > +	if (state == NCSI_CHANNEL_ACTIVE && ncsi_channel_is_last(ndp, nc))
> >  		ndp->flags |= NCSI_DEV_RESHUFFLE;
> >  
> >  	ncsi_stop_channel_monitor(nc);
> > diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
> > index 665bee25ec44..6a55df700bcb 100644
> > --- a/net/ncsi/ncsi-manage.c
> > +++ b/net/ncsi/ncsi-manage.c
> > @@ -27,6 +27,24 @@
> >  LIST_HEAD(ncsi_dev_list);
> >  DEFINE_SPINLOCK(ncsi_dev_lock);
> >  
> > +/* Returns true if the given channel is the last channel available */
> > +bool ncsi_channel_is_last(struct ncsi_dev_priv *ndp,
> > +			  struct ncsi_channel *channel)
> > +{
> > +	struct ncsi_package *np;
> > +	struct ncsi_channel *nc;
> > +
> > +	NCSI_FOR_EACH_PACKAGE(ndp, np)
> > +		NCSI_FOR_EACH_CHANNEL(np, nc) {
> > +			if (nc == channel)
> > +				continue;
> > +			if (nc->state == NCSI_CHANNEL_ACTIVE)
> > +				return false;
> > +		}
> > +
> > +	return true;
> > +}
> > +
> >  static void ncsi_report_link(struct ncsi_dev_priv *ndp, bool force_down)
> >  {
> >  	struct ncsi_dev *nd = &ndp->ndev;
> > @@ -266,6 +284,7 @@ struct ncsi_package *ncsi_add_package(struct ncsi_dev_priv *ndp,
> >  	np->ndp = ndp;
> >  	spin_lock_init(&np->lock);
> >  	INIT_LIST_HEAD(&np->channels);
> > +	np->channel_whitelist = UINT_MAX;
> >  
> >  	spin_lock_irqsave(&ndp->lock, flags);
> >  	tmp = ncsi_find_package(ndp, id);
> > @@ -633,6 +652,34 @@ static int set_one_vid(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc,
> >  	return 0;
> >  }
> >  
> > +/* Determine if a given channel should be the Tx channel */
> > +bool ncsi_channel_is_tx(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc)
> > +{
> > +	struct ncsi_package *np = nc->package;
> > +	struct ncsi_channel *channel;
> > +	struct ncsi_channel_mode *ncm;
> > +
> > +	NCSI_FOR_EACH_CHANNEL(np, channel) {
> > +		ncm = &channel->modes[NCSI_MODE_TX_ENABLE];
> > +		/* Another channel is already Tx */
> > +		if (ncm->enable)
> > +			return false;
> > +	}
> > +
> > +	if (!np->preferred_channel)
> > +		return true;
> > +
> > +	if (np->preferred_channel == nc)
> > +		return true;
> > +
> > +	/* The preferred channel is not in the queue and not active */
> > +	if (list_empty(&np->preferred_channel->link) &&
> > +	    np->preferred_channel->state != NCSI_CHANNEL_ACTIVE)
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> >  static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> >  {
> >  	struct ncsi_dev *nd = &ndp->ndev;
> > @@ -745,18 +792,22 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> >  		} else if (nd->state == ncsi_dev_state_config_ebf) {
> >  			nca.type = NCSI_PKT_CMD_EBF;
> >  			nca.dwords[0] = nc->caps[NCSI_CAP_BC].cap;
> > -			nd->state = ncsi_dev_state_config_ecnt;
> > +			if (ncsi_channel_is_tx(ndp, nc))
> > +				nd->state = ncsi_dev_state_config_ecnt;
> > +			else
> > +				nd->state = ncsi_dev_state_config_ec;
> >  #if IS_ENABLED(CONFIG_IPV6)
> >  			if (ndp->inet6_addr_num > 0 &&
> >  			    (nc->caps[NCSI_CAP_GENERIC].cap &
> >  			     NCSI_CAP_GENERIC_MC))
> >  				nd->state = ncsi_dev_state_config_egmf;
> > -			else
> > -				nd->state = ncsi_dev_state_config_ecnt;
> >  		} else if (nd->state == ncsi_dev_state_config_egmf) {
> >  			nca.type = NCSI_PKT_CMD_EGMF;
> >  			nca.dwords[0] = nc->caps[NCSI_CAP_MC].cap;
> > -			nd->state = ncsi_dev_state_config_ecnt;
> > +			if (ncsi_channel_is_tx(ndp, nc))
> > +				nd->state = ncsi_dev_state_config_ecnt;
> > +			else
> > +				nd->state = ncsi_dev_state_config_ec;
> >  #endif /* CONFIG_IPV6 */
> >  		} else if (nd->state == ncsi_dev_state_config_ecnt) {
> >  			nca.type = NCSI_PKT_CMD_ECNT;
> > @@ -840,43 +891,35 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
> >  
> >  static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> >  {
> > -	struct ncsi_package *np, *force_package;
> > -	struct ncsi_channel *nc, *found, *hot_nc, *force_channel;
> > +	struct ncsi_package *np;
> > +	struct ncsi_channel *nc, *found, *hot_nc;
> >  	struct ncsi_channel_mode *ncm;
> > -	unsigned long flags;
> > +	unsigned long flags, cflags;
> > +	bool with_link;
> >  
> >  	spin_lock_irqsave(&ndp->lock, flags);
> >  	hot_nc = ndp->hot_channel;
> > -	force_channel = ndp->force_channel;
> > -	force_package = ndp->force_package;
> >  	spin_unlock_irqrestore(&ndp->lock, flags);
> >  
> > -	/* Force a specific channel whether or not it has link if we have been
> > -	 * configured to do so
> > -	 */
> > -	if (force_package && force_channel) {
> > -		found = force_channel;
> > -		ncm = &found->modes[NCSI_MODE_LINK];
> > -		if (!(ncm->data[2] & 0x1))
> > -			netdev_info(ndp->ndev.dev,
> > -				    "NCSI: Channel %u forced, but it is link down\n",
> > -				    found->id);
> > -		goto out;
> > -	}
> > -
> > -	/* The search is done once an inactive channel with up
> > -	 * link is found.
> > +	/* By default the search is done once an inactive channel with up
> > +	 * link is found, unless a preferred channel is set.
> > +	 * If multi_package or multi_channel are configured all channels in the
> > +	 * whitelist with link are added to the channel queue.
> >  	 */
> >  	found = NULL;
> > +	with_link = false;
> >  	NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > -		if (ndp->force_package && np != ndp->force_package)
> > +		if (!(ndp->package_whitelist & (0x1 << np->id)))
> >  			continue;
> >  		NCSI_FOR_EACH_CHANNEL(np, nc) {
> > -			spin_lock_irqsave(&nc->lock, flags);
> > +			if (!(np->channel_whitelist & (0x1 << nc->id)))
> > +				continue;
> > +
> > +			spin_lock_irqsave(&nc->lock, cflags);
> >  
> >  			if (!list_empty(&nc->link) ||
> >  			    nc->state != NCSI_CHANNEL_INACTIVE) {
> > -				spin_unlock_irqrestore(&nc->lock, flags);
> > +				spin_unlock_irqrestore(&nc->lock, cflags);
> >  				continue;
> >  			}
> >  
> > @@ -888,32 +931,42 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> >  
> >  			ncm = &nc->modes[NCSI_MODE_LINK];
> >  			if (ncm->data[2] & 0x1) {
> 
> This data will not be updated if the channel monitor for it is not running.
> If I move the cable from the current configured channel to the other channel,
> NC-SI module will not detect the link status as the other channel is not configured
> and AEN will not happen.
> Is it per design that NC-SI module will always use the first interface with the link?

We'll check the link state of every channel at init, and per-package on
suspend events, but you're right that we only get AENs right now from
configured channels. There's probably no reason why we could at least
enable AENs on all channels even if we don't use them for network, maybe
during the package probe step.

> 
> > -				spin_unlock_irqrestore(&nc->lock, flags);
> >  				found = nc;
> > -				goto out;
> > +				with_link = true;
> > +
> > +				spin_lock_irqsave(&ndp->lock, flags);
> > +				list_add_tail_rcu(&found->link,
> > +						  &ndp->channel_queue);
> > +				spin_unlock_irqrestore(&ndp->lock, flags);
> > +
> > +				netdev_dbg(ndp->ndev.dev,
> > +					   "NCSI: Channel %u added to queue (link %s)\n",
> > +					   found->id,
> > +					   ncm->data[2] & 0x1 ? "up" : "down");
> >  			}
> > +			spin_unlock_irqrestore(&nc->lock, cflags);
> >  
> > -			spin_unlock_irqrestore(&nc->lock, flags);
> > +			if (with_link && !np->multi_channel)
> > +				break;
> >  		}
> > +		if (with_link && !ndp->multi_package)
> > +			break;
> >  	}
> >  
> > -	if (!found) {
> > +	if (!with_link && found) {
> > +		netdev_info(ndp->ndev.dev,
> > +			    "NCSI: No channel with link found, configuring channel %u\n",
> > +			    found->id);
> > +		spin_lock_irqsave(&ndp->lock, flags);
> > +		list_add_tail_rcu(&found->link, &ndp->channel_queue);
> > +		spin_unlock_irqrestore(&ndp->lock, flags);
> > +	} else if (!found) {
> >  		netdev_warn(ndp->ndev.dev,
> > -			    "NCSI: No channel found with link\n");
> > +			    "NCSI: No channel found to configure!\n");
> >  		ncsi_report_link(ndp, true);
> >  		return -ENODEV;
> >  	}
> >  
> > -	ncm = &found->modes[NCSI_MODE_LINK];
> > -	netdev_dbg(ndp->ndev.dev,
> > -		   "NCSI: Channel %u added to queue (link %s)\n",
> > -		   found->id, ncm->data[2] & 0x1 ? "up" : "down");
> > -
> > -out:
> > -	spin_lock_irqsave(&ndp->lock, flags);
> > -	list_add_tail_rcu(&found->link, &ndp->channel_queue);
> > -	spin_unlock_irqrestore(&ndp->lock, flags);
> > -
> >  	return ncsi_process_next_channel(ndp);
> >  }
> >  
> > @@ -1428,6 +1481,7 @@ struct ncsi_dev *ncsi_register_dev(struct net_device *dev,
> >  	INIT_LIST_HEAD(&ndp->channel_queue);
> >  	INIT_LIST_HEAD(&ndp->vlan_vids);
> >  	INIT_WORK(&ndp->work, ncsi_dev_work);
> > +	ndp->package_whitelist = UINT_MAX;
> >  
> >  	/* Initialize private NCSI device */
> >  	spin_lock_init(&ndp->lock);
> > diff --git a/net/ncsi/ncsi-netlink.c b/net/ncsi/ncsi-netlink.c
> > index 32cb7751d216..33a091e6f466 100644
> > --- a/net/ncsi/ncsi-netlink.c
> > +++ b/net/ncsi/ncsi-netlink.c
> 
> Is the following missed in the patch?
> static const struct nla_policy ncsi_genl_policy[NCSI_ATTR_MAX + 1] = {
> ...
> 	[NCSI_ATTR_MULTI_FLAG] =	{ .type = NLA_FLAG },
> 	[NCSI_ATTR_PACKAGE_MASK] =	{ .type = NLA_U32 },
> 	[NCSI_ATTR_CHANNEL_MASK] =	{ .type = NLA_U32 },

Oops, added.

> 
> > @@ -67,7 +67,7 @@ static int ncsi_write_channel_info(struct sk_buff *skb,
> >  	nla_put_u32(skb, NCSI_CHANNEL_ATTR_LINK_STATE, m->data[2]);
> >  	if (nc->state == NCSI_CHANNEL_ACTIVE)
> >  		nla_put_flag(skb, NCSI_CHANNEL_ATTR_ACTIVE);
> > -	if (ndp->force_channel == nc)
> > +	if (nc == nc->package->preferred_channel)
> >  		nla_put_flag(skb, NCSI_CHANNEL_ATTR_FORCED);
> >  
> >  	nla_put_u32(skb, NCSI_CHANNEL_ATTR_VERSION_MAJOR, nc->version.version);
> > @@ -112,7 +112,7 @@ static int ncsi_write_package_info(struct sk_buff *skb,
> >  		if (!pnest)
> >  			return -ENOMEM;
> >  		nla_put_u32(skb, NCSI_PKG_ATTR_ID, np->id);
> > -		if (ndp->force_package == np)
> > +		if ((0x1 << np->id) == ndp->package_whitelist)
> >  			nla_put_flag(skb, NCSI_PKG_ATTR_FORCED);
> >  		cnest = nla_nest_start(skb, NCSI_PKG_ATTR_CHANNEL_LIST);
> >  		if (!cnest) {
> > @@ -288,45 +288,54 @@ static int ncsi_set_interface_nl(struct sk_buff *msg, struct genl_info *info)
> >  	package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
> >  	package = NULL;
> >  
> > -	spin_lock_irqsave(&ndp->lock, flags);
> > -
> >  	NCSI_FOR_EACH_PACKAGE(ndp, np)
> >  		if (np->id == package_id)
> >  			package = np;
> >  	if (!package) {
> >  		/* The user has set a package that does not exist */
> > -		spin_unlock_irqrestore(&ndp->lock, flags);
> >  		return -ERANGE;
> >  	}
> >  
> >  	channel = NULL;
> > -	if (!info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > -		/* Allow any channel */
> > -		channel_id = NCSI_RESERVED_CHANNEL;
> > -	} else {
> > +	if (info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> >  		channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
> >  		NCSI_FOR_EACH_CHANNEL(package, nc)
> > -			if (nc->id == channel_id)
> > +			if (nc->id == channel_id) {
> >  				channel = nc;
> > +				break;
> > +			}
> > +		if (!channel) {
> > +			netdev_info(ndp->ndev.dev,
> > +				    "NCSI: Channel %u does not exist!\n",
> > +				    channel_id);
> > +			return -ERANGE;
> > +		}
> >  	}
> >  
> > -	if (channel_id != NCSI_RESERVED_CHANNEL && !channel) {
> > -		/* The user has set a channel that does not exist on this
> > -		 * package
> > -		 */
> > -		spin_unlock_irqrestore(&ndp->lock, flags);
> > -		netdev_info(ndp->ndev.dev, "NCSI: Channel %u does not exist!\n",
> > -			    channel_id);
> > -		return -ERANGE;
> > -	}
> > -
> > -	ndp->force_package = package;
> > -	ndp->force_channel = channel;
> > +	spin_lock_irqsave(&ndp->lock, flags);
> > +	ndp->package_whitelist = 0x1 << package->id;
> > +	ndp->multi_package = false;
> >  	spin_unlock_irqrestore(&ndp->lock, flags);
> >  
> > -	netdev_info(ndp->ndev.dev, "Set package 0x%x, channel 0x%x%s as preferred\n",
> > -		    package_id, channel_id,
> > -		    channel_id == NCSI_RESERVED_CHANNEL ? " (any)" : "");
> > +	spin_lock_irqsave(&package->lock, flags);
> > +	package->multi_channel = false;
> > +	if (channel) {
> > +		package->channel_whitelist = 0x1 << channel->id;
> > +		package->preferred_channel = channel;
> > +	} else {
> > +		/* Allow any channel */
> > +		package->channel_whitelist = UINT_MAX;
> > +		package->preferred_channel = NULL;
> > +	}
> > +	spin_unlock_irqrestore(&package->lock, flags);
> > +
> > +	if (channel)
> > +		netdev_info(ndp->ndev.dev,
> > +			    "Set package 0x%x, channel 0x%x as preferred\n",
> > +			    package_id, channel_id);
> > +	else
> > +		netdev_info(ndp->ndev.dev, "Set package 0x%x as preferred\n",
> > +			    package_id);
> >  
> >  	/* Bounce the NCSI channel to set changes */
> >  	ncsi_stop_dev(&ndp->ndev);
> > @@ -338,6 +347,7 @@ static int ncsi_set_interface_nl(struct sk_buff *msg, struct genl_info *info)
> >  static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> >  {
> >  	struct ncsi_dev_priv *ndp;
> > +	struct ncsi_package *np;
> >  	unsigned long flags;
> >  
> >  	if (!info || !info->attrs)
> > @@ -351,11 +361,19 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> >  	if (!ndp)
> >  		return -ENODEV;
> >  
> > -	/* Clear any override */
> > +	/* Reset any whitelists and disable multi mode */
> >  	spin_lock_irqsave(&ndp->lock, flags);
> > -	ndp->force_package = NULL;
> > -	ndp->force_channel = NULL;
> > +	ndp->package_whitelist = UINT_MAX;
> > +	ndp->multi_package = false;
> >  	spin_unlock_irqrestore(&ndp->lock, flags);
> > +
> > +	NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > +		spin_lock_irqsave(&np->lock, flags);
> > +		np->multi_channel = false;
> > +		np->channel_whitelist = UINT_MAX;
> > +		np->preferred_channel = NULL;
> > +		spin_unlock_irqrestore(&np->lock, flags);
> > +	}
> >  	netdev_info(ndp->ndev.dev, "NCSI: Cleared preferred package/channel\n");
> >  
> >  	/* Bounce the NCSI channel to set changes */
> > @@ -365,6 +383,137 @@ static int ncsi_clear_interface_nl(struct sk_buff *msg, struct genl_info *info)
> >  	return 0;
> >  }
> >  
> > +static int ncsi_set_package_mask_nl(struct sk_buff *msg,
> > +				    struct genl_info *info)
> > +{
> > +	struct ncsi_dev_priv *ndp;
> > +	unsigned long flags;
> > +	int rc;
> > +
> > +	if (!info || !info->attrs)
> > +		return -EINVAL;
> > +
> > +	if (!info->attrs[NCSI_ATTR_IFINDEX])
> > +		return -EINVAL;
> > +
> > +	if (!info->attrs[NCSI_ATTR_PACKAGE_MASK])
> > +		return -EINVAL;
> > +
> > +	ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
> > +			       nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
> > +	if (!ndp)
> > +		return -ENODEV;
> > +
> > +	spin_lock_irqsave(&ndp->lock, flags);
> > +	ndp->package_whitelist =
> > +		nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_MASK]);
> > +
> > +	if (nla_get_flag(info->attrs[NCSI_ATTR_MULTI_FLAG])) {
> > +		if (ndp->flags & NCSI_DEV_HWA) {
> > +			ndp->multi_package = true;
> > +			rc = 0;
> > +		} else {
> > +			netdev_err(ndp->ndev.dev,
> > +				   "NCSI: Can't use multiple packages without HWA\n");
> > +			rc = -EPERM;
> > +		}
> > +	} else {
> > +		rc = 0;
> > +	}
> > +
> > +	spin_unlock_irqrestore(&ndp->lock, flags);
> > +
> > +	if (!rc) {
> > +		/* Bounce the NCSI channel to set changes */
> > +		ncsi_stop_dev(&ndp->ndev);
> > +		ncsi_start_dev(&ndp->ndev);
> 
> Is it possible to delay the restart? If we have two packages, we might send
> set_package_mask command once and set_channel_mask command twice.
> We will see the unnecessary reconfigurations in a very short period time.

We could probably do with a better way of bouncing the link here too. I
suppose we could schedule the actual link 'bounce' to occur a second or
two later, and delay if we receive new configuration in that time.
Perhaps something outside of the scope of this patch though.

> 
> > +	}
> > +
> > +	return rc;
> > +}
> > +
> > +static int ncsi_set_channel_mask_nl(struct sk_buff *msg,
> > +				    struct genl_info *info)
> > +{
> > +	struct ncsi_package *np, *package;
> > +	struct ncsi_channel *nc, *channel;
> > +	struct ncsi_dev_priv *ndp;
> > +	unsigned long flags;
> > +	u32 package_id, channel_id;
> > +
> > +	if (!info || !info->attrs)
> > +		return -EINVAL;
> > +
> > +	if (!info->attrs[NCSI_ATTR_IFINDEX])
> > +		return -EINVAL;
> > +
> > +	if (!info->attrs[NCSI_ATTR_PACKAGE_ID])
> > +		return -EINVAL;
> > +
> > +	if (!info->attrs[NCSI_ATTR_CHANNEL_MASK])
> > +		return -EINVAL;
> > +
> > +	ndp = ndp_from_ifindex(get_net(sock_net(msg->sk)),
> > +			       nla_get_u32(info->attrs[NCSI_ATTR_IFINDEX]));
> > +	if (!ndp)
> > +		return -ENODEV;
> > +
> > +	package_id = nla_get_u32(info->attrs[NCSI_ATTR_PACKAGE_ID]);
> > +	package = NULL;
> > +	NCSI_FOR_EACH_PACKAGE(ndp, np)
> > +		if (np->id == package_id) {
> > +			package = np;
> > +			break;
> > +		}
> > +	if (!package)
> > +		return -ERANGE;
> > +
> > +	spin_lock_irqsave(&package->lock, flags);
> > +
> > +	channel = NULL;
> > +	if (info->attrs[NCSI_ATTR_CHANNEL_ID]) {
> > +		channel_id = nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_ID]);
> > +		NCSI_FOR_EACH_CHANNEL(np, nc)
> > +			if (nc->id == channel_id) {
> > +				channel = nc;
> > +				break;
> > +			}
> > +		if (!channel) {
> > +			spin_unlock_irqrestore(&package->lock, flags);
> > +			return -ERANGE;
> > +		}
> > +		netdev_dbg(ndp->ndev.dev,
> > +			   "NCSI: Channel %u set as preferred channel\n",
> > +			   channel->id);
> > +	}
> > +
> > +	package->channel_whitelist =
> > +		nla_get_u32(info->attrs[NCSI_ATTR_CHANNEL_MASK]);
> > +	if (package->channel_whitelist == 0)
> > +		netdev_dbg(ndp->ndev.dev,
> > +			   "NCSI: Package %u set to all channels disabled\n",
> > +			   package->id);
> > +
> > +	package->preferred_channel = channel;
> > +
> > +	if (nla_get_flag(info->attrs[NCSI_ATTR_MULTI_FLAG])) {
> > +		package->multi_channel = true;
> > +		netdev_info(ndp->ndev.dev,
> > +			    "NCSI: Multi-channel enabled on package %u\n",
> > +			    package_id);
> > +	} else {
> > +		package->multi_channel = false;
> > +	}
> > +
> > +	spin_unlock_irqrestore(&package->lock, flags);
> > +
> > +	/* Bounce the NCSI channel to set changes */
> > +	ncsi_stop_dev(&ndp->ndev);
> > +	ncsi_start_dev(&ndp->ndev);
> 
> Same question as set_package_mask function.
> Is it possible to delay the restart? If we have two packages, we might send
> set_package_mask command once and set_channel_mask command twice.
> We will see the unnecessary reconfigurations in a very short period time.
> 
> > +
> > +	return 0;
> > +}
> > +
> >  static const struct genl_ops ncsi_ops[] = {
> >  	{
> >  		.cmd = NCSI_CMD_PKG_INFO,
> > @@ -385,6 +534,18 @@ static const struct genl_ops ncsi_ops[] = {
> >  		.doit = ncsi_clear_interface_nl,
> >  		.flags = GENL_ADMIN_PERM,
> >  	},
> > +	{
> > +		.cmd = NCSI_CMD_SET_PACKAGE_MASK,
> > +		.policy = ncsi_genl_policy,
> > +		.doit = ncsi_set_package_mask_nl,
> > +		.flags = GENL_ADMIN_PERM,
> > +	},
> > +	{
> > +		.cmd = NCSI_CMD_SET_CHANNEL_MASK,
> > +		.policy = ncsi_genl_policy,
> > +		.doit = ncsi_set_channel_mask_nl,
> > +		.flags = GENL_ADMIN_PERM,
> > +	},
> >  };
> >  
> >  static struct genl_family ncsi_genl_family __ro_after_init = {
> > diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
> > index d66b34749027..02ce7626b579 100644
> > --- a/net/ncsi/ncsi-rsp.c
> > +++ b/net/ncsi/ncsi-rsp.c
> > @@ -241,7 +241,7 @@ static int ncsi_rsp_handler_dcnt(struct ncsi_request *nr)
> >  	if (!ncm->enable)
> >  		return 0;
> >  
> > -	ncm->enable = 1;
> > +	ncm->enable = 0;
> >  	return 0;
> >  }
> >  
> > -- 
> > 2.19.0
> 
> 

^ permalink raw reply

* [PATCH net] tipc: eliminate possible recursive locking detected by LOCKDEP
From: Ying Xue @ 2018-10-11 11:57 UTC (permalink / raw)
  To: jon.maloy, dvyukov
  Cc: netdev, tipc-discussion, davem, linux-kernel,
	parthasarathy.bhuvaragan

When booting kernel with LOCKDEP option, below warning info was found:

WARNING: possible recursive locking detected
4.19.0-rc7+ #14 Not tainted
--------------------------------------------
swapper/0/1 is trying to acquire lock:
00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
include/linux/spinlock.h:334 [inline]
00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850

but task is already holding lock:
00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
include/linux/spinlock.h:334 [inline]
00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&list->lock)->rlock#4);
  lock(&(&list->lock)->rlock#4);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by swapper/0/1:
 #0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
 #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
spin_lock_bh include/linux/spinlock.h:334 [inline]
 #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849

stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1af/0x295 lib/dump_stack.c:113
 print_deadlock_bug kernel/locking/lockdep.c:1759 [inline]
 check_deadlock kernel/locking/lockdep.c:1803 [inline]
 validate_chain kernel/locking/lockdep.c:2399 [inline]
 __lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
 lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
 _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
 spin_lock_bh include/linux/spinlock.h:334 [inline]
 tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
 tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
 tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
 tipc_init_net+0x472/0x610 net/tipc/core.c:82
 ops_init+0xf7/0x520 net/core/net_namespace.c:129
 __register_pernet_operations net/core/net_namespace.c:940 [inline]
 register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
 register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
 tipc_init+0x83/0x104 net/tipc/core.c:140
 do_one_initcall+0x109/0x70a init/main.c:885
 do_initcall_level init/main.c:953 [inline]
 do_initcalls init/main.c:961 [inline]
 do_basic_setup init/main.c:979 [inline]
 kernel_init_freeable+0x4bd/0x57f init/main.c:1144
 kernel_init+0x13/0x180 init/main.c:1063
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413

The reason why the noise above was complained by LOCKDEP is because we
nested to hold l->wakeupq.lock and l->inputq->lock in tipc_link_reset
function. In fact it's unnecessary to move skb buffer from l->wakeupq
queue to l->inputq queue while holding the two locks at the same time.
Instead, we can move skb buffers in l->wakeupq queue to a temporary
list first and then move the buffers of the temporary list to l->inputq
queue, which is also safe for us.

Fixes: 3f32d0be6c16 ("tipc: lock wakeup & inputq at tipc_link_reset()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 net/tipc/link.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index fb886b5..1d21ae4 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -843,14 +843,21 @@ static void link_prepare_wakeup(struct tipc_link *l)
 
 void tipc_link_reset(struct tipc_link *l)
 {
+	struct sk_buff_head list;
+
+	__skb_queue_head_init(&list);
+
 	l->in_session = false;
 	l->session++;
 	l->mtu = l->advertised_mtu;
+
 	spin_lock_bh(&l->wakeupq.lock);
+	skb_queue_splice_init(&l->wakeupq, &list);
+	spin_unlock_bh(&l->wakeupq.lock);
+
 	spin_lock_bh(&l->inputq->lock);
-	skb_queue_splice_init(&l->wakeupq, l->inputq);
+	skb_queue_splice_init(&list, l->inputq);
 	spin_unlock_bh(&l->inputq->lock);
-	spin_unlock_bh(&l->wakeupq.lock);
 
 	__skb_queue_purge(&l->transmq);
 	__skb_queue_purge(&l->deferdq);
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH net-next v5] net/ncsi: Extend NC-SI Netlink interface to allow user space to send NC-SI command
From: Samuel Mendoza-Jonas @ 2018-10-11  3:55 UTC (permalink / raw)
  To: Justin.Lee1, joel
  Cc: linux-aspeed, netdev, openbmc, amithash, christian, vijaykhemka
In-Reply-To: <245b32d9a3dd463f9334b9704b639ec4@AUSX13MPS302.AMER.DELL.COM>

On Wed, 2018-10-10 at 18:11 +0000, Justin.Lee1@Dell.com wrote:
<snip>
> +
> +	len = nla_len(info->attrs[NCSI_ATTR_DATA]);
> +	if (len < sizeof(struct ncsi_pkt_hdr)) {
> +		netdev_info(ndp->ndev.dev, "NCSI: no command to send %u\n",
> +			    package_id);
> +		ret = -EINVAL;
> +		goto out_netlink;
> +	} else {
> +		data = (unsigned char *)nla_data(info->attrs[NCSI_ATTR_DATA]);
> +	}

I only just noticed this, the call to nla_len() can cause a null-dereference if
the NCSI_ATTR_DATA attribute isn't present; we need to make sure it exists
before accessing it in info->attrs.

eg:

root@ozrom2-bmc:~# ./ncsi-netlink -l 2 -p 0 -c 0 --cmd
[   81.399837] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   81.409092] pgd = ddaa9fa6
[   81.413084] [00000000] *pgd=9702c831, *pte=00000000, *ppte=00000000
[   81.420729] Internal error: Oops: 17 [#1] ARM
[   81.426447] CPU: 0 PID: 1028 Comm: ncsi-netlink Not tainted 4.18.8-sammj-00144-gbc129f31bfa5 #12
...
[   81.874434] Kernel panic - not syncing: Fatal exception

Cheers,
Sam

^ permalink raw reply

* RE: net/tipc: recursive locking in tipc_link_reset
From: Jon Maloy @ 2018-10-11 10:55 UTC (permalink / raw)
  To: Dmitry Vyukov, parthasarathy.bhuvaragan@ericsson.com,
	David Miller, Ying Xue, netdev,
	tipc-discussion@lists.sourceforge.net, LKML
  Cc: Parthasarathy Bhuvaragan
In-Reply-To: <CACT4Y+bcxEz=P2E5TRj6A1qeCOQ+WtWR44WGLzqdDttLzD393A@mail.gmail.com>

Hi Dmitry,
Yes, we are aware of this, the kernel test robot warned us about this a few days ago.
I am looking into it.

///jon

> -----Original Message-----
> From: Dmitry Vyukov <dvyukov@google.com>
> Sent: October 11, 2018 3:55 AM
> To: parthasarathy.bhuvaragan@ericsson.com; Jon Maloy
> <jon.maloy@ericsson.com>; David Miller <davem@davemloft.net>; Ying Xue
> <ying.xue@windriver.com>; netdev <netdev@vger.kernel.org>; tipc-
> discussion@lists.sourceforge.net; LKML <linux-kernel@vger.kernel.org
> Subject: net/tipc: recursive locking in tipc_link_reset
> 
> Hi,
> 
> I am getting the following error while booting the latest kernel on
> bb2d8f2f61047cbde08b78ec03e4ebdb01ee5434 (Oct 10). Config is attached.
> 
> Since this happens during boot, this makes LOCKDEP completely unusable,
> does not allow to discover any other locking issues and masks all new bugs
> being introduced into kernel.
> Please fix asap.
> Thanks
> 
> 
> WARNING: possible recursive locking detected 4.19.0-rc7+ #14 Not tainted
> --------------------------------------------
> swapper/0/1 is trying to acquire lock:
> 00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
> include/linux/spinlock.h:334 [inline]
> 00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
> 
> but task is already holding lock:
> 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
> include/linux/spinlock.h:334 [inline]
> 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(&(&list->lock)->rlock#4);
>   lock(&(&list->lock)->rlock#4);
> 
>  *** DEADLOCK ***
> 
>  May be due to missing lock nesting notation
> 
> 2 locks held by swapper/0/1:
>  #0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
> register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
>  #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> spin_lock_bh include/linux/spinlock.h:334 [inline]
>  #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
> tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
> 
> stack backtrace:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14 Hardware name:
> QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1af/0x295 lib/dump_stack.c:113  print_deadlock_bug
> kernel/locking/lockdep.c:1759 [inline]  check_deadlock
> kernel/locking/lockdep.c:1803 [inline]  validate_chain
> kernel/locking/lockdep.c:2399 [inline]
>  __lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
>  lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
> __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
>  _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168  spin_lock_bh
> include/linux/spinlock.h:334 [inline]
>  tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
>  tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
>  tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
>  tipc_init_net+0x472/0x610 net/tipc/core.c:82
>  ops_init+0xf7/0x520 net/core/net_namespace.c:129
> __register_pernet_operations net/core/net_namespace.c:940 [inline]
>  register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
>  register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
>  tipc_init+0x83/0x104 net/tipc/core.c:140  do_one_initcall+0x109/0x70a
> init/main.c:885  do_initcall_level init/main.c:953 [inline]  do_initcalls
> init/main.c:961 [inline]  do_basic_setup init/main.c:979 [inline]
> kernel_init_freeable+0x4bd/0x57f init/main.c:1144
>  kernel_init+0x13/0x180 init/main.c:1063
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413

^ permalink raw reply

* Re: [PATCH] virtio_net: enable tx after resuming from suspend
From: ake @ 2018-10-11 10:22 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, David S. Miller, virtualization, netdev,
	linux-kernel
In-Reply-To: <dd530053-efab-b661-3423-1faaab8fb12f@redhat.com>



On 2018年10月11日 18:44, Jason Wang wrote:
> 
> 
> On 2018年10月11日 15:51, Ake Koomsin wrote:
>> commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
>> disabled the virtio tx before going to suspend to avoid a use after free.
>> However, after resuming, it causes the virtio_net device to lose its
>> network connectivity.
>>
>> To solve the issue, we need to enable tx after resuming.
>>
>> Fixes commit 713a98d90c5e ("virtio-net: serialize tx routine during
>> reset")
>> Signed-off-by: Ake Koomsin <ake@igel.co.jp>
>> ---
>>   drivers/net/virtio_net.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index dab504ec5e50..3453d80f5f81 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -2256,6 +2256,7 @@ static int virtnet_restore_up(struct
>> virtio_device *vdev)
>>       }
>>         netif_device_attach(vi->dev);
>> +    netif_start_queue(vi->dev);
> 
> I believe this is duplicated with netif_tx_wake_all_queues() in
> netif_device_attach() above?

Thank you for your review.

If both netif_tx_wake_all_queues() and netif_start_queue() result in
clearing __QUEUE_STATE_DRV_XOFF, then is it possible that some
conditions in netif_device_attach() is not satisfied? Without
netif_start_queue(), the virtio_net device does not resume properly
after waking up.

Is it better to report this as a bug first? If I am to do more
investigation, what areas should I look into?


Best Regards
Ake Koomsin

^ permalink raw reply

* [PATCH] iwlwifi: fix spelling mistake "registrating" -> "registering"
From: Colin King @ 2018-10-11  9:57 UTC (permalink / raw)
  To: Johannes Berg, Emmanuel Grumbach, Luca Coelho,
	Intel Linux Wireless, Kalle Valo, David S . Miller,
	linux-wireless, netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Trivial fix to spelling mistake in IWL_ERR error message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/wireless/intel/iwlwifi/dvm/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/main.c b/drivers/net/wireless/intel/iwlwifi/dvm/main.c
index 1088ff036e13..a5041d92babc 100644
--- a/drivers/net/wireless/intel/iwlwifi/dvm/main.c
+++ b/drivers/net/wireless/intel/iwlwifi/dvm/main.c
@@ -1054,7 +1054,7 @@ static void iwl_bg_restart(struct work_struct *data)
 			ieee80211_restart_hw(priv->hw);
 		else
 			IWL_ERR(priv,
-				"Cannot request restart before registrating with mac80211\n");
+				"Cannot request restart before registering with mac80211\n");
 	} else {
 		WARN_ON(1);
 	}
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH][next] ath10k: fix out of bound read on array ath10k_rates
From: Kalle Valo @ 2018-10-11  9:49 UTC (permalink / raw)
  To: Colin King
  Cc: linux-wireless, Sriram R, netdev, kernel-janitors, linux-kernel,
	ath10k, David S . Miller
In-Reply-To: <20181005215609.13935-1-colin.king@canonical.com>

Colin King <colin.king@canonical.com> wrote:

> From: Colin Ian King <colin.king@canonical.com>
> 
> An out-of-bounds read on array ath10k_rates is occurring because
> the maximum number of elements is currently based on the size of
> the array and not the number of elements in the array. Fix this
> by using ARRAY_SIZE instead of sizeof.
> 
> Detected by CoverityScan, CID#1473918 ("Out-of-bounds read")
> 
> Fixes: f279294e9ee2 ("ath10k: add support for configuring management packet rate")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

This is already fixed in ath-next.

error: patch failed: drivers/net/wireless/ath/ath10k/mac.c:164
error: drivers/net/wireless/ath/ath10k/mac.c: patch does not apply
stg import: Diff does not apply cleanly

Patch set to Rejected.

-- 
https://patchwork.kernel.org/patch/10628815/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH][next] ath10k: fix out of bound read on array ath10k_rates
From: Kalle Valo @ 2018-10-11  9:49 UTC (permalink / raw)
  To: Colin King
  Cc: Sriram R, David S . Miller, ath10k, linux-wireless, netdev,
	kernel-janitors, linux-kernel
In-Reply-To: <20181005215609.13935-1-colin.king@canonical.com>

Colin King <colin.king@canonical.com> wrote:

> From: Colin Ian King <colin.king@canonical.com>
> 
> An out-of-bounds read on array ath10k_rates is occurring because
> the maximum number of elements is currently based on the size of
> the array and not the number of elements in the array. Fix this
> by using ARRAY_SIZE instead of sizeof.
> 
> Detected by CoverityScan, CID#1473918 ("Out-of-bounds read")
> 
> Fixes: f279294e9ee2 ("ath10k: add support for configuring management packet rate")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

This is already fixed in ath-next.

error: patch failed: drivers/net/wireless/ath/ath10k/mac.c:164
error: drivers/net/wireless/ath/ath10k/mac.c: patch does not apply
stg import: Diff does not apply cleanly

Patch set to Rejected.

-- 
https://patchwork.kernel.org/patch/10628815/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* A mail kvóta túllépte a korlátot
From: Escuela 398, Justo Daract @ 2018-10-11  2:20 UTC (permalink / raw)




Tisztelt felhasználó,

Postaládája túllépte a szokásos 100 MB tárolási korlátot, nem lesz képes fogadni vagy küldeni e-mailt, amíg meg nem növeli a postaláda kvótáját. A levél kvótájának növeléséhez kattintson az alábbi hivatkozásra, és töltse ki a szükséges adatokat a postaláda-kvóta növeléséhez.

https://szolgltats.yolasite.com/

24 óra elteltével válasz nélkül a postaláda átmenetileg inaktívvá válik.

Kattintson ide: https://szolgltats.yolasite.com/

Köszönjük, hogy használja a webmail,
Copyright © 2018 ügyfélszolgálati szolgálat,
webmail adminisztrátor.

^ permalink raw reply

* Re: [PATCH] virtio_net: enable tx after resuming from suspend
From: Jason Wang @ 2018-10-11  9:44 UTC (permalink / raw)
  To: Ake Koomsin
  Cc: Michael S. Tsirkin, David S. Miller, virtualization, netdev,
	linux-kernel
In-Reply-To: <20181011075127.2608-1-ake@igel.co.jp>



On 2018年10月11日 15:51, Ake Koomsin wrote:
> commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
> disabled the virtio tx before going to suspend to avoid a use after free.
> However, after resuming, it causes the virtio_net device to lose its
> network connectivity.
>
> To solve the issue, we need to enable tx after resuming.
>
> Fixes commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
> Signed-off-by: Ake Koomsin <ake@igel.co.jp>
> ---
>   drivers/net/virtio_net.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index dab504ec5e50..3453d80f5f81 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -2256,6 +2256,7 @@ static int virtnet_restore_up(struct virtio_device *vdev)
>   	}
>   
>   	netif_device_attach(vi->dev);
> +	netif_start_queue(vi->dev);

I believe this is duplicated with netif_tx_wake_all_queues() in 
netif_device_attach() above?

Thanks

>   	return err;
>   }
>   

^ permalink raw reply

* [iproute2-next] tipc: support interface name when activating UDP bearer
From: Hoang Le @ 2018-10-11  2:07 UTC (permalink / raw)
  To: jon.maloy, maloy, ying.xue, netdev, tipc-discussion

Support for indicating interface name has an ip address in parallel
with specifying ip address when activating UDP bearer.
This liberates the user from keeping track of the current ip address
for each device.

Old command syntax:
$tipc bearer enable media udp name NAME localip IP

New command syntax:
$tipc bearer enable media udp name NAME [localip IP|dev DEVICE]

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
---
 tipc/bearer.c | 65 +++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 60 insertions(+), 5 deletions(-)

diff --git a/tipc/bearer.c b/tipc/bearer.c
index 05dc84aa8ded..118a664370e7 100644
--- a/tipc/bearer.c
+++ b/tipc/bearer.c
@@ -19,6 +19,8 @@
 #include <linux/tipc_netlink.h>
 #include <linux/tipc.h>
 #include <linux/genetlink.h>
+#include <linux/if.h>
+#include <sys/ioctl.h>
 
 #include <libmnl/libmnl.h>
 #include <sys/socket.h>
@@ -68,7 +70,7 @@ static void cmd_bearer_enable_l2_help(struct cmdl *cmdl, char *media)
 static void cmd_bearer_enable_udp_help(struct cmdl *cmdl, char *media)
 {
 	fprintf(stderr,
-		"Usage: %s bearer enable [OPTIONS] media %s name NAME localip IP [UDP OPTIONS]\n\n",
+		"Usage: %s bearer enable [OPTIONS] media %s name NAME [localip IP|device DEVICE] [UDP OPTIONS]\n\n",
 		cmdl->argv[0], media);
 	fprintf(stderr,
 		"OPTIONS\n"
@@ -121,6 +123,47 @@ static int generate_multicast(short af, char *buf, int bufsize)
 	return 0;
 }
 
+static int cmd_bearer_validate_and_get_addr(const char *name, char *straddr)
+{
+	struct ifreq ifc;
+	struct sockaddr_in *ip4addr;
+	struct sockaddr_in6 *ip6addr;
+	int fd = 0;
+
+	if (!name || !straddr)
+		return 0;
+
+	fd = socket(PF_INET, SOCK_DGRAM, 0);
+	if (fd <= 0) {
+		fprintf(stderr, "Failed to create socket\n");
+		return 0;
+	}
+
+	ifc.ifr_name[0] = 0;
+	memcpy(ifc.ifr_name, name, strlen(name));
+	ifc.ifr_name[strlen(name)] = 0;
+
+	if (ioctl(fd, SIOCGIFADDR, &ifc) < 0) {
+		fprintf(stderr, "ioctl failed: %s\n", strerror(errno));
+		close(fd);
+		return 0;
+	}
+
+	ip4addr = (struct sockaddr_in *)&ifc.ifr_addr;
+	if (inet_ntop(AF_INET, &ip4addr->sin_addr, straddr,
+		      INET_ADDRSTRLEN) == NULL)	{
+		ip6addr = (struct sockaddr_in6 *)&ifc.ifr_addr;
+		if (inet_ntop(AF_INET6, &ip6addr->sin6_addr, straddr,
+			      INET6_ADDRSTRLEN) == NULL) {
+			fprintf(stderr, "UDP local address error\n");
+			close(fd);
+			return -EINVAL;
+		}
+	}
+	close(fd);
+	return 1;
+}
+
 static int nl_add_udp_enable_opts(struct nlmsghdr *nlh, struct opt *opts,
 				  struct cmdl *cmdl)
 {
@@ -138,13 +181,25 @@ static int nl_add_udp_enable_opts(struct nlmsghdr *nlh, struct opt *opts,
 		.ai_family = AF_UNSPEC,
 		.ai_socktype = SOCK_DGRAM
 	};
+	char addr[INET6_ADDRSTRLEN] = {0};
 
-	if (!(opt = get_opt(opts, "localip"))) {
-		fprintf(stderr, "error, udp bearer localip missing\n");
-		cmd_bearer_enable_udp_help(cmdl, "udp");
+	opt = get_opt(opts, "device");
+	if (opt && !cmd_bearer_validate_and_get_addr(opt->val, addr)) {
+		fprintf(stderr, "error, no device name available\n");
 		return -EINVAL;
 	}
-	locip = opt->val;
+
+	if (strlen(addr) > 0) {
+		locip = addr;
+	} else {
+		opt = get_opt(opts, "localip");
+		if (!opt) {
+			fprintf(stderr, "error, udp bearer localip/device missing\n");
+			cmd_bearer_enable_udp_help(cmdl, "udp");
+			return -EINVAL;
+		}
+		locip = opt->val;
+	}
 
 	if ((opt = get_opt(opts, "remoteip")))
 		remip = opt->val;
-- 
2.17.1

^ permalink raw reply related

* [net 3/3] net/mlx5: WQ, fixes for fragmented WQ buffers API
From: Saeed Mahameed @ 2018-10-11  1:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20181011013244.29554-1-saeedm@mellanox.com>

From: Tariq Toukan <tariqt@mellanox.com>

mlx5e netdevice used to calculate fragment edges by a call to
mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct
indication for queues smaller than a PAGE_SIZE, (broken by default on
PowerPC, where PAGE_SIZE == 64KB).  Here it is replaced by the correct new
calls/API.

Since (TX/RX) Work Queues buffers are fragmented, here we introduce
changes to the API in core driver, so that it gets a stride index and
returns the index of last stride on same fragment, and an additional
wrapping function that returns the number of physically contiguous
strides that can be written contiguously to the work queue.

This obsoletes the following API functions, and their buggy
usage in EN driver:
* mlx5_wq_cyc_get_frag_size()
* mlx5_wq_cyc_ctr2fragix()

The new API improves modularity and hides the details of such
calculation for mlx5e netdevice and mlx5_ib rdma drivers.

New calculation is also more efficient, and improves performance
as follows:

Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size.

Before: 16,477,619 pps
After:  17,085,793 pps

3.7% improvement

Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 12 +++++-----
 .../net/ethernet/mellanox/mlx5/core/en_tx.c   | 22 +++++++++----------
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib.h |  5 ++---
 drivers/net/ethernet/mellanox/mlx5/core/wq.c  |  5 -----
 drivers/net/ethernet/mellanox/mlx5/core/wq.h  | 11 +++++-----
 include/linux/mlx5/driver.h                   |  8 +++++++
 6 files changed, 31 insertions(+), 32 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 15d8ae28c040..00172dee5339 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -432,10 +432,9 @@ static inline u16 mlx5e_icosq_wrap_cnt(struct mlx5e_icosq *sq)
 
 static inline void mlx5e_fill_icosq_frag_edge(struct mlx5e_icosq *sq,
 					      struct mlx5_wq_cyc *wq,
-					      u16 pi, u16 frag_pi)
+					      u16 pi, u16 nnops)
 {
 	struct mlx5e_sq_wqe_info *edge_wi, *wi = &sq->db.ico_wqe[pi];
-	u8 nnops = mlx5_wq_cyc_get_frag_size(wq) - frag_pi;
 
 	edge_wi = wi + nnops;
 
@@ -454,15 +453,14 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
 	struct mlx5_wq_cyc *wq = &sq->wq;
 	struct mlx5e_umr_wqe *umr_wqe;
 	u16 xlt_offset = ix << (MLX5E_LOG_ALIGNED_MPWQE_PPW - 1);
-	u16 pi, frag_pi;
+	u16 pi, contig_wqebbs_room;
 	int err;
 	int i;
 
 	pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
-	frag_pi = mlx5_wq_cyc_ctr2fragix(wq, sq->pc);
-
-	if (unlikely(frag_pi + MLX5E_UMR_WQEBBS > mlx5_wq_cyc_get_frag_size(wq))) {
-		mlx5e_fill_icosq_frag_edge(sq, wq, pi, frag_pi);
+	contig_wqebbs_room = mlx5_wq_cyc_get_contig_wqebbs(wq, pi);
+	if (unlikely(contig_wqebbs_room < MLX5E_UMR_WQEBBS)) {
+		mlx5e_fill_icosq_frag_edge(sq, wq, pi, contig_wqebbs_room);
 		pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index ae73ea992845..6dacaeba2fbf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -290,10 +290,9 @@ mlx5e_txwqe_build_dsegs(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 
 static inline void mlx5e_fill_sq_frag_edge(struct mlx5e_txqsq *sq,
 					   struct mlx5_wq_cyc *wq,
-					   u16 pi, u16 frag_pi)
+					   u16 pi, u16 nnops)
 {
 	struct mlx5e_tx_wqe_info *edge_wi, *wi = &sq->db.wqe_info[pi];
-	u8 nnops = mlx5_wq_cyc_get_frag_size(wq) - frag_pi;
 
 	edge_wi = wi + nnops;
 
@@ -348,8 +347,8 @@ netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 	struct mlx5e_tx_wqe_info *wi;
 
 	struct mlx5e_sq_stats *stats = sq->stats;
+	u16 headlen, ihs, contig_wqebbs_room;
 	u16 ds_cnt, ds_cnt_inl = 0;
-	u16 headlen, ihs, frag_pi;
 	u8 num_wqebbs, opcode;
 	u32 num_bytes;
 	int num_dma;
@@ -386,9 +385,9 @@ netdev_tx_t mlx5e_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 	}
 
 	num_wqebbs = DIV_ROUND_UP(ds_cnt, MLX5_SEND_WQEBB_NUM_DS);
-	frag_pi = mlx5_wq_cyc_ctr2fragix(wq, sq->pc);
-	if (unlikely(frag_pi + num_wqebbs > mlx5_wq_cyc_get_frag_size(wq))) {
-		mlx5e_fill_sq_frag_edge(sq, wq, pi, frag_pi);
+	contig_wqebbs_room = mlx5_wq_cyc_get_contig_wqebbs(wq, pi);
+	if (unlikely(contig_wqebbs_room < num_wqebbs)) {
+		mlx5e_fill_sq_frag_edge(sq, wq, pi, contig_wqebbs_room);
 		mlx5e_sq_fetch_wqe(sq, &wqe, &pi);
 	}
 
@@ -636,7 +635,7 @@ netdev_tx_t mlx5i_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 	struct mlx5e_tx_wqe_info *wi;
 
 	struct mlx5e_sq_stats *stats = sq->stats;
-	u16 headlen, ihs, pi, frag_pi;
+	u16 headlen, ihs, pi, contig_wqebbs_room;
 	u16 ds_cnt, ds_cnt_inl = 0;
 	u8 num_wqebbs, opcode;
 	u32 num_bytes;
@@ -672,13 +671,14 @@ netdev_tx_t mlx5i_sq_xmit(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 	}
 
 	num_wqebbs = DIV_ROUND_UP(ds_cnt, MLX5_SEND_WQEBB_NUM_DS);
-	frag_pi = mlx5_wq_cyc_ctr2fragix(wq, sq->pc);
-	if (unlikely(frag_pi + num_wqebbs > mlx5_wq_cyc_get_frag_size(wq))) {
+	pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
+	contig_wqebbs_room = mlx5_wq_cyc_get_contig_wqebbs(wq, pi);
+	if (unlikely(contig_wqebbs_room < num_wqebbs)) {
+		mlx5e_fill_sq_frag_edge(sq, wq, pi, contig_wqebbs_room);
 		pi = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
-		mlx5e_fill_sq_frag_edge(sq, wq, pi, frag_pi);
 	}
 
-	mlx5i_sq_fetch_wqe(sq, &wqe, &pi);
+	mlx5i_sq_fetch_wqe(sq, &wqe, pi);
 
 	/* fill wqe */
 	wi       = &sq->db.wqe_info[pi];
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
index 08eac92fc26c..0982c579ec74 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
@@ -109,12 +109,11 @@ struct mlx5i_tx_wqe {
 
 static inline void mlx5i_sq_fetch_wqe(struct mlx5e_txqsq *sq,
 				      struct mlx5i_tx_wqe **wqe,
-				      u16 *pi)
+				      u16 pi)
 {
 	struct mlx5_wq_cyc *wq = &sq->wq;
 
-	*pi  = mlx5_wq_cyc_ctr2ix(wq, sq->pc);
-	*wqe = mlx5_wq_cyc_get_wqe(wq, *pi);
+	*wqe = mlx5_wq_cyc_get_wqe(wq, pi);
 	memset(*wqe, 0, sizeof(**wqe));
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.c b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
index 68e7f8df2a6d..ddca327e8950 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
@@ -39,11 +39,6 @@ u32 mlx5_wq_cyc_get_size(struct mlx5_wq_cyc *wq)
 	return (u32)wq->fbc.sz_m1 + 1;
 }
 
-u16 mlx5_wq_cyc_get_frag_size(struct mlx5_wq_cyc *wq)
-{
-	return wq->fbc.frag_sz_m1 + 1;
-}
-
 u32 mlx5_cqwq_get_size(struct mlx5_cqwq *wq)
 {
 	return wq->fbc.sz_m1 + 1;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.h b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
index 3a1a170bb2d7..b1293d153a58 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
@@ -80,7 +80,6 @@ int mlx5_wq_cyc_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
 		       void *wqc, struct mlx5_wq_cyc *wq,
 		       struct mlx5_wq_ctrl *wq_ctrl);
 u32 mlx5_wq_cyc_get_size(struct mlx5_wq_cyc *wq);
-u16 mlx5_wq_cyc_get_frag_size(struct mlx5_wq_cyc *wq);
 
 int mlx5_wq_qp_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
 		      void *qpc, struct mlx5_wq_qp *wq,
@@ -140,11 +139,6 @@ static inline u16 mlx5_wq_cyc_ctr2ix(struct mlx5_wq_cyc *wq, u16 ctr)
 	return ctr & wq->fbc.sz_m1;
 }
 
-static inline u16 mlx5_wq_cyc_ctr2fragix(struct mlx5_wq_cyc *wq, u16 ctr)
-{
-	return ctr & wq->fbc.frag_sz_m1;
-}
-
 static inline u16 mlx5_wq_cyc_get_head(struct mlx5_wq_cyc *wq)
 {
 	return mlx5_wq_cyc_ctr2ix(wq, wq->wqe_ctr);
@@ -160,6 +154,11 @@ static inline void *mlx5_wq_cyc_get_wqe(struct mlx5_wq_cyc *wq, u16 ix)
 	return mlx5_frag_buf_get_wqe(&wq->fbc, ix);
 }
 
+static inline u16 mlx5_wq_cyc_get_contig_wqebbs(struct mlx5_wq_cyc *wq, u16 ix)
+{
+	return mlx5_frag_buf_get_idx_last_contig_stride(&wq->fbc, ix) - ix + 1;
+}
+
 static inline int mlx5_wq_cyc_cc_bigger(u16 cc1, u16 cc2)
 {
 	int equal   = (cc1 == cc2);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 66d94b4557cf..88a041b73abf 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1032,6 +1032,14 @@ static inline void *mlx5_frag_buf_get_wqe(struct mlx5_frag_buf_ctrl *fbc,
 		((fbc->frag_sz_m1 & ix) << fbc->log_stride);
 }
 
+static inline u32
+mlx5_frag_buf_get_idx_last_contig_stride(struct mlx5_frag_buf_ctrl *fbc, u32 ix)
+{
+	u32 last_frag_stride_idx = (ix + fbc->strides_offset) | fbc->frag_sz_m1;
+
+	return min_t(u32, last_frag_stride_idx - fbc->strides_offset, fbc->sz_m1);
+}
+
 int mlx5_cmd_init(struct mlx5_core_dev *dev);
 void mlx5_cmd_cleanup(struct mlx5_core_dev *dev);
 void mlx5_cmd_use_events(struct mlx5_core_dev *dev);
-- 
2.17.1

^ permalink raw reply related

* [net 2/3] net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type
From: Saeed Mahameed @ 2018-10-11  1:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Huy Nguyen, Eli Cohen, Saeed Mahameed
In-Reply-To: <20181011013244.29554-1-saeedm@mellanox.com>

From: Huy Nguyen <huyn@mellanox.com>

The HW spec defines only bits 24-26 of pftype_wq as the page fault type,
use the required mask to ensure that.

Fixes: d9aaed838765 ("{net,IB}/mlx5: Refactor page fault handling")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 48864f4988a4..c1e1a16a9b07 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -273,7 +273,7 @@ static void eq_pf_process(struct mlx5_eq *eq)
 		case MLX5_PFAULT_SUBTYPE_WQE:
 			/* WQE based event */
 			pfault->type =
-				be32_to_cpu(pf_eqe->wqe.pftype_wq) >> 24;
+				(be32_to_cpu(pf_eqe->wqe.pftype_wq) >> 24) & 0x7;
 			pfault->token =
 				be32_to_cpu(pf_eqe->wqe.token);
 			pfault->wqe.wq_num =
-- 
2.17.1

^ permalink raw reply related

* [pull request][net 0/3] Mellanox, mlx5 fixes 2018-10-10
From: Saeed Mahameed @ 2018-10-11  1:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Saeed Mahameed

Hi Dave,

This pull request includes some fixes to mlx5 driver,
Please pull and let me know if there's any problem.

For -stable v4.11:
('net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type')
For -stable v4.17:
('net/mlx5: Fix memory leak when setting fpga ipsec caps')
For -stable v4.18:
('net/mlx5: WQ, fixes for fragmented WQ buffers API')

Thanks,
Saeed.
---

The following changes since commit 52b5d6f5dcf0e5201392f7d417148ccb537dbf6f:

  net: make skb_partial_csum_set() more robust against overflows (2018-10-10 10:21:31 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2018-10-10

for you to fetch changes up to 37fdffb217a45609edccbb8b407d031143f551c0:

  net/mlx5: WQ, fixes for fragmented WQ buffers API (2018-10-10 18:26:16 -0700)

----------------------------------------------------------------
mlx5-fixes-2018-10-10

----------------------------------------------------------------
Huy Nguyen (1):
      net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type

Talat Batheesh (1):
      net/mlx5: Fix memory leak when setting fpga ipsec caps

Tariq Toukan (1):
      net/mlx5: WQ, fixes for fragmented WQ buffers API

 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 12 +++++-------
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c    | 22 +++++++++++-----------
 drivers/net/ethernet/mellanox/mlx5/core/eq.c       |  2 +-
 .../net/ethernet/mellanox/mlx5/core/fpga/ipsec.c   |  9 ++++-----
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h  |  5 ++---
 drivers/net/ethernet/mellanox/mlx5/core/wq.c       |  5 -----
 drivers/net/ethernet/mellanox/mlx5/core/wq.h       | 11 +++++------
 include/linux/mlx5/driver.h                        |  8 ++++++++
 8 files changed, 36 insertions(+), 38 deletions(-)

^ permalink raw reply

* [net 1/3] net/mlx5: Fix memory leak when setting fpga ipsec caps
From: Saeed Mahameed @ 2018-10-11  1:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Talat Batheesh, Saeed Mahameed
In-Reply-To: <20181011013244.29554-1-saeedm@mellanox.com>

From: Talat Batheesh <talatb@mellanox.com>

Allocated memory for context should be freed once
finished working with it.

Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code")
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c
index 5645a4facad2..b8ee9101c506 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.c
@@ -245,7 +245,7 @@ static void *mlx5_fpga_ipsec_cmd_exec(struct mlx5_core_dev *mdev,
 		return ERR_PTR(res);
 	}
 
-	/* Context will be freed by wait func after completion */
+	/* Context should be freed by the caller after completion. */
 	return context;
 }
 
@@ -418,10 +418,8 @@ static int mlx5_fpga_ipsec_set_caps(struct mlx5_core_dev *mdev, u32 flags)
 	cmd.cmd = htonl(MLX5_FPGA_IPSEC_CMD_OP_SET_CAP);
 	cmd.flags = htonl(flags);
 	context = mlx5_fpga_ipsec_cmd_exec(mdev, &cmd, sizeof(cmd));
-	if (IS_ERR(context)) {
-		err = PTR_ERR(context);
-		goto out;
-	}
+	if (IS_ERR(context))
+		return PTR_ERR(context);
 
 	err = mlx5_fpga_ipsec_cmd_wait(context);
 	if (err)
@@ -435,6 +433,7 @@ static int mlx5_fpga_ipsec_set_caps(struct mlx5_core_dev *mdev, u32 flags)
 	}
 
 out:
+	kfree(context);
 	return err;
 }
 
-- 
2.17.1

^ permalink raw reply related

* [net-next 7/7] net/mlx5e: Do not ignore netdevice TX/RX queues number
From: Saeed Mahameed @ 2018-10-11  1:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jason Gunthorpe, linux-rdma, Feras Daoud, Saeed Mahameed
In-Reply-To: <20181011012444.28194-1-saeedm@mellanox.com>

From: Feras Daoud <ferasda@mellanox.com>

The current design of mlx5e driver ignores the netdevice TX/RX queues
number for netdevices that RDMA IPoIB ULP creates. Instead, the queue
number is initialized to the maximum number that mlx5 thinks best for
performance. As a result, ULP drivers that choose to create a netdevice
with queue number that is less than the maximum channels mlx5 creates,
will get a memory corruption.

This fix changes the mlx5e netdev logic to respect ULP netdevices TX/RX
queue number and use it when creating resources instead of the maximum
channel number.

Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  | 11 ++++++--
 .../ethernet/mellanox/mlx5/core/en_ethtool.c  |  2 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 25 ++++++++++---------
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  |  9 ++++---
 .../ethernet/mellanox/mlx5/core/en_stats.c    |  8 +++---
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib.c |  9 ++++---
 .../mellanox/mlx5/core/ipoib/ipoib_vlan.c     |  1 -
 7 files changed, 37 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 0fbdd7696ec4..d7fbd5b6ac95 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -173,6 +173,7 @@ static inline u16 mlx5_min_rx_wqes(int wq_type, u32 wq_size)
 	}
 }
 
+/* Use this function to get max num channels (rxqs/txqs) only to create netdev */
 static inline int mlx5e_get_max_num_channels(struct mlx5_core_dev *mdev)
 {
 	return is_kdump_kernel() ?
@@ -181,6 +182,13 @@ static inline int mlx5e_get_max_num_channels(struct mlx5_core_dev *mdev)
 		      MLX5E_MAX_NUM_CHANNELS);
 }
 
+/* Use this function to get max num channels after netdev was created */
+static inline int mlx5e_get_netdev_max_channels(struct net_device *netdev)
+{
+	return min_t(unsigned int, netdev->num_rx_queues,
+		     netdev->num_tx_queues);
+}
+
 struct mlx5e_tx_wqe {
 	struct mlx5_wqe_ctrl_seg ctrl;
 	struct mlx5_wqe_eth_seg  eth;
@@ -710,7 +718,6 @@ struct mlx5e_profile {
 	void	(*disable)(struct mlx5e_priv *priv);
 	void	(*update_stats)(struct mlx5e_priv *priv);
 	void	(*update_carrier)(struct mlx5e_priv *priv);
-	int	(*max_nch)(struct mlx5_core_dev *mdev);
 	struct {
 		mlx5e_fp_handle_rx_cqe handle_rx_cqe;
 		mlx5e_fp_handle_rx_cqe handle_rx_cqe_mpwqe;
@@ -970,7 +977,7 @@ int mlx5e_netdev_init(struct net_device *netdev,
 void mlx5e_netdev_cleanup(struct net_device *netdev, struct mlx5e_priv *priv);
 struct net_device*
 mlx5e_create_netdev(struct mlx5_core_dev *mdev, const struct mlx5e_profile *profile,
-		    void *ppriv);
+		    int nch, void *ppriv);
 int mlx5e_attach_netdev(struct mlx5e_priv *priv);
 void mlx5e_detach_netdev(struct mlx5e_priv *priv);
 void mlx5e_destroy_netdev(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index c86fd770c463..c61f6302cde2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -319,7 +319,7 @@ static int mlx5e_set_ringparam(struct net_device *dev,
 void mlx5e_ethtool_get_channels(struct mlx5e_priv *priv,
 				struct ethtool_channels *ch)
 {
-	ch->max_combined   = priv->profile->max_nch(priv->mdev);
+	ch->max_combined   = mlx5e_get_netdev_max_channels(priv->netdev);
 	ch->combined_count = priv->channels.params.num_channels;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index f4b11bdae879..c9848e333450 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1799,7 +1799,7 @@ static int mlx5e_open_sqs(struct mlx5e_channel *c,
 			  struct mlx5e_channel_param *cparam)
 {
 	struct mlx5e_priv *priv = c->priv;
-	int err, tc, max_nch = priv->profile->max_nch(priv->mdev);
+	int err, tc, max_nch = mlx5e_get_netdev_max_channels(priv->netdev);
 
 	for (tc = 0; tc < params->num_tc; tc++) {
 		int txq_ix = c->ix + tc * max_nch;
@@ -2439,7 +2439,7 @@ int mlx5e_create_direct_rqts(struct mlx5e_priv *priv)
 	int err;
 	int ix;
 
-	for (ix = 0; ix < priv->profile->max_nch(priv->mdev); ix++) {
+	for (ix = 0; ix < mlx5e_get_netdev_max_channels(priv->netdev); ix++) {
 		rqt = &priv->direct_tir[ix].rqt;
 		err = mlx5e_create_rqt(priv, 1 /*size */, rqt);
 		if (err)
@@ -2460,7 +2460,7 @@ void mlx5e_destroy_direct_rqts(struct mlx5e_priv *priv)
 {
 	int i;
 
-	for (i = 0; i < priv->profile->max_nch(priv->mdev); i++)
+	for (i = 0; i < mlx5e_get_netdev_max_channels(priv->netdev); i++)
 		mlx5e_destroy_rqt(priv, &priv->direct_tir[i].rqt);
 }
 
@@ -2554,7 +2554,7 @@ static void mlx5e_redirect_rqts(struct mlx5e_priv *priv,
 		mlx5e_redirect_rqt(priv, rqtn, MLX5E_INDIR_RQT_SIZE, rrp);
 	}
 
-	for (ix = 0; ix < priv->profile->max_nch(priv->mdev); ix++) {
+	for (ix = 0; ix < mlx5e_get_netdev_max_channels(priv->netdev); ix++) {
 		struct mlx5e_redirect_rqt_param direct_rrp = {
 			.is_rss = false,
 			{
@@ -2755,7 +2755,7 @@ static int mlx5e_modify_tirs_lro(struct mlx5e_priv *priv)
 			goto free_in;
 	}
 
-	for (ix = 0; ix < priv->profile->max_nch(priv->mdev); ix++) {
+	for (ix = 0; ix < mlx5e_get_netdev_max_channels(priv->netdev); ix++) {
 		err = mlx5_core_modify_tir(mdev, priv->direct_tir[ix].tirn,
 					   in, inlen);
 		if (err)
@@ -2855,7 +2855,7 @@ static void mlx5e_netdev_set_tcs(struct net_device *netdev)
 
 static void mlx5e_build_tc2txq_maps(struct mlx5e_priv *priv)
 {
-	int max_nch = priv->profile->max_nch(priv->mdev);
+	int max_nch = mlx5e_get_netdev_max_channels(priv->netdev);
 	int i, tc;
 
 	for (i = 0; i < max_nch; i++)
@@ -3247,7 +3247,7 @@ int mlx5e_create_indirect_tirs(struct mlx5e_priv *priv, bool inner_ttc)
 
 int mlx5e_create_direct_tirs(struct mlx5e_priv *priv)
 {
-	int nch = priv->profile->max_nch(priv->mdev);
+	int nch = mlx5e_get_netdev_max_channels(priv->netdev);
 	struct mlx5e_tir *tir;
 	void *tirc;
 	int inlen;
@@ -3300,7 +3300,7 @@ void mlx5e_destroy_indirect_tirs(struct mlx5e_priv *priv, bool inner_ttc)
 
 void mlx5e_destroy_direct_tirs(struct mlx5e_priv *priv)
 {
-	int nch = priv->profile->max_nch(priv->mdev);
+	int nch = mlx5e_get_netdev_max_channels(priv->netdev);
 	int i;
 
 	for (i = 0; i < nch; i++)
@@ -4743,7 +4743,7 @@ static int mlx5e_nic_init(struct mlx5_core_dev *mdev,
 		return err;
 
 	mlx5e_build_nic_params(mdev, &priv->channels.params,
-			       profile->max_nch(mdev), netdev->mtu);
+			       mlx5e_get_netdev_max_channels(netdev), netdev->mtu);
 
 	mlx5e_timestamp_init(priv);
 
@@ -4926,7 +4926,6 @@ static const struct mlx5e_profile mlx5e_nic_profile = {
 	.enable		   = mlx5e_nic_enable,
 	.disable	   = mlx5e_nic_disable,
 	.update_stats	   = mlx5e_update_ndo_stats,
-	.max_nch	   = mlx5e_get_max_num_channels,
 	.update_carrier	   = mlx5e_update_carrier,
 	.rx_handlers.handle_rx_cqe       = mlx5e_handle_rx_cqe,
 	.rx_handlers.handle_rx_cqe_mpwqe = mlx5e_handle_rx_cqe_mpwrq,
@@ -4977,9 +4976,9 @@ void mlx5e_netdev_cleanup(struct net_device *netdev, struct mlx5e_priv *priv)
 
 struct net_device *mlx5e_create_netdev(struct mlx5_core_dev *mdev,
 				       const struct mlx5e_profile *profile,
+				       int nch,
 				       void *ppriv)
 {
-	int nch = profile->max_nch(mdev);
 	struct net_device *netdev;
 	int err;
 
@@ -5101,6 +5100,7 @@ static void *mlx5e_add(struct mlx5_core_dev *mdev)
 	void *rpriv = NULL;
 	void *priv;
 	int err;
+	int nch;
 
 	err = mlx5e_check_required_hca_cap(mdev);
 	if (err)
@@ -5116,7 +5116,8 @@ static void *mlx5e_add(struct mlx5_core_dev *mdev)
 	}
 #endif
 
-	netdev = mlx5e_create_netdev(mdev, &mlx5e_nic_profile, rpriv);
+	nch = mlx5e_get_max_num_channels(mdev);
+	netdev = mlx5e_create_netdev(mdev, &mlx5e_nic_profile, nch, rpriv);
 	if (!netdev) {
 		mlx5_core_err(mdev, "mlx5e_create_netdev failed\n");
 		goto err_free_rpriv;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 1846edb01fd6..64c2b9ea8b1e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -1090,7 +1090,8 @@ static int mlx5e_init_rep(struct mlx5_core_dev *mdev,
 		return err;
 
 
-	priv->channels.params.num_channels = profile->max_nch(mdev);
+	priv->channels.params.num_channels =
+				mlx5e_get_netdev_max_channels(netdev);
 
 	mlx5e_build_rep_params(mdev, &priv->channels.params, netdev->mtu);
 	mlx5e_build_rep_netdev(netdev);
@@ -1233,7 +1234,6 @@ static const struct mlx5e_profile mlx5e_rep_profile = {
 	.init_tx		= mlx5e_init_rep_tx,
 	.cleanup_tx		= mlx5e_cleanup_nic_tx,
 	.update_stats           = mlx5e_rep_update_hw_counters,
-	.max_nch		= mlx5e_get_max_num_channels,
 	.update_carrier		= NULL,
 	.rx_handlers.handle_rx_cqe       = mlx5e_handle_rx_cqe_rep,
 	.rx_handlers.handle_rx_cqe_mpwqe = mlx5e_handle_rx_cqe_mpwrq,
@@ -1296,13 +1296,14 @@ mlx5e_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
 	struct mlx5e_rep_priv *rpriv;
 	struct net_device *netdev;
 	struct mlx5e_priv *upriv;
-	int err;
+	int nch, err;
 
 	rpriv = kzalloc(sizeof(*rpriv), GFP_KERNEL);
 	if (!rpriv)
 		return -ENOMEM;
 
-	netdev = mlx5e_create_netdev(dev, &mlx5e_rep_profile, rpriv);
+	nch = mlx5e_get_max_num_channels(dev);
+	netdev = mlx5e_create_netdev(dev, &mlx5e_rep_profile, nch, rpriv);
 	if (!netdev) {
 		pr_warn("Failed to create representor netdev for vport %d\n",
 			rep->vport);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index b7d4896c7c7b..1c006869a642 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -133,7 +133,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 
 	memset(s, 0, sizeof(*s));
 
-	for (i = 0; i < priv->profile->max_nch(priv->mdev); i++) {
+	for (i = 0; i < mlx5e_get_netdev_max_channels(priv->netdev); i++) {
 		struct mlx5e_channel_stats *channel_stats =
 			&priv->channel_stats[i];
 		struct mlx5e_xdpsq_stats *xdpsq_red_stats = &channel_stats->xdpsq;
@@ -1217,7 +1217,7 @@ static const struct counter_desc ch_stats_desc[] = {
 
 static int mlx5e_grp_channels_get_num_stats(struct mlx5e_priv *priv)
 {
-	int max_nch = priv->profile->max_nch(priv->mdev);
+	int max_nch = mlx5e_get_netdev_max_channels(priv->netdev);
 
 	return (NUM_RQ_STATS * max_nch) +
 	       (NUM_CH_STATS * max_nch) +
@@ -1229,7 +1229,7 @@ static int mlx5e_grp_channels_get_num_stats(struct mlx5e_priv *priv)
 static int mlx5e_grp_channels_fill_strings(struct mlx5e_priv *priv, u8 *data,
 					   int idx)
 {
-	int max_nch = priv->profile->max_nch(priv->mdev);
+	int max_nch = mlx5e_get_netdev_max_channels(priv->netdev);
 	int i, j, tc;
 
 	for (i = 0; i < max_nch; i++)
@@ -1264,7 +1264,7 @@ static int mlx5e_grp_channels_fill_strings(struct mlx5e_priv *priv, u8 *data,
 static int mlx5e_grp_channels_fill_stats(struct mlx5e_priv *priv, u64 *data,
 					 int idx)
 {
-	int max_nch = priv->profile->max_nch(priv->mdev);
+	int max_nch = mlx5e_get_netdev_max_channels(priv->netdev);
 	int i, j, tc;
 
 	for (i = 0; i < max_nch; i++)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index ed28602c98f6..b59953daf8b4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -88,7 +88,8 @@ int mlx5i_init(struct mlx5_core_dev *mdev,
 	netdev->mtu = max_mtu;
 
 	mlx5e_build_nic_params(mdev, &priv->channels.params,
-			       profile->max_nch(mdev), netdev->mtu);
+			       mlx5e_get_netdev_max_channels(netdev),
+			       netdev->mtu);
 	mlx5i_build_nic_params(mdev, &priv->channels.params);
 
 	mlx5e_timestamp_init(priv);
@@ -117,10 +118,11 @@ void mlx5i_cleanup(struct mlx5e_priv *priv)
 
 static void mlx5i_grp_sw_update_stats(struct mlx5e_priv *priv)
 {
+	int max_nch = mlx5e_get_netdev_max_channels(priv->netdev);
 	struct mlx5e_sw_stats s = { 0 };
 	int i, j;
 
-	for (i = 0; i < priv->profile->max_nch(priv->mdev); i++) {
+	for (i = 0; i < max_nch; i++) {
 		struct mlx5e_channel_stats *channel_stats;
 		struct mlx5e_rq_stats *rq_stats;
 
@@ -417,7 +419,6 @@ static const struct mlx5e_profile mlx5i_nic_profile = {
 	.enable		   = NULL, /* mlx5i_enable */
 	.disable	   = NULL, /* mlx5i_disable */
 	.update_stats	   = NULL, /* mlx5i_update_stats */
-	.max_nch	   = mlx5e_get_max_num_channels,
 	.update_carrier    = NULL, /* no HW update in IB link */
 	.rx_handlers.handle_rx_cqe       = mlx5i_handle_rx_cqe,
 	.rx_handlers.handle_rx_cqe_mpwqe = NULL, /* Not supported */
@@ -729,7 +730,7 @@ int mlx5_rdma_rn_get_params(struct mlx5_core_dev *mdev,
 	if (rc)
 		return rc;
 
-	nch = mlx5_get_profile(mdev)->max_nch(mdev);
+	nch = mlx5e_get_max_num_channels(mdev);
 
 	*params = (struct rdma_netdev_alloc_params){
 		.sizeof_priv = sizeof(struct mlx5i_priv) +
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
index 3dbfa532fda1..b491b8f5fd6b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
@@ -351,7 +351,6 @@ static const struct mlx5e_profile mlx5i_pkey_nic_profile = {
 	.enable		   = NULL,
 	.disable	   = NULL,
 	.update_stats	   = NULL,
-	.max_nch	   = mlx5e_get_max_num_channels,
 	.rx_handlers.handle_rx_cqe       = mlx5i_handle_rx_cqe,
 	.rx_handlers.handle_rx_cqe_mpwqe = NULL, /* Not supported */
 	.max_tc		   = MLX5I_MAX_NUM_TC,
-- 
2.17.1

^ permalink raw reply related

* [net-next 6/7] net/mlx5e: Use non-delayed work for update stats
From: Saeed Mahameed @ 2018-10-11  1:24 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Jason Gunthorpe, linux-rdma, Saeed Mahameed
In-Reply-To: <20181011012444.28194-1-saeedm@mellanox.com>

Convert mlx5e update stats work to a normal work structure, since it is
never used delayed.

Add a helper function to queue update stats work on demand which checks
for some conditions and reduce code duplication to have a better
abstraction.

Fixes: ed56c5193ad8 ("net/mlx5e: Update NIC HW stats on demand only")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  3 ++-
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 24 ++++++++++++-------
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  |  3 +--
 3 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index f6dd0254d103..0fbdd7696ec4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -673,7 +673,7 @@ struct mlx5e_priv {
 	struct work_struct         update_carrier_work;
 	struct work_struct         set_rx_mode_work;
 	struct work_struct         tx_timeout_work;
-	struct delayed_work        update_stats_work;
+	struct work_struct         update_stats_work;
 
 	struct mlx5_core_dev      *mdev;
 	struct net_device         *netdev;
@@ -927,6 +927,7 @@ void mlx5e_cleanup_nic_tx(struct mlx5e_priv *priv);
 int mlx5e_close(struct net_device *netdev);
 int mlx5e_open(struct net_device *netdev);
 
+void mlx5e_queue_update_stats(struct mlx5e_priv *priv);
 int mlx5e_bits_invert(unsigned long a, int size);
 
 typedef int (*change_hw_mtu_cb)(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index d5a7bd493240..f4b11bdae879 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -274,8 +274,7 @@ static void mlx5e_update_ndo_stats(struct mlx5e_priv *priv)
 
 static void mlx5e_update_stats_work(struct work_struct *work)
 {
-	struct delayed_work *dwork = to_delayed_work(work);
-	struct mlx5e_priv *priv = container_of(dwork, struct mlx5e_priv,
+	struct mlx5e_priv *priv = container_of(work, struct mlx5e_priv,
 					       update_stats_work);
 
 	mutex_lock(&priv->state_lock);
@@ -283,6 +282,17 @@ static void mlx5e_update_stats_work(struct work_struct *work)
 	mutex_unlock(&priv->state_lock);
 }
 
+void mlx5e_queue_update_stats(struct mlx5e_priv *priv)
+{
+	if (!priv->profile->update_stats)
+		return;
+
+	if (unlikely(test_bit(MLX5E_STATE_DESTROYING, &priv->state)))
+		return;
+
+	queue_work(priv->wq, &priv->update_stats_work);
+}
+
 static void mlx5e_async_event(struct mlx5_core_dev *mdev, void *vpriv,
 			      enum mlx5_dev_event event, unsigned long param)
 {
@@ -2957,9 +2967,7 @@ int mlx5e_open_locked(struct net_device *netdev)
 	if (priv->profile->update_carrier)
 		priv->profile->update_carrier(priv);
 
-	if (priv->profile->update_stats)
-		queue_delayed_work(priv->wq, &priv->update_stats_work, 0);
-
+	mlx5e_queue_update_stats(priv);
 	return 0;
 
 err_clear_state_opened_flag:
@@ -3441,7 +3449,7 @@ mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	struct mlx5e_pport_stats *pstats = &priv->stats.pport;
 
 	/* update HW stats in background for next time */
-	queue_delayed_work(priv->wq, &priv->update_stats_work, 0);
+	mlx5e_queue_update_stats(priv);
 
 	if (mlx5e_is_uplink_rep(priv)) {
 		stats->rx_packets = PPORT_802_3_GET(pstats, a_frames_received_ok);
@@ -4946,7 +4954,7 @@ int mlx5e_netdev_init(struct net_device *netdev,
 	INIT_WORK(&priv->update_carrier_work, mlx5e_update_carrier_work);
 	INIT_WORK(&priv->set_rx_mode_work, mlx5e_set_rx_mode_work);
 	INIT_WORK(&priv->tx_timeout_work, mlx5e_tx_timeout_work);
-	INIT_DELAYED_WORK(&priv->update_stats_work, mlx5e_update_stats_work);
+	INIT_WORK(&priv->update_stats_work, mlx5e_update_stats_work);
 
 	priv->wq = create_singlethread_workqueue("mlx5e");
 	if (!priv->wq)
@@ -5037,7 +5045,7 @@ void mlx5e_detach_netdev(struct mlx5e_priv *priv)
 
 	profile->cleanup_rx(priv);
 	profile->cleanup_tx(priv);
-	cancel_delayed_work_sync(&priv->update_stats_work);
+	cancel_work_sync(&priv->update_stats_work);
 }
 
 void mlx5e_destroy_netdev(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 1a3efa87f557..1846edb01fd6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -992,8 +992,7 @@ mlx5e_rep_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	struct mlx5e_priv *priv = netdev_priv(dev);
 
 	/* update HW stats in background for next time */
-	queue_delayed_work(priv->wq, &priv->update_stats_work, 0);
-
+	mlx5e_queue_update_stats(priv);
 	memcpy(stats, &priv->stats.vf_vport, sizeof(*stats));
 }
 
-- 
2.17.1

^ permalink raw reply related

* [net-next 1/7] RDMA/netdev: Hoist alloc_netdev_mqs out of the driver
From: Saeed Mahameed @ 2018-10-11  1:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jason Gunthorpe, linux-rdma, Denis Drozdov,
	Saeed Mahameed
In-Reply-To: <20181011012444.28194-1-saeedm@mellanox.com>

From: Denis Drozdov <denisd@mellanox.com>

netdev has several interfaces that expect to call alloc_netdev_mqs from
the core code, with the driver only providing the arguments.  This is
incompatible with the rdma_netdev interface that returns the netdev
directly.

Thus re-organize the API used by ipoib so that the verbs core code calls
alloc_netdev_mqs for the driver. This is done by allowing the drivers to
provide the allocation parameters via a 'get_params' callback and then
initializing an allocated netdev as a second step.

Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Denis Drozdov <denisd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/infiniband/core/verbs.c               | 32 +++++++
 drivers/infiniband/hw/mlx5/main.c             | 23 ++---
 drivers/infiniband/ulp/ipoib/ipoib_main.c     | 21 ++---
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 87 ++++++++++---------
 include/linux/mlx5/driver.h                   | 14 +--
 include/rdma/ib_verbs.h                       | 23 ++++-
 6 files changed, 119 insertions(+), 81 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 6ee03d6089eb..2f34d3412097 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2621,3 +2621,35 @@ void ib_drain_qp(struct ib_qp *qp)
 		ib_drain_rq(qp);
 }
 EXPORT_SYMBOL(ib_drain_qp);
+
+struct net_device *rdma_alloc_netdev(struct ib_device *device, u8 port_num,
+				     enum rdma_netdev_t type, const char *name,
+				     unsigned char name_assign_type,
+				     void (*setup)(struct net_device *))
+{
+	struct rdma_netdev_alloc_params params;
+	struct net_device *netdev;
+	int rc;
+
+	if (!device->rdma_netdev_get_params)
+		return ERR_PTR(-EOPNOTSUPP);
+
+	rc = device->rdma_netdev_get_params(device, port_num, type, &params);
+	if (rc)
+		return ERR_PTR(rc);
+
+	netdev = alloc_netdev_mqs(params.sizeof_priv, name, name_assign_type,
+				  setup, params.txqs, params.rxqs);
+	if (!netdev)
+		return ERR_PTR(-ENOMEM);
+
+	rc = params.initialize_rdma_netdev(device, port_num, netdev,
+					   params.param);
+	if (rc) {
+		free_netdev(netdev);
+		return ERR_PTR(rc);
+	}
+
+	return netdev;
+}
+EXPORT_SYMBOL(rdma_alloc_netdev);
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index c414f3809e5c..5d9b7f62a0ba 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -5163,22 +5163,14 @@ static int mlx5_ib_get_hw_stats(struct ib_device *ibdev,
 	return num_counters;
 }
 
-static struct net_device*
-mlx5_ib_alloc_rdma_netdev(struct ib_device *hca,
-			  u8 port_num,
-			  enum rdma_netdev_t type,
-			  const char *name,
-			  unsigned char name_assign_type,
-			  void (*setup)(struct net_device *))
+static int mlx5_ib_rn_get_params(struct ib_device *device, u8 port_num,
+				 enum rdma_netdev_t type,
+				 struct rdma_netdev_alloc_params *params)
 {
-	struct net_device *netdev;
-
 	if (type != RDMA_NETDEV_IPOIB)
-		return ERR_PTR(-EOPNOTSUPP);
+		return -EOPNOTSUPP;
 
-	netdev = mlx5_rdma_netdev_alloc(to_mdev(hca)->mdev, hca,
-					name, setup);
-	return netdev;
+	return mlx5_rdma_rn_get_params(to_mdev(device)->mdev, device, params);
 }
 
 static void delay_drop_debugfs_cleanup(struct mlx5_ib_dev *dev)
@@ -5824,8 +5816,9 @@ int mlx5_ib_stage_caps_init(struct mlx5_ib_dev *dev)
 	dev->ib_dev.check_mr_status	= mlx5_ib_check_mr_status;
 	dev->ib_dev.get_dev_fw_str      = get_dev_fw_str;
 	dev->ib_dev.get_vector_affinity	= mlx5_ib_get_vector_affinity;
-	if (MLX5_CAP_GEN(mdev, ipoib_enhanced_offloads))
-		dev->ib_dev.alloc_rdma_netdev	= mlx5_ib_alloc_rdma_netdev;
+	if (MLX5_CAP_GEN(mdev, ipoib_enhanced_offloads) &&
+	    IS_ENABLED(CONFIG_MLX5_CORE_IPOIB))
+		dev->ib_dev.rdma_netdev_get_params = mlx5_ib_rn_get_params;
 
 	if (mlx5_core_is_pf(mdev)) {
 		dev->ib_dev.get_vf_config	= mlx5_ib_get_vf_config;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index e3d28f9ad9c0..9c816cd41724 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -2146,20 +2146,15 @@ static struct net_device *ipoib_get_netdev(struct ib_device *hca, u8 port,
 {
 	struct net_device *dev;
 
-	if (hca->alloc_rdma_netdev) {
-		dev = hca->alloc_rdma_netdev(hca, port,
-					     RDMA_NETDEV_IPOIB, name,
-					     NET_NAME_UNKNOWN,
-					     ipoib_setup_common);
-		if (IS_ERR_OR_NULL(dev) && PTR_ERR(dev) != -EOPNOTSUPP)
-			return NULL;
-	}
-
-	if (!hca->alloc_rdma_netdev || PTR_ERR(dev) == -EOPNOTSUPP)
-		dev = ipoib_create_netdev_default(hca, name, NET_NAME_UNKNOWN,
-						  ipoib_setup_common);
+	dev = rdma_alloc_netdev(hca, port, RDMA_NETDEV_IPOIB, name,
+				NET_NAME_UNKNOWN, ipoib_setup_common);
+	if (!IS_ERR(dev))
+		return dev;
+	if (PTR_ERR(dev) != -EOPNOTSUPP)
+		return NULL;
 
-	return dev;
+	return ipoib_create_netdev_default(hca, name, NET_NAME_UNKNOWN,
+					   ipoib_setup_common);
 }
 
 struct ipoib_dev_priv *ipoib_intf_alloc(struct ib_device *hca, u8 port,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index 299e2a897f7e..af1a95f80404 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -658,53 +658,36 @@ static void mlx5_rdma_netdev_free(struct net_device *netdev)
 	}
 }
 
-struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
-					  struct ib_device *ibdev,
-					  const char *name,
-					  void (*setup)(struct net_device *))
+static bool mlx5_is_sub_interface(struct mlx5_core_dev *mdev)
 {
-	const struct mlx5e_profile *profile;
-	struct net_device *netdev;
+	return mdev->mlx5e_res.pdn != 0;
+}
+
+static const struct mlx5e_profile *mlx5_get_profile(struct mlx5_core_dev *mdev)
+{
+	if (mlx5_is_sub_interface(mdev))
+		return mlx5i_pkey_get_profile();
+	return &mlx5i_nic_profile;
+}
+
+static int mlx5_rdma_setup_rn(struct ib_device *ibdev, u8 port_num,
+			      struct net_device *netdev, void *param)
+{
+	struct mlx5_core_dev *mdev = (struct mlx5_core_dev *)param;
+	const struct mlx5e_profile *prof = mlx5_get_profile(mdev);
 	struct mlx5i_priv *ipriv;
 	struct mlx5e_priv *epriv;
 	struct rdma_netdev *rn;
-	bool sub_interface;
-	int nch;
 	int err;
 
-	if (mlx5i_check_required_hca_cap(mdev)) {
-		mlx5_core_warn(mdev, "Accelerated mode is not supported\n");
-		return ERR_PTR(-EOPNOTSUPP);
-	}
-
-	/* TODO: Need to find a better way to check if child device*/
-	sub_interface = (mdev->mlx5e_res.pdn != 0);
-
-	if (sub_interface)
-		profile = mlx5i_pkey_get_profile();
-	else
-		profile = &mlx5i_nic_profile;
-
-	nch = profile->max_nch(mdev);
-
-	netdev = alloc_netdev_mqs(sizeof(struct mlx5i_priv) + sizeof(struct mlx5e_priv),
-				  name, NET_NAME_UNKNOWN,
-				  setup,
-				  nch * MLX5E_MAX_NUM_TC,
-				  nch);
-	if (!netdev) {
-		mlx5_core_warn(mdev, "alloc_netdev_mqs failed\n");
-		return NULL;
-	}
-
 	ipriv = netdev_priv(netdev);
 	epriv = mlx5i_epriv(netdev);
 
 	epriv->wq = create_singlethread_workqueue("mlx5i");
 	if (!epriv->wq)
-		goto err_free_netdev;
+		return -ENOMEM;
 
-	ipriv->sub_interface = sub_interface;
+	ipriv->sub_interface = mlx5_is_sub_interface(mdev);
 	if (!ipriv->sub_interface) {
 		err = mlx5i_pkey_qpn_ht_init(netdev);
 		if (err) {
@@ -718,7 +701,7 @@ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 			goto destroy_ht;
 	}
 
-	profile->init(mdev, netdev, profile, ipriv);
+	prof->init(mdev, netdev, prof, ipriv);
 
 	mlx5e_attach_netdev(epriv);
 	netif_carrier_off(netdev);
@@ -734,15 +717,37 @@ struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 	netdev->priv_destructor = mlx5_rdma_netdev_free;
 	netdev->needs_free_netdev = 1;
 
-	return netdev;
+	return 0;
 
 destroy_ht:
 	mlx5i_pkey_qpn_ht_cleanup(netdev);
 destroy_wq:
 	destroy_workqueue(epriv->wq);
-err_free_netdev:
-	free_netdev(netdev);
+	return err;
+}
+
+int mlx5_rdma_rn_get_params(struct mlx5_core_dev *mdev,
+			    struct ib_device *device,
+			    struct rdma_netdev_alloc_params *params)
+{
+	int nch;
+	int rc;
+
+	rc = mlx5i_check_required_hca_cap(mdev);
+	if (rc)
+		return rc;
 
-	return NULL;
+	nch = mlx5_get_profile(mdev)->max_nch(mdev);
+
+	*params = (struct rdma_netdev_alloc_params){
+		.sizeof_priv = sizeof(struct mlx5i_priv) +
+			       sizeof(struct mlx5e_priv),
+		.txqs = nch * MLX5E_MAX_NUM_TC,
+		.rxqs = nch,
+		.param = mdev,
+		.initialize_rdma_netdev = mlx5_rdma_setup_rn,
+	};
+
+	return 0;
 }
-EXPORT_SYMBOL(mlx5_rdma_netdev_alloc);
+EXPORT_SYMBOL(mlx5_rdma_rn_get_params);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 26a92462f4ce..4b75796cac23 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1228,21 +1228,15 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
 struct mlx5_uars_page *mlx5_get_uars_page(struct mlx5_core_dev *mdev);
 void mlx5_put_uars_page(struct mlx5_core_dev *mdev, struct mlx5_uars_page *up);
 
-#ifndef CONFIG_MLX5_CORE_IPOIB
-static inline
-struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
-					  struct ib_device *ibdev,
-					  const char *name,
-					  void (*setup)(struct net_device *))
-{
-	return ERR_PTR(-EOPNOTSUPP);
-}
-#else
+#ifdef CONFIG_MLX5_CORE_IPOIB
 struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
 					  struct ib_device *ibdev,
 					  const char *name,
 					  void (*setup)(struct net_device *));
 #endif /* CONFIG_MLX5_CORE_IPOIB */
+int mlx5_rdma_rn_get_params(struct mlx5_core_dev *mdev,
+			    struct ib_device *device,
+			    struct rdma_netdev_alloc_params *params);
 
 struct mlx5_profile {
 	u64	mask;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index e950c2a68f06..020216cee8f1 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2223,6 +2223,16 @@ struct rdma_netdev {
 			    union ib_gid *gid, u16 mlid);
 };
 
+struct rdma_netdev_alloc_params {
+	size_t sizeof_priv;
+	unsigned int txqs;
+	unsigned int rxqs;
+	void *param;
+
+	int (*initialize_rdma_netdev)(struct ib_device *device, u8 port_num,
+				      struct net_device *netdev, void *param);
+};
+
 struct ib_port_pkey_list {
 	/* Lock to hold while modifying the list. */
 	spinlock_t                    list_lock;
@@ -2523,8 +2533,8 @@ struct ib_device {
 	/**
 	 * rdma netdev operation
 	 *
-	 * Driver implementing alloc_rdma_netdev must return -EOPNOTSUPP if it
-	 * doesn't support the specified rdma netdev type.
+	 * Driver implementing alloc_rdma_netdev or rdma_netdev_get_params
+	 * must return -EOPNOTSUPP if it doesn't support the specified type.
 	 */
 	struct net_device *(*alloc_rdma_netdev)(
 					struct ib_device *device,
@@ -2534,6 +2544,10 @@ struct ib_device {
 					unsigned char name_assign_type,
 					void (*setup)(struct net_device *));
 
+	int (*rdma_netdev_get_params)(struct ib_device *device, u8 port_num,
+				      enum rdma_netdev_t type,
+				      struct rdma_netdev_alloc_params *params);
+
 	struct module               *owner;
 	struct device                dev;
 	struct kobject               *ports_parent;
@@ -4179,4 +4193,9 @@ struct ib_ucontext *ib_uverbs_get_ucontext(struct ib_uverbs_file *ufile);
 
 int uverbs_destroy_def_handler(struct ib_uverbs_file *file,
 			       struct uverbs_attr_bundle *attrs);
+
+struct net_device *rdma_alloc_netdev(struct ib_device *device, u8 port_num,
+				     enum rdma_netdev_t type, const char *name,
+				     unsigned char name_assign_type,
+				     void (*setup)(struct net_device *));
 #endif /* IB_VERBS_H */
-- 
2.17.1

^ permalink raw reply related

* [net-next 3/7] net/mlx5e: Gather common netdev init/cleanup functionality in one place
From: Saeed Mahameed @ 2018-10-11  1:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jason Gunthorpe, linux-rdma, Feras Daoud, Saeed Mahameed
In-Reply-To: <20181011012444.28194-1-saeedm@mellanox.com>

From: Feras Daoud <ferasda@mellanox.com>

Introduce a helper init/cleanup function that initializes mlx5e generic
netdev private structure, and use them from all profiles init/cleanup
callbacks.

This patch will also be helpful to initialize/cleanup netdevs that are
not created by mlx5 driver, e.g: accelerated ipoib child netdevs.

Fixes: 26e59d8077a3 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  4 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 53 ++++++++++++-------
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  | 25 ++++++---
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 28 +++++-----
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib.h |  9 ++--
 .../mellanox/mlx5/core/ipoib/ipoib_vlan.c     | 17 +++---
 6 files changed, 87 insertions(+), 49 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 1d743bd5d212..4b2737c613fe 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -698,7 +698,7 @@ struct mlx5e_priv {
 };
 
 struct mlx5e_profile {
-	void	(*init)(struct mlx5_core_dev *mdev,
+	int	(*init)(struct mlx5_core_dev *mdev,
 			struct net_device *netdev,
 			const struct mlx5e_profile *profile, void *ppriv);
 	void	(*cleanup)(struct mlx5e_priv *priv);
@@ -962,6 +962,8 @@ int mlx5e_ethtool_flash_device(struct mlx5e_priv *priv,
 			       struct ethtool_flash *flash);
 
 /* mlx5e generic netdev management API */
+int mlx5e_netdev_init(struct net_device *netdev, struct mlx5e_priv *priv);
+void mlx5e_netdev_cleanup(struct net_device *netdev, struct mlx5e_priv *priv);
 struct net_device*
 mlx5e_create_netdev(struct mlx5_core_dev *mdev, const struct mlx5e_profile *profile,
 		    void *ppriv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index bc034958c846..376197c97e3b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4749,14 +4749,18 @@ void mlx5e_destroy_q_counters(struct mlx5e_priv *priv)
 		mlx5_core_dealloc_q_counter(priv->mdev, priv->drop_rq_q_counter);
 }
 
-static void mlx5e_nic_init(struct mlx5_core_dev *mdev,
-			   struct net_device *netdev,
-			   const struct mlx5e_profile *profile,
-			   void *ppriv)
+static int mlx5e_nic_init(struct mlx5_core_dev *mdev,
+			  struct net_device *netdev,
+			  const struct mlx5e_profile *profile,
+			  void *ppriv)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 	int err;
 
+	err = mlx5e_netdev_init(netdev, priv);
+	if (err)
+		return err;
+
 	mlx5e_build_nic_netdev_priv(mdev, netdev, profile, ppriv);
 	err = mlx5e_ipsec_init(priv);
 	if (err)
@@ -4766,12 +4770,15 @@ static void mlx5e_nic_init(struct mlx5_core_dev *mdev,
 		mlx5_core_err(mdev, "TLS initialization failed, %d\n", err);
 	mlx5e_build_nic_netdev(netdev);
 	mlx5e_build_tc2txq_maps(priv);
+
+	return 0;
 }
 
 static void mlx5e_nic_cleanup(struct mlx5e_priv *priv)
 {
 	mlx5e_tls_cleanup(priv);
 	mlx5e_ipsec_cleanup(priv);
+	mlx5e_netdev_cleanup(priv->netdev, priv);
 }
 
 static int mlx5e_init_nic_rx(struct mlx5e_priv *priv)
@@ -4943,13 +4950,30 @@ static const struct mlx5e_profile mlx5e_nic_profile = {
 
 /* mlx5e generic netdev management API (move to en_common.c) */
 
+/* mlx5e_netdev_init/cleanup must be called from profile->init/cleanup callbacks */
+int mlx5e_netdev_init(struct net_device *netdev, struct mlx5e_priv *priv)
+{
+	netif_carrier_off(netdev);
+
+	priv->wq = create_singlethread_workqueue("mlx5e");
+	if (!priv->wq)
+		return -ENOMEM;
+
+	return 0;
+}
+
+void mlx5e_netdev_cleanup(struct net_device *netdev, struct mlx5e_priv *priv)
+{
+	destroy_workqueue(priv->wq);
+}
+
 struct net_device *mlx5e_create_netdev(struct mlx5_core_dev *mdev,
 				       const struct mlx5e_profile *profile,
 				       void *ppriv)
 {
 	int nch = profile->max_nch(mdev);
 	struct net_device *netdev;
-	struct mlx5e_priv *priv;
+	int err;
 
 	netdev = alloc_etherdev_mqs(sizeof(struct mlx5e_priv),
 				    nch * profile->max_tc,
@@ -4963,21 +4987,15 @@ struct net_device *mlx5e_create_netdev(struct mlx5_core_dev *mdev,
 	netdev->rx_cpu_rmap = mdev->rmap;
 #endif
 
-	profile->init(mdev, netdev, profile, ppriv);
-
-	netif_carrier_off(netdev);
-
-	priv = netdev_priv(netdev);
-
-	priv->wq = create_singlethread_workqueue("mlx5e");
-	if (!priv->wq)
-		goto err_cleanup_nic;
+	err = profile->init(mdev, netdev, profile, ppriv);
+	if (err) {
+		mlx5_core_err(mdev, "failed to init mlx5e profile %d\n", err);
+		goto err_free_netdev;
+	}
 
 	return netdev;
 
-err_cleanup_nic:
-	if (profile->cleanup)
-		profile->cleanup(priv);
+err_free_netdev:
 	free_netdev(netdev);
 
 	return NULL;
@@ -5031,7 +5049,6 @@ void mlx5e_destroy_netdev(struct mlx5e_priv *priv)
 	const struct mlx5e_profile *profile = priv->profile;
 	struct net_device *netdev = priv->netdev;
 
-	destroy_workqueue(priv->wq);
 	if (profile->cleanup)
 		profile->cleanup(priv);
 	free_netdev(netdev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 9264c3332aa6..6176e53c8b83 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -1078,28 +1078,40 @@ static void mlx5e_build_rep_netdev(struct net_device *netdev)
 	netdev->max_mtu = MLX5E_HW2SW_MTU(&priv->channels.params, max_mtu);
 }
 
-static void mlx5e_init_rep(struct mlx5_core_dev *mdev,
-			   struct net_device *netdev,
-			   const struct mlx5e_profile *profile,
-			   void *ppriv)
+static int mlx5e_init_rep(struct mlx5_core_dev *mdev,
+			  struct net_device *netdev,
+			  const struct mlx5e_profile *profile,
+			  void *ppriv)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
+	int err;
 
 	priv->mdev                         = mdev;
 	priv->netdev                       = netdev;
 	priv->profile                      = profile;
 	priv->ppriv                        = ppriv;
 
-	mutex_init(&priv->state_lock);
+	err = mlx5e_netdev_init(netdev, priv);
+	if (err)
+		return err;
 
 	INIT_DELAYED_WORK(&priv->update_stats_work, mlx5e_update_stats_work);
 
-	priv->channels.params.num_channels = 1;
+	mutex_init(&priv->state_lock);
+
+	priv->channels.params.num_channels = profile->max_nch(mdev);
 
 	mlx5e_build_rep_params(mdev, &priv->channels.params, netdev->mtu);
 	mlx5e_build_rep_netdev(netdev);
 
 	mlx5e_timestamp_init(priv);
+
+	return 0;
+}
+
+static void mlx5e_cleanup_rep(struct mlx5e_priv *priv)
+{
+	mlx5e_netdev_cleanup(priv->netdev, priv);
 }
 
 static int mlx5e_create_rep_ttc_table(struct mlx5e_priv *priv)
@@ -1224,6 +1236,7 @@ static int mlx5e_init_rep_tx(struct mlx5e_priv *priv)
 
 static const struct mlx5e_profile mlx5e_rep_profile = {
 	.init			= mlx5e_init_rep,
+	.cleanup		= mlx5e_cleanup_rep,
 	.init_rx		= mlx5e_init_rep_rx,
 	.cleanup_rx		= mlx5e_cleanup_rep_rx,
 	.init_tx		= mlx5e_init_rep_tx,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index af1a95f80404..3f544baab9c9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -71,13 +71,14 @@ static void mlx5i_build_nic_params(struct mlx5_core_dev *mdev,
 }
 
 /* Called directly after IPoIB netdevice was created to initialize SW structs */
-void mlx5i_init(struct mlx5_core_dev *mdev,
-		struct net_device *netdev,
-		const struct mlx5e_profile *profile,
-		void *ppriv)
+int mlx5i_init(struct mlx5_core_dev *mdev,
+	       struct net_device *netdev,
+	       const struct mlx5e_profile *profile,
+	       void *ppriv)
 {
 	struct mlx5e_priv *priv  = mlx5i_epriv(netdev);
 	u16 max_mtu;
+	int err;
 
 	/* priv init */
 	priv->mdev        = mdev;
@@ -87,6 +88,10 @@ void mlx5i_init(struct mlx5_core_dev *mdev,
 	priv->max_opened_tc = 1;
 	mutex_init(&priv->state_lock);
 
+	err = mlx5e_netdev_init(netdev, priv);
+	if (err)
+		return err;
+
 	mlx5_query_port_max_mtu(mdev, &max_mtu, 1);
 	netdev->mtu = max_mtu;
 
@@ -108,12 +113,14 @@ void mlx5i_init(struct mlx5_core_dev *mdev,
 
 	netdev->netdev_ops = &mlx5i_netdev_ops;
 	netdev->ethtool_ops = &mlx5i_ethtool_ops;
+
+	return 0;
 }
 
 /* Called directly before IPoIB netdevice is destroyed to cleanup SW structs */
-static void mlx5i_cleanup(struct mlx5e_priv *priv)
+void mlx5i_cleanup(struct mlx5e_priv *priv)
 {
-	/* Do nothing .. */
+	mlx5e_netdev_cleanup(priv->netdev, priv);
 }
 
 static void mlx5i_grp_sw_update_stats(struct mlx5e_priv *priv)
@@ -650,7 +657,6 @@ static void mlx5_rdma_netdev_free(struct net_device *netdev)
 
 	mlx5e_detach_netdev(priv);
 	profile->cleanup(priv);
-	destroy_workqueue(priv->wq);
 
 	if (!ipriv->sub_interface) {
 		mlx5i_pkey_qpn_ht_cleanup(netdev);
@@ -683,16 +689,12 @@ static int mlx5_rdma_setup_rn(struct ib_device *ibdev, u8 port_num,
 	ipriv = netdev_priv(netdev);
 	epriv = mlx5i_epriv(netdev);
 
-	epriv->wq = create_singlethread_workqueue("mlx5i");
-	if (!epriv->wq)
-		return -ENOMEM;
-
 	ipriv->sub_interface = mlx5_is_sub_interface(mdev);
 	if (!ipriv->sub_interface) {
 		err = mlx5i_pkey_qpn_ht_init(netdev);
 		if (err) {
 			mlx5_core_warn(mdev, "allocate qpn_to_netdev ht failed\n");
-			goto destroy_wq;
+			return err;
 		}
 
 		/* This should only be called once per mdev */
@@ -721,8 +723,6 @@ static int mlx5_rdma_setup_rn(struct ib_device *ibdev, u8 port_num,
 
 destroy_ht:
 	mlx5i_pkey_qpn_ht_cleanup(netdev);
-destroy_wq:
-	destroy_workqueue(epriv->wq);
 	return err;
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
index 2e7fb829e1b0..5ef3ef0072b4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.h
@@ -84,10 +84,11 @@ void mlx5i_dev_cleanup(struct net_device *dev);
 int mlx5i_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd);
 
 /* Parent profile functions */
-void mlx5i_init(struct mlx5_core_dev *mdev,
-		struct net_device *netdev,
-		const struct mlx5e_profile *profile,
-		void *ppriv);
+int mlx5i_init(struct mlx5_core_dev *mdev,
+	       struct net_device *netdev,
+	       const struct mlx5e_profile *profile,
+	       void *ppriv);
+void mlx5i_cleanup(struct mlx5e_priv *priv);
 
 /* Get child interface nic profile */
 const struct mlx5e_profile *mlx5i_pkey_get_profile(void);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
index e3e8a5f1ac9b..3dbfa532fda1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.c
@@ -275,14 +275,17 @@ static int mlx5i_pkey_change_mtu(struct net_device *netdev, int new_mtu)
 }
 
 /* Called directly after IPoIB netdevice was created to initialize SW structs */
-static void mlx5i_pkey_init(struct mlx5_core_dev *mdev,
-			     struct net_device *netdev,
-			     const struct mlx5e_profile *profile,
-			     void *ppriv)
+static int mlx5i_pkey_init(struct mlx5_core_dev *mdev,
+			   struct net_device *netdev,
+			   const struct mlx5e_profile *profile,
+			   void *ppriv)
 {
 	struct mlx5e_priv *priv  = mlx5i_epriv(netdev);
+	int err;
 
-	mlx5i_init(mdev, netdev, profile, ppriv);
+	err = mlx5i_init(mdev, netdev, profile, ppriv);
+	if (err)
+		return err;
 
 	/* Override parent ndo */
 	netdev->netdev_ops = &mlx5i_pkey_netdev_ops;
@@ -292,12 +295,14 @@ static void mlx5i_pkey_init(struct mlx5_core_dev *mdev,
 
 	/* Use dummy rqs */
 	priv->channels.params.log_rq_mtu_frames = MLX5E_PARAMS_MINIMUM_LOG_RQ_SIZE;
+
+	return 0;
 }
 
 /* Called directly before IPoIB netdevice is destroyed to cleanup SW structs */
 static void mlx5i_pkey_cleanup(struct mlx5e_priv *priv)
 {
-	/* Do nothing .. */
+	mlx5i_cleanup(priv);
 }
 
 static int mlx5i_pkey_init_tx(struct mlx5e_priv *priv)
-- 
2.17.1

^ permalink raw reply related

* [net-next 5/7] net/mlx5e: Initialize all netdev common structures in one place
From: Saeed Mahameed @ 2018-10-11  1:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jason Gunthorpe, linux-rdma, Saeed Mahameed, Feras Daoud
In-Reply-To: <20181011012444.28194-1-saeedm@mellanox.com>

Move all mlx5e generic structures initializations to mlx5e_netdev_init.
The common structure new initializer function will be used to initialize
mlx5 context for netlink created netdevs such as IPoIB mlx5 accelerated
child netdevs.

Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  6 +-
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 63 +++++++++----------
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  |  8 +--
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 10 +--
 4 files changed, 36 insertions(+), 51 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 59d07a6fe7c5..f6dd0254d103 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -961,7 +961,11 @@ int mlx5e_ethtool_flash_device(struct mlx5e_priv *priv,
 			       struct ethtool_flash *flash);
 
 /* mlx5e generic netdev management API */
-int mlx5e_netdev_init(struct net_device *netdev, struct mlx5e_priv *priv);
+int mlx5e_netdev_init(struct net_device *netdev,
+		      struct mlx5e_priv *priv,
+		      struct mlx5_core_dev *mdev,
+		      const struct mlx5e_profile *profile,
+		      void *ppriv);
 void mlx5e_netdev_cleanup(struct net_device *netdev, struct mlx5e_priv *priv);
 struct net_device*
 mlx5e_create_netdev(struct mlx5_core_dev *mdev, const struct mlx5e_profile *profile,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 9cf5863b7949..d5a7bd493240 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4560,32 +4560,6 @@ void mlx5e_build_nic_params(struct mlx5_core_dev *mdev,
 	mlx5e_build_rss_params(params);
 }
 
-static void mlx5e_build_nic_netdev_priv(struct mlx5_core_dev *mdev,
-					struct net_device *netdev,
-					const struct mlx5e_profile *profile,
-					void *ppriv)
-{
-	struct mlx5e_priv *priv = netdev_priv(netdev);
-
-	priv->mdev        = mdev;
-	priv->netdev      = netdev;
-	priv->profile     = profile;
-	priv->ppriv       = ppriv;
-	priv->msglevel    = MLX5E_MSG_LEVEL;
-	priv->max_opened_tc = 1;
-
-	mlx5e_build_nic_params(mdev, &priv->channels.params,
-			       profile->max_nch(mdev), netdev->mtu);
-
-	mutex_init(&priv->state_lock);
-
-	INIT_WORK(&priv->update_carrier_work, mlx5e_update_carrier_work);
-	INIT_WORK(&priv->set_rx_mode_work, mlx5e_set_rx_mode_work);
-	INIT_WORK(&priv->tx_timeout_work, mlx5e_tx_timeout_work);
-
-	mlx5e_timestamp_init(priv);
-}
-
 static void mlx5e_set_netdev_dev_addr(struct net_device *netdev)
 {
 	struct mlx5e_priv *priv = netdev_priv(netdev);
@@ -4756,11 +4730,15 @@ static int mlx5e_nic_init(struct mlx5_core_dev *mdev,
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 	int err;
 
-	err = mlx5e_netdev_init(netdev, priv);
+	err = mlx5e_netdev_init(netdev, priv, mdev, profile, ppriv);
 	if (err)
 		return err;
 
-	mlx5e_build_nic_netdev_priv(mdev, netdev, profile, ppriv);
+	mlx5e_build_nic_params(mdev, &priv->channels.params,
+			       profile->max_nch(mdev), netdev->mtu);
+
+	mlx5e_timestamp_init(priv);
+
 	err = mlx5e_ipsec_init(priv);
 	if (err)
 		mlx5_core_err(mdev, "IPSec initialization failed, %d\n", err);
@@ -4950,16 +4928,37 @@ static const struct mlx5e_profile mlx5e_nic_profile = {
 /* mlx5e generic netdev management API (move to en_common.c) */
 
 /* mlx5e_netdev_init/cleanup must be called from profile->init/cleanup callbacks */
-int mlx5e_netdev_init(struct net_device *netdev, struct mlx5e_priv *priv)
+int mlx5e_netdev_init(struct net_device *netdev,
+		      struct mlx5e_priv *priv,
+		      struct mlx5_core_dev *mdev,
+		      const struct mlx5e_profile *profile,
+		      void *ppriv)
 {
-	netif_carrier_off(netdev);
+	/* priv init */
+	priv->mdev        = mdev;
+	priv->netdev      = netdev;
+	priv->profile     = profile;
+	priv->ppriv       = ppriv;
+	priv->msglevel    = MLX5E_MSG_LEVEL;
+	priv->max_opened_tc = 1;
 
+	mutex_init(&priv->state_lock);
+	INIT_WORK(&priv->update_carrier_work, mlx5e_update_carrier_work);
+	INIT_WORK(&priv->set_rx_mode_work, mlx5e_set_rx_mode_work);
+	INIT_WORK(&priv->tx_timeout_work, mlx5e_tx_timeout_work);
 	INIT_DELAYED_WORK(&priv->update_stats_work, mlx5e_update_stats_work);
 
 	priv->wq = create_singlethread_workqueue("mlx5e");
 	if (!priv->wq)
 		return -ENOMEM;
 
+	/* netdev init */
+	netif_carrier_off(netdev);
+
+#ifdef CONFIG_MLX5_EN_ARFS
+	netdev->rx_cpu_rmap = mdev->rmap;
+#endif
+
 	return 0;
 }
 
@@ -4984,10 +4983,6 @@ struct net_device *mlx5e_create_netdev(struct mlx5_core_dev *mdev,
 		return NULL;
 	}
 
-#ifdef CONFIG_MLX5_EN_ARFS
-	netdev->rx_cpu_rmap = mdev->rmap;
-#endif
-
 	err = profile->init(mdev, netdev, profile, ppriv);
 	if (err) {
 		mlx5_core_err(mdev, "failed to init mlx5e profile %d\n", err);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index bd8caac2e572..1a3efa87f557 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -1086,16 +1086,10 @@ static int mlx5e_init_rep(struct mlx5_core_dev *mdev,
 	struct mlx5e_priv *priv = netdev_priv(netdev);
 	int err;
 
-	priv->mdev                         = mdev;
-	priv->netdev                       = netdev;
-	priv->profile                      = profile;
-	priv->ppriv                        = ppriv;
-
-	err = mlx5e_netdev_init(netdev, priv);
+	err = mlx5e_netdev_init(netdev, priv, mdev, profile, ppriv);
 	if (err)
 		return err;
 
-	mutex_init(&priv->state_lock);
 
 	priv->channels.params.num_channels = profile->max_nch(mdev);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index 3f544baab9c9..ed28602c98f6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -80,15 +80,7 @@ int mlx5i_init(struct mlx5_core_dev *mdev,
 	u16 max_mtu;
 	int err;
 
-	/* priv init */
-	priv->mdev        = mdev;
-	priv->netdev      = netdev;
-	priv->profile     = profile;
-	priv->ppriv       = ppriv;
-	priv->max_opened_tc = 1;
-	mutex_init(&priv->state_lock);
-
-	err = mlx5e_netdev_init(netdev, priv);
+	err = mlx5e_netdev_init(netdev, priv, mdev, profile, ppriv);
 	if (err)
 		return err;
 
-- 
2.17.1

^ permalink raw reply related

* [net-next 2/7] RDMA/netdev: Fix netlink support in IPoIB
From: Saeed Mahameed @ 2018-10-11  1:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jason Gunthorpe, linux-rdma, Denis Drozdov, Feras Daoud,
	Saeed Mahameed
In-Reply-To: <20181011012444.28194-1-saeedm@mellanox.com>

From: Denis Drozdov <denisd@mellanox.com>

IPoIB netlink support was broken by the below commit since integrating
the rdma_netdev support relies on an allocation flow for netdevs that
was controlled by the ipoib driver while netdev's rtnl_newlink
implementation assumes that the netdev will be allocated by netlink.
Such situation leads to crash in __ipoib_device_add, once trying to
reuse netlink device.

This patch fixes the kernel oops for both mlx4 and mlx5
devices triggered by the following command:

Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Denis Drozdov <denisd@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/infiniband/core/verbs.c              |  28 +++--
 drivers/infiniband/ulp/ipoib/ipoib.h         |   8 +-
 drivers/infiniband/ulp/ipoib/ipoib_main.c    | 123 +++++++++++--------
 drivers/infiniband/ulp/ipoib/ipoib_netlink.c |  23 +++-
 drivers/infiniband/ulp/ipoib/ipoib_vlan.c    |  19 +--
 include/rdma/ib_verbs.h                      |   7 ++
 6 files changed, 136 insertions(+), 72 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 2f34d3412097..8ec7418e99f0 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2643,13 +2643,27 @@ struct net_device *rdma_alloc_netdev(struct ib_device *device, u8 port_num,
 	if (!netdev)
 		return ERR_PTR(-ENOMEM);
 
-	rc = params.initialize_rdma_netdev(device, port_num, netdev,
-					   params.param);
-	if (rc) {
-		free_netdev(netdev);
-		return ERR_PTR(rc);
-	}
-
 	return netdev;
 }
 EXPORT_SYMBOL(rdma_alloc_netdev);
+
+int rdma_init_netdev(struct ib_device *device, u8 port_num,
+		     enum rdma_netdev_t type, const char *name,
+		     unsigned char name_assign_type,
+		     void (*setup)(struct net_device *),
+		     struct net_device *netdev)
+{
+	struct rdma_netdev_alloc_params params;
+	int rc;
+
+	if (!device->rdma_netdev_get_params)
+		return -EOPNOTSUPP;
+
+	rc = device->rdma_netdev_get_params(device, port_num, type, &params);
+	if (rc)
+		return rc;
+
+	return params.initialize_rdma_netdev(device, port_num,
+					     netdev, params.param);
+}
+EXPORT_SYMBOL(rdma_init_netdev);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index 1abe3c62f106..1da119d901a9 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -499,8 +499,10 @@ void ipoib_reap_ah(struct work_struct *work);
 struct ipoib_path *__path_find(struct net_device *dev, void *gid);
 void ipoib_mark_paths_invalid(struct net_device *dev);
 void ipoib_flush_paths(struct net_device *dev);
-struct ipoib_dev_priv *ipoib_intf_alloc(struct ib_device *hca, u8 port,
-					const char *format);
+struct net_device *ipoib_intf_alloc(struct ib_device *hca, u8 port,
+				    const char *format);
+int ipoib_intf_init(struct ib_device *hca, u8 port, const char *format,
+		    struct net_device *dev);
 void ipoib_ib_tx_timer_func(struct timer_list *t);
 void ipoib_ib_dev_flush_light(struct work_struct *work);
 void ipoib_ib_dev_flush_normal(struct work_struct *work);
@@ -531,6 +533,8 @@ int ipoib_dma_map_tx(struct ib_device *ca, struct ipoib_tx_buf *tx_req);
 void ipoib_dma_unmap_tx(struct ipoib_dev_priv *priv,
 			struct ipoib_tx_buf *tx_req);
 
+struct rtnl_link_ops *ipoib_get_link_ops(void);
+
 static inline void ipoib_build_sge(struct ipoib_dev_priv *priv,
 				   struct ipoib_tx_buf *tx_req)
 {
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 9c816cd41724..8baa75a705c5 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -2115,77 +2115,58 @@ static const struct net_device_ops ipoib_netdev_default_pf = {
 	.ndo_stop		 = ipoib_ib_dev_stop_default,
 };
 
-static struct net_device
-*ipoib_create_netdev_default(struct ib_device *hca,
-			     const char *name,
-			     unsigned char name_assign_type,
-			     void (*setup)(struct net_device *))
-{
-	struct net_device *dev;
-	struct rdma_netdev *rn;
-
-	dev = alloc_netdev((int)sizeof(struct rdma_netdev),
-			   name,
-			   name_assign_type, setup);
-	if (!dev)
-		return NULL;
-
-	rn = netdev_priv(dev);
-
-	rn->send = ipoib_send;
-	rn->attach_mcast = ipoib_mcast_attach;
-	rn->detach_mcast = ipoib_mcast_detach;
-	rn->hca = hca;
-	dev->netdev_ops = &ipoib_netdev_default_pf;
-
-	return dev;
-}
-
-static struct net_device *ipoib_get_netdev(struct ib_device *hca, u8 port,
-					   const char *name)
+static struct net_device *ipoib_alloc_netdev(struct ib_device *hca, u8 port,
+					     const char *name)
 {
 	struct net_device *dev;
 
 	dev = rdma_alloc_netdev(hca, port, RDMA_NETDEV_IPOIB, name,
 				NET_NAME_UNKNOWN, ipoib_setup_common);
-	if (!IS_ERR(dev))
+	if (!IS_ERR(dev) || PTR_ERR(dev) != -EOPNOTSUPP)
 		return dev;
-	if (PTR_ERR(dev) != -EOPNOTSUPP)
-		return NULL;
 
-	return ipoib_create_netdev_default(hca, name, NET_NAME_UNKNOWN,
-					   ipoib_setup_common);
+	dev = alloc_netdev(sizeof(struct rdma_netdev), name, NET_NAME_UNKNOWN,
+			   ipoib_setup_common);
+	if (!dev)
+		return ERR_PTR(-ENOMEM);
+	return dev;
 }
 
-struct ipoib_dev_priv *ipoib_intf_alloc(struct ib_device *hca, u8 port,
-					const char *name)
+int ipoib_intf_init(struct ib_device *hca, u8 port, const char *name,
+		    struct net_device *dev)
 {
-	struct net_device *dev;
+	struct rdma_netdev *rn = netdev_priv(dev);
 	struct ipoib_dev_priv *priv;
-	struct rdma_netdev *rn;
+	int rc;
 
 	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
 	if (!priv)
-		return NULL;
+		return -ENOMEM;
 
 	priv->ca = hca;
 	priv->port = port;
 
-	dev = ipoib_get_netdev(hca, port, name);
-	if (!dev)
-		goto free_priv;
+	rc = rdma_init_netdev(hca, port, RDMA_NETDEV_IPOIB, name,
+			      NET_NAME_UNKNOWN, ipoib_setup_common, dev);
+	if (rc) {
+		if (rc != -EOPNOTSUPP)
+			goto out;
+
+		dev->netdev_ops = &ipoib_netdev_default_pf;
+		rn->send = ipoib_send;
+		rn->attach_mcast = ipoib_mcast_attach;
+		rn->detach_mcast = ipoib_mcast_detach;
+		rn->hca = hca;
+	}
 
 	priv->rn_ops = dev->netdev_ops;
 
-	/* fixme : should be after the query_cap */
-	if (priv->hca_caps & IB_DEVICE_VIRTUAL_FUNCTION)
+	if (hca->attrs.device_cap_flags & IB_DEVICE_VIRTUAL_FUNCTION)
 		dev->netdev_ops	= &ipoib_netdev_ops_vf;
 	else
 		dev->netdev_ops	= &ipoib_netdev_ops_pf;
 
-	rn = netdev_priv(dev);
 	rn->clnt_priv = priv;
-
 	/*
 	 * Only the child register_netdev flows can handle priv_destructor
 	 * being set, so we force it to NULL here and handle manually until it
@@ -2196,10 +2177,35 @@ struct ipoib_dev_priv *ipoib_intf_alloc(struct ib_device *hca, u8 port,
 
 	ipoib_build_priv(dev);
 
-	return priv;
-free_priv:
+	return 0;
+
+out:
 	kfree(priv);
-	return NULL;
+	return rc;
+}
+
+struct net_device *ipoib_intf_alloc(struct ib_device *hca, u8 port,
+				    const char *name)
+{
+	struct net_device *dev;
+	int rc;
+
+	dev = ipoib_alloc_netdev(hca, port, name);
+	if (IS_ERR(dev))
+		return dev;
+
+	rc = ipoib_intf_init(hca, port, name, dev);
+	if (rc) {
+		free_netdev(dev);
+		return ERR_PTR(rc);
+	}
+
+	/*
+	 * Upon success the caller must ensure ipoib_intf_free is called or
+	 * register_netdevice succeed'd and priv_destructor is set to
+	 * ipoib_intf_free.
+	 */
+	return dev;
 }
 
 void ipoib_intf_free(struct net_device *dev)
@@ -2382,16 +2388,19 @@ int ipoib_add_pkey_attr(struct net_device *dev)
 static struct net_device *ipoib_add_port(const char *format,
 					 struct ib_device *hca, u8 port)
 {
+	struct rtnl_link_ops *ops = ipoib_get_link_ops();
+	struct rdma_netdev_alloc_params params;
 	struct ipoib_dev_priv *priv;
 	struct net_device *ndev;
 	int result;
 
-	priv = ipoib_intf_alloc(hca, port, format);
-	if (!priv) {
-		pr_warn("%s, %d: ipoib_intf_alloc failed\n", hca->name, port);
-		return ERR_PTR(-ENOMEM);
+	ndev = ipoib_intf_alloc(hca, port, format);
+	if (IS_ERR(ndev)) {
+		pr_warn("%s, %d: ipoib_intf_alloc failed %ld\n", hca->name, port,
+			PTR_ERR(ndev));
+		return ndev;
 	}
-	ndev = priv->dev;
+	priv = ipoib_priv(ndev);
 
 	INIT_IB_EVENT_HANDLER(&priv->event_handler,
 			      priv->ca, ipoib_event);
@@ -2412,6 +2421,14 @@ static struct net_device *ipoib_add_port(const char *format,
 		return ERR_PTR(result);
 	}
 
+	if (hca->rdma_netdev_get_params) {
+		int rc = hca->rdma_netdev_get_params(hca, port,
+						     RDMA_NETDEV_IPOIB,
+						     &params);
+
+		if (!rc && ops->priv_size < params.sizeof_priv)
+			ops->priv_size = params.sizeof_priv;
+	}
 	/*
 	 * We cannot set priv_destructor before register_netdev because we
 	 * need priv to be always valid during the error flow to execute
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
index d4d553a51fa9..38c984d16996 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
@@ -122,12 +122,26 @@ static int ipoib_new_child_link(struct net *src_net, struct net_device *dev,
 	} else
 		child_pkey  = nla_get_u16(data[IFLA_IPOIB_PKEY]);
 
+	err = ipoib_intf_init(ppriv->ca, ppriv->port, dev->name, dev);
+	if (err) {
+		ipoib_warn(ppriv, "failed to initialize pkey device\n");
+		return err;
+	}
+
 	err = __ipoib_vlan_add(ppriv, ipoib_priv(dev),
 			       child_pkey, IPOIB_RTNL_CHILD);
+	if (err)
+		return err;
 
-	if (!err && data)
+	if (data) {
 		err = ipoib_changelink(dev, tb, data, extack);
-	return err;
+		if (err) {
+			unregister_netdevice(dev);
+			return err;
+		}
+	}
+
+	return 0;
 }
 
 static size_t ipoib_get_size(const struct net_device *dev)
@@ -149,6 +163,11 @@ static struct rtnl_link_ops ipoib_link_ops __read_mostly = {
 	.fill_info	= ipoib_fill_info,
 };
 
+struct rtnl_link_ops *ipoib_get_link_ops(void)
+{
+	return &ipoib_link_ops;
+}
+
 int __init ipoib_netlink_init(void)
 {
 	return rtnl_link_register(&ipoib_link_ops);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
index 341753fbda54..8ac8e18fbe0c 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
@@ -85,7 +85,7 @@ static bool is_child_unique(struct ipoib_dev_priv *ppriv,
 
 /*
  * NOTE: If this function fails then the priv->dev will remain valid, however
- * priv can have been freed and must not be touched by caller in the error
+ * priv will have been freed and must not be touched by caller in the error
  * case.
  *
  * If (ndev->reg_state == NETREG_UNINITIALIZED) then it is up to the caller to
@@ -100,6 +100,12 @@ int __ipoib_vlan_add(struct ipoib_dev_priv *ppriv, struct ipoib_dev_priv *priv,
 
 	ASSERT_RTNL();
 
+	/*
+	 * We do not need to touch priv if register_netdevice fails, so just
+	 * always use this flow.
+	 */
+	ndev->priv_destructor = ipoib_intf_free;
+
 	/*
 	 * Racing with unregister of the parent must be prevented by the
 	 * caller.
@@ -120,9 +126,6 @@ int __ipoib_vlan_add(struct ipoib_dev_priv *ppriv, struct ipoib_dev_priv *priv,
 		goto out_early;
 	}
 
-	/* We do not need to touch priv if register_netdevice fails */
-	ndev->priv_destructor = ipoib_intf_free;
-
 	result = register_netdevice(ndev);
 	if (result) {
 		ipoib_warn(priv, "failed to initialize; error %i", result);
@@ -182,12 +185,12 @@ int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey)
 	snprintf(intf_name, sizeof(intf_name), "%s.%04x",
 		 ppriv->dev->name, pkey);
 
-	priv = ipoib_intf_alloc(ppriv->ca, ppriv->port, intf_name);
-	if (!priv) {
-		result = -ENOMEM;
+	ndev = ipoib_intf_alloc(ppriv->ca, ppriv->port, intf_name);
+	if (IS_ERR(ndev)) {
+		result = PTR_ERR(ndev);
 		goto out;
 	}
-	ndev = priv->dev;
+	priv = ipoib_priv(ndev);
 
 	result = __ipoib_vlan_add(ppriv, priv, pkey, IPOIB_LEGACY_CHILD);
 
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 020216cee8f1..0ed5d913a492 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -4198,4 +4198,11 @@ struct net_device *rdma_alloc_netdev(struct ib_device *device, u8 port_num,
 				     enum rdma_netdev_t type, const char *name,
 				     unsigned char name_assign_type,
 				     void (*setup)(struct net_device *));
+
+int rdma_init_netdev(struct ib_device *device, u8 port_num,
+		     enum rdma_netdev_t type, const char *name,
+		     unsigned char name_assign_type,
+		     void (*setup)(struct net_device *),
+		     struct net_device *netdev);
+
 #endif /* IB_VERBS_H */
-- 
2.17.1

^ permalink raw reply related

* [net-next 4/7] net/mlx5e: Always initialize update stats delayed work
From: Saeed Mahameed @ 2018-10-11  1:24 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Jason Gunthorpe, linux-rdma, Feras Daoud, Saeed Mahameed
In-Reply-To: <20181011012444.28194-1-saeedm@mellanox.com>

From: Feras Daoud <ferasda@mellanox.com>

mlx5e_detach_netdev cancels update_stats work which was not initialized
in ipoib netdevice profile, as a result, the following assert occurs:

ODEBUG: assert_init not available (active state 0) object type:
timer_list hint:(null)

This change moves the update stats work to be initialized for all
mlx5e netdevices.

Fixes: cd565b4b51e5 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h      | 1 -
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 5 +++--
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c  | 2 --
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 4b2737c613fe..59d07a6fe7c5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -926,7 +926,6 @@ int mlx5e_create_tises(struct mlx5e_priv *priv);
 void mlx5e_cleanup_nic_tx(struct mlx5e_priv *priv);
 int mlx5e_close(struct net_device *netdev);
 int mlx5e_open(struct net_device *netdev);
-void mlx5e_update_stats_work(struct work_struct *work);
 
 int mlx5e_bits_invert(unsigned long a, int size);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 376197c97e3b..9cf5863b7949 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -272,7 +272,7 @@ static void mlx5e_update_ndo_stats(struct mlx5e_priv *priv)
 			mlx5e_stats_grps[i].update_stats(priv);
 }
 
-void mlx5e_update_stats_work(struct work_struct *work)
+static void mlx5e_update_stats_work(struct work_struct *work)
 {
 	struct delayed_work *dwork = to_delayed_work(work);
 	struct mlx5e_priv *priv = container_of(dwork, struct mlx5e_priv,
@@ -4582,7 +4582,6 @@ static void mlx5e_build_nic_netdev_priv(struct mlx5_core_dev *mdev,
 	INIT_WORK(&priv->update_carrier_work, mlx5e_update_carrier_work);
 	INIT_WORK(&priv->set_rx_mode_work, mlx5e_set_rx_mode_work);
 	INIT_WORK(&priv->tx_timeout_work, mlx5e_tx_timeout_work);
-	INIT_DELAYED_WORK(&priv->update_stats_work, mlx5e_update_stats_work);
 
 	mlx5e_timestamp_init(priv);
 }
@@ -4955,6 +4954,8 @@ int mlx5e_netdev_init(struct net_device *netdev, struct mlx5e_priv *priv)
 {
 	netif_carrier_off(netdev);
 
+	INIT_DELAYED_WORK(&priv->update_stats_work, mlx5e_update_stats_work);
+
 	priv->wq = create_singlethread_workqueue("mlx5e");
 	if (!priv->wq)
 		return -ENOMEM;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 6176e53c8b83..bd8caac2e572 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -1095,8 +1095,6 @@ static int mlx5e_init_rep(struct mlx5_core_dev *mdev,
 	if (err)
 		return err;
 
-	INIT_DELAYED_WORK(&priv->update_stats_work, mlx5e_update_stats_work);
-
 	mutex_init(&priv->state_lock);
 
 	priv->channels.params.num_channels = profile->max_nch(mdev);
-- 
2.17.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox