All of lore.kernel.org
 help / color / mirror / Atom feed
From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
To: Matan Azrad <matan@mellanox.com>
Cc: Gaetan Rivet <gaetan.rivet@6wind.com>, dev@dpdk.org
Subject: Re: [PATCH 3/3] net/mlx5: adjust removal error
Date: Fri, 3 Nov 2017 14:06:05 +0100	[thread overview]
Message-ID: <20171103130605.GO24849@6wind.com> (raw)
In-Reply-To: <1509637324-13525-4-git-send-email-matan@mellanox.com>

On Thu, Nov 02, 2017 at 03:42:04PM +0000, Matan Azrad wrote:
> Fail-safe PMD expects to get -ENODEV error value if sub PMD control
> command fails because of device removal.
> 
> Make control callbacks return with -ENODEV when the device has
> disappeared.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>

In short I have the same comments as on the mlx4 patch about usage
consistency, this also applies to mlx5; mlx5_removed() should be only used
by the public callbacks from struct eth_dev_ops.

There's an additional difficulty with this PMD, you need to take into
account the fact it provides secondary process support (mlx5_dev_sec_ops).
I think secondary processes do not have any IBV context available for
mlx5_removed() to query, which should resolve to a no-op in this case.
Make sure secondary processes do not crash whatever happens.

See below for coding style and other issues.

> ---
>  drivers/net/mlx5/mlx5.h        |  1 +
>  drivers/net/mlx5/mlx5_ethdev.c | 39 +++++++++++++++++++++++++++++++++++----
>  drivers/net/mlx5/mlx5_flow.c   |  2 ++
>  drivers/net/mlx5/mlx5_rss.c    |  4 ++++
>  drivers/net/mlx5/mlx5_rxq.c    | 12 ++++++++++--
>  drivers/net/mlx5/mlx5_stats.c  |  6 +++++-
>  drivers/net/mlx5/mlx5_txq.c    |  2 ++
>  7 files changed, 59 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index e6a69b8..0dd104a 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -208,6 +208,7 @@ int mlx5_ibv_device_to_pci_addr(const struct ibv_device *,
>  int mlx5_set_link_up(struct rte_eth_dev *dev);
>  void priv_dev_select_tx_function(struct priv *priv, struct rte_eth_dev *dev);
>  void priv_dev_select_rx_function(struct priv *priv, struct rte_eth_dev *dev);
> +int mlx5_removed(const struct priv *priv);
>  
>  /* mlx5_mac.c */
>  
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
> index c31ea4b..bf61cd6 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -394,6 +394,8 @@ struct priv *
>  
>  	ret = priv_sysfs_write(priv, name, value_str, (sizeof(value_str) - 1));
>  	if (ret == -1) {
> +		if (mlx5_removed(priv))
> +			errno = ENODEV;
>  		DEBUG("cannot write %s `%s' (%lu) to sysfs: %s",
>  		      name, value_str, value, strerror(errno));
>  		return -1;
> @@ -925,13 +927,17 @@ struct priv *
>  {
>  	struct utsname utsname;
>  	int ver[3];
> +	int ret;
>  
>  	if (uname(&utsname) == -1 ||
>  	    sscanf(utsname.release, "%d.%d.%d",
>  		   &ver[0], &ver[1], &ver[2]) != 3 ||
>  	    KERNEL_VERSION(ver[0], ver[1], ver[2]) < KERNEL_VERSION(4, 9, 0))
> -		return mlx5_link_update_unlocked_gset(dev, wait_to_complete);
> -	return mlx5_link_update_unlocked_gs(dev, wait_to_complete);
> +		ret = mlx5_link_update_unlocked_gset(dev, wait_to_complete);
> +	ret =  mlx5_link_update_unlocked_gs(dev, wait_to_complete);

Besides the extra space after "ret =", I think this doesn't work as
intended. A "else" statement is necessary.

> +	if (ret && mlx5_removed(mlx5_get_priv(dev)))
> +		return -ENODEV;
> +	return ret;
>  }
>  
>  /**
> @@ -978,6 +984,8 @@ struct priv *
>  	     strerror(ret));
>  	priv_unlock(priv);
>  	assert(ret >= 0);
> +	if (mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> @@ -1029,6 +1037,8 @@ struct priv *
>  out:
>  	priv_unlock(priv);
>  	assert(ret >= 0);
> +	if (mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> @@ -1083,6 +1093,8 @@ struct priv *
>  out:
>  	priv_unlock(priv);
>  	assert(ret >= 0);
> +	if (mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> @@ -1364,13 +1376,13 @@ struct priv *
>  	if (up) {
>  		err = priv_set_flags(priv, ~IFF_UP, IFF_UP);
>  		if (err)
> -			return err;
> +			return errno == ENODEV ? -ENODEV : err;

There is a documentation issue here since the mlx5 PMD didn't get all the
errno consistency fixes that mlx4 got, however err is documented as being -1
in case of error, whereas priv_dev_set_link() returns a positive errno value
instead and mlx5_set_link_down/up() should return only negative errno values
but are documented as returning positive ones.

Anyway to keep it short: currently in mlx5, priv_*() => positive errno and
the public-facing mlx5_*() => negative errno, hence you should return a
positive ENODEV here.

You could avoid this mess by patching the public callbacks only and not
internal functions like this one.

>  		priv_dev_select_tx_function(priv, dev);
>  		priv_dev_select_rx_function(priv, dev);
>  	} else {
>  		err = priv_set_flags(priv, ~IFF_UP, ~IFF_UP);
>  		if (err)
> -			return err;
> +			return errno == ENODEV ? -ENODEV : err;

Same here.

>  		dev->rx_pkt_burst = removed_rx_burst;
>  		dev->tx_pkt_burst = removed_tx_burst;
>  	}
> @@ -1474,3 +1486,22 @@ struct priv *
>  		dev->rx_pkt_burst = mlx5_rx_burst;
>  	}
>  }
> +
> +/**
> + * Check if mlx5 device was removed.
> + *

"mlx5" is redundant.

As with mlx4, a short paragraph should describe where this function is
supposed to be used.

> + * @param priv
> + *   Pointer to private structure.
> + *
> + * @return
> + *   -ENODEV when device is removed and rte_errno is set, otherwise 0.
> + */
> +int
> +mlx5_removed(const struct priv *priv)
> +{
> +	struct ibv_device_attr device_attr;
> +
> +	if (ibv_query_device(priv->ctx, &device_attr) == EIO)
> +		return -(rte_errno = ENODEV);

Coding rules prohibit this kind of affectation, see mlx4 comments.

> +	return 0;
> +}
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index 5f49bf5..448c0a3 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -3068,6 +3068,8 @@ struct rte_flow *
>  		priv_lock(priv);
>  		ret = priv_fdir_ctrl_func(priv, filter_op, arg);
>  		priv_unlock(priv);
> +		if (ret && mlx5_removed(priv))
> +			ret = ENODEV;
>  		break;
>  	default:
>  		ERROR("%p: filter type (%d) not supported",
> diff --git a/drivers/net/mlx5/mlx5_rss.c b/drivers/net/mlx5/mlx5_rss.c
> index f3de46d..1ad9269 100644
> --- a/drivers/net/mlx5/mlx5_rss.c
> +++ b/drivers/net/mlx5/mlx5_rss.c
> @@ -250,6 +250,8 @@
>  	priv_lock(priv);
>  	ret = priv_dev_rss_reta_query(priv, reta_conf, reta_size);
>  	priv_unlock(priv);
> +	if (ret && mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> @@ -282,5 +284,7 @@
>  		mlx5_dev_stop(dev);
>  		mlx5_dev_start(dev);
>  	}
> +	if (ret && mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
> diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
> index a1f382b..c9a549d 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -278,6 +278,8 @@
>  	(*priv->rxqs)[idx] = &rxq_ctrl->rxq;
>  out:
>  	priv_unlock(priv);
> +	if (mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> @@ -485,8 +487,11 @@
>  	}
>  exit:
>  	priv_unlock(priv);
> -	if (ret)
> +	if (ret) {
>  		WARN("unable to arm interrupt on rx queue %d", rx_queue_id);
> +		if (mlx5_removed(priv))
> +			return -ENODEV;
> +	}
>  	return -ret;
>  }
>  
> @@ -537,9 +542,12 @@
>  	if (rxq_ibv)
>  		mlx5_priv_rxq_ibv_release(priv, rxq_ibv);
>  	priv_unlock(priv);
> -	if (ret)
> +	if (ret) {
>  		WARN("unable to disable interrupt on rx queue %d",
>  		     rx_queue_id);
> +		if (mlx5_removed(priv))
> +			return -ENODEV;
> +	}
>  	return -ret;
>  }
>  
> diff --git a/drivers/net/mlx5/mlx5_stats.c b/drivers/net/mlx5/mlx5_stats.c
> index 5e225d3..33b2a60 100644
> --- a/drivers/net/mlx5/mlx5_stats.c
> +++ b/drivers/net/mlx5/mlx5_stats.c
> @@ -438,13 +438,17 @@ struct mlx5_counter_ctrl {
>  		stats_n = priv_ethtool_get_stats_n(priv);
>  		if (stats_n < 0) {
>  			priv_unlock(priv);
> -			return -1;
> +			ret = -1;
> +			goto error;
>  		}
>  		if (xstats_ctrl->stats_n != stats_n)
>  			priv_xstats_init(priv);
>  		ret = priv_xstats_get(priv, stats);
>  		priv_unlock(priv);
>  	}
> +error:
> +	if (ret < 0 && mlx5_removed(priv))
> +		return -ENODEV;
>  	return ret;
>  }
>  
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> index fbb2630..a0101cb 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -186,6 +186,8 @@
>  	(*priv->txqs)[idx] = &txq_ctrl->txq;
>  out:
>  	priv_unlock(priv);
> +	if (mlx5_removed(priv))
> +		return -ENODEV;
>  	return -ret;
>  }
>  
> -- 
> 1.8.3.1
> 

-- 
Adrien Mazarguil
6WIND

  reply	other threads:[~2017-11-03 13:06 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-02 15:42 [PATCH 0/3] Fail-safe fix removal handling lack Matan Azrad
2017-11-02 15:42 ` [PATCH 1/3] net/failsafe: " Matan Azrad
2017-11-06  8:19   ` Gaëtan Rivet
2017-11-02 15:42 ` [PATCH 2/3] net/mlx4: adjust removal error Matan Azrad
2017-11-03 13:05   ` Adrien Mazarguil
2017-11-05  6:52     ` Matan Azrad
2017-11-06 16:51       ` Adrien Mazarguil
2017-11-02 15:42 ` [PATCH 3/3] net/mlx5: " Matan Azrad
2017-11-03 13:06   ` Adrien Mazarguil [this message]
2017-11-05  6:57     ` Matan Azrad
2017-12-13 14:29 ` [PATCH v2 0/4] Fail-safe fix removal handling lack Matan Azrad
2017-12-13 14:29   ` [PATCH v2 1/4] ethdev: add devop to check removal status Matan Azrad
2017-12-13 14:29   ` [PATCH v2 2/4] net/mlx4: support a device removal check operation Matan Azrad
2017-12-13 14:29   ` [PATCH v2 3/4] net/mlx5: " Matan Azrad
2017-12-13 14:29   ` [PATCH v2 4/4] net/failsafe: fix removed device handling Matan Azrad
2017-12-13 15:16     ` Gaëtan Rivet
2017-12-13 15:48       ` Matan Azrad
2017-12-13 16:09         ` Gaëtan Rivet
2017-12-13 17:09           ` Thomas Monjalon
2017-12-14 10:40             ` Matan Azrad
2017-12-13 21:55           ` Gaëtan Rivet
2017-12-14 10:40             ` Matan Azrad
2017-12-14 10:48               ` Gaëtan Rivet
2017-12-14 13:07                 ` Matan Azrad
2017-12-14 13:27                   ` Gaëtan Rivet
2017-12-14 14:43                     ` Matan Azrad
2017-12-19 17:10   ` [PATCH v3 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
2017-12-19 17:10     ` [PATCH v3 1/6] ethdev: add devop to check removal status Matan Azrad
2017-12-19 17:20       ` Stephen Hemminger
2017-12-19 17:24         ` Matan Azrad
2017-12-19 20:51           ` Thomas Monjalon
2017-12-19 22:13             ` Gaëtan Rivet
2017-12-20  8:39               ` Matan Azrad
2018-01-07  9:53       ` Thomas Monjalon
2017-12-19 17:10     ` [PATCH v3 2/6] net/mlx4: support a device removal check operation Matan Azrad
2017-12-19 17:10     ` [PATCH v3 3/6] net/mlx5: " Matan Azrad
2017-12-19 17:10     ` [PATCH v3 4/6] ethdev: adjust APIs removal error report Matan Azrad
2018-01-07  9:56       ` Thomas Monjalon
2017-12-19 17:10     ` [PATCH v3 5/6] ethdev: adjust flow " Matan Azrad
2018-01-07  9:58       ` Thomas Monjalon
2017-12-19 17:10     ` [PATCH v3 6/6] net/failsafe: fix removed device handling Matan Azrad
2017-12-19 22:21       ` Gaëtan Rivet
2017-12-20 10:58         ` Matan Azrad
2018-01-08 10:57           ` Gaëtan Rivet
2018-01-08 12:55             ` Matan Azrad
2018-01-08 13:46               ` Gaëtan Rivet
2018-01-08 14:00                 ` Matan Azrad
2018-01-08 14:31                   ` Gaëtan Rivet
2018-01-10 12:30     ` [PATCH v4 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
2018-01-10 12:31       ` [PATCH v4 1/6] ethdev: add devop to check removal status Matan Azrad
2018-01-10 12:31       ` [PATCH v4 2/6] net/mlx4: support a device removal check operation Matan Azrad
2018-01-10 12:31       ` [PATCH v4 3/6] net/mlx5: " Matan Azrad
2018-01-10 12:31       ` [PATCH v4 4/6] ethdev: adjust APIs removal error report Matan Azrad
2018-01-10 12:31       ` [PATCH v4 5/6] ethdev: adjust flow " Matan Azrad
2018-01-10 12:31       ` [PATCH v4 6/6] net/failsafe: fix removed device handling Matan Azrad
2018-01-10 12:43         ` Matan Azrad
2018-01-10 13:51           ` Gaëtan Rivet
2018-01-10 13:47         ` Gaëtan Rivet
2018-01-17 20:19       ` [PATCH v5 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
2018-01-17 20:19         ` [PATCH v5 1/6] ethdev: add devop to check removal status Matan Azrad
2018-01-17 20:40           ` Ferruh Yigit
2018-01-17 20:19         ` [PATCH v5 2/6] net/mlx4: support a device removal check operation Matan Azrad
2018-01-17 20:19         ` [PATCH v5 3/6] net/mlx5: " Matan Azrad
2018-01-17 20:19         ` [PATCH v5 4/6] ethdev: adjust APIs removal error report Matan Azrad
2018-01-17 20:19         ` [PATCH v5 5/6] ethdev: adjust flow " Matan Azrad
2018-01-17 20:19         ` [PATCH v5 6/6] net/failsafe: fix removed device handling Matan Azrad
2018-01-18  8:44           ` Gaëtan Rivet
2018-01-18 11:27         ` [PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
2018-01-18 11:27           ` [PATCH v6 1/6] ethdev: add devop to check removal status Matan Azrad
2018-01-18 17:18             ` Ferruh Yigit
2018-01-18 17:57               ` Adrien Mazarguil
2018-01-18 18:02               ` Matan Azrad
2018-01-18 11:27           ` [PATCH v6 2/6] net/mlx4: support a device removal check operation Matan Azrad
2018-01-18 16:59             ` Adrien Mazarguil
2018-01-18 11:27           ` [PATCH v6 3/6] net/mlx5: " Matan Azrad
2018-01-18 16:59             ` Adrien Mazarguil
2018-01-18 11:27           ` [PATCH v6 4/6] ethdev: adjust APIs removal error report Matan Azrad
2018-01-18 17:31             ` Ferruh Yigit
2018-01-18 18:10               ` Matan Azrad
2018-01-19 16:19                 ` Ferruh Yigit
2018-01-19 17:35                   ` Ananyev, Konstantin
2018-01-19 17:54                   ` Thomas Monjalon
2018-01-19 18:13                     ` Ferruh Yigit
2018-01-19 18:16                       ` Thomas Monjalon
2018-01-20 19:04                         ` Matan Azrad
2018-01-20 20:28                           ` Thomas Monjalon
2018-01-20 20:45                             ` Matan Azrad
2018-01-21 20:07                   ` Ferruh Yigit
2018-01-18 11:27           ` [PATCH v6 5/6] ethdev: adjust flow " Matan Azrad
2018-01-18 11:27           ` [PATCH v6 6/6] net/failsafe: fix removed device handling Matan Azrad
2018-01-20 21:12           ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Matan Azrad
2018-01-20 21:12             ` [PATCH v7 1/6] ethdev: add devop to check removal status Matan Azrad
2018-01-20 21:12             ` [PATCH v7 2/6] net/mlx4: support a device removal check operation Matan Azrad
2018-01-20 21:12             ` [PATCH v7 3/6] net/mlx5: " Matan Azrad
2018-01-20 21:12             ` [PATCH v7 4/6] ethdev: adjust APIs removal error report Matan Azrad
2018-01-20 21:12             ` [PATCH v7 5/6] ethdev: adjust flow " Matan Azrad
2018-01-20 21:12             ` [PATCH v7 6/6] net/failsafe: fix removed device handling Matan Azrad
2018-01-21 20:28             ` [PATCH v7 0/6] Fail-safe\ethdev: fix removal handling lack Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171103130605.GO24849@6wind.com \
    --to=adrien.mazarguil@6wind.com \
    --cc=dev@dpdk.org \
    --cc=gaetan.rivet@6wind.com \
    --cc=matan@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.