netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net-next] net/mlx4: Treat VFs fair when handling comm_channel_events
@ 2021-04-22  6:21 Hans Westgaard Ry
  2021-04-22  9:45 ` Tariq Toukan
  2021-04-22 22:10 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Hans Westgaard Ry @ 2021-04-22  6:21 UTC (permalink / raw)
  To: Tariq Toukan, David S. Miller, Jakub Kicinski, netdev

Handling comm_channel_event in mlx4_master_comm_channel uses a double
loop to determine which slaves have requested work. The search is
always started at lowest slave. This leads to unfairness; lower VFs
tends to be prioritized over higher VFs.

The patch uses find_next_bit to determine which slaves to handle.
Fairness is implemented by always starting at the next to the last
start.

An MPI program has been used to measure improvements. It runs 500
ibv_reg_mr, synchronizes with all other instances and then runs 500
ibv_dereg_mr.

The results running 500 processes, time reported is for running 500
calls:

ibv_reg_mr:
             Mod.   Org.
mlx4_1    403.356ms 424.674ms
mlx4_2    403.355ms 424.674ms
mlx4_3    403.354ms 424.674ms
mlx4_4    403.355ms 424.674ms
mlx4_5    403.357ms 424.677ms
mlx4_6    403.354ms 424.676ms
mlx4_7    403.357ms 424.675ms
mlx4_8    403.355ms 424.675ms

ibv_dereg_mr:
             Mod.   Org.
mlx4_1    116.408ms 142.818ms
mlx4_2    116.434ms 142.793ms
mlx4_3    116.488ms 143.247ms
mlx4_4    116.679ms 143.230ms
mlx4_5    112.017ms 107.204ms
mlx4_6    112.032ms 107.516ms
mlx4_7    112.083ms 184.195ms
mlx4_8    115.089ms 190.618ms

Suggested-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
---
v1 -> v2:                                                                                                                                                                                                                                
 - removed set but not user varible,
   reported by 'kernel test robot'
 - fixed reversed Christmas tree,
   removed else,
   removed extra varibles in printout,
   moved next_slave to struct mlx4_mfunc_master_ctx,
   as suggested by Tariq Toukan       
 drivers/net/ethernet/mellanox/mlx4/cmd.c  | 69 ++++++++++++++++++-------------
 drivers/net/ethernet/mellanox/mlx4/mlx4.h |  1 +
 2 files changed, 41 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index c678344d22a2..8d751383530b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -2241,43 +2241,52 @@ void mlx4_master_comm_channel(struct work_struct *work)
 	struct mlx4_priv *priv =
 		container_of(mfunc, struct mlx4_priv, mfunc);
 	struct mlx4_dev *dev = &priv->dev;
-	__be32 *bit_vec;
+	u32 lbit_vec[COMM_CHANNEL_BIT_ARRAY_SIZE];
+	u32 nmbr_bits;
 	u32 comm_cmd;
-	u32 vec;
-	int i, j, slave;
+	int i, slave;
 	int toggle;
+	bool first = true;
 	int served = 0;
 	int reported = 0;
 	u32 slt;
 
-	bit_vec = master->comm_arm_bit_vector;
-	for (i = 0; i < COMM_CHANNEL_BIT_ARRAY_SIZE; i++) {
-		vec = be32_to_cpu(bit_vec[i]);
-		for (j = 0; j < 32; j++) {
-			if (!(vec & (1 << j)))
-				continue;
-			++reported;
-			slave = (i * 32) + j;
-			comm_cmd = swab32(readl(
-					  &mfunc->comm[slave].slave_write));
-			slt = swab32(readl(&mfunc->comm[slave].slave_read))
-				     >> 31;
-			toggle = comm_cmd >> 31;
-			if (toggle != slt) {
-				if (master->slave_state[slave].comm_toggle
-				    != slt) {
-					pr_info("slave %d out of sync. read toggle %d, state toggle %d. Resynching.\n",
-						slave, slt,
-						master->slave_state[slave].comm_toggle);
-					master->slave_state[slave].comm_toggle =
-						slt;
-				}
-				mlx4_master_do_cmd(dev, slave,
-						   comm_cmd >> 16 & 0xff,
-						   comm_cmd & 0xffff, toggle);
-				++served;
+	for (i = 0; i < COMM_CHANNEL_BIT_ARRAY_SIZE; i++)
+		lbit_vec[i] = be32_to_cpu(master->comm_arm_bit_vector[i]);
+	nmbr_bits = dev->persist->num_vfs + 1;
+	if (++master->next_slave >= nmbr_bits)
+		master->next_slave = 0;
+	slave = master->next_slave;
+	while (true) {
+		slave = find_next_bit((const unsigned long *)&lbit_vec, nmbr_bits, slave);
+		if  (!first && slave >= master->next_slave)
+			break;
+		if (slave == nmbr_bits) {
+			if (!first)
+				break;
+			first = false;
+			slave = 0;
+			continue;
+		}
+		++reported;
+		comm_cmd = swab32(readl(&mfunc->comm[slave].slave_write));
+		slt = swab32(readl(&mfunc->comm[slave].slave_read)) >> 31;
+		toggle = comm_cmd >> 31;
+		if (toggle != slt) {
+			if (master->slave_state[slave].comm_toggle
+			    != slt) {
+				pr_info("slave %d out of sync. read toggle %d, state toggle %d. Resynching.\n",
+					slave, slt,
+					master->slave_state[slave].comm_toggle);
+				master->slave_state[slave].comm_toggle =
+					slt;
 			}
+			mlx4_master_do_cmd(dev, slave,
+					   comm_cmd >> 16 & 0xff,
+					   comm_cmd & 0xffff, toggle);
+			++served;
 		}
+		slave++;
 	}
 
 	if (reported && reported != served)
@@ -2389,6 +2398,8 @@ int mlx4_multi_func_init(struct mlx4_dev *dev)
 		if (!priv->mfunc.master.vf_oper)
 			goto err_comm_oper;
 
+		priv->mfunc.master.next_slave = 0;
+
 		for (i = 0; i < dev->num_slaves; ++i) {
 			vf_admin = &priv->mfunc.master.vf_admin[i];
 			vf_oper = &priv->mfunc.master.vf_oper[i];
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 64bed7ac3836..6ccf340660d9 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -603,6 +603,7 @@ struct mlx4_mfunc_master_ctx {
 	struct mlx4_slave_event_eq slave_eq;
 	struct mutex		gen_eqe_mutex[MLX4_MFUNC_MAX];
 	struct mlx4_qos_manager qos_ctl[MLX4_MAX_PORTS + 1];
+	u32			next_slave; /* mlx4_master_comm_channel */
 };
 
 struct mlx4_mfunc {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2 net-next] net/mlx4: Treat VFs fair when handling comm_channel_events
  2021-04-22  6:21 [PATCH v2 net-next] net/mlx4: Treat VFs fair when handling comm_channel_events Hans Westgaard Ry
@ 2021-04-22  9:45 ` Tariq Toukan
  2021-04-22 22:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Tariq Toukan @ 2021-04-22  9:45 UTC (permalink / raw)
  To: Hans Westgaard Ry, Tariq Toukan, David S. Miller, Jakub Kicinski,
	netdev



On 4/22/2021 9:21 AM, Hans Westgaard Ry wrote:
> Handling comm_channel_event in mlx4_master_comm_channel uses a double
> loop to determine which slaves have requested work. The search is
> always started at lowest slave. This leads to unfairness; lower VFs
> tends to be prioritized over higher VFs.
> 
> The patch uses find_next_bit to determine which slaves to handle.
> Fairness is implemented by always starting at the next to the last
> start.
> 
> An MPI program has been used to measure improvements. It runs 500
> ibv_reg_mr, synchronizes with all other instances and then runs 500
> ibv_dereg_mr.
> 
> The results running 500 processes, time reported is for running 500
> calls:
> 
> ibv_reg_mr:
>               Mod.   Org.
> mlx4_1    403.356ms 424.674ms
> mlx4_2    403.355ms 424.674ms
> mlx4_3    403.354ms 424.674ms
> mlx4_4    403.355ms 424.674ms
> mlx4_5    403.357ms 424.677ms
> mlx4_6    403.354ms 424.676ms
> mlx4_7    403.357ms 424.675ms
> mlx4_8    403.355ms 424.675ms
> 
> ibv_dereg_mr:
>               Mod.   Org.
> mlx4_1    116.408ms 142.818ms
> mlx4_2    116.434ms 142.793ms
> mlx4_3    116.488ms 143.247ms
> mlx4_4    116.679ms 143.230ms
> mlx4_5    112.017ms 107.204ms
> mlx4_6    112.032ms 107.516ms
> mlx4_7    112.083ms 184.195ms
> mlx4_8    115.089ms 190.618ms
> 
> Suggested-by: Håkon Bugge <haakon.bugge@oracle.com>
> Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
> ---
> v1 -> v2:
>   - removed set but not user varible,
>     reported by 'kernel test robot'
>   - fixed reversed Christmas tree,
>     removed else,
>     removed extra varibles in printout,
>     moved next_slave to struct mlx4_mfunc_master_ctx,
>     as suggested by Tariq Toukan
>   drivers/net/ethernet/mellanox/mlx4/cmd.c  | 69 ++++++++++++++++++-------------
>   drivers/net/ethernet/mellanox/mlx4/mlx4.h |  1 +
>   2 files changed, 41 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
> index c678344d22a2..8d751383530b 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
> @@ -2241,43 +2241,52 @@ void mlx4_master_comm_channel(struct work_struct *work)
>   	struct mlx4_priv *priv =
>   		container_of(mfunc, struct mlx4_priv, mfunc);
>   	struct mlx4_dev *dev = &priv->dev;
> -	__be32 *bit_vec;
> +	u32 lbit_vec[COMM_CHANNEL_BIT_ARRAY_SIZE];
> +	u32 nmbr_bits;
>   	u32 comm_cmd;
> -	u32 vec;
> -	int i, j, slave;
> +	int i, slave;
>   	int toggle;
> +	bool first = true;
>   	int served = 0;
>   	int reported = 0;
>   	u32 slt;
>   
> -	bit_vec = master->comm_arm_bit_vector;
> -	for (i = 0; i < COMM_CHANNEL_BIT_ARRAY_SIZE; i++) {
> -		vec = be32_to_cpu(bit_vec[i]);
> -		for (j = 0; j < 32; j++) {
> -			if (!(vec & (1 << j)))
> -				continue;
> -			++reported;
> -			slave = (i * 32) + j;
> -			comm_cmd = swab32(readl(
> -					  &mfunc->comm[slave].slave_write));
> -			slt = swab32(readl(&mfunc->comm[slave].slave_read))
> -				     >> 31;
> -			toggle = comm_cmd >> 31;
> -			if (toggle != slt) {
> -				if (master->slave_state[slave].comm_toggle
> -				    != slt) {
> -					pr_info("slave %d out of sync. read toggle %d, state toggle %d. Resynching.\n",
> -						slave, slt,
> -						master->slave_state[slave].comm_toggle);
> -					master->slave_state[slave].comm_toggle =
> -						slt;
> -				}
> -				mlx4_master_do_cmd(dev, slave,
> -						   comm_cmd >> 16 & 0xff,
> -						   comm_cmd & 0xffff, toggle);
> -				++served;
> +	for (i = 0; i < COMM_CHANNEL_BIT_ARRAY_SIZE; i++)
> +		lbit_vec[i] = be32_to_cpu(master->comm_arm_bit_vector[i]);
> +	nmbr_bits = dev->persist->num_vfs + 1;
> +	if (++master->next_slave >= nmbr_bits)
> +		master->next_slave = 0;
> +	slave = master->next_slave;
> +	while (true) {
> +		slave = find_next_bit((const unsigned long *)&lbit_vec, nmbr_bits, slave);
> +		if  (!first && slave >= master->next_slave)
> +			break;
> +		if (slave == nmbr_bits) {
> +			if (!first)
> +				break;
> +			first = false;
> +			slave = 0;
> +			continue;
> +		}
> +		++reported;
> +		comm_cmd = swab32(readl(&mfunc->comm[slave].slave_write));
> +		slt = swab32(readl(&mfunc->comm[slave].slave_read)) >> 31;
> +		toggle = comm_cmd >> 31;
> +		if (toggle != slt) {
> +			if (master->slave_state[slave].comm_toggle
> +			    != slt) {
> +				pr_info("slave %d out of sync. read toggle %d, state toggle %d. Resynching.\n",
> +					slave, slt,
> +					master->slave_state[slave].comm_toggle);
> +				master->slave_state[slave].comm_toggle =
> +					slt;
>   			}
> +			mlx4_master_do_cmd(dev, slave,
> +					   comm_cmd >> 16 & 0xff,
> +					   comm_cmd & 0xffff, toggle);
> +			++served;
>   		}
> +		slave++;
>   	}
>   
>   	if (reported && reported != served)
> @@ -2389,6 +2398,8 @@ int mlx4_multi_func_init(struct mlx4_dev *dev)
>   		if (!priv->mfunc.master.vf_oper)
>   			goto err_comm_oper;
>   
> +		priv->mfunc.master.next_slave = 0;
> +
>   		for (i = 0; i < dev->num_slaves; ++i) {
>   			vf_admin = &priv->mfunc.master.vf_admin[i];
>   			vf_oper = &priv->mfunc.master.vf_oper[i];
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> index 64bed7ac3836..6ccf340660d9 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> @@ -603,6 +603,7 @@ struct mlx4_mfunc_master_ctx {
>   	struct mlx4_slave_event_eq slave_eq;
>   	struct mutex		gen_eqe_mutex[MLX4_MFUNC_MAX];
>   	struct mlx4_qos_manager qos_ctl[MLX4_MAX_PORTS + 1];
> +	u32			next_slave; /* mlx4_master_comm_channel */
>   };
>   
>   struct mlx4_mfunc {
> 

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>

Thanks,
Tariq

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2 net-next] net/mlx4: Treat VFs fair when handling comm_channel_events
  2021-04-22  6:21 [PATCH v2 net-next] net/mlx4: Treat VFs fair when handling comm_channel_events Hans Westgaard Ry
  2021-04-22  9:45 ` Tariq Toukan
@ 2021-04-22 22:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-04-22 22:10 UTC (permalink / raw)
  To: Hans Westgaard Ry; +Cc: tariqt, davem, kuba, netdev

Hello:

This patch was applied to netdev/net-next.git (refs/heads/master):

On Thu, 22 Apr 2021 08:21:40 +0200 you wrote:
> Handling comm_channel_event in mlx4_master_comm_channel uses a double
> loop to determine which slaves have requested work. The search is
> always started at lowest slave. This leads to unfairness; lower VFs
> tends to be prioritized over higher VFs.
> 
> The patch uses find_next_bit to determine which slaves to handle.
> Fairness is implemented by always starting at the next to the last
> start.
> 
> [...]

Here is the summary with links:
  - [v2,net-next] net/mlx4: Treat VFs fair when handling comm_channel_events
    https://git.kernel.org/netdev/net-next/c/79ebfb11fe08

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-22 22:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-04-22  6:21 [PATCH v2 net-next] net/mlx4: Treat VFs fair when handling comm_channel_events Hans Westgaard Ry
2021-04-22  9:45 ` Tariq Toukan
2021-04-22 22:10 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).