netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/2] net/mlx4: two virtualization bug fixes
@ 2013-01-17 15:30 Or Gerlitz
  2013-01-17 15:30 ` [PATCH net 1/2] net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode Or Gerlitz
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Or Gerlitz @ 2013-01-17 15:30 UTC (permalink / raw)
  To: davem; +Cc: netdev, amirv, jackm, yevgenyp, Or Gerlitz

Hi Dave, 

These are two fixes to driver bugs which were introduced during the integration 
of the SRIOV patches, would love see them going to -stable too.

Or.


Or Gerlitz (1):
  net/mlx4_core: Set number of msix vectors under SRIOV mode to
    firmware defaults

Yan Burman (1):
  net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode

 drivers/net/ethernet/mellanox/mlx4/en_tx.c |   13 +++++++++----
 drivers/net/ethernet/mellanox/mlx4/main.c  |   11 ++---------
 2 files changed, 11 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net 1/2] net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode
  2013-01-17 15:30 [PATCH net 0/2] net/mlx4: two virtualization bug fixes Or Gerlitz
@ 2013-01-17 15:30 ` Or Gerlitz
  2013-01-17 16:21   ` Eric Dumazet
  2013-01-17 15:30 ` [PATCH net 2/2] net/mlx4_core: Set number of msix vectors under SRIOV mode to firmware defaults Or Gerlitz
  2013-01-18 19:25 ` [PATCH net 0/2] net/mlx4: two virtualization bug fixes David Miller
  2 siblings, 1 reply; 7+ messages in thread
From: Or Gerlitz @ 2013-01-17 15:30 UTC (permalink / raw)
  To: davem; +Cc: netdev, amirv, jackm, yevgenyp, Yan Burman

From: Yan Burman <yanb@mellanox.com>

Commit 5b4c4d36860e "mlx4_en: Allow communication between functions on
same host" introduced a regression under which a bridge acting as vSwitch
whose uplink is an mlx4 Ethernet device become non-operative in native
(non sriov) mode. This happens since broadcast ARP requests sent by VMs
were loopback-ed by the HW and hence the bridge learned VM source MACs
on both the VM and the uplink ports.

The fix is to place the DMAC in the send WQE only under SRIOV/eSwitch
configuration or when the device is in selftest.

Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 2b799f4..6771b69 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -630,10 +630,15 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 		ring->tx_csum++;
 	}
 
-	/* Copy dst mac address to wqe */
-	ethh = (struct ethhdr *)skb->data;
-	tx_desc->ctrl.srcrb_flags16[0] = get_unaligned((__be16 *)ethh->h_dest);
-	tx_desc->ctrl.imm = get_unaligned((__be32 *)(ethh->h_dest + 2));
+	if (mlx4_is_mfunc(mdev->dev) || priv->validate_loopback) {
+		/* Copy dst mac address to wqe. This allows loopback in eSwitch,
+		 * so that VFs and PF can communicate with each other
+		 */
+		ethh = (struct ethhdr *)skb->data;
+		tx_desc->ctrl.srcrb_flags16[0] = get_unaligned((__be16 *)ethh->h_dest);
+		tx_desc->ctrl.imm = get_unaligned((__be32 *)(ethh->h_dest + 2));
+	}
+
 	/* Handle LSO (TSO) packets */
 	if (lso_header_size) {
 		/* Mark opcode as LSO */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net 2/2] net/mlx4_core: Set number of msix vectors under SRIOV mode to firmware defaults
  2013-01-17 15:30 [PATCH net 0/2] net/mlx4: two virtualization bug fixes Or Gerlitz
  2013-01-17 15:30 ` [PATCH net 1/2] net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode Or Gerlitz
@ 2013-01-17 15:30 ` Or Gerlitz
  2013-01-18 19:25 ` [PATCH net 0/2] net/mlx4: two virtualization bug fixes David Miller
  2 siblings, 0 replies; 7+ messages in thread
From: Or Gerlitz @ 2013-01-17 15:30 UTC (permalink / raw)
  To: davem; +Cc: netdev, amirv, jackm, yevgenyp, Or Gerlitz

The lines

	if (mlx4_is_mfunc(dev)) {
		nreq = 2;
	} else {

which hard code the number of requested msi-x vectors under multi-function
mode to two can be removed completely, since the firmware sets num_eqs and
reserved_eqs appropriately Thus, the code line:

	nreq = min_t(int, dev->caps.num_eqs - dev->caps.reserved_eqs, nreq);

is by itself sufficient and correct for all cases. Currently, for mfunc
mode num_eqs = 32 and reserved_eqs = 28, hence four vectors will be enabled.

This triples (one vector is used for the async events and commands EQ) the
horse power provided for processing of incoming packets on netdev RSS scheme,
IO initiators/targets commands processing flows, etc.

Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/main.c |   11 ++---------
 1 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index e1bafff..a6542d7 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -1790,15 +1790,8 @@ static void mlx4_enable_msi_x(struct mlx4_dev *dev)
 	int i;
 
 	if (msi_x) {
-		/* In multifunction mode each function gets 2 msi-X vectors
-		 * one for data path completions anf the other for asynch events
-		 * or command completions */
-		if (mlx4_is_mfunc(dev)) {
-			nreq = 2;
-		} else {
-			nreq = min_t(int, dev->caps.num_eqs -
-				     dev->caps.reserved_eqs, nreq);
-		}
+		nreq = min_t(int, dev->caps.num_eqs - dev->caps.reserved_eqs,
+			     nreq);
 
 		entries = kcalloc(nreq, sizeof *entries, GFP_KERNEL);
 		if (!entries)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net 1/2] net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode
  2013-01-17 15:30 ` [PATCH net 1/2] net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode Or Gerlitz
@ 2013-01-17 16:21   ` Eric Dumazet
  2013-01-17 16:32     ` Or Gerlitz
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2013-01-17 16:21 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: davem, netdev, amirv, jackm, yevgenyp, Yan Burman

On Thu, 2013-01-17 at 17:30 +0200, Or Gerlitz wrote:
> From: Yan Burman <yanb@mellanox.com>
> 
> Commit 5b4c4d36860e "mlx4_en: Allow communication between functions on
> same host" introduced a regression under which a bridge acting as vSwitch
> whose uplink is an mlx4 Ethernet device become non-operative in native
> (non sriov) mode. This happens since broadcast ARP requests sent by VMs
> were loopback-ed by the HW and hence the bridge learned VM source MACs
> on both the VM and the uplink ports.
> 
> The fix is to place the DMAC in the send WQE only under SRIOV/eSwitch
> configuration or when the device is in selftest.
> 
> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
> Signed-off-by: Yan Burman <yanb@mellanox.com>
> Signed-off-by: Amir Vadai <amirv@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx4/en_tx.c |   13 +++++++++----
>  1 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> index 2b799f4..6771b69 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> @@ -630,10 +630,15 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
>  		ring->tx_csum++;
>  	}
>  
> -	/* Copy dst mac address to wqe */
> -	ethh = (struct ethhdr *)skb->data;
> -	tx_desc->ctrl.srcrb_flags16[0] = get_unaligned((__be16 *)ethh->h_dest);
> -	tx_desc->ctrl.imm = get_unaligned((__be32 *)(ethh->h_dest + 2));
> +	if (mlx4_is_mfunc(mdev->dev) || priv->validate_loopback) {
> +		/* Copy dst mac address to wqe. This allows loopback in eSwitch,
> +		 * so that VFs and PF can communicate with each other
> +		 */
> +		ethh = (struct ethhdr *)skb->data;
> +		tx_desc->ctrl.srcrb_flags16[0] = get_unaligned((__be16 *)ethh->h_dest);
> +		tx_desc->ctrl.imm = get_unaligned((__be32 *)(ethh->h_dest + 2));
> +	}
> +
>  	/* Handle LSO (TSO) packets */
>  	if (lso_header_size) {
>  		/* Mark opcode as LSO */

Isn't this bug could explain why mlx4 had to filter incoming frames ?

It seems you also could remove this chunk ?

(I don't really understand why mlx4_en_mac_to_u64() is done... its
horribly expensive...). 
matching 6 bytes can be done in 3 x86_64 instructions using the right
helper (ether_addr_equal_64bits())

vi +613 drivers/net/ethernet/mellanox/mlx4/en_rx.c

               ethh = (struct ethhdr *)(page_address(frags[0].page) +
                                         frags[0].offset);
                s_mac = mlx4_en_mac_to_u64(ethh->h_source);

                /* If source MAC is equal to our own MAC and not performing
                 * the selftest or flb disabled - drop the packet */
                if (s_mac == priv->mac &&
                    !((dev->features & NETIF_F_LOOPBACK) ||
                      priv->validate_loopback))
                        goto next;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net 1/2] net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode
  2013-01-17 16:21   ` Eric Dumazet
@ 2013-01-17 16:32     ` Or Gerlitz
  2013-01-17 16:34       ` Or Gerlitz
  0 siblings, 1 reply; 7+ messages in thread
From: Or Gerlitz @ 2013-01-17 16:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev, amirv, jackm, yevgenyp, Yan Burman

On 17/01/2013 18:21, Eric Dumazet wrote:
> On Thu, 2013-01-17 at 17:30 +0200, Or Gerlitz wrote:
>> From: Yan Burman <yanb@mellanox.com>
>>
>> Commit 5b4c4d36860e "mlx4_en: Allow communication between functions on
>> same host" introduced a regression under which a bridge acting as vSwitch
>> whose uplink is an mlx4 Ethernet device become non-operative in native
>> (non sriov) mode. This happens since broadcast ARP requests sent by VMs
>> were loopback-ed by the HW and hence the bridge learned VM source MACs
>> on both the VM and the uplink ports.
>>
>> The fix is to place the DMAC in the send WQE only under SRIOV/eSwitch
>> configuration or when the device is in selftest.
>>
>> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
>> Signed-off-by: Yan Burman <yanb@mellanox.com>
>> Signed-off-by: Amir Vadai <amirv@mellanox.com>
>> ---
>>   drivers/net/ethernet/mellanox/mlx4/en_tx.c |   13 +++++++++----
>>   1 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> index 2b799f4..6771b69 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> @@ -630,10 +630,15 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
>>   		ring->tx_csum++;
>>   	}
>>   
>> -	/* Copy dst mac address to wqe */
>> -	ethh = (struct ethhdr *)skb->data;
>> -	tx_desc->ctrl.srcrb_flags16[0] = get_unaligned((__be16 *)ethh->h_dest);
>> -	tx_desc->ctrl.imm = get_unaligned((__be32 *)(ethh->h_dest + 2));
>> +	if (mlx4_is_mfunc(mdev->dev) || priv->validate_loopback) {
>> +		/* Copy dst mac address to wqe. This allows loopback in eSwitch,
>> +		 * so that VFs and PF can communicate with each other
>> +		 */
>> +		ethh = (struct ethhdr *)skb->data;
>> +		tx_desc->ctrl.srcrb_flags16[0] = get_unaligned((__be16 *)ethh->h_dest);
>> +		tx_desc->ctrl.imm = get_unaligned((__be32 *)(ethh->h_dest + 2));
>> +	}
>> +
>>   	/* Handle LSO (TSO) packets */
>>   	if (lso_header_size) {
>>   		/* Mark opcode as LSO */
> Isn't this bug could explain why mlx4 had to filter incoming frames ?

yes and no...

when in multi-function -- eSwitch mode we DO need to always put the DMAC 
in the TX WQE
since this is a hint for the HW to optionally conduct loopback (e.g if 
this DMAC is
registered with the HW for another VF). For net / 3.8 we wanted a pure 
bug fix 1st and most,
we have more patches that reduce checks related to loopback but they can 
have a better fit
to net-next.

Also, there's another patch set coming from Yan that enables bridge and 
macvlan over
VFs, this is new functionality, adding support to ndo_fdb entries and 
IFF_UNICAST_FLT,
will be submitted to net-next soon (next week I believe) too.

All in all, the check on the RX mode was intentionally not removed by this
patch and will be optimized and enhanced in more patches which are 
coming for
net-next.

Hope this makes things clearer.

Or.





>
> It seems you also could remove this chunk ?
>
> (I don't really understand why mlx4_en_mac_to_u64() is done... its
> horribly expensive...).
> matching 6 bytes can be done in 3 x86_64 instructions using the right
> helper (ether_addr_equal_64bits())
>
> vi +613 drivers/net/ethernet/mellanox/mlx4/en_rx.c
>
>                 ethh = (struct ethhdr *)(page_address(frags[0].page) +
>                                           frags[0].offset);
>                  s_mac = mlx4_en_mac_to_u64(ethh->h_source);
>
>                  /* If source MAC is equal to our own MAC and not performing
>                   * the selftest or flb disabled - drop the packet */
>                  if (s_mac == priv->mac &&
>                      !((dev->features & NETIF_F_LOOPBACK) ||
>                        priv->validate_loopback))
>                          goto next;
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net 1/2] net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode
  2013-01-17 16:32     ` Or Gerlitz
@ 2013-01-17 16:34       ` Or Gerlitz
  0 siblings, 0 replies; 7+ messages in thread
From: Or Gerlitz @ 2013-01-17 16:34 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev, amirv, jackm, yevgenyp, Yan Burman

On 17/01/2013 18:32, Or Gerlitz wrote:
> (I don't really understand why mlx4_en_mac_to_u64() is done... its 
> horribly expensive...). matching 6 bytes can be done in 3 x86_64 
> instructions using the right helper (ether_addr_equal_64bits()) 

thanks for the heads up, we will fix.

Or.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net 0/2] net/mlx4: two virtualization bug fixes
  2013-01-17 15:30 [PATCH net 0/2] net/mlx4: two virtualization bug fixes Or Gerlitz
  2013-01-17 15:30 ` [PATCH net 1/2] net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode Or Gerlitz
  2013-01-17 15:30 ` [PATCH net 2/2] net/mlx4_core: Set number of msix vectors under SRIOV mode to firmware defaults Or Gerlitz
@ 2013-01-18 19:25 ` David Miller
  2 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2013-01-18 19:25 UTC (permalink / raw)
  To: ogerlitz; +Cc: netdev, amirv, jackm, yevgenyp

From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Thu, 17 Jan 2013 17:30:41 +0200

> These are two fixes to driver bugs which were introduced during the integration 
> of the SRIOV patches, would love see them going to -stable too.

Applied and queued up for -stable.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-01-18 19:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-17 15:30 [PATCH net 0/2] net/mlx4: two virtualization bug fixes Or Gerlitz
2013-01-17 15:30 ` [PATCH net 1/2] net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode Or Gerlitz
2013-01-17 16:21   ` Eric Dumazet
2013-01-17 16:32     ` Or Gerlitz
2013-01-17 16:34       ` Or Gerlitz
2013-01-17 15:30 ` [PATCH net 2/2] net/mlx4_core: Set number of msix vectors under SRIOV mode to firmware defaults Or Gerlitz
2013-01-18 19:25 ` [PATCH net 0/2] net/mlx4: two virtualization bug fixes David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).