Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing
From: David Miller @ 2012-12-10 18:41 UTC (permalink / raw)
  To: johannes; +Cc: eric, netdev, linux-wireless, linville, eric.dumazet
In-Reply-To: <1355132466.9857.6.camel@jlt4.sipsolutions.net>

From: Johannes Berg <johannes@sipsolutions.net>
Date: Mon, 10 Dec 2012 10:41:06 +0100

> From: Johannes Berg <johannes.berg@intel.com>
> 
> ip_check_defrag() might be called from af_packet within the
> RX path where shared SKBs are used, so it must not modify
> the input SKB before it has unshared it for defragmentation.
> Use skb_copy_bits() to get the IP header and only pull in
> everything later.
> 
> The same is true for the other caller in macvlan as it is
> called from dev->rx_handler which can also get a shared SKB.
> 
> Reported-by: Eric Leblond <eric@regit.org>
> Cc: stable@vger.kernel.org
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
> For some versions of the kernel, this code goes into af_packet.c

So the bug is that ip_check_defrag() has a precondition which is met
properly by all callers except AF_PACKET.

If this is the case, remind me why are we changing ip_check_defrag()
rather than the violator of the precondition?

^ permalink raw reply

* Re: [PATCH net-next v3 03/22] bnx2x: Add to VF <-> PF channel the release request
From: David Miller @ 2012-12-10 18:39 UTC (permalink / raw)
  To: ariele; +Cc: netdev, eilong
In-Reply-To: <1355154406-10855-4-git-send-email-ariele@broadcom.com>

From: "Ariel Elior" <ariele@broadcom.com>
Date: Mon, 10 Dec 2012 17:46:27 +0200

> +	/* PF released us */
> +	if (resp->hdr.status == PFVF_STATUS_SUCCESS) {
> +		DP(BNX2X_MSG_SP, "vf released\n");
> +
> +	/* PF reports error */
> +	} else {

Seriously?  Do I even need to tell you how awful this looks?

Remove the empty line and put that comment before the "else"
somewhere where it doesn't have to have indentation which does
not match the basic block it is in.

I think I've reviewed enough of this series, please audit the
rest of coding style problems yourself.

^ permalink raw reply

* Re: [PATCH net-next v3 02/22] bnx2x: VF <-> PF channel 'acquire' at vf probe
From: David Miller @ 2012-12-10 18:37 UTC (permalink / raw)
  To: ariele; +Cc: netdev, eilong
In-Reply-To: <1355154406-10855-3-git-send-email-ariele@broadcom.com>

From: "Ariel Elior" <ariele@broadcom.com>
Date: Mon, 10 Dec 2012 17:46:26 +0200

> +/* VF MBX (aka vf-pf channel) */
> +#include "bnx2x.h"
> +#include "bnx2x_sriov.h"
> +/* place a given tlv on the tlv buffer at a given offset */
> +void bnx2x_add_tlv(struct bnx2x *bp, void *tlvs_list, u16 offset, u16 type,
> +		   u16 length)

Put an empty line between include directives and the rest of the
file.

> +	while (tlv->type != CHANNEL_TLV_LIST_END) {
> +
> +		/* output tlv */
> +		DP(BNX2X_MSG_IOV, "TLV number %d: type %d, length %d\n", i,
> +		   tlv->type, tlv->length);

But this is a place where an empty line is not appropriate, please remove
the empty line after the one with the while().

> +/* acquire response tlv - carries the allocated resources */
> +struct pfvf_acquire_resp_tlv {
> +	struct pfvf_tlv hdr;
> +	struct pf_vf_pfdev_info {
> +		u32 chip_num;
> +		u32 pf_cap;
> +		#define PFVF_CAP_RSS		0x00000001
> +		#define PFVF_CAP_DHC		0x00000002
> +		#define PFVF_CAP_TPA		0x00000004

Please don't indent CPP defines excessively like this, they should be
at or near the initial column so that CPP directives can be easily
spotted by someone reading this code.

^ permalink raw reply

* Re: [PATCH net-next v3 01/22] bnx2x: Support probing and removing of VF device
From: David Miller @ 2012-12-10 18:35 UTC (permalink / raw)
  To: ariele; +Cc: netdev, eilong
In-Reply-To: <1355154406-10855-2-git-send-email-ariele@broadcom.com>

From: "Ariel Elior" <ariele@broadcom.com>
Date: Mon, 10 Dec 2012 17:46:25 +0200

>  static void bnx2x_get_pcie_width_speed(struct bnx2x *bp, int *width, int *speed)
>  {
> -	u32 val = REG_RD(bp, PCICFG_OFFSET + PCICFG_LINK_CONTROL);
> +	u32 val = 0;
> +	pci_read_config_dword(bp->pdev, PCICFG_LINK_CONTROL, &val);

Please put an empty line between function local variable declarations
and actual code.

> @@ -12023,85 +12057,115 @@ static int bnx2x_get_num_non_def_sbs(struct pci_dev *pdev,
>  	 * If MSI-X is not supported - return number of SBs needed to support
>  	 * one fast path queue: one FP queue + SB for CNIC
>  	 */
> -	if (!pos)
> +	if (!pos) {
> +		pr_info("no msix capability found");
>  		return 1 + cnic_cnt;
> +	}
> +
> +	pr_info("msix capability found");
>  

Use dev_info(), netdev_info(), or similar, rather than plain
pr_info().

^ permalink raw reply

* RE: ixgbe:  pci_get_device() call without counterpart call of pci_dev_put()
From: Rose, Gregory V @ 2012-12-10 18:24 UTC (permalink / raw)
  To: Kirsher, Jeffrey T, Elena Gurevich
  Cc: netdev, e1000-devel@lists.sourceforge.net, Skidmore, Donald C
In-Reply-To: <50C58C98.2030509@gmail.com>

> -----Original Message-----
> From: Jeff Kirsher [mailto:tarbal@gmail.com]
> Sent: Sunday, December 09, 2012 11:18 PM
> To: Elena Gurevich
> Cc: netdev; e1000-devel@lists.sourceforge.net; Rose, Gregory V; Skidmore,
> Donald C
> Subject: Re: ixgbe: pci_get_device() call without counterpart call of
> pci_dev_put()
> 
> On 12/09/2012 01:47 AM, Elena Gurevich wrote:
> > Hi all,
> > I am pioneer in linux device drivers here and using Intel 82599 NIC as
> > reference model, During investigation to drivers sources I found the
> > suspicious code:
> >
> > Is code sequence (1)   and (2) the possible device reference count
> leakage
> > ???
> >
> >
> > Thanks a lot in advance
> > Lena
> >
> > --------snipped from ixgbe_main.c  file function
> > ixgbe_io_error_detected()
> > -----------
> >
> > . . .
> > 		/* Find the pci device of the offending VF */
> > 		vfdev = pci_get_device(PCI_VENDOR_ID_INTEL, device_id, NULL);
> > 		while (vfdev) {
> > 			if (vfdev->devfn == (req_id & 0xFF))
> > 				break;
> > <------------------------------      (1) leaves the loop with successful
> get
> > call  !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> >
> > 			vfdev = pci_get_device(PCI_VENDOR_ID_INTEL,
> > 					       device_id, vfdev);
> > 		}
> > 		/*
> > 		 * There's a slim chance the VF could have been hot plugged,
> > 		 * so if it is no longer present we don't need to issue the
> > 		 * VFLR.  Just clean up the AER in that case.
> > 		 */
> > 		if (vfdev) {
> > 			e_dev_err("Issuing VFLR to VF %d\n", vf);
> > 			pci_write_config_dword(vfdev, 0xA8, 0x00008000);
> > 		}
> >
> > 		pci_cleanup_aer_uncorrect_error_status(pdev);
> > 	}
> >
> > 	/*
> > 	 * Even though the error may have occurred on the other port
> > 	 * we still need to increment the vf error reference count for
> > 	 * both ports because the I/O resume function will be called
> > 	 * for both of them.
> > 	 */
> > 	adapter->vferr_refcount++;
> >
> > 	return PCI_ERS_RESULT_RECOVERED;
> > <--------------------------------------------  (2) leaves the function
> > without put call !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> > ----------------------------------  snipped
> > -----------------------------------
> >
> Added Greg Rose (ixgbevf maintainer) & Don Skidmore (ixgbe maintainer)
> 
> Since the code you have questions about is dealing with the vf, Greg will
> most likely be able to shed some light to your questions.

That's a bug.  I'll generate a patch to add the necessary pci_dev_put().

- Greg


^ permalink raw reply

* Re: [PATCH net-next v3] tipc: sk_recv_queue size check only for connectionless sockets
From: Neil Horman @ 2012-12-10 18:22 UTC (permalink / raw)
  To: Jon Maloy; +Cc: Ying Xue, Paul.Gortmaker, erik.hugne, netdev, tipc-discussion
In-Reply-To: <50C60496.5040800@ericsson.com>

On Mon, Dec 10, 2012 at 10:49:42AM -0500, Jon Maloy wrote:
> On 12/10/2012 05:13 AM, Jon Maloy wrote:
> > On 12/10/2012 04:23 AM, Ying Xue wrote:
> >> The sk_receive_queue limit control is currently performed for all
> >> arriving messages, disregarding socket and message type. But for
> >> connectionless sockets this check is redundant, since the protocol
> >> flow already makes queue overflow impossible.
> >>
> >> We move the sk_receive_queue limit control so that it's only performed
> >> for connectionless sockets, i.e. SOCK_RDM and SOCK_DGRAM type sockets.
> >>
> >> However, as Neil Horman specified, we cannot simply force the socket
> >> receive queue limit against connectionless sockets as it may create a
> >> DoS vulnerability. For example, if a sender floods a receiver with
> >> messages containing an invalid set of message importance bits or
> >> CRITICAL importance, we will queue messages indefinitely.
> >>
> >> To avoid DoS attack, socket receive queue will be marked as overflow
> >> if we receive messages with invalid message importances, meanwhile,
> >> we also set one new threshold for CRITICAL importance messages.
> >>
> >> Signed-off-by: Ying Xue <ying.xue@windriver.com>
> >> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
> >> Cc: Neil Horman <nhorman@tuxdriver.com>
> >> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> >> ---
> >> v3 changes:
> >>  - set new threshold for CRITICAL message
> >>  - defined an importance factor table to avoid multiplication and
> >>    division operations in rx_queue_full().
> >>  - changed return value of rx_queue_full() from integer to boolean.
> >>
> >>  net/tipc/socket.c |   44 +++++++++++++++++++-------------------------
> >>  1 files changed, 19 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/net/tipc/socket.c b/net/tipc/socket.c
> >> index 9b4e483..a18a757 100644
> >> --- a/net/tipc/socket.c
> >> +++ b/net/tipc/socket.c
> >> @@ -43,7 +43,7 @@
> >>  #define SS_LISTENING	-1	/* socket is listening */
> >>  #define SS_READY	-2	/* socket is connectionless */
> >>  
> >> -#define OVERLOAD_LIMIT_BASE	10000
> >> +#define OVERLOAD_LIMIT_BASE	5000
> >>  #define CONN_TIMEOUT_DEFAULT	8000	/* default connect timeout = 8s */
> >>  
> >>  struct tipc_sock {
> >> @@ -73,6 +73,13 @@ static struct proto tipc_proto;
> >>  
> >>  static int sockets_enabled;
> >>  
> >> +static const u32 msg_importance_factor[] = {
> >> +	OVERLOAD_LIMIT_BASE,		/* TIPC_LOW_IMPORTANCE limit */
> >> +	OVERLOAD_LIMIT_BASE * 2,	/* TIPC_MEDIUM_IMPORTANCE limit */
> >> +	OVERLOAD_LIMIT_BASE * 100,	/* TIPC_HIGH_IMPORTANCE limit */
> >> +	OVERLOAD_LIMIT_BASE * 200	/* TIPC_CRITICAL_IMPORTANCE limit */
> >> +	};
> >> +
> >>  /*
> >>   * Revised TIPC socket locking policy:
> >>   *
> >> @@ -1158,28 +1165,17 @@ static void tipc_data_ready(struct sock *sk, int len)
> >>   * rx_queue_full - determine if receive queue can accept another message
> >>   * @msg: message to be added to queue
> >>   * @queue_size: current size of queue
> >> - * @base: nominal maximum size of queue
> >>   *
> >> - * Returns 1 if queue is unable to accept message, 0 otherwise
> >> + * Returns true if queue is unable to accept message, false otherwise
> >>   */
> >> -static int rx_queue_full(struct tipc_msg *msg, u32 queue_size, u32 base)
> >> +static bool rx_queue_full(struct tipc_msg *msg, u32 queue_size)
> >>  {
> >> -	u32 threshold;
> >>  	u32 imp = msg_importance(msg);
> >>  
> >> -	if (imp == TIPC_LOW_IMPORTANCE)
> >> -		threshold = base;
> >> -	else if (imp == TIPC_MEDIUM_IMPORTANCE)
> >> -		threshold = base * 2;
> >> -	else if (imp == TIPC_HIGH_IMPORTANCE)
> >> -		threshold = base * 100;
> >> -	else
> >> -		return 0;
> >> +	if (unlikely(imp > TIPC_CRITICAL_IMPORTANCE))
> >> +		return true;
> > 
> > This test is not necessary. Such messages have already been filtered out
> > in tipc_recv_msg() at link level.
> > The test msg_isdata(), which determines if a message should be sent up to
> > the port/socket level, is  also an implicit test that 
> > importance < TIPC_CRITICAL_IMPORTANCE.
> 
> (importance <= TIPC_CRITICAL_IMPORTANCE), of course.
> This is an effect of the co-location of the user and importance fields in the
> TIPC header. I.e., the importance is in reality coded into the value of
> the user field.
> 
> To clarify (and improve) my previous suggestion, what I had in mind
> was something like this:
> 
> 
>                 recv_q_len = skb_queue_len(&sk->sk_receive_queue);
> -               if (unlikely(recv_q_len >= (OVERLOAD_LIMIT_BASE / 2))) {
> -                       if (rx_queue_full(msg, recv_q_len,
> -                           OVERLOAD_LIMIT_BASE / 2))
> -                               return TIPC_ERR_OVERLOAD;
> -               }
> +               imp = msg_importance(msg);
> +               if (unlikely(recv_q_len > (OVERLOAD_LIMIT_BASE << (imp << 1))))
> +                       return TIPC_ERR_OVERLOAD;                 
>         } else {
>                 if (msg_mcast(msg))
>                         return TIPC_ERR_NO_PORT;
> 
> No need for any separate function at all, and guaranteed inline.
> This will probably translate to one single shift-instruction extra,
> and no memor access, since the compiler merge (imp << 1) with the 
> shift operation used to read imp from the header.
> 
> 
> The limits for message rejection become the following,
> given an OVERLOAD_LIMIT_BASE of 10000:
> 
> TIP_LOW_IMPORTANCE:       10000    (previously 10000)
> TIPC_MEDIUM_IMPORTANCE    40000    (previously 20000)
> TIPC_MEDIUM_IMPORTANCE    1600000  (previously 1000000)
> TIPC_CRITICAL_IMPORTANCE  25600000 (previously no limit)
> 
> ///jon
> 
Sure, this will work nicely as well.
Neil

^ permalink raw reply

* netconsole fun
From: Peter Hurley @ 2012-12-10 18:00 UTC (permalink / raw)
  To: netdev

Is there a recommended method of running netconsole on a machine that
has a slaved device?

I ask because it seems pretty difficult to get netconsole running even
on a machine that has multiple physical interfaces, only one of which is
slaved.

I scoured the documentation but didn't find anything relevant (does
documentation/networking/netconsole.txt need a patch?). AFAICT, the last
discussion ended here back in June '11
http://lkml.indiana.edu/hypermail/linux/kernel/1106.1/03185.html 

Any help or pointers provided would be much appreciated.

Regards,
Peter Hurley

^ permalink raw reply

* Re: [RFC PATCH v2 3/3] tun: fix LSM/SELinux labeling of tun/tap devices
From: Michael S. Tsirkin @ 2012-12-10 17:50 UTC (permalink / raw)
  To: Paul Moore; +Cc: netdev, linux-security-module, selinux, jasowang
In-Reply-To: <3124654.2UMIXvF0vN@sifl>

On Mon, Dec 10, 2012 at 12:33:49PM -0500, Paul Moore wrote:
> On Monday, December 10, 2012 07:26:56 PM Michael S. Tsirkin wrote:
> > On Mon, Dec 10, 2012 at 12:04:35PM -0500, Paul Moore wrote:
> > > On Friday, December 07, 2012 02:25:16 PM Michael S. Tsirkin wrote:
> > > > On Thu, Dec 06, 2012 at 04:09:51PM -0500, Paul Moore wrote:
> > > > > On Thursday, December 06, 2012 10:57:16 PM Michael S. Tsirkin wrote:
> > > > > > On Thu, Dec 06, 2012 at 11:56:45AM -0500, Paul Moore wrote:
> > > > > > > The SETQUEUE/tun_socket:create_queue permissions do not yet exist
> > > > > > > in any released SELinux policy as we are just now adding them with
> > > > > > > this patchset. With current policies loaded into a kernel with
> > > > > > > this patchset applied the SETQUEUE/tun_socket:create_queue
> > > > > > > permission would be treated according to the policy's unknown
> > > > > > > permission setting.
> > > > > > 
> > > > > > OK I think we need to rethink what we are doing here: what you sent
> > > > > > addresses the problem as stated but I think we mis-stated it.  Let
> > > > > > me try to restate the problem: it is not just selinux problem. Let's
> > > > > > assume qemu wants to use tun, I (libvirt) don't want to run it as
> > > > > > root.
> > > > > > 
> > > > > > 1. TUNSETIFF: I can open tun, attach an fd and pass it to qemu.
> > > > > > Now, qemu does not invoke TUNSETIFF so it can run without
> > > > > > kernel priveledges.
> > > > > 
> > > > > Correct me if I'm wrong, but I believe libvirt does this while running
> > > > > as root.  Assuming that is the case, why not simply setuid()/setgid()
> > > > > to the same credentials as the QEMU instance before creating the TUN
> > > > > device? You can always (re)configure the device afterwards while
> > > > > running as root/CAP_NET_ADMIN.
> > > > 
> > > > We want isolation between qemu instances.
> > > 
> > > Understood, I agree.
> > > 
> > > Achieving separation via SELinux is easily done, with libvirt/sVirt
> > > already doing this for us automatically in most cases; the only thing we
> > > will want to do is make sure the SELinux policy is aware of the new
> > > permission.
> > > 
> > > Achieving separation via DAC should also be easily done, simply run each
> > > QEMU instance with a separate UID and/or GID.
> > > 
> > > > Giving qemu right to open tun and SETIFF would give it rights
> > > > to access any tun device.
> > > 
> > > I'm quickly looked at tun_chr_open() again and I don't see any special
> > > rights/privileges required, the same for tun_chr_ioctl() and
> > > __tun_chr_ioctl().  Looking at tun_set_queue() I see we call
> > > tun_not_capable() which does a simple DAC check; it must have the same
> > > UID/GID or have CAP_NET_ADMIN.
> > > 
> > > I'm having a hard time seeing the problem you are describing; help me
> > > understand.
> > 
> > The issue is guest controls the number of queues in use.
> > So qemu would be required to be allowed to call tun_set_queue.
> > If we allow this we have a problem as one qemu will be
> > able to access any tun.
> 
> QEMU can call tun_set_queue() as long as it satisfies tun_not_capable(), which 
> from a practical point of view means that the TUN device was created with the 
> same UID/GID as the QEMU instance.  If you want TUN device separation between 
> QEMU instances using DAC you need to run each QEMU instance with a different 
> UID/GID (which you should be doing anyway if you want DAC enforced general 
> separation).
> 
> I believe I've stated this point several times now and I don't feel you've 
> addressed it properly.

Look at how it works at the moment:
a priveledged libvirt server calls tun_set_iff
and passes the fd to qemu which is not priveledged.

The result is isolation between qemu instances without
need to create uid per qemu instance.

How do we create multiple queues? It makes sense to
follow this model and pass in fds for individual queues.
However they need to be disabled initially
so libvirt can not do tun_set_queue for us.
When qemu later calls tun_set_queue it will fail which means we
can't utilize multiqueue.

My solution is an unpriveledged variant
of tun_set_queue that only enables/disables
a queue without attach/detach.


> -- 
> paul moore
> security and virtualization @ redhat

^ permalink raw reply

* Re: [RFC PATCH v2 3/3] tun: fix LSM/SELinux labeling of tun/tap devices
From: Paul Moore @ 2012-12-10 17:33 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-security-module, selinux, jasowang
In-Reply-To: <20121210172656.GA30775@redhat.com>

On Monday, December 10, 2012 07:26:56 PM Michael S. Tsirkin wrote:
> On Mon, Dec 10, 2012 at 12:04:35PM -0500, Paul Moore wrote:
> > On Friday, December 07, 2012 02:25:16 PM Michael S. Tsirkin wrote:
> > > On Thu, Dec 06, 2012 at 04:09:51PM -0500, Paul Moore wrote:
> > > > On Thursday, December 06, 2012 10:57:16 PM Michael S. Tsirkin wrote:
> > > > > On Thu, Dec 06, 2012 at 11:56:45AM -0500, Paul Moore wrote:
> > > > > > The SETQUEUE/tun_socket:create_queue permissions do not yet exist
> > > > > > in any released SELinux policy as we are just now adding them with
> > > > > > this patchset. With current policies loaded into a kernel with
> > > > > > this patchset applied the SETQUEUE/tun_socket:create_queue
> > > > > > permission would be treated according to the policy's unknown
> > > > > > permission setting.
> > > > > 
> > > > > OK I think we need to rethink what we are doing here: what you sent
> > > > > addresses the problem as stated but I think we mis-stated it.  Let
> > > > > me try to restate the problem: it is not just selinux problem. Let's
> > > > > assume qemu wants to use tun, I (libvirt) don't want to run it as
> > > > > root.
> > > > > 
> > > > > 1. TUNSETIFF: I can open tun, attach an fd and pass it to qemu.
> > > > > Now, qemu does not invoke TUNSETIFF so it can run without
> > > > > kernel priveledges.
> > > > 
> > > > Correct me if I'm wrong, but I believe libvirt does this while running
> > > > as root.  Assuming that is the case, why not simply setuid()/setgid()
> > > > to the same credentials as the QEMU instance before creating the TUN
> > > > device? You can always (re)configure the device afterwards while
> > > > running as root/CAP_NET_ADMIN.
> > > 
> > > We want isolation between qemu instances.
> > 
> > Understood, I agree.
> > 
> > Achieving separation via SELinux is easily done, with libvirt/sVirt
> > already doing this for us automatically in most cases; the only thing we
> > will want to do is make sure the SELinux policy is aware of the new
> > permission.
> > 
> > Achieving separation via DAC should also be easily done, simply run each
> > QEMU instance with a separate UID and/or GID.
> > 
> > > Giving qemu right to open tun and SETIFF would give it rights
> > > to access any tun device.
> > 
> > I'm quickly looked at tun_chr_open() again and I don't see any special
> > rights/privileges required, the same for tun_chr_ioctl() and
> > __tun_chr_ioctl().  Looking at tun_set_queue() I see we call
> > tun_not_capable() which does a simple DAC check; it must have the same
> > UID/GID or have CAP_NET_ADMIN.
> > 
> > I'm having a hard time seeing the problem you are describing; help me
> > understand.
> 
> The issue is guest controls the number of queues in use.
> So qemu would be required to be allowed to call tun_set_queue.
> If we allow this we have a problem as one qemu will be
> able to access any tun.

QEMU can call tun_set_queue() as long as it satisfies tun_not_capable(), which 
from a practical point of view means that the TUN device was created with the 
same UID/GID as the QEMU instance.  If you want TUN device separation between 
QEMU instances using DAC you need to run each QEMU instance with a different 
UID/GID (which you should be doing anyway if you want DAC enforced general 
separation).

I believe I've stated this point several times now and I don't feel you've 
addressed it properly.

-- 
paul moore
security and virtualization @ redhat

^ permalink raw reply

* Re: [RFC PATCH v2 3/3] tun: fix LSM/SELinux labeling of tun/tap devices
From: Michael S. Tsirkin @ 2012-12-10 17:26 UTC (permalink / raw)
  To: Paul Moore; +Cc: netdev, linux-security-module, selinux, jasowang
In-Reply-To: <5998443.squEvSxCG9@sifl>

On Mon, Dec 10, 2012 at 12:04:35PM -0500, Paul Moore wrote:
> On Friday, December 07, 2012 02:25:16 PM Michael S. Tsirkin wrote:
> > On Thu, Dec 06, 2012 at 04:09:51PM -0500, Paul Moore wrote:
> > > On Thursday, December 06, 2012 10:57:16 PM Michael S. Tsirkin wrote:
> > > > On Thu, Dec 06, 2012 at 11:56:45AM -0500, Paul Moore wrote:
> > > > > The SETQUEUE/tun_socket:create_queue permissions do not yet exist in
> > > > > any released SELinux policy as we are just now adding them with this
> > > > > patchset. With current policies loaded into a kernel with this
> > > > > patchset applied the SETQUEUE/tun_socket:create_queue permission would
> > > > > be treated according to the policy's unknown permission setting.
> > > > 
> > > > OK I think we need to rethink what we are doing here: what you sent
> > > > addresses the problem as stated but I think we mis-stated it.  Let me
> > > > try to restate the problem: it is not just selinux problem. Let's assume
> > > > qemu wants to use tun, I (libvirt) don't want to run it as root.
> > > > 
> > > > 1. TUNSETIFF: I can open tun, attach an fd and pass it to qemu.
> > > > Now, qemu does not invoke TUNSETIFF so it can run without
> > > > kernel priveledges.
> > > 
> > > Correct me if I'm wrong, but I believe libvirt does this while running as
> > > root.  Assuming that is the case, why not simply setuid()/setgid() to the
> > > same credentials as the QEMU instance before creating the TUN device? 
> > > You can always (re)configure the device afterwards while running as
> > > root/CAP_NET_ADMIN.
> > 
> > We want isolation between qemu instances.
> 
> Understood, I agree.
> 
> Achieving separation via SELinux is easily done, with libvirt/sVirt already 
> doing this for us automatically in most cases; the only thing we will want to 
> do is make sure the SELinux policy is aware of the new permission.
> 
> Achieving separation via DAC should also be easily done, simply run each QEMU 
> instance with a separate UID and/or GID.
> 
> > Giving qemu right to open tun and SETIFF would give it rights
> > to access any tun device.
> 
> I'm quickly looked at tun_chr_open() again and I don't see any special 
> rights/privileges required, the same for tun_chr_ioctl() and 
> __tun_chr_ioctl().  Looking at tun_set_queue() I see we call tun_not_capable() 
> which does a simple DAC check; it must have the same UID/GID or have 
> CAP_NET_ADMIN.
> 
> I'm having a hard time seeing the problem you are describing; help me 
> understand.

The issue is guest controls the number of queues in use.
So qemu would be required to be allowed to call tun_set_queue.
If we allow this we have a problem as one qemu will be
able to access any tun.

At least with DAC, looks like there's a problem. SELinux I think
can address this.

> > There could also be user tun users we want them isolated from qemu.
> 
> Once again, should be possible using either SELinux, DAC, or both.
> 
> -- 
> paul moore
> security and virtualization @ redhat

^ permalink raw reply

* Re: [RFC PATCH v2 3/3] tun: fix LSM/SELinux labeling of tun/tap devices
From: Paul Moore @ 2012-12-10 17:04 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-security-module, selinux, jasowang
In-Reply-To: <20121207122516.GA16577@redhat.com>

On Friday, December 07, 2012 02:25:16 PM Michael S. Tsirkin wrote:
> On Thu, Dec 06, 2012 at 04:09:51PM -0500, Paul Moore wrote:
> > On Thursday, December 06, 2012 10:57:16 PM Michael S. Tsirkin wrote:
> > > On Thu, Dec 06, 2012 at 11:56:45AM -0500, Paul Moore wrote:
> > > > The SETQUEUE/tun_socket:create_queue permissions do not yet exist in
> > > > any released SELinux policy as we are just now adding them with this
> > > > patchset. With current policies loaded into a kernel with this
> > > > patchset applied the SETQUEUE/tun_socket:create_queue permission would
> > > > be treated according to the policy's unknown permission setting.
> > > 
> > > OK I think we need to rethink what we are doing here: what you sent
> > > addresses the problem as stated but I think we mis-stated it.  Let me
> > > try to restate the problem: it is not just selinux problem. Let's assume
> > > qemu wants to use tun, I (libvirt) don't want to run it as root.
> > > 
> > > 1. TUNSETIFF: I can open tun, attach an fd and pass it to qemu.
> > > Now, qemu does not invoke TUNSETIFF so it can run without
> > > kernel priveledges.
> > 
> > Correct me if I'm wrong, but I believe libvirt does this while running as
> > root.  Assuming that is the case, why not simply setuid()/setgid() to the
> > same credentials as the QEMU instance before creating the TUN device? 
> > You can always (re)configure the device afterwards while running as
> > root/CAP_NET_ADMIN.
> 
> We want isolation between qemu instances.

Understood, I agree.

Achieving separation via SELinux is easily done, with libvirt/sVirt already 
doing this for us automatically in most cases; the only thing we will want to 
do is make sure the SELinux policy is aware of the new permission.

Achieving separation via DAC should also be easily done, simply run each QEMU 
instance with a separate UID and/or GID.

> Giving qemu right to open tun and SETIFF would give it rights
> to access any tun device.

I'm quickly looked at tun_chr_open() again and I don't see any special 
rights/privileges required, the same for tun_chr_ioctl() and 
__tun_chr_ioctl().  Looking at tun_set_queue() I see we call tun_not_capable() 
which does a simple DAC check; it must have the same UID/GID or have 
CAP_NET_ADMIN.

I'm having a hard time seeing the problem you are describing; help me 
understand.

> There could also be user tun users we want them isolated from qemu.

Once again, should be possible using either SELinux, DAC, or both.

-- 
paul moore
security and virtualization @ redhat


^ permalink raw reply

* RE: ipgre rss is broken since gro
From: Eric Dumazet @ 2012-12-10 16:54 UTC (permalink / raw)
  To: Dmitry Kravkov; +Cc: Eric Dumazet, netdev@vger.kernel.org
In-Reply-To: <504C9EFCA2D0054393414C9CB605C37F1BFC104B@SJEXCHMB06.corp.ad.broadcom.com>

On Mon, 2012-12-10 at 11:32 +0000, Dmitry Kravkov wrote:

> CPU is not loaded at all 
> 
> > cat /proc/net/softnet_stat
> Please find attached.
> 
> For gre interface RX and DROP statistics are advancing simultaneously (by one each ICMP request):
> 
> [root@ ~]# ifconfig gre
> gre       Link encap:UNSPEC  HWaddr C0-A8-0A-40-73-72-83-D2-00-00-00-00-00-00-00-00
>           inet addr:8.0.0.1  P-t-P:8.0.0.1  Mask:255.255.255.0
>           inet6 addr: fe80::5efe:c0a8:a40/64 Scope:Link
>           UP POINTOPOINT RUNNING NOARP  MTU:1476  Metric:1
>           RX packets:1646824 errors:0 dropped:51610 overruns:0 frame:0
>           TX packets:140519 errors:1 dropped:0 overruns:0 carrier:1
>           collisions:0 txqueuelen:0
>           RX bytes:2357650904 (2.1 GiB)  TX bytes:7309072 (6.9 MiB)


dropped:51610  so obviously  one cpu is fully loaded.

I believe performance problem might come from the
skb_set_queue_mapping(skb, 0); in __skb_tunnel_rx()

So all packets are queued into a single GRO queue, instead of being
split as intended in multiple queues.

I cant find why we must clear queue_mapping, so could you try :


diff --git a/include/net/dst.h b/include/net/dst.h
index 9a78810..4cb27df 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -329,7 +329,6 @@ static inline void __skb_tunnel_rx(struct sk_buff *skb, struct net_device *dev)
 	 */
 	if (!skb->l4_rxhash)
 		skb->rxhash = 0;
-	skb_set_queue_mapping(skb, 0);
 	skb_dst_drop(skb);
 	nf_reset(skb);
 }

^ permalink raw reply related

* Re: [PATCH v4 1/5] net: Add support for hardware-offloaded encapsulation
From: Alexander Duyck @ 2012-12-10 16:22 UTC (permalink / raw)
  To: saeed bishara
  Cc: Joseph Gasparakis, davem, shemminger, chrisw, gospo, netdev,
	linux-kernel, dmitry, bhutchings, Peter P Waskiewicz Jr,
	Alexander Duyck
In-Reply-To: <CAMAG_ee6xW4EnzPFt=_5+NL2+LuyUjxd52Xc242inYm2+Yhg7g@mail.gmail.com>

On 12/10/2012 02:04 AM, saeed bishara wrote:
>> +static inline struct iphdr *inner_ip_hdr(const struct sk_buff *skb)
>> +{
>> +       return (struct iphdr *)skb_inner_network_header(skb);
>> +}
> Hi,
> I'm a little bit bothered because of those inner_ functions, what
> about the following approach:
> 1. the skb will have a new state, that state can be outer (normal
> mode) and inner.
> 2. when you change the state to inner, all the helper functions such
> as ip_hdr will return the innter header.
>
> that's ofcourse the API side. the implementation may still use the
> fields you added to the skb.
>
> what you think?
> saeed

What you describe isn't too far off from what we are doing.  However we
need to store both the inner and the outer headers.  All these inner_
functions are meant to do is assist drivers to access the inner headers
in the case that skb->encapsulation is set.  We wanted to avoid
abstracting it too much since it is possible in the future that both
inner and outer network headers may be needed if for instance you were
to place a tunnelled frame inside of a VLAN with hardware tag insertion.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH net-next v3] tipc: sk_recv_queue size check only for connectionless sockets
From: Jon Maloy @ 2012-12-10 15:49 UTC (permalink / raw)
  To: Ying Xue; +Cc: Paul.Gortmaker, tipc-discussion, nhorman, netdev
In-Reply-To: <50C5B5B5.9060004@ericsson.com>

On 12/10/2012 05:13 AM, Jon Maloy wrote:
> On 12/10/2012 04:23 AM, Ying Xue wrote:
>> The sk_receive_queue limit control is currently performed for all
>> arriving messages, disregarding socket and message type. But for
>> connectionless sockets this check is redundant, since the protocol
>> flow already makes queue overflow impossible.
>>
>> We move the sk_receive_queue limit control so that it's only performed
>> for connectionless sockets, i.e. SOCK_RDM and SOCK_DGRAM type sockets.
>>
>> However, as Neil Horman specified, we cannot simply force the socket
>> receive queue limit against connectionless sockets as it may create a
>> DoS vulnerability. For example, if a sender floods a receiver with
>> messages containing an invalid set of message importance bits or
>> CRITICAL importance, we will queue messages indefinitely.
>>
>> To avoid DoS attack, socket receive queue will be marked as overflow
>> if we receive messages with invalid message importances, meanwhile,
>> we also set one new threshold for CRITICAL importance messages.
>>
>> Signed-off-by: Ying Xue <ying.xue@windriver.com>
>> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
>> Cc: Neil Horman <nhorman@tuxdriver.com>
>> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
>> ---
>> v3 changes:
>>  - set new threshold for CRITICAL message
>>  - defined an importance factor table to avoid multiplication and
>>    division operations in rx_queue_full().
>>  - changed return value of rx_queue_full() from integer to boolean.
>>
>>  net/tipc/socket.c |   44 +++++++++++++++++++-------------------------
>>  1 files changed, 19 insertions(+), 25 deletions(-)
>>
>> diff --git a/net/tipc/socket.c b/net/tipc/socket.c
>> index 9b4e483..a18a757 100644
>> --- a/net/tipc/socket.c
>> +++ b/net/tipc/socket.c
>> @@ -43,7 +43,7 @@
>>  #define SS_LISTENING	-1	/* socket is listening */
>>  #define SS_READY	-2	/* socket is connectionless */
>>  
>> -#define OVERLOAD_LIMIT_BASE	10000
>> +#define OVERLOAD_LIMIT_BASE	5000
>>  #define CONN_TIMEOUT_DEFAULT	8000	/* default connect timeout = 8s */
>>  
>>  struct tipc_sock {
>> @@ -73,6 +73,13 @@ static struct proto tipc_proto;
>>  
>>  static int sockets_enabled;
>>  
>> +static const u32 msg_importance_factor[] = {
>> +	OVERLOAD_LIMIT_BASE,		/* TIPC_LOW_IMPORTANCE limit */
>> +	OVERLOAD_LIMIT_BASE * 2,	/* TIPC_MEDIUM_IMPORTANCE limit */
>> +	OVERLOAD_LIMIT_BASE * 100,	/* TIPC_HIGH_IMPORTANCE limit */
>> +	OVERLOAD_LIMIT_BASE * 200	/* TIPC_CRITICAL_IMPORTANCE limit */
>> +	};
>> +
>>  /*
>>   * Revised TIPC socket locking policy:
>>   *
>> @@ -1158,28 +1165,17 @@ static void tipc_data_ready(struct sock *sk, int len)
>>   * rx_queue_full - determine if receive queue can accept another message
>>   * @msg: message to be added to queue
>>   * @queue_size: current size of queue
>> - * @base: nominal maximum size of queue
>>   *
>> - * Returns 1 if queue is unable to accept message, 0 otherwise
>> + * Returns true if queue is unable to accept message, false otherwise
>>   */
>> -static int rx_queue_full(struct tipc_msg *msg, u32 queue_size, u32 base)
>> +static bool rx_queue_full(struct tipc_msg *msg, u32 queue_size)
>>  {
>> -	u32 threshold;
>>  	u32 imp = msg_importance(msg);
>>  
>> -	if (imp == TIPC_LOW_IMPORTANCE)
>> -		threshold = base;
>> -	else if (imp == TIPC_MEDIUM_IMPORTANCE)
>> -		threshold = base * 2;
>> -	else if (imp == TIPC_HIGH_IMPORTANCE)
>> -		threshold = base * 100;
>> -	else
>> -		return 0;
>> +	if (unlikely(imp > TIPC_CRITICAL_IMPORTANCE))
>> +		return true;
> 
> This test is not necessary. Such messages have already been filtered out
> in tipc_recv_msg() at link level.
> The test msg_isdata(), which determines if a message should be sent up to
> the port/socket level, is  also an implicit test that 
> importance < TIPC_CRITICAL_IMPORTANCE.

(importance <= TIPC_CRITICAL_IMPORTANCE), of course.
This is an effect of the co-location of the user and importance fields in the
TIPC header. I.e., the importance is in reality coded into the value of
the user field.

To clarify (and improve) my previous suggestion, what I had in mind
was something like this:


                recv_q_len = skb_queue_len(&sk->sk_receive_queue);
-               if (unlikely(recv_q_len >= (OVERLOAD_LIMIT_BASE / 2))) {
-                       if (rx_queue_full(msg, recv_q_len,
-                           OVERLOAD_LIMIT_BASE / 2))
-                               return TIPC_ERR_OVERLOAD;
-               }
+               imp = msg_importance(msg);
+               if (unlikely(recv_q_len > (OVERLOAD_LIMIT_BASE << (imp << 1))))
+                       return TIPC_ERR_OVERLOAD;                 
        } else {
                if (msg_mcast(msg))
                        return TIPC_ERR_NO_PORT;

No need for any separate function at all, and guaranteed inline.
This will probably translate to one single shift-instruction extra,
and no memor access, since the compiler merge (imp << 1) with the 
shift operation used to read imp from the header.


The limits for message rejection become the following,
given an OVERLOAD_LIMIT_BASE of 10000:

TIP_LOW_IMPORTANCE:       10000    (previously 10000)
TIPC_MEDIUM_IMPORTANCE    40000    (previously 20000)
TIPC_MEDIUM_IMPORTANCE    1600000  (previously 1000000)
TIPC_CRITICAL_IMPORTANCE  25600000 (previously no limit)

///jon

> 
> 
>>  
>> -	if (msg_connected(msg))
>> -		threshold *= 4;
>> -
>> -	return queue_size >= threshold;
>> +	return queue_size >= msg_importance_factor[imp];
> 
> Ok. Less optimal than my suggestion, but also lower risk until we know
> the consequences of changing the multiplication factors.
> 
>>  }
>>  
>>  /**
>> @@ -1275,7 +1271,6 @@ static u32 filter_rcv(struct sock *sk, struct sk_buff *buf)
>>  {
>>  	struct socket *sock = sk->sk_socket;
>>  	struct tipc_msg *msg = buf_msg(buf);
>> -	u32 recv_q_len;
>>  	u32 res = TIPC_OK;
>>  
>>  	/* Reject message if it is wrong sort of message for socket */
>> @@ -1285,19 +1280,18 @@ static u32 filter_rcv(struct sock *sk, struct sk_buff *buf)
>>  	if (sock->state == SS_READY) {
>>  		if (msg_connected(msg))
>>  			return TIPC_ERR_NO_PORT;
>> +		/* Reject SOCK_DGRAM and SOCK_RDM message if there isn't room
>> +		 * to queue it
>> +		 */
>> +		if (unlikely(rx_queue_full(msg,
>> +			     skb_queue_len(&sk->sk_receive_queue))))
>> +			return TIPC_ERR_OVERLOAD;
>>  	} else {
>>  		res = filter_connect(tipc_sk(sk), &buf);
>>  		if (res != TIPC_OK || buf == NULL)
>>  			return res;
>>  	}
>>  
>> -	/* Reject message if there isn't room to queue it */
>> -	recv_q_len = skb_queue_len(&sk->sk_receive_queue);
>> -	if (unlikely(recv_q_len >= (OVERLOAD_LIMIT_BASE / 2))) {
>> -		if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE / 2))
>> -			return TIPC_ERR_OVERLOAD;
>> -	}
>> -
>>  	/* Enqueue message (finally!) */
>>  	TIPC_SKB_CB(buf)->handle = 0;
>>  	__skb_queue_tail(&sk->sk_receive_queue, buf);
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d

^ permalink raw reply

* [PATCH net-next v3 20/22] bnx2x: Support VF FLR
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

The FLR indication arrives as an attention from the management processor.
Upon VF flr all FLRed function in the indication have already been
released by Firmware and now we basically need to free the resources
allocated to those VFs, and clean any remainders from the device
(FLR final cleanup).

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h       |    6 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |   19 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h   |    1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c |  294 +++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |    7 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h  |    1 +
 6 files changed, 321 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index a7d9fb0..a2b8ffa 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -1879,7 +1879,13 @@ void bnx2x_prep_dmae_with_comp(struct bnx2x *bp, struct dmae_command *dmae,
 int bnx2x_issue_dmae_with_comp(struct bnx2x *bp, struct dmae_command *dmae);
 void bnx2x_dp_dmae(struct bnx2x *bp, struct dmae_command *dmae, int msglvl);
 
+/* FLR related routines */
+u32 bnx2x_flr_clnup_poll_count(struct bnx2x *bp);
+void bnx2x_tx_hw_flushed(struct bnx2x *bp, u32 poll_count);
+int bnx2x_send_final_clnup(struct bnx2x *bp, u8 clnup_func, u32 poll_cnt);
 u8 bnx2x_is_pcie_pending(struct pci_dev *dev);
+int bnx2x_flr_clnup_poll_hw_counter(struct bnx2x *bp, u32 reg,
+				    char *msg, u32 poll_cnt);
 
 void bnx2x_calc_fc_adv(struct bnx2x *bp);
 int bnx2x_sp_post(struct bnx2x *bp, int command, int cid,
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 384f303..91a2eb5 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -1095,8 +1095,8 @@ static u32 bnx2x_flr_clnup_reg_poll(struct bnx2x *bp, u32 reg,
 	return val;
 }
 
-static int bnx2x_flr_clnup_poll_hw_counter(struct bnx2x *bp, u32 reg,
-					   char *msg, u32 poll_cnt)
+int bnx2x_flr_clnup_poll_hw_counter(struct bnx2x *bp, u32 reg,
+				    char *msg, u32 poll_cnt)
 {
 	u32 val = bnx2x_flr_clnup_reg_poll(bp, reg, 0, poll_cnt);
 	if (val != 0) {
@@ -1106,7 +1106,8 @@ static int bnx2x_flr_clnup_poll_hw_counter(struct bnx2x *bp, u32 reg,
 	return 0;
 }
 
-static u32 bnx2x_flr_clnup_poll_count(struct bnx2x *bp)
+/* Common routines with VF FLR cleanup */
+u32 bnx2x_flr_clnup_poll_count(struct bnx2x *bp)
 {
 	/* adjust polling timeout */
 	if (CHIP_REV_IS_EMUL(bp))
@@ -1118,7 +1119,7 @@ static u32 bnx2x_flr_clnup_poll_count(struct bnx2x *bp)
 	return FLR_POLL_CNT;
 }
 
-static void bnx2x_tx_hw_flushed(struct bnx2x *bp, u32 poll_count)
+void bnx2x_tx_hw_flushed(struct bnx2x *bp, u32 poll_count)
 {
 	struct pbf_pN_cmd_regs cmd_regs[] = {
 		{0, (CHIP_IS_E3B0(bp)) ?
@@ -1193,8 +1194,7 @@ static void bnx2x_tx_hw_flushed(struct bnx2x *bp, u32 poll_count)
 	(((index) << SDM_OP_GEN_AGG_VECT_IDX_SHIFT) & SDM_OP_GEN_AGG_VECT_IDX)
 
 
-static int bnx2x_send_final_clnup(struct bnx2x *bp, u8 clnup_func,
-					 u32 poll_cnt)
+int bnx2x_send_final_clnup(struct bnx2x *bp, u8 clnup_func, u32 poll_cnt)
 {
 	struct sdm_op_gen op_gen = {0};
 
@@ -1219,7 +1219,8 @@ static int bnx2x_send_final_clnup(struct bnx2x *bp, u8 clnup_func,
 		BNX2X_ERR("FW final cleanup did not succeed\n");
 		DP(BNX2X_MSG_SP, "At timeout completion address contained %x\n",
 		   (REG_RD(bp, comp_addr)));
-		ret = 1;
+		bnx2x_panic();
+		return 1;
 	}
 	/* Zero completion for nxt FLR */
 	REG_WR(bp, comp_addr, 0);
@@ -3901,6 +3902,10 @@ static void bnx2x_attn_int_deasserted3(struct bnx2x *bp, u32 attn)
 
 			if (val & DRV_STATUS_DRV_INFO_REQ)
 				bnx2x_handle_drv_info_req(bp);
+
+			if (val & DRV_STATUS_VF_DISABLED)
+				bnx2x_vf_handle_flr_event(bp);
+
 			if ((bp->port.pmf == 0) && (val & DRV_STATUS_PMF))
 				bnx2x_pmf_update(bp);
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
index 1008d3f..a015965 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
@@ -876,6 +876,7 @@
 #define HC_CONFIG_0_REG_MSI_MSIX_INT_EN_0			 (0x1<<2)
 #define HC_CONFIG_0_REG_SINGLE_ISR_EN_0				 (0x1<<1)
 #define HC_CONFIG_1_REG_BLOCK_DISABLE_1				 (0x1<<0)
+#define DORQ_REG_VF_USAGE_CNT					 0x170320
 #define HC_REG_AGG_INT_0					 0x108050
 #define HC_REG_AGG_INT_1					 0x108054
 #define HC_REG_ATTN_BIT 					 0x108120
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index 658915f..a0a019c0 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -137,6 +137,17 @@ enum bnx2x_vfop_mcast_state {
 	   BNX2X_VFOP_MCAST_ADD,
 	   BNX2X_VFOP_MCAST_CHK_DONE
 };
+enum bnx2x_vfop_qflr_state {
+	   BNX2X_VFOP_QFLR_CLR_VLAN,
+	   BNX2X_VFOP_QFLR_CLR_MAC,
+	   BNX2X_VFOP_QFLR_TERMINATE,
+	   BNX2X_VFOP_QFLR_DONE
+};
+
+enum bnx2x_vfop_flr_state {
+	   BNX2X_VFOP_FLR_QUEUES,
+	   BNX2X_VFOP_FLR_HW
+};
 
 enum bnx2x_vfop_close_state {
 	   BNX2X_VFOP_CLOSE_QUEUES,
@@ -974,6 +985,94 @@ int bnx2x_vfop_qsetup_cmd(struct bnx2x *bp,
 	return -ENOMEM;
 }
 
+/* VFOP queue FLR handling (clear vlans, clear macs, queue destructor) */
+static void bnx2x_vfop_qflr(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	int qid = vfop->args.qx.qid;
+	enum bnx2x_vfop_qflr_state state = vfop->state;
+	struct bnx2x_queue_state_params *qstate;
+	struct bnx2x_vfop_cmd cmd;
+
+	bnx2x_vfop_reset_wq(vf);
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	DP(BNX2X_MSG_IOV, "VF[%d] STATE: %d\n", vf->abs_vfid, state);
+
+	cmd.done = bnx2x_vfop_qflr;
+	cmd.block = false;
+
+	switch (state) {
+	case BNX2X_VFOP_QFLR_CLR_VLAN:
+		/* vlan-clear-all: driver-only, don't consume credit */
+		vfop->state = BNX2X_VFOP_QFLR_CLR_MAC;
+		vfop->rc = bnx2x_vfop_vlan_delall_cmd(bp, vf, &cmd, qid, true);
+		if (vfop->rc)
+			goto op_err;
+		return;
+
+	case BNX2X_VFOP_QFLR_CLR_MAC:
+		/* mac-clear-all: driver only consume credit */
+		vfop->state = BNX2X_VFOP_QFLR_TERMINATE;
+		vfop->rc = bnx2x_vfop_mac_delall_cmd(bp, vf, &cmd, qid, true);
+		DP(BNX2X_MSG_IOV,
+		   "VF[%d] vfop->rc after bnx2x_vfop_mac_delall_cmd was %d",
+		   vf->abs_vfid, vfop->rc);
+		if (vfop->rc)
+			goto op_err;
+		return;
+
+	case BNX2X_VFOP_QFLR_TERMINATE:
+		qstate = &vfop->op_p->qctor.qstate;
+		memset(qstate , 0, sizeof(*qstate));
+		qstate->q_obj = &bnx2x_vfq(vf, qid, sp_obj);
+		vfop->state = BNX2X_VFOP_QFLR_DONE;
+
+		DP(BNX2X_MSG_IOV, "VF[%d] qstate during flr was %d",
+		   vf->abs_vfid, qstate->q_obj->state);
+
+		if (qstate->q_obj->state != BNX2X_Q_STATE_RESET) {
+			qstate->q_obj->state = BNX2X_Q_STATE_STOPPED;
+			qstate->cmd = BNX2X_Q_CMD_TERMINATE;
+			vfop->rc = bnx2x_queue_state_change(bp, qstate);
+			bnx2x_vfop_finalize(vf, vfop->rc, VFOP_VERIFY_PEND);
+		} else {
+			goto op_done;
+		}
+
+op_err:
+	BNX2X_ERR("QFLR[%d:%d] error: rc %d\n",
+		  vf->abs_vfid, qid, vfop->rc);
+op_done:
+	case BNX2X_VFOP_QFLR_DONE:
+		bnx2x_vfop_end(bp, vf, vfop);
+		return;
+	default:
+		bnx2x_vfop_default(state);
+	}
+op_pending:
+	return;
+}
+
+static int bnx2x_vfop_qflr_cmd(struct bnx2x *bp,
+			       struct bnx2x_virtf *vf,
+			       struct bnx2x_vfop_cmd *cmd,
+			       int qid)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+
+	if (vfop) {
+		vfop->args.qx.qid = qid;
+		bnx2x_vfop_opset(BNX2X_VFOP_QFLR_CLR_VLAN,
+				 bnx2x_vfop_qflr, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_qflr,
+					     cmd->block);
+	}
+	return -ENOMEM;
+}
+
 /* VFOP multi-casts */
 static void bnx2x_vfop_mcast(struct bnx2x *bp, struct bnx2x_virtf *vf)
 {
@@ -1433,6 +1532,201 @@ static void bnx2x_vf_free_resc(struct bnx2x *bp, struct bnx2x_virtf *vf)
 	vf->state = VF_FREE;
 }
 
+static void bnx2x_vf_flr_clnup_hw(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	u32 poll_cnt = bnx2x_flr_clnup_poll_count(bp);
+
+	/* DQ usage counter */
+	bnx2x_pretend_func(bp, HW_VF_HANDLE(bp, vf->abs_vfid));
+	bnx2x_flr_clnup_poll_hw_counter(bp, DORQ_REG_VF_USAGE_CNT,
+					"DQ VF usage counter timed out",
+					poll_cnt);
+	bnx2x_pretend_func(bp, BP_ABS_FUNC(bp));
+
+	/* FW cleanup command - poll for the results */
+	if (bnx2x_send_final_clnup(bp, (u8)FW_VF_HANDLE(vf->abs_vfid),
+				   poll_cnt))
+		BNX2X_ERR("VF[%d] Final cleanup timed-out\n", vf->abs_vfid);
+
+	/* verify TX hw is flushed */
+	bnx2x_tx_hw_flushed(bp, poll_cnt);
+}
+
+static void bnx2x_vfop_flr(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	struct bnx2x_vfop_args_qx *qx = &vfop->args.qx;
+	enum bnx2x_vfop_flr_state state = vfop->state;
+	struct bnx2x_vfop_cmd cmd = {
+		.done = bnx2x_vfop_flr,
+		.block = false,
+	};
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	DP(BNX2X_MSG_IOV, "vf[%d] STATE: %d\n", vf->abs_vfid, state);
+
+	switch (state) {
+	case BNX2X_VFOP_FLR_QUEUES:
+		/* the cleanup operations are valid if and only if the VF
+		 * was first acquired.
+		 */
+		if (++(qx->qid) < vf_rxq_count(vf)) {
+			vfop->rc = bnx2x_vfop_qflr_cmd(bp, vf, &cmd,
+						       qx->qid);
+			if (vfop->rc)
+				goto op_err;
+			return;
+		}
+		/* remove multicasts */
+		vfop->state = BNX2X_VFOP_FLR_HW;
+		vfop->rc = bnx2x_vfop_mcast_cmd(bp, vf, &cmd, NULL,
+						0, true);
+		if (vfop->rc)
+			goto op_err;
+		return;
+	case BNX2X_VFOP_FLR_HW:
+
+		/* dispatch final cleanup and wait for HW queues to flush */
+		bnx2x_vf_flr_clnup_hw(bp, vf);
+
+		/* release VF resources */
+		bnx2x_vf_free_resc(bp, vf);
+
+		/* re-open the mailbox */
+		bnx2x_vf_enable_mbx(bp, vf->abs_vfid);
+
+		goto op_done;
+	default:
+		bnx2x_vfop_default(state);
+	}
+op_err:
+	BNX2X_ERR("VF[%d] FLR error: rc %d\n", vf->abs_vfid, vfop->rc);
+op_done:
+	vf->flr_clnup_stage = VF_FLR_ACK;
+	bnx2x_vfop_end(bp, vf, vfop);
+	bnx2x_unlock_vf_pf_channel(bp, vf, CHANNEL_TLV_FLR);
+}
+
+static int bnx2x_vfop_flr_cmd(struct bnx2x *bp,
+			      struct bnx2x_virtf *vf,
+			      vfop_handler_t done)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+	if (vfop) {
+		vfop->args.qx.qid = -1; /* loop */
+		bnx2x_vfop_opset(BNX2X_VFOP_FLR_QUEUES,
+				 bnx2x_vfop_flr, done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_flr, false);
+	}
+	return -ENOMEM;
+}
+
+static void bnx2x_vf_flr_clnup(struct bnx2x *bp, struct bnx2x_virtf *prev_vf)
+{
+	int i = prev_vf ? prev_vf->index + 1 : 0;
+	struct bnx2x_virtf *vf;
+
+	/* find next VF to cleanup */
+next_vf_to_clean:
+	for (;
+	     i < BNX2X_NR_VIRTFN(bp) &&
+	     (bnx2x_vf(bp, i, state) != VF_RESET ||
+	      bnx2x_vf(bp, i, flr_clnup_stage) != VF_FLR_CLN);
+	     i++)
+		;
+
+	DP(BNX2X_MSG_IOV, "next vf to cleanup: %d. num of vfs: %d\n", i,
+	   BNX2X_NR_VIRTFN(bp));
+
+	if (i < BNX2X_NR_VIRTFN(bp)) {
+		vf = BP_VF(bp, i);
+
+		/* lock the vf pf channel */
+		bnx2x_lock_vf_pf_channel(bp, vf, CHANNEL_TLV_FLR);
+
+		/* invoke the VF FLR SM */
+		if (bnx2x_vfop_flr_cmd(bp, vf, bnx2x_vf_flr_clnup)) {
+			BNX2X_ERR("VF[%d]: FLR cleanup failed -ENOMEM\n",
+				  vf->abs_vfid);
+
+			/* mark the VF to be ACKED and continue */
+			vf->flr_clnup_stage = VF_FLR_ACK;
+			goto next_vf_to_clean;
+		}
+		return;
+	}
+
+	/* we are done, update vf records */
+	for_each_vf(bp, i) {
+		vf = BP_VF(bp, i);
+
+		if (vf->flr_clnup_stage != VF_FLR_ACK)
+			continue;
+
+		vf->flr_clnup_stage = VF_FLR_EPILOG;
+	}
+
+	/* Acknowledge the handled VFs.
+	 * we are acknowledge all the vfs which an flr was requested for, even
+	 * if amongst them there are such that we never opened, since the mcp
+	 * will interrupt us immediately again if we only ack some of the bits,
+	 * resulting in an endless loop. This can happen for example in KVM
+	 * where an 'all ones' flr request is sometimes given by hyper visor
+	 */
+	DP(BNX2X_MSG_MCP, "DRV_STATUS_VF_DISABLED ACK for vfs 0x%x 0x%x\n",
+	   bp->vfdb->flrd_vfs[0], bp->vfdb->flrd_vfs[1]);
+	for (i = 0; i < FLRD_VFS_DWORDS; i++)
+		SHMEM2_WR(bp, drv_ack_vf_disabled[BP_FW_MB_IDX(bp)][i],
+			  bp->vfdb->flrd_vfs[i]);
+
+	bnx2x_fw_command(bp, DRV_MSG_CODE_VF_DISABLED_DONE, 0);
+
+	/* clear the acked bits - better yet if the MCP implemented
+	 * write to clear semantics
+	 */
+	for (i = 0; i < FLRD_VFS_DWORDS; i++)
+		SHMEM2_WR(bp, drv_ack_vf_disabled[BP_FW_MB_IDX(bp)][i], 0);
+}
+
+void bnx2x_vf_handle_flr_event(struct bnx2x *bp)
+{
+	int i;
+
+	/* Read FLR'd VFs */
+	for (i = 0; i < FLRD_VFS_DWORDS; i++)
+		bp->vfdb->flrd_vfs[i] = SHMEM2_RD(bp, mcp_vf_disabled[i]);
+
+	DP(BNX2X_MSG_MCP,
+	   "DRV_STATUS_VF_DISABLED received for vfs 0x%x 0x%x\n",
+	   bp->vfdb->flrd_vfs[0], bp->vfdb->flrd_vfs[1]);
+
+	for_each_vf(bp, i) {
+		struct bnx2x_virtf *vf = BP_VF(bp, i);
+		u32 reset = 0;
+
+		if (vf->abs_vfid < 32)
+			reset = bp->vfdb->flrd_vfs[0] & (1 << vf->abs_vfid);
+		else
+			reset = bp->vfdb->flrd_vfs[1] &
+				(1 << (vf->abs_vfid - 32));
+
+		if (reset) {
+			/* set as reset and ready for cleanup */
+			vf->state = VF_RESET;
+			vf->flr_clnup_stage = VF_FLR_CLN;
+
+			DP(BNX2X_MSG_IOV,
+			   "Initiating Final cleanup for VF %d\n",
+			   vf->abs_vfid);
+		}
+	}
+
+	/* do the FLR cleanup for all marked VFs*/
+	bnx2x_vf_flr_clnup(bp, NULL);
+}
+
 /* IOV global initialization routines  */
 void bnx2x_iov_init_dq(struct bnx2x *bp)
 {
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index 8e65b29..8ffef4e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -685,9 +685,16 @@ int bnx2x_vfop_release_cmd(struct bnx2x *bp,
 void bnx2x_vf_release(struct bnx2x *bp, struct bnx2x_virtf *vf, bool block);
 int bnx2x_vf_idx_by_abs_fid(struct bnx2x *bp, u16 abs_vfid);
 u8 bnx2x_vf_max_queue_cnt(struct bnx2x *bp, struct bnx2x_virtf *vf);
+
+/* FLR routines */
+
 /* VF FLR helpers */
 int bnx2x_vf_flr_clnup_epilog(struct bnx2x *bp, u8 abs_vfid);
 void bnx2x_vf_enable_access(struct bnx2x *bp, u8 abs_vfid);
+
+/* Handles an FLR (or VF_DISABLE) notification form the MCP */
+void bnx2x_vf_handle_flr_event(struct bnx2x *bp);
+
 void bnx2x_add_tlv(struct bnx2x *bp, void *tlvs_list, u16 offset, u16 type,
 		   u16 length);
 void bnx2x_vfpf_prep(struct bnx2x *bp, struct vfpf_first_tlv *first_tlv,
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h
index c4d4891..4ab176b 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h
@@ -311,6 +311,7 @@ enum channel_tlvs {
 	   CHANNEL_TLV_RELEASE,
 	   CHANNEL_TLV_PF_RELEASE_VF,
 	   CHANNEL_TLV_LIST_END,
+	   CHANNEL_TLV_FLR,
 	   CHANNEL_TLV_MAX
 };
 
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH net-next v3 21/22] bnx2x: Support PF <-> VF Bulletin Board
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

The PF <-> VF Bulletin Board is a simple interface between the
PF and the VF. The main reason for the Bulletin Board is to allow
the PF to be the initiator. The VF publishes at 'acquire' stage
the GPA of a Bulletin Board structure it has allocated. The PF notes
this GPA in the VF database. The VF samples the Bulletin Board
periodically for new messages. The latest version of the BB is always
used.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h       |    6 ++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |   89 +++++++++++++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h   |    2 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |   96 ++++++++++++++++++++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c |   13 +++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |   18 ++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c  |   67 ++++++++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h  |   37 ++++++++
 8 files changed, 327 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index a2b8ffa..d6c89d6 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -1259,6 +1259,12 @@ struct bnx2x {
 	/* we set aside a copy of the acquire response */
 	struct pfvf_acquire_resp_tlv acquire_resp;
 
+	/* bulletin board for messages from pf to vf*/
+	union pf_vf_bulletin   *pf2vf_bulletin;
+	dma_addr_t		pf2vf_bulletin_mapping;
+
+	struct pf_vf_bulletin_content	old_bulletin;
+
 	struct net_device	*dev;
 	struct pci_dev		*pdev;
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 0a00c4c..67f7334 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -3787,6 +3787,95 @@ int bnx2x_setup_tc(struct net_device *dev, u8 num_tc)
 	return 0;
 }
 
+/* New mac for VF. Consider these cases:
+ * 1. VF hasn't been acquired yet - save the mac in local bulletin board and
+ *    supply at acquire.
+ * 2. VF has already been acquired but has not yet initialized - store in local
+ *    bulletin board. mac will be posted on VF bulletin board after VF init. VF
+ *    will configure this mac when it is ready.
+ * 3. VF has already initialized but has not yet setup a queue - post the new
+ *    mac on VF's bulletin board right now. VF will configure this mac when it
+ *    is ready.
+ * 4. VF has already set a queue - delete any macs already configured for this
+ *    queue and manually config the new mac.
+ * In any event, once this function has been called refuse any attempts by the
+ * VF to configure any mac for itself except for this mac. In case of a race
+ * where the VF fails to see the new post on its bulletin board before sending a
+ * mac configuration request, the PF will simply fail the request and VF can try
+ * again after consulting its bulletin board
+ */
+int bnx2x_set_vf_mac(struct net_device *dev, int queue, u8 *mac)
+{
+
+	struct bnx2x *bp = netdev_priv(dev);
+	int rc, q_logical_state, vfidx = queue;
+	struct bnx2x_virtf *vf = BP_VF(bp, vfidx);
+	struct pf_vf_bulletin_content *bulletin = BP_VF_BULLETIN(bp, vfidx);
+
+	/* if SRIOV is disabled there is nothing to do (and somewhere, someone
+	 * has erred).
+	 */
+	if (!IS_SRIOV(bp)) {
+		BNX2X_ERR("bnx2x_set_vf_mac called though sriov is disabled\n");
+		return -EINVAL;
+	}
+
+	if (!is_valid_ether_addr(mac)) {
+		BNX2X_ERR("mac address invalid\n");
+		return -EINVAL;
+	}
+
+	/* update PF's copy of the VF's bulletin. will no longer accept mac
+	 * configuration requests from vf unless match this mac
+	 */
+	bulletin->valid_bitmap |= 1 << MAC_ADDR_VALID;
+	memcpy(bulletin->mac, mac, ETH_ALEN);
+
+	/* Post update on VF's bulletin board */
+	rc = bnx2x_post_vf_bulletin(bp, vfidx);
+	if (rc) {
+		BNX2X_ERR("failed to update VF[%d] bulletin", vfidx);
+		return rc;
+	}
+
+	/* is vf initialized and queue set up? */
+	q_logical_state =
+		bnx2x_get_q_logical_state(bp, &bnx2x_vfq(vf, 0, sp_obj));
+	if (vf->state == VF_ENABLED &&
+	    q_logical_state == BNX2X_Q_LOGICAL_STATE_ACTIVE) {
+
+		/* configure the mac in device on this vf's queue */
+		unsigned long flags = 0;
+		struct bnx2x_vlan_mac_obj *mac_obj = &bnx2x_vfq(vf, 0, mac_obj);
+
+		/* must lock vfpf channel to protect against vf flows */
+		bnx2x_lock_vf_pf_channel(bp, vf, CHANNEL_TLV_PF_SET_MAC);
+
+		/* remove existing eth macs */
+		rc = bnx2x_del_all_macs(bp, mac_obj, BNX2X_ETH_MAC, true);
+		if (rc) {
+			BNX2X_ERR("failed to delete eth macs\n");
+			return -EINVAL;
+		}
+
+		/* remove existing uc list macs */
+		rc = bnx2x_del_all_macs(bp, mac_obj, BNX2X_UC_LIST_MAC, true);
+		if (rc) {
+			BNX2X_ERR("failed to delete uc_list macs\n");
+			return -EINVAL;
+		}
+
+		/* configure the new mac to device */
+		__set_bit(RAMROD_COMP_WAIT, &flags);
+		bnx2x_set_mac_one(bp, (u8 *)&bulletin->mac, mac_obj, true,
+				  BNX2X_ETH_MAC, &flags);
+
+		bnx2x_unlock_vf_pf_channel(bp, vf, CHANNEL_TLV_PF_SET_MAC);
+	}
+
+	return rc;
+}
+
 /* called with rtnl_lock */
 int bnx2x_change_mac_addr(struct net_device *dev, void *p)
 {
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
index cd1eaff..23a1fa9 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
@@ -496,6 +496,8 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev);
 /* setup_tc callback */
 int bnx2x_setup_tc(struct net_device *dev, u8 num_tc);
 
+int bnx2x_set_vf_mac(struct net_device *dev, int queue, u8 *mac);
+
 /* select_queue callback */
 u16 bnx2x_select_queue(struct net_device *dev, struct sk_buff *skb);
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 91a2eb5..b44ef79 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -5247,6 +5247,63 @@ void bnx2x_drv_pulse(struct bnx2x *bp)
 		 bp->fw_drv_pulse_wr_seq);
 }
 
+/* crc is the first field in the bulletin board. compute the crc over the
+ * entire bulletin board excluding the crc field itself
+ */
+u32 bnx2x_crc_vf_bulletin(struct bnx2x *bp,
+			  struct pf_vf_bulletin_content *bulletin)
+{
+	return crc32(BULLETIN_CRC_SEED,
+		 ((u8 *)bulletin) + sizeof(bulletin->crc),
+		 BULLETIN_CONTENT_SIZE - sizeof(bulletin->crc));
+}
+
+/* Check for new posts on the bulletin board */
+enum sample_bulletin_result bnx2x_sample_bulletin(struct bnx2x *bp)
+{
+	struct pf_vf_bulletin_content bulletin = bp->pf2vf_bulletin->content;
+	int attempts;
+
+	/* bulletin board hasn't changed since last sample */
+	if (bp->old_bulletin.version == bulletin.version)
+		return PFVF_BULLETIN_UNCHANGED;
+
+	/* validate crc of new bulletin board */
+	if (bp->old_bulletin.version != bp->pf2vf_bulletin->content.version) {
+
+		/* sampling structure in mid post may result with corrupted data
+		 * validate crc to ensure coherency.
+		 */
+		for (attempts = 0; attempts < BULLETIN_ATTEMPTS; attempts++) {
+			bulletin = bp->pf2vf_bulletin->content;
+			if (bulletin.crc == bnx2x_crc_vf_bulletin(bp,
+								  &bulletin))
+				break;
+
+			BNX2X_ERR("bad crc on bulletin board. contained %x computed %x\n",
+				  bulletin.crc,
+				  bnx2x_crc_vf_bulletin(bp, &bulletin));
+		}
+		if (attempts >= BULLETIN_ATTEMPTS) {
+			BNX2X_ERR("pf to vf bulletin board crc was wrong %d consecutive times. Aborting\n",
+				  attempts);
+			return PFVF_BULLETIN_CRC_ERR;
+		}
+	}
+
+	/* the mac address in bulletin board is valid and is new */
+	if (bulletin.valid_bitmap & 1 << MAC_ADDR_VALID &&
+	    memcmp(bulletin.mac, bp->old_bulletin.mac, ETH_ALEN)) {
+
+		/* update new mac to net device */
+		memcpy(bp->dev->dev_addr, bulletin.mac, ETH_ALEN);
+	}
+
+	/* copy new bulletin board to bp */
+	bp->old_bulletin = bulletin;
+
+	return PFVF_BULLETIN_UPDATED;
+}
 
 static void bnx2x_timer(unsigned long data)
 {
@@ -5283,6 +5340,10 @@ static void bnx2x_timer(unsigned long data)
 	if (bp->state == BNX2X_STATE_OPEN)
 		bnx2x_stats_handle(bp, STATS_EVENT_UPDATE);
 
+	/* sample pf vf bulletin board for new posts from pf */
+	if (IS_VF(bp))
+		bnx2x_sample_bulletin(bp);
+
 	mod_timer(&bp->timer, jiffies + bp->current_interval);
 }
 
@@ -11665,7 +11726,7 @@ static const struct net_device_ops bnx2x_netdev_ops = {
 	.ndo_poll_controller	= poll_bnx2x,
 #endif
 	.ndo_setup_tc		= bnx2x_setup_tc,
-
+	.ndo_set_vf_mac		= bnx2x_set_vf_mac,
 #ifdef NETDEV_FCOE_WWNN
 	.ndo_fcoe_get_wwn	= bnx2x_fcoe_get_wwn,
 #endif
@@ -12327,6 +12388,11 @@ static int bnx2x_init_one(struct pci_dev *pdev,
 		/* allocate vf2pf mailbox for vf to pf channel */
 		BNX2X_PCI_ALLOC(bp->vf2pf_mbox, &bp->vf2pf_mbox_mapping,
 				sizeof(struct bnx2x_vf_mbx_msg));
+
+		/* allocate pf 2 vf bulletin board */
+		BNX2X_PCI_ALLOC(bp->pf2vf_bulletin, &bp->pf2vf_bulletin_mapping,
+				sizeof(union pf_vf_bulletin));
+
 	} else {
 		doorbell_size = BNX2X_L2_MAX_CID(bp) * (1 << BNX2X_DB_SHIFT);
 		if (doorbell_size > pci_resource_len(pdev, 2)) {
@@ -13385,6 +13451,9 @@ int bnx2x_vfpf_acquire(struct bnx2x *bp, u8 tx_count, u8 rx_count)
 	req->resc_request.num_mac_filters = VF_ACQUIRE_MAC_FILTERS;
 	req->resc_request.num_mc_filters = VF_ACQUIRE_MC_FILTERS;
 
+	/* pf 2 vf bulletin board address */
+	req->bulletin_addr = bp->pf2vf_bulletin_mapping;
+
 	/* add list termination tlv */
 	bnx2x_add_tlv(bp, req, req->first_tlv.tl.length, CHANNEL_TLV_LIST_END,
 		      sizeof(struct channel_list_end_tlv));
@@ -13741,6 +13810,9 @@ int bnx2x_vfpf_set_mac(struct bnx2x *bp)
 	req->filters[0].flags =
 		VFPF_Q_FILTER_DEST_MAC_VALID | VFPF_Q_FILTER_SET_MAC;
 
+	/* sample bulletin board for new mac */
+	bnx2x_sample_bulletin(bp);
+
 	/* copy mac from device to request */
 	memcpy(req->filters[0].mac, bp->dev->dev_addr, ETH_ALEN);
 
@@ -13758,6 +13830,28 @@ int bnx2x_vfpf_set_mac(struct bnx2x *bp)
 		return rc;
 	}
 
+	/* failure may mean PF was configured with a new mac for us */
+	while (resp->hdr.status == PFVF_STATUS_FAILURE) {
+
+		DP(BNX2X_MSG_IOV,
+		   "vfpf SET MAC failed. Check bulletin board for new posts\n");
+
+		/* check if bulletin board was updated */
+		if (bnx2x_sample_bulletin(bp) == PFVF_BULLETIN_UPDATED) {
+
+			/* copy mac from device to request */
+			memcpy(req->filters[0].mac, bp->dev->dev_addr,
+			       ETH_ALEN);
+
+			/* send message to pf */
+			rc = bnx2x_send_msg2pf(bp, &resp->hdr.status,
+					       bp->vf2pf_mbox_mapping);
+		} else {
+			/* no new info in bulletin */
+			break;
+		}
+	}
+
 	if (resp->hdr.status != PFVF_STATUS_SUCCESS) {
 		BNX2X_ERR("vfpf SET MAC failed: %d\n", resp->hdr.status);
 		return -EINVAL;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index a0a019c0..11d603f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -2060,6 +2060,10 @@ void bnx2x_iov_free_mem(struct bnx2x *bp)
 	BNX2X_PCI_FREE(BP_VF_MBX_DMA(bp)->addr,
 		       BP_VF_MBX_DMA(bp)->mapping,
 		       BP_VF_MBX_DMA(bp)->size);
+
+	BNX2X_PCI_FREE(BP_VF_BULLETIN_DMA(bp)->addr,
+		       BP_VF_BULLETIN_DMA(bp)->mapping,
+		       BP_VF_BULLETIN_DMA(bp)->size);
 }
 
 int bnx2x_iov_alloc_mem(struct bnx2x *bp)
@@ -2099,6 +2103,12 @@ int bnx2x_iov_alloc_mem(struct bnx2x *bp)
 			tot_size);
 	BP_VF_MBX_DMA(bp)->size = tot_size;
 
+	/* allocate local bulletin boards */
+	tot_size = BNX2X_NR_VIRTFN(bp) * BULLETIN_CONTENT_SIZE;
+	BNX2X_PCI_ALLOC(BP_VF_BULLETIN_DMA(bp)->addr,
+			&BP_VF_BULLETIN_DMA(bp)->mapping, tot_size);
+	BP_VF_BULLETIN_DMA(bp)->size = tot_size;
+
 	return 0;
 
 alloc_mem_err:
@@ -2815,6 +2825,9 @@ int bnx2x_vf_init(struct bnx2x *bp, struct bnx2x_virtf *vf, dma_addr_t *sb_map)
 
 	vf->state = VF_ENABLED;
 
+	/* update vf bulletin board */
+	bnx2x_post_vf_bulletin(bp, vf->index);
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index 8ffef4e..5c8aac1 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -379,6 +379,12 @@ struct bnx2x_vfdb {
 	struct bnx2x_vf_mbx	mbxs[BNX2X_MAX_NUM_OF_VFS];
 #define BP_VF_MBX(bp, vfid)	(&((bp)->vfdb->mbxs[(vfid)]))
 
+	struct hw_dma		bulletin_dma;
+#define BP_VF_BULLETIN_DMA(bp)	(&((bp)->vfdb->bulletin_dma))
+#define	BP_VF_BULLETIN(bp, vf) \
+	(((struct pf_vf_bulletin_content *)(BP_VF_BULLETIN_DMA(bp)->addr)) \
+	 + (vf))
+
 	struct hw_dma		sp_dma;
 #define bnx2x_vf_sp(bp, vf, field) ((bp)->vfdb->sp_dma.addr +		\
 		(vf)->index * sizeof(struct bnx2x_vf_sp) +		\
@@ -703,4 +709,16 @@ void bnx2x_dp_tlv_list(struct bnx2x *bp, void *tlvs_list);
 
 bool bnx2x_tlv_supported(u16 tlvtype);
 
+u32 bnx2x_crc_vf_bulletin(struct bnx2x *bp,
+			  struct pf_vf_bulletin_content *bulletin);
+int bnx2x_post_vf_bulletin(struct bnx2x *bp, int vf);
+
+enum sample_bulletin_result {
+	   PFVF_BULLETIN_UNCHANGED,
+	   PFVF_BULLETIN_UPDATED,
+	   PFVF_BULLETIN_CRC_ERR
+};
+
+enum sample_bulletin_result bnx2x_sample_bulletin(struct bnx2x *bp);
+
 #endif /* bnx2x_sriov.h */
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
index 65a3723..f246c63 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
@@ -301,6 +301,10 @@ static void bnx2x_vf_mbx_acquire_resp(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		resc->num_mc_filters = 0;
 
 		if (status == PFVF_STATUS_SUCCESS) {
+			/* fill in the allocated resources */
+			struct pf_vf_bulletin_content *bulletin =
+				BP_VF_BULLETIN(bp, vf->index);
+
 			for_each_vfq(vf, i)
 				resc->hw_qid[i] =
 					vfq_qzone_id(vf, vfq_get(vf, i));
@@ -309,6 +313,12 @@ static void bnx2x_vf_mbx_acquire_resp(struct bnx2x *bp, struct bnx2x_virtf *vf,
 				resc->hw_sbs[i].hw_sb_id = vf_igu_sb(vf, i);
 				resc->hw_sbs[i].sb_qid = vf_hc_qzone(vf, i);
 			}
+
+			/* if a mac has been set for this vf, supply it */
+			if (bulletin->valid_bitmap & 1 << MAC_ADDR_VALID) {
+				memcpy(resc->current_mac_addr, bulletin->mac,
+				       ETH_ALEN);
+			}
 		}
 	}
 
@@ -360,6 +370,9 @@ static void bnx2x_vf_mbx_acquire(struct bnx2x *bp, struct bnx2x_virtf *vf,
 	/* acquire the resources */
 	rc = bnx2x_vf_acquire(bp, vf, &acquire->resc_request);
 
+	/* store address of vf's bulletin board */
+	vf->bulletin_map = acquire->bulletin_addr;
+
 	/* response */
 	bnx2x_vf_mbx_acquire_resp(bp, vf, mbx, rc);
 }
@@ -771,11 +784,39 @@ static void bnx2x_vf_mbx_set_q_filters(struct bnx2x *bp,
 				       struct bnx2x_vf_mbx *mbx)
 {
 	struct vfpf_set_q_filters_tlv *filters = &mbx->msg->req.set_q_filters;
+	struct pf_vf_bulletin_content *bulletin = BP_VF_BULLETIN(bp, vf->index);
 	struct bnx2x_vfop_cmd cmd = {
 		.done = bnx2x_vf_mbx_resp,
 		.block = false,
 	};
 
+	/* if a mac was already set for this VF via the set vf mac ndo, we only
+	 * accept mac configurations of that mac. Why accept them at all?
+	 * because PF may have been unable to configure the mac at the time
+	 * since queue was not set up.
+	 */
+	if (bulletin->valid_bitmap & 1 << MAC_ADDR_VALID) {
+
+		/* once a mac was set by ndo can only accept a single mac... */
+		if (filters->n_mac_vlan_filters > 1) {
+			BNX2X_ERR("VF[%d] requested the addition of multiple macs after set_vf_mac ndo was called\n",
+				  vf->abs_vfid);
+			vf->op_rc = -EPERM;
+			goto response;
+		}
+
+		/* ...and only the mac set by the ndo */
+		if (filters->n_mac_vlan_filters == 1 &&
+		    memcmp(filters->filters->mac, bulletin->mac, ETH_ALEN)) {
+
+			BNX2X_ERR("VF[%d] requested the addition of a mac address not matching the one configured by set_vf_mac ndo\n",
+				  vf->abs_vfid);
+
+			vf->op_rc = -EPERM;
+			goto response;
+		}
+	}
+
 	/* verify vf_qid */
 	if (filters->vf_qid > vf_rxq_count(vf))
 		goto response;
@@ -987,3 +1028,29 @@ mbx_error:
 mbx_done:
 	return;
 }
+
+/* propagate local bulletin board to vf */
+int bnx2x_post_vf_bulletin(struct bnx2x *bp, int vf)
+{
+	struct pf_vf_bulletin_content *bulletin = BP_VF_BULLETIN(bp, vf);
+	dma_addr_t pf_addr = BP_VF_BULLETIN_DMA(bp)->mapping +
+		vf * BULLETIN_CONTENT_SIZE;
+	dma_addr_t vf_addr = bnx2x_vf(bp, vf, bulletin_map);
+	u32 len = BULLETIN_CONTENT_SIZE;
+	int rc;
+
+	/* can only update vf after init took place */
+	if (bnx2x_vf(bp, vf, state) != VF_ENABLED &&
+	    bnx2x_vf(bp, vf, state) != VF_ACQUIRED)
+		return 0;
+
+	/* increment bulletin board version and compute crc */
+	bulletin->version++;
+	bulletin->crc = bnx2x_crc_vf_bulletin(bp, bulletin);
+
+	/* propagate bulletin board via dmae to vm memory */
+	rc = bnx2x_copy32_vf_dmae(bp, false, pf_addr,
+				  bnx2x_vf(bp, vf, abs_vfid), U64_HI(vf_addr),
+				  U64_LO(vf_addr), len/4);
+	return rc;
+}
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h
index 4ab176b..74d2831 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h
@@ -37,6 +37,7 @@ struct hw_sb_info {
  * A.K.A VF-PF mailbox
  */
 #define TLV_BUFFER_SIZE			1024
+#define PF_VF_BULLETIN_SIZE		512
 
 #define VFPF_QUEUE_FLG_TPA		0x0001
 #define VFPF_QUEUE_FLG_TPA_IPV6		0x0002
@@ -60,6 +61,9 @@ struct hw_sb_info {
 #define VFPF_RX_MASK_ACCEPT_ALL_UNICAST		0x00000004
 #define VFPF_RX_MASK_ACCEPT_ALL_MULTICAST	0x00000008
 #define VFPF_RX_MASK_ACCEPT_BROADCAST		0x00000010
+#define BULLETIN_CONTENT_SIZE		(sizeof(struct pf_vf_bulletin_content))
+#define BULLETIN_ATTEMPTS	5 /* crc failures before throwing towel */
+#define BULLETIN_CRC_SEED	0
 
 enum {
 	PFVF_STATUS_WAITING = 0,
@@ -298,6 +302,38 @@ union pfvf_tlvs {
 	struct tlv_buffer_size		tlv_buf_size;
 };
 
+/* This is a structure which is allocated in the VF, which the PF may update
+ * when it deems it necessary to do so. The bulletin board is sampled
+ * periodically by the VF. A copy per VF is maintained in the PF (to prevent
+ * loss of data upon multiple updates (or the need for read modify write)).
+ */
+struct pf_vf_bulletin_size {
+	u8 size[PF_VF_BULLETIN_SIZE];
+};
+
+struct pf_vf_bulletin_content {
+	u32 crc;			/* crc of structure to ensure is not in
+					 * mid-update
+					 */
+	u32 version;
+
+	aligned_u64 valid_bitmap;	/* bitmap indicating which fields
+					 * hold valid values
+					 */
+
+#define MAC_ADDR_VALID		0	/* alert the vf that a new mac address
+					 * is available for it
+					 */
+
+	u8 mac[ETH_ALEN];
+	u8 padding[2];
+};
+
+union pf_vf_bulletin {
+	struct pf_vf_bulletin_content content;
+	struct pf_vf_bulletin_size size;
+};
+
 #define MAX_TLVS_IN_LIST 50
 
 enum channel_tlvs {
@@ -312,6 +348,7 @@ enum channel_tlvs {
 	   CHANNEL_TLV_PF_RELEASE_VF,
 	   CHANNEL_TLV_LIST_END,
 	   CHANNEL_TLV_FLR,
+	   CHANNEL_TLV_PF_SET_MAC,
 	   CHANNEL_TLV_MAX
 };
 
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH net-next v3 22/22] bnx2x: Add VF device ids and enable feature
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

Add the various VF device ids (of all supported hardware)
Add the calls to enable_sriov and disable_sriov to enable the
SR-IOV feature. This patch also advances the version and release
date of the bnx2x module.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h       |   22 +++++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |   85 +++++++++++++++++++--
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c |    4 +
 3 files changed, 101 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index d6c89d6..08afced 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -26,8 +26,8 @@
  * (you will need to reboot afterwards) */
 /* #define BNX2X_STOP_ON_ERROR */
 
-#define DRV_MODULE_VERSION      "1.78.00-0"
-#define DRV_MODULE_RELDATE      "2012/09/27"
+#define DRV_MODULE_VERSION      "1.78.01-0"
+#define DRV_MODULE_RELDATE      "2012/10/30"
 #define BNX2X_BC_VER            0x040200
 
 #if defined(CONFIG_DCB)
@@ -802,36 +802,46 @@ struct bnx2x_common {
 #define CHIP_NUM_57711E			0x1650
 #define CHIP_NUM_57712			0x1662
 #define CHIP_NUM_57712_MF		0x1663
+#define CHIP_NUM_57712_VF		0x166f
 #define CHIP_NUM_57713			0x1651
 #define CHIP_NUM_57713E			0x1652
 #define CHIP_NUM_57800			0x168a
 #define CHIP_NUM_57800_MF		0x16a5
+#define CHIP_NUM_57800_VF		0x16a9
 #define CHIP_NUM_57810			0x168e
 #define CHIP_NUM_57810_MF		0x16ae
+#define CHIP_NUM_57810_VF		0x16af
 #define CHIP_NUM_57811			0x163d
 #define CHIP_NUM_57811_MF		0x163e
+#define CHIP_NUM_57811_VF		0x163f
 #define CHIP_NUM_57840_OBSOLETE	0x168d
 #define CHIP_NUM_57840_MF_OBSOLETE	0x16ab
 #define CHIP_NUM_57840_4_10		0x16a1
 #define CHIP_NUM_57840_2_20		0x16a2
 #define CHIP_NUM_57840_MF		0x16a4
+#define CHIP_NUM_57840_VF		0x16ad
 #define CHIP_IS_E1(bp)			(CHIP_NUM(bp) == CHIP_NUM_57710)
 #define CHIP_IS_57711(bp)		(CHIP_NUM(bp) == CHIP_NUM_57711)
 #define CHIP_IS_57711E(bp)		(CHIP_NUM(bp) == CHIP_NUM_57711E)
 #define CHIP_IS_57712(bp)		(CHIP_NUM(bp) == CHIP_NUM_57712)
+#define CHIP_IS_57712_VF(bp)		(CHIP_NUM(bp) == CHIP_NUM_57712_VF)
 #define CHIP_IS_57712_MF(bp)		(CHIP_NUM(bp) == CHIP_NUM_57712_MF)
 #define CHIP_IS_57800(bp)		(CHIP_NUM(bp) == CHIP_NUM_57800)
 #define CHIP_IS_57800_MF(bp)		(CHIP_NUM(bp) == CHIP_NUM_57800_MF)
+#define CHIP_IS_57800_VF(bp)		(CHIP_NUM(bp) == CHIP_NUM_57800_VF)
 #define CHIP_IS_57810(bp)		(CHIP_NUM(bp) == CHIP_NUM_57810)
 #define CHIP_IS_57810_MF(bp)		(CHIP_NUM(bp) == CHIP_NUM_57810_MF)
+#define CHIP_IS_57810_VF(bp)		(CHIP_NUM(bp) == CHIP_NUM_57810_VF)
 #define CHIP_IS_57811(bp)		(CHIP_NUM(bp) == CHIP_NUM_57811)
 #define CHIP_IS_57811_MF(bp)		(CHIP_NUM(bp) == CHIP_NUM_57811_MF)
+#define CHIP_IS_57811_VF(bp)		(CHIP_NUM(bp) == CHIP_NUM_57811_VF)
 #define CHIP_IS_57840(bp)		\
 		((CHIP_NUM(bp) == CHIP_NUM_57840_4_10) || \
 		 (CHIP_NUM(bp) == CHIP_NUM_57840_2_20) || \
 		 (CHIP_NUM(bp) == CHIP_NUM_57840_OBSOLETE))
 #define CHIP_IS_57840_MF(bp)	((CHIP_NUM(bp) == CHIP_NUM_57840_MF) || \
 				 (CHIP_NUM(bp) == CHIP_NUM_57840_MF_OBSOLETE))
+#define CHIP_IS_57840_VF(bp)		(CHIP_NUM(bp) == CHIP_NUM_57840_VF)
 #define CHIP_IS_E1H(bp)			(CHIP_IS_57711(bp) || \
 					 CHIP_IS_57711E(bp))
 #define CHIP_IS_E2(bp)			(CHIP_IS_57712(bp) || \
@@ -840,10 +850,13 @@ struct bnx2x_common {
 					 CHIP_IS_57800_MF(bp) || \
 					 CHIP_IS_57810(bp) || \
 					 CHIP_IS_57810_MF(bp) || \
+					 CHIP_IS_57810_VF(bp) || \
 					 CHIP_IS_57811(bp) || \
 					 CHIP_IS_57811_MF(bp) || \
+					 CHIP_IS_57811_VF(bp) || \
 					 CHIP_IS_57840(bp) || \
-					 CHIP_IS_57840_MF(bp))
+					 CHIP_IS_57840_MF(bp) || \
+					 CHIP_IS_57840_VF(bp))
 #define CHIP_IS_E1x(bp)			(CHIP_IS_E1((bp)) || CHIP_IS_E1H((bp)))
 #define USES_WARPCORE(bp)		(CHIP_IS_E3(bp))
 #define IS_E1H_OFFSET			(!CHIP_IS_E1(bp))
@@ -1195,8 +1208,9 @@ struct bnx2x_fw_stats_data {
 enum {
 	BNX2X_SP_RTNL_SETUP_TC,
 	BNX2X_SP_RTNL_TX_TIMEOUT,
-	BNX2X_SP_RTNL_AFEX_F_UPDATE,
 	BNX2X_SP_RTNL_FAN_FAILURE,
+	BNX2X_SP_RTNL_AFEX_F_UPDATE,
+	BNX2X_SP_RTNL_ENABLE_SRIOV,
 	BNX2X_SP_RTNL_VFPF_MCAST,
 	BNX2X_SP_RTNL_VFPF_STORM_RX_MODE,
 };
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index b44ef79..2d19b04 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -193,12 +193,18 @@ static struct {
 #ifndef PCI_DEVICE_ID_NX2_57712_MF
 #define PCI_DEVICE_ID_NX2_57712_MF	CHIP_NUM_57712_MF
 #endif
+#ifndef PCI_DEVICE_ID_NX2_57712_VF
+#define PCI_DEVICE_ID_NX2_57712_VF	CHIP_NUM_57712_VF
+#endif
 #ifndef PCI_DEVICE_ID_NX2_57800
 #define PCI_DEVICE_ID_NX2_57800		CHIP_NUM_57800
 #endif
 #ifndef PCI_DEVICE_ID_NX2_57800_MF
 #define PCI_DEVICE_ID_NX2_57800_MF	CHIP_NUM_57800_MF
 #endif
+#ifndef PCI_DEVICE_ID_NX2_57800_VF
+#define PCI_DEVICE_ID_NX2_57800_VF	CHIP_NUM_57800_VF
+#endif
 #ifndef PCI_DEVICE_ID_NX2_57810
 #define PCI_DEVICE_ID_NX2_57810		CHIP_NUM_57810
 #endif
@@ -208,6 +214,9 @@ static struct {
 #ifndef PCI_DEVICE_ID_NX2_57840_O
 #define PCI_DEVICE_ID_NX2_57840_O	CHIP_NUM_57840_OBSOLETE
 #endif
+#ifndef PCI_DEVICE_ID_NX2_57810_VF
+#define PCI_DEVICE_ID_NX2_57810_VF	CHIP_NUM_57810_VF
+#endif
 #ifndef PCI_DEVICE_ID_NX2_57840_4_10
 #define PCI_DEVICE_ID_NX2_57840_4_10	CHIP_NUM_57840_4_10
 #endif
@@ -220,29 +229,41 @@ static struct {
 #ifndef PCI_DEVICE_ID_NX2_57840_MF
 #define PCI_DEVICE_ID_NX2_57840_MF	CHIP_NUM_57840_MF
 #endif
+#ifndef PCI_DEVICE_ID_NX2_57840_VF
+#define PCI_DEVICE_ID_NX2_57840_VF	CHIP_NUM_57840_VF
+#endif
 #ifndef PCI_DEVICE_ID_NX2_57811
 #define PCI_DEVICE_ID_NX2_57811		CHIP_NUM_57811
 #endif
 #ifndef PCI_DEVICE_ID_NX2_57811_MF
 #define PCI_DEVICE_ID_NX2_57811_MF	CHIP_NUM_57811_MF
 #endif
+#ifndef PCI_DEVICE_ID_NX2_57811_VF
+#define PCI_DEVICE_ID_NX2_57811_VF	CHIP_NUM_57811_VF
+#endif
+
 static DEFINE_PCI_DEVICE_TABLE(bnx2x_pci_tbl) = {
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57710), BCM57710 },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57711), BCM57711 },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57711E), BCM57711E },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57712), BCM57712 },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57712_MF), BCM57712_MF },
+	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57712_VF), BCM57712_VF },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57800), BCM57800 },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57800_MF), BCM57800_MF },
+	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57800_VF), BCM57800_VF },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57810), BCM57810 },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57810_MF), BCM57810_MF },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57840_O), BCM57840_O },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57840_4_10), BCM57840_4_10 },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57840_2_20), BCM57840_2_20 },
+	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57810_VF), BCM57810_VF },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57840_MFO), BCM57840_MFO },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57840_MF), BCM57840_MF },
+	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57840_VF), BCM57840_VF },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57811), BCM57811 },
 	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57811_MF), BCM57811_MF },
+	{ PCI_VDEVICE(BROADCOM, PCI_DEVICE_ID_NX2_57811_VF), BCM57811_VF },
 	{ 0 }
 };
 
@@ -9432,8 +9453,10 @@ static void bnx2x_sp_rtnl_task(struct work_struct *work)
 
 	rtnl_lock();
 
-	if (!netif_running(bp->dev))
-		goto sp_rtnl_exit;
+	if (!netif_running(bp->dev)) {
+		rtnl_unlock();
+		return;
+	}
 
 	/* if stop on error is defined no recovery flows should be executed */
 #ifdef BNX2X_STOP_ON_ERROR
@@ -9452,7 +9475,8 @@ static void bnx2x_sp_rtnl_task(struct work_struct *work)
 
 		bnx2x_parity_recover(bp);
 
-		goto sp_rtnl_exit;
+		rtnl_unlock();
+		return;
 	}
 
 	if (test_and_clear_bit(BNX2X_SP_RTNL_TX_TIMEOUT, &bp->sp_rtnl_state)) {
@@ -9466,7 +9490,8 @@ static void bnx2x_sp_rtnl_task(struct work_struct *work)
 		bnx2x_nic_unload(bp, UNLOAD_NORMAL, true);
 		bnx2x_nic_load(bp, LOAD_NORMAL);
 
-		goto sp_rtnl_exit;
+		rtnl_unlock();
+		return;
 	}
 #ifdef BNX2X_STOP_ON_ERROR
 sp_rtnl_not_reset:
@@ -9484,6 +9509,8 @@ sp_rtnl_not_reset:
 		DP(NETIF_MSG_HW, "fan failure detected. Unloading driver\n");
 		netif_device_detach(bp->dev);
 		bnx2x_close(bp->dev);
+		rtnl_unlock();
+		return;
 	}
 
 	if (test_and_clear_bit(BNX2X_SP_RTNL_VFPF_MCAST, &bp->sp_rtnl_state)) {
@@ -9499,8 +9526,28 @@ sp_rtnl_not_reset:
 		bnx2x_vfpf_storm_rx_mode(bp);
 	}
 
-sp_rtnl_exit:
+	/* work which needs rtnl lock not-taken (as it takes the lock itself and
+	 * can be called from other contexts as well)
+	 */
+
 	rtnl_unlock();
+
+	if (IS_SRIOV(bp) && test_and_clear_bit(BNX2X_SP_RTNL_ENABLE_SRIOV,
+					       &bp->sp_rtnl_state)) {
+		int rc = 0;
+
+		/* disbale sriov in case it is still enabled */
+		pci_disable_sriov(bp->pdev);
+		DP(BNX2X_MSG_IOV, "sriov disabled");
+
+		/* enable sriov */
+		DP(BNX2X_MSG_IOV, "vf num (%d)\n", (bp->vfdb->sriov.nr_virtfn));
+		rc = pci_enable_sriov(bp->pdev, (bp->vfdb->sriov.nr_virtfn));
+		if (rc)
+			BNX2X_ERR("pci_enable_sriov failed with %d\n", rc);
+		else
+			DP(BNX2X_MSG_IOV, "sriov enabled");
+	}
 }
 
 /* end of nic load/unload */
@@ -11359,6 +11406,26 @@ static int bnx2x_init_bp(struct bnx2x *bp)
  * net_device service functions
  */
 
+static int bnx2x_open_epilog(struct bnx2x *bp)
+{
+	/* Enable sriov via delayed work. This must be done via delayed work
+	 * because it causes the probe of the vf devices to be run, which invoke
+	 * register_netdevice which must have rtnl lock taken. As we are holding
+	 * the lock right now, that could only work if the probe would not take
+	 * the lock. However, as the probe of the vf may be called from other
+	 * contexts as well (such as passthrough to vm failes) it can't assume
+	 * the lock is being held for it. Using delayed work here allows the
+	 * probe code to simply take the lock (i.e. wait for it to be released
+	 * if it is being held).
+	 */
+	smp_mb__before_clear_bit();
+	set_bit(BNX2X_SP_RTNL_ENABLE_SRIOV, &bp->sp_rtnl_state);
+	smp_mb__after_clear_bit();
+	schedule_delayed_work(&bp->sp_rtnl_task, 0);
+
+	return 0;
+}
+
 /* called with rtnl_lock */
 static int bnx2x_open(struct net_device *dev)
 {
@@ -11366,6 +11433,7 @@ static int bnx2x_open(struct net_device *dev)
 	bool global = false;
 	int other_engine = BP_PATH(bp) ? 0 : 1;
 	bool other_load_status, load_status;
+	int rc;
 
 	bp->stats_init = true;
 
@@ -11423,7 +11491,12 @@ static int bnx2x_open(struct net_device *dev)
 		} while (0);
 
 	bp->recovery_state = BNX2X_RECOVERY_DONE;
-	return bnx2x_nic_load(bp, LOAD_OPEN);
+
+	rc = bnx2x_nic_load(bp, LOAD_OPEN);
+	if (rc)
+		return rc;
+
+	return bnx2x_open_epilog(bp);
 }
 
 /* called with rtnl_lock */
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index 11d603f..4d76b6c 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -2036,6 +2036,10 @@ void bnx2x_iov_remove_one(struct bnx2x *bp)
 	if (!IS_SRIOV(bp))
 		return;
 
+	DP(BNX2X_MSG_IOV, "about to call disable sriov");
+	pci_disable_sriov(bp->pdev);
+	DP(BNX2X_MSG_IOV, "sriov disabled");
+
 	/* free vf database */
 	__bnx2x_iov_free_vfdb(bp);
 }
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH net-next v3 19/22] bnx2x: Support of PF driver of a VF release request
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

The 'release' request is the opposite of the 'acquire' request.
At release, all the resources allocated to the VF are reclaimed.
The release flow applies the close flow if applicable.
Note that there are actually two types of release:
1. The VF has been removed, and so issued a 'release' request
over the VF <-> PF Channel.
2. The PF is going down and so has to release all of it's VFs.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |    1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c |  121 +++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |   21 ++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c  |   25 ++++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h  |    1 +
 5 files changed, 168 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 3618afb..384f303 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -8628,6 +8628,7 @@ void bnx2x_chip_cleanup(struct bnx2x *bp, int unload_mode, bool keep_link)
 
 	netif_addr_unlock_bh(bp->dev);
 
+	bnx2x_iov_chip_cleanup(bp);
 
 
 	/*
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index c5d471e..658915f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -1425,6 +1425,14 @@ bnx2x_iov_static_resc(struct bnx2x *bp, struct vf_pf_resc_request *resc)
 	/* num_sbs already set */
 }
 
+/* FLR routines: */
+static void bnx2x_vf_free_resc(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	/* reset the state variables */
+	bnx2x_iov_static_resc(bp, &vf->alloc_resc);
+	vf->state = VF_FREE;
+}
+
 /* IOV global initialization routines  */
 void bnx2x_iov_init_dq(struct bnx2x *bp)
 {
@@ -1950,6 +1958,21 @@ int bnx2x_iov_nic_init(struct bnx2x *bp)
 	return 0;
 }
 
+/* called by bnx2x_chip_cleanup */
+int bnx2x_iov_chip_cleanup(struct bnx2x *bp)
+{
+	int i;
+
+	if (!IS_SRIOV(bp))
+		return 0;
+
+	/* release all the VFs */
+	for_each_vf(bp, i)
+		bnx2x_vf_release(bp, BP_VF(bp, i), true); /* blocking */
+
+	return 0;
+}
+
 /* called by bnx2x_init_hw_func, returns the next ilt line */
 int bnx2x_iov_init_ilt(struct bnx2x *bp, u16 line)
 {
@@ -2571,6 +2594,104 @@ int bnx2x_vfop_close_cmd(struct bnx2x *bp,
 	return -ENOMEM;
 }
 
+/* VF release can be called either: 1. the VF was acquired but
+ * not enabled 2. the vf was enabled or in the process of being
+ * enabled
+ */
+static void bnx2x_vfop_release(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	struct bnx2x_vfop_cmd cmd = {
+		.done = bnx2x_vfop_release,
+		.block = false,
+	};
+
+	DP(BNX2X_MSG_IOV, "vfop->rc %d\n", vfop->rc);
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	DP(BNX2X_MSG_IOV, "VF[%d] STATE: %s\n", vf->abs_vfid,
+	   vf->state == VF_FREE ? "Free" :
+	   vf->state == VF_ACQUIRED ? "Acquired" :
+	   vf->state == VF_ENABLED ? "Enabled" :
+	   vf->state == VF_RESET ? "Reset" :
+	   "Unknown");
+
+	switch (vf->state) {
+	case VF_ENABLED:
+		vfop->rc = bnx2x_vfop_close_cmd(bp, vf, &cmd);
+		if (vfop->rc)
+			goto op_err;
+		return;
+
+	case VF_ACQUIRED:
+		DP(BNX2X_MSG_IOV, "about to free resources\n");
+		bnx2x_vf_free_resc(bp, vf);
+		DP(BNX2X_MSG_IOV, "vfop->rc %d\n", vfop->rc);
+		goto op_done;
+
+	case VF_FREE:
+	case VF_RESET:
+		/* do nothing */
+		goto op_done;
+	default:
+		bnx2x_vfop_default(vf->state);
+	}
+op_err:
+	BNX2X_ERR("VF[%d] RELEASE error: rc %d\n", vf->abs_vfid, vfop->rc);
+op_done:
+	bnx2x_vfop_end(bp, vf, vfop);
+}
+
+int bnx2x_vfop_release_cmd(struct bnx2x *bp,
+			   struct bnx2x_virtf *vf,
+			   struct bnx2x_vfop_cmd *cmd)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+	if (vfop) {
+		bnx2x_vfop_opset(-1, /* use vf->state */
+				 bnx2x_vfop_release, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_release,
+					     cmd->block);
+	}
+	return -ENOMEM;
+}
+
+/* VF release ~ VF close + VF release-resources
+ * Release is the ultimate SW shutdown and is called whenever an
+ * irrecoverable error is encountered.
+ */
+void bnx2x_vf_release(struct bnx2x *bp, struct bnx2x_virtf *vf, bool block)
+{
+	struct bnx2x_vfop_cmd cmd = {
+		.done = NULL,
+		.block = block,
+	};
+	int rc;
+	bnx2x_lock_vf_pf_channel(bp, vf, CHANNEL_TLV_PF_RELEASE_VF);
+	rc = bnx2x_vfop_release_cmd(bp, vf, &cmd);
+	if (rc)
+		WARN(rc,
+		     "VF[%d] Failed to allocate resources for release op- rc=%d\n",
+		     vf->abs_vfid, rc);
+}
+
+static inline void bnx2x_vf_get_sbdf(struct bnx2x *bp,
+			      struct bnx2x_virtf *vf, u32 *sbdf)
+{
+	*sbdf = vf->devfn | (vf->bus << 8);
+}
+
+static inline void bnx2x_vf_get_bars(struct bnx2x *bp, struct bnx2x_virtf *vf,
+		       struct bnx2x_vf_bar_info *bar_info)
+{
+	int n;
+	bar_info->nr_bars = bp->vfdb->sriov.nres;
+	for (n = 0; n < bar_info->nr_bars; n++)
+		bar_info->bars[n] = vf->bars[n];
+}
+
 void bnx2x_lock_vf_pf_channel(struct bnx2x *bp, struct bnx2x_virtf *vf,
 			      enum channel_tlvs tlv)
 {
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index 2e2c571..8e65b29 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -51,6 +51,11 @@ struct bnx2x_vf_bar {
 	u32 size;
 };
 
+struct bnx2x_vf_bar_info {
+	struct bnx2x_vf_bar bars[PCI_SRIOV_NUM_BARS];
+	u8 nr_bars;
+};
+
 /* vf queue (used both for rx or tx) */
 struct bnx2x_vf_queue {
 	struct eth_context		*cxt;
@@ -430,6 +435,7 @@ void bnx2x_iov_remove_one(struct bnx2x *bp);
 void bnx2x_iov_free_mem(struct bnx2x *bp);
 int bnx2x_iov_alloc_mem(struct bnx2x *bp);
 int bnx2x_iov_nic_init(struct bnx2x *bp);
+int bnx2x_iov_chip_cleanup(struct bnx2x *bp);
 void bnx2x_iov_init_dq(struct bnx2x *bp);
 void bnx2x_iov_init_dmae(struct bnx2x *bp);
 void bnx2x_iov_set_queue_sp_obj(struct bnx2x *bp, int vf_cid,
@@ -547,6 +553,11 @@ static inline void bnx2x_vfop_end(struct bnx2x *bp, struct bnx2x_virtf *vf,
 	if (vfop->done) {
 		DP(BNX2X_MSG_IOV, "calling done handler\n");
 		vfop->done(bp, vf);
+	} else {
+		/* there is no done handler for the operation to unlock
+		 * the mutex. Must have gotten here from PF initiated VF RELEASE
+		 */
+		bnx2x_unlock_vf_pf_channel(bp, vf, CHANNEL_TLV_PF_RELEASE_VF);
 	}
 
 	DP(BNX2X_MSG_IOV, "done handler complete. vf->op_rc %d, vfop->rc %d\n",
@@ -662,6 +673,16 @@ int bnx2x_vfop_close_cmd(struct bnx2x *bp,
 			 struct bnx2x_virtf *vf,
 			 struct bnx2x_vfop_cmd *cmd);
 
+int bnx2x_vfop_release_cmd(struct bnx2x *bp,
+			   struct bnx2x_virtf *vf,
+			   struct bnx2x_vfop_cmd *cmd);
+
+/* VF release ~ VF close + VF release-resources
+ *
+ * Release is the ultimate SW shutdown and is called whenever an
+ * irrecoverable error is encountered.
+ */
+void bnx2x_vf_release(struct bnx2x *bp, struct bnx2x_virtf *vf, bool block);
 int bnx2x_vf_idx_by_abs_fid(struct bnx2x *bp, u16 abs_vfid);
 u8 bnx2x_vf_max_queue_cnt(struct bnx2x *bp, struct bnx2x_virtf *vf);
 /* VF FLR helpers */
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
index fef4564..65a3723 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
@@ -231,7 +231,7 @@ static void bnx2x_vf_mbx_resp(struct bnx2x *bp, struct bnx2x_virtf *vf)
 		if (rc) {
 			BNX2X_ERR("Failed to copy response body to VF %d\n",
 				  vf->abs_vfid);
-			return;
+			goto mbx_error;
 		}
 		vf_addr -= sizeof(u64);
 		pf_addr -= sizeof(u64);
@@ -258,8 +258,12 @@ static void bnx2x_vf_mbx_resp(struct bnx2x *bp, struct bnx2x_virtf *vf)
 	if (rc) {
 		BNX2X_ERR("Failed to copy response status to VF %d\n",
 			  vf->abs_vfid);
+		goto mbx_error;
 	}
 	return;
+
+mbx_error:
+	bnx2x_vf_release(bp, vf, false); /* non blocking */
 }
 
 static void bnx2x_vf_mbx_acquire_resp(struct bnx2x *bp, struct bnx2x_virtf *vf,
@@ -833,6 +837,21 @@ static void bnx2x_vf_mbx_close_vf(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		bnx2x_vf_mbx_close_done(bp, vf);
 }
 
+static void bnx2x_vf_mbx_release_vf(struct bnx2x *bp, struct bnx2x_virtf *vf,
+				    struct bnx2x_vf_mbx *mbx)
+{
+	struct bnx2x_vfop_cmd cmd = {
+		.done = bnx2x_vf_mbx_resp,
+		.block = false,
+	};
+
+	DP(BNX2X_MSG_IOV, "VF[%d] VF_RELEASE\n", vf->abs_vfid);
+
+	vf->op_rc = bnx2x_vfop_release_cmd(bp, vf, &cmd);
+	if (vf->op_rc)
+		bnx2x_vf_mbx_resp(bp, vf);
+}
+
 /* dispatch request */
 static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 				  struct bnx2x_vf_mbx *mbx)
@@ -867,6 +886,9 @@ static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		case CHANNEL_TLV_CLOSE:
 			bnx2x_vf_mbx_close_vf(bp, vf, mbx);
 			break;
+		case CHANNEL_TLV_RELEASE:
+			bnx2x_vf_mbx_release_vf(bp, vf, mbx);
+			break;
 		}
 
 	/* unknown TLV - this may belong to a VF driver from the future - a
@@ -961,6 +983,7 @@ void bnx2x_vf_mbx(struct bnx2x *bp, struct vf_pf_event_data *vfpf_event)
 	goto mbx_done;
 
 mbx_error:
+	bnx2x_vf_release(bp, vf, false); /* non blocking */
 mbx_done:
 	return;
 }
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h
index a289e59..c4d4891 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.h
@@ -309,6 +309,7 @@ enum channel_tlvs {
 	   CHANNEL_TLV_TEARDOWN_Q,
 	   CHANNEL_TLV_CLOSE,
 	   CHANNEL_TLV_RELEASE,
+	   CHANNEL_TLV_PF_RELEASE_VF,
 	   CHANNEL_TLV_LIST_END,
 	   CHANNEL_TLV_MAX
 };
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH net-next v3 17/22] bnx2x: Support of PF driver of a VF q_teardown request
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

The 'q_teardown' request is basically the opposite of the 'q_setup'.
Here the PF driver removes from the device the queue it opened against
the VF fastpath ring at 'setup_q' stage, along with all related
rx_mode info.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c |  270 +++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |    5 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c  |   20 ++
 3 files changed, 295 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index 472e8ca..fa76c4e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -110,6 +110,13 @@ enum bnx2x_vfop_qctor_state {
 	   BNX2X_VFOP_QCTOR_INT_EN
 };
 
+enum bnx2x_vfop_qdtor_state {
+	   BNX2X_VFOP_QDTOR_HALT,
+	   BNX2X_VFOP_QDTOR_TERMINATE,
+	   BNX2X_VFOP_QDTOR_CFCDEL,
+	   BNX2X_VFOP_QDTOR_DONE
+};
+
 enum bnx2x_vfop_vlan_mac_state {
 	   BNX2X_VFOP_VLAN_MAC_CONFIG_SINGLE,
 	   BNX2X_VFOP_VLAN_MAC_CLEAR,
@@ -136,6 +143,14 @@ enum bnx2x_vfop_rxmode_state {
 	   BNX2X_VFOP_RXMODE_DONE
 };
 
+enum bnx2x_vfop_qteardown_state {
+	   BNX2X_VFOP_QTEARDOWN_RXMODE,
+	   BNX2X_VFOP_QTEARDOWN_CLR_VLAN,
+	   BNX2X_VFOP_QTEARDOWN_CLR_MAC,
+	   BNX2X_VFOP_QTEARDOWN_QDTOR,
+	   BNX2X_VFOP_QTEARDOWN_DONE
+};
+
 #define bnx2x_vfop_reset_wq(vf)	atomic_set(&vf->op_in_progress, 0)
 
 void bnx2x_vfop_qctor_dump_tx(struct bnx2x *bp, struct bnx2x_virtf *vf,
@@ -342,6 +357,101 @@ static int bnx2x_vfop_qctor_cmd(struct bnx2x *bp,
 	return -ENOMEM;
 }
 
+/* VFOP queue destruction */
+static void bnx2x_vfop_qdtor(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	struct bnx2x_vfop_args_qdtor *qdtor = &vfop->args.qdtor;
+	struct bnx2x_queue_state_params *q_params = &vfop->op_p->qctor.qstate;
+	enum bnx2x_vfop_qdtor_state state = vfop->state;
+
+	bnx2x_vfop_reset_wq(vf);
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	DP(BNX2X_MSG_IOV, "vf[%d] STATE: %d\n", vf->abs_vfid, state);
+
+	switch (state) {
+	case BNX2X_VFOP_QDTOR_HALT:
+
+		/* has this queue already been stopped? */
+		if (bnx2x_get_q_logical_state(bp, q_params->q_obj) ==
+		    BNX2X_Q_LOGICAL_STATE_STOPPED) {
+			DP(BNX2X_MSG_IOV,
+			   "Entered qdtor but queue was already stopped. Aborting gracefully\n");
+			goto op_done;
+		}
+
+		/* next state */
+		vfop->state = BNX2X_VFOP_QDTOR_TERMINATE;
+
+		q_params->cmd = BNX2X_Q_CMD_HALT;
+		vfop->rc = bnx2x_queue_state_change(bp, q_params);
+
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_CONT);
+
+	case BNX2X_VFOP_QDTOR_TERMINATE:
+		/* next state */
+		vfop->state = BNX2X_VFOP_QDTOR_CFCDEL;
+
+		q_params->cmd = BNX2X_Q_CMD_TERMINATE;
+		vfop->rc = bnx2x_queue_state_change(bp, q_params);
+
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_CONT);
+
+	case BNX2X_VFOP_QDTOR_CFCDEL:
+		/* next state */
+		vfop->state = BNX2X_VFOP_QDTOR_DONE;
+
+		q_params->cmd = BNX2X_Q_CMD_CFC_DEL;
+		vfop->rc = bnx2x_queue_state_change(bp, q_params);
+
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_DONE);
+op_err:
+	BNX2X_ERR("QDTOR[%d:%d] error: cmd %d, rc %d\n",
+		  vf->abs_vfid, qdtor->qid, q_params->cmd, vfop->rc);
+op_done:
+	case BNX2X_VFOP_QDTOR_DONE:
+		/* invalidate the context */
+		qdtor->cxt->ustorm_ag_context.cdu_usage = 0;
+		qdtor->cxt->xstorm_ag_context.cdu_reserved = 0;
+		bnx2x_vfop_end(bp, vf, vfop);
+		return;
+	default:
+		bnx2x_vfop_default(state);
+	}
+op_pending:
+	return;
+}
+
+static int bnx2x_vfop_qdtor_cmd(struct bnx2x *bp,
+				struct bnx2x_virtf *vf,
+				struct bnx2x_vfop_cmd *cmd,
+				int qid)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+
+	if (vfop) {
+		struct bnx2x_queue_state_params *qstate =
+			&vf->op_params.qctor.qstate;
+
+		memset(qstate, 0, sizeof(*qstate));
+		qstate->q_obj = &bnx2x_vfq(vf, qid, sp_obj);
+
+		vfop->args.qdtor.qid = qid;
+		vfop->args.qdtor.cxt = bnx2x_vfq(vf, qid, cxt);
+
+		bnx2x_vfop_opset(BNX2X_VFOP_QDTOR_HALT,
+				 bnx2x_vfop_qdtor, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_qdtor,
+					     cmd->block);
+	}
+	DP(BNX2X_MSG_IOV, "VF[%d] failed to add a vfop. rc %d\n",
+	   vf->abs_vfid, vfop->rc);
+	return -ENOMEM;
+}
+
 static void
 bnx2x_vf_set_igu_info(struct bnx2x *bp, u8 igu_sb_id, u8 abs_vfid)
 {
@@ -593,6 +703,45 @@ bnx2x_vfop_mac_prep_ramrod(struct bnx2x_vlan_mac_ramrod_params *ramrod,
 	set_bit(BNX2X_ETH_MAC, &ramrod->user_req.vlan_mac_flags);
 }
 
+static int bnx2x_vfop_mac_delall_cmd(struct bnx2x *bp,
+				     struct bnx2x_virtf *vf,
+				     struct bnx2x_vfop_cmd *cmd,
+				     int qid, bool drv_only)
+{
+
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+
+	if (vfop) {
+		struct bnx2x_vfop_args_filters filters = {
+			.multi_filter = NULL,	/* single */
+			.credit = NULL,		/* consume credit */
+		};
+		struct bnx2x_vfop_vlan_mac_flags flags = {
+			.drv_only = drv_only,
+			.dont_consume = (filters.credit != NULL),
+			.single_cmd = true,
+			.add = false /* don't care */,
+		};
+		struct bnx2x_vlan_mac_ramrod_params *ramrod =
+			&vf->op_params.vlan_mac;
+
+		/* set ramrod params */
+		bnx2x_vfop_mac_prep_ramrod(ramrod, &flags);
+
+		/* set object */
+		ramrod->vlan_mac_obj = &bnx2x_vfq(vf, qid, mac_obj);
+
+		/* set extra args */
+		vfop->args.filters = filters;
+
+		bnx2x_vfop_opset(BNX2X_VFOP_VLAN_MAC_CLEAR,
+				 bnx2x_vfop_vlan_mac, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_vlan_mac,
+					     cmd->block);
+	}
+	return -ENOMEM;
+}
+
 int bnx2x_vfop_mac_list_cmd(struct bnx2x *bp,
 			    struct bnx2x_virtf *vf,
 			    struct bnx2x_vfop_cmd *cmd,
@@ -675,6 +824,44 @@ int bnx2x_vfop_vlan_set_cmd(struct bnx2x *bp,
 	return -ENOMEM;
 }
 
+static int bnx2x_vfop_vlan_delall_cmd(struct bnx2x *bp,
+			       struct bnx2x_virtf *vf,
+			       struct bnx2x_vfop_cmd *cmd,
+			       int qid, bool drv_only)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+
+	if (vfop) {
+		struct bnx2x_vfop_args_filters filters = {
+			.multi_filter = NULL, /* single command */
+			.credit = &bnx2x_vfq(vf, qid, vlan_count),
+		};
+		struct bnx2x_vfop_vlan_mac_flags flags = {
+			.drv_only = drv_only,
+			.dont_consume = (filters.credit != NULL),
+			.single_cmd = true,
+			.add = false, /* don't care */
+		};
+		struct bnx2x_vlan_mac_ramrod_params *ramrod =
+			&vf->op_params.vlan_mac;
+
+		/* set ramrod params */
+		bnx2x_vfop_vlan_mac_prep_ramrod(ramrod, &flags);
+
+		/* set object */
+		ramrod->vlan_mac_obj = &bnx2x_vfq(vf, qid, vlan_obj);
+
+		/* set extra args */
+		vfop->args.filters = filters;
+
+		bnx2x_vfop_opset(BNX2X_VFOP_VLAN_MAC_CLEAR,
+				 bnx2x_vfop_vlan_mac, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_vlan_mac,
+					     cmd->block);
+	}
+	return -ENOMEM;
+}
+
 int bnx2x_vfop_vlan_list_cmd(struct bnx2x *bp,
 			     struct bnx2x_virtf *vf,
 			     struct bnx2x_vfop_cmd *cmd,
@@ -958,6 +1145,89 @@ int bnx2x_vfop_rxmode_cmd(struct bnx2x *bp,
 
 }
 
+/* VFOP queue tear-down ('drop all' rx-mode, clear vlans, clear macs,
+ * queue destructor)
+ */
+static void bnx2x_vfop_qdown(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	int qid = vfop->args.qx.qid;
+	enum bnx2x_vfop_qteardown_state state = vfop->state;
+	struct bnx2x_vfop_cmd cmd;
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	DP(BNX2X_MSG_IOV, "vf[%d] STATE: %d\n", vf->abs_vfid, state);
+
+	cmd.done = bnx2x_vfop_qdown;
+	cmd.block = false;
+
+	switch (state) {
+	case BNX2X_VFOP_QTEARDOWN_RXMODE:
+		/* Drop all */
+		vfop->state = BNX2X_VFOP_QTEARDOWN_CLR_VLAN;
+		vfop->rc = bnx2x_vfop_rxmode_cmd(bp, vf, &cmd, qid, 0);
+		if (vfop->rc)
+			goto op_err;
+		return;
+
+	case BNX2X_VFOP_QTEARDOWN_CLR_VLAN:
+		/* vlan-clear-all: don't consume credit */
+		vfop->state = BNX2X_VFOP_QTEARDOWN_CLR_MAC;
+		vfop->rc = bnx2x_vfop_vlan_delall_cmd(bp, vf, &cmd, qid, false);
+		if (vfop->rc)
+			goto op_err;
+		return;
+
+	case BNX2X_VFOP_QTEARDOWN_CLR_MAC:
+		/* mac-clear-all: consume credit */
+		vfop->state = BNX2X_VFOP_QTEARDOWN_QDTOR;
+		vfop->rc = bnx2x_vfop_mac_delall_cmd(bp, vf, &cmd, qid, false);
+		if (vfop->rc)
+			goto op_err;
+		return;
+
+	case BNX2X_VFOP_QTEARDOWN_QDTOR:
+		/* run the queue destruction flow */
+		DP(BNX2X_MSG_IOV, "case: BNX2X_VFOP_QTEARDOWN_QDTOR\n");
+		vfop->state = BNX2X_VFOP_QTEARDOWN_DONE;
+		DP(BNX2X_MSG_IOV, "new state: BNX2X_VFOP_QTEARDOWN_DONE\n");
+		vfop->rc = bnx2x_vfop_qdtor_cmd(bp, vf, &cmd, qid);
+		DP(BNX2X_MSG_IOV, "returned from cmd");
+		if (vfop->rc)
+			goto op_err;
+		return;
+op_err:
+	BNX2X_ERR("QTEARDOWN[%d:%d] error: rc %d\n",
+		  vf->abs_vfid, qid, vfop->rc);
+
+	case BNX2X_VFOP_QTEARDOWN_DONE:
+		bnx2x_vfop_end(bp, vf, vfop);
+		return;
+	default:
+		bnx2x_vfop_default(state);
+	}
+}
+
+int bnx2x_vfop_qdown_cmd(struct bnx2x *bp,
+			 struct bnx2x_virtf *vf,
+			 struct bnx2x_vfop_cmd *cmd,
+			 int qid)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+
+	if (vfop) {
+		vfop->args.qx.qid = qid;
+		bnx2x_vfop_opset(BNX2X_VFOP_QTEARDOWN_RXMODE,
+				 bnx2x_vfop_qdown, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_qdown,
+					     cmd->block);
+	}
+
+	return -ENOMEM;
+}
+
 /* VF enable primitives
  *
  * when pretend is required the caller is responsible
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index 776aa75..09b3394 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -642,6 +642,11 @@ int bnx2x_vfop_qsetup_cmd(struct bnx2x *bp,
 			  struct bnx2x_vfop_cmd *cmd,
 			  int qid);
 
+int bnx2x_vfop_qdown_cmd(struct bnx2x *bp,
+			 struct bnx2x_virtf *vf,
+			 struct bnx2x_vfop_cmd *cmd,
+			 int qid);
+
 int bnx2x_vfop_mcast_cmd(struct bnx2x *bp,
 			 struct bnx2x_virtf *vf,
 			 struct bnx2x_vfop_cmd *cmd,
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
index 06598ec..ad92a4d 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
@@ -792,6 +792,23 @@ response:
 	bnx2x_vf_mbx_resp(bp, vf);
 }
 
+static void bnx2x_vf_mbx_teardown_q(struct bnx2x *bp, struct bnx2x_virtf *vf,
+				    struct bnx2x_vf_mbx *mbx)
+{
+	int qid = mbx->msg->req.q_op.vf_qid;
+	struct bnx2x_vfop_cmd cmd = {
+		.done = bnx2x_vf_mbx_resp,
+		.block = false,
+	};
+
+	DP(BNX2X_MSG_IOV, "VF[%d] Q_TEARDOWN: vf_qid=%d\n",
+	   vf->abs_vfid, qid);
+
+	vf->op_rc = bnx2x_vfop_qdown_cmd(bp, vf, &cmd, qid);
+	if (vf->op_rc)
+		bnx2x_vf_mbx_resp(bp, vf);
+}
+
 /* dispatch request */
 static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 				  struct bnx2x_vf_mbx *mbx)
@@ -820,6 +837,9 @@ static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		case CHANNEL_TLV_SET_Q_FILTERS:
 			bnx2x_vf_mbx_set_q_filters(bp, vf, mbx);
 			break;
+		case CHANNEL_TLV_TEARDOWN_Q:
+			bnx2x_vf_mbx_teardown_q(bp, vf, mbx);
+			break;
 		}
 
 	/* unknown TLV - this may belong to a VF driver from the future - a
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH net-next v3 18/22] bnx2x: Support of PF driver of a VF close request
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

The 'close' command is the opposite of an init request. Here the
queues of the VF are closed (if any are opened) and released.
This flow applies the 'q_teardown' flow on all the queues.
The VF state is changed by this request.
Interrupts are disabled for the VF when closed.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c |   97 +++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |    4 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c  |   27 ++++++
 3 files changed, 128 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index fa76c4e..c5d471e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -138,6 +138,11 @@ enum bnx2x_vfop_mcast_state {
 	   BNX2X_VFOP_MCAST_CHK_DONE
 };
 
+enum bnx2x_vfop_close_state {
+	   BNX2X_VFOP_CLOSE_QUEUES,
+	   BNX2X_VFOP_CLOSE_HW
+};
+
 enum bnx2x_vfop_rxmode_state {
 	   BNX2X_VFOP_RXMODE_CONFIG,
 	   BNX2X_VFOP_RXMODE_DONE
@@ -2303,6 +2308,28 @@ static void bnx2x_vf_qtbl_set_q(struct bnx2x *bp, u8 abs_vfid, u8 qid,
 	REG_WR(bp, reg, val);
 }
 
+static void bnx2x_vf_clr_qtbl(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	int i;
+
+	for_each_vfq(vf, i)
+		bnx2x_vf_qtbl_set_q(bp, vf->abs_vfid,
+				    vfq_qzone_id(vf, vfq_get(vf, i)), false);
+}
+
+static void bnx2x_vf_igu_disable(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	u32 val;
+
+	/* clear the VF configuration - pretend */
+	bnx2x_pretend_func(bp, HW_VF_HANDLE(bp, vf->abs_vfid));
+	val = REG_RD(bp, IGU_REG_VF_CONFIGURATION);
+	val &= ~(IGU_VF_CONF_MSI_MSIX_EN | IGU_VF_CONF_SINGLE_ISR_EN |
+		 IGU_VF_CONF_FUNC_EN | IGU_VF_CONF_PARENT_MASK);
+	REG_WR(bp, IGU_REG_VF_CONFIGURATION, val);
+	bnx2x_pretend_func(bp, BP_ABS_FUNC(bp));
+}
+
 u8 bnx2x_vf_max_queue_cnt(struct bnx2x *bp, struct bnx2x_virtf *vf)
 {
 	return min_t(u8, min_t(u8, vf_sb_count(vf), BNX2X_CIDS_PER_VF),
@@ -2474,6 +2501,76 @@ int bnx2x_vf_init(struct bnx2x *bp, struct bnx2x_virtf *vf, dma_addr_t *sb_map)
 	return 0;
 }
 
+/* VFOP close (teardown the queues, delete mcasts and close HW) */
+static void bnx2x_vfop_close(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	struct bnx2x_vfop_args_qx *qx = &vfop->args.qx;
+	enum bnx2x_vfop_close_state state = vfop->state;
+	struct bnx2x_vfop_cmd cmd = {
+		.done = bnx2x_vfop_close,
+		.block = false,
+	};
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	DP(BNX2X_MSG_IOV, "vf[%d] STATE: %d\n", vf->abs_vfid, state);
+
+	switch (state) {
+	case BNX2X_VFOP_CLOSE_QUEUES:
+
+		if (++(qx->qid) < vf_rxq_count(vf)) {
+			vfop->rc = bnx2x_vfop_qdown_cmd(bp, vf, &cmd, qx->qid);
+			if (vfop->rc)
+				goto op_err;
+			return;
+		}
+
+		/* remove multicasts */
+		vfop->state = BNX2X_VFOP_CLOSE_HW;
+		vfop->rc = bnx2x_vfop_mcast_cmd(bp, vf, &cmd, NULL, 0, false);
+		if (vfop->rc)
+			goto op_err;
+		return;
+
+	case BNX2X_VFOP_CLOSE_HW:
+
+		/* disable the interrupts */
+		DP(BNX2X_MSG_IOV, "disabling igu\n");
+		bnx2x_vf_igu_disable(bp, vf);
+
+		/* disable the VF */
+		DP(BNX2X_MSG_IOV, "clearing qtbl\n");
+		bnx2x_vf_clr_qtbl(bp, vf);
+
+		goto op_done;
+	default:
+		bnx2x_vfop_default(state);
+	}
+op_err:
+	BNX2X_ERR("VF[%d] CLOSE error: rc %d\n", vf->abs_vfid, vfop->rc);
+op_done:
+	vf->state = VF_ACQUIRED;
+	DP(BNX2X_MSG_IOV, "set state to acquired\n");
+	bnx2x_vfop_end(bp, vf, vfop);
+}
+
+int bnx2x_vfop_close_cmd(struct bnx2x *bp,
+			 struct bnx2x_virtf *vf,
+			 struct bnx2x_vfop_cmd *cmd)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+	if (vfop) {
+		vfop->args.qx.qid = -1; /* loop */
+		bnx2x_vfop_opset(BNX2X_VFOP_CLOSE_QUEUES,
+				 bnx2x_vfop_close, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_close,
+					     cmd->block);
+	}
+	return -ENOMEM;
+}
+
 void bnx2x_lock_vf_pf_channel(struct bnx2x *bp, struct bnx2x_virtf *vf,
 			      enum channel_tlvs tlv)
 {
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index 09b3394..2e2c571 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -658,6 +658,10 @@ int bnx2x_vfop_rxmode_cmd(struct bnx2x *bp,
 			  struct bnx2x_vfop_cmd *cmd,
 			  int qid, unsigned long accept_flags);
 
+int bnx2x_vfop_close_cmd(struct bnx2x *bp,
+			 struct bnx2x_virtf *vf,
+			 struct bnx2x_vfop_cmd *cmd);
+
 int bnx2x_vf_idx_by_abs_fid(struct bnx2x *bp, u16 abs_vfid);
 u8 bnx2x_vf_max_queue_cnt(struct bnx2x *bp, struct bnx2x_virtf *vf);
 /* VF FLR helpers */
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
index ad92a4d..fef4564 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
@@ -809,6 +809,30 @@ static void bnx2x_vf_mbx_teardown_q(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		bnx2x_vf_mbx_resp(bp, vf);
 }
 
+/* Done handler for 'close' operation: send response and reopen the
+ * channel.
+ */
+static void bnx2x_vf_mbx_close_done(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	bnx2x_vf_mbx_resp(bp, vf);
+	bnx2x_vf_enable_mbx(bp, vf->abs_vfid);
+}
+
+static void bnx2x_vf_mbx_close_vf(struct bnx2x *bp, struct bnx2x_virtf *vf,
+				  struct bnx2x_vf_mbx *mbx)
+{
+	struct bnx2x_vfop_cmd cmd = {
+		.done = bnx2x_vf_mbx_close_done,
+		.block = false,
+	};
+
+	DP(BNX2X_MSG_IOV, "VF[%d] VF_CLOSE\n", vf->abs_vfid);
+
+	vf->op_rc = bnx2x_vfop_close_cmd(bp, vf, &cmd);
+	if (vf->op_rc)
+		bnx2x_vf_mbx_close_done(bp, vf);
+}
+
 /* dispatch request */
 static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 				  struct bnx2x_vf_mbx *mbx)
@@ -840,6 +864,9 @@ static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		case CHANNEL_TLV_TEARDOWN_Q:
 			bnx2x_vf_mbx_teardown_q(bp, vf, mbx);
 			break;
+		case CHANNEL_TLV_CLOSE:
+			bnx2x_vf_mbx_close_vf(bp, vf, mbx);
+			break;
 		}
 
 	/* unknown TLV - this may belong to a VF driver from the future - a
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH net-next v3 16/22] bnx2x: Support of PF driver of a VF q_filters request
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

The VF driver uses the 'q_filters' message on the VF <-> PF channel
for configuring an open queue, for example when the rxmode changes.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c |  294 ++++++++++++++++++++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |   32 +++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c  |  277 +++++++++++++++++++
 3 files changed, 597 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index bde2f47..472e8ca 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -101,6 +101,8 @@ static void bnx2x_vf_igu_ack_sb(struct bnx2x *bp, struct bnx2x_virtf *vf,
 }
 /* VFOP - VF slow-path operation support */
 
+#define BNX2X_VFOP_FILTER_ADD_CNT_MAX		0x10000
+
 /* VFOP operations states */
 enum bnx2x_vfop_qctor_state {
 	   BNX2X_VFOP_QCTOR_INIT,
@@ -123,6 +125,17 @@ enum bnx2x_vfop_qsetup_state {
 	   BNX2X_VFOP_QSETUP_DONE
 };
 
+enum bnx2x_vfop_mcast_state {
+	   BNX2X_VFOP_MCAST_DEL,
+	   BNX2X_VFOP_MCAST_ADD,
+	   BNX2X_VFOP_MCAST_CHK_DONE
+};
+
+enum bnx2x_vfop_rxmode_state {
+	   BNX2X_VFOP_RXMODE_CONFIG,
+	   BNX2X_VFOP_RXMODE_DONE
+};
+
 #define bnx2x_vfop_reset_wq(vf)	atomic_set(&vf->op_in_progress, 0)
 
 void bnx2x_vfop_qctor_dump_tx(struct bnx2x *bp, struct bnx2x_virtf *vf,
@@ -572,6 +585,57 @@ bnx2x_vfop_vlan_mac_prep_ramrod(struct bnx2x_vlan_mac_ramrod_params *ramrod,
 	ureq->cmd = flags->add ? BNX2X_VLAN_MAC_ADD : BNX2X_VLAN_MAC_DEL;
 }
 
+static inline void
+bnx2x_vfop_mac_prep_ramrod(struct bnx2x_vlan_mac_ramrod_params *ramrod,
+			   struct bnx2x_vfop_vlan_mac_flags *flags)
+{
+	bnx2x_vfop_vlan_mac_prep_ramrod(ramrod, flags);
+	set_bit(BNX2X_ETH_MAC, &ramrod->user_req.vlan_mac_flags);
+}
+
+int bnx2x_vfop_mac_list_cmd(struct bnx2x *bp,
+			    struct bnx2x_virtf *vf,
+			    struct bnx2x_vfop_cmd *cmd,
+			    struct bnx2x_vfop_filters *macs,
+			    int qid, bool drv_only)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+
+	if (vfop) {
+		struct bnx2x_vfop_args_filters filters = {
+			.multi_filter = macs,
+			.credit = NULL,		/* consume credit */
+		};
+		struct bnx2x_vfop_vlan_mac_flags flags = {
+			.drv_only = drv_only,
+			.dont_consume = (filters.credit != NULL),
+			.single_cmd = false,
+			.add = false, /* don't care since only the items in the
+				       * filters list affect the sp operation,
+				       * not the list itself
+				       */
+		};
+		struct bnx2x_vlan_mac_ramrod_params *ramrod =
+			&vf->op_params.vlan_mac;
+
+		/* set ramrod params */
+		bnx2x_vfop_mac_prep_ramrod(ramrod, &flags);
+
+		/* set object */
+		ramrod->vlan_mac_obj = &bnx2x_vfq(vf, qid, mac_obj);
+
+		/* set extra args */
+		filters.multi_filter->add_cnt = BNX2X_VFOP_FILTER_ADD_CNT_MAX;
+		vfop->args.filters = filters;
+
+		bnx2x_vfop_opset(BNX2X_VFOP_MAC_CONFIG_LIST,
+				 bnx2x_vfop_vlan_mac, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_vlan_mac,
+					     cmd->block);
+	}
+	return -ENOMEM;
+}
+
 int bnx2x_vfop_vlan_set_cmd(struct bnx2x *bp,
 			    struct bnx2x_virtf *vf,
 			    struct bnx2x_vfop_cmd *cmd,
@@ -611,6 +675,48 @@ int bnx2x_vfop_vlan_set_cmd(struct bnx2x *bp,
 	return -ENOMEM;
 }
 
+int bnx2x_vfop_vlan_list_cmd(struct bnx2x *bp,
+			     struct bnx2x_virtf *vf,
+			     struct bnx2x_vfop_cmd *cmd,
+			     struct bnx2x_vfop_filters *vlans,
+			     int qid, bool drv_only)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+
+	if (vfop) {
+		struct bnx2x_vfop_args_filters filters = {
+			.multi_filter = vlans,
+			.credit = &bnx2x_vfq(vf, qid, vlan_count),
+		};
+		struct bnx2x_vfop_vlan_mac_flags flags = {
+			.drv_only = drv_only,
+			.dont_consume = (filters.credit != NULL),
+			.single_cmd = false,
+			.add = false, /* don't care */
+		};
+		struct bnx2x_vlan_mac_ramrod_params *ramrod =
+			&vf->op_params.vlan_mac;
+
+		/* set ramrod params */
+		bnx2x_vfop_vlan_mac_prep_ramrod(ramrod, &flags);
+
+		/* set object */
+		ramrod->vlan_mac_obj = &bnx2x_vfq(vf, qid, vlan_obj);
+
+		/* set extra args */
+		filters.multi_filter->add_cnt = vf_vlan_rules_cnt(vf) -
+			atomic_read(filters.credit);
+
+		vfop->args.filters = filters;
+
+		bnx2x_vfop_opset(BNX2X_VFOP_VLAN_CONFIG_LIST,
+				 bnx2x_vfop_vlan_mac, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_vlan_mac,
+					     cmd->block);
+	}
+	return -ENOMEM;
+}
+
 /* VFOP queue setup (queue constructor + set vlan 0) */
 static void bnx2x_vfop_qsetup(struct bnx2x *bp, struct bnx2x_virtf *vf)
 {
@@ -676,16 +782,180 @@ int bnx2x_vfop_qsetup_cmd(struct bnx2x *bp,
 	return -ENOMEM;
 }
 
-static u8 bnx2x_iov_get_max_queue_count(struct bnx2x *bp)
+/* VFOP multi-casts */
+static void bnx2x_vfop_mcast(struct bnx2x *bp, struct bnx2x_virtf *vf)
 {
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	struct bnx2x_mcast_ramrod_params *mcast = &vfop->op_p->mcast;
+	struct bnx2x_raw_obj *raw = &mcast->mcast_obj->raw;
+	struct bnx2x_vfop_args_mcast *args = &vfop->args.mc_list;
+	enum bnx2x_vfop_mcast_state state = vfop->state;
 	int i;
-	u8 queue_count = 0;
 
-	if (IS_SRIOV(bp))
-		for_each_vf(bp, i)
-			queue_count += bnx2x_vf(bp, i, alloc_resc.num_sbs);
+	bnx2x_vfop_reset_wq(vf);
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	DP(BNX2X_MSG_IOV, "vf[%d] STATE: %d\n", vf->abs_vfid, state);
+
+	switch (state) {
+	case BNX2X_VFOP_MCAST_DEL:
+		/* clear existing mcasts */
+		vfop->state = BNX2X_VFOP_MCAST_ADD;
+		vfop->rc = bnx2x_config_mcast(bp, mcast, BNX2X_MCAST_CMD_DEL);
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_CONT);
+
+	case BNX2X_VFOP_MCAST_ADD:
+		if (raw->check_pending(raw))
+			goto op_pending;
+
+		if (args->mc_num) {
+			/* update mcast list on the ramrod params */
+			INIT_LIST_HEAD(&mcast->mcast_list);
+			for (i = 0; i < args->mc_num; i++)
+				list_add_tail(&(args->mc[i].link),
+					      &mcast->mcast_list);
+			/* add new mcasts */
+			vfop->state = BNX2X_VFOP_MCAST_CHK_DONE;
+			vfop->rc = bnx2x_config_mcast(bp, mcast,
+						      BNX2X_MCAST_CMD_ADD);
+		}
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_DONE);
+
+	case BNX2X_VFOP_MCAST_CHK_DONE:
+		vfop->rc = raw->check_pending(raw) ? 1 : 0;
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_DONE);
+	default:
+		bnx2x_vfop_default(state);
+	}
+op_err:
+	BNX2X_ERR("MCAST CONFIG error: rc %d\n", vfop->rc);
+op_done:
+	kfree(args->mc);
+	bnx2x_vfop_end(bp, vf, vfop);
+op_pending:
+	return;
+}
+
+int bnx2x_vfop_mcast_cmd(struct bnx2x *bp,
+			 struct bnx2x_virtf *vf,
+			 struct bnx2x_vfop_cmd *cmd,
+			 bnx2x_mac_addr_t *mcasts,
+			 int mcast_num, bool drv_only)
+{
+	struct bnx2x_vfop *vfop = NULL;
+	size_t mc_sz = mcast_num * sizeof(struct bnx2x_mcast_list_elem);
+	struct bnx2x_mcast_list_elem *mc = mc_sz ? kzalloc(mc_sz, GFP_KERNEL) :
+		NULL;
+
+	if (!mc_sz || mc) {
+		vfop = bnx2x_vfop_add(bp, vf);
+		if (vfop) {
+			int i;
+			struct bnx2x_mcast_ramrod_params *ramrod =
+				&vf->op_params.mcast;
+
+			/* set ramrod params */
+			memset(ramrod, 0, sizeof(*ramrod));
+			ramrod->mcast_obj = &vf->mcast_obj;
+			if (drv_only)
+				set_bit(RAMROD_DRV_CLR_ONLY,
+					&ramrod->ramrod_flags);
+
+			/* copy mcasts pointers */
+			vfop->args.mc_list.mc_num = mcast_num;
+			vfop->args.mc_list.mc = mc;
+			for (i = 0; i < mcast_num; i++)
+				mc[i].mac = mcasts[i];
+
+			bnx2x_vfop_opset(BNX2X_VFOP_MCAST_DEL,
+					 bnx2x_vfop_mcast, cmd->done);
+			return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_mcast,
+						     cmd->block);
+		} else {
+			kfree(mc);
+		}
+	}
+	return -ENOMEM;
+}
+
+/* VFOP rx-mode */
+static void bnx2x_vfop_rxmode(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	struct bnx2x_rx_mode_ramrod_params *ramrod = &vfop->op_p->rx_mode;
+	enum bnx2x_vfop_rxmode_state state = vfop->state;
+
+	bnx2x_vfop_reset_wq(vf);
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	DP(BNX2X_MSG_IOV, "vf[%d] STATE: %d\n", vf->abs_vfid, state);
+
+	switch (state) {
+	case BNX2X_VFOP_RXMODE_CONFIG:
+		/* next state */
+		vfop->state = BNX2X_VFOP_RXMODE_DONE;
+
+		vfop->rc = bnx2x_config_rx_mode(bp, ramrod);
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_DONE);
+op_err:
+	BNX2X_ERR("RXMODE error: rc %d\n", vfop->rc);
+op_done:
+	case BNX2X_VFOP_RXMODE_DONE:
+		bnx2x_vfop_end(bp, vf, vfop);
+		return;
+	default:
+		bnx2x_vfop_default(state);
+	}
+op_pending:
+	return;
+}
+
+int bnx2x_vfop_rxmode_cmd(struct bnx2x *bp,
+			  struct bnx2x_virtf *vf,
+			  struct bnx2x_vfop_cmd *cmd,
+			  int qid, unsigned long accept_flags)
+{
+
+	struct bnx2x_vf_queue *vfq = vfq_get(vf, qid);
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+
+	if (vfop) {
+		struct bnx2x_rx_mode_ramrod_params *ramrod =
+			&vf->op_params.rx_mode;
+
+		memset(ramrod, 0, sizeof(*ramrod));
+
+		/* Prepare ramrod parameters */
+		ramrod->cid = vfq->cid;
+		ramrod->cl_id = vfq_cl_id(vf, vfq);
+		ramrod->rx_mode_obj = &bp->rx_mode_obj;
+		ramrod->func_id = FW_VF_HANDLE(vf->abs_vfid);
+
+		ramrod->rx_accept_flags = accept_flags;
+		ramrod->tx_accept_flags = accept_flags;
+		ramrod->pstate = &vf->filter_state;
+		ramrod->state = BNX2X_FILTER_RX_MODE_PENDING;
+
+		set_bit(BNX2X_FILTER_RX_MODE_PENDING, &vf->filter_state);
+		set_bit(RAMROD_RX, &ramrod->ramrod_flags);
+		set_bit(RAMROD_TX, &ramrod->ramrod_flags);
+
+		ramrod->rdata =
+			bnx2x_vf_sp(bp, vf, rx_mode_rdata.e2);
+		ramrod->rdata_mapping =
+			bnx2x_vf_sp_map(bp, vf, rx_mode_rdata.e2);
+
+		bnx2x_vfop_opset(BNX2X_VFOP_RXMODE_CONFIG,
+				 bnx2x_vfop_rxmode, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_rxmode,
+					     cmd->block);
+	}
+	return -ENOMEM;
 
-	return queue_count;
 }
 
 /* VF enable primitives
@@ -1054,6 +1324,18 @@ static int bnx2x_sriov_info(struct bnx2x *bp, struct bnx2x_sriov *iov)
 	return 0;
 }
 
+static u8 bnx2x_iov_get_max_queue_count(struct bnx2x *bp)
+{
+	int i;
+	u8 queue_count = 0;
+
+	if (IS_SRIOV(bp))
+		for_each_vf(bp, i)
+			queue_count += bnx2x_vf(bp, i, alloc_resc.num_sbs);
+
+	return queue_count;
+}
+
 /* must be called after PF bars are mapped */
 int bnx2x_iov_init_one(struct bnx2x *bp, int int_mode_param,
 				 int num_vfs_param)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index 3b758d4..776aa75 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -442,6 +442,10 @@ void bnx2x_iov_sp_task(struct bnx2x *bp);
 /* global vf mailbox routines */
 void bnx2x_vf_mbx(struct bnx2x *bp, struct vf_pf_event_data *vfpf_event);
 void bnx2x_vf_enable_mbx(struct bnx2x *bp, u8 abs_vfid);
+
+/* CORE VF API */
+typedef u8 bnx2x_mac_addr_t[ETH_ALEN];
+
 /* acquire */
 int bnx2x_vf_acquire(struct bnx2x *bp, struct bnx2x_virtf *vf,
 			  struct vf_pf_resc_request *resc);
@@ -616,11 +620,39 @@ void bnx2x_vfop_qctor_prep(struct bnx2x *bp,
 			   struct bnx2x_vf_queue *q,
 			   struct bnx2x_vfop_qctor_params *p,
 			   unsigned long q_type);
+int bnx2x_vfop_mac_list_cmd(struct bnx2x *bp,
+			    struct bnx2x_virtf *vf,
+			    struct bnx2x_vfop_cmd *cmd,
+			    struct bnx2x_vfop_filters *macs,
+			    int qid, bool drv_only);
+
+int bnx2x_vfop_vlan_set_cmd(struct bnx2x *bp,
+			    struct bnx2x_virtf *vf,
+			    struct bnx2x_vfop_cmd *cmd,
+			    int qid, u16 vid, bool add);
+
+int bnx2x_vfop_vlan_list_cmd(struct bnx2x *bp,
+			     struct bnx2x_virtf *vf,
+			     struct bnx2x_vfop_cmd *cmd,
+			     struct bnx2x_vfop_filters *vlans,
+			     int qid, bool drv_only);
+
 int bnx2x_vfop_qsetup_cmd(struct bnx2x *bp,
 			  struct bnx2x_virtf *vf,
 			  struct bnx2x_vfop_cmd *cmd,
 			  int qid);
 
+int bnx2x_vfop_mcast_cmd(struct bnx2x *bp,
+			 struct bnx2x_virtf *vf,
+			 struct bnx2x_vfop_cmd *cmd,
+			 bnx2x_mac_addr_t *mcasts,
+			 int mcast_num, bool drv_only);
+
+int bnx2x_vfop_rxmode_cmd(struct bnx2x *bp,
+			  struct bnx2x_virtf *vf,
+			  struct bnx2x_vfop_cmd *cmd,
+			  int qid, unsigned long accept_flags);
+
 int bnx2x_vf_idx_by_abs_fid(struct bnx2x *bp, u16 abs_vfid);
 u8 bnx2x_vf_max_queue_cnt(struct bnx2x *bp, struct bnx2x_virtf *vf);
 /* VF FLR helpers */
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
index eb10378..06598ec 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
@@ -518,6 +518,280 @@ response:
 	bnx2x_vf_mbx_resp(bp, vf);
 }
 
+enum bnx2x_vfop_filters_state {
+	   BNX2X_VFOP_MBX_Q_FILTERS_MACS,
+	   BNX2X_VFOP_MBX_Q_FILTERS_VLANS,
+	   BNX2X_VFOP_MBX_Q_FILTERS_RXMODE,
+	   BNX2X_VFOP_MBX_Q_FILTERS_MCAST,
+	   BNX2X_VFOP_MBX_Q_FILTERS_DONE
+};
+
+static int bnx2x_vf_mbx_macvlan_list(struct bnx2x *bp,
+				     struct bnx2x_virtf *vf,
+				     struct vfpf_set_q_filters_tlv *tlv,
+				     struct bnx2x_vfop_filters **pfl,
+				     u32 type_flag)
+{
+	int i, j;
+	struct bnx2x_vfop_filters *fl = NULL;
+	size_t fsz;
+
+	fsz = tlv->n_mac_vlan_filters * sizeof(struct bnx2x_vfop_filter) +
+		sizeof(struct bnx2x_vfop_filters);
+
+	fl = kzalloc(fsz, GFP_KERNEL);
+	if (!fl)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&fl->head);
+
+	for (i = 0, j = 0; i < tlv->n_mac_vlan_filters; i++) {
+		struct vfpf_q_mac_vlan_filter *msg_filter = &tlv->filters[i];
+
+		if ((msg_filter->flags & type_flag) != type_flag)
+			continue;
+		if (type_flag == VFPF_Q_FILTER_DEST_MAC_VALID) {
+			fl->filters[j].mac = msg_filter->mac;
+			fl->filters[j].type = BNX2X_VFOP_FILTER_MAC;
+		} else {
+			fl->filters[j].vid = msg_filter->vlan_tag;
+			fl->filters[j].type = BNX2X_VFOP_FILTER_VLAN;
+		}
+		fl->filters[j].add =
+			(msg_filter->flags & VFPF_Q_FILTER_SET_MAC) ?
+			true : false;
+		list_add_tail(&fl->filters[j++].link, &fl->head);
+	}
+	if (list_empty(&fl->head))
+		kfree(fl);
+	else
+		*pfl = fl;
+
+	return 0;
+}
+
+static void bnx2x_vf_mbx_dp_q_filter(struct bnx2x *bp, int msglvl, int idx,
+				       struct vfpf_q_mac_vlan_filter *filter)
+{
+	DP(msglvl, "MAC-VLAN[%d] -- flags=0x%x", idx, filter->flags);
+	if (filter->flags & VFPF_Q_FILTER_VLAN_TAG_VALID)
+		DP_CONT(msglvl, ", vlan=%d", filter->vlan_tag);
+	if (filter->flags & VFPF_Q_FILTER_DEST_MAC_VALID)
+		DP_CONT(msglvl, ", MAC=%pM", filter->mac);
+	DP_CONT(msglvl, "\n");
+}
+
+static void bnx2x_vf_mbx_dp_q_filters(struct bnx2x *bp, int msglvl,
+				       struct vfpf_set_q_filters_tlv *filters)
+{
+	int i;
+
+	if (filters->flags & VFPF_SET_Q_FILTERS_MAC_VLAN_CHANGED)
+		for (i = 0; i < filters->n_mac_vlan_filters; i++)
+			bnx2x_vf_mbx_dp_q_filter(bp, msglvl, i,
+						 &filters->filters[i]);
+
+	if (filters->flags & VFPF_SET_Q_FILTERS_RX_MASK_CHANGED)
+		DP(msglvl, "RX-MASK=0x%x\n", filters->rx_mask);
+
+	if (filters->flags & VFPF_SET_Q_FILTERS_MULTICAST_CHANGED)
+		for (i = 0; i < filters->n_multicast; i++)
+			DP(msglvl, "MULTICAST=%pM\n", filters->multicast[i]);
+}
+
+#define VFPF_MAC_FILTER		VFPF_Q_FILTER_DEST_MAC_VALID
+#define VFPF_VLAN_FILTER	VFPF_Q_FILTER_VLAN_TAG_VALID
+
+static void bnx2x_vfop_mbx_qfilters(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	int rc;
+
+	struct vfpf_set_q_filters_tlv *msg =
+		&BP_VF_MBX(bp, vf->index)->msg->req.set_q_filters;
+
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	enum bnx2x_vfop_filters_state state = vfop->state;
+
+	struct bnx2x_vfop_cmd cmd = {
+		.done = bnx2x_vfop_mbx_qfilters,
+		.block = false,
+	};
+
+	DP(BNX2X_MSG_IOV, "STATE: %d\n", state);
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	switch (state) {
+	case BNX2X_VFOP_MBX_Q_FILTERS_MACS:
+		/* next state */
+		vfop->state = BNX2X_VFOP_MBX_Q_FILTERS_VLANS;
+
+		/* check for any vlan/mac changes */
+		if (msg->flags & VFPF_SET_Q_FILTERS_MAC_VLAN_CHANGED) {
+
+			/* build mac list */
+			struct bnx2x_vfop_filters *fl = NULL;
+			vfop->rc = bnx2x_vf_mbx_macvlan_list(bp, vf, msg, &fl,
+							     VFPF_MAC_FILTER);
+			if (vfop->rc)
+				goto op_err;
+
+			if (fl) {
+				/* set mac list */
+				rc = bnx2x_vfop_mac_list_cmd(bp, vf, &cmd, fl,
+							     msg->vf_qid,
+							     false);
+				if (rc) {
+					vfop->rc = rc;
+					goto op_err;
+				}
+				return;
+			}
+		}
+		/* fall through */
+
+	case BNX2X_VFOP_MBX_Q_FILTERS_VLANS:
+		/* next state */
+		vfop->state = BNX2X_VFOP_MBX_Q_FILTERS_RXMODE;
+
+		/* check for any vlan/mac changes */
+		if (msg->flags & VFPF_SET_Q_FILTERS_MAC_VLAN_CHANGED) {
+
+			/* build vlan list */
+			struct bnx2x_vfop_filters *fl = NULL;
+			vfop->rc = bnx2x_vf_mbx_macvlan_list(bp, vf, msg, &fl,
+							     VFPF_VLAN_FILTER);
+			if (vfop->rc)
+				goto op_err;
+
+			if (fl) {
+				/* set vlan list */
+				rc = bnx2x_vfop_vlan_list_cmd(bp, vf, &cmd, fl,
+							      msg->vf_qid,
+							      false);
+				if (rc) {
+					vfop->rc = rc;
+					goto op_err;
+				}
+				return;
+			}
+		}
+		/* fall through */
+
+	case BNX2X_VFOP_MBX_Q_FILTERS_RXMODE:
+		/* next state */
+		vfop->state = BNX2X_VFOP_MBX_Q_FILTERS_MCAST;
+
+		if (msg->flags & VFPF_SET_Q_FILTERS_RX_MASK_CHANGED) {
+			unsigned long accept = 0;
+
+			/* covert VF-PF if mask to bnx2x accept flags */
+			if (msg->rx_mask & VFPF_RX_MASK_ACCEPT_MATCHED_UNICAST)
+				__set_bit(BNX2X_ACCEPT_UNICAST, &accept);
+
+			if (msg->rx_mask &
+					VFPF_RX_MASK_ACCEPT_MATCHED_MULTICAST)
+				__set_bit(BNX2X_ACCEPT_MULTICAST, &accept);
+
+			if (msg->rx_mask & VFPF_RX_MASK_ACCEPT_ALL_UNICAST)
+				__set_bit(BNX2X_ACCEPT_ALL_UNICAST, &accept);
+
+			if (msg->rx_mask & VFPF_RX_MASK_ACCEPT_ALL_MULTICAST)
+				__set_bit(BNX2X_ACCEPT_ALL_MULTICAST, &accept);
+
+			if (msg->rx_mask & VFPF_RX_MASK_ACCEPT_BROADCAST)
+				__set_bit(BNX2X_ACCEPT_BROADCAST, &accept);
+
+			/* A packet arriving the vf's mac should be accepted
+			 * with any vlan
+			 */
+			__set_bit(BNX2X_ACCEPT_ANY_VLAN, &accept);
+
+			/* set rx-mode */
+			rc = bnx2x_vfop_rxmode_cmd(bp, vf, &cmd,
+						   msg->vf_qid, accept);
+			if (rc) {
+				vfop->rc = rc;
+				goto op_err;
+			}
+			return;
+		}
+		/* fall through */
+
+	case BNX2X_VFOP_MBX_Q_FILTERS_MCAST:
+		/* next state */
+		vfop->state = BNX2X_VFOP_MBX_Q_FILTERS_DONE;
+
+		if (msg->flags & VFPF_SET_Q_FILTERS_MULTICAST_CHANGED) {
+			/* set mcasts */
+			rc = bnx2x_vfop_mcast_cmd(bp, vf, &cmd, msg->multicast,
+						  msg->n_multicast, false);
+			if (rc) {
+				vfop->rc = rc;
+				goto op_err;
+			}
+			return;
+		}
+		/* fall through */
+op_done:
+	case BNX2X_VFOP_MBX_Q_FILTERS_DONE:
+		bnx2x_vfop_end(bp, vf, vfop);
+		return;
+op_err:
+	BNX2X_ERR("QFILTERS[%d:%d] error: rc %d\n",
+		  vf->abs_vfid, msg->vf_qid, vfop->rc);
+	goto op_done;
+
+	default:
+		bnx2x_vfop_default(state);
+	}
+}
+
+static int bnx2x_vfop_mbx_qfilters_cmd(struct bnx2x *bp,
+					struct bnx2x_virtf *vf,
+					struct bnx2x_vfop_cmd *cmd)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+	if (vfop) {
+		bnx2x_vfop_opset(BNX2X_VFOP_MBX_Q_FILTERS_MACS,
+				 bnx2x_vfop_mbx_qfilters, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_mbx_qfilters,
+					     cmd->block);
+	}
+	return -ENOMEM;
+}
+
+static void bnx2x_vf_mbx_set_q_filters(struct bnx2x *bp,
+				       struct bnx2x_virtf *vf,
+				       struct bnx2x_vf_mbx *mbx)
+{
+	struct vfpf_set_q_filters_tlv *filters = &mbx->msg->req.set_q_filters;
+	struct bnx2x_vfop_cmd cmd = {
+		.done = bnx2x_vf_mbx_resp,
+		.block = false,
+	};
+
+	/* verify vf_qid */
+	if (filters->vf_qid > vf_rxq_count(vf))
+		goto response;
+
+	DP(BNX2X_MSG_IOV, "VF[%d] Q_FILTERS: queue[%d]\n",
+	   vf->abs_vfid,
+	   filters->vf_qid);
+
+	/* print q_filter message */
+	bnx2x_vf_mbx_dp_q_filters(bp, BNX2X_MSG_IOV, filters);
+
+	vf->op_rc = bnx2x_vfop_mbx_qfilters_cmd(bp, vf, &cmd);
+	if (vf->op_rc)
+		goto response;
+	return;
+
+response:
+	bnx2x_vf_mbx_resp(bp, vf);
+}
+
 /* dispatch request */
 static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 				  struct bnx2x_vf_mbx *mbx)
@@ -543,6 +817,9 @@ static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		case CHANNEL_TLV_SETUP_Q:
 			bnx2x_vf_mbx_setup_q(bp, vf, mbx);
 			break;
+		case CHANNEL_TLV_SET_Q_FILTERS:
+			bnx2x_vf_mbx_set_q_filters(bp, vf, mbx);
+			break;
 		}
 
 	/* unknown TLV - this may belong to a VF driver from the future - a
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH net-next v3 15/22] bnx2x: Support of PF driver of a VF setup_q request
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

Upon receiving a 'setup_q' request from the VF over the VF <-> PF
channel the PF driver will open a corresponding queue in the
device. The PF driver configures the queue with appropriate mac
address, vlan configuration, etc from the VF.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h       |    1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |   19 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c |  905 +++++++++++++++++----
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |  175 ++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c  |  146 ++++
 5 files changed, 1070 insertions(+), 176 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index c97616c..a7d9fb0 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -969,6 +969,7 @@ extern struct workqueue_struct *bnx2x_wq;
 #define BNX2X_MAX_NUM_OF_VFS	64
 #define BNX2X_VF_CID_WND	0
 #define BNX2X_CIDS_PER_VF	(1 << BNX2X_VF_CID_WND)
+#define BNX2X_CLIENTS_PER_VF	1
 #define BNX2X_FIRST_VF_CID	256
 #define BNX2X_VF_CIDS		(BNX2X_MAX_NUM_OF_VFS * BNX2X_CIDS_PER_VF)
 #define BNX2X_VF_ID_INVALID	0xFF
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 7966d7f..0a00c4c 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -2015,7 +2015,7 @@ static void bnx2x_free_fw_stats_mem(struct bnx2x *bp)
 
 static int bnx2x_alloc_fw_stats_mem(struct bnx2x *bp)
 {
-	int num_groups;
+	int num_groups, vf_headroom = 0;
 	int is_fcoe_stats = NO_FCOE(bp) ? 0 : 1;
 
 	/* number of queues for statistics is number of eth queues + FCoE */
@@ -2027,18 +2027,27 @@ static int bnx2x_alloc_fw_stats_mem(struct bnx2x *bp)
 	 * for fcoe l2 queue if applicable)
 	 */
 	bp->fw_stats_num = 2 + is_fcoe_stats + num_queue_stats;
+
+	/* vf stats appear in the request list, but their data is allocated by
+	 * the VFs themselves. We don't include them in the bp->fw_stats_num as
+	 * it is used to determine where to place the vf stats queries in the
+	 * request struct
+	 */
+	if (IS_SRIOV(bp))
+		vf_headroom = bp->vfdb->sriov.nr_virtfn * BNX2X_CLIENTS_PER_VF;
+
 	/* Request is built from stats_query_header and an array of
 	 * stats_query_cmd_group each of which contains
 	 * STATS_QUERY_CMD_COUNT rules. The real number or requests is
 	 * configured in the stats_query_header.
 	 */
 	num_groups =
-		(((bp->fw_stats_num) / STATS_QUERY_CMD_COUNT) +
-		 (((bp->fw_stats_num) % STATS_QUERY_CMD_COUNT) ?
+		(((bp->fw_stats_num + vf_headroom) / STATS_QUERY_CMD_COUNT) +
+		 (((bp->fw_stats_num + vf_headroom) % STATS_QUERY_CMD_COUNT) ?
 		 1 : 0));
 
-	DP(BNX2X_MSG_SP, "stats fw_stats_num %d, num_groups %d",
-	   bp->fw_stats_num, num_groups);
+	DP(BNX2X_MSG_SP, "stats fw_stats_num %d, vf headroom %d, num_groups %d",
+	   bp->fw_stats_num, vf_headroom, num_groups);
 
 	bp->fw_stats_req_sz = sizeof(struct stats_query_header) +
 		num_groups * sizeof(struct stats_query_cmd_group);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index 52c3bd2..bde2f47 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -99,10 +99,234 @@ static void bnx2x_vf_igu_ack_sb(struct bnx2x *bp, struct bnx2x_virtf *vf,
 	mmiowb();
 	barrier();
 }
+/* VFOP - VF slow-path operation support */
+
+/* VFOP operations states */
+enum bnx2x_vfop_qctor_state {
+	   BNX2X_VFOP_QCTOR_INIT,
+	   BNX2X_VFOP_QCTOR_SETUP,
+	   BNX2X_VFOP_QCTOR_INT_EN
+};
+
+enum bnx2x_vfop_vlan_mac_state {
+	   BNX2X_VFOP_VLAN_MAC_CONFIG_SINGLE,
+	   BNX2X_VFOP_VLAN_MAC_CLEAR,
+	   BNX2X_VFOP_VLAN_MAC_CHK_DONE,
+	   BNX2X_VFOP_MAC_CONFIG_LIST,
+	   BNX2X_VFOP_VLAN_CONFIG_LIST,
+	   BNX2X_VFOP_VLAN_CONFIG_LIST_0
+};
+
+enum bnx2x_vfop_qsetup_state {
+	   BNX2X_VFOP_QSETUP_CTOR,
+	   BNX2X_VFOP_QSETUP_VLAN0,
+	   BNX2X_VFOP_QSETUP_DONE
+};
+
+#define bnx2x_vfop_reset_wq(vf)	atomic_set(&vf->op_in_progress, 0)
+
+void bnx2x_vfop_qctor_dump_tx(struct bnx2x *bp, struct bnx2x_virtf *vf,
+			      struct bnx2x_queue_init_params *init_params,
+			      struct bnx2x_queue_setup_params *setup_params,
+			      u16 q_idx, u16 sb_idx)
+{
+	DP(BNX2X_MSG_IOV,
+	   "VF[%d] Q_SETUP: txq[%d]-- vfsb=%d, sb-index=%d, hc-rate=%d, flags=0x%lx, traffic-type=%d",
+	   vf->abs_vfid,
+	   q_idx,
+	   sb_idx,
+	   init_params->tx.sb_cq_index,
+	   init_params->tx.hc_rate,
+	   setup_params->flags,
+	   setup_params->txq_params.traffic_type);
+}
 
-static int bnx2x_ari_enabled(struct pci_dev *dev)
+void bnx2x_vfop_qctor_dump_rx(struct bnx2x *bp, struct bnx2x_virtf *vf,
+			    struct bnx2x_queue_init_params *init_params,
+			    struct bnx2x_queue_setup_params *setup_params,
+			    u16 q_idx, u16 sb_idx)
 {
-	return dev->bus->self && dev->bus->self->ari_enabled;
+	struct bnx2x_rxq_setup_params *rxq_params = &setup_params->rxq_params;
+
+	DP(BNX2X_MSG_IOV, "VF[%d] Q_SETUP: rxq[%d]-- vfsb=%d, sb-index=%d, hc-rate=%d, mtu=%d, buf-size=%d\n"
+	   "sge-size=%d, max_sge_pkt=%d, tpa-agg-size=%d, flags=0x%lx, drop-flags=0x%x, cache-log=%d\n",
+	   vf->abs_vfid,
+	   q_idx,
+	   sb_idx,
+	   init_params->rx.sb_cq_index,
+	   init_params->rx.hc_rate,
+	   setup_params->gen_params.mtu,
+	   rxq_params->buf_sz,
+	   rxq_params->sge_buf_sz,
+	   rxq_params->max_sges_pkt,
+	   rxq_params->tpa_agg_sz,
+	   setup_params->flags,
+	   rxq_params->drop_flags,
+	   rxq_params->cache_line_log);
+}
+
+void bnx2x_vfop_qctor_prep(struct bnx2x *bp,
+			   struct bnx2x_virtf *vf,
+			   struct bnx2x_vf_queue *q,
+			   struct bnx2x_vfop_qctor_params *p,
+			   unsigned long q_type)
+{
+
+	struct bnx2x_queue_init_params *init_p = &p->qstate.params.init;
+	struct bnx2x_queue_setup_params *setup_p = &p->prep_qsetup;
+
+	/* INIT */
+
+	/* Enable host coalescing in the transition to INIT state */
+	if (test_bit(BNX2X_Q_FLG_HC, &init_p->rx.flags))
+		__set_bit(BNX2X_Q_FLG_HC_EN, &init_p->rx.flags);
+
+	if (test_bit(BNX2X_Q_FLG_HC, &init_p->tx.flags))
+		__set_bit(BNX2X_Q_FLG_HC_EN, &init_p->tx.flags);
+
+	/* FW SB ID */
+	init_p->rx.fw_sb_id = vf_igu_sb(vf, q->sb_idx);
+	init_p->tx.fw_sb_id = vf_igu_sb(vf, q->sb_idx);
+
+	/* context */
+	init_p->cxts[0] = q->cxt;
+
+	/* SETUP */
+
+	/* Setup-op general parameters */
+	setup_p->gen_params.spcl_id = vf->sp_cl_id;
+	setup_p->gen_params.stat_id = vfq_stat_id(vf, q);
+
+	/* Setup-op pause params:
+	 * Nothing to do, the pause thresholds are set by default to 0 which
+	 * effectively turns off the feature for this queue. We don't want
+	 * one queue (VF) to interfering with another queue (another VF)
+	 */
+	if (vf->cfg_flags & VF_CFG_FW_FC)
+		BNX2X_ERR("No support for pause to VFs (abs_vfid: %d)\n",
+			  vf->abs_vfid);
+	/* Setup-op flags:
+	 * collect statistics, zero statistics, local-switching, security,
+	 * OV for Flex10, RSS and MCAST for leading
+	 */
+	if (test_bit(BNX2X_Q_FLG_STATS, &setup_p->flags))
+		__set_bit(BNX2X_Q_FLG_ZERO_STATS, &setup_p->flags);
+
+	/* for VFs, enable tx switching, bd coherency, and mac address
+	 * anti-spoofing
+	 */
+	__set_bit(BNX2X_Q_FLG_TX_SWITCH, &setup_p->flags);
+	__set_bit(BNX2X_Q_FLG_TX_SEC, &setup_p->flags);
+	__set_bit(BNX2X_Q_FLG_ANTI_SPOOF, &setup_p->flags);
+
+	if (vfq_is_leading(q)) {
+		__set_bit(BNX2X_Q_FLG_LEADING_RSS, &setup_p->flags);
+		__set_bit(BNX2X_Q_FLG_MCAST, &setup_p->flags);
+	}
+
+	/* Setup-op rx parameters */
+	if (test_bit(BNX2X_Q_TYPE_HAS_RX, &q_type)) {
+		struct bnx2x_rxq_setup_params *rxq_p = &setup_p->rxq_params;
+
+		rxq_p->cl_qzone_id = vfq_qzone_id(vf, q);
+		rxq_p->fw_sb_id = vf_igu_sb(vf, q->sb_idx);
+		rxq_p->rss_engine_id = FW_VF_HANDLE(vf->abs_vfid);
+
+		if (test_bit(BNX2X_Q_FLG_TPA, &setup_p->flags))
+			rxq_p->max_tpa_queues = BNX2X_VF_MAX_TPA_AGG_QUEUES;
+	}
+
+	/* Setup-op tx parameters */
+	if (test_bit(BNX2X_Q_TYPE_HAS_TX, &q_type)) {
+		setup_p->txq_params.tss_leading_cl_id = vf->leading_rss;
+		setup_p->txq_params.fw_sb_id = vf_igu_sb(vf, q->sb_idx);
+	}
+}
+
+/* VFOP queue construction */
+static void bnx2x_vfop_qctor(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	struct bnx2x_vfop_args_qctor *args = &vfop->args.qctor;
+	struct bnx2x_queue_state_params *q_params = &vfop->op_p->qctor.qstate;
+	enum bnx2x_vfop_qctor_state state = vfop->state;
+
+	bnx2x_vfop_reset_wq(vf);
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	DP(BNX2X_MSG_IOV, "vf[%d] STATE: %d\n", vf->abs_vfid, state);
+
+	switch (state) {
+	case BNX2X_VFOP_QCTOR_INIT:
+
+		/* has this queue already been opened? */
+		if (bnx2x_get_q_logical_state(bp, q_params->q_obj) ==
+		    BNX2X_Q_LOGICAL_STATE_ACTIVE) {
+			DP(BNX2X_MSG_IOV,
+			   "Entered qctor but queue was already up. Aborting gracefully\n");
+			goto op_done;
+		}
+
+		/* next state */
+		vfop->state = BNX2X_VFOP_QCTOR_SETUP;
+
+		q_params->cmd = BNX2X_Q_CMD_INIT;
+		vfop->rc = bnx2x_queue_state_change(bp, q_params);
+
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_CONT);
+
+	case BNX2X_VFOP_QCTOR_SETUP:
+		/* next state */
+		vfop->state = BNX2X_VFOP_QCTOR_INT_EN;
+
+		/* copy pre-prepared setup params to the queue-state params */
+		vfop->op_p->qctor.qstate.params.setup =
+			vfop->op_p->qctor.prep_qsetup;
+
+		q_params->cmd = BNX2X_Q_CMD_SETUP;
+		vfop->rc = bnx2x_queue_state_change(bp, q_params);
+
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_CONT);
+
+	case BNX2X_VFOP_QCTOR_INT_EN:
+
+		/* enable interrupts */
+		bnx2x_vf_igu_ack_sb(bp, vf, vf_igu_sb(vf, args->sb_idx),
+				    USTORM_ID, 0, IGU_INT_ENABLE, 0);
+		goto op_done;
+	default:
+		bnx2x_vfop_default(state);
+	}
+op_err:
+	BNX2X_ERR("QCTOR[%d:%d] error: cmd %d, rc %d\n",
+		  vf->abs_vfid, args->qid, q_params->cmd, vfop->rc);
+op_done:
+	bnx2x_vfop_end(bp, vf, vfop);
+op_pending:
+	return;
+}
+
+static int bnx2x_vfop_qctor_cmd(struct bnx2x *bp,
+				struct bnx2x_virtf *vf,
+				struct bnx2x_vfop_cmd *cmd,
+				int qid)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+
+	if (vfop) {
+		vf->op_params.qctor.qstate.q_obj = &bnx2x_vfq(vf, qid, sp_obj);
+
+		vfop->args.qctor.qid = qid;
+		vfop->args.qctor.sb_idx = bnx2x_vfq(vf, qid, sb_idx);
+
+		bnx2x_vfop_opset(BNX2X_VFOP_QCTOR_INIT,
+				 bnx2x_vfop_qctor, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_qctor,
+					     cmd->block);
+	}
+	return -ENOMEM;
 }
 
 static void
@@ -116,225 +340,354 @@ bnx2x_vf_set_igu_info(struct bnx2x *bp, u8 igu_sb_id, u8 abs_vfid)
 	}
 }
 
-static void
-bnx2x_get_vf_igu_cam_info(struct bnx2x *bp)
+/* VFOP MAC/VLAN helpers */
+static inline void bnx2x_vfop_credit(struct bnx2x *bp,
+				     struct bnx2x_vfop *vfop,
+				     struct bnx2x_vlan_mac_obj *obj)
 {
-	int sb_id;
-	u32 val;
-	u8 fid;
+	struct bnx2x_vfop_args_filters *args = &vfop->args.filters;
 
-	/* IGU in normal mode - read CAM */
-	for (sb_id = 0; sb_id < IGU_REG_MAPPING_MEMORY_SIZE; sb_id++) {
-		val = REG_RD(bp, IGU_REG_MAPPING_MEMORY + sb_id * 4);
-		if (!(val & IGU_REG_MAPPING_MEMORY_VALID))
-			continue;
-		fid = GET_FIELD((val), IGU_REG_MAPPING_MEMORY_FID);
-		if (!(fid & IGU_FID_ENCODE_IS_PF))
-			bnx2x_vf_set_igu_info(bp, sb_id,
-					      (fid & IGU_FID_VF_NUM_MASK));
+	/* update credit only if there is no error
+	 * and a valid credit counter
+	 */
+	if (!vfop->rc && args->credit) {
+		int cnt = 0;
+		struct list_head *pos;
 
-		DP(BNX2X_MSG_IOV, "%s[%d], igu_sb_id=%d, msix=%d\n",
-		   ((fid & IGU_FID_ENCODE_IS_PF) ? "PF" : "VF"),
-		   ((fid & IGU_FID_ENCODE_IS_PF) ? (fid & IGU_FID_PF_NUM_MASK) :
-		   (fid & IGU_FID_VF_NUM_MASK)), sb_id,
-		   GET_FIELD((val), IGU_REG_MAPPING_MEMORY_VECTOR));
+		list_for_each(pos, &obj->head)
+			cnt++;
+
+		atomic_set(args->credit, cnt);
 	}
 }
 
-static void __bnx2x_iov_free_vfdb(struct bnx2x *bp)
+static int bnx2x_vfop_set_user_req(struct bnx2x *bp,
+				    struct bnx2x_vfop_filter *pos,
+				    struct bnx2x_vlan_mac_data *user_req)
 {
-	if (bp->vfdb) {
-		kfree(bp->vfdb->vfqs);
-		kfree(bp->vfdb->vfs);
-		kfree(bp->vfdb);
+	user_req->cmd = pos->add ? BNX2X_VLAN_MAC_ADD :
+		BNX2X_VLAN_MAC_DEL;
+
+	switch (pos->type) {
+	case BNX2X_VFOP_FILTER_MAC:
+		memcpy(user_req->u.mac.mac, pos->mac, ETH_ALEN);
+		break;
+	case BNX2X_VFOP_FILTER_VLAN:
+		user_req->u.vlan.vlan = pos->vid;
+		break;
+	default:
+		BNX2X_ERR("Invalid filter type, skipping\n");
+		return 1;
 	}
-	bp->vfdb = NULL;
+	return 0;
 }
 
-static int bnx2x_sriov_pci_cfg_info(struct bnx2x *bp, struct bnx2x_sriov *iov)
+static int
+bnx2x_vfop_config_vlan0(struct bnx2x *bp,
+			struct bnx2x_vlan_mac_ramrod_params *vlan_mac,
+			bool add)
 {
-	int pos;
-	struct pci_dev *dev = bp->pdev;
-
-	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV);
-	if (!pos) {
-		BNX2X_ERR("failed to find SRIOV capability in device\n");
-		return -ENODEV;
-	}
+	int rc;
 
-	iov->pos = pos;
-	DP(BNX2X_MSG_IOV, "sriov ext pos %d\n", pos);
-	pci_read_config_word(dev, pos + PCI_SRIOV_CTRL, &iov->ctrl);
-	pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &iov->total);
-	pci_read_config_word(dev, pos + PCI_SRIOV_INITIAL_VF, &iov->initial);
-	pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &iov->offset);
-	pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &iov->stride);
-	pci_read_config_dword(dev, pos + PCI_SRIOV_SUP_PGSIZE, &iov->pgsz);
-	pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
-	pci_read_config_byte(dev, pos + PCI_SRIOV_FUNC_LINK, &iov->link);
+	vlan_mac->user_req.cmd = add ? BNX2X_VLAN_MAC_ADD :
+		BNX2X_VLAN_MAC_DEL;
+	vlan_mac->user_req.u.vlan.vlan = 0;
 
-	return 0;
+	rc = bnx2x_config_vlan_mac(bp, vlan_mac);
+	if (rc == -EEXIST)
+		rc = 0;
+	return rc;
 }
 
-static int bnx2x_sriov_info(struct bnx2x *bp, struct bnx2x_sriov *iov)
+static int bnx2x_vfop_config_list(struct bnx2x *bp,
+				  struct bnx2x_vfop_filters *filters,
+				  struct bnx2x_vlan_mac_ramrod_params *vlan_mac)
 {
-	u32 val;
+	struct bnx2x_vfop_filter *pos, *tmp;
+	struct list_head rollback_list, *filters_list = &filters->head;
+	struct bnx2x_vlan_mac_data *user_req = &vlan_mac->user_req;
+	int rc = 0, cnt = 0;
 
-	/* read the SRIOV capability structure
-	 * The fields can be read via configuration read or
-	 * directly from the device (starting at offset PCICFG_OFFSET)
-	 */
-	if (bnx2x_sriov_pci_cfg_info(bp, iov))
-		return -ENODEV;
+	INIT_LIST_HEAD(&rollback_list);
 
-	/* get the number of SRIOV bars */
-	iov->nres = 0;
-
-	/* read the first_vfid */
-	val = REG_RD(bp, PCICFG_OFFSET + GRC_CONFIG_REG_PF_INIT_VF);
-	iov->first_vf_in_pf = ((val & GRC_CR_PF_INIT_VF_PF_FIRST_VF_NUM_MASK)
-			       * 8) - (BNX2X_MAX_NUM_OF_VFS * BP_PATH(bp));
+	list_for_each_entry_safe(pos, tmp, filters_list, link) {
+		if (bnx2x_vfop_set_user_req(bp, pos, user_req))
+			continue;
 
-	DP(BNX2X_MSG_IOV,
-	   "IOV info[%d]: first vf %d, nres %d, cap 0x%x, ctrl 0x%x, total %d, initial %d, num vfs %d, offset %d, stride %d, page size 0x%x\n",
-	   BP_FUNC(bp),
-	   iov->first_vf_in_pf, iov->nres, iov->cap, iov->ctrl, iov->total,
-	   iov->initial, iov->nr_virtfn, iov->offset, iov->stride, iov->pgsz);
+		rc = bnx2x_config_vlan_mac(bp, vlan_mac);
+		if (rc >= 0) {
+			cnt += pos->add ? 1 : -1;
+			list_del(&pos->link);
+			list_add(&pos->link, &rollback_list);
+			rc = 0;
+		} else if (rc == -EEXIST) {
+			rc = 0;
+		} else {
+			BNX2X_ERR("Failed to add a new vlan_mac command\n");
+			break;
+		}
+	}
 
-	return 0;
+	/* rollback if error or too many rules added */
+	if (rc || cnt > filters->add_cnt) {
+		BNX2X_ERR("error or too many rules added. Performing rollback\n");
+		list_for_each_entry_safe(pos, tmp, &rollback_list, link) {
+			pos->add = !pos->add;	/* reverse op */
+			bnx2x_vfop_set_user_req(bp, pos, user_req);
+			bnx2x_config_vlan_mac(bp, vlan_mac);
+			list_del(&pos->link);
+		}
+		cnt = 0;
+		if (!rc)
+			rc = -EINVAL;
+	}
+	filters->add_cnt = cnt;
+	return rc;
 }
 
-static u8 bnx2x_iov_get_max_queue_count(struct bnx2x *bp)
+/* VFOP set VLAN/MAC */
+static void bnx2x_vfop_vlan_mac(struct bnx2x *bp, struct bnx2x_virtf *vf)
 {
-	int i;
-	u8 queue_count = 0;
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	struct bnx2x_vlan_mac_ramrod_params *vlan_mac = &vfop->op_p->vlan_mac;
+	struct bnx2x_vlan_mac_obj *obj = vlan_mac->vlan_mac_obj;
+	struct bnx2x_vfop_filters *filters = vfop->args.filters.multi_filter;
 
-	if (IS_SRIOV(bp))
-		for_each_vf(bp, i)
-			queue_count += bnx2x_vf(bp, i, alloc_resc.num_sbs);
+	enum bnx2x_vfop_vlan_mac_state state = vfop->state;
 
-	return queue_count;
-}
+	if (vfop->rc < 0)
+		goto op_err;
 
-/* must be called after PF bars are mapped */
-int bnx2x_iov_init_one(struct bnx2x *bp, int int_mode_param,
-				 int num_vfs_param)
-{
-	int err, i, qcount;
-	struct bnx2x_sriov *iov;
-	struct pci_dev *dev = bp->pdev;
+	DP(BNX2X_MSG_IOV, "vf[%d] STATE: %d\n", vf->abs_vfid, state);
 
-	bp->vfdb = NULL;
+	bnx2x_vfop_reset_wq(vf);
 
-	/* verify sriov capability is present in configuration space */
-	if (!pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV)) {
-		DP(BNX2X_MSG_IOV, "no sriov - capability not found");
-		return 0;
-	}
+	switch (state) {
+	case BNX2X_VFOP_VLAN_MAC_CLEAR:
+		/* next state */
+		vfop->state = BNX2X_VFOP_VLAN_MAC_CHK_DONE;
 
-	/* verify is pf */
-	if (IS_VF(bp))
-		return 0;
+		/* do delete */
+		vfop->rc = obj->delete_all(bp, obj,
+					   &vlan_mac->user_req.vlan_mac_flags,
+					   &vlan_mac->ramrod_flags);
 
-	/* verify chip revision */
-	if (CHIP_IS_E1x(bp))
-		return 0;
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_DONE);
 
-	/* check if SRIOV support is turned off */
-	if (!num_vfs_param)
-		return 0;
+	case BNX2X_VFOP_VLAN_MAC_CONFIG_SINGLE:
+		/* next state */
+		vfop->state = BNX2X_VFOP_VLAN_MAC_CHK_DONE;
 
-	/* SRIOV assumes that num of PF CIDs < BNX2X_FIRST_VF_CID */
-	if (BNX2X_L2_MAX_CID(bp) >= BNX2X_FIRST_VF_CID) {
-		BNX2X_ERR("PF cids %d are overspilling into vf space (starts at %d). Abort SRIOV",
-			  BNX2X_L2_MAX_CID(bp), BNX2X_FIRST_VF_CID);
-		return 0;
-	}
+		/* do config */
+		vfop->rc = bnx2x_config_vlan_mac(bp, vlan_mac);
+		if (vfop->rc == -EEXIST)
+			vfop->rc = 0;
 
-	/* SRIOV can be enabled only with MSIX */
-	if (int_mode_param == BNX2X_INT_MODE_MSI ||
-	    int_mode_param == BNX2X_INT_MODE_INTX) {
-		BNX2X_ERR("Forced MSI/INTx mode is incompatible with SRIOV\n");
-		return 0;
-	}
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_DONE);
 
-	/* verify ari is enabled */
-	if (!bnx2x_ari_enabled(bp->pdev)) {
-		BNX2X_ERR("ARI not supported, SRIOV can not be enabled\n");
-		return 0;
-	}
+	case BNX2X_VFOP_VLAN_MAC_CHK_DONE:
+		vfop->rc = !!obj->raw.check_pending(&obj->raw);
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_DONE);
 
-	/* verify igu is in normal mode */
-	if (CHIP_INT_MODE_IS_BC(bp)) {
-		BNX2X_ERR("IGU not normal mode,  SRIOV can not be enabled\n");
-		return 0;
-	}
+	case BNX2X_VFOP_MAC_CONFIG_LIST:
+		/* next state */
+		vfop->state = BNX2X_VFOP_VLAN_MAC_CHK_DONE;
 
-	/* allocate the vfs database */
-	bp->vfdb = kzalloc(sizeof(*(bp->vfdb)), GFP_KERNEL);
-	if (!bp->vfdb) {
-		BNX2X_ERR("failed to allocate vf database\n");
-		err = -ENOMEM;
-		goto failed;
+		/* do list config */
+		vfop->rc = bnx2x_vfop_config_list(bp, filters, vlan_mac);
+		if (vfop->rc)
+			goto op_err;
+
+		set_bit(RAMROD_CONT, &vlan_mac->ramrod_flags);
+		vfop->rc = bnx2x_config_vlan_mac(bp, vlan_mac);
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_DONE);
+
+	case BNX2X_VFOP_VLAN_CONFIG_LIST:
+		/* next state */
+		vfop->state = BNX2X_VFOP_VLAN_CONFIG_LIST_0;
+
+		/* remove vlan0 - could be no-op */
+		vfop->rc = bnx2x_vfop_config_vlan0(bp, vlan_mac, false);
+		if (vfop->rc)
+			goto op_err;
+
+		/* Do vlan list config. if this operation fails we try to
+		 * restore vlan0 to keep the queue is working order
+		 */
+		vfop->rc = bnx2x_vfop_config_list(bp, filters, vlan_mac);
+		if (!vfop->rc) {
+			set_bit(RAMROD_CONT, &vlan_mac->ramrod_flags);
+			vfop->rc = bnx2x_config_vlan_mac(bp, vlan_mac);
+		}
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_CONT); /* fall-through */
+
+	case BNX2X_VFOP_VLAN_CONFIG_LIST_0:
+		/* next state */
+		vfop->state = BNX2X_VFOP_VLAN_MAC_CHK_DONE;
+
+		if (list_empty(&obj->head))
+			/* add vlan0 */
+			vfop->rc = bnx2x_vfop_config_vlan0(bp, vlan_mac, true);
+		bnx2x_vfop_finalize(vf, vfop->rc, VFOP_DONE);
+
+	default:
+		bnx2x_vfop_default(state);
 	}
+op_err:
+	BNX2X_ERR("VLAN-MAC error: rc %d\n", vfop->rc);
+op_done:
+	kfree(filters);
+	bnx2x_vfop_credit(bp, vfop, obj);
+	bnx2x_vfop_end(bp, vf, vfop);
+op_pending:
+	return;
+}
 
-	/* get the sriov info - Linux already collected all the pertinent
-	 * information, however the sriov structure is for the private use
-	 * of the pci module. Also we want this information regardless
-	 *  of the hyper-visor.
-	 */
-	iov = &(bp->vfdb->sriov);
-	err = bnx2x_sriov_info(bp, iov);
-	if (err)
-		goto failed;
+struct bnx2x_vfop_vlan_mac_flags {
+	bool drv_only;
+	bool dont_consume;
+	bool single_cmd;
+	bool add;
+};
 
-	/* SR-IOV capability was enabled but there are no VFs*/
-	if (iov->total == 0)
-		goto failed;
+static void
+bnx2x_vfop_vlan_mac_prep_ramrod(struct bnx2x_vlan_mac_ramrod_params *ramrod,
+				struct bnx2x_vfop_vlan_mac_flags *flags)
+{
+	struct bnx2x_vlan_mac_data *ureq = &ramrod->user_req;
 
-	/* calcuate the actual number of VFs */
-	iov->nr_virtfn = min_t(u16, iov->total, (u16)num_vfs_param);
+	memset(ramrod, 0, sizeof(*ramrod));
 
-	/* allcate the vf array */
-	bp->vfdb->vfs = kzalloc(sizeof(struct bnx2x_virtf) *
-				BNX2X_NR_VIRTFN(bp), GFP_KERNEL);
-	if (!bp->vfdb->vfs) {
-		BNX2X_ERR("failed to allocate vf array\n");
-		err = -ENOMEM;
-		goto failed;
+	/* ramrod flags */
+	if (flags->drv_only)
+		set_bit(RAMROD_DRV_CLR_ONLY, &ramrod->ramrod_flags);
+	if (flags->single_cmd)
+		set_bit(RAMROD_EXEC, &ramrod->ramrod_flags);
+
+	/* mac_vlan flags */
+	if (flags->dont_consume)
+		set_bit(BNX2X_DONT_CONSUME_CAM_CREDIT, &ureq->vlan_mac_flags);
+
+	/* cmd */
+	ureq->cmd = flags->add ? BNX2X_VLAN_MAC_ADD : BNX2X_VLAN_MAC_DEL;
+}
+
+int bnx2x_vfop_vlan_set_cmd(struct bnx2x *bp,
+			    struct bnx2x_virtf *vf,
+			    struct bnx2x_vfop_cmd *cmd,
+			    int qid, u16 vid, bool add)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
+
+	if (vfop) {
+		struct bnx2x_vfop_args_filters filters = {
+			.multi_filter = NULL, /* single command */
+			.credit = &bnx2x_vfq(vf, qid, vlan_count),
+		};
+		struct bnx2x_vfop_vlan_mac_flags flags = {
+			.drv_only = false,
+			.dont_consume = (filters.credit != NULL),
+			.single_cmd = true,
+			.add = add,
+		};
+		struct bnx2x_vlan_mac_ramrod_params *ramrod =
+			&vf->op_params.vlan_mac;
+
+		/* set ramrod params */
+		bnx2x_vfop_vlan_mac_prep_ramrod(ramrod, &flags);
+		ramrod->user_req.u.vlan.vlan = vid;
+
+		/* set object */
+		ramrod->vlan_mac_obj = &bnx2x_vfq(vf, qid, vlan_obj);
+
+		/* set extra args */
+		vfop->args.filters = filters;
+
+		bnx2x_vfop_opset(BNX2X_VFOP_VLAN_MAC_CONFIG_SINGLE,
+				 bnx2x_vfop_vlan_mac, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_vlan_mac,
+					     cmd->block);
 	}
+	return -ENOMEM;
+}
 
-	/* Initial VF init - index and abs_vfid - nr_virtfn must be set */
-	for_each_vf(bp, i) {
-		bnx2x_vf(bp, i, index) = i;
-		bnx2x_vf(bp, i, abs_vfid) = iov->first_vf_in_pf + i;
-		bnx2x_vf(bp, i, state) = VF_FREE;
-		INIT_LIST_HEAD(&bnx2x_vf(bp, i, op_list_head));
-		mutex_init(&bnx2x_vf(bp, i, op_mutex));
-		bnx2x_vf(bp, i, op_current) = CHANNEL_TLV_NONE;
+/* VFOP queue setup (queue constructor + set vlan 0) */
+static void bnx2x_vfop_qsetup(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_cur(bp, vf);
+	int qid = vfop->args.qctor.qid;
+	enum bnx2x_vfop_qsetup_state state = vfop->state;
+	struct bnx2x_vfop_cmd cmd = {
+		.done = bnx2x_vfop_qsetup,
+		.block = false,
+	};
+
+	if (vfop->rc < 0)
+		goto op_err;
+
+	DP(BNX2X_MSG_IOV, "vf[%d] STATE: %d\n", vf->abs_vfid, state);
+
+	switch (state) {
+	case BNX2X_VFOP_QSETUP_CTOR:
+		/* init the queue ctor command */
+		vfop->state = BNX2X_VFOP_QSETUP_VLAN0;
+		vfop->rc = bnx2x_vfop_qctor_cmd(bp, vf, &cmd, qid);
+		if (vfop->rc)
+			goto op_err;
+		return;
+
+	case BNX2X_VFOP_QSETUP_VLAN0:
+		/* skip if non-leading or FPGA/EMU*/
+		if (qid || CHIP_REV_IS_SLOW(bp))
+			goto op_done;
+
+		/* init the queue set-vlan command (for vlan 0) */
+		vfop->state = BNX2X_VFOP_QSETUP_DONE;
+		vfop->rc = bnx2x_vfop_vlan_set_cmd(bp, vf, &cmd, qid, 0, true);
+		if (vfop->rc)
+			goto op_err;
+		return;
+op_err:
+	BNX2X_ERR("QSETUP[%d:%d] error: rc %d\n", vf->abs_vfid, qid, vfop->rc);
+op_done:
+	case BNX2X_VFOP_QSETUP_DONE:
+		bnx2x_vfop_end(bp, vf, vfop);
+		return;
+	default:
+		bnx2x_vfop_default(state);
 	}
+}
 
-	/* re-read the IGU CAM for VFs - index and abs_vfid must be set */
-	bnx2x_get_vf_igu_cam_info(bp);
+int bnx2x_vfop_qsetup_cmd(struct bnx2x *bp,
+			  struct bnx2x_virtf *vf,
+			  struct bnx2x_vfop_cmd *cmd,
+			  int qid)
+{
+	struct bnx2x_vfop *vfop = bnx2x_vfop_add(bp, vf);
 
-	/* get the total queue count and allocate the global queue arrays */
-	qcount = bnx2x_iov_get_max_queue_count(bp);
+	if (vfop) {
+		vfop->args.qctor.qid = qid;
 
-	/* allocate the queue arrays for all VFs */
-	bp->vfdb->vfqs = kzalloc(qcount * sizeof(struct bnx2x_vf_queue),
-				 GFP_KERNEL);
-	if (!bp->vfdb->vfqs) {
-		BNX2X_ERR("failed to allocate vf queue array\n");
-		err = -ENOMEM;
-		goto failed;
+		bnx2x_vfop_opset(BNX2X_VFOP_QSETUP_CTOR,
+				 bnx2x_vfop_qsetup, cmd->done);
+		return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_qsetup,
+					     cmd->block);
 	}
+	return -ENOMEM;
+}
 
-	return 0;
-failed:
-	DP(BNX2X_MSG_IOV, "Failed err=%d\n", err);
-	__bnx2x_iov_free_vfdb(bp);
-	return err;
+static u8 bnx2x_iov_get_max_queue_count(struct bnx2x *bp)
+{
+	int i;
+	u8 queue_count = 0;
+
+	if (IS_SRIOV(bp))
+		for_each_vf(bp, i)
+			queue_count += bnx2x_vf(bp, i, alloc_resc.num_sbs);
+
+	return queue_count;
 }
+
 /* VF enable primitives
  *
  * when pretend is required the caller is responsible
@@ -608,6 +961,216 @@ static void bnx2x_vf_set_bars(struct bnx2x *bp, struct bnx2x_virtf *vf)
 	}
 }
 
+static int bnx2x_ari_enabled(struct pci_dev *dev)
+{
+	return dev->bus->self && dev->bus->self->ari_enabled;
+}
+
+static void
+bnx2x_get_vf_igu_cam_info(struct bnx2x *bp)
+{
+	int sb_id;
+	u32 val;
+	u8 fid;
+
+	/* IGU in normal mode - read CAM */
+	for (sb_id = 0; sb_id < IGU_REG_MAPPING_MEMORY_SIZE; sb_id++) {
+		val = REG_RD(bp, IGU_REG_MAPPING_MEMORY + sb_id * 4);
+		if (!(val & IGU_REG_MAPPING_MEMORY_VALID))
+			continue;
+		fid = GET_FIELD((val), IGU_REG_MAPPING_MEMORY_FID);
+		if (!(fid & IGU_FID_ENCODE_IS_PF))
+			bnx2x_vf_set_igu_info(bp, sb_id,
+					      (fid & IGU_FID_VF_NUM_MASK));
+
+		DP(BNX2X_MSG_IOV, "%s[%d], igu_sb_id=%d, msix=%d\n",
+		   ((fid & IGU_FID_ENCODE_IS_PF) ? "PF" : "VF"),
+		   ((fid & IGU_FID_ENCODE_IS_PF) ? (fid & IGU_FID_PF_NUM_MASK) :
+		   (fid & IGU_FID_VF_NUM_MASK)), sb_id,
+		   GET_FIELD((val), IGU_REG_MAPPING_MEMORY_VECTOR));
+	}
+}
+
+static void __bnx2x_iov_free_vfdb(struct bnx2x *bp)
+{
+	if (bp->vfdb) {
+		kfree(bp->vfdb->vfqs);
+		kfree(bp->vfdb->vfs);
+		kfree(bp->vfdb);
+	}
+	bp->vfdb = NULL;
+}
+
+static int bnx2x_sriov_pci_cfg_info(struct bnx2x *bp, struct bnx2x_sriov *iov)
+{
+	int pos;
+	struct pci_dev *dev = bp->pdev;
+
+	pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV);
+	if (!pos) {
+		BNX2X_ERR("failed to find SRIOV capability in device\n");
+		return -ENODEV;
+	}
+
+	iov->pos = pos;
+	DP(BNX2X_MSG_IOV, "sriov ext pos %d\n", pos);
+	pci_read_config_word(dev, pos + PCI_SRIOV_CTRL, &iov->ctrl);
+	pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &iov->total);
+	pci_read_config_word(dev, pos + PCI_SRIOV_INITIAL_VF, &iov->initial);
+	pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &iov->offset);
+	pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &iov->stride);
+	pci_read_config_dword(dev, pos + PCI_SRIOV_SUP_PGSIZE, &iov->pgsz);
+	pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap);
+	pci_read_config_byte(dev, pos + PCI_SRIOV_FUNC_LINK, &iov->link);
+
+	return 0;
+}
+
+static int bnx2x_sriov_info(struct bnx2x *bp, struct bnx2x_sriov *iov)
+{
+	u32 val;
+
+	/* read the SRIOV capability structure
+	 * The fields can be read via configuration read or
+	 * directly from the device (starting at offset PCICFG_OFFSET)
+	 */
+	if (bnx2x_sriov_pci_cfg_info(bp, iov))
+		return -ENODEV;
+
+	/* get the number of SRIOV bars */
+	iov->nres = 0;
+
+	/* read the first_vfid */
+	val = REG_RD(bp, PCICFG_OFFSET + GRC_CONFIG_REG_PF_INIT_VF);
+	iov->first_vf_in_pf = ((val & GRC_CR_PF_INIT_VF_PF_FIRST_VF_NUM_MASK)
+			       * 8) - (BNX2X_MAX_NUM_OF_VFS * BP_PATH(bp));
+
+	DP(BNX2X_MSG_IOV,
+	   "IOV info[%d]: first vf %d, nres %d, cap 0x%x, ctrl 0x%x, total %d, initial %d, num vfs %d, offset %d, stride %d, page size 0x%x\n",
+	   BP_FUNC(bp),
+	   iov->first_vf_in_pf, iov->nres, iov->cap, iov->ctrl, iov->total,
+	   iov->initial, iov->nr_virtfn, iov->offset, iov->stride, iov->pgsz);
+
+	return 0;
+}
+
+/* must be called after PF bars are mapped */
+int bnx2x_iov_init_one(struct bnx2x *bp, int int_mode_param,
+				 int num_vfs_param)
+{
+	int err, i, qcount;
+	struct bnx2x_sriov *iov;
+	struct pci_dev *dev = bp->pdev;
+
+	bp->vfdb = NULL;
+
+	/* verify is pf */
+	if (IS_VF(bp))
+		return 0;
+
+	/* verify sriov capability is present in configuration space */
+	if (!pci_find_ext_capability(dev, PCI_EXT_CAP_ID_SRIOV))
+		return 0;
+
+	/* verify chip revision */
+	if (CHIP_IS_E1x(bp))
+		return 0;
+
+	/* check if SRIOV support is turned off */
+	if (!num_vfs_param)
+		return 0;
+
+	/* SRIOV assumes that num of PF CIDs < BNX2X_FIRST_VF_CID */
+	if (BNX2X_L2_MAX_CID(bp) >= BNX2X_FIRST_VF_CID) {
+		BNX2X_ERR("PF cids %d are overspilling into vf space (starts at %d). Abort SRIOV",
+			  BNX2X_L2_MAX_CID(bp), BNX2X_FIRST_VF_CID);
+		return 0;
+	}
+
+	/* SRIOV can be enabled only with MSIX */
+	if (int_mode_param == BNX2X_INT_MODE_MSI ||
+	    int_mode_param == BNX2X_INT_MODE_INTX)
+		BNX2X_ERR("Forced MSI/INTx mode is incompatible with SRIOV\n");
+
+	err = -EIO;
+	/* verify ari is enabled */
+	if (!bnx2x_ari_enabled(bp->pdev)) {
+		BNX2X_ERR("ARI not supported, SRIOV can not be enabled\n");
+		return err;
+	}
+
+	/* verify igu is in normal mode */
+	if (CHIP_INT_MODE_IS_BC(bp)) {
+		BNX2X_ERR("IGU not normal mode,  SRIOV can not be enabled\n");
+		return err;
+	}
+
+	/* allocate the vfs database */
+	bp->vfdb = kzalloc(sizeof(*(bp->vfdb)), GFP_KERNEL);
+	if (!bp->vfdb) {
+		BNX2X_ERR("failed to allocate vf database\n");
+		err = -ENOMEM;
+		goto failed;
+	}
+
+	/* get the sriov info - Linux already collected all the pertinent
+	 * information, however the sriov structure is for the private use
+	 * of the pci module. Also we want this information regardless
+	 * of the hyper-visor.
+	 */
+	iov = &(bp->vfdb->sriov);
+	err = bnx2x_sriov_info(bp, iov);
+	if (err)
+		goto failed;
+
+	/* SR-IOV capability was enabled but there are no VFs*/
+	if (iov->total == 0)
+		goto failed;
+
+	/* calculate the actual number of VFs */
+	iov->nr_virtfn = min_t(u16, iov->total, (u16)num_vfs_param);
+
+	/* allocate the vf array */
+	bp->vfdb->vfs = kzalloc(sizeof(struct bnx2x_virtf) *
+				BNX2X_NR_VIRTFN(bp), GFP_KERNEL);
+	if (!bp->vfdb->vfs) {
+		BNX2X_ERR("failed to allocate vf array\n");
+		err = -ENOMEM;
+		goto failed;
+	}
+
+	/* Initial VF init - index and abs_vfid - nr_virtfn must be set */
+	for_each_vf(bp, i) {
+		bnx2x_vf(bp, i, index) = i;
+		bnx2x_vf(bp, i, abs_vfid) = iov->first_vf_in_pf + i;
+		bnx2x_vf(bp, i, state) = VF_FREE;
+		INIT_LIST_HEAD(&bnx2x_vf(bp, i, op_list_head));
+		mutex_init(&bnx2x_vf(bp, i, op_mutex));
+		bnx2x_vf(bp, i, op_current) = CHANNEL_TLV_NONE;
+	}
+
+	/* re-read the IGU CAM for VFs - index and abs_vfid must be set */
+	bnx2x_get_vf_igu_cam_info(bp);
+
+	/* get the total queue count and allocate the global queue arrays */
+	qcount = bnx2x_iov_get_max_queue_count(bp);
+
+	/* allocate the queue arrays for all VFs */
+	bp->vfdb->vfqs = kzalloc(qcount * sizeof(struct bnx2x_vf_queue),
+				 GFP_KERNEL);
+	if (!bp->vfdb->vfqs) {
+		BNX2X_ERR("failed to allocate vf queue array\n");
+		err = -ENOMEM;
+		goto failed;
+	}
+
+	return 0;
+failed:
+	DP(BNX2X_MSG_IOV, "Failed err=%d\n", err);
+	__bnx2x_iov_free_vfdb(bp);
+	return err;
+}
+
 void bnx2x_iov_remove_one(struct bnx2x *bp)
 {
 	/* if SRIOV is not enabled there's nothing to do */
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index d86b289..3b758d4 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -26,6 +26,8 @@
  * The VF array is indexed by the relative vfid.
  */
 #define BNX2X_VF_MAX_QUEUES		16
+#define BNX2X_VF_MAX_TPA_AGG_QUEUES	8
+
 struct bnx2x_sriov {
 	u32 first_vf_in_pf;
 
@@ -91,6 +93,11 @@ struct bnx2x_virtf;
 /* VFOP definitions */
 typedef void (*vfop_handler_t)(struct bnx2x *bp, struct bnx2x_virtf *vf);
 
+struct bnx2x_vfop_cmd {
+	vfop_handler_t done;
+	bool block;
+};
+
 /* VFOP queue filters command additional arguments */
 struct bnx2x_vfop_filter {
 	struct list_head link;
@@ -406,6 +413,11 @@ static u8 vfq_cl_id(struct bnx2x_virtf *vf, struct bnx2x_vf_queue *q)
 	return vf->igu_base_id + q->index;
 }
 
+static inline u8 vfq_stat_id(struct bnx2x_virtf *vf, struct bnx2x_vf_queue *q)
+{
+	return vfq_cl_id(vf, q);
+}
+
 static inline u8 vfq_qzone_id(struct bnx2x_virtf *vf, struct bnx2x_vf_queue *q)
 {
 	return vfq_cl_id(vf, q);
@@ -438,6 +450,44 @@ int bnx2x_vf_acquire(struct bnx2x *bp, struct bnx2x_virtf *vf,
 int bnx2x_vf_init(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		  dma_addr_t *sb_map);
 
+/* VFOP generic helpers */
+#define bnx2x_vfop_default(state) do {				\
+		BNX2X_ERR("Bad state %d\n", (state));		\
+		vfop->rc = -EINVAL;				\
+		goto op_err;					\
+	} while (0)
+
+enum {
+	VFOP_DONE,
+	VFOP_CONT,
+	VFOP_VERIFY_PEND,
+};
+
+#define bnx2x_vfop_finalize(vf, rc, next) do {				\
+		if ((rc) < 0)						\
+			goto op_err;					\
+		else if ((rc) > 0)					\
+			goto op_pending;				\
+		else if ((next) == VFOP_DONE)				\
+			goto op_done;					\
+		else if ((next) == VFOP_VERIFY_PEND)			\
+			BNX2X_ERR("expected pending");			\
+		else {							\
+			DP(BNX2X_MSG_IOV, "no ramrod. scheduling");	\
+			atomic_set(&vf->op_in_progress, 1);		\
+			queue_delayed_work(bnx2x_wq, &bp->sp_task, 0);  \
+			return;						\
+		}							\
+	} while (0)
+
+#define bnx2x_vfop_opset(first_state, trans_hndlr, done_hndlr)		\
+	do {								\
+		vfop->state = first_state;				\
+		vfop->op_p = &vf->op_params;				\
+		vfop->transition = trans_hndlr;				\
+		vfop->done = done_hndlr;				\
+	} while (0)
+
 static inline struct bnx2x_vfop *bnx2x_vfop_cur(struct bnx2x *bp,
 						struct bnx2x_virtf *vf)
 {
@@ -446,6 +496,131 @@ static inline struct bnx2x_vfop *bnx2x_vfop_cur(struct bnx2x *bp,
 	return list_first_entry(&vf->op_list_head, struct bnx2x_vfop, link);
 }
 
+static inline struct bnx2x_vfop *bnx2x_vfop_add(struct bnx2x *bp,
+						struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vfop *vfop = kzalloc(sizeof(*vfop), GFP_KERNEL);
+
+	WARN(!mutex_is_locked(&vf->op_mutex), "about to access vf op linked list but mutex was not locked!");
+	if (vfop) {
+		INIT_LIST_HEAD(&vfop->link);
+		list_add(&vfop->link, &vf->op_list_head);
+	}
+	return vfop;
+}
+
+static inline void bnx2x_vfop_end(struct bnx2x *bp, struct bnx2x_virtf *vf,
+				  struct bnx2x_vfop *vfop)
+{
+	/* rc < 0 - error, otherwise set to 0 */
+	DP(BNX2X_MSG_IOV, "rc was %d\n", vfop->rc);
+	if (vfop->rc >= 0)
+		vfop->rc = 0;
+	DP(BNX2X_MSG_IOV, "rc is now %d\n", vfop->rc);
+
+	/* unlink the current op context and propagate error code
+	 * must be done before invoking the 'done()' handler
+	 */
+	WARN(!mutex_is_locked(&vf->op_mutex),
+	     "about to access vf op linked list but mutex was not locked!");
+	list_del(&vfop->link);
+
+	if (list_empty(&vf->op_list_head)) {
+		DP(BNX2X_MSG_IOV, "list was empty %d\n", vfop->rc);
+		vf->op_rc = vfop->rc;
+		DP(BNX2X_MSG_IOV, "copying rc vf->op_rc %d,  vfop->rc %d\n",
+		   vf->op_rc, vfop->rc);
+	} else {
+		struct bnx2x_vfop *cur_vfop;
+		DP(BNX2X_MSG_IOV, "list not empty %d\n", vfop->rc);
+		cur_vfop = bnx2x_vfop_cur(bp, vf);
+		cur_vfop->rc = vfop->rc;
+		DP(BNX2X_MSG_IOV, "copying rc vf->op_rc %d, vfop->rc %d\n",
+		   vf->op_rc, vfop->rc);
+	}
+
+	/* invoke done handler */
+	if (vfop->done) {
+		DP(BNX2X_MSG_IOV, "calling done handler\n");
+		vfop->done(bp, vf);
+	}
+
+	DP(BNX2X_MSG_IOV, "done handler complete. vf->op_rc %d, vfop->rc %d\n",
+	   vf->op_rc, vfop->rc);
+
+	/* if this is the last nested op reset the wait_blocking flag
+	 * to release any blocking wrappers, only after 'done()' is invoked
+	 */
+	if (list_empty(&vf->op_list_head)) {
+		DP(BNX2X_MSG_IOV, "list was empty after done %d\n", vfop->rc);
+		vf->op_wait_blocking = false;
+	}
+
+	kfree(vfop);
+}
+
+static inline int bnx2x_vfop_wait_blocking(struct bnx2x *bp,
+					   struct bnx2x_virtf *vf)
+{
+	/* can take a while if any port is running */
+	int cnt = 5000;
+
+	might_sleep();
+	while (cnt--) {
+		if (vf->op_wait_blocking == false) {
+#ifdef BNX2X_STOP_ON_ERROR
+			DP(BNX2X_MSG_IOV, "exit  (cnt %d)\n", 5000 - cnt);
+#endif
+			return 0;
+		}
+		usleep_range(1000, 2000);
+
+		if (bp->panic)
+			return -EIO;
+	}
+
+	/* timeout! */
+#ifdef BNX2X_STOP_ON_ERROR
+	bnx2x_panic();
+#endif
+
+	return -EBUSY;
+}
+
+static inline int bnx2x_vfop_transition(struct bnx2x *bp,
+					struct bnx2x_virtf *vf,
+					vfop_handler_t transition,
+					bool block)
+{
+	if (block)
+		vf->op_wait_blocking = true;
+	transition(bp, vf);
+	if (block)
+		return bnx2x_vfop_wait_blocking(bp, vf);
+	return 0;
+}
+
+/* VFOP queue construction helpers */
+void bnx2x_vfop_qctor_dump_tx(struct bnx2x *bp, struct bnx2x_virtf *vf,
+			    struct bnx2x_queue_init_params *init_params,
+			    struct bnx2x_queue_setup_params *setup_params,
+			    u16 q_idx, u16 sb_idx);
+
+void bnx2x_vfop_qctor_dump_rx(struct bnx2x *bp, struct bnx2x_virtf *vf,
+			    struct bnx2x_queue_init_params *init_params,
+			    struct bnx2x_queue_setup_params *setup_params,
+			    u16 q_idx, u16 sb_idx);
+
+void bnx2x_vfop_qctor_prep(struct bnx2x *bp,
+			   struct bnx2x_virtf *vf,
+			   struct bnx2x_vf_queue *q,
+			   struct bnx2x_vfop_qctor_params *p,
+			   unsigned long q_type);
+int bnx2x_vfop_qsetup_cmd(struct bnx2x *bp,
+			  struct bnx2x_virtf *vf,
+			  struct bnx2x_vfop_cmd *cmd,
+			  int qid);
+
 int bnx2x_vf_idx_by_abs_fid(struct bnx2x *bp, u16 abs_vfid);
 u8 bnx2x_vf_max_queue_cnt(struct bnx2x *bp, struct bnx2x_virtf *vf);
 /* VF FLR helpers */
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
index ba5503e..eb10378 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
@@ -375,6 +375,149 @@ static void bnx2x_vf_mbx_init_vf(struct bnx2x *bp, struct bnx2x_virtf *vf,
 	bnx2x_vf_mbx_resp(bp, vf);
 }
 
+/* convert MBX queue-flags to standard SP queue-flags */
+static void bnx2x_vf_mbx_set_q_flags(u32 mbx_q_flags,
+				     unsigned long *sp_q_flags)
+{
+	if (mbx_q_flags & VFPF_QUEUE_FLG_TPA)
+		__set_bit(BNX2X_Q_FLG_TPA, sp_q_flags);
+	if (mbx_q_flags & VFPF_QUEUE_FLG_TPA_IPV6)
+		__set_bit(BNX2X_Q_FLG_TPA_IPV6, sp_q_flags);
+	if (mbx_q_flags & VFPF_QUEUE_FLG_TPA_GRO)
+		__set_bit(BNX2X_Q_FLG_TPA_GRO, sp_q_flags);
+	if (mbx_q_flags & VFPF_QUEUE_FLG_STATS)
+		__set_bit(BNX2X_Q_FLG_STATS, sp_q_flags);
+	if (mbx_q_flags & VFPF_QUEUE_FLG_OV)
+		__set_bit(BNX2X_Q_FLG_OV, sp_q_flags);
+	if (mbx_q_flags & VFPF_QUEUE_FLG_VLAN)
+		__set_bit(BNX2X_Q_FLG_VLAN, sp_q_flags);
+	if (mbx_q_flags & VFPF_QUEUE_FLG_COS)
+		__set_bit(BNX2X_Q_FLG_COS, sp_q_flags);
+	if (mbx_q_flags & VFPF_QUEUE_FLG_HC)
+		__set_bit(BNX2X_Q_FLG_HC, sp_q_flags);
+	if (mbx_q_flags & VFPF_QUEUE_FLG_DHC)
+		__set_bit(BNX2X_Q_FLG_DHC, sp_q_flags);
+}
+
+static void bnx2x_vf_mbx_setup_q(struct bnx2x *bp, struct bnx2x_virtf *vf,
+				 struct bnx2x_vf_mbx *mbx)
+{
+	struct vfpf_setup_q_tlv *setup_q = &mbx->msg->req.setup_q;
+	struct bnx2x_vfop_cmd cmd = {
+		.done = bnx2x_vf_mbx_resp,
+		.block = false,
+	};
+
+	/* verify vf_qid */
+	if (setup_q->vf_qid >= vf_rxq_count(vf)) {
+		BNX2X_ERR("vf_qid %d invalid, max queue count is %d\n",
+			  setup_q->vf_qid, vf_rxq_count(vf));
+		vf->op_rc = -EINVAL;
+		goto response;
+	}
+
+	/* tx queues must be setup alongside rx queues thus if the rx queue
+	 * is not marked as valid there's nothing to do.
+	 */
+	if (setup_q->param_valid & (VFPF_RXQ_VALID|VFPF_TXQ_VALID)) {
+		struct bnx2x_vf_queue *q = vfq_get(vf, setup_q->vf_qid);
+		unsigned long q_type = 0;
+
+		struct bnx2x_queue_init_params *init_p;
+		struct bnx2x_queue_setup_params *setup_p;
+
+		/* reinit the VF operation context */
+		memset(&vf->op_params.qctor, 0 , sizeof(vf->op_params.qctor));
+		setup_p = &vf->op_params.qctor.prep_qsetup;
+		init_p =  &vf->op_params.qctor.qstate.params.init;
+
+		/* activate immediateely */
+		__set_bit(BNX2X_Q_FLG_ACTIVE, &setup_p->flags);
+
+		if (setup_q->param_valid & VFPF_TXQ_VALID) {
+			struct bnx2x_txq_setup_params *txq_params =
+				&setup_p->txq_params;
+
+			__set_bit(BNX2X_Q_TYPE_HAS_TX, &q_type);
+
+			/* save sb resource index */
+			q->sb_idx = setup_q->txq.vf_sb;
+
+			/* tx init */
+			init_p->tx.hc_rate = setup_q->txq.hc_rate;
+			init_p->tx.sb_cq_index = setup_q->txq.sb_index;
+
+			bnx2x_vf_mbx_set_q_flags(setup_q->txq.flags,
+						 &init_p->tx.flags);
+
+			/* tx setup - flags */
+			bnx2x_vf_mbx_set_q_flags(setup_q->txq.flags,
+						 &setup_p->flags);
+
+			/* tx setup - general, nothing */
+
+			/* tx setup - tx */
+			txq_params->dscr_map = setup_q->txq.txq_addr;
+			txq_params->sb_cq_index = setup_q->txq.sb_index;
+			txq_params->traffic_type = setup_q->txq.traffic_type;
+
+			bnx2x_vfop_qctor_dump_tx(bp, vf, init_p, setup_p,
+						 q->index, q->sb_idx);
+		}
+
+		if (setup_q->param_valid & VFPF_RXQ_VALID) {
+			struct bnx2x_rxq_setup_params *rxq_params =
+							&setup_p->rxq_params;
+
+			__set_bit(BNX2X_Q_TYPE_HAS_RX, &q_type);
+
+			/* Note: there is no support for different SBs
+			 * for TX and RX
+			 */
+			q->sb_idx = setup_q->rxq.vf_sb;
+
+			/* rx init */
+			init_p->rx.hc_rate = setup_q->rxq.hc_rate;
+			init_p->rx.sb_cq_index = setup_q->rxq.sb_index;
+			bnx2x_vf_mbx_set_q_flags(setup_q->rxq.flags,
+						 &init_p->rx.flags);
+
+			/* rx setup - flags */
+			bnx2x_vf_mbx_set_q_flags(setup_q->rxq.flags,
+						 &setup_p->flags);
+
+			/* rx setup - general */
+			setup_p->gen_params.mtu = setup_q->rxq.mtu;
+
+			/* rx setup - rx */
+			rxq_params->drop_flags = setup_q->rxq.drop_flags;
+			rxq_params->dscr_map = setup_q->rxq.rxq_addr;
+			rxq_params->sge_map = setup_q->rxq.sge_addr;
+			rxq_params->rcq_map = setup_q->rxq.rcq_addr;
+			rxq_params->rcq_np_map = setup_q->rxq.rcq_np_addr;
+			rxq_params->buf_sz = setup_q->rxq.buf_sz;
+			rxq_params->tpa_agg_sz = setup_q->rxq.tpa_agg_sz;
+			rxq_params->max_sges_pkt = setup_q->rxq.max_sge_pkt;
+			rxq_params->sge_buf_sz = setup_q->rxq.sge_buf_sz;
+			rxq_params->cache_line_log =
+				setup_q->rxq.cache_line_log;
+			rxq_params->sb_cq_index = setup_q->rxq.sb_index;
+
+			bnx2x_vfop_qctor_dump_rx(bp, vf, init_p, setup_p,
+						 q->index, q->sb_idx);
+		}
+		/* complete the preparations */
+		bnx2x_vfop_qctor_prep(bp, vf, q, &vf->op_params.qctor, q_type);
+
+		vf->op_rc = bnx2x_vfop_qsetup_cmd(bp, vf, &cmd, q->index);
+		if (vf->op_rc)
+			goto response;
+		return;
+	}
+response:
+	bnx2x_vf_mbx_resp(bp, vf);
+}
+
 /* dispatch request */
 static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 				  struct bnx2x_vf_mbx *mbx)
@@ -397,6 +540,9 @@ static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		case CHANNEL_TLV_INIT:
 			bnx2x_vf_mbx_init_vf(bp, vf, mbx);
 			break;
+		case CHANNEL_TLV_SETUP_Q:
+			bnx2x_vf_mbx_setup_q(bp, vf, mbx);
+			break;
 		}
 
 	/* unknown TLV - this may belong to a VF driver from the future - a
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH net-next v3 14/22] bnx2x: Support statistics collection for VFs by the PF
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

Statistics are collected by the PF driver. The collection is
performed via a query sent to the device which is basically an array
of 3-tuples of the form (statistics client, function, DMAE address).
In this patch the PF driver adds to the query, on top of the
statistics clients it is maintaining for itself (rss queues, storage,
etc), the 3-tuples for the VFs it is maintaining. The addresses used
are the GPAs of the statistics buffers supplied by the VF in the
init message on the VF <-> PF channel. The function parameter
ensures that the iommu will translate the GPA to the correct physical
address.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |   77 +--------------
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c    |   21 ++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.h    |    9 ++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c |   92 ++++++++++++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |    2 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c |  108 +++++++++++++++++----
 6 files changed, 217 insertions(+), 92 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 78514ab..3618afb 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -5250,7 +5250,8 @@ static void bnx2x_timer(unsigned long data)
 	if (!netif_running(bp->dev))
 		return;
 
-	if (!BP_NOMCP(bp)) {
+	if (IS_PF(bp) &&
+	    !BP_NOMCP(bp)) {
 		int mb_idx = BP_FW_MB_IDX(bp);
 		u32 drv_pulse;
 		u32 mcp_pulse;
@@ -7672,66 +7673,6 @@ void bnx2x_free_mem(struct bnx2x *bp)
 		       BCM_PAGE_SIZE * NUM_EQ_PAGES);
 }
 
-static int bnx2x_alloc_fw_stats_mem(struct bnx2x *bp)
-{
-	int num_groups;
-	int is_fcoe_stats = NO_FCOE(bp) ? 0 : 1;
-
-	/* number of queues for statistics is number of eth queues + FCoE */
-	u8 num_queue_stats = BNX2X_NUM_ETH_QUEUES(bp) + is_fcoe_stats;
-
-	/* Total number of FW statistics requests =
-	 * 1 for port stats + 1 for PF stats + potential 1 for FCoE stats +
-	 * num of queues
-	 */
-	bp->fw_stats_num = 2 + is_fcoe_stats + num_queue_stats;
-
-
-	/* Request is built from stats_query_header and an array of
-	 * stats_query_cmd_group each of which contains
-	 * STATS_QUERY_CMD_COUNT rules. The real number or requests is
-	 * configured in the stats_query_header.
-	 */
-	num_groups = ((bp->fw_stats_num) / STATS_QUERY_CMD_COUNT) +
-		     (((bp->fw_stats_num) % STATS_QUERY_CMD_COUNT) ? 1 : 0);
-
-	bp->fw_stats_req_sz = sizeof(struct stats_query_header) +
-			num_groups * sizeof(struct stats_query_cmd_group);
-
-	/* Data for statistics requests + stats_conter
-	 *
-	 * stats_counter holds per-STORM counters that are incremented
-	 * when STORM has finished with the current request.
-	 *
-	 * memory for FCoE offloaded statistics are counted anyway,
-	 * even if they will not be sent.
-	 */
-	bp->fw_stats_data_sz = sizeof(struct per_port_stats) +
-		sizeof(struct per_pf_stats) +
-		sizeof(struct fcoe_statistics_params) +
-		sizeof(struct per_queue_stats) * num_queue_stats +
-		sizeof(struct stats_counter);
-
-	BNX2X_PCI_ALLOC(bp->fw_stats, &bp->fw_stats_mapping,
-			bp->fw_stats_data_sz + bp->fw_stats_req_sz);
-
-	/* Set shortcuts */
-	bp->fw_stats_req = (struct bnx2x_fw_stats_req *)bp->fw_stats;
-	bp->fw_stats_req_mapping = bp->fw_stats_mapping;
-
-	bp->fw_stats_data = (struct bnx2x_fw_stats_data *)
-		((u8 *)bp->fw_stats + bp->fw_stats_req_sz);
-
-	bp->fw_stats_data_mapping = bp->fw_stats_mapping +
-				   bp->fw_stats_req_sz;
-	return 0;
-
-alloc_mem_err:
-	BNX2X_PCI_FREE(bp->fw_stats, bp->fw_stats_mapping,
-		       bp->fw_stats_data_sz + bp->fw_stats_req_sz);
-	BNX2X_ERR("Can't allocate memory\n");
-	return -ENOMEM;
-}
 
 int bnx2x_alloc_mem_cnic(struct bnx2x *bp)
 {
@@ -7778,10 +7719,6 @@ int bnx2x_alloc_mem(struct bnx2x *bp)
 	BNX2X_PCI_ALLOC(bp->slowpath, &bp->slowpath_mapping,
 			sizeof(struct bnx2x_slowpath));
 
-	/* Allocated memory for FW statistics  */
-	if (bnx2x_alloc_fw_stats_mem(bp))
-		goto alloc_mem_err;
-
 	/* Allocate memory for CDU context:
 	 * This memory is allocated separately and not in the generic ILT
 	 * functions because CDU differs in few aspects:
@@ -7810,6 +7747,9 @@ int bnx2x_alloc_mem(struct bnx2x *bp)
 	if (bnx2x_ilt_mem_op(bp, ILT_MEMOP_ALLOC))
 		goto alloc_mem_err;
 
+	if (bnx2x_iov_alloc_mem(bp))
+		goto alloc_mem_err;
+
 	/* Slow path ring */
 	BNX2X_PCI_ALLOC(bp->spq, &bp->spq_mapping, BCM_PAGE_SIZE);
 
@@ -7817,13 +7757,6 @@ int bnx2x_alloc_mem(struct bnx2x *bp)
 	BNX2X_PCI_ALLOC(bp->eq_ring, &bp->eq_mapping,
 			BCM_PAGE_SIZE * NUM_EQ_PAGES);
 
-
-	/* fastpath */
-	/* need to be done at the end, since it's self adjusting to amount
-	 * of memory available for RSS queues
-	 */
-	if (bnx2x_alloc_fp_mem(bp))
-		goto alloc_mem_err;
 	return 0;
 
 alloc_mem_err:
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
index b8b4b74..1eb92b8 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.c
@@ -5199,6 +5199,27 @@ void bnx2x_init_queue_obj(struct bnx2x *bp,
 	obj->set_pending = bnx2x_queue_set_pending;
 }
 
+/* return a queue object's logical state*/
+int bnx2x_get_q_logical_state(struct bnx2x *bp,
+			       struct bnx2x_queue_sp_obj *obj)
+{
+	switch (obj->state) {
+	case BNX2X_Q_STATE_ACTIVE:
+	case BNX2X_Q_STATE_MULTI_COS:
+		return BNX2X_Q_LOGICAL_STATE_ACTIVE;
+	case BNX2X_Q_STATE_RESET:
+	case BNX2X_Q_STATE_INITIALIZED:
+	case BNX2X_Q_STATE_MCOS_TERMINATED:
+	case BNX2X_Q_STATE_INACTIVE:
+	case BNX2X_Q_STATE_STOPPED:
+	case BNX2X_Q_STATE_TERMINATED:
+	case BNX2X_Q_STATE_FLRED:
+		return BNX2X_Q_LOGICAL_STATE_STOPPED;
+	default:
+		return -EINVAL;
+	}
+}
+
 /********************** Function state object *********************************/
 enum bnx2x_func_state bnx2x_func_get_state(struct bnx2x *bp,
 					   struct bnx2x_func_sp_obj *o)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.h
index adbd91b..b304678 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sp.h
@@ -776,6 +776,12 @@ enum bnx2x_q_state {
 	BNX2X_Q_STATE_MAX,
 };
 
+/* Allowed Queue states */
+enum bnx2x_q_logical_state {
+	BNX2X_Q_LOGICAL_STATE_ACTIVE,
+	BNX2X_Q_LOGICAL_STATE_STOPPED,
+};
+
 /* Allowed commands */
 enum bnx2x_queue_cmd {
 	BNX2X_Q_CMD_INIT,
@@ -1261,6 +1267,9 @@ void bnx2x_init_queue_obj(struct bnx2x *bp,
 int bnx2x_queue_state_change(struct bnx2x *bp,
 			     struct bnx2x_queue_state_params *params);
 
+int bnx2x_get_q_logical_state(struct bnx2x *bp,
+			       struct bnx2x_queue_sp_obj *obj);
+
 /********************* VLAN-MAC ****************/
 void bnx2x_init_mac_obj(struct bnx2x *bp,
 			struct bnx2x_vlan_mac_obj *mac_obj,
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index 99fd606..52c3bd2 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -1068,6 +1068,81 @@ void bnx2x_iov_sp_event(struct bnx2x *bp, int vf_cid, bool queue_work)
 	}
 }
 
+void bnx2x_iov_adjust_stats_req(struct bnx2x *bp)
+{
+	int i;
+	int first_queue_query_index, num_queues_req;
+	dma_addr_t cur_data_offset;
+	struct stats_query_entry *cur_query_entry;
+	u8 stats_count = 0;
+	bool is_fcoe = false;
+
+	if (!IS_SRIOV(bp))
+		return;
+
+	if (!NO_FCOE(bp))
+		is_fcoe = true;
+
+	/* fcoe adds one global request and one queue request */
+	num_queues_req = BNX2X_NUM_ETH_QUEUES(bp) + is_fcoe;
+
+	first_queue_query_index = BNX2X_FIRST_QUEUE_QUERY_IDX -
+		(is_fcoe ? 0 : 1);
+
+	DP(BNX2X_MSG_IOV,
+	   "BNX2X_NUM_ETH_QUEUES %d, is_fcoe %d, first_queue_query_index %d => determined the last non virtual statistics query index is %d. Will add queries on top of that\n",
+	   BNX2X_NUM_ETH_QUEUES(bp), is_fcoe, first_queue_query_index,
+	   first_queue_query_index + num_queues_req);
+
+	cur_data_offset = bp->fw_stats_data_mapping +
+		offsetof(struct bnx2x_fw_stats_data, queue_stats) +
+		num_queues_req * sizeof(struct per_queue_stats);
+
+	cur_query_entry = &bp->fw_stats_req->
+		query[first_queue_query_index + num_queues_req];
+
+	for_each_vf(bp, i) {
+		int j;
+		struct bnx2x_virtf *vf = BP_VF(bp, i);
+		if (vf->state != VF_ENABLED) {
+			DP(BNX2X_MSG_IOV,
+			   "vf %d not enabled so no stats for it\n",
+			   vf->abs_vfid);
+			continue;
+		}
+
+		DP(BNX2X_MSG_IOV, "add addresses for vf %d\n", vf->abs_vfid);
+
+		for_each_vfq(vf, j) {
+			struct bnx2x_vf_queue *rxq = vfq_get(vf, j);
+
+			/* collect stats fro active queues only */
+			if (bnx2x_get_q_logical_state(bp, &rxq->sp_obj) ==
+			    BNX2X_Q_LOGICAL_STATE_STOPPED)
+				continue;
+
+			/* create stats query entry for this queue */
+			cur_query_entry->kind = STATS_TYPE_QUEUE;
+			cur_query_entry->index = vfq_cl_id(vf, rxq);
+			cur_query_entry->funcID =
+				cpu_to_le16(FW_VF_HANDLE(vf->abs_vfid));
+			cur_query_entry->address.hi =
+				cpu_to_le32(U64_HI(vf->fw_stat_map));
+			cur_query_entry->address.lo =
+				cpu_to_le32(U64_LO(vf->fw_stat_map));
+			DP(BNX2X_MSG_IOV,
+			   "added address %x %x for vf %d queue %d client %d\n",
+			   cur_query_entry->address.hi,
+			   cur_query_entry->address.lo, cur_query_entry->funcID,
+			   j, cur_query_entry->index);
+			cur_query_entry++;
+			cur_data_offset += sizeof(struct per_queue_stats);
+			stats_count++;
+		}
+	}
+	bp->fw_stats_req->hdr.cmd_num = bp->fw_stats_num + stats_count;
+}
+
 void bnx2x_iov_sp_task(struct bnx2x *bp)
 {
 	int i;
@@ -1088,6 +1163,23 @@ void bnx2x_iov_sp_task(struct bnx2x *bp)
 		}
 	}
 }
+
+static inline
+struct bnx2x_virtf *__vf_from_stat_id(struct bnx2x *bp, u8 stat_id)
+{
+	int i;
+	struct bnx2x_virtf *vf = NULL;
+
+	for_each_vf(bp, i) {
+		vf = BP_VF(bp, i);
+		if (stat_id >= vf->igu_base_id &&
+		    stat_id < vf->igu_base_id + vf_sb_count(vf))
+			break;
+	}
+	return vf;
+}
+
+/* VF API helpers */
 static void bnx2x_vf_qtbl_set_q(struct bnx2x *bp, u8 abs_vfid, u8 qid,
 				u8 enable)
 {
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index 152a28e..d86b289 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -424,6 +424,8 @@ void bnx2x_iov_set_queue_sp_obj(struct bnx2x *bp, int vf_cid,
 				struct bnx2x_queue_sp_obj **q_obj);
 void bnx2x_iov_sp_event(struct bnx2x *bp, int vf_cid, bool queue_work);
 int bnx2x_iov_eq_sp_event(struct bnx2x *bp, union event_ring_elem *elem);
+void bnx2x_iov_adjust_stats_req(struct bnx2x *bp);
+void bnx2x_iov_storm_stats_update(struct bnx2x *bp);
 void bnx2x_iov_sp_task(struct bnx2x *bp);
 /* global vf mailbox routines */
 void bnx2x_vf_mbx(struct bnx2x *bp, struct vf_pf_event_data *vfpf_event);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c
index 89ec066..c87eee0 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c
@@ -19,7 +19,7 @@
 
 #include "bnx2x_stats.h"
 #include "bnx2x_cmn.h"
-
+#include "bnx2x_sriov.h"
 
 /* Statistics */
 
@@ -79,6 +79,41 @@ static inline u16 bnx2x_get_port_stats_dma_len(struct bnx2x *bp)
  * Init service functions
  */
 
+static void bnx2x_dp_stats(struct bnx2x *bp)
+{
+	int i;
+	DP(BNX2X_MSG_STATS, "dumping stats:\n"
+	   "fw_stats_req\n"
+	   "    hdr\n"
+	   "        cmd_num %d\n"
+	   "        reserved0 %d\n"
+	   "        drv_stats_counter %d\n"
+	   "        reserved1 %d\n"
+	   "        stats_counters_addrs %x %x\n",
+	   bp->fw_stats_req->hdr.cmd_num,
+	   bp->fw_stats_req->hdr.reserved0,
+	   bp->fw_stats_req->hdr.drv_stats_counter,
+	   bp->fw_stats_req->hdr.reserved1,
+	   bp->fw_stats_req->hdr.stats_counters_addrs.hi,
+	   bp->fw_stats_req->hdr.stats_counters_addrs.lo);
+
+	for (i = 0; i < bp->fw_stats_req->hdr.cmd_num; i++) {
+		DP(BNX2X_MSG_STATS,
+		   "query[%d]\n"
+		   "              kind %d\n"
+		   "              index %d\n"
+		   "              funcID %d\n"
+		   "              reserved %d\n"
+		   "              address %x %x\n",
+		   i, bp->fw_stats_req->query[i].kind,
+		   bp->fw_stats_req->query[i].index,
+		   bp->fw_stats_req->query[i].funcID,
+		   bp->fw_stats_req->query[i].reserved,
+		   bp->fw_stats_req->query[i].address.hi,
+		   bp->fw_stats_req->query[i].address.lo);
+	}
+}
+
 /* Post the next statistics ramrod. Protect it with the spin in
  * order to ensure the strict order between statistics ramrods
  * (each ramrod has a sequence number passed in a
@@ -103,7 +138,10 @@ static void bnx2x_storm_stats_post(struct bnx2x *bp)
 		DP(BNX2X_MSG_STATS, "Sending statistics ramrod %d\n",
 			bp->fw_stats_req->hdr.drv_stats_counter);
 
+		/* adjust the ramrod to include VF queues statistics */
+		bnx2x_iov_adjust_stats_req(bp);
 
+		bnx2x_dp_stats(bp);
 
 		/* send FW stats ramrod */
 		rc = bnx2x_sp_post(bp, RAMROD_CMD_ID_COMMON_STAT_QUERY, 0,
@@ -482,6 +520,12 @@ static void bnx2x_func_stats_init(struct bnx2x *bp)
 
 static void bnx2x_stats_start(struct bnx2x *bp)
 {
+	/* vfs travel through here as part of the statistics FSM, but no action
+	 * is required
+	 */
+	if (IS_VF(bp))
+		return;
+
 	if (bp->port.pmf)
 		bnx2x_port_stats_init(bp);
 
@@ -501,6 +545,11 @@ static void bnx2x_stats_pmf_start(struct bnx2x *bp)
 
 static void bnx2x_stats_restart(struct bnx2x *bp)
 {
+	/* vfs travel through here as part of the statistics FSM, but no action
+	 * is required
+	 */
+	if (IS_VF(bp))
+		return;
 	bnx2x_stats_comp(bp);
 	bnx2x_stats_start(bp);
 }
@@ -832,19 +881,10 @@ static int bnx2x_hw_stats_update(struct bnx2x *bp)
 	return 0;
 }
 
-static int bnx2x_storm_stats_update(struct bnx2x *bp)
+static int bnx2x_storm_stats_validate_counters(struct bnx2x *bp)
 {
-	struct tstorm_per_port_stats *tport =
-				&bp->fw_stats_data->port.tstorm_port_statistics;
-	struct tstorm_per_pf_stats *tfunc =
-				&bp->fw_stats_data->pf.tstorm_pf_statistics;
-	struct host_func_stats *fstats = &bp->func_stats;
-	struct bnx2x_eth_stats *estats = &bp->eth_stats;
-	struct bnx2x_eth_stats_old *estats_old = &bp->eth_stats_old;
 	struct stats_counter *counters = &bp->fw_stats_data->storm_counters;
-	int i;
 	u16 cur_stats_counter;
-
 	/* Make sure we use the value of the counter
 	 * used for sending the last stats ramrod.
 	 */
@@ -880,6 +920,23 @@ static int bnx2x_storm_stats_update(struct bnx2x *bp)
 		   le16_to_cpu(counters->tstats_counter), bp->stats_counter);
 		return -EAGAIN;
 	}
+	return 0;
+}
+
+static int bnx2x_storm_stats_update(struct bnx2x *bp)
+{
+	struct tstorm_per_port_stats *tport =
+				&bp->fw_stats_data->port.tstorm_port_statistics;
+	struct tstorm_per_pf_stats *tfunc =
+				&bp->fw_stats_data->pf.tstorm_pf_statistics;
+	struct host_func_stats *fstats = &bp->func_stats;
+	struct bnx2x_eth_stats *estats = &bp->eth_stats;
+	struct bnx2x_eth_stats_old *estats_old = &bp->eth_stats_old;
+	int i;
+
+	/* vfs stat counter is managed by pf */
+	if (IS_PF(bp) && bnx2x_storm_stats_validate_counters(bp))
+		return -EAGAIN;
 
 	estats->error_bytes_received_hi = 0;
 	estats->error_bytes_received_lo = 0;
@@ -1174,23 +1231,34 @@ static void bnx2x_stats_update(struct bnx2x *bp)
 	if (bnx2x_edebug_stats_stopped(bp))
 		return;
 
-	if (*stats_comp != DMAE_COMP_VAL)
-		return;
+	if (IS_PF(bp)) {
+		if (*stats_comp != DMAE_COMP_VAL)
+			return;
 
-	if (bp->port.pmf)
-		bnx2x_hw_stats_update(bp);
+		if (bp->port.pmf)
+			bnx2x_hw_stats_update(bp);
 
-	if (bnx2x_storm_stats_update(bp)) {
-		if (bp->stats_pending++ == 3) {
-			BNX2X_ERR("storm stats were not updated for 3 times\n");
-			bnx2x_panic();
+		if (bnx2x_storm_stats_update(bp)) {
+			if (bp->stats_pending++ == 3) {
+				BNX2X_ERR("storm stats were not updated for 3 times\n");
+				bnx2x_panic();
+			}
+			return;
 		}
-		return;
+	} else {
+		/* vf doesn't collect HW statistics, and doesn't get completions
+		 * perform only update
+		 */
+		bnx2x_storm_stats_update(bp);
 	}
 
 	bnx2x_net_stats_update(bp);
 	bnx2x_drv_stats_update(bp);
 
+	/* vf is done */
+	if (IS_VF(bp))
+		return;
+
 	if (netif_msg_timer(bp)) {
 		struct bnx2x_eth_stats *estats = &bp->eth_stats;
 
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH net-next v3 12/22] bnx2x: Support of PF driver of a VF acquire request
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

When a VF is probed by the VF driver, the VF driver sends an
'acquire' request over the VF <-> PF channel for the resources
it needs to operate (interrupts, queues, etc).
The PF driver either ratifies the request and allocates the resources,
responds with the maximum values it will allow the VF to acquire,
or fails the request entirely if there is a problem.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c    |   28 +++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h    |    9 +
 .../net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c    |   13 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c  |  200 +++++++++++++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h  |   43 ++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c   |  201 ++++++++++++++++++++
 6 files changed, 483 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index e8a05f4..7966d7f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -87,6 +87,34 @@ static inline void bnx2x_move_fp(struct bnx2x *bp, int from, int to)
 	to_fp->txdata_ptr[0] = &bp->bnx2x_txq[new_txdata_index];
 }
 
+/**
+ * bnx2x_fill_fw_str - Fill buffer with FW version string.
+ *
+ * @bp:        driver handle
+ * @buf:       character buffer to fill with the fw name
+ * @buf_len:   length of the above buffer
+ *
+ */
+void bnx2x_fill_fw_str(struct bnx2x *bp, char *buf, size_t buf_len)
+{
+	if (IS_PF(bp)) {
+		u8 phy_fw_ver[PHY_FW_VER_LEN];
+
+		phy_fw_ver[0] = '\0';
+		bnx2x_get_ext_phy_fw_version(&bp->link_params,
+					     phy_fw_ver, PHY_FW_VER_LEN);
+		strlcpy(buf, bp->fw_ver, buf_len);
+		snprintf(buf + strlen(bp->fw_ver), 32 - strlen(bp->fw_ver),
+			 "bc %d.%d.%d%s%s",
+			 (bp->common.bc_ver & 0xff0000) >> 16,
+			 (bp->common.bc_ver & 0xff00) >> 8,
+			 (bp->common.bc_ver & 0xff),
+			 ((phy_fw_ver[0] != '\0') ? " phy " : ""), phy_fw_ver);
+	} else {
+		strlcpy(buf, bp->acquire_resp.pfdev_info.fw_ver, buf_len);
+	}
+}
+
 int load_count[2][3] = { {0} }; /* per-path: 0-common, 1-port0, 2-port1 */
 
 /* free skb in the packet ring at pos idx
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
index 9ee67fa..cd1eaff 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h
@@ -1401,4 +1401,13 @@ static inline bool bnx2x_is_valid_ether_addr(struct bnx2x *bp, u8 *addr)
 	return false;
 }
 
+/**
+ * bnx2x_fill_fw_str - Fill buffer with FW version string.
+ *
+ * @bp:        driver handle
+ * @buf:       character buffer to fill with the fw name
+ * @buf_len:   length of the above buffer
+ *
+ */
+void bnx2x_fill_fw_str(struct bnx2x *bp, char *buf, size_t buf_len);
 #endif /* BNX2X_CMN_H */
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
index b7c82f9..292634f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c
@@ -817,21 +817,12 @@ static void bnx2x_get_drvinfo(struct net_device *dev,
 			      struct ethtool_drvinfo *info)
 {
 	struct bnx2x *bp = netdev_priv(dev);
-	u8 phy_fw_ver[PHY_FW_VER_LEN];
 
 	strlcpy(info->driver, DRV_MODULE_NAME, sizeof(info->driver));
 	strlcpy(info->version, DRV_MODULE_VERSION, sizeof(info->version));
 
-	phy_fw_ver[0] = '\0';
-	bnx2x_get_ext_phy_fw_version(&bp->link_params,
-				     phy_fw_ver, PHY_FW_VER_LEN);
-	strlcpy(info->fw_version, bp->fw_ver, sizeof(info->fw_version));
-	snprintf(info->fw_version + strlen(bp->fw_ver), 32 - strlen(bp->fw_ver),
-		 "bc %d.%d.%d%s%s",
-		 (bp->common.bc_ver & 0xff0000) >> 16,
-		 (bp->common.bc_ver & 0xff00) >> 8,
-		 (bp->common.bc_ver & 0xff),
-		 ((phy_fw_ver[0] != '\0') ? " phy " : ""), phy_fw_ver);
+	bnx2x_fill_fw_str(bp, info->fw_version, sizeof(info->fw_version));
+
 	strlcpy(info->bus_info, pci_name(bp->pdev), sizeof(info->bus_info));
 	info->n_stats = BNX2X_NUM_STATS;
 	info->testinfo_len = BNX2X_NUM_TESTS(bp);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index 5a55bf6..0c499c8 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -591,6 +591,64 @@ alloc_mem_err:
 	return -ENOMEM;
 }
 
+static void bnx2x_vfq_init(struct bnx2x *bp, struct bnx2x_virtf *vf,
+			   struct bnx2x_vf_queue *q)
+{
+	u8 cl_id = vfq_cl_id(vf, q);
+	u8 func_id = FW_VF_HANDLE(vf->abs_vfid);
+
+	unsigned long q_type = 0;
+	set_bit(BNX2X_Q_TYPE_HAS_TX, &q_type);
+	set_bit(BNX2X_Q_TYPE_HAS_RX, &q_type);
+
+	/* Queue State object */
+	bnx2x_init_queue_obj(bp, &q->sp_obj,
+			     cl_id, &q->cid, 1, func_id,
+			     bnx2x_vf_sp(bp, vf, q_data),
+			     bnx2x_vf_sp_map(bp, vf, q_data),
+			     q_type);
+
+	DP(BNX2X_MSG_IOV,
+	   "initialized vf %d's queue object. func id set to %d\n",
+	   vf->abs_vfid, q->sp_obj.func_id);
+
+	/* mac/vlan objects are per queue, but only those
+	 * that belong to the leading queue are initialized
+	 */
+	if (vfq_is_leading(q)) {
+
+		/* mac*/
+		bnx2x_init_mac_obj(bp, &q->mac_obj,
+				   cl_id, q->cid, func_id,
+				   bnx2x_vf_sp(bp, vf, mac_rdata),
+				   bnx2x_vf_sp_map(bp, vf, mac_rdata),
+				   BNX2X_FILTER_MAC_PENDING,
+				   &vf->filter_state,
+				   BNX2X_OBJ_TYPE_RX_TX,
+				   &bp->macs_pool);
+		/* vlan */
+		bnx2x_init_vlan_obj(bp, &q->vlan_obj,
+				    cl_id, q->cid, func_id,
+				    bnx2x_vf_sp(bp, vf, vlan_rdata),
+				    bnx2x_vf_sp_map(bp, vf, vlan_rdata),
+				    BNX2X_FILTER_VLAN_PENDING,
+				    &vf->filter_state,
+				    BNX2X_OBJ_TYPE_RX_TX,
+				    &bp->vlans_pool);
+
+		/* mcast */
+		bnx2x_init_mcast_obj(bp, &vf->mcast_obj, cl_id,
+				     q->cid, func_id, func_id,
+				     bnx2x_vf_sp(bp, vf, mcast_rdata),
+				     bnx2x_vf_sp_map(bp, vf, mcast_rdata),
+				     BNX2X_FILTER_MCAST_PENDING,
+				     &vf->filter_state,
+				     BNX2X_OBJ_TYPE_RX_TX);
+
+		vf->leading_rss = cl_id;
+	}
+}
+
 /* called by bnx2x_nic_load */
 int bnx2x_iov_nic_init(struct bnx2x *bp)
 {
@@ -938,3 +996,145 @@ void bnx2x_iov_sp_task(struct bnx2x *bp)
 		}
 	}
 }
+
+u8 bnx2x_vf_max_queue_cnt(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	return min_t(u8, min_t(u8, vf_sb_count(vf), BNX2X_CIDS_PER_VF),
+		     BNX2X_VF_MAX_QUEUES);
+}
+
+static
+int bnx2x_vf_chk_avail_resc(struct bnx2x *bp, struct bnx2x_virtf *vf,
+			    struct vf_pf_resc_request *req_resc)
+{
+	u8 rxq_cnt = vf_rxq_count(vf) ? : bnx2x_vf_max_queue_cnt(bp, vf);
+	u8 txq_cnt = vf_txq_count(vf) ? : bnx2x_vf_max_queue_cnt(bp, vf);
+
+	return ((req_resc->num_rxqs <= rxq_cnt) &&
+		(req_resc->num_txqs <= txq_cnt) &&
+		(req_resc->num_sbs <= vf_sb_count(vf))   &&
+		(req_resc->num_mac_filters <= vf_mac_rules_cnt(vf)) &&
+		(req_resc->num_vlan_filters <= vf_vlan_rules_cnt(vf)));
+}
+
+/* CORE VF API */
+int bnx2x_vf_acquire(struct bnx2x *bp, struct bnx2x_virtf *vf,
+		     struct vf_pf_resc_request *resc)
+{
+	int base_vf_cid = (BP_VFDB(bp)->sriov.first_vf_in_pf + vf->index) *
+		BNX2X_CIDS_PER_VF;
+
+	union cdu_context *base_cxt = (union cdu_context *)
+		BP_VF_CXT_PAGE(bp, base_vf_cid/ILT_PAGE_CIDS)->addr +
+		(base_vf_cid & (ILT_PAGE_CIDS-1));
+	int i;
+
+	/* if state is 'acquired' the VF was not released or FLR'd, in
+	 * this case the returned resources match the acquired already
+	 * acquired resources. Verify that the requested numbers do
+	 * not exceed the already acquired numbers.
+	 */
+	if (vf->state == VF_ACQUIRED) {
+		DP(BNX2X_MSG_IOV, "VF[%d] Trying to re-acquire resources (VF was not released or FLR'd)\n",
+		   vf->abs_vfid);
+
+		if (!bnx2x_vf_chk_avail_resc(bp, vf, resc)) {
+			BNX2X_ERR("VF[%d] When re-acquiring resources, requested numbers must be <= then previously acquired numbers\n",
+				  vf->abs_vfid);
+			return -EINVAL;
+		}
+		return 0;
+	}
+
+	/* Otherwise vf state must be 'free' or 'reset' */
+	if (vf->state != VF_FREE && vf->state != VF_RESET) {
+		BNX2X_ERR("VF[%d] Can not acquire a VF with state %d\n",
+			  vf->abs_vfid, vf->state);
+		return -EINVAL;
+	}
+
+	/* static allocation:
+	 * the global maximum number are fixed per VF. fail the request if
+	 * requested number exceed these globals
+	 */
+	if (!bnx2x_vf_chk_avail_resc(bp, vf, resc)) {
+		DP(BNX2X_MSG_IOV,
+		   "cannot fulfill vf resource request. Placing maximal available values in response");
+		/* set the max resource in the vf */
+		return -ENOMEM;
+	}
+
+	/* Set resources counters - 0 request means max available */
+	vf_sb_count(vf) = resc->num_sbs;
+	vf_rxq_count(vf) = resc->num_rxqs ? : bnx2x_vf_max_queue_cnt(bp, vf);
+	vf_txq_count(vf) = resc->num_txqs ? : bnx2x_vf_max_queue_cnt(bp, vf);
+	if (resc->num_mac_filters)
+		vf_mac_rules_cnt(vf) = resc->num_mac_filters;
+	if (resc->num_vlan_filters)
+		vf_vlan_rules_cnt(vf) = resc->num_vlan_filters;
+
+	DP(BNX2X_MSG_IOV,
+	   "Fullfilling vf request: sb count %d, tx_count %d, rx_count %d, mac_rules_count %d, vlan_rules_count %d\n",
+	   vf_sb_count(vf), vf_rxq_count(vf),
+	   vf_txq_count(vf), vf_mac_rules_cnt(vf),
+	   vf_vlan_rules_cnt(vf));
+
+	/* Initialize the queues */
+	if (!vf->vfqs) {
+		DP(BNX2X_MSG_IOV, "vf->vfqs was not allocated");
+		return -EINVAL;
+	}
+
+	for_each_vfq(vf, i) {
+		struct bnx2x_vf_queue *q = vfq_get(vf, i);
+
+		if (!q) {
+			DP(BNX2X_MSG_IOV, "q number %d was not allocated", i);
+			return -EINVAL;
+		}
+
+		q->index = i;
+		q->cxt = &((base_cxt + i)->eth);
+		q->cid = BNX2X_FIRST_VF_CID + base_vf_cid + i;
+
+		DP(BNX2X_MSG_IOV, "VFQ[%d:%d]: index %d, cid 0x%x, cxt %p\n",
+		   vf->abs_vfid, i, q->index, q->cid, q->cxt);
+
+		/* init SP objects */
+		bnx2x_vfq_init(bp, vf, q);
+	}
+	vf->state = VF_ACQUIRED;
+	return 0;
+}
+
+void bnx2x_lock_vf_pf_channel(struct bnx2x *bp, struct bnx2x_virtf *vf,
+			      enum channel_tlvs tlv)
+{
+	/* lock the channel */
+	mutex_lock(&vf->op_mutex);
+
+	/* record the locking op */
+	vf->op_current = tlv;
+
+	/* log the lock */
+	DP(BNX2X_MSG_IOV, "VF[%d]: vf pf channel locked by %d\n",
+	   vf->abs_vfid, tlv);
+}
+
+void bnx2x_unlock_vf_pf_channel(struct bnx2x *bp, struct bnx2x_virtf *vf,
+				enum channel_tlvs expected_tlv)
+{
+	WARN(expected_tlv != vf->op_current,
+	     "lock mismatch: expected %d found %d", expected_tlv,
+	     vf->op_current);
+
+	/* lock the channel */
+	mutex_unlock(&vf->op_mutex);
+
+	/* log the unlock */
+	DP(BNX2X_MSG_IOV, "VF[%d]: vf pf channel unlocked by %d\n",
+	   vf->abs_vfid, vf->op_current);
+
+	/* record the locking op */
+	vf->op_current = CHANNEL_TLV_NONE;
+}
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index 7c46034..e494b95 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -19,9 +19,13 @@
 #ifndef BNX2X_SRIOV_H
 #define BNX2X_SRIOV_H
 
+#include "bnx2x_vfpf.h"
+#include "bnx2x_cmn.h"
+
 /* The bnx2x device structure holds vfdb structure described below.
  * The VF array is indexed by the relative vfid.
  */
+#define BNX2X_VF_MAX_QUEUES		16
 struct bnx2x_sriov {
 	u32 first_vf_in_pf;
 
@@ -257,6 +261,12 @@ struct bnx2x_virtf {
 #define for_each_vf(bp, var) \
 		for ((var) = 0; (var) < BNX2X_NR_VIRTFN(bp); (var)++)
 
+#define for_each_vfq(vf, var) \
+		for ((var) = 0; (var) < vf_rxq_count(vf); (var)++)
+
+#define for_each_vf_sb(vf, var) \
+		for ((var) = 0; (var) < vf_sb_count(vf); (var)++)
+
 #define HW_VF_HANDLE(bp, abs_vfid) \
 	(u16)(BP_ABS_FUNC((bp)) | (1<<3) |  ((u16)(abs_vfid) << 4))
 
@@ -265,6 +275,13 @@ struct bnx2x_virtf {
 #define FW_VF_HANDLE(abs_vfid)	\
 	(abs_vfid + FW_PF_MAX_HANDLE)
 
+/* locking and unlocking the channel mutex */
+void bnx2x_lock_vf_pf_channel(struct bnx2x *bp, struct bnx2x_virtf *vf,
+			      enum channel_tlvs tlv);
+
+void bnx2x_unlock_vf_pf_channel(struct bnx2x *bp, struct bnx2x_virtf *vf,
+				enum channel_tlvs expected_tlv);
+
 /* VF mail box (aka vf-pf channel) */
 
 /* a container for the bi-directional vf<-->pf messages.
@@ -366,11 +383,32 @@ static inline struct bnx2x_vf_queue *vfq_get(struct bnx2x_virtf *vf, u8 index)
 	return &(vf->vfqs[index]);
 }
 
+static inline bool vfq_is_leading(struct bnx2x_vf_queue *vfq)
+{
+	return (vfq->index == 0);
+}
+
+/* FW ids */
 static inline u8 vf_igu_sb(struct bnx2x_virtf *vf, u16 sb_idx)
 {
 	return vf->igu_base_id + sb_idx;
 }
 
+static inline u8 vf_hc_qzone(struct bnx2x_virtf *vf, u16 sb_idx)
+{
+	return vf_igu_sb(vf, sb_idx);
+}
+
+static u8 vfq_cl_id(struct bnx2x_virtf *vf, struct bnx2x_vf_queue *q)
+{
+	return vf->igu_base_id + q->index;
+}
+
+static inline u8 vfq_qzone_id(struct bnx2x_virtf *vf, struct bnx2x_vf_queue *q)
+{
+	return vfq_cl_id(vf, q);
+}
+
 /* global iov routines */
 int bnx2x_iov_init_ilt(struct bnx2x *bp, u16 line);
 int bnx2x_iov_init_one(struct bnx2x *bp, int int_mode_param, int num_vfs_param);
@@ -388,6 +426,10 @@ void bnx2x_iov_sp_task(struct bnx2x *bp);
 /* global vf mailbox routines */
 void bnx2x_vf_mbx(struct bnx2x *bp, struct vf_pf_event_data *vfpf_event);
 void bnx2x_vf_enable_mbx(struct bnx2x *bp, u8 abs_vfid);
+/* acquire */
+int bnx2x_vf_acquire(struct bnx2x *bp, struct bnx2x_virtf *vf,
+			  struct vf_pf_resc_request *resc);
+
 static inline struct bnx2x_vfop *bnx2x_vfop_cur(struct bnx2x *bp,
 						struct bnx2x_virtf *vf)
 {
@@ -397,6 +439,7 @@ static inline struct bnx2x_vfop *bnx2x_vfop_cur(struct bnx2x *bp,
 }
 
 int bnx2x_vf_idx_by_abs_fid(struct bnx2x *bp, u16 abs_vfid);
+u8 bnx2x_vf_max_queue_cnt(struct bnx2x *bp, struct bnx2x_virtf *vf);
 /* VF FLR helpers */
 int bnx2x_vf_flr_clnup_epilog(struct bnx2x *bp, u8 abs_vfid);
 void bnx2x_vf_enable_access(struct bnx2x *bp, u8 abs_vfid);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
index 7c8e87d..2f46972 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
@@ -184,6 +184,180 @@ static int bnx2x_copy32_vf_dmae(struct bnx2x *bp, u8 from_vf,
 	return bnx2x_issue_dmae_with_comp(bp, &dmae);
 }
 
+static void bnx2x_vf_mbx_resp(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	struct bnx2x_vf_mbx *mbx = BP_VF_MBX(bp, vf->index);
+	u64 vf_addr;
+	dma_addr_t pf_addr;
+	u16 length, type;
+	int rc;
+
+	struct pfvf_general_resp_tlv *resp = &mbx->msg->resp.general_resp;
+
+	/* prepare response */
+	type = mbx->first_tlv.tl.type;
+	length = type == CHANNEL_TLV_ACQUIRE ?
+		sizeof(struct pfvf_acquire_resp_tlv) :
+		sizeof(struct pfvf_general_resp_tlv);
+	bnx2x_add_tlv(bp, resp, 0, type, length);
+	resp->hdr.status = bnx2x_pfvf_status_codes(vf->op_rc);
+	bnx2x_add_tlv(bp, resp, length, CHANNEL_TLV_LIST_END,
+		      sizeof(struct channel_list_end_tlv));
+	bnx2x_dp_tlv_list(bp, resp);
+
+	DP(BNX2X_MSG_IOV, "mailbox vf address hi 0x%x, lo 0x%x, offset 0x%x",
+	   mbx->vf_addr_hi, mbx->vf_addr_lo, mbx->first_tlv.resp_msg_offset);
+
+	/* send response */
+	vf_addr = HILO_U64(mbx->vf_addr_hi, mbx->vf_addr_lo) +
+		  mbx->first_tlv.resp_msg_offset;
+
+	pf_addr = mbx->msg_mapping +
+		offsetof(struct bnx2x_vf_mbx_msg, resp);
+
+	/* copy the response body, if there is one, before the header, as the vf
+	 * is sensitive to the header being written
+	 */
+	if (resp->hdr.tl.length > sizeof(u64)) {
+		length = resp->hdr.tl.length - sizeof(u64);
+		vf_addr += sizeof(u64);
+		pf_addr += sizeof(u64);
+		rc = bnx2x_copy32_vf_dmae(bp, false, pf_addr, vf->abs_vfid,
+					  U64_HI(vf_addr),
+					  U64_LO(vf_addr),
+					  length/4);
+		if (rc) {
+			BNX2X_ERR("Failed to copy response body to VF %d\n",
+				  vf->abs_vfid);
+			return;
+		}
+		vf_addr -= sizeof(u64);
+		pf_addr -= sizeof(u64);
+	}
+
+	/* ack the FW */
+	storm_memset_vf_mbx_ack(bp, vf->abs_vfid);
+	mmiowb();
+
+	/* initiate dmae to send the response */
+	mbx->flags &= ~VF_MSG_INPROCESS;
+
+	/* copy the response header including status-done field,
+	 * must be last dmae, must be after FW is acked
+	 */
+	rc = bnx2x_copy32_vf_dmae(bp, false, pf_addr, vf->abs_vfid,
+				  U64_HI(vf_addr),
+				  U64_LO(vf_addr),
+				  sizeof(u64)/4);
+
+	/* unlock channel mutex */
+	bnx2x_unlock_vf_pf_channel(bp, vf, mbx->first_tlv.tl.type);
+
+	if (rc) {
+		BNX2X_ERR("Failed to copy response status to VF %d\n",
+			  vf->abs_vfid);
+	}
+	return;
+}
+
+static void bnx2x_vf_mbx_acquire_resp(struct bnx2x *bp, struct bnx2x_virtf *vf,
+				      struct bnx2x_vf_mbx *mbx, int vfop_status)
+{
+	int i;
+	struct pfvf_acquire_resp_tlv *resp = &mbx->msg->resp.acquire_resp;
+	struct pf_vf_resc *resc = &resp->resc;
+	u8 status = bnx2x_pfvf_status_codes(vfop_status);
+
+	memset(resp, 0, sizeof(*resp));
+
+	/* fill in pfdev info */
+	resp->pfdev_info.chip_num = bp->common.chip_id;
+	resp->pfdev_info.db_size = (1 << BNX2X_DB_SHIFT);
+	resp->pfdev_info.indices_per_sb = HC_SB_MAX_INDICES_E2;
+	resp->pfdev_info.pf_cap = (PFVF_CAP_RSS |
+				   /* PFVF_CAP_DHC |*/ PFVF_CAP_TPA);
+	bnx2x_fill_fw_str(bp, resp->pfdev_info.fw_ver,
+			  sizeof(resp->pfdev_info.fw_ver));
+
+	if (status == PFVF_STATUS_NO_RESOURCE ||
+	    status == PFVF_STATUS_SUCCESS) {
+
+		/* set resources numbers, if status equals NO_RESOURCE these
+		 * are max possible numbers
+		 */
+		resc->num_rxqs = vf_rxq_count(vf) ? :
+			bnx2x_vf_max_queue_cnt(bp, vf);
+		resc->num_txqs = vf_txq_count(vf) ? :
+			bnx2x_vf_max_queue_cnt(bp, vf);
+		resc->num_sbs = vf_sb_count(vf);
+		resc->num_mac_filters = vf_mac_rules_cnt(vf);
+		resc->num_vlan_filters = vf_vlan_rules_cnt(vf);
+		resc->num_mc_filters = 0;
+
+		if (status == PFVF_STATUS_SUCCESS) {
+			for_each_vfq(vf, i)
+				resc->hw_qid[i] =
+					vfq_qzone_id(vf, vfq_get(vf, i));
+
+			for_each_vf_sb(vf, i) {
+				resc->hw_sbs[i].hw_sb_id = vf_igu_sb(vf, i);
+				resc->hw_sbs[i].sb_qid = vf_hc_qzone(vf, i);
+			}
+		}
+	}
+
+	DP(BNX2X_MSG_IOV, "VF[%d] ACQUIRE_RESPONSE: pfdev_info- chip_num=0x%x, db_size=%d, idx_per_sb=%d, pf_cap=0x%x\n"
+	   "resources- n_rxq-%d, n_txq-%d, n_sbs-%d, n_macs-%d, n_vlans-%d, n_mcs-%d, fw_ver: '%s'\n",
+	   vf->abs_vfid,
+	   resp->pfdev_info.chip_num,
+	   resp->pfdev_info.db_size,
+	   resp->pfdev_info.indices_per_sb,
+	   resp->pfdev_info.pf_cap,
+	   resc->num_rxqs,
+	   resc->num_txqs,
+	   resc->num_sbs,
+	   resc->num_mac_filters,
+	   resc->num_vlan_filters,
+	   resc->num_mc_filters,
+	   resp->pfdev_info.fw_ver);
+
+	DP_CONT(BNX2X_MSG_IOV, "hw_qids- [ ");
+	for (i = 0; i < vf_rxq_count(vf); i++)
+		DP_CONT(BNX2X_MSG_IOV, "%d ", resc->hw_qid[i]);
+	DP_CONT(BNX2X_MSG_IOV, "], sb_info- [ ");
+	for (i = 0; i < vf_sb_count(vf); i++)
+		DP_CONT(BNX2X_MSG_IOV, "%d:%d ",
+			resc->hw_sbs[i].hw_sb_id,
+			resc->hw_sbs[i].sb_qid);
+	DP_CONT(BNX2X_MSG_IOV, "]\n");
+
+	/* send the response */
+	vf->op_rc = vfop_status;
+	bnx2x_vf_mbx_resp(bp, vf);
+}
+
+static void bnx2x_vf_mbx_acquire(struct bnx2x *bp, struct bnx2x_virtf *vf,
+				 struct bnx2x_vf_mbx *mbx)
+{
+	int rc;
+	struct vfpf_acquire_tlv *acquire = &mbx->msg->req.acquire;
+
+	/* log vfdef info */
+	DP(BNX2X_MSG_IOV,
+	   "VF[%d] ACQUIRE: vfdev_info- vf_id %d, vf_os %d resources- n_rxq-%d, n_txq-%d, n_sbs-%d, n_macs-%d, n_vlans-%d, n_mcs-%d\n",
+	   vf->abs_vfid, acquire->vfdev_info.vf_id, acquire->vfdev_info.vf_os,
+	   acquire->resc_request.num_rxqs, acquire->resc_request.num_txqs,
+	   acquire->resc_request.num_sbs, acquire->resc_request.num_mac_filters,
+	   acquire->resc_request.num_vlan_filters,
+	   acquire->resc_request.num_mc_filters);
+
+	/* acquire the resources */
+	rc = bnx2x_vf_acquire(bp, vf, &acquire->resc_request);
+
+	/* response */
+	bnx2x_vf_mbx_acquire_resp(bp, vf, mbx, rc);
+}
+
 /* dispatch request */
 static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 				  struct bnx2x_vf_mbx *mbx)
@@ -193,8 +367,16 @@ static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 	/* check if tlv type is known */
 	if (bnx2x_tlv_supported(mbx->first_tlv.tl.type)) {
 
+		/* Lock the per vf op mutex and note the locker's identity.
+		 * The unlock will take place in mbx response.
+		 */
+		bnx2x_lock_vf_pf_channel(bp, vf, mbx->first_tlv.tl.type);
+
 		/* switch on the opcode */
 		switch (mbx->first_tlv.tl.type) {
+		case CHANNEL_TLV_ACQUIRE:
+			bnx2x_vf_mbx_acquire(bp, vf, mbx);
+			break;
 		}
 
 	/* unknown TLV - this may belong to a VF driver from the future - a
@@ -209,6 +391,25 @@ static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		for (i = 0; i < 20; i++)
 			DP_CONT(BNX2X_MSG_IOV, "%x ",
 				mbx->msg->req.tlv_buf_size.tlv_buffer[i]);
+
+		/* test whether we can respond to the VF (do we have an address
+		 * for it?)
+		 */
+		if (vf->state == VF_ACQUIRED) {
+
+			/* mbx_resp uses the op_rc of the VF */
+			vf->op_rc = PFVF_STATUS_NOT_SUPPORTED;
+
+			/* notify the VF that we do not support this request */
+			bnx2x_vf_mbx_resp(bp, vf);
+		} else {
+
+			/* can't send a response since this VF is unknown to us
+			 * just unlock the channel and be done with.
+			 */
+			bnx2x_unlock_vf_pf_channel(bp, vf,
+						   mbx->first_tlv.tl.type);
+		}
 	}
 }
 
-- 
1.7.9.GIT

^ permalink raw reply related

* [PATCH net-next v3 13/22] bnx2x: Support of PF driver of a VF init request
From: Ariel Elior @ 2012-12-10 15:46 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ariel Elior, Eilon Greenstein
In-Reply-To: <1355154406-10855-1-git-send-email-ariele@broadcom.com>

The VF driver will send an 'init' request as part of its nic load
flow. This message is used by the VF to publish the GPA's of its
status blocks, slow path ring and statistics buffer.
The PF driver notes all this down in the VF database, and also uses
this message to transfer the VF to VF_INIT state internally.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h       |    2 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c  |    2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h   |    4 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c |  160 +++++++++++++++++++++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |    6 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c  |   20 +++
 6 files changed, 193 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
index 6fc3fef..c97616c 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h
@@ -1852,6 +1852,8 @@ int bnx2x_del_all_macs(struct bnx2x *bp,
 
 /* Init Function API  */
 void bnx2x_func_init(struct bnx2x *bp, struct bnx2x_func_init_params *p);
+void bnx2x_init_sb(struct bnx2x *bp, dma_addr_t mapping, int vfid,
+			  u8 vf_valid, int fw_sb_id, int igu_sb_id);
 u32 bnx2x_get_pretend_reg(struct bnx2x *bp);
 int bnx2x_get_gpio(struct bnx2x *bp, int gpio_num, u8 port);
 int bnx2x_set_gpio(struct bnx2x *bp, int gpio_num, u32 mode, u8 port);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index 07e2963..78514ab 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -5419,7 +5419,7 @@ static void bnx2x_map_sb_state_machines(struct hc_index_data *index_data)
 		SM_TX_ID << HC_INDEX_DATA_SM_ID_SHIFT;
 }
 
-static void bnx2x_init_sb(struct bnx2x *bp, dma_addr_t mapping, int vfid,
+void bnx2x_init_sb(struct bnx2x *bp, dma_addr_t mapping, int vfid,
 			  u8 vf_valid, int fw_sb_id, int igu_sb_id)
 {
 	int igu_seg_id;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
index cefe448..1008d3f 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_reg.h
@@ -3726,6 +3726,10 @@
 #define PXP_REG_HST_DISCARD_INTERNAL_WRITES_STATUS		 0x10309c
 /* [WB 160] Used for initialization of the inbound interrupts memory */
 #define PXP_REG_HST_INBOUND_INT 				 0x103800
+/* [RW 7] Indirect access to the permission table. The fields are : {Valid;
+ * VFID[5:0]}
+ */
+#define PXP_REG_HST_ZONE_PERMISSION_TABLE			 0x103400
 /* [RW 32] Interrupt mask register #0 read/write */
 #define PXP_REG_PXP_INT_MASK_0					 0x103074
 #define PXP_REG_PXP_INT_MASK_1					 0x103084
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index 0c499c8..99fd606 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -65,6 +65,41 @@ struct bnx2x_virtf *bnx2x_vf_by_abs_fid(struct bnx2x *bp, u16 abs_vfid)
 	return (idx < BNX2X_NR_VIRTFN(bp)) ? BP_VF(bp, idx) : NULL;
 }
 
+static void bnx2x_vf_igu_ack_sb(struct bnx2x *bp, struct bnx2x_virtf *vf,
+				u8 igu_sb_id, u8 segment, u16 index, u8 op,
+				u8 update)
+{
+	/* acking a VF sb through the PF - use the GRC */
+	u32 ctl;
+	u32 igu_addr_data = IGU_REG_COMMAND_REG_32LSB_DATA;
+	u32 igu_addr_ctl = IGU_REG_COMMAND_REG_CTRL;
+	u32 func_encode = vf->abs_vfid;
+	u32 addr_encode = IGU_CMD_E2_PROD_UPD_BASE + igu_sb_id;
+	struct igu_regular cmd_data = {0};
+
+	cmd_data.sb_id_and_flags =
+			((index << IGU_REGULAR_SB_INDEX_SHIFT) |
+			 (segment << IGU_REGULAR_SEGMENT_ACCESS_SHIFT) |
+			 (update << IGU_REGULAR_BUPDATE_SHIFT) |
+			 (op << IGU_REGULAR_ENABLE_INT_SHIFT));
+
+	ctl = addr_encode << IGU_CTRL_REG_ADDRESS_SHIFT		|
+	      func_encode << IGU_CTRL_REG_FID_SHIFT		|
+	      IGU_CTRL_CMD_TYPE_WR << IGU_CTRL_REG_TYPE_SHIFT;
+
+	DP(NETIF_MSG_HW, "write 0x%08x to IGU(via GRC) addr 0x%x\n",
+	   cmd_data.sb_id_and_flags, igu_addr_data);
+	REG_WR(bp, igu_addr_data, cmd_data.sb_id_and_flags);
+	mmiowb();
+	barrier();
+
+	DP(NETIF_MSG_HW, "write 0x%08x to IGU(via GRC) addr 0x%x\n",
+	   ctl, igu_addr_ctl);
+	REG_WR(bp, igu_addr_ctl, ctl);
+	mmiowb();
+	barrier();
+}
+
 static int bnx2x_ari_enabled(struct pci_dev *dev)
 {
 	return dev->bus->self && dev->bus->self->ari_enabled;
@@ -364,6 +399,52 @@ static void bnx2x_vf_pglue_clear_err(struct bnx2x *bp, u8 abs_vfid)
 	REG_WR(bp, was_err_reg, 1 << (abs_vfid & 0x1f));
 }
 
+static void bnx2x_vf_igu_reset(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	int i;
+	u32 val;
+
+	/* Set VF masks and configuration - pretend */
+	bnx2x_pretend_func(bp, HW_VF_HANDLE(bp, vf->abs_vfid));
+
+	REG_WR(bp, IGU_REG_SB_INT_BEFORE_MASK_LSB, 0);
+	REG_WR(bp, IGU_REG_SB_INT_BEFORE_MASK_MSB, 0);
+	REG_WR(bp, IGU_REG_SB_MASK_LSB, 0);
+	REG_WR(bp, IGU_REG_SB_MASK_MSB, 0);
+	REG_WR(bp, IGU_REG_PBA_STATUS_LSB, 0);
+	REG_WR(bp, IGU_REG_PBA_STATUS_MSB, 0);
+
+	val = REG_RD(bp, IGU_REG_VF_CONFIGURATION);
+	val |= (IGU_VF_CONF_FUNC_EN | IGU_VF_CONF_MSI_MSIX_EN);
+	if (vf->cfg_flags & VF_CFG_INT_SIMD)
+		val |= IGU_VF_CONF_SINGLE_ISR_EN;
+	val &= ~IGU_VF_CONF_PARENT_MASK;
+	val |= BP_FUNC(bp) << IGU_VF_CONF_PARENT_SHIFT;	/* parent PF */
+	REG_WR(bp, IGU_REG_VF_CONFIGURATION, val);
+
+	DP(BNX2X_MSG_IOV,
+	   "value in IGU_REG_VF_CONFIGURATION of vf %d after write %x\n",
+	   vf->abs_vfid, REG_RD(bp, IGU_REG_VF_CONFIGURATION));
+
+	bnx2x_pretend_func(bp, BP_ABS_FUNC(bp));
+
+	/* iterate over all queues, clear sb consumer */
+	for (i = 0; i < vf_sb_count(vf); i++) {
+		u8 igu_sb_id = vf_igu_sb(vf, i);
+
+		/* zero prod memory */
+		REG_WR(bp, IGU_REG_PROD_CONS_MEMORY + igu_sb_id * 4, 0);
+
+		/* clear sb state machine */
+		bnx2x_igu_clear_sb_gen(bp, vf->abs_vfid, igu_sb_id,
+				       false /* VF */);
+
+		/* disable + update */
+		bnx2x_vf_igu_ack_sb(bp, vf, igu_sb_id, USTORM_ID, 0,
+				    IGU_INT_DISABLE, 1);
+	}
+}
+
 void bnx2x_vf_enable_access(struct bnx2x *bp, u8 abs_vfid)
 {
 	/* set the VF-PF association in the FW */
@@ -381,6 +462,17 @@ void bnx2x_vf_enable_access(struct bnx2x *bp, u8 abs_vfid)
 	bnx2x_pretend_func(bp, BP_ABS_FUNC(bp));
 }
 
+static void bnx2x_vf_enable_traffic(struct bnx2x *bp, struct bnx2x_virtf *vf)
+{
+	/* Reset vf in IGU  interrupts are still disabled */
+	bnx2x_vf_igu_reset(bp, vf);
+
+	/* pretend to enable the vf with the PBF */
+	bnx2x_pretend_func(bp, HW_VF_HANDLE(bp, vf->abs_vfid));
+	REG_WR(bp, PBF_REG_DISABLE_VF, 0);
+	bnx2x_pretend_func(bp, BP_ABS_FUNC(bp));
+}
+
 static u8 bnx2x_vf_is_pcie_pending(struct bnx2x *bp, u8 abs_vfid)
 {
 	struct pci_dev *dev;
@@ -996,6 +1088,13 @@ void bnx2x_iov_sp_task(struct bnx2x *bp)
 		}
 	}
 }
+static void bnx2x_vf_qtbl_set_q(struct bnx2x *bp, u8 abs_vfid, u8 qid,
+				u8 enable)
+{
+	u32 reg = PXP_REG_HST_ZONE_PERMISSION_TABLE + qid * 4;
+	u32 val = enable ? (abs_vfid | (1 << 6)) : 0;
+	REG_WR(bp, reg, val);
+}
 
 u8 bnx2x_vf_max_queue_cnt(struct bnx2x *bp, struct bnx2x_virtf *vf)
 {
@@ -1107,6 +1206,67 @@ int bnx2x_vf_acquire(struct bnx2x *bp, struct bnx2x_virtf *vf,
 	return 0;
 }
 
+int bnx2x_vf_init(struct bnx2x *bp, struct bnx2x_virtf *vf, dma_addr_t *sb_map)
+{
+	struct bnx2x_func_init_params func_init = {0};
+	u16 flags = 0;
+	int i;
+
+	/* the sb resources are initialized at this point, do the
+	 * FW/HW initializations
+	 */
+	for_each_vf_sb(vf, i)
+		bnx2x_init_sb(bp, (dma_addr_t)sb_map[i], vf->abs_vfid, true,
+			      vf_igu_sb(vf, i), vf_igu_sb(vf, i));
+
+	/* Sanity checks */
+	if (vf->state != VF_ACQUIRED) {
+		DP(BNX2X_MSG_IOV, "VF[%d] is not in VF_ACQUIRED, but %d\n",
+		   vf->abs_vfid, vf->state);
+		return -EINVAL;
+	}
+	/* FLR cleanup epilogue */
+	if (bnx2x_vf_flr_clnup_epilog(bp, vf->abs_vfid))
+		return -EBUSY;
+
+	/* reset IGU VF statistics: MSIX */
+	REG_WR(bp, IGU_REG_STATISTIC_NUM_MESSAGE_SENT + vf->abs_vfid * 4 , 0);
+
+	/* vf init */
+	if (vf->cfg_flags & VF_CFG_STATS)
+		flags |= (FUNC_FLG_STATS | FUNC_FLG_SPQ);
+
+	if (vf->cfg_flags & VF_CFG_TPA)
+		flags |= FUNC_FLG_TPA;
+
+	if (is_vf_multi(vf))
+		flags |= FUNC_FLG_RSS;
+
+	/* function setup */
+	func_init.func_flgs = flags;
+	func_init.pf_id = BP_FUNC(bp);
+	func_init.func_id = FW_VF_HANDLE(vf->abs_vfid);
+	func_init.fw_stat_map = vf->fw_stat_map;
+	func_init.spq_map = vf->spq_map;
+	func_init.spq_prod = 0;
+	bnx2x_func_init(bp, &func_init);
+
+	/* Configure RSS TBD */
+
+	/* Enable the vf */
+	bnx2x_vf_enable_access(bp, vf->abs_vfid);
+	bnx2x_vf_enable_traffic(bp, vf);
+
+	/* queue protection table */
+	for_each_vfq(vf, i)
+		bnx2x_vf_qtbl_set_q(bp, vf->abs_vfid,
+				    vfq_qzone_id(vf, vfq_get(vf, i)), true);
+
+	vf->state = VF_ENABLED;
+
+	return 0;
+}
+
 void bnx2x_lock_vf_pf_channel(struct bnx2x *bp, struct bnx2x_virtf *vf,
 			      enum channel_tlvs tlv)
 {
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index e494b95..152a28e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -267,6 +267,8 @@ struct bnx2x_virtf {
 #define for_each_vf_sb(vf, var) \
 		for ((var) = 0; (var) < vf_sb_count(vf); (var)++)
 
+#define is_vf_multi(vf)	(vf_rxq_count(vf) > 1)
+
 #define HW_VF_HANDLE(bp, abs_vfid) \
 	(u16)(BP_ABS_FUNC((bp)) | (1<<3) |  ((u16)(abs_vfid) << 4))
 
@@ -430,6 +432,10 @@ void bnx2x_vf_enable_mbx(struct bnx2x *bp, u8 abs_vfid);
 int bnx2x_vf_acquire(struct bnx2x *bp, struct bnx2x_virtf *vf,
 			  struct vf_pf_resc_request *resc);
 
+/* init */
+int bnx2x_vf_init(struct bnx2x *bp, struct bnx2x_virtf *vf,
+		  dma_addr_t *sb_map);
+
 static inline struct bnx2x_vfop *bnx2x_vfop_cur(struct bnx2x *bp,
 						struct bnx2x_virtf *vf)
 {
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
index 2f46972..ba5503e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c
@@ -20,6 +20,8 @@
 /* VF MBX (aka vf-pf channel) */
 #include "bnx2x.h"
 #include "bnx2x_sriov.h"
+#include <linux/crc32.h>
+
 /* place a given tlv on the tlv buffer at a given offset */
 void bnx2x_add_tlv(struct bnx2x *bp, void *tlvs_list, u16 offset, u16 type,
 		   u16 length)
@@ -358,6 +360,21 @@ static void bnx2x_vf_mbx_acquire(struct bnx2x *bp, struct bnx2x_virtf *vf,
 	bnx2x_vf_mbx_acquire_resp(bp, vf, mbx, rc);
 }
 
+static void bnx2x_vf_mbx_init_vf(struct bnx2x *bp, struct bnx2x_virtf *vf,
+			      struct bnx2x_vf_mbx *mbx)
+{
+	struct vfpf_init_tlv *init = &mbx->msg->req.init;
+
+	/* record ghost addresses from vf message */
+	vf->spq_map = init->spq_addr;
+	vf->fw_stat_map = init->stats_addr;
+
+	vf->op_rc = bnx2x_vf_init(bp, vf, (dma_addr_t *)init->sb_addr);
+
+	/* response */
+	bnx2x_vf_mbx_resp(bp, vf);
+}
+
 /* dispatch request */
 static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 				  struct bnx2x_vf_mbx *mbx)
@@ -377,6 +394,9 @@ static void bnx2x_vf_mbx_request(struct bnx2x *bp, struct bnx2x_virtf *vf,
 		case CHANNEL_TLV_ACQUIRE:
 			bnx2x_vf_mbx_acquire(bp, vf, mbx);
 			break;
+		case CHANNEL_TLV_INIT:
+			bnx2x_vf_mbx_init_vf(bp, vf, mbx);
+			break;
 		}
 
 	/* unknown TLV - this may belong to a VF driver from the future - a
-- 
1.7.9.GIT

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox