Netdev List
 help / color / mirror / Atom feed
* RE: [PATCH V2 1/2] Export firmware assigned labels of network devices to sysfs
From: Narendra_K @ 2010-06-04 20:31 UTC (permalink / raw)
  To: Narendra_K, greg; +Cc: netdev, linux-hotplug, linux-pci, Matt_Domsch
In-Reply-To: <20100603210715.GA18025@auslistsprd01.us.dell.com>



With regards,
Narendra K


> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Narendra K
> Sent: Friday, June 04, 2010 2:37 AM
> To: greg@kroah.com
> Cc: netdev@vger.kernel.org; linux-hotplug@vger.kernel.org; linux-
> pci@vger.kernel.org; Domsch, Matt
> Subject: Re: [PATCH V2 1/2] Export firmware assigned labels of network
> devices to sysfs
> 
> > -----Original Message-----
> > From: Greg KH [mailto:greg@kroah.com]
> > Sent: Friday, May 28, 2010 9:11 PM
> > To: K, Narendra
> > Cc: netdev@vger.kernel.org; linux-hotplug@vger.kernel.org;
> > linux-pci@vger.kernel.org; Domsch, Matt; Hargrave, Jordan; Rose,
> > Charles; Nijhawan, Vijay
> > Subject: Re: [PATCH 1/2] Export firmware assigned labels of network
> > devices to sysfs
> >
> Thanks for the comments.
> 
> > On Fri, May 28, 2010 at 06:55:21AM -0500, K, Narendra wrote:
> > > Please refer to the PCI-SIG Draft ECN
> > > "PCIe Device Labeling under Operating Systems Draft ECN" at this
> link
> > -
> > > http://www.pcisig.com/specifications/pciexpress/review_zone/.
> > >
> > > It would be great to know your views on this ECN. Please let us
> know
> > if you have
> > > have any suggestions or changes.
> >
> > Note that only members of the PCI-SIG can do this, which pretty much
> > rules out any "normal" Linux kernel developer :(
> >
> > Care to post a public version of this for us to review?
> > > --- /dev/null
> > > +++ b/drivers/pci/pci-label.c
> > > @@ -0,0 +1,242 @@
> > > +/*
> > > + * File:       drivers/pci/pci-label.c
> >
> > This line is not needed, we know the file name :)
> >
> > > + * Purpose:    Export the firmware label associated with a pci
> > network interface
> > > + * device to sysfs
> > > + * Copyright (C) 2010 Dell Inc.
> > > + * by Narendra K <Narendra_K@dell.com>, Jordan Hargrave
> > <Jordan_Hargrave@dell.com>
> > > + *
> > > + * This code checks if the pci network device has a related ACPI
> > _DSM. If
> > > + * available, the code calls the _DSM to retrieve the index and
> > string and
> > > + * exports them to sysfs. If the ACPI _DSM is not available, it
> falls
> > back on
> > > + * SMBIOS. SMBIOS defines type 41 for onboard pci devices. This
> code
> > retrieves
> > > + * strings associated with the type 41 and exports it to sysfs.
> > > + *
> > > + * Please see
> http://linux.dell.com/wiki/index.php/Oss/libnetdevname
> > for more
> > > + * information.
> > > + */
> > > +
> > > +#include <linux/pci-label.h>
> >
> > Why is this file in include/linux/ ?  Who needs it there?  Can't it
> just
> > be in in the drivers/pci/ directory?  Actually all you need is 2
> > functions in there, so it could go into the internal pci.h file in
> that
> > directory without a problem, right?
> >
> 
> This file is removed and functions are moved to the internal pci.h
> 
> > > +
> > > +static ssize_t
> > > +smbiosname_string_exists(struct device *dev, char *buf)
> > > +{
> > > +       struct pci_dev *pdev = to_pci_dev(dev);
> > > +       const struct dmi_device *dmi;
> > > +       struct dmi_devslot *dslot;
> > > +       int bus;
> > > +       int devfn;
> > > +
> > > +       bus = pdev->bus->number;
> > > +       devfn = pdev->devfn;
> > > +
> > > +       dmi = NULL;
> > > +       while ((dmi = dmi_find_device(DMI_DEV_TYPE_DEVSLOT, NULL,
> > dmi)) != NULL) {
> > > +               dslot = dmi->device_data;
> > > +               if (dslot && dslot->bus == bus && dslot->devfn ==
> > devfn) {
> > > +                       if (buf)
> > > +                               return scnprintf(buf, PAGE_SIZE,
> > "%s\n", dmi->name);
> > > +                       return strlen(dmi->name);
> > > +               }
> > > +       }
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static ssize_t
> > > +smbiosname_show(struct device *dev, struct device_attribute *attr,
> > char *buf)
> > > +{
> > > +       return smbiosname_string_exists(dev, buf);
> > > +}
> > > +
> > > +struct smbios_attribute smbios_attr_label = {
> > > +       .attr = {.name = __stringify(label), .mode = 0444, .owner =
> > THIS_MODULE},
> >
> > Can't you just put "label" as the name?
> >
> 
> This is fixed.
> 
> > > +       .show = smbiosname_show,
> > > +       .test = smbiosname_string_exists,
> > > +};
> > > +
> > > +static int
> > > +pci_create_smbiosname_file(struct pci_dev *pdev)
> > > +{
> > > +       if (smbios_attr_label.test &&
> > smbios_attr_label.test(&pdev->dev, NULL)) {
> > > +               sysfs_create_file(&pdev->dev.kobj,
> > &smbios_attr_label.attr);
> > > +               return 0;
> > > +       }
> > > +       return -1;
> > > +}
> > > +
> > > +static int
> > > +pci_remove_smbiosname_file(struct pci_dev *pdev)
> > > +{
> > > +       if (smbios_attr_label.test &&
> > smbios_attr_label.test(&pdev->dev, NULL)) {
> > > +               sysfs_remove_file(&pdev->dev.kobj,
> > &smbios_attr_label.attr);
> > > +               return 0;
> > > +       }
> > > +       return -1;
> > > +}
> > > +
> > > +static const char dell_dsm_uuid[] = {
> >
> > Um, a dell specific uuid in a generic file?  What happens when we
> need
> > to support another manufacturer?
> >
> 
> My understanding of uuid was incorrect. I have renamed it to a more
> generic
> device_label_dsm_uuid and ACPI_DSM_FUNCTION to DEVICE_LABEL_DSM
> 
> > > +       0xD0, 0x37, 0xC9, 0xE5, 0x53, 0x35, 0x7A, 0x4D,
> > > +       0x91, 0x17, 0xEA, 0x4D, 0x19, 0xC3, 0x43, 0x4D
> > > +};
> > > +
> > > +
> > > +static int
> > > +dsm_get_label(acpi_handle handle, int func,
> > > +              struct acpi_buffer *output,
> > > +              char *buf, char *attribute)
> > > +{
> > > +       struct acpi_object_list input;
> > > +       union acpi_object params[4];
> > > +       union acpi_object *obj;
> > > +       int len = 0;
> > > +
> > > +       int err;
> > > +
> > > +       input.count = 4;
> > > +       input.pointer = params;
> > > +       params[0].type = ACPI_TYPE_BUFFER;
> > > +       params[0].buffer.length = sizeof(dell_dsm_uuid);
> > > +       params[0].buffer.pointer = (char *)dell_dsm_uuid;
> > > +       params[1].type = ACPI_TYPE_INTEGER;
> > > +       params[1].integer.value = 0x02;
> > > +       params[2].type = ACPI_TYPE_INTEGER;
> > > +       params[2].integer.value = func;
> > > +       params[3].type = ACPI_TYPE_PACKAGE;
> > > +       params[3].package.count = 0;
> > > +       params[3].package.elements = NULL;
> > > +
> > > +       err = acpi_evaluate_object(handle, "_DSM", &input, output);
> > > +       if (err) {
> > > +               printk(KERN_INFO "failed to evaulate _DSM\n");
> > > +               return -1;
> > > +       }
> > > +
> > > +       obj = (union acpi_object *)output->pointer;
> > > +
> > > +       switch (obj->type) {
> > > +       case ACPI_TYPE_PACKAGE:
> > > +               if (obj->package.count == 2) {
> > > +                       len = obj-
> >package.elements[0].integer.value;
> > > +                       if (buf) {
> > > +                               if (!strncmp(attribute, "index",
> > strlen(attribute)))
> > > +                                       scnprintf(buf, PAGE_SIZE,
> > "%lu\n",
> > > +
> > obj->package.elements[0].integer.value);
> > > +                               else
> > > +                                       scnprintf(buf, PAGE_SIZE,
> > "%s\n",
> > > +
> > obj->package.elements[1].string.pointer);
> > > +                               kfree(output->pointer);
> > > +                               return strlen(buf);
> > > +                       }
> > > +               }
> > > +               kfree(output->pointer);
> > > +               return len;
> > > +       break;
> > > +       default:
> > > +               return -1;
> > > +       }
> > > +}
> > > +
> > > +static ssize_t
> > > +acpi_index_string_exist(struct device *dev, char *buf, char
> > *attribute)
> > > +{
> > > +       struct pci_dev *pdev = to_pci_dev(dev);
> > > +
> > > +       struct acpi_buffer output = {ACPI_ALLOCATE_BUFFER, NULL};
> > > +       acpi_handle handle;
> > > +       int length;
> > > +       int is_addin_card = 0;
> > > +
> > > +       if ((pdev->class >> 16) != PCI_BASE_CLASS_NETWORK)
> > > +               return -1;
> > > +
> > > +       handle = DEVICE_ACPI_HANDLE(dev);
> > > +
> > > +       if (!handle) {
> > > +               /*
> > > +                * The device is an add-in network controller and
> does
> > have
> > > +                * a valid handle. Try until we get the handle for
> the
> > parent
> > > +                * bridge
> > > +                */
> > > +               struct pci_bus *pbus;
> > > +               for (pbus = pdev->bus; pbus; pbus = pbus->parent) {
> > > +                       handle =
> > DEVICE_ACPI_HANDLE(&(pbus->self->dev));
> > > +                       if (handle)
> > > +                               break;
> > > +
> > > +               }
> > > +       }
> > > +
> > > +       if ((length = dsm_get_label(handle, DELL_DSM_NETWORK,
> > > +                                   &output, buf, attribute)) < 0)
> > > +               return -1;
> > > +
> > > +       return length;
> > > +}
> > > +
> > > +static ssize_t
> > > +acpilabel_show(struct device *dev, struct device_attribute *attr,
> > char *buf)
> > > +{
> > > +       return acpi_index_string_exist(dev, buf, "label");
> > > +}
> > > +
> > > +static ssize_t
> > > +acpiindex_show(struct device *dev, struct device_attribute *attr,
> > char *buf)
> > > +{
> > > +       return acpi_index_string_exist(dev, buf, "index");
> > > +}
> > > +
> > > +struct acpi_attribute acpi_attr_label = {
> > > +       .attr = {.name = __stringify(label), .mode = 0444, .owner =
> > THIS_MODULE},
> > > +       .show = acpilabel_show,
> > > +       .test = acpi_index_string_exist,
> > > +};
> > > +
> > > +struct acpi_attribute acpi_attr_index = {
> > > +       .attr = {.name = __stringify(index), .mode = 0444, .owner =
> > THIS_MODULE},
> > > +       .show = acpiindex_show,
> > > +       .test = acpi_index_string_exist,
> > > +};
> > > +
> > > +static int
> > > +pci_create_acpi_index_label_files(struct pci_dev *pdev)
> > > +{
> > > +       if (acpi_attr_label.test && acpi_attr_label.test(&pdev-
> >dev,
> > NULL) > 0) {
> > > +               sysfs_create_file(&pdev->dev.kobj,
> > &acpi_attr_label.attr);
> > > +               sysfs_create_file(&pdev->dev.kobj,
> > &acpi_attr_index.attr);
> > > +               return 0;
> > > +       }
> > > +       return -1;
> > > +}
> > > +
> > > +static int
> > > +pci_remove_acpi_index_label_files(struct pci_dev *pdev)
> > > +{
> > > +       if (acpi_attr_label.test && acpi_attr_label.test(&pdev-
> >dev,
> > NULL) > 0) {
> > > +               sysfs_remove_file(&pdev->dev.kobj,
> > &acpi_attr_label.attr);
> > > +               sysfs_remove_file(&pdev->dev.kobj,
> > &acpi_attr_index.attr);
> > > +               return 0;
> > > +       }
> > > +       return -1;
> > > +}
> > > +
> > > +int pci_create_acpi_attr_files(struct pci_dev *pdev)
> > > +{
> > > +       if (!pci_create_acpi_index_label_files(pdev))
> > > +               return 0;
> > > +       if (!pci_create_smbiosname_file(pdev))
> > > +               return 0;
> > > +       return -ENODEV;
> > > +}
> > > +EXPORT_SYMBOL(pci_create_acpi_attr_files);
> >
> > EXPORT_SYMBOL_GPL?
> >
> > Wait, why does this need to be exported at all?  What module is ever
> > going to call this function?
> >
> > > +int pci_remove_acpi_attr_files(struct pci_dev *pdev)
> > > +{
> > > +       if (!pci_remove_acpi_index_label_files(pdev))
> > > +               return 0;
> > > +       if (!pci_remove_smbiosname_file(pdev))
> > > +               return 0;
> > > +       return -ENODEV;
> > > +
> > > +}
> > > +EXPORT_SYMBOL(pci_remove_acpi_attr_files);
> >
> > Same here, what module will call this?
> >
> 
> These functions need not be exported as they are not called by any
> module.
> 
> > > +++ b/include/linux/pci-label.h
> >
> > As discussed above, this whole file does not need to exist.
> >
> > > +extern int pci_create_acpi_attr_files(struct pci_dev *pdev);
> > > +extern int pci_remove_acpi_attr_files(struct pci_dev *pdev);
> >
> > Just put these two functions in the drivers/pci/pci.h file.
> >
> Fixed.
> 
> In addition to these changes there are a coulple of changes i have done
> -
> 
> 1.Removed the check for network devices and evaulate _DSM for any pci
> device
> that has _DSM defined in adherence to the spec.
> 
> 2.Renamed the functions pci_create,remove-acpi_attr_files to
> pci_create,remove_firmware_label_files.
> 
> 3.Added checks for conditional compilation of if CONFIG_ACPI ||
> CONFIG_DMI
> 
> Note: While testing the patch with CONFIG_ACPI set to no, the
> compilation
> would fail with the below message.
> 
>   CC      drivers/pci/pci-label.o
> In file included from drivers/pci/pci-label.c:24:
> include/linux/pci-acpi.h:39: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or
> ‘__attribute__’
> before ‘acpi_find_root_bridge_handle’
> 
> I had to add make this change to proceed with the compilation. It would
> be
> great to know if i am missing something in the way conditional
> compilation
> is implemented or is it a issue.
> 
> ---
>  include/linux/pci-acpi.h |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
> index c8b6473..bc40827 100644
> --- a/include/linux/pci-acpi.h
> +++ b/include/linux/pci-acpi.h
> @@ -36,8 +36,8 @@ static inline acpi_handle
> acpi_pci_get_bridge_handle(struct pci_bus *pbus)
>  					      pbus->number);
>  }
>  #else
> -static inline acpi_handle acpi_find_root_bridge_handle(struct pci_dev
> *pdev)
> -{ return NULL; }
> +static inline void acpi_find_root_bridge_handle(struct pci_dev *pdev)
> +{ }
>  #endif
> 
>  #endif	/* _PCI_ACPI_H_ */
> 
> 
> Please find the patch with above suggestions and changes -
> 
> 
> From: Narendra K <narendra_k@dell.com>
> Subject: [PATCH V2 1/2] Export firmware assigned labels of pci devices
> to sysfs
> 
> This patch exports the firmware assigned labels of pci devices to
> sysfs which could be used by user space.
> 
> Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com>
> Signed-off-by: Narendra K <narendra_k@dell.com>
> ---
>  drivers/firmware/dmi_scan.c |   24 ++++
>  drivers/pci/Makefile        |    2 +-
>  drivers/pci/pci-label.c     |  273
> +++++++++++++++++++++++++++++++++++++++++++
>  drivers/pci/pci-sysfs.c     |    5 +
>  drivers/pci/pci.h           |    2 +
>  include/linux/dmi.h         |    9 ++
>  6 files changed, 314 insertions(+), 1 deletions(-)
>  create mode 100644 drivers/pci/pci-label.c
> 
> diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
> index d464672..7d8439b 100644
> --- a/drivers/firmware/dmi_scan.c
> +++ b/drivers/firmware/dmi_scan.c
> @@ -277,6 +277,28 @@ static void __init dmi_save_ipmi_device(const
> struct dmi_header *dm)
>  	list_add_tail(&dev->list, &dmi_devices);
>  }
> 
> +static void __init dmi_save_devslot(int id, int seg, int bus, int
> devfn, const char *name)
> +{
> +	struct dmi_devslot *slot;
> +
> +	slot = dmi_alloc(sizeof(*slot) + strlen(name) + 1);
> +	if (!slot) {
> +		printk(KERN_ERR "dmi_save_devslot: out of memory.\n");
> +		return;
> +	}
> +	slot->id = id;
> +	slot->seg = seg;
> +	slot->bus = bus;
> +	slot->devfn = devfn;
> +
> +	strcpy((char *)&slot[1], name);
> +	slot->dev.type = DMI_DEV_TYPE_DEVSLOT;
> +	slot->dev.name = (char *)&slot[1];
> +	slot->dev.device_data = slot;
> +
> +	list_add(&slot->dev.list, &dmi_devices);
> +}
> +
>  static void __init dmi_save_extended_devices(const struct dmi_header
> *dm)
>  {
>  	const u8 *d = (u8*) dm + 5;
> @@ -285,6 +307,7 @@ static void __init dmi_save_extended_devices(const
> struct dmi_header *dm)
>  	if ((*d & 0x80) == 0)
>  		return;
> 
> +	dmi_save_devslot(-1, *(u16 *)(d+2), *(d+4), *(d+5),
> dmi_string_nosave(dm, *(d-1)));
>  	dmi_save_one_device(*d & 0x7f, dmi_string_nosave(dm, *(d - 1)));
>  }
> 
> @@ -333,6 +356,7 @@ static void __init dmi_decode(const struct
> dmi_header *dm, void *dummy)
>  		break;
>  	case 41:	/* Onboard Devices Extended Information */
>  		dmi_save_extended_devices(dm);
> +		break;
>  	}
>  }
> 
> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> index 0b51857..69c503a 100644
> --- a/drivers/pci/Makefile
> +++ b/drivers/pci/Makefile
> @@ -4,7 +4,7 @@
> 
>  obj-y		+= access.o bus.o probe.o remove.o pci.o \
>  			pci-driver.o search.o pci-sysfs.o rom.o setup-res.o \
> -			irq.o vpd.o
> +			irq.o vpd.o pci-label.o
>  obj-$(CONFIG_PROC_FS) += proc.o
>  obj-$(CONFIG_SYSFS) += slot.o
> 
> diff --git a/drivers/pci/pci-label.c b/drivers/pci/pci-label.c
> new file mode 100644
> index 0000000..b35d48c
> --- /dev/null
> +++ b/drivers/pci/pci-label.c
> @@ -0,0 +1,273 @@
> +/*
> + * Purpose: Export the firmware label associated with a pci network
> interface
> + * device to sysfs
> + * Copyright (C) 2010 Dell Inc.
> + * by Narendra K <Narendra_K@dell.com>, Jordan Hargrave
> <Jordan_Hargrave@dell.com>
> + *
> + * This code checks if the pci network device has a related ACPI _DSM.
> If
> + * available, the code calls the _DSM to retrieve the index and string
> and
> + * exports them to sysfs. If the ACPI _DSM is not available, it falls
> back on
> + * SMBIOS. SMBIOS defines type 41 for onboard pci devices. This code
> retrieves
> + * strings associated with the type 41 and exports it to sysfs.
> + *
> + * Please see http://linux.dell.com/wiki/index.php/Oss/libnetdevname
> for more
> + * information.
> + */
> +
> +#include <linux/dmi.h>
> +#include <linux/sysfs.h>
> +#include <linux/pci.h>
> +#include <linux/pci_ids.h>
> +#include <linux/module.h>
> +#include <linux/acpi.h>
> +#include <linux/pci-acpi.h>
> +#include <acpi/acpi_drivers.h>
> +#include <acpi/acpi_bus.h>
> +#include "pci.h"
> +
> +#define	DEVICE_LABEL_DSM	0x07
> +
> +#if defined CONFIG_DMI
> +
> +struct smbios_attribute {
> +	struct attribute attr;
> +	ssize_t (*show) (struct device *dev, char *buf);
> +	ssize_t (*test) (struct device *dev, char *buf);
> +};
> +
> +static ssize_t
> +smbiosname_string_exists(struct device *dev, char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +  	const struct dmi_device *dmi;
> +  	struct dmi_devslot *dslot;
> +  	int bus;
> +  	int devfn;
> +
> +  	bus = pdev->bus->number;
> +  	devfn = pdev->devfn;
> +
> +  	dmi = NULL;
> +  	while ((dmi = dmi_find_device(DMI_DEV_TYPE_DEVSLOT, NULL, dmi))
> != NULL) {
> +    		dslot = dmi->device_data;
> +    		if (dslot && dslot->bus == bus && dslot->devfn == devfn) {
> +			if (buf)
> +      				return scnprintf(buf, PAGE_SIZE, "%s\n",
> dmi->name);
> +			return strlen(dmi->name);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static ssize_t
> +smbiosname_show(struct device *dev, struct device_attribute *attr,
> char *buf)
> +{
> +	return smbiosname_string_exists(dev, buf);
> +}
> +
> +struct smbios_attribute smbios_attr_label = {
> +	.attr = {.name = "label", .mode = 0444, .owner = THIS_MODULE},
> +	.show = smbiosname_show,
> +	.test = smbiosname_string_exists,
> +};
> +
> +static int
> +pci_create_smbiosname_file(struct pci_dev *pdev)
> +{
> +	if (smbios_attr_label.test && smbios_attr_label.test(&pdev->dev,
> NULL)) {
> +		sysfs_create_file(&pdev->dev.kobj,
> &smbios_attr_label.attr);
> +		return 0;
> +	}
> +	return -1;
> +}
> +
> +static int
> +pci_remove_smbiosname_file(struct pci_dev *pdev)
> +{
> +	if (smbios_attr_label.test && smbios_attr_label.test(&pdev->dev,
> NULL)) {
> +		sysfs_remove_file(&pdev->dev.kobj,
> &smbios_attr_label.attr);
> +		return 0;
> +	}
> +	return -1;
> +}
> +#else
> +static inline int
> +pci_create_smbiosname_file(struct pci_dev *pdev)
> +{
> +	return -1;
> +}
> +
> +static inline int
> +pci_remove_smbiosname_file(struct pci_dev *pdev)
> +{
> +	return -1;
> +}
> +#endif
> +
> +#if defined CONFIG_ACPI
> +
> +static const char device_label_dsm_uuid[] = {
> +	0xD0, 0x37, 0xC9, 0xE5, 0x53, 0x35, 0x7A, 0x4D,
> +	0x91, 0x17, 0xEA, 0x4D, 0x19, 0xC3, 0x43, 0x4D
> +};
> +
> +struct acpi_attribute {
> +	struct attribute attr;
> +	ssize_t (*show) (struct device *dev, char *buf);
> +	ssize_t (*test) (struct device *dev, char *buf);
> +};
> +
> +static int
> +dsm_get_label(acpi_handle handle, int func,
> +              struct acpi_buffer *output,
> +              char *buf, char *attribute)
> +{
> +	struct acpi_object_list input;
> +	union acpi_object params[4];
> +	union acpi_object *obj;
> +	int len = 0;
> +
> +	int err;
> +
> +	input.count = 4;
> +	input.pointer = params;
> +	params[0].type = ACPI_TYPE_BUFFER;
> +	params[0].buffer.length = sizeof(device_label_dsm_uuid);
> +	params[0].buffer.pointer = (char *)device_label_dsm_uuid;
> +	params[1].type = ACPI_TYPE_INTEGER;
> +	params[1].integer.value = 0x02;
> +	params[2].type = ACPI_TYPE_INTEGER;
> +	params[2].integer.value = func;
> +	params[3].type = ACPI_TYPE_PACKAGE;
> +	params[3].package.count = 0;
> +	params[3].package.elements = NULL;
> +
> +	err = acpi_evaluate_object(handle, "_DSM", &input, output);
> +	if (err)
> +		return -1;
> +
> +
> +	obj = (union acpi_object *)output->pointer;
> +
> +	switch (obj->type) {
> +	case ACPI_TYPE_PACKAGE:
> +		if (obj->package.count != 2)
> +			break;
> +		len = obj->package.elements[0].integer.value;
> +		if (buf) {
> +			if (!strncmp(attribute, "index", strlen(attribute)))
> +				scnprintf(buf, PAGE_SIZE, "%lu\n",
> +				obj->package.elements[0].integer.value);
> +			else
> +				scnprintf(buf, PAGE_SIZE, "%s\n",
> +				obj->package.elements[1].string.pointer);
> +			kfree(output->pointer);
> +			return strlen(buf);
> +		}
> +		kfree(output->pointer);
> +		return len;
> +	break;
> +	default:
> +		kfree(output->pointer);
> +		return -1;
> +	}
> +}
> +
> +static ssize_t
> +acpi_index_string_exist(struct device *dev, char *buf, char
> *attribute)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +
> +	struct acpi_buffer output = {ACPI_ALLOCATE_BUFFER, NULL};
> +	acpi_handle handle;
> +	int length;
> +
> +	handle = DEVICE_ACPI_HANDLE(dev);
> +
> +	if (!handle)
> +		return -1;
> +
> +	if ((length = dsm_get_label(handle, DEVICE_LABEL_DSM,
> +				    &output, buf, attribute)) < 0)
> +		return -1;
> +
> +	return length;
> +}
> +
> +static ssize_t
> +acpilabel_show(struct device *dev, struct device_attribute *attr, char
> *buf)
> +{
> +	return acpi_index_string_exist(dev, buf, "label");
> +}
> +
> +static ssize_t
> +acpiindex_show(struct device *dev, struct device_attribute *attr, char
> *buf)
> +{
> +	return acpi_index_string_exist(dev, buf, "index");
> +}
> +
> +struct acpi_attribute acpi_attr_label = {
> +	.attr = {.name = "label", .mode = 0444, .owner = THIS_MODULE},
> +	.show = acpilabel_show,
> +	.test = acpi_index_string_exist,
> +};
> +
> +struct acpi_attribute acpi_attr_index = {
> +	.attr = {.name = "index", .mode = 0444, .owner = THIS_MODULE},
> +	.show = acpiindex_show,
> +	.test = acpi_index_string_exist,
> +};
> +
> +static int
> +pci_create_acpi_index_label_files(struct pci_dev *pdev)
> +{
> +	if (acpi_attr_label.test && acpi_attr_label.test(&pdev->dev,
> NULL) > 0) {
> +		sysfs_create_file(&pdev->dev.kobj, &acpi_attr_label.attr);
> +		sysfs_create_file(&pdev->dev.kobj, &acpi_attr_index.attr);
> +		return 0;
> +	}
> +	return -1;
> +}
> +
> +static int
> +pci_remove_acpi_index_label_files(struct pci_dev *pdev)
> +{
> +	if (acpi_attr_label.test && acpi_attr_label.test(&pdev->dev,
> NULL) > 0) {
> +		sysfs_remove_file(&pdev->dev.kobj, &acpi_attr_label.attr);
> +		sysfs_remove_file(&pdev->dev.kobj, &acpi_attr_index.attr);
> +		return 0;
> +	}
> +	return -1;
> +}
> +#else
> +static inline int
> +pci_create_acpi_index_label_files(struct pci_dev *pdev)
> +{
> +	return -1;
> +}
> +
> +static inline int
> +pci_remove_acpi_index_label_files(struct pci_dev *pdev)
> +{
> +	return -1;
> +}
> +#endif
> +
> +int pci_create_firmware_label_files(struct pci_dev *pdev)
> +{
> +	if (!pci_create_acpi_index_label_files(pdev))
> +		return 0;
> +	if (!pci_create_smbiosname_file(pdev))
> +		return 0;
> +	return -ENODEV;
> +}
> +
> +int pci_remove_firmware_label_files(struct pci_dev *pdev)
> +{
> +	if (!pci_remove_acpi_index_label_files(pdev))
> +		return 0;
> +	if (!pci_remove_smbiosname_file(pdev))
> +		return 0;
> +	return -ENODEV;
> +}
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index fad9398..4ed517f 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1073,6 +1073,8 @@ int __must_check pci_create_sysfs_dev_files
> (struct pci_dev *pdev)
>  	if (retval)
>  		goto err_vga_file;
> 
> +	pci_create_firmware_label_files(pdev);
> +
>  	return 0;
> 
>  err_vga_file:
> @@ -1140,6 +1142,9 @@ void pci_remove_sysfs_dev_files(struct pci_dev
> *pdev)
>  		sysfs_remove_bin_file(&pdev->dev.kobj, pdev->rom_attr);
>  		kfree(pdev->rom_attr);
>  	}
> +
> +	pci_remove_firmware_label_files(pdev);
> +
>  }
> 
>  static int __init pci_sysfs_init(void)
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 4eb10f4..f223283 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -11,6 +11,8 @@
>  extern int pci_uevent(struct device *dev, struct kobj_uevent_env
> *env);
>  extern int pci_create_sysfs_dev_files(struct pci_dev *pdev);
>  extern void pci_remove_sysfs_dev_files(struct pci_dev *pdev);
> +extern int pci_create_firmware_label_files(struct pci_dev *pdev);
> +extern int pci_remove_firmware_label_files(struct pci_dev *pdev);
>  extern void pci_cleanup_rom(struct pci_dev *dev);
>  #ifdef HAVE_PCI_MMAP
>  extern int pci_mmap_fits(struct pci_dev *pdev, int resno,
> diff --git a/include/linux/dmi.h b/include/linux/dmi.h
> index a8a3e1a..cc57c3a 100644
> --- a/include/linux/dmi.h
> +++ b/include/linux/dmi.h
> @@ -20,6 +20,7 @@ enum dmi_device_type {
>  	DMI_DEV_TYPE_SAS,
>  	DMI_DEV_TYPE_IPMI = -1,
>  	DMI_DEV_TYPE_OEM_STRING = -2,
> +	DMI_DEV_TYPE_DEVSLOT = -3,
>  };
> 
>  struct dmi_header {
> @@ -37,6 +38,14 @@ struct dmi_device {
> 
>  #ifdef CONFIG_DMI
> 
> +struct dmi_devslot {
> +	struct dmi_device dev;
> +	int id;
> +	int seg;
> +	int bus;
> +	int devfn;
> +};
> +
>  extern int dmi_check_system(const struct dmi_system_id *list);
>  const struct dmi_system_id *dmi_first_match(const struct dmi_system_id
> *list);
>  extern const char * dmi_get_system_info(int field);
> --
> 1.6.5.2
> 
> With regards,
> Narendra K
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 1/2] net: Enable 64-bit net device statistics on 32-bit architectures
From: Stephen Hemminger @ 2010-06-04 20:39 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: David Miller, Arnd Bergmann, netdev, linux-net-drivers
In-Reply-To: <1275675318.2095.30.camel@achroite.uk.solarflarecom.com>

On Fri, 04 Jun 2010 19:15:18 +0100
Ben Hutchings <bhutchings@solarflare.com> wrote:

> On Fri, 2010-06-04 at 10:28 -0700, Stephen Hemminger wrote:
> > On Thu, 03 Jun 2010 20:11:38 +0100
> > Ben Hutchings <bhutchings@solarflare.com> wrote:
> > 
> > > static inline u64 rtnl_link_stats64_read(const u64 *field)
> > > {
> > > 	return ACCESS_ONCE(*field);
> > > }
> > > static inline u32 rtnl_link_stats64_read32(const u64 *field)
> > > {
> > > 	return ACCESS_ONCE(*field);
> > > }
> > 
> > Do we really care if compiler reorders access. I think not.
> > There was no order guarantee in the past.
> 
> Since these reads are potentially racing with writes, we want to ensure
> that they are atomic.  Without the volatile-qualification, the compiler
> can legitimately split or repeat the reads, though I don't see any
> particular reason why this is a likely optimisation.
> 
> Ben.
> 

But this part of the code is only being run on on 64 bit machines.
Updates of basic types for the CPU are atomic, lots of other code
already assumes this.
Take off your tin hat, this is excessive paranoia.

-- 

^ permalink raw reply

* Re: [PATCH 1/2] net: Enable 64-bit net device statistics on 32-bit architectures
From: Ben Hutchings @ 2010-06-04 20:47 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, Arnd Bergmann, netdev, linux-net-drivers
In-Reply-To: <20100604133930.34e2d53b@nehalam>

On Fri, 2010-06-04 at 13:39 -0700, Stephen Hemminger wrote:
> On Fri, 04 Jun 2010 19:15:18 +0100
> Ben Hutchings <bhutchings@solarflare.com> wrote:
> 
> > On Fri, 2010-06-04 at 10:28 -0700, Stephen Hemminger wrote:
> > > On Thu, 03 Jun 2010 20:11:38 +0100
> > > Ben Hutchings <bhutchings@solarflare.com> wrote:
> > > 
> > > > static inline u64 rtnl_link_stats64_read(const u64 *field)
> > > > {
> > > > 	return ACCESS_ONCE(*field);
> > > > }
> > > > static inline u32 rtnl_link_stats64_read32(const u64 *field)
> > > > {
> > > > 	return ACCESS_ONCE(*field);
> > > > }
> > > 
> > > Do we really care if compiler reorders access. I think not.
> > > There was no order guarantee in the past.
> > 
> > Since these reads are potentially racing with writes, we want to ensure
> > that they are atomic.  Without the volatile-qualification, the compiler
> > can legitimately split or repeat the reads, though I don't see any
> > particular reason why this is a likely optimisation.
> > 
> > Ben.
> > 
> 
> But this part of the code is only being run on on 64 bit machines.
> Updates of basic types for the CPU are atomic, lots of other code
> already assumes this.
> Take off your tin hat, this is excessive paranoia.

I like my tin hat, I'm sure I saw more oopses before I put it on. ;-)

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH V2 1/2] Export firmware assigned labels of network devices to sysfs
From: Greg KH @ 2010-06-04 20:49 UTC (permalink / raw)
  To: Narendra_K; +Cc: netdev, linux-hotplug, linux-pci, Matt_Domsch
In-Reply-To: <EDA0A4495861324DA2618B4C45DCB3EE612913@blrx3m08.blr.amer.dell.com>

On Sat, Jun 05, 2010 at 02:01:35AM +0530, Narendra_K@Dell.com wrote:
> 
> 
> With regards,

Um, what?

Why are you resending this over and over with no additional content
added?

If you have a new patch, send it.  Don't bury it at the bottom of
another message, and then linewrap the thing...

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH 2/2] pktgen: receive packets and process incoming rate
From: Eric Dumazet @ 2010-06-02 13:00 UTC (permalink / raw)
  To: Daniel Turull; +Cc: netdev, robert, jens.laas
In-Reply-To: <4C06453B.1080801@gmail.com>

Le mercredi 02 juin 2010 à 13:49 +0200, Daniel Turull a écrit :
> This patch adds receiver part to pktgen taking advantages of SMP systems
> with multiple rx queues:
> - Creation of new proc file  /proc/net/pktgen/pgrx to control and display the receiver.
> - It uses PER-CPU variable to store the results per each CPU.
> - Results displayed per CPU and aggregated.
> - The packet handler is add in the protocols handlers (dev_Add_pack())
> - Available statistics: packets and bytes received, work time and rate
> - Only process pktgen packets
> - It is possible to select the incoming interface 
> - Documentation updated with the new commands to control the receiver part.
> 

Interesting, but does it belongs to pktgen ?

other comments included :)

> Signed-off-by: Daniel Turull <daniel.turull@gmail.com>
> 


>  
> +/*Recevier parameters per cpu*/
> +struct pktgen_rx {
> +	u64 rx_packets;		/*packets arrived*/

unsigned long rx_packets ?

> +	u64 rx_bytes;		/*bytes arrived*/
> +
> +	ktime_t start_time;	/*first time stamp of a packet*/
> +	ktime_t last_time;	/*last packet arrival */
> +};


> +int pktgen_rcv_basic(struct sk_buff *skb, struct net_device *dev,
> +			 struct packet_type *pt, struct net_device *orig_dev)
> +{
> +	/* Check magic*/
> +	struct iphdr *iph = ip_hdr(skb);
> +	struct pktgen_hdr *pgh;
> +	void *vaddr;

Following code is ... well ... interesting... But ...

1) Is it IPV6 compatable ? pktgen can be ipv6 or ipv4
2) Is it resistant to malicious packets ? (very small ones)
3) No checksum ?

I think you should use standard mechanisms... (pskb_may_pull(), ...)
Take a look at __netpoll_rx() for an example.

> +	if (skb_is_nonlinear(skb)) {
> +		vaddr = kmap_skb_frag(&skb_shinfo(skb)->frags[0]);
> +		pgh = (struct pktgen_hdr *)
> +			(vaddr+skb_shinfo(skb)->frags[0].page_offset);
> +	} else {
> +		pgh = (struct pktgen_hdr *)(((char *)(iph)) + 28);
> +	}
> +
> +	if (unlikely(pgh->pgh_magic != PKTGEN_MAGIC_NET))
> +		goto end;
> +
> +	if (unlikely(!__get_cpu_var(pktgen_rx_data).rx_packets))
> +		__get_cpu_var(pktgen_rx_data).start_time = ktime_now();
> +
> +	__get_cpu_var(pktgen_rx_data).last_time = ktime_now();
> +

Its a bit suboptimal to use __get_cpu_var several time. Take a look at
disassembly code :)

Use a pointer instead


> +	/* Update counter of packets*/
> +	__get_cpu_var(pktgen_rx_data).rx_packets++;
> +	__get_cpu_var(pktgen_rx_data).rx_bytes += skb->len+14;

This +14 seems suspect (what about vlan tags ?)
Use ETH_HLEN instead at a very minimum?

> +end:
> +	if (skb_is_nonlinear(skb))
> +		kunmap_skb_frag(vaddr);

Should not recognised packets be allowed to flight in other parts of
kernel stack ? This way, we could use ssh to remotely control this
pktgen session ;)

> +	kfree_skb(skb);
> +	return 0;
> +}
> +


^ permalink raw reply

* Re: [PATCH V2 1/2] Export firmware assigned labels of network devices to sysfs
From: Greg KH @ 2010-06-04 21:01 UTC (permalink / raw)
  To: Narendra_K; +Cc: netdev, linux-hotplug, linux-pci, Matt_Domsch
In-Reply-To: <20100604204955.GA20329@kroah.com>

On Fri, Jun 04, 2010 at 01:49:55PM -0700, Greg KH wrote:
> On Sat, Jun 05, 2010 at 02:01:35AM +0530, Narendra_K@Dell.com wrote:
> > 
> > 
> > With regards,
> 
> Um, what?
> 
> Why are you resending this over and over with no additional content
> added?

And you have an out-of-office autoreply.

I give up...

^ permalink raw reply

* Re: [PATCH V3 1/2] Export firmware assigned labels of network devices to sysfs
From: Narendra K @ 2010-06-04 21:07 UTC (permalink / raw)
  To: greg
  Cc: netdev, linux-hotplug, linux-pci, matt_domsch, jordan_hargrave,
	charles_rose
In-Reply-To: <EDA0A4495861324DA2618B4C45DCB3EE612918@blrx3m08.blr.amer.dell.com>

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Narendra K
> Sent: Friday, June 04, 2010 2:37 AM
> To: greg@kroah.com
> Cc: netdev@vger.kernel.org; linux-hotplug@vger.kernel.org; linux-pci@vger.kernel.org; Domsch, Matt
> Subject: Re: [PATCH V2 1/2] Export firmware assigned labels of network devices to sysfs
> 
> In addition to these changes there are a coulple of changes i have done -
> 
> 1.Removed the check for network devices and evaulate _DSM for any pci device
> that has _DSM defined in adherence to the spec.
> 
> 2.Renamed the functions pci_create,remove-acpi_attr_files to 
> pci_create,remove_firmware_label_files.
> 
> 3.Added checks for conditional compilation of if CONFIG_ACPI || CONFIG_DMI
> 

V2 -> V3: I had missed fixing warnings in the patch. Please find the V3
of the patch with the warnings fixed -


From: Narendra K <narendra_k@dell.com>
Subject: [PATCH V3 1/2] Export firmware assigned labels of pci devices to sysfs

This patch exports the firmware assigned labels of pci devices to
sysfs which could be used by user space.

Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com>
Signed-off-by: Narendra K <narendra_k@dell.com>
---
 drivers/firmware/dmi_scan.c |   24 ++++
 drivers/pci/Makefile        |    2 +-
 drivers/pci/pci-label.c     |  274 +++++++++++++++++++++++++++++++++++++++++++
 drivers/pci/pci-sysfs.c     |    5 +
 drivers/pci/pci.h           |    2 +
 include/linux/dmi.h         |    9 ++
 6 files changed, 315 insertions(+), 1 deletions(-)
 create mode 100644 drivers/pci/pci-label.c

diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
index d464672..7d8439b 100644
--- a/drivers/firmware/dmi_scan.c
+++ b/drivers/firmware/dmi_scan.c
@@ -277,6 +277,28 @@ static void __init dmi_save_ipmi_device(const struct dmi_header *dm)
 	list_add_tail(&dev->list, &dmi_devices);
 }
 
+static void __init dmi_save_devslot(int id, int seg, int bus, int devfn, const char *name)
+{
+	struct dmi_devslot *slot;
+
+	slot = dmi_alloc(sizeof(*slot) + strlen(name) + 1);
+	if (!slot) {
+		printk(KERN_ERR "dmi_save_devslot: out of memory.\n");
+		return;
+	}	
+	slot->id = id;
+	slot->seg = seg;
+	slot->bus = bus;
+	slot->devfn = devfn;
+
+	strcpy((char *)&slot[1], name);
+	slot->dev.type = DMI_DEV_TYPE_DEVSLOT;
+	slot->dev.name = (char *)&slot[1];
+	slot->dev.device_data = slot;
+
+	list_add(&slot->dev.list, &dmi_devices);
+}
+
 static void __init dmi_save_extended_devices(const struct dmi_header *dm)
 {
 	const u8 *d = (u8*) dm + 5;
@@ -285,6 +307,7 @@ static void __init dmi_save_extended_devices(const struct dmi_header *dm)
 	if ((*d & 0x80) == 0)
 		return;
 
+	dmi_save_devslot(-1, *(u16 *)(d+2), *(d+4), *(d+5), dmi_string_nosave(dm, *(d-1)));
 	dmi_save_one_device(*d & 0x7f, dmi_string_nosave(dm, *(d - 1)));
 }
 
@@ -333,6 +356,7 @@ static void __init dmi_decode(const struct dmi_header *dm, void *dummy)
 		break;
 	case 41:	/* Onboard Devices Extended Information */
 		dmi_save_extended_devices(dm);
+		break;
 	}
 }
 
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 0b51857..69c503a 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -4,7 +4,7 @@
 
 obj-y		+= access.o bus.o probe.o remove.o pci.o \
 			pci-driver.o search.o pci-sysfs.o rom.o setup-res.o \
-			irq.o vpd.o
+			irq.o vpd.o pci-label.o
 obj-$(CONFIG_PROC_FS) += proc.o
 obj-$(CONFIG_SYSFS) += slot.o
 
diff --git a/drivers/pci/pci-label.c b/drivers/pci/pci-label.c
new file mode 100644
index 0000000..d6e0e9b
--- /dev/null
+++ b/drivers/pci/pci-label.c
@@ -0,0 +1,274 @@
+/*
+ * Purpose: Export the firmware label associated with a pci network interface
+ * device to sysfs
+ * Copyright (C) 2010 Dell Inc.
+ * by Narendra K <Narendra_K@dell.com>, Jordan Hargrave <Jordan_Hargrave@dell.com>
+ *
+ * This code checks if the pci network device has a related ACPI _DSM. If
+ * available, the code calls the _DSM to retrieve the index and string and
+ * exports them to sysfs. If the ACPI _DSM is not available, it falls back on
+ * SMBIOS. SMBIOS defines type 41 for onboard pci devices. This code retrieves
+ * strings associated with the type 41 and exports it to sysfs.
+ *
+ * Please see http://linux.dell.com/wiki/index.php/Oss/libnetdevname for more
+ * information.
+ */
+
+#include <linux/dmi.h>
+#include <linux/sysfs.h>
+#include <linux/pci.h>
+#include <linux/pci_ids.h>
+#include <linux/module.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <acpi/acpi_drivers.h>
+#include <acpi/acpi_bus.h>
+#include "pci.h"
+
+#define	DEVICE_LABEL_DSM	0x07
+
+#if defined CONFIG_DMI
+
+struct smbios_attribute {
+	struct attribute attr;
+	ssize_t (*show) (struct device *dev, struct device_attribute *, char *buf);
+	ssize_t (*test) (struct device *dev, char *buf);
+};
+
+static ssize_t
+smbiosname_string_exists(struct device *dev, char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+  	const struct dmi_device *dmi;
+  	struct dmi_devslot *dslot;
+  	int bus;
+  	int devfn;
+
+  	bus = pdev->bus->number;
+  	devfn = pdev->devfn;
+
+  	dmi = NULL;
+  	while ((dmi = dmi_find_device(DMI_DEV_TYPE_DEVSLOT, NULL, dmi)) != NULL) {
+    		dslot = dmi->device_data;
+    		if (dslot && dslot->bus == bus && dslot->devfn == devfn) {
+			if (buf)
+      				return scnprintf(buf, PAGE_SIZE, "%s\n", dmi->name);
+			return strlen(dmi->name);
+		}
+	}
+	
+	return 0;
+}
+
+static ssize_t
+smbiosname_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return smbiosname_string_exists(dev, buf);
+}
+
+struct smbios_attribute smbios_attr_label = {
+	.attr = {.name = "label", .mode = 0444, .owner = THIS_MODULE},
+	.show = smbiosname_show,
+	.test = smbiosname_string_exists,
+};
+
+static int 
+pci_create_smbiosname_file(struct pci_dev *pdev)
+{
+	if (smbios_attr_label.test && smbios_attr_label.test(&pdev->dev, NULL)) {
+		if (sysfs_create_file(&pdev->dev.kobj, &smbios_attr_label.attr))
+			return -1;
+		return 0;
+	}
+	return -1;	
+}
+
+static int 
+pci_remove_smbiosname_file(struct pci_dev *pdev)
+{
+	if (smbios_attr_label.test && smbios_attr_label.test(&pdev->dev, NULL)) {
+		sysfs_remove_file(&pdev->dev.kobj, &smbios_attr_label.attr);
+		return 0;
+	}
+	return -1;
+}
+#else
+static inline int 
+pci_create_smbiosname_file(struct pci_dev *pdev)
+{
+	return -1;
+}
+
+static inline int 
+pci_remove_smbiosname_file(struct pci_dev *pdev)
+{
+	return -1;
+}
+#endif
+
+#if defined CONFIG_ACPI
+
+static const char device_label_dsm_uuid[] = {
+	0xD0, 0x37, 0xC9, 0xE5, 0x53, 0x35, 0x7A, 0x4D,
+	0x91, 0x17, 0xEA, 0x4D, 0x19, 0xC3, 0x43, 0x4D
+};
+
+struct acpi_attribute {
+	struct attribute attr;
+	ssize_t (*show) (struct device *dev, struct device_attribute *attr, char *buf);
+	ssize_t (*test) (struct device *dev, char *buf, char *attribute);
+};
+
+static int
+dsm_get_label(acpi_handle handle, int func, 
+              struct acpi_buffer *output, 
+              char *buf, char *attribute)
+{
+	struct acpi_object_list input;
+	union acpi_object params[4];
+	union acpi_object *obj; 
+	int len = 0;
+
+	int err;
+
+	input.count = 4;
+	input.pointer = params;
+	params[0].type = ACPI_TYPE_BUFFER;
+	params[0].buffer.length = sizeof(device_label_dsm_uuid);
+	params[0].buffer.pointer = (char *)device_label_dsm_uuid;
+	params[1].type = ACPI_TYPE_INTEGER;
+	params[1].integer.value = 0x02;
+	params[2].type = ACPI_TYPE_INTEGER;
+	params[2].integer.value = func;
+	params[3].type = ACPI_TYPE_PACKAGE;
+	params[3].package.count = 0;
+	params[3].package.elements = NULL;
+
+	err = acpi_evaluate_object(handle, "_DSM", &input, output);
+	if (err) 
+		return -1;
+	
+	
+	obj = (union acpi_object *)output->pointer;
+
+	switch (obj->type) {
+	case ACPI_TYPE_PACKAGE:
+		if (obj->package.count != 2) 
+			break;
+		len = obj->package.elements[0].integer.value;
+		if (buf) {
+			if (!strncmp(attribute, "index", strlen(attribute)))
+				scnprintf(buf, PAGE_SIZE, "%llu\n", 
+				obj->package.elements[0].integer.value);
+			else
+				scnprintf(buf, PAGE_SIZE, "%s\n", 
+				obj->package.elements[1].string.pointer);
+			kfree(output->pointer);
+			return strlen(buf);
+		}
+		kfree(output->pointer);
+		return len;
+	break;
+	default:
+		kfree(output->pointer);
+	}
+	return -1;
+}	
+
+static ssize_t
+acpi_index_string_exist(struct device *dev, char *buf, char *attribute)
+{
+	struct acpi_buffer output = {ACPI_ALLOCATE_BUFFER, NULL};
+	acpi_handle handle;
+	int length;
+	
+	handle = DEVICE_ACPI_HANDLE(dev);
+
+	if (!handle) 
+		return -1;
+
+	if ((length = dsm_get_label(handle, DEVICE_LABEL_DSM, 
+				    &output, buf, attribute)) < 0)
+		return -1;
+
+	return length;
+}
+	
+static ssize_t
+acpilabel_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return acpi_index_string_exist(dev, buf, "label");
+}
+
+static ssize_t
+acpiindex_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return acpi_index_string_exist(dev, buf, "index");
+}
+
+struct acpi_attribute acpi_attr_label = {
+	.attr = {.name = "label", .mode = 0444, .owner = THIS_MODULE},
+	.show = acpilabel_show,
+	.test = acpi_index_string_exist,
+};
+
+struct acpi_attribute acpi_attr_index = {
+	.attr = {.name = "index", .mode = 0444, .owner = THIS_MODULE},
+	.show = acpiindex_show,
+	.test = acpi_index_string_exist,
+};
+
+static int
+pci_create_acpi_index_label_files(struct pci_dev *pdev)
+{
+	if (acpi_attr_label.test && acpi_attr_label.test(&pdev->dev, NULL, NULL) > 0) {
+		if (sysfs_create_file(&pdev->dev.kobj, &acpi_attr_label.attr)) 
+			return -1;
+		if (sysfs_create_file(&pdev->dev.kobj, &acpi_attr_index.attr)) 
+			return -1;
+		return 0;
+	}
+	return -1;
+}
+
+static int
+pci_remove_acpi_index_label_files(struct pci_dev *pdev)
+{
+	if (acpi_attr_label.test && acpi_attr_label.test(&pdev->dev, NULL, NULL) > 0) {
+		sysfs_remove_file(&pdev->dev.kobj, &acpi_attr_label.attr);
+		sysfs_remove_file(&pdev->dev.kobj, &acpi_attr_index.attr);
+		return 0;
+	}
+	return -1;
+}
+#else
+static inline int
+pci_create_acpi_index_label_files(struct pci_dev *pdev)
+{
+	return -1;
+}
+
+static inline int
+pci_remove_acpi_index_label_files(struct pci_dev *pdev)
+{
+	return -1;
+}
+#endif
+
+int pci_create_firmware_label_files(struct pci_dev *pdev)
+{
+	if (!pci_create_acpi_index_label_files(pdev))
+		return 0;
+	if (!pci_create_smbiosname_file(pdev))
+		return 0;
+	return -ENODEV;
+}
+
+int pci_remove_firmware_label_files(struct pci_dev *pdev)
+{
+	if (!pci_remove_acpi_index_label_files(pdev))
+		return 0;
+	if (!pci_remove_smbiosname_file(pdev))
+		return 0;
+	return -ENODEV;
+}
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index fad9398..4ed517f 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1073,6 +1073,8 @@ int __must_check pci_create_sysfs_dev_files (struct pci_dev *pdev)
 	if (retval)
 		goto err_vga_file;
 
+	pci_create_firmware_label_files(pdev);
+
 	return 0;
 
 err_vga_file:
@@ -1140,6 +1142,9 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
 		sysfs_remove_bin_file(&pdev->dev.kobj, pdev->rom_attr);
 		kfree(pdev->rom_attr);
 	}
+
+	pci_remove_firmware_label_files(pdev);
+
 }
 
 static int __init pci_sysfs_init(void)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4eb10f4..f223283 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -11,6 +11,8 @@
 extern int pci_uevent(struct device *dev, struct kobj_uevent_env *env);
 extern int pci_create_sysfs_dev_files(struct pci_dev *pdev);
 extern void pci_remove_sysfs_dev_files(struct pci_dev *pdev);
+extern int pci_create_firmware_label_files(struct pci_dev *pdev);
+extern int pci_remove_firmware_label_files(struct pci_dev *pdev);
 extern void pci_cleanup_rom(struct pci_dev *dev);
 #ifdef HAVE_PCI_MMAP
 extern int pci_mmap_fits(struct pci_dev *pdev, int resno,
diff --git a/include/linux/dmi.h b/include/linux/dmi.h
index a8a3e1a..cc57c3a 100644
--- a/include/linux/dmi.h
+++ b/include/linux/dmi.h
@@ -20,6 +20,7 @@ enum dmi_device_type {
 	DMI_DEV_TYPE_SAS,
 	DMI_DEV_TYPE_IPMI = -1,
 	DMI_DEV_TYPE_OEM_STRING = -2,
+	DMI_DEV_TYPE_DEVSLOT = -3,
 };
 
 struct dmi_header {
@@ -37,6 +38,14 @@ struct dmi_device {
 
 #ifdef CONFIG_DMI
 
+struct dmi_devslot {
+	struct dmi_device dev;
+	int id;
+	int seg;
+	int bus;
+	int devfn;
+};
+
 extern int dmi_check_system(const struct dmi_system_id *list);
 const struct dmi_system_id *dmi_first_match(const struct dmi_system_id *list);
 extern const char * dmi_get_system_info(int field);
-- 
1.6.5.2

With regards,
Narendra K

^ permalink raw reply related

* RE: [PATCH V2 1/2] Export firmware assigned labels of network devices to sysfs
From: Narendra_K @ 2010-06-04 21:15 UTC (permalink / raw)
  To: greg; +Cc: netdev, linux-hotplug, linux-pci, Matt_Domsch
In-Reply-To: <20100604204955.GA20329@kroah.com>

> 
> Um, what?
> 
> Why are you resending this over and over with no additional content
> added?
> 
> If you have a new patch, send it.  Don't bury it at the bottom of
> another message, and then linewrap the thing...
> 

Greg, Sorry for the inconvenience caused. I have sent the V3 patch with
changes mentioned.

With regards,
Narendra K

^ permalink raw reply

* Ethernet drivers wiki page
From: Luis R. Rodriguez @ 2010-06-04 21:52 UTC (permalink / raw)
  To: netdev; +Cc: H. Peter Anvin, Stephen Hemminger, Johannes Berg

Where can I start adding documentation for our Ethernet drivers
upstream? I was hoping for a wiki but I don't think we have a netdev
one yet. Shall we create one or piggy back on something else? I'm
reviewing:

https://wiki.kernel.org/

FWIW, I want to see something like this:

http://wireless.kernel.org/en/users/Drivers

But for ethernet.

  Luis

^ permalink raw reply

* Re: Ethernet drivers wiki page
From: Ben Hutchings @ 2010-06-04 22:01 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: netdev, H. Peter Anvin, Stephen Hemminger, Johannes Berg
In-Reply-To: <AANLkTinSjJsqNvEhJZIE6cJxEkBtg-DO2dqZWMczUy2i@mail.gmail.com>

On Fri, 2010-06-04 at 14:52 -0700, Luis R. Rodriguez wrote:
> Where can I start adding documentation for our Ethernet drivers
> upstream? I was hoping for a wiki but I don't think we have a netdev
> one yet. Shall we create one or piggy back on something else? I'm
> reviewing:
> 
> https://wiki.kernel.org/
> 
> FWIW, I want to see something like this:
> 
> http://wireless.kernel.org/en/users/Drivers
> 
> But for ethernet.

Wherever you put it, it should probably be linked from:

http://www.linuxfoundation.org/collaborate/workgroups/networking/

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: Ethernet drivers wiki page
From: Luis R. Rodriguez @ 2010-06-04 22:04 UTC (permalink / raw)
  To: netdev; +Cc: H. Peter Anvin, Stephen Hemminger, Johannes Berg, linux-kernel
In-Reply-To: <AANLkTinSjJsqNvEhJZIE6cJxEkBtg-DO2dqZWMczUy2i@mail.gmail.com>

On Fri, Jun 4, 2010 at 2:52 PM, Luis R. Rodriguez <mcgrof@gmail.com> wrote:
> Where can I start adding documentation for our Ethernet drivers
> upstream? I was hoping for a wiki but I don't think we have a netdev
> one yet. Shall we create one or piggy back on something else? I'm
> reviewing:
>
> https://wiki.kernel.org/
>
> FWIW, I want to see something like this:
>
> http://wireless.kernel.org/en/users/Drivers
>
> But for ethernet.

Since I've stuffed a few ethernet drivers on compat-wireless I might
as well start carrying more. So now we'd have 802.11, Bluetooth and
Ethernet all backported using the same framework, automatically, down
to at least the oldest stable kernel supported listed on kernel.org
and using all a generic compat module.

I want to document all this crap.

  Luis

^ permalink raw reply

* [PATCHv2 1/2] net: Enable 64-bit net device statistics on 32-bit architectures
From: Ben Hutchings @ 2010-06-04 22:04 UTC (permalink / raw)
  To: David Miller; +Cc: Stephen Hemminger, Arnd Bergmann, netdev, linux-net-drivers

Use struct rtnl_link_stats64 as the statistics structure.

On 32-bit architectures, insert 32 bits of padding after/before each
field of struct net_device_stats to make its layout compatible with
struct rtnl_link_stats64.  Add an anonymous union in net_device; move
stats into the union and add struct rtnl_link_stats64 stats64.

Add net_device_ops::ndo_get_stats64, implementations of which will
return a pointer to struct rtnl_link_stats64.  Drivers that implement
this operation must not update the structure asynchronously.

Change dev_get_stats() to call ndo_get_stats64 if available, and to
return a pointer to struct rtnl_link_stats64.  Change callers of
dev_get_stats() accordingly.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
This time without the horrible accessor macros.

Ben.

 include/linux/if_link.h   |    3 +-
 include/linux/netdevice.h |   91 ++++++++++++++++++++++++++------------------
 net/8021q/vlanproc.c      |   13 +++---
 net/core/dev.c            |   19 +++++----
 net/core/net-sysfs.c      |   12 +++---
 net/core/rtnetlink.c      |    6 +-
 6 files changed, 83 insertions(+), 61 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 85c812d..7fcad2e 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -4,7 +4,7 @@
 #include <linux/types.h>
 #include <linux/netlink.h>
 
-/* The struct should be in sync with struct net_device_stats */
+/* This struct should be in sync with struct rtnl_link_stats64 */
 struct rtnl_link_stats {
 	__u32	rx_packets;		/* total packets received	*/
 	__u32	tx_packets;		/* total packets transmitted	*/
@@ -37,6 +37,7 @@ struct rtnl_link_stats {
 	__u32	tx_compressed;
 };
 
+/* The main device statistics structure */
 struct rtnl_link_stats64 {
 	__u64	rx_packets;		/* total packets received	*/
 	__u64	tx_packets;		/* total packets transmitted	*/
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a249161..dceacc2 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -159,45 +159,49 @@ static inline bool dev_xmit_complete(int rc)
 #define MAX_HEADER (LL_MAX_HEADER + 48)
 #endif
 
-#endif  /*  __KERNEL__  */
-
 /*
- *	Network device statistics. Akin to the 2.0 ether stats but
- *	with byte counters.
+ *	Old network device statistics. Fields are native words
+ *	(unsigned long) so they can be read and written atomically.
+ *	Each field is padded to 64 bits for compatibility with
+ *	rtnl_link_stats64.
  */
 
+#if BITS_PER_LONG == 64
+#define NET_DEVICE_STATS_DEFINE(name)	u64 name
+#elif defined(__LITTLE_ENDIAN)
+#define NET_DEVICE_STATS_DEFINE(name)	u32 name, pad_ ## name
+#else
+#define NET_DEVICE_STATS_DEFINE(name)	u32 pad_ ## name, name
+#endif
+
 struct net_device_stats {
-	unsigned long	rx_packets;		/* total packets received	*/
-	unsigned long	tx_packets;		/* total packets transmitted	*/
-	unsigned long	rx_bytes;		/* total bytes received 	*/
-	unsigned long	tx_bytes;		/* total bytes transmitted	*/
-	unsigned long	rx_errors;		/* bad packets received		*/
-	unsigned long	tx_errors;		/* packet transmit problems	*/
-	unsigned long	rx_dropped;		/* no space in linux buffers	*/
-	unsigned long	tx_dropped;		/* no space available in linux	*/
-	unsigned long	multicast;		/* multicast packets received	*/
-	unsigned long	collisions;
-
-	/* detailed rx_errors: */
-	unsigned long	rx_length_errors;
-	unsigned long	rx_over_errors;		/* receiver ring buff overflow	*/
-	unsigned long	rx_crc_errors;		/* recved pkt with crc error	*/
-	unsigned long	rx_frame_errors;	/* recv'd frame alignment error */
-	unsigned long	rx_fifo_errors;		/* recv'r fifo overrun		*/
-	unsigned long	rx_missed_errors;	/* receiver missed packet	*/
-
-	/* detailed tx_errors */
-	unsigned long	tx_aborted_errors;
-	unsigned long	tx_carrier_errors;
-	unsigned long	tx_fifo_errors;
-	unsigned long	tx_heartbeat_errors;
-	unsigned long	tx_window_errors;
-	
-	/* for cslip etc */
-	unsigned long	rx_compressed;
-	unsigned long	tx_compressed;
+	NET_DEVICE_STATS_DEFINE(rx_packets);
+	NET_DEVICE_STATS_DEFINE(tx_packets);
+	NET_DEVICE_STATS_DEFINE(rx_bytes);
+	NET_DEVICE_STATS_DEFINE(tx_bytes);
+	NET_DEVICE_STATS_DEFINE(rx_errors);
+	NET_DEVICE_STATS_DEFINE(tx_errors);
+	NET_DEVICE_STATS_DEFINE(rx_dropped);
+	NET_DEVICE_STATS_DEFINE(tx_dropped);
+	NET_DEVICE_STATS_DEFINE(multicast);
+	NET_DEVICE_STATS_DEFINE(collisions);
+	NET_DEVICE_STATS_DEFINE(rx_length_errors);
+	NET_DEVICE_STATS_DEFINE(rx_over_errors);
+	NET_DEVICE_STATS_DEFINE(rx_crc_errors);
+	NET_DEVICE_STATS_DEFINE(rx_frame_errors);
+	NET_DEVICE_STATS_DEFINE(rx_fifo_errors);
+	NET_DEVICE_STATS_DEFINE(rx_missed_errors);
+	NET_DEVICE_STATS_DEFINE(tx_aborted_errors);
+	NET_DEVICE_STATS_DEFINE(tx_carrier_errors);
+	NET_DEVICE_STATS_DEFINE(tx_fifo_errors);
+	NET_DEVICE_STATS_DEFINE(tx_heartbeat_errors);
+	NET_DEVICE_STATS_DEFINE(tx_window_errors);
+	NET_DEVICE_STATS_DEFINE(rx_compressed);
+	NET_DEVICE_STATS_DEFINE(tx_compressed);
 };
 
+#endif  /*  __KERNEL__  */
+
 
 /* Media selection options. */
 enum {
@@ -660,10 +664,19 @@ struct netdev_rx_queue {
  *	Callback uses when the transmitter has not made any progress
  *	for dev->watchdog ticks.
  *
+ * struct rtnl_link_stats64* (*ndo_get_stats64)(struct net_device *dev);
  * struct net_device_stats* (*ndo_get_stats)(struct net_device *dev);
  *	Called when a user wants to get the network device usage
- *	statistics. If not defined, the counters in dev->stats will
- *	be used.
+ *	statistics. Drivers must do one of the following:
+ *	1. Define @ndo_get_stats64 to update a rtnl_link_stats64 structure
+ *	   (which should normally be dev->stats64) and return a ponter to
+ *	   it. The structure must not be changed asynchronously.
+ *	2. Define @ndo_get_stats to update a net_device_stats64 structure
+ *	   (which should normally be dev->stats) and return a pointer to
+ *	   it. The structure may be changed asynchronously only if each
+ *	   field is written atomically.
+ *	3. Update dev->stats asynchronously and atomically, and define
+ *	   neither operation.
  *
  * void (*ndo_vlan_rx_register)(struct net_device *dev, struct vlan_group *grp);
  *	If device support VLAN receive accleration
@@ -718,6 +731,7 @@ struct net_device_ops {
 						   struct neigh_parms *);
 	void			(*ndo_tx_timeout) (struct net_device *dev);
 
+	struct rtnl_link_stats64* (*ndo_get_stats64)(struct net_device *dev);
 	struct net_device_stats* (*ndo_get_stats)(struct net_device *dev);
 
 	void			(*ndo_vlan_rx_register)(struct net_device *dev,
@@ -867,7 +881,10 @@ struct net_device {
 	int			ifindex;
 	int			iflink;
 
-	struct net_device_stats	stats;
+	union {
+		struct rtnl_link_stats64 stats64;
+		struct net_device_stats stats;
+	};
 
 #ifdef CONFIG_WIRELESS_EXT
 	/* List of functions to handle Wireless Extensions (instead of ioctl).
@@ -2118,7 +2135,7 @@ extern void		netdev_features_change(struct net_device *dev);
 /* Load a device via the kmod */
 extern void		dev_load(struct net *net, const char *name);
 extern void		dev_mcast_init(void);
-extern const struct net_device_stats *dev_get_stats(struct net_device *dev);
+extern const struct rtnl_link_stats64 *dev_get_stats(struct net_device *dev);
 extern void		dev_txq_stats_fold(const struct net_device *dev, struct net_device_stats *stats);
 
 extern int		netdev_max_backlog;
diff --git a/net/8021q/vlanproc.c b/net/8021q/vlanproc.c
index afead35..df56f5c 100644
--- a/net/8021q/vlanproc.c
+++ b/net/8021q/vlanproc.c
@@ -278,8 +278,9 @@ static int vlandev_seq_show(struct seq_file *seq, void *offset)
 {
 	struct net_device *vlandev = (struct net_device *) seq->private;
 	const struct vlan_dev_info *dev_info = vlan_dev_info(vlandev);
-	const struct net_device_stats *stats;
+	const struct rtnl_link_stats64 *stats;
 	static const char fmt[] = "%30s %12lu\n";
+	static const char fmt64[] = "%30s %12llu\n";
 	int i;
 
 	if (!is_vlan_dev(vlandev))
@@ -291,12 +292,12 @@ static int vlandev_seq_show(struct seq_file *seq, void *offset)
 		   vlandev->name, dev_info->vlan_id,
 		   (int)(dev_info->flags & 1), vlandev->priv_flags);
 
-	seq_printf(seq, fmt, "total frames received", stats->rx_packets);
-	seq_printf(seq, fmt, "total bytes received", stats->rx_bytes);
-	seq_printf(seq, fmt, "Broadcast/Multicast Rcvd", stats->multicast);
+	seq_printf(seq, fmt64, "total frames received", stats->rx_packets);
+	seq_printf(seq, fmt64, "total bytes received", stats->rx_bytes);
+	seq_printf(seq, fmt64, "Broadcast/Multicast Rcvd", stats->multicast);
 	seq_puts(seq, "\n");
-	seq_printf(seq, fmt, "total frames transmitted", stats->tx_packets);
-	seq_printf(seq, fmt, "total bytes transmitted", stats->tx_bytes);
+	seq_printf(seq, fmt64, "total frames transmitted", stats->tx_packets);
+	seq_printf(seq, fmt64, "total bytes transmitted", stats->tx_bytes);
 	seq_printf(seq, fmt, "total headroom inc",
 		   dev_info->cnt_inc_headroom_on_tx);
 	seq_printf(seq, fmt, "total encap on xmit",
diff --git a/net/core/dev.c b/net/core/dev.c
index 983a3c1..71a6fd8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3686,10 +3686,10 @@ void dev_seq_stop(struct seq_file *seq, void *v)
 
 static void dev_seq_printf_stats(struct seq_file *seq, struct net_device *dev)
 {
-	const struct net_device_stats *stats = dev_get_stats(dev);
+	const struct rtnl_link_stats64 *stats = dev_get_stats(dev);
 
-	seq_printf(seq, "%6s: %7lu %7lu %4lu %4lu %4lu %5lu %10lu %9lu "
-		   "%8lu %7lu %4lu %4lu %4lu %5lu %7lu %10lu\n",
+	seq_printf(seq, "%6s: %7llu %7llu %4llu %4llu %4llu %5llu %10llu %9llu "
+		   "%8llu %7llu %4llu %4llu %4llu %5llu %7llu %10llu\n",
 		   dev->name, stats->rx_bytes, stats->rx_packets,
 		   stats->rx_errors,
 		   stats->rx_dropped + stats->rx_missed_errors,
@@ -5266,18 +5266,21 @@ EXPORT_SYMBOL(dev_txq_stats_fold);
  *	@dev: device to get statistics from
  *
  *	Get network statistics from device. The device driver may provide
- *	its own method by setting dev->netdev_ops->get_stats; otherwise
- *	the internal statistics structure is used.
+ *	its own method by setting dev->netdev_ops->get_stats64 or
+ *	dev->netdev_ops->get_stats; otherwise the internal statistics
+ *	structure is used.
  */
-const struct net_device_stats *dev_get_stats(struct net_device *dev)
+const struct rtnl_link_stats64 *dev_get_stats(struct net_device *dev)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
 
+	if (ops->ndo_get_stats64)
+		return ops->ndo_get_stats64(dev);
 	if (ops->ndo_get_stats)
-		return ops->ndo_get_stats(dev);
+		return (struct rtnl_link_stats64 *)ops->ndo_get_stats(dev);
 
 	dev_txq_stats_fold(dev, &dev->stats);
-	return &dev->stats;
+	return &dev->stats64;
 }
 EXPORT_SYMBOL(dev_get_stats);
 
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 99e7052..ea3bb4c 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -29,6 +29,7 @@ static const char fmt_hex[] = "%#x\n";
 static const char fmt_long_hex[] = "%#lx\n";
 static const char fmt_dec[] = "%d\n";
 static const char fmt_ulong[] = "%lu\n";
+static const char fmt_u64[] = "%llu\n";
 
 static inline int dev_isalive(const struct net_device *dev)
 {
@@ -324,14 +325,13 @@ static ssize_t netstat_show(const struct device *d,
 	struct net_device *dev = to_net_dev(d);
 	ssize_t ret = -EINVAL;
 
-	WARN_ON(offset > sizeof(struct net_device_stats) ||
-			offset % sizeof(unsigned long) != 0);
+	WARN_ON(offset > sizeof(struct rtnl_link_stats64) ||
+			offset % sizeof(u64) != 0);
 
 	read_lock(&dev_base_lock);
 	if (dev_isalive(dev)) {
-		const struct net_device_stats *stats = dev_get_stats(dev);
-		ret = sprintf(buf, fmt_ulong,
-			      *(unsigned long *)(((u8 *) stats) + offset));
+		const struct rtnl_link_stats64 *stats = dev_get_stats(dev);
+		ret = sprintf(buf, fmt_u64, *(u64 *)(((u8 *) stats) + offset));
 	}
 	read_unlock(&dev_base_lock);
 	return ret;
@@ -343,7 +343,7 @@ static ssize_t show_##name(struct device *d,				\
 			   struct device_attribute *attr, char *buf) 	\
 {									\
 	return netstat_show(d, attr, buf,				\
-			    offsetof(struct net_device_stats, name));	\
+			    offsetof(struct rtnl_link_stats64, name));	\
 }									\
 static DEVICE_ATTR(name, S_IRUGO, show_##name, NULL)
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 1a2af24..e645778 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -579,7 +579,7 @@ static unsigned int rtnl_dev_combine_flags(const struct net_device *dev,
 }
 
 static void copy_rtnl_link_stats(struct rtnl_link_stats *a,
-				 const struct net_device_stats *b)
+				 const struct rtnl_link_stats64 *b)
 {
 	a->rx_packets = b->rx_packets;
 	a->tx_packets = b->tx_packets;
@@ -610,7 +610,7 @@ static void copy_rtnl_link_stats(struct rtnl_link_stats *a,
 	a->tx_compressed = b->tx_compressed;
 }
 
-static void copy_rtnl_link_stats64(void *v, const struct net_device_stats *b)
+static void copy_rtnl_link_stats64(void *v, const struct rtnl_link_stats64 *b)
 {
 	struct rtnl_link_stats64 a;
 
@@ -791,7 +791,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 {
 	struct ifinfomsg *ifm;
 	struct nlmsghdr *nlh;
-	const struct net_device_stats *stats;
+	const struct rtnl_link_stats64 *stats;
 	struct nlattr *attr;
 
 	nlh = nlmsg_put(skb, pid, seq, type, sizeof(*ifm), flags);
-- 
1.6.2.5


-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* [PATCHv2 2/2] sfc: Implement 64-bit net device statistics on all architectures
From: Ben Hutchings @ 2010-06-04 22:06 UTC (permalink / raw)
  To: David Miller; +Cc: Stephen Hemminger, Arnd Bergmann, netdev, linux-net-drivers
In-Reply-To: <1275689093.2095.36.camel@achroite.uk.solarflarecom.com>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
This is unchanged from v1.

Ben.

 drivers/net/sfc/efx.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sfc/efx.c b/drivers/net/sfc/efx.c
index 26b0cc2..8ad476a 100644
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -1492,11 +1492,11 @@ static int efx_net_stop(struct net_device *net_dev)
 }
 
 /* Context: process, dev_base_lock or RTNL held, non-blocking. */
-static struct net_device_stats *efx_net_stats(struct net_device *net_dev)
+static struct rtnl_link_stats64 *efx_net_stats(struct net_device *net_dev)
 {
 	struct efx_nic *efx = netdev_priv(net_dev);
 	struct efx_mac_stats *mac_stats = &efx->mac_stats;
-	struct net_device_stats *stats = &net_dev->stats;
+	struct rtnl_link_stats64 *stats = &net_dev->stats64;
 
 	spin_lock_bh(&efx->stats_lock);
 	efx->type->update_stats(efx);
@@ -1630,7 +1630,7 @@ static void efx_set_multicast_list(struct net_device *net_dev)
 static const struct net_device_ops efx_netdev_ops = {
 	.ndo_open		= efx_net_open,
 	.ndo_stop		= efx_net_stop,
-	.ndo_get_stats		= efx_net_stats,
+	.ndo_get_stats64	= efx_net_stats,
 	.ndo_tx_timeout		= efx_watchdog,
 	.ndo_start_xmit		= efx_hard_start_xmit,
 	.ndo_validate_addr	= eth_validate_addr,
-- 
1.6.2.5

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply related

* Re: Is CONFIG_IP_VS_IPV6 still/really dangerous?
From: Ferenc Wagner @ 2010-06-04 22:21 UTC (permalink / raw)
  To: lvs-users; +Cc: netdev, robert.gallagher, Julius Volz
In-Reply-To: <AANLkTikQPaHYuOOAiJtIft9NnOZ2S1LXTVMi_BLD8gau@mail.gmail.com>

Julius Volz <juliusv@google.com> writes:

> On Fri, Jun 4, 2010 at 3:56 PM, Ferenc Wagner <wferi@niif.hu> wrote:
>
>> In commit fab0de02fb0da83b90cec7fce4294747d86d5c6f CONFIG_IP_VS_IPV6 is
>> described as:
>>
>>    Add IPv6 support to IPVS. This is incomplete and might be dangerous.
>>
>> I agree its implementation is incomplete.  But I wonder if it's really
>> dangerous in the sense that generic distribution kernels shouldn't
>> enable it, because it can break unrelated (eg. IPv4 IPVS) functionality.
>>
>> What does that warning mean today?  Isn't it out of date?
>
> I wrote the IPv6 support back in the day, but never used it
> large-scale. Rob Gallagher from HEAnet was doing some bigger
> experiments with it, but I'm not sure how far it went. CCing him.
>
> There are probably some other people out there that have tested it
> extensively. Maybe try the lvs-users and lvs-devel mailing lists?

Sounds like a good idea!  So:

Dear lvs-users,

did you experience any breakage as a result of switching on
CONFIG_IP_VS_IPV6?  I mean apart from it having incomplete
functionality.  The gist of the question is whether this option
is suitable for generic distro kernels or not, cf. above.
-- 
Thanks for your time,
Feri.

^ permalink raw reply

* Re: Ethernet drivers wiki page
From: Stephen Hemminger @ 2010-06-04 22:23 UTC (permalink / raw)
  To: Luis R. Rodriguez; +Cc: netdev, H. Peter Anvin, Johannes Berg, linux-kernel
In-Reply-To: <AANLkTik1isTBolg4IDchOlmKqXOLjQoobZrJ-9bb8ZL5@mail.gmail.com>

On Fri, 4 Jun 2010 15:04:17 -0700
"Luis R. Rodriguez" <mcgrof@gmail.com> wrote:

> On Fri, Jun 4, 2010 at 2:52 PM, Luis R. Rodriguez <mcgrof@gmail.com> wrote:
> > Where can I start adding documentation for our Ethernet drivers
> > upstream? I was hoping for a wiki but I don't think we have a netdev
> > one yet. Shall we create one or piggy back on something else? I'm
> > reviewing:
> >
> > https://wiki.kernel.org/
> >
> > FWIW, I want to see something like this:
> >
> > http://wireless.kernel.org/en/users/Drivers
> >
> > But for ethernet.
> 
> Since I've stuffed a few ethernet drivers on compat-wireless I might
> as well start carrying more. So now we'd have 802.11, Bluetooth and
> Ethernet all backported using the same framework, automatically, down
> to at least the oldest stable kernel supported listed on kernel.org
> and using all a generic compat module.
> 
> I want to document all this crap.

Why not Documentation/networking in kernel source? There is already
documentation there though much of it is out of date.
This has the advantage of having change log and matching kernel version.

^ permalink raw reply

* Re: 200 millisecond timeouts in TCP
From: Ryousei Takano @ 2010-06-04 22:55 UTC (permalink / raw)
  To: Satoru SATOH; +Cc: Ivan Novick, netdev
In-Reply-To: <20100604150202.GA8268@localhost.localdomain>

Hi satoru,

On Sat, Jun 5, 2010 at 12:02 AM, Satoru SATOH <satoru.satoh@gmail.com> wrote:
> It's possible to tune the RTO min value (rto_min) per route since
> 2.6.23+.  (see also the manual of iproute2, ip(8) )
>
Yeah, I forgot it.  Per route configuration is a nice idea compared
with global sysctls.

Thanks!
Ryousei

> - satoru
>
> On Fri, Jun 04, 2010 at 03:58:57PM +0900, Ryousei Takano wrote:
>> On Fri, Jun 4, 2010 at 7:37 AM, Ivan Novick <novickivan@gmail.com> wrote:
>> > Hello,
>> >
>> > Using tcpdump and systemtap I am seeing that sometimes retransmission
>> > of data is sent after waiting 200 milliseconds.  However sometimes
>> > retransmissions happen quicker.
>> >
>> > Is there a specifc event that causes these 200 milisec delays to kick
>> > in?  Are those events identifiable in netstat -s output?
>> >
>> > Also do you know if the timeout numbers for TCP are configurable parameters?
>> >
>> The minimum RTO value is fixed to 200 ms.  It is useful to make the min/max
>> RTO values tunable.  For example, reducing the minimum RTO value is effective
>> for TCP incast problem [1].  Of course, it may occur spurious retransmissions.
>>
>> [1] Vijay Vasudevan, et al, Safe and Effective Fine-grained TCP Retransmissions
>> for Datacenter Communication, SIGCOMM2009
>>
>> In Solaris, there are two tunable parameters: tcp_rexmit_interval_min/max.
>>
>> Do you have plan to introduce sysctl parameters like these to the Linux.
>>
>> Thanks,
>> Ryousei
>

^ permalink raw reply

* Re: [PATCH] syncookies: remove Kconfig text line about disabled-by-default
From: David Miller @ 2010-06-04 22:57 UTC (permalink / raw)
  To: fw; +Cc: netdev
In-Reply-To: <1275561750-24111-1-git-send-email-fw@strlen.de>

From: Florian Westphal <fw@strlen.de>
Date: Thu,  3 Jun 2010 12:42:30 +0200

> syncookies default to on since
> e994b7c901ded7200b525a707c6da71f2cf6d4bb
> (tcp: Don't make syn cookies initial setting depend on CONFIG_SYSCTL).
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>

Applied, thanks Florian.

^ permalink raw reply

* Re: [PATCH] tcp: use correct net ns in cookie_v4_check()
From: David Miller @ 2010-06-04 22:57 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1275579947.2456.80.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 03 Jun 2010 17:45:47 +0200

> [PATCH] tcp: use correct net ns in cookie_v4_check()
> 
> Its better to make a route lookup in appropriate namespace.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied and queued up for -stable, thanks Eric!

^ permalink raw reply

* Re: [PATCH] rps: tcp: fix rps_sock_flow_table table updates
From: David Miller @ 2010-06-04 22:57 UTC (permalink / raw)
  To: eric.dumazet; +Cc: therbert, netdev
In-Reply-To: <1275591838.2533.37.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 03 Jun 2010 21:03:58 +0200

> I believe a moderate SYN flood attack can corrupt RFS flow table
> (rps_sock_flow_table), making RPS/RFS much less effective.
> 
> Even in a normal situation, server handling short lived sessions suffer
> from bad steering for the first data packet of a session, if another SYN
> packet is received for another session.
> 
> We do following action in tcp_v4_rcv() :
> 
> 	sock_rps_save_rxhash(sk, skb->rxhash);
> 
> We should _not_ do this if sk is a LISTEN socket, as about each
> packet received on a LISTEN socket has a different rxhash than
> previous one.
>  -> RPS_NO_CPU markers are spread all over rps_sock_flow_table.
> 
> Also, it makes sense to protect sk->rxhash field changes with socket
> lock (We currently can change it even if user thread owns the lock
> and might use rxhash)
> 
> This patch moves sock_rps_save_rxhash() to a sock locked section,
> and only for non LISTEN sockets.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied.

^ permalink raw reply

* Re: [net-2.6 PATCH] ixgbe: only check pfc bits in hang logic if pfc is enabled
From: David Miller @ 2010-06-04 22:58 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, john.r.fastabend, donald.c.skidmore
In-Reply-To: <20100604030314.16356.79492.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 03 Jun 2010 20:03:45 -0700

> From: John Fastabend <john.r.fastabend@intel.com>
> 
> Only check pfc bits in hang logic if PFC is enabled.  Previously,
> if DCB was enabled but PFC was disabled the incorrect pause
> bits would be checked.
> 
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
> Acked-by: Don Skidmore <donald.c.skidmore@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2] net: check for refcount if pop a stacked dst_entry
From: David Miller @ 2010-06-04 22:58 UTC (permalink / raw)
  To: eric.dumazet; +Cc: steffen.klassert, netdev
In-Reply-To: <1275653207.2482.120.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 04 Jun 2010 14:06:47 +0200

> Le vendredi 04 juin 2010 à 13:57 +0200, Steffen Klassert a écrit :
>> xfrm triggers a warning if dst_pop() drops a refcount
>> on a noref dst. This patch changes dst_pop() to
>> skb_dst_pop(). skb_dst_pop() drops the refcnt only
>> on a refcounted dst. Also we don't clone the child
>> dst_entry, so it is not refcounted and we can use
>> skb_dst_set_noref() in xfrm_output_one().
>> 
>> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
>> ---
> 
> Thanks a lot Steffen !
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks guys!

^ permalink raw reply

* cxgb4i_v4 submission
From: Rakesh Ranjan @ 2010-06-04 22:58 UTC (permalink / raw)
  To: LK-NetDev, LK-SCSIDev, LK-iSCSIDev
  Cc: LKML, Karen Xie, David Miller, James Bottomley, Mike Christie,
	Anish Bhatt

The following 3 patches add a new iscsi LLD driver cxgb4i to enable iscsi offload
support on Chelsio's new 1G and 10G cards. This is updated version of previous cxgb4i
patch. Please share you commnets after review.

Changes since cxgb4i_v3.1
1. Fixed issues reported by mike.
2. Made libcxgbi more generic.

[PATCH 1/3] cxgb4i_v4: add build support
[PATCH 2/3] cxgb4i_v4: libcxgbi library for handling common part in cxgb4i/cxgb3i
[PATCH 3/3] cxgb4i_v4: main driver files

Regards
Rakesh Ranjan

-- 
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.

^ permalink raw reply

* [PATCH 2/3] cxgb4i_v4: libcxgbi library for handling common part in cxgb4i/cxgb3i
From: Rakesh Ranjan @ 2010-06-04 22:58 UTC (permalink / raw)
  To: LK-NetDev, LK-SCSIDev, LK-iSCSIDev
  Cc: LKML, Karen Xie, David Miller, James Bottomley, Mike Christie,
	Anish Bhatt, Rakesh Ranjan
In-Reply-To: <1275692327-21502-1-git-send-email-rakesh-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>

From: Rakesh Ranjan <rakesh-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>


Signed-off-by: Rakesh Ranjan <rakesh-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>
---
 drivers/scsi/cxgbi/libcxgbi.c | 1529 +++++++++++++++++++++++++++++++++++++++++
 drivers/scsi/cxgbi/libcxgbi.h |  577 ++++++++++++++++
 2 files changed, 2106 insertions(+), 0 deletions(-)
 create mode 100644 drivers/scsi/cxgbi/libcxgbi.c
 create mode 100644 drivers/scsi/cxgbi/libcxgbi.h

diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
new file mode 100644
index 0000000..3c2e621
--- /dev/null
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -0,0 +1,1529 @@
+/*
+ * libcxgbi.c: Chelsio common library for T3/T4 iSCSI driver.
+ *
+ * Copyright (c) 2010 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ *
+ * Written by: Karen Xie (kxie-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ * Written by: Rakesh Ranjan (rranjan-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ */
+
+#include <linux/skbuff.h>
+#include <linux/crypto.h>
+#include <linux/scatterlist.h>
+#include <linux/pci.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_host.h>
+#include <linux/if_vlan.h>
+#include <net/dst.h>
+#include <net/route.h>
+#include <net/tcp.h>
+
+#include "libcxgbi.h"
+
+MODULE_AUTHOR("Chelsio Communications");
+MODULE_DESCRIPTION("Chelsio libcxgbi common library");
+MODULE_LICENSE("GPL");
+
+static LIST_HEAD(cdev_list);
+static DEFINE_MUTEX(cdev_rwlock);
+
+struct cxgbi_device *cxgbi_device_register(unsigned int dd_size,
+					unsigned int nports)
+{
+	struct cxgbi_device *cdev;
+
+	cdev = kzalloc(sizeof(*cdev) + dd_size, GFP_KERNEL);
+	if (!cdev)
+		return NULL;
+
+	cdev->hbas = kzalloc(sizeof(struct cxgbi_hba **) *  nports, GFP_KERNEL);
+	if (!cdev->hbas) {
+		kfree(cdev);
+		return NULL;
+	}
+
+	mutex_lock(&cdev_rwlock);
+	list_add_tail(&cdev->list_head, &cdev_list);
+	mutex_unlock(&cdev_rwlock);
+	return cdev;
+}
+EXPORT_SYMBOL_GPL(cxgbi_device_register);
+
+void cxgbi_device_unregister(struct cxgbi_device *cdev)
+{
+	mutex_lock(&cdev_rwlock);
+	list_del(&cdev->list_head);
+	mutex_unlock(&cdev_rwlock);
+
+	kfree(cdev->hbas);
+	kfree(cdev);
+}
+EXPORT_SYMBOL_GPL(cxgbi_device_unregister);
+
+struct cxgbi_hba *cxgbi_hba_find_by_netdev(struct net_device *dev,
+					   struct cxgbi_device *cdev)
+{
+	int i;
+
+	if (dev->priv_flags & IFF_802_1Q_VLAN)
+		dev = vlan_dev_real_dev(dev);
+
+	for (i = 0; i < cdev->nports; i++) {
+		if (cdev->hbas[i]->ndev == dev)
+			return cdev->hbas[i];
+	}
+
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(cxgbi_hba_find_by_netdev);
+
+static struct rtable *find_route(struct net_device *dev,
+				__be32 saddr, __be32 daddr,
+				__be16 sport, __be16 dport,
+				u8 tos)
+{
+	struct rtable *rt;
+	struct flowi fl = {
+		.oif = dev ? dev->ifindex : 0,
+		.nl_u = {
+			.ip4_u = {
+				.daddr = daddr,
+				.saddr = saddr,
+				.tos = tos }
+			},
+		.proto = IPPROTO_TCP,
+		.uli_u = {
+			.ports = {
+				.sport = sport,
+				.dport = dport }
+			}
+	};
+
+	if (ip_route_output_flow(dev ? dev_net(dev) : &init_net,
+					&rt, &fl, NULL, 0))
+		return NULL;
+
+	return rt;
+}
+
+static struct net_device *cxgbi_find_dev(struct net_device *dev,
+					__be32 ipaddr)
+{
+	struct flowi fl;
+	struct rtable *rt;
+	int err;
+
+	memset(&fl, 0, sizeof(fl));
+	fl.nl_u.ip4_u.daddr = ipaddr;
+
+	err = ip_route_output_key(dev ? dev_net(dev) : &init_net, &rt, &fl);
+	if (!err)
+		return (&rt->u.dst)->dev;
+
+	return NULL;
+}
+
+static int is_cxgbi_dev(struct net_device *dev, struct cxgbi_device *cdev)
+{
+	struct net_device *ndev = dev;
+	int i;
+
+	if (dev->priv_flags & IFF_802_1Q_VLAN)
+		ndev = vlan_dev_real_dev(dev);
+
+	for (i = 0; i < cdev->nports; i++) {
+		if (ndev == cdev->ports[i])
+			return 1;
+	}
+	return 0;
+}
+
+static struct net_device *cxgbi_find_egress_dev(struct net_device *root_dev,
+						struct cxgbi_device *cdev)
+{
+	while (root_dev) {
+		if (root_dev->priv_flags & IFF_802_1Q_VLAN)
+			root_dev = vlan_dev_real_dev(root_dev);
+		else if (is_cxgbi_dev(root_dev, cdev))
+			return root_dev;
+		else
+			return NULL;
+	}
+
+	return NULL;
+}
+
+struct cxgbi_device *cxgbi_find_cdev(struct net_device *dev, __be32 ipaddr)
+{
+	struct flowi fl;
+	struct rtable *rt;
+	struct net_device *sdev = NULL;
+	struct cxgbi_device *cdev = NULL, *tmp;
+	int err, i;
+
+	memset(&fl, 0, sizeof(fl));
+	fl.nl_u.ip4_u.daddr = ipaddr;
+
+	err = ip_route_output_key(dev ? dev_net(dev) : &init_net, &rt, &fl);
+	if (err)
+		goto out;
+
+	sdev = (&rt->u.dst)->dev;
+	mutex_lock(&cdev_rwlock);
+	list_for_each_entry_safe(cdev, tmp, &cdev_list, list_head) {
+		if (cdev) {
+			for (i = 0; i < cdev->nports; i++) {
+				if (sdev == cdev->ports[i]) {
+					mutex_unlock(&cdev_rwlock);
+					return cdev;
+				}
+			}
+		}
+	}
+	mutex_unlock(&cdev_rwlock);
+out:	return cdev;
+}
+
+/*
+ * pdu receive, interact with libiscsi_tcp
+ */
+static inline int read_pdu_skb(struct iscsi_conn *conn,
+			       struct sk_buff *skb,
+			       unsigned int offset,
+			       int offloaded)
+{
+	int status = 0;
+	int bytes_read;
+
+	bytes_read = iscsi_tcp_recv_skb(conn, skb, offset, offloaded, &status);
+	switch (status) {
+	case ISCSI_TCP_CONN_ERR:
+		return -EIO;
+	case ISCSI_TCP_SUSPENDED:
+		/* no transfer - just have caller flush queue */
+		return bytes_read;
+	case ISCSI_TCP_SKB_DONE:
+		/*
+		 * pdus should always fit in the skb and we should get
+		 * segment done notifcation.
+		 */
+		iscsi_conn_printk(KERN_ERR, conn, "Invalid pdu or skb.");
+		return -EFAULT;
+	case ISCSI_TCP_SEGMENT_DONE:
+		return bytes_read;
+	default:
+		iscsi_conn_printk(KERN_ERR, conn, "Invalid iscsi_tcp_recv_skb "
+				  "status %d\n", status);
+		return -EINVAL;
+	}
+}
+
+static int cxgbi_conn_read_bhs_pdu_skb(struct iscsi_conn *conn,
+				       struct sk_buff *skb)
+{
+	struct iscsi_tcp_conn *tcp_conn = conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct cxgbi_device *cdev = cconn->chba->cdev;
+	int rc;
+
+	cxgbi_rx_debug("conn 0x%p, skb 0x%p, len %u, flag 0x%x.\n",
+			conn, skb, skb->len, cdev->get_skb_ulp_mode(skb));
+
+	if (!iscsi_tcp_recv_segment_is_hdr(tcp_conn)) {
+		iscsi_conn_failure(conn, ISCSI_ERR_PROTO);
+		return -EIO;
+	}
+
+	if (conn->hdrdgst_en && (cdev->get_skb_ulp_mode(skb)
+				& ULP2_FLAG_HCRC_ERROR)) {
+		iscsi_conn_failure(conn, ISCSI_ERR_HDR_DGST);
+		return -EIO;
+	}
+
+	rc = read_pdu_skb(conn, skb, 0, 0);
+	if (rc <= 0)
+		return rc;
+
+	return 0;
+}
+
+static int cxgbi_conn_read_data_pdu_skb(struct iscsi_conn *conn,
+					struct sk_buff *skb)
+{
+	struct iscsi_tcp_conn *tcp_conn = conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct cxgbi_device *cdev = cconn->chba->cdev;
+	bool offloaded = 0;
+	unsigned int offset = 0;
+	int rc;
+
+	cxgbi_rx_debug("conn 0x%p, skb 0x%p, len %u, flag 0x%x.\n",
+			conn, skb, skb->len, cdev->get_skb_ulp_mode(skb));
+
+	if (conn->datadgst_en &&
+		(cdev->get_skb_ulp_mode(skb) & ULP2_FLAG_DCRC_ERROR)) {
+		iscsi_conn_failure(conn, ISCSI_ERR_DATA_DGST);
+		return -EIO;
+	}
+
+	if (iscsi_tcp_recv_segment_is_hdr(tcp_conn))
+		return 0;
+
+	if (conn->hdrdgst_en)
+		offset = ISCSI_DIGEST_SIZE;
+
+	if (cdev->get_skb_ulp_mode(skb) & ULP2_FLAG_DATA_DDPED) {
+		cxgbi_rx_debug("skb 0x%p, opcode 0x%x, data %u, ddp'ed, "
+				"itt 0x%x.\n",
+				skb,
+				tcp_conn->in.hdr->opcode & ISCSI_OPCODE_MASK,
+				tcp_conn->in.datalen,
+				ntohl(tcp_conn->in.hdr->itt));
+		offloaded = 1;
+	} else {
+		cxgbi_rx_debug("skb 0x%p, opcode 0x%x, data %u, NOT ddp'ed, "
+				"itt 0x%x.\n",
+				skb,
+				tcp_conn->in.hdr->opcode & ISCSI_OPCODE_MASK,
+				tcp_conn->in.datalen,
+				ntohl(tcp_conn->in.hdr->itt));
+	}
+
+	rc = read_pdu_skb(conn, skb, 0, offloaded);
+	if (rc < 0)
+		return rc;
+
+	return 0;
+}
+
+void cxgbi_conn_pdu_ready(struct cxgbi_sock *csk)
+{
+	struct sk_buff *skb;
+	unsigned int read = 0;
+	struct iscsi_conn *conn = csk->user_data;
+	int err = 0;
+
+	cxgbi_rx_debug("csk 0x%p.\n", csk);
+
+	read_lock(&csk->callback_lock);
+	if (unlikely(!conn || conn->suspend_rx)) {
+		cxgbi_rx_debug("conn 0x%p, id %d, suspend_rx %lu!\n",
+				conn, conn ? conn->id : 0xFF,
+				conn ? conn->suspend_rx : 0xFF);
+		read_unlock(&csk->callback_lock);
+		return;
+	}
+
+	skb = skb_peek(&csk->receive_queue);
+	while (!err && skb) {
+		__skb_unlink(skb, &csk->receive_queue);
+		read += csk->cdev->get_skb_rx_pdulen(skb);
+		cxgbi_rx_debug("conn 0x%p, csk 0x%p, rx skb 0x%p, pdulen %u\n",
+				conn, csk, skb,
+				csk->cdev->get_skb_rx_pdulen(skb));
+		if (csk->flags & CTPF_MSG_COALESCED) {
+			err = cxgbi_conn_read_bhs_pdu_skb(conn, skb);
+			err = cxgbi_conn_read_data_pdu_skb(conn, skb);
+		} else {
+			if (csk->cdev->get_skb_flags(skb) & CTP_SKCBF_HDR_RCVD)
+				err = cxgbi_conn_read_bhs_pdu_skb(conn, skb);
+			else if (csk->cdev->get_skb_flags(skb) ==
+				CTP_SKCBF_DATA_RCVD)
+				err = cxgbi_conn_read_data_pdu_skb(conn, skb);
+		}
+		__kfree_skb(skb);
+		skb = skb_peek(&csk->receive_queue);
+	}
+	cxgbi_log_debug("read %d\n", read);
+	read_unlock(&csk->callback_lock);
+	csk->copied_seq += read;
+	csk->cdev->sock_rx_credits(csk, read);
+	conn->rxdata_octets += read;
+
+	if (err) {
+		cxgbi_log_info("conn 0x%p rx failed err %d.\n", conn, err);
+		iscsi_conn_failure(conn, ISCSI_ERR_CONN_FAILED);
+	}
+}
+EXPORT_SYMBOL_GPL(cxgbi_conn_pdu_ready);
+
+static int sgl_seek_offset(struct scatterlist *sgl, unsigned int sgcnt,
+				unsigned int offset, unsigned int *off,
+				struct scatterlist **sgp)
+{
+	int i;
+	struct scatterlist *sg;
+
+	for_each_sg(sgl, sg, sgcnt, i) {
+		if (offset < sg->length) {
+			*off = offset;
+			*sgp = sg;
+			return 0;
+		}
+		offset -= sg->length;
+	}
+	return -EFAULT;
+}
+
+static int sgl_read_to_frags(struct scatterlist *sg, unsigned int sgoffset,
+				unsigned int dlen, skb_frag_t *frags,
+				int frag_max)
+{
+	unsigned int datalen = dlen;
+	unsigned int sglen = sg->length - sgoffset;
+	struct page *page = sg_page(sg);
+	int i;
+
+	i = 0;
+	do {
+		unsigned int copy;
+
+		if (!sglen) {
+			sg = sg_next(sg);
+			if (!sg) {
+				cxgbi_log_error("sg NULL, len %u/%u.\n",
+								datalen, dlen);
+				return -EINVAL;
+			}
+			sgoffset = 0;
+			sglen = sg->length;
+			page = sg_page(sg);
+
+		}
+		copy = min(datalen, sglen);
+		if (i && page == frags[i - 1].page &&
+		    sgoffset + sg->offset ==
+			frags[i - 1].page_offset + frags[i - 1].size) {
+			frags[i - 1].size += copy;
+		} else {
+			if (i >= frag_max) {
+				cxgbi_log_error("too many pages %u, "
+						 "dlen %u.\n", frag_max, dlen);
+				return -EINVAL;
+			}
+
+			frags[i].page = page;
+			frags[i].page_offset = sg->offset + sgoffset;
+			frags[i].size = copy;
+			i++;
+		}
+		datalen -= copy;
+		sgoffset += copy;
+		sglen -= copy;
+	} while (datalen);
+
+	return i;
+}
+
+int cxgbi_conn_alloc_pdu(struct iscsi_task *task, u8 opcode)
+{
+	struct iscsi_tcp_conn *tcp_conn = task->conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct cxgbi_device *cdev = cconn->chba->cdev;
+	struct iscsi_conn *conn = task->conn;
+	struct iscsi_tcp_task *tcp_task = task->dd_data;
+	struct cxgbi_task_data *tdata = task->dd_data + sizeof(*tcp_task);
+	struct scsi_cmnd *sc = task->sc;
+	int headroom = SKB_TX_PDU_HEADER_LEN;
+
+	tcp_task->dd_data = tdata;
+	task->hdr = NULL;
+
+	/* write command, need to send data pdus */
+	if (cdev->skb_extra_headroom && (opcode == ISCSI_OP_SCSI_DATA_OUT ||
+	    (opcode == ISCSI_OP_SCSI_CMD &&
+	    (scsi_bidi_cmnd(sc) || sc->sc_data_direction == DMA_TO_DEVICE))))
+		headroom += min(cdev->skb_extra_headroom,
+					conn->max_xmit_dlength);
+
+	tdata->skb = alloc_skb(cdev->skb_tx_headroom + headroom, GFP_ATOMIC);
+	if (!tdata->skb)
+		return -ENOMEM;
+
+	skb_reserve(tdata->skb, cdev->skb_tx_headroom);
+	cxgbi_tx_debug("task 0x%p, opcode 0x%x, skb 0x%p.\n",
+			task, opcode, tdata->skb);
+	task->hdr = (struct iscsi_hdr *)tdata->skb->data;
+	task->hdr_max = SKB_TX_PDU_HEADER_LEN;
+
+	/* data_out uses scsi_cmd's itt */
+	if (opcode != ISCSI_OP_SCSI_DATA_OUT)
+		cxgbi_reserve_itt(task, &task->hdr->itt);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxgbi_conn_alloc_pdu);
+
+int cxgbi_conn_init_pdu(struct iscsi_task *task, unsigned int offset,
+			      unsigned int count)
+{
+	struct iscsi_tcp_conn *tcp_conn = task->conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct cxgbi_device *cdev = cconn->chba->cdev;
+	struct iscsi_conn *conn = task->conn;
+	struct iscsi_tcp_task *tcp_task = task->dd_data;
+	struct cxgbi_task_data *tdata = tcp_task->dd_data;
+	struct sk_buff *skb = tdata->skb;
+	unsigned int datalen = count;
+	int i, padlen = iscsi_padding(count);
+	struct page *pg;
+
+	cxgbi_tx_debug("task 0x%p,0x%p, offset %u, count %u, skb 0x%p.\n",
+			task, task->sc, offset, count, skb);
+
+	skb_put(skb, task->hdr_len);
+	cdev->set_skb_txmode(skb, conn->hdrdgst_en,
+			     datalen ? conn->datadgst_en : 0);
+	if (!count)
+		return 0;
+
+	if (task->sc) {
+		struct scsi_data_buffer *sdb = scsi_out(task->sc);
+		struct scatterlist *sg = NULL;
+		int err;
+
+		tdata->offset = offset;
+		tdata->count = count;
+		err = sgl_seek_offset(sdb->table.sgl, sdb->table.nents,
+					tdata->offset, &tdata->sgoffset, &sg);
+		if (err < 0) {
+			cxgbi_log_warn("tpdu, sgl %u, bad offset %u/%u.\n",
+					sdb->table.nents, tdata->offset,
+					sdb->length);
+			return err;
+		}
+		err = sgl_read_to_frags(sg, tdata->sgoffset, tdata->count,
+					tdata->frags, MAX_PDU_FRAGS);
+		if (err < 0) {
+			cxgbi_log_warn("tpdu, sgl %u, bad offset %u + %u.\n",
+					sdb->table.nents, tdata->offset,
+					tdata->count);
+			return err;
+		}
+		tdata->nr_frags = err;
+
+		if (tdata->nr_frags > MAX_SKB_FRAGS ||
+		    (padlen && tdata->nr_frags == MAX_SKB_FRAGS)) {
+			char *dst = skb->data + task->hdr_len;
+			skb_frag_t *frag = tdata->frags;
+
+			/* data fits in the skb's headroom */
+			for (i = 0; i < tdata->nr_frags; i++, frag++) {
+				char *src = kmap_atomic(frag->page,
+							KM_SOFTIRQ0);
+
+				memcpy(dst, src+frag->page_offset, frag->size);
+				dst += frag->size;
+				kunmap_atomic(src, KM_SOFTIRQ0);
+			}
+			if (padlen) {
+				memset(dst, 0, padlen);
+				padlen = 0;
+			}
+			skb_put(skb, count + padlen);
+		} else {
+			/* data fit into frag_list */
+			for (i = 0; i < tdata->nr_frags; i++)
+				get_page(tdata->frags[i].page);
+
+			memcpy(skb_shinfo(skb)->frags, tdata->frags,
+				sizeof(skb_frag_t) * tdata->nr_frags);
+			skb_shinfo(skb)->nr_frags = tdata->nr_frags;
+			skb->len += count;
+			skb->data_len += count;
+			skb->truesize += count;
+		}
+
+	} else {
+		pg = virt_to_page(task->data);
+
+		get_page(pg);
+		skb_fill_page_desc(skb, 0, pg, offset_in_page(task->data),
+					count);
+		skb->len += count;
+		skb->data_len += count;
+		skb->truesize += count;
+	}
+
+	if (padlen) {
+		i = skb_shinfo(skb)->nr_frags;
+		get_page(cdev->pad_page);
+		skb_fill_page_desc(skb, skb_shinfo(skb)->nr_frags,
+					cdev->pad_page, 0, padlen);
+
+		skb->data_len += padlen;
+		skb->truesize += padlen;
+		skb->len += padlen;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxgbi_conn_init_pdu);
+
+int cxgbi_conn_xmit_pdu(struct iscsi_task *task)
+{
+	struct iscsi_tcp_conn *tcp_conn = task->conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct cxgbi_device *cdev = cconn->chba->cdev;
+	struct iscsi_tcp_task *tcp_task = task->dd_data;
+	struct cxgbi_task_data *tdata = tcp_task->dd_data;
+	struct sk_buff *skb = tdata->skb;
+	unsigned int datalen;
+	int err;
+
+	if (!skb)
+		return 0;
+
+	datalen = skb->data_len;
+	tdata->skb = NULL;
+	err = cdev->sock_send_pdus(cconn->cep->csk, skb);
+	if (err > 0) {
+		int pdulen = err;
+
+		cxgbi_tx_debug("task 0x%p, skb 0x%p, len %u/%u, rv %d.\n",
+				task, skb, skb->len, skb->data_len, err);
+
+		if (task->conn->hdrdgst_en)
+			pdulen += ISCSI_DIGEST_SIZE;
+
+		if (datalen && task->conn->datadgst_en)
+			pdulen += ISCSI_DIGEST_SIZE;
+
+		task->conn->txdata_octets += pdulen;
+		return 0;
+	}
+
+	if (err == -EAGAIN || err == -ENOBUFS) {
+		/* reset skb to send when we are called again */
+		tdata->skb = skb;
+		return err;
+	}
+
+	kfree_skb(skb);
+	cxgbi_tx_debug("itt 0x%x, skb 0x%p, len %u/%u, xmit err %d.\n",
+			task->itt, skb, skb->len, skb->data_len, err);
+	iscsi_conn_printk(KERN_ERR, task->conn, "xmit err %d.\n", err);
+	iscsi_conn_failure(task->conn, ISCSI_ERR_XMIT_FAILED);
+	return err;
+}
+EXPORT_SYMBOL_GPL(cxgbi_conn_xmit_pdu);
+
+int cxgbi_pdu_init(struct cxgbi_device *cdev)
+{
+	cdev->pad_page = alloc_page(GFP_KERNEL);
+	if (!cdev->pad_page)
+		return -ENOMEM;
+
+	memset(page_address(cdev->pad_page), 0, PAGE_SIZE);
+
+	if (cdev->skb_tx_headroom > (512 * MAX_SKB_FRAGS))
+		cdev->skb_extra_headroom = cdev->skb_tx_headroom;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxgbi_pdu_init);
+
+void cxgbi_pdu_cleanup(struct cxgbi_device *cdev)
+{
+	if (cdev->pad_page) {
+		__free_page(cdev->pad_page);
+		cdev->pad_page = NULL;
+	}
+}
+EXPORT_SYMBOL_GPL(cxgbi_pdu_cleanup);
+
+void cxgbi_conn_tx_open(struct cxgbi_sock *csk)
+{
+	struct iscsi_conn *conn = csk->user_data;
+
+	if (conn) {
+		cxgbi_tx_debug("cn 0x%p, cid %d.\n", csk, conn->id);
+		iscsi_conn_queue_work(conn);
+	}
+}
+EXPORT_SYMBOL_GPL(cxgbi_conn_tx_open);
+
+struct cxgbi_sock *cxgbi_sock_create(struct cxgbi_device *cdev)
+{
+	struct cxgbi_sock *csk = NULL;
+
+	csk = kzalloc(sizeof(*csk), GFP_NOIO);
+	if (!csk)
+		return NULL;
+
+	if (cdev->alloc_cpl_skbs(csk) < 0)
+		goto free_csk;
+
+	cxgbi_conn_debug("alloc csk: 0x%p\n", csk);
+
+	csk->flags = 0;
+	spin_lock_init(&csk->lock);
+	kref_init(&csk->refcnt);
+	skb_queue_head_init(&csk->receive_queue);
+	skb_queue_head_init(&csk->write_queue);
+	setup_timer(&csk->retry_timer, NULL, (unsigned long)csk);
+	rwlock_init(&csk->callback_lock);
+	csk->cdev = cdev;
+	return csk;
+free_csk:
+	cxgbi_api_debug("csk alloc failed %p, baling out\n", csk);
+	kfree(csk);
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(cxgbi_sock_create);
+
+int cxgbi_sock_connect(struct net_device *dev, struct cxgbi_sock *csk,
+			struct sockaddr_in *sin)
+{
+	struct rtable *rt;
+	__be32 sipv4 = 0;
+	struct net_device *dstdev;
+	struct cxgbi_hba *chba = NULL;
+	int err;
+
+	cxgbi_conn_debug("csk 0x%p, dev 0x%p\n", csk, dev);
+
+	if (sin->sin_family != AF_INET)
+		return -EAFNOSUPPORT;
+
+	csk->daddr.sin_port = sin->sin_port;
+	csk->daddr.sin_addr.s_addr = sin->sin_addr.s_addr;
+
+	dstdev = cxgbi_find_dev(dev, sin->sin_addr.s_addr);
+	if (!dstdev || !is_cxgbi_dev(dstdev, csk->cdev))
+		return -ENETUNREACH;
+
+	if (dstdev->priv_flags & IFF_802_1Q_VLAN)
+		dev = dstdev;
+
+	rt = find_route(dev, csk->saddr.sin_addr.s_addr,
+			csk->daddr.sin_addr.s_addr,
+			csk->saddr.sin_port,
+			csk->daddr.sin_port,
+			0);
+	if (rt == NULL) {
+		cxgbi_conn_debug("no route to %pI4, port %u, dev %s, "
+					"snic 0x%p\n",
+					&csk->daddr.sin_addr.s_addr,
+					ntohs(csk->daddr.sin_port),
+					dev ? dev->name : "any",
+					csk->dd_data);
+		return -ENETUNREACH;
+	}
+
+	if (rt->rt_flags & (RTCF_MULTICAST | RTCF_BROADCAST)) {
+		cxgbi_conn_debug("multi-cast route to %pI4, port %u, "
+					"dev %s, snic 0x%p\n",
+					&csk->daddr.sin_addr.s_addr,
+					ntohs(csk->daddr.sin_port),
+					dev ? dev->name : "any",
+					csk->dd_data);
+		ip_rt_put(rt);
+		return -ENETUNREACH;
+	}
+
+	if (!csk->saddr.sin_addr.s_addr)
+		csk->saddr.sin_addr.s_addr = rt->rt_src;
+
+	csk->dst = &rt->u.dst;
+
+	dev = cxgbi_find_egress_dev(csk->dst->dev, csk->cdev);
+	if (dev == NULL) {
+		cxgbi_conn_debug("csk: 0x%p, egress dev NULL\n", csk);
+		return -ENETUNREACH;
+	}
+
+	csk->egdev = dev;
+
+	err = cxgbi_sock_get_port(csk);
+	if (err)
+		return err;
+
+	cxgbi_conn_debug("csk: 0x%p get port: %u\n",
+			csk, ntohs(csk->saddr.sin_port));
+
+	chba = cxgbi_hba_find_by_netdev(csk->dst->dev, csk->cdev);
+
+	sipv4 = cxgbi_get_iscsi_ipv4(chba);
+	if (!sipv4) {
+		cxgbi_conn_debug("csk: 0x%p, iscsi is not configured\n", csk);
+		sipv4 = csk->saddr.sin_addr.s_addr;
+		cxgbi_set_iscsi_ipv4(chba, sipv4);
+	} else
+		csk->saddr.sin_addr.s_addr = sipv4;
+
+	cxgbi_conn_debug("csk: 0x%p, %pI4:[%u], %pI4:[%u] SYN_SENT\n",
+				csk,
+				&csk->saddr.sin_addr.s_addr,
+				ntohs(csk->saddr.sin_port),
+				&csk->daddr.sin_addr.s_addr,
+				ntohs(csk->daddr.sin_port));
+
+	cxgbi_sock_set_state(csk, CTP_CONNECTING);
+
+	if (!csk->cdev->init_act_open(csk, dev))
+		return 0;
+
+	err = -ENOTSUPP;
+	cxgbi_conn_debug("csk 0x%p -> closed\n", csk);
+	cxgbi_sock_set_state(csk, CTP_CLOSED);
+	ip_rt_put(rt);
+	cxgbi_sock_put_port(csk);
+	return err;
+}
+EXPORT_SYMBOL_GPL(cxgbi_sock_connect);
+
+int cxgbi_sock_get_port(struct cxgbi_sock *csk)
+{
+	unsigned int start;
+	int idx;
+
+	if (!csk->cdev->pmap)
+		goto error_out;
+
+	if (csk->saddr.sin_port) {
+		cxgbi_log_error("connect, sin_port none ZERO %u\n",
+				ntohs(csk->saddr.sin_port));
+		return -EADDRINUSE;
+	}
+
+	spin_lock_bh(&csk->cdev->pmap->lock);
+	start = idx = csk->cdev->pmap->next;
+
+	do {
+		if (++idx >= csk->cdev->pmap->max_connect)
+			idx = 0;
+		if (!csk->cdev->pmap->port_csk[idx]) {
+			csk->saddr.sin_port =
+				htons(csk->cdev->pmap->sport_base + idx);
+			csk->cdev->pmap->next = idx;
+			csk->cdev->pmap->port_csk[idx] = csk;
+			spin_unlock_bh(&csk->cdev->pmap->lock);
+			cxgbi_conn_debug("reserved port %u\n",
+					csk->cdev->pmap->sport_base + idx);
+			return 0;
+		}
+	} while (idx != start);
+	spin_unlock_bh(&csk->cdev->pmap->lock);
+error_out:
+	return -EADDRNOTAVAIL;
+}
+EXPORT_SYMBOL_GPL(cxgbi_sock_get_port);
+
+void cxgbi_sock_put_port(struct cxgbi_sock *csk)
+{
+	if (csk->saddr.sin_port) {
+		int idx = ntohs(csk->saddr.sin_port) -
+			csk->cdev->pmap->sport_base;
+
+		csk->saddr.sin_port = 0;
+		if (idx < 0 || idx >= csk->cdev->pmap->max_connect)
+			return;
+
+		spin_lock_bh(&csk->cdev->pmap->lock);
+		csk->cdev->pmap->port_csk[idx] = NULL;
+		spin_unlock_bh(&csk->cdev->pmap->lock);
+		cxgbi_conn_debug("released port %u\n",
+				csk->cdev->pmap->sport_base + idx);
+	}
+}
+EXPORT_SYMBOL_GPL(cxgbi_sock_put_port);
+
+void cxgbi_sock_conn_closing(struct cxgbi_sock *csk)
+{
+	struct iscsi_conn *conn = csk->user_data;
+
+	read_lock(&csk->callback_lock);
+	if (conn && csk->state != CTP_ESTABLISHED)
+		iscsi_conn_failure(conn, ISCSI_ERR_CONN_FAILED);
+	read_unlock(&csk->callback_lock);
+}
+EXPORT_SYMBOL_GPL(cxgbi_sock_conn_closing);
+
+void cxgbi_sock_closed(struct cxgbi_sock *csk)
+{
+	cxgbi_conn_debug("csk 0x%p, state %u, flags 0x%lx\n",
+			csk, csk->state, csk->flags);
+
+	cxgbi_sock_put_port(csk);
+	csk->cdev->release_offload_resources(csk);
+	cxgbi_sock_set_state(csk, CTP_CLOSED);
+	cxgbi_sock_conn_closing(csk);
+}
+EXPORT_SYMBOL_GPL(cxgbi_sock_closed);
+
+static void cxgbi_sock_active_close(struct cxgbi_sock *csk)
+{
+	int data_lost;
+	int close_req = 0;
+
+	cxgbi_conn_debug("csk 0x%p, state %u, flags %lu\n",
+			csk, csk->state, csk->flags);
+
+	dst_confirm(csk->dst);
+	cxgbi_sock_hold(csk);
+	spin_lock_bh(&csk->lock);
+	data_lost = skb_queue_len(&csk->receive_queue);
+	__skb_queue_purge(&csk->receive_queue);
+
+	switch (csk->state) {
+	case CTP_CLOSED:
+	case CTP_ACTIVE_CLOSE:
+	case CTP_CLOSE_WAIT_1:
+	case CTP_CLOSE_WAIT_2:
+	case CTP_ABORTING:
+		break;
+	case CTP_CONNECTING:
+		cxgbi_sock_set_flag(csk, CTPF_ACTIVE_CLOSE_NEEDED);
+		break;
+	case CTP_ESTABLISHED:
+		close_req = 1;
+		cxgbi_sock_set_flag(csk, CTP_ACTIVE_CLOSE);
+		break;
+	case CTP_PASSIVE_CLOSE:
+		close_req = 1;
+		cxgbi_sock_set_flag(csk, CTP_CLOSE_WAIT_2);
+		break;
+	}
+
+	if (close_req) {
+		if (data_lost)
+			csk->cdev->send_abort_req(csk);
+		else
+			csk->cdev->send_close_req(csk);
+	}
+
+	spin_unlock_bh(&csk->lock);
+	cxgbi_sock_put(csk);
+}
+
+void cxgbi_sock_release(struct cxgbi_sock *csk)
+{
+	cxgbi_conn_debug("csk 0x%p, state %u, flags %lu\n",
+			csk, csk->state, csk->flags);
+
+	if (unlikely(csk->state == CTP_CONNECTING))
+		cxgbi_sock_set_state(csk, CTPF_ACTIVE_CLOSE_NEEDED);
+	else if (likely(csk->state != CTP_CLOSED))
+		cxgbi_sock_active_close(csk);
+
+	cxgbi_sock_put(csk);
+}
+EXPORT_SYMBOL_GPL(cxgbi_sock_release);
+
+unsigned int cxgbi_sock_find_best_mtu(struct cxgbi_sock *csk,
+					unsigned short mtu)
+{
+	int i = 0;
+
+	while (i < csk->cdev->nmtus - 1 && csk->cdev->mtus[i + 1] <= mtu)
+		++i;
+
+	return i;
+}
+EXPORT_SYMBOL_GPL(cxgbi_sock_find_best_mtu);
+
+unsigned int cxgbi_sock_select_mss(struct cxgbi_sock *csk, unsigned int pmtu)
+{
+	unsigned int idx;
+	struct dst_entry *dst = csk->dst;
+	u16 advmss = dst_metric(dst, RTAX_ADVMSS);
+
+	if (advmss > pmtu - 40)
+		advmss = pmtu - 40;
+	if (advmss < csk->cdev->mtus[0] - 40)
+		advmss = csk->cdev->mtus[0] - 40;
+	idx = cxgbi_sock_find_best_mtu(csk, advmss + 40);
+
+	return idx;
+}
+EXPORT_SYMBOL_GPL(cxgbi_sock_select_mss);
+
+void cxgbi_release_itt(struct iscsi_task *task, itt_t hdr_itt)
+{
+	struct scsi_cmnd *sc = task->sc;
+	struct iscsi_tcp_conn *tcp_conn = task->conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct cxgbi_sock *csk = cconn->cep->csk;
+	struct cxgbi_tag_format *tformat = &csk->cdev->tag_format;
+	u32 tag = ntohl((__force u32)hdr_itt);
+
+	cxgbi_tag_debug("release tag 0x%x.\n", tag);
+
+	if (sc && (scsi_bidi_cmnd(sc) ||
+	    sc->sc_data_direction == DMA_FROM_DEVICE) &&
+	    cxgbi_is_ddp_tag(tformat, tag))
+		csk->cdev->ddp_tag_release(csk, tag);
+}
+EXPORT_SYMBOL_GPL(cxgbi_release_itt);
+
+int cxgbi_reserve_itt(struct iscsi_task *task, itt_t *hdr_itt)
+{
+	struct scsi_cmnd *sc = task->sc;
+	struct iscsi_conn *conn = task->conn;
+	struct iscsi_session *sess = conn->session;
+	struct iscsi_tcp_conn *tcp_conn = conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct cxgbi_device *cdev = cconn->chba->cdev;
+	struct cxgbi_tag_format *tformat = &cdev->tag_format;
+	u32 sw_tag = (sess->age << cconn->task_idx_bits) | task->itt;
+	u32 tag;
+	int err = -EINVAL;
+
+	if (sc && (scsi_bidi_cmnd(sc) ||
+	    sc->sc_data_direction == DMA_FROM_DEVICE) &&
+			cxgbi_sw_tag_usable(tformat, sw_tag)) {
+		struct cxgbi_sock *csk = cconn->cep->csk;
+		struct cxgbi_gather_list *gl;
+
+		gl = cdev->ddp_make_gl(scsi_in(sc)->length,
+					scsi_in(sc)->table.sgl,
+					scsi_in(sc)->table.nents,
+					cdev->pdev, GFP_ATOMIC);
+		if (gl) {
+			tag = sw_tag;
+			err = cdev->ddp_tag_reserve(csk, csk->hwtid,
+							tformat, &tag,
+							gl, GFP_ATOMIC);
+			if (err < 0)
+				cdev->ddp_release_gl(gl, cdev->pdev);
+		}
+	}
+	if (err < 0)
+		tag = cxgbi_set_non_ddp_tag(tformat, sw_tag);
+	/*  the itt need to sent in big-endian order */
+	*hdr_itt = (__force itt_t)htonl(tag);
+
+	cxgbi_tag_debug("new sc 0x%p tag 0x%x/0x%x (itt 0x%x, age 0x%x).\n",
+			sc, tag, *hdr_itt, task->itt, sess->age);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxgbi_reserve_itt);
+
+void cxgbi_parse_pdu_itt(struct iscsi_conn *conn, itt_t itt,
+				int *idx, int *age)
+{
+	struct iscsi_tcp_conn *tcp_conn = conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct cxgbi_device *cdev = cconn->chba->cdev;
+	u32 tag = ntohl((__force u32) itt);
+	u32 sw_bits;
+
+	sw_bits = cxgbi_tag_nonrsvd_bits(&cdev->tag_format, tag);
+	if (idx)
+		*idx = sw_bits & ((1 << cconn->task_idx_bits) - 1);
+	if (age)
+		*age = (sw_bits >> cconn->task_idx_bits) & ISCSI_AGE_MASK;
+
+	cxgbi_tag_debug("parse tag 0x%x/0x%x, sw 0x%x, itt 0x%x, age 0x%x.\n",
+			tag, itt, sw_bits, idx ? *idx : 0xFFFFF,
+			age ? *age : 0xFF);
+}
+EXPORT_SYMBOL_GPL(cxgbi_parse_pdu_itt);
+
+void cxgbi_cleanup_task(struct iscsi_task *task)
+{
+	struct cxgbi_task_data *tdata = task->dd_data +
+				sizeof(struct iscsi_tcp_task);
+
+	/*  never reached the xmit task callout */
+	if (tdata->skb)
+		__kfree_skb(tdata->skb);
+	memset(tdata, 0, sizeof(*tdata));
+
+	cxgbi_release_itt(task, task->hdr_itt);
+	iscsi_tcp_cleanup_task(task);
+}
+EXPORT_SYMBOL_GPL(cxgbi_cleanup_task);
+
+void cxgbi_get_conn_stats(struct iscsi_cls_conn *cls_conn,
+				struct iscsi_stats *stats)
+{
+	struct iscsi_conn *conn = cls_conn->dd_data;
+
+	stats->txdata_octets = conn->txdata_octets;
+	stats->rxdata_octets = conn->rxdata_octets;
+	stats->scsicmd_pdus = conn->scsicmd_pdus_cnt;
+	stats->dataout_pdus = conn->dataout_pdus_cnt;
+	stats->scsirsp_pdus = conn->scsirsp_pdus_cnt;
+	stats->datain_pdus = conn->datain_pdus_cnt;
+	stats->r2t_pdus = conn->r2t_pdus_cnt;
+	stats->tmfcmd_pdus = conn->tmfcmd_pdus_cnt;
+	stats->tmfrsp_pdus = conn->tmfrsp_pdus_cnt;
+	stats->digest_err = 0;
+	stats->timeout_err = 0;
+	stats->custom_length = 1;
+	strcpy(stats->custom[0].desc, "eh_abort_cnt");
+	stats->custom[0].value = conn->eh_abort_cnt;
+}
+EXPORT_SYMBOL_GPL(cxgbi_get_conn_stats);
+
+int cxgbi_conn_max_xmit_dlength(struct iscsi_conn *conn)
+{
+	struct iscsi_tcp_conn *tcp_conn = conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct cxgbi_device *cdev = cconn->chba->cdev;
+	unsigned int skb_tx_headroom = cdev->skb_tx_headroom;
+	unsigned int max_def = 512 * MAX_SKB_FRAGS;
+	unsigned int max = max(max_def, skb_tx_headroom);
+
+	max = min(cconn->chba->cdev->tx_max_size, max);
+	if (conn->max_xmit_dlength)
+		conn->max_xmit_dlength = min(conn->max_xmit_dlength, max);
+	else
+		conn->max_xmit_dlength = max;
+	cxgbi_align_pdu_size(conn->max_xmit_dlength);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxgbi_conn_max_xmit_dlength);
+
+int cxgbi_conn_max_recv_dlength(struct iscsi_conn *conn)
+{
+	struct iscsi_tcp_conn *tcp_conn = conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	unsigned int max = cconn->chba->cdev->rx_max_size;
+
+	cxgbi_align_pdu_size(max);
+
+	if (conn->max_recv_dlength) {
+		if (conn->max_recv_dlength > max) {
+			cxgbi_log_error("MaxRecvDataSegmentLength %u too big."
+					" Need to be <= %u.\n",
+					conn->max_recv_dlength, max);
+			return -EINVAL;
+		}
+		conn->max_recv_dlength = min(conn->max_recv_dlength, max);
+		cxgbi_align_pdu_size(conn->max_recv_dlength);
+	} else
+		conn->max_recv_dlength = max;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxgbi_conn_max_recv_dlength);
+
+int cxgbi_set_conn_param(struct iscsi_cls_conn *cls_conn,
+			enum iscsi_param param, char *buf, int buflen)
+{
+	struct iscsi_conn *conn = cls_conn->dd_data;
+	struct iscsi_session *session = conn->session;
+	struct iscsi_tcp_conn *tcp_conn = conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct cxgbi_sock *csk = cconn->cep->csk;
+	int value, err = 0;
+
+	switch (param) {
+	case ISCSI_PARAM_HDRDGST_EN:
+		err = iscsi_set_param(cls_conn, param, buf, buflen);
+		if (!err && conn->hdrdgst_en)
+			err = csk->cdev->ddp_setup_conn_digest(csk, csk->hwtid,
+							conn->hdrdgst_en,
+							conn->datadgst_en, 0);
+		break;
+	case ISCSI_PARAM_DATADGST_EN:
+		err = iscsi_set_param(cls_conn, param, buf, buflen);
+		if (!err && conn->datadgst_en)
+			err = csk->cdev->ddp_setup_conn_digest(csk, csk->hwtid,
+							conn->hdrdgst_en,
+							conn->datadgst_en, 0);
+		break;
+	case ISCSI_PARAM_MAX_R2T:
+		sscanf(buf, "%d", &value);
+		if (value <= 0 || !is_power_of_2(value))
+			return -EINVAL;
+		if (session->max_r2t == value)
+			break;
+		iscsi_tcp_r2tpool_free(session);
+		err = iscsi_set_param(cls_conn, param, buf, buflen);
+		if (!err && iscsi_tcp_r2tpool_alloc(session))
+			return -ENOMEM;
+	case ISCSI_PARAM_MAX_RECV_DLENGTH:
+		err = iscsi_set_param(cls_conn, param, buf, buflen);
+		if (!err)
+			err = cxgbi_conn_max_recv_dlength(conn);
+		break;
+	case ISCSI_PARAM_MAX_XMIT_DLENGTH:
+		err = iscsi_set_param(cls_conn, param, buf, buflen);
+		if (!err)
+			err = cxgbi_conn_max_xmit_dlength(conn);
+		break;
+	default:
+		return iscsi_set_param(cls_conn, param, buf, buflen);
+	}
+	return err;
+}
+EXPORT_SYMBOL_GPL(cxgbi_set_conn_param);
+
+int cxgbi_get_conn_param(struct iscsi_cls_conn *cls_conn,
+			enum iscsi_param param, char *buff)
+{
+	struct iscsi_conn *iconn = cls_conn->dd_data;
+	int len;
+
+	switch (param) {
+	case ISCSI_PARAM_CONN_PORT:
+		spin_lock_bh(&iconn->session->lock);
+		len = sprintf(buff, "%hu\n", iconn->portal_port);
+		spin_unlock_bh(&iconn->session->lock);
+		break;
+	case ISCSI_PARAM_CONN_ADDRESS:
+		spin_lock_bh(&iconn->session->lock);
+		len = sprintf(buff, "%s\n", iconn->portal_address);
+		spin_unlock_bh(&iconn->session->lock);
+		break;
+	default:
+		return iscsi_conn_get_param(cls_conn, param, buff);
+	}
+	return len;
+}
+EXPORT_SYMBOL_GPL(cxgbi_get_conn_param);
+
+struct iscsi_cls_conn *
+cxgbi_create_conn(struct iscsi_cls_session *cls_session, u32 cid)
+{
+	struct iscsi_cls_conn *cls_conn;
+	struct iscsi_conn *conn;
+	struct iscsi_tcp_conn *tcp_conn;
+	struct cxgbi_conn *cconn;
+
+	cls_conn = iscsi_tcp_conn_setup(cls_session, sizeof(*cconn), cid);
+	if (!cls_conn)
+		return NULL;
+
+	conn = cls_conn->dd_data;
+	tcp_conn = conn->dd_data;
+	cconn = tcp_conn->dd_data;
+
+	cconn->iconn = conn;
+	return cls_conn;
+}
+EXPORT_SYMBOL_GPL(cxgbi_create_conn);
+
+int cxgbi_bind_conn(struct iscsi_cls_session *cls_session,
+				struct iscsi_cls_conn *cls_conn,
+				u64 transport_eph, int is_leading)
+{
+	struct iscsi_conn *conn = cls_conn->dd_data;
+	struct iscsi_tcp_conn *tcp_conn = conn->dd_data;
+	struct cxgbi_conn *cconn = tcp_conn->dd_data;
+	struct iscsi_endpoint *ep;
+	struct cxgbi_endpoint *cep;
+	struct cxgbi_sock *csk;
+	int err;
+
+	ep = iscsi_lookup_endpoint(transport_eph);
+	if (!ep)
+		return -EINVAL;
+
+	/*  setup ddp pagesize */
+	cep = ep->dd_data;
+	csk = cep->csk;
+	err = csk->cdev->ddp_setup_conn_host_pgsz(csk, csk->hwtid, 0);
+	if (err < 0)
+		return err;
+
+	err = iscsi_conn_bind(cls_session, cls_conn, is_leading);
+	if (err)
+		return -EINVAL;
+
+	/*  calculate the tag idx bits needed for this conn based on cmds_max */
+	cconn->task_idx_bits = (__ilog2_u32(conn->session->cmds_max - 1)) + 1;
+
+	read_lock(&csk->callback_lock);
+	csk->user_data = conn;
+	cconn->chba = cep->chba;
+	cconn->cep = cep;
+	cep->cconn = cconn;
+	read_unlock(&csk->callback_lock);
+
+	cxgbi_conn_max_xmit_dlength(conn);
+	cxgbi_conn_max_recv_dlength(conn);
+
+	spin_lock_bh(&conn->session->lock);
+	sprintf(conn->portal_address, "%pI4", &csk->daddr.sin_addr.s_addr);
+	conn->portal_port = ntohs(csk->daddr.sin_port);
+	spin_unlock_bh(&conn->session->lock);
+
+	/*  init recv engine */
+	iscsi_tcp_hdr_recv_prep(tcp_conn);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cxgbi_bind_conn);
+
+struct iscsi_cls_session *
+cxgbi_create_session(struct iscsi_endpoint *ep, u16 cmds_max, u16 qdepth,
+							u32 initial_cmdsn)
+{
+	struct cxgbi_endpoint *cep;
+	struct cxgbi_hba *chba;
+	struct Scsi_Host *shost;
+	struct iscsi_cls_session *cls_session;
+	struct iscsi_session *session;
+
+	if (!ep) {
+		cxgbi_log_error("missing endpoint\n");
+		return NULL;
+	}
+
+	cep = ep->dd_data;
+	chba = cep->chba;
+	shost = chba->shost;
+
+	BUG_ON(chba != iscsi_host_priv(shost));
+
+	cls_session = iscsi_session_setup(chba->cdev->itp, shost,
+					cmds_max, 0,
+					sizeof(struct iscsi_tcp_task) +
+					sizeof(struct cxgbi_task_data),
+					initial_cmdsn, ISCSI_MAX_TARGET);
+	if (!cls_session)
+		return NULL;
+
+	session = cls_session->dd_data;
+	if (iscsi_tcp_r2tpool_alloc(session))
+		goto remove_session;
+
+	return cls_session;
+
+remove_session:
+	iscsi_session_teardown(cls_session);
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(cxgbi_create_session);
+
+void cxgbi_destroy_session(struct iscsi_cls_session *cls_session)
+{
+	iscsi_tcp_r2tpool_free(cls_session->dd_data);
+	iscsi_session_teardown(cls_session);
+}
+EXPORT_SYMBOL_GPL(cxgbi_destroy_session);
+
+int cxgbi_set_host_param(struct Scsi_Host *shost,
+			enum iscsi_host_param param, char *buff, int buflen)
+{
+	struct cxgbi_hba *chba = iscsi_host_priv(shost);
+
+	if (!chba->ndev) {
+		shost_printk(KERN_ERR, shost, "Could not set host param. "
+				"Netdev for host not set\n");
+		return -ENODEV;
+	}
+
+	cxgbi_api_debug("param %d, buff %s\n", param, buff);
+
+	switch (param) {
+	case ISCSI_HOST_PARAM_IPADDRESS:
+	{
+		__be32 addr = in_aton(buff);
+		cxgbi_set_iscsi_ipv4(chba, addr);
+		return 0;
+	}
+	case ISCSI_HOST_PARAM_HWADDRESS:
+	case ISCSI_HOST_PARAM_NETDEV_NAME:
+		return 0;
+	default:
+		return iscsi_host_set_param(shost, param, buff, buflen);
+	}
+}
+EXPORT_SYMBOL_GPL(cxgbi_set_host_param);
+
+int cxgbi_get_host_param(struct Scsi_Host *shost,
+			enum iscsi_host_param param, char *buff)
+{
+	struct cxgbi_hba *chba = iscsi_host_priv(shost);
+	int len = 0;
+
+	if (!chba->ndev) {
+		shost_printk(KERN_ERR, shost, "Could not set host param. "
+				"Netdev for host not set\n");
+		return -ENODEV;
+	}
+
+	cxgbi_api_debug("hba %s, param %d\n", chba->ndev->name, param);
+
+	switch (param) {
+	case ISCSI_HOST_PARAM_HWADDRESS:
+		len = sysfs_format_mac(buff, chba->ndev->dev_addr, 6);
+		break;
+	case ISCSI_HOST_PARAM_NETDEV_NAME:
+		len = sprintf(buff, "%s\n", chba->ndev->name);
+		break;
+	case ISCSI_HOST_PARAM_IPADDRESS:
+	{
+		__be32 addr;
+
+		addr = cxgbi_get_iscsi_ipv4(chba);
+		len = sprintf(buff, "%pI4", &addr);
+		break;
+	}
+	default:
+		return iscsi_host_get_param(shost, param, buff);
+	}
+
+	return len;
+}
+EXPORT_SYMBOL_GPL(cxgbi_get_host_param);
+
+struct iscsi_endpoint *cxgbi_ep_connect(struct Scsi_Host *shost,
+					struct sockaddr *dst_addr,
+					int non_blocking)
+{
+	struct iscsi_endpoint *iep;
+	struct cxgbi_endpoint *cep;
+	struct cxgbi_hba *hba = NULL;
+	struct cxgbi_sock *csk = NULL;
+	struct cxgbi_device *cdev;
+	int err = 0;
+
+	if (shost)
+		hba = iscsi_host_priv(shost);
+
+	cdev = cxgbi_find_cdev(hba ? hba->ndev : NULL,
+			((struct sockaddr_in *)dst_addr)->sin_addr.s_addr);
+	if (!cdev) {
+		cxgbi_log_info("ep connect no cdev\n");
+		err = -ENOSPC;
+		goto release_conn;
+	}
+
+	csk = cxgbi_sock_create(cdev);
+	if (!csk) {
+		cxgbi_log_info("ep connect OOM\n");
+		err = -ENOMEM;
+		goto release_conn;
+	}
+	err = cxgbi_sock_connect(hba ? hba->ndev : NULL, csk,
+				(struct sockaddr_in *)dst_addr);
+	if (err < 0) {
+		cxgbi_log_info("ep connect failed\n");
+		goto release_conn;
+	}
+
+	hba = cxgbi_hba_find_by_netdev(csk->dst->dev, cdev);
+	if (!hba) {
+		err = -ENOSPC;
+		cxgbi_log_info("Not going through cxgb4i device\n");
+		goto release_conn;
+	}
+
+	if (shost && hba != iscsi_host_priv(shost)) {
+		err = -ENOSPC;
+		cxgbi_log_info("Could not connect through request host %u\n",
+				shost->host_no);
+		goto release_conn;
+	}
+
+	if (cxgbi_sock_is_closing(csk)) {
+		err = -ENOSPC;
+		cxgbi_log_info("ep connect unable to connect\n");
+		goto release_conn;
+	}
+
+	iep = iscsi_create_endpoint(sizeof(*cep));
+	if (!iep) {
+		err = -ENOMEM;
+		cxgbi_log_info("iscsi alloc ep, OOM\n");
+		goto release_conn;
+	}
+
+	cep = iep->dd_data;
+	cep->csk = csk;
+	cep->chba = hba;
+	cxgbi_api_debug("iep 0x%p, cep 0x%p, csk 0x%p, hba 0x%p\n",
+			iep, cep, csk, hba);
+	return iep;
+release_conn:
+	cxgbi_api_debug("conn 0x%p failed, release\n", csk);
+	if (csk)
+		cxgbi_sock_release(csk);
+
+	return ERR_PTR(err);
+}
+EXPORT_SYMBOL_GPL(cxgbi_ep_connect);
+
+int cxgbi_ep_poll(struct iscsi_endpoint *ep, int timeout_ms)
+{
+	struct cxgbi_endpoint *cep = ep->dd_data;
+	struct cxgbi_sock *csk = cep->csk;
+
+	if (!cxgbi_sock_is_established(csk))
+		return 0;
+
+	return 1;
+}
+EXPORT_SYMBOL_GPL(cxgbi_ep_poll);
+
+void cxgbi_ep_disconnect(struct iscsi_endpoint *ep)
+{
+	struct cxgbi_endpoint *cep = ep->dd_data;
+	struct cxgbi_conn *cconn = cep->cconn;
+
+	if (cconn && cconn->iconn) {
+		iscsi_suspend_tx(cconn->iconn);
+
+		write_lock_bh(&cep->csk->callback_lock);
+		cep->csk->user_data = NULL;
+		cconn->cep = NULL;
+		write_unlock_bh(&cep->csk->callback_lock);
+	}
+
+	cxgbi_sock_release(cep->csk);
+	iscsi_destroy_endpoint(ep);
+}
+EXPORT_SYMBOL_GPL(cxgbi_ep_disconnect);
+
+struct cxgbi_hba *cxgbi_hba_add(struct cxgbi_device *cdev,
+				unsigned int max_lun,
+				unsigned int max_id,
+				struct scsi_transport_template *stt,
+				struct scsi_host_template *sht,
+				struct net_device *dev)
+{
+	struct cxgbi_hba *chba;
+	struct Scsi_Host *shost;
+	int err;
+
+	shost = iscsi_host_alloc(sht, sizeof(*chba), 1);
+
+	if (!shost) {
+		cxgbi_log_info("cdev 0x%p, ndev 0x%p, host alloc failed\n",
+				cdev, dev);
+		return NULL;
+	}
+
+	shost->transportt = stt;
+	shost->max_lun = max_lun;
+	shost->max_id = max_id;
+	shost->max_channel = 0;
+	shost->max_cmd_len = 16;
+	chba = iscsi_host_priv(shost);
+	cxgbi_log_debug("cdev %p\n", cdev);
+	chba->cdev = cdev;
+	chba->ndev = dev;
+	chba->shost = shost;
+	pci_dev_get(cdev->pdev);
+	err = iscsi_host_add(shost, &cdev->pdev->dev);
+	if (err) {
+		cxgbi_log_info("cdev 0x%p, dev 0x%p, host add failed\n",
+				cdev, dev);
+		goto pci_dev_put;
+	}
+
+	return chba;
+pci_dev_put:
+	pci_dev_put(cdev->pdev);
+	scsi_host_put(shost);
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(cxgbi_hba_add);
+
+void cxgbi_hba_remove(struct cxgbi_hba *chba)
+{
+	iscsi_host_remove(chba->shost);
+	pci_dev_put(chba->cdev->pdev);
+	iscsi_host_free(chba->shost);
+}
+EXPORT_SYMBOL_GPL(cxgbi_hba_remove);
+
diff --git a/drivers/scsi/cxgbi/libcxgbi.h b/drivers/scsi/cxgbi/libcxgbi.h
new file mode 100644
index 0000000..880eb9e
--- /dev/null
+++ b/drivers/scsi/cxgbi/libcxgbi.h
@@ -0,0 +1,577 @@
+/*
+ * libcxgbi.h: Chelsio common library for T3/T4 iSCSI driver.
+ *
+ * Copyright (c) 2010 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ *
+ * Written by: Karen Xie (kxie-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ * Written by: Rakesh Ranjan (rranjan-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ */
+
+#ifndef	__LIBCXGBI_H__
+#define	__LIBCXGBI_H__
+
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <linux/list.h>
+#include <linux/netdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/scatterlist.h>
+#include <linux/skbuff.h>
+#include <scsi/libiscsi_tcp.h>
+
+
+#define	cxgbi_log_error(fmt...)	printk(KERN_ERR "cxgbi: ERR! " fmt)
+#define cxgbi_log_warn(fmt...)	printk(KERN_WARNING "cxgbi: WARN! " fmt)
+#define cxgbi_log_info(fmt...)	printk(KERN_INFO "cxgbi: " fmt)
+#define cxgbi_debug_log(fmt, args...) \
+	printk(KERN_INFO "cxgbi: %s - " fmt, __func__ , ## args)
+
+
+#ifdef	__DEBUG_CXGBI__
+#define	cxgbi_log_debug	cxgbi_debug_log
+#else
+#define cxgbi_log_debug(fmt...)
+#endif
+
+#ifdef __DEBUG_CXGBI_TAG__
+#define cxgbi_tag_debug        cxgbi_log_debug
+#else
+#define cxgbi_tag_debug(fmt...)
+#endif
+
+#ifdef __DEBUG_CXGBI_API__
+#define cxgbi_api_debug        cxgbi_log_debug
+#else
+#define cxgbi_api_debug(fmt...)
+#endif
+
+#ifdef __DEBUG_CXGBI_CONN__
+#define cxgbi_conn_debug         cxgbi_log_debug
+#else
+#define cxgbi_conn_debug(fmt...)
+#endif
+
+#ifdef __DEBUG_CXGBI_TX__
+#define cxgbi_tx_debug           cxgbi_log_debug
+#else
+#define cxgbi_tx_debug(fmt...)
+#endif
+
+#ifdef __DEBUG_CXGBI_RX__
+#define cxgbi_rx_debug           cxgbi_log_debug
+#else
+#define cxgbi_rx_debug(fmt...)
+#endif
+
+/* always allocate rooms for AHS */
+#define SKB_TX_PDU_HEADER_LEN	\
+	(sizeof(struct iscsi_hdr) + ISCSI_MAX_AHS_SIZE)
+
+#define	ISCSI_PDU_NONPAYLOAD_LEN	312 /* bhs(48) + ahs(256) + digest(8)*/
+#define ULP2_MAX_PKT_SIZE		16224
+#define ULP2_MAX_PDU_PAYLOAD	\
+	(ULP2_MAX_PKT_SIZE - ISCSI_PDU_NONPAYLOAD_LEN)
+
+#define PPOD_PAGES_MAX			4
+#define PPOD_PAGES_SHIFT		2       /*  4 pages per pod */
+
+/*
+ * align pdu size to multiple of 512 for better performance
+ */
+#define cxgbi_align_pdu_size(n) do { n = (n) & (~511); } while (0)
+
+/*
+ * struct pagepod_hdr, pagepod - pagepod format
+ */
+struct pagepod_hdr {
+	unsigned int vld_tid;
+	unsigned int pgsz_tag_clr;
+	unsigned int max_offset;
+	unsigned int page_offset;
+	unsigned long long rsvd;
+};
+
+struct pagepod {
+	struct pagepod_hdr hdr;
+	unsigned long long addr[PPOD_PAGES_MAX + 1];
+};
+
+struct cxgbi_tag_format {
+	unsigned char sw_bits;
+	unsigned char rsvd_bits;
+	unsigned char rsvd_shift;
+	unsigned char filler[1];
+	unsigned int rsvd_mask;
+};
+
+struct cxgbi_gather_list {
+	unsigned int tag;
+	unsigned int length;
+	unsigned int offset;
+	unsigned int nelem;
+	struct page **pages;
+	dma_addr_t phys_addr[0];
+};
+
+/*
+ * sge_opaque_hdr -
+ * Opaque version of structure the SGE stores at skb->head of TX_DATA packets
+ * and for which we must reserve space.
+ */
+struct sge_opaque_hdr {
+	void *dev;
+	dma_addr_t addr[MAX_SKB_FRAGS + 1];
+};
+
+struct cxgbi_sock {
+	struct net_device *egdev;
+	struct cxgbi_device *cdev;
+
+	unsigned long flags;
+	unsigned short rss_qid;
+	unsigned short txq_idx;
+
+	unsigned int hwtid;
+	unsigned int atid;
+
+	unsigned int tx_chan;
+	unsigned int rx_chan;
+	unsigned int mss_idx;
+	unsigned int smac_idx;
+	unsigned char port_id;
+
+	unsigned char hcrc_len;
+	unsigned char dcrc_len;
+
+	void *l2t;
+
+	int wr_max_cred;
+	int wr_cred;
+	int wr_una_cred;
+
+	struct sk_buff *wr_pending_head;
+	struct sk_buff *wr_pending_tail;
+	struct sk_buff *cpl_close;
+	struct sk_buff *cpl_abort_req;
+	struct sk_buff *cpl_abort_rpl;
+	struct sk_buff *skb_ulp_lhdr;
+	spinlock_t lock;
+	struct kref refcnt;
+	unsigned int state;
+	struct sockaddr_in saddr;
+	struct sockaddr_in daddr;
+	struct dst_entry *dst;
+	struct sk_buff_head receive_queue;
+	struct sk_buff_head write_queue;
+	struct timer_list retry_timer;
+	int err;
+	rwlock_t callback_lock;
+	void *user_data;
+
+	u32 rcv_nxt;
+	u32 copied_seq;
+	u32 rcv_wup;
+	u32 snd_nxt;
+	u32 snd_una;
+	u32 write_seq;
+};
+
+enum cxgbi_sock_states{
+	CTP_CONNECTING = 1,
+	CTP_ESTABLISHED,
+	CTP_ACTIVE_CLOSE,
+	CTP_PASSIVE_CLOSE,
+	CTP_CLOSE_WAIT_1,
+	CTP_CLOSE_WAIT_2,
+	CTP_ABORTING,
+	CTP_CLOSED,
+};
+
+enum cxgbi_sock_flags {
+	CTPF_ABORT_RPL_RCVD = 1,/*received one ABORT_RPL_RSS message */
+	CTPF_ABORT_REQ_RCVD,	/*received one ABORT_REQ_RSS message */
+	CTPF_ABORT_RPL_PENDING,	/* expecting an abort reply */
+	CTPF_TX_DATA_SENT,	/* already sent a TX_DATA WR */
+	CTPF_ACTIVE_CLOSE_NEEDED,	/* need to be closed */
+	CTPF_MSG_COALESCED,
+	CTPF_OFFLOAD_DOWN,		/* offload function off */
+};
+
+enum cxgbi_skcb_flags {
+	CTP_SKCBF_NEED_HDR = 1 << 0,	/* packet needs a header */
+	CTP_SKCBF_NO_APPEND = 1 << 1,	/* don't grow this skb */
+	CTP_SKCBF_COMPL = 1 << 2,	/* request WR completion */
+	CTP_SKCBF_HDR_RCVD = 1 << 3,	/* recieved header pdu */
+	CTP_SKCBF_DATA_RCVD = 1 << 4,	/*  recieved data pdu */
+	CTP_SKCBF_STATUS_RCVD = 1 << 5,	/* recieved ddp status */
+};
+
+static inline void cxgbi_sock_set_flag(struct cxgbi_sock *csk,
+					enum cxgbi_sock_flags flag)
+{
+	__set_bit(flag, &csk->flags);
+	cxgbi_conn_debug("csk 0x%p, set %d, state %u, flags 0x%lu\n",
+			csk, flag, csk->state, csk->flags);
+}
+
+static inline void cxgbi_sock_clear_flag(struct cxgbi_sock *csk,
+					enum cxgbi_sock_flags flag)
+{
+	__clear_bit(flag, &csk->flags);
+	cxgbi_conn_debug("csk 0x%p, clear %d, state %u, flags 0x%lu\n",
+			csk, flag, csk->state, csk->flags);
+}
+
+static inline int cxgbi_sock_flag(struct cxgbi_sock *csk,
+				enum cxgbi_sock_flags flag)
+{
+	if (csk == NULL)
+		return 0;
+
+	return test_bit(flag, &csk->flags);
+}
+
+static inline void cxgbi_sock_set_state(struct cxgbi_sock *csk, int state)
+{
+	csk->state = state;
+}
+
+static inline void cxgbi_sock_hold(struct cxgbi_sock *csk)
+{
+	kref_get(&csk->refcnt);
+}
+
+static inline void cxgbi_clean_sock(struct kref *kref)
+{
+	struct cxgbi_sock *csk = container_of(kref,
+						struct cxgbi_sock,
+						refcnt);
+	if (csk) {
+		cxgbi_log_debug("free csk 0x%p, state %u, flags 0x%lx\n",
+						csk, csk->state, csk->flags);
+		kfree(csk);
+	}
+}
+
+static inline void cxgbi_sock_put(struct cxgbi_sock *csk)
+{
+	if (csk)
+		kref_put(&csk->refcnt, cxgbi_clean_sock);
+}
+
+static inline unsigned int cxgbi_sock_is_closing(const struct cxgbi_sock *csk)
+{
+	return csk->state >= CTP_ACTIVE_CLOSE;
+}
+
+static inline unsigned int cxgbi_sock_is_established(
+						const struct cxgbi_sock *csk)
+{
+	return csk->state == CTP_ESTABLISHED;
+}
+
+static inline void cxgbi_sock_purge_write_queue(struct cxgbi_sock *csk)
+{
+	struct sk_buff *skb;
+
+	while ((skb = __skb_dequeue(&csk->write_queue)))
+		__kfree_skb(skb);
+}
+
+static inline int cxgbi_sock_compute_wscale(int win)
+{
+	int wscale = 0;
+	while (wscale < 14 && (65535 << wscale) < win)
+		wscale++;
+	return wscale;
+}
+
+
+struct cxgbi_sock *cxgbi_sock_create(struct cxgbi_device *);
+int cxgbi_sock_connect(struct net_device *, struct cxgbi_sock *,
+			struct sockaddr_in *);
+int cxgbi_sock_get_port(struct cxgbi_sock *);
+void cxgbi_sock_put_port(struct cxgbi_sock *);
+void cxgbi_sock_conn_closing(struct cxgbi_sock *);
+void cxgbi_sock_closed(struct cxgbi_sock *);
+void cxgbi_sock_release(struct cxgbi_sock *);
+unsigned int cxgbi_sock_find_best_mtu(struct cxgbi_sock *, unsigned short);
+unsigned int cxgbi_sock_select_mss(struct cxgbi_sock *, unsigned int);
+
+struct cxgbi_hba {
+	struct net_device *ndev;
+	struct Scsi_Host *shost;
+	struct cxgbi_device *cdev;
+	__be32 ipv4addr;
+};
+
+struct cxgbi_ports_map {
+	unsigned int max_connect;
+	unsigned short sport_base;
+	spinlock_t lock;
+	unsigned int next;
+	struct cxgbi_sock *port_csk[0];
+};
+
+struct cxgbi_device {
+	struct list_head list_head;
+	char *name;
+	struct net_device **ports;
+	struct cxgbi_hba **hbas;
+	const unsigned short *mtus;
+	unsigned char nmtus;
+	unsigned char nports;
+	struct pci_dev *pdev;
+
+	unsigned int skb_tx_headroom;
+	unsigned int skb_extra_headroom;
+	unsigned int tx_max_size;
+	unsigned int rx_max_size;
+	struct page *pad_page;
+	struct cxgbi_ports_map *pmap;
+	struct iscsi_transport *itp;
+	struct cxgbi_tag_format tag_format;
+
+	int (*ddp_tag_reserve)(struct cxgbi_sock *, unsigned int,
+				struct cxgbi_tag_format *, u32 *,
+				struct cxgbi_gather_list *, gfp_t);
+	void (*ddp_tag_release)(struct cxgbi_sock *, u32);
+	struct cxgbi_gather_list* (*ddp_make_gl)(unsigned int,
+						struct scatterlist *,
+						unsigned int,
+						struct pci_dev *,
+						gfp_t);
+	void (*ddp_release_gl)(struct cxgbi_gather_list *, struct pci_dev *);
+	int (*ddp_setup_conn_digest)(struct cxgbi_sock *,
+					unsigned int, int, int, int);
+	int (*ddp_setup_conn_host_pgsz)(struct cxgbi_sock *,
+					unsigned int, int);
+	__u16 (*get_skb_ulp_mode)(struct sk_buff *);
+	__u16 (*get_skb_flags)(struct sk_buff *);
+	__u32 (*get_skb_tcp_seq)(struct sk_buff *);
+	__u32 (*get_skb_rx_pdulen)(struct sk_buff *);
+	void (*set_skb_txmode)(struct sk_buff *, int, int);
+
+	void (*release_offload_resources)(struct cxgbi_sock *);
+	int (*sock_send_pdus)(struct cxgbi_sock *, struct sk_buff *);
+	void (*sock_rx_credits)(struct cxgbi_sock *, int);
+	void (*send_abort_req)(struct cxgbi_sock *);
+	void (*send_close_req)(struct cxgbi_sock *);
+	int (*alloc_cpl_skbs)(struct cxgbi_sock *);
+	int (*init_act_open)(struct cxgbi_sock *, struct net_device *);
+
+	unsigned long dd_data[0];
+};
+
+static inline void *cxgbi_cdev_priv(struct cxgbi_device *cdev)
+{
+	return (void *)cdev->dd_data;
+}
+
+struct cxgbi_device *cxgbi_device_register(unsigned int, unsigned int);
+void cxgbi_device_unregister(struct cxgbi_device *);
+
+struct cxgbi_conn {
+	struct list_head list_head;
+	struct cxgbi_endpoint *cep;
+	struct iscsi_conn *iconn;
+	struct cxgbi_hba *chba;
+	u32 task_idx_bits;
+};
+
+struct cxgbi_endpoint {
+	struct cxgbi_conn *cconn;
+	struct cxgbi_hba *chba;
+	struct cxgbi_sock *csk;
+};
+
+#define MAX_PDU_FRAGS	((ULP2_MAX_PDU_PAYLOAD + 512 - 1) / 512)
+struct cxgbi_task_data {
+	unsigned short nr_frags;
+	skb_frag_t frags[MAX_PDU_FRAGS];
+	struct sk_buff *skb;
+	unsigned int offset;
+	unsigned int count;
+	unsigned int sgoffset;
+};
+
+static inline int cxgbi_is_ddp_tag(struct cxgbi_tag_format *tformat, u32 tag)
+{
+	return !(tag & (1 << (tformat->rsvd_bits + tformat->rsvd_shift - 1)));
+}
+
+static inline int cxgbi_sw_tag_usable(struct cxgbi_tag_format *tformat,
+					u32 sw_tag)
+{
+	sw_tag >>= (32 - tformat->rsvd_bits);
+	return !sw_tag;
+}
+
+static inline u32 cxgbi_set_non_ddp_tag(struct cxgbi_tag_format *tformat,
+					u32 sw_tag)
+{
+	unsigned char shift = tformat->rsvd_bits + tformat->rsvd_shift - 1;
+
+	u32 mask = (1 << shift) - 1;
+
+	if (sw_tag && (sw_tag & ~mask)) {
+		u32 v1 = sw_tag & ((1 << shift) - 1);
+		u32 v2 = (sw_tag >> (shift - 1)) << shift;
+
+		return v2 | v1 | 1 << shift;
+	}
+
+	return sw_tag | 1 << shift;
+}
+
+static inline u32 cxgbi_ddp_tag_base(struct cxgbi_tag_format *tformat,
+					u32 sw_tag)
+{
+	u32 mask = (1 << tformat->rsvd_shift) - 1;
+
+	if (sw_tag && (sw_tag & ~mask)) {
+		u32 v1 = sw_tag & mask;
+		u32 v2 = sw_tag >> tformat->rsvd_shift;
+
+		v2 <<= tformat->rsvd_bits + tformat->rsvd_shift;
+
+		return v2 | v1;
+	}
+
+	return sw_tag;
+}
+
+static inline u32 cxgbi_tag_rsvd_bits(struct cxgbi_tag_format *tformat,
+					u32 tag)
+{
+	if (cxgbi_is_ddp_tag(tformat, tag))
+		return (tag >> tformat->rsvd_shift) & tformat->rsvd_mask;
+
+	return 0;
+}
+
+static inline u32 cxgbi_tag_nonrsvd_bits(struct cxgbi_tag_format *tformat,
+					u32 tag)
+{
+	unsigned char shift = tformat->rsvd_bits + tformat->rsvd_shift - 1;
+	u32 v1, v2;
+
+	if (cxgbi_is_ddp_tag(tformat, tag)) {
+		v1 = tag & ((1 << tformat->rsvd_shift) - 1);
+		v2 = (tag >> (shift + 1)) << tformat->rsvd_shift;
+	} else {
+		u32 mask = (1 << shift) - 1;
+		tag &= ~(1 << shift);
+		v1 = tag & mask;
+		v2 = (tag >> 1) & ~mask;
+	}
+	return v1 | v2;
+}
+
+static inline void *cxgbi_alloc_big_mem(unsigned int size,
+					gfp_t gfp)
+{
+	void *p = kmalloc(size, gfp);
+	if (!p)
+		p = vmalloc(size);
+	if (p)
+		memset(p, 0, size);
+	return p;
+}
+
+static inline void cxgbi_free_big_mem(void *addr)
+{
+	if (is_vmalloc_addr(addr))
+		vfree(addr);
+	else
+		kfree(addr);
+}
+
+#define RX_DDP_STATUS_IPP_SHIFT		27      /* invalid pagepod */
+#define RX_DDP_STATUS_TID_SHIFT		26      /* tid mismatch */
+#define RX_DDP_STATUS_COLOR_SHIFT	25      /* color mismatch */
+#define RX_DDP_STATUS_OFFSET_SHIFT	24      /* offset mismatch */
+#define RX_DDP_STATUS_ULIMIT_SHIFT	23      /* ulimit error */
+#define RX_DDP_STATUS_TAG_SHIFT		22      /* tag mismatch */
+#define RX_DDP_STATUS_DCRC_SHIFT	21      /* dcrc error */
+#define RX_DDP_STATUS_HCRC_SHIFT	20      /* hcrc error */
+#define RX_DDP_STATUS_PAD_SHIFT		19      /* pad error */
+#define RX_DDP_STATUS_PPP_SHIFT		18      /* pagepod parity error */
+#define RX_DDP_STATUS_LLIMIT_SHIFT	17      /* llimit error */
+#define RX_DDP_STATUS_DDP_SHIFT		16      /* ddp'able */
+#define RX_DDP_STATUS_PMM_SHIFT		15      /* pagepod mismatch */
+
+
+#define ULP2_FLAG_DATA_READY		0x1
+#define ULP2_FLAG_DATA_DDPED		0x2
+#define ULP2_FLAG_HCRC_ERROR		0x4
+#define ULP2_FLAG_DCRC_ERROR		0x8
+#define ULP2_FLAG_PAD_ERROR		0x10
+
+struct cxgbi_hba *cxgbi_hba_find_by_netdev(struct net_device *dev,
+					struct cxgbi_device *cdev);
+struct cxgbi_hba *cxgbi_hba_add(struct cxgbi_device *,
+				unsigned int, unsigned int,
+				struct scsi_transport_template *,
+				struct scsi_host_template *,
+				struct net_device *);
+void cxgbi_hba_remove(struct cxgbi_hba *);
+
+
+int cxgbi_reserve_itt(struct iscsi_task *task, itt_t *hdr_itt);
+void cxgbi_release_itt(struct iscsi_task *task, itt_t hdr_itt);
+void cxgbi_parse_pdu_itt(struct iscsi_conn *conn, itt_t itt,
+				int *idx, int *age);
+void cxgbi_cleanup_task(struct iscsi_task *task);
+
+void cxgbi_conn_pdu_ready(struct cxgbi_sock *);
+void cxgbi_conn_tx_open(struct cxgbi_sock *);
+int cxgbi_conn_init_pdu(struct iscsi_task *, unsigned int , unsigned int);
+int cxgbi_conn_alloc_pdu(struct iscsi_task *, u8);
+int cxgbi_conn_xmit_pdu(struct iscsi_task *);
+void cxgbi_get_conn_stats(struct iscsi_cls_conn *, struct iscsi_stats *);
+int cxgbi_conn_max_xmit_dlength(struct iscsi_conn *);
+int cxgbi_conn_max_recv_dlength(struct iscsi_conn *);
+int cxgbi_set_conn_param(struct iscsi_cls_conn *,
+			enum iscsi_param, char *, int);
+int cxgbi_get_conn_param(struct iscsi_cls_conn *, enum iscsi_param, char *);
+struct iscsi_cls_conn *cxgbi_create_conn(struct iscsi_cls_session *, u32);
+int cxgbi_bind_conn(struct iscsi_cls_session *,
+			struct iscsi_cls_conn *, u64, int);
+void cxgbi_destroy_session(struct iscsi_cls_session *);
+struct iscsi_cls_session *cxgbi_create_session(struct iscsi_endpoint *,
+						u16, u16, u32);
+int cxgbi_set_host_param(struct Scsi_Host *,
+				enum iscsi_host_param, char *, int);
+int cxgbi_get_host_param(struct Scsi_Host *, enum iscsi_host_param, char *);
+
+
+int cxgbi_pdu_init(struct cxgbi_device *);
+void cxgbi_pdu_cleanup(struct cxgbi_device *);
+
+struct iscsi_endpoint *cxgbi_ep_connect(struct Scsi_Host *,
+					struct sockaddr *, int);
+int cxgbi_ep_poll(struct iscsi_endpoint *, int);
+void cxgbi_ep_disconnect(struct iscsi_endpoint *);
+
+static inline void cxgbi_set_iscsi_ipv4(struct cxgbi_hba *chba, __be32 ipaddr)
+{
+	chba->ipv4addr = ipaddr;
+}
+
+static inline __be32 cxgbi_get_iscsi_ipv4(struct cxgbi_hba *chba)
+{
+	return chba->ipv4addr;
+}
+
+struct cxgbi_device *cxgbi_device_alloc(unsigned int dd_size);
+void cxgbi_device_free(struct cxgbi_device *cdev);
+void cxgbi_device_add(struct list_head *list_head);
+void cxgbi_device_remove(struct cxgbi_device *cdev);
+
+#endif	/*__LIBCXGBI_H__*/
+
-- 
1.6.6.1

-- 
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.

^ permalink raw reply related

* [PATCH 3/3] cxgb4i_v4: main driver files
From: Rakesh Ranjan @ 2010-06-04 22:58 UTC (permalink / raw)
  To: LK-NetDev, LK-SCSIDev, LK-iSCSIDev
  Cc: LKML, Karen Xie, David Miller, James Bottomley, Mike Christie,
	Anish Bhatt, Rakesh Ranjan
In-Reply-To: <1275692327-21502-3-git-send-email-rakesh-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>

From: Rakesh Ranjan <rakesh-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>


Signed-off-by: Rakesh Ranjan <rakesh-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>
---
 drivers/scsi/cxgbi/cxgb4i.h         |  179 +++++
 drivers/scsi/cxgbi/cxgb4i_ddp.c     |  657 ++++++++++++++++
 drivers/scsi/cxgbi/cxgb4i_init.c    |  308 ++++++++
 drivers/scsi/cxgbi/cxgb4i_offload.c | 1412 +++++++++++++++++++++++++++++++++++
 4 files changed, 2556 insertions(+), 0 deletions(-)
 create mode 100644 drivers/scsi/cxgbi/cxgb4i.h
 create mode 100644 drivers/scsi/cxgbi/cxgb4i_ddp.c
 create mode 100644 drivers/scsi/cxgbi/cxgb4i_init.c
 create mode 100644 drivers/scsi/cxgbi/cxgb4i_offload.c

diff --git a/drivers/scsi/cxgbi/cxgb4i.h b/drivers/scsi/cxgbi/cxgb4i.h
new file mode 100644
index 0000000..658c600
--- /dev/null
+++ b/drivers/scsi/cxgbi/cxgb4i.h
@@ -0,0 +1,179 @@
+/*
+ * cxgb4i.h: Chelsio T4 iSCSI driver.
+ *
+ * Copyright (c) 2010 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ *
+ * Written by: Karen Xie (kxie-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ * Written by: Rakesh Ranjan (rranjan-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ */
+
+#ifndef	__CXGB4I_H__
+#define	__CXGB4I_H__
+
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <linux/list.h>
+#include <linux/netdevice.h>
+#include <linux/if_vlan.h>
+#include <linux/scatterlist.h>
+#include <linux/skbuff.h>
+#include <scsi/libiscsi_tcp.h>
+
+#include "t4fw_api.h"
+#include "t4_msg.h"
+#include "l2t.h"
+#include "cxgb4.h"
+#include "cxgb4_uld.h"
+#include "libcxgbi.h"
+
+#define	CXGB4I_MAX_CONN		16384
+#define	CXGB4I_SCSI_HOST_QDEPTH	1024
+#define	CXGB4I_MAX_TARGET	CXGB4I_MAX_CONN
+#define	CXGB4I_MAX_LUN		0x1000
+
+struct cxgb4i_snic;
+typedef int (*cxgb4i_cplhandler_func)(struct cxgb4i_snic *, struct sk_buff *);
+
+struct cxgb4i_snic {
+	struct list_head list_head;
+	spinlock_t lock;
+	unsigned int flags;
+	struct cxgb4_lld_info lldi;
+	struct cxgb4i_ddp_info *ddp;
+	cxgb4i_cplhandler_func *handlers;
+};
+
+enum {
+	CPL_RET_BUF_DONE = 1,
+	CPL_RET_BAD_MSG = 2,
+	CPL_RET_UNKNOWN_TID = 4
+};
+
+struct cxgb4i_skb_rx_cb {
+	__u32 ddigest;
+	__u32 pdulen;
+};
+
+struct cxgb4i_skb_tx_cb {
+	struct l2t_skb_cb l2t;
+	struct sk_buff *wr_next;
+};
+
+struct cxgb4i_skb_cb {
+	__u16 flags;
+	__u16 ulp_mode;
+	__u32 seq;
+
+	union {
+		struct cxgb4i_skb_rx_cb rx;
+		struct cxgb4i_skb_tx_cb tx;
+	};
+};
+
+#define CXGB4I_SKB_CB(skb)	((struct cxgb4i_skb_cb *)&((skb)->cb[0]))
+#define cxgb4i_skb_flags(skb)	(CXGB4I_SKB_CB(skb)->flags)
+#define cxgb4i_skb_ulp_mode(skb)	(CXGB4I_SKB_CB(skb)->ulp_mode)
+#define cxgb4i_skb_tcp_seq(skb)		(CXGB4I_SKB_CB(skb)->seq)
+#define cxgb4i_skb_rx_ddigest(skb)	(CXGB4I_SKB_CB(skb)->rx.ddigest)
+#define cxgb4i_skb_rx_pdulen(skb)	(CXGB4I_SKB_CB(skb)->rx.pdulen)
+#define cxgb4i_skb_tx_wr_next(skb)	(CXGB4I_SKB_CB(skb)->tx.wr_next)
+
+/* for TX: a skb must have a headroom of at least TX_HEADER_LEN bytes */
+#define CXGB4I_TX_HEADER_LEN \
+	(sizeof(struct fw_ofld_tx_data_wr) + sizeof(struct sge_opaque_hdr))
+
+int cxgb4i_ofld_init(struct cxgbi_device *);
+void cxgb4i_ofld_cleanup(struct cxgbi_device *);
+
+struct cxgb4i_ddp_info {
+	struct list_head list;
+	struct kref refcnt;
+	struct cxgb4i_snic *snic;
+	struct pci_dev *pdev;
+	unsigned int max_txsz;
+	unsigned int max_rxsz;
+	unsigned int llimit;
+	unsigned int ulimit;
+	unsigned int nppods;
+	unsigned int idx_last;
+	unsigned char idx_bits;
+	unsigned char filler[3];
+	unsigned int idx_mask;
+	unsigned int rsvd_tag_mask;
+	spinlock_t map_lock;
+	struct cxgbi_gather_list **gl_map;
+};
+
+struct cpl_rx_data_ddp {
+	union opcode_tid ot;
+	__be16 urg;
+	__be16 len;
+	__be32 seq;
+	union {
+		__be32 nxt_seq;
+		__be32 ddp_report;
+	};
+	__be32 ulp_crc;
+	__be32 ddpvld;
+};
+
+#define PPOD_SIZE               sizeof(struct pagepod)  /*  64 */
+#define PPOD_SIZE_SHIFT         6
+
+#define ULPMEM_DSGL_MAX_NPPODS	16	/*  1024/PPOD_SIZE */
+#define ULPMEM_IDATA_MAX_NPPODS	4	/*  256/PPOD_SIZE */
+#define PCIE_MEMWIN_MAX_NPPODS	16	/*  1024/PPOD_SIZE */
+
+#define PPOD_COLOR_SHIFT	0
+#define PPOD_COLOR_MASK		0x3F
+#define PPOD_COLOR_SIZE         6
+#define PPOD_COLOR(x)		((x) << PPOD_COLOR_SHIFT)
+
+#define PPOD_TAG_SHIFT	6
+#define PPOD_TAG_MASK	0xFFFFFF
+#define PPOD_TAG(x)	((x) << PPOD_TAG_SHIFT)
+
+#define PPOD_PGSZ_SHIFT	30
+#define PPOD_PGSZ_MASK	0x3
+#define PPOD_PGSZ(x)	((x) << PPOD_PGSZ_SHIFT)
+
+#define PPOD_TID_SHIFT	32
+#define PPOD_TID_MASK	0xFFFFFF
+#define PPOD_TID(x)	((__u64)(x) << PPOD_TID_SHIFT)
+
+#define PPOD_VALID_SHIFT	56
+#define PPOD_VALID(x)	((__u64)(x) << PPOD_VALID_SHIFT)
+#define PPOD_VALID_FLAG	PPOD_VALID(1ULL)
+
+#define PPOD_LEN_SHIFT	32
+#define PPOD_LEN_MASK	0xFFFFFFFF
+#define PPOD_LEN(x)	((__u64)(x) << PPOD_LEN_SHIFT)
+
+#define PPOD_OFST_SHIFT	0
+#define PPOD_OFST_MASK	0xFFFFFFFF
+#define PPOD_OFST(x)	((x) << PPOD_OFST_SHIFT)
+
+#define PPOD_IDX_SHIFT          PPOD_COLOR_SIZE
+#define PPOD_IDX_MAX_SIZE       24
+
+#define W_TCB_ULP_TYPE          0
+#define TCB_ULP_TYPE_SHIFT      0
+#define TCB_ULP_TYPE_MASK       0xfULL
+#define TCB_ULP_TYPE(x)         ((x) << TCB_ULP_TYPE_SHIFT)
+
+#define W_TCB_ULP_RAW           0
+#define TCB_ULP_RAW_SHIFT       4
+#define TCB_ULP_RAW_MASK        0xffULL
+#define TCB_ULP_RAW(x)          ((x) << TCB_ULP_RAW_SHIFT)
+
+void cxgb4i_ddp_init(struct cxgbi_device *);
+void cxgb4i_ddp_cleanup(struct cxgbi_device *);
+
+#endif	/* __CXGB4I_H__ */
+
diff --git a/drivers/scsi/cxgbi/cxgb4i_ddp.c b/drivers/scsi/cxgbi/cxgb4i_ddp.c
new file mode 100644
index 0000000..3ef6f7e
--- /dev/null
+++ b/drivers/scsi/cxgbi/cxgb4i_ddp.c
@@ -0,0 +1,657 @@
+/*
+ * cxgb4i_ddp.c: Chelsio T4 iSCSI driver.
+ *
+ * Copyright (c) 2010 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ *
+ * Written by: Karen Xie (kxie-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ * Written by: Rakesh Ranjan (rranjan-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ */
+
+#include <linux/skbuff.h>
+#include <linux/scatterlist.h>
+
+#include "libcxgbi.h"
+#include "cxgb4i.h"
+
+#define DDP_PGIDX_MAX	4
+#define DDP_THRESHOLD	2048
+
+static unsigned char ddp_page_order[DDP_PGIDX_MAX] = {0, 1, 2, 4};
+static unsigned char ddp_page_shift[DDP_PGIDX_MAX] = {12, 13, 14, 16};
+static unsigned char page_idx = DDP_PGIDX_MAX;
+static unsigned char sw_tag_idx_bits;
+static unsigned char sw_tag_age_bits;
+
+static inline void cxgb4i_ddp_ppod_set(struct pagepod *ppod,
+				       struct pagepod_hdr *hdr,
+				       struct cxgbi_gather_list *gl,
+				       unsigned int pidx)
+{
+	int i;
+
+	memcpy(ppod, hdr, sizeof(*hdr));
+	for (i = 0; i < (PPOD_PAGES_MAX + 1); i++, pidx++) {
+		ppod->addr[i] = pidx < gl->nelem ?
+			cpu_to_be64(gl->phys_addr[pidx]) : 0ULL;
+	}
+}
+
+static inline void cxgb4i_ddp_ppod_clear(struct pagepod *ppod)
+{
+	memset(ppod, 0, sizeof(*ppod));
+}
+
+static inline void cxgb4i_ddp_ulp_mem_io_set_hdr(struct ulp_mem_io *req,
+						unsigned int wr_len,
+						unsigned int dlen,
+						unsigned int pm_addr)
+{
+	struct ulptx_sgl *sgl;
+
+	INIT_ULPTX_WR(req, wr_len, 0, 0);
+	req->cmd = htonl(ULPTX_CMD(ULP_TX_MEM_WRITE));
+	req->dlen = htonl(ULP_MEMIO_DATA_LEN(dlen >> 5));
+	req->len16 = htonl(DIV_ROUND_UP(wr_len - sizeof(req->wr), 16));
+	req->lock_addr = htonl(ULP_MEMIO_ADDR(pm_addr >> 5));
+	sgl = (struct ulptx_sgl *)(req + 1);
+	sgl->cmd_nsge = htonl(ULPTX_CMD(ULP_TX_SC_DSGL) | ULPTX_NSGE(1));
+	sgl->len0 = htonl(dlen);
+}
+
+static int cxgb4i_ddp_ppod_write_sgl(struct cxgbi_sock *csk,
+				     struct cxgb4i_ddp_info *ddp,
+				     struct pagepod_hdr *hdr,
+				     unsigned int idx,
+				     unsigned int npods,
+				     struct cxgbi_gather_list *gl,
+				     unsigned int gl_pidx)
+{
+	unsigned int dlen, pm_addr, wr_len;
+	struct sk_buff *skb;
+	struct ulp_mem_io *req;
+	struct ulptx_sgl *sgl;
+	struct pagepod *ppod;
+	unsigned int i;
+
+	dlen = PPOD_SIZE * npods;
+	pm_addr = idx * PPOD_SIZE + ddp->llimit;
+	wr_len = roundup(sizeof(struct ulp_mem_io) +
+			sizeof(struct ulptx_sgl), 16);
+
+	skb = alloc_skb(wr_len + dlen, GFP_ATOMIC);
+	if (!skb) {
+		cxgbi_log_error("snic 0x%p, idx %u, npods %u, OOM\n",
+				ddp->snic, idx, npods);
+		return -ENOMEM;
+	}
+
+	memset(skb->data, 0, wr_len + dlen);
+	set_wr_txq(skb, CPL_PRIORITY_CONTROL, csk->txq_idx);
+	req = (struct ulp_mem_io *)__skb_put(skb, wr_len);
+	cxgb4i_ddp_ulp_mem_io_set_hdr(req, wr_len, dlen, pm_addr);
+	sgl = (struct ulptx_sgl *)(req + 1);
+	ppod = (struct pagepod *)(sgl + 1);
+	sgl->addr0 = cpu_to_be64(virt_to_phys(ppod));
+
+	for (i = 0; i < npods; i++, ppod++, gl_pidx += PPOD_PAGES_MAX) {
+		if (!hdr && !gl)
+			cxgb4i_ddp_ppod_clear(ppod);
+		else
+			cxgb4i_ddp_ppod_set(ppod, hdr, gl, gl_pidx);
+
+	}
+
+	cxgb4_ofld_send(csk->cdev->ports[csk->port_id], skb);
+	return 0;
+}
+
+static int cxgb4i_ddp_set_map(struct cxgbi_sock *csk,
+				struct cxgb4i_ddp_info *ddp,
+				struct pagepod_hdr *hdr,
+				unsigned int idx,
+				unsigned int npods,
+				struct cxgbi_gather_list *gl)
+{
+	unsigned int pidx, w_npods, cnt;
+	int err = 0;
+
+	for (w_npods = 0, pidx = 0; w_npods < npods;
+		idx += cnt, w_npods += cnt, pidx += PPOD_PAGES_MAX) {
+		cnt = npods - w_npods;
+		if (cnt > ULPMEM_DSGL_MAX_NPPODS)
+			cnt = ULPMEM_DSGL_MAX_NPPODS;
+		err = cxgb4i_ddp_ppod_write_sgl(csk, ddp, hdr, idx, cnt, gl,
+						pidx);
+		if (err < 0)
+			break;
+	}
+	return err;
+}
+
+static void cxgb4i_ddp_clear_map(struct cxgbi_sock *csk,
+				struct cxgb4i_ddp_info *ddp,
+				unsigned int tag,
+				unsigned int idx,
+				unsigned int npods)
+{
+	int err;
+	unsigned int w_npods, cnt;
+
+	for (w_npods = 0; w_npods < npods; idx += cnt, w_npods += cnt) {
+		cnt = npods - w_npods;
+
+		if (cnt > ULPMEM_DSGL_MAX_NPPODS)
+			cnt = ULPMEM_DSGL_MAX_NPPODS;
+		err = cxgb4i_ddp_ppod_write_sgl(csk, ddp, NULL, idx, cnt,
+						NULL, 0);
+		if (err < 0)
+			break;
+	}
+}
+
+static inline int cxgb4i_ddp_find_unused_entries(struct cxgb4i_ddp_info *ddp,
+					unsigned int start, unsigned int max,
+					unsigned int count,
+					struct cxgbi_gather_list *gl)
+{
+	unsigned int i, j, k;
+
+	/*  not enough entries */
+	if ((max - start) < count)
+		return -EBUSY;
+
+	max -= count;
+	spin_lock(&ddp->map_lock);
+	for (i = start; i < max;) {
+		for (j = 0, k = i; j < count; j++, k++) {
+			if (ddp->gl_map[k])
+				break;
+		}
+		if (j == count) {
+			for (j = 0, k = i; j < count; j++, k++)
+				ddp->gl_map[k] = gl;
+			spin_unlock(&ddp->map_lock);
+			return i;
+		}
+		i += j + 1;
+	}
+	spin_unlock(&ddp->map_lock);
+	return -EBUSY;
+}
+
+static inline void cxgb4i_ddp_unmark_entries(struct cxgb4i_ddp_info *ddp,
+						int start, int count)
+{
+	spin_lock(&ddp->map_lock);
+	memset(&ddp->gl_map[start], 0,
+		count * sizeof(struct cxgbi_gather_list *));
+	spin_unlock(&ddp->map_lock);
+}
+
+static int cxgb4i_ddp_find_page_index(unsigned long pgsz)
+{
+	int i;
+
+	for (i = 0; i < DDP_PGIDX_MAX; i++) {
+		if (pgsz == (1UL << ddp_page_shift[i]))
+			return i;
+	}
+	cxgbi_log_debug("ddp page size 0x%lx not supported\n", pgsz);
+	return DDP_PGIDX_MAX;
+}
+
+static int cxgb4i_ddp_adjust_page_table(void)
+{
+	int i;
+	unsigned int base_order, order;
+
+	if (PAGE_SIZE < (1UL << ddp_page_shift[0])) {
+		cxgbi_log_info("PAGE_SIZE 0x%lx too small, min 0x%lx\n",
+				PAGE_SIZE, 1UL << ddp_page_shift[0]);
+		return -EINVAL;
+	}
+
+	base_order = get_order(1UL << ddp_page_shift[0]);
+	order = get_order(1UL << PAGE_SHIFT);
+
+	for (i = 0; i < DDP_PGIDX_MAX; i++) {
+		/* first is the kernel page size,
+		 * then just doubling the size */
+		ddp_page_order[i] = order - base_order + i;
+		ddp_page_shift[i] = PAGE_SHIFT + i;
+	}
+	return 0;
+}
+
+static inline void cxgb4i_ddp_gl_unmap(struct pci_dev *pdev,
+					struct cxgbi_gather_list *gl)
+{
+	int i;
+
+	for (i = 0; i < gl->nelem; i++)
+		dma_unmap_page(&pdev->dev, gl->phys_addr[i], PAGE_SIZE,
+				PCI_DMA_FROMDEVICE);
+}
+
+static inline int cxgb4i_ddp_gl_map(struct pci_dev *pdev,
+				    struct cxgbi_gather_list *gl)
+{
+	int i;
+
+	for (i = 0; i < gl->nelem; i++) {
+		gl->phys_addr[i] = dma_map_page(&pdev->dev, gl->pages[i], 0,
+						PAGE_SIZE,
+						PCI_DMA_FROMDEVICE);
+		if (unlikely(dma_mapping_error(&pdev->dev, gl->phys_addr[i])))
+			goto unmap;
+	}
+	return i;
+unmap:
+	if (i) {
+		unsigned int nelem = gl->nelem;
+
+		gl->nelem = i;
+		cxgb4i_ddp_gl_unmap(pdev, gl);
+		gl->nelem = nelem;
+	}
+	return -ENOMEM;
+}
+
+void cxgb4i_ddp_release_gl(struct cxgbi_gather_list *gl, struct pci_dev *pdev)
+{
+	cxgb4i_ddp_gl_unmap(pdev, gl);
+	kfree(gl);
+}
+
+struct cxgbi_gather_list *cxgb4i_ddp_make_gl(unsigned int xferlen,
+					     struct scatterlist *sgl,
+					     unsigned int sgcnt,
+					     struct pci_dev *pdev,
+					     gfp_t gfp)
+{
+	struct cxgbi_gather_list *gl;
+	struct scatterlist *sg = sgl;
+	struct page *sgpage = sg_page(sg);
+	unsigned int sglen = sg->length;
+	unsigned int sgoffset = sg->offset;
+	unsigned int npages = (xferlen + sgoffset + PAGE_SIZE - 1) >>
+				PAGE_SHIFT;
+	int i = 1, j = 0;
+
+	if (xferlen < DDP_THRESHOLD) {
+		cxgbi_log_debug("xfer %u < threshold %u, no ddp.\n",
+				xferlen, DDP_THRESHOLD);
+		return NULL;
+	}
+
+	gl = kzalloc(sizeof(struct cxgbi_gather_list) +
+		     npages * (sizeof(dma_addr_t) +
+		     sizeof(struct page *)), gfp);
+	if (!gl)
+		return NULL;
+
+	gl->pages = (struct page **)&gl->phys_addr[npages];
+	gl->length = xferlen;
+	gl->offset = sgoffset;
+	gl->pages[0] = sgpage;
+	sg = sg_next(sg);
+
+	while (sg) {
+		struct page *page = sg_page(sg);
+
+		if (sgpage == page && sg->offset == sgoffset + sglen)
+			sglen += sg->length;
+		else {
+			/*  make sure the sgl is fit for ddp:
+			 *  each has the same page size, and
+			 *  all of the middle pages are used completely
+			 */
+			if ((j && sgoffset) || ((i != sgcnt - 1) &&
+					 ((sglen + sgoffset) & ~PAGE_MASK)))
+				goto error_out;
+
+			j++;
+			if (j == gl->nelem || sg->offset)
+				goto error_out;
+			gl->pages[j] = page;
+			sglen = sg->length;
+			sgoffset = sg->offset;
+			sgpage = page;
+		}
+		i++;
+		sg = sg_next(sg);
+	}
+	gl->nelem = ++j;
+
+	if (cxgb4i_ddp_gl_map(pdev, gl) < 0)
+		goto error_out;
+	return gl;
+error_out:
+	kfree(gl);
+	return NULL;
+}
+
+static void cxgb4i_ddp_tag_release(struct cxgbi_sock *csk, u32 tag)
+{
+	struct cxgb4i_snic *snic = cxgbi_cdev_priv(csk->cdev);
+	struct cxgb4i_ddp_info *ddp = snic->ddp;
+	u32 idx;
+
+	if (!ddp) {
+		cxgbi_log_error("release ddp tag 0x%x, ddp NULL.\n", tag);
+		return;
+	}
+
+	idx = (tag >> PPOD_IDX_SHIFT) & ddp->idx_mask;
+	if (idx < ddp->nppods) {
+		struct cxgbi_gather_list *gl = ddp->gl_map[idx];
+		unsigned int npods;
+
+		if (!gl || !gl->nelem) {
+			cxgbi_log_error("rel 0x%x, idx 0x%x, gl 0x%p, %u\n",
+					tag, idx, gl, gl ? gl->nelem : 0);
+			return;
+		}
+		npods = (gl->nelem + PPOD_PAGES_MAX - 1) >> PPOD_PAGES_SHIFT;
+		cxgbi_log_debug("ddp tag 0x%x, release idx 0x%x, npods %u.\n",
+				tag, idx, npods);
+		cxgb4i_ddp_clear_map(csk, ddp, tag, idx, npods);
+		cxgb4i_ddp_unmark_entries(ddp, idx, npods);
+		cxgb4i_ddp_release_gl(gl, ddp->pdev);
+	} else
+		cxgbi_log_error("ddp tag 0x%x, idx 0x%x > max 0x%x.\n",
+				tag, idx, ddp->nppods);
+}
+
+static int cxgb4i_ddp_tag_reserve(struct cxgbi_sock *csk,
+				  unsigned int tid,
+				  struct cxgbi_tag_format *tformat,
+				  u32 *tagp,
+				  struct cxgbi_gather_list *gl,
+				  gfp_t gfp)
+{
+	struct cxgb4i_snic *snic = cxgbi_cdev_priv(csk->cdev);
+	struct cxgb4i_ddp_info *ddp = snic->ddp;
+	struct pagepod_hdr hdr;
+	unsigned int npods;
+	int idx = -1;
+	int err = -ENOMEM;
+	u32 sw_tag = *tagp;
+	u32 tag;
+
+	if (page_idx >= DDP_PGIDX_MAX || !ddp || !gl || !gl->nelem ||
+			gl->length < DDP_THRESHOLD) {
+		cxgbi_log_debug("pgidx %u, xfer %u/%u, NO ddp.\n",
+				page_idx, gl->length, DDP_THRESHOLD);
+		return -EINVAL;
+	}
+
+	npods = (gl->nelem + PPOD_PAGES_MAX - 1) >> PPOD_PAGES_SHIFT;
+
+	if (ddp->idx_last == ddp->nppods)
+		idx = cxgb4i_ddp_find_unused_entries(ddp, 0, ddp->nppods,
+							npods, gl);
+	else {
+		idx = cxgb4i_ddp_find_unused_entries(ddp, ddp->idx_last + 1,
+							ddp->nppods, npods,
+							gl);
+		if (idx < 0 && ddp->idx_last >= npods) {
+			idx = cxgb4i_ddp_find_unused_entries(ddp, 0,
+				min(ddp->idx_last + npods, ddp->nppods),
+							npods, gl);
+		}
+	}
+	if (idx < 0) {
+		cxgbi_log_debug("xferlen %u, gl %u, npods %u NO DDP.\n",
+				gl->length, gl->nelem, npods);
+		return idx;
+	}
+
+	tag = cxgbi_ddp_tag_base(tformat, sw_tag);
+	tag |= idx << PPOD_IDX_SHIFT;
+
+	hdr.rsvd = 0;
+	hdr.vld_tid = htonl(PPOD_VALID_FLAG | PPOD_TID(tid));
+	hdr.pgsz_tag_clr = htonl(tag & ddp->rsvd_tag_mask);
+	hdr.max_offset = htonl(gl->length);
+	hdr.page_offset = htonl(gl->offset);
+
+	err = cxgb4i_ddp_set_map(csk, ddp, &hdr, idx, npods, gl);
+	if (err < 0)
+		goto unmark_entries;
+
+	ddp->idx_last = idx;
+	cxgbi_log_debug("xfer %u, gl %u,%u, tid 0x%x, 0x%x -> 0x%x(%u,%u).\n",
+			gl->length, gl->nelem, gl->offset, tid, sw_tag, tag,
+			idx, npods);
+	*tagp = tag;
+	return 0;
+unmark_entries:
+	cxgb4i_ddp_unmark_entries(ddp, idx, npods);
+	return err;
+}
+
+static int cxgb4i_ddp_setup_conn_pgidx(struct cxgbi_sock *csk,
+					unsigned int tid,
+					int pg_idx,
+					bool reply)
+{
+	struct sk_buff *skb;
+	struct cpl_set_tcb_field *req;
+	u64 val = pg_idx < DDP_PGIDX_MAX ? pg_idx : 0;
+
+	skb = alloc_skb(sizeof(*req), GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	/*  set up ulp submode and page size */
+	val = (val & 0x03) << 2;
+	val |= TCB_ULP_TYPE(ULP_MODE_ISCSI);
+	req = (struct cpl_set_tcb_field *)skb_put(skb, sizeof(*req));
+	INIT_TP_WR(req, tid);
+	OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_SET_TCB_FIELD, csk->hwtid));
+	req->reply_ctrl = htons(NO_REPLY(reply) | QUEUENO(csk->rss_qid));
+	req->word_cookie = htons(TCB_WORD(W_TCB_ULP_RAW));
+	req->mask = cpu_to_be64(TCB_ULP_TYPE(TCB_ULP_TYPE_MASK));
+	req->val = cpu_to_be64(val);
+	set_wr_txq(skb, CPL_PRIORITY_CONTROL, csk->txq_idx);
+	cxgb4_ofld_send(csk->cdev->ports[csk->port_id], skb);
+	return 0;
+}
+
+int cxgb4i_ddp_setup_conn_host_pagesize(struct cxgbi_sock *csk,
+					unsigned int tid,
+					int reply)
+{
+	return cxgb4i_ddp_setup_conn_pgidx(csk, tid, page_idx, reply);
+}
+
+int cxgb4i_ddp_setup_conn_pagesize(struct cxgbi_sock *csk,
+				   unsigned int tid,
+				   int reply,
+				   unsigned long pgsz)
+{
+	int pgidx = cxgb4i_ddp_find_page_index(pgsz);
+
+	return cxgb4i_ddp_setup_conn_pgidx(csk, tid, pgidx, reply);
+}
+
+int cxgb4i_ddp_setup_conn_digest(struct cxgbi_sock *csk,
+				unsigned int tid,
+				int hcrc,
+				int dcrc,
+				int reply)
+{
+	struct sk_buff *skb;
+	struct cpl_set_tcb_field *req;
+	u64 val = (hcrc ? ULP_CRC_HEADER : 0) | (dcrc ? ULP_CRC_DATA : 0);
+
+	val = TCB_ULP_RAW(val);
+	val |= TCB_ULP_TYPE(ULP_MODE_ISCSI);
+
+	skb = alloc_skb(sizeof(*req), GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	csk->hcrc_len = (hcrc ? 4 : 0);
+	csk->dcrc_len = (dcrc ? 4 : 0);
+	/*  set up ulp submode and page size */
+	req = (struct cpl_set_tcb_field *)skb_put(skb, sizeof(*req));
+	INIT_TP_WR(req, tid);
+	OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_SET_TCB_FIELD, tid));
+	req->reply_ctrl = htons(NO_REPLY(reply) | QUEUENO(csk->rss_qid));
+	req->word_cookie = htons(TCB_WORD(W_TCB_ULP_RAW));
+	req->mask = cpu_to_be64(TCB_ULP_RAW(TCB_ULP_RAW_MASK));
+	req->val = cpu_to_be64(val);
+	set_wr_txq(skb, CPL_PRIORITY_CONTROL, csk->txq_idx);
+	cxgb4_ofld_send(csk->cdev->ports[csk->port_id], skb);
+	return 0;
+}
+
+static void __cxgb4i_ddp_cleanup(struct kref *kref)
+{
+	int i = 0;
+	struct cxgb4i_ddp_info *ddp = container_of(kref,
+						struct cxgb4i_ddp_info,
+						refcnt);
+
+	cxgbi_log_info("kref release ddp 0x%p, snic 0x%p\n", ddp, ddp->snic);
+	ddp->snic->ddp = NULL;
+
+	while (i < ddp->nppods) {
+		struct cxgbi_gather_list *gl = ddp->gl_map[i];
+
+		if (gl) {
+			int npods = (gl->nelem + PPOD_PAGES_MAX - 1) >>
+							PPOD_PAGES_SHIFT;
+			cxgbi_log_info("snic 0x%p, ddp %d + %d\n",
+						ddp->snic, i, npods);
+			kfree(gl);
+			i += npods;
+		} else
+			i++;
+	}
+	cxgbi_free_big_mem(ddp);
+}
+
+static void __cxgb4i_ddp_init(struct cxgbi_device *cdev)
+{
+	struct cxgb4i_snic *snic = cxgbi_cdev_priv(cdev);
+	struct cxgb4i_ddp_info *ddp = snic->ddp;
+	unsigned int ppmax, bits, tagmask, pgsz_factor[4];
+	int i;
+
+	if (ddp) {
+		kref_get(&ddp->refcnt);
+		cxgbi_log_warn("snic 0x%p, ddp 0x%p already set up\n",
+				snic, snic->ddp);
+		return;
+	}
+	sw_tag_idx_bits = (__ilog2_u32(ISCSI_ITT_MASK)) + 1;
+	sw_tag_age_bits = (__ilog2_u32(ISCSI_AGE_MASK)) + 1;
+	cdev->tag_format.sw_bits = sw_tag_idx_bits + sw_tag_age_bits;
+	cxgbi_log_info("tag itt 0x%x, %u bits, age 0x%x, %u bits\n",
+			ISCSI_ITT_MASK, sw_tag_idx_bits,
+			ISCSI_AGE_MASK, sw_tag_age_bits);
+	ppmax = (snic->lldi.vr->iscsi.size >> PPOD_SIZE_SHIFT);
+	bits = __ilog2_u32(ppmax) + 1;
+	if (bits > PPOD_IDX_MAX_SIZE)
+		bits = PPOD_IDX_MAX_SIZE;
+	ppmax = (1 << (bits - 1)) - 1;
+	ddp = cxgbi_alloc_big_mem(sizeof(struct cxgb4i_ddp_info) +
+			ppmax * (sizeof(struct cxgbi_gather_list *) +
+				sizeof(struct sk_buff *)),
+				GFP_KERNEL);
+	if (!ddp) {
+		cxgbi_log_warn("snic 0x%p unable to alloc ddp 0x%d, "
+			       "ddp disabled\n", snic, ppmax);
+		return;
+	}
+	ddp->gl_map = (struct cxgbi_gather_list **)(ddp + 1);
+	spin_lock_init(&ddp->map_lock);
+	kref_init(&ddp->refcnt);
+	ddp->snic = snic;
+	ddp->pdev = snic->lldi.pdev;
+	ddp->max_txsz = min_t(unsigned int,
+				snic->lldi.iscsi_iolen,
+				ULP2_MAX_PKT_SIZE);
+	ddp->max_rxsz = min_t(unsigned int,
+				snic->lldi.iscsi_iolen,
+				ULP2_MAX_PKT_SIZE);
+	ddp->llimit = snic->lldi.vr->iscsi.start;
+	ddp->ulimit = ddp->llimit + snic->lldi.vr->iscsi.size - 1;
+	ddp->nppods = ppmax;
+	ddp->idx_last = ppmax;
+	ddp->idx_bits = bits;
+	ddp->idx_mask = (1 << bits) - 1;
+	ddp->rsvd_tag_mask = (1 << (bits + PPOD_IDX_SHIFT)) - 1;
+	tagmask = ddp->idx_mask << PPOD_IDX_SHIFT;
+	for (i = 0; i < DDP_PGIDX_MAX; i++)
+		pgsz_factor[i] = ddp_page_order[i];
+	cxgb4_iscsi_init(snic->lldi.ports[0], tagmask, pgsz_factor);
+	snic->ddp = ddp;
+	cdev->tag_format.rsvd_bits = ddp->idx_bits;
+	cdev->tag_format.rsvd_shift = PPOD_IDX_SHIFT;
+	cdev->tag_format.rsvd_mask =
+		((1 << cdev->tag_format.rsvd_bits) - 1);
+	cxgbi_log_info("tag format: sw %u, rsvd %u,%u, mask 0x%x.\n",
+			cdev->tag_format.sw_bits,
+			cdev->tag_format.rsvd_bits,
+			cdev->tag_format.rsvd_shift,
+			cdev->tag_format.rsvd_mask);
+	cdev->tx_max_size = min_t(unsigned int, ULP2_MAX_PDU_PAYLOAD,
+				ddp->max_txsz - ISCSI_PDU_NONPAYLOAD_LEN);
+	cdev->rx_max_size = min_t(unsigned int, ULP2_MAX_PDU_PAYLOAD,
+				ddp->max_rxsz - ISCSI_PDU_NONPAYLOAD_LEN);
+	cxgbi_log_info("max payload size: %u/%u, %u/%u.\n",
+			cdev->tx_max_size, ddp->max_txsz,
+			cdev->rx_max_size, ddp->max_rxsz);
+	cxgbi_log_info("snic 0x%p, nppods %u, bits %u, mask 0x%x,0x%x "
+			"pkt %u/%u, %u/%u\n",
+			snic, ppmax, ddp->idx_bits, ddp->idx_mask,
+			ddp->rsvd_tag_mask, ddp->max_txsz,
+			snic->lldi.iscsi_iolen,
+			ddp->max_rxsz, snic->lldi.iscsi_iolen);
+}
+
+void cxgb4i_ddp_init(struct cxgbi_device *cdev)
+{
+	if (page_idx == DDP_PGIDX_MAX) {
+		page_idx = cxgb4i_ddp_find_page_index(PAGE_SIZE);
+
+		if (page_idx == DDP_PGIDX_MAX) {
+			cxgbi_log_info("system PAGE_SIZE %lu, update hw\n",
+					PAGE_SIZE);
+			if (cxgb4i_ddp_adjust_page_table()) {
+				cxgbi_log_info("PAGE_SIZE %lu, ddp disabled\n",
+						PAGE_SIZE);
+				return;
+			}
+			page_idx = cxgb4i_ddp_find_page_index(PAGE_SIZE);
+		}
+		cxgbi_log_info("system PAGE_SIZE %lu, ddp idx %u\n",
+				PAGE_SIZE, page_idx);
+	}
+	__cxgb4i_ddp_init(cdev);
+	cdev->ddp_make_gl = cxgb4i_ddp_make_gl;
+	cdev->ddp_release_gl = cxgb4i_ddp_release_gl;
+	cdev->ddp_tag_reserve = cxgb4i_ddp_tag_reserve;
+	cdev->ddp_tag_release = cxgb4i_ddp_tag_release;
+	cdev->ddp_setup_conn_digest = cxgb4i_ddp_setup_conn_digest;
+	cdev->ddp_setup_conn_host_pgsz = cxgb4i_ddp_setup_conn_host_pagesize;
+}
+
+void cxgb4i_ddp_cleanup(struct cxgbi_device *cdev)
+{
+	struct cxgb4i_snic *snic = cxgbi_cdev_priv(cdev);
+	struct cxgb4i_ddp_info *ddp = snic->ddp;
+
+	cxgbi_log_info("snic 0x%p, release ddp 0x%p\n", cdev, ddp);
+	if (ddp)
+		kref_put(&ddp->refcnt, __cxgb4i_ddp_cleanup);
+}
+
diff --git a/drivers/scsi/cxgbi/cxgb4i_init.c b/drivers/scsi/cxgbi/cxgb4i_init.c
new file mode 100644
index 0000000..b03b4c0
--- /dev/null
+++ b/drivers/scsi/cxgbi/cxgb4i_init.c
@@ -0,0 +1,308 @@
+/*
+ * cxgb4i_init.c: Chelsio T4 iSCSI driver.
+ *
+ * Copyright (c) 2010 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ *
+ * Written by: Karen Xie (kxie-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ * Written by: Rakesh Ranjan (rranjan-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ */
+
+#include <net/route.h>
+#include <scsi/scsi_cmnd.h>
+#include <scsi/scsi_device.h>
+#include <scsi/scsi_eh.h>
+#include <scsi/scsi_host.h>
+#include <scsi/scsi.h>
+#include <scsi/iscsi_proto.h>
+#include <scsi/libiscsi.h>
+#include <scsi/scsi_transport_iscsi.h>
+
+#include "cxgb4i.h"
+
+#define	DRV_MODULE_NAME		"cxgb4i"
+#define	DRV_MODULE_VERSION	"0.90"
+#define	DRV_MODULE_RELDATE	"05/04/2010"
+
+static char version[] =
+	"Chelsio T4 iSCSI driver " DRV_MODULE_NAME
+	" v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n";
+
+MODULE_AUTHOR("Chelsio Communications");
+MODULE_DESCRIPTION("Chelsio T4 iSCSI driver");
+MODULE_LICENSE("GPL");
+MODULE_VERSION(DRV_MODULE_VERSION);
+
+#define RX_PULL_LEN	128
+
+static struct scsi_transport_template *cxgb4i_scsi_transport;
+static struct scsi_host_template cxgb4i_host_template;
+static struct iscsi_transport cxgb4i_iscsi_transport;
+static struct cxgbi_device *cxgb4i_cdev;
+
+static void *cxgb4i_uld_add(const struct cxgb4_lld_info *linfo);
+static int cxgb4i_uld_rx_handler(void *handle, const __be64 *rsp,
+				const struct pkt_gl *pgl);
+static int cxgb4i_uld_state_change(void *handle, enum cxgb4_state state);
+static int cxgb4i_iscsi_init(struct cxgbi_device *);
+static void cxgb4i_iscsi_cleanup(struct cxgbi_device *);
+
+const static struct cxgb4_uld_info cxgb4i_uld_info = {
+	.name = "cxgb4i",
+	.add = cxgb4i_uld_add,
+	.rx_handler = cxgb4i_uld_rx_handler,
+	.state_change = cxgb4i_uld_state_change,
+};
+
+struct cxgbi_device *cxgb4i_uld_init(const struct cxgb4_lld_info *linfo)
+{
+	struct cxgbi_device *cdev;
+	struct cxgb4i_snic *snic;
+	int i, rc;
+
+	cdev = cxgbi_device_register(sizeof(*snic), linfo->nports);
+	if (!cdev) {
+		cxgbi_log_debug("error cxgbi_device_alloc\n");
+		return NULL;
+	}
+	cxgb4i_cdev = cdev;
+	snic = cxgbi_cdev_priv(cxgb4i_cdev);
+	spin_lock_init(&snic->lock);
+	snic->lldi = *linfo;
+	cdev->ports = snic->lldi.ports;
+	cdev->nports = snic->lldi.nports;
+	cdev->pdev = snic->lldi.pdev;
+	cdev->mtus = snic->lldi.mtus;
+	cdev->nmtus = NMTUS;
+	cdev->skb_tx_headroom = SKB_MAX_HEAD(CXGB4I_TX_HEADER_LEN);
+	rc = cxgb4i_iscsi_init(cdev);
+	if (rc) {
+		cxgbi_log_info("cxgb4i_iscsi_init failed\n");
+		goto free_cdev;
+	}
+	rc = cxgbi_pdu_init(cdev);
+	if (rc) {
+		cxgbi_log_info("cxgbi_pdu_init failed\n");
+		goto clean_iscsi;
+	}
+	cxgb4i_ddp_init(cdev);
+	rc = cxgb4i_ofld_init(cdev);
+	if (rc) {
+		cxgbi_log_info("cxgb4i_ofld_init failed\n");
+		goto clean_pdu_ddp;
+	}
+
+	for (i = 0; i < cdev->nports; i++) {
+		cdev->hbas[i] = cxgbi_hba_add(cdev,
+						CXGB4I_MAX_LUN,
+						CXGB4I_MAX_CONN,
+						cxgb4i_scsi_transport,
+						&cxgb4i_host_template,
+						snic->lldi.ports[i]);
+		if (!cdev->hbas[i])
+			goto clean_iscsi;
+	}
+	return cdev;
+clean_iscsi:
+	cxgb4i_iscsi_cleanup(cdev);
+clean_pdu_ddp:
+	cxgbi_pdu_cleanup(cdev);
+	cxgb4i_ddp_cleanup(cdev);
+free_cdev:
+	cxgbi_device_unregister(cdev);
+	cdev = ERR_PTR(-ENOMEM);
+	return cdev;
+}
+
+void cxgb4i_uld_cleanup(void *handle)
+{
+	struct cxgbi_device *cdev = handle;
+	int i;
+
+	if (!cdev)
+		return;
+
+	for (i = 0; i < cdev->nports; i++) {
+		if (cdev->hbas[i]) {
+			cxgbi_hba_remove(cdev->hbas[i]);
+			cdev->hbas[i] = NULL;
+		}
+	}
+	cxgb4i_ofld_cleanup(cdev);
+	cxgb4i_ddp_cleanup(cdev);
+	cxgbi_pdu_cleanup(cdev);
+	cxgbi_log_info("snic 0x%p, %u scsi hosts removed.\n",
+			cdev, cdev->nports);
+	cxgb4i_iscsi_cleanup(cdev);
+	cxgbi_device_unregister(cdev);
+}
+
+static void *cxgb4i_uld_add(const struct cxgb4_lld_info *linfo)
+{
+	cxgbi_log_info("%s", version);
+	return cxgb4i_uld_init(linfo);
+}
+
+static int cxgb4i_uld_rx_handler(void *handle, const __be64 *rsp,
+				const struct pkt_gl *pgl)
+{
+	const struct cpl_act_establish *rpl;
+	struct sk_buff *skb;
+	unsigned int opc;
+	struct cxgb4i_snic *snic = cxgbi_cdev_priv(handle);
+
+	if (pgl == NULL) {
+		unsigned int len = 64 - sizeof(struct rsp_ctrl) - 8;
+
+		skb = alloc_skb(len, GFP_ATOMIC);
+		if (!skb)
+			goto nomem;
+		__skb_put(skb, len);
+		skb_copy_to_linear_data(skb, &rsp[1], len);
+	} else {
+		skb = cxgb4_pktgl_to_skb(pgl, RX_PULL_LEN, RX_PULL_LEN);
+		if (unlikely(!skb))
+			goto nomem;
+	}
+
+	rpl = (struct cpl_act_establish *)skb->data;
+	opc = rpl->ot.opcode;
+	cxgbi_log_debug("snic %p, opcode 0x%x, skb %p\n", snic, opc, skb);
+	BUG_ON(!snic->handlers[opc]);
+
+	if (snic->handlers[opc])
+		snic->handlers[opc](snic, skb);
+	else
+		cxgbi_log_error("No handler for opcode 0x%x\n", opc);
+	return 0;
+nomem:
+	cxgbi_api_debug("OOM bailing out\n");
+	return 1;
+}
+
+static int cxgb4i_uld_state_change(void *handle, enum cxgb4_state state)
+{
+	return 0;
+}
+
+static struct scsi_host_template cxgb4i_host_template = {
+	.module				= THIS_MODULE,
+	.name				= "Chelsio T4 iSCSI initiator",
+	.proc_name			= "cxgb4i",
+	.queuecommand			= iscsi_queuecommand,
+	.change_queue_depth		= iscsi_change_queue_depth,
+	.can_queue			= CXGB4I_SCSI_HOST_QDEPTH,
+	.sg_tablesize			= SG_ALL,
+	.max_sectors			= 0xFFFF,
+	.cmd_per_lun			= ISCSI_DEF_CMD_PER_LUN,
+	.eh_abort_handler		= iscsi_eh_abort,
+	.eh_device_reset_handler	= iscsi_eh_device_reset,
+	.eh_target_reset_handler	= iscsi_eh_recover_target,
+	.target_alloc			= iscsi_target_alloc,
+	.use_clustering			= DISABLE_CLUSTERING,
+	.this_id			= -1,
+};
+
+#define	CXGB4I_CAPS	(CAP_RECOVERY_L0 | CAP_MULTI_R2T |	\
+			CAP_HDRDGST | CAP_DATADGST |		\
+			CAP_DIGEST_OFFLOAD | CAP_PADDING_OFFLOAD)
+#define	CXGB4I_PMASK	(ISCSI_MAX_RECV_DLENGTH | ISCSI_MAX_XMIT_DLENGTH | \
+			ISCSI_HDRDGST_EN | ISCSI_DATADGST_EN | \
+			ISCSI_INITIAL_R2T_EN | ISCSI_MAX_R2T | \
+			ISCSI_IMM_DATA_EN | ISCSI_FIRST_BURST | \
+			ISCSI_MAX_BURST | ISCSI_PDU_INORDER_EN | \
+			ISCSI_DATASEQ_INORDER_EN | ISCSI_ERL | \
+			ISCSI_CONN_PORT | ISCSI_CONN_ADDRESS | \
+			ISCSI_EXP_STATSN | ISCSI_PERSISTENT_PORT | \
+			ISCSI_PERSISTENT_ADDRESS | ISCSI_TARGET_NAME | \
+			ISCSI_TPGT | ISCSI_USERNAME | \
+			ISCSI_PASSWORD | ISCSI_USERNAME_IN | \
+			ISCSI_PASSWORD_IN | ISCSI_FAST_ABORT | \
+			ISCSI_ABORT_TMO | ISCSI_LU_RESET_TMO | \
+			ISCSI_TGT_RESET_TMO | ISCSI_PING_TMO | \
+			ISCSI_RECV_TMO | ISCSI_IFACE_NAME | \
+			ISCSI_INITIATOR_NAME)
+#define	CXGB4I_HPMASK	(ISCSI_HOST_HWADDRESS | ISCSI_HOST_IPADDRESS | \
+			ISCSI_HOST_INITIATOR_NAME | ISCSI_HOST_INITIATOR_NAME)
+
+static struct iscsi_transport cxgb4i_iscsi_transport = {
+	.owner				= THIS_MODULE,
+	.name				= "cxgb4i",
+	.caps				= CXGB4I_CAPS,
+	.param_mask			= CXGB4I_PMASK,
+	.host_param_mask		= CXGB4I_HPMASK,
+	.get_host_param			= cxgbi_get_host_param,
+	.set_host_param			= cxgbi_set_host_param,
+
+	.create_session			= cxgbi_create_session,
+	.destroy_session		= cxgbi_destroy_session,
+	.get_session_param		= iscsi_session_get_param,
+
+	.create_conn			= cxgbi_create_conn,
+	.bind_conn			= cxgbi_bind_conn,
+	.destroy_conn			= iscsi_tcp_conn_teardown,
+	.start_conn			= iscsi_conn_start,
+	.stop_conn			= iscsi_conn_stop,
+	.get_conn_param			= cxgbi_get_conn_param,
+	.set_param			= cxgbi_set_conn_param,
+	.get_stats			= cxgbi_get_conn_stats,
+
+	.send_pdu			= iscsi_conn_send_pdu,
+
+	.init_task			= iscsi_tcp_task_init,
+	.xmit_task			= iscsi_tcp_task_xmit,
+	.cleanup_task			= cxgbi_cleanup_task,
+
+	.alloc_pdu			= cxgbi_conn_alloc_pdu,
+	.init_pdu			= cxgbi_conn_init_pdu,
+	.xmit_pdu			= cxgbi_conn_xmit_pdu,
+	.parse_pdu_itt			= cxgbi_parse_pdu_itt,
+
+	.ep_connect			= cxgbi_ep_connect,
+	.ep_poll			= cxgbi_ep_poll,
+	.ep_disconnect			= cxgbi_ep_disconnect,
+
+	.session_recovery_timedout	= iscsi_session_recovery_timedout,
+};
+
+int cxgb4i_iscsi_init(struct cxgbi_device *cdev)
+{
+	cxgb4i_scsi_transport = iscsi_register_transport(
+					&cxgb4i_iscsi_transport);
+	if (!cxgb4i_scsi_transport) {
+		cxgbi_log_error("Could not register cxgb4i transport\n");
+		return -ENODATA;
+	}
+
+	cdev->itp = &cxgb4i_iscsi_transport;
+	return 0;
+}
+
+void cxgb4i_iscsi_cleanup(struct cxgbi_device *cdev)
+{
+	if (cxgb4i_scsi_transport) {
+		cxgbi_api_debug("cxgb4i transport 0x%p removed\n",
+				cxgb4i_scsi_transport);
+		iscsi_unregister_transport(&cxgb4i_iscsi_transport);
+	}
+}
+
+
+static int __init cxgb4i_init_module(void)
+{
+	cxgb4_register_uld(CXGB4_ULD_ISCSI, &cxgb4i_uld_info);
+	return 0;
+}
+
+static void __exit cxgb4i_exit_module(void)
+{
+	cxgb4_unregister_uld(CXGB4_ULD_ISCSI);
+	cxgb4i_uld_cleanup(cxgb4i_cdev);
+}
+
+module_init(cxgb4i_init_module);
+module_exit(cxgb4i_exit_module);
+
diff --git a/drivers/scsi/cxgbi/cxgb4i_offload.c b/drivers/scsi/cxgbi/cxgb4i_offload.c
new file mode 100644
index 0000000..314e5df
--- /dev/null
+++ b/drivers/scsi/cxgbi/cxgb4i_offload.c
@@ -0,0 +1,1412 @@
+/*
+ * cxgb4i_offload.c: Chelsio T4 iSCSI driver.
+ *
+ * Copyright (c) 2010 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ *
+ * Written by: Karen Xie (kxie-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ * Written by: Rakesh Ranjan (rranjan-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org)
+ */
+
+#include <linux/if_vlan.h>
+#include <net/dst.h>
+#include <net/route.h>
+#include <net/tcp.h>
+
+#include "libcxgbi.h"
+#include "cxgb4i.h"
+
+static int cxgb4i_rcv_win = 256 * 1024;
+module_param(cxgb4i_rcv_win, int, 0644);
+MODULE_PARM_DESC(cxgb4i_rcv_win, "TCP reveive window in bytes");
+
+static int cxgb4i_snd_win = 128 * 1024;
+module_param(cxgb4i_snd_win, int, 0644);
+MODULE_PARM_DESC(cxgb4i_snd_win, "TCP send window in bytes");
+
+static int cxgb4i_rx_credit_thres = 10 * 1024;
+module_param(cxgb4i_rx_credit_thres, int, 0644);
+MODULE_PARM_DESC(cxgb4i_rx_credit_thres,
+		"RX credits return threshold in bytes (default=10KB)");
+
+static unsigned int cxgb4i_max_connect = (8 * 1024);
+module_param(cxgb4i_max_connect, uint, 0644);
+MODULE_PARM_DESC(cxgb4i_max_connect, "Maximum number of connections");
+
+static unsigned short cxgb4i_sport_base = 20000;
+module_param(cxgb4i_sport_base, ushort, 0644);
+MODULE_PARM_DESC(cxgb4i_sport_base, "Starting port number (default 20000)");
+
+#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
+#define RCV_BUFSIZ_MASK	0x3FFU
+#define MAX_IMM_TX_PKT_LEN 128
+
+static int cxgb4i_sock_push_tx_frames(struct cxgbi_sock *, int);
+
+/*
+ * is_ofld_imm - check whether a packet can be sent as immediate data
+ * @skb: the packet
+ *
+ * Returns true if a packet can be sent as an offload WR with immediate
+ * data.  We currently use the same limit as for Ethernet packets.
+ */
+static inline int is_ofld_imm(const struct sk_buff *skb)
+{
+	return skb->len <= (MAX_IMM_TX_PKT_LEN -
+			sizeof(struct fw_ofld_tx_data_wr));
+}
+
+static void cxgb4i_sock_make_act_open_req(struct cxgbi_sock *csk,
+					   struct sk_buff *skb,
+					   unsigned int qid_atid,
+					   struct l2t_entry *e)
+{
+	struct cpl_act_open_req *req;
+	unsigned long long opt0;
+	unsigned int opt2;
+	int wscale;
+
+	cxgbi_conn_debug("csk 0x%p, atid 0x%x\n", csk, qid_atid);
+
+	wscale = cxgbi_sock_compute_wscale(csk->mss_idx);
+	opt0 = KEEP_ALIVE(1) |
+		WND_SCALE(wscale) |
+		MSS_IDX(csk->mss_idx) |
+		L2T_IDX(((struct l2t_entry *)csk->l2t)->idx) |
+		TX_CHAN(csk->tx_chan) |
+		SMAC_SEL(csk->smac_idx) |
+		RCV_BUFSIZ(cxgb4i_rcv_win >> 10);
+	opt2 = RX_CHANNEL(0) |
+		RSS_QUEUE_VALID |
+		RSS_QUEUE(csk->rss_qid);
+	set_wr_txq(skb, CPL_PRIORITY_SETUP, csk->txq_idx);
+	req = (struct cpl_act_open_req *)__skb_put(skb, sizeof(*req));
+	INIT_TP_WR(req, 0);
+	OPCODE_TID(req) = cpu_to_be32(MK_OPCODE_TID(CPL_ACT_OPEN_REQ,
+					qid_atid));
+	req->local_port = csk->saddr.sin_port;
+	req->peer_port = csk->daddr.sin_port;
+	req->local_ip = csk->saddr.sin_addr.s_addr;
+	req->peer_ip = csk->daddr.sin_addr.s_addr;
+	req->opt0 = cpu_to_be64(opt0);
+	req->params = 0;
+	req->opt2 = cpu_to_be32(opt2);
+}
+
+static void cxgb4i_fail_act_open(struct cxgbi_sock *csk, int errno)
+{
+	cxgbi_conn_debug("csk 0%p, state %u, flag 0x%lx\n", csk,
+			csk->state, csk->flags);
+	csk->err = errno;
+	cxgbi_sock_closed(csk);
+}
+
+static void cxgb4i_act_open_req_arp_failure(void *handle, struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk = (struct cxgbi_sock *)skb->sk;
+
+	cxgbi_sock_hold(csk);
+	spin_lock_bh(&csk->lock);
+	if (csk->state == CTP_CONNECTING)
+		cxgb4i_fail_act_open(csk, -EHOSTUNREACH);
+	spin_unlock_bh(&csk->lock);
+	cxgbi_sock_put(csk);
+	__kfree_skb(skb);
+}
+
+static void cxgb4i_sock_skb_entail(struct cxgbi_sock *csk,
+				   struct sk_buff *skb,
+				   int flags)
+{
+	cxgb4i_skb_tcp_seq(skb) = csk->write_seq;
+	cxgb4i_skb_flags(skb) = flags;
+	__skb_queue_tail(&csk->write_queue, skb);
+}
+
+static void cxgb4i_sock_send_close_req(struct cxgbi_sock *csk)
+{
+	struct sk_buff *skb = csk->cpl_close;
+	struct cpl_close_con_req *req = (struct cpl_close_con_req *)skb->head;
+	unsigned int tid = csk->hwtid;
+
+	csk->cpl_close = NULL;
+	set_wr_txq(skb, CPL_PRIORITY_DATA, csk->txq_idx);
+	INIT_TP_WR(req, tid);
+	OPCODE_TID(req) = cpu_to_be32(MK_OPCODE_TID(CPL_CLOSE_CON_REQ, tid));
+	req->rsvd = 0;
+	cxgb4i_sock_skb_entail(csk, skb, CTP_SKCBF_NO_APPEND);
+	if (csk->state != CTP_CONNECTING)
+		cxgb4i_sock_push_tx_frames(csk, 1);
+}
+
+static void cxgb4i_sock_abort_arp_failure(void *handle, struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk = (struct cxgbi_sock *)handle;
+	struct cpl_abort_req *req;
+
+	req = (struct cpl_abort_req *)skb->data;
+	req->cmd = CPL_ABORT_NO_RST;
+	cxgb4_ofld_send(csk->cdev->ports[csk->port_id], skb);
+}
+
+static void cxgb4i_sock_send_abort_req(struct cxgbi_sock *csk)
+{
+	struct cpl_abort_req *req;
+	struct sk_buff *skb = csk->cpl_abort_req;
+
+	if (unlikely(csk->state == CTP_ABORTING) ||
+			!skb || !csk->cdev)
+		return;
+	cxgbi_sock_set_state(csk, CTP_ABORTING);
+	cxgbi_conn_debug("csk 0x%p, flag ABORT_RPL + ABORT_SHUT\n", csk);
+	cxgbi_sock_set_state(csk, CTPF_ABORT_RPL_PENDING);
+	cxgbi_sock_purge_write_queue(csk);
+	csk->cpl_abort_req = NULL;
+	req = (struct cpl_abort_req *)skb->head;
+	set_wr_txq(skb, CPL_PRIORITY_DATA, csk->txq_idx);
+	t4_set_arp_err_handler(skb, csk, cxgb4i_sock_abort_arp_failure);
+	INIT_TP_WR(req, csk->hwtid);
+	OPCODE_TID(req) = cpu_to_be32(MK_OPCODE_TID(CPL_ABORT_REQ, csk->hwtid));
+	req->rsvd0 = htonl(csk->snd_nxt);
+	req->rsvd1 = !cxgbi_sock_flag(csk, CTPF_TX_DATA_SENT);
+	req->cmd = CPL_ABORT_SEND_RST;
+	cxgb4_l2t_send(csk->cdev->ports[csk->port_id], skb, csk->l2t);
+}
+
+static void cxgb4i_sock_send_abort_rpl(struct cxgbi_sock *csk, int rst_status)
+{
+	struct sk_buff *skb = csk->cpl_abort_rpl;
+	struct cpl_abort_rpl *rpl = (struct cpl_abort_rpl *)skb->head;
+
+	csk->cpl_abort_rpl = NULL;
+	set_wr_txq(skb, CPL_PRIORITY_DATA, csk->txq_idx);
+	INIT_TP_WR(rpl, csk->hwtid);
+	OPCODE_TID(rpl) = cpu_to_be32(MK_OPCODE_TID(CPL_ABORT_RPL, csk->hwtid));
+	rpl->cmd = rst_status;
+	cxgb4_ofld_send(csk->cdev->ports[csk->port_id], skb);
+}
+
+static u32 cxgb4i_csk_send_rx_credits(struct cxgbi_sock *csk, u32 credits)
+{
+	struct sk_buff *skb;
+	struct cpl_rx_data_ack *req;
+	int wrlen = roundup(sizeof(*req), 16);
+
+	skb = alloc_skb(wrlen, GFP_ATOMIC);
+	if (!skb)
+		return 0;
+	req = (struct cpl_rx_data_ack *)__skb_put(skb, wrlen);
+	memset(req, 0, wrlen);
+	set_wr_txq(skb, CPL_PRIORITY_ACK, csk->txq_idx);
+	INIT_TP_WR(req, csk->hwtid);
+	OPCODE_TID(req) = cpu_to_be32(MK_OPCODE_TID(CPL_RX_DATA_ACK,
+				      csk->hwtid));
+	req->credit_dack = cpu_to_be32(RX_CREDITS(credits) | RX_FORCE_ACK(1));
+	cxgb4_ofld_send(csk->cdev->ports[csk->port_id], skb);
+	return credits;
+}
+
+#define SKB_WR_LIST_SIZE	(MAX_SKB_FRAGS + 2)
+static const unsigned int cxgb4i_ulp_extra_len[] = { 0, 4, 4, 8 };
+
+static inline unsigned int ulp_extra_len(const struct sk_buff *skb)
+{
+	return cxgb4i_ulp_extra_len[cxgb4i_skb_ulp_mode(skb) & 3];
+}
+
+static inline void cxgb4i_sock_reset_wr_list(struct cxgbi_sock *csk)
+{
+	csk->wr_pending_head = csk->wr_pending_tail = NULL;
+}
+
+static inline void cxgb4i_sock_enqueue_wr(struct cxgbi_sock *csk,
+					  struct sk_buff *skb)
+{
+	cxgb4i_skb_tx_wr_next(skb) = NULL;
+	/*
+	 * We want to take an extra reference since both us and the driver
+	 * need to free the packet before it's really freed. We know there's
+	 * just one user currently so we use atomic_set rather than skb_get
+	 * to avoid the atomic op.
+	 */
+	atomic_set(&skb->users, 2);
+
+	if (!csk->wr_pending_head)
+		csk->wr_pending_head = skb;
+	else
+		cxgb4i_skb_tx_wr_next(csk->wr_pending_tail) = skb;
+	csk->wr_pending_tail = skb;
+}
+
+static int cxgb4i_sock_count_pending_wrs(const struct cxgbi_sock *csk)
+{
+	int n = 0;
+	const struct sk_buff *skb = csk->wr_pending_head;
+
+	while (skb) {
+		n += skb->csum;
+		skb = cxgb4i_skb_tx_wr_next(skb);
+	}
+	return n;
+}
+
+static inline struct sk_buff *cxgb4i_sock_peek_wr(const struct cxgbi_sock *csk)
+{
+	return csk->wr_pending_head;
+}
+
+static inline struct sk_buff *cxgb4i_sock_dequeue_wr(struct cxgbi_sock *csk)
+{
+	struct sk_buff *skb = csk->wr_pending_head;
+
+	if (likely(skb)) {
+		csk->wr_pending_head = cxgb4i_skb_tx_wr_next(skb);
+		cxgb4i_skb_tx_wr_next(skb) = NULL;
+	}
+	return skb;
+}
+
+static void cxgb4i_sock_purge_wr_queue(struct cxgbi_sock *csk)
+{
+	struct sk_buff *skb;
+
+	while ((skb = cxgb4i_sock_dequeue_wr(csk)) != NULL)
+		kfree_skb(skb);
+}
+
+/*
+ * sgl_len - calculates the size of an SGL of the given capacity
+ * @n: the number of SGL entries
+ * Calculates the number of flits needed for a scatter/gather list that
+ * can hold the given number of entries.
+ */
+static inline unsigned int sgl_len(unsigned int n)
+{
+	n--;
+	return (3 * n) / 2 + (n & 1) + 2;
+}
+
+/*
+ * calc_tx_flits_ofld - calculate # of flits for an offload packet
+ * @skb: the packet
+ *
+ * Returns the number of flits needed for the given offload packet.
+ * These packets are already fully constructed and no additional headers
+ * will be added.
+ */
+static inline unsigned int calc_tx_flits_ofld(const struct sk_buff *skb)
+{
+	unsigned int flits, cnt;
+
+	if (is_ofld_imm(skb))
+		return DIV_ROUND_UP(skb->len, 8);
+	flits = skb_transport_offset(skb) / 8;
+	cnt = skb_shinfo(skb)->nr_frags;
+	if (skb->tail != skb->transport_header)
+		cnt++;
+	return flits + sgl_len(cnt);
+}
+
+static inline void cxgb4i_sock_send_tx_flowc_wr(struct cxgbi_sock *csk)
+{
+	struct sk_buff *skb;
+	struct fw_flowc_wr *flowc;
+	int flowclen, i;
+
+	flowclen = 80;
+	skb = alloc_skb(flowclen, GFP_ATOMIC);
+	flowc = (struct fw_flowc_wr *)__skb_put(skb, flowclen);
+	flowc->op_to_nparams =
+		htonl(FW_WR_OP(FW_FLOWC_WR) | FW_FLOWC_WR_NPARAMS(8));
+	flowc->flowid_len16 =
+		htonl(FW_WR_LEN16(DIV_ROUND_UP(72, 16)) |
+				FW_WR_FLOWID(csk->hwtid));
+	flowc->mnemval[0].mnemonic = FW_FLOWC_MNEM_PFNVFN;
+	flowc->mnemval[0].val = htonl(0);
+	flowc->mnemval[1].mnemonic = FW_FLOWC_MNEM_CH;
+	flowc->mnemval[1].val = htonl(csk->tx_chan);
+	flowc->mnemval[2].mnemonic = FW_FLOWC_MNEM_PORT;
+	flowc->mnemval[2].val = htonl(csk->tx_chan);
+	flowc->mnemval[3].mnemonic = FW_FLOWC_MNEM_IQID;
+	flowc->mnemval[3].val = htonl(csk->rss_qid);
+	flowc->mnemval[4].mnemonic = FW_FLOWC_MNEM_SNDNXT;
+	flowc->mnemval[4].val = htonl(csk->snd_nxt);
+	flowc->mnemval[5].mnemonic = FW_FLOWC_MNEM_RCVNXT;
+	flowc->mnemval[5].val = htonl(csk->rcv_nxt);
+	flowc->mnemval[6].mnemonic = FW_FLOWC_MNEM_SNDBUF;
+	flowc->mnemval[6].val = htonl(cxgb4i_snd_win);
+	flowc->mnemval[7].mnemonic = FW_FLOWC_MNEM_MSS;
+	flowc->mnemval[7].val = htonl(csk->mss_idx);
+	flowc->mnemval[8].mnemonic = 0;
+	flowc->mnemval[8].val = 0;
+	for (i = 0; i < 9; i++) {
+		flowc->mnemval[i].r4[0] = 0;
+		flowc->mnemval[i].r4[1] = 0;
+		flowc->mnemval[i].r4[2] = 0;
+	}
+	set_wr_txq(skb, CPL_PRIORITY_DATA, csk->txq_idx);
+	cxgb4_ofld_send(csk->cdev->ports[csk->port_id], skb);
+}
+
+static inline void cxgb4i_sock_make_tx_data_wr(struct cxgbi_sock *csk,
+						struct sk_buff *skb, int dlen,
+						int len, u32 credits,
+						int req_completion)
+{
+	struct fw_ofld_tx_data_wr *req;
+	unsigned int wr_ulp_mode;
+
+	if (is_ofld_imm(skb)) {
+			req = (struct fw_ofld_tx_data_wr *)
+				__skb_push(skb, sizeof(*req));
+			req->op_to_immdlen =
+				cpu_to_be32(FW_WR_OP(FW_OFLD_TX_DATA_WR) |
+					FW_WR_COMPL(req_completion) |
+					FW_WR_IMMDLEN(dlen));
+			req->flowid_len16 =
+				cpu_to_be32(FW_WR_FLOWID(csk->hwtid) |
+						FW_WR_LEN16(credits));
+	} else {
+		req = (struct fw_ofld_tx_data_wr *)
+			__skb_push(skb, sizeof(*req));
+		req->op_to_immdlen =
+			cpu_to_be32(FW_WR_OP(FW_OFLD_TX_DATA_WR) |
+					FW_WR_COMPL(req_completion) |
+					FW_WR_IMMDLEN(0));
+		req->flowid_len16 =
+			cpu_to_be32(FW_WR_FLOWID(csk->hwtid) |
+					FW_WR_LEN16(credits));
+	}
+	wr_ulp_mode =
+		FW_OFLD_TX_DATA_WR_ULPMODE(cxgb4i_skb_ulp_mode(skb) >> 4) |
+		FW_OFLD_TX_DATA_WR_ULPSUBMODE(cxgb4i_skb_ulp_mode(skb) & 3);
+	req->tunnel_to_proxy = cpu_to_be32(wr_ulp_mode) |
+		FW_OFLD_TX_DATA_WR_SHOVE(skb_peek(&csk->write_queue) ? 0 : 1);
+	req->plen = cpu_to_be32(len);
+	if (!cxgbi_sock_flag(csk, CTPF_TX_DATA_SENT))
+		cxgbi_sock_set_flag(csk, CTPF_TX_DATA_SENT);
+}
+
+static void cxgb4i_sock_arp_failure_discard(void *handle, struct sk_buff *skb)
+{
+	kfree_skb(skb);
+}
+
+static int cxgb4i_sock_push_tx_frames(struct cxgbi_sock *csk,
+						int req_completion)
+{
+	int total_size = 0;
+	struct sk_buff *skb;
+	struct cxgb4i_snic *snic;
+
+	if (unlikely(csk->state == CTP_CONNECTING ||
+			csk->state == CTP_CLOSE_WAIT_1 ||
+			csk->state >= CTP_ABORTING)) {
+		cxgbi_tx_debug("csk 0x%p, in closing state %u.\n",
+				csk, csk->state);
+		return 0;
+	}
+
+	snic = cxgbi_cdev_priv(csk->cdev);
+	while (csk->wr_cred && (skb = skb_peek(&csk->write_queue)) != NULL) {
+		int dlen;
+		int len;
+		unsigned int credits_needed;
+
+		dlen = len = skb->len;
+		skb_reset_transport_header(skb);
+		if (is_ofld_imm(skb))
+			credits_needed = DIV_ROUND_UP(dlen +
+					sizeof(struct fw_ofld_tx_data_wr), 16);
+		else
+			credits_needed = DIV_ROUND_UP(8 *
+					calc_tx_flits_ofld(skb)+
+					sizeof(struct fw_ofld_tx_data_wr), 16);
+		if (csk->wr_cred < credits_needed) {
+			cxgbi_tx_debug("csk 0x%p, skb len %u/%u, "
+					"wr %d < %u.\n",
+					csk, skb->len, skb->data_len,
+					credits_needed, csk->wr_cred);
+			break;
+		}
+		__skb_unlink(skb, &csk->write_queue);
+		set_wr_txq(skb, CPL_PRIORITY_DATA, csk->txq_idx);
+		skb->csum = credits_needed;
+		csk->wr_cred -= credits_needed;
+		csk->wr_una_cred += credits_needed;
+		cxgb4i_sock_enqueue_wr(csk, skb);
+		cxgbi_tx_debug("csk 0x%p, enqueue, skb len %u/%u, "
+				"wr %d, left %u, unack %u.\n",
+				csk, skb->len, skb->data_len,
+				credits_needed, csk->wr_cred,
+				csk->wr_una_cred);
+		if (likely(cxgb4i_skb_flags(skb) & CTP_SKCBF_NEED_HDR)) {
+			len += ulp_extra_len(skb);
+			if (!cxgbi_sock_flag(csk, CTPF_TX_DATA_SENT)) {
+				cxgb4i_sock_send_tx_flowc_wr(csk);
+				skb->csum += 5;
+				csk->wr_cred -= 5;
+				csk->wr_una_cred += 5;
+			}
+			if ((req_completion &&
+				csk->wr_una_cred == credits_needed) ||
+				(cxgb4i_skb_flags(skb) & CTP_SKCBF_COMPL) ||
+				csk->wr_una_cred >= csk->wr_max_cred >> 1) {
+				req_completion = 1;
+				csk->wr_una_cred = 0;
+			}
+			cxgb4i_sock_make_tx_data_wr(csk, skb, dlen, len,
+						    credits_needed,
+						    req_completion);
+			csk->snd_nxt += len;
+			if (req_completion)
+				cxgb4i_skb_flags(skb) &= ~CTP_SKCBF_NEED_HDR;
+		}
+		total_size += skb->truesize;
+		t4_set_arp_err_handler(skb, csk,
+					cxgb4i_sock_arp_failure_discard);
+		cxgb4_l2t_send(csk->cdev->ports[csk->port_id], skb, csk->l2t);
+	}
+	return total_size;
+}
+
+static inline void cxgb4i_sock_free_atid(struct cxgbi_sock *csk)
+{
+	struct cxgb4i_snic *snic = cxgbi_cdev_priv(csk->cdev);
+
+	cxgb4_free_atid(snic->lldi.tids, csk->atid);
+	cxgbi_sock_put(csk);
+}
+
+static void cxgb4i_sock_established(struct cxgbi_sock *csk, u32 snd_isn,
+					unsigned int opt)
+{
+	cxgbi_conn_debug("csk 0x%p, state %u.\n", csk, csk->state);
+
+	csk->write_seq = csk->snd_nxt = csk->snd_una = snd_isn;
+	/*
+	 * Causes the first RX_DATA_ACK to supply any Rx credits we couldn't
+	 * pass through opt0.
+	 */
+	if (cxgb4i_rcv_win > (RCV_BUFSIZ_MASK << 10))
+		csk->rcv_wup -= cxgb4i_rcv_win - (RCV_BUFSIZ_MASK << 10);
+	dst_confirm(csk->dst);
+	smp_mb();
+	cxgbi_sock_set_state(csk, CTP_ESTABLISHED);
+}
+
+static int cxgb4i_cpl_act_establish(struct cxgb4i_snic *snic,
+				    struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk;
+	struct cpl_act_establish *req = (struct cpl_act_establish *)skb->data;
+	unsigned int hwtid = GET_TID(req);
+	unsigned int atid = GET_TID_TID(ntohl(req->tos_atid));
+	struct tid_info *t = snic->lldi.tids;
+	u32 rcv_isn = be32_to_cpu(req->rcv_isn);
+
+	csk = lookup_atid(t, atid);
+	if (unlikely(!csk)) {
+		cxgbi_log_error("can't find connection for tid %u\n", hwtid);
+		return CPL_RET_UNKNOWN_TID;
+	}
+	cxgbi_conn_debug("csk 0x%p, state %u, flag 0x%lx\n",
+				csk, csk->state, csk->flags);
+	csk->hwtid = hwtid;
+	cxgbi_sock_hold(csk);
+	cxgb4_insert_tid(snic->lldi.tids, csk, hwtid);
+	cxgb4i_sock_free_atid(csk);
+	spin_lock_bh(&csk->lock);
+
+	if (unlikely(csk->state != CTP_CONNECTING))
+		cxgbi_log_error("TID %u expected SYN_SENT, got EST., s %u\n",
+				csk->hwtid, csk->state);
+	csk->copied_seq = csk->rcv_wup = csk->rcv_nxt = rcv_isn;
+	cxgb4i_sock_established(csk, ntohl(req->snd_isn), ntohs(req->tcp_opt));
+	__kfree_skb(skb);
+
+	if (unlikely(cxgbi_sock_flag(csk, CTPF_ACTIVE_CLOSE_NEEDED)))
+		cxgb4i_sock_send_abort_req(csk);
+	else {
+		if (skb_queue_len(&csk->write_queue))
+			cxgb4i_sock_push_tx_frames(csk, 1);
+		cxgbi_conn_tx_open(csk);
+	}
+
+	if (csk->retry_timer.function) {
+		del_timer(&csk->retry_timer);
+		csk->retry_timer.function = NULL;
+	}
+
+	spin_unlock_bh(&csk->lock);
+	return 0;
+}
+
+static int act_open_rpl_status_to_errno(int status)
+{
+	switch (status) {
+	case CPL_ERR_CONN_RESET:
+		return -ECONNREFUSED;
+	case CPL_ERR_ARP_MISS:
+		return -EHOSTUNREACH;
+	case CPL_ERR_CONN_TIMEDOUT:
+		return -ETIMEDOUT;
+	case CPL_ERR_TCAM_FULL:
+		return -ENOMEM;
+	case CPL_ERR_CONN_EXIST:
+		cxgbi_log_error("ACTIVE_OPEN_RPL: 4-tuple in use\n");
+		return -EADDRINUSE;
+	default:
+		return -EIO;
+	}
+}
+
+/*
+ * Return whether a failed active open has allocated a TID
+ */
+static inline int act_open_has_tid(int status)
+{
+	return status != CPL_ERR_TCAM_FULL && status != CPL_ERR_CONN_EXIST &&
+		status != CPL_ERR_ARP_MISS;
+}
+
+static void cxgb4i_sock_act_open_retry_timer(unsigned long data)
+{
+	struct sk_buff *skb;
+	struct cxgbi_sock *csk = (struct cxgbi_sock *)data;
+
+	cxgbi_conn_debug("csk 0x%p, state %u.\n", csk, csk->state);
+	spin_lock_bh(&csk->lock);
+	skb = alloc_skb(sizeof(struct cpl_act_open_req), GFP_ATOMIC);
+	if (!skb)
+		cxgb4i_fail_act_open(csk, -ENOMEM);
+	else {
+		unsigned int qid_atid  = csk->rss_qid << 14;
+		qid_atid |= (unsigned int)csk->atid;
+		skb->sk = (struct sock *)csk;
+		t4_set_arp_err_handler(skb, csk,
+					cxgb4i_act_open_req_arp_failure);
+		cxgb4i_sock_make_act_open_req(csk, skb, qid_atid, csk->l2t);
+		cxgb4_l2t_send(csk->cdev->ports[csk->port_id], skb, csk->l2t);
+	}
+	spin_unlock_bh(&csk->lock);
+	cxgbi_sock_put(csk);
+}
+
+static int cxgb4i_cpl_act_open_rpl(struct cxgb4i_snic *snic,
+				   struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk;
+	struct cpl_act_open_rpl *rpl = (struct cpl_act_open_rpl *)skb->data;
+	unsigned int atid =
+		GET_TID_TID(GET_AOPEN_ATID(be32_to_cpu(rpl->atid_status)));
+	struct tid_info *t = snic->lldi.tids;
+	unsigned int status = GET_AOPEN_STATUS(be32_to_cpu(rpl->atid_status));
+
+	csk = lookup_atid(t, atid);
+	if (unlikely(!csk)) {
+		cxgbi_log_error("can't find connection for tid %u\n", atid);
+		return CPL_RET_UNKNOWN_TID;
+	}
+
+	cxgbi_sock_hold(csk);
+	spin_lock_bh(&csk->lock);
+	cxgbi_conn_debug("rcv, status 0x%x, csk 0x%p, csk->state %u, "
+			"csk->flag 0x%lx, csk->atid %u.\n",
+			status, csk, csk->state, csk->flags, csk->hwtid);
+
+	if (status & act_open_has_tid(status))
+		cxgb4_remove_tid(snic->lldi.tids, csk->port_id, GET_TID(rpl));
+
+	if (status == CPL_ERR_CONN_EXIST &&
+	    csk->retry_timer.function != cxgb4i_sock_act_open_retry_timer) {
+		csk->retry_timer.function = cxgb4i_sock_act_open_retry_timer;
+		if (!mod_timer(&csk->retry_timer, jiffies + HZ / 2))
+			cxgbi_sock_hold(csk);
+	} else
+		cxgb4i_fail_act_open(csk,
+				     act_open_rpl_status_to_errno(status));
+
+	__kfree_skb(skb);
+	spin_unlock_bh(&csk->lock);
+	cxgbi_sock_put(csk);
+	return 0;
+}
+
+static int cxgb4i_cpl_peer_close(struct cxgb4i_snic *snic, struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk;
+	struct cpl_peer_close *req = (struct cpl_peer_close *)skb->data;
+	unsigned int hwtid = GET_TID(req);
+	struct tid_info *t = snic->lldi.tids;
+
+	csk = lookup_tid(t, hwtid);
+	if (unlikely(!csk)) {
+		cxgbi_log_error("can't find connection for tid %u\n", hwtid);
+		return CPL_RET_UNKNOWN_TID;
+	}
+
+	cxgbi_sock_hold(csk);
+	spin_lock_bh(&csk->lock);
+
+	if (cxgbi_sock_flag(csk, CTPF_ABORT_RPL_PENDING))
+		goto out;
+
+	switch (csk->state) {
+	case CTP_ESTABLISHED:
+		cxgbi_sock_set_state(csk, CTP_PASSIVE_CLOSE);
+		break;
+	case CTP_ACTIVE_CLOSE:
+		cxgbi_sock_set_state(csk, CTP_CLOSE_WAIT_2);
+		break;
+	case CTP_CLOSE_WAIT_1:
+		cxgbi_sock_closed(csk);
+		break;
+	case CTP_ABORTING:
+		break;
+	default:
+		cxgbi_log_error("peer close, TID %u in bad state %u\n",
+				csk->hwtid, csk->state);
+	}
+
+	cxgbi_sock_conn_closing(csk);
+out:
+	__kfree_skb(skb);
+	spin_unlock_bh(&csk->lock);
+	cxgbi_sock_put(csk);
+	return 0;
+}
+
+static int cxgb4i_cpl_close_con_rpl(struct cxgb4i_snic *snic,
+				    struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk;
+	struct cpl_close_con_rpl *rpl = (struct cpl_close_con_rpl *)skb->data;
+	unsigned int hwtid = GET_TID(rpl);
+	struct tid_info *t = snic->lldi.tids;
+
+	csk = lookup_tid(t, hwtid);
+	if (unlikely(!csk)) {
+		cxgbi_log_error("can't find connection for tid %u\n", hwtid);
+		return CPL_RET_UNKNOWN_TID;
+	}
+
+	cxgbi_sock_hold(csk);
+	spin_lock_bh(&csk->lock);
+	cxgbi_conn_debug("csk 0x%p, state %u, flag 0x%lx.\n",
+			csk, csk->state, csk->flags);
+	csk->snd_una = ntohl(rpl->snd_nxt) - 1;
+
+	if (cxgbi_sock_flag(csk, CTPF_ABORT_RPL_PENDING))
+		goto out;
+
+	switch (csk->state) {
+	case CTP_ACTIVE_CLOSE:
+		cxgbi_sock_set_state(csk, CTP_CLOSE_WAIT_1);
+		break;
+	case CTP_CLOSE_WAIT_1:
+	case CTP_CLOSE_WAIT_2:
+		cxgbi_sock_closed(csk);
+		break;
+	case CTP_ABORTING:
+		break;
+	default:
+		cxgbi_log_error("close_rpl, TID %u in bad state %u\n",
+				csk->hwtid, csk->state);
+	}
+out:
+	spin_unlock_bh(&csk->lock);
+	cxgbi_sock_put(csk);
+	kfree_skb(skb);
+	return 0;
+}
+
+static int abort_status_to_errno(struct cxgbi_sock *csk, int abort_reason,
+								int *need_rst)
+{
+	switch (abort_reason) {
+	case CPL_ERR_BAD_SYN: /* fall through */
+	case CPL_ERR_CONN_RESET:
+		return csk->state > CTP_ESTABLISHED ?
+			-EPIPE : -ECONNRESET;
+	case CPL_ERR_XMIT_TIMEDOUT:
+	case CPL_ERR_PERSIST_TIMEDOUT:
+	case CPL_ERR_FINWAIT2_TIMEDOUT:
+	case CPL_ERR_KEEPALIVE_TIMEDOUT:
+		return -ETIMEDOUT;
+	default:
+		return -EIO;
+	}
+}
+
+/*
+ * Returns whether an ABORT_REQ_RSS message is a negative advice.
+ */
+static inline int is_neg_adv_abort(unsigned int status)
+{
+	return status == CPL_ERR_RTX_NEG_ADVICE ||
+		status == CPL_ERR_PERSIST_NEG_ADVICE;
+}
+
+static int cxgb4i_cpl_abort_req_rss(struct cxgb4i_snic *snic,
+				    struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk;
+	struct cpl_abort_req_rss *req = (struct cpl_abort_req_rss *)skb->data;
+	unsigned int hwtid = GET_TID(req);
+	struct tid_info *t = snic->lldi.tids;
+	int rst_status = CPL_ABORT_NO_RST;
+
+	csk = lookup_tid(t, hwtid);
+	if (unlikely(!csk)) {
+		cxgbi_log_error("can't find connection for tid %u\n", hwtid);
+		return CPL_RET_UNKNOWN_TID;
+	}
+
+	if (is_neg_adv_abort(req->status)) {
+		__kfree_skb(skb);
+		return 0;
+	}
+
+	cxgbi_sock_hold(csk);
+	spin_lock_bh(&csk->lock);
+
+	if (!cxgbi_sock_flag(csk, CTPF_ABORT_REQ_RCVD)) {
+		cxgbi_sock_set_flag(csk, CTPF_ABORT_REQ_RCVD);
+		cxgbi_sock_set_state(csk, CTP_ABORTING);
+		__kfree_skb(skb);
+		goto out;
+	}
+
+	cxgbi_sock_clear_flag(csk, CTPF_ABORT_REQ_RCVD);
+	cxgb4i_sock_send_abort_rpl(csk, rst_status);
+
+	if (!cxgbi_sock_flag(csk, CTPF_ABORT_RPL_PENDING)) {
+		csk->err = abort_status_to_errno(csk, req->status,
+						 &rst_status);
+		cxgbi_sock_closed(csk);
+	}
+out:
+	spin_unlock_bh(&csk->lock);
+	cxgbi_sock_put(csk);
+	return 0;
+}
+
+static int cxgb4i_cpl_abort_rpl_rss(struct cxgb4i_snic *snic,
+				    struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk;
+	struct cpl_abort_rpl_rss *rpl = (struct cpl_abort_rpl_rss *)skb->data;
+	unsigned int hwtid = GET_TID(rpl);
+	struct tid_info *t = snic->lldi.tids;
+
+	if (rpl->status == CPL_ERR_ABORT_FAILED)
+		goto out;
+
+	csk = lookup_tid(t, hwtid);
+	if (unlikely(!csk)) {
+		cxgbi_log_error("can't find connection for tid %u\n", hwtid);
+		goto out;
+	}
+
+	cxgbi_sock_hold(csk);
+	spin_lock_bh(&csk->lock);
+
+	if (cxgbi_sock_flag(csk, CTPF_ABORT_RPL_PENDING)) {
+		if (!cxgbi_sock_flag(csk, CTPF_ABORT_RPL_RCVD))
+			cxgbi_sock_set_flag(csk, CTPF_ABORT_RPL_RCVD);
+		else {
+			cxgbi_sock_clear_flag(csk, CTPF_ABORT_RPL_RCVD);
+			cxgbi_sock_clear_flag(csk, CTPF_ABORT_RPL_PENDING);
+
+			if (cxgbi_sock_flag(csk, CTPF_ABORT_REQ_RCVD))
+				cxgbi_log_error("tid %u, ABORT_RPL_RSS\n",
+						csk->hwtid);
+
+			cxgbi_sock_closed(csk);
+		}
+	}
+
+	spin_unlock_bh(&csk->lock);
+	cxgbi_sock_put(csk);
+out:
+	__kfree_skb(skb);
+	return 0;
+}
+
+static int cxgb4i_cpl_iscsi_hdr(struct cxgb4i_snic *snic, struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk;
+	struct cpl_iscsi_hdr *cpl = (struct cpl_iscsi_hdr *)skb->data;
+	unsigned int hwtid = GET_TID(cpl);
+	struct tid_info *t = snic->lldi.tids;
+	struct sk_buff *lskb;
+
+	csk = lookup_tid(t, hwtid);
+	if (unlikely(!csk)) {
+		cxgbi_log_error("can't find connection for tid %u\n", hwtid);
+		return CPL_RET_UNKNOWN_TID;
+	}
+
+	spin_lock_bh(&csk->lock);
+
+	if (unlikely(csk->state >= CTP_PASSIVE_CLOSE)) {
+		if (csk->state != CTP_ABORTING)
+			goto abort_conn;
+	}
+
+	cxgb4i_skb_tcp_seq(skb) = ntohl(cpl->seq);
+	skb_reset_transport_header(skb);
+	__skb_pull(skb, sizeof(*cpl));
+	__pskb_trim(skb, ntohs(cpl->len));
+
+	if (!csk->skb_ulp_lhdr) {
+		unsigned char *bhs;
+		unsigned int hlen, dlen;
+
+		csk->skb_ulp_lhdr = skb;
+		lskb = csk->skb_ulp_lhdr;
+		cxgb4i_skb_flags(lskb) = CTP_SKCBF_HDR_RCVD;
+
+		if (cxgb4i_skb_tcp_seq(lskb) != csk->rcv_nxt) {
+			cxgbi_log_error("tid 0x%x, CPL_ISCSI_HDR, bad seq got "
+					"0x%x, exp 0x%x\n",
+					csk->hwtid,
+					cxgb4i_skb_tcp_seq(lskb),
+					csk->rcv_nxt);
+			goto abort_conn;
+		}
+
+		bhs = lskb->data;
+		hlen = ntohs(cpl->len);
+		dlen = ntohl(*(unsigned int *)(bhs + 4)) & 0xFFFFFF;
+
+		if ((hlen + dlen) != ntohs(cpl->pdu_len_ddp) - 40) {
+			cxgbi_log_error("tid 0x%x, CPL_ISCSI_HDR, pdu len "
+					"mismatch %u != %u + %u, seq 0x%x\n",
+					csk->hwtid,
+					ntohs(cpl->pdu_len_ddp) - 40,
+					hlen, dlen, cxgb4i_skb_tcp_seq(skb));
+		}
+		cxgb4i_skb_rx_pdulen(skb) = hlen + dlen;
+		if (dlen)
+			cxgb4i_skb_rx_pdulen(skb) += csk->dcrc_len;
+		cxgb4i_skb_rx_pdulen(skb) =
+			((cxgb4i_skb_rx_pdulen(skb) + 3) & (~3));
+		csk->rcv_nxt += cxgb4i_skb_rx_pdulen(skb);
+	} else {
+		lskb = csk->skb_ulp_lhdr;
+		cxgb4i_skb_flags(lskb) |= CTP_SKCBF_DATA_RCVD;
+		cxgb4i_skb_flags(skb) = CTP_SKCBF_DATA_RCVD;
+		cxgbi_log_debug("csk 0x%p, tid 0x%x skb 0x%p, pdu data, "
+				" header 0x%p.\n",
+				csk, csk->hwtid, skb, lskb);
+	}
+
+	__skb_queue_tail(&csk->receive_queue, skb);
+	spin_unlock_bh(&csk->lock);
+	return 0;
+abort_conn:
+	cxgb4i_sock_send_abort_req(csk);
+	__kfree_skb(skb);
+	spin_unlock_bh(&csk->lock);
+	return -EINVAL;
+}
+
+static int cxgb4i_cpl_rx_data_ddp(struct cxgb4i_snic *snic, struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk;
+	struct sk_buff *lskb;
+	struct cpl_rx_data_ddp *rpl = (struct cpl_rx_data_ddp *)skb->data;
+	unsigned int hwtid = GET_TID(rpl);
+	struct tid_info *t = snic->lldi.tids;
+	unsigned int status;
+
+	csk = lookup_tid(t, hwtid);
+	if (unlikely(!csk)) {
+		cxgbi_log_error("can't find connection for tid %u\n", hwtid);
+		return CPL_RET_UNKNOWN_TID;
+	}
+
+	spin_lock_bh(&csk->lock);
+
+	if (unlikely(csk->state >= CTP_PASSIVE_CLOSE)) {
+		if (csk->state != CTP_ABORTING)
+			goto abort_conn;
+	}
+
+	if (!csk->skb_ulp_lhdr) {
+		cxgbi_log_error("tid 0x%x, rcv RX_DATA_DDP w/o pdu header\n",
+				csk->hwtid);
+		goto abort_conn;
+	}
+
+	lskb = csk->skb_ulp_lhdr;
+	cxgb4i_skb_flags(lskb) |= CTP_SKCBF_STATUS_RCVD;
+
+	if (ntohs(rpl->len) != cxgb4i_skb_rx_pdulen(lskb)) {
+		cxgbi_log_error("tid 0x%x, RX_DATA_DDP pdulen %u != %u.\n",
+				csk->hwtid, ntohs(rpl->len),
+				cxgb4i_skb_rx_pdulen(lskb));
+	}
+
+	cxgb4i_skb_rx_ddigest(lskb) = ntohl(rpl->ulp_crc);
+	status = ntohl(rpl->ddpvld);
+
+	if (status & (1 << RX_DDP_STATUS_HCRC_SHIFT)) {
+		cxgbi_log_info("ULP2_FLAG_HCRC_ERROR set\n");
+		cxgb4i_skb_ulp_mode(skb) |= ULP2_FLAG_HCRC_ERROR;
+	}
+	if (status & (1 << RX_DDP_STATUS_DCRC_SHIFT)) {
+		cxgbi_log_info("ULP2_FLAG_DCRC_ERROR set\n");
+		cxgb4i_skb_ulp_mode(skb) |= ULP2_FLAG_DCRC_ERROR;
+	}
+	if (status & (1 << RX_DDP_STATUS_PAD_SHIFT)) {
+		cxgbi_log_info("ULP2_FLAG_PAD_ERROR set\n");
+		cxgb4i_skb_ulp_mode(skb) |= ULP2_FLAG_PAD_ERROR;
+	}
+	if ((cxgb4i_skb_flags(lskb) & ULP2_FLAG_DATA_READY)) {
+		cxgbi_log_info("ULP2_FLAG_DATA_DDPED set\n");
+		cxgb4i_skb_ulp_mode(skb) |= ULP2_FLAG_DATA_DDPED;
+	}
+
+	csk->skb_ulp_lhdr = NULL;
+	__kfree_skb(skb);
+	cxgbi_conn_pdu_ready(csk);
+	spin_unlock_bh(&csk->lock);
+	return 0;
+abort_conn:
+	cxgb4i_sock_send_abort_req(csk);
+	__kfree_skb(skb);
+	spin_unlock_bh(&csk->lock);
+	return -EINVAL;
+}
+
+static void check_wr_invariants(const struct cxgbi_sock *csk)
+{
+	int pending = cxgb4i_sock_count_pending_wrs(csk);
+
+	if (unlikely(csk->wr_cred + pending != csk->wr_max_cred))
+		printk(KERN_ERR "TID %u: credit imbalance: avail %u, "
+				"pending %u, total should be %u\n",
+				csk->hwtid,
+				csk->wr_cred,
+				pending,
+				csk->wr_max_cred);
+}
+
+static int cxgb4i_cpl_fw4_ack(struct cxgb4i_snic *snic, struct sk_buff *skb)
+{
+	struct cxgbi_sock *csk;
+	struct cpl_fw4_ack *rpl = (struct cpl_fw4_ack *)skb->data;
+	unsigned int hwtid = GET_TID(rpl);
+	struct tid_info *t = snic->lldi.tids;
+	unsigned char credits;
+	unsigned int snd_una;
+
+	csk = lookup_tid(t, hwtid);
+	if (unlikely(!csk)) {
+		cxgbi_log_error("can't find connection for tid %u\n", hwtid);
+		kfree_skb(skb);
+		return CPL_RET_UNKNOWN_TID;
+	}
+
+	cxgbi_sock_hold(csk);
+	spin_lock_bh(&csk->lock);
+	credits = rpl->credits;
+	snd_una = be32_to_cpu(rpl->snd_una);
+	cxgbi_tx_debug("%u WR credits, avail %u, unack %u, TID %u, state %u\n",
+				credits, csk->wr_cred, csk->wr_una_cred,
+						csk->hwtid, csk->state);
+	csk->wr_cred += credits;
+
+	if (csk->wr_una_cred > csk->wr_max_cred - csk->wr_cred)
+		csk->wr_una_cred = csk->wr_max_cred - csk->wr_cred;
+
+	while (credits) {
+		struct sk_buff *p = cxgb4i_sock_peek_wr(csk);
+
+		if (unlikely(!p)) {
+			cxgbi_log_error("%u WR_ACK credits for TID %u with "
+					"nothing pending, state %u\n",
+					credits, csk->hwtid, csk->state);
+			break;
+		}
+
+		if (unlikely(credits < p->csum))
+			p->csum -= credits;
+		else {
+			cxgb4i_sock_dequeue_wr(csk);
+			credits -= p->csum;
+			kfree_skb(p);
+		}
+	}
+
+	check_wr_invariants(csk);
+
+	if (rpl->seq_vld) {
+		if (unlikely(before(snd_una, csk->snd_una))) {
+			cxgbi_log_error("TID %u, unexpected sequence # %u "
+					"in WR_ACK snd_una %u\n",
+					csk->hwtid, snd_una, csk->snd_una);
+			goto out_free;
+		}
+	}
+
+	if (csk->snd_una != snd_una) {
+		csk->snd_una = snd_una;
+		dst_confirm(csk->dst);
+	}
+
+	if (skb_queue_len(&csk->write_queue)) {
+		if (cxgb4i_sock_push_tx_frames(csk, 0))
+			cxgbi_conn_tx_open(csk);
+	} else
+		cxgbi_conn_tx_open(csk);
+
+	goto out;
+out_free:
+	__kfree_skb(skb);
+out:
+	spin_unlock_bh(&csk->lock);
+	cxgbi_sock_put(csk);
+	return 0;
+}
+
+static int cxgb4i_cpl_set_tcb_rpl(struct cxgb4i_snic *snic, struct sk_buff *skb)
+{
+	struct cpl_set_tcb_rpl *rpl = (struct cpl_set_tcb_rpl *)skb->data;
+	unsigned int hwtid = GET_TID(rpl);
+	struct tid_info *t = snic->lldi.tids;
+	struct cxgbi_sock *csk;
+
+	csk = lookup_tid(t, hwtid);
+	if (!csk) {
+		cxgbi_log_error("can't find connection for tid %u\n", hwtid);
+		__kfree_skb(skb);
+		return CPL_RET_UNKNOWN_TID;
+	}
+
+	spin_lock_bh(&csk->lock);
+
+	if (rpl->status != CPL_ERR_NONE) {
+		cxgbi_log_error("Unexpected SET_TCB_RPL status %u "
+				 "for tid %u\n", rpl->status, GET_TID(rpl));
+	}
+
+	__kfree_skb(skb);
+	spin_unlock_bh(&csk->lock);
+	return 0;
+}
+
+static void cxgb4i_sock_free_cpl_skbs(struct cxgbi_sock *csk)
+{
+	if (csk->cpl_close)
+		kfree_skb(csk->cpl_close);
+	if (csk->cpl_abort_req)
+		kfree_skb(csk->cpl_abort_req);
+	if (csk->cpl_abort_rpl)
+		kfree_skb(csk->cpl_abort_rpl);
+}
+
+static int cxgb4i_alloc_cpl_skbs(struct cxgbi_sock *csk)
+{
+	int wrlen;
+
+	wrlen = roundup(sizeof(struct cpl_close_con_req), 16);
+	csk->cpl_close = alloc_skb(wrlen, GFP_NOIO);
+	if (!csk->cpl_close)
+		return -ENOMEM;
+	skb_put(csk->cpl_close, wrlen);
+
+	wrlen = roundup(sizeof(struct cpl_abort_req), 16);
+	csk->cpl_abort_req = alloc_skb(wrlen, GFP_NOIO);
+	if (!csk->cpl_abort_req)
+		goto free_cpl_skbs;
+	skb_put(csk->cpl_abort_req, wrlen);
+
+	wrlen = roundup(sizeof(struct cpl_abort_rpl), 16);
+	csk->cpl_abort_rpl = alloc_skb(wrlen, GFP_NOIO);
+	if (!csk->cpl_abort_rpl)
+		goto free_cpl_skbs;
+	skb_put(csk->cpl_abort_rpl, wrlen);
+	return 0;
+free_cpl_skbs:
+	cxgb4i_sock_free_cpl_skbs(csk);
+	return -ENOMEM;
+}
+
+static void cxgb4i_sock_release_offload_resources(struct cxgbi_sock *csk)
+{
+
+	cxgb4i_sock_free_cpl_skbs(csk);
+
+	if (csk->wr_cred != csk->wr_max_cred) {
+		cxgb4i_sock_purge_wr_queue(csk);
+		cxgb4i_sock_reset_wr_list(csk);
+	}
+
+	if (csk->l2t) {
+		cxgb4_l2t_release(csk->l2t);
+		csk->l2t = NULL;
+	}
+
+	if (csk->state == CTP_CONNECTING)
+		cxgb4i_sock_free_atid(csk);
+	else {
+		struct cxgb4i_snic *snic = cxgbi_cdev_priv(csk->cdev);
+		cxgb4_remove_tid(snic->lldi.tids, 0, csk->hwtid);
+		cxgbi_sock_put(csk);
+	}
+
+	csk->dst = NULL;
+	csk->cdev = NULL;
+}
+
+static int cxgb4i_init_act_open(struct cxgbi_sock *csk,
+				struct net_device *dev)
+{
+	struct dst_entry *dst = csk->dst;
+	struct sk_buff *skb;
+	struct port_info *pi = netdev_priv(dev);
+	struct cxgb4i_snic *snic = cxgbi_cdev_priv(csk->cdev);
+	int offs;
+
+	cxgbi_conn_debug("csk 0x%p, state %u, flags 0x%lx\n",
+			csk, csk->state, csk->flags);
+
+	csk->atid = cxgb4_alloc_atid(snic->lldi.tids, csk);
+	if (csk->atid == -1) {
+		cxgbi_log_error("cannot alloc atid\n");
+		goto out_err;
+	}
+
+	csk->l2t = cxgb4_l2t_get(snic->lldi.l2t, csk->dst->neighbour, dev, 0);
+	if (!csk->l2t) {
+		cxgbi_log_error("cannot alloc l2t\n");
+		goto free_atid;
+	}
+
+	skb = alloc_skb(sizeof(struct cpl_act_open_req), GFP_NOIO);
+	if (!skb)
+		goto free_l2t;
+
+	skb->sk = (struct sock *)csk;
+	t4_set_arp_err_handler(skb, csk, cxgb4i_act_open_req_arp_failure);
+	cxgbi_sock_hold(csk);
+	offs = snic->lldi.ntxq / snic->lldi.nchan;
+	csk->txq_idx = pi->port_id * offs;
+	cxgbi_log_debug("csk->txq_idx : %d\n", csk->txq_idx);
+	offs = snic->lldi.nrxq / snic->lldi.nchan;
+	csk->rss_qid = snic->lldi.rxq_ids[pi->port_id * offs];
+	cxgbi_log_debug("csk->rss_qid : %d\n", csk->rss_qid);
+	csk->wr_max_cred = csk->wr_cred = snic->lldi.wr_cred;
+	csk->port_id = pi->port_id;
+	csk->tx_chan = cxgb4_port_chan(dev);
+	csk->smac_idx = csk->tx_chan << 1;
+	csk->wr_una_cred = 0;
+	csk->mss_idx = cxgbi_sock_select_mss(csk, dst_mtu(dst));
+	csk->err = 0;
+	cxgb4i_sock_reset_wr_list(csk);
+	cxgb4i_sock_make_act_open_req(csk, skb,
+					((csk->rss_qid << 14) |
+					 (csk->atid)), csk->l2t);
+	cxgb4_l2t_send(csk->cdev->ports[csk->port_id], skb, csk->l2t);
+	return 0;
+free_l2t:
+	cxgb4_l2t_release(csk->l2t);
+free_atid:
+	cxgb4i_sock_free_atid(csk);
+out_err:
+	return -EINVAL;;
+}
+
+void cxgb4i_sock_rx_credits(struct cxgbi_sock *csk, int copied)
+{
+	int must_send;
+	u32 credits;
+
+	if (csk->state != CTP_ESTABLISHED)
+		return;
+
+	credits = csk->copied_seq - csk->rcv_wup;
+	if (unlikely(!credits))
+		return;
+
+	if (unlikely(cxgb4i_rx_credit_thres == 0))
+		return;
+
+	must_send = credits + 16384 >= cxgb4i_rcv_win;
+
+	if (must_send || credits >= cxgb4i_rx_credit_thres)
+		csk->rcv_wup += cxgb4i_csk_send_rx_credits(csk, credits);
+}
+
+int cxgb4i_sock_send_pdus(struct cxgbi_sock *csk, struct sk_buff *skb)
+{
+	struct sk_buff *next;
+	int err, copied = 0;
+
+	spin_lock_bh(&csk->lock);
+
+	if (csk->state != CTP_ESTABLISHED) {
+		cxgbi_tx_debug("csk 0x%p, not in est. state %u.\n",
+			      csk, csk->state);
+		err = -EAGAIN;
+		goto out_err;
+	}
+
+	if (csk->err) {
+		cxgbi_tx_debug("csk 0x%p, err %d.\n", csk, csk->err);
+		err = -EPIPE;
+		goto out_err;
+	}
+
+	if (csk->write_seq - csk->snd_una >= cxgb4i_snd_win) {
+		cxgbi_tx_debug("csk 0x%p, snd %u - %u > %u.\n",
+				csk, csk->write_seq, csk->snd_una,
+				cxgb4i_snd_win);
+		err = -ENOBUFS;
+		goto out_err;
+	}
+
+	while (skb) {
+		int frags = skb_shinfo(skb)->nr_frags +
+				(skb->len != skb->data_len);
+
+		if (unlikely(skb_headroom(skb) < CXGB4I_TX_HEADER_LEN)) {
+			cxgbi_tx_debug("csk 0x%p, skb head.\n", csk);
+			err = -EINVAL;
+			goto out_err;
+		}
+
+		if (frags >= SKB_WR_LIST_SIZE) {
+			cxgbi_log_error("csk 0x%p, tx frags %d, len %u,%u.\n",
+					 csk, skb_shinfo(skb)->nr_frags,
+					 skb->len, skb->data_len);
+			err = -EINVAL;
+			goto out_err;
+		}
+
+		next = skb->next;
+		skb->next = NULL;
+		cxgb4i_sock_skb_entail(csk, skb, CTP_SKCBF_NO_APPEND |
+					CTP_SKCBF_NEED_HDR);
+		copied += skb->len;
+		csk->write_seq += skb->len + ulp_extra_len(skb);
+		skb = next;
+	}
+done:
+	if (likely(skb_queue_len(&csk->write_queue)))
+		cxgb4i_sock_push_tx_frames(csk, 1);
+	spin_unlock_bh(&csk->lock);
+	return copied;
+out_err:
+	if (copied == 0 && err == -EPIPE)
+		copied = csk->err ? csk->err : -EPIPE;
+	else
+		copied = err;
+	goto done;
+}
+
+static void tx_skb_setmode(struct sk_buff *skb, int hcrc, int dcrc)
+{
+	u8 submode = 0;
+
+	if (hcrc)
+		submode |= 1;
+	if (dcrc)
+		submode |= 2;
+	cxgb4i_skb_ulp_mode(skb) = (ULP_MODE_ISCSI << 4) | submode;
+}
+
+static inline __u16 get_skb_ulp_mode(struct sk_buff *skb)
+{
+	return cxgb4i_skb_ulp_mode(skb);
+}
+
+static inline __u16 get_skb_flags(struct sk_buff *skb)
+{
+	return cxgb4i_skb_flags(skb);
+}
+
+static inline __u32 get_skb_tcp_seq(struct sk_buff *skb)
+{
+	return cxgb4i_skb_tcp_seq(skb);
+}
+
+static inline __u32 get_skb_rx_pdulen(struct sk_buff *skb)
+{
+	return cxgb4i_skb_rx_pdulen(skb);
+}
+
+static cxgb4i_cplhandler_func cxgb4i_cplhandlers[NUM_CPL_CMDS] = {
+	[CPL_ACT_ESTABLISH] = cxgb4i_cpl_act_establish,
+	[CPL_ACT_OPEN_RPL] = cxgb4i_cpl_act_open_rpl,
+	[CPL_PEER_CLOSE] = cxgb4i_cpl_peer_close,
+	[CPL_ABORT_REQ_RSS] = cxgb4i_cpl_abort_req_rss,
+	[CPL_ABORT_RPL_RSS] = cxgb4i_cpl_abort_rpl_rss,
+	[CPL_CLOSE_CON_RPL] = cxgb4i_cpl_close_con_rpl,
+	[CPL_FW4_ACK] = cxgb4i_cpl_fw4_ack,
+	[CPL_ISCSI_HDR] = cxgb4i_cpl_iscsi_hdr,
+	[CPL_SET_TCB_RPL] = cxgb4i_cpl_set_tcb_rpl,
+	[CPL_RX_DATA_DDP] = cxgb4i_cpl_rx_data_ddp
+};
+
+int cxgb4i_ofld_init(struct cxgbi_device *cdev)
+{
+	struct cxgb4i_snic *snic = cxgbi_cdev_priv(cdev);
+	struct cxgbi_ports_map *ports;
+	int mapsize;
+
+	if (cxgb4i_max_connect > CXGB4I_MAX_CONN)
+		cxgb4i_max_connect = CXGB4I_MAX_CONN;
+
+	mapsize = (cxgb4i_max_connect * sizeof(struct cxgbi_sock));
+	ports = cxgbi_alloc_big_mem(sizeof(*ports) + mapsize, GFP_KERNEL);
+	if (!ports)
+		return -ENOMEM;
+
+	spin_lock_init(&ports->lock);
+	cdev->pmap = ports;
+	cdev->pmap->max_connect = cxgb4i_max_connect;
+	cdev->pmap->sport_base = cxgb4i_sport_base;
+	cdev->set_skb_txmode = tx_skb_setmode;
+	cdev->get_skb_ulp_mode = get_skb_ulp_mode;
+	cdev->get_skb_flags = get_skb_flags;
+	cdev->get_skb_tcp_seq = get_skb_tcp_seq;
+	cdev->get_skb_rx_pdulen = get_skb_rx_pdulen;
+	cdev->release_offload_resources = cxgb4i_sock_release_offload_resources;
+	cdev->sock_send_pdus = cxgb4i_sock_send_pdus;
+	cdev->send_abort_req = cxgb4i_sock_send_abort_req;
+	cdev->send_close_req = cxgb4i_sock_send_close_req;
+	cdev->sock_rx_credits = cxgb4i_sock_rx_credits;
+	cdev->alloc_cpl_skbs = cxgb4i_alloc_cpl_skbs;
+	cdev->init_act_open = cxgb4i_init_act_open;
+	snic->handlers = cxgb4i_cplhandlers;
+	return 0;
+}
+
+void cxgb4i_ofld_cleanup(struct cxgbi_device *cdev)
+{
+	struct cxgbi_sock *csk;
+	int i;
+
+	for (i = 0; i < cdev->pmap->max_connect; i++) {
+		if (cdev->pmap->port_csk[i]) {
+			csk = cdev->pmap->port_csk[i];
+			cdev->pmap->port_csk[i] = NULL;
+
+			cxgbi_sock_hold(csk);
+			spin_lock_bh(&csk->lock);
+			cxgbi_sock_closed(csk);
+			spin_unlock_bh(&csk->lock);
+			cxgbi_sock_put(csk);
+		}
+	}
+	cxgbi_free_big_mem(cdev->pmap);
+}
+
-- 
1.6.6.1

-- 
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.

^ permalink raw reply related

* [RESEND]cxgb4i_v4 submission
From: Rakesh Ranjan @ 2010-06-04 23:05 UTC (permalink / raw)
  To: LK-NetDev, LK-SCSIDev, LK-iSCSIDev
  Cc: LKML, Karen Xie, David Miller, James Bottomley, Mike Christie,
	Anish Bhatt

Resending the whole patchset due to missed [PATCH 1/3].

The following 3 patches add a new iscsi LLD driver cxgb4i to enable iscsi offload
support on Chelsio's new 1G and 10G cards. This is updated version of previous cxgb4i
patch. Please share you commnets after review.

Changes since cxgb4i_v3.1
1. Fixed issues reported by mike.
2. Made libcxgbi more generic.

[PATCH 1/3] cxgb4i_v4: add build support
[PATCH 2/3] cxgb4i_v4: libcxgbi library for handling common part in cxgb4i/cxgb3i
[PATCH 3/3] cxgb4i_v4: main driver files

Regards
Rakesh Ranjan

-- 
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox