Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] net: more accurate network taps in transmit path
From: David Miller @ 2012-09-19 19:33 UTC (permalink / raw)
  To: eric.dumazet; +Cc: jamie.gloudon, netdev
In-Reply-To: <1348078872.26523.1233.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 19 Sep 2012 20:21:12 +0200

> On Wed, 2012-09-19 at 14:16 -0400, David Miller wrote:
> 
>> Are you really sure that all the network tap implementations can
>> handle software GSO segmented skbs using skb->next linkage?
>> 
>> Because that is what they can potentially see after this change.
> 
> I dont think so, because skb->next is NULL at the points I call the
> network tap.
> 
> Or did I miss something ?

Indeed, you're right.  I misread the control flow here.

Applied, thanks a lot Eric.

^ permalink raw reply

* Re: [PATCH 2/2] Using LP firmware for taking advantage of the low-power capabilities.
From: Arend van Spriel @ 2012-09-19 19:33 UTC (permalink / raw)
  To: Jarl Friis
  Cc: Larry Finger, Stefano Brivio, Gábor Stefanik,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	b43-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
In-Reply-To: <CAOjsGA25PR9OCC_cAVn=gztk9YB6pNxCz=1mFjDxxQoc-tofEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On 09/19/2012 08:54 PM, Jarl Friis wrote:
>> Is the addition of your copyright to the driver
>> >warranted by this change?
> As far as I understand the copyright law: Yes, but I'm not an expert.
> Neither am I 100% sure what you mean.

You add your name to the copy notice of the file so you claim it for the 
entire file.

>> >For example, I have made much larger contributions
>> >to b43 over the years before I started doing reverse-engineering on this
>> >driver, but I never added my copyright.
> I suggest you do.

Typically, the names you see there are and/or were b43 maintainers.

>> >Your "Signed-off-by" implies
>> >copyright for the patch.
> The fact that I authored the patch implies copyright (even without
> Signed-off-by)
>
> Jarl

You can indeed claim it for your patch, but I think not beyond that. But 
I am no lawyer either so .....pffff.

Gr. AvS

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH net-next 11/11] sfc: Avoid generating over-length MC_CMD_FLUSH_RX_QUEUES request
From: Ben Hutchings @ 2012-09-19 19:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

MCDI supports requests up to 252 bytes long, which is only enough to
pass 63 RX queue IDs to MC_CMD_FLUSH_RX_QUEUES.  However a VF may have
up to 64 RX queues, and if we try to flush them all we will generate
an over-length request and BUG() in efx_mcdi_copyin().  Currently
all VF drivers limit themselves to 32 RX queues, so reducing the
limit to 63 does no harm.

Also add a BUILD_BUG_ON in efx_mcdi_flush_rxqs() so we remember to
deal with the same problem there if EFX_MAX_CHANNELS is increased.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/mcdi.c        |    3 +++
 drivers/net/ethernet/sfc/siena_sriov.c |    7 +++++++
 2 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mcdi.c b/drivers/net/ethernet/sfc/mcdi.c
index e855f4c..aea43cb 100644
--- a/drivers/net/ethernet/sfc/mcdi.c
+++ b/drivers/net/ethernet/sfc/mcdi.c
@@ -1195,6 +1195,9 @@ int efx_mcdi_flush_rxqs(struct efx_nic *efx)
 	__le32 *qid;
 	int rc, count;
 
+	BUILD_BUG_ON(EFX_MAX_CHANNELS >
+		     MC_CMD_FLUSH_RX_QUEUES_IN_QID_OFST_MAXNUM);
+
 	qid = kmalloc(EFX_MAX_CHANNELS * sizeof(*qid), GFP_KERNEL);
 	if (qid == NULL)
 		return -ENOMEM;
diff --git a/drivers/net/ethernet/sfc/siena_sriov.c b/drivers/net/ethernet/sfc/siena_sriov.c
index 9cb3b84..a8f48a4 100644
--- a/drivers/net/ethernet/sfc/siena_sriov.c
+++ b/drivers/net/ethernet/sfc/siena_sriov.c
@@ -21,6 +21,9 @@
 /* Number of longs required to track all the VIs in a VF */
 #define VI_MASK_LENGTH BITS_TO_LONGS(1 << EFX_VI_SCALE_MAX)
 
+/* Maximum number of RX queues supported */
+#define VF_MAX_RX_QUEUES 63
+
 /**
  * enum efx_vf_tx_filter_mode - TX MAC filtering behaviour
  * @VF_TX_FILTER_OFF: Disabled
@@ -578,6 +581,7 @@ static int efx_vfdi_init_rxq(struct efx_vf *vf)
 	efx_oword_t reg;
 
 	if (bad_vf_index(efx, vf_evq) || bad_vf_index(efx, vf_rxq) ||
+	    vf_rxq >= VF_MAX_RX_QUEUES ||
 	    bad_buf_count(buf_count, EFX_MAX_DMAQ_SIZE)) {
 		if (net_ratelimit())
 			netif_err(efx, hw, efx->net_dev,
@@ -683,6 +687,9 @@ static int efx_vfdi_fini_all_queues(struct efx_vf *vf)
 	__le32 *rxqs;
 	int rc;
 
+	BUILD_BUG_ON(VF_MAX_RX_QUEUES >
+		     MC_CMD_FLUSH_RX_QUEUES_IN_QID_OFST_MAXNUM);
+
 	rxqs = kmalloc(count * sizeof(*rxqs), GFP_KERNEL);
 	if (rxqs == NULL)
 		return VFDI_RC_ENOMEM;
-- 
1.7.7.6


-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH net-next 10/11] sfc: Bump version to 3.2
From: Ben Hutchings @ 2012-09-19 19:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

The key new feature for 3.2 is PTP support.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/net_driver.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 797dbed..c1a010c 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -37,7 +37,7 @@
  *
  **************************************************************************/
 
-#define EFX_DRIVER_VERSION	"3.1"
+#define EFX_DRIVER_VERSION	"3.2"
 
 #ifdef DEBUG
 #define EFX_BUG_ON_PARANOID(x) BUG_ON(x)
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH net-next 09/11] sfc: Expose FPGA bitfile partition through MTD
From: Ben Hutchings @ 2012-09-19 19:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/mtd.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mtd.c b/drivers/net/ethernet/sfc/mtd.c
index 8f4604d..08f825b 100644
--- a/drivers/net/ethernet/sfc/mtd.c
+++ b/drivers/net/ethernet/sfc/mtd.c
@@ -585,6 +585,7 @@ static const struct siena_nvram_type_info siena_nvram_types[] = {
 	[MC_CMD_NVRAM_TYPE_EXP_ROM_CFG_PORT1]	= { 1, "sfc_exp_rom_cfg" },
 	[MC_CMD_NVRAM_TYPE_PHY_PORT0]		= { 0, "sfc_phy_fw" },
 	[MC_CMD_NVRAM_TYPE_PHY_PORT1]		= { 1, "sfc_phy_fw" },
+	[MC_CMD_NVRAM_TYPE_FPGA]		= { 0, "sfc_fpga" },
 };
 
 static int siena_mtd_probe_partition(struct efx_nic *efx,
@@ -598,7 +599,8 @@ static int siena_mtd_probe_partition(struct efx_nic *efx,
 	bool protected;
 	int rc;
 
-	if (type >= ARRAY_SIZE(siena_nvram_types))
+	if (type >= ARRAY_SIZE(siena_nvram_types) ||
+	    siena_nvram_types[type].name == NULL)
 		return -ENODEV;
 
 	info = &siena_nvram_types[type];
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH net-next 08/11] sfc: Support variable-length response to MCDI GET_BOARD_CFG
From: Ben Hutchings @ 2012-09-19 19:17 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/mcdi.c |    6 ++++--
 drivers/net/ethernet/sfc/mtd.c  |    3 ++-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mcdi.c b/drivers/net/ethernet/sfc/mcdi.c
index 578e5f7..e855f4c 100644
--- a/drivers/net/ethernet/sfc/mcdi.c
+++ b/drivers/net/ethernet/sfc/mcdi.c
@@ -683,12 +683,14 @@ int efx_mcdi_get_board_cfg(struct efx_nic *efx, u8 *mac_address,
 	if (mac_address)
 		memcpy(mac_address, outbuf + offset, ETH_ALEN);
 	if (fw_subtype_list) {
+		/* Byte-swap and truncate or zero-pad as necessary */
 		offset = MC_CMD_GET_BOARD_CFG_OUT_FW_SUBTYPE_LIST_OFST;
 		for (i = 0;
-		     i < MC_CMD_GET_BOARD_CFG_OUT_FW_SUBTYPE_LIST_MINNUM;
+		     i < MC_CMD_GET_BOARD_CFG_OUT_FW_SUBTYPE_LIST_MAXNUM;
 		     i++) {
 			fw_subtype_list[i] =
-				le16_to_cpup((__le16 *)(outbuf + offset));
+				(offset + 2 <= outlen) ?
+				le16_to_cpup((__le16 *)(outbuf + offset)) : 0;
 			offset += 2;
 		}
 	}
diff --git a/drivers/net/ethernet/sfc/mtd.c b/drivers/net/ethernet/sfc/mtd.c
index 7581483..8f4604d 100644
--- a/drivers/net/ethernet/sfc/mtd.c
+++ b/drivers/net/ethernet/sfc/mtd.c
@@ -627,7 +627,8 @@ static int siena_mtd_get_fw_subtypes(struct efx_nic *efx,
 				     struct efx_mtd *efx_mtd)
 {
 	struct efx_mtd_partition *part;
-	uint16_t fw_subtype_list[MC_CMD_GET_BOARD_CFG_OUT_FW_SUBTYPE_LIST_MINNUM];
+	uint16_t fw_subtype_list[
+		MC_CMD_GET_BOARD_CFG_OUT_FW_SUBTYPE_LIST_MAXNUM];
 	int rc;
 
 	rc = efx_mcdi_get_board_cfg(efx, NULL, fw_subtype_list, NULL);
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH net-next 07/11] sfc: Convert firmware subtypes to native byte order in efx_mcdi_get_board_cfg()
From: Ben Hutchings @ 2012-09-19 19:17 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

On big-endian systems the MTD partition names currently have mangled
subtype numbers and are not recognised by the firmware update tool
(sfupdate).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/mcdi.c |   18 +++++++++++-------
 1 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mcdi.c b/drivers/net/ethernet/sfc/mcdi.c
index 294df4b..578e5f7 100644
--- a/drivers/net/ethernet/sfc/mcdi.c
+++ b/drivers/net/ethernet/sfc/mcdi.c
@@ -661,9 +661,8 @@ int efx_mcdi_get_board_cfg(struct efx_nic *efx, u8 *mac_address,
 			   u16 *fw_subtype_list, u32 *capabilities)
 {
 	uint8_t outbuf[MC_CMD_GET_BOARD_CFG_OUT_LENMIN];
-	size_t outlen;
+	size_t outlen, offset, i;
 	int port_num = efx_port_num(efx);
-	int offset;
 	int rc;
 
 	BUILD_BUG_ON(MC_CMD_GET_BOARD_CFG_IN_LEN != 0);
@@ -683,11 +682,16 @@ int efx_mcdi_get_board_cfg(struct efx_nic *efx, u8 *mac_address,
 		: MC_CMD_GET_BOARD_CFG_OUT_MAC_ADDR_BASE_PORT0_OFST;
 	if (mac_address)
 		memcpy(mac_address, outbuf + offset, ETH_ALEN);
-	if (fw_subtype_list)
-		memcpy(fw_subtype_list,
-		       outbuf + MC_CMD_GET_BOARD_CFG_OUT_FW_SUBTYPE_LIST_OFST,
-		       MC_CMD_GET_BOARD_CFG_OUT_FW_SUBTYPE_LIST_MINNUM *
-		       sizeof(fw_subtype_list[0]));
+	if (fw_subtype_list) {
+		offset = MC_CMD_GET_BOARD_CFG_OUT_FW_SUBTYPE_LIST_OFST;
+		for (i = 0;
+		     i < MC_CMD_GET_BOARD_CFG_OUT_FW_SUBTYPE_LIST_MINNUM;
+		     i++) {
+			fw_subtype_list[i] =
+				le16_to_cpup((__le16 *)(outbuf + offset));
+			offset += 2;
+		}
+	}
 	if (capabilities) {
 		if (port_num)
 			*capabilities = MCDI_DWORD(outbuf,
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH net-next 06/11] sfc: Add support for IEEE-1588 PTP
From: Ben Hutchings @ 2012-09-19 19:17 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, linux-net-drivers, Richard Cochran, Rodolfo Giometti,
	Andrew Jackson
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

From: Stuart Hodgson <smhodgson@solarflare.com>

Add PTP IEEE-1588 support and make accesible via the PHC subsystem.

This work is based on prior code by Andrew Jackson

Signed-off-by: Stuart Hodgson <smhodgson@solarflare.com>
[bwh:
 - Add byte order conversion in efx_ptp_send_times()
 - Simplify conversion of PPS event times
 - Add the built-in vs module check to CONFIG_SFC_PTP dependencies]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/Kconfig      |    7 +
 drivers/net/ethernet/sfc/Makefile     |    1 +
 drivers/net/ethernet/sfc/efx.c        |    3 +
 drivers/net/ethernet/sfc/ethtool.c    |    1 +
 drivers/net/ethernet/sfc/mcdi.c       |    5 +
 drivers/net/ethernet/sfc/mcdi_pcol.h  |    1 +
 drivers/net/ethernet/sfc/net_driver.h |   19 +-
 drivers/net/ethernet/sfc/nic.h        |   36 +
 drivers/net/ethernet/sfc/ptp.c        | 1483 +++++++++++++++++++++++++++++++++
 drivers/net/ethernet/sfc/siena.c      |    1 +
 drivers/net/ethernet/sfc/tx.c         |    6 +
 11 files changed, 1562 insertions(+), 1 deletions(-)
 create mode 100644 drivers/net/ethernet/sfc/ptp.c

diff --git a/drivers/net/ethernet/sfc/Kconfig b/drivers/net/ethernet/sfc/Kconfig
index fb3cbc2..25906c1 100644
--- a/drivers/net/ethernet/sfc/Kconfig
+++ b/drivers/net/ethernet/sfc/Kconfig
@@ -34,3 +34,10 @@ config SFC_SRIOV
 	  This enables support for the SFC9000 I/O Virtualization
 	  features, allowing accelerated network performance in
 	  virtualized environments.
+config SFC_PTP
+	bool "Solarflare SFC9000-family PTP support"
+	depends on SFC && PTP_1588_CLOCK && !(SFC=y && PTP_1588_CLOCK=m)
+	default y
+	---help---
+	  This enables support for the Precision Time Protocol (PTP)
+	  on SFC9000-family NICs
diff --git a/drivers/net/ethernet/sfc/Makefile b/drivers/net/ethernet/sfc/Makefile
index ea1f8db..e11f2ec 100644
--- a/drivers/net/ethernet/sfc/Makefile
+++ b/drivers/net/ethernet/sfc/Makefile
@@ -5,5 +5,6 @@ sfc-y			+= efx.o nic.o falcon.o siena.o tx.o rx.o filter.o \
 			   mcdi.o mcdi_phy.o mcdi_mon.o
 sfc-$(CONFIG_SFC_MTD)	+= mtd.o
 sfc-$(CONFIG_SFC_SRIOV)	+= siena_sriov.o
+sfc-$(CONFIG_SFC_PTP)	+= ptp.o
 
 obj-$(CONFIG_SFC)	+= sfc.o
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 8b79a64..96bd980 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -1779,6 +1779,9 @@ static int efx_ioctl(struct net_device *net_dev, struct ifreq *ifr, int cmd)
 	struct efx_nic *efx = netdev_priv(net_dev);
 	struct mii_ioctl_data *data = if_mii(ifr);
 
+	if (cmd == SIOCSHWTSTAMP)
+		return efx_ptp_ioctl(efx, ifr, cmd);
+
 	/* Convert phy_id from older PRTAD/DEVAD format */
 	if ((cmd == SIOCGMIIREG || cmd == SIOCSMIIREG) &&
 	    (data->phy_id & 0xfc00) == 0x0400)
diff --git a/drivers/net/ethernet/sfc/ethtool.c b/drivers/net/ethernet/sfc/ethtool.c
index f8e7e20..9df556c 100644
--- a/drivers/net/ethernet/sfc/ethtool.c
+++ b/drivers/net/ethernet/sfc/ethtool.c
@@ -1174,6 +1174,7 @@ const struct ethtool_ops efx_ethtool_ops = {
 	.get_rxfh_indir_size	= efx_ethtool_get_rxfh_indir_size,
 	.get_rxfh_indir		= efx_ethtool_get_rxfh_indir,
 	.set_rxfh_indir		= efx_ethtool_set_rxfh_indir,
+	.get_ts_info		= efx_ptp_get_ts_info,
 	.get_module_info	= efx_ethtool_get_module_info,
 	.get_module_eeprom	= efx_ethtool_get_module_eeprom,
 };
diff --git a/drivers/net/ethernet/sfc/mcdi.c b/drivers/net/ethernet/sfc/mcdi.c
index 2707e86..294df4b 100644
--- a/drivers/net/ethernet/sfc/mcdi.c
+++ b/drivers/net/ethernet/sfc/mcdi.c
@@ -578,6 +578,11 @@ void efx_mcdi_process_event(struct efx_channel *channel,
 	case MCDI_EVENT_CODE_FLR:
 		efx_sriov_flr(efx, MCDI_EVENT_FIELD(*event, FLR_VF));
 		break;
+	case MCDI_EVENT_CODE_PTP_RX:
+	case MCDI_EVENT_CODE_PTP_FAULT:
+	case MCDI_EVENT_CODE_PTP_PPS:
+		efx_ptp_event(efx, event);
+		break;
 
 	default:
 		netif_err(efx, hw, efx->net_dev, "Unknown MCDI event 0x%x\n",
diff --git a/drivers/net/ethernet/sfc/mcdi_pcol.h b/drivers/net/ethernet/sfc/mcdi_pcol.h
index 5038932..9d426d0 100644
--- a/drivers/net/ethernet/sfc/mcdi_pcol.h
+++ b/drivers/net/ethernet/sfc/mcdi_pcol.h
@@ -289,6 +289,7 @@
 #define          MCDI_EVENT_CODE_TX_FLUSH  0xc /* enum */
 #define          MCDI_EVENT_CODE_PTP_RX  0xd /* enum */
 #define          MCDI_EVENT_CODE_PTP_FAULT  0xe /* enum */
+#define          MCDI_EVENT_CODE_PTP_PPS  0xf /* enum */
 #define       MCDI_EVENT_CMDDONE_DATA_OFST 0
 #define       MCDI_EVENT_CMDDONE_DATA_LBN 0
 #define       MCDI_EVENT_CMDDONE_DATA_WIDTH 32
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 0f0926e..797dbed 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -56,7 +56,8 @@
 #define EFX_MAX_CHANNELS 32U
 #define EFX_MAX_RX_QUEUES EFX_MAX_CHANNELS
 #define EFX_EXTRA_CHANNEL_IOV	0
-#define EFX_MAX_EXTRA_CHANNELS	1U
+#define EFX_EXTRA_CHANNEL_PTP	1
+#define EFX_MAX_EXTRA_CHANNELS	2U
 
 /* Checksum generation is a per-queue option in hardware, so each
  * queue visible to the networking core is backed by two hardware TX
@@ -68,6 +69,9 @@
 #define EFX_TXQ_TYPES		4
 #define EFX_MAX_TX_QUEUES	(EFX_TXQ_TYPES * EFX_MAX_CHANNELS)
 
+/* Forward declare Precision Time Protocol (PTP) support structure. */
+struct efx_ptp_data;
+
 struct efx_self_tests;
 
 /**
@@ -736,6 +740,7 @@ struct vfdi_status;
  *	%local_addr_list. Protected by %local_lock.
  * @local_lock: Mutex protecting %local_addr_list and %local_page_list.
  * @peer_work: Work item to broadcast peer addresses to VMs.
+ * @ptp_data: PTP state data
  * @monitor_work: Hardware monitor workitem
  * @biu_lock: BIU (bus interface unit) lock
  * @last_irq_cpu: Last CPU to handle a possible test interrupt.  This
@@ -863,6 +868,10 @@ struct efx_nic {
 	struct work_struct peer_work;
 #endif
 
+#ifdef CONFIG_SFC_PTP
+	struct efx_ptp_data *ptp_data;
+#endif
+
 	/* The following fields may be written more often */
 
 	struct delayed_work monitor_work ____cacheline_aligned_in_smp;
@@ -1125,5 +1134,13 @@ static inline void clear_bit_le(unsigned nr, unsigned char *addr)
 #define EFX_MAX_FRAME_LEN(mtu) \
 	((((mtu) + ETH_HLEN + VLAN_HLEN + 4/* FCS */ + 7) & ~7) + 16)
 
+static inline bool efx_xmit_with_hwtstamp(struct sk_buff *skb)
+{
+	return skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP;
+}
+static inline void efx_xmit_hwtstamp_pending(struct sk_buff *skb)
+{
+	skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
+}
 
 #endif /* EFX_NET_DRIVER_H */
diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h
index bab5cd9..438cef1 100644
--- a/drivers/net/ethernet/sfc/nic.h
+++ b/drivers/net/ethernet/sfc/nic.h
@@ -11,6 +11,7 @@
 #ifndef EFX_NIC_H
 #define EFX_NIC_H
 
+#include <linux/net_tstamp.h>
 #include <linux/i2c-algo-bit.h>
 #include "net_driver.h"
 #include "efx.h"
@@ -250,6 +251,41 @@ extern int efx_sriov_get_vf_config(struct net_device *dev, int vf,
 extern int efx_sriov_set_vf_spoofchk(struct net_device *net_dev, int vf,
 				     bool spoofchk);
 
+struct ethtool_ts_info;
+#ifdef CONFIG_SFC_PTP
+extern void efx_ptp_probe(struct efx_nic *efx);
+extern int efx_ptp_ioctl(struct efx_nic *efx, struct ifreq *ifr, int cmd);
+extern int efx_ptp_get_ts_info(struct net_device *net_dev,
+			       struct ethtool_ts_info *ts_info);
+extern bool efx_ptp_is_ptp_tx(struct efx_nic *efx, struct sk_buff *skb);
+extern int efx_ptp_tx(struct efx_nic *efx, struct sk_buff *skb);
+extern void efx_ptp_event(struct efx_nic *efx, efx_qword_t *ev);
+#else
+static inline void efx_ptp_probe(struct efx_nic *efx) {}
+static inline int efx_ptp_ioctl(struct efx_nic *efx, struct ifreq *ifr, int cmd)
+{
+	return -EOPNOTSUPP;
+}
+static inline int efx_ptp_get_ts_info(struct net_device *net_dev,
+				      struct ethtool_ts_info *ts_info)
+{
+	ts_info->so_timestamping = (SOF_TIMESTAMPING_SOFTWARE |
+				    SOF_TIMESTAMPING_RX_SOFTWARE);
+	ts_info->phc_index = -1;
+
+	return 0;
+}
+static inline bool efx_ptp_is_ptp_tx(struct efx_nic *efx, struct sk_buff *skb)
+{
+	return false;
+}
+static inline int efx_ptp_tx(struct efx_nic *efx, struct sk_buff *skb)
+{
+	return NETDEV_TX_OK;
+}
+static inline void efx_ptp_event(struct efx_nic *efx, efx_qword_t *ev) {}
+#endif
+
 extern const struct efx_nic_type falcon_a1_nic_type;
 extern const struct efx_nic_type falcon_b0_nic_type;
 extern const struct efx_nic_type siena_a0_nic_type;
diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c
new file mode 100644
index 0000000..2b07a4e
--- /dev/null
+++ b/drivers/net/ethernet/sfc/ptp.c
@@ -0,0 +1,1483 @@
+/****************************************************************************
+ * Driver for Solarflare Solarstorm network controllers and boards
+ * Copyright 2011 Solarflare Communications Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation, incorporated herein by reference.
+ */
+
+/* Theory of operation:
+ *
+ * PTP support is assisted by firmware running on the MC, which provides
+ * the hardware timestamping capabilities.  Both transmitted and received
+ * PTP event packets are queued onto internal queues for subsequent processing;
+ * this is because the MC operations are relatively long and would block
+ * block NAPI/interrupt operation.
+ *
+ * Receive event processing:
+ *	The event contains the packet's UUID and sequence number, together
+ *	with the hardware timestamp.  The PTP receive packet queue is searched
+ *	for this UUID/sequence number and, if found, put on a pending queue.
+ *	Packets not matching are delivered without timestamps (MCDI events will
+ *	always arrive after the actual packet).
+ *	It is important for the operation of the PTP protocol that the ordering
+ *	of packets between the event and general port is maintained.
+ *
+ * Work queue processing:
+ *	If work waiting, synchronise host/hardware time
+ *
+ *	Transmit: send packet through MC, which returns the transmission time
+ *	that is converted to an appropriate timestamp.
+ *
+ *	Receive: the packet's reception time is converted to an appropriate
+ *	timestamp.
+ */
+#include <linux/ip.h>
+#include <linux/udp.h>
+#include <linux/time.h>
+#include <linux/ktime.h>
+#include <linux/module.h>
+#include <linux/net_tstamp.h>
+#include <linux/pps_kernel.h>
+#include <linux/ptp_clock_kernel.h>
+#include "net_driver.h"
+#include "efx.h"
+#include "mcdi.h"
+#include "mcdi_pcol.h"
+#include "io.h"
+#include "regs.h"
+#include "nic.h"
+
+/* Maximum number of events expected to make up a PTP event */
+#define	MAX_EVENT_FRAGS			3
+
+/* Maximum delay, ms, to begin synchronisation */
+#define	MAX_SYNCHRONISE_WAIT_MS		2
+
+/* How long, at most, to spend synchronising */
+#define	SYNCHRONISE_PERIOD_NS		250000
+
+/* How often to update the shared memory time */
+#define	SYNCHRONISATION_GRANULARITY_NS	200
+
+/* Minimum permitted length of a (corrected) synchronisation time */
+#define	MIN_SYNCHRONISATION_NS		120
+
+/* Maximum permitted length of a (corrected) synchronisation time */
+#define	MAX_SYNCHRONISATION_NS		1000
+
+/* How many (MC) receive events that can be queued */
+#define	MAX_RECEIVE_EVENTS		8
+
+/* Length of (modified) moving average. */
+#define	AVERAGE_LENGTH			16
+
+/* How long an unmatched event or packet can be held */
+#define PKT_EVENT_LIFETIME_MS		10
+
+/* Offsets into PTP packet for identification.  These offsets are from the
+ * start of the IP header, not the MAC header.  Note that neither PTP V1 nor
+ * PTP V2 permit the use of IPV4 options.
+ */
+#define PTP_DPORT_OFFSET	22
+
+#define PTP_V1_VERSION_LENGTH	2
+#define PTP_V1_VERSION_OFFSET	28
+
+#define PTP_V1_UUID_LENGTH	6
+#define PTP_V1_UUID_OFFSET	50
+
+#define PTP_V1_SEQUENCE_LENGTH	2
+#define PTP_V1_SEQUENCE_OFFSET	58
+
+/* The minimum length of a PTP V1 packet for offsets, etc. to be valid:
+ * includes IP header.
+ */
+#define	PTP_V1_MIN_LENGTH	64
+
+#define PTP_V2_VERSION_LENGTH	1
+#define PTP_V2_VERSION_OFFSET	29
+
+/* Although PTP V2 UUIDs are comprised a ClockIdentity (8) and PortNumber (2),
+ * the MC only captures the last six bytes of the clock identity. These values
+ * reflect those, not the ones used in the standard.  The standard permits
+ * mapping of V1 UUIDs to V2 UUIDs with these same values.
+ */
+#define PTP_V2_MC_UUID_LENGTH	6
+#define PTP_V2_MC_UUID_OFFSET	50
+
+#define PTP_V2_SEQUENCE_LENGTH	2
+#define PTP_V2_SEQUENCE_OFFSET	58
+
+/* The minimum length of a PTP V2 packet for offsets, etc. to be valid:
+ * includes IP header.
+ */
+#define	PTP_V2_MIN_LENGTH	63
+
+#define	PTP_MIN_LENGTH		63
+
+#define PTP_ADDRESS		0xe0000181	/* 224.0.1.129 */
+#define PTP_EVENT_PORT		319
+#define PTP_GENERAL_PORT	320
+
+/* Annoyingly the format of the version numbers are different between
+ * versions 1 and 2 so it isn't possible to simply look for 1 or 2.
+ */
+#define	PTP_VERSION_V1		1
+
+#define	PTP_VERSION_V2		2
+#define	PTP_VERSION_V2_MASK	0x0f
+
+enum ptp_packet_state {
+	PTP_PACKET_STATE_UNMATCHED = 0,
+	PTP_PACKET_STATE_MATCHED,
+	PTP_PACKET_STATE_TIMED_OUT,
+	PTP_PACKET_STATE_MATCH_UNWANTED
+};
+
+/* NIC synchronised with single word of time only comprising
+ * partial seconds and full nanoseconds: 10^9 ~ 2^30 so 2 bits for seconds.
+ */
+#define	MC_NANOSECOND_BITS	30
+#define	MC_NANOSECOND_MASK	((1 << MC_NANOSECOND_BITS) - 1)
+#define	MC_SECOND_MASK		((1 << (32 - MC_NANOSECOND_BITS)) - 1)
+
+/* Maximum parts-per-billion adjustment that is acceptable */
+#define MAX_PPB			1000000
+
+/* Number of bits required to hold the above */
+#define	MAX_PPB_BITS		20
+
+/* Number of extra bits allowed when calculating fractional ns.
+ * EXTRA_BITS + MC_CMD_PTP_IN_ADJUST_BITS + MAX_PPB_BITS should
+ * be less than 63.
+ */
+#define	PPB_EXTRA_BITS		2
+
+/* Precalculate scale word to avoid long long division at runtime */
+#define	PPB_SCALE_WORD	((1LL << (PPB_EXTRA_BITS + MC_CMD_PTP_IN_ADJUST_BITS +\
+			MAX_PPB_BITS)) / 1000000000LL)
+
+#define PTP_SYNC_ATTEMPTS	4
+
+/**
+ * struct efx_ptp_match - Matching structure, stored in sk_buff's cb area.
+ * @words: UUID and (partial) sequence number
+ * @expiry: Time after which the packet should be delivered irrespective of
+ *            event arrival.
+ * @state: The state of the packet - whether it is ready for processing or
+ *         whether that is of no interest.
+ */
+struct efx_ptp_match {
+	u32 words[DIV_ROUND_UP(PTP_V1_UUID_LENGTH, 4)];
+	unsigned long expiry;
+	enum ptp_packet_state state;
+};
+
+/**
+ * struct efx_ptp_event_rx - A PTP receive event (from MC)
+ * @seq0: First part of (PTP) UUID
+ * @seq1: Second part of (PTP) UUID and sequence number
+ * @hwtimestamp: Event timestamp
+ */
+struct efx_ptp_event_rx {
+	struct list_head link;
+	u32 seq0;
+	u32 seq1;
+	ktime_t hwtimestamp;
+	unsigned long expiry;
+};
+
+/**
+ * struct efx_ptp_timeset - Synchronisation between host and MC
+ * @host_start: Host time immediately before hardware timestamp taken
+ * @seconds: Hardware timestamp, seconds
+ * @nanoseconds: Hardware timestamp, nanoseconds
+ * @host_end: Host time immediately after hardware timestamp taken
+ * @waitns: Number of nanoseconds between hardware timestamp being read and
+ *          host end time being seen
+ * @window: Difference of host_end and host_start
+ * @valid: Whether this timeset is valid
+ */
+struct efx_ptp_timeset {
+	u32 host_start;
+	u32 seconds;
+	u32 nanoseconds;
+	u32 host_end;
+	u32 waitns;
+	u32 window;	/* Derived: end - start, allowing for wrap */
+};
+
+/**
+ * struct efx_ptp_data - Precision Time Protocol (PTP) state
+ * @channel: The PTP channel
+ * @rxq: Receive queue (awaiting timestamps)
+ * @txq: Transmit queue
+ * @evt_list: List of MC receive events awaiting packets
+ * @evt_free_list: List of free events
+ * @evt_lock: Lock for manipulating evt_list and evt_free_list
+ * @rx_evts: Instantiated events (on evt_list and evt_free_list)
+ * @workwq: Work queue for processing pending PTP operations
+ * @work: Work task
+ * @reset_required: A serious error has occurred and the PTP task needs to be
+ *                  reset (disable, enable).
+ * @rxfilter_event: Receive filter when operating
+ * @rxfilter_general: Receive filter when operating
+ * @config: Current timestamp configuration
+ * @enabled: PTP operation enabled
+ * @mode: Mode in which PTP operating (PTP version)
+ * @evt_frags: Partly assembled PTP events
+ * @evt_frag_idx: Current fragment number
+ * @evt_code: Last event code
+ * @start: Address at which MC indicates ready for synchronisation
+ * @host_time_pps: Host time at last PPS
+ * @last_sync_ns: Last number of nanoseconds between readings when synchronising
+ * @base_sync_ns: Number of nanoseconds for last synchronisation.
+ * @base_sync_valid: Whether base_sync_time is valid.
+ * @current_adjfreq: Current ppb adjustment.
+ * @phc_clock: Pointer to registered phc device
+ * @phc_clock_info: Registration structure for phc device
+ * @pps_work: pps work task for handling pps events
+ * @pps_workwq: pps work queue
+ * @nic_ts_enabled: Flag indicating if NIC generated TS events are handled
+ * @txbuf: Buffer for use when transmitting (PTP) packets to MC (avoids
+ *         allocations in main data path).
+ * @debug_ptp_dir: PTP debugfs directory
+ * @missed_rx_sync: Number of packets received without syncrhonisation.
+ * @good_syncs: Number of successful synchronisations.
+ * @no_time_syncs: Number of synchronisations with no good times.
+ * @bad_sync_durations: Number of synchronisations with bad durations.
+ * @bad_syncs: Number of failed synchronisations.
+ * @last_sync_time: Number of nanoseconds for last synchronisation.
+ * @sync_timeouts: Number of synchronisation timeouts
+ * @fast_syncs: Number of synchronisations requiring short delay
+ * @min_sync_delta: Minimum time between event and synchronisation
+ * @max_sync_delta: Maximum time between event and synchronisation
+ * @average_sync_delta: Average time between event and synchronisation.
+ *                      Modified moving average.
+ * @last_sync_delta: Last time between event and synchronisation
+ * @mc_stats: Context value for MC statistics
+ * @timeset: Last set of synchronisation statistics.
+ */
+struct efx_ptp_data {
+	struct efx_channel *channel;
+	struct sk_buff_head rxq;
+	struct sk_buff_head txq;
+	struct list_head evt_list;
+	struct list_head evt_free_list;
+	spinlock_t evt_lock;
+	struct efx_ptp_event_rx rx_evts[MAX_RECEIVE_EVENTS];
+	struct workqueue_struct *workwq;
+	struct work_struct work;
+	bool reset_required;
+	u32 rxfilter_event;
+	u32 rxfilter_general;
+	bool rxfilter_installed;
+	struct hwtstamp_config config;
+	bool enabled;
+	unsigned int mode;
+	efx_qword_t evt_frags[MAX_EVENT_FRAGS];
+	int evt_frag_idx;
+	int evt_code;
+	struct efx_buffer start;
+	struct pps_event_time host_time_pps;
+	unsigned last_sync_ns;
+	unsigned base_sync_ns;
+	bool base_sync_valid;
+	s64 current_adjfreq;
+	struct ptp_clock *phc_clock;
+	struct ptp_clock_info phc_clock_info;
+	struct work_struct pps_work;
+	struct workqueue_struct *pps_workwq;
+	bool nic_ts_enabled;
+	u8 txbuf[ALIGN(MC_CMD_PTP_IN_TRANSMIT_LEN(
+			       MC_CMD_PTP_IN_TRANSMIT_PACKET_MAXNUM), 4)];
+	struct efx_ptp_timeset
+	timeset[MC_CMD_PTP_OUT_SYNCHRONIZE_TIMESET_MAXNUM];
+};
+
+static int efx_phc_adjfreq(struct ptp_clock_info *ptp, s32 delta);
+static int efx_phc_adjtime(struct ptp_clock_info *ptp, s64 delta);
+static int efx_phc_gettime(struct ptp_clock_info *ptp, struct timespec *ts);
+static int efx_phc_settime(struct ptp_clock_info *ptp,
+			   const struct timespec *e_ts);
+static int efx_phc_enable(struct ptp_clock_info *ptp,
+			  struct ptp_clock_request *request, int on);
+
+/* Enable MCDI PTP support. */
+static int efx_ptp_enable(struct efx_nic *efx)
+{
+	u8 inbuf[MC_CMD_PTP_IN_ENABLE_LEN];
+
+	MCDI_SET_DWORD(inbuf, PTP_IN_OP, MC_CMD_PTP_OP_ENABLE);
+	MCDI_SET_DWORD(inbuf, PTP_IN_ENABLE_QUEUE,
+		       efx->ptp_data->channel->channel);
+	MCDI_SET_DWORD(inbuf, PTP_IN_ENABLE_MODE, efx->ptp_data->mode);
+
+	return efx_mcdi_rpc(efx, MC_CMD_PTP, inbuf, sizeof(inbuf),
+			    NULL, 0, NULL);
+}
+
+/* Disable MCDI PTP support.
+ *
+ * Note that this function should never rely on the presence of ptp_data -
+ * may be called before that exists.
+ */
+static int efx_ptp_disable(struct efx_nic *efx)
+{
+	u8 inbuf[MC_CMD_PTP_IN_DISABLE_LEN];
+
+	MCDI_SET_DWORD(inbuf, PTP_IN_OP, MC_CMD_PTP_OP_DISABLE);
+	return efx_mcdi_rpc(efx, MC_CMD_PTP, inbuf, sizeof(inbuf),
+			    NULL, 0, NULL);
+}
+
+static void efx_ptp_deliver_rx_queue(struct sk_buff_head *q)
+{
+	struct sk_buff *skb;
+
+	while ((skb = skb_dequeue(q))) {
+		local_bh_disable();
+		netif_receive_skb(skb);
+		local_bh_enable();
+	}
+}
+
+static void efx_ptp_handle_no_channel(struct efx_nic *efx)
+{
+	netif_err(efx, drv, efx->net_dev,
+		  "ERROR: PTP requires MSI-X and 1 additional interrupt"
+		  "vector. PTP disabled\n");
+}
+
+/* Repeatedly send the host time to the MC which will capture the hardware
+ * time.
+ */
+static void efx_ptp_send_times(struct efx_nic *efx,
+			       struct pps_event_time *last_time)
+{
+	struct pps_event_time now;
+	struct timespec limit;
+	struct efx_ptp_data *ptp = efx->ptp_data;
+	struct timespec start;
+	int *mc_running = ptp->start.addr;
+
+	pps_get_ts(&now);
+	start = now.ts_real;
+	limit = now.ts_real;
+	timespec_add_ns(&limit, SYNCHRONISE_PERIOD_NS);
+
+	/* Write host time for specified period or until MC is done */
+	while ((timespec_compare(&now.ts_real, &limit) < 0) &&
+	       ACCESS_ONCE(*mc_running)) {
+		struct timespec update_time;
+		unsigned int host_time;
+
+		/* Don't update continuously to avoid saturating the PCIe bus */
+		update_time = now.ts_real;
+		timespec_add_ns(&update_time, SYNCHRONISATION_GRANULARITY_NS);
+		do {
+			pps_get_ts(&now);
+		} while ((timespec_compare(&now.ts_real, &update_time) < 0) &&
+			 ACCESS_ONCE(*mc_running));
+
+		/* Synchronise NIC with single word of time only */
+		host_time = (now.ts_real.tv_sec << MC_NANOSECOND_BITS |
+			     now.ts_real.tv_nsec);
+		/* Update host time in NIC memory */
+		_efx_writed(efx, cpu_to_le32(host_time),
+			    FR_CZ_MC_TREG_SMEM + MC_SMEM_P0_PTP_TIME_OFST);
+	}
+	*last_time = now;
+}
+
+/* Read a timeset from the MC's results and partial process. */
+static void efx_ptp_read_timeset(u8 *data, struct efx_ptp_timeset *timeset)
+{
+	unsigned start_ns, end_ns;
+
+	timeset->host_start = MCDI_DWORD(data, PTP_OUT_SYNCHRONIZE_HOSTSTART);
+	timeset->seconds = MCDI_DWORD(data, PTP_OUT_SYNCHRONIZE_SECONDS);
+	timeset->nanoseconds = MCDI_DWORD(data,
+					 PTP_OUT_SYNCHRONIZE_NANOSECONDS);
+	timeset->host_end = MCDI_DWORD(data, PTP_OUT_SYNCHRONIZE_HOSTEND),
+	timeset->waitns = MCDI_DWORD(data, PTP_OUT_SYNCHRONIZE_WAITNS);
+
+	/* Ignore seconds */
+	start_ns = timeset->host_start & MC_NANOSECOND_MASK;
+	end_ns = timeset->host_end & MC_NANOSECOND_MASK;
+	/* Allow for rollover */
+	if (end_ns < start_ns)
+		end_ns += NSEC_PER_SEC;
+	/* Determine duration of operation */
+	timeset->window = end_ns - start_ns;
+}
+
+/* Process times received from MC.
+ *
+ * Extract times from returned results, and establish the minimum value
+ * seen.  The minimum value represents the "best" possible time and events
+ * too much greater than this are rejected - the machine is, perhaps, too
+ * busy. A number of readings are taken so that, hopefully, at least one good
+ * synchronisation will be seen in the results.
+ */
+static int efx_ptp_process_times(struct efx_nic *efx, u8 *synch_buf,
+				 size_t response_length,
+				 const struct pps_event_time *last_time)
+{
+	unsigned number_readings = (response_length /
+			       MC_CMD_PTP_OUT_SYNCHRONIZE_TIMESET_LEN);
+	unsigned i;
+	unsigned min;
+	unsigned min_set = 0;
+	unsigned total;
+	unsigned ngood = 0;
+	unsigned last_good = 0;
+	struct efx_ptp_data *ptp = efx->ptp_data;
+	bool min_valid = false;
+	u32 last_sec;
+	u32 start_sec;
+	struct timespec delta;
+
+	if (number_readings == 0)
+		return -EAGAIN;
+
+	/* Find minimum value in this set of results, discarding clearly
+	 * erroneous results.
+	 */
+	for (i = 0; i < number_readings; i++) {
+		efx_ptp_read_timeset(synch_buf, &ptp->timeset[i]);
+		synch_buf += MC_CMD_PTP_OUT_SYNCHRONIZE_TIMESET_LEN;
+		if (ptp->timeset[i].window > SYNCHRONISATION_GRANULARITY_NS) {
+			if (min_valid) {
+				if (ptp->timeset[i].window < min_set)
+					min_set = ptp->timeset[i].window;
+			} else {
+				min_valid = true;
+				min_set = ptp->timeset[i].window;
+			}
+		}
+	}
+
+	if (min_valid) {
+		if (ptp->base_sync_valid && (min_set > ptp->base_sync_ns))
+			min = ptp->base_sync_ns;
+		else
+			min = min_set;
+	} else {
+		min = SYNCHRONISATION_GRANULARITY_NS;
+	}
+
+	/* Discard excessively long synchronise durations.  The MC times
+	 * when it finishes reading the host time so the corrected window
+	 * time should be fairly constant for a given platform.
+	 */
+	total = 0;
+	for (i = 0; i < number_readings; i++)
+		if (ptp->timeset[i].window > ptp->timeset[i].waitns) {
+			unsigned win;
+
+			win = ptp->timeset[i].window - ptp->timeset[i].waitns;
+			if (win >= MIN_SYNCHRONISATION_NS &&
+			    win < MAX_SYNCHRONISATION_NS) {
+				total += ptp->timeset[i].window;
+				ngood++;
+				last_good = i;
+			}
+		}
+
+	if (ngood == 0) {
+		netif_warn(efx, drv, efx->net_dev,
+			   "PTP no suitable synchronisations %dns %dns\n",
+			   ptp->base_sync_ns, min_set);
+		return -EAGAIN;
+	}
+
+	/* Average minimum this synchronisation */
+	ptp->last_sync_ns = DIV_ROUND_UP(total, ngood);
+	if (!ptp->base_sync_valid || (ptp->last_sync_ns < ptp->base_sync_ns)) {
+		ptp->base_sync_valid = true;
+		ptp->base_sync_ns = ptp->last_sync_ns;
+	}
+
+	/* Calculate delay from actual PPS to last_time */
+	delta.tv_nsec =
+		ptp->timeset[last_good].nanoseconds +
+		last_time->ts_real.tv_nsec -
+		(ptp->timeset[last_good].host_start & MC_NANOSECOND_MASK);
+
+	/* It is possible that the seconds rolled over between taking
+	 * the start reading and the last value written by the host.  The
+	 * timescales are such that a gap of more than one second is never
+	 * expected.
+	 */
+	start_sec = ptp->timeset[last_good].host_start >> MC_NANOSECOND_BITS;
+	last_sec = last_time->ts_real.tv_sec & MC_SECOND_MASK;
+	if (start_sec != last_sec) {
+		if (((start_sec + 1) & MC_SECOND_MASK) != last_sec) {
+			netif_warn(efx, hw, efx->net_dev,
+				   "PTP bad synchronisation seconds\n");
+			return -EAGAIN;
+		} else {
+			delta.tv_sec = 1;
+		}
+	} else {
+		delta.tv_sec = 0;
+	}
+
+	ptp->host_time_pps = *last_time;
+	pps_sub_ts(&ptp->host_time_pps, delta);
+
+	return 0;
+}
+
+/* Synchronize times between the host and the MC */
+static int efx_ptp_synchronize(struct efx_nic *efx, unsigned int num_readings)
+{
+	struct efx_ptp_data *ptp = efx->ptp_data;
+	u8 synch_buf[MC_CMD_PTP_OUT_SYNCHRONIZE_LENMAX];
+	size_t response_length;
+	int rc;
+	unsigned long timeout;
+	struct pps_event_time last_time = {};
+	unsigned int loops = 0;
+	int *start = ptp->start.addr;
+
+	MCDI_SET_DWORD(synch_buf, PTP_IN_OP, MC_CMD_PTP_OP_SYNCHRONIZE);
+	MCDI_SET_DWORD(synch_buf, PTP_IN_SYNCHRONIZE_NUMTIMESETS,
+		       num_readings);
+	MCDI_SET_DWORD(synch_buf, PTP_IN_SYNCHRONIZE_START_ADDR_LO,
+		       (u32)ptp->start.dma_addr);
+	MCDI_SET_DWORD(synch_buf, PTP_IN_SYNCHRONIZE_START_ADDR_HI,
+		       (u32)((u64)ptp->start.dma_addr >> 32));
+
+	/* Clear flag that signals MC ready */
+	ACCESS_ONCE(*start) = 0;
+	efx_mcdi_rpc_start(efx, MC_CMD_PTP, synch_buf,
+			   MC_CMD_PTP_IN_SYNCHRONIZE_LEN);
+
+	/* Wait for start from MCDI (or timeout) */
+	timeout = jiffies + msecs_to_jiffies(MAX_SYNCHRONISE_WAIT_MS);
+	while (!ACCESS_ONCE(*start) && (time_before(jiffies, timeout))) {
+		udelay(20);	/* Usually start MCDI execution quickly */
+		loops++;
+	}
+
+	if (ACCESS_ONCE(*start))
+		efx_ptp_send_times(efx, &last_time);
+
+	/* Collect results */
+	rc = efx_mcdi_rpc_finish(efx, MC_CMD_PTP,
+				 MC_CMD_PTP_IN_SYNCHRONIZE_LEN,
+				 synch_buf, sizeof(synch_buf),
+				 &response_length);
+	if (rc == 0)
+		rc = efx_ptp_process_times(efx, synch_buf, response_length,
+					   &last_time);
+
+	return rc;
+}
+
+/* Transmit a PTP packet, via the MCDI interface, to the wire. */
+static int efx_ptp_xmit_skb(struct efx_nic *efx, struct sk_buff *skb)
+{
+	u8 *txbuf = efx->ptp_data->txbuf;
+	struct skb_shared_hwtstamps timestamps;
+	int rc = -EIO;
+	/* MCDI driver requires word aligned lengths */
+	size_t len = ALIGN(MC_CMD_PTP_IN_TRANSMIT_LEN(skb->len), 4);
+	u8 txtime[MC_CMD_PTP_OUT_TRANSMIT_LEN];
+
+	MCDI_SET_DWORD(txbuf, PTP_IN_OP, MC_CMD_PTP_OP_TRANSMIT);
+	MCDI_SET_DWORD(txbuf, PTP_IN_TRANSMIT_LENGTH, skb->len);
+	if (skb_shinfo(skb)->nr_frags != 0) {
+		rc = skb_linearize(skb);
+		if (rc != 0)
+			goto fail;
+	}
+
+	if (skb->ip_summed == CHECKSUM_PARTIAL) {
+		rc = skb_checksum_help(skb);
+		if (rc != 0)
+			goto fail;
+	}
+	skb_copy_from_linear_data(skb,
+				  &txbuf[MC_CMD_PTP_IN_TRANSMIT_PACKET_OFST],
+				  len);
+	rc = efx_mcdi_rpc(efx, MC_CMD_PTP, txbuf, len, txtime,
+			  sizeof(txtime), &len);
+	if (rc != 0)
+		goto fail;
+
+	memset(&timestamps, 0, sizeof(timestamps));
+	timestamps.hwtstamp = ktime_set(
+		MCDI_DWORD(txtime, PTP_OUT_TRANSMIT_SECONDS),
+		MCDI_DWORD(txtime, PTP_OUT_TRANSMIT_NANOSECONDS));
+
+	skb_tstamp_tx(skb, &timestamps);
+
+	rc = 0;
+
+fail:
+	dev_kfree_skb(skb);
+
+	return rc;
+}
+
+static void efx_ptp_drop_time_expired_events(struct efx_nic *efx)
+{
+	struct efx_ptp_data *ptp = efx->ptp_data;
+	struct list_head *cursor;
+	struct list_head *next;
+
+	/* Drop time-expired events */
+	spin_lock_bh(&ptp->evt_lock);
+	if (!list_empty(&ptp->evt_list)) {
+		list_for_each_safe(cursor, next, &ptp->evt_list) {
+			struct efx_ptp_event_rx *evt;
+
+			evt = list_entry(cursor, struct efx_ptp_event_rx,
+					 link);
+			if (time_after(jiffies, evt->expiry)) {
+				list_del(&evt->link);
+				list_add(&evt->link, &ptp->evt_free_list);
+				netif_warn(efx, hw, efx->net_dev,
+					   "PTP rx event dropped\n");
+			}
+		}
+	}
+	spin_unlock_bh(&ptp->evt_lock);
+}
+
+static enum ptp_packet_state efx_ptp_match_rx(struct efx_nic *efx,
+					      struct sk_buff *skb)
+{
+	struct efx_ptp_data *ptp = efx->ptp_data;
+	bool evts_waiting;
+	struct list_head *cursor;
+	struct list_head *next;
+	struct efx_ptp_match *match;
+	enum ptp_packet_state rc = PTP_PACKET_STATE_UNMATCHED;
+
+	spin_lock_bh(&ptp->evt_lock);
+	evts_waiting = !list_empty(&ptp->evt_list);
+	spin_unlock_bh(&ptp->evt_lock);
+
+	if (!evts_waiting)
+		return PTP_PACKET_STATE_UNMATCHED;
+
+	match = (struct efx_ptp_match *)skb->cb;
+	/* Look for a matching timestamp in the event queue */
+	spin_lock_bh(&ptp->evt_lock);
+	list_for_each_safe(cursor, next, &ptp->evt_list) {
+		struct efx_ptp_event_rx *evt;
+
+		evt = list_entry(cursor, struct efx_ptp_event_rx, link);
+		if ((evt->seq0 == match->words[0]) &&
+		    (evt->seq1 == match->words[1])) {
+			struct skb_shared_hwtstamps *timestamps;
+
+			/* Match - add in hardware timestamp */
+			timestamps = skb_hwtstamps(skb);
+			timestamps->hwtstamp = evt->hwtimestamp;
+
+			match->state = PTP_PACKET_STATE_MATCHED;
+			rc = PTP_PACKET_STATE_MATCHED;
+			list_del(&evt->link);
+			list_add(&evt->link, &ptp->evt_free_list);
+			break;
+		}
+	}
+	spin_unlock_bh(&ptp->evt_lock);
+
+	return rc;
+}
+
+/* Process any queued receive events and corresponding packets
+ *
+ * q is returned with all the packets that are ready for delivery.
+ * true is returned if at least one of those packets requires
+ * synchronisation.
+ */
+static bool efx_ptp_process_events(struct efx_nic *efx, struct sk_buff_head *q)
+{
+	struct efx_ptp_data *ptp = efx->ptp_data;
+	bool rc = false;
+	struct sk_buff *skb;
+
+	while ((skb = skb_dequeue(&ptp->rxq))) {
+		struct efx_ptp_match *match;
+
+		match = (struct efx_ptp_match *)skb->cb;
+		if (match->state == PTP_PACKET_STATE_MATCH_UNWANTED) {
+			__skb_queue_tail(q, skb);
+		} else if (efx_ptp_match_rx(efx, skb) ==
+			   PTP_PACKET_STATE_MATCHED) {
+			rc = true;
+			__skb_queue_tail(q, skb);
+		} else if (time_after(jiffies, match->expiry)) {
+			match->state = PTP_PACKET_STATE_TIMED_OUT;
+			netif_warn(efx, rx_err, efx->net_dev,
+				   "PTP packet - no timestamp seen\n");
+			__skb_queue_tail(q, skb);
+		} else {
+			/* Replace unprocessed entry and stop */
+			skb_queue_head(&ptp->rxq, skb);
+			break;
+		}
+	}
+
+	return rc;
+}
+
+/* Complete processing of a received packet */
+static inline void efx_ptp_process_rx(struct efx_nic *efx, struct sk_buff *skb)
+{
+	local_bh_disable();
+	netif_receive_skb(skb);
+	local_bh_enable();
+}
+
+static int efx_ptp_start(struct efx_nic *efx)
+{
+	struct efx_ptp_data *ptp = efx->ptp_data;
+	struct efx_filter_spec rxfilter;
+	int rc;
+
+	ptp->reset_required = false;
+
+	/* Must filter on both event and general ports to ensure
+	 * that there is no packet re-ordering.
+	 */
+	efx_filter_init_rx(&rxfilter, EFX_FILTER_PRI_REQUIRED, 0,
+			   efx_rx_queue_index(
+				   efx_channel_get_rx_queue(ptp->channel)));
+	rc = efx_filter_set_ipv4_local(&rxfilter, IPPROTO_UDP,
+				       htonl(PTP_ADDRESS),
+				       htons(PTP_EVENT_PORT));
+	if (rc != 0)
+		return rc;
+
+	rc = efx_filter_insert_filter(efx, &rxfilter, true);
+	if (rc < 0)
+		return rc;
+	ptp->rxfilter_event = rc;
+
+	efx_filter_init_rx(&rxfilter, EFX_FILTER_PRI_REQUIRED, 0,
+			   efx_rx_queue_index(
+				   efx_channel_get_rx_queue(ptp->channel)));
+	rc = efx_filter_set_ipv4_local(&rxfilter, IPPROTO_UDP,
+				       htonl(PTP_ADDRESS),
+				       htons(PTP_GENERAL_PORT));
+	if (rc != 0)
+		goto fail;
+
+	rc = efx_filter_insert_filter(efx, &rxfilter, true);
+	if (rc < 0)
+		goto fail;
+	ptp->rxfilter_general = rc;
+
+	rc = efx_ptp_enable(efx);
+	if (rc != 0)
+		goto fail2;
+
+	ptp->evt_frag_idx = 0;
+	ptp->current_adjfreq = 0;
+	ptp->rxfilter_installed = true;
+
+	return 0;
+
+fail2:
+	efx_filter_remove_id_safe(efx, EFX_FILTER_PRI_REQUIRED,
+				  ptp->rxfilter_general);
+fail:
+	efx_filter_remove_id_safe(efx, EFX_FILTER_PRI_REQUIRED,
+				  ptp->rxfilter_event);
+
+	return rc;
+}
+
+static int efx_ptp_stop(struct efx_nic *efx)
+{
+	struct efx_ptp_data *ptp = efx->ptp_data;
+	int rc = efx_ptp_disable(efx);
+	struct list_head *cursor;
+	struct list_head *next;
+
+	if (ptp->rxfilter_installed) {
+		efx_filter_remove_id_safe(efx, EFX_FILTER_PRI_REQUIRED,
+					  ptp->rxfilter_general);
+		efx_filter_remove_id_safe(efx, EFX_FILTER_PRI_REQUIRED,
+					  ptp->rxfilter_event);
+		ptp->rxfilter_installed = false;
+	}
+
+	/* Make sure RX packets are really delivered */
+	efx_ptp_deliver_rx_queue(&efx->ptp_data->rxq);
+	skb_queue_purge(&efx->ptp_data->txq);
+
+	/* Drop any pending receive events */
+	spin_lock_bh(&efx->ptp_data->evt_lock);
+	list_for_each_safe(cursor, next, &efx->ptp_data->evt_list) {
+		list_del(cursor);
+		list_add(cursor, &efx->ptp_data->evt_free_list);
+	}
+	spin_unlock_bh(&efx->ptp_data->evt_lock);
+
+	return rc;
+}
+
+static void efx_ptp_pps_worker(struct work_struct *work)
+{
+	struct efx_ptp_data *ptp =
+		container_of(work, struct efx_ptp_data, pps_work);
+	struct efx_nic *efx = ptp->channel->efx;
+	struct ptp_clock_event ptp_evt;
+
+	if (efx_ptp_synchronize(efx, PTP_SYNC_ATTEMPTS))
+		return;
+
+	ptp_evt.type = PTP_CLOCK_PPSUSR;
+	ptp_evt.pps_times = ptp->host_time_pps;
+	ptp_clock_event(ptp->phc_clock, &ptp_evt);
+}
+
+/* Process any pending transmissions and timestamp any received packets.
+ */
+static void efx_ptp_worker(struct work_struct *work)
+{
+	struct efx_ptp_data *ptp_data =
+		container_of(work, struct efx_ptp_data, work);
+	struct efx_nic *efx = ptp_data->channel->efx;
+	struct sk_buff *skb;
+	struct sk_buff_head tempq;
+
+	if (ptp_data->reset_required) {
+		efx_ptp_stop(efx);
+		efx_ptp_start(efx);
+		return;
+	}
+
+	efx_ptp_drop_time_expired_events(efx);
+
+	__skb_queue_head_init(&tempq);
+	if (efx_ptp_process_events(efx, &tempq) ||
+	    !skb_queue_empty(&ptp_data->txq)) {
+
+		while ((skb = skb_dequeue(&ptp_data->txq)))
+			efx_ptp_xmit_skb(efx, skb);
+	}
+
+	while ((skb = __skb_dequeue(&tempq)))
+		efx_ptp_process_rx(efx, skb);
+}
+
+/* Initialise PTP channel and state.
+ *
+ * Setting core_index to zero causes the queue to be initialised and doesn't
+ * overlap with 'rxq0' because ptp.c doesn't use skb_record_rx_queue.
+ */
+static int efx_ptp_probe_channel(struct efx_channel *channel)
+{
+	struct efx_nic *efx = channel->efx;
+	struct efx_ptp_data *ptp;
+	int rc = 0;
+	unsigned int pos;
+
+	channel->irq_moderation = 0;
+	channel->rx_queue.core_index = 0;
+
+	ptp = kzalloc(sizeof(struct efx_ptp_data), GFP_KERNEL);
+	efx->ptp_data = ptp;
+	if (!efx->ptp_data)
+		return -ENOMEM;
+
+	rc = efx_nic_alloc_buffer(efx, &ptp->start, sizeof(int));
+	if (rc != 0)
+		goto fail1;
+
+	ptp->channel = channel;
+	skb_queue_head_init(&ptp->rxq);
+	skb_queue_head_init(&ptp->txq);
+	ptp->workwq = create_singlethread_workqueue("sfc_ptp");
+	if (!ptp->workwq) {
+		rc = -ENOMEM;
+		goto fail2;
+	}
+
+	INIT_WORK(&ptp->work, efx_ptp_worker);
+	ptp->config.flags = 0;
+	ptp->config.tx_type = HWTSTAMP_TX_OFF;
+	ptp->config.rx_filter = HWTSTAMP_FILTER_NONE;
+	INIT_LIST_HEAD(&ptp->evt_list);
+	INIT_LIST_HEAD(&ptp->evt_free_list);
+	spin_lock_init(&ptp->evt_lock);
+	for (pos = 0; pos < MAX_RECEIVE_EVENTS; pos++)
+		list_add(&ptp->rx_evts[pos].link, &ptp->evt_free_list);
+
+	ptp->phc_clock_info.owner = THIS_MODULE;
+	snprintf(ptp->phc_clock_info.name,
+		 sizeof(ptp->phc_clock_info.name),
+		 "%pm", efx->net_dev->perm_addr);
+	ptp->phc_clock_info.max_adj = MAX_PPB;
+	ptp->phc_clock_info.n_alarm = 0;
+	ptp->phc_clock_info.n_ext_ts = 0;
+	ptp->phc_clock_info.n_per_out = 0;
+	ptp->phc_clock_info.pps = 1;
+	ptp->phc_clock_info.adjfreq = efx_phc_adjfreq;
+	ptp->phc_clock_info.adjtime = efx_phc_adjtime;
+	ptp->phc_clock_info.gettime = efx_phc_gettime;
+	ptp->phc_clock_info.settime = efx_phc_settime;
+	ptp->phc_clock_info.enable = efx_phc_enable;
+
+	ptp->phc_clock = ptp_clock_register(&ptp->phc_clock_info);
+	if (!ptp->phc_clock)
+		goto fail3;
+
+	INIT_WORK(&ptp->pps_work, efx_ptp_pps_worker);
+	ptp->pps_workwq = create_singlethread_workqueue("sfc_pps");
+	if (!ptp->pps_workwq) {
+		rc = -ENOMEM;
+		goto fail4;
+	}
+	ptp->nic_ts_enabled = false;
+
+	return 0;
+fail4:
+	ptp_clock_unregister(efx->ptp_data->phc_clock);
+
+fail3:
+	destroy_workqueue(efx->ptp_data->workwq);
+
+fail2:
+	efx_nic_free_buffer(efx, &ptp->start);
+
+fail1:
+	kfree(efx->ptp_data);
+	efx->ptp_data = NULL;
+
+	return rc;
+}
+
+static void efx_ptp_remove_channel(struct efx_channel *channel)
+{
+	struct efx_nic *efx = channel->efx;
+
+	if (!efx->ptp_data)
+		return;
+
+	(void)efx_ptp_disable(channel->efx);
+
+	cancel_work_sync(&efx->ptp_data->work);
+	cancel_work_sync(&efx->ptp_data->pps_work);
+
+	skb_queue_purge(&efx->ptp_data->rxq);
+	skb_queue_purge(&efx->ptp_data->txq);
+
+	ptp_clock_unregister(efx->ptp_data->phc_clock);
+
+	destroy_workqueue(efx->ptp_data->workwq);
+	destroy_workqueue(efx->ptp_data->pps_workwq);
+
+	efx_nic_free_buffer(efx, &efx->ptp_data->start);
+	kfree(efx->ptp_data);
+}
+
+static void efx_ptp_get_channel_name(struct efx_channel *channel,
+				     char *buf, size_t len)
+{
+	snprintf(buf, len, "%s-ptp", channel->efx->name);
+}
+
+/* Determine whether this packet should be processed by the PTP module
+ * or transmitted conventionally.
+ */
+bool efx_ptp_is_ptp_tx(struct efx_nic *efx, struct sk_buff *skb)
+{
+	return efx->ptp_data &&
+		efx->ptp_data->enabled &&
+		skb->len >= PTP_MIN_LENGTH &&
+		skb->len <= MC_CMD_PTP_IN_TRANSMIT_PACKET_MAXNUM &&
+		likely(skb->protocol == htons(ETH_P_IP)) &&
+		ip_hdr(skb)->protocol == IPPROTO_UDP &&
+		udp_hdr(skb)->dest == htons(PTP_EVENT_PORT);
+}
+
+/* Receive a PTP packet.  Packets are queued until the arrival of
+ * the receive timestamp from the MC - this will probably occur after the
+ * packet arrival because of the processing in the MC.
+ */
+static void efx_ptp_rx(struct efx_channel *channel, struct sk_buff *skb)
+{
+	struct efx_nic *efx = channel->efx;
+	struct efx_ptp_data *ptp = efx->ptp_data;
+	struct efx_ptp_match *match = (struct efx_ptp_match *)skb->cb;
+	u8 *data;
+	unsigned int version;
+
+	match->expiry = jiffies + msecs_to_jiffies(PKT_EVENT_LIFETIME_MS);
+
+	/* Correct version? */
+	if (ptp->mode == MC_CMD_PTP_MODE_V1) {
+		if (skb->len < PTP_V1_MIN_LENGTH) {
+			netif_receive_skb(skb);
+			return;
+		}
+		version = ntohs(*(__be16 *)&skb->data[PTP_V1_VERSION_OFFSET]);
+		if (version != PTP_VERSION_V1) {
+			netif_receive_skb(skb);
+			return;
+		}
+	} else {
+		if (skb->len < PTP_V2_MIN_LENGTH) {
+			netif_receive_skb(skb);
+			return;
+		}
+		version = skb->data[PTP_V2_VERSION_OFFSET];
+
+		BUG_ON(ptp->mode != MC_CMD_PTP_MODE_V2);
+		BUILD_BUG_ON(PTP_V1_UUID_OFFSET != PTP_V2_MC_UUID_OFFSET);
+		BUILD_BUG_ON(PTP_V1_UUID_LENGTH != PTP_V2_MC_UUID_LENGTH);
+		BUILD_BUG_ON(PTP_V1_SEQUENCE_OFFSET != PTP_V2_SEQUENCE_OFFSET);
+		BUILD_BUG_ON(PTP_V1_SEQUENCE_LENGTH != PTP_V2_SEQUENCE_LENGTH);
+
+		if ((version & PTP_VERSION_V2_MASK) != PTP_VERSION_V2) {
+			netif_receive_skb(skb);
+			return;
+		}
+	}
+
+	/* Does this packet require timestamping? */
+	if (ntohs(*(__be16 *)&skb->data[PTP_DPORT_OFFSET]) == PTP_EVENT_PORT) {
+		struct skb_shared_hwtstamps *timestamps;
+
+		match->state = PTP_PACKET_STATE_UNMATCHED;
+
+		/* Clear all timestamps held: filled in later */
+		timestamps = skb_hwtstamps(skb);
+		memset(timestamps, 0, sizeof(*timestamps));
+
+		/* Extract UUID/Sequence information */
+		data = skb->data + PTP_V1_UUID_OFFSET;
+		match->words[0] = (data[0]         |
+				   (data[1] << 8)  |
+				   (data[2] << 16) |
+				   (data[3] << 24));
+		match->words[1] = (data[4]         |
+				   (data[5] << 8)  |
+				   (skb->data[PTP_V1_SEQUENCE_OFFSET +
+					      PTP_V1_SEQUENCE_LENGTH - 1] <<
+				    16));
+	} else {
+		match->state = PTP_PACKET_STATE_MATCH_UNWANTED;
+	}
+
+	skb_queue_tail(&ptp->rxq, skb);
+	queue_work(ptp->workwq, &ptp->work);
+}
+
+/* Transmit a PTP packet.  This has to be transmitted by the MC
+ * itself, through an MCDI call.  MCDI calls aren't permitted
+ * in the transmit path so defer the actual transmission to a suitable worker.
+ */
+int efx_ptp_tx(struct efx_nic *efx, struct sk_buff *skb)
+{
+	struct efx_ptp_data *ptp = efx->ptp_data;
+
+	skb_queue_tail(&ptp->txq, skb);
+
+	if ((udp_hdr(skb)->dest == htons(PTP_EVENT_PORT)) &&
+	    (skb->len <= MC_CMD_PTP_IN_TRANSMIT_PACKET_MAXNUM))
+		efx_xmit_hwtstamp_pending(skb);
+	queue_work(ptp->workwq, &ptp->work);
+
+	return NETDEV_TX_OK;
+}
+
+static int efx_ptp_change_mode(struct efx_nic *efx, bool enable_wanted,
+			       unsigned int new_mode)
+{
+	if ((enable_wanted != efx->ptp_data->enabled) ||
+	    (enable_wanted && (efx->ptp_data->mode != new_mode))) {
+		int rc;
+
+		if (enable_wanted) {
+			/* Change of mode requires disable */
+			if (efx->ptp_data->enabled &&
+			    (efx->ptp_data->mode != new_mode)) {
+				efx->ptp_data->enabled = false;
+				rc = efx_ptp_stop(efx);
+				if (rc != 0)
+					return rc;
+			}
+
+			/* Set new operating mode and establish
+			 * baseline synchronisation, which must
+			 * succeed.
+			 */
+			efx->ptp_data->mode = new_mode;
+			rc = efx_ptp_start(efx);
+			if (rc == 0) {
+				rc = efx_ptp_synchronize(efx,
+							 PTP_SYNC_ATTEMPTS * 2);
+				if (rc != 0)
+					efx_ptp_stop(efx);
+			}
+		} else {
+			rc = efx_ptp_stop(efx);
+		}
+
+		if (rc != 0)
+			return rc;
+
+		efx->ptp_data->enabled = enable_wanted;
+	}
+
+	return 0;
+}
+
+static int efx_ptp_ts_init(struct efx_nic *efx, struct hwtstamp_config *init)
+{
+	bool enable_wanted = false;
+	unsigned int new_mode;
+	int rc;
+
+	if (init->flags)
+		return -EINVAL;
+
+	if ((init->tx_type != HWTSTAMP_TX_OFF) &&
+	    (init->tx_type != HWTSTAMP_TX_ON))
+		return -ERANGE;
+
+	new_mode = efx->ptp_data->mode;
+	/* Determine whether any PTP HW operations are required */
+	switch (init->rx_filter) {
+	case HWTSTAMP_FILTER_NONE:
+		break;
+	case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+	case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
+	case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
+		init->rx_filter = HWTSTAMP_FILTER_PTP_V1_L4_EVENT;
+		new_mode = MC_CMD_PTP_MODE_V1;
+		enable_wanted = true;
+		break;
+	case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
+	/* Although these three are accepted only IPV4 packets will be
+	 * timestamped
+	 */
+		init->rx_filter = HWTSTAMP_FILTER_PTP_V2_L4_EVENT;
+		new_mode = MC_CMD_PTP_MODE_V2;
+		enable_wanted = true;
+		break;
+	case HWTSTAMP_FILTER_PTP_V2_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+	case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
+	case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
+	case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
+		/* Non-IP + IPv6 timestamping not supported */
+		return -ERANGE;
+		break;
+	default:
+		return -ERANGE;
+	}
+
+	if (init->tx_type != HWTSTAMP_TX_OFF)
+		enable_wanted = true;
+
+	rc = efx_ptp_change_mode(efx, enable_wanted, new_mode);
+	if (rc != 0)
+		return rc;
+
+	efx->ptp_data->config = *init;
+
+	return 0;
+}
+
+int
+efx_ptp_get_ts_info(struct net_device *net_dev, struct ethtool_ts_info *ts_info)
+{
+	struct efx_nic *efx = netdev_priv(net_dev);
+	struct efx_ptp_data *ptp = efx->ptp_data;
+
+	if (!ptp)
+		return -EOPNOTSUPP;
+
+	ts_info->so_timestamping = (SOF_TIMESTAMPING_TX_HARDWARE |
+				    SOF_TIMESTAMPING_RX_HARDWARE |
+				    SOF_TIMESTAMPING_RAW_HARDWARE);
+	ts_info->phc_index = ptp_clock_index(ptp->phc_clock);
+	ts_info->tx_types = 1 << HWTSTAMP_TX_OFF | 1 << HWTSTAMP_TX_ON;
+	ts_info->rx_filters = (1 << HWTSTAMP_FILTER_NONE |
+			       1 << HWTSTAMP_FILTER_PTP_V1_L4_EVENT |
+			       1 << HWTSTAMP_FILTER_PTP_V1_L4_SYNC |
+			       1 << HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ |
+			       1 << HWTSTAMP_FILTER_PTP_V2_L4_EVENT |
+			       1 << HWTSTAMP_FILTER_PTP_V2_L4_SYNC |
+			       1 << HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ);
+	return 0;
+}
+
+int efx_ptp_ioctl(struct efx_nic *efx, struct ifreq *ifr, int cmd)
+{
+	struct hwtstamp_config config;
+	int rc;
+
+	/* Not a PTP enabled port */
+	if (!efx->ptp_data)
+		return -EOPNOTSUPP;
+
+	if (copy_from_user(&config, ifr->ifr_data, sizeof(config)))
+		return -EFAULT;
+
+	rc = efx_ptp_ts_init(efx, &config);
+	if (rc != 0)
+		return rc;
+
+	return copy_to_user(ifr->ifr_data, &config, sizeof(config))
+		? -EFAULT : 0;
+}
+
+static void ptp_event_failure(struct efx_nic *efx, int expected_frag_len)
+{
+	struct efx_ptp_data *ptp = efx->ptp_data;
+
+	netif_err(efx, hw, efx->net_dev,
+		"PTP unexpected event length: got %d expected %d\n",
+		ptp->evt_frag_idx, expected_frag_len);
+	ptp->reset_required = true;
+	queue_work(ptp->workwq, &ptp->work);
+}
+
+/* Process a completed receive event.  Put it on the event queue and
+ * start worker thread.  This is required because event and their
+ * correspoding packets may come in either order.
+ */
+static void ptp_event_rx(struct efx_nic *efx, struct efx_ptp_data *ptp)
+{
+	struct efx_ptp_event_rx *evt = NULL;
+
+	if (ptp->evt_frag_idx != 3) {
+		ptp_event_failure(efx, 3);
+		return;
+	}
+
+	spin_lock_bh(&ptp->evt_lock);
+	if (!list_empty(&ptp->evt_free_list)) {
+		evt = list_first_entry(&ptp->evt_free_list,
+				       struct efx_ptp_event_rx, link);
+		list_del(&evt->link);
+
+		evt->seq0 = EFX_QWORD_FIELD(ptp->evt_frags[2], MCDI_EVENT_DATA);
+		evt->seq1 = (EFX_QWORD_FIELD(ptp->evt_frags[2],
+					     MCDI_EVENT_SRC)        |
+			     (EFX_QWORD_FIELD(ptp->evt_frags[1],
+					      MCDI_EVENT_SRC) << 8) |
+			     (EFX_QWORD_FIELD(ptp->evt_frags[0],
+					      MCDI_EVENT_SRC) << 16));
+		evt->hwtimestamp = ktime_set(
+			EFX_QWORD_FIELD(ptp->evt_frags[0], MCDI_EVENT_DATA),
+			EFX_QWORD_FIELD(ptp->evt_frags[1], MCDI_EVENT_DATA));
+		evt->expiry = jiffies + msecs_to_jiffies(PKT_EVENT_LIFETIME_MS);
+		list_add_tail(&evt->link, &ptp->evt_list);
+
+		queue_work(ptp->workwq, &ptp->work);
+	} else {
+		netif_err(efx, rx_err, efx->net_dev, "No free PTP event");
+	}
+	spin_unlock_bh(&ptp->evt_lock);
+}
+
+static void ptp_event_fault(struct efx_nic *efx, struct efx_ptp_data *ptp)
+{
+	int code = EFX_QWORD_FIELD(ptp->evt_frags[0], MCDI_EVENT_DATA);
+	if (ptp->evt_frag_idx != 1) {
+		ptp_event_failure(efx, 1);
+		return;
+	}
+
+	netif_err(efx, hw, efx->net_dev, "PTP error %d\n", code);
+}
+
+static void ptp_event_pps(struct efx_nic *efx, struct efx_ptp_data *ptp)
+{
+	if (ptp->nic_ts_enabled)
+		queue_work(ptp->pps_workwq, &ptp->pps_work);
+}
+
+void efx_ptp_event(struct efx_nic *efx, efx_qword_t *ev)
+{
+	struct efx_ptp_data *ptp = efx->ptp_data;
+	int code = EFX_QWORD_FIELD(*ev, MCDI_EVENT_CODE);
+
+	if (!ptp->enabled)
+		return;
+
+	if (ptp->evt_frag_idx == 0) {
+		ptp->evt_code = code;
+	} else if (ptp->evt_code != code) {
+		netif_err(efx, hw, efx->net_dev,
+			  "PTP out of sequence event %d\n", code);
+		ptp->evt_frag_idx = 0;
+	}
+
+	ptp->evt_frags[ptp->evt_frag_idx++] = *ev;
+	if (!MCDI_EVENT_FIELD(*ev, CONT)) {
+		/* Process resulting event */
+		switch (code) {
+		case MCDI_EVENT_CODE_PTP_RX:
+			ptp_event_rx(efx, ptp);
+			break;
+		case MCDI_EVENT_CODE_PTP_FAULT:
+			ptp_event_fault(efx, ptp);
+			break;
+		case MCDI_EVENT_CODE_PTP_PPS:
+			ptp_event_pps(efx, ptp);
+			break;
+		default:
+			netif_err(efx, hw, efx->net_dev,
+				  "PTP unknown event %d\n", code);
+			break;
+		}
+		ptp->evt_frag_idx = 0;
+	} else if (MAX_EVENT_FRAGS == ptp->evt_frag_idx) {
+		netif_err(efx, hw, efx->net_dev,
+			  "PTP too many event fragments\n");
+		ptp->evt_frag_idx = 0;
+	}
+}
+
+static int efx_phc_adjfreq(struct ptp_clock_info *ptp, s32 delta)
+{
+	struct efx_ptp_data *ptp_data = container_of(ptp,
+						     struct efx_ptp_data,
+						     phc_clock_info);
+	struct efx_nic *efx = ptp_data->channel->efx;
+	u8 inadj[MC_CMD_PTP_IN_ADJUST_LEN];
+	s64 adjustment_ns;
+	int rc;
+
+	if (delta > MAX_PPB)
+		delta = MAX_PPB;
+	else if (delta < -MAX_PPB)
+		delta = -MAX_PPB;
+
+	/* Convert ppb to fixed point ns. */
+	adjustment_ns = (((s64)delta * PPB_SCALE_WORD) >>
+			 (PPB_EXTRA_BITS + MAX_PPB_BITS));
+
+	MCDI_SET_DWORD(inadj, PTP_IN_OP, MC_CMD_PTP_OP_ADJUST);
+	MCDI_SET_DWORD(inadj, PTP_IN_ADJUST_FREQ_LO, (u32)adjustment_ns);
+	MCDI_SET_DWORD(inadj, PTP_IN_ADJUST_FREQ_HI,
+		       (u32)(adjustment_ns >> 32));
+	MCDI_SET_DWORD(inadj, PTP_IN_ADJUST_SECONDS, 0);
+	MCDI_SET_DWORD(inadj, PTP_IN_ADJUST_NANOSECONDS, 0);
+	rc = efx_mcdi_rpc(efx, MC_CMD_PTP, inadj, sizeof(inadj),
+			  NULL, 0, NULL);
+	if (rc != 0)
+		return rc;
+
+	ptp_data->current_adjfreq = delta;
+	return 0;
+}
+
+static int efx_phc_adjtime(struct ptp_clock_info *ptp, s64 delta)
+{
+	struct efx_ptp_data *ptp_data = container_of(ptp,
+						     struct efx_ptp_data,
+						     phc_clock_info);
+	struct efx_nic *efx = ptp_data->channel->efx;
+	struct timespec delta_ts = ns_to_timespec(delta);
+	u8 inbuf[MC_CMD_PTP_IN_ADJUST_LEN];
+
+	MCDI_SET_DWORD(inbuf, PTP_IN_OP, MC_CMD_PTP_OP_ADJUST);
+	MCDI_SET_DWORD(inbuf, PTP_IN_ADJUST_FREQ_LO, 0);
+	MCDI_SET_DWORD(inbuf, PTP_IN_ADJUST_FREQ_HI, 0);
+	MCDI_SET_DWORD(inbuf, PTP_IN_ADJUST_SECONDS, (u32)delta_ts.tv_sec);
+	MCDI_SET_DWORD(inbuf, PTP_IN_ADJUST_NANOSECONDS, (u32)delta_ts.tv_nsec);
+	return efx_mcdi_rpc(efx, MC_CMD_PTP, inbuf, sizeof(inbuf),
+			    NULL, 0, NULL);
+}
+
+static int efx_phc_gettime(struct ptp_clock_info *ptp, struct timespec *ts)
+{
+	struct efx_ptp_data *ptp_data = container_of(ptp,
+						     struct efx_ptp_data,
+						     phc_clock_info);
+	struct efx_nic *efx = ptp_data->channel->efx;
+	u8 inbuf[MC_CMD_PTP_IN_READ_NIC_TIME_LEN];
+	u8 outbuf[MC_CMD_PTP_OUT_READ_NIC_TIME_LEN];
+	int rc;
+
+	MCDI_SET_DWORD(inbuf, PTP_IN_OP, MC_CMD_PTP_OP_READ_NIC_TIME);
+
+	rc = efx_mcdi_rpc(efx, MC_CMD_PTP, inbuf, sizeof(inbuf),
+			  outbuf, sizeof(outbuf), NULL);
+	if (rc != 0)
+		return rc;
+
+	ts->tv_sec = MCDI_DWORD(outbuf, PTP_OUT_READ_NIC_TIME_SECONDS);
+	ts->tv_nsec = MCDI_DWORD(outbuf, PTP_OUT_READ_NIC_TIME_NANOSECONDS);
+	return 0;
+}
+
+static int efx_phc_settime(struct ptp_clock_info *ptp,
+			   const struct timespec *e_ts)
+{
+	/* Get the current NIC time, efx_phc_gettime.
+	 * Subtract from the desired time to get the offset
+	 * call efx_phc_adjtime with the offset
+	 */
+	int rc;
+	struct timespec time_now;
+	struct timespec delta;
+
+	rc = efx_phc_gettime(ptp, &time_now);
+	if (rc != 0)
+		return rc;
+
+	delta = timespec_sub(*e_ts, time_now);
+
+	efx_phc_adjtime(ptp, timespec_to_ns(&delta));
+	if (rc != 0)
+		return rc;
+
+	return 0;
+}
+
+static int efx_phc_enable(struct ptp_clock_info *ptp,
+			  struct ptp_clock_request *request,
+			  int enable)
+{
+	struct efx_ptp_data *ptp_data = container_of(ptp,
+						     struct efx_ptp_data,
+						     phc_clock_info);
+	if (request->type != PTP_CLK_REQ_PPS)
+		return -EOPNOTSUPP;
+
+	ptp_data->nic_ts_enabled = !!enable;
+	return 0;
+}
+
+static const struct efx_channel_type efx_ptp_channel_type = {
+	.handle_no_channel	= efx_ptp_handle_no_channel,
+	.pre_probe		= efx_ptp_probe_channel,
+	.post_remove		= efx_ptp_remove_channel,
+	.get_name		= efx_ptp_get_channel_name,
+	/* no copy operation; there is no need to reallocate this channel */
+	.receive_skb		= efx_ptp_rx,
+	.keep_eventq		= false,
+};
+
+void efx_ptp_probe(struct efx_nic *efx)
+{
+	/* Check whether PTP is implemented on this NIC.  The DISABLE
+	 * operation will succeed if and only if it is implemented.
+	 */
+	if (efx_ptp_disable(efx) == 0)
+		efx->extra_channel_type[EFX_EXTRA_CHANNEL_PTP] =
+			&efx_ptp_channel_type;
+}
diff --git a/drivers/net/ethernet/sfc/siena.c b/drivers/net/ethernet/sfc/siena.c
index 6bafd21..84b41bf 100644
--- a/drivers/net/ethernet/sfc/siena.c
+++ b/drivers/net/ethernet/sfc/siena.c
@@ -335,6 +335,7 @@ static int siena_probe_nic(struct efx_nic *efx)
 		goto fail5;
 
 	efx_sriov_probe(efx);
+	efx_ptp_probe(efx);
 
 	return 0;
 
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index ebca75e..5e090e5 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -339,6 +339,12 @@ netdev_tx_t efx_hard_start_xmit(struct sk_buff *skb,
 
 	EFX_WARN_ON_PARANOID(!netif_device_present(net_dev));
 
+	/* PTP "event" packet */
+	if (unlikely(efx_xmit_with_hwtstamp(skb)) &&
+	    unlikely(efx_ptp_is_ptp_tx(efx, skb))) {
+		return efx_ptp_tx(efx, skb);
+	}
+
 	index = skb_get_queue_mapping(skb);
 	type = skb->ip_summed == CHECKSUM_PARTIAL ? EFX_TXQ_TYPE_OFFLOAD : 0;
 	if (index >= efx->n_tx_channels) {
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH net-next 05/11] sfc: Fix maximum array sizes for various MCDI commands
From: Ben Hutchings @ 2012-09-19 19:16 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

The maximum array sizes have been calculated on the basis of a maximum
SDU size of 255 bytes, whereas the actual maximum is 252 bytes.
Constructing a larger SDU will result in a BUG_ON in efx_mcdi_copyin.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/mcdi_pcol.h |   28 ++++++++++++++--------------
 1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mcdi_pcol.h b/drivers/net/ethernet/sfc/mcdi_pcol.h
index db4beed..5038932 100644
--- a/drivers/net/ethernet/sfc/mcdi_pcol.h
+++ b/drivers/net/ethernet/sfc/mcdi_pcol.h
@@ -491,12 +491,12 @@
 
 /* MC_CMD_GET_FPGAREG_OUT msgresponse */
 #define    MC_CMD_GET_FPGAREG_OUT_LENMIN 1
-#define    MC_CMD_GET_FPGAREG_OUT_LENMAX 255
+#define    MC_CMD_GET_FPGAREG_OUT_LENMAX 252
 #define    MC_CMD_GET_FPGAREG_OUT_LEN(num) (0+1*(num))
 #define       MC_CMD_GET_FPGAREG_OUT_BUFFER_OFST 0
 #define       MC_CMD_GET_FPGAREG_OUT_BUFFER_LEN 1
 #define       MC_CMD_GET_FPGAREG_OUT_BUFFER_MINNUM 1
-#define       MC_CMD_GET_FPGAREG_OUT_BUFFER_MAXNUM 255
+#define       MC_CMD_GET_FPGAREG_OUT_BUFFER_MAXNUM 252
 

 /***********************************/
@@ -507,13 +507,13 @@
 
 /* MC_CMD_PUT_FPGAREG_IN msgrequest */
 #define    MC_CMD_PUT_FPGAREG_IN_LENMIN 5
-#define    MC_CMD_PUT_FPGAREG_IN_LENMAX 255
+#define    MC_CMD_PUT_FPGAREG_IN_LENMAX 252
 #define    MC_CMD_PUT_FPGAREG_IN_LEN(num) (4+1*(num))
 #define       MC_CMD_PUT_FPGAREG_IN_ADDR_OFST 0
 #define       MC_CMD_PUT_FPGAREG_IN_BUFFER_OFST 4
 #define       MC_CMD_PUT_FPGAREG_IN_BUFFER_LEN 1
 #define       MC_CMD_PUT_FPGAREG_IN_BUFFER_MINNUM 1
-#define       MC_CMD_PUT_FPGAREG_IN_BUFFER_MAXNUM 251
+#define       MC_CMD_PUT_FPGAREG_IN_BUFFER_MAXNUM 248
 
 /* MC_CMD_PUT_FPGAREG_OUT msgresponse */
 #define    MC_CMD_PUT_FPGAREG_OUT_LEN 0
@@ -560,7 +560,7 @@
 
 /* MC_CMD_PTP_IN_TRANSMIT msgrequest */
 #define    MC_CMD_PTP_IN_TRANSMIT_LENMIN 13
-#define    MC_CMD_PTP_IN_TRANSMIT_LENMAX 255
+#define    MC_CMD_PTP_IN_TRANSMIT_LENMAX 252
 #define    MC_CMD_PTP_IN_TRANSMIT_LEN(num) (12+1*(num))
 /*            MC_CMD_PTP_IN_CMD_OFST 0 */
 /*            MC_CMD_PTP_IN_PERIPH_ID_OFST 4 */
@@ -568,7 +568,7 @@
 #define       MC_CMD_PTP_IN_TRANSMIT_PACKET_OFST 12
 #define       MC_CMD_PTP_IN_TRANSMIT_PACKET_LEN 1
 #define       MC_CMD_PTP_IN_TRANSMIT_PACKET_MINNUM 1
-#define       MC_CMD_PTP_IN_TRANSMIT_PACKET_MAXNUM 243
+#define       MC_CMD_PTP_IN_TRANSMIT_PACKET_MAXNUM 240
 
 /* MC_CMD_PTP_IN_READ_NIC_TIME msgrequest */
 #define    MC_CMD_PTP_IN_READ_NIC_TIME_LEN 8
@@ -1145,7 +1145,7 @@
 
 /* MC_CMD_PUTS_IN msgrequest */
 #define    MC_CMD_PUTS_IN_LENMIN 13
-#define    MC_CMD_PUTS_IN_LENMAX 255
+#define    MC_CMD_PUTS_IN_LENMAX 252
 #define    MC_CMD_PUTS_IN_LEN(num) (12+1*(num))
 #define       MC_CMD_PUTS_IN_DEST_OFST 0
 #define        MC_CMD_PUTS_IN_UART_LBN 0
@@ -1157,7 +1157,7 @@
 #define       MC_CMD_PUTS_IN_STRING_OFST 12
 #define       MC_CMD_PUTS_IN_STRING_LEN 1
 #define       MC_CMD_PUTS_IN_STRING_MINNUM 1
-#define       MC_CMD_PUTS_IN_STRING_MAXNUM 243
+#define       MC_CMD_PUTS_IN_STRING_MAXNUM 240
 
 /* MC_CMD_PUTS_OUT msgresponse */
 #define    MC_CMD_PUTS_OUT_LEN 0
@@ -1947,12 +1947,12 @@
 
 /* MC_CMD_NVRAM_READ_OUT msgresponse */
 #define    MC_CMD_NVRAM_READ_OUT_LENMIN 1
-#define    MC_CMD_NVRAM_READ_OUT_LENMAX 255
+#define    MC_CMD_NVRAM_READ_OUT_LENMAX 252
 #define    MC_CMD_NVRAM_READ_OUT_LEN(num) (0+1*(num))
 #define       MC_CMD_NVRAM_READ_OUT_READ_BUFFER_OFST 0
 #define       MC_CMD_NVRAM_READ_OUT_READ_BUFFER_LEN 1
 #define       MC_CMD_NVRAM_READ_OUT_READ_BUFFER_MINNUM 1
-#define       MC_CMD_NVRAM_READ_OUT_READ_BUFFER_MAXNUM 255
+#define       MC_CMD_NVRAM_READ_OUT_READ_BUFFER_MAXNUM 252
 

 /***********************************/
@@ -1963,7 +1963,7 @@
 
 /* MC_CMD_NVRAM_WRITE_IN msgrequest */
 #define    MC_CMD_NVRAM_WRITE_IN_LENMIN 13
-#define    MC_CMD_NVRAM_WRITE_IN_LENMAX 255
+#define    MC_CMD_NVRAM_WRITE_IN_LENMAX 252
 #define    MC_CMD_NVRAM_WRITE_IN_LEN(num) (12+1*(num))
 #define       MC_CMD_NVRAM_WRITE_IN_TYPE_OFST 0
 /*            Enum values, see field(s): */
@@ -1973,7 +1973,7 @@
 #define       MC_CMD_NVRAM_WRITE_IN_WRITE_BUFFER_OFST 12
 #define       MC_CMD_NVRAM_WRITE_IN_WRITE_BUFFER_LEN 1
 #define       MC_CMD_NVRAM_WRITE_IN_WRITE_BUFFER_MINNUM 1
-#define       MC_CMD_NVRAM_WRITE_IN_WRITE_BUFFER_MAXNUM 243
+#define       MC_CMD_NVRAM_WRITE_IN_WRITE_BUFFER_MAXNUM 240
 
 /* MC_CMD_NVRAM_WRITE_OUT msgresponse */
 #define    MC_CMD_NVRAM_WRITE_OUT_LEN 0
@@ -2305,13 +2305,13 @@
 
 /* MC_CMD_GET_PHY_MEDIA_INFO_OUT msgresponse */
 #define    MC_CMD_GET_PHY_MEDIA_INFO_OUT_LENMIN 5
-#define    MC_CMD_GET_PHY_MEDIA_INFO_OUT_LENMAX 255
+#define    MC_CMD_GET_PHY_MEDIA_INFO_OUT_LENMAX 252
 #define    MC_CMD_GET_PHY_MEDIA_INFO_OUT_LEN(num) (4+1*(num))
 #define       MC_CMD_GET_PHY_MEDIA_INFO_OUT_DATALEN_OFST 0
 #define       MC_CMD_GET_PHY_MEDIA_INFO_OUT_DATA_OFST 4
 #define       MC_CMD_GET_PHY_MEDIA_INFO_OUT_DATA_LEN 1
 #define       MC_CMD_GET_PHY_MEDIA_INFO_OUT_DATA_MINNUM 1
-#define       MC_CMD_GET_PHY_MEDIA_INFO_OUT_DATA_MAXNUM 251
+#define       MC_CMD_GET_PHY_MEDIA_INFO_OUT_DATA_MAXNUM 248
 

 /***********************************/
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH net-next 04/11] sfc: Allow efx_mcdi_rpc to be called in two parts
From: Ben Hutchings @ 2012-09-19 19:16 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

From: Stuart Hodgson <smhodgson@solarflare.com>

For NIC/System time synchonisation efx_mcdi_rpc needs to be split in
efx_mcdi_rpc_start and efx_mcdi_rpc_finish operations.

Signed-off-by: Stuart Hodgson <smhodgson@solarflare.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/mcdi.c |   21 ++++++++++++++++++---
 drivers/net/ethernet/sfc/mcdi.h |    6 ++++++
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mcdi.c b/drivers/net/ethernet/sfc/mcdi.c
index fc5e7bb..2707e86 100644
--- a/drivers/net/ethernet/sfc/mcdi.c
+++ b/drivers/net/ethernet/sfc/mcdi.c
@@ -320,14 +320,20 @@ static void efx_mcdi_ev_cpl(struct efx_nic *efx, unsigned int seqno,
 		efx_mcdi_complete(mcdi);
 }
 
-/* Issue the given command by writing the data into the shared memory PDU,
- * ring the doorbell and wait for completion. Copyout the result. */
 int efx_mcdi_rpc(struct efx_nic *efx, unsigned cmd,
 		 const u8 *inbuf, size_t inlen, u8 *outbuf, size_t outlen,
 		 size_t *outlen_actual)
 {
+	efx_mcdi_rpc_start(efx, cmd, inbuf, inlen);
+	return efx_mcdi_rpc_finish(efx, cmd, inlen,
+				   outbuf, outlen, outlen_actual);
+}
+
+void efx_mcdi_rpc_start(struct efx_nic *efx, unsigned cmd, const u8 *inbuf,
+			size_t inlen)
+{
 	struct efx_mcdi_iface *mcdi = efx_mcdi(efx);
-	int rc;
+
 	BUG_ON(efx_nic_rev(efx) < EFX_REV_SIENA_A0);
 
 	efx_mcdi_acquire(mcdi);
@@ -338,6 +344,15 @@ int efx_mcdi_rpc(struct efx_nic *efx, unsigned cmd,
 	spin_unlock_bh(&mcdi->iface_lock);
 
 	efx_mcdi_copyin(efx, cmd, inbuf, inlen);
+}
+
+int efx_mcdi_rpc_finish(struct efx_nic *efx, unsigned cmd, size_t inlen,
+			u8 *outbuf, size_t outlen, size_t *outlen_actual)
+{
+	struct efx_mcdi_iface *mcdi = efx_mcdi(efx);
+	int rc;
+
+	BUG_ON(efx_nic_rev(efx) < EFX_REV_SIENA_A0);
 
 	if (mcdi->mode == MCDI_MODE_POLL)
 		rc = efx_mcdi_poll(efx);
diff --git a/drivers/net/ethernet/sfc/mcdi.h b/drivers/net/ethernet/sfc/mcdi.h
index 0bdf3e3..dc25caa 100644
--- a/drivers/net/ethernet/sfc/mcdi.h
+++ b/drivers/net/ethernet/sfc/mcdi.h
@@ -71,6 +71,12 @@ extern int efx_mcdi_rpc(struct efx_nic *efx, unsigned cmd, const u8 *inbuf,
 			size_t inlen, u8 *outbuf, size_t outlen,
 			size_t *outlen_actual);
 
+extern void efx_mcdi_rpc_start(struct efx_nic *efx, unsigned cmd,
+			       const u8 *inbuf, size_t inlen);
+extern int efx_mcdi_rpc_finish(struct efx_nic *efx, unsigned cmd, size_t inlen,
+			       u8 *outbuf, size_t outlen,
+			       size_t *outlen_actual);
+
 extern int efx_mcdi_poll_reboot(struct efx_nic *efx);
 extern void efx_mcdi_mode_poll(struct efx_nic *efx);
 extern void efx_mcdi_mode_event(struct efx_nic *efx);
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH net-next 03/11] sfc: Add channel specific receive_skb handler and post_remove callback
From: Ben Hutchings @ 2012-09-19 19:15 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

From: Stuart Hodgson <smhodgson@solarflare.com>

Allows an extra channel to override the standard receive_skb handler
and also for extra non generic operations to be performed on remove.

Also set default rx strategy so only skbs can be delivered to the
PTP receive function.

Signed-off-by: Stuart Hodgson <smhodgson@solarflare.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/efx.c        |    6 ++++++
 drivers/net/ethernet/sfc/efx.h        |    1 +
 drivers/net/ethernet/sfc/net_driver.h |    3 +++
 drivers/net/ethernet/sfc/rx.c         |   13 +++++++++++--
 4 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 342a1f3..8b79a64 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -734,6 +734,7 @@ static void efx_remove_channel(struct efx_channel *channel)
 	efx_for_each_possible_channel_tx_queue(tx_queue, channel)
 		efx_remove_tx_queue(tx_queue);
 	efx_remove_eventq(channel);
+	channel->type->post_remove(channel);
 }
 
 static void efx_remove_channels(struct efx_nic *efx)
@@ -852,6 +853,7 @@ void efx_schedule_slow_fill(struct efx_rx_queue *rx_queue)
 
 static const struct efx_channel_type efx_default_channel_type = {
 	.pre_probe		= efx_channel_dummy_op_int,
+	.post_remove		= efx_channel_dummy_op_void,
 	.get_name		= efx_get_channel_name,
 	.copy			= efx_copy_channel,
 	.keep_eventq		= false,
@@ -862,6 +864,10 @@ int efx_channel_dummy_op_int(struct efx_channel *channel)
 	return 0;
 }
 
+void efx_channel_dummy_op_void(struct efx_channel *channel)
+{
+}
+
 /**************************************************************************
  *
  * Port handling
diff --git a/drivers/net/ethernet/sfc/efx.h b/drivers/net/ethernet/sfc/efx.h
index 70755c9..f11170b 100644
--- a/drivers/net/ethernet/sfc/efx.h
+++ b/drivers/net/ethernet/sfc/efx.h
@@ -102,6 +102,7 @@ static inline void efx_filter_rfs_expire(struct efx_channel *channel) {}
 
 /* Channels */
 extern int efx_channel_dummy_op_int(struct efx_channel *channel);
+extern void efx_channel_dummy_op_void(struct efx_channel *channel);
 extern void efx_process_channel_now(struct efx_channel *channel);
 extern int
 efx_realloc_channels(struct efx_nic *efx, u32 rxq_entries, u32 txq_entries);
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 24a78a3..0f0926e 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -393,14 +393,17 @@ struct efx_channel {
  * @get_name: Generate the channel's name (used for its IRQ handler)
  * @copy: Copy the channel state prior to reallocation.  May be %NULL if
  *	reallocation is not supported.
+ * @receive_skb: Handle an skb ready to be passed to netif_receive_skb()
  * @keep_eventq: Flag for whether event queue should be kept initialised
  *	while the device is stopped
  */
 struct efx_channel_type {
 	void (*handle_no_channel)(struct efx_nic *);
 	int (*pre_probe)(struct efx_channel *);
+	void (*post_remove)(struct efx_channel *);
 	void (*get_name)(struct efx_channel *, char *buf, size_t len);
 	struct efx_channel *(*copy)(const struct efx_channel *);
+	void (*receive_skb)(struct efx_channel *, struct sk_buff *);
 	bool keep_eventq;
 };
 
diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c
index e997f83..9e0ad1b 100644
--- a/drivers/net/ethernet/sfc/rx.c
+++ b/drivers/net/ethernet/sfc/rx.c
@@ -575,7 +575,10 @@ static void efx_rx_deliver(struct efx_channel *channel,
 	skb_record_rx_queue(skb, channel->rx_queue.core_index);
 
 	/* Pass the packet up */
-	netif_receive_skb(skb);
+	if (channel->type->receive_skb)
+		channel->type->receive_skb(channel, skb);
+	else
+		netif_receive_skb(skb);
 
 	/* Update allocation strategy method */
 	channel->rx_alloc_level += RX_ALLOC_FACTOR_SKB;
@@ -617,7 +620,8 @@ void __efx_rx_packet(struct efx_channel *channel, struct efx_rx_buffer *rx_buf)
 	if (unlikely(!(efx->net_dev->features & NETIF_F_RXCSUM)))
 		rx_buf->flags &= ~EFX_RX_PKT_CSUMMED;
 
-	if (likely(rx_buf->flags & (EFX_RX_BUF_PAGE | EFX_RX_PKT_CSUMMED)))
+	if (likely(rx_buf->flags & (EFX_RX_BUF_PAGE | EFX_RX_PKT_CSUMMED)) &&
+	    !channel->type->receive_skb)
 		efx_rx_packet_gro(channel, rx_buf, eh);
 	else
 		efx_rx_deliver(channel, rx_buf);
@@ -627,6 +631,11 @@ void efx_rx_strategy(struct efx_channel *channel)
 {
 	enum efx_rx_alloc_method method = rx_alloc_method;
 
+	if (channel->type->receive_skb) {
+		channel->rx_alloc_push_pages = false;
+		return;
+	}
+
 	/* Only makes sense to use page based allocation if GRO is enabled */
 	if (!(channel->efx->net_dev->features & NETIF_F_GRO)) {
 		method = RX_ALLOC_METHOD_SKB;
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
From: Jesper Dangaard Brouer @ 2012-09-19 19:14 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Pablo Neira Ayuso, netfilter-devel, netdev, yongjun_wei, kaber
In-Reply-To: <20120912213627.GJ14750@breakpoint.cc>

On Wed, 2012-09-12 at 23:36 +0200, Florian Westphal wrote:

[...cut...]

> On module removal nf_nat_ipv4 calls nf_iterate_cleanup which invokes
> nf_nat_proto_clean() for each conntrack.  That will then call
> hlist_del_rcu(&nat->bysource) using eachs conntracks nat ext area.
> 
> Problem is that nf_nat_proto_clean() is called multiple times for the same
> conntrack:
> a) nf_ct_iterate_cleanup() returns each ct twice (origin, reply)
> b) we call it both for l3 and for l4 protocol ids
> 
> We barf in hlist_del_rcu the 2nd time because ->pprev is poisoned.
> 
> This was introduced with the ipv6 nat patches.
> 
> --- a/net/netfilter/nf_nat_core.c
> +++ b/net/netfilter/nf_nat_core.c
> @@ -487,7 +487,7 @@ static int nf_nat_proto_clean(struct nf_conn *i, void *data)
>  
>         if (clean->hash) {
>                 spin_lock_bh(&nf_nat_lock);
> -               hlist_del_rcu(&nat->bysource);
> +               hlist_del_init_rcu(&nat->bysource);
>                 spin_unlock_bh(&nf_nat_lock);
>         } else {
>
> Would probably avoid it.  I guess it would be nicer to only call this
> once for each ct.

Florian's patch fixes the Oops :-)




^ permalink raw reply

* [PATCH net-next 02/11] sfc: Add explicit RX queue flag to channel
From: Ben Hutchings @ 2012-09-19 19:14 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

The PTP channel will have its own RX queue even though it's not
a regular traffic channel.

Original work by Ben Hutchings <bhutchings@solarflare.com>

Signed-off-by: Stuart Hodgson <smhodgson@solarflare.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/efx.c        |    8 +++++++-
 drivers/net/ethernet/sfc/net_driver.h |    5 ++++-
 drivers/net/ethernet/sfc/rx.c         |    7 +++++--
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index a606db4..342a1f3 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -1451,10 +1451,16 @@ static void efx_set_channels(struct efx_nic *efx)
 	efx->tx_channel_offset =
 		separate_tx_channels ? efx->n_channels - efx->n_tx_channels : 0;
 
-	/* We need to adjust the TX queue numbers if we have separate
+	/* We need to mark which channels really have RX and TX
+	 * queues, and adjust the TX queue numbers if we have separate
 	 * RX-only and TX-only channels.
 	 */
 	efx_for_each_channel(channel, efx) {
+		if (channel->channel < efx->n_rx_channels)
+			channel->rx_queue.core_index = channel->channel;
+		else
+			channel->rx_queue.core_index = -1;
+
 		efx_for_each_channel_tx_queue(tx_queue, channel)
 			tx_queue->queue -= (efx->tx_channel_offset *
 					    EFX_TXQ_TYPES);
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 7ab1232..24a78a3 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -242,6 +242,8 @@ struct efx_rx_page_state {
 /**
  * struct efx_rx_queue - An Efx RX queue
  * @efx: The associated Efx NIC
+ * @core_index:  Index of network core RX queue.  Will be >= 0 iff this
+ *	is associated with a real RX queue.
  * @buffer: The software buffer ring
  * @rxd: The hardware descriptor ring
  * @ptr_mask: The size of the ring minus 1.
@@ -263,6 +265,7 @@ struct efx_rx_page_state {
  */
 struct efx_rx_queue {
 	struct efx_nic *efx;
+	int core_index;
 	struct efx_rx_buffer *buffer;
 	struct efx_special_buffer rxd;
 	unsigned int ptr_mask;
@@ -1047,7 +1050,7 @@ static inline bool efx_tx_queue_used(struct efx_tx_queue *tx_queue)
 
 static inline bool efx_channel_has_rx_queue(struct efx_channel *channel)
 {
-	return channel->channel < channel->efx->n_rx_channels;
+	return channel->rx_queue.core_index >= 0;
 }
 
 static inline struct efx_rx_queue *
diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c
index 719319b..e997f83 100644
--- a/drivers/net/ethernet/sfc/rx.c
+++ b/drivers/net/ethernet/sfc/rx.c
@@ -479,7 +479,7 @@ static void efx_rx_packet_gro(struct efx_channel *channel,
 		skb->ip_summed = ((rx_buf->flags & EFX_RX_PKT_CSUMMED) ?
 				  CHECKSUM_UNNECESSARY : CHECKSUM_NONE);
 
-		skb_record_rx_queue(skb, channel->channel);
+		skb_record_rx_queue(skb, channel->rx_queue.core_index);
 
 		gro_result = napi_gro_frags(napi);
 	} else {
@@ -571,6 +571,9 @@ static void efx_rx_deliver(struct efx_channel *channel,
 	/* Set the SKB flags */
 	skb_checksum_none_assert(skb);
 
+	/* Record the rx_queue */
+	skb_record_rx_queue(skb, channel->rx_queue.core_index);
+
 	/* Pass the packet up */
 	netif_receive_skb(skb);
 
@@ -608,7 +611,7 @@ void __efx_rx_packet(struct efx_channel *channel, struct efx_rx_buffer *rx_buf)
 		 * at the ethernet header */
 		skb->protocol = eth_type_trans(skb, efx->net_dev);
 
-		skb_record_rx_queue(skb, channel->channel);
+		skb_record_rx_queue(skb, channel->rx_queue.core_index);
 	}
 
 	if (unlikely(!(efx->net_dev->features & NETIF_F_RXCSUM)))
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH net-next 01/11] pps/ptp: Allow PHC devices to adjust PPS events for known delay
From: Ben Hutchings @ 2012-09-19 19:13 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, linux-net-drivers, Richard Cochran, Rodolfo Giometti,
	Andrew Jackson
In-Reply-To: <1348081775.2636.15.camel@bwh-desktop.uk.solarflarecom.com>

Initial version by Stuart Hodgson <smhodgson@solarflare.com>

Some PHC device drivers may deliver PPS events with a significant
and variable delay, but still be able to measure precisely what
that delay is.

Add a pps_sub_ts() function for subtracting a delay from the
timestamp(s) in a PPS event, and a PTP event type (PTP_CLOCK_PPSUSR)
for which the caller provides a complete PPS event.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/ptp/ptp_clock.c          |    5 +++++
 include/linux/pps_kernel.h       |    9 +++++++++
 include/linux/ptp_clock_kernel.h |   10 ++++++++--
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c
index 1e528b5..966875d 100644
--- a/drivers/ptp/ptp_clock.c
+++ b/drivers/ptp/ptp_clock.c
@@ -300,6 +300,11 @@ void ptp_clock_event(struct ptp_clock *ptp, struct ptp_clock_event *event)
 		pps_get_ts(&evt);
 		pps_event(ptp->pps_source, &evt, PTP_PPS_EVENT, NULL);
 		break;
+
+	case PTP_CLOCK_PPSUSR:
+		pps_event(ptp->pps_source, &event->pps_times,
+			  PTP_PPS_EVENT, NULL);
+		break;
 	}
 }
 EXPORT_SYMBOL(ptp_clock_event);
diff --git a/include/linux/pps_kernel.h b/include/linux/pps_kernel.h
index 9404854..0cc45ae 100644
--- a/include/linux/pps_kernel.h
+++ b/include/linux/pps_kernel.h
@@ -116,5 +116,14 @@ static inline void pps_get_ts(struct pps_event_time *ts)
 
 #endif /* CONFIG_NTP_PPS */
 
+/* Subtract known time delay from PPS event time(s) */
+static inline void pps_sub_ts(struct pps_event_time *ts, struct timespec delta)
+{
+	ts->ts_real = timespec_sub(ts->ts_real, delta);
+#ifdef CONFIG_NTP_PPS
+	ts->ts_raw = timespec_sub(ts->ts_raw, delta);
+#endif
+}
+
 #endif /* LINUX_PPS_KERNEL_H */
 
diff --git a/include/linux/ptp_clock_kernel.h b/include/linux/ptp_clock_kernel.h
index 945704c..a644b29 100644
--- a/include/linux/ptp_clock_kernel.h
+++ b/include/linux/ptp_clock_kernel.h
@@ -21,6 +21,7 @@
 #ifndef _PTP_CLOCK_KERNEL_H_
 #define _PTP_CLOCK_KERNEL_H_
 
+#include <linux/pps_kernel.h>
 #include <linux/ptp_clock.h>
 

@@ -110,6 +111,7 @@ enum ptp_clock_events {
 	PTP_CLOCK_ALARM,
 	PTP_CLOCK_EXTTS,
 	PTP_CLOCK_PPS,
+	PTP_CLOCK_PPSUSR,
 };
 
 /**
@@ -117,13 +119,17 @@ enum ptp_clock_events {
  *
  * @type:  One of the ptp_clock_events enumeration values.
  * @index: Identifies the source of the event.
- * @timestamp: When the event occured.
+ * @timestamp: When the event occurred (%PTP_CLOCK_EXTTS only).
+ * @pps_times: When the event occurred (%PTP_CLOCK_PPSUSR only).
  */
 
 struct ptp_clock_event {
 	int type;
 	int index;
-	u64 timestamp;
+	union {
+		u64 timestamp;
+		struct pps_event_time pps_times;
+	};
 };
 
 /**
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* Pull request: sfc-next 2012-09-19
From: Ben Hutchings @ 2012-09-19 19:09 UTC (permalink / raw)
  To: David Miller
  Cc: Richard Cochran, Rodolfo Giometti, linux-net-drivers, netdev,
	Andrew Jackson

The following changes since commit b4516a288e71c64d7e214902250baf78b7b3cdcf:

  llc: Remove stray reference to sysctl_llc_station_ack_timeout. (2012-09-17 13:13:24 -0400)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next.git for-davem

(commit 450783747f42dfa3883920acfad4acdd93ce69af)

1. Extension to PPS/PTP to allow for PHC devices where pulses are
   subject to a variable but measurable delay.
2. PPS/PTP/PHC support for Solarflare boards with a timestamping 
   peripheral.
3. MTD support for updating the timestamping peripheral on those boards.
4. Fix for potential over-length requests to firmware.

Ben.

Ben Hutchings (7):
      pps/ptp: Allow PHC devices to adjust PPS events for known delay
      sfc: Fix maximum array sizes for various MCDI commands
      sfc: Convert firmware subtypes to native byte order in efx_mcdi_get_board_cfg()
      sfc: Support variable-length response to MCDI GET_BOARD_CFG
      sfc: Expose FPGA bitfile partition through MTD
      sfc: Bump version to 3.2
      sfc: Avoid generating over-length MC_CMD_FLUSH_RX_QUEUES request

Stuart Hodgson (4):
      sfc: Add explicit RX queue flag to channel
      sfc: Add channel specific receive_skb handler and post_remove callback
      sfc: Allow efx_mcdi_rpc to be called in two parts
      sfc: Add support for IEEE-1588 PTP

 drivers/net/ethernet/sfc/Kconfig       |    7 +
 drivers/net/ethernet/sfc/Makefile      |    1 +
 drivers/net/ethernet/sfc/efx.c         |   17 +-
 drivers/net/ethernet/sfc/efx.h         |    1 +
 drivers/net/ethernet/sfc/ethtool.c     |    1 +
 drivers/net/ethernet/sfc/mcdi.c        |   49 +-
 drivers/net/ethernet/sfc/mcdi.h        |    6 +
 drivers/net/ethernet/sfc/mcdi_pcol.h   |   29 +-
 drivers/net/ethernet/sfc/mtd.c         |    7 +-
 drivers/net/ethernet/sfc/net_driver.h  |   29 +-
 drivers/net/ethernet/sfc/nic.h         |   36 +
 drivers/net/ethernet/sfc/ptp.c         | 1483 ++++++++++++++++++++++++++++++++
 drivers/net/ethernet/sfc/rx.c          |   20 +-
 drivers/net/ethernet/sfc/siena.c       |    1 +
 drivers/net/ethernet/sfc/siena_sriov.c |    7 +
 drivers/net/ethernet/sfc/tx.c          |    6 +
 drivers/ptp/ptp_clock.c                |    5 +
 include/linux/pps_kernel.h             |    9 +
 include/linux/ptp_clock_kernel.h       |   10 +-
 19 files changed, 1688 insertions(+), 36 deletions(-)
 create mode 100644 drivers/net/ethernet/sfc/ptp.c

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [RFC] tcp: use order-3 pages in tcp_sendmsg()
From: Alexander Duyck @ 2012-09-19 19:04 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev
In-Reply-To: <20120919.135655.381209248813843140.davem@davemloft.net>

On 09/19/2012 10:56 AM, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 19 Sep 2012 17:14:19 +0200
>
>> I did some tests and got no problem so far, even using splice() [ this
>> one was tricky because it only deals with order-0 pages at this moment ]
>>
>> NIC tested : ixgbe, igb, bnx2x, tg3, mellanox mlx4
>>
>> On loopback, performance of netperf goes from 31900 Mb/s to 38500 Mb/s,
>> thats a 20 % increase.
> That's really a lot more than I expected, nice.

When I get some time I will test this patch on a system with an iommu
enabled.  I suspect it will have a huge performance impact there since
now you would be looking at roughly 1/8th the total number of map/unmap
calls on a system with 4K pages.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH 1/2] Added information about which firmware file is being requested.
From: Jarl Friis @ 2012-09-19 18:56 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: Stefano Brivio, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	b43-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
In-Reply-To: <505A1206.4020606-Gdu+ltImwkhes2APU0mLOQ@public.gmane.org>

2012/9/19 Michael Tokarev <mjt-XAri/EZa3C4vJsYlp49lxw@public.gmane.org>:
> On 19.09.2012 15:18, Jarl Friis wrote:
>
>> +     b43info(ctx->dev->wl, "Requesting firmware file '%s'\n", ctx->fwname);
>>       err = request_firmware(&blob, ctx->fwname, ctx->dev->dev->dev);
>
> Hmm.  I wonder if this should be printed in request_firmware()
> itself instead of in all callers?

Now that you mention it, I also think that is a much better idea.
However that would be a much more central place to do the change, so I
would gladly see somebody else do that patch (in replacement of mine)

Jarl

^ permalink raw reply

* Re: [PATCH 2/2] Using LP firmware for taking advantage of the low-power capabilities.
From: Jarl Friis @ 2012-09-19 18:54 UTC (permalink / raw)
  To: Larry Finger
  Cc: Stefano Brivio, Gábor Stefanik,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	b43-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
In-Reply-To: <5059DFF1.4060405-tQ5ms3gMjBLk1uMJSBkQmQ@public.gmane.org>

Thanks for feed back.

2012/9/19 Larry Finger <Larry.Finger-tQ5ms3gMjBLk1uMJSBkQmQ@public.gmane.org>:
>
>
> I have some questions about this patch. Where did you get the information
> needed to make these changes?

To be completely honest: I didn't get any information, it is based
purely on practical experiments: I experienced some inconsistency wrt.
performance on my hp6735b. I had some time to look at the driver (it's
15 years since I patched the kernel last time). Anyway I saw the
filename pattern for firmware and saw that `b43-fwcutter
broadcom-wl-5.100.138/linux/wl_apsta.o -w` contained similar files
ending on "16". So I simple tried it out (for version 16), and so far
it seems to work better. It also seems to wake up faster after a
sleep.

> Did it come from reverse engineering some
> Broadcom code

No!

> or did you look at their actual code?

No!

> There is a great deal
> of difference relative to our "clean-room" status. Anyone that has seen
> non-GPL Broadcom material cannot contribute code to b43.

I have not seen any Broadcom code at all (apart from the stuff that is
already in the linux source tree)

>
> Have you tested this code on devices with rev>=16?

Yes on my HP6735b having this chip integrated:
[ 1577.549270] b43-phy1: Broadcom 4322 WLAN found (core revision 16)
[ 1577.592117] b43-phy1 debug: Found PHY: Analog 8, Type 4, Revision 4
[ 1577.592158] b43-phy1 debug: Found Radio: Manuf 0x17F, Version
0x2056, Revision 3

I guess the part of the patch for PHY_LP has not been reached. I will
submit a new series of patches that separates things

>
> Now for some comments: This patch also needs the "b43:" added to the
> subject.

Sorry. It's long ago I have submitted patches to the kernel.

> In addition, you appear to have at least one white-space error in
> the MODULE_FIRMWARE line.

I am not sure what you mean here. Is this a mail issue... (I wrote it
just like the other ones around it)

> Is the addition of your copyright to the driver
> warranted by this change?

As far as I understand the copyright law: Yes, but I'm not an expert.
Neither am I 100% sure what you mean.

> For example, I have made much larger contributions
> to b43 over the years before I started doing reverse-engineering on this
> driver, but I never added my copyright.

I suggest you do.

> Your "Signed-off-by" implies
> copyright for the patch.

The fact that I authored the patch implies copyright (even without
Signed-off-by)

Jarl
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 1/2] Added information about which firmware file is being requested.
From: Michael Tokarev @ 2012-09-19 18:42 UTC (permalink / raw)
  To: Jarl Friis
  Cc: Stefano Brivio, Gábor Stefanik, linux-wireless, b43-dev,
	netdev, John W. Linville
In-Reply-To: <1348053493-22955-1-git-send-email-jarl@softace.dk>

On 19.09.2012 15:18, Jarl Friis wrote:

> +	b43info(ctx->dev->wl, "Requesting firmware file '%s'\n", ctx->fwname);
>  	err = request_firmware(&blob, ctx->fwname, ctx->dev->dev->dev);

Hmm.  I wonder if this should be printed in request_firmware()
itself instead of in all callers?

/mjt

^ permalink raw reply

* Opportunity to backport pch_gbe patches to stable 3.2.x
From: Erwan Velu @ 2012-09-19 18:41 UTC (permalink / raw)
  To: netdev

Hi there,

I'm using on a daily use and in production 3.2.x kernels (.29 currently) on a Kontron Nano-etx board that features the pch_gbe network driver.

I've been experiencing lots of troubles on this kernel series but since I've backported the following commits back to my 3.2.x, it works far better.

So from the history of next the following commits apply perfectly and runs very well.

I do think that other users of pch_gbe will appreciate this.

1a0bdadb4e36abac63b0a9787f372aac30c11a9e
5481c8cd83b4cb0f9f0746e1f477ac231e7eedb6
eefc48b078e1c74c701e8b44a56717418e9cd2bb
93c8acb599b72ca7da42e36d7971a28dce273665
32127a0a0a35706c18df11cd7ad69e96214b3c68
d89bdff152acc0c1e1c8093832547a553b69b45c
62ecc37986414ff98ba863f8f4b8c3fa9c8fb808
3ab77bf271e6a41512e366dfa5110edb981ed1d3
913f53e4c8a464c46a70898c88f2291ade28c196
f2c31662762b9e82b9891d6b385b17f9e5ef0ed2

Cheers,
Erwan

^ permalink raw reply

* Re: [PATCH net-next] net: more accurate network taps in transmit path
From: Eric Dumazet @ 2012-09-19 18:21 UTC (permalink / raw)
  To: David Miller; +Cc: jamie.gloudon, netdev
In-Reply-To: <20120919.141637.531173978256020103.davem@davemloft.net>

On Wed, 2012-09-19 at 14:16 -0400, David Miller wrote:

> Are you really sure that all the network tap implementations can
> handle software GSO segmented skbs using skb->next linkage?
> 
> Because that is what they can potentially see after this change.

I dont think so, because skb->next is NULL at the points I call the
network tap.

Or did I miss something ?

^ permalink raw reply

* Re: [PATCH net-next] net: more accurate network taps in transmit path
From: David Miller @ 2012-09-19 18:16 UTC (permalink / raw)
  To: eric.dumazet; +Cc: jamie.gloudon, netdev
In-Reply-To: <1348037089.26523.397.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 19 Sep 2012 08:44:49 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> dev_queue_xmit_nit() should be called right before ndo_start_xmit()
> calls or we might give wrong packet contents to taps users :
> 
> Packet checksum can be changed, or packet can be linearized or
> segmented, and segments partially sent for the later case.
> 
> Also a memory allocation can fail and packet never really hit the
> driver entry point.
> 
> Reported-by: Jamie Gloudon <jamie.gloudon@gmail.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Are you really sure that all the network tap implementations can
handle software GSO segmented skbs using skb->next linkage?

Because that is what they can potentially see after this change.

^ permalink raw reply

* Re: [RFC] tcp: use order-3 pages in tcp_sendmsg()
From: David Miller @ 2012-09-19 17:56 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1348067659.26523.949.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 19 Sep 2012 17:14:19 +0200

> I did some tests and got no problem so far, even using splice() [ this
> one was tricky because it only deals with order-0 pages at this moment ]
> 
> NIC tested : ixgbe, igb, bnx2x, tg3, mellanox mlx4
> 
> On loopback, performance of netperf goes from 31900 Mb/s to 38500 Mb/s,
> thats a 20 % increase.

That's really a lot more than I expected, nice.

^ permalink raw reply

* Re: [RFC] tcp: use order-3 pages in tcp_sendmsg()
From: Eric Dumazet @ 2012-09-19 17:55 UTC (permalink / raw)
  To: Rick Jones; +Cc: David Miller, netdev
In-Reply-To: <505A00C8.5050302@hp.com>

On Wed, 2012-09-19 at 10:28 -0700, Rick Jones wrote:
> On 09/19/2012 08:14 AM, Eric Dumazet wrote:
> > I did some tests and got no problem so far, even using splice() [ this
> > one was tricky because it only deals with order-0 pages at this moment ]
> >
> > NIC tested : ixgbe, igb, bnx2x, tg3, mellanox mlx4
> >
> > On loopback, performance of netperf goes from 31900 Mb/s to 38500 Mb/s,
> > thats a 20 % increase.
> 
> I guess Brutus will need a new baseline for his TCP Friends patch then :)
> 
> BTW, what is the change, if any for TCP_RR?
> 
> happy benchmarking,
> 
> rick jones
> 

No difference, because I already optimized this case last year ;)


commit f07d960df33c5aef8f513efce0fd201f962f94a1
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Mon Nov 28 22:41:47 2011 +0000

    tcp: avoid frag allocation for small frames
    
    tcp_sendmsg() uses select_size() helper to choose skb head size when a
    new skb must be allocated.
    
    If GSO is enabled for the socket, current strategy is to force all
    payload data to be outside of headroom, in PAGE fragments.
    
    This strategy is not welcome for small packets, wasting memory.
    
    Experiments show that best results are obtained when using 2048 bytes
    for skb head (This includes the skb overhead and various headers)
    
    This patch provides better len/truesize ratios for packets sent to
    loopback device, and reduce memory needs for in-flight loopback packets,
    particularly on arches with big pages.
    
    If a sender sends many 1-byte packets to an unresponsive application,
    receiver rmem_alloc will grow faster and will stop queuing these packets
    sooner, or will collapse its receive queue to free excess memory.
    
    netperf -t TCP_RR results are improved by ~4 %, and many workloads are
    improved as well (tbench, mysql...)

^ permalink raw reply

* Re: Macvtap bug: contractor wanted
From: David Miller @ 2012-09-19 17:54 UTC (permalink / raw)
  To: richard, chris; +Cc: netdev, qemu-devel, mst, jasowang, arnd
In-Reply-To: <20120919151133.GA23987@arachsys.com>

Solicitation of paid services or employment is absolutely not
appropriate on the vger.kernel.org mailing lists.

Please do not do this ever again.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox